Web Scraping to Predict Competitor Performance in Retail

3 min read

Competitor monitoring is a popular and wide-ranging application for web scraping. After all, a large part of the valuable information comes from understanding the actions and strategies our competition employs.

Getting accurate data on various performance indicators can be somewhat tough. Traditionally, we’d be able to get a glimpse into the financials once a year, if the company is publicly traded or publishes reports. Even then, these reports might not be as illuminating as one might think.

Savvy companies will minimize any sensitive information and only report what’s necessary, such as spendings, holdings, and overall revenue. You’ll rarely find the steps on how they got to the revenue goals.

Publicly accessible data

In retail and ecommerce, companies are naturally inclined to reveal a lot of data about products and services. From extensive descriptions and titles to pricing and stock data, all of these serve as guidance for potential customers.

Some of the implementations differ across retailers, especially in relation to held stock. Some retailers will not list remaining stock numbers outside cases where they drop below 10. Some retailers will show an abstract indicator of remaining stock, usually expressed in colors or bar charts. Finally, some always display exact numbers.

As a result, web scraping will be much more effective on those who display the most data. Even if it’s presented in an abstract sense (such as bar charts or colors), these are still indicators that can be used for analytical purposes, since what is most illuminating are changes over time, not particular numbers.

Other important metrics, which are nearly always visible, are review numbers. Particular data about the people or content posted within should be avoided as it involves personal data and is rarely as useful as the volume of reviews.

In both of these cases, the changing numbers of reviews and stock can be used as secondary indicators, both of which point to certain proxies for performance. While these metrics are not entirely accurate, they can be added to support other data points to provide an overall better picture of the competitor.

Understanding metric velocity

When measuring performance, numbers themselves aren’t that important. They might reveal something about the overall state of a business, but any number is a single snapshot in time. Performance, as most would understand, has to do with fluctuations in numbers over an extended period of time.

Deltas (that is, the changes in values) are slightly more informative. They also provide a snapshot of the business, however, it’s closer to a performance metric. With enough data, a predicted volume of sales (and, in turn, revenue) can be extrapolated. 

These can be separated across categories. Additional interest should be shown whenever larger events (e.g. Black Friday) crop up. Differences between categories might be more pronounced, enabling us to understand which products garner more interest from consumers.

It should be noted, however, that such a case is only possible if a retailer provides exact stock numbers. Data gets a lot muddier if only abstract representations of held stock are shown, although it can still be valuable. If unavailable, review counts can be used as a proxy.

For accurate performance reflections, however, velocity is the key. Velocity of metrics is the rate of change over time, which is the closest one can get to a glimpse into performance. For velocity to be available, historical data has to be collected that would indicate the deltas of a particular metric.

Velocity is a long-term indicator of performance as it shows the direction the company is moving towards. Such data might be less relevant for particular products or certain categories, but, taken as a whole, would be reflective of how the implementation of strategies is turning out for a retailer.

Additionally, both velocity and deltas can be used to evaluate where a particular retailer might be strengthening their positions. In other words, some data about strategic directions might be unveiled.

Finally, retail products often go through sales cycles where acquisitions of larger than usual amounts of stock happen. Tracking such data helps us evaluate which products and categories are undergoing a particular cycle.

Engaging with other metrics

Datasets become more informative as more sources get added as long as they are processed in an efficient and timely manner. Data has a tendency to provide a greater number of signals if there is a larger volume and variety of it.

Including pricing data when calculating the above would provide important information as fluctuations could influence purchases greatly. As such, it could be used to evaluate whether products or categories are becoming more popular simply due to pricing.

Additionally, overall website traffic can also be used as a hedge to improve accuracy. While more significant numbers of traffic also indicate overall performance, combining stock data with traffic could give us a better understanding of the distribution of the latter.

As with pricing, web scraping and monitoring would allow us to get a better understanding of the underlying reasons why categories might be experiencing greater changes in sales.

In the end, web scraping can serve as an advanced monitoring tool that can help us build a novel system that evaluates retailer performance. While such data is somewhat muddy, it can point us in the right direction and provide insights that would be otherwise unavailable.

Andrius Palionis Since 2015, Andrius Palionis has been supporting major companies around the world in their journey towards data-driven decision making. His motto “persistence is progress” has driven him to transform global attitudes towards the importance of data to business success and growth. As a Director of Sales and later VP of Enterprise Solutions at Oxylabs, Andrius obtained an in-depth understanding of main challenges that arise with data acquisition. Day to day, he uses his problem-solving and team management skills to accelerate the performance of numerous companies by successfully bridging their data needs with the most effective solutions.

Leave a Reply

Your email address will not be published.