Alternative Data Guide

Alternative data is gaining popularity among hedge fund managers and financial institutions these days. Despite the fact that it opens up new layers of challenges and complexities in the asset selection process, alternative data does give new promises and excitement to the otherwise trying active investing scene.

What is Alternative Data?

Traditional data sources like financial statements, presentations, SEC filings, sales figures, etc., though valuable, don’t usually give a complete and sufficiently timely picture. Investors of this data-driven era are demanding more actionable insights before making inferences about where to invest in.

To fill this void, a number of companies have been quick to collect, clean, analyze, and interpret useful data from non-traditional sources. Some of these sources include financial transactions, sensor inputs, web traffic, mobile devices, satellites, public records, and news sites. More often than not, these are sources that are outside the companies in question.

Check out the video below to see some of the real-life use cases:

Alternative data gives investors a set of unique and timely insights to help identify the best investment opportunities. These insights, properly extracted, predict events way before you learn about them from the popular financial news outlets.

The Growth of Alternative Data

Initially, there was a lot of skepticism surrounding alternative data and its usefulness. As time went by, more and more organizations, especially investment firms, saw its benefits and started to embrace the idea. The last decade saw a good number of data brokers and aggregators specialize in ways to supply alternative data to investors. Today, it is a well-established fact that alternative data can give insights that traditional data could not. The methods for collection of alternative data, combined with new technologies in machine learning and computation, helped give easy access to early insights.

More attention to Alternative Data was brought forth when giants like J.P. Morgan Chase and Goldman Sachs endorsed its power and stepped up their support for research in this area. What’s more, Bloomberg recently announced that they too are ready to sell alternative data sets to organizations that need it.

Wrap your head around these stats:

  •       According to J.P. Morgan, asset managers are investing $2-3 billion on the acquisition and processing of Alternative Data.
  •       The money invested in alternative data is likely to grow by 20-30% every year.
  •       The total data generated all over the world is expected to increase to 163 ZB within the year 2025!
  •       The number of data scientists has been increasing and has almost quadrupled in the last 5 years.

number of alternative data provider available from 1990 to 2019

During the past few years, more and more organizations have foreseen the explosion of alternative data and joined the bandwagon. Today there are more than 400 alternative data providers out there, according to YipitData. And these have just taken into account the big ones.

How is Alternative Data Generated?

The real reason that Alternative data has flourished so much has been the augmentation in data collection methods and the technology used to process this gigantic amount of data. In addition to that, advances in the Internet of Things has further added to the amount of data available on a daily basis.

There are three ways in which alternative data is generated:

By Individuals:

Most of the data generated by individuals tend to be unstructured and comparatively tougher to process. Individuals produce humongous amounts of data every minute. Some of the data created include social media interactions, product reviews on Amazon or other e-commerce groups, as well as trends on search engines like Google, Bing, etc.

By Business Processes:

Most data generated by businesses tend to be structured and can provide great insights for financial decision making. This form of alternative data is also called ‘exhaust data’, as it mainly comes in the form of a by-product of different business processes. Some of the alternative data created include credit card transactions, sales transactions, and data from government agencies.

By Sensors:

The data that is generated by sensors is largely unstructured. In the current era of technology and the Internet of Things, sensors are widespread. They are continually picking up signals and transporting them from one device to another. Devices like CCTV, machines, POS systems, even parking lot sensors provide a host of vital business data. Alternative data can also be generated by satellite imaging and geo-location devices. This form of data is highly valuable to gauge things like how often a certain store is visited or how often certain products are being shipped to certain locations. 

How is alternative data generated

Major Types of Alternative Data

Let’s take a closer look at some of the different types of alternative data generated:

Web Data 

This type of data includes information relating to web traffic, popular web searches, demographics, click-through rates, etc. This type of data is quite useful in gauging the results of advertising campaigns or the popularity of websites or products. It also provides excellent insights for market research and e-commerce.

Social Sentiment

This includes data resulting from the processing of social media posts and comments. This also includes public reactions to the news, product ads, etc. The data can come in the form of textual posts, digital images or videos. This can also include any form of online interaction between people on social media sites, like Twitter, Facebook, LinkedIn, etc. This data helps get an idea of current trends and brand virality.

For example, twitter sentiment analysis is a popular method of gauging public reactions to a product release, event or public announcement. Most of this data tends to be unstructured and requires pre-processing.

Geo-location Data 

The data received from electronic devices (especially mobile ones) used to track their physical location is called Geo-location data. Other than GPS signals, this type of data is also received from WiFi or Bluetooth signals. This type of alternative data gives a significant amount of information for making location-based decisions. So, organizations can get a better understanding of which locations have the most demand for certain products or activities. Retail stores can use this information to decide the best areas to expand their business to. As the Internet of Things technology expands, this form of alternative data will become more and more useful.

Credit Card Transactions 

Data obtained from credit and debit card transactions can be quite useful in tracking retail revenue. This can also give insights about how often an individual pays their bills, indicating if they are likely to make their loan payments on time. This form of alternative data is highly accurate and quite insightful. However, it can be quite expensive to obtain licenses for them.

Email Receipts 

Email receipts are electronic receipts obtained on delivery of a product or service purchased. This often comes in response to opt-in emails, reward received from a rewards app, invoices, etc. This form of data is quite useful for tracking retail revenues.

Here’s an interesting fact:

In November 2016, product receipts from more than 3 million email inboxes were evaluated to conclude that sales volumes of GoPro were dropping. This, in turn, led to a drop in their market shares!

Point of Sale Transactions 

Transactions at the Point of Sales provide a fortune of information related not only to sales volume and price trends but also about consumer preferences and product popularity.

Satellite Imagery

This form of alternative data is becoming increasingly popular, albeit expensive. It is obtained from satellites or low-level drones. In raw form, the data is available in the form of images, which are then processed to extract the required information. See how World Bank uses satellite imagery to measure the economic activities of South Aisa:

Here’s another great application of satellite imagery:

A few investment firms are using satellite image data to assess the health of local economies. They are extracting information from images of parking lots to see how many cars are parked at various instances of time. This gives them an insight into the economic state of the area.

Other uses of satellite data include supply-chain disruption tracking or tracking metal production and storage.


Similar to satellite data, the data on weather patterns can be quite useful in making a wide range of economic decisions. This form of data is collected from various sensors, like precipitation sensors, pressure sensors, thermometers, etc. The data can be used to analyze how much or what kind of agricultural produce can be expected from a region or the types of commodities that may be available. 

How to Obtain Alternative Data

There are largely three ways in which alternative data can be obtained:

Web scraping: 

This is also known as web harvesting. This is usually done by programmers who write code to access information available over the internet. The process involves browsing web pages from link to link and downloading the necessary information from the relevant pages through a series of text processing functions.

The information thus extracted is then saved to a spreadsheet or converted into a form that can be easily interpreted. For example, one can build a web scraper to extract and analyze information about contracts and tenders from different web pages.

Things have only got easier. A number of applications are now able to accomplish the tasks for the users without the need to program.

Acquisition of raw data: 

Raw data is the unprocessed data that is obtained from any source. For example, the data obtained from a sensor (which consists of mainly numbers) is a form of raw data. This data is available in its raw format. That means the data has not been subjected to any form of cleaning, noise removal or any other form of processing.

3rd party licensing: 

There are some companies that obtain licenses to recover ‘exhaust’ data, like credit card transactions, POS transactions, etc. from different companies. They then process this data into formats that can be easily used and sell the data to other organizations. Major players in this field include organizations like Quandl, YipitData, and iResearch.

How to use alternative data (in finance & investing)

One thing is for sure: 

The use of alternative data in finance and investing is increasing. Both fundamental and quantitative investment firms are using it to find novel sources of alpha.

These firms are using alternative data in various forms and from various sources. For example, organizations like Orbital Insight persistently monitor more than 260,000 parking lots through satellite imagery and then sell the satellite data to give insights on where people are shopping and when. These images also help predict what times of the year shopping trends are likely to spike. This sort of information can give investors an early idea of sales revenues much ahead of the quarterly returns.

Alternative data adoption

Companies like Yodlee consolidate credit and debit card transactions and sell them to hedge funds. The datasets can cost up to a million dollars. But they can help uncover important insights in areas like fraud detection, retail trends, spending habits, etc

Rakuten Intelligence uses their ‘’ app to get email receipt data. The app offers junk mail services to users to help them get rid of junk mail or spam. This app can detect commercial mails in the user’s inbox. It then uses these emails to gather insights about the user’s shopping preferences and sells the information to investors.

Finally, it is a well-known fact that social media sites like Facebook and Twitter allow you to scrape user posts. They even provide their own APIs to assist in the web scraping process. This data can then be used to perform sentiment analysis. In fact, twitter sentiment analysis is being widely used to get insights into public sentiments about retail brands, events or trending topics.


Alternative data is being labeled as the new oil. Investors and hedge funds are always on the lookout for novel ways to improve alpha and alternative data can give them the insights to lead them there. The field is still in its infancy but is already becoming increasingly popular.  More research needs to go into finding better ways to process the enormous amounts of alternative data acquired. Newer strategies also need to be considered on how to integrate this data into investment decisions. There’s no limit to how valuable this will be twenty years down the line, and for the moment, possibilities with this new fad are simply fascinating to the curious minds.