Alternative data conferences connect actors in the buy-side ecosystem to explore novel use cases as the demand for alternative data is increasingly intense. Only the largest and most sophisticated players with distinctly unique roles can leverage their critical edge. Increasing demands on data remains a challenge that only few can solve.
With annual purchases of alternative data by U.S.-based buy-side firms projected to reach $900 million by 2021, the competition to find, extract, refine, package, and ultimately sell alternative data is immense. Quandl, a subsidiary of NASDAQ, is the largest alternative data provider for financial professionals transforming the investment management processes not only for the buy-side but also for private equity and venture capital.
Conferences such as Quandl’s 2020 alternative data conference offer a unique opportunity to explore the future of alternative data: potentially new data sources, new approaches to extract additional signals, and novel use cases. My few observations from the conference.
Weaponizing alternative data
In recent years, the adaptation of alternative data has changed the buy-side workflow by adding time-sensitive and context-specific alpha generated through signals extracted from alternative data. Because of alternative data’s proven capacity to generate alpha consistently — at least for limited time until the signals diminish — the competition for the data is increasingly fierce. Finding data with yet to be extracted alpha is challenging.
The strategic weaponization of alternative data coupled with the need to innovate across the alternative data pipeline to scale both demand and supply emerged as a common theme across the speakers.
The standard use-case for alternative data — satellite pictures of cars in the parking lot outside malls — has considerable alpha decay. Incorporating new types of alternative data have shown to have a glacial adaptation pace considering the demands on backtesting and validation. The joint impact of the slow adaptation process and the rapid alpha decay limits strategic opportunities both for the buy-side and the data vendors.
The near-term future of alternative data is not in new sources: it is in developing new approaches, often using complex machine learning tools, that extract signals from existing alternative data sources.
Extracting better signals
Companies such as P Street Advisors further refine primary sources of alternative data using artificial intelligence. Highly refined alternative data is an increasingly valuable tool because it captures new signals. But algorithmically enhanced data has its detractors who claim that the tools used for strengthening the alpha not always easily explainable, hence a black box feeding another black box. As much as the industry intends to find glass box approaches, it is the black boxes that continue to win.
OmniSci looks at innovation from the computational aspect. As alternative data is often massive, real-time, and multi-dimensional, and investors need to extract credible analytical signals without latency. It offers solutions that combine both analytical speed with attractive graphical renderings. Essentially, OmniSci’s models that are trained over billions of rows of data and leverage machine learning approaches to speed up decisions around optimal trading strategies.
But, it is not the only innovation that was on the agenda at the conference. One of the more traditional and broadly adopted types of alternative data is signal extraction using Natural Language Processing (NLP) from news and other documents. Yet, as one of the pioneers in the field, RavenPack demonstrated, using NLP has room to innovate. As underlying methodologies evolve, the increasingly sophisticated NLP algorithms coupled with novel approaches to data science yielding complex, multithreaded knowledge graphs to better understand relationships. Network graphs allow analysts to look beyond flat data to extract and identify hidden and unexpected connections. Looking at the changes in relationships in network graphs over time reveals the direction, strength, and potential causality between factors and events.
NLP is moving from extracting sentiments and signals from texts — to dynamically depicting long-term cooccurring multithreaded relationships.
The usefulness of alternative data depends on the length and depth of the data. The higher the quality, the more immediate the publication of the data, the uniquely distinguishing quality of the data increases.
Successful long-term alternative data strategies will likely combine immense computational power, sophisticated machine learning pipelines, with intense focus on the human components. As alternative data changes the information content and augments the existing information set, the benefits accrue those who proactively capture it. The competitive pressures on alternative data providers continue to be immense.