Why Data Is the Lifeblood of Modern Hedge Funds
Quantitative and systematic hedge funds live and die by their data. A unique dataset, properly cleaned and integrated, can yield an edge worth hundreds of millions of dollars. Firms spend tens of millions of dollars per year on data subscriptions, and data teams often rival research teams in size. Below is a structured view of the datasets most commonly used, with rough pricing ranges so candidates and new analysts can understand the economics.
Pricing varies enormously by firm size, use case, and negotiation. The ranges below reflect typical institutional contracts and should be treated as approximate.
Market Data
Market data is the foundation of every trading strategy. It includes historical and real-time prices, volumes, and order book activity across asset classes.
- Refinitiv (LSEG) Tick History — global tick data across equities, futures, FX, and bonds. Typical enterprise cost: $200K to $1M+ per year.
- Bloomberg B-PIPE and BQuant — real-time feeds and analytics. Per-terminal Bloomberg cost is around $28K per year, with B-PIPE enterprise feeds running $100K to several million.
- Polygon.io — lower-cost US equities and options data. Plans start at $29 per month and scale to $2K+ per month for enterprise.
- Databento — pay-as-you-go historical and live market data. Often $10K to $100K per year for mid-sized funds.
- CRSP — gold-standard US equity historical data for academic and systematic research. University licenses are around $5K to $20K; commercial licenses run $50K to $200K+.
- CME DataMine — futures tick data from CME Group. Individual products start a few hundred dollars; full historical packages run into six figures.
Fundamental Data
Fundamental data covers company financials, estimates, filings, and corporate actions. It is the core input for equity long-short and factor strategies.
- Compustat (S&P Capital IQ) — standardized historical financials. Typical enterprise cost: $100K to $500K per year.
- FactSet — bundled fundamentals, estimates, and analytics. Per-seat licensing typically $12K to $20K per user per year.
- I/B/E/S (Refinitiv) — sell-side analyst estimates. Typically $100K to $400K per year.
- Visible Alpha — granular consensus estimates at the line-item level. Typically $100K to $300K per year.
- SEC EDGAR — free, but most funds pay for normalized feeds from vendors like Sentieo or AlphaSense.
Alternative Data
Alternative data is where funds search for true edge. This category has exploded in the last decade and now includes hundreds of specialized vendors.
- Second Measure (Bloomberg) — consumer credit card transactions. Typically $150K to $500K per year.
- Yodlee, Earnest Analytics, Consumer Edge — alternative transaction panels. $100K to $400K per year each.
- SafeGraph, Placer.ai, Advan — foot traffic and geolocation data. $50K to $300K per year depending on coverage.
- Orbital Insight, RS Metrics, Planet Labs — satellite imagery and derived signals. $100K to $1M+ per year.
- Thinknum, 7Park Data — web-scraped company signals. $30K to $250K per year.
- RavenPack, Dataminr — news analytics and sentiment. $75K to $500K per year.
- Sensor Tower, Apptopia — app usage and downloads. $50K to $250K per year.
- MarineTraffic, ImportGenius, Panjiva — shipping, trade, and supply chain. $20K to $200K per year.
- LinkUp, Revelio Labs — job postings and labor market signals. $75K to $300K per year.
Fixed Income and Credit
- TRACE — free corporate bond trade reporting, but cleaned feeds from vendors cost $25K to $100K per year.
- IHS Markit / S&P CDS and credit data — typically $100K to $500K per year.
- Intex — structured product cash flow modeling. $100K to $400K per year.
- LSTA LoanX — leveraged loan pricing. $50K to $200K per year.
Macro and Economic Data
- Haver Analytics — macroeconomic time series. Typically $30K to $150K per year.
- FRED, World Bank, IMF — free sources used by most funds.
- Macrobond — integrated macro database. $15K to $60K per user per year.
ESG Data
- MSCI ESG, Sustainalytics, Refinitiv ESG — typically $75K to $400K per year each.
- Trucost (S&P) — carbon and environmental metrics. $50K to $200K per year.
How Funds Evaluate New Datasets
Hedge funds typically follow a structured evaluation process before licensing a new dataset. This usually involves a trial period of 30 to 90 days where researchers backtest the data against their existing signals, check for coverage, point-in-time accuracy, and look-ahead bias, and estimate the marginal Sharpe contribution. Only a small fraction of evaluated datasets make it into production.
The largest multi-strategy firms like Citadel, Point72, and Millennium spend $50M to $200M+ per year on data, maintain dedicated data sourcing teams, and often buy exclusive rights to promising alternative datasets.
Learn More
If you are preparing for a quant research or data science role at a hedge fund, knowing the vendor landscape is a concrete way to demonstrate industry awareness. Explore our companies directory to see which firms are known for heavy data investment, and check our quant job listings for open data science and quantitative research positions.