Finance Theory Group

Finance Theory Insights

Data Abundance and Asset Price Informativeness

Jérôme Dugast, Thierry Foucault,

Based on: Journal of Financial Economics, 2018, 130 (2), 367-391. DOI:

Prices of financial securities may become less informative about their value when investors have easier and very quick access to digital information. 


Digital technology (the representation of information in bits) considerably increases the volume and diversity of data. Furthermore, progress in computing power has reduced data processing costs and fostered the emergence of powerful forecasting techniques (“Artificial Intelligence”). This “Big data” revolution changes how information is obtained, processed, and used by information intermediaries and investors. Does it make asset prices more informative about fundamentals?

In fact, this is not necessarily the case, because digital technology changes the dynamics of information production: It makes imprecise signals more quickly available after the arrival of news, which undermines incentives to produce more precise signals.  

Consider the arrival of news about a public firm, e.g., a new regulatory filing, an earnings conference call, or a new product. News analytics enable investors to quickly obtain a signal about the firm’s future earnings. However, this early ("raw") signal is less precise than the ("processed") signal that one can obtain by collecting and analyzing additional information. This would not be a problem, and would even enhance price informativeness, if subsequent efforts in producing information about future earnings were not affected by the arrival of the early signal. However, this is not the case: An increase in the demand for early, but imprecise, signals after new data arrival reduces investors’ incentives to further process the data.

Bad signals drive out good signals

Information production is a dynamic process in which the precision of signals extracted from new data gradually increases over time. After new data (newswires, social media, regulatory filings etc.) becomes available, investors can obtain two types of signals about the future earnings of a firm (or any events resolving uncertainty about the payoff of an asset): (i) a “raw signal”, which is available quickly but imprecise, (ii) a “processed signal”, which is precise but takes more time to obtain. The raw signal is informative but its quality is uncertain: There is a chance that it is just noise. Its true quality can only be discovered by processing the data further and is revealed by the processed signal (e.g., think about hiring an analyst to research whether news about a firm is relevant or not for its future earnings).

Investors obtain their signals from information intermediaries (e.g., news analytics providers for the raw signal and securities analysts for processed signals) who possess the skills or technology required for data processing. The market for information is competitive: (i) information intermediaries charge a fee for each type of signal such that their revenues just cover their fixed production cost and (ii) the number of investors buying each type of signal adjusts so that the expected trading profit, net of information fees, on each type of signal is nil.

When investors obtain their signal, they can trade on it with competitive risk neutral dealers and other uninformed traders, who do not buy the raw and the processed signals or lack the skills and technology to produce them. Thus, the availability of the raw signal increases information disparities between investors and dealers. Yet, dealers extract some information about investors' signals from the aggregate demand for the asset and set its price accordingly.

As new information arrives only when the signals are made available, there are only two dates of interest before uncertainty about the firm’s earnings is resolved: the date at which the raw signal becomes available (date 1) and the date at which the processed signal becomes available (date 2). Of interest are the dynamics of the asset price and the pricing error, i.e., the mean squared difference between the asset price and its payoff. At each point in time, the informativeness of the asset price is measured by the percentage reduction in the pricing error relative to the case in which investors have no information at all.

Figure 1 shows the informativeness of the price at date 1 (red dashed line) and at date 2 (blue plain line) as a function of the cost of producing the raw signal. Figure 1 is easier to understand by reading it from right (high cost) to left (low cost). When the cost of producing the raw signal declines, the informativeness of the asset price at date 2 declines (from about 70% when the cost of producing the raw signal is large to about 58% when this cost is nil) while the informativeness of the asset price shortly after news arrival (date 1) improves (from 0% to 58%).  



The reason for these patterns is as follows. When the cost of producing the raw signal becomes sufficiently small (i.e., at the point where price informativeness at date 1 jumps from 0 to 30% in Figure 1), the market for the raw signal becomes viable. That is, information intermediaries who produce this signal can charge a fee low enough to attract sufficient demand to amortize their fixed cost. This triggers a jump in the demand for the raw signal and therefore the volume of trading on this signal. As a result, the informativeness of the price at date 1 improves and even more so that the demand for the raw signal is strong (i.e., the cost of producing this signal is low).

In turn, because the pricing error at date 1 is smaller, average trading profits on the processed signal at date 2 decrease. Thus, the demand for the processed signal declines. In other words, the demand for the raw signal crowds out the demand for the processed signal. As fewer investors trade on the processed signal, the price of the asset at date 2 becomes less informative (the blue line goes down) and therefore the improvement in price informativeness from date 1 to date 2 becomes smaller (the blue line becomes closer to the red line) or equivalently the pricing error at date 2 increases relative to a world in which the raw signal is not available.

When the cost of the raw signal becomes small enough, the demand for the processed signal is so low it is not profitable anymore to supply it because revenues from the sale of this signal are insufficient to cover the fixed cost of producing it. Consequently, information production stops after date 1 and price informativeness does not improve thereafter (the dashed red line overlaps with the plain blue line in Figure 1). In this case, as the raw signal is less precise than the processed signal, the asset price is less informative than if trading on the raw signal was impossible.

In sum, the quick availability of signals after the arrival of new data can crowd out the demand for more fundamental research (e.g., by securities analysts). This crowding out obtains when the cost of producing the processed signal is smaller than a threshold. Otherwise, the availability of the raw signal improves price informativeness at both dates. Indeed, consider the extreme case in which there is no demand for the processed signal, whether the raw signal is produced or not, because the cost of producing this signal is too high. In this case, the availability of the raw signal at a low cost necessarily improves price informativeness relative to the case in which this signal is not available because some investors buy this signal and the asset price can then reflect, at least partially, their information. This logic applies as long as the demand for the processed signal is low, i.e., the cost of producing the processed signal is large enough.    

Why is this a new problem?

Filtering out the noise in new data takes time. Thus, the precision of investors' signals about events resolving uncertainty (e.g., earnings or macro-economic announcements) increases as these events approach. However, an improvement in the precision of early signals (due to a reduction in the cost of producing these signals) reduces the profitability of trading on subsequent, more precise, signals because by the time these signals are available, the asset price already contains some information. Thus, a reduction in the cost of obtaining early signals makes producing more precise signals less attractive, which ultimately harms the informativeness of prices about future cash-flows. This crowding out mechanism is not specific to the digital world. However, it has become more acute because raw signals become ever more quickly available due to digitization and increased computing power.

One implication is that a decline in the cost of obtaining raw signals should make earnings announcements (or any announcements resolving uncertainty about asset payoffs) more informative because asset prices just before these announcements become less informative. There is empirical evidence that algorithmic trading impairs price discovery prior to earnings announcements and therefore makes these announcements more informative. This negative effect of algorithmic trading on price discovery ahead of earnings announcements might be due to news-reading algorithms and a negative effect of the latter on the quality of analysts’ forecasts.  


A key function of financial markets is to guide capital allocation by providing signals about firms' fundamentals. Paradoxically, the growth of alternative data and new techniques to process these data could reduce the accuracy of these signals by crowding out incentives for fundamental research.


Jérôme Dugast

Associate Professor of Finance

Université Paris Dauphine - PSL

Thierry Foucault

Chaired Professor of Finance

HEC Paris