Macroeconomic nowcasting with internet search data – Anna Simoni

We are delighted to present the project of Anna Simoni, Hi! PARIS Fellow 2021.

Anna Simoni is senior researcher at CNRS and professor at ENSAE Paris and Ecole Polytechnique. She lead her research at the CREST (Center for Research in Economics and Statistics, a joint research unit of CNRS, ENSAE Paris – Institut Polytechnique de Paris, École polytechnique – Institut Polytechnique de Paris, GENES).

Her specialty is econometrics. She leads a project for Hi! PARIS entitled Macroeconomic nowcasting with high-dimensional Google Search data: theory and practice.

Macroeconomic nowcasting with internet search data 

Anna Simoni is deeply interested in economic issues and their implications on our societies. During her PhD at Toulouse School of economics, she specialized in econometrics, a field that aims at testing and estimating economical models. “I study how statistical methods need to be modified and adapted to answer questions of economic interest” she explains. Anna Simoni is senior researcher at CNRS, which awarded her a Bronze Medal in 2019, a high distinction for young scientists. She is also professor of statistics and econometrics at ENSAE and Ecole Polytechnique and affiliated at the Center for Research in Economics and Statistics (CREST).

The theoretical tools she develops make it possible to make inference on causal relationships in economic models and to make forecasts. For example, estimating the implications of a policy on economic actors such as companies, central banks or administrations. Or forecasting macroeconomic aggregates such as a country’s GDP. The latter example is related to her project for Hi! PARIS, which will develop tools with the aim of nowcasting macroeconomic and financial indicators. Of course, official institutions are already estimating and publishing these indicators. But not in real time. “For instance, to know the GDP in the first quarter of 2022, you have to wait until mid-May, when official series are published” says Anna Simoni.

To avoid this delay, she relies on alternative data sources coming from the internet. These data are produced by Google, that monitors the keywords that are typed into its search engine and classifies them into categories and subcategories such as business, news, health, jobs & education, etc. This database has been updated by Google since January 2004, and researchers can have access to the variation in search volume in each category on a weekly basis. “These non-official data are biased because they only reflect the behaviour of a sub-population that uses the internet. Nonetheless, it can be useful to exploit it because the amount of information is huge”. Considering only the six main countries of the euro area, there are already over 1800 categories and about 700 time series. With such a large number of variables, the term “big data” is not overused. Therefore, one of the biggest challenges is to select the most relevant variables in order to predict quantities such as the GDP growth rate.

This is where artificial intelligence and machine learning come into play, by helping researchers extract information form the data. Of course, these tools have to be rightly tailored, with a clear theoretical framework. “We have already developed an efficient procedure and preliminary results are very good, even though there are still open questions that we will try to answer thanks to the Hi! PARIS project.” says Anna Simoni.  Not only can Google Search Data help predict the GDP growth rate in real time when there is no official information, but when such information is available, Google Search Data can improve nowcasting accuracy. Finally, Anna Simoni’s work also explores the effectiveness of these predictions during different phases of the economic cycle. These can be, for example, recessions such as in 2008-2009, periods of stability or periods when the trend reverses. Providing such insights could be critical for policy-makers to quickly react to changes in the economy.