How informative are web searches for risk communication during COVID-19 in Germany?

Background: Risk communication during pandemics is an element of paramount importance. Understanding the level of public concern implicates expensive and time-consuming surveys. We hypothesize that the relative search volume from Google Trends could be used as an indicator of public concern towards prevention measures as well as of the adequacy of the ocial messages spread. Methods: The search terms ‘RKI’, ‘corona’ and ‘protective mask’ in German language were shortlisted. Cross-correlations between these terms and the reported cases from February 15 th to April 27 th were conducted for each German federal state. The ndings were contrasted against a timeline of ocial communications concerning COVID-19. Results: The highest correlations of the term ‘RKI’ (Robert Koch Institute, national public health authority in Germany) with reported COVID-19 cases were found between lags of -2 and -12 days, meaning web searches were already performed two to twelve days before case numbers increased. A similar pattern was seen for the term ‘corona’. Cross-correlations indicated that most searches on ‘protective mask’ were performed six to twelve days after the increase of cases. Conclusions: The results for the term ‘protective mask’ indicate some degree of confusion in the population, which is supported by the contradictory recommendations on the wearing of face masks over time. In addition, the relative search volumes could be a useful tool to provide timely information on location-based risk communication strategies.


Introduction COVID-19 (Corona Virus Disease 2019) is caused by the Severe Acute Respiratory Syndrome Coronavirus
(SARS-CoV-2). The global spread of the virus led to COVID-19 being classi ed as a pandemic in March 2020, affecting the lives of billions of people. At the time of writing, more than 29 million con rmed cases and nearly 950.000 deaths have been reported worldwide 1 . In Germany, around 260.000 con rmed cases and 9.400 deaths have been reported in its rst wave 2 .
The secondary attack rate of the infection and its enormous deadly toll highlighted the importance of public preventive measures to contain the spread of the virus and reduce the burden on health systems.
In addition to government-imposed contact restrictions and other policies, there are a number of individual hygiene measures such as social distancing, regular hand washing and the wearing of masks in order to contain the spread and for self-protection 3 .
Risk communication strategies aim at increasing the capacity of the public to act as an effective response partner by encouraging prevention measures among the population 4,5 . In past research, several factors have been found to in uence the persuasiveness of communications. These include authority credibility, transparency, perceived expertise and motivation to strive for a common interest 6 . During a pandemic, the actual change in behaviour is of great importance as the implementation of measures on an individual level is imperative. In several health behaviour theories, such as the Health Belief Model or the Protection Motivation Theory, the perceived severity of an event is, among other factors, a key element in the process of developing and adapting health behaviour 7,8 . Therefore, the timing, and more importantly, the content of risk communication procedures need to be appropriately adapted to its target population.
Collecting data on the perceived severity and public concern regarding an event is both time-consuming and costly. Therefore, in order to retrieve faster and cheaper insights, it may be worthwhile to look at web searches since the Internet is a common source for nding health-related information 9 . Web searches are not only a valuable resource for individuals seeking health information, but also for the scienti c community, as search queries contain geospatial and temporal data. Numerous studies have shown how analysing web search queries can assist in describing and predicting outbreaks especially in the context of sparse data [10][11][12] . Less frequently, web searches are analysed to describe the concern in the population in connection with an event or to get an idea of the general mood towards certain topics 13 .
The aim of this study is to explore the usefulness of clustered web search queries as an indicator of public concern for informing and adapting risk communication strategies in the context of the ongoing pandemic in Germany.

Data analysis
Statistical analyses were conducted using SPSS version 25 and R version 3.6.2. Line plots in combination with histograms, and cross-correlations were generated to assess the association between Google search trends and the o cially reported COVID-19 cases.
To characterize time dependence between time-series from two different data sources, cross-correlations were used. The time dependence between two time-series is termed as lag and indicates the direction of the two time-series correlated. A lag of -1 suggests that a peak in Google trends precedes a peak in the o cially reported COVID-19 cases by one day, and vice versa for positive lags.
Both time series were normalized to account for the different units and scales (RSV and absolute number of daily new cases). Thresholds for interpreting a correlation coe cient suggested by Hinkle et al. were applied 14 .
The risk communication milestones were reviewed based on their content and relevance for the present study by two independent reviewers. Disagreement was solved by discussion.

Time trends
Histograms of o cially reported cases on a national level in combination with line graphs of Google search queries for 'RKI', 'corona' and 'protective masks' at state level are shown in Figures 1 to 3. Most relevant communications and announcements are also presented in these gures. Figure 1 and 2 show that the RSV of the search terms 'RKI' and 'corona' reach their peak (mid-March) before the peak of the epidemic (beginning of April) for all German federal states. The RSV of 'corona' shows a second peak of searches in mid-April. Meanwhile the RSV of 'protective mask' presents with three peaks with the highest peak at a point where the curve of con rmed cases is already attening (end of April) (Figure 3).

Time-series cross-correlations
The linear association between o cially reported COVID-19 cases and Google Trend patterns for the search terms 'RKI', 'corona' and 'protective' mask were assessed using cross-correlations, as shown in Table 1 to Table 3.
Visual assessment of Table 1 and Table 2, which refer to the search terms 'corona' and 'RKI' respectively, reveals that the correlations reach a high level in most federal states with the exception of Brandenburg, Bremen, Mecklenburg-Western Pomerania and Saarland. These federal states are described separately from the rest in the following.
For the group of federal states showing high correlations, the search term 'corona' reached high coe cients (r = 0,707 to r = 0,804) preceding the reported cases for a couple of days (range of lags: -5 to -12 days). In the same group of states, high correlations (r = 0,711 to r = 0,856) for 'RKI' were reached when searches preceded the reported cases between two to ten days (range of lags: -2 to -10 days).
In the federal states Brandenburg, Bremen, Mecklenburg-Western Pomerania and Saarland only moderate correlations (r = 0,453 to r = 0,688) were reached for the term 'corona'. Correlations were highest when the searches preceded the reported cases (range of lags: -5 to 14 days). For these states, a similar trend was observed for the term 'RKI' (r = 0,449 and r = 0,682; range of lags: -10 to 0 days).
A different picture results from the search term 'protective mask'. In contrast to the other two search terms, reported cases preceded the web searches for 'protective mask' by ve to 14 days (range of lag +5 to +14 days) with correlation coe cients indicating low to moderate positive correlations in all federal states. (r= 0.400 to r = 0.602)

Risk communication milestones
Following the timeline of the data period (February 15 th to April 27 th ) chronologically (Figure 4), an important announcement was made on March 2 nd when the RKI classi ed the risk as "moderate". Just over a week later, on March 11 th , COVID-19 was classi ed as a pandemic by WHO. On March 13 th , most of the German states announced that they would temporarily close the schools. On March 17 th the RKI classi ed the risk as "high" and all schools and day care centres were closed. Five days later, on March 22 nd , the Federal Government and the federal states agreed on guidelines for restricting social contacts. Central elements were the prohibition of staying in public places with more than one other person not living in the same household, and the closure of restaurants, bars, and service providers with physical contact, such as hairdressers or massage studios.
On March 31 st , the German government expressed its opposition to the obligation to wear masks as medical masks should be reserved for health care personnel. On April 2 nd , the RKI extended its recommendation regarding wearing masks. While wearing masks was previously recommended for people with respiratory diseases, they now recommended it for the general population. A few days later (April 6 th ), compulsory wearing of masks was introduced in Jena, a city in the federal state of Thuringia.

Discussion
This study aimed to explore the usefulness of clustered web search queries as an indicator of the degree of public concern for informing and adapting risk communication strategies in the context of the ongoing COVID-19 pandemic in Germany.
The ndings showed that different search terms related to the pandemic had different trends over the course of the rst wave. While searches for the terms 'RKI' and 'corona' reached their peak before the number of cases peaked, the term 'protective mask' reached its peak when the curve was already attening.
The fact that the RSV of 'RKI' followed the same trend as the searches for the term 'corona' could indicate that the institute is trusted by the public. At times when the need for information on 'corona' was at its highest, web searches for the national public health authority were also at their highest level. This hypothesis is supported by data from the GESIS Panel Special Survey on the Coronavirus SARS-CoV-2 Outbreak in Germany. The data shows, that the RKI enjoys the greatest public trust in dealing with the pandemic compared to other institutions 15 .The temporal relationship between the RSV of 'corona' and the reported cases suggests that the term has a certain predictive value that could be used in the context of sparse data. A study by Effenberger et al. examined this temporal relationship in different countries and also found that the peaks of web searches for 'coronavirus' preceded the peaks of the reported cases by 11.5 days on average 16 . However, the visual examination of data from this study also suggests that the RSV is closely linked to the announcements of new policies or policy changes, since the second peak of the RSV coincided exactly with the announcement of the extension of the contact restriction policies.
The observation that an increase in web searches on 'RKI' and 'corona' preceded the increase of reported cases was also described in a similar study that monitored search terms related to COVID-19 such as 'corona', 'handwashing' and 'masks'. In contrast to this study, however, web searches for 'face mask' and 'surgical mask' behaved in the same way and were performed before most cases were reported 13 . Now, why did the search term 'protective mask' show a different trend in Germany? The long absence of a nationwide uniform obligation to wear a mask, while such obligations have already been introduced at local level, could have led to confusion and an increased need for research on this topic on the Internet. In addition, there have been con icting recommendations from different institutions as to whether masks should be worn by the general public or primarily by healthcare professionals. This public debate and uncertainty is re ected in the web searches for 'protective masks'.
The previous example shows how RSV of search terms can help to inform risk communication strategies in order to assess the degree of public concern about certain issues. In a similar study that analysed trends in in uenza-related search queries, it was also suggested that Google Trends identi es public perceptions of the in uenza threat rather than representing the incidence of in uenza 17 .
On the other hand, the RSV of search terms can help to evaluate the adequacy of the announcements and messages disseminated to the target public. We were able to show that at a time when web searches for 'corona' and the public concern was at its highest level, relatively few people informed themselves about preventive measures such as the wearing of protective masks. In retrospect, this is important information for the risk communication strategy, as it would be desirable for the public to inform itself about preventive measures at the peak of the pandemic, or even better, earlier.
The results of this study are subject to some limitations that should be considered. First of all, the intentions behind the web searches analysed are unclear. Different motives such as seeking health information, updating oneself on current policies, or purchasing face masks might have led individuals to search for the terms and not necessarily represent certain behaviours towards any preventive measures or the disease itself. Furthermore, it is known from literature that disease-related web searches are subject to media-driven interest known as "celebrity phenomena" 18 , which basically refers to increased internet queries for certain terms that are temporarily popular in the media.
Another important limitation arises from the limited applicability of the analysis method in federal states with low absolute populations, as was the case in Brandenburg, Bremen, Mecklenburg-Western Pomerania and Saarland. As the results showed, no high correlations were found in these states. It is therefore questionable to what extent the presented method can be applied in areas with a low population.
Some search terms that are highly relevant for disease prevention and personal protection, namely hand washing and disinfection, could not be included in the study due to the low search frequency at state level. There could be various reasons why these terms were not searched for as frequently. It is possible that there was no need in the population to inform themselves about these topics.
As a nal limitation, it should be mentioned that RSVs do not provide information about the absolute number of web searches. Therefore, it is not possible to draw conclusions about the actual amount of searches performed, but only about their trends over time.
Nevertheless, in this study we could present a novel tool for the fast assessment of public concern in times of an ongoing pandemic and show how web searches can also inform about the adequacy of risk communication strategies.

Conclusion
In the context of an ongoing pandemic using RSVs of key terms related to the disease can help to inform about the public concern. Results suggest that the data can also be used to assess the appropriateness of the messages disseminated. The example of the protective masks showed how initial public controversies and contradictions were re ected in the search behaviour of the population, which only informed itself about protective masks when the curve already attened. Further studies are needed to assess what is the real impact of inaccurate risk communication in the actual development of the pandemic.

Con ict of Interest
The authors declare no competing interests.

Funding Declaration
None.

Data availability
The Google Trends dataset generated for the current study is available and can be replicated at: Authors Contribution RS developed the concept and design of the study and made substantial contributions to the analysis, interpretation of data and critically revised the manuscript. KK performed the analyses, interpreted the data, and wrote the rst draft of the article and made substantial contributions the conception of the work. EL made substantial contributions to the analysis, interpretation of data and critically revised the manuscript. JM revised the last version of the manuscript before submission. All authors have read and approved the manuscript in its current form.

Ethics declarations
Ethical approval and consent to participate were not necessary as the study was based on openly available aggregated data.