Information-Seeking Under Threat: How the Characteristics of Web Searches Changed During the Pandemic

To adjust to novel and threatening environments people seek information. Here, we examine whether and how a threatening global event -–the pandemic– altered the characteristics of the information people sought out online. An analysis of queries submitted to Google search engine revealed that people were more likely to submit queries for information that could guide action (i.e., “How to” and “How do” searches) during the pandemic relative to before, controlling for total search volume. This tendency may have contributed to the rapid adaptation observed in response to the pandemic. Indeed, stress levels reported weekly by 17K individuals predicted the proportion of “How to” and “How do” searches, controlling for COVID-19 related con�nement. Markedly, population stress levels were more strongly associated with this high-level feature of web searches than they were with searches for speci�c terms such as “anxiety” or “stress”. In contrast, COVID-19 related con�nement, but not stress levels, was associated with the proportion of “What” and “Why” questions submitted to Google, suggesting that the con�nement was related to increased desire for general knowledge. Key results were replicated across two countries (UK and US). The study suggests that in situations of high stress people ask questions that can guide action. An intriguing possibility is that tracking of this feature could be used to monitor population stress levels beyond the pandemic.

To examine if and how the type of information people sought was altered in response to this threatening event, we analyzed queries submitted to Google search engine before and during the COVID-19 pandemic in the UK and US. Our investigation was guided by our theory of the key motives of information-seeking. According to this theory, people search for information to ful ll three main aims (Sharot & Sunstein, 2020;Kelly & Sharot, 2021); (i) guide action, (ii) increase comprehension of the world around them, and (iii) experience positive affect. These motives may be weighted differently in different situations. For example, when experiencing threat people may be more inclined to search for information that can guide adaptive action.
We quanti ed proxies of these features by examining the questions people submitted to the Google search engine. In particular, we rst calculated the percentage of Google queries containing the question-words "How to" and "How do" submitted in each geographical region out of all searches submitted in that region, every week from the date the "National Emergency" was declared (UK: March 23 rd , 2020; and US: March 13 th , 2020) through March 21 st , 2021, as well as every week from January 1 st , 2017 until National Emergency was declared. The rationale was that asking "How to" and "How do" will likely result in information that can directly guide action (e.g., "How do you install Zoom?"). In a controlled experiment, we con rmed that participants believe that when asking "How to" and "How do" questions, people want information that can guide their action. We did the same for "What" and "Why" questions. The rationale was that such questions are likely to ll information-gaps that can increase people's general sense of comprehension (e.g., "What is Zoom?"). That rationale was also tested and con rmed in a controlled experiment. Changes to the proportion of these queries cannot be explained by changes in the volume of Google searches during the pandemic, as we examined the change in the percentage of these queries out of all searches. Third, we quanti ed the valence of the most frequent questions submitted to the Google search engine by calculating the difference in the percentage of positive and negative words out of all words entered for these queries each week in the UK and US. This was done by matching the search words to an emotion lexicon database that includes 6,789 words scored on valence (Hu & Liu, 2004).
We then examined (i) how these features changed following the declaration of a "National Emergency" in the UK and US, and (ii) how these changes related to population stress levels in the UK. The latter was done by relating the three measures quanti ed weekly during the pandemic to weekly stress reports of approximately 17,468 individuals in the UK. In addition, we related the three features of information-seeking to COVID-19 related con nement in each geographical region. This allowed us to dissociate the effects of stress on informationseeking from the effect of COVID-19 related con nement.

Results
The pandemic resulted in signi cant changes to high-level features of web queries. To assess the high-level features of web searches we rst extracted the Google search volume index of "How to", "How do", "What" and "Why" questions separately. A Google search volume index is equal to the number of searches for the speci c term of interest in a given week and region divided by the total number of searches for that same week and region. These percentages are then normalized to represent search interest relative to the highest percent for that region for the entire time frame (i.e., January 1 st , 2017 -March 21 st , 2021; see Method for details). A Google search volume index of 100 denotes the peak popularity for the term, while an index of 0 means there was not enough search data to calculate popularity of this term. Note, that weekly changes to the Google search volume index cannot be explained by weekly changes in the total volume of Google searches, as the index re ects the percent of speci c queries out of all searches that week. We then averaged separately the (i) "How to" and "How do" scores and the (ii) "What" and "Why" scores. We also calculated a third feature -a Valence index -which indicates the valence of the most frequent questions submitted to Google search engine. To calculate this index, we extracted the most frequent search queries that included the words "How", "What" or "Why" and calculated the difference in the proportion of positive words to negative words from the terms submitted using a library of words that have been categorized as positive or negative (Hu & Liu, 2004). We transformed this number to be on a scale from 0 to 100, such that it would be easily comparable to the other two indexes. A score of 100 denotes the most positively valanced score and a score of 0 the most negatively valanced score.
In a separate study (see Supplementary Experiment), we validated the proposition that when people submit "How to" and "How do" search queries they are primarily motivated to nd information that can guide action and when submitting "What" and "Why" questions they are primarily motivated to increase their general understanding.
Plotting the three measures over time ( Figure 2) reveals a sharp increase in "How to" and "How do" questions and "What" and "Why" questions submitted following the declaration of a "National Emergency". In the US a second peak is observed around October 2020, a time in which COVID cases where again rising (Dong et al, 2021). In contrast, there was no change to the valence of searches in both countries.
Population stress-levels are selectively associated with asking "How to" and "How do". Thus far, we have shown that there is an increase in the proportion of "How to" and "How do" and "What" and "Why" searches submitted to the Google search engine during the pandemic relative to before. We next examined whether these were related to population stress levels. We had access to self-report stress levels collected every week in the UK between March 21 st , 2020 and March 21 st , 2021. Approximately 70,000 unique individuals completed the survey, on average 17,468 individuals a week in the UK . Speci cally, participants were asked to indicate if over the previous week they felt worried and/or stressed about any of the following factors: (i) catching (ii) becoming seriously ill from Covid-19, (iii) nance, (iv) unemployment and (v) getting food. We computed the mean proportion of individuals who reported stress or worry over these factors. We then conducted three linear models predicting on a weekly basis the Google search volume index in the UK of "How to" and "How do" questions, "What" and "Why" questions, and the valence of submitted questions from stress levels. We also included weekly UK COVID-19 con nement scores in the models to disentangle the effects of emotional stress on web searches from the effects of con nement due to restrictions placed by the Government. Covid-19 related con nement data for each week in the UK was obtained from the Oxford University COVID-19 Government Response Tracker (Webster et al., 2021). The data includes ordinal variables coded by severity/intensity of con nement, on a daily basis (from January 1 st , 2020 and March 21 st , 2021), due to the following: (i) school and university closures, (ii) workplace closures, (iii) public event cancelations, (iv) restrictions on gatherings, (v) public transport restrictions, (vi) stay at home requirements, (vii) restrictions on domestic travel, and (viii) restrictions on international travel; see Table 3 for coding. To obtain weekly values, we computed weekly averages of the daily ratings. Finally, to quantify an overall COVID-19 related con nement score, we normalised all variables to range between 0 and 1 and then averaged the 8 transformed variables together.
Importantly, to account for simple temporal trends we rst removed linear trend from the dependent variables ("How to" and "How do" questions, "What" and "Why" questions, and the Valence Index) and predictor variables (stress scores and COVID-19 related con nement), using the detrend function in the 'pracma' R package. The detrended dependent and predictor variables were then Z-scored before being entered in the linear models.
The linear model predicting "How to" and "How do" questions from stress levels and COVID-19 related con nement scores, revealed that both high stress (β = 0.211 ± 0.075 (SE), t(49) = 2.817, p = 0.007) and greater COVID-19 related con nement (β = 0.776 ± 0.075 (SE), t(49) = 10.346, p = 0.0001 Figure 3a) predicted proportion of "How to" and "How do" queries (R 2 of the model =0.742) . In other words, the relationship between stress levels and "How to" and "How do" searches cannot be solely explained by increased restrictions on movement during the pandemic, as our model controls for COVID-19 related con nement. In contrast, the proportion of "What" and "Why" questions was selectively predicted by COVID-19 related con nement (β = 0.676 ± 0.105 (SE), t(49) = 6.405, p = 0.0001), but not stress levels (β = 0.106 ± 0.105 (SE), t(49) = 1.005, p = 0.320), overall R 2. = 0.492 ( Figure 3b). Valence of searches was not predicted by either variables (stress: β = 0.238 ± 0.147 (SE), t(49) = 1.621, p = 0.112, COVID-19 related con nement: β = -0.044 ± 0.147 (SE), t(49) = -0.300, p = 0.766, R 2. = 0.013; Figure 3c). Thus far, we have shown that the relative volume of queries that can direct action are tightly related to stress levels. Next, we wanted to test the predictive validity of this simple model. Speci cally, we used the proportion of UK sample reporting stress to predict the proportion of "How to" and "How do" searches using a leave one out analysis. To account for a simple temporal trend, we rst removed the linear trend from the dependent variable ("how to" and "how do" questions) and the predictor variable (stress levels). The detrended predictor variables were then Z-scored before being entered in the simple linear model. The simple model was then run on all the data save for one time point which was held out from the analysis. We then used the regression beta to predict the proportion of "How to" and "How do" searches of the left-out time point. This process was repeated so that each week's proportion of "How to" and "How do" searches was estimated from the simple model parameters generated without using that week to t the data. The actual proportion of "How to" and "How do" searches of a week (data) and the predicted proportion of "How to" and "How do" searches (estimation) were then correlated and also compared using a paired sample t-test. This analysis indicates whether the stress levels in the UK is a good predictor of the proportion of "How to" and "How do" searches. We observed a correlation between the predicted proportion of "How to" and "How do" searches (estimate) and the actual proportion of "How to" and "How do" searches (data) (r(50) = 0.401, p = 0.003). The means of the two sets of values were not signi cantly different from one another (t = 0.098, p = 0.922). This analysis suggests that stress levels in a population is a good predictor of the proportion of "How to" and "How do" searches during the pandemic.
Finally, we tested whether stress was better predicted by "How to" and "How do" Google searches than searches for speci c content terms (e.g., "stress" and "anxiety"), which are often used in attempt to predict population wellbeing (Yang et al., 2015;Barros et al., 2019). To test this, we ran a model predicting the proportion of UK sample reporting stress from Google search index of "How to" and "How do" questions as well as Google search index for "stress" and closely associated terms (i.e., "anxiety" and "mental health"). Once again, the dependent and predictor variables were rst detrended and then Z-scored. The volume of "How to" and "How do" questions was the only signi cant predictor of stress (β = 1.751 ± 0.497 (SE), t(47) = 3.520, p = 0.001; Figure 4). All other predictors were not signi cant (p-value > 0.404). Similar results are observed if multiple linear models are run to predict stress each time from only one term. Once again, "How to" and "How do" Google search index was the strongest predictor of stress (β = 1.748 ± 0.478 (SE), t(50) = 3.658, p = 0.0006), while no other predictor was signi cant (all p > 0.271).
Note, that in all linear models we control for temporal trends by detrending the dependent and predictive variables.
The same results are observed when detrending is not employed (see Supplementary Table).

Discussion
The global pandemic generated a new set of practical and mental challenges. To overcome these challenges, people turned to technology. On average, people spent almost 7 hours a day online in 2020, up 7.3% from the previous year (Kemp, 2021). A large fraction of this time was dedicated to searching for and consuming information (Kemp, 2021). Here, we examined how the high-level features of questions that people submitted to the Google search engine changed in response to the pandemic. Our results reveal two signi cant changes to web searches in response to the pandemic.
First, we observed a sharp increase in searches for information that can guide action. In particular, both in the US and UK, the proportion of queries that included the question-words "How to" and "How do" were greater during the 12 months following the declaration of "National Emergency" than in the years prior. This rise likely re ects an adaptive human tendency to ask questions that can facilitate rapid adjustment to new and potentially aversive environments. Our supplementary experiment indeed shows that participants believe that when people submit "How to" and "How do" search queries, they are primarily motivated to nd information that can guide action (see Supplementary Experiment).
The uctuation in proportion of "How to" and "How do" questions submitted to Google was associated with the proportion of individuals who reported experiencing COVID-related stress in the UK. Speci cally, the greater the proportion of the sample who reported stress, the greater the proportion of "How to" and "How do" questions submitted out of all searches, controlling for temporal trends in the data. This was observed using a simple model relating the weekly proportion of "How to" and "How do" questions submitted to Google search engine to the proportion of individuals in the UK reporting COVID-related stress out of a sample of over 17K residences. Importantly, we were able to disentangle the effects of stress on web searches from the effects of COVID-related con nement, by controlling for those in the model. Thus, the relationship between stress levels and the volume of "How to" and "How do" queries cannot be explained simply by increased con nement. The ndings show that in the face of a novel threatening situation and high stress people search for information that can help guide action. While in the past such information may have been sought directly from other people, with the development of the internet, individuals are now able to turn to the web for answers. This ability may have contributed to the high resilience and quick adaptation observed in response to the pandemic (Aknin et al., 2021.;Globig et al., 2020).
Second, we observed a sharp increase in the proportion of queries that included the question-words "What" and "Why" during the global pandemic. Our supplementary study shows that participants believed that when people submit "What" and "Why" questions to the Google search engine they are primarily motivated to increase their general understanding, rather than have an intention to guide action (see Supplementary Experiment). Interestingly, this spike was not associated with stress levels. Instead, uctuations in the proportion of "What" and "Why" questions were associated only with COVID-19 related con nement. Speci cally, the more severe the levels of con nement imposed on people the more they asked "What" and "Why" questions relative to other searches. This rise may thus be due to an increase in down-time due to restrictions on activities outside the home. It is important to emphasize that any change in the proportion of "What" and "Why" searches and "How to" and "How do" searches cannot simply be explained by a general increase in number of Google searches, as the former are calculated as proportion of the latter. Neither can it be explained by temporal linear trends, as the data was detrended. We did not observe a change in the number of negative words used in questions submitted to the Google search engine during the pandemic relative to the number of positive words used.
Markedly, we show that the type of questions people asked, predicted the proportion of UK sample reporting COVID-related stress during the pandemic better than measuring the frequency of searches that include stress related content (i.e., "anxiety", "stress", and "mental health"). An interesting question is whether tracking the frequency of "How to" and "How do" questions, or other high-level features of web queries, can (i) predict population-level stress beyond the time of a pandemic and (ii) predict individual differences in stress. If a rmative, quantifying such search features can prove extremely valuable for monitoring mental health on both an individual and populating level. Future studies are needed to explore these intriguing possibilities.

Data Extraction
Web Search Data. Weekly search data was extracted from Google Trends (www.googletrends.com) for 220 weeks (January 1 st , 2017 through March 21 st , 2021). This was done separately for the UK and the US. We extracted the Google search volume index for four question-words ("How to", "How do", "What", and "Why") separately. A Google search volume index value is equal to the number of searches for the speci c term of interest in a given week and region (for example total number of searches that include the question-words "How to" in the UK the rst week of 2020) divided by the total number of searches in that same time and region (for example the total number of Google searches submitted in UK the rst week of 2020). These values are normalized to represent search interest relative to the highest value for that region for the entire time frame (i.e., January 1 st , 2017 -March 21 st , 2021). A value of 100 is the peak popularity for the term, a value of 50 means that the term is half as popular as the peak, while a score of 0 means there was not enough data to calculate the terms' popularity. We then averaged separately the (i) "How to" and "How do" scores and the (ii) "What" and "Why" scores.
To quantify valence, we extracted the 25 most popular search queries for each week and region for questions including "How", "Why" and "What". That is, for each week we extract 25 search queries per type of question (i.e., 75 total search queries for each week), as this is the maximum Google Trends reports. We then matched the words on the lists with a lexicon database (Hu & Liu, 2004) that includes 2006 positive words and 4783 negative words and count the number of positive and negative words for each week and region. Finally, we subtracted the proportion of negative words from the proportion of positive words. We transformed this number to be on a scale from 0 to 100, such that it would be easily comparable to the other two indexes. A score of 100 denotes the most positively valanced value and a score of 0 the most negatively valanced value.
Self-Reported Stress. Data was extracted, with permission, from the UK COVID-19 Social Study . The study is a panel study of over 70,000 UK citizens which aims to characterize the psychological and social experience of adults living in the UK during the Covid-19 pandemic (see Table 1 for demographics). The study commenced as a weekly survey, with participants receiving an invitation to the next wave of data collection 7 days following their last completion. All participants received up to 2 reminders (24 and 48 hours following their initial weekly invitation). The link to their last reminder remained live so they could return to the study a few days later if they chose to. Following week 22 of the study, monthly follow-ups rather than weekly follow-ups were sent.
To attain an equal number of responses across time, participants were randomized to receive their monthly invitation on either week 1,2,3 or 4 of the month, with subsequent invitations following 28 days after they completed the survey. An average of 17,468 individuals submitted data each week (see Table 2 for response frequency for each week). For full methods and demographics for the sample see www.COVIDSocialStudy.org. The UK COVID-19 Social Study was approved by the UCL Research Ethics Committee and all participants gave written informed consent.
Participants were asked: "over the past week, have any of the following been worrying you at all, even if only in a minor way?" They were presented with 18 factors that may cause worry (for example internet access, boredom, neighbours) and were to pick any that they were worried about. Five of these factors were a-priori categorized by the authors of the survey  as ones that have been impacted by COVID. These were (i) catching Covid-19 (ii) becoming seriously ill from Covid-19, (iii) nances, (iv) losing your job/unemployment and (v) getting food. Second, they were asked "have any of these things been causing you signi cant stress? (e.g., they have been constantly on your mind or have been keeping you awake at night)". They were presented with the same 18 factors as above and were asked to tick any of those causing signi cant stress. For each week and factor, Fancourt et al., 2021 calculated the proportion of respondents that ticked that factor either in response to question 1 and/or question 2. Factors i and ii were a-priori combined by Fancourt and colleagues to make one factor, leaving us with four factors. For each week the proportion of people ticking 1 and/or 2 were averaged across the four factors to produce one indicator of "stress levels" for that week. Table 1 shows the demographic of respondents to the UK COVID-19 Social Study. Importantly, data points reported by  were weighted using auxiliary weights to the national census and O ce for National Statistics (ONS) data. We used these weighted data points in our study. Thus, reported stress levels are representative of the UK population.  COVID-19 Con nement Score. To measure COVID-19 related con nement, we extracted eight con nement variables from a publicly available dataset (The Oxford University COVID-19 Government Response Tracker;Webster et al., 2021). All variables are ordinal coded by severity/intensity of con nement, on a daily basis (from January 1 st , 2020 to March 21 st , 2021), for the following: (i) school and university closures, (ii) workplace closures, (iii) public event cancelations, (iv) restrictions on gatherings, (v) public transport restrictions, (vi) stay at home requirements, (vii) restrictions on domestic travel, and (viii) restrictions on international travel; see Table 3 for coding. To obtain weekly values, we computed weekly averages of the daily ratings. To quantify an overall COVID-19 related con nement score, we transformed all variables to range between 0 and 1 using the R function scaler from the R package, bruceR. Finally, we averaged the 8 transformed variables together. Orders to "shelter-in-place" and otherwise con ne to home. to January 1 st , 2017. We then compared the weekly scores before the "National Emergency" to that after using an independent samples t-test.
To assess whether our measures were related to the proportion of UK sample reporting stress, we conducted three linear models predicting on a weekly basis Google's search volume index of "How to" and "How do" questions, "What" and "Why" questions, and the Valence index of questions submitted to Google in the UK, from UK stress levels. We also included weekly COVID-related con nement scores in the models to disentangle the effects of stress from the effects of con nement due to restrictions placed by the Government. To account for simple temporal trends, we removed the linear trend from the dependent and predictor variables rst, using the detrend function in the pracma R package. The detrended dependent and predictor variables were then Z-scored. Finally, we were interested in whether stress was better predicted by "How to" and "How do" Google searches than searches for stress related terms. To test this, we rst removed the linear trend from the dependent and predictor variables, and then Z-scored the dependent and predictor variables. We then ran a model predicting the proportion of UK sample reporting COVID-related stress from the "How to" and "How do" Google search index as well as Google search index for the words "stress", "anxiety", and "mental health" in the UK. In addition, we ran multiple linear models to predict the proportion of UK sample reporting COVID-related stress each time from only one of the terms above.
Next, we tested the predictive validity of a simple model using stress levels to predict the proportion of "How to" and "How do" searches using a leave one out analysis. Once again, we removed the linear trend from the dependent and predictor variables rst, and then the dependent and predictor variables were Z-scored. Speci cally, the simple model was run on all the data save for one time point which was held out from the analysis. We then used the regression beta to predict the proportion of "How to" and "How do" searches of the leftout time point. This process was repeated so that each week's proportion of "How to" and "How do" searches was estimated from the simple model parameters generated without using that week to t the data. This resulted in two values for the proportion of "How to" and "How do" searches for each week: the actual proportion of "How to" and "How do" searches (data) and the predicted value from the leave-one-out validation (estimate). The actual proportion of "How to" and "How do" searches of a week (data) and the predicted proportion of "How to" and "How do" searches (estimation) were then correlated and also compared using a paired sample t-test. This analysis indicates whether the population stress levels is a good predictor of the proportion of "How to" and "How do" searches.

Declarations
Data availability Data and code are available at a dedicated Github repository [github.com/affective-brainlab/information_seeking_under_threat]. The source data underlying Figs. 1-4 and Supplementary Fig 2 are provided as part of this repository. A reporting summary for this Article is available as a Supplementary Information le.
Code availability Code supporting this study are available at a dedicated Github repository [github.com/affective-brainlab/information_seeking_under_threat].

Figure 2
Temporal evolution of high-level features of web queries before and during the pandemic. (a&b) In the (a) UK and (b) US there was a sharp increase in the Google search volume index of "How to" and "How do" questions, and (c&d) "What" and "Why" questions after a "National Emergency" was declared. (e&f) There was no change in the Valence index [0 (most negatively valanced) and 100 (most positively valanced)]. The X-axis indicates the weeks ranging from January 1st, 2019 to March 21st, 2021. The solid black line indicates the time "National Emergency" was declared in the (a, c, &e) UK (March 23rd, 2020) and (b, d, &f) US (March 13th, 2020).

Figure 3
Stress is selectively associated with greater percentage of "How to" and "How do" questions submitted to Google search engine. Presented are the results of three separate linear regressions predicting the UK Google search volume index of (a)"How to" and "How do" questions, (b) "What" and "Why" questions, and (c) Valence Index [0 (most negative valanced) and 100 (most positive valanced)], from (i) the proportion of UK sample reporting COVID-related stress (detrended and Z scored) and (ii) UK COVID-19 related con nement score (detrended and Z scored). The X and Y values are the residuals (regressing out the respective control variable). The ne line represents the con dence interval. As can be observed, (a) increased stress was associated with increased "How to" and "How do" questions controlling for COVID-19 related con nement, while (b) increased COVID-19 related con nement was associated both with increased proportion of "What" and "Why" questions, and "How to" and "How do" questions. ***P<0.001, **P<0.01 (two-sided).