Correlation Between National Surveillance and Search Engine Query Data on Respiratory Syncytial Virus Infections in Japan

DOI: https://doi.org/10.21203/rs.3.rs-1332314/v1

Abstract

Background

The respiratory syncytial virus (RSV) disease burden is significant, especially in infants and patients with an underlying disease, and prophylaxis with palivizumab is recommended for these high-risk groups. Early recognition of a RSV epidemic is important for timely administration of palivizumab. We herein aimed to assess the correlation between national surveillance and Google Trends data pertaining to RSV infections in Japan.

Methods

The retrospective survey was performed between January 1, 2018, and November 14, 2021, and evaluated the correlation of national surveillance and Google Trends data. Joinpoint regression was used to analyze the data to evaluate the points at which changes in trends occurred.

Results

As a result, a strong correlation was observed every study year (2018 [r=0.87, p<0.01], 2019 [r=0.83, p<0.01], 2020 [r=0.83, p<0.01], and 2021 [r=0.96, p<0.01]). The change-points in the Google Trends data indicating the start of the RSV epidemic in 2018 and 2021 were observed earlier than by sentinel surveillance and in 2019 and 2020 simultaneously with sentinel surveillance.

Conclusions

Our data suggested that Google Trends has the potential to enable early identification of the RSV epidemic. In countries without a national surveillance system, Google Trends may serve as an alternative early warning system.

Introduction

Respiratory syncytial virus (RSV) is a common cause of acute respiratory tract illness in children and adults [1, 2]. The disease burden is significant, especially in infants and patients with an underlying disease [1], including premature infants and patients with a chronic lung disease, congenital heart failure, immunocompromised status, etc. Prophylactic administration of palivizumab is recommended for these high-risk groups [3]. The Japanese National Health Insurance coverage for palivizumab administration in these vulnerable populations is limited to six months. Therefore, the early recognition of a RSV epidemic each season is crucial to timely and appropriate administration of palivizumab prophylaxis.

Surveillance systems for respiratory infections, including RSV, vary among nations. Weekly surveillance is the norm in the United States and United Kingdom [4, 5]. In Japan, a sentinel surveillance system at primary care clinics and hospitals contributes to nationwide surveillance by updating the data weekly on the National Institute of Infectious Diseases website [6]. Owing to the development of such national surveillance systems, large RSV epidemics were identified in these countries in 2021 despite the strain on public health resources caused by the coronavirus disease 2019 (COVID-19) pandemic [4, 5, 7, 8]. However, some high-income countries and most middle-income countries still lack a RSV surveillance system and are unable to detect an epidemic early or assess an ongoing epidemic accurately. Concerns about RSV outbreaks are increasing worldwide, calling for a readily accessible method of detection.

Some recent studies reported the utility of search engine query data in predicting a disease trend or an epidemic of an infectious disease. Google Trends is a tool for exploring a variety of themes pertaining to social and health topics [9], and its data on the influenza virus and COVID-19 were found to correlate with official surveillance data [1013]. Therefore, we herein aimed to evaluate the correlation between national surveillance and Google Trends data on RSV infections in Japan to assess the utility of Google Trends as a tool for detecting increases in the RSV infection trend.

Methods

Google Trends data, generated from the total Google search data (https://trends.google.com/trends/?geo=JP), were used as search engine query data. Google Trends data are only available in the form of relative search volume, which is scaled on an index ranging from 0 to 100 (100 is the highest search volume in a given period). The search term, “RS virus” in Japanese (“RS uirusu”) was used to conduct a search in Japan between January 1, 2018, and November 14, 2021. A full-year analysis was conducted to obtain the weekly relative search volume (each year contained 52–53 weeks). The relevant data were collected on November 20, 2021.

The official surveillance data in Japan are reported weekly by the Infectious Disease Surveillance Center at the National Institute of Infectious Diseases [6]. In Japan, data on common infectious diseases, including RSV infections, are collected via sentinel surveillance from about 3,000 paediatric sentinel sites [14]. The data are then expressed as the number of laboratory-confirmed cases per sentinel site and made available to the public at websites after about nine to ten days. In the study period, the surveillance data were available from February 26 (week 9), 2018 to November 7 (week 44), 2021 because the RSV sentinel surveillance system was modified during week 9 in 2018 to report laboratory-confirmed cases per sentinel site rather than the number of actual cases.

The Spearman rank correlation test was used to compare the Google Trends data with the official surveillance data. Two-sided p < 0.05 was considered to indicate statistical significance. Strong, moderate, mild, weak, and no correlation was defined as 0.8-1.0, 0.6–0.8, 0.4–0.6, 0.2–0.4, and 0.0-0.2, respectively. Statistical analyses were performed using EZR (Saitama Medical Center, Jichi Medical University, Saitama, Japan), a graphic user interface for R (The R Foundation for Statistical Computing, Vienna, Austria) [15].

Additionally, to evaluate changes in epidemic trends, the surveillance data and relative search volume in Google Trends were analyzed using the Joinpoint Regression Program, Version 4.9.0.0 (Statistical Research and Applications Branch, National Cancer Institute), which enables the analysis of Joinpoints to identify significant trend changes. The epidemic curve of RSV infections is usually observed as a single peak each year. Thus, three Joinpoints (change-points) were established to estimate the epidemic trend over the following four periods: period 1) the pre-epidemic phase; period 2) epidemic phase (increasing); period 3) epidemic phase (decreasing); and period 4) post-epidemic phase on the assumption that this pattern would fail to appear if an epidemic did not occur. The inclination was expressed as weekly percentage changes (WPCs) between change-points with 95% confidence intervals. The present study was approved by the institutional review board of Okayama University Hospital (No. 2111-025).

Results

Figure 1 shows the trends in the relative search volume on Google Trends and the sentinel surveillance data for each year. The rising curve in the Google Trends data preceded that of the sentinel surveillance data in 2018, 2019, and 2021 (Figs. 1a, b, d). No epidemic surge was observed on either Google Trends or the surveillance data in 2020 while a large epidemic surge was observed in 2021. A strong correlation was observed every year, namely, 2018 (r = 0.87, p < 0.01), 2019 (r = 0.83, p < 0.01), 2020 (r = 0.83, p < 0.01), and 2021 (r = 0.96, p < 0.01) (Fig. 2).

Table 1 and Fig. 3 show the results of Joinpoint trend analysis of the relative search volume on Google Trends and the sentinel surveillance data. The change-points in period 2 suggesting the onset of a RSV epidemic were observed in week 19 on Google Trends and on week 22 in the sentinel surveillance data for 2018. The change- points in 2019 showed the same pattern at week 25. The change-points in period 2 in 2020 appeared during week 10 in both datasets, but the epidemic peak was not observed this season as already shown in Fig. 1. In 2021, as in 2018, the Google Trends data showed a change-point at week 11, or earlier than the sentinel surveillance data at week 15 or 18. Joinpoint analysis of the surveillance data from 2021 revealed that the increasing phase corresponded to period 3 as shown in Fig. 3(h) rather than to period 2.

Table 1

Trend changes in sentinel surveillance data and relative search volume on Google Trends for RSV between 2018 and 2021

 

Year

Period 1

(week)

WPC

(95% CI)

Period 2

(week)

WPC

(95% CI)

Period 3

(week)

WPC

(95% CI)

Period 4

(week)

WPC

(95% CI)

Google Trends

2018

1

-2.6*

(-5.0–0)

19

10.7*

(8.2–13.3)

36

-12.9*

(-19.9 – -5.4)

43

-4

(-9.9–2.4)

Sentinel surveillance

2018

9

-3.6*

(-6.1 – -1.0)

22

16.4*

(14.6–18.2)

37

-17.8*

(-20.7 – -14.8)

45

-0.5

(-5.6–4.8)

Google Trends

2019

1

-0.2

(-1.9–1.6)

25

17.0*

(13.1–21.0)

37

-21.9*

(-31.6 – -10.8)

42

-6.3*

(-1.2 – -2.5)

Sentinel surveillance

2019

1

-1.4

(-3.1–0.4)

25

21.6*

(17.8–25.6)

37

-18.1*

(-21.2 – -14.8)

47

2.6

(-11.5–18.8)

Google Trends

2020

1

5.2

(-0.1–10.8)

10

-12.8*

(-16.1 - -9.3)

24

6.0*

(3.6–8.4)

46

-8.2

(-16.9–1.5)

Sentinel surveillance

2020

1

0

(-2.6–2.8)

10

-23.7*

(-26.9 – -20.4)

24

17.1*

(12.8–21.6)

41

1.7

(-1.1–4.5)

Google Trends

2021

1

9.6*

(0.9–19.0)

11

18.2*

(15.9–20.5)

27

-21.6*

(-24.0 – -19.2)

39

-1.9

(-16.5 – -15.3)

Sentinel surveillance

2021

1

16.8*

(12.6–21.1)

15

3.7

(-29.1–51.6)

18

16.6*

(13.8–19.5)

28

-16.4*

(18.8 – -14.0)

*Statistically significant
Abbreviation: WPC, weekly percentage change; 95% CI, 95% confidence interval

Discussion

The present study revealed a strong correlation between the sentinel surveillance data and relative search volume on Google Trends. The strength of the Internet search engine query data has significant implications for real-world public health interests. Additionally, our findings suggested that Google Trends may have the potential to enable early detection of a RSV epidemic even if a national surveillance system is unavailable.

The analysis of Google Trends data pertaining to infectious diseases was initially used to determine its utility in detecting the influenza A(H1N1)2009 pandemic, in which it demonstrated its ability to forecast influenza disease activity [10]. The Google Trends data for estimating the activity of the influenza virus, dubbed “Google Flu Trends”, showed the favorable correlations in the United States and Europe [11, 13, 16, 17]. Recently, Google Trends for assessing RSV activity in the United States was also evaluated [18, 19]; although this has been done only in the United Sates thus far, our findings indicated that it may be applicable to other nations as well. Furthermore, our analysis suggested that Google Trends might be capable of detecting an increase in the RSV infection trend simultaneously with or even before the national surveillance system. Further evaluation in other countries or regions is needed. Google Trends is a readily available tool that can be used with great effect to advise the public health sector of infection risks. For countries that do not have a nationwide surveillance or alert system, Google Trends may serve as a useful, alternative warning system provided that the Internet penetration rate is at level comparable with that of the nations discussed.

Early recognition of a RSV epidemic is important because it enables timely palivizumab administration to prevent the infection among high-risk patients. In 2021, the RSV epidemic was observed earlier than in 2018 or 2019. In situations such as this, early prophylaxis with palivizumab should be considered. In Japan, the Infectious Disease Surveillance Center issued an alert concerning a RSV epidemic in week 18 in 2021 [20]. Our trend analysis using Joinpoint regression indicated the RSV epidemic started earlier, around week 11. If the Google Trends database had been used to monitor the RSV trend, a timelier warning might have been issued to the public health sector.

Although Google Trends analysis has important implications for early epidemic detection, the peak of the epidemic curve on Google Trends was higher than in the surveillance data for 2021. This discrepancy may be explained by the public’s focus on the RSV epidemic because the search volume on Google Trends mirrors the public’s concerns. Additionally, in 2020, a small peak was observed in the Google Trends data at week 9 while no peak was observed in the surveillance data. A small peak of this sort, which was apparently unrelated to any disease trend, may create the false impression that an epidemic is imminent. Moreover, topics of public interest, such as the announcement of a new drug or vaccine for RSV, will likely affect the search volume on Google Trends, undermining the reliability of the findings. Thus, Google Trend analysis is not an infallible method of predicting an infectious disease epidemic. Further studies are needed to evaluate the advantages and disadvantages of Internet search engine query data pertaining to other diseases and in other countries.

The present study had some limitations. First, the generalizability of the findings to other countries and regions was not evaluated. However, previous studies of Google Flu Trends demonstrated the service’s utility in the United States and Europe [10, 11, 13, 16, 17]; we may therefore expect a similar utility in predicting RSV trends. Second, we were able to obtain only a “relative” search volume because the “actual” search volume of Google Trends data is not available to the public. If the total number of Internet searches were very small, the results of an analysis of Google Trends data might become susceptible to over- or underestimation.

In conclusion, our study found a strong correlation between the relative search volume on Google Trends and sentinel surveillance data on RSV infections. Additionally, the Google Trends database was found to be able to detect an increasing trend in RSV infections simultaneously with or even before the national surveillance system. With its wide availability and user-friendly interface, Google Trends will likely gain more attention for its utility as a surveillance system for infectious diseases even among patients and their guardians.

Declarations

Ethics approval and consent to participate: All procedures were performed in accordance with relevant guidelines. The present study was approved by the institutional review board of Okayama University Hospital (No. 2111-025). Informed consent was not required in this study because the data was already opened in public.

Consent for publication: The patient’s consent was not required in this study. All authors approved to publish of this article. 

Availability of data and materials: Our data was available on the website of Google Trends (https://trends.google.co.jp/trends/explore?date=2018-01-01%202021-11-14&geo=JP&q=%2Fm%2F02f84_) and Infectious Diseases Weekly Report in the National Institute of Infectious Diseases (https://www.niid.go.jp/niid/ja/data/10762-idwr-sokuho-data-j-2144.html). The raw data was attached as the supplementary files.

Competing interests: The authors have no conflicts of interest to declare.

Funding: None

Authors' contributions: Drs. Uda and Hagiya conceptualized and designed the study. Dr. Uda drafted the manuscript and performed the data analyses. Dr. Hagiya contributed to the data analysis and the critical revision of the manuscript. Drs. Yorifuji, Koyama, Tsuge, Yashiro, and Tsukahara also contributed to the critical revision of the manuscript.

Acknowledgments: We thank Mr. James R. Valera for his editorial assistance and helpful comments.

Authors' information:

Corresponding Author: Kazuhiro Uda, MD

Department of Pediatrics, Okayama University Graduate School of Medicine, Dentistry, and Pharmaceutical Sciences, Okayama, Japan 

2-5-1 Shikata, Okayama City, Okayama 700-8558, Japan 

E-mail address: [email protected]

Hideharu Hagiya, MD, PhD

Department of General Medicine, Okayama University Graduate School of Medicine, Dentistry and Pharmaceutical Sciences, Okayama, Japan

2-5-1 Shikata, Okayama City, Okayama 700-8558, Japan

E-mail address: [email protected]

Takashi Yorifuji, MD, PhD

Department of Epidemiology, Okayama University Graduate School of Medicine, Dentistry, and Pharmaceutical Sciences, Okayama, Japan

2-5-1 Shikata, Okayama City, Okayama 700-8558, Japan

E-mail address: [email protected]

Toshihiro Koyama, PhD

Department of Health Data Science, Okayama University Graduate School of Medicine, Dentistry, and Pharmaceutical Sciences, Okayama, Japan

2-5-1 Shikata, Okayama City, Okayama 700-8558, Japan

E-mail address: [email protected]

Mitsuru Tsuge, MD, PhD

Department of Pediatrics, Acute Diseases, Okayama University Academic Field of Medicine, Dentistry, and Pharmaceutical Science, Okayama, Japan.

2-5-1 Shikata, Okayama City, Okayama 700-8558, Japan

E-mail address: [email protected]

Masato Yashiro, MD, PhD 

Department of Pediatrics, Okayama University Hospital, Okayama, Japan 

2-5-1 Shikata, Okayama City, Okayama 700-8558, Japan

E-mail address: [email protected]

Hirokazu Tsukahara, MD, PhD

Department of Pediatrics, Okayama University Graduate School of Medicine, Dentistry, and Pharmaceutical Sciences, Okayama, Japan

2-5-1 Shikata, Okayama City, Okayama 700-8558, Japan

E-mail address: [email protected]

References

  1. Nair H, Nokes DJ, Gessner BD, Dherani M, Madhi SA, Singleton RJ, O'Brien KL, Roca A, Wright PF, Bruce N et al: Global burden of acute lower respiratory infections due to respiratory syncytial virus in young children: a systematic review and meta-analysis. Lancet 2010, 375(9725):1545–1555.
  2. Shi T, Vennard S, Jasiewicz F, Brogden R, Nair H: Disease Burden Estimates of Respiratory Syncytial Virus related Acute Respiratory Infections in Adults With Comorbidity: A Systematic Review and Meta-Analysis. J Infect Dis 2021.
  3. American Academy of Pediatrics Committee on Infectious Diseases. Updated guidance for palivizumab prophylaxis among infants and young children at increased risk of hospitalization for respiratory syncytial virus infection. Pediatrics 2014, 134(2):415–420.
  4. The Centers for Disease Control and Prevention (CDC). National Respiratory and Enteric Virus Surveillance System (NREVSS), Respiratory Syncytial Virus (RSV) Surveillance. https://www.cdc.gov/surveillance/nrevss/rsv/index.html (accessed January 30, 2022).
  5. Public Health England. Weekly national Influenza and COVID-19 surveillance report, week 37 report, https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1018187/Weekly_Flu_and_COVID-19_report_w37.pdf (accessed November 30, 2021).
  6. The National Institute of Infectious Diseases. Infectious Diseases Weekly Report (IDWR), https://www.niid.go.jp/niid/en/idwr-e.html (accessed January 30, 2022).
  7. Ujiie M, Tsuzuki S, Nakamoto T, Iwamoto N: Resurgence of Respiratory Syncytial Virus Infections during COVID-19 Pandemic, Tokyo, Japan. Emerg Infect Dis 2021, 27(11).
  8. Delestrain C, Danis K, Hau I, Behillil S, Billard MN, Krajten L, Cohen R, Bont L, Epaud R: Impact of COVID-19 social distancing on viral infection in France: A delayed outbreak of RSV. Pediatr Pulmonol 2021.
  9. Nuti SV, Wayda B, Ranasinghe I, Wang S, Dreyer RP, Chen SI, Murugiah K: The use of google trends in health care research: a systematic review. PLoS One 2014, 9(10):e109583.
  10. Ginsberg J, Mohebbi MH, Patel RS, Brammer L, Smolinski MS, Brilliant L: Detecting influenza epidemics using search engine query data. Nature 2009, 457(7232):1012–1014.
  11. Davidson MW, Haim DA, Radin JM: Using networks to combine "big data" and traditional surveillance to improve influenza predictions. Sci Rep 2015, 5:8154.
  12. Cinarka H, Uysal MA, Cifter A, Niksarlioglu EY, Çarkoğlu A: The relationship between Google search interest for pulmonary symptoms and COVID-19 cases using dynamic conditional correlation analysis. Sci Rep 2021, 11(1):14387.
  13. Schneider PP, van Gool CJ, Spreeuwenberg P, Hooiveld M, Donker GA, Barnett DJ, Paget J: Using web search queries to monitor influenza-like illness: an exploratory retrospective analysis, Netherlands, 2017/18 influenza season. Euro Surveill 2020, 25(21).
  14. Ministry of Health, Labour and Welfare. Implementation Manual for the National Epidemiological Surveillance of Infectious Diseases Program. https://www.mhlw.go.jp/english/policy/health-medical/health/dl/implementation_manual.pdf. (Accessed January 30, 2022).
  15. Kanda Y: Investigation of the freely available easy-to-use software 'EZR' for medical statistics. Bone Marrow Transplant 2013, 48(3):452–458.
  16. Valdivia A, Lopez-Alcalde J, Vicente M, Pichiule M, Ruiz M, Ordobas M: Monitoring influenza activity in Europe with Google Flu Trends: comparison with the findings of sentinel physician networks - results for 2009-10. Euro Surveill 2010, 15(29).
  17. Hulth A, Rydevik G: Web query-based surveillance in Sweden during the influenza A(H1N1)2009 pandemic, April 2009 to February 2010. Euro Surveill 2011, 16(18).
  18. Oren E, Frere J, Yom-Tov E, Yom-Tov E: Respiratory syncytial virus tracking using internet search engine data. BMC Public Health 2018, 18(1):445.
  19. Crowson MG, Witsell D, Eskander A: Using Google Trends to Predict Pediatric Respiratory Syncytial Virus Encounters at a Major Health Care System. J Med Syst 2020, 44(3):57.
  20. The National Institute of Infectious Diseases (NIID). Pick up of infectious diseases: recent trend of coronavirus disease 2019 and Respitaroy Syncytal virus (published online May 7, 2021). https://www.niid.go.jp/niid/ja/diseases/ka/corona-virus/2019-ncov/2487-idsc/idwr-topic/10360-idwrc-2116c.html (accessed January 30, 2022).