Seasonal variation and change trends for quit smoking: evidence from Internet search engine query data

Background: The outcomes of smoking have generated considerable clinical interest in recent years. Although people from different countries are more interested in the topic of quit smoking during the winter, few studies have tested this hypothesis. The current study aimed to quantify public interest in quitting smoking via Google. Methods: We use Google Trends to obtain the Internet search query volume for terms relating to quit smoking in major northern and southern hemisphere countries in this research. Normally search volumes for the term “quit smoking + stop smoking + smoking-cessation” were retrieved within the USA, the UK, Canada, Ireland, New Zealand and Australia from January 2004 to December 2018. Seasonal effects were investigated using cosinor analysis and seasonal decomposition of time series models. Results: Significant seasonal variation patterns in those search terms were revealed by cosinor analysis and demonstrated by the evidence from Google Trends analysis in the representative countries including the USA ( p cos = 2.36×10 -7 ), the UK ( p cos < 2.00×10 -16 ), Canada ( p cos < 2.00×10 16 ), Ireland ( p cos <2.00×10 -16 ) ,Australia ( p cos = 5.13×10 -6 ) and New Zealand ( p cos = 4.87×10 -7 ). Time series plots emphasized the consistency of seasonal trends with peaks in winter / late autumn by repeating in nearly all years. The overall trend of search volumes for quitting smoking, observed by dynamic series analysis, has declined from 2004 to 2018. Conclusions: The preliminary evidence from Google Trends search tool showed a significant seasonal variation and a decreasing trend for the RSV of quit smoking. Our novel findings in smoking-cessation epidemiology need to be verified with further studies, and the mechanisms underlying these findings must be clarified.


Background
Tobacco use has been the main cause of preventable diseases and death all over the world [1]. The relationship between improved popularity of smoking and augmented incidence of lung cancer has been demonstrated these years which were uncommon previously [2]. According to one original research that continued smoking after a cancer diagnosis can cause a detrimental effect on the cancer treatment usefulness and survival [3]. Moreover, numerous studies have highlighted that smoking is responsible for 5% of global death, while smoking-related diseases such as cardiorespiratory, autoimmune, malignancy, cerebrovascular and subfertility accounts for 14% of deaths among adults from the age of 30 or older [4]. A prior investigation reported that smoking in pregnancy may result in numerous adverse outcomes, including stillbirth, low birth weight, miscarriage, perinatal, mobility and mortality [5]. Given the negative impacts caused by cigarette smoking, the majority of smokers stop smoking around the age of 40. It is reported that people who quit smoking before the age of 35 enjoy a better life expectancy comparable to that who does not smoke [6]. In addition, cancer mortality in both smoking and non-smoking related cancers reduced if smokers quit smoking after diagnosed by a cancer [7]. Therefore, we recommend smoking cessation because quit smoking lead to substantial health improvements [8]. Given the harmful effects of smoking and the benefits of quitting smoking, evaluating the global public interest in smoking cessation is very important. Several previous studies have reported about smoking-cessation [9,10].
However, few efforts have been made to identify seasonal variation patterns of quit smoking in large populations. This study seeks to fill that void.
Recently, people evaluate the public interest in health category by investigating the Internet-based search engine which has been established as a major resource of information [11]. The use of the Internet has increased sharply during the past decade and almost 5% of Internet search terms are meant for the information of health [12]. Google Trends, a website provided by Google Inc, analyses the popularity of a particular search query term in Google Search across different regions and languages. In several previous studies, Google Trends have already been employed to investigate seasonal and other time varying patterns of the topic of health conditions, including cellulitis [13], vitamin D [14] and gout [15]. Accordingly, the current study aimed to leverage Internet search query data from Google Trends to test people's interests toward the topic of quit smoking and investigate whether there was a seasonal pattern exists in this topic.

Google Trends interrogation and data collection
Google Trends (http://www.google.com/trends), based on Google Search that analyzes how often a particular query-term is entered relative to the total search volume across the world. Instead of reporting the absolute, raw search figures, Google Trends does this by presenting relative search volume (RSV). Each data point on the graph is divided by the total searches of the geography and time range it represents and multiplying by 100. The data points can be obtained in Comma Separated Values (CSV) format and the results are normalized and scaled on a range of 0 to 100. The value of 100 represents the term had the highest number of searches within a selected region and time frame on that day, while a score of 0 means that the term is below 1 percent of its peak popularity [16,17]. To make the selection more reliable, the system automatically excludes the repeated searches over a short period of time by the same person.
There are two options for searching keywords in Google Trends tool, searches can be divided into two sections: by "search term", which enables to search the exact keywords; or by "search topic", a broader search that containing the particular keywords or the terms that related. In this study, the later search option (searching by entering the particular keywords or terms that related) was used, because there are no definite technical terms available. Among all the relevant search choices, we choose the following search terms "quit smoking", "stop smoking" and "smoking cessation", which are the most searched. Those multiple terms can be searched in combination with a plus sign (+) that means "OR" and excluded with a "-" sign. In order to keep in line with the current standards for reporting Google trends data, weekly data from 2014/01/01 to 2018/12/31 were downloaded in CSV files for the United States of America (USA), the United Kingdom (UK), Canada and Ireland as well as New Zealand and Australia [18]. For statistical analyzing, those countries were grouped in relation to the northern or southern hemisphere. The searches were conducted on 6th March 2019, and before the analysis, the accuracy of the data was evaluated by two separate individuals crosschecking the data.

Dynamic series analysis
Dynamic series analysis means a series of statistical indicators are arranged in sequence according to a certain time order to observe and compare the change and development trend of a specified object in time. There are three statistical indicators contained in the dynamic series analysis model, the absolute, relative and average numbers, which were used to describe the object. The ratio of fixed base and link relative, based on the relative comparison, were used in this dynamic series analysis. Besides, in order to investigate the systematic seasonal variations of quit smoking, the strategy of cosinor analysis was used to test whether there was a significant seasonal variation in the volume of Internet searches for the term "quit smoking + stop smoking + smoking cessation". The method and the program used to implement the cosinor analysis were presented in detail by Barnett et al [19]. In short, the cosinor analysis, based on the sinusoidal patterns which fitted to an observed time series, is a common parametric seasonal model that hinged on the following sinusoid: (see Equation 1  Since there are two seasonal components exist in this linear model: sine and cosine, the threshold of significance was adjusted as p<0.025 to correct for multiple comparisons. In addition, in an effort to quantify the magnitude of seasonal peaks and troughs for countries that demonstrating significant seasonality, we calculate the percent change in search volume from winter months (USA, UK, Canada and Ireland from the northern hemisphere: December, January and February; New Zealand and Australia from the southern hemisphere: June, July and August ) to summer months (USA, UK, Canada and Ireland: June, July and August; New Zealand and Australia: December, January and February), which was similar to the process in several previous studies [20]. Besides, the conformance of the seasonal variations was emphasized by time series plots.
The "season" Package in R version 3.5.1 was used to perform all the data processing and analyses.   (Table S1). The results were presented in Figure 1. The seasonal pattern in those countries mentioned above showed the cyclicity with 12 months being a circle. These statistical data are limited to those search queries under the health category that originated within the USA, the UK, Canada, Ireland, New

Dynamic series analysis of the search volumes
Zealand and Australia. Meanwhile, the seasonal decomposition of time series data showed a significant decreasing trend in countries from both the northern hemisphere and the southern hemisphere.

Cosinor analysis for the relative search volume
The results of cosinor analyses were presented in Table 2, and graphical outcomes of cosinor analyses were presented in Figure 2.  4.87×10 -7 , p sin = 9.15×10 -6 ), the northern hemisphere (A = 26.71, p = 1.7, L = 7.7, p cos < 2.00×10 -16 , p sin = 4.27×10 -13 ) and the southern hemisphere (A = 9.42, p = 5.5, L = 11.5, p cos = 1.22×10 -11 , p sin = 7.50×10 -12 ). Search volumes presented higher levels in the winter/late autumn months (January and February for the four northern hemisphere countries and May for the two southern hemisphere countries). Lower levels presented in the summer/late spring (July and August for the four northern hemisphere countries and November for the two southern hemisphere).

Relative search volumes on Google from January 2004 to December 2018
The consistency of seasonal trends which confirmed in the cosinor analysis was emphasized by the time series plots. Significant decreasing trend in RSV for quit smoking was observed during the overall observation period (January 2004 to December 2018) in both hemispheres. However, in recent years, the magnitude of seasonal trend changes has been reduced ( Figure 3).

Discussion
To the best of our knowledge, the current study is by far the first of this kind highlighting the seasonal who are not a smoker, the incidence rate of early SIDS (sudden infant death syndrome) among smokers was 0.6 cases per 100,000 person-days higher [26]. A meta-analysis study in another research reported that sperm concentration reduced by 13%, sperm motility 10% and sperm morphology 3% among smokers [27]. All of these harmful effects caused by smoking may result in infertility. Furthermore, it has been reported that the relationship between depressive illness and smoking is that people who addicted to smoking are the most likely to suffer from depression than those who are not a smoker (Breslau, Kilbey, & Andreski, 1993) [28]. Given the dreadful effects to our body health caused by smoking and the enormous economic burden carried by smoking-related illness, most people are inclined to quit smoking. Since most smokers have the awareness of the importance of stop smoking, fewer people search this topic on the Internet. Third, for many young smokers, boredom remained a reason for them to smoke at a time [29]. Additionally, in UK, it has been suggested that the young usually treat the mobile phone use and smoking as a symbol of maturity [30]. The rise in mobile phone use may be responsible for an observed decreasing in smoking among teenagers over the past few years. Nevertheless, the association between the declining use of the cigarette and the rise in mobile phone use has not been studied in other countries [31]. With the rapid development of our technologies during the past several years, people have more choice to spend their boring time, which may be a reason for our results.
Some other factors may contribute to the seasonality of the search volumes on the topic of quit smoking. The observed seasonality may result from the increment of Internet use in winter. However, this presumption has not been corroborated [32]. The association between depression illness and smoking has previously been described, the study also reported that admissions for depressive disorders peaked in the springtime and declined during the summer time [28]. Since the patients with depression are more likely to be a smoker, this may explain that people's interest towards quit smoking reached a trough in the summer months. Besides, people have a tendency to make the decision to quit smoking on New Year's Eve, which is a widely recognized seasonal phenomenon [33].
Another research showed that cigarette sales presented a strongly seasonal pattern each year, with a peak in the summer months and reach a low in the winter months. This is nearly a mirror image of the seasonality pattern for the search volumes of quit smoking, suggesting a strong association between the two phenomena [34]. Additionally, there is robust evidence to show that current smokers and those who used to be a smoker, have a significantly higher risk of developing chronic obstructive pulmonary disease (COPD) [35]. A significant increase of quit smoking willingness was presented when an awareness of COPD was raised [36]. The data showed that a higher search volume related to "COPD" was presented during the winter months, and this may be one reason why the search volumes of quit smoking reached a highest in the winter months and late autumn. Since most of the countries from the southern hemisphere stand near the equator, the seasonal variation in weather may be fewer compared with the countries in the northern hemisphere. This may explain why the seasonal pattern in the southern hemisphere is slightly different with the northern hemisphere, with a peak in the late autumn, nadir in the late spring.
Our current research involved statistical data of the USA, the UK, Canada, Ireland, New Zealand and Australia, which represent both hemispheres. There are certain advantages identified in our study, including the large and exhaustive amount of data involved, the long period of observation and inclusion of representative countries from both hemispheres. Nevertheless, there remain some inherent limitations in our research. Although a massive amount of data contained in Google Trends, and more than 65% of all queries were searched within the Google engine [37].  The plots of cosinor analysis models for the seasonal variation in the relative search volume of [quit smoking + stop smoking + smoking cessation].

Supplementary Files
This is a list of supplementary files associated with this preprint. Click to download.
Supplementary Material.doc