Seasonality fluctuation is a common phenomenon in the incidence of many infectious diseases including TB. In this study, a clear seasonality in the time series of TB incidence in Guangdong was found. TB incidence peaked in the spring (March to May). This result is consistent with what was found in the study by Wang et al. that TB incidence from January 1997 to August 2019 of China predominantly peaks in spring and early summer [15]. In another study of the epidemiology of TB in Xinjiang by Wubuli et al., big peaks and trough of TB were found in March and in October respectively [16]. Despite common peak and trough observed, there were also slight variations in seasonality pattern between different studies on various regions, which might arise from meteorological pattern diversity of regions involved and study periods discrepancy involved between study. The mechanisms underlying seasonal periodicity remain poorly understood, while the oscillatory changes in infectiousness, contact patterns, pathogen survival, host susceptibility, population behaviors and meteorological factors may contribute to this phenomenon [17]. For TB, strong wind and increased outdoor activities in spring may play significant role in this seasonality pattern.
This study examined the association between weather variables and TB. We found that the weather factors including maximum temperature, maximum daily rainfall, minimum relative humidity, mean vapor pressure, extreme wind speed, maximum atmospheric pressure, mean atmospheric pressure and illumination duration were significantly associated with log(TB incidence). Additionally, extreme wind speed and maximum atmospheric pressure were fitted into the model. Extreme wind speed at lag 5 was positively associated with log(TB incidence), and maximum atmospheric pressure at lag 6 was positively associated. SARIMAX (0, 1, 1) (0, 1, 1)12 with extreme wind speed at lag 5 as covariant was the optimal model with lower AIC and highest prediction accuracy (lower MSE than model without weather factors as covariant or model with maximum atmospheric pressure as covariant), which overcame the assumption of linear dependence of variables in traditional time series model and improved the accuracy of the prediction.
Our results are similar to several previous studies on the effects of meteorological factors on TB in China. Xiao et al. [7] used DLNM to analyze the 10-year TB surveillance data in Jinghong, a city in Yunnan Province. After controlling the autocorrelation, the average temperature was negatively correlated with the incidence of TB, with lag period of 2 months; the total precipitation and the lowest relative humidity were negatively correlated, with lag period of 3 months and 4 months respectively, and there was no lag in the effect of the mean wind speed and total sunshine hours on TB incidence. Zhang et al. [8] used GWR model to analyze the incidence of TB in 2005-2015 in different districts of the country and local meteorological factors. It was found that the average temperature was positively correlated with the incidence of TB, while the mean relative humidity and the mean wind speed were negatively correlated. The different lag effects of weather variables in other studies probably resulted from the differences between study locations involved.
Different from the normal mean value of weather factors which was found to be correlated with TB incidence in other studies, the maximum value of weather factors represents the occurrence of more abnormal weather. Abnormal weather will lead to the decline of host resistance to pathogen, which is closely related to the occurrence of infectious diseases. The possible link between TB and weather factors may be attributable to the following reasons:
We found extreme wind speed at lag 5 was positively associated with log(TB incidence). The higher the wind speed, the greater the possibility of disease transmission through respiratory droplets [18]. This result is consistent with the characteristic that TB incidence was high in spring when wind speed was high in Guangdong.
We also found maximum atmospheric pressure at lag 6 was positively associated with TB incidence. Airflow usually occurs from high-pressure areas to low-pressure areas, so the correlation between TB and atmospheric pressure may be related to wind speed. However, the mechanism by which pressure affects the transmission of TB virus is poorly understood. Additional studies are warranted to further delineate the underlying mechanisms.
In present study, maximum temperature was also found to be significantly associated with log(TB incidence). As for temperature, it can affect the indoor and outdoor activities of TB patients and other susceptible people. For example, temperature was positively associated with the number of individuals walking on the track except extreme high temperature [19, 20]. Frequent outdoor activities may increase the risk of infection with TB.
Except for the above, minimum relative humidity, mean vapor pressure and maximum daily rainfall and illumination duration of different lags were also found significantly associated with log(TB incidence). The survival of viruses depends partially on levels of relative humidity. Viruses with lipid envelopes will tend to survive longer at lower (20–30%) RHs [21]. Continuous exposure to dry air may reduce the production of protective mucus on the surface of respiratory tract, thus weakening its resistance to the pathogen [22]. In one study, precipitation, atmospheric pressure, and relative humidity were found to have negative effects on TB incidence by indirectly lowering the concentrations of inhalable particulate matter and sulfur dioxide. And TB incidence was found to be negatively correlated with the concentration of inhalable particulate matter, sulfur dioxide, or nitrogen dioxide [23]. Also, the large amount of ultraviolet light provided by long-term sunshine not only restricts the growth of M. tuberculosis but also promotes the synthesis of vitamin D, which can protect people from TB to some extent [24].
The ARIMA model, also known as the Box-Jenkins model, can analyze various types of time series data and is a commonly used model in time series analysis [25-27]. Unlike the ARIMA model, which is a univariate time series model, the ARIMAX model can deal with multivariate time series data. It adds other variables related to the target series as input variables to improve the prediction accuracy. Previous studies have explored various models, such as ARIMA [2,9], X12-ARIMA [9], ARIMA-generalized regression neural network (GRNN), DLMN [7] and DWR [8] in predicting TB [2]. However, few models have considered seasonal variation characteristics and meteorological factors [28-30]. In a study performed on three cities in Jiangsu province, the ARIMAX model was found to be superior to the ARIMA and RNN models in predicting PTB when taking meteorological factors into consideration [10]. A time series study in Guangzhou, China, showed that an ARIMA model with imported cases and minimum temperature as input variables was superior to a single ARIMA model in forecasting dengue transmission [11]. Another time series study in Abidjan, Coted’ Ivoire, also indicated that including rainfall as an input variable can increase the accuracy of the ARIMA model in predicting infuenza [31]. In the present study, we found the addition of meteorological factors decreased MSE of SARIMA model without covariant and improved the prediction accuracy (Table 4). Both ARIMA and ARIMAX are linear regression models. Considering that this relationship may be nonlinear and possess the lag time, distributed lag nonlinear model and long-term prediction model might be applied in the future studies .
This study is not free from limitations. One limitation was that only monthly TB incidence data were available. Weekly or daily incidence data may decrease the accuracy of lagged time estimation. Secondly, this model was established based on TB incidence and meteorological data in one single study area Guangdong during 13 years and the prediction was conducted for only one year period. Therefore, it is only suitable to predict the overall trends in Guangdong. Then, other than local climate conditions alone, the result might be affected by other potentially confounding variables such as vaccine usage, improvement in medical care, population growth, economic development, populations, and ecological characteristics. However, these data were not available for assessment in the current study. Therefore, comparisons among different places were necessary to attain results free from confounding factors. Extension of studying term, extension of prediction term, continuous data collection, acquisition and filtering of confounding variables, as well as prediction method improvement and update are necessary to enhance the fit degree of the model and verify the prediction accuracy. These hopefully can be fulfilled in future studies. Lastly, the long-term effect of weather factors on TB is only calculated by mathematical methods, and its biological mechanism is not clear. It is hoped that future studies will reveal the mechanism of this delayed impact and help to properly interpret the results.