Background
The objective was to identify the Spatial-temporal characteristics and the epidemiology of tuberculosis in China from 2004 to 2017 with Joinpoint regression analysis, Seasonal Autoregressive integrated moving average (SARIMA) model, geographic cluster, and multivariate time series model.
Methods
The data of TB from January 2004 to December 2017 were obtained from the notifiable infectious disease reporting system supplied by the China CDC. Joinpoint regression analysis was used to observe the trend. The monthly incidence was predicted by the Seasonal autoregressive integrated moving average (SARIMA) model. Spatial autocorrelation analysis was performed to detect geographic clusters. A multivariate time series model was employed to analyze heterogeneous transmission.
Results
We included 13,991,850 TB cases from 2004 to 2017. The final selected model was the 0 Joinpoint model with an annual average percent change of -3.3. A seasonality was observed across the fourteen years, and the seasonal peaks were in January and March. The best SARIMA model was (0, 1, 1) X (0, 1, 1) 12 , with a minimum AIC (880.5) and SBC (886.4). The predicted value and the original incidence data of 2017 were well matched. The provinces with a high incidence were located in the northwest (Xinjiang, Tibet) and south (Guangxi, Guizhou, Hainan) of China. The autoregressive component had a leading role in the incidence of TB which accounted for 81.5% - 84.5% of the patients on average. The endemic component was about twice as large in the western provinces as the average while the spatial-temporal component was less important there. Most of the high incidences areas were mainly affected by the autoregressive component for the past fourteen years.
Conclusion
A significant decreasing trend was seen from 2004 to 2017. The seasonal peaks were in January and March every year. Obvious clusters were identified in Tibet and Xinjiang Province. A spatial heterogeneity in the component driving the transmission of TB was identified from the multivariate time series model. This suggested that targeted preventive efforts should be made in different provinces based on the main component contributing to the epidemics.