Spatio-temporal distribution analysis of TB in Xinjiang Uygur Autonomous Region, China

Background: Tuberculosis (TB) is a major global public health problem, which also affects economic and social development. China has the third largest burden of tuberculosis in the world. TB control made the slowest progress in western China while the highest prevalence of it showed up in Xinjiang. The study was conducted to investigate the spatial epidemiological features of pulmonary tuberculosis in Xinjiang Uygur Autonomous Region (referred to as Xinjiang) and compared the regional differences in the incidence of TB, for the 2013-2016which can provide scientific reference for TB prevention and control. Methods: Based on the TB monitored data, descriptive statistics was used to analyze the distribution characteristics of TB patients. Spatial correlation analysis and spatio-temporal scanning techniques were used to explore the clustering of TB in Xinjiang. Results: A total of 178,674 TB cases were notified in Xinjiang from 2013 to 2016 with an average annual incidence of 195.32/100,000. The incidence of TB in Xinjiang showed an upward trend. Male and female patients accounted for 52.56and 47.44 respectivelywith the sex ratio being 1.11:1. The number of cases continuously increased with the increasing ageand the elderly TB patients aged 60 years and above accounted for 46.77Most of the patients with TB were farmers and shepherds, accounting for 72.11The incidence of TB presented an upward trend from east to west and from north to south. Obvious spatial aggregation was observed in the incidence of TB in 98 countries and districts from 2013 to 2016 and the global Moran’s I was 0.5768 (P<0.001). The reported incidence rate of TB showed remarkable seasonality. The hot spots of TB were mostly concentrated in the southern Xinjiang with Kashgar as the center, while the cold spots were in northern Xinjiang with Urumqi as the center Conclusion: The TB incidence displayed spatial and temporal aggregation at the levels of district and county in Xinjiang during 2013-2016, with high risk areas relatively concentrating in the southern XinjiangIt is necessary to

conduct targeted TB prevention and control in key areas and allocate health resources reasonably. Keywords: Tuberculosis; Spatial autocorrelation; Spatio-temporal scanning analysis Background Tuberculosis (TB) is an airborne infectious disease caused by the mycobacterium tuberculosis, which not only typically infects the lungs, but also affects other parts of the body [1][2]. TB continues to be a significant public health problem in the world [3].
According to the report of the World Health Organization (WHO), TB ranks first among the most important infectious diseases in the world, though its incidence has decreased slowly by roughly 1.5% per year since 2000 [4][5][6][7]. Globally, there were 9.6 million incident cases of active TB disease in 2014, with 1.5 million TB related deaths, making TB the leading global infectious disease killer. According to the global TB report released by the WHO in 2016, China was the top third among 22 high-TB burden countries, with estimated 930,000 TB patients in 2015 [8][9]. Although China has been working hard to struggle against TB, it remains the third largest burden country in the world, after India and Indonesia.
TB control made the slowest progress in western China while the highest prevalence of it showed up in Xinjiang [10][11]. From 2010 to 2014, the Communicable Disease Network Direct Reporting System showed that, 264,000 cases of TB were reported in Xinjiang, with 1,144 TB related deaths. The incidence of TB in Xinjiang showed an upward trend in 2013-2016. The prevalence of TB was rising rapidly from 157.83/100,000 in 2011 to 202.32/100,000 in 2016 among residents, which was far beyond the nation average, see to significantly improve disease monitoring and early warning [1]. The improvement of direct reporting network system might develop into a factor promoted the rise and change of TB case notifications in Xinjiang. There were regional differences in the prevalence of TB in Xinjiang. Researches showed that TB had a specific spatial distribution pattern [11]. Understanding such spatial variations in TB prevalence and its determinants within a social, spatial, and temporal context is crucial for improved targeting of interventions and resources. Geospatial analytical methods, such as geographic information systems (GIS) and spatio-temporal scanning analysis, are effective tools for helping to achieve such understanding [12][13][14].
In China, there were some studies intended to reveal the spatio-temporal distribution characteristics of TB under province, or nationwide, however, less discussion on the temporal and spatial distribution of tuberculosis in Xinjiang in recent years [15][16][17][18].
Therefore, the main objectives of this study were to investigate the temporal trends and spatial patterns of the TB surveillance data of Xinjiang from 2013 to 2016 by epidemic characteristics analysis, spatial auto-correlation analysis and spatio-temporal scanning analysis.

Data source
District and county levels data of reported TB monthly cases and incidence from 2013 to 2016 were obtained from the Xinjiang Center for Disease Prevention and Control (Xinjiang CDC). Data from the Communicable Disease Network Direct Reporting System and the Xinjiang Statistical Yearbook were also collected.

Statistical analyses
(1)Descriptive statistics Descriptive statistics include the distribution of year, career, gender and age groups were adopted to describe the epidemic characteristics of morbidity and reported cases. Chisquare test was used to analyze the trend of incidence annually. The spatio-temporal clustering analysis with a Poisson model was applied to identify country at high risk for TB during 1 January 2016 and 31 December 2016. To visualize the cluster pattern in a geographical context, the geographic information system (GIS) software ArcGIS 10.4.1, SaTScan 9.4, and GeoDa 1.6.0 were used.
(2)Spatial autocorrelation The examination of spatial data is strongly affected by the location from which observations are made. Neighboring regions affect each another and proximate locations often share more similarities than widely-spaced locations spatial autocorrelation is measured using both global and local metrics. Spatial autocorrelation describes the correlation between variables in a spatial region and the same variable in its adjacent region, which could provide clues for some geographic regions to seek for the factors affecting diseases. Spatial autocorrelation analysis includes global spatial autocorrelation and local spatial autocorrelation [19][20]. To calculate spatial dependence, Moran's I (a statistical method that measures spatial autocorrelation) was employed. Moran's I is the most commonly used global autocorrelation index, whose value ranges from -1 to 1.
Positive value means positive correlation; negative value means negative correlation, and 0 means non-correlation, namely spatial random distribution.

1)The global Moran's I
The global Moran's I quantified the similarity of observations among adjacent geographical units from a global perspective was used to analyze the overall spatial autocorrelation degree and spatial distribution pattern [1]. The global Moran'I is defined as [Due to technical limitations this formula could not be inserted. It can be found in the where n is the number of samples, x i is the observed value in location i x̄ is the mean of the observed value across all locations, and w ij is an element value in the binary spatial weight matrix that describes the spatial relationship between location i and location j. The most frequently used spatial contiguity basolved it sed weight matrices were applied here, when two countries or districts share a geographical border w ij = 1, otherwise w ij = 0.
In order to test the spatial autocorrelation between regions, the normalized statistic Z(I) was used to test Moran's I index, corresponding to P <0.05, indicating a significant correlation. The statistic Z(I) is defined as follows [Due to technical limitations this formula could not be inserted. It can be found in the supplemental file "Formulas.docx" Formula #2] The global Moran's I ranges from −1 to 1 due to the use of the standardized spatial weight matrix.
2) The local Moran's I Local spatial autocorrelation statistics could be used to identify different spatial patterns (or spatial aggregation patterns) that might exist in different spatial locations. This allows us to observe local non-stationary in different spatial locations and to find spatial heterogeneity among the data. In our study, the local Moran scatterplot is a two- The scatterplot is centered on 0 and is divided into four quadrants that represent different types of spatial association. The first quadrant is the High-High (HH) cluster quadrant, i.e., countries or districts with high TB incidence surrounded by countries or districts with high TB incidence. Similarly, the second, third and fourth quadrants are the Low-High (LH), Low-Low (LL) and High-Low (HL) cluster quadrants, respectively.
The registration data of TB merged with a vector map were used to build spatial databases by ArcGIS 10.4.1 G1obal and local Moran's I were calculated by GeoDa 4.6.0 software respectively, as well as spatio-temporal scanning analysis was studied by SaTScan 9.3 software to detect the spatial autocorrelation and cluster range of the distribution.

Ethical Review
The study protocol and utilization of TB related data were reviewed by Xinjiang Uygur Autonomous Region center for Disease Control and Prevention and no ethical issues were identified. Therefore, no ethics approval was required by our Investigation Review Board.

Discussion
The prevalence of TB in Xinjiang was mainly characterized by high infection rate, high prevalence, high mortality and low cure rate. The incidence of TB had obvious spatiotemporal distribution. Spatial scientists, practitioners, and policy makers are interested in understanding the spatial variation in TB incidence at various geographic scales and resolutions. This study identified the spatio-temporal patterns of TB at the district and country level in China from January 2013 to December 2016.
The geographic distribution of TB incidence tends to vary across a geographic landscape.
Spatial correlation analysis and spatio-temporal analysis were used to detect spatial

Conclusions
It is well known that the incidence of TB is related to demographic characteristics such as gender, age, occupation, and ethnicity, as well as geographical factors, economic development, and medical condition. We used spatial statistics to observe the TB epidemic in Xinjiang from 2013 to 2016. The research results showed that TB continues to be a significant public health problem in Xinjiang. Southern regions of Xinjiang have higher-TB burden, due in part to a less-developed economy and poverty. Analysis of the influencing factors of TB in Xinjiang is the direction of our further research.

Declarations
Ethics approval and consent to participate Xinjiang Uygur Autonomous Region center for Disease Control and Prevention approved the use of an anonymised database of routinely collected TB data for this analysis.

Competing Interests
The authors declare that there is no conflict of interests.      Figure 1 Comparison of TB incidence     Gini coefficient at spatial window stops  Temporal-space clustering area map of 98 districts and counties in Xinjiang. Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors.

Supplementary Files
This is a list of supplementary files associated with the primary manuscript. Click to download. Formulas.docx