Geographical Clustering Analysis of Birth Defects in Guangxi

11 Birth defects (BD) is a big public health issue in Guangxi Zhuang Autonomous Region of China. The overall 12 prevalence of BD in Guangxi is about 1% and higher than most other provinces of China. However, the geographical 13 clustering variations in BD of Guangxi has not been described. Therefore, the aim of this study was to explore and 14 detect the spatial clustering patterns of BD prevalence across a well-defined geographic space. The data were 15 obtained from Guangxi birth defects monitoring network (GXBDMN) from 2016 to 2020, which collected socio- 16 demographic and clinical information from perinatal infants between 28 weeks of gestation and 7 days postnatal. 17 The spatial autocorrelation analysis and hot spot analysis will be used to explore the geographical clustering of BD 18 prevalence in 70 counties and 41 districts of Guangxi in this study. A total of 44,418 perinatal infants were born with 19 BD from 2016 to 2020. The overall prevalence of BD was 122.47/10,000 [95% confidence interval ( CI ): 121.34- 20 123.60/10,000]. The local indicators of spatial association (LISA) statistic and G i * statistic showed that the spatial 21 clustering patterns of BD prevalence changed over time, and the largest High-High clustering area and hot spot area 22 were both identified in the city of Nanning. Therefore, the spatial clustering patterns of BD prevalence in Guangxi 23 is very significant. Spatial cluster analysis can provide reliable and accurate spatial distribution patterns in BD control and prevention.


26
Introduction 27 Birth defects (BD), also known as congenital anomalies, refer to any structural or functional abnormalities that occur 28 during intrauterine, including metabolic disorders [1][2][3] . BD is one of the significant causes of spontaneous abortions, 29 stillbirths, and also death and disability among infants and children under 5 years old 3 . It is estimated that about 3-30 6% (about 7.9 million) of infants suffer from serious BD and more than 3.3 million infants and children die of BD 31 every year in the world, and also BD is one of the main reasons for the loss of disability adjusted life years (DALYs) 32 of infants aged 0-1 4-7 . The prevalence of BD in China is about 5.6%, which means around 0.9 million infants were 33 born with BD every year, which has a serious impact on the survival and quality of life of infants and brings great 34 financial burden to their families 8,9 . 35 Spatial epidemiology is an emerging (or re-emerging) interdiscipline which is based on geographic information 36 system (GIS) spatial analysis technology, can be used to describe and analyze the risk factors of demographics, 37 environment, behavior, socio-economics, genetics and infectious disease 10,11 . In recent years, spatial epidemiology 38 has been widely used in the study of the relationship between environment and health, and plays a very significant 39 role in the field of public health 12,13 . It has shown that the prevalence of BD is closely related to geographical 40 location 14,15 . The prevalence of BD in Guangxi is about 1%, higher than most other provinces of China, and varies 41 greatly in different regions 16,17 . In this study, we applied the spatial autocorrelation analysis and hot spot analysis in the geographical clustering of BD prevalence in 70 counties and 41 districts of Guangxi. It will provide scientific 43 basis for the future development of BD prevention and control strategies upon this result.

45
Results 46 BD prevalence mapping

47
The prevalence of BD in the 70 counties and 41 districts of Guangxi was mapping and shown in Figure 1, in which 48 the highest values of BD prevalence in the year of 2016 to 2020 were 294.79/10,000 (Chengzhong district of Liuzhou 49 city), 266.28/10,000 (Jinchengjiang district of Hechi city), 408.60/10,000 (Xingning district of Nanning city), 50 522.88/10,000 (Xingning district of Nanning city), and 636.58/10,000 (Xingning district of Nanning city), 51 respectively.

53
Spatial autocorrelation analysis of BD prevalence

54
The Z test of global Moran's I statistic showed that spatial autocorrelation was significant at the 95% confidence 55 interval (CI) level, and the global Moran's I index in the year of 2016 to 2020 were 0.11, 0.19, 0.28, 0.33, 0.36, 56 respectively. These results suggest positive spatial autocorrelation of BD prevalence in the entire study area.

57
The local indicators of spatial association (LISA) statistic showed that High-High cluster, High-Low cluster, and 58 Low-High cluster were the significant local spatial clustering patterns of BD prevalence in the study area from 2016 59 to 2020, and the spatial clustering patterns was illustrated in Figure 2. The above three significant local spatial

69
Hot spots analysis of BD prevalence 70 Gi* statistic showed that spatial clustering was significant at the 90% CI level of BD prevalence in this study area.

85
Discussion 86 Spatial epidemiology based on GIS spatial analysis technology plays an important role in control and prevention of 87 diseases 13 , which usually be applied in disease surveillance and health risk analysis in public health 18,19 . Although 88 spatial epidemiology has been applied to explore BD and maternal health problems in recent years 20-24 , but most 89 studies were only applied simple descriptive mapping of spatial distribution patterns 25 , especially in Guangxi, the 90 spatial epidemiology of BD is not well understood. This study is the first spatial epidemiology report on BD 91 prevalence of Guangxi by using GIS technology.

92
Spatial cluster analysis is a uniquely interdisciplinary research, and so it is very important to exchange ideas 93 among applied epidemiology researchers and spatial statisticians 26 . Spatial cluster analysis is usually used to analyze 94 the spatial distribution patterns for BD prevalence 27-29 . In our study, global Moran's I statistic, LISA and Gi* statistic 95 were used to explore and detect the spatial clustering pattern of BD prevalence in Guangxi from 2016 to 2020.

96
The LISA statistic results showed that the spatial clustering pattern of BD prevalence changed over time. The

106
So, these spatial cluster analysis results indicated that a significant spatial clustering pattern in Guangxi from 2016 107 to 2020, it means that there was a clustering of high values of BD prevalence in the city of Nanning. And also, these 108 spatial cluster analysis methods explored spatial clustering of BD prevalence provided strong evidence for 109 researchers. Spatial clustering of BD prevalence was centered in the city of Nanning, which is known for the 110 provincial capital city of Guangxi in China, and its gross domestic product (GDP) is the highest in Guangxi.

111
Moreover, Nanning provides professional pre/postnatal health care service. For these reasons, more pregnant women 112 chose to be hospitalized in the hospital which has high professional maternal and child health care level, especially 113 those who have severe problems found in other hospitals. As a result of this, the defects prevalence of perinatal 114 infants may be higher in the city of Nanning. This suggests that it may be an important problem and need to be 115 considered in the future when health policy formulated by our health administrative departments.

116
However, our study has two major limitations. First, the calculation of BD prevalence was a critical process in 117 this study. It is widely known that the importance of population-based cohort study in precision medicine and 118 translational medicine. Unlike a population-based cohort study, the data of our study were collected by Guangxi 119 birth defects monitoring network (GXBDMN), which is a hospital-based, passive surveillance system. So, it may 120 not fully reflect the actual BD prevalence of each county and district in Guangxi.

121
The second limitation was that the related risk factors of BD prevalence were not included to analyze in this study.

122
The geographical variations of BD prevalence may be explained by various factors, such as socio-economic 123 development level, environmental factors such air and water pollutions, the medical level of maternal and child 124 health care institutions, and lifestyle and behavior of pregnant women. The main aim of this study in the paper was 125 to detect spatial clustering patterns and identify the location of spatial cluster areas. So, the related risk factors of 126 BD prevalence were not included to analyze in this study. We hope that more insight into the epidemiology of BD 127 by using other spatial analysis methods such spatial regression analysis may be considered in our future study based 128 on this research.

129
In conclusion, the spatial clustering patterns of BD prevalence in Guangxi is very significant. Spatial cluster analysis can provide reliable and accurate spatial distribution patterns in BD control and prevention. Especially LISA 131 statistic and Gi* statistic enable us to explore and detect more reliably location of spatial clustering area when 132 researchers have some prior knowledge of spatial epidemiology and GIS spatial analysis technology.

136
In this study, the prevalence of BD of perinatal infants between 28 weeks of gestation and 7 days postnatal were 137 mapped at the county level. Then spatial autocorrelation analysis and hot spot analysis were used to detect the spatial 138 cluster type of the prevalence of BD by using GIS spatial analysis technology. Finally, all data analysis, including 139 spatial analysis, mapping and data preprocessing were conducted using ArcGIS 10.2, which is a GIS software and 140 developed by Esri company. And all the spatial analysis results were mapped by using ArcGIS 10.2.

142
Study area

143
The study area Guangxi Zhuang Autonomous Region is located in southwestern China (see Figure 4)

165
Global spatial autocorrelation is a correlation analysis of the entire study area which assumes that all the spatial 166 elements (stochastic variables) are on a plane 30 . The function of global spatial autocorrelation is used to describe 167 the overall spatial distribution of a phenomenon and judge whether the phenomenon has spatial cluster area in the 168 entire study area. Global Moran's I statistic 30 is usually be used to test for spatial cluster in global spatial 169 autocorrelation analysis.

170
The global Moran's I statistic is the stochastic processes of identifying stochastic phenomena which are distributed 171 in space of two dimensions and can evaluate the pattern of spatial cluster (spatial clustering pattern, discrete pattern 172 or random pattern) 30 . In our study, global Moran's I statistic was firstly applied to explore the spatial clustering 173 pattern for the prevalence of BD in Guangxi. The formula of the global Moran's I statistic is: (1)

176
In equation (1), where zi is the deviation between the attribute of unit i and its average value ( − ̅ ), wi,j is the 177 spatial weight between units i and j, n is the number of observations (units), and S0 is the aggregation of all spatial 178 weights.

179
The Z test usually using to test global Moran's I index. The values of Global Moran's I index are ranging from -1 180 to +1. When Z scores in which values > 1.96 or < -1.96 show that spatial autocorrelation is significant at the 95% CI

194
In equation (2), where xi is the spatial attribute value of unit i, wi,j is the spatial weight between units i and j, n is 195 the number of observations (units), and 2 is the aggregation of all spatial weights.

196
The Z test is also used to test LISA statistic. When Z scores in which values > 1.96 or < -1.96 show that spatial 197 autocorrelation is significant at the 95% CI level. So, the local spatial clustering patterns can be classified 4 types

214
The Z test is also used to test Gi* statistic. When Z scores in which values > 1.65 or < -1.65 show that spatial 215 clustering is significant at the 90% CI level. So, when P value < 0.10 and Z score > 0 indicates the study area has 216 hot spots; when P value < 0.10 and Z score < 0 indicates the study area has cold spots; and when P value > 0.10 217 indicates the study area has not hot and cold spots.