The relationship between key natural and social factors and the transmission of novel coronavirus disease 2019 in China

Novel coronavirus disease 2019 (COVID-19) has become a global pandemic. This study aims to explore the relationship between key natural and social factors and the transmission of COVID-19 in China. This study collected the number of conrmed cases of COVID-19 in 21 provinces and cities in China as of February 28, 2020. Three provinces were included in the sample: Hainan, Guizhou, and Qinghai. The 18 cities included Shanghai, Tianjin and so on. Key natural factors comprised monthly average temperatures in the January and February 2020 and spatial location as determined by longitude and latitude. Social factors were population density, Gross Domestic Product (GDP), number of medical institutions and health practitioners; as well as the per capita values for GDP, medical institutions, and health practitioners. Excel was used to collate the data and draw the temporal and spatial distribution map of the prevalence rate (PR) and the proportion of local infection (PLI). The inuencing factors were analyzed by SPSS 21.0 statistical software, and the relationship between the dependent and independent variables was simulated by 11 models. Finally, we choose the exponential model according to the value of R 2 and the applicability of the model.

On January 30, 2020, the World Health Organization (WHO) issued a statement declaring novel coronavirus disease 2019 (COVID- 19) a Public Health Emergency of International Concern [1]. COVID-19 is caused by a novel coronavirus. This virus was named Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) by the International Committee on Taxonomy of Viruses on February 11, 2020 [2]. Compared with Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV) and Middle East Respiratory Syndrome Coronavirus (MERS-CoV), SARS-CoV-2 poses a more serious global health threat [3]. Compared with Severe Acute Respiratory Syndrome (SARS), COVID-19 has presented a higher incidence, wider spread, and faster spread in mainland China [4,5].
In March, the WHO declared that there was a global pandemic of COVID-19 [6][7][8]. According to the realtime statistics of John Hopkins University in the United States, as of May 19, 2020, the number of con rmed cases of COVID-19 in the world had exceeded 4.73 million and the number of deaths had reached more than 316 thousand [9]. According to available data, the cumulative number of con rmed cases of COVID-19 in the United States had exceeded 1 477 516, which is the largest number of cases reported in any country in the world. Russian Federation was the country with the second highest number of con rmed infections, with a total of more than 299 941 cases [9]. In China, the cumulative number of reported con rmed cases had exceeded 84 500 [9]. And evidence shows that this epidemic has been generally controlled in China [6].
The results of epidemiological investigations of patients with COVID-19 have shown that the main clinical manifestations are: fever, dyspnea, dry cough, myalgia, fatigue, and in a few cases diarrhea and vomiting [10][11][12]. Its clinical features are similar to those of other coronavirus pneumonia [11]. Evidence from existing research indicate that COVID-19 is characterized by human-to-human transmission [11,13].
The basic reproduction number (R 0 ) at the initial stage of the epidemic is estimated to be 3.58, that is, the average number of infections per patient is 3.8 [14], indicating the general susceptibility of the population. At present, it can be seen that SARS-CoV-2 is mainly transmitted through respiratory droplets and contact, and the digestive system may also be a potential route of infection [15]. In addition, when there is a high concentration of aerosol in a con ned space, it is possible for the virus to spread through aerosol if exposed for a long time [16]. This demonstrates that there are additional possible routes of transmission of COVID-19.
Current studies have reported potential factors affecting the prevalence rate (PR) of COVID-19, such as temperature and humidity. The results show that both temperature and humidity will change the spread and mortality of COVID-19 [17][18][19][20][21]; however, a few studies have examined the factors affecting the proportion of local infection (PLI). In the longest incubation period before the onset of con rmed cases, if there is a history of staying in Wuhan or other provinces and cities, with a history of exposure to new coronavirus pneumonia cases, we de ned these cases as imported cases. The cases infected by imported cases are called local cases, and its proportion of the total number of con rmed cases is PLI.
Consequently, our team collected data relating to the number and source of con rmed cases from 21 provinces and cities in China. The focus of this study was restricted to con rmed cases in China, to eliminate the impact of cultural and policy differences. To explore the factors affecting the PLI in COVID-19 outbreaks across China, we examined the key natural factors of temperature, and spatial location determined by latitude and longitude. Social factors included population density, Gross Domestic Product (GDP), number of medical institutions and health practitioners; as well as the per capita values for GDP, medical institutions, and health practitioners.

Methods
In order to explore the factors that may affect the PLI, this study collected data on the number of con rmed cases of COVID-19 in the provinces of Hainan, Guizhou, and Qinghai; and the cities of Shanghai, Tianjin, Chongqing, Xiamen, Changsha, Hangzhou, Xi'an, Zhengzhou, Changchun, Shenzhen, Linyi, Xinyang, Nanyang, Changde, Xiangtan, Nanning, Nantong, and Xuzhou as of The data were collated and organized within Excel 2010, and the temporal and spatial distribution map was drawn to represent the PR and the PLI, SPSS 21.0 statistical software was used for data analysis. In this study, PLI was considered as the dependent variable, and key natural and social factors were selected as independent variables. These included the monthly minimum, median and maximum average temperatures for January and February 2020, spatial indicators of longitude and latitude, population density, GDP, number of medical institutions and health practitioners; as well as the per capita values for GDP, medical institutions and health practitioners. The relationship between the dependent and independent variables was simulated by linear, logarithmic and 9 other models and statistical signi cance was set as P < 0.05. The equations for the 11 models are as follows.
[Please see the supplementary les section to view the equations.] In the equations, represents dependent variable PLI; refers to the independent variables, such as temperature in January and February, longitude, latitude, number of medical institutions and health practitioners, GDP, population density; as well as the number of medical institutions, GDP, and health practitioners per capita. b 0 , b 1 , b 2 , b 3 , and u are the coe cients of the models as estimated by the statistical software from the generated graph of the data. Finally, we choose the exponential model according to the value of R 2 and the applicability of the model.

Results
Spatio-temporal distribution of the PR and the PLI of COVID-19 The number of con rmed cases, local cases and PLI of the selected 21 provinces and cities are shown in Table 1. In the spatio-temporal distribution map of COVID-19 's PR from the output of the modelling tools, different colors correspond to PR values. By the end of February, the highest PR was observed in the city of Shenzhen, followed by the Xinyang and Changsha. See Figure 1 below for details.
In the spatio-temporal distribution map of COVID-19's PLI, the colors represent the temperature median gradients of the corresponding provinces and cities, and the numbers on the map refer to the PLI in the selected areas. By the end of February, the PLI in the city of Tianjin was the highest among the selected provinces and cities, followed by the city of Xiangtan; then Hunan and Qinghai Provinces which were the lowest, as detailed in Figure 2 below.
When Comparing the PR and the PLI, both were found to be higher in the city of Xinyang, while in the city of Tianjin the PR was lower and the PLI was higher. The observed values for the city of Shenzhen was opposite to those of Tianjin.

Exponential model simulation of the PLI of COVID-19
Among the 21 selected regions, the population density of Qinghai Province was less than 0.01, which is signi cantly lower than the other 20 regions, and therefore Qinghai Province was excluded from the exponential model simulation. The exponential model simulation results show that the independent variables of temperature, latitude, and population density will affect the PLI. The values of B 1 are -0.022, -0.024, -0.027, -0.024, -0.027, -0.031, 0.032 and -1.116, as detailed in Table 2 below. The exponential model simulation diagram of PLI and temperature in January and February is shown in Figure 3 and Figure 4, which indicates that the higher the temperature, the lower the PLI. And from this analysis, it is evident that if the temperature decreases by 1℃, the average PLI increases by 0.01. The exponential model simulation diagram of PLI and latitude is shown in Figure 5, while that of PLI and population density is shown in Figure 6. The corresponding values of R 2 are 0.297, 0.322, 0.349, 0.290, 0.314, 0.339, 0.344, and 0.301, and the value of R 2 for the maximum average temperature in January is the largest (R 2 =0.349). The in uence of all other independent variables were not statistically signi cant, as detailed in Table 2 below.

Discussion
Compared with the spatio-temporal distribution of SARS in mainland China in 2003, COVID-19 showed different spatio-temporal aggregation patterns, which may be due to changes in social and demographic factors, different control strategies of local governments, and differences in transmission mechanisms of coronaviruses [4]. In this study, the PR of the selected provinces and cities showed a certain degree of spatial aggregation, that is, the farther away from Hubei Province, the lower the PR. This may be due to the fact that the deadline for the collection of con rmed cases was the end of February, 2020 when the proportion of imported cases was relatively high, corresponding to the relatively high turnover rates of nearby provinces and cities. The city of Shenzhen, however, is an exception. Although it is far away from Hubei Province, the PR was also relatively high by the end of February. The reason may be that the city of Shenzhen is a special economic zone with a high inter-provincial population turnover rate, which bring more imported cases to Shenzhen than its neighboring provinces and cities, and its lowest PLI also supports this hypothesis to some extent. While the PR in the city of Shenzhen is high, the PLI is low, whereas the opposite is true for the city of Tianjin. It was found that the average temperature in the city of Tianjin in January was between -5℃ and 2℃, and the average temperature in February was between -3℃ and 6℃. In Shenzhen, the average temperature in January was between 13℃ and 20℃, and the average temperature in February was between 14℃ and 21℃. It may be inferred that the PLI of COVID-19 will be affected when the temperature is high, and this conjecture is con rmed by related studies [17].
Reviewing the outbreak of SARS in 2003, it was found that the outbreak gradually subsided with the warming of the weather, and was basically under control by April and May [22], and ended completely in July. This indicated that the change of temperature may have affected the outbreak of SARS [23,24], and SARS-CoV-2, which is similar to SARS-CoV [25], may thus also be affected by temperature. A study of total 24139 con rmed COVID-19 cases in China and 26 other countries also found that temperature can signi cantly change the spread of COVID-19 [26]. The change of the early epidemic growth rate is consistent with the in uence of local environmental conditions on the spread of the disease [27], Guo's study also con rms that there is a negative correlation between R 0 and temperature [18]. This suggests that there may be an optimal temperature for the spread of the virus, and that the rates of incidence and spread of COVID-19 are expected to slow down with the advent of spring and summer [19]. These results are in agreement with those of previous studies modelling the aerosol transmission of the in uenza virus in animals [21]. The results of this project concur that the temperature in January and February in uenced the PLI.
Some studies have shown that COVID-19 mortality has a signi cant negative correlation to temperature, showing a signi cant negative correlation [20]. When the temperature increased, the mortality of ordinary and severe patients decreased signi cantly [22]. And previous studies have shown that both cold and heat may adversely affect the mortality of respiratory diseases [28], the results of this study show that the PLI of COVID-19 is negatively related to temperature, which may be due to the fact that the viability of SARS-CoV-2 is reduced with higher temperatures, which leads to a decrease in the incidence of local cases. The results of this study show that the higher the latitude, the higher the PLI, which also supports this view as temperatures are generally lower at higher latitudes.
A study in the United States shows that the growth rate of con rmed cases of COVID-19 is related to the size of the urban population, which means that the average rate of spread of SARS-CoV-2 in big cities is faster. If control measures are not taken, a larger proportion of the population will be infected in urban areas with larger populations [29]. However, according to the results of this analysis, the population density had the opposite effect on the PLI of COVID-19 in China. A possible reason may be that after the city of Wuhan announced the city closure, various other provinces and cities in China with large populations and high levels of labor movement immediately began implementing epidemic prevention and control measures [30,31]. For example, Henan and Hunan Provinces are major labor export provinces, exporting more than 15 million people, with Henan Province reaching 28.76 million. And examples of these measures included: the prohibition of visiting during the Spring Festival; the closure of selected highway entrances and exits; the establishment of health and quarantine stations; the control of all provincial and country roads; and the isolation of Wuhan workers returning to their residences in other locations, and the tracking of their close contacts. These measures contributed to the prevention of further outbreaks in other provinces of China, which may have affected the original impact of population density on the spread of COVID-19. This study, shows that the higher the population density, the lower the PLI for the selected areas.
Studies have shown that the impact of the epidemic on the economy is very serious [32]. Before the outbreak of the epidemic, millions of people traveled around the world every year, contributing positively to the global economy. Simultaneously, the increase of employment opportunities in the tourism industry also contributed to the global GDP. Since the spread of the global pandemic; however, the effect on tourism has had a negative impact on the global economy [33]. Experts believe that the impact of the COVID-19 epidemic may be more far-reaching than that of the Great Depression of the 1930s [34]. It has been observed that most studies focus on the resultant impact of the epidemic on GDP, whereas fewer studies focus on the impact of GDP on the outbreak. This study analyzed the effect of GDP on the PLI in 20 provinces and cities in China, and the results showed that the effect of GDP on the outbreak was not statistically signi cant. The reason may be that the implementation of the above preventive control measures may offset the impact of GDP on the outbreak of the epidemic.
The strict implementation of prevention and control measures has effectively contained the spread of the epidemic and reduced the demand for medical resources in other provinces of China except Hubei [35].
This may be the reason for the number of medical institutions, the number of medical institutions per capita, health practitioners, and health practitioners per capita to have no statistically signi cant effect on the PLI in the results of this study. Research also suggests that when the epidemic develops early or when the number of imported cases is limited, strict containment, defense, and suppression strategies can effectively reduce the demand for medical resources and will not cause impact on the medical system by the results of model simulation [36]. The tracking and management of close contacts of patients with COVID-19 are important components of prevention and control measures[37]. So we can see that the emergency measures implemented by the Chinese government since late January have played a decisive role in the prevention and control of the epidemic, and the evidence shows that the active cooperation of the public can break the original chain of virus transmission [38]. For example, if close contacts of COVID-19 patients agreed to be isolated, the public can do unnecessary going out and wear masks when going out, it can effectively block the transmission route and control the number of infected people, thereby reducing the burden of medical resources.

Limitations
There are also limitations in this study. First, the sample size selected is small, as only the relevant data from 21 provinces and cities in China were selected. This may affect the results of the statistical analysis. Second, the impact of various prevention and control policies and the willingness of the general public to co-operate needs to be further studied.

Conclusions
Among the selected key natural and social factors, higher temperatures may decrease the transmission of COVID-19. From this analysis, it is evident that if the temperature decreases by 1℃, the average PLI increases by 0.01. Further, it was established that locations at more northern latitudes had a higher PLI, and population density showed an inverse relationship with PLI. Availability of data and materials

Abbreviations
The datasets supporting the conclusions of this article are included within the additional le 1, and the datasets are publically available.

Competing interests
The authors declare that they have no competing interests.   Figure 1 The spatio-temporal distribution map of COVID-19's PR. The map depicted in this gure was taken from Wikimedia Commons (http://commons.wikimedia.org/wiki/Main_Page). Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors. The spatio-temporal distribution map of COVID-19's PLI. The map depicted in this gure was taken from Wikimedia Commons (http://commons.wikimedia.org/wiki/Main_Page). Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors.