Environmental Factors Contribute to the Transmissibility of COVID-19: Evidence from an Improved SEIR Model


 COVID-19 is ravaging Brazil, and its spread shows spatial heterogeneity. Changes in the environment have been implicated as potential factors involved in COVID-19 transmission. However, considerable research efforts have not elucidated the risk of environmental factors on COVID-19 transmission from the perspective of infectious disease dynamics. The aim of this study is to model the influence of the environment on COVID-19 transmission and to analyze how the socio-ecological factors affecting the probability of virus transmission in 10 states dramatically shifted during the early stages of the epidemic in Brazil. First, this study used a Pearson correlation to analyze the interconnection between COVID-19 morbidity and socio-ecological factors, and identified factors with significant correlations as the dominant factors affecting COVID-19 transmission. Then, the time-lag effect of dominant factors on the morbidity of COVID-19 was investigated by constructing a distributed lag nonlinear model, and the results were considered to be an improvement over the SEIR model. Lastly, a machine learning method was introduced to explore the nonlinear relationship between the environmental propagation probability and socio-ecological factors. By analyzing the impact of environmental factors on virus transmission, it can be found that population mobility directly caused by human activities had a greater impact on virus transmission than temperature and humidity. The heterogeneity of meteorological factors can be accounted for by the diverse climate patterns in Brazil. The improved SEIR model was adopted to explore the interconnection of COVID-19 transmission and the environment, which revealed a new strategy to probe the causal links between them.


Introduction
As the deadly wave of COVID-19 epidemic began to sweep across the world, the first case of COVID-19 in Brazil was reported on 25 February 2020 in São Paulo [1].
The World Health Organization (WHO) officially declared COVID-19 a pandemic in March 2020. Most countries, including Brazil, have implemented widespread social distancing restrictions to mitigate the spread of the virus, which indicates that these strategies can effectively reduce the number of cases and associated deaths [2].
Depending on the real-time statistics released by Johns Hopkins University, Brazil had approximately 7.676 million confirmed cases by the end of 2020, ranking third after the United States and India; thus, it was one of the most affected countries in the world.
In Brazil, a country of continental dimensions noted for its enormous socio-economic and environmental diversity, the spread of COVID-19 is very heterogeneous, affecting cities and rural regions differently [3]. As the primary component of the socioecological system, nonlinear complex global climate change presents a threat to human health in a variety of ways in relation to continuous COVID-19 transmission [4][5][6][7].
Recent studies have shown that the spread of COVID-19 is thought to be greater in cold and temperate climates and that it spreads slowly in warm, humid environments [8][9][10].
In addition, large-scale and diffuse human migration can affect the geographical distribution of infections and the spreading pattern of the pandemic [11], and COVID-19 transmission significantly decreased with mobility controls [12]. The social and natural environment has a superposition effect on COVID-19 transmission in relation to COVID-19 epidemiology. However, numerous studies so far have analyzed the interconnection between the environment and COVID-19 transmission through statistical analysis methods. The potential association has only been explored on a superficial level, resulting in different conclusions regarding the driving effects of environmental factors on COVID-19 transmission in the same area.
To date, a handful of studies have analyzed, made predictions about and evaluated the COVID-19 pandemic by modifying classical epidemiological dynamic models, including the SIR, SEIR, and SEIRD methods [13][14][15][16][17]. These studies combined the characteristics of COVID-19 and medical prevention and control measures to establish a dynamic model of infectious diseases in line with regional features to analyze and explore the trend of development of the pandemic. Although these epidemiological models are useful for estimating the dynamic process of COVID-19 transmission, they are primarily focused on the impact of social behavior on COVID-19 transmission, especially social distancing and quarantines, and few studies have considered the impact of meteorological factors on the transmission of COVID-19. It should be noted that it is also possible to comprehensively consider the influence of socio-ecological factors on COVID-19 transmission in the model, which can systematically explore the mechanism of environmental factors on COVID-19 transmission from the dynamic process of virus transmission.
Therefore, this study was intended to explore the role of the environment in COVID-19 transmission by improving the SEIR model, in which socio-ecological factors and active cases for the 10 seriously affected states of Brazil can be established. First, Pearson correlation was used to screen environmental factors that significantly affected the pandemic trends to identify which ones were dominant. Then, the time-lag effect of environmental factors on the spread of COVID-19 was analyzed. Considering the hysteresis of environmental factors, the classical SEIR infectious disease model was improved, and the nonlinear relationship between socio-ecological factors and COVID-19 transmission was built by adopting the machine learning method to explore the impact of environmental change on COVID-19 transmission. In this study, environmental factors were added to the improved SEIR model as a process parameter, which was named the environmental propagation probability, and the least square method was used to fit the trends of the simulated cases to the active cases. Accordingly, the environmental propagation probability was optimized by using the associated optimization algorithm to complete the above procedure. Therefore, the impact of socio-ecological factors on virus transmission was analyzed from the perspective of infectious disease dynamics, and the role and importance of environmental changes on COVID-19 transmission were also determined. This study provides a reference to guide and scientific evidence to formulate COVID-19 prevention and control policies from the perspective of environmental change.

Study area
The Brazilian government has offered a column of COVID-19 to report the cases in real-time. In this study, 10 severely affected states were selected from among the 27 states of Brazil, namely, Amazonas, Bahia, Ceará, Maranhão, Minas Gerais, Pará, Pernambuco, Rio de Janeiro, São Paulo, and Distrito Federal. The study area includes various climate types (Fig. 1, adapted from Alvares et al [18]), covering longitudes from 34° 48′ 24″ W to 73° 47′ 44″ W and latitudes from 2° 37′ 56″ N to 25° 18′ 12″ S. Fig.   2 shows the COVID-19 data, including daily newly confirmed cases and active cases in the 10 Brazilian states.

Data collection
The COVID-19 dataset includes the daily number of cumulative confirmed cases, cumulative deaths, and cumulative cured cases of COVID-19 in the 10 states, which were officially reported by the Ministry of Health of Brazil from February 25 to December 31, 2020. This study concentrated on the interior of the Brazilian states.
Meteorological data were collected from the ERA5 reanalysis dataset (https://cds.climate.copernicus.eu), which were further processed to the state level using zonal statistical methods in ArcGIS10. 2

Methods
In this study, the Pearson correlation was used to screen environmental factors, and those with a significant correlation between the environment and morbidity were The distributed lag nonlinear model (DLNM) was first introduced by Gasparrini and Armstrong in 2010 to explore the influence of meteorological factors on human health, specifically assessing the air temperature effect on health [19]. The central idea is a cross basis function. By selecting the appropriate basis function for the exposure-response and exposure-hysteresis dimensions, the cross-basis function is obtained by calculating the tension product of the two basic functions, and then the cross basis is included in the model for analysis. The DLNM has been used to provide a means to assess the nonlinear and delayed effects as well as the relative risks (RR) of different environmental factors within the lag period [20][21][22]. In this study, the DLNM was used where t is the observation day; E(yt) is the expected value of the observed incidence of COVID-19 on day t; α is the intercept; and cb is a cross-basis matrix used to estimate the nonlinear relationship between environmental factors and COVID-19 incidence and describe the lag effects of the factors. On the cross basis, E is the environmental factor with 2 degrees of freedom (df), the lag is up to 15 days, and time is the indicator variable constructed using the natural spline to control long-term trends.
The classical SEIR model in epidemiology is used to assess the spread of infection and divides the population into susceptible (S), exposed (E), infectious (I) and recovered (R) populations, in which only infected people are considered to be infectious [23]. Since COVID-19 first appeared, it was found to be highly contagious, the spread of COVID-19 has its own specific characteristics: there are symptomatic and asymptomatic infected people who are infectious, and infectious people in the incubation period are contagious and can spread the virus further among people before they experience some or all of these symptoms [24,25].
Where α is the conversion ratio of latent persons to infected persons, the reciprocal of the incubation period is set at 5.2 [24]; βA is the probability of asymptomatic infectious in susceptible individuals, with a value of 0.4417 [26]; βS is the probability of symptomatic infections in the susceptible, with a value of 0.6043 [26]; γA is the probability of recovery of asymptomatic infectious people; γS is the probability of recovering from symptomatic infectious; γq is the probability of recovery from quarantined infectious, the values of γA, γS and γq are all 0.1 [26]; δA is the probability of death from asymptomatic infection [26]; δS is the probability of death among the symptomatic infectious [26]; δq is the probability of death for the quarantined infectious, the values of δA, δS and δq are all 0.0347 [26]; ε is the probability of lifting the quarantine, the reciprocal of the days of quarantine which refers to 14; θ is the probability of the symptomatic infectious being quarantined with a value of 0.8 [24]; ρ is the probability of quarantine with a value of 0.5 [27]; б is the proportion of symptomatic infectious people with a value of 0.7 [24]; φ is the influence parameter of environmental factors on COVID-19 transmission, which can be optimized; and N is the total population in each state, which is also a time-dependent parameter.

Results
The Pearson correlation analysis (Fig. 4a) showed that the morbidity of COVID-19 was significantly correlated with the temperature, relative humidity, population mobility and precipitation (P<0.01), and the correlation coefficients were -0.38**, -0.534**, 0.738** and -0.573**, respectively. The wind speed and atmospheric pressure were not significantly associated with morbidity. Therefore, it can be indicated that, to a certain extent, areas with lower temperature, lower humidity and high population mobility are more likely to have higher morbidity from COVID-19, which is a finding that is consistent with those of many studies [28][29][30]. Due to the significant correlation between precipitation and relative humidity, temperature, relative humidity and population mobility were selected as the dominant environmental factors for further analysis.
The relative risk (RR) of environmental factors corresponds to different lag days ( Fig. 4b-d).  The COVID-19 data (from state outbreaks to the end of 2020) of 10 Brazilian states were added to the model for simulation. Fig. 5 shows that the simulation of Iq and the actual daily active cases have a strong imitative effect, with R 2 values above 0.9, which indicates that the quasi-data assimilation algorithm adopted by the model can better simulate historical data. A multilayer neural network model was built based on φ as the output after model optimization and the dominant environmental factors.
The results (Fig. 6) demonstrated that the population mobility between regions or within regions caused by human activities played the most significant role in the spread of COVID-19, followed by relative humidity and average daily temperature. Among the factors, population mobility plays a more obvious and important role in the spread of the virus, and the importance of temperature and relative humidity varies in different regions, primarily due to the influence of climatic zones.
Brazil is a country with a very rich climate zone, spanning most climates in the tropics (Fig. 2). The results implied that only the temperature of the Rio de Janeiro and  [35][36][37]. "Destroy ecosystems, especially forest ecosystems, and the virus can move from animals to humans," American science journalist David Quaman explains.

Discussion
This study confirmed that environmental factors, including the temperature,

Conclusion
In summary, this study argued that changes in environmental factors can affect the The location of Brazil with its Köppen climate types. Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors.  The improved SEIR dynamic model.  Results of model tting.

Figure 6
The importance of dominant environmental factors in the spread of the pandemic. Figure 1 The location of Brazil with its Köppen climate types. Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors.  The improved SEIR dynamic model.  Results of model tting.

Figure 6
The importance of dominant environmental factors in the spread of the pandemic. Figure 1 The location of Brazil with its Köppen climate types. Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors.  The improved SEIR dynamic model.   Results of model tting.

Figure 6
The importance of dominant environmental factors in the spread of the pandemic. Figure 1 The location of Brazil with its Köppen climate types. Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors.  The improved SEIR dynamic model.   Results of model tting.

Figure 6
The importance of dominant environmental factors in the spread of the pandemic. Figure 1 The location of Brazil with its Köppen climate types. Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors.  The improved SEIR dynamic model.   Results of model tting.

Figure 6
The importance of dominant environmental factors in the spread of the pandemic. Figure 1 The location of Brazil with its Köppen climate types. Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors.  The improved SEIR dynamic model.   Results of model tting.

Figure 6
The importance of dominant environmental factors in the spread of the pandemic. Figure 1 The location of Brazil with its Köppen climate types. Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors.  The improved SEIR dynamic model.   Results of model tting.

Figure 6
The importance of dominant environmental factors in the spread of the pandemic. Figure 1 The location of Brazil with its Köppen climate types. Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors.  The improved SEIR dynamic model.   Results of model tting.

Figure 6
The importance of dominant environmental factors in the spread of the pandemic.