The Spatiotemporal Transmission Dynamics of COVID-19 among Multiple Regions: A Modeling Study in China

Current explosive outbreak of COVID-19 around the world is a complex spatiotemporal process with hidden interactions between viruses and humans. This study aims at clarifying the transmission patterns and the driving mechanism that contributed to the COVID-19 epidemics across the provinces of China. Thus a new dynamical transmission model is established by ordinary diﬀerential system. The model takes into account the hidden circulation of COVID-19 virus among/within humans, which incorporates the spatial diﬀusion of infection by parameterizing human mobility. Theoretical analysis indicates that the basic reproduction number is a unique epidemic threshold, which can unite infectivity in each region by human mobility, and can totally determine whether COVID-19 proceeds among multiple regions. By validating the model with real epidemic data in China, it is found that (1) if without any intervention, COVID-19 would overrun China within three months, resulting in more than 1.1 billion infections; (2) high frequency of human mobility can trigger COVID-19 diﬀusion across each province in China, no matter where the initial infection locates; (3) travel restrictions and other non-pharmaceutical interventions must be implemented simultaneously for disease control; and (4) infection sites in central and east (rather than west and northeast) of China would easily stimulate quick diﬀusion of COVID-19 in the whole country.


Introduction
The pandemic coronavirus disease (COVID-19) is caused by a newly discovered coronavirus called SARS-CoV-2, which can spread from an infected person's mouth or nose in small liquid particles when they cough, sneeze, speak, sing or breathe [1]. Such easy transmission routes coupled with large human mobility quickly result in explosive outbreak across the world. As of June 10, 2021, this disease has attacked 212 countries, with a total of over 170 million confirmed cases and over 3.76 million deaths [1]. COVID-19 is disrupting global health, economic, political and social systems, and is posing comprehensive threats to people around the world.
The ongoing COVID-19 pandemic exhibits a clear time-space evolution. As the first case was reported in Wuhan, China on 29 December 2019, the disease has spread to all the provinces in China within a month [2]. By 21 February 2020, it has occurred in 27 countries, and the number of infected countries quickly increases to over 170 in late March. The infection size rose sharply from 282 to over 5 million during 5 months period. Such fast diffusion and hierarchical structures in time and place were possibly shaped by human behaviors (e.g., communication, work and movement). For example, after initial emergence in China, travel-related cases started appearing in other parts of the world with strong travel links to Wuhan [3]. This pattern along with the special characteristics of COVID-19: (1) high pathogenicity and hidden transmission among humans [4], (2) large asymptomatic patients with infectivity [5], (3) short serial interval [6], and (4) massive susceptibility [7], make it very difficult to assess the risk and control the infection. Recognizing the spatiotemporal transmission dynamics can help to forecast epidemic tendency, identify the potential drivers of transmission and high-risk population, and guide the designing of targeted interventions in resource limited settings.
Technically speaking, pure statistical model and mapping analysis can quantitatively tell the infection patterns in time and space. Mathematical frameworks incorporated epidemiology survey data can further capture the intrinsic variability of spatiotemporal transmission of epidemics, which are used increasingly in interdisciplinary studies [8]. Focusing on in the COVID-19 pandemic, many epidemiology-inspired models, including SIR, SIS, and SEIR models, had been built to study the spreading patterns [9,10,11,12,13,14]. By simulating the underlying transmission process, these studies found that (1) real-time mobility data from Wuhan can well elucidate the transmission in cities across China (2) various nonpharmaceutical interventions are effective in controlling the spread of the disease [13,15,16,17,18]; and (3) mobility networks of air travel can predict the global diffusion pattern at the early stages of the outbreak, and an unconstrained mobility would have significantly accelerated the spreading of COVID-19 [19].
This paper goes a further step to provide a new modeling framework with consideration of human mobility and surveillance data to clarify the hidden dynamics accounting for COVID-19 spatiotemporal transmission in China. Based on the deterministic compartment model, a multi-population transmission model of COVID-19 is established by ordinary differential equations (ODE). Qualitative theory is used to analyze the propagation dynamics of the model, including the expression of the basic reproduction number and the equilibria, the global stability of the diseasefree and endemic equilibria. Finally, this model is applied to investigate the detailed transmission patterns of COVID-19 across the provinces in China.

Modeling framework
To simulate the spatiotemporal transmission of COVID-19 across different regions, a new metapopulation dynamic model is proposed in this section. Based on the epidemiology features of COVID-19 and compartmental theory, the following basic assumptions are proposed.
• During the outbreak of COVID-19 infection, humans are divided into susceptible (S i ), exposed (E i ), preclinical infectious (I p i ), subclinical infectious (I s i ), clinical infectious (I c i ) and recovered (R i ) classes. Here I p i and I s i are inapparent infections, and I c i are apparent infections. The sum of these classes constitute the total population size, that is, It is assumed that N i is a constant, in which birth rate equals to death rate d. Here the subscript i denote the location of these parameters.
• The infection routes follows susceptible-latent-infected-recovered process. Individuals can be infected through contact with infectious individuals and then experience an incubation period 1/η. Exposed individuals progress to preclinical infectious (with probability ϕ) and subclinical infectious (with probability 1 − ϕ), subclinical infections with mild or no symptoms could not be easily found and treated, but they can self-recover after time 1/γ. Preclinical infections before symptom appear progress to become clinical infectious who develop symptoms after time 1/δ. They receive treatment and are cured successful through time 1/ω.
• When novel coronavirus carried by infected humans invades into a virgin area people there (local residents and visitors from other region) could be infected with certain probability. The model takes into account such spatial diffusion by incorporating a migration matrix P , in which element P ij denotes the average duration per unit time that the residents in region i stay in region j. It satisfies P ij ≥ 0 and ∑ j P ij = 1. Based on the above assumption, the essential features of the model framwork are depicted in Figure  1. Accordingly, the governing equations for simulating the spatiotemporal transmission dynamics  of COVID-19 are illustrated as follows: where λ j is the specific transmission rate in region j. The interpretation of other model parameters are presented in Table 1.

Basic reproduction number
The basic reproduction number R 0 , as one of the most important theoretical concepts in epidemiology, acts the critical measure of the transmissibility [20]. R 0 is interpreted as the average number of secondary cases that are produced by a single primary case in a fully susceptible population. In what follows, it is written S = (S 1 , S 2 , . . . , S n ) T and similarly for E, I p , I s , I c , R and N. The matrices and the column vector P i is the ith column of matrix (P ij ) n×n . System (1) can then be written as the following vectorial notation: Here for u ∈ R n , diag(u) denotes the n × n diagonal matrix whose main diagonal is u. It is clear that Ω= { (S, E, I p , I s , I c , R) ∈ R 6n + |0 ≤ S, E, I p , I s , I c , R ≤ N } is a compact absorbing and positively invariant set for (2). Direct calculation yields that system (2) has a disease-free equilibrium, denoted as Q 0 = ( S 0 , E 0 , I p0 , I s0 , I c0 , R 0 ) = (N, 0, 0, 0, 0, 0) . The basic reproduction number R 0 is calculated by using the theory of next generation matrix [20]. It is written as R 0 = ρ(F V −1 ), where F is the rate of occurring new infections, and V is the rate of transferring individuals outside the original group [20]. Here ρ represents the spectral radius of matrix. Based on the model (1), direct calculation yields that where I denote a identity matrix, and 0 is the zero matrix. It follows from the characteristic equation of F V −1 that the basic reproduction number is given by The three components of the R 0 are separately contributed by the infections preclinical, subclinical, and clinical states.

Global stability
The results concerning the global dynamics of system (2) is analyzed in the following.
Proof. It is denoted the expression of endemic equilibrium by S * , E * , I p * , I s * and I c * . Based on the equilibrium definition, letting the right-hand side of system (2) to be zeros and substituting S * , E * , I s * and I c * by I p * , it is obtained an equation about I p * as where Substituting I p * by 0 and N yields that f (0) = −dN < 0, and It follows from the zero-point theorem that system (2) has at least one positive equilibrium. Furthermore, due to f ′ (I p ) = m 1 1 > 0, f is an increasing function. Hence there exists a unique positive endemic equilibrium in the compact set Ω. (2) is globally asymptotically stable.
Proof. Since the total number of human population is a constant, the first equation of system (2) can be ignored.

The corresponding comparison system is
It is clear that model (4) is a linear system, and the coefficient matrix of its variables in the righthand side is exactly the matrix (F − V ). Hence, when R 0 = ρ ( F V −1 ) < 1, the unique equilibrium (E, I p , I s , I c ) = (0, 0, 0, 0) of this linear system (4) is globally asymptotically stable. Since According to the comparison theorem, with the same initial conditions, it has E(t) ≤Ē(t), I p (t) ≤ I p (t), I s (t) ≤Ī s (t), and I c (t) ≤Ī c (t) for any t > 0, yielding that Q 0 is globally asymptotically stable when R 0 < 1.
To establish global stability results of the endemic equilibrium, it is used the graph-theoretic method as presented in [21] and [22]. Theorem 4.3. If R 0 > 1, then, the unique endemic equilibrium Q * of system (2) is globally asymptotically stable in Ω.

Proof. Denote
where the variable with superscript as star are the expressions of endemic equilibrium in the model. Using the inequality 1 − x + ln x ≤ 0, for x > 0, direct differentiation yields: Here, Similarly, , n i i a + Figure 2: Digraph representation of the matrix A of transmission used to determine the coefficients in the Lyapunov function D. and as well as a n+i,i = ϕηE * i , a 2n+i,i = (1 − ϕ) ηE * i , a 3n+i,i =δI p * i . Let A = (a ij ) n×n with a ij > 0 as defined above and otherwise zero. The corresponding weighted digraph is shown in Figure 2. Along each of the cycles on the graph, it is verified that ∑ G ij = 0; for instance, G i,n+i + G n+i,i = 0, G j,n+i + G n+j,j + G i,n+j + G n+i,i = 0, and so on. It follows from Theorem 3.5 in [21] that there exist constants c i such that D = ∑ i c i Q i is a Lyapunov function for system (2). Let c 1 = · · · = c n = 1, and Further computation leads to .
Hence, with the functions D i and constants c i given above, the expression is a Lyapunov function for system (2). Its derivative is .
it is clear that system () has a unique equilibrium R i = R * i , which is global asymptotically stable. Using LaSalle's Invariance Principle, it is concluded that the endemic equilibrium Q * is global asymptotically stable in Ω.

Application to the outbreak in China
In this section, the proposed model is applied to analyze the spatiotemporal transmission dynamics of COVID-19 in Chinese provinces. Daily records of human infections were collected from authoritative data report. The permanent population size in each province were released by the 2019 National Bureau of Statistics. The daily migration data among provinces is collected from Baidu migration data (https://qianxi.baidu.com/). The model is validated by using Markov chain Monte Carlo (MCMC) method to fit the daily reported cases in 26 provinces (with cases more than 101 from 5 January 2020 to 15 March 2020). Here 6 parameters (β, α, κ, and the initial values of E, I c and I p in HuB) were estimated by MCMC. Since HuB province is considered to be the infection source, it is assumed that there is no infections in other provinces at initial time. The transmission rate λ i is derived from the effective reproduction number R t in province i. R t represents the number of new morbidity cases caused by an average morbidity case at time t. Here the R t in each province is estimated from the time series of its indigenous cases. Based on Bayesian framework, R t is calculated by the EpiEstim package in R language software [23], in which the intergenerational time follows gamma distribution, with the mean value and standard deviation as 7.5 and 3.4 respectively [24].
The fitting results are shown in Figures 3 and S1 (in Supplementary Information). It is found that the model performed well in fitting the daily reported incidences, except the data in some provinces such as HeB, ZJ, HeN, HuN, CQ and GZ. The fitting deviations are possibly due to the spatiotemporal heterogeneity of transmission parameters and detection efficiency. PRCC coefficients are used as global sensitivity to quantify the response of model outputs to the variation of the estimated parameters. By averaging the daily PRCC coefficients in the operation of fitting daily incidences, it is found that the output is strongly sensitive to the effective transmission rate of clinical infection (β) and the relative coefficient of migration matrix (κ), followed by the effective transmission rate of subclinical infection (α). Yet it seems that in the entire infection process the output is scarcely sensitive to the initial condition of the model. The reason for the negative correlation of β and κ with model output is that for given R t , small values of β and κ mean large values of transmission rates λ.
In the following simulations of the model, it is set that (1) Figure 5 shows the ranking of total infections in China with a unique infection source at initial time and human mobility at the entire process. Here it is assumed that there is no implementation of intervention, which is realized by setting the reproduction number in each province to be the value in early infection stage in HuB (equal 3.56). By simulating the transmission process through 57 days, it is found that (1) the initial infection located in HeN, ZJ, SH, JS and AH would caused the top five numbers of human cases (over 300 million); (2) when the initial infection is located in XZ, QH, JL, HLJ, XJ, it would lead to smallest infection sizes (around 142-183 million). Moreover, by simulating the transmission process through 120 days, it is observed that the infection would reach a saturated state: more than 1.1 billion people could be infected no matter where the infection initially occurs. In this case, all provinces reach the highest levels of new infections after about  two months (see Figure S3), but the attack rate exhibits spatial heterogeneity, in which the area near the initial infection source usually suffers worse.
H X Z Accumulative number of human cases  It is observed that more sites with initial infection and more frequency of human mobility would yield a little faster diffusion of the disease (that is more obvious in early infection period) and a little earlier arriving of the peak. When the disease starts to spread from January 23, 2021, the numbers of human cases would reach peak around early April or late March, in case of one initial infection site(XZ), or two initial infection sites (HeN and GD). However, in case of all size with initial infection, the peak is arriving basically in the middle of March, regardless of population mobility. In these four settings, the infection would last for three months, causing similar number of total infections.

Discussion
COVID-19 is posing increasing threats to public health around the world. Clear information about epidemiologic features and transmission patterns of the disease can help to control and prevent COVID-19 transmission. The present study is an attempt to provide a modeling framework to infer COVID-19 spatiotemporal transmission patterns by focusing on the COVID-19 outbreak in 31 provinces of China.
Since the outbreak of COVID-19, many epidemiological models have been proposed and applied to study the propagation of COVID-19. Focusing on the spatiotemporal transmission, modeling framework including mathematical model (e.g., ordinary/partial differential equation [9,10,11,12,15,16,19], difference equations [13],) computational model (e.g., agent-based model [17] and nextgeneration algorithm [18]), and statistical model (e.g., stochastic model [14] and ArcGIS [25]). Inspired by existing studies, this paper presented a new mathematical model via ODEs, which couples the intrinsic transmission dynamics, including the disease evolution in humans among different states (susceptible, exposed, preclinical infectious, subclinical infectious, clinical infectious and recovered), infection action by human-human contact, and human mobility among different regions. Moreover, the effects of human behavior and control strategy were characterized by model parameters, which can regulate the the spatiotemporal infectivity and transmissibility. Finally, MCMC algorithm was employed to estimate the uncertain parameters and then to evaluate the model. The modeling framework is an update of recent studies. By validating the proposed model with surveillance data in 31 provinces of China, the spatiotemporal transmission dynamics and the effects of human mobility and interventions were clarified, which can providing the following clues for guiding disease control.
First, there is a unique epidemic threshold, denoted by basic reproduction number R 0 , which can totally determine whether COVID-19 proceeds among multiple regions. If R 0 < 1, no matter how many infection sources there are, COVID-19 will always die out. Otherwise, the disease will persist in each region. R 0 unites infectivity in each region by human mobility. Such mobility contributes to transmission in two ways: susceptible persons in other regions could be infected when traveling to outbreak area, and infected persons may bring COVID-19 virus from outbreak area to other regions. Particularly, when R 0 > 1, no region can escape from infection if there exists human mobility among them.
Second, the effects of the implemented intervention in China are further evaluated. By using the proposed model to simulate the spatiotemporal transmission dynamics, it is found that if the interventions (e.g., social distancing, city lockdown, etc.) had not been implemented in China, COVID-19 would prevail all around China and the transmission would last about three months, resulting in over 1.1 billion patients. In this case, more than 78.6% population in China would be infected by COVID-19 virus, which is over 11,981 times of the total number of reported cases. The estimated effects of interventions are much more significant than previous results, which claimed that (1) if without non-pharmaceutical interventions in China, the number of cases was predicted to be 7.6 million by 29 February 2020 [15], or 37 million by 5 March 2020 [26], or increase the total infections by 93.7% [27]. The reason for the severity of our estimation could be that this study highlights the intrinsic spatiotemporal transmission dynamics and the total infection process.
Third, the role of human mobility in COVID-19 transmission is further clarified. Similarly to previous studies [13,25,27], it is verified that human mobility (by travel) can spark new infections in virgin areas and high frequency of human mobility in reality has driven COVID-19 diffusion across the 31 provinces of China. The present paper further indicates that the effects of human mobility in the spatiotemporal transmission of COVID-19 is more prominent in two cases: early stage of infection and when R 0 is a little bigger than one. If without intervention inside region, then human mobility would accelerate disease propagation across different regions, but it could not modify the number of total infections, unless travel is banned at the very beginning of infection. Hence, regional human migration plays as a trigger in the preliminary stage of infection, and then locally contracted infection dominates the following transmission process. The results demonstrate that non-pharmaceutical intervention is the core strategy, and travel ban at the same time can slow down the process and suppress incidence rate.
Fourth, the transmission patterns of COVID-19 in the whole country are further inferred. The initial infection located in central and east of China (HeN, ZJ, SH, JS and AH) would easily stimulate quick outbreak and large infection, but adverse consequence is observed if initial infection is located in west and northeast of China (XJ, HLJ, QH, XZ, GS, NMG, and YN), in that there exists less population flow. Yet if without any intervention, the transmission would continue three months, and then no matter where the outbreak occurs and how many sits do initial infection locates, the infections of COVID-19 would reach a saturation level, and more than 87% people in China would be infected.
In view of current situation of COVID-19 pandemic, China is facing high risk of sporadic outbreaks due to imported infections and is making great efforts for prevention. To control this disease, beside promoting vaccination (that is precisely what China is doing), the present study suggests that (1) identifying and isolating imported case is the primary mission, which can be accomplished by monitoring the travellers from foreign countries by tight and thorough surveillance system; and (2) in case of autochthonous infection, strict non-pharmaceutical interventions must be taken as soon as possible, including tracking close contacts and quarantine, travel restriction, lockdown of high risky community. Indeed, such intervention strategies are exactly as China is implementing. By doing so, more than 99.99% human infections would has been avoided according to this study.
Here several limitations need to be clarified. (1) The COVID-19 incidence data was based on public report information, which may yields data deviation from reality. (2) The biological parameters applied in the proposed model were extracted from literature, which may show geographical disparities.
(3) The model did not take into account the potential factors such as the difference of immunity and infectivity. Nevertheless, the model captures the dynamic evolution of disease in time and place, and incorporated the biologically intuitive parameterizations. It matches well with spatiotemporal data by fitting several parameters, lending confidence to the analysis and justifying the model's further generalization.
In summary, this paper develops an inference technique to identify COVID-19 transmission patterns and it is applied to explore the COVID-19 transmission patterns in the provinces of China. The proposed model takes into account the essential effects of human mobility and disease evolution, which allow to capture the hidden spatiotemporal dynamics and internal mechanism of COVID-19 transmission. The obtained results support the interventions that are being implemented in China.
Authors contributions HL and GZ designed the study. QJ, JL, HL and FT collected the data. QJ, JL, HL and GZ built the model and performed the analysis. QJ, JL and FT interpreted the data. QJ and GZ prepared the manuscript. All authors reviewed and approved the submitted manuscript.