Prediction of the epidemic trends of COVID-19 by the improved dynamic SEIR model

The outbreak of 2019 novel coronavirus disease (COVID-19) has become a public health emergency of international concern. The purpose of this study was to propose an improved dynamic SEIR (ID-SEIR) model to predict the epidemic trends of novel COVID-19. Firstly, we obtain the values of parameters in ID-SEIR model by using the epidemic data of Wuhan as the training sample. Secondly, we predict the epidemic trends of COVID-19 for the three most serious USA, New York and Italy with our proposed ID-SEIR model, and we can apply the proposed method to predict the epidemic trends of other countries and areas. Finally, we find that the proposed ID-SEIR model established in this paper has strong reliability, which can reasonably reflect the changes in national policies and public behavior during the epidemic. Also, this model can make predictions in line with the actual development of the epidemic and provide reference for infection prevention and control.


Introduction
In later December 2019, an outbreak of the novel coronavirus (2019-nCoV) pneumonia began in Wuhan (Hubei, China), and spread throughout the country rapidly.Further, 2019-nCoV pneumonia had been renamed as 2019 novel coronavirus disease (COVID-19) on February 12, 2020 officially.The COVID-19 has become a pandemic, and it has spread to more than 190 countries, areas, or territories beyond China until now.The accumulative 84,341 confirmed cases and 4,643 deaths were reported in China 1 , and 2,725,984 confirmed cases with 189,158 death were reported in other countries 2 (including United States of America (USA), Spain, Italy, France, Germany and United Kingdom as the top six countries) on April 26, 2020.WHO declared the COVID-19 outbreak as a public health emergency of international concern on Jan 30, 2020 3 .
To control the spread of the COVID-19, China took unprecedented nationwide interventions on January 23, 2020.After more than two months of forceful infection prevention and control, China has basically curbed the spread of local spreading and the new confirmed cases, and new confirmed cases are mainly from abroad.China has accumulated rich experience to overcome the COVID-19, and Tian et al. [6] also revealed that the national emergency response appears to have delayed the growth and limited the size of COVID-19 epidemic in China.However, the epidemic situation in the United States and European countries is not going well.The cumulative confirmed cases in the United States, Italy and Spain has exceeded China, respectively.So far, the characteristics of the virus, as well as the characteristics of patients are gradually known as the researchers explore COVID-19 constantly.The main mode of transmission of COVID-19 is person to person spread by respiratory droplets.Therefore, the isolation of patients and susceptible groups, reducing contact opportunity are the effective ways to suppress the spread of the virus.Li et al. [3] reveals the patients with the COVID-19 have a relatively long incubation period at an average of 5.2 days while it has the ability to spread the virus, which increases the difficulty of infection control.
According to the symptom analysis of some patients in the early stage in China, most of patients were ordinary type and 25.5% was severe type, please see the details in Yang et al. [8].Since the outbreak, many researchers have developed the dynamic models to analyze and predict the trend of pandemic changes.For example, Liu et al. [4] proposed a segmented Logistic model to describe infection, death and cure of COVID-19.Tang et al. [6] proposed a deterministic "Susceptible-Exposed-Infectious-Recovered" (SEIR) model to show the transmission dynamics of the novel coronavirus and assess the impact of public health interventions on infection.Yang et al. [9] proposed the modified SEIR model to predict the COVID-19 epidemic peaks and sizes.Wei et al. [7] proposed the SEIR +CAQ dynamic model to fit and forecast the trend of COVID-19.Li et al. [3] proposed a stochastic transmission model to calculate the probability that newly introduced cases might generate outbreaks in other areas.Zhang et al. [10] estimated the trends in the demographic characteristics of cases and key time-to-event intervals, and estimated the dynamics of the net reproduction number at the provincial level using the Bayesian method.
Due to the continuous adjustment of national infection prevention and control measures, however, the previous references ignored the adjustment of infection prevention and control measures and the improvement of treatment effect.Therefore, considering the characteristics of novel COVID-19, the type of infected population, the upgrade of infection prevention and control measures and the improvement of treatment, the present paper proposes an improved dynamic SEIR (ID-SEIR) model according to dynamic parameters that change with time.We first fit the ID-SEIR model by using the real epidemic data of Wuhan City, China, then aim to predict the epidemic moving trend of novel COVID-19 based on the epidemic data of the USA, New York, Italy and the current situation.Thus, we will provide the references for infection prevention and control.

Improved dynamic SEIR (ID-SEIR) model
The SEIR model has become an important tool for studying the epidemics, such as Aron and Schwartz [1], Li et al. [2], Tang et al. [5], Wei et al. [7], Yang et al. [9], and among many others. .Then the total infectious population is      

E t I t A t 
, where . We assume that the susceptible population   St is not necessarily infected in every contact with infectious population, then we let  denote the probability of being infected in each contact, where 0  Thus, the number of the infections at time and the number of population who has been in contact with infectious population but not infected is  , where the definitions of  and  can be found in Table1.
For the susceptible population who has been in contact with infectious population, whether they are infected or not, they must be quarantined.Assume that the probability of the susceptible population being tracked is  , where 0 1  , then the number of quarantined susceptible population is  at time t， and the number of exposed infections is Assume that some exposed infections will be quarantined, and the probability of exposed infections being tracked is ，where 0 1,   then the number of quarantined exposed infections is t Assume that the quarantined susceptible population will be released after a certain period of time, and the rate of release is ,  then the number of susceptible population is From the above discussions, we can obtain the following equations for the susceptible population

 
St and the quarantined susceptible population

E t I t A t S t dt E t I t A t S t E t I t A t S t S t dS t E t I t A t S t S t dt
After a period of time, the exposed population will be diagnosed, and assume that the proportion of symptomatic infections among all infected population is  .Thus, the number of asymptomatic at time , t and the quarantined infections will increase by , where  is the incubation rate.Then we have the following equations for the exposed population   Et and quarantined exposed population Note that the mild, ordinary and severe infected populations in the ID-SEIR model are undiagnosed infections, and assume that the probabilities of these three types of infections being diagnosed are i  respectively, where 01 i   and 1, 2,3.i  Thus, the number of the quarantined infections is at time , t where  is transfer rate of undiagnosed infected population.Note that the infected population will be cured or died after a period of time.For the mild infected population,    

t dt dI t E t I t i dt dA t E t A t dt dR t A t h h I t I t h dt dD t A t d d I t I t d dt
and the definitions of parameters in ID-SEIR model are given in Table 1.
Table 1 The definitions of parameters in ID-SEIR model

Parameter values in ID-SEIR model
In order to predict the epidemic trends of COVID-19 by using our proposed ID-SEIR model, we first need to determine the values of parameters in Table 1.In what follows, we will discuss how to determine the values of these parameters in Table 1.
Note that the incubation rate is the rate of exposed population becoming infected population, and Li et al. [3] shows the average incubation period of COVID-19 is 5.2 days and a whole quarantine cycle is 14 days.Then we take 1/ 5. .In this paper, we take 3 0.1 d  and 3 0.9 h  , the reason is that the number of dead population who is undiagnosed cannot be obtained and the morality of severe infections is higher than that of the total population.For simplicity, we fix    2, and the fitted epidemic trend is given in Fig. 3.In this paper, our ID-SEIR model has a better fit for the COVID-19 data of Wuhan, we will predict the epidemic trends for overseas COVID-19 by using the ID-SEIR model, and we will provide the references for infection prevention and control of COVID-19.
In the present paper, we apply the proposed ID-SEIR model to analyze the COVID-19 data of USA, Italy and New York state since USA and Italy are the "worst-hit area" in the world and New York is "worst-hit area" of USA.Of course, our proposed ID-SEIR model can be similarly used to analyze the COVID-19 situations in other countries and areas.

Analysis of COVID-19 in USA
Up to April 26, 2020, all 50 states in USA have been approved to make "major disaster" declarations, and the cumulative confirmed cases in New York, New Jersey, Massachusetts, Illinois, California, Pennsylvania, Michigan, Louisiana, Florida, Connecticut, Texas, Georgia had exceeded 20,000.USA regarded COVID-19 as the flu in the early stage, and implemented the epidemic prevention and control relatively later.This causes the COVID-19 rapidly spread in USA recently.
The COVID-19 data of USA is collected from March 11 to April 26, and the data is available from the official website of Johns Hopkins University.Thus, we apply our proposed ID-SEIR model to predict the peak of the cumulative infectious cases (PCIC) and the expected peak date (EPD) under different initial parameters.The prediction results are reported in Table 3 for the different  , where   0 S denotes the initial value of susceptible population, "PCIC" denotes the predicted peak of the cumulative infectious cases, 95% CI denotes the confidence interval of PCIC with the nominal level 0.95, "EPD" denotes the expected peak date.In addition, Fig. 4     From Table 3 and Fig. 4, we have the following results.
(1) For the most extreme case, that is, we take almost all population (330 million) of USA as the initial value of susceptible population.Thus, 0.13   as an example, the predicted cumulative infectious cases will reach a peak on May 20, and the predicted value is about 1,313,200 (95% CI: 1,238,100-1,388,300). When  increases, the predicted peak of the cumulative infectious cases will also increase while the expected peak date will postpone.For example, for 0.17,   the predicted cumulative infectious cases are about 1,632,800 (95% CI: 1,558,100-1,707,500), and the expected peak date is May 24.
(2) For the fixed  , the predicted cumulative infectious cases will decrease and reach peak quickly as the initial value of susceptible population decreases.For 0.13   and taking 100 million as the initial value of susceptible population, thus the predicted cumulative infectious cases will soon reach a peak on May 11, and the predicted value is about 1,123,100 (95% CI: 1,089,000-1,157,200).
(3) From Fig. 4, we can find that the cumulative infectious cases increase gradually to a peak, and then it will decrease slowly as the epidemic develops.For the fixed 0.17 of susceptible population increases, the cumulative infectious cases will have a higher growth than that with the small susceptible population.For the fixed   0 S , the cumulative infectious cases will have a lower growth as  decreases.
(4) We also find that our proposed ID-SEIR model has a better fit for the real cumulative infectious cases with     .Thus, the predicted cumulative infectious cases will soon reach a peak on May 14, and the predicted value is about 1,171,300 (95% CI: 1,144,600-1,198,000).

Analysis of New York's COVID-19
Since the outbreak of novel COVID-19 in New York, the number of new confirmed cases had increased exponentially recently, and had exceeded that of the whole China.However, the population of New York is only one third that of Hubei province, China.
The COVID-19 data of New York comes from the official website of New York Government.Thus, we apply our proposed ID-SEIR model to analyze the COVID-19 data from March 7 to April 26 and predict the peak of the cumulative infectious cases (PCIC) and the expected peak date (EPD) under different parameters  and  .The prediction results are reported in Table 4, and the dynamic curves for the cumulative infectious cases are reported in Fig. 5 with different  and  .Fig. 5 The dynamic curves for the cumulative infectious cases with different  for New York's COVID-19 data, where 0.13 From Table 4 and Fig. 5, we have the following results.
(1) For the fixed  , the predicted cumulative infectious cases will increase and the expected peak date will postpone as  increases.For example, for (2) For the fixed  , the predicted cumulative infectious cases will decrease and reach peak quickly as  increases.For 0.15   and 0.15,   the predicted cumulative infectious cases will reach a peak on May 9, and the predicted value is about 355,400 (95% CI: 330,900-379,900).
(3) Similar to the results of USA, the cumulative infectious cases will increase gradually to a peak and decrease slowly with the epidemic developing.For the fixed 0.13,   as  decreases, the cumulative infectious cases will have a higher growth than that with a small  .
(4) From Fig. 5, we can find that our proposed ID-SEIR model has a better fitting for the real cumulative infectious cases with .Thus, the predicted cumulative infectious cases will reach a peak on May 6, and the predicted value is about 309,200 (95% CI: 292,900-325,500).

Analysis of Italy's COVID-19
At present, the Italians had taken the corresponding protective measures, such as wearing masks and washing hands frequently.Therefore, the contact infection rate should decline with time.Under the forceful measures of Italy, the probability of the susceptible population being tracked would gradually increase with time.
Compared to Wuhan, China, Italy's domestic medical resources are relatively scarce, and the patients need to queue up for hospitalization.This means that the diagnosed probability and isolating patients are less than those of Wuhan, and the proportion of dead patients is higher than that of Wuhan.The Italian government has actively taken measures to respond the COVID-19 and has continuously purchased the medical equipment.Therefore, the diagnosed probability and isolating patients should increase, and the proportion of dead patients will decrease over time.According to the actual situation in Italy, the parameters of the ID-SEIR model are adjusted in Table 5, and other parameters are same as that in Table 2.  6 for the different  and  , where "PCIC" denotes the predicted peak of the cumulative infectious cases, 95% CI denotes the confidence interval of PCIC with the nominal level 0.95, "EPD" denotes the expected peak date.
Fig. 6 gives predicted curve of COVID-19 for cumulative infectious cases in Italy.  .Therefore, the predicted cumulative infectious cases will reach a peak on June 1, and the predicted value is about 233,400 (95% CI: 231,200-235,700).In addition, the average fitting deviation rate excluding the first five days was about 6.21% due to the small number of cumulative cases in the first few days with the relatively large fitting deviation.

Conclusion and Discussion
In later December 2019, an outbreak of the novel COVID-19 began in Wuhan (Hubei, China), and spread throughout the country rapidly.Then China adopted unprecedented nationwide interventions on January 23, 2020 to limit the spread of the epidemic.Under the lead of President Jinping Xi, China took a series of measures, for example, the whole country was quarantined, the masks were required strictly, the national holiday was extended, strict measures for limiting travel and public gatherings were introduced, public spaces were closed and rigorous temperature monitoring was implemented nationwide.Until March 21, 2020, China has achieved great success in infection prevention and control.
As the most severe province of COVID-19 in China, Hubei Province also had no new confirmed cases for six consecutive days.It can be seen that the measures taken by China have played a key role to contain the spread of the novel COVID-19.
However, the novel COVID-19 is highly infectious and causes complex epidemics, which has been spreading in almost all countries in the world.In order to predict the epidemic trends for overseas for the three most serious USA, New York and Italy with our proposed ID-SEIR model, and we can apply the proposed method to predict the epidemic trends of other countries and areas.
We find that the proposed ID-SEIR model can predict the epidemic trends of COVID-19 very well.
For USA, the predicted cumulative infectious cases will reach a peak on May 14, and the predicted value is about 1,171,300 (95% CI: 1,144,600-1,198,000) with initial susceptible population 100 million and 0.17 . For New York, the predicted cumulative infectious cases will reach a peak on May 6, and the predicted value is about 309,200 (95% CI: 292,900-325,500).For Italy, the predicted cumulative infectious cases will reach a peak on June 1, and the predicted value is about 233,400 (95% CI: 231,200-235,700).also find that some parameters affect the predicted cumulative infectious cases and the expected peak date, for example, the initial value of susceptible population, the probability  of being infected in each contact, and the probability  of the susceptible population being tracked.For the fixed  and  , the predicted cumulative infectious cases will decrease and reach peak quickly as the initial value of susceptible population decreases.When  increases, the predicted peak cumulative infectious cases will also increase while the expected peak date will postpone.The predicted cumulative infectious cases will decrease and reach peak quickly as  increases.Therefore, we hope that each country can learn the experiences from Chinese infection prevention and control, and require to adopt the quarantine, wear the masks strictly, and close public places so that they can reduce the susceptible population and contact rate.
According to our analysis, the epidemic of several serious countries and areas in the world will continue for some time, and the elimination of the novel COVID-19 will be a long-term war.The proposed ID-SEIR model established in this paper has strong reliability, which can reasonably reflect the changes in national policies and public behavior during the epidemic.Also, this model can make predictions in line with the actual development of the epidemic and provide reference for infection prevention and control.However, the proposed ID-SEIR forecast model mainly focuses on the effect of time to the total infected cases, and fails to reflect the effect of other possible factors or variables, such as travel, contact and other trajectory data, which will be the focus of the future study.
The structural diagram of the improved dynamic SEIR (ID-SEIR) model adopted in the study for the novel COVID-19 The calibrated data and real data of the cumulative number of con rmed cases for Wuhan's COVID-19 data from January 23 to February 20, 2020 The dynamic curves for the cumulative infectious cases The predicted curve of COVID-19 for cumulative infectious cases in Italy

For 1 It 2 It 3 ItFig. 1
Fig. 1 The structural diagram of the improved dynamic SEIR (ID-SEIR) model adopted in the study for the novel COVID-19 Since the diagnosed infected population   q It has been quarantined, they are not able to contract the susceptible population.Thus, there are five types of infectious population at any time t , including   Et ,   At ,   i It , where 1, 2,3 i 

1  2  3 hhh 2  3  1 d 2 d 3 d
Definitions  Probability of the susceptible population being tracked  Probability of being infected in each contact  Contact rate between the susceptible and infectious population  Transmissibility index of exposed infections compared with infected population  Probability of the mild infections being diagnosed Probability of the ordinary infections being diagnosed Probability of the severe infections being diagnosed 1 Proportion of cured infections among all undiagnosed mild infections 2 Proportion of cured infections among all undiagnosed ordinary infections 3 Proportion of cured infections among all undiagnosed severe infections 1 Proportion of the mild infections among all infected population Proportion of the ordinary infections among all infected population Proportion of the severe infections among all infected population Proportion of dead infections among all undiagnosed mild infections Proportion of dead infections among all undiagnosed ordinary infections Proportion of dead infections among all undiagnosed severe infections A h Proportion of cured infections among all asymptomatic infected population q h Proportion of cured infections among all diagnosed infected population A  Transfer rate of asymptomatic infected population q  Transfer rate of diagnosed infected population A d Proportion of dead infections among all asymptomatic infected population q d Proportion of dead infections among all diagnosed infected population


From January 23, 2020, China adopted the strong infectious prevention and control, and China had successfully curbed the spread of local spreading on March 19, 2020.Therefore, we use the COVID-19 data in Wuhan from January 23, 2020 to March 19, 2020 (no new confirmed cases for three consecutive days) as the train sample to obtain the optimal initial values of the parameters ,,    and . COVID-19 data in Wuhan is available from the official website of Hubei Provincial Health Commission http://wjw.hubei.gov.cn.It is worth noting that, on February 12, 2020, Hubei Province confirmed the clinical diagnosis for the suspected cases with the typical pneumonia imaging characteristics.This makes 13,436 new confirmed cases in Wuhan City on February 12. Compared with February 11, there is a big difference.Considering the early stage of the outbreak, the amounts of reagents used for the detection of nucleic acids in patients are insufficient, and many suspected cases are not included in the confirmed population in time.In order to better train the proposed ID-SEIR model with the COVID-19 data in Wuhan, we first need to correct the data for the data from February 5 to 11 so that the calibrated data can be used to obtain the values of parameters in model.Fig.2give the calibrated data and real data of the cumulative number of confirmed cases.

Fig. 2 Table 2 Fitting
Fig.2The calibrated data and real data of the cumulative number of confirmed cases for Wuhan's COVID-19 data from January 23 to February 20, 2020Thus, the values of parameters in ID-SEIR model for Wuhan COVID-19 data is given in Table2.Table 2 The values of parameters in ID-SEIR model for Wuhan COVID-19 data Parameters Values Parameters Values 

Fig. 3
Fig. 3 The fitted epidemic trend for Wuhan's COVID-19 data from January 23 to March 19, 2020 It is easy to see that our proposed ID-SEIR model can fit the epidemic trend very well for the COVID-19 data in Wuhan from January 23, 2020 to March 19, 2020, and the average fitting deviation rate is about 5.58%.
reports the dynamic curves for the cumulative infectious cases with different   0 S and  .

Fig. 4
Fig. 4 Left plot: dynamic curves for the cumulative infectious cases with different   0 S and

.
Right plot: dynamic curves for the cumulative infectious cases with different  and   to April 26, 2020.Therefore, we consider that the proposed ID-SEIR model can predict the epidemic trends of COVID-19 very well if we take initial value infectious cases are about 370,900 (95% CI: 344,400-397,400), and the expected peak date is May 11.
April 26, 2020.Therefore, we think that the proposed ID-SEIR model can predict the epidemic trends of New York's COVID-19 very well if we take

Fig. 6
Fig. 6 The predicted curve of COVID-19 for cumulative infectious cases in Italy with

COVID- 19 ,
we first propose an ID-SEIR model by considering the susceptible population, quarantined susceptible population, exposed population, quarantined exposed population, asymptomatic infections, infected population with different severity, diagnosed infected population, cured population and dead population at any time.Secondly, we obtain the values of parameters in ID-SEIR model by using the epidemic data of Wuhan as the training sample.Finally, we predict the epidemic trends of COVID-19

Figure 3 TheFigure 4 dynamic curves for the cumulative infectious cases Figure 5
Figure 3

Table 3
Prediction results of epidemic trends for COVID-19 in USA

Table 4
Prediction results of epidemic trends for COVID-19 in New York situation of COVID-19 in Italy is still very serious.Until April 26, the total number of diagnosed patients in Italy had reached 197,675 and 26,644 deaths, it makes Italy to have a higher number of deaths.Based on the Italy's COVID-19 data from February 20 to April 26, we will predict the epidemic trends of COVID-19 in Italy by the proposed ID-SEIR model.Since the situation in Italy was different from that in China, we need adjust some parameters in ID-SEIR model to better fit and reflect the situation in Italy.After the outbreak of COVID-19, most

Table 5
Adjusted parameters of ID-SEIR model for COVID-19 in Italy Since the COVID-19 outbreak in Italy occurred in late February, there were only 3 confirmed cases in Italy until February 20.Therefore, we collect the COVID-19 data from the official website of Johns Hopkins University.Based on the adjusted parameters, we apply the proposed ID-SEIR model to analyze the COVID-19 data of Italy, and report the prediction results in Table

Table 6
Prediction results of epidemic trends for COVID-19 in Italy