COVID-19 propagation prediction and assessment method with imported cases and infection generations: Shanxi province as a case

When everyone focuses on 2019 coronavirus disease (COVID-19) in Hubei province, the epidemic in other province cannot be ignored, which also has an impact on the epidemic in the whole country. The most distinctive epidemic characteristic in Preprint to all regions except Wuhan is that the most of conﬁrmed cases are imported cases from Wuhan, and the propagation chain is relatively clear. Based on detailed contact tracing information of conﬁrmed cases, combined with ﬁrst-order outbreak response measures, we establish a disease transmission dynamical model to describe the infection propagation chain among the human population. Using Shanxi province as a case, modeling results indicate that the epidemic peak in Shanxi province occurred in February 2. In addition, our model suggests that according to the current development trend, COVID-19 will disappear in February with the ﬁnal epidemic number of approximately 175 cases. It is veriﬁed that the most eﬀective outbreak control measures in Shanxi include home isolation of people, surveillance and isolation of second-generation cases, contact tracing and management of contacts. With the end of the holiday, if the average number of contacts per person per day is less than 6 , it has little impact on the incidence of COVID-19, and even if third- and fourth-generation cases occur, the epidemic will be under control, no later than late March with a ﬁnial outbreak size of 220 cases. However, if the average number of contacts per person per day is greater than 6, the number of COVID-19 cases will continue to be reported resulting in another epidemic peak. Through the forecast and evaluation of COVID-19 in Shanxi, it is veriﬁed that the model with infection generations is more accurate to describe the spread mode and can be extended to regions with imported cases.

all regions except Wuhan is that the most of confirmed cases are imported cases from Wuhan, and the propagation chain is relatively clear. Based on detailed contact tracing information of confirmed cases, combined with first-order outbreak response measures, we establish a disease transmission dynamical model to describe the infection propagation chain among the human population. Using Shanxi province as a case, modeling results indicate that the epidemic peak in Shanxi province occurred in February 2. In addition, our model suggests that according to the current development trend, COVID-19 will disappear in February with the final epidemic number of approximately 175 cases. It is verified that the most effective outbreak control measures in Shanxi include home isolation of people, surveillance and isolation of second-generation cases, contact tracing and management of contacts. With the end of the holiday, if the average number of contacts per person per day is less than 6 , it has little impact on the incidence of COVID-19, and even if third-and fourth-generation cases occur, the epidemic will be under control, no later than late March with a finial outbreak size of 220 cases. However, if the average number of contacts per person per day is greater than 6, the number of COVID-19 cases will continue to be reported resulting in another epidemic peak. Through the forecast and evaluation of COVID-19 in Shanxi, it is verified that the model with infection generations is more accurate to describe the spread mode and can be extended to regions with imported cases.

INTRODUCTION
On December 8th, 2019, the first case of unexplained pneumonia was reported in Wuhan city, Hubei province of the Republic of China. Due to long incubation period(95% confidence interval[CI], 4.1 to 7.0, even longer) before clinical symptoms [1], with the travel and homecoming of individuals who travel to affected areas by airplane and high-speed rail, the new coronavirus is coming fast and widespread. From January 13, confirmed cases started to appear in other countries and other Chinese provinces, respectively. The high infectivity and household clustering of COVID-19 infections raised a lot of health concern and panic among people. From January 25, all provincial government initiated a first-level emergency response. These included mandatory registration of all people arriving from Wuhan city, their family members and related contacts, and the registered individuals with home quarantine for at least 14 days, suspension of bus and taxi services, closure of all entrances and exits of expressways leading to the city surveillance for new COVID-19 cases and contact tracing and management of contacts. In addition, all grass-roots organizations, government organs, enterprises and institutions needed to strengthen personnel management and health monitoring services. All of these measures are designed to improve early detection, reporting, quarantine and treatment of COVID-19 cases. Issued on January 26, State Council issued a directive for the Spring Festival holiday to be extended until February 10.
With the series of prevention and control measures implemented locally, the outbreak in all province except Hubei province seems to be under control in February 10. However, with the end of the Spring Festival, people started to work which means that contact between people in the workplace and public transport may result in the increase in risk of transmission of SARS-CoV-2. As a result a number of questions arise including how many COVID-19 cases are being missed and not isolated? How much impact did the end of the holiday have on the incidence of SARS-CoV-2 transmission? Will a second outbreak peak occur as a result for the people resuming their work duties after the Spring Festival? To adequately answer these questions, mathematical models can be constricted to describe the transmission dynamics of COVID-19. As we know, the elementary segments of epidemic of infectious disease includes three aspects: the source of infection, transmission routes and characteristics of the susceptible population. For Shanxi province, the source of infection are mainly imported cases from Wuhan city (which is called as the first-generation cases), and the main transmission route is close contact of imported cases with local people (which belongs to the second-generation cases). So, different from the traditional modeling idea, we establish a dynamics model with infection generations to describe the propagation chain of COVID-19. In addition, outbreak control interventions such as surveillance for new cases and contact tracing and management of contacts can also be well reflected in the model. Disease dynamics models originate from the general Kermack-McKendrick model (SIR, SIS) and has been extended to various model types and specifications. Dynamical model is based on transmission mechanism of infectious diseases and addressed the limitation of small sample data allowing the parameterization of infectious disease transmission to predict the trend of the infection and assess the impact of prevention and control measures. A number of models have been put forward which have modelled the transmission dynamics of COVID-19 [2][3][4]. These models established differential equations that assumed that the random variable of the residence time of an individual in a compartment is exponentially distributed. In fact, by observing actual COVID-19 case data, we note that the normal, or log-normal or gamma distributions are more appropriate. Therefore, there is a need to develop modeling frameworks that consider the residence time as a normally distributed parameter and present an integration modeling method that accounts for the spread of SARS-CoV-2 between infection generations.
In this study, using reported COVID-19 cases data from Shanxi province, we establish an integration model to forecast the epidemic trend, peak time, peak number of COVID-19 cases, and estimate the duration of the epidemic. In addition, we aim to assess the effect of outbreak response interventions including surveillance, tracing and isolation of confirmed cases and their contacts on these estimates. By above research in Shanxi province, we verify the rationality and efficiency of the model.

Materials
Data source: The official reported confirmed cases and their detailed track information [5]. Family population obtained from National Bureau of Statistics [6].
Prerequisites a) Since the number of confirmed cases is very small compared to the total human population in Shanxi province, the susceptible group is not considered in model. b) Once human are infected by the SARS-CoV-2 virus, they may undergo several days before clinical symptoms, but they has infection during this period. So, it is assumed that once human are infected, they are infectious. c) It has been known that infected individuals can spread the virus without showing any symptoms. Asymptomatic infection after the incubation period is not considered in model.

Dynamical Model
In this paper, we take infectious cases as research objects. Once human are infected by the SARS-CoV-2 virus, they first undergo an incubation period with several days. After incubation period, when showing clinical symptoms, people will spend some days on home medication before going to the hospital. When going to hospital, they need spent one to three days to take Nucleic acid test to be confirmed. According to whether or not they are identified and confirmed, we divide the infectious cases into two classes: non-confirmed cases and confirmed cases. The non-confirmed cases refer to the population who has been infected and infectious, but has not been confirmed by medical institutions. After taking Nucleic acid test and being confirmed, the nonconfirmed cases become confirmed cases. Once confirmed, the cases will be being treated in isolation in hospital and separate from the infection of the epidemic. The number of non-confirmed cases at time t is denoted by I(t). According to the propagation chain, they are divided into three subclasses: the first-generation, the second-generation and the third-generation, whose numbers at time t are denoted by I 1 (t), I 2 (t) I 3 (t). The first-generation nonconfirmed cases refer to the imported non-confirmed cases from Wuhan city. The second-generation non-confirmed cases refer to local non-confirmed infected by the first-generation cases, and they can be divided into two parts: family non-confirmed cases and non-family non-confirmed cases. Family nonconfirmed cases are the infected but no confirmed family members of the firstgeneration cases. Non-confirmed non-family cases refer to population that are generally infected in public places or public transport by the first-generation non-confirmed cases. The third-generation non-confirmed cases refer to local cases infected but not confirmed by the second-generation non-confirmed cases, which has not been reported in Shanxi province. For the first-generation non-confirmed cases, the import number at time τ is A(τ ), who are confirmed at time t with probability G(t − τ ). The daily number of close contacts per first-generation non-confirmed case is B = C 1 + C 2 , where C 1 is the contact number with family members, C 2 is the contact number with non-family members. λ is the probability of post-exposure infection. At time τ , the import number of the second-generation non-confirmed cases is λBI 1 (τ ). For family member and non-family member who are infected at time τ , their confirmed probability at time t are F 1 (t − τ ) and F 2 (t − τ ), respectively. Since all people from Wuhan city and their family members were complied with home quarantine for at least 14 days, the effective contact number of the family members with other people is 0. The daily contact number of each non-family member with other people, whose proportion is σ = C 2 C 1 +C 2 , is C 3 . If exists, the import number of the third-generation non-confirmed cases at time τ is λσC 3 I 2 (τ ), who are confirmed at time t with probability F 3 (t − τ ). The cumulative numbers of the first-generation confirmed cases, the second-generation confirmed cases and the third-generation confirmed cases at time t are denoted by Q 1 (t), Q 2 (t) and Q 3 (t), respectively. The transmission of the virus between these subpopulations can be seen in Fig.1, and the corresponding dynamical model is system (1). Where,

Parameter values
In this subsection, we give parameter values in model (1) (see Table 1) and describe the implementation process of parameter estimation, which is implemented by the function fminsearch in the optimization toolbox in MATLAB. a) We apply actual time of arrival of the first-generation non-confirmed cases from February 14 to 25, to give A(t) (see Fig.2).
b) The time from the arrival time in Shanxi province to confirmed time is tested to obey normal distribution, by analyzing which we can obtain the 95% confidence interval of µ 1 and σ 1 . By applying Q 1 (t) to fit the first-generation cumulative confirmed cases, we estimate the values of µ 1 and σ 1 . The fitting result can be seen in Fig.3.   c) C 1 is obtained by analyzing family members in Shanxi province. µ 2 (µ 2 = µ 3 ), σ 2 are obtained by analyzing time from the contact time of the secondgeneration cases with the first-generation cases to confirmed time, which obeys normal distribution. By applying ∫ t 0 λC 1 I 1 (τ )F 1 (t − τ )dτ to fit the secondgeneration cumulative confirmed family cases, we obtain the value of λ and adjust the value of σ 2 . The fitting result can be seen in Fig.4. d) By applying ∫ t 0 λC 2 I 1 (τ )F 2 (t − τ )dτ to fit the second-generation cumulative confirmed non-family cases, we obtain the mean value of σ 3 and C 3 . The fitting result can be seen in Fig.5. e) By applying Q(t) = Q 1 (t)+Q 2 (t) to fit the total cumulative confirmed cases, we slightly adjust the mean value of λ from 0.0129 to 0.0149. The fitting result can be seen in Fig.6.

RESULTS
Based on above parameter values, the prospect of the epidemic in Shanxi province can be seen in Fig.7 and Table 2. The inflection point (peak time) of the epidemic has occurred in February 2. The disease will disappear in late February or the beginning of March and the mean final scale of cases will reach 175. It is inferred that there are about 20 non-confirmed family cases and about non-confirmed non-family cases that have not yet been found and confirmed, respectively. According to the current surveillance and control measures, the first-generation non-confirmed cases and second-generation non-confirmed family cases can be easily tracked and isolated. However, the situation of the second-generation non-confirmed non-family cases is complicated, and it's hard to track them all. Therefore, 20 non-confirmed non-family cases will be current risk source.   On January 10, Chinese new year break was over. Some groups had to start to work. In this case, due to the existence of non-confirmed non-family cases, the contact between individuals in workspace and public transport may increase the risk of transmission of the epidemic. How much impact does it have on the incidence of disease? If it causes third-generation infections, will there be a secondary outbreak? We estimate the prospect of the epidemic with the end of holiday and having third-generation infection, considering that the second-generation non-confirmed non-family cases have different contact numbers with other people (see Fig.8 and Table 3). Let σ = C 2 C 1 + C 2 = 0.5498, It is concluded that it has no effect on the peak time and peak value. When the contact number increases by 2, the final scale will increase 6. When the contact number increases to 10, the mean final scale will reach 208, and the duration of the epidemic will extend to the middle of March. So, as long as the second-generation non-confirmed cases are managed well, the end of holiday has little effect of the progress of the disease.  If the mean contact number per person per day is more than 6, the disease cannot disappear in the short term and can cause another peak. With this, more generations of cases need to be considered in model.

DISCUSSION AND CONCLUSION
In this section, we discuss the epidemic trend of Shanxi province under different conditions.
Case I: More than half of the first cases from Wuhan arrived in Shanxi after January 21. If the first-level emergency response was moved up from January 25 to January 22, and the individuals from Wuhan city were absolutely isolated from family members and other people, the epidemic will weaken by half (see Fig.10 and Table 4). Our results demonstrate that the epidemic will disappear in January 31 with the final size of 85 cases.
Case II: If people from Wuhan city are allowed to live with their family members, but are absolutely isolated from other people (C 2 = 0), the epidemic will disappear in February 20 with the mean final size of 124 cases(see Fig.11). If isolation measures are not enforced, first-generation non-confirmed cases have no stronger isolation awareness and freely enter public places. Assuming that the SARS-CoV-2 infection can only be spread to second-generation, with different contact numbers of the first-generation cases with non-family members, the trend of disease can be seen in Fig.9. When contact number increases by 2, the peak time has little change, and the peak value will increase 1, and the mean final size of the outbreak will increase 20-30 cases. If the contact number  increases to 10, that is the degree of activity of first-generation non-confirmed cases increases to 2.5 times as the actual situation, the mean final size will be approximately 270 cases and the duration of the epidemic will extend to the beginning of March.
Our results demonstrate that the early detection, early reporting and early quarantine of second-generation cases is key for effective outbreak control. With second-generation non-confirmed cases be well managed, if third-generation cases only appear, the effect of end of holiday is small. If fourth-generation cases also appear, contact tracing and isolation measures should be strictly performed to decrease the number of contacts per person per day be less than 6, otherwise the number of COVID-19 cases will continue to be reported resulting in another epidemic peak. Overall our results can provide theoretical guidance to understand the development and control of the disease. More importantly, our model provides important insights into the impact of imported cases while accounting for the relatively small data volume and can be extended to all provinces except Hubei province. Since asymptomatic infection  after the incubation period is hard to estimate and its exact mechanism is not clear, so it is not considered in model.

DECLARATIONS
Ethics approval and consent to participate: Not applicable.
Availability of data and materials: Not applicable.

Conflicts of Interest:
The authors declare no conflict of interest.

Availability of data and materials:
The datasets used during the current study are available from the corresponding author on reasonable request.