A fatality data based on an optimized SEIR Model for Epidemic: A study about the testing and quarantining

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), caused by a newly discovered coronavirus, has been announced as a global pandemic by WHO on March 12, 2020. We aimed to provide a more accurate, object-oriented SEIR model for Epidemic as well as a thorough analysis for different intervention strategies and critical parameters. Based on the object-oriented mindset in computer science, we assigned properties to every independent patient, which includes age, likelihood of catching the virus, independent mortality rate, social activeness, awareness of the danger of the virus. We also considered macro level parameters including testing rate, positive rate. Based on the fatality data obtained, we estimated the occurrence time of the first case in China (excluding Hubei), Russia, Germany, and the United States using the model. Eight groups of virus-transmission simulation with different testing rate and quarantine ratio were conducted. The mortality data was found to be an effective and crucial parameter while the number cases can be decisive reflecting the severity of the pandemic.


Introduction
Novel coronavirus pneumonia is an acute infectious disease that transmits rapidly from person to person which has a relatively high mortality rate. The initial symptoms of the virus are fatigue, fever, dry cough, and dyspnea. 1 Most patients can develop antibody and cured without medical intervention; however, 19·1% patients with sever symptoms may develop acute respiratory distress syndrome, which may lead to death. 2 Soon after it was detected in late December 2019, from Wuhan, China, the number of cases grew exponentially worldwide and became a global pandemic. 3 According to the data from the World Health Organization, on April 18,2020, there are 2,160,207 confirmed Covid-19 cases worldwide, with more than 146,008 deaths caused by the virus. 4 Since the current situation of the epidemic is growing worse as the virus spreads silently, it is vital to predict the future situation in the affected regions, and analyze the strategies taken, which would help to find the more reasonable and effective intervention to reduce the damage of the virus. Based on the SEIR model, we constructed the optimized OO-SIR model, which considered age distribution of the patients, dynamic mortality rate, social activeness of individuals, positive rate, and testing rate. Some studies have already discussed containment strategies in UK, Singapore and Hongkong. However, we believe a detailed discussion is required to analyze the significance of testing and quarantining patients. 5, 6,7 Method Parameter Age The age of patients of COVID-19 disease and their mortality ratio demonstrates a strong gradient pattern. In the OO-SIR model, there are 3 parameters related to age: patient age distribution of the target population ( ! ), age distribution of the target population ( ! ), and constant likelihood of patients in a certain age interval to catch the virus ( " ). " is a critical parameter in the model -which determines the predicted patient age distribution in the target population -will influence the mortality rate directly, has to be calculated with a reliable source of data.
China is the first country that controls the community transmission of the coronavirus, which has a relatively low calculated mortality rate which is less than 2%(Calculated morality rate) and a large number of infections (over 12,000 reported cases). Therefore, it is reasonable to believe that the number of tests conducted in China is large enough, which make its data reliable enough to calculate the " . Thus, the data of China (excluding Hubei) is used to calculate the " . " (patient age distribution in China excluding Hubei), " (population age distribution in China), and " (likelihood of patients in a certain age interval to catch the virus)is shown (Table 1).
In particular, the relationship between " , " , and " is shown (Equation (1)).  After obtaining the " and age distribution of the target population ( ! ), the patient age distribution of the target population ( ! ) can be obtained (Equation (2)).

Mortality Rate
In conventional SIR models, the mortality rate is considered as a constant with a certain age distribution 9 . According to Rajgor, the mortality rate of COVID-19 is 1•38%. 10 Nevertheless, at normal circumstance, the calculated mortality rate of Covid-19 is way higher than 1•38%. Zhang revealed the relationship between mortality rate of Covid-19 and the number of health workers -the mortality rates in Hubei and Wuhan demonstrated an exponential decay pattern as medical workers increase. 11 Thus, in OO-SIR model, the mortality rate is defined as a variable: the function about population mortality rate( ), initial population mortality rate( " ), medical system coefficient( ) is shown (Equation (3)).

= ·
(3) In the real-world scenario, the mortality rate is affected by many factors, including comorbidities, immune system, random mutation, medical intervention, etc. In OO-SIR model, the level of medical intervention of a patient is described as medical system coefficient ( ). It is not easy to quantify the health condition and basic disease of every patient during mathematical modeling; however, the positive relationship between age of the patient and the risk of death of the patient is revealed. In other words, a patient with greater age is tend to have poorer health condition and more comorbidities. A study from Chinese Epidemiology Working Group for NCIP Epidemic Response revealed the relationship between age and mortality rate. 2 Thus, OO-SIR model modeled the relationship between the age of patient and the individual mortality rate ( % ), assuming the age as the only factor affecting the risk of death, while considering the intervention of the medical system.
The individual mortality rate of a patient in a certain age interval will increase as the age increases. By modeling the data with exponential function, the following equation can be obtained. The relationship between % , age of the patient ( ), and can be written (Equation (4)).
In the model, the initial population mortality rate ( " ) is defined when the country has sufficient medical supply. Since the age distribution within a region/country is relatively stable during the spread of virus, the initial population mortality rate ( " ) is stable as well, it is defined as a constant. The relationship between initial population mortality rate ( " ), initial individual mortality rate ( " ), and the age of the individual patient ( ) is shown (Equation (6)).
Zhang Indicated that the population mortality rate decayed exponentially as more hospital beds and doctors (medical system capacity) fluxed into Wuhan; thus, on the contrary, the population mortality rate would grow exponentially as the medical system gets overwhelmed. However, the growing pattern will not be unlimited. Due to the reason that the mortality rate without medical intervention is certain, the growth will converge to the mortality rate when there is no medical intervention. Therefore, the medical system occupancy rate has significant impact on the overall population mortality rate ( ) OO-SIR model described this phenomenon mathematically. A function with such a pattern can be described using the error function. 12 The model defined medical system coefficient ( ) to help calculating the theoretical mortality rate, which will grow as more medical resource is occupied. The function about medical system coefficient ( ), hospitalized cases ( + ), and medical system capacity ( + ) is shown (Equation (7)).

= <=
, , Calculated morality rate It is almost impossible to detect every patient, which means there has to be untested, undocumented cases. This phenomenon may cause the calculated mortality rate to be higher than the population mortality rate. The calculated mortality rate ( 0 ) is defined as the mortality rate calculated by the data provided by the health official in a country, province, or region. The relationship between calculated mortality rate, confirmed cases ( 1 ), and confirmed deaths ( 1 ) was defined (Equation (8)).

= (8)
Social activeness and public awareness The spreading speed of the virus ( ) is proportional with contact frequency between objects. 13 To better describe the contact frequency, the OO-SIR model defined effective contact frequency ( ). In the model, there are four critical parameters defined that affect the effective contact frequency: social activeness (Equation (10)), public awareness (Equation (11)), and the immunity coefficient (Equation (12)). The relationship between , , , , , and number of active cases ( 0 ) is shown (Equation (9)).
In the model, is assigned using normal distribution; object with higher social activeness tends to make more contacts per day. The social activeness distribution function about constant t, coefficient 3 , 3 are written (Equation (10)).
represents the population's awareness of the danger of the virus and the strength of government enforcements. Intuitively, the population will decrease the frequency of social contact and ware PPE intentionally when they are conscious about the virus. Thus, public awareness is going to increase as more patients died due to the COVID-19 or the strength of city lockdown increases. As the increases, the effective contact frequency ( ) will decrease.
The distribution of effective contact frequency made by objects will shift when changes ( Figure  1); the distribution of effective contact frequency will shift to the left if the public awareness increases. Figure 1 The effect of social awareness on effective contact frequency is defined as immunity coefficient, which represents the proportion of people who has immunity within a population. 14 The relationship between immunity coefficient ( ), number of infected patients ( ), recovered cases ( ), and population ( ) is shown (Equation (12)).
Quarantining the patients In practice, most confirmed coronavirus cases will be quarantined; however, there are still unconfirmed and asymptotic cases. To make the model closer to the real-world scenario, the number of infected patients ( ) is divided into two groups: patients who are able to transmit the virus ( 0 ) and isolated cases ( = ). The relationship between them is shown in (Equation (13)).
= + (13) The model also included the concept of quarantine ratio ( = ), representing the proportion of isolated cases in all coronavirus cases. Due to the fact that quarantined patients are isolated from the susceptibles, they are theoretically less likely to transmit the virus. The relationship between quarantine ratio ( = ), isolated cases ( = ), and number of infected patients ( ) is written as (Equation (14)).

= (14)
However, the = is extremely difficult to calculate due to the insufficient coronavirus tests conducted in some regions, the = calculated in the model using (Equation (17)).

Positive rate
The positive rate ( ? ) represents the ratio of confirmed cases and total tests conducted. Although ? has no actual impact on the computation in the model, a region with low ? tend to have lower testing rate. Lower testing rate means less patients confirmed and less cases quarantined, resulting in lower = . Therefore, the model assumes that ? is inversely proportional with = , as shown in (Equation (15)).

∝ (15)
Countries with high ? tend to have higher 0 (Figure 2). This phenomenon is potentially because a higher positive rate means a higher proportion of untested patients. With a same number of total tests conducted (when the number is large enough), more patients will test positive and the ? will be higher.
A higher number of patients tend to lead to a higher number of 0 , so here concludes that the 0 is inversely proportional with , their relationship is shown in (Equation(16)). Testing rate In the model, the testing rate ( = ) is defined as the percentage of confirmed patients as a proportion of the number of patients that have symptom and can be tested. This parameter can reflect the accuracy and reliability of the data of confirmed cases in a country. The relation ship between testing rate ( = ), tested and confirmed cases ? ,and number of infected patients ( ) is written (Equation (18)).

= ∝
Model A computer model is constructed based on the model mentioned above; the simulation results were calculated using the model. The data that need to input is shown (Table 2), all the data are gathered from the health officials and Johns Hopkins university CSSE. 15 The OO-SIR model can be described as following equations: The procedure of OO-SIR Model is shown (Figure 3).

Figure 3 The flow chart of the OO-SIR model
By deploying machine learning and PID algorithm, the model can find out the undermined coefficients in the model: testing rate, quarantine ratio, date of lockdown, etc. The model will calculate the derivative and integral to do the regression analysis, and try to find the best fitting scenario and to give reference for later modeling.

Result
Simulation result Based on the model discussed above, by inputting the data of each country and region 15 , the coronavirus transmission and deaths simulations of China (Excluding Hubei), Russia, Germany, and the United States were conducted. ℛ " and were recalculated by the model and redressed manually; , , + , = were entered manually (Table 3).  Noticeably, Germany and the United States have relatively low = , which may cause a higher 0 ( Figure 2); however, the model demonstrated a high uncertainty of the = of Germany, after multiple trails, the model offered 2 potential explanation: the patient age distribution in Germany skewed towards elderly, or the = in Germany is fairly low (0•60-0•80). In the United States, the growth of the number of cumulative cases demonstrated linear pattern, which very likely means that the testing capacity has reached. Therefore, the number of daily new cases would change drastically unless the ? changes. The high 0 of the United States is mainly due to its low =

Testing rate
The testing rate plays a crucial role affecting the health officials to make decision and respond to contain the virus by influencing the number of cases reported. We analyzed the effect of testing rate on the overall virus transmission -death toll, cumulative cases, active cases, and reported cases -with different quarantine ratio.
Effect of testing rate on the number of cases and deaths with low quarantine ratio By conducting more coronavirus tests, the governments and health officials can acquire a more accurate view of the transmission of the virus. The model was used to investigate the effect of testing rate on the overall number of deaths and cases. Four trails with 100%, 75%, 50%, and 25% testing rate were conducted, city lockdown will trigger when 10,000 cases are confirmed, assuming same speed of responding.  When there are no patients quarantine, the curves of reported active cases converges despite the fact that the difference in actual number of cases is large (up to 335·5%) ( Figure 4). The four groups have remarkable difference in the number of deaths, the calculated mortality rate demonstrated is inversely proportional with the testing rate as predicted before. Thus, when the quarantine ratio is 0 (or ignorable compare to the number of actual cases), difference between the testing rate cannot result in a significant shift in the number of confirmed cases. The testing rate has a proportional relationship with the number of reported cases, which cannot be estimated solely through the curve of reported cases and daily new cases, meaning the severity of the case cannot be estimated solely through the curve of reported cases.
The calculated mortality rate raised as the spreading speed gradually decreases; however, the calculated mortality rate may decrease due to more tests conducted, which is not expected by the model. This could be attributed to the improvement of testing capacity, which brings a higher testing rate thus a lower calculated mortality rate. 100% testing rate, 75% testing rate, 50% testing rate, 25% testing rate without quarantine and same trigger of city lockdown

Figure 4 Demonstrative trails -spreading speed of the virus (a) number of active cases (b) number of cumulative cases (c) number of reported active cases (d) calculated mortality rate (e) number of reported active cases (f) convergence of reported active cases (g) with
Effect of testing rate on the number of cases and deaths with high quarantine ratio Another group of trails with higher quarantine ratio were also done to investigate the effect of testing rate on the transmission of the virus. With the population of 10,000,000 and quarantine ratio of 0.75 (relative to reported cases), the following 4 trails were conducted (Table 6). With a higher quarantine ratio (relative to reported cases), the pattern of convergence of the reported active cases was not observed here ( Figure 5). Furthermore, trails with lower testing rate tend to take shorter amount of time to reach peak, which may because of an earlier triggering of the lockdown. In this case, higher spreading speed of the virus would indicate a lower testing rate. Nevertheless, the calculated mortality rate would not change noticeably when the quarantine ratio is changed compare to the previous trails. In a same population, the only factor affecting the calculated mortality rate is the testing rate. According to our modeling, enforcing city lockdown when 5% of the population are infected can decrease the final number of cases by 58·6% (95% CI 54%-63%), quarantining 75% patients can decrease the final number of cases by 17·5% (95% CI 10-26%), and using both them and decrease the final number of cases by 93·0% (95% CI 92%-95%). With no intervention to mitigate the transmission of the virus, 96·5% of the population are infected, which is substantially higher than previous estimations. We believe this due to a relatively high proportion of people infected despite the fact that the ℛ " is lower than 1. It is extremely important topic the right approach to balance the economy and the curve, fast and strict response is required to contain the virus and avoid fatality.
We found the phenomenon that when the relative quarantine ratio is low, the number of reported cases would converge regardless the significant difference in testing rate. The relationship between testing rate and calculated mortality rate was proven, which may provide a new method for future strategy making and intervention. The model indicates that the decrease of daily new cases may be merely due to the decrease of positive rate when the testing capacity is limited. The severity of the virus must be underestimated, countries have to keep enforcing social distancing to protect the risky population.
We reserved port for later optimizations of the model, which will help us keep advancing the model and improve the accuracy. Also, we provided extension classes for potential revisions.
Also, our study has some limitations. OO-SIR model, as a fatality-based model, assumed a certain distribution of the age of patients, which may cause age of patients to deviated from the real-world scenario, which may cause the number of actual cases to be overestimated. Also, the effect of social distancing may also be overestimated since not everybody is wearing PPE at current time. We also did not consider the difference between ability of transmission of asymptomatic cases and symptomatic cases. Yu revealed that COVID-19 has airborne transmission, which would bring more uncertainty to transmission inside a condo or business establishment. 16 Due to the reason that the population in the model is closed, the travellers from other countries and regions are ignored, the import cases from other countries will make the scenario more complex and difficult to predict. Secondly, the country is considered as a cohesive whole, which may result the prediction to deviate as higher proportion of the population getting infected.

Figure 1
The effect of social awareness on effective contact frequency Figure 2 The trend of calculated mortality rate and positive rate in Israel, Australia, Canada, Germany, Italy, Sweden, the United States of America, Spain, and France The ow chart of the OO-SIR model Demonstrative trails -spreading speed of the virus (a) number of active cases (b) number of cumulative cases (c) number of reported active cases (d) calculated mortality rate (e) number of reported active cases (f) convergence of reported active cases (g) with 100% testing rate, 75% testing rate, 50% testing rate, 25% testing rate without quarantine and same trigger of city lockdown Figure 5 Demonstrative trails -spreading speed of the virus (a) number of active cases (b) number of cumulative cases (c) number of reported active cases (d) calculated mortality rate (e) number of reported active cases (f) with 100% testing rate, 75% testing rate, 50% testing rate, 25% testing rate without quarantine and same trigger of city lockdown