A new method to calculate the instant case fatality rate of COVID-19 in Wuhan and Hubei of China

The worst-hit area of coronavirus disease 2019 (COVID-19) in China was Wuhan City and its aliated Hubei Province, where the outbreak has been well controlled. The case fatality rate (CFR) is the most direct indicator to evaluate the hazards of an infectious disease. However, most reported CFR on COVID-19 represent a large deviation from reality. We aimed to establish a more accurate way to estimate the CFR of COVID-19 in Wuhan and Hubei and compare it to the reality. The daily case notication data of COVID-19 from December 8, 2019, to May 1, 2020, in Wuhan and Hubei were collected from the bulletin of the Chinese authorities. The instant CFR of COVID-19 was calculated from the numbers of deaths and the number of cured cases, the two numbers occurred on the same estimated diagnosis dates. The instant CFR of COVID-19 was 1.3%-9.4% in Wuhan and 1.2%-7.4% in Hubei from January 1 to May 1, 2020. It has stabilized at 7.69% in Wuhan and 6.62% in Hubei since early April. The cure rate was between 90.1% and 98.8% and nally stabilized at 92.3% in Wuhan and stabilized at 93.5% in Hubei. The mortality rates were 34.5/100 000 in Wuhan and 7.61/100 000 in Hubei. In conclusion, this approach reveals a way to accurately calculate the CFR, which may provide a basis for the prevention and control of infectious diseases.


Introduction
The novel coronavirus pneumonia (COVID-19) 1 initially appeared in Wuhan, China. As of June 29, 2020, the number of con rmed COVID-19 cases in China has exceeded 85,204 of which 4,648 have died.
Wuhan City and its a liated Hubei Province were the most affected and account for at least 90% of the cases in China. At present, the outbreak in China has been controlled basically. However, the cases of COVID-19 have rapidly increased all over the world and World Health Organization announced it was a pandemic on March 11, 2020 2-4 5 . Many countries such as the United States, Spain, Italy, France, Germany and England are most affected. As of June 29, 2020, the total number of patients has risen sharply to over 10.1 million worldwide, and over 499,000 patients died 6 . The global pandemic is getting worse.
Assessing the hazards of infectious diseases is a vital concern in epidemiology 7 . The case fatality rate (CFR), representing the percentage of deaths from a disease among the total number of infected patients during the entire outbreak, is the most direct path that re ects the severity of the disease. The accurate CFR of an infectious disease can be obtained after an outbreak is over on the basis of the numbers of con rmed cases and deaths. Given the severity of the COVID-19 outbreak situation throughout the world, it is important to estimate and predict the CFR of COVID-19. Hard outcomes such as the CFR have a crucial part in forming strategies at national and international levels from a public health perspective. It is imperative that health-care leaders and policy makers are guided by estimates of mortality and case fatality 8 for the setting of public security strategies, the allocation of health resources and the adjustment of medical treatments. This information might also be of great help in the government's decision making and the public's understanding. However, the mortality or "CFR" of COVID-19 published recently in some articles and announced by authorities are neither mortality nor CFR 9, 10 . Wang D et al. reported that 6 of 138 con rmed cases were dead and 47 of them were discharged, resulting in a CFR of 4.3% 11 . Zhang YP analyzed the characteristics of 72314 COVID-19 cases reported in China' Infectious Disease Information System. There were a total of 1,023 deaths out of 44,672 con rmed cases with a CFR of 2.3% 12 . The same problem of the above articles was that the hospitalized patients were ignored. The hospitalized patients would be either cured or dead in future and should not be included in the denominator for calculation of a CFR. In addition, daily con rmed cases would terminate with death or cured a few days later, but on the same reporting day, the deaths were always the con rmed cases some time earlier. In this case, it could increase the error of the estimation of CFR. China is the rst country to suffer and recover from COVID-19 outbreak. The accurate estimation of CFR in China may exert a huge impact on global epidemic prevention.
To avoid the above mentioned errors, we established a method to estimate the instant CFR of COVID-19. We collected the daily case noti cation data of COVID-19 announced by the Chinese o cials and estimated death time and cured time. Finally, the instant CFR of COVID-19 was estimated according to the numbers of deaths and cured cases on the same estimated diagnosis dates.

Research design
The con rmed COVID-19 patients would eventually have two endpoints, death or cured. Every single patient would undergo a death time or cure time, and then nally reached the end. In the study, the death time or cure time were de ned as a period of time from diagnosis date to death date or the time from diagnosis date to cure date, respectively. Firstly, the death time and cure time of COVID-19 patients were calculated by a new method described below in data analysis. Then the estimated date of diagnosis of the declared deaths and cured cases were determined by a death time or a cure time advanced from the deaths or cured cases, respectively. The instant CFR and cure rate of COVID-19 were obtained based on the numbers of deaths and cured cases, which occurred on the same estimated diagnosis dates.

Data collection
We collected the daily case noti cation data of COVID-19 in Wuhan City and its a liated Hubei Province from December 8, 2019, to May 1, 2020 13 . The data included the numbers of daily con rmed cases/deaths/cured cases and the numbers of daily cumulative con rmed cases/deaths/cured cases. Two researchers collected the data independently every day and then checked and corrected the data. We collected data from the o cial website, which was considered exempt from approval.

Data analysis
According to the daily numbers of con rmed cases, deaths, and cured cases, the daily variation histograms of con rmed cases, deaths, and cured cases were plotted. Their trend curves of the histograms were tted, and the dates corresponding to the highest point of the trend curves were found.
We speculated that the time difference (period) between the two peak dates of the daily deaths and the daily con rmed cases in trend curves was considered as the mean death time (lag time from diagnosis to death) of the deceased patients. As the same, the time difference between the two peak dates of the daily cured cases and the daily con rmed cases in trend curves was considered as the mean cure time (lag time from diagnosis to recovery) of the cured patients. Depending on the above mean death time and cure time, the estimated initial diagnosis dates of the declared deaths or cured cases of COVID-19 were determined.
The estimated initial diagnosis date of dead patients should be n days prior to the death date. That is, the estimated diagnosis date (EDD) of the deaths is the date n days before the declared death date. The estimated initial diagnosis date of deaths is obtained from the date reported on the daily case noti cation data (DCND) of deaths minus the mean death time (DT). Similarly, the estimated diagnostic date (EDD) of cured cases was obtained from the date reported on the daily case noti cation data of cured cases minus the mean cure time (CT) (Fig. 1).

EDD of cured cases = date on DCND -CT
The daily CFR (FR) is calculated from the number of cumulative deaths (NCD) divided by a sum of the number of cumulative deaths and the number of cumulative cured cases (NCCC) occurred on the same estimated diagnosis date.

FR = NCD / (NCD + NCCC) × 100%
The daily cure rate (CR) is calculated from the number of NCCCs divided by the sum of the number of NCDs and NCCCs occurred on the same estimated diagnosis date.

CR = NCCC / (NCD + NCCC) × 100%
The mortality rate was calculated based on the number of deaths over the population in Wuhan and Hubei.

Statistical analysis
GraphPad Prism 8 and Excel 2016 were used to record, calculate and analyze the data, and to draw gures of different patterns. The unit of time for the data collection of COVID-19 epidemic was a day. Fit spline/LOWESS was used to analyze and t the trend curves of the histograms for daily deaths and daily con rmed cases.

Daily new cases
The curves of daily new COVID-19 cases reported in Wuhan City and Hubei Province are shown in Figure   2. The con rmed cases initially occurred on December 8, 2019. The number of daily newly con rmed cases started to increase in early January. The number markedly increased in late January 2020, peaked in mid-February, and then decreased gradually. The death toll began to increase from mid-January to late January, peaked in early February 2020 and then decreased gradually. The COVID-19 patients were initially cured on January 10, 2020. The number of daily new cured patients began to increase in mid-January, obviously increased in early and mid-February, maintained a high level in late February 2020, and then decreased in the rst half of March. The numbers of daily new con rmed cases, daily new deaths and daily new cured cases in Wuhan City were higher than those in Hubei Province. The characteristics in Hubei Province and Wuhan City were similar.

Daily cumulative cases
The daily variation curves of cumulative cases of COVID-19, including cumulative con rmed cases, cumulative deaths and cumulative cured cases in Wuhan City and Hubei Province, are shown in Figure 3. The numbers of cumulative con rmed cases began to increase in early January. The increase is sharply accelerated in late January and stabilized until early March. The cumulative death toll started to show an increasing trend in mid-January. The number was remarkably increased in early February, and the numbers increase slowed down in late February. The numbers of cumulative cured cases showed an increasing tendency in mid-January. The increase was visibly accelerated in February and remained stable until April. On April 16, the data of con rmed cases, deaths and cured cases of COVID-19 in Wuhan were adjusted by Chinese authorities. The number of con rmed cases increased from 50,008 to 50,333. The number of deaths increased from 2,579 to 3,869 and the number of cure cases decreased from 47,283 to 46,335. The cumulative cases before April 16 were altered based on the increase or decrease proportion.

Death time and cure time
After treatment, the COVID-19 patients died or were cured. The time from diagnosis to death is referred to as the death time. The time from diagnosis to discharge is the cure time. Figure 4 shows daily variation histograms of con rmed cases, deaths and cured cases in Wuhan and Hubei. The trend curves of the histograms were tted, and the dates that corresponded to the highest point of the trend curves were found. The dates of highest point in the trend curves of daily con rmed cases, the deaths, and the cured cases were February 9, February 15, and February 29, respectively, in Wuhan City; and February 8, February 14, and February 28, respectively, in Hubei Province. The death time and the cure time were 6 days and 20 days in Wuhan City and Hubei Province, respectively.

Estimation of diagnosis dates of the deaths and cured cases
The estimated diagnosis dates of deaths and cured cases were determined based on mean death or cure times, respectively. We assumed that the dead patients died on day 6 after diagnosis was con rmed in Wuhan and Hubei. The estimated diagnosis date of the dead patients should be on 6 th day prior to death date. The cured patients discharged on day 20 after diagnosis. The estimated diagnosis date of the cured cases should be 20 days before the announced cure date in Wuhan and Hubei, respectively. Figure 5 shows the curves of the numbers of cumulative deaths and the number of cumulative cured cases of COVID-19 on the estimated diagnosis dates. The number of con rmed cases was equal to the sum of the number of cumulative deaths and the number of cumulative cured cases on the same estimated diagnosis date. Since the number of COVID-19 deaths on the same day was signi cantly less than the number of cured cases, the curve of estimated con rmed COVID-19 cases which was the sum of the cumulative cured cases and the cumulative death toll, was close to the curve of cumulative cured cases. The shapes and trends of the curves in Wuhan City are similar to but lower than those in Hubei Province.

Instant CFRs
The instant CFR of COVID-19 in Wuhan City was less than 10% (1.3-9.4% with an average of 7.5% from January 1 to April 9, 2020. It was low in the rst half of January, gradually increased and exceeded 9.0% on the end of January, then gradually stabilized at around 8.0% in late February. It was no more than 8% in March and has been stable at 7.69% since early April ( Figure 6A). The instant CFR of COVID-19 in Hubei Province showed the same trend as that in Wuhan City, but the CFRs were lower than that in Wuhan City. The instant CFR of COVID-19 in Hubei Province was less than 7.4% (1.2-7.4%) with an average of 6.2% from January 1 to April 10, 2020. It was lower than 4% in the rst half of January, then increased to about 7.0% in late January, and has been stable at 6.62% since early April ( Figure 6B). The curve tendency showed that the CFRs were stable after April, 2020.

Instant cure rates
The cure rate and CFR are opposing gures, since cure and death are a pair of competing events. The instant cure rate of COVID-19 in Wuhan City was between 90.1% and 98.8%, and has been stable at 92.3% since early April. While the cure rate in Hubei Province was between 92.6% and 98.7%. Overall, the cure rates didn't uctuate much, and nally stabilized at 92.3% in Wuhan City and at 93.5% in Hubei Province on April 9 ( Figure 6).

The mortality rate
The numbers of deaths in Wuhan and Hubei was 3,869 and 4,512 on April 30, 2020. The population of Wuhan City and Hubei Province are 11.21 and 59.27 million. So, the mortality rates were 34.5/100,000 in Wuhan City and 7.61/100,000 in Hubei Province.

Discussion
In view of the misleading of reported CFR calculation, we have, for the rst time, established a method to calculate the instant CFR of COVID-19. This method resolved the problems of COVID-19 patients who are still hospitalized and the time lag between the diagnosis and outcome. By using this new calculation, we estimated the CFRs of COVID-19 in Wuhan City and its Hubei Province, where the epidemic was most severe inside China. The results showed that the instant CFR of COVID-19 was 1.3%-9.4% in Wuhan and 1.2%-7.4% in Hubei from January 1 to May 1, 2020. The CFR has stabilized at 7.69% in Wuhan and 6.62% in Hubei since early April, 2020.
The accurate CFR of an infectious disease can be obtained after the end of the outbreak. At present, the outbreak of COVID-19 is ongoing throughout the world. The numbers of con rmed cases, deaths and cured cases are constantly changing 14 . Therefore, estimating the CFR is obviously challenging. However, it is critical to balance the socioeconomic burden of infection control interventions against their potential bene t for mankind 10  Disease Information System through February 11, 2020 12 . A total of 1,023 deaths occurred among 44,672 con rmed cases, for an overall CFR of 2.3%. However, he ignored the fact that there were 38,909 hospitalized cases in con rmed cases. At present, on o cial websites, the CFRs have been calculated by dividing the number of known deaths by the number of con rmed cases. These data re ect neither the CFR nor mortality and might be off by orders of magnitude 10 . The con rmed cases included deaths, cured patients, and hospitalized patients. Some of the hospitalized patients may die in the future, and their data would not contribute to the calculation of case fatality. In addition, the diagnosis of viral infection will precede death or recovery by days to weeks. The number of deaths should be compared to the previously con rmed case numbers-accounting for this delay increasing the estimated CFR.
We assumed that the CFR of the whole outbreak process could be regarded as a collection of many successive instant CFRs. The instant CFR is directly related to various factors at a time, and it is easier to analyze various factors and take possible actions to in uence the disease progress by knowing this CFR in advance. This instant CFR will gradually approach the CFR as time goes on until the outbreak ends gradually. The extreme instant CFR can be used as an approximate value of the CFR within a certain period of time. This approach provides a way to accurately calculate the CFR without involving the data of hospitalized patients.
The instant unit of the CFR was set as a day and the daily CFRs were calculated. The daily CFR was equal to the daily cumulative number of deaths being divided by the sum of the number of cumulative deaths and the number of cumulative cured patients. The number of daily con rmed cases is equal to the number of deaths plus the number of cured cases. The con rmed COVID-19 patients had to go through a certain period of time (the death time or the cure time), resulting in the nal outcomes: death or cure. The date of death minus the death time is the diagnosis date of death. Similarly, the cure date minus the cure time is the diagnosis date of cured patients. The daily CFR and cure rates were then calculated using the cumulative number of deaths and cured cases at the same diagnosis date. This rate varies from day to day with disease-related factors.
The outbreak of COVID-19 rst appeared in Wuhan, where the situation was most serious with the highest CFR of China. The CFR varied over time. In the rst half of January, the CFR was less than 10% because the outbreak was not very serious and prevalent. It was easy for patients to see a doctor. However, until the beginning of February, the number of patients largely increased, and it was di cult for patients to receive medical care treatment. The admission and diagnosis of COVID-19 were delayed, the hospitals were crowded out, and the medical condition deteriorated. As a result, the CFR rose sharply to 19.5%. In February, a large number of domestic medical resources supported Wuhan, and the medical conditions gradually improved. The treatment concept and methods were updated, the level of medical care was improved, and the treatment procedure became increasingly standardized. On the other hand, the virulence of SARS-CoV-2 may decline as the virus passes through generations. The proportion of severe cases decreased, and the CFR decreased to nearly 5.26% on April 1. We speculated that the CFR will be close to the nal CFR. The trend in Hubei Province was similar to that in Wuhan, but the CFR was low. The CFRs in the rst half of January 2020 were less than 10%. From the second half of January to the beginning of February, the rate rose to 10% ~16.6% and then decreased to 4.82% on April 1. We used this method to estimate the CFR (5.7%) of COVID-19 in China; this estimated CFR is close to its actual value (5.5%), supporting that this method is reliable. Using this method, we calculated the COVID-19 CFR of China and it turned out to be higher than the reported CFR of many articles and authorities, meaning that the CFR of COVID-19 in Wuhan City and Hubei Province were also underestimated.
In comparison with the outbreak of severe acute respiratory syndrome (SARS) in 2003, of 5,327 probable cases with SARS, 343 cases died in mainland China, giving an overall CFR of 6.4% 16 . In some cities or provinces, such as Beijing, Tianjin, Guangdong, Shanxi and Inner Mongolia, the CFR of SARS were 7.3%, 12.0%, 5.4%, 9.9% and 8.1%, respectively. In the present study, the CFRs were 7.69% and 6.62% in Wuhan City and Hubei Province, respectively. Therefore, the CFRs of COVID-19 in Wuhan, and Hubei are almost at the same level as that of SARS in mainland China.
This study has some limitations. The major in uencing factors were the determination of mean death time and mean cure time. Indeed, not every patient followed this mean death time or cure time. The result may deviate from the actual situation. In addition, our study was based on the o cial data. We believe that and mortality rate would be high and the CFR would be low if there's some missing cases, false negative diagnosis of nucleic acid, or some asymptomatic infections were included. We suggest that the government disclose cases of infectious diseases to share resources for scienti c study. If the accurate cure time and death time is obtained, it may help to correctly calculate the CFR and provide a basis for the prevention and control of infectious diseases.

Declarations
Availability of data and materials The datasets during and/or analyzed during the current study available from the corresponding author on reasonable request.

Figure 1
The relationship between estimated diagnosis date (EDD) and dates of deaths or cured cases on daily case noti cation data (DCND).

Figure 2
The daily variation of new con rmed cases, new cured cases and new deaths of COVID-19 in Wuhan City (A) and Hubei Province (B).

Figure 3
The daily variation of cumulative con rmed cases, cumulative cured cases and cumulative deaths of COVID-19 in Wuhan City and Hubei Province.