A population-based retrospective study of the spread of COVID-19 from Wuhan to Henan province, China

The epidemic of COVID-19 has now spread globally and affected over 110 countries. As of Mar 10 th , using publicly available data and ocial news reports in Henan province, we tracked a total of 1272 cases and a retrospective study was conducted to investigate the related factors in COVID-19 spread and control. We conrmed 554 primary patients had travel or residential history of Wuhan in the recent 2 weeks. Secondary cases accounted for 77.9% (141/181) among all the patients aged 61 or older, in whom contacted with unconrmed returnees from Wuhan was responsible for 27.0% (38/141). The median incubate period is 7 (IQR, 4-10) days by analyzing time information in 469 cases. For 442 patients with discharge dates, the duration from onset to cure is 19 (IQR, 15-23) days. The time from onset to seeking care at a hospital varied in age groups, and differed between primary and secondary cases. Patients visiting different hospitals affected the time from seeking care to cure. Thus, our results showed the spread of COVID-19 and factors associated with outcomes of patients in Henan province, which helps to understand the epidemiological features outside of the epidemic area and control the disease in other regions and countries.


Introduction
The epidemic of coronavirus disease 2019 (COVID- 19), rst emerged in the central Chinese city of Wuhan, Hubei province, is caused by a novel coronavirus (SARS-CoV-2) in Mid-December 2019 1,2 .While infections in the rst case cluster were initially thought to be mostly due to wild-animal-to-human transmission from a local seafood market 3, the growth of case incidence in Wuhan after the closure of the market, and exportation of cases in hospital and family settings in different regions showed evidence of human-to-human transmission 4 Fueled by migration events, COVID-19 has gradually become a potential global public health threaten 5 .
Considering the large travel volume during the traditional Chinese New Year, the government of Hubei province imposed a lockdown on Wuhan City with the closure of airports, railway stations, and highways at 10:00 am on Jan 23 rd , and nearly all of the other cities in the province followed later on 6 .The closure of Wuhan city largely reduced the number of cases exported and achieved considerable results in slowing COVID-19 spread.However, more than a million people have left Wuhan before Jan 23 rd , 2020.
Considering the risk of the epidemic spreading worldwide, the WHO announced a global public health emergency on Jan 30 th , 2020 7 .Unfortunately, 118,322 cases of novel coronavirus infections have been reported, causing more than 4200 deaths worldwide by the date of Mar 11 st , 2020.COVID-19 has therefor been assessed and characterized as a pandemic 8 .
To help control the epidemic and cure infected patients, clinical and epidemiological studies about COVID-19 are encouraged.Early clinical studies have focused on describing the clinical characters and treating the sick.Clinical research has shown that the outcomes of patients outside of Wuhan are different from those initially reported in Wuhan 9,10 .However, most clinical investigations were focused on the patients in the epidemic area in Wuhan.The detailed information about the patients outside Wuhan remains poorly understood despite the increasing number of con rmed cases [11][12][13] .Moreover, some important features including the duration of the disease have not been fully analyzed in epidemiology studies [14][15][16] .Thus, it is urgent to systematically study the outcomes and other epidemiology features of COVID-19 outside of Wuhan 17 .
Here, COVID-19-related data published by the Provincial Health and Health Committees of Henan, a province with a population of 100 million adjacent to Hubei was collected.We integrated the data with o cial news reports from January 20 th to Mar 10 th , 2020 and analyzed the epidemic spread and outcomes of patients and the possible related factors.The main objectives of the present study were to present a real dynamic change of the spread of COVID-19 from Wuhan to a new region and thus, provide more information about outcomes and other epidemiology features outside of Wuhan.

Results
Dynamic changes and the spread of epidemic from Wuhan city to Henan province On Jan 21 st , the National Health Committee con rmed the rst imported cases of pneumonia in Henan province (Figure 1A).On Feb 12 nd , there were a total of 1169 cumulative cases and a peak value of 901 existing cases in the province (Figure 1B).It took 22 days to reach the peak of existing con rmed cases since the rst con rmed case in Henan province.With the gradual decrease in the number of newly con rmed cases and the continuous increase in the number of cures, the cumulative number of cases in Henan province reached 1,272 as of Mar 10 th , whereas excluding 22 deaths there is only 1 existing case (Figure 1C, D).We compared the outcomes of patients in Henan province, Wuhan City, China excluding Wuhan City, China, and worldwide excluding China on Mar 10 th .We found the fatality rate of patients was lower in the regions outside of Wuhan than in Wuhan, while the cure rate of patients was higher in regions outside of Wuhan (Figure 1E, 1F).These results showed both present states and nial endings of COVID-19 outside of Wuhan in Henan province would be better than that in Wuhan.Therefore, it can be learned from Henan about the experiences for controlling the spread of epidemics.
The correlation between out ow migration events and primary cases in Henan province.
Migration events are one of the most important factors associated with the spread of the epidemic.The closure of Wuhan on Jan 23 rd , 2020 is the largest attempted movement restriction or quarantine in human history.To evaluate the in uence of population movement before a closure of epidemic center of Wuhan on the spread of the COVID-19, we extracted a daily out ow index of Wuhan from Jan 10 th (the start of the Lunar New Year travel rush) to Jan 23 th 2020 (lockdown of major cities in Wuhan) from Baidu migration map.Our data showed a different proportion of the Wuhan-originated out ow population in 13 affected cities.For example, the out ow index of Xinyang is the highest among the cities, with a highest number of con rmed cases (Figure 2A).Next, we correlated the index of the 13 cities with the number of primary, or secondary, or totally con rmed cases in the compared city by using Spearman's analysis.The results showed signi cant positive correlations between population input and the number of primary, or secondary, or totally con rmed cases.(Figure 2B, 2C, 2D; Spearman's analysis, P<0.001)

Personal characteristics of 1272 patients with COVID-19 in Henan province
To investigate the outcomes and other epidemiolocal features in COVID-19 spread and control, publicly available data regarding con rmed cases from Health Committees of Henan province were extracted.As of Mar 10 th , we tracked a total of 1272 cases, including information about age, sex, illness onset date, con rmed date et al.The details of 442 cured and 15 death patients were also analyzed in Table .Data showed that 45 (4.25%) of 1272 patients were aged 6 days-18 years, 475 (37.34%) were aged 18-40 years, 562 (44.18%) were aged 41 and 60 years, and 181 (14.23%) were aged 61 years and older.The median age was 44 years (interquartile range, 32-55 years).More than half of the 1272 patients (688, 54.09%) were male.In overall cases, 555 patients had travel or residence history of Wuhan in 2 weeks, including 2 patients who certainly contacted the Huanan seafood market.The credible information of 469 patients provides the exact date of leaving Wuhan, closely contacting with con rmed or suspected cases and illness onset.The median incubation period from leaving Wuhan or having contact history to symptoms was 7 (interquartile range, 4 to 10) days.Among 442 patients with exact datesof discharge, the median duration from illness onset to cure was 19 (interquartile range, 15 to 23) days.Among 15 patients with exact date of both illness onset and death, the mean interval from illness onset to death was 10 (interquartile range, 8 to 18) days, and their median age is 74 (interquartile range, 68-80) years.Values are medians (interquartile ranges, IQR) or counts (percentages, %).Numbers*(n) do not total 100% owing to inclusion and exclusion criteria or missing data.

Age distribution with classi ed patient into primary or secondary cases of the COVID-19 epidemic
We performed an extensive analysis of the age distribution of patients according to classifying primary or secondary cases in the spread of COVID-19.We found nearly 90% (497/555) primary cases are 19-60 years old patients.On admission, our results showed different proportions of secondary cases in different age groups (Figure 3A; Chi-square test, P<0.001), especially more than 3-fold of secondary cases to primary cases in elderly patients (>60 years old).Next, we further investigated the contact history in secondary cases of elderly patients, and the proportions of contact with primary cases was similar to secondary cases.Surprisingly, 27.14% of elderly patients became infected after contact their children or relatives who returned from Wuhan but not con rmed COVID-19 according to the reports (Figure 3B).

The duration and outcomes of the disease progression in different hospitals
Studies so far have not systematically analyzed the duration of cured patients.Here we showed the time from illness onset to seeking care at a hospital varies among people in different age groups (Figure 4A; Kruskal-Wallis test, P<0.01).The patients aged 61 years and older spent a median of 4 days from illness onset to seeking care at a hospital.We also showed the cure time varies among people in different age groups (Figure 4B, Kruskal-Wallis test, P<0.01).The patients younger than 19 years spent a median of 14 days from seeking care at a hospital to cure.Next, we analyzed the duration of COVID-19 in primary and secondary cases.The results showed that the difference between the two groups was mainly re ected in the duration from illness onset to seeking care at a hospital (Figure 4C; Mann-Whitney test t, P<0.05), while the time from seeking care at a hospital to cure was almost indifferent between two groups (Figure 4D; Mann-Whitney test t, P>0.05)).Besides, we found there is a signi cant cure of cure time of patients cured and discharged from city and several county hospitals in Xinyang (Figure 4E; Kruskal-Wallis test, P<0.001).Our results indicate that the medical environment is also one of the most important factors that affects the outcomes of COVID-19 patients.

Discussion
Although the COVID-19 seems controllable outside of Wuhan in China so far, it has gradually become a major health problem that threatens other countries in the world through population movement.By using available data from the Provincial Health and Health Committees of Henan and o cial news reports, our study described the demographic changes and the latest status of COVID-19 infection in Henan province, which involved a more complete outbreak epidemic pattern and control process.The extensive analyses of factors such as population movement, ages, primary and secondary case-classi cation, treatment in different hospitals supplemented the epidemiological characteristics of the disease and provide referable information for the epidemic control.
Many epidemiological features of the COVID-19 outbreak are similar to those of SARS in 2003 and MERS in 2013.On admission, the new Coronavirus seems more contagious 18 .It took only 30 days from early January for COVID-19 to spread from Wuhan to nationwide, and less than 3 months to affect more than 110 countries worldwide.For Henan Province, a total of 1272 con rmed cases have been reported till Mar 10 th .This spread pattern provides further evidence of human-to-human transmission and highlights the effects of population movement in the expansion of the epidemic.It is worthy of appreciation that 98% of the cases were cured as of Mar 10 th in Henan, and with no more newly con rmed cases for several days.However, despite the lower-case fatality rate, COVID-19 has so far resulted in more deaths than SARS and MERS combined 19 .
Base on the population-level study, our study showed the spread of COVID-19 signi cantly correlated with population movements and thus, the results highlight the importance of assessing the population out ows to evaluate the spread of the epidemic.Also, exploring the association between movement events and epidemic spread was expected to help identify high-risk areas and guide health strategy formulation.Most important, the results indicated the essential function of the prevention of primary cases and spread outside from epidemic areas.Taken the epidemic situation in Henan province as an example, with the efforts of the Wuhan government that imposed a lockdown on the city on Jan 23 rd , 2020, the government of Henan province also introduced strong measures to greatly reduce and restrict public transport.The cooperation showed the great capacity of their emergency response and achieved considerable results in epidemic control.
Of note, the detailed information of the cases published on the o cial websites of the Provincial Health Committees of Henan province is released on time, and being fully shared from the beginning of the epidemic, which is important for raising public awareness and enhancement of the participation in the ght against the epidemic.More importantly, the detailed information about the con rmed cases provides opportunities for researchers to further interpret the data, especially in the early phase of an outbreak when little information is available.
To analysis the cases in detail, we strictly investigated information on the cases.The incubation period is estimated according to the time when the patient contacted con rmed, susceptive persons, or returned from Wuhan to symptom onset.and.The median incubation period in our study is 7 days, which is longer than that in recent reports 9,20 .Anyway, rarely more than 14 days, our data provides advisory information for those need self-tracking observation to prevent the epidemic from human-to-human transmission.
Our study also showed the composition of secondary cases varies in different age-groups.Although the number of primary patients aged 61 years and older is limited, the proportion of secondary cases among them is relatively high and up to over 3-fold to primary cases.Except for those who have di culty in tracing exposure history or did not provide accurate information, most of the secondary cases in the elderly got infected by contact with primary cases (e.g., their children or relatives) or local cases (e.g., friends, neighbors).It cannot be overlooked that part of the infections was resulted from contacting with returnees, symptoms of whom were not mentioned in the reports at the time.The explanation may be that some patients with the new coronavirus infection do not have a fever or radiologic abnormalities on initial presentation, or because of a long incubation period 21 .Therefore, limiting the social contacts of these susceptive individuals was crucial for COVID-19 control.
As few studies reported the duration of the disease, our study estimated it by screening information from 442 cured cases.The results showed the median duration of the disease is 19 days, including the time from illness onset to seeking care at a hospital until cure.Further analysis showed signi cant results among age-groups.On admission, the duration from illness onset to seeking care at a hospital of secondary cases is longer than that of primary cases, which can be explained that primary cases are more feasibly detected and have raised awareness with such epidemics.Additionally, the duration in different hospitals indicated that the outcomes of patients can be in uenced by the medical environment, not only determined by the characteristics of the disease and the patient self-condition.
However, our study has limitations that do not statistics the numbers of severe-critically ill patients due to data availability.Previous studies have shown the proportion of elderly people in severe cases and deaths is higher 22 , in line with our results that the mean age of 15 deaths is 74 years old.Secondly, other factors, such as coexisting conditions, may also contribute to the case fatality or clinical outcome.However, without detailed clinical data from individual patients, we were not able to examine these issues directly.
In summary, our study provides retrospective analyses of COVID-19 spread from Wuhan to Henan province.We con rmed the spread of COVID-19 is greatly affected by population movements and showed detailed epidemiological features, such as age distribution, the duration of the disease.The results may provide valuable information for understanding COVID spread and help improve epidemic prevention and control.

Study design and Data sources
In this retrospective study, we used data of cases from National or Provincial (City Health Commission in China.The data of con rmed cases updated daily by the government, were collected and integrated from Jan 20 th to Mar 10 th , 2020.Detailed information was extracted from the data, included basic demographic information, exposure history, date of illness onset, date of seeking care at a hospital, date of cure or death, etc. Population movement data from Wuhan to 13 cities in Henan were obtained from the Baidu Migration Map (h ttp://qianxi.baidu.com/).

Data compilation
Con rmed cases were categorized into two groups, the primary (with a clear history of staying or traveling in Wuhan within 2 weeks) and secondary (those not known to be primary) cases.We de ned the total in ow index as the sum of daily in ow index from Jan 10 th to 23 rd , 2020.Information about the time from contact with con rmed or susceptive persons or return from Wuhan to appearing clinical symptoms was exacted from the con rmed cases to calculate incubate period.

Statistical analysis
Continuous variables were expressed as medians and interquartile ranges, as appropriate.Categorical variables were summarized as counts and percentages as frequencies and compared by Chi-square test.
We used non-parametric tests to assess differences in duration (Mann-Whitney test to compare two groups and Kruskal-Wallis test to compare three or more groups).The correlations between the number of totally con rmed or primary/secondary cases and total in ow index were performed with Spearman's correlation analysis.We did all analyses in SPSS (Statistical Package for the Social Sciences) version 20.0 software.We considered p values of less than 0•05 to be signi cant.

Declarations
Figures  The spread of the epidemic is correlated with movement events.(A) Index of out ow from Wuhan to 13 cities in Henan province during period of Jan 10th to Jan 23rd.(B-D) The correlation between the total index with, the total number of primary cases (B), and the total number of secondary cases (C) and the total number of con rmed COVID-19 cases(D).Statistical analysis by Spearman's correlation analysis (B-D).Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries.This map has been provided by the authors.

Figure 1 Spread
Figure 1

Figure 3 Age
Figure 3

Figure 4 The
Figure 4

Table .
Epidemiological and demographical characteristics with clinical outcomes of COVID-19 patients in Henan province, China.