Savable but lost lives when ICU is overloaded: a model from 733 patients in epicenter Wuhan, China

Background. Coronavirus Disease (COVID-19) causes a sudden turn over to bad at some check-point and thus needs intervention of intensive care unit (ICU). This resulted in urgent and large needs of ICUs posed great risks to the medical system. Estimating the mortality of critical in-patients who were not admitted to the ICU (MI-mortality) will be valuable to optimize the management and assignment of ICU. Methods. Retrospective, of the 733 in-patients diagnosed with COVD-19 at Huangpi Hospital of Traditional Chinese Medicine (Wuhan, China), as of March 18, 2020. This study aims to estimate the MI-mortality and build a model to identify the critical in-patients. Demographic, clinical and laboratory results were collected and analyzed. The mortality rate for the patients who failed to receive ICU and unfortunately died was analyzed. To this end, the key factors for prognostic of patients who may need ICU care were found. A prognostic classication model using machine learning was built to identify the patient who may need ICU. Results. Considering the shortage of ICU beds at the beginning of disease emergence, we dened the mortality for those patients who were predicted to be in needing of ICU treatment yet they did not as MI-mortality. Patients who entered the ICU and died were dened as ICU-mortality. To estimate MI-mortality, a prognostic classication model was built to identify the in-patients who may need ICU care based on the medical factors collected in-hospital. Its predictive accuracies on whole patient set (733 [25 708]), training set (586 [20 566]) and testing set (147 [5 142]) dataset were 0.8513, 0.8935 and 0.8288, with the AUC of 0.8844, 0.8941 and 0.9120, respectively. Our analysis had shown that the MI-mortality is 41% and the ICU-mortality is 32%, implying that enough bed of ICU in treating patients in critical conditions. Conclusions. 733 patients, 25 in-patients were admitted ICU, among them 8 patients died. 25

The COVID-19 costed average mortality of 5.3% worldwide. Yet the reported mortality is largely different, with as high as 27.3% in Yemen and as low as 0.1% in Singapore and Qatar (Updated on June 19) [16]. It remains unknown on such differences. The plausible explanation includes the low ratio of infected people among the whole population, high level of medical standard and ICU ward per capita. In a radical time of shorting ICU beds, a very tough decision needs be made to grant high priority for the patient with hope of survival in serious conditions. Estimate the mortality of the critical patients failed to receive ICU will further help to explain the differences in mortality rate across countries, and optimize the assignment on ICU resources.
In this study, 733 patients from Huangpi Hospital of Traditional Chinese Medicine (Wuhan, China) were collected and analyzed by benchmark machine learning methods. The patients were systematically reviewed and the disease progression was carefully quanti ed. The study aimed to estimate the mortality for the critical patient who should be admitted into the ICU intervention in early time yet did not due to various causes. To this end, a prognostic system was built to identify those patients who were more likely to need ICU care, thereby helping to estimate the number of ICU bed needed for early preparation.

Study Design and Participants
The retrospective cohort study consists of 733 patients diagnosed with COVID-19, the collected patients were admitted to Huangpi Hospital of Traditional Chinese Medicine (Wuhan, China) from January to March 2020 by the Guangxi Medical Team joined the battle against COVID-19. Method for laboratory con rmation of SARS-CoV-2 infection have been described elsewhere [17] , [18]. Brie y, the methods of next-generation sequencing, real-time reverse-transcriptase polymerase chain reaction (RT-PCR) or Immunoglobulin M (IgM) and Immunoglobulin G (IgG) antibodies can be utilized to diagnose patients with COVID-19 [18]. All patients obtained the throat-swab specimens and reviewed every other day via treating.
This study had been approved by the First A liated Hospital of Guangxi Medical University Hospital Ethics Committee and the requirement for informed consent was waived.

Data Collection
The data were extracted from electronic medical records. For each patient, three types of factors including demographic, clinical and laboratory results were extracted. The demographic factors include the medical history and census information, such as gender, age, presence or absence of comorbidities, time from onset to admission, time from admission to ICU care and death, main symptoms at admission. The clinical and laboratory examination includes chest radiographs or CT scans, treatment measurement, and daily routine tests minutely recorded (12 factors such as pulse, respiration rate, blood pressure, body temperature, oxygen saturation, heart rate, etc.). The symptoms present referred to the rst symptoms related to the main complaint such as fever, cough, fatigue, diarrhea, etc. There are in total 909 factors are indexed for each patient, resulting in a comprehensive characterizing the disease progression. All data were handled by computer professionals and checked by two physicians (HW and JZ).

Laboratory Procedures
Routine blood examinations include complete blood count, coagulation pro le, serum biochemical tests (including liver function (twelve items), renal function electrolyte (twelve items), blood lipid and blood glucose (three items), procalcitonin detection and uorescence, glucose determination (various enzymatic methods), six sets of coagulation, ve categories of complete blood count + CRP), respiratory tract infection pathogen IgM 9 items and in uenza A/B virus antigen detection. Considering, 173 examination indicators extracted from the inpatients were collected.

Study De nitions
Fever was de ned as axillary temperature of at least 37.3℃. The illness severity of COVID-19 was de ned according to the Chinese management guide for COVID19 (version 7.0) [4]. The critical patients indicate that they should be admitted into the ICU. The criteria for inclusion in the ICU were 1) respiratory failure and requires mechanical ventilation, 2) shock, 3) combined with other organ failures. Due to the limited medical resources, it is not guaranteed that those who meet the above three conditions can be included in the ICU. The critical patients who should be admitted into ICU yet they did not due to the lack of ICU beds, herein this type of patient is named Missing ICU. All patients in the ICU meet the aforementioned three conditions or even serious. The mortality of the patients who have admitted into ICU was named by ICU-mortality. Hepatorenal insu ciency indicated liver or kidney dysfunction, such as cirrhosis, hepatic carcinoma, renal cyst, etc. CT scan for double lung infection indicates abnormal CT manifestations, such as Ground-glass Opacity, Consolidation, Reversed Halo Sign, Fibrosis, Septal Thickening, etc.
Continuous variables were quanti ed by six statistical measurements, including median value, mean value, maximum value, minimum value, standard deviation, and interquartile range (IQR) [10]. The six measurements are enough comprehensive for variables following normal distribution. Categorical variables were expressed as 0 or 1. All features (909) were extracted from demographic, clinical and laboratory results for modeling, analysis and forecasting. Statistics reveal that 143 factors were continuous variables (858 features) and 51 factors were categorical variables (51 features).
The patients were dichotomized into two subgroups by thresholds. Accordingly, we calculated the resulted values including true positive rate (TPR) and false positive rate (FPR) and draw its receiver operating characteristic curve (ROC). The area under the curve (AUC) was calculated to measure the prognostic power for each factor. The value the close to 1, the better prognostic power. The top ten factors with the largest AUCs were extracted to build a prognostic classi cation model.

Statistical Analysis
The Mann Whitney-U test, T-test, χ2 test, or Fisher's exact test were utilized to compare the differences between the identi ed two subgroups where it applies. We involved the top ten factors which have the largest AUC value. Boxplots were drawn to illustrate the statistical differences.
Estimating the MI-mortality for the patients who may survive This study aimed to estimate the mortality for the critical patient who should be admitted into the ICU intervention in early time yet did not due to various causes. To this end, we rstly built a prognostic model for identifying the patients who were critical patients, i.e., who need ICU care. The study chart is demonstrated in Fig. 1.
The building of a prognostic model for identifying the critical in-patients who need ICU care. We involved the patients who were rstly admitted in-hospital and then received ICU care. Such patients were labeled by "ICU-care". Those in-hospital patients who were not received in ICU until discharge were labeled by "Non-ICU-care". For the two types of patients, their clinical measures collected during in-hospital were extracted. The whole samples were randomly divided into two datasets. One was used to build a classi er while the other one was used to test the prognostic performance of the classi ers. The training and testing dataset consisted of 586 [20 566] patients and 147 [5 142] patients, respectively. We considered the prognostic prediction on whether a patient needs ICU care as a supervised learning problem. We rstly involved the top ten factors which have the largest AUC when evaluated its prognostic power individually. The found ten factors were then used to build a composite classi cation model by the benchmark model of support vector machine (SVM) [19]. We employed balance-sampling with ensemble learning strategy [20], given that the dataset was severely class-imbalanced. We divided 566 Non-ICU-care samples into 29 groups, each of which was consisted by 20 ICU-care samples. Thus, the 29 groups of balanced training subset, was utilized to train 29 SVM classi ers. After training, 29 classi ers were obtained via the bootstrap sampling scheme. The obtained 29 classi ers were applied on the test samples and the prediction of its label was obtained by majority voting.
Estimating the MI-mortality for the patients who may survive. The COVID-19 costed average mortality of 6.9% worldwide. In a radical time of shorting ICU beds, a very tough decision needs be made to grant high priority for the solvable patient. However, it remains unknown the mortality for the patients should be treated in ICU, as predicted by the rst step, yet not been admitted to ICU due to various causes. Given the high sensitivity or speci city of 1 and 0.8239 (Table 2) of the classi cation model in the rst step in prediction whether a patient should be admitted to ICU, we reasoned that the predicted positive patients do need ICU care. Consequently, we involved the dying patients who were classi ed as the one should receive ICU care yet not. We de ned the ratio of a number of such patients over a total number of dead people as Missing-ICU-mortality. MI-mortality measured the necessity of ICU in selecting patients in critical conditions. It also measured the reliability of the model built in the rst step. Furthermore, the mortality of the patients who have admitted into ICU was also estimated for comparing the difference of MI-mortality and ICU-mortality. This difference can not only help us to understand the difference in mortality between countries, but also help us to rationally plan ICU resources in emergencies.    Fig. 2-A). Their corresponding boxplots with respect to the two types of patients were also visualized in the Additional le 2. Their p-values and the performance measurements were summarized in Table 1. LDH and hs-CRP were indicated to be statistically different (p-value ≤ 0.001). On will observe that the mean values of hs-cTnI, Mb, D-Dimer, LDH, IgM, CK-MB and hs-CRP on ICU-care were higher than non-ICU-care. The uctuation (variance) of hs-cTnI and Mb were larger. The age was also a signi cant factor. Older patients tended to need ICU care more than young patients (p-value ≤ 0.0001). Statistics illustrated that those older than 60 (more than half of the total) were easily admitted to ICU. Table 2 indicated the numerical results with accuracy and AUC of 0.8299 and 0.9120 for predicting whether inpatients will need ICU care. From the confusion matrices (refer to the Supplementary Table 1), 25 patients were judged to be admitted to the ICU care, whereas in fact they did not enter the ICU. We named such patients group as Missing-ICU. The caused reason was that the resources of ICU were limited, which did not guarantee that all critical inpatients, even satisfying criteria of ICU care, could not be admitted to the ICU.
The estimated MI-mortality is 41% In the aforementioned step, we involved the patient who died before admitted to ICU and they were identi ed that patients should receive ICU by the classi er. We de ned the MI-mortality to measure the ratio of number of such patient over total number of deaths. We repeated the sampling and training scheme 100 times to ensure a full coverage of the whole patients' dataset. The averaged and standard deviation of the MI-mortality were obtained with values of 0.41 and 0.30, respectively. The mean MImortality value of 0.41 implies that the patients who did not receive adequate ICU treatment will be forcing high mortality of 41%. The standard deviation of 0.3 demonstrates that the built classier in rst step is relatively stable. The patients recommended being admitted in ICU by the model in rst step were accurate.
On the whole, of the 16 non-survivors, the MI-mortality rate is 41%. The predicted results of ROC (10) involved by the machine learning technology outperform the results using all features (909) (as shown in Fig. 2-B).

Discussion
The study aims to estimate the mortality of the critical patients failed to receive ICU by performing early prognostic using machine learning. Currently, with the epidemic continuing to spread in many countries, our strategy provides quantitative evidence and method to estimate the ICU admission and MI-mortality for maximum rescuing of patients who are hopeful to survive. It helps to explain the differences in mortality rate across countries, and optimize the assignment on ICU resources.
In the current study, our model identi ed the patients who should be admitted into the ICU. When The temporal changes of two-group patients on these indicators were tallied, the optimal thresholds can be obtained, as shown in Fig. 3. More concretely, once the value of IgM was 0 g/L, the patients were at risk and the immune system was forced by the virus. The study has some notable limitations. First, independent cross-institutional samples for model evaluation are missing. Due to the chaos as well as other factors such as patient privacy, it is very di cult to collect such complete sample in short time. Second, the positive sample size is a tiny fraction of total sample size. The caused data imbalance yields di culties in training a model. To relieve the problem, we used an effective and mature learning method to deal with it.

Conclusion
On our cohort of 733 patients, the mortality of patients admitted in ICU was 32%. There were 25 inpatients who have been predicted by our model that they should need to enter ICU, yet they did not enter ICU due to short of ICU beds. The MI-mortality was 41%.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.