Deep Neural Network Analysis of Clinical Variables Predicts Escalated Care in COVID-19 Patients


 This study sought to identify the most important clinical variables that can be used to determine which COVID-19 patients will need escalated care early on using deep-learning neural networks. Analysis was performed on hospitalized COVID-19 patients between February 7, 2020 and May 4, 2020 in Stony Brook Hospital. Demographics, comorbidities, laboratory tests, vital signs, and blood gases were collected. We compared data obtained at the time in emergency department and the time of intensive care unit (ICU) upgrade of: i) COVID-19 patients admitted to the general floor (N=1203) versus those directly admitted to ICU (N=104), and ii) patients not upgraded to ICU (N=979) versus those upgraded to the ICU (N=224) from the general floor. A deep neural network algorithm was used to predict ICU admission, with 80% training and 20% testing. Prediction performance used area under the curve (AUC) of the receiver operating characteristic analysis (ROC). We found that C-reactive protein, lactate dehydrogenase, creatinine, white-blood cell count, D-dimer, and lymphocyte count showed temporal divergence between patients were upgraded to ICU compared to those were not. The deep learning predictive model ranked essentially the same set of laboratory variables to be important predictors of needing ICU care. The AUC for predicting ICU admission was 0.782±0.013 for the test dataset. Adding vital sign and blood-gas data improved AUC (0.861±0.018). This study identified a few laboratory tests that were predictive of escalated care. This work could help frontline physicians to anticipate downstream ICU needs to more effectively allocate healthcare resources.


Introduction
Since it was rst reported in Wuhan, China in December 2019 (1,2), the coronavirus disease 2019 (COVID-19) has infected over 27 million people and killed more than 880,00 people worldwide (September 6, 2020) (3). There are recent spikes in COVID-19 cases and there will likely be second waves (4). To date, it is challenging for emergency room physicians to determine which patients need escalated care (i.e., ICU admission) or anticipate ICU needs downstream for effective allocation of healthcare resources in part because much is still unknown about this disease. Many studies have reported a large array of clinical variables associated with COVID-19 which include, but are not limited to, patient demographics, clinical presentations, comorbidities, imaging data, vital sign data, and laboratory blood tests (5)(6)(7). A few studies have attempted to predict the need for escalated care and mortality typically using data obtained at admission to the emergency department (ED) (8)(9)(10)(11). Current results are inconsistent and there is no consensus as to which variables are good predictors of escalated care. This is in part due to COVID-19 patients came into the emergency department at various stage of disease severity, which could confound the results. It may be more informative to study patients who were subsequent upgraded to ICU from the general oor.
The goal of this study was to identify the most important clinical variables that can be used to determine which patients will need downstream ICU care early on. We performed comparison between those not upgraded to the ICU from the general oor versus those subsequently upgraded to the ICU, and contrasted with comparison between COVID-19 patients admitted to the general oor versus those immediately admitted to ICU. Clinical variables were obtained at the time of arrival to the emergency department as well as at the time of ICU upgrade. A deep neural-network algorithm was developed to identify the most important clinical variables that informed the need for escalated care, and used these variables to predict ICU admission. Hospital ED from February 7, 2020 to June 30, 2020. There were 2,892 COVID-19 positive patients as determined by real-time polymerase chain reaction (RT-PCR) for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), of which 1430 were hospitalized. Patients <18 years old, patients who were still in the hospital at the time of this analysis, and patient who did not have full codes were excluded.

Methods
The nal sample sizes included 1203 patients admitted to general oor ("general oor", Group A) and 104 directly admitted to the ICU from the ED ("direct ICU", Group B), 979 patients remained on the general oor ("no upgrade", Group C) and 224 were upgraded from the general oor to the ICU ("upgrade ICU", Group D) (Figure 1).
These clinical variables were collected for general oor admission (group A) versus direct ICU (group B) at ED admission. Data were collected for the no-upgrade versus upgraded group at ED admission to the general oor. Data were also collected one day prior to ICU upgrade (group D) or three days after hospitalization for the no-upgrade group (Group C). The "3 day" was chosen for comparison because the median day for patients to be upgraded to the ICU from the general oor was 3 days.
Preprocessing and deep neural network prediction model: Bicarbonates, pCO 2 , pO 2 , pH, hematocrit and troponin were also not used in the machine learning analysis because invasive blood gas samples and troponin were not routinely obtained in our hospital on general oor patients. For the rest of the laboratory variables, missing data (<25%) were imputated using standard methods (12).
Two deep neural network models were built: one using laboratory tests (excluding vitals and blood gases) and the other using laboratory tests, vitals and blood gases. Both used Jupyter Notebook, Tensor ow, and Keras, and were constructed using 2 fully connected dense layers. The inputs consisted of the clinical variables for no-ICU versus ICU patients: namely those of Group A ( oor) at ED admission and Group C (no upgrade) at the corresponding time of upgrade versus Group B (direct ICU) at ED admission and Group D (upgrade) at the time of upgrade. The output was ICU admission. For both, the dataset was randomly split into 80% training data and 20% testing data, and trained for 50 epochs with a batch size of 6. For the model using laboratory tests, a learning rate of 0.001 was used, whereas for the model using laboratory tests, vitals, and blood gases, a learning rate of 0.0009 proved optimal. A Softmax function for activation in the output layer was used. The clinical variables were ranked using SHAP (SHapley Additive exPlanations), a Python package that explains the output of machine learning models based on game theory.
Statistical analysis and performance evaluation: Statistical analysis was performed using SPSS v26 (IBM, Armonk, NY) and SAS v9.4 (SAS Institute, Cary, NC). Group comparisons of categorical variables in frequencies and percentages were performed using the Chi-squared test or Fisher exact test. Group comparison of continuous variables in medians and interquartile ranges (IQR) used the Mann-Whitney U test. For all analyses, a p value < 0.05 was considered to be statistically signi cant.
For performance evaluation of deep neural network, data were split 80% for training and 20% for testing. Prediction performance was evaluated by area under the curve (AUC) of the receiver operating characteristic (ROC) curve for the test data set. The average ROC curve and AUC were obtained with ten runs and standard deviations were obtained. A p value < 0.05 was taken to be statistically signi cant unless otherwise speci ed. Table 1A summarizes the demographics and comorbidities for the general oor (group A, N=1203) versus direct ICU (group B, N=104). Compared to the general oor group, the direct ICU group had more males (p=0.005), smokers (p=0.008), diabetics (p=0.047) and patients with heart failure (p=0.016). Age, ethnicity, race, and prevalence of hypertension, asthmas, COPD, coronary artery disease, cancer immunosuppression and chronic kidney disease were not statistically different between groups (p>0.05). Table 1B summarizes the demographics and comorbidities for the no-upgrade (group C, N=979) versus upgrade group (group D, N=224). Compared to the no upgrade group, the upgrade ICU group had more males (p=0.005), and patients with asthma (p=0.008) but fewer patients with cancer (p=0.004). Race was different between groups. Age, ethnicity, and prevalence of smoking, hypertension, diabetes, COPD, coronary artery disease, heart failure immunosuppression and chronic kidney disease were not statistically different between groups (p>0.05).

Results
Laboratory tests: Figure 2 plots the laboratory tests for general oor (group A) versus direct ICU (group B) at ED admission, and no-upgrade (group C) versus upgrade (group D) at ED admission and at the time of upgrade. WBC, LDH, CRP, TNT, and ferritin were signi cantly different between the general oor and the direct ICU group at ED admission (red bars). Lymph, WBC, LDH, CRP, AST, CRT, ferritin, and ALT were signi cantly different between the no-upgrade and upgrade group at the time of admission to the hospital (green bars). Lymph, WBC, and CRP were signi cantly different between the no-upgrade and upgrade group at the day prior to upgrade (blue bars). Table 2 presents the results of Figure 2 in a simpli ed format for comparison. LDH, CPR and ferritin were signi cantly different for the general oor versus direct ICU group at ED admission, no-upgrade versus upgrade group at ED admission, and no-upgrade versus upgrade group at the time of upgrade (Table 3, row 1-3). WBC stood out in that it was different for the general oor versus direct ICU group at ED admission, the no-upgrade versus upgrade at the time of upgrade, but it was not different for the noupgrade versus upgrade at ED admission WBC and CRP signi cantly decreased in the no-upgrade group (Table 3, 4 th row). WBC, LDH, and Cr increased while lymph decreased in the upgrade group (Table 3, 5 th   row).
Lymph, WBC, D-dimer, LDH, CRP, and Cr improved or did not deteriorate between the two time points in the no-upgrade group but deteriorated in the upgrade group (Table 3, 6 th row). Ferritin, TNT, AST, BNP, procal, and ALT were not signi cantly different between the two time points in both the no-upgrade and upgrade group (Table 3, 7 th row).
Vitals and blood gases: Figure 3 plots the vital signs and blood gases for general oor versus direct ICU at ED admission, and noupgrade versus upgrade at ED admission and one day prior to upgrade. RR, SpO 2 , temperature, pO 2 , and pH, were signi cantly different between the general oor versus direct ICU group (red bars). RR, HR, SpO 2 , temperature, pH, and pCO 2 were signi cantly different between the no-upgrade versus upgrade group (green bars) at the time of admission to hospital. HR, SpO 2 , DBP, SDP, and temperature were signi cantly different between the no-upgrade versus upgrade group (blue bars) at the day prior to upgrade. Table 3 simpli es the results of Figure 2. HR, SpO 2 , and temperature were signi cantly different for the general oor versus direct ICU group at ED admission, no-upgrade versus upgrade at ED admission, and no-upgrade versus upgrade at time of ICU upgrade (Table 3, row 1-3). pH stood out in that it was different for the general oor versus direct ICU group at admission, no-upgrade versus upgrade at time at upgrade but it was not different for no-upgrade versus upgrade at admission. For the no upgrade group, RR, HR, DBP, SBP signi cantly decreased and SpO 2 and temperature increased (Table 3, 4 th row), whereas for the upgrade group, HR and temperature decreased and SpO 2 increased (Table 3, 5 th row). Unlike the laboratory tests, none of the vitals and blood gases showed improvement in the no-upgrade group and deterioration in the upgrade group between the two time points (Table 3, 6 th and 7 th row).

Predictors of ICU admission
The deep neural network model built using laboratory tests ranked CRP, LDH, Cr, WBC, D-dimer, and lymph (in order of importance) to be the top predictors of ICU admission. This model yielded an accuracy of 86±5% and AUC of 0.782±0.013 for the testing dataset.
The deep neural network model built using laboratory tests, vitals and blood gases ranked RR, LDH, CRP, DBP, procal, WBC, D-dimer, and O 2 (in order of importance) to be the top predictors of ICU admission. This model yielded an accuracy of 88±7% and an AUC of 0.861±0.018 for the testing dataset. Adding vitals and blood-gas data improved prediction performance.

Discussion
This study investigated the clinical variables associated with direct ICU admission and upgrade to ICU from the general oor. We found that lymphocyte count, white-blood cell count, D-dimer, lactate dehydrogenase, C-reactive protein, and creatinine (unranked) improved or did not deteriorate with time in patients who were not upgraded to the ICU, but deteriorated in patients who were upgraded to the ICU, showing temporal divergence. The deep learning predictive model using laboratory tests ranked Creactive protein, lactate dehydrogenase, creatinine, white-blood cell count, D-dimer, and lymphocyte count (in orders of importance), showing substantial overlaps with those variables that exhibited temporal divergence. The performance of the predictive model using these top predictors yielded an AUC of 0.782±0.013 for predicting ICU admission on the test dataset. Adding vitals and blood-gas data further improved prediction performance (0.861±0.018).
Compared to the general oor group, the direct ICU group had signi cantly more males, smokers, diabetics and patients with heart failure. Compared to the no upgrade group, the upgrade ICU group had more males, and patients with asthma but fewer patients with cancer. Smokers, diabetics and patients with heart failure were more likely to receive escalated care at ED admission. Patients with asthma was the only comorbidity that were associated with ICU upgrade. Some major comorbidities were important factor for ICU admission especially at ED admission, but less so for ICU upgrade, suggesting that ED physicians might consider major comorbidities as factor needing escalated care.
Many laboratory tests showed worse disease severity in the direct or upgrade ICU group compared to general oor and no-upgrade group. However, we found that these laboratory tests by themselves were inadequate to reliably determine which patients required ICU admission. Often time, there were no appreciable differences between those directly admitted or upgraded to the ICU and those admitted to the general oor. For example, LDH, CRP and ferritin were signi cantly different for the general oor versus direct ICU group at ED admission, and no-upgrade versus upgrade group for both ED admission and at time of the ICU upgrade ( Table 2, row 1-3), suggesting they might not be useful to distinguish ICU upgrade despite being abnormal due to COVID-19. WBC stood out in that it was different for the general oor versus direct ICU group at ED admission and the no-upgrade versus upgrade group at the time of upgrade, but not for the no-upgrade versus upgrade group at ED admission, suggesting it is one of the most informative variables of ICU upgrade.
Our innovative approach was thus to identify the laboratory tests that showed improvement or plateau between the two time points in the no-upgrade group but deteriorated in the upgrade group. The laboratory tests that showed temporal divergence were identi ed to be lymphocyte count, white-blood cell count, D-dimer, lactate dehydrogenase, C-reactive protein, and creatinine (unranked). By contrast, most vitals and blood gases did not show such temporal divergence between groups, suggesting that vital signs and blood gases might be overall less important when compared to laboratory tests. This appears counter intuitive because vitals are readily available and are often informative in emergency room situation. Possible explanations are: i) SpO 2 might be affected by supplemental oxygen inhalation, ii) RR, HR, SBP and DBP could be highly variables, iii) these vital signs were within normal normative physiological ranges although there were group differences. We concluded that vital signs and blood gases appear to be overall less informative in predicting ICU admission compared to laboratory tests.

Deep learning analysis
To further explore whether the above-mentioned laboratory variables are predictive of direct and upgrade ICU admission, we developed a deep-learning model, trained it on 80% of the data, and tested it independently on 20% of data that the model had not seen before. Our deep neural network model identi ed C-reactive protein, lactate dehydrogenase, creatinine, white-blood cell count, D-dimer, and lymphocyte count (in orders of importance) to be the top predictors of ICU admission. These variables showed substantial overlaps with those variables exhibiting temporal divergence described above. The performance of the predictive model using these top predictors yielded an AUC of 0.782 for predicting ICU admission from the testing dataset. Adding vital and blood-gas data improved prediction performance, yielding an AUC of 0.861 for predicting ICU admission from the test dataset. It is worth noting that RR was one of the highly ranked variables. This is not surprising because COVID-19 patients usually exhibited respiratory distress. Taken together, there is corroborative evidence that a few laboratory tests and vital signs are amongst the most important predictors of severe illness that warrants escalated care.

Previous studies
A few studies have previously identi ed some clinical variables to be associated with disease severity or mortality in COVID-19 infection. A few studies have attempted to identify important clinical variables that predicted critical illness and mortality using data at ED admission. There is however no consensus as to which clinical variables are good predictors. Jiang et al. used supervised learning and found mildly elevated alanine aminotransferase, myalgias, and hemoglobin at presentation to be predictive of severe ARDS of COVID-19 with 70% to 80% accuracy. This study had small, non-uniform, heterogeneous clinical variables, obtained from different hospitals (9). Ji et al. used logistic regression to predict stable versus progressive COVID-19 patients (N=208) based on whether their conditions worsened during hospitalization (10). They reported comorbidities, older age, lower lymphocyte and higher lactate dehydrogenase at presentation to be independent high-risk factors for COVID-19 progression. Yan et al. utilized supervised machine learning to predict critical COVID-19 at ED admission using presence of X-ray abnormality, cancer history, age, neutrophil/lymphocyte ratio, LDH, dyspnea, bilirubin, unconsciousness and a number of comorbidities (11). They reported an AUC of 0.88. By the time this paper is reviewed, more studies will be published. Our study is innovative and unique in that we speci cally addressed the need for escalated care of patients who were admitted to the general oor. Nonetheless, comparisons of different predictive models on the same datasets are warranted.

Limitations
This study has several limitations. This is a retrospective study carried out in a single hospital. These ndings need to be replicated in a large and multi-institutional setting for generalizability. As in all observational studies, other residual confounders may exist that were not accounted for in our analysis.
Finally, it is important to note that the COVID-19 pandemic circumstance is unusual and evolving. Flow of patients (i.e., ICU) may depend on individual hospital's patient load, practice, and available resources, which also differ amongst countries.

Conclusions
This study provided corroborative evidence that WBC, lymphocyte count, D-dimer, lactate dehydrogenase, C-reactive protein, and creatinine are amongst the most important predictors of severe illness requiring ICU care. This work could help frontline physicians to better manage COVID-19 patients by anticipating downstream ICU needs to more effectively allocate healthcare resources.

Declarations
Author contributions statements JL -collected data, analyzed data, drafted paper BM -analyzed data and drafted paper   Patient selection owchart. The nal sample sizes included 1203 patients admitted to general oor ("general oor", Group A) and 104 directly admitted to the ICU from the ED ("direct ICU", Group B), 979 patients remained on the general oor ("no upgrade", Group C) and 224 were upgraded from the general oor to the ICU ("upgrade ICU", Group D).

Figure 2
Laboratory tests for group A ( oor) and B (direct ICU) at ED admission, and group C (no upgrade) and group D (upgrade) at two time points (at ED admission and one day prior upgrade and equivalence). SI conversion factors: To convert alanine aminotransferase and lactate dehydrogenase to microkatal per liter, multiply by 0.0167; C-reactive protein to milligram per liter, multiply by 10; D-dimer to nanomole per liter, multiply by 0.0054; leukocytes to ×109 per liter, multiply by 0.001. Error bars are SEM. * p<0.05, ** p<0.01, *** p<.005. Sample sizes for each bar graphs are shown. Note that a lower lymphocyte count, whereas higher values of the other laboratory variables, are associated with worse prognosis.

Figure 3
Vital signs and blood gases for group A ( oor) and B (direct ICU) at ED admission, and group C (no upgrade) and group D (upgrade) at two time points (at ED admission and one day prior upgrade and equivalence). Error bars are SEM. * p<0.05, ** p<0.01, *** p<.005. Sample sizes for each bar graphs are shown.