Articial Intelligence to Predict Mortality in Critically ill COVID-19 Patients Using Data from the First 24h: A Case Study from Lombardy Outbreak

Introduction: SARS-CoV-2 infection was rst identied at the end of 2019 in China, and subsequently spread globally. COVID-19 disease frequently affects the lungs leading to bilateral viral pneumonia, progressing in some cases to severe respiratory failure requiring ICU admission and mechanical ventilation. Risk stratication at ICU admission is fundamental for resource allocation and decision making, considering that baseline comorbidities, age, and patient conditions at admission have been associated to poorer outcomes. Supervised machine learning techniques are increasingly diffuse in clinical medicine and can predict mortality and test associations reaching high predictive performance. We assessed performances of a machine learning approach to predict mortality in COVID-19 patients admitted to ICU using data from the Lombardy ICU Network. Methods: this is a secondary analysis of prospectively collected data from Lombardy ICU network. To predict survival at 7-,14- and 28 days we built two different models; model A included patient demographics, medications before admission and comorbidities, while model B also included the data of the rst day since ICU admission. 10-fold cross validation was repeated 2500 times, to ensure optimal hyperparameter choice. The only constrain imposed to model optimization was the choice of logistic regression as nal layer to increase clinical interpretability. Different imputation and over-sampling techniques were employed in model training. Results were included, with Exploratory analysis and Kaplan-Meier curves demonstrated mortality association with age and gender. Model A and B reached the greatest predictive performance at 28 days (AUC 0.77 and 0.79), with lower performance at 14 days (AUC 0.72 and 0.74) and 7 days (AUC 0.68 and 0.71). Male gender, age and number of comorbidities were strongly associated with mortality in both models. Among comorbidities, chronic kidney disease and chronic obstructive pulmonary disease demonstrated association. Mode of ventilatory assistance at ICU admission and Fraction of Inspired oxygen were associated with mortality in model B. Conclusions Supervised machine learning models demonstrated good performance in prediction of 28-day mortality. 7-days and 14-days predictions demonstrated lower performance. Machine learning techniques may be useful in emergency phases to reach higher predictive performance with reduced human supervision using complex data.


Introduction
Towards the end of 2019, a novel strand of coronavirus, named Severe Acute Respiratory Syndrome coronavirus-2 (SARS-CoV-2) was identi ed as the causative agent of an outbreak of bilateral pneumonia in the city of Wuhan in China. (1) The clinical picture related with SARS-CoV-2 infection, was subsequently named COVID-19 disease, and is frequently characterized by severe bilateral pneumonia. The epidemic spread outside mainland China to an increasing number of countries, and on March 11th, 2020 it was declared a pandemic. (2) Italy, and in particular Lombardy, was the epicenter of the rst outbreak of COVID-19 in the Western World.
In Lombardy, the rst cases were recognized at the end of February, and the number of Intensive Care Unit admissions rose substantially in the following weeks. (3) The outcomes of patients admitted to ICU for COVID-19 disease are severe, and comparable with those of patients with severe Acute Respiratory Distress Syndrome (ARDS), with mortality around 50% in patients requiring mechanical ventilation. (4)(5)(6). Several factors have been associated with a negative outcome, including age, male gender, previous comorbidities, and level of respiratory support at ICU admission (4,7). Machine learning algorithms are increasingly employed in clinical medicine due to their potential of analysing large amount of information with reduced human supervision, resulting in high predictive performance. (8, 9) These models can similarly help to hasten data cleaning and netuning of predictive models, a process which would draw large resources during an emergency. Increased predictive performance using arti cial intelligence could be useful both for patients, to target the best therapeutic strategy with realistic goals, and for the healthcare system to enhance the allocation of resources. To assess the performance of a machine learning approach on operative data collected during the upsurge phase of the pandemic we propose a supervised learning model to predict mortality in COVID-19 critically ill patients.

Methods
This is a secondary analysis of data collected during the COVID-19 Lombardy outbreak from February 2020 to April 2020 using operational and clinical data from the Lombardy ICU network, as described in previous studies. (3,4,10) The aim of this study is to predict survival at 7, 14 and 28 days from ICU admission, using a typical supervised learning framework.
Data on patient baseline characteristics, including medications, comorbidities and baseline ventilation parameters are included in the analysis. Data are described as mean (standard deviation) or frequency (percentage), as appropriate.
We conducted an initial exploratory analysis of data, testing univariate and bivariate associations with Chi-square test and Mann-Whitney U test. Survival analysis was conducted plotting Kaplan-Meier curves.
We built two different models (A and B), to predict survival at three different timepoints: 7-day, 14-day and 28-day mortality from ICU admission. Model A included only baseline patient data (age, gender, home medications and comorbidities); Model B included baseline data and data from the rst 24h in ICU. The only imposition to models was on the last layer: data output had to be expressed through a logistic regression so that interpretability was not lost and we were able to better understand the models' decision-making process.
Both models were trained using 10-fold cross validation. Hyperparameters were optimized to maximize the out-of-fold area under the curve (AUC) on a randomized grid space. Cross-validation was repeated models. To avoid selection bias during modeling, hyperparameters optimization was conducted with a high degree of freedom, with the only constraint of using a logistic regression as nal layer of the model, to retain some native interpretability. During training, the optimization process had the choice of using different over-sampling techniques (Synthetic Minority Over-sampling Technique [SMOTE], support-vector machine-Synthetic Minority Over-sampling Technique [SVM-SMOTE] or Adaptive synthetic sampling approach for imbalanced learning [ADASYN]), (11)(12)(13) as well as different imputation techniques.
We opted to validate the model using cross-validations compared to hold-out validations to account for the limited observation space.

Results
We included a total of 1503 patients in the analysis. 28-days mortality was 51% (n = 766). Survivors were signi cantly younger and suffered fewer comorbidities at admission compared to non-survivors (p < 0.05). 44% of survivors had no pre-existing comorbidities, compared to 26% of non-survivors (Table 1).    Precision is also known as Positive Predictive Value, recall also known as sensitivity. F1 combines precision and recall. SD = Standard Deviation  Male gender was associated with increased mortality (p < 0.05). Home medications and type of comorbidity also differed between survivors and non-survivors (p < 0.05), as reported in Fig. 1, with a higher proportion of patients affected by liver disease among survivors.
Kaplan-Meier curves reveal a progressive reduction in survival from 70% at 10 days to 55% at 20 days from ICU admission reaching a plateau thereafter in both male and female populations (Supplemental Fig. 1and 2). Age was strongly associated with mortality, with 7-day survival ranging from 64% in the oldest age group to 93% in the 30-40 years group, with differences progressively increasing at 14 and 28days survival (Fig. 2). The greatest predictive performance was reached at 28-days, with AUC = 0.77 for model A and AUC = 0.79 for model B (Table II). Inter-class performance at 28-days displayed good-balance among all scores for both classes, with model A and B demonstrating similar precision and recall (0.7), and similar F1 scores (Table III). Table IV reports mean odds for 28-days survival from model A and B. Feature importance seems rather coherent amongst the two models, with chronic kidney disease and age strongly associated with mortality.

Discussion
In this study we build two different models to predict survival in critically ill patients with COVID-19 disease, using supervised machine learning techniques. Model A included baseline characteristics only, while model B included baseline characteristics and ICU admission data.
We assessed survival at 7-, 14-and 28-days for both models, reaching the highest predictive performance with later outcomes, an effect probably related to the increased balance between classes with increasing ICU stay.
Model B performance was reduced at 7 and 14-days. As model A included only baseline data and model B included ventilatory parameters and other data at ICU admission, the loss in the predictive performance of model B can be attributed to a mixed effect (patients with worse respiratory but reduced baseline comorbidities may recover and stabilize, while older and more comorbid patients can still be at risk of deterioration after ICU admission). More complex and deeper models should probably be employed to assess the effect of ventilatory parameters at ICU admission.
Both models con rm that age is one of the strongest predictors of ICU survival, with a probability of death reaching almost 100% in older patients. The strong association between age and mortality is a constant nding in COVID-19 literature (4,14). Male gender is a negative predicting factor, a nding con rmed by previous studies (15). An etiological justi cation might be linked to a difference between the sexes in cellular immunity as males present a poorer T-cell activation and an increase in proin ammatory cytokines (16).
Chronic kidney disease (CKD) is highly correlated with mortality in our data, and this may be related to several factors. CKD affects older patients (17) that, as already demonstrated widely (3) and con rmed by our models, are particularly fragile when hospitalized for COVID-19. Secondly, all stages of CKD are associated with an increased risk of premature mortality from all causes(18) and thirdly, CKD is associated in up to two-thirds of the cases with diabetes and hypertension (19), a proxy for older, multimorbid patients. (4) COVID-19 disease is also associated with new onset acute kidney injury, that may further worsen previous kidney disfunction, leading to organ failure. (20,21) We found a strong association between chronic obstructive pulmonary disease (COPD) and mortality. COPD patients have both an increased risk of COVID-19 disease, and a poorer prognosis, with higher rates of hospitalization and mortality. (22) COPD is an independent predictor of mortality in patients admitted to ICU for COVID-19 pneumonia. (4) Diabetes Mellitus is associated with mortality in our results. Type-2 diabetes mellitus is more frequent in older patients, male gender, and is part of the metabolic syndrome with hypertension and obesity, which was previously demonstrated to have strong association with COVID-19 outcomes. (23) The association of diabetes and survival has been questioned by other studies, where the association was lost after controlling for other factors (4,24).
Chronic therapy with ACE inhibitors and ARBs were associated with higher mortality in this analysis.
Initial reports linked the possible pharmacodynamics of these classes of drugs to an up-regulation of ACE2 expression (25) and a consequent increase in the availability of target molecules for SARS-CoV-2(26). This association has been proven wrong by Mancia et al(27), who performed a large populationbased case-control study demonstrating that use of ACE inhibitors and ARBs was more frequent among

Conclusions
Supervised machine learning models demonstrated good performance in the prediction of 28-day mortality. Lower performance was demonstrated at 7 and 14 days. As the model improves with further data, it may be used as a score of severity before ICU admission to stratify patients according to survival probability. Machine learning techniques may be useful in emergency phases, to reach higher predictive performance with reduced human supervision using complex data.

Declarations
Ethics approval and consent to participate: The institutional ethics board of Fondazione IRCCS Ca' Granda Ospedale Maggiore Policlinico, Milan, approved this study and waived the need for informed consent from individual patients owing to the retrospective nature of the study. This study followed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) reporting guideline.
COVID-19 patients due to their higher prevalence of cardiovascular disease, without evidence linking those drugs to an higher risk of infection by SARS-CoV-2.
Compared to Model A, Model B overly relies on the type of ventilatory assistance at ICU admission (mechanical ventilation vs Non-invasive ventilation vs spontaneous breathing), at the expense of an higher variance compared to model A. Among the other ventilatory variables in the rst 24h, only the fraction of inspiratory oxygen (FiO 2 ) administered during the rst hours was demonstrated to be inversely associated with mortality in the nal B model.

Limitations
The study presents several limitations. First, it is an observational study based on operative data daily collected during an emergency crisis by a regional coordination center, hindering the quality of data assured by a research targeted database. Despite being a limitation, it was one of the main goals of the models: to test the ability of machine learning models during the escalation phase of the spread of SARS-CoV2 where an hold-out validation cannot be retrieved. Some variables that could be useful to increase the predictive performance of the model could not be collected, including data about comorbidities (i.e. CKD Stage, hypertension severity) and other physiological parameters (weight, body mass index, more complex ventilatory data, patient frailty). Availability of more data could probably reduce class imbalance in this population.
The number of patients and data included in this study is not comparable with big data analysis, where machine learning techniques outperform classic statistical. However, with this study we were able to demonstrate that machine learning approach may be used even with smaller dataset in an emergency setting and reach high predictive performance.