A Novel Prediction Prognosis Model of sepsis with acute respiratory failure: A cohort study from the MIMIC-IV database


 Objectives: Acute respiratory failure is significantly related to increased short-term mortality in sepsis patients. We aimed to develop a novel prognosis model for predicting the risk for hospital mortality in sepsis patients with acute respiratory failure.Methods: We researched the Medical Information Mart for Intensive Care (MIMIC)-IV database, and developed a matched cohort of adult sepsis with acute respiratory failure. After applying multivariate Cox regression, a nomogram was developed based on identified risk factors of the mortality in the cohort. Besides, the discrimination of the nomogram in predicting individual hospital death was evaluated by the area under o the characteristic operating curve (ROC).Results: A total of 663 sepsis patients with acute respiratory failure were included in this study. Systolic blood pressure, white blood cell count, neutrophils, mechanical ventilation, PaO2 < 60mmHg, abdominal cavity infection, Klebsiella pneumoniae, Acinetobacter baumannii, and immunosuppressive disease were the independent risk predictors of the mortality in sepsis patients with acute respiratory failure. The area under curve of the nomogram in the ROC was 0.880 (95% CI: 0.851-0.908) that provided significantly higher discrimination compared with simplified acute physiology score II [0.656 (95% CI: 0.612-0.701)].Conclusion: The model has good performance in predicting the mortality risk of sepsis patients with acute respiratory failure, and it can be clinically useful to evaluate the short-term prognosis in critically ill patients with sepsis and acute respiratory failure.


Introduction
Sepsis and septic shock are globally important clinical problems faced by emergency and critical care medicine (1,2). The deaths of more than 750,000 individuals each year mainly resulted from sepsis in the United States, which accounted for about 10% of critically ill patients.
About 15% of sepsis progressed to septic shock, and the mortality of septic shock was up to 50% in the intensive care unit (ICU) (1). However, the pathophysiology of sepsis-related death is complex in that it is the result of multi-factor interaction. In this one, acute respiratory failure may play an important role. acute respiratory failure is closely related to sepsis, and it is also one of the most important complications of sepsis (3,4). In addition, acute respiratory failure is also a leading cause of ICU admission and mortality for critical illness. Notably, sepsis is in turn the main cause of acute respiratory failure, close to 70% of all causes (5).
Despite the age-standardized incidence and the mortality of sepsis have decreased by 37% and 52.8% respectively from 1990 to 2017, sepsis remains the main cause of health threats, which is still a serious global problem (6). Sepsis and septic shock are usually accompanied by acute respiratory failure and the need for identifying prognostic factors of acute respiratory failure-associated sepsis throughout the ICU (7), which may be crucial for reducing short-term mortality in sepsis. Besides, the poor outcome of critically ill patients is associated with acute respiratory failure and sepsis, which are the most common prognosis factors for critical illness (8,9). Therefore, targeting the prognostic risk of sepsis with acute respiratory failure may be a priority to improve the outcome of ICU patients. However, due to the lack of clinical reference tools for sepsis patients with acute respiratory failure, early detection of the risk for hospital mortality in such patients is a signi cant challenge.
Following the International Guidelines of the Surviving Sepsis Campaign (1), the overall prognosis of sepsis patients has been improved to a certain extent.Nevertheless, in some large-scale multicenter clinical studies (10,11), the mortality has been shown to still be as high as 20% − 30%. Therefore, rapidly identifying the modi able mortality risk in sepsis, especially in sepsis-associated acute respiratory failure, is still necessary to further reduce the mortality of sepsis. Based on a large clinical database, the main objective of our study is to develop a prognosis model for predicting the probability of hospital death in sepsis patients with acute respiratory failure.

Database
The data were extracted from the Medical Information Mart for Intensive Care (MIMIC)-IV database (version 0.4). The MIMIC-IV is a publicly available database of patients admitted to the Beth Israel Deaconess Medical Center between 2008 and 2019, which contains comprehensive information for each patient in the hospital: laboratory measurements, medications administered, vital signs documented, and so on. In addition, the MIMIC-IV database is organized into three modules: core, hosp, and icu, which is to highlight their intended use and provenance. Importantly, all sources from the MIMIC-IV database are ltered to include only the medical record number in the main patient list, which is to protect patient privacy. Speci cally, the project was approved by the institutional review boards of both the Massachusetts Institute of Technology and Beth Israel Deaconess Medical Center.

Patient population
Sepsis is diagnosed with an acute change in total SOFA score ≥ 2 consequent to the infection (12

Data extraction and management
Page 4/20 The raw data, including demographics, laboratory tests, microbiology cultures, medication prescription, vital signs, etc., were extracted by Navicat for Structure Query Language Server and further processed by R software.
Laboratory parameters and the mean value of vital signs of clinical variables were collected within the rst 24 h after ICU admission. Laboratory parameters include: alanine aminotransferase, aspartate aminotransferase, creatinine, blood urea nitrogen, hemoglobin, platelet, partial thromboplastin time, international normalized ratio, prothrombin time, white blood cell count, neutrophils, lymphocyte, lac, creatine kinase isoenzyme-MB (CK-MB), arterial blood gas analysis (PaO 2 ), PaO 2 /FiO 2 .arterial oxygen saturation (SpO 2 ). vital signs include: heart rate, systolic blood pressure, diastolic blood pressure, respiratory rate, temperature. Demographic data include: age, gender, length of hospital stay, hospital mortality. Advanced cardiac life support include: mechanical ventilation, renal replacement therapy. System scores for critical illness were evaluated, including the Simpli ed Acute Physiology Score (SAPS) II, SOFA score, and Glasgow Coma Scale (GCS). Comorbidity index on discharge according to the ICD-9 and ICD-10 codes (Supplementary materials 1-6), site of infection (Supplementary material 7), and type of microbial infection were also the main research focus. The percentage of missing values for creatinine, blood urea nitrogen, GCS were 0.15%, missing data were managed by a matrix diagram illustrated in the Data Pro ling Report (Supplementary material 8) and the multiple imputation method.

Statistical Analysis
Statistical analysis was performed with the R software (version 3.4.3). First, the trend of sample distribution was analyzed by Shapiro Wilke test, and then continuous variables with normal and nonnormal distribution were respectively presented as the mean ± standard deviation (SD) or the median (interquartile range, IQR). In addition, categorical variables were presented as the frequency and percentage. After that, non-parametric tests including Mann-Whitney U test and Kruskal-Wallis test were applied for the datasets with non-normal distribution and heterogeneity, and Pearson Chi-squared test was applied to categorize variables. Second, the propensity score matching (PSM) was used to develop a matched cohort. After matching age and gender, the included survival set was made up of randomly extracted cases from the total survival population, the number of which was matched with the nonsurvival set in a ratio of 2:1. Most importantly, statistically signi cant variables (P < 0.05) were identi ed by multivariate Cox regression analysis, and then they were combined into a nomogran. Finally, the discrimination of the nomogram was evaluated by the area under receiver operating characteristic curve (ROC).

Patient selection and baseline characteristics
19,658 patients with Sepsis-3 were selected from the MIMIC-IV database. After applying the exclusion criteria, 19,655 adult patients with sepsis were included in the next analysis. Of these patients, 1,564 were complicated with acute respiratory failure. Finally, 221 of sepsis patients with acute respiratory failure died during hospitalization, while the remaining 1,343 survived. The surviving group and the nonsurviving group are matched 1:2 by age and gender. Finally, a total of 663 sepsis patients with acute respiratory failure were included. The hospital mortality of sepsis patients with acute respiratory failure was 14.13%. The ow chart of patient selection is shown in Fig. 1. and 33.5% vs. 62.0%, respectively; p < 0.001 for both] of laboratory parameters were signi cantly different in the survival and non-survival groups. Interestingly, the incidences of infection types of the urinary tract (40.0% vs. 26.2%, p = 0.001), and the skin and soft tissue (18.8% vs. 12.7%, p = 0.048) were higher in survivors than in non-survivors, and the comorbid hypertension (57.0% vs. 43.3%, p = 0.001) also showed the same comparison trend. Survivors had lower incidences of microbiology infections with Acinetobacter baumannii and Klebsiella pneumoniae (3.6% vs. 39.8% and 14.5% vs. 43.9%, respectively; p < 0.001 for both) than those of the non-survivors. Besides, all score systems (p < 0.001) in survival group were better than those in the non-survival group. Other major variables did not differ signi cantly in the two groups, except for mechanical ventilation (54.3% vs. 79.2%, p = 0.007) and length of hospital stay [3.5(1.6-7.6) vs. 6.5(2.5-11.2), p < 0.001]. The comparison of baseline characteristics is shown in

Development of a prediction nomogram
The signi cant variables identi ed by multivariable Cox regression analysis (Table 2) were brought into a visualization model development. Each categorical variable (i.e., mechanical ventilation, immunosuppressive disease, Klebsiella pneumonia, Acinetobacter baumannii infection, abdominal cavity infection, PO2 < 60 mmHg had two squares with different intervals according to their own OR, and continuous variables (neutrophils, white blood cell count, and systolic blood pressure) were presented in the form of different curved area graphs. Most notably, Acinetobacter baumannii infection accounted for the biggest weight. Development of the nomogram to predict the probability of hospital mortality in sepsis patients with acute respiratory failure is shown in Fig. 2. 3.4 Discrimination of the nomogram The nomogram was compared with the SAPS II for predicting the mortality in sepsis patients with acute respiratory failure. Results of the ROC revealed that the sensitivity and speci city of the nomogram were respectively 81.9% and 79.9%, and the area under curve (AUC) of the nomogram was 0.880 [95% con dence interval (CI): 0.851-0.908], and that SAPS II had the sensitivity and speci city of 41.6% and 81.9%, respectively, and the AUC of 0.656 (95% CI: 0.612-0.701). The ROC comparing the discrimination of the nomogram and SAPS II is shown in Fig. 3.

Discussion
In this study, we implemented multivariate Cox regression to identify the independent risk factors associated with the hospital mortality of sepsis-related acute respiratory failure, and nine predictors including mechanical ventilation, immunosuppressive disease, Klebsiella pneumoniae, Acinetobacter baumannii, abdominal cavity infection, PO 2 < 60 mmHg, neutrophils, white blood cell count, and systolic blood pressure were integrated into a prediction nomogram presented as a visualization model. Besides, the model had good performance in predictive value. To the best of our knowledge, this study is the rst prediction model to determine mortality risk in sepsis patients with acute respiratory failure based on the MIMIC-IV database. Moreover, the model may be a clinically useful tool for predicting the short-term prognosis of such patients.
The lung is the most easily involved organ in severe infection. In sepsis, acute respiratory failure was the common sepsis-related organ injury (4,7,13) and led to signi cant mortality, which was even as high as 34% − 45% (14). For this reason, this study directly focuses on clinical characteristics relevant to the modi able mortality in sepsis-associated acute respiratory failure. With increasing recognition that current approaches to the management of sepsis do not eliminate severe outcomes of sepsis patients due to complicated disease processes (1,2) and that this complexity is also confounding the prognosis of sepsis patients with acute respiratory failure (7), the prognostic risk model may offer one potential approach. Although targeting sepsis has had some success (15,16), this study provides the rst prognostic model for sepsis-related acute respiratory failure to complement such a prediction system. Speci cally, our ndings suggest that prognostic predictors focusing on the lung, as presented by our data-driven analysis of mortality risk, especially mechanical ventilation, Klebsiella pneumoniae, Acinetobacter baumannii infection, and PO 2 < 60 mmHg, may identify septic individuals most likely to die from factors associated with acute respiratory failure. Moreover, these ndings on respiratory-related predictors are consistent with the following studies. Some reported that acute respiratory failure, a leading cause of ICU admission, often needed mechanical ventilation as a life-supporting intervention, but which could lead to excess mortality (5,17). Some studies on respiratory infection showed that Klebsiella pneumonia substantially contributed to the mortality of nosocomial and ventilator-associated pneumonia in the ICU (18, 19), and that compared with non-Acinetobacter baumannii, sepsis patients with Acinetobacter baumannii pulmonary infection had a higher mortality rate (20). Besides, Acinetobacter baumannii was also one of the most common pathogens to cause hospital-acquired pneumonia, especially in the ICU (21), indicating that it may be a common and important mortality-related biomarker. Most importantly, according to the fact that Acinetobacter baumannii accounted for the largest weight in our model, the predictive value of Acinetobacter baumannii infection was applicable to sepsis patients with acute respiratory failure, and as the most important predictor to hospital mortality of such patients. In addition, PO 2 < 60 mmHg as an important predictor of poor outcome was supported by an autopsy study, which showed that dead patients with severe acute respiratory distress syndrome were more likely to refractory hypoxemia (22). Therefore, focusing on clinically respiratory-related characteristics may be a priority for predicting the prognosis in sepsis patients with acute respiratory failure.
We also analyzed non-pulmonary risk factors and found signi cant results in this study. Although sepsis was derived from infection, severe sepsis was often accompanied by other organ impairment and even evolved into multiple organ dysfunction syndromes, the mortality of which was extremely high (2,4). Therefore, the mortality of sepsis is clinically relevant to disease progression, comorbidity in particular. Interestingly, among comorbid diseases, only immunosuppressive disease was an independent predictor for hospital mortality of sepsis-related acute respiratory failure in this study. It has been con rmed that immunity damage is the main pathophysiological manifestation of the occurrence and progression of sepsis and aggravates the in ammatory cascade reaction (23,24). Consistently, our nding on comorbid immunosuppression also highlights the important role of immunity in the short-term prognosis of sepsis patients. Based on the fact that the in ammatory response induced by infection is the promoter of sepsis (2,24), we assume that in ammation-related markers may re ect the prognosis of sepsis. Our ndings that white blood cell count and neutrophils from laboratory tests were independent prognostic biomarkers support this assumption. These biomarkers are common and necessary in clinical practice and can effectively re ect the response and severity of bacterial infection (25).
Although we also found that systolic blood pressure, an important hemodynamic index, independently affected the mortality of sepsis patients with acute respiratory failure, it was di cult to speci cally identify the mortality risk if only based on clinical symptoms and signs. In this case, adding clinical biomarkers with the predictive value may help to solve this dilemma. However, infection site is also an important part of infection-related mortality in sepsis. Despite the impact of infection site on hospital mortality in sepsis patients cannot be de ned based on a review (26), our nding showed that abdominal cavity infection was a mortality risk factor in sepsis patients with acute respiratory failure. This nding clearly supports the infection-related mortality theory. But surprisingly, a prior study reported that infection sites of pulmonary and other sources, but not abdominal, were predictive of outcome in sepsis patients with acute lung injury but not in those without acute lung injury (27). These seemingly contradictory results may be related to patient population, sample size, and other confounding factors. These different studies do not affect our understanding of the importance of infection site in sepsis, especially in sepsisrelated acute respiratory failure, even though this heterogeneity may be confounding their clinical extension.
SAPS II was commonly a system score for the severity of critical illness and clinically a useful tool for predicting the short-term prognosis in sepsis (12,28). But whether it applies to sepsis-associated acute respiratory failure remains unclear. Therefore, we developed a novel prognostic prediction model and compared its predictive performance with SAPS II in this study. Our nding suggests that SAPS II in discriminating sepsis-related acute respiratory failure patients under the risk of hospital mortality was not as good as previously reported. However, this nding was consistent with a study, which reported that SAPS II and SOFA scores had not signi cantly predictive value in sepsis mortality (29). In addition, this study also showed that the prediction model possessed the superior discrimination than SAPS II in the mortality risk. It compared with the previously developed model for predicting the mortality of patients with skin and soft tissue infections(ROC AUC of 0.84), our model still shows better discrimination (30). It follows that targeting a predictive model based on a combination of independent risk factors may be a preferred option for evaluating the short-term prognosis of sepsis patients with acute respiratory failure in the ICU.
There are some limitations to this study. First, this study has inherent shortcomings of the retrospective cohort study, although the MIMIC-IV database is currently the latest version of 0.4, including brand-new datasets from 2008 to 2019. Besides, the complexity of clinical data suggests that there may be unadjusted potential confounding factors hidden in this study. Second, one challenge of clinical research on sepsis is that there is no speci c diagnostic method, it still relies on a rule-out de nition, which may lead to relatively low sensitivity for the diagnosis of sepsis. Therefore, the model based on Sepsis-3 also faces the above challenge and can be used only in sepsis-related acute respiratory failure diagnosed by the exclusion method. Third, when evaluating the survival of such patients, we cannot absolutely rely on the model, combine speci c clinical support conditions, such as FiO 2 , whether to use vasoactive drugs, etc. The model is only used as an aid to the evaluation. Finally, the external performance of the model may need to be further evaluated due to the lack of a prospective cohort to validate the model.

Conclusion
This study provides a novel prediction model for hospital mortality in ICU patients with sepsis and acute respiratory failure, based on the MIMIC-IV database. In the cohort, mechanical ventilation, immunosuppressive disease, Klebsiella pneumoniae, Acinetobacter baumannii infection, abdominal cavity infection, PO2 < 60 mmHg, neutrophils, white blood cell count, and systolic blood pressure were independently associated with the mortality of sepsis patients with acute respiratory failure. Besides, Acinetobacter baumannii infection was reported as the most signi cant prognostic biomarker. Therefore, the model will be speci cally bene cial for improving the short-term prognosis of sepsis patients with acute respiratory failure once preventive measures targeted to the mortality-related risk factors are implemented. Declarations original data and foreseeing and minimizing obstacles to the sharing of data described in the work. All authors read and approved the nal manuscript.    Figure 1 Flow chart of the patient selection process and study cohort. ICU, intensive care unit. Besides, a vertical line from each variable upward to the points was also drawn. Finally, the speci c point of each variable was added up to a total score, which corresponded to a prediction probability at the bottom of the nomogram. Receiver operating characteristic curve of the nomogram and SAPS II. The nomogram in the ROC showed the sensitivity and speci city of 81.9% and 79.9%, and the AUC of 0.880 (95% CI: 0.851-0.908). However, the sensitivity and speci city of SAPS II in the ROC were 41.6% and 81.9% respectively, and the AUC of SAPS II was 0.656 (95% CI: 0.612-0.701). ROC, receiver operating characteristic curve; AUC, area under curve; CI, con dence interval.