Development and Validation of a Risk Factor-Based Nomogram to Early Predict in-Hospital Mortality in Adult Patients With Suspected Infection Admitted to the Intensive Care Unit: a Retrospective Cohort Study

Background The over-consumption of time for data collection in the Sequential Organ Failure Assessment (SOFA) score and the poor performance of the quickSOFA (qSOFA) score in predicting poor prognosis among patients with suspected infection in intensive care unit (ICU) may delay the treatment towards sepsis. The aim of this study is to develop a prediction model to early identify patients with suspected infection who are under a high risk of in-hospital mortality in ICU. Methods Patients with suspected infection were retrospectively retrieved from the Medical Information Mart for Intensive Care (MIMIC III) database. Objective variables whose results can be obtained within a short period of time were integrated into the uni- and multi-variate logistic regression to screen the independent predictors for the in-hospital mortality in ICU patients with suspected infection. Then, the prediction nomogram was constructed by these independent predictors in the training set, and undergone internal validation and sensitivity analysis. of strong and validation


Introduction
The Third International Consensus De nitions Task Force de ned sepsis as "life-threatening organ dysfunction due to a dysregulated host response to infection." [1] For clinical operationalization, the task force recommends using a change in baseline of the total Sequential Organ Failure Assessment (SOFA) score of 2 points or more to represent organ dysfunction in patients with suspected infection and then de ned this cohort of patients as sepsis. [2] Whereas, as the level of creatinine and bilirubin as well as urine output in the SOFA score can not be timely captured, the calculation of SOFA score may delays the initiation of treatment towards sepsis. [3] To solve this defect, the task force developed a parsimonious clinical model called quickSOFA (qSOFA) to promptly identify adult patients with suspected infection who are likely to have poor outcomes, which can prompt the clinicians to further investigate for organ dysfunction with SOFA score and initiate appropriate therapy if such actions have not already been undertaken. However, as the qSOFA score has been veri ed to have well predictive performance for the inhospital mortality in patients with suspected infection outside the intensive care unit (ICU) (AUROC = 0.81; 95% CI, 0.80-0.82) but poor among encounters in ICU (AUROC = 0.66;95%CI, 0.64-0.68), [4] this benchmark may be not applicable to the prediction of poor outcome in patients with suspected infection who are entered into ICU directly or in patients who are suspected to have infection after ICU admission. Based on these, developing a predictive model to predict poor outcome in patients with suspected infection under an ICU setting is of great signi cance.
Therefore, in the present study, we aimed to develop and validate an prediction nomogram to promptly and accurately predict the in-hospital mortality in ICU patients with suspected infection, and then compared the predictive performance of this nomogram with the SOFA and qSOFA scores.

Data source
We conducted an observational study by retrieving data from the Medical Information Mart for Intensive Care (MIMIC III)  Since all data are de-identi ed in this database to remove patients' information, the requirement for individual patient consent is not indispensable.

Study population and data extraction
Since Johnson AEW et al has successfully extracted the suspected infection population from the MIMIC III database according to the Sepsis-3 task force de nition as the combination of culture and antibiotic start time to occur within a speci c time epoch, [4,7] we followed the structure query language (SQL) code edited by them to extract patients suspected of infection within 24 hours before or after ICU admission (the code is available online at https://github.com/alistairewj/sepsis3-mimic/tree/master/query). The following exclusion criteria were set: (1) age < 18; (2) with bad data (missing value account for 50% or more); (3) entering into ICU for more than twice; (4) suspected of infection one day before or after ICU admission. Besides, patients entering into the cardiac surgical intensive care unit were excluded because their post-operative physiologic derangements do not translate to the same mortality risk as the other ICU patients. The baseline characteristics of the extracted cohort between this study and that of Johnson et al was exhibited in Additional le 1: Table S1. The data from the table showed that the study population we extracted was similar to that of Johnson et al, indicating that we successfully isolated the cohort with suspected infection as with the Sepsis-3 criteria. Then, patients from the Metavision system (one of the electronic health record systems in MIMIC III) were randomly distributed into a training set and a validation set at a ratio of 7:3 to develop and validate the prediction nomogram, respectively. Sensitivity analysis was conducted in patients from the Carevue system (another electronic health record systems in MIMIC III). The Metavision system contains clinical data between 2008 and 2012, while data in the Carevue system is between 2001 and 2008. The data extraction process was illustrated in Additional le 2 Figure S1. For the nal cohort, we retrospectively collected the following data: (1) demographic data including age, gender, ethnicity and the body mass index (BMI); (2) the elixhauser comorbidity index; [8] (3) in-hospital mortality; (4) length of ICU stay; (5) mean value of vital signs during the rst hour of ICU stay; (6) the simpli ed acute physiology score (SAPS ) to assess the severity of illness; [9] (7) SOFA and qSOFA which were calculated using data from the rst 24 hours of ICU stay. Besides, the rst result of the following laboratory indexes since ICU admission were extracted, they are lactate, PCO 2 , PO 2 , PH, platelet, red blood cell distribution width (RDW), white blood cell (WBC), lymphocyte, neutrophil, bicarbonate, chloride, and glucose. We included these laboratory indexes into this study because their results can be obtained within a short time (normally within 1 hour), which was in line with the design concept of the prediction nomogram.

Statistical analysis
The distribution of continuous variables was judged by Shapiro-Wilk tests. Data were expressed as mean ± standard deviation (SD) for parametric continuous data and as median [interquartile ranges] for nonparametric distribution. Categorical data were expressed as number (percentages). Parametric and nonparametric variables were compared by unpaired Student's test and Mann-Whitney U test, respectively. The Chi-Squared test was used to compare categorical variables between groups.
Logistic regression analysis was conducted in the training set to identify independent risk factors for the in-hospital mortality in patients with suspected infection. Speci cally, variables related to the in-hospital mortality in the univariate analysis (p < 0.1) were entered into a stepwise multivariate logistic regression analysis, in which both forward selection and backward elimination were used to test at each step for variables to be included or excluded. The Akaike Information Criterion (AIC) was used as the selection criteria to eliminate predictors, [10] and variables with p < 0.05 were considered as the independent risk factors and used to create the nomogram. The following variables were integrated into the logistic regression analysis: (1) patients' age; (2) elixhauser index; (3) the vital signs and the laboratory indexes described above. Collinearity between these continuous variables were tested by the variance in ation factor (VIF) and an arithmetic square root of VIF ≤ 2 was considered as non-collinearity.
The predictive performance of the nomogram was evaluated and compared with SOFA and qSOFA scores in both of the training and validation sets. The predictive validity was assessed by an area under the curve of the receiver operating characteristic (AUROC) between the nomogram and SOFA or qSOFA score. The construct validity was determined by examining the predictive agreement using the Cronbachα. [11] The calibration curve was generated by the bootstrap method with 1000 resampling to evaluate the calibration of the nomogram. Besides, integrated discrimination improvement (IDI) index was calculated to compare discrimination slopes between the nomogram and SOFA or qSOFA score [12]. Besides, the DCA curve was generated to assess the net bene t of medical intervention conforming the nomogram and qSOFA score at different threshold probabilities in the training and validation sets. Sensitivity analysis was conducted in patients from the Carevue system to assess robustness of the prediction nomogram.
Statistical analyses were performed using R software (version 3.6.1, R Foundation for Statistical Computing, Vienna, Austria). Missing values were addressed with multiple imputation in the process of logistic regression and model construction. The imputation technique involves creating multiple copies of the data and replacing missing values with imputed values through a suitable random sample from their predicted distribution. A two-tailed p value of < 0.05 was considered statistically signi cant. All analyses were reported according to the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) guidelines. [13] Results

Characteristics of participants
After screening by the inclusion and exclusion criteria, a total of 7000 patients with suspected infection were included into the nal cohort. Characteristics at baseline of all participants and those in the hospital survivor and hospital non-survivor groups were exhibited in Table 1. Except for gender and BMI, all of the other features were signi cantly different between the two groups. Non-survivors were older than survivors (73 [59, 82] vs 64 [51, 78], p < 0.001). Patients' illness condition and organ failure were more severe in non-survivors, with p < 0.001 in both of the simpli ed acute physiology score (SAPS ) (51 [40, 63] vs 33 [25,42]) and the SOFA score (7 [4,10] vs 3 [2,5]). The distribution mode of the qSOFA score was different between the two groups (p < 0.001), with larger proportion of qSOFA ≤ 1 in non-survivors and qSOFA ≥ 2 in survivors. Besides, non-survivors had longer ICU stay time than survivors (3.46 [1.63, 6.99] vs 2.21 [1.3, 4.58], p < 0.001).

Development of a prediction nomogram
The SOFA and qSOFA scores as well as the characteristics of the predictors integrated into the logistic regression analysis in the training and validation sets were exhibited in Additional le 3: Table S2, which showed that both sets had no statistical difference in almost all the variables.
As the nomogram was developed in the training set, we exhibited in Table 2 the characteristics of the variables in the training set which were included into the logistic regression. Except for the level of chloride (p = 0.688), all of the other variables were statistically different between survivors and nonsurvivors. The VIF of the variables in Table 2 was calculated and none of them had an arithmetic square root of VIF ≤ 2, indicating that there was no collinearity between variables. After conducting the logistic regression, the risk factors independently associated with the in-hospital mortality of ICU patients with suspected infection were screened out and shown in Table 3. Then, a model integrating lactate, elixhauser index, age, RDW, WBC, lymphocyte, PH value, bicarbonate as well as the mean values of respiratory rate, temperature and SpO 2 was established, and then a nomogram based on this model was plotted. (Fig. 1) After calculating by the nomogram, a patient with a total point of over 225 will have a probability of the in-hospital death of more than 50% and then, be considered to be under the risk of inhospital death.

Model performance
The predictive validity for the in-hospital mortality in ICU patients with suspected infection using the nomogram was statistically greater than that of the qSOFA score (AUROC: 0.772 vs 0.611; p < 0.001) and SOFA score (AUROC: 0.772 vs 0.743; p = 0.015) in the training set. In the validation set, the predictive validity of the nomogram was still statistically greater than that of qSOFA score (AUROC: 0.764 vs 0.625; p < 0.001), but similar to that of SOFA score (AUROC: 0.764 vs 0.739; p = 0.141). (Fig. 2 with the qSOFA score, the prediction validity of the nomogram was improved by 12.5% and 11.3% in the training and validation sets, respectively. (Table 4) Calibration curve was depicted for both of the training and validation sets and the bias-corrected line is formed using a bootstrap method. (Fig. 3) The gure showed that the apparent curve and bias-corrected curve were slightly deviated from the reference line, but a good conformity between observation and prediction is still observed. SOFA, sequential organ failure assessment; qSOFA, quickSOFA; AUROC, area under the receiver operating characteristic curve; IDI, integrated discrimination improvement.

Clinical usefulness of the nomogram
Considering the clinical usefulness, the DCA for the nomogram was depicted and compared with the qSOFA score. (Fig. 4) Results indicated that medical intervention guided by the nomogram could add more net bene t than the qSOFA score when the threshold probability (PT) > 0.1 in both of the training and validation sets.

Stability of the model performance
Sensitivity analysis was conducted in the Carevue system. The predictive validity, predictive agreement, calibration and clinical usefulness of the nomogram were compared with SOFA and qSOFA scores by using AUROC and IDI index, Cronbach α, calibration curve, and DCA curve, respectively. Results showed that the predictive validity of the nomogram was statistically greater than that of the qSOFA score First, both of the qSOFA score and the nomogram are designed to early identify adult patients with suspected infection who are likely to have poor outcomes. However, the nomogram performed better than the qSOFA score in predicting the in-hospital mortality in patients with suspected infection in ICU.
Considering that the qSOFA score had good predictive performance in the out-of-hospital, emergency department, and general ward settings but poor performance in ICU, the nomogram we developed may effectively makes up for the defect of the qSOFA score and a combined use of the two models may provide timely and accurate prediction of poor outcome in patients with suspected infection in both of the non-ICU and ICU settings and then to prompt the clinicians to further investigate for organ dysfunction with SOFA score and to initiate appropriate therapy towards sepsis.
Second, as can be seen from the Cronbachα, the nomogram and SOFA score show high consistency in recognizing the patients with suspected infection who are under high risk of in-hospital death, which indicates that the septic patients identi ed by the SOFA score from the suspected infection cohort are to a large extent consistent with the suspected infection patients with poor outcome identi ed by the nomogram. In addition, as can be seen from the AUROC, the predictive validity of the nomogram in predicting poor outcome is similar or even stronger than that of the SOFA score. All of these demonstrate that the nomogram is appropriate for the prediction of poor outcome in the potentially septic population. Besides, in order to make timely and objective prediction, we included into the prediction nomogram with laboratory indexes and other indicators that can be obtained within a short time, while those timeconsuming laboratory tests, such as hepatic and renal function, and subjective indicators, such as the Glasgow Coma Scale (GCS) and medical intervention, were excluded from the model construction process. Therefore, the predictive e ciency of the nomogram is higher than the SOFA score, which avoids the delay of the early uid resuscitation and antibiotic administration in the treatment of sepsis. [14] However, it should be noted that the nomogram can not be used for the diagnosis of sepsis for that it is designed for the prediction of in-hospital mortality but not for the evaluation of organ dysfunction. Consequently, sepsis should still be diagnosed according to the Sepsis-3 criteria. [1] The ow chart for the diagnosis of sepsis by the combined use of the nomogram as well as the qSOFA and SOFA score is exhibited in Additional le 6: Figure S3.
Third, the Sepsis-3 Task Force did not include lactate as a part of the diagnostic criteria for sepsis for that evidence from them showed that integrating lactate into the qSOFA did not improve the predictive validity for ICU mortality and/or prolonged ICU stay. However, by conducting the logistic regression analysis, we found that lactate has the highest odds ratio (OR) value compared with all the other independent risk factors related to the in-hospital mortality in patients with suspected infection, which indicated that lactate is the most important predictor and has the strongest power for the prediction of in-hospital mortality in ICU patients with suspected infection. Supporting our conclusion, Sepsis-1 de nition also included increased lactate level as part of the severe sepsis criteria which was equivalent to the sepsis criteria in Sepsis-3. [15] Therefore, although lactate was excluded from the diagnostic criteria of sepsis in Sepsis-3, its clinical signi cance should not be underestimated. Several studies have shown that increased levels of lactate can be used to identify suspected infection patients with normal vital signs and without signi cant organ dysfunction who are under an increased risk of death from sepsis. [16][17][18] In consequence, the dynamic monitoring of blood lactate still plays an important role in the identi cation and management of sepsis, which may contributes to risk-stratify patients with suspected sepsis and facilitate aggressive early treatment. [19][20][21][22] To support this view, the Surviving Sepsis Campaign issued interim guideline after the publication of Sepsis-3 that the 3 h and 6 h bundle therapies should be continued for sepsis with lactate levels above 2.0 mmol/L. [3] Fourth, except for lactate, RDW has the highest OR value in the multivariate logistic regression, indicating that it may signi cantly affect the prognosis of patients with suspected infection in ICU. This nding is consistent with some previous studies that RDW can be used to estimate the short-term mortality of nonhematologic diseases, such as sepsis-associated encephalopathy, [23] stroke, [24] cardiovascular diseases, [25,26] and liver diseases. [27,28] Studies had indicated that the in ammatory response and oxidative stress during sepsis may contribute to the adverse effect of RDW, [29][30][31][32] but the exact mechanisms under the relationship between RDW and the in-hospital mortality of patients with suspected infection are still unclear. Therefore, more studies are necessary to clarify the speci c mechanism of the positive relationship between RDW and short-term poor prognosis, which may provide new target for the improvement of prognosis in patients with sepsis.
One thing should be noted is that if patients suspected of infection have been stayed in ICU for a period of time and the data of creatinine, bilirubin and urine output has been obtained, SOFA score should be calculated directly to diagnose sepsis and guide the relevant treatment. However, if such patients have just been admitted to ICU or the data of the above indicators are lacked, clinicians should adopt the nomogram rstly to identify the patients who are likely to have poor outcomes so as to initiate or escalate sepsis related therapy as soon as possible.
The study has several limitations. First, the retrospective nature of this observational study determined that unidenti ed confounding factors may affect the results if adding to the model. Second, as data in the MIMIC database is slightly old and internal validation as well as sensitivity analysis were conducted only by data from this database, external validation based on latest data and data from different medical settings are necessary to further evaluate the performance of the developed prediction nomogram. Third, as predicting the in-hospital mortality by a nomogram arti cially is ine cient, it is essential to integrate the nomogram into an electronic medical system, which can facilitate the clinicians to make prompt prediction without increasing their working loads and then to conduct appropriate treatment timely. Availability of data and materials The datasets used and analyzed during the current study are available from the corresponding author on reasonable request.
YY, QW and SL analyzed the data and wrote the paper. YFY and KG collected the data. FG, PW and WL checked the integrity of the data and the accuracy of the data analysis. LL and YZ designed the study and revised the paper. All authors read and approved the nal manuscript. Figure 1 The nomogram for the prediction of in-hospital mortality in patients with suspected infection in ICU.

Figure 1
The nomogram for the prediction of in-hospital mortality in patients with suspected infection in ICU.

Figure 2
The ROC curve of the prediction nomogram, SOFA score and qSOFA score in the training set (a) and validation set (b).

Figure 3
Calibration curves constructed by bootstrap approach in the training set (a) and validation set (b).

Figure 4
The DCA curve of medical intervention in patients with the nomogram and qSOFA score in the training set (a) and validation set (b).

Figure 4
The DCA curve of medical intervention in patients with the nomogram and qSOFA score in the training set (a) and validation set (b).

Figure 4
The DCA curve of medical intervention in patients with the nomogram and qSOFA score in the training set (a) and validation set (b).

Figure 4
The DCA curve of medical intervention in patients with the nomogram and qSOFA score in the training set (a) and validation set (b).

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.