Development of a nomogram to predict in-hospital mortality of sepsis-associated encephalopathy: a retrospective cohort study

Background


Introduction
Sepsis-associated encephalopathy (SAE) is the dysfunction of brain that develops with the process of sepsis without evidence of the central nervous system infection.It is preferentially associated with deterioration of consciousness, behavior, memory and cognitive function, imposing heavy medical and nancial burden on families and society [1][2][3].More harmfully, patients with sepsis complicated by encephalopathy tend to have higher short-term mortality than those with sepsis alone.A landmark study conducted by Eidelman LA et al. demonstrated that encephalopathy was associated with increased in-hospital mortality from 16% when the Glasgow Coma Score (GCS) is 15 to 63% when GCS is between 3-8 [4].Consistent ndings were also reported in another high-quality study performed by Romain Sonneville et al. showing that SAE was related to decreased 30-day survival probability from 67% when GCS is 15 to 32% when GCS is between 3-8, and even mild change of consciousness (de ned by GCS of [13][14] is an independent risk factor of 30-day death with an hazard rate (HR) of 1.38 after adjustment for confounding factors [5].The clinical signi cance of SAE is further strengthened by several other studies consolidating that SAE was responsible for the increase of short-term mortality, prolongation of hospital staying, or excessive therapeutic activity leading to overmuch assumption of medical resources [6,7].However, it is still di cult to identify SAE patients with higher risk of in-hospital death and further stratify them.Therefore, the main objective of the present study by a large clinical database was to develop a predictive nomogram to individually predict the probability of in-hospital death in SAE patients and thereby facilitate clinicians to make timely treatment decision and improve the prognosis of such patients.

Data source
We conducted an observational study by retrieving data from the Medical Information Mart for Intensive Care (MIMIC III) open source clinical database, which contains de-identi ed health-related data of over forty thousand patients who received treatment in critical care units of the Beth Israel Deaconess Medical Center between 2001.06 and 2012.10 [8, 9].All data in MIMIC III was classi ed into 26 tables recording various individual information, such as demographics characteristics, treatment measures, nursing notes and laboratory tests.Besides, it contains survival outcome data obtained from the hospital and laboratory health record systems reporting the in-hospital mortality, or from the Social Security Administration Death Master File recording the out-of-hospital survival data.The MIMIC III database can be freely utilized after successful application and ethical approval from the Institutional Review Boards of both Beth Israel Deaconess Medical Center (Boston, MA, USA) and the Massachusetts Institute of Technology (Cambridge, MA, USA).Since all data are de-identi ed in this database to remove patient information, the requirement for individual patient consent is not indispensable.

Study population and data extraction
PgAdmin (version 4.1, Bedford, USA) was used to run structure query language (SQL) and then to extract data from the MIMIC III database.Six tables were occupied in our study, including DIAGNOSES_ICD, ICUSTAYS, PATIENTS, LABEVENTS, MICROBIOLOGYEVENTS and PRESCRIPTIONS.We included adult patients (> 17 years of age) with a diagnosis of sepsis according to the Third International Consensus De nitions for Sepsis (Sepsis 3.0): (1) Patients with infection at ICU admission, and (2) the Sequential Organ Failure Assessment (SOFA) score ≥ 2 [10].Excluded were patients with (1) primary brain injury (traumatic brain injury, ischemic stroke, hemorrhagic stroke, epilepsy or intracranial infection); (2) preexisting liver or kidney failure affecting consciousness; (3) severe burns and trauma; (4) hypothermia or malignant hyperthermia; (5) chronic alcohol or drug abuse; (6) pre-existing mental illness, including schizophrenia, depression, anxiety, compulsion and dementia; (7) severe electrolyte imbalances or blood glucose disturbances, including hyponatremia (< 120 mmol/l), hypercapnia (PCO 2 > 75 mmHg), hyperglycemia (> 180 mg/dl) or hypoglycemia (< 54 mg/dl); (8) dying or leaving within 24 hours since ICU admission; (9) without an evaluation of GCS.Eligible patients were included in the nal cohort for investigation (Additional le 1: Fig. S1).For the nal cohort, we retrospectively collected the following data from the database: (1) demographic data and hospital outcome; (2) comorbidity conditions as coded and de ned in the International Classi cation of Diseases, Ninth Revision (ICD-9); (3) mean value of vital signs during the rst 24 hours of ICU stay; (4) the rst laboratory data since ICU admission; (5) site of infection and type of micro-organism; (6) use of antibiotic, sedative and analgesic agents.The severity of illness and organ failure was assessed by SOFA on the rst day of ICU stay [11].

Sepsis-associated encephalopathy
As GCS had been proved to be an excellent tool for characterizing SAE and distinguishing it from sepsis, we de ned SAE in the study as sepsis accompanied by GCS < 15 on the rst day of ICU admission [4].For sedated or postoperative patients, we adopted GCS measured before sedation or surgery.Sepsis with normal consciousness (GCS = 15) but complicated by delirium should also be de ned as SAE.However, the Confusion Assessment Method developed for ICU patients (CAM-ICU) to diagnose delirium in ICU patients are scarce in the database [12], thus it is unavailable to include this subgroup of patients in the study.Nevertheless, such patients only account for less than 10% of SAE, and show no signi cant difference on the short-term mortality comparing with sepsis without encephalopathy according to previous report [5].Therefore, not including this subgroup may not signi cantly compromise the reliability of the results in our study.

Statistical analysis
Shapiro-Wilk tests was used to assess the distribution of variables.Data were expressed as mean ± standard deviation (SD) for parametric continuous data and as median (interquartile ranges) for nonparametric distribution.Categorical data were expressed as number (percentages).
To enhance the stability and reliability of the conclusion, the patients in the nal cohort were randomly distributed to a training set and a validation set without replacement at a ratio of 7:3.Parametric continuous variables were compared between training and validation sets by using unpaired Student' t test and non-parametric continuous variables by Mann-Whitney U test.The Chi-Squared test was adopted to assess the differences in categorical variables between datasets.
Logistic regression analysis was used to identify risk factors independently associated with in-hospital mortality.Speci cally, variables signi cantly related to in-hospital death in univariate analysis (p < 0.1) were entered into multivariate Logistic regression analysis to calculate estimated odds ratios (OR) and 95% con dence intervals (95%CI), where signi cant level for independent risk factors was p < 0.05.To avoid multicollinearity, the signi cant variables incorporated into the SOFA score were not included in the multivariate analysis.Moreover, a nomogram was obtained by the training set according to Occam's Law of Razor, namely the best model should be one that can achieve the aim of study with fewer variables [13].Missing values were addressed with multiple imputation in the process of Logistic regression and model construction.Then, the performance of the nomogram and SOFA score in predicting the probability of in-hospital death were evaluated in both training and validation sets by an area under the curve of the receiver operating characteristic (AUROC).The performance of the nomogram was also evaluated with calibration by bootstrap method with 1000 resampling, and the calibration results of the nomogram and SOFA score were also compared.Besides, integrated discrimination improvement (IDI) was calculated to compare discrimination slopes and Brier score to evaluate model tness [14].DCA analysis was performed to evaluate the net bene t of medical intervention conforming each model at different threshold probabilities in the training and validation sets.To achieve the best sensitivity and speci city, Xtile analysis was used to nd out the cut-off values of the total score calculated by the nomogram.Kaplan-Meier analysis was conducted to visualize the probability of hospital survival grouped by the cutoff points in both training and validation sets, and log-rank tests were used to identify between-group difference.
Statistical analyses were performed using R software (version 3.6.1,R Foundation for Statistical Computing, Vienna, Austria).A two-tailed p value of < 0.05 was considered statistically signi cant.All analyses were reported according to the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD) guidelines [15].

Characteristics of participants
There were a total of 46467 patients entering into ICU for various causes, and 15847 patients with sepsis were included in the study cohort.After screening by the inclusion and exclusion criteria, a total of 956 patients were diagnosed as SAE by GCS < 15 and were included into the nal cohort.Characteristics at baseline and upon ICU admission of the 956 participants (669 from the training set and 287 from the validation set) revealed that both sets had no statistic difference in almost all the variables (Table 1,2).The median age of the two sets were 73 (60, 82) and 72 (58, 82) years, respectively.Male accounted for 52.62% and 56.45% of patients in the two sets.Median hospital stay time was 12.33 (7.52, 21.68) days in the training set and 13.98 (8.03, 23.36) days in the validation set, and nearly half of patients were received into the Medical Intensive Care Unit (MICU).The in-hospital mortalities in the two sets were 143 (21.38%) and 58 (20.21%), respectively, without statistical difference (p = 0.75).On the rst day of ICU admission, SOFA score were 6 (4, 9) and 6 (4, 8) as well as GCS were 13 (9,14) and 13 (8, 14) in the training set and validation set, respectively.

Development of a prediction nomogram in the training set
Missing values were addressed by multiple imputation before Logistic regression analysis.We interpolated 10 times and combined them into one dataset by taking their mean values.Univariate and multivariate Logistic regression results are shown in Table .3.Except for bilirubin, PO 2 , SAPSII, GCS and systolic pressure, which were included in SOFA score, sixteen variables with p value < 0.1 in univariate analysis were entered into multivariate analysis.Finally, age, SOFA, RDW, neutrophil to lymphocyte ratio (NLR), heart rate, respiratory rate, temperature, lung infection and gram-positive bacterium(G+) infection were considered as the independent risk factors for hospital death among patients with SAE.Lung infection and G + infection were not included for model development since these two parameters depended on microbial culture, which took more than 48 hours.Then, the performances of all combination of the remaining seven risk factors were comprehensively evaluated and NLR was then excluded due to its minimal contribution to the improvement of model discrimination.Finally, a model integrating age, SOFA, RDW, heart rate, respiratory rate and temperature was established for it had similar discrimination compared with the combined model integrating all the independent risk factors (Fig. 1A).
Based on this model, a nomogram was plotted to predict the probability of in-hospital death (Fig. 1B).

Validation of the predictive nomogram
Comparison between the nomogram and SOFA score for predicting in-hospital mortality in SAE patients was shown in Table .4.The AUROC of the current model was 0.774 (95%CI: 0.729-0.818) in the training set, which was signi cant higher than that of SOFA with a value of 0.662 (95%CI: 0.61-0.714;p < 0.01) (Fig. 2A); Similarly, the AUROC of SOFA was 0.599 (95%CI: 0.524-0.674) in the validation set, which can be improved by the nomogram to a value of 0.741 (95%CI: 0.673-0.809;p < 0.01).All these results indicated that the predictive nomogram had better discrimination than SOFA in predicting the in-hospital mortality of patients with SAE and about 77.4% as well as 74.9% of the probability of individual mortality would be correctly predicted by the nomogram in both sets respectively.Calibration curves were depicted for both training and validation sets and the bias-corrected line is formed using a bootstrap approach.In the training set, the apparent curve, the bias-corrected curve and the ideal reference line were closely aligned, demonstrating good calibration (Fig. 3A).In the validation set, the apparent curve and bias-corrected curve slightly deviated from reference line, but a good conformity between observation and prediction is still observed.(Fig.3B).The Brier score of the nomogram and SOFA were 0.136 (95%CI: 0.12-0.153)and 0.155 (95%CI: 0.138-0.172),respectively in the training set; and 0.168 (95%CI: 0.144-0.192)and 0.193 (95%CI: 0.169-0.216),respectively in the validation set, indicating that the predictive nomogram had better calibration than SOFA (Additional le 2: Fig. S2).Moreover, when compared with SOFA, the IDI of the nomogram was 0.109 (95%CI: 0.081-0.136,p < 0.001) in the training set and 0.083 (95%CI: 0.049-0.117,p < 0.001) in the validation set, which meant that the nomogram could increase the prediction probability of SOFA by 10.9% and 8.3% in the two sets respectively.

DCA analysis and performance of the nomogram in stratifying risk of patients
With regard to clinical use, the DCA for nomogram was presented and compared with SOFA in both the training and validation sets.In the training set, medical intervention guided by the nomogram could add more net bene t than SOFA when the threshold probability (PT) was between 0.1 and 0.8 (Fig. 4A).In the validation set, treatment directed by nomogram could gain more net bene t when PT was between 0.25 and 0.8 (Fig. 4B).
The score calculated with the nomogram were then divided into three subgroups based on the cut-off values detected by the X-tile analysis in the training set (Fig. 5A), namely 217 and 258 points.Kaplan-Meier curves in both training and validation sets showed signi cant difference in the in-hospital survival when SAE patients were strati ed into low-risk group (≤ 217), middle-risk group (218-257) and high-risk group (≥ 258) by the cut-off points (log-rank p < 0.001 and log-rank p = 0.0052, respectively) (Fig. 5B).

Discussion
In this retrospective analysis by MIMIC III database, we conducted Logistic regression analysis to recognize the independent risk factors for in-hospital death of SAE patients and then predictors including age, SOFA, RDW, heart rate, respiratory rate and temperature were identi ed and then integrated into a best-t prediction model visualized as a prediction nomogram.To the best of our knowledge, this is the rst study to evaluate the potentially modi able factors contributing to the hospital death of SAE and to develop a nomogram to predict its hospital mortality.The prediction performance of the nomogram was then tested by discrimination and calibration in a training set and validation set as well as by the bootstrap method, all exhibiting acceptable and stable predicting performance.Moreover, decision curve analysis was employed to account for both the bene ts and the costs of intervention to SAE patient guided by the nomogram to validate its clinical usefulness.The decision curve showed that interventions guided by the current nomogram can add more net bene ts than SOFA score.
SOFA system was rstly developed to better describe multiple organ failure or morbidity [16].Since then, researchers found that SOFA was not only a scoring system to evaluate the severity of organ failure but also a useful tool in predicting the in-hospital mortality of cardiovascular disease, trauma and critically ill patients [17][18][19].More recently, the third international consensus de nitions for sepsis and septic shock (Sepsis-3) recommended the use of SOFA to diagnose sepsis because it is associated with a 10% higher in-hospital mortality of systemic infection, and recognition of this crisis may promote clinicians to give prompt and appropriate medical intervention [10].Seymour CW et al. further supported the consensus with the nding that the validity of SOFA in discriminating the in-hospital mortality of systemic infection was acceptable with AUROC of 0.74, which has no statistical difference compared with the more complex LODS score but was obviously greater than qSOFA score [20].These ndings indicated that SOFA score is a useful and simple tool in predicting the in-hospital death of patients with systemic infection, but whether it is applicable to the forecast of the in-hospital mortality of patients with SAE is still unclear.Thus, we evaluated the performance of SOFA in predicting the in-hospital death of patients with systemic infection alone and those with SAE.Results indicated that in the 15847 patients with systemic infection, the AUROC was 0.724 (Additional le 3: Fig. S3), which was similar to the AUROC (0.74) in the study of Seymour CW, indicating a good performance of the SOFA score in discriminating patients with systemic infection under the risk of in-hospital death.Neverthless, SOFA score exhibited poor performance in discriminating SAE patients under the risk of in-hospital death with AUROC of 0.599-0.662.Therefore, we developed the current predictive model incorporating SOFA score and clinical parameters, which showed better predictive performance than SOFA and exhibited improved discrimination and calibration.Interestingly, SOFA accounted for the biggest weight in the nomogram, indicating that it is the most important predictor in the best t model and has the strongest power to predict in-hospital mortality in SAE patients.
RDW is a measure of the size of circulating erythrocytes and was routinely used in the differential diagnosis of anemia.However, studies have revealed that RDW is also useful in estimating the short-term mortality of non-hematologic diseases, such as cardiovascular diseases [21,22], stroke [23], liver diseases [24,25], and critical illness [26].In patients with acute subarachnoid hemorrhage or acute heart failure, RDW is even associated with long-term mortality of patients [27,28].Consistently, our study demonstrated that RDW is an independent risk factor and potent predictor for the in-hospital mortality of SAE.Mechanisms under the relationship between RDW and short-term mortality of SAE remain largely unknown, but several studies had revealed that RDW is positively associated with in ammatory markers, such as C-reactive protein (CRP) and erythrocyte sedimentation rate (ESR), in unselected outpatients, autoimmune diseases and healthy population [29][30][31][32].Thus, we hypothesized that the in ammatory response during sepsis may contribute to the adverse impact of RDW on the prognosis of SAE.Besides, oxidative stress may be another reason to connect RDW with poor in-hospital outcome because studies indicated that oxidative stress can increase anisocytosis by disrupting erythropoiesis, and altering the circulating half-life of red blood cell, ultimately leading to increased level of RDW [33,34].
To further facilitate clinical use and treatment, patients with SAE was strati ed into three risk groups based on the cut-off values calculated by the nomogram.The in-hospital mortality in the three risk groups were 7.4%, 28.71% and 45.57% in the training set as well as 11.49%, 26.14% and 44% in the validation set.Interestingly, the in-hospital mortality of the high-risk group was similar to that of patients with septic shock with a value of 42.3% [10].As septic shock is characterized by the use of vasopressor to maintain mean arterial pressure (MAP) ≥ 65 mmHg and an increased level of lactate (> 2 mmol/L), we further compared the frequency of vasopressor use as well as levels of MAP and lactate in the three groups.
Results showed signi cant higher level of lactate and frequency of vasopressor usage in the high-risk group than in the other two risk groups, simultaneously MAP was over 65 mmHg in all groups (Additional le 4-6: Fig. S4-S6), indicating that septic shock may be an important cause for in-hospital death in high risk group.Based on these, medical interventions towards septic shock, including early investigation for and treatment of infection, uid resuscitation within 15-30 minutes and repeated assessment of hemodynamics, adoption of vasopressor and corticosteriods, provision of supportive care and so on [35,36], may reduce the in-hospital mortality of SAE patients in the high-risk group.It is di cult to con rm the exact mechanism for the increased in-hospital mortality in the middle-risk group, but the in-hospital mortality in the middle-risk group was similar to that of septic patients with uidresistant hypotension requiring vasopressors but without hyperlactatemia (< 2 mmol/L) [10].Consistently, the level of blood lactate was lower than 2 mmol/L in the low-and middle-risk groups and had no difference between them (Additional le 4 Fig.S4).We hypothesized that circulatory failure without obvious abnormality of cell metabolism may be one reason for the increased in-hospital mortality in the middle-risk group.In consequence, uid resuscitation and rational use of vasoactive drugs to improve circulatory function may be useful to prevent in-hospital death of patients in the middle-risk group [37].Neverthless, more studies are needed to ascertain the exact causes of in-hospital mortality in the three risk groups so that targeted therapies can be performed or developed to effectively reduce in-hospital death of SAE patients.
Two point should be noted when using the nomogram.First, as vital signs in our study are the mean values of the rst 24 hours of each ICU patient, the nomogram is not applicable to patients dying or leaving within 24 hours since ICU admission.Second, laboratory tests in the nomogram are the rst results since ICU admission, thus, all the laboratory tests included in the nomogram should be completed within the rst 24 hours of ICU admission.
This study has some limitations: First, as speci c therapy for SAE is lacking, the interventions mentioned in the DCA analysis are treatments toward sepsis and septic shock [36,37].Therefore, it is urgent to develop speci c treatment for the encephalopathy during sepsis, which may further enhance the clinical usefulness of the nomogram and reduce the hospital death of SAE patients.Second, the retrospective nature of this observational study determined that unidenti ed confounding factors may affect the results if adding to the model.Third, as the database lacks data related to the CAM-ICU, septic patients with GCS = 15 but complicated by delirium were not distinguished in the study.Therefore, whether the nomogram is appropriated to this population need to be further veri ed.Fourth, as neuroimaging data was not included in the study, we cannot assess the impact of organic lesion of brain on in-hospital outcome.Studies based on the results of brain MRI have revealed that the impairments of cerebral white matter in patients with critical illness are not only related to sequelae of the central nervous system but also associated with increased mortality [38].Fifth, one of the challenges in studying SAE is that without speci c diagnostic method, it remains a rule-out de nition, which may lead to a high speci city, but relatively low sensitivity for the diagnosis of SAE.Thus, the current nomogram can only be used in SAE diagnosed by exclusion and may require further modi cation once speci c diagnostic methods are developed.Finally, we only conducted an internal validation by the study cohort from the MIMIC database, external validation should be performed in the future study to further validate the robustness and performance of the prediction model.

Supplementary Files
This is a list of supplementary les associated with this preprint.Click to download.

Figure 5 (
Figure 5 FigureS2.tifFigureS6.tif FigureS4.tifFigureS5.tifFigureS3.tifFigureS1.tif A prediction nomogram based on SOFA score, together with patient's age, RDW, mean values of heart rate, temperature and respiratory rate on the rst day of ICU admission can be conveniently used to serve accurate prognostic prediction in the hospital mortality of SAE.This may be particularly bene cial in Perlstein TS, Weuve J, Pfeffer MA, Beckman JA.Red blood cell distribution width and mortality risk in a community-based prospective cohort.Arch Intern Med.2009; 169:588-594.34.Weiss G, Goodnough LT.Anemia of chronic disease.N Engl J Med. 2005; 352(10):1011-23.35.Peake SL, Delaney A, Bailey M, et al.Goal-directed resuscitation for patients with early septic shock.

Table 2
Patients' characteristics at ICU admission a Laboratory tests recorded the first result of each patients' ICU stay e ALT in the table is the value after logarithmic transformation f ALT in the table is the value after logarithmic transformation CCU, coronary care unit; CSRU, cardiac surgical intensive care unit; MICU, medical intensive care unit; SICU, surgical intensive care unit; TSICU, trauma/surgical intensive care unit; SOFA, sequential organ failure assessment; GCS, Glasgow coma score; RDW, red blood cell distribution widths; NLR, neutrophil to lymphocyte ratio; MCV, mean corpuscular volume; a Parametric continuous data are presented as mean ± standard deviation (SD), non-parametric continuous data are presented as median (interquartile ranges), whereas categorical data are presented as frequency (percentage) b Severe score is calculated on the first day of each ICU patients' stay c Vital signs is calculated on the first 24 hours of each ICU patients' stay d

Table 3
and multivariate Logistic regression analysis of cohort.ALT in the table is the value after logarithmic transformation b ALT in the table is the value after logarithmic transformation SOFA, sequential organ failure assessment; RDW, red blood cell distribution widths; NLR, neutrophil to lymphocyte ratio

Table 4
of models for predicting in-hospital mortality in patients.