Development and Validation of a Prognostic Nomogram for Cerebral Infarction Patients in Intensive Care Units: a Retrospective Cohort Study

OBJECTIVES: Our study aimed to establish a utility risk prediction model for the prognosis of patients with cerebral infarction. BACKGROUND: Despite large number of studies focus on the prognosis risk factors of patients with cerebral infarction, there were still lack of utility and visual risk prediction model for predicting the in-hospital mortality of patients with cerebral infarction. METHODS: The study is a retrospective cohort study. The lasso regression model was used for data dimension reduction and feature selection. Model of hospital mortality of cerebral infarction patients was developed by multivariable logistic regression analysis. Calibration and discrimination were used to assess the performance of the nomogram. Decision curve analysis (DCA) was used to evaluate the clinical utility of the model. RESULTS: Overall, 1,564 patients (1315 survivals and 249 deaths) with cerebral infarction included in our research from MIMIC-IV database. The incident of in-hospital mortality is 15.9%. Lasso regression model veried that age, white blood cell count, anion gap (AG), SOFA score were signicantly correlated with hospital mortality. The risk prediction model demonstrated a good discrimination with an AUC of ROC 0.789 (95% CI 0.752–0.826) in training set and 0.829 (95% CI 0.791–0.867) in test set. The calibration plot of risk prediction model showed predicted probabilities against observed death rates indicated excellent concordance. DCA showed that this model has good clinical benets. Conclusion: We developed a nomogram that predicts hospital mortality in patients with cerebral infarction according to the real world’s data. The nomogram exhibited excellent discrimination and calibration capacity, favoring its clinical utility. line The solid line represents the performance of the nomogram, of which a closer t to the diagonal dotted line represents a better prediction. The gure showed that the prediction model have a good predictive ability. 3(d): Calibration curve of the nomogram in test set. The x-axis represents the predicted probability of in-hospital mortality of post-cardiac arrest patients. The y-axis represents the actual in-hospital mortality of post-cardiac arrest patients. The diagonal dotted line represents a perfect prediction by an ideal model. The solid line represents the performance of the nomogram, of which a closer t to the diagonal dotted line represents a better prediction. The gure showed that the prediction model have a good predictive ability. 3(e): Decision curve analysis of the nomogram for post-cardiac arrest patients (training set) The DCA curve of the nomogram for post-cardiac arrest patients. Solid line: The patient does not apply the nomogram, and the net income is zero; Gray line: All patients used the nomogram. The further the red solid line was from the dotted line, the greater the clinical application value. 3(f): The validation for the DCA curve of the nomogram for post-cardiac arrest patients (test set). Solid line: The patient does not apply the nomogram, and the net income is zero; Gray line: All patients used the nomogram. The further the red solid line was from the dotted line, the greater the clinical application value.


Introduction
According to the data released by the World Health Organization, in the past decade from 2000 to 2012, stroke had become the 2rd leading cause of death in the world 1-2 , which was second to heart disease, with high incidence, morbidity, recurrence, disability and mortality rate 1,3 .
Cerebral infarction (CI, also known as ischemic stroke) is a common cerebrovascular disease [4][5] . Cerebral infarction refers to the cause of cerebral blood circulation disorder, which leads to cerebral vascular blockage or severe stenosis, reduces cerebral blood perfusion, and then leads to the death of brain tissue in the cerebral vascular supply area due to ischemia and hypoxia 6 . Clinically, it is described as sudden local or diffuse neurological de cit. New local cerebral infarction lesions are showed on head computed tomography (CT) or magnetic resonance imaging (MRI) 7 . The great majority of critically ill CI patients are in the intensive care unit (ICU) 8 . While, not all of them bene t from the care of ICU. It is urgently needed to develop a risk strati cation for those CI patients to make e cient decisions.
Nomograms are great prognostic tools to predict clinical events by integrating potential clinical events with patients' performance status. Nomograms have been widely used for tumor prognosis, to predict the long-term survival and recurrence 9 .
A nomogram was applied to predict mortality rate in ischemic stroke patients recently 10 . We hypothesized a nomogram for the risk strati cation of critically ill patients with CI in the ICU. This study was committed to identify prognostic factors for mortality of critically ill patients with CI and based on a multivariable logistic regression model. The performance and clinical bene ts of the nomogram were assessed in a validation cohort. Totally, the nomogram could be greatly applied for high-risk patients and clinical decision.

Data Source
The primary data of our study was derived from MIMIC-IV database (version 1.0). MIMIC-IV database is an extensive database, and contained all medical record numbers corresponding to patients admitted to an intensive care unit (ICU) or the emergency department between 2008-2019 in the Beth Israel Deaconess Medical Center (BIDMC) 11 . The version 1.0 is the lasted version of MIMIC-IV database. One of our authors (C.J, certi cation ID: 8979131) gained permission to documented the database after online training at the National Institutes of Health (NIH). Our research was conducted entirely on publicly available and anonymized data. Therefore, individual patient consents were not required. All methods were carried out in accordance with relevant guidelines to protect the privacy of patients.

Population selection
We included the data of adult patients (aged >18 years old) diagnosed with cerebral infarction at hospital admission from the MIMIC-IV database by the International Classi cation of Diseases version 9 diagnosis codes and version 10 diagnosis codes. The exclusion criteria were: (I) Incomplete or unobtainable documented or other vital medical data records; (II) During pregnancy and the postpartum period; (III) Missing the data of blood biochemical and blood gas analysis; (IV) Missing survival outcome data.
Clinical and laboratory data Patients' baseline characteristics (age, height, weight) and comorbidity (diabetes, hypertension, chronic lung disease, myocardial infarction, heart failure, et al) were collected. The rst document of vital signs data and laboratory tests data of cardiac arrest patients admitted to the hospital were extracted. Vital signs data included systolic blood pressure (SBP), diastolic blood pressure (DBP), mean blood pressure (MBP), body temperature (T), heart rate (HR), respiratory rate (RR), pulse oximetry derived oxygen saturation (SPO 2 ). Laboratory tests data included creatinine, blood urea nitrogen (BUN), anion gap, PH, lactate, chloride, glucose, hemoglobin, hematocrit, white blood cell count, platelet count, serum potassium, serum sodium, calcium, and prothrombin time (PT). The sequential organ failure assessment (SOFA) score 12 were also calculated for each patient. The endpoint of our study was in-hospital mortality which was de ned as survival status at hospital discharge.

Statistical Analysis
Multiple imputation was performed to process the missing variable data (less than 20% of total variable data), and the severe variable data missing (more than 20% of total variable data) was abandoned to form a new dataset 13 . The new dataset was randomly divided into two parts on a 7/3 scale, one parts (70%) as the training set and another parts (30%) as the test set. Training set was used for model development (derivation cohort) and test set was used for model validation (validation cohort). Continuous variables that do not conform to normal distribution were documented as medians with upper and lower quartiles, otherwise, documented as the mean ± standard deviation (SD). Group comparisons were performed using the t-test or Wilcoxon rank-sum test for continuous variables, and the chi-square test or Fisher's exact test for categorical variables. First, lasso regression was used to conduct preliminary screening of the predictors based on the whole study database, and screened out the predictors with large regression coe cients. Second, multivariate regression analysis was used to analyze the above screened predictors and identify independent risk factors in training dataset. Thirdly, multivariate regression analysis analyzed the above independent risk factors again and established the risk prediction model in training dataset. The scores for predictors were calculated based on coe cients of logistic regression variables in the model. The visualization of model was demonstrated by nomogram. The discrimination of risk prediction model for in-hospital mortality of cardiac arrest patients was assessed by receiver operating characteristic (ROC) curve analysis. The area under the curve (AUC) of the ROC curve more than 0.7 was regarded as good discrimination. The degree of tting of the prediction model was assessed by calibration curve analysis which tested by Hosmer-Lemeshow test.
The decision curve analysis (DCA) was conducted to evaluate the clinical utility of the nomogram through quantifying net bene ts against a range of threshold probabilities. The validation of model capabilities were used by test set. These results were expressed as odds ratio (OR) with 95% con dence intervals (CIs). All tests were 2-tailed tests, and p ≤ 0.05 was considered statistically signi cant. Statistical analyses were performed using R version 3.6.3 (R Foundation for Statistical Computing, Vienna, Austria).

Results
The characteristics of study patients Totally, 1,564 eligible patients (734 males and 830 females) with an average age of 77.99±0.39 years old were included in our study nally, more details about the data extraction process and missing data as shown in Supplementary Table S1 and Table S2. In the present study, 249 patients (116 males and 133 females) died during hospitalization, with the incidence of in-hospital mortality was 15.9%. Death group in hospital tend to older with malignant cancer, and have higher level of anion gap, BUN, Glucose and WBC count (As showed in Table1). After the whole sample was randomly divided into a training set and a Page 5/18 test set with the proportion of 7: 3, there were no signi cant differences in observed clinical variables between the training set and the validation set (As showed in Table2).  P<0.001) and SOFA score (OR: 1.035; 95% CI: 1.028-1.042; P<0.001) as the independent predicting factors for in-hospital mortality of patients with cerebral infarction (As showed in Table 3). Prediction model for predicting the risk of in-hospital mortality by nomogram The prediction model included age, anion gap, WBC count, SOFA score which were determined as independent predicting factors to predict the risk of in-hospital mortality for patients with cerebral infarction. For example, a 80-years-old patient with clinical data of admission as followed: WBC: 15 * 10 9 , aniongap: 20 mEq/L, SOFA score: 8 points. His risk factor score was 70 points, 30 points, 22.5 points and 37.2 points, respectively. Then his total score was about 160 points, and the risk of in-hospital mortality was 52%. (As showed in Figure 2).
Performance evaluation and validation of prediction model ROC curve analysis for the training set showed that our risk prediction model has a good discrimination (AUC of ROC: 0.789; 95% CI 0.752-0.826, as showed in Figure 3(a)). We validated the prediction model in the test set, and the result showed that the present prediction model also has a good discrimination in test set (AUC of ROC: 0.829; 0.791-0.867, as showed in Figure 3(b)). Furthermore, we performed calibration curve analysis on the prediction model. The calibration curve plot showed that predicted probabilities against observed death rates indicated excellent concordance (as showed in Figure 3(c)). In the test set, the calibration curve plot also showed excellent concordance between the predicted probabilities and the observed mortality (as showed in Figure 3(d)). In addition, we performed decision curve analysis to evaluate the clinical utility of the prediction model. DCA curve revealed that this prediction model has a good utility clinical practice (as showed in Figure 3(e) and Figure 3(f)).

Discussion
Our study collected clinical information of 1,564 patients with CI in ICU (1315 survivals and 249 deaths) with cerebral infarction from MIMIC-IV database. This nomogram did great performance for both the primary and validation cohort as assessed by the lasso curves analysis, the calibration curves analysis, the decision curve analysis, the nomogram table and ROC curves. So, our nomogram could be greatly applied to clinical practice. Nomograms predict one's probability of a clinical event using individual information and variables, they have become a usual prognostic model in oncology 14 . This study provided an easy-to-use prognostic nomogram for the rst time with 4 clinical factors, which is collected on the rst-day admission for critically ill patients with CI, the nomogram could meliorate one's risk strati cation and prevent death of critically ill patients with CI in time.
Cerebral infarction is a major disease that endangers modern people's health. Patients with cerebral infarction are likely to have sequelae if the treatment is not appropriate, with high incidence and mortality, which result in economic and health loads to our country and people. Substantial critically ill patients with CI are admitted to ICU 8 . While not all CI patients bene t from ICU care. In order to do risk strati cation to make more e cient decisions for CI patients, we used the nomograms through integrating individual risk factors with performance status to forecast the clinical events. We hypothesized that a nomogram on account of a multivariable Cox regression model in a primary cohort, can also be applicable to CI patients' risk strati cation in ICU.
Age is one of the most essential risk factors in cerebrovascular diseases [15][16] , such as cerebral infarction 17 , transient ischemic attack (TIA) 18 , Intracerebral hemorrhage (ICH) 19 , and intracranial aneurysm 20 . Our study also found age was an independent predict factors for the prognosis of cerebral infarction patients in intensive care units. Generally, serum aniongap (AG) rising resulted in over accumulation of organic acid or excessive loss of anions 21 . The excessive generation of lactate and pyruvate in serum result to common reason for AG cumulation [22][23] . Serum AG count could be applied as a prognostic indicator to have evaluation for patients with CI in a short-term, higher AG on the rst-dayadmission was related to increased risk of all-cause mortality, a few patients who were in ICU had higher AG count 24 . WBC count is an important risk factor and is related with delayed cerebral ischemia 25 . High WBC count is also referred to mortality and pneumonia after acute ischemic stroke, which might be induced by stress and in ammatory response, it is reported that higher WBC is associated with mortality after acute stroke 26 . SOFA score is a sequential organ failure assessment score system, and applies to data collected in 24 hours of intensive care units' admission. The SOFA score evaluation contents include respiratory, cardiovascular (BP, vasoactive drug use), renal, hepatic, neurological and haematological (platelet number) systems [27][28] . Totally, in our study those 4 factors are reliable prognostic factors for mortality of critically ill CI patients in the ICU, and these 4 factors also could contribute to clinical work.
Besides, we assessed the nomogram with properties and clinical bene ts to prove its accuracy and utility. The nomogram was applied to clinical practice easily and identi ed high-risk patients and guided decision-making. Timely prognostic assessment is essential because of CI's treatment time window is narrow. It's especially essential to discriminate high-risk patients as early as possible to carry on further active intervention measures for a better prognosis. Currently, bio-markers catch much attention. Higher AG counting was related to increased incidence of all-cause mortality, which guided to monitor cerebral infarction and the formulation of secondary prevention strategies 24 .
While there are still some several problems to be solved. First, some previously reported risk factors (transient ischemic attack, atrial brillation, smoking and alcohol use, blood lipid, blood glucose, blood homocysteine) of CI 27,29−30 , were not proven to be related to the death in hospital in our study. So, the prognostic value of these factors for CI should be recon rmed in future studies. National Institute of Health stroke scale (NIHSS) score and Modi ed Rankin Scale (MRS) score were not be contained in present study due to the complexity of their score and di cult to obtain in MIMIC-IV database. Thus, future studies can compare our nomogram with the two scoring models. Last, the nomogram model still needed extra more samples to con rm application and reliability, more external cohort would further solid the reliability and signi cance of the nomogram model.

Conclusion
Our developed nomogram with excellent discrimination and calibration capacity, bene tting its utility, could predict hospital mortality in patients with CI according to the real world's data.  The nomogram for predict the risk of in-hospital mortality in cerebral infarction patients in intensive care units. The top row of the 'Points' represented a scale for each risk factors, points of each predictor were acquired by drawing a straight line upward from the corresponding value to the "Points" line. Then sum the points received from each predictor and located the number on the "Total Points" axis. To conclude the patient's sort of probability for in-hospital mortality, draw a straight line down to the corresponding "Risk of death" axis.