A New Scoring System for Predicting Short-term Outcomes in Critically-ill Acute Decompensated Heart Failure

Background: Acute decompensated heart failure (ADHF) contributes millions of emergency department (ED) visits and it is associated with high in-hospital mortality. The aim of this study was to develop and validate a multiparametric score for critically-ill ADHF patients. Methods: In this single-center, retrospective study, a total of 1268 ADHF patients were enrolled and divided into derivation (n=1014) and validation (n=254) cohorts. The primary endpoint was any in-hospital death, cardiac arrest or utilization of mechanical support devices. Logistic regression model was preformed to identify risk factors and build the new scoring system. The assigning point of each parameter was determined according to its β coecient. The discrimination was validated internally using C statistic and calibration was evaluated by the Hosmer-Lemeshow goodness-of-t test. Results: We constructed a predictive score based on six signicant risk factors [systolic blood pressure (SBP), white blood cell (WBC) count, hematocrit (HCT), total bilirubin (TBIL), estimated glomerular ltration rate (eGFR) and NT-proBNP]. This new model was computed as (1×SBP<90mmHg)+ (2×WBC>9.2×10 9 /L)+(1×HCT ≤ 0.407)+(2×TBIL>34.2μmol/L)+(2×eGFR<15ml/min/1.73m 2 )+ (1×NTproBNP ≥ 10728.9ng/ml). The C statistic for the new score was 0.758 (95%CI 0.667 to 0.838) higher than APACHE (cid:0) , AHEAD and ADHERE score. It also demonstrated good calibration for detecting high-risk patients in the validation cohort (χ 2 =6.681, p=0.463). Conclusions: The new score including SBP, WBC, HCT, TBIL, eGFR and NT-proBNP might be used to predict short-term prognosis of critically-ill ADHF patients. were measured with 12-lead electrocardiography and pleural effusion was determined by chest X-ray. Left ventricular ejection fraction (LVEF) and estimated pulmonary arterial systolic pressure (PASP) were assessed by using echocardiography (General Electric, USA). The participant’s worst values of blood laboratory tests during the initial 24-hours were recorded including arterial pH, PaO 2 , actual bicarbonate (AB), lactate concentration, serum sodium, serum potassium, white blood cell (WBC) count, hemoglobin (Hb) concentration, hematocrit, international normalized ratio (INR), D-dimer concentration, total bilirubin, serum creatinine, serum uric acid (SUA), high-sensitivity troponin I (hs-TNI) and N-terminal pro-B-type natriuretic peptide (NT-proBNP). Estimated glomerular ltration rate was calculated using the Chinese version of the MDRD equation [13] .


Background
Heart failure (HF) is an advanced manifestation of various cardiovascular diseases responsible for several million hospitalizations worldwide, imposing a heavy economic burden [1] . Presence of HF generally implies poor prognosis, especially in patients who are admitted for acute decompensated heart failure. Recent data show that the in-hospital mortality for ADHF patients is 3% and the rehospitalization rate exceeds 50% within six months [2][3][4] . According to the latest American college of cardiology (ACC) guidelines, it is important for the initial evaluation of the clinical trajectory of ADHF. The identi cation of a high-risk status at admission may help to allocate limited hospital resources and discuss the appropriate goals of care [5] . Therefore, accurately assessing the severity and risk timely can be bene cial for patients with ADHF [6] .
Several risk strati cation systems have been published previously. Unfortunately, there are several limitations of these systems. First, few of them focused on a contemporary intensive care unit (ICU) population with ADHF. Second, the existing risk assessment tools for inpatients with ADHF are often complex and are uniformly underutilized. Third, the tools emphasized clinical and laboratory variables but did not include echocardiographic measures and emerging plasma biomarkers [7] . Being universally available as a prime investigation for cardiac assessment, echocardiography carries a pivotal role in daily practice, particularly in managing ADHF patients. Finally, a recent study demonstrated that clinical care risk scores established to predict the prognosis in unselected ICU patients performed poorly in CICU with ADHF, emphasizing the need to develop improved tools for risk strati cation among critically-ill ADHF patients [8] . The aim of this present study was to develop and validate a novel clinical scoring model to predict short-term adverse outcomes in a Chinese population of patients with critically-ill ADHF and compare it with the existing systems, such as the Acute Physiology and Chronic Health Evaluation(APACHE) system [9] , AHEAD score [10] and ACUTE HF score [11] .

Study population
Clinical data were collected from 1268 patients with ADHF who were admitted to ICU from the emergency department (ED) at Fuwai hospital between January 2014 and December 2018. All participants met the most recent European guidelines for the diagnosis of acute heart failure [12] . Critically-ill ADHF was de ned as exacerbation of chronic HF (CHF) with worsening symptoms su cient to be admitted to intensive care. Exclusion criteria were known diagnoses with malignancy. Patients with ST-segment elevation myocardial infarction and non ST-segment elevation myocardial infarction were also excluded because TIMI score was established and extensively used in these patients and reperfusion treatment itself played an important role on the prognosis. However, patients with comorbid coronary heart disease and chronic HF who were hospitalized for exacerbation of CHF without indications for reperfusion therapy were also included in this study. All data were retrospectively obtained from Fuwai hospital electronic medical records. The study was approved by the ethics committee of Fuwai Hospital and was conducted in accordance with the Declaration of Helsinki.

Data collection and endpoints
For each patient, baseline information on admission was obtained including demographic data, baseline health status, Glasgow coma scale (GCS), body mass index (BMI), vital signs and comorbidities by reviewing their medical records. The presence of atrial brillation (AF) and bundle branch block (BBB) were measured with 12-lead electrocardiography and pleural effusion was determined by chest X-ray. Left ventricular ejection fraction (LVEF) and estimated pulmonary arterial systolic pressure (PASP) were assessed by using echocardiography (General Electric, USA). The participant's worst values of blood laboratory tests during the initial 24-hours were recorded including arterial pH, PaO 2 , actual bicarbonate (AB), lactate concentration, serum sodium, serum potassium, white blood cell (WBC) count, hemoglobin (Hb) concentration, hematocrit, international normalized ratio (INR), D-dimer concentration, total bilirubin, serum creatinine, serum uric acid (SUA), high-sensitivity troponin I (hs-TNI) and N-terminal pro-B-type natriuretic peptide (NT-proBNP). Estimated glomerular ltration rate was calculated using the Chinese version of the MDRD equation [13] .
The main outcome of this analysis was a composite endpoint de ned as: (1)in-hospital mortality; (2)inhospital cardiac arrest during the admission; (3)utilization of mechanical support devices during the hospitalization which included intra-aortic balloon pumps (IABP) and extracorporeal membrane oxygenation (ECMO). We also collected the information about patients who had listed for heart transplantation (HTx).

Statistical analysis
For patients' background data, categorical variables were expressed as frequencies (percentages), and continuous variables were expressed as means ± standard deviations or medians with quartiles depending on their normality. Normality was assessed using the Shapiro-Wilk W-test.
Participants were divided into derivation (n = 1014) and validation (n = 254) cohorts according to the order of admission to ED. The comparison of the baseline data indicated that the distribution of age and occurrence of endpoint agreed well between the two cohorts but the validation cohort had marginally more female patients, more patients with AF and higher NT-proBNP concentration. Some thresholds for categorical variables were adopted as commonly used in clinical treatment including heart rate (HR), respiratory rate (RR), AB and PaO 2 whereas age, pH and hs-TNI were considered as continuous variables.
Participants were divided into different groups based on the optimal cut-off values of lactate level, serum sodium, WBC, HCT, TBIL, SUA, D-dimer and INR which were determined by respectively performing receiver-operating characteristic (ROC) curve analyses. Patients were de ned as underweight by BMI < 18.5 kg/m 2 , normal by 18.5/kg/m 2 ≤ BMI < 24 kg/m 2 , overweight by BMI ≥ 24 kg/m 2 and obese by BMI ≥ 30 kg/m 2 . Serum potassium < 3.5 mmol/L was de ned as hypokalemia and potassium > 5.5 mmol/L was de ned as hyperkalemia. The cut-off levels for anemia were hemoglobin < 130 g/L in men and < 120 g/L in women, whereas that for NT-proBNP were determined by quartiles. PASP > 30 mmHg was recorded as increased pulmonary artery pressure. The thresholds for eGFR were in accordance with Kidney Outcomes Quality Initiative guidelines, which classi ed participants into ve stages (eGFR ≥ 90, 60 ≤ eGFR < 90, 30 ≤ eGFR < 60, 15 ≤ eGFR < 30 and eGFR < 15 ml/min/1.73 m 2 ). Three subgroups based on LVEF were identi ed: HF with reduced ejection fraction (HFrEF, LVEF < 40%), HF with middle-range ejection fraction (HFmrEF, LVEF 40%-49%) and HF with preserved ejection fraction (HFpEF, LVEF ≥ 50%). The predictive power of patients' characteristics for the short-term adverse outcomes was computed using the univariate logistic regression and described by odds ratios (ORs) and their 95% con dence intervals. Then, the statistically signi cant predictors identi ed by univariate analysis were entered into the multivariate logistic regression model with a forward stepwise selection algorithm. Using a method of β-coe cient-based weights similar to that used for the Framingham risk score [14] , the assigning weight of each predictor was determined according to the β coe cient in the multivariate logistic regression model to develop a novel scoring system. Subsequently, in order to test the prognostic power of the new score, the ROC methodology was adopted both in derivation and validation groups. The discriminative capacity of the new score was quanti ed with C-statistic while calibration was graphically evaluated by the Hosmer-Lemeshow goodness-of-t test.
The software package SPSS version 25.0 (IBM Corporation, New York, NY, USA) was utilized for statistical analysis. All statistical tests were 2-tailed, with a p value < 0.05 considered statistically signi cant. Graphs were generated using the software GraphPad Prism 8.0.

Baseline characteristics
The baseline characteristics of derivation and validation cohorts with critically-ill ADHF were summarized in Table 1. For both groups, the gender, age distribution and risk of adverse outcomes were comparable without signi cant difference. Of the 1268 patients enrolled, 873 were male with a median age of 58(± 17) years, among whom the elderly accounted for 17.9%. The proportion of HFrEF was 62.3%, 13.9% for HFmrEF and 23.8% for HFpEF. Coexisting atrial brillation was observed in 35.6% patients and pleural effusion was identi ed in 31.9% of the participants. During hospitalization, the primary endpoint occurred in 181 patients (14.3%) with 117 death (9.2%). The heart transplantation occurred in 3.5% of the patients.

Logistic regression and model establishment
Univariate analysis was performed in derivation cohort using the univariate logistic regression model and included the following 29 parameters: age, elderly, sex, BMI, GCS, temperature, SBP, heart rate, RR, arterial pH, PaO 2 , AB, lactic acid, serum sodium, potassium, WBC, Hb, HCT, TBIL, SUA, eGFR, D-dimer, INR, NT-proBNP, hs-TNI, LVEF, PASP, existence of AF, pleural effusion and BBB. All variates except age, sex, temperature, RR, arterial pH, PaO 2 , hs-TNI, LVEF, AF and BBB were found to be signi cantly associated with the incidence of short-term adverse outcomes.
Based on the results of univariate analysis, a forward stepwise method was adopted for 19 parameters that showed signi cant relations for predicting short-term outcomes. Low SBP, high WBC level, HCT, concentrations of TBIL, NT-proBNP and coexistence of stage 5 chronic kidney disease (CKD) were identi ed as the independent predictors. Using these six risk factors and with consideration of the weighing of respective β coe cients, we determined assigned point for each parameter, which led to a new prognostic strati cation system. Because the weight associated with HCT was the lowest among parameters, we speci ed low HCT to 1 point and divided all weights by a factor of 1.07 then rounding them to the nearest integer. The novel scoring system was as follows: (1×SBP < 90 mmHg)+(2×WBC > 9.2×10 9 /L)+(1×HCT ≤ 0.407)+(2×TBIL > 34.2 µmol/L)+(2×stage 5 CKD)+(1×NTproBNP ≥ 10728.9 ng/ml). The univariate and multivariate logistic analysis results were listed in Table 2.

Discrimination and calibration of the new score
In the derivation cohort, the C statistic of new scoring system was 0.794 (95%CI 0.753 to 0.836, p < 0.001). Among the validation patients, since cases with scores of 5 or higher were limited, we combined them into one group for subsequent analysis. The incidence of adverse outcomes increased from 0% for score of 0 to 7.5%, 8%, 20.4%, 10% and 45.7% for score of 1,2,3,4,and 5 points or higher. The scores of the validation cohort and the incidence of primary endpoint events were shown in Fig. 1.
Using receiver operating characteristics analysis, the C statistics were calculated for comparison of the discriminative power between the new score and other established systems. The C statistic for our new score was 0.758 (95% CI 0.677 to 0.838, p < 0.001), whereas for APACHE was 0.598(95%CI 0.496 to 0.700, p = 0.058), for ADEHER risk tree [4] and AHEAD score was 0.631 (95%CI 0.529 to 0.733, p = 0.011) and 0.540 (95%CI 0.442 to 0.638, p = 0.439) respectively, demonstrating that our system had a better predictive power for short-term outcomes in critically-ill ADHF patients. The comparison of these four scores were shown in Fig. 2. The calibration of the system was evaluated with the Hosmer-Lemeshow goodness-of-t test. In the validation cohort, the new scoring system demonstrated a good calibration (χ2 = 6.681, p = 0.463) for detecting high-risk ADHF patients admitted to ED. The calibration plots were shown in Fig. 3.
Furthermore, we attempted to predict the occurrence of heart transplantation with the new system.

Discussion
In the present study of Chinese patients in a single heart center ICU setting, we developed and validated a predictive model based on physical examinations and laboratory testing within 24 hours after admission. We found that six parameters were signi cantly associated with poor short-term outcome: (A)low systolic blood pressure (SBP < 90 mmHg), (B)increasing white blood cell (WBC > 9.2×10 9 /L), (C)low hematocrit (HCT ≤ 0.407), (D)abnormal liver parameter (TBIL > 34.2 µmol/L), (E) NT-proBNP ≥ 10728.9 ng/ml and (F)stage 5 CKD (eGFR < 15 ml/min/1.73 m 2 ). In comparison, several commonly used existing systems did not exhibit an adequate ability to predict in-hospital outcomes. The new risk score might aid in the identi cation of patients with ADHF at risk for the incidence of in-hospital death, cardiac arrest or use of mechanical support devices.

The predictive elements for ADHF
Hypotension is signi cantly associated with increased mortality in AHF patients. The clinical importance of SBP has previously been considered and prognostic scores such as ADHERE [4] , AHFI [15] and GWTG-HF [16] have been created. Gheorghiade et al. reported that a systolic pressure under 120 mmHg at the time of admission was associated with a poor prognosis compared with a systolic pressure over 120mmHg [17] . Previously, a multicenter acute heart failure registry study also indicated that low admission SBP is an independent predictor of mortality in patients with AHF both in HFpEF and HFrEF, regardless of age [18] . Similarly, we identi ed hypotension (SBP < 90 mmHg) was independently associated with the incidence of short-term poor prognosis and was included in the new scoring system.
Elevated leukocytes often re ects the existence of in ammatory response. In the present study, it also emerged as an independent risk predictor and was one of the most important determinants of the shortterm prognosis. There is evidence to suggest an association between leukocytes and HF development and outcomes. Elevated leukocyte level(9 × 10 3 cells/mm 3 ) has been recently identi ed as an independent predictor of in-hospital mortality in patients with acute non-ischemic HF and of long-term survival in patients with dilated cardiomyopathy [19,20] ; these ndings support our results.
Anemia is a frequent co-morbidity in patients with acute heart failure. Several studies have reported that the presence or development of anemia is correlated with increased mortality and morbidity and with higher hospitalization rates, irrespectively of age, gender, diabetes or the NYHA functional class in chronic HF patients [21,22] . Von Haehling S et al. has found, in a study of 627 AHF patients, that moderate or severe anemia predicts a signi cantly increased 12-month mortality after adjusting for other risk factors [22] . However most previous studies used the hemoglobin level as a marker of anemia, the relationship between hematocrit and ADHF remains to be elucidated. The present study showed that hematocrit had a better predictive power than hemoglobin and was employed in the new scoring system.
Abnormal liver function test results are often observed in heart failure. According to China heart failure(China-HF) registry, elevated total bilirubin is an independent predictor of adjusted in-hospital mortality [23] . In the Acute Study of Clinical Effectiveness of Nesiritide in Decompensated Heart Failure(ASCEND-HF), compared with normal bilirubin levels, abnormal total bilirubin was associated with increased 30-day mortality or HF rehospitalization and 180-day mortality [24] . Nevertheless, few risk strati cation systems previously investigated select TBIL as a marker of impaired liver function. Our analysis showed close correlation between TBIL on admission and short-term outcomes.
The interactions between heart and kidney are strong and complicated, by which acute or chronic dysfunction in one organ may induce dysfunction in the other. As previously shown, renal dysfunction is a strong prognosis predictor of AHF. For example, in the data from AHEAD registry, patients with eGFR < 30 ml/min on admission had a poor prognosis during hospitalization and long-term follow-ups [10] . In another retrospective study, with 104794 AHF admissions, abnormal eGFR on admission was proved to be a signi cant predictor of mortality and readmission risk [25] , in keeping with the fact that nearly all established scores took renal function into account [7] . For this score evaluation, we used eGFR calculated by Chinese version of the MDRD equation instead of serum creatinine as the indicator for estimating renal function, scoring 2 points if less than 15 ml/min/1.73 m 2 in both sex.
In addition to the above clinical and laboratory results, plasma level of N-terminal propeptide brain natriuretic peptide played a critical role in our analysis. The 2016 ESC guideline recommends it as an initial diagnostic test especially for excluding the diagnosis of HF with a high negative predictive value [12] . A meta-analysis of ADHF patients has con rmed that NT-proBNP is an independent predictor of mortality both in all-cause and cardiovascular death despite different cut points, time intervals and prognostic models [26] . More recently, Biljana Stojcevski et al. reported that discharge NT-proBNP should be assessed to detect the AHF patients with higher risk of short-and long-term death [27] . Although current studies employed different cut-off values for NT-proBNP, we used quartiles for multivariate analysis, which showed that only patients with the highest plasma concentration of NT-proBNP was related to poorer in-hospital outcomes scoring 1 point in the new system.

The unique potential value of the new score
To improve the prognosis for ADHF, it is crucial to identify high-risk patients as a rst step. Several risk strati cation systems have been published for AHF previously such as the Acute Physiology and Chronic Health Evaluation(APACHE) system [9] , AHEAD score [10] , ADHERE, American Heart Association Get With the Guidelines-Heart Failure (GWTG-HF) [16] , ACUTE HF score [11] and AHFI [15] . However, there are several limitations of these systems. Firstly, existing clinical predictive models for AHF were largely derived from North America or Europe, the performance of scoring systems varied substantially across different world regions. A recent study indicated that region-speci c recalibrations were needed for AHF patients [28] . Secondly, many scores were established more than ten years ago. With the development and application of new diagnosing techniques and arising biomarkers, some important clinical indicators should be brought into reevaluation such as NT-proBNP, hs-TNI, D-dimer.
There are several noteworthy features of the present investigation: Because of the exclusion criteria not covering LVEF, it was carried out in a cohort of ADHF patients containing not only HFrEF but also HFpEF and HFmrEF often ignored in other studies. And it offered a relatively comprehensive system for evaluating in-hospital outcomes for critically-ill ADHF patients, due to the complete analysis of clinical, biochemical, electrocardiographic and echocardiographic parameters. Considering the incompleteness and availability of past medical history in practical ED situations, we did not highlight past diseases in building the new scoring system. Also, we utilized logistic regression instead of regression tree analysis, hence constructed a quanti able tool to reach a better predictive accuracy. Moreover, the nal model consisting of six easy-to-obtain indexes with a simple calculation method was relatively convenient to identify high-risk populations and aid to determine whether an emergency ADHF should be admitted to ICU department or treated in ED. For these reasons, our new score might represent a practical and e cient approach to the critically-ill patients commonly hospitalized for ADHF.

New score and heart transplantation
Although this new system showed a satisfactory predictive power for the composite endpoint, it cannot accurately predict HTx. The candidacy for HTx was assessed carefully in Fuwai hospital. Elderly and frail patients with ADHF who failed optimal medical management and mechanical circulatory support often suffered from malnutrition, immune dysfunction and multiple organ failure. They were obviously unsuitable for operations. It was understandable that the score was unparallel to the consideration of HTx. Secondly, the selection for HTx was associated with economic conditions, social support and psychological condition.

Study limitations
This study presents a potential model for triaging emergency department ADHF patients for intensive care unit. Also, it had several limitations. First, our database consisted of a cohort of patients from a single cardiovascular hospital, and the study population included only Chinese patients. The participants evaluated was limited to patients admitted only to the ICU, and ADHF patients who were then admitted to other wards were not enrolled. Although an internal validation was performed by bootstrapping techniques in the same population, the results should be carefully interpreted when applied to external validation studies. Second, the composite endpoint of our study was in-hospital death or cardiac arrest or clinical application of mechanical support devices. The selected 6 parameters demonstrated good ability to distinguish patients with high risk of short-term adverse events. Due to lack of follow-ups after discharge, the ability of our scoring system to predict post-discharge and long-term prognosis was still uncertain. Third, the individual clinical data was collected at the time of admission without counting the effects of pre-hospital managements, such as the widely used inotropic drugs for ADHF, which may in uence admission blood pressure and heart rate.

Conclusions
Existing predictive systems did not demonstrate enough ability to evaluate the incidences of short-term adverse events in critically-ill ADHF in Chinese population. Our new scoring system including SBP, white blood cell count, hematocrit, total bilirubin, estimated glomerular ltration rate and NT-proBNP might provide a practical tool for daily risk strati cation of ADHF patients, irrespective of its etiology. participated in statistical guidance and sample size calculation. Yan-min Yang, Yan Liang and Jun Zhu provided clinical advice on study design. All authors read and approved the nal manuscript.