Head-to-Head Comparison of Clinical Scores for Predicting Stroke Associated Pneumonia after Intracerebral Hemorrhage

Background: Despite advances in medical knowledge, treatment for ICH remains strictly supportive with not many evidence-based interventions currently available. Stroke associated pneumonia (SAP) is a common medical complication after stroke and has signicant impact on stroke outcomes. In the study, we aimed to systematically compare discrimination and calibration of clinical scores with regard to in-hospital SAP after ICH. Methods: The validation cohort was derived from the Beijing Registration of Intracerebral Hemorrhage. The SAP was diagnosed according to the criteria for hospital-acquired pneumonia of Center for Disease Control and Prevention. Five clinical scores were included in the study: the ICH-APS-A, ICH-ASP-B, ISAN, ACCD4 and PASS score. Discrimination was assessed by calculating the area under the receiver operating characteristic curve (AUROC). Pairwise AUROC was compared by using Delong’s method. Calibration was assessed by performing the Hosmer-Lemeshow goodness-of-t test and plot of observed versus predicted risk according to 10 deciles of the predicted risk. The Cox and Snell R-square and Nagelkerke R-square of the Hosmer-Lemeshow goodness-of-t test were calculated. Results: A total number of 1964 patients were enrolled. The mean age was 56.8±14.4 and 67.6% were male. The median admission NIHSS was 11 (IQR: 3-21). The median length of stay (LOS) was 16 days (IQR: 8-22). A total number of 575 (29.2%) patients was diagnosed with in-hospital SAP after ICH. The AUROC of ve clinical scores ranged from 0.732 to 0.800. In pairwise comparison, the ICH-APS-B (0.800, 95% CI=0.780-0.820, P<0.001) showed statistically better discrimination than other risk models (all P<0.001). All clinical scores performed better among patients with LOS longer than 72 hours. The ICH-APS-B (0.827, 95% CI=0.806-0.848, P<0.001) still showed statistically better discrimination than other risk models in patients with LOS longer than 72 hours showed the maximum Youden Index. In pairwise comparison, the ICH-APS-B (0.800, 95% CI = 0.780–0.820, P < 0.001) showed statistically better discrimination than other risk models for in-hospital SAP after ICH (all P < 0.001).


Background
Spontaneous intracerebral hemorrhage (ICH) accounts for approximately 15-20% of all strokes and is one of leading causes of mortality and morbidity worldwide [1,2]. Despite advances in medical knowledge, treatment for ICH remains strictly supportive with not many evidence-based interventions currently available [3,4].
Stroke associated pneumonia (SAP) is a common medical complication after stroke and has signi cant impact on stroke outcomes. Evidence showed that SAP not only increase the length of hospital stay and medical cost, but also is an important risk factor of mortality and morbidity after stroke [5]. In addition, it was found that pneumonia could increase risk of several nonpneumatic medical complications, such as deep vein thrombosis, gastrointestinal bleeding and atrial brillation [6]. Meanwhile, previous study showed that SAP was more common in patients with ICH than those with acute ischemic stroke (AIS) [6,7]. These data point out the need for more aggressive SAP prophylaxis among patients with ICH.
Several clinical scores have been developed for predicting SAP after ICH, such as the VHA score [8], ICH-APS-A [9], ICH-ASP-B [9], ISAN [10], ACCD4 [11] and PASS [12]. The ISAN [10], ACCD4 [11] and PASS score [12] included ICH patients in the derivation cohorts, while the ICH-APS-A [9] and ICH-APS-B [9] scores were developed exclusively for ICH. Although some of these ICH risk models have been internally or externally validated, none of them has been universally accepted and consistently used in routine clinical practice and clinical research. In addition, with many grading system available, it is becoming increasingly di cult for clinician and researcher to determine which risk models provide optimal predictability and reliability in clinical practice and clinical trials. Therefore, it is necessary to conduct head-to-head comparison of these models in an independent cohort.
In the study, we aimed to systematically compare discrimination and calibration of clinical scores with regard to in-hospital SAP after ICH following the TRIPOD (transparent reporting of a multivariable prediction model for individual prognosis or diagnosis) guideline [13,14].

Validation cohort
The validation cohort was derived from the Beijing Registration of Intracerebral Hemorrhage, which was a multicenter, prospective and observational cohort study. A total number of thirteen hospitals in Beijing area participated in the study. To be eligible for the study, subjects had to meet the following criteria: (1) age 18 years or older; (2) hospitalized with a primary diagnosis of spontaneous ICH and con rmed by brain CT or MRI; (3) direct admission to hospital from a physician's clinic or emergency department; (4) written informed consent from patients or their legal representatives. The study protocol was approved by the Institutional Review Board (IRB) of the Beijing Tiantan Hospital (KY2014-023-02).

Data Collection And De nition Of Variables
A standardized electronic case report form (eCRF) was used for data collection. Participating centers collected data and submitted it online to the coordinating center at Beijing Tiantan Hospital. For this study, the following candidate variables were analyzed: (1) demographics; (2) time from onset to hospital (hours); (3) stroke risk factors: hypertension, diabetes mellitus, dyslipidemia, atrial brillation, history of stroke/TIA, myocardial infarction, heart failure, current smoking and alcohol consumption; (4) pre-stroke modi ed Rankin Scale (mRS) score; (5) pre-admission antithrombotic medications; (6) admission stroke severity based on the National Institutes of Health Stroke Scale score (NIHSS) and the Glasgow Coma Scale (GCS) score; (7) admission blood pressure (mmHg): (8) admission laboratory tests; (9) neuroimaging variables: hematoma volume (measured using the ABC/2 method [15]), hematoma location (supratentorial or infratentorial ICH), intraventricular extension (presence or absence) and subarachnoid extension (presence or absence). (10) etiology diagnosis (primary or secondary ICH); (11) surgical treatment (craniotomy evacuation, minimal-invasive surgical therapy or brain ventricle puncture and drainage); (12) withdrawal of medical care; and (13) length of hospital stay (LOS).

Diagnosis Of Sap
In-hospital SAP was diagnosed by treating physician according to the criteria for hospital-acquired pneumonia of Center for Disease Control and Prevention [16,17], on a basis of clinical and laboratory indices of respiratory tract infection (fever, cough, new purulent sputum, auscultatory respiratory crackles) and supported by typical chest X-ray ndings. In the study, pneumonia occurred before stroke was not considered.

Statistical analysis
Categorical variables were summarized as proportions. Continuous variables were summarized with mean and standard deviation (SD) or median and interquartile range (IQR). Chi-square or Fisher exact test was used to compare categorical variables and Mann-Whitney test or independent t-test was employed to compare continuous variables between groups.
By systematic searching, six clinical scores were identi ed, which could be used to predict SAP after ICH. The VHA score [8] cannot be validated in the study due to that we did not have information on "Found-down at symptom onset". Finally, ve clinical scores were included in the study: the ICH-APS-A [9], ICH-ASP-B [9], ISAN [10], ACCD4 [11] and PASS [12]. Discrimination was assessed by calculating the area under the receiver operating characteristic curve (AUROC) [13,14]. Pairwise AUROC was compared by using Delong's method [18]. Sensitivity, speci city, positive predict value (PPV), and negative predictive value (NPV) were calculated at each risk models' maximum Youden Index. Calibration was assessed by performing the Hosmer-Lemeshow goodness-of-t test and plot of observed versus predicted risk according to 10 deciles of the predicted risk. The Cox and Snell Rsquare and Nagelkerke R-square of the Hosmer-Lemeshow goodness-of-t test were calculated [13,14]. Due to that the Hosmer-Lemeshow test has been shown to be overly sensitive [19], Pearson's correlation coe cient between observed and predicted risk was calculated as well.

Patient characteristics
From December 2014 to September 2016, a total number of 1964 patients were enrolled in the Beijing Registration of Intracerebral Hemorrhage. The clinical characteristics are shown in Table 1. The mean age was 56.8 ± 14.4 and 67.6% were male. The median time from onset to hospital was 4.0 hours (IQR: 1.90-11.1). The median NIHSS and GCS score on admission was 11 (IQR: 3-21) and 14 (IQR: 8-15), respectively. The median hematoma volume on CT was 15.8 cm 3 (IQR: 6.0-38.6). The median length of hospital stay was 16 days (IQR: [8][9][10][11][12][13][14][15][16][17][18][19][20][21][22]. A total number of 575 (29.2%) patients was diagnosed with in-hospital SAP after ICH. Compared to patients without in-hospital SAP, those with in-hospital SAP after ICH were older, with higher proportion of dysphagia, dysarthria, hematoma intraventricular extension, subarachnoid extension, receiving surgical treatment and higher NIHSS score, blood pressure, blood glucose, hematoma volume and longer hospital stay (Table 1).   Figure 1 shows discrimination of ve clinical scores with regard to in-hospital SAP after ICH. The AUROC of 6 clinical scores ranged from 0.732 to 0.800 (Table 3). The sensitivity, speci city, PPV, NPV and maximum Youden Index for predicting in-hospital SAP after ICH are shown in Table 2. The ICH-SAP-B showed the maximum Youden Index. In pairwise comparison, the ICH-APS-B (0.800, 95% CI = 0.780-0.820, P < 0.001) showed statistically better discrimination than other risk models for in-hospital SAP after ICH (all P < 0.001). All risk models showed much better discrimination in patients with LOS longer than 72 hours (Table 3). Among patients with LOS longer than 72 hours, the ICH-APS-B (0.827, 95% CI = 0.806-0.848, P < 0.001) still showed statistically better discrimination than other risk models with regard to in-hospital SAP after ICH (all P < 0.001).

Comparison Of Model Calibration For In-hospital Sap
The predicted and observed risk according to 10 deciles of the predicted risk of in-hospital SAP after ICH was plotted (supplementary Fig. 1). The results of Hosmer-Lemeshow test are shown in Table 4. The ICH-APS-B have a signi cance level of Hosmer-Lemeshow test greater than 0.05 in overall cohort, as indicated that the observed values are not statistically different from the expected values. The ICH-SAP-B had the largest Cox and Snell R-square. Similar results were found for patients with LOS longer than 72 hours. Due to that the Hosmer-Lemeshow test has been shown to be overly sensitive to trivial deviations from the ideal t, Pearson correlation coe cient between predicted and observed risk was calculated. All correlation coe cient of ve models was greater than 0.90 (all P < 0.001). The ICH-SAP-B had the largest Pearson correlation coe cient.

Discussion
In the study, we systematically compared discrimination and calibration of ve clinical scores with regard to in-hospital SAP after ICH. The AUROC ranged from 0.732 to 0.800 in the overall cohort. In pairwise comparison, the ICH-APS-B showed statistically better discrimination than other risk models with regard to in-hospital SAP after ICH. All risk models showed much better discrimination among patients with LOS longer than 72 hours. The ICH-APS-B still showed statistically better discrimination than other risk models for in-hospital SAP after ICH in patients with LOS longer than 72 hours. The ICH-SAP-B had the largest Cox and Snell R-square of the Hosmer-Lemeshow goodness of t test for predicting in-hospital SAP after ICH.
Previous studies have shown that clinical scores for predicting SAP performed better in sensitivity analyses for patients survival beyond 48-72 hours after ICH [9,10]. It is interesting to gure out the potential reasons. We compared the baseline characteristics between patients with LOS less than 72 hours and those longer than 72 hours. It was found that patients with shorter LOS had signi cantly more severe neurological de cit on admission, such as with higher NIHSS score, lower GCS score and larger hematoma volume. However, these patients did not have correspondingly increased risk of in-hospital SAP after ICH. The rates of in-hospital SAP between two groups was not statistically different. Further, we found that patients with shorter LOS had signi cantly higher proportion of in-hospital mortality and withdraw of medical care (Supplementary Table 1). We conjectured that the contradiction between neurological severity and risk of in-hospital SAP after ICH in patients with LOS less than 72 hours might be due to that patients died or left hospital before pneumonia occurred. This might be potential reason for less sensitivity of these clinical scores for in-hospital pneumonia after ICH in this group of patients.
Despite advances in medical knowledge, treatment for ICH remains strictly supportive with not many evidence-based interventions currently available [3,4]. Evidence showed that SAP not only increase the length of hospital stay and medical cost, but also is an important risk factor of mortality and morbidity after stroke. In addition, previous study showed that the rate of SAP was higher among patients with ICH than those with AIS [6]. These data point out the need for more aggressive SAP prophylaxis among patients with ICH in comparison with AIS. Several clinical scores have been developed for predicting SAP after ICH.
Although some risk models have been internally or externally validated, none of them has been universally accepted and consistently used in routine clinical practice and clinical research. In this large ICH cohort (n = 1964), the ICH-APS-B showed statistically better discrimination than other risk models with regard to in-hospital SAP after ICH. Meanwhile, the ICH-APS-B showed the largest Cox and Snell R-square of the Hosmer-Lemeshow goodness of t test for predicting in-hospital SAP after ICH.
Though it is promising, caution need to be taken when interpreting the results: rst, the cohorts used to develop these clinical scores are different. The ISAN, ACCD4 and PASS prediction score included ICH patients in the derivation cohorts, while the ICH-APS-A and ICH-APS-B scores were developed exclusively in ICH. Second, the baseline characteristics of study populations for derivation and validation of these clinical scores are different. Finally, there might be complex genetic, social, economic factors as well as regional management philosophies and preferences that are di cult to account for when risk models are developed or applied to a distinct population. These clinical scores need to be further validated in more populations and larger samples.
In two recent large randomized trials (stroke-INF [20] and PASS [21]), preventive antibiotic therapy did not improve functional outcome after stroke. These trials selected patients by symptoms and signs and prevention strategies are developed for the average person (one size t all), with less consideration for the differences in SAP risk between individuals. In this way, it is inevitable to include patients with unbalanced, too high, or too low risk of developing SAP in these clinical trials. These validated clinical scores could be used to stratify patients for risk of SAP after stroke and then test prophylactic antibiotics in different risk strati cations. Clinical trials conducted in this way will allow us to clarify more accurately prophylactic antibiotics will work in which risk strati cation patients. Whether preventive antibiotic therapy in a high-risk subgroup patients could improve functional outcome after ICH warrants to be investigated.
Our study has limitation that deserve mention. First, like all observational studies, we cannot rule out the possibility that additional baseline variable (unmeasured confounders) might have some impact on the development of in-hospital SAP after ICH, such as use of angiotensin converting enzyme inhibitors, [22] acid-suppressive medications [23], oral hygiene [24] and biomarkers for stroke-induced immunodepression [25,26]. Second, we cannot have all elements required for all risk models. For example, we did not have information about "Found-down at symptom onset" and VHA score [8] cannot be validated in the study. Third, our study included only hospitalized patients and those patients died in emergency room or treated in outpatient clinics were not included. Finally, validation cohorts originated from Asian population and the ICH models needed to be further validated in different populations.

Conclusion
Several risk models are externally validated to be effective for risk strati cation and outcome prediction with regard to in-hospital SAP after ICH, which would be useful tools for personalized care and clinical trials in prevention of SAP after ICH. Dr. Jiaokun Jia reports no disclosures.

Abbreviations
Dr. Hao Feng reports no disclosures.
Dr. Jingjing Lu reports no disclosures.
Dr. Yi Ju reports no disclosures.
Dr. Xingquan Zhao reports no disclosures.  Predictive performance of clinical scores with regard to in-hospital SAP after ICH (n=1964) Figure 1 Predictive performance of clinical scores with regard to in-hospital SAP after ICH (n=1964)

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.