Repeated measurements are not better than initial measurement of Score for Neonatal Acute Physiology-II for prediction of in-hospital mortality in severely septic preterm neonates

The ability of serial versus single time application of ‘Score for Neonatal Acute Physiology, version II’ (SNAP-II) to predict mortality at day 14 in preterm neonates < 34 weeks with severe sepsis was studied prospectively over 1-year in a tertiary care neonatal unit. SNAP-II scores were recorded at the onset of severe sepsis (T0) and serially at 24 (T1), 48 (T2) and 72 (T3) hours later. Delta scores (Δ SNAP-II) were derived from the difference between any two SNAP-II scores. Seventy-one preterm neonates were enrolled. Baseline characteristics were similar in survivors (n = 53) and non-survivors (n = 18). Median SNAP-II scores at all the four time points were signicantly higher in non-survivors (p < 0.001). The Δ SNAP-II (T0 – T2) score was signicantly different between non-survivors and survivors (mean difference: -14.7; 95% CI: -29, -0.9; p = 0.02), while the difference was not signicant between T0 – T1 and T0 – T3. Initial SNAP-II score had a signicantly better discriminating ability for day 14 mortality (AUC (95% C.I): 0.83 (0.70– 0.93)) than Δ SNAP-II scores at various time points (AUC (95% C.I): 0.59 (0.41–0.75) for T0 – T1, 0.70 (0.50–0.87) for T0 – T2 and 0.64 (0.38–0.89) for T0 – T3). Conclusion: Initial SNAP-II is better than Δ SNAP-II scores in predicting 14-day mortality in severely septic preterm neonates. Non-survivors had a signicantly higher serial SNAP-II scores compared to survivors. Serial SNAP-II score do not have additional value in predicting mortality of preterm neonates with severe sepsis.


Introduction
Severity of illness scores predict outcomes including in-hospital mortality in adults as well as in the pediatric population [1,2]. These scores were typically measured once at admission to the ICUs. Over the last two decades, studies done in adults and children have reported that serial measurements of these severity scores potentially improve the prediction of outcomes as change in the severity scores should re ect the change in the internal milieu better than a single time measurement. An increase in APACHE-II score from day 1 to day 3 in medical ICU predicting mortality in adults [3], increase in PELOD scores in a pediatric ICU from days 1 to 2 and 2 to 4 being associated with mortality [4], highest as well as mean SOFA scores predicting ICU mortality [5], maximum as well as delta scores of MODS, SOFA and LOD scores better predicting mortality [6] and an increase in severity scores signi cantly and variably preceded mortality in pediatric oncology patients [7], are all examples where change in serial severity scores were used for prediction of various outcomes.
A similar score called SNAP-II with six predictor variables was validated for NICU admissions [8,9]. SNAP-II at admission to the NICU was reported to predict mortality in neonates with congenital diaphragmatic hernia [10]. Another study in severely septic preterm neonates reported that a SNAP-II of ≥ 40, measured at the onset of severe sepsis, had a positive predictive value and speci city of 88% and 86%, respectively, for mortality [11]. Serial measurements of SNAP-II did not live up to the expectations even though the available evidence is meagre to make such conclusions. Meadow et al showed that serial SNAP-II scores in premature neonates became progressively less helpful in distinguishing neonates who either died in the NICU or survived with low Mental/Psychomotor Developmental Index scores [12]. In another study, in neonates requiring mechanical ventilation, serial SNAP scores did not predict mortality, but admission SNAP scores did predict [13]. Severe sepsis/septic shock is one of the major causes for mortality in preterm neonates. The ability of serial measurements of SNAP-II scores in prediction of mortality in this population is very important but has not been studied yet. Hence, the current study was done in preterm neonates of < 34 weeks' gestation with severe sepsis to assess whether a change in SNAP-II score over time can predict mortality by 14 days better in comparison to the initial SNAP-II score.

Materials And Methods
This prospective cohort study was conducted in the level III NICU of a tertiary care hospital in India. All consecutively inborn preterm neonates of < 34 weeks' gestation and diagnosed to have septicemia with evidence of SIRS and OD were eligible for enrolment. Neonates with major congenital malformations, those who suffered severe perinatal asphyxia, who were moribund and those where parents refused for consent were excluded from the study. The study protocol was approved by the Institute Ethics Committee. An informed, written consent was obtained from one of the parents in all the enrolled subjects.
Septicemia was de ned as presence of clinical signs of sepsis, with either blood culture or sepsis screen positive or with radiological evidence of pneumonia. Sepsis screen consisted of CRP, µESR, TLC, ANC and ITR of neutrophils. Sepsis screen was considered positive if > 2 factors were abnormal. CRP > 10 mg/L, µESR > 10 mm or 'age in days + 3' mm in the rst 7 days, TLC < 5000 per mm3 and ITR above 20% were considered abnormal. ANC was assessed using Manroe's and Zipursky's charts [14,15].
SIRS was de ned based on the criteria framed by Adams-Chapman and the variables of temperature, respiratory rate, heart rate, blood pressure, and urine output were modi ed for preterm neonates [16]. Presence of at least 2 of the 4 criteria de ned SIRS. Sever sepsis was de ned as the presence of at least one OD in the 24-hour period preceding enrolment. OD criteria were adapted from a previous study done by Sundaram et al in severely septic preterm neonates [17].
SNAP-II score consisted of the following 6 parameters: 1) lowest mean arterial pressure, 2) worst ratio of partial pressure of oxygen (PaO2) to fraction of inspired oxygen, 3) lowest temperature (ºF), 4) lowest serum pH, 5) occurrence of multiple seizures, and 6) urine output (< 1mL/kg/hour). First 12 hours from the onset of severe sepsis served as the data collection window (T0) and scoring was repeated at 24 (T1), 48 (T2) and 72 (T3) hours from the rst score. Delta scores (Δ SNAP-II) were the difference in SNAP-II scores between two time points. Differences between T0 and T1 (Δ SNAP-II_0-24), T0 and T2 (Δ SNAP-II_0-48) and T0 and T3 (Δ SNAP-II_0-72) were used for analysis. The neonates were followed up to 14 days from the onset of severe sepsis or until remission from OD or death, whichever was earlier.
The study subjects were categorized into 2 groups-survivors by day 14 of enrollment and non-survivors.
The ability of serial SNAP-II scores in predicting all-cause mortality by 14 days from the onset of severe sepsis was analyzed. We also compared the organ dysfunction status at the onset of severe sepsis and at 14 days or by death, whichever was earlier.

Sample size and statistical analysis
In the unpublished annual data of the unit, a total of 80 neonates of < 34 weeks' gestation developed severe sepsis and 60% of them died before discharge. With this data as baseline with an alpha error of 5%, con dence level of 95% and accepting a variability of 5%, a sample size of 65 was estimated. An additional 10% were recruited to account for attrition.
Normality of the numerical variables was assessed by the Kolmogorov-Smirnov test and Q-Q plots. Normally distributed numerical variables were compared by the Welch two-sample t-test whereas nonparametric variables were compared by the Wilcoxon Rank-Sum test. Categorical variables were compared for independence by the Pearson's Chi-squared test or by the Fisher's Exact test when the cell size of any one cell in the contingency table was < 5. All SNAP-II scores were compared between the nonsurvivors and survivors. Total SNAP-II (sum of all scores), mean SNAP-II (average of all scores) and maximum SNAP-II (highest of the four scores) were calculated and compared between non-survivors and survivors. Bonferroni correction was done for multiple pairwise comparisons. Receiver-operating characteristic curves were generated for SNAP-II_0, SNAP-II_0-24, SNAP-II_0-48, SNAP-II_0-72, total SNAP-II, mean SNAP-II and maximum SNAP-II with mortality by day 14 and the generated AUROC curves were compared.
Considering the non-parametric distribution of SNAP-II with missing measurements at some time points and due to the correlated nature of the data due to repeated measurements of SNAP-II in same population at four different sequential time points, a linear mixed model analysis was carried out. In the model, 'SNAP-II' score was the response variable and 'time' (four time points of measurement) was the 'repeated measures' independent variable (model condition for within participant comparison) and outcome at day 14 (binomial categorical variable) was the independent variable. Both xed effect (time and outcome at day 14) and random effects (subjects) were included in the model analysis. Contrasts were not used to break down the interactions as 'mortality by day 14' had only two levels. The effect sizes of the variables in the model were calculated as the product of their 't' value and 'df' (degrees of freedom).
Post-hoc analysis for multiple comparisons was done by the 'tukey' method. A P value of < 0.05 was considered as signi cant and the 95% con dence intervals were calculated by the 'bootstrap' method with 2000 repeated samplings with replacement and adequate adjustment for the correlated nature of the data. The R program and R packages "tidyverse", "nlme", "pastecs", "multcomp" and "ggplot2" were used for modelling [18][19][20][21] and for generating and comparing the ROC curves [22].
Demographic and baseline variables were comparable between non-survivors and survivors (Table 1). Seizures, shock and metabolic acidosis were more frequently observed in non-survivors. Apart from cardiovascular system dysfunction, other organ dysfunctions were not different between the studied groups ( Table 2). Abbreviations: SD-standard deviation, RR-risk ratio, MD-mean difference, CI-con dence interval, PPROM -Preterm premature rupture of membranes Abbreviations: SD-standard deviation, RR-risk ratio, MD-mean difference, CI-con dence interval, OD -organ dysfunction, GNB -gram negative bacilli, CRP-C-reactive protein, PaO2-partial pressure of oxygen, PaCO2-partial pressure of carbon dioxide Abbreviations: SD-standard deviation, RR-risk ratio, MD-mean difference, CI-con dence interval, OD -organ dysfunction, GNB -gram negative bacilli, CRP-C-reactive protein, PaO2-partial pressure of oxygen, PaCO2-partial pressure of carbon dioxide The initial SNAP-II scores as well as subsequent SNAP-II scores measured at 24, 48 and 72 hours from the onset of severe sepsis were signi cantly higher in non-survivors ( Table 3). The mean, maximum and total SNAP-II scores were also signi cantly higher in non-survivors ( Table 3). The Δ SNAP-II_0-48 score was signi cantly different between non-survivors and survivors (MD (95% C.I): -14.7 (-29, -0.9); p = 0.02), whereas other Δ SNAP-II scores were similar (Table 3).  SNAP-II scores measured at all the four time points had good discriminatory ability for the primary outcome of death by 14 days ( an excellent ability of these summarized SNAP-II parameters to discriminate non-survivors from survivors (Table 4). Discriminatory ability of Δ SNAP-II scores ranged from poor to average (Table 4). On comparing the AUROC curves in a pairwise fashion with the initial SNAP-II as the primary comparator and after making necessary adjustments for the correlated nature of the comparisons, mean SNAP-II had an AUROC signi cantly better than the initial SNAP-II (Table 4). Abbreviations: AUROC -area under the receiver operating characteristic, SNAP-II -score for neonatal acute physiology version 2, C.I -con dence intervals Discussion Serial measurement of severity of illness scoring systems such as APACHE II, SOFA and MODS have been studied in adults and pediatric age groups and have shown that a change in the score between two or more time points preceded and predicted mortality [3][4][5]. Despite sepsis being one of the most common causes of neonatal mortality, such serial measurements of severity scores for a better prediction of risk of mortality in this population has not been studied till now. The key observations in the current study are: (i) SNAP-II scores measured at the onset of severe sepsis and at 24, 48, and 72 hours from the onset were signi cantly higher in non-survivors in comparison to survivors; (ii) the initial SNAP-II score was statistically better than the Δ SNAP-II scores in discriminating non-survivors from survivors; (iii) the trend of SNAP-II scores was signi cantly different without any overlap between survivors and non-survivors.
A previous study done by the same authors reported a signi cantly higher median SNAP-II at the onset of severe sepsis in neonates who died in comparison to those who survived: median (IQR) 43 (36, 53) vs 18 (16, 37), respectively, p < 0.001 [11]. An Egyptian study done in neonates with sepsis reported a signi cantly higher median SNAP-II score in those who died or developed OD [23]. Another study from Nepal reported that a SNAP-II score of ≥ 12 predicted mortality with a sensitivity of 76% and speci city of 73% amongst all neonates admitted to NICU [24]. Any severity scoring system should preferably include the time factor so as to encompass the sequential changes that take place in the organ dysfunction. This is important as the illness and associated organ dysfunctions are not static and evolve over time. None of the above studies repeated the SNAP-II scores to assess the change.
Change in the severity score (delta scores) re ects disease progression as well as therapeutic response and facilitates therapeutic decision making. Adult studies in ICU's have demonstrated and validated the usefulness of delta scores in prognosticating mortality [25]. We observed that Δ SNAP-II_0-48, although signi cantly higher in non-survivors, had an average discriminative ability (AUROC: 0.70). This is similar to the observations made by Frain et al, where the ability of serial SNAP-II scores to predict mortality in neonates requiring mechanical ventilation decreased with time [13]. A systematic review on serial severity scores in neonates showed that in majority of the studies, SNAP-II scores were done only at admission to ICU and were used to predict mortality on rst day of life. In the same analysis, 6 studies used them at later time points and 2 studies used the score in a prospective fashion. However, none of the studies included in the review tested the utility of serial measurements of SNAP-II scores, nor the population included a signi cant number of sick preterm neonates [26].
Maximum SNAP-II, total SNAP-II and mean SNAP-II scores are all summarized forms of the score which eventually re ects cumulative severity of the illnesses. Total and mean score capture the average degree of illness severity whereas maximum score captures the worst point in the course of the illness. Mean scores have a built-in denominator component and hence may get in uenced by the number of observations, while the maximum score has the advantage of not being much affected by missing data points. In the current study, initial, total and maximum SNAP-II scores had similar AUROC curves, although AUROC curve of mean SNAP-II scores was statistically better than that of initial SNAP-II. Ferreira et al analyzed the association between initial, highest, total and mean SOFA scores and ICU mortality and reported that the mean SOFA score and highest SOFA score had the strongest association with mortality and highest SOFA score had the largest AUROC curve of 0.90 [5]. They also reported that the AUROC curve of highest SOFA was signi cantly larger in comparison to initial SOFA at admission to the ICU. Moreno et al reported that maximum SOFA score represented the cumulative organ dysfunction experienced by a patient and had a strong correlation with mortality outcome [27]. The ndings in our study indicate that the initial SNAP-II score might perform similar to various cumulative measures of SNAP-II and better than delta SNAP-II scores in sick and septic preterm neonatal population. This observation of poor association between serial SNAP-II assessment and mortality needs further exploration in a larger population.
The current study has few limitations. Firstly, we did not evaluate other important outcomes such as length of hospital stays and survival without major morbidities. Secondly, the proportional contribution of individual SNAP-II parameters in prediction and prognostication were not analyzed. Despite these limitations, this is the rst study of its kind in preterm neonates with severe sepsis which has examined the utility of sequential assessment of SNAP-II in comparison to initial SNAP-II using a robust linear mixed model analysis.

Conclusions
Serial measurements of SNAP-II scores were signi cantly higher in non-survivors compared to survivors.
Initial SNAP-II score is better than Δ SNAP-II scores in discriminating non-survivors from survivors in this population. Mean SNAP-II score had a statistically better AUROC curve than the initial SNAP-II. However, cumulative scores have the disadvantage of the need for repeated assessments. Larger, multi-centric studies are needed to explore and answer the association between sequential SNAP-II assessments and mortality and its generalizability. Participant ow chart