Derivation, validation, and transcriptomic assessment of pediatric septic shock phenotypes identified through latent profile analyses: Results from a prospective multi-center observational cohort

Background Sepsis poses a grave threat, especially among children, but treatments are limited due to clinical and biological heterogeneity among patients. Thus, there is an urgent need for precise subclassification of patients to guide therapeutic interventions. Methods We used clinical, laboratory, and biomarker data from a prospective multi-center pediatric septic shock cohort to derive phenotypes using latent profile analyses. Thereafter, we trained a support vector machine model to assign phenotypes in a hold-out validation set. We tested interactions between phenotypes and common sepsis therapies on clinical outcomes and conducted transcriptomic analyses to better understand the phenotype-specific biology. Finally, we compared whether newly identified phenotypes overlapped with established gene-expression endotypes and tested the utility of an integrated subclassification scheme. Findings: Among 1,071 patients included, we identified two phenotypes which we named ‘inflamed’ (19.5%) and an ‘uninflamed’ phenotype (80.5%). The ‘inflamed’ phenotype had an over 4-fold risk of 28-day mortality relative to those ‘uninflamed’. Transcriptomic analysis revealed overexpression of genes implicated in the innate immune response and suggested an overabundance of developing neutrophils, pro-T/NK cells, and NK cells among those ‘inflamed’. There was no significant overlap between endotypes and phenotypes. However, an integrated subclassification scheme demonstrated varying survival probabilities when comparing endophenotypes. Interpretation: Our research underscores the reproducibility of latent profile analyses to identify clinical and biologically informative pediatric septic shock phenotypes with high prognostic relevance. Pending validation, an integrated subclassification scheme, reflective of the different facets of the host response, holds promise to inform targeted intervention among those critically ill.


Introduction
Sepsis is de ned as life-threatening organ dysfunction caused by a dysregulated host response to an infection.It represents a major public health problem, especially among children, where it affects an estimated 20 million each year across the globe.(1) Moreover, sepsis is the leading cause of under-ve mortality.(2) Yet, despite numerous clinical trials, sepsis care remains limited to early antibiotics and intensive organ support.This lack of therapeutic e cacy has been attributed, in part, to the heterogeneity among critically ill patients.(3) Thus, reproducible approaches that identify clinically and biologically relevant subclasses are necessary to facilitate targeted therapeutic approaches and ultimately to improve patient outcomes.(4) Gene-expression pro ling of whole blood has been used to identify sepsis subclasses.(5-8) Among children, Wong and colleagues used a 100 gene-expression panel, to identify pediatric septic shock endotypes -A and B with prognostic value; assignment to endotype A was associated with a nearly 3fold increased risk of mortality, relative to those with endotype B. (9) Subsequently, these endotypes were shown to demonstrate a differential response to corticosteroids in observational studies, with patients classi ed as endotype A having a 4-fold increase in mortality with corticosteroid use, relative to patients with endotype B. (10) Similar strategies have been deployed among adults yielding analogous results.(11) Of note, gene-expression-based endotyping is being tested in the ongoing Stress Hydrocortisone in Pediatric Septic Shock (SHIPPS, NCT03401398) trial and holds promise to demonstrate the feasibility of employing predictive enrichment strategies among critically ill children.
Concomitantly, a decade ago, Calfee et al. leveraged latent class analyses of clinical, laboratory, and biomarker data to identify two phenotypes of acute respiratory distress syndrome (ARDS).The hyperin ammatory group was characterized by worse outcomes, relative to those without this phenotype.(12) Of note, these phenotypes have demonstrated heterogeneity in treatment effect (HTE) in response to several interventions in secondary analyses of ARDS trials (12,13), and corticosteroids among critically ill COVID-19 patients.( 14) More recently, Dahmer et al. and others have shown reproducibility and prognostic utility of this approach among children with ARDS.(15,16) Lastly, using similar approaches, Sinha and colleagues recently published on molecular phenotypes among adults with sepsis.(17) To the best of our knowledge, no study has evaluated the reproducibility of latent pro le phenotypes in pediatric sepsis.
In the current study, we sought to derive pediatric septic shock phenotypes using latent pro le analyses and test their reproducibility in our longstanding multi-center prospective observational cohort based in the U.S. We sought to establish their prognostic value and to test for interactions between phenotypic and commonly used interventions against sepsis on clinically relevant outcomes.To establish their biological signi cance, we conducted transcriptomic analyses in a subset of the cohort to identify differentially expressed genes and infer cell populations linked to phenotypes.Lastly, we compared the overlap between established gene-expression endotypes of pediatric septic shock and newly identi ed latent pro le phenotypes.We tested the hypothesis that integrating endotype and phenotype assignments could provide a re ned framework for the subclassi cation of critically ill children.

Study design and patient selection
Our ongoing prospective observational cohort study of pediatric septic shock has been extensively detailed previously.(10,18,19) Inclusion criteria for study enrollment was all patients meeting consensus criteria for pediatric septic shock (20) recruited between 2003 and 2023 from 13 pediatric intensive care units (PICUs) in the U.S. Blood was collected from consenting participants within 24 hours of meeting enrollment criteria (day 1).Patients who did not require any vasoactive support were excluded.The primary outcomes of interest included 7-and 28-day mortality, and complicated course -a composite endpoint of death by or presence of ≥ 2 organ dysfunctions on day 7 after study enrollment.

Derivation set
We randomly split patients in the cohort into derivation (60%) and hold-out validation (40%) sets.We used R package "mclust" (v.6.0.0) to perform latent pro le analyses -a Gaussian Finite Mixture Modeling approach-using clinical, laboratory, and biomarker variables in the derivation set.Brie y, we included deviation of vital signs from the median values for age and sex during health.Laboratory data were obtained at the discretion of treating physicians.Biomarker data were previously measured using multiplex Luminex assays in serum collected on day 1.Additional details and selection of the number of latent pro les are detailed in the Online Supplement.

Validation set
The phenotype assignments in the derivation set were used to train a support vector machine (SVM) classi er, which was used to assign phenotypes in the validation set.We compared patient demographics, characteristics, outcomes, and biomarkers, in the derivation and validation sets to test reproducibility and ensure clinical and biological relevance of assigned phenotypes.

Transcriptomic analyses
Bulk messenger RNA sequencing data was available from a subset of the cohort recruited between 2019 and 2023 from day 1 biospecimens.We used DESeq2 (v.1.38.3) to identify differentially expressed genes (DEGs) between the latent pro le phenotypes.DEGs were selected based on ≥ log2 fold change value cutoff of ± 1, and adjusted p-value of 0.05.We conducted Reactome pathway analyses with a Benjamin Hochberg false discovery rate (FDR) < 0.05 to identify enriched biological pathways and CIBERSORT analyses, a bulk deconvolution approach, to determine differences in cell subsets between phenotypes.

Inference of cell types underlying phenotypes
We sought to gain granular insight at a single-cell level into immune cell subpopulations associated with latent pro le phenotypes.To achieve this, we rst integrated three single-cell RNA sequencing datasets, which included data on neutrophil subsets among critically ill adults, a vast majority of whom had COVID-19.(21)(22)(23) We calculated a composite gene score by subtracting the geometric mean of underexpressed genes from the geometric mean of overexpressed genes, identi ed through DEG analyses comparing phenotypes.We inferred differences in the abundance of cell subsets between phenotypes by referencing the composite gene score against the integrated single-cell dataset.

Comparison with established gene-expression pediatric septic shock endotypes
A subset of patients in the cohort had existing assignments as endotypes A or B based on historical data using a 100-gene panel on the Nanostring nCounter platform.Brie y, image analysis of gene-expression mosaics were previously used to assign pediatric septic shock endotypes, with endotype A being characterized by a repressed adaptive immune response relative to endotype B. (10) Statistical analyses: We assessed differences in demographic and clinical characteristics between groups by non-parametric Kruskal-Wallis tests for continuous variables and χ2 tests for categorical variables.Multivariate logistic regression models were used to assess the association between phenotype and outcomes of interest and adjusted for era of enrollment (2013-2023 vs. 2003-2012), patient age, pediatric risk of mortality score (PRISM III), (24) presence of comorbidity, and immunocompromised status.Interactions between phenotype and commonly used sepsis therapies on clinical outcomes were tested based on results of binary logistic regression models adjusted for age and PRISM III score.Pearson χ2 test was used to test the overlap between established gene-expression endotypes and latent pro le phenotypes.Kaplan Meier curves were used to estimate differences in survival comparing endotypes, phenotypes, and an integrated subclass assignment scheme where we considered outputs of both these approaches.The relative risk of 28-day mortality among subclasses was compared by Cox regression analyses.A two-tailed p-value < 0.05 was used to test signi cance.Role of the funding source: The content is solely the responsibility of the authors and does not necessarily represent the o cial views of the NIH (U.S.).

Results
The overview of the study and analyses is detailed in Supplementary Fig. 1.A total of 1,395 patients met the inclusion criteria for the study of whom we excluded 324 patients who did not receive any vasoactive support.The median age of the patients included in the study (n = 1,071) was 5.3 years (quartile 1: 1.7; quartile 3: 11.0 years).The derivation set was comprised of 646 patients and the validation set included 425 patients.
Latent pro le analyses in the derivation set revealed two phenotypes.Differences in standardized variables between the two phenotypes are shown in Fig. 1.One of the phenotypes (n = 126, 19.5%) was characterized by high Angiopoietin-2/Tie-2 ratio, Angiopoietint-2, soluble thrombomodulin (sTM), interleukin 8 (IL-8), and intercellular adhesion molecule 1 (ICAM-1) and low Tie-2 and Angiopoietin-1, which we designated as the 'in amed' phenotype.This group was characterized by a high serum creatinine, blood urea nitrogen (BUN), lactate, a high international normalized ratio (INR), and low platelet counts.We labeled the remaining patients (n = 520, 80.5%), characterized by the absence of such features, as the 'unin amed' phenotype.
Table 1 shows the comparisons between phenotypes in the derivation and validation sets -the latter based on the assignments of our SVM classi er.There were no differences in age and sex comparing phenotypes.Although patients who were 'in amed' were more likely to have had a history of oncologic disease or bone marrow transplantation than those 'unin amed' in the derivation set, there were no statistically signi cant differences in the validation set.Patients with an 'in amed' phenotype had a trend toward higher rates of positive blood cultures in the derivation set (26.2% vs. 19.2%,p = 0.08), which reached statistical signi cance in the validation set (33.8% vs. 20.6%,p = 0.016), relative to those 'unin amed'.There were no signi cant differences in the type of pathogen.Patients with an 'in amed' phenotype had higher baseline illness severity and signi cantly worse clinical outcomes in the derivation and validation sets.Finally, patients with an 'in amed' phenotype were more likely to have been prescribed adjunctive corticosteroids by treating physicians, relative to those 'unin amed'.
Patients with an 'in amed' phenotype had over 5-fold higher odds of 7-day mortality (adj.OR 5.6, 95% CI: 3.6-8.6,p < 0.001), over 4-fold higher odds of 28-day mortality (adj.OR 4.4, 95% CI: 3.0-6.4,p < 0.001), and nearly 4-fold higher odds of complicated course (adj.OR 3.9, 95% CI: 2.8-5.5, p < 0.001) relative to those 'unin amed'.Results of interactions between phenotypes and common sepsis therapies on patient outcomes are detailed in Table 2. Patients with an 'in amed' phenotype were more likely to have received ≥ 100 ml/kg of uid on day 1 of PICU admission, ≥ 2 vasoactive agents, corticosteroids, required intubation and continuous renal replacement therapy (CRRT) support with commensurately worse outcomes, relative to those who 'unin amed'.We did not identify any signi cant interaction between phenotype and sepsis therapies on outcomes with one exception.Patients with an 'in amed' phenotype who received ≥ 2 antimicrobial therapies had a signi cantly higher rate of complicated course in comparison with those 'unin amed' who received ≥ 2 antimicrobial therapies (65.5% vs 26.6%, interaction p-value 0.021).
Transcriptomic data was available in 144 patients.We identi ed 44 differentially expressed genes (DEGs) when comparing patients with 'in amed' (n = 17) vs. 'unin amed' phenotype (n = 127), of which 25 genes were overexpressed and 19 were underexpressed.Biological pathways enriched among patients with an 'in amed' phenotype relative to those 'unin amed' corresponded to activation of the immune system, cytokine signaling, neutrophil degranulation, and antimicrobial peptides.CIBERSORT analyses identi ed that the proportion of neutrophils was lower among patients with an 'in amed' phenotype relative to those 'unin amed'.Expression data was available for 14 overexpressed and 5 underexpressed genes, identi ed through DEG analyses, in the integrated single-cell dataset.After correction for multiple comparisons, genes overexpressed among those with an 'in amed' phenotype corresponded to those expressed by developing neutrophils, proliferating T lymphocytes/Natural Killer (NK) cells, and NK cells.
In contrast, genes underexpressed among those with an 'in amed' phenotype corresponded to those expressed by mature neutrophils.These data are shown in Fig. 2; with additional details presented in the Online Supplement.
A total of 233 patients in the study had data on established gene-expression endotype and latent pro le phenotype assignments.There was no statistically signi cant association between endotypes and phenotypes in the cohort (Pearson χ2 test, p-value of 0.08).Figure 3 shows the Kaplan Meier survival curves based on gene-expression endotype (A vs. B), latent pro le phenotype ('in amed' vs. 'unin amed'), and an integrated scheme where we considered all four possible combinations of endotype and phenotype assignment.Patients classi ed as endotype B & 'unin amed' had the lowest mortality risk.Relative to this group, those classi ed as endotype A & 'in amed' had an over 12-fold (RR: 12.5, 95% CI: 3.8, 41.2, p < 0.001) higher relative risk of mortality; those with endotype B & 'in amed' had a nearly 5-fold increase in mortality (RR; 4.8, 95% CI: 1.1, 20.1, p = 0.032); those with endotype A & 'unin amed' had an over 3-fold increase in mortality (RR: 3.6, 95%CI: 1.2, 11.1, p = 0.024).There were no statistically signi cant differences in mortality between the latter two subclasses.

Discussion
In this study, we derived and internally validated two pediatric septic shock phenotypes, identi ed through latent pro le analyses, of high prognostic relevance.With one exception, there was no evidence for heterogeneous responses to common sepsis treatments on clinical outcomes between phenotypes.Transcriptomic analyses revealed overexpression of genes implicated in innate immune response among those with an 'in amed' phenotype.Our data suggest a high turnover of neutrophils among this high-risk subset of patients, with additional roles for proliferating T/NK, and NK cells.We did not identify a signi cant overlap between established gene-expression endotypes and the newly derived latent pro le phenotypes.Finally, we demonstrated the prognostic relevance of patient 'endophenotypes' based on an integrated subclassi cation scheme that considered both gene-expression-based endotypes and latent pro le phenotypes.
The phenotypes identi ed in our study share similarities with the hyperand hypo-in ammatory phenotypes originally described by Calfee and colleagues among adults with ARDS, (12,13) and subsequently reproduced among pediatric patients; (15) 17) Our data provide further support of the reproducibility of latent pro le analyses as a methodologic approach to identify phenotypes, irrespective of assigned 'syndromic' diagnoses, across the spectrum of the host developmental age.
We provide evidence for the prognostic utility of latent pro le phenotypes with the 'in amed' group being independently associated with signi cant risk of poor clinical outcomes upon adjusting for multiple potential confounders.Unlike previous studies, beyond the robust prognostic implications, we did not nd evidence of HTE of common sepsis therapies on clinical outcomes among phenotypes.The exception to this was that those patients with an 'in amed' phenotype who received ≥ 2 antimicrobial therapies had signi cantly higher rate of complicated course than those with an 'unin amed' phenotype.While this observation may merely re ect the fact that the 'in amed' phenotype represented the sickest subset of patients, a few additional considerations are warranted (a) a lack of appropriate source control, (b) an inability to achieve therapeutic drug levels of antimicrobials and/or (c) an exaggerated host immune response, despite appropriate antimicrobial coverage, among those 'in amed'.Of note, our ndings mirror those of Sinha et al.where the authors identi ed that septic adults with a hyperin ammatory phenotype had higher rates of bacteremia than those without.(17) Pending validation, future studies are needed to determine whether precision antibiotic dosing, targeted use of extra-corporeal blood puri cation strategies, and or modulation of the innate immune response can improve outcomes among patients with an 'in amed' phenotype.
We did not identify a differential response to corticosteroids among phenotypes unlike that observed among adults with COVID-19.(14) The explanations for this difference are likely multifactorial including the relative homogeneity among patients with COVID-19 compared to the cohort studied, differences in pathogen type -viral vs. bacterial induced host response, and compartmentalized effects of corticosteroids based on primary cells affected -lung vs. peripheral blood.In addition, Sinha and colleagues demonstrate differential responses to recombinant activated protein C (rAPC) vs. placebo among phenotypes when re-examining results of the PROWESS-SHOCK trial data.(17) While we demonstrate evidence of a coagulopathy among those with an 'in amed' phenotype, we cannot comment on whether latent pro le phenotypes among children would be expected to have a similar biological response as with adults, given the developmental differences in host response.
Transcriptomic analyses revealed activation of neutrophil pathways consistent with gene-expression studies comparing phenotypes of adult ARDS and patients with sepsis as detailed by Bos et al. (27) Our data suggest a higher turnover of neutrophils among those with an 'in amed' phenotype, as indicated by the signatures re ective of developing neutrophils relative to those 'unin amed'.A recent prospective single-cell multi-omics study by Kwok et al. among septic adults corroborates our data, wherein patients with the worst clinical outcomes were characterized by emergency granulopoiesis and the presence of immature neutrophils.(28) Finally, our data suggest a preponderance of additional cell subsets including proliferating T/NK and NK cells among those with an 'in amed' phenotype.While we cannot con dently speak to whether the phenotypes identi ed represent 'treatable traits', (29) our data indicate that the groups identi ed are biologically distinct.Future studies are necessary to determine the mechanistic link between cell subpopulations and phenotypes, and whether targeted modulation of cell subsets can be used as a novel therapeutic approach against sepsis.
We did not identify a signi cant overlap between established gene-expression-based endotypes and latent pro le phenotypes.As such our data indicate that, fundamentally, these two approaches are sampling different, albeit vitally important, biological facets of the host response in critical illness.While the former broadly re ects the adaptive arm of the host immune response, the latter informs the innate arm of the host response, including microvascular endothelial function.Therefore, we believe that the integrated classi cation scheme of 'endophenotypes' detailed in our study is of clinical and potential therapeutic relevance.For instance, patients classi ed as endotype A & 'in amed' may represent an extreme endophenotype with a signi cantly increased risk of mortality.This is consistent with the observation that critically ill patients with an overactive innate-and repressed adaptive-immune response have been consistently associated with the worst clinical outcomes.As such these patients would be expected to be poor candidates to receive corticosteroids based on their endotype.However, they may potentially bene t from targeted immunomodulation to quell the innate immune response based on their phenotypic assignment.Furthermore, although patients with endotype B & 'in amed' and endotype A & 'unin amed' endophenotypes had comparably elevated risk of mortality, the therapeutic implication of such subclass assignment is expected to be diametrically opposite between groups.Although speculative, pending validation in cohort studies and clinical trials, such an integrated subclassi cation scheme holds the potential to inform better alignment of interventions among those critically ill by providing a comprehensive understanding of patient pathobiology.(30) Our study has several limitations: (1) the observational nature of the study limits precludes any inference of causality; (2) despite accounting for era of patient enrollment in our multivariate models, the long study period is a limitation; (3) latent pro le phenotypes only considered day 1 data.However, given the temporal and dynamic nature of the host response, it is conceivable that these class assignments may be subject to change over time; (4) external validation dataset to demonstrate the reproducibility of our SVM model was lacking.Moreover, we did not seek to develop a classi er that used a parsimonious set of predictor variables as this is better achieved in external validation sets; (5) the number of patients with an 'in amed' phenotype among whom transcriptomic data was available was limited, which may have contributed to fewer DEGs being identi ed; (6) the integrated single-cell data used as reference was largely comprised of samples obtained from adults with COVID19 critical illness.Given that few singlecell studies to date have captured neutrophil signatures among septic patients, prospective studies that simultaneously capture phenotypic and single-cell transcriptomic data are necessary to directly identify cell subsets underlying phenotypes; (7) the number of patients in whom both established gene-expression endotype and latent pro le phenotype class assignments were available was limited; 8) both endotype and phenotype assignments were based on data generated within 24 hours of meeting septic shock criteria and were assumed to re ect baseline differences in host response.However, a signi cant proportion of patients in the cohort received corticosteroids.It remains plausible that the biological differences in host response among subclasses may re ect those in response to corticosteroids, rather than baseline differences.

Conclusions
In this study, we demonstrate the existence of two phenotypes among children with septic shock identi ed through latent pro le analyses with high prognostic value.We provide evidence of upregulated innate immune responses among those with an in amed phenotype re ective of signatures of developing neutrophils, proliferating T/NK, and NK cells.The phenotypes did not show overlap with established gene-expression-based adaptive endotypes in pediatric septic shock nor demonstrate a differential response to corticosteroids.We integrated these two promising classi cation schemes to delineate novel sepsis 'endophenotypes'.Pending validation, such an approach may allow for therapeutic drug selection informed by a comprehensive understanding of patient-level pathobiology.
the reactive and unin amed phenotypes detailed by Heijnen et al. among mechanically ventilated adults; (25) molecular phenotypes of acute kidney injury detailed by Bhatraju et al. among adults; (26) and most recently those identi ed by Sinha et al. among septic adults.( biobank and conducted experiments, respectively.Drs.Geoffrey Allen (Children's Mercy Hospital, Kansas City, MO) and Jocelyn Grunwell (Children's Healthcare of Atlanta at Egleston, Atlanta, GA) contributed to patient recruitment but did not contribute to the manuscript.Transcriptomic data were made available through the SUBSPACE (Subtyping in Sepsis and Critical Illness) Consortium, funded and managed by In ammatix, Inc.Con icts of interest: Cincinnati Children's Hospital Medical Center (CCHMC) and the estate of the late Dr. Hector R. Wong hold patents for gene-expression-based pediatric septic shock endotypes, re ective of the host adaptive immune system.M.R.A and R.K hold a provisional patent for gene-expression-based multiple organ dysfunction syndrome (MODS) subclass identi cation, re ective of the innate immune response.In ammatix is a for-pro t company focusing on the development and commercialization of best-in-class host-response diagnostic tests.Y.H.B and T.E.S are employees and/or stockholders of In ammatix Inc.P.K is a stockholder of In ammatix Inc.Contribution of authors: Study conceptualization: M.R.A and R.K. Funding acquisition: M.R.A, N.S.P, and R.K. Data acquisition: M.R.A, J.C.F, N.Z.C, S.L.W, M.T.B, P.N.J, A.J.S, R.L. J.N, , N.J.T, M.Q, B.H, P.K, T.S; Project administration: A.J.L, N.L.S, S.W.S, J.K, and B.Z. Data curation and analyses: M.R.A, M.H, A.R.M, H.Z, Y.H.B, P.K and R.K. Draft of manuscript: M.R.A and M.H. Review and editing of manuscript: all authors.All authors approve the manuscript in its nal version.

Figures
Figures

Figure 2 Inference
Figure 2 (c.) Composite gene score calculated by subtracting the geometric mean of underexpressed genes (PRLR, HCAR2, RAMP3, SHE, and CMTM2) from the geometric mean of overexpressed genes (CCL4, PRTN3, NEIL3, CENPU, ELANE, SKA3, CEP55, NCAPH, HBB, DEFA4, DEFA3, CTSG, CCL20, and MMP15) among patients with an 'in amed' phenotype relative to those 'unin amed' projected on the UMAP of the integrated single-cell dataset.The gene score was scaled as shown in the legend with genes in red representing those overexpressed and those in blue showing those underexpressed.Bottom panel from left to right.(d.) Projection of overexpressed genes identi ed among patients with an 'in amed' phenotype relative to those 'unin amed' on the integrated single-cell dataset demonstrated that genes corresponded to signatures of developing neutrophils, proliferating T/NK cells, and NK cells.(e.) Projection of underexpressed genes identi ed among patients with an 'in amed' phenotype relative to those 'unin amed' on the integrated single-cell dataset demonstrated that genes corresponded to signatures of mature neutrophils.

Table 1 .
Demographics, patient characteristics, and clinical outcomes among pediatric septic shock latent pro le phenotypes in the derivation and validation sets.

Table 2 .
Results of tests for interaction between pediatric septic shock latent pro le phenotypes and common sepsis therapies on clinical outcomes.