Microarray sample selection for study inclusion
The GSE101702 microarray dataset consisted of 52 healthy control patient whole-blood samples, as well as 107 whole-blood samples from patients with influenza (including 63 and 44 cases of mild-moderate and severe disease, respectively). The GSE21802 microarray dataset consisted of 4 healthy control patient whole-blood samples, as well as 36 whole-blood samples from patients with influenza (including 20 and 16 cases of mild-moderate and severe disease, respectively).
Cross‐platform normalization
In total, both microarray platforms jointly detected 15,231 genes in these two patient samples. Prior to batch effect removal, these samples exhibited batch-based clustering along the two top principal component (PC) axes established based upon non-normalized gene expression values (Fig 1A). A PCA analysis performed following normalization, however, confirmed the successful elimination of these batch effects (Fig 1B), consistent with successful cross-platform normalization.
Influenza patient consensus clustering
Following the normalization of the selected gene expression data, a consensus clustering approach was used to assess transcriptomic profile similarity among these samples. Respective settings for the proportion of features to samples, the number of subsamples, and the proportion of items to samples were 100%, 10, and 80%. This approach resulted in the classification of samples into two patient groups exhibiting significant differences in patterns of gene expression (Fig 2A). Within each subgroup, however, patterns of gene expression were highly similar among samples as confirmed through the establishment of a consensus matrix (cluster-consensus score > 0.8), indicating that a consensus matrix k = 2 was optimal for this patient subgrouping effort (Fig 2B).
The clinical characteristics of participants in these two subgroups were assessed by investigating the age, sex, and disease severity of patients included in the GSE101702 dataset. While no significant differences between subgroups I and II with respect to patient age or sex, significantly more patients in subgroup II had severe influenza requiring IMV (Fig 3). ANOVA-based analyses further confirmed that these transcriptomic classifications were an age-independent factor associated with influenza disease severity (Table 1).
Gene co-expression module identification in influenza patient subgroups
To clarify which genes were differentially expressed in influenza patient cohorts, gene expression levels were compared between healthy controls and influenza patients. Following a GSEA analysis, specific genes in individual subgroups were expressed at higher levels relative to the control group (FDR < 0.05) (Fig 4A-B).
The unique transcriptomic signatures of individual influenza patient subgroups were further examined by conducting a WGCNA-based assessment of gene expression values that were differentially regulated in the established patient subgroups (Supplementary Figure 1A-C). Pairwise comparisons of genes in these two groups ultimately led to the identification of 2806 and 2466 genes exhibiting subgroups-specific regulatory patterns in subgroups I and II, respectively (Table 2). These subgroup-specific genes were then used to conduct a WGCNA analysis consisting of 8 modules (Fig 4C). Relationships among these WGCNA modules and influenza patient subgroups are detailed in Figure 4C and Table 2. Subgroup I was associated with the tan, red, magenta, and grey modules, whereas subgroup II was associated with the salmon, cyan, black, and blue modules (Fig 4C).
Subsequent GO and KEGG analyses revealed the significant enrichment of the black, cyan, and salmon modules for genes associated with the immune response and inflammation, including the cytokine-mediated signaling pathway, leukocyte activation or migration, cytokine production, response to virus/bacteria infection, complement and coagulation cascades, and platelet activation. In contrast, the red and magenta modules were primarily associated with ribosome-related processes such as ribosome biogenesis, ribonucleoprotein complex biogenesis, and rRNA metabolic process (Fig 5A-B). For each of modules, one module was selected at random to assess the directionality of its regulation in the identified patient subgroups. This revealed the upregulation of osteoclast differentiation, platelet activation, and cAMP signaling in subgroup II, as well as the upregulation of the ribosome pathway in subgroup I (Fig 5C).
Associations between WGCNA modules and patient clinical characteristics
To gain insight regarding the relationship between the identified WGCNA modules and patient clinical characteristics, correlations between age or influenza disease severity and module eigengenes were assessed. The tan, black, and salmon modules were found to be positively correlated with disease severity via this approach, whereas they were unrelated to patient age. In contrast, the red module was negatively correlated with influenza severity and age (Fig 5D).