1. WGCNA analysis identified key coexpressed gene modules
On the basis of prespecified criteria, 13547 DEGs were identified from the TCGA-BLCA & GTEx dataset. The heatmap was shown in Supplementary Fig 1, among which 644 immune-related DEGs were intersected and sent to enrichment analysis and WGCNA analysis. Fig 1a was the heatmap of the expressions of the 644 immune-related DEGs between tumor and normal samples. GO-BP enrichment of the immune-related DEGs showed enrichment of immune-related terms like cell chemotaxis, cellular response to tumor necrosis factor, response to tumor necrosis factor, positive regulation of cytokine production, regulation of chemotaxis, leukocyte chemotaxis, Fc receptor signaling pathway, positive regulation of defense response (Fig 1 b). KEGG enrichment of the immune-related DEGs also revealed enrichment of immune-related terms like cytokine-cytokine receptor interaction, T cell receptor signaling pathway, viral protein interaction with cytokine and cytokine receptor, chemokine signaling pathway, MAPK signaling pathway, B cell receptor signaling pathway, natural killer cell mediated cytotoxicity, and Kaposi sarcoma-associated herpesvirus infection (Fig 1 c).
WGCNA analysis of the 644 immune-related DEGs defined four distinct coexpression gene modules (Fig 1d-e), among which the yellow, turquoise, and brown modules were the most significantly coexpressed modules with the tumor and normal phenotypes, and the coexpression coefficients were 0.72, 0.58, and -0.23, with p-values of 1e-72, 2e-40, and 1e-06, respectively.
2. Survival analysis of the key coexpression modules’ genes
Within the TCGA & GTEx dataset, 48 genes in module yellow, 164 genes in module turquoise, and 53 genes in module brown were intersected and further employed in survival analysis. Log-rank test revealed that genes AGTR1, AHNAK, AKT3, ANXA6, CALR, CSRP1, CXCL12, DES, EDNRA, ELN, FGFR1, GREM1, HGF, HLA-C, IGF1, IKBKB, IL17RD, ILK, JUN, KITLG, LRP1, NFATC1, NFATC4, NFKBIZ, NFYA, NR3C2, NRP2, OGN, PDGFC, PGR, PI15, PPP3CB, PPP3CC, PSMC1, PSME2, PTGER3, PTPN6, RBP1, RORA, S100A10, SEMA3A, TEK, TGFB3, THBS1, THRA, TNFRSF6B, and VIM had prognostic values (Fig 2 a-u1). Among these prognostic genes, the low expressions of HLA-C, IKBKB, NFKBIZ, NFYA, PSME2, and PTPN6 were significantly associated with good outcomes, while the other genes were the opposite. In univariable Cox regression analysis, HLA-C, PSME2, NFYA, THBS1, CXCL12, S100A10, RBP1, PI15, LRP1, ELN, CSRP1, JUN, AHNAK, NFKBIZ, DES, HGF, ANXA6, VIM, PPP3CB, PPP3CC, NFATC1, NFATC4, AKT3, PTPN6, SEMA3A, EDNRA, GREM1, IGF1, KITLG, OGN, PDGFC, TGFB3, AGTR1, FGFR1, IL17RD, NR3C2, PTGER3, RORA, TEK, THRA, CALR, PSMC1, ILK, IKBKB, PGR, and TNFRSF6B were significantly associated with overall survival (OS) of bladder cancer (Fig 2 v1). The mutation rates of all these prognostic genes were displayed as an oncoplot (Fig 2 w1). AHNAK showed the highest mutation rates with 9% samples.
3. Construction and validation of IRGPI.
On multivariable Cox regression analysis, an IRGPI model which comprises 11 immune-related genes was built. The Riskscore = (-0.314)×HLA-C + (0.172)×RBP1 + (0.679)×AHNAK + (-0.296)×NFKBIZ + (0.551)×NFATC1 + (-0.314)×PTPN6 + (-0.465)×NRP2 + (0.538)×PTGER3 + (0.977)×CALR + (-0.411)×IKBKB + (40.744)×TNFRSF6B (Fig 3 a). A risk score was calculated for each sample of the training and validation cohort and according to the median of the score, all samples were classified as high- and low-risk subgroups. Log-rank test identified that the high-risk group patients had poorer survival than the low-risk group ones in both two cohorts, and the p-value in the training cohort and the validation cohort was <0.001 and 0.021, respectively (Fig 3 b-c). In the training cohort, GSEA analysis enriched seven immune-related KEGG terms, i.e. chemokine signaling pathway, cytokine cytokine receptor interaction, FC gamma R mediated phagocytosis, leukocyte transendothelial migration, systemic lupus erythematosus, TGF beta signaling pathway, and wnt signaling pathway (Fig 3 d). The percentages of dead and alive patients were 60% and 40% in the high-risk subgroup, and 28% and 72% in the low-risk subgroup, respectively (Fig 3 e). Survival analysis of the training cohort revealed that both male and female patients in the high-risk group gained a shorter survival than that of the low-risk group (Fig 3 f-g), with p-values < 0.001. The mutation rates of most top 20 highest mutational genes in the high-risk group were lower than those in the low-risk group, except TP53 (Fig 3 h-i). The expressions of several members of the HLA family genes like HLA-H, HLA-F, HLA-B, HLA-J, HLA-DMA, HLA-E, and HLA-C were downregulated in the high-risk group compared with the low-risk group (Fig 3 j). The expression of immune checkpoints genes like SIGLEC15 and TBX2 were lower expressed in the high-risk group than in the low-risk group, while HAVCR2 and PDCD1LG2 were the opposite (Fig 3 k). The fractions of some immune cells like plasma cells, T cells CD8, T cells CD4 memory activated, T cells follicular helper, T cells regulatory (Tregs), and dendritic cells activated were higher in the low-risk subgroup than in the high-risk group, while the fractions of macrophages M0, macrophages M2, and mast cells resting were reverse (Fig 3 l). Additionally, a lower score of the immune functions like APC_co_stimulation (APC, Antigen Presenting Cells), B_cells, CCR (chemokine receptors), macrophages, mast_cells, pDCs (plasmacytoid dendritic cells), and Treg was found in the low-risk group compared with the high-risk group, except NK_cells (Fig 3 m). The correlations of the 11 IRGPI genes with the 22 immune cell types revealed that macrophages M1 and HLA-C showed the strongest positive correlation, while the most negative correlation occurred between dendritic cells activated and PTGER3 (Fig 3 n). ‘Furthermore, TMB was significantly lower among the samples in the high-risk group vs the low-risk ones, p-value = 0.034 (Fig 3 o). TMB was negatively correlated with the risk score, for which the coefficient was -0.12 and the p-value was 0.02 (Fig 3 p). Samples were further divided into high-TMB or low-TMB subgroups based on the median. Log-rank test identified that the high-TMB samples confirmed longer survival than the low-TMB samples, p-value < 0.001 (Fig 3 q). More importantly, when investigating risk and TMB in combination, the high-TMB & low-risk subgroup showed the most significant survival advantage, followed by the low-TMB & low-risk subgroup, then the high-TMB & high-risk subgroup, and the low-TMB & high-risk subgroup had the worst prognosis of all groups (Fig 3r). The risk curve indicated increased deaths with increasing risk scores, and there was a negative correlation between the risk score and OS, for which the coefficient was -0.18 and the p-value was 2e-4 (Fig 3s-u). Finally, all the 1-year, 3-year, and 5-year calibration curves in the training cohort and the validation cohort showed good agreement between predicted and observed (the diagonal) outcomes (Fig 3 v, Supplementary Fig 1 a). These outcomes were successfully validated by the ROC curves in the way that the area under the curve (AUC) at 1-year, 3-year, and 5-year in the training cohort was 0.690, 0.722, and 0.738, respectively (Fig 3 w), while the AUC at 1-year, 3-year, and 5-year in the validation cohort was 0.588, 0.636, and 0.609, respectively (Supplementary Fig 1 b).
The clinicopathologic features of the TCGA cohort classified by risk score were displayed in Fig 4 a. The stages and immune subtypes were significantly statistically different between IRGPI-high and IRGPI-low subgroups, and both the p-values were 0.001 (Fig 4 b-c).
4. Survival analysis based on the immune cells and immune functions in the TCGA cohort
Log-rank test revealed that patients with a higher score of immune cells like dendritic cells activated, macrophages M1, NK cells activated, plasma cells, T cells CD4 memory activated, T cells CD8, and T cells follicular helper; immune functions like APC_co_inhhibition, CD8+_T_cells, check-point, cytolitic_activity, DCs, HLA, inflammation_promoting, MHC_class_I, NK_cells, T_cell_co_inhibition, T_cell_co_stimulation, T_helper_cells, Tfh, Th1_cells, Th2_cells, TIL, and Treg tended to have longer survivals than lower score ones. In contrast, a higher score of immune cells like eosinophils, macrophages M0, macrophages M2, monocytes, neutrophils, and T cells CD4 memory resting; immune functions like macrophages, mast_cells, and Type_II_IFN_Response were associated with poorer outcomes compared with lower score ones (Fig 5 a-g1). On univariable Cox regression analysis, immune cells like T cells CD8, T cells CD4 memory activated, and macrophages M0 (Fig 5 h1) and immune functions like mast_cells and NK_cells were significant predictors for OS (Fig 5 j1). On multivariable Cox regression analysis, only macrophages M0, mast_cells, and NK_cells were considered to be independent prognostic predictors (Fig 5 i1, Fig 5 k1).
5. Construction and validation of a clinicopathologic-IRGPI model
In the TCGA cohort, univariable Cox regression analysis revealed that age, stage, and IRGPI risk score model were significantly associated with OS (Fig 6 a). Further multivariable Cox regression analysis also confirmed these three factors were independent prognostic predictors for OS (Fig 6 b). Thus we developed the nomogram model which enrolled the IRGPI and several clinicopathologic features (Fig 6 c). Our nomogram model (AUC: 0.722) outperformed the TIDE model (AUC: 0.491) and the TIS (tumor inflammation signature) model[21-24] (AUC: 0.459) (Fig 6 d). The 1-year, 3-year, and 5-year calibration curves in the training cohort and the validation cohort indicated excellent calibrations of our model (Fig 6 e, Fig 6 g). The AUC of our model also represented superior performances of predictions. In the training cohort, 1-year, 3-year, and 5-year AUCs were 0.732, 0.749, and 0.752, respectively (Fig 6 f), while in the validation cohort, they were 0.860, 0.827, and 0.812, respectively (Fig 6 h).