Immune Microenvironment Landscape of ccRCC
After using the ssGSEA method to measure the immune function of The Cancer Genome Atlas (TCGA) ccRCC transcriptome, we obtained the corresponding ssGSEA score of each sample according to the 29 immune-related pathways, and immune cell infusion was used to evaluate the immune function of ccRCC, as shown in Supplementary Table 1.
Consensus Clustering According to Molecular Subtypes
The results of consensus clustering of the ssGSEA scores were visualized using an empirical cumulative distribution function (CDF) plot and a delta area plot (Figure 1A and C). K=5 was regarded as the best value to cut these clusters, and the consensus matrix heat map is shown in Figure 1B. Therefore, the total ccRCC dataset was divided into 5 clusters: C1, C2, C3, C4, and C5. By using the ESTIMATE algorithm, we obtained three scores: the immune score, stromal score and ESTIMATE score (shown in Supplementary Table 2). A heatmap of the 5 clusters with their ssGSEA and ESTIMATE scores is shown in Figure 1D. Cluster C3 showed obvious differences compared with the other clusters.
For different clusters, there was no statistical difference in age (Figure 2A, P=0.1181), sex (Figure 2B, P=0.3233) or N stage (Figure 2E, P=0.3277). However, there were significant differences in race (Figure 2C, P=0.0069), T stage (Figure 2D, P=0.0001), M stage (Figure 2F, P=0.0157) and grade (Figure 2H, P<0.0001). Regarding the prognosis of patients, there were significant differences between the five clusters. Cluster 4 had the best prognosis for both OS (Figure 2G) and PFS (Figure 2I). Moreover, the tumor microenvironment using the ESTIMATE algorithm was statistically significant between the five clusters (Figure 3A-C). However, concerning the stromal score, there was no significant difference between the C4 and C5 clusters (Figure 3A). For the three ICIs and PD1 and PDL1, only some clusters showed significant differences (Figure 3D and E). Regarding CTLA4, except for the expression of CTLA4 between cluster 1 and cluster 4, there were significant differences in the other clusters (Figure 3F). Notably, almost all HLA genes showed significant differences between the five clusters (Figure 3G).
WGCNA
A total of 4555 genes with great variation in different samples were subjected to WGCNA. After it was clear that there was no missing value, an outlier was deleted (Figure 4A). β = 3 was used to establish a proximity matrix to make our gene distribution conform to the scale-free network (Figure 4B). After clustering genes and using the dynamic cutting method to cut the tree into different modules and merge similar modules, we obtained a total of 8 modules (Figure 4C). Among them, the red module was positively correlated with our immune cluster classification (Figure 5D). The red module showed the highest GS and MM based on an intramodular analysis (Figure 5E) and was regarded as a significant module. Based on the cutoff value for hub genes (MM > 0.7 and cor GS > 0.4), we ultimately obtained 14 hub genes: CD79A, FCRLA, GPR174, GZMK, JCHAIN, LAG3, MZB1, PDCD1, PLA2G2D and POU2AF1 (supplementary table 3).
Functional Enrichment Analysis
In the GO analysis, the significant modules were mostly enriched in cell killing, the external side of the plasma membrane and receptor ligand activity (Figure 5A). In the KEGG analysis, the modules were mostly enriched in cytokine−cytokine receptor interactions (Figure 5B). The main enriched functions and pathways of the 14 hub genes were related to immunity. Examples included the regulation of regulatory T cell differentiation, the regulation of the T cell apoptotic process and the negative regulation of T cell activation (Figure 5C).
Identification of the prognostic factors of nomogram
Eleven prognostic factors, including three hub genes, MZB1, GZMK and LAG3 were identified by LASSO regression analysis(supplementary figure 1).Multivariate Cox regression analysis showed that the final independent prognostic factors were LAG3, GZMK, radiotherapy, age, T stage, M stage and grade. The results of the multivariate Cox regression analysis are shown in supplementary figure 2, and the nomograms are shown in Figure 6A. The areas under the curve (AUCs) of the time-dependent ROC curves were 0.839, 0.802 and 0.769 for 1-, 3-, and 5-year OS, respectively (Figure 6B). The calibration curves of the 1-, 3-, and 5-year survival rates also showed that the nomogram had good prediction ability (Figure 6C-E).