Identification of immune-related genes
A four-step analysis of 526 tumor samples revealed a total of 770 immune-related genes(fig 1 A). All the 526 patients were randomly divided into training (n=263) and test (n=263) sets. A summary of patients’ age, sex, and the TNM stage across the two sets is outlined in Table I. We found no statistical differences (P > 0.05) in all these variables. We used random forest analysis to select 38 genes. Moreover, LASSO analysis resulted in 13 genes(fig 1 B, C), of which 11, namely ADRB2, AGER, BIRC5, CDNF, CXCL2, ESR2, F2RL1, IFITM1, PTX3, SEMA3G, and TCF7L2, were confirmed to be immune-related genes associated with survival following univariate analysis (Table II).
Construction and validation of a prognostic prediction signature
Univariate Cox regression analysis allowed construction of an immune-related prognostic signature of the 11 hub genes. The formula used was as follows: Risk Score=-0.00926×ADRB2+0.00502× AGER+0.00787×BIRC5-0.00552×CDNF+0.000911× CXCL2+0.00167×ESR2-0.0109×F2RL1+0.0255×IFITM1+0.00840×PTX3-0.02760×SEMA3G -0.0433×TCF7L2. Thereafter, we used the Kaplan-Meier and ROC analyses to calculate risk scores in the training set. Results revealed significantly shorter overall survival time in the high-risk score group, after KM analysis (we divided 50% of the 263 patients with high-risk score into high-risk score group, P value <0.001, fig 2 A). ROC curves showed that the constructed signature had good accuracy, with AUC values of 0.71, 0.71 and 0.65 at 1-, 3- and 5-years, respectively (fig 2 B C). Moreover, the high-risk score group recorded a shorter survival time in the test set (P< 0.01, fig 2 D). We also analyzed the relationship between risk scores with different clinical information (tumor, lymph node, metastasis degrees, and grades) and found that the degree of metastasis was strongly correlated with risk scores(fig 2 E F). Furthermore, a combination of metastasis degrees and risk score generated a better cutoff value in ROC analysis(fig 2 G).
Functional and pathway enrichment analysis
GO analysis revealed enrichment of the following biology processes in ccRcc specimens; regulation of chemotaxis, regulation of symbiosis, and encompassing mutualism through parasitism(fig 3 A). On the other hand, KEGG analysis results showed that the 11 immune-related genes might be playing key roles in colorectal and breast cancers. Pathway analysis showed that hippo signaling and Kaposi sarcoma-associated herpesvirus infection were strongly associated with the 11 immune-related genes(fig 3 B). A further analysis of samples with different risk scores revealed an additional 4190 differentially expressed genes (P<0.01), while results of GSEA analysis revealed five pathways with different responses between two groups (P<0.01, fig 3 C).
Analysis of hub survival immune related genes
A comparison in expression levels in genes between training set (n=263) and normal tissue samples (n=71) revealed upregulation of 4 genes (IFITM1, CXCL2, AGER, and BIRC5) and downregulation of 3 genes (PTX3, F2RL1, and SEMA3G) in the tumor tissue samples (P<0.001, fig 4 A B). Conversely, 4 genes, namely ADRB2, CDNF, ESR2, and TCF7L2, did not exhibit any significant differences between tumor and normal tissues (P>0.05). A search of the 11 genes in the TCGA KIRC cohort (normal sample=72, and the tumor sample =533) in the web of Ualcan further revealed upregulation of 4 genes (IFITM1, CXCL2, AGER, and BIRC5) and downregulation of 5 others (PTX3, F2RL1, SEMA3G, CDNF, and ESR2) in tumor tissues. KM curves from the Ualcan database revealed 10 genes (ADRB2, AGER, BIRC5, CXCL2, ESR2, F2RL1, IFITM1, PTX3, SEMA3G, TCF7L2) which had statistical differences (P<0.05, fig 4 C). Notably, patients with upregulated ADRB2, F2RL1, SEMA3G, and TCF7L2, or downregulated AGER, BIRC5, CXCL2, ESR2, IFITM1, and PTX3, exhibited better survival rates. Furthermore, Expression levels of 7 genes were significantly correlated with levels of DNA methylation. Among the genes, TCF7L2, F2RL1, BIRC5, and CDNF significantly upregulated, whereas CXCL2, ESR2, and SEMA3G downregulated DNA methylation levels in the primary tumor tissues (P<0.05, fig 4 D).
Relationship between immune cell components with expression of immune-related genes and immunotherapy response
Results of ImmuCellAI analysis revealed that the high-risk score group of the training set had a higher percentage of 10 immune cells, namely Exhausted cells, Tr1 cells, nTreg cells, Th1 cells, Tfh cells, Effect memory cells, MAIT cells, Macrophage cells, CD8 T cells, and iTreg cells (fig 5 A). Conversely, a lower percentage was observed in 6 immune cells, namely CD4 native cells, CD8 native cells, Th17 cells, central memory cells, Neutrophil cells, and Gamma delta cells (fig 5 A, B).