Identification of anoikis-related genes in ccRCC
The flowchart of this study is shown in Figure1. We obtained RNA transcriptome data and corresponding clinical data of 607 ccRCC patients in TCGA database. Among 652 ARGs identified from the GeneCards database, 32 differentially expressed ARGs were significant between tumor and noncancerous adjacent samples (NATs), including 17 upregulated and 15 downregulated in tumors (FIGURE 2A, B).
The biological roles of anoikis-related genes in different subgroups of ccRCC and functional analysis
We identified 19 prognostic genes using the univariate Cox regression analysis(FIGURE 3A). Regulatory relationships among prognostic genes is shown in FIGURE 3B. The ccRCC cohort was divided into three distinct groups, which we named as GroupA, GroupB and GroupC, based on these prognostic genes using the “ConsensusClusterPlus” package with K = 3(FIGURE 3C). The PCA and tSNE score plot illustrated a remarkable separation between the three subgroups(FIGURE 3E, F). As shown in Figure 3D, patients in Group A were significantly associated with a better prognosis while in Group B have a worst prognosis(p < 0.001). The relationship between the expressions of prognostic genes and clinicopathological parameters such as age, gender, TNM stage(T stage, N stage, and M stage) was shown using the Heatmap(FIGURE 4B). As we can see, the gene PTHLH,UBE2C, MMP9 and BIRC5 are high expression in Group B but low expression in Group A and Group B(FIGURE 4A). On the contrary, the gene PLG and OCLN are high expression in Group A but low expression in Group B and Group C.
Immune cell infiltration in the three groups was analyzed. As shown in Figures 4C, Activated CD4 T cell and Natural killer T cell were predominant in Group B, while Type 2 T helper cell Tcells were predominant in Group A (p < 0.05).
To explore the underlying mechanisms and functions, functional enrichment analysis was performed to reveal the potential biological functions associated with ARGs involved in typing. GSVA analysis shows that the Group A were mainly clustered in several termsm, such as vibrio cholerae infection, aldosterone regulated sodium reabsorption, tight junction, fatty acid metabolism and so on. Inversely, the worst prognosis Group B were mainly clustered in jak stat signaling pathway, cytokine-cytokine receptor interaction, toll like receptor signaling pathway and so on(FIGURE 5A, B, C). At the same time, GSEA analysis shows that Group B were significantly enriched in several biological processes or molecular functions, such as chemokine signaling pathway, hematopoietic cell lineage, primary immunodeficiency and T cell receptor signaling pathway(FIGURE 5 D, E, F). In conclusion, these results suggested that these signaling pathways be mainly associated with poor prognosis in ccRCC patients.
Development and validation of the anoikis-related genes signature
To establish the signature for predicting the survival status of ccRCC patients, a total of 518 patients, who met the inclusion and exclusion criteria, were randomly grouped into a training set (259 patients) and a testing set (259 patients) at a 1:1 ratio. In the training set, we performed univariate Cox regression on the differentially expressed ARGs and dentified 19 genes associated with prognosis. Five optimal prognostic ARGs were further selected by multivariate Cox regression analysis and LASSO regression analysis (FIGURE 6A, B). Finally, a risk score model including the selected ARGs was developed in the training set: Risk score=coefficients*Expression of ARGs(i)(Table 1). Based on the median risk score, all the patients were divided into the high- and low-risk groups. Then, We drew a heatmap according to the expression of prognostic genes in the high and low risk groups. As shown in Figures 7D, BIRC5 and SLPI are high expression in high risk groups while EDA2R, PLG and SLPI are low expression in high risk groups. The relationship between the cluster classification of ccRCC patients and the risk score is shown in Figure 7E, and the risk score of Group B is higher than that of the other two groups. In the Sankey plot (FIGURE 7F), further delineation of subgroups with respect to risk-score groups and vital status revealed that most patients in Group B had a high-risk score and a worse prognosis. Survival curve (Kaplan-Meier) analysis results showed that the OS for patients in the low-risk group was significantly better than that of patients in the high-risk group, in both the training and testing datasets (p < 0.001)(FIGURE 6C). To evaluate the predictive accuracy of our model, we plotted the ROC curves for both the training and testing datasets(FIGURE 6E, F). The AUC values for OS at 1 years, 3 years, and 5 years in both datasets were >0.5, indicating that the prognostic model could predict the survival of ccRCC patients with considerable accuracy. Univariate and multivariate Cox regression analyses were used to evaluate the risk score based on the ARGs as an independent prognosis predictor for ccRCC. Univariate Cox regression analysis showed that age (hazard ratio (HR) = 1.023, p = 0.011), stage T (HR = 1.934, p < 0.001), stage N (HR = 3.422, p < 0.001), stage M (HR = 4.326, p < 0.001) and risk score (HR = 1.139, p < 0.001) were closely correlated with OS rate in ccRCC(FIGURE 7A). Multivariate Cox regression analysis result indicated that age (HR = 1.034, p < 0.001), stage T (HR = 1.369, p = 0.019), stage M (HR = 3.181, p < 0.001) and risk score (HR = 1.119, p < 0.001) were an independent prognosis indicator for ccRCC(FIGURE 7B, C).
Construction and evaluation of the nomogram incorporating the anoikis-related gene signature
All variables which were significant in the multivariate Cox regression analysis were enrolled in the predictive model to predict 1-year, 3-year and 5-year OS for ccRCC patients(FIGURE 8A). The calibration curve showed that the 1-, 3-, and 5-year overall survival rates predicted by the nomogram were basically consistent with the actual overall survival rates(FIGURE 8B). According to the total points generated from the nomogram, ccRCC patients in the entire datasets were grouped into different risk groups by the median value of the total points. Figure 8C demonstrated that patients in the high-points group possessed higher cumulative hazard compared to those in the low-points group. The DCA analysis illustrated that a higher net benefit could be obtained through using the nomogram to reach a decision than using other factors(FIGURE 8D-F).
Differential analysis of immune cells and tumor microenvironment
In order to explore the differences of immune cells, we performed the CIBERSORT algorithm to explore the correlation between infiltrating immune cells and risk group. The results showed that the infiltration of immune cells was different between the two risk groups, and the infiltration of immune cells such as memory B cells and activated memory T cells CD4 were higher in the high-risk group while resting dendritic cells and resting mast cells were higher in the low-risk group, the difference was statistically significant (P < 0.05)(FIGURE 9A, B). The correlation between the risk score and tumor infiltrating immune cells is shown in Figure 4C. The risk score is significantly positively correlated with the infiltration degree of T cells follicular helper, activated memory T cells CD4 and macrophages M0 but significantly negatively correlated with resting mast cells and resting Dendritic cells (P < 0.001)(FIGURE 10A-L). In addition, TIDE analysis showed that the high-risk group had a higher TIDE score than the low-risk group(FIGURE 9D).
Meanwhile, we plotted the violin plot of the tumor stromal score, immune score and ESTIMATE score of the high and low risk groups, and found that the high-risk group had higher tumor stroma score, immune score and ESTIMATE score(FIGURE 9E).
Differential analysis of drug sensitivity
According to drug sensitivity analysis, nine cancer drugs were screened from the GDSC database, and their drug sensitivity was significantly different between the high and low risk groups(P < 0.001). The results in Figure 11A-I show that the high-risk group showed high sensitivity to the drugs 5−Fluorouracil, Bortezomib, Dactolisib, Paclitaxel, Vinorelbine and Vinblastine.