Differentially expressed MRGs and functional enrichment analysis
We first compared the expression values of 944 MRGs between KIRP samples and normal control samples using the Wilcoxon signed-rank test in R. A total of 123 differentially expressed MRGs, including 60 up-regulated genes and 63 down-regulated genes, were eventually identified based on the criteria of |log2FC| > 2 and FDR < 0.05. Then, the volcano plot and heatmap were utilized to visualize the expression patterns of these differentially expressed MRGs in tumor samples and normal samples (Figure 1A and 1B).
Functional enrichment analysis was performed to explore potential molecular mechanisms of the differentially expressed MRGs. The most enriched GO terms in the biological processes (BP) category were small molecule catabolic process, cellular amino acid metabolic process and organic acid catabolic process. Significantly enriched GO terms related to cellular components (CC) category included microbody part, peroxisomal part and peroxisomal matrix. In the molecular function (MF) category, the differentially expressed MRGs were highly enriched in the terms of oxidoreductase activity, coenzyme binding and iron ion binding (Figure 1C). In addition, KEGG pathway analysis revealed that these genes were notably associated with pathways in retinol metabolism, metabolism of xenobiotics by cytochrome P450 and drug metabolism − cytochrome P450 (Figure 1D).
Identification of metabolism-related prognostic signature
To explored the prognostic value of MRGs in renal cancer progression, we performed univariate Cox regression analysis to examine the relationships between the expression levels of those 123 genes and OS. Results demonstrated that 15 genes were significantly associated with OS (P < 0.01) (Figure 2A). Among the 15 MRGs, P4HA3, IL4I1, RRM2, ITPKA, PSAT1, TYMP, HK3, PLCB2 and AANAT were considered as risk gene with HR > 1, while AGMAT, GATM, HAO2, FBP1, ADH6 and ALDH6A1 were considered as protective genes with HR < 1. We then used LASSO Cox regression on the above-mentioned 15 MRGs to identify the most optimal risk score model for predicting survival in KIRC patients (Figure 2B and 2C). Eventually, RRM2 and ALDH6A1 were retained as target genes, and the coefficient of each target gene was calculated to construct the metabolism-related prognostic signature.
Subsequently, we constructed the OS prognostic signature based on the expression of the 2 target genes and its prognostic coefficients using the following formula: Risk score= (0.0361 × expression level of RRM2) + (–0.0184 × expression level of ALDH6A1). According to the median risk score, a total of 253 and 254 KIRC patients were sorted into the high-risk group and low-risk group, respectively. The Kaplan-Meier curve displayed that a significant difference in OS between the high- and the low-risk groups, and patients with high-risk scores exhibited lower survival than patients with low-risk scores (5-year survival rate, 49.3% vs. 72.6%, P <0.001) (Figure 2D). We applied the ROC curve to evaluate the predictive accuracy of the signature, and the area under the curve of the ROC curve was 0.705, suggesting a moderate prognostic value (Figure 2E). In addition, the distribution of risk scores and the survival status of patients were ranked according to the risk scores (Figure 2F and 2G).
Determination of the prognostic signature as an independent prognostic factor
Furthermore, we performed the univariate and multivariate Cox regression analyses to further determine whether the prognostic signature could serve as an independent prognostic factor. Univariate analysis revealed that the age, grade, AJCC stage, T stage, M stage and risk score were significantly associated with OS (Figure 2H). Subsequent results showed that age (P < 0.001), grade (P = 0.024), AJCC stage (P = 0.043) and risk score (P < 0.001) were still significantly related with OS in multivariate analyses (Figure 2I). Therefore, the metabolism-related prognostic signature was an independent prognostic factor for KIRC patients.
Clinical utility of prognostic signature
We further explored the relationships between metabolism-related prognostic signature and various clinical parameters. The expression levels of the 2 identified target MRGs in high- and low-risk groups were demonstrated in the heatmap (Figure 3A). The results showed that RRM2 and ALDH6A1were represented high and low expression levels in the high-risk group, respectively. Additionally, we observed that there were significant differences between low- and high-risk groups in grade (P = 4.841e−08), AJCC stage (P = 1.648e−08), T stage (P = 3.826e−07), M stage (P = 5.918e−05) (Figure 3B-3E).
To better evaluate the survival outcomes and detect the broad applicability of the prognostic signature, we next performed survival analyses stratified by age, gender, tumor grade, AJCC stage, T stage and M stage. As shown in Figure 4, high-risk group had significantly shorter OS than those in the low-risk group for the cases with age ≤ 60 (P = 1.236e−02), age > 60 (P = 2.404e−05), female patients (P = 1.139e−04), male patients (P =3.673e−04), G1-2 (P = 3.163e−02), G3-4 (P = 2.236e−02), AJCC stage Ⅰ&Ⅱ (P =1.166e−02), T1-2 (P = 2.623e−04) and M0 (P = 5.778e−03). However, no significant differences were observed for OS between high- and low-risk groups for the KIRC patients with AJCC stage Ⅲ&Ⅳ (P = 7.808e−02), T3-4 (P = 1.673e−01) and M1 (P = 7.726e−02).
We then analyzed the relationship between the expression level of each target gene of the prognostic signature and clinicopathological features to assess the function of those 2 genes in disease progression. The results indicated that enhanced RRM2 expression was significantly associated with advanced tumor stage and high-grade tumor, suggesting that RRM2 was a poor prognostic factor. The highest expression in RRM2 was found in the most progressive clinicopathological stage, that is, G4 and stage IV, T4 and M1 (Figure 5A). On the contrary, ALDH6A1 expression is gradually decreasing in the progression of KIRC (Figure 5B), which suggested that ALDH6A1 was a protective factor for KIRC.
Construction and validation of the predictive nomogram
A predictive nomogram was constructed by incorporating prognostic signature and several clinical parameters to generate individual numerical probabilities of OS (Figure 6A). The C-index of the developed nomogram was 0.774. The DCA showed demonstrates that using this nomogram to predict OS had higher net benefit if the threshold probability was larger than 3% (Figure 6B). Additionally, there is an obviously higher net benefit occurred in the nomogram than the tumor grade and AJCC stage. The calibration curves indicated the nomogram performed well in predicting 1-, 3‐ and 5-year OS compared with the ideal model (Figure 6C-E). Taking together, these results showed that the reliability and predictability of the nomogram.
External validation the expression level and prognosis value of RRM2 and ALDH6A1
The expression level of RRM2 was significantly up-expressed, while ALDH6A1 was significantly down-expressed in KIRC samples compared with the normal samples in the Oncomine database (Figure 7A), TIMER database (Figure 7B) and GEPIA database (Figure 7C), which were consistent with our results. Besides, the aberrant expression of these 2 genes was found to be frequently observed in various types of cancer patients. Interestingly, RRM2 and ALDH6A1 consistently maintained over-expression and under-expression in various cancer, respectively. In addition, the prognostic values of the two genes were further confirmed by the Kaplan Meier plotter in the GEPIA database. The results indicated that RRM2 low-expression group and ALDH6A1 high-expression group had favorable prognosis (Figure 7D).
To determine the protein expression levels of the RRM2 and ALDH6A1, the present study referred to the HPA database. According to the results, we discovered that ALDH6A1 expression in kidney cancer tissues was significantly lower than that in normal tissues (Figure 7E). However, no significant difference was found in RRM2 between kidney cancer tissues and normal tissues. Taking together, the aberrant expression of the two genes was further validated in kidney cancer. Such evidence revealed the robustness and reliability of our results to some extent.
In addition, the expression level of RRM2 and ALDH6A1 were further verified with two independent cohorts (GSE53757, Figure 7F and GSE66270, Figure 7G) in GEO database. Results showed that the expression level of RRM2 was significantly up-expressed, while ALDH6A1 was significantly down-expressed in KIRC samples compared with the normal samples in multiple datasets, which was consistent with our previous result.
GSEA
The top 50 significant genes of positive and negative correlation were obtained by GSEA. Subsequently, we performed hallmark analysis for RRM2 and ALDH6A1. Results indicated that the most significant pathways of RRM2 included ANTIGEN PROCESSING AND PRESENTATION, ALLOGRAFT REJECTION, SYSTEMIC LUPUS ERYTHEMATOSUS, LEISHMANIA INFECTION and AUTOIMMUNE THYROID DISEASE (Figure 8A). In addition, the heatmap showed transcriptional expression profiles of the top 50 features for each phenotype in RRM2 (Figure 8B). Correspondingly, GSEA enrichment analysis showed that the significantly enriched pathways for ALDH6A1 included VALINE LEUCINE AND ISOLEUCINE DEGRADATION, FATTY ACID METABOLISM, PROPANOATE METABOLISM, CITRATE CYCLE TCA CYCLE and PYRUVATE METABOLISM (Figure 9A). In addition, the heatmap showed transcriptional expression profiles of the top 50 features for each phenotype in ALDH6A1 (Figure 9B).