3.1 Identification of metabolism-related DEGs between HCC samples and normal samples.
In order to identify metabolism-related DEGs between HCC and normal tissues, we conducted differential expression analysis in TCGA(LICH), ICGC(LIRI-JP) and GSE112790 datasets by using the limma R package (FDR < 0.05 and |log2 FC| > 1). We obtained 457 DEGs, of which 100 were related to metabolic reprogramming. (Supplementary Fig.S1 and Supplementary Table S1).
3.2 GO and KEGG analyses in metabolism-related genes.
To gain further insight into the function of the identified metabolism-related DEGs, we performed GO and KEGG analyses. The DEGs were mainly involved in biological processes associated response to xenobiotic stimulus, nucleoside phosphate biosynthetic process, carboxylic acid biosynthetic process and so on (Fig. 1a). Moreover, the DEGs were involved in multiple KEGG pathways, including retinol metabolism, drug metabolism-cytochrome P450, arachidonic acid metabolism, et al (Fig. 1b).
3.3 Construction and validation of a metabolism-related prognostic model in the TCGA and ICGC cohorts.
Considering the difference in the expression of metabolism related genes between HCC and normal tissues, we determined to analyze the clinical application value of the DEGs in depth. Among the 100 metabolic DEGs, 37 genes were demonstrated to be significantly associated with the overall survival status of HCC patients through the univariable Cox regression analysis (Supplementary Fig. S2a), and 9 of them were identified to have the maximum prognostic value through the LASSO Cox regression analysis (Supplementary Figs. S2b,c ). Finally, we applied these genes to establish a metabolism-related model in the TCGA cohort to estimate the prognostic value of metabolic reprogramming in HCC patients. And the identical formula was utilized in the ICGC cohort to determine whether the prognostic model was reliable in different populations (Figs. 2a-f). The formula for the prognostic model was that risk score = Σ Cox coefficient of gene Xi × scale expression value of gene Xi. Then, HCC patients were categorized into high- or low- risk groups with the optimal cut-off value of risk score. The high-risk score predicted poor prognosis for HCC patients in the TCGA cohort (P = 1.847e-07), as confirmed in the ICGC cohort (P = 7.243e-06). And we found that the risk score in TCGA cohort had excellent prognostic value in both univariate and multivariate Cox regression analyses, which was consistent with the results in ICGC cohort (Figs. 3a-d). The results showed that metabolic alterations had important clinical significance in the prognosis of HCC patients. And the risk score of the metabolism-related prognostic model can act as an independent prognostic factor for the TCGA cohort (HR (hazard ratio) 3.635, 95% CI (confidence interval)2.382–5.549) and the ICGC cohort (HR1.905, 95%CI 1.328–2.731) in HCC patients.
3.4 Evaluation of the metabolism-related prognostic model in the TCGA and ICGC HCC cohorts.
To further evaluate the prognostic accuracy of the metabolism-related prognostic model, we calculated the area under ROC curve (AUC) and the C-index value of the model (Figs. 4a-f). And we also constructed a nomogram with comprehensive consideration of risk score and clinical characteristics for clinical use (Fig. 4g). We found that the AUC of the prognostic model on OS status was 0.758 at 1 year, 0.742 at 2 years, 0.693 at 3 years, 0.7 at 4 years and 0.693 at 5 years in TCGA cohort, while 0.749 at 1 year, 0.748 at 2 years, 0.763 at 3 years, 0.83 at 4 years and 0.672 at 5 years in ICGC cohort (Figs. 4a,d). We also found the overall AUC (0.755) of risk score was superior to that of grade (0.478), stage (0.698), T (0.704), N (0.506), M (0.508) in TCGA cohort. And the overall AUC (0.749) of risk score in ICGC cohort was superior to that of priorMalignancy (0.526), indicating that the prognostic model was well constructed (Figs. 4b,e). In addition, through principal component analysis (PCA) of unsupervised learning, we found that HCC samples were well distinguished between high and low risk grouping in the predictive model (Figs. 4c,f). Furthermore, we observed that the C-index of risk score was 0.7260 in TCGA cohort and 0.7195 in ICGC cohort. And the C-index of nomogram was 0.7510 in TCGA cohort and 0.8240 in ICGC cohort, indicating higher predictive accuracy of nomogram than risk scores. Finally, we conducted the calibration curves at 3 years and at 5 years, to validate the nomogram’s performance and observed that the predictive curves worked well (Figs. 4h,i).
3.5 Clinical relevance of metabolism-related prognostic model and its constructed genes in TCGA cohort
We conducted a clinical correlation analysis for risk score of the prognostic model to further understand the influence of metabolic reprogramming on the clinical prognosis of HCC patients. The results showed that risk score had significant correlations with the status(p = 1.11e-06), grade(p = 0.003), stage(p = 0.014) and T(p = 0.013) in HCCs in TCGA cohort (Figs. 5a-d). The results suggested that Risk score correlates with the progression of hepatocellular carcinoma. And a higher risk score indicates a higher degree of malignancy in HCC. Therefore, metabolic alterations are closely related to the clinical prognosis of HCC patients.
3.6 The role of risk score in the prognosis of HCC patients stratified by clinicopathological variables
To explore the prognostic value of risk score in HCC patients stratified by different clinicopathological variables, we stratified HCC patients by age, sex, grade, stage and TNM stage. The results showed that the high-risk group had a poor prognosis among patients stratified by different clinical variables (Fig. 6). These results suggested that metabolism-related prognostic risk score was not limited by clinicopathological variables in predicting the prognosis of HCC patients.
3.7 Correlation between metabolism-related genes and tumor immunity in the TCGA HCC cohort.
Since our analysis found that there was a significant correlation between the expression of metabolism-related risk genes and immune-related genes, we further analyzed the metabolic alterations in the high and low immune-rating groups (Supplementary Figs. S4a,b). And we found that risk score was significantly higher in the low immune group than in the high immune group (Supplementary Fig. S4e). There were 23 metabolism-related genes overexpressed in the high immune score group, while 37 in the low immune score group (FDR < 0.05 and |log2 FC| > 1, Supplementary Figs.S4c,d). By GO and KEGG analyses, it was found that the metabolic genes in the high immune score group were mainly involved in the organophosphate biosensitization and arachidonic acid signaling pathway, et al. While in the low immune score group, the metabolic genes were mainly enriched in lipid biosensitization, glutathione metabolism and so on (Figs. 7a,b). We also found that there was a close relationship between metabolism-related genes and tumor immune infiltration, especially in the high immune score group (Supplementary Fig.S5). We speculated that metabolism-related genes may indirectly regulate tumor immunity through metabolic alterations to some extent, thus affecting the prognosis of HCC patients.