3.1 Differentially expressed ARGs between HCC and adjacent nontumor tissues
In this study, a total of 374 HCC and 50 non-tumor tissues with RNA sequencing data were analyzed. Eventually, 27 differentially expressed ARGs were identified, including 25 up-regulated genes (DDIT3, BAX, TSC1, HGS, BAK1, HSP90AB1, RAB24, CLN3, SQSTM1, PEA15, IKBKE, TP63, HSPB8, ITGA6, ITGA3, DAPK2, TMEM74, ITGB4, IRGM, NKX2-3, SPHK1, NRG2, TP73, CDKN2A, and BIRC5) and 2 down-regulated genes (FOS and DIRAS3) with |log2FC| > 1.5 (Fig. 1a-1b). Figure 1c illustrated these 27 differentially expressed ARGs in HCC and normal tissues.
3.2 Function enrichment analyses for ARGs
To investigate the biological functions and molecular mechanisms of the above 27 ARGs in HCC, GO enrichment analysis and KEGG pathway analysis were conducted (Fig. 2, Fig. 3). The GO enrichment analysis can be divided into three parts: biological processes (BP), cellular components (CC) and molecular function (MF). As showed in Fig. 2, the top enriched terms for BP included autophagy, process utilizing autophagic mechanism, and regulation of apoptotic signaling pathway. As for CC, the most significant terms involved in autophagosome, chaperone complex, and integrin complex. In MF group, these differentially expressed ARGs were mainly associated with BH domain binding, chaperone binding, and heat shock protein binding. Besides, KEGG enrichment analysis showed that the selected differentially expressed ARGs were mainly participated in pathways in cancer, human papillomavirus infection, apoptosis, measles, and platinum drug resistance (Fig. 3a-3b).
3.3 Prognostic gene signature for HCC cohorts
Using a univariate Cox regression analysis, totally 13 differentially expressed ARGs (BAX, SQSTM1, PEA15, CDKN2A, HSPB8, HGS, IKBKE, HSP90AB1, RAB24, BIRC5, DDIT3, BAK1, and TMEM74) were found to be markedly related to OS in HCC patients (Fig. 4a). All these 13 survival-related ARGs were regarded as risk factors (HRs, 1.154–1.715; P < 0.05) and their overexpression may cause worse prognosis. Then, we validated the prognostic roles of these 13 survival-related ARGs in Kaplan Meier-plotter website (http://kmplot.com/analysis/index.php). The results are consistent with the findings in TCGA dataset (Fig. S1; The OS curve of BAX presented similar trend to these of other genes, but not statistically significant). Finally, the 13 differentially expressed ARGs were entered into a Lasso regression analysis. Figure 4b presented the regression coefficient of these 13 ARGs in HCC. While 7 ARGs (SQSTM1, PEA15, CDKN2A, HSPB8, RAB24, BIRC5, and TMEM74) were included, the model reached the best performance (Fig. 4c). The biological functions and risk coefficients of 7 ARGs were listed in Table 1, which mainly associated with the formation of autophagosome, regulation of autophagy, as well as regulation of apoptosis.
Table 1
Biological functions and coefficient of 7 ARGs.
No | Gene symbol | Full name | Function | Risk coefficient |
1 | SQSTM1 | Sequestosome 1 | Functions as a bridge between polyubiquitinated cargo and autophagosomes | 0.76707221 |
2 | PEA15 | Proliferation And Apoptosis Adaptor Protein 15 | Functions as a regulator of apoptosis and autophagy | 0.24377181 |
3 | CDKN2A | Cyclin Dependent Kinase Inhibitor 2A | Involved in regulation of autophagy and caspase-independent cell death | 0.03399741 |
4 | HSPB8 | Heat Shock Protein Family B (Small) Member 8 | Functions as a regulator of macroautophagy | 0.19802769 |
5 | RAB24 | Ras-Related Protein Rab-24 | Involved in autophagy-related processes | 0.07147873 |
6 | BIRC5 | Baculoviral IAP Repeat Containing 5 | Functions as a regulator of apoptosis and autophagy | 0.50048513 |
7 | TMEM74 | Transmembrane Protein 74 | Plays an essential role in autophagy | 0.21098334 |
To explore the contributions of the 7 ARGs to hepatocellular carcinogenesis, the genetic alteration information of these genes were explored in the cBio Cancer Genomics database. The PanCancer Atlas dataset (353 samples) and Firehose Legacy dataset (366 samples) of HCC were both included. ARGs of interest are altered in 149 (41%) of 366 sequenced patients (TCGA, Firehose Legacy dataset) (Fig. S2), compared with that altered queried ARGs were detected in 110 (31%) of 353 sequenced patients (TCGA, PanCancer Atlas dataset) (Fig. S3a). Moreover, HCC patients in genes-altered group presented poorer PFS (progression-free survival) (Fig.S3b) and DFS (disease-free survival) (Fig.S3c) than these in genes-unaltered group. The OS curves presented similar trend to that of PFS, but not statistically significant (Fig.S3d). These results suggested that the 7 ARGs play a crucial role in hepatocellular carcinogenesis.
The 7 ARGs were finally analyzed by a multivariate Cox regression analysis, and 3 candidate genes (SQSTM1, HSPB8, and BIRC5) were selected as the prognostic markers for HCC patients (Table 2). Each patient received a risk score calculated as follows: risk score = (0.2578 × expression value of SQSTM1) + (0.1190 × expression value of HSPB8) + (0.3049 × expression value of BIRC5).
Table 2
Univariate and multivariate Cox regression analyses of OS in HCC patients.
Genes | HR(95% CI) | P | HR(95% CI) | P | Coef |
SQSTM1 | 1.3759(1.1682–1.6204) | 0.000132 | 1.2940(1.1007–1.5213) | 0.001792 | 0.257771 |
PEA15 | 1.3709(1.0956–1.7153) | 0.005811 | --- | --- | --- |
CDKN2A | 1.2434(1.0706–1.4441) | 0.004323 | --- | --- | --- |
HSPB8 | 1.1542(1.0371–1.2846) | 0.008595 | 1.1264(1.0079–1.2588) | 0.035779 | 0.119042 |
RAB24 | 1.7155(1.2278–2.3970) | 0.001566 | --- | --- | --- |
BIRC5 | 1.3523(1.1790–1.5510) | 1.60E-05 | 1.3565(1.1810–1.5581) | 1.61E-05 | 0.304895 |
TMEM74 | 1.5356(1.0999–2.1439) | 0.011769 | --- | --- | --- |
3.4 Identification of independent risk factors of OS for HCC patients
In order to screen the independent risk factors of OS for HCC patients, the univariate cox and multivariate cox regression analyses were carried out to explore the independent risk factors of OS. According to Fig. 5a, tumor stage, T (primary tumor), and risk score were closely related to OS (HR = 1.669(95% CI: 1.357–2.053), P < 0.001; HR = 1.649(95% CI: 1.354–2.009), P < 0.001, and HR = 1.755(95% CI: 1.511–2.039), P < 0.001, respectively). As shown in Fig. 5b, the results of multivariate cox regression indicated that M (metastasis) (HR = 1.394(95% CI: 1.065–1.824), P = 0.016) and risk score (HR = 1.769(95% CI: 1.478–2.116), P < 0.001) should be considered as the independent risk factors of OS.
3.5 Validation of the risk model
According to the median values of risk score, we divided 374 HCC cases into high-risk and low-risk groups. To validate the performance of risk model, we plotted Kaplan-Meier curves to compare the HCC survival in two groups. The results showed that HCC patients in low-risk group have better prognosis than those in high-risk group (3-year OS, 70.2% vs 53.7%; 5-year OS, 55.2% vs 42.0%; P = 4.478e-04) (Fig. 6a). The ROC curves were also plotted using the risk factors related to OS (age, gender, grade, stage, T (primary tumor), M (metastasis), N (lymph nodes), and risk score). We then evaluated the area under curve (AUC) values for each risk factor, and the results showed that risk score curve has a better feasibility in predicting the individuals’ survival with AUC of 0.756 (Fig. 6b). In addition, Figure. 6c-6e showed that the survival time of HCC patients decreases along with the rising of risk score.
The prognostic role of the three-gene risk model was then validated in the GEO validation cohort (GSE14520; n = 221). According to the median value of risk score, the patients were also divided into two groups (high risk group and low risk group). The Kaplan-Meier survival result showed that HCC patients in high risk group have a significantly worse OS compared to cases with low risk in the validation dataset (3-year OS, 57.7% vs 73.5%; 5-year OS, 43.2% vs 63.0%; P = 1.274e-03) (Fig. 7a). Moreover, Fig. 7b demonstrated that the risk score presents a well predictive ability with AUC of 0.672. Figure 7c-7e showed that patients in high risk group have shorter survival time than patients with low risk score in validation cohort. Taking together, the three-gene signature has good feasibility for predicting the prognosis in HCC.
3.6 Construction of a nomogram for predicting survival rate of HCC
The clinical nomogram was constructed to quantitatively assess the patients’ survival rate through combining several risk factors (age, gender, grade, stage, T (primary tumor), N (lymph nodes), M (metastasis), and risk score). As shown in Fig. 8a, the total points of risk factors were utilized to evaluate the individuals’ 3-year and 5-year survival rates. The concordance index (C-index) was 0.68 (95% CI: 0.63–0.73). In addition, calibration curves presented good concordance between actual survival and nomogram-predicted survival (Fig. 8b-8c), especially for the 3-year survival.
3.7 The relationships between ARGs and clinical factors
The Student’s t test was applied to investigate the relationships between the expression levels of 3 ARGs (SQSTM1, HSPB8, and BIRC5) and clinical factors. The results showed that the risk scores increased along with the T (primary tumor) and tumor grade (Fig. 9a-9b). In addition, the expression of SQSTM1 was higher in the groups of patients aged > 65, male, and patients with higher tumor grade (Fig. 9c-9e). We also found the level of BIRC5 was related to the grade, tumor stage and T (primary tumor) in HCC patients (Fig. 9f-9 h).
3.8 Validation of the expression of ARGs at protein level and mRNA level
The Human Protein Atlas database (https://www.proteinatlas.org/) was used to evaluate the protein levels of 3 prognostic ARGs (SQSTM1, HSPB8, and BIRC5) in HCC tissues with their expression in normal tissues (Fig. 10a-10c). As expected, the protein levels of 3 prognosis-related ARGs (SQSTM1, HSPB8, and BIRC5) were markedly higher in HCC tissues as compared with normal samples. Then, the TIMER database (https://cistrome.shinyapps.io/timer/) was applied to validate the mRNA expression levels of SQSTM1, HSPB8, and BIRC5. The results showed that the mRNA levels of these 3 ARGs were obviously higher in HCC compared to normal controls (Fig. S4).
3.9 Construction of TFs-genes networks and miRNA-genes networks for SQSTM1, HSPB8, and BIRC5
To better understand the contributions of SQSTM1, HSPB8, and BIRC5 to the development and progression of HCC. The TFs-genes networks and miRNA-genes networks for SQSTM1, HSPB8, and BIRC5 were established (Fig. 11a-11b). The numbers of TFs and miRNAs in the networks were 100 and 117, respectively. In the TFs-genes networks (Fig. 11a), ZNF394 was identified as the hub TF for these 3 target genes. Moreover, SQSTM1 and BIRC5 shared 10 TFs (ZNF394, ZBTB7A, SCRT1, ZNF644, SSRP1, MLX, PPARG, CTCF, MDX3, and ZNF501). In the miRNA-genes networks (Fig. 11b), miR-218-5p, miR-646, miR-93-5p, miR-16-5p, miR-484, miR-335-5p, and miR-1252-3p could regulate both SQSTM1 and BIRC5. In addition, SQSTM1 and HSPB8 were predicted as the targeted genes of miR-1226-3p. Taken together, the TFs-genes networks and miRNA-genes may provide new clues for the further studies on molecular mechanisms of HCC.