Differentially expressed ARGs
A total of 232 ARGs were identified using the HADb. A total of 370 patients with primary HCC had their clinical data and gene expression profiles available on the TCGA database (Table 1). Differential gene expression analysis identified 62 ARGs, including 58 upregulated and 4 down-regulated ARGs (Figure 1A-B). Figure 1C shows the expression profiles of the differentially expressed ARGs in HCC and non-tumor tissue samples.
Functional enrichment analysis
Enrichment analysis was used to identify functional GO terms and KEGG pathways associated with the 62 differentially expressed ARGs in HCC samples. The GO biological processes associated with these genes were “process utilizing autophagic mechanism”, and “macroautophagy”, while the GO molecular functions associated with these genes were “protein kinase regulator activity”, “cysteine-type endopeptidase activity”, and “heat shock protein binding”. Regarding cellular components, the top two enriched GO terms were “region” and “chaperone complex” (Figure 2A). On the other hand, enrichment analysis on showed that the differentially expressed ARGs were mainly associated with the following KEGG pathways: autophagy, apoptosis, platinum drug resistance, cellular senescence, p53 signaling pathway, IL-17 signaling pathway, and protein processing in endoplasmic reticulum. An enrichment z-score < 0 indicated that the relationship with the pathways could be reduced (Figure 2B). The heatmaps in Figure 2C show the relationship between the differentially expressed ARGs and the enriched pathways.
Identification of risk-associated ARGs
The correlation between expression levels of the 62 differentially expressed ARGs and OS was evaluated using the TCGA HCC data set. Univariate cox regression was first used to identify potential prognostic differentially expressed ARGs in the HCC patients. The analysis showed that 34 ARGs had their expression levels correlated with OS (Figure 3A). Multivariate cox proportional hazard regression analysis was then performed order to construct a prognostic model that can efficiently predict outcomes of HCC patients. Interestingly, only 5 prognosis-related ARGs (HDAC1, RHEB, ATIC, SPNS1 and SQSTM1) were identified as potential independent risk factors (Table 2). The K-M analysis of OS showed that the high levels of HDAC1 and ATIC were strongly correlated with shorter OS time (HR=2.11 and 2.04, respectively; 95% CI = [1.48 - 3.02] and [1.41 - 2.95], respectively; P<0.001 for both; Figures 3B, 3C). Similarly, high levels of SPNS1 and SQSTM1 were also associated with poor outcomes (HR=1.77 and 1.70, respectively; 95% CI= [1.22-2.58] and [1.20-2.40], respectively; P<0.01 for both; Figures 3D, 3E). Likewise, high expression of RHEB was associated with shorter OS time of HCC patients (HR=1.53, 95% CI= [1.08-2.16], P=0.015; Figure 3F). The two risk-associated ARGs (ATIC 1 and SPNS) associate with DFS in patients with HCC (p=0.013 and p=0.0018, respectively). The other 3 risk-associated ARGs (SQSTM1, RHEB and HDAC1) are not significantly associated with DFS (p=0.087, 0.061 and 0.12 respectively) (Supplementary Figure S1).
Construction of a prognostic model using ARG genes
A linear regression model for the calculation of prognostic risk scores using expression levels (expr) of 5 ARGs weighted by their cox regression coefficients. The risk score was calculated using the following linear formula: riskScore= 0.4216 × exprHDAC1) + (0.5443 × exprRHEB) + (0.6171 × exprATIC) + (1.3652 × exprSPNS1) + (0.2082 × exprSQSTM1). A riskScore was calculated for each patient, and patients were then stratified for OS analysis into high and low risk groups relative to the median riskScore of all patients (n = 185). The K-M curve, along with the log rank test, indicated that the low-risk group exhibited favorable outcomes, while the high-risk group was associated with unfavorable outcomes (p < 0.001; Figure 4A). The distribution and status of OS was then analyzed by ranking the risk scores (Figure 4B-C). Figure 4D shows the expression profiles of risk-associated ARGs in high-risk and low-risk HCC patient groups.
Significance of the ARG-based risk model as an independent risk factor
The correlation of the clinical characteristics of patients and the riskScore with OS was then analyzed using univariate and multivariate regression analysis. Univariate cox regression analysis showed that the pathological stage, the T stage, the M stage and the riskScore were correlated with OS of HCC patients (p < 0.05; Figure 5A). Of importance, multivariate cox regression analysis including clinical parameters and riskScores showed that only the riskScore was independently associated with OS of HCC patients (p < 0.001; Figure 5B).
The prognostic efficiency of the ARG-based risk model
A multi-target ROC curve was performed to evaluate the prognostic efficiency of the risk model in the prediction of clinical outcomes in HCC patients. As shown in Figure 6A, the AUC for the risk score was 0.747, which indicates a competitive performance.
The correlation between the risk score and clinical parameters was then analyzed. The results showed that the riskScore was higher in histological stages III-IV, compared with stages I-II (P = 0.040; Figure 6B). In addition, the riskScore was higher in T3-T4 stages, compared with T1-T2 stages (P=0.038; Figure 6C). On the other hand, no differences in the riskScore were observed between patients >65 and those ≤65 years old (P = 0.916), between male and female patients (P = 0.596), between grades G1-G2 and G3-G4 (P = 0.119), between stages N1 and N0 (P = 0.573), or between stages M1 and M0 (P = 0.348) (Figure 6D-H).
Verification of the ARG-based risk model in the testing group
We further evaluated prognostic efficiency of the ARG-based risk model by analyzing the patients in the different liver cancer cohorts from ICGC dataset (https://dcc.icgc.org/releases/current/Projects/LIRI-JP). For 232 LIRI-JP samples, patients in high-risk group had inferior OS than patients in low-risk group (p = 0.0045) (Figure 7A). The distribution and status of OS and expression profiles of risk-associated ARGs were also analyzed by ranking the risk scores in high-risk and low-risk HCC patient groups from ICGC dataset (Figure 4B-D). Overall, the accuracy of ARGs-based risk model was confirmed in the independent validation liver cancer cohorts.
The relationship of the drug sensitivity and risk ARGs
The relationship between drug sensitivity of 17 HCC cell lines and the relative expression levels of risk-associated ARGs was explored using data available from The Genomics of Drug Sensitivity of Cancer Database (GDSC). We further analyzed the correlation between the expression of HDAC1, RHEB and SQSTM1 with the IC50 of specific targeted drugs. We presumed that a positive correlation between the expression of these genes and the IC50 of the studied drugs would indicate a basis for developing drug resistance in HCC patients. In contrast, a negative correlation between risk-associated ARGs and IC50 would indicate higher drug sensitivity in HCC cell lines. High HDAC1 expression was associated with higher drug resistance (higher IC50) of HCC cell lines to Trametinib, 17-AAG, HG-5-113-01, Bleomycin, RDEA119, Nutlin-3a, PD-0325901, Elesclomol, CHIR-99021, Afatinib, Cetuximab and Selumetinib (p < 0.05), while it was associated with higher drug sensitivity (lower IC50) of HCC cell lines to Pyrimethamine and Methotrexate (p < 0.05; Figure 8A).
On the other hand, higher RHEB expression resistance to ha-793887 in HCC cell lines (p < 0.05), while it was associated with higher sensitivity to 17-AAG, Elesclomol, PD-0325901, Docetaxel, Trametinib, RDEA119 and Selumetinib (p < 0.05; Figure 8B). Furthermore, higher SQSTM1 expression was associated with higher resistance of HCC cell lines to GSK1070916 (p < 0.05), while it was associated with higher sensitivity to other drugs such as 17-AAG, Trametinib, RDEA119, PD-0325901, Selumetinib, Dasatinib, Docetaxel and Lapatinib (p < 0.05; Figure 8C).