Patients characteristics
As shown in Table 1, 418 primary tumors with both gene expression and clinical data were downloaded from TCGA (LIHC-CHOL) data in November 2019. The median age was 61 years old at diagnosis. Moreover, Histopathologic distribution of HCC included well-differentiated (14.8%), moderately differentiated (81.7%) and poorly differentiated (3.5%). The cancer status included 254 tumor-free (65.6%) and 133 with tumor (34.4%). Stage I disease was found in 194 patients (49.2%), stage II in 98 (24.9%), stage III in 90 (22.8%) and stage IV in 12 (3.1%). Grade I disease was found in 55 patients (14.8%), Grade II in 180 (48.4%), Grade III in 124 (33.3%) and Grade IV in 13 (3.5%).The Vascular invasion include 17 Macro (5.3%), 94 Micro (29.3%) and 210 NO (65.4%). Most tumors (87.8%, n=367) were of Hepatocellular Carcinoma (HCC), 9.8% (n=41) were Cholangiocarcinoma (CHOL), 1.7% (n=7) were Hepatocholangiocarcinoma (Mixed) and 0.7% (n=3) were Fibrolamellar Carcinoma (FLC). Tumor size (≤5cm, 74.9%, n=311) and (>5cm, 25.1%, n=104). Eight of 298 (2.7%) cases had lymph node metastases. Eight of 311 (2.6%) cases had distant metastases.
PSMD14 mRNA expression
Oncomine database analysis exhibits significantly up-regulation of PSMD14 in 2 Liver cancer data set compared to the normal liver tissues (Figure 1A). Complementary, PSMD14 mRNA expression was higher in most tumors, such as bladder cancer, brain and CNS, breast cancer, cervical, colorectal cancer, head and neck cancer, kidney cancer, liver cancer, lung cancer, lymphoma, myeloma, pancreatic cancer and sarcoma. To further assess PSMD14 expression of liver cancer, we verified PSMD14 mRNA expression by TIMER data set (Figure 1B). PSMD14 mRNA expression was significantly higher in Liver Hepatocellular Carcinoma (LIHC) compared with adjacent normal tissues.
PSMD14 expression and its clinical significance in HCC
To further assess the role of PSMD14 in HCC, we first analyzed four independent microarray datasets from Oncomine database. The median rank of PSMD14 in up-regulated genes of HCC was 1197.5 based on a meta analysis across the four datasets, including 4 analyses using the Oncomine algorithms (760 samples, P = 2.96E-5, Figure 2A) (15-17). In the meantime, we examined the protein expression of PSMD14 using the Human Protein Atlas (HPA). Images revealed a markedly high expression of PSMD14 could be observed compared to the normal Liver tissues using the HPA002114 antibody in Liver cancer (Figure 2B). Next, we evaluated PSMD14 mRNA expression compared with normal tissues in multiple HCC studies from TCGA and ICGC databases, and found that the mRNA level of PMSD14 was significantly higher in HCC patients than normal liver tissues (Figure 2C-D). These findings suggested that PSMD14 is highly expressed in HCC and elevated PSMD14 may predict a poor outcome for HCC patients.
We analyzed 377 HCC samples with PSMD14 expression and clinical characteristics from the TCGA database. As shown in Figure 3 (A-D). Significant increase in PSMD14 expression correlated with tumor histological grade (p = 0.046), histological stage (p < 0.001), surgical approach (p = 0.001) and T stage (p = 0.013). Categorical dependent variable using logistic regression, univariate analysis showed that PSMD14 expression was significantly associated with poor prognostic clinicopathologic characteristics (Table 2). Increased PSMD14 expression was significantly associated with grade (OR =1.9 for well vs. moderate), stage (OR = 2.5 for I vs. II, III), surgical approach (OR = 0.26 for HL vs. SSLR), tumor size (OR = 1.65 for ≤5cm vs. > 5cm) in HCC (all p-values < 0.05). The results suggested that HCC with high PSMD14 expression was progressed to a more advanced stage than those with low PSMD14 expression.
Survival and independent prognostic analysis
Kaplan-Meier survival analysis indicated that HCC with PSMD14-high had a poor prognosis than the PSMD14-low in TCGA and ICGC database in Figure 3 (E-F) (p < 0.001). The univariate analysis revealed that PSMD14-high correlated significantly with a poor OS (hazard ratio (HR): 2.17; 95% confidence interval (CI): 1.32-3.57; p = 0.002). Other clinicopathologic variables related to poor survival include status, stage, T stage and distant metastasis. Multivariate analysis with HR of 1.9 shown that the PSMD14 remained independently associated with overall survival (CI: 1.08-3.33, p = 0.026), along with status in TCGA (Table 3). Therefore, we first revealed that PSMD14 was associated with poor prognosis, as an independent prognostic factor for HCC survival.
Construction and validation of signature
We selected the genes signatures associated with PSMD14 in the TCGA-LIHC database. A total of 2478 PSMD14-associated genes (Pearson |R| > 0.4, p < 0.001) were chosen to generate prognosis gene signatures and the top 10 positively and negatively genes were selected for further analysis (Figure 4A). All genes were analyzed by univariate Cox regression. A total of 17 genes were significantly related to the OS in TCGA-LIHC database (Figure 4B). Then, the LASSO COX regression analysis and the regression coefficient was computed, the model achieved the best performance at 6 genes (Figure 4C). Finally, we constructed a risk signature for HCC using multivariate COX regression (Table 4). All patients were divided into high- and low-risk sets based on median risk score in the TCGA and ICGC database. Status, survival time and gene expression levels of patients were shown in TCGA (Figure 4D) and ICGC (Figure 4E).
The survival analysis indicated that the OS of the low-risk set was better than that of high-risk set in the TCGA database (P < 0.001) (Figure 5A). The results were consistent in the ICGC database (P < 0.001) (Figure 5B). The area under the ROC curve (AUC) for 1-, 3-, and 5-year OS were 0.723, 0.653, 0.645, 0.657, 0.705, 0.683 in the TCGA (Figure 5C) and ICGC (Figure 5D) cohorts, respectively. Together, Results revealed that the signature showed excellent performance for OS prediction.
Univariate and multivariate COX regression analysis
Univariate Cox regression indicated that stage, T stage, M stage and risk score in the TCGA database (stage: P < 0.001; T stage: P < 0.001; M stage: P = 0.026; risk score: P < 0.001; Figure 6A), and gender, stage, risk score in the ICGC database (gender: P =0.039; stage: P < 0.001; risk score: P < 0.001; Figure 6C) were predictors for OS. Moreover, multivariate Cox regression analysis verified that age (HR = 1.019, 95% CI (1.003- 1.034); P = 0.017) and risk score (HR = 1.545; 95% CI (1.354-1.763); P < 0.001; Figure 6B) were significant independent risk factors in the TCGA database. Multivariate Cox regression further showed that gender (HR = 0.372, 95% CI (0.196-0.707); P =0.003); stage (HR = 2.320, 95% CI (1.599-3.367); P < 0.001); priorMalignancy (HR = 2.500, 95% CI (1.065-5.868); P =0.035) and risk score (HR = 1.094; 95% CI (1.031-1.162); P =0.003; Figure 6D) was significant independent risk factors in the ICGC database. These data demonstrated that this signature was an independent risk factor of HCC.
Nomogram construction
Based on the prognostic signature and clinical factors, such as age, gender, vascular invasion, tumor status and stage, a nomogram was constructed (Figure 7A). The calibration curve was used to describe the prediction value of the nomogram and the 45-degree line indicates the actual survival outcomes. The results showed that the nomogram-predicted survival closely matched with the best prediction performance for predicting 1-, 3- and 5-year OS (Figure 7B). The 1-year AUC was 0.765 for nomogram, and 0.481 for age, 0.490 for grade, 0.425 for status, 0.711 for stage. The 3-year AUC was 0.697 for nomogram, and 0.508 for age, 0.525 for grade, 0.567 for status, 0.706 for stage. Moreover, the 5-year AUC was 0.715 for nomogram, and 0.594 for age, 0.508 for grade, 0.607 for status, 0.667 for stage (Figure 7C-E). These showed that compared with a single clinical factor, the nomogram had great predictive accuracy combined the signature and clinical factors.
Model gene verification
Consistent with our results, IVD and LCAT were found to be significantly downexpressed, while CCT6A and OLA1 were significantly overexpressed for liver cancer in the Oncomine (Figure 8A), TIMER (Figure 8B), TCGA (Figure 8C) and ICGC (Figure 8D) database. Though lack in the Oncomine database, the mRNA expression of PSMD1 and RBM45 were also found to be significantly overexpressed for liver cancer in the TIMER, TCGA and ICGC database. Taking together, we further verified aberrant expression of six prognostic genes and found that IVD and LCAT genes were significantly decreased, while CCT6A,OLA1,PSMD1 and RBM45 genes were increased in HCC tissues.
Gene set enrichment analyses
We carried on the Gene Set Enrichment Analysis (GSEA) between PSMD14 (high/low) expression and high/low risk model data sets in TCGA‐LIHC. GSEA revealed significant difference (FDR < 0.05, NOM p-val < 0.05) in enrichment of MSigDB Collection (c2.cp.kegg.v7.0.symbols.gmts). We selected the most significantly enriched signaling pathways based on their normalized enrichment score (NES). A great majority of the enriched pathways were metabolism related, such as the purine metabolism, pyrimidine metabolism and ubiquitin mediated proteolysis are differentially enriched in PSMD14 high expression and high‐risk group phenotype (Figure 9A). Besides, the glycine serine and threonine metabolism, primary bile acid biosynthesis and retinol metabolism were enriched in PSMD14 low expression and low-risk group phenotype (Figure 9B).
Seven genes expression is correlated with immune infiltration level
We investigated the expression of seven genes ( PSMD14, RBM45, PSMD1, OLA1, CCT6A, LCAT and IVD) in human HCC cell lines Hep3B and HepG2. Compared to the normal liver cell L02, IVD and LCAT were significantly downregulated, the PSMD14,PSMD1,RBM45, OLA1 and CCT6A gens were significantly upregulated in the Hep3B and HepG2 cells (Figure 10A).
In addition, we analyzed the correlation between seven prognostic genes (PSMD14, RBM45, PSMD1, OLA1, CCT6A , LCAT and IVD) expression and 6 types of infiltrating immune cells (B cells, CD4 T cells, CD8+ T cells, neutrophils, macrophages and dendritic cells) in LIHC. The results showed that PSMD14 expression levels had a significantly positive correlation with infiltrating levels of B Cell (r = 0.245, P = 4.27e-06), CD8+ T cell (r = 0.205, P = 1.30e-04), CD4+ T cell (r = 0.266, P = 5.57e-07), Macrophage (r = 0.374, P = 9.16e-13), Neutrophils (r = 0.445, P = 3.30e-18) and Dendritic cells (r = 0.317, P = 2.33e-09). RBM45 expression levels had a significantly positive correlation with infiltrating levels of B Cell (r = 0.35, P = 2.39e-11), CD8+ T cell (r = 0.26, P = 1.07e-06), CD4+ T cell (r = 0.367, P = 2.12e-12), Macrophage (r = 0.461, P = 2.55e-19), Neutrophils (r = 0.471, P = 1.72e-20) and Dendritic cells (r = 0.419, P = 6.90e-16). PSMD1 expression levels had a significantly positive correlation with infiltrating levels of B Cell (r = 0.257, P = 1.31e-06), CD8+ T cell (r = 0.192, P = 3.50e-04), CD4+ T cell (r = 0.299, P = 1.55e-08), Macrophage (r = 0.393, P = 4.53e-14), Neutrophils (r = 0.447, P = 2.22e-18) and Dendritic cells (r = 0.355, P = 1.50e-11). OLA1 expression levels had a significantly positive correlation with infiltrating levels of B Cell (r = 0.359, P = 6.46e-12), CD8+ T cell (r = 0.27, P = 3.89e-07), CD4+ T cell (r = 0.328, P = 4.51e-10), Macrophage (r = 0.444, P = 6.18e-18), Neutrophils (r = 0.382, P = 1.98e-13) and Dendritic cells (r = 0.399, P = 1.93e-14). CCT6A expression levels had a significantly positive correlation with infiltrating levels of B Cell (r = 0.322, P = 9.24e-10), CD8+ T cell (r = 0.16, P = 3.06e-03), CD4+ T cell (r = 0.352, P = 1.72e-11), Macrophage (r = 0.462, P = 1.90e-19), Neutrophils (r = 0.367, P = 2.04e-12) and Dendritic cells (r = 0.313, P = 3.45e-09). None of the above five genes had no significant correlations with tumor purity ( P >0.05) (Figure 10B-G).
LCAT expression was significantly negatively related to tumor purity (r = -0.124, P = 2.12e-02) and had a significantly negative correlation with infiltrating levels of B Cell (r = -0.197, P = 2.35e-04), CD8+ T cell (r = -0.103, P = 5.59e-02), CD4+ T cell (r = -0.174, P = 1.16e-03), Macrophage (r = -0.244, P = 5.21e-06), Neutrophils (r = -0.202, P = 1.53e-04) and Dendritic cells (r = -0.194, P = 3.12e-04) (Figure 10F). IVD expression was significantly positive related to tumor purity (r = 0.121, P = 2.45e-02) and had a significantly negative correlation with infiltrating levels of B Cell (r = -0.173, P = 1.31e-03), CD8+ T cell (r = -0.097, P = 7.37e-02), CD4+ T cell (r = -0.135, P = 1.20e-02), Macrophage (r = -0.171, P = 1.49e-03) and Dendritic cells (r = -0.145, P = 7.53e-03) and no significant correlations with Neutrophils (r = -0.065, P = 2.271e-01) (Figure 10H).
Correlation of PSMD14 with the proportion ofTICs
To further confirm the correlation of PSMD14 expression with the proportion of tumor-infiltrating immune subsets was analyzed using CIBERSORT algorithm, and 21 kinds of immune cell profiles in LIHC samples were constructed. The results from the difference and correlation analyses showed that a total of six kinds of TICs were correlated with the expression of PSMD14 (Figure 11). Among them, three kinds of TICs were positively correlated with PSMD14 expression, including Macrophage M0, T cells CD4 memory activated and NK cells resting; three kinds of TICs were negatively correlated with PSMD14 expression, including B cells naïve, NK cells activated and T cells CD4 memory resting. These results further supported that the levels of PSMD14 affected the immune activity of TICs.