1. Identification of DEGs between normal and HCC tissues
The flowchart of this study was illustrated in Figure 1. The expression data of 59 CMGs, including 13 well-defined B7-CD28 family costimulatory molecules and 46 TNF family costimulatory molecules genes, were extracted from The Cancer Genome Atlas (TCGA) database after excluding TNFRSF6B for its low expression. The 59 costimulatory molecule-related genes expression levels were compared between HCC tumor and normal tissues, we identified 40 differentially expressed genes (DEGs) (P < 0.05). Among these DEGs, 11 genes (NGFR, TNFSF11, PDCD1LG2, CD274, TNFRSF1A, TNFRSF11B, TMIGD2, FAS, TNFRSF10D, TNFSF13 and CD86) were down-regulated while 29 genes (TNFRSF17, TNFRSF13B, CD276, TNFRSF12A, LTBR, TNFSF18, EDAR, TNFRSF14 , ICOSLG, RELT, CD28, ICOS, LTA, TNFRSF21, TNFRSF10C, VTCN1, TNFRSF11A, LTB, EDA2RC, TLA4, TNFSF9, TNFRSF25, PDCD1, CD70, TNFSF4, TNFRSF9, TNFRSF18, TNFSF15 and TNFRSF4) were up-regulated in tumor tissues (Figure 2A and Supplementary Table 1). The correlation among CMGs were analyzed. The relationships between each two of them were almost positively correlated, TNFRSF13C and TNFRSF13B were most correlated (Cor = 0.93) (Figure 2B). A protein–protein interaction (PPI) network was performed to further explore the interactions among these CMGs (Figure 2C). The minimum required interaction score was set at highest confidence 0.9. The resulted showed that TNF, CD28, CD40, CD80, CTLA4, LTA, TNFRSF10A and TNFSF13B were hub genes.
2. Consensus clustering of prognosis-related CMGs in HCC
A univariate Cox regression analysis was performed to primary selecting of the survival-related genes from 59 CMGs. A total of 11 CMGs were significantly linked to the prognosis of HCC patients (P < 0.05). Two genes were protective genes with hazard ratio (HR) < 1, while 9 genes were risk factors with HR > 1 among them (Figure 3A). To explore the associations between the expression of 11 prognosis-related CMGs and HCC subtypes, HCC patients were divided into two subgroups (Cluster 1: n=197, Cluster 2: n=173) according to consensus clustering analysis (Figure 3B and Supplementary Figure 1A). The PCA were analyzed to verify the reliability between different subgroups, and Cluster 1 and Cluster 2 could gather together and non-overlapped with each other (Figure 3C). We compared the OS between two subtypes to better understand the relationships between clustering results and survival outcomes, the Kaplan-Meier curves indicated that Cluster 1 had a better prognosis than Cluster 2 (P = 0.002, Figure 3D). The clinical features and two clusters were compared with a heatmap. The majority of 11 prognosis-related CMGs had higher expression in Cluster 2. These two clusters were different in grade (P < 0.01), but not with tumor stage, age and gender (Figure 3E). Furthermore, GSEA analysis showed that oncogenic pathways (apoptosis, G2M-checkpoint, IL2-STAT5-signaling, IL6-JAK-STAT3-signaling, inflammatory response, PI3K-AKT-MTOR-signaling, TNFA-signaling via NF-κB, unfolded protein response) were significantly enriched in Cluster 2 (Supplementary Figure 1B).
3. Construction and verification of costimulatory molecule-related risk signature
To narrow down candidate genes and construct the risk signature, the least absolute shrinkage and selection operator (LASSO) Cox regression analysis was performed in the training set, and 6 of 11 prognosis-related CMGs were identified (Supplementary Figure 2). The formula to calculate the risk score as follows: risk score = (0.25584 * TNFSF4) + (-0.29002 * TMIGD2) + (0.13379 * TNFRSF4) + (0.22009 * TNFRSF11B) + (0.40207 * TNFRSF11A) + (-0.78099 * CD40LG). We calculated the risk scores for every HCC patient in the training set according to the above formular. Patients in the training set were divided into high- and low-risk groups based on their median risk sore. A significant difference of OS was observed in different subgroups. High-risk patients had a poorer OS than low-risk groups (P < 0.001) (Figure 4D). Time-dependent receiver operating characteristic (ROC) analysis was used to evaluate the sensitivity and specificity of the risk signature. The areas under the curve (AUC) were 0.756 at 1-year survival, 0.791 at 3-year survival and 0.729 at 5-year survival (Figure 4E). We ranked the risk scores of patients and analyzed their distribution in the training set (Figure 4A). The survival status of HCC patients in the training set was showed on the dot plot (Figure 4B). The heatmap displayed the expressions of 6 prognosis-related CMGs between two risk groups (Figure 4C).
To determine the stability of the risk signature, we further verified the predictive capability in the test set and total set. The risk score was calculated for each patient in the test set and total set by the same formular obtained from the training set and the patients were classified into high- and low-risk groups. Similarly, Kaplan-Meier survival curve showed significantly difference in two risk groups among the test set. The OS of the high-risk groups was poorer than that of the low-risk groups (P=0.019) (Figure 4I). The 1-year AUC was 0.728, the 3-year AUC was 0.644 and the 5-year AUC was 0.654 (Figure 4J). The survival status, the distribution of the risk score and the expression heatmap of 6 prognosis-related CMGs in the test set were presented in Figure 4F-H.
The results in the total set were similar to the training set and test set. Patients in the high-risk group had a significantly shorter prognosis than patients in the low-risk group (P<0.001) (Figure 4N). In the total set, the AUC was 0.739 at 1 year, 0.708 at 3 years and 0.662 at 5years (Figure 4O). The distribution of the risk score, survival status and the expression patterns of 6 prognosis-related CMGs were showed in Figure 4K-M.
4. Independent prognostic value of the risk signature
We performed the univariate and multivariate Cox regression analyses to examine whether the risk score could act as an independent prognosis variable of HCC. Univariate Cox regression analysis showed that pathological tumor stage and risk score were significantly associated with the prognosis (Figure 5A, C, E). Multivariate Cox regression analyses further identified that the risk score was an independent prognostic factor for OS in the training set, test set and total set (Figure 5B, D, F).
5. Correlations between the risk signature and clinicopathological factors
The association between the risk model and clinical characteristics were analyzed. The heatmap displayed the expressions of 6 prognosis-related CMGs and the clinicopathological characteristics in high- and low-risk groups. The risk score was significantly correlated with histological grade, pathological T stage and clinical stage (Figure 6A). Patients were divided into subgroups according to the clinical variables. Differences in clinicopathological factors between high- and low-risk groups were showed in Figure 6B. The risk score was significantly higher in advanced grade, T stage and clinical stage. Meanwhile, the correlation between the risk score and clinicopathological features on OS was also explored. Survival analysis manifested that higher risk score were correlated with poor prognosis in most subgroups (Figure 6C).
6. Construction of a novel nomogram
We constructed a nomogram to predict the survival rates for HCC patients based on age, gender, histological grade, TNM stage, tumor stage and risk score (Figure 7A). The calibration curves of the nomogram indicated good consistency between the predicted survival rate and actual 1-, 3- and 5-year survival rate (Figure 7B). The AUCs of risk score and tumor stage were 0.739 and 0.671 in 1-year, 0.698 and 0.680 in 3-year, 0.638 and 0.663 in 5-year, respectively (Figure 7C). These findings suggested that the risk signature might be reliable to predict the OS for HCC patients.
7. Functional enrichment analyses based on the risk signature
To explore the potential biological processes for the prognostic risk signature, a total of 474 DEGs were obtained in high- and low-risk groups with the criteria FDR < 0.05 and |log2FC | ≥ 1. Among them, 63 genes were downregulated in high-risk group, while 411 genes were upregulated (Supplementary Table 2). Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways analyses were carried out based on the DEGs. The results of GO analysis indicated that the DEGs were mainly related to nuclear division. KEGG analysis showed that DEGs were mostly enriched in cell cycle (Supplementary Figure 3).
8. Association between risk signature and tumor immune microenvironment
The differences of tumor-infiltrating immune cells between high- and low-risk groups were analyzed to explore the correlations between the prognostic risk signature and tumor immune microenvironment (TIME). Supplementary Figure 4 displayed the abundance of 22 immune cells between high- and low-risk subgroups. Among 22 immune cell types, memory B cells and macrophage M0 were positively correlated with the risk score, while the abundance of naïve B cells, plasma cells and regulatory T cells (Tregs) were significantly enriched in low-risk group (Figure 8A). Furthermore, the relative proportion of naïve B cells, resting memory CD4 T cells, activated memory CD4 T cells, regulatory T cells, gamma delta T cells, macrophage M1 and resting mast cells were significantly associated with OS (Figure 8B).
To further explore the relationship between the risk signature and immune status, we performed the expression profiles of 29 immune signature gens sets (16 types of immune cells and 13 immune-related pathways) in high- and low-risk groups using the single-sample gene set enrichment analysis (ssGSEA). The heatmap displayed the significant differences in immune status between high- and low-risk samples (Figure 9A). The low-risk subgroup showed higher levels of infiltration of immune cells and higher activity of immune-related pathways (Figure 9B). We found that the immune score and stromal score were higher in low-risk groups, while the tumor purity was significantly lower in low-risk subgroup (Figure 9C).
9. Differences in molecular characteristics between high- and low-risk groups
We evaluated the relationship between mutation characteristics and the risk signature in TCGA HCC patients with available somatic mutation data. TMB was higher in high-risk patients in spite of no significant difference (Figure 10B). We also identified the top 20 genes with the highest mutation rates in high- and low-risk subgroups (Figure 10A). Additionally, we explored the association between immunophenoscore (IPS) and risk signature to predict the potential clinical efficacy and the response to ICI therapy in HCC patient. The IPS, IPS-CTLA4, IPS-PD1-PD-L1- PD-L2, and IPS-PD1-PD-L1-PD-L2-CTLA4 blocker were significantly higher in low-risk group, implying that HCC patients with low-risk score could benefit more from ICI therapy than high-risk patients (Figure 10C).
10. Verification of prognostic CMGs
We verified the expression of the CMGs (TNFSF4, TNFRSF4, TMIGD2, TNFRSF11A, TNFRSF11B, CD40LG) in 43 pairs of tumorous and non-tumorous tissue specimens from patients with HCC using qRT-PCR analysis. The results of qRT-PCR showed that the expression of TNFSF4, TNFRSF4, TNFRSF11A and CD40LG was higher in HCC tissues compared to normal tissues. However, the mRNA expression of TNFRSF11B and TMIGD2 was higher in normal tissues, which was consistent with the results of bioinformatic analysis.