Research Schematic Diagram
Figure 1 displayed the overall flow sketch map of this work.
Differential Expression Definition and Functional Annotation of PRGs
DEPRGs in the TCGA-LIHC cohort were extracted complying with the filtering criterion (|log2FC| > 1, p < 0.01), the results indicated that 90 PRGs were up or down-regulated in tumor tissues compared to the normal specimens in the training set. More specifically, 15 PRGs were up-regulated in normal specimens, while the remaining 75 PRGs were raised in HCC tissues relatively (Figure 2A). PPI analysis with high confidence (0.7) was performed to describe the functional association network of these DEPRGs (Figure 2B). Additionally, the correlation coefficients among DEPRGs were calculated that satisfied the correlation threshold (cutoff > 0.3), and as Figure 2C exhibited, most PRGs possessed a positive regulatory relationship, except for a negative association among PLG, ORM1, SPP2, HRAS, PPIA, and ALB. Subsequently, to make a better recognition of the potential biological functions of these DEPRGs, GO enrichment analysis was exerted and the results implied that the GO terms observably enriched by PRGs were “platelet degranulation and activation” in BP, “platelet alpha granule” in CC, as well as “growth factor activity” in MF, respectively, implying that these differential genes were closely related to platelets (Figure 2D).
Identification of Prognostic PRGs
Based on 90 DEPRGs, we further conducted the uni-cox regression analysis (p < 0.01) to obtain prognosis-associated PRGs. As the forest plot displayed in Figure 3A, 29 PRGs with prognostic values were identified, and except for SPP2 (HR: 0.876, 95%CI: 0.817-0.939) and GNA14 (HR: 0.558, 95%CI: 0.406-0.767), the remaining PRGs were regarded as risk factors for HCC patients in the training set. And their correlation network was also displayed in Figure 3B. Furthermore, the influence of prognostic gene expression on OS of HCC patients was displayed with KM survival curves more intuitively (Figure 3C).
Clustering Analysis of HCC Subtypes Based on 29 Prognosis-related PRGs
Based on 29 prognostic PRGs, we further utilized the unsupervised clustering analysis to recognize different HCC subtypes. And two distinct clusters (cluster C1: 239 cases, cluster C2: 131 cases) were determined in the TCGA-LIHC cohort (Figure 4A). Both OS and PFS time demonstrated that there was an obvious survival discrepancy between the two HCC subtypes, a poorer prognosis was observed in patients belonging to cluster C2 compared with those in cluster C1 (Figure 4B). Whereafter, the intrinsic connection between HCC clusters and clinicopathological parameters was analyzed, and Figure 4C displayed that cases in cluster C2 were closely associated with the higher expression of most PRGs and worse clinical features including T stage, tumor stage, and pathological grade, which also confirmed the poor prognosis of this HCC subtype. To further explore the discrepancies in functional pathways and immune features between different clusters, we performed GSVA and ssGSEA analyses in R software. As Figure 4D displayed, compared to cluster C1, the C2 subtype mainly enriched in KEGG pathways like Cell cycle, Homologous recombination, DNA replication, cancer-associated (like Bladder, Pancreatic, and Renal cell carcinomas) pathways, and signaling transduction axis including NOD-like receptor and mTOR signaling pathways, implying that cluster C2 might own a close association with the occurrence and evolution of HCC. Furthermore, employing GO and KEGG analyses to make a functional annotation of the DEGs identified between two different subtypes (|log2FC| > 2, p < 0.01) and the top 10 enriched terms of both were displayed respectively (Figure 4E-F). Moreover, according to the immune infiltrating scores calculated by the ssGSEA algorithm, several immunocytes (including aDCs, iDCs, pDCs, Macrophages, Th1 and Th2 cells, as well as Tregs) were observed that had a higher infiltrating level in cluster C2, while Mast cells had a higher infiltrating score in C1 subtype (Figure 4G). Similarly, most immunological functions like APC co-inhibition and co-stimulation, CCR, Check-point, HLA, MHC class I, Para-inflammation, as well as T cell co-inhibition and co-stimulation, showed a higher score in cluster C2, whereas Type I/II IFN Response were mainly enriched in cluster C1 (Figure 4H). As well known, the tumor immune microenvironment is tightly associated with tumor development and immunotherapeutic response, therefore, the differences in infiltrating immunocytes and immunological functions between two distinct HCC subtypes might be a key element determining the effectiveness of immunotherapy and prognosis of patients with HCC.
Development and Corroboration of Platelet-associated Risk Signature
In this section, 29 candidate prognosis-related PRGs obtained previously were first applied to establish a risk model via the lasso cox regression analysis (Additional file 3: Figure S1). Then a 12 PRGs-contained risk signature was constructed according to their respective gene expression and regression coefficients as the formula indicated below: Risk Score = (0.029 * PRKCD) + (0.066 * HRAS) + (-0.021 * SPP2) + (0.050 * TUBA4A) + (-0.370 * GNA14) + (0.035 * EGF) +(0.066 * GNG4) + (0.109 * CFL1) + (0.028 * PPIA) + (0.231 * GNA12) + (0.150 * OLA1) + (0.080 * ANXA5) (Additional file 4: Table S3). HCC patients in the training set were subsequently classified into two distinct risk groups hinging on the median risk score (185 cases at high risk and 185 cases at low risk), KM survival curve illustrated that high-risk patients suffered a worse OS compared with low-risk ones (Figure 5A). Moreover, similar results were further corroborated in the external datasets that the OS of high-risk patients both in the GSE14520 and ICGC-LIRI cohorts was significantly inferior to those at low risk (Figure 5B-C). And the differences in risk scoring distribution, survival status, and risk gene expression schema between distinct risk groups within the respective cohorts were also described in Figure 5D-F. By the way, the representative IHC staining results of these risk genes both in the normal and HCC tissues obtained from the online HPA database also proved their expression patterns from the protein level (Additional file 5: Figure S2). Additionally, t-SNE as well as PCA analyses also indicated that our prognostic model could effectively distinguish patients from distinct risk groups both in the training and external validation cohorts (Figure 5G-I). To estimate the efficiency of the risk signature in prognosis prediction, we counted the AUCs in the 1-, 2-, and 3-year ROC curves respectively. The results showed that the year-standardized AUCs in the TCGA-LIHC cohort were 0.772, 0.731, and 0.717, which were 0.705, 0.672, and 0.664 in the GSE14520 cohort, as well as 0.720, 0.708, and 0.706 in the ICGC-LIRI cohort, respectively (Figure 6A), implying that our risk signature had good accuracy and stability in prognosis assessment. Meanwhile, we also made a comparison between our risk signature with other published prediction models, including Yi’s signature (12), Wang’s signature (13), Su’s signature (14), Lin’s signature (15), and Zhang’s signature (16). The concordance index (C-index) was used to appraise the predictive capability of distinct risk signatures, and the restricted mean survival (RMS) time was applied to compare survival differences among groups distinguished by different risk models. It was observed that our risk signature had a higher C-index and a more noteworthy survival difference compared to other signatures, indicating that our prediction model also had some advantages in horizontal comparison (Figure 6B-C). Besides, both the 1-, 3-, and 5-year ROC curves as well as survival curves of these contrastive risk signatures were also displayed respectively (Figure 6D-E).
Clinicopathological Parameters Correlation Analysis
Subsequently, to explore the prognostic value of our signature in patients with different clinical characteristics, we further analyzed whether there were significant connections between risk groups and clinical features. The results illustrated that the high-risk score was closely correlated with worse tumor stage (Stage II-III), T stage (T2-3), and pathological grade (G3) (Figure 7A). The levels of pathological grade and stage increased with the elevation of risk scores, meanwhile, a conspicuous connection was also observed between risk scores and HCC clusters that cluster C2 with poor clinical outcome was markedly associated with the high-risk score (Figure 7B), and the linear relationship among clusters, risk groups, and patients’ statuses was described with Sankey plot in Figure 7C. In addition, the survival curves of patients with different clinical features in both risk groups were shown in Figure S3 (Additional file 6). And the association between risk scores and clinicopathological parameters in the GSE14520 and ICGC-LIRI cohorts was also analyzed and displayed in Figure S4 (Additional file 7). Additionally, the correlation among these risk genes both in the training and the other two external cohorts was displayed in Figure 7D and Figure S5 (Additional file 8). Subsequently, both uni- and multi-cox analyses were performed to assess the independent prognostic efficacy of risk scores, uni-cox analysis in the training set disclosed that both risk score and several other variables (tumor stage, T /M stage) were hazard factors for the poor prognosis, and multi-cox analysis further proved the ability of the risk score as an independent hazard factor (p<0.001, HR = 4.331, 95% CI: 2.571-7.293). Meanwhile, in the GSE14520 cohort, both BCLC stage (p = 0.032, HR = 1.516, 95% CI: 1.037-2.216), as well as the risk score (p = 0.048, HR = 1.665, 95% CI: 1.005-2.759) were demonstrated to be independent hazard factors via the multi-cox analysis, the analogous results were also discovered in the ICGC-LIRI cohort that the risk score (p=0.046, HR = 2.007, 95% CI: 1.012-3.984), tumor stage (p=0.002, HR = 2.074, 95% CI: 1.316-3.270), and pathological grade (p=0.028, HR = 2.216, 95% CI: 1.088-4.513) were verified to be independent hazard factors, while gender (p=0.003, HR = 0.304, 95% CI: 0.140-0.660) was thought as a protective factor for patients’ OS (Figure 7E). In a word, these findings indicated that our 12-gene-included risk signature was closely associated with clinical characteristics, which also had a fine predictive capacity and was expected to act as a potential prognostic indicator for HCC patients.
Functional Analysis Based on Platelet-related Risk Signature
Considering the potential discrepancies in biological processes and functional pathways between distinct risk groups in the TCGA-LIHC cohort, we exerted both GSVA and GSEA analyses to make an individualized investigation. The GSVA (GO part) results illustrated that the conspicuously enriched GO terms in patients at high risk were “regulation of protein localization and folding”, “regulation of cell cycle G2-M phase transition”, and “mitotic process” in BP, “DNA polymerase binding” and “ubiquitin-like protein conjugating enzyme activity” in MF, and “anaphase-promoting complex” in CC, implying that high-risk score was tightly connected with cell cycle and mitotic process, which might be implicated in the mediation of tumorigenesis and progression. While the primary GO terms enriched in patients at low risk included “metabolic-related process”, “coagulation and fibrinolysis process” in BP, “enzymatic activity” in MF, and “platelet dense granule lumen” in CC, indicating that low-risk score was tightly connected with platelet and metabolic related processes. Moreover, several noteworthy KEGG pathways like “cell cycle”, “DNA replication”, “RNA degradation”, as well as “ubiquitin-mediated proteolysis” were significantly concentrated in the high-risk group, by contrast, multiple metabolic-associated processes like “nitrogen metabolism”, “amino acid and lipid-related metabolism”, and “drug metabolism cytochrome P450” were primarily concentrated in the low-risk one (Figure 8A). Subsequently, GSEA analysis was used to validate the biological annotation obtained above. It was observed that the major KEGG pathways concentrated in patients at high risk were “cytokine-cytokine receptor interaction”, “ECM receptor interaction”, and “neuroactive ligand-receptor interaction”, whereas “complement and coagulation cascades”, “drug metabolism cytochrome P450”, as well as “fatty acid metabolism” were mainly concentrated in those at low risk (Figure 8B). GO analysis indicated that the top terms correlated with high-risk scores were “humoral immune response”, “phagocytosis”, and “immunoglobulin complex”, while “alpha amino acid catabolic process”, “drug metabolic process”, and “epoxygenase P450 pathway” were the main terms correlated with low-risk scores (Figure 8C). Additionally, 328 DEGs were screened out between two risk groups with the threshold |log2FC| > 2 and p < 0.01, and GO and KEGG analyses were also employed according to these DEGs and the results were displayed in Figure 8D. Similarly, the results of both GSEA and GO/KEGG analyses in the GSE14520 and ICGC-LIRI cohorts were also shown in Figure S6 (Additional file 9).
Investigation of Immune Characteristics
According to the immunological landscape depicted by multiple immune algorithms, an observable discrepancy in infiltrating immunocytes between distinct risk groups was discovered that the infiltrating level of immunocytes was positively associated with the risk score (Figure 9A). And ssGSEA method-based immune analysis further confirmed that most immunocytes (including aDCs, iDCs, Macrophages, Tfh and Th2 cells, as well as Tregs) were calculated with a higher infiltrating score in patients at high risk, inversely, other immunocytes like Mast cells and NK cells displayed a higher infiltration level in those at low risk. With regard to immunological functions, such as “APC co-stimulation”, “CCR”, as well as “MHC class I” were linked to high-risk scores, while functions like “cytolytic activity”, and “Type I/II IFN Response” were linked to low-risk scores (Figure 9B). These findings were generally consistent with the consequences obtained in the previous clustering analysis, implicating that these observations acquired in both subgroup analyses were convincing to an extent. Additionally, the correlation coefficients between risk PRGs and immunocytes were calculated respectively and described in Figure 9C. And the discrepancies in infiltrating immunocytes or immunological functions in the other two validation cohorts were also investigated and exhibited in Figure S7 (Additional file 10). Furthermore, the discrepancies in the expression level of immune checkpoint genes, as well as HLA genes which were connected with antigen presentation and immune response, were also displayed in Figure 9D. Nearly all of these genes showed a higher expression level in the high-risk group, indicating that there might be a potential difference in immune status of two different groups distinguished by individual risk scores.
Genetic Mutation and Drug-sensitive Analysis
Considering the potential value of somatic mutations in tumor progression and individual clinical treatment, we analyzed and depicted the somatic mutation landscape of patients in the TCGA-LIHC cohort in whole and different risk groups, separately. Waterfall plots respectively displayed the somatic mutation feature of patients in the entire cohort (Figure 10A), as well as those in two distinct risk groups (Figure 10B). Briefly speaking, patients at high risk were demonstrated to be accompanied by a higher mutation frequency compared with low-risk ones, and missense mutation was the most frequent mutation pattern in two risk groups. Specifically, TP53 with missense mutation was markedly frequent in patients at high risk, whereas AXIN1 with frameshift insertion was relatively common in patients at low risk, respectively.
TIDE, represents tumor immune dysfunction and exclusion, is an algorithm usually employed to evaluate the possibility of immune escape and then make a prediction of immunotherapy response (17). Here, we explored the relationship between the risk score and TIDE score to investigate the potential significance of the risk model in immunotherapy prediction. The results indicated that there was an obvious negative association between two variables (R = -0.31, p = 7.1e-10), patients at low risk had a higher TIDE score, implying that our risk signature possessed the capacity of immunotherapy prediction and patients at low risk might be more susceptive to immunotherapeutic compared with high-risk patients (Figure 10C). Subsequently, the correlation between risk scores and IC50 values of frequently used chemotherapeutic medicaments was analyzed to explore the feasible sensitive agents. As Figure 10D showed, a lower IC50 value of gemcitabine, doxorubicin, bortezomib, bleomycin, bicalutamide, bexarotene, mitomycin C, and obatoclax mesylate was observed in the high-risk group, implicating that patients at high risk might benefit from these chemotherapeutic agents, while patients at low risk might profit from the following drugs, including erlotinib, docetaxel, metformin, cyclopamine, bosutinib, axitinib, nilotinib, and gefitinib.
In vitro Functional Verification
To further investigate the interaction between tumor cells and platelets in vitro, the platelets from human peripheral blood were first extracted and isolated and then co-cultured with HCC cells (HepG2 and MHCC97H) for 48 h. As Figure 11A exhibited, the consequences of RT-qPCR demonstrated that the mRNA expression levels of PRKCD, HRAS, TUBA4A, EGF, GNG4, CFL1, PPIA, GNA12, OLA1, and ANXA5 were distinctly elevated after platelet stimulation, whereas the expression of SPP2 as well as GNA14 displayed a significant decreasing trend, which was precisely consistent with the expression patterns of these risk genes in our prognostic signature. Meanwhile,
the expression of N-cadherin (N-ca), Vimentin (Vim), and Snail was also increased obviously, whereas the expression of E-cadherin (E-ca) displayed an opposite trend both in HepG2 and MHCC97H cells, implying that platelet stimulation could induce epithelial-mesenchymal transition (EMT) in HCC cells (Figure 11B). Subsequently, considering the research-supported close relationship between PRKCD and platelets, we chose PRKCD as the target molecule for further exploration (18-21). As Figure 12A-B displayed, both the mRNA and protein expression levels of PRKCD were first knocked down by si-PRKCD transfection, and after 24h transfection, platelets were added and co-cultured with tumor cells for another 48h. Subsequently, we investigated the impacts of distinct transfection treatments (including si-NC (Control), si-PRKCD-1, si-PRKCD-2, si-PRKCD-1 + PLT, si-PRKCD-2 + PLT) on HCC cells’ vitality and motility. Both the CCK-8 assay and colony formation test illustrated that the proliferative vitality of two HCC cells was significantly restrained by PRKCD silencing, and analogously, this inhibitory influence could be corrected by platelet stimulation (Figure 12C-D). Moreover, the transwell assay illustrated that the invasion and migration capabilities of HepG2 as well as MHCC97H cells were markedly suppressed by PRKCD silencing, which could be reversed by direct co-culture with platelets (Figure 12E). And the wound healing test further demonstrated that the reduction of PRKCD could obviously attenuate the migration capacity of HepG2 and MHCC97H cells and this inhibitory influence was apparently reversed by the supplement of platelets (Figure 12F). These findings indicated that platelet stimulation could up-regulate the expression of PRKCD, which was further confirmed to be involved in mediating tumor malignant phenotypes through in vitro experiments, implicating that PRKCD was involved in platelet-induced HCC progression. While as a positive feedback loop, whether PRKCD could affect platelet activation remained unclear. Here, we hypothesized that the expression level of PRKCD in HCC cells could affect the activation of platelets in vitro. To confirm our assumption, the expression level of cellular PRKCD was first suppressed by si-PRKCD transfection (si-PRKCD-1 with better inhibition performance was selected in this section), and the PRKCD-silencing HCC cells were then utilized to stimulate the isolated platelets by direct contact, eventually, the activation level of platelets was analyzed by flow cytometric assay. The results demonstrated that PRKCD-silencing HCC cells could effectively inhibit the activation level of platelets compared to the negative control (Figure 13), which was a confirmation of our conjecture. In summary, these results preliminarily proved that our risk gene PRKCD could either participate in the modulation of malignant biological behaviors of cancer cells as well as regulate the activation of platelets in vitro, implicating that PRKCD might act as a key molecular bridge in the cross-talk between HCC cells and platelets and also take an essential part in the platelet-induced HCC progression.