Identification of different cellular senescence-associated clusters
To comprehensively explore the expression patterns of cellular senescence-related genes in HCC, we downloaded the information of RNA sequencing samples and clinical information of 424 HCC patients from the TCGA database as the training set, and we downloaded the information of RNA sequencing samples and clinical information of 240 HCC patients from the ICGC database as the validation set. Based on the expression profiles of 281 cellular senescence-associated genes, we stratified 365 HCC patient samples in the training set into two different clusters (233 cases in Cluster 1 (C1), 132 cases in Cluster 2 (C2) by a nonnegative matrix factorization (NMF) algorithm(Figure 1A). In the training set, the survival of C1 patients was significantly better than that of C2 patients, and the survival curves are shown in Figure 1B. Subsequently, we further compared the differences between clusters in terms of basic clinical characteristics, and we found that C2 patients had a greater proportion of HCC patients with T3 and T4 stages compared with C1 patients, while patients with Grade classification G3 and G3 were significantly more represented in C1 than in C2 (Log-rank test,P<0.05)(Figure 1C- Figure 1F). The above results suggest that there is a relationship between the expression of genes related to cellular senescence and clinical characteristics T stage and Grade classification.
Differential gene analysis and functional enrichment analysis
To further investigate the differences in gene expression and biological processes involved between different clusters, we further performed differential gene analysis between different clusters, in which 123 genes were up-regulated in C1 and 2253 genes were up-regulated in C2(Figure 2A- Figure 2B). The GO enrichment analysis showed that the biological process (BP) of up-regulated genes in C1 was mainly enriched in small molecule catabolic process and carboxylic acid catabolic process; the most enriched cellular component (CC) was high-density lipoprotein particle and plasma lipoprotein particle. GO enrichment analysis revealed that the biological processes (BP) of upregulated genes in C2 were mainly enriched in cellular responses to copper ions and secondary metabolic processes; the most enriched cellular components (CC) were the multivesicular body membrane and apical plasma membrane(Figure 2C- Figure 2D). KEGG functional enrichment analysis showed that in cluster1 upregulated genes were mainly enriched in Chemical carcinogenesis - DNA adducts and Metabolism of xenobiotics by cytochrome P450 pathway, while in C2 .The upregulated genes in C2 were mainly enriched in Neuroactive ligand-receptor interaction, Protein digestion and absorption pathway(Figure 2E- Figure 2F). We then performed GSEA (KEGG) enrichment on the two clusters showing that in cluster1 the DRUG_METABOLISM_CYTOCHROME_P450 pathway and RETINOL_METABOLISM pathway were mainly enriched, while in C2 the OOCYTE_MEIOSIS pathway and PROGESTERONE_MEDIATED_OOCYTE_MATURATION pathway were mainly enriched(Figure 3A). Multiple enrichment analysis revealed that C2 is mainly involved in pathways associated with cellular senescence.
Differences in chemokine and energy metabolism-related genes among clusters
We further compared the expression differences of chemokine and energy metabolism-related genes in cluster1 and C2. we found that in C2, the expression of most chemokine-related genes was significantly higher than in C1(Figure 3B), especially CCL26, CXCL5, CXCL6, CXCL1(Figure 3C- Figure 3F). similarly, the we found that most of the energy metabolism-related genes were highly expressed in C2 (Figure 4A- Figure 4G). We further enriched the differential energy metabolism-related genes, and we found that these differential frontal energy metabolism-related genes were mainly enriched in xenobiotic metabolic process, cellular response to xenobiotic stimulus, and carbohydrate biosynthetic process pathways(Figure 5A- Figure 5B).The above results further suggest that there is a link between cellular senescence and tumor microenvironment, while cellular senescence can change the energy metabolic state to some extent.
Characterization of immune landscape in distinct cellular senescence clusters
Previous studies have demonstrated a relationship between cellular senescence and tumor immune infiltration in a variety of tumor types[18, 19]. We calculated the proportion of 21 immune cell species in each HCC sample using the R software CIBERSORT, while comparing the differences between immune cell components across clusters(Figure 5C). Particularly, we found that regulatory T cells (Tregs) was markedly elevated in C2. Subsequently, we applied the ssGSEA algorithm to determine the relative ratios of 28 immune cells and immune-related pathways in each HCC sample, comparing the differences in immune cell composition between clusters(Figure 5D- Figure 5E). We found that most immune cells were significantly enriched in C2, and interestingly, Myeloid-derived suppressor cells (MDSCs) cells (MDSCs) were significantly higher in C2 than in C1, which was concordant with previous observations linking C2 to an immunosuppressive phenotype. We further compared the effect of infiltrating immune cells on the survival of HCC patients, and we found that patients with highly infiltrated CD8T cells, M0 Macrophages, had a better prognosis; while highly infiltrated M2 Macrophages cells had the opposite result(Figure 5F- Figure 5H) .In addition, we calculated the StromalScore, ImmuneScore, ESTIMATEScore, and TumorPurity for each HCC sample using the ESTIMATE function of R software, and the ImmuneScore of C2 was significantly higher than that of C1, in agreement with the previous results(Figure 6A- Figure 6D). Overall, these results confirm that cellular senescence is associated with the tumor microenvironment, to some extent, contribute to the formation of an immunosuppressive microenvironment.
Landscape of tumor mutation and immune checkpoint-associated gene expression in different clusters
We compared the differences in somatic mutations and tumor mutational load (TMB) between clusters and showed that there was no significant difference in somatic mutation frequency and TMB between C1 and C2(Figure 7C), interestingly, C1 had the CTNNB1 gene as the major mutated gene while C2 had the TP53 gene as the major mutated gene(Figure 7A- Figure 7B). Immunomodulators play an important role in reshaping the tumor microenvironment. Therefore, to further explore the complex communication between immunomodulators, immune infiltration and cellular senescence, we explored the expression of immune checkpoint-associated genes among different clusters. We found that most immune checkpoint-related genes, including PD-L1, PD1, PD-L2, TIM-3were significantly upregulated in C2, further suggesting that the presence of an immunosuppressive microenvironment in C2(Figure 6E- Figure 6H). Finally, we used the TIDE score to assess the clinical effectiveness of immunotherapy across clusters. In our results, C2 had the lower TIDE score(Figure 7D- Figure 7F), implying that patients in C2 could benefit more from immunotherapy than C1.
Taken together, our comprehensive analysis showed that cellular senescence clusters are significantly associated with energy metabolism, chemokines, tumor microenvironment and patient prognosis , which may provide new insights into HCC classification system.
Construction of the cellular senescence score for overall survival in HCC patients
In order to better reflect the characteristics of C1 and C2, we constructed a risk score model to differentiate HCC patients. First, we performed differential gene analysis for C1 and C2 with the screening criteria of pvalue < 0.05 and logFC > 1 or logFC < -1. A total of 8341 differential genes were obtained. Subsequently, we selected 68 genes that overlapped with the set of genes associated with cellular senescence(Figure 7G). We identified 36 prognosis-associated cellular senescence genes using Cox univariate regression analysis(Figure 7H); we then used LASSO Cox regression analysis and multifactorial Cox regression analysis to build prognostic models(Figure 8A- Figure 8C). Four prognostic genes (CENPA, CXCL8, EZH2, and G6PD) were identified in the training set used to build the prognostic model. The prognostic risk score was calculated using the following formula: risk score = (0.22281 × CENPA gene expression) + (0.10830 × CXCL8 gene expression) + (0.19533 × EZH2 gene expression) + (0.18866 × G6PD gene expression). Patients in the training set were divided into a high-risk group and a low-risk group based on the median risk score. Patients in the low-risk group had significantly higher postoperative survival rates than those in the high-risk group(Figure 8F). We evaluated the predictive efficacy of risk scores on the prognosis of HCC patients using ROC curves(Figure 8D- Figure 8E). We further analyzed the clinical factors with risk score and found that risk score and T-stage were independent risk factors affecting the prognosis of HCC patients(Figure 8G- Figure 8H).We used the validation set to determine the model's robustness and prognostic value. On the basis of risk scores for four genes, the validation set's 240 patients were divided into a high-risk and a low-risk group. Kaplan-Meier survival curve analysis revealed that patients in the low-risk group survived significantly longer than those in the high-risk group (p<0.001) (Figure 9G- Figure 9H).
Meanwhile, we further explored the relationship between risk scores and different clusters, and we found that the high-risk score group was more responsive to the characteristics of C2; similarly, the low-risk score group responded to the characteristics of C1(Figure 9F). Therefore, we believe that the risk score of this model can well reflect the characteristics of cellular senescence in HCC.
Construction of integrated models to optimize risk stratification and survival prediction in HCC patients
To better predict the probability of survival in HCC patients, we created a predictive nomogram based on the integration of risk scores and other clinicopathological features(Figure9A). Age, gender, T-stage, N-stage, M-stage, and Grade stage were included. The calibration curves of the nomogram for 1, 3 and 5 year survival probabilities showed excellent agreement with the ideal performance(Figure 9B- Figure 9D). This indicates a high accuracy of our nomograms. In addition, the nomogram based on risk score showed the best ability to predict OS most strongly compared to other clinicopathological characteristics (age, gender, T stage, N stage, M stage and Grade classification) with a mean AUC higher than 0.6(Figure 9E).