Identification of the aging risk signature
Since the ESCA dataset from the TCGA has the largest sample size (N = 172) and complete clinical information, we, therefore, considered it as the discovery cohort to construct the aging risk signature. Univariate Cox regression analysis of the collected 1438 aging-relevant genes was performed with transcriptomic data of the TCGA dataset. Results demonstrated that 37 genes were linked with prognosis (all P < 0.05; Table S4). We then used the Lasso-Cox regression with 10-fold cross-validation to identify the aging genes that contributed most to the ESCA survival. The Lasso coefficient information between the log (λ) and the gene penal number was shown in Fig. 1A. The minimum deviance was observed when the gene number was selected as 22 (Fig. 1B). Finally, we chose 22 aging-relevant genes to establish a risk signature for ESCA survival assessment.
The determined 22 genes including SDCCAG3, ANXA5, ARHGEF18, VPREB3, MEOX1, RCAN3, EDAR, SYNE2, C22orf29, PCSK5, CHMP7, RPUSD4, MT1E, HLA-DOB, SREBF1, ATF3, LRRN3, HSPD1, HSPH1, KLRB1, GLA, and DYRK2. Their prognosis contribution values for ESCA patients were exhibited in Table S5. We constructed a risk signature to calculate the risk scores for each ESCA patient (Fig. 1C) according to the linear combination between the determined 22 gene expression levels and corresponding regression weights. The risk association plot of the calculated risk scores with survival times and statuses was shown in Fig. 1C. In addition, distinct expression distribution of the determined 22 genes in low- and high-risk groups was also presented with a heatmap (Fig. 1C).
To investigate the survival predictive capacity of the identified risk signature, we stratified ESCA patients from the discovery cohort into low-risk (N = 86) and high-risk (N = 86) groups. We found that patients of low-risk group had a significantly better overall survival than patients of high-risk group (Log-rank test P < 0.001; Fig. 1D). This link was still significant after controlling for age, sex, stage, grade, smoking status, and alcohol status in a multivariate Cox regression model (HR: 0.12 95% CI: 0.05–0.25, P < 0.001; Fig. 1E).
Corroboration of the aging risk signature
To validate the prognosis predictive roles of the constructed risk signature, we used the disease-specific survival (DSS) and progression-free survival (PFS) information from the TCGA ESCA cohort. Besides, two additional ESCA cohorts derived from the GEO were also used for validation. We observed that ESCA patients with a low-risk score exhibited both improved DSS and PSS as compared with high-risk patients in the TCGA cohort (Log-rank test both P < 0.001; Fig. 2A, C). The associations remained still significant when controlling for clinical confounding factors (multivariate Cox regression P < 0.001 and P = 0.003, respectively; Fig. 1B, D). In two GEO cohorts of GSE53624 and GSE53622, the significantly prolonged overall survival outcomes were also observed in patients with low-risk scores (Log-rank test P < 0.001 and P = 0.021, respectively; Fig. 2E, F).
The identified risk signature associated with preferable immunocyte infiltration and immune activity
Previous two studies have been demonstrated that aging and its relevant genomic traits were involved in the immune microenvironment and immunity [34, 35]. Hence, we speculated that the determined aging signature might regulate the immune cell infiltration and biological circuits associated with the immune activity. A heatmap according to ssGSEA algorithm was achieved to show the different infiltrated levels of 28 immunocyte types in low- versus high-risk groups in the TCGA ESCA cohort (Fig. 3A). Results revealed that increased infiltration of anti-tumor immune cell types, such as central memory CD4+ and CD8+ T cells, and effector memory CD8+ T cells were observed in ESCA patients of low-risk group (all P < 0.05). Moreover, the decreased infiltration of pro-tumor immunocytes, like neutrophils and regulatory T cells were noticed in the low-risk group (both P < 0.05). In addition, activated B cells and gamma delta T cells, which belong to the intermediate immune cell type were also markedly enriched in patients with low-risk scores (both P < 0.05). We also performed immunocyte infiltration analysis via the CIBERSORT method (Figure S1), and consistently, more infiltration of immune response cells (e.g., CD8 T cells, activated natural killer cells, and M1 macrophages) and less infiltration of immune suppressive cells (e.g., regulatory T cells and M2 macrophages) were observed in low-risk ESCA patients.
GSEA analysis against KEGG and HALLMARK databases was conducted to investigate the specific biological pathways involved in the aging signature. We found that immune response-relevant circuits in the KEGG database (e.g., T cell receptor signaling pathway and chemokine signaling pathway) and HALLMARK database (e.g., inflammatory response, interferon-gamma response, and IL2-STAT5 signaling) were markedly enriched in the ESCA low-risk subgroup (Fig. 3B, C).
ESCA patients of low-risk group had a significantly enhanced enrichment of T cell-inflamed signature, IFNγ signature, and cytolytic activity than those patients of high-risk group (Wilcoxon test all P < 0.05; Fig. 3D, E, F). Besides, the elevated enrichment of immune cell signature, immune signaling molecules, and cytokines/chemokines signature was also observed in low-risk patients (Wilcoxon test all P < 0.05; Figure S2).
The different expression levels of complete immune checkpoints between low- and high-risk subpopulations were calculated. Results showed that the common immune checkpoints (e.g., CD274, CD276, IDO1, LAG3, and PDCD1) were highly expressed in patients with a low-risk score (all P < 0.05; Figure S3).
TMB and SMGs linked with the determined aging signature
Tumor mutation burden (TMB) was revealed to be associated with tumor prognosis and immunotherapy response [36, 37]. We thus explored the connection between the aging signature and TMB. By using the somatic mutational profile of TCGA ESCA dataset, we calculated the TMB for each ESCA patient and noticed that patients with a low-risk score harbored a markedly enhanced TMB than patients with a high-risk score (Wilcoxon test, P < 0.001; Fig. 4A).
Mutational signatures are characterized by specific combination patterns of nucleotide substitutions and have been demonstrated to associate with TMB. Hence, we detected possible mutational signatures of ESCA samples by dividing the nucleotide substitution matrix under the NMF approach. Based on the cophenetic metric plot, four mutational signatures were determined (Fig. 4B). After annotating them with signatures from the COSMIC database (Fig. 4C), we identified these four mutational signatures as signature 1 (associated with age of cancer diagnosis), signature 4 (associated with smoking), signature 13 (associated with APOBEC enzyme activity), and signature 17 (the etiology of this signature remains unknown). Distinct mutational features of the above four signatures were exhibited in Fig. 4D. Mutational signature activity of all ESCA samples was calculated and illustrated in Fig. 4E and Table S6. Subsequent investigation indicated that low-risk scores were significantly associated with the elevated mutational activity of signature 1 (P = 0.036; Figure S4A) and decreased activity of signature 17 (P = 0.049; Figure S4B). No significant differences were found between two risk groups concerning signatures 4 and 13 (P = 0.829 and 0.779, respectively; Figure S4C, D).
To elucidate whether the enhanced TMB of low-risk group was affected by other clinical confounders, we added clinical features (i.e., age, sex, stage, smoking status, and alcohol status) and determined mutational signatures (i.e., signatures 1, 4, 13, and 17) into a multivariate logistic regression model. The connection between the low-risk scores and the elevated TMB was still remained (OR: 2.38, 95% CI: 1.13–5.69, P = 0.002; Fig. 4F).
Totaling 16 significantly mutated genes (SMGs) were determined by detecting the somatic mutation data of the TCGA cohort. Waterfall plot between low- and high-risk groups (Fig. 5) demonstrated a markedly different mutation frequency in TP53 [66 of 80 (82.5%) vs. 71 of 79 (89.9%); P = 0.038], NAV3 [3 of 80 (3.8%) vs. 11 of 79 (13.9%); P = 0.025], and FAT1 [2 of 80 (2.5%) vs. 6 of 79 (7.6%); P = 0.045].
Roles of the determined aging signature in assessing ICI treatment efficacy
The aforementioned results demonstrated that the established aging risk signature was connected with immune microenvironment and immunogenicity, we hypothesized that the aging signature may play vital roles in immune checkpoint inhibitor (ICI) treatment. Hence, a urothelial cancer anti-PD-L1 dataset (Imvigor210) with both gene expression profiles and immunotherapeutic information was used to investigate the connection between determined aging signature and ICI efficacy. Results revealed that patients of low-risk subgroup had a markedly better ICI prognosis as compared with patients of high-risk subgroup (Log-rank test P = 0.002; Fig. 6A). This association was remained even controlling for sex, baseline ECOG score, smoking status, immune phenotype, and platinum treatment in a multivariate Cox regression model (HR: 0.69, 95% CI: 0.53–0.91, P = 0.007; Fig. 6B). Subsequent analysis showed that a higher proportion of better ICI response status (i.e., CR and PR) was observed in the low-risk group (28.7% vs. 16.9%, P = 0.018; Fig. 6C). Besides, in this ICI-treated cohort, we also observed patients with low-risk scores exhibited an elevated TMB (Wilcoxon test P = 0.004; Fig. 6D). A trend of higher neoantigen burden was also noticed in the low-risk subgroup, although this association did not reach statistical significance (Wilcoxon test P = 0.083; Fig. 6E). Finally, we conducted immune infiltration analysis by using transcriptomic data of this immunotherapeutic cohort to compare the distinct infiltration levels between two risk subgroups. Consistently, the enhanced infiltration of immune response lymphocytes represented by CD8 T cells was significantly enriched in patients with low-risk signature scores (Figure S5).