The Landscape of Genetic and Transcriptional Alterations of DNA Methylation Regulators in CRC
As shown in Figure 1A, a total of 20 DNA methylation regulators, including 3 writers, 3 erasers, and 14 readers, were identified in this study. First, we summarized the incidence of somatic mutations and CNV in these 20 regulators in CRC. Overall, maftool analysis showed a low somatic mutation rate among the DNA methylation regulators. TET methylcytosine dioxygenases 1 and 3 exhibited the highest mutation rates (both 6%), whereas MBD3 exhibited the lowest mutation rate (0%) among the 20 regulators (Figure 1B). CNV was a ubiquitous phenomenon among the DNA methylation regulators in CRC, with most exhibiting CNV deletions, whereas DNMT3B exhibited an increased frequency of CNV amplifications. The CNV profiles of the 20 DNA methylation regulators are presented in Figure 1C and D. Significant synergy between mRNA expression and CNV alterations was observed among the DNA methylation regulators between normal and CRC samples. Compared with normal tissues, most of the DNA methylation regulators (except ZBTB38) with amplified CNV were expressed at markedly higher levels in CRC tissues (Figure 1E). Furthermore, the expression of all but one of the DNA methylation regulators (except MBD3) differed significantly among the distinct consensus molecular subgroups (CMSs) (Figure 1F). This analysis revealed that the expression of and epigenetic imbalance among DNA methylation regulators plays a pivotal role in the tumorigenesis and progression of CRC.
Prognosis and Immune Landscape of DNA Methylation Regulators in CRC
Kaplan-Meier survival curves demonstrated that 11 of the 20 identified regulators were associated with overall survival (OS) in the meta-GEO cohort (Figure S2A-K). Figure 2A summarizes the interactive relationship and prognostic significance of these 20 DNA methylation regulators in CRC. Our analyses revealed a significant cross-talk network existing not only between regulators of the same functional subtype but also between writers, erasers, and readers. In particular, the strong correlation observed between UHRF1 (reader) and DNMT1 (writer) indicated functional cooperation between these regulators.
We also investigated the relationship between DNA methylation regulators and the immune TME in CRC. Correlation analyses showed that the expression of many regulators (particularly DNMT3B) was significantly negatively correlated with the relative abundance of TIICs (Figure 2B). We therefore conducted a more detailed examination of the role of DNMT3B in immune-related processes, MSI status, and immunotherapy response.
Gene Set Enrichment Analysis results revealed that immune-related processes, including the T-cell receptor signaling pathways, were significantly enriched in the high-DNMT3B expression group (Figure 2C). A significant difference in OS between the high- and low-DNMT3B expression groups was observed in the meta-GEO cohort (P = 0.031, Figure 2D). Interestingly, we found that DNMT3B mRNA expression correlated with the mean methylation of promoter sites positively, not negatively in the traditional sense (Figure 2E). Furthermore, DNMT3B mRNA expression was significantly downregulated in the MSI-high (MSI-H) group (Figure 2F). DNMT3B mRNA expression also differed significantly among distinct CMSs, suggesting a role in promoting CRC heterogeneity (Figure 2G). In the IMvigor210 cohort, a greater proportion of CRC patients with high DNMT3B expression exhibited complete response (CR) or partial response (PR) than patients with low DNMT3B expression (31% vs. 17%, respectively, Figure 2H). Patients in the high-DNMT3B expression group exhibited a significant survival advantage after treatment with the PD-L1 inhibitor Atezolizumab (P = 0.034, Figure 2I). Overall, these results indicate a close relationship exists between the DNA methylation regulators, the TME, and immunotherapy response.
DNA Methylation Modification Patterns Mediated by the 20 Regulators in CRC
The above studies confirmed that DNA methylation is closely related to the prognosis and immune microenvironment of CRC patients. To build on these findings, we attempted to identify distinct DNA methylation patterns using consensus clustering based on the expression of those 20 regulators. Accordingly, three distinct modification pattern clusters, namely Cluster A (n = 481), Cluster B (n = 848), and Cluster C (n = 362), were identified in the meta-GEO and TCGA cohorts. The PCA plot of the 20 DNA methylation regulators revealed significant separation between the three pattern clusters (Figure 3A). Kaplan-Meier survival curves demonstrated a significant survival advantage for patients in Cluster B, whereas those in Cluster C exhibited a clearly poorer outcome in the meta-GEO cohort (P = 0.027, Figure 3B). Figure 3C shows a heatmap of the three DNA methylation modification pattern clusters annotated according to clinical and pathologic parameters.
GSVA was performed to explore how the three DNA methylation modification patterns affect biological behavior. Cluster B was markedly enriched in immune-related biological processes, such as the complement and coagulation cascades and cytokine-cytokine receptor interactions, whereas carcinogenic activation and stromal-related pathways, such as adherens junction and the MAPK signaling pathway, were significantly enriched in Cluster C (Figures 3D-F). We therefore further evaluated the relationship between DNA methylation modification patterns and immune phenotypes. Interestingly, despite increased infiltration of immune cells, particularly innate immune cells (e.g., eosinophils, myeloid-derived suppressor cells, macrophages, mast cells, monocytes, natural killer cells, and neutrophils), CRC patients in Cluster C did not exhibit a significant survival advantage (Figure 3G). A previous study showed that stromal activation in the TME can inhibit antitumor immune effectors (33). Single-sample gene set enrichment analysis confirmed that stromal-related processes, such as EMT1, EMT2, and the pan-fibroblast TGF- response signature, were significantly activated in Cluster C (Figure 3H). We also found that CD8+ T-effector cells and cytolytic activity were significantly enhanced in Cluster B, which was associated with higher antitumor activity and better prognosis. Collectively, these findings indicate that the three DNA methylation modification patterns are significantly associated with distinct clinical outcomes and immune landscapes. Cluster A represents an immune-desert phenotype characterized by low TIICs, whereas Cluster B represents an immune-inflamed phenotype marked by higher antitumor activity, and Cluster C can be considered an immune-excluded phenotype featuring stromal activation.
Construction of the DNA Methylation-Related Gene Clusters
Given the heterogeneity of DNA methylation modification, DEGs within the three clusters were identified to investigate the molecular behavior of each modification pattern in depth. A total of 1,183 DNA methylation modification pattern–related genes were identified within the three modification patterns (Figure 4A). Enrichment analysis demonstrated that these genes were primarily associated with RNA modification- and DNA methylation-related events (Figure 4B and C).
Univariate Cox regression analyses identified 270 genes with prognostic value. Based on the expression levels of these 270 genes, an unsupervised clustering analysis was conducted and identified three stable genomic subtypes in the meta-GEO and TCGA cohorts: gene.Cluster A (n = 591), gene.Cluster B (n = 845), and gene.Cluster C (n = 255). Survival analysis of the meta-GEO cohort revealed that gene.Cluster A and gene.Cluster B were associated with better prognosis, whereas gene.Cluster C was associated with the worst clinical outcome (Figure 4D). This clustering result was consistent with the DNA methylation modification patterns (Figure 4E).
In addition to MBD4, significant differences in expression between the three clusters were observed among all DNA methylation regulators (Figure 4F). Similar to the Cluster C modification pattern (immune-excluded phenotype), gene.Cluster C was associated with a wide range of both immune and stromal activation phenotypes (Figure 4G and H). These results further demonstrated that CRC is associated with three distinct DNA methylation modification patterns, which are represented by distinct clinical outcomes and immune landscapes.
Development of the DM_Score
Considering the individual complexity and heterogeneity of the DNA methylation modifications, we developed a scoring system to quantify the DNA methylation level of each CRC patient (i.e., the DM_Score). The CRC patients were divided into high- and low-score subgroups with optimal cutoff values of 0.915 and 1.985 in the meta-GEO and TCGA-CRC cohorts, respectively. Patients with a low DM_Score exhibited longer survival times than those in the high-DM_Score group, consistent with the meta-GEO (Figure 5A, P < 0.001) and TCGA-CRC (Figure 5B, P = 0.019) cohorts.
A Sankey diagram was generated to visualize the attribute changes in the DNA methylation modification patterns, gene clusters, DM_Score, and survival status for individual CRC patients (Figure 5C). Patients in Cluster C exhibited the highest median DM_Score, whereas those in Cluster A exhibited the lowest median DM_Score (Figure 5D, P < 2.2 × 10-16). Similarly, gene.Cluster C was associated with a significantly higher DM_Score than gene.Cluster A and gene.Cluster B (Figure 5E, P < 2.2 × 10-16). DM_Score was significantly associated with AJCC stage and survival status in the meta-GEO cohort (P < 0.01, Figure 5F and G). Becht et al. suggested that CMS2 represents immunologically “cold” tumors lacking significant lymphoid/myeloid infiltration and major histocompatibility complex expression (34). Among the four CMSs in CRC, CMS1 exhibited the highest DM_Score, whereas CMS2 exhibited the lowest DM_Score in TCGA-CRC cohort (Figure 5H).
To characterize the impact of DM_Score on the TME, we also investigated the correlation between DM_Score and TIICs. A heatmap was generated and showed that DM_Score was positively associated with vast majority of TIICs and stroma-related signatures in the meta-GEO cohort (Figure 5I). It is thus reasonable to conclude that a higher DM_Score represents a higher immune-activating phenotype, whereas a low DM_Score can be hypothesized to indicate an immune-desert phenotype.
Prognostic Value of the DM_Score
Univariate and multivariate Cox regression analyses confirmed that the DM_Score provided robust and independent prognostic value for predicting clinical outcomes in both the meta-GEO (Figure 5J and K) and TCGA-CRC (Figure 5L and M) cohorts. We then assessed the prognostic value of the DM_Score in terms of OS in CRC. Subgroup analyses revealed that the risk signature remained robust as an independent prognostic factor in terms of age (age ≥ 65 and < 65 years, Figure S3A and 4B), gender (male and female, Figure S3C and 3D), and AJCC stage (III + IV, Figure S3E). The OS of low-DM_Score patients was markedly superior to that of high-DM_Score patients in all clinicopathologic subgroups. These results further confirmed the utility of the DM_Score as a robust prognostic predictor in CRC.
Mutational Profile of TCGA-CRC Cohort DM_Score Groups
Accumulating evidence suggests that a close relationship exists between immunotherapeutic effect and somatic mutations (35, 36). We therefore contrasted the mutational landscapes between TCGA-CRC cohort high- and low-DM_Score groups using the R package maftools (Figure 6A and B) (37). We then analyzed the differences in mutation frequencies in common cancer driver genes between the high- and low-DM_Score groups. BRAF (20% vs. 2% [high vs. low DM_Score, respectively], P < 0.001, Figure 6C) and SMAD4 (18% vs. 8%, P = 0.003, Figure 6D) exhibited higher somatic mutation rates in the high-DM_Score group, whereas TP53 (50% vs. 77%, P < 0.001, Figure 6E) and APC (66% vs. 94%, P < 0.001, Figure 6F) exhibited a higher frequency of mutations in the low-DM_Score group. Pearson’s correlation analysis revealed that DM_Score was positively correlated with TMB (R=0.22, P = 8.7 ×10-7, Figure 6G), thus indirectly suggesting that patients with a high DM_Score would experience a durable clinical response to immunotherapy. These results provide more details about the relationship between DM_Score classification and genome somatic mutations and reveal the complexity of interactions between DNA methylation modification and individual somatic mutations.
Relationship between DM_Score grouping and classical immune and molecular subtypes
To further validate the relationship between DM_Score and the TME, we examined the protein expression levels of PD-L1, CD8A, and TGFB1 between high- and low-DM_Score patients in the Wuhan cohort. As shown in Figure 7A, compared with the low-DM_Score group, the high-DM_Score group exhibited higher PD-L1 and CD8A expression but lower TGFB1 expression.
The CMS classification describes the four molecular subgroups of CRC according to genetic and epigenetic perspectives: CMS1 (MSI Immune), CMS2 (Canonical), CMS3 (Metabolic), and CMS4 (Mesenchymal) (38). Figure 7B and C shows that the CMS2 subgroup was predominant in the low-DM_Score group and CMS4 in the high-DM_Score group in both the GSE39582 and TCGA-CRC cohorts (P < 0.001, χ2 test). CMS2 and CMS3 are reportedly correlated with reduced immune infiltration and immunoreactivity in CRC (39). The sum of the percentage of CMS2 and CMS3 was >80% in the low-DM_Score group, which again confirmed that CRC patients with a low-DM_Score can be classified as bearing immunologically “non-inflamed” tumors.
Thorsson et al. developed an immune subtype classification of 33 non-hematological cancer types using an integrated immunogenomics method (40). They identified and characterized six immune subtypes with potential guiding value for cancer management, particularly immunotherapy. Only four immune subtypes were identified in TCGA-CRC cohort, predominated by C1 (wound healing), which was almost evenly distributed among the two risk groups (Figure 7D). It was found that C2 was more highly infiltrated with activated CD8+, follicular helper T cells, regulatory T cells, and neutrophils, representing a “hot” TME immunological status (41). The proportions of C2 in the GSE39582 cohort were 28% and 11% in the high- and low-DM_Score subgroups, respectively (Figure 7E).
Role of the DM_Score in Predicting the Immunotherapy Response
Next, we investigated the suitability of the DM_Score as a biomarker to predict the efficacy of immunotherapies that have recently shown breakthroughs in cancer treatment, particularly anti-PD-1/PD-L1 therapy (42, 43). To date, TMB, MSI, and PD-L1/PD-1 mRNA have been the most widely used predictors of immunotherapy outcomes (44). Patients in the high-DM_Score group exhibited higher TMB in TCGA-CRC cohort, suggesting that they might benefit more from immunotherapy (P = 2 × 10-5, Figure 8A). IPS, a new machine learning-based marker, reportedly offers superior predictive capability for response to immune checkpoint blockade therapy in solid tumors (45). As we expected, the IPS was significantly higher in the high-DM_Score group, suggesting these patients might obtain greater benefit from anti-CTLA-4 and anti-PD-1 combination therapy (P = 0.011, Figure 8B).
The TIDE score was higher in the high-DM_Score group compared with the low-DM_Score group (P = 1 × 10-8, Figure 8C). Similarly, patients in the high-DM_Score group exhibited higher levels of PD-L1 mRNA both in TCGA-CRC (P = 8.3 × 10-16, Figure 8D) and meta-GEO (P = 6.8 × 10-9, Figure 8E) cohorts. In addition, CRC patients with MSI-H/deficient mismatch repair (dMMR) always exhibited a significantly higher DM_Score compared with CRC patients with microsatellite stable (MSS), MSI-low (MSI-L), or proficient mismatch repair (pMMR) status in both TCGA-CRC (P < 2.2 × 10-16, Figure 8F) and GEO GSE39582 (P = 2 × 10-8, Figure 8G) cohorts.
In the GSE78220 cohort, patients with a high DM_Score exhibited markedly prolonged survival after anti-PD-1 therapy (P < 0.001, Figure 8H). Meanwhile, all patients in the low-DM_Score group exhibited sustained disease progression after Pembrolizumab therapy, whereas 68% of patients in the high-DM_Score group achieved CR or PR (Figure 8I). In the IMvigor210 cohort, we calculated the DM_Score of each patient treated with the PD-L1 inhibitor Atezolizumab. Similar to the performance in the GSE78220 cohort, we verified longer survival expectation in the high-DM_Score subgroup (P = 0.018, Figure 8J). CRC patients achieving CR or PR exhibited a higher DM_Score than patients with stable disease or progressive disease (P = 0.0014, Figure 8K). As illustrated in a stacked bar plot, a higher percentage of CR or PR was observed in the high-DM_Score group than the low-DM_Score group (35% vs. 20%, Figure 8L). Taken together, the results from these analyses strongly suggest that DNA methylation modification pattern is correlated with immunotherapy response and that the DM_Score is useful for selecting patients who might benefit from anti-PD-L1/PD-1 immunotherapy.