Comprehensive Characterization of m6A Regulators-Mediated Methylation Modication Patterns and Tumor Microenvironment Signature in Breast Cancer

Recent studies have identied the epigenetic regulation in the tumor immune microenvironment (TME). Nonetheless, RNA N6-methyladenosine (m6A) modication mediated TME cell inltration in breast cancer is still unknown. Methods Herein, we systematically explored m6A modication TME inltration characteristics of 2249 breast cancer patients and comprehensively correlated the m6 A modication patterns with tumor immune landscape. The m6Ascore was established using principal component analysis. Multivariate Cox regression analysis was performed to evaluate the prognostic value of the m6Ascore. Results Three m6A modication clusters were identied. The TME cell inltration characteristics under m6A clusters were consistent with the tumor immune phenotypes (-excluded, -inamed and -desert). The assessment of m6A modication clusters was able to estimate tumor stages, stromal activity (angiogenesis, CD8+ T effector and EMT), clinical prognosis and genomic mutation. The genetic alternation mediated by m6Ascore was also associated with enhanced response to targeted therapy and immunotherapy. This study demonstrated that m6A regulator-mediated methylation modication plays an important role in TME modulation. Thus, comprehensive evaluation of m6A modication patterns may enhances our understanding of TME modulation and presents a new approach toward targeted therapy and immunotherapeutic strategies for breast patients. modication clusters. The white (consensus value = 0, samples never clustered together) and blue (consensus value = 1, samples always clustered together) heatmaps illustrate the sample consensus. Survival analyses of the three clusters based on 2249 patients with breast cancer: cluster A, 576 cases; cluster B, 839 cases; cluster C, 834 cases. Kaplan–Meier curves with log-rank P-value of 0.0082 indicate a signicant survival difference among the clusters. showed signicantly better OS than the other clusters. showing the activation states of biological pathways in the pathways;


Introduction
Breast cancer is the most commonly diagnosed cancer worldwide, and its incidence rates have increased rapidly in recent years [1]. Metastatic disease accounts for more than 90% of breast cancer-related deaths, and its molecular and clinical features are characterized by a high degree of heterogeneity, affecting disease progression and treatment response [2,3]. Therefore, further elucidation of the genetic and epigenetic aberrations underlying the intertumoral heterogeneity of breast cancer is still necessary.
The dynamic epigenetic modi cations on mRNAs are an important post-transcriptional regulation of gene expression [4,5]. N6-methyladenosine (m6A) methylation has been identi ed as one of the most common modi cation that mainly occurs in eukaryotic mRNAs [6]. m6A modi cation is mainly mediated by m6A methyltransferases ("writers"), binding proteins ("erasers"), and demethylases ("readers"), which is linked to RNA fate control by regulating pre-mRNA alternative splicing, RNA stability, and mRNA translation [7].
Accumulating evidence identi ed that m6A modi cation was associated with a broad impact on DNA damage response, circadian clock control, cell differentiation, and tumorigenesis [8,9,10,11]. However, the correlation between m6A regulators mediated methylation modi cation patterns and tumor progression and immunomodulatory abnormality in breast cancer remains unknown.
The tumor microenvironment (TME) comprises a complex milieu of non-malignant cells, including blood endothelial cells and lymphatic endothelial cells, mesenchymal cells, and immune cells, along with the extracellular matrix and in ammatory mediators they secrete. Tumor-in ltrating immune cells substantially in uence therapeutic responses and clinical outcomes [12,13,14]. Chang et al. reported that the m6A reader YTHDF3 promoted breast cancer brain metastasis by inducing the translation of m6A-enriched gene transcripts and controlling the interaction of cancers and the brain microenvironment [15]. The m6A RNA methyltransferases METTL3/14 regulate immune responses to anti-PD-1 therapy by decreasing cytotoxic tumor-in ltrating CD8 + T cells and increasing the secretion of IFN-γ, CXCL9, and CXCL10 in the TME [16]. The m6A demethylase ALKBH5 regulates anti-PD-1 therapy response by modulating lactate and suppressive immune cell accumulation in the TME [17]. However, the comprehensive role of TME cell in ltration mediated by multiple m6A regulators requires systematic clari cation.
In this study, we integrated the molecular and clinical information of 2249 breast cancer samples to comprehensively analyze the m6A modi cation patterns, and correlated the patterns with the TME cell in ltration characteristics. Three m6A modi cation clusters were identi ed, and we found that the TME characteristics under these three clusters were highly consistent with the immune phenotypes, indicating that m6A modi cation plays an important role in modulating TME characterization in breast cancer.
Moreover, we determined that the m6Ascore is an effective prognostic biomarker, and it appears useful for predicting the immunotherapy outcome in breast cancer.

Dataset acquisition and preprocessing
The gene expression data and clinical annotation of breast cancer were searched in Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) database. Patients without survival information were removed from further evaluation. In total, we used six eligible breast cancer cohorts (GSE20713, GSE42568, GSE20685, GSE25066, GSE124647, TCGA-BRCA [TCGA-Breast Invasive Carcinoma]) for further analysis. The raw CEL data of these six datasets were uniformly processed using the Robust Multichip Average algorithm for background correction and normalization [18]. Next, they were integrated into a meta-cohort using the ComBat function from the R sva package for batch removal [19].

Unsupervised consensus clustering
To identify m6A regulator-mediated m6A modi cation sub-clusters, tumor samples were clustered into sub-clusters with unsupervised consensus clustering based on the expression matrix of 21 m6A regulators using the R Consensus ClusterPlus package. To guarantee the stability of classi cation, the parameters were used for clustering as following: number of repetitions = 1000 bootstraps, pItem = 0.8, pFeature = 1.0, clustering algorithm = k-means method. The clustering that exhibited the most signi cant survival difference was considered.

Gene set variation analysis and gene set enrichment analysis
To investigate the biological processes difference between m6A modi cation sub-clusters, gene set variation analysis (GSVA) was performed to estimate the pathway enrichment scores using the R GSVA package [23]. The biological processes related to m6A regulators were further analyzed with gene set enrichment analysis (GSEA) using the R clusterPro ler package. The gene ontology (GO) gene sets of c2.cp.kegg.v7.1.symbols were downloaded from the MSigDB database. Adjusted P-values of <0.05 were considered signi cant.

TME estimation
The relative abundance of the various immune cells in the TME was quanti ed using the single-sample GSEA (ssGSEA) algorithm [24]. Stromal cell populations, including the angiogenesis information, CD8+ T effector signature, epithelial-mesenchymal transition (EMT), and pan-broblast TGF-β response (Pan-F-TBRS) signatures, in the TME were quanti ed as per the immune in ltration estimation (microenvironment cell populations' counter) method described by Becht et al [25].

Identi cation of differentially expressed genes
To identify m6A-related differentially expressed genes (DEGs), three distinct m6A modi cation clusters were classi ed based on the expression of the 21 m6A regulators. DEGs between the modi cation clusters were determined using the empirical Bayesian approach of the R limma package [26]. The signi cance criteria for determining DEGs were set as adjusted P-values of <0.05.

Construction of the m6A gene signatures
To quantify the m6A modi cation clusters of individual tumors, we constructed a set of scoring systems to evaluate the m6A modi cation clusters of individual patients with breast cancer: the m6A gene signature, also termed the m6Ascore. The m6A gene signatures were established as follows: The DEGs identi ed from the different m6A clusters were rst normalized, and the overlap genes were extracted. The patients were classi ed into groups with the unsupervised clustering method for deeper analysis of the overlap DEGs. The number of gene clusters and their stability were de ned using the consensus clustering algorithm. The prognostic analysis for each gene in the signature was evaluated using the univariate Cox regression model. The genes with signi cant prognosis were extracted for further analysis.
As per the method described by Zhang et al. and Teskey et al. [27,28], the m6A-relevant gene signatures were constructed using principal component analysis (PCA). After obtaining the prognostic value of each gene signature score, the m6Ascore of each patient was obtained with the gene expression grade index method: where i is the expression of m6A phenotype-related genes.

Estimation of immunotherapy sensitivity
Treatment sensitivity was estimated using the R pRRophetic package. Treatment sensitivity was determined by the concentration required for inhibiting 50% of cellular growth (IC50) [29,30]. The immunotherapy response was predicted based on the TIDE (tumor immune dysfunction and exclusion) database (http://tide.dfci.harvard.edu/).

Statistical analysis
The DEGs were analyzed using the R limma package. Correlation analysis between the TME-in ltrating immune cells and the expression of m6A regulators was conducted using the Spearman method.
Composition differences of >2 groups were calculated using one-way analysis of variance (ANOVA) and Kruskal-Wallis tests. The cut-off point of each dataset subgroup was determined using the R survminer package to construct the correlation between survival and m6Ascore. Further, patients were divided into high and low m6Ascore groups based on the maximally selected log-rank statistics. The survival curves for the prognostic analysis were generated via the Kaplan-Meier method, and the signi cance of differences was identi ed using log-rank tests. The hazard ratios (HR) for TME-in ltrating immune cells and patient survival were calculated using a univariate Cox regression model. Two-sided P < 0.05 was regarded as statistically signi cant. All data processing was done in R 3.6.1 software.
To assess the CNV alteration frequency of the 21 m6A regulators, a prevalent CNV alteration was analyzed, and showed that RBM15B, YTHDF2, WTAP, and ZC3H13 had the highest CNV ampli cation, with frequencies of 13.4%, 13.4%, 8.8%, and 8.3%, respectively. KIAA1429, IGF2BP1, YTHDF1, YTHDF3, and HNRNPC had a widespread CNV deletion at frequencies of 16.1%, 12.5%, 12.2%, 8.6%, and 8.0%, respectively (Fig. 1C). Analysis of the distribution of CNV alterations of the 21 regulators on 23 chromosomes showed that the distribution was irregular and diffused (Fig. 1D). Based on the expression of the 21 regulators, we were able to completely distinguish the breast cancer samples from the normal samples (Fig. 1E). To ascertain whether the above genetic variations in uenced the expression of the m6A regulators, we investigated the regulators' mRNA expression levels between the normal and cancer samples, and found that the CNV alterations could be the prominent factors resulting in the abnormal regulator expression, i.e., YTHDF2, WTAP, ZC3H13, IGF2BP1, YTHDF1, and HNRNPC (Fig. 1F). The above analyses reveal the high genetic and expression heterogeneity of m6A regulators in breast cancer. The abnormal expression of m6A regulators may play a crucial role in breast cancer tumorigenesis and progression.
Identi cation of m6A-related phenotypes in breast cancer Six independent BRCA datasets with available overall survival (OS) data and clinical information (GSE20713, GSE42568, GSE20685, GSE25066, GSE124647, TCGA-BRCA) were integrated into one metacohort (Additional le 1). We rst evaluated the m6A regulator interactions, m6A regulator connections, and prognostic values in 2049 breast cancer samples. Fig. 2A shows that HNRNPC, METTL3, and YTHDF2 were favorable factors of OS, while LRPPRC, YTHDC1, and YTHDF3 were risk factors of OS. The strongest positive correlation was observed for METTL14 and YTHDC1, and METTL3 and HNRNPA2B1 (P < 0.001). On the other hand, several erasers and writers showed negative correlations: KIAA1429 and METTL3 expression correlated negatively with YTHDC1 and METTL14. These results indicate that crosstalk among the 21 m6A regulators may play vital roles in the formation of m6A modi cation clusters and in the characterization of TME cell in ltration.
To classify patients with different m6A modi cation clusters based on writer, reader, and eraser expression (Additional le 2), three distinct modi cation clusters were identi ed with unsupervised clustering: cluster A (n = 576), cluster B (n = 839) and cluster C (n = 834) (Fig. 2B). Prognostic analysis of the three clusters indicated a particularly prominent survival advantage in cluster C (P = 0.0082) (Fig. 2C). TME cell in ltration characteristics in three m6A modi cation clusters We further conducted GSVA enrichment analysis to clarify the biological behaviors among the three m6A modi cation clusters. Fig. 2D shows that the stromal and carcinogenic activation pathways, i.e., the adherens junction, GAP junction, epithelial cell signaling, cell cycle, interaction, TGF-β signaling, cell adhesion, and p53 signaling pathways, were markedly enriched in cluster A vs. cluster B. Meanwhile, the focal adhesion, adherens junction, and cell cycle pathways were enriched in cluster A vs. cluster C. These results indicate that the three clusters are involved in the stromal and carcinogenic activation pathways.
To further know the effect of m6A regulators on TME cell in ltration, 23 TME-in ltrating cells among the three clusters were analyzed. Except monocytes, 22 immune in ltration cells were remarkably enriched (Fig. 3A). Univariate Cox regression analysis indicated that activated CD8 T cells (P = 0.019), CD56 dim natural killer cells (P = 0.0084), eosinophils (P = 0.00025), mast cells (P = 0.027), and natural killer cells (P = 0.027) were associated with the prognosis of the three clusters (Fig. 3B). The activation of stromal cells in the TME, such as cancer-associated broblasts, as well as the overexpression of immune checkpoint genes such as PD1/PDL1, can signi cantly dampen the anti-tumor immunity mediated by cytotoxic T cells [31,32]. Subsequent investigation con rmed that the m6A modi cation clusters were characterized by higher in ltration of stromal-related activities such as angiogenesis, EMT, CD8+ T effectors, and pan-broblast TGF-β (Fig. 3C). To determine hormone receptor expression, we analyzed the ER, PR, and HER2 statuses. m6A modi cation cluster C had a higher ER+ and PR+ rate than clusters B and C, and a higher percentage of triple-negative breast cancer (TNBC) (P < 0.001) (Additional le 3). Unsupervised clustering also revealed that the 21 regulators were signi cantly different among the three clusters (Fig. 3D). These results show that the DEGs in the m6A modi cation clusters may play a critical role in maintaining the TME.

Identi cation of m6A gene signatures
To investigate the potential biological behavior of each m6A modi cation cluster, 178 m6A phenotyperelated DEGs were overlapped using the R limma package (Additional le 4). GO enrichment analysis of the DEGs showed that the biological processes were remarkably associated with m6A modi cation and immunity, which further showed that m6A modi cation plays a non-negligible role in TME immune regulation ( Fig. 3E and F). To explore the potential mechanism, the 178 DEGs underwent unsupervised clustering analyses, which identi ed three distinct m6A modi cation genomic clusters: gene cluster A (n = 939), gene cluster B (n = 606), and gene cluster C (n = 704) (Fig. 4A). Prognostic analysis showed that the patients with breast cancer in gene cluster C had better prognoses, while patients in gene cluster B had poorer prognoses (Fig. 4B). We observed prominent differences in m6A regulator expression in the three gene clusters (**P < 0.01, ****P < 0.0001), which was similar to the results of the m6A methylation modi cation clusters (Fig. 4C).
To accurately predict the pattern of m6A methylation modi cation in individual patients, a set of scoring systems (m6Ascore) was used to quantify the m6A modi cation pattern of individual patients with breast cancer based on the phenotype-related genes. The attribute changes of the patients were illustrated using an alluvial diagram (Fig. 4D). To better illustrate the characteristics of the m6A signature, we also tested the correlation between the known signatures and the m6Ascore (Fig. 4E). Fig. 4F shows that gene cluster B had the highest median m6Ascore, while gene cluster C had the lowest median score (****P < 0.0001). These results indicate that a low m6Ascore could be closely linked to the immune activation-related signatures, whereas a high m6Ascore could be linked to the stromal activation-related signatures. More signi cantly, compared to the other m6A modi cation clusters and to m6A modi cation cluster C, which had the lowest median score, m6A cluster B had a signi cantly increased m6Ascore (****P < 0.0001) (Fig.  4G). Analysis of stroma-related pathway activity indicated that high scores were signi cantly associated with enhanced stromal pathway activation ( Figure 4H). These results reveal that a low m6Ascore is signi cantly associated with immune activation and that a high m6Ascore is associated with stromal activation. The m6Ascore was able to better discriminate the m6A modi cation patterns of breast cancer, and identify TME cell in ltration characterization.

Clinical characteristics of m6Ascore phenotypes
To determine the clinical characteristics of the m6Ascore, we rst analyzed the distribution of the m6Ascore and OS status of patients with breast cancer (Fig. 5A). The m6Ascore was calculated for all patients, and the m6Ascore score breakpoint was used to classify patients into high m6Ascore (n = 1124) and low m6Ascore groups (n = 1125). A high m6A score was associated with poor survival, and a low m6A score was associated with favorable survival (Fig. 5B). We evaluated the difference in m6Ascore between histological subtypes, and found that the ER+, PR+, HER2-, and non-TNBC subtypes were characterized by better prognosis and correlated signi cantly with lower m6Ascore, whereas the ER-, PR-, HER2+, and TNBC subtypes were correlated with a higher m6Ascore (Fig. 5C). To explore the clinical prognosis value, Kaplan-Meier curve analyses suggested that patients in the low m6Ascore group had significantly better OS probability than those in the high m6Ascore group in TCGA-BRCA (P = 0.0047), GSE42568 (P = 0.00064), GSE124647 (P = 0.00015), GSE25066 (P < 0.0001), and GSE20685 (P = 0.0026) (Fig. 5D-H), even though there were no survival differences between the two m6Ascore groups in GSE20713 (P = 0.21) (Fig. 5I).

Genomic mutations of the m6Ascore phenotypes
To determine the genetic and clinicopathological characteristics of patients in the m6Ascore groups, we rst compared the CNVs in the m6Ascore groups. Fig. 6A and B showed the distribution of the segment score across all chromosomes; there was a signi cant difference for chromosome 8, 11, and 17. We then compared the proportion of the genome signi cantly altered on the three chromosomes, and found that the high m6Ascore group showed increased levels of CNVs than the low m6Ascore group. Identi cation of the somatic mutation landscapes in the high and low m6Ascore groups (Fig. 6C and D) showed that the TP53-mutated group had more patients with high m6Ascore than patients with low m6Ascore. The patients with high m6Ascore had a signi cantly higher level of tumor mutation burden (TMB) than the patients with low m6Ascore, with the rate of the tenth most signi cant mutated gene being 8% versus 5%. As TMB is closely related to clinical targeted therapy and immunotherapy, it is important to identify the patient subgroup that could be more sensitive to certain drugs.
Targeted therapy and immunotherapy sensitivity between patients with high and low m6Ascore We predicted the treatment response to several commonly used targeted therapy drugs in the high-and low-m6Ascore patients with breast cancer. Fig. 6E and F showed that the high m6Ascore group in the breast cancer cohorts was more sensitive to lapatinib and sorafenib (****P < 0.0001). We investigated whether the m6A modi cation signatures could predict the patient response to immune checkpoint blockade therapy based on the TIDE prediction. The high m6Ascore group had lower TIDE expression, which indicated that the group is much more sensitive to immunotherapy (****P < 0.0001) (Fig. 6G). Further research implied that quantifying the m6A modi cation patterns is a potential and robust biomarker of prognosis and clinical response assessment of immunotherapy (Fig. 6H). This result could yield more clues for determining personalized treatment strategies for patients with breast cancer.

Discussion
The dysregulation of epigenetic modi cations is a common feature of most human cancers, including breast cancer [33,34,35]. Apart from DNA methylation and protein phosphorylation or histone modi cations, RNA m6A modi cation has been identi ed as another important layer of epigenetic modi cation in transcription and translation [36], which are involved in regulating RNA stability, RNA export, protein translation, and stem cell renewal [4,37]. The alteration of epigenetic m6A modi cation is often associated with tumorigenesis and cancer progression. However, recent studies have mostly focused on single TME cell types or single m6A modi cation regulators; the overall TME in ltration characterization mediated by the integrated roles of multiple m6A regulators is not comprehensively recognized.
In the present study, we comprehensively identi ed three distinct m6A methylation modi cation clusters in breast cancer based on 21 m6A modi cation regulators. We found that the three clusters had differential TME cell in ltration characteristics. Cluster A was identi ed as innate immunity and stromal activation, corresponding to the immune-excluded phenotype, which is signi cantly associated with stromal activation status, including the angiogenesis, EMT, and TGF-β pathways; cluster B was identi ed as the suppression of immunity, corresponding to the immune desert phenotype, which is signi cantly associated with immune tolerance and ignorance, and the lack of activated and priming T cells [38]; cluster C was identi ed as adaptive immunity activation, corresponding to the immune-in amed phenotype, which is associated with a large amount of immune cell in ltration in the TME [39,40]. Tumors lacking the in ltration of immune cells, particularly CD8 + T cells, are usually de ned as cold tumors, and such tumors are often associated with fewer bene ts from immune checkpoint inhibitors [41]. Prognostic analysis showed that, compared to cluster A and C, cluster B had the immune desert phenotype and poorer prognosis. This could provide novel checkpoints for m6A modi cation pattern mediation of the TME.
Biological pathway analysis of the DEGs among the three distinct m6A methylation modi cation clusters showed that the m6A-related gene signatures were associated with various activation of immune pathway. Similar to the clustering results of the m6A modi cation clusters, three genomic subtypes were identi ed based on the m6A gene signatures, which were also signi cantly correlated with stromal and immune activation. To reduce the heterogeneity of the three genomic subtypes, we established a scoring system based on the m6A-related genes to evaluate the m6A gene signatures. Higher m6Ascores exhibited an immune-excluded phenotype, and vice versa. Integrated analyses also revealed that the m6Ascore was an independent biomarker of poor prognosis in breast cancer. The ER + and PR + histological subtypes, i.e., sensitive to checkpoint endocrine and immunotherapy therapy, were signi cantly related to higher m6Ascore. Therapy sensitivity analysis showed that the lower m6Ascore group was more sensitive to targeted therapy and immunotherapy. Therefore, the ndings from the m6Ascore could provide more clues into personalized treatment for breast cancer.

Conclusions
In conclusion, we comprehensively evaluated the RNA methylation patterns and TME characterization of m6A regulators in breast cancer and highlighted the indispensable role of m6A modi cation in shaping the heterogeneity and complexity of the TME. Targeting m6A modi cations could be a promising strategy against breast cancer.

Availability of data and materials
The datasets used and analyzed during the current study are available from the corresponding author on reasonable request.

Declarations
Availability of data and materials The datasets used and analyzed during the current study are available from the corresponding author on reasonable request.  and normal (blue) tissues. *P < 0.05; **P < 0.01; ***P < 0.001; ns, not signi cant. modi cation clusters. The white (consensus value = 0, samples never clustered together) and blue (consensus value = 1, samples always clustered together) heatmaps illustrate the sample consensus. (C) Survival analyses of the three clusters based on 2249 patients with breast cancer: cluster A, 576 cases; cluster B, 839 cases; cluster C, 834 cases. Kaplan-Meier curves with log-rank P-value of 0.0082 indicate a signi cant survival difference among the three clusters. Cluster C showed signi cantly better OS than the other two clusters. (D and E) GSVA enrichment analysis showing the activation states of biological pathways in the three clusters. The biological processes are visualized with the heatmap: red represents activated pathways; blue represents inhibited pathways. The breast cancer cohorts were used as sample annotations: m6A-cluster A vs. m6A-cluster B (D); and m6A-cluster A vs. m6A-cluster C (E). regulators in the three clusters. Red represents high regulator expression; blue represents low regulator