Genome-wide DNA methylation and expression patterns of microRNAs in relation to breast cancer subtypes among American women of African and European ancestry

Background: Aggressive high-grade, estrogen receptor negative (ER-) breast cancer is more common among American women of African ancestry (AA) than those of European ancestry (EA). The reasons remain largely unknown. Epigenetic mechanisms, particularly DNA methylation and altered microRNA (miRNA) expression, may contribute to racial differences in breast cancer. However, few studies have specically characterized genome-wide DNA methylation-based modications at the miRNA level in relation to ER+ and ER- breast cancer, and their functional role in the regulation of miRNA expression, especially among high risk AA women. Methods: In this study, genome-wide DNA methylation and miRNA expression proling was performed using the Illumina Innium HumanMethylation450 Bead Chip platform and miRNA sequencing (miRNA-seq) in breast tumors from both AA and EA women. Results: The genome-wide methylation screen identied a total of 7,191 unique CpGs mapped to 1,292 miRNA genes, which correspond to 2,035 unique mature miRNAs. Cluster analysis of these miRNA-associated methylation loci showed a clear pattern of ER-subtype differences. We identied differentially methylated loci (DMLs: (|delta β|)>0.10, FDR<0.05) between ER- and ER+ tumor subtypes, including 290 DMLs shared in both races, 317 and 136 were specic to AA and EA women, respectively. Integrated analysis of DNA methylation and corresponding miRNA expression identied certain DMLs whose methylation levels were signicantly correlated with the expression of relevant miRNAs, such as multiple CpGs within miR-190b and miR-135b highly negatively correlated with their expression. Further target prediction and pathway analysis showed that these DNA methylation-dysregulated miRNAs are involved in multiple cancer-related pathways, including cell cycle G1-S growth factor, cytoskeleton remodeling, angiogenesis, EMT, and others such as signal transduction TGF-β, Wnt, NOTCH, and ESR1-mediated signaling pathways. Conclusions: Our results suggest that DNA methylation changes within miRNA genes are associated with altered miRNA expression, which may contribute to the network of subtype- and race-related tumor biological differences in breast cancer. These ndings shed light on the epigenetic regulation of miRNA expression and provide insights into the relations of clinical-relevant miRNAs to their target genes and to serve as potential preventative and therapeutic targets.


Introduction
Breast cancer is a heterogeneous disease, comprised of different clinical subtypes linked to disparate prognosis. However, as proposed by Anderson and colleagues (1), there are two primary etiologic subtypes, each associated with a distinct set of risk factors and genetic pro les that essentially correspond to estrogen receptor positive (ER+) and negative (ER-) disease, as an important marker for treatment options and prognosis. Compared to women diagnosed with ER + breast cancer, those with ERtumors in general have a poor prognosis, partly because of their aggressive phenotype and the lack of targeted therapy. High-grade, ER-breast cancer is more common among American women of African ancestry (AA) than those of European ancestry (EA). However, much less is known about the underlying causes of increased risk of ER-breast cancer in AA women.
Changes in DNA methylation have been recognized as one of the most common molecular alterations in cancer (2). Aberrant methylation patterns have been frequently reported in breast cancer, with studies primarily focused on promoter regions of protein-coding genes (PCGs) (3)(4)(5)(6). Speci c methylation patterns have been associated with high tumor grade and ER negativity of breast cancer (4,(7)(8)(9). In our own studies and others, analyses restricted to CpG dinucleotides (CpGs) within and nearby PCGs, show differences in DNA methylation patterns in tumors by ER status and between AA and EA women (10)(11)(12)(13). These results suggest that aggressive breast cancer subtypes may be related to altered gene methylation, and that methylation patterns may differ by ancestry.
Although it is well known that more than 75% of the genome is actively transcribed, yielding a large number of non-coding RNAs (ncRNAs), the genome-wide methylation patterns of ncRNAs in breast cancer remain largely unknown. MicroRNAs (miRNAs) are a well-characterized class of small ncRNAs that are key regulators of PCG expression, through induction of mRNA degradation or interference with mRNA translation (14). It is estimated that miRNAs regulate more than 50 percent of human genes (15) and are found to play an essential role in the regulation of many biologic processes in the development of cancer, including cell differentiation, apoptosis, immunity, and proliferation (16). Speci cally in breast cancer, miRNA expression patterns in tumors have been linked to pathologic presentation and tumor subtypes (17)(18)(19), and speci c miRNAs have been found to play essential roles in breast cancer invasion and metastasis (18,20,21). Moreover, in a recent study, we observed differential miRNA expression patterns by ER status and between races (22).
As current research has largely focused on the regulatory roles of miRNAs in cancer, their own regulation is not fully understood. Recent studies have shown that, similar to PCGs, miRNA encoding genes can be regulated by aberrant DNA methylation, and hence contribute to carcinogenesis (23,24). Given that a single miRNA can regulate expression of multiple target genes, alteration of methylation patterns of miRNAs may have a much broader effect than at other loci. A number of studies have reported altered methylation of miRNA genes in breast cancer, however, most of these studies have been limited by one or more key factors, such as focusing on only a small number of candidate miRNA genes, few studies on human breast tumor tissues, and lack of data on direct correlation between methylation and expression levels (25). Furthermore, there are no studies that have speci cally evaluated genome-wide methylation patterns of miRNAs in relation to differences in ER + and ER-breast cancer tissues from both AA and EA women. Taken together, identi cation and quanti cation of DNA methylation-based characteristics at the miRNA level in ER + and ER-tumors may improve our understanding of the molecular basis for the phenotypic differences between these two primary subtypes of breast cancer. In addition, examining DNA methylation changes in aggressive ER-tumor commonly observed in AA women may provide insights into underlying causes of breast cancer racial disparities.
In the current study, we characterized genome-wide DNA methylation patterns of CpGs within loci encoding miRNAs in breast tissues from both AA and EA women. We examined DNA methylation alteration patterns in ER + and ER-tumors for both races and compared the variability within and across race. For a subgroup of tumors, we previously performed genome-wide miRNA expression pro ling using next-generation sequencing, which allowed us to directly integrate DNA methylation and miRNA expression data to examine whether DNA methylation was associated with altered gene expression of corresponding miRNAs.

Materials And Methods
Tissue samples and DNA methylation assay As described previously, we conducted genome-wide DNA methylation pro ling of fresh-frozen breast tumor samples from 58 AA and 80 EA women using the Illumina In nium HumanMethylation450 Bead Chip platform, with subsequent analyses restricted to CpG loci within or near PCGs (10). In the current analysis of these same samples, we focused on CpGs within loci encoding miRNAs. Brie y, breast tumor tissues were collected from patients who were treated at Roswell Park Comprehensive Cancer Center. Genomic DNA was isolated from banked specimens and stored at -80 °C until use. All samples were linked with detailed clinical information by the Roswell Park Biomedical Data Science Shared Resource, with tumor characteristics presented in Table 1. DNA methylation data processing and analysis Hybridized and processed arrays are scanned, and the raw intensity is then extracted and processed by the min R package. We applied rigorous sample and locus speci c quality control criteria, SWAN normalization, and correction for batch effects (26). We removed samples with poor detection p-values using the IMA package (27). Probes that were ambiguously mapped and those shown to contain SNPs were also excluded from the analysis (28,29), leaving the nal dataset contained 276,108 CpG loci.
A CpG locus to be associated with miRNA coding regions was de ned as those located within 5Kb window of a miRNA gene. The unsupervised consensus clustering was performed on all unique miRNAassociated probes to examine overall methylation patterns in breast tumors, using the ConsensusClusterPlus R package (Bioconductor) with partitioning around medoids algorithm. The nal consensus clustering was given by combining the ward.D linkage method and the Pearson correlationbased distance measurement. The Wilcoxon rank-sum test was used to evaluate the statistical signi cance for each probe in each of the comparisons. To adjust for multiple comparisons, the false discovery rate (FDR) was computed using the Benjamini and Hochberg approach. Differentially methylated loci (DML) were de ned as CpGs with an absolute mean β-value difference (|delta β|) at least 0.10 between subgroups and FDR-adjusted p < 0.05. miRNA expression pro ling and data processing Among breast tumors included in the DNA methylation assay, miRNA-seq was performed on a subset of tumors, including 58 tumors (29 AA and 29 EA women). As described previously (22), 1 µg RNA was used for small RNA cDNA library preparation and then samples were sequenced on the Illumina HiSeq 2500 platform using high output 50-cycle single read sequencing. The miRNA quanti cation data were generated including expression levels for the known pre-miRNAs and mature miRNAs for all samples. The Cancer Genome Atlas (TCGA) data processing and analysis To validate our ndings, TCGA DNA methylation and miRNA expression data were downloaded from GDAC Broad Institute (http://gdac.broadinstitute.org/runs/stddata 2016_01_28/data/BRCA/20160128/).
Data generated using the same platform as the current study were included in this analysis. Patient and breast cancer clinical features were retrieved, such as age, race, and ER status. There were 537 TCGA samples with DNA methylation and miRNA expression data for the current analysis, with tumor characteristics described in Supplementary Table S1. All data were processed and analyzed following the same pipeline as used in the current study (above). DNA methylation status and association with miRNA expression To examine whether DNA methylation status is associated with expression of corresponding miRNAs, we paired each CpG site with the corresponding miRNA, de ned as probes located within 5Kb of a miRNA gene, as one CpG-miR Pair. Unique CpG-miR pairs were identi ed, in which one miRNA can be paired with multiple CpG sites, and similarly one CpG site can be paired with multiple miRNAs. Spearman's nonparametric correlation analysis was used to compute the correlation e cient (rho) between methylation levels (beta values) and miRNA expression (log counts per million, logCPM) for each pair.
Correlation analyses were carried out separately for tumors from AA and in EA women.
miRNA target gene prediction and pathway analysis For several top candidate CpG methylation-correlated-miRNAs, we further performed target prediction and pathway analysis using MetaCore (Clarivate Analytics), an integrated knowledge-based platform for comprehensive functional analysis of OMICs data and gene list. This method is based on a high-quality, manually curated database of molecular interactions, molecular pathways, and gene-disease associations. Speci cally, in this analysis, we identi ed target genes of these miRNAs and their enriched biological processes and pathways within gene sets. MetaCore employs a hypergeometric model to determine signi cance of functional relationships. Functions with FDR-adjusted p < 0.05 were considered signi cant.

Results
DNA methylation alterations at miRNA-associated loci in breast tumor tissue from AA and EA women Based on the criterion that a CpG locus to be associated with a miRNA gene was de ned as those located within 5Kb window of a given miRNA locus, a total of 7,191 unique CpGs were identi ed. These CpGs were mapped to 1,292 miRNA genes, which correspond to 2,035 unique mature miRNAs based in the latest release of miRBase version 22 (http://www.mirbase.org/). As shown in Fig. 1, unsupervised consensus clustering analysis of the methylation levels at these miRNA-associated CpG loci identi ed two major clusters. Cluster 1 contains primarily ER + breast tumors; cluster 2 is enriched for ER-tumors.
As no known studies have speci cally examined genome-wide DNA methylation patterns of miRNA encoding genes in breast tumors from AA populations and there is a clear pattern of ER-subtype differences ( Fig. 1), in the current study, we focused on analyses to identify differentially methylated loci (DML) between ER-and ER + tumors in AA and EA women, separately. As shown in Fig. 2A hypo-methylated) were identi ed in both races, 317 were speci c to AAs (198 hyper-and 119 hypomethylated), and 136 were speci c to EAs (34 hyper-and 102 hypo-methylated). We further presented these DMLs in the volcano plot for AA women (Fig. 2C) and for EA women (Fig. 2D), which clearly demonstrated a signi cantly higher frequency of hypermethylated loci in tumors from AA women compared to EA women. The full list of these signi cant DMLs by ER subtype for each group is shown in Supplemental Table S2.
Validation of DMLs in the TCGA dataset We validated DNA methylation for DMLs identi ed above using data from TCGA, which included 141 AA and 396 EA cases in this analysis. Data on 537 (out of 607 DMLs in AAs) and 385 (out of 426 DMLs in EAs) CpGs were available in the TCGA dataset. As shown in Supplemental Figure S1, we observed a high concordance between DMLs detected in our data and those identi ed using the TCGA dataset, i.e., all probes showed the same direction of methylation change by ER status, although in TCGA dataset, methylation changes of 16 DMLs in AAs and the 3 DMLs in EAs did not reach the statistical signi cance (dots in red).

Impact of aberrant DNA methylation on miRNA expression
To examine whether these DMLs are associated with expression of their corresponding miRNAs, we paired each CpG with the corresponding miRNA de ned as within 5Kb window, with a total of 1,408 unique CpG-miR pairs identi ed in our data. In cancers from AA women, we identi ed 224 (94 negative and 130 positive) signi cant correlations between methylation and expression levels, involving 136 unique CpGs at which differential methylation was associated with expression of 124 mature miRNAs (p < 0.05) from 97 unique miRNA genes. In tumors from EA women, we found 97 (45 negative and 52 positive) correlations, corresponding to 70 CpGs and 64 mature miRNAs (p < 0.05) from 51 unique miRNA genes. The correlation coe cients for each of these signi cant CpG-miR pairs are shown in Fig. 3A, with detailed information on the CpG loci, mature miRNAs, and their correlation coe cients listed in Table S3.
Notably, signi cant negative correlations for multiple CpG loci were observed for miR-190b and miR-135b in both race groups, whereas positive correlations were observed for members of the miR-224/miR-452 cluster. As shown in Fig. 3B, the scatter plots showed the correlation patterns of several methylation loci with the corresponding miRNA expression on the miR-190b, miR-135b, and miR-452/-224. Results suggest that DNA methylation changes appeared to drive miRNA expression alterations and that these correlations were somewhat stronger in AAs than in EAs.
We further validated these ndings in TCGA. The majority of correlations identi ed in our study showed consistent directions in TCGA, with detailed information listed in supplemental Table S4. In Figure S2, the scatter plots showed signi cant correlations for multiple CpGs with their corresponding miRNAs, i.e., the miR-190b, miR-135b, miR-224/-452. As shown, the methylation level of a CpG mapped within the miR-190b locus was negatively correlated with expression levels of the mature miR-190b-5p, in which high methylation levels were associated with low expression levels in both our and TCGA data. Similar ndings were observed for multiple probes of miR-135b. These results further con rmed that DNA methylation modi cations in miRNA encoding genes play regulatory role of expression of these miRNAs. miRNA target genes and pathway analysis To investigate the biological relevance of these ndings, we performed target prediction and pathway enrichment analysis using MetaCore, focusing on several top candidate miRNAs at which DNA methylation was highly correlated with miRNA expression, including miR-135b-5p, miR-190b-5p, and miR-224-5p/miR-452-3p. Details on canonical pathways, molecular process networks, and targets for each miRNA are presented in supplemental Table S5. These miRNAs deregulated by DNA methylation in this setting are involved in multiple cancer-related pathways, indicating their role in various key molecular processes related to and beyond ER biology. As shown in Fig. 4, top enriched pathway and process networks related to miR-190b-5p and miR-135b-5p include cell cycle G1-S growth factor or interleukin regulations, cytoskeleton remodeling, angiogenesis, EMT, and others such as signal transduction TGF-β, Wnt, NOTCH, and ESR1-mediated signaling pathways. Targets from these process networks were identi ed, including various transcription factors and protein kinase, such as ESR1, GSK3β, MMP-2, MMP-9, TGF-β, mTOR, and PI3KCA.

Discussion
DNA methylation has been reported as a mechanism that may cause dysregulation of mature miRNAs and consequently associated with breast cancer risk and progression. However, few studies have speci cally examined this epigenetic modi cation at miRNA level in relation to ER + and ER-breast cancer, and its functional role in the regulation of miRNA expression, especially in AA women who are at high risk of aggressive, ER-disease. In this study, we evaluated methylation patterns of miRNA genes and their effect on miRNA expression in breast tumors from both AA and EA women. The genome-wide methylation screen identi ed a set of DNA methylation loci within 5Kb of miRNA genes that were differentially methylated between ER-and ER + tumor subtypes in tumors from both races, or speci c to AA or EA women. Integrated analysis of DNA methylation and miRNA expression further identi ed certain DMLs whose methylation levels were signi cantly correlated with the expression of relevant miRNAs. Thus, DNA methylation changes associated with altered miRNA expression may contribute to the network of molecular differences of ER subtypes of breast cancer and may differ by race.
We identi ed many DMLs that can signi cantly differentiate ER-and ER + tumor subtypes. Some of the top DMLs mapped to, or were in the vicinity of, miRNAs that are differentially expressed in ER-versus ER + tumors, such as multiple CpGs in miR-135b, -190b, and − 224, as shown in our previous study and in other studies (22,30). Many other DMLs were novel relative to ER subtypes, such as CpGs associated with miR-125b1, -2053, -3132, and − 4736. Some of these identi ed DMLs were common in both races and others were speci c to AA or EA women. We also noted that there were more DMLs by ER subtypes in AA tumors compared to EA tumors. Moreover, while ER-subtype DMLs common to both races showed similar proportions of hyper-and hypo-methylated loci, interestingly, we noted signi cant differences in proportions for race-speci c DMLs. We observed a higher frequency of hypermethylated miRNA genes in ER-compared to ER + tumors from AA women, but a lower frequency of hypermethylation in ER-versus ER + tumors from EA women. Consistent with our ndings for tumors from EA women, several studies focused on DNA methylation of coding genes observed a higher frequency of hypermethylated genes in ER + tumors (8,31,32). One explanation for this may be that ER-related signaling molecules may enhance the activity of DNA methylation transferases to catalyze the transfer of a methyl group to a DNA segment, thus lead to the speci c gain of DNA methylation in ER + tumors. By contrast, we observed in AA women that there were more DMLs associated with ER tumor subtypes and that there was a higher frequency of hypermethylated miRNA loci in ER-tumors, which warrant further investigations. Nevertheless, the majority of these DMLs were con rmed in TCGA data, suggesting a differential DNA methylation pattern of miRNA encoding genes relevant to breast cancer subtypes and race.
Another goal of our work was to identify DNA methylation-associated changes in miRNA expression that may be involved in phenotypic differences in breast cancer subtypes in both AA and EA women. Leveraging the availability of both genome-wide DNA methylation and miRNA expression data from the same patients, we were able to directly link both types of data and examined whether altered DNA methylation at speci c CpG loci was associated with the expression of neighboring miRNAs. As expected, we observed that altered methylation at certain DMLs was negatively correlated with expression of corresponding miRNAs in breast tumors of both AA and EA women. Several top correlated CpG-miR pairs include multiple loci within the miR-190b and miR-135b. MiR-190b has been reported to be highly upregulated in ER + compared to ER-breast cancers from several other studies including our own (22,(33)(34)(35). In the present study, our results provide evidence that differences in miR-190b expression in breast cancer subtypes is associated with methylation alterations. This observation was consistent with results from our analysis of TCGA dataset and from a recently published study (35). Results from studies also suggest that expression of miR-190b in primary tumors may play a role with breast cancer recurrence, tamoxifen resistance, breast cancer and overall survival, but its exact role in carcinogenesis remains to elucidate (33,34,36). Interestingly, as low expression is associated with poor survival, we observed the lowest miR-190b expression in ER-tumors from AA women among all tumor-race subgroups, which may contribute to the poor survival observed in this population. By contrast, we observed CpG promoter hypomethylation and upregulation of miR-135b in ER-versus ER + tumors. Consistent with these results, several studies have reported miR-135b was upregulated in ER-or triple negative breast cancers and its overexpression promotes breast cancer progression and metastasis through regulation of ERα, AR, and other genes such as LATS2, TGFβ, WNT, and HIF1AN (37)(38)(39)(40)(41). Taken together, con rmed in TCGA data, these consistent negative correlations between CpG methylation and miRNA expression support the notion that, at least in some cases, altered miRNA expression may occur through mechanisms involving loss or gain of promoter DNA methylation during early stages of carcinogenesis and contribute to the development of speci c subtypes of the disease. These DNA methylation-dysregulated network of miRNA genes could be clinically relevant predictive markers and served as potential new therapeutic targets of anticancer therapy. It may be possible to restore the functionality of the dysregulated miRNA network through epigenetic drugs. Future study is necessary to fully understand the mechanisms of underlying biological functions and their clinical implications.
Although most studies have primarily focused on the relationship of DNA methylation and their negative control of gene expression, we revealed positive correlations between methylation and expression for many CpG-miRNA pairs, similar to what has been consistently reported in cancer studies that investigate correlation patterns between CpG methylation and gene expression at the genome-wide scale (42). Notably, we found that DNA methylation of multiple loci within the miR-224/miR-452 cluster were positively associated with miRNA expression in both races. These two intronic miRNAs are located within a locus on chromosome Xq28 encompassing GABRE gene. We observed hypermethylation at these CpG loci and increased miR-224/miR-452 expression in ER-compared to ER + tumors. Studies have reported both up-and down-regulation of miR-224 in various cancers, suggesting its cell type-speci c expression and function (43)(44)(45). Speci cally, in breast cancer, and consistent with our study, one study reported increased miR-224 expression in both triple negative breast cancer cell lines and tumor tissues, and that its knockdown in aggressive triple negative breast cancer cells reduced proliferation, migration and invasion (46). These results may suggest more diverse mechanisms of epigenetic regulation, which may involve DNA methylation of insulator or repressor elements, transcription factor binding, chromatin and histone modi cations during carcinogenesis. Such complexity warrants further investigation of these miRNAs and ndings could provide implications for understanding the molecular basis for the phenotypic differences by breast cancer subtype and race.
We further performed target predictions and examined the major biological functions and signaling pathways of several signi cant methylation-dysregulated miRNAs. Various important metastasis-and cancer-related pathways were enriched, including cell cycle regulation, angiogenesis, EMT, and Wnt, TGFβ and ESR1-mediated signaling. Speci cally, we found that these miRNAs participate in these pathways through interactions with targets such as ESR1, GSK3β, MMP-2/-9, TGFβ, mTOR, SMAD4, c-Myc, HIF1α, and PI3KCA, some of which have been shown to regulate various signaling pathways and cellular functions in breast carcinogenesis (47)(48)(49)(50). These results suggest that these miRNAs target many key genes and work as a module, contributing to breast cancer phenotypic differences and aggressiveness.
Further experimental validation will be needed to con rm these ndings.

Conclusions
In summary, based on genome-wide pro ling of tumor DNA methylation and microRNA expression, our results suggest that DNA methylation patterns in miRNA encoding genes differ between breast cancers according to cancer subtype and race, and that this altered methylation may affect miRNA expression. Further pathway analysis identi ed their potential role in modulating cancer-related key biological processes. These ndings provide the basis for further functional analyses in experimental studies, which are likely to further inform the underpinning of subtype-and race-related tumor biological differences in breast cancer. Our study sheds light on the epigenetic regulation of miRNA expression and provide insights into the relations of clinical-relevant miRNAs to their target genes and to serve as potential preventative and therapeutic targets.  Unsupervised consensus cluster analysis of miRNA methylation patterns of breast tissues. Two major clusters were identi ed. Cluster 1 contains primarily ER+ breast tumors; cluster 2 is enriched for ERtumors.  hypomethylated DMLs. Dots plotted in grey are not DMLs in either AA or EA women. The mean methylation difference (|delta β|) in ER-versus ER+ tumors is on the X axis, and the Y axis shows the log 10 of FDR (adjusted P-value). For each CpG-miR pair, the methylation level (beta value) is on the X axis, and the expression level (log counts per million, logCPM) of corresponding miRNA is on Y axis.

Figure 4
Functional enrichment analysis on targets from miR-135b-5p and miR-190b-5p using MetaCore. (A) Target gene network and top ten pathway networks for miR-190b-5p. (B) Target gene network and top ten pathway networks for miR-135b-5p.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.