Integrative Analysis of Gene Expression and DNA Methylation Depicts the Impact of Obesity on Breast Cancer

zhenchong xiong Sun Yat-sen University Cancer Center https://orcid.org/0000-0002-1129-5663 Lin Yang Southern Medical University Nanfang Hospital Jianchang Fu Sun Yat-sen University Cancer Center Xing Li Sun Yat-sen University Cancer Center Xinhua Xie Sun Yat-sen University Cancer Center Yanan Kong Sun Yat-sen University Cancer Center Xiaoming Xie (  xiexm@sysucc.org.cn ) Sun Yat-sen University Cancer Center


Background
Breast cancer is the most commonly diagnosed cancer and the second leading cause of cancer death for women in the world [2].Previous study has reported that obesity, which is characterized by excess adipose tissues, is a risk factor for BC [3].In premenopausal women, obesity associated with increased risk of hormone receptor (HR) negative BC.In postmenopausal women, obesity associated with increased risk of HR positive BC [4].Moreover, several studies showed that obese patients exhibited more aggressive tumor features (such as larger tumor size, lymph node metastasis, shorter disease-free survival and greater risk of mortality) comparing to non-obese patients in BC [5,6].Although previous studies have observed that the adipose tissue in obese individuals increasingly secrets adipokines (including hormones, growth factors, and cytokines) contributing to an environment promoting cancer proliferation and metastasis [7], how obesity impacts BC requires further studies.
Body Mass Index (BMI), which is de ned as a person's weight in kilograms divided by the square of height in meters, is the most commonly used method for obesity evaluation.However, it is more like a surrogate measure for body fatness while obesity should be calculated using the excess accumulation of adipose tissues rather than body mass [8].As heterogeneity exists in the body distribution, function and tissue composition of adipose tissue among BC patients, a total body mass index is insu cient to evaluate the degree of obesity in local tissue.Moreover, BMI could only re ect the gain of weight but not to re ect the pathophysiological changes during the process of obesity [9].Thus, developing new biomarkers to evaluate the obesity status of BC tissues is helpful to assess the impact of obesity on BC.
It is well known that obesity is affected by multiple factors (including environmental factors, genetic predisposition and the individual lifestyle) [10].Recently, an increasing body of evidences showed that DNA methylation also involved in the process of obesity [11,12].DNA methylation is an epigenetic mechanism which regulates gene expression through chromatin structure changes.Equally in uenced by environmental factors, genetic predisposition and the individual lifestyle, the level of gene methylation is dynamically changing setting up stable gene expression pro les to adapt the process of obesity.Study analysing the whole genome methylation and gene expression in non-diseased breast showed that obesity associated with the genome-wide methylation changes in human tissue [13].In addition, Brionna Y Hair et al. observed that obesity signi cantly associated with genome-wide hyper-methylation in ERpositive BCs [14].Thus, changes of genome-wide DNA methylation could be a re ection of the biological changes in breast tissue during the process of obesity.The goal of our study is to capture the obesityrelated genomic changes and to explore the impact of obesity on BC tissues.We developed DNA methylation based BMI index (DM-BMI) to evaluate the degree of obesity in breast tissues.We validated the accuracy of DM-BMI in breast tissues from non-BC and BC population.Further, we assessed the correlation of DM-BMI with the obesity adipose tissue content and with the expression of adipokines in BC tissues.Next, we identi ed the DM-BMI related gene expression pro le.Using Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) database, we observed that the DM-BMI-related genes signi cantly involved in the process of cancer immunity.Using Estimate and Cibersort algorithm, we observed a positive correlation between DM-BMI and immune cell in ltration.Finally, we assessed the correlation between DM-BMI and biomarkers of response to immune checkpoint inhibitors [1] and observed that DM-BMI positively correlated with ICI response in BC.

Data collection and processing
The training set including the genome-wide methylation data of 221 normal breast tissues in GEO (GSE88883 and GSE101961) and the validation sets including data of 44 normal breast tissues (the validation set 1) and data of 70 tumor-adjacent breast tissues (the validation set 2) in GEO (GSE67919 and GSE74214) were used to developed the DM-BMI score.The BMI data of the above cases were listed in Suppelmentary materials 1-2.The DNA methylation and expression data of 76 cases with matched tumor and tumor-adjacent breast tissues and the data of 699 cases with unmatched tumor tissues were collected from The Cancer Genome Atlas (TCGA) database.
Genome-wide methylation data was pro led using Illumina In nium HumanMethylation450 BeadChips Assay.For DNA methylation data, β value ranging from 0 (no DNA methylation) to 1 (complete DNA methylation) was used to de ne the methylation level of each probe, and the probes with missing value in over 50% samples were removed while the probes with missing value in less than 50% samples were imputed with the k-nearest neighbors (knn) imputation method.Probes located on the sex chromosome and probes containing known single nucleotide polymorphisms (SNPs) were removed.Eventually, 301998 probes were included in this study.BMIQ normalization for type I and II probe correction was performed.Data from multiple studies was integrated and the Combat algorithm was performed to remove the batch effects.The above processes were carried out using the R package Champ.
For gene expression data, background correction and normalization were carried out using the R package limma.

Calculation of DNA-methylation based BMI index
To improve the predictive accuracy of the model, the BMI value was transformed to F(BMI) before further analysis as follow: F(BMI) = log(BMI + 1)-log(healthy.BMI + 1) if BMI < = healthy.BMI; F(BMI)=(BMI-healthy.BMI)/( healthy.BMI + 1) if BMI > healthy.BMI.The parameter healthy.BMI was set to 25 referring to the upper limit of BMI in healthy population.
A lasso regression was used to regressed the DM-BMI in the form of F(BMI) based on the BMI data and DNA methylation Data with 301998 probes.42 probes was selected in the lasso regression model as BMI predictors according to the lambda.minvalue (Fig. S1A), and the coe cient values of each probe was shwon in Fig. S1B.The lasso regression analysis was carried out using the R package glmnet (alpha was set to 1, and the lambda value was identi ed by performing a 10-fold cross validation to the training data).

Analysis of intra-sample adipose tissue content
Adipose tissue accounts for a large proportion of breast tissue composition.We used a deconvolution algorithm to calculate the proportion of adipose tissue in breast and BC tissues based on DNA methylation.Andrew E. et al provided a deconvolution algorithm to model cell subpopulations in breast tissues based on DNA methylation data.Illumina 450 k DNA methylation data of human mammary epithelial cells (GSE40699) and adipose tissue (GSE48472) were used as reference pro les.Data was processed as previous description, and probes which have an absolute difference in beta-value between the human mammary epithelial cells and the averaged adipose tissue > 0.7 were selected for the evaluation of intra-sample adipose tissue content.Data of adipose tissue content were listed in Supplementary material 3.

Characteristics analyses of BMI predictors
Page 5/13 42 probes were selected in the lasso regression model as BMI predictors.The distribution of 42 probes on chromosome, CpG island, TSS regions was assessed using the R package Champ.BMI predictors which were differentially methylated between BC tissues and tumor-adjacent breast tissues in TCGA database were identi ed using the R package Champ.The survival correlation of BMI predictors was assessed using TCGA BRCA data.Correlation between the methylation level of BMI predictors and DM-BMI was assessed.
Functional and clinical characteristics analysis of DM-BMI related gene pro le DM-BMI of TCGA-BC tissues were calculated using DNA methylation data, and speraman correlation coe cient was used to assess the correlation of DM-BMI and clinical characteristics (menopause status, hormone status, copy number variation and gene mutation) in BC.Gene expression pro le related to DM-BMI was identi ed, and functional analysis of the related genes were process by GO and KEGG analysis.Otherwise, we performed Gene set variation analysis (GSVA) analysis to identify DM-BMI related gene signature using gene expression data in TCGA.The above procedure were performed using the R software.

Evaluation of correlations between DM-BMI and the immune microenvironment in BC.
Tumor mutation burden (TMB) was de ned as the number of non-synonymous mutations/Mb of genome.As previously reported [15], TMB of BC tissues in TCGA was calculated as (whole exome nonsynonymous mutations)/38 (Mb).
The level of tumor-in ltrating immune cells and stromal cells in each tissue were evaluated by ESTIMATE algorithm [16].The proportion of 22 immune cells in each tissue were evaluated by CIBERSORT algorithm (http://cibersort.stanford.edu/)[17].Correlations between DM-BMI and ESTIMATE/ CIBERSORT scores were calculated using speraman correlation coe cient.The data of TMB, ESTIMATE and CIBERSORT analysis were listed in Supplementary material 4.

Development and validation of DM-BMI in breast, tumor-adjacent and BC tissues.
A total of 221 breast tissues from non-BC cases (GEO database) were selected as the training cohort to develop the DNA-methylation-based BMI (DM-BMI) prediction model (Fig. 1).The median BMI and median age of training cohort were 28.24 (6.07-53.74)and 37 .50 lasso regression models based on DNA methylation data of training cohort (301998 probes per sample) were constructed and the model with the minimum mean-squared error was selected based on the Lambda value (Fig. S1A).42 probes were included and the coe cients of them were shown in Fig. S1B and Supplementary material 5.
We used Spearman correlation coe cient and paired t-test to assess the predictive accuracy of the DM-BMI model.DM-BMI showed a signi cant correlation with BMI (Fig. 2A) and paired t-test revealed that there was no signi cant difference between DM-BMI and BMI (t =-0.384, df = 220, p-value = 0.702).Using a deconvolution algorithm, we evaluated the breast tissue composition and observed that the increased DM-BMI was signi cantly correlated with higher proportion of adipose tissue (Fig. 2B).These results showed the high accuracy of DM-BMI for BMI prediction.
Further, we assessed the DM-BMI of paired tumor and tumor-adjacent breast tissues in TCGA database.The tumor tissues exhibited a higher level of DM-BMI comparing to its paired tumor-adjacnt tissues (Fig. 2G).In BC tissues, DM-BMI positively correlated with adipose tissue proportion (Fig. 2H).

What Is Known About The 42 Bmi Predictors?
As DM-BMI was signi cantly correlated with the obesity status which has been suggested to risk factor for BC, we further assessed the relevance between BMI predictors and BC.As hyper-methylation of CpG Island at gene promoter regions often caused gene silencing, we rst evaluated the distribution of BMI predictors.Among 22 pairs Chromosome (Chr), Chr1 and 16 are the most common region for BMI predictor distribution.45.2% of BMI predictors located at the gene body regions while only 23.8% of them located at the promoter regions (TS1500 and TS200).Further, we observed that only a few part of BMI predictors located at CpG islands (Fig. 3A).
Next, we assessed the methylation variation of BMI predictors between tumor and tumor-adjacent breast tissues.26 differentially methylated probes (DMP) were identi ed: 22 BMI predictors were hypermethylated in tumor tissues comparing to the tumor-adjacent breast tissues and 4 were hypo-methylated in tumor (Fig. 3B). 3 of 42 BMI predictors correlated better OS for BC patients and 2 of them were located at gene-coding regions (Fig. 3C).In BC tissues, the correlation between methylation level of 42 BMI predictor and DM-BMI were evaluated and 11 of them signi cantly correlated with DM-BMI (correlation coe cient > 0.3 or < -0.3; Fig. 3D-E).By integrative analysis of DNA methylation and expression the negatively correlation between methylation level and gene expression were observed in 22/42 BMI predictors (Fig. 3E).

Functional and clinical characteristics analysis of DM-BMI related gene pro le in BC.
Further, we explored the biological signi cance of DM-BMI in breast cancer tissues.The survival correlation of DM-BMI was evaluated in BCs and subgroup of BCs with cancer therapy (chemotherapy, endocrine-therapy, anti-HER2 therapy and radiation-therapy).DM-BMI was consistently correlated with higher mortality risk in the whole cohort of BC patients and subgroups of patients with chemotherapy, endocrine-therapy or radiation-therapy, respectively (Table 1).Tissues from patients with postmenopausal status and TP53-mutation exhibited a signi cantly higher level of DM-BMI (Fig. 4A-B).
Otherwise, an increasing level of DM-BMI was correlated with increased proportion of ERBB2 and MYC ampli cation (Fig. 4C-D).Previous studies showed that adipokines produced by obese adipose tissues drives obesity-mediated in ammation and BC progression.In BC tissues, proportion of adipose tissue was positively correlated with DM-BMI.Expression of 6 pro-in ammatory adipokines were positively correlated DM-BMI while expression of 2 anti-in ammatory adipokines were negatively correlated DM-BMI (Fig. 4E).These results indicated that obesity increased the expression of pro-in ammatory adipokines in BC tissues.
Further, we assessed the DM-BMI (obesity) related gene expression pro le and mRNA expression of 10032 gene signi cantly correlated with DM-BMI.To evaluate the biological effect of obesity on BC, we performed GSEA analysis of DM-BMI related genes using KEGG and GO database.GO analysis showed that gens positively correlated DM-BMI were signi cantly involved in antigen process and presentation, immune cell activation, MHC protein binding, immune receptor activity in BC (Fig. 4F).KEGG consistently showed that DM-BMI related genes were signi cantly enriched in tumor-immunity related pathway (including: antigen processing and presentation, NK cell mediated cytotoxicity, T cell differentiation, and PD-1 checkpoint pathway) (Fig. 4G).These results indicated that the obesity-related gene pro le involved in the regulation of immune response in BC.
DM-BMI correlated with T cell in ltration and ICI response markers in BC.
We evaluated the correlation between obesity and immune response to BC. Gene mutation which changes protein-coding sequence and leads to the expression of abnormal proteins was suggested to be the driving factor for cancer development.Also, the abnormal protein derived from tumor mutation might arise immune response.In BC tissues, we observed a positive correlation between DM-BMI and TMB.
Using Estimate algorithm, we evaluated the degree of immune cell in ltration in TCGA BC tissues.Interestingly, we found a positive correlation between DM-BMI and Estimate-immune score while no signi cant correlation was observed between DM-BMI and Estimate-Stromal score (Fig. 5B).Next, we calculated the relative abundance of 22 immune cell types in TCGA BC tissues.Among them, the content of CD8-T, CD4 memory activated-T, T follicular helper and regulatory T cells (Treg) were positively correlated with DM-BMI indicating the more intense T cell mediated immune response in BC tissues with increased DM-BMI (Fig. 5C).As the representative of immunotherapy, the ICI therapy suppressed BC progression by activating T cell mediated immune response.Thus, we examined whether DM-BMI predicted the tumor response to ICI. 5 markers for ICI response and 4 markers for ICI resistance were selected to evaluate the tumor response.In BC tissues, DM-BMI positively correlated with IFNG, IFNG.GS, CD274, CD8 and APS indicating that BC tissues with increased DM-BMI exhibited a better response to ICI (Fig. 5D).Moreover, DM-BMI was negatively correlated with two ICI resistance markers (CAFs and TAM-M2).All these results indicated that BC tissue at obesity status might exhibited a more intense response to ICI based on markers of ICI response.

Discussion
In this study, we developed a obesity evaluation model (DM-BMI) based on DNA methylation pattern.
Based on the DM-BMI model, we further identi ed the obesity-related gene expression pro le.Although obesity has been shown to be a BC risk factor for many years, most studies focus on the correlation between obesity and clinical prognosis while studies about the biological and genomic impact of obesity on breast cancer were limited.The adipose tissue, as the major agent mediating obesity-related biological effects, is a important starting point for the study of the obesity impact.In both breast and BC tissue, we observed a positive correlation between the proportion adipose tissue content and DM-BMI.Previous study reported that the expansion of adipose tissue, accompanying by the dys-regulation of the endocrine function (adipokine secretion) of the adipose cells, was driven by an increase size of adipose or by formation of new adipose cell.As the DM-BMI increased, we observed an increased expression of proin ammatory adipokines and decreased expression of anti-in ammatory adipokines which might synergetically induce obesity-mediated in ammation through activation of NF-κB pathway and creat a pro-oncogenic environment.
In addition to the expansion of adipose tissue, we also observed an promoting effect of obesity on immune response in BC tissue.In obses individuals, adipose tissue expands with increasing demand of oxygen which induce the development of hypoxia environment.The activation of hypoxia signalling increases the expression of adipokines, especiallly the pro-in ammation adipokines (including CCL2, CXCL8, CXCL10, IL-18, IL-1α and Oncostatin M) which involve in the recruitment of tumor-associated immune cells.Moreover, previous study also showed that adipocytes could fuel immune cells through releasing exosome-sized, lipid-lled vesicles [23].In BC tissues, we observed that DM-BMI positively correlated with the degree of M1 macrophages, activated dendritic cell and T cell in ltration indicating an increase activity of tumor immune response.As T cell exhaustion is the critical for tumor immune escape, previous study indicated that T cell exhaustion could be reversed by immune checkpoint inhibition (such as PD-1) and replenishing activated T cells (such as CAR-T).Interestingly, in obese BC tissues, we found an increase content of CD4 +,CD8 + and follicular helper T cells, which may be due to the increased secretion of immune chemokines in adipose tissue.Although we also observed a positive correlation between DM-BMI and regulatory T cell (Treg), a subset of immune cell with immunosuppressive activity, it can be interpreted as a negative feedback regulation by the immune system to maintain the stability of the immune environment after the activation of the immune response [24].Further, our study revealed that DM-BMI positively correlated with ICI response markers in BC tissues.These results suggest that obese BC patients may bene t from ICI.Recently, Wang, Z. Et al consistently reported that obesity is associated with increased response of PD-1/PD-L1 blockade in animal melanoma tumors model [25].
However, as our ndings were mainly supportedly by database analysis, data from clinical samples treated with ICI treatment are still required to validate the correlation between DM-BMI and ICI response.
With the increasing number of obese patients, the impact of obesity on the treatment of breast cancer has arose more and more attention.We observed that increased DM-BMI was correlated with higher mortality risk in patients with chemotherapy, hormone-therapy and radiation-therapy.Although no evidence pointed out that obesity induces drug-resistance in cancer cell, dose of chemotherapy and radiation might be routinely reduced in obese individuals because doctors usually limit body surface area under 2 m 2 to reduce toxicity when calculating the dose of chemotherapy [4,26].As highly expression of aromatase, adipose tissue is an endocrine organ which is an important site for estrogen biosynthesis, especially in postmenopausal women [27].In obese BC patients, increased expression of aromatase might caused the resistance to endocrine-therapy.
Because of the limitation of BMI in obesity evaluation, several imaging methods have been developed for obesity evaluation (including: bioimpedance analysis instruments, dual-energy X-ray absorptiometry, computed tomography and magnetic resonance imaging) [28,29].Although these newly methods enable the precise quanti cation of adipose tissue, the operational complexity has limits their application [30].
Developing new methods to evaluate the degree of obesity is of great value.Recently, increasing number of studies indicated that environmental factors (such as dietary pattern and life style) induce changes in DNA methylation pattern predisposing to obesity and obesity, likewise, results in genome-wide methylation variation [11,31].Moreover, biomarkers based on DNA methylation have been shown to be effective in obesity evaluation while most of them were only applicated in blood samples [11].Thus, we developed a DNA-methylation-based biomarker (DM-BMI) for obesity evaluation in breast tissue.In both normal breast and tumor-adjacent breast tissue, DM-BMI showed a signi cant correlation with both BMI and the content of adipose tissue.In addition, we also observed that DM-BMI was positively correlated with the degree of pro-in ammatory adipokines and immune cells in ltration in BC tissues.All these data indicated that DM-BMI is an effective biomarkers to evaluated the biological changes in tumor tissues of obese patients.
The identi cation of BMI predictors naturally cause the assumption that whether these CpGs are critical regulators of obesity.In our study, only 11 of 42 BMI predictors signi cantly correlated with DM-BMI while the others exhibited negligible correlation with DM-BMI.Although DNA methylation level were negatively correlated with gene expression in over half of BMI predictors, 45.2% of BMI predictors located at the body region of gene sequence.How CpGs located at body region regulate gene expression is unclear.As previous study reported that variations of DNA methylation pattern are the consequence of adiposity rather than the cause [31], whether these BMI predictors are regulators of obesity or the imprints of the biological process needs further investigation.

Conclusion
Collectively, we established a new biomarker for obesity evaluation and discovered that BC tissues of obese individuals exhibit an increased degree of immune cell in ltration indicating that obese BC patients might be the potential bene ciaries for ICI treatment.

Table 1 .
Survival analysis of DM-BMI in BC with systemic therapy