A Novel Group of Genes that Causes Endocrine Resistance in Breast Cancer Identi ed by Dynamic Gene Expression Analysis

Arvand Asghari University of Houston Katherine Wall The University of Texas Health Science Center at Houston Michael Gill The University of Texas Health Science Center at Houston Natascha Del Vecchio University of Chicago Farnaz Allahbakhsh University of Houston Jacky Wu University of Houston Nan Deng The University of Texas MD Anderson Cancer Center W. Jim Zheng The University of Texas Health Science Center at Houston Hulin Wu The University of Texas Health Science Center at Houston Michihisa Umetani (  mumetani@uh.edu ) University of Houston Vahed Maroufy The University of Texas Health Science Center at Houston


Introduction
Breast cancer (BC) is the most prevalent type of cancer in women and the leading cause of cancer-related deaths among US women aged 20 to 59 1 . More than 280 000 new US cases of breast cancer and 43 000 deaths are projected for 2021 1 . While current ve-year survival rates of BC have reached 90%, it is still the second leading cause of cancer-related deaths in women overall 1 . Treatment options for breast cancers typically involve surgical resection of the tumor followed by drug treatment based on the cancer type 2 . Breast cancers are categorized into subtypes based on hormone receptors (HR), namely estrogen receptor (ER) and progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2) expression status. Nearly 70% of breast cancers express the estrogen receptor (ER + ) without overexpression of HER2 (ER + /HER2-negative BC, luminal A breast cancer) 2 . While 13% of breast cancers are HR + /HER2 + (luminal B), more than 13% are HR − /HER2 − , also referred to as triple-negative breast cancer (TNBC). While ve-year survival rates of luminal A breast cancer are around 90%, the ve-year survival rate of TNBC is much lower 2 .
ER exists in two isoforms, ERα (ESR1) and ERβ (ESR2), with ERα as the dominant form in BC. ER acts as a transcription factor mediating gene expression and also as a signaling molecule, inducing kinase pathways and regulating cell growth in cultured breast cancer cells 3 . Given that 80% of breast cancers express ER, ER-targeted endocrine therapies are a core component of systemic therapy. Endocrine therapies include selective ER modulators (SERMs) such as tamoxifen, selective ER down-regulators (SERDS) such as fulvestrant, and aromatase inhibitors (AIs) targeting estrogen biosynthesis. Endocrine therapies have been successful at improving cancer outcomes; however, the development of endocrine resistance, or resistance to inhibition of ER actions, remains a roadblock in breast cancer treatment. Many patients have intrinsic resistance to endocrine therapies. Only 30% of patients with metastatic disease saw initial regression with endocrine therapies, however, in almost all of them the resistance developed eventually, and the tumor recurred not long after that. Moreover, more than 20% of patients who present with early breast cancer will develop endocrine resistance throughout treatment 4,5 .
Resistance to endocrine therapies has been traced to mutations in ESR1, alterations in receptor tyrosine kinases such as HER2, and alterations in signaling pathways such as the MAPK pathway. Point mutations in the ligand-binding domain (LBD) of ESR1, have been shown to cause endocrine resistance 6-8 . The most common mutations, Y537 and D538, lead to constitutive ligand-independent activation of ERα 9 . These mutants are less sensitive to fulvestrant or tamoxifen 7 . LBD mutations in ESR1 have been found in around 20% of metastatic ER + cancers after endocrine therapy 7,10 . ESR1 mutations in circulating tumor DNA (ctDNA) have been found in 36% of patients with metastatic tumors following AI treatment 11 . Gene fusions of the ESR1 DNA binding domain to the C-terminus of other proteins, though rare, can promote ligand-independent ERα activity 12 .
HER2 mutations occur in 2.4% of primary tumors and 6.7% of metastatic tumors 9 . Activating mutations of HER2 can confer resistance to estrogen deprivation and fulvestrant treatment; fulvestrant sensitivity can be restored with HER2 inhibitors 13 . HER2 ampli cation is less common, occurring in 0.8% of primary and 2.1% of metastatic tumors 9 , Ampli cation of HER2 has been linked to tamoxifen resistance, through the hyperactivation of MAPK 14 . Currently, ER + /HER2 + breast cancers are treated with both antiestrogens and HER2 inhibitors 9 . However, for patients with endocrine-resistant breast cancers, treatments are limited as the development of endocrine resistance is not well understood. Existing scienti c literature focuses on the role of ESR1, receptor tyrosine kinases, and their signaling pathways; however, genetic mutation of these genes comprises a small percentage of cases. In at least 60% of cases, additional factors must play important roles in the resistance development process as those cases show intact ESR1 and no up-or down-regulation in other signaling pathways 10 .
In this study, we explored the dynamic behavior of the entire gene population to identify novel genes that play fundamental roles in the development and progression of endocrine-resistant breast cancer. We used the (i) time-course gene expression patterns in estrogen deprived MCF7 cells, which are widely accepted as undergoing the process of developing endocrine resistance over time, (ii) patient data from endocrineresistant tumors compared to endocrine-sensitive ones, and (iii) patient tumor data in TNBC compared to luminal A breast cancer 15,16 to compare the similarities among different types of BC and also to identify key candidate genes in such breast tumors.

Results
LTED Cells are the Cell-Based Model of Endocrine resistance Closest to Patients Data among Breast Cancer Cell Models First, we identi ed the best cell-based model of endocrine resistance based on their similarities to patients' tumor data. For this purpose, we compared the gene expression patterns from cell models of endocrine resistance before and after establishment, utilizing datasets GSE20361 and GSE111151, and gene expression patterns of resistant and sensitive tumors from patients using the GSE87411 dataset (16). The GSE20361 dataset is characterized by acquiring endocrine resistance. By culturing MCF7 cells in the estrogen-deprived medium for a long period, cells developed resistance to endocrine therapies 15 .
On the other hand, GSE111151 contains multiple models of tamoxifen resistance developed using several breast cancer cell lines (MCF7, T-47D, ZR-75-1, and BT-474) and the gene expression data is from before and after developing the resistance 17 . We used the whole gene expression patterns of these datasets to generate a heatmap and utilized hierarchical clustering to nd the distance or similarity between each of these datasets. Amongst the cell line models, we found that the gene expression patterns of the long-term estrogen-deprived (LTED) cells are the closest to the patient data, visualized by the shortest distance between these two datasets in the column's dendrogram of the heatmap (Figure 1). ZR-75-1 tamoxifen-resistant cell lines are next based on their closeness to patient data, followed by BT-474, MCF7, and T-47D tamoxifen-resistant cell lines. Therefore, we decided to use LTED cells as the most suitable cell line model for studying endocrine resistance in breast cancer.

Estrogen Deprived MCF7 Cells Show Four Dominant Differential Expression Patterns with Unique Cell Functions
To better understand the process of acquiring endocrine resistance and its underlying gene expression patterns, we used the GSE20361 dataset 15 . In this work, researchers collected the RNA samples from the MCF7 cells with estrogen depletion after 0, 3, 15, 30, 90, 120, 150, and 180d. We applied our recently developed statistical pipeline to the dataset to nd dynamically regulated genes employed in the process of endocrine-resistance development and progression 18 . The pipeline provides three main functions: First, statistical hypothesis testing determines a set of dynamic response genes (DRGs) that exhibit significant changes over time. Next, these DRGs are clustered into gene response modules (GRMs), subsets of DRGs with similar time-course expression patterns. Finally, the GRMs associations and regulatory effect are analyzed through a gene regulatory network built using ordinary differential equations. Exploring the modules showed that the top four GRMs account for 2 748 (92% of the top 3 000) DRGs ( Figure 2c). The 3 000 DRGs comprised 2 305 unique genes. The rst GRM was the largest model, containing 1 095 genes characterized by a gradual downregulation in expression from day 0 to 30, at which point the pattern shifts, revealing sharp upregulation from day 30 to 150. The second module, comprised of 812 DRGs, showed nearly opposite behavior, demonstrating gradual upregulation until day 30, after which sharp downregulation was observed. Genes from individual GRMs were submitted to the Metascape gene annotation tool for gene enrichment analysis. The rst module was signi cantly enriched for cell division, DNA replication, DNA-dependent DNA replication, and cell cycle pathways and ontologies. The interaction between these different gene ontologies is also visualized in a tree cluster in Figure 2e. The second GRM gene association included membrane tra cking, cellular protein catabolic processes, and macro-autophagy, while the third GRM enriched mostly for positive regulation of protein autophosphorylation and regulation of cellular ketone metabolic processes. The fourth GRM is mostly related to cell junction organization (Figure 2d, e). These results indicate that we identi ed the genes signi cantly changed during the development of endocrine resistance and found that each GRM encodes a unique cell function.

Dynamic Gene Expression Analysis of LTED Data Overlaps with Endocrine-Resistant Patient Data and also with Genes Regulated in TNBC
To further narrow down our search for identifying the candidate genes with active roles associated with endocrine resistance, we also analyzed GSE87411, an endocrine-resistant/sensitive patient dataset; the only patient dataset publicly available to the best of our knowledge 16 . A group of 109 breast cancer patients was treated with aromatase inhibitors for 2-4 weeks and their response to treatment was evaluated 16 . Following the evaluation of patient response to treatment, they were categorized into endocrine-resistant and sensitive patients. We used the gene expression pro le of these patients to nd the genes signi cantly regulated in endocrine-resistant breast cancer tumors as compared to endocrinesensitive breast cancer tumors. We compiled patient data, analyzed the microarrays as two groups, and compared the results to nd the signi cantly differentially expressed genes 20 . Among these genes, 984 genes were signi cantly upregulated in endocrine-resistant patients while 621 genes were signi cantly downregulated. We then compared these signi cantly regulated genes in the patient dataset to the DRGs found from our dynamic gene expression analysis and selected those with highly correlated expression patterns between the two datasets for further analysis, which returned 319 genes (Figure 3a).
TNBC accounts for around 15% of breast cancer cases, characterized as one of the most aggressive types of breast cancer with no established therapy options yet 2 . Similar to endocrine-resistant breast cancer patients, TNBC patients are also resistant to conventional endocrine therapies due to its nature of the lack of hormone receptors. To examine whether these two groups share certain gene expression patterns or not, we explored the gene expression similarities between the development of endocrine resistance and TNBC. First, we identi ed the 1 497 genes signi cantly and differentially expressed in TNBC compared to luminal A breast cancer. This list was compared to the 2 305 DRGs including 319 commonly shared genes between the DRG pipeline analysis and endocrine-resistant patient data.
Surprisingly, 97.8 % of genes signi cantly expressed in the TNBC vs. Luminal A gene set were included in the DRGs, and more than 80% of the genes commonly shared among the DRGs and endocrine-resistant patient data were also signi cant in the TNBC vs. Luminal A gene set. This common set between the three sets of analyses is comprised of 254 genes. We posit these genes are important to the development of endocrine resistance, estrogen deprivation, and TNBC ( Figure 3a).
Gene ontology analysis on the shared 254 genes showed signi cant enrichment for cell cycle, cell division, and DNA repair pathways, all signatures of genes affected in cancer cells (Figure 3b). Genes related to DNA repair pathways were included in both TNBC and endocrine-resistant breast cancer, indicating the importance of the DNA repair system to account for mutations in the DNA in their cancers 21 . Other important genes include ribonuclease H2 subunit A (RNASEH2A), a gene that is known to be upregulated in cancers. RNASEH2A is a mediator of the removal of lagging-strand Okazaki fragment RNA primers, thus it can be integral in the proliferation of both triple-negative and endocrine-resistant breast cancers 22 . Moreover, most of the common genes are regulated by transcription factors such as E2F1, MYCN, and TFDP1, which are all important transcription factors in TNBC ( gure 3c). This nding con rms the notion that TNBC tumors share signi cant similarities with endocrine-resistant breast cancer.
MCM family, RAD51, CAV1, and CCNE1 are Among the Top Candidate Genes Responsible for Endocrine-

Resistant Development
To further identify the genes important for the endocrine resistance development and progression, we utilized several bioinformatic approaches designated to ranking and prioritizing genes. We used several tools to generate gene-gene and protein-protein interaction networks based on the candidate genes in each of the modules 23-26 . These tools rank or prioritize genes according to the functional similarities of the genes, direct and indirect interactions amongst them, and their roles in the same pathway to cluster the genes or proteins together. For tools that require a training gene set, we gathered known genes for endocrine-resistance development from literature and submitted them as the training set. Using generated networks, we selected the genes with the highest number of neighbors in the network as the master regulators of genes, which will be our main candidates (Figure 4). These core genes may have the most essential roles in the development of endocrine resistance in LTED-MCF7 cells. Thereby, we were able to narrow the list of candidates to 34 genes from major four modules, mostly from modules 1 and 2 (n = 31) (Table S1).
Among the genes in module 1, we found PARP1 and E2F1, recently discovered important genes for endocrine resistance in breast cancer, indicating the validity of our approach. We also found minichromosome maintenance (MCM) family genes as important genes, namely MCM2 and MCM7, RAD51, and TCF3 (Figure 4), which have not been studied thoroughly [27][28][29] .MCM family genes (MCM2-7) form an MCM complex protein that functions as a DNA replication licensing factor and plays a central role in eukaryotic DNA replication 30 . Recent evidence suggests that blocking the expression of these genes can lead to the inhibition of the growth of tamoxifen-resistant cancer cells. This evidence perfectly matches the expression pattern of these genes in our dataset, as these genes are signi cantly upregulated during the endocrine-resistance process in breast cancer cells 31 . RAD51 overexpression, a key protein of homologous recombination, is also linked to overall poor survival and endocrine resistance in breast cancer, although the exact underlying signaling pathways are not well understood yet 32 . As for TCF3, while there is no study on its role in endocrine resistance, it is important for breast cancer differentiation, development, and prognosis 33 . All the genes in module two were slightly upregulated by day 30 and signi cantly downregulated afterward ( Figure 5). In line with our data, there are reports showing downregulation of CAV1 as an important step in breast cancer development and resistance to endocrine therapies 34 , and further studies are warranted. Some of the other important candidate genes were selected from modules 3 and 4, including ATG3, CCNE1, and MFAP4.

Long-term Estrogen Deprived MCF7 Cells Gained Resistance at Day 90
To validate the bioinformatic ndings, we established the LTED MCF7 cell model by culturing human breast cancer MCF7 cells in estrogen-depleted growth media for an extended period. This cell model recapitulated acquired resistance to aromatase inhibitors in postmenopausal women and the same experimental settings as the study on which we used for our gene expression analysis with an extended sample collection period 15 . After the initial quiescence state of around 30 days when the cell's growth rate was minimal, they started to adapt with estrogen depletion and regain their growth ability while losing their responsiveness to 17b-estradiol (E 2 ). At around day 90, LTED cells showed complete loss of response to E 2 , characterized by their loss of ability to respond to estrogen stimulation to increase their growth rate, while their basal growth rate reached similar levels as those of parental MCF7 cells cultured in estrogen-rich media (Figure 5a). We then continued to maintain the LTED cells for another 6 months.
After a period of super-sensitivity to estrogen at low doses (10 -11 to 10 -13 M), which happened around day 150, they became completely unresponsive to estrogen treatments at all concentrations by day 250 (Figure 5b). These results are consistent with the previous reports and follow a similar timeline 15,[35][36][37][38] .
To validate the gene expression pro les of the candidate genes, we collected samples of LTED cells at different time points (from day 0 to day 135) and measured the expression of some of the candidate genes using qRT-PCR (Figure 5c). The expression pro le of these genes closely mimicked their gene expression patterns from the microarray studies, which further con rms microarray results and our downstream bioinformatics analysis. Furthermore, as our qRT-PCR analysis contains much more data points compared to the previous study, the expression patterns of the genes are more detailed than the microarray results.

Discussion
In the present study, we utilized a newly developed statistical and computational pipeline to examine the process of endocrine resistance in breast cancer and found novel underlying gene associations. First, we compared the data from patients resistant and sensitive to the endocrine therapies with publicly available gene expression data from cell-based models. We found LTED MCF7 cell model is the closest to the patients' data among cell models analyzed based on their whole gene expression patterns. Next, we reanalyzed the gene expression in LTED cells during the process of acquiring endocrine resistance. Using our statistical and computational pipeline, which is designed for identifying dynamically signi cant genes and capable of comparing genes and gene clusters based on their time-resolved expression 18,19 , we grouped the genes into different modules, or DRGs, based on their expression patterns in LTED cells. Then, we compared the DRGs to the gene expressions of patients' tumor samples from endocrineresistant or sensitive tumors and found a group of 319 genes as potential drivers of endocrine-resistance development. Finally, using multiple bioinformatics approaches, we narrowed down the candidate genes list to 34 genes from major four modules, primarily from modules 1 and 2. These genes were selected based on gene-gene network analysis and literature analysis. Analyzing the gene-gene networks enabled us to select the genes with a higher probability of being more integral to the networks. These candidate genes are potential targets for developing potent therapies for endocrine-resistant breast cancer. The expression patterns of some of these genes were further validated in biological settings by developing an LTED cell line.
Twenty out of the 34 genes were from the rst module, which showed a slight decrease in expression by day 30, followed by signi cant upregulation afterward, which correlates to the growth rates of MCF7 cells under the development of endocrine resistance. As anticipated, the genes in module 1 were primarily associated with DNA replication and repair mechanisms, emphasizing the importance of these mechanisms in the replication and growth of cancer cells. Minichromosome maintenance complex genes MCM2, 3, 4, 6, and 7 were identi ed in Module 1. Overexpression of MCM2, MCM3, MCM4, and MCM6 is associated with luminal B, HER2 + , and triple-negative breast cancers 39,40 . RAD51, one of the genes in module 1 is known to be related to overall poor survival and endocrine resistance in breast cancer, while it is also linked to regulation of metastasis in TNBC 32,41 . Among other genes in module 1 is replication factor C subunit 3 (RFC3), which is essential for the homologous DNA pairing and strand exchange. Downregulation of RFC3 has been shown to attenuate cell proliferation, migration, and invasion in TNBC 42,43 . The cyclin-dependent kinases (CDKs) are critical regulatory enzymes governing cell cycle transitions and play an essential role in the development of endocrine resistance in some breast cancer cases 10,44 . While CDK 4/6 are the most studied members of the CDK family in relation to endocrine-resistant breast cancer, our analysis revealed another member of this family, CDK1 to be important for both endocrine resistance and TNBC. Notably, CDK1 is required for the initiation of mitosis and cell proliferation, and its inhibition in TNBC led to a decrease in cell viability and an increase in cell apoptosis 45 .
Fourteen genes are related to the rest of the modules, with 11 of them related to module 2. The majority of these genes are already known to be related to breast cancer in some way, yet their connection to endocrine resistance and TNBC has not been discovered yet and calls for further experiments. Among the genes in module 2, the network analysis suggested ELAVL1, GABARAPL2, and CAV1 as the main regulators of all other genes. Protein Phosphatase 1A (PPM1A) is also among the genes in module 2, which means that it was downregulated in endocrine-resistant breast cancer, as well as TNBC. This protein is a member of the protein phosphatase 2C family of Ser/Thr protein phosphatases and has been shown to regulate mitogen-activated protein kinase cellular signaling pathways as well as proliferation, cell invasion, and migration 46,47 . Recently, Mazumdar et al. showed that this protein is signi cantly downregulated in ER-negative breast cancers and that its upregulation suppresses in vitro and in vivo growth of TNBC cells 48 , which supports our ndings.
Interestingly, the genes we identi ed important for the resistance to endocrine therapies include most genes that are also signi cantly regulated in TNBC compared to luminal A breast cancer (Figure 3a), suggesting that these genes are both of great importance in the endocrine-resistance process as well as triple-negative breast cancer. We addressed the gap in the knowledge regarding the genetic factors underlying endocrine-resistant as well as triple-negative breast cancers. These two subtypes of breast cancer are the most fatal types of breast cancer with no known effective therapeutic approaches available to date. Thus, research on the underlying genetic factors of these cancers is of great importance. Here, we found a group of candidate genes that are signi cantly regulated in the process of endocrine resistance, and with further studies, they will be potential targets for the treatment of endocrineresistant breast cancers. Moreover, the majority of these genes are also signi cantly regulated in TNBC compared to luminal A breast cancer, suggesting that endocrine-resistant breast cancer and TNBC share signi cant similarities in their mechanisms. Therefore, these genes we identi ed can also be potential therapeutic targets for treating TNBC. Still, more than 1 200 genes were signi cantly regulated in TNBC and endocrine resistance development in LTED cells but not in endocrine-resistant breast cancer patients. This can be because data from endocrine-resistant patients only show the comparisons between resistant and sensitive tumors, and thereby, does not represent the genes that are important for establishing endocrine resistance. Since we cannot determine whether the endocrine-resistant patients had already gained resistance or not at the time of the endocrine therapy, we could not track the development of endocrine resistance from patients' datasets. In contrast, our approach analyzed the process of developing endocrine resistance, not just the endpoints of endocrine resistance. Our analysis found several novel genes with potential signi cance in endocrine-resistant breast cancer as well as TNBC and opened new doors toward designing new therapeutic approaches for endocrine-resistant breast cancer and TNBC.
Our novel statistical and computational analysis approach, combined with the use of multiple types of data from cell models to patients, revealed a new potential to develop effective therapeutic approaches toward various diseases. Further biological experiments are warranted to con rm a measurable degree of importance of these genes. While we did our best to use both cell line data and patient data to nd the genes with the most physiological and clinical signi cance, more datasets for both patient data and dynamic gene expression data will enhance the validation and increase the reliability of our ndings.

Data Acquisition and Processing
The microarray data of GSE20361, GSE87411, and GSE111151 were downloaded from GEO, the public functional genomics repository [15][16][17] . Data from GSE20361 were analyzed using a Matlab-based pipeline to identify DRGs described as below, while data from GSE111151 and GSE87411 were used in conjunction with Limma. Using default options, we identi ed the differentially expressed genes from both datasets for downstream analysis. To generate the heatmap, the log-fold-change values of the datasets and the "ComplexHeatmap" package in R were used. To identify the genes signi cantly regulated in TNBC compared to luminal A breast cancer, we used UALCAN 49 . Each gene was checked for its expression in the TNBC subtype compared to luminal A subtype, and those with signi cant differential expression, using a signi cance level of 0.05, were selected for further analysis. To generate the genegene and protein-protein interaction networks, gene lists were analyzed using NetworkAnalyst, ToppGene, ToppFun, and Funcoup, using the default parameters. Metascape tool was used for all the gene enrichment analysis throughout the paper using the default parameters.

Dynamic Gene Expression Analysis Pipeline
As part of the analysis, we used a pipeline tool we developed 18, 19 . One key assumption of the methodology is that a small fraction of the genes responds to external stimuli. Further, we assume that the larger fraction of the genes maintains constant expression over time. Thus, spline smoothing was applied to the expression patterns using a smoothing regularity parameter obtained by minimizing generalized cross-validation of the top 200 most-responsive probes and applied to all expression pro les. A subset of the genes that exhibit time course patterns that have relatively smooth trajectories was ranked by their interquartile range, and the top genes for the estimation subset were selected. Then, the DRGs with similar expression patterns over time are clustered into temporal GRMs using Iterative Hierarchal Clustering (IHC).

LTED Cell Line Development
MCF7 cells were cultured in estrogen-depleted growth media (phenol red-free RPMI1690 medium (Thermo Fisher Scienti c, Waltham, MA) supplemented with 5% charcoal-stripped serum (CSS) and 10 mg/ml of Insulin (Thermo Fisher Scienti c) for extended periods. Cell samples were collected every 7 days and used for downstream analysis.

Proliferation Assay
Both parental and LTED MCF7 cells were cultured in a 96-well plate. After 24 hours, the cells were treated in triplicates with E 2 in RPMI1690 plus 5% CSS. The treatment was repeated every other day and at the end of the seventh day, then the plates were frozen for the assay. FluoReporter blue uorometric dsDNA quanti cation kit (Invitrogen, Waltham, MA) was used per the manufacturer's protocol. The excitation/emission wavelength of 360/460 nm were read and normalized to the control wells.
qRT-PCR Analysis mRNA abundance was evaluated by qRT-PCR as described 19 . Relative mRNA levels were calculated by using the comparative CT method normalized to cyclophilin. The primers were designed using Primer Express Software (Applied Biosystems, Waltham, MA) as shown in Table S2.

Data Availability
All data are available in the main text or Supplementary Information. Figure 1 Heatmap of the global gene expression patterns between cell-based models of endocrine resistance and patient data. Data for the LTED cell model was obtained from GSE20361. Data for patient's responses to hormonal therapies was obtained from GSE87411. The rest of the cell model data was obtained from GSE111151. For cell models ZR75-1, BT474, and T47D the data consisted of two populations of tamoxifen-resistant cell lines both of which are included here in the heatmap. The dendrogram represents the closeness of the datasets, thus the LTED cell line is the most similar to the patient's data followed by ZR-75-1 tamoxifen-resistant cells.  Similar gene expression patterns between endocrine-resistant breast cancer and triple-negative breast cancer. a) Venn diagram depicts the shared signi cantly regulated genes between endocrine-resistant and triple-negative breast cancers and patient's data. b) Gene ontology analysis showing the signi cantly enriched ontologies and pathways for the common genes. c) Transcription factor (TF) analysis of the common genes shows the majority of them are regulated by E2F1, MYCN, and TP53 TFs.

Figure 4
Tissue-speci c protein-protein interaction network for modules 1 and 2 candidate genes. Candidate genes from modules 1 and 2 were fed to NetworkAnalyst and the generated interaction networks are shown below. Red and orange circles represent the candidate proteins while the yellow ones are other proteins that interact with our candidate proteins. The bigger the size of the circle, the more interactions it has. The scale of the module one and two gures is different as module 1 contains signi cantly more genes that made the visualization harder. Proliferation assay and qRT-PCR results con rm the bioinformatics ndings. a) Proliferation assay of the LTED cells at day 90 and normal MCF7 cells under different treatment doses of E2 (from 10-14 to 10-7 M) showed that LTED cells were not responsive to estrogen treatment anymore. b) Proliferation assay of the LTED cells at different days under several doses of E2 (from 10-15 to 10-7 M) showed that cells completely lost their responsiveness to E2 treatment by day 250. c) qRT-PCR results for MCM2, CDK1, and