Molecular characterization of ESR1 variants in breast cancer

Estrogen receptor 1 (ESR1) mutations and fusions typically arise in patients with hormone receptor-positive breast cancer after aromatase inhibitor therapy, whereby ESR1 is constitutively activated in a ligand-independent manner. These variants can impact treatment response. Herein, we characterize ESR1 variants among molecularly profiled advanced breast cancers. DNA next-generation sequencing (592-gene panel) data from 9860 breast cancer samples were retrospectively reviewed. Gene fusions were detected using the ArcherDx fusion assay or whole transcriptome sequencing (n = 344 and n = 4305, respectively). Statistical analyses included Chi-square and Fisher’s exact tests. An ESR1 ligand-binding domain (LBD) mutation was detected in 8.6% of tumors evaluated and a pathogenic ESR1 fusion was detected in 1.6%. Most ESR1 LBD mutations/fusions were from estrogen receptor (ER)-positive samples (20.1% and 4.9%, respectively). The most common ESR1 LBD mutations included D538G (3.3%), Y537S (2.3%), and E380Q (1.1%) mutations. Among biopsy sites, ESR1 LBD mutations were most observed in liver metastases. Pathogenic ESR1 fusions were identified in 76 samples (1.6%) with 40 unique fusion partners. Evaluating co-alterations, ESR1 variant (mutation/fusion) samples more frequently expressed androgen receptor (78.0% vs 58.6, P < 0.0001) and less frequently immune checkpoint proteins than ESR1 wild-type (PD-1 20.0% vs 53.4, P < 0.05; immune cell PD-L1 10.0% vs 30.2, P < 0.0001). We have described one of the largest series of ESR1 fusions reported. ESR1 LBD mutations were commonly identified in ER-positive disease. Limited data exists regarding the clinical impact of ESR1 fusions, which could be an area for future therapeutic exploration.


Introduction
Most primary breast cancers are hormone receptor-positive [1], and targeting the estrogen receptor (ER) is a common treatment strategy [2]. Aromatase inhibitors, selective ER modulators (SERMs), and selective ER down regulators (SERDs) can successfully treat ER-positive disease [3]. However, for a significant number of patients, endocrine resistance eventually develops, leading to disease recurrence and metastases [4]. Delineating the mechanisms of 1 3 resistance and developing strategies to overcome treatment resistant disease is an important focus for improving breast cancer outcomes.
Alterations known to contribute to acquired endocrine resistance include somatic alterations to key cancer pathways, ER transcriptional regulators, and DNA repair genes [5][6][7]. Comparative analyses of tumor sequencing data for hormone therapy naïve and post hormone therapy tumors identified ESR1, ERBB2 and NF1 among the most frequently mutated genes in response to hormone therapy [6]. ESR1 encodes ERα, and mutations that result in its constitutive activation-mutations in the LBD-are found almost exclusively in patients with endocrine therapy-resistant disease [8,9]. ESR1 LBD mutations [10], copy number gains [11], and gene fusions [12] are all associated with acquired resistance and metastatic disease. In an analysis of 541 patients with metastatic breast cancer pretreated with aromatase inhibitors [10], Chandarlapathy and colleagues tested circulating tumor DNA (ctDNA) for two LBD hotspot mutations, Y537S and D538G in patients with metastatic breast cancer previously treated with an aromatase inhibitor, and almost 30% of patients had one of these mutations. In another targeted sequencing analysis (287 gene panel) of 11,616 breast cancer tumors, ESR1 mutations were identified in 10% of samples and overwhelmingly enriched in metastatic samples; 78% of the samples in which ESR1 mutations were found were metastatic [13].
We sought to survey the landscape of ESR1 variants among a large genomic database of breast cancer cases. We analyzed 9860 breast cancer tumors, which included 5337 (54.1%) samples from distant metastatic sites, with a combination of large panel sequencing-a 592-gene paneland whole transcriptome sequencing. Herein, we report the results of our ESR1-focused analyses.

Patient samples
Formalin-fixed paraffin-embedded (FFPE) patient samples (n = 9860) were submitted to a commercial CLIA-certified laboratory (Caris Life Sciences, Phoenix, AZ). The present study was conducted in accordance with guidelines of the Declaration of Helsinki, Belmont Report, and U.S. Common Rule. With compliance to policy 45 CFR 46.101(b), this study was conducted using retrospective, de-identified clinical data, and patient consent was not required.

Next-generation sequencing (NGS) for 592-gene panel
NGS was performed on genomic DNA isolated from FFPE tumor samples (n = 9860) using the NextSeq platform (Illumina, Inc., San Diego, CA). Matched normal tissue was not sequenced. A custom-designed SureSelect XT assay was used to enrich 592 whole-gene targets (Agilent Technologies, Santa Clara, CA). All variants were detected with > 99% confidence based on allele frequency and amplicon coverage, with an average sequencing depth of coverage of > 500 and an analytic sensitivity of 5%. Prior to molecular testing, tumor enrichment was achieved by harvesting targeted tissue using manual microdissection techniques. Genetic variants identified were interpreted by board-certified molecular geneticists and categorized as 'pathogenic,' 'likely pathogenic,' 'variant of unknown significance,' 'likely benign,' or 'benign,' according to the American College of Medical Genetics and Genomics standards. Alteration rates were calculated as the total number of samples harboring a 'pathogenic' or 'likely pathogenic' variant divided by the total number of samples scored.

Copy number alteration (CNA)
The CNA of each exon was determined by calculating the average depth of the sample along with the sequencing depth of each exon and comparing this calculated result to a precalibrated value.

Immunohistochemistry (IHC)
IHC was performed on full FFPE sections of glass slides. These slides were stained using automated staining techniques, per the manufacturer's instructions, and optimized and validated per Clinical Laboratory Improvement Amendments/College of American Pathologists and International Organization for Standardization requirements. Staining was scored for intensity (0 = no staining; 1 + = weak staining; 2 + = moderate staining; 3 + = strong staining) and staining percentage (0-100%).

Tumor mutational burden (TMB)
TMB was measured by counting all non-synonymous missense, nonsense, in-frame insertion/deletion and frameshift mutations found per tumor that had not been previously described as germline alterations in dbSNP151, Genome Aggregation Database (gnomAD) databases or benign variants identified by Caris Life Sciences geneticists. A cutoff point of ≥ 10 mutations per megabase (mt/MB) was used based on the KEYNOTE-158 pembrolizumab trial [14], which showed that patients with a TMB of ≥ 10 mt/MB across several tumor types had higher response rates than patients with a TMB of < 10 mt/MB. Caris Life Sciences is a participant in the Friends of Cancer Research TMB Harmonization Project [15].

Whole transcriptome sequencing (WTS)
WTS uses a hybrid-capture method to pull down the full transcriptome from a FFPE tumor samples (n = 4305; WTS platform was not available at the time of profiling for all samples) using the Agilent SureSelect Human All Exon V7 bait panel (Agilent Technologies, Santa Clara, CA) and the Illumina NovaSeq platform (Illumina, Inc., San Diego, CA). FFPE specimens underwent pathology review to determine percent tumor content and tumor size; a minimum of 10% tumor content in the area for microdissection was required to enable enrichment and extraction of tumor-specific RNA. Qiagen RNA FFPE tissue extraction kit was used for extraction, and the RNA quality and quantity were determined using the Agilent TapeStation. Biotinylated RNA baits were hybridized to the synthesized and purified complementary DNA (cDNA) targets and the bait-target complexes were amplified in a post capture PCR reaction. The resultant libraries were quantified and normalized, and the pooled libraries were denatured, diluted, and sequenced. Raw data were demultiplexed using the Illumina DRAGEN FFPE accelerator. FASTQ files were aligned with STAR aligner (Alex Dobin, release 2.7.4a github). A full 22,948-gene dataset of expression data were produced by the Salmon, which provides fast and bias-aware quantification of transcript expression [16]. BAM files from STAR aligner were further processed for RNA variants using a custom detection pipeline. The reference genome used was GRCh37/hg19 and analytical validation of this test demonstrated ≥ 97% Positive Percent Agreement (PPA), ≥ 99% Negative Percent Agreement (NPA) and ≥ 99% Overall Percent Agreement (OPA) with a validated comparator method.

Fusion detection by WTS
For samples tested February 2019 and later, gene fusion detection was performed on mRNA isolated from a FFPE tumor sample (n = 4305) using the Illumina NovaSeq platform (Illumina, Inc., San Diego, CA) and Agilent SureSelect Human All Exon V7 bait panel (Agilent Technologies, Santa Clara, CA). FFPE specimens underwent pathology review to determine percent tumor content and tumor size; a minimum of 10% of tumor content in the area for microdissection was required to enable enrichment and extraction of tumor-specific RNA. Qiagen RNA FFPE tissue extraction kit was used for extraction, and the RNA quality and quantity was determined using the Agilent TapeStation. Biotinylated RNA baits were hybridized to the synthesized and purified cDNA targets and the bait-target complexes were amplified in a post capture PCR reaction. The resultant libraries were quantified, normalized and the pooled libraries are denatured, diluted, and sequenced; the reference genome used was GRCh37/hg19 and analytical validation of this test demonstrated ≥ 97% Positive Percent Agreement (PPA), ≥ 99% NPA and ≥ 99% OPA with a validated comparator method.

Fusion detection by archer
For samples tested prior to February 2019, gene fusion detection was performed by targeted RNA sequencing using the ArcherDx fusion assay (Archer FusionPlex Solid Tumor panel). The FFPE tumor samples (n = 344) were microdissected to enrich the sample to ≥ 20% tumor nuclei, and mRNA was isolated and reverse transcribed into cDNA. Unidirectional gene-specific primers were used to enrich for target regions, followed by NGS (Illumina MiSeq platform). Targets included 52 genes, and the full list can be found at http:// arche rdx. com/ fusio nplex-assays/ solid-tumor. We analyzed reads and contigs that were matched to a database of known fusions and other oncogenic isoforms (Quiver database, ArcherDx), as well as those novel isoforms or fusions with high reads (> 10% of total reads) and high confidence after bioinformatic filtering. Samples with < 4000 unique RNA reads were reported as indeterminate and excluded from analysis, and all the analyzed fusions were in-frame and were predicted to have kinase domains preserved. Fusions among the > 11,000 fusions known to be found in normal tissues were excluded (16). The detection sensitivity of the assay allows for detection of a fusion that is present in at least 10% of the cells in the samples tested.

Statistical analysis
All statistical analyses were performed with JMP V13.2.1 (SAS Institute), or R Version 3.6.1 (https:// www.R-proje ct. org). Categorical data was evaluated using Chi-square or Fisher's exact test, where appropriate.

Results
We retrospectively reviewed the molecular profiles of a national cohort of 9860 tumor samples from 9545 unique breast cancer patients that were submitted to Caris Life Sciences for molecular testing. All samples were analyzed using targeted DNA NGS and 4305 samples also had WTS performed. Most patients (74.7%) were aged 50 years or older at the time of molecular profiling, and 94 patients (1%) were male. A slight majority of samples were obtained from distant metastatic sites (54.1%) and 45.8% were from primary breast tissue or locoregional (LR) lymph nodes, and the remaining 0.1% were from lymph node sites that were not otherwise specified (Table 1). The most represented metastatic sites were liver (n = 1655, 31%) and bone (n = 733, 13.7%). Overall, an ESR1 LBD mutation was detected in 8.6% of all tumors evaluated and a pathogenic ESR1 fusion was detected in 1.6%. ESR1 variants (mutation or fusion) were enriched in ER-positive/HER2-negative tumors; 14.5% LBD mutations and 2.6% fusions. ESR1 LBD mutations were exclusive to ER-positive tumors, whereas ESR1 fusions were noted in ER-negative tumors, although rare. Breast tumors with an unclear receptor subtype (i.e., indeterminate IHC result for ER, PR, and/or HER2) accounted for 10.7% of the cohort.

ESR1 fusions
Our assays were capable of detecting gene fusions for 4649 tumor samples. We profiled 4305 samples by WTS and 344 samples by Archer panels. At least one pathogenic/likely pathogenic ESR1 fusion isoform was detected in 76 samples, which constitutes 1.6% of evaluable tumor samples. A total of 40 unique fusion partners were identified, with ESR1 exclusively observed as the upstream (5′) fusion partner. The majority of ESR1 fusion-positive samples lacked a concurrent ESR1 LBD mutation (n = 69, 91%). Of the ESR1 fusions with resolvable breakpoints (94%), 56.5% of downstream fusion partner sequences were in-frame with ESR1; five ESR1 fusion transcripts could not be classified because of low resolution across the breakpoint.

Subjects with multiple biopsies
Two-hundred ninety-eight patients had multiple biopsy samples analyzed. We were particularly interested in the samples from 92 patients for whom both a primary and distant metastatic biopsy sample were analyzed. Nine patients had an ESR1 variant detected in the metastatic sample only, one in the primary sample only, and one with an ESR1-D538G mutation in the primary and metastatic samples. We analyzed expression patterns among 51 patients with an ESR1 variant detected in at least one biopsy (Fig. 5). For those who had ESR1-E380Q variants detected, these variants were more frequently observed across all of an individual's biopsy samples than unique to a subset of their biopsies (66.7%, 4 of 6 patients), whereas D538G (42.1%, 8 of 19 patients), Y537S (37.5%, 6 of 16 patients), and other LBD variants (33.3%, 4 of 12 patients) were more often unique to a subset of patient biopsies. Of the 30 patients with an ESR1 variant unique to a subset of their own biopsies, 86.7% (n = 26) did not harbor the variant in the initial biopsy, including nine patients whose initial biopsy was from the breast or LR lymph nodes. An ESR1 fusion was identified in three samples, although it is unclear whether fusions were present in respective paired samples as fusion detection was not performed at the time of tumor profiling.

Discussion
In this study of 9680 breast cancer tumors that underwent comprehensive molecular profiling, we have identified a broad range of ESR1 variants. ESR1 LBD mutations were detected in 8.6% of all tumors evaluated and a pathogenic ESR1 fusion was detected in 1.6%. ESR1 LBD mutations were appreciated in 14.5% of ER+/HER2− breast cancer samples. ESR1 LBD mutations were somewhat less common than previously reported [10,17]. This may be explained by characteristics of this cohort (more heterogeneous population) and by the selection of a pretreatment specimen for molecular profiling in several instances. Inherent ESR1 mutations are rare [6]; therefore, this practice would lower the frequency of identified variants. Higher rates of ESR1 mutations-25% to 40%-have been reported in clinical trial settings following progression on endocrine therapy, often utilizing specimens collected at progression of disease [10,17]. However, in a similar study utilizing the Memorial Sloan Kettering-Integrated Mutation Profiling of Actionable Cancer Targets (MSK-IMPACT) platform to analyze 929 breast cancers, ESR1 mutations were identified in 10% of samples [9]. In another study that evaluated the molecular profiles of 11,616 breast tumors, ESR1 mutations were detected in 10.2% and were enriched in metastatic samples [10,17]. ESR1 variants were more prevalent in liver samples than primary breast samples (27.0 vs 2.3%, P < 0.0001) and represented 55.9% (n = 447/799) of all ESR1 variants identified in metastatic tissue. These findings are similar to prior reports [13], in which the highest ESR1 mutation rates were found in liver metastases (44%), followed by pleura (25%), lung (24%), and bone (20%), with a 5% rate of ESR1 mutations in breast tissue samples. In our study, patients with multiple biopsies more commonly harbored an ESR1 variant in subsequent tissue rather than an initial biopsy. Limited literature exists assessing the prevalence of ESR1 variants in paired tissue samples. In one previous study, matched samples were available for four patients with higher allele frequencies of ESR1 mutations in biopsy samples at progression [18].
The most common ESR1 LBD mutations in our study were D538G, Y537S, and E380Q. These mutations are known to be clinically important. In the BOLERO-2 phase III trial that assessed the combination of everolimus and exemestane in patients with endocrine therapy-resistant metastatic ER-positive breast cancer, both D538G and Y537S mutations identified from baseline cell-free DNA (cfDNA) were associated with decreased overall survival [10]. The SoFEA (Study of Faslodex Versus Exemestane With or Without Arimidex) phase III study identified ESR1 mutations in 39.1% of available patient baseline plasma ctDNA. Within the exemestane-treated arm, ESR1 mutation status was associated with worse progression-free survival (PFS) (HR 2.12, P = 0.01) [17]. In the PALOMA-3 phase III trial that evaluated the addition of palbociclib to second-line endocrine therapy, ESR1 mutations were identified in 25.3% of available patient baseline plasma ctDNA. Treatment with palbociclib significantly improved PFS regardless of ESR1 mutation status. Worsened PFS was appreciated for patients with ESR1 mutations who received fulvestrant and placebo (3.6 months vs 5.4 months) [17]. In both the SoFEA and PALOMA-3 studies, the predominant ESR1 mutations were D538G, Y537S/N, and E380Q [17].
Targeting fusions for cancer treatment has made a significant impact in other cancer subtypes, including EML4-ALK fusions in non-small cell lung cancer and NTRK fusions, for which targeted therapies are now available. These therapies have led to significant improvements in overall response rate and PFS in a biomarker-selected population [19,20]. Fusion transcripts that are in-frame are more often considered pathogenic and clinically relevant due to retained key functional domains (i.e., kinases, LBD) that can be activated by the fusion partner. Out-of-frame fusion products can have a more varied biology, as completely new sequences without preserved domains are being translated. We identified several ESR1 fusions that are likely to have clinical relevance, albeit rare, such as recurrent ESR1-CCDC170, ESR1:YAP1, ESR1:NCOA2, and ESR1:PLEKHG1 fusions, along with 36 other unique fusions. ESR1 fusion transcripts are of clinical importance as they may allow for constitutive activation of the estrogen receptor and can contribute to endocrine therapy resistance, predominately by deleting the binding domain for traditional estrogen inhibitory therapies [21]. ESR1 fusions can therefore render breast cancer cells resistant to aromatase inhibitors, SERMs, and SERDs due to disruption of the LBD. Conversely, ESR1 LBD mutations may allow for some level of responsiveness to SERMs and SERDS [8,9,22]. In addition to causing endocrine therapy resistance, ESR1 fusion products may activate downstream signaling pathways [23] and support metastatic proliferation. In our analysis, the majority of the ESR1 fusion products retained the ERD and ZF domains in the setting of a dysfunctional LBD, suggesting that the ability to bind DNA and initiate downstream signaling was retained. ESR1:NCOA2 fusions are not well-described in breast cancer but have been identified in uterine tumors [24]. NCOA2 is a transcriptional coactivator for nuclear hormone receptors, and this fusion product may utilize an active promoter region to dysregulate expression of the NCOA2 coactivator domain, thereby increasing proliferation cell signaling pathways including estrogen-mediated. ESR1-CCDC170 fusions activate proliferation pathways involving HER2/HER3 [21,25] [25]. ESR1:YAP1 upregulates an epithelial-to-mesenchymal transcriptional signature thereby promoting metastasis [21] and estrogenindependent enrichment at regulatory regions of estrogenresponsive genes [21]. A fusion event could therefore provide multiple mechanisms for cancer growth escape.
The ESR1:CCDC170 fusion was the most common fusion we identified. It was out-of-frame in all instances; however, a previous study suggested that this fusion has biological relevance in ER-positive breast cancers. Previously described by Veeraraghavan et al. [25], CCDC170 fusion transcripts are likely generating N-terminally truncated CCDC170 proteins expressed under the ESR1 promoter, which could cause constitutive activation of the ER LBD.
Targeting ESR1 fusions could be a potential treatment strategy. In one study that evaluated interacting proteins with ESR1 fusion transcripts, enhanced recruitment of 26S proteasomal subunits was identified in tumors characterized by an ESR1:YAP1 fusion [26]. Following treatment with bortezomib, a 26S proteasome inhibitor, and fulvestrant, tumor growth was suppressed [27]. Blocking the downstream estrogen receptor kinases CDK4/6 with CDK4/6 inhibitors has also been evaluated [21]. In a patient derived xenograft model harboring the ESR1:YAP1 fusion, treatment with palbociclib led to inhibitor tumor growth, decreased Ki-67 levels, and reduced pRb. The sensitivity of ESR1 fusionexpressing breast cancer cells to concomitant HER2-targeted therapies has also been assayed in a preclinical setting; breast cancer cells harboring an ESR1:CCDC170 fusion treated with tamoxifen and lapatinib showed decreased growth [28].
The main limitation of our study is the lack of longitudinal outcomes data. We did not have access to clinical outcomes data to correlate ESR1 variants with treatment response and survival, and this is beyond the scope of the current study. Most patients were presumed to have stage IV disease, even if the tumor submitted was obtained from a breast biopsy or surgical specimen. However, it is unknown whether patients presented with de novo metastatic disease, and her history of previous lines of therapy. In addition, it is not possible to precisely define the clinical scenarios of patients at the time the specimen was collected-i.e., whether the biopsy was taken from a current site of residual disease or progression during therapy, or if prior to any systemic therapy. Finally, the sequencing platforms available for tumor profiling varied over time, with fusion data unavailable for many samples.
Herein, we have described one of the largest series of ESR1 fusions reported, with 40 unique fusions identified. ESR1 LBD mutations were common, identified in 8.6% of all tumors evaluated and 14.5% of ER+/HER2− tumors. An improved understanding of how ESR1 variants affect ER signaling may ultimately guide treatment choices following progression on endocrine therapy. Future studies investigating the prognostic implications of ESR1 variants and how ESR1 variants affect responses to therapies beyond endocrine therapy are needed.
Acknowledgements No acknowledgements to report.
Author contributions ALH: corresponding/first author, material preparation, data analysis, manuscript writing and editing. AE: material preparation, data collection and analysis, manuscript writing and editing. RF: project conceptualization, data collection and analysis, manuscript editing. HFO': material preparation, data analysis, manuscript writing and editing. PRP: data analysis, manuscript editing. FL: data analysis, manuscript editing. SMS: data analysis, manuscript editing. MRN: data analysis, manuscript editing. DM: data collection and analysis, manuscript editing. MJO: data collection and analysis, manuscript editing. JS: data collection and analysis, manuscript editing. GV: data analysis, manuscript editing. CI: data analysis, manuscript editing. LS: data analysis, manuscript editing. WMK: data collection and analysis, manuscript editing. ART: project conceptualization, data analysis, manuscript editing.

Data availability
The datasets generated during and/or analyzed during the current study are not publicly available but may be available from the corresponding author upon reasonable request.