Concurrent mutations associated with trastuzumab-resistance revealed by single cell sequencing

HER2-positive breast cancer patients benefit from HER2-targeted therapies, among which the most commonly used is trastuzumab. However, acquired resistance typically happens within one year. The cellular heterogeneity of it is less clear. Here we generated trastuzumab-resistant cells in two HER2-positive breast cancer cell lines, SK-BR-3 and BT-474. Cells at different time points during the resistance induction were examined by exome sequencing to study changes of genomic alterations over time. Single cell-targeted sequencing was also used to identify resistance-associated concurrent mutations. We found a rapid increase of copy number variation (CNV) regions and gradual accumulation of single nucleotide variations (SNVs). On the pathway level, non-synonymous SNVs for SK-BR-3 cells were enriched in the MAPK signaling pathway, while for BT-474 cells they were enriched in mTOR and PI3K-Akt signaling pathways. However, all of the three signaling pathways were in the downstream of the HER2 kinase. Putative trastuzumab-resistance-associated SNVs included AIFM1 P548L and ERBB2 M833R in SK-BR-3 cells, and ADAMTS19 V451L, OR5M9 D230N, COL9A1 R627T, and ITGA7 H911Q in BT-474 cells. Single-cell-targeted sequencing identified several concurrent mutations. By validation, we found that concurrent mutations (AIFM1 P548L and IL1RAPL2 S546C in SK-BR-3 cells, MFSD11 L242I and ANAPC4 E16K in BT-474 cells) led to a decrease of trastuzumab sensitivity. Taken together, our study revealed a common pathway level trastuzumab-resistance mechanism for HER2-positive breast cancer cells. In addition, our identification of concurrent SNVs associated with trastuzumab-resistance may be indicative of potential targets for the treatment of trastuzumab-resistant breast cancer patients.


Introduction
HER2-positive breast cancer is defined as overexpression of the HER2 protein detected by immunohistochemistry or amplification of the ERBB2 gene detected by fluorescence in situ hybridization (FISH), taking account for around 15-20% of breast cancer [1,2]. HER2-targeted therapies have achieved satisfactory effects in HER2-positive breast cancer [3,4]. Trastuzumab, a recombinant humanized monoclonal antibody targeting the extracellular domain of the HER2 tyrosine kinase receptor, is commonly used in the first-line treatment for HER2-positive breast cancer, in combination with chemotherapies, such as paclitaxel and docetaxel, and/or another anti-HER2 drug pertuzumab [5][6][7]. However, 60-80% of the patients with an initial response to HER2-targeted therapy will acquire resistance within one year, leading to a great clinical challenge [8].
Understanding the molecular mechanism of trastuzumabresistance is critical to improve the management of HER2positive breast cancer patients.
Considering HER2 is a transmembrane receptor tyrosine kinase, proposed trastuzumab-resistance mechanisms include intrinsic HER2 alterations, activation of downstream pathways and bypass signaling by other kinases [9,10]. Among these the most commonly reported is activating alterations of the PI3K/AKT/mTOR pathway, which is in the downstream of HER2 signaling and activated by HER2 phosphorylation [11]. However, only a few specific mutated genes leading to trastuzumab-resistance in this pathway have been reported, including PIK3CA and PTEN genes [12][13][14], more potentially targetable mutation sites need to be discovered. What's more, acquired trastuzumab-resistance is a long-term accumulation process [15], which alteration occurs first and how different alterations accumulate are still unknown, requiring a sequential monitoring.
Although several genomic studies focused on molecular mechanism of trastuzumab-resistance have already been reported [16][17][18], cellular heterogeneity on the trastuzumab-resistance is not known. Especially whether two or more mutations can occur concurrently in the same cell or one mutation occurs in part of cells and the other occurs in other cells. Studying at single cell level can solve this problem. In 2012 a single cell whole genome amplification method called multiple annealing and looping-based amplification cycles (MALBAC) was developed [19]. MALBAC uses quasilinear pre-amplification instead of traditional PCR to reduce the accumulation of amplification biases caused by exponential amplification. By using MALBAC we can achieve up to 93% genome coverage and ~ 1% allele dropout rate, making it an ideal method for single nucleotide variation (SNV) detection.
In this study, we intended to first monitor changes of genomic alterations at different time points during the process of trastuzumab-resistance induction in two HER2-positive breast cancer cell lines (SK-BR-3 and BT-474) by bulk exome sequencing. We then chose putative trastuzumabresistance-associated SNVs as targets and used MALBAC to perform single-cell-targeted sequencing in both two cell lines to find concurrent SNVs. Our findings would shed lights on understanding the molecular mechanism of trastuzumab-resistance and guiding treatment for acquired trastuzumab-resistant breast cancer patients.
The trastuzumab-resistance of SK-BR-3 h and BT-474 h cells were confirmed by comparing survival rates of SK-BR-3 h and SK-BR-3 cells or BT-474 h and BT-474 cells under different concentrations of trastuzumab treatment. For SK-BR-3 and SK-BR-3 h cells, trastuzumab was added to the medium to final concentrations of 0, 4, 10, 50, 100, 200 and 300 μg/ml for 72 h. For BT-474 and BT-474 h cells, final concentrations were 0, 25, 50, 75, 100, 125, 175 and 200 μg/ml. The CellTiter 96 AQueous Non-Radioactive Cell Proliferation Assay (Promega, Madison, WI, USA) was used in accordance with manufacturer guidelines. All experiments were repeated three times and data were shown as mean ± s.e.m..

Bulk exome sequencing of genomic DNA from different time points during resistance induction
Cells after one, two and three months of trastuzumab treatment during the generation of trastuzumab-resistant cells and cells without trastuzumab treatment in both SK-BR-3 and BT-474 cell lines were collected for bulk exome sequencing. QIAamp DNA Micro Kit (QIAGEN, Germantown, MD, USA) was used for genomic DNA extraction. Exome libraries of genomic DNA were constructed using Agilent SureSelect Human All Exon V6 kit (Agilent, Santa 1 3 Clara, CA, USA) following manufacturer's recommendations. The quality of exome library was checked by qPCR and fragment analysis using LabChip GX Touch 24 Nucleic Acid Analyzer (PerkinElmer, Waltham, MA, USA). Qualified exome libraries were sequenced by Illumina NovaSeq 2 × 150 bp.

CNV calling based on bulk exome sequencing data
Quality control of raw sequencing data was performed by discarding reads containing adapter contamination, lowquality nucleotides and unrecognizable nucleotides. The clean data were then mapped to the reference human genome (UCSC hg19) by Burrows-Wheeler Aligner (BWA) software [21]. After that SAMtools [22] and Picard were used to sort the files and perform duplicate marking, local realignment and base quality recalibration. We used cells without trastuzumab treatment as control to call differential copy number variations (CNVs) by Control-FREEC [23].

SNV calling and analyses
Samtools mpileup and bcftools were used to do variant calling and the differential SNVs were detected by MuTect [24] using cells without trastuzumab treatment as control. ANNOVAR was used to perform functional annotation [25]. Non-synonymous differential SNVs were selected. Gene ontology (GO) analysis of non-synonymous SNVs was performed by Metascape on KEGG pathway with a P value cutoff of 0.05. We defined non-synonymous SNVs also existing in COSMIC or KEGG pathways in cancer / Erbb signaling pathway / PI3K-Akt signaling pathway as putative trastuzumab-resistance-associated SNVs. Interpretation of these SNVs with protein annotations was done by MutationMapper.

Single cell isolation and whole genome amplification of trastuzumab-resistant cells
SK-BR-3 h and BT-474 h cells were also used for single cell isolation by mouth pipetting under microscope. A total of 96 and 50 single cells in SK-BR-3 and BT-474 cells were isolated. After single cell lysis, whole genome amplification was performed by MALBAC following the published paper [19]. In detail, nine cycles of quasilinear amplification with random primers containing 27-nucleotide sequence and 8 variable nucleotides were first performed. Then 20 cycles of traditional PCR targeting the common sequences were done. The quality of amplified DNA for each single cell was detected by qPCR with eight pairs of primers targeting random copy number conservative regions across the genome. Single cells having no less than six sites with Ct value less than 30 passed the quality control. A total of 94 and 50 single cells in SK-BR-3 and BT-474 cell lines passed the quality control.

Single-cell-targeted sequencing
Targeted mutation sites were selected based on bulk exome sequencing data. Non-synonymous SNVs with mutation frequency ≥ 10% in both SK-BR-3 and BT-474 cell lines were used for functional impact prediction by three methods: Mutation Assessor, SIFT and Polyphen-2. SNVs predicted to be damaging by no less than two methods were chosen for single-cell-targeted sequencing. A total of 21 and 22 mutation sites were selected for SK-BR-3 and BT-474 cell lines, and the detailed mutation information and sequences of primers were shown in Supplementary Table S1 and S2. Genomic sites with mutation frequency ≥ 5% were defined as mutant, genomic sites with mutation frequency < 5% were defined as wild-type, and genomic sites with no reads cover were defined as no coverage. Hierarchical clustering of mutations and cells were performed by using pheatmap R package.

Functional experiments to validate detected concurrent mutations leading to trastuzumab-resistance
We first validated the function of concurrent mutations of AIFM1 P548L and IL1RAPL2 S546C in trastuzumabresistance in SK-BR-3 cells. Primers were designed at specific sites of plasmids pCMV-FLAG-AIFM1 and pCMV-FLAG-IL1RAPL2 to construct mutant plasmids. The primer sequences were shown in Supplementary Table S3. SK-BR-3 cells were transiently transfected with plasmids of pCMV-FLAG-AIFM1-WT, pCMV-FLAG-AIFM1-Mut, pCMV-FLAG-IL1RAPL2-WT, pCMV-FLAG-IL1RAPL2-Mut, both pCMV-FLAG-AIFM1-WT and pCMV-FLAG-IL1RAPL2-WT, and both pCMV-FLAG-AIFM1-Mut and pCMV-FLAG-IL1RAPL2-Mut using Lipofectamine 2000 (Invitrogen, Waltham, MA, USA), respectively. Trastuzumab was added to the medium to final concentrations of 0, 0.25, 0.5, 1 and 2 μg/ml for 72 h. Cell Counting Kit-8 (CCK8) (ZETA, Xi'an, China) was used in accordance with manufacturer guidelines. For AIFM1 and IL1RAPL2 mutations, experiments were repeated three times. And for concurrent mutations of AIFM1 and IL1RAPL2 genes, experiments were repeated twice. Mean values for the three or two experiments were shown.
The function of concurrent mutations of MFSD11 L242I and ANAPC4 E16K in trastuzumab-resistance in BT-474 cells was validated by similar methods. We used the data from SK-BR-3 cells without trastuzumab treatment as control to call the differential CNVs and SNVs of the three time points. In the aspect of CNVs, for the first two months there were only a few differential CNV regions, but at the third month the CNV regions increased rapidly ( Supplementary Fig. S3a). These observations suggested that, in addition to an important role in both carcinogenesis and cancer metastasis [26][27][28], CNVs may also have a crucial effect on drug resistance. In contrast, SNVs increased gradually across the whole process (Supplementary Fig. S3b).

Changes
For exome sequencing data, we could not get CNV information at a high resolution; therefore, we mainly focused on SNV analysis. We first listed all non-synonymous differential SNVs (a total of 105) and exhibited the mutation frequency changes of them across the three time points, as shown in Fig. 1a. The number of non-synonymous SNVs appeared at the third month (74 SNVs) was significantly higher than the first two months (first month: 14 SNVs, second month: 17 SNVs), further supporting that dramatic change of trastuzumab sensitivity may occur from the third month.
To find which signaling pathway these non-synonymous SNVs may affect, we performed GO analysis on KEGG pathways of them. As shown in Fig. 1b, the non-synonymous SNVs were mainly enriched in MAPK signaling pathway. We marked the positions of these mutated genes in MAPK signaling pathway in Fig. 1c.
Except the genes in the enriched pathway, mutations of other genes may also play important roles in trastuzumabresistance. We defined non-synonymous SNVs that also existed in COSMIC or KEGG pathways in cancer / Erbb signaling pathway / PI3K-Akt signaling pathway as putative trastuzumab-resistance-associated SNVs and demonstrated them with protein annotations (Fig. 1d). AIFM1 P548L belonged to COSMIC, with a mutation frequency of 2.47% in breast cancer and 8.49% in HER2-positive breast cancer. This mutation affected the apoptosis-inducing factor mitochondrion associated C-term domain of AIFM1 protein. ERBB2, MTOR and AGT genes existed in KEGG pathways in cancer. ERBB2 M833R affected the protein tyrosine kinase domain of ERBB2 protein. ERBB2 and MTOR genes existed in KEGG Erbb signaling pathway. MTOR and NTF3 genes existed in KEGG PI3K-Akt signaling pathway.

Concurrent mutations associated with trastuzumab-resistance of SK-BR-3
To distinguish if two SNVs occurred concurrently in the same cell or one SNV occurred in part of the cells and the other occurred in the other cells, we took advantage of MALBAC for single-cell-targeted sequencing of SK-BR-3 h cells. By functional impact prediction of 40 non-synonymous SNVs with mutation frequency ≥ 10% (Supplementary Fig. S4a), we selected 21 targeted mutation sites and 94 single cells were targeted sequenced.
The distribution of the 21 SNVs in the 94 single cells are shown in Fig. 2a. Among all of the 1,974 sites, only 28 sites were without coverage, taking account for 1.42%, which confirmed the high genome coverage of MALBAC method. We tried to find two SNVs that more frequently occurred concurrently in the same single cell. The top four pairs are shown in Fig. 2b: AIFM1 P548L and CHD7 G2228W with a concurrent frequency of 29.8%; AIFM1 P548L and UBR5 P2740S with a concurrent frequency of 29.8%; AIFM1 P548L and IL1RAPL2 S546C with a concurrent frequency of 26.6%; and AIFM1 P548L and URI1 P105L with a concurrent frequency of 24.5%. AIFM1 gene encoded protein was essential for nuclear disassembly in apoptotic cells and AIFM1 P548L belonged to COSMIC. CHD7 gene encoded a protein that contained several helicase family domains. UBR5 gene encoded protein belonged to the HECT family, which functioned as E3 ubiquitin-protein ligases, playing a role in regulation of cell proliferation or differentiation. IL1RAPL2 gene encoded protein was a member of the interleukin 1 receptor family, which played important roles in immunity and inflammation. URI1 gene encoded member of molecular chaperones and played a role in multiple malignancies.
Compared with single gene mutation, combined mutations of these genes may have more growth advantages under trastuzumab treatment and contribute more to trastuzumab-resistance. Although there was also possibility for more than two mutation combination associated with trastuzumab-resistance, considering the limited single cell sample size and number of targeted mutation sites, we did not perform this analysis.

Introduction of the concurrent mutations led to trastuzumab-resistance
Above we detected pairs of concurrent SNVs in SK-BR-3 h cells, but whether these detected concurrent mutations lead to trastuzumab-resistance still needed validation. Therefore, we chose one pair of concurrent mutations: AIFM1 P548L and IL1RAPL2 S546C (both of the two SNVs occurred at the first month of trastuzumab addition and the mutation frequencies of them increased over time) for functional experiments.
We transfected SK-BR-3 cells with either wild-type AIFM1 plasmid (AI WT), mutant AIFM1 plasmid (AI Mut), wild-type IL1RAPL2 plasmid (IL WT), mutant IL1RAPL2 plasmid (IL Mut), both wild-type AIFM1 and wild-type IL1RAPL2 plasmids (Double WT) and both mutant AIFM1 and mutant IL1RAPL2 plasmids (Double Mut), respectively. The transfection efficiency was confirmed by western blot in Supplementary Fig. S5a. As shown in Fig. 2c, in AI WT and AI Mut group there were no significant differences in

Changes of genomic alterations during the generation of trastuzumab-resistant BT-474 cell line
BT-474 is also a HER2-positive breast cancer cell line, but its molecular subtype is Luminal B (ER positive, PR positive, HER2 positive). Breast cancers with different molecular subtypes exhibited quite distinct biological and clinical phenotypes [29]. To investigate the similarities Differential CNVs and SNVs of the three time points were called. For CNVs, there were a few differential CNV regions at the first month, and the CNV regions increased rapidly at the second month ( Supplementary Fig. S3c). It indicated that the dramatic change of trastuzumab sensitivity for BT-474 cells may occur earlier than SK-BR-3 cells, from the second month of trastuzumab addition. And for SNVs, they also increased gradually across the whole process in BT-474 cells ( Supplementary Fig. S3d), as in SK-BR-3 cells.
Then we demonstrated the mutation frequency changes of all non-synonymous differential SNVs (a total of 197) across the three time points in BT-474 cell line in Fig. 3a The GO analysis on KEGG pathways showed distinctions from SK-BR-3 cells, the non-synonymous SNVs were mainly enriched in mTOR and PI3K-Akt signaling pathways (Fig. 3b), consistent with previous studies [12]. Positions of these mutated genes in mTOR or PI3K-Akt signaling pathway are marked in Fig. 3c.

Concurrent mutations associated with trastuzumab-resistance of BT-474
By functional impact prediction of 47 non-synonymous SNVs with mutation frequency ≥ 10% in BT-474 cells ( Supplementary Fig. S4b), we selected 22 mutation sites for single -cell-targeted sequencing. Except for one (USP35 E55K, all single cells showed no coverage in this position maybe due to the complex genomic organization making it difficult to be amplified by MALBAC method), the left 21 mutation sites were used and 50 single cells were targeted sequenced.
The distribution of the 21 SNVs in the 50 single cells are shown in Fig. 4a. Among all of the 1,050 sites, only 5 sites were without coverage, taking account for 0.48%. The top four pairs of concurrent SNVs are shown in Fig. 4b: KIAA1107 K771T and MFSD11 L242I with a concurrent frequency of 24%, KIAA1107 K771T and ANAPC4 E16K with a concurrent frequency of 22%; KIAA1107 K771T and CENPJ E318Q with a concurrent frequency of 22%; and MFSD11 L242I and ANAPC4 E16K with a concurrent frequency of 20%. KIAA1107 gene was also known as BTBD8 and the protein encoded by it contained BTB domain, which was an evolutionarily conserved domain found in developmentally regulated transcription factors. MFSD11 gene encoded protein contained major facilitator superfamily domain, this superfamily was a diverse group of secondary transporters. ANAPC4 gene encoded one subunit of the anaphase-promoting complex, which can promote metaphase-anaphase transition by ubiquitinating its specific substrates. CENPJ gene encoded protein belonged to the centromere protein family and played a structural role in the maintenance of centrosome integrity and normal spindle morphology.
To further validate these detected concurrent mutations could lead to trastuzumab-resistance, we introduced concurrent MFSD11 L242I and ANAPC4 E16K (we avoided KIAA1107 gene due to its low expression in adult tissues) in BT-474 cells. The plasmid transfection efficiency was shown in Supplementary Fig. S5b. Similar results as in SK-BR-3 cells were exhibited in Fig. 4c, in AN WT and AN Mut group there were no significant differences in trastuzumab sensitivities. In contrast, in MF WT and MF Mut group and Double WT and Double Mut group trastuzumab sensitivity significantly decreased in cells with mutant plasmid compared with cells with wild-type plasmid, especially for Double WT and Double Mut group (with 3.38-and 4.26-fold of IC 50 in mutant cells in MF and Double groups), confirming the function of the concurrent mutations in resistance in BT-474 cells.

Discussion
In this study, we investigate the molecular mechanism of acquired trastuzumab-resistance and how genomic alterations accumulate over time in two HER2-positive breast cancer cell lines. The non-synonymous differential SNV landscapes for trastuzumab-resistant cells in SK-BR-3 and BT-474 cell lines are quite different, sharing only two genes: ERBB2 and BICD1 (with distinct mutation types: ERBB2 M833R in SK-BR-3, ERBB2 S310F in BT-474 and BICD1 V603F in SK-BR-3, BICD1 E394Q in BT-474). However, when we compare the results on the pathway level, the two cell lines show similarity. Although the non-synonymous SNVs are enriched in different signaling pathways for the two cell lines, all of these signaling pathways are in the downstream of the HER2 kinase. This means that the molecular mechanism of trastuzumab-resistance for both SK-BR-3 and BT-474 cells is alterations of downstream signaling pathways of the HER2 kinase (Fig. 4d). Previous studies also have reported that pathway level alterations rather than mutations in single genes can predict response to HER2targeted therapies [30], implying the importance of pathway level alterations in trastuzumab-resistance.
The process of genomic alteration accumulation for the two cell lines are also distinctive. The differential CNV regions increase rapidly at the third month in SK-BR-3 cells, but earlier (at the second month) in BT-474 cells. We believe that the number of genomic aberrations may  be associated with the change of trastuzumab-resistance. Although we didn't have IC 50 values for each timepoint (due to difficulties for enlarging culture of cells at the first two time points), according to the change of genomic aberrations, we think that the resistance may accumulate gradually, but there is a period of time that the resistance may change more rapidly than any other time to finally get a stable resistant situation. The earlier time of rapid CNV increase suggest that BT-474 acquires trastuzumabresistance earlier than SK-BR-3 cells.
We know that although both of SK-BR-3 and BT-474 cell lines are HER2 positive, their molecular subtypes are different (SK-BR-3 with HER2 subtype, BT-474 with Luminal B subtype). Expression of EGFR in the SK-BR-3 cell line is three times of the BT-474 cell line, and BT-474 is more sensitive to trastuzumab compared with SK-BR-3 [31]. The lower trastuzumab sensitivity of SK-BR-3 may be due to its higher expression of EGFR, a bypass signaling of HER2 as compensation. This is consistent with our results that it is more difficult and may take longer time to gain trastuzumab-resistance for SK-BR-3 cells. To confirm whether the different genomic aberrations are really due to distinct molecular subtypes, we need to repeat the experiments in other cell lines belonging to HER2 or HER2positive luminal subtypes to find if the conclusion still works. What's more, to some extent cell lines can't fully represent behaviors of in vivo tumors, further sequencing experiments on HER2-positive tumor tissues from patients diagnosed with different molecular subtypes are still necessary.
We identify some putative trastuzumab-resistance-associated SNVs, such as AIFM1 P548L and ERBB2 M833R in SK-BR-3 cells, and ADAMTS19 V451L, OR5M9 D230N, COL9A1 R627T, and ITGA7 H911Q in BT-474 cells. What's more, taking advantage of single cell sequencing, we identify concurrent mutations associated with trastuzumab-resistance. Further we validate that concurrent AIFM1 P548L and IL1RAPL2 S546C or MFSD11 L242I and ANAPC4 E16K can indeed lead to trastuzumab-resistance in SK-BR-3 or BT-474 cells, confirming the significance of the detected mutations. These mutation sites have the potential to become drug targets for trastuzumab-resistant breast cancer patients.
In conclusion, we identify that the pathway level mechanism of trastuzumab-resistance is alterations of downstream signaling pathways of the HER2 kinase. We also detect a few putative trastuzumab-resistance mutations, especially concurrent mutations. These mutations are potential biomarkers of trastuzumab-resistance and putative targets for treatment of trastuzumab-resistant breast cancer patients.