Case selection
Frozen tissues of tumors and matching healthy skin tissues from surgically resected specimens were retrieved from the archives of the Kanagawa Cancer Center Biospecimen Center. We analyzed five MBC from four patients. The five MBCs included one primary LGASC (sample ID: LGASC) and high-grade MBC with a predominant metaplastic squamous cell carcinoma pattern (MSC) that progressed from LGASC and metastasized to the axillary lymph node (sample ID: LNMSC), and three de novo MSCs (sample ID: M2T, M3T, and M7T, respectively). The de novo MSCs were retrieved from frozen specimens for genetic analysis according to the sufficient cellularity of tumor cells. The remaining surgically resected tissues were fixed in 10% buffered formalin, embedded in paraffin, and stained with hematoxylin and eosin. The histology and histological grade of these tumors were evaluated according to the fifth edition of the WHO Classification of Tumours1. Clinical data, including germline mutations of breast cancer genes BRCA1 and BRCA2, stage, and outcomes, were retrieved from the medical records. The study was conducted according to the Declaration of Helsinki and was approved by the Ethics Committee of the Kanagawa Cancer Center (approval no. H28-240). Written informed consent for retrospective studies, including somatic and germline genetic analyses, as broad comprehensive consent, was obtained from the patients involved.
Immunohistochemistry
Immunohistochemical staining was performed on a section from the representative paraffin block of each tumor. Then, heat-induced antigen retrieval was performed. The following primary antibodies were used: estrogen receptor (ER) (SP1, Roche, Basel, Switzerland, pre-diluted), progesterone receptor (PgR)(1E2, Roche, pre-diluted), human epidermal growth factor receptor 2 (HER2) (4B5, Roche, pre-diluted), p63 (4A4, Nichirei, Tokyo, Japan, pre-diluted), p40 (BC20, Nichirei, pre-diluted), SMA (1A4, Roche, pre-diluted), Ki-67 (MIB-1, Agilent, Santa Clara, CA, USA, pre-diluted), and SMAD4 (EP618Y, Abcam, Cambridge, UK, 1:100). All immunohistochemical procedures, except for p40 and SMAD4, were performed using the automatic staining machine VENTANA BenchMark ULTRA IHC/ISH system (Roche). ER, PgR, and HER2 were detected using the iVIEW DAB Detection Kit (Roche), and p63, SMA, and Ki-67 were detected using the ultraView Universal DAB Detection Kit (Roche). For p40, immunostaining was performed using HISTOSTAINER 48A (Nichirei), and its expression was detected using Histofine Simple Stain MAX-PO kit (Nichirei). For SMAD4, immunostaining was performed using BOND-Ⅲ (Leica Biosystems, Nussloch, Germany). Positive and negative controls were used. ER, PgR, and HER2 were evaluated according to the current American Society of Clinical Oncology/College of American Pathologists guidelines16,17.
Whole-genome sequencing
Genomic DNA was extracted from frozen tissues of each of the five MBCs and four matched healthy skin tissues using a standard method with protease K (PK) digestion. Briefly, the frozen tissues were minced and incubated at 65ºC overnight in 1-mg/mL PK solution, containing 10-mM Tris-HCl buffer (pH 7.8–8.0) (Thermo-Fisher Scientific, Inc., Waltham, MA), 0.4% (w/v) sodium dodecyl sulfate (NIPPON GENE CO., LTD, Tokyo, Japan), 150-mM NaCl, and 10-mM ethylenediaminetetraacetic acid (Thermo-Fisher Scientific, Inc.). The solution was extracted in 25:24:1 phenol/chloroform/isoamyl alcohol (phenol-CIAA) solution (FUJIFILM Wako Chemicals) several times, and finally, DNA was precipitated from the aqueous phase with ethanol and collected by centrifugation. DNA was once dissolved in autoclaved distilled water (dH2O) and treated with 10 µg/mL of a DNase-free RNase (Pure Link RNase A, Thermo-Fisher Scientific) at 37°C for 30 min. The reaction was treated once with phenol-CIAA, and DNA was obtained as described above. The double-stranded DNA was quantified with Qubit 2 (Thermo-Fisher Scientific), and DNA purity was evaluated using NanoPhotometer (Implen, Munich, Germany) as A260/A280 and A260/230 optical density ratios. WGS was entrusted to Genewiz (Shinagawa-ku, Tokyo, Japan) for primary data analysis. Briefly, sequencing libraries were constructed from each 100 ng of the extracted DNA using TruSeq Nano DNA Library Prep kit (Illumina, Hayward, CA) and sequenced using a standard 150-bp paired-end read protocol. The obtained sequence data were analyzed using the Genomon 2 DNA analysis pipeline (https://github.com/Genomon-Project) at the Human Genome Center, the Institute of Medical Science, the University of Tokyo (Tokyo, Japan).
Analysis of single-nucleotide variants and short insertions and deletions (indel)
Single-nucleotide variants (SNVs) with minimum depth ≥8, base quality ≥15, variant read ≥4, P <0.01 (Fisher’s exact test), and variant allele frequency ≥0.02 in the tumor were first selected for analysis. SNVs in protein coding regions that resulted in changes in amino acid sequences were further evaluated to identify pathogenic or likely pathogenic ones by interrogating against the Catalogue of Somatic Mutations in Cancer (COSMIC) (https://cancer.sanger.ac.uk/cosmic), the 1000 Genomes Project (https://international genome.org), dbSNPs (NCBI, NIH; https://www.ncbi.nlm.nih.gov/snp/), Functional Analysis through Hidden Markov Models (v2.3) (http://fathmm.biocompute.org.uk), ClinVar (NCBI, NIH; https://www.ncbi.nlm.nih.gov/clinvar/), and OncoKB (Memorial Sloan Kettering Cancer Center; https://www.oncokb.org/) databases. Literature searches were also conducted to recognize pathogenic or likely pathogenic SNVs. Breast cancer driver genes were finally determined in compliance with the list that appeared in the report by Michailidow et al.18.
Estimation of structural variations and DNA copy number variants
Large insertions, deletions, tandem duplications, and inversions, with tumor allele frequency ≥0.07, depth thresh fold ≥10, control depth ≥10, and inversion size threshold >1000 in the tumor were considered as structural variation (SV) candidates. Cellularity was measured using the R package Sequenza (https://sequenzatools.bitbucket.io). To examine the functional bias of the genes where the SVs breakpoints are localized, a clustering analysis of the gene subset on the SVs breakpoints was performed using DAVID 6.8 (https://david.ncifcrf.gov/home.jsp).
Copy number variant (CNV) profiling was performed using DNAcopy version 1.56.0. (https://bioconductor.org/packages/release/bioc/html/DNAcopy.html), an R/Bioconductor package. We identified significantly amplified or deleted regions of genome using GISTIC 2.019.
To detect chromothripsis from WGS data, we used ShatterSeek v1.1 (https:// github.com/parklab/ShatterSeek), an R package, and applied previously established criteria20
Evaluation of mutational patterns and signatures
MutationalPatterns version 2.0.0 (https://bioconductor.org/packages/release/bioc/html/MutationalPatterns.html), an R/Bioconductor-based package developed by Blokzijl et al.21, was used to characterize the patterns in base substitutions, and mutational signatures were identified through WGS.
Reverse transcription-polymerase chain reaction
The existence of SMAD4–DCC fusion transcript was evaluated in LGASC and LNMSC using reverse transcription-polymerase chain reaction (RT-PCR). Briefly, total RNA was extracted from frozen tissues of LGASC and LNMSC using TRIzol Reagent (Invitrogen, Waltham, MA) according to the manufacturer’s instructions. Then, total RNA was reverse transcribed into cDNA using SuperScript IV VILO Master Mix (Invitrogen) and subjected to PCR amplification using PrimeSTAR HS DNA polymerase (Takara, Kyoto, Japan). The forward PCR primer was set in the exon 3 of SMAD4 with the nucleotide sequence 5'-CTACGAACGAGTTGTATCACC-3' (SMAD4-forward), and the reverse primer was set in the exon 3 of DCC with the nucleotide sequence 5'-ACTTGAGTAGCACTGTGTCTC-3' (DCC-reverse). The amplified fragments that corresponded to the predicted molecular size were subcloned into pCR®︎-TOPO®︎ plasmid vector, and the nucleotide sequence was determined by Sanger sequencing using ABI PRISM 3130xl Genetic Analyzer (Thermo-Fisher Scientific, Inc.).
Western blotting
To confirm the existence of SMAD4–DCC chimeric protein, Western blotting was performed using the NuPAGE 4%–12% gradient Bis-Tris Protein Gel system (Thermo-Fisher Scientific, Inc.) with MOPS running buffer (Thermo-Fisher Scientific, Inc.). Proteins were extracted from frozen tissues of LGASC and LNMSC using Cell Lysis Buffer (Cell Signaling Technology, Inc., Dancers, MA). To detect SMAD4–DCC fusion protein, we used the following antibodies: anti-SMAD4 rabbit monoclonal antibody (BLR133J, Bethyl Laboratories, Montgomery, TX, USA, 1:1,000) and anti-DCC mouse monoclonal antibody (A-1, Santa Cruz Biotechnology, Dallas, TX, USA, 1:1,000). Anti-vinculin (V9131, Sigma-Aldrich, Merck KGaA, Darmstadt, Germany, 1:10,000) was used as a protein loading control. Secondary antibody reaction was performed using peroxidase-conjugated anti-mouse IgG (NA931, Cytiva, Tokyo, Japan, 1:100,000) or anti-rabbit IgG (NA934, Cytiva, 1:100,000). Detection was performed using the ImmunoStar LD-enhanced chemiluminescence detection reagent (FUJIFILM Wako Chemicals, Osaka, Japan).
Statistical analysis
Genetic data were compared using the Mann–Whitney U-test. Cosine similarity was used to determine the relative contribution of the 30 COSMIC mutational signatures (Mutational Signatures, version 2, https://cancer.sanger.ac.uk/signatures/signatures_v2/) in each sample. P values of less than 0.05 were used to denote statistical significance. All statistical analyses were performed using R, version 4.0.2.