Ethics
All procedures performed involving human participants were in accordance with the ethical standards of the Peter MacCallum Cancer Centre Human Research and Ethics Committee (project number 10_83) and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. Informed consent was obtained from all individual participants included in the study.
Whole exome sequencing
One µg of germline DNA was obtained from peripheral leucocytes and fragmented using the Covaris S2 System (Covaris, Woburn, MA, USA). The SureSelect Human All Exon v1 (Agilent, Santa Clara, CA, USA) was used for exome enrichment according to the manufacturer’s protocol. Paired-end 100 base pair reads were sequenced on a HiSeq2000 (Illumina Inc, San Diego, CA, USA) instrument. Both exomes passed sequencing quality control with mean target base coverages of 129x and 121x for P1 and P2 respectively and >95% of targeted bases covered more than 10x.
Sequence alignment and variant calling
Raw sequence reads were quality checked with FastQC(13) and trimmed for low quality bases and adaptor if necessary using Cutadapt(14). Reads were aligned to the human genome (GRCh37 assembly) using BWA-MEM(15). Duplicate reads were marked using Picard(16) followed by merging of BAM files for both individuals. Local realignment around indels was performed on the merged BAM files using the Genome Analysis Tool Kit (GATK) software v3.1(17). Subsequently, base quality score recalibration was performed using GATK software. Single nucleotide variants (SNVs) and indels were identified using the GATK HaplotypeCaller and annotated with information from Ensembl release 73 using Ensembl's Perl API and Variant Effect Predictor (18, 19). Each variant was annotated with its frequency in the 1000 Genomes Project(20), the National Heart, Lung and Blood Institute (NHLBI) Grand Opportunity (GO) Exome Sequencing Project(21) and an in-house exome dataset of 147 familial breast cancer cases(22). The likely pathogenic consequence for each variant was determined by Polyphen(23), SIFT(24), and Combined Annotation Dependent Depletion (CADD) scaled score(25).
Exome data analysis
For genes with multiple transcripts, transcripts were prioritised on 1) most to least deleterious predicted impact of variant on protein function (Supplementary Data 1), and 2) RefSeq transcript. The highest ranking transcript was taken forward for further analysis. LoF variants and missense variants which met the following criteria were considered for further analysis: (1) Phred variant quality score of >30, and (2) variant allele frequency between 0.15 and 0.8. For identification of novel variants shared between P1 and P2, variants were excluded if they were present in control cohorts: 1000 Genomes Project, NHLBI GO Exome Sequencing Project or an in-house cohort of 147 Australian familial breast cancer exomes. All loss of function (LoF) variants (truncating frameshift, nonsense, essential splice site), and missense variants with a CADD scaled score ≥10 were manually checked in the Integrated Genome Viewer (IGV)(26, 27).
Variants shared between P1 and P2 which were confirmed on Sanger sequencing were checked in the Genome Aggregation Database v2 dataset (gnomAD) (28), comprising exome and genome data from 125 748 and 15 708 unrelated individuals respectively, for the population frequency, to confirm that these variants were rare or novel. Variants with a frequency greater than 1X 10-4 in the gnomAD dataset were considered too common to account for the development of AMTs and were excluded.
Whole genome amplification and Sanger sequencing
Candidate variants were confirmed by Sanger sequencing using whole-genome amplified DNA from P2. Whole-genome amplification of genomic DNA was performed using the REPLI-g Midi Kit (Qiagen, Redwood City, CA, USA). PCR primers were designed using the Primer3 program v0.4.0(29, 30) and are listed in Supplementary Data 2. DNA fragments were amplified using HotStarTaq DNA Polymerase (Qiagen, Redwood City, CA, USA), purified using ExoSAP-IT PCR Purification Kit (USB Corporation, Cleveland, OH, USA), and sequenced using the Big Dye Terminator v3.1 kit (Applied Biosystems, Foster City, CA, USA). Sanger sequencing was performed on an ABI3130 Sequencer (Applied Biosystems), and visualised in Geneious 5.6.2 software (BioMatters Ltd, Auckland, New Zealand).
Tumour micro-dissection and analysis
Both tumours were reviewed by a clinical pathologist with expertise in this area. Consecutive 10 μm sections were cut from the formalin fixed paraffin embedded PMP specimens with the highest tumour content, and stained with haematoxylin and eosin. Tumour cells were micro-dissected manually using a 23 gauge needle and somatic DNA was extracted using the DNeasy Blood and Tissue Kit (Qiagen, Redwood City, CA, USA). Somatic copy number analysis of tumours was assayed using the OncoScan Molecular Inversion Probe assay (Affymetrix, Santa Clara, CA, USA) on 50-75 ng of somatic DNA, and the data analysed using Nexus Copy NumberTM software (Biodiscovery, Inc., El Segundo, CA, USA). There was no matched control copy number data available for P2. The Oncoscan molecular assay comprises >220,000 single nucleotide polymorphisms and provides copy number resolution around 50-100 kb.
To assess if a variant showed somatic LOH, Sanger sequencing was performed using unamplified tumour DNA extracted from the AMT (primers listed in Supplementary Data 3).
Sanger sequencing of candidate genes in an AMT/PMP validation cohort
Germline DNA from individuals with AMTs or PMP was obtained from the Victorian Cancer Biobank (VCB), the Australian Ovarian Cancer Study (AOCS) and Southampton, UK(31). Clinical details were extracted from de-identified histopathology reports. Histopathology reports for all PMP samples in the validation cohort were examined to ensure that they were not metastases of known ovarian origin.
Sanger sequencing of all exons of REEP5 was performed using germline DNA from individuals from the PMP validation cohort, using the same methods as described earlier. PCR primers for each exon were designed to include 40 base pairs flanking the intron-exon boundary of each exon (Supplementary Data 4).