Patients and Study Design
RUBY was a single arm, open-label, multicentric, phase II study built with a two-stage Simon‘s design (NCT02505048). All patients provided written informed consent. Eligible patients were 18 years or older women with progressive HER2- breast cancer previously treated with at least one line of chemotherapy in the metastatic setting. Patients had a genomic LOH-high score obtained from an available genome-wide human SNP array or a BRCA1/2 somatic mutation, without known BRCA1/2 deleterious germline mutation. Patients harboring sBRCA1/2 mutations could have come from genomics driven trials such as SAFIR02-Breast (NCT02299999), MOSCATO36 or PERMED (NCT02342158) trials. Other inclusion criteria comprised measurable disease (according to response evaluation criteria in solid tumors version 1.1 [RECIST v1.1]), an ECOG performance status 0 or 1, and a 21‑day washout period from last chemotherapy or targeted therapy with resolution of all toxicities to grade ≤1. Main exclusion criteria included known gBRCA1/2 deleterious mutation, contraindication to rucaparib treatment, previous treatment with PARP inhibitor, less from 14 days from radiotherapy, spinal cord compression and/or symptomatic or progressive brain metastases, problem with intestinal absorption, severe or uncontrolled systemic disease, history of myelodysplastic syndrome, and hematopoietic function or organ impairment. The national ethics committee approved the study.
Procedure and assessments
Patients were treated with 600 mg oral rucaparib twice a day until disease progression, unacceptable toxicity, or initiation of another antineoplastic treatment. Toxicity management and dose reduction followed summary of products characteristics recommendations and local standard practice. Clinical and laboratory examination were performed every 4 weeks after treatment initiation. The safety was assessed and graded by National Cancer Institute-common terminology criteria for adverse events version 4.03 (NCI-CTCAE v4.03) every 4 weeks from treatment initiation until the end of treatment. Assessment of response to treatment for therapeutic decision was based on investigator-reported measurements on target and non-target lesions and was done according RECIST v1.1 with computed tomography scans (CT‑scans) or magnetic resonance imaging repeated every 8 weeks. For patients enrolled during the first stage of the study and patient with a response to treatment enrolled in the second stage, a central review was set-up to confirm investigator-reported measurements.
Genomic LOH assessment
The RUBY trial was tightly connected to the SAFIR trials. At RUBY initiation, all alive, non-germline BRCA1/2-mutated patients with a CytoScan HD or OncoScan CNV profile generated from metastatic tumor sample in SAFIR02Breast (NCT02299999) or SAFIRTOR (NCT02444390) studies, were screened for HRD. The SAFIR02-Breast/SAFIRTOR patient informed consent forms covered this screening phase. A local pathologist assessed biopsies from metastatic lesions to retain samples with more than 30% of cancer cells. DNA was extracted from 6 tissue sections (6-µm thick), using Nucleospin® 8 Tissue kit (Macherey-Nagel, GmbH & Co. KG, Germany) and a 7th tissue section was stained with hematoxylin-eosin. The AllPrep DNA/RNA Mini kit (Qiagen, Hilden, Germany) was used for isolation of DNA from frozen core biopsies, according to the manufacturer's protocol. The Qubit 2.0 Fluorometer (Quant-iT™ dsDNA BR Assay Kit; Thermo Fisher Scientific) was used for DNA quantification according to the manufacturer's instructions. 300 to 500 ng of DNA were used for DNA microarrays when more than 400 ng of extracted DNA were obtained. If DNA extraction procedure lead to less than 400 ng of DNA, 10 to < 300 ng of DNA were used for DNA microarray analysis. DNA microarrays analyses were performed using Affymetrix SNP6.0 technology (Thermo Fisher Scientific company): OncoScan® FFPE Assay Kit was used for FFPE tissue samples (designed for degraded DNA) and the CytoscanTM HD Array Kit was used for the fresh-frozen tissues. SNP probes were used both in the Oncoscan array and the Cytoscan arrays to provide DNA copy number variations.
Before being sent to Clovis Oncology, Affymetrix raw file (.cel) from SAFIR02/SAFIRTOR underwent a second pseudonymization procedure consisting of changing the SAFIR02/SAFIRTOR patient ID code with a new patient ID. A hexadecimal editor software (Frhed v1.6.0, http://frhed.sourceforge.net) was used to search and replace the patient ID code. To exclude any injury in the structure of the Affymetrix® file after this procedure, each recoded file was tested in order to validate its integrity.
Finally pseudonymized array data files were sent to Clovis Oncology for HRD assessment. Percentage genome-wide LOH was calculated using the method previously described25. For each sample from the TCGA, SAFIR and SAFIR-TOR studies, LOH regions were inferred across the 22 autosomal chromosomes of the genome using the computed minor allele frequencies of the SNPs sequenced in the Affymetrix assays. LOH inference was based on Biodiscovery’s implementation of the published ASCAT (allele-specific copy number analysis of tumors) methodology37,38. LOH regions spanning across ≥90% of a whole chromosome or chromosome arm were excluded from the calculation because these LOH events are likely due to non-HRD mechanisms39.
Hence for each tumor, the percentage of the genome with LOH was computed as 100 times the total length of non-excluded LOH regions divided by the total length of the interrogable genome.
In equation form:
% genome with LOH = 100* ∑ (lengths of non-excluded LOH regions) / (total length of genome with SNP coverage − ∑ (lengths of excluded LOH regions))
We prespecified a cutoff of 18% or more to define high genomic “LOH-high” score for breast carcinoma. This score was estimated to capture the top 25% of LOH scores based on previous analysis of The Cancer Genome Atlas (TCGA)2 microarray SAFIR01 and SAFIR02 microarray dataset (n=675). The results shown here are in part based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga.
Prognostic value of genomic LOH score
Proportion of high LOH score was compared between early and mBC using TCGA dataset (n = 1624) and data from SAFIR02 trial (n= 620). Genomic LOH status was compared between TCGA and SAFIR-02 trial using chi-square or fisher exact test. Prognostic value of high genomic LOH was assessed only on patients from SAFIR02-Breast trial for which outcome data were available. OS was defined in SAFIR02-Breast as the time from inclusion to death from any cause, and was estimated using the Kaplan-Meier method with 95% CI. Patients alive at the time of analysis were censored at their last follow-up date. Univariable and multivariable analysis were performed using the log-rank test and Cox proportional hazards model, respectively. Significant factors (i.e p≤0.05) in the univariable analysis were included in the multivariable analysis.
BRCA1 and BRCA2 deficiency prediction with HRDetect
As an exploratory study, a whole-genome sequencing was performed in a sub-population of RUBY patients. Whole genome sequencing was performed as previously described11,40.
Data pre-processing. Whole-genome sequence reads were mapped to the Human genome build hg19 by using Burrows-Wheeler Aligner MEM algorithm (BWA v0.7.17)41 with the following parameter : -A 2. For compatibility with Genome Analysis Toolkit (GATK) 42, the read group ID was attached to every read in the resulting alignment file (bam file) with the -R parameter, and shorter split hits were marked as secondary with -M. GATK FixMateInformation was used to check mate-pair information between mates and fix if needed on a name sorted bam file. The duplicate reads were tagged by GATK MarkDuplicates (GATK4.1.4.0) using a position sorted bam file. Base quality score recalibration (BQSR) was performed with GATK BaseRecalibrator followed by GATK ApplyBQSR. In the first pass of BQSR method, the parameter --known-sites was used to specify three databases of known polymorphic sites: 1000G_phase1.indels.hg19.sites.vcf, Mills_and_1000G_gold_standard.indels.hg19.sites.vcf (downloaded from GATK Resource Bundle) and dbSNP14440. The mean sequencing depth was assessed with deepTools plotCoverage (v3.3.1)43 on 50 million bases chosen randomly for each sample with the following parameters : --numberOfSamples 50000000 --ignoreDuplicates.
Somatic variant calling. The workflow described in GATK Best Practices for somatic short variant discovery was followed. In brief, somatic short variants (SNV and indel) calling was performed using GATK Mutect2, in tumor sample with matched normal. The --f1r2-tar-gz argument was used to output F1R2 counts that can be used to learn the orientation bias model. Also a germline resource file: af-only-gnomad.raw.sites.hg19.vcf (downloaded from GATK Resource Bundle) containing population allele frequencies of variants was used during the variant calling to filter common germline variants. The contamination in tumor sample can be estimated by a two steps method available in GATK. First, GetPileupSummaries was used with the file small_exac_common_3_b37.vcf (downloaded from GATK Resource Bundle) that has been lifted to hg19 with GATK LiftoverVcf. These genomic positions were used for two parameters of GetPileupSummaries: --intervals and --variant. The second step was performed by CalculateContamination. This tool calculates the contamination in the tumor from the results of GetPileupSummaries. In our case, we also used the matched normal results to get more precise estimations. Finally, GATK FilterMutectCalls was applied to filter variants due to contamination and/or due to strand/read orientation bias.
Germline variant calling. The workflow described in GATK Best Practices for germline short variant discovery was followed. In brief, germline short variants were called only in target genes by GATK HaplotypeCaller tool, in GVCF mode per sample. These GVCF files were merged using GATK GenomicsDBImport. Then GATK GenotypeGVCF was used for joint genotyping of tumor and normal samples. Only variants with one alternate allele were considered (--max-alternate-alleles 1). We used the Variant Quality Score Recalibration (VQSR) method of GATK to filter the germline insertions/deletions with --max-gaussians 4 and SNPs with --max-gaussians 6.
Variants annotation. ANNOVAR program (v20191107)44 was used to annotate variants. In this tool, the relevant transcripts are reported based on refGene database45. Population allele frequencies available in the 1000 Genomes Project46 , the Kaviar47, the gnomAD48 and the Complete Genomics 69 (cg69)49 databases were taken into account. The functional consequences of coding, non-coding and splicing SNVs are predicted with FATHMM50 and dbscSNV51 algorithms. Also the clinical significance from ClinVar database52, the presence of the variant in COSMIC (https://cancer.sanger.ac.uk)53and dbSNP15040 are added.
Somatic copy number variations. The bioconda package cnv_facets (v0.15.0, https://github.com/dariober/cnv_facets/) was used to detect allele-specific copy number variants (CNVs) in tumor sample compared to a matched normal sample. The core of this tool uses the Facets R package54 and predicts the tumor purity, ploidy and clonal heterogeneity. The list of parameters used is: --snp-mapq 15 --snp-baq 20 --snp-count-orphans --depth 25 4000 --cval 25 400 --nbhd-snp auto --gbuild hg19 --rnd-seed 1234.
Structural rearrangements. We used a tool called Manta (v1.6.0)55 to detect somatic structural variants with defaults parameters. This algorithm is based on supporting paired and split-read evidence. Inversions are reported as breakends by default. The script converInversion.py was applied to resulting vcf files to reformat inversions into single inverted sequence junctions.
HRDetect analysis. The HRDetect-pipeline was downloaded from the Github repository: https://github.com/eyzhao/hrdetect-pipeline and modified to be used for our study. The modified code is available at: https://github.com/gustaveroussy/hrdetect-pipeline. The somatic short variants, copy number variations and structural rearrangements were used as input of the HRDetect pipeline. Then the HRDetect scores were determined as described previously56.