Patient samples
The study included 460 liver metastases from 176 patients who underwent resection for CRLM at Oslo University Hospital, Oslo, Norway, between October 2013 and February 2018. Five patients were excluded due to mucinous tumor tissue, poor DNA quality, or suspicion of low tumor cell content. In total, 441 liver metastases from 171 patients were included for mutation analyses (Fig. 1), of which 102 patients had multiple lesions analyzed (median of 3 metastatic lesions per patient, range 2–9).
Fresh frozen tumor tissue samples (15–30 mg) were homogenized in liquid nitrogen and DNA was extracted using the AllPrep Universal DNA/RNA/miRNA protocol (Qiagen, Hilden, Germany). DNA quality and concentrations were assessed by NanoDrop 1000 spectrophotometer (version 3.7.1, Thermo Fisher Scientific, Waltham, Massachusetts, U.S.A.) and Qubit fluorometer (Thermo Fischer Scientific).
Mutation and microsatellite instability analyses
A total of 355 metastatic tumor samples from 103 patients have previously been analyzed for hotspot mutations in BRAF exon 15 and KRAS and NRAS exons 2–4 by Sanger sequencing (17). The remaining 86 tumor samples and 68 patients were analyzed in the present study.
All 441 tumor samples were also sequenced for all coding regions of TP53 (exons 2–11). In summary, three singleplex PCR reactions were used to analyze TP53 exons 2–4, 5–6 and 7–9, respectively, by amplifying 50 ng of DNA in a reaction mix containing 10x HotStar-buffer, dNTP, HotStar Taq polymerase (Qiagen), and the primers described in Supplementary Table 1. TP53 exons 10 and 11 were analyzed in a separate multiplex PCR reaction by amplification of 50 ng of DNA using the 2x Multiplex PCR-kit (Qiagen). PCR products were purified using Illustra ExoProStar 1-step (GE Healthcare, Chicago, Illinois, U.S.A.), and the Applied Biosystems BigDye Terminator v1.1 Cycle Sequencing Kit and Applied Biosystems 3730 DNA Analyzer were used for sequencing (both Thermo Fisher Scientific). DNA from the blood of two healthy donors was used as controls. The results were analyzed using Applied Biosystems Sequencing Analysis software v5.3.1 and SeqScape software v2.5 (Thermo Fisher Scientific) and scored independently by two investigators. All mutations were validated in independent PCR reactions, and synonymous mutations were not called as mutations.
Five cases with intra-patient mutation heterogeneity of TP53 among metastatic lesions were re-analyzed. Heterogeneous cases were first verified by a second Sanger sequencing, and two of the patients were subsequently re-analyzed also with ultra-deep targeted sequencing with the Illumina TruSight Tumor 15 gene panel as described in (17). High-sensitivity analyses disconfirmed TP53 mutation heterogeneity in one patient. The other patient had inter-metastatic heterogeneity in the specific loci affected by TP53 mutation, and displayed p.Asp184fs mutations in two lesions and p.Arg273His mutations in three lesions (all five lesions had the same KRAS mutation). The three patients for whom high-sensitivity data were not available were classified as mutated.
All tumors were analyzed for microsatellite instability (MSI) status using PCR-based marker analyses, either as previously described using BAT25/BAT26 (26), or using the five markers incorporated in the MSI Analysis System version 1.2 (Promega, Fitchburg, WI, USA). Uncertain cases after analyses of BAT25/BAT26 were re-analyzed with the MSI Analysis System. All patients had microsatellite stable (MSS) tumors except for one.
DNA copy number analyses
A total of 232 lesions from the first 67 patients with multiple metastases sampled were analyzed by genome-wide DNA copy number profiling using the Applied Biosystems CytoScanHD array (Thermo Fisher Scientific). The procedure was conducted according to the manufacturer’s instructions, following the CytoScan Assay Manual Protocol. Resulting raw-intensity CEL-files were pre-processed with the R package rawcopy (v1.1) (27), and subsequently segmented by ASCAT (v2.5) (28), with penalty parameter set to 25 and chromosomes X and Y excluded. A primary interest was to estimate the level of CNA heterogeneity among samples from the same patient. This estimate is highly sensitive to poor data quality, and strict quality control was therefore performed on the segmented data by careful visual inspection of copy number profiles and Sunrise plots produced by ASCAT. Samples with non-aberrant profiles (no/few CNAs) or poor Sunrise plots were excluded, retaining 192 lesions from 64 patients for further analyses. Copy number gain and loss was called for segments with ≥ 1 or ≤-1 copies relative to the median genome-wide copy number estimated by ASCAT, respectively. For comparison with the data processing approach used in our previous study (23), the pre-processed data from rawcopy was additionally segmented by the PCF algorithm implemented in the R package copynumber (29) with the penalty parameter (gamma) set to 100.
To enable analyses across samples, the sample-wise segmented data were further split into their smallest genomic regions of overlap by computationally introducing breakpoints at every unique breakpoint occurring in any sample in the total dataset.
The COSMIC Cancer Gene Census version 86 (30) was used to define cancer-critical genes, (both Tier 1 and Tier 2 genes considered). Of the 719 genes, 672 were covered in the CNA data.
A sample-wise estimate of the overall CNA burden was calculated as the fraction of the genome (per cent of base pairs) with aberrant copy number. For patients with multiple lesions, the mean CNA burden was used for patient-wise analyses. Estimates of ploidy were derived from ASCAT.
Unpublished DNA copy number data was available for three matching primary tumors for comparison of amplification status in the metastases.
Estimation of intra-patient inter-metastatic copy number heterogeneity
High-quality DNA copy number data were available for at least two metastatic lesions from 48 patients (Fig. 1), including a total of 176 tumors and a median of 4 tumors per patient (range 2–8). These were analyzed for intra-patient inter-metastatic DNA copy number heterogeneity by three different approaches. First, the genome-wide matrix of estimated copy numbers was used to perform pairwise comparisons among metastatic lesions from each patient based on Euclidean distances, using the dist function implemented in the R stats package. To obtain one heterogeneity measure per patient, the mean Euclidean distance of all pairwise comparisons was calculated, in accordance with the approach used in our previous study (23). Second, the pairwise distance was calculated as in the first approach but using Pearson correlation-based distance. Third, CNA heterogeneity was assessed by a gene-wise estimation (protein-coding genes from UCSC known genes) of the fraction of CNAs within a patient that were not common across the lesions, i.e. genes with aberrant copy number in one or more lesions but not in all. The heterogeneity-calling was more conservative with this approach, as only events exceeding the copy number gain/loss thresholds were considered heterogeneous, while genes consistently affected by gain (or loss) but with varying amplitudes were regarded as homogeneous CNA events.
The patient-wise CNA heterogeneity measure was categorized as high or low relative to the median across the patients.
For robustness, data segmented with the PCF algorithm was used to estimate copy number heterogeneity (distance-based) in the same manner as in our previous study (23), by calculating the average pair-wise Euclidean distance between DNA segments with a variance of > 0.03 among samples from each patient. The distance measured obtained from ASCAT and PCF showed good correlation (Spearman’s rho 0.58, p < 0.001; Supplementary Fig. 1a).
Statistical analyses
Pairwise comparisons of variables between groups were done by non-parametric Wilcoxon rank-sum tests for continuous variables and with Fisher’s exact test for categorical data, both implemented in the R stats package.
Survival analyses were performed with 5-year cancer-specific survival (5y-CSS) as the end point. Time to death from CRC was measured from start of treatment (either neoadjuvant systemic treatment or surgery) and deaths from other causes were censored (31). Only patients with MSS cancers and R0 or R1 status in the liver after resection were included in survival analyses (n = 165 of 171 patients in the full cohort, n = 62 of 64 patients in CNA burden analyses, n = 46 of 48 in CNA heterogeneity analyses). Kaplan-Meier estimates and log rank tests were used for comparisons of variables with only two groups, using the survdiff function in the R survival package. For comparisons of more than two groups, log rank tests for trend were performed using the comp function in the R survMisc package. All Kaplan-Meier plots were made with the ggsurvplot function in the R survminer package. Univariable and multivariable Cox regression analyses were performed with the coxph function in the R survival package.