Whole Genome Sequencing and Custom Liquid Biopsy Markers in Sarcomas


 Background: Sarcomas are rare tumours with heterogeneous clinical behaviour including varying rates of metastasis. Clinical treatment and follow-up rely on crude grading systems with uncertain accuracy for individual patients. Whole genome sequencing (WGS) detects both structural variation and single nucleotide variants and may thus add important diagnostic/prognostic information. Liquid biopsies (LB) may potentially identify hematogenous spread and treatment response rate but further evaluation in sarcomas is needed. Methods: In this study we explore the performance of individualized LB in four patients with different types of sarcomas. Fresh frozen tumour tissue was sequenced using WGS and whole transcriptome sequencing. Three putative driver variants or one fusion gene were selected per case and custom digital droplet PCR (ddPCR) assays were designed, evaluated on tumour DNA and used to assess levels of cell free tumour DNA in plasma taken prior to surgery. Results: In LB, ddPCR identified three variants in one patient with metastatic disease. The remaining three patients had negative LBs and were without disease at follow-up (>18 months after surgery). Conclusions: WGS is a powerful tool to detect all types of genetic changes in sarcoma and can facilitate clinical diagnosis/classification while custom LB may add prognostic information.


Background
Sarcomas constitute a group of rare and heterogeneous tumours originating in mesenchymal tissue, commonly divided into bone and soft tissue sarcomas. They compromise around 1% of all cancers and roughly 20% of paediatric solid malignancies (1). A signi cant proportion of histopathological entities can be genetically de ned by speci c fusion gene drivers caused by chromosomal rearrangements, e.g. the SYT-SSX fusion genes in synovial sarcoma or the FUS-DDIT3 fusion gene in myxoid liposarcoma. In other cases, however, sarcomas harbour numerous structural and numerical chromosomal abnormalities that may support the diagnosis of speci c subtypes, e.g. a near haploid genotype seen in in ammatory leiomyosarcomas or supernumerary ring chromosomes seen in liposarcomas or dermato brosarcoma protuberans (2). Some high-grade osteosarcomas are de ned by chromothripsis-like events, a single genetic catastrophe where multifragmentation of chromosomes results in multiple rearrangements and copy number alterations (3,4). Compared to carcinomas, the overall mutational burden is regarded as quite low (5). Nonetheless, single nucleotide mutations may have important diagnostic and prognostic value in sarcomas, e.g., KIT mutations in GIST or STAG2 mutations in Ewing sarcomas, with negative prognostic value especially together with TP53 mutation, respectively (6, 7). Targeted panels using massive parallel DNA sequencing (MPS) designed to identify mutations commonly found in the major cancer types, have shown limited utility in sarcomas (8). However, gene panels speci cally designed to target genetic alterations found in sarcoma have had greater success in the diagnostic setting (9,10).
Liquid biopsy (LB) is a rapidly developing eld in oncology. The most common LB method is based on the assumption that genetic variants in circulating tumour DNA (ctDNA) can be identi ed and quanti ed in cell-free DNA (cfDNA) from plasma {Barbany, 2019 #66}. In several malignancies, LB has been proven to be an effective method to track therapy response and to detect the acquisition of resistance mutations in both solid and haematological malignancies {Alimirzaie, 2019 #9;Hocking, 2016 #10}. Early reports suggest that LB can be utilized in speci c subgroups of sarcomas, including fusion-gene related tumours or tyrosine kinase-driven tumours {Braig, 2019 #12;Nannini, 2014 #13}. However, the role of LB in a general and unselected group of sarcomas has not been investigated yet.
In this study, we wanted to test the concept of LB in a prospective group of sarcomas as presented in the routine clinical setting. Four patients were included, and their tumours were genetically characterized using both whole genome (WGS) and whole transcriptome sequencing (WTS). Putative markers for ctDNA tracking were selected and custom-made digital droplet PCR (ddPCR) assays were designed in a preoperative setting to evaluate the method.

Methods
Tumour specimen collection and nucleic acid extractions All participants gave their written informed consent to participate in this study which has been approved by the local ethical review board (approval code: 2016/2-31/1). Fresh frozen tumour tissue was isolated from surgical specimens by a clinical pathologist at the Karolinska University Hospital, Stockholm, Sweden. Tumour tissue was snap frozen and tumour representativity was con rmed using routine haematoxylin-eosin (HE)-stained sections, followed by a visual estimation of tumour cell proportion. Nucleic acids were manually extracted from tumour tissue using the DNeasy and RNeasy kits (Qiagen, Hilden, Germany) at the Clinical Molecular Pathology laboratory using accredited protocols (SWEDAC).
DNA and RNA isolates were quanti ed using Qubit 3.0 Fluorometer (Thermo Fisher Scienti c, Waltham, MS, USA) and quality controlled using a 4200 Tapestation (Agilent Technologies, Santa Clara, CA, USA).
Germline DNA was extracted from white blood cells using EZ1 DNA Tissue Kit (Qiagen) and quanti ed using Qubit.
Whole genome sequencing DNA from fresh-frozen tumour samples and white blood cells was converted to sequencing libraries using a PCR-free paired-end protocol (Illumina TruSeq, San Diego, CA, USA) and sequenced on the NovaSeq 6000 (Illumina) aiming at 30x coverage. The median coverage for tumour DNA was 26 -36x (330 -491 M read pairs) and 30 -38 x for germline DNA.
Whole transcriptome sequencing/Preparation of RNA libraries RNA-seq was performed on total RNA from fresh frozen tumour tissue. Sequencing libraries were generated using PCR-free paired-end protocol (Illumina TruSeq, San Diego, CA, USA) and sequenced on the NovaSeq 6000 (Illumina) platform aiming for at least 20 M reads per sample. Raw FASTQ reads were submitted for fusion gene analysis as described below.

SNVs
SNVs were called using Balsamic (https://github.com/Clinical-Genomics/BALSAMIC/), which uses Vardict, TNscope and TNHaplotyper for calling SNvs and small indels and were visualised in SCOUT -a custom-developed decision support system. The four following criteria were applied for manual ltering of variants identi ed in by the bioinformatic caller: 1) variants should be located in the exonic or splice regions, 2) >5% allele frequency in the tumour sample, 3) <0.001% allele frequency in the normal sample, and 4) frequency in the The Genome Aggregation Database (gnomAD, v.2.1.1) < 0.01. Next all variants were manually inspected using the Integrative Genomics Viewer [39] and functionally interpreted. All pre ltered variants were also submitted to the Molecular Board Portal Karolinska [21] for functional annotation. Filtered variant lists are available in the supplementary data.

Fusions
We used FusionCatcher (version 1.20 with standard settings [40]) to screen the transcriptome for fusion gene transcripts in all cases. Putative in-frame fusion transcripts were ltered for blacklisted and banned variants, genes with common mapping reads, adjacent genes as well as limited anchorage sequences (short reads). Lastly, all fusion transcripts were compared to rearrangements identi ed in the WGS data. SVs from WGS data were identi ed using the FindSV pipeline (https://github.com/J35P312/FindSV) incorporated into MIP (https://mip-api.readthedocs.io/en/latest/) and visualised in the SCOUT interface (https://github.com/Clinical-Genomics/scout).

Copy Number Alterations
Copy number alterations were detected using the FindSV pipeline. FindSV performs variant detection using CNVnator V0.3.2 using 1kb bins [41] and TIDDIT V2.0.0 [42]; as well as annotation using the variant effect predictor 92 (McLaren et al. 2016) and SVDB [42]. Next, the variants were visualized using IGV [39] and VCf2cytosure [43]. In addition, copy number analysis was performed and Manhattan plots were generated with CNVPytor.1 Samples were analysed with the read depth option and with a bin size of 100kb [44]. Selected CNVs were also veri ed in the SCOUT interface (https://github.com/Clinical-Genomics/scout).

Isolation of cfDNA
Peripheral blood samples were collected from all patients before surgery (24 -0 days prior) in Streck Cell-Free DNA BCT tubes and stored in room temperature for a maximum of 5 days before plasma isolation.
Blood volumes ranged from 6-20 ml/patient. After centrifugation at 1600 x g. (10 min 4°) the plasma supernatant was removed and centrifuged at 16 000 x g. (10 min 4°). Cell-free plasma volumes ranged from 3-9 ml/patient and were frozen at -80°C in cryovials. Plasma from blood donors was collected as reference plasma.
Samples were thawed in room temperature and cfDNA extraction was performed on 3 ml aliquots of plasma (one or multiple per patient depending on total plasma volume) using QiAamp Circulating Nucleic Acid Kit (Qiagen, Manchester, UK). cfDNA was eluted in 40 µL AVE buffer and stored in -20°C.

ddPCR design and analysis
Custom ddPCR assays containing sequence-speci c primers and FAM/HEX labelled probes were designed based on WGS results according to the Rare Mutation Detection Best Practices Guidelines (BioRad, Hercules, California, USA) and ordered from Integrated DNA Technologies IDT (Coralville, Iowa, USA). For SNV assays, the HEX-probe was designed for the wild type allele. For the SV assay, a commercial non-mutated reference gene was used (ABCC9). Amplicon lengths were 65-91 bp. For sequence details see Supplementary table 1. All assays were tested on a gradient-ddPCR to de ne optimal annealing temperature. Gradients were run on normal control gDNA (10 ng/well), NTC and positive control (1ng patient tumour DNA in a background of 9 ng normal control gDNA). No false positives were observed.
Before ddPCR analysis, all cfDNA aliquots from the same patient were pooled. ddPCR reaction mixes were then prepared with 10 µL 2x ddPCR Supermix for Probes (No dUTP, BioRad), 11 µL cfDNA sample and 1 ul custom assay containing ( nal primer/probe concentration of 900 nM/250 nM respectively). Reactions were run in triplicates alongside triplicates of non-template, wild type (10 ng gDNA) and positive controls (1 ng gDNA from tumour in 9 ng wt gDNA). Additionally, a minimum of 9 wells of cfDNA from healthy plasma donors were run as background control. ddPCR reactions were run on the QX200 AutoDG Droplet Digital PCR System and QX200 Droplet Reader (BioRad) according to the manufacturer's instructions. The following cycling conditions were used: 1 cycle at 95°C for 10 min, 40 cycles at 94°C for 30 s and custom annealing temperatures* for 1 min, 1 cycle at 98°C for 10 min, and 1 cycle at 8°C in nite, all at a ramp rate of 2°C/s. Data was analyzed using the QuantaSoftPro Software (BioRad) and results were manually reviewed. The threshold for positive droplets was based on the control samples run together with patient samples in each assay.

Patient 1
Clinical history A 70-year-old female presented with a pain in the distal femur. Imaging revealed a 13 x 2.5 x 2 cm localized tumour with cartilaginous appearance indicating a chondrosarcoma. Fine-needle aspiration cytology and imaging also suggested a conventional chondrosarcoma. The primary tumour was resected, and histopathology classi ed the tumour as a high-grade chondrosarcoma (grade 3) removed with wide margins ( Figure 1A). After 60 months the patient developed a bone metastasis to the clavicle.
Liquid biopsy was performed, after which the patient received localized radiotherapy to the metastatic site (surgical indication) followed by resection with marginal margins. Histopathology revealed a 7 x 5 x 4 cm dedifferentiated chondrosarcoma metastasis with adjacent intravascular growth. After an additional 13 months, the patient developed brain metastases and passed away.

Somatic variants and fusion gene analysis
After manual ltering of variants identi ed by whole genome sequencing, 40 coding variants remained (Supplementary Table 2A). These included a truncating variant in a known tumour suppressor gene (FANCA) with variant allele frequency (VAF) 31%, a missense variant in a known oncogene (CARD11) with VAF 47% as well as variants in other well-known cancer genes (e.g. TP53, RNF43) and truncating variants in genes known to be associated with chondrosarcoma: COL2A1 (VAF 60%) and ADAMSTL1 (VAF 39%) or sarcomas in general: PIK3C3, VAF 68% ( Figure 1B). The 33 remaining variants were classi ed as variants of unknown relevance (VUS). The tumour also harboured a known non-coding hotspot mutation in the TERT promoter on chromosome 5:1295228 with a VAF of 31%. Based on the variants, gene background and tumour allele frequency, we selected the variants in TP53 (VAF 45%), PIK3C3 (VAF 68%) and COL2A1 (VAF 60%) for ctDNA screening. The custom ddPCR assays had 100% speci city on fresh frozen tumour DNA and a theoretical limit of detection (LOD) of 0.03% given that the patient had 95.6ng total cfDNA in 9 ml of plasma (10.6 ng/ml plasma) ( Figure 1E). Each assay was run on cfDNA from 3 ml of plasma. The patient was positive for all variants, with an estimated combined variant allele frequency of 0.14% corresponding to 4.4 mutant molecules per millilitre plasma (Supplementary gure 2, supplementary table 4). PIK3C3 which had the highest tumour VAF also had the highest number of signals with an average of 2 mutant copies/well corresponding to approximately 22 copies in total.

Patient 2
Clinical history A 64-year-old male presented with a deeply situated mass in the thigh. MRI showed a large soft tissue tumour indicative of a sarcoma, con rmed by ne-needle aspiration cytology. A liquid biopsy was performed, and the tumour was surgically resected. The histopathological examination classi ed the tumour as a myxo brosarcoma measuring 3.4 x 3.2 x 3 cm, with less than 10% necrosis, corresponding to grade 2 according to the FNCLCC-system [16]. The tumour was resected with wide margins (Figure 2A). No adjuvant therapy was given. The patient was alive without disease manifestation at follow-up (19 months after surgery).

Somatic variants and fusion gene analysis
After manual ltering of variants identi ed by whole genome sequencing, 18 variants remained (Supplementary Table 2B) including a frameshift in TP53. Three variants were found in oncogenes (KMT2A, BCL11A and SETBP1), but no additional variants were found in bone-de tumour suppressor genes. For ctDNA screening, we selected a deletion of 32 bp in TP53 (VAF 27%) and two SNVs in KMT2A (VAF 26%) and CYP4F22 (VAF 44%) ( Figure 2B). The custom ddPCR assays had 100% speci city on fresh frozen tumour DNA ( Figure 2E) and a theoretical LOD of 0.07% given that the patient had 29 ng total cfDNA in 8 ml of plasma (3.6ng/ml plasma). The patient was negative for all variants on ddPCR (Supplementary Table 4).

Patient 3
Clinical history A 71-year-old female presented with an 8 x 5 x 4 cm soft tissue tumour in the trunk. MRI and ne needle aspiration cytology con rmed a diagnosis of soft tissue sarcoma. Liquid biopsy was performed, and the tumour was surgically removed. Histopathology showed a spindle cell sarcoma of unclear subtype ( Figure 3A), corresponding to a grade sarcoma 2 according to FNCLCC-criteria. Areas of nerve-sheet like cellular morphology and diffuse loss of H3K27me3 immunoreactivity favoured a diagnosis of malignant peripheral nerve sheath tumour (MPNST). The tumour was resected with wide margins and no adjuvant therapy was given. The patient was alive without disease at follow-up (20 months after surgery).

Somatic variants and fusion gene analysis
After manual ltering of variants identi ed by whole genome sequencing, 37 coding variants remained (Supplementary Table 2C). These included one variant in a tumour suppressor (CPEB3; VAF 33%) and variants in three oncogenes (KDR, WAS and CCNA1; VAF of 20%, 20% and 40% respectively).
The CNA analysis demonstrated an extremely complex karyotype with a large number of CNAs and gains and losses on all chromosomes ( Figure 3C, Supplementary Figure 1C). There was no deletion of NF1 or TP53, although there was a sub-clonal deletion of the short arm on chromosome 9 including CDKN2A/B and an ampli cation on 17p13.1-p12 which included the MYOCD gene. (Figure 3D). WGS and WTS did not identify any expressed fusion gene (data not shown).

ctDNA analysis
In this patient we prioritized putative driver variants with higher allele frequency, selecting CR2 (AF 71%) ( Figure 3B), ADAM23 (AF 41%) and CCNA1 (AF 40%) for ctDNA screening. The custom ddPCR primers showed 100% speci city on fresh frozen tumour DNA ( Figure 3E) and a theoretical LOD of 0.08% given that the patient had 24.2 ng total cfDNA in 6 ml of plasma (4.0 ng/ml plasma). The patient was negative for all variants on ddPCR (Supplementary Table 4).

Patient 4 Clinical history
A 55-year-old female presented with a subcutaneous tumour in the thigh. The MRI showed a 4.5 x 3 x 2 cm tumour most consistent with a soft tissue sarcoma. Liquid biopsy was performed, and the tumour was surgically removed. Histopathology classi ed the tumour as a low-grade myxoid liposarcoma with <1 mitosis/10 HPF, removed with marginal margins (Figure 4A), and routine molecular pathology identi ed a FUS-DDIT3 fusion gene using qRT-PCR. Postoperative radiotherapy was administered. The patient was alive without sarcoma at follow-up (26 months after surgery), although she later developed a breast cancer.

Fusion gene screening (Whole transcriptome sequencing)
Since this case had a previously identi ed FUS-DDIT3 fusion gene, the genetic investigation was performed to map this variant in detail. WTS was able to detect the FUS-DDIT chimeric transcript (supplementary table 3). WGS of tumour DNA showed that the fusion gene resulted from a three-break balanced rearrangement, t(X; 12; 16)(p11.21;q13.3;p11.2)) ( Figure 4B). Trisomy 8 was the only additional chromosomal aberration ( Figure 4C and D, Supplementary gure 1D).

ctDNA analysis
Based on the genomic rearrangement, ddPCR primers were designed to identify the FUS-DDIT3 breakpoint on the genomic level. The custom ddPCR assays showed 100% speci city on fresh frozen tumour DNA ( Figure 4E) and a theoretical LOD of 0.15% given that the patient had 7.6 ng total cfDNA in 3 ml of plasma (2.5ng/ml plasma). The patient was negative for all variants on ddPCR (Supplementary Table 4).

Discussion
In recent years liquid biopsies have been utilized to detect treatment predictive mutations in ctDNA in cancer patients, without the hazard of invasive biopsies. The method has also been used to develop predictive biomarkers for tumour burden and as a quantitative measurement of tumour clone responsiveness during treatment. Similar applications have been foreseen in the treatment of sarcoma patients (11). Since several histological subtypes of sarcoma are characterized by speci c genetic events, several studies have evaluated LB based on such variants. Results from these studies suggest that detection of fusion genes in ctDNA may be a very sensitive method to detect tumour burden (12,13). However, detection of the ampli ed MDM2 gene in liposarcoma was not sensitive enough to be utilized in routine diagnostics (12). Since MDM2 ampli cation often is substantial (30+ copies) in liposarcoma, it could suggest that other small copy number gains may be di cult to quantify in cfDNA (assuming that liposarcomas produce ctDNA at all).
Here, we wanted to investigate the concept of customized LB in a prospective clinical setting. First, we can conclude that WGS comprehensively detected both putative driver single nucleotide and structural variants in all included cases, thus providing important additional information compared to a targeted gene panel. This is important since sarcomas are very genetically diverse and thus di cult to target using standard gene panels: in fact, up to half of the SNVs in our patients would not have been detected by standard clinical pan-cancer panels.
Three of the SNVs in the chondrosarcoma were found in genes that had previously been reported to be common in chondrosarcomas (TP53, COL2A1 and TERT (14,15). The tumour also harboured a variant in PIK3C3 which is more commonly mutated in primary bone malignancies compared to other tumour types (16). Similarly, of the SNVs in patient 2, only TP53 mutations have previously been reported in myxo brosarcoma (17,18) and in patient 3, none of the common MPNST genes (NF1, TP53,CDKN2A/B) harboured SNVs although there was a deletion encompassing the CDKN2A/B locus on 9p known to be present in >60% of all MPNST (19) and an ampli cation on 17p that included the MYOCD gene, reported to be important in sarcoma (20) demonstrating the importance of combining these types of analyses.
The patterns of CNAs in our samples are similar to those previously reported. For instance, the dedifferentiated chondrosarcoma (patient 1) the myxo brosarcoma (patient 2) and the suspected MPNST (patient 3) all had very complex karyotypes as expected in these tumour types (18, 19,21,22). Of note, patient 1 and 2 both harboured somatic TP53 SNVs and had focal complex events with multiple deletions and duplications on selected chromosomes as seen in chromo-anasynthesis (a chromotripsislike event) often associated with TP53 mutation (3,23) that are more common in sarcoma than other tumour types and especially in myxo bosarcoma (37% of 36 tested tumours) (24). In contrast, the lowgrade myxoid liposarcoma (patient 4) only had a diagnostic FUS-DDIT3 fusion and a trisomy 8, one of the most common additional ndings (25,26).
In addition, our data support that short-read WGS can be used to detect translocations and to characterize the breakpoints of fusion genes; knowledge required for design of targeted assays. For example, in the case of patient 4, the FUS-DDIT3 was created by a fusion of exon 5 in FUS to exon 2 of DDIT3 (i.e. type 2 (27)) as a result of a three-way translocation involving chromosome X and with a small deletion at the breakpoint on chromosome 12 (FUS).
In a therapeutic setting, WGS can also be used to detect variants that could aid in diagnosis and perhaps be targetable. For instance, the chondrosarcoma harboured a heterozygous mutation in FANCA, which is a biomarker for Olaparib sensitivity in prostate cancer (28) and could have been a potential treatment alternative for this patient (Classi ed as a Tier III variant according to AMP criteria (29) and heterozygous carriers have been reported to respond to Olaparib (e.g. (30)). The tumour also carried a PIK3C3 mutation. A novel PI3K inhibitor, alpelisib, has recently been approved for speci c subtypes of breast cancer (31) and could potentially have been a second therapeutic target detected by liquid biopsy. However, the cost of WGS is still substantial and it requires fresh frozen tumour tissue. In many cases, standard gene panels targeting sarcoma might still be a feasible method in order to identify target single nucleotide variants to follow in LB.
For ddPCR, variants in genes known to be mutated in cancer according to the cBioPortal were selected for analysis. Methodologically, we were able to produce ddPCR assays for the included tumours within a clinically relevant time frame (a couple of weeks). Furthermore, the three patients with localised disease had low levels of total cfDNA and negative ddPCR assays. This is in line with previous studies that have shown that 80% of all patients with non-metastatic soft-tissue sarcoma have undetectable ctDNA (32). This is also in agreement with previous publications (12,13,33), which suggest that quantitative applications, such as dynamic changes during therapy, may be di cult in sarcomas. However, it is interesting that the only patient with metastatic disease had both higher levels of total cfDNA and detectable ddPCR markers in plasma. Detection of ctDNA has been associated with poorer prognosis in Ewing sarcoma and osteosarcoma (33), however there are no larger studies on soft tissue sarcomas. Increased levels of ctDNA can precede radiological detection of recurrence (32) in sarcoma, but we unfortunately, did not have additional follow-up samples on our patients to further evaluate this.

Conclusion
While previous studies on liquid biopsies in translocation-associated sarcomas show good potential for clinical implementation, there is still a great need to investigate which methods can be used for the majority of sarcomas without known fusion driver events. We show that whole genome sequencing can detect diverse driver events (SNVs, indels, CNAs and fusion genes) and that targeted analysis of these variants using ddPCR on cell free DNA from plasma is possible to perform in a clinically relevant time frame. Especially in low-grade sarcomas, whose metastatic rate is 10-15% over a 10-year period, LB could potentially reduce the need of imaging follow-up. The natural continuation of this study is to continue prospective collection of patients and monitor disease progression for several years, with the aim of determining in which setting LB makes sense in routine diagnostics and follow-up for sarcoma patients.

Declarations
Ethics approval and consent to participate The study was conducted according to the guidelines of the Declaration of Helsinki, and approved by the local Ethics Committee of Stockholm (protocol code 2016/2-31/1 and date of approval: 9 Mar 2016 with additions 2018/1472-32/1 and 2019-01222). Oral and written informed consent was obtained from all subjects involved in the study.

Consent for publication
Informed consent for publication have been obtained from all the study participants.

Availability of data and materials
Whole genome sequencing data is considered sensitive and cannot be openly shared in public databases according to Swedish law and the ethical permission. All data from this study may be made available from the corresponding author (emma.tham@ki.se) on reasonable request and a formalized ligal binder.    Supplementary Files