3.1. Optimization of long-range PCR for mtDNA enrichment
Quality of amplicons was assessed by observing the definition of bands on agarose gel: clearly defined bands of expected size, without smears and/or non-specific products, were aimed at. At the initial stage, Platinum PCR SuperMix yielded either poor bands or large smears (Fig. S2a), while LA Taq Hot Start and PrimeSTAR GXL both yielded distinguishable bands of expected size with primers for longer, more challenging, 11.2 kb fragment (Fig. S2b and S2c, respectively). Thus, LA Taq Hot Start and PrimeSTAR GXL entered stage II, where optimal DNA input and number of amplification cycles were tested (Fig. S3). PrimeSTAR GXL performed better than LA Taq Hot Start for 11.2 kb amplicon, by producing clearer and better-defined bands, as well as optimal yield required for downstream library preparation (Fig. S3a and S3b). Therefore, decision was made to proceed only with PrimeSTAR GXL to stages III and IV, in order to establish optimal conditions for 9.1 kb amplicon. As shown in Fig. S4 and S5, optimal genomic DNA input amounted to 1 ng (in 12.5 μL reaction volume), while 25 amplification cycles provided good balance between yield and specificity for both buccal epithelia and blood sample types. Increasing annealing temperature to 60°C improved specificity, thus producing visually better results when compared to 55°C annealing temperature (Fig. S5 and S4, respectively). To confirm optimal PCR conditions, we amplified both mtDNA fragments in different samples and sample types (Fig. S6). Annealing temperature of 60°C was retained for 9.1 kb amplicon in 3-step PCR, while 2-step PCR conditions were applied for 11.2 kb amplicon. Final optimized long-range PCR conditions for PrimeSTAR GXL DNA polymerase are shown in Table S1. PrimeSTAR GXL has already been reported as the best-performing long-range DNA polymerase in comparison to five other DNA polymerases, and specifically for the purpose of obtaining long PCR products for sequencing on MiSeq instrument [14]. Even though the previous publication used different targets for long-range PCR, this study corroborates the best performance of PrimeSTAR GXL, while also expanding its application to long-range PCR of mtDNA amplicons for MPS. Additionally, PrimeSTAR GXL provided accurate, repeatable and reproducible results in our previous study [11], thus confirming its reliability for long-range PCR purposes.
3.2. Evaluation of limited-cycle PCR step
We noticed that, while quality metrics of sequencing runs were quite satisfactory and within respective parameter’s range given in manufacturer’s specifications, the yield of generated data did not achieve its full capacity. Moreover, libraries that produced low quantity electropherograms on LabChip also produced lower cluster density (hence less clusters and subsequently less data) in sequencing, regardless of the loading concentration. Nextera XT assay has been known to produce uneven read depth profiles [2-4, 6, 15], which risks that some regions receive very low read depth. Thus, in order to maximize the yield of data – primarily to ensure that each sample gets sufficient read depth on all positions of mtDNA genome – increment to 15 cycles of amplification was introduced to limited-cycle PCR step, wherein index adapters are added and libraries are amplified. Molar concentrations of libraries amplified with 15 cycles showed substantial increase as opposed to libraries that underwent 12-cycle PCR (Table S3), which was expected, and LabChip electropherograms showed improved, larger quantities of library fragments (Fig. 1). All variant calls were concordant between 12-cycle and 15-cycle libraries of corresponding samples, including occurrences of point heteroplasmy, as well as insertions and deletions (Table S4).
However, it was necessary to exclude the possibility that prolonged amplification of indexed libraries affected sequencing results in any way (e.g. elevating the level of noise or introducing sequence errors). For that reason, negative controls (NC-EX, NC-PCR and NC-LIB) were analysed as previously described [11], from runs containing libraries prepared with 12 and 15 cycles, to assess the level of noise and exogenous signals detected in sequencing. Cumulatively, an average of 7,370 positions with reads were detected in 15-cycles NCs, which is higher than the average of 6,543 detected in 12-cycles NCs (Table S5). Nonetheless, average read depth was only slightly elevated (6 reads and 5 reads for 15-cycles and 12-cycles NCs, respectively), while maximum read depth (Table S5) detected in any NC was well below the established minimum read depth threshold of 220 reads [11]. Noise in samples was also evaluated by analysing signals of alternative bases (different than haplotypes, excluding positions with point heteroplasmy): average read depth of alternative signals was 44 reads for 12-cycle libraries, and 52 reads for 15-cycle libraries. While in both cases signals exceeding the threshold of 220 reads were detected, these would not be of concern to impact variant calling and interpretation since all such signals were either below the 3% analysis threshold [11], or exhibited poor strand balance and/or displayed low quality score; thusly, they would be excluded from final variant calling. Increasing amplification cycles produced larger quantity of libraries, which, ultimately, improved the yield of sequencing data by enabling maximal usage of sequencing chemistry capacity, while still maintaining good quality of run metrics (Table S6). Since the modified PCR conditions did not affect the level of noise nor variant calling at the established analysis and interpretation thresholds, they were deemed safe and so applied in subsequent sequencing runs.
3.3. Comparison of library normalization methods
Sequencing metrics parameters such as cluster density, clusters passing filter and quality of bases, are mostly dependent on the loading concentration of pooled libraries (which is, in turn, most influenced by the accuracy of library quantification [16], and are thus not directly dependent on normalization method. However, the chosen method of library normalization may greatly impact the proportion of reads for a sample (expressed as “% reads identified”), in the sense of better or worse uniformity in representation of samples. Naturally, greater uniformity between samples means more even distribution of reads per sample, and consequently achieving sufficient read depth across the sequenced targets. The greatest risk of low proportion of reads in a sample is losing valuable information from regions that would potentially receive very few or no reads, leading to increasingly difficult detection and interpretation of variants in those regions, and eventually requiring repeated library preparation and sequencing – with additional costs of reagents and consumables.
Two library normalization methods were compared: magnetic beads-based normalization against “standard” normalization (i.e. quantification of libraries followed by individual normalization). From the standard deviation of % reads identified, with corresponding coefficients of variation (Table S7), it is evident that magnetic beads normalization introduced greater variation to the distribution of reads per library (Fig. 2). This observation is concordant to previously reported [4], and is likely caused by the sensitivity of magnetic beads to numerous handling steps (dependent on accuracy, precision, speed and dexterity of particular analyst). Even though normalization beads are included in Nextera XT library preparation kit and require no additional expenses, LabChip quantification and individual, library-by-library normalization allow faster processing of larger sample batches. The latter method also requires less hands-on time (thereby reducing cross-contamination), and produces both concentration/molarity and fragment distribution information for each library, so the risk of (costly) repeated sequencing is greatly diminished. Additionally, it enables more flexible regulation of run plexity, which is particularly relevant for achieving the desired read depth of samples for certain applications (e.g. detection of novel variants will require higher read depth than population studies, which means sequencing less samples per run).