Study design
In order to test the sensitivity and coverage of clinical WGS, Genome in a Bottle (GIAB) sample HG001 (NA12878), HG005 (NA24631)/HG006 (NA24694)/HG007 (NA24695) (known as the Chinese son/father/mother), and YH cell line (a human lymphoblastoid cell line from first Asian genome donor) [30] were collected and sequenced using MGISEQ-2000 platform. All the samples used in this study are listed in Table 1.
Table 1
Sample Name | Sequencing | Platform | DNA input | PCR/PCR-free | Read length | Mean depth |
NA12878-1 | WGS | MGISEQ-2000 | 1 µg | PCR | PE100 | 197 |
NA12878-2 | WES | MGISEQ-2000 | 300 ng | PCR | PE100 | 220 |
YH | WGS | MGISEQ-2000 | 1 µg | PCR | PE100 | 151 |
NA24631 | WGS | MGISEQ-2000 | 1 µg | PCR | PE100 | 111 |
NA24694 | WGS | MGISEQ-2000 | 1 µg | PCR | PE100 | 115 |
NA24695 | WGS | MGISEQ-2000 | 1 µg | PCR | PE100 | 117 |
Figure 1 shows the overall design of this study. First, in order to assess the recommended depth for proband-only WGS in clinical diagnostics, the analysis of the sensitivity and positive predictive value (PPV) of high-confidence SNPs/indels, the sensitivity of CNV detection, and the depth and breadth of coverage for disease-associated genes and CNVs were performed using down-sampling samples of NA12878-1. The results of sensitivity and PPV from a single genome may be difficult to generalize to a range of samples [31]. So in this part, similar analyses were also performed for down-sampling samples of another high depth sequencing sample of YH. After determining the recommended mean DP for proband-only WGS, the Chinese trios (NA24631, NA24694 and NA24695) with recommended mean DP were used to test the sensitivity and PPV of trio-based WGS when taking advantage of the family-based trio design in clinical WGS. Using down-sampling samples of NA12878-1 and NA12878-2, we also compared the performance of WES and WGS using the recommended mean DP. Finally, we analyzed 8 clinical cases with known disease causing variants using our WGS pipeline (Fig. 1).
Sensitivity And Positive Predictive Value Of Variant Calls
To evaluate the performance of variant calls in detecting true genotypes, the high-confidence calls (SNPs and indels) for GIAB sample HG001 (NA12878) and HG005/HG006/HG007 (Chinese son/father/mother) were considered as true-positive calls (v3.3.2) [35]. We restricted the calculation of sensitivity (high-confidence calls detected by our method/all the high-confidence calls in GIAB) and PPV (high-confidence calls detected by our method/all the variants detected by our method) of variant calls to the high confidence region (v3.3.2) [35]. High-confidence variant calls and regions tend to include a subset of variants and regions that are easier to detect [35]. The sensitivity and PPV of variant calls in YH sample were also evaluated. The alleles validated by the Illumina 1M BeadChip were considered as “true-positive” calls for YH [30]. Genotype quality (GQ) and DP were used to filter out variants with erroneous variant calls.
Sensitivity Of Cnv Detection
In order to detect the sensitivity of proband-only WGS for CNV detection. The 12 down-sampling samples of NA12878-1 (10X-150X) with increasing mean DP were evaluated. CNVnator (read depth) [32], BreakDancer (read pair) [33] and LUMPY (read depth and read pair) [34] were used for the detection of CNVs for the 12 down-sampling samples. In this study, a total of 3 CNV call datasets were assessed. The overall sensitivity of CNVnator, BreakDancer and LUMPY of the 12 down-sampling samples for CNV call set 1 (BreakDancer only included the detection of deletions in CNV call set 1), CNV call set 2 and CNV call set 3 was shown in Fig. 3. In general, CNV calling is reliable with increasing DP (Fig. 3a-c). At increasing sequencing depths, the trends of the sensitivity curves for the 3 CNV tools were different from one another. CNVnator showed a wide range in the sensitivity with varying DP, and the sensitivity visibly increased with mean depth, which indicates that the sensitivity of CNV detection is positively correlated with sequencing depth.
We also observed that the size of CNVs may influence the sensitivity of CNV tools. The performance of each tool varied along with the size of CNVs (Supplementary Fig. 1–9). Taking the deletions in CNV call set 1 as an example, the widely-used tool CNVnator may not be suitable for CNV detection when the size of CNV is less than 1 kb. When the CNV size is less than 1 kb, the sensitivity significantly increased with sequencing depth from 10X to 30X but reached a plateau at a ~ 40X depth (Supplementary Fig. 10). This indicated that when the CNV size is less than 1 kb, the detection rate is greatly influenced by sequencing depth, which is less obvious when the CNV size is between 6–70 kb. While, BreakDancer provided a better performance for deletion detection in all size of CNVs. These results suggested that, clinical scientists should pay more attention to the selection of CNV tools when focusing on different CNV sizes.
The selection of CNV call set and CNV detection tools may influence the sensitivity of CNV detection, making the assessment of the recommended depth for CNV detection of proband-only WGS difficult. In order to investigate the minimum requirement of mean DP for CNV detection in proband-only WGS, using 3 CNV call sets (CNV call set 1, 2, 3) and the detection results of 3 CNV tools (CNVnator, BreakDancer and LUMPY), we defined a “miss detection index” (MDI) value in this study. The MDI value for a specific mean DP is defined as the frequency when the specific mean DP shows the “lowest” sensitivity for different CNV size in a CNV call set. Without regard to selection of CNV call set and CNV detection tools, MID can be used to evaluate the recommended depth for CNV detection of proband-only WGS.
In the formula, M means the number of times when mean depth i shows the “lowest” sensitivity of CNV detection, N means the total number of times for all the depth that shows the “lowest” sensitivity of CNV detection. To obtain qualified CNV sizes in a CNV call set for evaluation, some criteria need to be fulfilled for a CNV size (Supplementary material). Detailed criteria and examples of the calculation of MDI can be found in Supplementary material.
As a result, the MDI value of a depth of 10X, 20X, 30X and 40X ranked first in the down-sampling samples of NA12878-1 (Fig. 3d-f). 10X-40X account for more than 71.98% of the total depth. Taking together the sensitivity of detecting CNVs and sequencing costs, a sequencing depth of ~ 40X provides the best value for CNV detection, as is indicated by the trends in the sensitivity curves (Fig. 3a-c).
Depth and breadth of coverage for disease-associated genes and CNVs
Although WGS is better than WES for variation detection in patients with genetic disorders, the coverage of coding exons in key disease-associated genes of WGS has not been fully evaluated. In order to investigate the breadth of coverage of proband-only WGS for disease-associated genes, the breadth of coverage of 6 gene sets for the 12 down-sampling samples of NA12878-1 (10X-150X) and YH with increasing mean depth were evaluated. For each exons of the coding genes, we calculated the percent of exonic bases covered at more than 10X depth, which was reported to provide 95% sensitivity for heterozygous SNVs in WES [4]. None of the 12 down-sampling samples of NA12878-1 and YH covered 100% of the coding exons in the 6 gene sets except for the ACMG59 gene set (Supplementary Table 2). The results of down-sampling samples of NA12878-1 seems slightly better than down-sampling samples of YH, probably caused by the total sequencing depth of NA12878-1 (~ 197) and YH (~ 151). Across the 6 gene sets, limited range of variation was found in the down-sampling samples when the mean depth is more than 40X (Supplementary Table 2).
As for the ACMG 59 genes, we also observed a range of variation in the breadth of coverage for the 12 down-sampling samples. As such, a mean depth of more than 70X for YH and 90X for NA12878-1 covered 100% for all the ACMG 59 genes. Mean depth of 30X to 50X were most widely used for WGS [24, 25]. The proportion of genes in ACMG59 gene set covered 100% at > = 10X was 93.22%, 98.31% and 96.61% for NA12878-1_30X, NA12878-1_40X and NA12878-1_50X (with mean depths of ~ 30X, ~ 40X and ~ 50X). For mean depth of ~ 30X, ~ 40X and ~ 50X for YH, we observed that 86.44%, 93.22% and 98.31% of genes were covered 100% at > = 10X. The breadths of coverage are significantly better when the average sequencing depth is more than 40X (Fig. 4a). Sites of all genes are covered more than 99.9% when the sequencing depth is ~ 40X. Interestingly, we also observed poorly covered RYR1 and TGFBR1 gene in this study (Fig. 4a) comparing to a previously published paper for measuring sensitivity and coverage of clinical WES [4], indicating that the poor coverage of RYR1 and TGFBR1 might be caused by the features of the gene regions, not the sequencing methods used. Clinical scientists need to pay more attention to these genes when performing clinical WGS. Considering the cost of sequencing, a sequencing depth of ~ 40X provides the best value for the coverage of the ACMG 59 gene set, as is indicated by the trends in breadth of coverage value of the ACMG 59 genes. We observed similar patterns for down-sampling samples of YH (Supplementary Table 3). We also examined the percentage of genes at > = 20X coverage in the ACMG 59 gene set, which could provide 99% sensitivity for heterozygous SNVs [4]. 81.36% and 59.32% of genes were covered 100% for NA12878-1 and YH when the mean DP is ~ 40X.
CNVs are another major part of human genetic variation, which are often found to be associated with human diseases. For the CNVs in the DECIPHER database, most CNVs can be well covered (more than 95% coverage) at > 10X depth when the sequencing depth is 40X-50X for NA12878-1 (Fig. 4b). Clinical scientists should pay more attention to 22q11.2 distal deletion syndrome, Cat-Eye Syndrome Type I and Miller-Dieker syndrome, which showed a coverage of 88.96% and 4.54% even with sequencing depth of 100X. Similar patterns were also observed for down-sampling samples of YH (Supplementary Table 4).
Sensitivity And Ppv Among Trios
The purpose of the analysis of trios, is to test the sensitivity and PPV when taking advantage of the family-based trio information in clinical WGS. Although the sequencing depth of the Chinese trio is more than 100, here we used down-sampling samples with a mean depth of 44.00X, 43.77X and 43.05X for NA24631, NA24694 and NA24695. The reason why we concentrate on the depth of ~ 40X is that, ~ 40X depth is the most widely used and recommended depth for WGS, which is also consistent with some of our previous results, especially for CNV detection.
In this study, we took advantage of the family-based trio design (Fig. 5a) to calculate the sensitivity and PPV in high confidence regions (NISTv3.3.2/GRCh37) [35]. The analysis was restricted to variants with DP > = 10X and GQ > = 20 (Fig. 5a). We focused on the loci where one parent was homozygous for the alt allele and the other was homozygous for the reference allele. For these variants, the offspring should be heterozygous. This could provide a new “gold standard set” for NA24631 (the offspring). Comparing to the high-confidence calls of NA24631 provided by NIST (the high-confidence call set), the new gold standard set could be used to test the sensitivity and PPV for trio-based WGS. Figure 5b showed the results of the sensitivity and PPV of the “gold standard set” and the high-confidence call set for NA24631. As a result, the sensitivity of the “gold standard set” for SNP and indel detection was > 99.48% and > 96.36% respectively, and PPVs were 99.86% and 97.93%. Trio-based analysis showed great improvement for the PPV of indel detection (Fig. 5b), PPV for indel detection improved from 89.68% (using the high-confidence call set) to 97.93% (using the “gold standard set”).
Wgs And Wes
Here, we evaluated the performance of WES (MGIEasy Exome FS Library Prep Set) and WGS (MGIEasy PCR-Free DNA Library Prep Set) on DNA samples of NA12878. NA12878-1_40X and NA12878-2_120X were used for evaluation. NA12878-1_40X was the down-sampling sample of NA12878-1 with a mean DP of 40X. NA12878-2_120X was an down-sampling sample of NA12878-2 with a mean DP of 120X, which is typical and current standard for clinical WES [36]. Two quality parameters for variation detection (DP and GQ), sensitivity for SNV/indels detection, breadth of coverage of the list of 8,394 putative disease-associated genes and CNVs were compared in this part.
DP and GQ are two main parameters assessing quality of variant calls, which are often used to filter out variants with erroneous variant calls [30, 37]. First, we investigated the GQ and DP distribution of NA12878-1_40X and NA12878-2_120X in the regions of the human genome covered by WES (59,082,036 bp). We found similar results to those obtained in a previous WES study [8]. The distribution of DP for the variants ranged more widely in NA12878-2_120X than in NA12878-1_40X (Supplementary Fig. 11), with a median at 94X depth but a mode at 63X and 58X for SNPs and indels, indicating low levels of coverage for a substantial proportion of variants. While the distribution of DP was normal-like for NA12878-1_40X, with a median at 42X and coinciding mode at 41X for both SNPs and indels (Supplementary Fig. 11). The vast majority of variants called by NA12878-2_120X had a GQ close to 100, and fluctuated along with the GQ scores. The distribution of GQ for variants in NA12878-1_40X showed a mode in low GQ area, probably caused by insufficient variant calls in the WES regions.
For the detection of SNVs and indels, “true positive” calls were further restricted to the regions of the human genome covered by WES (59,082,036 bp). A total of 44,726 variants were used for evaluation. DP and GQ filtering were not used in this part. In general, the sensitivity and PPV of WGS (NA12878-1_40X) is higher than WES (NA12878-2_120X) for the 44,726 variants (Supplementary Table 5), except for the PPV for homozygous indel detection. For 90.39% of gold SNPs, WES and WGS yielded the same genotype. More than 63.41% of these concordant SNVs were identified as heterozygous, which was similar to those obtained in previous WES studies [4, 15, 25].
Then, we investigated the breadth of coverage of NA12878-1_40X and NA12878-2_120X in the 8,394 putative disease-associated genes and CNVs (DECIPHER database). As a result, WGS showed better coverage of both putative disease-associated genes and CNVs. More than 99.77% sites of the exon regions of the 8,394 putative disease-associated genes were covered with a depth of > = 10 for NA12878-1_40X, while NA12878-2_120X covered 99.46% of the exon regions. NA12878-2_120X was poorly covered for CNVs in the DECIPHER database (Supplementary Table 6). More than 69.69% of the CNVs showed a coverage of less than 10%.
Analysis Of 8 Clinical Cases With Known Disease-causing Variants
In this study, samples of 8 clinical cases with known variants of various types were recruited and reanalyzed using our WGS pipeline. All the 12 variants were validated previously by methods other than MPS technology. Focusing on SNVs, indels and CNVs, we applied our method to 8 clinical cases using singleton WGS. Variants were manually assessed for quality and interpreted according to the American College of Medical Genetics and Genomics (ACMG) guidelines for variant classification [27, 28]. All the previously validated variants (Supplementary Table 7) were successfully detected by our WGS pipeline, which further demonstrated the sensitivity of the method.
Based the sequencing DP, the average cost for the WGS approach was ~$490 per sample for the 8 clinical cases in current study. The overall cost (from DNA extraction to reporting) for a single case with a mean sequencing depth of ~ 40X is approximately $600, including ~$280 for chemicals (DNA extraction, library construction and sequencing), ~$220 for labor, and ~$100 for depreciation expenses.