This study evaluated the sensitivity, specificity, and concordance between automatic identification using the Ion S5 system and manual identification using the Miseq system. In the first phase, we compared the sensitivity and specificity of the two systems via karyotypically defined cell line mixtures. In the second phase, we calculated the concordance between the two systems in 107 clinical trophoblast biopsies. The sensitivity and specificity of both two systems were comparable. The concordance per embryo and per chromosome pair were high using the same calling criteria.
NGS technology has been proven to be consistent with many other CCS platforms used in the PGT-A10, 15, 23, 24. In early assessments of NGS for PGT-A, the concordance between NGS systems and 24-chromosome aCGH was assessed10, 23. Accordingly, the sensitivity and specificity of NGS were ultimately high, and the broader dynamic range of CNV status generated by the NGS interpretation software simplified the identification of chromosome ploidy. Subsequently, investigators studied segmental or mosaic aneuploidy using NGS and validated these observations through a third platform, such as FISH or SNP arrays3, 13. These articles demonstrated that segmental aneuploidy and diploid/aneuploid mosaicism could be identified using NGS, but that not every variation observed was reliable17. The WGA artifacts, the algorithms selected for calculation, or the approach of identification can lead to false positiveness16. The present study focused on two distinctive identification approaches to evaluate automatic calling using the Ion Reporter software in the Ion S5 system and manual calling using the BlueFuse Multi software in the Miseq system. Though the WGA of two systems were both based on the modified Rubicon PicoPLEX kit (Takara Bio, Kyoto, Japan), their procedures were different in the library preparation. Thus, we separated the pre-amplified products into two aliquots for the parallel comparison. Although it reduced the initial amount of DNA, the performance of libraries could be independently evaluated on each system.
Of the sequencing with default setting, the individual sequence length spanned between 100 to 150 bp on the Ion S5, while the Miseq generates uniform 36 bp sequences. Although the sequence lengths are different, the distribution of read counts within unit intervals (set as 1 Mb) across a particular region displayed almost the same pattern between the two systems (Fig. 3). Additionally, their own quality metrics could be fundamentally similar but not totally identical, due to the specific sequencing principles underlying each system16. Of the CNV region assessment, the Ion Reporter applied a hidden Markov model (HMM) to predict CNV and whole number ploidy status, while the BlueFuse Multi used its own algorithm. Of the measure of background noise in individual samples, the Ion Reporter displayed the median of the absolute values of all pairwise differences (MAPD), while the BlueFuse Multi reviewed the derivative log2 ratio (DLR) for the spread of the difference in CNV between all bins within a chromosome (Detection of aneuploidy in a single cell using the Ion ReproSeq PGS View Kit, Application Note, Thermo Fisher Scientific; BlueFuse Multi v4.5, Software Guide, Illumina).
Common calling criteria were applied in the parallel comparison of this study, and it was determined by the former four cell line models. Eventually, high concordance in the clinical samples was obtained between the automatic and manual identifications. However, some differences still existed between the two approaches though common criteria were used. First, the Ion Reporter provided a tunable analysis workflow followed with an automatic identification under this frame, and the BlueFuse Multi allowed operators to make manual calling based upon its own default settings, which are unchangeable (Ion Reporter™ 5.10 User Guide, Thermo Scientific Fisher; BlueFuse Multi v4.5, Software Guide, Illumina). Therefore, some parameters were unable to be completely synchronized between the two software, such as the transition penalty, which represents the sensitivity of different ploidy status between two adjacent data points. Second, manual intervention is not required during automatic identification of the Ion Reporter, while it is necessary for the BlueFuse Multi when the technician observes a deviation from the default line representing copy number two. In some samples with ambiguous patterns, both the masking of automatic identification and subjective conclusions made by manual identification may happen without validation using a second methodology. Additionally, it is also important to note that the fidelity of samples with boundary aneuploid levels could be affected by the separation of pre-amplified products in the present study.
In general, batch-to-batch automatic identification was a faster and more standardized approach, but sometimes less flexible in the individual sample. Namely, the major advantage of automatic workflow provided by the Ion S5 is reduction in manipulation time, reporting time, and thus turnaround time. Of manipulation time, the Ion S5 system combines the WGA and library procedures together, and leaves the remaining steps for the Ion Chef automatic machine. In contrast, the Miseq system takes nearly twice the manipulation time for separate procedures of WGA and library preparation. In terms of reporting time, the automatic identification of Ion S5 system quickly accomplishes typical ploidy calling in a batch, though additional manual rechecks could be required for some ambiguous results; whereas, the manual identification of Miseq system requires individual checking for each sample, and thus needs longer time.
Conclusively, it is the first study to compare the automatic and manual identifications of the Ion S5 and Miseq NGS systems for PGT-A. The sensitivity and specificity of both systems were comparable, while the concordance in the clinical samples was high. The automatic identification provides a faster and more standardized approach, and thus represents a good option for the laboratories with high throughputs.