High Concordance Between Automatic Identication in Ion S5 and Manual Identication in Miseq During Next-Generation Sequencing for Preimplantation Genetic Testing of Aneuploidy

The Ion S5 (Thermo Fisher Scientic) and Miseq (Illumina) NGS systems are both widely used in the clinical laboratories conducting PGT-A. Each system employs discrepant library preparation steps, sequencing principles, and data processing algorithms. The automatic interpretation via Ion Reporter software (Thermo Fisher Scientic) and the manual interpretation via BlueFuse Multi software (Illumina) for chromosomal copy number variation (CNV) represent very different reporting approaches. Thus, it is intriguing to compare their ability of ploidy detection as PGT-A/NGS system. In the present study, four aneuploid cell lines were individually mixed with a diploid cell line at different aneuploid ratios of 0% (0:5), 10% (1:9), 20% (1:4), 40% (2:3), 50% (3:3), 60% (3:2), 80% (4:1) and 100% (5:0) to assess the sensitivity and specicity for whole chromosomal and segmental aneuploidy detection. The clinical biopsies of 107 blastocysts from 46 IVF/PGT-A cycles recruited between December 2019 and February 2020 were used to calculate the concordance. Initially, the pre-amplied products were divided into two aliquots for different library preparation procedures of each system. Applying with the same calling criteria, automatic identication was achieved through the Ion Reporter, while well-trained technicians manually identied each sample through the BlueFuse Multi. The results displayed that both systems reliably distinguished chromosomal CNV of the mixtures with at least 10% aneuploidy from karyotypically normal samples ([Ion S5] whole-chromosomal duplication: 2.14 vs. 2.05, p-value=0.009, segmental deletion: 1.88 vs. 2.05, p-value=0.003; [Miseq] whole-chromosomal duplication: 2.12 vs. 2.03, p-value=0.047, segmental deletion: 1.82 vs. 2.03, p-value=0.002). The sensitivity and specicity were comparable between the Ion S5 and Miseq ([sensitivity] 93% vs. 90%, p=0.78; [specicity] 100% vs. 100%, p-value=1.0). In the 107 clinical biopsies, three displayed chaotic patterns (2.8%), which could not be interpreted for the ploidy. The ploidy concordance was 99.04% (103/104) per embryo and 99.47% (2265/2277) per chromosome pair. Since their ability of detection were proven to be similar, the automatic identication in Ion S5 system presents comparatively faster and more standardized performance. This study evaluated the sensitivity, specicity, and concordance between automatic identication using the Ion S5 system and manual identication using the Miseq system. In the rst phase, we compared the sensitivity and specicity of the two systems via karyotypically dened cell line mixtures. In the second phase, we calculated the concordance between the two systems in 107 clinical trophoblast biopsies. The sensitivity and specicity of both two systems were comparable. The concordance per embryo and per chromosome pair were high using the same calling criteria.


Introduction
Preimplantation genetic testing for aneuploidy (PGT-A) was developed to detect imbalanced chromosome number in the early-stage embryos during IVF and has been continuously improved. It is utilized to identify chromosomal copy number variation (CNV) in a single biopsy of embryos via different comprehensive chromosome screening (CCS) tools 1 . Initially, the uorescence in situ hybridization (FISH) technique was employed, but it led to poor clinical outcomes as only a few chromosome pairs could be detected 2 . Following the emergence of array comparative genomic hybridization (aCGH), it soon became widely used for 24-chromosome CNV analysis 3 . Simultaneously, trophectoderm (TE) biopsy of blastocysts was gradually developed, and shown to overcome the mosaicism issue in the biopsy of day 3 cleavage-stage embryos 4,5 . Therefore, analysis of TE biopsies using aCGH was commonly accepted by laboratories conducting PGT-A. In addition to aCGH, quantitative polymerase chain reaction (qPCR) and single nucleotide polymorphism microarrays (SNP arrays) have also been demonstrated as CCS methodologies for embryo ploidy 6,7,8 .
In 2013, next-generation sequencing (NGS) for chromosomal CNV analysis was introduced in PGT-A 9 . It offers the advantages of high throughput and increased exibility of data analysis. Therefore, it e ciently reduced costs and enhanced sensitivity 10 . In order to analyze samples containing only 5 to 10 cells, the biopsies must undergo whole-genome ampli cation (WGA) initially. Then, the ampli ed DNAs are pooled to create a library for massive sequencing. Within the massive sequencing, the number of reads generated determines how much information can be obtained from each individual sample and how many samples can be simultaneously tested in a single run 11 . To calculate CNV, each chromosome is divided into several intervals of appropriate unit lengths, so-called as 'bins' or 'tiles,' and the reads that pass quality assurance metrics are mapped to the human reference genome according to the intervals.
Then, the bin count data is calculated, corrected, and smoothed using commercialized algorithms speci c to different interpretation software. The chromosomal CNV can be distinguished by the deviation of default copy number representing as two 10 .
Apart from detecting whole chromosome aneuploidy, NGS technologies can also identify segmental or mosaic aneuploidies 12,13,14 . Based on validation against karyotypically de ned samples, NGS was proven to be able to detect the above aneuploidies, though the sensitivity and speci city were highly dependent on the calling conditions 15 . In addition to the applied criteria, technical derivatives due to WGA artifacts and masking effects conferred by the algorithms can also be the source of bias that reduces the accuracy of PGT-A 16, 17 . The Ion S5 system (Thermo Fisher Scienti c, Waltham, MA, USA) and Miseq system (Illumina, San Diego, CA, USA) are two major NGS platforms in PGT-A. Each system uses discrepant library preparation protocols, sequencing principles, and commercial analysis software. The WGA procedures and library preparation in the Ion S5 system are combined, while they are separate in the Miseq system. The Ion S5 system conducts emulsion PCR ampli cation for library templating using the Ion Chef automatic machine followed with hydrogen ion-detecting sequencing; and the Miseq system employs parallel bridged ampli cation for optics-based sequencing 18 . Of the throughput, the Ion S5 system can accommodate 16 to 96 samples per run depending on the chips applied, while 24 samples per run as the maximum in the commercial Veriseq PGS kit package on Miseq system. In terms of data analysis, the sequences generated by the Ion S5 and Miseq undergo their own commercialized quality assurance metrics, and then are interpreted using the Ion Reporter software (Thermo Fisher Scienti c), and the BlueFuse Multi software (Illumina), respectively. The Ion Reporter automatically achieves aneuploid calling, which can be tuned by a customized analysis work ow; while the BlueFuse Multi requires the operator to conduct manual and observational identi cation.
Since the Ion S5 and Miseq systems are very different for the identi cation approach, it is intriguing to compare the ability of ploidy detection between the two NGS systems in PGT-A. In this study, karyotypically de ned cell lines were mixed to evaluate the sensitivity and speci city. Then the clinical trophoblast samples were utilized for calculating the concordance per embryo and per chromosome pair.

Study design
This study was approved by the Ethics Review Committee of National Taiwan University Hospital. In the rst phase, we employed a mixing experiment with ve karyotypically de ned cell lines to compare the sensitivity and speci city between the Ion S5 system (Thermo Fisher Scienti c) and Miseq system

Clinical subjects
All couples involved in the study were initially counselled by the reproductive consultants. A complete explanation of the IVF/PGT-A process, including published values in terms of the sensitivity and speci city of the Ion S5 and Miseq, the in-house percentage of failed ampli cation, the inconclusive and aneuploid rate, were provided by the conducting laboratory for their consideration. Each enrolled couple signed the consent form for the study, which was previously approved by the Institutional Review Board of National Taiwan University, before entering the personalized controlled ovarian stimulation program 20 .
Informed consent was obtained from all the participants, and all methods were performed in accordance with the relevant guidelines.
Retrieved metaphase II (MII) oocytes were fertilized by intracytoplasmic sperm injection (ICSI) and cultured to the blastocyst stage. Once the inner cell mass (ICM) of the blastocyst was graded above B according to the Gardner and Schoolcraft system 21 and distinctive cellular TE was evident, biopsy would be performed using pipetting shearing (Origio, Måløv, Denmark).

Library preparation
Biopsied samples were thawed, lysed, and randomly fragmented using the extraction and preampli cation master mix of the Ion SingleSeq® kit (Thermo Fisher Scienti c). A total volume of 15 µL fragmented products were separated into two aliquots: 7.5 µL for the WGA plus library preparation combined procedure on the Ion S5 system, and the other 7.5 µL for the separate WGA and library preparation procedures on the Miseq system.
To prepare the library for the Ion S5 system, the individual barcodes and ampli cation master mix of the Ion SingleSeq® kit was added to the pre-ampli ed products. Then a PCR program for both WGA and barcode ligation was performed. The library amplicon was pooled and puri ed using AMPure® XP beads (Beckman Coulter, Pasadena, CA, USA). Then they were quanti ed using the high-sensitivity (HS) Assay Kit (Qubit®, Life Technologies, Waltham, MA, USA), and then diluted for templating on the Ion Chef® automatic machine (Thermo Fisher Scienti c). The templated chip was sequenced using the Ion S5 (Ion ReproSeq PGS Kits-Ion S5 System User Guide).
To prepare the library for the Miseq system, the pre-ampli ed products were subjected to WGA using the ampli cation master mix of the Sureplex® Ampli cation kit (Illumina). The ampli ed products were quanti ed using the high-sensitivity (HS) Assay Kit (Qubit), and then diluted for preparing the library. The amplicons underwent tagmentation, index ligation, puri cation by AMPure XP beads, normalization, and eventually they were pooled for the Miseq sequencing (VeriSeq PGS Library Prep Reference Guide).

NGS and CNV analysis
Data generated by the Ion S5 system was subjected to align to the human reference genome, and went through quality assurance metrics to remove low quality and duplicate reads using Torrent Suite® (Thermo Fisher Scienti c). Then the available reads were analyzed using Ion Reporter software (Thermo Fisher Scienti c) to calculate CNV. The length of a single tile was set as 1 Mb corresponding with the default unit length in the BlueFuse Multi software (lllumina). Aneuploid calling was initially accomplished by the Ion Reporter with a customized analysis work ow followed with a self-proprietary program for additional tuning.
Data generated by the Miseq system was processed and analyzed using the BlueFuse Multi software.
Similar but not totally identical, the reads went through a series of quality assurance metrics. To calculate CNV, every aligned read count was assigned to the bin unit with the default length as 1 Mb. Aneuploid calling was conducted manually by well-trained technicians using BlueFuse Multi based on the deviation from the default line as copy number two.

Assessment of sensitivity and speci city
The sensitivity and speci city were calculated using the karyotypically de ned cell line mixtures. Sensitivity was de ned as the number of tested samples containing aneuploid cells with positive aneuploid calls divided by the total number of tested samples containing aneuploid cells. Speci city was de ned as the number of tested samples containing merely diploid cells without positive aneuploid calls divided by the total number of tested samples containing merely diploid cells.

Concordance analysis
Concordance was calculated as per embryo and per chromosome. First, we analyzed the concordance between two systems based on ploidy conclusions for the embryos (euploid and aneuploid). Second, we analyzed the concordance for individual chromosomes (diploid and aneuploid). The same chromosomes with different aneuploid percentages on each NGS system would be counted as concordant, since the aneuploid percentages could be affected by several objective issues, such as the e ciency of WGA and masking of the data processing procedure.

Statistical analysis
The count data are displayed as percentages, and continuous data as averages and standard deviations (SD). Groups were compared using the Chi-square or Fisher's exact test. Signi cant differences were de ned as a p-value less than 0.05. All analyses were conducted using GraphPad software (Prism, GraphPad Software, La Jolla, CA, USA).

Patient pro les
Forty couples undergoing IVF/PGT-A were enrolled in this study (mean maternal age: 37.2 years, SD: 4.3 years), including 32 couples using their own oocytes (mean maternal age: 37.2 years, SD: 4.1 years) and 8 couples using donated oocytes (mean donor age: 23.8 years, SD: 2.0 years). In terms of the indications, 20 couples had advanced maternal age (> 36 years) (50%); 6 couples had severe male factors (15%); 6 couples had a history of repeated implantation failure (15%); and 8 couples using donated oocyte (20%) would like to undergo PGT-A for single embryo transfer (SET). Detailed number of retrieved MII, normally fertilized oocytes (two pronuclei, 2PN), derived blastocysts for biopsy, and biopsied samples for NGS testing are displayed in Table I. One hundred and eight blastocysts were biopsied. One biopsy failed to be ampli ed (0.9%, 1/108), and a total of 107 clinical biopsies were subjected to the NGS analysis. Assessment of sensitivity and speci city Four karyotypically de ned aneuploid cell lines were individually mixed with a diploid cell line to simulate mosaic samples with different types and levels of aneuploidy: 0%, 10%, 20%, 40%, 50%, 60%, 80%, and 100%. Figures (1a) and (1b) displayed correlation between the aneuploid percentage generated by the mixing experiment and the calculated copy numbers of the affected aneuploid regions determined by the Ion S5 system (based on automatic identi cation using the Ion Reporter), and by the Miseq system (based on manual identi cation using the BlueFuse Multi Since detection of segmental deletion was more challenging than that of whole-chromosome duplication, the standard deviation displayed a wider range of variation in the box plots. The overall sensitivity of the two NGS systems at different aneuploid percentages were displayed in the bar chart of Fig. 2a (Fig. 2b), and no signi cant difference was observed, either (p-value = 1.00).

Concordance per embryo
The concordance calculated by the ploidy of embryos between the two NGS systems is presented in Table II. A total of 107 samples were subjected to sequencing. In chromosomal CNV analysis, three samples displayed chaotic patterns that the ploidy could not be interpreted (2.8%, 3/107). Ploidy was classi ed as euploid (below 20% aneuploidy), low-rate mosaic (20%-50% aneuploidy), high-rate mosaic (50%-80% aneuploidy), and aneuploid (exceeding 80% aneuploidy). Concordance was calculated as the number of samples identi ed as euploid or mosaic/aneuploid on the both two systems divided by the total number of samples with conclusive results. Concordant results were obtained for a total of 103 samples, and thus the concordance rate per embryo was 99.04% (103/104).

Concordance per chromosome pair
Furthermore, concordance per chromosome pair between the two NGS systems was calculated (Table III). A total of 2392 chromosome pairs were assessed (52 male embryos and 52 female embryos). One hundred and fteen chromosome pairs exhibited chaotic mosaicism (2 male embryos and 3 female embryos), and the individual chromosomes affected could not be clearly identi ed (4.8%, 115/2392).
Thus, the remaining 2277 chromosome pairs were categorized as diploid, low-rate mosaic, high-rate mosaic, or aneuploid. Concordance was calculated as the number of chromosome pairs identi ed as diploid or mosaic/aneuploid on the both systems divided by the total number of chromosome pairs with conclusive results. The same mosaic or aneuploid chromosomes with different aneuploid percentages in a particular sample between the two systems were also counted as concordant. Concordant results were obtained for a total of 2265 chromosome pairs, and thus the concordance rate per chromosome pair was 99.47% (2265/2277). c Chromosome pairs identi ed as diploid or mosaic/aneuploid on the both two NGS systems.
Mosaic/aneuploid chromosome pairs with different aneuploid percentages on each system are also included.
d Chromosome pairs identi ed as diploid on the only one system, and as mosaic/aneuploid on the other system.

Discussion
This study evaluated the sensitivity, speci city, and concordance between automatic identi cation using the Ion S5 system and manual identi cation using the Miseq system. In the rst phase, we compared the sensitivity and speci city of the two systems via karyotypically de ned cell line mixtures. In the second phase, we calculated the concordance between the two systems in 107 clinical trophoblast biopsies. The sensitivity and speci city of both two systems were comparable. The concordance per embryo and per chromosome pair were high using the same calling criteria.
NGS technology has been proven to be consistent with many other CCS platforms used in the PGT-A 10,15,23,24 . In early assessments of NGS for PGT-A, the concordance between NGS systems and 24chromosome aCGH was assessed 10,23 . Accordingly, the sensitivity and speci city of NGS were ultimately high, and the broader dynamic range of CNV status generated by the NGS interpretation software simpli ed the identi cation of chromosome ploidy. Subsequently, investigators studied segmental or mosaic aneuploidy using NGS and validated these observations through a third platform, such as FISH or SNP arrays 3,13 . These articles demonstrated that segmental aneuploidy and diploid/aneuploid mosaicism could be identi ed using NGS, but that not every variation observed was reliable 17 . The WGA artifacts, the algorithms selected for calculation, or the approach of identi cation can lead to false positiveness 16 . The present study focused on two distinctive identi cation approaches to evaluate automatic calling using the Ion Reporter software in the Ion S5 system and manual calling using the BlueFuse Multi software in the Miseq system. Though the WGA of two systems were both based on the modi ed Rubicon PicoPLEX kit (Takara Bio, Kyoto, Japan), their procedures were different in the library preparation. Thus, we separated the pre-ampli ed products into two aliquots for the parallel comparison.
Although it reduced the initial amount of DNA, the performance of libraries could be independently evaluated on each system.
Of the sequencing with default setting, the individual sequence length spanned between 100 to 150 bp on the Ion S5, while the Miseq generates uniform 36 bp sequences. Although the sequence lengths are different, the distribution of read counts within unit intervals (set as 1 Mb) across a particular region displayed almost the same pattern between the two systems (Fig. 3). Additionally, their own quality metrics could be fundamentally similar but not totally identical, due to the speci c sequencing principles Common calling criteria were applied in the parallel comparison of this study, and it was determined by the former four cell line models. Eventually, high concordance in the clinical samples was obtained between the automatic and manual identi cations. However, some differences still existed between the two approaches though common criteria were used. First, the Ion Reporter provided a tunable analysis work ow followed with an automatic identi cation under this frame, and the BlueFuse Multi allowed operators to make manual calling based upon its own default settings, which are unchangeable (Ion Reporter™ 5.10 User Guide, Thermo Scienti c Fisher; BlueFuse Multi v4.5, Software Guide, Illumina). Therefore, some parameters were unable to be completely synchronized between the two software, such as the transition penalty, which represents the sensitivity of different ploidy status between two adjacent data points. Second, manual intervention is not required during automatic identi cation of the Ion Reporter, while it is necessary for the BlueFuse Multi when the technician observes a deviation from the default line representing copy number two. In some samples with ambiguous patterns, both the masking of automatic identi cation and subjective conclusions made by manual identi cation may happen without validation using a second methodology. Additionally, it is also important to note that the delity of samples with boundary aneuploid levels could be affected by the separation of pre-ampli ed products in the present study.
In general, batch-to-batch automatic identi cation was a faster and more standardized approach, but sometimes less exible in the individual sample. Namely, the major advantage of automatic work ow provided by the Ion S5 is reduction in manipulation time, reporting time, and thus turnaround time. Of manipulation time, the Ion S5 system combines the WGA and library procedures together, and leaves the remaining steps for the Ion Chef automatic machine. In contrast, the Miseq system takes nearly twice the manipulation time for separate procedures of WGA and library preparation. In terms of reporting time, the automatic identi cation of Ion S5 system quickly accomplishes typical ploidy calling in a batch, though additional manual rechecks could be required for some ambiguous results; whereas, the manual identi cation of Miseq system requires individual checking for each sample, and thus needs longer time.
Conclusively, it is the rst study to compare the automatic and manual identi cations of the Ion S5 and Miseq NGS systems for PGT-A. The sensitivity and speci city of both systems were comparable, while the concordance in the clinical samples was high. The automatic identi cation provides a faster and more standardized approach, and thus represents a good option for the laboratories with high throughputs.  Cell lines are mixed to create multiple levels of aneuploidy. The calculated copy number at the affected aneuploid region displayed correlation with the aneuploid percentage using automatic identi cation via the Ion Reporter on the Ion S5 system (1a), and using manual identi cation via the BlueFuse Multi on the Miseq system (1b). As the number of aneuploid cells in the mixtures increases, the copy number of the regions with segmental deletion or whole-chromosome duplication decreases or increases on both the NGS systems.

Figure 2
Overall sensitivity of the Ion S5 and Miseq at different aneuploid levels are displayed in the bar chart, and the table lists individual sensitivity for segmental deletion and whole chromosomal duplication (2a).
Overall speci city of the Ion S5 and Miseq are shown (2b). Both the sensitivity and speci city are not signi cantly different between the two systems.