The number of SARS-CoV-2 genome sequences in the public database is growing rapidly . The exponential growth of the publicly available SARS-CoV-2 genome sequences attributable to the rapid genome sequencing, development of data analysis workflow, and data sharing by researchers worldwide [11-13]. Currently, most of the sequencing workflows were created for the use of the Oxford Nanopore and Illumina’s sequencing platforms [16-18]. The ARTIC protocol is one of the most widely used sequencing methods using the Oxford Nanopore platform . In some cases, researchers used both the Oxford Nanopore and Illumina sequencing platforms to generate consensus genome sequences . The Ion Torrent sequencing platform, one of the popular platforms that was extensively used for viral genome sequencing [19-21], however, was not widely used in SARS-CoV-2 sequencing. Several Ion-Torrent based SARS-CoV-2 sequencing workflows were reported [12, 22, 23] but not very popular among the researchers. Unlike the ARTIC protocol, the published protocol for some of these in-house Ion-Torrent based assay was not detailed enough for it to be replicated in other laboratory settings [12, 23]. The recently launched Ion AmpliSeq™ SARS‑CoV‑2 Research Panel user guide contained the detailed and optimized sequencing protocol for the Ion S5 sequencing platform (MAN0019277 Rev.A.O). In the current study, we adopted and modified protocol from Ion AmpliSeq™ SARS‑CoV‑2 Research Panel user guide and applied it to the Ion PGM sequencing platform. When adopting or establishing a new protocol, it is critical to harmonize the steps written in the user manual with the existing Standard Operating Procedures (SOP) in a laboratory. This is to ensure that we obtain high-quality data and results. If a different platform or different reagents from the user manual were used, the specific SOP should be carefully optimized and accessed to maximize the output and data reproducibility of this newly established protocol. In a single sequencing run with Ion 318 chip, five complete and one near-complete genome sequences of SARS-CoV-2 derived from RNA samples directly extracted from human nasopharyngeal swabs were generated. According to the manufacturer’s protocol, 1 million sequencing reads were recommended for every sample (MAN0019277 Rev.A.O). Our sequencing run, however, generated approximately 180,000 to 1,500,000 reads per sample, suggesting a lesser number of total reads were sufficient to generate a complete genome sequence. Hence, the number of multiplexed samples can be increased to reduce the sequencing cost using the Ion PGM.
For NGS, an ideal protocol should generate results with high on-target specificity and read coverage uniformity. In this study, 99.9% of the generated reads of all six isolates were mapped to SARS-CoV-2 genomes with more than 96% coverage uniformity. Uneven read distribution is a common issue, and intrinsic factors affect data quality of NGS . In fact, uneven reads distribution was also reported in the previous version of SARS-CoV-2 tiling PCR amplification method . Two target regions (r1_1.14.786182 and r1_1.25.388943) of the Ion AmpliSeq™ SARS‑CoV‑2 Research Panel consistently resulted in low sequencing read coverage in most samples isolated from human and infected cell culture supernatant (unpublished data). Clearly, this problem cannot be random or sample type dependent. Generally, increased amount of sequencing output is the easiest way to improve the coverage in the low read depth region. The 4Apr20-64-Hu with sequencing reads of 216,151 had good sequence coverage (>500X) for both regions. Neither the sample with lower sequencing reads (21Apr20-209-Hu) nor samples with a higher number of sequencing reads demonstrated good coverage at these two regions. The high amount of sequencing reads will only lead to the over-sequencing of the adequately covered regions, causing higher sequencing costs. Therefore, increasing the overall number of sequencing reads will not be suitable for solving the read depth problem for r1_1.14.786182 and r1_1.25.388943. Other factors, such as genetic variation and variability of GC content, are common factors that could affect the efficiency of the target enrichment process. We observed that the genetic variations between 4Apr20-64-Hu (>500X read coverage at both regions) and other isolates at positions 13730 and 23929 at the potential primer binding sites for target regions, respectively. Hence, low coverage regions reported herein could be genome dependent, and the specific genetic variations may lead to inefficient primer annealing during the multiplex amplification process. An improved version of the Ion AmpliSeq™ SARS‑CoV‑2 Research Panel tackling these two low coverage regions or other genetic variations present in the circulating SARS-CoV-2 strains should be considered.
Taken together, we report a rapid complete genome sequencing protocol for the SARS-CoV-2 to be used with the Ion PGM. The much lesser amount of sequencing reads than the recommended 1 million reads was sufficient to produce a complete SARS-CoV-2 genome sequence. Six samples or more can be included in a single sequencing run using the Ion 318 chip. Our findings nonetheless revealed that using the Ion AmpliSeq™ SARS‑CoV‑2 Research Panel, two potential dropout regions would occur, and increasing of the sequencing reads would not be useful. An improved version of the Ion AmpliSeq™ SARS‑CoV‑2 Research Panel that addresses the potential genetic variations at the primer binding sites could improve the read coverage uniformity and usability of the sequencing data.