Residual adapters
Compared to MiSeq, iSeq raw reads had significantly more adapters for all three viral datasets analyzed (p≤1.35 ×10-3) (Fig. 1 and Table S3). After trimming, residual adapters were still detected in AdapterRemoval, FastP, and SeqPurge-trimmed single and Skewer-trimmed paired reads, with FastP retaining the most adapters for poliovirus (0.038−12.54%), SC2 (0.043−13.06%), and norovirus trimmed reads (0.32−3.51%) (Fig. 1). AdapterRemoval left more adapters in MiSeq than iSeq poliovirus and SC2 trimmed reads (p<0.015). SeqPurge only left detectable adapters in SC2 single reads.
Differences in raw versus trimmed read statistics
Overall, iSeq and MiSeq raw reads showed similar mean total read (paired and single) counts, paired read counts, base counts, and read lengths, except MiSeq generated more SC2 raw reads and bases (p=0.035, Table S4). The iSeq generated more high-quality raw reads for poliovirus and SC2 than MiSeq (p≤1.09×10-3), while no differences were observed for noroviruses.
After trimming, all trimmers output similar counts of total reads, read pairs, and bases for poliovirus, SC2, and norovirus trimmed reads (Table S5-S7), except BBDuk, which had significantly fewer bases for SC2 (p<0.028, Table S6). BBDuk also retained the shortest trimmed reads for all viruses compared to other trimmers (p≤3.12×10-5, Fig.2, Table S5-S7). SeqPurge and Skewer consistently output longer trimmed reads than Trimmomatic, AdapterRemoval, and FastP across viruses and sequencers (Fig. S8-S10, panels D and J).
The iSeq poliovirus and SC2 trimmed datasets had significantly fewer paired reads compared to the raw datasets (p<0.012, Tables S4−S6, Fig. S8B and S9B), with Trimmomatic, AdapterRemoval, FastP, and BBDuk consistently retaining fewer trimmed read pairs than raw reads (p<0.027) for both poliovirus and SC2. Also, poliovirus and SC2 trimmed datasets had significantly fewer bases compared to raw datasets (p<5.44×10-4). Overall, trimmed reads were shorter but with higher quality bases (82.41⎼96.2% with Q≥30) than raw reads (77.74⎼93.61%) for poliovirus, SC2, and noroviruses (p<3.75×10-3, Tables S5−S7, Fig. S8−S10, panels E, F, K and L). Additionally, trimmers preserved longer MiSeq poliovirus and SC2 trimmed reads than iSeq (p≤5.59×10-3, Fig.2, Table S4), and more high-quality iSeq than MiSeq trimmed reads for all three viruses (p≤0.035) (Table S4).
Differences in trimmed read quality
Overall, AdapterRemoval, Trimmomatic, and FastP consistently produced trimmed reads with a higher percentage of quality bases (Q≥30, 93.15−96.7%) than SeqPurge, BBDuk, and Skewer (87.73−95.72%) (Tables S5-S7 and S11, Fig. S8-S10, panels E, F, K and L). Specifically, BBDuk, SeqPurge, and Skewer retained significantly fewer quality trimmed iSeq reads across all viruses (p<7.9×10-3) and MiSeq norovirus reads (p<0.024) compared to other trimmers. Only AdapterRemoval retained significantly more quality MiSeq SC2 trimmed reads than BBDuk and SeqPurge (p<0.016), and no quality differences were observed for MiSeq poliovirus trimmed reads (p>0.088).
Overall, trimmers output more high quality (Q≥30) iSeq than MiSeq SC2 and norovirus trimmed reads (p<0.035), with no platform-differences for poliovirus trimmed reads (Table S4).
De novo assembly statistics
All trimmers except BBDuk improved N50 and maxContig for assemblies across viral datasets compared to raw reads. After trimming, the most pronounced differences in assembly statistics were observed for poliovirus and SC2 datasets. Notably, BBDuk-trimmed poliovirus and SC2 reads assemblies resulted in the lowest N50 (p<0.037, Table S12), and maxContig (p<7.83×10-3, Table S13), achieving only 8−39.9% genome coverage compared to raw reads (8.8−87.5%) and other trimmers (54.8−98.9%) (Table 1). Trimmed poliovirus reads assembled in long contigs, significantly improving genome coverage compared to raw read assemblies, from 35.7% to 98.9% for iSeq FastP-trimmed reads and from 87.5% to 95.6% for MiSeq AdapterRemoval-trimmed reads (Table 1). Assemblies from norovirus trimmed reads showed no significant differences.
MiSeq and iSeq showed comparable mean N50 and maxContig for SC2 and norovirus trimmed reads. However, FastP-trimmed iSeq poliovirus reads assembled longer contigs than MiSeq reads (p=0.014, Table S14).
Single nucleotide polymorphism (SNP) quality and concordance
There were no differences in SNP quality for SC2 and norovirus datasets across the trimmers. However, for poliovirus datasets, BBDuk-trimmed read assemblies had lower mean SNP quality compared to other trimmers (Table S15).
Illumina iSeq and MiSeq read assemblies identified SNPs with similar quality, ranging from 3 to 228 for all viruses (Table S14). SNP concordance across trimmers was high (>97.7-100%) for both iSeq and MiSeq viral datasets; however, BBDuk-trimmed read assemblies had 2−8 unique SNPs relative to other trimmers (Fig. S16).