RT-qPCR is currently a technique employed by the FDA to detect the presence of HAV and HuNoV in foods. A positive RT-qPCR result is then followed by Sanger-based sequencing for confirmation of virus presence and virus genotyping. With the advent, advances, and popularity of whole genome sequencing we wanted to investigate its utility as a method to confirm RT-qPCR positives and for use in genotyping the virus, particularly at very low levels of contamination.
Sequencing technologies have been applied on foodborne virus studies by many groups using various approaches and different sequencing platforms (Yang et al., 2017) (Chen et al., 2019) (Raymond et al., 2022) (Buytaers et al., 2022) (Aw et al., 2016). Bartsch et al. applied a metagenomics approach on frozen strawberries involved in a norovirus outbreak using the Illumina HiSeq platform (Bartsch et al., 2018). They could obtain only 2 out of 29 million sequencing reads that matched to the norovirus sequence, mainly due to the presence of highly abundant nucleic acids of other sources. Aw et al. could obtain rotavirus and picobirnavirus sequences from field-harvest and retail lettuce samples after sequence-independent amplification on those samples (Aw et al., 2016). Buytaers et al. performed sequencing using Oxford Nanopore technologies on norovirus-spiked raspberries (Buytaers et al., 2022). They showed that a norovirus genome could be obtained with shotgun metagenomics if virus is present in a sufficiently high contamination load, and with hybrid capture in lower contamination loads. These studies showed the possibilities of applying sequencing technologies to foodborne virus investigation and also demonstrated that the enrichment of viral targets, either by specific capture strategies or pre-amplification methods, could increase the virus sequencing reads and thus improve the sequencing ability on viruses in food samples. However, few studies applied sequencing on food samples containing virus in very low amounts (e.g., Ct values close to or around 40), which are the most frequently reported for viral contaminated food items.
SISPA and SPIA are two sequence-independent pre-amplification approaches that are frequently coupled with high throughput sequencing to generate viral reads from various samples (Kapusinszky et al., 2017) (Chen et al., 2018) (Chen et al., 2019) (Blomstrom et al., 2010). Myrmel et al. compared the efficiency of these two amplification methods combined with sequencing to recover bovine coronavirus genome (BCoV) and bovine rhinitis virus (BRBV) from nostril specimens (Blomstrom et al., 2010). Their data showed that the SPIA approach generated a higher number and a higher percentage of viral reads for both high copy number of BCoV input (4.1 x 105 genome copies) and low copy number of BRBV (700 genome copies), which indicated a high efficiency of SPIA for amplification of viral RNA in comparison to SISPA. We reasoned that using pre-amplification prior to WGS for viral contaminated food samples would increase the sensitivity and improve WGS ability to confirm RT-qPCR positive at low viral contamination levels. To this end we used either a SPIA or a SISPA pre-amplification method prior to sequencing of HAV and HuNoV from berry samples. A serial dilution of virus ranging from 105 to 10− 1 genome copies was used to ensure coverage of low viral quantities. Our data showed that either SPIA or SISPA coupled with WGS could recover enough reads of HAV or HuNoV from samples in group I for confirmation and genotyping. For samples in group II or III, which had lower amounts of virus input, they both could confirm some but not all RT-qPCR positive HAV and HuNoV samples. In addition, due to the limited number of sample replicates (especially in Group III), a comparison of efficiency, as well as the limit of detection of SPIA-WGS and SISPA-WGS for confirmation of HAV and HuNoV was not performed. Studies specifically designed to determine the limit of detection for confirmation and genotyping are warranted.
Three types of berry samples containing either HuNoV or HAV were included in this study. For Type 1 samples, HAV- or HuNoV-RNA transcripts were directly added to raspberry RNA extracts and provided an ideal model to examine virus detection by WGS. This model allowed us to use a pre-determined number of viral RNA copies for RT-qPCR and WGS without the need to consider virus recovery yield, intact virus, and viral genome integrity. Our data showed that the positivity of HAV transcripts could be consistently confirmed using either SPIA-WGS or SISPA-WGS when the Ct values were less than 40.
When the Ct values were above 40, 9 out of 13 (69%) PCR positive samples could be confirmed by SPIA-WGS, while 9 out of 9 (100%) were confirmed by SISPA-WGS (Table 1). Notably, both pre-amplification-WGS strategies were able to detect the presence of 0.1cp/3 µL viral RNA in samples (4 out 5 samples and 1 out of 2 samples for SPIA and SISPA, respectively) that had previously tested negative by RT-qPCR (Supplementary Table 1a and 1b). This might be due to a lack of viral RNA transcript in the 3 µL of sample used in the RT-qPCR reaction. In the case of frozen raspberry samples spiked with HuNoV transcripts (Type I), all 20 PCR-positive samples could be confirmed by SPIA-WGS. However, only 3 out of 5 (Ct between 35 to 40) and 2 out of 5 (Ct > 40) samples were detected by SISPA-WGS, suggesting that SPIA-WGS might provide better performance for detecting HuNoV transcripts in samples with higher Ct values.
The Type 2 samples contained viral RNA derived from either a HuNoV positive stool sample (a natural model of virus contamination) or HAV virus from cell culture spiked onto frozen strawberries. In these samples, HAV and HuNoV at low levels (Ct values close to 40) could be detected by both SISPA-WGS (Supplementary Table 1f and 1h) and SPIA-WGS (Table 3). However, a 1:10000 HAV dilution (Supplemental Table 1f Spiking 2) was identified at the genotype level by SISPA-WGS but was not detected by SPIA-WGS (Table 3). For the HuNoV spiked samples (Table 4), 18 out of 24 and 16 out 20 PCR positive samples could be confirmed with SPIA-WGS and SISPA-WGS, respectively. For the samples with Ct values higher than 40, 1 out 4 could be confirmed by SPIA-WGS, 3 out 4 could be confirmed by SISPA-WGS. With the limited number of samples, it is hard to demonstrate if SISPA-WGS had better performance than SPIA-WGS on confirmation of human norovirus samples with higher Ct values.
For the naturally contaminated blackberry (Type 3) sample, two out of three bag samples were determined as GII positive at high Ct values (46.5 and 43.4) prior to our receipt of the samples. Despite using the same isolation/detection protocol (the BAM protocol), we could not repeat/achieve positive results for any of the 12 x 50 g samplings from the 3 bags. This could be attributed to three possibilities: first, virus contamination is unevenly distributed in the samples; second, viral RNA was absent in the 3 µL volume used for the PCR reactions due to its low concentration; or third, the original RT-qPCR results were false positives. To address these possibilities, the remaining four berry concentrates derived from the same bag for each of the three bags were combined and concentrated prior to RNA isolation. RT-qPCR results showed that 2 out of 3 bags were HuNoV GII positive with pooling and concentration of the concentrates, although only one out of three PCR replicates was positive with a Ct of 41.95 and 42.65 for bag A and B, respectively (Table 5). These two PCR positive RNA samples were subsequently used for sequencing. HuNoV reads from bag A were recovered and assigned as HuNoV GII by SPIA-WGS but not SISPA-WGS, while HuNoV reads from bag B were unassigned by both SPIA-WGS and SISPA-WGS.
Our results indicate that RNA concentration could be one of the options to improve the capability of the current BAM detection method. Similar results were also observed with the HuNoV-spiked strawberries (spiking 4, Supplementary Table 1g and 1h). Instead of concentrating berry concentrates as above, isolated virus RNA was combined and concentrated for the RT-qPCR assay. Ct values dropped from 34.6 and 36.4 to 32.1 for the samples with the spiking at concentration of 1:1000 dilution. Similarly, Ct values dropped from 37.2 and 38.8 to 35.2 for spiking at 1:10000 dilution. These concentrated RNA samples were detected by both SPIA-WGS and SISPA-WGS with higher percentage of HuNoV reads in comparison with its unconcentrated counterpart (spiking 4, Supplementary Table 1g).
In contrast to the strawberry extract spiked with HAV transcripts, from which nearly full-length HAV genomic sequences still could be recovered for some of the samples with Ct above 40, only partial viral genomic sequences could be generated for most of the HuNoV transcripts-spiked samples having Ct above 30; this was also true for all HAV or HuNoV spiked strawberries by WGS. Our data also showed that a larger percentage of target viral reads and longer viral sequences were obtained with more virus input and very few reads were obtained from samples with high Ct (e.g., close to or above 40). Thus, if the recovered partial viral sequences were from regions outside of the genotyping location, the samples could be detected at species level but not identified at genotype level.
Our data show that extremely low levels of viral RNA in samples that were negative according to RT-qPCR could sometimes be detected using WGS techniques. We took steps to ensure that we could discriminate a true positive from cross contamination, including performing all the steps for sample preparation and RNA work in separate areas, running a negative control for both RT-qPCR and WGS assays, and taking additional precautions to avoid cross contamination during sequencing, such as stringent washing of the sequencer with Tween 20 between runs, and stringent QC to remove reads with low quality scores and aligning the recovered sequences with the sequence database from in-house samples to exclude any cross contamination.
Finally, data from the concentrated RNA samples showed both an improved sensitivity of RT-qPCR and an increase of WGS viral reads. Thus, it may be useful to consider how to optimize the protocols by adding a concentration step to improve the sensitivity of existing detection and confirmation methods.