The narcissus (Narcissus L. spp.) is a bulbous plant from the family Amaryllidaceae, comprising numerous species. Although originating from the Mediterranean regions [1], Narcissus species are now grown worldwide. In Japan, narcissus bulbs imported from abroad are inspected during the post-entry quarantine. Over 20 plant viruses are known to infect the narcissus [1], and these viruses include several from the genus Potyvirus: cyrtanthus elatus virus A, lily mottle virus, narcissus degeneration virus, narcissus late season yellows virus (NLSYV), narcissus yellow stripe virus, and ornithogalum mosaic virus (OrMV) [1-4].
The Potyvirus vallota mosaic virus (ValMV) [5] was originally identified from Vallota speciosa in the Netherlands, with infection causing flower colour breaking and chlorotic leaf lesions [6]. This virus could be transmitted by Myzus persicae and its experimental host range were Chenopodium amaranticolor, C. quinoa, Gomphrena globosa, Hyoscyamus niger, Nicotiana clevelandii, Spinacia oleracea, and Tetragonia expansa [6]. Three distinct ValMV isolates have been reported from Cyrtanthus elatus (Synonym for V. speciosa [7]) in the United States (accession no. EF441726.1: direct submission), Nerine sarniensis in the United Kingdom (accession no. EF507688 [8]), and Nerine sp. in New Zealand (accession no. FJ618540.1 [9]). However, no study has analysed their sequences from the 5′ terminus to the NIb region, nor has the complete genome of ValMV been sequenced. This study now provides the complete ValMV genome.
Narcissus bulbs imported from the United States were cultivated for inspection at one of Japan’s post-entry quarantine stations. One plant (N. albidus) exhibited leaf chlorosis (Supplementary Fig. 1) in February 2020. To confirm viral infection, next-generation sequencing (NGS) of the symptomatic leaf sample was performed. Total RNA was extracted using a SpectrumTM Plant Total RNA Kit (Sigma-Aldrich, St. Louis, MO, USA). Reverse transcription of extracted RNA and cDNA amplification were conducted with a TransPlex® Complete Whole Transcriptome Amplification (WTA) Kit (Sigma-Aldrich) and ExTaq DNA polymerase (Takara Bio, Shiga, Japan), following Yanagisawa et al. [10]. An Ion Xpress Plus Fragment Library Kit (Thermo Fisher Scientific, Tokyo, Japan) was used for NGS library preparation from WTA products. The library was sequenced on an Ion Personal Genome Machine (PGM) system (Thermo Fisher Scientific). Sequence data analysis and de novo assembly were conducted in CLC Genomics Workbench version 20.0 (QIAGEN, Hilden, Germany), again following Yanagisawa et al. [10].
To validate assembled contig sequences from NGS and determine the complete ValMV genome, we performed Sanger sequencing of products from RT-PCR and rapid amplification of cDNA ends (RACE). New primers were designed based on one contig showing high sequence identity with known ValMV isolates, determined using BLASTn (Fig. 1 and Supplementary Table 1). Reverse transcription of extracted RNA was performed using PrimeScript™ Reverse Transcriptase (Takara Bio) with the random primer (N)6 and an oligo(dT)15 primer (Takara Bio). The novel primers and KOD-Plus-Neo (TOYOBO, Osaka, Japan) were used to conduct PCR. The 5′- and the 3′-terminal ends of ValMV were amplified using a SMARTer® RACE 5′/3′ Kit (Takara Bio) and a 3′-Full RACE Core Set (Takara Bio), respectively. Amplicons were directly sequenced using SeqStudio Genetic Analyzer (Thermo Fisher Scientific). Acquired paired-end sequence data were assembled in MEGA X [11].
The open reading frame (ORF) of ValMV was determined with ORF Finder (https:// www. ncbi.nlm.nih.gov/orffinder/). To predict putative cleavage sites of the viral polyprotein, we compared the amino acid (aa) sequences of ValMV and other known potyviruses [5]. The PIPO protein was identified from the highly conserved G1-2A6-7 motif [12].
We predicted the phylogenetic relationship of ValMV with other potyviruses using aa sequences of the coat protein (CP) and the polyprotein. The analysis was performed in MEGA X, using the maximum-likelihood method with the Jones-Taylor-Thornton (JTT) model and 1,000 bootstrap replications. With BLAST, we performed pairwise sequence comparisons of the complete genome, viral polyprotein, and 10 putative mature proteins between ValMV and potyviruses in the same cluster of the CP-based phylogenetic tree (Fig. 2A).
Fifty contigs longer than 500 nucleotides (nt) were assembled from 659,956 NGS reads. The BLAST analysis identified a 9,339 nt contig with high sequence identity (99.0, 98.9, and 98.8%) to the nt sequences of three known ValMV isolates (accession no. FJ618540.1, EF441726.1, and EF507688). The novel contig consisted of 5,273 NGS reads (mean coverage, 70.15). Five other contigs (749, 912, 1,058, 1,181, and 2,152 nt) showed high sequence identity (94.8, 92.9, 89.9, 93.7, and 94.6%) to the NLSYV nt sequence (accession no. JQ326210.1). Furthermore, 370,856 NGS reads mapped to the genome sequence (accession no. JQ326210.1) of NLSYV (mean coverage, 4,586.71). Another 44 contigs (518–2,394 nt) were not related to the plant virus sequences.
The complete genome of ValMV (isolate ValMV-Nar) had 9,451 nt (ORF: 135–9,236 nt and 3,033 aa) (accession no. LC658681) excluding the 3′ poly(A) tail (Fig. 1). The genome structure was typical of potyviruses, and we identified PIPO via the GA7 motif at positions 2,808–2,815 nt (Fig. 1). Additionally, the NGS contig sequence of 9,339 nt matched the complete ValMV-Nar genome.
Phylogenetic analysis based on the CP coding region showed that ValMV-Nar was in the same clade as known ValMV isolates and was closely related to OrMV (Fig. 2A). Phylogeny based on polyprotein sequences confirmed the close relationship between ValMV-Nar and OrMV (Fig. 2B).
Pairwise sequence comparisons (Supplementary Table 2) showed that the CP coding region of ValMV-Nar had >98% nt and aa sequence identity with known ValMV isolates, but shared low identity (61.5–71.9% nt and 57.5–76.8% aa sequence identities) with other potyviruses. The CP aa sequences were highly conserved between ValMV-Nar and known ValMV isolates, although the aa at position 59 was Asparagine in ValMV-Nar only, being Threonine in the other isolates. In contrast, the ValMV-Nar polyprotein shared 58.1–65.9% nt and 45.3–66.3% aa sequence identity with other potyviruses. Moreover, ValMV-Nar sequence identity with other potyviruses was 58.1–65.4% nt for the whole genome, 49.5–53.4% nt for the P1 coding region, and 51.9–68.9% nt for other eight mature protein regions. For all 10 mature protein regions and the polyprotein, ValMV-Nar had the highest sequence identity with OrMV.
This study characterized ValMV based on the complete genome sequence, whereas prior studies had used only partial sequences [8, 9]. Species demarcation criteria in Potyviridae are as follows: (i) <76% nt and <82% aa sequence identity in the viral polyprotein, (ii) <76–77% nt and <80% aa sequence identity in the CP coding region, or (iii) <58% nt sequence identity in the P1 coding region and <74–78% nt sequence identity in other protein coding regions [5]. Therefore, nt and aa sequence identities are above the species demarcation threshold when examining CP coding regions of ValMV-Nar and other isolates [8, 9], but below that threshold when examining complete genomes and individual protein regions of ValMV-Nar and other potyviruses (Supplementary Table 2). Moreover, phylogenetic analysis showed that ValMV-Nar was grouped with other potyviruses (Fig. 2). The available data indicate that ValMV-Nar belongs to the same species as existing ValMV isolates, and that ValMV is a distinct species in Potyvirus.
This is the first report confirming the complete genome sequence of ValMV, and the first to detect ValMV in Narcissus sp.. However, because the analysed plant was co-infected with NLSYV, we cannot ascertain the symptoms caused by ValMV infection alone. We therefore recommend further research that investigates the specific disease symptoms attributable to ValMV in the narcissus plant.