In China, wheat (Triticum aestivum L.) is mainly grown in northern China, including Shandong, Anhui, Henan, and Jiangsu provinces [3, 15]. More than 50 viruses infect wheat, causing significant yield losses worldwide [12]. Barley yellow dwarf viruses (BYDV), wheat dwarf virus (WDV), wheat yellow mosaic virus (WYMV), and chinese wheat mosaic virus (CWMV) are the dominant viruses that damage Chinese wheat production [5, 7, 10]. Recent studies have reported several new insect-borne viruses that infect wheat, such as wheat leaf yellowing-associated virus (WLYaV), wheat yellow dwarf virus (WYDV), wheat yellow stunt-associated betaflexivirus (WYSaBV), and wheat yellow striate virus (WYSV) [4, 6, 9, 14].
In April 2021, wheat (cultivar Jimai 22) with yellow stripes on its leaves was observed in Tai’an City, Shandong province, China (Fig 1A). Similar symptoms were observed on approximately 6% (n ≈150,000) of wheat plants according to a survey. To identify the agent causing the symptoms, a typical symptomatic wheat leaf tissue was collected and analyzed using high-throughput sequencing (Novogene Bio, Tianjin, China). TRIzol (Invitrogen, Carlsbad, USA) was used to extract total RNA, which was reverse transcribed to cDNA and then made in a sequencing library using a TruSeq RNA Sample Preparation Kit (Illumina Inc., San Diego, CA, USA). The library was sequenced using the Illumina NovaSeq platform (Novogene Bioinformatic Technology). De novo assembly of 52,095,565 clean paired-end reads generated 206,511 contigs (357 to 27,483 bp). Wheat genome sequences were removed and BLASTn and tBLASTx were used to identify contigs corresponding to plant virus sequences in the GenBank plant virus database. Two assembled contigs (774 nt and 2,782 nt) were found to share 97.3-98.3% identity to the wheat leaf yellowing-associated virus (WLYaV, genus Polerovirus), ranging from 1,268 to 2,038 nt and 2,983 nt to 5,761 nt in length [14]. In addition, one assembled contig (5,614 nt) showed 86.4% identity to the genomic sequence of WYDV, suggesting the presence of a distinct polerovirus. None of the transcripts showed sequence similarity to the genomes of other plant viruses. Subsequent RNA-seq analysis showed coinfection of the wheat sample with WLYaV and an unidentified polerovirus.
To verify the existence of two poleroviruses and to obtain its full-length sequence, we designed primer sets (Supplementary Table S1) to amplify each RNA from the two overlapping regions, followed by cloning and splicing. Reverse transcription PCR (RT-PCR) and rapid amplification of cDNA ends (RACE)-PCR were performed to acquire the complete nucleotide sequences. The complete sequences of WYLaV-TA (GenBank accession no. OM829808) and WYSaV-SD (GenBank accession no. OM829809) were deposited in GenBank. The WYLaV-TA genome comprises 5,763 nucleotides, and shares 97.3% sequence identity with the WLYaV isolate JN-U3 (KY605226) [14].
The genomic sequence of the unknown virus, tentatively named “wheat yellow stripe-associated virus” (WYSaV), is 5595 nt in length, with a 5¢ untranslated region (UTR) of 58 nt and a 3¢ UTR of 137 nt. A multiple sequence alignment showed that the complete WYSaV genome is 87.3% identical to that of WYDV isolate Henan (OK216142) [6]. According to ORF finder and comparison with other polerovirus sequences, the genome of WYSaV contains six open reading frames (ORFs) initiated by AUG and one ORF initiated by AUA (Fig 1B). ORF0 (nt 59–820) encodes a predicted 253 amino acid/28.5 kDa protein (P0), a viral suppressor of posttranscriptional gene silencing [1]. BLASTp analysis demonstrated that WYSaV P0 has 61.3% sequence identity to WYDV. ORF1 (nt 204–2174) encodes a predicted 656 amino acid/71.6 kDa protein (P1). P1 contains a Peptidase S39 domain (CL0124) at position 209-409 and has 82.3% sequence identity to WYDV P1. ORF1 and ORF2 (nt 1,586–3,448) encode a P1-P2 fusion protein (RdRp) via a -1 ribosomal frameshift [11], which carries a Viral RNA-directed RNA-polymerase (RdRp4, CL0027) domain. ORF3a (nt 3,517–3,654, AUA start codon) is located between ORF2 and ORF4 and encodes a predicted 45-aa protein involved in long-distance viral movement, with 97.8% sequence identity to P3a of WYDV. ORF3 (nt 3635–4246) encodes a predicted 21.9-kDa protein (P3) and has 91.1% sequence identity with P3 of WYDV. ORF4 (nt 3,657–4,118) encodes a predicted 17.4-kDa protein comprising a conserved Luteovirus putative VPg genome-linked protein domain (aa 52–145) and 87.6% sequence identity with WYDV P4. ORF3 and ORF5 (nt 4,244–5,458) encode a predicted 608 aa/67.1-kDa P3-P5 fusion protein, comprising a putative readthrough protein (RTD) domain (aa 214–428) possibly involved in virus transmission [2], and has 89.0% aa sequence identity with that of WYDV.
A phylogenetic tree was established based on complete genome sequences of representative polerovirus members, employing the neighbor-joining method implemented in MEGA7 with 1,000 bootstrap replicates [8]. The phylogenetic tree showed that WYSaV was closest to WYDV, and clustered with CYDV-RPV, BYDV-GPV, and CYDV-RPS to from a distinct subgroup (Fig 2). Phylogenetic analysis also showed that the WYSaV RdRp and CP are most closely related to those from WYDV (Fig 2).
In summary, the shared amino acid sequence identities between WYSaV and WYDV isolate Henan were 61.3% for ORF0, 82.3% for ORF1, 88.7% for ORF1-2, 91.1% for ORF3, 97.8% for ORF3a, 87.6% for ORF4, 89.0% for fused ORF3-5 (Fig 1C). Therefore, except ORF3 and ORF3a, the highest nucleotide identities for the genomic sequence (86.4%) and the deduced protein sequence between WYSaV and other poleroviruses, are less than the species identification cutoff (90%) for the Genus Polerovirus [13]. Based on the distinctive polerovirus-like genome organization of WYSaV and its close relationship to WYDV, we concluded that WYSaV is a novel virus in the genus Polerovirus.