Pearl millet (Pennisetum glaucum (L.) R. Br.) is a staple food in Africa. Faced with climate change, this dryland cereal would play a major role in food security in the next future. However, crop intensification often leads to the emergence of viral diseases. In the 1990-2000s, almost ten viruses have been reported on pearl millet: Johnsongrass mosaic virus, Maize dwarf mosaic virus and Sugarcane mosaic virus (Potyvirus, Potyviridae), Wheat streak mosaic virus (Tritimovirus, Potyviridae), Maize streak virus (Mastrevirus, Geminiviridae), Panicum mosaic virus (Panicovirus, Tombusviridae) and the associated Panicum mosaic satellite virus, Indian peanut clump virus (Pecluvirus, Virgaviridae) and Rice black streaked dwarf virus (Fijivirus Reoviridae), [1, 2], but current epidemiological data and yield losses are poorly documented.
Here, we identified and characterized a full genome sequence of a novel marafivirus infecting pearl millet in Burkina Faso. In August 2018, pearl millet leaves showing small brown dots were collected in a farmer’s field at Nouna, Diébégou districts, Burkina Faso. Calcium chloride dihydrate (CaCl2) was used for leaves drying. After collecting leaves, we first placed them in absorbent paper and introduced them into a 50 ml falcon tube containing about 20g CaCl2. Tubes were then immediately stored at 4°C for 48 hours. Finally the samples were stored at room temperature for further analysis using the virion-associated nucleic acid (VANA)-based metagenomic approach as previously described [3]. Purified DNAs libraries were sequenced using Illumina HiSeq platform in 2x150 bp paired-end reads (Genewiz, South Plainfield, NJ, USA). Illumina sequences were analyzed following a bioinformatic pipeline as previously described [3]. Contigs produced by read assembly were utilized as queries to perform BLASTn and BLASTx searches [4] using the NCBI database (https://blast.ncbi.nlm.nih.gov/Blast.cgi). These BLAST searches revealed the presence of several Marafivirus-like contigs (Supplementary Table 1). Total RNAs were further extracted using the GeneJET Plant RNA Purification Kit (Thermo Fisher Scientific, Waltham, MA USA). cDNAs were constructed with reverse primer oligodT and MMLV-RT (200 u/µl) enzyme kit (Promega, Fitchburg Madison/USA) following recommendations. Internal and external parts of the genome were reconstituted and validated using primer pairs designed from Illumina sequences: Marafi-1662F/Marafi-2175R (~ 514 bp); Marafi-5974F/Marafi-polyT (~ 315 bp) and Marafi-1F/Marafi-411R (~ 411 bp) in independent PCR amplifications using Takara (5 u/µl) enzyme (Supplementary table 1). To obtain the 3’end sequence, a 3’RACE PCR was performed as described in [5] using primer combination Marafi-5974F/M4T/M4. Next amplicon was sequenced using primers Marafi-6110F, Marafi-6237F and M4 (Supplementary table 1). The 5’RACE PCR was performed using the Nanopore MinION cDNA-PCR sequencing method according to Oxford Nanopore Technologies instructions (version: PCS_9085_v109_revK_14Aug2019). cDNAs were constructed using total RNAs and a reverse primer Marafi-411R in a first step and a Nanopore switching primer SSP in a second step (Supplementary table 1). Two independent PCRs were performed, adopting primer pairs PR2/Marafi-159R (~ 159 bp) and PR2/Marafi-198R (~ 200 bp) (Supplementary table 1). PCRs products were directly sequenced with Sanger method using primers Marafi-159R and Marafi-198R, respectively. All Sanger sequences were edited and assembled with Illumina contigs using Geneious v5.5.9 software (https://www.geneious.com) to reconstitute the marafivirus whole genome. Open reading frames (ORFs) were detected using the ORF Finder NCBI analysis tool (http://www.ncbi.nlm.nih.gov/gorf/gorf.html). Genome-wide pairwise identity comparison of the pearl millet-derived marafivirus and representative sequences of the Tymoviridae family were performed using SDT v.1.2 [6]. Complete genome sequences and coat protein sequences were aligned using MUSCLE 3.7 with default settings [7] and phylogenetic relationships were performed using PhyML 3.1 [8] implemented in MEGAX software [9] using GTR + G + I model for complete genome sequences and LG + G + F model for coat protein sequences. One thousand bootstrap replicates were implemented as a test for the support of branches.
The complete genome of the novel pearl millet virus was 6364 nt long without the polyA tail. A large ORF was detected from 156 to 6218 nt (6063 nt) which encodes a 2020 aa-long polyprotein (224.2 kDa) (Fig. 1A). Several domains were identified including methyltransferase (MTR, 36–317 aa), papain-like protease (PRO, 722–817 aa), helicase (HEL, 913–1144 aa), RNA-dependent RNA polymerase (RdRp, 1448–1664 aa) and two forms of the same coat protein (CP, 1847–2019 aa) (Fig. 1A). The novel genome also contains a conserved “marafibox”, a 16-nt consensus sequence present in all known marafiviruses, from 5463 to 5478 nt (Fig. 1A and 1B). This domain is different from other Poaceae marafiboxes at its fifteenth nucleotide with “C” but different from “tymobox” in four nucleotide sites (Fig. 1, B) [10, 11]. Furthermore, the conserved adenine A(5488 nt) indicating the initiation site of the CPs sgRNA transcription was detected at ~ 10 nt downstream the marafibox in the conserved nt CAA(5487 −5489nt) (Fig. 1B) [11–13]. Two methionine codons were detected downstream the marafibox at positions 5523 and 5631, they could be the putative start codons of a minor 25kDa coat protein (CP1: 1790–2020 aa, 24.5 kDa) and a major 21kDa coat protein (CP2: 1826–2020 aa, 20.8 kDa). Nucleotide pairwise identities showed that the complete genome shared 68.5% sequence identity (Supplementary table 3) with sorghum bicolor marafivirus USA (MN100128) [14]. The CP protein sequence shared 58.5% identity with oat blue dwarf virus (OBDV, U87832) and 56% identity with sorghum bicolor marafivirus USA (MN100128) (Supplementary table 3). Phylogenetic trees confirmed that both the complete genome and the CP protein sequences of the novel virus isolated from pearl millet grouped with Marafivirus species (Fig. 2, A and B). According to the ninth report of International Committee on Taxonomy of Viruses (ICTV), new virus species might be created in the genus Marafivirus if the overall sequence identity is less than 80% and the capsid protein sequences are less than 90% identical [10]. Our results are in agreement with this demarcation criterion. In conclusion, the new virus, tentatively named pennisetum glaucum marafivirus (PGMV) could be considered as a putative new species that belongs to the genus Marafivirus, family Tymoviridae.