Complete genome sequence of mimosa mosaic virus, a new sobemovirus infecting Mimosa sensitiva L.

A new sobemovirus, which we have named “mimosa mosaic virus” (MimMV), was found by high-throughput sequencing and isolated from a mimosa (Mimosa sensitiva L.) plant. The genome sequence was confirmed by Sanger sequencing and comprises 4595 nucleotides. Phylogenetic analysis based on the predicted amino acid (aa) sequences of the P2b protein (encoded by ORF2b) and the coat protein showed 52.7% and 31.8% aa sequence identity, respectively, to those of blueberry shoestring virus. The complete genome sequence of MimMV was less than 47% identical to those of other sobemoviruses. These data suggest that MimMV is a member of a new species in the genus Sobemovirus, for which the binomial name “Sobemovirus mimosae” is proposed.

Mimosa (Mimosa sensitiva L.) is a wild plant belonging to the family Fabaceae and subfamily Mimosoideae. In July 2015, we collected mimosa leaf samples showing virus-like symptoms (Fig. 1B) in Santo Antônio do Tauá in the state of Pará. The collected samples were analyzed by high-throughput sequencing (HTS). Samples were initially processed using a viral semipurification procedure [1]. Total RNA was extracted from the semipurified preparation using a ZR Plant RNA MiniPrep Kit (Zymo Research, Irvine, USA). The extracted RNA from mimosa plants was added to a pooled sample as part of a larger effort to find new viruses in several plants. (This pooled sample contained RNA from semipurified virus preparations from cowpea, black pepper, and patchouli in addition to mimosa plants.) Then, rRNA was removed from the pooled sample using a RiboZero kit for plants (Illumina, San Diego, USA), and a cDNA library was prepared and sequenced using a Novaseq system (Illumina) at Macrogen with 100-bp paired-end reads (Macrogen Inc. Seoul, Korea).
The HTS reads were trimmed, and contigs were obtained by de novo assembly using SPAdes 4.2. [2]. The resulting contigs were analyzed with tBlastx against the virus genome database using Geneious v.8.1.9 (Biomatters Ltd, Auckland, New Zealand). Based on the contigs related to sobemovirus sequences, longer contigs were assembled by read mapping in Geneious. The longest contig assembled contained 4,448 nucleotides (nt) aligned with 4,999,563 reads (out of 35,191,002 total reads), with a mean coverage of 105,684.7. The specific primers SobemoR and SobemoF (Supplementary Table S1) were designed to amplify the RNA-dependent RNA polymerase (RdRp) gene region to detect sobemovirus sequences in individual mimosa samples used for the HTS analysis. To sequence the whole genome of the sobemovirus using the Sanger method, cDNA fragments were amplified by RT-PCR using RNA extracted from a single mimosa plant using specific primers (Supplementary Table S1) designed based on the HTS analysis. 5′-and 3′-RACE were performed as described previously [3]. The cDNA fragments were sequenced by the Sanger method.
The complete genome of the sobemovirus, which we have named "mimosa mosaic virus" (MimMV), contains 4595 nt (GenBank accession no. OP456085) as determined by Sanger sequencing. The genomic organization of MimMV is typical of members of the genus Sobemovirus Handling Editor: Ralf Georg Dietzgen.
In other members of the genus, P2ab is expressed through a (-1) frameshift at a slippery sequence, UUU AAA C, which is followed by a stem-loop located seven nucleotides downstream of this site [9]. The frameshift site of MimMV is likely located at position 1909 in its genome. Using RNAfold [10], a stem-loop was predicted four nucleotides downstream of the frameshift site of MimMV (Fig. 1A).
P2b encodes a putative RdRp and contains a conserved GDD motif (SGSYCTSSTNX 19-35 GDD). ORF3 encodes a putative coat protein (CP). The CP motif ACAAA is located at position 3397 in the genome and may constitute the start site of the subgenomic RNA [4]. For genome recombination analysis, the complete genome sequence of MimMV and those of 21 sobemoviruses (20 of which have been assigned to a species and one for which a new species has been proposed), were analyzed using the RDP4 program [11], but no evidence signals of recombination was found for this isolate.
The MimMV genome sequence was compared to those of other sobemoviruses obtained from RefSeq. The sequences of the P2b and CP regions of the sobemoviruses were   based on alignments made using MAFFT version 7 (https:// mafft. cbrc. jp/ align ment/ server) and used for phylogenetic analysis. Phylogenetic trees were inferred using IQ-TREE2 [12] with the model Q.pfam+F+R4 for P2b and LG+F+I+G4 for CP, which were chosen using ModelFinder [13]. Phylogenetic analysis showed that blueberry shoestring virus (BBSSV) was the closest relative when the P2b region was used (Fig. 2a), and a rubus isolate of sowbane mosaic virus (SoMV) was closest to MimMV when the CP region was used (Fig 2b).
Pairwise sequence identity to sobemovirus sequences obtained from RefSeq was calculated using SDT v1.2. In the P2b region, BBSSV and tobacco velvet mottle virus (VTMoV) showed 52.7% and 43.3% aa sequence identity, respectively, to MimMV. In the CP region, BBSSV and the rubus isolate of SoMV showed 31.8% and 26.3% aa sequence identity, respectively. When complete genome sequences were compared, BBSSV was the closest relative to MimMV, with 46.8% nt sequence identity (Fig. 2c). The sequence identity between MimMV and BBSSV at the amino acid (52.7% in P2b and 31.8% in CP) and nucleotide (46.8% in the whole genome) level was low, despite BBSSV being the closest relative within the genus Sobemovirus. The demarcation criterion for establishing a new species in the genus Sobemovirus is < 75% nucleotide sequence identity in the complete genome sequence [14]. In conclusion, our results show that MimMV should be considered a member of a new species in the genus Sobemovirus, family Solemoviridae, and the binomial name "Sobemovirus mimosae" is suggested for this species.