Identification and Molecular Characterization of a Novel Non-typical Carlavirus Infecting Rose Plants (Rosa Chinensis Jacq.)


 In the present study, systematic screening of public transcriptomic data followed by a shotgun assembly from Rosa chinensis Jacq. revealed a sequence of 8,332 nucleotides (nt) representing a potential novel virus, tentatively named rose virus C (RVC). The incidence of RVC in rose plants under our survey was low (5.4%, 5 out of 92). The complete nucleotide sequence of RVC was determined by Sanger sequencing. The genomic RNA of RVC consists of 8,386 nt, excluding the 3′-poly(A) tail, and contains five definite open reading frames (ORFs). BlastN analysis revealed that RVC had 67.3-71.3% nt identity with the carlaviruses with the maximum coverage of 19%. Phylogenetic analysis showed that RVC clustered with the carlaviruses and had a nt sequence identity of 48.3-50.0% based on its full-length genome. Replication protein had 48.7-51.0% nt sequence identity with other carlaviruses while the coat protein showed 39.5-45.6% nt identity being much below the species demarcation criteria of 72%. These all indicate that RVC is a distinct carlavirus. Overall, our data suggest that RVC is a novel non-typical virus species of the genus Carlavirus.


Background
Rose (Rosa chinensis Jacq.) is an important ornamental plant with a high popularity in cut-ower industry. Due to its vegetative propagation, mainly by grafting, rose plants have harbored many viruses over millennia.
More than 24 viruses have been reported to infect rose plants worldwide, of these only six occur in China, including apple mosaic virus, arabis mosaic virus, apple stem grooving virus, blackberry chlorotic ringspot virus, prunus necrotic ringspot virus, and rose leaf rosette-associated virus ( Viruses in the genus Carlavirus, family Beta exiviridae contain a single positive-sense linear RNA molecule with six open reading frames (ORFs). ORF1 encodes a polypeptide assumed to be the viral replicase. ORFs 2-4 form the triple gene block (TGB) proteins which allow cell-to-cell and long-distance movements. The coat protein (CP) is encoded by ORF5, while ORF6 encodes a cysteine-rich RNA-binding protein (RBP) which may facilitate aphid transmission or be involved in host gene transcription/ silencing and/or viral RNA replication (Adams et al. 2004;King et al. 2012). RNA-seq is now widely used to identify differentially expressed genes in higher plants which result in large volumes of data easily available in public database (Wang et al., 2009). So, the datasets can be mined to identify viruses/viroids using bioinformatics tools. In this paper, we identi ed a new carlavirus, rose virus C (RVC), in rose plants (Rosa chinensis Jacq.) using systematic BLAST-based data mining of public plant transcriptome data and Sanger sequencing. The complete genome sequence of RVC and its phylogenetic relationship are described.
To identify potential novel viruses/viroids, a systematic search for virus/viroid sequences in public plant transcriptome data was conducted using BLASTN analysis. The sequence of type member of each recognized virus/viroid species was used as query against all transcriptome shotgun assemblies (TSA) in GenBank. Using this method, a contig with 8332 nucleotides (nt) with distant homology to carlavirus was identi ed in the TSA from the RNA-Seq data of rose plants in China (Bioproject PRJNA546486).
Based on the contig sequences, two speci c primer sets RVC-586F/-586R and RVC-699F/-699R were designed to con rm the presence of RVC under natural conditions. RVC-624R/-416R and RVC-7668F/-7938F were designed for 5′ RACE and 3′ RACE, respectively, using the SMARTer RACE 5'/3' Kit (Clontech, USA) as outlined by the kit instructions. The speci c primer sets RVC-F1/-R1 were used to amplify nearly full genome of RVC from the rose plant sample ( Supplementary Fig. 1). All primers used in this study have been given in the Supplementary In the present study, a total of 92 rose plants were analyzed, of which 5 tested positive for RVC. However, no association between virus-like symptoms in rose plants and RVC was found. Total nt length of the RVC genomic RNA isolated from the rose plant was 8,386, excluding the 3′-poly(A) tail (Fig. 1, accession  The RVC genome contains ve de nite ORFs, namely replication protein, TGBs and CP, while no predicted RBP was found (Fig. 1). It appears that RVC is closer to a foveavirus or robigovirus than to the typical carlavirus which has six ORFs. The 5′ and 3′ untranslated regions (UTRs) have 57 nt and 94 nt, respectively (Fig. 1).
Phylogenetic analysis showed that the full genome of RVC clustered with the carlaviruses with 48.3-50.0% nt sequence identity as determined by MEGA version 6 (Tamura et al. 2013) using the neighbor-joining method (Saitou and Nei 1987) with 1000 bootstrap replications (Fig. 2). This was con rmed when the replication protein was analyzed ( Supplementary Fig. 2). On the other hand, RVC may be considered as a new genus since it moved to a single group when the CP was used in the analysis (Supplementary Fig. 3). Here  (Martelli et al., 2007). The CP (867 nt, 288 aa, 31.7 kDa) showed a relatively low nt sequence identity (39.5-45.6%,) with known carlaviruses, the highest sequence identity was with red clover carlavirus 1 (45.6%, accession no. MG596239) ( Table 1). The sequence identity of the replication protein and CP, when compared to other members of carlaviruses, were much lower than the species demarcation criteria of 72% as set by the ninth report of the international committee on taxonomy of viruses (King et al., 2011). Overall, our data suggest that RVC is an atypical novel virus species of the genus Carlavirus.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.