Complete genome sequence of a tentative novel capillovirus isolated from Gerbera jamesonii

The currently named gerbera virus A (GeVA) has been shown to be a novel capillovirus with a complete genome of 6929 nucleotides (nt) (GenBank accession no. OM525829.1). GeVA was detected in Gerbera jamesonii using high-throughput RNA sequencing analysis. The GeVA genome is a single linear RNA with two open reading frames (ORF), similar to those of other capilloviruses. The larger ORF encodes a polyprotein containing four domains, while the smaller ORF encodes a movement protein. The complete genome had 41.0–54.9% nt sequence identity to other those of capilloviruses, while the polyprotein and the movement protein had 26.5–36.4% and 13.1–32.2% amino acid (aa) sequence identity, respectively. Two UUAGGU promoters for subgenomic RNA (sgRNA) transcription were also identified in this study. BLAST analysis demonstrated that the GeVA genome shared the highest sequence similarity with rubber tree capillovirus 1 (MN047299.1) (complete nucleotide sequence identity, 68.54%; polyprotein amino acid sequence identity, 44.53%). Phylogenetic analysis based on complete genome and replication protein sequences placed GeVA alongside other members of the genus Capillovirus in the family Betaflexiviridae. These data suggest that GeVA is a new member of the genus Capillovirus.

Gerbera plants (Gerbera spp.) are exceptionally important cut flowers in the current global floricultural trade. They are also widely grown in many countries as flowering potted or garden plants [1]. Viruses of several species have been reported in gerbera plants worldwide, including chrysanthemum stem necrosis virus (CSNV), cucumber mosaic virus (CMV), impatiens necrotic spot virus (INSV), tobacco rattle virus (TRV), tobacco ringspot virus (TRSV), and tomato spotted wilt virus (TSWV) [2][3][4][5][6][7]. These viruses are known to inhibit growth in gerbera, with symptoms such as deformity, necrosis, and ringspot on leaves. In the case of CMV, it has been reported to cause color breaking and deformation in flowers. However, capilloviruses have not been identified in gerbera plants.
Capilloviruses are members of the family Betaflexiviridae, which possess flexuous filaments particles 640-700 nm in length and a 6.5-to 7.4-kb linear positive-sense ssRNA genome. The genomic RNA has two open reading frames (ORF) encoding a large replication-associated protein fused with the coat protein (CP) and a putative movement protein (MP), and it is polyadenylated at its 3' end [8]. Currently, the genus Capillovirus includes five virus species: (Apple stem grooving virus, Cherry virus A, Currant virus A, Mume virus A, and Capillovirus TRV1). Members of this genus have mainly been found in woody plants, but a strain of ASGV has been reported in lily plants [9]. So far, no vectors of any species have been reported for viruses of the genus Capillovirus [10].
In August 2019, leaf samples of 32 gerberas were collected in seven greenhouses in Gyeongsangbuk-do, South Korea. In each greenhouse, gerberas were visually inspected, and 2-7 samples (32 in total) showing viral symptoms such as chlorosis, chlorotic mottle, ringspot, leaf distortion, and yellowing were collected. To identify . About 200 million total reads were generated from about 20 Gbp of raw data. The reads were assembled de novo into transcript contigs using Trinity software as described previously [11]. As a result, 163,814 contigs were assembled and subjected to BLASTx searches against the non-redundant NCBI protein database. These revealed four TSWV-like contigs, a badnavirus-like contig, and two partitivirus-like contigs in addition to a tentative novel capillovirus-like contig of 6908 nt from 5286 reads that shared 44.62% (with 93% query coverage) amino acid (aa) sequence identity with rubber tree virus 1 (RTV1) (GenBank no. QGR26011). To confirm the presence of the virus-like contigs, total RNA was extracted from each sample (n = 32) using an easy-spin™ Total RNA Extraction Kit (iNtRON Bio, Seongnam, Korea). RT-PCR was then performed using primer pairs specifically designed based on the acquired contigs. TSWV was detected in eleven out of 32 samples showing ringspot and leaf distortion, but badnaviruses and partitiviruses were not detected (data not shown). The novel capillovirus was detected without other viruses in a leaf sample showing chlorotic mottle ( Fig. 1). Total RNA from this sample was used for synthesis of cDNA using Oligo dT(18) as a primer, after which PCR was performed using nine pairs of primers designed based on the contig sequence (  , were sap inoculated in triplicate, using the capillovirus-infected gerbera leaf as inoculum. All of the plants were observed for 28 days after inoculation, but distinct symptoms including chlorotic mottle were not observed. RT-PCR assays using the upper leaf of each inoculated plant showed that the gerbera plant was the only indicator plant that gave a positive reaction. All iterations gave the same result. The PCR product from the positive plant product was sequenced and confirmed to be identical to the corresponding sequence from the novel capillovirus. Therefore, it was confirmed that the novel capillovirus originated from the collected sample, has gerbera as a host, and is mechanically transmissible to gerbera. Although it could not be concluded that the capillovirus was the direct cause of chlorotic mottle, chlorotic mottle was observed in a sample in which monoinfection with the novel capillovirus was confirmed by RT-PCR and HTS, but not in other gerberas in the same environment. Taken together, these results suggest that chlorotic mottle occurs The complete genome of the virus consists of 6929 nucleotides (nt) with 40.68% G/C content (Fig. 2). Using NCBI ORF Finder, it was predicted to contain two ORFs, like other known capilloviruses [8]. The 5' and 3' untranslated regions (UTR) are 37 and 357 nt in length, respectively. The complete genome shares 68.54% nt sequence identity (with 22% query coverage) with rubber tree capillovirus 1 (MN047299.1). ORF1 (nt 38-6532) encodes a large replication-associated protein fused to a coat protein that is 2164 aa long. Additionally, it shares 44.53% and 43.25% aa sequence identity (99% and 76% query coverage) with the polyprotein of rubber tree capillovirus 1 (QGR26011.1) and currant virus A (YP_009229912.1), respectively. ORF2 encodes a putative movement protein, 425 aa in length, which shares 40.00% and 41.85% aa sequence identity (97% and 52% query coverage) with the movement proteins of rubber tree capillovirus 1 (QGR26012.1) and mume virus A (QIM55854.1), respectively.
A Simple Modular Architecture Research Tool (SMART) analysis revealed that the ORF1-encoded polyprotein contains four domains: methyltransferase (Mtr, Pfam01660; nt 161-1141), helicase (Hel, Pfam01443; nt 2669-3154), RNA-dependent RNA polymerase (RdRP, Pfam00978; nt 3527-4579), and coat protein (CP, Pfam05892; nt 5978-6520). In contrast, ORF2 contains a single domain: movement protein (MP, Pfam01107; nt 4918-5475). The coat protein (CP) cistron of members of the genus Capillovirus is located in the C-terminal end of ORF1, and ORF2 (MP) is nested within ORF1 [8]. In this virus, the CP cistron is also located at the C-terminal end of ORF1, and it shares 62.33% aa sequence identity (at 99% query coverage) with the CP cistron of rubber tree capillovirus 1 (QGR26011.1). Putative promoter sequences (UUA GGU ) resembling those that direct subgenomic RNA (sgRNA) transcription of other capilloviruses [12] were found at the 5' terminus (nt 4841 and nt 5810) of the MP and CP domain, respectively. Thus, the MP and CP of this virus are expected to be expressed through sgRNA transcription, as has been observed with other capilloviruses [13].
A comparison of the novel capillovirus, which was tentatively named "gerbera virus A" (GeVA), and 41 members of 10 genera in the family Betaflexiviridae (Supplementary Table S2) was performed using Clustal Omega 1.2.4 (https:// www. ebi. ac. uk/ Tools/ msa/ clust alo/). The analysis revealed a nucleotide (nt) sequence identity range of 41.0-54.9% for the complete genome, 41.9-55.0% for the replicase (polyprotein) coding region, and 35.7-54.1% for the movement protein (MP) coding region. The replicase showed an amino acid (aa) sequence identity range of 26.5-36.4%, while the MP showed an aa sequence identity range of 13.1-32.2%. A phylogenetic tree constructed based on these sequences showed that GeVA clustered with members of the genus Capillovirus (Fig. 3). According to the International Committee on Taxonomy of Viruses (ICTV) species demarcation criteria for the genus Capillovirus, a new species classification is made when there is a difference in natural host range or serological specificity or when there is less than 72% nt sequence identity in the CP or polymerase genes or 80% aa sequence identity in the encoded proteins [8]. Recently, new species demarcation  Table S1). criteria were proposed for the family Betaflexiviridae [14], including the genus Capillovirus, to reduce ambiguity in species classification for betaflexiviruses in cases where species boundaries are unclear. Based on these criteria, GeVA is proposed to be a new member of the genus Capillovirus because it has less than 80% aa sequence identity to other capilloviruses in the replicase protein, shares the same genome organization with other capilloviruses, and is nested among them in phylogenetic analysis.
Author contributions All authors contributed to the study's conception and design. Material preparation, data collection, and analysis Fig. 3 Phylogenetic analysis of gerbera virus A (GeVA) and 41 members of the family Betaflexiviridae. The phylogenetic tree was constructed using amino acid sequences of the complete polyprotein. The analysis was performed by the maximum-likelihood method in MEGA software version 11.0.10, using the Jones-Taylor-Thornton (JTT) model and 1000 bootstrap replicates. All viral sequences were obtained from the NCBI database, and GeVA is indicated by a red arrow.
were performed by Sangmin Bak, San Yeong Kim, Minhui Kim, and Wonyoung Jeong. The first draft of the manuscript was written by Sangmin Bak, and all authors provided comments on this and subsequent drafts. San Yeong Kim supervised the research and all related work. All of the authors have read and approved the final manuscript.
Funding This work was supported by Korea Institute of Planning and Evaluation for Technology in Food, Agriculture, Forestry (IPET) through the Advanced Production Technology Development Program, funded by the Ministry of Agriculture, Food and Rural Affairs (MAFRA) (grant no. 315002-5).

Data availability
The complete genome sequence of gerbera virus A (GeVA) was deposited in the GenBank database under the accession number OM525829.1.