Fig mosaic disease (FMD) is the most harmful disease of fig trees worldwide. It was first described in California in the early 1930s [1]. FMD is a disease caused by multiple viruses and Produces severe symptoms, including chlorotic mottling, leaf puckering, vein banding, ring spots, etc [2]. Currently, more than a dozen viruses and viroids have been found in figs belonging to six virus families [3]. Badnavirus is one of the major pathogens of FMD. Badnavirus Virions are bacilliform, particles ranging in length from 60 to 900 nm. Virions contain a single molecule of non-covalently closed circular dsDNA of about 7.2–9.2 kb and each strand of the genome has a single discontinuity. The taxonomy of badnaviruses is based on the sequences of the polymerase gene (RT + RNase H) and 80% identity is recognized as a species demarcation threshold [4].A more recent study showed that FMD is widespread in China's Xinjiang Uygur Autonomous Region [5], but the region’s pathogen causing FMD is unknown. In the present study, we used high-throughput sequencing and identification and obtained the sequence of a novel virus Fig badnavirus-2 (FBV-2) in fig trees in Xinjiang, China. first cloned the full-length sequence of FBV-2 using the RT-PCR and PCR methods.
Serious symptoms such as ringspot, necrotic spot, vein banding, chlorosis mottled, malformation and fruit spots could be observed in the surveyed orchards of fig trees of Kashgar, China (Fig. 1A ~ F). Ringspot Symptomatic leaves were collected 35 years old fig trees in Kashgar, Total RNA was extracted from 0.1 g leaf material using an RNAprep Pure Plant Kit (Tiangen, Beijing, China) according to the instructions. Small RNA Sequencing library was constructed using the TruSeq Small RNA Sample Prep kit and was then sequenced using an Illumina HiSeq platform (Personalbio, Shanghai, China), low-quality reads were filtered out using CLC Genomics Workbench 9.5 (QIAGEN, USA). A total number of 34,994,359 clean reads were obtained after filtering. SRNA with a length of 18–26 bp was screened from clean reads for subsequent analysis, and sRNA with 21, 22, and 24 nt accounted for 30.8%, 18.48%, and 18.02% of the total sRNA, respectively. Alignment Contigs to virus sequences were concatenated using Velvet software, and de novo assembly generated 9 contigs of FBV-2, a total length of 1108 nt. For Transcriptome sequencing, enrichment of the mRNA with polyA structure by Oligo (dT) magnetic beads, was added to the fragmentation buffer to obtain the mRNA fragment and synthesis of double-strand cDNA. built cDNA libraries and were then sequenced using an Illumina HiSeq platform (Beijing, China). 70752624 clean reads were obtained after filtering, of which 632222 were alignment to viral sequences, accounting for 0.9% of the total number of clean reads. Alignment Contigs to virus sequences were concatenated using Trinity software and Contigs were mainly 200 ~ 600 nt in length, De novo assembly generated 2 contigs of FBV-2, a total length of 2525 nt. All contigs were generated with the highest sequence identity (79.7 ~ 93.3%) to GBV-1.
11 contigs were assembled into the GBV-1 genome using DNAMAN, and based on the position of the contigs sequence was designed seven primer pairs to allow the amplification of the whole genome by PCR and RT-PCR. Total DNA was extracted from 0.1 g of leaf tissue using a Plant genomic DNA extraction kit (Tiangen, Beijing, China) according to the instructions. Total RNA was extracted from 0.1 g of leaf tissue using an RNAprep Pure Plant Kit and Complementary DNA (cDNA) was synthesized from 2 µL (573 ng/µL) of total RNA using to PrimeScriptTMRT reagent kit (TaKaRa).
The PCRs were performed using 2×Taq PCR Mastermix (Tiangen, Beijing, China), PCR reaction system: 2×Taq PCR Master Mix 12.5 µL, cDNA or DNA 3 µL, 10 µmol/L upstream and downstream primer is 1 µL, and ddH2O is made up to 25 µL, Reaction program: pre-denaturation at 94℃for 5 min; denaturation at 94℃for 30 s, annealing at 49 ~ 55℃for 30 s, extension at 72℃for 1 min/kb, total 30 ~ 35 cycles; extension at 72℃for 7 min. Take 6 µL of PCR The product was detected by 1.2% agarose gel electrophoresis, and the target fragment was purified by the gel DNA extraction kit (TianGen, Beijing, China).
PCR products were cloned into the pEASY-T5 Zero vector and transferred into Trans1-T1 Phase Resistant Chemically Competent Cell (Quanshejin, Beijing, China). In each fragment three positive clones were sent for bi-directional Sanger sequencing at the Shenggong (Shanghai, China), and three clones of each fragment showed 100% sequence identity. Assembly sequences using DNAMNA. finally obtained the complete nucleotide sequences of the three isolation of FBV-2 from Kashgar, Atushi, and Korla of Xinjiang. used to NCBI ORF Finder (https://www.ncbi.nlm.nih.gov/orffinder/) and Conserved Domain tool (http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi) were identified genomic ORFs organization and conserved domain structure of FBV-2. Sequence comparison was conducted using BLAST and DNAMAN 8, phylogenetic analysis was performed in MEGA X using the Maximum-likelihood (ML) method with 1000 bootstrap replications.
The viral genome is a circular dsDNA molecule, length of 7233 bp, and has 46.1% G + C content. sequence was deposited in the NCBI GenBank database under accession number MW842908 (Kashgar), MW842909 (Korla), MW842910 (Atushi), respectively, and the three isolates’ sequence similarity is more than 99.9%. ORFs Prediction showed that the whole genome contains four ORFs, the virus has typical badnaviruses characteristics, with overlapping stop/start codons between ORF1(716–718 nt) and ORF2(715–717 nt) (ATGA) and between ORF2 (1120–1122 nt) and ORF3(1119–1121 nt) (ATGA). A sequence complementary to the 3’end of methionine tRNA (5’-TGGTATCAGAGCTAGTTT-3’) is a primer binding site for first-strand DNA synthesis and starting point for the annotation of the circular genomic sequence (Fig. 1G). FBV-2 has the highest sequence similarity to FBV-1 (KT809305.1; 68.92% nt sequence identity), GBV-1(MF781082.1; 83.07% nt sequence identity), Grapevine roditis leaf discoloration-associated virus (KT965859.1; 68.66% nt sequence identity), Banana streak CA virus (KJ013511.1; 69.73% nt sequence identity) and Dracaena mottle virus (DQ473478.1; 73.75% nt sequence identity), among which the sequence similarity with GBV-1 is the highest.
ORF1 (position 287 to 718 nt) encodes a hypothetical protein (143aa) of approximately 16.48 kDa and includes a domain of the unknown function (DUF1319) which is restricted to members of the genus Badnavirus. ORF1 with those of other members of the genus Badnavirus showed the highest similarity to GBV-1 (MF781082.1; 85.01% nt sequence identity), GRLDaV (AWD78009.1; 68.04% nt sequence identity and 64.34% aa sequence identity) and FBV-1 (YP_006273073.1; 66.83% nt sequence identity and 64.75% aa sequence identity). ORF 2 (position 715 to 1122 nt) also encodes an unknown function hypothetical protein (135aa ) of approximately 14.86 kDa, ORF2 has 86.86% amino acid sequence identity to GBV-1 (ATV81253.1) and 60.74% amino acid sequence identity to FBV-1 (AOI28215.1).ORF3 (positions 1119 to 6722 nt) encodes a putative polyprotein (1867 aa) of approximately 214.8 kDa and conserved domain prediction showed that ORF3 including zinc finger, aspartic protease, reverse transcriptase long terminal repeat (RT_LTR) (positions 5052 to 5612 nt), ribonuclease H-like (RNase_H_like) (positions 5898 to 6284 nt) superfamilies. The nucleotide sequence similarity between ORF3 and GBV-1 (MF781082.1), Cacao swollen shoot virus (MN179342.1), Dioscorea bacilliform virus (KY827394.1), Banana streak CA virus (KJ013511.1) are 83.43%, 80.21%, 73.3%, 70.96%, 75.1%, respectively. The nucleotide sequence similarity between the RT + RNase H conserved region (at position 5052 to 6284) of FBV-2 and the RT + RNase H conserved region of GBV-1 (MF781082.1), FBV-1(MK348055.1) and GRLDaV (KT965859.1) was 84.46%, 78.82%, 77.03%, respectively. ORF4 (positions 6302 to 6751 nt) encodes an unknown function putative protein (149 aa) of approximately 16.85 kDa. ORF4 has 79.38% nucleotide sequence identity to GBV-1 (ATV81253.1), and 69.13%, 34.56%,37.31% amino acid sequence identity to GBV-1 (ATV81253.1), FBV-1(MK348055.1), GRLDaV (KT965859.1), respectively.
Sequence analysis found that ORF3 (2172–2174 nt) and ORF4(6410–6412 nt) contain a terminator TGA, respectively, the entire ORF segment is divided into two fragments, which seriously interferes with the continuous coding function of ORF3 and ORF4. To verify the existence of a terminator in ORF3 and ORF4, 11 samples (including 3 samples used in this study) were sequenced. The analysis found that 11 samples had terminator, indicating that the sequencing results were accurate and the full-length sequence of FBV-2 terminators do exist at positions 2172–2174 nt and 6410–6412 nt, this may be caused by mutations in the virus sequence during long evolution.
Phylogenetic relationships between FBV-2 in three isolate and ten genera of the family Caulimoviridae were estimated using genetic information from 36 members of this Family and 46 members of badnavirus, Rice tungro bacilliform virus (RTBV) was included as an outgroup (Fig. 2). Phylogenetic Trees were constructed using the complete genome nucleotide sequences (Fig. 2A) and the ORF3 amino acid full-length sequence (Fig. 2B). The tree was generated by the Maximum-likelihood (ML) method with 1,000 bootstrap replications using the software MEGA X. The tree inferred from the complete genome sequences that FBV-2 three isolates were clustered separately in a group and forms a branch of badnavirus (Fig. 2A). Based on the ORF3, amino acid full-length sequence tree showed that FBV-2 three isolates formed a branch with GBV-1, FBV-1, and GRLDaV, among which FBV-2 was the most closely related to GBV-1, GBV-1 is a new badnavirus reported from grape, further indicating that FBV-2 is a member of the badnavirus.
In this study, we tested some reported fig viruses (such as Fig badnavirus 1, Fig mosaic virus, Fig fleck-associated virus 2, Fig leaf mottle-associated virus 1, etc.) in fig trees following Next Generation Sequencing (NGS), and also a new virus (FBV-2) has been discovered. The full-length genome sequences of the new virus were obtained from three different regions from Xinjiang. Interestingly, three isolates have high sequence similarity (more than 99.9%). It is worth mentioning that the genome sequence of FBV-2 (7233bp) is significantly longer than that of FBV-1 and GBV-1, which are 93 nt and 84 nt, respectively, and there are differences in the genome structure of FBV-1[6, 7]. Although we found a new badnavirus in fig trees, but we didnot confirm the virus's role in producing symptoms due to technical and financial limitations. In addition, ORF3 and ORF4 contain a terminator (TGA), which is seriously disrupted coding protein, and the role of FBV-2 in the production of FMD was unclear. Therefore, additional research needs to be conducted to resolve the etiology of the fig mosaic disease symptom. The result of the Genome sequence analysis reveals that FBV-2 contains four OFRs like FBV-1, nucleotide sequence similarity in the RT + RNase H conserved region is less than 80%, and there is one more ORF than GBV-1. Genome organization and phylogenetic analysis support the conclusions of the novel virus as a new member of the genus badnavirus. The present findings provide a contribution to the identification of FMD pathogens