The complete genome of CgPmV1 consist of nine segments, with full-length sequences of RNA1-9 are 2,444, 2,255, 2,012, 1,297, 1,082, 1,055, 1,004, 991 and 699 bp, respectively with GC content are 58%, 59%, 59%, 57%, 53%, 54%, 61%, 54% and 58%. RNA9 consists of 518 bp at the 5'-termini and 181 bp at the 3'-termini of the RNA2, losing the intermediate fragment of 1556 bp (nt 519 to nt 2,074).Except RNA5, which encodes two ORFs, the other eight sequences encodes only one ORF (Fig. 1B).
Multiple alignment of RNA 1–8 sequences, a conserved sequence (CGATAATAA) was found at the 5’-termini,whereas RNA 9 differed by only one nucleotide (Fig. 2A). The 3’-termini of dsRNAs 1–9 cDNA sequences has no strict sequence conservation, the tetranucleotide CCCC only exist in dsRNA2 and dsRNAs 7–9, the sequence (CN)4CGNC2GCGNG2CGNC2NC is shared by dsRNAs 1 and 3, while CTN3GN2TN2CTN3GNG2N2T only by dsRNAs 4 and 6 (Fig. 2B).
The 5'-termini untranslated regions (UTR) in front of ORFs 1–9 contained 33, 71, 59, 102, 183, 159, 90, 165 and 71 nt, the 3'-termini untranslated regions in back of ORFs 1–9 contain 95, 105, 96, 319, 98, 266, 179, 217 and 157 nt, respectively. Moreover, the interval between ORF5-1 and ORF5-2 is 84 nt. Proteins translated by ORFs 1–9 are termed P1-9 and their molecular masses are 84.8, 74.2, 66.1, 31.2, 21.6 (P5-1), 4.8(P5-2), 22.9, 25.8, 22.1 and 16.6 kDa, respectively.
The BLASTp comparison of P1-9 showed that P1-P9 had significant sequence similarity to the corresponding protein encoded by CcFV-1 P1, P3 and P4 have 87.6% (coverage 100%, E-value 0.0), 83.3% (coverage 100%, E-value 0.0) and 86.6% (coverage 100%, E-value 0.0) sequence similarity with RdRp, methyltransferase and PASrp encoded by CcFV-1, respectively. For P2 and P9, 86.6% (coverage 100%, E-value 0.0) and 84.3% (coverage 98%, E-value 6e-70) amino acid sequences were similar to the same hypothetical protein encoded by CcFV-1. P5-1 showed 82.5% (coverage 94%, E-value 9e-108) sequence similarity to a protein of CcFV-1, while P5-2 showed no similarity to any known protein. It is worth mentioning that the P6-P8 has only 89.0% (coverage 100%, E-value 9e-138), 45.7% (coverage 99%, E-value 4e-44) and 95.5% (coverage 99%, E-value 3e-138) sequence similarity to the corresponding hypothetical protein in CcFV-1, and no more comparison results have been obtained.
To examine the phylogenetic relationship between CgPmV1 and other polymycoviruses, a phylogenetic tree was constructed based on RdRp protein of CgPmV1 and other polymycoviruses, using MEGA 7.0 by maximum-likelihood method with1000 bootstrap replicated. The Hadaka virus 1 (HadV1), a positive single-stranded RNA (+ ssRNA) share the three conserved segments with determined polymycoviruses but lacked the segment of encoding PASrp [25],was elected to an outgroup. Phylogenetic tree showed that CgPmV1 was closest to CcFV-1 in phylogeny (Fig. 3A). The multiple sequences alignment of RdRp confirmed that CgPmV1 possessed three conserved motifs that ubiquitous in members of the polymycoviridae (Fig. 3B).
In conclusion, the novel polymycovirus with defective RNA is the first found in C. gloeosporioides. Although defective RNA is widespread in a variety of RNA viruses, it is not common in polymycoviruses [4]. Whether dsRNA9, as defective RNA, affect the proliferation of CgPmV1. In addition, CgPmV1 is the phylogenetically closest relative of CcFV-1, whether form unique virus particles and influence the virulence of the host. These are all things that need to be studied further.