Running
Cotton (Gossypium hirsutum) is one of the major socioeconomically important crops cultivated worldwide, serving as the main source of fiber for the textile industry [1–3]. Cotton crops are constantly challenged by several insect-pest species [4]. The cotton boll weevil (CBW), Anthonomus grandis (Coleoptera: Curculionidae), is considered the major insect-pest in South and North America and exhibits the highest incidence in cotton crops during the transition period ranging from flowering to fructification [5, 6]. CBW adults feed on and lay eggs within the cotton reproductive structures, causing often flower bud abortion [7, 8]. Since CBW are endophytic, its larvae can cause damage to flower buds when they are not aborted, impacting fiber quality [6, 9]. Its high reproductive capacity, plasticity and genetic variability associated with the occurrence of crop residues or stumps in cotton crops helped to increase the incidence, density and geographic distribution of CBW worldwide [6, 10–12]. Until now, no conventional or transgenic cotton cultivar with satisfactory resistance to CBW are available to cotton producers. Consequently, numerous insecticide applications have been used annually for its management [13]. Unfortunately, the frequent occurrence of CBW populations with reduced susceptibility to insecticides and failure in chemical control has already been reported in Brazilian cotton crops [14, 15]. Meanwhile, the identification of new viruses infecting CBW may provide information that might result in the development of molecular or biological tools for their effective control.
Herein, were used an RNA-sequencing approach through next-generation sequencing (NGS) to investigate the presence of viruses and coding viral RNA in native, apparently healthy, adult CBW insects collected in September 2020 on cotton field situated in Serra da Petrovina (16o47’53’’S and 54o07’53’’W), Pedra Preta city, Mato Grosso state, Brazil. Pooled CBW insects (200 mg) were macerated in a mortar and pestle containing SM buffer (100 mM NaCl, 8 mM MgSO4, 50 mM Tris-Cl, pH 7.5). The homogenate was filtered once through cheesecloth and centrifuged three times at 4,000 × g for 10 min for supernatant clarification. Then, clarified supernatant was used for RNA extracted using QIAamp Viral RNA Mini Kit (QIAGEN, Hilden, Germany) according to the manufacturer’s instructions. Total RNA sample was processed for rRNA removal using the Ribo-Zero rRNA removal kit (Illumina, San Diego, CA, USA) and cDNA library was constructed using the TruSeq RNA library preparation kit (Illumina, San Diego, CA, USA). The cDNA sample was sequenced at Macrogen (Gangnam-gu, Seoul, Republic of Korea) using an Illumina HiSeq 2000 paired-end platform. The raw reads were quality trimmed and de novo assembled using MEGAHIT software [16]. The resulting contigs closely related to viruses were retrieved using BLASTx against an in house viral RefSeq database. To extend the assembled sequences as far as possible generated/trimmed reads were mapped back to the respective viral genomes using Geneious 11.1.5 software [17]. As well, genome annotation was also performed using Geneious 11.1.5 software, whereas open reading frames (ORFs) were confirmed using BLASTx search against the NCBI non-redundant protein database (08/2021).
The NGS resulted in 56,210,608 total reads of which 138,798 were considered as virus-related sequences. The de novo assembling of these viral reads generated a consensus sequence with 10,632 nucleotides in length (Genbank accession number OK413669, Supplementary Material S1). The genome coverage was of 1,440X. A single ORF of 8,913 nucleotides encoding a large polyprotein was predicted, additional to a 5’-UTR sequence (1,158 nucleotides) and a 3’-UTR (561 nucleotides). The 5’ and 3’ ends of the viral genome were confirmed by rapid amplification of cDNA ends (RACE) using 5' and 3’ RACE System for Rapid Amplification of cDNA Ends, version 2.0 (Thermo Fisher Scientific), according to the manufacturer’s protocol (data not shown). The amplified 5’ and 3’ products were sequenced using the MinION plataform with the Rapid RBK110.96 kit, following the manufacturer’s instructions (Oxford Nanopore Technologies) and the sequences analyzed using the Geneious 11.1.5 software. We identified by sequences alignment functional ORFs flanked by putative proteolytic sites such as structural proteins, and the non-structural proteins (Fig. 1). From this data, virus genome organization clearly resembled other members of the family Iflaviridae [18, 19].
From the amino acid sequence alignment using the MAFFT method [20], this putative amino acid polyprotein showed 32.13% identity with a sequence of a putative iflavirus (QKN89051.1) found in samples of wild zoo birds in China. According to the International Committee on Taxonomy of Viruses (ICTV) [19], iflaviruses presents two demarcation criteria for new species recognition into genus, such as: (1) natural host range and (2) sequence identity at the amino acid level of the capsid proteins under 90% [21, 22]. Therefore, we concluded that this new picorna-like virus describe here accomplishes all ICTV requirements to be recognized as new specie belonging to the Iflavirus genus. So, we have tentatively named this putative new virus as Anthonomus grandis Iflavirus 1 (AgIV-1).
Phylogenetic analysis with the whole genome sequence of this putative novel virus was compared to other iflaviruses (Table S1). Sequence alignments were carried out using the MAFFT method [20] with whole polyprotein-coding genes with highest identity (Fig. 2). A maximum-likelihood tree was inferred using the Fast-tree method implemented into Geneious 11.1.5 software [17] while the branch support was estimated by a Shimodaira-Hasegawa-like test. According to Silva et al. [2015], iflaviruses do not form a single clade according to the insect-infected order, suggesting that they did not follow the same evolutionary path as their insect hosts at order level [23, 24]. Our results corroborate this observation, where we can see that AgIV-1 is ancestral within a clade with three iflaviruses found in lepidopteran hosts (Fig. 2).
Given this, the identification of new viruses in this insect-pest increases our knowledge of their diversity, evolution, and provides new information to the development of biotechnological tools for its control. Until now few iflaviruses were associated with symptoms or acute disease [25, 26], except the deformed wing virus of honey bees [27]. Future studies with this new picorna-like virus identified in CBW will be focused on virus prevalence, infectivity, and possible use of this virus as a viral vector to carry RNA interfering strategies for CBW biological control.