De novo transcriptome assembly of Premnotrypes vorax (Coleoptera: Curculionidae)

Objective: Premnotrypes vorax (P. vorax) is an insect pest that causes signicant losses to potato crops in Colombia. Currently, the insect control is mainly done by using highly toxic chemical insecticides and there are no reports of any commercial biological control strategy against this pest. Hence, the objective of this study was to characterize the insect genetic expression to search for genes that could codify for Bacillus thuringiensis Cry toxin receptors. Using an RNA-seq approach, we sequenced the mRNA from the insect tissue, performed a de novo assembly and analyzed the reconstructed transcriptome of P. vorax. To our knowledge, this is the rst genetic report of this endemic insect which will set the basis of a possible biological control strategy. Results: The transcriptome data was obtained from dissected midgut tissue samples of P. vorax larvae. The isolated RNA was isolated and sequenced using the Illumina HiSeq platform with a conguration of 2x150pb reads. A total of 383,552,246 reads were obtained and subsequently a quality and cleaning process was performed through FastQC and Trimmomatic software, respectively. A novo assembly was done using the Trinity software, obtaining a transcriptome assembly with 25,631 genes that showed at least one annotation record, resulting in 74,984 transcript isoforms.


Introduction
Premnotrypes vorax (Hustache) (Coleoptera: Curculionidae) is one of the 15 species of insects that form part of the complex known as "Andean potato weevil" (Figure 1). P. vorax is distributed in South America, principally in Colombia, Ecuador, Venezuela and Peru with registries as one of the most important pests of potato (Solanum tuberosum) crops (Pérez et al, 2009). Adult insects feed on the plants and cause damage along the edges of the leaves, but the larvae also make tunnel-shaped lesions in the tubers causing externally visible damage (Pérez-Álvarez et al., 2010). This insect can cause commercial loses up to 80% of damaged tubers or the complete destruction of the potato crop, especially with high larvae populations (ICA, 2011).
Historically, insect control by indigenous farmers involves extended crop rotation, special separation between elds and chemical control with highly toxic insecticides applied at planting. These insecticides usually fail to penetrate the soil, which is where the tuber pest of interest is able to survive and grow Coleoptera insect pests such as Tenebrio molitor (Fabrick et al., 2009) and Leptinotarsa decemlineata (Park et al., 2009;van Frankenhuyzen, 2009) suggesting that it could be used as bioinsecticide against P. vorax. The objective of this work was to make de novo transcriptome assembly of P. vorax (Coleoptera: Curculionidae) that would set the basis for future development of biological control strategies.

Methods
Total RNA extraction, sequencing library preparation and sequencing The RNA was extracted from larvae midgut tissue with no biological replicates since pests were collected from the wild. A total of 20 larvae were used where 20 mg of midgut tissue were extracted and the tissue was separated in two Eppendorf tubes. The total RNA was extracted using the kit Agentcourt RNAdvance Cell v2 (Beckman Coulter) and quanti ed using Nanodrop 2000 and Qubit 2.0 system. The RNA integrity number (RIN) for each sample was calculated using the Bioanalyzer2100 (Agilent) system. All samples presented a RIN >= 7 (8.5 and 8.8), indicating enough quality and integrity for library preparation. Illumina sequencing libraries were prepared using the TruSeq stranded mRNA following the vendor's protocol and obtaining an average library fragment size of 500 bp. RNA-seq libraries were sequenced using the Hiseq platform with a con guration of 300 cycles to generate pair-end reads of 150 bp. General statistics for the sequencing can be found in Table 1.

Preprocess, de novo Assembly, Annotation and Mapping
For all samples, the quality of sequencing results was analyzed using the FASTQC software using default parameters for fastq format. Due to the excellent quality, no reads were removed from the dataset and they were assembled using Trinity v2.6.5 (Grabherr et al., 2011) under default parameters. The statistics for the generated transcriptome are presented in Table 1 Table 1.

Results
We used an RNA-seq approach to characterize the transcriptome of P. vorax (Hustache) (Coleoptera: Curculionidae). Total RNA was processed and sequenced using the Illumina technology and the statistics for the sequencing can be found at Table 1. The quality control evaluation for the sequencing data was performed using FASTQC and the reconstruction and annotation of the transcriptome was achieved with the Trinity and Trinotate pipelines. Raw and processed data were deposited at NCBI public repositories under the accession number PRJNA506951.

Discussion
To our knowledge, this is the rst characterization of the gut transcriptome pro le of P. vorax (Hustache) (Coleoptera: Curculionidae). We are presenting a de novo assembly and annotation of the transcriptome achieved with the Trinity and Trinotate pipelines, respectively. All raw and processed data have been deposited at NCBI public repositories (PRJNA506951) for public accession to be used for different purposes. There is an urgent need for development of effective control strategies since this pest is a severe problem in Colombia where currently it occupies the rank 36th out of 183 countries that produce potatoes worldwide with 60 varieties. Since potato is the third most important crop and the average consumption in the country is 60 Kg per person per year (Fedepapa, 2018) an effective, ecofriendly and human harmless pest control strategy is needed. This is the rst transcriptome data of this insect pest that could contribute to the understanding of the genetic expression of this organism and guide the research towards nding targets for pest control.

Limitations
The data presented here is a rst approach to this species. Annotation quality is dependent on closely related species, which currently are few. Additional sequencing for replicates of midgut tissue and other organs will help re ne bona de transcript isoforms. Experimental work being performed at the moment will also help validate the expression of the transcripts of interest for biological control.