The Complete Mitogenome of Curculio Chinensis (Coleoptera: Curculionidae: Curculioninae): Structural Characterization and Phylogenetic Implication

To explore the phylogenetic position of Curculio chinensis Chevrolat, 1878 and phylogenetic relationships among major lineages of the family Curculionidae, we sequenced and annotated this mitogenome. The mitogenome is 18,680 bp in length, and includes the 37 typical mitochondrial genes and a large control region (length: 1,997 bp). Mitogenome organization, nucleotide composition, and codon usage are similar to most of the previously sequenced Curculioninae mitogenomes. All 13 protein-coding genes use ATN or TTG as start codon, and end with TAA/G or incomplete stop codons (single T-). Twenty-one transfer RNA genes have the typical clover-leaf structures, while the dihydrouridine (DHU) arm of trnS1 is missing. In Curculioninae mitogenomes, the size and number of tandem repeats in the control region are highly variable. Both ML and BI analyses based on the 13 PCGs and two rRNAs from 91 species of Coleoptera strongly supported the monophyly of Curculionidae and three of the included subfamilies (Platypodinae, Dryophthorinae, and Cryptorhynchinae) plus the sister relationship between Platypodinae and Dryophthorinae. Additionally, the monophyly of the genus Curculio was recovered with strong support.


Introduction
The typical mitogenome of insects is a circular double-stranded DNA molecule with 15-18 kb in length, encoding 13 protein-coding genes (PCG), two ribosomal RNA genes (rRNA) and 22 transfer RNA genes (tRNA), also includes a large non-coding region (control region) [1,2]. In insects, the mitogenome has been widely used as a molecular marker to explore the population genetics, phylogeny, and evolution [2][3][4].
The camellia weevil, Curculio chinensis Chevrolat, 1878, belongs to the subfamily Curculioninae (Coleoptera: Curculionidae), and is widely distributed in most of China's Camellia spp. (tea) producing areas [5]. It is one of the most serious pests of tea and causes huge economic losses [5,6]. Further understanding of the phylogenetic status of C. chinensis is of great interest to the management of economic plant pests.
In this study, we sequenced and annotated the mitogenome of C. chinensis, and analyzed its characteristics. Phylogenetic relationships were reconstructed based on nucleotide sequence data of 91 species mitogenomes, which enabled us to investigate the phylogenetic position of C. chinensis and provided insight into the phylogenetic relationships among the major subfamilies of Curculionidae.

Sample collection and DNA extraction
Adult specimens of C. chinensis were collected from Yunguanshan forest farm, Guiyang City, Guizhou Province, China (26.48208727° N, 106.75480714° E, July 2020). All fresh specimens were preserved in 100% ethyl alcohol and deposited in a -20℃ freezer at the laboratory of Guizhou Academy of Forestry, Guiyang. Identi cation specimens was performed using morphological characters in Chao and Chen [7].
Whole genomic DNA was extracted from thorax muscle tissues using the Biospin Insect Genomic DNA Extraction Kit (BioFlux) following manufacturer's instructions. Voucher specimens are stored in the insect herbarium of Guizhou Academy of Forestry.

Molecular phylogenetic analysis
A total of 91 mitogenomes from three families of Coleoptera were used for the phylogenetic analyses (Table S1). Of these, eighty-seven species belong to Curculionidae (the ingroup), while the remaining four species from two families (Anthribidae and Brentidae) were chosen as outgroups. Nucleotide sequences (without stop codons) for the 13 PCGs were aligned using MAFFT 7 [15] with the G-INS-i (accurate) strategy and codon alignment mode (Code table: Invertebrate mitochondrial genetic codon). rRNAs genes (rrnL and rrnS) were aligned using the MAFFT 7 [15] with the Q-INS-I algorithm (which takes account of the secondary structure of rRNA gene). Ambiguously aligned areas were removed using Gblocks v0.91b [16], respectively. Gene alignments were concatenated using PhyloSuite v1.2.2 [14]. Partitioning scheme and nucleotide substitution models for maximum likelihood (ML) and Bayesian inference (BI) phylogenetic analyses were selected with PartitionFinder2 [17] using the Bayesian information criterion (BIC) (Tables S2-S3). ML analyses were reconstructed by IQ-TREE [18] under the ultrafast bootstrap (UFB) approximation approach [19] with 10,000 replicates. BI analyses were performed using MrBayes 3.2.6 [20] in the CIPRES Science Gateway [21] with four chains (one cold chain and three hot chains). Two independent runs of 30,000,000 generations were carried out with sampling every 1,000 generations. The rst 25% of trees were discarded as burn-in. After the average standard deviation of split frequencies fell below 0.01, stationarity was assumed.

Mitogenome organization and nucleotide composition
The mitogenome of C. chinensis is a double-stranded circular DNA molecule, containing 37 typical mitochondrial genes (13 PCGs, 22 tRNAs, and two rRNAs) and a large control region (Table 1, Fig. 1), which are common in bilaterian animals [2]. The newly sequenced mitogenome (length: 18,680 bp) is medium-sized compared to other Curculioninae mitogenomes (ranging from 16,852 bp Curculio. davidi, GenBank accession: NC_034293 to 19,216 bp Curculio sp. GenBank accession: MG728095) [9]. Variation in the size of the control region is the main source of the length variation in Curculioninae mitogenomes.
The mitogenome of C. chinensis has the same gene order as other previously sequenced Curculioninae species [9,22,23]. A total of 71 overlapping nucleotides were found in ten pairs of neighboring genes, the longest overlap (23 bp) was identi ed between the trnL1 and rrnL. Furthermore, there are 2,033 intergenic nucleotides disperse across 14 gene boundaries, and the longest intergenic region (1,882 bp) is located between trnI and trnQ.

Protein-coding genes
The total size of all 13 PCGs of C. chinensis is 11,160 bp, accounting for 59.74% of the entire mitogenome (Table 1). In 13 PCGs, nad2, cox1, cox2, atp8, atp6, cox3, nad3, nad5, nad4, nad4L, nad6, and cob use ATN (ATA/T/G/C) as the start codon, while nad1 is initiated by TTG, which is common for Curculioninae mitogenomes [9,22,23]. All PCGs stopped with TAA/G, or their incomplete form single T-. The incomplete termination codon single T-can be completed by post-transcriptional polyadenylation [28]. The AT-skews of the all PCGs among Curculioninae range from − 0.13 (A. rectirostris and E. kamerunicus) [22,23] to -0.146 (C. davidi) [9], showing a biased use for the T nucleotide. The relative synonymous codon usage (RSCU) of C. chinensis mitogenome is presented in Fig. 2, indicating Leu, Phe, and Ile are the three most frequently used amino acids. In the new mitogenome, the four most frequently utilized codons are UUA-Leu, UUU-Phe, AUU-Ile, and AUA-Met. The most frequently used codons are composed of A nucleotide or U nucleotide, which re ects the high AT content of PCGs.

Transfer and ribosomal RNA genes
The typical sets of 22 tRNAs were identi ed with the sizes ranging from 62 bp (trnR) to 71 bp (trnK) ( Table 1). The AT content of tRNAs (74.6%-78.3%) was slightly higher than that of the PCGs (72.5%-77.5%) (Tables S4-S12). Most tRNAs have clover-leaf secondary structures, except for trnS1, where the dihydrouridine (DHU) arm became a simple loop (Fig. 3). This feature is common in metazoan mitogenomes [29]. There are a total of 30 mismatched base pairs belong to six types (U-G, U-U, A-C, A-G, U-C, and A-A) were found in the arm structures of the 22 tRNAs.

Control region
The control region regulates the replication and transcription of mtDNA [1,2]. In all sequenced Curculioninae mitogenomes, the control regions are located between rrnS and trnI. Tandem repeats nder analysis [13] found different numbers of tandem repeat units in the nine complete Curculioninae mitogenomes (Fig. 4). Two types of tandem repeats were discovered in the A. pomorum control region

Phylogenetic relationships
Based on ML and BI analyses of nucleotide data of 13 PCGs and two rRNAs, we reconstructed the phylogenetic relationships of 91 species of Coleoptera. The trees of two analyses have largely congruent topologies, with most branches strongly supported (Figs. 5-6). Furthermore, relationships recovered in our analyses are similar to those found by Song et al. [24]. As in recent phylogenetic analyses of Curculionidae [32,33], deep internal nodes within the family are not consistent and received weak support. For instance, in the BI tree (Fig. 6), Cryptorhynchinae is sister to the clade comprising Ceutorhynchinae, Curculioninae, Molytinae, and Alcidinae, while in the ML tree (Fig. 5)