Complete mitochondrial genome of Cultellus attenuatus and its phylogenetic implications

The mitochondrial genomes of three species in Solenoidea of Heterodonta have been reported, but the mitochondrial genes and phylogenetic relationships of Cultellus attenuatus, which also belongs to this superfamily and has high economic value, are unknown. The complete mitochondrial genome of C. attenuatus was sequenced and compared with mitogenomes of seven species of Heterodonta bivalve mollusks in GenBank. The mitochondrial genome of C. attenuatus has a length of 16,888 bp and contains 36 genes, including 12 protein-coding genes, 2 ribosomal RNAs and 22 transfer RNAs. In comparison with C. attenuates, the mitochondrial genes of Sinonovacula constricta from the same family were not rearranged, but those of six other species from different families were rearranged to different degrees. The location, size, and composition of the largest noncoding regions in eight species suggested a closer relationship between C. attenuatus and S. constricta. The phylogenetic analysis showed that C. attenuatus and S. constricta belonging to Cultellidae cluster into one branch and that two species of Solenidae (Solen grandis and Solen strictus) clustered as their sister taxa. Overall, we used mitochondrial genome data to demonstrate that C. attenuatus and S. constricta exhibit the closest relationship in Heterodonta. These data and analyses provide new insights into the phylogenetic relationships in Heterodonta.


Introduction
The mitochondrial DNA of most animals is a closed circular molecule independent of nuclear DNA that ranges in size from 14 to 17 kb. This molecule contains 37 genes, which include 13 proteins, 2 ribosomal RNAs, and 22 transfer RNAs [1]. The mitochondrial genes exhibit quite conserved replication and feature matrilineal inheritance, no rearrangement and high substitution rates [2]. Many characteristics make mitochondrial genes valuable, and these genes have thus become a powerful source of data for genetic evolution

Sample collection and DNA extraction
Living C. attenuatus were collected from Bohai Bay (Dongying City, Shandong Province, China). Total genomic DNA was extracted from the adductor muscle of an individual C. attenuatus using the TIANamp Marine Animals DNA Kit (Tiangen).

PCR and DNA sequencing
For the design of Long-PCR primers, partial cox1 and rrnL sequences were first obtained. PCR amplification of the cox1 and rrnL genes was performed using the primer sets LCO1490/HCO2198 and rrnLAR-L/rrnLAR-H [13], respectively. Based on these two sequences, two sets of Long-PCR primers LB-L/LB-H and LC-L/LC-L, were used for Long-PCR amplification [13]. The PCR products were sequenced by Personal Biotechnology Co. Ltd. (Qingdao). due to its nutritional composition, which consists of low fat and high protein, and is thus becoming increasingly popular among consumers [15]. Due to its high market value, the artificial catching of C. attenuatus is gradually increasing, whereas the quantity of wild resources is decreasing [14]. Artificial breeding of C. attenuatus has been performed in an attempt to restore the diminishing wild resources by introducing hatchery-produced seeds. The study of mitochondrial genes can be applied to produce genetic markers that can monitor the restoration of aquatic animal stocks, which is quite helpful for the restoration and conservation of wild populations [3]. Therefore, obtaining the mitochondrial gene sequence of C. attenuatus is necessary, but no relevant reports have been reported thus far. In this study, we obtained and analyzed the entire mitochondrial genome of C. attenuatus and compared it with the mitochondrial genomes of other family species in Heterodonta. The data not only contribute to our understanding of phylogenetic status but also serve as a source for the development of useful genetic markers for aquaculture. Fig. 1 Map of the mitogenome of Cultellus attenuatus (a), mitochondrial gene arrangement of eight species of Heterodonta (b) and phylogenetic trees of 18 species of Heterodonta (c). In the mitogenome map, from inside to outside, the first circle represents the scale; the second circle represents the GC skew; the third circle represents the GC content; and the fourth and fifth circles represent the arrangement of protein-coding genes, tRNA genes and rRNA genes in the genome. With respect to gene arrangement, the bars indicate identical gene blocks. The phylogenetic trees were derived from the maximum likelihood (ML) of concatenated amino acid sequences of 12 protein-coding genes obtained by stitching was uploaded to the MITOS2 web server (http://mitos2.bioinf.uni-leipzig.de/index.py) for annotations.
The protein-coding and ribosomal RNA genes were identified based on their similarity (e-value < 10 − 5 ) to published gene sequences through BLAST searches (http://www. ncbi.nlm.nih.gov/BLAST/). Transfer RNA was identified through tRNAscan-SE v.1.21 [18] and DOGMA [19] using the invertebrate mitochondrial genetic code. CGView was used to map the whole mitochondrial genome circle [20].
Libraries were constructed using a whole genome shotgun strategy and next-generation sequencing and then sequences based on the Illumina NovaSeq sequencing platform.

Analysis of sequence data
The sequencing data were assembled using A5-miseq v20150522 [16] and SPAdesv3.9.0 [17]. The sequences with a high sequencing depth were compared with the Nt (Nucleotide Sequence Database) library on NCBI by BLASTN (BLAST V2.2.31+), and the mitochondrial sequences of each spliced result were selected. The final mitochondrial sequence was obtained after the splicing results were integrated. The complete mitochondrial genome sequence genes exceeded 60%, which indicated that the protein-coding genes prefer AT (Table 1). In the entire mitochondrial genome, the values found for the AT skew and GT skew of the entire mitochondrial genome were − 0.291 and 0.368, respectively, which suggested that the entire mitochondrial gene was biased toward T and G, a phenomenon that was also found in some other bivalves [7,8,23]. All 12 protein-coding genes were similar to the whole genome, with a negative AT skew and a positive GT skew, which revealed a bias toward T and G (Table 1).

Ribosomal RNAs and transfer RNAs
Similar to most bivalves, the mitochondrial genome of C. attenuatus contained 22 transfer RNA genes and 2 ribosomal RNA genes. The size of 22 tRNA genes varied from 63 to 67 bp, and all of these genes can fold into typical secondary structures. The two ribosomal RNA genes included rrnL and rrnS-the former had a length of 1237 bp and was located between nad6 and atp6, and the latter was 827 bp and located between trnM and cox3. The content of A + T in rrnL was 69.68%, which was the highest percentage found among all protein-coding genes and rRNA genes. The A + T content of rrnS was 66.38%, which was slightly lower than the average level found for the whole mitochondrial genome. The values found for the AT skew and GC skew of rrnL were − 0.063 and 0.285, respectively, which are similar to those found for protein-coding genes and indicated a bias toward T and G. The values for the AT skew and GC skew of rrnS were 0.056 and 0.209, respectively, showing a bias toward A and G, which was different from the results found for other genes in the whole mitochondrial genome (Table 1).

Noncoding regions (NCRs)
The mitochondrial genome of the whole C. attenuatus contained 24 NCRs, which ranged in size from 2 to 1173 bp and has a total length of 1917 bp, which accounted for 11.35% of the entire mitochondrial genome. Four of the 24 NCRs were longer, with sizes greater than or equal to 100 bp: trnP-nad4, 103 bp; trnF-cox1, 149 bp; trnM-rrnS, 100 bp; and nad2-trnK, 1173 bp. The largest NCR had a length of 1173 bp and an A + T content of 68.24% and was located between nad2 and trnK. In addition, a sequence (15,884 bp-15,974 bp) with a low A + T content (53.85%) was found in the largest NCR, and this sequence is considered the origin of L-strand replication. The remaining sequence of the largest NCR was 914 bp in length, had an AT content of 69.80%, and was identified as a putative control region. Compared with the largest NCR of the C. attenuatus, the largest NCR of S. constricta, which belongs to the same family, is located at

Phylogenetic analysis
To clearly show the phylogenetic relationship of Heterodonta, the mitochondrial gene sequences of 18 species from Heterodonta were obtained from GenBank. Chlamys farreri and Mimachlamys nobilis were used as distinct outgroups. The amino acid sequences of 12 proteins (except atp8) were adopted for phylogenetic analysis, and the amino acid sequences of 12 mitochondrial proteins were aligned using MEGA 7.0 [21]. The phylogenetic relationships among heterodont bivalves were reconstructed by the maximum likelihood (ML) method using MEGA 7.0 [21]. The ML tree was run with the LG + G + I + F substitution model (determined by the lowest Bayesian Information Criterion score), and 1000 bootstraps were used for the estimation.

Genome organization and nucleotide composition
The mitochondrial genome of C. attenuatus has a length of 16,888 bp and contains 12 protein-coding genes, 22 transfer RNA genes, and 2 ribosomal RNA genes (Fig. 1a). Compared with the genome size of the sequenced mollusk mtD-NAs, the mitochondrial genome size was within the normal range. In addition, four overlaps with sizes of 1 bp, 6 bp, 1 and 3 bp were detected in the mitochondrial genome, and among these, the largest overlap was found between trnE and trnS2 ( Table 1).The A + T content of whole mitogenome of C. attenuatus was 66.46%, which is comparable to those found for S. constricta (67.00%) [13] and Solen grandis (64.84%) [10]. In addition, the mitochondrial gene length of C. attenuatus was 376 bp shorter than that of S. constricta (in the same family) [22] and 104 bp longer than that of S. grandis (in a different family) [10].

Protein-coding genes and codon usage
The mitochondria of C. attenuatus contain 12 protein-coding genes, which range in size from 291 to 1761 bp and have a total length of 11,470 bp, and these genes account for 67.92% of the total length of the genes. Three starting codons were found in these 12 protein-coding genes: ATG in 7 protein-coding genes (cox2, cox3, cob, atp6, nad2, nad3 and nad6), ATT in 3 protein-coding genes (cox1, nad4 and nad5) and ATA in 2 protein-coding genes (nad4 l and nad1). These findings differed from the initial codon composition of the protein-coding genes in the mtDNA of S. grandis.
With the exception of cox2, all 11 protein-coding genes had complete termination codons, namely TAA (N = 5) and TAG (N = 6). The A + T content of all the protein-coding of heterodonts is currently available [10,11]. In this study, a phylogenetic tree was reconstructed based on the amino acid sequences of the mitochondrial protein genes of heterodont bivalves (Fig. 1c). Two species of Lucinidae clustered into a single clade, and other species clustered into a large clade consisting of three small clades, similar to the results obtained in previous studies [11,26]. Paphia euglypta, Venerupis philippinarum, Meretrix meretrix and Meritrix petechialis clustered together, which supported their genetic relationship within Veneridae [26]. Moerella iridescens, Sanguinolaria diphos, Sanguinolaria olivacea, Semele scaba and Solecurtus divaricatus, which belong to the superfamily Tellinoidea, clustered together, which was consistent with the results from previous studies [22]. The novel finding obtained in this study is that C. attenuatus exhibited the closest relationship with S. constricta, which further determined that C. attenuatus and S. constricta belong to the same family (Cultellidae) [12]. S. grandis and Solen strictus have a close relationship, and C. attenuatus and S. constricta were identified as sister taxa.
In conclusion, based on a comparison of gene arrangement, the characteristics of NCR and a phylogenetic analysis of species in Heterodonta, C. attenuatus and S. constricta were identified as sister species, and were found to belong to Cultellidae.

Gene arrangement
One species was selected from six families and one superfamily of Heterodonta, and the arrangement of its mitochondrial genes were compared with that of C. attenuatus (Fig. 1b). The mitochondrial genes of S. constricta and C. attenuatus were arranged in exactly the same order. The mitochondrial genes of species in different families exhibit a large degree of rearrangement, whereas species in the same family tend to show a small degree of rearrangement, and species in the same genus show almost no rearrangement [8,24]. The mitochondrial genes of S. constricta and C. attenuatus were not rearranged (Fig. 1b), which further confirmed the close relationship between the two species [12]. This comparison of the mitochondrial gene arrangement revealed the same phenomenon. Only three gene blocks, namely, nad5-cob, rrnL-atp6-rrnS-cox3 and nad4 l-nad4, were shared between S. grandis and C. attenuatus. However, the remaining five species from different families shared fewer gene blocks, which indicated a greater rearrangement. This enormous gene rearrangement in bivalves is associated with single-stranded coding, because doublestranded coding tends to inhibit gene rearrangement compared with single-stranded coding [25].

Phylogenetic analysis
With the development of mitochondrial genome research, an increasing number of researchers have performed phylogenetic studies of species based on mitochondrial genes, but relatively little information on the mitochondrial genome Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Availability of Data and Material
The mitochondrial genome sequence is available from GenBank (MW653805).
Code Availability There is no relevant code.

Compliance with ethical standards
Conflicts of Interest/Competing Interest The authors declare that they have no conflicts of interest.

Consent for Publication
All authors read and approved the final manuscript.
Ethics Approval The present study was performed according to the standard operation procedures (SOPs) of the Guide for the Use of Experimental Animals of the Ocean University of China. All animal care and use procedures were approved by the Institutional Animal Care and Use Committee of Ocean University of China.

Consent to Participate
This article does not describe any studies with humans.