Genome structure and composition
The complete mitogenome sequence of Aythya marila is a typical closed-circular molecule with a size of 16,617 bp and has been deposited in GenBank under the accession number OP326610. The mitogenome contents of Aythya marila are the same as most other published Aythya, which includes 37 genes [16], 13 PCGs, 22 tRNAs, two rRNAs (rrnl and rrns), and a control region (D-loop) (Fig. 1). Among the 37 fragment genes, one PCG (ND6) and eight tRNA genes (tRNA-Ala, tRNA-Cys, tRNA-Glu, tRNA-Gln, tRNA-Asn, tRNA-Pro, tRNA-Ser2, and tRNA-Tyr) were located on the light strand (L-strand), whereas the CR and the remaining 28 genes were located on the heavy strand (H-strand). The gene order was the same as that in most bird mitogenomes [17, 18] (Table 1). The total length of protein-coding, tRNA, and rRNA genes comprised 68.65%, 9.29%, and 15.57% of the entire mt genome, respectively. The base composition of the Aythya marila genome was 29.42% A, 22.25% T, 15.49% G, and 32.84% C, and the A + T content (51.67%) was higher than the G + C content (48.33%), suggesting a strong A + T bias. A relatively high AT content has also been reported in other Aythya species [16]. In addition, AT-skews and GC-skews are often used to assess nucleotide-compositional differences in mitochondrial genomes [19]. We found that the Aythya marila mitogenome AT-skew (0.139) was positive, but the GC-skew (-0.359) was negative, indicating more A than T, and more C than G (Table 2).
Genes encoded by the heavy strand are shown inside the circle, whilst those encoded by the light strand are shown outside the circle.
Table 1
Organization of the A. marila mitochondrial genome.
Gene | Strand | Location | Size (bp) | Intergenic length | Anticodon | Start codon | Stop codon |
tRNA-Phe | H | 1–70 | 70 | 0 | GAA | - | - |
12S rRNA | H | 71-1054 | 984 | 0 | - | - | - |
tRNA-Val | H | 1055–1125 | 71 | 0 | TAC | - | - |
16S rRNA | H | 1126–2728 | 1603 | 0 | - | - | - |
tRNA-Leu2 | H | 2729–2802 | 74 | 4 | TAA | - | - |
ND1 | H | 2807–3784 | 978 | -2 | - | ATG | AGG |
tRNA-Ile | H | 3783–3854 | 72 | 8 | GAT | - | - |
tRNA-Gln | L | 3863–3933 | 71 | -1 | TGG | - | - |
tRNA-Met | H | 3933–4001 | 69 | 0 | CAT | - | - |
ND2 | H | 4002–5040 | 1039 | 0 | - | ATG | T-- |
tRNA-Trp | H | 5041–5116 | 76 | 3 | TCA | - | - |
tRNA-Ala | L | 5120–5188 | 69 | 2 | TGC | - | - |
tRNA-Asn | L | 5191–5263 | 73 | 0 | GTT | - | - |
tRNA-Cys | L | 5264–5329 | 66 | 0 | GCA | - | - |
tRNA-Tyr | L | 5330–5399 | 70 | 1 | GTA | - | - |
COI | H | 5401–6951 | 1551 | -9 | - | GTG | AGG |
tRNA-Ser2 | L | 6943–7015 | 73 | 2 | TGA | - | - |
tRNA-Asp | H | 7018–7086 | 69 | 1 | GTC | - | - |
COII | H | 7088–7774 | 687 | 1 | - | GTG | TAA |
tRNA-Lys | H | 7776–7843 | 68 | 1 | TTT | - | - |
ATP8 | H | 7845–8012 | 168 | -10 | - | ATG | TAA |
ATP6 | H | 8003–8686 | 684 | -1 | - | ATG | TAA |
COIII | H | 8686–9469 | 784 | 0 | - | ATG | T-- |
tRNA-Gly | H | 9470–9538 | 69 | 0 | TCC | - | - |
ND3 | H | 9539–9890 | 351 | 1 | - | ATG | TAA |
tRNA-Arg | H | 9892–9961 | 70 | 0 | TCG | - | - |
ND4L | H | 9962–10258 | 297 | -7 | - | ATG | TAA |
ND4 | H | 10252–11629 | 1378 | 0 | - | ATG | T-- |
tRNA-His | H | 11630–11698 | 69 | 0 | GTG | - | - |
tRNA-Ser1 | H | 11699–11764 | 66 | -1 | GCT | - | - |
tRNA-Leu1 | H | 11764–11834 | 71 | 0 | TAG | - | - |
ND5 | H | 11835–13658 | 1824 | -1 | - | GTG | TAA |
Cytb | H | 13658–14800 | 1143 | 2 | - | ATG | TAA |
tRNA-Thr | H | 14803–14871 | 69 | 10 | TGT | - | - |
tRNA-Pro | L | 14882–14951 | 70 | 10 | TGG | - | - |
ND6 | L | 14962–15483 | 522 | 0 | - | ATG | TAG |
tRNA-Glu | L | 15484–15551 | 68 | 0 | TTC | - | - |
D-loop | H | 15552–16617 | 1066 | 0 | - | - | - |
Table 2
Nucleotide composition and skewness of A. marila mitochondrial genome.
A. marila | T% | C% | A% | G% | (A + T)% | AT-skew | GC-skew | Length (bp) |
Mitogenome | 22.25 | 32.84 | 29.42 | 15.49 | 51.67 | 0.139 | -0.359 | 16617 |
PCGs | 23.56 | 33.98 | 27.02 | 15.44 | 50.58 | 0.068 | -0.375 | 11407 |
COI | 24.11 | 32.95 | 25.98 | 16.96 | 50.09 | 0.037 | -0.320 | 1551 |
COII | 22.12 | 33.19 | 27.66 | 17.03 | 49.75 | 0.111 | -0.315 | 687 |
ATP8 | 17.86 | 40.47 | 33.93 | 7.74 | 51.79 | 0.310 | -0.679 | 168 |
ATP6 | 23.39 | 37.99 | 27.92 | 11.70 | 51.31 | 0.088 | -0.540 | 684 |
COIII | 23.09 | 33.67 | 26.91 | 16.33 | 50.00 | 0.076 | -0.347 | 784 |
ND3 | 23.55 | 34.42 | 26.81 | 15.22 | 50.36 | 0.065 | -0.387 | 351 |
ND5 | 21.32 | 35.09 | 29.99 | 13.60 | 51.31 | 0.169 | -0.441 | 1824 |
ND4 | 22.71 | 36.00 | 29.32 | 11.97 | 52.03 | 0.127 | -0.501 | 1378 |
ND4L | 24.58 | 33.67 | 25.59 | 16.16 | 50.17 | 0.020 | -0.351 | 297 |
ND6 | 39.46 | 11.88 | 10.15 | 38.51 | 49.61 | -0.591 | 0.528 | 522 |
Cytb | 23.01 | 36.31 | 26.33 | 14.35 | 49.34 | 0.067 | -0.433 | 1143 |
ND1 | 24.34 | 34.66 | 25.36 | 15.64 | 49.70 | 0.021 | -0.378 | 978 |
ND2 | 21.75 | 36.48 | 29.55 | 12.22 | 51.30 | 0.152 | -0.498 | 1039 |
D-loop | 24.67 | 32.18 | 27.30 | 15.85 | 51.97 | 0.051 | -0.340 | 1066 |
tRNAs | 26.64 | 22.03 | 30.01 | 21.32 | 56.65 | 0.059 | -0.016 | 1543 |
12S rRNA | 19.21 | 28.25 | 32.62 | 20.02 | 51.83 | 0.259 | -0.171 | 984 |
16S rRNA | 19.90 | 25.76 | 34.56 | 19.78 | 54.46 | 0.269 | -0.131 | 1603 |
Protein-coding genes and codon usage
The mitogenome of Aythya marila has 13 coding genes with a total length of 11,407 bp and encodes cytochrome b, two ATPases, three cytochrome c oxidase subunits, and six NADH dehydrogenase subunits. The lengths of the 13 PCGs ranged from 168–1824 bp. Among the 13 PCGs, 10 used ATG as the initiation codon, whereas COX I, COX II, and ND5 were initiated by the start codon GTG. Seven protein genes (ATP6, ATP8, COX II, ND3, ND4L, ND5, and Cytb) used TAA as the stop codon. For COX I and ND1, the stop codon was AGG, whereas for ND6, it was TAG. ND2, ND4, and COX III ended with the incomplete stop codon T (Table 1).
The average A + T content of PCGs in the mitogenome of Aythya marila was 50.58%, ranging from 49.34% (Cytb) to 52.03% (ND4). Except for ND6, the base compositions and skewness of the remaining 12 PCGs were relatively similar. Only the percentages of ND6 T and G were much higher than those of the other genes, with a positive GC skew (Table 2). The usage of amino acids and RSCU values in the PCGs of Aythya marila are summarized in Fig. 2 and Table 3. The most frequently used codons (amino acids) were CUA (317), AUC (208), GCC (173), UUC (167), CUC (166), and ACC (156). Apart from Trp and Met, all amino acids have their preferred codons, and this is different from other Aythya species that have been determined so far [16].
Table 3
The codon number and relative synonymous codon usage in the mitochondrial genome of Aythya marila.
Codon | Count | RSCU | | Codon | Count | RSCU | | Codon | Count | RSCU | | Codon | Count | RSCU |
UUU(F) | 52 | 0.475 | | UCU(S) | 23 | 0.491 | | UAU(Y) | 18 | 0.340 | | UGU(C) | 2 | 0.148 |
UUC(F) | 167 | 1.525 | | UCC(S) | 100 | 2.135 | | UAC(Y) | 88 | 1.660 | | UGC(C) | 27 | 1.852 |
UUA(L) | 41 | 0.373 | | UCA(S) | 87 | 1.858 | | UAA(*) | 7 | - | | UGA(*) | 87 | - |
UUG(L) | 13 | 0.118 | | UCG(S) | 13 | 0.278 | | UAG(*) | 1 | - | | UGG(W) | 19 | 1 |
CUU(L) | 46 | 0.419 | | CCU(P) | 25 | 0.435 | | CAU(H) | 12 | 0.216 | | CGU(R) | 1 | 0.081 |
CUC(L) | 166 | 1.511 | | CCC(P) | 94 | 1.635 | | CAC(H) | 99 | 1.784 | | CGC(R) | 16 | 0.297 |
CUA(L) | 317 | 2.887 | | CCA(P) | 102 | 1.774 | | CAA(Q) | 67 | 1.489 | | CGA(R) | 45 | 3.649 |
CUG(L) | 76 | 0.692 | | CCG(P) | 9 | 0.156 | | CAG(Q) | 23 | 0.511 | | CGG(R) | 10 | 0.811 |
AUU(I) | 65 | 0.508 | | ACU(T) | 31 | 0.392 | | AAU(N) | 16 | 0.264 | | AGU(S) | 7 | 0.241 |
AUC(I) | 208 | 1.625 | | ACC(T) | 156 | 1.975 | | AAC(N) | 105 | 1.736 | | AGC(S) | 51 | 1.759 |
AUA(I) | 111 | 0.867 | | ACA(T) | 115 | 1.456 | | AAA(K) | 77 | 1.730 | | AGA(R) | 0 | 0 |
AUG(M) | 58 | 1 | | ACG(T) | 14 | 0.177 | | AAG(K) | 12 | 0.270 | | AGG(R) | 2 | 0.162 |
GUU(V) | 37 | 0.751 | | GCU(A) | 40 | 0.559 | | GAU(D) | 8 | 0.267 | | GGU(G) | 11 | 0.204 |
GUC(V) | 61 | 1.239 | | GCC(A) | 173 | 2.420 | | GAC(D) | 52 | 1.733 | | GGC(G) | 77 | 1.426 |
GUA(V) | 72 | 1.462 | | GCA(A) | 91 | 1.273 | | GAA(E) | 72 | 1.532 | | GGA(G) | 74 | 1.370 |
GUG(V) | 27 | 0.548 | | GCG(A) | 12 | 0.168 | | GAG(E) | 22 | 0.468 | | GGG(G) | 54 | 1 |
Ribosomal RNA and transfer RNA genes
Similarly to other Aythya species, the Aythya marila mitogenome contained two rRNAs with a total length of 2587 bp, which are on the H strand. The 12S rRNA gene and 16S rRNA are located between tRNA-Phe and tRNA-Leu, which are typically separated by tRNA-Val. The 12S rRNA gene had the base composition of A = 32.62%, T = 19.21%, C = 28.25% and G = 20.02%, whilst the 16S rRNA gene had the base composition of A = 34.56%, T = 19.90%, C = 25.76% and G = 19.78%. Both the rRNAs had a slight AT bias (Table 2).
The Aythya marila mitogenome contained 22 tRNAs ranging in size from 66 bp to 76 bp, with an obvious AT bias (56.65%). Fourteen of these were located on the H-strand and eight on the L-strand (Table 1). The secondary structure of tRNA is shown in Fig. 3. In many vertebrate mitogenomes, the tRNA-Ser (GCT) gene had an unusual secondary structure owing to the lack of a dihydrouridine arm. However, some previous studies have shown that this does not affect tRNA function [20–23]. Other than tRNA-Ser (GCT), tRNA genes were able to form the classical cloverleaf secondary structure, and the anticodon arms contained a relatively conserved region [20].
Control Region And Repetitive Sequences
A control region was identified in the mitogenome of Aythya marila. This region is located between tRNA-Glu and tRNA-Phe, and had a length of 1066 bp. The AT content (51.97%) and GC content (48.03%) of this region were similar to those of the whole mitochondrial genome (Table 2). We also found that the control region AT-skew (0.051) was positive, but the GC-skew (-0.340) was negative, which indicated that the A content was higher than the T content, and that the G content was lower than the C content. A total of 24 repeat sequences were detected in the Aythya marila mitogenome, which contained 22 forward repeats and two palindromic repeats (Fig. 4). The repeat sequences ranged from 20–23 bp with a total size of 497 bp, constituting 2.99% of the mitogenome.
Synonymous and nonsynonymous substitution rate
The Ka/Ks ratio can be used to determine whether the PCGs were under selection pressure [24, 25]. To detect the pressure on Aythya marila mitochondrial PCGs, we calculated the Ka/Ks ratio of the 13 PCGs. As shown in Fig. 5, the Ka/Ks ratio did not exceed 0.105 for any genes, indicating that they underwent intense purification selection. Stabilizing the normal function of mitochondria may be due to the important role of these genes in purification selection [26]. ATP8 had the highest Ka/Ks value (Ka/Ks = 0.105), indicating that ATP8 experienced the least selective pressure and evolved the fastest. However, the Ka/Ks value of COI was the lowest (Ka/Ks = 0.003), indicating that COI had the highest selective pressure and had evolved the slowest.
Population genetic diversity and differentiation
The complete mitogenome sequence was used to estimate the genetic diversity and differentiation of the Aythya population. A total of 28 haplotypes were identified from 28 individuals, of which four were from A. nyroca with the other four Aythya each having six haplotypes. The haplotype diversity of the five populations was 1, whilst the nucleotide diversity (π) ranged from 0.00080 to 0.01109 per population (Table 4). These results showed that our target species had the highest genetic diversity, whilst the critically endangered Aythya baeri had the lowest genetic diversity. In addition, there was obvious differentiation between the five populations, except that no significant genetic differentiation was found between A. baeri and A. nyroca (FST = -0.09740), suggesting that genetic exchange between the two populations is common (Table 5).
Table 4
Genetic diversity of five Aythya species.
Species | NH | N | h | π |
Aythya marila | 6 | 6 | 1 | 0.01109 |
Aythya ferina | 6 | 6 | 1 | 0.00139 |
Aythya nyroca | 4 | 4 | 1 | 0.00093 |
Aythya baeri | 6 | 6 | 1 | 0.00080 |
Aythya fuligula | 6 | 6 | 1 | 0.00087 |
NH, number of haplotypes; N, number of individuals; h, haplotype diversity; π, nucleotide diversity.
Table 5
Fixation index (FST) of five Aythya species.
Species | Aythya fuligula | Aythya marila | Aythya ferina | Aythya nyroca |
Aythya marila | 0.55425 | | | |
Aythya ferina | 0.96521 | 0.78815 | | |
Aythya nyroca | 0.97498 | 0.82212 | 0.96901 | |
Aythya baeri | 0.97628 | 0.82312 | 0.97023 | -0.09740 |
Phylogenetic Analysis
We conducted a phylogenetic analysis of the mitogenomes of all Anatidae species in GenBank using both BI and ML methods, all of which exhibited high bootstrap support values and Bayesian posterior probabilities. The results indicated that the BI and ML phylogenetic trees had the same topologies, but the tree based on BI had higher support values. Hence, we only present the BI tree in this study (Fig. 6). In the phylogenetic tree, there were four major clades among the Anatidae: Dendrocygninae, Oxyurinae, Anserinae, and Anatinae. The first branch, Dendrocygninae, contained only Dendrocygna javanica. D. javanica is an early diverging lineage representing one of the most distinctive genera of Anatidae. Because of their erect posture, their elongated necks and legs, and tree-perching habits distinguish them from most other waterfowl [27, 28]. The second is Oxyurinae, which consists of only Oxyura jamaicensis. The phylogenetic relationship of Oxyurinae is also considered controversial [29]. According to early morphological studies, Oxyurinae shares a common ancestor with geese and swans [29, 30], leading us to consider Oxyurinae as the group most closely related to Anserinae whilst not being within Anatinae. In previous studies, Oxyurinae was shown to have diverged earlier than Dendrocygninae in Anatidae [31] and our results also support Oxyurinae as an independent subfamily. The third is Anserinae, which includes Cygnini and Anserini. Anserini contains the genera Anser and Branta, which are closely related species and sisters to the Cygnini assemblage, which is consistent with morphological data [32]. Meigini, Anatini, Aythyini, and Cairinini form the fourth branch of the Anatinae. Within Anatinae, Aythyini is monophyletic, Aythyini and Anatini have the closest phylogenetic relationships, and Aythyini diverged earlier than Anatini based on morphological studies and phylogenetic analysis [29, 33]. Within the genera Aythya, A. marila and A. fuligula form a branch, whilst A. americana and A. ferina were clustered, and A. baeri and A. nyroca were clustered into one branch, with all branches being supported by good statistical values. In general, our findings support those of the previous molecular phylogenetic studies.
To further elucidate the phylogenetic interrelationships between the five Aythya species (Aythya ferina, Aythya nyroca, Aythya baeri, Aythya fuligula, and Aythya marila), we analyzed the phylogenetic tree species based on complete mitochondrial genomes, with Anas platyrhynchos was selected as the outgroup (Fig. 7). The results showed that A. baeri and A. nyroca had obvious mixed clustering patterns, whilst A. marila and A. fuligula were interwoven, suggesting possible interbreeding between them. The topological structures of phylogenetic trees with different data all supported A. baeri being closely related to A. nyroca, and that A. marila and A. fuligula species were relatively closely related.