Genome organization and base composition
The mitochondrial genome of M. sinensis and L. exotica are circular, double-stranded DNA moledules, 16,018 and 14,978 bp in length respectively (Figs. 1 & 2). The complete mitogenome of M. sinensis contains 34 mt genes: 13 protein-coding genes (PCGs), 19 tRNA genes, and two rRNA genes (Fig. 1). It lacks three tRNA genes: TrnA, trnE, trnL1. The average A + T content of the M. sinensis mt genome is approximately 75.32%, which is higher than other isopods (typical range: 54.4%-71.2༅). 18 overlapping regions and 13 intergenic regions are found in the genome. (Table 1). Nucleotide frequencies of all mt genes of M. sinensis are listed (Table 2).
Table 1
Gene content of the Mongoloniscus sinensis. Mitogenome
| | | Length (bp) | | | | |
Feature | Stranda | Position | Initiation Codon | Stop Codon | Anticodon | Intergenic nucleotide |
cob | - | 1–1,107 | 1107 | ATG | TAA | | 64 |
nad5 | + | 1,172-2,884 | 1713 | ATT | TAA | | * |
trn-F | + | 2,885-2,952 | 68 | | | GAA | -33b |
trn-H | - | 2,920-2,984 | 65 | | | GTG | 5 |
nad4 | - | 2,990-4,333 | 1344 | ATG | TAA | | * |
nad4l | - | 4,334-4,612 | 279 | ATT | TAA | | -5 |
trn-P | - | 4,608-4,665 | 58 | | | TGG | -4 |
nad6 | + | 4,662-5,153 | 492 | ATT | TAA | | -2 |
trn-S2 | + | 5,152-5,202 | 51 | | | TGA | -9 |
rrnL | - | 5,194-6,336 | 1143 | | | | 21 |
trn-Q | - | 6,358-6,424 | 67 | | | TTG | 47 |
trn-M | + | 6,472-6,529 | 58 | | | CAT | -16 |
nad2 | + | 6,514-7,518 | 1005 | ATG | TAG | | -10 |
trn-C | - | 7,509-7,562 | 54 | | | GCA | -18 |
trn-Y | - | 7,545-7,600 | 56 | | | GTA | 4 |
cox1 | + | 7,605-9,140 | 1536 | ATG | TAA | | 1 |
trn-L2 | + | 9,142-9,208 | 67 | | | TAA | -10 |
cox2 | + | 9,199-9,870 | 672 | ATC | TAA | | -2 |
trn-K | + | 9,869-9,926 | 58 | | | TTT | -13 |
trn-D | + | 9,914-9,975 | 62 | | | GTC | -8 |
atp8 | + | 9,968 − 10,123 | 156 | ATT | TAA | | -13 |
atp6 | + | 10,111 − 10,785 | 675 | ATG | TAA | | * |
cox3 | + | 10,786 − 11,613 | 828 | ATG | TAA | | -41 |
trn-R | + | 11,573 − 11,633 | 61 | | | TCG | 21 |
nad3 | + | 11,655 − 12,011 | 357 | ATT | TAA | | -41 |
trn-V | + | 11,971 − 12,025 | 55 | | | TAC | -9 |
nad1 | - | 12,017 − 12,940 | 924 | ATA | TAA | | 1 |
trn-N | + | 12,942 − 13,008 | 67 | | | GTT | -2 |
rrnS | + | 13,007–13,723 | 717 | | | | -17 |
trn-I | + | 13707–13768 | 62 | | | TCA | 4 |
trn-W | + | 13,773 − 13,826 | 54 | | | TCA | 292 |
trn-G | + | 14,119 − 14,176 | 58 | | | TCC | 485 |
trn-T | + | 14,662 − 14,725 | 64 | | | TGT | 879 |
trn-S1 | + | 15,605 − 15,677 | 73 | | | TCT | 341 |
*Gene borders are defned based on borders with adjacent genes. |
aPlus strand (+)/minus strand (−). |
bNegatve values represent overlapping nucleotdes. |
Table 2
Base composition of whole genome, protein-coding gene, rRNA
Region (strand) | A% | C% | G% | T% | A + T% | G + C% | AT skew | GC skew | +/-strand Isopoda ground pattern | |
Whole genome | 37.13 | 10.68 | 14.00 | 38.19 | 75.32 | 24.68 | -0.014 | 0.134 | | |
cob (-) | 30.17 | 15.9 | 11.47 | 42.46 | 72.63 | 27.37 | -0.169 | -0.162 | - |
nad5 (+) | 33.80 | 7.82 | 15.47 | 42.91 | 76.71 | 23.29 | -0.119 | 0.328 | + |
nad4 (-) | 30.21 | 15.85 | 8.78 | 45.16 | 75.37 | 24.63 | -0.198 | -0.287 | - |
nad4l (-) | 34.41 | 9.32 | 9.68 | 46.59 | 81.00 | 19.00 | -0.150 | 0.019 | - |
nad6 (+) | 33.74 | 6.71 | 10.98 | 48.58 | 82.32 | 17.68 | -0.180 | 0.241 | + |
rrnL(-) | 40.24 | 9.97 | 10.24 | 39.55 | 79.79 | 20.21 | 0.009 | 0.013 | - |
nad2 (+) | 36.42 | 8.06 | 13.53 | 41.99 | 78.41 | 21.59 | -0.071 | 0.230 | + |
cox1 (+) | 27.93 | 14.52 | 17.19 | 40.36 | 68.29 | 31.71 | -0.182 | 0.084 | + |
cox2 (+) | 33.33 | 13.10 | 14.73 | 38.84 | 72.17 | 27.83 | -0.076 | 0.059 | + |
atp8 (+) | 37.18 | 10.90 | 8.97 | 42.95 | 80.13 | 19.87 | -0.072 | -0.097 | + |
atp6 (+) | 32.00 | 11.26 | 14.81 | 41.93 | 73.93 | 26.07 | -0.134 | 0.136 | + |
cox3 (+) | 27.78 | 14.86 | 16.18 | 41.18 | 68.96 | 31.04 | -0.194 | 0.043 | + |
nad3 (+) | 31.37 | 9.24 | 14.85 | 44.54 | 75.91 | 24.09 | -0.173 | 0.233 | + |
nad1 (-) | 30.30 | 13.1 | 12.88 | 43.72 | 74.03 | 25.97 | -0.181 | -0.008 | - |
rrnS (+) | 36.12 | 13.39 | 18.13 | 32.36 | 68.48 | 31.52 | 0.055 | 0.150 | + |
The circular mitogenome of Ligia exotica is composed of 13 PCGs, 21 tRNA genes, two rRNA genes, one non-coding region, while only lacking the trnG gene (Fig. 2). The average A + T content of the L. exotica mt genome is approximately 59.13%. 17 overlapping regions and 13 intergenic regions are found in the genome (Table 3). Nucleotide frequencies of all mt genes of M. sinensis are listed (Table 4).
Table 3
Gene content of the Ligia exotica. Mitogenome
| | | Length (bp) | | | | |
Feature | Stranda | Position | Initiation Codon | Stop Codon | Anticodon | Intergenic nucleotide |
trnE | + | 663–724 | 62 | | | TTC | |
trnS1 | + | 725–787 | 63 | | | TCT | 17 |
cob | - | 805-1,938 | 1134 | ATA | TAA | | * |
trnT | - | 1,939-1,997 | 59 | | | TGT | -8b |
nad5 | + | 1,990-3,717 | 1728 | ATT | TAG | | -8 |
trnF | + | 3,710-3,768 | 59 | | | GAA | -2 |
trnH | - | 3,767-3,828 | 62 | | | GTG | * |
nad4 | - | 3,829-5,158 | 1330 | ATG | T | | -7 |
nad4l | - | 5,152-5,448 | 297 | ATA | TAA | | 6 |
trnP | - | 5,455-5,516 | 62 | | | TGG | 1 |
nad6 | + | 5,518-6,024 | 507 | ATT | TAG | | -2 |
trnS2 | + | 6,023 − 6,084 | 62 | | | TGA | * |
rrnL | - | 6,085 − 7,268 | 1184 | | | | -7 |
trnV | - | 7,262-7,320 | 59 | | | TAC | 2 |
trnQ | - | 7,323-7,377 | 55 | | | TTG | 4 |
trnM | + | 7,382-7,445 | 64 | | | CAT | -21 |
nad2 | + | 7,425-8,441 | 1017 | ATG | TAG | | -15 |
trnC | - | 8,427-8,479 | 53 | | | GCA | -1 |
trnY | - | 8,479-8,540 | 62 | | | GTA | 6 |
cox1 | + | 8,547 − 10,079 | 1533 | CGA | TAA | | -5 |
trnL2 | + | 10,075 − 10,136 | 62 | | | TAA | * |
cox2 | + | 10,137 − 10,820 | 684 | ATA | TAG | | -2 |
trnK | + | 10,819 − 10,880 | 62 | | | TTT | -2 |
trnD | + | 10,879 − 10,938 | 60 | | | GTC | 9 |
atp8 | + | 10,948 − 11,097 | 150 | ATA | TAA | | -7 |
atp6 | + | 11,091 − 11,762 | 672 | ATG | TAA | | -1 |
cox3 | + | 11,762 − 12,565 | 804 | ATG | TAA | | -17 |
trnR | + | 12,549 − 12,608 | 60 | | | TCG | 9 |
nad3 | + | 12,618 − 12,962 | 345 | ATT | TAG | | -2 |
trnA | + | 12,961 − 13,021 | 61 | | | GCA | 24 |
nad1 | - | 13,046 − 13,957 | 912 | ATC | TTA | | 18 |
trnL1 | - | 13,976 − 14,035 | 60 | | | TAG | -4 |
rrns | + | 14,096 − 14,794 | 699 | | | | 2 |
trnI | + | 14,797 − 14,860 | 64 | | | GAT | 14 |
trnW | + | 14,875 − 14,938 | 64 | | | TCA | 39 |
*Gene borders are defned based on borders with adjacent genes. |
aPlus strand (+)/minus strand (−). |
bNegatve values represent overlapping nucleotdes. |
Table 4
Base composition of whole genome, protein-coding gene, rRNA
Region (strand) | A% | C% | G% | T% | A + T% | G + C% | AT skew | GC skew | +/-strand Isopoda ground pattern |
Whole genome | 28.29 | 18.04 | 22.83 | 30.85 | 59.13 | 40.87 | -0.043 | 0.117 | | |
cob (-) | 24.34 | 24.96 | 17.02 | 33.69 | 58.02 | 41.98 | -0.161 | -0.189 | - |
nad5 (+) | 25.87 | 15.39 | 24.59 | 34.14 | 60.01 | 39.99 | -0.138 | 0.230 | + |
nad4 (-) | 24.59 | 25.34 | 17.22 | 32.86 | 57.44 | 42.56 | -0.144 | -0.191 | - |
nad4l (-) | 21.55 | 23.23 | 19.87 | 35.35 | 56.90 | 43.10 | -0.243 | -0.078 | - |
nad6 (+) | 23.67 | 14.40 | 24.06 | 37.87 | 61.54 | 38.46 | -0.231 | 0.251 | + |
rrnL(-) | 32.18 | 19.51 | 17.91 | 30.41 | 62.58 | 37.42 | 0.028 | -0.043 | - |
nad2 (+) | 22.91 | 17.11 | 25.07 | 34.91 | 57.82 | 42.18 | -0.207 | 0.189 | + |
cox1 (+) | 22.96 | 19.70 | 22.37 | 34.96 | 57.93 | 42.07 | -0.207 | 0.064 | + |
cox2 (+) | 24.71 | 19.44 | 24.42 | 31.43 | 56.14 | 43.86 | -0.120 | 0.113 | + |
atp8 (+) | 30.00 | 12.67 | 22.00 | 35.33 | 65.33 | 34.67 | -0.082 | 0.269 | + |
atp6 (+) | 24.85 | 19.79 | 21.43 | 33.93 | 58.78 | 41.22 | -0.154 | 0.040 | + |
cox3 (+) | 22.01 | 21.14 | 24.13 | 21.71 | 54.73 | 45.27 | -0.195 | 0.066 | + |
nad3 (+) | 22.32 | 16.52 | 26.09 | 35.07 | 57.39 | 42.61 | -0.222 | 0.224 | + |
nad1 (-) | 22.59 | 22.59 | 20.29 | 34.54 | 57.13 | 42.87 | -0.209 | -0.054 | - |
rrnS (+) | 20.03 | 20.03 | 22.03 | 26.32 | 57.94 | 42.06 | 0.091 | 0.048 | + |
Protein-coding genes and codon usage
PCGs of M. sinensis are 11,088 bp in size, with its A + T content reaching 74.18%, the highest among all known Oniscidea. The ATG codon is the most commonly found start codon (six PCGs), with ATT next (found in nad3, nad4I, nad5, nad6, and atp8), and ATA only found in nad1. As for the terminal codons, TAA is found in 12 genes, and TAG only found in nad2 (Table 1).
The PCGs of L. exotica are 11113 bp in size, with its A + T content being 58.06%, much lower than M. sinensis. ATA is the start codon for four PCGs (cob, nad4I, cox2, nad8), with ATG found in four PCGs (nad2, nad4, atp6, cox3), ATT in three (nad3, nad5, nad6), and ATC and CGA in one each (nad1 and cox1 respectively). As for terminal codons, TAA is found in 7 genes, TAG in five (cox2, nad2, nad3, nad5, nad6) and T found only in Nad4 (Table 3). The unfinished T codon is not counted separately, as we presumed that it would be completed (TAA) by posttranscriptional polyadenylation (Ojala et al., 1981; Schuster & Stern, 2009).
The two species exbibit the same protein-coding genes location as 9 genes (cox1-3, atp8, atp6, nad2-3, and nad5-6) are encoded by the plus strand and four genes (cob, nad1, nad4, and nad4L) by the minus strand (Table 2 and Table 4). Codon usage, RSCU, and codon family proportion (corresponding to the amino acid usage) of M. sinensis and L. exotica are investigated (Suppl. Materials 2). The four most abundant codon families of M. sinensis (Phe, Ile, Leu, and Ser) encompass 48.75% of all codon families; The four most abundant codon families of L. exotica (Gly, Leu, Ser and Val) encompass 43.17% of all codon families. Among these codon families, A + T-rich codons are favored over synonymous codons with lower A + T content in M. sinensis and L. exotica. These nonpolar hydrophobic amino acids and polar neutral amino acids occur so frequently that they are associated with most proteins encoded in the mitochondrial genome as transmembrane proteins.
Transfer and ribosomal RNA genes
The two rRNAs, rrnL and rrnS, of M. sinensis are 1143 and 717 bp in size, with 79.79% and 68.48% A + T content respectively. All 19 commonly found tRNAs are present in the mitochondrial genome of M. sinensis, ranging from 51 bp (trnS2) to 73 bp in size (trnS1), and adding up to 1158 bp in total combined length. As for L. exotica, the two rRNAs, rrnL and rrnS, are 1184 and 699 bp in size, with 62.58% and 57.94% A + T content, respectively. All 21 commonly found tRNAs are present in the mitochondrial genome of L. exotica, ranging from 53 bp (trnC) to 64 bp in size (trnI, trnM, trnW), and adding up to 1278 bp in combined length. tRNA genes are distributed throughout the mitogenome and are found on both strands. The putative secondary structures of all identified tRNAs are shown in Fig. 3 and Fig. 4. The majority of tRNAs have a common t-shaped or clover-leaf secondary structure. Exceptions include the trnC of M. sinensis, where the DHU-arm is absent, and the trnG, trnK, trnP, trnS2, trnW and trnV, where the TΨC-arm is absent. In L. exotica, all of the secondary structures (predicted by MITOS and ARWEN) exhibit the conventional cloverleaf structure, except for trnC and trnV, which lack DHU arms and trnF, trnP, trnM, trnL2, trnK, trnD, trnR which lack the TΨC-arm.
Non-coding regions
In M. sinensis. there is one biggest non-coding region of 879 bp length located between trnT and trnS1. And other three non-coding regions are between trn-W and cob the length is 292 bp, 485 bp and 341 bp. We have not detected a hairpin structure in the mt control region of M. sinensis (Fig. 1). There are six discontinuous repeats of length 53 bp. As for L. exotica, one non-coding regions (NCR), 662 bp in size, are located between trnE and trnW. There are no tandem repeats of sections in this region and a GC-rich region containing the putative hairpin structure (Fig. 5).
Phylogeny and gene order
The phylogenies produced using BI and ML methods show concordant topologies; however, support values differ between approaches. BI analyses produced very high statistical support while the ML topology exhibited a mixture of mostly high but several lower support values (Fig. 7 and Fig. 8). Neither analyses recovered the monophyly of the Oniscidea. Though M. sinensis formed a clade with three species of Porcellionidae Brandt༆Ratzeburg,1831 (Oniscidea) which was placed within a larger clade including all other Oniscidean species included in this study except Ligia species. But L. exotica formed a clade with L. oceanica, and the clade of two species of family Ligidae was not placed into the main clade consisting subfamily Oniscidea.
The gene order of the mt monomer of M. sinensis and L. exotica are shown in Fig. 6. Comparison with the putative isopoda ground pattern mt genome [11] revealed four gene rearrangements in M. sinensis, three gene rearrangements in L. exotica. As for M. sinensis, the first rearrangement trnR is between the cox3 and nad3, and it is between nad3 and nad1 of isopoda ground pattern. The second rearrangement, trnV, is between nad3 and nad1, but it is between 16S rRNA and nad3 in the isopoda ground pattern. The third rearrangement of trnT is between the 12S rRNA and cob, and it is between cob and nad5 in isopoda ground pattern. Another rearrangement of trnT was translocated near trnS1 and cob.
As for L. exotica, the first rearrangement trnR is between the cox3 and nad3, and it is between nad3 and nad1 of isopoda ground pattern; The second and third rearrangement trnW and trnE are interchanging. In a word, comparing with isopoda ground pattern, two isopods have gene translocation as other known mitochondrial genomes in isopod. Missing tRNA is found in all oniscideas including the present two species, and it is universal phenomenon. So, the order of mitochondrial genes in the Oniscidea is weakly conserved.