3.1. Genome structure, organization and composition
The complete mitogenome of W. compar (GenBank accession No. MW059037) was a 14,373 bp closed-circular molecule (Fig. 1), encoding an entire set of 36 genes, which included 12 PCGs (cox1-3, nad1-6, nad4l, atp6, and cytb), 22 tRNA genes, 2 rRNA genes (s-rRNA and l-rRNA), and a non-coding region (NCR)) (Table 2). However, this was different from the Enoplea nematode (such as Trichuris_suis) gene set, which has an atp8 gene (Liu et al. 2012). All genes were transcribed in the same direction on the N-strand. The distribution of genes in the mitochondrial genome was identical to those of S. obvelata (Wang et al. 2016)d siamensis (Park et al. 2011). The overall base composition of the mt genome of W. compar was as follows A: 25.7%; T: 53.0%; G: 16.9%; C: 4.4%; G + C: 21.3%; and A + T content: 78.7%. The overall AT and GC skew values were − 0.346 and 0.586, respectively (Table 3). The intergenic spacer sequence was 613 bp in total and included 28 regions ranging in size from 1 to 147 bp. The overlapping nucleotide fragments were scattered in one place, located between trnH and rrnS (-2bp).
Table 2
Characterization of the mt genome of W. compar.
Feature | Strand | Position | Length (bp) | Initiation codon | Stop codon | Anticodon | Intergenic nucleotide |
cox1 | N | 1–1,558 | 1558 | ATA | T(AA) | | 6 |
nad1 | N | 1,565-2,425 | 861 | ATT | TAG | | 7 |
atp6 | N | 2,433-3,033 | 601 | TTG | T(AA) | | |
trnL2 | N | 3,034 − 3,089 | 56 | | | TTA | 10 |
trnS1 | N | 3,100-3,152 | 53 | | | AGA | 19 |
nad2 | N | 3,172-3,994 | 823 | TTG | T(AA) | | |
trnI | N | 3,995-4,058 | 64 | | | ATC | |
trnY | N | 4,059 − 4,115 | 57 | | | TAC | 1 |
trnR | N | 4,117-4,170 | 54 | | | CGT | 16 |
trnQ | N | 4,187-4,240 | 54 | | | CAA | 2 |
trnC | N | 4,243-4,296 | 54 | | | TGC | 2 |
rrnL | N | 4,299-5,257 | 959 | | | | 1 |
trnM | N | 5,259-5,318 | 60 | | | ATG | 53 |
nad6 | N | 5,372-5,806 | 435 | TTG | TAA | | 6 |
trnV | N | 5,813-5,868 | 56 | | | GTA | 8 |
trnW | N | 5,877-5,935 | 59 | | | TGA | 2 |
trnF | N | 5,938-5,994 | 57 | | | TTC | 2 |
cytb | N | 5,997-7,070 | 1074 | TTG | TAG | | 51 |
cox3 | N | 7,122-7,866 | 745 | TTG | T(AA) | | |
trnP | N | 7,867-7,924 | 58 | | | CCT | 3 |
trnT | N | 7,928-7,991 | 64 | | | ACA | 5 |
nad4 | N | 7,997-9,220 | 1224 | ATT | TAA | | 4 |
trnG | N | 9,225-9,280 | 56 | | | GGA | |
nad4l | N | 9,281-9,514 | 234 | TTG | TAA | | 17 |
trnK | N | 9,532-9,590 | 59 | | | AAA | |
nad3 | N | 9,591-9,924 | 334 | TTG | T(AA) | | |
nad5 | N | 9,925 − 11,508 | 1584 | TTG | TAG | | 5 |
trnL1 | N | 11,514 − 11,568 | 55 | | | CTA | 30 |
trnE | N | 11,599 − 11,657 | 59 | | | GAA | 17 |
trnD | N | 11,675 − 11,730 | 56 | | | GAC | 3 |
cox2 | N | 11,734 − 12,525 | 792 | ATT | T(AA) | | 1 |
trnH | N | 12,527 − 12,582 | 56 | | | CAC | -2 |
rrnS | N | 12,581 − 13,312 | 732 | | | | 1 |
trnA | N | 13,314 − 13,374 | 61 | | | GCT | 147 |
trnS2 | N | 13,522 − 13,575 | 54 | | | TCA | 54 |
NCR | N | 13,630 − 14,159 | 530 | | | | 140 |
trnN | N | 14,300 − 14,357 | 58 | | | | |
Table 3
Base composition of the complete mt genome, PCGs, and rRNA genes of W. compar.
Region | A% | C% | G% | T(U)% | A + T% | G + C% | AT skew | GC skew |
Whole genome | 25.7 | 4.4 | 16.9 | 53 | 78.7 | 21.3 | -0.346 | 0.586 |
PCGs | 21.7 | 4.8 | 17.8 | 55.8 | 77.5 | 22.6 | -0.440 | 0.576 |
atp6 | 18.6 | 4.7 | 19.8 | 56.9 | 75.5 | 24.5 | -0.507 | 0.619 |
cox1 | 22.3 | 7.5 | 19.5 | 50.6 | 72.9 | 27.0 | -0.388 | 0.444 |
cox2 | 27.9 | 4.7 | 18.6 | 48.9 | 76.8 | 23.3 | -0.273 | 0.598 |
cox3 | 18.4 | 4.2 | 18.9 | 58.5 | 76.9 | 23.1 | -0.522 | 0.640 |
cytb | 24.4 | 5.0 | 14.6 | 56.0 | 80.4 | 19.6 | -0.393 | 0.488 |
nad1 | 19.3 | 5.7 | 20.2 | 54.8 | 74.1 | 25.9 | -0.480 | 0.561 |
nad2 | 23.7 | 2.7 | 15.7 | 58.0 | 81.7 | 18.4 | -0.420 | 0.709 |
nad3 | 21.3 | 1.8 | 16.5 | 60.5 | 81.8 | 18.3 | -0.480 | 0.803 |
nad4 | 20.8 | 4.8 | 16.2 | 58.2 | 79.0 | 21.0 | -0.473 | 0.541 |
nad4l | 20.1 | 2.6 | 18.4 | 59.0 | 79.1 | 21.0 | -0.492 | 0.755 |
nad5 | 20.1 | 4.2 | 19.1 | 56.6 | 76.7 | 23.3 | -0.477 | 0.642 |
nad6 | 21.1 | 3.4 | 12.0 | 63.4 | 84.5 | 15.4 | -0.500 | 0.552 |
l-rRNA | 31.7 | 3.4 | 15.6 | 49.2 | 80.9 | 19.0 | -0.216 | 0.639 |
s-rRNA | 34.2 | 5.3 | 18.2 | 42.3 | 76.5 | 23.5 | -0.107 | 0.547 |
3.2. Gene annotation and analysis
The mt genome of W. compar encodes 12 PCGs, which contain 3420 codons with a total length of 10260 bp. The content of A + T was 77.5%, which was far higher than G + C. The nucleotides in metazoan mitogenomes are not randomly distributed, and this nucleotide bias is often linked the unequal usage of codons (Herbeck et al. 2003). The nucleotide usage of the 12 PCGs in the mt genome of W. compar is shown in Table 4. The codons TTT (Phenylalanine, 19.2%), TTA (Leucine, 8.2%), ATT (Isoleucine, 7.5%) usage were most frequent. Therefore, the nucleotides in PCGs were biased towards A and T. Most of the PCGs had the start codon TTG, except for cox 1 that uses ATA, and the other 3 genes (nad 1, nad 4 and cox 2) that use ATT (Table 2). Three types of stop codons were used: TAG (nad 1, cob and nad 5), TAA (nad 6, nad 4 and nad 4l) and an abbreviated stop codon T (cox 1, atp 6, nad 2, cox 3, nad 3 and cox 2), which is an incomplete TAA stop codon and is completed by the addition of 3` A residues to the mRNA (Table 2).
W. compar mt DNA contains 22 tRNA genes, which range from 53 bp (trnS1) to 64 bp (trnI and trnT). The two rRNA genes, rrnL and rrnS, were 959 bp and 732 bp in size, respectively (Table 2). ssnL is located between trnC and trnM, and rrnS is situated between trnH and trnA. The A + T content of rrnL and rrnS was 80.9% and 76.5%, respectively (Table 3).
The NCR, located between trnS2 and trnN, was 530 bp in length with a significantly higher A + T content (97.2%) than of any other region of the mitochondrial genome.
Table 4
Nucleotide codon usage of the 12 PCGs of the mt genome of W. compar.
Nucleotide | Count | Nucleotide | Count | Nucleotide | Count | Nucleotide | Count |
TTT(F) | 657 | TCT(S2) | 100 | TAT(Y) | 170 | TGT(C) | 80 |
TTC(F) | 0 | TCC(S2) | 0 | TAC(Y) | 1 | TGC(C) | 0 |
TTA(L2) | 281 | TCA(S2) | 11 | TAA(*) | 3 | TGA(W) | 51 |
TTG(L2) | 220 | TCG(S2) | 4 | TAG(*) | 4 | TGG(W) | 21 |
CTT(L1) | 10 | CCT(P) | 54 | CAT(H) | 51 | CGT(R) | 35 |
CTC(L1) | 0 | CCC(P) | 0 | CAC(H) | 0 | CGC(R) | 0 |
CTA(L1) | 1 | CCA(P) | 3 | CAA(Q) | 19 | CGA(R) | 0 |
CTG(L1) | 0 | CCG(P) | 3 | CAG(Q) | 17 | CGG(R) | 1 |
ATT(I) | 255 | ACT(T) | 51 | AAT(N) | 120 | AGT(S1) | 121 |
ATC(I) | 0 | ACT(T) | 0 | AAC(N) | 1 | AGC(S1) | 1 |
ATA(M) | 97 | ACA(T) | 5 | AAA(K) | 50 | AGA(S1) | 57 |
ATG(M) | 68 | ACG(T) | 3 | AAG(K) | 37 | AGG(S1) | 20 |
GTT(V) | 243 | GCT(A) | 48 | GAT(D) | 70 | GGT(G) | 139 |
GTC(V) | 2 | GCC(A) | 1 | GAC(D) | 0 | GGC(G) | 0 |
GTA(V) | 45 | GCA(A) | 3 | GAA(E) | 46 | GGA(G) | 54 |
GTG(V) | 38 | GCG(A) | 4 | GAG(E) | 23 | GGG(G) | 21 |
(F, L2, L1, I, M, V, S1, P, T, A, Y, H, Q, N, K, D, E, C, W, R, S2, G), The letters in parentheses are single-letter abbreviations for 22 amino acids; *, stop codon;
3.3. Comparative mitochondrial genomic analysis
There are seven families of the Oxyuridomorpha infraorder: Thelastomatoidae, Travassosinematidae, Hystrignathidae, Protrelloididae, Oxyuridae, Pharyngodonidae, and Heteroxynematidae (De Ley et al. 2002). To date, the mt genomes of many Oxyuridomorpha nematode infraorder lineages are still underrepresented or not represented, except for those of Oxyuridae and Heteroxynematidae. There are seven species in Oxyuridae (including W. compar) and Heteroxynematidae, and we thus compared the mitochondrial genomic differences between the seven species of Oxyuridomorpha.
Gene rearrangement is mainly caused by mutations in the mitochondria (San et al. 2006), and for which the TDRL model is possibly the most widely accepted explanation hypothesis (Chen et al. 2021; Xu et al. 2021). The mitochondrial gene arrangements in the seven species were not the same (Fig. 2), where W. compar was consistent with W. siamensis and S. obvelata but different from O. equi, E. vermicularis, A. tetraptera, and P. ambiguus. The main difference was the occurrence of a transposition event (a position change of trnI for O. equi and E. vermicularis; a position change of trnY for A. tetraptera) or the number of NCR. trnI was inserted between NCR and trnN, and trnY was inserted between trnQ and trnC. P. ambiguus has two NCR, consistent with other nematodes (Liu et al. 2012; Liu et al. 2013; Gao et al. 2019; Sun et al. 2016), and an extra NCR in P. ambiguus was found between trnA and trnS2.
The complete mtDNA sequence of W. compar (14,373 bp) had the most bp out of the Oxyuridomorpha species mtDNA sequences, followed by S. obvelata (14,235 bp), W. siamensis (14,128 bp), P. ambiguus (14,023 bp), E. vermicularis (14,010 bp), A. tetraptera (13,700 bp), and O. equi (13,641 bp). Among them, there was a major difference in the non-coding region (NCR) and intergenic nucleotides (Table 5). The nucleotide composition of the entire mtDNA sequences for W. compar was biased towards A and T, with T being the most favored nucleotide and C being the least favored, in accordance with the majority of nematode mt genomes (Park et al. 2011; Liu et al. 2016; Kang et al. 2009; Kim et al. 2014). The overall A + T content of the W. compar genome sequence was 78.7%, which is highest among oxyuridomorpha species (Fig. 3-c) (77.9% for W. siamensis, 71.2% for E. vermicularis, 70.2% for A.tetraptera, 74.1% for S. obvelata, 67.8% for O. equi, and 71.6% for P. ambiguus, respectively). AT and GC skew in each mt PCG of oxyuridomorpha are plotted in Fig. 3-a,b. All AT-skew values were detected as negative and all GC-skew values were positive. Oxyuridomorpha shares a similar pattern with those of most other Spirurina species (references are listed in Table 1). Nucleotide composition is generally regarded as a potential indicator of gene direction and selective pressure during replication and transcription (Perna et al. 1995).
Table 5
Comparison of the size and sequence identity of mitochondrial genomes in seven species.
| W.C | W.S | E.V | A.T | S.O | O.E | P.A | Identity |
Full genome | 14373 | 14128 | 14010 | 13700 | 14231 | 13641 | 14023 | |
PCGs | 10260 | 10236 | 10326 | 10380 | 10284 | 10287 | 10206 | |
tRNAs | 1260 | 1265 | 1232 | 1246 | 1257 | 1226 | 1209 | |
rRNAs | 1691 | 1685 | 1622 | 1614 | 1705 | 1618 | 1642 | |
intergenic nucleotide | 613 | 430 | 172 | 198 | 192 | 245 | 231 | |
NCR | 530 | 510 | 675 | 230 | 815 | 250 | 740 | 63.70/44.23/33.21/41.45/32.45/13.94 |
atp6 | 601 | 601 | 637 | 622 | 642 | 601 | 639 | 93.18/56.49/59.29/56.39/58.97/59.72 |
cox1 | 1558 | 1558 | 1578 | 1560 | 1566 | 1563 | 1560 | 93.77/77.25/80.26/80.28/78.94/77.71 |
cox2 | 792 | 751 | 721 | 791 | 735 | 751 | 730 | 87.75/66.79/74.59/71.66/69.69/70.71 |
cox3 | 745 | 762 | 762 | 774 | 768 | 768 | 696 | 89.67/69.55/61.33/66.67/62.80/68.05 |
cytb | 1074 | 1074 | 1117 | 1111 | 1083 | 1083 | 1080 | 92.55/69.91/64.17/68.26/64.95/71.48 |
nad1 | 861 | 861 | 864 | 861 | 864 | 864 | 855 | 91.64/79.33/77.66/79.77/78.01/79.26 |
nad2 | 823 | 823 | 847 | 829 | 823 | 847 | 837 | 91.37/60.21/69.82/72.65/64.97/71.31 |
nad3 | 334 | 334 | 334 | 334 | 330 | 334 | 334 | 86.23/71.26/69.85/71.26/69.85/74.85 |
nad4 | 1224 | 1224 | 1225 | 1252 | 1224 | 1231 | 1230 | 91.67/71.64/70.53/73.98/72.26/75.08 |
nad4L | 234 | 234 | 234 | 234 | 234 | 234 | 234 | 92.31/66.24/72.65/74.36/73.93/76.50 |
nad5 | 1584 | 1584 | 1582 | 1581 | 1581 | 1583 | 1578 | 90.53/71.20/72.75/73.77/71.59/75.76 |
nad6 | 435 | 435 | 432 | 438 | 435 | 435 | 435 | 94.02/67.28/72.15/72.15/70.80/72.87 |
rrnL | 959 | 953 | 944 | 942 | 942 | 945 | 928 | 91.03/67.42/71.60/67.61/67.48/70.35 |
rrnS | 732 | 732 | 678 | 672 | 763 | 673 | 714 | 90.48/66.76/71.58/66.29/70.75/72.52 |
Identity:W.compar-W.siamensis/W.compar-E.vermiculari/W.compar-A. tetraptera/W. compar -S. obvelata/W. compar-O. equi /W.compar-P.ambiguus
All seven mt genomes showed similar characteristics, such as the NAD4L being the smallest and ND5 being the largest among the PCGs. Although all the mitochondrial genomes were similar in size, they differed in sequence identity (Table 5). In a comparison of 12 PCGs sequence identities between W. compar and six other pinworms, the highest identity was found with W. siamensis. Among the seven species, nad1 and cox1 exhibited the highest concordance.
Summaries of the relative synonymous codon usage (RSCU) and number of amino acids in the 12 PCGs were calculated for the seven mitogenomes, as shown in Fig. 4 and Fig. 5. The amino acid compositions and RSCUs of these mitogenomes are largely similar. Among these, phenylalanine, leucine-2 and valine are the most frequent amino acids, whereas arginine, glutamine, and histidine are rare. Leucine-1 has the smallest RSCU, whereas serine-1 and leucine-2 have the largest. However, they also differ in that the RSCU value of leucine-1 of the genus Wellcomia is much smaller than that of other genera because it has only CUU codes but not CUC, CUG, CUA. Serine-1 is more biased towards codon AGU is consistent across the seven species, the RSCU value of leucine-1 (UUA codon) of genus Wellcomia is more than that of leucine-1 (UUG codon) (W. compar:3.29 > 2.58; W. siamensis:3.46 > 2.39). Therefore, leucine-1 is more biased towards codon UUA in genus Wellcomia, while being more biased towards codon UUG in the other five species.
We measured the selective pressure acting upon amino acid replacement mutations via the ratios of dN/dS for all 12 PCGs of W. compar against those of W. siamensis, which ranged from 0.034 to 0.212 (Fig. 6). All PCGs were under negative (purifying) selection (ω < 1) pressure, indicating the existence of functional constraints affecting the evolution of these genes. The functional constraint on nad3 gene was the most relaxed, whereas nad4L gene was evolving under the strongest purifying selection pressure (Fig. 6).
3.4. Phylogenetic analyses
Phylogenetic analyses of W. compar and selected Spirurina nematodes were performed using maximum likelihood (ML) and MrBayes inference (BI) methods based on concatenated mitochondrial CDS of 12 PCGs (BI (Fig. 7), ML (Fig. 8)). The mt genome sequences may provide reliable genetic markers for examining the taxonomic status of nematodes, particularly when PCG sequences are used as markers for comparative analyses (Hwang et al. 2001; Liu et al. 2013; Park et al. 2011; Wang et al. 2016; Liu et al. 2016). Because some superfamilies were represented by a single species in our study, this topology should be interpreted with caution. Phylogenetic analysis showed that the BI and ML trees both divided Spirurina into two clades: Spiruromorpha formed one separate clade, and Oxyuridomorpha + Rhigonematomorpha + Gnathostomatomorpha + Dracunculoidea + Ascaridomorpha formed another clade. The second clade was further sub-divided into two clades, Rhigonematomorpha + Gnathostomatomorpha + Dracunculoidea + Ascaridomorpha and Oxyuridomorpha. These results are consistent with recent study on Spirurina (Chen et al. 2021).
Within Spiruromorpha, many nodes were well-supported. Interestingly, our results indicate that T. grusi is sister to S. lupi. Although S. lupi and T. callipaeda belong to the superfamily Thelazioidea, T. callipaeda did not cluster together with S. lupi, but instead showed an early diverging position to S. lupi. These results are identical to those previously reported (Gao et al. 2021; Liu et al. 2015). However, inconsistent with previous studies (Ahmed et al. 2022), our results showed that Dracunculoidea was more closely related to Ascaridomorpha than Spiruromorpha. Within Ascaridomorpha, the superfamily Ascaridoidea forms a well-supported clade that is sister to Rhigonematoidea + Heterakoidea. They have a sister-group relationship with Seuratoidea. Interestingly, Seuratoidea, Heterakoidea, and Seuratoidea belong to the common infraorder: Ascaridomorpha. The topology of this clade indicated that Rhigonematoidea is closely related to the superfamily Ascaridomorpha. We therefore think it may belong to the infraorder Ascaridomorpha.
The phylogenetic trees obtained using the different analytical methods were extremely similar in topology. The only difference is in Oxyuridomorpha, where the two methods produced different topologies. Our results revealed that W. compar was sister to W. siamensis, with high statistical support (BI = 1.00; ML bootstrap = 100). In Oxyuridomorpha, W. compar and W. siamensis are clustered together and have closer relationships with S. obvelata than with E. vermicularis in both trees, which is further supported by the comparison of gene orders in this study. A. tetraptera and S. obvelata were shown to be clustered together, and E. vermicularis has an affinity with pinworms other than W. siamensis. A previous study (Wang et al. 2016) demonstrated that W. siamensis was more closely related to P. ambiguus than E. vermicularis as stated in another prior study (Liu et al. 2016). A recent study (Chen et al. 2021) found P. ambiuguus to be sister to A. tetraptera. These findings are different from our results, which may be due to the scarcity of the pinworm complete mt genome. Although pinworms are nematodes of human and animal health significance, the mitogenomes of only seven species have been sequenced and published. Therefore, expanding taxon sampling is still necessary for future phylogenetic studies of this infraorder based on mtDNA genomic datasets.