Using Long-PCR and NGS to obtain complete mt genomes of individual ticks
A previous study [7] classified tick mt genomes into three types according to gene orders (Figure 1A): (1) type I for Argasidae (soft ticks) and non-Australasian Prostriata (“other Ixodes”); (2) type II for Australasian Prostriata (“Australasian Ixodes”); and (3) type III for Metastriata (all other hard ticks). The nomenclature “other Ixodes” and “Australasian Ixodes” is from [8]. The present study focused on the genus Dermacentor belonging to Metastriata using ticks from four species (D. silvarum, D. nuttalli, D. marginatus and D. niveus). The type III mt genomes of individual ticks (Figure 1B) were obtained using long-PCR amplification coupled with NGS (Methods). All the reference genomes of tick mitochondria read in the 5'→3' direction as the major coding strand (J-strand). Using specific primers (Table 1), each entire mt genome was amplified in two large segments: large segment 1 (L1) and large segment 2 (L2) or large segment 3 (L3) and large segment 4 (L4). L1 and L2 contain Control Region 1 (CR1) and Control Region 2 (CR2), respectively, whereas L3 and L4 contain tandem Repeat 1 (R1) and tandem Repeat 2 (R2), respectively (Figure 1A). Using ~4 Gbp 2 × 150 DNA-seq data for each genome, the complete mt genomes of D. silvarum, D. nuttalli and D. marginatus were obtained by assembling L3 and L4 separately then merging L3 and L4 (Figure 1B). In addition, CR1 and CR2 on L4 were validated using PCR amplification coupled with Sanger sequencing, separately, as CR1 and CR2 share an identical segment (Figure 2A). Using ~4 Gbp 2 × 150 DNA-seq data, the complete mt genome of D. silvarum was also obtained by assembling L1 and L2 separately then merging L1 and L2. Furthermore, R1 and R2 on L1 were validated using PCR amplification coupled with Sanger sequencing, separately, as the repeat units of R1 are the reverse complements of the repeat units of R2 (Figure 3). Comparison of the D. silvarum mt genomes obtained by sequencing L3 and L4 with those obtained by sequencing L1 and L2 improved the accuracy of the DNA sequence. Since both R1 and R2 were longer than 150 bp, we also used ~4 Gbp 2 × 250 bp DNA-seq data to obtain full-length sequences of R1 and R2 for genome polishing. In total, 12.7 Gbp DNA-seq data were generated to cover ~848,069 × (12.72 Gbp/1.5 Kbp) of the D. silvarum mt genome (GenBank: MN347015) which was used as a reference for precise annotation in the following studies.
Table 1. PCR primers for the Dermacentor mt genomes
Forward primer
|
Reverse primer
|
Segment
|
Size(bp)
|
TCAGTCATTTTACCGCGATGA
|
GCTCAAATTCCATTCTCTGC
|
L1
|
9,580
|
AGCTGTTACTAACGTTGAGG
|
AGGATGTTGATGGATCGAAA
|
L2
|
8,156
|
GCTAKTGGGTTCATACCCCAA
|
CGACCTCGATGTTGGATTAGGA
|
L3
|
7,155
|
CCAACCTGATTCWCATCGGTCT
|
TCATCGCGGTAAAATGACTGA
|
L4
|
9,187
|
TGCTGCTGGCACAAATTTAGC
|
CAAGATGACCCTAAATTCAGGCA
|
CR1
|
483
|
GGAGCTATACCAATTGAATATCCC
|
TTGGGGTATGAACCCAATAGC
|
CR2
|
645
|
TGCATTCAGTTTCGGCCTGA
|
CCGGCTGTCTCATCTATTGAC
|
R2
|
3,616
|
CTATTCCGGCATAGTAAAATGCCTG
|
CAAGCTTATGCACCCTTTTCAATAC
|
R1
|
570
|
These primers were designed to amplify large segments (L1, L2, L3 and L4) and short segments (CR1, CR2, R1 and R2) in the mt genomes of the genus Dermacentor. Their PCR reaction conditions can be seen in the Methods. Based on the results using 100 individual ticks from four species, the primers for L3 and L4 were optimized to amplify more species of the genus Dermacentor than those of L1 and L2. The R2 segment spanned tRNAArg, tRNAAsn, tRNASer, tRNAGlu, ND1, tRNALeu, 16S rRNA, tRNAVal, 12S rRNA and CR1. The R1 segment spanned tRNAIle, tRNAGln, R1, tRNAPhe and ND5. The segment sizes were estimated using the D. silvarum mt genome (GenBank: MN347015).
Comparison of D. silvarum, D. nuttalli and D. marginatus mt genomes showed that they have the same gene order (type III) and high sequence identities (> 95%) (tandem repeats were not part of these calculations). Preliminary analysis showed two significant features in these tick mt genomes that are also possible in ticks of Metastriata (Figure 1A): (1) the mt genomes of D. silvarum, D. nuttalli and D. marginatus contained two tandem repeats (R1 and R2); and (2) these tick mt genomes contained multiple Short Tandem Repeats (STRs) with very short repeat units (1 or 2 bp). STRs, widely used by forensic geneticists and in studies of genealogy, are often referred to as Simple Sequence Repeats (SSRs) by plant geneticists or microsatellites by oncologists. Found widely in animal mt genomes, STRs follow a pattern in which one or more nucleotides (repeat unit) are repeated and the repeat units are directly adjacent to each other, allowing for very rare Single Nucleotide Polymorphisms (SNPs) in the repeat units. The minimum length of the repeat units of STRs is obviously 1 bp; we call this this type of STR a polynucleotide. PolyAs and polyTs occur frequently in tick and insect mt genomes; indeed, they contribute substantially to the high AT content of many of these mt genomes. Polynucleotides and tandem repeats R1 and R2 had the same pattern of variation in the D. silvarum mt genome (below). This suggested that a unified study should be performed on the CNV of STRs with longer and shorter repeat units, particularly polynucleotides that were usually overlooked in previous studies. To describe a tandem repeat, we use the repeat unit and its copy number. STRs can be classified by their repeat unit length (m) and copy number (n), thus briefly noted as m×n STR. For example, the STR ATATATATAT is noted as [AT]5 and classified as 2×5 STR. In this way, a polynucleotide is classified as 1×n STR.
Precise annotation of the Dermacentor silvarum mt genome
Our D. silvarum mt genome shares a sequence identity of 97.47% with the publicly available D. silvarum mt genome NC_026552.1 in the NCBI RefSeq database. We performed precise annotation of the complete D. silvarum mt genome (Table 2) using sRNA-seq data and confirmed these annotations using the PacBio full-length transcriptome data (Methods). Although most of the new annotations were consistent with those of NC_026552.1, we corrected many errors in NC_026552.1, particularly in tRNAs, rRNAs, CR1, CR2, R1 and R2. D. silvarum transcribes both entire strands of its mt genome to produce primary transcripts covering CR1 and CR2, predicted to be non-coding and non-transcriptional regions in a previous study [7]. CR1 with a length of 309 bp and CR2 with a length of 307 bp shared a 263-bp identical segment (Figure 2A). CR1 and R1 were annotated as full-length RNAs cleaved from the minor coding strand (N-strand) primary transcript, whereas CR2 and R2 were annotated as DNA regions (Table 2) covered by four transient RNAs. However, the Transcription Initiation Termination Sites (TISs) and the Transcription Termination Sites (TTSs) of the mt primary transcripts of ticks are still not determined due to insufficient data available.
Table 2. Precise annotations of the Dermacentor silvarum mt genome
Gene
|
Strand
|
Start
|
End
|
Length
|
tRNA-Met
|
(+)
|
1
|
67
|
67
|
ND2
|
(+)
|
68
|
1028
|
961
|
tRNA-Trp
|
(+)
|
1029
|
1091
|
63
|
tRNA-TyrAS
|
(+)
|
1092
|
1149
|
58
|
COI
|
(+)
|
1150
|
2686
|
1537
|
COII
|
(+)
|
2687
|
3359
|
673
|
tRNA-Lys
|
(+)
|
3360
|
3429
|
70
|
tRNA-Asp
|
(+)
|
3430
|
3491
|
62
|
ATP8/6
|
(+)
|
3492
|
4326
|
835
|
COIII
|
(+)
|
4327
|
5104
|
778
|
tRNA-Gly
|
(+)
|
5105
|
5171
|
67
|
ND3
|
(+)
|
5172
|
5511
|
340
|
tRNA-Ala
|
(+)
|
5512
|
5576
|
65
|
Intergenic
|
(+)
|
5577
|
5578
|
2
|
tRNA-Arg
|
(+)
|
5579
|
5640
|
62
|
tRNA-Asn
|
(+)
|
5641
|
5708
|
68
|
Intergenic
|
(+)
|
5709
|
5712
|
4
|
tRNA-Ser
|
(+)
|
5713
|
5769
|
57
|
tRNA-Glu
|
(+)
|
5770
|
5834
|
65
|
R2
|
*
|
5825
|
5987
|
163
|
HAS1
|
(+)
|
5835
|
9258
|
3424
|
tRNA-Ile
|
(+)
|
9259
|
9322
|
64
|
HAS2
|
(+)
|
9323
|
12930
|
3608
|
tRNA-Thr
|
(+)
|
12931
|
12991
|
61
|
tRNA-ProAS/ND6
|
(+)
|
12992
|
13503
|
512
|
Cytb
|
(+)
|
13504
|
14585
|
1082
|
tRNA-Ser
|
(+)
|
14586
|
14652
|
67
|
tRNA-LeuAS/CR2
|
(+)
|
14653
|
15023
|
371
|
CR2
|
*
|
14717
|
15023
|
307
|
tRNA-Cys
|
(+)
|
15024
|
15086
|
63
|
Intergenic
|
(+)
|
15087
|
15094
|
8
|
tRNA-Tyr
|
(-)
|
1090
|
1151
|
62
|
LAS1
|
(-)
|
1152
|
5984
|
4833
|
ND1
|
(-)
|
5985
|
6903
|
919
|
tRNA-Leu
|
(-)
|
6904
|
6964
|
61
|
16S rRNA
|
(-)
|
6965
|
8185
|
1221
|
tRNA-Val
|
(-)
|
8186
|
8247
|
62
|
12S rRNA
|
(-)
|
8248
|
8949
|
702
|
CR1
|
(-)
|
8950
|
9258
|
309
|
tRNA-IleAS
|
(-)
|
9259
|
9323
|
65
|
tRNA-Gln
|
(-)
|
9324
|
9390
|
67
|
R1
|
(-)
|
9391
|
9559
|
169
|
tRNA-Phe
|
(-)
|
9560
|
9620
|
61
|
ND5
|
(-)
|
9621
|
11275
|
1655
|
tRNA-His
|
(-)
|
11276
|
11341
|
66
|
ND4/4L
|
(-)
|
11342
|
12928
|
1587
|
tRNA-ThrAS
|
(-)
|
12929
|
12991
|
63
|
tRNA-Pro
|
(-)
|
12992
|
13055
|
64
|
LAS2
|
(-)
|
13056
|
14654
|
1599
|
tRNA-Leu
|
(-)
|
14655
|
14716
|
62
|
LAS3
|
(-)
|
14717
|
1089
|
1467
|
This reference sequence is available at the NCBI GenBank database under the accession number MN347015. J(+) and N(-) represent the major and minor coding strands of the mt genome, respectively. Control Region 1 (CR1) and tandem Repeat 1 (R1) were annotated as full-length RNAs cleaved from the minor coding strand (N-strand) primary transcript, whereas CR2 and R2 were annotated as DNA regions (*). The “AS” suffix represents antisense. H-strand Antisense Segment 1 (HAS 1) represents R2/ND1AS/tRNALeuAS/(16S rRNA)AS/tRNAValAS/(12S rRNA)AS/CR1. HAS2 represents tRNAGlnAS/R1/tRNAPheAS/ND5AS/tRNAHisAS/(ND4/4L)AS. L-strand Antisense Segment 1 (LAS1) represents COIAS/COIIAS/tRNALysAS/tRNA-AspAS/(ATP8/6)AS/COIIIAS/tRNAGlyAS/ND3AS/tRNAAlaAS/tRNAArgAS/tRNAAsnAS/tRNASerAS/tRNAGluAS/R2.
LAS2 represents ND6AS/CytbAS/tRNASerAS. LAS3 represents CR2/tRNACysAS/tRNAMetAS/ND2AS/tRNATrpAS.
Using precise annotations, we obtained two new findings about the D. silvarum mt tRNAs. The first involved six mt tRNA genes, from which atypical tRNAs with no D-arm or an unstable T-arm were inferred [9]. One of tRNASers(mtDNA: 5713:5769) and tRNACys (Figure 2B) had no D-arms, whereas tRNAAla (Figure 2B), tRNAGlu, tRNATyr and tRNAPhe had unstable T-arms. Another new finding was that the intergenic regions between tick mt tRNA genes are longer than those in mammalian except a novel 31-nt ncRNA [4], which was generated in the gene order rearrangement of mammalian mt tRNA genes. Although these intergenic regions in ticks were cleaved between their neighbouring tRNAs to form small RNAs (sRNAs) shorter than 10 bp, they are not likely to have biological functions, in our view. One typical example of a sRNA was A[U]7, between tRNACys and tRNAMet (Figure 2B). Based on these two findings, we found that 1×n STRs involved both intergenic regions (e.g., A[U]7) and atypical mt tRNAs (e.g., [A]5 in tRNACys). Comparison of tRNASer and tRNACys suggested that tRNACys (Figure 2B) with no D-arm had an [A]5 insertion that formed a large loop. Given that the tRNACys DNA sequence had too little evolutionary conservation to allow for a STR insertion, it proved a long-standing hypothesis that atypical tRNAs do not have biological functions.
R1 and R2 (Figure 3) were predicted to be two non-coding and non-transcriptional regions in the previous study [7]. In the present study, however, they were proven to be transcribed on two strands. The repeat units in R1 were reverse complements of those in R2 (Figure 3B). Our DNA-seq data showed that the copy numbers of R1 and R2 exhibited great diversity within an individual, which confirmed the finding from our previous study of the E. fullo mt genome [5]. Since repeat units in R1 and R2 were reverse complements, we used PCR amplification (Table 1) coupled with Sanger sequencing to further investigate R1 sequences in more than 100 individual ticks from four species (D. silvarum, D. nuttalli, D. marginatus and D. niveus) and obtained the following results: (1) for each individual tick, the R1 sequence obtained using Sanger sequencing is actually a consensus sequence of a large number of heterogeneous sequences; (2) copy numbers were distributed between 2 and 5 for all studied repeat units, with one partial repeat unit counted as 1; (3) in total, three types of repeat units of R1 with lengths of 28, 34 and 44 bp (types 1, 2 and 3, respectively) were identified (Figure 3B) and noted as R28, R34 and R44; (4) in general, R1 sequences from individual ticks of one species comprised repeat units of one type and R1 sequences from individual ticks of the same species from different places could have different copy numbers; and (5) of the four species of ticks we studied, D. nuttalli and D. niveus had a R1 which was composed of type 3 units, whereas D. silvarum had a R1 which was composed of type 2 or type 3 units. As for D. marginatus, most of the R1s were composed of the type 3 units; however, a few had R1s composed of the types 1 and 2 hybrid units, noted as [R34]l-[R28]m-[R34]n, where l, m and n represent the copy numbers. The discovery of these “hybrid” units suggested to us that mt DNA recombination may occur within an individual tick, resulting in the insertion of [R28]m into [R34]l+n. This confirmed our proposal of DNA-recombination events in a previous study of the E. fullo mt genome [5]. In that previous study, the insertion of segments A and B into STR [R87] l+m+n resulted in [R87]l-A-[R87]m-B-[R87]n.
Discovery of a transposon-like element
In a previous study, the repeat unit was conceived as the “tick box”—a degenerate 17-bp sequence motif that may be involved in the 3′ formation of ND1 and tRNAGlu transcripts in all major tick lineages [7] [10] [11]. A large translocated segment (LT1) spanning from ND1 to tRNAGln was first reported in 1998 [12] [13] and the presence of the “tick box” motif at both ends of this LT1 indicated its involvement in recombination events that are responsible for known Metastriata ticks [12] [13] [14]. Metastriata genome rearrangements have been found in all Metastriata ticks studied [8] [10] [11] [12] [13] [14] [15] [16] [17] [18] (Figure 1A). In the present study, LT1 was corrected to span R2, ND1, tRNALeu, 16S rRNA, tRNAVal, 12S rRNA, CR1, tRNAIle, tRNAGln and R1 (Figure 3A) in the reference genome using precise annotations. Given that nearly half of the human genome is various types of transposable elements that contain repetitive DNA sequences [19], we hypothesized that LT1 is a transposon, with R1 and R2 as invert repeats (IRs) and genes from ND1 to tRNAGln as insert sequences (ISs). To test our hypothesis, we sought structural variation (Methods) in the D. silvarum mt genome to determine the occurrence of LT1 translocation events. The results proved the occurrence of LT1 inversions within an individual tick (Figure 3A).
Since LT1 inversions were rare, 4.1 Gbp DNA-seq data were generated to cover ~427,247× (4.09 Gbp/9.58 Kbp) of L1 in the D. silvarum mt genome to detect the LT1 inversions. As the dominant copy number is five for both R1 and R2, we used 34×5 STR to represent R1 and R2 in the D. silvarum mt genome (Figure 3C). Thus, R1 and R2 in D. silvarum are ~170-bp long (34×5), which is longer than the reads in the 2×150 bp DNA-seq data. We had to sequence the same library using 2×250 bp sequencing to validate the reference genome and the LT1 inversion (Methods). The substantial diversity in R1 and R2 copy numbers within an individual tick rendered great diversity in LT1. However, we did not obtain full-length sequences of LT1 due to sequence-length limitations in the DNA-seq data. Therefore, we were unable to determine whether R1 and R2 had the same copy numbers within one LT1.
Copy number variation of STRs in the mt genomes within an individual animal
By mapping DNA-seq data to the D. silvarum mt genome, variation detection (Methods) was performed to report two types of DNA variation—SNPs and small insertions/deletions (InDels). Almost all the detected DNA variation within a D. silvarum tick were Copy Number Variation (CNV) of STRs caused by InDels of one or more entire repeat units, whereas SNPs were not detected. We defined the STR position as the genomic position of the first nucleotide of the reference STR. For example, [G]8 was designated as the reference STR at position 1810, because it occurred most frequently in mtDNAs within one individual tick (Table 3); the alternative alleles of [G]8 included [G]6, [G]7, [G]9 and [G]10. Importantly, it was found that almost all of the STRs had multiple variants, particularly those with copy numbers greater than 5. The detection of CNV of STRs was reliable, based on the following reasons: (1) PCR amplification and deep DNA sequencing produces a high signal-to-noise ratio in the detection of DNA variation; (2) the Illumina sequencer generates very rare InDel errors, i.e. few per million bases [20]; (3) it is impossible for sequencing or alignment errors to result in 2-bp InDels in 2×n STR (e.g., [TA]9); (4) the alternative allele ratios (Methods) at some positions were significantly higher and the highest ratio reached was ~32% at position 3441 (Table 3); (5) the same CNV of STRs caused diversity within and between individual ticks, e.g., [TA]9 in our genome (Table 3) and [TA]5 in NC_026552.1; (6) CNV of STRs was detected in other animal species, including E. fullo [5], mouse; and (7) Almost all the detected DNA variation within a single human cell were CNV of STRs, whereas SNPs were not detected using a colon cancer single cell RNA-seq (scRNA-seq) dataset (SRA: SRP113436).
Using a very strict parameter (Methods), we selected 20 STR positions in the D. silvarum mt genome for further study (Table 3). STRs at 18 positions were 1×n STR and those at the other 2 positions were 2×n STR. Almost all of the STRs were composed of A or T, except [G]8 at position 1810. All of the reference STRs and their variants had copy numbers greater than 5. Among all of the variants, 1×n STR occurred much more frequently than m×n STR (m > 1). Thirteen STR positions in the protein-coding genes had identical sequences between individuals, exhibiting a high degree of evolutionary conservation; however, the other seven STR positions in the tRNA and rRNA genes exhibited variation. For example, [A]10, [TA]9, [T]9 and [T]10 at positions 5524, 7324, 8042 and 8243 in our mt genome changed to [A]8, [TA]5, [T]8 and [T]6 at positions 5524, 7324, 8042 and 8243 in NC_026552.1 (Table 3). The CNV of STRs in the protein-coding genes resulted in frameshift mutations in the proteins, which may be deleterious [21]. For example, the COI gene has a 1524-bp Coding Sequence (CDS) for a 308-aa protein. The variant [G]9 at genomic position 1810 (Table 3) results in a 765-bp CDS for a 255-aa truncated COI protein. This finding inspired us to investigate if animal cells have mechanisms to remove mitochondria containing deleterious mutations or inhibit the expression of the deleterious variants, as they can cause loss of function or diseases. We did not, however, obtain deep RNA-seq data for D. silvarum. We had to compare the STR variants in the E. fullo mt genome [5] at the DNA level and RNA levels, using the DNA-seq and RNA-seq data (SRA: SRP174926). However, we found no significant differences between them. This suggested that deleterious STR mutations can irreversibly change the proteins made from their mRNAs and that mitochondria containing deleterious mutations can accumulate in cells.
Table 3. Mitochondrial CNV of STRs within a D. silvarum individual
Position
|
Gene
|
Ref
|
Allele
|
Depth
|
1810
|
COI
|
[G]8
|
Ref/+1G/-1G/+2GG/T/G/C/+1T/*/-2GG
|
352526/6177/3116/150/32/25/19/19/10/6
|
2573
|
COI
|
[A]9
|
Ref/+1A/-1A/T/+2AA/A/+3AAA/-2AA
|
277857/7636/1788/223/170/13/9/8
|
3441
|
tRNA-Asp
|
[A]8
|
Ref/+1A/+2AA/-1A/+1C/+3AAA/+1G/G/C/A
|
438644/195894/3402/1011/69/58/17/10/7/5
|
3619
|
ATP8/6
|
[A]10
|
Ref/+1A/-1A/+2AA/-2AA/T/+3AAA/A/C
|
400525/23182/9717/997/185/137/49/14/11
|
4592
|
COIII
|
[T]11
|
Ref/+1T/-1T/+2TT/-2TT/+3TTT/T/+1A
|
107531/10859/6363/1027/140/110/20/10
|
5524
|
tRNA-Ala
|
[A]10
|
Ref/+1A/-1A/+2AA/-2AA/A/+3AAA
|
97659/3329/2622/131/27/19/9
|
6361
|
ND1
|
[A]10
|
Ref/+1A/-1A/+2AA/-2AA/+3AAA/C
|
94957/4984/3410/313/18/17/8
|
7324
|
16S rRNA
|
[TA]9
|
Ref/-2TA/+2TA
|
93454/4152/3088
|
7737
|
16S rRNA
|
[A]9
|
Ref/+1A/-1A/+2AA/C/A
|
89380/1277/890/32/6/5
|
8042
|
16S rRNA
|
[T]9
|
Ref/+1T/-1T/+2TT/-2TT/G
|
96225/1795/1552/40/8/5
|
8243
|
12S/tRNA-Val
|
[T]10
|
Ref/+1T/-1T/+2TT/-2TT/+3TTT/G
|
99234/3993/2326/151/31/8/7
|
8885
|
12S rRNA
|
[T]10
|
Ref/+1T/-1T/+2TT/-2TT/T/+3TTT
|
86875/4767/3630/287/38/36/18
|
10026
|
ND5
|
[A]10
|
Ref/-1A/+1A/+2AA/-2AA/+3AAA/C
|
99166/5465/3630/134/82/6/5
|
10298
|
ND5
|
[A]13
|
Ref/-1A/+1A/-2AA/+2AA/-3AAA/+3AAA/C
|
91464/17434/7221/2402/721/241/61/5
|
11216
|
ND5
|
[A]17
|
Ref/-1A/+1A/-2AA/+2AA/-3AAA/+3AAA/A/*/+2TA/+1C
|
138074/26622/18595/4443/2859/542/413/96/24/5/5
|
11483
|
ND4/4L
|
[TA]6
|
Ref/-2TA/+2TA/A/T
|
200829/3596/2031/19/7
|
11962
|
ND4/4L
|
[A]11
|
Ref/-1A/+1A/-2AA/+2AA/C/+3AAA/A/-3AAA
|
203481/18678/10550/646/496/28/20/11/10
|
12082
|
ND4/4L
|
[A]8
|
Ref/-1A/+1A/T/+2AA/A
|
215342/1501/821/48/12/7
|
12505
|
ND4/4L
|
[A]9
|
Ref/-1A/+1A/+2AA/C/-2AA
|
227869/3067/2631/47/20/9
|
12697
|
ND4/4L
|
[A]8
|
Ref/+1A/-1A/+2AA/C/A/G
|
230254/1922/497/23/22/6/6
|
This reference sequence (GenBank: MN347015) uses the consensus DNA sequence of one Dermacentor silvarum individual collected from Qiangyang, Gansu province of China. All the STR variants were selected using a very strict parameter (Methods). Ref was designated as the STR reference, as it occured the most frequently in mtDNAs within one individual tick. In the allele column, insertions and deletions were noted as “+” and “-” following the specifications in pileup format [32]. The numbers in the depth column are one-to-one corresponding to the variants in the allele column. For example, the STR position 1810 had a reference sequence [G]8 and a variant [G]9 noted by “+G”, which had a depth 6177.