In total 5,706,546 reads or sequences consisting of 1,470,273,091 bp (Q30 = 91.69%) were generated from the O. stimpsoni samples. The raw sequences could be assembled into contigs. This process eliminates the repetitive sequences and creates longer reads, which might increase the probability of detecting microsatellite repeats and suitable primers within a read [23]. The high-quality filtered reads were assembled into 68,969 contigs with a length of 54,165,912 bp (average, 785 bp), N50 scaffold size of 748 bp, and 38.7% GC content (Table 1). Of the 238,776 identified SSRs sequences, 25.69% (61,331), 66.08% (157,788), 7.1% (17,090), 0.9% (2,251), and 0.1% (316) featured di-, tri-, tetra-, penta-, and hexa-nucleotide repeats, respectively. Meanwhile, 13,195 unique sequences containing pure/compound microsatellite regions and primer-designable flanking regions were selected.
Table 1
Summary of Illumina MiSeq sequencing
Description
|
|
Total number of bases
|
1,470.27 Mb
|
Average read length
|
257 nucleotides
|
Number of reads
|
5,706,546
|
Number of contigs
|
68,969
|
Total contigs
|
54,165,912 nucleotides
|
Average contig read length
|
785 nucleotides
|
Max. contig length/Min. contig length
|
16,572/310 nucleotides
|
N50
|
784 nucleotides
|
Among these sequences, those with a minimum of seven tri- or tetra-nucleotide repeat motifs were used to develop MS primers. To design the primers, sequences of adequate length (more than 100 bp) and unique sequences flanking the MS array (minimum of 100 bases) were selected. Thus, 63 MS loci (32 tri- and 31 tetra-nucleotides) were selected for subsequent polymorphism screening. Of these 63 MS loci, 26 (8 tri- and 18 tetra-nucleotides) were amplified successfully in the initial evaluation of the MS primers. The remaining 37 primers did not generate the desired amplification products in all four tested individuals. Additionally, 17 loci displayed faint or inconsistent bands, which might have been attributable to nonspecific PCR amplification. Subsequently, further screening revealed that 17 (58.3%) loci were polymorphic and 2 loci were monomorphic in the four O. stimpsoni samples. The primer sequences, repeat motifs, fluorescent labels, and GenBank accession numbers for the 17 novel polymorphic microsatellite loci are summarized in Table 2. For efficient genotyping, PCR products of three or four pooled loci in a single-track reaction using per product size variation were used. By sequencing in the same track the products of PCR reactions, the analysis time and cost were significantly reduced compared to those of the conventional method (no pooling).
Table 2
Characteristics of the 17 microsatellite loci identified for Ocypode stimpsoni
Locus
(GenBank accession no.)
|
Primer sequence (5′-3′)
|
Repeat motif
|
Size range (bp)
|
NA
|
Ho
|
He
|
PIC
|
OS10
|
ACTACTGCTACTACTACCGT-6fam
|
(TAT)17
|
188–291
|
16
|
0.897
|
0.899
|
0.884
|
(MF461429)
|
CCCCTGATAACCTGTCGACG
|
|
|
|
|
|
|
OS11
|
GGCGTTATTAGCACTGCTGC-hex
|
(ACT)11
|
103–148
|
26
|
0.948
|
0.928
|
0.918
|
(MF461431)
|
TGTTCCTTTCCTTTTTCACTTT
|
|
|
|
|
|
|
OS12
|
GGCCAGCACAGGTAGAGAAA
|
(AGT)21
|
209–281
|
24
|
0.896
|
0.928
|
0.918
|
(MF461432)
|
AGCAGCATCAGGAACAACAA
|
|
|
|
|
|
|
OS13
|
AGCAGCAGTAATAGTAGTAGCAGT-6fam
|
(CAG)8
|
209–278
|
18
|
0.79
|
0.855
|
0.84
|
(MF461433)
|
GTGTTTCCTTCGCTCAGTGC
|
|
|
|
|
|
|
OS34*
|
TGTCTGTCTGTCTTTTGGTCT-pet
|
(TGTC)11
|
120–168
|
14
|
0.96
|
0.825
|
0.802
|
(MF461434)
|
TGACCCGAGGCTTAAAACCC
|
|
|
|
|
|
|
OS36*
|
CCCTAACCCCCTCTCTGTCT-hex
|
(CTAT)8
|
142–186
|
11
|
0.95
|
0.773
|
0.736
|
(MF461435)
|
AACGTGGCAATGCATAACCG
|
|
|
|
|
|
|
OS41*
|
GGAACTCTCTCTCAGCACAGG-6fam
|
(TATT)7
|
206–234
|
8
|
0.92
|
0.778
|
0.743
|
(MF461436)
|
GAAACACCTGTGCAGCAGTG
|
|
|
|
|
|
|
OS42
|
TTCCTCTCTTCAGCAGCACC-6fam
|
(TCCT)7
|
162–186
|
7
|
0.788
|
0.783
|
0.746
|
(MF461438)
|
TGGGGATGACAAGAGAGCTG
|
|
|
|
|
|
|
OS43
|
CTTACGAAGGGGAGAGCGAG-pet
|
(AGAC)8
|
225–281
|
14
|
0.85
|
0.873
|
0.855
|
(MF461440)
|
TGTAATCTACCGTGCCCGAG
|
|
|
|
|
|
|
OS46
|
GGAAGGCAGGTATGGAGAGC-pet
|
(AGGA)7
|
270–338
|
17
|
0.896
|
0.863
|
0.846
|
(MF461442)
|
AATCGAAACCAAGCCCTCGT
|
|
|
|
|
|
|
OS47
|
CGGCGGGTGATTGTAGCTA-ned
|
(GTTA)7
|
269–297
|
8
|
0.9
|
0.77
|
0.731
|
(MF461443)
|
GAGCTTTGTCAAGAAGCTGCA
|
|
|
|
|
|
|
OS54*
|
TTGCGACTCCAGAAGGTCAC-6fam
|
(ATAC)10
|
207–235
|
8
|
0.919
|
0.789
|
0.756
|
(MF461444)
|
GCTCCAAGGGCAGAGGTATT
|
|
|
|
|
|
|
OS55*
|
TGGTGGGGATTCGAATAGCG-hex
|
(TAAA)7
|
192–208
|
4
|
0.868
|
0.616
|
0.533
|
(MF461445)
|
TGCACCATCCACCCTCATTT
|
|
|
|
|
|
|
OS56
|
ACCACCCATTCGTCATGTGT-hex
|
(CATA)8
|
343–393
|
14
|
0.99
|
0.861
|
0.84
|
(MF461447)
|
GATGATGGACGGGTCGGTTA
|
|
|
|
|
|
|
OS57
|
GGTCAGGACGGTAATGGCAT-pet
|
(GTAT)9
|
280–328
|
14
|
0.89
|
0.865
|
0.846
|
(MF461449)
|
ACGATGAAAACGGCAAAAGTG
|
|
|
|
|
|
|
OS59*
|
CTGACCTGCTGGCTGGTAAA-ned
|
(GGCT)8
|
113–149
|
19
|
0.979
|
0.854
|
0.833
|
(MF461450)
|
CACCCCAGCTCAAAGACTCA
|
|
|
|
|
|
|
OS63
|
CGCAACCTACACAACAGCTG-ned
|
(TAAA)7
|
228–296
|
18
|
0.773
|
0.935
|
0.925
|
(MF461453)
|
GAGTGCTAGGTAGACATGCACA
|
|
|
|
|
|
|
NA, number of alleles per locus; Ho, observed heterozygosity; He, expected heterozygosity; PIC, polymorphism information content. Calculations assume that individuals with one microsatellite band are homozygous for the allele. * Indicates significant deviation from Hardy–Weinberg equilibrium after the application of Bonferroni’s correction (p, initial α = 0.05/17 = 0.0029).
One-hundred individuals of O. stimpsoni were screened for variation in the 17 novel polymorphic microsatellite loci. The statistical results for these microsatellite loci are summarized in Table 2. A homology search using BLAST illustrated that none of these sequences was similar to any of the sequences in GenBank. The 17 newly identified microsatellite markers of O. stimpsoni exhibited relatively highly polymorphic excluding one locus (OS-55). Understanding the genetic diversity of O. stimpsoni populations is vital for stock abundance recovery and protection of this species.
In total, 240 alleles were observed for the 17 loci. The NA per locus varied from 4 at OS-55 to 26 at OS-11 (mean, 14.11; Table 2). Ho ranged from 0.773 at OS-63 to 0.99 at OS-56 (mean 0.894), whereas He varied from 0.616 at CR-55 to 0.935 at OS-63 (mean, 0.835; Table 2). All 17 loci had high PIC (> 0.5), and rare alleles with a frequency of < 5% were detected at most loci. There was no evidence of genotyping errors or allele dropouts attributable to stuttering, which would have affected the allele scoring. Samples that failed to amplify after the rerun were excluded, and thus, the likelihood that poor DNA quality affected the results was low. Deviation from HWE (p < 0.0029) was evident at six loci (OS-34, OS-36, OS-41, OS-54, OS-55, OS-59, and CR-43) with null alleles.
With the rapid development of sequencing technology during the past decade, NGS technologies allow for the identification of large-scale microsatellite markers [24]. Larger numbers of microsatellite repeats are usually highly polymorphic and more stable, and they display clearer than shorter tracts [25]. The 17 polymorphic microsatellite loci identified in this study consisted of 4 tri-nucleotide repeats and 13 tetra-nucleotide repeats.
Generally, deficiency of heterozygosity is caused by allelic dropout, the limited sample size, size homoplasy, or the presence of null alleles [26, 27]. Despite the small number of studied individuals, most loci were clearly associated with the presence of null alleles. These 17 highly polymorphic microsatellite markers could be useful in studies on the genetic diversity, conservation genetics, and supporting actions for the effective management of O. stimpsoni.