Developmentof 84 single nucleotide polymorphism (SNP) markers for the three-spot swimming crab (Portunus sanguinolentus) by using RAD appoach Guidong Miao1,2,3 Feng Wang1,2,3 Jihua Guo1,2,3, Pei Zhang1,2,3 Hongyu Ma4


 The three-spot swimming crab (Portunus sanguinolentus) is one of most important large size economic crab in China. In this study, we first isolated and characterized a set of 84 SNP loci in P. sanguinolentus. 10 pairs of primers PCR products were sequenced and a total of 3181bp high-quality DNA sequences were obtained from which 84 polymorphic SNP loci were identifed and 84 SNP loci were identified and accurated genotyped. The average frequency of SNP loci was one locus every 38 base pairs in P. sanguinolentus genome. Of these 84 SNP loci, each had bi-alleles with the minor allele frequency ranging from 0.0167 to 0.5000. The observed heterozygosity varied from 0.0333 to 0.7143, while the expected heterozygosity ranged from 0.0333 to 0.5085 per locus. 51 loci showed low variation (HO ≤ 0.3) and fourteen SNP loci showed high variation (HO ≥ 0.5). Among 84 SNP loci, 11 loci showed significant deviation from Hardy–Weinberg Equilibrium. The SNP markers developed herein will provide valuable information for elucidating population genetic diversity, population dynamics, and conservation genetics of this germplasm resource and other related crab species.

at depths of 40 to 80 m (Campbell and Fielder 1986), and in the reproductive season, berried females migrate to deeper waters for spawning. Because of its big size, good flavor, high nutritional value, and affordable price, this resource is of considerable socioeconomic importance to coastal villages bordering South coast of China. P. sanguinolentus is becoming the main candidate for marine aquaculture due to its high market demand (Williams et al. 2001). There have been numerous studies of P. sanguinolentus taxonomy Dai and Yang 1991), maturation (Sumpton et al. 1989;Jacob et al. 1990;Rasheed and Mustaquim 2010), and reproduction (Campbell and Fielder 1986;. However, there has been only little information about P. sanguinolentus in the waters of China (Huang 1993). So far, little information could be available for population genetic structure of P. sanguinolentus. RAPD marker and mtDNA have been used study phylogenetic relationship of marine crabs (Jin et al. 2004;Ma et al. 2016;Meng et al. 2016). mtDNA COI gene be used for evaluating population structure (Ren et al. 2017). Furthermore, the sequence information of microsatellite and SNP candidate markers has been reported by transcriptome analysis of this species, but no validated marker primers have been used for population genetic analysis .
As a novel genetic marker, Single nucleotide polymorphisms (SNPs) are thought to be an ideal molecular tool for population genetics and molecular phylogeny study, due to characteristics of highly abundance and wide distribution in genome (Yang et al. 2020), which have been largely used in genetic diversity analysis of aquaculture species (Feng et al. 2014Cui et al. 2015Miao et al. 2017). To date, SNP data was not available for P. sanguinolentus, which has limited the studies on molecular genetics and population resources.
To understand the population genetic of P. sanguinolentus is urgent, Due to the lack of effective molecular markers a stock genetics assessment of this commercial crabs has never been carried out in Chinese waters (Lee and Hsu, 2003). The purpose of this research was to isolate and characterize SNP markers using Restrictionsite associated DNA sequencing (RAD-seq) for P. sanguinolentus, so as to facilitate study on population genetic diversity and conservation genetics of this species The restriction site-associated DNA (RAD) library of P. sanguinolentus was constructed in our laboratory. We sequenced the muscle samples of from 20 individuals in Shantou (Guangdong of China) using the Illumina HiSeq PE150 platform. The Samtools software is used to sort the sequence comparison results and remove duplicate sequences, and then use the Mpileup function of Samtools to perform population variation detection on all samples. In rad library population, as long as variation locus appear in one sample, the variation locus will be output and is considered as a putative SNPs locus. Bcftools and Vcfutils.pl with defaulted parameters value settings were used to lter all the variation loci. Finally, 1,633,225 putative SNP loci were obtained, including 360,251 INDELs loci.
Subsequently, partial SNP loci isolated and identi ed by Sanger sequencing. The sequences containing potential SNPs loci, of which 30 sequences(No INDELs loci) with high abundant putative SNPs were randomly selected for primer designing using Primer Premier 5.0. The designed primer pairs were examined for their ampli cation e ciency using all 30 individuals. Thirty wild P. sanguinolentus individuals were randomly collected from the inshore of Shantou City, China. Genomic DNA was extracted using the TIANamp Marine Animals DNA Kit (TIANGEN, China) following the manufacturer protocol. PCR amplifcation was performed using ABI9700 PCR thermocyclers (Life Technologies, New York, USA). The ampli cation reaction volume was 30 µL and containing 100 ng template DNA, 10 × reaction buffer 3.0 µL, 25 mmol/L MgCl 2 1.5 µL, 2.5 mmol/L dNTP 1.4 µL, 10 µmol/L each primer 1 µL and 2 unit of EasyTaq DNA Polymerase (TransGen Biotech, China) and DNase-/RNase-free deionized water. The thermal cycling conditions were as follows: initial denaturation at 95 °C for 4 min, 32 cycles of denaturation at 95 °C for30 s, annealing at 48-61°C for 50 s, and extension at 72°C for 50s, and a fnal extension at 72 °C for 10 min. 15 of the 30 primers were obtained PCR products.The PCR products were purifed using Gel Extraction Kit (Tiangen, Beijing, China), and directly sequenced in both directions using the ABI 3730 DNA Analyzer (Applied Biosystems, CA, USA). Sequences were assembled using software Lasergene with manually corrected by eyes. The genotypes per locus were scored manually by the peaks and their colors.
After sequencing for PCR products, 15 good sequences were acquired and 10 of the 15 sequences contained polymorphic SNP loci, the 10 sequences be used for assembling. Finally, 3181 bp long DNA sequences with high quality were obtained, and 84 SNP loci were identi ed. The number of SNP loci ranged from four to sixteen per contig. Crustacean genomes are thought to be highly variable genome sizes and complex (Jeffery 2012;). In the mud crab (S. paramamosain) and the blue swimming crab (P. pelagicus), the SNP density was reported to be one locus every 146 bp and 93 bp in genomic DNA (Feng et al. 2014, Miao et al, 2017 and one locus every 338 bp in functional gene sequences (Ma et al. 2011) respectively. In this study, the SNP density was one locus every 38 bp, which was higher than those reported in S. paramamosain and P. pelagicust. The sequencing of ve primers PCR products did not found polymorphic SNP loci, the number bases of the ve sequences was not calculated in the total length of the sequence, and the selection of primers containing high abundance putative SNP sequences resulted in high SNP density, which also indicated a complex genome of P. sanguinolentusse.Of 84 SNP loci identi ed (Table 1), 61 substitutions were transition type ( 32 A/G and 29 C/T), and 23 substitutions were transversion type (6 A/C, 8 A/T, 6 C/G, and 3 G/T). The proportion of transition/transversion was 1:0.38, which was equal to that found in functional genes and genomic DNA of P. pelagicus but double that in genomic DNA of S. paramamosain (Ma et al. 2011;Feng et al. 2014). Further, due to when designing primers the sequence containing insertion and deletion SNP loci was not selected, so no insertion and deletion types were observed in P.sanguinolentus.Results showed that all 84 SNP loci exhibited biallelic polymorphism, with the minor allele frequency ranging from 0.0167 to 0.5000. Similarly, low minor allele frequency (0.006) was tested in Physeter macrocephalus (Morin et al. 2007). The low minor allele frequency may be due to small sample size, sampling strategy and presence of selective pressure.
A total of 164 alleles were found at 84 SNP loci among 30 individuals. The observed heterozygosity varied from 0.0333 to 0.7143, while the expected heterozygosity ranged from 0.0333 to 0.5085 per locus. 51 loci showed low variation (H O ≤ 0.3) and fourteen SNP loci showed high variation (H O ≥ 0.5).
Among 84 SNP loci, 11 loci showed signi cant deviation from Hardy-Weinberg equilibrium after Bonferroni correction (adjusted P < 0.00060) (Table 1), this might result from a limited number of specimens for SNP genotyping. This study showed a rich genetic variation in P. sanguinolentus population, which was similar to the previous studies (Feng et al. 2014Cui et al. 2015Ren et al. 2016;Miao et al. 2017;Lu et al. 2021). These SNP markers will be helpful for studies on population genetic diversity, population dynamics, and conservation genetics of marine crab species.