Development and identification of 131 SNP markers in Sthenoteuthis pteropus (Steenstrup)

The orangeback flying squid, Sthenoteuthis pteropus, is a species of significant potential value that is widely distributed in the tropical and temperate waters of the Atlantic Ocean. There have been no reports of the population genetics and effective molecular markers for this species due to a lack of reliable information regarding its genetic structure and its many individual differences, as well as its complex and changeable life history. Therefore, the development of auxiliary molecular markers would contribute to the development, sustainable utilization, and protection of the species. In this study, 131 novel single nucleotide polymorphism (SNP) markers were developed by double digest restriction-site associated DNA sequencing (ddRAD-Seq). The observed heterozygosity (Ho) and expected heterozygosity (He) ranged from 0.00 to 0.80 and 0.18 to 0.50, respectively. The polymorphism information content (PIC) value ranged from 0.18 to 0.50. None of the marker locations significantly deviated from the Hardy–Weinberg equilibrium (p > 0.05) after a Bonferroni correction. These polymorphic SNPs will be important in the further analysis of the population heredity of S. pteropus and its scientific management.

Sthenoteuthis pteropus, is an economically important species that is widely distributed in the western Atlantic Ocean (from the Madeira Islands to the western Gulf of Guinea) and the eastern Atlantic Ocean (from the Nova Scotia Peninsula to the Gulf of Mexico and the Caribbean) (Merten et al. 2017;Chen et al. 2009). The instantaneous biomass of the species is in the range of 4.2-6.5 million tons, and the annual total biological yield is estimated to be 34-52 million tons, but a dedicated fishery had not yet been established (Merten 2016). Existing studies have mainly focused on the microelement content, age, and growth of S. pteropus (Lischka et al. 2018;Laptikhovsky et al. 1993;Arkhipkin and Mikheev. 1992). There have been few studies of the population genetics of the species, and corresponding molecular markers have not been developed. It is therefore necessary to study the Atlantic population of S. pteropus, especially with regard to its population genetics and the corresponding molecular markers.
In recent years, with the development of double digest restriction-site associated DNA sequencing (ddRAD-Seq), single nucleotide polymorphism (SNP) markers have been widely used in population genetics due to their wide genome coverage, representativeness, and ease of automatic analysis (Xu et al. 2020;Li et al. 2021). There have been few population genetics studies of S. pteropus. Therefore, it is necessary to develop and screen a set of easily available SNP loci, which will contribute to the sustainable development and utilization of the species and the conservation of population diversity. In this study, a number of SNP markers     TTG GAT GCT TTC ATG TTT CTT CTC TCG  R: ACG TTG GAT GCG AAC AAG AGC GAT ACC TAC  EXT: TTT CTT CTC TCG TTA ATT AGTT   C/T  140 TTG GAT GTT TGT CAG ATG GAG GGT CAG  R: ACG TTG GAT GCT CAA AAT CAA GGA AAA TTG  EXT: cAGG GTC AGT TGG TTGGT   G/A  140 : ACG TTG GAT GTT GGC GAA CAA TAG CCA ACC  R: ACG TTG GAT GAT AGC CCG TAA CAC TTC GTC  EXT: tctccCAA TAG CCA ACC CAG   1 3 of S. pteropus were developed and effectively identified by ddRAD-Seq. Muscle samples of 55 S. pteropus individuals were collected in the Middle East Atlantic Ocean (n = 55, 1° 12′ W-8° 00′ W, 1°16′ S-5°01′ S). Genomic DNA was extracted by the phenol-chloroform method (Russell and Sambrook. 2001). The quality of DNA was determined by 0.8% agarose gel electrophoresis, and DNA was quantified using a UV spectrophotometer. Five S. pteropus individuals were randomly selected for database construction following ddRAD-Seq, and the ddRAD-Seq libraries were performed following two restriction enzymes (DpnII and BfaI, New England Biolabs, NEB, USA) digesting the double-strand genomic DNA for several fragments with size of a range of 500-600 bp. A sample-specific adapter was ligated to the end of digested fragments and then pooled together. After filtration, about 500 bp of fragments were remained in the pooled mixture. The target ddRAD-Seq tags were amplified with specific primer to produce the final 220-450 bp size product for constructing sequencing library (Wang et al. 2010), and Illumina NovaSeq™ and PE150 was used for sequencing. A total of 16.92 G raw data were obtained by Illumina double-terminal sequencing, with an average of 3.38 G per sample. The base quality, Q30, reached 91.37% of reads. After resetting the screening parameters, the minor allele frequencies (MAF) were 0.05-0.10, the deletion rate was ≤ 0.05, and the minimum sequencing depth was 5, with a total of 1896 SNPs obtained. To improve the reliability, we screened SNP loci from 70 to 90 bp and retained 131 SNP markers. The 131 SNP markers were detected by a multiplex polymerase chain reaction (PCR) using 50 S. pteropus individuals. The PCR amplification primers and single base extension primers were designed using the Massarray Assay Design software (Sequenom, San Diego, CA, USA). The PCR reaction system included 1 μL DNA (10-30 ng/μL), 0.25 μL of each primer, and 1.1 μL PCR mix, with ddH2O added to 5 μL of the final mixture. The PCR reaction conditions were as follows: 94 °C for 3 min, then 40 cycles of 94 °C for 30 s, an annealing temperature of 56 °C for 25 s, an extension at 72 °C for 30 s, and a final extension at 72 °C for 3 min, with the reaction saved at 4 °C. The amplified products were detected by a Massarray Analyzer Compac Mass spectrometer (Sequenom), the results were analyzed using TYPER software, and SNP markers were detected by the SNP typing results. For the validated sequence of SNPs, observed heterozygosity (Ho) and expected heterozygosity (He) were calculated using the Stacks Pipeline analysis tool and the population computing program (Catchen et al. 2013). The polymorphism information content (PIC) value and Hardy-Weinberg equilibrium were estimated and tested using PowerMarker V3.25 software (Liu and Muse. 2005). The linkage disequilibrium (LD) was calculated using PLINK 2.0 Alpha software (Chang et al. 2015).
All 131 SNP loci were detected in 50 S. pteropus individuals, and the 131 potential SNP markers were polymorphic and biallelic (Supplementary Table 1). The statistics of genetic diversity are shown in Table 1. The ranges of Ho and He were 0.00 to 0.80 and 0.18 to 0.50, respectively. The PIC ranged from 0.18 to 0.50. After a Bonferroni correction, all loci were in accordance with the Hardy-Weinberg equilibrium. No significant LD was detected. To the best of our knowledge, this is the first SNP marker developed for S. pteropus. These new SNP markers will enable further population genetics analysis and a better understanding of this important species, which will lead to the use of S. pteropus resources and the protection of its population.
Author contributions YZ carried out all the experiments; CW and HX designed the methods and experiments, interpreted the results and finished the discussion. BL and GL were responsible for overall supervision, and participated in coordination. All authors read and approved the final manuscript.

Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval Experimental protocols involving live animals were approved by the Ethics Committee for the Use of Animal Subjects of Shanghai Ocean University.