Intraspecic Genetic Distance and Phylogenetic Evolution of Schizothorax Plagiostomus Inferred from Mitochondrial D-Loop Sequences

The sh in the genus Schizothorax from the Cyprinidae family live in high-altitude Rivers andstreams, are threatened by various anthropogenic stressors. This study aims to characterize S. plagiostomus across Pakistan and throughout the world available on NCBI using the mitochondrial D-loop region, and in particular, to assess the degree of intra-specic pairwise distance among these sequences, as well as to establish their phylogenetic relationships. The percent overall nucleotide composition was 32.6% (A), 33.6% (T), 19.8% (C), and 14.0% (G), which infers that S. plagiostomus control regions is AT-rich (66.2%) and poor in G contents. The mean pair-wise intra-specic nucleotide diversity (Pi)of all the S. plagiostomus was 0.022. While, the inter-specic nucleotide diversity of all the Schizothorax species was 0.049. D-loop sequences for intra-specic variations revealed 765 sites were invariable and 10 were variable, 8 parsimony informative sites and only 2 were singletons. The overall transition/transversion ratio is R = 7.135. Three domains in S. plagiostomus were observed, namely, the termination associated sequence (TAS) domain, the central conserved sequence block (CSB) domain, and the conserved sequence block (CSB) domain. No substitution saturation was detected as an Iss value was signicantly ( 𝑃 < 0.001) lower than the Iss.c in all cases indicating the suitability of the data for phylogenetic analysis. This study signies the importance of the control region for the genetic analysis of S. plagiostomus and also provides a hypothesis of their phylogenetic relationships.


Introduction
The Cyprinids sh include in the genus of Schizothorax encompasses more than 200 species with a world-wide distribution [1,2,3]. Schizothorax species are economically important as fetching high prices in local market and inhabit cold water bodies including the Neelum and Jhelum Rivers in Azad Jammu and Kashmir [4,5]. Over-shing, demolition of their habitat due to road and dam construction and excessive quantity of heavy metals can be considered as the key factors behind the declines of the Schizothorax populations [6,7]. In addition, the deterioration of catchment areas due to unseemly agrarian practices, deforestation, and contamination is lessening water quality, declining these cold-water adapted shes in some water bodies [2][3]8]. However, the genetic diversity, evolutionary history and a liations of these sh species are poorly known [9] as are also their taxonomy and biogeography [10].
The information based on genetic diversity of sh species are applied in genetic improvement programmes and as well as to develop a suitable base population for sustainable use [11]. Preservation of genetic variation within and among populations is an important aspect in management and conservation of biodiversity. Conservation genetics become more important in recent decades [12,13] and species associated to conservation programs should be genetically characterized and compared with other populations of the same species to set up an appropriate conservation strategies [14,15]. It is notable that a decrease in genetic variation diminishes the capacity of a population to adjust to the natural changes and in this way diminishes its drawn-out endurance [16]. The success of breeding and conservation programs as well as effectiveness for management policies would bene t from better knowledge of intra-and interspeci c genetic diversity and divergence. Different approaches have been utilized to examine the genetic differentiation, phylogenetic a liations, biogeographical patterns and taxonomic description of sh and higher vertebrate species [17,18]. In vertebrates, most of the mitochondrial genome variation con ned to a non-coding control (D-loop) region [19] therefore, generally considered as the highly variable in mitochondrial genome [20]. The mitochondrial D-loop region is used to study the genetic structure and phylogeography in many animal species [21,22] This study constitutes the rst attempt to investigate the intraspeci c genetic distance and phylogenetic relationships among the S. plagiostomus distributedin allied region.

Materials And Methods
For the current study the sh samples (n = 60) were randomly collected from the Jhelum and Neelum rivers (from Ghori to Kohala) with cast and gill nets during 2013-2014. Only the sixteen samples were ampli ed and their information is shown in Table ( 1). The collected sh were anesthetized by immersion in 1% benzocaine in water, and euthanized with over dose of benzocaine. Following the analysis, these samples were stored in 90% ethanol and deposited at UAJK Museum. Approximately 0.1g of tissue was sterilized with ethanol, and then washed three times with distilled water. The total DNA was extracted through a standard phenol-chloroform extraction method of Sambrook et al [23]. The region was ampli ed with the newly designed primers: D Loop-F (5′-CAT ATA TGT ATT ATC ACC ATT-3′), D Loop-R (5′-GTT TGA CAA GGA TAA CAG GA-3′) by using the Primer-3 program (https://www.ncbi.nlm.nih.gov/tools/primer-blast/). Sequencing was done for the D-Loop region in Genetic Analyser (ABI Prism 3100, USA). The BioEdit program (http://www.mbio.ncsu.edu/BioEdit) was used for sequence editing to determine nucleotide variants. These sequences were deposited to Genbank (KX364221 to KX364236). To study the population of Jammu and Kashmir India, the D-Loop sequences were retrieved from NCBI GenBank resources ( Fig. 1 and Table 1). Sequence analyses were carried out using MEGA X [24] software. DNASP 5.0 program [25] was used to calculate no. of haplotypes, haplotype diversity (Hd), and nucleotide diversity (Pi). The Median Joining Network (MJN) was used for haplotype construction through Network 10.2 software. DAMBE (v7.2.14) was applied to calculate the entropybased substitution saturation and its critical value [26].The Barbus barbus (NC008654) was used as outgroup to con rm the monophyly of the S. plagiostomus. The evolutionary divergence was calculated by ML method based on the general time reversible with the gamma distribution shape parameter (GTR + G) model. In addition, the demographic history was inferred by analysing mismatch distributions through pairwise differences between all individuals of each population by using Fu's Fs [27] and by departure from mutation-drift equilibrium with Tajima D test.

Control Region Sequence Structure
The percent overall nucleotide composition was 32.6% (A), 33.6% (T), 19.8% (C), and 14.0% (G), which infers that S. plagiostomus CR is AT rich (66.2%) and poor in Gcontents. The D-loop region was AT-rich, and the 5′ end was more conserved than the 3′ end. The mean pair-wise genetic distance of all the studied samples calculated using the Kimura 2-parameter model (K2P) was 0.003 while, the average number of nucleotide differences (k) was 2.16. D-loop sequences for intraspeci c variations revealed 765 sites were invariable in our sampleand 10 were variables, 8 parsimony informative sites and only 2 were singletons. The transition/transversion rate ratios are k 1 = 6.087 (purines) and k 2 = 22.446 (pyrimidines).
We have observed three domains in S. plagiostomus, namely, the termination associated sequence (TAS) domain, the central conserved sequence block (CSB) domain and the conserved sequence block (CSB) domain. In these sequences, we detected three TAS motif -TACAT -in 5′ region and its reverse complement (RC) 'ATGTA' was also found near the 5`end. After TAS, three central CSB blocks (CSB-F, CSB-E and CSB-D) were also observed. The conserved sequence of CSB-F was ATGTAGTAAGAAACCACCAA, which distinguished the central CSB block domain from the TAS domain. CSB-E was positioned after CSB-F, whose conserved sequence was AGGGACAACAACTGTGGGGG. CSB-D was located downstream to CSB-F with its conserved sequence TACTGGCATCTGGTTCCT (Fig. 2). The CSB-F and CSB-D were highly conserved as compared to CSB-E with the few intraspeci c variation-transitions (G to A-2nd residue, C to T − 10th and 13th residues).
Generally, these key sequences were highly conserved and easily documented. In the CSB domain, three conserved sequence blocks (CSBs): CSB-1, CSB-2, and CSB-3 of the S. plagiostomus were found at the 3′end of D-Loop. The length of these CSBs was 22, 18, and 14 nucleotides, respectively. Base composition was extremely speci c to each CSB as follows: CSB-D, T rich; CSB-I, AT rich; CSB-II, C rich; and CSB-III, AC rich. Moreover, pyrimidine block (TTTTTTCCTTTTTTC) consist of 15 bps was also detected between the TAS-4 and CSB-1 regions (Fig. 2).

Intra-speci c D-loop Sequence Variations
The comparative analyses of the current study with all the available sequences of S. plagiostomus Using DAMBE the substitution saturation was assessed for D-Loop region. In these sequences no saturation was observed as shown a linear correlation when the transitions and transversions plotted against genetic distance (Fig. 3). It was also con rmed from a signi cantly higher ( < 0.001) Iss.c value of both of symmetrical (0.742) and asymmetrical (0.497) as compared to Iss values (0.027). These results depicted the suitability of the data for phylogenetic study. Also it was observed that transitions were outnumbering transversions. Phylogenetic analysis was used to estimate relationships among the studied and reference sequences (retrieved from NCBI) of S. plagiostomus to assess historical information of mitochondrial D-Loop region. If we consider the time divergence in mya (million years ago), then we might conclude that S. plagiostomus were radiated from other Schizothorax species about 0.91 mya while this species itself radiate in 0.05 mya due to the geological event that causes the uplifting of Himalaya as shown in Fig. (4a-b).

Population Wise D-loop Sequence Variations
All the studied samples of S. plagiostomus were categories into population 1 (KX364221-KX364226), population 2 (KX364227-KX364236) and compared with the population (population 3) of Jammu and Kashmir (JX101885.1, JX101883.1, JX885847.1, JX885848.1, JX978536.1, KF928796.1). Average length of D-loop sequences was 867 bps. While, the number of sites (excluding sites with gaps / missing data) were 685. Among these, the 70 were segregating sites in which parsimony informative sites were 28 while 42 were singleton. The total number of 17 haplotypes were observed in Schizothorax species and their haplotype diversity (Hd) was 0.970. The haplotype network constructed for D-loop sequences of three Schizothorax population was presented in Fig. 5. Hap_8, Hap_9 and Hap_10 were shared among the population 1 and population 2. This haplotypes sharing seems to be the result of hybridization and de cient taxonomy. As shown in Discussions D-Loopis highly mutable and a speci c non-coding region (compared with nDNA) in the mtDNA genome due to its fast rate of evolution [28]. The D-loop region (~ 770 bps) was identi ed by comparing with mitochondrial reference gene sequences from NCBI as per Lalitha and Chandavar [29]. All the D-Loop sequences were subjected to nucleotide BLAST with S. plagiostomus, showed the maximum similarity (Evalue is less than or equal to 0) authenticating to rule out the risks of numts (nuclear copies of mitochondrial origin). The numts are actually shu ing of mtDNA fragments into nuclear genome [30]. Sorenson and Quinn [30] also suggested to apply newly designed primers using reference sequences available instead of using universal primers. Accordingly, we designed and apply the new primers from the reference sequences of S. plagiostomus.
The sequences of D-Loop region of S. plagiostomus were conserved with the no deletion/insertion however 1.28% of variable sites were found. The overall nucleotide composition was 32.6% (A), 33.6% (T), and 19.8% (C), and 14.0% (G), emphasizing AT-rich contents of animal mitochondrial genome. The best-t ML model for control region was found to be T92 based on lowest AIC (Akaike Information Criterion, corrected) criterion values [31]. Althoughthe control region is highly mutable and rapidly evolving portion of mtDNA [32], structurally, it consist of three domains like, TAS domain, central CSB domain and CSB domain, as found in freshwater turtles of order-Geoemydidae [32] and Trionychidae [33]. The TAS domain with the sequence of -TACAT-and its RC sequence -ATGTA-near 5` end were also reported in Cryptodiran and Pleurodiran turtles [33]. These sequences were also observed in current study.
Brzuzan and Ciesielski [34], also reported these TAS motifs in coregonid species and involved in termination of replication process.  3) were also identi ed in S. plagiostomus. CSB-F was used toseparate the central CSB domain from the TAS domain. The relative position of these regions has been reported also in some other vertebrates [37,38]. The consensus sequence of CSB-F, CSB-E and CSB-D in S. plagiostomus was highly conserved and consistent with those described in other shes studied [39,40]. The CSBs and TAS were also identi ed in S. esocinus of Pakistan [41]. A GTGGG-box (common to euteleosts), next to CSB-D was also identi ed, and also reported by Syed et al [42] in Indian Schizothoracinae. Moreover, a pyrimidine block (TTTTTTCCTTTTTTC) consists of 15 bps was also detected between the TAS-4 and CSB-1 regions. This pyrimidine motif is similarly described in Indian Schizothoracinae [42].
In the primitive Schizothoracinae, the genetic divergance time was estimated through mitochondrial genome. Li et al [43] reported that speci cally Schizothorax species radiated from Early Pleistocene to Late Miocene (1.0-10.2 Ma) notably the time periodof uplifting of Plateau [44]. Possibly the ancestors of Schizothorax species were separated through this tectonic unrest, and cause the successive speciation. Present study reported that S. plagiostomus were radiated from other Schizothorax species about 0.91 mya while this species itself radiate in 0.05 mya due to the geological event that causes the uplifting of Himalaya.
In this study, a relatively high haplotype diversity (0.961) and low nucleotide diversity (0.013) were observed. The combination of high haplotype diversity and low nucleotide diversity also reported from previous studies [45,46,47,48]. This is likely due to rapid demographic expansion from a small effective population size [49,50]. Most of the haplotypes were shared between population 1 and population 2. The haplotype sharing and its connection with other lowest frequencies indicated that the population undergone a series of expansion event in recent time [51].
The deviation from neutrality estimates through the neutrality tests with Tajima's D and Fu's Fs statistics, that is based on the expectation of a constant population size at mutation-drift equilibrium. Here, a negative Tajima's D (-1.566) indicates an excess of low rate polymorphisms relative to expectation, indicating population size expansion or positive selection [52]. The Tajima's D was also used to estimates of selective neutrality, population bottlenecks and range expansion. The overall Tajima's D value was negative with an insigni cant p-value, indicating deviation from evolutionary neutrality. Similarly, the Fu's Fs test, indicating the rare mutations in the populations compared to what is expected under a neutral model of evolution. The signi cant negative Fu's Fs statistical value provides strong support for previous population expansion, and exclude the possibility of background selection and evolutionary forces that fabricate a pattern to population expansion [27,48].
Many sh species are threatened by different factors like, introduction of invasive-exotic species and human activities [53]. These constitute serious challenges which are threatening viability of many sh species, including those of endemic species. The phylogenetic, systematic and taxonomic a nities of cyprinids species of Pakistan, and those of indigenous taxa in particular, are still poorly resolved and highly fragmentary. Due to lack of reliable management plan in this region, natural populations of this species are exposed to over shing by shermen. It is almost impossible to bring them back when they are lost. This is the rst study to report genetic data of S. Plagiostomus from AJK state, where there is a need for devise conservation and management plans for the exploited cold-water sh species. We report the genetic data and phylogenetic relationships among cyprinids, and especially of the S. plagiostomus. It is mandatory to prevent over shing, particularly to prohibit shing throughout reproductive season.

Declarations
Acknowledgements The authors gratefully appreciate the extensive review of the manuscript by Prof. Juha Merila of University of Helsinki. We thank Muhammad Mubarak Ali for expert technical assistance and sh collection. Con ict of interest The authors report no con icts of interest.

Compliance with ethical standards
Ethics approval The Board of Advanced Studies and Research at the University of Azad Jammu and Kashmir in Muzaffarabad, Pakistan provided the permit to conduct this study in the Jhelum and Neelum rivers. No speci c permission was required for the collection sites.
Consent to participate All individual participants included in the study consent to this manuscript to participate.
Consent for publication All individual participants included in the study consent to this manuscript for publication Disclosure Statement The authors alone are responsible for the content and writing of the paper.
Financial interests All authors certify that they have no a liations with or involvement in any organization or entity with any nancial interest or non-nancial interest in the subject matter or materials discussed in this manuscript.