Universal Fragment of mtDNA for Cervidae and Other Ungulate Species Identication

Objective The abundance of literature and many studies aimed at the identication of free-living animal species has helped to identify specic nucleotide marker sequences. In many cases, the best marker to distinguish species is the mitochondrial genome (mtDNA) or one of its fragments. In molecular analyses of Cervidae, biological material such as blood or muscle is easily obtained, as nearly all of these species are gaming animals. In our research, we present the case study of successful species identication based on degraded samples of bone, with the use of short mtDNA fragments. We obtained a partial sequence of the mitochondrial cytochrome b (Cytb) gene for Capreolus capreolus, Dama dama, and Cervus elaphus, that can be used for species aliation. The proposed methodology is helpful as a routine identication procedure for a variety of tissue sources, even in cases where the samples are degraded. The new sequences have been deposited in GenBank, enriching the existing Cervidae mtDNA base.


Introduction
The current state of knowledge of molecular biologists has led to the widespread use of mitochondrial DNA (mtDNA) as a marker for species speci c identi cation in animals [1][2][3][4][5] . For intraspeci c detection of unrelated individuals, sequences with high variability are recommended, e.g. certain nuclear genes [6].
For species identi cation within Cervidae, we choose conservative sequences shared among the animals with species speci c variables, because doing so brings the best effect [7][8][9].
Mitochondrial DNA is known to be an effective molecular marker in phylogenetic analyses [10,11]. This is due to the high polymorphism of the control region, as well as a lack of recombination, and very good isolation e ciency, even from small amounts of biological tissue, as well as the resistance of mtDNA to degradation processes. The analysis of species-speci c variation using the homologous cytochrome b (Cytb) is characterized by high reproducibility and sensitivity of results [11][12][13][14][15][16]. To distinguish closely related species, selected mtDNA fragments with very high speci city are needed. Often conservative gene sequences encoding proteins are used in studies on interspecies diversity [17], while the control region is used to provide a reliable source of knowledge about intraspeci c variability [7,10,17].
Cytochrome b provides excellent phylogenetic information on the taxonomic position of various vertebrates; it can be used in the analysis of live specimens or for forensic identi cation purposes [18][19][20][21]. In addition, this gene is often considered when determining origins of samples from di cult biological materials, i.e. hair, feathers, tooth fragments or other bones, which mainly utilize mitochondrial DNA polymorphisms [18]. Irwin et al. [22] has determined the rate of evolutionary changes for the genera of some species in different components of cytochrome b amino acid sequences based on fossil DNA analyses. Several recent studies show that when the DNA template is derived from bone material, a 300-Page 3/19 500 bp Cytb fragment, is suitable for mammalian species identi cation [19,[23][24][25][26].
We have found the cytochrome b mitochondrial gene to be useful in identifying species of wild animals using bone material (mandible, frontal bone). The aim of our research was to develop a short universal fragment from mtDNA, which could be used in the species identi cation of various deer populations.

Sampling DNA
Bones skull samples were obtained from wild populations of 3 ungulate species in 2016-2018. DNA isolations were performed using the column-based method and the GeneMatrix Bond DNA Puri cation Kit (Eurx). The purity and concentration of DNA from the bone material was determined using a NanoDrop 2000c spectrophotometer (Thermo Scienti c).

Mitochondrial DNA analysis
The following primer pair was used for PCR ampli cation: The thermal reaction pro le used to amplify the Cytb regions was as follows: initial denaturation at 95 °C for 2 min followed by 35 cycles of denaturation at 95 °C for 30 s, annealing at 58 °C for 30 s, extension of the primer at 72 °C for 30 s, and a nal extension of 72 °C for 7 min. PCR products were checked by electrophoresis in a 1.5% agarose gel containing ethidium bromide and a TBE buffer (pH 8.0); the gels were visualized under UV and archived using the GeneSys V.1.3.5.0 software (Syngene). The sequences reported in this paper have been deposited in the GenBank nucleotide sequence database with the accession numbers in Table 2.

Sequence Analysis
At rst, the forward and reverse sequences were edited, and consensus sequences were obtained using Basic Local Alignment Tool software. ClustalW and Mega7.1 software were used to perform multiple sequence alignments [27]. Substitution patterns and rates were estimated under the Kimura (1980) 2parameter model [28].
The genetic variability of haplotypes was characterized by the total alignment length (bp), the number of monomorphic sites, the number of polymorphic sites, the number of parsimony informative sites (PIC), the number of haplotypes, and the average G+C content in each region using DnaSP6.10.01. [29].

Species identi cation
To reveal the species of each sample analysed, we performed phylogeny reconstruction using the Bayesian approach. Seven Cytb sequences for Cervus elaphus, two for Dama dama and 9 for Capreolus capreolus were grouped together along with 115 Cytb sequences (Table 2) of the three species from Genbank, as well as two outgroup sequences (Antidorcas marsupialis, Beatragus hunteri) for comparison. Next, all sequences were aligned with the Muscle algorithm [30] and cut to obtain the proper alignment set in Seaview [31]. The best-t substitution model was chosen using jModelTest 2.10 [32].
Finally, the tree was constructed with MrBayes 3.2.6 [33] using two, randomly started and independent runs, carried out for 20,000,000 generations of Markov chain steps. A consensus tree was constructed based on the set of trees collected after both runs converged -i.e., when the standard deviation of both runs was much below 0.01. Table 1 showns a spectrophotometer readings on DNA isolates, gave OD 260/280 ratios ranging from 1. Species identi cation of analysed DNA sample As a result of performing PCR and DNA sequencing on the collected deer samples, 18 sequences of the Cytb gene were obtained, which helped in the identi cation of each species belonging to the Cervidae family. The frequencies for each nucleotide were as follows: A = 25%, T/U = 25%, C = 25%, and G = 25%. This analysis involved 18 nucleotide sequences with a total of 207 positions in the nal dataset. The average GC content was 50%. The Cytb region was characterized by a high level of monomorphism with a small number of 163 sites and polymorphic sites number of 44 and a number of parsimoniously informative sites number (PIC) of 2. Based on the whole length of the Cytb gene sequenced, a total of 5 haplotypes were detected with a Hd (Haplotype diversity) equal to 0.771. The most frequent haplotype was Hap_3, which contained 7 species. It should be mentioned that the type of genetic frequency of these haplotypes in North-western Poland Cervidae haplotypes Hap_1: 5.6% and Hap_2: 1.1% for Cervus elaphus, Hap_3: 7.8 % and Hap_4: 3.3% for Capreolus capreolus, and Hap_5 was 2.2% for Dama dama. Figure 1 showns the obtained phylogenetic tree was resolved into three distinct clades that consisted of representatives of the three analysed species. Samples were grouped together with each species representatives showing a high probability (100%) of assignment, indicating clear species identi cation. Within the clades we found substantial polytomy, which is a result of lack of sequence informativity within the species level.

Discussion
At the beginning, it should be noted that DNA analysis of biological samples has become the standard practice in animal identi cation at various taxonomic levels. Different types of tissues, such as bones, blood, hair (fur), feathers, skin, meat (muscle sample), faecal, and others are often the subject of many studies in various DNA analysis laboratories [14,34].
The compilation of known DNA markers in uenced the construction of the genetic map of Cervus elaphus [35][36][37]. This genetic map comprises 621 sites (length of 2532 cM, with average intervals of 5.7 cM), and it integrates modern technologies and research methods, including: comparative genomics and orthologous alleles of DNA markers derived from ruminants and other mammals (i.e. Pere David's deer, Elaphurus davidianus and red deer, C. elaphus) [38]. The genetic map of deer was used as an annotation for further research, such as the origin and evolution of ruminant genomes [36], QTL scanning [37], SNP analyses of the whole genome, [39,40] and whole genome sequencing as well as the annotation and assembly of pseudochromosomes [38].
The total length analysed for all tested individuals was 207 bp due to the removal of the last nucleotides in the sequences. The reason for obtaining different lengths was probably due to inhibition of sequencing reactions by individual matrices. Similar results were obtained by Kumar Gupta [24], who worked on stool samples, and also obtained short Cytb sequence fragments of 366, 374 and 503 bp [14,34,46].
We show that when using bone tissue, the primers used in this work for the Cytb gene fragment ampli cations work better, because rstly, they differentiate closely related species well and have the additional advantage that they can be used for many other mammalian species as well. Our research is con rmed by many studies, not only for the family Cervidae, but also other works on the identi cation of wild mammalian species [1,5,23,34,47,48].
There are differences of opinion among the researchers, regarding which of the markers, COI or Cytb, provides more reliable and reproducible results for DNA barcoding analysis. In 2010, a group of researchers led by Tobe [1] carried out an assessment of genetic intraspeci c variability based on COI and Cytb sequences from 217 mammalian species. The results showed that the discriminatory power was higher for the Cytb gene, i.e. there was a higher probability that two random individuals from a given population would have sequence differences at the marker locus than for the COI sequence. Research carried out by Wilson-Wilde [49] demonstrated that identi cation based on the COI gene sequence is suitable for genetically distant species, while in the case of closely related species, it is no longer unambiguous and requires additional tests. However, COI, Cytb, and the mt-CR control region are more often used for this purpose [1, 12-16, 47, 50-52].
The phylogenetic analysis grouped the analysed sequences within individual species with 100% probability ( Figure.1). The Cytb fragment, analysed in this study, allows correct species identi cation, however, the lack of intraspeci c polymorphism results in the inability to use it in population studies. This is clearly shown in the phylogenetic tree obtained ( Figure. 1) where most monospeci c nodes are polytomous. Lack of the node's solution (polytomy) is in this case is the result of a lack of genetic information from the analysed DNA sequences (soft polytomy). Our results suggest that the intraspeci c genetic polymorphism is low for all mammalian species. Similar results were obtained in earlier studies [1,53].

Conclusion
Despite the challenging biological material of bone tissue, the Cytb gene was successfully used to identify individuals of closely related ungulate bone DNA species using PCR analysis and Sanger DNA sequencing. Our results contribute to the study of mitochondrial ungulates by providing new reference sequences that are available in a public database. The method reported here could readily be adapted to discriminate other mammalian species from bones DNA.

Limitations
The e ciency of the applied DNA isolation method varied. The resulting DNA concentration values demonstrated over a hundredfold difference between the lowest and highest concentration. The study revealed that one of the most important moments during the DNA extraction process was the preliminary preparation of the bone material. Identi cation of bones samples was depends on the quality and quantity of DNA present in the sample. Another major limitation of the study was the high ratios for isolates indicated they were contaminated with RNA.

Abbreviations
Not applicable.

Ethics approval and consent to participate
The study material used consisted of selected parts of skeleton (mandible or skull bones) of three representatives from the Cervidae family: Red deer, Cervus elaphus; Roe deer, Capreolus capreolus; Fallow deer, Dama dama; provided to us by commercial hunters from land managed by the north-western Poland. All animals were harvested in accordance with normal recreational and commercial hunting   Tables   Table 1. List of the study materials and results of DNA isolations.  Figure 1 Bayesian phylogenetic tree showing species identi cation of analysed DNA samples (samples are indicated with a star). Sequences of Antidorcas marsupialis and Beatragus hunteri were used for rooting. Numbers along node are the posterior probability values of nodes.