The Complete Mitochondrial Genome of Critically Endangered Hangul (Cervus Hanglu Hanglu) and Its Comparison With the Other Red Deer

Background: Hangul (Cervus hanglu hanglu) or Kashmiri stag belongs to the family Cervidae and is only surviving red deer in the Indian subcontinent. Its complete mitogenome sequence is lacking in the open database for further phylogenetic inferences. Methods and results: We sequenced and characterized the rst complete mitogenome of Hangul, which was 16,354 bp in length. It was compared with other red deer subspecies. We observed eight pairs of overlapping genes and 15 intergenic spacers in between the mitochondrial regions. Relative synonymous codon usage (RSCU) for the 13 PCGs of Hangul was consisting of 3597 codons (excluding stop codons). We observed a highest frequency for leucine (11.75%) and the lowest for tryptophan amino acid (1.12%) in 13 PCGs of Hangul. All the tRNA genes showed a typical secondary cloverleaf arrangement, excluding tRNA-Ser in which dihydrouridine arm did not form a stable structure. Conclusions: The Bayesian inference phylogenetic tree indicated that Hangul clustered within the Tarim deer group (C. h. yarkandensis) and closed to C. e. hippelaphus, which formed the western clade. Besides, the subspecies of C. nippon and C. canadensis clustered together and formed an eastern clade. The nding was supported by the mean pairwise genetic distance based on both complete mitogenome and 13 PCGs. The comparative study of the Hangul mitogenome with other red deer provides crucial information for understanding the evolutionary relationships. It offers a valuable resource for conserving this critically endangered cervid with a limited distribution range.


Introduction
Red deer (genus Cervus), is one of the most widespread deer species in the world [1]. At present, 22 subspecies of red deer are reported [1][2][3]. Hangul (Cervus hanglu hanglu), or Kashmiri stag, belongs to the family Cervidae and is the only surviving red deer in the Indian subcontinent [4][5][6]. The Hangul is restricted to the Dachigham National Park, Srinagar, Jammu & Kashmir (J&K), India [5,6]. Historically, Hangul was widely distributed in the Kashmir Mountains and some parts of Chamba District, in adjoining Himachal Pradesh [7]. The population of Hangul deer prone to risk of extinction due to vast anthropogenic pressures such as human settlements, livestock interference resulting in excessive grazing, illegal poaching, fragmentation, and loss of corridors [8,9]. According to the latest survey in 2019, it occurs in a small population with 237 Hangul in Dachigham National Park [10]. Considering this, IUCN has categorized Hangul under 'Critically Endangered' [11] and in Appendix-I of the Convention on International Trade in Endangered Species of Wild Fauna and Flora (CITES). It is listed under Schedule-I of the Wild Life (Protection) Act, 1972 (WPA) of India to protect from illegal trade. The phylogenetic relationships of red deer subspecies have been described [1][2][3]. However, the phylogenetic position of Hangul among the red deer has always been in debate and the least studied [12][13][14][15]. Presently, the red deer are differentiated into two groups, the eastern or Wapiti group (eastern Asia and North America) and western (Europe, North Africa, Middle East, and central Asia) group. The western group comprises red deer populations of C. h. bactrianus, C. h. yarkandensis and C. h. hanglu from central Asia, considered as Tarim red deer, and C. elaphus from Europe [16]. The recent nomenclature revised the taxonomic status of Hangul from Cervus elaphus hanglu to a valid species Cervus hanglu, which belongs to Tarim red deer group [16]. Moreover, the wapiti deer group was previously described as C. elaphus, but now they are considered separate species and placed under C. canadensis using molecular analysis [1][2][3]16]. Hence, it is crucial for an effective conservation and management plan to understand the phylogenetic relationship of red deer subspecies using robust sequence data. The sequence of the complete mitogenome of Hangul lacks in the database. Therefore, we aimed to generate a novel complete mitochondrial genome sequence of the Cervus h. hanglu and compared it with the available sequence of other red deer subspecies. The novel mitogenome of Hangul provides a deep insight into phylogenetic resolution, evolutionary patterns and further study will increase the understanding of the genetic relationship of Tarim red deer and its conservation.

Methods And Material
We used two tissue samples of Hangul collected from the naturally dead animls from Dachigham National Park, Jammu & Kashmir (J&K), India. Total genomic DNA was extracted in 80 µl volume using DNeasy Blood & Tissue Kit (Qiagen, Hilden, Germany). The quality of DNA was checked in 0.8% agarose gel stained with green stain dye and quanti ed using a QIAxpert system spectrophotometer. Extracted DNA was diluted in a nal concentration of 40 ng/µl for PCR ampli cation.

PCR ampli cation and sequencing
The complete mitogenome of Hangul was ampli ed and sequenced using 22 different sets of primers (Supplementary Table ST1) [17]. Of these, one set of primers (C1U897/C2L15) was not ampli ed. Hence we used an additional set of primer for the ampli cation of the complete cytochrome c oxidase I (CO I) region (F1:5'-TACAGTCTAATGCTTCACTCAGCCA-3'/ R3: 5'-GGGGGTTCGATTCCTTCCTTTC-3') [18]. PCR reactions were performed in 20 µl reaction volumes with PCR buffer (10 mM Tri-HCl, pH 8.3, and 50 mM KCl), 1.5 mM MgCl 2 , 0.2 mM of each dNTP, 2 pmol of each primer, 5.0 U of Taq DNA polymerase (Thermo Fisher Scienti c) and 1 µl template DNA. All reactions were run along with negative controls. The PCR conditions were 95°C for 5 min followed by 32 cycles at 95°C for 45s, annealing 55°C for 45s and extension 72°C for 1.30 min, with a nal extension of 72°C for 10 min. The effectiveness and consistency of the PCR reactions were monitored using positive controls. The ampli ed PCR amplicons were visualized in UV light on 1.8% agarose gel stained with green stain dye. Exonuclease I (EXO-I) and shrimp alkaline phosphatase (FastAP) (Thermo Fisher Scienti c) treatments were given to the ampli ed PCR products for 15 minutes each at 37°C and 80°C, respectively, to eliminate any residual primer. The ampli ed PCR products were sequenced bi-directionally using BigDye® Terminator cycle sequencing Kit v3.1 (Thermo Fisher Scienti c) in the ABI 3500XL Genetic Analyzer (Applied Biosystems). The quality of generated sequences was visualized in SeqA v6 (Applied Biosystems).

Hangul mitogenome characterization and annotation
We generated the complete mitogenome of Hangul by aligning the overlapping fragments of DNA sequences using Sequencher® v5.4.6 (Gene Codes Corporation, Ann Arbor, MI, USA). Annotation of mitogenome was carried out using the Mitos Web Server [19]. The complete mtDNA gene map of Hangul was generated using OGDRAW v1.3.1 [20]. We included complete mtDNA sequences of other red deer subspecies C. h. yarkandensis (Accession number: GU457435), C. e. hippelaphus (KT290948), C. c. xanthopygus (GU457434), C. c. kansuensis (NC039923), C. c. songaricus (KJ025072), C. canadensis (MT534583), and C. c. nannodes (MT430939) from the GenBank for the comparative analysis. MEGA X was used to calculate the base composition and mtDNA genetic code [21]. Bias in nucleotide composition among the complete mitogenome was estimated using skew analysis where: AT skew = (A − T)/(A + T) and GC skew = (G − C)/(G + C) [22]. The Relative synonymous codon usage (RSCU) and amino acid composition of the mitochondrial protein-coding genes (PCGs) were calculated using MEGA X [21]. The typical secondary cloverleaf structure of transfer RNA genes (tRNAs) was predicted using tRNAscan-SE v2.0 [23]. The intergenic spacer and overlapping regions interspersed between genes of complete mitogenome were estimated manually.

Genetic differentiation and phylogenetic analysis
The mean pairwise genetic distance between the red deer was calculated using the Tamura-Nei model (TN93) using MEGA X [21]. The phylogenetic relationship of Tarim red deer was performed with an addition of 34 complete mitogenome sequences from the NCBI database comprising Cervini (21), Muntiacini (2), Alceini (1), Caprini (6), Bovini (2), Boselaphini (1) and one sequence of Sus scrofa as an outgroup. A Monte Carlo Markov Chain (MCMC) based Bayesian consensus tree was constructed using BEAST v1.7 [24]. We executed Bayesian inference analysis using MCMC chains for 10 million generations and sampled one tree at every 1000 generations using a burn-in of 5000 generations. The resulting phylogenetic trees were visualized in FigTree v1.4.0 [25].

Mitogenome organization
We obtained a total length of 16,354 bp from two Hangul mitogenomes and submitted them to the NCBI GenBank (Accession number: MW430050 and MW430051). Both the sequences showed 100% similarity with each other. The map of the complete mitogenome sequence of Hangul has been represented in Fig. 1. It consisted of 22 transfer RNA genes, 13 protein-coding genes (PCGs), two ribosomal RNA genes, and a non-coding control region (D-loop region) ( Fig. 1 and Table 1). The arrangement and distribution of mtDNA genes were similar to the other Tarim and western red deer species [26,27]. The total nucleotide composition of Hangul mtDNA was A (33.26%), T (28.75%), C (24.51%), and G (13.49%) ( Table 2). Most of the genes were coded on the H-strand, except for the ND6 gene (13557-14084) and eight tRNA genes (tRNA Gln , tRNA Ala , tRNA Asn , tRNA Cys , tRNA Tyr , tRNA Ser , tRNA Glu , tRNA Pro ). The control region was present between tRNA Pro and tRNA Phe (Table 1). We observed eight pairs of overlapping genes among tRNA Val /16S rRNA, tRNA Ile /tRNA Gln , COI/tRNA Ser , ATP8/ATP6, ATP6/COIII, ND4L/ND4, ND5/ND6, and  Table ST2). The typical base composition of PCGs was A = 31.38% T = 30.54%, G = 13.22 and C = 24.87 ( Table 2). The Hangul PCGs comprised 12 majority strand or H-strand genes (NADH dehydrogenases: ND1, ND2, ND3, ND4, ND5, and ND4L; three cytochrome c oxidases: COI, COII, and COIII; two ATPases: ATP6 and ATP8, and one cytochrome b: Cyt b gene) and one minority strand or L-strand gene (NADH dehydrogenase: ND6 gene) ( Fig. 1 and Table 1) as commonly present in other vertebrate species [30,31]. We observed a higher abundance of AT% (61.9%) than GC% (38.1%). We examined base skews between red deer subspecies for understanding the nucleotide distribution in PCGs. The average AT and GC skews value for Hangul PCGs was 0.014 and − 0.306, receptively. We also observed positive AT skewness in other red deer species; it indicated that adenines base presents more frequently than thymine, while GC skewness values were negative, indicated that C biased nucleotide composition (Supplementary Table ST2). Of these 13 PCGs, the ND5 gene (1821 bp) was the longest, and the ATP8 (201 bp) was the smallest in length. All 13 PCGs were started with ATG or ATA; similar to other red deer species [27]. We found seven complete stop codons TAA, out of thirteen PCGs, excluding Cyt b with AGA; whereas ND1-ND4 and COIII use incomplete codon TA-or T-( Table 1). The PCGs having incomplete stop codon were completed by a post-transcriptional addition of polyadenylation during the mRNA maturation process. Relative synonymous codon usage (RSCU) for the 13 PCGs of Hangul was consisting of 3597 codons (excluding stop codons) (Fig. 2). We observed the highest frequency for leucine (11.75%) and lowest for tryptophan amino acid (1.12%) in Hangul and other red deer species PCGs (Fig. 3).

Ribosomal RNA and transfer RNA genes
We identi ed two ribosomal RNA and 22 tRNA genes in the complete mitogenome of Hangul, which is typically found in other mammalian species [32,33]. The size of 12S rRNA was 957 bp, while 16S rRNA was 1572 bp. The 12S rRNA and 16S rRNA genes were located between tRNA Phe and tRNA Val and between tRNA Val and tRNA Leu , respectively (Table 1 and Fig. 1). The total nucleotide composition in two rRNA was A (37.72%), T (24.24%), C (20.88%), and G (17.16%) ( Table 2). The length of both rRNA genes of Hangul was 2529 bp which accounted for 15.46% of the complete mitogenome and it varies from 2516 to 2529 bp in other red deer subspecies (Supplementary Table ST2). The total AT content of two rRNA was 61.96%, similar to other subspecies of red deer. The typical AT and GC skew for Hangul in two rRNA was 0.217 and − 0.097, receptively (Supplementary Table ST2). The 22 tRNA genes were distributed in the whole mitogenome and the size of 22 tRNA was varied from 60 (tRNA ser ) to 75 bp (tRNALeu). Of these 22 tRNAs genes, 14 were located on H-strand, while eight were present on L-strand ( Fig. 1 and Table 1). The size of 22 tRNA was 1514 bp and nucleotide composition was A (35.6%), T (28.2%), C (20.61%), and G (15.59%). The average AT and GC content in tRNA was found to be AT biased with 63.8% and 36.2%, respectively. We observed positive skews values (0.116) for AT content and negative skews values (-0.138) for GC content ( Table 2). The anticodons of 22 tRNAs of Hangul were provided in Table 1.
All the 21 tRNA genes were showed a typical secondary cloverleaf structure, excluding tRNAser in which the dihydrouridine arm did not form a stable structure (Fig. 4).

Mitochondrial D-loop
Mitochondrial D-loop/control region (mtCR) is a non-coding, hyper-variable region, plays an important role in regulating replication and transcription of the mitochondrial genome [34]. We observed the length of mtCR in C. h. hanglu was 917 bp and it was positioned between tRNAPro and tRNAPhe (Table 1 and Fig. 1). The size of mtCR was smallest (916 bp) in C. h. yarkandensis and longest (994 bp) in C. c. xanthopygus and C. c. songaricus. It showed that Tarim red deer (C. h. hanglu and C. h. yarkandensis) having almost equal size of mtCR. The variation in the CR length might be due to insertion and deletion (INDEL), which has also been reported in previous studies [35,36]. The nucleotide composition of CR was A (29.23%), T (31.73%), C (23.56%), and G (15.49%). The AT (60.95%) content was higher than the GC (39.05%) content. We observed negative skews values for AT and GC content − 0.041 and − 0.206, respectively (Table 2).

Phylogenetic analysis and genetic distance
The phylogenetic position of Hangul was performed with other red deer, sika deer (C. nippon), eight species of Cervini, two species of Muntiacini, one species of Alceini, six species of Caprini, two species of Bovini, and one species of Boselaphini using 13 PCGs. The Bayesian inference phylogenetic tree indicated that Hangul formed a sister relationship with C. h. yarkandensis and closed to C. e. hippelaphus, which formed a Western clade with high posterior probability values (PP) (PP ~ 1) (Fig. 5). Interestingly, the other red deer subspecies of C. cannadensis is clustered with Sika deer (C. nippon) and formed an Eastern clade. The Bayesian result supports the assignment of Tarim red deer (C. h. hanglu and C. h. yarkandensis) within the western clade as suggested by Lorenzini and Garofalo, 2015 using both complete mt cyt b gene and CR region [16]. Moreover, both the sika and red deer clustered within the cervini group (Fig. 5). In contrast to this, the previous phylogenetic position of Tarim deer did not show the congruent results, showed a closer relationship with Eastern red deer [14], and another study by , where it formed a separate clade with both Eastern and Western red deer clade [15].
The long coverage of mtDNA provides better insight into the resolution of the phylogenetic tree and taxonomic position than the short fragment [37]. Moreover, the cladding pattern of other deer species exhibited similar clustering as described by Gilbert et al., 2006 [38]. We estimated pairwise genetic distance between the red deer subspecies based on 13 PCGs and complete mitogenome (Table 3 and  Supplementary Table ST3). It indicated that Hangul was closest to Tarim red deer (C. h. yarkandensis) with a low genetic distance (0.028). We also observed that Hangul was closer to western red deer (C. e. hippelaphus) (0.038) than eastern red deer (0.057 to 0.072). The highest genetic distance was observed between Hangul and Manchurian wapiti (C. c. xanthopygus) and these ndings were similarly based on complete mitogenome (Supplementary Table ST3). The gene-wise comparison of the red deer subspecies showed Hangul was closer to Tarim red deer based on 12S, 16S, ND1, ND2, COI, ATP8, ATP6, ND4, ND5, ND6, Cyt b, and CR. However, few genes such as COII, COIII, ND3, ND4L showed that it was close to C. e. hippelaphus, which belongs to the western red deer clade (Fig. 6).

Conservation Implications
The Hangul holds an important consideration because it is the last hoping red deer population only surviving in the Dachigam National Park, Srinagar, J&K. The previous studies revealed a low genetic variability and high inbreeding in Hangul [9,14]. The phylogenetic position of Hangul among the red deer subspecies holds important consideration and has always been in debate. The previous phylogenetic studies revealed that Hangul belongs to the eastern red deer clade [14] and a distinct evolutionary line than eastern and western red deer groups [15]. However, a recent study suggested that Hangul clustered within the western red deer group [16]. To provide better insights into the taxonomic status and lineage identi cation of Hangul, we generated and characterized the rst complete mitogenome, which was 16,354 bp in length. Results of this study indicated that Hangul is closely related to Tarim red deer within the western clade. The existing populations of Hangul are completely isolated and exist in small numbers and more vulnerable to genetic erosion than the populations that exist in large numbers. Thus, the mtDNA region's information will be helpful to establish whether the group of populations possesses a signi cant level of genetic variations that may warrant separate conservation efforts. Moreover, a comparative study of red deer subspecies based on complete mitogenome provided the baseline information for genetic monitoring, phylogenetic analysis, spatial distribution ranges, and evolutionary relationships. The generated novel mitogenome of Hangul will also assist in identifying the con scated biological samples for the tracking of the wildlife crime cases.

Declarations
Acknowledgments This study was funded by the Wildlife Institute of India. We thank Dr. Dhananjai Mohan, Director; Dr. Y.V. Jhala, Dean, WII, for their support. We thank the State Forest Departments of Jammu and Kashmir for sending the biological samples to WII.

Funding
This study was funded by the Wildlife Institute of India.

Disclosure statement
The authors declare no con ict of interest.

Availability of data and material
The sequence data is submitted to GenBank under Accession No. MW430050 and MW430051.

Ethical statement
The study was conducted using a tissue collected from the naturally dead animal and therefore, no Institutional Animal Ethics Committee approval was required.

Competing Interests
The author(s) declare no competing interests.
Author contribution S.K.G. developed the concept and designed the framework and acquired resources. PY, AK, NY carried out sequencing and data analysis. PY, AK and SKG wrote the paper. All the authors read and approved the nal version of the manuscript Consent to Participate (Ethics) Since the experiment was not conducted on humans, no consent to participate was required.

Consent to Publish (Ethics)
This study has not used any secondary data for publication, hence no consent to publish was required. Relative synonymous codon usage (RSCU) of the mitochondrial protein-coding genes of the Cervus h. hanglu mitochondrial genome. Codon count numbers are provided on the X-axis.