Relationships among North American deer based on mitochondrial DNA and ultraconserved elements, with comments on mito-nuclear discordance

Despite their economic, cultural, and ecological signi�cance, the phylogenetic relationships among North American deer remain uncertain, due in part to discordance between phylogenies built from mitochondrial DNA (mtDNA) and nuclear markers. However, the data from these two genomic regions have heretofore been analyzed in isolation. We compared phylogenies built from mtDNA Cyt b, and single nucleotide polymorphisms (SNPs) from the mitogenome and nuclear (ultraconserved elements, UCEs) markers from the same individuals to investigate mito-nuclear discordance within and between taxa in the genus Odocoileus. A Cyt b tree shows haplotype sharing between O. hemonius and O. virginianus. Mitochondrial DNA SNPs separated O. hemionus and O. virginianus, whereas nuclear SNPs separated O. hemonius, O. virginianus, O. v. couesi, O. v. clavium and O. h. sitkensis plus O. h. columbianus. We found less support for O. h. columbianus as a distinct taxon, which had signs of introgression with nominate O. h. hemionus. The well-established paraphyly of mtDNA haplotypes from O. virginianus and O. hemonius is con�rmed with comparisons of mtDNA and nuclear-encoded SNPs from the same individuals. A possible reason for mito-nuclear discordance is that the evolutionary splits are relatively recent, the mtDNA results are in�uenced by genome capture via ancient hybridization, or ancestral lineage sorting; we think our UCE data favor the latter explanation. Niche models suggested allopatric refugia at the Last Glacial maximum for these taxa except for a parapatric or sympatric distribution estimated for mule deer and black-tailed deer, which might explain the modern hybrid zone.


Introduction
Molecular data revolutionized evolutionary studies in the late 20th Century, providing novel insights into topics ranging from higher level phylogenies to phylogeography.Paradoxically, the evolution of charismatic megafauna such as bears (genus Ursus; Cronin et al. 2014), elephants (genus Loxodonta; Roca et al. 2005) and North American canids (genus Canis; Hailer and Leonard 2008) has proven di cult to elucidate with molecular data due to hybridization, molecular marker bias, and mito-nuclear incompatibilities, which obscure species limits and complicate conservation strategies.Deer of the genus Odocoileus represent another such case.
Despite their economic, cultural, and ecological importance, relationships among the Odocoileus deer have proven di cult to parse.White-tailed deer (O.virginianus) range widely through North America and include up to 65 described subspecies (Smith 1991, Heffel nger 2011), two of which are noteworthy: Coues deer (O.v. couesi) and the endangered Florida Key deer (O.v. clavium).Mule deer (O.hemionus) occur in western North America and have been divided into as many as 11 subspecies, including the Columbian black-tailed deer (O.h.columbianus) and Sitka black-tailed deer (O.h.sitkensis).How these taxa are related to each other, and species vs subspecies limits, are open questions.
Mule deer and white-tailed deer are generally distinguishable by morphology, although they are known to hybridize (Cathey et al. 1998, Bradley et al. 2003).Early attempts to distinguish species using mitochondrial DNA (mtDNA) restriction sites in a hybrid zone revealed asymmetric gene ow, with female white-tailed deer more commonly mating with male mule deer than the reverse cross (Carr et al. 1986).Hopken et al. (2015) found that mtDNA control region sequences separated most, but not all, white-tailed deer and mule deer in a small area of the Paci c Northwest.A phylogeny of genera and species of American deer based on mtDNA Cyt b sequences could not separate mule and white-tailed deer, which shared several mtDNA haplotypes (Gutiérrez et al. 2017).However, the extent of geographic sampling is of interest.Analysis of markers from the nuclear genome have con rmed broad genomic differentiation between mule deer and white-tailed deer with evidence for hybridization where their ranges overlap (Combe et al. 2022).Using Y-linked gene sequences, Cathey et al.
(1998) found a better correlation between nuclear DNA and morphology, nding black tailed deer and mule deer as sister taxa, indicating that maternal gene ow from white-tailed to mule deer had biased phylogenetic inference.Latch et al. (2009Latch et al. ( , 2014) ) analyzed 10 microsatellite loci in 1900 individuals sampled across the range of black-tailed deer and mule deer.Their results con rmed the phylogeographic and taxonomic split between mule deer and black-tailed deer, although they (Latch et al. 2011) also reported a hybrid swarm between mule deer and Columbian black-tailed deer.Based on PRNP gene sequences, Vázquez-Miranda and Zink (2020) found three synonymous and diagnostic single nucleotide polymorphism (SNP) substitutions between mule deer and white-tailed deer, and Zink et al. (2020) reported three F1 hybrids between mule deer and white-tailed deer from Nebraska.Villanova et al. (2017) found that Key deer and those from the neighboring county of Collier in south Florida had mtDNA pro les that were distinct from mainland whitetails, although their STRUCTURE plot based on 12 microsatellite loci did not show complete separation, which might re ect the longer coalescence times of nuclear loci.Lastly, a population study restricted to western Kansas where mule deer and white-tailed deer overlap employed genome-wide SNPs and was able to sort unequivocally both species despite hybridization (Combe et al. 2022).Wright et al. (2022) analyzed mtDNA Cyt b sequences, including many of those used by Gutiérrez et al. (2017), and concluded that there had been at least two instances of mtDNA genome capture between white-tailed and mule deer, which could explain the lack of species distinctiveness between the two species in mtDNA gene trees; they discounted hybridization and recent speciation.Wright et al. (2022) concluded that the most recent putative capture event occurred 1.32 million years ago.We are concerned whether a single mitochondrial DNA gene such as Cyt b can provide valid divergence dates as ascribed by Wright et al. (2022) given rate heterogeneity over time (Nakamoto et al. 2021) and an incomplete fossil record.Heffel nger and Latch (2023) concluded that mtDNA genotype sharing between mule deer and white-tailed deer was as yet unexplained.
In this paper, we analyzed mtDNA sequences, and Single Nucleotide Polymorphisms (SNPs) derived from Ultraconserved Elements (UCEs) from multiple localities across North America to determine the best hypothesis for relationships among these cervid taxa.
Our goal is to describe relationships among the major taxa, not to provide de nitive taxonomic resolution at the subspecies level or below.To explore putative mito-nuclear con icts reported by earlier studies and the hypothesis of mtDNA genome capture (Wright et al. 2022), we provide an explicit comparison of SNPs obtained from the same individuals for the mitogenome and nuclear genome.
In addition, we use niche modeling (Peterson 2001) to explore whether major taxa were allopatric at the Last Glacial Maximum, which would suggest that evolutionary divergences predated and were maintained across this time period.We computed the average genetic distance among major taxa using DnaSP (Rozas et al. 2017).We extracted the mitochondrial genome from the UCE data by aligning to O. hemionus mitochondrial genome (NCBI reference NC_020729.1)with bcftools (Li et al. 2009).We concatenated 659 mtDNA SNPs and used PAUP* 4.0a169 (Swofford 2023) to infer a tree using the SVDQuartets routine using the multispecies coalescent with 200 bootstrap replicates.To evaluate further the differentiation between taxa and the hypothesis of mitochondrial DNA capture (Wright et al. 2022), we used IQTREE (as above) to construct a maximum likelihood tree from their Cyt b data after converting sequences into amino acids to determine if phylogenetic haplotype mixing was a result of synonymous nucleotide changes.

Ultraconserved Elements, Population and Phylogenetic Analyses
Our samples included white-tailed deer from Minnesota (n = 5), New York (n = 5), Nebraska (N = 17), mule deer from Nebraska (n = 11) and California (n = 1), Coues deer from Arizona (n = 5), Key deer (n = 5), black-tailed deer from California (n = 7), and black-tailed deer from Alaska (n = 4).We extracted the tissue samples from hunter-harvested animals using a Qiagen kit, whereas frozen museum samples (Coues, Key, Alaskan black-tails; see acknowledgments) were extracted using phenol-chloroform to ensure comparable DNA concentrations from tissue loans.We sent puri ed DNA in the desired concentration to RAPiD Genomics where library preparation using the UCE Tetrapods 5Kv1 kit (targeting 5060 loci; Faircloth et al. 2012) was performed for Illumina sequencing using their high-throughput work ow with proprietary chemistry.Brie y, DNA was sheared to a mean fragment length of 400bp, fragments were end-repaired, followed by incorporation of unique dual-indexed Illumina adapters and PCR enrichment.Samples were pooled equimolar and sequenced 2x150bp.
We checked read depth and quality with FastQC and removed two samples with low coverage.We removed adapter sequences and assembled reads (forwards and reverse) using PEAR (Zhang et al. 2014) and then aligned samples with to a deer reference genome (ovis_v1.0)with BWA (Li and Durbin 2009).We converted BAM les with bcftools (Li et al. 2009) to VCF les and then ltered SNPs by quality (minQ = 20), percent missing data (--max-missing 0.5), and minor allele frequency (--maf 0.05).
We recovered over 125,877 total SNPs from the nuclear genome.We thinned nuclear SNPs to 10% and removed sites with high linkage disequilibrium, resulting in a set of 12,587 SNPs for analysis.We concatenated all SNPs into one sequence.To examine any bias that may have been introduced by down sampling the SNPs, we generated a random dataset of the same size (12,587) and the complete dataset of ~ 126K SNPs with Discriminant Analysis of Principal Components (DAPC; Jombart et al. 2010).The DAPC analyses of different datasets did not con ict.
We used STRUCTURE v2.3.4 (Pritchard et al. 2000) to assess genetic clustering among individuals.We initially evaluated genetic structure from 58 samples.For this, we completed 10 independent runs for models that ranged in population values (K) from 1-7.Each independent run was for 700,000 generations with the rst 200,000 generations used as burn-in.After we evaluated the likelihood scores for the population models and the ΔK (Evanno et al. 2005) calculation we determined that it was appropriate to split the dataset into eastern (O.virginianus, O. v. clavium, O. v. couesi) and western samples (O.h.columbianus and O. h.sitkensis).The eastern (N = 36) and western (N = 22) datasets were each evaluated with a predetermined number of populations (K) from 1-5.Each population model was run independently 10 times for 700,000 generations, 200,000 as burn-in.

Comparison of phylogenetic patterns in SNP from mitochondrial and nuclear genomes
To compare phylogenetic hypotheses based on SNPs from the mitogenome and nuclear genome, we input the mtDNA SNP data into IQTree and PAUP4 and compared the likelihoods of the two trees using the Shimodaira-Hasegawa (2000) test.We also did the reciprocal analysis (genomic SNP data on the two trees).

Niche models
To ascertain if O. hemionus and O. virginianus and their component subspecies groups were allopatric at the Last Glacial Maximum (LGM) we constructed ecological niche models for each of the taxa.In brief, niche models show where the climate conditions used by a species today existed at some other time (e.g., LGM), assuming that the species would occur in these areas (Peterson 2001, Peterson et al. 2011).We extracted locality information from the Global Biodiversity Information Facility (https://www.gbif.org)into Maxent (Elith et al. 2011;Phillips et al. 2006) to build a climatic niche model that was then projected onto the 19 LGM climate layers at the LGM (Hijmans et al. 2005; CCSM model); we used default parameters with the exception that we used 1000 iterations to assist model convergence.Point samples per taxon were: O. virginianus (2499), O. v. clavium (436), O. v. couesi (916), O. h.hemionus (2494), O. h.columbianus (1999), and O. h.sitkensis (421).We estimated the receiver operating characteristic (ROC) for each model using 15% of the data for testing.For each model, we selected climate variables from an initial run with greater than 5% contribution to the model, and then repeated the Maxent analysis with 10 replicates.We visualized estimated average distributions of the 10 replicates with DIVA-GIS using the 10% probability threshold to depict presence or absence (Hijmans et al. 2012).Our goal in niche modeling was to discover potential LGM refugia, reasoning that irrespective of when divergence among taxa occurred, maintenance of genetic differences between taxa would require allopatric refugia at the LGM.

MtDNA
The average sequence divergence (p) for Cyt b between all white-tailed deer and mule deer was 0.026.IQtree selected the HKY + F + I + G4 substitution model, calculated the proportion of invariable sites as 0.5393, and estimated the gamma shape alpha as 1.048.Samples of O. hemionus and O. virginianus are intermingled on the Cyt b tree (Fig. 1).A grouping of individuals representing O. v. couesi included an individual identi ed as O. h.crooki (Genbank: FJ188885) apparently from Texas (Latch et  The SVDquartets tree (Fig. 2) for the 659 concatenated SNPs (469 parsimony informative) from the mitogenome with 100 bootstrap replicates, basically separated only white-tailed deer and mule deer, although not perfectly.The maximum likelihood tree (Supplementary Figure S1) for Cyt b amino acids separated the sequences into several apparent groups (none with signi cant bootstrap support) including a paraphyletic cluster of white-tailed deer, a group of mule/black-tailed deer that included two whitetailed deer, and a second cluster of mule/black-tailed deer that included seven white-tailed deer (red branches in gure).

Ultraconserved Elements
A phylogenetic hypothesis based on UCE data separates O. hemionus and O. virginianus and shows clades including O. v. couesi and O. v. clavium (Fig. 3).Samples of O. h.sitkensis are reciprocally monophyletic and are sister to O. h.columbianus (from California), whereas other O. h.columbianus occur elsewhere on the O. hemionus side of the tree.A STRUCTURE plot separates white-tailed deer and mule deer (Fig. 4a).When the data are analyzed within these two species, there is incomplete separation between mule deer, Sitka black-tailed deer, and Columbian black-tailed deer from California (Fig. 4b), whereas O. v. couesi, O. v. clavium, and O. h.sitkensis are independent (Fig. 4c).

Discussion
The UCE tree (Fig. 3) supports the distinctiveness of O. v. virginianus, O. v. couesi, O. v. clavium, O. h.sitkensis, O. h.columbianus (from California), and O. h.hemionus, in contrast with the mtDNA tree (Fig. 1, Supplementary Figure S1).The STRUCTURE analyses (Fig. 4), however, show incomplete separation of mule deer and black-tailed deer, which is consistent with the hybrid swarm noted by Latch et al. (2011); our lack of mule deer outside of Nebraska prevents de nitive conclusions about range-wide patterns in this taxon.Combe et al. (2022) reported a hybridization rate of nearly 10% between white-tailed deer and mule deer in western Kansas (eight of 92 individuals), the state immediately to the south of Nebraska (see Russell et al. 2021, Wright et al. 2022).However, our data suggest a lower frequency of hybrids in Nebraska (e.g., Zink et al. 2020).
Translocations of deer by game managers likely affected the genetic structure of many game species, such as wild turkey (Meleagris gallopavo) and mallards (Anas platyrhynchos) (Mock et al. 2004, Schummer et al. 2023).Regarding white-tailed deer, Cha n et al. (2021) commented that "an unintended consequence was that natural patterns of gene ow became obscured and pretranslocation signatures of population structure were replaced."This suggests that documenting the history of white-tailed deer populations will require examination of historical museum specimens and a thorough sampling of modern deer populations with dense genomic data.
Mito-nuclear discordance.--Thelack of species distinctiveness between mule deer and white-tailed deer in the mtDNA tree (Fig. 1; Supplementary Figure S2) con icts with the monophyly of these two species in the UCE tree (Fig. 3) and a PRNP gene tree (Zink et al. 2020).Because previous assessments of mito-nuclear discordance were based on data sets including different individuals, we con rmed mito-nuclear discordance by analyzing SNPs from the mitogenome and nuclear genome for the same individuals.
Because white-tailed deer and mule deer are closely related, a mtDNA gene tree with its four times more rapid coalescence time should capture the species split relative to a nuclear gene tree (Zink and Barrowclough 2008), although it does not.This mitonuclear mismatch could have several causes, including ongoing introgression or retention of ancestral alleles.Wright et al. (2022) suggest that white-tailed deer diverged from black-tailed deer, with mule deer later splitting from black-tailed deer.They concluded that a history of hybridization led to at least one genome capture of white-tailed deer mtDNA by mule deer at 1.32 mybp.Their samples are more widely distributed geographically than ours, which precludes a strong test of their genome-capture hypothesis.In contrast, Heffel nger and Latch (2023) considered hypotheses of dispersal, hybrid origin, and isolation in glacial refugia as causes of the mito-nuclear discord between mule deer, white-tailed deer and black-tailed deer and noted that none of these could be ruled out, other than noting that a common mechanism across hypotheses is ancestral lineage sorting.Our nuclear SNP tree (Fig. 3) is not consistent with a genome-capture event (Wright et al. 2022) and suggests that white-tailed deer and mule deer split rst, with black-tailed deer subsequently diverging from mule deer.In addition, given the 1.32 my since the putative mtDNA capture, gene ow ought to have spread the mis-matched mtDNA genotypes much farther than observed by Wright et al. (2022).Hence, we suggest that the mito-nuclear mismatch between mule deer and white-tailed deer is best explained as ancestral lineage sorting that has yet to be completed, although ongoing hybridization will result in new mismatches.
Lineages and glacial history.--Ourniche models (Online Resource 1,2) suggest that the distinctiveness of the taxa in O. virginianus was at least present and maintained through the LGM via largely allopatric refugia.Our UCE data are insu ciently dense to capture the hybrid zone between O. h.columbianus and O. h.hemionus (Latch et al. 2011).However, a hybrid zone could explain the apparent overlapping or parapatric LGM distribution (Fig. S2a,b) of the latter two taxa and could explain their intermingled mtDNA genetics (Fig. 1) discussed above.Taxonomic implications.--Weconclude that the best estimate of the species tree (see Heckeberg 2020) is best based on the UCE data (Figs.3,4).Most authors recognize white-tailed deer and mule deer as distinct species (Bradley e al. 2014; Ramírez-Pulido et al. 2014; Caire et al. 2019), which is consistent with our results.Key deer are distinct in both mtDNA and nuDNA trees.Hence, Key deer could be recognized as a separate species based on their reciprocal monophyly in the UCE tree, although they are not entirely distinct at a few microsatellite loci (Villanova et al. 2017).Similarly, Coues deer appears to be distinct and worthy of consideration for species status.Bradley et al. (2014) considered black-tailed deer and mule deer to be conspeci c, but UCE data and the SNPs from the mitogenome suggest they are evolving independently.Latch and Heffel nger (2022) found genetic support for two blacktailed deer subspecies (O.h.columbianus, O. h.sitkensis) and mainland O. h.hemionus and the two island subspecies, (O.h.cerrosensis on Cedros Island and O. h.sheldoni on Tibur´on Island).Our data also indicate that Sitka black-tailed deer are distinct and reciprocally monophyletic, whereas we found less support for O. h.columbianus.We suggest that none of the other samples from named subspecies diverge to the degree of these taxa, suggesting that the remaining subspecies nomenclature of both mule deer and white-tailed deer does not re ect evolutionary diversity.We acknowledge that greater sampling is required to determine if discrete taxonomic boundaries exist, although if the hypothesis of mtDNA genome capture is correct (Wright et al. 2022), mtDNA could obfuscate taxonomic decisions.

Declarations Figures
Gutiérrez et al. (2017) used samples of O. hemionus represented a broad part of the range and included eight subspecies (hemionus, crooki, sheldoni, fuliginatus, inyoensis, peninsulae, californicus, eremicus).In contrast, the samples of O. virginianus used by Gutiérrez et al. (2017) represented a relatively small part of the range, with one sample of O. v. couesi from Chihuahua and no samples of O. v. clavium.Their mtDNA tree separated most black-tailed deer (O.h.columbianus and O. h.sitkensis) from mule deer similar to Cathey et al. (1998), but samples of O. h.columbianus from Oregon, and O. virginianus from Texas and the District of Columbia were included with mule deer.The single individual of Coues deer was sister to a single O. h.crooki from Texas.The remaining O. h.columbianus and O. h.sitkensis were shown as sister to Mazama pandora (Yucatan brown brocket deer), instead of O. virginianus; however, the low bootstrap support on their tree suggests this relationship is uncertain.

Condensed
maximum likelihood phylogeny based on 959 bp of cytochrome b.Red = mule deer, blue = white-tailed deer, yellow = individuals of both mule deer and white-tailed deer, black = Coues deer.Triangular terminal taxa with no labels signify multiple individuals corresponding to the color code.No nodes had bootstrap values that exceeded 75%.Node labels preceded by a letter start Genbank sequence identi ers, and the others are either in Gutiérrez et al. (2017) or the appendix.

Figure 3 SVDquartets
Figure 3 al. 2009) that might be misidenti ed.The single individual of O. v. clavium (representing 34 total individuals with an identical haplotype; see Villanova et al. 2017) is embedded within a clade of O. virginianus.A clade of O. h.columbianus includes interspersed individuals identi ed as O. h.sitkensis.A clade of O. hemionus included individuals from Alberta, Arizona, British Columbia, Utah, Nevada, Wyoming, Idaho and Oregon.One clade included O. h.hemionus, O. h.columbianus, and O. virginianus (from Nebraska).Thus, apart from O. v. couesi and O. v. clavium, the mtDNA gene tree does not re ect current taxonomy.