Whole-genome sequencing of reindeer (Rangifer tarandus) populations reveals independent origins of dwarf ecotypes and potential molecular mechanisms underpinning cold adaptation

doi:10.21203/rs.3.rs-3619721/v1

Download PDF

Research Article

Whole-genome sequencing of reindeer (Rangifer tarandus) populations reveals independent origins of dwarf ecotypes and potential molecular mechanisms underpinning cold adaptation

https://doi.org/10.21203/rs.3.rs-3619721/v1

This work is licensed under a CC BY 4.0 License

You are reading this latest preprint version

Background

Reindeer (Rangifer tarandus) are iconic mammals that inhabit the Arctic and sub-Arctic regions. In these areas, reindeer not only play a vital ecological role, but they also hold cultural and economic significance for indigenous communities. In order to thrive in the harsh conditions of the northernmost areas of the world, reindeer have developed an array of phenotypic adaptations, especially in the ecotypes living in the High Arctic. Therefore, a thorough understanding of population structure, history, and genetic diversity of reindeer is useful for their sustainable management and to guide long-term conservation efforts.

Results

We conducted whole-genome sequencing of a male R. t. tarandus specimen from Norway's Hardangervidda region, generating a highly continuous and complete genome assembly that can be used as a reference genome for genetic analyses focusing on the Fennoscandian reindeer. We also sequenced reindeer ecotypes from across the globe and generated de novo sequences from two ancient samples. Our analysis suggests an independent evolution of small-sized phenotypes in specific high-arctic ecotypes, such as the Svalbard reindeer (R. t. platyrhynchus) and Peary caribou (R. t. pearyi). We describe how the demographic bottleneck that affected the reindeer in the Svalbard archipelago resulted in reduced genetic variability compared to mainland Norway reindeer. Our data suggests that these two distinct ecotypes were likely independent populations before the last glaciation. Finally, we also observe an enriched number of genes associated with cilium motility and cilium assembly presenting missense variants between these two ecotypes, potentially linked to adaptations in the extreme arctic environment. For instance, some of these genes play a role in respiratory cilia movement, potentially improving respiratory function in cold environments.

Conclusions

Our findings provide new insights into the genetic basis of small body size adaptations in reindeer ecotypes and highlight the impact of environmental constraints on their populations. Our high-quality reference genome and associated resources will aid in addressing epidemiological, conservation, and management challenges faced by reindeer populations in a rapidly changing world.

reindeer

Rangifer tarandus

genome reference

Svalbard

population genetics

size adaptation

cold adaptation

Reindeer (Rangifer tarandus) are large herbivore mammals that inhabit the Arctic and sub-Arctic regions, including the northern parts of Eurasia, Greenland and North America, where for historical reasons they are referred to as caribou [1]. As the only trully domestic cervid species, reindeer hold cultural and economic significane for the indigenous communities, while also playing a vital ecological role in their ecosystems. Despite their cultural, ecological, and economic relevance, the origin of the modern-day distribution of reindeer populations is still not fully understood. Until now, work has been largely limited to morphological comparison and the analysis of short stretches of microsatellites and mitochondrial DNA for select groups [2–5], and there is an ongoing debate regarding the taxonomic status of both living an extinct taxa inside the Rangifer genus [6]. Moreover, reindeer exhibit multiple adaptations that have allowed them to thrive in the challenging arctic landscapes. However, their molecular basis remains poorly understood, further highlighting the need for more in-depth genetic analyses.

Interestingly, past work revealed that both caribou and Eurasian reindeer harbour extremely small sized ecotypes in the High Arctic: the Svalbard reindeer (R. t. platyrhynchus, endemic to the Svalbard archipelago) and the Peary caribou (R. t. pearyi, native to the Canadian Arctic archipelago) [1, 7] A third small sized ecotype existed in Eastern Greenland (R. t. eogroenlandicus), but it is reported extinct since 1900 [8]. Some remote populations of reindeer in Northwest Greenland have also been reported to exhibit a comparatively small stature [1, 9, 10], hinting at a possibly close relationship with one of the established small sized ecotypes. In Svalbard, the small size is particularly pronounced, with males being roughly 20% smaller than their mainland counterparts [11], as well as having considerably shorter legs and a distinctly rounded, smaller head (Fig. 1a). Reduced body surface to volume ratios, leading to a stout stature, have been previously proposed as a common evolutionary trajectory for species moving into colder climates, a process known as “Allen’s rule” [12]. Another important morphological characteristic of these reindeer is an increased volume of the nasal cavity [13, 14], which reduces the heat and water loss from the respiratory tract. It has been argued that these features contribute to weathering the extreme harsh and cold [15] conditions in the High Arctic.

Herein, we present a highly continuous and complete genome assembly of a male R. t. tarandus individual (normal size ecotype) from the Hardangervidda National Park (Norway). In addition, we carry out whole-genome sequencing (WGS) of reindeer ecotypes covering much of the modern global distribution of the animals. Interestingly, this dataset includes the first highly covered genomes of late glacial reindeer to date, obtained from samples recovered from northern Germany. We use this information to reconstruct the evolutionary history of reindeer and confirm that the small size phenotype has evolved independently in Svalbard reindeer, on the one hand, and Peary caribou and their relatives in Greenland on the other. Finally, a closer look at the genetic diversity present in Svalbard reindeer indicates a highly differentiated genomic landscape and a significantly increased number of missense variants accumulating in a group of genes involved in cilium motility and assembly. Interestingly, defects in these genes have been shown to affect skeletal development and/or lead to reduced appendage and body size (stature), as well as reduced respiratory capacity in humans and mouse models.

Genome reference and global analysis of reindeer ecotypes

A reference genome was generated from a five-year-old male specimen of the subspecies R. t. tarandus from the region of Hardangervidda, Norway (Fig. 1b). We used Chromium-based linked-read sequencing (see Methods) and obtained a final assembly size of 2.49 Gb, with a scaffold N50 of 20.75 Mb (Supplementary Table 1). This assembly size compares favourably to the Mongolian reindeer sequenced by Li et al. [16] (2.76 Gb), with notably greater contiguity (20.75 Mb vs. 1.01 Mb; Supplementary Table 1). Recently, a high-quality chromosome-level assembly of the Canadian R. t. caribou has been published by Poisson et al. [17] (2.58 Gb), with a very large scaffold N50 (54.36 Mb) (Supplementary Table 1). However, this assembly still has large gap regions (Supplementary Table 1). A gene space coverage analysis of expected mammalian BUSCO [18] genes showed that our Chromium-based assembly is the most complete among the three currently available R. tarandus genome assemblies (95.3% of complete mammalian BUSCO genes present, with an additional 1.2% being present but fragmented and only 3.5% missing; Supplementary Table 2). We assessed the accuracy of our assembly and compared it to the other available deer assemblies, by mapping the original genomic reads to each assembly and measuring the number of inaccuracies. The resulting feature response curve (FRC) analysis further showed that both our assembly and the caribou assembly have an equivalent and much higer accuracy than all other compared genome assemblies, as they present very few accumulated errors (Supplementary Fig. 1). These results confirm that the overall quality of our reindeer reference is on par with the recently published chromosome-level assembly of the caribou. Additionally, chromosome-level comparison to the cow genome confirmed strong concordance and completeness (Supplementary Figs. 2 and 3). And finally, complementary mapping of de-novo assembled species-specific transcript data mapped with a rate of 99.96%, indicating a highly complete assembly that can be confidently used to address the genomic analyses described below.

To estimate the genetic variation across reindeer populations, we performed low coverage WGS on samples from their circumpolar distribution (Fig. 2a, Supplementary Table 3) as well as from two ancient samples from the Ahrensburg tunnel valley near Hamburg, Germany (14,800 − 14,000 calBP; throughout the present text, the expression calBP is used to indicate calendar years before 1950 to avoid confusion with uncalibrated ¹⁴C ages).

Our dataset includes 10 individuals from the Svalbard archipelago (R. t. platyrhynchus), as well as from surrounding regions – including 10 individuals from three different regions in mainland Norway (Snohetta (3 individuals), Hardangervidda (3 individuals) and Finnmark (4 individuals)), Northwestern Greenland (2 individuals), Northern Scandinavia (10 individuals), Novaya Zemlya (2 individuals) and Belyi Island (2 individuals); and from the other extant “small” ecotype, namely the Peary caribou (R. t. pearyi) from Bathurst Island and Cornwallis Island in the Canadian High Arctic (1 individual of each) (Supplementary Table 3). While the classification into distinct subspecies of R. t. platyrhynchus and R. t. pearyi directly corresponds to their individual ecological habitats (Svalbard and Canadian Arctic Archipelago, respectively), the classification of reindeer living in Greenland is not straight-forward. At least three reindeer ecotypes are described to live across the western coast of Greenland, in herds that are isolated from each other [1, 9]. The Greenlandic samples used in this study belong to an uncharacterized herd of reindeer found in Northwest Greenland [19] (75° 33’N, 58° 05’W), but our phylogenetic analysis (see below) together with on-site morphological observation (Supplementary Fig. 4) suggest that they constitute another smaller sized ecotype descendent from Peary caribou. Finally, to complement our analysis, we added publicly available data from both Chinese reindeer [16] and Canadian caribou [20], down-sampled to an equivalent coverage of ~ 6.5x as our in-house produced WGS data (rhombus symbols in Fig. 2a, Supplementary Fig. 5). Mapping rate against our genome reference was high across all samples and mean sequencing coverage across samples ranged from 3.91x to 14.18x (Supplementary Table 4 and Supplementary Fig. 5).

Principal components analysis (PCA) based on genotype likelihoods of these mapped data reveals the greatest genetic differentiation between Svalbard and Northwestern Greenland, as those populations samples are apparently aligned to imaginary axis that are perpendicular (Fig. 2b). Close to the intersection of such imaginary axis, we found the populations that presumably have given rise to the mentioned populations: Novaya Zemlya reindeer (for Svalbard) and the Peary caribou from the Canadian Archipelago (for Northwestern Greenland). All other Eurasian reindeer, including the ancient reindeer samples from central Europe, form a group that is genetically distinct from the aforementioned populations.

This pattern is compatible with the phylogenetic tree (Fig. 2c). In it, we found the Svalbard samples comprising a monophyletic clade with 100% of bootstrap support and the two Novaya Zemlya specimens appearing paraphyletic to the Svalbard clade, together forming a clade that is also monophyletic and has 87% of bootstrap support. The other taxa of interest, the reindeer from Northwestern Greenland and the Peary caribou, also form a monophyletic clade with 96% of bootstrap support. The whole tree displays a strong phylogeographic structure, although without a unipolar orientation. Interestingly, the Hamburg ancient samples are placed as a stem group of all other reindeer, suggesting little participation of Pleistocene European reindeer in the Post-glacial colonization of Arctic Islands [21, 22].

Genomic signatures in Svalbard reindeer

R. t. platyrhynchus from Svalbard is the smallest reindeer subspecies in the world [1, 11]. Its population has been geographically isolated for thousands of years in the northernmost archipelago following the retreat of circumpolar ice sheets at the end of the last ice age, and multiple adaptations that help them withstand the extreme arctic environment where they live have been described [7]. To investigate the genetic landscape underlying these characteristics, we sequenced 6 individuals from this ecotype, as well as 6 individuals from mainland Norway, more deeply to reach an average 20x coverage (Supplementary Fig. 6). From these data, we generated a filtered set of high-quality SNPs to use in our comparison of both populations.

Multiple sequentially Markovian coalescent (MSMC) analysis [23] inferred a steep demographic bottleneck in the Svalbard population that started during the Late Pleistocene, some 50 thousand years ago (TYA), and reached its minimum (some 1,000 individuals) around the mid Holocene, showing a steady population growth thereafter (Fig. 3). The Norwegian population, on the other side, experienced a more gradual demographic decrease that reached its lowest point (< 10,000 individuals) around the early Holocene to recover sharply around the Mid-Holocene (5 TYA). The demographic histories of both populations (Svalbard and Norway) decoupled around 200 TYA, suggesting that the split of those populations from a common ancestor occurred around that time. This further suggests that, during the last glacial cycle, the two groups could have occupied distributions and refugia that were partially or fully disjoined. This disjointness between the evolutionary histories of reindeer from Svalbard and Norway translated into deep differences in their genomic variation. For instance, the runs of homozygosity (RoHs), both small (< 0.1 Mb) and medium (0.1–1.0 Mb) sized, were at least one order of magnitude more numerous in Svalbard than in Norway (Supplementary Fig. 7).

In an effort to identify possible selective sweeps in the genome of Svalbard reindeer, we searched for regions with extreme allele frequencies using population branch statistics (PBS). However, PBS distribution over 5 Kb sliding windows did not present single outlier regions that could reliably differentiate real traces of selection from a neutral distribution (Supplementary Fig. 8). These extremely high levels of genome-wide differentiation are also likely caused by the founder effect affecting this ecotype.

Given this high genetic differentiation, we next sought to describe what type of genetic variants have accumulated in the genome of the Svalbard reindeer in comparison to their mainland counterparts (Supplementary Fig. 9). We could identify a total of 1,305 annotated genes with a higher number of non-synonymous than synonymous variants. We identified reliable functionally annotated orthologs in the cow reference for 756 of these genes, with 700 having at least one associated GO term in the BiomaRt database [24] (Supplementary Table 5).

Gene Ontology (GO) enrichment analysis (Supplementary Table 6) found that many of these genes are G protein-coupled odorants receptors, since the most enriched, non-redundant, biological processes were ‘detection of chemical stimulus’ (GO:0009593, P = 4.5×10^− 7), ‘G protein-coupled receptor signaling pathway’ (GO:0007186, P = 5.4×10^− 7) and ‘sensory perception of smell’ (GO:0007608, P = 1.1×10^− 6). The combined list of genes with these GO terms added to 39 odorant receptors with non-synonymous variants in Svalbard reindeer (Supplementary Table 7). The next significantly enriched, non redundant terms were related to cilium and motility, such as ‘cilium movement’ (GO:0003341, P = 0.00014), ‘microtubule-based process’ (GO:0007017, P = 0.00032), ‘cilium movement involved in cell motility’ (GO:0060294, P = 0.00039), ‘flagellated sperm motility’ (GO:0030317, P = 0.00076) and ‘cilium assembly’ (GO:0060271, P = 0.00110). The list of all cilium-related genes with accumulated non-synonymous variants added to 23 genes (Supplementary Table 8). A literature survey of these 23 ciliary genes revealed that human mutations in at least 9 of them (CEP162, TALPID3, INTU, CEP97, MNR, CENPJ, DYNC2I1, CROCC and CEP120) have been implicated with ciliopathies and other related human diseases, all of which give rise to body malformations and height reduction (Supplementary Table 9), among other phenotypes. The proteins encoded by these 9 genes are all part of the cilium assembly machinery and interact with each other in the process of centriole elongation during mitosis (Fig. 4, Supplementary Fig. 10). Another 5 of the identified ciliary genes (GASL2L, SPAG17, DNAH11, DNAH5 and CFAP221) are regulators of the motility and function of respiratory cilia, and human mutations in all of them have been shown to cause ciliary dyskinesia, a disease that causes chronic inflammation of the respiratory tract (Fig. 4, Supplementary Table 9).

In this study, we employed whole genome sequencing to disentangle the origin of smaller sized reindeer ecotypes living in isolated arctic environments, with a particular focus on the Svalbard reindeer in the High-Arctic Svalbard Archipelago. The historical migration path that gave rise to the modern-day distribution of reindeer has been a subject of debate for some time [1, 3–5, 25], and especially the origin of reindeer in Svalbard has remained unresolved [2, 4, 5, 26]. An initial scenario proposed that reindeer reached Svalbard through Greenland, similar to species such as the arctic fox (Alopex lagopus) [27], which seemed supported by the similarly small stature found in reindeer on both Svalbard and parts of Greenland and Canada. More recent work comparing mitochondrial DNA suggested an immigration route from Eurasia, via Novaya Zemlya and Franz Joseph Land [4] (in the latter, reindeer are currently extinct [28]). However, these results were based on the maternal mitochondrial lineage or a few genetic positions and did not include outgroup samples from Greenland or North America. With a much greater statistical power provided by genome-wide variants of ancient and modern individuals, our analysis supports the Eurasian route (Fig. 5), and suggests that the ancestors of current day Svalbard reindeer likely found refuge in the Beringia region rather than present day Europe during the last glaciation. The Svalbard archipelago and Novaya Zemlya, including the Franz Joseph Land archipelago in between, were largely covered by the Weichselian ice sheet during the Last Glacial Maximum (~ 22,700 before present (calBP)) and were uninhabitable until at least 15,000 calBP [29]. By the start of the Younger Dryas stadial (12,900 calBP), the ice between Novaya Zemlya and neighbouring islands had completely melted, stablishing a hard upper limit for the time to the split between Svalbard and Novaya Zemlya populations.

Our results further suggest that the reindeer that reached Svalbard – and were probably already present on Franz Joseph Land – suffered extended reductions of their effective population size, most probably due to food and habitat constraints. Together with the environmental conditions in the High Arctic, this provides a rationale for the emergence (and fixation) of small body size as a beneficial adaptation, together with other phenotypic adaptations to their extremely cold environment. We have identified the G protein-coupled odorant receptor gene family as the one with the most enrichment of genes with coding differences between Svalbard and Norwegian reindeer. Olfactory sense is crucial to finding food, potential predators and mating partners, and it plays an important role in evolutionary success of any animal species [30], including reindeer [31]. Accordingly, the odorant receptor gene family has been identified as a recurrent target of selection in the process of ecological adaptation [32, 33]. For instance, it has also been described in polar bears, also adapted to the extreme arctic environment, with these animals showing a reduced and more specific odorant receptor repertoire compared to brown bears [34]. However, it has been shown that the highly polymorphic nature of the odorant receptor gene family is likely to cause mapping and assembly errors that generate false-positive signals. Therefore, it has been recommended that they be excluded from selection and enrichment analyses [35].

More unexpectedly, we have detected an accumulation of mostly benign missense changes on a significantly large number of genes involved in ciliogenesis and cilia motility. On the one hand, five of these genes affect the movement of the cilia in the respiratory tract. Previous work suggests that a faster ciliary movement in the epithelial cells of the nasal cavity can improve the respiratory function of humans [36, 37], and indeed it has been shown that other arctic mammalian species present signals of positive selection in genes related to “regulation of cilium beat frequency” [38]. On the other hand, nine of these cilium related genes are key players in the assembly and growth of the centrioles. Interestingly, defects in all these nine genes have been reported to give rise to different types of ciliopathies in humans (Supplementary Table 9), diseases which manifest through morphological alterations such as growth retardation, facial abnormalities, smaller rib cage or dwarfism [39].

We speculate that as initial adaptations related to cilia motility genes could have allowed an increased air circulation in the nasal cavity, small adaptive variations in other proteins of the same machinery could have morphological consequences on the body of these animals as a side effect. However, to conclusively test whether selection has acted on these genes and confirm the underlying hypothesis of adaptation, much greater sample sizes will be needed from both the Svalbard population as well as other smaller-sized high-arctic reindeer ecotypes.

Past work has proposed a narrative in which life in isolated environments in the High Arctic has exerted particular selective pressures on independent populations of reindeer. These changes were thought to have manifested in specific adjustments to body size as well as physiology – analogous to findings from several other mammals [40]. Our detailed genomic survey provides evidence that in reindeer, this adaptation has happened at least two times independently, and we provide evolutionary and biological insights into these adaptations at a genetic level. Our results thus form another critical building block towards understanding more general evolutionary principles underpinning genetic adaptations to life in extreme environments. However, more research on a diverse set of “small” ecotypes is needed to address this question more conclusively.

Sample collection, DNA extraction and sequencing

For de novo genome sequencing, blood from a 3-years-old male reindeer from the region of Hardangervidda (Norway, 60⁰ 0’16’’N, 7⁰ 34’24’’E) was extracted by puncture of the heart immediately after killing and placed into EDTA tubes. Blood was frozen in a paraffin-driven freezer and shipped on ice to the IKMB Sequencing Center in Kiel (Germany). High-molecular-weight genomic DNA was extracted using MagAttract HMW DNA Kit (Qiagen) following the manufacturer’s instructions and fragment length > 50 Kb confirmed using Agilent TapeStation 4200. DNA was sent to the NGI/ SNP&SEQ Center in Uppsala, where one 10x Genomics Chromium library [41] was prepared. Sequencing was performed using an Illumina HiSeqX (Illumina, San Diego, CA, USA) and 941 million 2 x 150 bp reads were obtained.

Samples used for WGS are described in Supplementary Table 3, as well as the methods utilized for genomic DNA extraction. All samples that were not collected from already available museum specimens, were collected from dead animals during industrial slaughter or subsitence hunting. All samples were sent to the IKMB Sequencing Center in Kiel (Germany) for sequencing. DNA quality was inspected using an Agilent TapeStation 4200 and quantified using a Qubit 2.0 Fluorometer, after which Nextera DNA Flex libraries were prepared. Samples were sequenced in 2 x 150 bp reads on an Illumina HiSeq4000 or Illumina NovaSeq6000 sequencer (Supplementary Fig. 11). Each sample was sequenced to ~ 6x coverage (~ 50 million paired-end reads) and samples were barcoded, pooled and demultiplexed accordingly. Additionally, 6 libraries from Svalbard samples and 6 libraries from mainland Norway samples were sequenced a second time (~ 100 million additional paired-end reads per sample) to increase coverage (up to a total of ~ 20x per sample). Access and utilization of the genetic material in this study followed Nagoya Protocol obligations, with no operational requirements according to Norwegian law.

Ancient reindeer samples

Late Pleistocene hunter-gatherer groups were heavily dependent on reindeer hunting for subsistence. Especially the locations and the landscape around the archeological sites in the Ahrensburg tunnel valley were repeatedly used by these humans for seasonal hunting of reindeer [42, 43]. We sampled petrous bones of two adult reindeer, which were direct dated to 13,152 − 12,416 calBC (KIA-53518: 12,520 ± 55 BP) and 13,046 − 12,293 cal BC (KIA-53525: 12,460 ± 55 BP) using the radiocarbon method following standard protocols at the Leibniz Laboratory for AMS Dating and Isotope Research, Kiel. Here, “calBC” denotes real calibrated ¹⁴C dates.

Bleach was used to remove surface contaminants from petrous bones. DNA extraction and subsequent partial uracil-DNA-glycosylase treated sequencing libraries were prepared from bone powder following previously established protocols [44]. All steps, including sampling, DNA extraction and the preparation of sequencing libraries, were performed in clean-room facilities of the Ancient DNA Laboratory in Kiel. Negative controls were taken along for the DNA extraction and library generation steps. The libraries were paired-end sequenced using 2 x 75 cycles on an Illumina HiSeq 4000. Illumina sequencing adapters were removed and paired-end reads were merged if they overlapped by at least 11 bp using ClipAndMerge, a part of the EAGER pipeline [45]. Merged reads were filtered for a minimum length of 30 bp. Both samples show the expected degradation patterns, increased C > T and G > A substitutions at the 5’ and 3’ ends of the reads (Supplementary Fig. 12).

Genome assembly

Supernova v2.1.0 [46] was used to assemble the 10x barcoded FASTQ reads. A total of 2,458 scaffolds were generated, with a N50 value of 15,262,525 bp. Leading and trailing Ns were removed using a custom script. To further scaffold the assembly, we used ARKS v1.0.2 [47] with default parameters and increased the scaffold N50 to 20,747,807 bp in 2,302 scaffolds. To generate an assembly of the mitochondrial genome, first Trimmomatic v0.33 [48] was used to trim the 10x barcodes using HEADCROP:23 for forward reads and HEADCROP:1 for reverse reads. A single lane of reads (150 million reads) was used as input for MitoZ v2.2 [49] with the option “genetic_code 2” to generate the mitochondrial assembly (Supplementary Fig. 13).

To assess the quality of our reindeer assembly, we downloaded the following available deer genomes (together with one sequencing lane of raw reads in each case): Chinese reindeer genome [16] was downloaded from http://gigadb.org/dataset/100370, caribou genome [17] was downloaded from NCBI (GCA_019903745.2), white-tailed deer (Odocoileus virginianus) genome was downloaded from NCBI (GCF_002102435.1, Ovir.te v1.0), hog deer (Axis porcinus) genome [50] was downloaded from NCBI (GCA_003798545.1, ASM379854 v1) and red deer (Cervus elaphus) genome [51] was downloaded from NCBI (GCA_002197005.1, CerEla1.0). The quality of our Hardangervidda reindeer assembly was compared to these other deer assemblies by FRC analysis. Briefly, we mapped one lane of the original genomic reads on the corresponding assembly using BWA v0.7 [52] and used FRCbam to detect mapping errors [53]. Results were plotted to determine overall congruency between assembly and raw data. Each assembly was also analysed for the presence of known mammalian single-copy orthologs (BUSCO v3.0.2) [18] and determine gene space coverage as a proxy for overall completeness.

One available reindeer RNAseq dataset was downloaded from NCBI (SRR5647658), trimmed using Trimgalore v.0.4.4 [54] and option “length 36 q 5 stringency 1 e 0.1” and assembled into a transcriptome using Trinity v2.4.0[55]. This transcriptome was mapped against our assembly using minimap v2.9 [56] “ax splice” to further assess genome completeness.

Genome annotation

The genome was annotated using an in-house developed pipeline (https://github.com/ikmb/esga) based on ab-initio gene model prediction and model hints from different sources. In short, the genome sequence is first repeat-masked using RepeatMasker v4.0.8 [57], followed by generation of 3 types of hint features: 1) reviewed metazoan proteins were downloaded from UniProt [58] and aligned using Exonerate [59] proteins2genome mode, 2) trimmed raw RNAseq reads (see previous section) were aligned using Hisat2 [60], and 3) assembled transcriptome sequences (see previous section) were aligned using Exonerate est2genome mode. Resulting alignments were used as extrinsic hints to run Augustus v3.2.3 [61] specifying “species = human UTR = off –alternativesfrom-evidence = false”. Finally, UTRs and alternative isoforms were incorporated by mapping the assembled transcripts using PASA [62], to finally produce 28,985 protein coding gene models. InterProScan v5.19 [63] was used to perform functional annotation, with protein domains being assigned to 20,402 genes and GO terms to 13,329 genes.

Evolutionary analysis of reindeer ecotypes

Sequencing reads were first trimmed with Trimgalore v0.4.4 [54] using options “paired retain_unpaired length 36 q 30”. Read quality of trimmed reads was analyzed with FastQC v0.11.9. Trimmed reads were mapped to the genome reference using the MEM algorithm from the BWA v0.7 [52] package. Mapped reads were sorted with Samtools v1.9 [64] and duplicate reads were flagged with Picard v2.17.8 [65]. Genomewide coverage was measured using the genomeCoverageBed script from BEDtools v2.25.0 [66]. Reads from whitetailed deer (used as outgroup), caribou and Chinese reindeer were downloaded from NCBI (SRR4069819, SRR9332022 and SRR5763127, respectively) and processed in the same way as described above.

For PCA analysis, the ANGSD v0.921 [67] package, which takes the uncertainty of lowcoverage sequencing data into account, was used. First, genotype likelihoods were calculated from the mapping information collected in the bam files (one per individual) using ANGSD with parameters “uniqueOnly 1 remove_bads 1 only_proper_pairs 1 trim 0 baq 1 minMapQ 15 minQ 15 setMinDepth 60 setMaxDepth 400 doCounts 1 GL 1 doMajorMinor 1 doMaf 1 skipTriallelic 1 SNP_pval 1e-3 doGeno 32 doPost 1”. Next, the covariance matrix between individuals was computed using the ngsCovar tool from the ngsTools suite [68]. Eigenvectors were calculated with the “eigen” function implemented in R [69] and the first two principal components were plotted using a custom R script.

In order to infer the relationship between all sequenced individuals, variants were detected in all samples using FreeBayes v1.1.0 [70], and afterwards filtered with the vcffilter script from the VCFlib library [71] to call only variants with quality > 20 and a coverage of at least 10 reads across all samples (to discard variants unique to a single sample), and with at least 5 reads in each specific sample (f “QUAL > 20 & DP > 10” g “DP > 5”). Variant information was collected in a single VCF file and allele frequencies were calculated using PLINK v1.90b6.16 [72]. Frequency information for the resulting 24,756,418 positions was converted to TreeMix format using custom a script. 100 bootstrap replicates of TreeMix v1.13 [73] were run, fixing white-tailed deer as outgroup and re-sampling blocks of 500 SNPs (“root White_tailed_deer noss bootstrap k 500”). The 100 bootstrap trees were concatenated and the consensus script from the Phylip package v3.696 [74] was used to identify the tree with highest support, which was visualized using FigTree v1.4.4 [75].

Genomic comparison of Svalbard and mainland Norway ecotypes

In order to study the Svalbard ecotype in more detail, only the samples that had been sequenced more deeply (Svalbard n = 6, Norway mainland n = 6; ~20x coverage) were used in the analyses described in this section. Variant calling was performed using FreeBayes and VCFlib in the same manner as described previously.

We employed Population Branch Statistic (PBS) to scan the genome of Svalbard reindeer for possible selective sweep [76]. PBS values were calculated using the following formula:

$$PBS= \frac{{log}\left(1-{{F}_{ST}}_{RO}\right)-{log}\left({1-{F}_{ST}}_{TR}\right)-log(1-{{F}_{ST}}_{TO})}{2}$$

Where F_ST is the fixation index between populations, RO is F_ST between the reference (Norway mainland) and the outgroup (Hamburgian) population, TR is F_ST between the target (Svalbard) and reference population and TO is F_ST between the target and the outgroup population. A high PBS value generally points out the amount of allele frequency change for a given locus which corresponds to the regions which are generally highly differentiated branches of the population tree compared to a reference population. We have chosen to calculate PBS over F_ST as PBS is also directional and can detect the selection signal that happened only in Svalbard reindeer.

We used Svalbard population as the target population, Mainland as the reference population and ancient population as the outgroup. We only kept those SNPs which are present in all the samples, bi-allelic and at least one minor allele is present in the sample (not fixed). We removed all the indels from our analysis. The filtering is done by BCFtools [77] using the “bcftools view g ^miss m2 M2 v snps c 01” command.

PBS was calculated using scikitallel [78]. After calculating PBS for every SNP, we average all the SNPs for a 50 kb region with a 5 kb sliding window with at least 10 SNP. Any scaffold with less than 50kb regions is removed from further analysis. The plots are generated using matplotlib [79].

To infer population size of the two ecotypes over time, the multiple sequential Markovian coalescent method was used, as implemented in MSMC2 v2.1.1 [23], for the three largest genomic sequences (scaffolds 1, 5 and 8, 237 Mbp in total). The segment patterning “p 1*2 + 15*1 + 1*2” was used to reduce overfitting on analysing single scaffolds. Results were scaled by a generation time of eight years and 1.6×10^− 8 mutations/generation, as estimated with a phylogenetic analysis incorporating the ancient samples with BEAST v2.6.1 [80].

Heterozygosity values were estimated using the software Rohan [81]. Long runs of homozygosity (RoHs) were inferred using nonoverlapping 1 Mb windows. To achieve higher resolution of the homozygous segments we also assessed the RoHs with BCFtools/RoH [82]. The RoHs were quality filtered by average fwdbwd phred scores > 40. Further, the segments were classified into two length classes: small (< 0.1Mb) and medium (0.1 to 1.0 Mb). These intervals have been considered representative of inbreeding levels corresponding to identity-by-descent from > 500 and 50 to 500 generations, respectively [83].

To study the effect of the genetic variants of the Svalbard ecotype in relation to the mainland ecotype, first we filtered the VCF files using VCFtools and the parameters “minQ 30 minDP 10 maxDP 80”. Next, we generated a VCF with the 6 Svalbard samples and selected only the positions where at least 5 of the 6 samples presented only the alternative allele (genotype = 1/1) using a custom script. In parallel, to discard positions that were different in the Svalbard samples in relation to the reference individual, but not in other Norway mainland samples, we generated a VCF file with the 6 mainland Norway samples and selected all the positions where at least two individuals present the alternative allele, either in homozygosis or in heterozygosis (genotype = 1/1 or genotype = 0/1). The variants present on this later VCF were subtracted from the Svalbard VCF file. Variant Effect Predictor (VEP) v106.1 [84] was run on the resulting VCF file which contained Svalbard-specific, homozygous variants, together with the genome annotation described above. Only genes with coding variants, with more non-synonymous than synonymous substitutions were selected. OrthoMCL v2.0.9 [85] was used to identify the 1:1 orthologs of all the annotated reindeer genes in the B. taurus reference, since this is the most closely related species with a curated BiomaRt database. BiomaRt package [24] was used to map the B. taurus GO terms to orthologs of the selected reindeer genes, and the topGO package [86] was used to run a fisher test to identify significantly enriched biological processes in this set of genes. Following recommendation of the topGO authors, we did not correct for multiple testing. Functional evaluation of variant effects was performed with PolyPhen-2 [87].

Ethics approval and consent to participate

The Hardangervidda reindeer sample that was used to generate the reference genome was collected from elegible Norwegian hunters with active license for hunting the respective animal. The other samples from Snøhetta, Finnmark and Hardengervidda were sampled in connection with the wild reindeer hunt coordinated by the Norwegian Wild Reindeer Center in cooperation with hunters with active licenses. Tissue samples of reindeer from Novaya Zemly and Belyi Island were collected from dead animals via subsistence hunting. The sampling was conducted by authorized managers of these populations, according to the regulations stated by the Ministry of Nature Protection of the Russian Federation. Tissue samples from domestic reindeer on Yamal Peninsula and Popigay were collected from dead animals during industrial slaughter and required no specific permits. Svalbard reindeer jaws were collected from local hunters as part of the mandatory hunting report. The Governor of Svalbard gives permission to local hunters for hunting reindeer. Sample collection from Greenland was approved by the Greenland Institute of Natural Resources (GINR). All methods performed and described are reported in accordance with the ARRIVE guidelines.

Consent for publication

Not applicable.

Availability of data and materials

Genome of Hardangervidda reindeer is available at the European Nucleotide Archive (ENA) under the study ID PRJEB35834 and accession GCA_902712895. Raw reads are also available under the accession numbers ERR3764966 (S1), ERR3764967 (S2), ERR3764968 (S3) and ERR3764969 (S4). WGS reads of all reindeer ecotypes used in this study are available at NCBI’s Sequence Read Archive (SRA) under the project ID PRJNA613573. Hamburgian ancient reindeer clipped and merged data is available at the European Nucleotide Archive under the project ID PRJEB37436.

The datasets supporting the conclusions of this article are included within the article (and its additional files).

Competing interests

The authors declare not competing interests.

Funding

M. T.-O. acknowledges the German Network for Bioinformatics Infrastructure (de.NBI), for which the author offers genomes-as-a-service, and Kiel Life Science (KLS) Young Scientist Programme for financial support. P. Arnold acknowledges Z3 of CRC877 for access to the CLSM. Ancient DNA work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) through project number 2901391021 SFB 1266 and Germany`s Excellence Strategy - EXC 2150. K. S. K. and K. R. acknowledge support from ReiGN “Reindeer Husbandry in a Globalizing North – Resilience, Adaptations and Pathways for Actions” (NordForsk-funded project number 76915). None of the above mentioned funding bodies influenced or otherwise contributed to the contents presented in this manuscript.

Authors’ contribution

M.T.‑O. supervised sample collection and sequencing, performed bioinformatics analysis, submitted the data and conceived the project. A.F. and T.H.K. conceived and initiated the project. A.F., T.H.K. and M.P.H jointly supervised the project and contributed to sample collection. J.A.A. supervised sample collection. T.H.K. collected the blood of the reindeer used for genome assembly. P.Arnold, B.K.C., W.L. and R.L. provided infrastructure and performed preliminary histology preparation and analysis. J.M. performed functional variant analysis. G.M. analyzed protein structures. M.T.‑O., I.B., E.S.‑C. and M.M. analyzed genomic differentiation. Å.Ø.P. provided samples from Svalbard. Ø.W., Ø.K.A and P.Arnesen collected Norwegian reindeer samples. E.W.B. provided samples from Greenland. B.V.E., J.S. and B.K‑K. provided ancient reindeer samples. B.L.G.T., K.S.K., K.R. and I.M. collected reindeer samples and extracted genomic DNA. A.L. contributed to sample collection. M.T.‑O., M.P.H and E.S.-C. wrote the manuscript. All authors revised and edited the manuscript for critical content and approved of the final version.

Acknowledgements

We want to acknowledge the help from Helge Norheim Sund, Karl Jordal and Lars Bjune during biomaterial collection at Hardangervidda; hunters in Longyearbyen hunting association for help with collecting Svalbard reindeer mandibles; Taras Sipko, David Anderson and Dimitri V. Arzyutov for help collecting samples from Russia; and Stensaas Reinsdyrslakteri and Mikal Jacob Hole for collecting samples from Norway. We also want to acknowledge the technical assistance of Liv Wenche Thorbjørnsen and Tim Steiert during DNA extraction. We thank the Museum of Archaeology within the Foundation Schleswig-Holstein State Museums for access to sampling of the ancient reindeer samples. Linked-read sequencing of the male reindeer reference was performed by the SNP&SEQ platform of the Science for Life Laboratories Uppsala, Sweden. Design of Figs. 1c, 6a and 7 is work of Kari C. Toverud. We finally want to thank Prof. Eigil Reimers and Frank E. Zachos for critical reading of the manuscript.

Banfield AWF. A revision of the reindeer and caribou, genus Rangifer. National Museum Of Canada. 1961;77:1–137.
Flagstad Ø, Røed KH. Refugial origins of reindeer (Rangifer tarandus L.) inferred from mitochondrial DNA sequences. Evolution (N Y). 2003;57(3):658–70.
Røed KH, Flagstad O, Nieminen M, Holand O, Dwyer MJ, Røv N, et al. Genetic analyses reveal independent domestication origins of Eurasian reindeer. Proc Biol Sci. 2008 Aug 22;275(1645):1849–55.
Kvie KS, Heggenes J, Anderson DG, Kholodova M V., Sipko T, Mizin I, et al. Colonizing the high arctic: Mitochondrial DNA reveals common origin of Eurasian archipelagic reindeer (Rangifer tarandus). PLoS One. 2016;11(11):1–15.
Gravlund P, Meldgaard M, Pääbo S, Arctander P. Polyphyletic Origin of the Small-Bodied, High-Arctic Subspecies of Tundra Reindeer (Rangifer tarandus). Mol Phylogenet Evol. 1998;10(2):151–9.
Harding LE. Available names for Rangifer (Mammalia, Artiodactyla, Cervidae) species and subspecies. Zookeys [Internet]. 26AD Aug 2022;1119:117–51. Available from: https://doi.org/10.3897/zookeys.1119.80233
Hakala, A.V.K.’, Staaland, H.2 , Pulliainen, E’andReed KH. Taxonomy and history of arctic island reindeer with special reference to Svalbard reindeer - A preliminary report. Rangifer. 1986;(1):360.
Degerbøl M. The extinct reindeer of East-Greenland: Rangifer tarandus eogroenlandicus, subsp. nov.: compared with reindeer from other Arctic regions. Acta Arctica. 1957;10.
Roby DD, Thing H, Brink KL. History, Status, and Taxonomic Identity of Caribou (Rangifer tarandus) in Northwest Greenland. Arctic. 1984;37(1):23–30.
Landa A, Gravlund P, Cuyler C, Jeremiassen SR. Er rensdyrene på Inglefield Land mest beslægtet med de vestgrønlandske rener eller Peary rener? Pinngortitaleriffik, Grønlands Naturinstitut Teknisk rapport. 2000;33:20.
Wollebaek A. The Spitsbergen reindeer (Rangifer tarandus spetsbergensis). Norske Videnskaps-Akademi i Oslo Resultater av de norske statsunderstøttede Spitsbergenekspeditioner. 1926;1(4).
Allen JA. The influence of physical conditions in the genesis of species. Radical Review. 1877;1:108–140.
Croitor R. Plio-Pleistocene Deer of Western Palearctic: Taxonomy, Systematics, Phylogeny [Internet]. Toderaș I, editor. Institute of Zoology of the Academy of Sciences of Moldova; 2018. Available from: https://hal.science/hal-01737207
Johnsen HK, Blix AS, Jørgensen L, Mercer JB. Vascular basis for regulation of nasal heat exchange in reindeer. Am J Physiol. 1985 Nov;249(5 Pt 2):R617-23.
Flerov KK. Fauna of USSR : mammals : vol. 1/no. 2 Musk Deer and Deer. Moscow and Leningrad, USSR: Academy of Sciences. 1952;1(2):222–247.
Li Z, Lin Z, Ba H, Chen L, Yang Y, Wang K, et al. Draft genome of the Reindeer (Rangifer tarandus). Gigascience. 2017;6(November):1–5.
Poisson W, Prunier J, Carrier A, Gilbert I, Mastromonaco G, Albert V, et al. Chromosome-level assembly of the Rangifer tarandus genome and validation of cervid and bovid evolution insights. BMC Genomics [Internet]. 2023;24(1):142. Available from: https://doi.org/10.1186/s12864-023-09189-5
Simão FA, Waterhouse RM, Ioannidis P, Kriventseva E V., Zdobnov EM. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 2015 Oct 1;31(19):3210–2.
Born EW, Laidre K, Wiig Ø. Polar Bear Studies in Baffin Bay (BB) and Kane Basin (KB), 2013. Interim Report of Field Work 2013. 2013;
Taylor RS, Horn RL, Zhang X, Golding GB, Manseau M, Wilson PJ. The Caribou (Rangifer tarandus) Genome. Genes (Basel). 2019;10(7):540.
Anderson DG, Kvie KS, Davydov VN, Røed KH. Maintaining genetic integrity of coexisting wild and domestic populations: Genetic differentiation between wild and domestic Rangifer with long traditions of intentional interbreeding. Ecol Evol. 2017 Jul 26;7(17):6790–802.
Røed KH, Bjørklund I, Olsen BJ. From wild to domestic reindeer – Genetic evidence of a non-native origin of reindeer pastoralism in northern Fennoscandia. J Archaeol Sci Rep. 2018;19:279–86.
Schiffels S, Durbin R. Inferring human population size and separation history from multiple genome sequences. Nat Genet [Internet]. 2014;46(8):919–25. Available from: http://dx.doi.org/10.1038/ng.3015
Durinck S, Spellman PT, Birney E, Huber W. Mapping Identifiers for the Integration of Genomic Datasets with the R/Bioconductor package biomaRt. Nat Protoc. 2009;4(8):1184–91.
Røed KH. Refugial origin and postglacial colonization of holarctic reindeer and caribou. Rangifer. 2005;25(1):19–30.
Cronin MA, Patton JC, Balmysheva N, MacNeil MD. Genetic variation in caribou and reindeer (Rangifer tarandus). Anim Genet. 2003 Feb 1;34(1):33–41.
Dalén L, Fuglei EVA, Hersteinsson P, Kapel CMO, Roth JD, Samelius G, et al. Population history and genetic structure of a circumpolar species : the arctic fox. Biological Journal of the Linnean Society. 2005;84:79–89.
Bruce W, Clarke WE. The mammalia and birds of Franz Joseph Land. Proc Roy Phys Soc Edinbourg. 1899;16:502–21.
Patton H, Hubbard A, Andreassen K, Auriac A, Whitehouse PL, Stroeven AP, et al. Deglaciation of the Eurasian ice sheet complex. Quat Sci Rev. 2017;169:148–72.
Ache BW, Young JM. Olfaction: Diverse Species, Conserved Principles. Neuron [Internet]. 2005;48(3):417–30. Available from: https://www.sciencedirect.com/science/article/pii/S0896627305008949
Hansen BB, Aanes R, Sæther BE. Feeding-crater selection by high-arctic reindeer facing ice-blocked pastures. Can J Zool [Internet]. 2010 Jan 30;88(2):170–7. Available from: https://doi.org/10.1139/Z09-130
Adipietro KA, Mainland JD, Matsunami H. Functional Evolution of Mammalian Odorant Receptors. PLoS Genet [Internet]. 2012;8(7):1–14. Available from: https://doi.org/10.1371/journal.pgen.1002821
Gilad Y, Man O, Pääbo S, Lancet D. Human specific loss of olfactory receptor genes. Proc Natl Acad Sci U S A. 2003 Mar;100(6):3324–7.
Rinker DC, Specian NK, Zhao S, Gibbons JG. Polar bear evolution is marked by rapid changes in gene copy number in response to dietary shift. Proc Natl Acad Sci U S A. 2019 Jul;116(27):13446–51.
Fuentes Fajardo K V, Adams D, Program NCS, Mason CE, Sincan M, Tifft C, et al. Detecting false-positive signals in exome sequencing. Hum Mutat [Internet]. 2012 Apr 1;33(4):609–13. Available from: https://doi.org/10.1002/humu.22033
Sedaghat MH, Shahmardan MM, Norouzi M, Heydari M. Effect of Cilia Beat Frequency on Muco-ciliary Clearance. J Biomed Phys Eng. 2016 Dec 1;6(4):265–78.
Grosse-Onnebrink J, Werner C, Loges NT, Hörmann K, Blum A, Schmidt R, et al. Effect of TH2 cytokines and interferon gamma on beat frequency of human respiratory cilia. Pediatr Res. 2016 May;79(5):731–5.
Yudin NS, Larkin DM, Ignatieva E V. A compendium and functional characterization of mammalian genes involved in adaptation to Arctic or Antarctic environments. BMC Genet. 2017 Dec 28;18(Suppl 1):111.
Focșa IO, Budișteanu M, Bălgrădean M. Clinical and genetic heterogeneity of primary ciliopathies (Review). Int J Mol Med. 2021 Sep;48(3).
Foster JB. Evolution of mammals on islands. Nature. 1964;202:234–5.
Zheng GXY, Lau BT, Schnall-Levin M, Jarosz M, Bell JM, Hindson CM, et al. Haplotyping germline and cancer genomes with high-throughput linked-read sequencing. Nat Biotechnol. 2016;34(3):303–11.
Bratlund B. Hunting strategies in the Late Glacial of northern Europe: A survey of the faunal evidence. Journal World Prehist. 1996;10:1–48.
Bratlund B. A survey of the subsistence and settlement pattern of the Hamburgian Culture in Schleswig-Holstein. Jahrb Röm-Germ Zentralmus. 1994;41:59–93.
Krause-Kyora B, Nutsua M, Boehme L, Pierini F, Pedersen DD, Kornell SC, et al. Ancient DNA study reveals HLA susceptibility locus for leprosy in medieval Europeans. Nat Commun. 2018 May;9(1):1569.
Peltzer A, Jäger G, Herbig A, Seitz A, Kniep C, Krause J, et al. EAGER: efficient ancient genome reconstruction. Genome Biol. 2016;17(1):60.
Weisenfeld NI, Kumar V, Shah P, Church DM, Jaffe DB. Direct determination of diploid genome sequences. Genome Res. 2017;27(5):757–67.
Coombe L, Zhang J, Vandervalk BP, Chu J, Jackman SD, Birol I, et al. ARKS: chromosome-scale scaffolding of human genome drafts with linked read kmers. BMC Bioinformatics. 2018 Dec 20;19(1):234.
Bolger AM, Lohse M, Usadel B. Trimmomatic: A flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30(15):2114–20.
Meng G, Li Y, Yang C, Liu S. MitoZ: a toolkit for animal mitochondrial genome assembly, annotation and visualization. Nucleic Acids Res. 2019;47(11):e63.
Wang W, Yan HJ, Chen SY, Li ZZ, Yi J, Niu LL, et al. Data descriptor: The sequence and de novo assembly of hog deer genome. Sci Data. 2019;6:4–11.
Bana NA, Nyiri A, Nagy J, Frank K, Nagy T, Steger V, et al. The Red Deer Cervus elaphus Genome CerEla1.0: Sequencing, Annotating, Genes, Chromosomes. Hereditary Genetics. 2018;07(01):1000191.
Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009 Jul;25(14):1754–60.
Vezzi F, Narzisi G, Mishra B. Reevaluating Assembly Evaluations with Feature Response Curves: GAGE and Assemblathons. PLoS One. 2012;7(12):1–11.
https://github.com/FelixKrueger/TrimGalore.
Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, et al. Trinity: reconstructing a full-length transcriptome without a genome from RNA-Seq data. Nat Biotechnol. 2013;29(7):644–52.
Li H. Minimap2: Pairwise alignment for nucleotide sequences. Bioinformatics. 2018;34(18):3094–100.
Smit AF, Hubley R, Green P. RepeatMasker Open-4.0. 2013-2015 <http://www.repeatmasker.org>.
Bateman A, Martin MJ, O’Donovan C, Magrane M, Alpi E, Antunes R, et al. UniProt: the universal protein knowledgebase. Nucleic Acids Res. 2017 Jan 4;45(D1):D158–69.
Slater GS, Birney E. Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics. 2005 Feb;6(1):31.
Kim D, Langmead B, Salzberg SL. HISAT: a fast spliced aligner with low memory requirements. Nat Methods. 2015 Apr 9;12(4):357–60.
Stanke M, Schöffmann O, Morgenstern B, Waack S. Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources. BMC Bioinformatics. 2006 Feb 9;7:62.
Haas BJ, Delcher AL, Mount SM, Wortman JR, Smith RK, Hannick LI, et al. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 2003;31(19):5654–66.
Jones P, Binns D, Chang HY, Fraser M, Li W, McAnulla C, et al. InterProScan 5: genome-scale protein function classification. Bioinformatics. 2014 May 1;30(9):1236–40.
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009 Aug;25(16):2078–9.
Broad Institute. Picard. http://broadinstitute.github.io/picard/. 2016;
Quinlan AR, Hall IM. BEDTools: A flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26(6):841–2.
Korneliussen TS, Albrechtsen A, Nielsen R. ANGSD: Analysis of Next Generation Sequencing Data. BMC Bioinformatics. 2014;15(1):1–13.
Fumagalli M, Vieira FG, Linderoth T, Nielsen R. ngsTools: methods for population genetics analyses from next-generation sequencing data. Bioinformatics. 2014 May;30(10):1486–7.
R Core Team. R: A Language and Environment for Statistical Computing. Vienna, Austria; 2019.
Garrison E, Marth G. Haplotype-based variant detection from short-read sequencing. 2012;1–9.
Garrison E. Vcflib, a simple C++ library for parsing and manipulating VCF files. https://github.com/vcflib/vcflib. 2016;
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, et al. PLINK: A tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81(3):559–75.
Pickrell JK, Pritchard JK. Inference of Population Splits and Mixtures from Genome-Wide Allele Frequency Data. PLoS Genet. 2012;8(11).
Felsenstein J. PHYLIP (Phylogeny Inference Package) version 3.6. Distributed by the author. Department of Genome Sciences, University of Washington, Seattle. 2005;
Rambaut A. FigTree v1.4. Institute of Evolutionary Biology, University of Edinburgh, Edinburgh. http://tree.bio.ed.ac.uk/software/figtree/. 2018;
Yi X, Liang Y, Huerta-Sanchez E, Jin X, Cuo ZXP, Pool JE, et al. Sequencing of 50 human exomes reveals adaptation to high altitude. Science. 2010 Jul;329(5987):75–8.
Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, et al. Twelve years of SAMtools and BCFtools. Gigascience [Internet]. 2021 Feb 1;10(2):giab008. Available from: https://doi.org/10.1093/gigascience/giab008
Miles A, pyup.io bot, R. M, Ralph P, Kelleher J, Pisupati R, et al. cggh/scikit-allel: v1.3.6 [Internet]. Zenodo; 2023. Available from: https://doi.org/10.5281/zenodo.7946569
Hunter JD. Matplotlib: A 2D Graphics Environment. Comput Sci Eng. 2007;9(3):90–5.
Bouckaert R, Heled J, Kühnert D, Vaughan T, Wu CH, Xie D, et al. BEAST 2: A Software Platform for Bayesian Evolutionary Analysis. PLoS Comput Biol. 2014 Apr 10;10(4):e1003537.
Modern H, Samples A. Joint Estimates of Heterozygosity and Runs of. 2019;212(July):587–614.
Narasimhan V, Danecek P, Scally A, Xue Y, Tyler-Smith C, Durbin R. BCFtools/RoH: A hidden Markov model approach for detecting autozygosity from next-generation sequencing data. Bioinformatics. 2016;32(11):1749–51.
Curik I, Ferenčaković M, Sölkner J. Inbreeding and runs of homozygosity: A possible solution to an old problem. Livest Sci. 2014;166(1):26–34.
McLaren W, Gil L, Hunt SE, Riat HS, Ritchie GRS, Thormann A, et al. The Ensembl Variant Effect Predictor. Genome Biol [Internet]. 2016;17(1):122. Available from: https://doi.org/10.1186/s13059-016-0974-4
Galhom AE, Al-Deeb W, Osama A. OrthoMCL: Identification of Ortholog Groups for Eukaryotic Genomes. Li Li, Christian J Stoeckert Jr, and David S Roos. 20003;50(2):127–34.
Alexa A, Rahnenfuhrer J. topGO: Enrichment Analysis for Gene Ontology. R package version 2.38.1. 2019;
Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, et al. A method and server for predicting damaging missense mutations. Nat Methods. 2010;7(4):248–9.
Szklarczyk D, Gable AL, Lyon D, Junge A, Wyder S, Huerta-Cepas J, et al. STRING v11: Protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 2019;47(D1):D607–13.
Bustamante-Marin XM, Yin WN, Sears PR, Werner ME, Brotslaw EJ, Mitchell BJ, et al. Lack of GAS2L2 Causes PCD by Impairing Cilia Orientation and Mucociliary Clearance. The American Journal of Human Genetics [Internet]. 2019;104(2):229–45. Available from: https://www.sciencedirect.com/science/article/pii/S0002929718304609
Dougherty GW, Loges NT, Klinkenbusch JA, Olbrich H, Pennekamp P, Menchen T, et al. DNAH11 Localization in the Proximal Region of Respiratory Cilia Defines Distinct Outer Dynein Arm Complexes. Am J Respir Cell Mol Biol. 2016 Aug;55(2):213–24.
Hornef N, Olbrich H, Horvath J, Zariwala MA, Fliegauf M, Loges NT, et al. DNAH5 mutations are a common cause of primary ciliary dyskinesia with outer dynein arm defects. Am J Respir Crit Care Med. 2006 Jul;174(2):120–6.
Lee L, Campagna DR, Pinkus JL, Mulhern H, Wyatt TA, Sisson JH, et al. Primary ciliary dyskinesia in mice lacking the novel ciliary protein Pcdp1. Mol Cell Biol. 2008 Feb;28(3):949–57.
Teves ME, Zhang Z, Costanzo RM, Henderson SC, Corwin FD, Zweit J, et al. Sperm-associated antigen-17 gene is essential for motile cilia function and neonatal survival. Am J Respir Cell Mol Biol. 2013 Jun;48(6):765–72.
Nechipurenko IV, Olivier-Mason A, Kazatskaya A, Kennedy J, McLachlan IG, Heiman MG, et al. A Conserved Role for Girdin in Basal Body Positioning and Ciliogenesis. Dev Cell [Internet]. 2016;38(5):493–506. Available from: https://www.sciencedirect.com/science/article/pii/S1534580716305020
Spektor A, Tsang WY, Khoo D, Dynlacht BD. Cep97 and CP110 Suppress a Cilia Assembly Program. Cell [Internet]. 2007;130(4):678–90. Available from: https://www.sciencedirect.com/science/article/pii/S0092867407007945
Wang WJ, Tay HG, Soni R, Perumal GS, Goll MG, Macaluso FP, et al. CEP162 is an axoneme-recognition protein promoting ciliary transition zone assembly at the cilia base. Nat Cell Biol [Internet]. 2013;15(6):591–601. Available from: https://doi.org/10.1038/ncb2739
Kodani A, Yu TW, Johnson JR, Jayaraman D, Johnson TL, Al-Gazali L, et al. Centriolar satellites assemble centrosomal microcephaly proteins to recruit CDK2 and promote centriole duplication. Nelson WJ, editor. Elife [Internet]. 2015;4:e07519. Available from: https://doi.org/10.7554/eLife.07519
Kobayashi T, Kim S, Lin YC, Inoue T, Dynlacht BD. The CP110-interacting proteins Talpid3 and Cep290 play overlapping and distinct roles in cilia assembly. Journal of Cell Biology [Internet]. 2014 Jan 13;204(2):215–29. Available from: https://doi.org/10.1083/jcb.201304153
Hamada Y, Tsurumi Y, Nozaki S, Katoh Y, Nakayama K. Interaction of WDR60 intermediate chain with TCTEX1D2 light chain of the dynein-2 complex is crucial for ciliary protein trafficking. Mol Biol Cell [Internet]. 2018 May 9;29(13):1628–39. Available from: https://doi.org/10.1091/mbc.E18-03-0173
Zeng H, Hoover AN, Liu A. PCP effector gene Inturned is an important regulator of cilia formation and embryonic development in mammals. Dev Biol [Internet]. 2010;339(2):418–28. Available from: https://www.sciencedirect.com/science/article/pii/S0012160610000102
Tsai JJ, Hsu WB, Liu JH, Chang CW, Tang TK. CEP120 interacts with C2CD3 and Talpid3 and is required for centriole appendage assembly and ciliogenesis. Sci Rep. 2019 Apr;9(1):6037.
Sharma A, Aher A, Dynes NJ, Frey D, Katrukha EA, Jaussi R, et al. Centriolar CPAP/SAS-4 Imparts Slow Processive Microtubule Growth. Dev Cell [Internet]. 2016;37(4):362–76. Available from: https://www.sciencedirect.com/science/article/pii/S1534580716302441

No competing interests reported.

Download PDF

Editorial decision: Revision requested
10 Apr, 2024
Reviews received at journal
09 Apr, 2024
Reviews received at journal
19 Mar, 2024
Reviewers agreed at journal
18 Mar, 2024
Reviews received at journal
22 Feb, 2024
Reviewers agreed at journal
19 Feb, 2024
Reviewers agreed at journal
15 Feb, 2024
Reviewers invited by journal
25 Nov, 2023
Editor assigned by journal
19 Nov, 2023
Editor invited by journal
17 Nov, 2023
Submission checks completed at journal
17 Nov, 2023
First submitted to journal
16 Nov, 2023

You are reading this latest preprint version

Whole-genome sequencing of reindeer (Rangifer tarandus) populations reveals independent origins of dwarf ecotypes and potential molecular mechanisms underpinning cold adaptation

Status:

Version 1

Abstract

Background

Results

Conclusions

Figures

Background

Results

Genome reference and global analysis of reindeer ecotypes

Genomic signatures in Svalbard reindeer

Discussion

Conclusions

Methods

Sample collection, DNA extraction and sequencing

Ancient reindeer samples

Genome assembly

Genome annotation

Evolutionary analysis of reindeer ecotypes

Genomic comparison of Svalbard and mainland Norway ecotypes

Declarations

Ethics approval and consent to participate

Consent for publication

Availability of data and materials

Competing interests

Funding

Authors’ contribution

Acknowledgements

References

Additional Declarations

Supplementary Files

Status:

Version 1