Deep genome sequencing provides potential novel insights into plateau adaptations in domestication of goats to extreme environments

Background: Environmental conditions and extensive rearing impose a myriad of selective pressures on domesticated animals living at high altitude. These pressures have selected for the characteristics of strong adaptability for life in the alpine pastoral area. While it is accepted that various natural and anthropogenic factors, such as environmental pressures, human migration, and socioeconomic practices, have shaped the genomic structure of goats, little is known regarding the resulting genetic adaptations. The goat (Capra hircus) have become well adapted to certain extreme environments (e.g., plateaus and deserts). Through whole-genome sequencing, it was hypothesized here that the immune system and olfactory perception genetic regulatory pathways may contribute to the defense related adaptations of the goat population. As such, CNGA4 and Camk 2b have been identied as key candidate genes that drive these adaptations during goat domestication, and they may enhance immunity and defenses against external adverse factors. This study provides a new insight into the unique regional defense adaptation of Qaidam cashmere goats during the generations long domestication process, in order to provide a theoretical basis for the adaptation of cashmere goats to a commercial environment. Results: More than 3,000 candidate genes have been found that may have been selected for through both natural and articial processes during domestication. These candidate genes require validation with additional samples in the future. novel immune and olfactory perception genetic regulatory pathways were identied, which may correspond to defensive adaptations to the extreme environments. Conclusion: CNGA4 and Camk 2b may enhance immunity and defense against external adverse factors during domestication.


Background
Goats are one of the most important livestock resources of local breeders that was domesticated approximately 10,000 years ago 1 , The domestication of goats transformed natural reproduction into a controlled process, which has turned out to be a critical development in the history of modern human civilization. Nevertheless, the origins of caprine domestication are controversial 2 . A great diversity of goat breeds has been created through selective breeding, adaptation, isolation, and genetic drift. Domesticated goats retain many of their natural characteristics, and are an important genetic resource. There are numerous local goat breeds in China, which have a strong tness and foraging capability under a wide range of habitats, including the dry, cold, and harsh Qinghai-Tibet Plateau and the warm and humid lands of South China 3 .
Arti cial selection during domestication and production-oriented breeding has greatly shaped the level of genomic variability in goats 4 . High-throughput whole genome sequencing can be used to characterize adaptive genetic variations in harsh or extreme environments 5,6 . Wang et al. detect selection signals in 8 goat breeds by analyzing the sequences of 5 domestic goats and 3 introduced breeds 4 , Benjelloun et al. found a number of candidate genes in which mutations allowed for the adaptation of metabolic pathways or biological processes associated with local environmental conditions by screening 3 indigenous goat breeds 7 . Selective sweep analyses identi ed multiple regions in which genetic variations have been described in duck 8 , pandas 9 , pig 10 , sheep 11 , dog 12 and goat 13 . These populations have evolved numerous adaptations, and several genomic studies have been conducted to identify the genes that drive these adaptations. High-altitude environments exert strong selective pressures, and human and animal populations have evolved in convergent ways to cope with a chronica lack of oxygen 14 . Different environmental adaptations are thought to be the result of favorable mutations and selection pressures.
This provides interesting opportunities to identify the selective signatures associated with adaptation genes. However, little is known about the many diverse genetic mechanisms by which have allowed goats to successfully adapt to extreme environments, such as the olfactory sense and the immune system.
To elucidate the potential genes or variations in signal characteristics and phenotypes of goats adapted to local extreme climatic conditions, whole-genome sequencing was performed on samples from 10 indigenous Chinese Qinghai-Tibetan Plateau goat breeds (altitude >4,000 m above sea level). The resulting data was compared with previously published genome sequences of 9 low altitude goat breeds.
The aim of this study was to develop a basic understanding of the genetic variations associated with the acclimation characteristics of goats that are in uenced by both arti cial and natural selection. Taken together, this information will provide a new perspective for the research on the mechanisms of acclimation and adaptation in Chinese goat breeds.

Ethic statement
This study followed the recommendations of the "Regulations for the Management of Affairs Concerning Experimental Animals" (Ministry of Science and Technology, China, revised in June 2004). The study were approved by the Animal Care and Use Committees of the Northwest Institute of Plateau Biology, Chinese Academy of Sciences. The animals are not harmed during sample collection. The study is in accordance with the ARRIVE Guidelines.
Animals and whole-genome sequencing Ear tissue samples were collected from 10 Qaidam Cashmere goats (CMDG) in the Qaidam area, Haixi prefecture, Qinghai Province (altitude>3,000 m above sea level). For a population sample from a contrasting climate, ear tissue samples were collected from 9 Chengdu Brown goats (CDBG) of the Chengdu plain (altitude<1000 m above sea level). Total genomic DNA was extracted from samples and a minimum of 3 g genomic DNA was used to construct paired-end libraries with an insert size of 500 bp using the Paired-End DNA Sample Prep kit (Illumina Inc., San Diego, CA, USA). These libraries were sequenced using the novaseq6000 (Illumina Inc., San Diego, CA, USA) NGS platform at Gene denovo company (Guangzhou, China).

Short insertions and deletions (InDels) and Structural Variants (SVs)
The Burrows-Wheeler Aligner (BWA) was used to align the clean reads from each sample against the reference genome (https://www.ncbi.nlm.nih.gov/assembly/GCF_001704415.1) with the default parameters. Variant calling was performed for all samples using the GATK's Uni ed Genotyper. Both SNPs and InDels were ltered using GATK's Variant Filtration with proper standards (-Window 4 -lter "QD < 2.0 || FS > 60.0 || MQ < 40.0 " -G_ lter "GQ < 20"). Those exhibiting segregation distortion or sequencing errors were discarded. In order to determine the physical positions of each SNP, the software tool ANNOVAR 15 , was used to align and annotate SNPs or InDels. Structural Variation (SV) types include translocations, inversions and insertion events, which were determined the breakdancer software.
Phylogenetic and population structure analysis Next, a phylogenetic tree was constructed using the neighbor joining method in the PHYLIP software program 16 (version 3.69). The bootstrapped con dence interval was based on 1000 replicates. The population subdivision pattern was preliminarily classi ed in the principal component analysis by the GCTA program 17 . The admixture model-based software Admixture 18 V1.3.0 was further used to estimate population structure (Q matrix). The tested K was set from 1 to 10, and the optimal K was determined as that which resulted in the lowest cross-validation error. The kinship matrix (K matrix) was calculated using the method of Loiselle 19 et al by the software of SPAGeDi 20 V1.5.

Analysis of Nucleotide diversity and Selective Sweep Regions
To ensure accuracy of the analysis, missing SNPs at a rate of greater than 20% were excluded from further analysis. Multiple indices related to population genetics, including Nucleotide Diversity 21 (π) and Fst, were calculated with Pop Genome using the sliding window approach. The window size was set to 100 kb, and the step size was set to 10kb. Selective sweep regions were selected according to the interception of 2 indices, which were Fst and π ratio with a top 5% threshold level. All related graphs were drawn by R scripts. Candidate genes within the sweep regions were extracted for further analysis.

Results And Discussion
Whole-Genome Sequencing and Genetic Variation Genome sequencing yielded a total of 438.46 billion reads of raw paired-end reads with a length of 150bp from 19 goats in the present study. Stringent quality ltering yielded 430.32 billion reads or 64,548 Gb of genome data (approximately 147× effective sequence depth). Over 98% of the total clean reads were mapped to the annotated goat reference genome with a coverage of ~99.66% (Supplementary Table S1 Table S2). Next, the 18.58 million SNPs were divided into 3 categories: 9.86 million SNPs shared between CDMG and CDBG, as well as 8.57 million and 0.15 million SNPs unique to CDMG and CDBG ( Fig.1), respectively. Next, 18.58 million SNPs were measured for genome-wide variation and the frequency spectrum so as to accurately detect genomic footprints left by selection. The numbers of SNPs in the exonic, intronic, and intergenic regions of CDBG were lower than was observed in CDMG. Furthermore, higher genomic diversity was observed in CDMG relative to CDBG (Supplementary Table   S3). These data indicate that the high altitude goat breed underwent intense natural and arti cial selection in extreme weather conditions.
The sequence diversity (π) was estimated to be 1.35 10 -3 in CDBG and 2.05 10 -3 in CDMG, which were slightly lower than the native sheep 22 (2.28 10 -3 ). Moreover, the goats exhibited a nucleotide diversity that was higher than estimates for human populations 23 (1.0 10 -3 ), and lower than that of Chinese cattle breeds 24 (Leiqiong cattle,4.2 10 -3 ). The level of genetic differentiation between them, Fst, was estimated to be 0.03, which is lower than that between Tibetan and Hu sheep 25 (about 0.249). The high proportion of shared SNPs and the low Fst revealed a small genetic divergence between plateau and plain goats. However, the principal component analysis (PCA) was signi cantly separated along the rst principal component between plateau and plain goats (Fig.2), which accounts for the greatest proportion of genetic variability. The neighbor-joining tree also grouped the plateau and plain goats into two distinct groups.

Short insertions and deletions (InDels) and Structural Variants (SVs)
As efforts turn to whole genome sequencing as an approach for identifying adaptive variation, it will become even more critical to detect InDels fully and accurately 26 . Furthermore, InDels and SVs are important types of genetic variations that contribute signi cantly to genomic diversity 27 . A total of 861,995 insertions and 1,114,290 deletions were identi ed. Within all InDels, 1,129,583 were observed in both domestic goat, 388,539 were speci c to CDBG and 741,044 were speci c to CDMG. The length of InDels ranged from 0 to 10 bp, with the large majority consisting of single nucleotide deletions (Fig.3a). Observation of a high number of unique genetic variations suggests a bene t to the animals in protecting genetic diversity.
A total of 31,483 reliable structure variants were detected. Among these, 4,941 were insertions, 16,785 were deletions, 292 were inversions, 7,888 were chromosomal translocations, and 1,577 were tandem duplications. Within these SVs, 1,366 were speci c to CDBG and the other 11,915 were speci c to CDMG (Fig.3b). No signi cant differences were observed in the length of these types of SVs. Apparently, more variation was retained under selection pressure for CDMG in this environment. This phenomenon could be the result of the life-sustaining adaptions to high altitude, low temperature, and low oxygen conditions. The discovery range of InDels overlapped with that of structural variation. Thus, breed-speci c variants could be useful for further studies of breed characterization, while overlapping variants may re ect closer relations between breeds 28 .

Signatures of Positive Selection in goat Genome
Genome variations were compared to both nucleotide diversity and allele frequency in response to natural and arti cial selection. The ratio of the genetic diversity in CDBG versus CDMG goats (πCDBG/πCDMG) across the goat genome along the autosome using the sliding window method was measured. The 5% of windows with a Fst>=0, π ratio (CDBG compared to CDMG=0.775), in comparison to CDBG, CDMG was observed to possess lower levels of polymorphisms, suggesting that these genomic regions were strongly selected for in the extreme environment. These selected regions also exhibited signi cant differences (Tajima.D= -0.313). This result quantitatively re ects that the extreme climatic conditions had a strong impact on the emergence of adaptive genetic changes in highland goats. In addition, although the overall goat genome is similar, two divergent clusters were observed that may be the result of adaptations to the extreme environmental conditions in which they lived (Fig.2). Moreover, the combination of Fst values and π ratios were more reliable indicators of goat genome for the selected genomic regions from the plateau and high-altitude regions. A total of 1277, were identi ed as high leves of differentiation regions (Supplementary Table S4). Using the top 5% of Fst>=0 and π ratio>=0.78 cutoffs for the CDBG/CDMG (Fig.4b). A total of 3184 overlapping selective candidate genes were observed to be shared between the Fst values and π ration values (Fig.4a). The combined analysis of these methods provided further support that the candidate genes were in fact a result directly related to the treatment.

GO and pathway enrichment analysis of selection signals
It is reasonable to expect that speci c genes and biological pathways be involved in adaptation to extreme environments. To gain insights into the molecular functions of the candidate genes in the selected region and to provide clues for further screening validation, a gene ontology enrichment analysis was conducted. The analysis identi ed signi cant enrichment of 1867 speci c candidate genes embedded in the selective regions in 175 GO terms (P < 0.05, Supplementary Table S5). These genes were signi cantly enriched in detection of stimulus involved in sensory perception (GO:0050906), neurological system process (GO:0050877), positive regulation of phospholipase activity (GO:0010518), detection of stimulus (GO:0051606). These pathways re ect the likely adaptations of cashmere goats to the low temperatures on the plateau. Also enriched were cardiac conduction (GO:0061337), pulmonary valve morphogenesis (GO:0003184), positive regulation of potassium ion transmembrane transporter activity (GO:1901018). These pathways are associated with adaptations to activity under the hypoxic environmental conditions associated with life at high altitude. Moreover, some immune activities were also signi cantly enriched. For example, positive regulation of T cell mediated immunity (GO:0002711), regulation of humoral immune response (GO:0002920), this would probably have been the result of a rapid response to new pathogens associated with frequent contact with humans during domestication 29 . Multiple candidate genes involved in production traits. For example, coat color genes included OSTM1 (ncbi_102189757), KITLG (ncbi_100860807), POMC (ncbi_102182514), APC (ncbi_102184724), CDK5 (ncbi_102178040), body size included BMP2K (ncbi_102186290), FOSL2 (ncbi_102171522), IGF2 (ncbi_100861277), BMP1 (ncbi_102179476), ACSL3 (ncbi_102185694), POMC (ncbi_102182514), CKM (ncbi_102170262), MYF5 (ncbi_100861252), cashmere traits included PHF14 (ncbi_102173711), FGF5 (ncbi_102171226), KRT74 (ncbi_102176789). In addition, some speci c selective genes that were expected to be mutated to allow for environmental adaptation were also identi ed. These included genes involved in adaptability to hypoxia, such as GNB1 (ncbi_102187070), HIF1A (ncbi_100861391), RHOG (ncbi_102191156), PPARA (ncbi_100861393), TMTC2 (ncbi_102190760), EPO (ncbi_102180175), TH (ncbi_102179849), PPARA (ncbi_100861393), ACER1 (ncbi_102179093), HSP70.1 (ncbi_100860849). These ndings substantially enriched the catalog of genetic variation and provided new insights into the unique characteristics of Chinese native goat breeds.
Different genes coordinate with each other to perform their biological functions, and pathway-based analysis is helpful to further understand the biological functions of genes. Selected gene were annotated using KEGG pathways. The analysis identi ed 21 genes that were signi cantly enriched (P 0.05) (Fig.5, Supplementary Table S6). Many of these pathways that were identi ed are involved in in ammatory responses in the lung tissue and immune system. For example some of the enriched pathways included Asthma pathway, Pertussis pathway, Antigen processing and presentation pathway. This may be because in the hypoxic environment, lung tissue and pulmonary blood vessels are damaged, so that the pathway of organ in ammatory response is in a state of inhibition, and by activating a large number of immunerelated biological pathways, the in ammatory response caused by blood vessels in the remodeling process is reduced 30 . This is probably a defensive adaptation mechanism of the cardiovascular system to the environment of high altitude hypoxia. The immune regulatory cytokine interleukin 10 (IL-10) is an important regulatory cytokine with an overall immunosuppressive effect, which may represent an evolutionary adaptation that permits reactivation of the IL-10 gene under appropriate physiological or pathological conditions 31 . There are also candidate gene clusters associated with neural and automatic immune responses. Immune related genes are expected to be a main target for selection because they play a key role in immunity and defense, for example interleukin 1A (IL1A), interleukin 1B (IL1B), interleukin 10 (IL10), interleukin 9 (IL9), interleukin 4 (IL4), interleukin 13 (IL13), interleukin 18 (IL18) 32,33,34 . This re ects the effect of natural selection pressure on plateau goats in the harsh habitat of high altitude areas. There may be general patterns of recent evolutionary adaptation of the human immune system to particular regions and that these adaptations can produce differences in disease susceptibility 35 . Taken together, these results suggest that regulatory variations affecting immune-related genes, particularly those associated with viruses, may confer a selective advantage on the populations.
In addition to the environmental challenges posed by low oxygen and high UV radiation, smell plays a key role in the survival of animals through a series of core behaviors, such as sexual selection, alarm signals, foraging, and defensive awareness 36 . As the data presented here indicated, the Olfactory transduction signaling pathway was highly enriched (Q< 0.05) (Fig.5a). The ability of prey to innately recognize the odor of a potential predator provides a strong selective advantage. These neural mechanisms enable chemical surveillance against other species, interpretation of the cues, and initiation of a defensive response 37,38 . Signals that induce fearful behavior, secreted by one individual and received by a second individual of the same species, in which they initiate a speci c reaction, are cues transmitted between species that selectively disadvantage the sender and advantage the recipient 39 . Pheromones are chemical signals between two members of the same species which are known to be found mainly in animal communication. Although these models have not provided insight into the organization of the neural response in the receiving animals 40 , subsets of sensory neurons are thought to be genetically determined and modulate innate behavior.
However, the evolution of olfactory perception during goat domestication has previously received little attention. Olfactory perception acuity is less important for survival of domesticated goats living in the predictable farm environment than those living in the wild. The three genes enriched in the olfactory perception signaling pathway were Cyclic nucleotide gated channel alpha 4(CNGA4), Adenylate cyclase 3 (ADCY3), cGMP-dependent 3',5'-cyclic phosphodiesterase (PDE2A). Heteromultimeric cyclic nucleotidegated (CNG) channels play a central role in the transduction of odorant signals and subsequent adaptation, Cyclic nucleotide gated channel alpha 4 (CNGA4)subunit rapid olfactory adaptation allows an animal to continuously assess changes in the odor environment and the intensity of odor 41 . The CNGA4 subunit in Olfactory sensory neurons (OSNs) was a functionally important mechanism for sensory adaptation of the olfactory system in vivo 42 . This volatile "alarm signal" produces and releases defensive chemicals that play an important role in animals adapting to their natural lives, as originally described by Darwinian natural selection in extreme environments. In this study, the CNGA4 gene revealed a higher level of diversity in CDMG (Fig.6a, 6b). Further screening of this candidate gene for nonsynonymous mutations in exon, which may be representative of putative functional variants, revealed 4 missense SNPs (exon3: c.G178A/p.D60N, exon4: c.T835C/p.F279L, exon4: c.G893A/p.R298Q and exon6: c.C1489A:p.L497M) (Fig.6c) in the domestic goats. These results suggest that variations may be an important adaptation of the olfactory defense system. Moreover, ADCY3 gene encodes the adenylate cyclase involved in olfactory signal transduction. Animals carrying functional knockouts of ADCY3 are unable to smell, as olfactory signaling across the main olfactory epithelium is mediated via receptor stimulation of type 3 adenylyl cyclase 43 . An indispensable role of ADCY3 in postnatal maturation of OSNs and the maintenance of olfactory ciliary ultrastructure in mice through adulthood 44 . The PDE2A gene was expressed in olfactory sensory neurons of olfactory epithelium 45 . As a marker for a unique set of olfactory sensory neurons, PDE2A is now de ned as the "necklace olfactory" 46 . In the present study, 2 remarkable selection signals were detected in the ADCY3 (Fst=0 and π ratio=0.87) and PDE2A (Fst=0 and π ratio=0.79). These observations suggest that these selection signals may result in an enhancement of the olfactory perception defense system.
Interestingly, KEGG enrichment analysis showed that oocyte meiosis pathways were also involved in the adaptations to the environment. Calcium/calmodulin-dependent protein kinaseII (Camk 2b) might be related the hypoxia-induced factors-1 (HIF-1) pathway, or may possibly even function through the HIF-1 signal pathway 47 . The HIF pathway is a known complex hypoxic regulatory system involved in homeostasis, embryogenesis, and development, and can stimulate hundreds of down-regulated genes in response to hypoxia. In addition, the anaphase-promoting complex subunit 13 (ANAPC13) gene in this pathway is speci cally involved in oocyte maturation 48 . In hypoxic environment, through the transmission of multiple collective signals, cellular metabolism may be converted from aerobic to glycolytic metabolism, which is conducive to the maturation of granulosa cells in the process of follicle maturation. This would allow for the cells to adapt their energy requirements using pathways with reduced oxygen requirements in hypoxic environments. Therefore, after a period of Darwinian natural selection, the relevant adaptation mechanism of a species may be the result of interactions between multiple genes. This indicates the complexity of selecting important adaptively mutated genes in signaling regions, and the associated regulatory pathways of these genes.
Some of the enriched signaling pathways were not believed to be related to the adaptations of indigenous goats to domestication in the present study, such as the Ribo avin metabolism pathway, Lipoic acid metabolism pathway, Linoleic acid metabolism pathway of metabolism system and prolactin signaling pathway of endocrine system. However, these pathways are involved in the basic regulation of physiological metabolic and endocrine processes. Therefore, it can be speculated that the selected genes associated with these pathways could be involved in responses resulting from domestication, possibly providing new clues to the functions of such signaling pathways.

Conclusion
To our knowledge, more than 3,000 candidate genes have been found that may have been selected for through both natural and arti cial processes during domestication. These candidate genes require validation with additional samples in the future. In addition, novel immune and olfactory perception genetic regulatory pathways were identi ed, which may correspond to defensive adaptations to the extreme environments in which Chinese native goat breeds were forced to survive during the process of domestication. CNGA4 and Camk 2b may enhance immunity and defense against external adverse factors during domestication. Furthermore, these results shed new light on the eld of evolution and provide a novel basis for the identi cation of new genes affecting goat domestication and evolution.
Authors ' contributions DQ and KZ designed the experiments and revised the manuscript. BZ conducted the RNA-seq and qRT-PCR analyses. DT analyzed the data and wrote the paper. FT conducted WGCNA analysis. SL and BH prepared the DNA samples for SNP identi cation and genotyping and analyzed the data. All authors read and approved the nal manuscript.
Ethics approval and consent to participate All animal experiments were approved by the Animal Care and Use Committees of the Northwest Institute of Plateau Biology, Chinese Academy of Sciences. We obtained written permission from the owner of the sheep farm to take samples.

Consent for publication
Not Applicable.