Genome-wide Association Studies Identify Quantitative Trait Loci Affecting Cattle Temperament

Cattle Here, we performed genome-wide association studies for a series of temperament traits, assessed in an open eld and novel object test, using autosomal SNPs derived from the whole-genome sequence. We identied 37 and 29 genome-wide signicant loci in an open eld and novel object test, respectively. Gene set analysis implicated the neuroactive ligand-receptor interaction pathway. Analysis in tissue specic expressions showed enrichment in the brain. While some candidate genes were involved in psychiatric and neurodegenerative diseases in humans. The rst principal component explained the largest variance in the data of open eld and novel object test, and the most signicant loci were assigned to SORCS3 and SESTD1, respectively.

pre-calving, calving and two weeks post-calving at the time of temperament assessment. In the feeding regime, the experimental animals comprised pen-feeding individuals and free-grazing individuals. The pen-feeding individuals were fed a total mixed ration composed of 65% grass silage and 35% concentrate on a dry matter basis. The free-grazing individuals ate grass in the pasture during summer and autumn (from June to November) every year and were properly fed above-mentioned TMR during winter and spring (from December to May) every year.

Temperament traits assessment
We combined open eld test, novel object test and heart rate variability into a set of experiment procedures ( Figure 1). Our assessment referenced to previous studies (14,21) and performed for each animal in an open eld area. 48 Brahman cattle (9 pen-feeding individuals and 39 free-grazing individuals) and 186 Yunling cattle (58 pen-feeding individuals and 128 free-grazing individuals) were selected in the behavioral experiments. Before the test, the selected individual was encouraged through a single le raceway to a squeeze crush to wear a heart monitor system (Polar V800, Polar Electro, Oy, Finland). Then, the animal walked along a single le raceway into the open eld area for acclimatization within 10 mins, which was referred to as the period of open eld test; subsequently, for the novel object test, a yellow duck doll was placed in the center of the area for 5 mins. The dimensions of the open eld area and the duck doll were shown in S1 Figure. During the test, the animal's behavior was recorded by a video camera. Total time for contact with duck doll calculated by a stopwatch. The R-R data series was transformed into a computer and corrected with default parameters using gHRV (22). Eventually, the HRV data consisted of LHO, UFO Figure 1 and S1 Table. Owing to some objective conditions, such as bad temperament, four Brahman cattle (two pen-feeding individuals and two free-grazing individuals) and 10 Yunling cattle (seven pen-feeding individuals and three free-grazing individuals) did not complete the novel object test.

Phenotypic analysis
We performed a general linear model to estimate the effect of breed on temperament traits with consideration of the effect of feeding regime using R project (23) To investigate the relationship among temperament traits, and between temperament traits and body measurement traits, we calculated the partial Pearson's correlation adjusting for feeding regime and breeds using the ppcor package (24) in R. Principal component analysis, conducted with princomp function of R project, was used to condense several correlated measures into a small number of principal components. The Kaiser-Meyer-Olkin (KMO) test of the measure of sampling adequacy was used to estimate the appropriateness of conducting PCA using REdaS package (25). The KMO test provided a value of 0.72 for the open eld test and 0.66 for the novel object test. Because the correlation matrix with KMO > 0.6 is considered tolerable to explain the correlations between the variables (26), our data therefore appropriate for PCA analysis.

Sample collection and genome sequencing
After completing all behavioral assessments, we encouraged the test individual to the squeeze crush to collect ear tissue, whole blood and body measurement traits. Genomic DNA was extracted from the ear tissues using the standard phenol-chloroform protocol (27). A total of 158 paired-end libraries with an insert size of 350 bp were constructed and sequenced using Illumina NovaSeq (S2 Table). The length of the reads was 150 bp. The sequence data used in this paper was obtained from published papers where detailed information about sampling and sequencing was available (28). The aims of the whole blood and body measurement traits have been reported by previous study (29).

Reads mapping and SNP calling
Firstly, we mapped clean reads to the cattle reference assembly GCF_002263795.1 using BWA-MEM with default parameters (30). Duplicate reads were ltered using the "REMOVE_DUPLICATES=true" option of Picard tools. The average alignment rate and coverage were 99.54% and 5.61×, respectively. The "HaplotypeCaller", "GenotypeGVCFs" and "SelectVariants" argument of Genome analysis toolkit 3.8 (GATK) (31) were used for calling raw SNPs. The argument "VariantFiltration" of the same software was applied to all raw SNPs with following options: QD < 2, FS > 60, MQRankSum < -12.5, ReadPosRankSum < -8.0 and mean sequence depth (for all individuals) < 1/3× and > 3x. In addition, the haplotype-phase inference and missing alleles imputation were produced using Beagle (32) to carry out GWAS further. Based on ~41 M autosomal SNPs, we estimated the eigenvectors using smartPCA of EIGENSOFT v5.0 package (33) to adjust population structure in GWAS.  30 Brahman genomes (5 pen-feeding individuals and 25 free-grazing individuals) and 128 Yunling genomes (29 pen-feeding individuals and 99 freegrazing individuals). For GWAS in the novel object test, there were 29 Brahman genomes (5 pen-feeding individuals and 24 free-grazing individuals) and 122 Yunling genomes (25 pen-feeding individuals and 97 free-grazing individuals). We carried out a GWAS with mixed linear model using genome-wide e cient mixed-model association (GEMMA) software package (34). For two-breed GWAS, the PC1 of autosomal SNPs, feeding regime and breed information were de ned as the xed effect. For within-Yunling-cattle GWAS, the PC1 of autosomal SNPs and feeding regime were de ned as the xed effect. The kinship matrix was de ned as the random effect. Because the sample size of Brahman cattle is smaller, we did not carry out within-Brahman-cattle GWAS.
For multiple testing correction, because the effective number of independent SNPs for GWAS in open eld test and novel object test calculated using PLINK (35) with the parameters (--indep-pairwise 50 5 0.2) were 750,367 and 737545, respectively, the P-value signi cant and suggestive thresholds were approximately 5 × 10-8 (0.05/750,367) and 1 × 10-6 (1/750,367), respectively. In fact, the thresholds were widely used by numerous studies.
Proportion of variance explained (PVE) was de ned as follows (36): Where is effect size for SNP maker, MAF is minor allele frequency for SNP marker, se is standard error of effect size for SNP marker and N is the sample size.
Functional annotation in the GWAS associated loci To reveal important candidate genes, we used the following strategy to narrow down our ndings. Firstly, we estimated the pairwise LD relation between associated SNPs. Borders of the associated loci were de ned as the associated SNPs that are in approximate linkage equilibrium with each other at r 2 > 0.6 supplied by PLINK (35). To obtain important GWAS signals, the associated loci with the number of associated SNPs < 3 or the length < 1000 bp was excluded. Secondly, we also merged the loci in which the signi cant SNPs were associated with two or more temperament traits. In each independent locus, the SNP with the smallest P-value was called the leading SNP. Third, functional annotation of associated SNPs was carried out according to the Bos taurus reference genome (ARS-UCD1.2) in package ANNOVAR (37).

Pathway and QTL enrichment analysis
According to the functional annotation supplied by ANNOVAR, protein-coding genes located within 500 kb on either side of the associated SNPs were de ned as candidate genes. For two-breed GWAS, there were 801 and 666 candidate genes for open eld test and novel object test, respectively, whereas 89 candidate genes were overlapped. For each test, functional enrichment analysis was carried out on the list of candidate genes using the KEGG database supplied by KOBAS 3.0 (http://kobas.cbi.pku.edu.cn/) (38).
Moreover, we performed a chi-squared test using chisq.test function of R (39) to compare the observed and expected number of the independent loci overlapped with QTLs of each traits in Cattle QTL Database (https://www.animalgenome.org/cgi-bin/QTLdb/BT/index) (40).

Tissue expression analysis
A total of 85 RNA-seq experiments in Bos taurus were download from 5 studies of SRA database (PRJEB25677, PRJEB35127, PRJNA251439, PRJNA263600, PRJNA522422). All clean reads for each sample were aligned to the Bos taurus reference genome assembly using STAR (41) and hisat2 (42). Next, the transcript was assembly from alignment reads using Stringtie (43). Transcript FPKM was de ned as the transcript expression levels using Ballgown (44). To identify the brain-speci cally expressed genes, we calculated tissue speci city indices (τ) as described earlier (45). Firstly, we ltered out weekly expressed genes in which the maximum expression level in all tissues was less than 1 FPKM. Second, τ is de ned as follows: Third, we classi ed the genes as tissue-speci cally expressed genes when τ > 0.8.

Variation of temperament traits
To evaluate the difference in temperament, we measured 15 and 14 traits in an open eld and novel object test, respectively, in 48 Brahman cattle and 186 Yunling cattle ( Fig. 1) (S1 Table). The abbreviations and de nitions of traits are shown in Fig. 1. In terms of feeding regime, Brahman cattle consisted of 9 pen-feeding individuals and 39 free-grazing individuals, and Yunling cattle consisted of 58 pen-feeding individuals and 128 free-grazing individuals. Diverse phenotypic variations of these traits are shown in the S3 Table for open eld test and S4 Table for novel object test, respectively. It was observed that the coe cient of variation ranged from 2.89-248.26% for open eld test and from 5.32-1300.90% for novel object test, respectively. The distributions of most temperament traits followed unimodal distribution, except for LAT (S3 Figure). We used a general linear model with consideration of the effect of feeding regime to test the effect of breed on temperament traits. Seven traits in Brahman cattle (FDO, APO, HSO, RMO, PNO, HSN and PNN) were signi cantly higher than those in Yunling cattle (P < 0.05) ( Table 1), supporting previous observations that the docile Bos taurus was easier to handle than the relatively ighty Bos indicus (2,20). Breed differences in temperament traits have also been reported in other studies (1,14). These results suggested that the differences in temperament traits are, at least at the level of the breed, under some genetic control. To clarify the interrelationship of these 29 traits, we calculated partial correlation adjusting for feeding regime and breeds. A total of 256 pairs correlations became signi cant in 406 pairs of traits (0.143-0.988) (S4 Figure). In general, traits within each group mirrored one another and were tightly correlated.   Figure.) (S7 Table), which altogether implicated 801 candidate genes. Gene set enrichment analysis of Kyoto Encyclopaedia of Genes and Genomes (KEGG) pathways revealed that the most signi cant pathway was neuroactive ligand-receptor interaction (corrected P = 2.42 × 10 − 3 ) (S6A Figure.), which included many types of neuroreceptors genes, such as dopamine, serotonin, gamma-aminobutyric acid (GABA), and glutamate receptors. In fact, there were many neuroreceptors genes involved in cattle behavior, such as dopamine receptor D4 (45), glutamate ionotropic receptor AMPA 2 (46) and serotonin receptor 2A (47). Moreover, a previous study has demonstrated that genes interacting with anti-psychotic drugs for schizophrenia characterized by abnormality of emotional expression are overrepresented in the neuroactive ligand-receptor interaction pathway (48). Therefore, we inferred that the neuroactive ligand-receptor interaction pathway is critical to emotion control in cattle.
In order to provide more information about the functional impact of candidate genes on temperament, we utilized 85 RNA-seq experiments containing 18 main classes and 36 subclasses as described in Fig. 2A and 3D from SRA database in Bos taurus to identify brain-speci cally expressed genes (S2 File) ( Fig. 2A).
We performed a series of linkage disequilibrium (LD) analyses for associated SNPs to count the number of LD blocks, leading to 37 signi cantly and 34 suggestively independent loci (S8 table). Among these independent loci, 29 loci were repeatedly observed in at least two traits. In addition, 12 out of 15 temperament traits had at least two associated loci, except for ACT, STE and MHO, which mainly supported a polygenic determinism (2).
In order to establish the relationship between these traits and other quantitative traits, we investigate the enrichment of QTLs derived from Cattle QTL Database in the independent loci found in open led test. Of the 71 independent loci, 13, 14, 26 were overlapped with the QTLs of health, meat and carcass, and production traits, respectively, more than expected by chance (S9 Table). The results suggested that our temperament traits might be related with health, meat and carcass, and production traits.
For PC1, the most signi cant association locus was located in the rst intron of SORCS3 (Fig. 3A, 3B and 3C), a gene encoding sortilin related VPS10 domain-containing receptor 3. Moreover, we also found the SORCS3 locus was associated with other ve temperament traits (APO, HFO, LFO, POO and UFO) in open eld test (S7 Table). In addition, the SORCS3 region was the locus associated with the second largest number of traits in open eld test after the LZTS1 region. SORCS3 is a neuronal receptor whose transcript is expressed at a prominent level in the cerebral cortex (68). In tissue expression pro le, we also observed that the mRNA expression of SORCS3 in the brain was higher than those in other tissues and SORCS3 was a brain-speci cally expressed gene (τ = 0.88). (Fig. 3D). The previous study reported that the emotional-processing areas of the cerebral cortex could modulate the activity of the autonomic nervous system (69). Moreover, the experiment of SORCS3-de cient mice demonstrated that SORCS3 is a postsynaptic modulator of synaptic depression and fear extinction (70). Besides, sortilin de ciency can prevent the age-dependent degeneration of sympathetic neurons (71). In our study, we combined an open eld test and heart rate variability into an experimental procedure exploring the activity of the autonomic nervous system (sympathetic and parasympathetic) in emotional control. We thus concluded that this gene was a strong candidate gene contributing to emotional control in cattle.

Two-breed genome-wide association studies for novel object test
We also carried out a two-breed GWAS of 14 temperament traits and PC1 in the novel object test. We identi ed 154 signi cant SNPs and 847 suggestive SNPs (S5 Figure) (S10 Table), which altogether implicated 666 candidate genes. Gene set enrichment analysis of KEGG pathways revealed that the most signi cant pathway was axon guidance (corrected P = 1.79 × 10 − 4 ) (S6B Figure). Meantime, neuroactive ligand-receptor interaction (corrected P = 0.016) was also detected. Several studies have shown that the neuroactive ligand-receptor interaction pathway plays an important role in cognition-related diseases, such as Parkinson's disease (72) and Alzheimer's disease (73). Therefore, we inferred that neuroactive ligand-receptor interaction pathway is critical to cognitive function in cattle. We also found that 116 candidate genes were brain-speci cally expressed genes, more than expected by chance (χ 2 test, P = 1.450 × 10 − 9 ) (Fig. 4B), suggesting that these candidate genes might work on central nervous system and further contribute to difference in temperament.
After calculating the linkage disequilibrium of associated SNPs to obtain independent loci, a total of 29 signi cant loci and 26 suggestive loci (S11 table) were associated with 13 temperament traits, and there were no associated loci for UFN. Among these independent loci, 11 loci were repeatedly observed in at least two traits. Moreover, 11 out of 13 temperament traits had at least two associated loci, except for HSN, MHN, suggesting that most of the temperament traits were quantitative traits controlled by polygenes. In addition, 14, 12, 22 of the 55 independent loci were overlapped with the QTLs of health, meat and carcass, and production traits, respectively, more than expected by chance (S12 table), which indicates that our temperament traits might be related with health, meat and carcass, and production traits.
After surveying published literature about 20 genes closest to the signi cant loci, a total of six and four genes were found to be associated with neurodegenerative and psychiatric diseases, respectively. Among these genes, the most famous gene is APP (amyloid-β precursor protein), in which a mutation protected against Alzheimer's disease and age-related cognitive decline (74). In our study, APP locus was signi cantly associated with the total time at licking or sni ng the yellow duck doll, suggesting that APP was a strong candidate gene contributing to cattle cognition or recognition. The other ve candidate genes were also involved in neurodegenerative disease in humans, including Alzheimer's disease (NFATC2, CBLN4, CBFA2T3) (75-77), cerebellar ataxia (GRID2) (78), and mental retardation (RNF180) (79). Only four candidate genes were involved in psychiatric disease, including bipolar disorder (SESTD1, NR3C1) (80,81), obsessive-compulsive disorder (SLITRK1) (82), and mood disorder (KCTD12) (83) in humans. From the above literatures, we could hypothesize that genes predisposing to neurodegenerative disease might play crucial roles in cognition or recognition in domestic cattle.
For PC1, the most signi cant association locus was located in the rst intron of SESTD1 (Fig. 4A, 4B and 4C), a gene encoding SEC14 and spectrin domain containing 1. Moreover, we also found SESTD1 locus to be associated with four other temperament traits (APN, HFN, FDN, RMN) in open eld test. In addition, the SESTD1 region was the locus associated with the largest number of traits in the novel object test (S9  table). The SESTD1, a developmentally dynamic synapse protein (84), is involved in lithium response (80). Interestingly, due to the neuroprotective effects of lithium, it has been regarded as a candidate drug in the disease-modi cation of neurodegenerative disorders, such as Alzheimer's disease and amyotrophic lateral sclerosis (85). Moreover, a previous study found that knockdown of SESTD1 reduced dendritic spine density and excitatory synaptic transmission in hippocampal neurons (86). Numerous studies have demonstrated that the hippocampus plays a vital role in cognitive function (87,88). Tissue expression analysis revealed that the mRNA expression of SESTD1 in the brain was higher than those in duodenum, kidney, liver, spleen and muscle tissue (Fig. 4D). In our study, we combined the novel object test and heart rate variability into an experimental procedure detecting performance on cognition test. These results imply that SESTD1 is a strong candidate gene contributing to cattle cognition or recognition. through the two-breeds GWAS. It was to be observed that SORCS3 locus and SESTD1 locus identi ed by two-breeds GWAS were also detected in the within-Yunling-cattle GWAS. Moreover, similar to the ndings from a previous study (89), we also found the length of quantitative trait loci (QTL) identi ed by twobreed GWAS was shorter than that identi ed by within-Yunling-GWAS (t-test, P = 0.03 in open eld test and P = 0.08 in novel object test), indicating that the two-breed approach provided smaller con dence intervals of QTL than within-breed analyses. This could explain by the utilization of the historical recombination events that have occurred in each breed, resulting in less linkage disequilibrium and better resolution (90).

Discussion
Along with the establishment of a correlation between temperament traits and economically important traits (e.g. production traits), cattle breeders lay more emphasis on docility in breeding programs in the future. In this study, open eld test and novel object test were integrated into a set of experimental procedures. It is worth mentioning that we measured heart rate variability during the experimental procedure. Therefore, our experimental procedure could re ect the activity of the autonomous nervous system in emotional control and recognition of a novel object.
Based on temperament traits in open eld and novel object test, we used ~ 13M SNPs from wholegenome sequence data to clarify the relationship between genotype and phenotype through two-breed GWAS, leading to 71 suggestive association loci (including 37 signi cant loci) in open eld test and 55 suggestive association loci (including 29 signi cant loci) in novel object test, respectively. Although larger sample sizes are usually required for GWAS, the variants with more signi cant effects and high frequency could be identi ed using a smaller sample size. GWAS analysis with larger samples will identify additional variants with small effects and low frequency in the future.
Since WGS contain more variants compared with SNP array, our GWAS for temperament traits has a higher resolution and our candidate genes would be more plausible compared with previous studies. Moreover, although further functional experiments would allow us to validate our GWAS loci, most of the associated genes (e.g., neuroreceptors genes, brain-speci cally expressed gene, psychiatric and neurodegenerative risk genes) have biologically plausible links to temperament traits. In addition, some strong candidate genes (e.g., SORCS3, SESTD1 and APP) deserve more speci c studies to con rm their putative role in modulating the emotional expression or cognitive processes.

Conclusions
In conclusion, in the present study, we collected a large number of temperament traits and associated genetic datasets. These results provide a theoretical basis for molecular-marker selection in the breeding and genetic manipulation of cattle temperament improvement to meet the increasing demand for better docility. Further explorations of causal genes underlying temperament will be necessary to perform genomic selection in domestic cattle and precision medicine in humans.

Availability of data and materials
The raw whole genome sequencing data were reported in our previous study (28)