Ethics statement
Ethics approval for all animal experiments was granted by the Institutional Animal Care and Use Committee of Northwest A&F University following the recommendation of the Regulations for the Administration of Affairs Concerning Experimental Animals of China.
Animal population
Individuals used for the detection of temperament traits comprised Brahman cattle and Yunling cattle. Yunling cattle were bred by Yunnan Academy of Grassland and Animal Science and all experimental animals were from its core breeding farm. All individuals were female, multiparous, and not at two weeks pre-calving, calving and two weeks post-calving at the time of temperament assessment. In the feeding regime, the experimental animals comprised pen-feeding individuals and free-grazing individuals. The pen-feeding individuals were fed a total mixed ration composed of 65% grass silage and 35% concentrate on a dry matter basis. The free-grazing individuals ate grass in the pasture during summer and autumn (from June to November) every year and were properly fed above-mentioned TMR during winter and spring (from December to May) every year.
Temperament traits assessment
We combined open field test, novel object test and heart rate variability into a set of experiment procedures (Figure 1). Our assessment referenced to previous studies (14, 21) and performed for each animal in an open field area. 48 Brahman cattle (9 pen-feeding individuals and 39 free-grazing individuals) and 186 Yunling cattle (58 pen-feeding individuals and 128 free-grazing individuals) were selected in the behavioral experiments. Before the test, the selected individual was encouraged through a single file raceway to a squeeze crush to wear a heart monitor system (Polar V800, Polar Electro, Oy, Finland). Then, the animal walked along a single file raceway into the open field area for acclimatization within 10 mins, which was referred to as the period of open field test; subsequently, for the novel object test, a yellow duck doll was placed in the center of the area for 5 mins. The dimensions of the open field area and the duck doll were shown in S1 Figure. During the test, the animal’s behavior was recorded by a video camera. Total time for contact with duck doll calculated by a stopwatch. The R-R data series was transformed into a computer and corrected with default parameters using gHRV (22). Eventually, the HRV data consisted of LHO, UFO, VFO, LFO, HFO, POO, MHO, HSO, PNO, RMO, APO, FDO, LHN, UFN, VFN, LFN, HFN, PON, MHN, HSN, PNN, RMN, APN and FDN. The other five traits consisted of STE, ACT, VOC, LAT and DUR. The full name, definition and significance of the 29 traits were shown in Figure 1 and S1 Table. Owing to some objective conditions, such as bad temperament, four Brahman cattle (two pen-feeding individuals and two free-grazing individuals) and 10 Yunling cattle (seven pen-feeding individuals and three free-grazing individuals) did not complete the novel object test.
Phenotypic analysis
We performed a general linear model to estimate the effect of breed on temperament traits with consideration of the effect of feeding regime using R project (23) To investigate the relationship among temperament traits, and between temperament traits and body measurement traits, we calculated the partial Pearson’s correlation adjusting for feeding regime and breeds using the ppcor package (24) in R. Principal component analysis, conducted with princomp function of R project, was used to condense several correlated measures into a small number of principal components. The Kaiser-Meyer-Olkin (KMO) test of the measure of sampling adequacy was used to estimate the appropriateness of conducting PCA using REdaS package (25). The KMO test provided a value of 0.72 for the open field test and 0.66 for the novel object test. Because the correlation matrix with KMO > 0.6 is considered tolerable to explain the correlations between the variables (26), our data therefore appropriate for PCA analysis.
Sample collection and genome sequencing
After completing all behavioral assessments, we encouraged the test individual to the squeeze crush to collect ear tissue, whole blood and body measurement traits. Genomic DNA was extracted from the ear tissues using the standard phenol-chloroform protocol (27). A total of 158 paired-end libraries with an insert size of 350 bp were constructed and sequenced using Illumina NovaSeq (S2 Table). The length of the reads was 150 bp. The sequence data used in this paper was obtained from published papers where detailed information about sampling and sequencing was available (28). The aims of the whole blood and body measurement traits have been reported by previous study (29).
Reads mapping and SNP calling
Firstly, we mapped clean reads to the cattle reference assembly GCF_002263795.1 using BWA-MEM with default parameters (30). Duplicate reads were filtered using the “REMOVE_DUPLICATES=true” option of Picard tools. The average alignment rate and coverage were 99.54% and 5.61×, respectively. The “HaplotypeCaller”, “GenotypeGVCFs” and “SelectVariants” argument of Genome analysis toolkit 3.8 (GATK) (31) were used for calling raw SNPs. The argument “VariantFiltration” of the same software was applied to all raw SNPs with following options: QD < 2, FS > 60, MQRankSum < -12.5, ReadPosRankSum < -8.0 and mean sequence depth (for all individuals) < 1/3× and > 3x. In addition, the haplotype-phase inference and missing alleles imputation were produced using Beagle (32) to carry out GWAS further. Based on ~41 M autosomal SNPs, we estimated the eigenvectors using smartPCA of EIGENSOFT v5.0 package (33) to adjust population structure in GWAS. The principal component 1 based on the genotype matrix could separate Brahman cattle from Yunling cattle (S2 Fig), which has been presented by a previous study (28).
GWAS analysis
A total of 13,006,271 SNPs and 12,948,424 SNPs were used in GWAS for 15 traits in open field test and 14 traits in novel object test, respectively. Meanwhile, the PC1 of each test was also acted as a phenotype for GWAS. For GWAS in the open field test, there were 30 Brahman genomes (5 pen-feeding individuals and 25 free-grazing individuals) and 128 Yunling genomes (29 pen-feeding individuals and 99 free-grazing individuals). For GWAS in the novel object test, there were 29 Brahman genomes (5 pen-feeding individuals and 24 free-grazing individuals) and 122 Yunling genomes (25 pen-feeding individuals and 97 free-grazing individuals). We carried out a GWAS with mixed linear model using genome-wide efficient mixed-model association (GEMMA) software package (34). For two-breed GWAS, the PC1 of autosomal SNPs, feeding regime and breed information were defined as the fixed effect. For within-Yunling-cattle GWAS, the PC1 of autosomal SNPs and feeding regime were defined as the fixed effect. The kinship matrix was defined as the random effect. Because the sample size of Brahman cattle is smaller, we did not carry out within-Brahman-cattle GWAS.
For multiple testing correction, because the effective number of independent SNPs for GWAS in open field test and novel object test calculated using PLINK (35) with the parameters (--indep-pairwise 50 5 0.2) were 750,367 and 737545, respectively, the P-value significant and suggestive thresholds were approximately 5 × 10-8 (0.05/750,367) and 1 × 10-6 (1/750,367), respectively. In fact, the thresholds were widely used by numerous studies.
Proportion of variance explained (PVE) was defined as follows (36):
Where is effect size for SNP maker, MAF is minor allele frequency for SNP marker, se is standard error of effect size for SNP marker and N is the sample size.
Functional annotation in the GWAS associated loci
To reveal important candidate genes, we used the following strategy to narrow down our findings. Firstly, we estimated the pairwise LD relation between associated SNPs. Borders of the associated loci were defined as the associated SNPs that are in approximate linkage equilibrium with each other at r2 > 0.6 supplied by PLINK (35). To obtain important GWAS signals, the associated loci with the number of associated SNPs < 3 or the length < 1000 bp was excluded. Secondly, we also merged the loci in which the significant SNPs were associated with two or more temperament traits. In each independent locus, the SNP with the smallest P-value was called the leading SNP. Third, functional annotation of associated SNPs was carried out according to the Bos taurus reference genome (ARS-UCD1.2) in package ANNOVAR (37).
Pathway and QTL enrichment analysis
According to the functional annotation supplied by ANNOVAR, protein-coding genes located within 500 kb on either side of the associated SNPs were defined as candidate genes. For two-breed GWAS, there were 801 and 666 candidate genes for open field test and novel object test, respectively, whereas 89 candidate genes were overlapped. For each test, functional enrichment analysis was carried out on the list of candidate genes using the KEGG database supplied by KOBAS 3.0 (http://kobas.cbi.pku.edu.cn/) (38). Moreover, we performed a chi-squared test using chisq.test function of R (39) to compare the observed and expected number of the independent loci overlapped with QTLs of each traits in Cattle QTL Database (https://www.animalgenome.org/cgi-bin/QTLdb/BT/index) (40).
Tissue expression analysis
A total of 85 RNA-seq experiments in Bos taurus were download from 5 studies of SRA database (PRJEB25677, PRJEB35127, PRJNA251439, PRJNA263600, PRJNA522422). All clean reads for each sample were aligned to the Bos taurus reference genome assembly using STAR (41) and hisat2 (42). Next, the transcript was assembly from alignment reads using Stringtie (43). Transcript FPKM was defined as the transcript expression levels using Ballgown (44). To identify the brain-specifically expressed genes, we calculated tissue specificity indices (τ) as described earlier (45). Firstly, we filtered out weekly expressed genes in which the maximum expression level in all tissues was less than 1 FPKM. Second, τ is defined as follows:
Third, we classified the genes as tissue-specifically expressed genes when τ > 0.8.