Animals and phenotypic measurements
A total of 722 animals from 2 farms (298 animals from farm A, 424 animals from farm B) were scored for udder traits in February and March 2020. The New Zealand goat herd is a mixed breed population comprised of approximately 85% Saanen, 5% Toggenburg, 5% Nubian/Anglo-Nubian, 4.5% Alpine and 0.5% Sable (17). The udder traits measured included udder depth (UD), fore udder attachment (FUA), rear udder attachment (RUA), udder furrow (UF), teat placement (TP) and teat angle (TA). Udder traits were scored on a point system from 1 to 9 described in Table 1 with definition of traits as used in McLaren et al. (18). Additionally, the following milk production traits, milk volume, fat yield and protein yield were measured for a subset of 336 animals out of the 722-animal cohort (141 animals from farm A, 195 animals from farm B) as part of standard herd testing procedures using Fourier transform infrared spectroscopy. Milk production traits were modelled by extrapolation to a 305-day lactation length period, restricted to animals with lactation length >200 days.
Table 1 Goat udder traits definitions
Trait
|
Description
|
Scoring system
|
Desired scores
|
Udder depth
|
Depth of udder measured relative to the hocks according to age
|
1 (deep) - 9 (shallow)
|
4,5,6
|
Rear udder attachment
|
Width at the top of the milk secreting tissue
|
1 (narrow) - 9 (wide)
|
9
|
Fore udder attachment
|
Attachment of udder to body wall
|
1 (weak) - 9 (strong)
|
9
|
Udder furrow
|
Strength and definition of medial suspensory ligament of udder
|
1 (weak) - 9 (strong)
|
7,8
|
Teat placement
|
Location of where teat attaches to udder
|
1 (wide) - 9 (close)
|
6,7,8
|
Teat angle
|
Teat angle from side view
|
1 (poor) - 9 (good)
|
8,9
|
Genotypes
The 722 phenotyped animals were genotyped with the Illumina Caprine 50K BeadChip with 54,241 SNP markers (Illumina Inc, San Diego, CA). Markers with ambiguous or missing genotypes in more than 5% of goats, or with minor allele frequency <0.01 were omitted. After filtering, 51,477 markers were retained.
Genome-wide association analysis
Genome wide association mapping with an applied mixed linear model accounting for differences in population structure was performed using genome-wide complex trait analysis tool (GCTA) v1.93.2 (19, 20). The following mixed linear model was used in our study:
y is the phenotype of interest, a is the mean term, b is the fixed (additive) effect of the SNP, x is the SNP genotype, g – is the effect of population structure estimated by calculating genetic relationship matrices (GRM) between all animals (20). mixed linear model association analysis, leave one chromosome out (MLMA-LOCO) approach was used where the chromosome on which the significant candidate SNP is located are excluded from the GRM calculations to avoid double fitting of candidate variants. e is the residual effect.
A SNP was considered significantly associated to a trait of interest at the whole genome level if their -log10 p-values exceeded -log10(5x10-8). SNP marker trait association plots against the ARS1 goat reference were constructed using the ggplot R package (21).
Calculation of allelic effect sizes for marker 19:26,610,610
Allelic effect sizes for marker 19:266,610,610 were calculated for production and udder traits in the 722 animals Genome Wide Association Study (GWAS) population using a linear regression model adjusted for age and farm origin. A one-way analysis of variance (ANOVA) test between the means (α = 0.05) was performed to characterise allelic effect differences between 19:26,610,610 genotype groups presented in figure 2. The Shapiro-Wilk normality assumption test was performed on all data presented. Multiple pairwise comparisons adjusted for false discovery rate were utilised to highlight statistically significant differences in the data presented.
Whole genome sequencing for variant identification
Whole genome sequencing was performed on 302 goats (including 48 animals included in the cohort used for GWAS) on the Illumina HiSeq XTen platform (Illumina Inc., San Diego, CA). For each animal, 700-1000 million paired end reads with read pair lengths of 150 nucleotides were generated. Read sequence quality was assessed using FastQC (FastQC v0.11.7). Reads were mapped to the ARS1 goat reference genome assembly using Burrows Wheeler aligner (22). Duplicate read pairs were marked using picard MarkDuplicates (Picard v2.1.0., https://broadinstitute.github.io/picard/). Reads aligned around small insertions/deletions were realigned using the Genome Analysis Toolkit (GATK v 3.8, RealignerTargetCreator and IndelRealigner tools) (23, 24). Probabilities for single nucleotide and indel variants were discovered individually in each alignment using GATK’s HaplotypeCaller tool (25) and genotyped jointly across the entire cohort with GATK’s GenotypeGVCFs tool. Pairwise linkage disequilibrium R2 values between variants were calculated using PLINK software. The Ensembl Variant Effect Predictor (VEP v90.3) software (https://www.ensembl.org/vep) was used to annotate variants with their predicted effect on protein sequence.