Evaluation of GBLUP and Bayes-Alphabet Based on Different Marker Density For Genomic Prediction in Alpine Merino Sheep

The marker density, the heritability level of trait and the statistical models adopted are critical to the 32 accuracy of genomic prediction (GP) or genomic selection (GS). The studies on the impact of the 33 above factors on accuracy of GP are usually focused on the comparison and discussion of simulated 34 datasets. If the potential of GS is to be fully utilized to optimize the effect of breeding and selection, 35 it is essential to incorporate these factors into real data for understanding their impact on GP 36 accuracy, more clearly and intuitively. Herein, we studied the genomic prediction of six wool traits 37 of sheep by two different models, including genomic best linear unbiased prediction (GBLUP), and 38 Bayes-Alphabet. We adopted 5-fold cross-validation to perform the accuracy evaluation based on 39 the genotyping data of Alpine Merino sheep (n=821). models for GP which would assist in further exploring the optimization of GP. been applied to the domesticated Alpine Merino sheep populations. We have observed that for traits with low heritability (SS and FER), increasing the density of markers could improves the GP accuracy, but it has little impact on traits with high heritability (SL), and even decreases the accuracy (FD). The accuracy of the GBLUP model is generally higher than that of the Bayes-Alphabet model for SS and FER, while with the improvement of heritability, the advantage of GBLUP is no longer Therefore, from this study, we conclude that GBLUP is more suitable for traits with lower heritability (FER and FBS), and Bayes-alphabet, especially BayesB and Bayesion LASSO, have better GP effects for traits with high heritability (FD and SL), different GP models are applicable to

models for GP which would assist in further exploring the optimization of GP. 53 Keywords: Genomic prediction; Alpine Merino sheep; Wool traits; GBLUP; Bayes-Alphabet; 54 Marker density 55 Background 56 The advancement in the field of quantitative genetics and molecular biology has improved the 57

Statistics and processing of phenotypic data 120
A total of 6 wool traits were collected and the descriptive statistics of individual wool phenotype 121 data was presented in Table 1, including the abbreviation of each trait, the corresponding standard 122 error (S.E), the average value (represented by mean ± S.D), and the number of individuals that were 123 effectively recorded (Numbers). For the wool traits, the standard deviation (S.D) ranged from 2.11 124 (FD) to 13.16 (SL), and the standard error (SE) ranged from 0.07 (FD) to 0.46 (SL). 125 The polygenic heritability and the GP accuracy 126 The phenotypic variance and the additive variance of the 6 wool traits based on L-and H-datasets 127 were estimated to calculate polygenic heritability (h 2 ). For L-datasets, heritability ranged from 0.37 128 (FER) to 0.70 (SL); and for H-datasets, heritability ranged from 0.29 (FER) to 0.68 (SL). The 129 estimated results of heritability (expressed as the proportion of additive variance in phenotypic 130 variance) shown in Table 3, states that SL was the highest and the FER was the lowest irrespective 131 of the L-or H-datasets. Moreover, the heritability estimated by L-datasets was slightly higher than 132 that of H-datasets for these 6 wool traits. 133 The GP accuracy was calculated using 5 methods based on two marker density datasets (Table 4). 134 For L-datasets, the GP accuracy of SL was the highest (0.59 for Bayesion LASSO model); and the 135 GP accuracy of FER was the lowest (0.28 for BayesA model). Correspondingly, for H-datasets, the 136 trait with the highest GP accuracy was also SL (0.58 for BayesA, BayesB, and Bayesion LASSO 137 model), and the trait with the lowest GP accuracy was FER (0.31 for BayesA model). 138

Genomic information and individual relationship matrix 140
The analyses involved in this study are all based on genomic information obtained from genotyping through microarrays, GP has replaced the traditional phenotype and pedigree information with the 142 dense markers, providing a new method to estimate genetic variance, which improves the accuracy 143 of prediction and selection [31]. Genomic information is not only suitable for a population with 144 pedigree information, but can also be applied to populations without pedigree information or 145 incorrect, incomplete and even missing genealogical records [32,33]. In the GBLUP model, the 146 traditional individual relationship matrix constructed by pedigree was replaced by the genome 147 matrix , which represents the relationship between individuals more accurately, as it is based on 148 a dense genome-wide marker. More importantly, this may capture the genetic connections from 149 unknown common ancestors, because it represents confirmed gene sharing, and has advantages over 150 presumed or conceptualized ancestral sharing [4]. In GBLUP model, it was assumed that each SNP 151 has an effect, and the cumulative effect of SNPs obey a normal distribution [34], the assumption 152 might only be applicable to certain specific groups or traits. According to the hypothesis of Habier 153 et al., for some traits, only a few markers have a larger effect, while most markers have little or no 154 effect [23,35]. Therefore, GBLUP may not be suitable for such trait, in other words, the GP accuracy 155 of GBLUP will be lower than other models, like the FD trait in current study, the GP accuracy (0.56 156 based on L-datasets) of the Bayesion LASSO model was higher than that (0.52 based on L-datasets) 157 of the GBLUP model. From the above results, GBLUP may not be applicable to FD traits and its 158 predictive ability may not achieve satisfactory results. Hence, it is necessary to adopt different GP 159 models. In the Bayes-Alphabet method, models such as BayesB and BayesCπ assume that most of 160 the SNPs in the genome are located in regions without quantitative trait locus (QTL) and have no 161 effect [24]. while a small number of other SNPs existed in linkage disequilibrium (LD) together 162 with QTL, and accounts for most of the effect [34,36]. According to reports, different Bayes-Alphabet methods put forward a variety of prior hypotheses on the distribution of SNP effects (Table  164 2) [34]. In the current study, in addition to the GBLUP method, 4 typical Bayes-Alphabet methods 165 (BayesA, BayesB, BayesCπ and Bayesion LASSO) were also used to compare the GP accuracy of 166 the 6 wool traits. 167 In most cases, GP suffers limitations while adopting the high-density or low-density SNP genomic 168 information, i.e., the number of marker effects that need to be estimated is often greater than the 169 number of individuals to be recorded. In this study, both the L-and the H-datasets showed that the 170 number (35,379 and 460,656) of markers was much larger than the number (821) of individuals. 171 Although many advanced statistical methods [37,38] have been proposed to overcome this 172 challenge, the true distribution of QTL and SNP effects were unclear for many quantitative traits 173 [34]. Moreover, in contrast to L-datasets, the H-datasets microarrays contain more genomic 174 information, but it also involves more complex matrices and larger computation, which will 175 undoubtedly increase the cost of time and economy [36]. 176

Phenotypic statistics and estimation of heritability 177
In the current study, the collected phenotypic statistics of wool traits were compared with the results 178

GP results and accuracy of prediction 203
If breeding scientists are to effectively apply genomic selection in their breeding programs, they 204 need to have a full understanding of the factors that affect the accuracy of the dataset predictions. 205 For effective application of GS and GP on sheep breeding programs, there should be a thorough 206 understanding of the factors affecting the accuracy of the dataset predictions. We collected 821 207 samples from the breeding program to investigate the influence and interaction of marker density 208 and GP on the accuracy of prediction. Previous studies suggested that the density of markers has an 209 essential impact on the accuracy of GP [44,45]. Solberg and his collaborators (2008) adopted the 210 simulated data to analyze the correlation between accuracy and marker density, their results showed 211 that increasing the density of SNPs from 1 to 8 per centimorgan (cM) could improve the accuracy 212 of GP by 25 % [46]. but this did not mean that the accuracy could always improve with the increase 213 of marker density, in other words, there is a limit to this improvement. Heffner et al. (2011) 214 conducted a study using a wheat dataset and showed that with the increased density from 192 to 215 1,158 markers, the accuracy of GP could be improved by 10 %. However, when the marker density 216 increased from 192 to 384,it caused only a small increase in accuracy [47]. Most of the 10 % 217 improvement mentioned above occurred in the interval from 192 to 384 markers, and the increase 218 of the remaining markers did not significantly affect the accuracy. These results indicate that marker 219 density has a positive effect on the accuracy of GP, while the response of accuracy to density will 220 eventually stabilize [48]. 221 Herein, we adopted the genome datasets based on the level of 50K and 630K microarray, 222 respectively. Table 1 shows that with the marker density increases, the improved accuracy of GP for 223 most traits, especially in FBS and FER, model Bayesion LASSO and BayesA increased by 12 and 224 11 %, respectively, while in other traits the accuracy was not significantly improved, such as CFWR 225 and FD_CV, the accuracy of GBLUP and BayesB increased only by 1 %; FBS and FER benefited 226 more from the increase in marker density than other traits, which could be explained by the fact that 227 quantitative genetic characteristics require more markers to accurately estimate their many small 228 effects of QTL [49]. Interestingly, there are exceptions in this study, for some traits, the accuracy may even decrease: in FD trait, the accuracy of BayesA and Bayesion LASSO models were reduced 230 by 3 % and 5 %, respectively. Two reasons that may explain why increasing number of markers on 231 each chromosome led to a decrease in GP accuracy. Firstly, the number of markers in the microarray 232 is much larger than the number of samples, which may be due to excessively high density of markers 233 leading to the model overfitting [50]. Secondly, the increases in the number of markers will lead to 234 the addition of more unknown variables (marker effects) and a lack of accurate estimation. The is closely related to the trait itself. For traits with low heritability levels (FER and FBS), a small part 239 of the phenotypic variation was explained by additive effects [52], and the increase of marker density 240 may improve the accuracy of GP more obviously; correspondingly, for those traits with high 241 heritability levels (CFWR and FD), increasing the marker density has little benefit on the GP 242 accuracy, sometimes even has a negative impact on accuracy. 243 Among the 6 wool traits studied here, SL and FD_CV had the highest heritability (h 2 =0.53 and 244 h 2 =0.58, respectively), and their corresponding accuracy of GP was also the highest, which ranged 245 from 0.53 to 0.60 and 0.45 to 0.55, respectively. While for two traits with the lowest heritability, 246 FBS (h 2 =0.33) and FER (h 2 =0.28), the accuracy was 0.29 to 0.38 and 0.28 to 0.36, respectively, 247 which was lower than SL and FD_CV. For those traits with lower heritability, the correlation 248 between phenotypic value and genetic value will be lower, the effect value of markers distributed 249 across the genome may be estimated with lower accuracy [23], it suggested that higher heritability 250 has a positive effect on the accuracy of GP. Bolormaa et al. (2013) also reported that the prediction of the trait with the highest heritability was more accurate [53], and also several studies have shown 252 that the accuracy of GP increases with the improved heritability [54,55], the results of the current 253 study agreed with them. In addition, we found that for traits with low heritability, GBLUP had a 254 better prediction effect, whether it is adopting L-or H-datasets, but with the increase of heritability, 255 the advantage of GBLUP is not obvious. From Table 4, it could be observed that for the trait SL 256 with high heritability, the estimation accuracy of BayesB (0.58-0.60) and Bayesion LASSO (0.58-257 0.59) models performed better, this may indicate that for some traits with high heritability, BayesB 258 and Bayesion LASSO assumes more reasonable distribution in marker effect, which leads to higher 259 prediction accuracy. Similar results were obtained in the study of Honarvar and his coworkers, based 260 on the simulation data of three different levels of heritability, they compared the accuracy of the 261 RRBLUP and bayesion-LASSO models, and the results showed that the GP accuracy of the 262 bayesion-LASSO model is higher than that of the RRBLUP model for these traits, but the former 263 has a more obvious advantage in traits with high heritability [56], and it should be noted that GBLUP 264 was equivalent to RRBLUP. In addition, the accuracy of GP was also related to the size and structure 265 of the reference group [57,58]. We will collect and organize a larger dataset in future and try to take 266 the above factors into consideration in subsequent studies for better conclusive results. 267

Conclusions 268
To summarize, this study was based on two different densities of microarray genotyping data, using 269 Bayes-Alphabet (including BayesA, BayesB, BayesCπ, Bayesion LASSO) and GBLUP model to 270 perform the GP. The heritability of 6 wool traits of Alpine Merino sheep was estimated, and the 271 accuracy of the BVs prediction of these traits under different conditions was evaluated through five-272 fold cross-validation. To the best of our knowledge, this was the first study of optimization of GP 273 has been applied to the domesticated Alpine Merino sheep populations. We have observed that for 274 traits with low heritability (SS and FER), increasing the density of markers could improves the GP 275 accuracy, but it has little impact on traits with high heritability (SL), and even decreases the accuracy 276 (FD). The accuracy of the GBLUP model is generally higher than that of the Bayes-Alphabet model 277 for SS and FER, while with the improvement of heritability, the advantage of GBLUP is no longer 278 obvious. Therefore, from this study, we conclude that GBLUP is more suitable for traits with lower  Table 1. 305

Genotypic data and quality control 306
The customized Affymetrix HD 630K microarray was employed as the datasets for the genotype of 307 high-density SNP genotypes (H-datasets) for the Alpine Merino sheep. The genotyping platform for 308 analysis was based on the array plate processing workflow of GeneTitan system (Santa Clara, 309 California, USA) from Thermo Fisher (Affymetrix). The sites in the Illumina Ovine SNP 50K 310 microarray were screened out from the Affymetrix HD 630K microarray and used as the datasets of 311 low-density SNP genotypes (L-datasets). The H-and L-datasets were pre-processed using PLINK 312 v1.9b4 software prior to the statistical analysis and variance component estimation [60]. The SNPs 313 were eliminated with call rate (geno) below 95 %, minor allele frequency (MAF) below 0.01, which 314 seriously deviated from the Hardy Weinberg Equilibrium with a P-value below 10 -6 . Here, the X, Y 315 chromosomes and mitochondrial chromosomes were excluded from the analysis. In addition, Beagle 316 software (version number; 12Jul19.0df) was used to impute the missing SNPs [61]. After quality control and impute, a total of 821 individuals with 460,656 autosomal SNPs were retained for H-318 datasets, and a total of 821 individuals with 35,379 autosomal SNPs for L-datasets. 319

Statistical methods for GP 320
We explored the application of SNP datasets of different densities in genome evaluation and further 321 compared the accuracy of GP adopting 5 different models, including Bayes-Alphabet (BayesA, 322 BayesB, BayesCπ, Bayesian LASSO) and GBLUP. Six wool traits from 821 samples were used to 323 first, estimate the variance of each component, including the additive and residual variance; second, 324 five different models were adopted to perform GP, and its accuracy was compared via 5-fold cross-325 validation, and all these models were evaluated in SNP datasets of H-and L-datasets. Replicate 326 measurements were not available for the individuals so that the effects of permanent environmental 327 were not modeled. The samples involved were from different herds and genders. These factors 328 altered the phenotype in a fixed pattern, and hence the system environmental effects were added to 329 the framework. 330 The statistical methods of Bayes-Alphabet involved can be written as: 331 Here, represents the corrected phenotypic value of individuals, refers to a fixed term, and 333 contains a vector of 3 effects, including herds, genders, and mean of population.
represents 334 the genotype of individual at site , and represents the effect value of site , and therefore 335 ∑ refers to the BV corresponding to individual , to the vector of residual effects. 336 According to the method from Meuwissen et al. and Habier et al [2,23], adopted the R package  Table 2. 340 The methods of GBLUP involved in the current study confirms to a linear model. 341 In Bayes-Alphabet model, in equation 2, , , and represent the same parameters as those 343

Accuracy of GP by K-fold cross-validation 353
Five-fold cross-validation was performed to compare the accuracy of different methods of GP. 354 During K-fold cross validation, the population should be divided randomly [34]. The datasets 355 consisting of 821 individuals were divided into five approximately equally-sized subgroups (each 356 subgroup contained around 165 individuals). For 5-fold cross-validation, four subgroups which 357 retain the phenotype and genotype, were regarded as training population (reference population) to 358 estimate the parameters. The remaining subgroup i.e., candidate population was used to verify the 359 samples, and correspondingly, the phenotype of this group of samples was set as missing (Not 360 applicable, NA).

Funding 384
This research: including experimental design, sample collection, data analysis, and manuscript 385

Availability of data and materials 390
All analysis results data generated during this study are included in this manuscript. Requests for 391 the raw data should be made to the corresponding authors. 392

Ethics approval and consent to participate 393
All animal work carried out in the current study was performed per the guidelines for the care and

Consent to publish 399
Not applicable 400

Competing interests 401
The authors declare that they have no competing interests.   Tables 578 Table 1. Descriptive statistics of phenotypic values of traits. 1 S.E, standard error; 2 S.D, standard 579 deviation. 580 Table 2. Different GS methods and effects distribution. 581 Table 3. Estimates of additive and residual components of variance obtained adopting 'BGLR' for 582 different datasets. a CFWR: clean fleece weight rate; SS: staple strength; FER: fleece extension 583 rate; FD: mean fiber diameter; FD_CV: Coefficient of variation of FD; SL: staple length; b 584 Polygenic heritability, the proportion of the additive effect variance to the total phenotypic 585 variance. 586 Table 4. Comparison of prediction accuracies of 6 traits based on 2 datasets via 5 models. a 587 Abbreviations of traits explained in Table 3 Comparison of GP accuracy based on different density genotype datasets. The six traits were clean eece weight rate (CFWR); staple strength (SS); eece extension rate (FER); mean ber diameter (FD); Coe cient of variation of FD (FD_CV); staple length (SL).

Figure 2
Based on genotype datasets of different densities, the GP accuracy of 5 models in different heritability level. On the left is the result for the H-datasets, and on the right is the result for the L-datasets. The six traits were clean eece weight rate (CFWR); staple strength (SS); eece extension rate (FER); mean ber diameter (FD); Coe cient of variation of FD (FD_CV); staple length (SL). The ve models were: BayesA (BA); BayesB (BB); BayesCπ (BC); Bayesion LASSO (BL); and GBLUP (GB).

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.