Updating Knowledge in The Estimation of The Genetics Parameters Multi-trait and Multi-Environment Bayesian Analysis in Rice (Oryza Sativa L.)

51 Among the multi-trait models used to jointly study several traits and environments, the Bayesian framework 52 has been a preferable tool for using a more complex and biologically realistic model. In most cases, the 53 non-informative prior distributions are adopted in studies using the Bayesian approach. Still, the Bayesian 54 approach tends to present more accurate estimates when it uses informative prior distributions. The present 55 study was developed to evaluate the efficiency and applicability of multi-trait multi-environment (MTME) 56 models under a Bayesian framework utilizing a strategy for eliciting informative prior distribution using 57 previous data from rice. The study involved data pertained to rice genotypes in three environments and five 58 agricultural years (2010/2011 until 2014/2015) for the following traits: grain yield (GY), flowering in days 59 (FLOR) and plant height (PH). Variance components and genetic and non-genetic parameters were 60 estimated by the Bayesian method. In general, the informative prior distribution in Bayesian MTME models 61 provided higher estimates of heritability and variance components, as well as minor lengths for the highest 62 probability density interval (HPD), compared to their respective non-informative prior distribution 63 analyses. The use of more informative prior distributions makes it possible to detect genetic correlations 64 between traits, which cannot be achieved with the use of non-informative prior distributions. Therefore, 65 this mechanism presented for updating knowledge to the elicitation of an informative prior distribution can 66 be efficiently applied in rice genetic selection.

Rice is one of the most important sources of the global population's daily caloric and nutritional 95 requirement (FAO, 2020). The global population is increasing, but the available area of suitable wetland is 96 decreasing (Ray et al., 2013). Therefore, the need to increase crop productivity rather than expand 97 agricultural land has grown over the years (Lobell et al., 2011;Phalan et al., 2011;Ray et al., 2013). It is 98 estimated that by 2050 the agricultural production of rice should be between 60 and 110% (Hunter et al.,

101
In general, in a plant breeding program aimed to identify the most genetically superior genotypes, 102 selection is based on only one trait (Suela et al., 2019;Parimala et al., 2020;Sabri et al., 2020). While 103 interesting, this approach can cause problems if its performance in the other trait desired is not evaluated 104 (Cruz et al., 2014). The genetic evaluation of multiple traits is relevant because superior varieties combine 105 optimal attributes for several traits simultaneously in plant breeding (Torres et al., 2018). In these cases, 106 the selection can be made indirectly, based on secondary traits of low environmental influence, easy to 107 measure and genetically correlated with the target trait, which is a very interesting alternative to maximize 108 accuracy (Santos et al., 2018). frequentist when it uses informative prior distributions (van de Schoot et al., 2021). Thus, informative prior 119 distributions should be preferable for breeding purposes to improve the accuracy of the selection process.
120 Silva et al. (2013) and Azevedo et al. (2022) presented a system, respectively, for updating 121 knowledge about the hyperparameters from the prior distributions of the variance components in the 122 univariate analysis in maize and white oat breeding, using the phenotypic data collected in different years.

123
However, these procedures for eliciting informative prior distributions have not been presented yet for 124 multi-trait analysis. Furthermore, although multi-traits and multi-environment studies in rice have already 125 been reported in the literature (Bhandari et al., 2019, Yu et al., 2019, Ahmadi et al., 2021 126 2021), the combination of multi-trait models under a multi-environment under a Bayesian framework with 127 informative prior, so far, has not been investigated.

128
Thus, the present study aimed to evaluate different strategies for eliciting informative prior 129 distribution using previous data from rice. For such, phenotypic data of four traits associated with eighteen 130 genotypes of rice evaluated in five agricultural years were used.

133
Experimental data

142
The useful area consisted of 4 m of three internal rows (4 × 0.9 m, 3.60 m 2 ). The experiments were 143 conducted on floodplain soils with continuous flood irrigation. The cultural treatments were carried out 144 according to the recommendations for irrigated rice cultivation in the evaluated regions (Soares et al., 2005).

147
The fitted multi-trait statistical model was given by:

148
= + 1 + 2 + 149 Which can be rewritten as: is the vector of systematic effects of j-th environment in the ith trait, is the block effects of the ith 153 trait in the jth environment and is the residual vector of the i-th trait in the j-th environment. X is the 154 incidence matrix of systematic effects, 1 is the incidence matrix of block effects and 2 is the incidence 155 matrix of genotype effects.

156
The prior distributions for the parameters of the model were given by: where is the identity matrix, ∑ , ∑ , ∑ and ∑ are the (co)variance matrix estimates with prior 162 distributions given by:

180
The relative variation index is the ratio of the coefficient of genotypic variation to the coefficient 181 of residual variation, this is .

182
The informativeness of prior distribution is associated with the values of the hyperparameters and,

196
(2022) demonstrated in univariate analyses with ten years of data that the procedure for updating knowledge The following parameters were calculated to assess the impact of prior knowledge insertion : i) 204 the posterior coefficient of variation (CV) of the estimates of the components of variance, heritability, 205 genetic correlation and additive genetic values; ii) length of the Highest Posterior Density intervals (HPD) 206 of the parameter estimates; iii) the deviance information criterion (DIC), when possible, since the quality 207 of the fit can only be compared using the DIC when the model uses the same data; iv) agreement between 208 genetic estimates by non-informative and informative prior distribution, considering 30% of the selection 209 differential (total of 6 genotypes).

210
All computational implementations of the analysis were performed using the R software program

224
Insert Table 1 225 226 For all parameters, the p-values of Geweke's Z statistics were greater than 1% significance (Tables   227   2 and 3), which indicates that the convergence was achieved, and the inferences can be performed.

233
The smaller posterior coefficient of variation (CV) values of the genetic variance and heritability 234 ( Table 2) and genetic correlation (Table 3) were observed considering the informative prior to the 235 estimation process. In this approach, the hyperparameters from the prior distributions were obtained by 236 analyzing the previous year. Therefore, the length of the HPD interval is also smaller due to the higher 237 precision provided by this informative prior (Tables 2 and 3

254
It was observed increased additive genetic variance and heritability with the use of informative 255 prior about the results of the non-informative prior distribution for all traits, except for PH, in the locality 256 of Lambari, and FLOR, in the locality of Janaúba (Table 2). Among the 18 rice genotypes evaluated, the 257 trait GY in the Janaúba locality obtained the highest additive genetic variance, while the smallest value was 258 found for PH in the Lambari locality. It was also observed the highest heritability value of 0.79 in the 259 Janaúba locality for PH and the lowest heritability for GY, with a value of 0.14 in the Leopoldina locality.

269
The GY, FLOR and PH traits presented coefficients of variation ( ) from 0.95% to 2.58%,

271
These can be considered adequate when compared to the method for the classification of coefficients of variation is more influential than residual variation (Torres et al., 2018). This was observed in this study between environments (Table 5). The genetic correlations between environments were positive for all traits.

282
Leopoldina was the environment that presented the highest correlations with other locations. Considering 283 the genetic correlations below 0.30 as low and above 0.60, (Oliveira et al., 2020) suggest, respectively, the 284 occurrence of high (0.14-0.22) and moderate (0.34-0.47) G × E, i.e., the performance of genotypes changed 285 across environments.

287
Insert Table 5 288 289 The percentage of agreement considering a selection differential of 30% was calculated to compare 290 the ranking of genotypes between the three environments for each trait, as described above (

313
We demonstrated the feasibility of the proposed multi-trait multi-environment Bayesian model for 320 plant breeding involving a low number of genotypes that are evaluated for multiple traits across a range of 321 environments. In addition, we presented a knowledge-updating mechanism for eliciting an informative prior 322 distribution. The use of more informative prior distributions makes it possible to detect genetic correlations 323 between traits, which was not feasible with the use of non-informative prior distributions.