Mega-environment analysis to assess adaptability, stability, and genomic predictions in grain sorghum hybrids

Multi-environment trials (MET) are fundamental for assessing genotype-by-environment interaction (GxE) effects, adaptability and stability of genotypes and provide valuable information about target regions. As such, a MET involving grain sorghum hybrid combinations derived from elite inbred lines adapted to diverse sorghum production regions was developed to assess agronomic performance, stability, and genomic-enabled prediction accuracies within mega-environments (ME). Ten females and ten males from the Texas A&M and Kansas State sorghum breeding programs were crossed following a factorial mating scheme to generate 100 hybrids. Grain yield, plant height, and days to anthesis were assessed in a MET consisting of ten environments across Texas and Kansas over two years. Genotype plus Genotype-by-block-of-environment biplot (GGB) assessed ME, while the "mean-vs-stability" view of the biplot and the Bayesian Finlay–Wilkinson regression evaluated hybrid adaptability and stability. A genomic prediction model including the GxE effect was applied within ME to assess prediction accuracy. Results suggest that grain sorghum hybrid combinations involving lines adapted to different target regions can produce superior hybrids. GGB confirmed distinct regions of sorghum adaption in the U.S. Further, genomic predictions within ME reported inconsistent results, suggesting that additional effects rather than the correlations between environments are influencing genomic prediction of grain sorghum hybrids.


Introduction
Sorghum [Sorghum bicolor (L.) Moench] is the fifth most important cereal crop in the world, following corn (Zea mays L.), wheat (Triticum aestivum L.), rice (Oryza sativa L.), and barley (Hordeum vulgare L.) with 57.8 million tons of global annual production in 2019 (FAOSTAT 2021). While Africa is responsible for almost 50% of grain sorghum production worldwide, grain yields are significantly higher in developed countries. For example, the average sorghum yield in 2019 was 4.58-versus 1.23-tons ha −1 in the U.S. and Nigeria, respectively (FAOSTAT 2021). Within the U.S. states, Kansas and Texas account for 82% of total U.S. production, with average yields of 5.72-and 5.38-tons ha −1 , respectively (NASS 2020).
Grain sorghum production is predominantly cultivated in agricultural areas where water is limited (Monk et al. 2014). Sorghum is favored in these regions because the species possesses various drought-tolerance mechanisms, including stay-green and pre-flowering drought-tolerance (Rooney 2016). Drought tolerance in sorghum provides greater yield stability than other crops, such as maize (Zea mays), which are more susceptible to water-stress conditions. Because sorghum production targets stressprone environments, developing improved grain sorghum genotypes requires special considerations. For instance, sorghum hybrids that are highly responsive to farm inputs are best grown in non-limiting environments, but these do not reflect the typical sorghum production conditions. Hence, assessing the stability of grain sorghum hybrids across target environments becomes particularly important for the effective deployment of sorghum hybrids.
Multi-environment trials (MET) are commonly used to assess genotype-by-environment interaction (GxE) effects. When MET detect GxE that do not involve crossover interaction (COI), GxE may be ignored since the best genotypes are consistent across environments. However, GxE is often associated with COI, and judicious hybrid deployment requires breeders to investigate the adaptability and stability of specific genotypes (Crossa et al. 2004;Burgueño et al. 2008). Many stability tests contemplate univariate measures of stability (Wricke 1962;Finlay and Wilkinson 1963;Shukla 1972), while others have proposed the use of multivariate measures based on biplot technique (Gabriel 1971) (i.e., additive main effects and multiplicative interaction, AMMI, Gauch and Zobel 1997;genotype + GxE, GGE, Yan et al. 2000), and nonparametric measures (Nassar and Huhn 1987;Huehn 1990). Depending on the test, stability can be classified as static or dynamic (Becker 1981;Becker and Leon 1988).
As MET consist of a continuous series of testing genotypes across target environments, not all genotypes are evaluated in each environment. Consequently, MET are usually unbalanced, which challenges the ability to assess the stability of a genotype. A common practice to circumvent unbalanced data and assess stability involves removing genotypes missing in some environments and performing stability tests via classical models, assuming genotype and environment effects as fixed (Finlay and Wilkinson 1963;Gauch 1988;Yan et al. 2000). A suitable alternative to manage unbalanced data involves applying linear mixed models that accommodate random effects and include (co)variance structure to model genetic and environmental effects.
Among the models that assume effects as random, factor analytics models are commonly applied to MET (Piepho 1994;Smith et al. 2001Smith et al. , 2005Piepho et al. 2008). Linear mixed models permit better prediction of those missing data by incorporating the genetic relationship matrix of tested and untested genotypes (Jarquin et al. 2020). Many studies have reported on the benefits of using linear mixed models to assess MET (Piepho 1997(Piepho , 1998Smith et al. 2001Smith et al. , 2019Resende and Thompson 2004;Piepho and Möhring 2005;Burgueño et al. 2008;Balestre et al. 2010;Hu 2015;Jarquin et al. 2020). Bayesian models have also demonstrated their advantages for analyzing genotypic data (Sorensen and Gianola 2007;Crossa et al. 2010;Alves et al. 2019;Fonseca et al. 2021a). For instance, Lian and de los Campos (2016) reported the stability of wheat varieties using a novel approach of the Finlay and Wilkinson stability test under a Bayesian framework. The stability of grain sorghum hybrids based on the Bayesian framework is lacking.
In addition to the unbalanced genotypic effect in the MET, the number of genotypes and environments vary depending on the stage (i.e., early vs. advanced cycles) during the breeding process. Conventionally, many genotypes are tested in limited environments during the initial stages of selection, whereas advanced stages evaluate a few genotypes across many environments. Although this approach is generally applicable, breeding pipelines are being modified to accommodate predictive breeding technologies (Crossa et al. 2021). For instance, genomicenabled prediction is most effective if it replaces the first phase of hybrid trials in grain sorghum breeding schemes (Fonseca et al. 2021a). Further, MET permit the development of genomic-enabled prediction models to include GxE, and results indicate better prediction accuracy (Burgueño et al. 2012;Lopez-Cruz et al. 2015;Lado et al. 2016;Cuevas et al. 2016; Page 3 of 17 128 Vol.: (0123456789) Acosta-Pech et al. 2017;Basnet et al. 2019;Costa-Neto et al. 2020).
Although MET primarily focus on identifying superior genotypes for target environments, established breeding programs also apply MET to generate relevant information about tested environments (Gauch and Zobel 1997;Yan et al. 2000;Yan and Kang 2002;Laffont et al. 2013;Gauch 2013;Yan 2015Yan , 2016. This analysis can be useful to identify mega-environments (ME) across a wider production area (Rakshit et al. 2012;Nielsen and Vigil 2018;Dalló et al. 2019;Sharma et al. 2020;Ansarifard et al. 2020). By definition, ME reduce GxE and allow breeders to extract performance information about genotypes more effectively within a ME (González-Barrios et al. 2019). Lado et al. (2016) indicated that wheat line performance was more predictable when ME information is considered. Alves et al. (2021) showed that the prediction accuracy of hybrids in weakly-correlated environments can be increased when including additional phenotypic information from other correlated trials. Lopez-Cruz et al. (2015) presented a marker-byenvironment genomic best linear unbiased prediction model (MxE GBLUP) to account for GxE effects in genomic-enabled prediction models. The MxE model borrows genetic information across environments, thus increasing prediction accuracies for untested genotypes in tested environments. Acosta-Pech et al. (2017) further expanded the MxE GBLUP model to include combining abilities, environment, and combining abilities-by-environment interactions and reported higher prediction accuracies in maize. Fonseca et al. (2021a) recently applied the Acosta-Pech approach to sorghum, and prediction accuracies were promising. Most genomic-enabled prediction models that include GxE effects assume the GxE (co) variance matrix to be positive semi-definite (Cuevas et al. 2016;Crossa et al. 2019). Thus, higher correlations between environments are expected to yield higher prediction accuracies. The combination of the Acosta-Pech model under ME may generate good prediction accuracies as ME are expected to increase the correlation between environments, causing a reduction in the residual variance, and potentially capturing greater genetic variation.
A common breeding approach for multinational commercial breeding programs is to cross elite germplasm adapted to diverse target environments (Cooper et al. 2014). Studies have demonstrated the potential advantages of utilizing this strategy in hybrid crops (Podlich and Cooper 1998;Technow et al. 2020). This strategy is less common in public institutions (Fonseca et al. 2021b) because these programs have more limited target areas with fewer genetic resources. Nevertheless, the exchange of germplasm between distinct breeding programs could foster a collaborative system between public institutions and mitigate the decrease in the number and funding of public breeding programs (Shelton and Tracy 2017;Coe et al. 2020;Fonseca et al. 2021b).
The present research aimed to develop ME for grain sorghum production regions and assess the adaptability and stability of grain sorghum hybrids derived from geographically distinct U.S. public breeding programs within ME. Further, genomic predictions within ME were investigated. Finally, a comparison between genomic predictions based on ME and without ME is presented.

Genetic material
The grain sorghum hybrids utilized in this study are as previously described by Fonseca et al. (2021a). In brief, 100 hybrids were generated from crossing ten female with ten male lines in a factorial mating scheme (Comstock and Robinson 1952). The elite inbreds originated from the sorghum breeding programs at Texas A&M AgriLife Research (College Station, Texas) and Kansas State Research Center (Hays, Kansas). Each program provided five females (A-lines) and five males (R-lines) developed for their respective target environments and that produced agronomically acceptable hybrids ( Table 1). The 100 hybrids were subdivided into four groups to represent hybrid combinations derived from elite lines within each breeding programs and between breeding programs ( Fig. 1).

Experimental design
Hybrids were planted in a randomized complete block design (RCBD) with sets in replicate adjustment; each set was composed of one of the four groups of 25 hybrids (Fig. 1), with randomization occurring within and between sets. In 2018 trials, each location had three replicates, while in 2019 each location had two replicates. A plot consisted of two adjacent rows, approximately 5.3 m in length, with row spacing that ranged from 0.76 to 1.0 m, depending on the production practices in each environment. In 2018, trials were grown in Monte Alto, Victoria and College Station in Texas, and Garden City and Colby in Kansas. In 2019, the trials were in Taft, Victoria and College Station in Texas and Hays and Colby in Kansas. These Texas and Kansas locations represent distinct adaptation zones and, in each test, agronomic practices standard to the location were followed (Table 2) (Fonseca et al. 2021a). GPS coordinates of each environment were used to collect weather data from the NASA POWER database (NASA 2021).
Three agronomic traits were measured including days to anthesis (DA), plant height (PHT), and grain yield (GY). DA were recorded as the number of days from planting to when 50% of the plants in the plot were at mid-anthesis. In Kansas locations, DA was collected on one replicate only. Just prior to harvest, PHT was measured in cm using a representative plant in each plot at the length from the soil to the tip of the panicle. Plots were harvested using plot combines fitted with a Harvest Master GrainGage System (Juniper Systems), to measure grain weight and moisture content. After adjusting yield to a 14% moisture, GY was converted to tons ha −1 (Fonseca et al. 2021a).

Statistical analysis
For each environment, data were analyzed following a linear mixed model (Henderson 1984): where y is the vector of phenotypes; is the intercept; h is the random effect of hybrid, h ∼ N 0, 2 h I , r is the random effect of replicates, r ∼ N 0, 2 r I ; e is the vector of residuals, e ∼ N 0, 2 e I ; 1 is a vector of ones; Z 1 and Z 2 are incidence matrixes; 2 h , 2 r and 2 e are variance components for hybrids, replicates and residuals, respectively. Model (1) was extended to incorporate environment and GxE effects for performing a combined analysis as followed: where s is the vector of environmental effects, s ∼ N 0, 2 s I ; hs is the vector of the GxE effect hs ∼ N 0, 2 hs I , r(s) is the vector of replicate effect nested within environment, r(s) ∼ N 0, 2 r(s) I ; Z 1 , Z 2 , Z 3 , Z 4 , are incidence matrixes for the corresponding effect.
Variance components were estimated via restricted maximum likelihood (REML) method (Patterson and Thompson 1971) using the lmer function of the lme4 R package (Bates et al. 2015), and its significance assessed by the likelihood ratio test (LRT) using the ranova function of the lmerTest R package (Kuznetsova et al. 2017). All analysis were conducted in R   (1) assuming hybrids as fixed. BLUEs were used to develop the training set in genomic prediction models.
Mega-environments (ME) were developed using the gge function of the gge R package (Wright and Laffont 2020). Genotype plus genotype × environment (GGE) biplot and which-won-where polygon were used to assess grain sorghum hybrid performance, while genotype plus genotype × block of environments (GGB) biplots were used to define ME. Biplots were scaled and environment-centered. For GGE biplot, stability was assessed using loading projections on the average environment coordinate (AEC) at each ME. For Bayesian Finlay-Wilkinson regression (BFW-Lian and De Los Campos 2016), static and dynamic stability was assessed based on the slope of genotypes across environment. Regression lines presenting slopes close to 0 classified static stability while slopes close to 1 defined dynamic stability. For BFW, genomic information of hybrids was included in the model as (co)variance matrix of the genotypic effect. Posterior distribution was calculated based on 10,000 interactions, burn-in of 5000 and a thin of 5. Genomic relationship of hybrids were developed in silico by the Kronecker product of the parental marker matrix. For details see Fonseca et al. (2021a).
Genotyping, genomic prediction model, and cross validation scheme DNA extraction, genotyping-by-sequencing (GBS), and development of molecular markers were described in Fonseca et al. (2021a). In brief, seed from the parental lines were germinated in a greenhouse and DNA extracted from ∼ 14-day-old leaf tissue. NgoMIV restriction enzyme was used to digest each DNA. After single nucleotide polymorphisms (SNPs) were called and processed to represent high quality markers, a total of 35,546 SNP were available for further analysis. The genomic-enabled prediction model used in this study was first introduced by Acosta-Pech et al. (2017) and first applied in grain sorghum hybrids by Fonseca et al. (2021a). Herein, the model was applied for each ME separately and included combining abilities, environment and GxE effects to predict the performance of sorghum hybrids as follows: where y = y 1 , ⋯ , y j , ⋯ , y s H ; u f is a random vector to include the interaction between GCA of females and environment, u f ∼ N 0, 2 fs V f ; u m is a random vector that models the interaction between males and environment, u m ∼ N 0, 2 ms V m ; u h is a vector that takes into account the interaction between hybrids and environment, u h ∼ N 0, 2 hs V h ; e is a vector of residuals, e ∼ N 0, 2 e j I ; Z s , Z 1 , Z 2 , and Z 3 are incidence matrixes for environment, female, male, and hybrid, respectively. G f , G m , and H are the genomic relationship of matrix for females, males, and hybrids, respectively. The elements of matrix H were calculated in silico by the Kronecker product of G f and G m as H = G f ⊗ G m (Technow et al. 2014;Acosta-Pech et al. 2017), which ⊗ represents the Kronecker product. V f , V m , and V H are variance-covariance matrices that associate each respective genomicmain-effect-by-environment variance component, i.e., 2 fs , 2 ms , and 2 hs . The variance-covariance matrix was calculated as To validate models and estimate the predictability of parents, a simple leave-one-out cross-validation (CV) scheme was applied following Fonseca et al.
Page 7 of 17 128 Vol.: (0123456789) (2021a). Briefly, seed parents in hybrid combinations were removed, and models trained with the remaining records where that seed parent was absent. The same process was applied for males. Thus, the CV scheme led to prediction of hybrids when a female or a male parent were removed. The prediction accuracy (r) at each ME was estimated as the Pearson's correlation between observed and predicted values of hybrids.
Genomic prediction model was fit using Bayesian methods implemented in the BGLR statistical package (Pérez and de los Campos 2014) available for R. Inferences were based on 10,000 Gibbs sampler iterations with a burn-in of 5000, and a thin of 5.

Analysis by location
Genetic variation was significant for all traits in all environments (Table 3). Broad-sense heritability estimates were generally high, indicating that most of the variation within a location was associated with genetic effects. GY ranged from 4.31 to 8.34 ton ha −1 , PHT ranged from 121 to 145 cm, and DA ranged from 59 to 80 days across all environments. The lowest environmental mean GY occurred in 2018 Monte Alto (18RF), TX, while the greatest was in 2019 Colby (19COL), KS (Table 3). In 2020, the National Agricultural Statistics Service reported grain sorghum average yields of 4.23 and 5.71 ton ha −1 for Texas (NASS 2021a) and Kansas (NASS 2021b), respectively. Such grain sorghum averages indicate that hybrids presented in this study generate yields similar to those of commercial hybrids, thus meeting the expectation of high-yielding germplasm.

Combined analysis and development of mega-environments
The likelihood ratio test (LRT) for the combined analysis detected highly significant effects due to genetics, environments, and GxE for all traits (Table 4). Environment effects explained most of the variation for GY (48.9%) and DA (76.7%), while genetic components were the largest for PHT (40.5%). For GY,   the GxE variance component accounted for almost 12% of the total variation, exceeding the proportion of the genetic variance (9.8%). Yan et al. (2000) suggested the existence of mega environments when GxE effects explain proportionally more than genotype effects; thus, the likelihood of finding a single best sorghum hybrid for all test environments is unlikely. Hence, mega-environment 1 (ME1) and mega-environment 2 (ME2) were developed (Table 4). The GGE biplot analysis for GY explained more than 50% of the variation with the first two principal components (PC) (Fig. 2). The which-won-where polygon suggested that three ME exist. Nonetheless, the relative lack of representativeness of the environment 19HAY combined with empirical knowledge about sorghum production regions led to the establishment of only two ME (Fig. 3 and Table 4). These two ME are consistent with previous grain sorghum studies that define production regions as subtropical and temperate environments (Monk et al. 2014).
Adaptability and stability of grain sorghum hybrids derived from public breeding programs Several grain sorghum hybrids from both programs were highest yielding and highly stable (Figs. 4, 5; Table 5). For instance, within the subtropical environment (i.e., ME1), the highest yielding hybrid was a Kansas/Texas combination (KF1/TM3) (Fig. 4). More important, the KF1/TM3 hybrid exhibited good stability across environments (Fig. 4). According to Finlay-Wilkinson (1963), linear regressions of genotype yield parallel to the population mean yield regression categorize 'dynamic stability'. Further, an "ideal genotype" outperforms other genotypes in both low-and high-yielding environments. Based on the BFW regression method applied herein, the KF1/ TM3 hybrid exhibits dynamic stability and represents a potential "ideal genotype" (Fig. 4). For instance, the KF1/TM3 hybrid was among the highest yielding in the 18RF and 19CS environments (Fig. 4). Therefore, KF1/TM3 exemplifies a crucial characteristic that determines high-performing and stable grain sorghum hybrids: hybrids that are adapted to marginal areas and responsive to more favorable environments (Monk et al. 2014). Within the temperate ME (ME2), the top two performers resulted from crosses involving Texas inbred lines (TF2/TM2 and TF2/TM1) (Fig. 6). That implies that elite lines developed outside a particular target environment can produce good grain sorghum hybrids in other sub-tropical ME. Within U.S. sorghum improvement programs it is generally easier to move germplasm from lower to higher latitudes (i.e., from South to North in the U.S.) than in the opposite direction. Results presented herein support such a statement as Texas elite sorghum lines performed well in hybrid combination in temperate Kansas environments.
It is interesting to note that the top performers within ME1 exhibited above-average PHT and below-average DA across environments (Fig. 5). This emphasizes the positive correlation between PHT and GY and further highlights the relationship between earlier hybrids and their avoidance of biotic and abiotic stresses (i.e., midge damage and post-anthesis drought tolerance). However, for ME2, the top hybrids were shorter than the average PHT across environments (Fig. 7). In ME2, taller plants might be adversely affected by abiotic stress associated with this ME, such as wind and drought that can lead to lodging. For those environments, shorter plants are likely to suffer less from lodging and stalk breakage problems, which could explain the yield advantage and justify their selection for these environments.
Genomic predictions modeling GxE within ME Genomic predictions of grain sorghum hybrid performance within ME produced inconsistent results

Fig. 3
Mosaic plot and genotype plus genotype × block of environments (GGB) biplot of the multi-environment trial visualizing the two-way partitioning of the total sums of squares (TSS) into genotype (G), genotype × blocks of environments (GB), and residuals (R) along each GGB biplot axis for yield. The two mega-environments are denoted 1 (ME1) and 2 (ME2). ME1 included Texas environments while ME2 included Kansas environments as defined in Table 2. Hybrids are labeled according to each female x male combination. TF and TM codes for female and males derived from Texas A&M sorghum breeding programs, respectively, whereas females and males derived from Kansas breeding programs are identified as KF and KM respectively. Number following the inbred line origin identifies different parents  (Table 6). When compared to the prediction without ME, the leave-one-out female CV scheme had higher prediction accuracies within ME1, while the leave-one-out male CV scheme had better predictions in ME2 for GY and PHT. Overall, there was a slight increase in predicting GY and DA within ME1. These results indicate that closely related environments can but do not necessarily increase prediction accuracies (Cuevas et al. 2016;Crossa et al. 2019).

Fig. 4
The "mean vs stability" representation of the genotype plus genotype × environment (GGE) biplot based on megaenvironment 1 (left), and Bayesian Finlay-Wilkinson (BFW) regression (right) for grain yield. For GGE, data were environment-centered and not scaled; the biplot was focused on singular value partitioning using nipals algorithm and grain yield scores projections on the average environment coordinate displayed the stability of each hybrid. Underlined hybrids in the GGE biplot indicated the top and bottom two and an intermediate less stable hybrid. The underlined hybrids were assessed using BFW regression to further investigate static and dynamic stability The accuracy of genomic prediction models is influenced by many effects, including but not limited to the quality of the phenotypic data, the heritability of the trait, and the size and structure of the training population. For example, predictions across all environments consider the entire data set, and predictions within a ME only include information from a specific ME. While the ME division can increase the Fig. 6 The "mean vs stability" representation of the genotype plus genotype × environment (GGE) biplot based on mega-environment 2 (left), and Bayesian Finlay-Wilkinson (BFW) regression (right) for grain yield. For GGE, data were environment-centered and not scaled. The biplot was genotypefocused singular value partitioning using nipals algorithm.
Grain yield score projections on the average environment coordinate displayed the stability of each hybrid. Underlined hybrids in the GGE biplot indicated the top and bottom two hybrids and an intermediate less stable hybrid. The underlined hybrids were assessed using BFW regression to further investigate static and dynamic stability correlations between environments located within the ME, the advantages for predicting untested genotypes may be limited due to the concomitant reduction in the number of hybrids present in the training set. As such, developing prediction models for specific ME is recommended when the target environment is well-defined and sufficiently evaluated to build robust prediction models (Table 6). It is also possible to increase the prediction accuracy by including phenotypic records derived from additional sets of correlated environments (Alves et al. 2021). This is observed when prediction accuracies are compared between ME2 and without ME (Table 6). In situations where predictions of untested hybrids involve new environments, reaction norm principles and envirotypic-assisted genomic prediction models indicate promising results as both genomic and environmental information are included in the model (Jarquín et al. 2014;Li et al. 2018;Costa-Neto et al. 2020;Buntaran et al. 2021).

Conclusion
The identification of ME reduces GxE effects, provides better information about MET, and continues to be a common practice among plant breeders interested in both genotype and environment assessment. A MET for grain sorghum hybrids grown across Texas and Kansas confirmed the existence of two distinct ME that coincide with historically different regions of adaptation generally designated as subtropical and temperate environments. Hybrid combinations between elite inbred lines from Texas and Kansas public sorghum breeding programs indicate a potential benefit of exchanging geographically-distinct germplasm to generate high-performing and stable grain sorghum hybrids across temperate and subtropical environments. Based on the present results, additional public (and commercial) sorghum breeding programs are encouraged to explore elite germplasm exchanges to increase genetic diversity within and across programs with the ultimate goal of enhancing genetic gain and hybrid performance. This is especially critical in crops in which investment in public breeding programs is anticipated to continue to decline and thereby exploit the shared resources of multiple programs. Genomic-enabled prediction models were inconsistent in their ability to predict hybrid performance within ME, yet highly correlated environments can generate greater prediction accuracies with smaller training sets. Further research is warranted to identify additional factors affecting the accuracy of genomic prediction models within ME grain sorghum hybrids trials. Finally, the inclusion of non-genetic effects to model the correlation among environments should be considered in future studies to assess their merit to increase prediction accuracies in MET. Data availability The datasets generated during and/or analyzed during the current study are available from the corresponding author upon reasonable request.

Conflict of interest
The authors declare that there is no conflict of interest.