Understanding G × E is an important initial step to developing strategies for a breeding program in the target environment(s). Our results showed that G × E patterns differed between safflower traits. The identification of site-shared and site-specific MTAs in GWAS provides knowledge to broaden our understanding of the genetic basis of G × E interactions for important safflower traits.
Different G × E interaction patterns were observed for safflower traits
The seasonal rainfall in the Victoria Wimmera region (Horsham and Wonwondah) had an impact on safflower agronomic traits. Safflower is normally sown in Winter in Australia to maximize the usage of the available water from Winter and early Spring rain (Wachsmann 2008). In our study, we observed that water stress during flowering decreased safflower grain yield substantially. Further, insufficient Spring rain heavily reduced safflower production via poor biomass accumulation. Similar grain yield instability induced by rainfall patterns has also been reported in other crops (Sadras and Dreccer 2015). Besides grain yield, a 1% decrease of OL was observed under differing water stress, consistent with previous studies that oil content decreased under drought stresses (Ebrahimi et al. 2017b; Joshan Y. 2019). The positive correlation between OL and PR (0.14 to ~0.46 across sites), which was also reported in a previous study ( r=0.476) (Oz 2016), indicated that artificial selection for OL in safflower has a limited impact on seed protein compared with soybean (Leamy et al. 2017b). The negative correlation between SW with PR and OL across all four sites suggests a negative relationship between carbohydrate accumulation and protein and oil accumulation, which could be similar to the competition in cereal crops (Bjarnason and Vasal 1992; Pasam et al. 2012).
The overall G × E interactions were significant for all the traits studied. However, there were different levels of G × E for the different traits. The heterogeneous model further revealed detailed G × E patterns for each trait, which indicated the rank changes of accessions between sites. The high G × E observed for YP across sites was consistent with studies in other crops (He et al. 2019; Tolessa et al. 2013). Low to moderate pairwise genetic correlation indicated re-ranking for YP was high among sites. Only 4 accessions showed yield stability through presence in the top 50 high yield accessions across all sites, which could be used for the future breeding program. The cultivars and breeding lines performed well at site 1 (19 accessions out of the 50 top YP accessions) but not at the other three sites, suggesting that introgression of water stress tolerance from landraces could improve safflower yield stability. The low level of G × E for OL with limited re-ranking across sites observed in our study was also indicated in a soybean study (Sudarić et al. 2006). According to the BLUEs for each site, about half of the top 30 accessions with high OL were cultivars and breeding lines, reflecting breeding efforts to improve OL in safflower cultivars. Although a moderate rGoverall was observed for DF and PH, the pairwise genetic correlation showed that lines were reranked strongly for DF at site 2 compared with other sites. The genetic divergence of DF among the accessions in response to water stress at flowering implied DF is important in developing drought tolerant varieties (Bhandari et al. 2020).
GWAS identified MTAs for safflower traits
GWAS has been widely used to study the genetic basis of the important agronomy traits with diverse germplasm in crops (Liu and Yan 2019). Multi-environment trials normally were combined to present the overall phenotypic variation for GWAS to detecting the associations between markers and traits (Landers and Stapleton 2014; Leamy et al. 2017c). However, with diverse germplasm, the phenotypic variation displayed under differed environments can be used to measure the plasticity of the traits or trait G × E level with proper statistical models (Des Marais et al. 2013; Malosetti et al. 2013). Environmental stable and environmental specific MTAs can help our understanding of the genetic basis of trait G × E, and it also will enrich our knowledge of the genetic architecture of the important agronomy traits (Li et al. 2018b; Timpson et al. 2018). In our study, GWAS was carried out with a globally diverse safflower collection for six agronomic traits in four field trials that differed with water availability. MTAs shared across sites were identified for traits with low G × E, and site-specific MTAs were discovered for all traits with more site-specific MTAs than shared MTAs identified for moderate overall G × E traits.
A high number of significant MTAs were identified for seed oil content (OL) by all three GWAS approaches, of which 18 were shared across sites, and 12 were site-specific, indicating the complex genetic control of this trait. Studies with canola showed that 24 candidate genes were involved in fatty acid biosynthesis (Qu et al. 2017a). In safflower, a transcriptome study showed that a significant number of transcription factors were involved in oil accumulation in safflower seeds (Li et al. 2021). The six MTAs shared across four sites will be of interest to safflower breeders and geneticists as sources of genetic variation to improve the seed oil content in safflower under different growing conditions. Similarly, numerous MTA (total 20) were identified for seed weight (SW), of which 11 were shared across sites. Three MTAs explaining more than 10-20% phenotypic variance across sites will provide useful information for breeders to modify SW in safflower (Supplementary Table s9).
The molecular basis for G × E interactions could be due to site-specific QTLs, gene expression or differences in the magnitude of expression across environments (Des Marais et al. 2013; Li et al. 2018b). In our study, all site-specific MTAs showed differing allelic effects across sites for each trait (Supplementary Table s6-s11), however, the effects were significant in some environments but not in other environments. We observed moderate overall G × E for PH and DF with a higher number of site-specific MTAs. Markers associated with PH and DF under drought conditions in safflower have been reported (Ebrahimi et al. 2017a). Only one MTA was identified for DF at site 2, which could be related to the narrow phenotypic variation observed at site 2 (Stich and Melchinger 2010). Few MTAs for PR and YP were detected by all three GWAS methodologies. However, those that were identified explained a high proportion of the phenotypic variation for each trait, indicating their potential importance for genetic improvement.
Correlations between traits can be caused by pleiotropy or a close linkage of loci associated with the traits (Chen and Lübberstedt 2010). Shared major genes or QTL for flowering time and plant height have been reported in soybean (Cober and Morrison 2010; Fang et al. 2017). In our study, Locus 5628 was associated with DF at site 1 and PH at site 2 and 3, suggesting the MTA is likely tightly linked to different QTL affecting both traits, rather than being a single QTL with pleiotropic effects. In canola, a QTL affecting both OL and PR in repulsion was reported, suggesting the PR and OL biosynthesis pathways interfere and/or compete with one another (Chao et al. 2017). In maize, high OL and high PR were achieved using the opaque2 modifier genes. However, a yield reduction was noted (Vanous et al. 2019). In our study, we identified four MTAs affecting three traits, one MTA influencing both SW and OL, one MTA associated with PR and OL, and two MTAs interfering SW, OL, and PR. The allelic effects of those MTAs were consistent with the correlation observed in the field among the three traits that PR and OL are positively correlated, and both traits are negatively correlated with SW. This suggested that safflower breeding for PR and OL may differ from canola and maize. However, balancing seed weight and seed quality (OL and PR) would be a challenge. There were other strong phenotypic correlations, such as YP with PH and YP with PR, but associated markers were not identified. The reason could be the low number of significant SNPs that were observed for grain yield.
The interplay of GWAS results and genetic architectures
SW and OL are known as highly heritable traits in many crops, while yield is more quantitative in nature (Ward et al. 2019; Xiao et al. 2019). The number of MTAs identified by the three GWAS methods did not fully reflect the complexity of the trait genetic architecture. One reason for this could be the thresholds used by the three methods. The p-value used in our single locus MLM-GWAS was relaxed, and a significant number of candidate MTAs were observed for all traits. With meta-GWAS, we reported significant SNPs number with -log10P value at 3 instead of 2, which detected more significant SNPs for each trait indicating the increased power (Supplementary Table s12). However, multi-locus BayesR, which can improve association mapping resolution by removing multiple SNPs being in LD with the same QTL, could detect SNPs with larger effects (Kemper et al. 2015; Pasam et al. 2017). We observed fewer MTAs for all traits with the BayesR methods with the arbitrary threshold of 0.7 posterior probability of a SNP having an effect. This threshold may have been too stringent for polygenic traits such as YP and PR. Only 10 MTAs associated with YP, and 5 MTAs associated with PR were detected with BayesR, which explained 5-28% of the phenotypic variance.
Heterogenous model fit the data better
Mixed linear models are widely used for G × E analysis in crop research (Smith et al. 2005). Falconer & Mackay (1996) suggested that the same trait measured in different environments should be considered as different (but correlated) traits. In our study, the homogenous model 1 combined the four sites together and estimated the overall GXE pattern with only three parameters. However, the heterogeneous model 2 treated each site as an independent environment, and a total of 26 parameters were estimated. The increased number of parameters allowed dissection of G × E among individual environments to reveal hidden patterns of genetic correlation between sites. Furthermore, the AIC, BIC, and Logl were improved for all traits, indicating model 2 was better able to fit the data (Hirotugu. 1998). These findings agreed with Malosetti et al. (2013), who compared different models to study G × E interactions, and concluded that sophisticated mixed models are necessary to allow for heterogeneity of genetic variances and correlations across environments.
In conclusion, two mixed linear models were applied to analyse the G × E pattern for grain yield (YP), days to flowering (DF), plant height (PH), 500 seed weight (SW), seed oil content (OL), and seed protein content (PR) in a globally diverse safflower collection grown in four field trials. The heterogenous mixed linear model (MLM) fitted data better and provided a detailed estimation of the G × E pattern. We observed that different water stress conditions impacted the performance of each of these traits differently, with low overall G × E observed for OL and SW and high overall G × E for YP. Ninety-two MTAs were identified with large effects MTAs detected for OL, SW, and PR across all sites. Site-specific MTAs were detected for all traits with differed allelic effects, suggesting these MTAs could be associated with trait G × E. Five MTAs were associated with multiple traits. The uniform GWAS thresholds used in the study could have impacted the number of significant SNP identified for complex traits. This study has provided new insights into the genetic architecture of the traits studied, and it presents opportunities to exploit the MTA identified in breeding programs to increase yield stability and local adaptation in safflower.