The recent improvements in genotyping technologies, together with the efforts to sequence and assemble the genomes of several crop plants, have greatly increased the throughput and ability to characterize genetic diversity and population structure in germplasm collections using thousands of genome-wide molecular markers (Metzker 2010; Li et al. 2014; Unamba et al. 2015; Nkhata et al. 2020; Wu et al. 2020; Delfini et al. 2021). This progress could greatly facilitate the pre-screening of large germplasm collections and accelerate the ability to exploit the mostly untapped resources available in public gene banks repositories (Kilian and Graner 2012; Ray and Satya 2014; Ariani et al. 2018). The improvements in sequencing technologies are fostering the development of novel conceptual frameworks for characterizing genetic diversity and understanding environmental adaptation in plant natural populations by integrating the availability of ecological and climatic data with genetic information (Schoville et al. 2012; Bragg et al. 2015; Ariani et al. 2018; Shim et al. 2021).
A Brazilian common bean diversity panel, including 91 landraces and 97 elite lines, was recently characterized using ~ 6,000 genetic markers obtained with DArT-Seq technology by Valdisser et al. (2017), who observed the usual divergence between Andean and Mesoamerican entries (Kwak and Gepts 2009) and a further diversification in the Mesoamerican gene pool among different grain types. More recently, two additional publications describe SNP diversity in specific Brazilian diversity panels of common bean. Delfini et al. (2021), analyzing the genetic diversity of some 50,000 SNPs obtained by GBS (CviAII) among – mostly Mesoamerican − 219 cultivars and breeding lines, observed the usual (Lobaton et al. 2018) strong divergence among Andean and Mesoamerican accessions. Furthermore, they observed a differentiation among groups with different seed coat colors. In a study of a sample of 110 entries consisting predominantly of landraces from more arid, northeastern Brazil, Elias et al. (2021) used GBS (CviAII) and observed that most accessions in the Mesoamerican group clustered according to grain types although hybridizations among clusters within this group were also notable, confirming the results of Burle et al. (2010).
In the current study, we analyzed a larger, more representative collection of Brazilian landraces (274 accessions) using GBS (CviAII)-based SNP markers, which can allow a more precise genetic characterization of this resource compared to SSR markers (Burle et al. 2010, 2011). In contrast, however, with Elias et al. (2021), possible correlations with climatic data were analyzed with bioclimatic variables relevant to the putative growing seasons, i.e., TempWet (BIO8 - Mean temperature of wettest quarter); PrecWet (BIO16 - Precipitation of wettest quarter); TempWarm (BIO10 - Mean temperature of warmest quarter); and PrecWarm (BIO18 - Precipitation of warmest quarter). Although bean growing seasons do not coincide precisely with these quarters, our modified bioclimatic variables provide a more precise approximation of the actual growing seasons. In addition, the integration of genotypic and climatic data for this collection allowed the identification of abiotic stress-resistant market classes and relevant molecular markers for potential improvement by breeding of this crop.
The current analysis of population structure showed a clear separation between the Mesoamerican and Andean gene pools, with great admixture among individuals of the Mesoamerican gene pool. A similar pattern was observed in this same collection previously using microsatellite markers (Burle et al., 2010), but also in other collections of Brazilian germplasm using different marker types (Blair et al., 2013; Valdisser et al., 2017; Delfini et al. 2021). The identification of five sub-populations in the present study agrees with the results of Burle et al. (2011), studying the same bean sample. The absence of admixed individuals between the Mesoamerican and Andean gene pools of common bean could be due to hybrid-weakness observed in F1 crosses between these two gene pools in both wild and domesticated individuals of this species (Gepts and Bliss, 1985; Koinange and Gepts, 1992; Hannah et al., 2007). The F1 hybrid weakness is the result of two gene-pool-specific loci that are carried mostly by race Mesoamerica, in the Mesoamerican gene pool, and race Nueva Granada, in the Andean pool (Singh et al., 1991). These two eco-geographic races are the main common bean races in Brazil (Burle et al., 2011). The low admixture level between Mesoamerican and Andean genotypes in Brazil could directly affect common bean breeding improvements especially for biotic stress resistance, since it hinders the ability to transfer pathogen resistance genes between the two gene pools, a common strategy used for this purpose (Miklas et al., 2006).
LD decay is a frequently used measure for detecting signatures of selection in specific genomic areas (Barton, 2011). This analysis showed a rapid LD decay for most chromosomes in the Brazilian collection analyzed, possibly facilitating the introgression of novel traits in selected individuals of this collection. However, chromosomes Pv09 and Pv11 showed extensive LD with a half-decay distance of more than 10 Mbp, suggesting a possible selection pressure on these two chromosomes. Indeed, the observed selection sweep could be the result of farmer selection for the most productive plants under standard growing condition in Brazil. One of the major constraints of agricultural production in tropical areas is soil acidity and the ensuing aluminum (Al) toxicity, which highly affects plant productivity by reducing root growth (Panda et al., 2009). Interestingly, more than half of the agricultural areas of Brazil are affected by Al toxicity (Olmos and Camargo, 1976); thus, Brazilian landraces could constitute a useful source of resistance for this stress. The possibility that genomic segments from chromosomes Pv09 and Pv11 could harbor resistance to Al toxicity is supported by a previous study using a bi-parental population of common bean that identified several QTLs affecting root traits in response to Al stress on these chromosomes (López-Marín et al., 2009). An additional possibility is that these chromosomes harbor factors for other traits relevant to Brazilian growing conditions. Elias et al. (2021) identified a major factor for putative drought stress tolerance on Pv09 with, as candidate gene, a Platz transcription factor potentially involved in drought stress tolerance in maize (Zenda et al. 2019).
sPCA is a multivariate test that focuses on the spatial pattern of genetic variability by integrating genetic and geographical data (Jombart et al. 2008). The application of this test can facilitate the identification of long- and short-range genetic patterns in georeferenced plant collection and can allow the correction for population structure in association analysis in the presence of spatial genetic structures (Pyhäjärvi et al. 2013). Previous analysis on wild and domesticated common beans across their native range in the Americas showed a high correlation between geographic and genetic distances, highlighting a significant global structure with physically close genotypes showing high similarities at the genetic level (Papa and Gepts 2003; Rodriguez et al. 2016; Ariani et al. 2018). In contrast, the Brazilian diversity panel analyzed in this study showed extensive local structure with genetically different genotypes sharing similar climatic characteristics (i.e., phenotypes). This phenomenon can reduce the ability to identify variants involved in environmental adaptation, thus drastically reducing the number of significantly associated SNPs. The high local structure observed in the current collection could indeed be one of the reasons why only a few SNPs were significantly associated with environmental data. However, this can also suggest a broad adaptation of the genotypes analyzed, and a high level of hybridization between genetically different varieties. Both results could be beneficial for breeding improvement of local Brazilian landraces and confirm previous results regarding genetic diversity and population structure into this collection (Burle et al. 2010).
Association analysis between genotypic and the environmental data estimated for the cultivation period showed few significantly associated SNPs when correcting for population structure. These results, nevertheless, indicate that signatures of selection from agricultural practices or environmental variations during growing season could be detected using landscape genomic methods also in domesticated common bean collections.
Association analysis with PrecWet identified two SNPs significantly associated with this bio-climatic variable, one located in Pv02 and the other in Pv08, tagging 19 genes in total. Among the gene models identified in Pv02, there are several that could be associated with water stress response based on their annotations (Table 2). As an example, Phvul.002G325900 (annotated as a poly(A) polymerase homologous of Arabidopsis: PAPS1) has been implicated in regulation of plant growth and flowering time in Arabidopsis (Vi et al. 2013; Czesnick and Lenhard 2016). Other interesting genes identified by this analysis include Phvul.002G326000, annotated as a homolog of Arabidopsis CC1, a gene associated with plant growth (mostly through cellulose biosynthesis) and stress resistance (Endler et al. 2015). The Phvul.002G326800 gene model, annotated as a cyclin-dependent kinase, may be related to regulation of cell cycle and plant growth (Umeda et al. 2005). Since the ability to finely regulate flowering time and plant growth is an essential feature for drought resistance or avoidance in this species (Beebe et al. 2013), these genes can constitute interesting candidates for further genetic studies and also for improving these traits in domesticated common bean.
Among the other gene models associated with PrecWet, we identified Phvul.002G326900, a homolog of Arabidopsis gene PMI1, and Phvul.002G326600, an ACC oxidase involved in ethylene biosynthesis (ACC oxidase) and a homolog of Arabidopsis ACO4. The Arabidopsis PMI1 is a gene involved in chloroplast movement in response to environmental signals (especially light-related), a mechanism that is essential for plant adaptive response to stress (Suetsugu et al. 2015) and has also been related to ABA signaling in response to water deficit (Rojas-Pierce et al. 2014). ACC oxidase is the last and rate-limiting enzyme for ethylene biosynthesis (Ruduś et al. 2013), a phytohormone essential in several processes related to plant growth and development, as well as in the response to environmental stresses (Wang et al. 2013; Iqbal et al. 2017). Due to their relevance in important plant physiological processes and stress response, the two genes identified by this analysis could represent interesting molecular markers for improving stress resistance in domesticated common bean.
Association analysis with PrecWarm identified two SNPs associated with this bioclimatic variable, located in Pv01 and Pv08, tagging in total 14 genes. Among the genes identified in Pv01, several of them could be related with responses to environmental variables. As example, the gene model Phvul.001G068300 is a homolog of Arabidopsis SPT4-2, which has been associated with the regulation of auxin response and transcriptional regulation of auxin-related genes in Arabidopsis (Dürr et al. 2014). Auxin is an essential phytohormone involved in multiple developmental and environmental responses (Bielach, Hrtyan, and Tognetti 2017). Other genes associated with PrecWarm identified in this study were Phvul.001G068400, a homolog of the Arabidopsis CYTOCHROME C OXIDASE DEFICIENT 1 (COD1) gene, and Phvul.001G068600, annotated as the Arabidopsis receptor-like kinase LRK10. COD1 is a PPR gene involved in RNA-editing in mitochondrion and is essential for the correct development of the respiratory complex in plants (Dahan et al. 2014), while LRK10 has been related to ABA signaling and drought response in that species (Lim et al. 2015). Due to their involvement in essential cellular processes and drought response, these two genes might also represent interesting candidates for further studies in this species.
Association analysis with TempWet identified two SNPs significantly associated with this bio-climatic variable tagging in total 15 genes, one of them located in Pv04 and the others in Pv09. Based on their annotations, some of the genes identified could be involved in a response to environmental stresses, in particular to temperature and light variations. One of the genes identified (Phvul.009G085800) is annotated as a peroxidase protein. Members of this gene family are involved in several biological processes, mostly as antioxidant enzymes for hydrogen peroxide degradation (Veitch 2004; Pandey et al. 2017). The other genes identified by this association analysis are involved in the regulation of chloroplast development at molecular level. In particular, Phvul.004G106800, a homolog of Arabidopsis ARC11, is involved in chloroplast division (Fujiwara et al. 2008) while the homolog of Phvul.009G086500 (PGR3) is a nuclear gene involved in transcriptional regulation of chloroplast genes (Yamazaki et al. 2004). The other gene model identified by this association analysis (Phvul.009G086000) is homologous to Arabidopsis TIC55-II, a gene involved in chloroplast protein import and chlorophyll degradation in that plant (Boij et al., 2009; Hauenstein et al., 2016). Due to the central role of chloroplasts in response and adaptation to external stimuli in plants, especially light- and water-related signals (Spetea et al. 2014; Dutta et al. 2017; Tamburino et al. 2017), the genes identified by this association analysis would require further investigations for evaluating their role in stress response in common bean.
Association analysis with TempWarm identified two significantly associated SNPs located on Pv03, tagging in total 10 genes. Among these genes, some of them could be related to environmental stress responses, like heat stress. As an example, the gene model Phvul.003G154500, a homolog of Arabidopsis SHOOT REDIFFERENTIATION DEFECTIVE 2 (SRD2), is involved in cell proliferation and in lateral root development in Arabidopsis (Ohtani and Sugiyama 2005; Ohtani, Demura, and Sugiyama 2010); a knockdown mutation of this gene causes aberrant growth phenotypes under high temperatures in this species (Ohtani and Sugiyama 2005). Another gene model identified in this genomic interval is Phvul.003G154600, homolog of Arabidopsis CESA7, a gene involved in cell wall formation in that species (Bosca et al. 2006). Due to the importance of plant cell walls in heat stress response (Wu et al. 2018), this gene could represent an interesting marker for improving heat stress in common bean. Among the additional genes identified in this interval there are Phvul.003G154700, homolog of Arabidopsis SUPPRESSOR OF VARIEGATION 9 (SVR9), and Phvul.003G155400, homolog of Arabidopsis BRI1 SUPPRESSOR 1 (BRS1). SVR9 is essential for chloroplast development and plant survival; it can also affect leaf development in Arabidopsis (Zheng et al. 2016). BRS1 is a suppressor of the Brassinosteroid-Insensitive 1 (BRI1) gene, an essential gene involved in plant growth and development in this species (Li et al. 2001). Due to their involvement in plant growth and development in Arabidopsis, the two common bean homologs identified in this study could represent interesting candidate genes for further genetic studies and validation.
The integration of environmental and genotypic data can facilitate the identification of both useful molecular markers for breeding improvement and suitable donor germplasm of stress resistance that can be used as parents for varietal improvement (Schoville et al. 2012; Henry and Nevo 2014; Berny Mier y Teran et al. 2020). The comparison of climatic variables among Brazilian market classes showed that the landraces of Mulatinho type were cultivated in areas with significantly lower rainfall and higher temperature in Brazil, confirming earlier results obtained with a similar approach using microsatellite markers (Burle 2008). We hypothesize that initial growers of Mulatinho types may have chosen this type because of its relative drought resistance regardless of the seed color. Subsequently, a consumer preference developed for the specific Mulatinho seed color and became the main reason for cultivating this bean type regardless of its drought adaptation. Yet, selection for seed color may have entailed indirect selection for heat or drought tolerance because of linkage disequilibrium, further enhanced by the predominantly autogamous reproductive system of common bean (Papa and Gepts 2003; Papa et al. 2005; Kwak and Gepts 2009). Seed color and color pattern are conditioned by some 7–10 genes distributed over most of the bean chromosomes (Bassett 2007; McClean et al. 2002). Selection for seed color in common bean by farmers is effective as demonstrated by the tremendous biocultural diversity displayed by domesticated common bean (e.g., Kuzay et al. 2020) and shown recently in places as diverse as Mexico (Worthington et al. 2012) and Uganda (Wilkus et al. 2019). The current results confirm the extensive levels of LD in the bean genome identified previously (Kwak and Gepts 2009). These results suggest that some of the SNPs identified by GWAS analysis could be related to environmental adaptation or that there can be linkage between loci differentiating market classes and those controlling mechanisms of environmental adaptation. Nevertheless, caution should be exercised because spurious associations can occur during GWAS analysis (Shen and Carlborg 2013).