Genome-wide association study for grain mineral content in a Brazilian common bean diversity panel

QTNs significantly associated to nine mineral content in grains of common bean were identified. The accumulation of favorable alleles was associated with a gradually increasing nutrient content in the grain. Biofortification is one of the strategies developed to address malnutrition in developing countries, the aim of which is to improve the nutritional content of crops. The common bean (Phaseolus vulgaris L.), a staple food in several African and Latin American countries, has excellent nutritional attributes and is considered a strong candidate for biofortification. The objective of this study was to identify genomic regions associated with nutritional content in common bean grains using 178 Mesoamerican accessions belonging to a Brazilian Diversity Panel (BDP) and 25,011 good-quality single nucleotide polymorphisms. The BDP was phenotyped in three environments for nine nutrients (phosphorus, potassium, calcium, magnesium, copper, manganese, sulfur, zinc, and iron) using four genome-wide association multi-locus methods. To obtain more accurate results, only quantitative trait nucleotides (QTNs) that showed repeatability (i.e., those detected at least twice using different methods or environments) were considered. Forty-eight QTNs detected for the nine minerals showed repeatability and were considered reliable. Pleiotropic QTNs and overlapping genomic regions surrounding the QTNs were identified, demonstrating the possible association between the deposition mechanisms of different nutrients in grains. The accumulation of favorable alleles in the same accession was associated with a gradually increasing nutrient content in the grain. The BDP proved to be a valuable source for association studies. The investigation of different methods and environments showed the reliability of markers associated with minerals. The loci identified in this study will potentially contribute to the improvement of Mesoamerican common beans, particularly carioca and black beans, the main groups consumed in Brazil.


Introduction
Of the leguminous plants consumed by humans, the common bean (Phaseolus vulgaris L.) has the highest worldwide demand and is widely cultivated, accounting for approximately 41.712 million hectares annually (RAWAL and NAV-ARRO 2019). It is considered a staple food for the populations of numerous countries, particularly in Latin America and Eastern and Southern Africa (Broughton et al. 2003). Per capita consumption varies depending on consumer preferences, countries, and regions, reaching up to 66 kg capita −1 year −1 in some African and American countries and ranging from 5 to greater than 10 kg capita −1 year −1 in the United States and Brazil, respectively (Blair 2013).
Common beans have high carbohydrate content and are known to be an excellent source of protein and micronutrients such as iron (Fe), zinc (Zn), thiamine, and folic acid (Petry et al. 2015). Although a cup of common beans can Communicated by Elena Bitocchi. supply approximately 25% and 15% of the daily reference values of Fe and Zn, respectively, genetic improvement can increase the contents of these elements in common bean grains by two-or even threefold (Cichy et al. 2009).
The process of improving nutrient contents in crops is referred to as biofortification, which is known as a sustainable and economic strategy to address malnutrition in developing countries, considering its focus on improving the staple foods consumed on a daily basis (Dwivedi et al. 2012). Biofortification can be performed using two strategies: genetic biofortification, which is based on traditional methods of plant breeding or genetic engineering, and agronomic biofortification, which is based on the optimized use of fertilizers (Cakmak 2008). Genetic biofortification is considered the most effective option in terms of costs and efficiency, given that as inbred lines with higher nutrient contents are developed, they can be used on a large scale without requiring further investment (Cu et al. 2020).
Several studies have shown a wide genetic variability in the nutritional characteristics of common beans, thereby indicating a considerable potential for increasing the content of Fe, Zn, and other minerals in this crop (Beebe et al. 2000;Islam et al. 2002;Pinheiro et al. 2010;Silva et al. 2012;Ribeiro et al. 2014;McClean et al. 2017;Delfini et al. 2020). Moreover, genetic mapping studies on biparental populations have identified certain quantitative trait locus (QTL) regions associated with different nutritional characteristics, notably Fe and Zn contents. Andean and Middle American intra-and inter-gene pool populations were explored in these studies. For Andean intra-gene pool populations, Cichy et al. (2009) reported 23 QTLs for seed Fe and Zn colocalized on three linkage groups, being two QTLs, one for Fe and other for Zn, found at the same interval, explaining 36 (Fe) and 39% (Zn) of the phenotypic variation. Blair et al. (2011) also exploring an Andean population found 9 QTLs for Fe and Zn on five linkage groups, with 18.6-27.8% of the variance explained. For a Mesoamerican population (Blair et al. (2010) found mostly novel QTLs compared to previous studies that explained up to 21.3% for Fe and 38.4% for Zn of the phenotypic variance, while for an inter-gene pool population, Blair et al. (2009) found QTLs explaining up to 48% of phenotypic variance. However, few genome-wide association studies (GWAS) have analyzed this subject.
GWAS are an extension of QTL association mapping studies between characteristics of interest and molecular markers in large population samples. Moreover, statistical genotype and phenotype associations are identified in unrelated individuals and are only detected when the marker and QTL are in strong linkage disequilibrium (LD). This method requires an existing linkage map or a reference genome of the investigated species to facilitate marker positioning (Stapley et al. 2010). In addition, a large number of markers are required for good genome coverage, which has been made possible by the recent emergence of next-generation sequencing (NGS) (Edwards and Batley 2010).
Several techniques have been developed to generate large sets of single nucleotide polymorphisms (SNPs), such as high-density SNP arrays available for several crops, including common beans. However, these arrays are often designed based on a limited number of elite genotypes and can produce biased data when used to characterize non-elite genotypes (Rasheed et al. 2017;Raggi et al. 2019). The NGS technique is a promising alternative that produces a large number of SNPs, is not biased, and can be performed at a low cost. Genotyping-by-sequencing (GBS) is a technically direct and multiplexed approach that can be used to sequence subsets of a genome based on restriction enzymes for rapid, specific, and reproducible results, and the large number of SNPs generated provides a deeper understanding of population structure and genetic diversity, in addition to being suitable for GWAS and even for the discovery of candidate genes (Stansell et al. 2018).
Another important aspect of GWAS concerns the mathematical models used to associate the marker with the target trait. Advances in quantitative molecular genetics enabled the development of a large number of association mapping methods for the genetic dissection of complex plant characteristics. However, most of the previous studies on common beans used single-locus GWAS, such as general linear (GLM) and mixed linear (MLM) models. These models assess the significance of the marker-trait association considering one marker at a time, resulting in significant associations based on rigorous multiple test correction [e.g., Bonferroni and false discovery rate (FDR)]. Due to the high significance required, these methods detect QTNs that have a large effect only; they are unable to identify polygenes with small effects for complex characteristics (Lan et al. 2020).
To solve this problem, alternative multi-locus methods, including mrMLM (Wang et al. 2016), FASTmrMLM (Tamba and Zhang 2018), pLARmEB , and ISIS-EM-BLASSO (Tamba et al. 2017), have been proposed, with the advantage that Bonferroni correction for multiple tests is not necessary. These methods differ from other multi-locus methods in that they comprise two steps. The first step considers the SNP effect as random, with all potentially associated markers being selected using a random-SNP-effect MLM with a modified Bonferroni correction for the significance test. In the second step, all markers are placed in a single model, and all effects other than zero are detected based on a likelihood ratio test to identify QTNs (Chang et al. 2018).
Given that the tools used in GWAS can identify, characterize, and develop molecular markers related to the allelic variation of nutritional characteristics of common beans and considering the socioeconomic importance of the crop, studies investigating the genetic architecture of these traits are 1 3 fundamental (Katuuramu et al. 2018). Thus, the objective of the study was to identify regions related to the nutritional content of common bean grains using multi-locus methods in a Brazilian Diversity Panel (BDP) comprising Mesoamerican accessions adapted to tropical conditions. These findings will provide useful genetic information to improve the biofortification of common bean crops.

Genetic material and experimental design
The Mesoamerican common bean diversity panel used in this study was established in order to group accessions adapted to tropical conditions and that represented the variability of the Mesoamerican domesticated gene pool of the species present in Brazil (Delfini et al. 2021). This panel includes cultivars, landraces, and inbred lines maintained in the germplasm bank of the Instituto de Desenvolvimento Rural do Paraná (IDR-Paraná). For the purposes of the present study, a BDP sub-set comprising 178 Mesoamerican common bean accessions was evaluated (Table S1).
The experiments were conducted during the 2018 rainy season (sowing between September and October) at different IDR-Paraná Research Stations in the cities of Londrina (LDA) (23º22′8″S; 51º9′48″O), Ponta Grossa (PG) (25º09′11″S; 50º09′22″O), and Guarapuava (GUA) (25º23′51″S; 51º32′36″O), all located in the state of Paraná, Brazil. The experiments were based on an incomplete block design with replicates in sets. Five sets with two replicates were used, each with 50 treatments (46 accessions and four controls). The field experiment was design for accommodate 230 accessions, but, due to field losses or accessions removed by minimum genotype quality, only 178 were effectively used in this study. For each plot, common beans were planted in four 2.00-m-long rows, with a between-row spacing of 0.50 m and a density of 12 plants per linear meter. Fertilization and pest, disease, and invasive plant control were performed according to the technical recommendations for the crop.

Phenotyping for nutrient content in grains and statistical analysis
After physiological maturity, 100 g samples of disease-free seeds with no visible physical or insect-related damage from each experimental plot were collected. Before laboratory analyses, the samples were washed sequentially with running water, 0.01 M HCl solution, and distilled water to prevent contamination by soil particles. Subsequently, the seeds were oven-dried at 60 °C for 24 h and then ground to a fine powder. The flour obtained was packed in plastic bags with hermetic insulation. Phosphorus (P), potassium (K), calcium (Ca), magnesium (Mg), copper (Cu), manganese (Mn), sulfur (S), Zn, and Fe contents in the flour were determined using the method described by Miyazawa (1999). Initially, 0.4 g samples of flour were subjected to nitroperchloric digestion with a 3:1 solution of HNO 3 / HClO 4 in 80-mL digester tubes, and mineral concentrations were subsequently determined using an Inductively Coupled Plasma Optical Emission Spectrometer (ICP-OES, Optima 83,000, PerkinElmer, Waltham).
An analysis of variance (ANOVA) of the phenotypic data was performed using the PROC GLM function of the SAS software (SAS Institute 2000) according to the following statistical model: Yijkl = μ + Ai + Sj + ASij + R/ ASkij + G/Slj + AG/Smlj + eijklm, where μ is the mean; Ai is the fixed effect of the i-th environment; Sj is the effect of the j-th "set"; ASij is the effect of the interaction between environments and "sets"; R/ASkij is the effect of the k-th repetition within the interaction between the i-th environment and the j-th set; G/Slj is the random effect of the l-th genotype within the j-th "set"; AG/Smlj is the effect of the interaction of environments and accessions within the j-th "set"; and eijklm is the experimental error (Hallauer and Miranda Filho 1988). The means adjusted for each accession in each of the environments and across the environments were also obtained using the LSmeans option in the GLM procedure of the SAS software. Heritability (h 2 ) was estimated using the equation h 2 = σ 2 G /σ 2 P , in which the genotypic (σ 2 G ) and phenotypic (σ 2 P ) variances were estimated using the following equations: σ 2 G = (QM G -QM E )/ ra and σ 2 P = QM G /ra, where QM G is the mean square of genotype within "sets," QME is the mean square of the error, r is the number of replicates, and a is the number of environments. Descriptive analysis was performed using the PROC UNIVARIATE function of the SAS software (SAS Institute 2000). A graphic representation of Pearson correlations was obtained using the corrplot package implemented in the R software (Wei and Simko 2017; R Core Team 2020).

Genotyping-by-sequencing
Genotyping was performed based on the GBS technique using the enzyme CviAII (Ariani et al. 2016). Data processing was performed as described in Delfini et al. (2021), removing individuals containing less than 30% genotyped positions and SNPs with a minor allele frequency lower than 0.05, heterozygous SNPs were treated as missing data. The data were imputed using the Beagle software v.5 (Browning et al. 2018), and only SNPs anchored to chromosomes in the common bean reference genome (Schmutz et al. 2014) were used.

Population structure and linkage disequilibrium
SNPs were filtered for LD using the indep-pairwise function of the PLINK software (Purcell et al. 2007), and only the SNPs with an LD value ≤ 0.2 were maintained. Analyses of population structure, principal components, and clustering were performed using the filtered data. The population structure was inferred using the Bayesian clustering algorithm in Structure v2.3.4 software (Pritchard et al. 2000). The model comprised an admixture with 100,000 burn-ins, 100,000 Monte Carlo Markov chains, and 10 replicates for hypothetical numbers of subpopulations (K) between 1 and 10. The statistical parameter ΔK (Evanno et al. 2005) was used to determine the number of groups. Only accessions with a membership coefficient ≥ 0.6 were assigned to a genetic group, whereas those with coefficients < 0.6 were placed in a group designated as a mixture. Principal component analysis (PCA) was performed using the snpgdsPCA function of the SNPRelate package (Zheng et al. 2012) of the R software. A neighbor-joining tree was constructed using the TASSEL 5.0 software (Bradbury et al. 2007).
The LD between SNPs was estimated using the LdcorSV package (Desrousseaux et al. 2017) of the R software, which can be used to correct LD biases caused by the population structure and relationship. In addition to the conventional r 2 , r 2 corrected for population structure (r 2 s ), r 2 including relationship (r 2 v ) and r 2 including population structure and relationship (r 2 vs ) were calculated. The results obtained using the STRU CTU RE software for K = 2 were used as population structure, and a kinship matrix was calculated for relationship using the rrBLUP package (Endelman 2011) of the R software. LD decay was calculated using a nonlinear method (Hill and Weir 1988) and adjusted using the nls function in the R software.

Genome-wide association studies
A total of four multi-locus GWAS implemented in the mrMLM.GUI 4.0 software (Ya-Wen et al. 2019) were used to detect significant QTNs for the target traits: mrMLM, FASTmrMLM, pLARmEB, and ISIS-EM-BLASSO. The population structure (Q) and the kinship matrix (K) were included in the model to reduce the occurrence of false positives and improve analytical power. The STRU CTU RE software results (K = 2) were used for Q, and a kinship matrix was calculated using the mrMLM.GUI 4.0 software. For all methods, the parameters used were the standards, and a logarithm of the odd (LOD) score ≥ 3 was used as the critical value for significant associations. The phenotypic data used for GWAS were the adjusted means of each of the three environments and the overall adjusted mean (LDA, PG, GUA, and LSmeans). To obtain more accurate results, only QTNs showing repeatability (i.e., detected at least twice using different methods or in different environments) were considered significant and were used in searches for favorable alleles and candidate genes.

Identification of favorable alleles
For each QTN, all BDP accessions were initially divided into two groups based on the QTN genotype, and alleles associated with a positive effect on the phenotype (i.e., those associated with an increased mineral content) were identified. Subsequently, the number of favorable alleles for each accession was identified, and the association between the accumulation of these superior alleles in the same accession and nutritional contents was verified. Boxplots and a heatmap generated using the pheatmap package (Kolde 2019) of the R software were produced for better result visualization.

Search for candidate genes
The search for potential candidate genes focused on the QTNs detected using multiple methods or in multiple environments. The search radius (physical distance) was determined according to LD half-decay corrected for the population structure and relationship (r 2 vs ). Genes present in the association region were identified based on the annotation of the Phaseolus vulgaris v.2 common bean reference genome published on the Phytozome v10.3 website (http:// phyto zome. net). Subsequently, genes with known putative functions based on the Gene Ontology annotation (GO, http:// www. geneo ntolo gy. org/) related to the traits of interest were selected as candidate genes.

Nutritional characterization
The contents of nine minerals in the grains of common bean accessions from the BDP cultivated in three environments were determined. The descriptive statistics (mean, maximum, minimum, standard deviation, asymmetry, and kurtosis) and heritability values (h 2 ) are shown in Table 1, and the frequency distributions of the contents of these minerals in each of the three environments are presented in Fig. 1.
The ANOVA (Table S2) revealed a significant effect in all characteristics evaluated with regard to genotype and environmental variation; however, no Genotype × Environment (GxE) interactions for the evaluated traits were detected. Coefficients of variance (CV) (  The mean P, K, Ca, S, Zn, and Fe contents were higher in common beans cultivated in LDA, whereas Mn and Mg were detected at higher concentrations in PG and GUA, respectively. Although the mean Cu content showed no substantial variation between the three locations, it was the mineral with the greatest variation between the minimum and maximum values among the accessions (up to 2.22-fold). Comparatively, the mean variation found for P, K, Mg, S, and Zn contents ranged between 1.55-and 1.78-fold, whereas Ca, Mn, and Fe contents had an approximately two-fold variation. No marked differences in the minerals content were observed between the two principal commercial groups present in this study, black and carioca, as already reported in our previous study (Delfini et al. 2020).
Correlation analysis showed significant effects for most minerals. A positive correlation was detected between Fe and Cu (0.67, P ≤ 0.05); however, they were negatively correlated with P, K, Mg, and S. Similarly, positive correlations were detected among P, K, Ca, Mg, Mn, and S contents, ranging from 0.31 (Ca-P) to 0.79 (P-S) (Fig. S1).

Genotyping, population structure and linkage disequilibrium
After filtering the SNPs obtained by GBS, 25,011 high-quality SNPs were obtained. These SNPs were used for subsequent GWAS, and their distribution is shown in Fig. 2. After removing high-LD SNPs, a total of 707 remained for use in population structure studies. Population structure analysis performed using the ADMIXTURE method indicated a K value of 2, corroborating PCA and clustering analysis findings based on the neighbor-joining method (Fig. S2). Thus, two highly distinct groups were identified: one composed predominantly of carioca common bean commercial accessions and the other composed of black bean accessions. The third group included 12% of the accessions, for which the membership coefficient in either of the aforementioned two groups was less than 0.6, and was classified as a mixture group.
LD corrected for relationship (r 2 v ) and population structure and relationship (r 2 vs ) showed half-decay values well below those obtained for conventional r 2 and those corrected only for population structure (r 2 s ). The LD half-decay values for r 2 , r 2 s , r 2 v , and r 2 vs were 1,414.41, 1,223.41, 296.76, and 296.54 kb, respectively. Thus, the search distance for candidate genes was set according to the distance determined based on r 2 vs (Fig. S3).

Genome-wide association studies
Altogether, 238 unique QTNs were significant for the nine minerals studied, of which 48 were detected at least twice by one of the four multi-locus methods used or in the different environments analyzed (LDA, PG, GUA, and LSmeans). These QTNs were considered promising and were subjected to a more thorough analysis (Fig. 3, Fig.   Fig. 2 Density distribution of single nucleotide polymorphisms (SNPs) identified by genotyping-by-sequencing (GBS) from the Brazilian Diversity Panel (BDP) along the common bean genome at a 1-Mb window size 1 3 S4 and S5). The QTNs presenting repeatability are shown in Table 2. Results revealed differences in the numbers of QTNs associated with the different minerals; 7, 4, 4, 11, 2, 6, 5, 7, and 2 QTNs were detected for P, K, Ca, Mg, Cu, Mn, S, Zn, and Fe, respectively. Moreover, four pleiotropic QTNs were detected (i.e., they showed a significant association at least two times for more than one trait) with one QTN shared between P and Mg, one between Ca and Mn, and two between K and Mg. In addition to the pleiotropic QTNs, 12 QTNs overlapped considering the genomic region around the QTN defined according to LD (± 296 kb), with the two of them associated with Cu and P located on chromosomes Pv3 and Pv4, respectively, which overlapped with the two pleiotropic QTNs identified for K-Mg. The other overlapping QTNs were associated  with Mg-Zn, Mg-S, Mn-Fe, P-Cu, and Zn-Zn located on chromosomes Pv1, Pv8, Pv9, Pv9, and Pv11, respectively. QTNs were detected on all 11 common bean chromosomes, with the largest number located on Pv09 and Pv07, accounting for 10 and 7 QTNs, respectively. The phenotypic variation explained (PVE) by the different QTNs varied from very low values close to 0 to 13.49%. Although values close to zero were also observed for QTN effects, the effects varied from -4.78 to 3.15. Different PVE and QTN effects results were obtained for the same QTN using different methods, with more significant results for PVE. For example, some QTNs detected using the FASTmrMLM method showed an effect and PVE close to zero.
For all QTNs detected using the four methods, the pLARmEB method identified the largest number of significant SNPs, followed by the ISIS-EM-BLASSO, mrMLM, and FASTmrMLM methods. However, considering only the QTNs that showed repeatability, the ISIS-EM-BLASSO method stood out, followed by FASTmrMLM, pLARmEB, and mrMLM. The FASTmrMLM method showed 100% efficiency (i.e., all SNPs detected using this method showed repeatability), whereas the mrMLM, ISIS-EM-BLASSO, and pLARmEB methods showed 73%, 57%, and 19% efficiency, respectively. Although the pLARmEB method detected the highest initial number of SNPs, many of these SNPs were discarded at subsequent stages of the study. The GUA environment detected the highest initial number of significant SNPs, followed by PG, LSmeans, and LDA. However, PG presented more stable SNPs, followed by LSmeans, LDA, and GUA. Some environments showed no significant SNPs for certain characteristics (Fig. 3).

Favorable alleles
Favorable alleles were identified for each of the QTNs that showed repeatability (i.e., alleles associated with increased mineral content), followed by an analysis of whether the accumulation of these alleles reflected an increasing mineral content in the genotypes (Fig. 4). Results showed a gradual increase in P, K, Ca, Mg, Mn, S, Zn, and Fe contents in common bean grains according to the number of favorable alleles present in the genotype. Figure 5 shows a clustering of BDP accessions according to the contents of the nine minerals studied, with lower nutrient content groups presenting fewer favorable alleles (< 20) and higher nutrient content groups presenting more favorable alleles (> 25). Forty-three accessions had 50% or more favorable alleles, of which 25 accessions were common beans from the carioca commercial group and 11 were from the black commercial group; the IAPAR 16 cultivar was characterized by the highest percentage (62%) of favorable alleles. The number of favorable alleles in the 31 inbred lines developed by IDR-Paraná ranged between 13 and 28, with nine showing a favorable allele percentage of at least 50%.

Potential candidate genes
Initially, a search was performed for genes in the genomic regions identified around QTNs. Using GO annotation, the genes were grouped into three functional categories related to cellular components, biological processes, and molecular function. In the cellular component category, the main functions identified were related to the membrane and its integral components, whereas in the biological processes category, the functions were related to oxido-reduction processes, protein phosphorylation, transcription regulation, metabolic processes, transmembrane transport, and metal ion transport. In the molecular function category, protein binding, ATP binding, DNA binding, nucleic acid binding, heme binding, zinc biding, iron binding, protein kinase activity, and metal ion binding were some of the main functions of the identified genes.
With the exception of K and Mn, all minerals had at least one candidate gene with a nutrient-related function identified (Table 3). The largest number of candidate genes was identified for P and Zn (13 and 10, respectively), whereas only a single candidate gene was associated with S. For some of the identified candidate genes, the QTN was located within the gene itself, whereas others were detected at distances of between 6.6 and 287 kbp from the QTN.

Discussion
The discovery of genes or genomic regions associated with the nutritional contents of common bean grains may accelerate the development of new biofortified cultivars. In this context, common bean germplasms preserved worldwide represent an invaluable genetic resource for identifying genomic regions associated with nutrient accumulation in grains based on associative mapping techniques. In the present study, GWAS were conducted for nine nutrients based on the screening of a Mesoamerican common bean diversity panel, including accessions adapted to Brazilian climatic conditions and that represent the most widely consumed e Environment: 1-LDA, 2-PG, 3-GUA, 4-LSmeans f Methods: 1-FASTmrMLM, 2-ISIS-EM-BLASSO, 3-mrMLM, 4-pLARmEB; pleiotropic QTNs are shown in bold commercial groups in the country, the black and carioca groups.
The ANOVA conducted to assess the relationships between common bean accessions and environments with regard to the nine target minerals showed that the BDP can be used in GWAS and that different QTNs can be identified in different environments. The heritability values (h 2 ) obtained tend to indicate a high influence of the environment on these traits. However, heritability and genetic models may vary among different studies, as relevant calculations depend on the study population, experimental design, and environmental conditions (Lynch and Walsh 1998).
The distinction of two groups broadly corresponding to the commercial groups of black and carioca common beans was already used in several studies (Valdisser et al. 2016;Gioia et al. 2019;Delfini et al. 2021). In contrast to conventional QTL mapping, association mapping is based on unstructured populations; consequently, it is fundamental to consider the population structure and relationship between individuals to prevent false associations due to the confounding effects of population mixture (Oraguzie et al. 2007). This consideration may be particularly applicable in the case of diversity panels, which are assembled from germplasm collections, inbred lines obtained from breeding, and cultivars already released on the market. Thus, selecting the appropriate association method is important and must consider the population structure and the relationship between individuals (Shi et al. 2011).
Regarding to the distribution of the SNPs obtained by GBS, it was observed a higher density toward the ends of the chromosomes, and similar results was obtained in other bean materials (Ariani et al. 2016;Garcia et al. 2021). Mapping resolution and statistical power are the main aspects considered in GWAS, with the former being strongly influenced by LD. Under high-LD conditions, a lower density of markers is necessary in target regions with a high potential for detecting markers strongly associated with polymorphisms of the genes of interest, even if physically distant (Shi et al. 2011). In this regard, a high LD is observed in predominantly autogamous crops (Li et al. 2016) which enables GWAS in common bean populations. In the present study, the LD half-decay corrected for the population structure and kinship (r 2 vs ) obtained was 296.5 kb, which is relatively higher than the one previously obtained (249 kb) in studies conducted with all Mesoamerican BDP individuals (Delfini et al. 2021), thereby indicating that LD can vary depending on the study population. Considering the LD decay distance in the study population, regions within a distance of 296.5 kb on either side of the detected QTNs were searched for candidate genes.
A comparison of the four multi-locus methods used indicated that the ISIS-EM-BLASSO method identified the highest number of co-detected QTNs, corroborating the findings of previous studies (Cui et al. 2018;Ma et al. 2018;Misra et al. 2018;Zhang et al. 2018b;Fang et al. 2020). Although using the pLARmEB method initially detected the greatest number of QTNs, few of these contributed significantly to heritability, as previously reported by Li et al. (2018) and Lü et al. (2018). Moreover, this method was the least efficient for detecting reliable QTNs. Furthermore, although the FASTmrMLM method presented 100% efficiency (all detected SNPs showed repeatability), the associated PVE and QTN effects were close to zero. Although all four assessed methods involve a combined two-step approach, the different number of QTNs identified may be due to different screening and estimation models associated with each method (Zhang et al. 2018a). Therefore, to obtain more reliable results, only the QTNs identified with more than one method or in more than one cultivation environment were considered.
Several QTNs detected at least twice using different methods or in different environments were identified and were thus considered reliable. The co-detection of these QTNs indicates the reliability and complementarity of using different methods and environments and also reveals that certain QTNs show stable behavior in different environments. Moreover, the combined use of multiple statistical methods has the advantage of facilitating the identification of smalleffect QTNs for characteristics with a complex genetic basis and low heritability (He et al. 2019a).
The detection of pleiotropic QTNs or overlapping genomic regions for different nutrients reinforces the results obtained based on correlation analysis, which revealed positive correlations between most minerals, thereby indicating that the accumulation or transport of different nutrients in grains can be associated with common mechanisms and/or genetic factors. These findings corroborate the findings of previous studies that reported positive correlations among minerals in other crops, including wheat (Cu et al. 2020) and millet (Jaiswal et al. 2019), and in common beans (Delfini et al. 2020).
The large number of QTNs identified may confirm the hypothesis that nutrient accumulation and transport in grains are complex characteristics controlled by many genes or gene families that each have a small effect and that can be detected by the multi-locus models used. Moreover, the results indicate that it is possible to detect QTNs in diversity panels comprising accessions that are routinely developed and used in breeding programs, thereby facilitating the incorporation of these QTNs during the improvement stages (Shi et al. 2011).
In addition to being useful for the identification of genes, the implementation of GWAS can also facilitate the identification of genomic regions associated with the traits of interest, and marker information can be used in breeding programs to efficiently incorporate specific loci in elite germplasms (Collard et al. 2005). The QTNs identified in the present study could play a role in this regard, since the accumulation of favorable alleles is associated with a gradual increase in content of the nutrients studied. Results show that these favorable alleles probably have an additive effect on nutrient accumulation in grains (Zhang et al. 2018b). Increased Fe and Cu contents were not detected, which could be attributed to the fact that few associated QTNs were identified for these elements. While in this study two QTNs were found to be associated with Fe, Gunjača et al. (2021) didn't found any QTN associated with this element. Silva et al. (2013) reported that for Zn, additive allele interaction alone explains the content variation, whereas for Fe the occurrence of dominance is also important, and the same may be true for Cu. Thus, it seems that the nutrient content of grains can be improved by increasing the number of favorable alleles in inbred lines and cultivars that already show a good agronomic performance, and marker-assisted selection (MAS) can serve as a useful tool for this purpose.
Although some of the QTNs identified in the present study were characterized as having a small effect, these may prove useful for genomic selection (GS) (He et al. 2019a). In GS models, instead of using a complete panel of randomized SNPs, the use of markers associated with traits of interest may decrease the number of markers and, consequently, the costs of genotyping large breeding populations (Lan et al. 2020). Moreover, the use of specific trait-associated markers in GS models can improve prediction accuracy by reducing background noise in model construction (He et al. 2019b;Ali et al. 2020).
The availability of a genome sequence and gene annotation for common beans enabled the search for genes located in genomic regions around the QTNs that may be associated with nutrient content variation in grains. Potential candidate genes were identified for all nutrients, except K and Mn, with more than one gene identified in most cases and with genes distributed in more than one chromosome for each trait. These results reinforce the quantitative inheritance of these traits, which is well documented for Fe and Zn, elements that tend to be the main focus of studies investigating nutrient contents in common beans ). Phvul.004G000800 Magnesium ion binding S08_59042075 Phvul.008G243200 Magnesium ion binding S11_1201121 Phvul.011G013600 Magnesium ion binding Cu S03_1013759 Phvul.003G012600 Electron carrier activity-Cupredoxins-blue copper proteins S09_36738755 Phvul.009G247600 Copper transport protein ATOX1-Related S S01_1213244 Phvul.001G014532 Iron-sulfur cluster binding Zn S01_46657405 Phvul.001G208400 Zinc ion binding Phvul.001G212400 Zinc ion binding S05_39601221 Phvul.005G166300 Zinc ion binding S09_29892463 Phvul.009G197800 Zinc ion binding Phvul.009G198300 Zinc ion binding S10_818442 Phvul.010G005300 Zinc ion binding S11_18198447 Phvul.011G115800 zinc ion binding Phvul.011G115900 zinc ion binding S11_50117148 Phvul.011G187200 Zinc ion binding S11_50159956 Phvul.011G190200 Zinc ion binding Fe S08_2828543 Phvul.008G034400 Iron ion binding Phvul.008G034500 Iron ion binding S09_13611567 Phvul.009G081633 Iron ion binding Phvul.009G081700 Iron ion binding Many of the genes detected in the present study were identified as having the molecular function of protein binding, which may be associated with transduction signaling and the transcription factors that modulate gene expression (Villordo-Pineda et al. 2015). Moreover, other genes not directly related to the nutrient with which the QTN was associated were identified, possibly indicating correlations in the mechanisms of absorption or transport between the minerals analyzed. Examples of detected functions include calmodium binding, phospholipid binding, Cu ion transmembrane transport activity, and 4 Fe-4 S cluster binding. With respect to K, for which none of the detected QTNs were associated with genes directly related to the nutrient, two genes associated with potassium ion transmembrane transport and potassium ion binding functions were identified on chromosomes Pv9 and Pv4, in genomic regions near the QTNs detected for P and Mg, respectively.
Although common beans are rich in multiple minerals, Fe and Zn are the main nutrients studied in this crop, for which numerous QTLs were already described. However, most previous studies focused on common beans of Andean origin or inter-gene-pool populations obtained by crossing Andean and Mesoamerican genotypes; to date, few studies have exclusively focused on panels of Mesoamerican origin. In a QTL study of a Mesoamerican population, Blair et al. (2010) identified new sites related to Fe and Zn that had not been detected in previous studies on Andean or intergenic populations. Similarly, the sites identified in the present study differ from those found previously by Katuuramu et al. (2018), who conducted GWAS for protein, Zn, and Ca contents and the bioavailability of Fe using an Andean diversity panel, as well as from those identified by Erdogmus et al. (2020), who evaluated an intergenic panel with regards to Ca and Mn contents. Similarly no common QTNs were identified between the present study and the recently study developed by Gunjača et al. (2021), who performed a GWAS for eight minerals in a Croatian common bean panel, composed by Mesoamerican and Andean accessions. The cited authors also didn't find common QTNs with previous works and together with Blair and Izquierdo (2012) concluded that QTLs can be gene pool specific. Thus, the loci identified in the present study will make an important contribution to the biofortification of Mesoamerican common beans and more specifically, the black and carioca common beans, the primary types consumed in Brazil.
This study shows that the BDP is efficient for association studies and enables the identification of high variability between different quantified minerals. Moreover, the evaluation of different multi-locus methods and cultivation environments established the reliability of markers associated with minerals. Further studies should validate the importance and functionality of candidate genes associated with the accumulation of nutrients in grains. However, the identified QTNs showed promising potential to be used in biofortification breeding programs focused on the pyramiding of favorable alleles, which could be monitored through MAS. In addition, these loci can be incorporated into SNP panels related to traits of interest in GS.