The Brazilian lima bean landraces: conservation and breeding


 Brazil is one of the countries with the greatest genetic diversity in lima beans ( Phaseolus lunatus L.), which has been maintained both on farm and in germplasm banks. The knowledge of this diversity in the country is extremely important for developing a strategy for use and conservation. The objective of this study was characterizing landraces lima bean accessions from different regions in Brazil . Twenty two accessions conserved in the Phaseolus Germplasm Bank from UFPI (Piauí-Brazil) were characterized with 37 agro-morphological descriptors and 15 microsatellite markers. In the agro-morphological characterization, the maximum value of genetic divergence was obtained for the pair UFPI-262 and UFPI-252 (D = 88.74). The UPGMA grouping made it possible to form four groups. Tocher's optimization method enabled the formation of 10 groups. Regarding molecular characterization, 10 loci presented polymorphism, and the number of alleles per locus varied from two to seven. The Polymorphic Information Content (PIC) varied from 0.0767 to 0.7240. The loci GATS91 and PVat001 were highly informative and can be indicate for further studies involving the lima bean. The genetic diversity found (He = 0.316) was higher than that reported in the Yucatán Peninsula, a region indicated as a center of diversity for lima bean. Thus, the agro-morphological and molecular characterization were efficient in quantifying the genetic divergence between the studied accessions. The data found in this research provide a valuable resource for geneticists to subsidize breeding programs involving the lima bean.


Introduction
Lima bean (Phaseolus lunatus L.) is the second most important legume species from the genus Phaseolus, being originated in Mesoamerica about 10,000 years ago (Kaplan and Lynch 1999). Lima beans are widely distributed from northern of Mexico through northern of Argentina in its wild forms, while its domesticated forms are distributed from the southern of United States to the east of Brazil (Andeuza-Noh et al. 2013). In South America, the wild populations of lima bean have been found in Colombia, Venezuela, Ecuador, Peru, Argentina, and also in Bolivia and Brazil (Andueza-Noh et al. 2015). Recent molecular studies have indicated that the lima bean is included in three major gene pools: the Andean (AI), found in South America; the Mesoamerican I (MI), found in the west of Mexico; and the Mesoamerican II (MII), located in the South of Mexico, extending along Caribbean and South America (Andueza-Noh et al. 2013;Andueza-Noh et al. 2015;Martínez Castillo et al. 2014;Serrano-Serrano et al. 2012). Previous studies have not reported the assignment of a wide collection of Brazilian germplasm to these gene pools despite the wide presence and importance of lima bean to Brazilian smallholders, mainly at the northeastern Brazil. Previously, Chacón  have hypothesized that Brazil could be a center of lima bean domestication since they found two lima bean varieties that share haplotypes belonging to MII gene pool which could mean a third domestication event occurred in Brazil.
Additionally, Chacón   The genetic characterization is conducted evaluating both agro-morphological and molecular tools and it is the basis for plant breeding programs and cultivar development (Arriel et al., 2006). The agromorphological characterization is essential to identify the superior, and genetically dissimilar genotypes, as well as the indication of genitors, contributing to selection genotype with agriculturally important features that have the potential to be of interest for producers (Ron et al., 2018).
Among the main molecular methods used to characterize plants, the microsatellites markers, also known as Simple Sequence Repeats (SSR), are one of the most used methods to study the polymorphism between DNA sequences in the Phaseolus genus (Matondo, et al., 2017;Litt and Luty, 1989). These markers present potential to identify the polymorphism from regions in the genome through of repetitions from one to six nucleotide in tandem (Oliveira et al., 2004). Despite being found in high frequency and wide distribution in eukaryotic genomes, these markers have not been used intensively to evidence the polymorphism in the P. lunatus species.
Since that there is not information about the genetic characterization of lima bean genotypes from Brazil, we hypothesized that (1) there would be variability between the lima bean genotypes and distinct distribution of diversity; (2) This variability could be greater than those observed in other regions.
Therefore, the objective of this study was to assess the genetic diversity of landraces lima bean germplasm, from different regions in Brazil, which have been conserved in the Phaseolus Germplasm Bank from Federal University Federal of Piauí (PGB-UFPI), Brazil using agro-morphological and molecular markers, to evaluate the particularity of this germplasm and its importance as genetic resources.

Agromorphological characterization
Plant material and experimental methodology The twenty ve lima bean accessions were (Table 1, Figure 1), twenty one of which were received from the Bank of Germplasm and Plants of the Universidade Federal de Viçosa (BGH-UFV), in Viçosa, Brazil, one of the oldest and most representative collections of lima bean in Brazil. The accession UFPI-720, from Piauí, was added to the group, for its economically relevance to the Northeast region of Brazil. The experiment was carried out in a greenhouse of the Plant Science Department Center of UFPI (72.7 m, 05°05'05''S and 42º05' W), in the city of Teresina-PI, Brazil. The experimental design used was completely randomized, with four replications. The agro-morphological characterization of the accessions was carried out from 37 descriptors related to the leaf, ower, fruit, and seeds of the lima bean, generating a total of eleven continuous quantitative descriptors and twenty six multi-categorical qualitative (IPGRI, 2001). The eleven quantitative descriptors analyzed were: LW (leaf width), LL (leaf length), NDF (number of days to the start of owering), PL (pod length), NLP (number of locules per pod), WP (width of the pod), LS (length of the seed), WS (width of the seed), W100S (weight of 100 seeds), NSV (number of seeds per pod), TS (thickness of seed).

Analysis of agro-morphological data
An univariate variance analysis was used to estimate the genetic variability between the accessions for quantitative data. The means were grouped according to Scott-Knott's test, with a 5% probability. The accessions were grouped using the unweighted pair group method with arithmetic means hierarchical method (UPGMA), using Mahalanobis distance (D) as a dissimilarity measure (Cruz and Carneiro 2003).
Qualitative traits were analyzed using multi-categoric variables, this generating a dissimilarity matrix. To obtain this matrix, the mode of each variable by accession was used, without repetition. Later, Tocher's optimization method was used to carry out a clustering analysis, as cited by Rao (1952). The analysis was carried out using the program GENES (Cruz 2008).

Molecular characterization
Plant material and DNA extraction Twenty four lima bean accessions were characterized (Table 1, Figure 1) in the Laboratory of Genetic Diversity and Plant Breeding, Escola Superior de Agricultura "Luiz de Queiroz" (ESALQ), Department of Genetics, University of São Paulo, in Piracicaba-SP, Brazil.
The extraction of the genomic DNA from accessions was carried out, using the protocol based on CTAB as described by Doyle and Doyle (1990). The concentration and integrity of the extracted DNA were veri ed using the quanti cation of DNA aliquots of each accession on Sybr-safe 1 % (w/v) agarose gels and compared with standard phage lambda DNA. The DNA of each sample was diluted to a nal concentration of 10 ng μL-1.

SSR genotyping
For the study of genetic diversity 15 microsatellite markers (Gaitán-Solís et al. 2002 andYu et al. 2000) isolated and optimized for Phaseolus vulgaris L. were used in the lima bean accessions (Table 2). Ampli cation reactions for the 15 loci were performed with 20 ng of DNA, 1 U of Taq DNA polymerase, 2.0 mM of magnesium chloride (MgCl 2 ), 0.2 mM of each dNTP, 0.1 μM of each primer ("forward" and "reverse") and 1 X PCR reaction buffer in a nal volume of 20 μL. The ampli cation conditions used were as follows: 1) 94 °C for 2 min; 2) 94 °C for 15 s; 3) annealing temperature (Ta speci c for each SSR) for 15 s; 4) 72 °C for 15 s; 5) repetition of steps 2-4 30 times; and 6) 72 °C for 10 min. The ampli cation products were separated under denaturant conditions in 7% polyacrylamide gel, 7 M urea, under a constant potency of 70 watts. An allele marker was used as a standard for DNA size in a 10 pb scale (Invitrogen TM ). After electrophoresis, the gel was stained with silver nitrate.

Genetic diversity analysis
Polymorphic information content (PIC), as described by Botstein et al. (1980), was calculated as a function of the number of alleles detected and their distribution and frequency in the groups of accessions. The values of the PIC per locus were determined by PIC = 1 − p i 2 − 2p i 2 p j 2 , where p i and p j are the frequencies of the alleles i and j in the accession groups. The calculation is based on the number of alleles detected per given locus and the relative frequency of each allele in the total set of hits.
From the genotyping of the gels, the estimates of genetic diversity (allele frequencies, number of alleles per locus, H O = observed heterozygosity, H E = expected heterozygosity) were obtained using MSTOOLS (Park 2001).
A cluster analysis was performed using the UPGMA method , based on Rogers' distances (1972), modi ed by Wright (1978), to determine the genetic relationships between accessions using the program NTSYS (Rohlf 1989). The stability of the clusters was tested using the bootstrap procedure based on 10,000 resamples using the program TFPGA (Miller 1997).

Results And Discussion
Morpho agronomic characterization Through the F test (Table 3), signi cant differences (P<0.01 or P<0.05) were found between the accessions of the evaluated characters, showing the existence of genetic variability between them, except for the width and length of the leaf. Therefore, the phenotypic variability found between the accessions of the landraces lima beans offers possibilities to enhance the variation within breeding programs and, consequently, to develop cultivars that can attend to the current or future needs of the bean consumer market (Lopes et al. 2010).  The coe cient of variation (CV%) in the eleven characters varied from 4.66% to 26.28%, for WS and LL, respectively (Table 3). Most characters presented values from low to medium, indicating a good experimental precision. According to Pimentel-Gomes (2009), experimental precision is considered high when the variation coe cient is below 10%, moderate when it is between 10% and 20%, and low when between 20% and 30%. Any value above 30% is considered very low. However, this classi cation does not consider the particularities of the studied culture here and does not differentiate with regard to the nature of the character under study (Costa et al. 2002).
The Scott-Knott test con rmed the genetic variability between the accessions for NDF, PL, NLP, WP, LS, WS, W100S, NSP and TS (Table 4). The characters NDF and CS presented higher variability between the accessions, forming four different groups, followed by PL, WP, WS and TS, which allowed the formation of three different groups. The traits related to the pod are important, since larger pods contain large seeds, which is preferred among lima bean consumers (Silva et al. 2017). In addition, green pods of lima bean can be used for food (Melo 2011).
According to Vargas et al. (2003), the study of the variation of the seed traits is one of the main criteria to explain the origin and the genetic diversity of lima beans. Silva et al. (2017), when analyzing the genetic diversity present in accessions of lima bean from Brazil, using agro-morphological and phenological characters, found that the traits that most contributed to the genetic variability were the length and width of the seeds. Additionally, their results also showed traits common to the Mesoamerican and Andean gene pools, and another group with intermediate traits of these two gene pools, indicating that Brazil could be an important center of diversity for this important crop.  The measures of genetic dissimilarity (Table 4), estimated through the generalized Mahalanobis distance (D), showed that the accessions UFPI-262 and UFPI-252 (D= 88.74) were the most divergent, while UFPI-218 and UFPI-250 (D = 2.67) were the most similar. The most appropriate would be to recommend crossbreeding between genetically distant genitors that present superior performance for the desirable traits, where complementarity and transgressive segregation could produce an outstanding base for selection. Brito- Silva et al. (2015), when characterizing accessions of PGB-UFPI through agromorphological markers, indicated the crossing UFPI-220 x UFPI-468 for breeding programs, since both have high levels of dissimilarity and average performance, with important traits for productivity components. Crossing similar accessions, on the other hand, is not recommended, as the genetic variability and genetic gain would be restricted (Correa and Gonçalves 2012).
The UPGMA (Figure 2) allowed the formation of four groups at a distance of approximately 58%. Group I included the accessions UFPI-225, UFPI-264, UFPI-217, UFPI-216, UFPI-221, UFPI-224, UFPI-245, UFPI-230, UFPI-236, UFPI-228, UFPI-232, UFPI-242, UFPI-250, UFPI-218 and UFPI-220. These accessions presented the lowest means for the NDF, thus grouping the most precocious among those studied. This is important because the majority Brazilian germplasm has a long cycle, nearly seven months, and diminishing it would enable cultivating more than once a year or lesser cost of production in a single shorter season. Additionally, short-cycle accessions can also diminish the risk of plagues and diseases, in addition to avoiding periods of "veranicos", hot periods similar to the ones known as Indian summers.
Group II included the accessions UFPI-239, UFPI-229 and UFPI-247. They also presented low means for the NDF character. The accessions UFPI-247 and UFPI-229 presented the highest means for the characters NLP and NSP. According to Singh (2001), the characters number of pods per plant, number of seeds per pod, and weight of 100 seeds are important to select plants to increase the production of grains. Group III, in turn, included the accession UFPI-262. In addition to being one of the most divergent, this accession also presented the highest means for most characters.
Group IV was formed by the accessions UFPI-252, UFPI-251 and UFPI-261, which presented the highest means for NDF, thus being the most delayed. The accessions UFPI-261 and UFPI-251 obtained the highest means for P100S. According to Akande and Balogun (2007), 100-seed weight is a better indicator of seed yield in lima bean; it could then be used as a selection criterion for higher seed yield.
In the evaluation of the qualitative characters of lima bean, using Tocher's optimization method, ten groups were formed (Table 5). From the 26 qualitative characters used to evaluate the accessions, it was found that some did not contribute to differentiate them, such as: color of the keel, size of the ower bud, position of the bunches, orientation of the pods with regard to the bunches, germination of the seeds in the pods, texture of the testa of the seed, color of the cotyledons, and plant growth pattern. Most of the evaluated accessions presented white ower wings, parallel wing opening , short pod apex, right pod curvature and brown pod color. The variation between them was related to the seed color and seed coat pattern. These results show that there are differences between them for these features, highlighting the color of the seed and the pattern of its tegument, as the preference of the Brazilian consumer market resides in bicolour grains. Some traditional plates in the Brazilian Northeast are made with lima beans, requireing large grains with different forms and pigments located in the testa of the seed, The differences in the visual patterns of the grains are extremely important for breeders who aim to improve the bean to attend consumer preferences. Bria, Suharyanto and Purnomo (2019), in a study on the genetic diversity between lima bean accessions using agro-morphological characters, also found that the seed color and seed color pattern presented a very high variation. According to Gepts (2014), the domestication of Phaseolus resulted in several changes, such as reducing seed dormancy, seed dispersion, photoperiod, and increased variety of forms and colors of pods and seeds. Nobre et al. (2012) found that the biometry and color of lima bean seeds help in differentiating between varieties. Guimarães et al. (2007) state that the seed color is seen as a factor that can contribute for a better commercialization of the product, which will depend on consumer preferences in different regions.
In Teresina, the capital of Piauí, Brazil, for example, there is a preference for white tegument seeds, while in the Midsouth region of the state, people prefer white seeds with stripes in the tegument (Lopes et al. 2010). According to Santos et al. (2002), conducting studies on the morphology of the varieties of fava beans is of paramount importance to enable the registration of identi cation characters, allowing access to this material in the search for plants with good responses in terms of productivity and adaptation to different environmental conditions. Studies on the morphology of the lima bean varieties are extremely important to make the record of identifying traits viable, making it possible to access this material when seeking plants with good answers regarding productivity and adaptation to different environmental conditions.

López
Considering the results found, the UFPI-262 accession stands out for presenting good performance for most of the evaluated characters, especially for the descriptors related to the production. As a result, it can be recommended for a program of genetic breeding of the culture, more speci cally in arti cial crossings.

Molecular characterization
From the 15 microsatellite loci tested in the 24 accessions of PGB-UFPI, 10 presented a polymorphic pattern. This result demonstrates success in the transferability of these primers, considering that they were developed for P. vulgaris. The number of alleles per locus ranged from two to seven, with the majority of loci presenting two alleles. The loci that showed the greatest allele variation in base pairs were PVat001, with seven alleles, and GATS91, with ve, while those with the lowest variation were BM 140, BM 141, BM 154, BM 156, and BM 183, presenting two alleles.
The Polymorphic Information Content (PIC), calculated to estimate the discriminatory power of each locus, ranged from 0.0767 for BM 140 and BM 183 to 0.7240 for PVat001, with a mean of 0.2825.
According to the classi cation proposed by Botstein et al. (1980), markers with PIC values above 0.5 are considered very informative, while values between 0.25 and 0.50 are moderately informative, and those below 0.25 are little informative. This value serves as the basis for the classi cation and selection of primers that were e cient in discrimination against individuals. Therefore, the results show that the loci PVat001 and GATS91 are very informative and can be indicated for molecular studies with lima bean.
The expected heterozygosity (He) ranged from 0.0816 for the loci BM 140 and BM 183 to 0.7775 for the locus PVat001, with a mean of 0.3160 (Table 6)  The observed heterozygosity (Ho) was lower than the expected. This result may be related to the agricultural practices used by small farmers: they plant different landraces varieties for their own consumption with a bi-color seed pattern, and at the same time they plant genotypes with white seed color to attend to the great consuming centers, which may lead to the a genetics homogeneity of landraces varieties and to the simultaneous xation of alleles (Martínez-Castillo et al. 2004). Another probable cause for the Ho inferior than the expected is related to the reproductive system of the species, which, despite being mixed, has a predominance of self-fertilization (Penha et al., 2017). Therefore, the knowledge about the genetic diversity of a germplasm bank becomes necessary for an adequate selection of accessions, depending on the objectives of the breeding program (Perseguini et al., 2011), and, as a result, the combination of agronomic and molecular information can be used to extract new alleles and in the breeding of agronomic traits. As a result, the development of new cultivars with alleles of interest is a promising strategy to take advantage of genetic diversity, aiming at agricultural bene ts.
Studies show that the Brazilian germplasm of lima bean presents a considerable genetic diversity. Therefore, the studies mentioned above showed that the diversity of the lima bean germplasm is not restricted to its traditionally recognized centers of origin and diversity. This work showed that the accessions evaluated at the PGB -UFPI have a high genetic diversity when compared those found in the Yucatán Peninsula (Martínez-Castillo et al. 2008), a region indicated as a center for the diversity of lima bean. However, it should be considered that each work used different methodologies, from the size of the samples to their origins.
Furthermore, this high diversity found in Brazil, when compared to other regions and similar studies, indicates that Brazil is a center of genetic diversity for lima bean, reinforcing the hypothesis that the country is a possible center of domestication of the species. In a near future, it is hoped that independent analyzes of lima bean genotypes from Brazil can be integrated through a multi-institutional effort to systematically explore the genetic diversity of ex situ collections, with a view to expanding the genetic base and reinforce the creation of culture breeding programs.
In addition to the allelic variation, other information of great importance is the genetic distance, which allows the evaluation of redundancy in the germplasm banks. From the analysis of the molecular data in this study, a genetic distance matrix was generated using Rogers' distances (1972), which was used to group individuals by constructing a dendrogram with UPGMA ( Figure 3). Considering the arbitrary threshold of 0.40 for the Rogers' distance, the dendrogram showed the formation of six groups. It was found that, in general, there was no relation between the origin of the accession and its allocation in the groups, being that the majority of the accessions from the state of Minas Gerais is dispersed among the groups This can be explained due to the probable exchange of seeds between the small farmers, due to the proximity of the regions, and also by the reproductive system of the lima bean. Penha et al. (2017) reported crossing rates of 38.1% when analyzing 14 accessions of PGB-UFPI through ten SSRs loci. According to Baudoin (1988), the projection of the stigma and stylus out of the perianth and the receptivity of the stigma during many hours can be responsible for these high crossing rates. Gomes et al. (2020), analyzed the genetic diversity of accessions of lima bean to build a nuclear collection, using SSRs markers, and also found that the groups in which the studied accessions were separated did not re ect their geographic origins. Guimarães et al. (2007), through a dendrogram constructed from 22 lima bean accessions analyzed through RAPD markers, formed two main groups, whose accessions were not divided according to their origin. The authors stated that this happens frequently with cultures in which the cultivation of landraces varieties predominates, such as lima beans, where seed exchange occurs between farmers.
The groups I, IV, V, and VI formed by the accessions UFPI-216, UFPI-262, UFPI-239, and UFPI-720, respectively, were the most divergent. The UFPI-720 accession was very isolated from the others, this can be explained as a result of its origin, as it is the only representative access from Piauí, a state with different edaphoclimatic conditions from the state of Minas Gerais. The calculated cophenetic value was 0.92, suggesting that the UPGMA explains practically the entire original dissimilarity matrix.
Studies related to the characterization of conserved accession in germplasm banks are crucial to maintaining genetic resources in diversity centers (Maxted et al., 2012). From the agro-morphological and molecular characterization of landraces accessions of lima bean from PGB-UFPI, it was found that there is genetic variability between them, and that the variability between the landraces accessions from Brazil is larger than those found in other regions of the world. Furthermore, the characterization of the germplasm was important for providing useful genetic material for the development of improved lineages, in which case the crossing of the genotypes UFPI-720 and UFPI-216 is recommended. Another important aspect found was that the organization and distribution of genetic diversity are not related to their geographic origins. Therefore, the conservation of landraces accessions and their application in future genetic breeding studies will reduce the vulnerability of agricultural crops to the action of pathogens, guaranteeing their agronomic performance and strengthening food security for small farmers.
In general, we realized that only a small portion of genetic diversity of landraces germplasm of lima beans from Brazil is used in breeding, since only one genetics breeding program of lima beans was started in Brazil, at the Federal University of Piauí, Teresina, Brazil. The lack of genetic and phenotypic data contributes to the limited use of this germplasm. Therefore, this characterization should encourage and facilitate the use of lima beans by farmers and breeders. These results also will stimulate the optimization of resources, improving experimental projects related to the structure and genetic diversity of lima beans, encouraging the use of the germplasm by farmers and breeders, among other researchers around the world, signaling the agro-morphological and molecular characterization as the rst step in maintaining the PGB-UFPI, according to the genetic pro les of the accession

Implications for the breeding and conservation of lima beans in Brazil
Characterization studies have contributed for the knowledge of genetic variability among landraces accessions conserved in germplasm banks and in the detection of traits of interest for their posterior introduction in pre-breeding and crop breeding programs. They can also identify possible duplicates found in the collection and avoid unnecessary expenses, such as time, labor and conservation costs. .
Both markers used in this research were reliable in the evaluation of genetic diversity of landraces germplasm of lima bean, helping to identify the genotypes with desirable traits. Therefore, these results provide opportunities to carry out directed elds trials, mainly focused on the most diverging individuals to start breeding programs (UFPI-720 and UFPI-216). The information found is extremely important to get to know the genetic diversity of lima beans in Brazil, since the country has been mentioned as a possible center for the domestication of the species.
It is important to highlight the importance of conserving germplasm banks, since they are repositories of genes to subsidize breeding programs, which will be important for the development of future agricultural activities. Therefore, considering the economic, social, and cultural importance of lima bean, this work shows the relevance of studying this culture in Brazil, and that it can increase interest in expanding researche with this culture, which is so diverse and that much remains to be elucidated about it. College of Agriculture "Luiz de Queiroz" (ESALQ), especially to its Genetics Department, due to the availability of its physical structure for the performance of this work. To the National Council for Scienti c and Technological Development (CNPq), for funding this project.

Declarations
Availability of data and materials The geographical coordinates, accesses evaluate and the readings of the SSRs are stored in our database and can be made available as requested.
Compliance with ethical standards Con ict of interest The authors declare that they have no con ict of interest Figure 1 Geographical distribution of the lima bean accessions studied  Dendrogram generated by the UPGMA method, based on Rogers' genetic distance matrix, as modi ed by Wright, using ten microsatellites loci evaluated in 24 lima bean accessions.