Phaseolus lunatus L.: pulse seeds phenotype image analysis

The locally cultivated creole varieties of Phaseolus lunatus are adapted to specific climatic and environmental conditions. Family farmers and local communities preserve and multiply their seeds over generations, promoting genetic diversity, food and nutritional security, and agricultural sustainability. This species has great geno-phenotypic diversity, which can be harnessed in breeding programs if accurately characterized. We evaluated the phenotypic variations of P. lunatus seeds from 13 varieties in three states (Sergipe, Bahia, and Alagoas) using image analysis. We estimated the weight of 100 seeds using a precision analytical balance and obtained morphometric measurements, including area, maximum diameter, and minimum diameter, using Groundeye (TBit®) imaging equipment and software. We also recorded dominant color and RGB color system descriptors. The morphometric variables underwent variance analysis using the F-test, and the means were clustered using the Scott-Knott test at 5% significance level. The data underwent Pearson Correlation Analysis (t-Student at 5%), were grouped based on dissimilarity using the UPGMA method, and were represented in a dendrogram. We also performed Principal Component Analysis on the evaluated characteristics. The dominant color of the seeds was predominantly orange in nine varieties. Morphometry showed a positive and significant association. The dendrogram revealed two homogeneous and distinct groups, and the first two principal components accounted for 86.80% of the genotypic variation. Therefore, high-resolution images for phenotypic characterization of creole lima bean seeds are a promising non-destructive tool for selection purposes.


Introduction
The legume Phaseolus lunatus, also known as the lima bean, is an economically significant crop.It is considered the second most important species in the Phaseolus genus due to its agronomic characteristics, including grain production for human consumption and its tolerance to unfavorable soil and climatic conditions (Noh et al. 2015).Phylogenetic analysis of P. lunatus has shown a close relationship with other bean species, such as P. vulgaris, Vigna unguiculata, and Vigna radiata (Tian et al. 2021).
The primary method of conservation for lima beans is on-farm preservation by small farmers who store their seeds, thereby conserving genetic diversity.On-farm conservation is an alternative approach for diversifying agriculture, but it requires support from the government and private sectors to promote and enhance agricultural interactions among small-scale farmers (Soares et al. 2022).Fava bean or lima bean is mainly grown in the Northeast region of Brazil, primarily by family farmers due to its ability to adapt to water scarcity, high temperatures, and low-fertility soils (Soares et al. 2022, Li et al. 2015).
Fava, a legume commonly sold at street markets, is an important food source for many communities.However, there is considerable heterogeneity in the characteristics of the grains produced in these communities, which receive different names depending on the region (Felix et al. 2014).The species presents high phenotypic variability regarding growth habit (determined or indeterminate), flower color (white, yellowish-white, pink, purple, and violet, with intermediate tones), and seed integument color (white, gray, yellow, pink, brown, black, among others).Additionally, there are variations in seed shape (rhomboid, round, or reniform), size, weight, and other morphometric characteristics (Lopes et al. 2015).
However, these traits need to be further explored by farmers, as there are few studies on selecting desirable traits in participatory breeding programs.As a result, the materials available for small-scale farmers are typically a varietal mixture, which has low productivity and inconsistent characteristics such as seed integument color and size, making it unsuitable for commercial grain production (Silva et al. 2015).
The genetic diversity of Creole varieties was evaluated at the Phaseolus Germplasm Bank of the Federal University of Piauí (PGB-UFPI, Brazil) using 26 qualitative and 11 quantitative descriptors based on the criteria of Biodiversity International.Simple Sequence Repeats (SSR) molecular markers were also used to determine genetic polymorphism and differentiation.The study revealed significant genotypic and phenotypic diversity among the accessions, particularly for traits such as number of pods per plant, number of seeds per pod, and weight of 100 seeds (Pires et al. 2021).In another study, the results suggest a recent genetic bottleneck for the genotypes, which may be related to the management and production techniques used by farmers due to the increasing market demand for improved lima bean varieties in recent years (Lustosa-Silva et al. 2022).
Morphometric characterization studies are crucial for selecting economically valuable traits based on consumer market demand.Moreover, using image analysis equipment enhances the precision of evaluated traits, leading to genetic progress in the selection.Thus, the aim of this study was to assess the morphometry and seed coat color of P. lunatus creole bean seeds using image analysis.

Material and methods
The seeds used in this study were obtained from three Brazilian states: Sergipe-SE (SE1, SE2, SE3, SE4, SE5, SE6, SE7, and SE8), Bahia-BA (BA9 and BA10), and Alagoas-AL (AL11, AL12, and AL13) and collected from family farmers in Frei Paulo-SE and Pedro Alexandre-BA.The seeds were harvested manually when the pods were ripe and still attached to the plant, then sun-dried for 3 to 5 days and stored in bottles and kept at 8 °C.Thus, it is suggested that the seeds were stored under similar physiological conditions, which did not significantly affect seed color evaluation.
As these are non-commercial materials, farmers cultivate P. lunatus varieties as if they were a single variety, without distinctions in the field.Separation of the varieties occurs after harvesting, based on the color of the seed coat and morphometric characteristics such as size and shape.Seeds from the Federal University of Alagoas (UFAL)-Rio Largo-AL were donated and kept in a cold chamber at 8 °C ± 2 °C.Initially, the weight of 100 seeds (WHS) was determined by weighing 100 seeds of each variety using a precision scale, following the recommendation of the International Plant Genetic Resources Institute-IPGRI (Carvalho et al. 2006).The objective of this study was to evaluate the morphometry and staining of the tegument of P. lunatus creole bean seeds by image analysis, which is an essential strategy for selecting economically relevant traits based on consumer market demand.Using image analysis equipment results in greater accuracy of the evaluated characteristics, allowing for more significant genetic progress in the selection process.
The morphometric analysis used four replicates of 25 seeds per variety and was performed using Groundeye (TBit®) equipment, which was calibrated prior to image capture.From the captured images, area (mm 2 ), maximum diameter (mm), and minimum diameter (mm) variables were obtained using software for a detailed classification of seed color.Dominant color and RGB color system descriptors (red, green, and blue) were also recorded.
The morphometric data were tested for normality and homoscedasticity.Treatment significance and means were assessed using the F-test, and means were clustered using the Scott-Knott test (5% level).The degree of dependence among morphometric characteristics was evaluated using Pearson's correlation with a t-test at a 5% probability level.Unweighted Pair Group Method using arithmetic averages (UPGMA) was used to assess the dissimilarities among all varieties.The resulting distinctions were clustered using a dendrogram, and a column chart was plotted using Excel software® to display the dominant color and RGB system.

Results and discussion
The high-resolution images obtained using the Groundeye equipment (TBit®) are shown in Fig. 1.
The image capture enabled visual verification of the high phenotypic variation in the tegument color among the varieties.SE6 showed small black spots on the seeds, indicating variable expressivity and incomplete penetrance.
Based on the analysis performed by the Groundeye equipment (TBit®), Fig. 2, the seeds from SE2, SE3, SE5, SE6, SE7, SE8, BA9, AL12, and AL13 presented a predominance of orange color, while SE1 and BA10 had a predominance of yellow color.AL13 presented black seeds, and SE4 presented red seeds.The camera allowed us to capture the predominant color of the seeds of each variety, which, in most cases, was distinct from the color observed by the human eye.The difference between the human eye and Groundeye is that the equipment uses complex algorithms to differentiate colors, which the human eye cannot make this distinction.
The use of image analysis in seed phenotyping has enabled the conversion of qualitative characteristics into quantitative ones.This transformation provides valuable information for the characterization, registration, and identification of variations (Lopes et al. 2010), eliminating the subjectivity of color The color of lima bean seeds is directly linked to their commercialization, with a preference for white seeds for cooking and size, that exhibit a yellow dominance in Groundeye.Color also plays a crucial role in the empirical selection of seed varieties by farmers (Soares et al. 2022).In Phaseolus genus most of the yellow and white colored genotypes were faster cooking and had higher Fe bioavailability when compared to the dark red kidney and red mottled accessions (Katuuramu et al. 2020).
Fava seeds come in a wide variety of colors, ranging from white and brown to gray, yellow, pink, red, purple, black, stained, and mottled (Moraes et al. 2017).This qualitative characteristic holds great economic importance since it is associated with the acceptance of the grains by consumers.People tend to associate lighter coloration with a less bitter taste, as well as recent harvest, making it easier to cook (Lopes et al. 2010(Lopes et al. , 2015)).
Hence, it is vital to explore diverse strategies that aid in the conservation of creole broad varieties (Soares et al. 2022) and implement initiatives that guide farmers towards producing seeds that can be marketed effectively.
Using image analysis, it can be suggested that the dominant colors with a combined sum of yellow and orange should exceed 90% when aiming to obtain large seeds with a light seed coat, which are more desirable for cooking.The RGB color system software facilitates accurate classification of seed color based on the red (R), green (G), and blue (B) components, as illustrated in Fig. 3.
The R, G, and B components correspond to the intensity of red, green, and blue light, respectively, with values ranging from 0 to 255 (Tsuchiyama and Matsushima 2017).Figure 3 illustrates that G1 exhibited the highest values of RGB, whereas G11 had the lowest values, with corresponding values of 200.82, 147.193, 193.26, 45.77, 30.25, and 40.86, respectively.A value of 0 represents darkness, while the maximum value indicates high intensity or clarity.If all color components have a minimum value, the resulting color is black.Conversely, if all components have a maximum value, the color is white.When all values are similar or close, the resulting tone is gray.However, if the values tend to diverge, the resulting tone becomes increasingly vivid and less subdued.
SE1 and BA10, with dominant yellow color, had RGB values close to white, while AL13 and AL11, with dominant red and black color, approached black tones.BA9, AL12, and AL13 exhibited similar RGB values, resulting in a gray color for the seed coat.Despite SE3 and SE7 appearing gray visually, they displayed significant variations in RGB values due to stains on the seed coat.
Using the RGB color system, the seeds were categorized as either light or dark colored, with SE1, SE3, SE5, SE7, SE8, BA9, BA10, AL12, and AL13 classified as having light-colored seed coats, while SE2, SE4, SE6, and AL11 exhibited predominantly dark colors (Sunoj et al. 2018).Light-colored seed Fig. 2 The dominant color (%) of the Phaseolus lunatus L. seed tegument analyzed by GroundEye (TBit®).*Colors with a frequency ≤ 4% were included in non-categorized (NC) coat varieties were more prevalent, which could be attributed to the farmers' selection process, as the consumer market tends to favor materials with lighter coloration.Consequently, varieties with dark seed coats were eliminated (Soares et al. 2022).Statistically significant differences (p < 0.01) were observed for the morphometric variables AR and WHS among the 13 seed varieties (Table 1).
As shown in Table 1, the mean area values for the 13 seed varieties ranged from 93.65 mm 2 (SE7) to 63.87 mm 2 (AL13), with SE7 exhibiting the highest values.The mean values for the maximum diameter ranged from 12.69 mm (SE7) to 10.42 mm (AL11), whereas the smallest diameter values varied between 9.23 mm (SE7) and 7.46 mm (AL13), with no significant differences observed for either variable.The WHS values ranged from 45.33 g (SE1) to 29.55 g (BA10).
The significant difference in seed weight is a distinctive characteristic of this species and is commonly reported in the literature (Felix et al. 2014).This variable is regarded as one of the most important quantitative characteristics for analyzing genetic divergence in studies conducted with this species, ranging from 70% (Meza et al. 2012) to 97.32% (Lopes et al. 2010).Variety AL13 exhibited the lowest mean values for most parameters, suggesting that it mainly comprises smaller seeds.Conversely, SE7 had the highest values, indicating a predominance of seeds with larger dimensions.
In a genetic study that included 300 P. lunatus accessions from various regions of Honduras, provided by farmers, Meza et al. (2012) reported substantial diversity in variety and seed dimensions.Other studies have also documented high variability in seed size, color, and shape, such as the studies of Guimarães et al. (2007) and Li et al. (2010).
The simple correlation analysis (Pearson) revealed a positive and significant association among all morphometric variables (Fig. 4).
Correlations among agronomic traits are of significant interest for breeding programs as they facilitate the assessment of the impact of one trait on another, enabling indirect selection.In the case of simple correlation, the evaluation is based on the degree of association between two numerical variables, one dependent (Yi) and one independent (Xi) .
The results (Fig. 4) indicate that there is potential for indirect selection of the weight of 100 seeds (WHS) by evaluating AR, D.Max, and D.Min, irrespective of the P. lunatus variety.The findings reveal a strong (0.6 < r ≤ 0.9) to very strong (0.9 < r < 1) linear correlation (Callegari-Jacques 2009).This phenomenon may be linked to pleiotropy or gene linkage, whereby a gene influences two or more traits.In this case, the effect is positive since some genes promote the increase of both traits (Falconer and Mackay 1996).
From a genetic breeding perspective, the positive correlation of the traits AR, D.Max, D.Min, and WHS suggests that the focus of selection should be on a single trait to enhance both.This approach helps reduce the number of traits to be evaluated, thereby optimizing the breeder's work and time (Nogueira et al. 2012).The dissimilarity dendrogram, based on morphometric characteristics, comprised two groups of accessions according to the Mojena method (Mojena 1977) with k = 1.25 (Fig. 5).
Group B is the most divergent and significant group regarding the dissimilarity of the accessions, as it contains the varieties with higher means of AR, D.Max, D.Min, and WHS.The observed divergences among the P. lunatus varieties are substantial, indicating a strong genetic diversity among the accessions.Utilizing these materials in plant breeding programs to obtain materials with the desired characteristics for the consumer market is crucial.
Dissimilarity measures are of great importance in studies of genetic diversity, especially when identifying genotypes for hybridization programs.This technique enables the identification of homogeneous groups through a scheme that groups the varieties with the aim of achieving high homogeneity within groups and heterogeneity among them (Cruz et al. 2011).  2 Values of the correlation coefficients with the two main components (CPs), auto value, total variance, accumulated auto values, and total accumulated in percentage (%) by the components, of the thirteen varieties of Phaseolus lunatus L., from the morphological and colorful characteristics Table 2 presents the correlation coefficients with the two Principal Components (PC), eigenvalues, total variance, cumulative eigenvalues, cumulative total, and associated eigenvectors for each principal component in the Principal Component Analysis (PCA).The variable that had the most substantial impact on the PC1 was AR (0.92), whereas for PC2, it was R (0.72).
The contribution of each principal component can be assessed by analyzing the proportion of the total variance explained by that component (Regazzi 2000).In this study, the first two principal components (PCs) accounted for 86.80% of the total variation in the morphological and staining variables of the different P. lunatus seed groups.As a result, these two PCs effectively summarize the total sample variance and can be utilized to analyze the dataset.
The outcome aligns with the criteria proposed by Rencher (2002), which stipulate that the first and second principal components should account for at least 70% of the total variance.In this study, PC1 accounted for 58.41%, and PC2 accounted for 28.39% of the variation in the data.Morphological characteristic variables demonstrated a strong correlation with PC1, while those associated with the RGB color system exhibited a robust correlation with PC2 (Callegari-Jacques 2009).
In a study with 47 cowpea varieties (Vigna unguiculata L.) from different origins, divergence was analyzed by combining various characteristics, such as the weight of 100 seeds, grain shape, and grain color, among others, using three principal components to account for 80.32% of the total variance.Implementing this tool enabled the identification of 13 varieties with high characteristics in production, precocity, and grain quality (Bertini et al. 2010).
Based on the biplot graph, the variables associated with morphological and RGB characteristics are situated to the right of CP1.Varieties SE1, SE3, SE7, G8, BA9, and AL12 (CP1) possess the potential for higher morphometric values, more vivid hues, and lighter colors.Concerning CP2, varieties SE1, SE2, Fig. 6 Biplot with the projection of the variables of the first two principal components of the 13 varieties of Phaseolus lunatus L., obtained by the Method of Principal Component Analysis, from the morphological characteristics and color SE3, SE5, BA9, BA10, and AL13 display higher values for RGB and more vibrant tones.
Principal Component Analysis has been widely employed in the characterization and selection of common bean varieties (P.vulgaris) (Arevalo et al. 2020;Brito et al. 2020;Domingues et al. 2013) and has also been adopted in lima bean cultivation (Silva et al. 2019).This technique contributes to the identification of desirable characteristics and has been analyzed in previous studies that demonstrate its contribution to variability (Carbonari 2021, Silva et al. 2019).Therefore, it is possible to eliminate variables with low contributions in discriminating the studied genotypes, which is crucial for reducing costs, time, and labor in breeding programs (Cruz et al. 2013).
Several factors must be taken into account for a successful breeding program.In the case of P. lunatus, the consumer market is of utmost importance because for a variety to be considered interesting, it must have large seeds with lighter coloration.In the northeast region of Brazil, consumers prefer brown and white materials (Soares et al. 2022), making it necessary to consider market demands when selecting and breeding new varieties.
Based on the results of the principal component analysis, it can be inferred that varieties SE1, SE3, and BA9 have desirable morphological and visual characteristics, such as large seeds and light teguments, that meet the demands of consumers.These varieties can be utilized in plant breeding programs to develop new materials and have improved agronomic traits.
With high and accurate discriminatory power, Groundeye image analysis based on computer vision is an important method for obtaining data on seed coat color, as well as other important morphological traits for the classification of bean landraces such as height and width of the seed sample.The entire process can also be automated, allowing for the analysis of a large number of samples in a short amount of time (Varga et al. 2019).

Conclusion
The utilization of high-resolution images to assess the morphological characteristics and color of creole P. lunatus seed tegument has shown to be a promising approach for analyzing phenotypic traits.The multivariate analysis of the data obtained through image processing has allowed us to infer that the seeds of SE1, SE3, and BA9 varieties possess morphological and visual features, such as lighter colors and larger sizes, that meet the necessary criteria for use and selection in plant breeding programs.
Therefore, using image analysis, it can be suggested that the dominant colors with a sum of yellow and orange should be greater than 90% when aiming to obtain large seeds with a light seed coat, which are more desirable for cooking.

Fig. 5
Fig. 5 Dissimilarity dendrogram among varieties (G) of Phaseolus lunatus L., based on the Euclidean distance and clustered by UPGMA