Number of experiments that should be considered in the cluster analysis of common bean genotypes for plant architecture and grain yield traits

The number of experiments that allows the choice of parents to be used in controlled crossings in a more assertive way in cluster analysis is unknown for plant architecture and grain yield traits in common bean. Therefore, the objective of this work was to determine the number of experiments that should be considered in Tocher's and the unweighted pair group method with arithmetic mean (UPGMA) cluster analyses to identify promising common bean parents for several plant architecture and grain yield traits. Four experiments were carried out in different years and growing seasons, in the same site. The randomized block design was used and 17 common bean genotypes with carioca (beige seed coat with brown streaks) and black grains were evaluated in relation to 12 traits related to plant architecture and five traits related to grain yield. Statistical analyses were performed with data obtained from individual and combined experiments. Significant genotype × experiment interaction was observed for most of the evaluated traits. When Tocher's and UPGMA cluster analyses was performed from data obtained in individual experiments different groups were formed. The use from data obtained in two, three or four experiments allowed greather reliability in the formation of groups. Three and two experiments are sufficient in the Tocher's and the UPGMA cluster analyses, respectively, to identify promising carioca and black common bean parents for several plant architecture and grain yield traits in a more assertive way.


Introduction
The breeding of beans (Phaseolus vulgaris L.) in Brazil began with the creation of the Agronomic Station of Campinas in 1887, which years later received the name Agronomic Institute of Campinas (Carbonell et al. 2012). Since then, the development of highyielding common bean cultivars has been one of the main objectives of breeding programs. In the last decades, more efforts have been concentrated on the selection for upright plant architecture, which allows the harvest to be carried out manually or with a harvester, to obtain less grain loss. The plant architecture in common bean has been evaluated by lodging, insertion of the first pod, plant height and other traits described by Moura et al. (2013), Soltani et al. (2016) and Nadeem et al. (2020).

Abstract
The number of experiments that allows the choice of parents to be used in controlled crossings in a more assertive way in cluster analysis is unknown for plant architecture and grain yield traits in common bean. Therefore, the objective of this work was to determine the number of experiments that should be considered in Tocher's and the unweighted pair group method with arithmetic mean (UPGMA) cluster analyses to identify promising common bean parents for several plant architecture and grain yield traits. Four experiments were carried out in different years and growing seasons, in the same site. The randomized block design was used and 17 common bean genotypes with carioca (beige seed coat with brown streaks) and black grains were evaluated in relation to 12 traits related to plant architecture and five traits related to grain yield. Statistical analyses were performed with data obtained from individual and combined experiments. Significant genotype × experiment interaction was observed for most of the evaluated traits. When Tocher's and UPGMA cluster analyses was performed from data obtained in individual experiments different groups were formed. The To increase the chances of success in the development of new common bean cultivars with high grain yield and upright plant architecture, it is necessary to know the genetic divergence of the superior parents to be used in controlled crossings. Cluster analyses, especially Tocher and UPGMA, have been shown to be efficient for the identification of superior common bean parents for agronomic (Gonçalves et al. 2016;dos Santos et al. 2019) and for seed morphological (Gonçalves et al. 2014) traits. However, most of the researches carried out to date consider data obtained in a single experiment for studies of genetic divergence of common bean genotypes based on agronomic traits (Correa and Gonçalves 2012;de Lima et al. 2012;Bertoldo et al. 2014;Gonçalves et al. 2016;dos Santos et al. 2019). Therefore, these works not consider the effects of genotype × experiment (environment) interaction that have been reported for most agronomic traits evaluated in common bean genotypes (Moura et al. 2013;Boros et al. 2014;Soltani et al. 2016;Delfini et al. 2017;Nadeem et al. 2020) in the choice of parents to be used in the crossbreeding blocks.
When significant genotype × experiment interaction is observed for agronomic traits in common bean, it is necessary to consider the environmental variability between years and growing seasons for the same site of conduction of the experiments in the cluster analysis. For this reason, Cargnelutti Filho et al. (2009) recommended that data obtained in six and seven experiments are sufficient to identify divergent common bean cultivars for grain yield, phenological and morphological traits in Tocher's and Ward cluster analyses, respectively. However, these authors evaluated few traits related by plant architecture. Additionally, the use of data obtained in six or seven experiments to identify superior parents based in several plant architecture and grain yield traits to be used in a breeding program is not viable, as this would involve a lot of analysis time, making this step extremely expensive and laborious.
The number of experiments that allows the identification of promising common bean parents for several plant architecture and grain yield traits in a more assertive way in cluster analysis should allow greater efficiency, accuracy and speed in the selection of genotypes that will be used for controlled crosses. This information is not known and constitutes an important innovation for common-bean breeding programs. Therefore, the objective of this work was to determine the number of experiments that should be considered in the Tocher's and the UPGMA cluster analyses to identify promising common bean parents for several plant architecture and grain yield traits.

Conducting experiments
Four experiments were carried out in the area of the Common-Bean Breeding Program of the Federal University of Santa Maria (UFSM), Santa Maria, Rio Grande do Sul, Brazil (29º 42ʹ S latitude, 53º 49 ʹ W longitude and 95 m altitude). The region's climate is classified as humid subtropical (Alvares et al. 2013), which allows the cultivation of common bean in rainy and dry season crops. The sowing of the rainy season experiments (2016 and 2017) was performed in October and dry season experiments (2017 and 2018), in February. The experimental area was maintained with green cover between July and September with the cultivation of black oats.
The randomized block design was used with three replicates. The experimental plot was formed of 4-m four long rows, with spacing between lines of 0.5 m, but only the two central lines were considered as useful area (4 m 2 ). The treatments consisted of 17 common bean genotypes, being four cultivars (Pérola, Carioca, BRS Valente and Guapo Brilhante) and 13 lines obtained by different research institutions that participated of the Value of Cultivation and  . These genotypes have carioca (beige seed coat with brown streaks) and black beans, both from the Mesoamerican gene pool, being representative of the most produced grain types in Brazil.
The soil of the experimental area is classified as typic alitic Argisol, Hapludalf, and was prepared with plowing and harrowing, which corresponds to conventional cultivation. Fertilization, control of weed and insects, and irrigation were performed in accordance with technical recommendations for the cultivation of common bean in the Southern region of Brazil (Ctsbf 2012). Disease control was not carried out in compliance with the rules established for conducting common bean VCU experiments in Brazil (Brasil 2006).

Traits related to plant architecture and grain yield
The plant architecture was evaluated at maturation stage (R9) by 12 traits. The two qualitative traits, lodging and general adaptation score, were determined in the useful area of the plots using score scales ranging from 1 to 9. For lodging, score 1 characterized upright plants and score 9 was assigned to prostrate plants. For general adaptation score, score 1 defined upright plants, with a large number of pods per plant and without symptoms of disease in the pods and score 9 represented prostrate plants, with few pods per plant and high severity of disease symptoms in the pods.
The other evaluations of the plant architecture included quantitative traits that were analyzed in 10 plant representatives of the useful area and harvested at random. The following traits were measured with a measuring tape and showed in cm: insertion of the first pod, insertion of the last pod, plant height, first-internode length, second-internode length, third-internode length, fourth-internode length and fifth-internode length. The hypocotyl and epicotyl diameters were measured with a digital caliper, in mm, 1 cm below and 1 cm above the cotyledon node, respectively.
The five traits related to grain yield determined were number of pods per plant, number of grains per plant, number of grains per pod, mass of 100 grains and grain yield. These traits were evaluated in the 10 plants collected at random in the useful area, except for the grain yield that was calculated from the data obtained in all plants harvested in the useful area. Grain moisture was standardized at 13% to obtain the mass of 100 grains (g) and grain yield (kg ha −1 ).

Statistical analyses
The data obtained were subjected to individual analysis of variance considering the following experiments: 2016 rainy (I), 2017 dry (II), 2017 rainy (III) and 2018 dry (IV) season crops. Hartley's maximum F-test was applied to verify the homogeneity of the residual variances.
The combined analyses of variance were performed considering the data obtained in the experiments I and II (2016 rainy and 2017 dry season crops), I, II and III (2016 rainy, 2017 dry and 2017 rainy season crops) and I, II, III and IV (2016 rainy, 2017 dry, 2017 rainy and 2018 dry season crops). In the these analyses, genotype effect was analyzed as fixed and the other effects (experiment, genotype × experiment interaction and error) were random. Since the ratio between the highest and lowest residue mean square was greater than seven, it was necessary to correct the degrees of freedom of the error and of the genotype × experiment interaction (Cruz 2016) and this allowed homogeneous residual variances to be obtained for all evaluated traits.
Multicollinearity diagnostics was made with the phenotypic correlation matrix obtained in the 89 Page 4 of 14 Vol:. (1234567890) combined analysis of variance of the four experiments (I, II, III and IV). The condition number (CN), which corresponds to the relationship between the highest and lowest eigenvalue of the matrix, was evaluated according to the collinearity classes established by Montgomery et al. (2012). The exclusion of highly correlated traits and with greater weight in the last eigenvectors was performed to obtain weak collinearity, that is, CN ≤ 100, before performing the cluster analyses.
The cluster analyses considered the data obtained for invidual (I, II, III and IV) and combined (I and II; I, II and III; and I, II, III and IV) experiments. For this, the residual variance and covariance matrices obtained in the variance analyses of these experiments were used to generate the genetic dissimilarity matrices between the common bean genotypes using Mahalanobis' generalized distance with standardized means. The Mahalanobis' generalized distance analysis were also applied to identify traits with the greatest contribution to genetic divergence considering indivudual and combined experiments.
The following cluster analyses were performed for individual (I, II, III and IV) and combined (I and II; I, II and III; and I, II, III and IV) experiments: Tocher's optimization and hierarchical unweighted pair group method with arithmetic mean (UPGMA). The cophenetic correlation coefficient (CCC) was established from Pearson's linear correlation between the elements of the cophenetic matrix and the elements of the dissimilarity matrix to verify the consistency of the clustering pattern. All statistical analyses were performed using the Genes program (Cruz 2016).

Results and discussion
Individual and combined analyses of variance Significant genotype × experiment interaction was observed for most of the traits evaluated in one or more combined analysis of variance (Table 1). Therefore, the agronomic performance of the common bean genotypes was not constant in the different years and growing seasons. Similarly, significant genotype × experiment (environment) interaction has been described for most of the plant architecture and grain yield traits evaluated in common bean genotypes (Moura et al. 2013;Boros et al. 2014;Maziero et al. 2015;Soltani et al. 2016;Delfini et al. 2017;Ribeiro et al. 2018;Nadeem et al. 2020). When a significant genotype × experiment interaction is observed for several agronomic traits in common bean, it is not recommended that cluster analysis be performed based on data obtained in a single experiment, because this strategy does not consider the environmental variability between years and growing seasons for the same site of conduction of the experiments (Cargnelutti Filho et al. 2009).
There was a significant genotype effect for most of the evaluated traits in individual and combined experiments. Therefore, there is genetic variability for the plant architecture and grain yield traits in common bean genotypes and this allowed the study of genetic divergence. However, no significant genotype × experiment interaction and genotype effects were observed to the first-internode length, fourthinternode length and fifth-internode length in the experiments combined (I and II; I, II and III; and I, II,  III and IV). Therefore, these traits were not included in the cluster analyses.
The diagnosis of multicolinearity revealed CN = 3,273.23, which corresponds to the severe collinearity class according to the criteria proposed by Montgomery et al. (2012). Therefore, it was necessary to exclude the traits with high correlation and with greater weight in the last autovectors: thirdinternode length, number of grains per plant, plant height and epicotyl diameter. The removal of these four traits resulted in weak collinearity (CN = 75.26) and this prevented multicollinear variables from implicitly receiving greater weights in cluster analyses (Cruz and Carneiro 2006), allowing the correct Table 1 Results of the F test of analysis of variance for the traits of lodging (LDG), general adaptation score (GAS), insertion of the first pod (IFP, cm), insertion of the last pod (ILP, cm), plant height (PH, cm), first-internode length (1st IL, cm), second-internode length (2nd IL, cm), third-internode length (3rd IL, cm), fourth-internode length (4th IL, cm), fifth-internode length (5th IL, cm), hypocotyl diameter (HD, mm), epi-cotyl diameter (ED, mm), number of pods per plant (NPP), number of grains per plant (NGP), number of grains per pod (NGPOD), mass of 100 grains (M100G, g), and grain yield (YIELD, kg ha −1 ) obtained of 17 common bean genotypes evaluated in the 2016 rainy (I), 2017 dry (II), 2017 rainy (III) and 2018 dry (IV) seasons and in the combined experiments I  and II; I, II and III; and I, II, III and  interpretation of the results obtained in the cluster analyses.

Tocher's cluster analysis
The results obtained in the Mahalanobis' generalized distance showed that the order of the three traits which showed greater participation for the differentiation of common bean genotypes was different when considering data obtained in individual (I, II, III and IV) experiments (Table 2). However, the mass of 100 grains was the trait that most contributed to the differentiation between common bean genotypes in the combined experiments I and II (29.36%), I, II and III (29.44%) and I, II, III and IV (31.51%). Similarly, it was found that the mass of 100 grains exhibited the greatest relative contribution to the separation of common bean genotypes, although the order of the other agronomic traits important the recognition of these differences varied in each evaluation year,   I and II; I, II and III; and I, II, III and I  II  III  IV  I and II  I, II and III  I, II, III and  when the Mahalanobis' generalized distance was used (Coelho et al. 2010). The mass of 100 grains, too, has been described as the most important agronomic trait to assess the genetic dissimilarity of common bean genotypes based on data obtained in one (Correa and Gonçalves 2012) or three (Cabral et al. 2011) experiments.
In the present study, the mass of 100 grains has a greater contribution in the formation of different groups in the cluster analyses, using the results obtained in two, three or four experiments. Therefore, data obtained in two experiments were sufficient to recognize differences between carioca and black bean genotypes, based on plant architecture and grain yield traits, in the Mahalanobis' generalized distance analysis.
When Tocher's cluster analysis was performed from data obtained in individual experiments, it was observed that the number of groups formed and the composition of these groups was different (Table 3). This can be explained by the fact that most of the plant architecture and grain yield traits showed a significant genotype × experiment interaction effect (Table 1). When this happens, the selection of superior parents for agronomic trais will be different in each experiment, and this represents difficulties for the common-bean breeding programs. Previous studies have also shown that the groups formed in the Tocher's cluster analysis were not exactly the same for each of the different environments, since significant genotype × experiment interaction was found for most of the agronomic traits evaluated by Ceolin et al. (2007) and Coelho et al. (2010). These authors observed that the groups formed in the Tocher's cluster analysis were different for each evaluated experiment (environment). Therefore, when a significant genotype × experiment interaction is observed for most agronomic traits, the strategy of presenting the results obtained in the Tocher's analysis for each growing environment is not the best option for the breeding program. This is because the identification of superior or redundant genotypes will vary with the growing environment.
Therefore, the environmental variability between growing seasons and years for the same location where the experiments are conducted must be considered in the cluster analysis. Cargnelutti Filho et al. (2009) recommended that data obtained from six experiments were sufficient to identify divergent cultivars by Tocher's cluster analysis based on grain yield, phenology and morphology traits. However, using data obtained from six experiments to identify superior parents or duplicate accessions in a breeding program implies an increase in the time needed to assess genetic divergence and a lot of delay in making decisions regarding the genotypes that can be used in the crossbreeding blocks.
Data from two experiments (I and II) resulted in the division of common bean genotypes into two groups (Table 3), making it possible to differentiate the line TB 02-19 (group 2) from the other evaluated genotypes (group 1). However, when considering Table 4 Means of the traits of lodging (LDG), general adaptation score (GAS), insertion of the first pod (IFP, cm), insertion of the last pod (ILP, cm), second-internode length (2nd IL, cm), hypocotyl diameter (HD, mm), number of pods per plant (NPP), number of grains per pod (NGPOD), mass of 100 grains (M100G, g), and grain yield (YIELD, kg ha −1 ) obtained in each of the three groups established by Tocher's optimization method, from Mahalanobis´ generalized distance, in the combined experiment I, II, III and IV (2016rainy, 2017dry, 2017  the data obtained in three (I, II and III) or four (I, II, III and IV) experiments, it was possible to differentiate three groups with identical composition. Group 1 included 15 carioca and black bean genotypes, corresponding to 88.23% of the evaluated genotypes; group 2 consisted of the cultivar Pérola (carioca beans); and group 3 was composed of line TB 02-19 (black beans). Tocher's cluster analysis has been efficient in differentiating common bean genotypes for agronomic traits, despite the fact that group 1 normally concentrates the largest number of evaluated genotypes (de Lima et al. 2012;Gonçalves et al. 2016;Pereira et al. 2019;dos Santos et al. 2019). The use from data obtained in three or four experiments allowed greather reliability in the formation of groups in the Tocher's cluster analysis, in the presesent study. The data obtained in three or four experiments allowed recognition of the differences between the common bean genotypes grouped in each of the three groups. Group 1 included the common bean genotypes of upright plant architecture, characterized by the lowest lodging and general adaptation score values and the largest hypocotyl diameter (Table 4). These genotypes, also, had the highest number of pods per plant, number of grains per pod and grain yield values, among the three groups formed. Groups 2 and 3 were characterized by common bean genotypes of upright plant architecture, that is, higher second-internode length value; however, they showed low grain yield. All evaluated genotypes had medium-sized grains (25 to 40 g), which meet the market demand for carioca and black beans (Carbonell et al. 2010). The results obtained showed that the common bean genotypes in group 1 have a greater number of traits that confer an upright plant architecture and high grain yield, with great potential for use in controlled crosses.
In the present study, the groups formed in the Tocher's cluster analysis were identical when considering data obtained in three or four experiments. Therefore, data obtained in three experiments were sufficient for studies of genetic divergence in the Tocher's cluster analysis, allowing the identification of carioca and black common bean parents with a greater number of taits that confer upright plant architecture and high grain yield.

UPGMA cluster analysis
The lowest CCC was obtained in the UPGMA cluster analysis performed with data obtained in the 2016 rainy (0.5591) and the largest CCC was found with data obtained in the 2017 dry (0.9335) season crops (Fig. 1), all of which were significant at 1% probability by the t test. A similar amplitude of variation was observed for CCC values obtained in the UPGMA cluster analysis, considering agronomic and/or morphological traits evaluated in one (Gonçalves et al. 2014(Gonçalves et al. , 2016, two (Arteaga et al. 2019) and three (Cabral et al. 2011) experiments with different common bean genotypes. The closer to one the CCC value, the greater the adjustment between the cophenetic matrix and the dissimilarity matrix based on the Mahalanobis' generalized distance, resulting in greater cluster reliability (Cabral et al. 2011). In the present study, the highest CCC values (≥ 0.8746) were obtained in the 2017 dry season (Fig. 1) and the combined experiments I and II; I, II and III; and I, II, III and IV (Fig. 2), indicating greater reliability in the representation of the groups formed in this UPGMA cluster analysis.
The groups formed in the 2016 rainy, 2017 dry, 2017 rainy and 2018 dry season crops were different in the dendrograms obtained in the UPGMA cluster analysis (Fig. 1), confirming the results observed in the Tocher´s cluster analysis (Table 3). This is because when cluster analysis was performed based on data obtained in a single experiment, the environmental variability between years and growing seasons for the same site of conduct of the experiments was not considered. In the present study, it was possible to verify that when the UPGMA cluster analysis Page 11 of 14 89 Vol.: (0123456789) was performed with data from one experiment, there was no repeatability in the characterizing the genetic divergence of carioca and black bean genotypes for plant architecture and grain yield traits.
However, the dendrograms generated in the UPGMA cluster analysis, considering data from two, three and four experiments, formed two identical groups adopting 70% similarity as a criterion for defining the groups (Fig. 2). Group 1 contained the line TB 02-19 and group II was composed of other carioca and black common bean lines and cultivars. When the UPGMA cluster analysis was applied to morphological traits evaluated in two experiments with common bean genotypes, it was also possible to group the genotypes into just two groups (Guidoti et al. 2018;Arteaga et al. 2019). In the present study, it was not possible to gather in different groups carioca and black common bean genotypes for the plant architecture and grain yield traits. The difficulty of separating carioca and black common bean genotypes into different groups by cluster analysis was also reported for agronomic (Pereira et al. 2019) and molecular (Veloso et al. 2015) traits. As in the process of developing new carioca and black common bean cultivars, crosses between parents with both types of grains were carried out, this resulted in genetic similarity (Veloso et al. 2015). For this reason, carioca and black common bean lines and cultivars have a narrow genetic basis, making it difficult to differentiate common bean genotypes from grain types to agronomic traits (Delfini et al. 2017;Pereira et al. 2019).
The results obtained in the UPGMA cluster analysis showed that the inclusion of data obtained in three or four experiments did not change the clustering pattern of common bean genotypes in relation to the analysis with data from two experiments. Therefore, data from two experiments were sufficient in the UPGMA cluster analysis to obtain a dendrogram with high reliability in the formation of the groups. This allows the identification of promising carioca and black common bean parents for plant architecture and grain yield traits in a more assertive manner.

Conclusions
Data obtained from three and two experiments are sufficient in the Tocher's and the UPGMA cluster analyses, respectively, to identify promising carioca and black common bean parents for several plant architecture and grain yield traits and in a more assertive manner.