Mega-Environment Analysis and Identify Stable and High-Yielding of New Promising Black Soybean Lines in Indonesia

Stable and high-yielding are the major goals of black soybean breeding. Testing new lines in a mega-environment is one of the development processes in black soybean breeding. The aims of the research were (i) to identify the effects of genotype, environment, and GEIs on the grain yield of soybean lines in Java Island; (ii) to select stable and high yielding soybean lines; and (iii) to determine the discriminative environments, and (iv) to determine the concept of stability measurements on black soybean grain yields. Field trials of 10 new F8 promising lines and three check varieties were conducted under eight different environments during four years (2016–2019). The measurement results showed that the grain yield was inuenced by genotype (8.35%), environment (59.49%), and GEIs (32.16%). Grain yield stability measurements showed that the four newly lines was identied had high yields and stable in eight environments, they were A-5A-PSJ (S2), DB-96-CTY (S5), UP 161 (S6), and UP 162 (S7). The Ngawi (2017) followed by Bogor (2019) and Banyuwangi (2016) has the strongest interactive capabilities, and was suitable for used as a trial environments. Grain yield (Y) was identied as having a positive and signicant correlation (p < 0.05) with S (3) , S (6) , NP (2) , NP (3) , NP (4) , KR, and YSI stability measurements, which indicated that they were included in the concept of dynamic stability measurement. high yield stability and high yielding in suitable locations for recommended lines. Based on this description, the objectives of this study were (i) to identify the effects of genotype, environment, and GEIs on the grain yield of soybean lines in Java Island; (ii) to select stable and high yielding soybean lines; and (iii) to determine the discriminative environments, and (iv) to determine the concept of stability measurements on black soybean grain yields. The results information was useful in the development of soybean lines in the future.


Introduction
Black Soybeans contain many nutrients that are bene cial for humans. The current research of black soybean leading to the results of secondary metabolites that are bene cial to health. There were many secondary metabolites of black soybeans that are bene cial to humans, one of which is iso avones.
Iso avones are a group of avonoid compounds that produce natural antioxidants 1 . The bene ts of iso avones for humans include preventing cancer, heart attack, and climateric syndrome 2,3 . Iso avones in black soybeans were found in all plant organs, especially soybean seeds. Sumardi et al. 4 reported that the largest types of iso avones contained in black soybean seeds were daedzein and genistein. The important role of black soybeans for food and human health, makes this commodity increasingly needed by the community so that its demand was predicted to increase.
In Indonesia, black soybeans were still needed as raw materials for production in soy sauce industries. The advantage as a raw material for soy sauce was to improved the quality of the color to dark chocolate. Soy sauce made from black soybeans "Merapi" has a protein content of 3.20%, higher than the soy sauce from the yellow soybean "Argomulyo" (2.37%) 5 . The large role of black soybeans for functional foods, makes the demand for black soybeans increase.
Triyanti 6 reported that soybean production in 2020 had increased to 632,333 tons from 2019 which produced 424,190 tons. However, this amount was still not able to meet the higher demand for black soybeans, so it was predicted that the import value of s black oybeans to meet domestic needs was still high.
One of the efforts that can be done is by assembling new superior black soybean that have high grain yields. The stages that must be passed in the assembly of promising lines are testing in a variety of environments (multilocation trial). Multi-location trials was carried out to obtain information on stable and adaptive lines in all environments as well as speci c adapted lines with high yields. The more environments used for testing (mega-environment), the higher the accuracy of stable lines 7 . According to Srivastava et al. 8 , mega-environment analysis using the GGE biplot provides information on suitable sites for extensive adaptation and suitable sites for speci c adaptation testing. Silva et al. 9 reported that mega-environment trials could lead to complex GEIs and obtained information about the relationship between trial environments. Information obtained from mega-environment trials was very helpful for plant breeders in manages of GEIs and extracting them into the same agro-climatic concept 10 . According to Zdiarski et al. 11 , mega-environment trial provides information on speci c adapted lines based on regional targets and regional climate. In addition, based on these environment groupings, lines that show high yield stability and high yielding in suitable locations for recommended lines. Based on this description, the objectives of this study were (i) to identify the effects of genotype, environment, and GEIs on the grain yield of soybean lines in Java Island; (ii) to select stable and high yielding soybean lines; and (iii) to determine the discriminative environments, and (iv) to determine the concept of stability measurements on black soybean grain yields. The results information was useful in the development of soybean lines in the future.

Result
Combined Analysis of Variance (ANOVA) to identify GEIs AMMI ANOVA (Table 1) revealed that the effect of genotype, environment, and their interactions (GEIs) had a signi cant effect (P < 0.05) on soybean grain yield variations. The main effect of genotype accounted for 8.35% of the total variation, while environment and GEIs accounted for 59.49% and 32.16%, respectively. These results indicated that grain yields were signi cantly affected by environmental changes, followed by effects of GEIs and genotypes (Table   1). Highly signi cant environmental effects and high variance components can be attributed to large differences between test environments such as elevation, and differences in the amount and distribution of annual rainfall ( Table 2). The GEIs were further divided into seven IPCAs, of which the rst two IPCAs showed signi cant contributions. IPCA1 contributed 42.66% to the number of squared GEIs. The rst two IPCAs accounted for 69.30% of the total variation in GEIs indicating the accuracy of the data. The magnitude of the GEIs effects was also be seen in the box plot ( Figure 1) and the GEIs heat map ( Figure 2). In Figure 1 the distribution of the average grain yield in each environment showed high variation. The GEIs heatmap ( Figure 2) showed the relationship between each environment and the average yield. In Figure 2, the environment was divided into three main clusters, namely environmental groups that have high average grain yields (E4 and E5), low grain yield (E2, E6, and E7), and medium grain yield (E1, E3, and E8). This data showed that the in uence of the environment was very strong on soybean grain yields.
Parametric and non-parametric stability measurements on soybean grain yields The stability of soybean grain yields in multi-environmental trials based on parametric and non-parametric measurements was presented in Table 3. Linear regression measurements on environmental indexed showed that grain yields of the two stability parameters bi and S 2 di were inconsistent in assessing the line response to various environmental conditions. Almost all lines showed a regression coe cient value (bi) which was not signi cantly different from one (1), on the contrary, some lines also showed a signi cant deviation from the regression value (S 2 di) greater than zero (Table 3). Based on the regression coe cients, all lines had an average response in all test environments. Thus, the lines UP161 (S6), UP 162 (S7), and UP 163 (S8) were declared the most stable according to this measurements. The other lines, Detam 1 (S11) and Detam 3 (S12), had a regression deviation (S 2 di) which was signi cantly greater than zero and a bi value that was not different from one, indicating that these two lines were better adapted to environments that produce high yields (favorable). The Wi 2 , σ² , dan θ stability measurements, showed the same results in determining the stability of soybean grain yields. Based on these three stability measurements, the A-5A-PSJ (S2) was declared the most stable line, followed by the UP 161(S6), DB-96-CTY (S5), and UP163 (S8) lines. CVi measurement indicated A-8A-PSJ (S4) as the most stable, followed by A-4A-PSJ (S1) and UP 162 (S7). θ stability measurement indicated the Detam 3 (S12) line as the most stable, followed by A-7A-PSJ (S3) and GH Detam 5 (S13). These three lines have a high average yield. ASV measurement revealed UP 161 (S6), UP 162 (S7), and A-5A-PSJ (S2) lines as the most stable. YSI measurement indicated UP 162 (S7) as the most stable line, followed by DB-96-CTY (S5), and A-5A-PSJ (S2). All three lines had high average grain yields across the eight test environments. In non-parametric stabiility measurements, DB-96-CTY (S5) was declared the most stable by S (3) and KR measurements. The UP 161 (S6) line was declared the most stable by NP(1) measurement. The UP 162 (S7) line was declared the most stable by S ¹ , S , NP ² , NP ³ , and NP ⁴ measurements. The GH Detam 5 (S13) line was declared the most stable by S (1) and S (2) measurements.
The grouping of soybean lines tested in this study was presented in Figure 3. The grouping was the output of Hierarchical Clustering Analysis (HCA) on the stability rank's of the tested lines based on parametric and non-parametric measurements. In the Figure 3, soybean lines were divided into three main groups, namely the unstable low yield cluster containing the A-4A-PSJ (S1), UP 165 (S10), A-8A-PSJ (S4), and GH Detam 5 (S13) lines; unstable high yield cluster containing Detam 1 (S11), Detam 3(S12), A-7A-PSJ(S3), and UP 164 (S9) lines; and a stable high yield cluster containing DB-96-CTY (S5), UP 162 (S7), UP 161 (S6), UP 163 (S8), and A-5A-PSJ (S2). The third group was preferred because it has stable and high grain yields than the overall average.
The relationship between parametric and non-parametric stability measurements with soybean line grain yields The relationship between stability measurements was estimated using Spearmans correlation to the stability rank's of each measurement. The results of the correlation analysis was presented in Table 4. The grain yield (Y) was identi ed as having a positive and signi cant correlation (p<0.05) with the S (3) , S (6) , NP (2) , NP (3) , NP (4) , KR, and YSI stability measurements. Another positive and signi cant correlation was shown by S (1) with S (2) (p<0.01); NP (1) with bi (p<0.01); KR with Wi 2 , σ² , S2di, θ , ASV, YSI (p<0.01), and θ (p<0.05); and CVi with ASV (p<0.05). To classify the relationship between each measurement, Principal Component Analysis (PCA) was used on the grain yield stability rank's. The results of the PCA measurement showed that the rst four PCs was identi ed produced an eigenvalue > 1, with an effect of 93.14% of the total variation (Table 5). PCA biplots regarding the relationship between stability measurements and grain yield were drawn based on the PC1 and PC2 values ( Figure 4). Based on Figure 4, the grain yield (Y) was in one group with NP (2) , NP (3) , NP (4) , S (3) , S (6) , YSI, and KR measurements. The S (1) grouped with S (2) . NP (1) grouped with bi. ASV grouped with S 2 di, θ , σ² , and Wi 2 . The CVi measurement was weakly correlated with the last group, while the θ measurement was not correlated with all other measurements.

AMMI biplot analysis of soybean grain yield
The rst two principal components (IPCA1 and IPCA2), accounted for 69.2% of the total variation in GEIs ( Figure 5). The length of the environment vector from the center of the biplot was proportional to the number of GEIs. An environment with a longer vector has a strong interaction, while an environment with a shorter vector has a weak interaction. The Ngawi (2017) environment has the strongest interactive power followed by Bogor (2019) and Banyuwangi (2016).
The rst IPCA (IPCA1) was plotted against the line and environmental mean ( Figure 6). The lines were distributed below and above the mean IPCA1 values of -40 to +60. The environments and lines on the left hand side of the vertical line have grain yields below the overall average, while those on the right side of the origin have grain yields above the overall average. The environments of Ngawi (2016), Ngawi (2017) and Probolinggo (2018) have large positive IPCA1 scores and have above-average grain yields. The Ngawi (2017) environment was classi ed as having the highest average yield compared to other environments, and the lines that had the same IPCA sign and close to Ngawi (2017) were S5 and S7. The Malang (2018) environment was identi ed as having the lowest average grain yield and a small positive IPCA value close to zero, while Banyuwangi (2017) had a small negative IPCA value close to zero with an average grain yield less than the overall average. Lines that are close to the horizontal line have stable grain yields. Overall, of the 13 lines tested, six lines (46%) produced grain yields above the overall average grain yield with three lines being close to the horizontal line, and seven lines (54%) produced grain yields below the overall average grain yields.

Mega-environments analysis of 13 soybean lines in eight environments
The genetic correlations between the eight environments was presented in Table 5 and re ected in the GGE biplot ( Figure 7). In the GGE biplot analysis in 2016-2019, PC1 explained 46.28% of the total variation and PC2 explained 23.56% of the total variation, accounting for 67.84% of the total variation for grain yields (Figure 7). Biplots based on average data during four years showed that the eight test environments had ve sectors with different winning lines (vertex). The vertex lines were A-7A-PSJ (S3), UP 164 (S9), GH Detam 5 (S13), A-8A-PSJ (S4), and Detam 3 (S12).  (2017) were the two individual mega-environments (ME-III and ME-IV) with A-8A-PSJ (S4) and UP 164 (S9) as vertex lines, respectively. Vertex line GH Detam 5 (S13), without the environment in its sector, so it's not a line that has a high average grain yield in all environment.

Discussion
The development of stable and high yielding black soybean lines is one of the goals of the current breeding program. Effects of GEIs are frequent for many quantitative traits including grain yield 12 , reproductive tness, harvest time, and biotic and abiotic resistance 13 . The emergence of GEIs effects in multienvironment experiments makes it di cult to select the superior line 12,14 . In this study, the combined ANOVA showed that the effects of environment (E) and GEIs explained 59.49 and 32.16% of the total variability of the grain yield, respectively ( Table 1). The box plot (Fig. 1) and heatmap GEIs (Fig. 2) also showed that the tested environments and lines had high grain yield variations. Since combined ANOVA cannot explain the effect of GEIs, other statistical models such as AMMI are more useful. Therefore, parametric and non-parametric measurements were used to estimate the adaptability of the lines to a wide range of environments. The magnitude of the environmental and GEIs effects showed that the test environment had a signi cant effect on grain yields. Differences in rainfall, humidity, and light in each environment during the experiment had a signi cant effect on soybean grain yields 15 . In this study, the Ngawi (2017) had the highest average grain yield (2.54 t/ha) compared to other environments. The average daily rainfall during the experiment in Ngawi (2017) was 15.31 mm (Table 2). According to Mandic et al. 16 , rainfall was related to soybean grain yields. In other studies, rainfall also affects in sweet potato yields 17 . In addition, the humidity during the experiment also showed suitable conditions for growth and development of soybean.
Stability and adaptability analyzes were used when the results of the combined ANOVA indicated the presence of GEIs. In this case, we used various stability measurements, namely parametric, non-parametric, AMMI, and GGE biplot. The use of various measurements aims to increase accuracy in selecting stable and high yielding lines tested 12,18,19 . The parametric and non-parametric measurements presented in Table 4, showed that each measurement selects a different stable line. This is because each measurement has different assumptions in terms of calculating stability and adaptability. However, there were several measurements showing uniform output, namely Wi 2 , σ² , and θ . In other studies, it was also reported that there were parametric measurements that had similar results, including Vaezi et al. 12 on barley in iran. According to Karuniawan et al. 17 , if there were stability measurements that have similar outputs, one of them can be chosen to estimate the stability of the yield. Ruswandi et al. 20 showed different things, all parametric and non-parametric measurements showed differences in selecting stable maize. This difference in output indicates that another approach was needed to classify the stable lines based on parametric and non-parametric measurements. The approach commonly used in cases like this was Hierarchical Cluster Analysis (HCA) 12,17 . Based on the HCA results, the tested lines were divided into three main groups, namely unstable low yield, unstable high yield, and stable high yield groups (Fig. 3). Lines belonging to the unstable low yield group include A-4A-PSJ (S1), UP 165 (S10), A-8A-PSJ (S4), and GH Detam 5 (S13). This rst group was less favored because of low grain yields, which will result in low economic income. The second group, namely unstable high yield, consists of the Detam 1 (S11), Detam 3 (S12), A-7A-PSJ (S3), and UP 164 (S9) lines. The second group can be used for speci c adapt. Meanwhile, the third group, namely stable high yields, consists of DB-96-CTY (S5), UP 162 (S7), UP 161 (S6), UP 163 (S8), and A-5A-PSJ (S2). The third group was the expected ideal group. According to 21 , a stable and high yielding lines is one of the targets for soybean breeding. In addition, lines with high and stable yields are expected to increase farmers' income 19 . Thus, this group was strategically developed for a further sustainable soybean breeding program.
To classify parametric and non-parametric stability measurements, the principal component analysis (PCA) was used. The results of PCA (Fig. 4) and correlation analysis (Table 4) showed that the grain yield (Y) in the same group with NP (2) , NP (3) , NP (4) , S (3) , S (6) , KR, and YSI. This group belongs to the concept of dynamic stability 12,18 . Another group that correlates with each other were S (1) with S (2) ; bi with NP (1) ; S 2 di with ASV, θ , σ² , and Wi 2 ; While CVi and θ were not grouped with other measurements. These groups were included in the concept of static stability 12 . Thus, dynamic stability groups can be used to select lines in favorable environments 18 . While the static stability group can be used to select lines in a unfavorable environment.
AMMI biplots showed that the rst two principal components (IPCA1 and IPCA2), explained 69.2% of the total variation of GEIs (Fig. 5). In In the AMMI biplot PC1 vs Yield (Fig. 6), the rst IPCA biplot (IPCA1) was plotted against the line and environmental mean. The environments and lines tested on the left side of the vertical line had grain yields below the overall average grain yield, while those on the right had above the overall average grain yield 19,23 . Ngawi (2017) environment was classi ed as had the highest average yield compared to other environments, and the lines that close to this environment were DB-96-CTY (S5) and UP 162 (S7). The Malang (2018) environment was identi ed as had the lowest average yield and a small positive IPCA value close to zero while Banyuwangi (2017) has a small negative IPCA value that was close to zero with an average grain yield of less than the overall average grain yield. These two environments were not suitable for testing and development of soybeans. Lines that were close to the horizontal line have stable grain yields 24 . Overall, of the 13 lines tested, six lines (46%) produced grain yields above the overall average grain yield, namely A-5A-PSJ (S2), A-7A-PSJ (S3), DB-96-CTY (S5), UP 162 (S7), UP 164 (S9), and Detam 3 (S12), with three lines that were close to the horizontal line (stable), namely A-5A-PSJ (S2), DB-96-CTY (S5), and UP 162 (S7). While the other seven lines (54%) had grain yields below the overall average grain yield. According to several researchers, the lines that were on the right side and close to the horizontal line were potential lines and can provide economic income 19,23 .
The mega-environment trials uses the GGE biplot "which won where" (Fig. 7). The results of the GGE biplot analysis provided an opportunity to detect differences between environments with different characteristics related to the performance of the tested lines in these environments 12,25 . The line at the top of the sector (vertex) has the highest yield for the environment in that sector. One of the important features of this biplot was the clustering of environments, which suggests the possibility of different mega-environments 7 . Our results showed that during 2016-2019, the environments fell into different clusters with varied environmental clustering patterns. The rst two PCs accounted for 67.84% of the total variability attributable to the effects of genotype (G), environment (E), and their interactions (GEIs) over 4 years (Fig. 7). Based on 4 years of average data, we found four mega-environments with different vertex lines. This suggests a speci c adaptation of the line to the mega-environment and positive exploitation of GEIs 12,25 . Our ndings revealed that some of the new lines were better adapted to different environments on the Java island (stable) than other lines and check varieties. Based on our results, both numerical (parametric and non-parametric) and graphical (AMMI and GGE biplot) measurements resulted the same pattern for the identi cation of stable soybean lines.  Table 2. The eld experiment used a randomized completed block design which was repeated four times. Each line was planted on a plot 3 x 5 m 2 with a spacing of 20 cm x 30 cm.

Grain yield measures
A total of 13 black soybean lines were observed for their grain yield following the standard Descriptor for Soyabean 29 . Grain yield was harvested when the pods are brown. The grain yield data per plot was converted to ton/ ha.

Statistical Analysis
The combined ANOVA statistical model to estimate GEIs follows the equation Y opqr = µ + G o + E p + GE op + R q(p) + B r(q) + ε opqr where Y opqr is the value of line o in plot r, and the value in location p of each replication q; µ is the grand mean; G o is the effect of line o; E p is the effect of the environment p; GE op is the effect of genotype by environment interactions on line o and environment p; R q(p) is the effect of replicate q on location p; B r(q) is the effect of replication q on plot r; and ε opqr is the error effects from line o in plot r and repeat q of location p, respectively. To calculate the combined ANOVA used Genstat 12th .
Grain yield stability was estimated using parametric and non-parametric measurements (Table 6). To calculate grain yield stability based on parametric and non-parametric measurements used STABILITYSOFT 30 .
Identi cation of stable lines among the selected genotypes was conducted using AMMI following the study of 31 : where: Yijk is the yield in location j from line i of replication k, μ is the average of grand yield, G i is the in uence of line i, Ej is the in uence of the location j, λ k is the value of primer component k, α ik and γ jk were the vector score for the line i and location j to component k, ρijr is a mistake from line i and location j While ASV was estimated following the study of 32 : ss IPCA1, ss IPCA2 were the sum of square in IPCA 1 and 2, which shows the score of the main component because of the high contribution in genotype by location interactions. IPCA1 and IPCA2 were the rst and second from IPCA scores for each genotype from the AMMI analysis.