Correlation analysis of lodging score and related traits
The box plots displaying phenotypic data for various environmental traits (Fig. 1) indicate minimal variation in the data across the years. Consequently, the mean values of the data for each trait were employed in the correlation analysis (Table 1). In the GB population, the lodging score displays highly significant positive correlations with several traits, including flowering time, maturity time, plant height, number of main stem nodes, stem diameter, and internode length. The correlation coefficients for these relationships range from 0.457 to 0.783. However, there is no significant correlation observed between lodging score and grain weight per plant. The correlation coefficients between lodging score and each of the correlated traits are as follows: Flowering time: 0.698 Maturity time: 0.483 Plant height: Highest correlation at 0.783 Number of main stem nodes: 0.749 Stem diameter: 0.457 Internode length: 0.564 In contrast, grain weight per plant exhibits no significant correlation, with a correlation coefficient of 0.050. These findings indicate that the correlations between soybean lodging and other agronomic traits can serve as valuable references for selecting high-yield varieties in field-based breeding programs.
Descriptive statistical analysis of soybean lodging related traits and estimates of broad-sense heritability
Descriptive statistical analysis was conducted on the phenotypic data (Table 2). The results revealed that both parental lines exhibited mild lodging scores. However, the RIL populations displayed a wide range of family variation, with coefficients of variation ranging from 28.44% to 51.94%. This wide variation encompasses extreme traits, providing a solid foundation for QTL mapping of lodging. Similar patterns were observed in other lodging-related traits within the RIL population, indicating parental segregation and laying the groundwork for QTL mapping. The coefficients of variation for various traits are as follows: 11.34% to 13.00% for flowering time, 5.99% to 7.00% for maturity time, 20.20% to 28.51% for plant height, 13.51% to 21.35% for the number of main stem nodes, 10.80% to 14.45% for stem diameter, 13.31% to 17.17% for internode length, and 18.25% to 24.42% for grain weight per plant.
Several traits, including lodging score, flowering time, maturity time, and plant height, exhibit kurtosis and skewness values greater than 1. These values indicate the presence of numerous influential factors influencing these traits. The segregation of these traits in the RILs is governed by multiple genes, aligning with the characteristics of quantitative genetic traits. Conversely, the absolute values of kurtosis and skewness for the number of main stem nodes, stem diameter, internode length, and grain weight per plant are all less than 1, signifying that these traits follow a normal or approximately normal distribution and are consistent with quantitative genetic traits. Additionally, the frequency distribution map (Fig. 2) vividly illustrates the continuous variations in the phenotypic data of lodging score and its related traits. These findings collectively suggest that the lodging score and associated traits within the GB population conform to a normal or partially normal distribution, aligning with the characteristics of the RIL population and categorizing as quantitative genetic traits. In summary, the results indicate that lodging and related traits in the GB population adhere to a normal distribution pattern, consistent with the characteristics of the RIL population and indicating their classification as quantitative genetic traits.
ANOVA results for the lodging score of the GB RIL population across five natural environments demonstrate significant effects of genotype, environment, and the interaction between genotype and environment on lodging and related traits of the GB population (Table 3). The lodging trait in the GB population exhibited a substantially high heritability estimate (h2) of 93.14%, indicating that the lodging phenotype in soybean is primarily influenced by genotype.
Identification of QTLs for lodging score and related traits
In total, 84 QTL loci were identified, accounting for phenotypic variation ranging from 1.26% to 66.87% across six environments. These QTLs were distributed across various traits, with 20, 11, 11, 12, 9, 10, 6, and 5 QTLs detected for lodging score, flowering time, maturity time, plant height, number of main stem nodes, stem diameter, internode length, and grain weight per plant, respectively (Fig. 3; Table S2). All of these QTLs displayed LOD values exceeding 2.5. The major and stable QTL locus, named qLD-4-1, associated with lodging score was identified. It is positioned within the physical interval of 3513907-5769624 bp on chromosome 4, spanning between bin15 to bin39 markers. This QTL was consistently detected in all six environments and exhibited phenotypic variation ranging from 15.38% to 38.68%, with LOD values ranging from 10.36 to 34.70. Additionally, nine out of the ten primary QTLs for other related traits (qFT-4, qMT-4-1, qMT-4-2, qPH-4, qPH-19-2, qNMSN-4, qSD-4-1, qSD-4-2, qIL-4-1, qIL-4-2) were found within the physical region of the primary QTL, qLD-4-1, for lodging score (Table 4). These QTLs exhibited phenotypic variation ranging from 55.93% to 66.87% for flowering time, 31.58% to 48.70% for maturity time, 41.41% to 51.32% for plant height, 20.46% to 48.39% for the number of main stem nodes, 12.10% to 29.12% for stem diameter, and 13.60% to 30.07% for internode length. The stable QTLs mentioned above provide valuable insights for the exploration of genes that regulate soybean lodging and related agronomic traits.
Candidate gene prediction within stable and major QTL interval
To identify potential genes associated with lodging within the QTL region (qLD-4-1), a search was conducted for 271 gene models located within this interval. Subsequently, 225 gene functions linked to qLD-4-1 underwent GO annotation and were categorized through GO enrichment analysis, providing functional annotations spanning cellular composition, biological processes, and molecular function (Fig. 4). The majority of genes within qLD-4-1 were found to be involved in processes such as the regulation of DNA-templated transcription, plasma membrane, chloroplast, membrane, and ATP binding. In order to narrow down the list of candidate genes associated with lodging, a comparison was made of differentially expressed genes across various soybean tissues (Fig. 5) during three specific periods: Shoot Apex, Hypocotyl, and Stem. Through gene GO enrichment analysis, followed by gene expression screening and functional annotation, a total of 13 candidate genes were identified (Table 5), indicating their potential roles in critical processes governing soybean lodging. In the high-quality resequencing data, seven out of the 13 genes exhibited structural variation between the parental lines of the RIL population (Guizao 1 and B13). These genes are Glyma.04g050200, Glyma.04g050800, Glyma.04g051300, Glyma.04g052100, Glyma.04g053600, Glyma.04g056200, and Glyma.04g063800 (Table 6).
Expression for the identification of candidate genes
This study conducted a comprehensive analysis of the expression levels of candidate genes in the root and stem tissues of the two parental lines. The genes exhibited differential expression in the stems and leaves of the two parents, as determined by qRT-PCR analysis (Fig. 6). Among these genes, Glyma.04g051300, Glyma.04g056200, and Glyma.04g063800 demonstrated significant differences in expression between Guizao 1 and B13 in both root and stem tissues, and these differences were highly significant. These findings strongly suggest that these three genes are the primary candidates responsible for regulating soybean lodging.