SNP genetic map constructed based on A-RIL populations
Soybean seed oil content was an important quantitative trait, construction of genetic maps and QTLs localization have been applied in molecular MAS program with the development of molecular markers. In many previous studies, genetic maps of restriction fragment length polymorphism (RFLP) [29], amplified fragment length polymorphisms (AFLP), and/or simple sequence repeat (SSR) etc. markers [7, 30-31] were used to map target traits. However, the coverage of these genetic maps was low, which will reduce the accuracy of QTL mapping. With the development of genome sequencing technology, millions of single nucleotide polymorphisms (SNPs) and insertion/deletions were identified. Choi et al. (2007)[32] discovered a total of 5,551 SNPs markers by the resequencing of sequence-tagged sites (STSs) developed from expressed sequence tag (EST) sequence using three mapping populations including the Minsoy × Noir 1, Minsoy × Archer as well as the Evans × PI 209332. Hyten et al. (2010)[33] established a high-throughput soybean QTL mapping with 1,536 SNP markers using the same populations as Choi et al. (2007)[32]. Wang et al. (2016) constructed two linkage maps containing 5,728 and 4,354 bins based on 89,680 and 80,995 SNP markers, spanning a total genetic distance of 2,204.6 cM and 2,136.7 cM, with an average distance of 0.4 and 0.5 cM between neighboring bins in NJRINP and NJRI4P, respectively. Yao et al. (2020)[34] constructed a high-density genetic linkage map comprising 11,398 SNP markers on 20 linkage groups (LGs) using the specific-locus amplified fragment sequencing (SLAF-seq) method, a total of 181 recombinant inbred lines (RILs) derived from a cross between wild soybean ZYD00463 and cultivated WDD01514 were genotyped. These SNP markers encompassed 2913.78 cM of the soybean genome, with a mean distance of 0.26 cM between markers. Abundant molecular markers facilitate genome-wide analysis of soybean [35, 36], and the high-density occurrence of SNPs in the genome was more suitable for QTL localization. In this study, a 2,212 SNP markers map distributing on 20 chromosomes was established, with a length of 5718.01 cM and an average distance of 2.61 cM between adjacent markers based on A-RIL populations (RIL3613 and RIL6013) with common parent, which contributed to increased genetic diversity and identify epistatic interactions of QTLs between multiple genetic backgrounds [37]. In addition, multi-populations broken through the limitation of less recombination events in single population evaluation, using the same SNP marker could improve the accuracy of the identified QTLs.
The additive QTLs variation
Eight SNP sites within the QTL physical interval located in the SSR map and nineteen additive common QTLs were detected by two methods and in multiple backgrounds (Figure 2, Table 4 and Supplementary Table S3). Comparison of the previous studies with Soybase Datadase, 19 of 23 QTLs for the oil content had been mapped overlap in soybean with previous researches [7-9, 14, 29, 37-51] (Table 4 and Supplementary Figure S2). Among them, the same QTL that was located simultaneously in two RIL populations, qOil-5-1 (marker interval, Gm05_4062384- Gm05_5158657) showed similar position the consensus QTL detected by Wang et al (2012)[49] in cross SD02-4-59 × A02-381100. This further confirmed the accuracy of QTLs identification in this study. The rest four QTLs (qOil-13-1, qOil-13-4, qOil-13-5 and qOil-17-2) are new found.
Genetic basement of epistasis QTL and interactions with environments
Epistasis, or the interaction of a pair of loci may play an important role in the genetic of complex quantitative traits [18, 52]. If epistasis is ignored, many individual loci could not be detected, which will weaken the detection power of QTL[53]. In present study, PVE of A and AE of 64 additive QTL ranged from 1.29% to 10.75% and 0.86% to 10.66%, respectively, while those of AA and AAE effect of 12 pairs of epistatic QTL ranged from 1.61% to 3.31% (Figure 3 and Table 5). By the comparison, AA and AAE effect were less than A and AE effect, this is similar to the study of Teng et al. (2017)[22]. Epistasis may also play an essential role in trait improvement even if epistatic variance components are low [17, 54]. In this research, two individual QTL of two pairs of epistasis effects QTLs were constructed by the intervals from the same chromosome 9 and 15, respectively, including Gm09_19759328-Gm09_19958039 interacted with Gm09_683799-Gm09_1859079, Gm15_8908864-Gm15_9850704 interacted with Gm15_2753022-Gm15_29807975, and other pairs epistasis effects QTLs were mapped on different chromosomes. One QTL also could interact with multiple QTLs. In the RIL6013 population, six QTLs interacted with more than one QTL, Gm01_41350513-Gm01_42850670, Gm09_585590-Gm09_1201285, Gm09_683799-Gm09_1859079, Gm11_18740411-Gm11_34453671, Gm12_7885195-Gm12_20168509, Gm15_8908864-Gm15_9850704 interacted with each other or other QTLs. Twelve pairs of epistatic QTLs included 14 sites, among which 11 sites contained, overlapped or crossed regions with the previous QTLs (Supplementary Figure S3) [9, 10, 29, 37-39, 47-49, 51, 55-59]. It also further illustrates the stability of these locus, and the results were similar to Yang et al. (2013)[53]. The other three sites (Gm01_4091303-Gm01_5085864, Gm01_41350513-Gm01_42850670 and Gm18_47550661-Gm18_48024037) have not been reported and can be considered as the newly discovered QTLs in this study.
Boer et al. (2002)[60] believed that the QTL interactions with environments could be regarded as a specific expression of QTL caused by year, location, temperature and other factors. Therefore, multi-environment joint analysis can be used to identify the stability of QTL and estimate the AAE effect. In this study, the PVE value by AAE effects ranged from 0.09 to 1.26% (Figure 3 and Table 5), only a pair AAE effects QTL (Gm01_4091303-Gm01_5085864~Gm19_41753854-Gm19_42163159), the PVE value of AAE effects was less than the PVE value of AA effects in the RIL3613 population, meaning that the environment played an important role, while the impact of environment were weak in remaining 11 pairs QTLs. The results showed that AA and AAE effects were important genetic effects in QTL localization [12, 61].
Relationship between additive effect QTLs and epistatic QTLs
In this study, the additive effect QTL also had epistasis effects [53, 62, 63], which should be carefully analyzed and evaluated in MAS to improve the seed oil content in soybean. Two additive effect QTLs qOil-9-2 (marker interval, Gm09_585590-Gm09_1201285) and qOil-15-1 (marker interval, Gm15_8908864-Gm15_9850704) showed interactions with other two QTLs (Table 4 and Table 5), such as Gm09_585590-Gm09_1201285 interacted with two QTLs Gm01_41350513-Gm01_42850670 and Gm11_18740411-Gm11_34453671, Gm15_8908864-Gm15_9850704 interacted with Gm01_41350513-Gm01_42850670, Gm09_19759328-Gm09_19958039, Gm11_18740411-Gm11_34453671, and Gm12_20168509-Gm12_7885195 interacted with Gm15_2753022-Gm15_29807975. The results showed that, should not only consider the effect of one site but also the interaction of multiple sites, not only consider the main effect of QTL, but also the additive effect, AA effects and QTL interactions with environments in the application of MAS breeding.
Oil content candidate genes prediction
We identified 3 genes that may be involved in oil anabolism through 14 additive QTLs (qOil-1-1, qOil-1-2, qOil-2-3, qOil-5-1, qOil-9-2, qOil-13-1, qOil-13-4, qOil-15-1, qOil-15-2, qOil-15-3, qOil-15-4, qOil-15-5, qOil-17-2 and qOil-19-5) with the physical marker distance of less than 1 Mb which based on pathway annotations (Table 6 and Supplementary Table S3, bold text). Glyma.05G049500.1 was predicted to lysophospholipid acyltransferase, which was an important enzyme participating in membrane and storage lipid biosynthesis and participated in glycerolipid and glycerophospholipid metabolism pathway [64]. Glyma.05G053300.1 was predicted to plant 4,4-dimethylsterol C-4alpha-methyl-monooxygenase, which involved in the steroid biosynthesis pathway [65], we believe that it indirectly affects the biosynthesis of fatty acids. Glyma.17G067400.1 was predicted to diacylglycerol kinase (ATP), which is key signaling enzymes for phosphorylated diacylglycerin to produce phosphatidylic acid, and participates in the glycerophospholipid metabolism pathway, it also affect the formation of fatty acids[66]. In addition, We identified 2 genes that may be involved in oil anabolism through 7 epistatic QTLs (marker interval, Gm01_4091303-Gm01_5085864, Gm09_585590-Gm09_1201285, Gm09_19759328-Gm09_19958039, Gm10_42150935-Gm10_42851788, Gm15_8908864-Gm15_9850704, Gm18_47550661-Gm18_48024037, Gm19_41753854-Gm19_42163159) with the physical marker distance of less than 1 Mb which based on pathway annotations (Table 5 and 6). Glyma.18G200300.1 was predicted to alcohol dehydrogenase, it is the key enzyme in alcohol fermentation, which is a flooding-response specific soybean gene expressed in root tissue, and affects soybean yield [67]. Glyma.10G189900.1 was predicted to peroxygenase, Hanano et al. (2006)[68] study showed that peroxygenase had been purified from oat microsomes and lipid droplets, whose pathway also constitutes one branch of the so-called lipoxygenase pathway, which catalyzed the oxidation of unsaturated fatty acids (c18:2, c18:3, or c16:3) to produce corresponding fatty acid hydroperoxides. Further tests are needed to determine which of the 5 genes have a significant effect on soybean oil content, and candidate genes are targeted for further study.