A quantitative genomic map of QTLs controlling seed yield, seed components, hormone level and disease related traits
4555 QTLs of 128 agronomic and disease related traits, developed from 79 different populations of three different ecotypes and grown in 12 different countries (Additional file 3: Table S1) were gathered and combined together in one unique map (Fig. 1). A total distance of 978.4Mb on the physical map of Darmor-bzh was covered (Fig. 2A, Additional file 1: Figure S1, Additional file 4: Table S2). Further observation revealed that 2695 and 1860 QTLs were located on A and C genome, respectively. A3 and C3 chromosomes contained the highest number of QTLs, with 430 and 399 QTLs, respectively, whereas A10 and C4 chromosomes had the longest coverage distance (159.33Mb and 122.42Mb, respectively). A9, A3 and C3 chromosomes contained the highest number of traits (80, 71 and 69 traits, respectively) (Fig. 2A). Obviously, most QTLs for seed components, seed yield, hormones level and disease related traits were found on A genome rather than in C genome.
It is crucial to locate regions of the genome where multiple traits overlapped the most. Thereby, the above-mentioned 128 traits were subdivided into five categories: abiotic factor (A), biotic factor (B), hormones level (H), seed components (S) and yield related traits (Y). The total number of QTLs in each category were 349 (A), 334 (B), 42 (H), 1392 (S) and 2438 (Y). Each region on Darmor-bzh genome was carefully observed in order to detect regions where QTLs of more than one category of trait could overlap, i.e. regions with two, three, four or five categories of traits, which were present in one region, simultaneously. A total of 517 regions which hosted overlapping QTLs were observed (Fig. 2B, Additional file 1: Figure S1). The region of overlapping QTLs on each chromosome, the number of QTLs and the categories of traits are summarized on Additional file 5: Table S3. First, eight regions were found to harbor all the five categories of studied traits (A, B, H, S and Y) (Additional file 6: Table S4). Those eight regions were located on six chromosomes: one region was found on each of A1 (1.71–1.71 Mb, with 40 QTLs), A2 (2.31–2.31 Mb, with 20 QTLs), A10 (11.78–11.87 Mb, with 14 QTLs) and C3 (5.09–5.33 Mb, with 11 QTLs), and two regions were found on A6 (21.68–21.95 Mb with 15 QTLs and 22-22.30 Mb with 13 QTLs), and on A9 (8.12–9.87 Mb and 20.76–22.51 Mb, with 34 QTLs on each of them). Second, 107 regions which contained four categories of traits were found in all 19 chromosomes. Number of region in each chromosome were respectively of 11 on C3, nine on each of A6 and A9, eight on A7, seven on each of A2 and A8, six on each of A6, A10 and C4, five on C1 and C2, four on each of A1, A3, A4 and C9, three on each of C5 and C8, and one on C6. For example, 28 QTLs of four categories of traits (1A, 12B, 5S, 10Y) overlapped on A2 (1.49–2.31 Mb). Note that the region on A2 (1.71–22.04 Mb) included 288 overlapping QTLs (12A, 22B, 63S, 191Y), which was the richest region of overlapping QTLs in B. napus genome. Third, 225 regions on all 19 chromosomes were found to have overlapping QTLs involving three categories of traits: 22 on C3, 20 on C4, 18 on A7, 16 on C5, 15 on C9, 14 on each of A6, C2 and C8, 13 on each of C1 and C7, 11 on A8, ten on A3, nine on A9, eight on A5, seven on each of A2 and C6, six on each of A4 and A10, and two on A1. For instance, on a region of A5 (3.49–5.29 Mb), 40 QTLs of three categories of traits (5B, 12S, 23Y) overlapped. Fourth, 177 regions were found to contain overlapping QTLs which involved two categories of traits: 20 on C4, 17 on C5, 16 on C9, 15 on C8, 14 on each of C6 and C7, 13 on A7, 11 on C1, ten on each of C2 and C3, seven on each of A6 and A8, five on each of A4 and A5, four on A3, three on A1 and A9, two on A10, and one on A2. As example, 13 QTLs of two categories of traits (10B, 3Y) overlapped on C4 (20.66–20.70 Mb).
Note that some QTLs might overlap multiple times with other QTLs in different regions because of their extended length, for example, a QTL for C16:0 was located on A1 (2.25–19.86 Mb) and it could overlap two times with QTLs in region which involved five and four categories (1.71–1.71 Mb and 1.71–22.04 Mb, respectively). Then, the most abundant and the most overlapping categories of traits were S and Y categories, they were found in 403 among the 517 regions of overlapping QTLs detected in this study. The H category of trait was found rarely in overlapping region since the identified QTLs in early published papers were few (42 QTLs), so far, this H category were found in 39 among the 517 regions of overlapping QTLs of this study. Otherwise, regions of overlapping QTLs which involved one environment or one population were observed. No specified region was found exclusively for one population. Also, only specified regions for China were found in 11 areas of the genome: four areas on C3 (36.94–37.27 Mb, 37.27–38.94 Mb, 39.94–40.21 Mb and 41.40-46.52 Mb), two areas on A8 (16.87-17.37Mb and 17.37-18.00Mb), and one area on each of A7 (17.48–18.48 Mb), A10 (0.14–1.64 Mb), C4 (42.73–44.22 Mb), C6 (8.43–9.43 Mb) and C7 (24.87–25.45 Mb) (Additional file 5: Table S3). For instance, the region on A8 (16.87-17.37Mb) had four QTLs (3S, 1Y), which were all found with Chinese experimental field. Ultimately, the rapeseed genome had been finely dissected to unveil regions that harbored multiple traits, simultaneously. It would be crucial to couple those findings with the identification of genes that were located within those regions to understand the influence of those genes over those traits.
Candidate genes identified within regions of overlapping QTLs, and their interaction network
Totally, 3181 genes which are associated to oil biosynthesis, yield and disease related traits were aligned to the physical map of Darmor-bzh, and a total of 2744 candidate genes were found within overlapping QTLs of two to five categories of traits (Fig. 3A, Additional file 7: Table S5). A total number of genes of 26 (1%), 729 (47%), 1291 (27%) and 700 (25%) were found for five, four, three and two categories of traits, respectively (Fig. 3B).
Eight regions of overlapping QTLs of five categories of traits (A, B, H, S, Y) were found in six chromosomes (A1, A2, A6, A9, A10 and C3). A total of 26 candidate genes were found on three among those six chromosomes: seven genes on A6 (four on 21.68–21.95 Mb and three on 22-22.30 Mb), 18 genes on A9 (six on 8.12–9.87 Mb and 12 on 20.76–22.51 Mb) and one gene on C3 (5.09–5.33 Mb). For example, three candidates were found on A6 (22-22.30 Mb) which were DHLAT-BnaA06g33300D, RLK-BnaA06g33320D and AAPPT-BnaA06g33540D. Meanwhile, 729 candidate genes were found within overlapping QTLs of four categories of traits in all 19 chromosomes, and they were respectively of 129 (A1), 120 (A9), 71 (A8), 58 (A10), 46 (C3), 44 (A5), 43 (A3), 38 (A6), 37 (A2), 28 (A7 and C1, each), 22 (A4), 15 (C5), 12 (C9), 11 (C4 and C7, each), 10 (C2), four (C6) and two (C8). As example, three candidate genes (CCT-BnaA03g14860D, RLK-BnaA03g15210D and KAT2-BnaA03g15290D) on A3 (6.84–7.12 Mb) were found within 19 overlapping QTLs (2B, 1H, 5S, 11Y). Moreover, 1289 candidate genes were located within overlapping QTLs of three categories of traits, and they were found on all 19 chromosomes: 169 (C3), 129 (A3), 121 (A2), 104 (C2), 77 (A5), 74 (C4), 70 (C8), 66 (A6 and C6), 61 (A7), 56 (A4), 53 (C1), 48 (C9), 47 (A9), 44 (C7), 40 (A10), 37 (C5), 24 (A8) and three (A1). For instance, two candidate genes (ADC2-BnaC01g03710D and FAE-BnaC01g04130D) were found on C1 (1.93–2.16 Mb) involving 13 overlapping QTLs (4B, 1S, 8Y). At last, overlapping QTLs of 2 categories of traits contained 700 candidate genes in 18 chromosomes (excluding A2): 110 (C9), 75 (C5), 61 (C4), 59 (C7), 51 (A7), 48 (C2), 42 (C3), 41 (C1), 40 (C6), 31 (A3), 29 (A6), 28 (C8), 21 (A5), 19 (A8), 16 (A4), 15 (A10), 9 (A1) and 5 (A9). For example, two candidate genes (RLK-BnaC07g13860D and RN-BnaC07g14020D) were found on C7 (19.60-19.79 Mb) involving two overlapping QTLs (1 A and 1 B). In assumption from those findings, important genes which were located within regions of overlapping QTL with multiple traits were identified. They might have influence on more than one category of traits, and they could be selected according to the desired improvement of two or multiple traits.
Interaction network analysis of the 2744 candidate genes were made with their 1555 orthologous genes in A. thaliana because B. napus is not available on String database. Gene ontology (GO) analysis indicated that the 1555 genes could be classified into 16 categories, according to Panther GO-slim biological process’s classification (Additional file 8: Table S6), it included the cellular process (GO:0009987), biological phase (GO:0044848), reproductive process (GO:0022414), multi-organism process (GO:0051704), localization (GO:0051179), interspecies interaction between organisms (GO:0044419), reproduction (GO:0000003), biological regulation (GO:0065007), response to stimulus (GO:0050896), signaling (GO:0023052), developmental process (GO:0032502), rhythmic process (GO:0048511), multicellular organismal process (GO:0032501), metabolic process (GO:0008152), growth (GO:0040007), immune system process (GO:0002376). Other genes which could not fit into those categories were classified “Others”.
The interaction network was visualized with Cytoscape, 1271 nodes and 10101 edges were displayed (Fig. 4). In this network, 11 genes might be more influential over other genes (degree layout, DL ≧ 70): AP2 (DL = 103), FT (DL = 100), AUX1 (DL = 90), KASIII (DL = 89), CO (DL = 80), MCAT (DL = 79), KASI (DL = 78), AGL20 (DL = 74), PHYA (DL = 72), COP1 (DL = 71) and ACP (DL = 70). Those genes belonged to the GO categories of metabolic process (KASI and AGL20), multicellular organismal process (CO) and other category (AP2, FT, AUX1, KASIII, MCAT, PHYA, COP1 and ACP4). Those most influential genes had function related to oil biosynthesis and yield related traits: ACP, MCAT, KASI and KASIII are both plastidial genes which are involved in fatty acid biosynthesis. ACP act as carrier of acyl intermediates, then MCAT catalyzes the synthesis of malonyl-ACP and CoA from malonyl-CoA and ACP. KASI and KASIII both act on fatty acid elongation. Then, the other seven influential genes were related to yield traits: AGL20 promote flowering and inflorescence meristem in Arabidopsis, AP2 are involved in floral organ specification, in floral meristem establishment and in ovule and seed coat development. AUX1 are auxin transporters and have influence on lateral root initiation and positioning, CO regulate flowering during long days, COP1 act as suppressors of photomorphogenesis and stimulate skotomorphogenesis in the dark, FT also promote flowering as AGL20, and PHYA regulate photomorphogenesis. Those most influential genes had different functions and were involved in different metabolism pathways, yet they might have higher effect over other genes, this might indicate that the simultaneous control of multiple categories of traits might be affected at different path of metabolisms.
Gene expression and structural variation of candidate genes in eight rapeseed accessions.
One copy of each homologous of the 11 most influential genes of the interaction network in B. napus was selected for the current gene expression and structural variation analyses. Additionally, the 26 candidate genes which were found within QTLs of five categories (A, B, H, S and Y) were also added to the analyses. Thus, a total of 37 genes of Darmor-bzh genome were searched in eight rapeseed accessions by using of BnPIR database [25], including two winter-types (Quinta and Tapidor), two spring-types (Westar and No2127), and four semi-winter types (ZS11, Zheyou7, Shengli and Gangan) (Additional file 9: Table S7). The tissues used in the determination of gene expression on BnPIR database were collected throughout the flowering process of the plant from the eight accessions of this study. They were collected at five different post sowing days: T0: 24 days post sowing; T1: 54 days post sowing; T2: 82 days post sowing; T3: 115 days post sowing; T4: 147 days post sowing. Gene expressions are displayed on Additional file 2: Figure S2.
Particularly, six genes displayed the most significant difference in expression among them, while considering the difference in FPKM value and the expression profile during different post sowing days: ACP4, AGL20, FRB12, LTA2, PP2A and RLP46. The six homologous genes were located on four different chromosomes in Darmor-bzh (A1, A5, A6 and A9), but in five chromosomes (A1, A6, A9, C3 and C4) in the other eight rapeseed varieties (Fig. 5, Additional file 10: Table S8). The expression of those six genes is illustrated on Fig. 6 and nucleotide sequence identity is shown on Additional file 11: Table S9. ACP and AGL20 were among the above mentioned most influential genes in the interaction network and their roles were cited earlier. Besides, the four remaining genes were located in regions of overlapping QTLs of five categories of traits: FRB12 encodes a translation initiation factor which is involved in apoptosis in response to infection from Pseudomonas syringae, LTA2 is involved in embryo formation and development, PP2A modulates protein phosphoregulation in several cellular processes such as growth and developmental processes, hormone and environmental responses, and RLP46 are trans-membrane receptors which are involved in cellular signaling.
ACP4 gene expression showed dissimilarity at T1, with the highest expression in Gangan, followed by No2127, Westar, Zheyou7 and ZS11, while expression of ACP4 in Quinta, Shengli and Tapidor were lower, the other stages displayed comparable profile. Comparison of nucleotide sequences showed that in one hand ACP4 in Gangan, Shengli, Westar, Zheyou7 and ZS11, and in other hand ACP4 in Quinta and Tapidor, shared 100% identity, respectively. Similarly, expression levels of FBR12 in Shengli, Tapidor, Quinta and ZS11, which shared 100% of identity, showed a high peak at T1, whereas at T3, Westar FBR12 expression level was slightly elevated compared to the other FBR12. Westar FBR12 shared 100% identity with the FBR12 of other genomes except with Zheyou7 FBR12 which shared 92% identity with Westar FRB12. Else, AGL20 of No2127 particularly displayed significant high expression levels at T2, T3 and T4, while others showed relatively similar expression. No2127 AGL20 shared 100% of identity with AGL20 of Shengli and Zheyou7, 97% of identity with Gangan, Westar, Tapidor and ZS11, and 95% of identity with Quinta. Besides, Shengli LTA2 expression level went from the highest at T0 to the lowest at T1 in comparison to LTA2 of other genomes. Zheyou7 and ZS11 also displayed a succession of increase and decrease of expression level during the five post sowing stages, they though shared 96% of identity, whereas Shengli LTA2 shared 100% of identity with Quinta, Tapidor, Westar and ZS11 LTA2, 98% identity with No2127 LTA2 and 96% identity with Gangan and Zheyou7 LTA2. Then, PP2A in Shengli and No2127 displayed alternate increase and decrease, but reverse expression level (i.e. when one increased, the other decreased). However, they shared 100% of identity between them, and with PP2A of other genome, except with Westar PP2A which shared 99% identity with them. Westar PP2A expression level remained low from stage T1 to T4. At last, a peak of high expression level was noticeable in RLP46 of Tapidor and Quinta at T3, while those of Westar, Zheyou7 and ZS11 peaked at T2. Tapidor and Quinta RLP46 shared 100% identity between them, and with Gangan, Westar and ZS11, and 99% with No2127, Shengli and Zheyou7. It was noticeable than in genes Zheyou7 and ZS11 always had the same expression level, even when the nucleotide sequence identity were less than 100%. Those findings suggested that similar gene sequence might display different expression profile, and reversely, different gene sequence might show similar expression profile.