Phenotypic evaluations
A continuous and broad range of reactions to inoculation with S. sclerotiorum was observed among the 187 B. napus accessions in this study (Table 1, Fig. 1a-d). The main stems LL of the genotypes at 7 dpi varied from 2.3 to 9.0 cm and LW ranged from 19.3 to 81.4% among the studied environments (Table 1, Fig. 1b-c). In the CARR_19, the main stems LL ranged from 3.0-7.3 cm with the mean of 5.1 cm, whereas LANG_19, CARR_20, and OSN_20 had a range (mean) of 2.8-8.7 cm (5.6 cm), 2.5-7.3 cm (5.0 cm), 2.3-9.0 cm (5.9 cm), respectively (Table 1, Fig. 1b). In the case of stem LW, the CARR_19, LANG_19, CARR_20, and OSN_20 had a range (mean) of 26.9-70.3% (47.7%), 26.7-81.4% (55.1%), 21.9-68.0% (48.1%), and 19.3-78.6% (53.6%), respectively (Table 1, Fig. 1c). The mean LL (5.9 cm) was the highest in (OSN_20), and the lowest (5.0 cm) was recorded in (CARR_20), whereas the highest overall mean LW (55.1%) was observed in (LANG_19) followed by the lowest mean (47.7%) in both (CARR_19) environments (Table 1, Fig. 1b-c). The resulting BLUEs for PM_14D and PM_21D for SSR scores across all environments ranged from 2 to 63% with an average of 33%, and from 16 to 94% with a mean of 66%, respectively (Table 1, Fig. 1d). The diversity of phenotypic responses observed in this study is consistent with observations made by other researchers18–20,22,24 and reinforce the notion that resistance to sclerotinia infections is quantitatively inherited and controlled by multiple genes. The combENV BLUEs of the top five promising source of resistance ranged between 2.6 to 4.2 cm for LL, from 23.7 to 40.6% for LW, and had between 2 to 10% PM_14D, and between 16 to 37% PM_21D. These ranges were smaller than the respective phenotypic responses observed on the resistant checks “Pioneer 45S51” 5.2 cm, 51.9%, 32%, and 66% for LL, LW, PM_14D, and PM_21D, respectively, and “Pioneer 45S56” 5.2 cm, 49.5%, 31%, and 62% for LL, LW, PM_14D, and PM_21D, respectively, and susceptible check “Westar” 6.8 cm, 65%, 63%, and 94% for LL, LW, PM_14D, and PM_21D, respectively (Supplementary Table S1). Therefore, these promising genotypes will serve as a valuable resource to transfer resistance gene into the elite canola cultivars to develop SSR resistant cultivars for the growers. A two-way analysis of variance (ANOVA) indicated that genotype, environment, and their interaction had significant effects (P ≤ 0.001) on both LL and LW for stem resistance. Similar results were obtained from the ANOVA analysis of the combENV sets for PM_14D, and PM_21D with the exception of the interaction between genotype and environment on PM_21D, which was not-significant (P ≤ 0.05) (Supplementary Table S2). The family-based mean heritability of LL ranged between 0.64 and 0.78, while heritability for LW ranged between 0.56 and 0.72 for all environments. Combined across all environments, high broad-sense heritability 0.88 and 0.86 was observed for LL and LW, respectively (Table 1).
Plant phenotypic variables measured on the 187 genotypes in all environments displayed a wide variation. DF values ranged from 37 to 88 days and had a coefficient of variation (CV) ranging from 3.7 to 6.4 (Table 1). IL varied from 7.7 to 20.4 cm with CV between 17.9 to 28.3, whereas SD ranged from 4.0 to 12.9 mm with a CV ranging from 21.9 to 31.2 (Table 1). The frequency distributions of phenotypic values for each trait are presented in the Supplementary Fig. S1.
Correlation among internode length, stem diameter, lesion length, and lesion width
To determine the effects of IL and SD on the reaction of genotypes to SSR stem resistance in respect to stem LL, LW, PM_14D and PM_21D, efforts were made to maintain homogenous plant densities in every row. We found that combENV BLUEs of LL was negatively correlated with SD [r = -0.34, P= < 0.0001] (Fig. 2). Similarly, significant negative correlations were also observed between combENV data set of LW and SD (r = -0.44, P= < 0.0001), LW and PM_14D (r = -0.45, P= < 0.0001), LW and PM_21D (r = -0.44, P= < 0.0001) (Fig. 2). The regression analyses showed that the stem LL and LW were also significantly and negatively associated with SD (R2=0.11, P= 1.7 x 10-6 for LL; R2=0.19, P= 1.5 x 10-10 for LW) (Fig. 3b, Supplementary Fig. S2). However, stem LL and IL had significant positive correlation (r =0.49, P= < 0.0001), and similarly a significant positive correlation was also found to be associated between stem LW and IL (r =0.42, P= < 0.0001), LW and PM_14D (r =0.40, P= < 0.0001), LW and PM_21D (r =0.47, P= < 0.0001) (Fig. 2). The regression analyses between stem LL, stem LW with IL also showed positive correlation (R2=0.24, P= 6.9 x 10-13 for LL, and R2=0.17, P= 1.9 x 10-9 for LW) (Fig. 3a, Supplementary Fig. S2). Regression analyses among the stem LW, PM_14D, PM_21D with IL, and SD were presented in the Supplementary Fig. S2. Interestingly, a highly significant positive correlation was observed among stem LL, LW, PM_14D, and PM_21D for stem resistance across all the studied environments and combENV analyses (Fig. 2). The correlation between stem LL and LW was strong and positive in CARR_19 (r = 0.91), LANG_19 (r = 0.90, P= < 0.0001), CARR_20 (r = 0.91, P= < 0.0001), OSN_20 (r = 0.90, P= < 0.0001), and combined (r = 0.94, P= < 0.0001) environments. Highly significant correlations were also reported between stem LL and PM_14D (r = 0.83, P= < 0.0001), LL and PM_21D (r = 0.75, P= < 0.0001) (Fig. 2). These results suggest that stem LW, PM_14D, and PM_21D could serve as proxies for LL during assessment of stem resistance to S. sclerotiorum in rapeseed/canola.
Correlation between days to flowering and sclerotinia stem rot resistance
The DF was significantly and negatively associated with combENV BLUEs of LL, LW, PM_14D, PM_21D (r = - 0.39, P= < 0.0001 for LL; r = - 0.44, P= < 0.0001 for LW; r = - 0.49, P= < 0.0001 for PM_14D; and r = - 0.59, P= < 0.0001 for PM_21D) (Fig. 2). The regression analyses showed that DF were negatively and significantly associated with stem LL, LW, PM_14D, and PM_21D (R2 = 0.14, P= 4.0 x 10-8 for LL; R2 = 0.19, P=1.7 x 10-10 for LW; R2 = 0.24, P= 5.9 x 10-13 for PM_14D; R2 = 0.34, P= 2.2 x 10-16 for PM_21D) (Fig. 3c, Supplementary Fig. S2). These negative correlation results further confirmed that there is a connection between the DF and SSR resistance in B. napus, indicating that early flowering genotypes tend to be more vulnerable to the S. sclerotiorum attack with increased stem LL, LW, and plant mortality than the late maturing genotypes.
Genotypic Data and Principal Component Analysis
After eliminating markers with missing data greater than 25%, a total of 25,809 polymorphic SNPs with minor allele frequency (MAF) greater than 5% were obtained and employed for association analysis. The highest proportions of SNPs had MAF between 0.10 and 0.15 (20%) and between 0.05 and 0.10 (20%) (Supplementary Fig. S3). The other seven MAF classes represent between 3 to 14% each of the total markers. To scan the population stratification of the association panel, principal component analysis and kinship matrix were performed on the genotypes based on 25,809 SNPs. The first and second PCA accounted for 9.1 and 5.8 % of the variance, respectively. The first 4 PCA accounted for 22% of the variance, and at PC4 the inflection point occurred, so we used four PCs in association mapping which could dominate the population structure. The model-based cluster analysis using the first four PCs suggested that there were 5 subgroups within the genotypes (Fig. 4).
Marker-trait-association (MTA) analysis
Association analyses using the phenotypic data and SNP marker data were conducted for each phenotypic trait for stem resistance (LL, LW and PM) in each year and with the combENV BLUEs data to identify best MTAs. To reduce false positive or false negative associations i.e the chance of committing Type I and Type II errors, three different well known renowned GWAS algorithms i.e FarmCPU, MLM, and GEMMA-MLM, were used to identify the true MTAs. The population structures using four PCA as fixed effect and familial relatedness with kinship matrix as random effects were incorporated in all the implemented GWAS models to control pseudo associations. Therefore, the SNPs detected in any two GWAS models were considered reliable and declared as significant SNP for the studied trait. Significant MTAs were determined on the basis of modified Bonferroni correction by calculating the effective number of independent tests (loci) from the tested 25,809 SNPs by Li and Ji58. The Q-Q plots generated from all GWAS analyses models of the analyzed phenotypic traits showed a sharp deviation from the expected P value distribution in the tail area, indicating that population structure and familial relatedness were well controlled and false positive associations were reduced (Fig 5a-g).
Stem lesion length (LL)
Association analysis was performed using BLUEs of LL of all four environments and combENV BLUEs separately (Fig. 5a-e, Supplementary Fig. S4). A total of 64 significant SNPs corresponding to 62 loci were identified at the level of [−log10 (P) ≥ 3.4; P ≤ 0.0004] by at least two of the GWAS models, and thus were regarded as more reliable. These SNPs were unevenly distributed among the B. napus chromosomes (Fig. 5a-e, Supplementary Fig. S4, Supplementary Table S3, 6). The majority of significant SNPs were detected on chromosomes A01 (5), A03 (5), A09 (6), C03 (8), and C06 (12). Significant SNPs that were present in the LD block on the same chromosome were regarded as single locus. Among these, thirty-eight significant SNPs were detected in two or more environments and in at least two of the GWAS tested models and explained phenotypic variance of the SNPs ranged from 4.5 to 9.9%. Allelic effects of these identified SNPs varied from -0.84 to 0.83% (Supplementary Table S3, 6).
Stem lesion width (LW)
GWAS analyses using LW BLUEs of all environments and combENV detected a total of 70 significant SNPs in 66 loci in at least one of the four environments and combENV datasets (Supplementary Fig. S5, Supplementary Table S4, 6). Out of these 70 significant SNPs, a total of 30 were found in at least two or more environments of two GWAS models out of three GWAS models. The estimated allelic effects of those 70 significant SNPs varied from -6.83 to 7.55. The phenotypic variation accounted for by these SNP markers varied between 4.9 to 12.1% (Supplementary Table S4, 6). Manhattan and Q-Q plots summarizing the analysis of stem LW for SSR resistance by FarmCPU, MLM, and GEMMA-MLM are shown in (Supplementary Fig. S5).
Plant mortality (PM)
CombENV BLUEs value of 14 and 21 dpi plant mortality were used to perform the GWAS analysis. Marker-trait-association analyses identified a total of 21 and 30 significant markers for PM_14D and PM_21D, respectively, which were commonly identified in at least two of the GWAS analysis models (Fig 5f-g, Supplementary Fig. S6, Supplementary Table S5, 6). A total of 11 significant SNP markers were commonly found both in PM_14D and PM_21D (Supplementary Table S5). About 3.6 to 7.7% of the phenotypic variation were explained by these significant SNP markers. The estimated allelic effects were ranged between -11.29 to 9.37 (Supplementary Table S5, 6). The MTAs resulting from FarmCPU, MLM, GEMMA-MLM for PM_14D_BLUP and PM_21D_BLUP were presented in the Manhattan and Q-Q plots (Supplementary Fig. S6).
Candidate Genes
The significant SNPs those were detected in at least two environments (four environments and combENV) were used to search for the candidate genes for sclerotinia stem rot resistance using “ZS11” reference genome sequence47. A total of 69 candidate genes with known functions associated with plant disease resistance mechanisms were identified within ± 50 kb of the respective significant SNPs. A list of these genes, their biological functions based on TAIR 10, Uniport-KB, annotations and corresponding details is provided (Supplementary Table S7). The candidate genes are involved in the biological process of defense response, defense response to fungus, programmed cell death, response to molecule of fungal origin, response to salicylic acid, indole glucosinolate biosynthetic process, induced systemic resistance, response to chitin, jasmonic acid mediated signaling pathway, ethylene-dependent systemic resistance, systemic acquired resistance, camalexin biosynthetic process, pattern recognition receptor signaling pathway, response to wounding, response to nematode, response to oxidative stress, toxin catabolic process, immune response, reactive oxygen species metabolic process, brassinosteroid mediated signaling pathway and other biological processes which might play key role in SSR resistance in rapeseed/canola (Supplementary Table S7).
Genomic Prediction
The prediction accuracies applying genome-wide markers for stem LL, and LW traits varied from 0.15-0.51 and 0.18-0.52, respectively, for four individual environments (Fig. 6a). The LANG_19 environment showed the highest prediction accuracy for both LL (0.51) and LW (0.52), whereas the lowest prediction accuracy was observed 0.15 and 0.18 for the CARR_20 environment (Fig. 6a). However, the average correlation between observed resistance to SSR and predicted SSR by GP model were 0.43, 0.44, 0.49, and 0.63 for combENV LL, LW, PM_14D, and PM_21D, respectively, through the implementation of genome-wide markers (Fig. 6b-c, Supplementary Table S8). The estimated prediction accuracies by applying the GWAS identified significant markers based on the certain P value, resulted different significant marker densities. All SNP markers subset (25,809 SNPs) showed the lowest prediction accuracy before thinning of the markers based on the GWAS based P-value thresholds for LL, LW, PM_14D, and PM_21D traits. Prediction accuracy was increased after implementing the GP model with the selected SNPs from the GWAS-based thresholds. The average prediction accuracies showed an increasing trend as the GWAS-based P-value threshold decreases for combENV LL, LW, and PM _14D traits and their accuracies ranged from 0.55-0.84, 0.68-0.87, and 0.74-0.86, respectively (Fig. 6b-c, Supplementary Table S8). However, in the case of PM_21D trait, the highest average prediction accuracy (0.88) were observed on GWAS threshold [-log10 (P) < 0.001, LOD=3.0] followed by 0.82 [-log10 (P) < 0.01, LOD=2.0], and 0.86 [-log10 (P) < 0.0004, LOD=3.4] (Fig. 6c, Supplementary Table S8). Genomic predictions were stronger when plant mortality data were used rather than the lesion length and width.