Phenotypic variations for Sclerotinia sclerotiorum reactions and correlation among phenotypic traits
SSR disease reactions can be variable under field environments, therefore, phenotyping the collection of germplasm against S. sclerotiorum was performed in the greenhouse under a controlled environment. A continuous and broad range of phenotypic variations were observed for days to wilt (DW) and lesion phenotypes (LP) traits among the genotypes in the study (Fig. 1a-f; Table 1). The BLUP values for DW varied from 3.5 to 9.9 days with an overall mean of 5.4 days and standard deviation (SD) of 0.87 (coefficient of variation is 30.9%). The variations observed for LP scores at 3, 4, and 7 dpi, ranged (mean ± SD) from 2.0 to 4.3 (2.8 ± 0.50), 2.6 to 4.8 (3.8 ± 0.45), and 3.7 to 5.0 (4.8 ± 0.16), respectively (Fig. S1). The coefficient of variation (CV) of LP scores of the association population at different days varied from 6.7 to 19.2% (Table 1). Based on the phenotypic data, a few genotypes, which performed better than the resistant check cultivars used in this study, were identified as promising sources of resistance to SSR at the seedling stage. The BLUP values of the top five promising resistant genotypes ranged from 7.1 to 9.9 for DW, 2.0 to 2.2 for LP_3dpi, 2.6 to 2.8 for LP_4dpi, and 3.7 to 4.4 for LP_7dpi. However, the observed phenotypic responses of the resistant checks ‘Pioneer 45S51’ were 4.7, 3.3, 4.2, and 4.9 for DW, LP_3dpi, LP_4dpi, and LP_7dpi, respectively, and ‘Pioneer 45S56’ were 5.3, 2.9, 3.8, and 4.9 for DW, LP_3dpi, LP_4dpi, and LP_7dpi, respectively. The phenotypic response of susceptible check ‘Westar’ cultivar was 3.5, 4.3, 4.8, and 5.0 for DW, LP_3dpi, LP_4dpi, and LP_7dpi, respectively (Table 1; Suppl. Table S1). Analysis of variance (ANOVA) for SSR reaction in terms of DW and LP scores on different days revealed significant differences (P ≤ 0.001) among the genotypes, and interaction of genotype by experiment with an exception for LP at 7 dpi (interaction non-significant) (Suppl. Table S2).
Highly significant correlations were observed among the phenotypic traits for SSR reaction. For instance, significant negative associations were found for DW with LP_3dpi (r = -0.84), LP_4dpi (r = -0.94), and LP_7dpi (r = -0.87) at P ≤ 0.001 (Fig. S2). The estimated broad-sense heritability of SSR resistance on entry mean basis across the two experiments were 0.71, 0.69, 0.70, 0.62 for DW, LP_3dpi, LP_4dpi, and LP_7dpi, respectively (Table 1). Medium to high heritability for SSR resistance in the phenotypic traits indicated that the phenotypic data was suitable for further genetic analyses.
SNP distribution and population structure analysis
After quality filtering and removal of markers with MAF < 5%, a total of 27,282 high quality SNPs were used in the current study. These SNPs span a length of 854.3 Mb genome sequence representing 75.6% coverage of the B. napus genome (~ 1130 Mb). The number of SNPs were uneven among the 19 chromosome and ranged from 714-2386 SNPs per chromosome with the average SNP per chromosome was 1436, where the chromosome 4 and 13 having the lowest (714 SNPs) and highest (2386 SNPs), respectively, while the average SNP per chromosome was 1436. The mean SNP density was approximately one SNP per 31.3 kb (Fig. 2a). Based on the 27,282 markers, principal component analysis (PCA) and kinship analyses were performed to identify the underlying genetic differences of the genotypes. The first three PCA explained 22.2% of the genotypic variation and were included in the GWA mapping model to control the confounding effect of population stratification. Furthermore, model-based clustering analysis using the first three PCA identified five subgroups within the genotypes based on three ecotypes (Fig. 2b).
Marker-trait-associations detected for SSR resistance by single-locus GWA analyses
The single-locus (SL) GWA analyses, was performed with the GEMMA-MLM model that included the first three PCs as fixed effect and genetic relatedness matrix as random effect. The SL GWA results for DW and LP scores at 3, 4, and 7 dpi are presented in Suppl. Table S3. Based on the method developed by Li and Ji (2005), the significance threshold was P ≤ 2.40E-04; LOD ≥ 3. A total of 35 SNPs were identified for the SSR resistance phenotypic traits. The SNPs were detected on chromosomes A01, A03, A04, A05, A06, A08, A09, C01, C02, C03, C04, C05, C06, C08, and C09. The majority of the significant SNPs were located on chromosomes C08 (5), A09 (4), A04 (3), A05 (3), A06 (3), C02 (3), and C03 (3). The highest (n=15) number of significant SNPs were identified for DW while the lowest (n=11) number of SNPs were for the LP_3dpi SSR trait. Among these, 18 significant SNPs were detected for two or more of the SSR resistance traits (Suppl. Table S3, S4).
Marker-trait-associations detected for SSR resistance by multi-locus GWA analyses
Three multi-locus (ML) GWA algorithms: MLMM, FarmCPU, and mrMLM detected a total of 219 SNPs corresponding to 216 loci across all the 19 chromosomes of B. napus genome [-log10 (P) =3.0-12.3] (Suppl. Table S3). The number of SNPs detected by the three ML-GWAS methods ranged from 10-48. The highest number of 48 SNPs were detected for DW trait by FarmCPU whereas the lowest number of 10 SNPs were found to be associated for LP_7dpi by mrMLM method. A total of 44 out of 219 SNPs were identified simultaneously in at least two phenotyped SSR resistance traits by two or more ML methods for any of the traits. The estimated allelic effects ranged between -0.54 to 0.63, -0.29 to 0.27, -0.21 to 0.19, and -0.14 to 0.12 for DW, LP_3dpi, LP_4dpi, and LP_7dpi traits, respectively. The explained phenotypic variation accounted for by the significant SNPs ranged from 2.0-9.30%, 1.60-11.90%, 1.35-13.30%, and 2.48-9.52% for DW, LP_3dpi, LP_4dpi, and LP_7dpi traits, respectively (Suppl. Table S3, S4).
Commonly identified marker-trait-associations among the SSR resistance traits, among and between single-locus and multi-locus GWAS methods
Of the 35 detected SNPs by SL GEMMA-MLM method, a maximum of 15 SNPs for DW, 14 SNPs for both LP_4dpi and LP_7dpi respectively and 11 SNPs for LP_3dpi traits (Fig. S3). Seven SNPs were mutually identified between DW and LP_4dpi, DW and LP_7dpi; followed by 5 SNPs between LP_3dpi and LP_4dpi trait, and only single SNP between DW and LP_3dpi (Suppl. Table S3, S4). Moreover, only a single, SNP SCM002771.2_77997199, on chromosome C03 were co-localized by the SL method for DW, LP_3dpi, and LP_4dpi traits. All of the QTNs detected with the SL method were also associated with the four SSR resistance traits by any of the ML-GWAS models. In addition to the 35 QTNs identified by SL, GWA analyses by ML-methods detected additional 184 SNPs associated with SSR phenotypic traits. The number of identified QTNs by all the ML models for SSR resistance traits varied between 54-88 whereas the number of QTNs for each of the ML models ranged between 10-48. The highest (48) number of QTNs were detected by FarmCPU (DW), and the lowest (10) QTNs by mrMLM model out of the three ML models for LP_7dpi (Fig. S3). Comparison of the ML models demonstrated that each model has the power to detect QTNs concurrently from each other and a few QTNs (ranged 1 to 9) were detected by all the three models for each trait. However, no common SNPs were identified by all three ML models with all SSR resistance traits. The number of commonly detected SNPs varied between and among the studied SSR resistance phenotypic traits: DW & LP_4dpi (20) > LP_3dpi & LP_4dpi (14) > DW & LP_7dpi (13) > DW & LP_3dpi (8) > DW, LP_3dpi & LP_4dpi (5) > LP_4dpi & LP_7dpi (3) > DW, LP_4dpi & LP_7dpi (2) (Suppl. Table S3, S4). However, to obtain more reliable results, only the SNPs that were simultaneously detected by a combination of SL and any of the ML methods or at least two of the ML methods or at least two traits were considered as significant QTNs. Thus, a total of 71 QTNs controlling SSR resistance traits were obtained (Suppl. Table S4). These QTNs will serve as a valuable source and could provide promising opportunities to facilitate cost-effective MAS breeding of SSR resistance. Manhattan and Q-Q plots summarizing the GWA results of all the phenotypic traits for SSR resistance by SL (GEMMA-MLM) and ML (MLMM, FarmCPU, mrMLM) algorithms were present in Fig. 3a-d and Fig. S4, S5, S6, and S7. All GWA models were compared with the studied phenotypic traits to determine if the models control false positives and false negatives. The Q-Q plot depicts the expected negative log10 (P) values versus the expected negative log10 (P) values across all markers. Q-Q plots of models including GEMMA-MLM, and MLMM had a straight line with slightly deviated tail, which indicated that these two models reduced false positives (Fig. 3a-c). However, most of the SNPs were close to the straight line or a little bit inflated downward, indicating that they might have been reported as false negatives (Fig. 3a-c). In contrast, examination of Q-Q plots of FarmCPU, and mrMLM models showed a sharp upward deviation from the expected P value distribution in the tail area, indicating these models controlled both false positives and false negatives (Fig. 3a-c).
Candidate gene prediction
To identify the potential candidate genes for the SSR resistance, the significant SNPs detected in at least two traits or with two or more GWA models were used for candidate gene mining using the “ZS11” reference genome sequence database (Sun et al. 2017). With this criterion, 81 putative candidate genes with known functions associated with plant disease resistance mechanisms were identified. Candidate gene proteins were used as a query against the Uniport database (https://www.uniprot.org/uniprot/) to discover a putative biological function (Suppl. Table S5). The biological processes of the detected candidate genes were involved in defense response, defense response to fungus, response to a molecule of fungal origin, response to chitin, programmed cell death, callose deposition in cell wall, response to salicylic acid, indole glucosinolate biosynthetic process, induced systemic resistance, jasmonic acid mediated signaling pathway, ethylene-dependent systemic resistance, systemic acquired resistance, pattern recognition receptor signaling pathway, response to wounding, protein kinase activity, response to oxidative stress, toxin catabolic process, immune response, reactive oxygen species metabolic process, brassinosteroid mediated signaling pathway and other biological processes which might play a key role in early stage SSR resistance in rapeseed/canola (Suppl. Table S5).
Genomic prediction (GP)
The mean predictive ability and prediction accuracy of six GS models are shown in Fig. 4a-d. There was a little difference in the predictive ability among the six GS statistical models for all the analyzed SSR resistance traits. The average predictive ability i.e. the correlation between observed and predicted resistance to SSR (i.e. GEBVs) were 0.60-0.62 for DW; 0.67-0.68 for LP_3dpi; 0.63 for LP_4dpi; and 0.45-0.48 for LP_7dpi. The highest genomic predictions explained ~ 67-68% of the variation was observed in LP_3dpi traits, whereas the lowest ~ 45-48% explained variation was recorded in LP_7dpi trait. Predictive abilities estimated from the various models had 0 to 3-unit differences depending on the traits. No model consistently resulted in higher predictive ability across the traits. For example, in case of LP_3dpi, 0.68 predictive ability were recorded from rrBLUP, whereas all the Bayes models yielded a 0.67 predictive ability. However, the predictive ability for the LP_7dpi trait was 0.45 by rrBLUP and BL method, which was 3 units lower than the BB, and BRR model’s estimation of 0.48. For LP_4dpi, all the implemented GS models resulted in ~0.63 predictive ability.