To test the analysis of variance of stripe rust resistance, data was transformed using arcsin root square method. For both years, the Shapiro-Wilk normality test had a highly significant p-value of 2.116e-08 and 4.129e-08 confirming the unnormal distribution of the untransformed data. Compared with the untransformed data, transformed data was better normally distributed (Supplementary figure 1 and 2). The analysis of variance revealed highly significant differences among the genotypes for stripe rust resistance (Table 1). The coefficient of Infection (CI) ranged from zero to 100% and from zero to 95% in 2019 and 2020, respectively (Figure1). Highly significant differences were found between the years and genotypes x years interaction. Broad-sense heritability was high across the two years (H2B=0.73).
Based on CI, 19 and eight genotypes showed 0% CI in 2019 and 2020, respectively. Also, six and 13 genotypes were resistant with CI four% or less in 2019 and 2020, respectively. The total number of resistant genotypes (with CI ranging from 0-4%) was 24 and 19 genotypes in 2019 and 2020, respectively. Only eight genotypes were resistant to stripe rust in both years (Figure 2 and Table 2). These genotypes are; One Canadian genotype (PI_556465), one Saudi Arabian genotype (PI_574347), two Iranian genotypes (PI_243679 and PI_625253), two Kenyan genotypes (PI_237655 and PI_237658) and two Egyptian genotypes (Misr_1 and Beni Sweif_4).
Association mapping for stripe rust resistance
The GBS-SNP markers and population structure
The GBS generated a set of 36,720 SNPs after filtration for MAF >0.05, maximum missing sites per SNP<20%, and maximum missing sites per genotype <20% [23, 24, 28]. Heterozygous loci were marked as missing and the filtration was repeated. As a result of this filtration, a set of 26,703 SNP markers for 102 genotypes was generated. This set was used in the GWAS analysis. The new SNPs distributed across all wheat chromosomes, increasing the possibility of QTL detection.
The structure of the recently studied panel was extensively studied in our previous manuscript [29]. To wrap up, the 103-genotypes were classified into three subpopulations. The eight resistant genotypes were distributed on the three sub-populations indicating that the hybridizations among these genotypes will be very effective (Table 2).
Genome-wide association study (GWAS) and linkage disequilibrium (LD) between the significant SNPs
Due to the presence of population structure in the studied genotypes, which causes a spurious association, two different models were tested: mixed linear model + kinship (MLM+K) and general linear model + population structure (GLM+PC). The QQ-plot of both models is presented in figure 3. The QQ-plot evaluating the performance of MLM models skewed below the reference line in both years, 2019 and 2020, indicating the overcorrection of the MLM+K model due to the use of the kinship (Figure 3a and b). Instead, the GWAS was performed using GLM+PC in which QQ-plot represented an ideal distribution on the reference line for both the 2019 and 2020 experiments indicating the high correction efficiency (Figure 3.c and d).
Based on the GLM+PC model, a set of 14 and 56 SNPs were identified to be associated with stripe rust resistance in 2019 and 2020, respectively (Figure 4a). The significant SNPs identified in the 2019 experiment were located on chromosomes, 1A, 1B, 2A, 4A, 5A, and unknown chromosomes (Figure 5 and supplementary table 2). While the significant SNPs of the 2020 experiment were located on chromosomes, 1A, 1B, 1D, 2A, 2B, 3A, 3B, 4A, 4B, 4D, 5A, 5B, 6A, 6B, 7A, 7B, and unknown chromosome (Figure 5 and supplementary table 3). Out of the identified SNPs from the 2019 and 2020 trails, five SNPs were common in the two years (Figure 4a). These five SNPs located on chromosome 2A (three SNPs) and chromosome 4A (two SNPs) (Table 3). The phenotypic variation explained by each significant SNPs (R2) ranging from 11.03% for SNP marker S4A_658402828 to 23.24% for SNP marker S2A_9121999 in 2019. In 2020, the R2 ranged from 14.26% for the SNP marker S2A_16881495 to 30.75% for the SNP marker S2A_9121999. The SNP marker S2A_9121999 had the highest R2 in both years with a value of 23.24 and 30.75% in 2019 and 2020, respectively. The allele A of the SNP S4A_658402828 had the highest allele effect which decreases stripe rust symptoms with 55.96% and 65.02 % in 2019 and 2020, respectively. While the allele T in the SNP marker S2A_9121999 had the lowest allele effect which increasing stripe rust resistance with 37.43% and 32.78% in 2019 and 2020, respectively (Table 3).
The linkage disequilibrium (r2) between each pair of the significant SNPs located on the same chromosome was calculated. For the three significant SNPs on chromosome 2A, no significant LD was found. While an incomplete LD was found for the two significant SNPs on chromosome 4A, with r2 value of 0.51 (Supplementary table 6 and figure 6).
Genes underlying significant SNPs and their validation
To further understand the genetic association between the significant SNPs and stripe rust resistance in wheat, the annotation of the gene models underlying the significant SNPs was investigated using IWGSC v1.0 GFF3 files. Out of the three significant SNPs on chromosome 2A, two SNPs, S2A_16067928 and S2A_168814950, located within gene models TraesCS2A01G038300 and TraesCS2A01G042100.1, respectively (Table 4). One gene model, TraesCS4A01G380100.1, was found to underly one of the significant SNPs on chromosome 4A.
To validate the association between the identified gene models, the functional annotation of these gene models was investigated. TraesCS2A01G038300 gene model is producing Beta- glucosidase, an enzyme which is important in many plant species to improve the plant defense against bacterial, fungus, and insects [30, 31]. Phosphoglycerate mutase protein, produced by TraesCS2A01G042100.1 gene (S2A_16881495), was found to improve the plant adaptation to stresses; mainly abiotic stresses like drought [32]. The coiled-coil domain protein produced by TraesCS4A01G380100.1 gene model was found to have an effective contribution in fungal disease resistance such as powdery mildew in wheat [33, 34].
Additionally, to robust the association between the significant markers and stripe rust resistance, the expression of the three identified gene models under control and disease conditions at the different plant growing stages: the seedling, vegetative and reproductive stage was compared (Figure 7). All three genes have higher expression under disease conditions compared with the controlled conditions at the vegetative growth stage. Also, the two gene models located on chromosome 2A have a higher expression under the disease conditions compared to the control conditions at the seedling growth stage.
Single marker analysis (SMA) of stripe rust resistance using DArT marker
SMA identified 13 and 22 significant DArT markers associated with stripe rust resistance in 2019 and 2020 respectively (p-value < 0.05) (Figure 4.b and supplementary tables 4 and 5). Out of these significant markers, only three markers were significantly associated with the resistance in both years, 2019 and 2020 (Table 5). The common significant DArT markers located on chromosomes 1D, 4A, and 7D. The allele effect indicating that two of the three markers, WPT-665480, and WPT-0493, were found to be associated with decrease stripe rust resistance with a percentage of 21 and 27.7 in 2019 and 19.5 and 17.3% in 2020 for each marker, respectively. While the third marker, WPT-5857, was decreasing the symptoms with a percentage of 22.6% and 19.94% in 2019 and 2020, respectively. The phenotypic variation (R2) explained by each marker ranged from 5% for marker WPT-665480 to 9% for marker WPT-5857 in 2019 and from 7% for marker WPT-0493 to 16% for marker WPT-5857 in 2020. The LD between the three significant DArT markers and the significant SNPs located on the same chromosome was investigated and no significant LD was found (Figure 6.b.).
Selection of superior genotypes to stripe rust resistance in the tested materials
To genetically confirm the superior resistance of the promising genotypes presented in Table 2, the number of targeted alleles of all the significant SNPs and DArT markers, either in 2019, 2020, or both years, was investigated in each of the selected genotypes (Figure 8). The Canadian genotype PI_556465 contained a higher number of significant DArT markers (twelve markers). However, four genotypes from the selected eight genotypes did not have available DArT marker genotypic data. As a result, we could not depend on the DArT marker genotypic data to select the best genotypes. For the SNP markers data, the highest number of targeted alleles was found in the genotype PI_237655 from Kenya (48 alleles) followed by the Iranian genotype PI_243679 (45 alleles). The lowest number of targeted alleles (35 alleles) was found in the Saudi Arabian genotype PI_574347. The two resistant Egyptian genotypes contained an intermediate number of targeted alleles with 42 and 38 alleles for Misr_1 and Beni Sweif_4, respectively.
To further understand the possibility of improving stripe rust resistance in the Egyptian genotypes using the currently selected genotypes, the number of different alleles of the targeted SNPs between each pair of the eight genotypes was investigated (Table 6). The number of different alleles ranged from one allele between the Saudi Arabian and Iranian genotypes (PI_574347 and PI_625253) to 15 alleles between the Iranian genotype PI_243679 and both the Egyptian genotype Misr_1 and the Kenyan genotype PI_237658. Comparing between the two Egyptian genotypes and the remaining selected genotypes, the highest number of different alleles was found between both of the two genotypes and the Iranian genotype PI_243679 with 15 and 12 different alleles for Misr_1 and Beni Sweif_4, respectively (Table 6). In addition, the genetic distance between the eight selected genotypes was calculated to confirm the success of the crossing between these genotypes. It ranged from 0.5922 between PI_625253 and PI_243679 to 0.7410 between PI_574347 and Beni Swief_4 (Table 6).