Meta-QTL Analysis for Stripe Rust Resistance in Wheat


 Stripe rust caused by Puccinia striiformis f. sp. tritici Eriks. & E. Henn (Pst) is one of the most prevalent wheat diseases causing upto 70% yield losses worldwide. The present study was conducted in wheat for the first time to identify important meta-QTL (MQTL) regions for their use in developing stripe rust resistant wheat cultivars and to understand the genetic architecture of stripe rust resistance in wheat. For this purpose, a dense consensus map consisting of 76,753 markers was constructed and 353 QTLs from earlier studies were projected on this consensus map. As many as 61 MQTLs were identified using 184 (out of 353) original QTLs. Ten important genomic regions including six breeders’ MQTLs (PVE >20%) and four MQTL hotspots were selected to be used by wheat breeders. As many as 409 important candidate genes (CGs) were also identified, which either encoded known R proteins (265) or showed differential expression (144) due to stripe rust infection. These included genes encoding the following proteins: NBS-LRR, WRKY domains, ankyrin repeat domains, sugar transporters, etc. Overall, the present study provided robust MQTLs and underlying CGs which may be potential targets for molecular breeding for development of stripe rust resistant wheat cultivars or may be the target for future molecular studies to understand the mechanism of stripe rust resistance.


Introduction
Due to its wide adaptability, wheat is grown on > 200 mha globally and it provides 20% of the daily protein for the growing world population (Shiferaw et al. 2013;Tesfaye et al. 2021). In production, globally wheat is the second most important crop after maize (FAO-2021). Rust diseases (leaf rust, stem rust and stripe rust) which occur world-wide are among the major biotic stresses severely affecting wheat yield. Among the three rusts, stripe rust (also known as yellow rust or Yr) caused by Puccinia striiformis f. sp. tritici Eriks. & E. Henn (Pst) is the most devastating and widely prevalent disease in major wheat growing countries around the world. There are at least 140 stripe rust pathotypes known globally, of these, more than 28 pathotypes occurring in India alone ( Stripe rust causes signi cant yield losses in almost every part of the world, where cool and humid conditions persist during crop season. The yield losses can be up to 70% under severe epidemic conditions, adversely affecting grain-lling duration, thus leading to poor grain yield as well as grain quality (Vikram et al. 2021). Under severe conditions, the yield losses may go up to 100% when the infection occurs at the seedling stage and environment conditions conducive to the pathogen persist until maturity (Afzal et al. 2007; Pradhan et al. 2020). In India, it is a major disease in the North West Plain Zones (NWPZ) especially in the sub-mountainous parts of Punjab, Haryana, and Western Uttar Pradesh, which are the major wheat growing areas. In fact, a major outbreak of this disease was witnessed in NWPZ during 2006 and in the Northern Hills Zone (NHZ) during 2012-13, leading to severe yield losses (Prashar et al. 2007; Saharan et al. 2013; Pradhan et al. 2020). As many as 83 Yr genes have already been identi ed globally, which are distributed on all the 21 wheat chromosomes (McIntosh et al. 2014). Out of these 83 Yr genes, more than 15 genes have been derived from alien wheat species (Park 2016). As many as 9 genes (Yr5/YrSP, Yr7, Yr10, Yr15, Yr36, Yr18, YrU1 and Yr46) have also been cloned and were shown to encode a variety of proteins including those with the nucleotide binding site leucine rich repeats (NBS-LRR) domain, kinase like domains, ankyrin repeats, Stripe rust resistance has also been treated as a quantitative trait (QT) and > 300 QTLs for this trait have been reported using interval mapping (Wang et al. 2017a). Several LD-based genome wide association studies (GWAS) for stripe rust resistance have also been conducted during the last two decades, leading to identi cation of a number of marker trait associations (MTAs). The activity involving quantitative genetic studies for this trait increased so extensively in recent years that ve of these GWA studies were However, no attempts have been made so far to conduct MQTL analysis and to validate the robustness of QTLs for stripe rust resistance, identi ed across different environments and using different mapping populations differing in their genetic make-up. This is despite the fact that MQTL analysis in wheat has been conducted for a number of other traits including those for tolerance to abiotic stresses such as involving MQTL analysis is the rst of its kind which was planned to identify MQTLs for stripe rust resistance, using the QTL information already available. The CGs for MQTLs were also identi ed. The MQTLs, MQTL hotspots and the CGs identi ed during the present study should serve as an important resource for future molecular breeding programs for stripe rust resistance in bread wheat. The results of this study should also allow identi cation of the consensus and robust QTLs and for re ning the QTL positions on a consensus map (Go net and Gerber 2000). Bibliographic search and input le preparation   For conducting MQTL analysis for stripe rust resistance, bibliographic search was done to collect  literature on QTL mapping studies published during past 20 years (2000 to 2020) (Table S1). From the literature, detailed information on types of mapping populations and their parents, population size, pathotypes used for phenotyping, methods of QTL mapping, position of QTL and markers anking the QTL, logarithm of odds (LOD) value, and R 2 values of the QTLs was collected. Only QTLs with complete information required for meta-analysis were retained for nal analysis. The two types of input data text les including the genetic map le and QTL information le were prepared from each study following the instructions provided in the BioMercator v3/v4 manual (Sosnoswki and Joets 2012). individual QTL studies used for MQTL analysis were also included for construction of a consensus genetic map.

Projection of QTLs on consensus map
The original QTLs were projected onto the above developed consensus map using the QTL projection (QTL Proj) tool available in BioMercator v4.2; QTLs which could not be projected onto the consensus map were excluded. For the projection of QTLs on the consensus map, a scaling rule between the marker interval of the original QTL and the corresponding interval on the consensus chromosome was used (Veyrieras et al. 2007). The new con dence interval (CI) for MQTLs on the consensus linkage group was computed using Gaussian distribution (Veyrieras et al. 2007). considered (AIC value estimates the relative amount of data lost by different statistical models). Quality of the model depended on the assumption that lesser is the information loss, higher will be the quality of that model (Akaike 1998). The MQTLs located in the same physical interval and genetic/con dence interval <15 cM were considered as MQTL hotspots. Identi cation of known Yr genes co-located with MQTLs

Meta-analysis of
The known Yr genes that are co-located with the MQTLs in the present study were also identi ed. For this purpose, the sequences of the markers linked to the Yr genes were blasted against the wheat reference genome version 49 (Triticum_aestivum IWGSC_Ensembl_Release 49) available in the Ensembl Plants database and the physical coordinates of the respective markers were extracted. These physical intervals of the markers linked to Yr genes were compared with the physical coordinates of the MQTL regions; an individual Yr gene falling within a speci c MQTL region was treated as the MQTL region co-located with the corresponding Yr gene. Gene ontology (GO) and in silico expression analysis of CGs GO analysis was conducted using Biomart tool available in Ensemble Plants. In silico expression of CGs was also conducted using the tool expVIP available online (wheat-expression.com; Ramirez-Gonzalez et al. 2018) to select the important genes based on their expression values. Expression data on three different experiments involving two different studies related to stripe rust resistance that were available in expVIP database were selected for in silico expression analysis. Following is the brief summary of these two expression studies. In the rst study conducted by Zhang et al. (2014), transcriptome divergence and overlap in response to stripe rust and powdery mildew was studied using the seven-day old seedlings of the variety N9134 that was inoculated using Pst race CYR31. The inoculated leaves were harvested at 0, The data on expression was available as log2 transformed tpm (transcripts per million) values. Only those genes showing FC≥2 or FC≤-2 (on comparison of tpm values under stress vs. control) were considered as differentially expressed. The results of such differentially expressed genes were depicted in the form of heatmaps generated using online tool Morpheus (https://software.broadinstitute.org/morpheus/).

QTLs and their distribution on wheat chromosomes
Based on the bibliographic search, a total of 68 studies (66 studies involving common wheat and two studies involving durum wheat) were found to contain all the required QTL related information and were used for the MQTL analysis (Table 1; for details see Table S1). Salient features of these studies include the following: (i) The mapping populations consisting of RILs, DH or F 3 ranged in size from 78 to 1020 plants/lines (Table1 ; Table S1). (ii) A total of 353 QTLs were identi ed from these 68 studies and the information on the anking marker, phenotypic variation explained (PVE) % of each QTL along with their con dence intervals (CI) are presented in Table S1.

Consensus map and projection of QTLs on consensus map
The consensus map had 76,753 markers, including 3526 DArT, 65459 SNP, 3975 SSR, and 3793 AFLP, STS, and TRAP etc. The total length of the consensus map was 5774 cM; the size of the 21 individual linkage groups ranged from 98 cM to 462 cM ( Figure 2). The marker densities ranged from 5 markers per cM in case of 6D to 28 markers per cM for 1A ( Figure 2). The average marker densities of the seven linkage groups was the lowest for D sub-genome (11.85 markers per cM) followed by B sub-genome (14.28 markers per cM) and A sub-genome (15.42 markers per cM) ( Figure 2).
Out of the total 353 QTLs collected from 68 studies, only 214 QTLs could be projected on to the consensus map; only these were used for MQTL analysis.

MQTLs, MQTL hotspots and their distribution on wheat chromosomes
Out of the 214 QTLs projected on the consensus map, 30 QTLs were found to be singletons and were therefore removed from further analysis. The remaining 184 QTLs with 8 QTLs overlapping two adjacent MQTLs were grouped into 61 MQTLs following the lowest AIC values. These MQTLs were located on 18 out of the 21 chromosomes (except 5D, 6D and 7D) with 1 to 8 MQTLs per chromosome ( Figure 3). Further, all original QTLs clustered in each of the 60 out of the 61 MQTL regions (except MQTL57) were identi ed using different types of mapping populations reported in different studies. In case of MQTL57 which was represented by 2 initial QTLs, both these QTLs were reported from the same population in a single study (Table 2). For 59 out of the 61 MQTLs, the information of stripe rust pathotypes that were utilized during the original studies was also available. The number of individual QTLs per MQTL ranged from 2 to 10 ( Table 2). Mean R 2 (PVE %) of the MQTLs ranged from 1.9% to 48.1% and their CI ranged from 0.02 cM (MQTL51) to 11.47 cM (MQTL40). Out of the 61 MQTLs, maximum (36) MQTLs were found on the B sub-genome followed by 19 MQTLs on A sub-genome and 6 MQTLs on D sub-genome. The genetic position, genetic interval and other related information for each MQTL are listed in Table S2.

Co-located Yr genes with the MQTLs
Twenty-two (22) known major Yr genes were found to be co-located (on the basis of physical position of MQTLs as well as the physical position of Yr genes) within the genomic regions of the following 10 MQTLs; MQTL3, 10, 15, 17, 18, 19, 20, 21, 24 and 29; 9 Yr genes were co-located with MQTL3 alone (Figure 3). Further, since the physical intervals of the MQTL15, MQTL17 and MQTL21 overlapped with each other, the same six Yr genes were co-located with these three MQTLs. In case of MQTL17, one more gene was co-located in addition to these six genes.
Identi cation of CGs, GO terms and in-silico expression analysis A total of 1581 genes were available in the genomic regions de ned by 60 out of the 61 MQTLs. Of the above 1581 genes, total 409 (265+144) important CGs were identi ed based on two different criteria which included (i) 265 important R genes which follow six of the 9 different mechanisms listed by Kourellis and van der Hoorn (2018) (Figure 4a and Table S3) and (ii) 144 signi cant differential expression based on in silico expression analysis (Table S4). These 265 genes belonged to 48 out of the 61 MQTLs. Twenty four (24) of the above 144 differentially expressed genes mentioned in category 1 were also commonly identi ed in the 265 genes belonging to second category.
GO analysis of the above 409 (144+265) CGs revealed a number of GO terms out of which some of the important and most abundant GO terms include those involved in biological processes like phosphorylation, protein ubiquitination, proteolysis, transmembrane transport, oxidation-reduction processes, etc. Similarly, important GO terms in molecular functions category included those involved in catalytic activity, ATP binding, protein binding, heme binding, iron ion binding, metal ion binding, transmembrane transporter activity, oxidoreductase activity, etc.
The above 144 differentially expressed CGs identi ed through in silico expression analysis (FC≥2 or ≤-2) belonged to 44 out of 60 MQTLs and the number of differentially expressed genes per MQTL ranged from 1 to 8 (Table 3 and Table S4). Further, these differentially expressed CGs encoded proteins for different categories like R-domain containing proteins, transcription factors (Zn nger binding proteins, SANT/Myb domains, NAC domain, BTF3), transporters (mitochondrial carrier domains, sugar-phosphate transporter domain, sodium/carbon exchanger domain, SLC26A/SulP transporter), different protein kinases, genes involved in calcium signaling, a number of peptidases, genes involved in oxidative stress (cytochrome P450) and domains like S1/P1 nuclease, Six-bladed beta-propeller/Strictosidine synthase, Alphamannosyltransferase and Peptidase T2 (Table S4). A representative heat map of 29 important differentially expressed CGs which included 14 CGs encoding R domain containing proteins, 9 CGs known to be involved in disease resistance signaling pathways and 6 CGs having very high expression (FC≥ 5 or ≤-5) shown in Figure 4b.

Discussion
The QTL interval mapping studies for stripe rust in wheat started in mid-1990s (Line et al. 1996). Since then, more than 70 reports have been published, leading to identi cation of >350 QTLs, so that stripe rust resistance is now treated both as a quantitative trait involving quantitative resistance loci (QRLs) and a qualitative trait controlled by a number of Yr genes, the latter following a gene-for-gene relationship with Avr genes in the pathogen. The relationship of QRLs in the host and the corresponding QTLs for virulence in the pathogen has not been worked out so far, although quantitative nature of virulence in pathogen has been worked out in some cases including wheat pathogens like Zymoseptoria tritici where several large and small effect QTLs were identi ed for virulence (Stewart et al. 2018) The 61 MQTLs identi ed during the present study were derived from 184 QTLs, which indicated roughly three-times reduction in redundancy for the genomic regions controlling stripe rust resistance in wheat genome. Earlier, while conducting meta-QTL analysis in wheat for fusarium head blight, roughly ve-fold reduction in redundancy was reported ( Sixty-one (61) MQTLs (including four MQTLs derived from the QTLs belonging to durum wheat) is a fairly large number indicating a high degree of redundancy of QTLs which agrees with a large number of Yr genes for stripe rust resistance reported in wheat genome. Such a redundancy of genes/MQTLs is a requirement for providing resistance against large number of ever-evolving races of stripe rust, distributed in different wheat growing regions of the world (Pradhan et al. 2020). Earlier, for resistance against fusarium head blight also, 65 MQTLs were identi ed. However, the number of MQTLs identi ed in this study far exceeds the number of MQTLs identi ed for leaf rust resistance (35) (Soriano and Royo 2015). Perhaps this is due to relatively fewer QTL studies (19) available for meta-analysis in case of leaf rust resistance.
Most of the MQTLs identi ed in the present study controlled more than one parameters/traits ( Table 2;  Table S2). This probably indicated either a tight linkage of genes for different traits, or occurrence of pleotropic genes or a bias due to the use of related traits measuring the same resistance component by different means as also reported earlier in case of MQTL analysis for leaf rust resistance in wheat (Soriano and Royo 2015). In the present study, ve MQTLs (MQTL3-1B, MQTL10-2A, MQTL19-2B, MQTL20-2B and MQTL24-2D) also showed co-localization of 1 to 9 different known Yr genes including four cloned Yr genes. These genomic regions may be involved in controlling both, qualitative and quantitative resistance and thus may be more important. Earlier, a number of Yr genes (out of the 82 reported Yr genes) have been deployed in commercial wheat cultivars; however only few are still effective.
For instance, some of the Yr genes which are still effective in India include Yr5, Yr10, Yr15, Yrsp, Yr47, Yr57 and Yr63 (Prasad 2020; Sharma et al. 2020). Four of these Yr genes (Yr5, Yr10, Yr15, Yrsp) that were found to be co-located with MQTLs ( Figure 3 Some of the MQTLs also overlap Yr genes that have already been cloned. For instance, MQTL20-2B was co-localized with two cloned Yr genes (Yr5/Yrsp and Yr7) whereas MQTL1-1A was co-localized with cloned gene Yr10. Co-localization of Yr5/Yrsp and Yr7 in the same MQTL region is perhaps due to the allelic nature of both the genes which were earlier shown to be closely linked ( Efforts were also made during the present study to identify MQTLs and MQTL hot spots that may prove useful for breeding; we describe these MQTLs as breeders' MQTL. For selecting these breeders' MQTLs, we utilized a number of criteria including the following two criteria suggested in an earlier study (Lo er et al. 2009): (i) the low CI and high average PVE of the MQTLs and (ii) the number of QTLs carried by individual MQTL. Additional criteria were also used in the present study for prioritizing and selecting breeders' MQTLs and MQTL hotspots. For instance, the relationship between MQTLs and the pathotypes occurring in speci c wheat growing regions may be an important criterion. While doing this we also have to keep in mind that virulence can also be quantitative in nature as mentioned earlier. MQTLs showing resistance against more than one pathotypes may also be important for achieving broad spectrum resistance. Such important MQTLs showing resistance against multiple pathogen races were also identi ed in a recent study on MQTL analysis reported for tan spot resistance in wheat (Liu et al. 2020).
MQTLs identi ed in the present study consisted of original QTLs which showed resistance at either APR, HTAP (high temperature adult plant resistance), SR (seedling resistance) or all stage resistance (ASR). Also, almost all the MQTLs (except MQTL36-4A) showed resistance against more than one pathotypes indicating that these MQTLs exhibit race non-speci c resistance and may be containing a number of novel genes which may be involved in providing resistance against broad spectrum of pathotypes. Keeping in view the different criteria listed above, a number of breeders' MQTLs were identi ed, which are listed in Table 4.
Some of the Yr genes have been shown to overlap the QRLs/MQTLs and also the CGs that were identi ed during the present study. The 409 CGs identi ed during the present study have been shown to encode a variety of proteins; at least some of them are known to be involved in disease resistance ( Table 3 CGs identi ed during the present study also deserve to be discussed. In wheat, CGs underlying the MQTLs were also identi ed for several traits including drought tolerance (Kumar et al. 2020), tan spot resistance (Liu et al. 2020) and fusarium head blight resistance (Venske et al. 2019). However, the strategy used by us was novel and not used in any of these earlier reports. For instance, in most of the earlier reports, the complete physical interval of the MQTL regions was considered for identi cation of CGs. However, in the present study, we calculated the exact physical position of the MQTL based on the MQTL peak position available from the BioMercator software. The 1 Mb interval on either sides of the MQTL peak was considered for identi cation of genes, which were used for identi cation of CGs responsible for stripe rust resistance.
Important differentially expressed CGs identi ed in the present study are presented in Figure 4. Fifty-nine (59) CGs out of the total 409 CGs were available in breeder's MQTL indicating that these CGs are more important. Out of the 59 CGs, 32 CGs also showed differential expression and encoded important R genes, S/TPK, SLC transporter, Mitogen-activated protein (MAP) kinase, UDP-glucosyltransferases, S1/P1 nuclease, etc. The role of some of the important CGs (shown in Figure 4) during disease resistance can be summarised as follows: (i) Earlier, STPK-V, a member of Pm1 gene was reported to confer powdery mildew resistance in wheat (Cao et al. 2011). (ii) NBS-LRR domain containing genes are the protein products of the cloned Yr genes like Yr10, Yr5, etc. as mentioned earlier (Liu et al. 2014;Marchal et al. 2018). (iii) TaMAPK4, a type of MAPK gene is reported to act as a positive regulator of stripe rust resistance in wheat (Wang et al. 2018). (iv) The above transporters may also possibly encode Yr genes similar to Yr46 which was shown to encode for hexose transporter (Moore et al 2015). (v) UDPglucosyltransferases were earlier reported to show differential expression due to stripe rust infection in wheat genotypes indicating their role in Yr39 mediated stripe rust resistance (Coram et al. 2008). Some other important CGs like those encoding for WRKY domains, Ankyrin repeat and F-box domain containing genes were also identi ed in different MQTLs, although the expression data was not available for these genes. WRKY and Ankyrin repeat domain containing genes were recently found to encode for proteins of cloned YrU gene . Similarly, F-box domain containing gene was identi ed as a CGs underlying the YrR39 locus in wheat and it was shown to upregulate due to stripe rust infection (Yin et al. 2018).
In summary, the present study allowed us to identify 6 MQTLs and 4 MQTL hotspots to be used by breeders particularly for high yielding wheat cultivars which are susceptible for stripe rust ( Table 4). Four of these 10 genomic regions also showed co-localization with known Yr genes. Some of the important CGs which were identi ed during the present study may be further validated/edited using approaches like gene editing, overexpression, gene knockout strategies or CG based association mapping (CGAM). Reports are available where some of these strategies have been used for validation of genes for their role in stripe rust resistance. For instance, overexpression of TaWRKY62 provided high temperature seedling plant resistance to stripe rust by activating other genes encoding for PR proteins, salicylic and jasmonic acid responsive genes and ROS associated genes (Wang et al. 2017b). Similarly, in another study overexpression of TaLHY (a type pf MYB TF) in leaf blade and sheath reduced the negative impacts of stripe rust on wheat plant (Zhang et al. 2015). This knowledge may prove to be useful for validating similar CGs identi ed in this study.
Gene editing or mutation is still unexplored in case of stripe rust resistance except a single study where the function of Yr15 gene (encoding wheat tandem kinase 1 or WKS1) in stripe rust resistance was validated using mutation analysis (Klymiuk et al. 2020). EMS mutations were created in this gene in resistant wheat lines to develop susceptible lines. The resulting susceptible lines showed point mutations in the three amino acids,i.e. Gly54, Ala149 and Ala460 leading to disruption in gene function, thereby validating the role of WSK1 in resistance. Therefore, a similar strategy may be certainly explored for at least three cloned genes (Yr5/Yrsp, Yr7 and Yr10) which are co-located in two important MQTL regions mentioned in Table 4 as well the important CGs shown in Figures 4a and 4b. Similarly, techniques involving CRISPR/Cas9 or base editing may also be employed for the above cloned Yr genes as well as the CGs after the identi cation of causal SNPs involved in providing stripe rust resistance through CGAM approach. Similar report involving CRISPR/Cas9 for fusarium head blight are available where successful editing (using CRISPR-Cas9) of three genes, TaABCC6, TaNFXL1, and TansLTP9

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.