Identi cation and ne-mapping of a major QTL (qRtsc8-1) conferring resistance to maize tar spot complex and production markers validation in breeding lines


 Tar spot complex (TSC) is a major foliar disease of maize in many Central and Latin American countries and leads to severe yield loss. To dissect the genetic architecture of TSC resistance, a genome-wide association study (GWAS) panel and a bi-parental doubled haploid population were used for GWAS and selective genotyping analysis, respectively. A total of 115 SNPs in bin 8.03 were detected by GWAS and three QTL in bins 6.05, 6.07, and 8.03 were detected by selective genotyping. The major QTL qRtsc8-1 located in bin 8.03 was detected by both analyses, it explained 14.97% of the phenotypic variance. To fine-map qRtsc8-1, the recombinant-derived progeny test was implemented. Recombinations in each generation were backcrossed, and the backcross progenies were genotyped with Kompetitive Allele Specific PCR (KASP) markers and phenotyped for TSC resistance individually. The significant tests for comparing the TSC resistance between the two classes of progenies with and without resistant alleles were used for fine-mapping. In BC5 generation, qRtsc8-1 was fine mapped in an interval of ~721 kb flanked by markers of KASP81160138 and KASP81881276. In this interval, the candidate genes GRMZM2G063511 and GRMZM2G073884 were identified, which encode an integral membrane protein-like and a leucine-rich repeat receptor-like protein kinase, respectively. Both genes are involved in maize disease resistance responses. Two production markers KASP81160138 and KASP81160155 were verified in 471 breeding lines. This study provides valuable information for cloning the resistance gene, it will also facilitate the routine implementation of marker-assisted selection in the breeding pipeline for improving TSC resistance.

rst reported in Mexico and most often prevalent in moderately cool and humid tropical and subtropical areas. TSC has for long been associated with at least three fungal pathogens: Phyllachora maydis, Monographella maydis, and Coniothyrium phyllachorae (Hock et al. 1992). Phyllachora maydis is the most important pathogen involved in TSC, which alone can cause tar spots and severe losses on maize yield.
Appropriate crop management practices including early or on-time sowing, lower densities, fungicidal control, and use of TSC resistant varieties are the traditional approaches to control TSC. Breeding TSC resistant varieties is the most cost-effective, environmentally friendly, and long-term approach to reduce the economic impact caused by TSC, which depends on the collection and identi cation of resistant germplasms. In the past 30 years, the International Maize and Wheat Improvement Center (CIMMYT) has developed and distributed many TSC resistant maize varieties (Mottaleb et al. 2019). A large number of lines were identi ed for TSC resistance under natural disease screening conditions, and some lines showed high levels of resistance to TSC, which are valuable donors for breeding TSC resistant varieties and dissecting the genetic architecture of TSC resistance. Several studies have been conducted to identify quantitative trait loci (QTL) conferring TSC resistance (Mahuku et al. 2016; Cao et al. 2017).
Genome-wide association study (GWAS), linkage mapping, and selective genotyping are powerful methods in genetic research Cao et al. 2017; Gowda et al. 2021). GWAS based on linkage disequilibrium (LD) can effectively detect genetic variants associated with the target trait, but it is hampered by high false positive associations (Yuan et al. 2019; Liu et al. 2021). Linkage mapping based on recombination events is a powerful method for QTL mapping of complex traits, but the mapping resolution is low (Cao et al. 2017). Selective genotyping of individuals with extreme phenotypic values from one or both tails is a cost-effective strategy, which may provide roughly equivalent power to complete QTL mapping (Lebowitz et al. 1987; Lee et al. 2014). The combined use of different mapping methods can complement the limitations of each other and has been successfully applied in revealing the genetic basis of several major diseases in maize (Guo et al. 2020; Ren et al. 2021). A major QTL of qRtsc8-1 on chromosome 8 was detected and veri ed in different genetic populations by the combined use of GWAS and linkage mapping in previous studies (Mahuku et al. 2016;Cao et al. 2017). Fine mapping of qRtsc8-1 and development of the production markers that are tightly linked to this major QTL will lead to improvement in the application of marker-assisted selection (MAS) against TSC resistance.
The recombination-derived progeny test strategy is a powerful and widely used method for QTL ne mapping, which can narrow down the genomic region of the target QTL through trait-marker association testing in recombination-derived progenies (Ding et al. 2012;Liu et al. 2016). Using the recombinantderived progeny test, a QTL qMrdd8 associated with maize rough dwarf disease resistance was ne mapped to an interval of 347 kb, and two candidate genes CG1 and CG2 were identi ed (Liu et al. 2016). In addition, a major QTL RppCML496 conferring resistance to Puccinia polysora in maize was ne mapped to an interval of 128kb and a NBS-LRR gene was the most likely candidate gene (Lv et al. 2021).
The development of functional markers within the nely mapped interval of the major QTL will enable the introgression of resistant alleles into elite breeding material through MAS, after the marker effects are veri ed in breeding materials. Compared to conventional breeding, MAS breeding is an indirect selection technique, where the selection is based on markers tightly linked to the genomic regions regulating the target trait, rather than the trait itself (Badu-Apraku and Fakorede 2017). Subsequently, the favorable alleles are transferred from the donor lines to the recipient lines for the improvement of the target trait. Several maize breeding programs have reported improved selection e ciency upon implementation of MAS (Nair et al. 2015;Xu et al. 2020;Prasanna et al. 2020aPrasanna et al. , 2021. At CIMMYT, MAS is being routinely deployed to enrich the favorable alleles of large effect QTL for maize lethal necrosis, maize streak virus (MSV), and provitamin A content in tropical maize breeding populations (Prasanna et al. 2020a(Prasanna et al. , 2020b).
Fine mapping the major QTL for TSC resistance and veri cation of the effects of production markers in breeding lines is essential to accelerate the development of TSC resistant germplasm via MAS. Several large effect QTL conferring TSC resistance have been detected, but none of these QTL has been ne mapped and therefore production markers for routinely implementing MAS are still unavailable. The objectives of this study were to (1) identify the major QTL conferring TSC resistance by the combined use of GWAS and selective genotyping, (2) ne map the major QTL of qRtsc8-1 by subjecting the BC 1 , BC 3 , and BC 5 progenies to recombination-derived progeny testing, and reveal the candidate genes in the ne mapped interval, (3) develop and validate the production markers in breeding lines for the routine deployment of MAS to enrich the favorable alleles of qRtsc8-1 in tropical maize breeding populations.

Plant materials
A total of 652 diverse maize inbred lines, including two association MAPPING panels of Drought Tolerant Maize for Africa (DTMA) and CIMMYT maize lines (CMLs), were used for GWAS. The DTMA panel and the CMLs panel, representing broadly the genetic diversity of tropical/subtropical maize, consisted of 282 and 364 inbred lines respectively. A bi-parental doubled haploid (DH) population consisting of 201 lines was used for selective genotyping to con rm the genomic region of the major QTL of qRtsc8-1 identi ed by GWAS. It was derived from the F 1 cross formed between the TSC resistant line of CML495 and the TSC susceptible line of La Posta Sequia C7 F64-2-6-2-2-B-B-B.
Fine mapping was performed in the DH population ( Figure 1). Resistant DH lines were selected and crossed to the susceptible DH lines to generate F 1 populations. Molecular markers within the qRtsc8-1 region were used to identify recombinants, which were backcrossed to the recurrent susceptible DH lines to generate the BC 1 progenies. The BC 1 progenies were planted to evaluate for resistance to TSC. This process was repeated to develop a series of advanced backcross populations, including BC 2 , BC 3 , BC 4 , and BC 5 to ne-map the QTL of qRtsc8-1. A breeding population consisted of 471 breeding lines was used to validate the ne mapping results and verify the effects of production markers. Disease evaluations in the eld Disease resistance was evaluated as described by Mahuku et al. (2016) under natural disease screening conditions. Disease severity was conducted three times at weekly intervals, starting from two weeks after owering. Disease severity was scored using a 1-5 rating scale, where 1 corresponds to highly resistant with no visible disease symptoms and nearly 0% of leaves infected; 2 corresponds to resistant with 1 -30% of the leaf area infected; 3 corresponds to moderately susceptible with 31 -50% of the leaf area infected; 4 corresponds to susceptible with 51 -75% of the leaf area infected; and 5 corresponds to highly susceptible with 76 -100% of the leaf area infected. The nal highest score was used for further analyses. For initial QTL mapping, disease severity was scored throughout plots. For ne mapping, the severity of each plant was scored individually, and the TSC score within each genotypic class was calculated by the following formula: TSC score =∑ (severity scale × number of planes per scale) ∕the total number of plants.

Phenotypic data analysis
Phenotypic data analysis was performed using META-R Version 6.04 (Alvarado et al. 2020). Best linear unbiased predictions (BLUPs) and variance components were estimated by a mixed linear model (MLM): Where Y ijk is the phenotypic value of the i-th genotype at the j-th environment in the k-th replication, μ is the overall mean, G i is the effect of the i-th genotype, E j is the effect of the j-th environment, R(E) kj is the effect of the k-th replication at the j-th environment, GE ij is the effect of i-th genotype by j-th environment, and ε ijk is the residual. All the factors were set as random effects. Heritability was calculated on an entry-

GWAS analysis
For the imputed SNP dataset, SNPs were ltered with missing rate < 20%, heterozygosity rate < 5 %, and a minor allele frequency > 0.05. A total of 248,482 high-quality SNPs evenly distributed on maize ten chromosomes were retained for future analysis. Analyses of linkage disequilibrium (LD) and GWAS were performed with 652 maize inbred lines in TASSEL 5.0. The LD decay was estimated as the Squared Pearson correlation coe cient (r 2 ) calculated between adjacent SNPs. The threshold r 2 = 0.1 was used. The distance of LD decay was 5.04 kb across the ten chromosomes. The LD decay distance ranged from 7.94 kb on chromosome 4 to 1.92 kb on chromosome 6. The MLM incorporating kinship (K) matrix and principal component analysis (PCA) was applied for GWAS analysis. The K matrix was estimated with the default Centered_Identity by State method in TASSEL 5.0. The rst three principal components were calculated to control the population structure. The threshold P-value of 2.01 × 10 -7 was determined by a Bonferroni correction method to avoid false positives. The GWAS results were used to generate Manhattan and quantile-quantile (q-q) plots with the qqman package in R software (R Core Team 2019).

Selective genotyping
In the DH population, 20 DH lines with the highest TSC scores (top 10% susceptible tail) and 20 DH lines with the lowest TSC scores (top 10% resistant tail) were selected to detect QTL by selecting genotyping. For the unimputed SNP dataset, 34,317 SNPs with missing rate < 20%, heterozygosity rate < 5 %, and a minor allele frequency > 0.10 were used for selective genotyping analysis. A Chi-square test with a 2 × 2 contingency table was used for the comparison of allele frequencies in resistant and susceptible groups. The SNP showing a signi cant difference (P < 0.05) between the allele frequency means of the two tails, indicates the presence of a resistance QTL near this SNP. The threshold P-value (1.46 × 10 -6 ) was determined by a Bonferroni correction method. The qqman package in R software was used to visualize the selective genotyping results.

Development of KASP markers
KASP markers were used to ne mapping qRtsc8-1. Sequences of KASP markers whose names start with PZA or PHM were obtained from the maize KASP assays developed for CIMMYT's Global Maize Program and the Generation Challenge Programme (https://www.biosearchtech.com/products/pcr-kits-andreagents/genotyping-assays/kasp-genotyping-chemistry/kasp-snp-libraries/maize-genotyping-library). Polymorphic SNPs between CML495 and La Posta Sequia C7 F64-2-6-2-2-B-B-B were selected based on the GBS dataset and used to develop new KASP markers. A total of 19 KASP markers within the physical interval identi ed by GWAS and selective genotyping were developed and named starting with "KASP" (Table S1).
Fine mapping strategy of qRtsc8-1 A recombinant-derived progeny testing strategy ) was used for ne mapping of qRtsc8-1 (Fig. 1). Recombinants identi ed from all mapping populations (F 1 , BC 2 , and BC 4 ) were backcrossed to the corresponding susceptible DH lines to produce backcross progenies (BC 1 , BC 3 , and BC 5 ). Individuals derived from each recombinant-derived backcross progeny were planted to evaluate for resistance to TSC and genotyped with appropriate markers using KSAP assays (LGC Genomics). They were classi ed into two classes of genotype in the qRtsc8-1 region: homozygous La Posta Sequia C7 F64-2-6-2-2-B-B-B and heterozygous La Posta Sequia C7 F64-2-6-2-2-B-B-B/CML495. A two-way ANOVA was used to compare disease scores between the two genotypic classes. A signi cant difference (P < 0.05) between the TSC resistance scores of the two genotypic classes indicated that the resistance QTL was present in the heterozygous region. Otherwise, it indicated that the resistance QTL was absent in the heterozygous region.

Candidate gene analysis
The B73 sequence of the ne mapping interval was obtained through maizeGDB (Portwood et al. 2019). Candidate genes were retrieved and annotated. Genetic variation analysis of candidate genes was conducted using resequencing data of the two parental lines (data not published). Sequences of each gene and 2 kb upstream of the transcription start sites were used to perform genetic variation analysis

Haplotype analysis and veri cation production markers in breeding lines
Ten genotyping assays were developed at LGC genomics for haplotype analysis and veri cation of the production markers in 471 breeding lines. In total, 8 of the 10 genotyping assays passed the technical validation process in the breeding lines, which were used as the production markers for further haplotype and veri cation analyses. To determine the functional markers for deploying MAS routinely in breeding populations, a stepwise regression of TSC resistance with the genotype was carried out with the R MASS package. LD and haplotype analysis were conducted by Haploview 4.2 (Barrett et al. 2005). Standardized disequilibrium coe cient (D') was used to evaluate the LD between markers and generate the LD heatmap. Haploid blocks were detected based on LD using the con dence intervals method in Haploview 4.2 (Gabriel et al. 2002).

Evaluation of the resistance to TSC
The descriptive statistics for resistance to TSC in the DTMA panel, CMLs panel, the DH population, and the breeding lines are presented in Table 1  The GWAS results are shown in Fig. 2 and Table S2. GWAS revealed that 115 SNPs were signi cantly associated with TSC resistance. All SNPs were located in bin 8.03 but distributed in two genomic regions. The q-q plot showed that population structure was well controlled using the MLM (PCA + K) method in TASSEL 5.0.

Qtl Detected By Selective Genotyping For Tsc Resistance
The selective genotyping results are shown in Fig. 3 and Table S3. A total of 298 SNPs distributed on two chromosomes showed a signi cant correlation between genotype and TSC resistance. Two SNPs S6_123687641 (P = 2.44 × 10 −8 ) and S6_165635560 (P = 1.10 × 10 −6 ) were located on chromosome 6 (bins 6.05 and 6.07). Fine mapping of the major QTL qRtsc8-1 To narrow down the region of qRtsc8-1, two anking markers PZA00379_2 and PZA01972_14 were used to identify recombinants from 20 different crosses between resistant DH lines and susceptible DH lines. The recombinants were further genotyped with ve KASP markers (PHM11114_7, PZA02683_1, PHM3978_104, PZA03135_1, and PHM4134_8) (Table S1) within the qRtsc8-1 region. Three types of recombinants (R1 to R3) were detected and backcrossed to corresponding susceptible DH lines to generate BC 1 progenies for ne mapping (Fig. 4). In the winter season from January to May in 2014, all 362 BC 1 progenies were individually scored for TSC resistance and genotyped with all the seven KASP markers previously described. A two-way ANOVA was performed for the progeny testing to compare TSC scores between the two classes of genotype: homozygous La Posta Sequia C7 F64-2-6-2-2-B-B-B and heterozygous La Posta Sequia C7 F64-2-6-2-2-B-B-B/CML495. Resistance to TSC was signi cantly different between the two classes of genotype in recombinant types R1 and R2, indicating that the CML495 donor region harbored the resistance QTL of qRtsc8-1. There was no signi cant difference in resistance to TSC between the two classes of genotypes in recombinant type R3, indicating that the CML495 donor region didn't harbor qRtsc8-1. QTL analysis of the recombinants narrowed down the qRtsc8-1 to the region between markers PZA00379_2 and PHM3978_104 with a physical distance of 33.80Mb. The qRtsc8-1 explained 5.35-9.93% of the total phenotypic variation based on the analysis on recombinants R1 and R2.
To further nely map qRtsc8-1, additional recombinant events were identi ed. A total of 512 plants from BC 2 populations were planted in the summer season of 2014 and genotyped with two anking markers PZA00379_2 and PHM3978_104 to identify recombinations, which were further investigated with nine KASP markers (PHM11114_7, KASP76522592, KASP82826985, KASP84607800, KASP86029983, KASP90189319, PZA02683_1, KASP8262199, and KASP101176111) (Table S1). Ten new types of recombinants (R4 to R13) were identi ed and backcrossed to susceptible DH lines to produce 684 BC 3 progenies (Fig. 4). In the winter season from January to May in 2015, all the progenies were individually phenotyped and genotyped. The same progeny testing was conducted and narrowed down qRtsc8-1 to an interval of ~6.30 Mb between markers KASP76522592 and KASP82826985. Recombinant types R6 and R12 con rmed the left boundary of the ne mapping region, and recombinant types R7 con rmed the right boundary. The phenotypic variation explained (PVE) value of qRtsc8-1 ranged from 17.65 to 61.22% based on recombinants R7-R12.
Eight new types of recombinants (R14 to R21) were detected with seven KASP markers (KASP76522592, KASP79341449, KASP81160138, KASP81881276, KASP82493295, KASP83335716, KASP84607800) (Table S1) in BC 4 populations (3515 plants) and backcrossed to susceptible DH lines to produce BC 5 progenies. In the winter season from January to May in 2016, all the 1817 BC 5 progenies were evaluated for resistance to TSC and genotyped using the seven markers (Fig. 4). Recombinant types R15 and R20 were deduced as susceptible by the progeny testing, indicating that qRtsc8-1 was downstream of KASP81160138 and upstream of KASP81881276. Finally, qRtsc8-1 was mapped to a physical distance of 721.14 kb between markers KASP81160138 and KASP81881276. The PVE value of qRtsc8-1 detected in recombinants R16 to R19 ranged from 12.57 to 25.36%.
Identi cation of candidate genes in the ne mapping interval of qRtsc8-1 Based on the annotation information of maize B73 reference genome obtained from maizeGDB, ve genes including three putative uncharacterized proteins and two genes with known predicted function were identi ed within the ne mapping interval of qRtsc8-1 (Table S4). GRMZM2G071228, GRMZM5G879762, and GRMZM5G869967 are putative uncharacterized proteins, their functions are still unknown. GRMZM2G063511 encodes an integral membrane protein like, that harbors the most signi cant SNP S8_81160155 detected by GWAS. GRMZM2G073884 encodes a leucine-rich repeat receptor-like protein kinases (LRR-RLKs). Both GRMZM2G063511 and GRMZM2G073884 may play important roles in disease resistance.

Validation Of The Production Markers In Breeding Lines
Before deploying MAS, genotyping assays anking the ne mapping interval need to be validated in breeding lines. In total, 8 of the 10 genotyping assays passed the technical validation process in the 471 breeding lines, which were used as the production markers for further haplotype and veri cation analyses ( Table 2). Two genotyping assays, including KASP81881276, did not pass the technical validation process, due to the low SNP calling success rates in the breeding lines (data not shown). represents that the marker is located at the physical position 81,160,138 bp on chromosome 8 b ** P < 0.01 The technical validation result of the eight genotyping assays obtained from the stepwise regression analysis is shown in Table 2. The KASP81160155 with a full model P-value of 0.0036 was identi ed as the most important production marker, explaining 23.07% of phenotypic variance. This SNP contained two alleles, "A" and "C", in the breeding lines. The favorable allele in breeding lines was "A" with a MAF of 0.34. The average TSC scores of the breeding lines carrying the alleles "A" and "C" were 2.09 and 2.65, respectively. The favorable allele of "A" improved the TSC resistance by 15.02% compared to the average TSC score.
The LD analysis of the eight markers in 471 breeding lines revealed that KASP81160138 was located in the same haplotype block with KASP81160155 (Fig. 5). The favorable allele of "T" of KASP81160138 improved the TSC resistance by 12.82% compared to the average TSC score of all the breeding lines, which was a minor allele with a MAF of 0.39. Both KASP81160138 and KASP81160155 markers were located in GRMZM2G063511. Three possible haplotypes involving the two markers, "TA", "TC", and "CC", were identi ed in all the breeding lines ( Table 3). The haplotype H1 ("TA") had a frequency of 0.36, showing the highest effect on improving TSC resistance. It improved the TSC resistance by 13.95% compared to the average TSC score across all the breeding lines. The remaining two haplotypes H2 ("TC") and H3 ("CC") reduced the TSC resistance by 7.93% and 8.90% compared to the average TSC score across all the breeding lines. The frequency of haplotype H2 was only 0.02. These two markers, KASP81160138 and KASP81160155, were identi ed as the production markers in breeding lines, which can be used for routine deployment of MAS to enrich the favorable alleles of qRtsc8-1 in tropical maize breeding populations. resolution. Two minor QTL in bins 6.05 and 6.07 were also detected by selective genotyping. These results revealed that the resistance to TSC in maize is controlled by a major QTL on chromosome 8 coupled with several minor QTL. The major QTL of qRtsc8-1 was previously detected by Cao et al. (2017) and Mahuku et al. (2016). It shows that the QTL of qRtsc8-1 is stable across different genetic backgrounds and environments. Moreover, this major QTL of qRtsc8-1 was further veri ed with a ne mapping strategy and validated using production markers in breeding lines. The ne mapping results and the production of markers developed by the present study will facilitate MAS for TSC improvement.
High-density markers in the ne mapping region are essential to narrowing down the QTL region (Ren et al. 2017). Single sequence repeat (SSR), cleaved ampli ed polymorphic sequences (CAPSs), and insertion-deletions (InDels) have been widely used in QTL ne mapping in maize. However, they are laborious and time consuming. KASP is a homogeneous, uorescence based single-step SNP genotyping assay, which has the characteristics of high throughput, high accuracy, low cost, and breeder friendliness  Posta Sequia C7 F64-2-6-2-2-B-B-B. Frameshift variation leads to the encoding of different amino acids after the 172 amino acid. Sanger sequencing is required to verify the variation. More studies are required to clone the gene accountable to TSC resistance in this ne mapping interval, and to understand the molecular mechanisms underlying TSC resistance.
The development of markers tightly linked to qRtsc8-1 is essential for deploying MAS for improving TSC resistance. Two production markers KASP81160138 and KASP81160155 were veri ed in breeding lines.
The favorable allele of both makers was a minor allele, indicating that increasing the frequency of the favorable allele in breeding programs is valuable. The two production markers can be used to enrich the favorable allele of qRtsc8-1 in early generations (F 2 , F 3 , BC 1 , BC 2 ), which allows breeders to focus on fewer lines in subsequent generations. Since CIMMYT has already developed and routinely employs a large number of disease markers for MAS, such as MSV, maize lethal necrosis, and maize a atoxin, the two markers reported here can be used together with those markers to develop maize germplasm with multiple disease resistance.
Genomic selection, an extension of MAS, has been reported to improve TSC resistance effectively by Cao et al. (2017,2021), where moderate to high prediction accuracies were obtained in different populations.
Genomic prediction analysis with signi cantly associated markers has the potential to improve prediction accuracy ). Incorporating the production markers of KASP81160138 and KASP81160155 into genomic prediction has the potential to improve TSC resistance in breeding programs.
Accurate phenotypic evaluation with reliable arti cial inoculation methods is crucial for genetic dissection of the TSC resistance and development of TSC resistant varieties. However, arti cial inoculation methods are still not available, rst the main player P. maydis is biotrphic and therefore cannot be cultured for mass production in the laboratory. Further, the complexity of the interaction between the three TSC causal pathogens is not well understood. In the present study, all the results are based on phenotypic data obtained from multiple environment trials under natural screening conditions.
Reliable arti cial inoculation methods need to be explored in further research. Author contribution statement XZ and JC initiated and designed the overall study. GH, AEA, TD, and FSV performed and coordinated the eld experiments and phenotypic data collection. XZ, JC, GH, MO, and BMP contributed to the genotypic data generation. JR, PW, AZ, JQ, YL, and HZ carried out the data analysis. JR, PW, JC, and XZ interpreted the results and wrote the manuscript. All authors contributed to manuscript editing.

Declarations
Con ict of interest We declare that we do not have any commercial or associative interest that represents a con ict of interest in connection with the work submitted.
Ethical approval The experiments comply with the current laws of the countries in which they were performed.

Data availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request. Supplementary data associated with this article can be found in the online version.