A Wheat 660 K SNP array‑based high‑density genetic map facilitates QTL mapping of flag leaf‑related traits in wheat

Key message A high-density genetic map containing 122,620 SNP markers was constructed, which facilitated the identification of eight major flag leaf-related QTL in relatively narrow intervals. Abstract The flag leaf plays an important role in photosynthetic capacity and yield potential in wheat. In this study, we used a recombinant inbred line population containing 188 lines derived from a cross between ‘Lankao86’ (LK86) and ‘Ermangmai’ to construct a genetic map using the Wheat 660 K single-nucleotide polymorphism (SNP) array. The high-density genetic map contains 122,620 SNP markers spanning 5185.06 cM. It shows good collinearity with the physical map of Chinese Spring and anchors multiple sequences of previously unplaced scaffolds onto chromosomes. Based on the high-density genetic map, we identified seven, twelve, and eight quantitative trait loci (QTL) for flag leaf length (FLL), width (FLW), and area (FLA) across eight environments, respectively. Among them, three, one, and four QTL for FLL, FLW, and FLA are major and stably express in more than four environments. The physical distance between the flanking markers for QFll. igdb-3B / QFlw.igdb-3B / QFla.igdb-3B is only 444 kb containing eight high confidence genes. These results suggested that we could directly map the candidate genes in a relatively small region by the high-density genetic map constructed with the Wheat 660 K array. Furthermore, the identification of environmentally stable QTL for flag leaf morphology laid a foundation for the following gene cloning and flag leaf morphology improvement.


Introduction
Common wheat (Triticum aestivum L.) is an important staple food crop, which provides calories and proteins for nearly 35% of the human population in the world (Cao et al. 2020). Due to the rapid growth rate of world population and limited arable land, wheat breeders have to keep genetic gains at a rate of 2.4% per year to keep up with the global demands (Molero et al. 2019). To ensure this genetic gain, molecular breeding, such as marker-assisted selection (MAS), was developed to improve breeding efficiency (Kumar et al. 2011;Soto-Cerda et al. 2015). The construction of genetic maps and identification of QTLs are critical for MAS because the practice of MAS mainly depends on the quantitative trait loci (QTL) and genes (Liu et al. 2018a).
The construction of wheat genetic map is much easier now than before because a number of high-quality reference genomes were released recently including Chinese Spring (CS), Zang1817, and newly released 10+ reference genomes and they provide robust anchor information for marker grouping and ordering (Appels et al. 2018;Guo et al. 2020 Walkowiak et al. 2020;Zhu et al. 2021;Athiyannan et al. 2022;). However, the physical maps from these reference genomes could not totally replace genetic maps as they are insufficient for representing all variations of wheat which contains large scale of structural variations and introgressions (Cheng et al. 2019;He et al. 2019). In addition, genetic maps can validate physical maps, improve de novo genome assemblies, provide genetic distances, provide insights for recombination, chromosome arrangements, and translocations, and include the markers that cannot be localized on physical maps (Yadav et al. 2019;Zheng et al. 2019). Most importantly, genetic maps can facilitate QTL mapping in plant and animal breeding programs (Paterson et al. 1988).
Obviously, a higher-density genetic map can provide more information about recombination among the populations and delimit the QTL in a smaller interval (Cui et al. 2017;Liu et al. 2018a). Single-nucleotide polymorphism (SNP) is the most abundant type of molecular markers in wheat, and several wheat SNP arrays have been developed, for example Wheat 9 K, 55 K, 90 K, 660 K, and 820 K arrays (Cavanagh et al. 2013;Wang et al. 2014;Winfield et al. 2016;Wen et al. 2017;Ren et al. 2018;Sun et al. 2020). Due to their relative low cost, Wheat 55 K and 90 K arrays are widely used to construct genetic maps. For example, genetic maps containing about 8 ~ 12 K polymorphic markers spanning more than 3,000 cM across 21 wheat chromosomes were constructed using Wheat 55 K SNP array (Liu et al. 2018a;Ren et al. 2018;Huang et al. 2019). A total of 46,977 SNPs from Wheat 90 K array were genetically mapped by the combination of eight mapping populations (Wang et al. 2014). Wheat 660 K array has the advantages of reliable physical positions, high-density and cost-effective, which can be a substitute for other arrays, and has shown vast of possible applications (Sun et al. 2020). Cui et al. (2017) constructed a high-density genetic map containing 119,001 markers spanning 4424.4 cM using Wheat 660 K array and mapped a major stable QTL into a small interval (0.7 cM, 3.23 Mb). Although the Wheat 660 K array showed great potential for targeted genotyping and QTL mapping, it was used in limited QTL studies (Sun et al. 2020). In wheat, the flag leaf is the primary photosynthesis organ, which contributes about 50% photosynthetic activity and 41-43% of the carbohydrates for wheat grain filling (Duncan 1971;Xu et al. 1995). The size, posture, and angle are closely related to some agronomic traits. For instance, the size of flag leaf was proven to be positively related to thousand kernel weight, kernel number per spike, and grain yield per plant (Fan et al. 2015;Wu et al. 2015). As a key factor for photosynthetic efficiency in the later growth stage of wheat, the flag leaf morphology including size and posture plays an important role in providing carbohydrates for grain (Zhao et al. 2018;Tu et al. 2021). Optimizing flag size and posture of crops is expected to improve planting density and photosynthetic efficiency for higher yield. The flag leaf morphology, such as flag leaf length (FLL), flag leaf area (FLA), and flag leaf width (FLW), is a complex quantitative trait, which is controlled by multiple genes and significantly influenced by environment (Simón 1999;Coleman et al. 2001). As its importance in determining wheat yield, enormous efforts have been made to explore the genetic base in wheat. In wheat, TaWOX5 not only can improve plant genetic transformation efficiency but also can increase flag leaf width (Wang et al. 2022). TaSPL8 modulates leaf angle via auxin and brassinosteroid signaling . Additionally, many QTL related to FLL, FLW, and FLA were identified by biparental QTL mapping using the markers of simple sequence repeat (SSR), amplified fragment length polymorphism (AFLP), or/and diversity array technology (DArT), which were located on all 21 chromosomes of wheat (Xue et al. 2013;Fan et al. 2015;Wu et al. 2015;Liu et al. 2018b;Zhao et al. 2018;Farokhzadeh et al. 2019;Jin et al. 2020;Yan et al. 2020). Using the Wheat 55 K or 90 K array, a number of other QTL were identified (Hu et al. 2020;Tu et al. 2021;Ma et al. 2020). Furthermore, six and two peaks were found to be significantly associated with FLL and FLW by GWAS using Wheat 660 K array (Chen et al. 2021). However, only a few of them were major and stably expressed. For example, four major stable QTL (QFll. sau-SY-5B/QFll.sicau-5B.3, QFll.sau-SY-2B, QFlw.sau-SY-2D/QFlw.sicau-2D, and QFlw.sau-SY-2B) associated with FLL and FLW were identified in multiple environments and validated in different genetic backgrounds using kompetitive allele-specific PCR (KASP; Ma et al. 2020;Tu et al. 2021). In addition, there were twelve QTL (QFll.sicau-2D, QFll. sicau-4D, QFll.sicau-5B, QFLL-6A,  identified in more than four environments (Fan et al. 2015;Wu et al. 2015;Liu et al. 2017Liu et al. , 2018bHu et al. 2020). However, most of them were delimited into relatively large intervals, and none of them was cloned. Identification of QTL/genes controlling these traits will provide gene resource for wheat improvement in the future.
A high-density genetic map can greatly promote the identification of QTL underlying important agronomic traits and delimit the QTL in relatively small intervals. The aims of this study were (1) to construct a high-density genetic map and analyze the relationship between genetic map and physical map; (2) to comprehensively evaluate the performance of the flag leaf and identify related QTL underlying FLL, FLW, and FLA. These results should provide valuable information for wheat genetic study and plant architecture improvement.

Plant materials and field trials
The QTL mapping population used in this study was constructed using a cross between 'Lankao86' (LK86) and 'Ermangmai' (EMM). It was advanced to F 8 generation using single seed descent and contained 188 recombinant inbred lines (RILs). Parent LK86 is an advanced wheat breeding line with long and wide flag leaf, while EMM is a landrace with slender and short leaf. The RIL panel together with two parents was planted in Beijing (39°57′ N, 116°17′ E) and Zhaoxian in Hebei Province (38°05′ N, 114°52′ E) in 2016-2021. The planting environments are listed in Supplementary Table 1. For convenience, we used "E" and numbers to represent different environments. E1, E2, E3, E5, and E8 were normal conditions with agronomic practices regarding fertilizer and pest control in accordance with local practices. Compared to the normal environment, E4 were treated with 70% phosphatic fertilizer, while E6 and E7 were treated with 70% and 55% nitrogenous fertilizer, respectively. All the field trials were conducted in a randomized complete block design with two or three replications. In each environment, seeds were sown in two-row blocks with 1 m long row spaced 25 cm apart with a sowing rate of 11 seeds per row.

Phenotype evaluation and statistical analysis
The phenotype assessments of the RIL population were conducted across eight environments. After anthesis, five plants in the middle of each row were selected to measure FLL and FLW. FLL was the distance from leaf bottom to leaf tip. FLW was length of the widest part of the flag leaf. FLA was calculated by FLL*FLW*0.75 (Spagnoletti et al. 1990). Basic statistical analyses including means, skewness, kurtosis, and Shapiro-Wilk tests were conducted in R package of "pastecs" (https:// github. com/ phgro sjean/ paste cs). Analysis of variance (ANOVA) was performed using "aov" function in R following the formula: Y ~ Line + Env + Env:Line + (Rep% in %Env), where Y, Line, Env, and Rep are the phenotypic value, genotype, environment, and replicate, respectively. The best linear unbiased prediction (BLUP) values was calculated in R package of "lme4." The broad-sense heritability ( H 2 b ) for FLL, FLW, and FLA were calculated following the formula H b 2 = 2 g ∕ 2 g + 2 ge ∕n + 2 ∕nr , where 2 g represents the genotypic effect, 2 ge stands for the genotype by environmental effect, σ 2 is the residual error, n represents the number of environments, and r is the number of replicates (Holland et al. 2003).

Genotyping and genetic map construction
The young leaves of 188 RILs and parents were harvested and grinded into powder. High-quality DNA was extracted using CTAB (hexadecyl trimethyl ammonium bromide) method (Allen et al. 2006). Genotype analysis was conducted using the Wheat600 K SNP array in the company of CapitalBio Technology (Beijing). A SNP was rejected if it did not pass any of the following criterions: 1) Call rate ≥ 96%; 2) Ploy high resolution; 3) Minor allele frequency (MAF) ≥ 0.25; 4) Heterozygous rate ≤ 5.32%; 5) Polymorphic between LK86 and EMM; 6) Unique mapped against the reference genome of CS (IWGSC RefSeq v2.0). After stringent filtering, 147,883 high-quality SNPs were retained to construct the genetic map. All these markers were analyzed using the "BIN" function in IciMapping 4.2 (Meng et al. 2015). Only one marker with the lowest missing rate from each bin was selected as the bin marker to construct the genetic map. Further, the bin markers were grouped and ordered using the "MAP" function in IciMapping. Then, the redundant markers were integrated into the genetic map according to the bin information. R package "LinkageM-apView" (https:// github. com/ louel lette/ Linka geMap View) was used to visualize the genetic map.

Comparison of genetic position and physical position
The flanking sequences of each probe in the Wheat660 K SNP array were blasted against the reference genome of CS (IWGSC RefSeq v1.0, v2.0 and v2.1) to identify physical positions of probes. SNPs with all 70 bp (excluding the alternative SNP in the middle of probes) best mapped to the reference genome were regarded as unique mapped SNPs (E value ≤ 10 -5 ). Comparative analyses of each SNP were then conducted between genetic positions and physical positions. Genes in the QTL interval were obtained on the WheatOmics website (Ma et al. 2021).

QTL analysis
QTL analysis was performed using inclusive composite interval mapping (ICIM) function in IciMapping v4.2. The running parameters were set as probability in stepwise regression (PIN) = 0.001, logarithm of odds (LOD) ≥ 3.5 with 1 cM step. QTL identified in more than two environments were reported. Further, QTL detected in more than four environments were regarded as environmentally stable QTL. QTLs related to the same trait in different environments with overlapped interval were considered as identical QTL. Based on the rules of genetic nomenclature, QTL were named (McIntosh et al. 2017). IGDB represents "Institute of Genetic and Developmental Biology."

Construction of a high-density wheat genetic map
As our previous report, an F 8 RIL population including 188 lines derived from a cross between LK86 and EMM was constructed (Niu et al. 2020). In this study, the RIL panel and the two parental lines were genotyped by the Wheat 660 K SNP array. There were 97.51% (614,842/630517) of SNPs showed high call rate (CR ≥ 96%), and 39.62% (249,807/630517) of SNPs were Ploy High Resolution (PHR), which suggested good quality of our genotype data ( Supplementary Fig. 1). The PHR SNPs with CR ≥ 96% were used for further filter steps. The SNPs with MAF ≥ 0.25, heterozygous rate ≤ 5.32%, polymorphic between the two parents and unique mapped were retained. Finally, we obtained 147,883 robust SNPs for linkage analysis and genetic map construction. All the SNPs were grouped into 4,923 bins using the "BIN" function in the IciMapping software. Only one SNP with the least missing rate in each bin was selected to construct the chromosome frame.
Finally, 4,887 bin markers were mapped on the genetic map by linkage analysis. About 55.18% of bins contained only one SNP, and other bins included an average of 54.76 SNPs in per bin. The largest bin contained 6,285 SNPs that were across the centromere of the chromosome 3B (114-416 Mb), which potentially suggested low recombination rate in this region. According to the bin information, the redundant SNP markers were integrated into the genetic map, which resulted in a genetic map containing 122,620 SNP markers spanning 5,185.06 cM ( Fig. 1 and Supplementary Table 2).

Overview of the genetic map
The genetic map contained 22 linkage groups that were corresponded to 21 wheat chromosomes with chr6B split into two parts. Due to the presence of 1BL/1RS translocation in LK86, the SNPs on chromosome 1RS or 1BS showed co-segregation and distorted segregation were discarded, resulting in only 1BL remaining in the genetic map. For mapped SNP markers (including bin and redundant markers), there were 52.02% and 40.45% on A and B sub-genomes, while only 7.53% on the D sub-genome (Table 1 and Supplementary Fig. 2a). Chromosome 7A had the most mapped markers (15,437), and 6D had the least markers (392), with an average of 5839.05 markers for per chromosome. A, B, and D sub-genomes contained 41.03%, 37.18%, and 21.79% of bin markers, respectively (Table 1 and Supplementary Fig. 2b). The number of bin marker ranged from 77 (6D) to 403 (2B), with a mean number of 231.71 bin markers for each chromosome. The distribution characteristic of bin markers was less in the middle and more in the end of chromosomes, suggesting high probabilities of recombination rates in the end of chromosomes ( Supplementary Fig. 3). The average genetic distance between adjacent bin markers ranged from 0.87 cM for 4B-1.81 cM for 6B-1, with an average distance of 1.06 cM. The chromosome 2B with 362.34 cM showed the longest genetic distance, and 1B showed the shortest genetic distance (Table 1). About 98.05% of genetic gaps were less than 5 cM in this study, and only two gaps showed more than 20 cM (Supplementary Fig. 4

The genetic and physical map showed good collinearity
Theoretically, the genetic map should show good collinearity with the physical map. However, some factors, such as poor genome assembly, translocations, and inversions, can disorder the collinearity between genetic and physical map. To perform comparative analysis of genetic and physical map, we aligned the flank sequences of probes in Wheat 660 K array to the three CS reference genomes (IWGSC RefSeq v1.0, RefSeq v2.0, and RefSeq v2.1) to obtain its physical position. Based on the IWGSC RefSeq v1.0, we found that 87.93% of the bin markers showed consistent physical and genetic positions, and 6.81% were grouped on their homologous chromosomes. Only 5.26% were mapped to ungrouped or non-homologous chromosomes (Supplementary Table 3  and Supplementary Table 4). These results suggest that the genetic position of bin markers was in good agreement with that in the physical map (Fig. 2).
In addition, a genetic map could also improve the quality of genome assembly including accuracy, continuity, and completeness. Among 122,620 mapped markers in the genetic map, 1,028 SNP markers were located on the unknown chromosome based on the reference of CS (IWGSC RefSeq v1.0; Supplementary Table 5; Appels et al. 2018). These SNPs were dispersed in all 22 linkage groups in the genetic map (Supplementary Table 6). About 65.56% (674/1,028) of them were assembled onto the chromosomes in the newly released reference genome (IWGSC RefSeq v2.1), and 91.25% (615/674) were mapped on the same chromosome between genetic and physical map. For example, 109 markers spanning ~ 673 kb on the unknown chromosome of IWGSC RefSeq v1.0 were mapped onto the linkage group 2A (Fig. 3 and Supplementary Table 7). Simultaneously, these markers were assembled on the chromosome 2A of IWGSC RefSeq v2.1 and showed good collinearity with our genetic map. Furthermore, a region (774.81-778.63 Mb) containing these markers in IWGSC RefSeq v1.0 showed an inversion with IWGSC RefSeq v2.1 and genetic map ( Fig. 3 and Supplementary Table 7). In addition, we also anchored markers that are on the unknown chromosome in both IWGSC RefSeq v1.0 and v2.1 onto the genetic map. For example, a region containing 44 SNP markers located on the unknown chromosome in the first and second versions of reference (IWGSC RefSeq v1.0 and v2.1) was mapped onto the chromosome 2D ( Fig. 4 and Supplementary Table 8). The known broad-spectrum leaf-rust resistance gene, Lr22a, was located in this region, and our genetic map showed good collinearity with the published data (Thind et al. 2017). All of these results verified the robustness and accuracy of our genetic map. Fig. 1 The genetic density map for the RIL population derived from a cross between LK86 and EMM 51 Page 6 of 12

Phenotypic variation of leaf morphology
The parental line LK86 exhibited longer, larger, and wider leaf morphology compared with EMM (Table S9; two-tailed t-test, P < 0.01). FLL, FLW, and FLA showed transgressive segregation in bidirections beyond the parental lines. The skewness, kurtosis, and Shapiro-Wilk test analyses showed that the phenotype distribution of FLL, FLW, and FLA appeared approximately a normal distribution (Supplementary Table 9 and Supplementary Fig. 5). ANOVA analysis indicated significant differences among RIL genotypes, environments, and their interactions (P < 0.001; Supplementary  Table 10). Furthermore, the broad-sense heritability of the three traits was more than 0.93, suggesting the three traits were mainly determined by the genetic factors (Supplementary Table 10). However, significant effects on both genetic and environment were also detected for the three traits in the population. The correlation coefficient between FLL and FLW was 0.53 (data not shown), which implied that the genetic factors related to the two traits probably partial overlapped.

The characteristics of the high-density genetic map
To the best of our knowledge, we constructed a genetic map with the highest density in common wheat in this study. It contains 122,620 SNP markers spanning 5185.06 cM, which is slightly longer (4424.4 cM) and contains a little more markers (119,566) than previously reported by Cui et al. (2017). The slightly longer map length in this study may be due to high-density polymorphic markers between LK86 and EMM, which covers more recombination and extends the genetic map. Using high-density polymorphic markers, we grouped the markers into 22 linkage groups with relatively small gaps, which are nearly corresponding to the 21 wheat chromosomes. However, the chromosome 6B still split into two linkage groups. This phenomenon is likely due to low coverage of polymorphic markers in 6B, which is consistent with the results that only 615 markers in this linkage group. The second reason is probably due to low recombination rate in the region. Another reason is probably due to high sequence identity between 6A, 6B and 6D, which is consistent with the results that the markers on these three chromosomes tend to "disordered." In addition, we also found that most of the genetic gaps longer than 10 cM are located on chromosome 6A, 6D and 7D, which probably accounts to rare polymorphic markers in the gap regions on the chromosomes.

QTL Comparison and candidate gene prediction
In this study, we identified three, one, and four environmentally stable QTL for FLL, FLW, and FLA, respectively. QFll.igdb-2B/QFla.igdb-2B for FLL and FLA was mapped to a region flanking by markers AX-111777248 and AX-110912708 and physically located between 634.4 Mb and 649.8 Mb spanning 15.4 Mb on chromosome 2BL (IWGSC RefSeq v1.0). It was colocated with previously reported QFll.sau-SY-2B that located in a larger interval 29.9 Mb) compared with QFll.igdb-2B/QFla.igdb-2B in this study (Tu et al. 2021). This result validated the accuracy of our results and suggested that we could narrow down QTL in relatively small regions using a high-density genetic map. There are 129 high-confidence genes in the intervals of QFll.igdb-2B/QFla.igdb-2B (Supplementary Table 12). Considering relatively numerous genes in this regions, prediction of candidate genes is aimless and more studies are needed to fine map the QTL.
QFll.igdb-3B/QFlw.igdb-3B/QFla.igdb-3B for FLL, FLW, and FLA was mapped in a 444 kb interval (579.62-580.06 Mb) flanking by markers AX-110954005 and AX-108787461 on chr3B. Comparison of physical position of reported QTL for FLL showed that QFll.igdb-3B may be a new locus. There were only eight high-confidence genes in this region (Supplementary Table 12). Considering this QTL explained relatively small phenotypic variance (24.43% in E8 and less than 6.5% in other environments), we thought it might be a minor QTL and had various effect in different environments. Based on this, we predicted the candidate genes majorly based on the functional annotation and expression pattern of these genes. Among them, TraesCS3B02G368000 encodes an F-box family protein and expresses constitutively ( Supplementary Fig. 6). Previous studies showed that some members of this family are involved in regulation of leaf size (Baute et al. 2017). Therefore, this gene may be a candidate gene underlying QFll. igdb-3B/QFlw.igdb-3B/QFla.igdb-3B.
QFlw.igdb-7D.2 for FLW flanking by markers AX-109909429 and AX-111559194 was on the chromosome 7D. The physical position of AX-111559194 was on the chromosome 7D (501.3 Mb), while AX-109909429 was on the chromosome 6D (467.5 Mb). Therefore, it is difficult to evaluate the physical size of QFlw.igdb-7D.2. It is far away from QFlw.cau-7D (364-451.0 Mb) (Wu et al. 2015), which suggests that it was a novel QTL for FLW. The same situation was also found for QFla.igdb-5B/QFll.igdn-5B, and it is difficult to evaluate the size of physical interval.

A high-density genetic map can narrow the intervals for QTL mapping
As we all know, population size, recombination rate, and marker density are major factors influencing intervals size for QTL mapping. In this study, we mapped QFll.igdb-3B/QFlw.igdb-3B/QFla.igdb-3B in a 444 kb region containing only eight annotated high-confidence genes, which suggested that we could directly map QTL in relatively small intervals using saturated genetic map. For QFll.igdb-2B/QFla.igdb-2B, we mapped them into a 15.4 Mb interval , which significantly smaller than that of previously reported (634.4-663.3 Mb) (Tu et al. 2021). In summary, we developed a high-density genetic map with Wheat 660 K SNP array for a RIL population in wheat, and applied it for QTL analysis of two flag leaf traits and identified one known and three novel QTL. One QTL contained only eight candidate genes. All these results showed the necessities and advantages of high-density genetic map in wheat study.