Phenotype analysis
The description of genotype for the RIL and natural populations is shown in Table 1. In the RIL population, the parent Ningmai 9 showed a lower GPC than Yangmai 158 in all environments, which was consistent with their production performance. The max–min and coefficient of variation (CV) were above 5%, except for E1. The GPC in the natural population also presented a great variation, with CV ranging from 6.28% to 10.40% across the environments. GPC showed a significant positive correlation among almost all the environments in RIL and natural populations (Table 2), and the heritability reached 0.56 and 0.74, respectively (Table 3). Both genotype and environment had significant influence on the GPC of the RIL and natural populations, and their interaction also had significant influence on GPC of the natural population, but not on that of the RIL population (Table 3).
QTL mapping
Using QTL mapping, we identified 17 QTL on 10 chromosomes, including 1A (1), 1B (1), 2B (1), 2D (1), 3A (2), 3B (3), 4A (1), 5A (2), 5B (3), and 7A (2), which explained 3.94%–9.21% of phenotypic variance (Table 4 and Fig. 1). The additive effect of eight QTLs, including Qgpc-3B.1, Qgpc-3B.3, Qgpc-4A, Qgpc-5A.1, Qgpc-5A.2, Qgpc-5B.3, Qgpc-7A.1, and Qgpc-7A.2, was from Ningmai 9 and that of the other 9 QTL was from Yangmai 158. Two of the QTLs, Qgpc-2D and Qgpc-3B.2, were identified in two environments, and the other QTL were identified only in single environment.
Association mapping
The natural population was genotyped using Affymetrix 50 K assay. After quality control, 36360 SNPs were retained for further analysis. The SNPs on the A, B, and D genomes were 11415, 13133, and 11202, respectively, and the position of other 610 SNPs was uncertain. The number of SNPs per chromosome ranged from 875 (chromosome 4D) to 2698 (chromosome 3B) (Supplemental Table S3). According to the population structure analysis, the population was divided into three distinct subgroups (△K = 3) (Fig. 2A), and a similar result was obtained from the kinship matrix analysis (Fig. 2C). Approximately one fifth of the materials was assigned to different subgroups by the two programs because of the different statistical methods used (Supplemental Table S1).
Using association mapping, we obtained 17 chromosome intervals containing significant markers associated with GPC distributed on 14 chromosomes, including 1B (1), 1D (1), 2A (1), 2B (1), 2D (1), 3A (2), 3B (1), 3D(2), 4A (1), 4B (1), 5A (2), 6B (1), 7A (1), and 7D (1) (Table 5 and Fig. 3). We then divided the intervals into three types for further analysis: Type A, P values below 10−4 and in multiple environments; Type B, P values below 10−4 and in single environment; and Type C, P values below 10−3 and in multiple environments.
Development and validation of KASP markers for candidate intervals
To further utilize the related QTL and associated markers, we tried to convert them into friendly KASP marker. Firstly, nine of these intervals were selected out according to the following three criterions (Table 6): (1) QTL detected in multi-environments; (2) Type A associated intervals; (3) repeatedly regions between QTL mapping and association mapping. Secondly, the loci with low homology in these intervals were chosen for marker development (Supplemental Table S2). At last, KASP genotyping was performed in the materials from RIL population and natural population to test the developed markers.
Association mapping was performed in a large breeding population (1163 F4 lines) with the nine successfully developed KASP markers based on GLM (Fig. 4). Then, we compared the GPC of the lines with 1~9 GPC-increased alleles and GPC-decreased alleles, and the selected order of the markers was according to the P values from lowest to highest (Fig. 5). It was found that the difference between the lines with GPC-increased alleles and GPC-decreased alleles increased as the number of markers increased, and remained relatively stable as the three markers with lowest P values were used in selection. Therefore, the three markers of Kgpc-2B, Kgpc-2D, and Kgpc-4A with low P values (<10−10) were applicable for GPC selection, and their combination presented more effective.
Application of significant KASP markers
Further, we used the three markers to test 164 F6 lines, and 15 lines with GPC-increased alleles showed an average GPC of 14.85%, which was significantly higher than 13.15% for eight lines with GPC-decreased alleles (Fig. 6 and Supplemental Table S4), thus indicating a good selection effect.
Exploration of candidate genes in the significant intervals
The published IWGSC reference sequence provided detailed information for the identification of candidate genes, and 607, 42, and 235 high confidence (HC) genes were identified in the intervals of Kgpc-2B, Kgpc-2D, and Kgpc-4A, respectively (Supplemental Table S5). We also analyzed the homologous genes of Gpc-B1, and a homologous gene of TraesCS4A01G242700 was found in the interval of Kgpc-4A. Further, TraesCS4A01G242700 of the parents, Ningmai 9 and Yangmai 158, was sequenced (Supplemental Table S6), however, no sequence difference were found. Gene expression data of TraesCS4A01G242700 was extracted from WheatEXP, indicating that TraesCS4A01G242700 was expressed in the grain, leaf, spike, and roots, and it showed the highest expression level in roots (Supplemental Table S7). In addition, 23 of the HC genes in the interval of Kgpc-2D expressed in different tissues at reproductive stage, and 17 expressed in the grain were analyzed (Supplemental Table S8).