Plant materials
An association panel comprising 171 wheat cultivars was used for SNP genotyping and two years FHB resistance phenotyping. Among them, three cultivars were derived from Italy, Mexico and Japan, and the other 168 cultivars were collected from 8 provinces at winter wheat region in Northern China and 9 provinces from Southern China (Table S1). The population was planted at Wanfu Experimental Station, Institute of Agricultural Sciences of the Lixiahe, Yangzhou, Jiangsu Province, China (altitude 8 m, latitude 32.24°N, annual rainfall about 1000 mm, growing season from early November to the next May) in growing seasons 2016–2017 (17YZ) and 2017–2018 (18YZ). Field experiments were designed as randomized complete blocks with two replicates per environment. The cultivars in each replication were sown in two rows of 133×25 cm with 40 seeds per row. The field trials were managed following local practices.
Phenotyping
All cultivars were inoculated with four F. graminearum strains (F4, F15, F34, and F0609), kindly provided by Prof. Huaigu Chen from Jiangsu Academy of Agricultural Sciences, Nanjing, China. Ten spikes per row were inoculated at the late-heading stage by injecting 10 µL of macroconidial suspension (1.0 × 105 conidia/ml) into a single floret in the middle of each spike. After inoculation, the disease nursery was mist-irrigated for five min every 30 min from 7:00 am to 6:00 pm each day to provide high humid conditions favorable for FHB infection [27]. FHB severities were recorded 25 days after inoculation as the number of symptomatic spikelets per infected spike, and the mean value was used for further analysis. The percentage of symptomatic spikelets (PSS) was calculated as the measure of FHB severity. All tested accessions were classified into four classes based on FHB severity, resistant (0 < PSS ≤25%), moderately resistant (25% < PSS ≤50%), moderately susceptible (50% < PSS ≤75%) and susceptible (75% < PSS ≤ 100%) [28].
Genotyping and SNP calling
Genomic DNA was extracted from fresh leaves of seedlings using the CTAB method [29]. The association mapping population was genotyped from the wheat Illumina 90K iSelect array with 81,587 SNPs (Wang et al. 2014) at the Biotechnology Center, Department of Plant Sciences, University of California, USA, using the Illumina SNP genotyping platform and BeadArray Microbead Chip [30]. To avoid spurious marker-trait associations (MTAs), SNP markers with minor allele frequencies (MAF) < 0.05 and missing data >10% were excluded from subsequent analyses. The physical positions of SNP markers were obtained from Chinese Spring reference genome sequences at the International Wheat Genome Sequencing Consortium website (IWGSC, http://www.wheatgenome.org/).
Population structure analysis
Population structure was estimated using Structure 2.3.4 with 1,676 polymorphic SNP markers distributing on all 21 wheat chromosomes with r2 < 0.2, based on the Bayesian cluster analysis [31]. Six runs of Structure were performed with a K between 1 and 11, using the admixture model with 100,000 replicates each for burn-in and MCMC. The optimal K-value was determined using the ∆K method [32].
Linkage disequilibrium
Linkage disequilibrium among markers was calculated using the full matrix and sliding window options in Tassel v5.0 with the filtered SNP markers. Pairwise LD was confirmed through measuring squared allele frequency correlations r2, and the significance of pairwise LD (P-values) was measured in Tassel v5.0 with 1000 permutations. The r2 values were plotted against physical distance and a LOESS curve was fitted to the plot to show the association between LD decay and physical map distance. The critical value of r2 beyond which the LD was possible due to genetic linkage was determined by taking the 95th percentile in the distribution of r2 of the selected loci. The intersection of the right curve of r2 values with this threshold was considered as an estimate of the LD range [33]. The distribution of the unlinked r2 was used to define the critical value of r2.
GWAS for FHB resistance
Associations between genotypic and phenotypic data were analyzed using the kinship matrix in a Mixed Linear Model (MLM) by Tassel v5.0 to control background variation and eliminate spurious MTAs. In the MLM analysis, the kinship matrix (K matrix) was considered a random effect factor, whereas the subpopulation data (Q matrix) was considered a fixed-effect factor [34]. The K matrix was calculated in the software Tassel v5.0, and the Q matrix was inferred in Structure v2.3.4. The P-value determining whether a SNP marker was associated with the trait and the R2 indicating the variation explained by the marker were recorded [35]. Markers with an adjusted -log10 (P-value) ≥3.0 were regarded as significant for FHB resistance. Significant SNP markers within one LD on the same chromosome were considered to represent one locus. Haplotype analyses of the significant SNPs were performed with Haploview v.4.2 [36].
Identification of candidate genes
To identify the candidate genes linked to significant SNPs, the physical positions of the markers preceded by the chromosome name were taken to Ensembl, and the genes in the same genetic positions were considered. The intervals were then explored for predicted genes and annotations. For genes that are unavailable from the IWGSC annotations, we evaluated orthologous genes (proteins) in related species with known predicted functions using the comparative genomics tool in Ensembl. The closest species, Triticum urartu (A-genome donor) and Aegilops tauschii (D-genome donor) were first considered, followed by more distant species including barley (Hordeum vulgare), Brachypodium (Brachypodium distachyon), rice (Oryza sativa), maize (Zea mays), foxtail millet (Setaria italica), thale cress (Arabidopsis thaliana) and banana (Musa acuminata). In some cases, when the genes had a less similar ortholog (< 70%) in the annotated genomes of related species in Ensembl, the sequence of the T. aestivum gene was taken to National Center for Biotechnology Information (NCBI) and basic local alignment search tool (BLAST) (http://blast.ncbi.nlm.nih.gov/Blast.cgi) search was performed to find highly similar sequences (megablast). This search also included gene predictions in different species available in GenBank, but not in Ensembl. We also looked at the T. aestivum gene transcripts and their domains that were available in Ensembl (using the transcript table link).