Generation of RNA-seq data
In this study, we constructed 24 RNA-seq libraries of the 8 treatments, WHd-A2.0, MHd-A2.0, WHd-L2.0, MHd-L2.0, WHd-A3.5, MHd-A3.5, WHd-L3.5 and MHd-L3.5, and each treatment included 3 biological repeats. After high-throughput sequencing, each sample contained 40 – 59.4 million reads; the data size of each sample was between 6 – 8.9 Gb, the Q30 value of each sequencing sample exceeded 87%, the GC content distribution was 50 – 56%, and sequencing alignment analysis results showed that 89.35 – 97.79% of the reads could be mapped to IWGSC RefSeq v1.1 (Table S1). Principal component analysis (PCA) of the eight raw sequencing datasets indicated that the samples could be clustered into four groups according to their genotype, which showed good repeatability between samples that could be used for subsequent analysis (Fig. S2).
Identification of differentially expressed genes
To identify genes that differed significantly between the early and late heading genotypes during the development of young panicles, we analyzed the differentially expressed genes (DEGs) with strict quality control. The DEGs of L2.0, L3.5, A2.0, and A3.5 were compared between the genotypes with an early heading time (WHd) and a late heading time (MHd). Finally, we identified 18,325 unique differentially expressed genes (Fig. 1a, Additional Table S2). Several reported flowering time genes were differentially expressed, including Vrn1-5A [3], Vrn3-7B [5], Ppd-1D [9, 10], and WSOC1 [14]. In addition, we discovered that when the growth cone developed to the W2.0 stage, there were 12,941 DEGs in the apical meristem between WHd-A2.0 and MHd-A2.0, of which 9,990 DEGs were upregulated and 2,951 DEGs were downregulated (Fig. S3a). When the growth cone developed to the W3.5 stage, in the apical meristem, there were 1,557 DEGs between WHd-A3.5 and MHd-A3.5, of which 1,406 were upregulated and 151 were downregulated (Fig. S3b). In the leaf tissue, there were 6,390 DEGs between WHd-L2.0 and MHd-L2.0, of which 2,735 genes were upregulated and 3,655 genes were downregulated (Fig. S3c). In leaf tissue at the W3.5 stage, there were 1,040 DEGs, of which 764 were upregulated and 276 were downregulated in WHd-L3.5 vs. MHd-L3.5 (Fig. S3d). We screened the common differentially expressed genes at different developmental stages in the same tissue and found that 1,084 genes were common DEGs between WHd-A2.0 and MHd-A2.0 and between WHd-A3.5 and MHd-A3.5 (Fig. 1b). In addition, 215 genes were common DEGs between WHd-L2.0 and MHd-L2.0 and betweenWHd-L3.5 and MHd-L3.5 (Fig. 1c). We speculated that these genes play an important role in spikelet development at the W2.0 and W3.5 stages. Additionally, we explored the distribution of differentially expressed genes along the three wheat subgenomes, and the DEGs were mainly distributed at the end of the chromosome arm (Fig. S4).
GO enrichment and MapMan metabolic pathway analysis of differentially expressed genes
To explore the regulatory pathways of DEGs, GO enrichment and MapMan metabolic pathway analysis were conducted between the early heading date and the late heading date accessions. We finally identified 657 significant GO terms (Table S3) involved in biological process (BP), molecular function (MF), and cellular component (CC) (Fig. 2; Fig. S5; Fig. S6; Fig. S7). There were 224, 94, 210, and 129 significant GO terms in A2.0, A3.5, L2.0, and L3.5, respectively (Table S3). Among them, we discovered several flowering time-related GO terms, including GO:0005975 (carbohydrate metabolism) [20], GO:0005991 (trehalose metabolic process) [21], and GO:0019684 (photosynthesis, light reaction) [22]. The metabolic pathways of four groups of DEGs were visualized by MapMan software (Fig. 3; Fig. S8; Fig. S9; Fig. S10), and a total of 328 significant metabolic pathways were finally identified. There were 224, 94, 210, and 129 significant MapMan metabolic pathway terms in A2.0, A3.5, L2.0, and L3.5, respectively (Table S4). Additionally, we found 5 common GO terms, i.e., GO:0003700 (transcription factor activity, sequence-specific DNA binding), GO:0001071 (nucleic acid binding transcription factor activity), GO:0003824 (catalytic activity), GO:0043565 (sequence-specific DNA binding), and GO:0016491 (oxidoreductase activity), and 214 common metabolic pathways, including bin:1.1.1 (light reaction), bin:11.8 (lipid metabolism), and bin:13.2.3.5 (amino acid metabolism) (Fig. 4). We suggest that these common regulatory pathways can provide ideas for further exploring the variation in the molecular mechanisms of wild and mutant wheat.
Transcriptome factor classification and identification
To identify the transcription factors (TFs) involved in the heading time development process, the DEGs were subjected to transcription factor analysis using iTAK software. In our study, a total of 1,225 transcription factors were classified into 56 transcription factor families (Table S5). A large number of differentially expressed genes were bHLH (129, 10.51%), WRKY (109, 8.88%), NAC (91, 7.41%), AP2/ERF-ERF (88, 7.17%), and MYB (64, 5.21%) transcription factors (Table S5). Among them, we also identified several transcription factor families involved in the plant flowering process, including three LFY transcription factors [23], eighteen SBP transcription factors, thirty-six MADS-MIKC transcription factors and ten MADS-M-type transcription factors (Table S5). Gene expression level analysis showed that transcription factors were expressed at the critical period that determines flowering time. For example, we found that the three LFY transcription factor genes (TraesCS2A02G443100, TraesCS2B02G464200, TraesCS2D02G442200) were highly expressed in the apical meristem, especially in WHd-A3.5 and WHd-A2.0 (Fig. 5a). Most of the SBP transcription factor genes were highly expressed in WHd-A2.0 and WHd-A3.5 (Fig. 5b). Additionally, in the MADS-box gene family, we discovered that the TraesCS5A02G391700 (Vrn1-5A) gene always showed a high expression level in both wild and mutant wheat, TraesCS3B02G612600 was highly expressed in leaf tissue, and two genes (TraesCS3D02G284200, TraesCS3A02G284400) were highly expressed in the apical meristem (Fig. 5c). GO enrichment analysis of LFY transcription factor genes revealed that the genes were mainly significantly involved in the GO:0006355 pathway (regulation of transcription, DNA-template, p = 6.60 × 10-23), and the SBP transcription factor genes significantly participated in the GO:0003677 pathway (DNA binding, p = 9.70 ×10-20). Additionally, the MADS-box transcription factors were mainly involved in the GO:0019219 pathway (regulation of nucleobase-containing compound metabolic process, p = 4.10 ×10-36).
Construction of the flowering gene regulatory network
In our study, a total of 18,352 DEGs were used to construct the coexpression network. In selecting the soft power value, we chose β = 14 to build the coexpression network (Fig. S11a; Fig. S11b). We calculated the average gene connectivity under different soft-thresholding powers and found that when β = 14, the coexpression network had scale-free characteristics (Fig. S11c; Fig. S11d). Finally, a total of 17 coexpression modules were obtained by the “dynamic tree cut” method. Each branch in the cluster tree represents a gene set, and different modules were distinguished by different colors (Fig. S11e). The correlation value between coexpression modules and samples was also calculated (Fig. S12). The reported flowering time genes Vrn1-5A (TraesCS5A02G391700), Vrn3-7B (TraesCS7B02G013100), Ppd-1D (TraesCS2D02G079600), and WSOC1 (TraesCS4D02G341700) were selected to build a gene coexpression network, and genes connected to the reported flowering time genes were extracted from the coexpression modules. We finally detected 16, 336, 446 and 124 DEGs with biological connections to Vrn1-5A (Fig. 6a), Vrn3-7B (Fig. S13a), Ppd-1D (Fig. S14a), and WSOC1 (Fig. S15a), respectively. The completed gene list related to the reported flowering time genes is summarized in Table S6.
Analysis of gene expression patterns coexpressed with flowering genes
To further study the gene expression pattern of the differentially expressed genes coexpressed with the wheat heading date gene, we analyzed the 16 genes that had biological connections with Vrn1-5A. We found the genes that were coexpressed with Vrn1-5A could be divided into three patterns according to their gene expression level. First, TraesCS2D02G181400 is a MADS-MIKC transcription factor that was highly expressed in WHd-A3.5, MHd-A3.5, MHd-L3.5, and WHd-A2.0 but expressed at low levels in WHd-L2.0, MHd-L2.0, MHd-A2.0, and WHd-L3.5 (Fig. 6b). TraesCS2D02G181400 was upregulated between WHd-A2.0 and MHd-A2.0 and downregulated between WHd-L3.5 and MHd-L3.5 (Fig. 6c). Additionally, the other five genes (TraesCS1D02G339300, TraesCS2A02G562100, TraesCS3A02G007200, TraesCS3B02G361000, TraesCS4D02G337000) were expressed at low levels at all stages in both the early and late heading date accessions (Fig. 6b). In addition, the remaining ten differentially expressed genes (TraesCS3A02G231100, TraesCS3B02G152300, TraesCS3B02G213900, TraesCS3B02G260400, TraesCS5D02G386500, TraesCS6A02G146500, TraesCS7A02G439300, TraesCS7A02G489000, TraesCS7D02G429000, TraesCS7D02G475300) showed higher expression levels in WHd-L2.0, WHd-L3.5, and MHd-L2.0 but presented lower expression in WHd-A2.0, WHd-A3.5, MHd-A2.0, MHd-A3.5, and MHd-L3.5 (Fig. 6b). Additionally, the genes that were highly related to Vrn3-7B were mainly highly expressed in the leaf tissues (Fig. S13b). Genes connected with Ppd-1D tended to be expressed in the leaf (Fig. S14b). Some genes that were coexpressed with WSOC1 were preferentially expressed in leaf tissues (Fig. S15b). Furthermore, by gene function annotation, we found some vital candidate flowering time genes in the coexpression network, such as TraesCS1A02G220300, TraesCS2D02G181400, TraesCS3A02G143100, and TraesCS5A02G166100 (Table 1).