The seed coat area of hybrid and its parents was significantly different at 8 DAP.
The seed coat area and kernel weight from 0 to 15 DAP in the hybrid and parental lines were measured continuously (Fig.1A). At 0 DAP, the seed coat area of the hybrid was lower than that of both the mid-parental value (MPV) and the maternal line. At 6 DAP, the hybrid values began to exceed the MPV but were still lower than those of the maternal line. With the expansion of the seed coat, the hybrid value continued to be higher than the MPV at 7 DAP and slightly exceeded the value of the maternal line. At 8 DAP, there was a significant difference (P value = 0.034) between the hybrid and the MPV regarding seed coat area (Fig. 1B). From 9 to 15 DAP, the seed coat area of the hybrid was increasingly different from the MPV and paternal line values (Table S1). In conclusion, 8 DAP is the inflection point and the entry point of heterosis in the seed coat of maize. The kernel weight of the hybrid was significantly different from that of the MPV after 10 DAP (Fig. 1D). These results indicated that the heterosis of the seed coat area occurred prior to changes in seed weight.
In the seed coat area, the relative growth rate of the hybrid and paternal line peaked at 3 DAP, and the maternal line grew most rapidly at 2 DAP (Fig. 1C). In terms of the relative increasing rate of kernel weight, the value of the hybrid was the highest at 3 DAP, while it was highest at 2 DAP for the maternal line and at 4 DAP for the paternal line (Fig. 1E). In general, seed coat area and kernel weight showed a large increase at 2-4 DAP. More importantly, 3 DAP was found to be a critical time point for hybrid seed coat development.
Relative cell numbers in the seed coat
To understand whether the increased seed coat area in the hybrid was due to changes in cell number or cell size, we counted the number of cells in the seed coat per unit area (0.1 mm2) (Fig. 2). The middle longitudinal section of seed embryos collected at 8 DAP was obtained (Fig. 2A-C). The number of cells per unit area for the paternal line (319 cells per 0.1 mm2) was higher than that for the hybrid (149.7 cells per 0.1 mm2), the maternal line (138 cells per 0.1 mm2) and the MPV (228.5 cells per 0.1 mm2) at 8 DAP (Fig. 2D). Although there was no difference in the number of cells per unit area between the hybrid and maternal line at 8 DAP, there was a significant difference in seed coat surface area (Fig. 2D, E). Thus, compared with the maternal line, the cell number of the hybrid was determined by its larger seed coat area. The differences in cell number and seed coat area between the hybrid and MPV were significant (Fig. 2D), suggesting that the difference in seed coat size between the hybrid and parents might be the result of the coordination between cell number and cell size.
Analysis of expressed gene in seed coat
To investigate the expression of genes before and after the time points at which the seed coat of the hybrid line and its parents showed significant differences, we performed RNA-seq on 18 samples at 6 time points (2-4 DAP and 8-10 DAP) from the hybrid and parental lines, with 3 biological replicates for each sample. In total, 54 libraries and 2.4 billion high-quality reads were generated. The reads were then mapped to the maize B73 reference genome (RefGen_V4 [26]) by Hisat [27]. Approximately (87%) of reads were uniquely mapped (Table S2) and used to calculate the normalized gene expression level as fragments per kb of transcript per million mapped reads (FPKM). There was a high correlation among the three biological replicates at each time point (average R2=0.93) (Table S3). All the above results indicated that the sampling quality of this survey was high, and that the RNA-seq sequencing data were accurate and reliable. Thus, we took the average FPKM value of the three replicates as the expression level for the sample at each time point. The Principal Component Analysis (PCA) revealed that the 18 samples at six time points from three genotypes were assigned to two stages, and that 2-4 DAP were clearly distinguished from 8-10 DAP (Fig. S1). Moreover, the RNA-seq data of 2-4 DAP were overlapping (Fig. S1).
There were 39005 annotated genes in the B73 AGP_v4 genome[26]. The transcriptomes of both the hybrid and parents exhibited very similar distributions in the number of expressed genes at each time point. In detail, the total numbers of genes expressed by the hybrid, maternal and paternal lines at six time points were 23516, 23596 and 23247, respectively. A total of 17691, 18328 and 17604 genes overlapped at these 6 time points, accounting for 75.23%, 77.67% and 75.73% of the total expressed genes (Fig. 3A-C), respectively. Taken together, the results of overlapping expressed genes in the hybrid and its parental lines showed that the growth and development of the seed coat was mainly due to gene expression level rather than the change in gene expression.
Differentially expressed gene identification
It was suggested that the hybrid showed significant differences from its parents at 8 DAP, while the relative growth rate of the hybrid seed coat area and kernel weight peaked at 3 DAP. Therefore, we focused on RNA-seq data at 3 DAP and 8 DAP for subsequent analysis. In DEG analysis, the gene expression levels in the hybrid were compared one-by-one with its parental lines, with a difference in the fold change (FC) ≥ 2 and (FC) ≤ -2 denoted as upregulated expression and downregulated expression, respectively. Some DEGs were highly or weakly expressed in either the hybrid or parental lines. At 3 DAP, a total of 3915 DEGs varied between the hybrid and maternal line (2134 (54.50%) upregulated and 1781 (45.50%) downregulated), and 2647 DEGs varied between the hybrid and paternal line (1248 (47.15%) upregulated and 1399 (52.85%) downregulated). Compared with that at 3 DAP, the proportion of DEGs upregulated at 8 DAP increased sharply. A total of 2182 DEGs varied between the hybrid and maternal line (1499 (68.70%) upregulated and 683 (31.30%) downregulated), and 2724 DEGs varied between the hybrid and paternal line (1847 (67.80%) upregulated and 877 (32.20%) downregulated) (Fig. 4A). The differences in the upregulation and/or downregulation of gene expression between the hybrid and its parents varied across comparisons at 3 DAP and 8 DAP. Some DEGs had genotype-specific expression and were thus identified as genotype-specific unigenes (Fig. 4B). Venn diagram analysis indicated that 1156 genes were DEGs (highlighted as black bold font) between the hybrid and its parents at both time points (Fig. 4B). Out of these 1156 genes, 834 DEGs (sum of bold font numerals) were commonly observed between M vs. H and P vs. H at 3 DAP. However, 430 DEGs (sum of bold font numerals) were commonly observed between M vs. H and P vs. H at 8 DAP (Fig. 4B). In the hybrids at both time points, 108 DEGs were found to be commonly differentially expressed (Fig. 4B).
Expression patterns of DEGs
To explore and categorize the trends for changes in expression, DEGs in 3 DAP and 8 DAP samples were divided into 13 possible hybrid and parental expression patterns (Table 1). Gene expression patterns were divided into additive expression and nonadditive expression. We found no significant changes in the number of additively expressed DEGs between the two time points. The expression pattern of most genes in the hybrid at the two time points was additive (85.18% at 3 DAP and 86.45% at 8 DAP). However, the effects of nonadditive DEGs were quite different, especially the dominantly expressed DEGs (classes 5, 7, 10 and 12). That is, the gene expression in the hybrid was similar to that of one parent, which accounted for more than 13% of all DEGs at both time points. At 3 DAP, the number of dominantly expressed genes with higher paternal expression accounted for the highest proportion in the nonadditive classes, while at 8 DAP, the number of dominantly expressed genes with higher maternal expression was the most abundant. The number of overdominantly expressed genes with transitional downregulation (0.35% at 3 DAP and 0.14% at 8 DAP) was relatively higher than that with transitional upregulation (0.10% at 3 DAP and 0.08% at 8 DAP).
We summarized 13 possible hybrid and parental expression patterns into mid-parent, high-parent and low-parent categories to analyse the expression changes at both time points. By comparing the number of DEGs in three expression patterns at both time points, it was found that the number of low-parent DEGs changed the most, followed by the number of high-parent DEGs, while the number of mid-parents DEGs changed the least (Table S4). According to the analysis of the fold changes at both time points, it was suggested that the change in the number of low-parent and high-parent DEGs may play a role in hybrid seed coat heterosis.
Functional classification by Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis
The DEGs with high and low parental expression at 3 DAP and 8 DAP were annotated by GO. The enrichment of GO terms with low-parental expression at 3 DAP showed that a large number of DEG categories related to stress were significantly enriched in biological processes, followed by cellular components, and categories related to the plasma membrane (Fig. 5A). The highly expressed DEG GO enrichment results at 3 DAP suggested that categories related to chromosome assembly and processing were enriched in cellular components, and categories related to the cell cycle and DNA replication were enriched in biological processes (Fig. 5B). The GO enrichment results for low-parental expression at 8 DAP demonstrated that the categories related to cytoskeleton and microtubule were enriched in cellular component, and the categories enriched in biological process were mainly related to cell cycle, nuclear division and microtubule (Fig. 5C). The GO enrichment results for highly expressed DEGs at 8 DAP indicated that categories related to metabolism were enriched (Fig. 5D).
The KEGG pathway enrichment analysis was carried out to further understand the biological function of genes and their interactions. We performed KEGG enrichment analysis on DEGs with high and low-parental expression at 3 DAP and 8 DAP, respectively (Fig. 6). Among the 3 DAP low-parental expressed DEGs, metabolic pathways, biosynthesis of secondary metabolites, protein processing in the endoplasmic reticulum and glutathione metabolism were significantly enriched (Fig. 6A). For the 3 DAP high-parental expressed DEGs, the biosynthesis of secondary metals, DNA replication and photosynthesis were significantly enriched (Fig. 6B). Metabolic pathways, the biosynthesis of secondary metabolites and alpha linolenic acid metabolism were significantly enriched in 8 DAP low-parental expressed DEGs (Fig. 6C). In contrast to DEGs with high parental expression at 3 DAP, DNA replication was also enriched in the low-parental expression DEGs at 8 DAP. For the high-parental expression DEGs at 8 DAP, circadian rhythms, the biosynthesis of secondary metals, glycolysis/gluconeogenesis, thiamine metabolism, metabolic pathways, galactose metabolism and other pathways were significantly enriched (Fig. 6D).
Weighted gene co-expression network analysis (WGCNA)
To identify the specific genes highly related to seed coat area, we constructed co-expression networks by using the transcriptome data from comparisons of the hybrid and its parents at 6 time points, and associated the co-expression module with seed coat area traits. The results showed that these genes should be divided into 41 modules (Fig. S2). By observing the correlation between modules and samples, it was found that the absolute value of the correlation coefficient between the MEyellow (0.86) and MEsienna3 (0.82) modules was the highest (Fig. S2). In other words, these two modules have the highest correlation with seed coat area. In these two modules, the genes with the top 100 weight values were screened to construct the regulation network diagram (Fig. 7, Fig. S3). First, the module MEyellow revealed that Zm00001d019306 was a hub gene (Fig. 7). However, in module MEsienna3, the regulatory network was not as simple and clear as that in MEyellow. We chose genes with a degree value of more than 10 as hub genes. Therefore, the hub genes in module MEsienna3 were Zm00001d006330, Zm00001d043289, Zm00001d007181, Zm00001d043290, Zm00001d033510, Zm00001d024732, Zm00001d015025, Zm00001d006331 and Zm00001d047796 (Fig. S3, Table S5).
Validation of DEGs by quantitative real-time PCR
The expression patterns of nine DEGs were further confirmed by quantitative real-time PCR (qRT‒PCR). We compared the transcriptional profiles of each gene with the seed coat RNA samples from the hybrid and its parents at different time points. The data confirmed that the expression patterns of all 9 DEGs were consistent with the expression levels obtained by RNA-seq analysis (Fig. 8, Fig. S4)