Differential expression analysis of yield negative regulators in wheat parents and F1 hybrids
Owing to their polyploid genetic architecture, bread wheat cultivars show significant quantitative variations that are being influenced by various sets of homeologs in the A, B, and D subgenomes. The interaction of these homeologs determines the expression pattern of a gene that can either confer buffering effect due to the functional redundancy of homeologs or a dominance effect due to genetic variation in the homeolog sequence, eventually leading to phenotypic diversity. Therefore, identification of the dynamic transcriptional expression patterns of various genes in different cultivars and their hybrids can decipher their interactions that might influence the growth of a plant. Four syntenic triads, namely TaGW2, TaD27, TaCKX2.1, and TaCKX2.2 (accessions are mentioned in Supplementary Data 2 Table 1) in polyploid wheat, were assessed for differential expression analysis at six different growth stages such as one shoot, tillering, jointing, booting, flowering, and maturing (Fig. 1A), in three elite cultivars i.e., TD-1 (TD), Auqaab-2000 (AUQ), and GA-2002 (GAp), and at flowering and maturing stages in their F1 hybrids, namely TG (TD-1 x GA-2002), TA (TD-1 x Auqaab-2000), GT (GA-2002 x TD-1), GA (GA-2002 x Auqaab-2000), AT (Auqaab-2000 x TD-1), and AG (Auqaab-2000 x GA-2002), by retrieving whole tissue RNA from fresh leaves at each stage (Fig. 1B, 1C). The data indicated that the expression patterns vary from stage to stage and cultivar to cultivar (Fig. 2). In TD cultivar, the expression level of all the genes was high at one shoot stage; however, a low expression level was observed in AUQ and GAp cultivars (Fig. 2A, 2B). At the tillering stage, the expression level of TaD27 (a tillering negative regulator) was low in TD which can be correlated with the phenotypic data as more tillering was observed in TD compared to AUQ and GAp (Fig. 1E). Similarly, at the jointing stage the expression level of CKX family genes such as TaCKX2.1 and TaCKX2.2 (negative regulators of cytokinin) was low in Gap (Fig. 2B), consequently, an increased phenomenon of jointing was observed in GAp cultivar as compared to TD and AUQ cultivars (Fig. 1F). At the booting stage, the expression levels of all genes showed an enhanced expression pattern compared to other growth stages. However, a similar expression pattern can be observed in all the cultivars at flowering stage. Interestingly, in the flag leaf tissues, the expression level of all genes decreased at the maturing stage in TD and AUQ whereas their expression level was maximum in GAp compared to all other stages. Moreover, the cluster analysis showed a positive correlation in the expression pattern of all the genes at one shoot, jointing, and booting stages (Fig. 2C). A similar yet more strong positive correlation can be observed at tillering, flowering, and maturing stages which represent the stages that highly impact the grain yield. These results also indicate that dynamic transcriptional expression patterns can also be associated with the phenotypic outcome for the agronomically significant traits.
Subsequently, a similar analysis was conducted for the F1 hybrids, generated as a result of reciprocal crosses of these commercially cultivated cultivars (Fig. 1B, 1C), at two significant growth stages i.e., at the flowering and maturing stages to explore whether the expression patterns of these genes stay the similar to their parents or not. In all F1 hybrids (AG, AT, GA, GT, TA, and TG), the expression dynamics of the TaD27 gene stayed consistent at the flowering stage, however, at maturing stage TG hybrid showed higher expression of the TaD27 gene as compared to other hybrids (Fig. 2D, 2E). The expression patterns of the TaGW2 gene demonstrated an increasing notion from flowering to the maturing stage for the AG, AT, and TG hybrids. However, a decreasing notion was observed for the GA, GT, and TA hybrids where the expression of the TaGW2 gene decreased at the maturing stage. The expression of the TaCKX2.1 and TaCKX2.2 genes drastically decreased from the flowering to the maturing stage in the GA hybrid. The cluster analysis indicated a strong positive correlation between the expression patterns of GA and AT hybrids for all the genes in contrast to a slightly less positive correlation that can be observed in GT, TG, and TA hybrids (Fig. 2F). To develop a consensus for transcriptional suppression or dominance and track the behavior of expression patterns for a gene between parents and their hybrids, the expression dynamics of both parents and the hybrids were compared at flowering and maturing stages (Fig. 4). In the case of AG and GA hybrids, the expression of TaD27 gene was same at both stages and significantly different from both parents i.e., AUQ and GAp (Fig. 4A). Moreover, a general trend of enhanced expression of all the genes can be observed at the flowering stage as compared to both parents. Similarly, a general trend of decreased expression in all the genes can be observed in all hybrids from GA, involving GA-2002 as one of the parents, at maturing stage (Fig. 4A, 4C). Interestingly, in the case of TA and AT hybrids, the AT hybrid showed the transcriptional expression dynamics more inclined towards the TD parent at both growth stages where TD served as the male parent (Fig. 4B, 4E). Likewise, an expression pattern similar to the male parent can be reflected for all the genes in TG and GT hybrids (Fig. 4C).
Identification of preferential expression patterns of homeologs
To predict the preferential expression patterns of homeologs of yield-related negative regulators and identify a high-confidence single-homeolog dominance that might be influencing the expression patterns in different cultivars, we decided to access the relative abundance of transcripts expressed from each subgenome (A, B, and D) for TaCKX2.1, TaCKX2.2, and TaTEF1 genes at all six growth stages in all parents (Fig. 3A, 3B). The expression profiles of the TaTEF1 syntenic triad showed that the relative expression of its homeologs changes at different growth stages in all cultivars, however, in general, a higher relative expression of TaTEF1-7A homeolog was calculated. In GA cultivar, the expression of TaTEF1-7A homeolog was higher than TaTEF1-7B and TaTEF1-7D homeologs at one shoot, booting, and maturing stages. At the jointing stage, all three homeologs showed decreased expression in GA cultivar. Interestingly, at the jointing stage the expression of TaTEF1-7A only plummeted in GA where the expression was very high in AUQ and TD cultivars (Fig. 3B). Contrary to parents’ data, in all F1 hybrids, the expression level of the TaTEF1-7B homeolog was higher at maturing stage except in the TA hybrid where the TaTEF1-7A homeolog showed higher expression (Fig. 3D, 3E). The stage-dependent suppression and dominance of homeologs in a syntenic triad in different cultivars and their hybrids depict the unbiased expressional dynamics of the subgenomes.
To validate the resemblance and/or difference in the expression pattern of homeologs from the CKX family, the expression level of TaCKX2.1 and TaCKX2.2 homeologs was analyzed in parents at all six growth stages. Interestingly, all the homeologs of the TaCKX2.1 gene demonstrated similar patterns of expression at different growth stages in all the cultivars with an enhanced expression level at the booting and maturing stages (Fig. 3A). Likewise, a similar expression pattern was observed for all the homeologs of TaCKX2.2 gene at different growth stages (Fig. 3B). Furthermore, the cluster analysis showed a strong positive correlation in the expression patterns of all the studied homeologs at one shoot, tillering, booting, and flowering stages (Fig. 3C). In the case of F1 hybrids, the expression dynamics of the homeologs of the TaCKX2.1 gene showed an enhanced expression level of the TaCKX2.1A homeolog in all the hybrids at the flowering stage except for the AG hybrid where TaCKX2.1B demonstrated eminently higher expression level (Fig. 3E). The cluster analysis demonstrated a strong positive correlation among AT, GT, and GA hybrids (Fig. 3F). These inconsistent and different expression patterns in hybrids also consolidate the fact that a homolog-expression bias can be overcome by crossing different cultivars. For instance, an enhanced expression level of TaTEF1-7B homeologs can be observed in AG, GA, TG, and GT hybrids in contrast to the GA-2002 parent (GAp) (Fig. 4D, 4F).
Domain analysis of yield-related negative regulators
To predict the interaction of cis-regulatory elements that might be influencing the dynamic transcriptional expression patterns of the genes, we generated the in silico data using online platforms to access the structural and functional distinctions among the yield-related negative regulators and their homeologs. We retrieved three homeologs for each gene i.e., TaGW2, TaD27, TaCKX2.1, and TaCKX2.2 from A, B, and D subgenomes of wheat. Interestingly, we found five homeologs of the TaCKX2.2 gene, namely TaCKX2.2A, TaCKX2.2B, TaCKX2.2D1, TaCKX2.2D2, and TaCKX2.2D3 as shown in Supplementary Data 2 Table S1. To identify their functional characteristics, we used their protein sequences to find out the functional domains (Fig. 5A). The results indicate that the TaGW2 genes have a domain of the Ring U-box superfamily that functions as an E3 ubiquitin ligase and involves specific post-translational modification of proteins known as ubiquitination 20,21. The TaD27 genes contain the DUF4033 superfamily domain that has been reported in rice, chrysanthemum, and Arabidopsis thaliana and proven to be involved in shoot branching through biosynthesis of strigolactones 22,23, while TaCKX2.1 and TaCKX2.2 genes possess cytokinin-binding superfamily domains and involved in the inactivation of plant growth hormone- cytokinin 12. Different domains indicate that these genes are related to different gene families and engaged in different molecular mechanisms related to wheat growth and yield. Moreover, the WoLF PSORT server was accessed to forecast the subcellular localization of genes to identify the main site of their activity 24 (Supplementary Data 2 Table 1).
Phylogenetic analysis of homeologs
To determine the evolutionary pathway, we identified and retrieved the paralogous genes of GW2, D27, CKX2.1, and CKX2.2 genes from Triticum dicoccoides, Triticum spelta, Triticum turgidum, Triticum urartu, and Arabidopsis thaliana through https://plants.ensembl.org/index.html. According to the results, we divided the phylogenetic tree into three groups (Fig. 5B). The D27, GW2, CKX2.1, and CKX2.2 genes from all the above-mentioned species were classified into three different groups (I, II, and III) based on their evolutionary relationships. Interestingly, five homeologs of the TaCKX2.2 genes were obtained through protein sequence-based analysis (Supplementary Data 2 Fig. S1).
Gene structure and motif analysis
According to gene structure analysis, all homeologs of TaCKX2.1 and TaCKX2.2 genes contain three exons and two introns, and the homeologs of TaD27 genes consist of six exons and five introns, while all the homeologs of TaGW2 genes contain eight exons and seven introns (Fig. 5C). Furthermore, we identified conserved motifs of TaGW2, TaD27, TaCKX2.1, and TaCKX2.2 genes by accessing MEME platform 25. Results indicated that all TaGW2 genes have eight motifs distributed on N and C terminals. The TaD27 contains a maximum of six and a minimum of five motifs while TaCKX2.1 and TaCKX2.2 genes have a maximum of 13 motifs, eight on the N terminal and six on the C terminal (Fig. 5D). Motif analyses showed that these structurally different genes possessed five conserved motifs. Moreover, these structural and motif similarities in the homeologs of different genes explain their similar molecular and biological functions.
Functional gene ontology annotation analysis
To enhance our understanding of the functional performance of these genes, we used gene ontology (GO) enrichment analysis based on cellular component (CC), molecular function (MF), and biological process (BP) classes to estimate the roles of the TaGW2, TaD27, TaCKX2.1, and TaCKX2.2 genes in polyploid Triticum aestivum. These concepts enabled us to forecast the molecular actions of these genes in the growth and development of the plant. Figure 6A demonstrates comprehensive information on the findings of the annotation.
The GO-CC enrichment annotation analysis detected three enriched terms, nucleus (GO:0005634), chloroplast (GO:0009507), and extracellular space (GO:0005615). The GO-BP enrichment detected a total of seven terms such as ligase activity (GO:0016874), ubiquitin-protein ligase activity (GO:0061630), macromolecule metabolic process (GO:0043170), primary metabolic process (GO:0044238), secondary shoot formation (GO:0010223), strigolactone biosynthetic process (GO:1901601), and cytokinin metabolic process (GO:0009690). While GO-MF annotation analysis identified a total of 12 enrichment terms, macromolecule metabolic process (GO:0043170), primary metabolic process (GO:0044238), regulation of organ growth (GO:0046620), regulation of seed growth (GO:0080113), iron ion binding (GO:0005506), isomerase activity (GO:0016853), cis-trans isomerase activity (GO:0016859), catalytic activity (GO:0003824), oxidoreductase activity (GO:0016491), cytokinin dehydrogenase activity (GO:0019139), flavin adenine dinucleotide binding (GO:0050660), and FAD binding (GO:0071949) (Fig. 6A, Supplementary Data 1 Table S1). The TaCKX2.1 and TaCKX2.2 genes demonstrated their involvement in catalytic activity and cytokinin metabolic processes encompassing their function as negative regulators of cytokinin hormones. The TaD27 gene showed its involvement in the strigolactone biosynthetic process and secondary shoot formation. Similarly, the TaGW2 gene showed its contribution to the regulation of seed growth.
Identification of putative miRNA targets
Various plants’ reactions to biotic and abiotic stress as well as phytohormones have associations with microRNA (miRNA)-directed regulators. Therefore, we decided to fetch bioinformatics analysis based on miRNA-gene sequence interaction through the psRNATarget database. We discovered many miRNAs that target the genes we were interested in to further expand our understanding of the miRNAs regulating these genes in Triticum aestivum. Figure 6B shows comprehensive information about targeting miRNAs. The TaGW2 genes are targeted by a maximum number of miRNAs as 15 different miRNAs are targeting the homeolog TaGW2D. Despite being the homeologs of the same gene (TaD27), only two miRNAs (TaemiRN4410 and TaemiRN4300) and one miRNA (TaemiRN4300) target the TaD27B and TaD27D gene, respectively. Interestingly, the miRNA target data also emphasize that homeologs of the same gene have varying numbers of targeting miRNAs and the same miRNA can target homeologs of different genes (Supplementary Data 1 Table S2). This observation also explains the impact of targeting miRNAs that might result in the variable expression pattern of these genes and/or homeologs at different growth and developmental stages of the plant.
Identification of transcription factors binding sites
Transcriptional regulations also play a pivotal role in the coordinated expression of genes and these regulations encompass the activation and suppression of genes 26. Such events are controlled through cis-acting elements (CREs) present in the promoter region located upstream of the gene 27. The transcriptional machinery consists of transcriptional factors and co-activators, bears the tendency to bind with transcription factor binding sites (TFBS), and decides the fate of the gene’s expression pattern 26. To understand a landscape of differential expression patterns of the TaGW2, TaD27, TaCKX2.1, and TaCKX2.2 genes, we decided to predict the molecular map of CREs present in their promoter regions.
Although all the genes belong to different gene families and conduct distinct molecular and biological functions, we were able to find six common CREs, namely AP2/ERF, GATA/tify, GATA, Dof, bZIP, and MYB/SANT, in the promoter region of all the genes (Fig. 7). Among these CREs, the AP2/ERF (APETALA2/ethylene-responsive element binding factors) transcription factors are a large group of plant transcription factors that are highly conserved throughout the plant kingdom and regulate several biological and physiological processes such as plant-responsive mechanisms against various stresses, morphogenesis, metabolite regulation, and hormone signal transduction in plants 28. Interestingly, the TaCKX2.1 and TaCKX2.2 genes belonging to the CKX family only showed one common TFBS for SHI-related sequence (SRS) transcription factor that is reported to be involved in growth and development, phytohormone biosynthesis, and stress response in plants 29. However, the TaD27 gene showed the highest number (74) of TFBS while the TaCKX2.2D2 gene showed merely seven TFBS (Fig. 4B, Supplementary Data 1 Table S3, S4, S5). Like targeting miRNAs, the inconsistency in the number of CREs present in the homeologs of the same gene endorses the fact that they hold the tendency to influence the expression pattern of these homeologs at different growth stages.
Protein structure analyses
The advancement of the graphical display of genetic data has enabled the comparative study of proteins and describe their properties using 3D structures because proteins possessing similar domains frequently have similar functions despite their structural differences. To explore the architectures of the protein, the protein structures were predicted using the c2khoA template at a 90–95 percent confidence level. Based on the principles of protein structure guide design, binding ligands that are required to bind with either exterior or internal residues can be found. The TaGW2, TaD27, TaCKX2.1, and TaCKX2.2 genes exhibit the same monomer structure with alpha (α) helices and beta (β) sheets (Supplementary Data 2 Fig. S1). These genes are also anticipated to have physical connections, independent divergence sequences, and conserved biological traits because of the tertiary structures of proteins. Moreover, no domains or motifs were observed that warrant the interactions among the proteins encoded by these genes with each other that may hinder their normal functionality. Despite having variable gene structures, the protein structures of all the homeologs are similar to each other which explains their capacity to carry out the assigned task in a similar fashion, apart from their expression profile that shows which homeolog is being expressed at a higher level than others.
Morphological trait analysis of F1 hybrids and F2 plants
Harnessing the potential of integrated approaches involving genetics, breeding, and statistical tools enables the fast-track introduction of elite cultivars in hybrid development programs with prior knowledge about their phenotypic performance especially related to yield. To exploit our genomic and transcriptomic analyses in the breeding program, phenotypically distinct cultivars were crossed in a reciprocal manner to observe their morphological behavior in F1 progeny. The data of agronomically significant traits such as yield were collected from parents, the F1 hybrids developed through reciprocal crosses, and F2 segregating population (Fig. 8). The data showed significant variations in parents and F1 hybrids that were grown in the pots with a basic dose of fertilizers and consistent agronomic practices throughout the experiment. An impeccable hybrid vigor was observed with a 17% increase in the average number of grains and a 50% increase in average thousand grains weight in the F1 hybrids of GA (GAp x AUQ) and TA (TD x AUQ), respectively (Fig. 1D, 1E, 1F). To assess the correlation among selected yield-related traits principal component analysis (PCA) was conducted (Fig. 8). The scree plot indicated that the first two principal components explained more than 60% of the total variability for both years, based on which PCA biplot was constructed to analyze the variability (Fig. 8A). During 1st year (F1 hybrid progeny), plant height (PH) and spike length (SL), non-productive (NP)-tillers vectors lie close to each other and have greater than 90° with flag leaf length (LL), flag leaf area (LA), and tillers which indicates that both sets of traits have an antagonistic relationship with each other (Fig. 8B, 8C, 8D). During the 2nd year, the growth of segregating population of F2 wheat plants, sown in the field, showed a mixed response from declined to enhanced yield notion compared to parents. The number of tillers, total grains per plant (TGP), thousand grains weight (TGW), LA, and SL were higher for all the genotypes except the cross between AUQ and TD (AT) which performed comparatively better in the F1 progeny in the previous year (Fig. 8E, 8F, 8G, 8H). Moreover, LL and NP-tillers function synergistically but lie opposite to TGW which indicates that TGW will be less with the plants having high LL and NP-tillers (Fig. 8F). Correlation analysis also showed that during 1st-year the number of tillers was positively correlated with LA, TGW, TGP, and spikelet, and negatively correlated with PH and NP-tillers (Fig. 8C, 8D). During 2nd year, the number of tillers was positively correlated with flag leaf width (LW), SL, and TGP, and negatively correlated with LL and NP-tillers whereas, TGW was negatively correlated with LL and positively correlated with a number of tillers (Fig. 8G, 8H). During the 1st year, cluster analysis grouped the GAp and TD parents together, and hybrids TG, TA, GT, GA, AT, and AG were put into separate groups based on their performance similarity (Fig. 8C). However, during the 2nd year, the TD and AUQ parents behaved similarly and were put in one group whereas, among traits, TGW and tiller were grouped in one group which showed that they have a synergistic relationship with each other (Fig. 8G).