Identification of T. aestivum FAD genes
In the current study, 82 TaFAD (coding by 68 genes) have been identified. All identified FAD genes were named based on a phylogenetic dendrogram, and their numbering was according to the location of genes on chromosomes. The physicochemical study of the identified genes was carried out using the ProtParam tool. Based on the results, these genes are different in the number of amino acids, molecular weight (MW), and isoelectric point (pI). The protein sequence encoded by these 68 FAD genes ranged in length from 281 amino acids of TaFAD4.2 to 518 amino acids of TaFAD6.2. The theoretical molecular weights of these proteins ranged from 29.74 to 59.63 kDa, with the isoelectric points varied from 5.42 to 9.72 (Additional file 1: Table S1). The results of cellular localization of proteins revealed that they are active in chloroplast, mitochondria, endoplasmic reticulum, and plasma membrane. The spatial diversity of these genes is likely related to the diverse functional roles of these genes in different cell processes.
Phylogenetic analysis of the wheat FAD gene family
We carried out a phylogenetic tree using the Neighbor-joining method to investigate relationships of wheat FAD proteins. According to figure 2, wheat FAD proteins were divided into five groups. The first group is FAB2, which is called stearoyl-ACP desaturase (SADs) in plants. It can introduce a double bond into the stearoyl-ACP at the delta-9 position. Based on the phylogenetic tree, the FAB2 subfamily has been noticeably separated from other subfamilies. All members of wheat and rice FAB2 were clustered together, whereas FAB2 of Arabidopsis and soybean (as dicot plants) were grouped. FAD4 group introduces a double bond into a saturated acyl chain at the delta-3 position. Likewise, all FAD4 proteins of monocot plants were clustered together and separated from dicot plants clade. In FAD2/FAD6 (Delta-12 desaturase, omega 6) subfamily, FAD2 and FAD6 were divided into two separate clades due to their subcellular localization (Additional file 1: Table S1). FAD2 is endoplasmic reticulum-type omega 6, while FAD6 is chloroplast-type omega 6. In FAD3/FAD7/FAD8 (Delta-15 desaturase, omega 3) subfamily, the FAD7 clade is closer to the FAD8 clade in wheat, which may be due to the high similarity of their sequences. The subcellular localization of FAD7 and FAD8 is the chloroplast, while the FAD3 is endoplasmic reticulum-type omega 3.
Gene location on a chromosome, gene duplication, and selection pressure of FAD genes
To assess the distribution of FAD family genes, a chromosome map has been constructed, and an uneven distribution of 68 FAD genes on wheat chromosomes was determined (Fig. 3). Chromosomes (Chr) 2A, 2D, 2B, 5A, and 5B had the highest number of genes while there was no gene of this family on chr1A-1D and 7D. Only one gene has been found on chr7A and chr7B related to TaFAB2.33 and TaFAB2.34, respectively. Different members of the FAB2 subfamily are located on chr2A-7B except for ch4B and ch4D. In SLD and DES subfamilies, the genes were determined on chr5A-5D and chr2A-2D, respectively. FAD3, FAD6, FAD7, and FAD8 subfamilies have been found on chr4A-4D, chr4A-5D, chr2A-2D, and chr2B-2D, respectively.
Two types of tandem and segmental duplication were observed in the wheat FAD gene family. However, according to the higher frequency of segmental duplication, its role in the expansion of the FAD gene family is much greater than tandem duplication. Gene duplication is one of the important mechanisms to increase genetic diversity and generate new genes in plants. In the current study, it was found that three genes on chr6A (TaFAD2.1-3), four genes on chr6B (TaFAD2.4-7), and four genes on chr6D (TaFAD2.8-11) are the result of tandem duplication. Likewise, threes genes on chr2A (TaFAB2.1-3), chr2D (TaFAB2.9-11), chr5B (TaFAB2.26-28), and chr5A (TaFAB2.23-25) are the result of tandem duplication. We investigated Ka, Ks, and Ka/Ks parameters for 153 paired genes (Additional file 1: Table S2) to reveal a functional selection pressure between duplicated genes. In general, Ka/Ks >1, Ka/Ks=1, and Ka/Ks <1 indicate positive, neutral, and negative selections, respectively [26]. The Ka/Ks ratio was less than one for most of the paired genes. This negative selection is to maintain the function of the FAD gene family in wheat plants, and they were under a slow evolutionary process, and almost their role in evolution has been maintained. However, the Ka/Ks ratio for seven paired genes (TaFAD2.4/TaFAD2.7, TaFAD2.6/TaFAD2.5, TaFAD2.2/TaFAD2.6, TaFADB2.1/TaFADB2.2, TaFADB2.9/TaFADB2.10, TaFADB2.15/TaFADB2.20, and TaFADB2.1/TaFADB2.11) was greater than one, indicated the aforementioned paired genes were under the positive selection during evolution and they have different functions due to the mutations that have been occurred during their evolution. The divergence time of duplications was estimated at 1.98-47.75 Myra.
Exon-intron structures and conserved motifs
Based on the analysis of the exon-intron structure of the FAD gene family, they had 0 to 10 introns with high structural diversity (Fig. 4). 21.95% of wheat FAD genes are intronless. The longest intron is related to the TaFAD2.7 gene. Three intron splicing phases have been observed for the wheat FAD gene family, including phase zero, splicing after the third nucleotide of the codon; phase one, splicing after the first nucleotide of the codon; and phase two, splicing after the second nucleotide [27]. Most of the genes in this family have phases zero and two, whereas TaFAD6.1-2 and TaFAB2.4 genes have all three intron splicing phases. In TaFAB2.13, TaFAB2.18, TaFAB2.22a, TaFAB2.15, and TaDES.1-3 genes only phase zero have been observed. Likewise, TaFAB2.1-3, TaFAB2.23-25, TaFAB2.27-28, TaFAB2.6-7, TaFAB2.9-11, TaFAB2.33-34, and TaFAB2.31 genes demonstrated only phase two. The genes of the same subfamily were more similar in exon-intron structure compared with the genes of the other subfamilies. The FAD3/7/8 and DES subfamilies have eight and two exons, respectively. However, the FAB2 subfamily has 2-3 exons except for TaFAB2.4b with 10 exons. The SLD, FAD2, and FAD4 subfamilies have one exon except for TaFAD2.2, TaFAD2.5b, TaFAD2.6c, and TaFAD2.7 with two exons. The FAD6 subfamily contains 10 exons, while TaFAD6.1b has seven exons. A comparison of gene structure and phylogenetic tree demonstrated that the gene structure and intron splicing phases were similar in each gene cluster except TaFAB2.4. (Fig. 4).
MEME tool has been applied to detect conserved motifs in protein sequences of the FAD family (Additional file 1: Table S3 and S4). Based on the results, 30 and 10 conserved motifs in the FAD and FAB2 subfamilies have been identified, respectively (Fig. 5 and 6). Evaluation of motifs with Pfam showed that motifs 1, 2, 8, 9, and 10 in the FAD subfamily are related to the FAD gene family. The motifs 1-5 in the FAB2 subfamily are also related to the fatty acid desaturases. Conserved motif 13 (Cytochrome b5-like Heme/Steroid binding domain) was presented only in the TaSLD1-3 (Additional file 1: Table S4). Motif 18 (AAAARADSGEA) with no function was found in all the members of the FAD subfamily. The lowest number of motifs was related to TaFAD4.1-3 with two motifs (Motifs 18 and 10). Motifs 1, 2, 4, and 9 were common in all the members of the FAB2 subfamily except TaFAB2.4. TaFAB2.4 has only motifs 8 and 10 with no function. All members of the FAD subfamily contain three conserved histidine boxes (Additional file 1: Table S5). The amino acid composition of His-boxes is highly conserved in the same subfamilies. The third His-box with consensus sequence HX2HH exists in all TaFAD except in the TaSLD1-3 that the first amino acid is glutamine instead of histidine, which is vital for its enzymatic activity. All members of the FAB2 subfamily contain two conserved histidine boxes. The first His-box with consensus sequence EENRHG exists in all TaFAB2 except TaFAB2.8 with the consensus sequence of EENRML, whereas the sequence of the second His-box (DEKRHE) is identical in all FAB2 subfamily members.
Detection of cis-acting elements in TaFAD promoters
To better understanding the TaFAD genes expression regulation mechanisms, the search of cis-acting elements 1500 bp upstream of starting codon (ATG) of TaFAD genes was performed using the PlantCare database. In total, 82 motifs were identified in this family that the abundance of these elements was highly varied for each gene (Additional file 1: Table S6). The highest frequency of motifs was related to STRE (95.58%), MYC (94.11%), MYB (89.70%), TGACG-motif (88.23%), CGTCA-motif (88.23%), as-1 (88.23%), ABRE (85.29%), and G-box (83.82%). Likewise, the lowest frequency of motifs was related to BoxIII (only in TaFAB2.21), AT1-motif (only in TaFAD2.11), xhs-MA1a (only in TaFAB2.24), L-box (only in TaFAD3.2), LAMP-element (only in TaFAD8.1), AAAC-motif (only in TaFAB2.10), F-box (only in TaFAB2.3), Plant_AP_2_like (only in TaFAB2.15), re2f-1 (only in TaFAB2.23), and OCT (only in TaFAB2.28).
Simple sequence repeats (SSRs) in TaFAD genes and TaFAD-targeted miRNAs prediction
72 SSRs have been identified in 36 of the 68 TaFAD genes including 33, 25, 11, 2, and 1 SSRS in FAD3/FAD7/FAD8, FAB2, FAD2/FAD6, DES/SLD, and FAD4 subfamilies, respectively (Additional file 1: Table S7). Most genes had a single SSR except FAB2.22 (5SSRs), FAD8.1 (5 SSRs), FAD8.3 (5 SSRs), FAD2.7 (5 SSRs), FAD3.1 (4 SSRs), FAD3.3 (4 SSRs), FAD3.2 (3 SSRs), FAD3.4 (3 SSRs), FAD3.6 (2 SSRs), FAD8.2 (3 SSRs), FAB2.15 (3 SSRs), FAB2.20 (2 SSRs), FAB2.34 (3 SSRs), FAB2.31 (2 SSRs), and FAD2.8 (2 SSRs). The highest frequency was related to tri-nucleotide repeats (36 SSRs) followed by tetra-nucleotide repeats (6 SSRs), penta-nucleotide repeats (13 SSRs), hexa-nucleotide repeats (4 SSRs), and di-nucleotide repeats (3 SSRs). 91 miRNAs for 47 TaFADs possible targets have been identified (Additional file 1: Table S8). The relationship between miRNAs and their targets was not one by one and many miRNAs had a common target. For instance, TaSLD3 transcript was co-targeted by 5 miRNAs named tae-miR9659-3p, tae-miR1137a, tae-miR395a, tae-miR395b, and tae-miR9677b. On the contrary, one miRNA had multiple transcript targets such as tae-miR5384-3p can suppress the expression of TaFAB2.1, TaFAB2.9, TaFAB2.11, TaFAB2.21, TaFAB2.32, TaFAB2.33, and TaFAB2.34.
Expression analysis of TaFAD genes
To understand the functional role of TaFAD genes, their expression patterns of different developmental stages have been studied in the wheat root, shoot, leave, grain, and spike tissues (Fig. 7) (Additional file 1: Table S9). The members of the TaFAD family showed different expression patterns. All members of DES/SLD subfamily revealed low transcript level at all developmental stages and tissues except TaSLD3 (reduced expression in leaves and shoots of seedling and growth cycles, high expression at the vegetative stage in roots and spikes), TaSLD1 (moderate expression at seedling stage in roots), and TaSLD2 (reduced expression at the vegetative stage in spikes). In the FAD2/FAD6 subfamily, all TaFAD2 genes had low expression except TaFAD2.1, TaFAD2.6, and TaFAD2.8 with a high level of transcripts in all stages and tissues. However, TaFAD6.1-2 genes had a high level of expression in shoots, leaves at the all studied developmental stages, and spike at the reproductive phase. Based on RNA-seq data analysis of the TaFAD4 subfamily, all members demonstrated a low level of transcripts in all stages and tissues. In the FAD3/FAD7/FAD8 subfamily, the low level of transcript expression of TaFAD7.1-3 genes has been observed in vegetative, reproductive, and seedling tissues, while high abundance expression of them has been detected in roots and vegetative phase of spikes. TaFAD8.1-3 genes had a high level of expression in leaves and shoots of all wheat developmental stages. However, TaFAD8.1, TaFAD8.3, and TaFAD8.2 genes had low, moderate, and high expressions at the reproductive stage in spikes, respectively. The expression patterns of TaFAD3.1-3 genes were similar and low in all stages and tissues except TaFAD3.1 with moderate expression in roots at the seedling stage. However, TaFAD3.4-6 genes showed a high level of expression in all tissues at vegetative and spiked at reproductive stages. In FAB2 subfamily, TaFAB2.1-3, TaFAB2.6-7, TaFAB2.9-11, TaFAB2.14, TaFAB2.16, TaFAB2.18-19, and TaFAB2.21-34 revealed a low level of transcripts in all stages and tissues whereas TaFAB2.4-5, TaFAB2.12, TaFAB2.15, TaFAB2.17, and TaFAB2.20 had a high abundance of transcripts in all developmental stages except TaFAB2.4 with low expression in roots and spike at growth phase, roots at seedling, roots, and grains at reproductive stage. Likewise, TaFAB2.8 had high expression in all conditions except seedling phases with moderate expressions. Finally, TaFAB2.13 showed a low level of transcripts in all conditions except roots at seedling and growth cycles with high expression and leaves/shoots at the growth cycle with moderate expression. Besides, we investigated the expression patterns of TaFAD genes to predict their roles responding to biotic and abiotic stresses (Fig. 8 and 9) (Additional file 1: Table S9). In response to powdery mildew pathogen, we observed the up-regulated expression of TaFAB2.4 after 24 hours of treatment. In this observation, TaFAD8.1-3, TaFAB2.1, TaFAB2.15, and TaFAB2.17 revealed high expressions during the stress. On the other hand, the expression of the TaFAD3.4, TaFAD3.5, TaFAD3.6, TaFAD3.8, TaSLD1-3, and TAFAB2.20 (moderate expression) remained unchanged and it has been down-regulated in TaFAB2.5 and TaFAB2.12 at 72 hours compared with 24 hours. The expression of TaFAD8.1-3, TaFAD4.2, TaFAD4.3, and TaFAD3.5-6 has been increased in response to the stripe rust pathogen CYR31. Although, the transcript level of TaFAD2.1, TaFAD2.6, TaFAD2.8, TaFAB2.1, TaFAB2.12, TaFAB2.17 (high expression) and TaSLD1-3, TaFAD6.1-2, TaFAB2.20 (moderate expression) remained unchanged. After 6 hours of drought stress, the expression of all TaFAD genes has been down-regulated, obviously except TaFAB2.5, TaFAB2.8, and TaFAB2.12 that showed up-regulation. The TaFAD2.1, TaFAD2.6, TaFAD2.8, TaFAB2.5, TaFAB2.8, TaFAB2.12, TaFAB2.15, TaFAB2.17, and TaFAB2.20 could be up-regulated by heat stress after 6 hours. Under drought and heat treatment, the expression of TaFAD7.1-3 and TaFAD8.1-3 has been decreased whereas the expression of TaFAD2.1, TaFAD2.6, TaFAD2.8, TaFAB2.5, TaFAB2.8, TaFAB2.12 TaFAB2.15, TaFAB2.17, and TaFAB2.20 have been induced more significantly at 6 hours. The expression of 11 TaFAD genes has been up-regulated under cold stress including TaFAD2.6, TaFAD2.8, TaFAD6.1-2, TaFAD8.1-3, TaFAB2.15, TaFAB2.17, and TaFAB2.20 while the transcript level of the TaFAB2.4, TaFAB2.5, TaFAB2.8, TaFAB2.12, TaFAD3.4-6, TaFAD4.2-3, and TaSLD1-3 was moderate.
TaFAD2.6 and TaFAD2.8 structural modeling and docking studies
The FAD2 members have been shown to play important roles in wheat response to environmental stresses. Microsomal omega-six fatty acid desaturases (FAD2) introduce a double bond to the oleic acid (C18:1) carbon chain resulting in linoleic acid (C18:2) [28]. Linoleic acid is a precursor of other polyunsaturated fatty acids, and it is essential for the growth of all eukaryotes [29]. On the other hand, among the FAD2 members, TaFAD2.6 and TaFAD2.8 proteins had the highest expression in all studied stresses. Therefore, in the present study, their molecular structure and interaction of ligand-enzyme have been evaluated. Three-dimensional structures of TaFAD2.6 and TaFAD2.8 proteins have been predicted and refined by I-TASSER and ModRefinder servers, respectively. According to the results of the Ramachandran analysis of primary and refined models, the residue count increased in favored regions from 68.4% to 86.0% and from 69.1% to 87.0% in TaFAD2.6, and TaFAD2.8, respectively, confirming that the refined models are of good qualities. The modeled structure for TaFAD2.6 has 18 α-helices and 19 loops (Fig. 10b) while TaFAD2.8 structure has 19 α-helices and 20 loops (Fig. 11b). Most of the predictor programs predicted that the TaFAD2.6 and TaFAD2.8 proteins have six transmembrane (TM) α-helices (Fig. 10a and 11a) with a consensus domains of TM1 (Val86-Val109), TM2 (Ala114-Ile132), TM3 (Ser145-Trp158), TM4 (Arg208-Val213), TM5 (Gln254-Val273), TM6 (Trp280-Gln303) for TaFAD2.6 and TM1 (Asp65-Val86), TM2 (ALA91-Ile109), TM3 (Ser122-Trp135), TM4 (Trp197-Asn204), TM5 (Asp236-Ser251), TM6 (Phe255-Thr282) for TaFAD2.8. Three conserved histidine boxes have been found in the TaFAD2.6 and TaFAD2.8. Based on the reports in other plant species, the histidine boxes interact with two irons at the catalytic sites of the FAD2 enzymes [30, 31]. The predicted structures of TaFAD2.6 contains three conserved histidine boxes with eight histidine residues of His134, His138 at position loop 7, His170, His173, His174 at positions α-helix 8, and His345, His348, His349 at position α-helix 16 of the protein structure (Fig. 10d). Likewise, the TaFAD2.8 structure possesses three conserved histidine boxes with eight histidine residues of His111, His115 at position α-helix 4, His147, His150, His151 at positions α-helix 6, and His322, His325, His326 at position α-helix 17 (Fig. 11d). The histidine residues revealed essential catalytic features in plant FADs [32].
Delta-12 desaturase introduces a double bond at the delta-12 position of oleic acid to produce linoleic acid. We carried out docking studies of oleic acid on the refined model structure to assess the ligand specificity of wheat omega-6 desaturases using AutoDock 4.2. Based on the docking simulation with the ligand-enzyme binding energy of -5.94 kcal/mol, Val129, His134, Leu160, Val161, Ser165, Trp166, Ser169, His170, His173, Pro188, Lys189, Gln190, Lys191, Glu192, Ala193, Ile212, Val213, Leu216, Leu296, Val297, Ile299, Thr300, and His345 of TaFDA2.6 formed closed contacts with the docked oleic acid, while Arg29, Val106, Ala110, His111, Trp143, Ser146, His147, Arg149, His150, His151, Asn153, Gln192, Ala202, Pro211, Leu273, Ile276, Thr277, His281, His322, and His326 were involved in ligand-TaFAD2.8 binding with -4.75 kcal/mol docking energy. The oleic acid formed a hydrogen band with His170 and His147 in TaFAD2.6 and TaFAD2.8, respectively (Fig. 10c and 11c). Based on the figure 10d, in TaFDA2.6, His134, His138, His170, His173, His174, His345, His348, and His349 residues involved substrate-binding are part of HECGH, HRRHH, and HVAHH histidine boxes. In the TaFAD2.8, His111, His115, His147, His 150, His151, His322, His325, and His326 residues involved in the active site are also part of three conserved histidine boxes (Additional file 1: Table S5). These results suggest that the conserved histidine boxes play critical roles in the binding of the ligand to the active site. The results mentioned above may be useful for future site-directed mutagenesis studies to increase the catalytic efficiency of the FAD2 enzymes and subsequently enhance wheat tolerance to different stresses.