Genetics of drought/heat tolerance and breeding for drought tolerantce in crop plants have been areas of current active research in crop plants. Several reviews on drought tolerance in crops have also been written, providing updated information on the subject (Hirschmann et el. 2014, Abuelsoud et al. 2016; Bapela et al. 2022, Adel et al. 2023,). As described in these reviews, genes associated with a variety of metabolic pathways, including photosynthesis, respiration, nitrogen metabolism and sulphur metabolism are involved in response to drought in crop plants. A number of traits including depression of canopy temperature, shrinkage of root traits, closure of stomatal have also been utilized as marker traits for selection of drought tolerant genotypes in a crop (Ashfaq et al. 2023). Thus, a large number of genes (numbering into hundreds), many of these genes occurring as multigene families, seem to be involved in drought tolerance, rendering development of drought tolerant cultivars a challenge indeed. Some major genes/QTLs, QTL hot spots and metaQTLs have also been identified in some crops (Gupta et al. 2012, 2017; Barmukh et al. 2022); in some cases, these QTLs/genes have also been utilised in developing drought tolerant cultivars, as done in chickpea (Bharadwaj et al. 2021). Transcription factors and meta-QTLs for drought tolerance have also been identified (Gahlaut et al. 2016; Kumar et al. 2021; Tanin et al. 2022). Recently in Argentina in South America, a sunflower gene Hahb4 encoding a transcription factor has also been utilized for developing drought tolerant transgenic HB4 wheats that have been approved for consumption as feed and food; in Agentina and Brazil, HB4 wheat has also been approved for commercial cultivation (Gupta, 2023a).
Despite the above developments, a search for genes responding to drought continues unabated. The present study is yet another such effort involving identification of drought responsive SOT genes in wheat and the related species (Hirschmann et el. 2014; Abuelsoud et al. 2016). In wheat, the present study is the first systematic analysis of the family of SOT genes and SOT proteins. Our results involving gene and protein structure, gene distribution and protein classification will facilitate further investigations on the functions of TaSOT genes in T. aestivum. Hopefully, these results will also provide a useful reference for the study of SOT genes in other crops. Our data should also promote further studies on the evolution of SOT genes/proteins in polyploid genomes and the genetic improvement of bread wheat for drought tolerance.
In wheat, individual genes range from one gene per genome to dozens of genes per genome, such that many genes are present as multigene families. In several cases, examined so far in our own laboratory also, more than one gene occurs in the same genome (for a review, see Gupta et al. 2023). Following are some examples (number of genes in parentheses): (iii) RuvBL genes (9) (iv) RWP-PK (37) (iv) SET domain genes = SDGs (166); (iv) 20S proteasome genes (67); (v) NAAT genes (24); (vi) VMT genes (6); (vii) SWEET genes (108).
Like some of the above examples, the number of SOT genes varies from two SOT genes in the genus Flaveria (an ornamental genus of Asteraceae) to 107 SOT genes in bread wheat (the present study). Among other species examined, 18 genes were reported in Arabidopsis thaliana (Klein et al. 2006), 35 in rice (Oryza sativa; Chen et al. 2012), 77 in cotton, (G. hirsutum; Wang et al. 2019), 56 in Chinese cabbage (Brassica rapa L.; Jin et al.2019), an average of 45 in 11 species of the algal genus Caulerpa (Landi et al. 2020) and 29 in potato (Solanum tuberosum; Faraji et al. 2021). Based on the hexaploid nature of wheat genome and availability of 35 SOT genes in diploid rice genome, it is not surprising that 107 genes were identified in wheat in the present study, assuming that for each rice gene, there would be three genes in hexaploid wheat. However, not all the seven homoeologopus groups carried TaSOT genes and also not all genes had three copies, one on each of the three chromosomes of the same homoeologous group. For instance, 4 D chromosome did not have any gene. In earlier studies, this type of uneven distribution was also noticed SOT family in rice, where 35 SOT genes are unevenly distributed and genes were absent on chromosome 3 and 5 (Chen et al. 2012). Similar results were obtained in cotton, mustard and potato (Wang et al. 2019, Jin et al. 2019 and Faraji et al. 2021). In earlier studies also, D genome was found to be different (Marcussen et al. 2014 and IWGSC 2018).
The results of the structure of TaSOT genes obtained in the present study can also be related with funtion and evolution of these genes. It may be recalled that 34 of the 107 TaSOT genes identified during the present study were split genes, where the number of introns in an individual gene varied from 1 to 5. It means that majority of TaSOT genes [73 (68%) out of 107] were intronless. Similar results are available for SOT genes in other plant species studied so far. For instance, in Arabidopsis 7 of the 21 AtSOT genes (Klein et al. 2004), in potato, only four genes out of 29 StSOT genes (Faraji et al.2021), in Chinese cabbage (B. rapa) 18 out of 56 BraSOT genes (Jin et al. 2019) and in rice, 19 out of 35 OsSOT genes were split genes, suggesting that majority of genes are intronless (Klein et al. 2004; Chen et al.2012); also among rice genes, the gene OsSOT9, which provides tolerance for heat and drought stress is a intronless gene (Cao et al. 2016). In the present study, only 23 genes had high expression under drought, of these 23 genes, 19 genes are intronless and only 4 were split genes, suggesting that in general many more introless genes are associated with high expression under drought. The occurrence of miRNAs and lncRNA among split genes and intronless genes was also examined; it was found that of 16 miRNAs, 4 miRNAs occurred in 4 intronless genes (the remaining 12 miRNAs occur in 8 split genes). Based on this distribution, we may conclude as follows: (i) many more genes are intronless; (ii) many more intronless genes (26%) and relatively fewer split genes (11%) had high expression under drought.
The occurrence of relatively fewer split genes among plant SOT genes may have some significance. This may be examined keeping in view the function of introns in split genes, which are known to carry a variety of regulatory sequences including non-coding RNAs like miRNAs and lncRNAs. In the present study, the expression of only four of the 34 split genes were associated with high expression under drought. Asssociation of miRNA and lncRNA with split genes was also examined; it was found that among 12 genes carrying miRNAs, eight genes were split genes and the remaining four were intronless. In contrast, among nine lncRNAs six TaSOT genes, of which only two are split genes (TaSOT2-2B, TaSOT36-7A). This suggests that miRNA are more frequent in split genes and lncRNA are more frequent in intronless genes. These results also suggest that majority of TaSOT genes involved in response to drought/heat were intronless and also that miRNA and lncRNA are negatively correlated with drought and heat tolerance.
Among the intronless genes, TaSOT5b-2A is the only gene having target miRNA and the sequence for a lncRNA, associated with high expression under heat and drought as well. The relative distribution of miRNAs and lncRNA in split genes can also be also examined, showing that majority of genes involved in drought stress areintronless. This may indirectly support the earlier reports, where small nucleolar RNAs, small nuclear RNAs, and circular RNAs were associated with gene regulation (Kumari et al. 2022; Vakirlis et al. 2022). In the present study, miRNAs in (12 genes) in which 8 genes are split genes with interons and lncRNAs are found in 6 TasSOT genes, of which only 2 genes are split genes.
The occurrence of 7 tandem and 26 segmental duplications in TaSOT genes observed in the present study is nothing unique, since similar duplications have been reported in several other crops (potato, rice, cotton, cabbage) and the model plant species ,arabidopsis (Faraji et al. 2021; Chen et al. 2012; Appels et al. 2018). It has been advocated that gene duplications provide new genetic material for mutation, drift, and selection to act upon, the result of which is specialized for new gene functions. Without gene duplication, the plasticity of a genome or species in adapting to changing environments would be severely limited. The distribution of dupolications in genes can also be utilized to find out the evolutionary stage, when these duplications originated.
Orthologous relationships of TaSOT genes with SOT genes of other related species like rice and six other species (T. urartu, Ae. tauschii, T. turgidum, H. vulgare, B. distachyon and Z. mays), were analysed, where clear orhology of TaSOT was observed with these species. No orthology of TaSOT genes has seen with Arabidosis which is in accordance with the result of OsSOT genes, indicating their independent evoloution (Chen et al. 2012). In our analysis a clear gene duplication can be observed in monocots (15 genes in maize, 35 in rice and 107 in wheat) which shows that some homologous genes in wheat ancestor species were common; suggesting that these genes are highly conserved during evolution and may have similar functions. The divergences between homologous genes are due to deletion and duplication, revealing that some of them may lose function or acquire new function in evolutionary process (Li et al. 2020).
Synteny analysis shows that a majority of TaSOTgenes showed syntenic relationship with their orthologs in T. turgidum (20–90%) and a minimum of five TaSOT genes with Z. mays (40–54%), which shows that in course of evolution and with polyploidization, the number of syntenic SOTs increased by gene complementation to confer adaptive plasticity (Otto et al. 2000). In previous study also it is being analysed that approximately 35% of StSOT showed synteny relationships with the tomato and approximately 32% with arabidopsis (Faraji et al. 2021).
Promoter analysis of TaSOT genes showed that these genes contained cis-regulatory elements (CREs) with responding to abiotic stresses. In TaSOT out of 48, 11 were responding to abiotic stress which was in favour with the previous study of OsSOTs, where 21 types of CREs were involved in abiotic stress (Chen et al.2012). The largest portion of the ABRE regulative elements was also seen in the promoter regions of TaSOT genes, which is the key regulative element in heat and drought stress response (Batra et al. 2019, Chaudhary et al. 2023). These findings suggest potential cross talk between hormones in the expression of ABA receptors.
In our analysis most of the SSR found in TaSOT gene were trinucleotide(maximum) to hexanucleotide repeats(minimum). The abundance of tri-nucleotide repeat SSRs in TaSOT is in agreement with earlier reports in wheat (Varshney et al. 2005). no SSR analysis have been reported in SOT genes in any other plant species.
In protein analysis of TaSOT we found high aliphatic index of 71–83%; with negative value of GRAVY for all TaSOT protein. The high Aliphatic Index of a protein sequence suggests their significance in high thermostability (Ikai 1980, Chaudhary et al. 2023). A protein with a negative GRAVY value is non-polar and hydrophilic in nature (Bhattacharya et al. 2018). Most of the TaSOT protein are cytoplasmic, but protein of three genes is transmembranic and also found in mitochondria, chloroplast. these results are in favour with that of AtSOT, which were also found to localised in cytoplasm as well as in plastids (Kleain et al. 2004). But in
contrast with StSOTs where all protein were localised in cytoplasm only with no TMP (Faraji et al. 2021). This indicates that TaSOT proteins likely to have distinct functions related to membrane-associated processes and organelle-specific functions.
Post-translational phosphorylation of TaSOT proteins revealed a wide variety of phosphorylated residues, in potato StSOTs also, highly phosphorylated SOTs were noticed (Faraji et al. 2021). Since protein phosphorylation can mediate multiple biological processes, like plant development and stimuli responses (Heidari et al. 2019, Rezaee et al. 2020), it may be inferred that highly phosphorylated TaSOTs should play an important role during the wheat life cycle.
SOT domain (PF00685) is found in all TaSOT proteins; this is true of other plant species also, where SOT genes have been characterized (Faraji et al. 2021, Chen et al. 2012, Wang et al. 2019). SOT family members are specified by four conserved regions (I to IV) in their protein sequences (Varin et al. 1997), in which the I and IV regions are highly conserved sections, and are involved in PAPS binding (Klein et al. 2004, Zang et al. 2009). This is also obsereved in all the previous studies of SOTs (Faraji et al. 2021, Chen et al. 2012, Wang et al. 2019).
The specific amino acid residues responsible for SOT catalytic activity in these motifs are in accordance with Faraji et al. 2021. The conserved 3' PB motif located in the C-terminal region of the TaSOTs acts as the interacting sites with the 3'-phosphate group of PAPS, and it selectively modulates the binding of PAPS, as reported by Klaassen et al. in 1997 and Faraji et al. in 2021. The structural similarities observed in this motif may indicate a significant resemblance in expression patterns and regulatory functions within the cell, as discussed by Faraji et al. in 2021.
The results of protein-protein interactions involving SOT proteins suggested that the different genes families along with the SOT domain family are co-expressed and co-regulated and all the families are connected in a pathway with TaSOT gene family have transferase and catalytic activity, sulfur is a crucial element that contributes significantly to the structure, regulation, and catalytic functions of proteins. Previous research by Faraji et al. in 2021 has demonstrated the involvement of the SOT gene family in the biosynthesis of sulfur compounds, as well as the regulation of flavonoid and brassinosteroid metabolic processes. It is likely that TaSOT proteins work in collaboration with proteins from iron-sulfur complexes and amino acid metabolism, thereby playing a role in regulating plant responses to external stimuli (Faraji et al. 2021, Heidari ert al. 2020).
The 3D structures of three representative TaSOT proteins showed high proportion of amino acid residues (> 90%), falling in the most favoured regions, suggesting that the predicted models are reliable (Batra et al. 2017; Kumar et al. 2018, Chaudhary et al. 2023). Generally, our predicted 3D models were in good agreement with the parameters related to typical SOT proteins and can be utilized for peptide ligands and as a docking assay. We found either 10 or 11 channel proteins; previously also11-13 channels were identified in StSOT proteins (Faraji et al.2021). In protein structures, the channels and cavities modulate protein function and can determine their binding specificity (Fukao et al. 2012, Prasad et al. 2023).
Our docking assay revealed that the predicted residues in docked complex provide stability and maintain interaction between three TaSOTs and the A3P molecules. Most of these functional regions were included in the sulphotransfer_1 domain. Same result was obsereved in StSOT, where ligand binding comprises of proline, glycine, serine, and lysine in their active sites (Faraji et al.2021). The time-dependent fluctuation of Rg (radius of gyration) was estimated to establish the influence of ligand binding on the compactness and structural integrity of ligand- A3P and TaSOT proteins. TaSOT5b-2D shows higher Rg, than other two proteins, that indicates that the higher the Rg higher will be the level of labile nature of the complexes, whereas the lower Rg profile indicates the contracted nature of the complexes (Tiwari et al. 2022).
The differential expression of TaSOT genes, observed during in-silico expression analysis, was also confirmed in ten choosen TaSOT genes using qRT-PCR. In another earlier study in Brassica also, expression of SOTs were increases under abiotic stresses. (Marsolais et al. 2003; Michele et al. 1999) which was later confirmed using transgenic approach in OsSOT9 (Cao et al.2016). Similar results were also reported in rice (Chen et.al.2012) and cotton (Wang et al. 2019), Chinese cabbage (Jin et al.2019), Caulerpa (Landi etal. 2020) and Potato (Faraji et al. 2021). The expression and regulation of TaSOT genes in response to abiotic stresses indicate their participation in various cellular pathways. To gain a comprehensive understanding of the functions of TaSOT genes, further investigations in wheat are warranted. Moreover, exploring the allelic variations within these genes could be valuable for the development of thermotolerant wheat cultivars.