The RuvBL genes in eukarytoes resemble the RuvB helicase gene of E coli, and have therefore been described as RuvB like (RuvBL) genes. They encode proteins, each having both DNA-dependent ATPase and DNA helicase activities. These genes belong to the family of genes encoding AAA + family of ATPases associated with diverse cellular activities. The RuvBL protein associated with several multi-subunit transcriptional complexes, and also involves histone modification. In human beings, the RuvBLs have been shown to play important roles in essential signalling pathways such as the c-Myc and Wnt pathways, which are involved in chromatin remodelling, regulation of transcription and development, mitosis, DNA repair and apoptosis (Matias et al. 2006). RuvBL genes have also been shown to be involved in human cancers. In particular, proteins having RuvBL1/RuvBL2 ATPase activity, drive maturation of PAQosome (Particle for Arrangement of Quaternary Structure, a large multisubunit chaperone complex), DNA replication and radio-resistance in lung cancer (Yenerall et al. 2019). RuvBL1 is also an epigenetic factor promoting proliferation and inhibitory differentiation program in head and neck squamous cancers (Lin et al. 2020). RUVBL1/2-TTT complex is also involved in mTORC1-hyperactive cancer cells (Shin et al. 2020). Following are some other names for RuvBL1 proteins: RVB1, TIH1, ECP54, TIP49, ECP-54, INO80H, NMP238, PONTIN, TIP49A, NMP 238, and Pontin52.
Among higher plants, RuvBL genes have been studied in only two plant species, namely Arabidopsis (Holt et al. 2002; Schořová et al. 2019) and rice (Saifi et al. 2017). In pigeonpea, the RuBL genes have also been used for developing transgenics to examine the role of these genes in providing tolerance against salt stress (Singh et al. 2020). DNA helicases, like RuvBL, are crucial for nearly all DNA metabolic processes in plants, including pre-mRNA splicing. These act as molecular motor proteins that are crucial for numerous cellular functions (Tuteja and Tuteja 2004a). Mutations or complete absence of RuvBL genes or mutations in these genes have been shown to be lethal in yeast and Arabidopsis (Holt et al. 2002; Jonsson et al. 2001). The expression of these genes is necessary for the normal development of shoot meristem, as shown in Arabidopsis (Holt et al. 2002). Additionally, it is also known that NTP-dependent transcription activators are crucial for abiotic stress responses (Tuteja and Tuteja 2004b).
TaRuvBL genes in wheat and related species
The nine TaRuvBL genes identified in wheat during the present study differed among themselves (Fig. 2). Based on hexaploid nature of wheat genome and availability of four RuvBL genes in rice genome, normally, one should expect 12 genes in wheat, assuming that for each rice gene, there would be three genes in hexaploid wheat. However, in the present study only nine genes were available; this is not surprising, since one of the four genes in rice was incomplete (OsRuvBL2b), sugggesting that perhaps either this is under the process of degeneration/deletion or is the result of duplication (see later). Each of the seven plant species included in the present study had three RuvBL genes (T. aestivum and T. turgidum had three genes on each genome) except rice and maize, each with four genes (Table 2), suggesting deletion/neofunctionalization of RuvBL2b gene during evolution.
The nine TaRuvBL genes in wheat showed variation in the number of exons (Fig. 3). The number of exons in TaRuvBL genes ranged from 2 to 12. In the present study, we observed that homoeologues differed in the number of exons which is rather surprising. For example, TaRuvBL1a-4A and TaRuvBL1a-4B contain 11 exons each; while TaRuvBL1a-4D contains 12 exons. It seems that the first exon in the TaRuvBL genes present on A and B genomes split into two exons during evolution or vice versa. Variation of the first exon among homoeologues represents early stages of RuvBL gene sub-functionalization and neo-functionalization (Bartos et al. 2012). The presence of more than one splice variants in each of the following four genes suggest that alternate splicing of the transcripts of these genes perhaps resulted in multiple transcript variants (available at RefSeq, Jan 2016), TaRuvBL1a-4A, TaRuvBL1a-4A, TaRuvBL1a-4A, TaRuvBL1b-3A.
The avaibility of three RuvBL genes on each of the three wheat genomes (A, B, and D) are on almost same positions (terminal and subterminal regions of short arm) on three chromosomes of the same homoeologous group (Fig. 2) as also in each its progenitors. This suggests that no duplication/deletion events for RuvBL genes occurred during the evolution of hexaploid wheat (Table 2) and that TaRuvBL genes in wheat are largely conserved. The only exception was the gene TaRuvBL1a-4A, located on a long arm of chromosome 4A, which may be attributed to a pericentric inversion (4AS; 4AL) in 4A chromosome (Dvorak et al. 2018). This hypothesis receives support from the earlier observations, which suggested that chromosome 4A in hexaploid wheat was reversed following allopolyploidization (Devos et al. 1995; Dvorak et al. 2018; Zhou et al. 2020).
Orthologs of OsRuvBL2b were present only in maize and Brachypodium genome (Fig. 14) and absent in wheat genome as well as in genomes of its progenitors including T. urartu and Ae tauschii. It was also absent in the related diploid species H. vulgare. This suggests that the gene RuvBL2b must have been lost during evolution soon after divergence of rice and wheat from a common ancestor, but not before divergence of rice and maize from a common ancestor (Fig. 14).
In the present study, we also observed that no RuvbL gene was present on homoeologous chromosomes of rice and wheat, suggesting their diverse evolution. For example, TaRuvBL1a genes are present on each of the three chromosomes of homoeologous group 4 (4A, 4B, and 4D); these three genes are orthologues of the rice gene OsRuvBL1a located on rice chromosome 1, which is not homoeologous to chromosome of homoeologous group 4 of wheat (Ahn et al. 1993; Sorrells et al. 2003). Similarly, RuvBL1b genes are located on 3A, 3B, 3D chromosomes of wheat and an unrelated rice chromosome 7. TaRuvBL2a is located on wheat chromosomes 2A, 2B and 2D and the corresponding rice homoeologue, rice chromosome 6. No satisfactory explanation for this lack of conservation between crops is available, although structural variations (SVs) could be one possible reason (also see next section on syntney).
Evolution of RuvBL genes (synteny and phylogeny)
Synteny analysis gives information about the conservation of genes on the orthologous chromosomes of wheat and rice (Sorrells et al. 2003). In the present study, out of nine TaRuvBL genes, five genes exhibited syntenic relationship with genes in O. sativa, Z. mays, B. distachyon, and H. vulgare (Fig. 4; Supplementary Table S2). Three TaRuvBL genes present on homeologus group 2 (2A, 2B, and 2D) of wheat show synteny with all the four plant species (Z. mays, O. sativa, B. distachyon, H. vulgare). However, two different TaRuvBL genes present on chromosome 4B and 4D of wheat show synteny only with H. vulgare. Two genes of maize (ZmRuvBL2a and ZmRuvBL2b), each show synteny with only one homoeologous group (Group 2) of wheat genes (TaRuvBL2a-2A, TaRuvBL2a-2B, TaRuvBL2a-2D). This might be due to a duplication of RuvbL2a gene in maize which is also confirmed due to the high sequence similarity (96.9%) between ZmRuvBL2a and ZmRuvBL2b. In rice also duplication in OsRuvBL2a has occurred; however only truncated gene is present in rice. Therefore, it is possible that in rice and maize, gene duplication has taken place after their divergence from the common ancestor of wheat, rice, and maize. Thus, this homoeologuous group of wheat might have played a crucial role in the evolution of RuvBL genes. These results suggest that TaRuvBL genes in wheat might have originated from other related species instead of orthologous genes from rice and maize.
The above results also receive support from the results of phylogeny. In the phylogenetic tree, all the RuvBL proteins (ecoded by 34 RuvBL genes) belonging to eight plant species are distributed in two subfamilies (RuvBL1 and RuvBL2), which is in agreement with the earlier studies carried out in plants and animal systems (Wang et al. 2011; Saifi et al. 2017; Schorova et al. 2019). In the phylogenetic tree also, O. sativa and Z. maize proteins were clustered in separate groups as compared to proteins from T. aestivum, T. urartu, Ae. tauschi, and H. vulgare, showing their different ancestry. Therefore, the results from both, the synteny analysis and phylogenetic analysis are in agreement. These results suggest that the RuvBL genes in rice and maize and those of wheat and its progenitors followed different routes of evolution, after separating from common ancestry in the remote past.
Cis-regulatory elements (CREs) in promotor regions
Promoter analysis revealed the presence of a variety of CREs in nine TaRuvBL genes. These CREs included elements specific for response to development (tissue-specific), light (low temperature), and phytohormone, etc. (Supplementary Table S5). The light-responsive elements included Sp1, Box1, AE-box, I-box, and MRE. The phytohormone response elements included ABRE, ERE, as1, CGTCA, GARE, and TGA elements, where ABRE was present in maximum number. Stress-responsive elements included LTR, DRE, MYC, MYB, STRE, ARE, I-Box, MRE, WRE3, W-Box, MBS, and F- Box. Here, MYB and MYC occurred in highest frequency. However, development related CREs were relatively fewer in number. It appears that phytohormones control the expression of the majority of TaRuvBL genes; among these, auxin and strigolactone are known to form a dynamic feedback loop and modulate the distribution of each other (Hayward et al. 2009). It is also known that certain phytohormone responsive elements, such as ABRE, contribute to the majority in responses to heat and drought stress (Joo et al. 2013). Thus, ABRE elements along with other cis-acting elements that are available in TaRuvBL respond to phytohormones (e.g., MeJA, ABA, SA, GA, etc.), which may be regulating the expression of TaRuvBL genes under abiotic stress. Similarly, other response elements may have a role in the regulation of expression of TaRuvBL genes under the following conditions: (i) abiotic stress (low temperature), (ii) tissue-specific response, (iii) response to wound, and (iv) some physiological processes. Because of the lack of information on the regulation of expression of TaRuvBL genes in wheat, a more detailed discussion of our results on the occurrence of CREs in TaRuvBL promoters is not possible at this stage.
Transposons elements (TEs) within genes
In TaRuvBL genes, 83 TEs were identified, which included a maximum number of DNA/hAT followed by DNA/EnSpm/CACTA, LTR/gypsy, DNA/MuDR, LTR/copia, DNA/Mariner, NonLTR/L1, DNA/Helitron, and NonLTR/Penelope/Athena. In one of our earlier studies involving the identification and characterization of TaSDG genes, the TEs were found to differ in types and number (Batra et al. 2020). Therefore, it is apparent that even different related genes may harbour different types and numder of TEs (Lisch 2013). TEs can be activated upon different environmental cues in plants i.e., heat and drought stress (Cavrak et al. 2014; De Felice B et al. 2009).
In maize genome, four to nine TE families have been reported, including all major class I superfamilies (Gypsy, Copia, long interspersed nuclear elements (LINEs)), and DNA transposons. The promoters of many maize genes have been shown to be enriched (more than two-fold) with these TEs and these genes (with TEs) have been shown to be up-regulated in response to different abiotic stresses (Makarevitch et al. 2015).
An important class of TEs is also the hAT superfamily, identified in fungi, animals, and plants (Calvi et al. 1991; Rubin et al. 2001). In Arabidopsis a hAT-like transposon binds to the DNA repair gene Ku70, and it is essential for the development (Bundock and Hooykaas 2005). It is also highly conserved among monocots and eudicots, suggesting early specialization (Jesus et al. 2012). Previous studies have also reported that hAT-like elements are transcriptionally active in rice, grapevine, and sugarcane (Jiao and Deng 2007; Benjak et al. 2008; Jesus et al. 2012). CACTA is another type of TE which were identified during this study. In wheat, the percentage of CACTA TE is highest (15.5%) as compared to the other plant species (Appels et al. 2018). In rapeseed, it acts as an enhancer to increase silique length and seed weight (Shi et al. 2019). In Maize, insertion of CACTA-like transposon at the promoter of ZmCCT10 contributed to the flowering-time adaptation (Huang et al. 2018). In wheat, insertion of CACTAs in TaMFT and TaVrn1 genes into the third and first intron, respectively have been shown to increase seed dormancy (Wang et al. 2022).
miRNAs sequences in genes
The present study identified only one miRNA (tae-miR164) with its target sites in two TaRuvBL genes, namely TaRuvBL1a-4A and TaRuvBL1a-4B genes (Supplementary Table S4). It is well known that the miR164 family of miRNAs is highly conserved, and its members regulate conserved targets belonging to NAC transcription factor. There are at least two earlier studies, where targets of miR164 have been identified, and their interaction has been studied. (i) In Populaus euphratica, the peu-miR164a–e and its target gene POPTR 0007s08420 participate in abiotic stress response. Additionally, it is also reported that overexpressing PeNAC070 in Arabidopsis promoted the growth of lateral roots, delayed stem elongation, and increased sensitivity of transgenic plants to drought and salt stresses. This study helped in understanding the adaptability of P. euphratica to extreme drought and the salt environment by analysing tissue-specific expression patterns of miR164-regulated and specific promoter-regulated PeNAC genes. (ii) In an earlier study in wheat also, the target gene of tae-miR164 was found to be a gene encoding the transcription factor TaNAC21/22 of the NAM sub-family, involved in negative regulation of resistance against stripe rust, suggesting that the presence of miRNA provided resistance by controlling the expression of TaNAC21/22 (Feng et al. 2014); in this study in wheat, no effort was made to examine the role of tae-mi164 in tolerance to abiotic stresses like heat and drought. The present results, however, seem to have a parallel in Populus.
In the present study, only two of the nine genes, namely TaRuvBL1a-4A and TaRuvBL1a-4B were identified as the targets of a solitary miRNA. Obviously, tae-miR164 will reduce the expression of these two TaRuvBL genes, but how inhibition of the expression of these two target genes will influence tolerance to heat and drought is a subject of future study, particularly in view of the fact that in qRT-PCR analysis, the expression of three RuvBL genes, namely, TaRuvBL1b-3B, TaRuvBL1a-4A, TaRuvBL2a-2D, was higher in tolerant genotype (Giza) relative to that in the sensitive genotype (HD2329). However, since in Populus, peu-miR164a–e has been shown to have a role in tolerance against abiotic stresses, the role of tae-miR-164 in heat/drought response through inhibition of two TaRuvBL genes will be an interesting subject for future study.
Structure and function of proteins
TaRuvBL proteins (455 aa to 491 aa) have been shown to be long relative to OsRuvBL proteins (455 aa to 476 aa) which are more conserved. The variation in domain length also supports this observation, since domain length of TaRuvBL proteins varies from 351 to 387aa and in rice 351 to 356aa. This indicates evolutionary changes in wheat RuvBL proteins after speciation of wheat and rice from common ancestor. Grand Average of Hydropathy (-0.045 to -0.193) suggested hydrophilic nature of RuvBL proteins. These proteins function in cytoplasm, and most of the proteins functioning in cytoplasm are of hydrophilic in nature (Batra et al. 2017; Batra et al. 2019). This feature and higher aliphatic index of these genes (98.8 to 133.8) indicate their significance in high thermostability (Ikai 1980).
It may also be recalled that the presence of a solitary domain (TIP49/P-loop) in each of the TaRuvBL proteins, is a characteristic feature of all the RuvBL proteins of other species examined so far, including yeast, Arabidopsis, and rice (Fritsch et al. 2004; Holt et al. 2002; Gribun et al. 2008). This domain imparts nucleoside triphosphate hydrolase or P-loop NTPase activity and exhibits characteristic ATPase and DNA unwinding activities (Saifi et al. 2017). ATPase activity is involved in several cellular processes which initially protect cells from stress but eventually lead to cell death under continuous stress conditions (Saifi et al. 2019; Tuteja et al. 2015). Seven motifs (1, 2, 3, 4, 5, 6, and 8) were also annotated as ATP binding and ATP-dependent activity (GO: 0008094); Similar motifs were also reported in rice RuvBL proteins, which also exhibit the characteristic ATPase and DNA unwinding activities (Saifi et al. 2017). ATPase activity of RuvBL1 homologs from other systems was also reported in the presence of single-stranded DNA as well as double-stranded DNA (Gribun et al. 2008; Ahmad and Tuteja 2012).
Multiple sequence alignment suggested that all the nine TaRuvBL proteins carry some conserved amino acid sequences in their solitary characteristic domain (e.g., Walker A, Walker B, and Sensor1; Supplementary Fig. 1). Walker A contains the phosphate-binding loop (P-loop) of nucleosides, and Walker B is involved in metal (Mg2+) binding and nucleotide hydrolysis (Snider et al. 2008). Motifs 2, 4, and 6 are also involved in functions like NuA4 histone-acetyl-transferase complex (GO: 0035267), a complex having histone acetylase activity on chromatin, as well as ATPase, DNA helicase, and structural DNA binding activities. The complex is thought to be involved in transcription and DNA break repair (Utley et al. 2005).
The INO80 chromatin-remodelling complex, regulates the abundance and positioning of nucleosomes and DNA damage (Morrison et. al. 2017). R2TP complex (GO: 0097255) involved in RNA pol II assembly and apoptosis (Ahmad et al. 2013). The remaining motifs, namely 7, 9 and 10–16 are novel; their molecular characterization needs to be explored in future studies (Supplementary Figure S6, Supplementary Table S13).
The 3D structures of nine TaRuvBL proteins showed high proportion of amino acid residues (> 90%), falling in the most favoured regions, suggesting that the predicted models are reliable (Batra et al. 2017; Kumar et al. 2018). We also found that the 3D structures of six TaRuvBL1 proteins (TaRuvBL1a-4A, TaRuvBL1a-4B, TaRuvBL1a-4D, TaRuvBL1b-3A, TaRuvBL1b3B, TaRuvBL1b-3D) had homo-hexameric structures whereas TaRuvBL2 proteins (TaRuvBL2a-2A, TaRuvBL2a-2A, TaRuvBL2a-2A) had monomeric structure (Supplementary Figure S4). In some earlier studies, it has also been emphasized that RuvBL1 is the eukaryotic homologue of the bacterial DNA-dependent ATPase and helicase RuvB (Putnam et al. 2001, Yamada et al. 2001), which assembles into functional homo-hexameric rings, and is the motor that drives branch migration of the Holliday junction in the presence of RuvA and RuvC during homologous recombination (Tsaneva et al. 1993). It was also proved experimentally (Matias et al. 2006) that crystal structure of the human AAA + Protein RuvBL1 had hexameric ring-shaped structure and is formed of ADP-bound RuvBL1 monomers (Matias et al. 2006). AAA + proteins generally form hexameric ring structures and contain conserved motifs for ATP binding and hydrolysis like the Walker A (P-loop) and Walker B box (Gorbalenya et al.1989; Schmid et al.1992), the Arg finger, and sensor residues. In the present study also, TaRuvBL proteins have been shown to consist of homohexameric rings, made up of six monomeric protein structure. In the present study, the 3D structure of OsRuvBL reference proteins was also examined (Supplementary Figure S4), where OsRuvBL1a and OsRuvBL1b were shown to exhibit homohexameric structure whereas OsRuvBL2a was shown to be monomeric structure. It confirms that protein structure of RuvBL family is conserved among species. Thus, the predicted 3D structure of these proteins provides an understanding to study the molecular functions of the RuvBL helicase protein family. Though monomeric subunits of RuvBL1b and RuvBL2a proteins are fully capable to perform ATPase and DNA helicase activities separately, the hetero polymorphic protein complex (heterododecamer) of these proteins may provide stability to RuvBL protein and may facilitate to provide DNA helicase activities at varying positions simultaneously. RuvBL1 and RuvBL2 interactions have also been reported earlier suggesting the formation of heterohexameric ring structure (Abrahao et al. 2021; Cheung et al. 2010).
Protein -protein interactions
Several studies have demonstrated that RuvBL proteins interact with other proteins to regulate diverse processes. For instance, A. thaliana mutants for these genes resulted in reduced telomerase activity and suggested the involvement of AtRuvBL proteins in plant telomerase biogenesis and show protein interaction with AtTERT, a high-molecular-weight protein (Schořová et al. 2019). Human RuvBL proteins have also been shown to interact with transcription factor MYC which is required for the expression of many genes involved in cell-cycle transition events and proliferation (Wood et al. 2000).
Protein interaction of represenatative protein TaRuvBL1a-4A is shown in (Fig. 10) (Supplementary table S14). The results of different protein interaction can provide valuable information in future for the prediction of working pathways and protein involved in it with TaRuvBL genes and for the further functional characterization of TaRuvBL genes.
In-silico expression and qRT-PCR
Five of the nine TaRuvBL genes exhibited high expression in root, spike, stem, leaf, and grain, at different developmental stages (Fig. 12a). Expression analysis of three representative genes was conducted using qRT-PCR. Two genes (TaRuvBL2a-2D, TaRuvBL1b-3B) were found to exhibit differential expression (upregulation) during qRT-PCR (Fig. 13). In another earlier study also, expression of OsRuvBL1a differed under abiotic stresses (cold, heat and salinity; Saifi et al. 2017), which was later confirmed using transgenic approach (Saifi et al. 2021). Similar results were also reported in Arabidopsis (Holt et al. 2002) and pigeon pea (Singh et al. 2020). TaRuvBL gene expression and upregulation of its transcript under abiotic stress conditions suggest its involvement in multiple cellular pathways. Further studies in wheat could be conducted to understand fully the role of TaRuvBL genes, and to utilize allelic variation in these genes for imparting heat stress to wheat cultivars