Genome-wide identification of serine acetyltransferase (SAT) gene family in rice (Oryza sativa) and their expressions under salt stress

Assimilation of sulfur to cysteine (Cys) occurs in presence of serine acetyltransferase (SAT). Drought and salt stresses are known to be regulated by abscisic acid, whose biosynthesis is limited by Cys. Cys is formed by cysteine synthase complex depending on SAT and OASTL enzymes. Functions of some SAT genes were identified in Arabidopsis; however, it is not known how SAT genes are regulated in rice (Oryza sativa) under salt stress. Sequence, protein domain, gene structure, nucleotide, phylogenetic, selection, gene duplication, motif, synteny, digital expression and co-expression, secondary and tertiary protein structures, and binding site analyses were conducted. The wet-lab expressions of OsSAT genes were also tested under salt stress. OsSATs have underwent purifying selection. Segmental and tandem duplications may be driving force of structural and functional divergences of OsSATs. The digital expression analyses of OsSATs showed that jasmonic acid (JA) was the only hormone inducing the expressions of OsSAT1;1, OsSAT2;1, and OsSAT2;2 whereas auxin and ABA only triggered OsSAT1;1 expression. Leaf blade is the only plant organ where all OsSATs but OsSAT1;1 were expressed. Wet-lab expressions of OsSATs indicated that OsSAT1;1, OsSAT1;2 and OsSAT1;3 genes were upregulated at different exposure times of salt stress. OsSAT1;1, expressed highly in rice roots, may be a hub gene regulated by cross-talk of JA, ABA and auxin hormones. The cross-talk of the mentioned hormones and the structural variations of OsSAT proteins may also explain the different responses of OsSATs to salt stress.


Introduction
Sulfur (S) is a vital macronutrient for plants and involved in many biochemical reactions. S is the fourth most important plant nutrient after N, P, and K. It is found in amino acids (cysteine and methionine), vitamins and cofactors, glutathione and phytochelatins [1]. Plants can use the reduced sulfate (SO 4 2− ) and convert it to sulfide (S 2− ). Sulfide is used to integrate S into cysteine (Cys) in the cytosol. Cys biosynthesis is a basic process either as a precursor or donor of key S compounds in plants. While Cys synthesis is observed in the cytosol, plastids, and mitochondria, the sulfate reduction pathway is localized in plastids of plant cells [2,3]. The Cys synthesis takes place in the presence of two enzymes: serine acetyltransferase (SAT or SERAT) and O-acetylserine (thiol) lyase (OASTL). The former leads to synthesizing intermediary product O-acetylserine (OAS) using acetyl-CoA and serine. The latter uses the sulfide and OAS for production of Cys as a cofactor in presence of pyridoxal-5′-phosphate. These two enzymes are called hetero-oligomeric cysteine synthase complex (CSC) [4][5][6].
The ancestral SAT gene was a host origin gene and not evolved from the cyanobacterial endosymbiont [7]. A total of five serine acetyltransferases (SAT or SERAT; EC 2.3.1. 30) were identified in Arabidopsis thaliana [8]. AtSAT1, AtSAT3, and AtSAT5 are localized in plastids, mitochondria, and cytoplasm, respectively [2]. Mitochondrion is identified as the most important compartment for O-acetylserine (OAS) synthesis in Arabidopsis, the precursor of Cys [3].
Similarly, cell-distribution of SAT activities in pea (Pisum sativum) leaves is found at the highest level in mitochondria followed by chloroplast and cytoplasm [2]. The other two AtSATs, AtSAT2 and AtSAT4, are localized in cytoplasm but their protein sequences show variations from other SATs and are expressed at low level [9]. The overexpressed of AtSAT1 in maize does not cause any negative effect on Arabidopsis' s growth and it enhances 10-kDa γ-zein storage protein during endosperm development [10].
Drought and salt stresses are known to be regulated by abscisic acid, whose biosynthesis is limited by Cys. Cys is formed by cysteine synthase complex depending on SAT and OASTL enzymes. It is reported that AtSAT2;1 was induced when Arabidopsis seedlings were treated by 50 µM ABA [11]. Also, SAT1 in the chloroplast of Arabidopsis were significantly upregulated under salt and light stresses [12]. However, it is not known how SAT genes are regulated in rice under salt stress. Therefore, in present study, rice (Oryza sativa) SAT (OsSATs) genes/proteins were identified at genome-wide level, then bioinformatics analyses were performed (sequence, motif, phylogenetic, expression, and protein modeling) to find out structural and functional divergences of SATs in rice. Also, the responses of OsSAT genes were tested to offer more insights into their regulations under salt stress.

Nucleotide and phylogenetic analyses
Maximum likelihood estimate of transition/transversion bias (R) was estimated by Kimura's method [21] using two-parameter model. The estimate of transition/transversion bias and G+C content were calculated using DnaSP v6.12.01 [22]. Molecular phylogenetic analysis was performed according to the maximum likelihood (ML) method using JTT matrix-based model [23]. Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances, which are estimated using a JTT model, and then topology with superior log likelihood value was selected. The analysis involved 11 amino acid sequences with 266 positions after all positions containing gaps and missing data were eliminated. Evolutionary analyses were conducted using MEGA7software [24].

Selection, gene duplication and synteny analyses
Nucleotide sequences of rice SAT genes were analyzed using DnaSP v6.12.01 [22] in terms of polymorphic sites, DNA polymorphism, genetic variation level (π and θ), and Tajima's D [25]. Ka and Ks values were calculated for duplicated gene pairs using DnaSP v6.12.01 [22]. Gene duplication analyses were conducted according to following two criteria: If the alignment of the coding nucleotide sequences covers 70% of the longest genes and the amino acid identity between the sequences are ≥ 70%, the gene of interest is assumed to be duplicated [26].
Synteny analyses were conducted on Circoletto platform [27]. With this aim, processed genes of Arabidopsis thaliana, Brachypodium distachyon, Solanum lycopersicum, and Zea mays were downloaded from Plant Genome Duplication Database [28]. Then the local blast database was constructed in Bioedit v7.2.5 [16] and the blast results were fed to Circoletto server.

Plant material and growth conditions
The local Karacadag variety was used as plant material. Rice seeds were sown in 10-cm-square pots containing 40% perlite and 60% peat. The plants were grown in the growth cabinet with following cycle: an average of 50% humidity at 25 °C at 8-h dark stage and 30 °C at 16-h light period. 4-5 plants were grown for each experimental group. Rice plants at third leaf stage were exposed to 200 mM NaCl and leaf samples were collected at the 3rd-, 12th-and 24th-hour. Leaf samples were immediately stored in RNA later stabilization solution (Invitrogen, Cat No.: AM7021).

RNA isolation and gene expression analysis
RNA isolation was conducted by TRIzol reagent method (Ambion, Ref: 15596026). The quality and quantity of the isolated RNA were measured using 2% agarose gel and nano-drop spectrophotometer (Maestrogen, MN-913). A total of 12 samples (i.e. 3 biological replicates from each sample) were used in Real-Time qPCR studies. Genomic DNA elimination were carried out with these 12 samples by taking the protocol of the DNaseI kit (Cat: EN0521) into account. RNAs were then converted to cDNA according to the kit's protocol. qPCR studies were performed with the QIAGEN Rotor-Gene Q 5-Plex instrument [36]. The OsSATs were expressed using the AMPIGENE qPCR Green Mix (Cat No: ENZ-NUC104-10000) commercial kit. OsActinII gene, identified for salt stress in rice, is used as reference gene [37].

Genome-wide identification and sequence analyses of SATs
A total of six non-redundant SAT genes were identified in rice genome using five AtSAT protein sequences as references: LOC_Os01g52260 (OsSAT1;1), LOC_ Os02g10830 (OSSAT1;2), LOC_Os03g04140 (OsSAT3), LOC_Os03g08660 (OsSAT2;1), LOC_Os03g10050 (OsSAT2;2), and LOC_Os05g45710 (OsSAT1;3) ( Table 2). The protein lengths of SATs are ranged from 298 to 391 amino acid residues; molecular weights were found between 30.44 and 42.72 kDa. All SATs are predicted to be acidic character (pI ≤ 7) except for AtSAT3. Exon numbers range from one to 10. AtSAT2 and 4 (SAT3 family) and OsSAT3 have 10 exons. In terms of gene structures, OsSAT1;1, OsSAT1;3, OsSAT2;2 have similar gene structures whereas OsSAT1;2 and OsSAT2;1 are alike. (Supp. Fig. 1). AtSATs are more diverse and AtSAT1 and AtSAT3 are more similar to each other. All genes contain upstream and downstream regions except for OsSAT1;2 and OsSAT2;1. All SAT proteins in the  Fig. 2). According to domain analyses of SAT proteins, WIKM-LEEAKSDVKQEPILSNYYYASITSHRSLESALAHILS-VKLSNLNLPSNTLFELFISVLEESPEIIESTKQDLIAV sequence was the conserved serine acetyltransferase domain whereas GVVIGETAVVGDNVSILHGVTLG and IGDGVLIGAGSCILGNITIGEGAKIGSGSV sequences were the hexapeptide domains with 100% match score (Supp. Fig. 1). Domain boundaries and the number of hexapeptide domains showed variation among the sequences. The SATs shown with with flickr pink filled ellipses have only the second bacterial transferase hexapeptide domain (Supp. Fig. 2).
To provide more insight about protein sequence diversity of SATs, sequence identity matrix was constructed (Supp. Table 1). The lowest identity value was 0.325 between AtSAT2 and AtSAT3 whilst the highest value was found between OsSAT2;1 and OsSAT2;2 as 0.914. The mean identity value of SAT proteins was 0.476.

Conserved motif analysis
Regarding conserved motifs, 10 conserved motifs were detected (Table 3 and Supp. Fig. 3). The motif 3 was found to be related with PF00132 (bacterial transferase hexapeptide). Motif 4 and 5 were related with PF06426 (serine acetyltransferase) domain structure. The motif 1, 2, 3, 4, and 5 were present in all SATs whereas other motifs were found in different numbers.

Selection, gene duplication and synteny analyses
Nucleotide variations of six OsSAT genes were analyzed using two selection analyses: Tajima's D and Ka/Ks tests. In Tajima's D test, the number of polymorphic (segregating) sites in OsSAT genes were identified as 516, of which 47.1% (243/516) were singleton variable sites and 30.4% (273/516) were parsimony informative sites. 306 sites were invariable (monomorphic) and nucleotide diversity was found 0.33 and 0.27 for π and θ parameters, respectively. Tajima's D was found 1.22, indicating purifying (negative) selection.
In the second selection test, nonsynonymous (Ka) and synonymous (Ks) substitution rates between the duplicated gene pairs were calculated. Ka/Ks values for all OsSAT genes were found less than one, validating previous finding that OsSAT genes were subjected to purifying selection. Moreover, gene duplications analyses indicated that three segmental duplications and one tandem duplication occurred, suggesting that these duplications are the major force for SAT gene expansion (Table 4).
To understand gene duplication dynamics of OsSAT genes four comparative syntenic maps of rice, associated with four representative species, were generated (Fig. 2). Six OsSAT genes had a syntenic relationship with five genes in Arabidopsis, followed by four genes in B. distachyon, four genes in tomato, and three genes in maize.

Distinct expression profiles of OsSAT genes
To better understand the functions of OsSAT genes in the cell metabolism, gene expression levels in different tissues and organs were displayed as a heatmap in Fig. 3.
For digital expression analysis, five OsSAT genes were obtained from Rice Expression Profile Database (RiceX-Pro) except for OsSAT1;2 (LOC_Os02g10830) gene. In general, OsSAT genes in stem, inflorescence, anther, pistil, lemma, palea, and embryo tissues/organ were expressed at low level. Particularly, OsSAT2;1 and OsSAT2;2 genes were down-regulated about three fold changes. Also, these two genes were downregulated in inflorescence and anther tissues. In leaf blade and sheath, OsSAT genes commonly were upregulated. OsSAT2;1 and OsSAT2;2 genes were upregulated about three fold changes in leaf blade. In the second step, gene expression profiles of OsSAT genes were evaluated at five different time points depending on six types of plant hormone applications (abscisic acid, gibberellin, auxin, brassinosteroid, cytokinin, and jasmonic acid) (Fig. 4). Jasmonic acid (JA) treatment especially caused the upregulation of OsSAT genes in terms of gene expression levels. Particularly OsSAT1;1 gene was upregulated nearly four fold changes by JA. Also, it was found that the optimal time point for the expression to increase was 1 h for OsSAT1;1 gene under JA treatment. Also, OsSAT1;1 responded to the auxin and ABA treatments at the third hour of exposure although its responses were not as high as to those of JA. The upregulation of OsSAT1; 1 gene also was observed under ABA and auxin treatments. OsSAT2;1 and OsSAT2;2 genes were upregulated at the first hour of exposure to JA acid. Overall, the levels of gene expressions under hormone treatments supported that OsSAT1;1, OsSAT2;1, and OsSAT2;2 genes were positively regulated by hormone treatments.

Co-expression analyses of OsSAT genes
The co-expression network of OsSAT genes was constructed using RiceFREND database. Co-expression network of OsSAT genes displayed that Os04g0577500, Os11g0524300, Os06g0167400, Os06g0690700, Os12g0641300, Os04g0488700, and Os07g0589000 were seven-first-neighbor genes that co-expressed with OsSATs (Fig. 5). Os04g0488700 (similar to PHY3, AGC kinase) was co-expressed with OsSAT1;1 gene. The AGC kinase family is one of seven kinase families and they are conserved in all eukaryotic genomes. AGC kinases in plants play roles in modulation of kinase activity by external stimuli [40]. Os12g0641300 (similar to Zn-dependent hydrolases of the beta-lactamase fold) was identified as a co-expressed gene with OsSAT1;3. Os07g0589000 (lateral organ boundaries, LOB domain containing protein) was co-expressed with OsSAT2;1 and OsSAT2;2. Lateral organ boundaries domain (LBD) proteins contain lateral organ boundaries (LOB) domain that are key regulators for plant organ development such as photomorphogenesis, plant regeneration, and pollen development [41]. OsSAT3 was coexpressed with Os04g0577500 (TatD-related deoxyribonuclease family protein), Os11g0524300 (protein of unknown function DUF1001 family protein), Os06g0167400 (ditrans-poly-cis-decaprenylcistransferase family protein), and Os06g0690700 (similar to potential cadmium/zinc-transporting ATPase HMA1). TatD is conserved protein found in all living organisms and participates in DNA fragmentation during apoptosis in eukaryotic cells [42]. Heavy metal pumps (P1B-ATPases) are vital for cellular heavy metal homeostasis. Arabidopsis thaliana contains eight P1B-ATPase genes (heavy metal ATPases 1-8 (HMA1-HMA8) members [43].

Annotations of SAT proteins
The gene ontology (GO) analyses of OsSAT proteins were performed using PANNZER server, in terms of biological process, molecular function, and cellular component (Supp. Fig. 4). Sulfur amino acid biosynthetic process (GO:0000097), l-serine metabolic process (GO:0006563), biosynthetic process from serine (GO:0006535), cellular amino acid biosynthetic process (GO:0008652), sulfate assimilation (GO:0000103), and response to sulfate starvation (GO:0009970) were identified as biological processes in which OsSATs are involved (Supp. Fig. 4A). Serine O-acetyltransferase activity (GO:0009001), terpene synthase activity (GO:0010333), magnesium ion binding (GO:0000287), zinc ion binding (GO:0008270), and protein binding (GO:0005515) were predicted molecular functions which are OsSATs are carried out (Supp.   Fig. 4C). All things considered, it was clearly observed that amino acid synthesis is the most prominent biological process for OsSATs.  Table 2). The predicted 3D structures of OsSAT proteins (Fig. 6) were found reliable due to and their Ramachandran values, ranging from 96 to 99% in core and allowed regions. Overall, structural differences of SATs may indicate their functional differences. The 3D structural similarities (%) were identified using six rice, five Arabidopsis, and one soybean SAT proteins on CLICK structure comparison server (

Predicted active sites of OsSATs
The identification of catalytic residues of enzymes is an indispensable step for understanding the functions of enzymes [44]. In this study, active site predictions of OsSATs were performed using InterPro 74.0 server (Table 6). Particularly, Asp (D), His (H), Gly (G), Thr (T), Arg (R), Ala (A), and Leu (L) residues were conserved at different positions in all OsSATs; in contrast, some residues such as 248M (Met), 249Q (Gln), and 292A (Ala) residues were only identified in LOC_Os03g04140 (OsSAT3) protein, suggesting functional divergence of SAT3 in rice. In a nutshell, it was found that similar amino acid residues were present in the predicted active binding sites.

The expressions of OsSATs under salt stress
In this study, OsSATs' responses were investigated under 3, 12, 24-h salt treatments under 3, 12, 24-h salt treatments (Fig. 7). OsSAT2;1, OsSAT2;2 and OsSAT3 were downregulated at all exposure times. The magnitude of OsSAT2;1 expression to all salt treatments was the lowest compared to other OsSATs. On the other hand, OsSAT2;2 and OsSAT3 responded to salt stress exposure times in a similar way. The expressions of OsSAT1;1, OsSAT1;2, and OsSAT1;3 increased depending on the 3, 12, and 24-h NaCl treatments. OsSAT1;2 was generally expressed at the highest level at all exposure times. Lastly, the responses of OsSAT1;2 and OsSAT1;3 to salt exposure times were the highest at 24-h NaCl treatment. Overall, OsSAT1;1 and OsSAT1;2 and OsSAT1;3 are responsive genes to different salt exposure times; and OsSAT1;2 and OsSAT1;3 were particularly upregulated by 24-h salt treatment.

Discussion
Sequence, nucleotide and phylogenic analyses of Arabidopsis and rice proteins showed that there are sequential and phylogenetic divergences among SATs [9]. AtSATs and OsSATs were not separated from each other completely in phylogenetic analysis due to having the same number of exons and relatively higher identity scores. For example, AtSAT2/4 and OsSAT3 genes have 10 exons with identity values above 50% unlike the rest of SATs. In terms of identity values of genes of interest, OsSATs have more similar (0.541) protein sequences compared to AtSATs (0.450). Nonetheless, the selection analyses of OsSAT genes showed that SATs are subjected to the purifying selection. Purifying selection, also known as background selection, reduces genetic diversity and shapes it in natural population [45]. Consequently, it may be suggested that the genetic diversity of OsSAT genes decreased as a result of purifying selection. The predicted 3D model of Arabidopsis and rice SAT proteins R value, the ratio of transition to transversion, were estimated for DNA sequence evolution and phylogeny reconstruction [46]. In any genome, transitions (T↔C and A↔G) are observed at higher frequencies than transversions (T↔A, T↔G, C↔A, and C↔G). In this study, the estimated transition/transversion bias (R) was found as 0.71, indicating genetic variations. Proving this result, GC contents of OsSATs also showed a considerable variation. Genomic DNA base composition (GC content) affects genes' functions and adaptation of species to its environment and it may play roles in complex gene regulation [47]. Consequently, it can be concluded that the action of purifying selection may increase the specificity and the selectivity of SATs in rice metabolism, leading to variations of G+C contents. In addition to that, we found that segmental and tandem duplications are driving force of OsSATs evolution. It is known that  gene duplication is one of types of genomic change that can lead to evolutionary changes. The duplicated genes can contribute to the evolution of novel functions including adaptation to stress, induction of disease resistance, production of floral structures [48] and expansion of gene families [49]. As is found in this study, it is reported that SAT2;1/SAT2;2 and SAT3;1/SAT3;2 isoforms in Arabidopsis were also duplicated gene pairs [50]. Although all SATs contain serine acetyltransferase N-terminal domain structure (SATase_N, PF06426) and bacterial transferase hexapeptide (PF00132) as their common motifs, there are still variations in motif structures. The presence of protein motifs may play important roles in protein function. The motifs in active sites of proteins are well-conserved [51], suggesting that these variations in motif structures may be connected with functional diversities of SATs in plants.
Homologous proteins from different organisms can be recognized using sequence comparison because amino acid substitutions in particular positions are prevented by strong selective constraints [52]. The structure of proteins affect their functions [9] SAT proteins showed structural variations indicating SAT proteins' functional flexibilities. Therefore, SATs may be involved in more specific processes on the same pathway.
The expressional similarities among OsSATs may be originated from preserved residues in their active sites taking effect on their functions. As stated earlier, Asp (D), His (H), Gly (G), Thr (T), Arg (R), Ala (A), and Leu (L) are the conserved residues in active sites of OsSATs. His169, Asp154, His18, Arg203 and His204, Lys230, Arg253 in soybean were identified as active residues involved in reactions such as catalysis, oxyanion reaction intermediate, serine binding, CoA binding according to crystal structures and analysis of site-directed mutation data [53]. In this study, similar residues were identified in predicted active sites of OsSAT proteins.
Expression patterns of OsSATs showed difference according to tissues and organs, and treatments. This result is validated by the co-expression maps of OsSAT genes showing their involvement in various metabolic pathways and their association with gene families with different functions. When the expression levels of OsSATs by organs are taken into consideration, it was observed that the expression levels of OsSAT2;1 and OsSAT2;2 genes showed more dynamic profiles. All SATs but OsSAT1;1 were expressed in leaf blade. OsSAT1;1 and OsSAT2;1 were particularly induced in roots. JA is the only hormone activating expression of OsSAT1;1, OsSAT2;1, and OsSAT2;2. Besides, OsSAT1;1 was the only OsSAT induced by auxin, JA and ABA hormones. The wet-lab expressions of OsSATs under different salt treatments conducted within study showed that OsSAT1;1, OsSAT1;2 and OsSAT1;3 were generally upregulated depending on different exposure times. OsSAT1;2 showed the highest expression levels when exposed to 3, 12, and 24-h salt treatments. Conversely, OsSAT2;2, OsSAT3 and particularly OsSAT2;1 were downregulated depending on exposure times. Watanabe et al. [50] stated that three cytosolic isoforms, SAT1;1, SAT3;1, and SAT3;2 genes in Arabidopsis contribute to seed development and SAT gene family plays essential roles for plant survival. Also, SAT3;1 and SAT3;2 isoforms of Arabidopsis play roles in plant development. In Arabidopsis, AtSAT2 and AtSAT4 are 10-100 times less transcribed compared to the major expressed SAT isoenzymes such as AtSAT1, 3 and 5 [9,54]. The varied expression patterns of OsSATs under various conditions show that they are dynamically regulated and OsSAT1;1 is the only gene induced by ABA, JA and auxin hormones. Interestingly, Wang et al. [55] reported that ABA stimulates JA synthesis in rice seedlings through SAPK10 (osmotic stress/ABA-activated protein kinase) bZIP72-AOC pathway. Conversely, JA is also reported to activate ABA, ethylene and salicyclic acid. JA is known to increase salt tolerance of plants. Exogenous JA application also increase ABA and proline contents in cells along with antioxidant enzyme activity rendering them more tolerant to salt stress [56]. Similarly, auxin, ethylene, salicylic acid, brassinosteroids, and gibberellin are also triggered by JA [57]. Consequently, the findings showed that SAT enzymes (EC 2.3.1.30) in rice may be regulated by cross-talk of ABA, JA and auxin hormones.

Conclusion
Sulfur is one of macronutrient which is necessary for plants' growth and development. On the other hand, Cys, one of sulfur-containing amino acids, is used in biological systems to synthesize sulfur-containing compounds. SAT gene family play specific roles in plant metabolism, particularly in sulfur assimilation pathway. In this study, a total of six OsSAT genes were identified in rice genome and variations at gene and protein structures were identified using bioinformatics approaches. It is found that purifying selection along with segmental and tandem duplications led OsSATs to have more specific and selective roles in metabolic pathways which may have an effect on the rice plant's responses to abiotic and biotic stress conditions. SATs appear to be induced by cross-talk of several hormones. More explicitly, JA induced the expressions of OsSAT1;1, OsSAT2;1, and OsSAT2;2 whereas auxin and ABA induced only OsSAT1;1. OsSAT1;1 was the only gene induced by ABA, JA and auxin in digital expression analysis. As is known, salt stress also activates ABA synthesis and ABA-related gene expressions. The wet-lab data in present study showed that OsSAT1;1, OsSAT1;2 and OsSAT1;3 genes were upregulated when exposed to different time periods of salt stress. Consequently, OsSAT1;1 may be a hub gene regulated by cross-talk of the mentioned hormones. Interestingly, OsSAT1;1 is the only SAT gene lowly expressed in leaf blade but highly expressed in root cells. The cross-talk of hormones and the structural variations of OsSAT proteins may also explain the different responses of OsSATs to salt stress. In this regard, the effect of crosstalk of hormones and how to fine-tuning OsSATs expressions under abiotic stress conditions may be considered as research topics for future studies.