Bioinformatics Analyses of Serine Acetyltransferase (SAT) Gene Family in Rice (Oryza sativa) and their Expressions under Salt Stress

Assimilation of sulfur to cysteine occurs in the presence of serine acetyltransferase (SAT). In this study, SAT genes in rice (Oryza sativa) were identi�ed and analyzed using bioinformatics approaches. Also, these genes were tested under salt stress. OsSATs have two common motifs, bacterial transferase hexapeptide and acetyltransferase and underwent purifying selection. They have more similar protein sequences compared to Arabidopsis. However, there is structural and functional divergence among OsSATs which may be driven by the segmental and tandem duplications. Purifying selection and gene duplications may also have effect leading to variation of speci�city and selectivity of OsSATs. In this regard, Asp (D), His (H), Gly (G), Thr (T), Arg (R), Ala (A), and Leu (L) are identi�ed as well-conserved residues in their active sites which have an indicator role on their functions. The OsSATs expressions in different tissues, organs and under hormones showed that jasmonic acid was main hormone inducing the expressions of OsSAT1;1, OsSAT2;1, and OsSAT2;2 whereas auxin and abscisic acid only triggered OsSAT1;1 expression. On the other hand, wet-lab expressions of OsSATs in this study indicated that OsSAT1;1, OsSAT1;2 and OsSAT1;3 genes were upregulated under different exposure times of salt stress. OsSAT1;1 is the only OsSAT induced by various situmuli. The �ndings can be used by plant breeders and genetic engineers to develop new rice varieties having optimal growth and stress tolerance.


Introduction
Sulfur (S) is a vital macronutrient for plants and involved in many biochemical reactions.S is the fourth most important plant nutrient after N, P, and K.It is found in amino acids (cysteine and methionine), vitamins and cofactors, glutathione and phytochelatins [1].Plants can use the reduced sulfate (SO 4  2− ) and convert it to sul de (S 2− ).Sul de is used to integrate S into cysteine (Cys) in the cytosol.Cys biosynthesis is a basic process either as a precursor or donor of key S compounds in plants.While Cys synthesis is observed in the cytosol, plastids, and mitochondria, the sulfate reduction pathway is localized in plastids of plants [2,3].The Cys synthesis takes place in the presence of two enzymes: serine acetyltransferase (SAT or SERAT) and Oacetylserine(thiol)lyase (OASTL).The former leads to synthesizing intermediary product O-acetylserine (OAS) from acetyl-CoA and serine.The latter uses the sul de and OAS for production of Cys as a cofactor in the presence of pyridoxal-5'-phosphate.These two enzymes are called hetero-oligomeric cysteine synthase complex (CSC) [4][5][6].
The ancestral SAT gene was a host origin gene and not evolved from the cyanobacterial endosymbiont [7].A total of ve serine acetyltransferases (SAT or SERAT; EC 2.3.1.30)were identi ed in Arabidopsis thaliana [8].AtSAT1, AtSAT3, and AtSAT5 are found to localized in plastid, mitochondria, and cytoplasm, respectively [2].Additionally, mitochondrion is identi ed as the most important compartment for O-acetylserine (OAS) synthesis in Arabidopsis, which is the precursor of Cys [3].Similarly, cell-distribution of SAT activities in pea (Pisum sativum) leaves is found at the highest level in mitochondria followed by chloroplast and cytoplasm [2].The other two AtSATs, AtSAT2 and AtSAT4, are localized in cytoplasm but their protein sequences show variations from other SATs and are expressed at low level [9].The overexpressed of AtSAT1 in maize does not cause any negative effect on Arabidopsis' s growth and it enhances 10-kDa γ-zein storage protein during endosperm development [10].
In this study, rice (Oryza sativa) SAT (OsSATs) genes/proteins were identi ed at genome-wide scale, then bioinformatics analyses were performed (sequence, motif, phylogenetic, expression, and protein modeling) to nd out structural and functional divergences of SATs in rice.Also, the responses of OsSAT genes were tested to offer more insights about their regulations under salt stress.

Nucleotide and phylogenetic analyses
Maximum likelihood estimate of transition/transversion bias (R) was estimated under the Kimura [17] using two-parameter model.The estimate of transition/transversion bias and G+C content were calculated using DnaSP v6.12.01 [18].Molecular phylogenetic analysis was performed according to the maximum likelihood (ML) method using JTT matrix-based model [19].Initial tree(s) for the heuristic search were obtained automatically by applying Neighbor-Join and BioNJ algorithms to a matrix of pairwise distances, which are estimated using a JTT model, and then topology with superior log likelihood value was selected.The analysis involved 11 amino acid sequences with 266 positions after all positions containing gaps and missing data were eliminated.Evolutionary analyses were conducted in MEGA7 [20].
Selection, gene duplication and synteny analyses Nucleotide sequences of rice SAT genes were analyzed using DnaSP v6.12.01 [18], in terms of polymorphic sites, DNA polymorphism, genetic variation level (π and θ), and Tajima's D [21].Ka and Ks values were calculated for duplicated gene pairs using DnaSP v6.12.01 [18].Gene duplications analyses were conducted according to two following criteria: If the alignment of the coding nucleotide sequences covers 70% of the longest genes and the amino acid identity between the sequences are ≥70%, gene of interest is assumed to be duplicated [22].

Plant material and growth conditions
Salt tolerant local Karacadag variety was used as plant material.Rice seeds were sown in 10-cm-square pots containing 40% perlite and 60% peat.The

RNA isolation and gene expression analysis
RNA isolation was conducted by TRIzol reagent method (Ambion, Ref: 15596026).The quality and quantity of the isolated RNA were measured using 2% agarose gel and nano-drop spectrophotometer (Maestrogen, MN-913).A total of 12 samples (i.e. 3 biological replicates from each sample) were used in Real-Time qPCR studies.Genomic DNA elimination were carried out with these 12 samples by taking into account the protocol of the DNaseI kit (Cat: EN0521).RNAs were then converted to cDNA according to the kit's protocol.qPCR studies were performed with the QIAGEN Rotor-Gene Q 5-Plex instrument (Bustin et al., 2009).The OsSATs were expressed using the AMPIGENE qPCR Green Mix (Cat No: ENZ-NUC104-10000) commercial kit.OsActinII gene, identi ed for salt stress in rice, is used as reference gene [33].
Relative Quanti cation 2 -ΔΔC T (Livak and Schmittgen 2001) method was used to compare gene expression.Gene expression data were analyzed with Rotor-Gene Q 2.3.5 software.Primers used in the study (Table 1) were designed using Primer3.v. 0.

Genome-wide identi cation and sequence analyses of SATs
A total of six non-redundant SAT genes were identi ed in rice genome using ve AtSAT protein sequences as references: LOC_Os01g52260 (OsSAT1;1), LOC_Os02g10830 (OSSAT1;2), LOC_Os03g04140 (OsSAT3), LOC_Os03g08660 (OsSAT2;1), LOC_Os03g10050 (OsSAT2;2), and LOC_Os05g45710 (OsSAT1;3) (Table 2).The protein lengths of SATs are ranged from 298 to 391 amino acid residues; molecular weights were found between 30.44 and 42.72 kDa.All SATs are predicted to be acidic character (pI≤7) except for AtSAT3.Exon numbers range from one to 10. AtSAT2 and 4 (SAT3 family) and OsSAT3 have 10 exons.All SAT proteins in the study contain the serine acetyltransferase N-terminal domain structure (SATase_N, PF06426).In addition, bacterial transferase hexapeptide (PF00132) domain was identi ed as one or two repeats.To provide more insight about protein sequence diversity of SATs, sequence identity matrix was constructed (Table 3).The lowest identity value was found 0.325 between AtSAT2 and AtSAT3 whilst the highest value was found between OsSAT2;1 and OsSAT2;2 as 0.914.The mean identity value of SAT proteins was 0.476.

Conserved motif analysis
Regarding conserved motifs, 10 conserved motifs were detected (Table 4 and Fig. 1).The motif 3 was found to be related with PF00132 (Bacterial transferase hexapeptide).Motif 4 and 5 were related with PF06426 (serine acetyltransferase) domain structure.The motif 1, 2, 3, 4, and 5 were present in all SATs whereas other motifs were found in different numbers.Nucleotide and phylogenetic analyses R value, the ratio of transition to transversion, were estimated to provide insight into DNA sequence evolution and phylogeny reconstruction.Also, G+C contents of OsSATs were calculated to predict probable functional variations of the genes which are important for organisms in gaining adaptation to its environment.The estimated transition/transversion bias (R) was found as 0.71, indicating genetic variations.G+C contents were found 71.82%, 75.25%, 51.92%, 72.63%, 72.47%, and 68.68% for OsSAT1;1, OSSAT1;2, OsSAT3, OsSAT2;1, OsSAT2;2, and OsSAT1;3, respectively.
Phylogenetic analysis indicated that AtSATs and OsSATs split into two major clades (Group A and B).Group A also divided into two subclades (Fig. 2).

Selection, gene duplication and synteny analyses
Nucleotide variations of six OsSAT genes were analyzed using two selection analyses: Tajima's D and Ka/Ks tests.In Tajima's D test, the number of polymorphic (segregating) sites in OsSAT genes were identi ed as 516, of which 47.1% (243/516) were singleton variable sites and 30.4% (273/516) were parsimony informative sites.306 sites were invariable (monomorphic) and nucleotide diversity was found 0.33 and 0.27 for π and θ parameters, respectively.Tajima's D was found 1.22, indicating purifying (negative) selection.
In the second selection test, the nonsynonymous (Ka) and synonymous (Ks) substitution rates between the duplicated gene pairs were calculated.Ka/Ks values for all OsSAT genes were found less than one, validating previous nding that OsSAT genes were subjected to purifying selection.Moreover, gene duplications analyses indicated that three segmental duplications and one tandem duplication occurred, suggesting that these duplications are the major force for SAT gene expansion (Table 5).To understand gene duplication dynamics of OsSAT genes four comparative syntenic maps of rice, associated with four representative species, were generated.(Figure 3).Six OsSAT genes had a syntenic relationship with ve genes in Arabidopsis, followed by four genes in B. distachyon, four genes in tomato, and three genes in maize.

Distinct expression pro les of OsSAT genes
To better understand the functions of OsSAT genes in the cell metabolism, gene expression levels in different tissues and organs were displayed as a heatmap in Fig 4 .For digital expression analysis, ve OsSAT genes were obtained from Rice Expression Pro le Database (RiceXPro) except for OsSAT1;2 (LOC_Os02g10830) gene.In general, OsSAT genes in stem, in orescence, anther, pistil, lemma, palea, and embryo tissues/organ were expressed at low level.Particularly, OsSAT2;1 and OsSAT2;2 genes were down-regulated as about -3-fold change.Also, these two genes were downregulated in in orescence and anther tissues.In leaf blade and sheath, OsSAT genes commonly were upregulated.OsSAT2;1 and OsSAT2;2 genes were upregulated about three-fold changes in leaf blade.
In the second step, gene expression pro les of OsSAT genes were evaluated at ve different time points depending on six types of plant hormone applications (abscisic acid, gibberellin, auxin, brassinosteroid, cytokinin, and jasmonic acid) (Fig. 5).Jasmonic acid treatment especially causes upregulation of OsSAT genes in terms of gene expression levels.Particularly OsSAT1;1 gene was upregulated nearly four-fold by jasmonic acid.Also, it was found that the optimal time point for the expression to increase was one hour for OsSAT1;1 gene under jasmonic acid treatment.Also, OsSAT1;1 responded to the auxin and abscisic acid treatments at the third hour of exposure despite its responses were not as high as to that of jasmonic acid.The up-regulation of OsSAT1; 1 gene also was observed under abscisic acid and auxin treatments.OsSAT2;1 and OsSAT2;2 genes were upregulated at the rst hour of exposure to jasmonic acid.Overall, levels of gene expressions under hormone treatments supported that OsSAT1;1, OsSAT2;1, and OsSAT2;2 genes were positively regulated by hormone treatments.

Co-expression analysis of OsSAT genes
The co-expression network of OsSAT genes was constructed using RiceFREND database.Co-expression network of OsSAT genes displayed that Os04g0577500, Os11g0524300, Os06g0167400, Os06g0690700, Os12g0641300, Os04g0488700, and Os07g0589000 were seven rst neighbor genes that co-expressed with OsSATs (Fig. 6).Os04g0488700 (similar to PHY3, AGC kinase) was co-expressed with OsSAT1;1 gene.The AGC kinase family is one of seven kinase families and they are conserved in all eukaryotic genomes.AGC kinases in plants play roles in modulation of kinase activity by external stimuli [34].Os12g0641300 (similar to Zn-dependent hydrolases of the beta-lactamase fold) was identi ed as a co-expressed gene with OsSAT1;3.Os07g0589000 (lateral organ boundaries, LOB domain containing protein) was co-expressed with OsSAT2;1 and OsSAT2;2.Lateral organ boundaries domain (LBD) proteins contain lateral organ boundaries (LOB) domain that are key regulators for plant organ development such as photomorphogenesis, plant regeneration, and pollen development [35].OsSAT3 was co-expressed with Os04g0577500 (TatD-related deoxyribonuclease family protein), Os11g0524300 (protein of unknown function DUF1001 family protein), Os06g0167400 (di-trans-poly-cis-decaprenylcistransferase family protein), and Os06g0690700 (similar to potential cadmium/zinc-transporting ATPase HMA1).TatD is conserved protein found in all living organisms and participates in DNA fragmentation during apoptosis in eukaryotic cells [36].Heavy metal pumps (P1B-ATPases) are vital for cellular heavy metal homeostasis.Arabidopsis thaliana contains eight P1B-ATPase genes (heavy metal ATPases 1-8 (HMA1-HMA8) members [37].

Secondary and tertiary structure analyses of OsSATs
According to the secondary structure analyses, there are structural variations among OsSAT proteins.The alpha helix, extended strand, beta turn, and random coil percentages (%) were found between 36.94 -42.62, 17.46 -21.19, 7.38 -10.45, and 28. 25 -34.65,respectively (Supp.Table 1).The predicted 3D structures of OsSAT proteins (Fig. 8) were found reliable due to and their Ramachandran values ranging from 96% to 99% in core and allowed regions.
Homologous proteins from different organisms can be recognized using sequence comparison because amino acid substitutions in particular positions are prevented by strong selective constraints [38] .These structural variations at secondary and tertiary levels may be associated with SAT proteins' functional exibilities.

Predicted active sites of OsSATs
The identi cation of catalytic residues of enzymes is an indispensable step for understanding the functions of enzymes [39] .In this study, active site predictions of OsSATs were performed using InterPro 74.0 server (Table 7).Particularly, Asp (D), His (H), Gly (G), Thr (T), Arg (R), Ala (A), and Leu (L) residues were conserved at different positions in all OsSATs; in contrast, some residues such as 248M (Met), 249Q (Gln), and 292A (Ala) residues were only identi ed in LOC_Os03g04140 (OsSAT3) protein, suggesting functional divergence of SAT3 protein in rice.In general perspective, it was found that similar amino acid residues were present in the predicted active binding sites.The expression of OsSATs under salt stress In this study, OsSATs responses under 3, 12, 24-h salt treatments were investigated (Fig. 9).OsSAT2;1, OsSAT2;2 and were downregulated under all exposure times.The magnitude of OsSAT2;1 expression to all salt treatments was the lowest compared to other OsSATs.On the other hand, OsSAT2;2 and OsSAT3 responded to exposure times in a similar way.The expressions of OsSAT1;1 and OsSAT1;2 and OsSAT1;3 increased depending on the 3, 12, and 24-hour NaCl treatments.OsSAT1;2 was generally expressed at the highest level under all exposure times.Lastly, the responses of OsSAT1;2 and OsSAT1;3 to salt exposure times were the highest at 24-h NaCl treatment.Overall, OsSAT1;1 and OsSAT1;2 and OsSAT1;3 are responsive genes to different salt exposure times; and OsSAT1;2 and OsSAT1;3 were particularly upregulated by 24-hour salt treatment.

Discussion
Sequence, nucleotide and phylogenic analyses of Arabidopsis and rice proteins showed that there are sequential and phylogenetic divergences among SATs (Kawashima et al. 2005).AtSATs and OsSATs were not separated from each other completely in phylogenetic analysis due to having the same number of exons and relatively higher identity scores.For example, AtSAT2/4 and OsSAT3 genes have 10 exons with identity values above 50 % unlike the rest of SATs.In terms of identity values of plants of interest, OsSATs have more similar (0.541) protein sequences compared to AtSATs (0.450).Nonetheless, the selection analyses of OsSAT genes showed that SATs are subjected to the purifying selection.Purifying selection, also known as background selection, reduces genetic diversity and shapes it in natural population [40].Consequently, it may be suggested that the genetic diversity of OsSAT genes decreased as a result of purifying selection.
R value, the ratio of transition to transversion, were estimated for DNA sequence evolution and phylogeny reconstruction [41].In any genome, transitions (T↔C and A↔G) are observed at higher frequencies than transversions (T↔A, T↔G, C↔A, and C↔G).In this study, the estimated transition/transversion bias (R) was found as 0.71, indicating genetic variations.Proving this result, GC contents of OsSATs also showed a considerable variation.Genomic DNA base composition (GC content) affects genes' functions and adaptation of species to its environment and it may play roles in complex gene regulation [42].Consequently, it can be concluded that the action of purifying selection may increase the speci city and the selectivity of SATs in rice metabolism, leading to variations of G+C contents.In addition to that, we found that segmental and tandem duplications are driving force of OsSATs evolution.It is known that gene duplication is one of types of genomic change that can lead to evolutionary changes.The duplicated genes can contribute to the evolution of novel functions including adaptation to stress, induction of disease resistance, production of oral structures [43] and expansion of gene families [44].As is found in this study, it is reported that SAT2;1/SAT2;2 and SAT3;1/SAT3;2 isoforms in Arabidopsis were also duplicated gene pairs [45].
Although all SATs contain serine acetyltransferase N-terminal domain structure (SATase_N, PF06426) and bacterial transferase hexapeptide (PF00132) as their common motifs, there is still variations in motif structures.The presence of protein motifs may play important roles in protein function.The motifs in active sites of proteins are well-conserved [46], suggesting that these variations in motif structures may be connected with functional diversities of SATs in plants.
Expression patters of OsSATs showed difference according to tissues and organs, and treatments.This result is validated by the co-expression maps of OsSAT genes showing their involvement in various metabolic pathways and their association with gene families with different functions.When the expression levels of OsSATs by organs are taken into consideration, it was observed that the expression levels of OsSAT2;1 and OsSAT2;2 genes showed more dynamic pro les.Jasmonic acid is the only hormone activating expression of OsSAT1;1, OsSAT2;1, and OsSAT2;2.Besides, OsSAT1;1 was the only OsSAT induced by auxin and abscisic acid hormones.The expressions of OsSATs under different salt treatments conducted within study showed that OsSAT1;1, OsSAT1;2 and OsSAT1;3 were generally upregulated depending on different salt exposure times.However, OsSAT1;2 showed the highest expression levels when exposed to 3,12, and 24-hour salt treatments.Conversely, OsSAT2;2, OsSAT3 and particularly OsSAT2;1 were downregulated depending on salt exposure times.Watanabe et al. [45] stated that three cytosolic isoforms, SAT1;1, SAT3;1, and SAT3;2 genes in Arabidopsis contribute to seed development and SAT gene family plays essential roles for plant survive.Also, SAT3;1 and SAT3;2 isoforms of Arabidopsis play roles in plant development.In Arabidopsis, AtSAT2 and AtSAT4 are 10-100 times less transcribed compared to the major expressed SAT isoenzymes such as AtSAT1, 3 and 5 [9,47].Everything considered, the varied expression patterns of OsSATs under various conditions show that they are dynamically regulated and OsSAT1;1 is the only OsSAT induced by various situmuli.
The expressional similarities among OsSATs may be originated from preserved residues in their active sites taking effect on their functions.As stated earlier, Asp (D), His (H), Gly (G), Thr (T), Arg (R), Ala (A), and Leu (L) are the conserved residues in active sites of OsSATs.His169, Asp154, His18, Arg203 and His204, Lys230, Arg253 in soybean were identi ed as active residues involved in reactions such as catalysis, oxyanion reaction intermediate, serine binding, CoA binding according to crystal structures and analysis of site-directed mutation data [48].In this study, similar residues were identi ed in predicted active sites of OsSAT proteins.

Conclusion
A total of six OsSAT genes were identi ed in rice genome and variations at gene and protein structures were identi ed using bioinformatics approaches.It is found that jasmonic acid induced the expressions of OsSAT1;1, OsSAT2;1, and OsSAT2;2 whereas auxin and abscisic acid induced only OsSAT1;1.On the other hand, OsSAT1;1, OsSAT1;2 and OsSAT1;3 genes were upregulated under different exposure times of salt stress.OsSAT1;1 is the only OsSAT gene induced by various situmuli.
SAT gene family play speci c roles in plant metabolism, particularly sulfur assimilation pathway depending on purifying selection which decreased genetic diversity of them.In addition to purifying selection, segmental and tandem duplications may lead to OsSATs to have more speci c and selective roles in metabolic pathways which may have an effect on the plant's responses to abiotic and biotic stress conditions.Therefore, the ndings can be used by plant breeders and genetic engineers to develop new rice varieties having optimal growth and stress tolerance.

Declarations Figures
Block diagram of conserved motifs in 11 SAT protein sequences from rice and Arabidopsis using MEME server.Each color represents the distinct motif structure.
Phylogenetic tree of SAT protein family based on SAT amino acid sequences of rice and Arabidopsis.
Page 13/17   Global OsSAT gene expression pro le in response to plant hormones.The expression data is retrieved from Rice Expression Pro le Database (RiceXPro) Co-expression networks of OsSAT genes (blue-circled).The each network was generated using the data on RiceFREND gene co-expression database.
plants were grown in the growth cabinet with following cycle: an average of 50% humidity at 25 o C at 8-hour dark stage and 30 o C at 16-hour bright stage.4-5 plants were grown for each experimental group.Third true leaves of rice plants were exposed to 200 mM NaCI and leaf samples were collected at the 3 rd -, 12 th -and 24 th -hour.Leaf samples were immediately stored in RNA/eater stabilization solution (Invitrogen, Cat No: AM7021).

Figure 7 GO
Figure 7

Table 1
Primers used for RT-qPCR analysis of SATs genes.

Table 2
Properties of SAT members in Arabidopsis and rice.

Table 3
Sequence identity matrix of SATs in Arabidopsis and rice

Table 5
Segmental and tandem duplications of SAT paralogous pairs in rice genome Non-synonymous (Ka) and synonymous (Ks) indicate the substitution rates.Ka/Ks is non-synonymous/synonymous mutation ratio

Table 6
The 3D structure overlap (%) of rice, Arabidopsis, and soybean SATs using CLICK structure comparison server

Table 7
Predicted active sites of OsSAT proteins