Genome-wide analysis of gibberellin-dioxygenases genes in rice

Background Gibberellins (GAs), a pivotal plant hormone, play fundamental roles in plant and In rice, gibberellin-dioxygenases (GAoxes), OsGA20ox, OsGA3ox and OsGA2ox, are involved in the biosynthesis and deactivation of gibberellins. However, a comprehensive genome-wide analysis of gibberellin-dioxygenases genes is still uncovered.Results In this study, a total of 95 candidate OsGAox genes were found and 19 OsGAox genes were further analyzed. Results of phylogenetic tree showed that the OsGAox genes in Arabidopsis and rice were divided into four subgroups and shared some common features. Furthermore, analysis of gene structure and conserved motifs revealed that splicing phase and motifs were well conserved during the evolution of GAox genes in Arabidopsis and rice, and some specific motifs still need to be further studied. Exploration of expression profiles indicated that most of OsGAox genes exhibited tissue-specific expression patterns, implied their complicated functions. Moreover, the expression patterns of these genes under GA 3 and PAC treatment were investigated, and result showed that some genes, OsGA2ox3, OsGA2ox5, OsGA2ox7, OsGA2ox9, OsGA20ox2, and OsGA20ox4, may play a major role in GA homeostasis to cope with exogenous GA.Conclusions Our study provided a comprehensive analysis of the OsGAox gene family. Splicing characteristics and conserved motifs indicated the evolutionary conserved function in plants. Expression profiles indicated that each OsGAox gene has complicated and special functions. Although plenty of GAoxes were involved in the endogenous GA metabolism, only some of them acted in response to the exogenous GA treatment, which provided available information for researchers to manipulate the chemical GAs to improve the plant architecture and production.

3 metabolism pathway has been extensively studied [8]. This pathway mainly involves three stages of reactions according to the reaction site and the enzyme activity. In the first stage, geranylgeranyl diphosphate (GGDP), a common C20 precursor for diterpenoids, is converted to the tetracyclic hydrocarbon intermediate ent-kaurene by two kinds of terpene synthases (TPSs) in the plastids, entcopalyl diphosphate synthase (CPS) and ent-kaurene synthase (KS). In the second stage, GA 12 and GA 53 are synthesized from ent-kaurene by two types of cytochrome P450 monooxygenases (P450s) at the endoplasmic reticulum, ent-kaurene oxidase (KO) and ent-kaurenoic acid oxidase (KAO). Finally, bioactive GA synthesis is catalyzed by two kinds of soluble 2-oxoglutarate-dependent dioxygenases (2ODDs) known as GA 20-oxidase (GA20ox) and GA 3-oxidase (GA3ox) in the cytosol and the bioactive GAs or their immediate precursors are inactivated by a third 2ODD, GA 2-oxidase (GA2ox), including C19-GA2oxs and C20-GA2oxs [9].
The GA20ox, GA3ox and GA2ox belong to the 2-ODDs superfamily and are each encoded by a multigene family [10], the members of which have different expression patterns and thus regulate GA metabolism in different plant developmental processes. Some GAox gene family members have been cloned and identified, and their biological functions have also been studied in various plant species [11]. By manipulating the expression of GAox genes the levels of endogenous active GAs can be regulated in various plants. For example, the deficiency of a rice semi-dwarfing gene (sd-1/OsGA20ox2) known as the "green revolution gene" causes reduction in endogenous GAs, thus affecting the plant height in rice [12]. Overexpression of GA2ox genes in switchgrass is a feasible strategy to improve plant architecture and reduce biomass recalcitrance for biofuel [13]. MtGA2ox10 plays an important role in the rhizobial infection and the development of root nodules through fine catabolic tuning of GA in M. truncatula [14]. Therefore, it is extremely important to identify and exploit GAox genes in all kinds of plants.
Rice (Oryza sativa L.), one of the most important global food crops, is a primary source of food for over half of the world's population [15]. In the history, the first "green revolution (GR)" resulted from the utilization of GAox genes and fertilizer greatly increased the production in crops, especially on rice 4 and wheat [3]. However, considering the deterioration of the environment and the increasing of population, food security is becoming one of the most important issues in the world. Improving the rice production scientifically is an efficient strategy to resolve the problem. Meanwhile, grain yield in rice is a complex trait affected by multiple factors and major progress in increasing rice yield is on the basis of the exploitation of high yield varieties. Massive studies showed that GA is not only involved in regulating the process of plant growth and development alone, but also in coordination with other plant hormones to function. For example, the crosstalk between GA, ABA and auxin can regulate cell expansion during organ growth in Gerbera hybrida [16]. By mediating ABA and GA biosynthesis exogenous auxin represses seed germination in soybean [17]. Therefore, considering of a key role that GA plays in regulating the rice growth and development, a lot of high yield varieties have been exploited by regulating genes involved in the GA biosynthetic pathways in rice [18][19][20]. In order to better manipulate OsGAox genes to attain high yield varieties, a further study on OsGAox family genes needs to be conducted. Although some GAoxs in the GA pathway of rice have been studied previously, the results are mainly limited to gene identification and biological functions [7].
Here, we conducted a comprehensive genome-wide analysis of GA oxidases in rice and further studied 19 GAox genes including analysis of the phylogenetic relationship, gene structure, motif identification. Furthermore, we investigated the gene expression profiles in different tissues and expression patterns of these genes under GA 3 and PAC treatment. These data suggested OsGAox genes had special expression patterns in various tissues and played different roles under GA treatment and PAC treatment in rice. This study will serve as a foundation into a comprehensive and in-depth study of OsGAox genes so that the plant architecture and production can be improved by manipulating these genes in future.

Identification and analysis of the GAox family genes in rice
A previous study reported about 21 GAox genes in rice [21], but here we identified a total of 95 candidate OsGAox genes based on genome and transcriptome databases (Additional file 2: Table S1).
We analyzed these genes and result showed that 95 candidate OsGAox genes contain 19 of the 21 5 reported genes, except GA2ox2 and GA2ox10 (Fig. 1a), because they lacked a common domain DIOX_N (PF14226) based on the results of alignment. Therefore, we further analyzed the 19 genes (Table 1).
To better understand the distribution of rice GAoxs on chromosomes, the map of genes distributed across chromosomes was created with Mapchart. Our study showed that the 19 genes were unevenly distributed on 7 chromosomes. There were four GAox genes both on chromosome 1 and chromosome 4. Five GAox genes were located on chromosome 5. Two GAox genes were mapped on chromosome 3 and chromosome 7. Only one GAox gene was found on chromosome 2 and chromosome 8, respectively (Fig. 1b). The information, including names, entry ID, number of deduced amino acid, molecular weights, predicted subcellular localization, group classification and theoretical pI were summarized in Table 1. Data showed that protein length of the identified OsGAox family genes ranged from 301 (GA20ox6) to 446 (GA20ox4) amino acids (aa) in length, with an average of 363 aa. The molecular mass ranged from 32.10 to 47.63 kDa, and the pI ranged from 5.25 (GA20ox8) to 7.44 (GA2ox6). Most of OsGAox family genes were predicted to be located in the nucleus and cytoplasm analyzed with WoLFPSORT [22] and TargetP [23], which was consistent with the previous studies [24].
Results also suggested some of them can be transported into mitochondria or chloroplasts, implied that these organelles might be also involved in the GA metabolism in plants.

Phylogenetic analysis of the OsGAox gene family
In Arabidopsis, sixteen GAox genes (seven GA2oxes, four GA3oxes and five GA20oxes) have been identified previously [9]. In order to determine evolutionary relationships of GAoxes in rice and Arabidopsis, the phylogenetic tree was constructed using the neighbor-joining (NJ) method from alignments of the GAox complete protein sequences of 16 AtGAoxes and 19 OsGAoxes (Fig. 2). The tree generated four distinct subgroups and also revealed that the phylogenetic representation of Arabidopsis and rice GAox proteins was quite different. Among the 35 proteins, 5 OsGAox and 5 AtGAox belonged to C19GA2ox subfamily, 8 OsGAox and 5 AtGAox to GA20ox subfamily, 2 OsGAox and 4 AtGAox to GA3ox subfamily and 4 OsGAox and 2 AtGAox to C20GA2ox subfamily. The presence 6 of two subgroups of putative GA2ox (C19 or C20 GA classes) was also confirmed by C20GA2ox splitting from C19GA2ox in the phylogenetic tree. Four subfamilies (GA20ox, GA3ox, C19GA2ox and C20GA2ox) were shared in both two species, suggesting that these four subfamilies might be widespread in plant GA metabolism. Results showed that the diversity of GA20oxes is more abundant in rice, while the diversity of GA3oxes is more abundant in Arabidopsis, which might lead to various GA metabolism between monocots and dicots. Furthermore, the numbers of GA20ox and GA2ox genes were greater than GA3ox in both two species, indicating that GA20ox and GA2ox had undergone a more dynamic evolutionary route than GA3ox and thus resulted in more functional redundancy. Overall, the GAox genes shared some common characteristics in monocots and dicots evolutionary relationship, so the related studies on them could interact and put each other forward.

Gene structure and conserved motif analysis of GAox genes
To support the phylogenetic analysis, we performed gene structure analysis of GAox family members from Arabidopsis and rice. As shown in Figure 3, the number of exons was conserved, ranging from 1 to 3 exons in AtGAox and OsGAox genes. We also investigated intron phases with respect to codons.
Most of the first intron was a phase 2 intron, suggesting that splicing events occurred after the second nucleotide. The second intron was generally a phase 0 intron, suggesting that splicing events occurred after the third nucleotide. This result revealed that the splicing phase was also highly conserved during the evolution of GAox genes both in Arabidopsis and rice. However, intron phases of rice were with a bias in favor of phase 0 compared to Arabidopsis, which indicated that the ancient introns were dominantly of phase 0 so as to favor intron average length and influence the evolution of GA oxidase genes in rice shuffling [25][26][27]. To characterize the structures of GAox proteins in rice, we further analyzed conserved motifs in detail by using the MEME motif search tool. The sequence logos and E values for ten motifs were presented, which named from 1 to 10 in turn by E values (Fig 4a).
Result also suggested these conserved motifs were in different positions and had different width, implied the expansion of the functions during the evolutionary process in rice. Subsequently, we analyzed the distribution of these motifs in proteins. Results showed that motif 1, 2, 3, 4, 5, 6 and 8 7 were shared by most of the GAox proteins, while other motifs were absent in specific subfamilies (Fig   4b). In C19GA20ox subfamily, motif 7 was absent in all the members except for GA2ox8, while motif 9 was only shared in this subfamily, which implied that these two motifs, 7 and 9 may have special functions in C19GA2ox subfamily (Fig 4b). Moreover, motif 10 was shared by all C20GA2ox and GA3ox, implied the conserved functions of C20GA2ox and GA3ox (Fig 4b). Differences among motif distributions might explain the sources of functional divergence in GA oxidases in evolutionary history. Consequently, the detailed functions of these motifs needed to be further explored.

Expression patterns of OsGAoxes
To provide the clues for functional studies of OsGAoxes, we used FPKM values to represent their expression profiles in different tissues of rice and some of them were validated by qRT-PCR in this study [28]. Because of the lack of the corresponding probe of GA3ox2, we analyzed expression patterns of the other 18 genes (Additional file 2: Table S2). Our results showed that all the GAox genes were almost expressed and displayed different expression levels in various tissues (Fig. 5). It was worthy to note that some GAox genes, especially GA2ox3, GA2ox7, GA2ox8 and GA20ox6, were highly expressed in panicle. This result indicated that these four genes may play key roles in panicle development. Furthermore, both GA2ox7 and GA20ox6 exhibited a high expression in all tissues, which revealed that they might play an important role during the whole developmental processes in rice. On the contrary, GA20ox8 expressed at a relatively lower expression in all tissues, implying that it might have a functional redundancy to others. Based on the above results, we next verified the expression pattern of these five genes by qRT-PCR. Results showed these investigated 5 genes exhibited same expression pattern compared to RNA-seq results (Additional file 1: Figure S1). Overall, these results represented that each GAox gene possessed special expression pattern in various tissues. Therefore, the studies on potential functions of these genes in different developmental stages in rice need to be done in future.

8
In this study, to explore the manner of these 19 GAoxes in response to exogenous GA, we treated 2- week seedlings of rice with GA3 and PAC (Paclobutrazol, biosynthetic inhibitor of the endogenous gibberellin) and analyzed the expression profiles of all 19 OsGAox genes by qRT-PCR. As expected, the seedlings treated with GA were higher than those non-treated seedlings, and the seedlings treated with PAC were lower than those non-treated seedlings (Additional file 1: Figure S2). This result showed that GAs indeed promoted the growth of plant. We further investigated the 19 OsGAox genes to get insight into the response event. Results showed only six of them exhibited obviously altered expression under the treatment of GA3 and PAC, GA2ox3, GA2ox5, GA2ox7, GA2ox9, GA20ox2 and GA20ox4 (Fig. 6a). Our expression profiling results showed that the expression of GA2ox3, GA2ox5, GA2ox7, GA2ox9 genes were dramatically increased under GA 3 treatment and reduced under PAC treatment, indicated that most of genes of GA2 subfamily may be involved in the process of GA deactivation in response to exogenous GA. Interestingly, only two genes of GA20 and GA3 subfamilies had intense response to PAC treatment, GA20ox2 and GA20ox4, revealed that maybe only a few of the genes involved in bioactive GA synthesis played roles to regulate GA level under GA treatment.
However, with the exception of these six genes, there was no obvious up-regulation or downregulation expression of other genes (Fig. 6b). We speculated that this reaction may be probably limited to our experimental material. Overall, those results indicated that GA had a great effect on the height of rice, and not all OsGAox genes were involved in the gibberellin homeostasis. Some genes are the main regulatory genes, but some genes which were not involved in the gibberellin homeostasis may play a role in other aspects of rice developmental progress. After we figure out the function of one certain gene, our results can be utilized to up-regulate or down-regulate the gene by GA 4 or PAC treatment to control the growth of rice.

GAoxes display both conserved and diverse characters in plants
The 2-ODDs superfamily is a large family that has been identified in many land plants, especially in crops, such as cucumber, soybean and rice [21,29]. Previous research showed that there were 21 GAox genes that have been identified in rice [21], but few of them were well characterized. Here, we identified 95 GAox genes in the rice genome and further analyzed 19 OsGAox genes with both 2OG-FeII_Oxy (PF03171) and DIOX_N (PF14226) domains. In addition to 19 reported genes in this study, the function of proteins encoded by the remaining genes still needs to be studied in the future. The studies included phylogenetic tree construction, analysis of gene structure and conserved motifs, investigation of their expression patterns and exploration of their function of regulating the level of bioactive GA. Our results showed that although these OsGAox genes shared some common characteristics, the diversity of the expression could contribute to the homeostasis of GA in rice.
In our evolutionary analysis, GAox proteins family in Arabidopsis and rice can be both divided into four subfamilies based on their protein sequences. Phylogenetic analysis also revealed distinct differences between the two species, such as the number of each subfamily. In some studies, there was another subfamily in some species, GA7ox [29,30]. GA7ox, which oxidizes GA 12 -aldehyde to GA 12 and possesses mono-oxygenase 7-oxidase activity, was reported in pumpkin and cucumber but has not been found in other species. So far, although three GA2ox, GA3ox and GA20ox subfamilies have been found in some plant species, the identification of the GAox gene in other plant species still need to be conducted, including GA7ox or the other subfamily. Consequently, it can help us to better understand the GAox genes evolutionary relationship, and further to utilize the GAox genes in future agricultural production.
In this study, gene structure revealed that OsGAox and AtGAox gene structures were conserved and the ancient introns were dominantly of phase 0 to favor intron average length of GA oxidase genes in rice shuffling. Conserved motif analysis of the OsGAox proteins in rice revealed that most of motifs consisted in all OsGAox, while a few motifs were only possessed by a certain subfamily. To better understand the GAox genes function, the biological functions of these special motifs need to be characterized.

The orchestrated GA homeostasis is quite complicated
The expression patterns of 19 OsGAox genes were investigated, and data showed that expression 10 levels varied greatly. Four genes, GA2ox3, GA2ox7, GA2ox8 and GA20ox6, were highly expressed in panicle, indicating that they may play a key role in panicle development. GA20ox6 and GA20ox8 had a higher and lower expression level in all tissues, respectively, suggested that they may play distinct roles in rice different developmental stages. In general, gene expression profiles generate fundamental new insights into their biological function in organisms. The GA20ox6 gene, for example, is essential for reproductive development, including anther dehiscence, pollen fertility, and seed initiation in rice, which is consistent with its expression patterns in rice [31]. Comprehensive expression analysis of Arabidopsis GA2-oxidase genes provides a valuable resource for further elucidating the roles of GA2ox genes during different stages of development [32]. Therefore, the different expression patterns of certain of OsGAox genes in rice indicated that they might play important roles in plant development and have unique functions in specific developmental stages.
Therefore, exploration of genome-wide analysis of the gibberellin dioxygenases genes in rice will benefit to researches on the regulation of GA homeostasis.
The levels of bioactive GAs in plants are maintained via feedback and feedforward regulation of GA metabolism and some of GAoxes expression level will change in response to exogenous GA 3 [33]. This feedback regulation had also been proved in Arabidopsis. For example, AtGA3ox1 of AtGA3ox family and some genes of AtGA20ox family were under the regulation of GA-negative feedback [34,35]. At the same time, the expression of some genes of AtGA2ox family were upgraded with the GA 3 treatment [36]. To figure out how OsGAox genes orchestrate GA metabolism we examined OsGAox gene expression alterations in rice seedling under GA 3 and PAC treatment. Notably, GA2ox3, GA2ox5, GA2ox7, GA2ox9, GA20ox2 and GA20ox4 showed distinct changes under these treatments. However, the other genes presented unexpected expression patterns which greatly broaden our outstanding of the GA feedback and feedforward regulation. Since chemical GA is widely used in rice production, the issue that how the OsGAox genes combine and coordinate to regulate the bioactive GA level is very important to be addressed in the future for application. Taken together, considering that the feasibility of genetic improvement of rice yield by manipulating the expression of GAox genes [37], our results can be utilized to up-regulate or down-regulate the gene by GA 4 or PAC treatment to control the growth of rice after we figure out the function of one certain gene.

Conclusions
In this study, we comprehensively analyzed 19 GAox genes in rice, which can be divided into four subgroups according to phylogenetic tree. Gene structure and conserved motif analysis showed that most GAox genes were conserved in two model plants, dicots and monocot, Arabidopsis and rice. We also analyzed their expression profiles in different tissues in rice and the result suggested that various GAox genes played different roles in rice developmental stages, which can make a foundation for exploring the function of these genes. In addition, the expression patterns of these genes under GA 3 and PAC treatment were also investigated and data exhibited six genes were majorly involved in regulating the GA homeostasis to cope with the exogenous GA. Taken together, our data will generate insight into the further study of GAox genes in rice and provide reference for exploitation of certain OsGAox gene to improve the plant architecture and rice production.

Plant materials and treatments
Rice (Oryza sativa L. japonica cv. Nipponbare) seeds from our lab were grown in sterile water in containers. After 14 days, the seedlings were transferred into a paddy at Wuhan University under natural conditions. Plant materials for qRT-PCR were: roots, stems, leaves and different developmental stages of panicles. For the GA 3 and PAC treatment, the seeds were dehusked and sterilized with 3% NaClO solution, washed three times with sterile distilled water.

Sequence retrieval and identification and analysis of GAox genes in rice
All sequences were downloaded from two databases: RGAP (Rice Genome Annotation Project Database, http://www.rice.plantbiology.msu.edu/) and TAIR (the Arabidopsis Information Resource, http://www.arabidopsis.org/) [38], which was described in a previous report [19]. Then, using the gene of GAox family in rice that had been cloned previously as query sequences, two domains, 2OG-FeII_Oxy (PF03171) and DIOX_N (PF14226), were extracted in Blastp searches [39]. The information of candidate 95 identified OsGAox genes including putative function were listed in Additional file 2, table S1. Finally, the Pfam (http://pfam.sanger.ac.uk/search) and SMART (http://smart.emblheidelberg.de/) were employed to further confirm the existence of both 2OG-FeII_Oxy (PF03171) and DIOX_N (PF14226) domains in all identified OsGAox gene proteins. The map of 19 genes distributed across chromosomes was created with Mapchart software in this study.

Analysis of phylogenetic relationship
Multiple sequence alignments of 19 GAox proteins from rice and 16 GAox proteins from Arabidopsis were performed by using the clustalX program with default parameters [40]. A phylogenetic tree was conducted using the neighbor-joining (NJ) method with 1000 replicates bootstrap replicates in MEGA6.0 [41]. For gene structure analysis, by aligning the genomic DNA sequences with the corresponding cDNA sequences from the RADP and RAP-DB database the exon and intron structures of individual OsGAox gene were displayed via the Gene Structure Display Server (GSDS; http://gsds.cbi.pku.edu.cn/) [42]. Conserved motifs analysis was performed by entering the full-length amino acid sequences of OsGAox genes into the MEME analysis tool with the maximum number of motifs to identify set to 10 [43].

Analysis of GA oxidase family gene expression patterns
To study expression patterns of OsGAox genes, the public RNA-seq data, which contained data of a wide range of rice developmental stages, were downloaded from RiceGE (Rice Functional Genomic Express Database, http://signal.salk.edu/cgi-bin/RiceGE). Furthermore, the heat map was created with Heml software with the log-transformed values by reanalyzing the RNA-seq data [44].

RNA extraction, qRT-PCR and RNA seq
Total RNA was isolated from collected samples using TRIzol reagent (Takara, Japan) and then treated with DNase I (New England Biolabs, Beijing, China) according to previous study [45]. The RNA concentration was measured using Nanodrop2000 (Thermo Scientific, USA). Approximate 5ug total RNA were reverse-transcribed using first strand cDNA synthesis Kit (Invitrogen). Reverse transcription polymerase chain reaction (RT-PCR) were performed using the M-MLV one step RT-PCR system according to the manufacturer's instructions (Invitrogen). Specific primers (Additional file 2: Table S3) for qRT-PCR were designed by using the software Primer Premier 5 and synthesized by TSINGKE biotechnology company (Wuhan, China). The amplification length for each gene was restricted to 80-250 bp to ensure the efficiency of optimal polymerization. Quantitative real-time PCR (qRT-PCR) was performed with 0.5μl cDNA, 0.4μM gene specific primers and 5μl 2 × mix (TaKaRa) and water was used to supplement to 10μl in a Bio-Rad iQ5 Real Time PCR machine according to the manufacturer's instruction. The reaction program of qRT-PCR was performed under the following conditions: 95℃ for 30s, followed by 40 cycles at 95℃ for 5s, and 60 ℃ for 30s. Three replicates were carried out for each sample and three biological replicates were also performed for each sample. The relative gene expression levels were calculated using a 2 -△△Ct method and the melting curve was carried out for each PCR product to avoid nonspecific amplification [46]. The rice gene actin (LOC_Os03g50885) was used as an internal control to normalize the expression of related genes involved in GAs biosynthesis.  Table 1 The     cm;10-15cm;15-22cm; S1-S4 represents the different stages of seed after seed germination. S1-S4: 0-2 days; 3-4; 5-10;11-12.

Supplementary Files
This is a list of supplementary files associated with this preprint. Click to download. Table S1.xlsx Table S2.xlsx