Identification of RPD3 genes in nine species
In this study, a total of 108 RPD3 protein sequences from nine species were determined after eliminating redundant sequences and they are named by the position on the chromosome. The corresponding relationship between gene ID number and their names was shown in Additional file 1: Table S1. A total of 18 genes (GhHDA1-GhHDA18) containing Hist_deacetyl (PF00850) domains were identified from G. hirsutum, among which 9 genes were located on the At genome and 9 genes were mapped on the Dt genome. Further, 18 genes (GbHDA1-GbHDA18) from G. barbadense, 9 genes (GaHDA1-GaHDA9) from G. aboreum and 9 genes (GrHDA1-GrHDA9) from G. raimondii were detected, repectively. Tetraploid cotton possessed twice as many RPD3 genes as diploid cotton, indicating that no RPD3 cotton gene was lost in the process of polyploidy. The numbers of RPD3 genes in the other five species were 10 (Arabiposis), 14 (Oryza satica L.), 11 (Populus trichocarpa), 8 (Theobroma cacao) and 11 (Zea mays L.), respectively. The GhRPD3 proteins length ranged from 232 to 635 aa with an average of 459aa. The physicochemical parameters analysis showed that the isoelectric point (pl) of GhRPD3 proteins varied from 4.47 to 8.65 with an average value of 5.68 and the molecular weight of GhRPD3 proteins varied from 25.79 to 73.01 KDa with an average value of 51.21 KDa. The subcellular localization results indicated that most of the GhRPD3 genes were located in cytoplasmic (10) and nuclear (8), suggesting that GhRPD3 genes might possess multiple regulatory functions (Table 1). The predicted length, pI, Mw and subcellular localization of the RPD3 proteins in other eight species were shown in Additional file 1: Table S1.
Table 1
Physicochemical parameters of 18 RPD3 genes in G. hirsutum
Name | Gnen ID | Protein Length | Protein pI | Protein MW(kD) | subcellular localization |
GhHDA1 | Ghir_A01G001410.1 | 499 | 4.9676 | 56.18 | Nuclear |
GhHDA2 | Ghir_A03G007210.1 | 471 | 5.076 | 53.09 | Nuclear |
GhHDA3 | Ghir_A03G008200.1 | 655 | 5.325 | 73.01 | Cytoplasmic |
GhHDA4 | Ghir_A03G018610.1 | 351 | 4.4737 | 39.55 | Cytoplasmic/Nuclear |
GhHDA5 | Ghir_A05G039610.1 | 449 | 6.9085 | 48.66 | Mitochondrial/Chloroplast |
GhHDA6 | Ghir_A09G010210.1 | 429 | 4.8969 | 49.08 | Cytoplasmic/Nuclear |
GhHDA7 | Ghir_A12G027820.1 | 574 | 6.3108 | 63.26 | Cytoplasmic |
GhHDA8 | Ghir_A13G019980.1 | 232 | 6.5919 | 25.79 | Plasma Membrane |
GhHDA9 | Ghir_A13G023460.1 | 368 | 5.3373 | 40.37 | Cytoplasmic/Chloroplast |
GhHDA10 | Ghir_D01G001410.1 | 499 | 4.9676 | 56.26 | Nuclear |
GhHDA11 | Ghir_D02G019970.1 | 465 | 5.1309 | 52.65 | Nuclear |
GhHDA12 | Ghir_D03G010660.1 | 635 | 4.8889 | 71.02 | Cytoplasmic |
GhHDA13 | Ghir_D03G011510.1 | 471 | 5.1489 | 53.06 | Nuclear |
GhHDA14 | Ghir_D04G003510.1 | 443 | 6.9591 | 47.95 | Chloroplast/Mitochondrial |
GhHDA15 | Ghir_D09G009940.1 | 429 | 4.8371 | 49.11 | Cytoplasmic/Nuclear |
GhHDA16 | Ghir_D12G027930.1 | 579 | 6.1788 | 63.80 | Cytoplasmic |
GhHDA17 | Ghir_D13G020760.1 | 331 | 8.6517 | 37.28 | Plasma Membrane |
GhHDA18 | Ghir_D13G024090.1 | 380 | 5.648 | 41.63 | Cytoplasmic |
Phylogenetic analysis of the RPD3 gene family
A total of identified 108 RPD3 protein sequences from G. raimondii (9), G. arboreum (9), G. hirsutum (18), G. barbadense (18), A. thaliana (10), T. cacao (8), Oryza sativa (14), Zea Mays (11) and P. trichocarpa (11) were employed to construct an unrooted phylogenetic tree using the neighbor-joining method for investigating the evolutionary relationships of RPD3 proteins. The RPD3 proteins were phylogenetically classified into 4 subfamilies (Class I, Class II, Class III and unclassified) according to the formulated subfamilies in Arabidopsis [7]. The Class Ⅰ subgroup was the largest subfamily with 49 RPD3 genes, while the Class Ⅲ subgroup has the fewest members, only containing one gene in seven diploid species genomes and including two genes in two tetraploid cotton genomes (Fig. 1). Among these four classes, each subfamily contained all nine species RPD3 genes, indicating this gene family was relatively conserved in different species during evolutionary process.
Exon-intron structure and conserved motif analysis
The domains of RPD3 sequences in cotton were investigated and exhibited according to the results of SMART database using TBtools, showing that all cotton RPD3 genes contained a Hist_deacetyl domain (Additional file 2: Table S2 and Additional file 3: Figure S1). An unrooted phylogenetic tree with the predicted cotton RPD3 genes was constructed (Fig. 2a), then exon-intron structure (Fig. 2b) and conserved motifs (Fig. 2c) were analyzed to better understand the similarity and difference of cotton RPD3 members. The results showed that the gene length of RPD3 cotton genes was relatively conserved in Class Ⅰ and Class Ⅲ, but there were twelve longer sequences existing in Class Ⅱ and unclassified group. The RPD3 cotton genes included lots of exons varying from 3 to 17 and most RPD3 genes (48/54) contained more than five exons (Additional file 4: Table S3), which might be associated with diversification of their functions. In terms of distributions of motifs, most RPD3 cotton genes belonging to the same subfamily performed a similar motif composition except for the unclassified group (Fig. 2c). Most Class Ⅰsubfamily members contained 9 motifs in addition to GrHDA5 and GhHDA4, only including 4 and 6 motifs respectively. Class Ⅲ subfamily genes had three or four motifs and most Class Ⅱ subfamily members possessed 7 motifs except for GhHDA12, one motif less than other genes. There were differences existing in the exon-intron structure and motif arrangement among four categories, while they were highly conserved on the same branches, indicating that the RPD3 members classified into the same branch might perform a relatively conserved function in cotton growth and development.
Chromosomal distribution, gene duplication and selection pressure
The chromosomal distributions of GrRPD3, GaRPD3, GbRPD3 and GhRPD3 genes were exhibited according to the genomic position of 54 cotton RPD3 genes (Additional file 5: Table S4 and Fig. 3). In G. hirsutum, 18 GhRPD3 genes were unevenly mapped on 13 chromosomes. A03 contained the most GhRPD3 genes (3), while the other 12 chromosomes only contained one or two GhRPD3 genes (Fig. 3a). The distribution of 18 GbRPD3 genes on chromosomes in G. barbadense was similar with that of GhRPD3 genes in G. hirsutum (Fig. 3b). In G. arboreum, 9 GaRPD3 genes were unevenly located on 6 chromosomes. Chr01 and Chr13 contained three and two GaRPD3 genes, respectively, while the other 4 chromosomes contained only one GaRPD3 genes (Fig. 3c). In G. raimondii, the chromosomal distribution of 9 GrRPD3 genes were highly consistent with the corresponding D sub-genome of G. hirsutum (Fig. 3d), showing conserved numbers and chromosomal distribution of RPD3 genes between diploid and tetraploid cotton species. In addition, the lopsided chromosomal distribution of the cotton RPD3 genes indicated that genetic variation occured during the progress of evolution. Notably, most of the RPD3 genes were distributed on the opposite ends of the chromosomes in four cotton species (Fig. 3).
In general, tandem and segmental duplication were two main reasons of gene family generation in the evolutionary process of gene family [21]. The analysis of gene duplication indicated that all RPD3 family members were amplified only through segmental duplication (Additional file 6: Table S5), suggesting that segmental duplication played a vital role in the evolution of the RPD3 gene family. The homologous gene pairs obtained by collinearity analysis among RPD3 genes in G. arboretum, G. raimondii, and G. hirsutum was visualized using circular maps (Fig. 4). The Ka/Ks ratios of most homologous gene pairs were lower than one, indicating that purified selection was essential during the evolution of cotton RPD3 genes, while the Ka/Ks ratios of gene pairs (GhHDA2/GaHDA3 and GhHDA6/GaHDA6) were more than 1, suggesting that these two gene pairs might have experienced positive selection pressure. The study also predicted the occurrence time of segmentally duplicated RPD3 gene pairs by the formula “ t = Ks/2r” (r = 2.6 × 10− 9) [22]. Except for gene pair (GhHDA6/GaHDA6), the other segmental duplication events of three cotton species might have occurred 0.6 to 144.44 million years ago with an average time of 18.39 million years (Additional file 6: Table S5).
Analysis of cis-elements in predicted promoter regions of GhRPD3
In order to explore the possible regulatory functions of GhRPD3 genes under various environmental stresses and hormone regulation pathway, the 2000 bp promoter regions of 18 GhRPD3 genes were employed to the PlantCARE database for identification of putative stress-associated and plant hormone-related cis-elements. A total of 9 kinds of elements related to plant hormone: AuxRR-core (auxin), TGA-element (auxin), P-box (gibberellin), TATC-box (gibberellin), GARE-motif (gibberellin), CGTAC-motif (MeJA), TGACG-motif (MeJA), TCA-element (SA) and ABRE (ABA), and 4 kinds of elements responding to stresses: TC-rich repeats (defense and stress responsiveness), MBS(drought), WUN-motif (wound) and LTR (cold stress), were detected in the predicted promoters of GhRPD3 genes. As shown in Fig. 5, the promoters of some GhRPD3 genes contained various hormone-responsive and stress-responsive components, such as GhHDA2 (2 MBS, 2 LTR, 2 TC-rich repeats, 1 GARE-motif, 2 ABRE, 1 TGACG-motif) and GhHDA13 (1 MBS, 1 LTR, 1 TC-rich repeats, 1 AuxRR-core, 2 GARE-motif, 1 TCA-element, 4 ABRE, 2 TGACG-motif). Among all the 18 GhRPD3 genes, there are large numbers of light-responsive elements distributing in their promoter regions (Additional file 7: Table S6), in addition, MeJA-responsive and ABA-responsive elements are more common than other hormone-related elements (Additional file 8: Figure S2). The results revealed that GhRPD3 genes might be involved in MeJA and ABA hormone signaling pathways as well as response to environmental stresses.
Expression profiles of GhRPD3 genes in different tissues and under different abiotic stresses
To understand the potential functions of GhPRD3 genes in the growth and development of cotton, we studied their expression in various cotton tissues containing anther, pistil, bract, sepal, petal, filament, torus, root, leaf, stem, ovules and fibers using publicly available transcriptomic data provided by Hu et al. [23]. Transcripts of all the GhRPD3 genes were detected in at least three tissues with fragments per kilobase million (FPKM) ≥ 1. Furthermore, ten genes exhibited high expression in all the selected tissues (Additional file 9: Table S7). The results indicated that GhRPD3 genes were widely expressed in both reproductive organs and vegetative organs and might have multiple biological functions. After log2-conversion of FPKM values, the expression profiles of GhRPD3 genes in different tissues were shown in Fig. 6a. Seven GhRPD3 genes exhibited relative high expression in at least eight tissues (log2-transformed FPKM value ≥ 2.6), especially a pair of homologous genes (GhHDA1/GhHDA10), performing higher expression level at all the tissues and showing the same expression pattern. Nevertheless, three GhRPD3 genes (GhHDA4, GhHDA14, GhHDA18) revealed relative low expression in at least eight tissues (log2-transformed FPKM value < 1), of which GhHDA14 showed the lower expression at all tissues except for pistil. These homologous gene pairs (GhHDA1/GhHDA10, GhHDA4/GhHDA11, GhHDA2/GhHDA13, GhHDA6/GhHDA15, GhHDA7/GhHDA16 and GhHDA9/GhHDA18) were located on At and Dt subgenomes, and exhibited similar expression patterns. For example, homologous gene pairs (GhHDA4/GhHDA11 and GhHDA9/GhHDA18) showed relative low expression at all these twelve tissues. Gene pair (GhHDA2/GhHDA13) exhibited relative high expression in torus and ovule, while showed relatively low expression in petal (Fig. 6a).
Based on the analysis of cis-elements in promoter regions and previous reports on RPD3 genes in other plants, GhRPD3 gens might respond to abiotic stresses. To verify this hypothesis, we investigated the expression characteristics of 18 GhRPD3 genes under cold, heat, PEG and salt treatments using available transcriptomic data [23] (Fig. 6b). The results indicated that most GhRPD3 genes could be induced by all four stresses to varying degrees, especially these six genes (GhHDA1, GhHDA2, GhHDA6, GhHDA10, GhHDA12 and GhHDA18), significantly up-regulated at every stage of the treatment. However, only one gene (GhHDA4) exhibited markedly down-regulated expression under four abiotic stresses. Some genes were induced by a particular treatment, for example, GhHDA13 and GhHDA16 was significantly induced by PEG treatment because of containing drought-responsive elements in their predicted promotors. Four genes (GhHDA7, GhHDA17, GhHDA11, GhHDA5) showed up-regulated expression under heat treatment. The expression of GhHDA9 was significantly up-regulated under cold and salt treatments. According to the results, we can make a conclusion that GhRPD3 genes play an essential role in response to abiotic stresses.
Characterization of GhRPD3 genes expression during flower bud differentiation period
To explore the expression difference of GhRPD3 genes between the early-maturity and late-maturity cottons in flower bud differentiation period, we selected nine genes showing relatively high expression in tissues (anther, pistil, bract, sepal, petal, filament and torus) that made up the floral organs for qRt-PCR. The buds of early-maturity variety (CCRI50) and late-maturity variety (GX11) from one-leaf stage to five-leaf stage were used for qRt-PCR respectively (Fig. 7). The results revealed that more than half of these genes (5/9) possessed relatively higher expression in early-maturity cotton than that in late-maturity cotton during flower bud differentiation period. GhHDA5 showed marked difference at two-leaf stage and three-leaf stage and these two stages were regarded as the important period of flower bud differentiation. A homologous gene pair (GhHDA6/GhHDA15) located on At and Dt respectively showed same expression trend that both of them presented the highest expression at three-leaf stage and then exhibited down-regulated expression at next two stages in CCRI50. In addition, all the nine genes performed relatively high expression at two-leaf stage or three-leaf stage in CCRI50 compared with GX11. The results showed that GhRPD3 genes were associated with early maturity of cotton.
Responses of GhRPD3 genes to MeJA and ABA treatment
MeJA and ABA play important roles in plant stress resistance. To further explore the possible functions of GhRPD3 genes, we selected the GhRPD3 genes containing MeJA and ABA responsiveness elements in the predicted promotors to analyze the expression characteristics of those genes under MeJA and ABA treatment by qRT-PCR (Fig. 8 and Fig. 9). Most GhRPD3 genes (8/13) were markedly up-regulated at 9 h after MeJA treatment. Three genes (GhHDA7, GhHDA13 and GhHDA18) exhibited significantly up-regulated expression at least three time points, while four genes (GhHDA2, GhHDA8, GhHDA9 and GhHDA11) showed marked transcriptional down-regulation at least three time points after MeJA treatment (Fig. 8). More than half of GhRPD3 genes (6/11) were significantly up-regulated at 9 h after ABA treatment. Three GhRPD3 genes (GhHDA14, GhHDA15 and GhHDA18) performed relatively high expression at least three time points, while three GhRPD3 genes (GhHDA10, GhHDA11, GhHDA17) showed early down-regulated and then up-regulated expression patterns under ABA treatment (Fig. 9). The results showed that exogenous application of MeJA and ABA significantly induced the transcriptional levels of most GhRPD3 genes containing MeJA-responsiveness and ABRE elements in their promoter regions.