HAT: p300/CBP/HAC family
To determine the evolutionary relationship of DcHATs with those from rice, and Arabidopsis, total protein sequences from these organisms were used to construct a neighbour-joining phylogenetic tree (Fig. 1). We observed one subgroup related to CBP, TAFII250 and MYST, and three subgroups for GNAT. Eight DcHATs can be divided into four families, CBP (DcHAC1a/1b), GNAT (DcHAG1/2/3), TAFII250 (DcHAF1), and MYST (DcHAM1/2). HAT/HDAC members in D. catenatum usually have extremely longer genomic sequences than those in Arabidopsis and rice (Fig. 2), largely owing to existence of overlength introns.
The CBP family contains 5 members in Arabidopsis, 3 in rice and 2 in D. catenatum (Fig. 1). HAC-like proteins are usually characteristic of several successive domain combinations: an N-terminal ZnF-TAZ domain, a PHD-finger domain, a CBP-type HAT domain, a ZnF-ZZ domain and a C-terminal ZnF-TAZ domain, except AtHAC2 and OsHAC5 lacking the N-terminal ZnF-TAZ domain. Phylogenetic analysis (Fig. 1) revealed that DcHAC1a/1b paralog pairs were grouped together with AtHAC1/12 and OsHAC1a/1b pairs, consistently, DcHAC1a and DcHAC1b have 74.1% protein sequence identity, both display 51.6% and 50.5% identity with AtHAC1, respectively. No counterparts in D. catenatum were classified into AtHAC4/5, OsHAC5 or AtHAC2 branches, indicating CBP family in D. catenatum might have experienced less gene duplication events during evolution. In Arabidopsis, AtHAC1 participates in pleiotropic developmental programs, and its mutation causes delayed flowering, decreased primary root length, and poor fertility [42]. AtHAC5 and AtHAC12 have redundant functions with AtHAC1 in regulate flowering time by repressing the expression of a major floral repressor FLC [43]. AtHAC1/5 are required for salicylic acid (SA)-triggered immunity and PR induction. AtHAC1/5 together with the master immune regulator NPR1 are recruited to PR chromatin through NPR1 interacting with transcription factor TGA upon SA signal, resulting in the formation of HAC-NPR1-TGA complex that activates PR transcription by histone acetylation-mediated epigenetic reprogramming [44]. So, DcHAC1a/1b might undertake most of HAC roles in development regulation and stress response of D. catenatum.
HAT: GNAT/HAG family
HAG family was divided into three distinct clades (HAG1/GCN5, HAG2 and HAG3/ELP3), each of which contains a single member in examined species. All HAG family proteins possess a landmark GNAT-type HAT domain (Acetyltransf_1, PF00583), moreover, each clade has additional distinct domains, such as HAG1/GCN5 with a C-terminal bromodomain, HAG2 with Hat1_N domain (PF10394) and HAG3 with a central Elp3 domain (IPR006638). Bromodomain found in many chromatin-associated proteins can interact specifically with acetylated lysine and mediate the interaction between histone acetyltransferases and histone proteins [45]. In HAG1 clade, Arabidopsis AtGCN5/HAG1 is a major HAT and usually acts as the catalytic subunit of several multicomponent HAT complexes that acetylate lysine residues of histone H3 [46]. AtGCN5/HAG1 plays an essential role in diverse plant development processes, such as meristem activity, cell differentiation, plant architecture, root, leaf and floral organogenesis, and de novo shoot regeneration [21, 22, 24, 46, 47]. AtGCN5/HAG1 is also involved in nutrient balance and fatty acid metabolism. AtGCN5 contributes to iron homeostasis by directly affecting the H3K9 and H3K14 acetylation level of the citrate efflux protein FRD3. Mutation of AtGCN5 impairs iron translocation from the root to the shoot [48]. AtGCN5 modulates fatty acid biosynthesis by affecting the H3K9/14 acetylation levels of FAD3; loss of AtGCN5 causes decreased ratio of α-linolenic acid to linoleic acid in seed oil [49]. AtGCN5 regulates stem cuticular wax biosynthesis by modulating the H3K9/14 acetylation of CER3; AtGCN5 disruption impairs cuticular wax accumulation [50]. In addition, AtGCN5/HAG1 is involved in response to environmental stresses. AtGCN5 is required for thermotolerance and salt tolerance through H3K9/14 acetylation and activation of the key stress-responsive genes [51, 52]. DcHAG1 displays same domain organization with other plant GCN5 proteins, shares 62.5% and 63.4% sequence identity with AtHAG1/GCN5 and OsHAG1, respectively. But the GNAT-type HAT domain of DcHAG1 is 88.3% sequence identical to that of AtHAG1/GCN5. Considering the significance of HAG1 clade, it is requisite to investigate the regulatory function of DcHAG1 in D. catenatum. In HAG3 clade, Arabidopsis HAG3 participates in UV-B-induced DNA damage repair and signaling. Knockdown of HAG3 leads to reduction of DNA damage and growth inhibition by UV-B [31]. DcHAG3 has 88.4% and 90.4% sequence identity with AtHAG3 and OsHAG3, respectively, indicating DcHAG3 might have similar role in UV-proof.
HAT: MYST/HAM family
D. catenatum genome has two MYST-like genes, DcHAM1/2, but so far both only have partial sequences available (Fig. 1). Based on known full-length MYST sequences in other species, MYST family members usually share the conserved domain combinations: an N-terminal chromodomain, a central C2H2-ZnF domain and a C-terminal MYST-type HAT domain MOZ-SOS. Chromodomain is a methyl-specific histone binding module [56], indicating that MYST-catalyzed histone acetylation might coordinate with histone methylation to perform the functions. In Arabidopsis, HAM1 and HAM2 act redundantly to control female and male gametophyte development as well as flowering time. Knockout of HAM1 and HAM2 resulted in lethality; in ham sesquimutants ham1/+; ham2/ham2 and ham1/ham1;ham2/+, half of the ovules aborted due to an arrest of mitotic cell cycle at one-nucleate stage of megagametogenesis [57]. On the other hand, HAM1 and HAM2 regulate flowering time by deposition of histone H4 lysine 5 acetylation (H4K5ac) within FLC and MAF3/4 chromatins; knockdown of HAM1/2 by amiRNA or T-DNA insertion results in early flowering and reduced fertility, whereas overexpression of HAM1 caused late flowering [58].
HDAC: PRD3/HDA1 family
To determine the evolutionary relationship of DcHDACs with those from rice, and Arabidopsis, total protein sequences from these organisms were used to construct a neighbour-joining phylogenetic tree (Fig. 3). HDACs were grouped into three families, SIR2, HD2, and RPD3/HDA1, which is further divided into Class-I, II, and IV. For the members in each family or clade, the closer the genetic relationship, the more similar the gene structure (Fig. 4). All Dendrobium RPD3/HDA1 members contain the conserved histone deacetylase domain (Hist_deacetyl) (Fig. 3), but their length varies from 356 aa to 685 aa, except DcHDA2b and DcHDA5b with partial sequences. In addition, DcHDA9 has another transmembrane domain at N-terminus, whereas DcHDA15 possesses another ZnF_RBZ domain, which was found in a nucleoporin protein RanBP2 located on the cytoplasmic side of the nuclear pore complex functioning in nuclear protein import [59].
Class-I of RPD3/HDA1 family possesses 10 members in D. catenatum, 12 in Arabidopsis, and 14 in rice; and they can be divided into four subclasses: I-1 (HDA19), I-2 (HDA6/7), I-3 (HDA9/10/17), and rice-specific I-4 (HDA701/709). Subclass I-1 compresses a single member each in D. catenatum and Arabidopsis, and 3 homologs in rice. In Arabidopsis, down-regulation of HDA19 (AtHD1) leads to pleiotropic defects including early senescence, serrated leaves, aerial rosettes formation, delayed flowering, and defects in floral organ identity [60, 61]. Deplete of HDA19 results in markable tolerance to abiotic stresses including salinity, drought, and heat [62, 63]. In rice, knockdown of HDA703 (OsHDAC3) reduces rice peduncle elongation and fertility, knockdown of HDA710 (OsHDAC2) affects vegetative growth, and down-regulation of HDT702 (OsHDAC1) causes the production of narrowed leaves and stems [7]. overexpression of HDA702 enhances growth rate, alters plant architecture [64], and increases root growth through deacetylating OsNAC6 locus [65]. HDA702 implicates in formation of IDS1-TPR1-HDA1 transcriptional repression complex, contributing to negative regulation of salt tolerance [66]. These data suggest that HDA702/703/710 in rice may have divergent developmental functions compared with closely related homologs in Arabidopsis. DcHDA19 has 77.6% and 77.4% sequence identity with AtHDA19 and HDA702, respectively. It is interesting to explore whether the function of DcHDA19 is more near to AtHDA19 or HDA702. Subclass I-2 harbors a single copy each in D. catenatum (DcHDA6) and rice (HDA705), and two homologs in Arabidopsis (AtHDA6/7). Arabidopsis HDA6 is involved in leaf development [67], cellular patterning of root epidermis [68], flowering time [69], repression of somatic embryogenesis [23], circadian rhythms [55], and light intensity adaption [70]. AtHDA6 is also required for repressing pathogen defense response; loss of HDA6 resulted in enhanced resistance to hemibiotrophic bacterial pathogen Pst DC3000 and constitutively activated expression of pathogen-responsive genes [71]. AtHDA7 is crucial for female gametophyte development and embryogenesis, and silencing of AtHDA7 causes reduced seed set [72]. In rice, overexpression of HDA705 in rice decreases the resistance to ABA and salt stress during seed germination, but enhances the resistance to osmotic stress during the seedling stage [73]. DcHDA6 has 67.2% and 66.6% sequence identity with AtHDA6 and HDA705, respectively, but only 49.9% with AtHDA7, indicating the function of DcHDA6 is more near to AtHDA6 than AtHDA7. Subclass I-3 consists of a single member each in D. catenatum and rice, and 3 members in Arabidopsis. Arabidopsis HDA9 is involved in preventing precocious flowering under short day [74, 75], suppressing seed germination [76], and promoting the onset of leaf senescence [77]. Moreover, the hda9 mutant is insensitive to salt and polyethylene glycol treatments [78]. DcHDA9 has 78.5% sequence identity with AtHDA9, indicating there might be functional similarity to a large extent.
Class-II of RPD3/HDA1 family contains 5 members each in D. catenatum, Arabidopsis and rice; and they can be divided into four subclasses: II-1 (HDA8), II-2 (HDA14), II-3 (HDA15), and II-4 (HDA5). Among them, subclass II-3 possesses a single member in D. catenatum, Arabidopsis, and rice. DcHDA15 contains an additional ZnF_RBZ domain (SMART, SM00547), and shares the sequence identity of 53.2% and 55% with AtHDA15 and OsHDA704, respectively. DcHDA15 gene, like rice HDA714, consists of 16 exons, but is around 112 kb in length, far larger than HDA714 (~ 20 kb). In Arabidopsis, HDA15 is a negative component of light-regulated seed germination. PHYTOCHROME INTERACTING FACTOR3 (PIF3) recruits HDA15 to repress chlorophyll biosynthesis and photosynthesis gene expression in etiolated seedlings [79]. HDA15-PIF1 acts as a key repression module directing the transcription network of seed germination [80]. NF-YC1/3/4/9 transcriptional co-repressors interact with HDA15 to inhibit hypocotyl elongation in the light partially via histone deacetylation [81]. Collected date suggested that DcHDA15 might have a role in photomorphogenesis and skotomorphogenesis. Subclass II-4 has a pair of paralogs in D. catenatum (DcHDA5a/b) or Arabidopsis (HDA5/18), and one member in rice (HDA713). DcHDA5a has the sequence identity of 53.8% and 59.4% with AtHDA5 and OsHDA713, respectively. In Arabidopsis, HDA5 regulates flowering time through targeting the chromatin of flowering repressor genes FLC and MAF1 as well as interacting with FVE, FLD and HDA6 to form a complex. The hda5 mutant is late-flowering due to up-regulated expression of FLC and MAF1 associated with increased deposition of activation markers, histone H3 acetylation and H3K4 trimethylation (H3K4me3) [82]. HDA18-mediated histone acetylation affects cellular patterning in root epidermis, and loss or overexpression of HDA8 lead to cells at the N position having H fate [83].
Class-IV of RPD3/HDA1 family contains 2 copies in D. catenatum, but only a single member in Arabidopsis and rice, respectively. DcHDA2a displays protein sequence identity of 61.7% and 71.1% with AtHDA2 and OsHDA706, respectively. No related function was reported so far.
HDAC: HD2 family
HD2 family is specific to plant, and firstly appeared in Streptophyta green algae Charophyta, but not in Chlorophyta [84]. Here we identified two HD2 homologs in D. catenatum (DcHDT1/2), compared with 2 members in rice (HDT701/702) and 4 members in Arabidopsis (AtHD2A-D). HD2-like genes usually compress 7 ~ 9 exons and about 1 kb in length (Fig. 4) but they have significant sequence divergence during evolution. For instance, there are only 50.6% protein sequence identity between DcHDT1 and DcHDT2, which both also share 17.9% and 45.9% sequence identity with OsHDT701, 24.5% and 7.2% with OsHDT702, respectively. DcHDT1/2 are characterized of the conserved C2H2-type zinc finger domain at C-terminus, but this domain is loss in Arabidopsis AtHD2A/2C and rice HDT702. HD2 proteins lacking zinc finger domain (Gr2 HD2s) evolved from those containing zinc finger domain (Gr1 HD2s) during angiosperm diversification, and Gr2 HD2s might fulfill complementary functions with the corresponding Gr1 HD2s [84]. In Arabidopsis, HD2 histone deacetylases have been demonstrated to participate in repressing gene expression [85]. HDT1 (HD2A) and HDT2 (HD2B) paralogs are involved in regulating root meristem maintenance partially through fine-tuning gibberellin metabolism via deacetylation and repression of GA2ox2 [86]. HD2B was identified as a distinguished factor associated with seed dormancy by genome-wide association mapping of 113 wild-type accessions with natural variation; inactivation of HD2B contributes to maintenance of seed dormancy in dormant accessions [87]. In addition, HD2 family is involved in the response to biotic and abiotic stresses. HD2B is implicated in plant innate immune defense in the form of MPK3-HD2B regulatory module [88]. Loss of HD2C results in hypersensitivity to ABA, NaCl and salt stress [89]. Overexpression of HDT701 in rice reduces resistance to ABA, salt and osmotic stress during seed germination, but enhances resistance to salt and osmotic stress during the seedling stage [90]. Further work is required to exploit why Gr2 HD2s is loss in D. catenatum during evolution.
HDAC: SIR2 family
SIR2 family contains 2 members in D. catenatum, Arabidopsis, and rice, respectively. All these proteins possess a single copy of SIR2-type HDAC domain, whose catalytic activity requires NAD as a cofactor [91]. SIR2 family was divided into two clades, SRT1 clade including DcSRT1/AtSRT1/OsSRT701 and SRT2 clade including DcSRT2/AtSRT2/OsSRT702, indicating SRT1/2 diverged before the split of dicot and monocot. Consistently, the members in SRT1 or SRT2 branch have similar exon-intron gene structure architecture, respectively. Protein alignment showed DcSRT1 shares sequence identity of 60.3% and 62% to AtSRT1 and OsSRT701, respectively. DcSRT2 shares 66.6% and 61.4% sequence identity with AtSRT2 and OsSRT702, respectively. In Arabidopsis, SRT2 but not SRT1 is localized at the inner mitochondrial membrane and interacts with some protein complexes, mainly involved in energy metabolism and metabolite transport. Mutation of SRT2 results in no growth phenotype but rather a metabolic disorder with altered levels in sugars, amino acids, and ADP contents [36]. AtSRT1 negatively regulates plant tolerance to stress and glycolysis but stimulates mitochondrial respiration through both epigenetic regulation and modulation of AtMBP-1 transcriptional activity [92]. The reduction and loss of AtSRT1 result in significant resistance to ABA, while overexpression of AtSRT1 leads to hypersensitivity to ABA, salt and osmotic stresses. SRT1/2 are involved in mediating transcriptional repression of ethylene signaling by decreasing the levels of H3K9 acetylation [93]. OsSRT1 participates in maintenance of genome stability and defense against DNA damage. Knockdown of OsSRT1 triggers increased H3K9ac and decreased H3K9me2, resulting in oxidative burst, DNA fragmentation, and cell death, whereas overexpression of OsSRT1 enhances tolerance to oxidative stress [94]. OsSRT1 represses glycolysis by both epigenetic regulation and deacetylating glycolytic GAPDH to inhibit its nuclear accumulation and moonlighting function as a transcriptional activator of glycolytic genes [95]. Taken together, D. catenatum SIR2 type HDACs would have a predominant role in energy metabolism.
Subcellular localization prediction
To investigate the functional compartmentation of DcHATs/DcHDACs, we predicted their subcellular localization through different software (Table 2). SLP-Local predictor showed the potential subcellular localization sites for most of DcHATs/DcHDACs with low reliability index (RI < 3), except for DcHAF1, which might be localized in the nucleus or cytosol with higher reliability (RI = 6). However, Target P detected potential localization for nearly one half of these proteins with higher reliability (RC < 2), and all of them are localized in nucleus or cytosol. Uniprot program generated prediction results for only half of DcHATs and DcHDACs, including a SIR2-type HDAC, DcSRT2 with mitochondria localization as well as DcHAC1a/1b, DcHAG2, and DcHDA5a/5b/6/8/9/15/19 with nucleus localization. NES predictor (NetNES) detected the potential leucine-rich nuclear export signal (NES) for DcHATs and DcHDACs with a threshold value of 0.5. NES was found in most of DcHAT members except for DcHAG1, but only found in some DcHDAC members, such as DcHDA2b/6/9 and DcSRT2. The presence of the NES in 6 DcHATs and 4 DcHDACs indicates that they might perform the functions in both nucleus and cytosol. WoLF PSORT displayed low frequency values for DcHATs and DcHDACs; most of them were predicted with a major localization in the nucleus or cytosol, however, DcHAF1 generated the relatively close localization scores for nucleus (7) and cytosol (6), consistent with the dual localization predicted by other software (Table 2).
Promoter cis-element analysis
To detect the cis acting regulatory elements in DcHAT and DcHDAC gene promoter regions that determine the gene expression profiles, 2 kb sequences upstream of the ATG initiation codon were assessed via PlantCARE database [96]. Some phytohormone and stress-responsive cis-acting elements were selected to display (Fig. 5). According to function classification, these elements were mainly divided into two groups: ① phytohormone response elements including MeJA-responsive (82), abscisic acid responsive (27), gibberellin-responsive (16), auxin-responsive (14) and salicylic acid-responsive (10); ② stress response elements including anaerobic induction (28), low-temperature responsive (21), drought-inducibility (15), defense and stress responsive (10), anoxic specific inducibility (2) and wound-responsive (1). Overall, the four elements with the highest number were MeJA-responsive (82), anaerobic induction (28), abscisic acid responsive elements (27), and low-temperature responsive (21). Among these genes, DcHDA2b promoter only had a single predicted element, while DcHAG2 had maximal 18 elements. MeJA-responsive element (16/21) and anaerobic induction element (16/21) were found in most of DcHAT and DcHDAC promoter regions. Although experimental evidence is required to further check the correlation between cis-elements and gene expression response to different conditions, collected data revealed the involvement of DcHAT and DcHDAC genes in stress responses.
Expression profilings of DcHATs and DcHDACs in tissues and organs
To explore the tissue and organ-specificity of DcHATs and DcHDACs, their transcript levels were monitored in different plant tissues and organs, including leaf, root, green root tip, white part of root, stem, flower bud, sepal, labellum, pollinia, and gynostemium by RNA-seq data [97]. In DcHAT superfamily (Fig. 6A and Additional file 2), CBP members DcHAC1a/1b displayed similar expression pattern in vegetative organs and floral buds, but showed different patterns in mature floral organs, such as high expression of DcHAC1a and DcHAC1b in pollinia and root, respectively. Both MYST members DcHAM1/2 had distinct expression profiles, such as DcHAM1 with intermediated expression but DcHAM2 lacking expression in detected tissues and organs. GNAT members DcHAG1/2/3 and TAFII250 member DcHAF1 displayed high expression in almost all the tissues and organs. In DcHDAC superfamily (Fig. 6B and Additional file 2), Class-I RPD3/HDA1 members had similar and intermediated expression patterns, except for DcHDA6 with the highest expression in pollinia where DcHDA19/9 were absent. Class-IV members DcHDA2a/2b had similar but weak expression profiles. Class-II members displayed diverse expression patterns but had uniformly low expression in pollinia; DcHDA8 harbored intermediated expression in almost all the organs; DcHDA14, DcHDA15, DcHDA5a seemingly had distinct expression levels in leaf, floral bud, and root/stem, respectively; DcHDA5b transcripts were absent in detected organs, indicating functional divergence of the DcHDA5a/5b genes. SIR2 members DcSRT1/2 were also weakly expressed in all the organs except for intermediated expression of DcSRT1 and DcSRT12 in pollinia and labellum, respectively. On the other hand, HD2 members DcHDT1/2 were strongly expressed in all the organs except for DcHDT1 not in pollinia. These results indicated that DcHATs/DcHDACs have undergone the functional conservation and divergence during evolution.
Expression of DcHATs and DcHDACs in response to environmental stresses
To examine the responses of DcHATs and DcHDACs to cold stress, the expression levels of DcHATs and DcHDACs were evaluated through analyzing the RNA-seq data from the leaves of D. catenatum seedlings grown at 20 °C (control) and 0 °C for 20 h, respectively [98](Fig. 7 and Additional file 3). In DcHAT superfamily, the expression DcHAG1 transcription was markedly induced, but that of DcHAM1 was repressed by cold exposure. In DcHDAC superfamily, the transcription level of DcHDA14 was increased, but that of DcHDT1/2 was obviously decreased in responding to cold exposure.
To examine the responses of DcHATs and DcHDACs to drought stress, the transcriptional dynamics of DcHATs and DcHDACs were evaluated by analyzing the RNA-seq data from the leaves of D. catenatum seedlings under drought-recovery treatment [99] (Fig. 8 and Additional file 4). Briefly, the plants were irrigated on the 1st day, and kept dry for next 6 days, and rehydrated on the 8th day. Leaves were sampled at 18:30 on the 2nd (DR8), 7th (DR10), 8th (DR11), and 9th (DR15) days, respectively. The results suggested that 5-day drought treatment decreased the expressions of DcHAG1, DcHDA14, and DcHDA9, but increased the expression of DcHDA5a, and then rehydration restored the transcription of DcHAG1, DcHDA14, and DcHDA9 to the normal level.
To explore the response of DcHATs and DcHDACs to heat shock, the expression profiles of DcHATs and DcHDACs in the leaves of D. catenatum seedlings treated at 35 °C for different times were monitored by real-time quantitative RT-PCR (RT-qPCR) (Fig. 9). The results show that the expression levels of 4 DcHAT genes and 10 DcHDAC genes were significantly influenced by heat treatment, including DcHAM2 in MYST family, DcHAG1/2/3 in GNAT family, DcHDA6 /19/9 (Class I), DcHDA2a/2b (Class IV), DcHDA5a/5b (Class II) in RPD3/HDA1 family, and DcHDT1/2 in HD2 family. At 3 h after treatment (HAT), 8 genes were involved in early response to heat shock, including 5 upregulated genes and 3 downregulated genes. At 6 HAT, 7 genes were involved (6 upregulated genes and 1 downregulated genes). At 12 HAT, 5 genes were evidently induced, but no downregulated genes were detected. These indicate that the expression of DcHATs and DcHDACs are generally in positive response to heat shock. Notably, DcHAM2 and DcHDA5b transcripts were highly induced upon exposure to heat shock, but not detected in the expression profiles of different tissues and organs examined, indicating these two genes are mainly involved in heat shock response.