Structure determination of the Rpd3S-nucleosome complex
To investigate how Rpd3S recognizes the nucleosomal substrate, we purified the five-subunit yeast Rpd3S complex from insect cells recombinantly and prepared the di-nucleosome harbouring the H3K36me3 mimic in vitro (Methods). Since it was documented that Rpd3S has a tendency to bind di-nucleosome substrates13,14, a di-nucleosome with 30 bp DNA linker were reconstituted and incubated with Rpd3S to form a complex. The resulting complex was purified and subjected to cryo-EM analysis (Extended data Fig. 1). As shown in the two-dimensional classification of initial particles, Rpd3S-bound di-nucleosome particles only represent a minor population and undergoes a high conformational flexibility (Extended Data Fig. 1c), which is problematic for particle alignment to achieve high resolution reconstructions. Indeed, only a low-resolution reconstruction of Rpd3S–di-nucleosome complex was obtained at 10.5 Å resolution (Extended data Fig. 1d). Therefore, we focused on the structure determination of the dominant mono–nucleosome-bound Rpd3S complex, which was reconstructed at an overall resolution of 3.1 Å (Extended data Fig. 1e and 2, Supplementary Video 1). Principal component analysis (PCA) of the reconstruction revealed different modes of relative movement of Rpd3S around the nucleosome (Supplementary Video 2), indicating some degree of plasticity of the complex. Focused refinement of Rpd3S yielded a 3D reconstruction with an overall resolution at 3.0 Å, from which we could build an atomic model of Rpd3S complex de novo (Fig. 1a and Extended data Fig. 1e and 2). All the subunits of Rpd3S, other than the Ume1 subunit, could be well assigned to the density with predicted models of individual domains by AlphaFold16, followed by manual model building (Methods). Another 3D reconstruction of Eaf3 CHD bound mono-nucleosome was resolved at an overall resolution of 3.4 Å (Extended data Fig. 1e and 2). All the resulting models were refined in real space, leading to good stereochemistry (Extended data Table 1).
Overall structure of the Rpd3S-nucleosome complex
Rpd3S in the context of nucleosome resembles an arrowhead hovering over the mono-nucleosome near the nucleosomal DNA site ranging from superhelical location (SHL) − 2.5 to 0 and contacting the linker DNA on the opposite site (Fig. 1b, Supplementary Video 1). Our final Rpd3S model contains one copy of Sin3 and Rpd3, and two copies of Eaf3-Rco1 heterodimer with one copy of the Eaf3 CHD binding to the H3K36me3 modification protruding between SHL − 7 and + 1 (Figs. 1 and 2a,b). Whereas that two copies of Rco1 exist in Rpds3S had been reported previously12, it is the first time we show Rpd3S also has two copies of Eaf3 and each of them forms a heterodimer with Rco1 through its MRG domain (Fig. 2b). Two Rco1 copies are not resolved equally in our EM map (Fig. 2a). One is revealed with all the defined domains (PHD1, AID, SID and PHD2) and several portions from the unannotated N- and C-terminal regions (hereafter referred to as Rco1L) (Fig. 2a and Extended data Fig. 3a), and the other is only visible in the PHD1, AID and SID domains (hereafter referred to as Rco1S) that interact with the MRG domain of Eaf3 (Fig. 2b). The deacetylase domain (DAC) of Rpd3 is located at the centre of the arrowhead and embraced by Sin3 and Eaf3-Rco1L heterodimer (Fig. 1a). The Eaf3-Rco1S heterodimer forms the tail of the arrowhead (Fig. 1a, Supplementary Video 1).
Sin3 is a scaffold protein holding all the other subunits together
Consistent with the previous biochemical data12, the largest subunit Sin3 serves as a scaffold protein in the Rpd3S complex. The protein regions of PAH (paired amphipathic helix) 3, HID (HDAC-interaction domain), PAH4 and the N-terminal part of HCR (highly conserved region) could be traced in the EM map (Fig. 2a,b and Extended data Fig. 3b). Sin3 HID adopts an extended conformation featuring two helical modules, namely the N- and C-terminal module (N- and C-module), which are connected by a long loop region hovering over near the active site of Rpd3 (Fig. 2c and Extended data Fig. 3b). The C-module of HID is sandwiched between Rpd3 and the N-terminal domain of Rco1L mainly through electrostatic and hydrophobic interactions, respectively (Fig. 2b and Extended data Fig. 3c,d, Supplementary Video 1), exhibiting a high similarity to that of the structure of human Sin3A HID17 (Extended data Fig. 3e). Meanwhile, the previously uncharacterized N-module of HID composed of three loosely packed short helices wedges in the space between Rpd3 and the Rco1L-associated MRG domain of Eaf3 via hydrophobic and polar interactions, respectively (Fig. 2d and Extended data Fig. 3f,g). PAH3 located in front of HID displays an almost identical structure as that of the equivalent component in Sin3A18 (Extended data Fig. 3h). Our structure additionally indicates that it is involved in the dimerization of Rco1 in the complex since it is positioned just above the dimerization interface (Fig. 2e). Following HID is PAH4 which exhibits three helical components (α18-α20) and ends with a long flexible loop (Extended data Fig. 3b). PAH4 together with the N-terminal resolved portion of HCR assemble into a compact structure consisting of 13 α-helices and 4 anti-parallel β-strands (Fig. 2b and Extended data Fig. 3b). It sits beside the C-module of HID and PAH3 creating a hollow centre of Sin3 to accommodate PHD2 of Rco1L that then associates with Rpd3 through multiple hydrogen bonds (Fig. 2b and Extended data Fig. 3i,j).
Rco1 orchestrates the assembly of Rpd3S
Rco1 SID has been previously proven to be essential for the incorporation of Eaf3 into Rpd3S, since deletion of SID dissociated Eaf3 from the whole deacetylase complex15. Consistent with this observation, the MRG domain of Eaf3 interacts extensively with the region of Rco1 spanning PHD1 to SID in each Eaf3-Rco1 heterodimer (Fig. 2b,f and Extended data Fig. 4). PHD1 of Rco1 associates with the α4 helix of the MRG domain, which involves several polar interactions and a single hydrophobic interaction mediated by two leucine (L285 and L304) side chains (Extended data Fig. 4a). In comparison, Rco1 SID binds to MRG in two areas that are separated by the second set of helix hairpin (α5/6) (Fig. 2f). The first area is beside the first set of helix hairpin (α2/3) of MRG to which α6 of SID aligns itself parallelly mainly via hydrophobic contacts (Extended data Fig. 4b). The second area is created by two helices (α1 and α5) of MRG, where the C-terminal turn of SID binds with several hydrogen bonds and a three-residue-formed hydrophobic chain involving W236, L353 and L232 (Extended data Fig. 4c).
Compared with Rco1S, Rco1L was built more completely in our complex model, which reaches itself to almost every corner of the complex and associates with all the other resolved subunits (Fig. 2a,b), implying that it works as a “molecular glue” to stick other subunits to the scaffold protein Sin3. Notably, the unannotated N-terminal region of Rco1 is partially resolved in Rco1L, referred to as N-terminal domain (NTD) and HDAC-interaction region (HIR) (Fig. 2a), which hug both Sin3 HID C-module and Rpd3 through extensive contacts (Fig. 2b,c and Extended data Fig. 2c,3d), suggesting a regulatory role of Rco1L to lock Rpd3 in the right position for catalysis. In addition, part of Rco1L C-terminal domain (CTD) could be traced in our EM map (Fig. 2a), which consists of two sets of β-sheet surrounded by extensive loop regions and a long α-helix at the C-terminus, located diagonally to PHD1 and right next to PHD2 (Extended data Fig. 3a). Rco1S dimerizes with Rco1L through the identical C-terminal long α-helix primarily via hydrophobic interactions (Fig. 2e). This dimerization interface is further stabilized by Sin3 PAH3 via hydrophobic effects (Fig. 2e, Supplementary Video 1). It has been shown that both copies of Rco1 are necessary for the full functionality of Rpd3S12. However, the essential role of Rco1S is not unravelled by our Rpd3S structure, which might only become possible when the high-resolution structure of Rpd3S in complex with the di-nucleosome is available.
Rpd3 is captured with a histone peptide in the active site
The structure of the deacetylase subunit Rpd3 solved in this study is similar to the previously determined human HDAC3 and HDAC8 crystal structures19,20 (Extended data Fig. 5a,b), featuring a compact α/β domain of a central eight-stranded parallel β-sheet and eleven α-helices (Fig. 2b). Interestingly, extra density on the entry side of Rpd3 active site is observed in our EM map (Extended data Fig. 2c), which snuggly fits a short lysine-containing peptide that might be attributed to a histone tail. The lysine side chain points toward the catalytic Zinc ion (Zn2+) likely forming hydrogen bonds with the carbonyl oxygen of G159 and the side chain of Y313 (Extended data Fig. 5c), suggesting a histone product of a deacetylation reaction. Indeed, superimposition of our Rpd3 structure on the inhibitor-bound HDAC8 crystal structure20 indicates an overlap between the lysine side chain and the part of the inhibitor that inserts into the active site and interacts with the catalytic residues (Extended data Fig. 5c,d). Besides, the side chain of Y313 of Rpd3 adopts an “in” conformation as the equivalent residue (Y306) does in HDAC8 structure20 (Extended data Fig. 5c,d), which is an conformation proposed to accommodate substrate and catalysis21. Since Rpd3S has a broad substrate specificity, we cannot distinguish from which histone the peptide comes to the Rpd3 active site in our structure.
Activation mechanism of Rpd3 by Sin3 HID
As mentioned above, the HID of Sin3 closely associates with the deacetylase Rpd3 via a concerted action of both its N- and C-modules using two different types of interactions (Fig. 2c and Extended data Fig. 3c,g), indicating that it may play a regulatory role in Rpd3 deacetylase activity. Previous studies on HDAC1 and HDAC2 in complex with their cognate co-repressors suggested that the enzymatic activity of class I HDACs are allosterically regulated by inositol phosphates19,22,23. Since Rpd3 is a prototype of class I HDACs, we are wondering whether it also shares such a similar activation mechanism. Superimposition of our Sin3 HID-Rpd3 complex structure on the structure of HDAC3 bound to the deacetylase activation domain (DAD) from the human SMRT co-repressor19 (Extended data Fig. 5e,f) reveals that the coresponding site where the inositol-1,4,5,6-tetrakisphosphate (Ins(1,4,5,6)P4) binds in the HDAC3 complex is occupied by α8 of Sin3 HID that, in the meantime, occupies the binding interface of SMRT-DAD, indicating that Sin3 HID plays a role in activating Rpd3 deacetylase activity. Indeed, the side chain of E812 on α8 replaces one of the phosphate groups of Ins(1,4,5,6)P4 to coordinate with the side chain of an equivalent arginine residue (R280) sitting on the L6 loop (Extended data Fig. 5e,f), a loop that was proposed to gate the active site of HDAC319.
Multiple contacts with DNA and histone H3 orient Rpd3S on nucleosome
How Rpd3S engages with its nucleosomal substrates has been a long-sought question in structural biology, whose progress was largely hindered due to the lack of a high-resolution structure of the Rpd3S-nucleosome complex. The Rpd3S-nuclosome complex structure determined in our study at near atomic resolution uncovers three novel DNA contacts between Rpd3S and the mono-nucleosome (Fig. 3, Supplementary Video 1).
Two of the three DNA contacts are mediated by Rco1L through its two helical components (Fig. 3a-c). First, three lysine residues (K320, K321 and K328) located on AID or SID point towards the linker DNA (Fig. 3b), likely contacting with the DNA backbones through electrostatic interactions, which perfectly explains the previous observations that the linker DNA and Rco1 SID are required for Rpd3S to bind its nucleosomal substrates10,12. Since PHD1 in our structure is revealed to be essential for the binding of SID to Eaf3 MRG domain (Fig. 2f and Extended data Fig. 4a), it makes sense that disruption of PHD1 also negatively affects the association of Rpd3S with chromatin11. The second DNA contact mediated by Rco1L is achieved via the side chain of R384 on α8 helix that resides between SID and PHD2 (Extended data Fig. 3a), which interacts with the nucleosomal DNA located at SHL − 1.5 (Fig. 3c). Rco1L PHD2 in our Rpd3S structure fills in the hollow centre produced by Sin3 and sits right next to α8 helix (Fig. 2b and Extended data Fig. 3a,i). Disturbance of PHD2 might influence its association with Sin3, which in turn could disturb the correct positioning of α8 helix and then its interaction with the nucleosome11. The third DNA contact occurs around SHL − 2, which involves three alpha helices of Sin3 (α13, α22 and α23) (Fig. 3d). This contact likely involves polar interactions mediated by two lysine (K940 and K1244) and two glutamine (Q937 and Q1222) residues.
Apart from that, Rco1L makes polar contacts with histone H3 at residues T80 and D81 through its K122 and R125 residues from the α2 helix of NTD. Rco1L NTD, in the meantime, associates with the C-module of Sin3 HID, indicating that Rco1L promotes the engagement of Sin3 to the nucleosomal DNA. Consequently, Rco1L in the deacetylase complex not only serves as a “molecular glue”, but also provides a platform for nucleosomal engagement.
All these nucleosomal contacts orient Rpd3S complex at a specific location above the nucleosome disc region in close vicinity to the resolved N-terminal tails of H3 and H4 with distances of approximately 54 and 31 Å to the active centre of Rpd3, respectively. Since Rpd3S has a broad substrate specificity, we cannot distinguish from which histone the peptide comes to the Rpd3 active site in our structure. The way that Rpd3S utilizes multiple weak interactions to engage nucleosome may be essential for Rpd3S to rapidly deacetylate the coding region while it travels with elongating RNA polymerase II24.
Recognition mechanism of H3K36me3 mark by the chromodomain (CHD) of Eaf3
It is well-documented that Rpd3S utilizes the CHD of its Eaf3 subunit to recognize H3K36 methylated nucleosome8,15. In our structure, the CHD of Eaf3 is well resolved together with the H3K36me3 mimic (Fig. 4a and Extended data Fig. 2c), which is accommodated in the aromatic cage formed by residues Y23, Y81, W84 and W88 (Fig. 4b, Supplementary Video 1).
The recognition of H3KC36me3 by CHD is assisted by contacts with nucleosomal and linker DNA. Specifically, the side chain of K26 electrostatically interacts with the nucleosomal DNA at SHL + 1.5, while the side chain of K85 inserts into the interspace between two DNA gyres at SHLs − 7 and + 1 associating with the phosphodiester backbones from both sides (Fig. 4c). Contacts with the linker DNA by CHD are mediated by the C-terminal α-helix (Fig. 4c), which is consistent with the observation that the linker DNA is essential for Rpd3S to bind H3K36-methylated nucleosome10. This multivalent nucleosome recognition mode by CHD is reminiscent of the binding mode of the PWWP domain which recognizes the same histone modification25.
As mentioned above, two copies of Eaf3 are identified in the Rpd3S-nucleosome complex together with Rco1. This leads to the difficulty to distinguish which copy the observed CHD comes from. Since the loop connecting CHD and MRG domain is quite long (more than 80 residues) and predicted to be flexible, it stands to reason that the visible CHD could theoretically stem from both Eaf3 copies. In our current model, the resolved CHD is presumably connected to the MRG domain associated with Rco1L due to the relatively closer distance compared to the other copy of MRG domain. Based on a low-resolution 3D reconstruction of Rpd3S bound to di-nucleosome, we propose a model in which the two CHDs of Rpd3S could simultaneously bind to each mono-nucleosome unit within the di-nucleosome (Fig. 4d). Such dual association of CHD with di-nucleosome may facilitate the transition of Rpd3S between the neighbouring nucleosomes.