The N-terminal tail of histone H2A is cleaved during mouse embryonic stem cell differentiation
Prior studies have reported that histone H3 is cleaved upon cellular differentiation14. To determine if differentiation involves the cleavage of other histones, we treated mouse embryonic stem cells (mESCs) with retinoic acid (RA) to induce their differentiation into embryoid bodies (EBs, Fig. 1A). Using immunoblot analysis, we identified that histone H2A is cleaved upon differentiation at day one and four (Fig. 1B), and no change was observed for H2B. Next, we sought to interrogate whether cleaved H2A (cH2A) still associates with chromatin by isolating mono-nucleosomes from both undifferentiated mESCs and EBs using micrococcal nuclease (MNase) digestion. As shown in Fig. 1C, cH2A is only present in mono-nucleosomes derived from EBs, but not in undifferentiated cells (consistent with our results in Figure. 1B), suggesting that cH2A is chromatin associated.
Next, we utilized quantitative mass spectrometry to identify the cleavage site(s) on H2A and quantify the levels of cH2A upon differentiation. Histones from differentiated and undifferentiated mESCs were extracted with acid and then further purified by reverse phase liquid chromatography (RP-HPLC). Characteristic fractions corresponding to canonical H2A.119 (Supplementary Fig. 1A) were analyzed by liquid chromatography coupled to MS (LC-MS). In agreement with our immunoblot data, we detected cleavage of H2A at day two of differentiation, which significantly increased further by day four of RA treatment. The most abundant cleavage site we found was at L23 with ~ 1% of all H2A being cleaved at this site (Fig. 1, D and E). Other cleavage sites that increased through differentiation were also identified at F25, V27 and G37. Although additional cleavage sites were detected along the H2A tail (Supplementary Fig. 1B-D), their abundance did not change during differentiation, indicating that these sites were not regulated during cell development.
Numerous variants of H2A exist, including macroH2A, H2A.Z, and H2A.X, which have different roles in cellular differentiation20. Although addressing whether H2A variants undergo cleavage during development is beyond the scope of this work, sequence alignment shows that the cleavage motif is conserved across the majority of the variants (Supplementary Fig. 1E). It remains to be determined if these variants are being cleaved during differentiation or other biological processes.
Cathepsin L facilitates H2A proteolysis upon cellular differentiation
Previously, we reported that CTSL cleaved H2A in vitro 21 and other studies have similarly found CTSL-mediated cleavage of histones in vitro using reconstituted nucleosomes22. Analysis of the cleavage sites of CTSL substrates has revealed a motif consisting of glutamine at the P1 position and aromatic residues at the P2 position23. The H3 cleavage motif Q/LAT (/ indicates cleavage site)14, is similar to where H2A is being cleaved (G/LQF) as both contain hydrophobic and aromatic amino acids (Fig. 2A). Together, this led us to the hypothesis that CTSL serves as the H2A protease.
To investigate this further, we knocked down the expression of CTSL using shRNA (shCTSL) in mESCs (shSC was used as control) (Fig. 2B). Next, we differentiated mESCs (shCTSL and shSC) for two days, acid-extracted histones and analyzed them by immunoblot. Although shCTSL mESCs still differentiated into embryoid bodies, cleavage of H2A was reduced compared to the control (Fig. 2C). Similarly, when we blotted for total H3, we also found less cleavage of histone H3 (cH3) in shCTSL cells compared to control cells as well (Fig. 2C). To quantify the levels of cH2A upon CTSL knockdown, we employed top-down MS to monitor intact H2A as well as the cleavage products. Again, we observed a significant reduction of cH2A when the expression of CTSL was reduced (Fig. 2D and Supplementary Fig. 2A and B). Proteolysis in the control cells was observed not only at L23 but also at other residues, including V27, suggesting that CTSL cleaves H2A at multiple sites. Nevertheless, the abundance of L23 was three-fold higher than V27, indicating that L23 is most likely the primary cleavage site for CTSL in vivo (Fig. 2D).
To understand how cleavage of H2A by CTSL affects post-translational modification (PTM) patterns, we analyzed acetylation levels at H2AK5 and H2AK9, as these H2A N-terminal marks are associated with gene activation24. Our mass spectrometry data showed that the acetylation levels of H2AK5 and H2AK9 change during normal ESC differentiation. In undifferentiated cells, the levels of all possible forms of acetylated H2A (K5ac, K9ac and dually acetylated K5acK9ac) are approximately 10% (Supplementary Fig. 2C-E). Two days after RA treatment acetylation levels increase significantly to 19%. Finally, after four days of RA treatment, acetylation levels decrease to 7%. To interrogate how CTSL proteolysis affects the levels of acetylated H2A during differentiation (Fig. 2E), we quantified the levels of acetylated H2A in control and shCTSL cells after RA treatment by spiking in synthetic peptides corresponding to H2AK5ac, H2AK9ac, H2AK5acK9ac and the unmodified H2A peptide. As shown in Fig. 2F, the levels of H2AK5ac, significantly decreased after four days of RA treatment in control but not in shCTSL cells. In fact, the levels of H2AK5ac were relatively unchanged. Similarly, H2AK9ac levels also did not decrease in shCTSL cells. Although at day two, the levels of H2AK9ac and the H2AK5acK9ac seemed lower in mESC shCTSL cells versus control cells at day two, the trend of acetylation levels not changing from day two to day four in shCTSL cells. Together, our data indicates that CTSL-mediated proteolysis serves to rapidly remove multiple acetyl marks on the H2A N-terminus, thereby potentially regulating gene expression during differentiation.
Knockdown CTSL leads to genome wide redistribution of acetylated H2A in stem cells
Given that histone tails, specifically those of H3 and H4, have been shown to be dynamically modified during stem cell differentiation 25, and our observations indicating that H2A is also dynamically acetylated during differentiation, we asked how proteolysis regulated the genome-wide localization of H2A acetylation. We performed ChIP-seq for H2AK9ac in shSC and shCTSL mESCs as well as in differentiated embryoid bodies (Fig. 3A). Alterations in the total levels and genomic distribution of H2AK9ac in undifferentiated CTSL KD cells compared to control cells were not apparent by MS or ChIP-seq (Supplementary Fig. 3A-B). In agreement with our MS data, we found a genome-wide decrease in H2AK9ac on day two of treatment when CTSL was knocked down but not at day four. (Fig. 3B). Moreover, when determining the genomic distribution of H2AK9ac, we found a 4% increase at promoters in CTSL KD EBs (Supplementary Fig. 3D). Next, we performed differential binding analysis, here we found that ~ 1.1% of peaks changed significantly between day two and day four in the control (Log2 fold change > 1 (up) or < -1 (down), FDR < 0.05), which was not observed upon knockdown of CTSL (Fig. 3D). This led to the question of whether the expression of those genes are altered upon CTSL knockdown. To answer this question, we performed RNA-seq in shSC and shCTSL cells.
We hypothesized that genes that lose H2AK9ac after four days of RA treatment will also have lower gene expression compared to day two. As demonstrated in Fig. 3D, genes (n = 347) with significantly less H2AK9ac have lower gene expression after four days of differentiation in the control but not upon CTSL knockdown. Interestingly, gene ontology (GO) analysis showed that genes with significantly decreased H2AK9ac in WT cells (n = 347) at day four of RA treatment are involved in cellular differentiation and nervous system development (Supplementary Fig. 3C). Taken together, these data suggest that H2A proteolysis by CTSL aids in gene regulation by silencing genes involved in pluripotency while activating genes to promote cell linage commitment.
Acetylated H2A is recognized by PBAF complex in mESCs
Changes in the H2A acetylation patterns upon loss of CTSL expression likely alters the recruitment of regulatory proteins to chromatin. In addition to H2AK9ac, our MS data showed increased levels of H2AK5ac in differentiated cells upon CTSL KD (Fig. 2F and Supplementary Fig. 2C). This mark is associated with gene activation24,26, but its reader protein remain to be identified. Thus, to identify potential readers, we used synthetic peptides corresponding to H2AK5ac, H2AK9ac and H2AK5acK9ac as bait in peptide pulldown assays. Isolated proteins were characterized by LC-MS/MS with further validation by immunoblot (Fig. 4A). As a negative control, we used an H2A peptide with only N-terminal acetylation since H2A is known to be co-translationally acetylated at the N-terminal serine residue by NatD27,28. All other H2A peptides also include this N-terminal acetylation.
The H2AK5ac and H2AK9ac peptide baits succeeded in capturing members of the Polybromo-1 BRG1 Associating Factor (PBAF) chromatin remodeler complex, including Brd7 and Pbrm1 (Fig. 4B, C). Similarly, the dually acetylated H2AK5acK9ac was also able to pull down PBAF members, such as Smarcd1, Phf10, Pbrm1 and Brd7. Enrichment of PBAF proteins was more prominent when H2A was acetylated at K5 and K9 simultaneously (Fig. 4D, Supplementary Fig. 4A). Similarly, Brd7 preferentially binds to dually acetylated H2A (Fig. 4F). However, it appears recombinant Brd7 binds better to H2AK5ac alone in vitro (Supplementary Fig. 4B), suggesting that additional proteins may enhance recognition of dually acetylated H2A. Furthermore, we found that Brg1 showed similar affinity for all acetylated peptides (Fig. 4D and Supplementary Fig. 4A). Since histone H4 is also known to be acetylated at its N-terminal tail, which resembles the sequence of H2A, we reasoned that acetylated H4 may also interact with the PBAF complex. Repeating our affinity pulldown experiments using unmodified and a dually acetylated (K5acK8ac) H4 peptides, we observed enrichment of PBAF compared to an unmodified control (Fig. 4F and Supplementary Fig. 4B). However, as expected, the main reader for H4K5acK8ac, Brd4 25,29 attained greater enrichment than Brg1 and Brd7 (Supplementary Fig. 4B-C). Pbrm1 has been previously reported to bind H3K14ac30, which we likewise observed in our experiments. However, we found that Pbrm1 also binds to H2AK5acK9ac (Fig. 4F). Additionally, we observed that TBP binds acetylated H2A and H4, highlighting its positive correlation with gene activation (Supplementary Fig. 4D).
We next performed ChIP-Seq for PBAF specific proteins (Pbrm1 and Arid2) to establish genes that are co-occupied by H2AK9ac and PBAF in mESCs. Our data showed that 54% of all Arid2 peaks contain H2AK9ac(n = 9257). Similarly, 55% of all Pbrm1 peaks are also marked with H2AK9ac (n = 8316) (Fig. 4G and Supplementary Fig. 4E and F). When we compared the PBAF occupied regions without H2AK9ac to where they coexisted, we found a remarkable increase of PBAF at gene promoters with acetylated H2A present (Fig. 4H). For instance, 7.6% of peaks exclusive to Arid2 are found at the promoter, however when acetylated H2A is present, we observed 13.3% of peaks correspond to promoter regions. Likewise, we noticed a 4% increase of Pbrm1 localization at promoters in presence of acetylated H2A, suggesting that H2AK9ac facilitates PBAF localization to promoters. To correlate H2AK9ac and PBAF co-occupancy, we compared the expression of genes containing H2AK9ac and PBAF at their promoters to those exclusively found for PBAF. As shown in Fig. 4I, expression of genes that are co-occupied by PBAF and acetylated H2A have significantly higher expression than those only marked by PBAF (P-value < 0.0001), suggesting that potential recruitment of PBAF by H2AK9ac promotes gene expression.
H2A proteolysis prevents PBAF recognition of acetylated H2A
Since the modifications on histone tails can regulate the recruitment of proteins to chromatin, proteolytic severing of the tails could preclude these interactions normally mediated by modifications on the tails. We hypothesized that removal of the H2A tail, and therefore H2A tail acetylation, would disrupt binding of PBAF to H2A. To test this, we performed co-immunoprecipitation followed by MS (IP-MS) analysis using mESCs expressing either FLAG-tagged full-length H2A (FL-H2A) or N-terminally truncated H2A, representing cleavage at L23 (cH2A). Inspection of the interactome of FL-H2A compared to cH2A showed 96% of the proteins interacted with both cH2A and FL H2A (Fig. 5A). However, in agreement with our previous findings, members of the PBAF chromatin remodeler complex (Smarcd1 and Actl6a) were only found to interact with FL-H2A and not cH2A. Additionally, we found that Importin-9 (Ipo9) was enriched in cH2A sample compared to FL-H2A (Fig. 5B and C). Ipo9 is known to translocate H2A-H2B dimers from the cytosol to the cell nucleus 31. We also found that NPL1 (nucleosome assembly protein 1, also known as Nap1), which is known to exchange H2A/H2B dimers 32, was enriched by cH2A over FL-H2A. Interestingly, we found Cbx1, 3 and 5 to preferentially interact with FL-H2A (Fig. 5B and C). Gene ontology analysis showed that the proteins enriched by cH2A are typically associated with RNA splicing and gene expression while those enriched with FL-H2A are involved in nucleosome assembly and DNA packaging (Supplementary Fig. 5A-B).
We next validated our IP-MS results by doing co-immunoprecipitation assays followed by immunoblotting. As shown in Fig. 5D and Supplementary Fig. 5C, FL-H2A co-precipitated with Pbrm1 more so than cH2A. Additionally, Pbrm1 but not Smarcc2 (Baf170) co-precipitated with FL-H2A, suggesting that Pbrm1 interacts with H2A acetylation through its bromodomains, which bind to acetylated lysine residues. Given that proteolysis removes the binding substrate for PBAF, we hypothesized that Cathepsin L KD cells should have higher PBAF occupancy at H2AK9ac sites. To test this hypothesis, we focused on day four as cH2A abundance is at its maximum. We performed ChIP-Seq for Pbrm1 and Arid2 in shSC and shCTSL cells at day four of RA treatment. When we analyzed the occupancy of PBAF at H2K9ac peaks, we noticed an increased occupancy of both Arid2 and Pbrm1, (Fig. 5E-F), indicating that PBAF remains bound to acetylated H2A when CTSL levels are reduced. Taken together, these results indicate that the PBAF protein complex recognizes acetylated H2A and that H2A proteolysis abrogates this recognition (Fig. 5G).
cH2A is associated with marks of active transcription and fast turnover
Our IP-MS results found a reduced interaction between CBX proteins and cH2A. Cbx1, also known as heterochromatin protein 1 β, plays an important role in gene silencing through its interaction with methylated H3K933. This suggests that nucleosomes enriched with cH2A are depleted of histone H3K9me marks. To examine this possibility, we purified mononucleosomes containing cH2A and analyzed their PTMs by MS (Supplementary Fig. 6A). As expected, nucleosomes with cH2A were mostly depleted of mono-, di- and tri-methylation at H3K9 compared to those with FL-H2A (Fig. 6A, B). Instead, we found that cH2A-containing nucleosomes had higher levels of histone marks associated with gene expression, such as H3K14ac and H3K36me3 (Fig. 6A). Taken together, this data suggests that FL-H2A can coexist with H3K9me in certain parts of the genome, while cH2A is most likely to be temporarily associated with accessible chromatin, due to its precursor, FL-H2A, being hyperacetylated on promoters of active genes.
Aside from affecting histone modifications, cleavage of H2A by CTSL could also affect nucleosome structure and stability. Our recent publication using hydrogen deuterium exchange coupled with mass spectrometry (HDX-MS), showed that in a nucleosome context, the N-terminal tail of H2A is protected from exchange34, which is consistent with the finding that histone tail proteolysis destabilize the nucleosome in vitro 35. The loss of the N-terminal tail of H2A likely disrupts DNA-histone interactions that play important roles in the maintenance of nucleosome structure. Indeed, during cellular processes such as replication and transcription, the nucleosome undergoes conformational changes to allow access to DNA. Interestingly, we found that cH2A preferentially interacts with Npl1 (Fig. 5B and C), a protein known to exchange H2A/H2B dimers. Thus, we hypothesized that cH2A-containing nucleosomes are readily exchanged or evicted compared to FL-H2A-containing nucleosomes. To measure the stability of cH2A in cells, we blocked protein synthesis with cycloheximide. We found that while FL-H2A protein levels remain stable for at least eight hours, cH2A undergoes rapid degradation (Fig. 6C). Notably, in our IP-MS experiment, we found that cH2A but not FL-H2A interacted with Trim21 (Fig. 5A), a ubiquitin ligase known to target proteins for proteasomal degradation 36. To investigate whether cH2A was being degraded by the proteasome, cells were treated with the proteasome inhibitor bortezomib and cH2A stability was subsequently monitored for eight hours. As shown in Fig. 6D, the levels of cH2A still decrease after bortezomib treatment. This was further confirmed by treating cells simultaneously with cycloheximide and bortezomib (Fig. 6E), indicating that cH2A may be degraded by another cellular degradation pathway. Overall, our data suggest that histone proteolysis occurs in open chromatin regions to remove key histone PTM sites and disrupt binding interactions, thus facilitating nucleosome destabilization and eviction, and promoting gene silencing of pluripotency related genes.