Identification and preliminary characterization of peptides which induce polyubiquitination of proteins when expressed in HEK293T cells: A series of peptides were designed which displayed the property of inducing polyubiquitination when overexpressed in HEK293T cells. As shown in Fig. 1a, peptides to target 3 distinct functional classes of proteins were used: 1) 4 members of a family of enzymes (VB-001 through VB-004); 2) MAPK1 (VB012) or TRIM28, an E3 ligase; or 3) 5 additional ligandable E3 ligases (VB-009-11 and VB-013-015).
Each peptide induced polyubiquitination of multiple proteins, producing a smear of proteins larger than 100 kd when overexpressed in HEK293T cells and detected by western blotting with an anti-ubiquitin antibody (Fig. 1b). As shown in Fig. 1c, the smear of ubiquitin-conjugated proteins was pronouncedly larger and brighter in peptide-treated cells over control (pCMV vector), indicating the presence of multiple ubiquitin conjugated proteins. Moreover, each peptide produced a characteristic pattern of bands within the smear suggesting that there may be some specificity as to which proteins resided in the polyubiquitinated fractions. The proteins targeted by VB-002 and VB-003 produced significantly larger smears, which likely stems from their co-chaperone function, and may involve extensive interactions with other proteins and/or E3 ligase substrates as well. These initial observations suggested that the peptides shown in Fig 1a consistently give rise to polyubiquitinated proteins in cells.
Polyubiquitinated proteins have been described in the literature, for instance, in neurodegenerative diseases (Johnston et al., 1998; Dehvari et al., 2012; Maynard et al., 2009), with mutant protein expression (Chadchankar et al., 2009), during oxidative stress or apoptosis (Reeg et al., 2015; Canu et al., 2000), and also during viral infections (Kobayashi et al., 2020). So, while a serendipitous observation, polyubiquitination was, in itself, not surprising.
That the polyubiquitinating property can be localized to such small regions as 25 – 30 amino acids and the number of peptides which consistently show this effect suggest an underlying mechanism, which need to be elucidated. While polyubiquitin-conjugated proteins do not normally accumulate in healthy cells, being rapidly transported to the proteasome for degradation, the peptide-induced polyubiquitinated proteins are stable. So, they likely also interfere with the process of delivering polyubiquitinated proteins to the proteasome for degradation, which might explain their accumulation in cells. For this reason, the peptides were termed, PINTACs, [Proteasome Inhibiting Targeting Construct].
So, these initial observations indicated a role for PINTACs in ubiquitin modification of substrates, and may be a strategy for harnessing E3 ligases by designing appropriate PINTACs, and developing assays for use in TPD drug discovery research. To realize those opportunities, it was important to determine the fidelity of ubiquitination induced by the peptides by associating a PINTAC with the distinct set of proteins polyubiquitinated, regardless of whether the ubiquitinated proteins are the natural substrates (or not) of the E3 ligase targeted by the PINTAC.
Targeted ubiquitination is highly selective towards an E3 ligase substrate: The specificity of a PINTAC to differentially target an E3 ligase and/or its substrate was next investigated. Two PINTACs were designed, one directed to MAPK1, (VB-012, 42 kd native size) and the other to the Sumo/E3 ligase, TRIM28 (VB-016), which likely targets MAPK1 for degradation (VB-015) in HEK293T cells (unpublished). As shown in Fig. 2a, the two PINTACs produced an identical pattern of ubiquitinated MAPK1, with discrete bands larger than 100 kd. Moreover, there was a near complete disappearance of the native MAPK1 protein indicating extensive ubiquitination. In the same samples, however, TRIM28 was not significantly altered nor ubiquitinated, and many other proteins related to the MAPK1 function were not conjugated with ubiquitin (Fig. 2c). Only PIK3CD showed a band of slightly lower mobility with VB-012, which is likely not a ubiquitinated form but may be an alternatively spliced isoform. In sharp contrast to VB-012, the VB015 sample showed a large smear of high molecular weight proteins when detected with an anti-ubiquitin antibody, such as in Fig. 1c, indicating the presence of multiple proteins in this fraction.
Taken together, these observations indicated that each of the PINTACs possesses the ability to induce polyubiquitination. It appears that a PINTAC directed to the substrate of an E3 ligase, in this case MAPK1, predominantly modifies the substrate, but a PINTAC directed to an E3 ligase modifies multiple proteins via ubiquitination.
Differential targeting of MEK1/2 by PINTACs directed to each of 4 E3 ligases: Next, an attempt was made to apply the serendipitous observation of polyubiquitination induced by PINTACs to purposefully target E3 ligases. Accordingly, PINTACs were designed to target three different E3 ligases (VB-009, -010, -011, -013 and -014). Polyubiquitinated proteins from the transfected HEK293T cells were observed as a smear of high molecular weight proteins as described above.
The samples were further characterized by western blotting with a panel of antibodies to detect specific proteins, and, by challenging the transfected cells with a panel of compounds to observe any alterations to the composition of polyubiquitinated proteins. As expected, most proteins remained unchanged with any of the 5 PINTAC treatments. However, an antibody to MAP2K1 (MEK1) and MAP2K2 (MEK2) readily detected MEK1 and/or MEK2 in the polyubiquitinated fractions with VB-009, -010, and -011 PINTAC treatments, as shown in Fig. 4a. MAPK1/Erk2 or B-Raf were not altered in these samples, nor many other proteins amongst the panel of antibodies used (not shown). To confirm the ubiquitination, total proteins from each of the samples, plus a control PINTAC which did not polyubiquitinate MEK1 or MEK2 (VB-014, pCMV) were fractionated on a column having a molecular weight cutoff of 100 kd, so as to exclude unconjugated MEK1 (43 kd) and MEK2 (44 kd). The partially purified 100 kd supernatants from the column fractionation were immunoprecipitated with the anti-ubiquitin antibody, and probed by western blotting with the MEK1/2 antibody. As shown in Fig. 4b, the antibody detected MEK1/2 bands of significantly higher molecular weights than the native proteins. MAPK1 was not detectable in the same supernatants, indicating that the column fractionation completely removed MAPK1 or unconjugated MEK1/2. Importantly, the process also showed more direct proof of ubiquitin conjugation of MEK1/2.
Lenalidomide alters the pattern of proteins polyubiquitinated with PINTACs: Lenalidomide, which is actively being applied in the clinic as a molecular glue, is particularly effective in multiple myeloma. The compound, and other thalidomide analogs, target cereblon (CRBN), a substrate receptor for a CRL4-type E3 ligase complex that was originally identified as a gene associated with mild intellectual disability (reviewed in: Ito et al 2021). Upon binding to lenalidomide CRBN mediates its pharmacological activities by engaging over a dozen neosubstrates and targeting them for degradation.
Several PINTAC-transfected HEK293T cells were treated with lenalidomide. As shown in Fig. 3, lenalidomide markedly decreased polyubiquitination by the MAPK1 PINTAC, VB-012, and also VB-015, but enhanced the effect with VB-009 and VB-010, both of which are designed to target the same E3 ligase (not cereblon). The pronounced effects of lenalidomide on VB-009 and VB-010 is highly intriguing, and needs to be fully understood. Perhaps, competition for ubiquitin reagents or enzymes, alteration of protein networks necessary for CRL4 function, or even direct crosstalk between the E3 ligase targeted by the PINTACs and CRL4 E3 ligase may be the reasons for the observed effects.
Since PINTACs polyubiquitinate many substrates, it is highly advantageous if they can be applied for targeting CRL E3 ligase subunits, cullins, or other essential proteins necessary for their function. If so, it may become possible to apply the combined effects of the PINTACs and lenalidomide in the clinic for improving selectivity towards neotargets, finding new substrates, or overcoming drug resistance in multiple myeloma, such as, aberrant Wnt signaling (van Andel et al, 2019). Enhancement of VB-009 and VB-010 polyubiquitination by lenalidomide may suggest crosstalk between the E3 ligase targeted by them and CRL4, but the mechanism needs to be fully understood. Characterizing the polyubiquitinated fractions from these samples with or without lenalidomide treatment may identified the proteins responsible, and the pathways involved. Also, these studies need to be extended with additional thalidomide analogs to understand their specificities.
From the foregoing, it is clear that PINTACs can be used to deliberately interfere with the functioning of the UPS, bear some selectivity towards the proteins which are polyubiquitinated by each, and can be further modulated with compound treatments.
Mass spectrometric detection of proteins in the polyubiquitinated fraction: Identifying the proteins which reside in the polyubiquitinated fraction can help understand the mechanism and substrate specificity of the PINTAC, identify E3 ligase substrates, help design assays for drug discovery research, and screen for drug candidates with therapeutic applications. While some ubiquitinated proteins are expected in the 100 kd supernatants even from normal cells, the stimulation of ubiquitination provided by PINTAC treatments substantially increases their relative levels, and enables their detection over the cellular background. Toward this goal, the 100 kd supernatants from VB-011 PINTAC and two additional control PINTACs (directed to other E3 ligases) were subjected to differential mass spectrometry.
A total of 3,084 proteins were identified across the three samples, represented by 37,593 peptides from VB-009 sample, and 32,065 and 5,599 peptides, respectively, for two control samples. The mass spectrometry output from VB-009 sample was queried for proteins with a native molecular weight under 100 kd, represented by at least 20 peptides per protein, and, with the number of peptides being in 2-fold excess (or higher) when compared with at least one of the control samples (evidence of stimulation). Such proteins were considered specifically polyubiquitinated in response to VB-009 expression and selected for further analysis. This identified 40 proteins (Table I) which were substantially enriched in the VB-009 sample. Consistent with the western blot data in Fig. 4, MEK1 and MEK2 were identified in the mass spectrometry data from VB-009 and VB-009, but not in the control sample which did not detect MEK1/2 by western blotting.
A number of proteins involved in the ubiquitination process were also enriched with VB-009. These include the E2 enzymes, UBE2D2, UBE2K, and UBE2V2, all of which are known interactors of the intended E3 ligase target of VB-009. Among the ubiquitin proteins likely conjugated to the proteins in this study, UBA1, RPS27A, SAE1, and the ubiquitin fold containing protein, GABARAPL2 were highly enriched, and UBL4A was represented. The deubiquitinases, OTUD6B and UCHL1 were enriched in VB-009 sample compared to the control samples.
As shown in Table I, 18 proteins possess either an RNA binding property or are components of the splicing or translation machinery. 5 additional proteins associate with DNA. Curiously, 15 proteins are known to associate with Stress granules (SGs), which are cytosolic membraneless organelles involved in RNA metabolism, post-transcriptional regulation, and translational control [Reviewed in: Youn et al, 2019]. Believed to form through phase separation enabled by a combination of interactions among different molecular entities, SGs exhibit a very large number of inter-molecular interactions, including, RNA-RNA interactions (Van Treeck and Parker, 2018), protein-protein interactions, and RNA-protein interactions.
SGs have been implicated in neurodegenerative diseases: Cellular ubiquitination processes are involved in the maintenance of SGs, and may be dysregulated in Amyotropic Lateral Sclerosis (ALS) and Frontotemporal dementia (FTD) (Maxwell et al, 2021; Farawell et al, 2020). Aberrant SG dynamics and a growing number of RNA binding proteins are being investigated as candidates in both diseases (Olney, N.T et al, 2017). Besides FUS, low levels of TDP43, EWSR1, and SMN1 were identified in the VB-009 samples, as well. Tar-binding protein (TDP43), FUS, EWS RNA Binding Protein 1 (EWSR1), TAF15, hnRNPA1, hnRNPA2B1, ATXN2, and TIA1 are the prime candidates which cause or influence disease (Beradan-Heravi et al, 2019).
RNA binding proteins are frequently mutated in ALS and FTD: FUS and Tar binding protein (TDP-43) rank 1st and 10th among the disease candidates forming cytoplasmic inclusions in the degenerating motor neurons of ALS patients and mutations in TDP-43 and FUS causes familial ALS. So, the finding of both these proteins (including RPL3 [Tar-RNA binding protein]) among the ubiquitin-conjugated candidates with VB-009 suggests a potential link from these proteins to the ubiquitin system, and likely an E3 ligase which may affect the degradation of these proteins.
FUS and TDP43 mutation spectrum in ALS and FTD:
To date, more than 50 different FUS mutations have been described in patients with ALS (Deng et al., 2014), of which many disrupt the nuclear localization signal and result in mislocalization of FUS to the cytoplasm (Dormann et al., 2010; Lagier-Tourenne et al., 2010). The expression in neuronal-like cells of either mutant TDP-43 (M337V) or FUS (R495X) mutant led to UPS dysfunction, suggesting a dysregulation of the UPS system as an additional feature of ALS pathology (Farrawell et al, 2020). TDP-43 is depleted from the nucleus and found as hyperphosphorylated, aggregated cytoplasmic inclusions in ∼97% of ALS and ∼50% of FTD patients (Giordana et al., 2010). Most of the ALS associated mutations appear in the exon 6 representing the C-terminal glycine-rich region of TDP-43. N-terminal mutations are rare, but the missense mutations A90V and D169G are causative in ALS as well as FTD.
Splicing factor genes are mutated in myeloid malignancies: LUC7L2, SRSF2, and U2AF1 are among the proteins mutated at frequencies ranging between 40% and 85% in different subtypes of myelodysplastic syndrome (MDS) (Visconte et al, 2019). Mutations in U2AF1 at codon S34 and Q157 are found in about 11% of patients with MDS. Likewise, the expression of the L166P mutated form of PARK7 leads to enhanced degradation through the ubiquitin-proteasome system.
Identification of peptide signatures or ‘motifs’ in FUS shared with other proteins in the VB-009 polyubiquitinated fraction: It is highly intriguing that several of the proteins ubiquitinated with VB-009 expression are frequently mutated in neurological diseases and myeloid malignancies, and may be candidates for targeted degradation in the clinic. Since a peptide directed against a single E3 ligase was used to stimulate polyubiquitination of the 40 proteins identified in this study, there exists the possibility that they may interact with the UPS, or a component of it, in a common manner. Therefore, the protein sequences were investigated further to explore if some common theme emerges, such as, structural or functional motifs that may be shared by some proteins, which may enable targeting them individually or as a group.
E3 ligases recognize their targets through specific motifs referred to as degrons, which may either be a stretch of linear amino acids (physical degrons), or comprised of discontinuous sequences brought in close proximity by the folding of the protein (structural degrons). Degrons have been identified in some E3 ligases, such as, SCFFBXL17, APC/C, SCFbTrCP, and SPOP (reviewed in: Jevtic et al, 2021). Similarly, the substrate proteins contain conserved sequences, or motifs, which are recognized by the cognate E3 ligase for target engagement prior to ubiquitination. For substrate engagement, degrons may require posttranslational modifications, such as phosphorylation (Winston et al., 1999), acetylation (Shemorry et al., 2013), hydroxylation (Ivan et al., 2001; Jaakkola et al., 2001), ADP ribosylation (Zhang et al., 2011), or arginylation (Yoo et al., 2018), or be inactivated by oxidation (Manford et al., 2020).
A variety of approaches have been effectively employed for identifying motifs in E3 ligases, including, protein interactions (House, 2003; Venables, 2004; and Buchwald, 2013), structural studies (Santelli et al, 2005), and miRNA knockdown (Murphy Schafer et al, 2020) with the E3 ligase Siah1. The polyubiquitinated proteins reported here represent a signature of the action by one or more E3 ligases within the complex cellular environment. Accessory factors, protein modifications, proximity, protein network alterations or other factors may determine the range or substrate specificities and many of these interactions can be motif-based as well. To search for such sequences in the output from VB-009, protein sequences of the candidates in Table I were aligned using COBALT and refined to shorter stretches of ~100 amino acids which exhibited maximal homology. Only contiguous sequence homologies were considered, bipartite sequences were not searched. Additionally, the signatures of proteins within any group may be structurally similar in native proteins or after modifications (structural degrons), which were not searched.
As shown in Table I, 33 proteins from the polyubiquitinated fraction with VB-009 could be assigned to one of 5 distinct consensus sequences. Group 1, consisting of 7 proteins, are rich in Arg and Gly residues, with some Ser residues as well. Only FUS contains the Tri-RGG motif within the homology region (RGG(X0–4)RGG(X0–4)RGG), while SRSF1 contains the di-RG motif (RG(X0–4)RG) (Thandapani et al). Group 1 proteins contain RRM domains in their structure [FUS, RBM26, EIF4B, EIF4H, EIF5A, SRSF1, and SRSF6], however, the regions of maximal sequence homology among these proteins did not consistently map to the RRM domains. Further work is needed to ascertain if the RGG motifs constitute degrons or are involved in interacting with an E3 ligase.
Evidence from literature suggests that the Group 1 signature may possess biological function. For instance, several of the amino acids in FUS homology region are located within the nuclear localization signal (NLS) and are frequently mutated in Amyotropic Lateral Sclerosis (ALS) [Deng and Jankovich]. The GRG triplet (residues 486-488), DRG (502 to 505), G (507), S (513), and RP (524-525) amino acids share homologies with other proteins in Group 1 [Chong et al]. Some clinically significant FUS mutations are truncated at G466X, R495X,and G456vfsx, or may alter the secondary structure of the motif (G472X, or R521G, R521L, R521C). These peptide signatures are being tested if they can serve as an E3 ligase interaction site, more specifically the E3 ligase targeted by VB-009.
The Group 2 signature sequence was not as striking as that of Group 1, but were generally rich in charged or modifiable amino acids (K, R, Q, N, or T) with interspersed serine, glycine or alanine. Importantly, the amino acid region of PARK7 (DJ-1) exhibiting homology with other members of this group is known from literature to contain mutations which cause autosomal recessive forms of Parkinson’s disease (PD). A107, E113, and P158, are frequently mutated in PD, and I105, L116, L122 and T154 are located within the (Hering et al, 2004) homologous regions of Group 2. Earlier studies of the E64D variant in fibroblasts from a patient bearing the homozygous mutation showed that levels of the protein are decreased. Likewise, the E163K mutation reduces the stability of the protein in vitro, and the P158del variant is unstable when expressed in cells. Besides HMGB1, this group also includes HMGB2 and HMGB3 which were identified in the sample, but not included in the homology search on account of the high level of sequence conservation with HMGB1. This group also consists of FABP5, wherein the G114R and N124S polymorphisms have been implicated in schizophrenia and autism (Shimamoto et al, 2014).
The Group 3 signature sequence exhibited a high degree of sequence conservation, and may involve a peroxiredoxin fold. The remaining homology groups also consisted of at least one protein mutated (or possessing a causative polymorphism) in hematological or neurological diseases. This includes the S34 and Q157 mutations in U2AF1 found in about 11% of patients with myelodysplastic syndrome. The significance of these mutations in relation to any biological activity of the homologies identified in this study needs to be fully understood, particularly in the context of the ubiquitin proteasome system function.
Finally, it remains to be determined why several putative motifs were observed in this study, although the PINTAC was directed to a single E3 ligase. The typical protein interaction motif is around 6-12 amino acids in length or shorter. So, the PINTACs may contain a few tandem or overlapping motifs in their sequence (currently 30 amino acids), each with the ability of recruiting distinct sets of proteins. Additional proteins, such as ubiquitin accessory factors or chaperones may partially determine interaction specificity, and may have recruited some proteins. G3BP2, DNAJC8, TBCA, SUGT1, STIP1, or U2AF1 are likely candidates and additional proteins not identified in this study may also be involved. Many other RNA binding proteins and chaperones were identified in this study but are not presented here since they did not meet the selection threshold of at least 20 peptides represented in the mass spectrometry data. Optimization of PINTAC sequence, deeper sequencing of the polyubiquitinated fraction with mass spectrometry, computational analysis of the candidate motifs, and comparisons of the ubiquitinated fraction across more samples (directed towards different E3 ligases) may aid their identification, as well as the search for motifs.