A novel binding site on the cryptic intervening domain is a motif-dependent regulator of O-GlcNAc transferase

The modification of intracellular proteins with O-linked β-N-acetylglucosamme (O-GlcNAc) moieties is a highly dynamic process that spatiotemporally regulates nearly every important cellular program. Despite its significance, little is known about the substrate recognition and regulation modes of O-GlcNAc transferase (OGT), the primary enzyme responsible for O-GlcNAc addition. In this study, we have identified the intervening domain (Int-D), a poorly understood protein fold found only in metazoan OGTs, as a specific regulator of OGT protein-protein interactions and substrate modification. Utilizing an innovative proteomic peptide phage display (ProP-PD) coupled with structural, biochemical, and cellular characterizations, we discovered a novel peptide motif, employed by the Int-D to facilitate specific O-GlcNAcylation. We further show that disruption of Int-D binding dysregulates important cellular programs including nutrient stress response and glucose metabolism. These findings illustrate a novel mode of OGT substrate recognition and offer the first insights into the biological roles of this unique domain.


Introduction
Regulation of intracellular primary metabolites is a delicate balance between excess and de cit, each of which can have deleterious effects on the health of a cell. One major signaling corridor for cellular nutrient sensing is the hexosamine biosynthetic pathway (HBP), which uses amino acids, nucleotides, carbohydrates, and fatty acids to produce uridine diphosphate N-acetylglucosamine (UDP-GlcNAc) 1 . This allows the level of each essential metabolite to in uence HBP ux and cellular concentration of the nucleotide sugar. In turn, UDP-GlcNAc serves as the activated sugar donor of O-GlcNAc transferase (OGT), the single enzyme responsible for most mono-O-linked β-N-acetylglucosamine addition (O-GlcNAcylation) on serine and threonine residues of intracellular proteins across all major cellular processes 2 . Since O-GlcNAcylation dynamically and sensitively regulates numerous highly coordinated processes including the cell cycle 3 , gene expression 4 , and proteasomal degradation 5 , this modi cation has been proposed to be a central nutrient sensing system 6,7 . Unlike many other enzymes, OGT does not recognize an apparent sequence motif at the modi cation site, and little is known about this enzyme's substrate recognition and regulation modes.
Coordination of the highly spatiotemporally dynamic cellular environment requires enzymes to be in the right place at the right time to interact with their targets. For many enzymes, particularly those that recognize a wide breadth of protein substrates, this is achieved by association with larger protein machinery or binding to adaptor proteins through speci c protein-protein interactions (PPIs) 8 . Often these interactions are facilitated through scaffolding domains or discrete binding sites, outside of the enzyme catalytic pocket, that recognize short linear motifs (SLiMs) [9][10][11] . However, these binding interactions can be di cult to map as they are typically low a nity and facilitated by poorly de ned shallow grooves. It is therefore necessary to apply new technology to both identify and understand the complex regulatory networks that direct essential cellular proteins.
OGT is a multidomain enzyme consisting of 13.5 tetratricopeptide repeats (TPRs) and a catalytic region ( Fig. 1a) [12][13][14] . Tandem TPR motifs are commonly found in protein binding scaffolding, directing the focus of many studies toward identifying the TPR domain's role in substrate recognition and protein binding. Our group and others have identi ed key TPR residues, including parallel aspartate and asparagine ladders, that facilitate OGT substrate binding [15][16][17][18][19] . Speci c protein substrates have also been studied to map interacting residues on the TPR domain showing that even TPRs distal to the OGT catalytic site may be involved in substrate recognition 18,19 . Despite these efforts, a full model of OGT substrate recognition and its connection to the O-GlcNAc regulatory network remains obscure.
The OGT catalytic region is comprised of twin N-and C-catalytic lobes (N-Cat and C-Cat) separated by an intervening domain (Int-D) (Fig. 1a). Both catalytic lobes have Rossmann-like structures, typical of the GT-B superfamily of glycosyltransferases 12 . However, human OGT and its homologs in metazoans are the only known proteins to contain an Int-D. Since its identi cation over a decade ago, the function of the Int-D has remained mysterious. To advance our understanding of this essential enzyme and its roles in the cell, we employed an innovative proteomic peptide phage display (ProP-PD) to pro le PPI sites across the surface of OGT 20 . Using X-ray crystallization and biochemical approaches, we identi ed a novel binding site in the OGT Int-D that was revealed to be a motif dependent regulator of OGT binding and substrate speci c O-GlcNAcylation. This is the rst evidence of SLiM-based OGT substrate recognition and the rst discrete binding site identi ed in the unique Int-D. Moreover, we uncovered a mode of indirect posttranslational modi cation (PTM) crosstalk through tyrosine phosphorylation of our novel motif.
Revealing the speci c binding site and mode of PTM crosstalk gives the rst structurally justi able mechanism of cellular communication between phosphorylation and O-GlcNAcylation. We further demonstrate the surprising roles of this binding site in regulating the O-GlcNAcylation response to nutrient stress as well as lactate production. In addition to current hypotheses that propose HBP ux as a key driver of O-GlcNAcylation nutrient sensing 21 , we demonstrate that mutation of the Int-D can signi cantly delay the O-GlcNAcylation response to nutrient starvation. These ndings reveal important roles of the OGT Int-D that has garnered interest for its unique character but to this point remained cryptic.

ProP-PD identi ed a novel OGT binding motif
To directly identify biologically relevant OGT peptide binders, we employed an innovative phage display atform. The unique Proteomic Peptide Phage Display (ProP-PD) library of overlapping 16-mer peptides is designed around intrinsically disordered regions of the human proteome 20,22 . Compared to commonly used phage display platforms, ProP-PD features higher peptide copy number (~ 200 per virion) through display on the major coat protein P8 of the lamentous M13 phage, which enables enrichment of moderate to low a nity binders through avidity 23 . The targeted design also allows us to only screen biologically relevant peptides that map to human proteins. This leads to identi cation of SLiMs that Page 4/25 facilitate protein-protein interactions at amino acid resolution, which is di cult to obtain using traditional experimental approaches. We screened the ProP-PD library against two OGT constructs, full-length OGT (OGT) and the protein crystallization construct OGT 4.5 (Fig. 1a). When recombinantly expressed and puri ed, these protein constructs are well folded and highly stable, allowing us to evaluate whether peptides bind to the TPR or catalytic region based on their enrichment by one or both constructs. Enriched phage pools were sequenced with next-generation sequencing (NGS) and peptides were considered as hits if they appeared in multiple replicates or if two overlapping sequences were identi ed. Remarkably, both OGT constructs yielded a similar peptide population with a novel strongly enriched motif PxYx[I/L] ( Fig. 1b) identi ed by the SLiMFinder algorithm 24,25 , which to our knowledge has not previously been described 11 . Of the 24 most highly enriched peptides in the NGS dataset, 16 contain this exact motif or a single amino acid variant. Enrichment of a single motif, by OGT and OGT 4.5 , suggests this is the dominant peptide binding site localized to the OGT catalytic region. However, the absence of Ser/Thr in the motif implies that this is not a direct active site binder. Seven natural peptides and one consensus peptide (CP37) were selected for synthesis and further characterization (Extended data Fig. 1a).
We began assessing peptide interactions with OGT using the thermal shift assay (TSA), a rapid label-free binding assay that has been widely used to evaluate target-ligand interactions by measuring changes in protein denaturation temperature (T m ) 26 . All but one peptide showed a signi cant T m increase with both OGT and OGT 4.5 (Extended data Fig. 1b). We note that some of our identi ed peptides mapped to proteins with well-characterized biological functions or known links to O-GlcNAcylation. For instance, SMG9 protein is a critical component of cellular nonsense mediated mRNA decay (NMD), a process that targets transcripts with premature stop codons for degradation 27 . Notably, SMG9 has a single major O-GlcNAcylation site at residue T114, located 30 amino acids upstream of the identi ed motif, making SMG9 an interesting target to investigate further 28, 29 . Applying a uorescently labeled 5-FAM-SMG9 peptide, we measured its binding a nity to OGT by microscale thermophoresis (MST) 30 . This binding assay showed a clear dose-response relationship and produced a K d of 13.1 mM (Fig. 1c), which would be considered a moderately strong SLiM PPI. As our synthesized peptides contain a single conserved motif, we hypothesized that they were all binding to a single site on OGT. The remaining unlabeled peptides were evaluated by competitive uorescence polarization (FP) assay with uorescently labeled 5-FAM-SMG9 peptide (Extended data Fig. 1 Fig. 2a-d). Strikingly, unambiguous electron densities clearly show that all three peptides bind to the OGT Int-D in a highly conserved conformation (Fig. 2a, c and Extended data Fig. 2b, d). We note that the peptide binding did not induce signi cant change to the OGT 4.5 :UDP-GlcNAc structure (Extended data Fig. 2e). Given the high similarity of these peptide binding conformations, we focus on the OGT 4.5 :UDP-GlcNAc:SMG9 structure in our following discussions.
In the complex structure, SMG9 peptide binds in an elongated conformation in a shallow Int-D surface groove, displaying favorable shape complementarity (Fig. 2a, c). Peptide binding leads to 1,247 Å 2 buried surface area, which is in the typical range for transient PPIs 31,32 . We observed strong electron density for the SMG9 peptide backbone and most of its side chains, allowing us to analyze this interaction with high con dence. The binding conformation of the SMG9 peptide is mainly supported by two types of interactions: hydrophobic (Fig. 2b) and hydrogen bonding (Fig. 2c). OGT Int-D site binding speci city is likely driven by hydrophobic interactions. The motif isoleucine side chain of SMG9 peptide (corresponding to the I149 residue of SMG9 protein) reaches into a small hydrophobic pocket formed by highly conserved OGT residues I734, I787 and F723. Additionally, the SMG9 motif proline (P145) residue interacts with hydrophobic I790, lining the Int-D surface groove, allowing appropriate positioning of the peptide motif. Besides these hydrophobic features, OGT Int-D N791 also plays critical roles in stabilizing peptide binding through polar interactions. The carboxamide side chain of N791 makes bidentate interactions with the backbone of SMG9 I149. The N791 backbone carbonyl oxygen further accepts a hydrogen bond from the carboxamide sidechain of Q148 in the SMG9 peptide. Another indispensable element in this interaction is the SMG9 motif tyrosine (Y147) residue. The side-chain hydroxyl of SMG9 Y147 forms double hydrogen bonds with OGT L837 and E839 backbones, while the Y147 backbone makes two additional hydrogen bonds to OGT S833. Moreover, the interactions between backbones of SMG9 V146 and OGT S833/Q834 further stabilize the peptide binding conformation. Outside of the motif, SMG9 T143 side chain mediates a hydrogen bond with the backbone carbonyl of OGT G793.
Collectively, these hydrophobic and hydrogen bonding interactions form a tangled network stabilizing the binding of SMG9 peptide in the OGT Int-D.
To con rm the SMG9 peptide binding mode in our crystal structure, we mutated each of the two primary binding features: the hydrophobic pocket (F723, I734, and I787) and the bidentate donor N791, and measured their changes in binding a nity for SMG9 peptide. Each of the hydrophobic OGT residues were mutated to a glutamate or arginine, while the asparagine bidentate interaction was disrupted by mutation to alanine. Each mutated OGT construct was recombinantly expressed in E. coli and puri ed at high concentration for FP saturation binding assays with uorescent SMG9 peptide. Insertion of charged residues into the hydrophobic pocket (F723E, I734R, I787E, and I787R) as well as disruption of the bidentate backbone interaction (N791A) each shifted EC 50 values by > 10-fold (Fig. 2d, Extended data Fig. 3a), showing that these mutations substantially disrupt SMG9 peptide binding. TSA of each OGT mutant showed no or slight changes in protein stability (Extended data Fig. 3b). These results strongly support the peptide binding mode illustrated in our crystal structure and offer useful mutants for interrogation of this binding site's role in cells.

Smg9 Y147 Phosphorylation Disrupts Peptide Binding To Ogt
On the other side of this interaction, the SMG9 Y147 residue is highly conserved in our motif and makes substantial interactions with OGT Int-D (Fig. 2c), signaling its importance for binding. To further con rm the SMG9 binding mode observed in our crystal structure, we introduced a Y147F mutation to the SMG9 peptide. Using the same FP competition assay as before, SMG9 Y147F demonstrated > 10-fold shift in EC 50 (Fig. 2e), supporting the signi cant role of motif Y side chain in OGT interactions. Interestingly, Y147 is one of the major phosphorylation sites on SMG9 protein. Previous studies have linked phosphorylation at this site to transcriptional programs in epidermal growth factor signaling 29 . Structural analysis of this modi cation on Y147 suggests that it would not be tolerated by the Int-D binding site. Using our FP competition assay, we tested the interaction of phosphorylated SMG9 peptide (pY147) with OGT. SMG9 pY147 signi cantly disrupted peptide binding, to a similar degree as Y147F (Fig. 2e). Tyrosine phosphorylation (pTyr) is a fundamental mechanism of cellular regulation 33 . Compared to the much more prevalent Ser/Thr phosphorylation, pTyr is relatively rare with an occurrence on only 2% of cellular proteins. Interestingly, nearly 70% of O-GlcNAcylated proteins contain at least one pTyr site 34 . O-GlcNAcylation and pTyr share many commonalities with two primary functions of both modi cations being signal transduction in response to extracellular stress and stimuli as well as regulation of protein localization and complex formation across various cellular processes. Dysregulation of pTyr can lead to a variety of diseases including cancer, which has made tyrosine kinases and phosphatases promising targets for therapeutic interventions 35 . More interestingly, pTyr and the OGT Int-D appear to share similar evolutionary paths as pTyr primarily evolved in metazoans while the Int-D is only found in metazoan OGTs 36 . These parallels have long suggested cross communication between O-GlcNAcylation and tyrosine phosphorylation, but few studies have made progress in de ning their interaction mechanisms.

Ogt Int-d Is A Motif-dependent Facilitator Of Protein Interaction And O-glcnacylation
To begin investigating the biological roles of the OGT Int-D, we turned our attention to full-length protein interactions. Two primary functions facilitated by noncatalytic binding sites of enzymes are substrate recognition and non-substrate protein binding. First, we demonstrated that the OGT-SMG9 association occurs at protein level by co-immunoprecipitation (co-IP) analysis of Flag-tagged OGT and cMyc-tagged SMG9 co-expressed in human embryonic kidney 293 cells (HEK293) ( Fig. 3a and b). We then selected the SMG9-Y147F mutant as well as three OGT mutants (I734R, I787E, N791A) that most effectively disrupted SMG9 peptide binding for co-IP ( Fig. 3a and b, Extended data Fig. 4a and b). Compared to wild-type (WT) SMG9, the SMG9-Y147F mutant substantially decreased association with OGT, further supporting the importance of Y147 side chain in their PPI. In addition, each of the OGT mutants signi cantly reduced the level of SMG9 association, though none of the individual mutants completely abolished SMG9 binding. A double mutant OGT (I787E-N791A) was used to co-IP SMG9 and produced similar results (Extended data Fig. 4c). This suggests that interaction via the Int-D site is the major facilitator of OGT-SMG9 association, but may not be the only interaction between the full-length proteins. As mentioned above, our SMG9 peptide crystal structure shows that OGT residue N791 makes a bidentate interaction with the peptide backbone while residues I734 and I787 interact with hydrophobic sidechains in the peptide motif. Since all three of our tested mutants disrupt SMG9 association similarly well, we decided to continue evaluating the Int-D site with what we theorize to be a more general disrupter of OGT Int-D site binding, N791A. To evaluate the impact of Int-D site mutation on OGT's intrinsic activity, we performed a radiolabeled activity assay. Recombinantly puri ed OGT or OGT-N791A were incubated with UDP-3 H-GlcNAc and CKII peptide, one of the best OGT peptide substrates. We detected no signi cant difference in radio-ligand incorporation between OGT and OGT-N791A (Extended data Fig. 3c Fig. 4d), supporting the Int-D site as a speci c motif-dependent regulator of OGT protein binding.
A recent study has reported a single O-GlcNAcylation site at T114 of SMG9, 30 amino acids upstream of our identi ed motif 28 . The proximity of this modi cation site to our Int-D motif implies a coordinated substrate binding model (Extended data Fig. 2f). To evaluate changes in SMG9 O-GlcNAcylation upon disruption of Int-D site interaction, we generated TRex-293 stable cell lines expressing Flag-tagged WT OGT or N791A mutant with endogenous OGT knockdown. TRex-293 cells enable single-copy integration of exogenous constructs into an engineered genomic locus, allowing us to tightly control WT and mutant OGT expression through an inducible tet-on system. Co-expression of cMyc-SMG9 or cMyc-SMG9-Y147F with each OGT construct, followed by anti-cMyc immunoprecipitation and biotinylation of O-GlcNAc by GalT assay 37 (Fig. 5b). Surprisingly, N791A mutation appeared to desensitize the O-GlcNAcylation response to nutrient stress. A follow-up time course study revealed that N791A expression delayed the stress response (Fig. 5c). Cell growth assay showed no signi cant difference between the growth rate of WT OGT or N791A overexpressing cells in either nutrient condition (Fig. 5d). These results implicate the Int-D site as a temporal regulator of cellular nutrient stress signaling.
To determine if this delayed response to nutrient deprivation can be generalized to other cell lines, we stably expressed WT OGT and N791A in the cervical cancer cell line HeLa. Unlike TRex-293 cells, these cell lines only required 24 hours of low nutrient treatment to show considerable O-GlcNAc downregulation, illustrating the variability of response speed across different cell types (Extended data Fig. 6). As in our TRex-293 cells, N791A expression delayed the low nutrient stress response, relative to WT OGT expressing cells. This striking alteration in global O-GlcNAcylation response to low nutrient stress by a single OGT point mutation has not been observed in previous studies. Cellular nutrient sensing is regulated by different systems at multiple levels across the cell, explaining the eventual reduction of O-GlcNAcylation level we observed in N791A cells. It is interesting that this effect is only seen when both free glucose and serum levels are reduced, suggesting that a combination of primary metabolites and growth factors are responsible. The complex composition of serum precludes us from speci cally identifying the affected signaling pathways at this point, though these results already support the Int-D binding site as an important regulator of nutrient sensing. This also supports a model of cooperative regulation between OGT and the HBP.
Response to external cues and nutrient status is only one side of OGT's nutrient sensing role. Regulation of downstream metabolic programs, including the balance of glycolysis and oxidative phosphorylation, is a critical function of OGT 46 . Under normal conditions, healthy cells will preferentially utilize the oxidative phosphorylation pathway to metabolize glucose. However, metabolically aberrant cancer cells drive glucose consumption through both the citric acid cycle and glycolysis pathways (the Warburg effect), resulting in increased lactate production 47 . OGT overexpression has been identi ed in a range of clinical cancers and associated with increased glycolysis 46 . Throughout culture of our stable cell lines, we noticed rapid media acidi cation of WT OGT cell lines but not N791A. To evaluate differences in metabolic activity, we measured media lactate concentration in a time course using the LactateGlo assay and normalized the results to cell growth rate. Samples expressing WT OGT showed signi cantly higher lactate concentrations than N791A expressing cells at 48 and 72 hours (Fig. 5e), implicating the Int-D as a regulator of cellular metabolism and the Warburg effect. These simple tests of lactate production and O-GlcNAcylation response to nutrient stress have broad implications for the important roles of Int-D in health and disease, which requires further investigation.

Discussion
Previous studies of OGT substrate regulation and protein binding have focused on the TPRs of OGT, as they are well-known structural scaffolds found in many proteins across the cell. Our study has assumed a fundamentally different route, employing the unique ProP-PD library to pro le OGT surface binding sites. Our agnostic approach led to the surprising discovery of a novel motif-based binding site in the OGT Int-D. Since its initial structural characterization over a decade ago, the role of the Int-D has remained mysterious. Despite the prevalence of OGT homologs throughout the kingdoms of life, the unique fold of Int-D that intersects an otherwise typical GT-B enzyme catalytic domain is only found in metazoan OGTs. This is unlike the TPR domain whose structure and length are highly conserved across OGT homologs in prokaryotic and eukaryotic species. This may suggest that the Int-D arises with increased organismal structural and functional complexity. We have identi ed a unique peptide motif PxYx[I/L] that binds to the OGT Int-D site. Our observations that motif containing proteins partake in processes typically associated with evolutionarily complex functions of multicellular organisms (e.g., chromosomal structure, cell-cell adhesion, environmental sensing, etc.) support this premise. As mentioned above, tyrosine phosphorylation is also typically associated with multicellular organisms and regulation of evolutionarily complex functions. Sequence alignment of OGT Int-Ds reveals a further line of genetic delineation as the Int-D binding site residues identi ed in this study (e.g., F723, I734, I787, N791) are only highly conserved in vertebrates 12 . This is interesting as knockout of ogt in vertebrate systems (e.g., mouse, human) is embryonic lethal, however, ogt knockout is tolerated in the invertebrate C. elegans 48,49 . Our work leads to a hypothesis that the Int-D site is an important piece of OGT's essential cellular functions and may be exploited for therapeutic interventions.
Hallmark phenotypes of cancer and other deleterious diseases include dysregulation of nutrient sensing and metabolism. Aberrant OGT and O-GlcNAcylation have been intimately linked to the development of cancer and other metabolic disorders. The wide reach of O-GlcNAcylation across important cellular systems makes it a valuable target for therapeutic intervention, however, disruption of these processes can also present problems. Current OGT inhibitors primarily target the active site, triggering global disruption of O-GlcNAcylation, spurring concern over unanticipated side effects. Nearly as interesting as the processes impacted by the Int-D are those that are not. Apparent maintenance of normal O-GlcNAcylation upon Int-D site mutation indicates that this binding site plays explicit roles in the cell. The structural uniqueness of the Int-D, coupled with its speci c sequence recognition offer great opportunity to target OGT with exceptional speci city. Our ndings for the Int-D's role in low nutrient response and lactate production warrant further investigation of this domain's role in cancer and metabolic disorders.
This study uni es multiple lines of thinking that have been pervasive in the eld but have lacked concrete evidence: 1) major regulatory mode(s) of OGT utilize its non-catalytic regions to coordinate interactions with speci c substrates; and 2) due to its evolutionary novelty, the Int-D plays one or more roles in OGT that make it distinct from other glycosyltransferases or even non-eukaryotic OGT homologs. Through multiple cellular assays we have identi ed the Int-D binding site as a motif-dependent regulator of protein association and O-GlcNAcylation. We further linked this site to tyrosine phosphorylation crosstalk, nutrient stress response, and metabolic regulation. Along with other important roles implicated by our bioinformatic analysis, these ndings have radically changed our view of the Int-D and its importance to OGT. This opens exciting pathways to speci cally dissect OGT's regulation and functions in health and disease, creating a deluge of new opportunities in the eld.
During the preparation of this manuscript, a study utilized mRNA display to screen for peptide inhibitors of OGT 50 . This screen enriched different groups of peptides including some with a similar PxYx[I/L] motif as we identi ed. However, the team did not investigate the binding mode of these motif-containing peptides as they did not strongly inhibit OGT's intrinsic activity, corroborating our ndings and supporting the Int-D as a novel regulatory site on OGT. before adding the HD2 phage library (10 11 phages in 100 µL PBS per well), rst to the GST-coated wells (1 h, 4 ˚C) to remove non-speci c binders, and then to the bait protein-coated plates (2 h, 4 ˚C). Unbound phages were removed and the bound phages were eluted (100 µL log phase E. coli OmniMAX, 30 min, 37 C). M13 helper phages were added (10 9 M13KO7 helper phages per well, 45 min at 37 ˚C) before transferring the bacteria to 1 mL 2xYT supplemented with 100 µg carbenicillin (Carb), 30 µg Kanamycin and 0.3 mM isopropyl-β-D-1-thiogalactopyranoside (IPTG). Bacteria were grown at 37 ˚C for 18 h, before harvesting the phages (2,000 x g for 10 min). The phage supernatants were pH adjusted (using 1/10 volume 10x PBS) and used as in-phage for the next round of selection.

Methods
The peptide-coding regions of the naïve ProP-PD library and the binding-enriched phage pools (5 µL) were PCR-ampli ed and barcoded using Phusion High-Fidelity polymerase (Thermo Scienti c) for 22 cycles. PCR products were con rmed by 2% agarose gel electrophoresis stained with GelRed using a 50 bp marker (BioRad). PCR products were normalized using Mag-bind Total Pure NGS, pooled and puri ed from a 2% agarose gel (QIAquick Gel Extraction Kit), and analyzed using Illumina MiSeq v3 (1x150 bp read setup, 20% PhiX). Results were processed using in-house Python scripts. Reads were demultiplexed, adapter and barcode regions were trimmed, and sequences were translated into peptide sequences.
Peptides were annotated using PepTools and assigned con dence levels based on four different criteria: occurrence in replicate selections, identi cation of overlapping peptide sequences, high counts, occurrence of sequences matching consensus motifs determined from the generated data set. For a stringent analysis we focused on the medium/high con dence peptides that ful ll at least three of these criteria.

Protein Expression And Puri cation
Human OGT 4.5 , OGT, and mutants described above were expressed and puri ed as described previously 12 . Brie y, a pET24b plasmid with OGT 4.5 gene inserted (a kind gift from S. Walker's lab) was transformed into BL21(DE3) E. coli for protein expression. The bacteria were cultured in LB medium supplemented with 50 µg/mL kanamycin at 37°C, 250 rpm. After reaching to an OD 600 of 0.6-0.8, the 0.3 mM IPTG was added to induce protein expression at 16°C, 220 rpm overnight. Bacterial cells were collected and resuspended in TBS buffer (150 mM NaCl, 20 mM Tris, pH 8.0) supplemented with 1 mM phenylmethylsulfonyl uoride. The suspension was lysed by pressure cell homogenizer, clari ed by centrifugation, and subjected to Ni-NTA a nity chromatography. The eluted OGT 4.5 protein was incubated with HRV-3C protease to cleave the N-terminal His 6 -tag overnight. The protein sample was subsequently puri ed by a size exclusion chromatography column (Superdex 200 increase 10/300; Cytiva) on an AKTA FPLC system in TBS buffer with 0.5 mM tris(3-hydroxypropyl)phosphine. OGT 4.5 protein was concentrated to 8 mg/mL for crystallization.

Crystallization
All peptides for crystallization were prepared by solid phase peptide synthesis (Ontores, ≥ 95% purity, HPLC). Hanging drop vapor diffusion method was used for co-crystallization.

Cell Growth Assay
Cell viability was assessed using CellTiterGlo 2.0 reagent (Promega) according to the manufacturer's instructions. Brie y, cells were seeded in 96-well tissue culture plates, with at least four biological replicates for each condition and time point. Cells were grown in the indicated nutrient condition for the speci ed amount of time before an equal volume of CellTiterGlo 2.0 reagent was added directly to the well and incubated at room temperature for 10 min in the dark. Luminescence was measured on a BioTek Synergy H1 plate reader.

Lactate Production Assay
Lactate production was measured using the LactateGlo assay kit (Promega) following manufacturer's instructions. Brie y, cells were seeded in 96-well tissue culture plates with at least four biological replicates for each time point. At the speci ed time point, a sample of culture media was taken and diluted 1000x fold in PBS. A lactate standard curve from 0.2-200 µM in PBS was also prepared to quantify media lactate concentration. Samples were mixed with LactateGlo detection solution in low volume black 384-well plates and incubated at room temperature in the dark for 1 hour. Luminescence was measured on a BioTek Synergy H1 plate reader. Lactate concentration was normalized to cell number using the above cell growth assay protocol.

Statistical analysis
All of the data shown are mean values with error bars representing standard deviation. Statistical signi cance was determined using two-tailed student's t-test.

Data Availability
Atomic coordinates of the OGT 4.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download. oatimage12.jpeg ExtendedData.docx