Non-enzymatic Stereoselective S-glycosylation of Polypeptides and Proteins

Glycosylation (1-4) is an essential and powerful technique that Nature employs to regulate the properties and functions of proteins and polypeptides. Our capacity to emulate Nature’s power, however, is limited by the methods available (5) to perform glycosylation on these complex biomolecules. So far, very few glycosylation reactions could operate under the conditions tolerated by biomolecules (e.g., aqueous media, mild pH, and ambient temperature), and the need to install glycosyl groups in a stereo-controlled fashion poses additional, signicant challenges. Here we report a non-enzymatic glycosylation reaction that builds axial S-glycosidic bonds under biorelevant conditions. Our strategy exploits the exceptional functional group tolerance of radical processes, and is enabled by the design and use of allyl glycosyl sulfones as precursors to glycosyl radicals. Our method could introduce a variety of glycosyl units to the cysteine residues of polypeptides in a highly selective fashion. The power of this method is further demonstrated in the direct glycosylation of bioexpressed proteins. Computational and experimental studies provide insights into the reaction mechanism.


Introduction
Glycosylation of proteins plays critical roles in a wide range of biological processes, including, for example, viral infection, immune response, and cancer metastasis. 1-4 In a natural glycoprotein, the carbohydrate moieties are usually linked to peptide skeletons through the hydroxyl group of serine or threonine (O-glycans) or the amido group of asparagine (N-glycans). In comparison, S-glycosylation on the cysteine residues (S-glycans) is less common but also found in bioorganisms. [6][7][8] Due to the similarity between oxygen and sulfur atoms, S-glycans often exhibit similar structural features and hence similar biological activities to their (more frequently encountered) O-glycan analogues. On the other hand, Sglycans are considerably more resistant toward chemical hydrolysis and enzymatic degradation, and thus have longer lifetimes 9 in biological systems. In part due to the above reasons, S-glycans has attracted tremendous attention as pharmaceutical agents or as mimetics of the natural O-glycans in biological research. 9, 10 Rapid and robust methods to prepare S-linked glycoproteins and glycopeptides are of general interest.
Directly installing a glycosyl unit onto the thiol group of cysteine residues represents a conceptually straightforward approach 5,11-13 to access S-linked glycopeptides, but remains a barren area. Cysteine has served as an extremely useful handle for modi cation of proteins because of its high nucleophilicity. 14 Various types of electrophiles (Fig. 1a) have been identi ed to react selectively with thiols, including Michael acceptors, a-halo carbonyls, uorinated arenes, 15 and organometallic complexes. 15 Yet, the introduction of glycosyl electrophiles to form S-linked glycopeptides under bio-relevant conditions (e.g., aqueous media, mild pH, and ambient temperature) has lagged behind. This could be because few of the conventional glycosyl electrophiles 11,16 are compatible with or soluble in aqueous media. Moreover, to control stereoselectivity during this step poses additional, signi cant complications. The Miller group 17 reported an elegant glycosylation in water that yields b-glucosides. Ca(OH) 2 was employed as promotor in these reactions. The Davies group 18 disclosed a unique phosphine mediated desulfurization reaction that converts disul de-linked glycopeptides to the corresponding S-linked glycosides. The authors noted this reaction would abate the stereopurity of the cysteine unit since it proceeds via the intermediacy of dehydroalanine. Enzymatic approaches [19][20][21] have demonstrated great potential for installing glycosyl units, but the requisite glycosyltransferases are sometimes substrate speci c or di cult to obtain. Nonenzymatic methods usually show broader substrate scope and employ more readily available reagents.
Thus, the development of chemical glycosylation methods that produces fully unprotected, stereode ned S-linked glycopeptides/glycoproteins still represents a thrilling task.

Main Text
Reaction Design: The inherent challenges associated with handling glycosyl electrophiles in aqueous media prompted us to consider using other types of reactive intermediates to build S-glycosidic bonds.
Radical reactions 22,23 exhibit outstanding functional group tolerance, and possess unique utility in the synthesis and modi cation of complex (bio)molecules. 5,14,24-25 Interestingly, the use of glycosyl radicals to build glycosyl-peptide bonds remains an underdeveloped eld, presumably due to the lack of methods 16 to generate glycosyl radicals under biorelevant conditions. In this context, we envision a radical-based reaction to make S-linked glycopeptides that features the design and use of bench-stable allyl glycosyl sulfones as the glycosylating agents (Fig. 1b). We reason the free thiol group in cysteine residues could be converted to disul des (6 to 7) with excellent e ciency. Disul de 7 could serve dual purposes: both to initiate the generation of and to trap glycosyl radicals. Homolyzation of the disul de bond 26 in 7 readily affords a thiyl radical 8. Reaction of 8 with the double bond in allyl glycosyl sulfone 1 leads to alkyl radical 2, which then would release 27-28 allyl sul de 3 and emit SO 2 to yield the key glycosyl radical 4. Trapping of the glycosyl radical 4 by disul de 7 regenerates the thiyl radical 8 and furnishes the desired S-glycoside 5. Given the established reactivity pro les of glycosyl radicals, 29 we anticipate the nal step would occur with high axial-selectivity. In this work, we show the validity and potential of this radical-based S-glycosylation reaction. This reaction proceeds smoothly under biorelevant conditions, and is amenable to the direct glycosylation of unprotected polypeptides and proteins.
Proof-of-Principle Establishment: We started by investigating a model reaction between allyl glycosyl sulfone 9 and cysteine derived disul des 10a/b (Fig. 2a). Sulfone 9 can be readily prepared from the commercial per-acetylated glucose (see SI), and is shelf-stable for over one year without precautions to avoid air or moisture. We established S-glycoside 11 could be produced from 9 and 10a/b with high e ciency and with almost complete axial-selectivity under photocatalytic conditions. Alkene 12a/b was formed as a byproduct. Either 10a or 10b could be used, and therefore, perturbation on the electronic properties of the disul de reactant is tolerated. The inexpensive, organic dye Eosin Y could be employed as the photocatalyst instead of Ir[dFCF 3 (ppy) 2 (dtbbpy)PF 6 . Moreover, this reaction is compatible with protic (co)solvents, including H 2 O (see Fig. S3-S5 for condition optimization).
We next examined this reaction in the synthesis of fully unprotected S-glycopeptides (13 + 14 to 15, Fig.  2b). Toward this end, we prepared glycosyl sulfone 13, with all of its hydroxyls unprotected. Thanks to the intrinsic stability of sulfones to polar nucleophilic functional groups, 13 is highly stable in aqueous media and convenient to handle. Eosin Y was used as the photocatalyst, in order to avoid potential transition metal contamination in the products. Under the standard conditions, sulfone 13 reacted with various oligopeptide derivatives to give the corresponding glycopeptides with good e ciencies. Excellent functional group compatibility was observed. Peptides with side chains that bear amide, hydroxyl, carboxylic acid, guanidine, primary amine, sul de and indole groups all underwent this transformation smoothly (15a-f). Some of the peptides employed are of biological relevance. For example, the pentapeptide sequence in product 15e is identical to those surrounding the S-linked D-glucose unit in the naturally occurring antimicrobial sublancin. 6 Cyclo(RGDfC) employed to synthesize 15f is a potent aVb3 integrin-binding peptide.
We next investigated the scope of glycosyl donors, using disul de 16 derived from glutathione (GSH) as the substrate (16 + 17 to 18, Fig. 2c). Various glycosylated GSH were generated with high selectivities. We showed that a glucosyl (18a), 2-aza-2-deoxy galactosyl (18b), arabinopyranosyl (18c), fucosyl (18d), or maltosyl group (18e) could be introduced. Importantly, the azide-containing GalNAz unit, which nds broad applications in metabolic labeling, 30 was tolerated and incorporated by our method as well (18f). It is notable that all of the glycosyl donors employed here are readily prepared and bench-stable. Moreover, these results suggest the reaction is not sensitive to the properties of the glycosyl units installed, a key advantage of chemical derivatization methods.
Glycosylation of Polypeptides and Proteins: With our reaction design validated, we proceeded to develop an operationally simpler protocol that could directly convert free thiols in polypeptides to S-glycosides.
Toward this end, we identi ed isothiazolone 20 as a thiolating agent, 31 and established the sequence shown in Fig. 3 (19 to 22 via 21). S-thiolation of thiol 19 by 20 to form disul de 21 is quantitative and rapid (complete in less than ve minutes). More importantly, isolation/puri cation of intermediate 21 is unnecessary, since no byproduct or potentially interfering functional groups are formed during this step. Thus, addition of thiol 19 to an aqueous solution that contains 20, glycosyl donor 17, and Eosin Y, followed by irradiation with green or blue light yields the glycosylated peptides 22. Essentially, the conversion from free thiols to S-glycosides is conducted in one operation. Various bioactive polypeptides were glycosylated using this "second generation" method.

Mucin 1 (MUC1) is an O-linked glycoprotein containing tandem repeats of 20 amino acids (cf. 26), with
glycosyl residues mounted on serines or threonines via 1,2-cis-linakges. 33 The MUC1 backbone is covered with complex oligosaccharides in normal cells. In tumor cells, however, the glycosylation is incomplete, and therefore, many MUC1-associated epitopes such as Tn/TF-antigens are revealed to the immune system. MUC1-derived glycopeptides have been extensively studied for the development of antitumor vaccines. 34 Interestingly, compared with the natural O-linked counterpart, some arti cial S-linked Tnantigen demonstrated enhanced immunogenicity. 10 We thus custom-synthesized oligopeptide 27 in which a serine residue in 26 was replaced by a cysteine, with the aim of preparing the corresponding 1,2cis-S-linked Tn-antigen mimetics using our method. Accordingly, 27 was subjected to our conditions, and S-linked glycopeptides were obtained with high axial selectivities. Simply by switching the glycosyl sulfone donors, a 2-aza-2-deoxy galactosyl (28a), a glucosyl (28b), or a maltosyl group (28c) could be installed. The resulting glycopeptides may aid investigating the impact of sugar moieties on the immunogenic processes.
S-glycosylation of Cys-containing proteins was also realized. A bodies are small proteins that can bind to target proteins or peptides with high a nity. 35 As mimetics of monoclonal antibodies, a bodies have seen extensive diagnostic and therapeutic applications. We expressed a mutant a body 29 that bears a cysteine residue. Incubating 29 (150 mM) in our "glycosylation kit" that contains glycosyl donor 13 (30 mM), thiolating agent 20 (45 mM), and Eosin Y (0.3 mM) for ca. 5 min, followed by irradiation with blue LED for 1 h gave the corresponding glycosylated protein 30 with good e ciency. The site of glycosylation was con rmed by tandem mass spectrometry. The His-Tag tail in 29 was tolerated by our conditions. Non-structural protein 7 (nsp7) is a cofactor of SARS-CoV nsp12 polymerase. The nsp7 (31) of SARS-CoV-2 contains three cysteine residues. Following the above protocol, we could install glycosyl groups onto this protein as well, affording mono-, di-, and triglycosylated products as a mixture.
Mechanistic Studies: To understand the reaction mechanism, we investigated the kinetic pro les 36 of the reaction between 9 and disul de 10b. We conducted three parallel experiments in which only the initial concentrations of 9 and 10b were set different, and monitored the concentration of 10b as a function of time (Fig. 4a, left). We found that 1) the reaction rate is not impacted by [9], since the experiments with different initial concentrations of 9 showed essentially overlapping progress (cf. red and blue lines); and 2) the reaction has a rst order of dependence on [10b], since ln[10b] decreases almost linearly with time (see SI for a more detailed discussion). We next conducted another set of experiments in which the loadings of photocatalyst varied from 0.25% to 1.0% (Fig. 4a, right). We noted the reaction rate increased less than linearly with the loading of photocatalyst (see Fig. S6). With the above results, the rate law of this reaction could be described as the following: From equation 1 and our proposed reaction pathway (Fig. 1b), we reason the rate determining step of this reaction is likely the photocatalyst mediated homolytic cleavage of disul de to generate thiyl radicals. 26 We further conducted calculations [B3LYP/6-31G(d)] on the model reaction between sulfone 32 and disul de 33 (Fig. 4b). Conversion of 32 to the anomeric radical 37 passes through a series of low-barrier steps initiated by thiyl radical 34, via alkyl radical 35 and sulfonyl radical 36. In principle, radical 37 could attack disul de 33 to give either the equatorial product 38 or the axial product 39. Yet, the process to generate 39 is kinetically more facile by 6.4 kcal/mol, fully consistent with the exclusive axial-selectivity 29 we observed.

Discussion
A non-enzymatic glycosylation method was established to build S-glycosyl-cysteine linkages under biorelevant conditions. We anticipate this reaction, with its generality and operational simplicity, would be rapidly adopted in the preparation of S-linked glycopeptides and glycoproteins that nd routine applications in chemistry, biology, drug discovery, and material science.
References And Notes