Site-selective, stereocontrolled glycosylation of minimally protected sugars

The identification of general and efficient methods for the construction of oligosaccharides stands as one of the great challenges for the field of synthetic chemistry1,2. Selective glycosylation of unprotected sugars and other polyhydroxylated nucleophiles is a particularly significant goal, requiring not only control over the stereochemistry of the forming bond but also differentiation between similarly reactive nucleophilic sites in stereochemically complex contexts3,4. Chemists have generally relied on multi-step protecting-group strategies to achieve site control in glycosylations, but practical inefficiencies arise directly from the application of such approaches5–7. Here we describe a strategy for small-molecule-catalyst-controlled, highly stereo- and site-selective glycosylations of unprotected or minimally protected mono- and disaccharides using precisely designed bis-thiourea small-molecule catalysts. Stereo- and site-selective galactosylations and mannosylations of a wide assortment of polyfunctional nucleophiles is thereby achieved. Kinetic and computational studies provide evidence that site-selectivity arises from stabilizing C–H/π interactions between the catalyst and the nucleophile, analogous to those documented in sugar-binding proteins. This work demonstrates that highly selective glycosylation reactions can be achieved through control of stabilizing non-covalent interactions, a potentially general strategy for selective functionalization of carbohydrates. A simplified synthesis strategy for stereo- and site-selective glycosylations, using minimally protected mono- and disaccharides and thiourea small-molecule catalysts, enables highly selective functionalization of carbohydrates.

The challenge of distinguishing similarly reactive sites in molecules lies at the heart of organic synthesis and is illustrated particularly dramatically by the coupling of polyhydroxylated partners underlying the construction of oligosaccharides 8,9 . Although enzymes have evolved precise complex machineries to achieve site-selectivity in many molecular contexts, including glycosylation 10,11 , laboratory chemists have traditionally relied on the use of protecting groups to effectively circumvent the problem (Fig. 1a). Protecting-group strategies have been advanced to a sophisticated level and have long stood as a pillar of laboratory carbohydrate synthesis, enabling site control and often influencing stereocontrol in a wide range of chemical glycosylations 5,6,12 . However, the installation and removal of specific protecting groups requires multi-step synthetic sequences, and often results in steric and electronic deactivation of the unprotected hydroxyl groups that are the targets of reaction 13,14 . Owing to these intrinsic inefficiencies, protecting-group-free carbohydrate synthesis that is both highly siteand stereo-selective and broad in scope remains an important goal for the field of carbohydrate chemistry 15 .
There have been several important efforts to achieve non-enzymatic catalyst-controlled approaches to site-selectivity in glycosylation reactions. One successful approach relies on the ability of certain Lewis acids to form cyclic covalent adducts with vicinal diols while selectively activating one of the transiently protected oxygens towards reaction [16][17][18][19][20] . In this manner, cis-1,2-diols can be converted to cyclic adducts and induced to undergo selective glycosylations at the equatorial oxygen. α-Selective glycosylation of trans-diols has also been demonstrated recently with a diboron catalyst, with site-selectivity controlled by the size of the protecting groups on the nucleophiles rather than the catalyst 21 . Despite these successes, a general method for catalyst-controlled glycosylation of trans-diols to generate β-glycosides is still lacking.
We considered a different catalyst-controlled approach, one that is not dictated by the inherent stereochemical properties of the substrates but would instead take advantage of non-covalent interactions to activate a specific hydroxyl group (Fig. 1a). Carbohydrates are known to engage in various types of attractive non-covalent interaction, with extensive evidence for C-H/π interactions between carbohydrates and electron-rich aromatic side chains in carbohydrate-protein complexes [22][23][24] . Such interactions are pivotal to saccharide recognition and have been exploited to design synthetic receptors for carbohydrate binding 25,26 . Precisely tailored catalysts capable of harnessing such interactions could enable site-selective glycosylation at hydroxyl groups previously inaccessible by strategies that rely on the catalytic formation of cyclic adducts. Although attractive non-covalent interactions have been utilized in site-selective protection of carbohydrates 3,[27][28][29][30] , examples of catalyst-controlled glycosylation where both stereo-and site-selectivity are controlled by non-covalent interactions remain scarce and are generally only moderately selective 31,32 .
The amide groups of catalysts such as cat-1 have been proposed to act as the general base responsible for nucleophile activation in glycosylation reactions, and variation of the arylpyrrolidino groups has been shown to affect the degree of stereospecificity in reactions of protected sugar nucleophiles [33][34][35][36] . We predicted that altering the arylpyrrolidino amide components of the catalyst might also influence the site-selectivity of glycosylation; therefore, we systematically evaluated the effect of the aryl substituents on (1,2)-selectivity. The selectivity of galactosylation with β-3a was indeed found to be highly responsive to changes in the 'northern' arylpyrrolidine amides (Fig. 1b). Whereas an unsubstituted pyrrolidine catalyst lacking the 'northern' arene (cat-2) promoted unselective galactosylation, catalysts bearing different 'northern' arenes (cat-1, 3-6) induced site-selectivity that correlated with the electron density of the arene. The 5-N-methyl indole catalyst (cat-6) was thus identified as the optimal catalyst for site-selective galactosylation ((1,2):(1,3) = 17:1). In contrast, variation of the 'southern' arene (cat-1, 7 and 8) had little influence on site-selectivity, suggesting that only the 'northern' arylpyrrolidino amide is involved in nucleophile recognition. As discussed in greater detail below, the positive correlation between site-selectivity and the electron density of the 'northern' arene suggests that attractive non-covalent interactions influenced Nucleophile  We adopted an analogous approach towards the development of catalysts for site-selective mannosylations (Fig. 1b). In contrast to the trend observed in galactosylation, site-selectivity was observed to be most responsive to alterations of the 'southern' arene (cat-1, 7 and 8) rather than the 'northern' arene (cat-1, 5 and 6). The catalyst framework thus appears to promote glycosylation by engaging either of the two amido arylpyrrolidines as general bases depending on the identity of the electrophilic coupling partner. As observed in galactosylation, (1,2)-selectivity was sensitive to the electronic properties of the arene, with more electron-rich arenes affording higher selectivity. Incorporation of an electron-rich arene (2-naphthyl) on both the 'northern' and the 'southern' arylpyrrolidines in an attempt to develop a universal catalyst for both galactosylation and mannosylation resulted in similar site-selectivity but lower reactivity compared with cat-7 (30% versus 60% conversion; Supplementary Fig. 6). Thus, catalysts bearing specifically tuned 'northern' and 'southern' arylpyrrolidines afforded the best combination of site-selectivity and reactivity, and cat-6 and cat-8 were selected for further studies of galactosylation and mannosylation, respectively.
With precisely tailored bis-thiourea catalysts thus identified, the scope of β-selective and site-selective galactosylation and mannosylation was explored (Fig. 2). β-Monosaccharides such as β-3a-3c and minimally protected disaccharides containing a β-galactose or β-glucose motif (3d-3g and 3l-3m) underwent   selectivity for the C2 hydroxyl, with no observable over-glycosylation. β-Glucoside-containing pharmaceuticals and natural products (3h-3i and 3n) also underwent glycosylation with synthetically useful levels of (1,2)-selectivity. The insensitivity of site-selectivity to the identity of the β-anomeric group suggests that the anomeric substituent projects away from the key interactions with the catalyst that are responsible for imparting selectivity. However, the anomeric configuration of nucleophiles was found to have a profound impact on site-selectivity, as α-monosaccharides (α-3a and α-3b) showed significantly lower site-selectivity (2.1:1 and 1:1.7 (1,2):(1,3) respectively). We studied the divergent behaviour of α-3a and β-3a with the hope of gleaning insight into the origins of catalyst-controlled site-selectivity. Site-selectivity for the galactosylation of α-3a and β-3a was found to respond very differently to electronic perturbation of catalyst arenes (Fig. 3a). The site-selectivity in the galactosylation of β-3a increased as the catalyst arenes became more electron rich, whereas little variation in site-selectivity was observed in the galactosylation of α-3a. We predicted that the contrasting behaviours observed with α-3a and β-3a could result from their differing ability to effectively engage in C-H/π interactions with catalyst arenes in the glycosylation event. The strength and facial preference of such interactions is known to depend strongly on the sugar configuration: experimental and theoretical studies have shown that stacking preferentially occurs on the face presenting multiple axially oriented C-H bonds 22,23,37,38 . We compared the experimental ΔΔG ‡ derived from the (1,2):(1,3) ratio (defined as -RTln(r.r.) and reflecting the difference in transition state energies leading to the two regioisomeric products) with computed interaction energies between β-galactose and different arenes. A good correlation was observed in the case of β-3a, whereas no statistically significant correlation was seen with α-3a. These observations support the hypothesis that, when not precluded owing to steric effects, attractive C-H/π interactions can have a critical role in controlling site-selectivity.
We explored the possibility of exploiting other catalyst features to enforce site control in the galactosylation of α-3a. Expansion of the catalyst 'northern' aryl group to 1-napthyl and replacement of the tert-leucine residue with alanine (cat-9) enabled the highly (1,2)-selective galactosylation of α-3a (19:1 (1,2):(1,3); Fig. 3b). Although the mechanisms of catalyst control are clearly distinct for the reactions of α-3a and β-3a, these intriguing results demonstrate the potential generality of a catalyst-controlled approach for achieving site-selectivity and suggest that non-covalent interactions other than C-H/π interactions can be harnessed to impart site-selectivity in glycosylation.
We sought to understand the mechanism by which C-H/π interactions in the selectivity-determining step lead to enhanced site selectivities. In theory, site-selectivity could arise either from increased binding of a pre-reactive pro-(1,2) complex or from stabilization of the transition state leading to (1,2)-product. Previous kinetic analyses of Calculated interaction energies (kcal mol -1 )  (Fig. 4a). Therefore, the enhanced site-selectivity cannot be attributed to the ability of cat-6 to form more stable ternary complexes: instead, the faster rate constant (k cat ) observed for cat-6 indicates that the site-selectivity induced by cat-6 can be attributed to acceleration of the glycosylation step. The higher k cat_(1,2) and lower k cat_(1,3) for cat-6 relative to cat-2 indicates that higher site-selectivity is achieved by accelerating the major (1,2)-pathway and decelerating the minor (1,3)-pathway. In separate studies, it was found that the analogue of β-3b bearing benzyl protecting groups at all but the C2 hydroxyl position was about ten times less reactive than β-3b itself under otherwise identical catalytic conditions (Supplementary Table 6). Taken together, these observations highlight key advantages of non-covalent catalyst control relative to traditional protecting-group approaches: whereas neighbouring protecting groups typically result in reduced reaction rates owing to steric congestion and electronic deactivation, non-covalent activation of the targeted site on unprotected nucleophiles relies on rate acceleration relative to the uncatalysed pathway. Density-functional-theory calculations on the galactosylation of β-galactose were carried out to further investigate the origins of site-selectivity. Both (1,2)-and (1,3)-transition states were located and found to feature an asynchronous S N 2-like mechanism involving 4H activation of the diphenylphosphate group and amide-mediated nucleophile activation (Fig. 4b). The computed general-base activation mechanism was consistent with both previous proposals and the experimental observation that replacing the 'northern' amide with a thioamide, a weaker general base, resulted in diminished site-selectivity ( Supplementary Fig. 11). However, the predicted sense of site-selectivity was opposite from that observed experimentally. We predicted that the disagreement with experimental results might be caused by poor modelling of solvation: although unprotected sugars are likely to engage in explicit hydrogen-bonding interactions with the ethereal solvent, these interactions will be poorly described by implicit solvent models 39,40 . Indeed, semi-empirical molecular dynamics simulations in explicit solvent suggest that the C4 hydroxyl on the nucleophile engages in strong hydrogen-bonding interactions with the solvent. Replacing the poorly modelled C4 hydroxyl with a methoxy resulted in a computed preference for reaction at the C2 hydroxyl, in line with   experiment. Further explicit solvent calculations were used to construct a two-dimensional free-energy surface for nucleophile binding to the 'northern' amide-phosphate complex ( Supplementary Fig. 18), and revealed the existence of three roughly isoenergetic minima corresponding to binding of the C2 hydroxyl, the C3 hydroxyl or both to the amide group. This supports the conclusion drawn from kinetic studies that the observed (1,2)-selectivity arises from transition-state stabilization and not from preferential binding of one hydroxyl group in the nucleophile-bound ternary complex. Analysis of the calculated transition-state structures reveals that the (1,2)-transition state features markedly closer C-H/π contacts than the (1,3)-transition state, consistent with the increase in (1,2)-selectivity induced by electron-rich aryl groups (Fig. 4b). The differences can be understood intuitively by considering how general-base activation of the nucleophile affects the C-H/π interaction between the arene and the axial C-H bonds at C1, C3 and C5. Activation of the C3 hydroxyl group by the arylpyrrolidine amide results in pulling the nucleophile away from the arene and weakening the interaction with the C3 methine. In contrast, amide-mediated activation of the C2 hydroxyl better preserves the C-H/π interactions between the nucleophile and the arene.
We have developed highly (1,2)-selective galactosylations and mannosylations of β-carbohydrates using bis-thioureas bearing electron-rich arenes. Studies of the structure-selectivity relationship demonstrate the importance of C-H/π interactions between nucleophiles and catalyst arenes in controlling the site-selectivity of glycosylation. Kinetic and computational analyses point to selective stabilization of the major (1,2)-pathway through attractive carbohydrate C-H/aromatic interactions. This work supports the notion that carbohydrate-aromatic interactions can be leveraged productively in glycosylation reactions, and more broadly showcases the feasibility of exploiting attractive non-covalent interactions to achieve high stereo-and site control in small-molecule-catalysed glycosylations.

Online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/s41586-022-04958-w.