Development of a versatile and efficient C–N lyase platform for asymmetric hydroamination via computational enzyme redesign

Although C–N bonds are ubiquitous in natural products, pharmaceuticals and agrochemicals, biocatalysts forging these bonds with high atom-efficiency and enantioselectivity have been limited to a few select enzymes. In particular, ammonia lyases have emerged as powerful catalysts to access C–N bond formation via hydroamination. However, the use of ammonia lyases is rather restricted due to their narrow synthetic scope. Herein, we report the computational redesign of aspartase, a highly specific ammonia lyase, to yield C–N lyases with cross-compatibility of non-native nucleophiles and electrophiles. A wide range of non-canonical amino acids (ncAAs) are afforded with excellent conversion (up to 99%), regioselectivity >99% and enantioselectivity >99%. The process is scalable under industrially relevant protocols (exemplified in kilogram-scale synthesis) and can be facilely integrated in cascade reactions (demonstrated in the synthesis of β-lactams with N-1 and C-4 substitutions). This versatile and efficient C–N lyase platform supports the preparation of ncAAs and their derivatives, and will present opportunities in synthetic biology. Ammonia lyases are powerful catalysts to access C–N bond formation via hydroamination, but show a narrow synthetic scope. Now, by computational redesign of an aspartase, a C–N lyase is developed that shows cross-compatibility of non-native nucleophiles and electrophiles expanding the synthetic scope.

O utside the 20 proteinogenic amino acids that serve as the foundational building blocks of life, there are manifold non-canonical amino acids (ncAAs) that exhibit diverse physiological functions and are extensively used as intermediates for bioactive products 1 . An analysis of FDA-approved medicines reveals that approximately 12% of the 200 top-grossing drugs contain at least one ncAA building block 2 . The implementation of ncAAs in synthetic pipelines can bypass many difficulties in installing challenging pharmaceutical functionalities, such as chiral amines and unprotected polar groups. Moreover, the physical and chemical properties of synthetic peptides and proteins can be selectively manipulated through the incorporation of ncAAs, which has contributed to understanding biological macromolecules 3 and the development of new therapeutics 4 and high-performing biocatalysts 5,6 . In a broader sense, the biotechnological application of ncAAs would benefit synthetic biology's central goal: to create new life forms and functions. However, despite the impressively wide perspectives of ncAAs, their synthesis remains a challenge because stringent stereoselectivity and functional-group compatibility are required 7 . The full potential of this class of compounds has not been realized. Thus, simple, sustainable, cost-effective and scalable processes for ncAAs are highly desirable 8 .
Given the apparent advantages in terms of selectivity, sustainability and evolvability, biocatalytic preparation of ncAAs has attracted increasing interest 9,10 . From a retrosynthetic perspective, the use of coupling reactions imparts excellent benefits as it introduces a few atom-economical reactions that allow multistep procedures and purifications to be avoided to provide a target product 11 . In this respect, chiral ncAAs can be obtained through the hydroamination of carbon-carbon double bonds, one of the top aspirational reactions in synthetic chemistry 12 . Asymmetric C-N addition enables an enantioselective and atom-economic synthesis in which easily accessible prochiral compounds are conjugated to form an optically pure ncAA with 100% theoretical yield, albeit typically with a requirement for tedious protection-deprotection steps and expensive catalysts or chiral auxiliaries via metal catalysis or organocatalysis 13,14 . These problems compound when ncAAs contain additional reactive functional groups and need to fit the stringent criteria for industrial applications. In this context, nature circumvents such challenges by using ammonia lyases to catalyse reversible C-N bond cleavage and formation [15][16][17][18] (Fig. 1a). Native reactions mediated by aspartase and phenylalanine ammonia lyase (PAL) exemplify the enzymes' advantages perfectly, which have been exploited to synthesize l-aspartic acid and l-phenylalanine on the thousand-ton scale since the 1980s (ref. 18 ) (Fig. 1b). More recently, a rapid increase in C-N lyase applications in biocatalytic and therapeutic fields has benefited from the advancement of discovery and engineering technologies to improve these enzymes' functional properties and offer expanded synthetic scopes [19][20][21][22][23][24][25][26][27][28] (Fig. 1c,d). The growing toolbox of C-N lyases enables syntheses of a range of high-value-added products, with prominent examples including cypermethrin and a key intermediate of olodanrigan (EMA401) by engineered PALs 19,22 , a series of valuable aspartic acid derivatives by two methylaspartate ammonia lyase variants 24 , and the fungal natural product aspergillomarasmine A by ethylenediamine-N,N′-disuccinic acid lyase 27 .
Although inspiring advances have been made, C-N lyases are not considered to tap all of the catalytic potential within their scaffold. The cross addition of non-native nucleophilic amines to substituted alkene partners remains elusive, with no success in simultaneously diversifying the electrophile and nucleophile substrate scope of C-N lyases. To endow C-N lyases with broader synthetic use, there are several issues to be considered in addition to the inherent difficulties in asymmetric hydroamination, such as asymmetric induction and compatibility problems (Fig. 1e). First, such a transformation requires that both amines and alkenes can access the enzyme active site and are stabilized at applicable positions to maintain enantioselectivity. Second, the high regio-and enantioselectivities achieved by C-N lyases rely mainly on the precise hydrogen-bonding network or aromatic microenvironment of these enzymes, wherein residues enabling exquisite hydrogen bond networks to stabilize nucleophiles and electrophiles are interlaced, which inherently limits the substrate scope. Third, within a compact active site (an average volume of 1,072 Å 3 calculated from the Catalytic Site Atlas database 29 ), approximately ten key residues (the van der Waals volumes of proteinogenic amino acids range from 67 to 163 Å 3 , ref. 30 ) need to be screened for the whole active site and 10! ≅ 10 6 possible combination paths linking these single mutations. Most mutational pathways to new properties would be tortuous in the presence of epistatic effects, and the challenge lies in identifying   an efficient path to the desired function along the rugged fitness landscape.
In this study, encouraged by our recent success in redesigning an aspartase from Bacillus sp. YM55-1 (AspB) 23 , we attempted to tackle the longstanding challenge of creating C-N lyases with cross-compatibility of non-native substrates by performing computational protocols that allow large jumps in function to traverse inactive sequence space along the fitness landscape. We envisioned that the AspB variants with a thoroughly reshaped active site could achieve cross addition of a large variety of nucleophilic amines to unsaturated acids. Empowered by computational design, we constructed a well-defined library and successfully permitted a diverse range of ncAAs to be prepared with high activity, excellent conversion and enantioselectivity. Moreover, subsequent two-step chemoenzymatic route would efficiently furnish the desired β-lactam compounds. In addition to the synthetic use, our work may provide a general strategy to address the challenge to expand the reactivity repertoire of biocatalytic asymmetric hydroamination transformations and highlight a C-N lyase platform that possesses unique and attractive properties for the biocatalytic preparation of a myriad of practically relevant ncAAs and their demandtailored derivatives.

Results
Computational redesign of AspB. We commenced our study by defining calculation criteria based on the detailed enzymatic mechanism. Although thermodynamically feasible, hydroamination reactions generally have a high activation barrier due to the repulsion of the electron-rich π system of the alkene substrate and the electron pair on the amine nitrogen atom 31 . However, the entropic penalty can conceivably be overcome by preorganization in an enzyme reactive site 15 . The reaction mechanism of AspB involves the abstraction of the pro-R proton from the C β atom of substrate aspartate by the general base Ser318 within the SS loop 32 . The carboxylate group of the formed enolate anion intermediate is stabilized through a network of hydrogen bonds including residues Thr101, Ser140, Thr141 and Ser319, with the amino group being easily accommodated in the nucleophile binding pocket by hydrogen-bonding interactions of the side chains of Thr101, Asn142 and His188 (ref. 33 ). The original α-carboxylate binding pocket consists of residues Thr187, Met321, Lys324 and Asn326, which were substituted to target different electrophiles in our previous work 23 . Initial experiments on the asymmetric hydroamination reaction of amine groups revealed that Asn142 served as a dispensable residue, whereas Thr101 and His188 functioned as crucial residues; that is, substitutions of Thr101 and His188 to alanine provoked dramatic reductions in catalytic activities. From the structural point of view, residues Ala99, Leu358 and Glu362 create a hindered environment to suppress larger nucleophiles from entering its ideal reactive pose. Motivated by these findings, we used Rosetta Enzyme Design to expand the nucleophile spectrum of AspB. Specifically, Ala99, Asn142, Leu358 and Glu362 were envisioned to be mutated into less-bulky residues while the global folding of the enzyme was maintained. An ensemble of different starting points was generated for Rosetta Enzyme Design by molecular dynamics (MD) simulations. Subsequently, to improve the quality of the design library and reduce the screening effort, the predicted designs obtained by Rosetta Enzyme Design were computationally screened to increase the conformational space sampled by multiple independent MD simulations. The fraction of time that the enzyme-substrate complex spends in the hydrogen-bonding interaction network of the β-carboxylate was quantified by using geometric criteria for near-attack conformations (NACs), which are conformations that approach the transition state structure. Finally, a small number of designs were selected for experimental characterization on the basis of the total energy scores, the penalty scores of the constraints and NAC frequencies (Fig. 2).

Conjugate addition of a matrix of nucleophiles to electrophiles.
We first evaluated the asymmetric hydroamination of crotonic acid (1) with amine derivatives. As the benchmark reactions for computational design, substrates that provide an exceptional opportunity for late-stage diversification (allylamine, h) or bear relatively larger substituents (cyclopropylamine, j) were chosen. A mutant library of the B19 enzyme 23 was generated in silico by simultaneous substitutions of Ala99, Asn142, Leu358 and Glu362 to less-bulky residues (A99 to G, N142 to GAVSTC, L358 to GAVIMSTCDNH and E362 to GAVLIMSTCDNH). Experimental validation of a small set of 22 designs for 1h and 16 designs for 1j resulted in the identification of 37 mutants, among which the referred BA15 design (containing A99G-N142S-T187C-M231I-K324L-N326A-L358V-E362M mutations) showed the highest specific activity for syntheses of both 1h and 1j as ascertained by HPLC with a chemically prepared authentic standard. Excellent conversions (>99%) were achieved within 2 h, and control experiments showed that the amines did not react with crotonic acid in the absence of the enzymes. The successful design encouraged the evaluation of a broad spectrum of compounds (a-n), in which most of the amines were efficiently converted by BA15 to afford the respective optically pure products (>99% enantiomeric excess (e.e.)) with >96% conversions (Fig. 3c) at a substrate loading up to 150 g l −1 . Notably, sterically hindered amines (c and e), which were problematic substrates for previously reported enzymatic hydroamination reactions, were successfully catalytically incorporated. For strong nucleophile ethylenediamine (m), weak spontaneous reaction occurred, which gave a slight drop of product enantioselectivity to 90% e.e. Structural analysis showed that A99G-N142S-L358V-E362M mutations would afford an enlarged amine binding pocket that would retain the van der Waals interactions and permit the binding of bulkier amine groups in different orientations ( Fig. 3a and Supplementary Fig. 2a). Hence, non-native amines with charged or large substituents might be accepted in addition to simple ammonia. More importantly, the conformational change of the nucleophilic pocket was considered to not conflict with the electrophilic pocket, which raised the possibility of direct combination of the nucleophilic and electrophilic pockets without multiple rounds of design.
To evaluate the compatibility of the nucleophilic pocket and designed electrophilic pockets, the redesigned nucleophile binding pocket was introduced into the AspB wild type (yielded AA15 design containing A99G-N142S-L358V-E362M mutations, Supplementary Fig. 2b) and its engineered enzymes P1 (yielded PA15 design containing A99G-N142S-T187C-M231I-K324L-N 326C-L358V-E362M mutations, Supplementary Fig. 2c) and F29 (ref. 23 Supplementary  Fig. 2d) to catalyse the conjugate addition of a matrix of diverse nucleophilic donors to electrophilic acceptors. As anticipated, the unsaturated amino acids bearing an ethyl group efficiently underwent a hydroamination reaction with the evaluated amines to afford the corresponding products in excellent conversions (>94%) and enantioselectivities (>99% e.e., except for 2m). The substrate tolerances peaked at concentration up to 100 g l −1 . For charged substrate fumaric acid, most of the substituted amines with aliphatic, unsaturated or charged groups were efficiently processed, providing the desired products >90% conversions and excellent stereoselectivity (>99% e.e., except for 3m) at a high substrate loading of 80-130 g l −1 , except for isopropylamine that gave moderate conversion (88%). For aromatic substrates, a lower concentration (7.5 g l −1 ) had to be used due to their low solubility. Amines with relatively small substituents exhibited low conversions. Nonetheless, cinnamic acid proved to be a competent coupling partner with methoxyamine, providing the product with 97% conversion. Aromatic acrylates bearing election-withdrawing/-donating groups also afforded desired products with satisfactory conversions, which demonstrates the compatibility of the redesigned enzyme to tolerate functionalized groups typically encountered in pharmaceutical agents.
For a few ncAAs bearing valuable scaffolds or providing opportunities for further functionalization, preparative reactions were performed. As summarized in Fig. 4, various ncAA products were successfully synthesized by the corresponding redesigned AspBs on ten-to hundred-gram scales to afford the products in good to excellent isolated yields (74-93%). Propargylamine was efficiently converted to give the corresponding product 1i in excellent isolated yield (93%, 131 g), leaving the alkynyl group available for potential downstream synthetic manipulation. With a 1.5-fold molar equiv. of propargylamine over crotonic acid, the reactions were complete within 1 h at 50 °C, providing the space-time yield of 131 g l −1 h −1 ,     Fig. 3 | Computationally redesigned aspB-catalysed amine (a-n) addition to unsaturated acids (1-9). a, Structural comparison of the Rosetta-predicted model of Ba15 bound to (R)-β-(cyclopropylamino)butanoic acid to the wild type enzyme bound to aspartate highlights a drastic reshaping of the active site. Redesigned residues are coloured in purple. b, Unsaturated acid electrophiles (1-9) and amine nucleophiles (a-n). c, Enzymatic synthesis of ncAAs by using heat-purified Ba15 (1a-n), Pa15 (2a-n), aa15 (3a-n) and Fa15 (4b-9n). The reactions were carried out at 50 °C, pH 9.0 (Ba15 and Pa15), 37 °C, pH 9.0 (aa15) or 37 °C, pH 8.5 (Fa15). Unsaturated acid concentrations, reaction times, conversions and e.e. are listed. Generally, 1.5-2.0 equiv. (1a-3n) or 14-23 equiv. (4b-9n) of amines and 0.12-0.60 mM enzymes were added in the reaction mixtures (5 ml). Conversions and e.e. values were determined by HPLC analysis. n.d., not determined. a Stereoselectivity was determined at only the β-carbon atom. b The formation of the β-amino acids was identified by ESI-MS and nuclear magnetic resonance, and e.e. was not determined due to the lack of racemic standard compounds. c e.e. was not determined because the racemic standard β-amino acids could not be distinguished by HPLC analysis after numerous attempts. d The formation of the β-amino acids was identified by ESI-MS, and e.e. was not determined due to a lack of racemic standard compounds. Data are from one independent experiment. which is a notable high value for C-N lyases. Within the rather broad substrate spectrum, we next examined substrates with long aliphatic chains that are precursors of aspartame derivatives. Derivatization of the artificial dipeptide sweetener aspartame with N-alkyl groups can generate even sweeter compounds, such as the approved food additive neotame, which is 7,000-13,000 times sweeter than sucrose 34 . In this study, N-butyl-l-aspartic acid (3f), which is the precursor to neotame analogue, was synthesized on a kilogram scale using whole cells fermented from merely 2 l of medium with excellent conversion (>97%), isolated yield (92%, 1.4 kg) and stereoselectivity (>99% e.e.) (Fig. 4b), demonstrating the great potential of the redesigned AspBs to offer alternative synthetic options for the industrial preparation of valuable ncAA products.
Chemoenzymatic synthesis of β-lactam compounds. The broader substrate scope of the redesigned C-N lyases has also raised the possibility of building entirely new synthetic pathways for valuable precursors to pharmaceuticals. By harnessing the elegance of biocatalysis and the robustness of chemical catalysis, efficient routes towards the β-lactam heterocycle, one of the most acclaimed pharmacophoric moieties 35 , from simple starting materials were examined. With the enzymatically prepared ncAAs in hand, we subsequently performed the cyclization reaction in the same pot without purification, accomplishing the full conversion for the second cyclization step. The substrate cyclopropylamine (j), which could be efficiently converted by BA15 with 99% conversion, was chosen for our initial investigation. The chemoenzymatically prepared β-lactam product 1jc was isolated with moderate overall yield (63%) and excellent optical purity (e.e. >99%), and without racemization of the potentially sensitive C β stereogenic centre (Fig. 5a). To further demonstrate the synthetic usefulness of this strategy, 1g, 1i,  1k, 2h, 2i and 2k, which were well accepted substrates of the redesigned enzymes, were chosen and provided good overall isolated yields (46-71%). The results demonstrated a simplified practical procedure towards optically pure β-lactam heterocycles through a chemoenzymatic synthesis route.
It is noteworthy that alkenes and alkynes are highly versatile synthetic handles suitable for elaboration into a wide variety of useful functional groups. The modification of the alkyne or alkene tag in β-lactam heterocycles via well-established click chemistry allows the introduction of bulky reporter groups (for example, rhodamine). Therefore, the corresponding probes are useful tools for studying enzyme activity, function and assembly 36 . To our delight, the use of 2ic as the alkyne donor was feasible and the corresponding azole product was obtained in 3 h as ascertained by electrospray ionization-mass spectrometry (ESI-MS) (Fig. 5b). The successful attempt to synthesize chemoenzymatic β-lactam heterocycles via click chemistry may further provide access to tailor-made enzyme inhibitors and provide potential molecular probes to unravel the activity and function of proteins.

Discussion
Through evolution, nature has fashioned a plethora of enzymes to catalyse diverse reactivities that make life possible. The conventional views on enzyme limitations in synthetic applications often assume that an enzyme's exquisite activity comes at the cost of strict selectivity for accepted substrates 37 . Nevertheless, a majority fraction of enzymes do not make use of all the possible chemistries that are accessible by their scaffold 38 , which has fuelled efforts to tailor the performance of existing enzymes and, more ambitiously, to create new reactions. Exploring a fitness landscape via traditional laboratory evolution relies on the presence of generalists as starting points 39 . It is an arduous task that does not involve knowledge of what determines new activity, and the quality of the said starting point is unknown since iterative rounds of mutagenesis target sites scattered throughout the global protein structure 40 . An alternative avenue in engineering existing enzymes has proved successful in enhancing our ability to recognize the origin of enzymes' remarkable performances using theoretical and computational methods [41][42][43][44][45][46] . By exploring a small fraction of the vast sequence space, successful examples of studies focused on reshaping active sites that accommodate individual substrates have been reported [47][48][49][50] . However, a fantastic array of biocatalysts, such as C-N lyases, imine reductases and aldolases, catalyse the cross-coupling of multiple substrates. Consequently, the question is raised as to whether rational computational modelling could be applicable for more challenging tasks, where manipulations of enzymes are no longer limited to a small set of residues but are expanded to deal with collective mutations synchronously lining the whole active site.
Here, we addressed this possibility by dramatically transforming the substrate recognition pattern of the extremely specific enzyme AspB. This goal was achieved using mechanism-based computation protocols, dissecting nearby interactions relevant for binding and catalysis, retrieving essential catalytic geometric criteria from structural analyses and MD simulations, and allowing large jumps in sequence space while accommodating interactions between multiple simultaneous mutations and substrate. The notable advantage of this strategy lies in minimalizing experimental efforts while maintaining the exploration of synergisms between mutations. As such, desired enzymes possessing up to eight mutations at spatially adjacent positions in the exquisite active site were obtained by the screening of only a handful of variants. Without the assistance of computation, such extensive sculpting of the enzyme's active site would be either formidable by rational inspection or extremely labour intensive via experimental molecular evolution. The redesigned enzymes successfully permitted a wide range of aliphatic, aromatic and charged ncAAs to be prepared. This enzymatic route also highlights the power of the redesigned C-N lyase to precisely access the desired stereoisomer of the product with excellent regioselectivity. These non-natural biotransformations can be exceptionally efficient, even able to fulfil industrial requirements for substrate loading, product yield and profile, and scalability. Further advantages are that the redesigned C-N lyases were suitable for cascade reactions, enabling sequential chemoenzymatic transformations that produce other high-value-added products, such as β-lactam heterocycles. As the computational redesign workflow is well established, we expect that this protein scaffold may be further exploited to tackle more challenging transformations, such as the exploration of unactivated alkenes without carboxyl groups, the anti-Markovnikov hydroamination of terminal alkenes to generate linear aliphatic amines and the synthesis of ncAAs with double carbon stereocentres.
In summary, we presented a versatile C-N lyase platform tuned by computation to directly conjugate a matrix of nucleophilic amines and unsaturated acids through asymmetric hydroamination. The results provide convincing support for the notion that computational tools that efficiently navigate large regions of sequence space to propose beneficial mutants hold promise for tackling the challenges in biocatalysis in general. We anticipate that further development of this effective biocatalyst platform may open up new opportunities to allow stepwise economic connections of structurally diverse building blocks through C-N links for the synthesis of ncAAs and their derivatives, not only providing uses in synthetic and medicinal chemistry but also laying molecular building blocks for future synthetic biology development. Instruments. HPLC analysis was carried with two sets of instruments, including Agilent set (1200 Series) and Shimadzu set (LC-2030C). Absorbance was measured in mAU by Agilent set, or μV by Shimadzu set. Liquid chromatography-MS (ESI + ) analysis was conducted on an Agilent set (1200 Series + G1946D).

Reagents
Computational protein redesign. An ensemble of different conformations of the substrate was generated by enumerating these under Yasara. Substrate rotamers were sampled around the canonical minimum dihedral angles (60°, −60° and 180°) with 5° intervals over 35° around the minima (for example, 42.5° to 77.5°). The Rosetta Enzyme Design 51 application positions substrates optimally for catalysis by applying forces between the bound substrate and catalytically important groups in the enzyme. The substrate geometry corresponded to NAC criteria and was based on published QM/MM calculations 33 Table 1). Rosetta enzyme design oriented the substrate optimally for deamination by applying these forces in silico.
To generate an ensemble of different starting points for Rosetta Enzyme Design, 40 MD simulations of each complex structure were performed. The complex structures were positioned in a rectangular simulation cell with at least 7.5 Å between protein and the periodic boundary of the simulation cell. The salt ions were positioned at electrostatically favourable positions by an algorithm implemented under Yasara. An energy minimization was carried out before the MD simulation. MD simulation was carried out under Yasara with a leapfrog algorithm, with a time-step of 1.33 femtoseconds and a Berendsen thermostat to preserve constant pressure and temperature. The LINCS and SETTLE algorithms were used to constrain hydrogen atoms. The simulations were carried out using the Yamber3 force field. The temperature was increased from 5 to 298 K over 3 ps, followed by equilibration (2 ps) and production (5 ps). For each complex structure, we ran Rosetta Enzyme Design ten times for the 40 starting points generated by MD simulations and 100 times for the initial complex structure, resulting in a total of 500 designs. Rosetta Enzyme Design uses a Monte Carlo algorithm in which it selects mutations and structural changes that decrease overall energy to generate 3D structures of designs. The following command line options 23 were used for Rosetta Enzyme Design: -enzdes:: -cst_predock -cst_design -detect_design_interface -cut1 0.0 -cut2 0.0 -cut3 8.0 -cut4 10.0 -cst_min -chi_min -bb_min -cst_opt -include_catres_in_interface_detection -packing:: -use_input_sc -soft_rep_design -extrachi_cutoff 1 -design_min_cycles 3 -ex1:level 4 -ex2:level 4 -ex1aro:level 4 -ex2aro:level 4.
The in silico predicted designs were subsequently screened with 5 ps MD simulations. The frequencies of snapshots conforming to the NAC criteria in MD simulations were calculated. Finally, the obtained 500 designs were selected for experimental characterization on the basis of the following guidelines: (1) the overall energy should be negative, (2) the sum of the penalty energies for the above constrains should not exceed 20 REU and (3) the NAC frequencies should be larger than 50%. The active site must be organized and maintain the original β-carboxylate hydrogen-bonding network.
Construction of AspB variants. For BA1-BA22 and BC1-BC16, plasmid pET-21a(+), containing the AspB sequence, was used as a template for QuikChange mutagenesis with Q5 PCR MasterMix (NEB). PCR was performed in 0.5 ml microcentrifuge tubes and the DpnI-treated PCR products were transformed into E. coli TOP10. Incorporation of the mutations was confirmed by DNA sequencing.