Structural characteristics of Sh GdmF
To gain insight into the structural properties of the S. hygroscopicus amide synthase GdmF and its catalytic mechanism, we overexpressed full-length ShGdmF and purified the protein to homogeneity (supporting information, Figure S1a). We optimized the buffer conditions to obtain maximum stability of the protein using a thermal shift assay. This revealed an optimal pH of 7.5 and a positive stabilizing effect of D-glucose, glycerol, DTT, and DMSO, while the addition of salts, EDTA, or urea appeared to play a minor role in protein stability (supporting information, Figures S1b + S1c).
Crystals of apo-enzyme ShGdmF grew in rhombohedral form and diffracted to a resolution of 1.4 Å. The crystals belonged to the orthorhombic space group C2221 and the structure was solved by molecular replacement using the homologous crystal structure of arylamine N-acetyltransferase 1 (NAT1, pdb: 2bsz, sequence identity: 30%)13 from M. loti with one monomer in the asymmetric unit (data collection and refinement statistics are given in the supporting information Table S1). Despite the low sequence identity of < 36%, the structural homology to prokaryotic and eukaryotic NATs is quite high for the refined ShGdmF structure with root mean square deviations (rmsd) of 1.5 Å for all backbone atoms compared to the structure of MlNAT1. The overall folding of ShGdmF largely resembles the common architecture of its NAT analogue, revealing a three domain architecture with an N-terminal α-helical bundle (domain I, residues 1–88), a β-barrel (domain II, residues 89–185), and a C-terminal α/β-lid (domain III, residues 211–257), the latter being connected to domain II via an interdomain region (residues 186–210) (Fig. 2a). The α-helical bundle consists of five α-helices, and a short β-strand between α2 and α3. The second domain forms a β-barrel with eight β-strands. Two short α-helices lead to the interdomain region, which appears to be mostly unstructured in ShGdmF. Finally, the α/β-lid comprises a β-sheet of four antiparallel β-strands, and a C-terminal α-helix. As observed for NATs, the active site of ShGdmF is formed by a catalytic triad17 (Cys73-His111-Asp126 in ShGdmF) (Fig. 2b), which is embedded in a deep, primarily hydrophobic cleft21,25 (Fig. 2c). In NAT proteins, the arginine residue four amino acids upstream to the catalytic cysteine 73 was suggested to stabilize the active conformation of NATs by salt-bridge interactions with a conserved glutamate of the α3-helix17, with mutation R64Q in HsNAT2 leading to slow acetylation26. ShGdmF has a histidine (His69) at this position, and shows no interactions with α3, which might reflect that ShGdmF binds a larger native substrate and the catalytic mechanism involves a different conformation of the active site cleft. The previously discovered highly conserved motif (I/V)(P/A)FENLx, present in all NAT proteins27 and adjacent to the catalytic triad, is not found in ShGdmF and replaced by a VPYDNST motif (Supporting information Figure S1d).
As has been previously observed for prokaryotic NATs, and in contrast to eukaryotic NATs25, the C-terminal α/β-lid terminates in an α-helix positioned away from the deep active site cleft. Thus, ShGdmF is shortened by 20 amino acids compared with MlNAT1 (Supporting Information: Figure S2a). The low crystallographic B-factors of the β-barrel and α/β-lid indicate fairly rigid domains with average values of 24 Å2. The catalytic triad (Cys73-His111-Asp126 in ShGdmF) (Fig. 2b) is highly conserved with an rmsd of 0.04 Å compared to the catalytic domain of MlNAT1 and is buried deep in the large active site cleft between domains II and III. Consistent with other prokaryotic and eukaryotic NATs, the catalytic triad is surrounded predominantly by aliphatic and aromatic amino acids.25 However, differences in the amino acid composition and their sidechain positions, particularly in residues 72 and 74, are found in close proximity to the catalytically important cysteine 73 (Fig. 2c). This leads to a two- to threefold increased active site cleft with a volume of approximately 700 Å3 in ShGdmF (Fig. 2d and Supporting Information, Figures S2a and S2b).
Despite the high resolution of the crystallographic data, there is no apparent electron density found for a major part of the interdomain region (residues 194–206), suggesting a highly flexible, mostly unstructured loop. In NAT structures, the interdomain region has a largely α-helical structure and closes the deep active site cleft and the catalytic triad. This altered loop region in ShGdmF appears to be a unique feature of amide synthases and contributes significantly to the widened active site cleft. The structure therefore reveals a significantly altered substrate-binding region compared to existing homology models of GdmF, which assumed identical active sites in all NATs. These results therefore support previous predictions, suggesting an unstructured interdomain loop in the related amide synthase AmRifF.19 Given that the native seco-substrate of ShGdmF is much larger, the wider active site cleft and flexible interdomain region may play a critical role in substrate binding and/or catalysis during macrolactamization. Besides these structural studies we included functional investigations for which suitable substrates had to be synthesized.
Synthesis of substrate analogues
Synthesis of substrate fragments and seco -progeldanamycin SNAC-thioester 27b and ethylthioester 29
A series of smaller fragments of the seco-acid of geldanamycin representing both the amino moiety and the carboxyl terminus activated either as N-acetylcysteamine or as pantetheine thioesters were prepared (Fig. 3A and B). In addition, we carried out a total synthesis of a thioactivated seco-progeldanamycin derivative. We chose 8-demethyl-seco-progeledanamycin 27b because it structurally hardly differs from 1 but is likely easier accessible synthetically. Importantly, the loss of the methyl group leads to the elimination of the 1,3-allylic strain in the C8-C10 region, resulting in greater conformational flexibility compared to seco-progeldanamycin (1) (Fig. 3C).
The SNAC esters 11a-c were prepared by amide coupling from N-acetylcysteamine and the corresponding carboxylic acids and, quite analogously, the pantetheine thioesters 13a-c were obtained using the protected pantetheine dimethyl ketal 12 as the starting point.28 In this case, the protecting group was cleaved using InCl3 under mild conditions.29
Briefly, we dissected the target molecule into two major fragments 19 and 24, which were to be joined by cross-olefin metathesis. The synthesis of the aromatic moiety 19 commenced with the known benzyl alcohol 14,30 which was converted to the oxazolidinone 15 in three steps utilizing the Evans alkylation protocol. From there, standard transformations that included a Wittig olefination step led to the ethyl ester 16, which was subsequently converted to the epoxy alcohol 17, using the Sharpless epoxidation as a method to diastereoselectively introduce two stereogenic centers. Dibal-H reduction was used to regioselectively open the oxirane ring, followed by a sequence of O-silylation, O-methylation, and desilylation without cleavage of the silyl protection at the phenol group. The primary alcohol 18 formed was converted to the corresponding aldehyde using the Dess-Martin reagent, which was subsequently subjected to a diastereocontrolled Roush-crotylation, yielding the desired diastereomer 19 as the major product (d.r.= 10:1).
The synthesis of the second fragment 24 utilized L-glutamic acid as the chiral starting building block, which was transformed into the γ-lactone 22. From there, the aldehyde 23 was generated via a sequence of standard steps and this was subjected to a diastereocontrolled vinylation to afford the desired alkene 24 (syn:anti = 3:1). The absolute configuration of the newly formed stereogenic center for the main diastereomer was determined by Mosher ester analysis (see supplemental information).31
With both building blocks in hand, the two alkenes 19 and 24 were coupled by cross-metathesis using the Hoveyda-Grubbs Ru-complex as a precatalyst that yielded product 25. This reaction preferentially afforded the (E)-configured alkene in moderate yield, which was further processed in four steps, that included protecting group and functional group manipulations to give the lactol 26 which was subjected to a Wittig olefination protocol with P-ylide 28a already carrying the SNAC ester to furnish seco acid derivative 27a. This was finally deprotected to afford the SNAC ester 27b in good yield. In addition, the cross-metathesis product 25 was further processed to give the ethylthioester-containing seco-acid derivative 29.
Macrolactamization of seco-progeldanamycin derivative 27 with amide synthase Sh GdmF
Subsequently, ShGdmF was incubated with SNAC ester 27b and small amounts of progeldanamycin derivative 30 were formed as well as the main product, seco-acid 31, the hydrolysis product (Fig. 3D). The formation of 30 was evidenced by HRMS/MS and by comparison with the cyclization product obtained in parallel by silver nitrate-promoted macrolactamization32 of ethylthioester 29 followed by desilylation. Importantly, under the incubation conditions in the absence of ShGdmF the macrocyclization product could not be detected but only the hydrolysis product 31 was found to be formed.
These results demonstrate that ShGdmF is able to accept and cyclize SNAC esters of seco-progeldanamycin, despite low degree of conversion while hydrolysis predominated. This can be rationalized in that SNAC thioesters are only simplified models of the larger pantetheine thioesters or CoA esters.
Sh GdmF binds truncated co-substrates
While most known NATs utilize coenzyme A (CoA) as the acetyl-carrying co-substrate, the activated form of seco-substrates for amide synthases is not known so far. Based on the size of CoA and seco-ketide substrates, as well as the low sequence conservation of the active site cleft (Supporting Information, Figure S1d), it was hypothesized that, before macrolactamization occurs, seco-substrates equipped with shorter co-substrates would be transported to the binding site.20,23 Thus, ShGdmF was crystallized in the presence of thioesters 11a-c and 13a-c and m-aminobenzoic acids 31 and 32, and the substrate-bound crystal structures of ShGdmF were solved to resolutions ranging from 1.28 to 1.82 Å (statistics for data collection and refinement are given in Supporting Information Table 1). The substrate-bound protein structures agree well with the apo structure of ShGdmF, with rmsd values between 0.05 and 0.137 Å for all backbone atoms, indicating only minor spatial changes within the subdomains upon substrate binding. To our surprise, we could not find electron density for the different thioester groups in any of these structures, but we found densities for the cleaved SNAC or pantetheine groups, instead (Figs. 4a and 4b). The SNAC and pantetheine groups are deeply embedded in the wide active site cleft, with the sulfur atom of the sulfhydryl group at a distance of only about 2 Å away from the reactive cysteine Cys73. ShGdmF therefore appears to be capable of directly binding SNAC and pantetheine thioesters and catalyzing the first step of macrolactamization, i.e., initiating nucleophilic attack of the activated ester carbon by the thiol group of Cys73 and thus cleavage of the thioester in the absence of arylamines. Accordingly, the absence of thioester units in the structures could be rationalized by assuming undesired hydrolysis of thioesters, as the large active site allows easy access of water molecules. We hypothesize that the unstructured interdomain region could transform into a helical structure in the presence of the long ketide chain of native seco-substrate 1, which then protects the active site from the surrounding water.
The bound SNAC and pantetheine co-substrates show conserved binding positions within the active site aligned with the β-strands of domains II and III (Fig. 4b). Two stabilizing interactions of the sulfur atom in the thioesters are observed with Tyr37 and His111. Although the orientation of the SNAC units within the hydrophobic active site cleft is the same for the different thioester derivatives, two alternative conformations of the acetamido group were observed, one of which forms an additional hydrogen bond to Gly110. This flexibility reflects the predominantly hydrophobic nature of the interactions with aliphatic and aromatic residues in the active site (Val72, Tyr74, Val98, Phe129, Pro130, and Phe210), and is consistent with mostly hydrophobic amino acids surrounding the catalytic triad in known NATs25.
The pantetheine transporter forms additional hydrogen bonds with backbone atoms of the putative P-loop (residue Gly131) and the β1/β2 turn (residues Arg99 and Ala101), as well as various van der Waals contacts to hydrophobic regions in the active site cleft (Val72, Tyr74, Val98, Gln100, Phe129, Pro130, Phe210, and Ile223). In conclusion, the binding position of the SNAC and pantetheine co-substrates suggests that the polyketide chain of the native seco-substrate binds either to the groove between the Tyr74 and Glu110 residues or towards the solvent and the unresolved interdomain region.
The diffraction data collected for ShGdmF crystallized in the presence of acetyl-CoA showed no electron density for acetyl-CoA in the active site, supporting the view that acetyl-CoA 5 does not act as a natural co-substrate for seco-progeldanamycin in ShGdmF. These results are consistent with reports by Sinclair et al.17 and Sim et al.33, who postulated that amide synthases do not use acetyl-CoA 5 as a co-substrate for their corresponding seco-ketides, since critical amino acids, which are highly conserved in NATs and required to bind CoA, are not present in amide synthases.33 These critical residues comprise the phosphate-binding P-loop in NATs. Also in GdmF, a putative P-loop can be identified starting with Gly131 (GPSY), however we found that the pantetheine transporter interacts with the P-loop directly via Gly131. Comparing the crystal structure of ShGdmF complexed with pantetheine to the CoA-bound MlNAT1 structure (pdb: 4nv7)34, both the P-loop and the C-terminal β-sheet are shifted by 3 to 4 Å, thereby blocking the crystallographically determined binding site for the diphosphates and 3'-phospho-adenosine of CoA (Supporting Information, Figure S2c). In addition, molecular dynamics simulations show a significant decrease in the flexibility of the P-loop in ShGdmF compared with simulations of MlNAT1 (Supporting Information, Figure S2d). However, coenzyme A (CoA) has previously been found to bind to NATs in various orientations.21,25 This was evidenced by the crystal structure of human NAT2 which revealed binding of CoA in a bent conformation to a deep groove formed by the α-helical interdomain region of NAT2 and the β-barrel subdomain.25 Given the missing electron density for CoA and the interdomain region in our data sets, we cannot completely rule out the possibility of CoA binding to the interdomain region of ShGdmF, however the data suggest a preference of ShGdmF for shorter co-substrates than CoA when mediating binding of the polyketide substrate.
To elucidate the binding modes of the acyl acceptors during formation of the macrolactam, we solved the crystal structures of ShGdmF complexed with 3-aminophenol (32) and 3-amino-5-methylphenol (33), respectively, to a resolution of 1.26 Å (statistics for data acquisition and refinement are given in the supplemental material, Table 1). The hydrophobic binding site of the aminophenols in the active site cleft overlaps with the binding site of the co-substrates SNAC and pantetheine (Fig. 3a-d). The overlap of the acyl donor with that of the acceptor binding site suggests that ShGdmF follows a sequential catalytic mechanism similar to that described for NATs35–37: (1) initial binding of the activated acyl substrate (in form of a CoA-ester for NATs), and acylation of the catalytic cysteine residue, (2) followed by displacement of the co-substrate by an arylamine-acyl acceptor and transfer of the acyl group onto the arylamine to form the product amide. The crystal structures indicate some flexibility of the aminophenol 32 and 33 when bound to ShGdmF, as the two acceptor molecules were found to adopt slightly different binding orientations (Fig. 4c-d). It should be pointed out that the binding position is consistent with the crystallographically determined positions of the aromatic antituberculosis agent isoniazid and the antihypertensive agent hydralazine in prokaryotic NAT crystal structures.38,39
Sh GdmF shows broad substrate specificity
ShGdmF shows broad substrate specificity and, in contrast to NATs, it accepts acyl substrates with truncated co-substrates, according to our steady-state enzyme assays with aminophenols 32 and 33 and thioesters 11a-c or 13a-c using Ellman´s reagent (5,5′-dithiobis-2-nitrobenzoic acid, DTNB).41,42 The enzyme is able to catalyze the amide synthesis of SNAC and pantetheine thioesters directly in vitro and in the absence of the cellular environment (Fig. 4e + 4f). The kinetic parameters KM of ShGdmF for substrate conversion range from submillimolar to millimolar at maximum rates (kcat) of 0.001 to 0.02 s− 1. The determined kcat/KM values indicate that ShGdmF exhibits the highest activity with SNAC thioester 13b for amide coupling of both aminophenols 32 and 33 (supplemental Table 2). For the SNAC thioesters, an increase in kcat/KM with increasing chain length was observed, whereas in the case of the pantetheine thioesters, a decrease in catalytic efficiency with longer acyl substrates was observed. In terms of kcat values, the activity of ShGdmF is similar to that of the related MsNAT.
The interdomain region might assist substrate binding
Neither in the high-resolution substrate-bound crystal structures nor in the apo structure of ShGdmF, interpretable electron density could be found for the interdomain region, indicating a highly flexible substructure. We computationally modelled the missing 13 amino acids into the apo ShGdmF structure and performed all-atom molecular dynamics simulations in explicit water to illuminate the potential function of the interdomain region. During the simulations, the interdomain region was shifted 15 Å toward the β-barrel, resulting in attractive and specific interactions between the two regions, thereby closing the large active site cleft (Figs. 5a and 5b). Accordingly, the interdomain region appears to act as a dynamic lid shielding the catalytic triad from the surrounding solvent and transforming the active site cleft into a substrate-binding tunnel. Closure of the cleft is mediated primarily by the formation of a salt bridge between Asp194 and Arg109 and a hydrogen bond between Gln100 and Gln201, as well as by a series of hydrophobic interactions of nonpolar residues. In this conformation, the P-loop binds to the interdomain region, thereby affecting the active site. However, it remains unsolved whether the predicted refolding of the interdomain region in ShGdmF is a consequence of the empty cleft in the active site or alternatively represents a gating function of the interdomain region.
We therefore covalently docked seco-progeldanamycin (1 bound as thioester to Cys73), resulting in a predicted binding mode of 1 in a pocket between the interdomain region and the β-barrel. In this conformation, the substrate does not occupy the crystallographically determined co-substrate binding site but does allow simultaneous binding of the co-substrate and the acyl acceptor. Subsequent all-atom molecular dynamics simulations showed that the interdomain region does not move toward the β-barrel in the presence of the substrate seco-progeldanamycin, but in contrast to the apo simulations described above, the interdomain region appears to undergo a conformational change and forms a helical structure (Fig. 5c). This conformational change leads to a rearrangement of the binding position of the substrate (Fig. 5d). seco-Progeldanamycin thus interacts with the restructured interdomain region (residues Tyr199 to Ala206), the α-helical bundle (residues Tyr37 to Leu46), and the β-barrel (Val72, Tyr74, Glu110, and His111). The aminophenol moiety occupies a distant site that allows nucleophilic attack on the thioester carbon. These results suggest the possible presence of a novel conformation of the interdomain region required for the catalysis of amide synthases and the formation of macrolactams.
To gain additional knowledge regarding the assumption that CoA ester is not a substrate for ShGdmF, we developed a microarray-based targeted displacement assay using fluorescently labeled geldanamycin-FITC. However, this approach did not provide more accurate or further information on this question (details on how the assay was performed and the results obtained can be found in the supplemental information).