Microbially guided discovery and biosynthesis of biologically active natural products

The design of small molecules that inhibit disease-relevant proteins represents a longstanding challenge of medicinal chemistry. Here, we describe an approach for encoding this challenge—the inhibition of a human drug target—into a microbial host and using it to guide the discovery and biosynthesis of targeted, biologically active natural products. This approach identied two previously unknown terpenoid inhibitors of protein tyrosine phosphatase 1B (PTP1B), an elusive therapeutic target for the treatment of diabetes and cancer. Both inhibitors target an allosteric site, which confers unusual selectivity, and can inhibit PTP1B in living cells. A screen of 24 uncharacterized terpene synthases from a pool of 4,464 genes uncovered additional hits, demonstrating a scalable discovery approach, and the incorporation of different PTPs into the microbial host yielded alternative PTP-specic detection systems. Findings illustrate the potential for using microbes to discover and build natural products that exhibit precisely dened biochemical activities yet possess unanticipated structures and/or binding sites. arabinose-inducible components a single component from B2H pBAD (Golden Gate assembly). Tables S4-S6 list the primers and DNA fragments used to construct each plasmid. We assembled pathways for terpenoid biosynthesis by purchasing plasmids encoding the rst module (pMBIS) and various sesquiterpene synthases (ADS or GHS in pTRC99a) from Addgene, and by building the remaining plasmids. We replaced the tetracycline resistance in pMBIS with a gene for chloramphenicol resistance to create pMBIS CmR . We integrated genes for ABS, TXS, ABA, and GGPPS into pTRC99t (i.e., pTRC99a without BsaI sites). Tables S4-S6 list the primers and DNA fragments used to construct each plasmid.


Introduction
Despite advances in structural biology and computational chemistry, the design of small molecules that bind tightly and selectively to disease-relevant proteins remains exceptionally di cult 1 . The free energetic contributions of rearrangements in the molecules of water that solvate binding partners and structural changes in the binding partners themselves are particularly challenging to predict and, thus, to incorporate into molecular design 2,3 . Drug development, as a result, often begins with screens of large compound libraries 4 .
Nature has endowed living systems with the catalytic machinery to build an enormous variety of biologically active molecules-a diverse natural library 5 . These molecules evolved to carry out important metabolic and ecological functions (e.g., the phytochemical recruitment of predators of herbivorous insects 6 ) but often also exhibit useful medicinal properties. Over the years, screens of environmental extracts and natural product libraries-augmented, on occasion, with combinatorial (bio)chemistry 7-9have uncovered a diverse set of therapeutics, from aspirin to paclitaxel 10 . Unfortunately, these screens tend to be resource intensive 11 , limited by low natural titers 12 , and largely subject to serendipity 13 .
Bioinformatic tools, in turn, have permitted the identi cation of biosynthetic gene clusters 14,15 , where colocalized resistance genes can reveal the biochemical function of their products 16,17 . The therapeutic applications of many natural products, however, differ from their native functions 18 , and many biosynthetic pathways can, when appropriately recon gured, produce entirely new and, perhaps, more effective therapeutic molecules 19,20 . Methods for e ciently identifying and building natural products that inhibit speci c disease-relevant proteins remain largely undeveloped.
Protein tyrosine phosphatases (PTPs) are an important class of drug targets that could bene t from new approaches to inhibitor discovery. These enzymes catalyze the hydrolytic dephosphorylation of tyrosine residues and, together with protein tyrosine kinases (PTKs), contribute to an enormous number of diseases (e.g., cancer, autoimmune disorders, and heart disease, to name a few) 21,22 . The last several decades have witnessed the construction of many potent inhibitors of PTKs, which are targets for over 30 approved drugs 23 . Therapeutic inhibitors of PTPs, by contrast, have proven di cult to develop. These enzymes possess well conserved, positively charged active sites that make them di cult to inhibit with selective, membrane-permeable molecules 24 ; they lack targeted therapeutics of any kind.
In this study, we describe an approach for using microbial systems to nd natural products that inhibit di cult-to-drug proteins. We focused on protein tyrosine phosphatase 1B (PTP1B), a therapeutic target for the treatment of type 2 diabetes, obesity, and HER2-positive breast cancer 25 . PTP1B possesses structural characteristics that are generally representative of the PTP family 26 and regulates a diverse set of physiological processes (e.g., energy expenditure 27 , in ammation 28 , and neural speci cation in embryonic stem cells 29 ). In brief, we assembled a strain of Escherichia coli with two genetic modules-(i) one that links cell survival to the inhibition of PTP1B and (ii) one that enables the biosynthesis of structurally varied terpenoids. In a study of ve well-characterized terpene synthases, this strain identi ed two previously unknown terpenoid inhibitors of PTP1B. Both inhibitors were selective for PTP1B, exhibited distinct binding mechanisms, and increased insulin receptor phosphorylation in mammalian cells. A screen of 24 uncharacterized terpene synthases from eight phylogenetically diverse clades uncovered additional hits, demonstrating a scalable approach for nding inhibitor-synthesizing genes. A simple exchange of PTP genes, in turn, permitted the facile extension of our genetically encoded detection system to new targets. Our ndings illustrate a versatile approach for using microbial systems to nd targeted, readily synthesizable inhibitors of disease-relevant enzymes.

Results
Development of a genetically encoded objective E. coli is a versatile platform for building natural products from unculturable or low-yielding organisms 30,31 . We hypothesized that a strain of E. coli programmed to detect the inactivation of PTP1B (i.e., a genetically encoded objective) might enable the discovery of natural products that inhibit it (i.e., molecular solutions to the objective). To program such a strain, we assembled a bacterial two-hybrid (B2H) system in which PTP1B and Src kinase control gene expression (Fig. 1a). In this system, Src phosphorylates a substrate domain, enabling a protein-protein interaction that activates transcription of a gene of interest (GOI). PTP1B dephosphorylates the substrate domain, preventing that interaction, and the inactivation of PTP1B re-enables it. E. coli is a particularly good host for this detection system because its proteome is su ciently orthogonal to the proteome of H. sapiens to minimize off-target growth defects that can result from the regulatory activities of Src and PTP1B (SI Note 1) 32 .
We carried out B2H development in several steps. To begin, we assembled a luminescent "base" system in which Src modulates the binding of a substrate domain to an Src homology 2 (SH2) domain (Fig. 1b); this system, which includes a chaperone that helps Src to fold (Cdc37) 33 , is similar to other B2H designs that detect protein-protein binding 34 . Unfortunately, our initial system did not yield a phosphorylationdependent transcriptional response, so we complemented it with inducible plasmids-each harboring a different system component-to identify proteins with suboptimal expression levels (Fig. 1b). Interestingly, secondary induction of Src increased luminescence, an indication that insu cient substrate phosphorylation and/or weak substrate-SH2 binding depressed GOI expression in our base system. We modi ed this system by swapping in different substrate domains, by adding mutations to the SH2 domain that enhance its a nity for phosphopeptides 35 , and by removing the gene for Src-a modi cation that allowed us to control expression exclusively from a second plasmid. With this con guration, induction of Src increased luminescence most prominently for the MidT substrate (Fig. 1c), and simultaneous induction of both Src and PTP1B prevented that increase-an indication of intracellular PTP1B activity (Fig. 1d). We nalized the MidT system by incorporating genes for PTP1B and Src, by adjusting promoters and ribosome binding sites to amplify its transcriptional response further ( Fig. 1d; [1][2], and by adding a gene for spectinomcyin resistance (SpecR) as the GOI. The nal plasmid-borne detection system required the inactivation of PTP1B to permit growth at high concentrations of antibiotic (Fig. 1e).

Biosynthesis of PTP1B inhibitors
To search for inhibitors of PTP1B that bind outside of its active site, we coupled the B2H system with metabolic pathways for terpenoids, a structurally diverse class of secondary metabolites with largely nonpolar structures (Fig. 2a). Terpenoids include over 80,000 known compounds and represent nearly one-third of all characterized natural products 36 (the basis of approximately 50% of clinically approved drugs 37 ). To begin, we focused on a handful of structurally diverse terpenoids without established inhibitory effects (Fig. 2b): amorphadiene, -humulene, -bisabolene, abietadiene, and taxadiene. Each terpenoid pathway consisted of two plasmid-borne modules: (i) the mevalonate-dependent isoprenoid pathway from S. cerevisiae (optimized for expression in E. coli 38 ) and (ii) a terpene synthase supplemented, when necessary for diterpenoid production, with a geranylgeranyl diphosphate synthase. These modules generated terpenoids at titers of 0.3-18 mg/L in E. coli ( Supplementary Fig. 3a).
We screened each pathway for its ability to produce inhibitors of PTP1B by transforming E. coli with plasmids harboring both the pathway of interest and the B2H system (Fig. 2c). To our surprise, pathways for amorphadiene and -bisabolene permitted survival at high concentrations of antibiotic. Critically, GC-MS traces con rmed that all pathways generated terpenoids in the presence of the B2H system (Fig. 2d, Supplementary Fig. 3b,c), and maximal resistance of the amorphadiene-and -bisabolene-producing strains required both an active terpene synthase and a functional B2H system ( Supplementary Fig. 3d).
We con rmed the inhibitory effects of puri ed terpenoids by examining their in uence on PTP1Bcatalyzed hydrolysis of p-nitrophenyl phosphate (pNPP; Fig. 2e, Supplementary Table 13). The IC 50 's for amorphadiene and -bisabolene were 53 ± 8 μM and 13 ± 2 μM, respectively, in 10% DMSO (Fig. 2f). These IC 50 's are surprisingly strong for unfunctionalized hydrocarbons (i.e., they are similar to the IC 50 's of inhibitors that form hydrogen bonds and other stabilizing interactions with PTP1B 21,39 ) and, importantly, resemble terpenoid concentrations in liquid culture (Fig. 2g), a nding consistent with in vivo inhibition. We note: Our estimates of potency and concentration are probably conservative. DMSO, which we used as a cosolvent for kinetic assays, increases the solubility of nonpolar inhibitors but tends to reduce their apparent potency (likely by increasing free energetic cost of desolvation 40 ), and terpenoids tend to accumulate intracellularly, where their concentrations can exceed extracellular levels by an order of magnitude 41 . Our growth-coupled assays, kinetic assays, and production measurements, taken together, indicate that amorphadiene and -bisabolene activate the B2H system by inhibiting PTP1B inside the cell.
A scalable approach to molecular discovery Our microbial strain provides a powerful tool for screening genes for their ability to generate novel PTP1B inhibitors. Most terpenoids, as a case study, are not commercially available, and even when their metabolic pathways are known, their biosynthesis, puri cation, and in vitro analysis is a resourceintensive process that is di cult to parallelize with existing methods 42 . Our B2H system offers a potential solution: It can identify inhibitor-synthesizing genes with a simple growth-coupled assay. We explored its application to discovery efforts by using it to screen a diverse set of uncharacterized biosynthetic genes.
In brief, we carried out a bioinformatic analysis of the largest terpene synthase family (PF03936) by building and annotating a cladogram of its 4,464 constituent members (Supplementary Figs. 4,5); from here, we synthesized three uncharacterized genes from each of eight clades: six with no characterized genes and two with some characterized genes (Fig. 3a). We reasoned that these 24 phylogenetically diverse genes might encode enzymes with different product pro les.
Guided by our initial screen, we searched for sesquiterpene inhibitors by pairing each of the uncharacterized genes with the FPP pathway. To our surprise, six genes conferred a signi cant survival advantage (Fig. 3B), and maximal resistance required an active B2H system ( Supplementary Fig. 6). Each hit generated distinct product pro les ( Supplementary Fig. 7); we focused our analysis on A0A0C9VSL7, which produced mostly (+)-1(10),4-cadinadiene as a major product (Fig. 3c, d). This terpenoid is a structural analog of amorphadiene but has a weaker potency (IC 50 = 165+33 uM); a titer is 33+18 uM suggests that intracellular accumulation may allow it to inhibit PTP1B inside the cell. Our ability to detect a weak inhibitor suggests that the B2H system can capture a broad set of scaffolds in molecular discovery efforts. The puri cation and analysis of additional hits, the incorporation of isoprenoid substrates of different sizes (e.g., GGPP), and the inclusion of more uncharacterized genes could expand the scope of such efforts.

Biophysical analysis of PTP1B inhibitors
Allosteric inhibitors of PTPs are valuable starting points for drug development. These molecules bind outside of the well conserved, positively charged active sites of PTPs and tend to have improved selectivities and membrane permeabilities over substrate analogs 21 . Motivated by these considerations, an early screen identi ed a benzbromarone derivative that inhibited PTP1B weakly (IC 50 = 350 μM) without competing with substrates; subsequent optimization of this compound led to two improved inhibitors (IC 50 's = 8 and 22 μM) that bind to an allosteric site 39 (Fig. 4a). Over the next 15 years, efforts to nd new inhibitors that bind to this site-or other allosteric regions on the catalytic domain-have been largely unsuccessful 43 . Benzbromarone derivatives are the only allosteric inhibitors with crystallographically veri ed binding sites. (Although, an allosteric inhibitor that binds to a disordered region of the full-length protein has been characterized with NMR 25 ). New approaches for nding allosteric inhibitors are clearly needed.
Our microbial system could grant access to new compounds that bind in unexpected ways. Amorphadiene and -bisabolene provide examples. They are highly nonpolar and, thus, incapable of engaging in the hydrogen bonds and electrostatic interactions on which most other PTP inhibitors rely 21,39 . To examine their binding mechanisms in detail, we collected X-ray crystal structures of PTP1B bound to amorphadiene and -bisabolol, a soluble analogue of -bisabolene that can be soaked into protein crystals. Intriguingly, both molecules bind to the allosteric site targeted by benzbromarone derivatives (Fig. 4a). Amorphadiene, however, causes the 7 helix to reorganize to create a hydrophobic cleft (Fig. 4b); this type of reorganization is interesting because it is typically slow (micro-to millisecond) 44 and di cult to incorporate into computational ligand design 45 . By contrast, -bisabolol wraps around F280 with an orientation orthogonal to that of the crystallized benzobromarone derivatives ( Fig. 4c). We note: -bisabolol is a ~ 20-fold weaker inhibitor than -bisabolene and may adopt a different bound pose ( Supplementary Fig. 9m); unfortunately, the low solubility of -bisabolene hindered direct structural studies.
Our crystal structures suggest that both terpenoids adopt multiple bound conformations (i.e., the electron density indicates regions of disorder; Supplementary Fig. 8 a,b). Molecular dynamics simulations provide additional support for this behavior (Supplementary Fig. 8 c-4e) which, we note, is di cult to incorporate into computational screens 46 . To probe the binding of amorphadiene and -bisabolene further, we carried out several additional analyses. First, we examined the inhibition of PTP1B by dihydroartemisinic acid. This structural analogue of amorphadiene has a carboxyl group that, according to our crystal structures, should interfere with binding to the hydrophobic cleft created by the 7 helix (Fig. 4d). The IC 50 of this molecule was eight-fold higher than that of amorphadiene, a reduction in potency consistent with its crystallographic pose (Fig. 4e, Supplementary Fig. 9l). Second, we assessed the inhibitory effects of amorphadiene and -bisabolene against TC-PTP, the most closely related phosphatase to PTP1B. These molecules inhibited TC-PTP ve-to six-fold less potently than PTP1B (Fig. 4f, Supplementary Fig. 9a-9k) -a nding consistent with binding to the poorly conserved allosteric site. Importantly, this selectivity may seem modest, but it matches or exceeds the selectivities of most pre-optimized inhibitors (including benzobromarone derivatives) and is exceedingly rare for unfunctionalized hydrocarbons (particularly, in light of their comparatively modest molecular weights) 47 . We assessed the contribution of the 7 helix to selectivity by removing the equivalent region from both PTP1B and TC-PTP (Fig. 4f). This modi cation caused a four-fold reduction in the selectivity of amorphadiene, but not -bisabolene-an effect consistent with the unique involvement of the 7 helix in the binding of amorphadiene.
Amorphadiene and -bisabolene are lipophilic molecules that could be valuable for their ability to pass through the membranes of mammalian cells. To examine the biological activity of these molecules, we incubated them with HEK293T/17 cells and used an enzyme-linked immunosorbent assay to measure shifts in insulin receptor (IR) phosphorylation. IR is a receptor tyrosine kinase that undergoes PTP1Bmediated dephosphorylation from the cytosolic side of the plasma membrane (PTP1B, in turn, localizes to the endoplasmic reticulum of the cell). Both molecules increased IR phosphorylation over a negative control (Fig. 4g, Supplementary Fig. 10). We checked for off-target contributions to this signal, in turn, by repeating the ELISA with equivalent concentrations of dihydroartemisinic acid and -bisabolol. To our satisfaction, both molecules led to a reduction in signal consistent with their reduced potencies.

Design of alternative PTP-speci c objectives
We explored the versatility of our B2H system by assessing its ability to detect the inactivation of several other diseases-relevant PTPs. In short, we swapped out the gene for PTP1B with genes for PTPN2, PTPN6, or PTPN12; these enzymes are targets for immunotherapeutic enhancement 48 , the treatment of ovarian cancer 49 , and acute myocardial infarction 50 , respectively. Their catalytic domains share 31-65% sequence identity with the catalytic domain of PTP1B. Interestingly, the new B2H systems were immediately functional; PTP inactivation permitted growth at high concentrations of spectinomycin ( Fig.  5a). This nding suggests that our detection system can be easily extended to other members of the PTP family.
PTP-speci c B2H systems could facilitate the identi cation of natural products that selectively inhibit one PTP over another. We explored this application by comparing the antibiotic resistance conferred by PTP1B-and TC-PTP-speci c systems in response to metabolic pathways for amorphadiene andbisabolene (Fig. 5b). As expected, the PTP1B-speci c system permitted growth at higher concentrations of antibiotic, a result consistent with the selectivity of both terpenoids for PTP1B. Indistinguishable terpenoid titers between the two strains suggest that this survival advantage does not result from difference in intracellular concentration (Fig. 5c). Findings thus indicate that a simple comparison of B2H systems-a potential secondary screen-offers a simple approach for evaluating the selectivity PTPinhibiting gene products. Notably, high concentrations of inhibitors in two strains could swamp out selective effects; in such cases, terpenoid levels could be reduced with lower mevalonate concentrations.

Discussion
This study addresses an important challenge of medicinal chemistry-the design of molecular structures that inhibit disease-relevant enzymes-by using a desired biochemical activity (i.e., an objective) as a genetically encoded constraint to guide molecular biosynthesis. This approach enabled the identi cation of two selective, biologically active inhibitors of PTP1B, an elusive drug target 54 . These molecules are not drugs, but they are promising scaffolds for lead development. Their mechanisms of modulation-which elicit allosteric conformational changes yet appear to rely on loose, conformationally exible binding-are unusual (and computationally elusive 55 ), and demonstrate the ability of microbial systems to nd new solutions to di cult challenges in molecular design. Our identi cation of unusual inhibitors in relatively small libraries, in turn, suggests that microbial systems can access a rich molecular landscape that is not e ciently explored by existing approaches to molecular discovery.
The B2H system at the core of our approach is a valuable tool for identifying biologically active natural products, which are structurally complex, di cult to synthesize, and often hidden in cryptic gene clusters 56 . It has several key advantages over contemporary approaches to inhibitor discovery: (i) It incorporates synthesizability as a search criterion-an important attribute of drug leads 57 . (ii) It is scalable. We used a growth-coupled assay to screen 24 uncharacterized terpene synthases; this type of assay is compatible with very large libraries (e.g., 10 10 ) 58 . (iii) It can use cellular machinery to stabilize proteins (e.g., CDC37 for Src); this capability could facilitate the integration of unstable and/or disordered targets. Future efforts to exploit these advantages by incorporating large libraries of mutated and/or recon gured pathways, alternative biosynthetic enzymes (e.g., cytochromes P450, halogenases, and methyltransferases), or new classes of disease-relevant enzymes would be informative.
The B2H system also has important limits. When used alongside metabolic pathways, it links survival not only to the potency of metabolites, but also to their titers, off-target effects, and pathway toxicities. These limitations can be bene cial; they bias the discovery process toward potent, readily synthesizable inhibitors and could, thus, facilitate post-discovery efforts to improve the titers of interesting molecules 59 .
Nonetheless, they will exclude some types of structurally complex molecules that are di cult to synthesize in E. coli. The use of similar activity-based screens in other organisms (e.g., Streptomyces) could be interesting.
The compatibility of our discovery approach with different PTPs is valuable in light of their increasingly well validated potential as a rich-and essentially untapped-source of new therapeutic targets 60 . We anticipate that some PTPs will require the use of chaperones and/or transcriptional adjustments to be incorporated into B2H systems. Our systematic optimization of the PTP1B-based system provides an experimental framework for exploring these modi cations. Side-by-side comparisons of B2H systems, in turn, offer a promising strategy for evaluating inhibitor selectivity in secondary screens. In future work, new varieties of objectives (e.g., B2H systems or genetic circuits that detect the selective inhibition-or, perhaps, activation-of one PTP over another) could facilitate the discovery of molecules with sophisticated mechanisms of modulation in primary screens. The versatility of genetically encoded objectives highlights the power of using microbial systems to nd targeted, biologically active molecules.

Methods
Bacterial strains. We used E. coli DH10B, chemically competent NEB Turbo, or electrocompetent One Shot Top10 (Invitrogen) to carry out molecular cloning and to perform preliminary analyses of terpenoid production; we used E. coli BL2-DE31 to express proteins for in vitro studies; and we used E. coli s1030 61 for our luminescence studies and for all experiments involving terpenoid-mediated growth (i.e., evolution studies).
For all strains, we generated chemically competent cells by carrying out the following steps: (i) We plated each strain on LB agar plates with the required antibiotics. (ii) We used one colony of each strain to inoculate 1 mL of LB media (25 g/L LB with appropriate antibiotics listed in We generated electrocompetent cells by following an approach similar to the one above. In step iv, however, we resuspended the cells in 50 mL of ice cold MilliQ water and repeated this step twice-rst with 50 mL of 20% sterile glycerol (ice cold) and, then, with 1 mL of 20% sterile glycerol (ice cold). We froze the pellets as before.
Materials. We purchased methyl abietate from Santa Cruz Biotechnology; trans-caryophyllene, tris(2carboxyethyl)phosphine (TCEP), bovine serum albumin (BSA), M9 minimal salts, phenylmethylsulfonyl uoride (PMSF), and DMSO (dimethyl sulfoxide) from Millipore Sigma; glycerol, bacterial protein extraction reagent II (B-PERII), and lysozyme from VWR; cloning reagents from New England Biolabs; amorphadiene from Ambeed, Inc.; and all other reagents (e.g., antibiotics and media components) from Thermo Fisher. Taxadiene was a kind gift from Phil Baran of the The Scripps Research Institute. We prepared mevalonate by mixing 1 volume of 2 M DL-mevalanolactone with 1.05 volumes of 2 M KOH and incubating this mixture at 37°C for 30 minutes.
Cloning and molecular biology. We constructed all plasmids by using standard methods (i.e., restriction digest and ligation, Golden Gate and Gibson assembly, Quikchange mutagenesis, and circular polymerase extension cloning). Table S1 describes the source of each gene; Tables S2 and S3 describe the composition of all nal plasmids.
We began construction of the B2H system by integrating the gene for HA4-RpoZ from pAB094a into pAB078d and by replacing the ampicillin resistance marker of pAB078d with a kanamycin resistance marker (Gibson Assembly). We modi ed the resulting "combined" plasmid, in turn, by replacing the HA4 and SH2 domains with kinase substrate and substrate recognition (i.e., SH2) domains, respectively (Gibson assembly), and by integrating genes for Src kinase, CDC37, and PTP1B in various combinations (Gibson assembly). We nalized the functional B2H system by modifying the SH2 domain with several mutations known to enhance its a nity for phosphopeptides (K15L, T8V, and C10A, numbered as in Kaneko et. al. 35 ), by exchanging the GOI for luminescence (LuxAB) with one for spectinomycin resistance (SpecR), and by toggling promoters and ribosome binding sites to enhance the transcriptional response (Gibson assembly and Quickchange Mutagenesis, Agilent Inc.). We note: For the last step, we also converted Pro1 to ProD by using the Quikchange protocol. When necessary, we constructed plasmids with arabinose-inducible components by cloning a single component from the B2H system into pBAD (Golden Gate assembly). Tables S4-S6 list the primers and DNA fragments used to construct each plasmid.
We assembled pathways for terpenoid biosynthesis by purchasing plasmids encoding the rst module (pMBIS) and various sesquiterpene synthases (ADS or GHS in pTRC99a) from Addgene, and by building the remaining plasmids. We replaced the tetracycline resistance in pMBIS with a gene for chloramphenicol resistance to create pMBIS CmR . We integrated genes for ABS, TXS, ABA, and GGPPS into pTRC99t (i.e., pTRC99a without BsaI sites). Tables S4-S6 list the primers and DNA fragments used to construct each plasmid.
Luminescence assays. We characterized preliminary B2H systems (which contained LuxAB as the GOI) with luminescence assays. In brief, we transformed necessary plasmids into E. coli s1030 (Table S2), plated the transformed cells onto LB agar plates (20 g/L agar, 10 g/L tryptone, 10 g/L sodium chloride, and 5 g/L yeast extract with antibiotics described in Table S2), and incubated all plates overnight at 37°C. We used individual colonies to inoculate 1 ml of terri c both (TB at 2%, or 12 g/L tryptone, 24 g/L yeast extract, 12 mL/L 100% glycerol, 2.28 g/L KH 2 PO 4 , 12.53 g/L K 2 HPO 4 , pH = 7.3, and antibiotics described in Table S2), and we incubated these cultures overnight (37°C and 225 RPM). The following morning, we diluted each culture by 100-fold into 1 ml of TB media (above), and we incubated these cultures in individual wells of a deep 96-well plate for 5.5 hours (37°C, 225 RPM). (We note: When pBAD was present, we supplemented the TB media with 0-0.02 w/v % arabinose). We transferred 100μL of each culture into a single well of a standard 96-well clear plate and measured both OD 600 and luminescence on a Biotek Synergy plate reader (gain: 135, integration time: 1 second, read height: 1 mm). Analogous measurements of cell-free media allowed us to measure background signals, which we subtracted from each measurement prior to calculating OD-normalized luminescence (i.e., Lum / OD 600 ).
Analysis of antibiotic resistance. We evaluated the spectinomycin resistance conferred by various B2H systems in the absence of terpenoid pathways by carrying out the following steps: (i) We transformed E. coli with the necessary plasmids (Table S2) and plated the transformed cells onto LB agar plates (20 g/L agar, 10 g/L tryptone, 10 g/L sodium chloride, 5 g/L yeast extract, 50 μg/ml kanamycin, 10 μg/ml tetracycline). (ii) We used individual colonies to inoculate 1-2 ml of TB media (12 g/L tryptone, 24 g/L yeast extract, 12 mL/L 100% glycerol, 2.28 g/L KH 2 PO 4 , 12.53 g/L K 2 HPO 4 , 50 μg/ml kanamycin, 10 μg/ml tetracycline, pH = 7.3), and we incubated these cultures overnight (37°C, 225 RPM). In the morning, we diluted each culture by 100-fold into 4 ml of TB media (as above) with 0-500 μg/ml spectinomycin (we used spectinomycin in the liquid culture only for Figure S2), and we incubated these cultures in deep 24-well plates until wells containing 0 μg/ml spectinomycin reached an OD 600 of 0.9-1.1. (iv) We diluted each 4-ml culture by 10-fold into TB media with no antibiotics and plated 10-μL drops of the diluent onto agar plates with various concentrations of spectinomycin. (v) We incubated plates overnight (37°C) and photographed them the following day.
Terpenoid biosynthesis. We prepared E. coli for terpenoid production by transforming cells with plasmids harboring requisite pathway components (Table S2) and plating them onto LB agar plates (20 g/L agar, 10 g/L tryptone, 10 g/L sodium chloride, and 5 g/L yeast extract with antibiotics described in Table S2). We used one colony from each strain to inoculate 2 ml TB (12 g/L tryptone, 24 g/L yeast extract, 12 mL/L 100% glycerol, 2.28 g/L KH 2 PO 4 , 12.53 g/L K 2 HPO 4 , pH = 7.0, and antibiotics described in Table S2) in a glass culture tube for ~16 hours (37°C and 225 RPM). We diluted these cultures by 75-fold into 10 ml of TB media and incubated the new cultures in 125 mL glass shake asks (37°C and 225 RPM). At an OD 600 of 0.3-0.6, we added 500 μM IPTG and 20 mM mevalonate. After 72-88 hours of growth (22°C and 225 RPM), we extracted terpenoids from each culture as outlined below. Table S9 lists exact sample sizes, culture volumes, and fermentation times.
Extraction and puri cation of terpenoids. We used hexane to extract terpenoids generated in liquid culture. For 10-mL cultures, we added 14 mL of hexane to 10 ml of culture broth in 125-mL glass shake asks, shook the mixture (100 RPM) for 30 minutes, centrifuged it (4000 x g), and withdrew 10 mL of the hexane layer for further analysis. For 4-mL cultures, we added 600 μL hexane to 1 mL of culture broth in a microcentrifuge tube, vortexed the tubes for 3 minutes, centrifuged the tubes for 1 minute (17000 x g), and saved 300-400 μL of the hexane layer for further analysis.
To purify amorphadiene, -bisabolene, and (+)-1(10),4-cadinadiene, we supplemented 500-1000 mL culture broth with hexane (16.7% v/v), shook the mixture for 30 minutes (100 RPM), isolated the hexane layer with a separatory funnel, centrifuged the isolated organic phase (4000 x g), and withdrew the hexane layer. To concentrate the terpenoid products, we evaporated excess hexane in a rotary evaporator to bring the nal volume to 500 μL, and we passed the resulting mixture over a silica gel 1-3 times (Sigma-Aldrich; high purity grade, 60 Å pore size, 230-400 mesh particle size). We analyzed elution fractions (100% hexane) on the GC/MS and pooled fractions with the compound of interest (amorphadiene). Once puri ed, we dried pooled fractions under a gentle stream of air, resuspended the terpenoid solids in DMSO, and quanti ed the nal samples as outlined below. We repeated the puri cation process until samples (in DMSO) were >95% pure by GC/MS unless otherwise noted.
GC-MS analysis of terpenoids. We measured terpenoids generated in liquid culture with a gas chromatograph / mass spectrometer (GC-MS; a Trace 1310 GC tted with a TG5-SilMS column and an ISQ 7000 MS; Thermo Fisher Scienti c). We prepared all samples in hexane (directly or through a 1:100 dilution of DMSO) with 20 μg/ml of caryophyllene or methyl abietate as an internal standard. Highly concentrated samples were diluted 10-20x prior to preparation to bring concentrations within the MS detection limit. When the peak area of an internal standard exceeded ± 40% of the average area of all samples containing that standard, we re-analyzed the corresponding samples. For all runs, we used the following GC method: hold at 80°C (3 min), increase to 250°C (15°C/min), hold at 250°C (6 min), increase to 280°C (30°C/min), and hold at 280°C (3 min). To identify various analytes, we scanned m/z ratios from 50 to 550.
We examined sesquiterpenes generated by variants of ADS by using select ion mode (SIM) to scan for the molecular ion (m/z =204). For quanti cation, we used Eq. 1: where A i is the area of the peak produced by analyte i, A std is the area of the peak produced by C std of caryophyllene in the sample, and R is the ratio of response factors for caryophyllene and amorphadiene in a reference sample. Tables S12-14 provide the concentrations of all standards and reference compounds used in this analysis.
We quanti ed sesquiterpenes generated by variants of GHS by using the aforementioned procedure with several modi cations: We used methyl abietate as an internal standard (several mutants of GHS generate caryophyllene as a product); we scanned for both m/z = 204 and m/z = 121, a common ion between sesquiterpenes and methyl abietate; we used a ratio of response factors for amorphadiene and methyl abietate at m/z = 121 for R; and we calculated peak areas at m/z = 121. We focused our analysis on peaks with areas that exceeded 1% of the total area of all peaks at m/z=204. We quanti ed diterpenoids by, once again, accompanying our general procedure with several modi cations: We scanned for a different molecular ion (m/z = 272) and an ion common to both diterpenoids and caryophyllene (m/z=93); we used a ratio of response factors for pure taxadiene (a kind gift from Phil Baran) and caryophyllene at m/z = 93; and we calculated peak areas m/z = 93. For all analyses, we examined only peaks with areas that exceeded 1% of the total area of all peaks at m/z=272. We identi ed molecules by using the NIST MS library and, when necessary, con rmed this identi cation with analytical standards or mass spectra reported in the literature. We note: The assumption of a constant response factor for different terpenoids (that is, the assumption that all sesquiterpenes and diterpenes ionize like amorphadiene and taxadiene, respectively) can certainly yield error in estimates of their concentrations; our analyses, which are consistent with those of other studies of terpenoid production in microbial systems 62,63 , supply rough estimates of concentrations for all compounds except amorphadiene and taxadiene (which had analytical standards).
Bioinformatics. We used a bioinformatic analysis to identify a phylogenetically diverse set of terpene synthases. Brie y, we downloaded (i) all constituent genes of PF03936 (the largest terpene synthase family grouped by a C-terminal domain) from the PFAM Database and (ii) all enzymes with Enzyme Commission (EC) number of 4.2.3.# from the Uniprot Database; this string, which de nes carbon oxygen lyases that act on phosphates, includes terpene synthases. We cleaned both datasets in Excel (i.e., we ensured that every identi er had only one row), and we used a custom R script to designate each PF03936 member as characterized (i.e., in possession of a Uniprot-based EC number) or uncharacterized.
Finally, we used FastTree 64 to create a phylogenetic tree of the PF03936 family and the R-package ggtree 65 to visualize the resulting tree and function data as a cladogram and heatmap.
After annotating the cladogram by hand, we selected three genes from each of six clades: six with no characterized genes and two with some characterized genes. We avoided clades proximal to known monoterpene synthases or diterpene synthases known to act on GGPP isomers absent in our system (e.g., ent-copalyl diphosphate); these enzymes are unlikely to act on FPP, the primary product of pMBIS CmR . When selecting enzymes within clades, we biased our choice towards bacterial/fungal species and selected genes with a minimal number of common ancestors within the clade. The selected genes were synthesized and cloned into the pTrc99a vector by Twist Biosciences and assayed for antibiotic resistance as described above.
Enzyme kinetics. To examine terpenoid-mediated inhibition, we measured PTP1B or TCPTP-catalyzed hydrolysis of p-nitrophenyl phosphate (pNPP) in the presence of various concentrations of terpenoids. µg/ml BSA), and DMSO at 10% v/v. We monitored the formation of p-nitrophenol by measuring absorbance at 405 nm every 10 seconds for 5 minutes on a SpectraMax M2 plate reader. We report exact sample sizes (i.e., the number of independently prepared reactions) in Supplementary Table 10.
We used a custom MATLAB script to process all raw kinetic data. This script removed all concentration values that fell outside of either (i) the range of our standard curve (absorbance vs. μM; Supplementary   Fig. 18) or (ii) the initial rate regime (>10% of the pNPP concentration used in the assay). When this step reduced kinetic dataset to fewer than ten points, we re-measured those datasets to collect at least ten. We t nal datasets, in turn, with a linear regression model (using Matlab's backslash operator).
We evaluated kinetic models in three steps: (i) We t initial-rate measurements collected in the absence and presence of inhibitors to Michaelis-Menten and inhibition models, respectively (here, we used the nlin t and fminsearch functions from MATLAB; Supplementary Table 13). (ii) We used an F-test to compare the mixed model to the single-parameter model with the least sum squared error (here, we used the fcdf function from MATLAB to assign p-values), and we accepted the mixed model when p < 0.05. (iii) We used the Akaike's Information Criterion (AIC) to compare the best-t single parameter model to each alternative single parameter model, and we accepted the "best-t" model when the difference in AIC (Δ i ) exceed 5 for all comparisons. 66 We note: For amorphadiene, -bisabolene, and (+)1-(10),4-cadinadiene this criterion was not met; both noncompetitive and uncompetitive models, however, yielded indistinguishable IC 50 's.
We estimated the half maximal inhibitory concentration (IC 50 ) of inhibitors by using the best-t kinetic models to determine the concentration of inhibitor required to reduce initial rates of PTP-catalyzed hydrolysis of 15 mM of pNPP by 50%. We used the MATLAB function "nlparci" to determine the con dence intervals of kinetic parameters, and we propagated those intervals to estimate corresponding con dence intervals for each IC 50 .
X-ray crystallography. We prepared crystals of PTP1B by using hanging drop vapor diffusion. In brief, we added 2 μL of PTP1B (~600 μM PTP1B, 50 mM HEPES, pH 7.3) to 6 μL of crystallization solution (100 mM HEPES, 200 mM magnesium acetate, and 14% polyethylene glycol 8000, pH 7.5) and incubated the resulting droplets over crystallization solution for one week at 4°C (EasyXtal CrystalSupport, Qiagen). We soaked crystals with ligand by transferring them to droplets formed with 6 μL of crystallization solution and 1 μL of ligand solution (10 mM in DMSO), which we incubated for 2-5 days at 4°C. We prepared all ligands for freezing by soaking them in cryoprotectant formed from a 70/30 (v/v) mixture of buffer (100 mM HEPES, 200 mM magnesium acetate, and 25% polyethylene glycol 8000, pH 7.5) and glycerol.
We collected X-ray diffraction data through the Collaborative Crystallography Program at Lawrence Berkeley National Lab (ALS ENABLE, beamline 8.2.1, 100 K, 1.00003 Å). We performed integration, scaling, and merging of X-ray diffraction data using the xia2 software package 67 , and we carried out molecular replacement and structure re nement with the PHENIX graphical interface, 68 supplemented with manual model adjustment in COOT 69 and one round of PDB-REDO 70 (the latter, only for the PTP1Bamorphadiene complex).
Molecular dynamics (MD) simulations. We performed MD simulations using GROMACS 2020 71 . Brie y, we used the CHARMM36m protein force eld 72 , a CHARMM-modi ed TIP3P water model 73 , and ligand parameters generated by CGenFF 74,75 . We solvated each PTP1B-ligand complex (initialized from the corresponding crystal structure) in a dodecahedral box with edges positioned ≥ 10 Å from the surface of the complex, and we added sodium ions (three for amorphadiene and one for -bisabolol) to neutralize each system. We used the LINCS algorithm 76 to constrain all bonds involving hydrogen atoms, the Verlet leapfrog algorithm to numerically integrate equations of motion with a 2-fs time step, and the particlemesh Ewald summation 77 (cubic interpolation with a grid spacing of 0.16 nm) to calculate long-range electrostatic interactions; we used a cutoff of 1.2 nm, in turn, for short-range electrostatic and Lennard-Jones interactions. We independently coupled the protein-ligand complex and solvent molecules to a temperature bath (300K) using a modi ed Berendsen thermostat 78 with a relaxation time of 0.1 ps, and we xed pressure coupling to 1 bar using the Parrinello-Rahman algorithm 79 with a relaxation time of 2 ps and isothermal compressibility of 4.5 × 10 −5 bar -1 .
For each system, we carried out 30 independent MD simulations to reduce sampling bias. For each MD trajectory, we minimized energy using the steepest decent method followed by 100-ps solvent relaxation in the NVT ensemble and 100-ps solvent relaxation in the NPT ensemble. After an additional 1-ns NPT equilibration, we carried out production runs for 1 ns in the NPT ensemble and registered coordinate data every 10 ps.
Analysis of PTP1B inhibition in HEK293TCells . We prepared HEK293T/17 cells for an enzyme-linked immunosorbent assay (ELISA) by growing them in 75 cm 2 culture asks (Corning) with DMEM media supplemented with 10% FBS, 100 units/ml penicillin, and 100 units/ml streptomycin. We replaced the media every day for 3-5 days until the cells reached 80-100% con uency.
We measured the in uence of inhibitors on insulin receptor (IR) phosphorylation by using an IR-speci c ELISA ( Supplementary Fig. 10a). Brie y, we starved cells for 48 hours in FBS-free media and incubated the with inhibitors (all at 3% DMSO) for 10 minutes. After incubation, we lysed cells with lysis buffer (9803, Cell Signaling Technology) supplemented with 1X halt phosphatase inhibitor cocktail and 1X halt protease inhibitor cocktail (Thermo Fisher Scienti c) for 10 min, pelleted the cell debris, and used the lysis buffer to dilute each sample to 60 mg/ml total protein. We measured IR phosphorylation in subsequent dilutions of the 60 mg/ml samples with the PathScan® Phospho-Insulin Receptor β (panTyr) Sandwich ELISA Kit (Cell Signaling Technology; #7082). We note: To identify biologically active concentrations ofbisabolene and amorphadiene, we screened several concentrations and chose those that gave the highest signal (405 μM for -bisabolene and 930 μM for amorphadiene); similar concentrations of weak inhibitors did not yield a detectable signal (Supplementary Fig. 10b,c).
Statistical analysis and reproducibility. We determined statistical signi cance (Figs 3g) with a two-tailed Student's t-test (details in Supplementary Tables 11 and 15), and we used an F-test to compare one-and two-parameter models of inhibition (Supplementary Table 13).
Reporting Summary. Further information on research design is available in the Nature Research Reporting Summary linked to this data. Data availability. The plasmids generated in this study are available on Addgene (https://www.addgene.org/) or from the authors. All code generated for data analysis is available upon request. Source data for our gures is available as follows: Supplementary Table 7 (Fig. 1b-d,  Supplementary Fig. 1 Fig. 4g; Supplementary Fig. 6b-c). The crystal structures determined in this study are available from the RCSB Protein Data Bank (PDB entry 6w30, 6w31). Table 14   Development of a bacterial-two hybrid system that links the inhibition of PTP1B to antibiotic resistance. a, A bacterial two-hybrid (B2H) system in which a phosphorylation-dependent protein-protein interaction modulates transcription of a gene of interest (GOI, black). Major components include (i) a substrate domain fused to the omega subunit of RNA polymerase (yellow), (ii) an SH2 domain fused to the 434 phage cI repressor (light blue), (iii) Src kinase and PTP1B, (iv) an operator for 434cI (dark green), (v) a binding site for RNA polymerase (purple), and (vi) a gene of interest (GOI, black). b, The luminescence generated by a B2H system with a p130cas substrate, LuxAB as the GOI, and no PTP1B. We used an inducible plasmid to increase expression of speci c components. c, The luminescence generated by B2H systems with an SH2 domain that exhibits enhanced a nity for phosphopeptides (SH2*), one of four substrate domains, LuxAB as the GOI, and no Src or PTP1B. We used an inducible plasmid to control the expression of Src. d, The B2H system from c with either p130cas or MidT substrates. We used a second plasmid to control the expression of Src and an active or inactive (C215) variant of PTP1B. Right: Two optimized single-plasmid systems. e, The nal B2H system. Inactivation of PTP1B enabled a strain of E. coli harboring this system to survive at high concentrations of spectinomycin (> 250 μg/ml). Error bars in b-d denote standard error with n = 3 biological replicates. Supplementary tables 2, 7, and 8 detail the plasmids used in each B2H version. inhibition of PTP1B by (+)-1(10),4-cadinadiene (85% purity, 10% DMSO). Lines show the best-t kinetic models of inhibition (Supplementary Table 13).

Figure 4
Biophysical analysis of terpenoid-mediated inhibition. a. Aligned X-ray crystal structures of PTP1B bound to TCS401, a competitive inhibitor (yellow protein, orange highlights, and green spheres; pdb entry 5k9w), and BBR, an allosteric inhibitor (gray protein, blue highlights, and light blue spheres; pdb entry 1t4j). b-c, Aligned structures of PTP1B bound to BBR (white protein and light blue ligand) and (b) amorphadiene (cyan protein and dark blue ligand, pdb entry 6W30) or (c) -bisabolol (pink protein and red ligand, pdb entry 6W31), a soluble analogue of -bisabolene. d, Dihydroartemisinic acid (DHA), a structural analogue of amorphadiene with a carboxyl group likely to disrupt binding to the hydrophobic cleft. e, DHA is eightfold less potent than amorphadiene. Lines show the best-t kinetic models of inhibition (Supplementary   Table 13). Error bars denote standard error for n = 3 independent measurements with a 95% con dence interval for the IC50. f, Both amorphadiene and -bisabolene inhibit PTP1B much more potently than TC-PTP; the removal of the 7 helix (or equivalent) from both enzymes reduces the selectivity of AD, but not AB. Error bars show propagated 95% con dence intervals estimated from n ≥ 3 independent measurements at each condition (exact sample sizes are reported in Supplementary Table 10  Extension to other disease-related PTPs. a, The spectinomycin resistance of strains harboring B2H systems modi ed to detect the inactivation of different disease-relevant PTPs. Inactivating mutations51-53 confer survival at high concentrations of antibiotic. b, A comparison of the resistance conferred by PTP1B-and TC-PTP-speci c B2H systems in the presence of metabolic pathways for amorphadiene and -bisabolene (i.e., pMBISCmR + ADS or ABA). The PTP1B-speci c system exhibits a prominent survival advantage, a nding consistent with the selectivity of both terpenoids for this enzyme. c, The titers of AD and AB in strains harboring both the B2H systems and associated metabolic pathways are indistinguishable between strains.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.