Successes, surprises and pitfalls in modular polyketide synthase engineering: generation of ring-contracted stambomycins

The modular organization of the type I polyketide synthases (PKSs) would seem propitious for rational engineering of desirable analogous. However, despite decades of efforts, such experiments remain largely inecient. Here, we combined multiple, state-of-the-art approaches including modication of docking domains, use of modules of varying domain composition, alternative interdomain fusion sites, and targeted adaptation of key domain-domain interfaces, to reprogram the stambomycin PKS by deleting seven internal modules – the most substantial modication to an intact system reported to date. One such system produced the target 37-membered mini-stambomycin metabolites, a reduction in chain length of 14 carbons relative to the 51-membered parental compounds, but also substantial quantities of shunt metabolites released from the multienzyme subunit upstream of the newly-installed junction. Our data also provide evidence for an unprecedented off-loading mechanism of such stalled intermediates involving the C-terminal thioesterase domain acting on chains located four modules upstream. The yields of all metabolites were substantially reduced compared to the wild type compounds, likely reecting the poor tolerance to the non-native substrates of the modules downstream of the introduced interfaces. Taken together, our data demonstrate that even ‘optimized’ PKS engineering strategies remain inadequate for ecient production of target polyketide derivatives, and highlight several areas for future investigation. Dichroism CD spectra on a Chirascan CD (Applied Photophysics, United (IBS-Lor Plateforme Biophysique Biologie Structurale) at 0.5 nm intervals in the wavelength range of 180 − 260 nm at 20°C, using a temperature-controlled chamber. A 0.01 cm quartz cuvette containing 30 µL of docking domain at 100 µM, a 0.1 cm cuvette with 200 µL of sample at 10 µM, and a 1 cm cuvette containing 1.5 mL of sample at 1 µM, were used for all the measurements. All measurements were performed at least in triplicate, and sample spectra were corrected for buffer background by subtracting the average spectrum of buffer alone. The CD spectra were deconvoluted using the deconvolution software CDNN2.1 65 to estimate the secondary structure present in the docking domains.


Introduction
For almost thirty years, efforts have been made to leverage the modular genetic architecture of the type I polyketide synthases (PKSs) to generate novel derivatives, typically by modifying individual catalytic domains. Despite enormous progress in establishing domain structure-function relationships 1,2 , such genetic manipulation remains ine cient 3 . Insight into factors potentially contributing to low product yields was provided by cryo-electron microscopy analysis of a model PKS module at multiple stages of its catalytic cycle 4,5 . This work revealed that interdomain contacts are critical for establishing the various functional states of the module, and that transitions between such states rely on evolving interfaces between the domains, as well as the intervening 'linker' regions. In short, PKS modules appear to be highly integrated units, thus explaining why exchange of catalytic domains for heterologous counterparts is often detrimental 6 . Collectively, these observations motivate future approaches in which modules or multi-modular subunits are employed as the basic building blocks for engineering the assembly lines [7][8][9][10][11][12][13] .
However, an alternative de nition was recently suggested by the nding that KS domains in certain PKSs co-evolve with the tailoring domains located upstream in the assembly lines 14,15 . Accordingly, modules begin with the modifying domains and the associated AT, and terminate with the KS that is classically assigned to the downstream module (Fig. 2). Even before a module rede nition was suggested, engineering efforts revealed that maintaining the key ACP n /KS n+1 interface can, in certain cases, be critical for the function of a hybrid PKS 7,10,16 . Recently, we have carried out module swapping based on both of these de nitions, by covalently tethering heterologous modules to a common donor module within a bimodular mini-PKS 17 . Overall, our data demonstrated that both module de nitions led to functional hybrid PKSs, and which boundaries worked best depended on the source module 18 . Indeed, regardless of which extremities are employed, module exchange results in non-native interdomain interactions (ACP n /KS n+1 or KS n+1 /ACP n+1 ), and in the case of classical module boundaries, potential incompatibilities in terms of KS substrate speci city (Fig. 2) -both of which have been shown to reduce activity via detailed studies in vitro 19 − 21 .
In this work, we aimed to investigate the generality of these ndings for efforts to create non-native intermodular junctions when the modules are located on distinct subunits 10 . In such cases, the resulting non-covalent interactions are mediated by short sequences at the extreme C-and N-termini of the subunits called docking domains (DDs) 22 (Figs. 1 and 2). Matched pairs of DDs form speci c complexes at intersubunit interfaces, enforcing a strict subunit ordering within the PKS system. As a test case, we aimed to genetically engineer the biosynthesis of substantially smaller derivatives of the stambomycin family of polyketides in Streptomyces ambofaciens ATCC23877 23 . The stambomycins 1 (Fig. 1b) are glycosylated macrolides which show promising anti-cancer activity against multiple human cancer cell lines 23 . The six characterized family members (A − F) differ in the alkyl functionality at the C-26 position ( Fig. 1a) which directly impacts their potency, 23,24 but have in common modi cation by trans-acting cytochrome P450 hydroxylases at C-28 and C-50. Notably, at 51-members, the macrolactone ring is among the largest of all known polyketides. Thus, the stambomycins represent an attractive model system for establishing PKS engineering as a means to access structurally-simpli ed analogues (minimal pharmacophores 25 ) for biological evaluation, as a complement to traditional chemical synthesis.
Here we report a comprehensive series of experiments aiming to generate 37-membered ring stambomycin analogues, based on both the classical and revised module de nitions. The target ministambomycins were detected successfully, albeit in low yields, and the identi cation of the derivatives allowed us to clarify the relative timing of the two cytochrome P450-catalyzed hydroxylation reactions in the pathway. Attempts to boost titers by ACP/KS interface engineering 19,20,26 were unsuccessful, but led in certain cases to a surprising increase in the liberation of linear shunt metabolites -only the second report, to our knowledge, of inter-subunit crosstalk resulting in thioesterase (TE)-mediated chain release.
Taken together with recent work by others 10 , our data reinforce the idea that in order to boost e cacy, strategies based on modifying PKS intersubunit interfaces must take into account the function of the modules acting downstream from the newly-established junctions.

Results
Design of engineering experiments based on classical modular boundaries. The stambomycin PKS comprises 25 modules distributed among 9 polypeptides (Pks1 − 9) 23 (Fig. 1a) (Note: throughout the text, the stambomycin genes have been numbered in accordance with ref. 23 ). To access abridged derivatives using the classical module boundaries, we reasoned that we could engineer novel intersubunit interfaces by suitable manipulation of docking domains. Encouragingly, the extreme C-and N-termini of all subunits (with the exception of the N-terminus of Pks1 and the C-terminus of Pks9) contain sequences with convincing homology to previously-identi ed DDs 22,27 (the C-terminal DDs are referred to hereafter as C DDs and their partner N-terminal DDs as N DDs). By bioinformatics analysis, we were able to con dently assign the DDs acting at 6 of the 8 interfaces to the type '1a' class 22 , and the remaining two sets of DDs as type '1b' 27 ( Supplementary Fig. 1). In both cases, docking occurs between an α-helical C DD and a coiled-coil formed by the N DD, with speci city achieved via strategically-placed charge:charge interactions at the complex interface ( Supplementary Fig. 1) 22,27 .
Among the type 1a junctions, there were notably two which appeared compatible in terms of the translocated substrate: PKSs 3/4 + 7/8 and Pks 4/5 + 8/9 ( Supplementary Fig. 2). Speci cally, the functional groups at the critical α-and β-positions 14,28 of the transferred chains are identical at these junctions, and correspondingly, the downstream KSs show similarities across several sequence motifs previously correlated with substrate speci city 14,20,29 (Supplementary Fig. 2). Targeting such interfaces thus allowed us, at least in principle, to overcome the functional block to the engineered systems represented by poor recognition of the incoming substrate by the directly downstream KS domain 21 .
Ultimately, we targeted a new interface between Pks subunits 4 and 9 for two principle reasons. Firstly, Pks4 is at the origin of the structural variation between the stambomycin family members, and thus we anticipated that maintaining the subunit within the hybrid system would give rise to a corresponding series of truncated analogues, providing important evidence for their identity. Secondly, it was genetically more practical to modify the second set of interfaces due to splitting of the PKS subunits between two loci ( Fig. 1).
To establish the novel junction, we initially modi ed the C DD of Pks4 ( C DD 4 ) to match that of Pks8 (the natural partner of the N DD of Pks9 ( N DD 9 )), either by site-directed mutagenesis of residues previously identi ed as key mediators of interaction speci city (construct C DD 4 SDM; Supplementary Fig. 3 30 . Modifying the C DD 4 speci city 'code' to match that of C DD 8 required mutation of 3 residues, while for the C DD 4 helix swap, the terminal 16 amino acids of C DD 4 were exchanged for the corresponding 15 residues of C DD 8 (Supplementary Fig. 3 and Supplementary Table 1). The genetic alterations were carried out in two distinct PKS contexts: (i) in the presence of the intervening subunits 5 − 8, which allowed for the possibility of competitive interactions between modi ed Pks4 and both Pks5 and Pks9; and ii) removing the intervening subunits 5 − 8, thus eliminating competition for binding of Pks4 by Pks5, and of Pks9 by Pks8 ( Supplementary Fig. 3). We further generated a mutant in which Pks subunits 5 − 8 were deleted but no modi cation was made to C DD 4 , in order to judge the intrinsic capacity of Pks4 and Pks9 to interact. Furthermore, genetic engineering was carried out in parallel by both PCR-targeting 31 and CRISPR-Cas9 32 , in order to directly compare the e cacy of these two approaches, as well as evaluate the effect of the short scar sequence remaining in the chromosome following PCR-targeting.
Engineering the stambomycin PKS based on the classical module de nition. The C DD 4 SDM and C DD 4 helix swap sequences were introduced in parallel into the S. ambofaciens genome using PCR targeting and CRISPR-Cas9 (full experimental details are provided in the Supplementary Methods). As discussed previously, the modi cations were made both in the presence of the intervening subunits Pks5 − 8 and in their absence ( Supplementary Fig. 3). As previous work has shown that production from the stambomycin biosynthetic gene cluster requires activation by constitutive overexpression of a pathwayspeci c LAL (Large ATP-binding regulators of the LuxR family) regulator 23 Fig. 4). Construct K7N6 was assembled speci cally to test the effect of this region, without any further modi cation to C DD 4 and the intervening pks5 − pks8 genes.
With the exception of K7N3, CPN4 and CPN5, extracts of the engineered mutant strains harboring pOE484 were analyzed by high performance liquid chromatography heated electrospray ionization high-resolution mass spectrometry (HPLC-HESI-HRMS) on a Dionex UItiMate 3000 HPLC coupled to a Q Exactive™ Hybrid Quadrupole-Orbitrap™ Mass Spectrometer, and compared to extracts of the control strain containing pIB139 33 as well as the wild type S. ambofaciens containing pOE484, using SIEVE 2.0 screening software. K7N3, CPN4 and CPN5 were analyzed subsequently, and the data inspected manually. Yield quanti cation was carried out with reference to a calibration curve generated with puri ed stambomycins A/B 1 (the limit of detection was found to be between 10 and 1 µg L − 1 , and so any yields < 10 µg L − 1 must be considered an estimate). Novel metabolites not present in the control strains, and for which we obtained reliable exact masses, are listed in Table 1 and Supplementary Fig. 5.
The rst notable result is that the K7N6/OE484 mutant yielded a similar metabolic pro le to S. ambofaciens wt (22 ± 3 mg L − 1 , 73% relative yield (Supplementary Table 5)), showing that the scar sequence negatively impacted stambomycin production, but not dramatically (Fig. 3). By contrast, no stambomycins were observed, as anticipated, in all constructs in which Pks5 − Pks8 had been removed (K7N1 − 3; CPN1,2) (Fig. 3). Stambomycins 1 were present, however, in strains K7N4 and CPN4 harboring C DD 4 site-directed mutations and in the C DD 4 helix swap strain CPN5, all of which still contained Pks5 − Pks8, albeit at reduced amounts relative to the wild type (18%, 23% and 14% of wt, respectively) ( Fig. 3 and Supplementary Table 5). (Surprisingly, the metabolic pro le of K7N5 reproducibly differed from that of CPN5, as no stambomycin-related metabolites were detected (Fig. 3)). These data suggested that while the mutations introduced into C DD 4 reduced interaction with N DD 5 , they were not su cient to disrupt natural chain transfer between Pks4 and Pks5, arguing that DD engineering to alter partner choice should be accompanied by removal of competing intersubunit interactions.
We did not nd any evidence in the DD engineering experiments for any of the target 37-membered metabolites ( Supplementary Figs. 3 and 5). However, all strains in which stambomycin production was abolished ( Table 1) exhibited four new peaks in common ( Fig. 3b and Supplementary Fig. 5) (other peaks corresponding to potentially novel compounds were observed, but none were shared between multiple strains). The determined exact masses and mass spectra (as exempli ed by strain CPN2/OE484, Fig. 3b) correspond to truncated derivatives of stambomycins A/B and C/D respectively, following premature release from modules 13 and 12 of Pks4 (compounds 2 − 5, Fig. 3d and Supplementary Fig. 5; ca. 8-fold greater yield of the module 13 products (Supplementary Table 5)). Further support for the identity of these shunt compounds was obtained by grafting the chain-terminating (type I) thioesterase (TE) domain from the C-terminal end of Pks9 to the C-terminus of Pks4 in order to force chain release at this stage. Indeed, identical compounds were produced, but at 17-fold increased yield relative to CPN2/OE484, consistent with active off-loading of the chains (Fig. 3c, Supplementary Fig. 6 and Supplementary Table 5).
Based on their masses, both sets of shunt metabolites were hydroxylated on a single carbon, while none were found to bear the β-mycaminose of the mature stambomycins, consistent with the absence of the tetrahydropyran moiety to which it is normally tethered. To determine the location of the hydroxylation and therefore the hydroxylase responsible, we inactivated in mutant CPN2/OE484 the genes samR0478 and samR0479 encoding respectively, the stambomycin C-28 and C-50 cytochrome P450 hydroxylases 34 . While extracts of CPN2/OE484/Δ478 were unchanged relative to CPN2/OE484 (i.e. the hydroxyl group was still present), the CPN2/OE484/Δ479 mutant exhibited four new peaks with masses corresponding to the dehydroxylated shunt products (Fig. 3, Supplementary Fig. 7 (compounds 6 − 9) and Supplementary Table 5). Taken together, these data show that the unusual on-line modi cation catalyzed by SamR0479 34 which is necessary for macrocyclization, occurs prior to chain extension by Pks5. While SamR0478 has also been speculated to act during chain assembly 34 , hydroxylation evidently occurs downstream of Pks4, at least. The intriguing substrate structural and/or protein-protein recognition features controlling the timing of hydroxylation by these P450 enzymes remain to be elucidated.
Role of TE domains in release of the shunt metabolites. We attributed the observed shunt metabolites 2 − 5 to the lack of productive chain translocation between Pks4 and Pks9, causing intermediates to accumulate on ACPs 12 and 13. To evaluate whether these were released by spontaneous hydrolysis or enzymatically, we further investigated the role of Pks9 TEI 34 in chain release, as well as that of SamR0485, a proof-reading type II TE 35 located in the cluster. Both TEs were disabled by site-directed mutagenesis of the active site serines (Ser ◊ Ala) ( Supplementary Fig. 6).
Interestingly, inactivation of both the type I and type II TEs reduced the yields of shunt products 2 − 5 relative to the parental strain CPN2/OE484 (by 66% and 27%, respectively; average of duplicate experiments) (Supplementary Fig. 6 and Supplementary Table 5). These data clearly show that premature release of the chains is catalyzed, at least in part, by both TEs in the cluster, although spontaneous liberation also occurs. While type II TEs typically interact with acyl-ACPs in trans to release blocked chains 35 , the effect of the Pks9 TEI is less readily explained. One possibility is that the new productive docking interaction between Pks4 and Pks9 allows Pks9 to adopt an alternative conformation from which the TE can off-load intermediates bound to ACPs 12 and 13 of Pks4 ( Supplementary Fig. 6). Although this mechanism is reminiscent of that used by the pikromycin PKS to generate both 12-and 14membered rings 36 , the pikromycin TEI is separated from its alternative ACP target by a single module, while Pks9 TEI is located ve or four modules downstream from ACPs 12 and 13 in the engineered system, which would seem to necessitate substantial inter-subunit acrobatics.   Table 4). Analysis of the individual C DDs by circular dichroism (CD) con rmed their expected high α-helical content ( C DD 4 wt (100 µM): 58%; C DD 8 wt (100 µM): 49%), and showed no evident effect of the introduced mutations on secondary structure ( Supplementary Fig. 8). All of the constructs were further con rmed to be homodimeric by size exclusion chromatography-multi-angle light scattering (SEC-MALS) ( Supplementary Fig. 8).
Based on the higher a nity of the interaction, we could identify the N DD 9 Val as the physiologically relevant construct. The observed binding stoichiometry (1 homodimeric C DD:2 monomeric N DDs), is consistent with the known structure of a type 1a complex in which two monomers of each DD are present ( Fig. 1, Supplementary Fig. 1) 22 . As expected, no non-speci c interaction was detected between native C DD 4 and N DD 9 , explaining the lack of productive interaction between unmodi ed subunits Pks4 and Pks9 when the intervening multienzymes are deleted (strain K7N3) (Fig. 3a).
Analysis by ITC of binding between C DD 4 SDM or C DD 4 helix swap and N DD 5 revealed the complete absence of interaction ( Supplementary Fig. 8), and therefore that the introduced modi cations were su cient to disrupt communication between the native pair. Thus, the continued production of stambomycins 1 by K7N4, CPN4 and K7N5 harboring Pks5 − Pks8 must be due to additional contacts between Pks4 and Pks5 beyond the docking domains, likely including the compatible ACP 13 /KS 14 interface. On the other hand, no interaction was detected between C DD 4 SDM and N DD 9 , showing that this limited number of mutations was inadequate to induce productive contacts. This result is fully in accord with the absence of the expected mini-stambomycin products from these strains (K7N1/CPN1, Fig. 3a).
By contrast, the C DD 4 helix swap exhibited essentially the same binding to N DD 9 Val as C DD 8 (K d = 21.0 ± 0.3 µM), demonstrating that exchange of just this helix is su cient to redirect docking speci city 30 . Thus, ine cient docking is not at the origin of the failure of the C DD 4 helix swaps to yield chain-extended products in vivo (strains K7N2/CPN2, Fig. 3a). We could therefore conclude at this stage that the problem arose from the non-native interface generated between ACP 13 and KS 21 , poor acceptance by KS 21 of the incoming substrate during chain transfer and/or chain extension, and/or low activity towards the modi ed chain of domains/modules acting downstream.
Attempted optimization of the stambomycin DD mutants. We aimed next to improve the novel Pks4/Pks9 intersubunit interface in strain CPN2 ( C DD 4 helix swap + deletion of Pks5 − 8) by targeting helix αI of ACP 13 , as the rst 10 residues of this helix have been implicated previously in governing the interaction with the downstream KS domain at hybrid junctions 26 . Notably, multiple sequence alignment of all ACPs in the stambomycin PKS located at intersubunit junctions, revealed a unique sequence for each ACP in the helix αI region, consistent with a recognition 'code' for the KS partner, and the idea that mismatching these contacts might hamper productive chain transfer ( Supplementary Fig. 9). Indeed, as mentioned previously, even when docking is interrupted, contacts between ACP 13 and KS 14 are apparently su cient to enable chain translocation between Pks4 and Pks5 (Fig. 3a). In addition, an analogous strategy of optimizing the ACP n /KS n+1 chain transfer interface was shown recently to substantially improve interaction between an ACP (JamC) derived from the jamaicamide B biosynthetic pathway, and the rst chain extension module of the lipomycin PKS (LipPKS1) 41 .
In our case, the rst six residues of ACP 13 helix αI were modi ed using CRISPR-Cas9 (EADQRR ◊ PSERRQ), so that the full 10-residue recognition sequence matched that of ACP 20 , the natural partner of KS 21 (Supplementary Fig. 9). Analysis of extracts of the resulting strain CPN2/OE484/ACP13 SDM by HPLC-MS revealed at best minute amounts (highest yield of 0.5 µg mL − 1 ) of target cyclic mini-stambomycins A/B (11), lacking the hydroxyl group introduced by SamR0478 ( Fig. 4 and Supplementary  Fig. 9). Thus, while this experiment nally yielded the rst evidence for successful chain transfer between Pks4 and Pks9 followed by subsequent chain extension by Pks9 and TE-catalyzed release, the overall e ciency of the system remained poor. Interestingly, however, the yields of the four shunt metabolites 2 − 5 were as much as 48-fold higher from the ACP 13 helix swap mutant than from CPN2/OE484, showing that improved interactions between ACP 13 and KS 20 facilitated release of the stalled intermediates from ACPs 12 and 13, presumably via remote action by the TEI domain.  Supplementary   Fig. 10). Both of these modi cations were introduced into S. ambofaciens using CRISPR-Cas9, while simultaneously removing Pks5 − Pks8, yielding respectively after co-transformation with pOE484 and the control plasmid pIB139, strains ATCC/OE484/hy59_S1, ATCC/pIB139/hy59_S1, ATCC/OE484/hy59_S2, and ATCC/pIB139/hy59_S2. Analysis of culture extracts revealed the presence in both ATCC/OE484/hy59_S1 and ATCC/OE484/hy59_S2 relative to the controls, of a novel series of 37-membered metabolites (Fig. 4). The measured masses were consistent with the desired mini-stambomycins either as their free acids or in cyclic form (metabolites 10 − 12, Fig. 4). Signals corresponding to the A/B and C/D derivatives of all metabolites were detected, providing important evidence for their identities, as well as both the C-14 hydroxylated 12 and non-hydroxylated 11 forms of the cyclic mini-stambomycins (C-14 corresponds to C-28 in the parental compounds (Fig. 1)). It is not surprising that the corresponding E and F forms were not detected, as their yields even from the wild type are much lower than the A − D derivatives (Fig. 3a). The observation of non-hydroxylated 11 shows notably that internal hydroxylation by SamR0478 is not an absolute prerequisite for TE-catalysed macrolactonization, and argues that hydroxylation of the ministambomycins only takes place on the macrocyclic compound. Although compounds 11 and 12 incorporate the tetrahydropyran moiety of the parental stambomycins 1 which undergoes glycosylation, derivatives bearing β-mycaminose were not observed, presumably due to poor recognition of the overall modi ed macrocycle by glycosyl transferase SamR0481 23 .
As observed previously, the strains also produced substantial quantities of the shunt products 2 − 5 (inactivation of samR0479 led correspondingly to the dehydroxy versions of these compounds 6 − 9 ( Supplementary Figs. 7 and 10)). The yields were ca. 80-fold higher than those of the corresponding ministambomycins, with the highest titers observed in the strain incorporating the hybrid KS 14 /KS 21 . The amount of shunt metabolites was also approximately 123-fold higher than from strain CPN2/OE484 (which incorporates an ACP 13 -C DD 4 swap/ N DD 9 -KS 21 interface) (Figs. 3a and 4, Supplementary Table 5).
Thus, contrary to expectation, although using the KS as a fusion site improved communication between Pks4 and Pks9, it also substantially boosted TEI-mediated off-loading of stalled upstream intermediates.
In principle, such stalling could result from a slow rate of chain extension in the now hybrid acceptor module (for example, in the full KS swap construct, KS 14 and ACP 21 are completely mismatched for chain extension). To evaluate this idea, we modi ed ACP 21 within ATCC/OE484/hy59_S1 incorporating the fulllength KS 14 , targeting a sequence region previously identi ed as mediating intramodular communication between the KS and ACP during chain extension ( Supplementary Fig. 11) 19,26 . Speci cally, we exchanged loop 1 and the initial portion of helix αII of ACP 21 for the corresponding sequence of ACP 14 , using CRISPR-Cas9 ( Supplementary Fig. 11). As we anticipated that creation of this substantially hybrid ACP might engender structural perturbation, we also engineered a minimal mutant of ACP 21 in which only one of the two most critical residues in the recognition motif was mutated to the corresponding amino acid in ACP 14 (G 1499 of Pks9 ◊ D; the second residue, R, of the motif is already common to the two ACPs) ( Supplementary Fig. 10). Analysis of the loop/helix αII swap by HPLC-MS showed that all ministambomycin production had been abolished ( Supplementary Fig. 10), consistent with the anticipated disruption to ACP 14 structure. Production by the ACP site-directed mutant was not any better than by the full KS swap construct ( Fig. 4 and Supplementary Fig. 10), as only metabolite 11 remained above the limits of detection.
In principle, the hybrid KS 14 (Fig. 4, Supplementary Fig. 10) (and correspondingly, 15, the dehydroxylated analogue of 14, was detected in the SamR0479 mutant ( Supplementary Fig. 10)). The same metabolite 14 was identi ed from the ACP 21 G ◊D mutant (Fig. 4 and Supplementary Fig. 11), consistent with interrupted chain transfer to KS 22 . Taken together, these data con rm module 22 as a new blockage point in the engineered systems.
Relative e cacy of PKS engineering using PCR-targeting and CRISPR-Cas9. As multiple of our core constructs were generated by both PCR-targeting and CRISPR-Cas9, we were able to directly compare the e ciency of the two techniques ( Fig. 3 and Supplementary Fig. 4). Globally, our results con rm that both approaches can be employed to introduce large-scale modi cations to PKS biosynthetic genes (i.e. deletions of single or multi-gene regions) 32,44−46 . We have also demonstrated, for only the second time to our knowledge, that CRISPR-Cas9 can be leveraged to speci cally modify modular PKS domains 47 . Of the two methods, CRISPR-Cas9 was the more rapid, as the corresponding constructs were engineered in approximately half of the time. In addition, while CRISPR-Cas9 allowed for direct modi cation of the host genome, PCR-targeting relied on the availability of suitable cosmids housing the target genes, and resulted in a 33 bp attB-like 'scar' sequence in the genome ( Supplementary Fig. 4) 48 . In addition to hampering iterative use of the approach, the scar apparently provoked a moderate reduction in stambomycin yields in mutant K7N6 compared to the wild type, an effect also noted upon comparison of several analogous mutant strains (e.g. K7N4 vs. CPN4, Fig. 3). Nonetheless, we did encounter certain di culties with use of CRISPR-Cas9 (i.e. failure to obtain construct CPN3, occasional reversions to wild type, etc.), observations motivating ongoing efforts in other laboratories to further enhance the suitability of CRISPR-Cas9 for editing PKS pathways 47,49−54 .

Discussion
In this work, we have utilized an approach based on the state-of-the-art in PKS engineering to modify the stambomycin PKS (Fig. 5). Speci cally, we aimed to remove the four PKS subunits between Pks4 and Pks9 in the assembly line which together house seven chain extension modules, to generate a series of 37-membered 'mini-stambomycins'. While in principle such a change might have been possible by directly fusing Pks4 and Pks9 via a suitable intermodular linker, this approach would have resulted in a heptamodular subunit whose size is far in excess of the tetramodular multienzymes present in the system. We have also demonstrated recently the low e cacy of this strategy when the module downstream of the linker is N-terminal in its native subunit context (as with module 21 of Pks9) 18 .
As an initial approach (Fig. 5), we modi ed C DD 4 to render it compatible with N DD 9 , with the aim of inducing productive communication between Pks4 and Pks9, while leaving all modular units intact. This modi ed PKS relied for function on both a non-native chain transfer interface (ACP 13 /KS 21 ), and the intrinsic tolerance of the downstream KS/modules to the incoming substrate. We were optimistic this experiment might work given the structural similarities between the native substrates of KS 14 and KS 21 at least directly adjacent to the acyl terminus, as well as the fact that the stambomycin PKS generates a small family of metabolites, and therefore must exhibit some intrinsic tolerance to structural variation.
Although we showed in vitro with recombinant DD pairs that a docking helix-swapped mutant of C DD 4 communicated effectively with N DD 9 , chain transfer across the engineered interface did not occur in vivo, as evidenced by the accumulation of multiple shunt products. While our attempt to render the ACP 13 /KS 21 junction more native by site-directed mutagenesis did result in certain target metabolites, the most signi cant effect was to increase the yields of the truncated chains.
Having narrowed down the biosynthetic block to events occurring downstream of the engineered junction, we next carried out interface engineering based on proposed alternative module boundaries, leveraging fusion points within the KS domain (Fig. 5). In this case, sites were selected to either maintain essentially the whole of KS 14 , or to create a hybrid KS 14  Our results also showcase the intrinsically high tolerance of the Pks9 TEI domain towards shorter substrates. Indeed, the data also demonstrate that this TEI domain participates in off-loading the shunt metabolites from the upstream subunit, and that this activity interferes with passage of the chain to subsequent modules. Unfortunately, our attempts to boost yields of the mini-stambomycins by engineering the condensation interface between KS 14 and ACP 21 were unsuccessful, both when the full ACP 21 recognition loop/helix αII region was swapped for that of ACP 14 , and when a single site-directed mutation was made at a putatively critical position (Fig. 5). This result is surprising in light of the bene cial effects reported in vitro of both of these modi cations on chain extension carried out by mismatched KS and ACP domains sourced from the erythromycin PKS (DEBS) 26 . Apparently, the introduced changes were not su cient to ensure effective communication between the KS 14 and ACP 21 domains (or were in fact deleterious to function), and/or any bene t was masked by the poor tolerance of the downstream modules to the modi ed intermediates.
To fully judge the e cacy of this work, it is instructive to compare it to the other two examples in the literature in which full biosynthetic systems have been re-engineered to remove multiple internal modules 10 . In the rst, recently-reported case, the neoaureothin (Nor) hexamodular PKS was 'morphed' into the evolutionarily-related aureothin (Aur) tetramodular PKS by removing the second bimodular subunit, NorA′. As in our work, the authors initially attempted to engineer a new interaction between the monomodular subunits NorA and NorB anking NorA′ using compatible docking domains, by exchanging the type 1b N DD of NorB for the type 1a N DD of NorA′ (the natural partner of NorA). When the target metabolite was not obtained, they relocated the fusion site to the KS-AT linker downstream of the conserved KS region in NorB, thereby maintaining the native NorA ACP-C DD/ N DD-KS NorA′ junction.
Ultimately, several linker variants had to be evaluated before a functional sequence was identi ed, in part by serendipity (indeed it is 1 residue longer than the native linker). Overall, the yields of the targeted chainshortened metabolites were reduced approximately 18-fold compared to the parental neoaureothin (to ca. 2.5 mg L − 1 ), a much less signi cant penalty than engendered by our engineering strategy. Presumably, the superior titers obtained in this experiment re ect the much higher intrinsic amenability of the Nor PKS to conversion into an Aur PKS, as the Nor PKS likely evolved from an Aur PKS by subunit insertion 10 .
Nevertheless, the newly-created NorA/NorB interface was also only partially functional, as product corresponding to the intermediate generated by iterative action of the upstream subunit NorA was still obtained.
The second relevant investigation concerns the accelerated evolution (AE) of the RAPS PKS, based on spontaneous induced homologous recombination between its component modules 43 . As mentioned earlier, several of the resulting systems incorporated intermodular fusion sites essentially at the mid-point of the respective KS domains, and so can be compared to our best performing construct hy59_S2. Notably, yields from the hybrid RAPS PKSs from which either 3 or 6 modules were removed, were reduced by a maximum of 33% relative to that of the parental compound. We propose two explanations for the higher functionality of these systems relative to hy59_S2. First of all, in every case, the module downstream of the newly-formed junctions in the contracted RAPS PKSs was internal to its respective Taken together, this set of results shows that contracting PKS systems represents a viable approach to accessing truncated polyketide derivatives of variable length, including macrocycles. Whether such systems are generated rationally or using an AE process, the most e cient hybrids will likely result: i) from PKSs whose modules (and in particular KS domains) exhibit a substantial degree of mutual sequence identity and thus intrinsically high substrate tolerance (or which can be adapted by mutagenesis to broaden their speci city 29 ); and, ii) when novel junctions are created with downstream modules which are situated at internal positions within their subunits. The data also reinforce the idea that in cases where communication at modi ed interfaces occurs via non-covalent protein-protein interactions, at least a portion of the KS downstream from the docking domains should be included to boost e ciency 10,11,16 . Finally, our work has identi ed an increase in TEI-mediated proof-reading provoked by such interface engineering. Elucidating the mechanism underlying this unexpected intersubunit release activity, and thus how to effectively suppress it, should be a pro table avenue for further boosting product titers.

Methods
Bioinformatics analysis. To underpin the interface engineering strategy, the extremities of all the stambomycin PKS subunits were analyzed to identify the boundaries of the most C-terminal and Nterminal function domains (ACP and KS, respectively), and thus the regions potentially containing docking domains (DDs). The resulting sequences were compared by multiple sequence alignment using Clustal Omega (https://www.ebi.ac.uk/Tools/msa/clustalo) 57 , to bona de and putative DD sequences from multiple DD classes, including those present at the DEBS 2/DEBS 3 interface (type 1a, PDB ID:1PZQ, 1PZR 22 ) and the PikAIII/PikAIV junction (type 1b, PDB ID: 3F5H 27 ), to allow for type classi cation. To identify suitable boundaries for DD heterologous expression in E. coli, the secondary structure of the putative DD regions was predicted using PSIPRED 4.0 (http://bioinf.cs.ucl.ac.uk/psipred/) 37 . Analysis for potential speci city-conferring residues in the stambomycin PKS ketosynthase (KS) domains was carried out by multiple sequence alignment against model KS domains 14,20,29 , using Clustal Omega 57 .
General methods. All reagents and chemicals were obtained from Sigma-Aldrich, except the following: BD (tryptone, yeast extract, TSB powder), Thermo Fisher Scienti c (Tris), VWR (glycerol, NaCl, NaNO 3 ), ADM, France (NutriSoy our), and New England Biolabs (T4 DNA ligase, restriction enzymes). Oligonucleotide primers and two additional synthetic DNA fragments for CPN4 and CPN5 constructs were synthesized by Sigma-Aldrich (Supplementary Table 2). The docking domains N DD 9 Val and N DD 9 Met Supplementary Table 1) were obtained as synthetic peptides from GeneCust. DNA sequencing of PCR products was performed by Sigma-Aldrich and Euro ns.
PCR reactions were performed with Taq DNA polymerase (Thermo Fisher Scienti c) or Phusion High-Fidelity DNA polymerase (Thermo Fisher Scienti c) when higher delity was required. Isolation of DNA fragments from agarose gel, puri cation of PCR products and extraction of plasmids were carried out using the NucleoSpin® Gel and PCR Cleanup or NucleoSpin® Plasmid DNA kits (Macherey Nagel, Hoerdt, France).
Strains and media. E. coli BL21 strains were obtained from Novagen. Unless otherwise speci ed, all E. coli strains were cultured in LB medium (yeast extract 10 g, tryptone 5 g, NaCl 10 g, distilled water up to 1 L, pH 7.0) 58 or on LB agar plates (LB medium supplemented with 20 g L − 1 agar) at 37°C. Streptomyces ambofaciens ATCC23877 and the derived mutants were grown in TSB (TSB powder 30 g (tryptone 17 g, soy 3 g, NaCl 5 g, K 2 HPO 4 2.5 g, glucose 2.5 g), distilled water up to 1 L, pH 7.3) or on TSA plates (TSB medium supplemented with 20 g L − 1 agar), and sporulated on SFM 59 agar plates (NutriSoy our 20 g, Dmannitol 20 g, agar 20 g, tap water up to 1 L) at 30°C. All strains were maintained in 20% (v/v) glycerol in 2 mL Eppendorf tubes and stored at −80°C.
For fermentation of S. ambofaciens ATCC23877 and its mutants, spores were streaked on TSA with appropriate antibiotics and after incubation 48 h at 30°C, a loop of mycelium was used to inoculate 7 ml of MP5 medium (yeast extract 7 g, NaCl 5 g, NaNO 3 1 g, glycerol 36 mL, MOPS 20.9 g, distilled water up to 1 L, pH 7.4) supplemented with selective antibiotics and sterile glass beads, followed by incubation at 30°C and 200 rpm for 24 − 48 h. Finally, the seed culture was centrifuged and resuspended into 2 mL fresh MP5 before being inoculated into 50 mL MP5 medium in a 250 mL Erlenmeyer ask, and cultivated at 200 rpm and 30°C for 4 days.
The two systems differ in the way in which Cas9 is expressed; in the case of pCRISPomyces-2, the nuclease is expressed constitutively, while in the pCRISPR-Cas9 system, its expression is under inductive control by thiostrepton (Tsr). The crRNA sequence was selected to match the DNA segment which contains NGG on its 3' end (N is any nucleotide, and the NGG corresponds to the protospacer-adjacent motif (PAM). The annealed crRNA fragment and two homologous arms (HAL and HAR, anking the target region) were sequentially inserted into the delivery plasmid pCRISPomyces-2 using the respective restriction sites BbsI and XbaI, to afford the speci c recombinant plasmid for each mutant (Supplementary Fig. 4). Correspondingly, an sgRNA cassette (tracrRNA + sgRNA) and two homologous arms were inserted into the plasmid pCRIPR-Cas9 using sites NcoI, SnaBI and StuI, respectively ( Supplementary Fig. 10). In addition, the crRNA was designed to be located within the region to be deleted ( Supplementary Fig. 4) to avoid Cas9-catalyzed cleavage occurring in the genome of the resulting mutant. In the case of site-directed mutants, additional DNA fragments containing the targeted mutations were inserted between the two homologous arms. In addition, the DNA sequence with the fragments identical to the crRNA was modi ed, so as to avoid subsequent Cas9-catalyzed cleavage of the obtained mutants ( Supplementary Figs. 9 and 11).
Overexpression and puri cation of docking domains. The wild-type docking domains ( C DD 4 , N DD 5 , C DD 8 , N DD 9 Val and N DD 9 Met) and mutant docking domains ( C DD 4 SDM, C DD 4 helix swap) were ampli ed from genomic DNA of S. ambofaciens wild type and the relevant mutants, using forward and reverse primers incorporating BamHI and HindIII restriction sites, respectively (Supplementary Table 2). The PCR amplicons were digested using FD BamHI and FD HindIII, and then ligated into the equivalent sites of vector pBG-102 (Center for Structural Biology, Vanderbilt University). In the case of all C DDs which lacked aromatic residues, a tyrosine residue (codon TAT incorporated in the forward primer, Supplementary Table 1) was added at the N-terminal ends (so as not to interfere with docking with the N DD partner) to allow e cient monitoring by UV-Vis during the puri cation, as well as reliable measurement of protein concentration necessary for binding studies by ITC.
The resulting constructs pBG102-N DD 5 , pBG102-C DD 8 and pBG102-N DD 9 were used to transform E. coli BL21 (DE3). For C DD 4 and its mutants, these were transformed into Rosetta™ 2(DE3), as these constructs contain 8 codons rarely used in E. coli. Positive transformants were selected on LB agar supplemented with kanamycin (50 µg mL − 1 ) (25 µg mL − 1 chloramphenicol was also added for expression in Rosetta™ 2(DE3)). A single colony was transferred to LB (10 mL) supplemented with antibiotics, and the culture grown at 37°C and 200 rpm for overnight. The 1 mL overnight culture was used to inoculate LB media (1 L) supplemented with appropriate antibiotics, and then incubated at 37°C and 200 rpm to an optical density of 0.8, at which point protein synthesis was induced by the addition of IPTG ( nal concentration 0.1 mM). After incubation at 18°C and 200 rpm for 18 h, cells were collected by centrifugation at 8000g for 30 min, resuspended in 40 mL protein puri cation buffer A (50 mM Tris-HCl, 400 mM NaCl, 10 mM imidazole, pH 8.0), and lysed by sonication. Following centrifugation at 20000g and ltration using a 0.45 µm membrane, the soluble cell lysates were loaded onto 2 ⋅ 5 mL HisTrap HP (GE Healthcare) columns (two 5 mL columns in series) equilibrated in buffer A, and puri ed by preparative protein puri cation chromatography using an ÄKTA Avant system. The following program was applied: sample loading, 1 mL min − 1 ; washing, 2 mL min − 1 , 10 column volumes of buffer A; elution, 2 mL min − 1 , 5 column volumes of buffer B (50 mM Tris-HCl, 400 mM NaCl, 250 mM imidazole, pH 8.0); elution, 2 mL min − 1 , 2 column volumes of buffer C (50 mM Tris-HCl, 400 mM NaCl, 500 mM imidazole, pH 8.0).
All His 6 -SUMO-tagged proteins were collected (fractions containing the protein of interest were selected based on the UV chromatography and SDS-PAGE gel), and transferred into dialysis bag containing His 6tagged human rhinovirus 3C protease (H3C) (1 − 2 µM). The dialysis bag was then placed into a container lled with buffer D (50 mM Tris-HCl, 400 mM NaCl, pH 8.0), and the cleavage allowed to proceed at 4°C overnight. The resulting proteins, which incorporated a non-native N-terminal GPGS sequence, were then separated from the remaining His 6 -tagged SUMO and His 6 -tagged human rhinovirus 3C protease by reloading onto the 2 ⋅ 5 mL HisTrap HP columns pre-equilibrated in buffer A. Puri cation was then carried out with the following program: sample loading, 1 mL min − 1 ; washing, 2 mL min − 1 , 4 column volumes of buffer A; elution, 2 mL min − 1 , 2 column volumes of buffer B; elution, 2 mL min − 1 , 2 column volumes of buffer C. The untagged docking domains passed through the column during the washing step, and were collected and concentrated to 5 − 7 mL using an Amicon Ultra 3000 MWCO centrifuge lter (Millipore in the syringe. In the case of the binding experiments between N DD 9 Met and C DD 8 , the C DD 8 (700 µM) was added to the N DD 9 Met (80 µM in the cell), while for the binding between N DD 9 Val and the C DDs ( C DD 8 , C DD 4 wild type, C DD 4 SDM and C DD 4 helix swap), the C DDs (700 µM) were added to N DD 9 Val in the cell (120 µM). The ITC experiments were then carried out as followed: initial waiting time 120 s, initial injection of 0.5 µL over 1 s followed by 19 serial injections of 2 µL over 4 s, separated by an interval of 120 s. For each experiment, the reference power was set to 5 µcal − 1 , stirring speed to 750 rpm, and the high feedback mode was selected. Two independent titrations were performed for each combination of DDs. The heat of reaction per injection (µcal s − 1 ) was determined by integration of the peak areas using the Origin 7.0 (OriginLab) software, assuming a one-site binding model (consistent with the solved structures of the types of DDs 22,27 ), yielding the best-t values for the heat of binding (ΔH), the stoichiometry of binding (N) and the dissociation constant (K d ). The heats of dilution of the DDs were determined by injecting them into the cell containing buffer only, and these were subtracted from the corresponding binding data prior to curve tting.
In some cases, when a plateau (binding saturation) was not reached at the nal titration step, and the problem could not be solved by increasing the concentration of DD in syringe, we initially placed C DD/ N DD complex in the ITC cell (at the concentration of the two partners reached in the previous titration), lled the syringe with additional DD, and performed a second titration experiment. This procedure was then repeated until binding saturation was reached. To t the data, the MicroCal ITC concatenation software was used to combine two ITC data les together. Most importantly, the critical parameter dimensionless constant (C-value) was calculated as follows: where K a is the binding constant, [M] T is the total macromolecular concentration in the cell, and N is the stoichiometry of interaction. A reliable ITC binding isotherm is evidenced by ITC data with C-values > 1 (the optimal range is 5 < C < 500) 64  HPLC-MS analysis of fermentation metabolites and puri ed docking domains. The fermentation broth of Streptomyces was centrifuged at 4000g for 10 min. As described previously 23 , the stambomycins and their derivatives were then extracted from the mycelia, by rst resuspending the cells in 40 mL distilled water, followed by centrifugation (4000g, 10 min, repeated 3×) to remove water-soluble components. After decanting the water, the cell pellets were weighed and extracted with methanol by shaking at 150 rpm for 2 h at room temperature. Thereafter, the methanol extracts were ltered to remove the cell debris, followed by rotary evaporation to dryness. The obtained extracts were then dissolved in methanol, whose volume was determined according to the initial weight of the mycelia (70 µL methanol to 1 g of initial cell pellet). V, −28 V and −6 V, respectively. Due to the much lower sensitivity of the Orbitrap LTQ XL relative to the Orbitrap ID-X Tribrid as evidenced by comparative analysis of identical samples on the two instruments, we introduced a 10× correction factor to the the yields determined using the Orbitrap LTQ XL (Supplementary Table 5).
The puri ed docking domains in buffer GF were diluted with Milli-Q water to a concentration of 50 µM and injected onto an Alltima™ C18 column (2.1 ⋅ 150mm, 5 µm particle size). Analysis was carried out with Milli-Q water containing 0.1% TFA (A) and acetonitrile containing 0.1% TFA (B), using the elution  Table 6) were identi ed using SIEVE 2.0 screening software, applying the default settings for component extraction of small molecules, except that of the base peak minimum intensity, which was set to 5000000.
Quanti cation of metabolites. To quantify the yields of native stambomycins and the newly-generated derivatives by HPLC-MS, we generated a calibration curve with previously-puri ed stambomycin A/B 1 as the standard (using a concentration range between 1 µg L − 1 and 50 mg L − 1 ). This approach yielded a linear correlation between the quantity of 1 and the respective peak area in the extracted ion chromatogram (EIC) (the areas of all peaks corresponding to the parental ions were used) ( Supplementary Fig. 12, Supplementary Table 5). The titers of all stambomycin derivatives were then determined using this calibration curve, based on the peak areas corresponding to the parental ions in their respective EIC chromatograms (Supplementary Table 5). As the limit of detection in these experiments lay between 1 and 10 µg L − 1 , yields in this range must be viewed as estimates.   is inter-modular, while the ACP/KS chain transfer contacts (right-pointing arrows) are intra-modular.

Figure 3
Analysis of metabolites derived from PKS engineering based on the classical module de nition. a HPLC-PDA analysis at lmax 238 nm of stambomycins 1 from the wild type strain and various mutants. Quanti cation of all derivatives (Supplementary Table 5) was based on comparison to the yields of the wild type stambomycins 1 as determined using a calibration curve ( Supplementary Fig. 13) (30 mg L-1 total yield of stambomycins A/B and C/D, set to 100%). b LC-ESI-HRMS analysis of mutants in which 1 was absent revealed a serious of shunt products (2-5) (the average yields (two measurements) relative to 1 in the wt are indicated). Shown are the extracted ion chromatograms (EICs) of 2-5, using the calculated m/z values shown in Supplementary Fig.5. c LC-ESI-HRMS analysis of several CPN2-derived mutants (the yields of shunt products 2-5 are shown relative to 1 in the wt (average of four measurements)). Notably, the combined yield of 2-5 in mutant ATCC/OE484/Pks4+TEI was 17-fold higher than that from CPN2/OE484. A series of new compounds 6-9 was generated in strain CPN2/OE484 in which the gene samR0479 was deleted. d Chemical structures of shunts 2-9. The structural differences among them are highlighted (green = R group; red = hydroxyl). Shunt products 2, 4, 6 and 8 correspond to stambomycin C/D derivatives, and 3, 5, 7 and 9 to stambomycin A/B derivatives. M12 and M13 indicate shunt compounds released from modules 12 and 13, respectively.

Figure 4
Engineering of functional mini-stambomycin PKSs. The various strategies used in each case are represented schematically, along with the obtained products and their maximum yields (estimated for <10 mg L-1) (full analysis of all constructs is provided in Supplementary Table 5). The engineering starting Summary of the engineering strategies applied in this work to the stambomycin PKS. Inset are the six distinct approaches used, and the structures of the resulting metabolites are shown. The strategies giving rise to the target mini-stambomycins 10−12 are indicated in red. The hydroxyl group shown in pink is introduced by the P450-hydroxylase SamR0478, and that in red, by SamR0479.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.