Development of solubility biosensor for PKS. In our previous AT domain swapping effort, we observed a correlation between solubilities and in vitro activities of hybrid PKSs6. We reasoned that engineered PKSs that maintain a stable conformation, thereby avoiding aggregation, have a higher probability of exhibit expected activities. To test that hypothesis, we sought to develop E. coli biosensor strains that could detect protein misfolding.
Several methods have already been developed for assaying in vivo protein stability with fluorometric outputs20–23. However, these methods have not been tested with large, multidomain proteins such as PKSs. Heat-shock genes ibpA and fxs are highly expressed when misfolded proteins accumulate inside E. coli24. The promoters of those genes (Pibp20,25 and Pfxs26) were used to drive expression of the green fluorescent protein (GFP) gene (Pibp alone or Pibp and Pfxs in tandem) and integrated into the genome of E. coli BL21 (DE3) in the arsB gene, thereby creating ΔarsB::Pibp GFP and ΔarsB::Pibpfxs GFP. The arsB site encodes an arsenic membrane pump, which should be a neutral site under standard laboratory conditions. In parallel, we made ΔibpA::GFP by integrating the GFP gene in frame of the ibpA gene, consequently knocking out the native gene and appropriating the promoter. To assess how these biosensors react to PKSs with different levels of solubility, we used the sixth module of the erythromycin PKS (6-DeoxyErythronolide B Synthase) with the neighboring TE domain (DEBSM6) and two engineered versions, D0 and D1, that contained an AT from the epothilone PKS module 4 (EposM4) in place of the native AT with different swap junctions; DEBSM6 was previously shown to be more soluble than D1, which was more soluble than D06.
The three PKS variants together with an empty vector control were expressed on pET plasmids in the E. coli strains, and fluorescence was measured (Fig. 1a). The ibpA promoter integrated in the arsB locus was not induced by the highly soluble DEBSM6 while it was similarly induced by D0 and D1. Meanwhile, the tandem promoter Pibpfxs was more sensitive and became induced by DEBSM6 but to a lower degree than the other two PKSs. Integration of GFP into the ibpA locus gave high background GFP fluorescence with the empty vector, but it was lower than that of DEBSM6, which was lower than both DO and D1.
Next, we investigated the use of fluorescent fusion tags to be able to normalize biosensor activation to the amount of heterologous protein produced. Previous results from non-PKS proteins showed that the folding of a fluorescent fusion protein is affected by the folded state of the protein to which it is attached21. To see the response of fusion tags with multi-domain proteins like PKSs, mCherry was attached to the C-terminus of the three reference PKSs, thereby creating DEBSM6 mCherry, D0 mCherry, and D1 mCherry. Although DEBSM6 mCherry had higher fluorescence than the other proteins, measuring cellular abundance of the expressed PKSs by SDS-PAGE quantification revealed it was due to higher amounts of protein and not due to differences in solubility (Fig. 1b). To investigate why the less soluble D0 has proportionally the same fluorescence compared to DEBSM6, we separated soluble and insoluble fractions and measured the amounts of expressed PKSs using SDS-PAGE and fluorescence (Fig. 1c, d). We observed that mCherry fused to DEBSM6 did not change the solubility, which was still more than 95% soluble while D0 mCherry had more than half of its protein content in the insoluble fraction. D1 mCherry was 75% soluble. The fluorescence in the different fractions mirrored the protein content indicating that the chromophore is still fluorescing even when attached to an insoluble protein, showing that PKSs offer a unique set of challenges unlike smaller proteins22,27.
To test biosensor activation across varying levels of expressed proteins, we combined the mCherry tagged PKSs with the E. coli strain that harbors ΔarsB::Pibp GFP biosensor and induced protein expression by adding different concentrations of IPTG. Even at high IPTG concentrations, DEBSM6 mCherry only weakly induced the biosensor while D0 mCherry had high induction even at 50 µM (Fig. 1e). D0 mCherry showed a higher GFP fluorescence than D1 mCherry, indicating that the biosensor can discriminate the two proteins when appropriate IPTG concentrations are used. To simplify presentation of the solubility data, we define the solubility coefficient as mCherry fluorescence divided by GFP fluorescence, which is a measure of the PKS solubility and how well it expressed (Fig. 1f and Supplementary Fig. 1–4).
Constructing an AT swapped PKS library. We created a library of engineered DEBSM6 with EposM4 AT variants carrying randomly assigned swap junctions in the KS-AT and the post-AT linker regions. The EposM4 AT natively uses either malonyl-CoA or methylmalonyl-CoA, unlike the native DEBSM6 AT which uses only methylmalonyl-CoA. To get a structural understanding of the linker regions around the DEBSM6 AT, we used the newly released AlphaFold28 to generate a homodimeric structure model without the TE domain (Fig. 2a). The predicted DEBSM6 structure showed a high degree of similarity with the experimentally solved structures of KS-AT didomains from DEBSM3 and DEBSM5 (56% and 57% sequence identity with DEBSM6 KS-AT)29,30 as well as the PKS module structures recently reported14,31. The predicted KS-AT linker consists of a disordered region starting from the KS which then forms three alpha helices surrounding three beta strands (Fig. 2b, d). The post-AT linker on the other hand wraps itself around the residues of the KS-AT linker and continues along the KS domain, interacting with several residues until it reaches the structural subdomain ψKR32 (Fig. 2c, e).
Next, we developed an in vitro method for creating a randomized mutagenesis library where each AT-swapped DEBSM6 variant had one random upstream and downstream junction. This was done using oligo pools, up to 350 base pair long oligonucleotides designed to each carry the swap junction at a different amino acid position (Fig. 3a). In total, 72 unique swap junction oligos (duplicate oligos in homology regions were removed) were designed for the KS-AT linker and 73 oligos were created for the post-AT linker, thereby making 5,256 possible combinations when randomly combining an upstream and downstream junction.
The oligo pools were used to amplify the EposM4 AT sequence using PCR and the resulting fragment mix was cloned into the AT position of DEBSM6 mCherry. Many of the resulting colonies of the library carried small deletions in the linker regions, possibly due to synthesis errors caused by the length of the oligos. Roughly 40% of colonies were visibly red indicating the presence and in-frame expression of mCherry.
Biosensor-guided screening of soluble PKSs. Around 800 colonies with a visible red color were induced for protein expression and fluorescence was measured using flow cytometry (Fig. 3b). Forty colonies with high solubility were selected, and the corresponding plasmids were purified and sequenced to determine what positions were enriched in the high solubility set. The highly soluble PKSs had swap junctions in amino acids us10-45 and us82-86 in the KS-AT linker and in ds18-65 in the post-AT linker (Fig. 3c, Supplementary Table 1). The gap between us46-81 was predicted by AlphaFold to contain the alpha helix 2 (α2) and the beta strand 2 (β2), both of which are deeply embedded within the KS-AT linker structure (Fig. 2b). AlphaFold also predicted that the junctions in amino acids ds1-18 in the post-AT linker would be inside the AT structure (Fig. 3d), and those in ds 65–90 would be inside the ψKR domain, both of which would cause the PKS to be insoluble (Fig. 2c).
The 40 library colonies that showed high solubility were remeasured in triplicates and a subgroup of five were selected to assess protein solubility by SDS-PAGE and enzyme activity: o4 (us85/ds21), o8 (us13/ds43), o15 (us21/ds64), o17 (us28/ds62) and o33 (us17/ds25) (Fig. 4a). The most soluble of these, o33, had close to the wild type DEBSM6 solubility. To investigate the solubility of these proteins in the absence of mCherry, the C-terminal mCherry was removed from each variant and their protein amounts in the soluble and insoluble fractions were quantified by SDS-PAGE (Fig. 4b). Insoluble fractions of the five selected variants (o4 10.4%, o8 8.6%, o15 13.5%, o17 13.1%, o33 13.7%) were all lower than the references (D0 63.6%, D1 19.4%) but still higher than the wild type DEBS M6 (7.5%). These data confirm that the selected variants have improved solubility.
In vitro activity assay for highly soluble AT-swapped PKSs. The five highly soluble variants, DEBSM6 and D1 (us1/ds41) were purified using nickel affinity resins (Supplementary Fig. 5a). In vitro enzymatic activity was measured using a synthetic starter substrate and either malonyl-CoA or methylmalonyl-CoA as extension substrates, resulting in a desmethyl or methyl triketide lactone (TKL), respectively. Three variants, o4, o8 and o33, and D1 produced methyl TKL in a turnover rate similar to the wild-type DEBSM6 indicating that protein structures of o4, o8 and o33 are not destabilized even with a heterologous AT domain (Fig. 4c). The positions in the KS-AT linker of the domain-swapped mutants were at the beginning of the linker (β0-α0) for o8 (us13) and o33 (us17), or in the middle of α3 for o4 (us85). These junctions either included the entire KS-AT linker from EposM4 or retained the counterpart of the parental PKS. The downstream junctions in the post-AT linker had the swap boundaries at the end of the AT domain for o4 and o33 (ds21 or ds25) or just before the residues interacting with the KS domain for o8 (ds43). Both o15 and o17 had high solubility but showed significantly lower activities. These non-functional variants had the downstream junction at ds62 and ds64, respectively, meaning KS interacting residues in the post-AT linker (ds44-56 in AlphaFold prediction) had the heterologous EposM4 AT sequence. This part of the linker is known to tightly interact with the KS in DEBS29,30 and is critical for KS condensation reaction34, indicating that the heterologous linker sequence in the o15/o17 variants is unable to complement that function and the quaternary structure is likely not retained35.
When malonyl-CoA was used as a substrate, no production was observed in DEBSM6 as expected since the native AT cannot accept malonyl-CoA6 (Fig. 4d). In contrast, o4, o8 and o33, and D1 showed product formation with malonyl-CoA albeit at a lower amount compared with methyl TKL production. This is consistent with previous results and thought to be due to the preference of DEBSM6 KS6.
Investigating swap positions in KS-AT linker. Next, we selected eight swap junctions in the KS-AT linker and three in the post-AT linker and constructed all 24 combinations. Upon measuring fluorescence, results show that positions us85, us28 and us17 always led to the highest relative solubility, no matter if the downstream junction was in a position that contributed to high (ds25), medium (ds62) or low solubility (ds81) (Fig. 5a). These results indicate that swap junctions influence protein stability independently of each other. Therefore, it would be possible to assess an upstream junction’s general influence on stability, independent of the paired downstream junction.
To do so, we selected the post-AT ds25 junction from the most soluble o33 variant and combined it with each possible upstream position using EposM4 AT. In total, 72 constructs were made, and solubility was assessed using the solubility coefficient. The measurements show that the α2 and β2 structure regions are not appropriate for recombination (Fig. 5b, c), which is consistent with our randomized library data (Fig. 3c). Surprisingly, most other variants showed relatively high solubilities; a notable exception was the variant with one proline residue in β1 (us27), which destabilized the protein, while its neighboring positions did not.
To further investigate the solubility-activity relationship, four variant pairs were selected with nearby swap junctions and but with differences in solubility: us1/ds25, us3/ds25, us26/ds25, us27/ds25, us48/ds25, us52/ds25, us79/ds25, and us84/ds25. These PKSs were purified without the mCherry fusion tag and all proteins were isolated with high purity except us52/ds25 (Supplementary Fig. 5b). In vitro activity analysis of desmethyl and the methyl TKL production showed that the less soluble mutants were also less active in all examples but one in which the activity was equal (Fig. 6e, f). Because all tested variants have the same post-AT linker junction (ds25), the cause of the structural destabilization and low activities should be attributed to an unfavorable junction in the KS-AT linker. The ACP probably docks with the KS-AT linker surface while interacting with the KS for chain elongation36, disruption of which may explain the observed reduction of activities.
To see how generalizable these results are, we repeated the experiment with a phylogenetically distant AT, the ethylmalonyl-CoA–specific AT from the tiacumicin PKS module4 (TiasM4)37. In this case, 69 unique constructs were made to cover the same linker region. Results from the fluorescence measurement showed that the solubility patterns from TiasM4 AT-swapped PKSs agreed well with the corresponding EposM4 AT results (Fig. 6b, d). Again, upstream junctions within α2 and β2 appear to destabilize the protein when ds25 is used as a downstream junction.