Recombinant proteins produced in E. coli lack important post-translational modifications of mammalian proteins (most notably glycosylation of secreted proteins). However, using E. coli as an expression host is simpler and often more productive compared to mammalian cells. For its low costs, production in E. coli is the method of choice if the downstream applications do not require eukaryotic-specific posttranslational modifications. Also, in the manufacturing of biologicals, E. coli is sometimes used (e.g. for insulin 32). However, not all mammalian proteins do fold correctly in E. coli. Cystine knot proteins, which are virtually unknown in the bacterial kingdom, but have developed in fungi, plants, and animals, are often problematic, and the production of extracellular cystine knot proteins in the E. coli cytoplasm leads mostly to protein aggregation and the formation of inclusion bodies 21.
The vascular endothelial growth factor (VEGF) family of growth factors is interesting because its members are targets for inhibition, e.g. of tumor-induced blood vessel growth 33–36, and promotion, e.g. of lymphatic vessel growth in lymphedema 37. Inhibition has been mostly achieved by antibodies, which are still almost exclusively produced in mammalian cells for both research and therapeutic purposes, and successful attempts to produce full antibodies in E. coli are sparse 38. However, most members of the VEGF growth factors have been produced successfully in E. coli 10,11,39−42 but the exact refolding conditions have not been reported in some of these works 12,15. Notably, the successful production of the lymphangiogenic VEGFs (VEGF-C and VEGF-D) has not been reported in E. coli to date. The protein production for structural and functional studies on VEGF-C and VEGF-D has been performed for the last 20 years in insect or mammalian cells 43–46, 6. Some commercial suppliers offer VEGF-C produced in E. coli from a truncated cDNA, but the reported biological activities are significantly lower compared to corresponding proteins produced by mammalian hosts 47. When VEGF-C expression is performed from a truncated cDNA, an odd 9th cysteine in the VEGF homology domain interferes with correct disulfide bond formation 17,48. This cysteine is present in both VEGF-C and VEGF-D but absent from all other VEGFs.
Counter-intuitively, commercially available growth media for lymphatic endothelial cells (LECs) are mostly not supplemented with the lymphangiogenic VEGF-C, but with VEGF-A, which is the primary growth factor for blood vascular endothelial cells (BECs). Unlike VEGF-C, VEGF-A cannot activate the primary lymphangiogenic growth factor VEGFR-3 1,49. Hence, a readily available source of biologically active VEGF-C for lymphatic endothelial cell culture is of interest.
We noticed that in mammalian cells, producing VEGF-C from a full-length cDNA is slower compared to a truncated cDNA, most likely due to the vastly increased possibilities for disulfide bond formation. But irrespective of whether we used a full-length or a truncated cDNA, our initial attempts to produce VEGF-C in bacteria led to inclusion body formation, a typical result for many other cystine knot growth factors 50–52. In eukaryotic cells, protein disulfide isomerases (PDIs) are deployed to ensure rapid isomerization into the correct bonding pattern. While many secreted proteins with disulfide bonds have been successfully co-expressed in E. coli with the CyDisCo expression system, we failed to generate bioactive VEGF-C with this strategy.
However, even in the Kringle-2 serine protease fragment vtPA 20,53, which is used as the acid test for disulfide bond formation in E. coli, only 5.1% of the amino acid residues are cysteines, while in the various mature VEGF-C forms the cysteine content is between 7.2% − 8.6%. From the 18 cysteine residues in the dimer, two remain unpaired 48,16,54, and correct inter- and intramolecular disulfide bond formation in VEGF-C is likely a rare event compared to aggregation, even with PDI assistance. A large number of different PDI family members (21, and an even larger number of thioredoxin domain-containing proteins) suggests that specialization has occurred among these proteins. Possibly, with S. cerevisiae ERV1 and H. sapiens PDI of the CyDisCo system, we simply tested the wrong enzymes. The same might have been the case in a previous attempt to improve the quality of VEGF-C by co-expressing CALR, CDC37, PH4B, PDIA3, or PPIB in insect cells 55. It is also possible that VEGF-C folding benefits from the assistance by a specialized chaperone, as was reported for the closely related VEGF-A 56. However, even though the redox environment of the bacterial cytoplasm does not per se prevent the formation of disulfide bonds, the typical environment of most PDIs is the eukaryotic endoplasmic reticulum, which features a different redox environment compared to the cytoplasm 57.
We chose two parallel strategies to produce bioactive VEGF-C: Folding from solubilized inclusion bodies and preventing the formation of inclusion bodies by fusing VEGF-C to the solubility-enhancing maltose-binding protein 30. The first approach led to the identification of suitable folding conditions, which were similar to conditions previously reported to work for other VEGF family members.
Fusing VEGF-C to the solubility-enhancing MBP was unsuccessful on its own in the E. coli strain BL21, but when combined with the modified cytoplasmic redox-environment of the Origami E. coli strain led to the generation of bioactive VEGF-C. Compared to the soluble VEGF-C in the Origami strain, the insoluble, aggregated VEGF-C in the BL21 strain was largely protected from proteolytic degradation. Proteolytic degradation is at least partly responsible for the low yield of VEGF-C after purification (compare Figs. 3a with 3c, Fig. 6, and Fig. 7). The Western blotting with pentahis-antibody detection indicates that proteolytic processing removes the hexahistidine tag from the MBP-VEGF-C fusion protein. However, VEGF-C binds nevertheless to the Ni excel resin similar to what has been shown previously for VEGF-A 58. Yet, this binding is not efficient, and majority of active VEGF-C remains in the flowthrough during this step (Fig. 6b, c and d). VEGF-C recovery could be increased to 0.6 mg/l of bacterial culture using amylose resin, which binds to the MBP tag. However, due to proteolytic processing (see Fig. 7c and E) and/or improper folding of the MBP tag in the Origami strain, a large amount of the active VEGF-C was still present in the flowthrough.
We were surprised to see that bioactive VEGF-C could also be produced in the E. coli Origami strain without the need for the MBP moiety. However, a side-by-side comparison of quantified proteins showed that without the MBP moiety, the activities that we could recover were minimal compared to the MBP-tagged VEGF-C. Such activities were even sometimes undetectable in our experiments depending on the sensitivity of the bioassay. Interestingly, after Ni sepharose affinity chromatography, the major biologic activity was recovered from the fraction containing VEGF-C fused with a partially proteolytically processed MBP. In addition, MBP-tagged VEGF-C eluted from the amylose resin had a very similar level of bioactivity in Ba/F3-VEGFR-3 assay compared to VEGF-C, from which the MBP moeity had been removed by TEV cleavage. This indicates that removal of the MBP tag is not required to recover biological activity of VEGF-C. The X-ray structure of VEGF-C complexed with VEGFR-2 (or VEGFR-3) 16,59 supports the notion that in a VEGF-C/VEGFR-3 complex, an MBP domain N-terminally to the VEGF homology domain would point away from the receptor and thus not interfere with receptor binding and activation. On the other hand, domains C-terminally to the VHD might point towards the cell surface and thus be more likely to interfere with receptor binding and/or activation, similar to the silk-homology domain, which keeps pro-VEGF-C inactive 5,31,60. In the most straightforward scenario, these domains would be located between the extracellular domains 4 and 5 of VEGFR-3, which are instrumental in propagating the dimerization that is initiated by domains 1–3 towards the cell surface and interior 59.
The total yield of VEGF-C remained modest (about 0.1 and 0.6 mg of active VEGF-C per liter of E. coli culture from Ni sepharose and amylose affinity chromatography, respectively). However, the yield was limited due to incomplete binding of VEGF-C to both resins due to partial or complete cleavage of the N-terminal hexahistidine or the MBP tag, which resulted in the flowthrough showing almost the same level of activity in the Ba/F3-VEGFR-3 assay as the input lysate (Fig. 6d and 7e). We could increase the yield by 6-fold using amylose affinity chromatography. We estimate that alternative chromatographic methods such as using immobilized VEGFR-3 49 could further increase the yield. Together with optimization of the production parameters, the use of E. coli could therefore be a viable replacement for the eukaryotic expression of VEGF-C. Even larger improvements could perhaps be achieved by deploying PDIs in the Origami strain using the MBP-VEGF-C fusion protein, a combination that was not tested in our study.
In addition to producing mature, active VEGF-C from a truncated cDNA, we also attempted to produce full-length VEGF-C (i.e., VEGF-C together with its N- and C-terminal propeptides). However, none of our attempts were successful. We speculate that the repetitive cysteine-rich Balbian Ring 3 protein (BR3) motif in the C-terminal domain of VEGF-C precludes the generation of any meaningful amounts of correctly folded protein. Even if correctly folded, such full-length protein would not be active unless concertedly cleaved by a preprotein convertase such as furin and a specific activating enzyme (ADAMTS3, KLK3, cathepsin D, plasmin, thrombin). Because both cleavages can happen in the supernatant of 293T cells 2, we exposed the bacterial full-length VEGF-C to conditioned 293T cell supernatant but could not detect any activation.
VEGF-C is the central molecule required for lymphatic growth. It has potential applications in vascular biology research as a supplement in lymphatic endothelial cell culture, and as a pharmacological agent in the treatment of lymphedema. VEGF-C has already been used – in the form of the adenoviral vector AdVEGF-C – in clinical studies to treat lymphedema 37. While bacterial VEGF-C could be a cost-effective recombinant source of the growth factor for many in-vitro applications, glycosylated VEGF-C remains the first choice for in-vivo applications. Although mammalian-type N-glycosylation has been engineered into E. coli 61, the routine delivery of glycosylated therapeutic proteins is currently achieved using viral vectors, mRNA drugs, or recombinant protein derived from a mammalian cell line.