mCherry Fusion Proteins Facilitate Production of Recombinant, Cysteine-Rich Leptospira interrogans Proteins in Escherichia coli

Background Recombinant fluorescent fusion proteins are fundamental to advancing many aspects of protein science. Such proteins are typically used to enable the visualization of functional proteins in experimental systems, particularly cell biology. An important problem in biotechnology is the production of functional, soluble proteins. Here we report the use of mCherry-fusions of soluble, cysteine-rich, Leptospira-secreted exotoxins in the PF07598 gene family, the so-called virulence modifying (VM) proteins. Results The mCherry fusion proteins facilitated the production of the VM proteins (LA3490 and LA1402) by enabling the visual detection of pink colonies and following them through lysis and sequential chromatography steps. CD-spectroscopy analysis confirmed the stability and robustness of the mCherry-fusion protein, with a structure comparable to AlphaFold structural predictions. LA0591, a unique member of the PF07598 gene family that lacks N-terminal ricin B-like domains, was produced as a tagless protein that strengthens the recombinant protein production protocol. The current study provides the approaches for the synthesis of 50–125 kDa soluble, cysteine-rich, high-quality mCherry tagged or tagless fast protein liquid chromatography (FPLC)-purified protein. Conclusions The use of mCherry-fusion proteins enables a streamlined, efficient process of protein production and qualitative and quantitative downstream analytical and functional studies. Approaches for troubleshooting and optimization were systemically evaluated to overcome difficulties in recombinant protein expression and purification, demonstrating biotechnology utility in accelerating recombinant protein production.

Life science research revolves around understanding the relationship of protein structure 50 to function, and proper three-dimensional folding of recombinant proteins is essential for 51 determining protein function (1, 2). Escherichia coli is an ideal organism for this work given its 52 defined genome, short generation time, high yield at low cost, and convenience for producing 53 fluorescent protein-tagged recombinant proteins (3)(4)(5). Flexibility in expression conditions to 54 enable proper protein folding such as low-temperature (16°C) shift upon IPTG induction can be 55 exploited in E. coli expression systems to enhance protein yield, limit aggregation, and produce a 56 soluble function proteins (2, 6). Expression of cysteine-rich proteins as functional toxins can be 57 challenging because of the disulfide bonding of heterologous over-expressed proteins, leading to 58 the inclusion body formation (1, 7-10). 59 The discovery of fluorescent tags facilitates real-time monitoring of the function of 60 proteins, nucleic acids, organelles, and organisms without cytotoxic effects (5, 11). mCherry is a 61 broadly used monomeric (28.8 kDa) red fluorescent protein derived from the Indo-Pacific Sea 62 anemone, Discosoma sp. (12). mCherry is photostable and emits light at a wavelength of 610 nm 63 and is stable in high concentrations of urea, making it suitable as a fusion partner for downstream 64 applications and facilitating the production of soluble, function recombinant proteins (12,13). 65 We recently discovered the novel paralogous PF07598 gene family (VM proteins)-66 found only in group I pathogenic Leptospira species and expanded in most virulent L. 67 interrogans, L. kirschneri, and L. noguchii species, whose genomes encode a repertoire of  12 68 distinct paralogs (14)(15)(16). VM proteins all have encoded signal peptides that suggest they are 69 secreted; this has been confirmed experimentally (17,18). Not only are VM proteins secreted 70 exotoxins but they are comprised of bona fide N-terminal tandemly repeated ricin B-like lectin 71 domains and a C-terminal DNase/toxin domain. Experimental work has shown the leptospiral 72 VM proteins to be potential vaccine candidates (18,19). VM protein genes are upregulated in 73 vivo in the acute hamster model, consistent with the hypothesis that VM proteins are involved in 74 mediating leptospirosis pathogenesis (14-16). The full-length VM proteins are predicted to have 75 twelve cysteines and five disulfide bonds, and their expression as soluble, functional VM 76 proteins has been challenging. 77 This study aimed to optimize the production of soluble, functional, VM protein-mCherry 78 VM fusion proteins for real-time visualization of the expression and purification processes. The 79 protocol is accessible, reproducible, and potentially applicable to produce bacterial exotoxins in 80 either multi-globular or individual domains. 81 82

Cysteine-rich multi-globular domain soluble leptospiral VM proteins 84
The real-time monitoring of expression and purification of VM protein was feasible by tagging 85 mCherry fluorescent protein, visualized by pink color (Fig. 1-3) (20,21). Efficiency mCherry-86 fusion protein expression was seen with 1 mM IPTG at 16°C (Fig. 2). Here, we designed the 87 construct with a thioredoxin fusion tag, which is a highly soluble and thermostable with robust 88 folding characteristics, therefore, enhancing the solubility of the protein (Fig. 1) (22)(23)(24). A 89 polyHis (His6) tag at both N and C-terminals for affinity Ni-NTA chromatography ( Fig. 1)

Functional assessment infers stable mCherry-fusion proteins 92
The mCherry-LA3490 protein containing 2 tandemly repeated N-terminal ricin B domains 93 (RBL1 and RBL2) and C-terminal domain/DNase activity (CTD) was purified as a stable multi-94 domain globular protein (Fig. 3). The fusion protein reacted with anti-His6 monoclonal 95 antibodies showing that His6 residues were accessible for reactivity and the fusion protein was 96 produced as in its native condition (Fig. 3d)

Purification for full-length LA1402 and LA0591 (CTD) strengthens the reproducibility 104
To examine the reproducibility of the purification protocol, the two paralogous members 105 LA1402 and LA0591 from the PF07598 gene family were selected to validate the purification 106 protocol. High-level expression and successful production of mCherry-LA1402 and tagless-107 LA0591 VM proteins verified the protocol (Fig. 4). Based on protein sequences, LA1402 show 108 60% identities with LA3490, and LA0591 shows 56% identities with LA3490, and they both 109 show 99% coverage with LA3490. 110 111

Purified soluble VM proteins are physiochemically stable 112
The robustness of the predicted VM protein structures was validated by examining the content of 113 the secondary structures of mCherry-LA1402 using CD-spectroscopy (Fig. 5a, Table 1). 114 Previously, we showed the stability of secondary structure content ( -helix and ß-sheets) in the 115 paralogs LA1400 and LA0591 by CD-spectroscopy. The robustness of the CD-spectroscopy 116 result was comparable with AlphaFold-derived 3D structures (29). 117 Spectrofluorometric-based analyses suggest that LA1402 is intact with mCherry and the 118 fluorescence intensity gradually increased as the concentration of mCherry-LA1402 increased 119 (Fig. 5b). The fusion protein was unstable at 80°C with increased time intervals (Fig. 5c). Urea 120 had minimal effect on mCherry-LA1402 stability, even at 8M urea, under which conditions the 121 visibly pink color and fluorescence intensity was able to be determined (Fig. 5d

Strategy and design of recombinant constructs 133
Post-signal synthetic E. coli codon-optimized VM genes encoding sequences la3490 (Uniprot 134 ID: Q8F0K3) or la1402 (Q8F6A7) or la0591 (Q8F8G6) linked to mCherry (X5DSL3) via a 135 glycine-serine hinge (GGGGSGGGGSGGGGS) were synthesized and cloned into pET32b (+) 136 (Gene Universal Inc., USA) between enterokinase cleavage sites for convenient removal of the 137 mCherry fluorescent tag. The two histidine (His6) tags were added to both N and C-terminals for 138 Nickel-NTA purification. Before use, the sequence, and the orientation of the genes in the 139 constructs were verified by restriction digestion and sequencing (Fig.1). 140 141

Soluble recombinant fluorescent protein expression, and purification 142
The cysteine-rich LA3490 or LA1402 or LA0591 VM proteins were expressed in T7 SHuffle ® 143 Aliquots of 50 μL of T7 shuffle ® E. coli culture expressing mCherry VM proteins was directly 200 mounted onto a glass slide and air-dried. The cells were visualized using an Olympus IX81 201 fluorescence microscope (Tokyo, Japan). The DAPI filter (excitation/emission peak at 358/461) 202 was used to detect nuclear signals, and the TRITC filter (excitation/emission peak at 544/570) 203 was used to detect mCherry fluorescence. Representative images were captured using 204 Micromanager software. 205

DISCUSSION 211
Here we demonstrate the efficient production of mCherry-fusions, cysteine-rich, leptospiral 212 VM exotoxins, both full-length (including the N-terminal ricin B domains and using a tagless 213 construct harboring the la0591 gene. LA0591 is a unique VM protein that lacks ricin binding 214 domain (RBLs), therefore secreted intracellularly (18,19). The functional assay of the mCherry-215 fusions was verified by immunoblotting, asialofetuin binding assay, CD-spectroscopy, and 216 physio-chemical study. The mCherry-fusion proteins were obtained as a soluble, visibly pink 217 color protein from the lysis-optimized protocol. Among the various detergents, CelLytic ™ B 218 (Cell Lysis Reagent; Sigma-Aldrich, USA) buffer showed the preeminent result over the other 219 solubilization buffers such as 1% or 0.1% TritioX-100, 0.1% TritonX-100 with CHAPS, 2% or 220 5% Sarkosyl, or 8M urea (31) (Fig. 2-3, Fig. S1). 221 The mCherry tag had the important advantage of enabling real-time monitoring of pink 222 color fusion protein and minimizing the time of the downstream process if the fusion protein 223 does not turn pink color. In addition, polyhistidine tags are an excellent choice for affinity 224 chromatography. These His6 tags are positioned on either the N-or C-terminals or both ends to 225 conquer the potential problem that can occur in the inaccessibility of the protein tag, due to the 226 obstruction from the protein folding (25, 32). Both tags successfully help in the production of 227 fusion proteins. 228 VM exotoxins conform to the AB toxin paradigm. Although, their unique architecture is 229 encoded by a single gene, they self-assembled into a multi-domain holotoxin, which 230 differentiates them from the other classical AB toxins such as diphtheria toxin, pertussis toxin, 231 Shiga toxin, or ricin toxin which are typically encoded by two or more genes (29) The production of inclusion bodies or aggregated proteins could be due to: 1) a high rate 243 of protein expression formed due to specific intermolecular interaction, 2) Unbalanced protein 244 homeostasis equilibrium, 3) the rate of heterologous recombinant protein expression surpassing 245 the capacity of the host cells to manage protein post-translational modification, 4) environmental 246 conditions (37, 40). However, the aggregation process is reversible (41). 247 To circumvent inclusion body formation, the VM gene constructs were designed in the 248 pET32 expression vector, to be co-expressed with a thioredoxin tag in SHuffle ® T7 competent E. 249 coli cells (New England Biolabs, USA). VM protein expression was seen in both inclusion 250 bodies and soluble fractions when recombinant protein expressed ITPG-induced at a low 251 temperature (16ºC for 24 h) (18,19). Nonetheless, the active protein was produced from a 252 soluble fraction which was advantageous over refolding. 253 The refolding of inclusion bodies into bioactive forms is extremely cumbersome, either 254 producing low-yield or in-active proteins. The refolding potentially involves buffer exchange 255 which reduces the protein-protein interaction and minimizes the aggregation of proteins (37, 42), 256 which can be obtained by Size exclusion chromatography and adsorption chromatography.   buffer 100 mM NaH2PO4, 10 mM Tris-HCl pH 7.4 containing 8M urea followed by sonication 517 and separation of the pellet (IP) and supernatant (IS) fractions. In addition, the pellet was 518 solubilized in lysis buffer 20 mM Tris-HCl + 150 mM NaCl containing 2% or 5 % sarkosyl. The 519 induced soluble fraction solubilized with 5% sarkosyl was subjected to purification using AKTA 520 pure and eluted fractions were analyzed on 4-12% SDS-PAGE (right panel).    CelLyticT" B Cell +0.