Structural insights into the binding of nanobodies LaM2 and LaM4 to the red fluorescent protein mCherry

Red fluorescent proteins (RFPs) are powerful tools used in molecular biology research. Although RFP can be easily monitored in vivo, manipulation of RFP by suitable nanobodies binding to different epitopes of RFP is still desired. Thus, it is crucial to obtain structural information on how the different nanobodies interact with RFP. Here, we determined the crystal structures of the LaM2‐mCherry and LaM4‐mCherry complexes at 1.4 and 1.9 Å resolution. Our results showed that LaM2 binds to the side of the mCherry β‐barrel, while LaM4 binds to the bottom of the β‐barrel. The distinct binding sites of LaM2 and LaM4 were further verified by isothermal titration calorimetry, fluorescence‐based size exclusion chromatography, and dynamic light scattering assays. Mutation of the residues at the LaM2 or LaM4 binding interface to mCherry significantly decreased the binding affinity of the nanobody to mCherry. Our results also showed that LaM2 and LaM4 can bind to mCherry simultaneously, which is crucial for recruiting multiple operation elements to the RFP. The binding of LaM2 or LaM4 did not significantly change the chromophore environment of mCherry, which is important for fluorescence quantification assays, while several GFP nanobodies significantly altered the fluorescence. Our results provide atomic resolution interaction information on the binding of nanobodies LaM2 and LaM4 with mCherry, which is important for developing detection and manipulation methods for RFP‐based biotechnology.


| INTRODUCTION
Fluorescent proteins (FPs) are the most extensively studied and widely used genetic tools in molecular biology research. FPs can be easily expressed in almost all kinds of cells, and the fusion of FPs generally does not affect the function of other proteins. [1][2][3] Compared to jellyfishderived green FPs (GFPs), red FPs (RFPs) have several advantages when applied in imaging due to their longwavelength excitation, lower light scattering, and decreased autofluorescence. [4][5][6][7][8][9] Although many genetically encoded RFP animal strains have been established to facilitate live observation, the manipulation of RFPs is still desired, which may be improved by the development of RFP-specific nanobodies. 10 Nanobodies, first discovered by Hamers-Casterman in 1993, 11 are single domain antibodies derived from the heavy chain variable regions (VHH) of Camelidae atypical immunoglobulins. Nanobodies are the smallest functional fragments derived from a naturally occurring immunoglobulin. Unlike monoclonal antibodies, nanobodies can be easily produced in prokaryotic expression systems. Because of their small size (12)(13)(14)(15) and high stability and solubility, nanobodies are widely used for industrial 12 in vitro diagnostic and clinical applications. 13 The small size also allows nanobodies to be genetically encoded as chimera proteins and delivered to cells by fusion plasmids. Typically, the long CDR3 region enables nanobodies to bind to antigens with high specificity and affinity similar to those of traditional antibodies. 14,15 The small size of nanobodies also goes beyond traditional IgG antibodies in several specific applications, including binding with the smooth PD-L1 protein surface, 16 inserting into canyons on the HIV envelope that are not accessible to IgG 17 to neutralize a broad range of HIV-1 strains, and effectively blocking the entry of SARS-CoV-2 spike protein. [18][19][20] Kirchhofer et al. first developed a series of GFP nanobodies that can induce subtle opposing changes in the chromophore environment. 21 The GFP-specific nanobodies GBP1 (GFP enhancer) and GBP4 (GFP minimizer) were suitable for monitoring protein expression, subcellular localization, and translocation. Our previous work also showed that the chimeric GFP nanobody GFPenhancer-(GGGGS) 4 -LaG16 increased the binding affinity of GFP and was suitable for GFP-tagged target protein purification. 22 Tang et al. developed a GFP nanobodybased system for the selective manipulation of diverse GFP-labeled cells across transgenic lines. 23 Later, Tang et al. achieved direct optogenetic control of GFP expression in neurons by Cre/loxP recombination through the binding of the GFP-specific nanobody Cre chimera protein to GFP. 24 Herce et al. designed a cell-permeable nanobody system to label and manipulate intracellular antigens in living cells. 25 Simpson performed PROTAC degradation of a GPF fusion protein with an anti-GFP nanobody conjugated to the Halo-tag. 26 Prole and Taylor developed methods to visualize and manipulate intracellular signaling through GFP and GFP nanobodies. 27 In the existing solved GFP nanobody structures, most of the nanobody binding epitopes of GFP are different. GFP-enhancer, 21 GBP-minimizer, 21 and Sb44 28 bind to the different epitopes surrounding GFP's β-barrel. While Nb2 29 and LaG16 22 bind to the same epitope of GFP, they shared only 29.7% identical CDR sequences. These complex structures provided important structural information for the further development of GFP manipulation tools.
Although many GFP nanobody-related protein visualization and manipulation applications have been introduced, few RFP nanobodies have been reported. Fridy et al. generated a series of nanobodies (named LaMs) that bind specifically to mCherry through a high-throughput screening method. 10 To develop an in vivo RFP manipulation system, the design of two or more nanobodies fused with other manipulating components that can interact with different epitopes of the RFP surface at the same time is required. However, the lack of structural information on the detailed interaction interfaces between RFP and specific nanobodies hinders the design and application of manipulation of RFP or RFP fusion proteins by high-affinity antibodies. Here, we determined the crystal structure of the LaM2-mCherry and LaM4-mCherry complexes and clarified the details of the binding of these two nanobodies to mCherry. We also verified the simultaneous binding of LaM2 and LaM4 to RFP by a series of orthogonal molecular biology assays. Our results provide crucial atomic resolution interaction information for the further development of methods to manipulate RFP or RFP fusion proteins in vivo.

| RESULTS
2.1 | The overall structure of the LaM2-mCherry and LaM4-mCherry complexes To gain insight into the binding sites of nanobodies to RFPs, we purified recombinant LaM2, LaM4 and RFP mCherry and then determined the crystal structures of the LaM2-mCherry and LaM4-mCherry complexes. The crystal of the complex contains mCherry and LaM2 or LaM4 at a 1:1 stoichiometry. The overall structure of LaM2-mCherry was refined to 1.39 Å resolution and that of LaM4-mCherry was refined to 1.92 Å resolution. The crystallographic data are shown in Table 1. The binding interface of CDRs 1-3 of LaM2/LaM4 and mCherry was well defined. LaM2-mCherry crystallized in the space group P2 1 2 1 2 1, and the asymmetric unit contained one LaM2 nanobody and one mCherry molecule. The Matthews coefficient was approximately 2.11 Å 3 /Da, and the solvent content was 41.58%. LaM4-mCherry crystallized in space group C121, and the asymmetric unit contained one LaM4 nanobody and one mCherry molecule. The Matthews coefficient was approximately 2.14 Å 3 /Da, and the solvent content was 42.42%. Figure 1a shows the overall structure of the LaM2-mCherry complex, and Figure 1b shows the overall structure of the LaM4-mCherry complex. The binding sites of LaM2 and LaM4 on mCherry were different. Figure 1c shows the superposed structures of LaM2-mCherry and LaM4-mCherry. LaM2 binds to the side of the β-barrel (the fourth and fifth β-sheets of the 11 total β-sheets), while LaM4 binds to the bottom of the β-barrel (both the amino and carboxyl termini of RFP are at the bottom). The binding modes of the nanobodies are also very different. Figure 1d compares the binding of LaM2 and LaM4. Although the constant domains of the nanobodies are similar, the CDRs are totally different. The CDR3 of nanobodies is longer than that in IgG, and therefore, while only a loop in the IgG secondary structure typically interacts with the antigen, an α-helix in the nanobody may also emerge and provide an additional interaction mode with the antigen. CDR3 and CDR1 of LaM2 contain two α-helices: residues 123-126 (Ser-Glu-Asn-Asp) and residues 42-45 (Thr-Phe-Ser-Asp). CDR3 of LaM4 contains an α-helix consisting of residues 109-111 (Gln-Arg-Leu). Additionally, the surface potentials of LaM2 and LaM4 are quite different; LaM2 has a large negative patch in CDR1 that contributes to ionic interactions with mCherry, while the binding of LaM4 to mCherry does not involve similar ionic interactions ( Figure 1e).

| Details of the binding sites of LaM2/LaM4 to mCherry
Since the resolution of both nanobody-mCherry complex crystals was high enough, the binding sites between LaM2/LaM4 and mCherry were clearly defined. The detailed interaction interfaces of LaM2 and LaM4 with mCherry are shown in Figure 2. In the LaM2-mCherry complex, all the CDRs 1-3 of LaM2 contributed to the binding to mCherry.

| Validation of the thermodynamics and binding affinity of the nanobody to mCherry by site-directed mutagenesis
To further clarify the detailed driving forces of the binding between the nanobodies and mCherry, we performed structurally guided site-directed mutagenesis and studied the binding affinity of the mutated nanobodies to mCherry. We first used isothermal titration calorimetry (ITC) to measure the binding affinity and thermodynamic parameters because it is a label-free and insolution method and is regarded as the gold standard for protein-protein interactions ( Figure 3, Table 2). Both LaM2 and LaM4 showed high binding affinity to mCherry; the K D of LaM2-mCherry was 3.02 nM and that of LaM4-mCherry was 22.5 nM (Figure 3a,b). Then, we mutated some residues that contributed to the binding of mCherry.  When the two residues of LaM2 CDR1 (Ser44) and CDR2 (Ser68) that form hydrogen bonds with mCherry were individually replaced by Ala, the binding affinity with mCherry was only slightly reduced ( Figure 3a). The side chain of LaM2 Trp67 was inserted into a hydrophobic hole in mCherry, and the W67A mutation abolished the hydrophobic interaction and significantly reduced the binding affinity to mCherry. When Trp119 and Tyr120 in CDR3 were replaced by Ala simultaneously, the binding with mCherry was totally abolished (Figure 3a), indicating that this region was crucial for mCherry binding.
For LaM4, the surface of N103 seemed to be complementary to the surface near mCherry Lys84, and a hydrogen bond seemed to form between N108 and mCherry's Glu10. To confirm which interaction was dominant, we constructed LaM4 N103D, N103K, N108D, and N108K point mutation nanobodies and tested their binding affinity to mCherry by ITC. Both the N103D, N108D, and N103K mutations abolished the interaction with mCherry, while N108K still had high binding affinity (Figure 3b). These results suggest that Asn103 and mCherry binding occurs mainly through Van der Waals forces complementary to the protein surface because if Asn103 interacts with mCherry mainly through hydrogen bonds or salt bridges, the binding affinity should still be strong when mutated to Asp; however, the LaM4 N103D mutation totally abolished the interaction, similar to the N103K mutation. When Asn108 mutated to Lys, the interaction was only slightly weakened, while the Asp mutation totally abolished the interaction with mCherry, indicating that the interaction between N108 and mCherry's Glu10 was mainly through the specific hydrogen bond.

| Validation of the simultaneous binding of LaM2 and LaM4 to mCherry
The crystal structure of the LaM2-mCherry and LaM4-mCherry complexes showed that the binding regions of LaM2 and LaM4 to mCherry did not overlap, so we assumed that LaM2 and LaM4 could bind to mCherry simultaneously. We confirmed this assumption by ternary ITC and F-SEC experiments. The K D of LaM2 titrated into the LaM4-mCherry complex obtained by gel filtration was 8.33 nM, similar to that obtained for LaM2 directly titrated into mCherry (Figure 4a, Table 3), indicating that the binding of LaM4 did not significantly affect LaM2. The K D of LaM4 titrated into the LaM2-mCherry complex was 261 nM, a 10-fold decrease compared to titration into mCherry alone (Figure 4b, Table 3), indicating that the binding of LaM2 induces an allosteric change in mCherry's binding interface with LaM4. We proved this assumption by the analysis of crystal structure data, and the binding of LaM2 slightly shifted the loop position of mCherry β-barrel's two large bottom loops, which are crucial for the binding to LaM4's CDR1 (around Arg28) and CDR3 (around Leu101).
We also observed the formation of a ternary complex of LaM2-mCherry-LaM4 by fluorescence-based size exclusion chromatography (F-SEC), 30 which can directly show the size of the biological macromolecule complex under physiological conditions. The F-SEC results (Figure 4c) also confirmed that a stable complex of 1:1:1 LaM2-mCherry-LaM4, 1:1 LaM2-mCherry, and 1:1 LaM4-mCherry formed if these proteins were mixed in a proper ratio. It is worth noting that although LaM2 and LaM4 are similar in size, there was a certain difference in the position of the peak after binding with mCherry,  which may be due to the different 3D shapes of the LaM2-mCherry and LaM4-mCherry complexes. We determined the size distributions of mCherry alone, LaM2-mCherry, LaM4-mCherry, and LaM2-mCherry-LaM4 by dynamic light scattering (DLS) experiments. The results showed that mCherry alone was very high uniformity, centered at approximately 6 nm, and when complexes were formed with the respective nanobodies, the size increased to approximately 10-11 nm (Figure 5a).  In contrast to some GFP nanobodies (GFP enhancer and minimizer), 21 the binding of LaM2 and LaM4 did not significantly affect the chromophore environment of mCherry, resulting in few changes in mCherry fluorescence properties (Figures 5b and S1). This feature ensured that the optical activity of mCherry would not change significantly with the binding of LaM2/LaM4. Thus, quantification by RFP fluorescence remains accurate when manipulated through the binding of LaM2/ LaM4 chimeric operators.

| DISCUSSION
While the molecular weight of a nanobody is only approximately one tenth that of IgG, nanobodies still provide a relatively large binding interaction interface. We calculated and compared the buried surface areas of LaM2 and LaM4 to mCherry and five nanobodies of GFP (GBP1 enhancer PDB ID: 3K1K, GBP4 minimizer PDB ID: 3G9A, LaG16, PDB ID: 6LR7, Nb2, PDB ID: 7E53, and Sb44, PDB ID: 6LZ2) to GFP, in addition to a representative PD-L1 nanobody KN035 (PDB ID: 5JDS) entering clinical trials by PISA 16 (Table 4). All of these complexes have similar buried surface areas of approximately 600-850 Å 2 , which is comparable to that of IgG and provides high affinity and specificity. We also compared the buried surface areas of two hapten nanobodies (CorNb-Cortisone, PDB ID: 6ITQ 31 and MTX Nb-MTX, PDB ID: 3QXV 32 ). Since these hapten antigens are relatively small and cannot provide a large surface for binding, the buried surface areas are relatively small, between 300 and 400 Å 2 ; however, in contrast to the small buried surface areas of protein antigens, over 50% of the hapten total surface is buried, showing the effectiveness of their interactions with specific antigens.
In addition to the delivery of plasmids encoding nanobodies, unlike IgG, nanobodies can easily enter the cell membrane through a nonendocytic delivery system using a poly-Arg tag 25 and thus may have additional advantages over IgG-based chimeric manipulation systems.
Simulations based on the crystal structures show that LaM4 can bind to the DsRed tetramer, and the binding sites are not on the DsRed self-multimerization interface; thus, the binding of LaM4 does not affect tetramerization. Therefore, it is possible to design chimeric proteins linking functional operation components with LaM4 and develop a self-assembling macromolecular machine based on the RFP tetramer.

| CONCLUSION
In summary, we have obtained the details of how nanobodies LaM2 and LaM4 bind to mCherry's different epitopes at atomic resolution via structural biology techniques. Additionally, our thermodynamic and molecular biology assays verified the crucial residues for the nanobody-RFP interaction. The binding of LaM2 or LaM4 did not significantly change the fluorescence of mCherry, which is important for fluorescence quantification assays. LaM2 and LaM4 can bind simultaneously to mCherry, which is crucial for recruiting multiple operation elements to the RFP. These results provide important basic information for the development of a LaM2/LaM4-based RFP manipulation system and provide strategies to further optimize the binding affinity of nanobodies to RFP.

| Protein expression, purification, and characterization
The coding sequences of LaM2 and LaM4 were optimized based on favored codon usage in Escherichia coli and were synthesized by Genewiz (Suzhou, China). For crystallization and binding assays, DNA encoding LaM2 and LaM4 was subcloned into the pET28a-SUMO vector with an N-terminal 6xHis tag followed by a SUMO tag or a pET21a-derived vector with an N-terminal 10xHis tag, respectively. The plasmids were transformed into E. coli strain BL21 (DE3) for expression. The bacteria were cultured in LB medium at 37 C until the OD600 reached 0.8. Recombinant protein expression was induced by the addition of 0.2 mM isopropyl-D-1-thiogalactopyranoside and incubation for an additional 18 hr at 18 C. The cells were harvested and resuspended in NiA buffer containing 20 mM imidazole, 5% glycerol, 150 mM NaCl, and 100 mM Tris-HCl, pH 7.5. The His10-tagged recombinant LaM2/LaM4 and their respective mutants were initially purified by Ni-NTA affinity purification using a HisTrap HP column (Qiagen) and eluted with NiB buffer containing 300 mM imidazole, 5% glycerol, 150 mM NaCl, and 100 mM Tris pH 7.5. For crystallization, the His-SUMO tag was removed by incubation with recombinant ULP1 overnight at 4 C. The cleaved tag fragment and ULP1 were removed by passing through a HisTrap HP column. LaM2/LaM4 was further purified by SEC on a Superdex75 Increase column (Cytiva), and the buffer was exchanged to gel filtration buffer: 10 mM HEPES, pH 7.4 and 100 mM NaCl. The purity and molecular weight of the target proteins were verified by SDS-PAGE.

| Site-directed mutagenesis
Site-directed mutagenesis was carried out by employing a PCR-based mutagenesis site-directed method (2x Phanta Master Mix, Vazyme Biotech Co., Ltd.) using His10-LaM2 and His10-LaM4 as the template. The sequences of the primers used to generate these mutants are displayed in Table S1. All site-directed mutagenesis constructs were confirmed by DNA sequencing (RuiDi, Shanghai, China). Cryoprotection was performed by adding glycerol to the reservoir buffer at a 20% concentration. X-ray diffraction data were collected at 100 K in beamlines BL17U1 33 and BL19U1, 34 Shanghai Synchrotron Radiation Facility, Chinese Academy of Sciences.

| Determination and refinement of protein structure
Diffraction images were indexed and processed by HKL2000. 35 The structures of LaM2-mCherry and LaM4-mCherry were obtained by molecular replacement using the Phaser program from the CCP4 crystallography package 36 with mCherry (PDB ID: 2H5Q 5 ) and a GFP nanobody (PDB ID: 3K1K 21 ) as the search model. Structure refinement was performed by Refmac 37 and Phenix. 38 The model was refined by COOT. 39 The crystallographic parameters of LaM2-mCherry (PDB ID: 6IR2; 1.39 Å) and LaM4-mCherry (PDB ID: 6IR1; 1.92 Å) are listed in Table 1. The related figures were drawn by PyMOL. 40

| Isothermal titration calorimetry
The thermodynamic parameters of the binding of LaM2/ LaM4 and their respective mutants to mCherry were determined by ITC using VP-ITC or ITC200 calorimetry (MicroCal VP-ITC, Malvern). In a typical experiment, each titration was performed by injecting a 12 μl aliquot of protein sample into the cell containing another reactant (detailed concentration information is listed in Table S2) at a time interval of 120 s to ensure that the titration peak returned to baseline. Altogether, 23 aliquots were titrated in each individual experiment. The stoichiometry of binding (n), the association constant Ka, and the binding enthalpy ΔH were evaluated using MicroCal Origin 7.0 software with a one-site binding model.

| Fluorescence-based SEC
The oligomeric state of the tested samples in buffers was recorded by F-SEC. We used 100 μl 0.1 mg/ml mCherry as control. For the LaM2-mCherry and LaM4-mCherry complexes, 50 μl of 0.2 mg/ml mCherry (approximately 7 μM) and 0.2 mg/ml (approximately 14 μM) LaM2 or LaM4 were mixed in equal volumes (the final concentration of mCherry was 3.5 μM, and the final concentration of LaM2/ LaM4 was 7 μM) and incubated on ice for 1 hr. For the LaM2-LaM4-mCherry complex, 50 μl 0.2 mg/ml mCherry (approximately 7 μM) was mixed with 25 μl 0.4 mg/ml LaM2 and 25 μl 0.4 mg/ml LaM4 and incubated on ice for 1 hr. After high-speed centrifugation, the supernatants were loaded onto a Superdex 200 Increase size-exclusion column (Cytiva) equilibrated with SEC buffer (20 mM HEPES pH 7.0, 150 mM NaCl). The fluorescence of each sample was recorded by a fluorometer (excitation, 587 nm; emission, 610 nm for mCherry fluorescence). The data were processed and normalized by FSEC plotter software.

| Emission spectrum measurements
The emission spectra of mCherry (0.1 mg/ml) and LaM2/ LaM4-mCherrry (mCherry 0.1 mg/ml with excess nanobodies) were recorded using a fluorescence spectrophotometer (Varian Cary Eclipse). The excitation wavelength was 587 nm. The emission spectrum was recorded between 550 and 700 nm. The spectra data were analyzed with Origin.

| DLS assay
The particle size distributions of mCherry, the LaM2-mCherry complex, the LaM4-mCherry complex, and the LaM2-LaM4-mCherry complex were measured by a Nano-size-Zeta potential analyzer (Malvern Instruments, ZS90-2026). The test temperature was 25 C, and the test angle was 90 .