De novo design of energy transfer proteins housing excitonically coupled chlorophyll special pairs

Natural photosystems couple light harvesting to charge separation using a “special pair” of chlorophyll molecules that accepts excitation energy from the antenna and initiates an electron-transfer cascade. To investigate the photophysics of special pairs independent of complexities of native photosynthetic proteins, and as a first step towards synthetic photosystems for new energy conversion technologies, we designed C2-symmetric proteins that precisely position chlorophyll dimers. X-ray crystallography shows that one designed protein binds two chlorophylls in a binding orientation matching native special pairs, while a second positions them in a previously unseen geometry. Spectroscopy reveals excitonic coupling, and fluorescence lifetime imaging demonstrates energy transfer. We designed special pair proteins to assemble into 24-chlorophyll octahedral nanocages; the design model and cryo-EM structure are nearly identical. The design accuracy and energy transfer function of these special pair proteins suggest that de novo design of artificial photosynthetic systems is within reach of current computational methods.


Introduction
Photosynthetic proteins manipulate the distances and angles between chlorophyll (Chl) molecules to tune excitonic coupling and control absorption and uorescence spectra, excited state dynamics, energy transfer, and electron tunneling (Croce & (Sener et al., 2002;Wraight & Clayton, 1974). Natural photosynthesis can guide the development of synthetic biology for renewable fuel production, but only if we can determine the structure-function relationships required for e cient solar-to-fuel energy conversion and build new structures that exploit this knowledge. Chl special pairs have attracted great interest as primary electron donors, but the complexity of natural photosystems makes it di cult to study these Chls directly. Model protein systems such as the water-soluble Chl protein (WSCP) and the B820 dimer of LH1 allow the investigation of excitonic interactions between Chls or between bacteriochlorophylls (BChls) without spectral congestion from other pigments (Bednarczyk et Wasielewski et al., 1976), which have provided valuable insights, but these can be labor-intensive to synthesize, overlook the role of protein matrix effects which are important in native special pairs (Gorka et al., 2021), and lack the ne control over Chl-Chl distances and orientations needed to reproduce the precise geometries of native special pairs. De novo-designed Zn-tetrapyrrole monomer binding proteins (Ennist et  experimentally. Systematic methods of assembling Chl dimers with prede ned geometries are lacking, making it di cult to correlate structure and function. Despite decades of active research, there has been no generalizable strategy to assemble Chl dimers that precisely match special pair geometries. We reasoned that recent advances in computational protein design could enable the creation of stable, water-soluble proteins that assemble Chl dimers with prede ned geometries and which can be built into extensive protein assemblies. Binding a small molecule as a dimer is a computational challenge, because the binding interface involves not just the protein but also the second small molecule, which has an independent set of rotational and translational degrees of freedom. To control these degrees of freedom, we sought to design homodimers with perfect two-fold cyclic (C 2 ) symmetry, which bind a C 2 -symmetric Chl pair such that the C 2 symmetry axes of the protein and chromophore are coincident, similar to native reaction centers, which can have true C 2 symmetry (Chen et al., 2020; Gisriel et al., 2017) or pseudo-C 2 symmetry ( Figure 1a). C 2 symmetry ensures that the two bound Chl molecules will have near-degenerate site energies, improving the resonance between pigment transitions necessary to create delocalized states (Reppert, 2023). For Chl dimer protein scaffolds, we chose hyperstable C 2 -symmetric repeat protein dimers containing symmetric pockets with tunable sizes and geometries (Brunette et al., 2015(Brunette et al., , 2020Doyle et al., 2015;Fallas et al., 2017;Hicks et al., 2022). In this dimeric repeat protein architecture ( Figure  1c), the hydrophobic core is independent from the small molecule binding site, enabling full customization for binding with little impact on the overall protein structure. Several thousand C 2symmetric homodimers that sample a wide range of superhelical curvature, rise, and radius parameters have been generated (Fallas et al., 2017;Hicks et al., 2022).
To probe the effect of geometry on Chl-Chl coupling, we set out to design a range of C 2 -symmetric dimers that hold two closely interacting Chl molecules in varied geometries including the arrangement found in native special pairs. In native proteins, (B)Chls typically have a pentacoordinate central Mg(II) or Zn(II) ion with a histidine (His) N ε atom as the axial ligand. For each chosen special pair geometry, we built a His rotamer interaction eld and stored the possible His-Chl interaction geometries in a hash table (Figure 1c; see Methods for details). For each geometrically compatible C 2 scaffold, we cycled through His-Chl rotamers from the hash table, aligned them to the scaffold C 2 -symmetry axis, and searched for matches of the His N-C α -C backbone atoms to the backbone atoms of the residues lining the binding cavity.
Scaffolds for which the His N-C α -C backbone atoms aligned with corresponding atoms in the protein backbone, and which could accommodate the Chl dimer without clashes, were redesigned using symmetric Rosetta FastDesign to optimize hydrophobic packing and hydrogen bonding around the Chls  (Figure 1c). Designs were ltered based on the Rosetta full-atom energy, the solvent-accessible surface area of the Chl dimer (DSasa), His rotamers, and His N ε -metal ligation geometry. We selected 43 designs based upon 13 unique scaffolds for experimental characterization (see Supplementary Table 1 for amino acid sequences). We also characterized an additional 5 variants generated based on structural information, as described below for SP3. The protein monomer sizes range from 20.6 to 28.4 kDa (179 to 261 amino acids). We refer to these 48 designs as Chl Special Pair proteins, or SP for brevity.
Following SP protein expression in E. coli, SDS-PAGE gels showed that all 48 designs were present in the soluble fractions of lysates. Proteins were puri ed by Ni-NTA and size-exclusion chromatography (SEC) (Supplementary Figure 1). All SEC traces exhibited protein absorption at the elution volume expected for homodimer formation. Of 20 designs investigated by Small Angle X-ray Scattering (SAXS) in the apostate, 15 had SAXS pro les suggesting a 3-dimensional shape consistent with the design model ( slightly lower predicted radius of gyration (R g ) value compared to experimental SAXS data is likely due to a dense hydration shell around the highly charged SP proteins (Kim et al., 2016;Svergun et al., 1998). The far ultraviolet (UV) circular dichroism (CD) spectra of three SP proteins that expressed in high yield (≥140 mg/L) show the proteins are highly α-helical with and without the synthetic Chl a derivative, Zn pheophorbide a methyl ester (ZnPPaM). Thermal denaturation curves monitored by the CD signal at 222 nm indicate that all three proteins are highly thermostable in the apo-and holo-states ( Figure 2).
At longer wavelengths in the UV/visible/near-infrared (UV/vis/NIR), CD spectra can serve as a convenient probe of excitonic interactions between Chls. Monomeric Chls including Chl a and ZnPPaM exhibit asymmetric negative CD signals in the Q y region near ~670 nm (Supplementary Figure 3) (Lindorfer et al., 2017). When Chl dimers are arranged in chiral protein environments, however, excitonic interactions can produce delocalized transitions with chiral character, yielding CD signals that are stronger and conservative (i.e., composed of a bisignate doublet that integrates to zero). Figure 2 shows that ZnPPaM bound to the SP1, SP2, and SP3 proteins have bisignate CD features in the Q y region (in the red part of spectrum), consistent with excitonic coupling between the Chls. As shown in Supplementary Table 3, the Q y CD features of SP2 and SP3 are substantially stronger relative to their Q y absorption bands than is the Q y CD signal of monomeric ZnPPaM in organic solvent. ZnPPaM binding titrations of SP2 and SP3 monitored by CD in the Q y region show that the CD doublets are attributable to the binding of ZnPPaM dimers. Curve tting of the CD titrations yields SP2-ZnPPaM dissociation constants (K D s) of 300 nM for K D1 and 2.5 μM for K D2 , and SP3-ZnPPaM K D s of 800 nM for K D1 and 1.0 μM for K D2 (Supplementary  Figures   1-6), we selected promising candidates for X-ray crystallographic structure determination. We solved the crystal structures of 3 designs, SP1, SP2, and SP3x (a close relative of SP3), and found that all three had protein backbone conformations that matched the corresponding design models to within 1.7 Å C α RMSD ( Figure 3). The X-ray crystal structure of SP1 was solved in the ZnPPaM-bound state to 2.0 Å resolution, revealing a special pair geometry closely matching that of purple photosynthetic bacteria (Figure 3a-f). The rotameric state of the Zn-ligating His121 is identical to that in the design model, and several hydrophobic and Tstacking interactions form as designed. Hydrogen bonds to the ring E ketone group, shown to be important for modulating special pair redox potentials (Lin et al., 1994) SP2 was intended to assemble a ZnPPaM dimer with a conformation signi cantly different from native special pairs in order to investigate the effect of dimer geometry. The SP2 crystal structure was solved in both the apo-state and the ZnPPaM-bound state to 2.4 and 2.5 Å resolution, respectively. The apo-and holo-state amino acid backbones both agree with the SP2 design model to within 1.4 Å RMSD ( Figure  3g). The holo-state crystal structure has two copies of the SP2 dimer in the asymmetric unit; alignment of the two ZnPPaM dimers shows their binding geometries are equivalent, with an RMSD of 0.22 Å over the tetrapyrrole rings. The ZnPPaM molecules are ligated by His178 as in the SP2 design model. After alignment of the crystal structure and design model protein backbones, the corresponding tetrapyrrole rings are approximately coplanar. Despite the accuracy of the protein backbone design, the crystal structure shows the ZnPPaM molecules are rotated and translated relative to the design model ( Figure  3h). Compared to the apo-state crystal structure, the SP2 binding cavity widens by ~1.6 Å in the presence of ZnPPaM; this expansion provides the extra volume needed for the ZnPPaM molecules to adopt their unexpected conformation. While the ZnPPaM dimer in SP2 differs from the design model, the crystal structure nevertheless satis es the objective of creating a non-native dimer geometry.
An apo-state crystal structure of SP3x, which shares 94% sequence identity with SP3, was solved to 3.05 Å resolution. The SP3x homodimeric design model agrees with the crystal structure to 1.61 Å C α RMSD ( Figure 3i). The crystallized SP3x protein fails to bind a ZnPPaM dimer; CD studies indicate that it can only bind one ZnPPaM molecule per protein dimer. While the design model largely matches the crystal structure, subtle discrepancies in the shape of the binding pocket and side chain rotameric states may contribute to sub-stoichiometric ZnPPaM binding. With the high-resolution SP3x structure in hand, we redesigned the binding cleft, allowing Rosetta FastRelax to select new binding site residues and make small adjustments to the dimer geometry. We tested a new set of ve mutants, one of which, SP3, exhibits high a nity assembly of a ZnPPaM dimer, as assayed by CD titration (Supplementary Figure 4). SP3 is designed to assemble a Chl dimer similar in structure to the P700 special pair of Photosystem I. Alignment of the SP3 design model to the P700 special pair from the crystal structure of Photosystem I (PDB ID: 1JB0) (Jordan et al., 2001) gives an RMSD of 0.68 Å across tetrapyrrole ring atoms. We grew green holo-state crystals of SP3 with ZnPPaM bound but were unable to solve the structure.
The absorption and uorescence spectra of native special pairs are shifted compared to monomeric (B)Chls, in part due to excitonic coupling between the (B)Chls, which enables them to act as exciton traps (Gorka et al., 2021;Taylor & Kassal, 2019;van Amerongen et al., 2000). The SP2-ZnPPaM dimer absorbance spectrum presents a red-shifted shoulder in solution. Analysis of SP2-ZnPPaM absorbance binding titrations (Figure 4a and Supplementary Figure 5) shows that whereas the Q y transition of monomeric ZnPPaM in SP2 has an absorbance maximum at 669 nm with an extinction coe cient (ε 669nm ) of 49,900 M -1 cm -1 , the SP2-ZnPPaM dimer spectrum has its Q y maximum slightly shifted to 668 nm with a decreased ε 668nm of 38,200 M -1 cm -1 . While the monomer has no discernable spectroscopic feature at 690 nm (its ε 690nm is 9,400 M -1 cm -1 ), the SP2-ZnPPaM dimer spectrum has a distinct shoulder with ε 690nm of 17,700 M -1 cm -1 .
To rule out the possibility that the SP2-ZnPPaM bands at 668 and 690 nm may represent two populations of distinct ZnPPaM oligomers, we investigated the SP2-ZnPPaM spectral features using low-temperature spectroscopy in a sucrose/trehalose matrix ( Figure 4b and Supplementary Figure 7; see Methods for details). If the two Q y bands originate from the same ZnPPaM dimer species, decreasing temperature should increase the uorescence emission intensity at the lower energy transition relative to the higher energy transition due to relaxation of the electronic excitation within the dimer. We prepared a dimer sample with 2.0 molar equivalents of ZnPPaM per SP2 protein dimer and a monomer sample with only 0.3 molar equivalents per protein dimer, both in sucrose/trehalose lms. The monomer sample lacked a red-shifted shoulder, and its Q y uorescence maximum at 672 nm intensi ed with decreasing temperature from room temperature (RT) to 75 K (Figure 4b and Supplementary Figure 7). This behavior is expected for monomeric chromophores, because lower temperatures decrease homogeneous line broadening. In contrast, in the SP2-ZnPPaM dimer sample, two emission bands were observed at 673 and 692 nm, and lowering the temperature decreased the intensity of dimer uorescence emission at 673 nm while increasing emission at 692 nm. In effect, the temperature dependence of the uorescence intensity near 672 nm is reversed in the dimer compared to the monomer (Figure 4b and Supplementary Figure 7j). This indicates that the two spectroscopic features in the SP2-ZnPPaM dimer are coupled; lower temperatures disfavor thermal promotion from the lower energy 692 nm state to the higher energy state, causing 692 nm uorescence emission to intensify while 673 nm emission weakens.
We calculated theoretical CD spectra based on the ZnPPaM dimer geometries of design models and crystal structures and compared them to experimental CD traces. We found that the signs of the CD Cotton effects predicted from the crystal structures of SP1 and SP2 are consistent with the experimental signs in the Q y region. In the SP3 dimer, the calculated spectrum based on the design model agrees with the experimentally-measured signs of the CD Cotton effects, suggesting that the P700-like ZnPPaM dimer in SP3 assembles as designed. (See Methods and Supplementary Figures 8-10 for details of CD spectral calculations). The excitonic coupling strengths of the SP1-and SP2-ZnPPaM dimer crystal structures were calculated to be 87 and 241 cm -1 , respectively. The experimental absorbance spectrum of SP2-ZnPPaM has features at 668 and 690 nm, corresponding to a Q y peak splitting of 477 cm -1 , or a coupling of 239 cm -1 , consistent with the calculated coupling strength. In addition, calculations based on the SP2-ZnPPaM crystal structure predict an oscillator strength redistribution towards the high-energy excitonic component (Supplementary Figure 11). This results in a band centered at ~690 nm with a lower intensity than the high-energy component at ~668 nm, in good agreement with the observed absorbance spectrum in which the 690 nm band appears as a weaker shoulder (Figure 4a).
Native special pairs play a critical role as energy transfer acceptors for antenna proteins. To test whether our designed SP proteins participate in energy transfer with natural light-harvesting proteins, we analyzed excitation energy transfer in 2D surface arrays using Fluorescence Lifetime Imaging Microscopy (FLIM) in combination with nanoimprint lithography (Huang et al., 2020) ( Figure 5). We selected the cyanobacterial antenna protein CpcA with attached phycoerythrobilin (PEB) as the energy transfer donor. CpcA-PEB was puri ed from E. coli, and gives a strong uorescence emission maximum at 568 nm and emission extending past 630 nm (Barnett et al., 2017), overlapping with the excitation spectrum of Zn pheophorbide a (ZnPPa) when bound to SP2. Notably, Q y peak splitting was not observed in SP2 assembled with ZnPPa instead of ZnPPaM (Figure 5b), suggesting that the peripheral substituents of the chlorin play a signi cant role in excitonic coupling. To monitor energy transfer, ~5 μm-wide linear arrays of CpcA-PEB and perpendicular linear arrays of SP2-ZnPPa were applied to a poly-L-lysine functionalized glass surface by contact printing , creating intersection points where CpcA-PEB and SP2-ZnPPa interact and other locations in which only one of the proteins was present. Wide-eld epi uorescence imaging with excitation at 450 nm was used to analyze the surface attachment; ltering emission at 620 nm preferentially displays regions of CpcA-PEB, while 680 nm preferentially shows SP2-ZnPPa. At the intersections between the lines, donor CpcA-PEB emission (620 nm lter) was decreased and SP2-ZnPPa acceptor (680 nm lter) was increased, indicating energy transfer from donor to acceptor ( Figure 5c).
To quantify the strength of the interaction between CpcA-PEB and SP2-ZnPPa, we used time-resolved single photon counting. The surface was illuminated with a 485 nm picosecond laser ltered at 620 nm (donor emission), and photons were counted for individual pixels (surface resolution approximately 300 nm) over time, allowing both total uorescence intensity and lifetime to be analyzed (Figure 5d). In regions with only CpcA-PEB, uorescence intensity was above 4000 arbitrary units (a.u.) and the lifetime was over 2 ns (τ av = 2058 ps). Regions with both CpcA-PEB and SP2-ZnPPa show reduced uorescence intensity (<1000 a.u.) and lifetimes under 0.9 ns (τ av = 839 ps). We estimate the energy transfer e ciency of CpcA-PEB to SP2-ZnPPa in 2D arrays to be 59%, similar to gures seen for natural uorescent proteins For e cient solar energy conversion, nature organizes photosynthetic machinery into specialized compartments such as thylakoids in plants or chromatophore vesicles in purple photosynthetic bacteria (Singharoy et al., 2019). As a rst step towards such structures, we sought to incorporate a Chl-binding SP protein into a two-component supercomplex with octahedral symmetry (King et al, 2014). The C 2 symmetry axes of 12 copies of the SP2 dimer and the C 3 axes of 8 copies of a C 3 -symmetric homotrimer (Boyken et al., 2016(Boyken et al., , 2019 were aligned with the C 2 and C 3 axes of an octahedron. We sampled rotations and translations along these axes to generate a closely packed octahedral model with the C 2 dimers on the edges and the C 3 trimers on the vertices. Interface residues were redesigned to create binding surfaces between the SP2 dimers and the trimers. Twenty-one designs were experimentally characterized, and one was found to assemble into octahedral structures by negative-stain EM (Supplementary Figures  13-14). In this nanocage, the SP2-like component shares 87% sequence identity with the original SP2 design, and its absorbance spectrum has a red-shifted shoulder in the Q y region, similar to the original . While the resolution is not su cient to con dently determine the orientations of the Chls, the cryo-EM density map is consistent with the Chl dimer geometry in the SP2 crystal structure (Figure 6c,d).

Conclusion
We describe the rst designed proteins which hold Chl dimers in precisely de ned closely juxtaposed geometries. We obtain crystal structures of two holo-state designs: the rst, SP1, reproduces the binding geometry of the native purple bacterial reaction center special pair with sub-angstrom precision, and the second, SP2, has a distinct geometry with the Chls closer together. Our use of symmetry reduces the complexity of the design procedure while ensuring equivalent site energies for the two bound Chls to strengthen excitonic coupling. Symmetry also enables integration of the de novo special pair proteins into larger supercomplexes. Our octahedral nanocage incorporating 12 ZnPPaM dimers is a rst step towards de novo design of photosynthetic compartments analogous to thylakoids or chromatophores.
SP2 exhibits spectroscopic hallmarks of native special pairs including Cotton effects by CD, shifting of absorption and uorescence bands, and energy transfer activity when paired with the native antenna protein CpcA. SP1 exhibits weaker excitonic coupling than SP2 and the BChl special pair of purple photosynthetic bacteria despite its close structural similarity to the latter. The stronger coupling of SP2 relative to SP1 may re ect the closer spacing of the ZnPPaM molecules in SP2, whereas the stronger coupling of the purple bacterial special pair is due to the stronger Q y transition dipole moment of bacteriochlorins as compared to chlorins (Knox & Spring, 2003). Prediction of the spectroscopic properties of a Chl dimer in a protein is complicated by the fact that Chl-Chl coupling energies are typically similar in magnitude to the available thermal energy, Frank-Condon active vibrational and phonon reorganization energies, and local Chl vibrational frequencies (Reppert, 2023). Accurate optical predictions require benchmarking of theoretical methods using robust model systems. Native photosynthetic proteins can be di cult to isolate and typically contain many interacting pigment molecules, creating spectral congestion. The highly thermostable, water-soluble Chl dimer proteins described herein avoid the complexity of native pigment-protein complexes and provide a testbed to investigate structure-spectrum relationships.
Studies of photosynthetic light harvesting and charge separation indicate that natural photosynthesis leaves room for e ciency improvements ( The conformer generation was achieved using the NeRF algorithm (available on github at https://github.com/atom-moyer/nerf) which translates internal molecular coordinates to global molecular coordinates. Various conformers were generated by varying the internal coordinates such as the relative positioning of the Chl groups, the dihedral of ligation by the histidine residue, and the rotamer of the histidine sidechain. The full complex was duplicated along the C 2 axis to create the symmetric complex. If the relative orientations of the Chls were varied, clashes between the rings and their substitutions were evaluated and ltered. The full process of de novo motif generation was repeated for ligation with the epsilon and delta nitrogen of the imidazole ring.
Once the de novo conformers were generated, the 6-D transformation that de nes the relative orientation of the N-CA-C atoms of the ligating histidine residues were hashed using a method described previously (Fallas et al., 2017; Yao et al., 2022). The hashed 6-D transformation was used as a key in a multi value hash table (https://github.com/atom-moyer/getpy), and the associated value was a vector that de ned the information necessary to rebuild the histidine-Chl complex, which nitrogen from the histidine was used for ligation and the internal coordinates of the histidine rotamer.
During evaluation of design scaffolds, the 6-D transformation of each symmetric residue pair across chains was evaluated and hashed using the same method used to hash the de novo conformers described above. That allowed the identi cation of symmetric residue pairs which have similar 6-D transformations to the potentially acceptable ligation geometries. If a matching 6-D transformation was found, the histidine-Chl complex was rebuilt from the associated value in the hash table, and the complex was evaluated in the context of the protein. If the Chls did not clash with the backbone atoms of the protein, the placement was accepted and passed into the protein design process.
A python package and example scripts which generate the de novo hash tables and place the histidine-Chl complexes into symmetric proteins can be found here: https://github.com/atom-moyer/stapler. Protein expression and puri cation: Synthetic genes with N-terminal His 6 tags followed by TEV protease cleavage sites were purchased in pET29b expression vectors from Integrated DNA Technologies, Inc. Plasmids were transformed into Lemo21(DE3) Competent E. coli (New England Biolabs). For each protein, a single E. coli colony was grown in a culture of 5 mL of LB with 100 μg/mL kanamycin overnight at 37°C. Overnight cultures were used to inoculate 50-500 mL cultures of auto-induction media (Studier, 2005). Bacteria were grown in auto-induction media at 37°C with shaking for 4 hours, then incubated shaking overnight at 18°C. Bacteria were harvested and resuspended in 300 mM NaCl, 30 mM imidazole, 25 mM Tris buffer at pH 8, ~0.01 mg/mL DNase (Sigma-Aldrich), ~0.1 mg/mL lysozyme (Sigma-Aldrich), and Pierce™ Protease Inhibitor Tablets (Thermo Fisher Scienti c). Bacteria were lysed by sonication and centrifuged at ~18,000 g for 30 minutes. Soluble fractions were puri ed by Immobilized Metal A nity Chromatography (IMAC) gravity columns packed with Ni-NTA agarose resin (Qiagen) at room temperature. Columns were washed with a buffer containing 20 mM imidazole and proteins were eluted with a 300 mM imidazole buffer. Samples were digested with His-tagged TEV protease in the presence of 0.5 mM dithiothreitol for 1-2 days at room temperature. Digested proteins were buffer exchanged into 20 mM imidazole buffer, 300 mM NaCl, and 25 mM Tris buffer at pH 8 and applied to IMAC columns to remove TEV protease and uncleaved protein. Proteins were further puri ed by size-exclusion chromatography (SEC) using an ÄKTA FPLC with a Superdex 200 Increase 10/300 GL column (GE Healthcare Life Sciences). Protein and Chl molecular weights were veri ed by reverse-phase liquid chromatography/mass spectrometry (LC/MS) with an Agilent G6230B TOF instrument using an AdvanceBio RP-Desalting column. Mass spectra were deconvoluted in Biocon rm using a total entropy algorithm ( Supplementary Figures 17-19).
Protein-chlorophyll sample preparation: Zn pheophorbide a methyl ester (ZnPPaM) was purchased from Frontier Scienti c, Inc. ZnPPaM stock solutions were prepared in dimethyl sulfoxide (DMSO) or methanol to concentrations between 200 μM and 1 mM. ZnPPaM concentrations were determined using mass measurements and by using the known absorptivity of Zn pheophytin a, which has a similar absorbance spectrum and an extinction coe cient ε 659nm of 77,300 M -1 cm -1 in 80% acetone/20% deionized water (Jones et al., 1976). Ultraviolet/visible (UV/vis) absorbance spectra were collected using a Jasco V-750 spectrophotometer with a 1 nm bandwidth and 400 nm/min scanning speed. Protein-ZnPPaM complexes were prepared by slowly adding freshly-prepared ZnPPaM stock solution to protein solution in aqueous buffer at room temperature and incubating samples for several hours. Unbound ZnPPaM was removed by centrifugation to pellet precipitated ZnPPaM, sterile ltration using a 0.22 μm syringe lter, and/or running a PD-10 desalting column puri cation (Sephadex™ G-25 M resin, Cytiva Life Sciences). Circular Dichroism (CD) spectroscopy: CD spectra were collected using a Jasco J-1500 spectrophotometer. For protein secondary structure assays, spectra were measured on samples of 0. Fluorescence quantum yield measurements: Fluorescence spectra displayed in Supplementary Figure 6 were recorded on a Fluorolog Horiba Jobin Yvon spectro uorimeter equipped with a Xenon lamp, a double monochromator and a photomultiplier detector. The experiments were carried out in right angle (RA) con guration. Each baseline subtracted uorescence spectrum was corrected for spectral sensitivity of the uorimeter and re-absorption by assuming the middle of the cuvette is the origin of emission. Relative quantum yields were estimated using Chl a in diethyl ether as a reference (Weber & Teale, 1957).
Low-temperature absorbance and uorescence spectroscopy in a sucrose/trehalose lm: Solutions of SP2-ZnPPaM were mixed with a saturated sugar solution made by dissolving a 50:50 sucrose/trehalose (w/w) in distilled water as described previously (Caram et al., 2016). A 100 μL sample of SP2 at 34 mg/mL (ZnPPaM dimer) or 156 mg/mL (ZnPPaM monomer) was added dropwise to 100 μL of the sugar solution and gently mixed. The sugar/protein mixture was dropped onto a 0.1 mm quartz cuvette from Starna Cells Inc. and kept under vacuum in the dark for 24 hr. The sample was then loaded into a Janis ST-100 cryostat using a custom-built copper cuvette holder and cooled with liquid nitrogen. A Lakeshore 330 Autotuning Temperature Controller was used to control the temperature. An Agilent Cary-60 spectrometer was used to collect all absorbance spectra across temperatures. For the temperaturedependent uorescence emission spectra, a home-built setup equipped with a Thorlabs 405 nm laser head (CPS405). The collected emission was ber coupled into a Flame Ocean Optics spectrometer.
Lifetimes were recorded using a homebuilt, all-re ective epi uorescence setup. The samples were excited via a pulsed laser output from a 405 nm pulsed diode laser (LDH-P-C-405, PicoQuant) with a 10 MHz repetition rate. The emission was subsequently ltered with a 420 nm longpass dichroic beam splitter (DMLP425R, Thorlabs) and 420 nm longpass lter (10CGA-420, Newport). As starting structures, either design models or crystal structures were employed. Protonation states were determined with the H++ webserver, at pH 8 (using default parameters) (Anandakrishnan et al., 2012;Gordon et al., 2005;Myers et al., 2006). Topology and geometry les were generated with LEaP, using an isometric truncated-octahedron shape for the periodic box, with a minimum distance between the protein and the edges of the box of 1.5 nm. Protein charges were neutralized with Na + and Clions.
Minimization and initial equilibration steps were performed following a recently developed protocol (Roe & Brooks, 2020). Brie y, it consists of nine sequential energy minimizations and short MD runs, which sum 4000 steps of minimization and 40000 MD steps (totalling 30 ps), followed by a nal MD equilibration (500000 steps, 1000 ps). Then, after discarding the rst 200 ns, production runs were done in the NPT ensemble at 300.0 K, with a time step of 2 fs, and constraining bonds involving hydrogen atoms via the SHAKE algorithm. Constant temperature and pressure were ensured with the Langevin thermostat (collision frequency: 2 ps -1 ) and Monte Carlo barostat, respectively. Long-range electrostatics were considered via the Particle Mesh Ewald (PME) model, setting the direct space sum cut-off to 1.0 nm.
Calculation of Chl dimer excitonic coupling and spectra: Calculations involving excited states were performed on the chromophore geometries of the design models and on the crystal structures. In the latter case, hydrogen atoms were added using UCSF Chimera 1.11 (Pettersen et al., 2004). Electronic couplings were calculated using the EET (electronic energy transfer) module from Gaussian, at the CAM-B3LYP/6-31G* level. The effect of the environment was considered through the polarizable continuum model (PCM) (Iozzi et al., 2004;Tomasi et al., 2005), choosing n-octanol as representative of the protein dielectric behavior. The EET analysis considered six singlet excited states per chromophore.
To obtain circular dichroism spectra we used the results of the EET calculations and the Excitonic Analysis Tool (EXAT) program (Jurinovich et al., 2014(Jurinovich et al., , 2015(Jurinovich et al., , 2018. Rotatory strengths were calculated by considering both electric and magnetic dipoles in the velocity formulation. Spectral lineshapes were simulated as gaussians, with a full-width at half-maximum of 350 cm -1 for the Q y and Q x transitions and X-ray crystallography for SP1 and SP2: Crystals of SP1 and SP2 were grown using protein puri ed as described above. Protein samples dispensed in 1 μL drops at puri cation concentrations were mixed with equal volume of a crystallization solution and set in hanging drops (refer to Supplementary Table 4 Table 4) extend to 2.0 Å, 2.4 Å, and 2.5 Å resolution for SP1-ZnPPaM, apo-state SP2, and SP2-ZnPPaM, respectively. The asymmetric units of the SP1-ZnPPaM and apo-state SP2 structures each contained one complete dimer (two copies of a protein subunit), and the SP2-ZnPPaM structure had 2 dimers in the asymmetric unit.
The placement of subunits was determined using the molecular replacement algorithm in program PHENIX (Adams et al., 2010). Local rebuilding of all constructs was performed using the program COOT (Emsley et al., 2010), followed by re nement in PHENIX (Adams et al., 2010). For the ZnPPaM-bound structures, the protein was built and re ned completely with waters (excluding waters from the binding site) and other chemicals before manually tting ZnPPaM into the density that remained. ZnPPaM Corresponding atoms were aligned in PyMOL and the RMSD over all 48 atom pairs was calculated. Native BChl a special pairs used for comparison to the SP1 protein came from 5 different species of purple photosynthetic bacteria, including Rhodobacter sphaeroides, Rhodopseudomonas palustris, Thermochromatium tepidum, Gemmatimonas phototrophica, and Thiorhodovibrio strain 970. The PDB IDs of the nine X-ray crystal and cryo-EM structures containing the native special pairs used for comparison to SP1 were: 7PIL, 7VNY, 6Z27, 6Z02, 6Z5S, 3WMM, 5Y5S, 7O0U, and 7C9R (Cao et  utilizes a hierarchical sampling strategy to search for interfaces with high shape complementarity based on residue pair transform scoring. The top 10 scored docking con gurations for each scaffold were subsequently sequence designed by symmetric RosettaDesign calculations, using a previously reported protocol (King et al., 2014) to carry out two-component protein-protein interface design. Brie y, we aim to design low-energy, well-packed hydrophobic protein-protein interfaces where protein building blocks are treated as rigid backbones and only side chain rotamers of interface residues are packed with layer design restrictions. The beta_nov16 or a clash-xed score function was used during the design. Finally, all cage designs were ltered based on shape complementarity (>0.6), interface surface area (solventaccessible surface area, 1000 < sasa <1600), predicted binding energy (ddG <-20 kcal/mol), buried unsatis ed hydrogen bonds (uhb <3), and clash check (< 3). All Rosetta scripts used are available upon request.
Transmission negative-stain electron microscopy (nsEM) and image processing: SEC puri ed cage fractions were diluted to about 0.5 µM (monomeric component concentration) for negative-stain EM characterization. Brie y, on a glow-discharged formvar/carbon supported 400-mesh copper grid (Ted Pella, Inc.), 6 μL of protein sample were drop-casted for 2 mins. The grid was blotted and stained with 3 μL of 2% uranyl formate, blotted again, and stained with 3 μL of uranyl formate for 20 s before nal blotting. Micrographs of stained samples were taken on a 120kV Talos L120C transmission electron microscope. All nsEM datasets were collected using the EPU software and processed by CryoSparc (Punjani et al., 2017) with contrast transfer function (CTF) correction. All the particle picks were 2D classi ed for 20 iterations into 50 classes. Particles from selected classes were used for building the abinitio initial model. The initial model was homogeneously re ned using C 1 and the corresponding O symmetry.
Cryo-EM grid preparation and data collection: Grids (QUANTIFOIL® R 2/2 on Cu 300 mesh grids + 2 nm C) were vitri ed using a Vitrobot Mark IV with chamber maintained at 22°C and 100% humidity. Grids were plunge-frozen into liquid ethane directly following application of 3.5 μl of the ZnPPaM-loaded nanocage to the glow-discharged (for 5 s) surface of the grid. Grids were screened at the NYU Cryo-EM core facility using a Talos Arctica microscope operated at 200 kV with a Gatan K3 camera. Data were then collected on a Titan Krios microscope operated at 300 kV with a Gatan K3 camera with BioQuantum imaging lter ("Krios 2" at the New York Structural Biology Center). Data were acquired from duplicate grids using Leginon (Suloway et al., 2005) and pre-processed (2X binned and motion-corrected with MotionCor2 (Zheng et al., 2017) within Appion (Lander et al., 2009). Full data collection parameters are shown in Supplementary Table 5.
Cryo-EM data processing and model building: Aligned and dose-weighted micrographs were imported to Cryosparc v.3 (Punjani et al., 2017) and processed using the work ow shown in Supplementary Figure 15.
During data collection, we noted a high proportion of damaged (compressed or fragmented) nanocage particles in areas of ice with a reported ALS thickness below 40 nm. Curation of micrographs to exclude those with the thinnest ice, and with CTF t resolution lower than ~6Å, facilitated picking of intact nanocage particles. 2D classi cation was performed on manually-picked particles to generate templates representing diverse views of the nanocage, but subsequent template-based picking tended to exclude rare particle views. Recovery of these rare views was improved by using a single template for picking representing the view most often missed in prior template-based picking efforts (see Supplementary   Table 6). Compared to picking with multiple templates, using a single, rare-view template improved the recovery of diverse particle views, which were then used as a training set for Topaz (Bepler et al., 2019).
Picking with Topaz yielded diverse, well-centered nanocage particles. Data from each of the two grids imaged were picked with Topaz separately, and the curated particles were then combined and further curated in 2D. This larger set of curated particles was used to retrain Topaz (204,039 vs. 19,355 in initial Topaz training set) on the full set of micrographs from both grids. Particles picked using this Topaz model were then curated by 2D classi cation, micrograph curation by ice thickness and CTF t values, and removal of duplicates.
A 200-micrograph subset from a single grid was used to generate an ab initio 3D reconstruction. Following iterative rounds of homogeneous and heterogeneous re nement, this map served as the initial 3D model for processing of the full particle set from both grids. 3D re nement and classi cation yielded a map of the full nanocage with an average reported resolution of ~6.5Å (as calculated in Cryosparc using gold-standard FSC cutoff of 0.143). O symmetry was imposed during the nal round of re nement. Continuous conformational heterogeneity likely limited the resolution of the full nanocage map, as discrete states were not readily separable by further 3D classi cation. Multiple modes of exibility were visualized using Cryosparc's 3D Variability Analysis (Punjani & Fleet, 2021), supporting the notion that the nanocage particles used in re nement were subject to compression/deformation (see Supplementary   Information for movies of protein breathing motions). We then used partial signal subtraction and focused re nement to improve resolution in the ligand-binding region of the cage (region enclosed in yellow mask, Supplementary Figure 15 inset). Prior to partial signal subtraction, particles were expanded with T symmetry (the highest-order symmetry containing a complete Chl-binding dimer). The symmetryexpanded, partially-subtracted particle set was then re ned in C 1 using Local Re nement in Cryosparc.
The cryo-EM map of the full nanocage was used for real-space re nement of a model in Phenix (Echols et al., 2012). Due to the intermediate map resolution, all residues were modeled as alanine and restraints (secondary structure, Ramachandran, and non-crystallographic symmetry constraints) were imposed during re nement. The designed nanocage model was used as a starting point for re nement, and individual chains were docked into cryo-EM maps using Chimera (Pettersen et al., 2004) before hydrogen removal and truncation to polyalanine using phenix.pdbtools. Stubbed, docked models were then subjected to restrained real-space re nement in Phenix. We observed a notable difference between the design model and the cryoEM density in the angle between each trimeric interface helix and its attached DHR "arm". To generate a starting model for restrained re nement of the full nanocage, we rst performed rigid-body re nement, with each trimer subunit modeled as two rigid bodies (corresponding to the interface helix and "arm" regions; residues 259-337 and 1-258, respectively). Cryo-EM model statistics are listed in Supplementary Table 7

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.