Programming complex regulation mechanisms through simple molecular assembly


 What are the advantages and disadvantages of building nanosystems using one or multiple components? More than 55% of all proteins found in living organisms are multimeric and likely exploit molecular assembly to create new functional entities. However, the specific contribution of molecular assembly to the creation of novel functions remains relatively unexplored at the thermodynamic, kinetic and molecular levels. Here, we use theory and a simple experimental model to determine the design rules for engineering efficient self-assembled, self-regulated nanosystems. Using these rules, we have rationally designed and implemented various regulation mechanisms (e.g., cooperative and anticooperative assembly, self-inhibition, molecular timer) into two model trimeric nanosystems including a complex artificial catalyst. These simple strategies based on molecular assembly have been extensively exploited by natural biosystems and are expected to play a crucial role in the development of future self-regulated nanotechnologies.

Finely regulated self-assembled molecular systems -aptly called nanomachines -are central to life and are increasingly important in nanotechnology 1,2 . In living organisms, 25 molecular systems have evolved to respond precisely to specific variations in stimuli such as temperature, pressure, light, pH, osmolarity, small molecules, or macromolecules [3][4][5] . These nanosystems typically self-assemble via the formation of multiple noncovalent interactions, either through intramolecular folding or through the intermolecular association of two or more molecular components 6,7 . The tetrameric protein hemoglobin, for example, contains four oxygen- 30 carrying components finely regulated by variation in pH, carbon dioxide, and temperature and by the allosteric effector 2,3-bisphosphoglyceric acid (2,3-DPG) 8 . Inspired by such sophisticated nanosystems, chemists and engineers aspire to develop similar self-regulated systems for various nanotechnological applications, including biosensing, drug delivery and chemical computing 9-11 . Different mechanisms have been exploited by nature to create finely regulated molecular 35 systems [12][13][14] . Allostery, for example, regulates the activity of biological macromolecules through structural changes caused by the binding of an effector molecule at a location often distal to the active site [15][16][17] . An improved thermodynamic understanding of allosteric mechanisms has recently provided powerful strategies to optimize the performance of artificial nanosystems, such as their dynamic range, sensitivity, and cooperativity [17][18][19][20][21] . Another potentially important strategy 40 to introduce regulation into molecular systems is by generating functional entities from the spontaneous assembly of multiple molecular components 12,13 . Although this strategy has been exploited extensively by nature, e.g., ribosome self-assembly 22,23 , its specific and detailed contribution to creating novel functionalities remains relatively unexplored at a thermodynamic, kinetic and molecular level [24][25][26] . To explore the potential of molecular assembly to create new 45 functionalities, we designed and characterized a model artificial biomolecular system. Using theory and experiments, we demonstrate how simple self-assembled molecular systems can be readily programmed to produce complex regulation mechanisms by simply changing their number of components, their concentrations, and their thermodynamics and kinetics of assembly ( Fig. 1A). We also apply this knowledge to engineer numerous regulation mechanisms into an 50 artificially selected catalytic nanosystem. Figure. 1 | Designing molecular assemblies using one (black, 1c), two (blue, 2c) or three (green, 3c) molecular components. A) Molecular assemblies with multiple components enable the creation of various programmable assembly profiles (e.g., cooperative, anti-cooperative and 55 self-inhibited assembly) with temporally controlled activation (e.g., molecular timer). B) A simple DNA-based self-assembled "3-way junction" nanosystem containing three 10 base pairs (5AT/5GC) arms. All assemblies were monitored using fluorescently labeled DNA strands, and the data were normalized accordingly (see Methods). C) Top: Urea (and temperature, Fig. S5) denaturation curves reveal that the free energy of assembly (DG°Ass) increases with the number of Table S1). Bottom: The 1c system assembles and disassembles faster than the 2c and 3c systems (see also Fig. S6-S8 for raw data). E) Increasing the number of components increases the ability to inhibit the assembly process by using a complementary "inhibitor" strand.

70
To establish the thermodynamic and kinetic rules of molecular assembly, we designed a simple self-assembled "3-way junction" nanosystem that can be readily built using 1, 2 or 3 components (Fig. 1B). This widely occurring motif in natural and artificial ribozymes and DNAzymes 27 is also an important building block in DNA nanotechnology 28,29 . This 3-way junction contains three arms each separated by a short 2-thymine spacer (see Fig. S1 for linker 75 design). Each arm is made from 10 base pairs (5AT/5GC) and has a predicted folding free energy of -10 kcal/mol (KD ~ 100 nM), which enables assembly within a range of concentrations ideally suited for fluorescence measurements 30 .
We first compared the stability of all 3-way junctions by assessing the difference in energy between the assembled and disassembled conformations, otherwise known as DG, using 80 chemical denaturation procedures (see also SI for thermal denaturation procedure). Urea titration curves typically provide two important parameters: 1) an estimation of DG in the absence of urea (DG°Ass) and 2) the m-value, which correlates with the amount of surface area exposed to solvent upon disassembly (see Fig. 1C, Top) 31,32 . Surprisingly, we found that the one-component system (1c) is significantly less stable than the 2c and 3c systems (-5.9 ± 0.6 kcal/mol versus -15.8 ± 0.2 85 and -21.8 ± 0.9 kcal/mol, respectively). We also found that the 1c system exposes much less of its surface upon disassembly than the 2c and 3c systems (m-value = 0.67 ± 0.06 kcal/mol·M, 1.61 ± 0.05 kcal/mol·M, and 2.1 ± 0.5 kcal/mol·M, respectively) (Fig. S4). This is likely because the 1c system is already preorganized in its disassembled state. To verify this hypothesis, we characterized the isolated hairpins and found that they remain folded even above 10 M urea (Fig.   90 S2). The DG°Ass and the m-values of the different systems are indeed proportional to the number of base pairs formed during the assembly/disassembly transition (Fig. 1B, Bottom and Fig. S4).
Thermal denaturation of these nanosystems also revealed that the assembly of the 1c system showed both a reduction in enthalpy (54 ± 9%) and in entropy (49 ± 1 0%) compared to the 3c system, consistent with the idea that two of the three arms remain organized in the disassembled 95 state of 1c (Fig. S5).
We then determined the kinetics of assembly of these nanosystems and found that their rate of assembly decreases drastically when the number of components was increased (Fig. 1D).
For example, while 50% of the 1c system folds within 1 ms (t1/2 = 0.5 ± 0.1 ms), the 2c (t1/2 = 32 ± 9 s) and the 3c (t1/2 = 1287 ± 378 s) systems assemble five and seven orders of magnitude more 100 slowly, respectively (Fig. 1D, inset). We also determined the half-life of all 3-way junctions and found that the 3c and 2c systems remained assembled up to four orders of magnitude longer than the 1c system (23 ± 9 h, 22 ± 8 h, and 21 ± 11 s, respectively) ( Fig. 1D, Bottom). The DG°Ass and m-values estimated from the kinetics of assembly and disassembly are within the experimental error of the values determined using equilibrium experiments, suggesting that each system 105 assembles and disassembles via a two-state mechanism (see Fig. S4 and Table S1) 33 .
Overall, we found that despite its faster assembly rate, the unimolecular nanosystem is also more likely to form preorganized structures (e.g., hairpins) that do not contribute to the overall stability of the assembly. These preorganized structures always remain formed even when the system is disassembled. In contrast, building nanosystems using many components 110 reduces the level of preorganized structures, thus ensuring that the assembly mechanism maximizes the number of newly formed interactions. Preorganized structures in the disassembled state also impact the ability of the system to be regulated by simple allosteric mechanisms. For example, we found that the 1c system cannot be inhibited by a classic complementary DNA "inhibitor" due to the low accessibility of its nucleotides locked in the hairpins. In contrast, the 115 2c and 3c systems are increasingly more sensitive to the presence of the inhibitor, thus displaying a better ability to develop more regulation mechanisms (Fig. 1E).
We then determined the impact of varying the number of components on the mechanism of assembly of these nanosystems. The assembly mechanism of unimolecular nanosystems, such as the 1c system, is generally hard to tune. These systems typically fold rapidly and become 120 active as soon as they are synthesized 34 . Their activity, therefore, varies linearly with their concentration and cannot be regulated without the help of an external molecule, such as an allosteric effector (see Fig. S9) 35, 36 . In contrast, the assembly of the dimeric 2c system can be regulated by tuning the concentration of one of its components (here called A). This results in a classic dose-response behavior (see Fig. 2A) with two programmable parameters: 1) the 125 midpoint or [A]50%, i.e., the concentration at which 50% of the system is assembled, and 2) the cooperativity of the response or dynamic range (DR), i.e., the broadness of the transition, defined as the change in [A] required to provide a change in response from 10% to 90% (DR =  37 . In such situations, the observed transition becomes more "cooperative", and the dynamic range is reduced by up to 9-fold ( Fig. 2C, right, blue curve).
The assembly of the trimeric 3c system displays even more programmability by enabling assembly over a wider range of concentrations. For instance, by varying the concentration of components B and C at levels lower or higher than KD BC (41 ± 23 nM, see Fig. S10), one can 140 tune their level of preorganization (i.e., [BC]). For example, when using a concentration of B and C higher than KD BC , the dimer preforms in a state similar to that of the 2c system, and consequently, the 3c system behaves similarly to the 2c system ( Fig. 2A-B darker curves and We can also conveniently tune the [A]50% and the dynamic range of the 3c system by 150 simply changing the affinity between the components and the ratio between their respective concentrations. For example, one can decrease (increase) the [A]50% of the 3c system by simply decreasing (increasing) the temperature (Fig. 2D, left) or by increasing (decreasing) the number of Watson-Crick base pairs involved in the arms (Fig. 2D, middle). We also found that the anticooperative behavior observed at a low concentration of components B and C is highly 155 dependent on the concentration ratio between these components (R = [C]/[B]). As this ratio increases above 1, the anticooperativity rapidly disappears, and the system displays a typical 81-fold dynamic range (Fig. 2D, right). For example, a 2-fold increase in concentration of one component, from R = 1 to R = 2, decreases the dynamic range of assembly by up to 6-fold, from DR = 383 to DR = 57, while keeping the [A]50% relatively unchanged, from 18 nM to 11 nM. 160 This provides a useful strategy to specifically program either the [A]50% or the dynamic range independently. Overall, these results demonstrate that the assembly process of nanostructures made from many components can be programmed to provide much more diverse regulation profiles (e.g., from an anticooperative to a highly "cooperative" assembly profile). is increased, the [A]50% of the 2c system increases from 5.6 nM to 516 nM (left), while its dynamic range decreases from 81-to 9-fold (right). When the concentrations of both strands B and C are increased, the [A]50% of the 3c system displays a more complex relationship (changing from 92 nM to 18 nM to 372 nM -left), while its dynamic range decreases from 729-to 9-fold (right). D) The assembly profile of 3 components systems can be readily programmed by: 1) Another way to program the assembly of a 3c system is by changing its propensity to form the final trimeric assembly (DDGTri-Dim or |DGTri -DGDim|) by tuning the affinity of A towards the preorganized system BC (Fig. 3A). To explore and characterize this effect, we 185 designed a set of three different 3-way junctions displaying similar DGDim (same arms) but different DGTri values. In an attempt to modulate only DGTri, we altered the stability of the junction by varying the linker length between the arms (0, 2, 4 thymines) (Fig. 1A) 38,39 . Using temperature denaturation curves, we found that all of our designs display similar dimeric affinities (approximately -10.6 ± 0.4 kcal/mol, Fig. S1) but different trimeric affinities ranging 190 from -12.0 ± 0.7 kcal/mol (0T spacer) to -18.3 ± 0.3 kcal/mol (2T spacer) ( Fig. 3B and Fig.   S1D). As shown above, when using a concentration of B and C below KD BC , we found that the most stable 3c system (2T linker, DDGTri-Dim = 7.3 ± 0.5 kcal/mol) assembles efficiently with a high level of "cooperativity" (Fig. 3C, DR = 20 ± 5). When DDGTri-Dim is significantly reduced (e.g., 0T linker, DDGTri-Dim = 2.0 ± 1.1 kcal/mol), the assembly becomes self-inhibited ("none-all-195 none" mechanism, 24 ) at high [A] through a mechanism that favors the formation of dimers over the trimer (Fig. 3E). For example, the percentage of assembled trimer increases from 10% to 70% when [A] is changed from 5 nM to 200 nM and goes back down to 10% when [A] is further increased to 10 µM (DR = 2000-fold; see also how to program this self-inhibited dynamic range in Fig. S14). This result highlights that if DDGTri-Dim is too small, an increase in [A] will not drive 200 complex assembly further but rather inhibit it by favoring dimer formation (Fig. S14). Finally, the longer 4T linker system, which produces a trimer of intermediate affinity (DDGTri-Dim = 4.0 ± 0.8 kcal/mol), displays less "cooperativity" of assembly than the more stable 2T system (Fig. 3C, DR = 75 ± 18) without displaying the self-inhibition mechanism of the least stable 0T system. In contrast, similar mutations in the dimeric 2c system provided no diversity in the regulation 205 mechanism (Fig. 3D).

Figure. 3 | Programming trimer assembly by increasing the difference in energy between the dimer and trimer (DDGTri-Dim).
A) Thermodynamic scheme of the assembly of the 3c systems. B) Changing the thymine spacer length (0T black, 2T green and 4T blue, see Fig. 1A) 210 creates trimers with different DDGTri-Dim values (see inset). C) Increasing DDGTri-Dim narrows the dynamic range of assembly (as seen with the more cooperative 2T system), while low DDGTri-Dim (0T) creates a self-inhibited trimeric system, which disassembles into two dimers (AB and AC) at higher [A]. D) Similar modifications on the 2c system provide no difference in assembly. E) Polyacrylamide gel electrophoresis of the 0T spacer trimeric assembly supports that the decrease 215 in trimer occurs through sequestration into dimers (also see Fig. S15).
The regulation mechanisms discussed above all take place at equilibrium. However, molecular assemblies can also be under kinetic control [40][41][42] . In the case of the 3c system, for example, mixing an excess of component A (100 to 600 nM) with smaller concentrations of B and C (10 nM) traps the latter into nonfunctional AB and AC dimers (Fig. 4A). Dissociation of these dimers is then required to enable trimer formation by the slow association of the formerly sequestered components B and C, thus resulting in biphasic kinetics ( Fig. 4B and Fig. S16). We confirmed that the fastest phase, kDim, represents the formation of dimers AB and AC given their linear dependency on the concentration of monomer A (Fig. 4C blue). In contrast, the slowest 225 phase, kTrim, represents the formation of the trimer and is rate-limited by the formation of dimer BC, thus explaining its insensitivity to the concentration of component A (Fig. 4C green and Fig.   S17). This kinetically controlled mechanism of assembly provides interesting time-dependent formation/dissociation profiles for the dimer and the activation of the trimer. Experimental and therefore their level of preorganization, one can increase the rate of assembly of the trimer ( Fig. 4E and Fig. S17). Notably, increasing the rate of trimer formation also decreased the 235 percentage of dimers transiently formed. These results demonstrate how a trimeric molecular assembly can be easily programmed to activate and deactivate within specific time ranges, thus acting as a molecular timer. Such time-dependent nanosystems are important in various biochemical processes, such as signal transduction, protein synthesis and the cell cycle and are likely to provide promising applications in future self-regulated nanotechnologies [43][44][45] .

Figure. 4 | Programming time-dependent assembly.
A-B) At low concentrations of monomers B and C (mainly unbound), the addition of an excess of monomer A produces slower biphasic kinetics due to the sequestration of strands B and C into AB and AC dimers. Trimer formation then proceeds through a slow strand-exchange mechanism limited by the association of the 245 formerly sequestered components B and C (see Fig. S17 and S20). C) Kinetic trace of the dimeric (blue) and trimeric (green) assemblies derived from the raw kinetic data (black). D) By tuning the concentration of A, one can program the rate of formation (activation) and the percentage of formed dimer without affecting the rate of trimeric assembly. E) By increasing the concentrations of [B] and [C] and therefore their level of preorganization, one can increase the 250 rate of assembly of the trimer and decrease the percentage of transiently formed dimers. See Fig.  S22 for raw data of panels D and E.
In this study, our 3-way junction served as a convenient synthetic toolkit to quantitatively test simple molecular assembly and the extent to which it can be harnessed to create novel 255 regulation mechanisms. To test the generality and predictability of these findings, we used these rules to program the catalytic activity of NaA43, a sodium-specific RNA-cleaving DNAzyme previously used as a sensor to monitor the sodium concentration inside cells 46 . We first measured the apparent activity of NaA43 and found that it displays an apparent KM of 16 ± 1 nM, corresponding to half of the concentration of the DNAzyme used in our assay (30 nM), and a 260 dynamic range of 9 ± 1 (see Fig. 5A). We also estimated KD DNAzyme , the dissociation constant between the substrate and the DNAzyme, to be in the fM range (Fig. S23), which explains why this DNAzyme/substrate system operates in a saturation regime ([DNAzyme] > KD DNAzyme , see also Fig. 2A). To explore the effect of molecular assembly on the regulation of this DNAzyme, we engineered a dimeric DNAzyme by moving the loop far from the catalytic site, thus creating 265 a three-component system. This modification still provides a functional DNAzyme despite a 49 ± 4% reduction in catalytic activity (Fig. S24). At a high concentration of DNAzyme components (1 µM), which ensures high preorganization, we found that [A]50% is still approximately half the concentration of the DNAzyme (729 ± 29 nM) with a "cooperative" dynamic range (DR = 7.7 ± 0.3). In contrast, as predicted by our model (Fig. 4A-B), using a low concentration of DNAzyme 270 components (i.e., 30 nM) produces a kinetic trap. This kinetic trap can be used to program a substrate inhibition regulation mechanism (i.e., "none-all-none" regulation), where both DNAzyme components become sequestered into nonfunctional dimers at high substrate concentrations (see Fig. S25). We then programmed the DNAzyme into a molecular timer to allow its activity to be finely regulated over time. To do this, we created a novel trimeric assembly by adding (instead of 295 cutting, see Fig. 5B) an extra component to the existing system (Fig. 5C). We designed this extra "controller" component to hybridize to a variant of the DNAzyme (see Fig. S27 for design), forming either an inactive trimer or an active dimer (Fig. 5C top and Fig. S28). In contrast to the native DNAzyme, which deactivates only when running out of substrate, our molecular assembly strategy enables tuning of both the catalytic rate and the deactivation time of the DNAzyme (Fig.   300 5C bottom). As depicted, by further increasing the substrate concentration, we favor dimer sequestration and formation of the active DNAzyme, thus increasing the catalytic rate (Fig. 5D).
This dimer deactivates at a controlled time t1/2 = 29 ± 8 min upon binding to the controller strand (forming the inactive trimer). In contrast, by increasing the concentration of DNAzyme and controller (Fig. 5E), we favor a faster assembly of the trimer, leading to faster deactivation of the 305 DNAzyme without drastically changing the amount of dimer generated (i.e., with a similar catalytic rate). These results exemplify how simple molecular assembly strategies can be applied to introduce finely programmed regulation mechanisms into complex nanosystems such as a DNAzyme.

310
Here, we have demonstrated the versatility of simple molecular assembly to achieve a wide range of regulatory mechanisms in two different model nanosystems. We have shown that despite assembling at a slower rate, nanosystems built using multiple components may also lead to assemblies that are significantly more stable. They do so by employing smaller components that contains less preorganized structures, thereby leading to more interactions being formed 315 during the assembly process (Fig. 1). Smaller, less preorganized components also display more potential to form interactions with other molecular effectors, thus creating novel avenues for regulating their assembly (e.g., complementary DNA inhibitor, Fig. 1D). Another advantage of three-component systems over two-component systems is that they permit assembly using both a "cooperative" and an anticooperative process (Fig. 2). We also showed that three-component 320 systems can be tuned to exhibit self-inhibition mechanisms (Fig. 3) as well as time-dependent activation/deactivation mechanisms (Fig. 4). All these complex regulation profiles (at equilibrium or over time) are achievable using only a simple molecular assembly strategy and can be applied to systems of increasing complexity (e.g., catalytic nanosystem, Fig. 5). This illustrates the simplicity of this approach compared to the more complex allosteric regulation 325 mechanisms often employed by nature to produce similar regulation profiles 19,47,48 .
Engineering complex self-regulation mechanisms using simple molecular assembly provides a quantitative and programmable chemical strategy to develop and optimize nanosystems with applications ranging from biosensing 49 to chemical computing 50,51 and drug delivery 52,53 . For example, current strategies to extend the dynamic range of sensors consist of 330 combining two or multiple sensors with different affinities [54][55][56] . In contrast, here, we have illustrated how a three-component sensing system can be programmed to display either a narrow or an extended dynamic range. This ability to tune the dynamic range can also be useful to program and optimize the response of molecular logic gates. For example, a narrow dynamic range creates a more efficient all-or-none response, while a self-inhibited nanosystem (Fig. 3C) 335 produces a "bandpass" filter (i.e., none-all-none response), a regulatory mechanism observed in many cellular functions 57 . A three-component system could also help maintain drug concentration within a specific therapeutic window. Nature, for example, employs various substrate inhibition strategies to maintain the level of crucial product metabolites despite large variations in substrate concentration 58,59 . One could envisage a simple three-component 340 nanosystem to control and maintain the level of an active drug following its activation through a regulated catalytic system (Fig. 5B). Finally, a three-component system with programmed kinetic traps can enable time-specific activation/deactivation of various active biomolecules, leading to flexible and custom disease treatment strategies (Fig. 5C) 60,61 .
In addition to providing new strategies to develop complex self-regulated nanosystems, 345 elucidating the thermodynamic and kinetic basis of molecular assembly will likely contribute to a better understanding of protein complex evolution. We have demonstrated how two common natural mechanisms to create novel protein assemblies, fission (i.e. cut in half) and fusion (i.e. controller strand) 62,63 , have enabled us to engineer novel functionalities in a DNAzyme. Given that more than 55% of all proteins in living organisms are multimeric 26 , it will be interesting to 350 explore whether the functional gains of these proteins, e.g., complex regulation mechanisms, have emerged from the advantages derived from simple molecular assembly strategies 64 . It remains challenging to answer this question, however, given that multimeric proteins have evolved and diverged over billions of years 26,65 . In conclusion, simple molecular assembly provides an efficient, programmable strategy to improve nanosystem functionality (e.g., by 355 enabling optimized regulation mechanisms). We believe that the simplicity with which this strategy can be implemented in any nanosystem will greatly impact the development of future self-regulated nanotechnologies 40,66 .

Fluorescent experiments.
Urea Titration curves. Urea titration curves were performed following a method developed by our lab. 31 We started with a 900 µL solution of the DNA-based system of interest in 10 M urea buffered solution (10 mM NaH2PO4, 40 mM NaCl, pH = 7.00). We then sequentially diluted this solution with a buffered solution containing the same concentration of DNA-based system but with no urea. Each sample was equilibrated for 2 minutes before recording their fluorescence 435 (Cary Eclipse, Agilent Technologies). Unimolecular system titration (ADiss ⇌ AAss) was performed at a concentration of 10 nM of fluorescent DNA strand and fitted using Eq. 1.
The trimeric association follows this reaction: A + B + C ⇌ ABC where A, B, and C are singlestranded DNA and ABC is the three-way junction formed from these strands. When measuring the assembly, we use at least a 10-fold excess of molecule A in buffer that is rapidly mixed using a stopped-flow instrument coupled with a fluorimeter (SX20, Applied Photophysics) with a 495 solution of B and C in water (to avoid pre-association of B and C). Data are fitted using a combination of a pseudo-first-order kinetic and a second-order kinetic (Eq. 8). Appropriate dilution of unlabeled DNA solutions were made such as the concentration of strands B and C is kept at 1 µM and the concentration of strand A is changed from 30 nM to 100 µM.
Solutions are then mixed in a 5:1 ratio with the 6x loading buffer (2.5 mg/mL bromothymol blue, 2.5 mg/mL xylene cyanol FF and 30% glycerol in water). A 15% polyacrylamide gel is handcast 505 following Bio-Rad protocol and incubated in the running buffer (0.5x TBE buffer containing 5 mM of MgCl2) for 1h. 10 µL of samples are run for 90 min at 120V using the Mini-PROTEAN Tetra cell electrophoresis unit (Bio-Rad) and the Bio-Rad PowerPac Basic power supply. Gels are stained with a 0.5x solution of GelRed TM (Biotium) for 10 minutes and analyze on an imaging system (ChemiDoc TM XRS+, Bio-Rad). The integration of band intensity is then 510 performed to evaluate the amount of assembled DNA-based systems.