In vitro characterisation of the MS2 RNA polymerase complex reveals novel host factors that modulate leviviral replicase activity

The coliphage MS2 is a well-established model organism that has helped to reveal a variety of fundamental concepts for RNA-based translational control and viral RNA packaging. Despite the comprehensive characterisation of the MS2 life cycle, the macromolecular composition of its RNA replicase remained obscure. Here, we sought to identify the missing host proteins required for the assembly of the active replicase complex. By combining a puried, inactive MS2 replicase sub-complex with selected modules of an in vitro translation system, we conrmed that the three suspected host factors EF-Ts, EF-Tu and ribosomal protein S1 form an active core-enzyme with the viral replicase subunit. Unexpectedly, we also found that the translation initiation factors IF1 and IF3 directly modulate MS2 replicase activity. While IF1 enhances replicase activity in a template-independent manner, IF3 acts as an inhibitor that prevents polymerase initiation and / or elongation. Both observations suggest a previously unknown role of these host proteins during the phage life cycle. Finally, we demonstrate the in vitro formation of small RNAs that contain minimal motifs required for MS2 replicase-dependent amplication. Our work sheds new light on the architecture of the MS2 replication machinery while also providing the basis for a new cell-free evolution platform.


Introduction
The leviviral bacteriophage MS2 is one of the smallest viral pathogens known with its single-stranded 3.6 kilobase (kb) RNA genome encoding for only four proteins (1,2). Due to its simple architecture, MS2 nds extensive use as a model organism in various biological research elds and helped to reveal a large number of fundamental biological processes such as translational control, feedback inhibition and key aspects of viral life cycles (3)(4)(5)(6). Moreover, MS2 is used as a surrogate for human pathogenic RNA viruses to study the properties, stability and detectability of RNA viruses under different environmental conditions (7)(8)(9). Apart from their relevance to RNA biochemistry and molecular biology, protein and RNA components of ssRNA coliphages such as Qβ and MS2 serve as attractive platforms for RNA imaging (10)(11)(12), RNA packaging and delivery (13)(14)(15), and molecular evolution and gene expression in cell-free systems (16)(17)(18)(19)(20).
Infection of male Escherichia coli cells by MS2 is dependent on the maturation protein (mp), which is present as a single copy in the MS2 capsid and enables binding of MS2 virions to the host's F-pili (6). Following infection, the host translation machinery immediately begins synthesizing the MS2 coat protein (cp), whose main function is to encapsulate newly synthesised viral (+) strands. Ribosomal readthrough events of this gene provide a brief opportunity for initiation of translation of the replicase (rep) gene, whose ribosome-binding site is otherwise sequestered in an "operator" hairpin RNA structure (21) and inhibited by long-range RNA-RNA interactions (22). Similarly, timing and level of mp expression are controlled by the RNA folding kinetics of an untranslated leader sequence (23). To initiate genome replication, the MS2rep subunit forms an active holocomplex, which likely consists of the E. coli elongation factors EF-Tu, EF-Ts, ribosomal protein S1 and a yet unde ned host protein ( Fig. 1) (1). This complex synthesises new genomic (+) strand using antigenomic (-) strands intermediates as template. At later stages of the life cycle, cp dimers bind and stabilize the operator hairpin which ultimately leads to a complete suppression of the rep translation (21). Finally, expression of the lysis gene (lys) leads to the release of mature virions from the infected host, enabling new cycles of infection (6). Leviviral replicases are amongst the most ancient RNA-dependent polymerase enzymes and share key characteristics with other viral RNA polymerases (24), which makes them attractive model systems for drug discovery and mutational analyses. However, although MS2 is one of the main model systems for RNA viruses, fundamental aspects of its genome replication remain unclear to date. In particular, a host factor postulated almost 50 years ago (25,26), that is required for RNA replicase activity remains unknown. Here, using our previously established reporter system for MS2 replicase activity in PURE systems (20), we are able to identify new host factors that modulate the activity of the phage enzyme.
Through stepwise reduction of PURE proteins, we identi ed initiation factor 1 (IF1) as a co-factor that stimulates the activity of the MS2 replicase core complex consisting of MS2rep, EF-Tu, EF-Ts, and ribosomal protein S1. Furthermore, we found that initiation factor 3 (IF3) acts as an inhibitor for replicase activity, which has potential implications for the viral life cycle. Finally, we demonstrate that the MS2 replicase complex can evolve small replicable RNA species during the in vitro replication of the full-length genome. These new RNA scaffolds may be well-suited for the generation of robust self-replicating synthetic RNA systems in cell-free in vitro systems.

Preparation of RNAs and plasmids
The preparation of all RNAs and the cloning of all plasmids are described in detail in the Supplementary Methods. Supplementary Table 1 contains all primers used for RNA preparation and cDNA synthesis.
Supplementary Table 2 lists all DNA templates used for IVT. Supplementary Table 3 contains the sequences of all plasmids used in this study.

Preparation of Proteins
The preparation of proteins followed an adapted version of the protocol described by Shepherd et al. (27). In short, Top10 E. coli cells (Thermo Fisher Scienti c) were transformed with each plasmid by electroporation. Proteins were overexpressed at 16°C overnight in Lysogeny Broth (LB Lennox) with 0.2 % L-Arabinose for all pBAD33-based constructs or 1 mM IPTG for pLD1-3, respectively. Subsequently, proteins were puri ed over HisPur™ Ni-NTA resin (Thermo Fisher Scienti c) using HEPES buffer (50 mM HEPES·KOH pH 7.5, 250 nM NH 4 Cl, 10 mM MgCl 2 , 5 mM DTT) and stored in HEPES/glycerol buffer (50 mM HEPES·KOH pH 7.5, 100 mM KCl, 10 mM MgCl 2 , 7 mM DTT, 30 % Glycerol). Preparations from pLD1, pLD2 and pLD3 were adjusted to ten-fold stocks with the nal protein content per reaction equalling concentrations as described previously (27). A more detailed protocol is included in the Supplementary Methods. Real time uorescence measurements All readout constructs are based on designs by Weise et al. (20). For experiments using the PURExpress® In Vitro Protein Synthesis Kit (NEB), standard reactions were supplemented with 10 µM DFHBI-1T, MS2rep and F30-Bro(-) or MS2-Bro(+), respectively, as described for the individual experiments: Fluorescence signals were recorded every 60 s over a total of four hours or every 120 s for six hours for experiments involving F30-Bro(+|-) or MS2-Bro(+|-), respectively. Unless stated otherwise, all experiments were performed in independent triplicates assembled from the same stock solutions. Serial transfer and sequencing For serial transfer experiments, reactions were setup by mixing components in the following composition: 15 µM EF-Tu, 15 µM EF-Ts, 1.5 µM S1, 1 µM MS2rep, 0.5 mM of each ATP, GTP, CTP and UTP, 10 mM DTT and 0.5 U / µL RNase inhibitor (moloX). IF1 was supplemented at a nal concentration of 15 µM. The nal concentration of MgCl 2 was 6 mM, with HEPES, KCl and glycerol supplemented in 50 mM, 100 mM and 18 %, respectively. For the rst reaction, MS2wt RNA was provided at 50 nM nal concentration.
Reactions were incubated at 37°C for three hours, frozen in liquid nitrogen, and stored at 80°C. New reactions were mixed as described above, except that a 1/5 reaction volume (corresponding to MS2wt RNA in H 2 O) was replaced with an aliquot from the previous reaction.
To obtain sequencing data, RNAs from serial transfer experiments were puri ed using Monarch® RNA Cleanup Kit (NEB) then polyadenylated using E. coli Poly(A) Polymerase (NEB) and re-puri ed with the same clean-up kit. Double-stranded cDNAs were synthesized using Template Switching RT Enzyme Mix (NEB) in combination with primers CDSII-T 24 VN and TSO-CDSII (Supplementary Table 1 Preparation of gel electrophoresis samples All samples for gel electrophoresis were prepared by mixing equal volumes of reaction mixture and 2x RNA Gel loading dye (Thermo Fisher Scienti c). Samples were heated to 70°C for ve minutes and then slowly cooled down back to room temperature. This allowed bound proteins to denature and dissociate from RNAs, as well as the complementary RNA strands to denature and anneal.

Results
Puri ed MS2 replicase is functional in recombinant in vitro translation systems To characterise the activity requirements of the MS2 replicase complex, we overexpressed the MS2rep subunit in E. coli and puri ed it by single-step immobilized metal ion a nity chromatography (IMAC). In the one-step puri cation protocol, MS2rep co-eluted with the two potential host factors ribosomal protein S1 and the translation factor (TF) EF-Ts, both of which are components of the related Qβ replicase complex ( Supplementary Fig. 1). Surprisingly, we noticed that EF-Tu, which is an essential and tightlybinding host factor for the Qβ replicase complex that readily co-puri es with Qβrep, did not co-elute with the His-tagged MS2rep subunit. Thus, the assembly properties of the replicase core complexes between the genera Allolevivirus (Qβ) and Levivirus (MS2) show unexpected differences.
Next, we sought to explore if the MS2·S1·EF-Ts heterocomplex could be used to initiate transcription of a genuine MS2 template. To this end, we made use of our previously established MS2 RNA polymerase assay for the detection of MS2 replicase activity in recombinant in vitro transcription translation (PURE) systems (20,28). In this assay, a uorescence readout is produced by (+) strand synthesis of the broccoli aptamer from a (-) strand template that is fused with the 3'-end of genomic MS2 (-) strand (F30-Bro(-)), in the presence of the uorogen DFHBI-1T (29) ( Fig. 2A). In agreement with literature reports for in situ expressed MS2rep (20), we observed a strong uorogenic readout when both F30-Bro(-) and MS2·S1·EF-Ts were incubated in the commercially available PURExpress® system. We also observed F30-Bro(+) synthesis, albeit at a lower level, in a homemade PURE system (PURE 3.0), which was prepared according to a previously established protocol (27) (Fig. 2B). Based on this result, we inferred that the PURE systems contained the host-factors required to assemble the active MS2 holoenzymes.
To further narrow down the range of possible E. coli proteins that are required for MS2 replicase activity, we investigated the activity in the presence of different PURE protein fractions. In PURE 3.0, 30 of the 31 E. coli TFs are obtained after co-expression and puri cation of the TF genes from three large expression plasmids resulting in three protein fractions (LD1, LD2 and LD3) (27). An additional enzyme mix contains 70S ribosomes as well as the elongation factor EF-Tu and the enzymes necessary to reconstitute a NTP regeneration system based on creatine phosphate (Supplementary Table 4). Initially, we tested MS2 replicase activity in reduced PURE 3.0 reactions (PUREred) based on the LD1-LD3 protein fractions. We also used a simpli ed enzyme mix containing only ribosomal protein S1 and EF-Tu because we did not expect that ribosomes or kinase enzymes to be the missing co-factors based on the genetic similarities between MS2 and Qβ (30). As anticipated, MS2 showed transcription of F30-Bro(+) in the PUREred environment (Fig. 2C). Next, we sought to narrow down the range of potential host factors by further omitting protein components from the PUREred setup. Hereby, we con rmed that TFs EF-Tu and the ribosomal protein S1 are critical co-factors required for full MS2 replicase activity, similarly to Qβ, as depletion of both proteins would lead to a drastic loss in F30-Bro(+) synthesis (Fig. 2C). While depletion of EF-Tu had a drastic effect on F30-Bro(+) synthesis, we still observed signi cant transcription in the absence of added S1 protein presumably due to the presence of S1 protein in the puri ed complex ( Supplementary Fig. 1).
In addition to the expected dependency of the replicase on S1 and EF-Tu, we also observed an unforeseen impact on F30-Bro(+) synthesis upon depletion of the individual LD protein fractions. While depletion of both LD1 and LD3 appeared to weakly stimulate MS2rep activity, omission of LD2 caused a drastic loss of F30-Bro(+) transcription (Fig. 2C). LD2 contains eight enzymes, four tRNA synthetases (AlaRS, AsnRS, IleRS, PheRS1 + 2), E. coli Methionyl-tRNA formyltransferase (MTF), the translation elongation factor EF-Ts (which co-puri es with the MS2rep subunit), and the two translation initiation factors IF1 and IF3 (27).
To identify TFs responsible for this marked effect on replicase activity, we performed selective depletion experiments with all LD2 proteins (Fig. 3A). Omitting the added EF-Ts led to a complete loss of activity, revealing that the excess amount of EF-Ts in the LD2 fractions is essential for transcription of the unnatural F30-Bro(-) template. We further observed an unexpected alteration of transcription activity by omitting the initiation factors IF1 and IF3. While depletion of IF1 caused a ~ 50% reduction in F30-Bro(+) synthesis, the omission of IF3 led to a more than 400% increase in transcription activity compared to the positive control reaction containing the full 1x LD2 protein fraction. These ndings indicated that IF1 stimulates MS2 replicase activity, whereas IF3 acts as an inhibitor. This hypothesis was further corroborated in additional experiments, in which an excess of 5 µM of each of the eight individually puri ed LD2 proteins was added to the reaction mixture (Fig. 3B). Whereas excess IF1 led to an increased transcription activity, adding an excess of IF3 completely abolished F30-Bro(+) synthesis. In contrast, the omission or supplementation of the four tRNA synthetases had no impact on MS2 replicase activity. Both the inhibitory effect of IF3 and stimulating effect of IF1 showed clear dose-dependencies with observable effects already at low micromolar concentrations (Fig. 3C), supporting the notion that they are based on a direct functional interaction with the MS2 replicase core complex. In contrast, control experiments using MTF or PEG8000 at increasing concentrations showed that neither non-speci c protein-protein interactions nor excluded volume effects are responsible for the observed modulation of replicase activity induced by IF1 and IF3. Furthermore, we found evidence that inhibition by IF3 is based on a direct competition between IF3 and the replicase for RNA binding as only an excess of the MS2rep complex could rescue transcriptional activity ( Supplementary Fig. 2).

IF1 stimulates synthesis of the full-length MS2 genome
While the F30-Bro(+) synthesis enables monitoring (+) strand synthesis from an arti cial (-) template, it provides no information on the complete replication cycle of the natural ~ 3600 nt MS2 genome. To probe full-length genome replication by the in vitro reconstituted MS2 replicase complex, we integrated the broccoli aptamer into the (+)-strand of MS2 wild type (MS2wt) genome at an amenable site downstream of the open reading frame for the maturation protein (MS2(+)Bro(+)) (Fig. 4A), where its insertion should only minimally interfere with replication (31)(32)(33). Using this construct, we were able to observe a continuous increase in DFHBI-1T uorescence when the MS2rep·S1·EF-Ts complex was incubated in PURExpress system, which suggests processive genome replication (Fig. 4B). Comparison with reference inputs of MS2(+)Bro(+) showed an estimated sixfold ampli cation, corresponding to an increase of MS2(+)Bro(+) from 15 nM to approximately 90 nM over a 6-hour time course.
Having shown that the MS2 replicase complex can replicate genomic MS2 RNA in the PURExpress system, we further sought to dissect the in uence of the individual co-factors on the ability of the replicase to synthesise the genomic (+) and (-) strands (Fig. 5A, B). Synthesis of genomic (+) and (-) strands from the corresponding MS2-Bro(-) template produced only a weak uorescence output compared to synthesis of F30-Bro(+) from F30-Bro(-) template (Fig. 5C, D). Notably, the omission of an excess of S1 in these experiments did not signi cantly affect genome synthesis unlike for the shorter unnatural F30-bro construct used previously (Fig. 2). This nding indicates that the bound S1 present in the puri ed complex is su cient for effective replication of the natural replicase substrate.
The uorescence output of the broccoli aptamer domain during genomic (-) strand synthesis was seemingly not affected by supplementation of IF1 (Fig. 5E). At the same time, however, an in-gel uorescence analysis revealed a drastic increase in RNA-synthesis in the presence of the co-factor, with the majority of product migrating at the expected size of a full-length duplex (Supplementary Fig. 3). This suggests that only very little newly synthesised (-) strand RNA was present as single strand. In contrast, a stimulation of both overall RNA synthesis as well as broccoli uorescence was observed during synthesis of MS2(+)Bro(+) from MS2(-)Bro(+) template in the presence of IF1 (Supplementary Fig. 3, Fig. 5F). Thus, IF1 stimulated MS2 replicase activity independent from the polarity of the template. In this minimal in vitro environment, the protein either caused a direct reduction of non-uorescent inert duplex product during genomic (+) strand synthesis, or enhanced folding of the aptamer reporter domain.

Spontaneous formation of replicable RNA species
The puri ed replicase complex of phage Qβ is well known for its spontaneous in vitro synthesis of rapidly amplifying RNA species of different length and nucleotide sequence, even in the absence of externally added template molecules (34)(35)(36). To test if MS2rep is capable of a similar spontaneous generation of short ampli able RNA species, we compared the activity of both puri ed Qβ and reconstituted MS2 core complex in template-free reactions supplemented with NTPs and SYBR Green nucleic acid stain. As expected, we observed a rapid increase in uorescence after a brief lag phase of ~ 5-10 min in the presence of the Qβ heterotetramer (Fig. 6A), suggestive of the rapid formation of small amplifying RNA species ("RNA parasites") described in previous studies (36). We veri ed the e cient formation of small replicable RNAs by the Qβ replicase by gel electrophoresis (Supplementary Fig. 4). Surprisingly, we observed no such spontaneous formation of RNAs when the MS2 replicase complex in the presence of IF1 was incubated for 75 min under the same conditions (Fig. 6A). As the MS2 enzyme did not show a similar strong background activity as the Qβ core complex, we asked whether the enzyme was able to produce short amplifying RNAs as by-products during MS2 genome replication. To test this hypothesis, we performed serial transfer experiments with the MS2 replicase core complex (MS2rep·S1·EF-Ts·EF-Tu) in the absence and presence of IF1 and MS2(+) RNA (Fig. 6B). In reactions with the full-length genome, we observed a rapid degeneration of the ~ 3600 nt RNA molecule during the rst two dilutions concomitant with the emergence of smaller RNA species with a broad size distribution and a dominant RNA band migrating at ~ 200 nt. The emergence of aberrant RNA products was strongly increased when the MS2 core complex was further supplemented with IF1. Intriguingly, in the presence of IF1, the small RNA species emerged even in the absence of input MS2(+) after the rst serial transfer (Fig. 6B). To obtain more information about the sequence properties of the newly evolved RNA replicators, we reverse transcribed, sequenced, and analysed the reaction products. Notably, we obtained only a single clonal sequence from these experiments (MSRP-22), which showed an almost perfect homology with the rst 118 nt of the 5'UTR and 105 nt of the 3' UTR of MS2wt (Fig. 7A, B, Supplementary Table 5). We con rmed that MSRP-22 is a genuine RNA template for MS2 replicase since it was speci cally ampli ed in an input concentrations-dependent manner in batch reactions (Fig. 7C). In contrast, MS2rep was not able to amplify RQ135, a typical RNA parasite of Qβrep (37) within the same time window. This nding con rmed the differential template requirements of both phage replicases.

Discussion
Previous attempts to characterise leviviral replicases in vitro were hampered by the general lability of the enzymes and the dependency of replicase activity on an unknown, yet easily dissociable, host factor (25,26). Using the recombinant heterotrimeric MS2rep·S1·EF-Ts complex as a starting point, we elucidated the enigmatic properties of this archetypical class of RNA polymerase enzymes. Compared to the stable Qβrep·EF-Ts·EF-Tu·S1 heterotetramer, which is capable of replicating a number of non-genomic RNAs as well as genomic Qβ(+) RNA in vitro (38), the MS2 replicase complex shows a number of different characteristics. Firstly, EF-Tu, which is an integral part of the active complex for both replicases, only weakly binds to the stable MS2rep·S1·EF-Ts core trimer. Secondly, the previously unknown host factor required for full activity was identi ed to be the translation initiation factor IF1. IF1 is a S1 domain protein that contains an oligomer binding (OB) fold found in a variety of RNA chaperones (39). Indeed, it has been reported that IF1 acts as a transcription anti-termination factor, which can destabilize strong secondary structure elements and thereby facilitating RNA polymerase read-through (40). Furthermore, IF1 showed RNA chaperone activity during trans-splicing assays both in vivo and in vitro (41). These properties suggest that IF1 may support MS2 genome replication by destabilizing secondary structure elements of single-stranded templates and exposing terminal and internal binding sites for the MS2 replicase and / or through facilitating replication initiation and product release. The in uence of IF1 on RNA synthesis resembles the role of the OB-proteins S1 and Hfq for (-) strand synthesis during the the Qβ replication cycle. Here, Hfq facilitates the access of replicase to the 3'-end of the genomic (+) strand (42,43), while S1 appears to contribute to both termination of replication and re-initiation after product release (44). Thus, a similar yet MS2-speci c role of IF1 for the remodelling of MS2 templates seems plausible.
The strong inhibitory effect of IF3 on MS replicase activity seems counterintuitive at rst and rather indicative of an in vitro artefact. However, early studies reported a speci c interaction of IF3 with the 3'terminus of MS2 (45)(46)(47). Our results suggest a direct competition for the 3'-end between the replicase and IF3 because increasing concentrations of the replicase compensated for the inhibitory effect of IF3. Notably, MS2 gene expression was also shown to be completely independent from IF3, unlike for E. coli host proteins (48). Thus, replication and translation of MS2 genomes seem to be well adapted to conditions under which IF3 levels are minimal, such as it was reported for E. coli cells that have reached stationary phase (49). While phage virions are still produced under these conditions, protein synthesis rates are usually insu cient to trigger cell lysis and phage release (50). Therefore, we speculate that the ability of the MS2 replication system to continuously produce infectious but host-contained virions in slowly growing cells may enable phage persistence within starved bacterial populations. This state may last until growth conditions and therefore a lytic reproduction cycle can be restored. Inhibition of genome replication by IF-3 might also play a role in the well-orchestrated infection cycle of MS2 and requires further in vivo studies.
While the isolated MS2 replicase holocomplex synthesizes genomic RNA from both (+) and (-) strand templates in batch reactions, both products readily anneal to form double-stranded RNA, which can no longer be used as a template for replication (51,52). In contrast, replication of the full-length genome in PURE systems enabled sustained synthesis of the genomic (+) strand. Under these more in vivo-like conditions, ribosome binding and translation to the MS2(+) RNA might counteract a direct annealing with newly synthesised (-) strand thereby supporting continuous replication similar to that described for Qβ (53).
Qβ replicase is notorious for the spontaneous generation of short, exponentially amplifying "RNA parasites" from trace amounts of contaminating RNA even in single batch reactions (34,36). In contrast, the reconstituted MS2 replicase core complex required additional serial transfers before replicative RNA species were enriched in signi cant amounts. While the emerging RNA population was dominated by a small sequence migrating at ~ 300 nt in agarose gels, it retained an overall broad size distribution even after ve serial transfers. The lack of convergence to a discrete number of small, dominant replicators (such as for Qβ replicase, Supplementary Fig. 4) implies either that initiation, rather than elongation is critical for sequence replication, or that longer sequences are generated by mechanisms such as template switching or non-templated terminal transferase activity (54).
The generally lower tendency of MS2 replicase to rapidly generate and amplify non-genomic RNA species in comparison with Qβ replicase remains unclear. One explanation is an overall lower in vitro activity and / or lifetime of the holoenzyme similar to pioneering reports that used enriched protein fractions for activity assays (55). Alternatively, MS2 replicase might be less promiscuous than Qβ replicase towards non-genomic templates and therefore less prone to generate short parasitic sequences. In agreement with this conjecture, the clonal sequence derived from the serial transfer experiment consisted exclusively of major parts of the 5' and 3'UTR of the wild-type genome without additional sequence elements, suggesting that these motifs form the core of replicable units. Thus, both UTR-elements may enable the design of replicable mRNA species, which could be used for dynamic in vitro evolution studies (18,19,34,35) or the generation of self-amplifying mRNAs under in vivo conditions (56).

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download. WagneretalSI.docx