Single-step discovery of high-affinity RNA ligands by UltraSelex

doi:10.21203/rs.3.rs-2713959/v1

Download PDF

Article

Single-step discovery of high-affinity RNA ligands by UltraSelex

https://doi.org/10.21203/rs.3.rs-2713959/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Aptamers, nucleic acid ligands against specific targets, have emerged as drug candidates, sensors, imaging tools, and nanotechnology building blocks. The most successful method for their development has been SELEX (Systematic Evolution of Ligands by EXponential Enrichment), an iterative procedure that is labor- and time-intensive and often enriches candidates for criteria other than those desired. Here we present UltraSelex, a non-iterative method that combines biochemical partitioning, high-throughput sequencing, and computational background minimization through statistical rank modeling. This approach avoids the common bias for abundant sequences and selects high-affinity ligands, even if they are extremely scarce. In six independent UltraSelex experiments (three towards each target), we discovered high-affinity aptamers for a fluorogenic silicon rhodamine dye, and a protein target, the SARS-CoV-2 RNA-dependent RNA polymerase. These aptamers enabled live-cell RNA imaging and efficient enzyme inhibition, respectively. The wet-lab partitioning part of UltraSelex can be completed in a few hours, and including sequencing and rank modeling via a public web server, the identification of lead candidates can be accomplished in about one day. UltraSelex provides a rapid route to novel drug candidates and diagnostic tools with greatly improved performance.

Biological sciences/Biotechnology/Nucleic-acid therapeutics

Biological sciences/Biotechnology/Nanobiotechnology/Biosensors

Biological sciences/Biochemistry/RNA

Biological sciences/Biological techniques/High-throughput screening

Biological sciences/Chemical biology/Combinatorial libraries

UltraSelex combines wet-lab partitioning with statistical rank modeling to enable aptamer discovery in a single day.

The COVID-19 pandemic has highlighted the importance of RNA biology to a broad public ^1–4. Different types of RNAs act as information carriers, regulators, and catalysts. Many RNAs form defined three-dimensional structures and interact with other cellular (macro)molecules, thereby mutually modulating their functions ^{5, 6}. These highly specific, structure-dependent interactions inspired the development of artificial RNA ligands, so-called aptamers, which can bind to a wide variety of targets. So far, aptamers found numerous analytical, biotechnological, and therapeutic applications ^7–13 and laid the foundation for a billion-dollar industry ¹⁴. For these applications, methods for the rapid and precise identification of target-specific binders are crucial. Moreover, if sequence-specific target binding data could be generated in a consistent high-throughput manner, these datasets might enable future AI-based predictions of RNA-target interactions ^15–18. In the early 1990s, the pioneering aptamer screening technology SELEX (Systematic Evolution of Ligands by EXponential Enrichment) was developed, an iterative in vitro method for enriching RNA aptamers from random RNA libraries, containing up to 10¹⁶ unique sequences ^19–21. Conventional SELEX is, however, time-consuming as it involves many iterations, traditionally 10 to 15 selection rounds, and carries a high proportion of false-positive identifications, even in the last round’s high-throughput sequencing (HTS) data. More recent variants such as microfluidic and capillary electrophoresis SELEX are only applicable to macromolecular targets and are difficult to scale up ²². Generally, conventional SELEX is sensitive to amplification bias (i.e., parasitic well-amplifying sequences often dominate ²³), and the computational hit identification strategies rely on the abundance of enriched sequences (or fold-changes thereof) rather than on binding strength and can therefore be similarly affected by initially abundant sequences.

Here, we introduce UltraSelex, a non-iterative aptamer discovery method without intermittent enzymatic amplification. In the experimental (wet-lab) part, a sequential partitioning system precisely segregates the binders from the waste fractions that are discarded in classical SELEX and similar approaches. All partitioning fractions are subjected to HTS, monitoring the relative enrichment of millions of RNA sequences over the entire differential partitioning process, which exposes the target-binder complex to gradually increasing volumes of buffer, performed in a single step. In the computational part, the HTS data is processed leveraging the Systematic Gradient Reproduction of Enriching Ligands (SGREELI) algorithm (available at https://www.ultrarnalab.com). This algorithm greatly improves the detection of binders with an ultra-low signal from the surrounding background noise, ranking ligands by their enrichment gradient rather than by absolute abundance. The whole process is fast (~ 24 h in total), accurate (> 90% true positive hits), and versatile. We applied UltraSelex to both, small-molecule (a fluorogenic silicon rhodamine dye, molecular weight (MW) ~ 0.6 kDa) and macromolecular protein targets (the SARS Cov-2 polymerase nsp12, MW ~ 110 kDa) and rapidly identified high-affinity RNA aptamers that compare favorably with aptamers, obtained by conventional SELEX.

Principles and limitations of SELEX

When screening large nucleic acid libraries, detection of unspecific background sequences is inevitable (e.g., due to non-specific binding to targets and immobilization matrix, or incomplete supernatant removal). Moreover, the sequencing depth of the current HTS platforms (~ 10^8–9 reads per run) does not allow the detection of differences in sequence abundance (e.g., between binders and background) in a random pool of ~ 10^14–15 molecular species. This combination has so far prevented the development of non-iterative methods for the discovery of high-affinity nucleic acid ligands. In SELEX, binders are iteratively amplified to distinguish them from the background by their sheer abundance, as determined based on HTS datasets (Extended Data Fig. 1a). However, the abundance (or its change between two iterations) is a useful indicator for the strength of target binding only if it can be ruled out that other parameters (such as substrate acceptance by the three polymerases used) have a strong influence on the composition of the library. Due to the iterative nature of SELEX, even small differences in amplifiability may translate into vastly different copy numbers, which will purge much of the initial pool (including excellent binders with poor amplifiability), and the final, enriched pool may consist of mediocre binders with good amplifiability. Another weakness is the constant switching between bead-based and solution-based steps, which makes automation and parallelization of high-affinity binder discovery more difficult. This last point has been addressed by improved technologies such as capillary SELEX or microfluidic SELEX, but they have their own limitations ²².

Design of UltraSelex

In contrast, UltraSelex depletes the background while retaining the binders without intermittent amplifications. Due to the much lower abundance of individual sequences, identifying high-affinity binders requires a sensitive trend analysis. UltraSelex involves both experimental and computational sections that complement each other, enabling the single-step discovery of high-affinity binders (Fig. 1).

(1) The initial RNA pool (containing 10^13–14 unique sequences, Extended Data Fig. 1b) is dissolved in buffer and incubated with the target immobilized on magnetic beads. The beads are subjected to (e.g., 10) consecutive partitioning steps in which they are equilibrated with increasing volumes of buffer individually. In the final step, the remaining bead-bound RNAs are recovered by chemical or enzymatic release. Serial partitioning should sequentially dilute the background non-binders, leading to a monotonic decrease in their abundance (black species in Fig. 1a), while the binders (blue and golden) peak in relative abundance at characteristic partitioning steps related to their binding affinities and initial abundance.

(2) RNA from each partitioning (group 1–10) and the final release (group 11) step are separately collected, reverse-transcribed and amplified individually by PCR for HTS library construction (Extended Data Fig. 1b-e).

(3) All HTS libraries are multiplexed and sequenced. The quality-controlled reads (Supplementary Table 1) are combined and their abundance information is added to a single data frame, consisting of 10^7–8 rows (one for each sequence) and 12 columns (sequence ID, read counts n in g1-g11).

(4) The HTS reads can be used as a proxy for the identification of high-affinity binders only after several adjustments: As initial sequence abundancies in randomized RNA pools differ (Extended Data Fig. 2a) ^{23, 24}, for quantitative comparison within a partitioning group, read numbers in g2-g11 are normalized by division by read numbers in g1, which generates the fold-change (fc) data frame. Quantitative comparison between partitioning groups requires consideration of systematic changes in pool composition across the partitioning process. With each successive partitioning, the amount of RNA released into the supernatant decreases, requiring more PCR cycles for HTS library preparation, resulting in more reads per unique sequence (Extended Data Fig. 2d). Furthermore, the composition of the groups (non-binders vs. weak vs. intermediate vs. high-affinity binders) is expected to change across fractions. To account for these two effects, the fc values of each group g2 - g11 must be corrected with a different factor. As all groups contained at least 1% sequences with ≥ 2 reads (gray dashed line in Extended Data Fig. 2b), to distinguish non-binders from weak binders we selected the RNA species representing the boundary of the top 1% fraction and noted its fc value fc_Ref.. Because strong binders are expected to elute in late groups, increasing weighting factors γ are assigned to the fractions, using a linear γ-function as a default (Extended Data Fig. 2c). fc_Ref. and γ are combined to obtain a normalization factor by which the fc values of all sequences in this partitioning fraction are divided. In this way, a data frame is obtained with the intra- and intergroup-adjusted fold change (afc) values of 10^7–8 sequences in groups g2 – g11. Extended Data Fig. 2k illustrates this data treatment which we refer to as SGREELI. The line plot in Fig. 1b shows the afc trend for three RNAs with different binding characteristics.

Furthermore, to distinguish a binder from background, a sequence must either occur very frequently in one late group or appear in multiple groups. To favor multiple appearances, the afc values for each sequence are integrated, using the γ values as the x-parameters. In this way, a single auc (area under the curve) value is obtained for each species. The auc values are added to the data frame and the sequences sorted by auc. High auc values indicate strong enrichment in later fractions, a feature that is thought to reflect strong binding.

(5) Optionally, the top percentile value, as well as the slope and shape of the gamma function, can be optimized (See Supplementary Note for details).

(6) The sequences with the highest rank are then selected for K_D determination and more detailed functional characterization.

Serial partitioning enriches known aptamers

We first tested the wet-lab partitioning procedure on the identification of silicon rhodamine-binding aptamers (See Extended Data Fig. 1 for pool design). For UltraSelex experiment SiR-A, 20 µg RNA (52 randomized positions, ~2x10¹⁴ RNA molecules) were subjected to the partitioning procedure and all fractions were analyzed by HTS. This analysis revealed that in each partitioning step, the diversity decreased, whereas the proportion of frequently occurring sequences increased (Extended Data Fig. 2b, d). Still, even in the g10 library, ~ 90% of the sequences occurred only once, indicating that the complexity is still larger than the sequencing depth.

To test whether this wet-lab partitioning procedure can uncover strong binders, we monitored the relative abundance (as reads per million reads – RPM) of 3 known silicon rhodamine binding aptamer sequences that had been discovered previously from a different aliquot of the same RNA library by conventional SELEX ⁹. Two of them (SiR2, SiR3) started appearing in the HTS raw reads in g5 and reached their peak relative abundance in g10 (Fig. 2a). Notably, the relative abundance in the release fraction g11 was lower than in the peak partitioning fractions. The third aptamer (SiR1) showed a hardly detectable peak at g10. Out of the 42 aptamers identified by Sanger sequencing in our previous SELEX study ⁹, 39 were detected in the UltraSelex SiR-A dataset, most of them with low abundancies (Extended Data Fig. 2e), illustrating the need for a ranking system that does not only rely on read abundance.

Two additional UltraSelex experiments were carried out against the same target in which unspecific binding of target RNA libraries to the beads was reduced by the prior addition of passivating, as well as partially competing, RNA species (tRNAs in experiment SiR-B, and a SELEX-derived SiR-aptamer in SiR-C). A similar distribution of the known SELEX-derived aptamers was observed in the three experiments, however, with slight shifts in the peak locations (Extended Data Fig. 2e,f) but similar auc values (Extended Data Fig. 2g), attesting to the high reproducibility of the partitioning process.

To investigate whether the libraries created in the partitioning process converge away from randomness (defined as the composition of g1) towards a specific composition (defined as the composition of g11), we analyzed the distribution of all 4096 6mers throughout the randomized part of the library for the top 1% abundant RNA species for each partitioning group and for the bead-bound group. This analysis revealed rapid convergence, as judged by the Pearson correlation coefficient (Pearson’s r). (Fig. 2b, Extended Data Fig. 2f-g). Collectively, these results indicate that non-iterative serial partitioning can identify known high-affinity aptamers. Simple read counting, however, finds only a few, very abundant, aptamers, such as SiR2 and SiR3.

The SGREELI algorithm allows for sensitive scoring and ranking of ligands

We applied the SGREELI intra- and intergroup normalization to the UltraSelex dataset SiR-A, assigning auc values to ~ 4 × 10⁷ RNA sequences. The plot of the adjusted enrichment factors of SiR1, SiR2 and SiR3 clearly identifies SiR1 as a binder, but with a lower auc value than the other two aptamers (Fig. 2c). Out of the 39 known aptamers that were detected in the dataset, 13 were among the top 0.01% group by auc ranking (Fig. 2d). In comparison, only 6 were in this top group when ranked by abundance in the final (bead-bound) fraction, g11.

Of note, none of these 39 sequences was at the very top by auc; the best-ranking known aptamer ranked 140 (Extended Data Fig. 2l). Similarly, 37 of the known sequences were found in UltraSelex dataset SiR-B (top score 75) and 38 in SiR-C (top score 32). Consistently, those sequences in which a fluorescence turn-on was reported after addition of a SiR-quencher conjugate as confirmation of SiR binding ranked much better than those without a turn-on (likely representing false-positive non-binders), confirming the relative enrichment of true binders in UltraSelex (Fig. 2d and Extended Data Fig. 2m). These results raised the question of what the sequences at the very top (by auc) are, and how their performance compares with the previously known aptamers.

UltraSelex enriches high-affinity SiR-binding aptamers for live-cell imaging

Comparison of the top sequences (by auc) from the three UltraSelex experiments revealed significant overlap (Fig. 3a,b). 6 sequences occurred among the top 25 of all 3 data sets, and ~ 300 (60%) among the top 500. The top 25 RNAs from the 3 experiments were individually synthesized and assayed for fluorescence turn-on (Fig. 3c, Extended Data Fig. 3a). 52%, 100%, and 68% of these sequences showed turn-on in SiR-A, SiR-B, and SiR-C. The SiR-B set (tRNA background passivation) not only showed the highest percentage of binders, but also the highest average turn-on (3.5-fold) and the two highest individual turn-on values (9 and 10-fold). For these two promising candidates, SiR-B7 and SiR-B15, the binding dissociation constants K_D were determined by titration with SiR-PEG-NH₂, to be ~ 250 and ~ 195 nM, respectively (Fig. 3d). Both light-up and K_D values compare favorably with the top aptamer previously identified by conventional SELEX (7-fold and 490 nM) ⁹.

Sequences SiR-B7 and SiR-B15 were then cloned into a vector containing the Tornado scaffold ²⁵ and transfected into human embryonic kidney (HEK-293T) cells. Upon addition of cell-permeable SiR-PEG-NH₂ to the cells, aptamer-expressing cells showed strong, mostly cytosolic fluorescence and compared favorably with the SELEX-derived control aptamer-expressing cells (Fig. 3e). These findings highlight the potential of UltraSelex-derived aptamers for live-cell imaging.

Most UltraSelex-identified aptamers are not found by other means

The unique ranking system of UltraSelex identifies many promising aptamers that are ranked very low in other approaches. E.g., of the ~ 5000 sequences in the 0.01% top group of the SiR-B dataset by auc ranking, only one-tenth (500) belong in the same top group when sorted by read abundance in the bead-bound group. (Fig. 3g, top panel). The Sankey plot illustrates that the top auc group derives from different read abundance groups. Looking only at the top 25 sequences, the ratio is better, but 15 of those sequences are ranked much lower by abundance than by auc (Fig. 3g, bottom). Similar relations were observed in the SiR-A and SiR-C datasets (Extended Data Fig. 3b).

Among the top 25 aptamers from SiR-A, SiR-B and SiR-C (47 unique sequences), only 5 could be detected in the HTS data set of the final round of conventional SELEX for SiR-binding aptamers ⁹. There, they ranked between 3rd and 484th. (Extended Data Fig. 3c).

UltraSelex identifies aptamers that bind to and inhibit the SARS-CoV-2 nsp12 polymerase with picomolar affinity

To test the general applicability of UltraSelex, we applied it to a much larger (around 200-fold MW) macromolecular target, the enzymatic RNA polymerase subunit of SARS-CoV-2, known as non-structural protein 12 (nsp12), which is a central component of the RNA-dependent RNA polymerase (RdRp) complex that replicates the viral single-stranded, positive-sense RNA genome and transcribes its genes ²⁶. Aptamers that bind to this protein target with high affinities could, in principle, provide new tools and therapeutics that disrupt the replication cycle of the virus.

Based on this rationale, we used three aliquots of the same starting library as in the SiR selections, and carried out UltraSelex experiments with bead-bound nsp12 protein. In the experimental setup Nsp-A, no additional background binding reduction measures were taken, while in Nsp-B, the beads were pretreated with tRNA, and in Nsp-C with an RNA hairpin reported to be a substrate for the RdRp complex ²⁷. The protein-derivatized beads were subjected to the same serial partitioning procedure as detailed above, and all fractions were sequenced and analyzed computationally.

Overall, the read counts per unique sequence were much lower than in the SiR experiments, which made binder identification more difficult. However, the SGREELI algorithm consistently ranked the sequences by auc, regardless of the absolute number of reads (For examples, see Extended Data Fig. 4b) and resulted in substantial overlap between the three data sets, demonstrating robust performance of UltraSelex even for more complex, macromolecular targets. Notably, some of the top-ranked sequences had as few as 5 reads in all 11 fractions combined (Supplementary Table 3). 16 sequences were in the top 25 of all three data sets (64%), with a decreasing trend for lower-ranked sequences (Fig. 4a,b). The Sankey plots (Extended Data Fig. 4a) indicate that only a very small portion of the 0.001% auc-ranking fraction also ranked first by read abundance in the last fraction (g11), although there was greater agreement within the top 25 group.

84 sequences that belonged (with 3 exceptions) to the top 250 group in all 3 data sets (Fig. 4e) were individually prepared by in vitro transcription and assayed for the binding of the nsp12 protein by bio-layer interferometry (BLI). 76 showed potentially relevant binding affinities (K_D < 1 µM, Fig. 4d) with the majority of the aptamers exhibiting nanomolar and 10 aptamers even picomolar K_D’s. Within this group of top-ranked sequences, there was no correlation between rank and K_D. The binder with the highest affinity, Nsp-B113, had a K_D of ~ 58 pM, which is one of the lowest values for protein-binding aptamers reported to date (Fig. 4c, Extended Data Fig. 4c,d).

We then screened 50 of these nsp12-binding aptamers for their potential to inhibit the RNA polymerase activity of the RdRp, of which nsp12 is one component, using an assay that monitors the extension of an RNA hairpin template, in the presence of the accessory factors nsp7 and nsp8 ²⁷. In this experiment, 7 aptamers showed a notable inhibitory effect (Extended Data Fig. 4g). To preclude the aptamers from being extended by the polymerase, which would result in heterogeneous mixtures that would be difficult to analyze, we destroyed their 3’-ends by a brief periodate oxidation step, rendering them unextendable. This treatment had only a minor effect on the K_D (Extended Data Fig. 4e), but markedly improved the reproducibility of the assay and the inhibitory effect of 3 of the aptamers. 6 of the 3’-blocked aptamers inhibited the extension by more than 90%, based on single-time point measurements (Fig. 4f, g). Kinetic analysis revealed that aptamers Nsp-B30 and Nsp-B68 showed the most pronounced inhibitory effects (Fig. 4h, i), which were found to be stoichiometry-dependent (Extended Data Fig. 4i). These experiments highlight the potential of UltraSelex for the discovery of protein-binding aptamers with conceivably therapeutic potential.

SGREELI ranking reveals high-performance aptamers at default parameters

In our analysis, we used default values assuming 1% as a suitable cut-off for discriminating between weak binders and non-binders, and a linear enrichment trend (θ = 1). Using multi-parameter optimization with the f1 score as a criterion, we determined the extent to which the choice of θ and the top percentile affected the ranking (see Supplementary Note for details). This analysis revealed a negligible effect of the top percentile and different optimal θ values for the SiR and nsp12 selections. Comparison of the rankings for each UltraSelex dataset calculated once with the default and once with the optimized hyper-parameter settings showed a large overlap (e.g., 67–80% among the top 100 of the three SiR selections and > 94% in the three nsp12 datasets). Similar results were obtained when we examined the agreement of the auc-based rankings with the binding affinity of the nsp12-aptamers using the Normalized Discounted Cumulative Gain (NDCG) score (see Supplementary Note). This high level of agreement indicates that the rankings based on the default parameters produce high-performance aptamer candidates and should be suitable for most applications, and additionally highlights the robustness of this computational approach.

In this work, UltraSelex was introduced and proved applicable for facile RNA aptamer discovery against small molecules and macromolecular protein targets in just one day. UltraSelex is based on a single-step of RNA selection, which corresponds in time and effort to one round of conventional SELEX approaches. Furthermore, as no sophisticated instrumentation is required and a public web server facilitates the end-to-end analysis of HTS libraries UltraSelex is highly accessible. UltraSelex identifies high-affinity aptamers using information contained in wash fractions that are typically discarded in the conventional SELEX process, thereby increasing the chances of discovering low-abundance aptamers. In addition, single-round selection can significantly reduce the systematic bias caused by, for example, repeated copying by different polymerases in the iterative SELEX process. As the entire partitioning part is bead-based, it can be easily parallelized and likely automated.

The computational part, i.e., the here-developed SGREELI algorithm, allows the discovery of very rare aptamers that disappear in background noise in other identification methods. In contrast to considering the abundance of aptamers alone, SGREELI constructs an accurate ranking model that includes both intra- and inter-group normalization with fold-change and gamma simulation of the enrichment trend. For this analysis, a public web server (https://www.ultrarnalab.com) was set up to which users can upload demultiplexed HTS datasets. Processing of typical data sets (11 groups with 5–10 million reads each) takes ~ 30 minutes, after which the user receives the SGREELI ranking. The rate-limiting steps of UltraSelex are the sequencing time and the upload of the sequence data sets.

Similar to SELEX and other high-throughput screening techniques, the requirement of target immobilization in UltraSelex is a limitation that may lead to aptamers that do not bind the free target equally well. This effect may be the cause for the lacking correlation between the SGREELI rank and measured K_D values within the top group. Nevertheless, in both applications, UltraSelex identified a group of high-affinity target binders with sufficient size to proceed with secondary screens and downstream applications. In the selection against an immobilized silicon rhodamine dye, aptamers that had higher affinity and fluorescence turn-on than those generated by conventional SELEX were generated in a partitioning process that took less than four hours. This is all the more remarkable because various measures to improve aptamer performance were used in SELEX, which extended over several months (such as increasing selection pressure or counter-selection), and not in UltraSelex. In the selection against the polymerase nsp12 of the SARS-CoV-2 virus, the most remarkable aspect is that many of the top-ranked candidates were extremely rare (e.g., 5 reads combined in all 11 partitioning groups, Supplementary Table 4), which would never have been detected in abundance-based ranking systems, highlighting, in particular, the power of the SGREELI algorithm. The aptamers against both targets performed well in downstream applications (live-cell imaging, polymerase inhibition) without extensive optimizations.

In conclusion, UltraSelex has the potential to greatly speed up and improve the ligand discovery process and thereby enable new applications of aptamers in biotechnology, medicinal diagnostics, and as therapeutics.

Data Availability

NGS raw data that support the findings of this study are available for academic use in the online repository (https://www.ultrarnalab.com). UltraSelex library construction and the aptamer sequences selected for downstream applications are compiled in the Supplementary Materials.

Code Availability

The web server for UltraSelex deep sequencing data analysis is freely available and will be routinely updated on the site (https://www.ultrarnalab.com). Other data analysis scripts used in this study are available upon reasonable request from the corresponding authors.

Acknowledgements

We thank Jäschke Lab members for discussions; R. Wirth (Heidelberg University) for the cDNA resource of published work and SiR dye; M. Sunbul (Heidelberg University) and R. Völk (Heidelberg University) for RNA resources; M. Knop (Heidelberg University) for access to TECAN; W. Nickel (Heidelberg University) for access to BLI-Octet; V. Benes (EMBL) for access to HTS; SDS@hd for data storage and bwHPC (bwForCluster MLS&WISO and bwUniCluster) for cluster computation resources.

Contributions

Y.Z. and A.J. conceptualized the study. Y.Z designed and synthesized the UltraSelex and other NGS libraries and developed and carried out modeling analysis. Y.Z., Y.J., and S.F. synthesized and purified RNA. Y.Z., Y.J., S.F., and K.W. measured ligand affinities. D.K., Y.Z., and Y.J. conducted protein purification and inhibition activity assays. W.H., Y.Z., Q.Y., and Z.T. contributed to the web server construction and code review. S.F. and J.Z. measured the fluorescence turn-on and S.F. imaged RNA in live cell. D. I. multiplexed and supported sequencing. Y.Z. and A.J. wrote the original draft. Y.Z. and A.J. supervised this study. I.S., A.A.H., and other authors reviewed and edited the draft.

Competing interests

A patent application has been filed relating to this work. A.A.H. is a founder of Dewpoint Therapeutics and a member of the board as well as a shareholder in Caraway Therapeutics. The authors declare no other competing interests.

Hogan, M.J. & Pardi, N. mRNA Vaccines in the COVID-19 Pandemic and Beyond. Annu Rev Med 73, 17-39 (2022).
Polack, F.P. et al. Safety and Efficacy of the BNT162b2 mRNA Covid-19 Vaccine. N Engl J Med 383, 2603-2615 (2020).
Schultz, D.C. et al. Pyrimidine inhibitors synergize with nucleoside analogues to block SARS-CoV-2. Nature 604, 134-140 (2022).
Yan, L. et al. A mechanism for SARS-CoV-2 RNA capping and its inhibition by nucleotide analog inhibitors. Cell 185, 4347-4360 e4317 (2022).
Zhu, Y. & Hart, G.W. Dual-specificity RNA aptamers enable manipulation of target-specific O-GlcNAcylation and unveil functions of O-GlcNAc on beta-catenin. Cell 186, 428-445 e427 (2023).
Hentze, M.W., Castello, A., Schwarzl, T. & Preiss, T. A brave new world of RNA-binding proteins. Nat Rev Mol Cell Biol 19, 327-341 (2018).
Sunbul, M. et al. Super-resolution RNA imaging using a rhodamine-binding aptamer with fast exchange kinetics. Nat Biotechnol 39, 686-690 (2021).
Valero, J. et al. A serum-stable RNA aptamer specific for SARS-CoV-2 neutralizes viral entry. Proc Natl Acad Sci U S A 118, 50 e2112942118 (2021).
Wirth, R., Gao, P., Nienhaus, G.U., Sunbul, M. & Jäschke, A. SiRA: A Silicon Rhodamine-Binding Aptamer for Live-Cell Super-Resolution RNA Imaging. J Am Chem Soc 141, 7562-7571 (2019).
Woo, C.H., Jang, S., Shin, G., Jung, G.Y. & Lee, J.W. Sensitive fluorescence detection of SARS-CoV-2 RNA in clinical samples via one-pot isothermal ligation and transcription. Nat Biomed Eng 4, 1168-1179 (2020).
Zhou, J. & Rossi, J. Aptamers as targeted therapeutics: current potential and challenges. Nat Rev Drug Discov 16, 181-202 (2017).
Nakatsuka, N. et al. Aptamer-field-effect transistors overcome Debye length limitations for small-molecule sensing. Science 362, 319-324 (2018).
Xu, Y. & Zhu, T.F. Mirror-image T7 transcription of chirally inverted ribosomal and functional RNAs. Science 378, 405-412 (2022).
Grand View Research. Aptamers Market Size, Share & Trends Analysis Report By Type (Nucleic Acid (DNA, RNA, XNA), Peptide), By Application (Diagnostics, Therapeutics, Research & Developments, Others), By Technology, by End-user, By Region, And Segment Forecasts, 2022 – 2030. www.grandviewresearch.com/industry-analysis/aptamers-industry (2022).
Bashir, A. et al. Machine learning guided aptamer refinement and discovery. Nat Commun 12, 2366 (2021).
Iwano, N., Adachi, T., Aoki, K., Nakamura, Y. & Hamada, M. Generative aptamer discovery using RaptGen. Nature Computational Science 2, 378-386 (2022).
Rube, H.T. et al. Prediction of protein-ligand binding affinity from sequencing data with interpretable machine learning. Nat Biotechnol 40, 1520-1527 (2022).
Chen, J.C. et al. Generating experimentally unrelated target molecule-binding highly functionalized nucleic-acid polymers using machine learning. Nat Commun 13, 4541 (2022).
Ellington, A.D. & Szostak, J.W. In vitro selection of RNA molecules that bind specific ligands. Nature 346, 818-822 (1990).
Robertson, D.L. & Joyce, G.F. Selection in vitro of an RNA enzyme that specifically cleaves single-stranded DNA. Nature 344, 467-468 (1990).
Tuerk, C. & Gold, L. Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage T4 DNA polymerase. Science 249, 505-510 (1990).
Zhuo, Z. et al. Recent Advances in SELEX Technology and Aptamer Applications in Biomedicine. Int J Mol Sci 18, 2142 (2017).
Takahashi, M. et al. High throughput sequencing analysis of RNA libraries reveals the influences of initial library and PCR methods on SELEX efficiency. Sci Rep 6, 33697 (2016).
Meiser, L.C. et al. DNA synthesis for true random number generation. Nat Commun 11, 5869 (2020).
Litke, J.L. & Jaffrey, S.R. Highly efficient expression of circular RNA aptamers in cells using autocatalytic transcripts. Nat Biotechnol 37, 667-675 (2019).
Gao, Y. et al. Structure of the RNA-dependent RNA polymerase from COVID-19 virus. Science 368, 779-782 (2020).
Hillen, H.S. et al. Structure of replicating SARS-CoV-2 polymerase. Nature 584, 154-156 (2020).

Yes there is potential Competing Interest. A patent application has been filed relating to this work. A.A.H. is a founder of Dewpoint Therapeutics and a member of the board as well as a shareholder in Caraway Therapeutics. The authors declare no other competing interests.

AuxiliarySupplementaryfile.pdf
Access information for reviewers
nreditorialpolicychecklistYZ.pdf
Editorial policy checklist
UltraSelexmanuscript20230311SI.pdf
Supplementary Information
nrreportingsummaryYZ.pdf
Reporting summary

Download PDF

Version 1

posted

You are reading this latest preprint version

Single-step discovery of high-affinity RNA ligands by UltraSelex

Status:

Version 1

Abstract

Figures

One Sentence Summary

Introduction

Results

Principles and limitations of SELEX

Design of UltraSelex

Serial partitioning enriches known aptamers

The SGREELI algorithm allows for sensitive scoring and ranking of ligands

UltraSelex enriches high-affinity SiR-binding aptamers for live-cell imaging

Most UltraSelex-identified aptamers are not found by other means

UltraSelex identifies aptamers that bind to and inhibit the SARS-CoV-2 nsp12 polymerase with picomolar affinity

SGREELI ranking reveals high-performance aptamers at default parameters

Discussion

Declarations

Data Availability

Code Availability

Acknowledgements

Contributions

Competing interests

References

Additional Declarations

Supplementary Files

Status:

Version 1