Recent advances in next-generation sequencing enables rapid identification of oncogenic transcripts in individual patients within actionable time frames1–3. However, many of such driver mutations cannot be targeted due to the lack of specific inhibitory molecules4,5. Numerous fusion genes generated by chromosomal translocations demonstrate cogent oncogenic activity, but also remain largely ‘undruggable’ with conventional therapeutics6. Personalized targeting of these fusion structural variants at the protein level has proven to be challenging7, and in the rare cases where small molecule inhibitors are available, treatments often result in rapid development of drug-resistance and disease relapse8. Alternatively, the unique chimeric sequence at the breakpoint of the two genes at the transcript level represents a currently unexplored, tractable, and impactful target for sequence-specific silencing with programmable RNA nucleases.
The type VI CRISPR (clustered regularly interspaced short palindromic repeats) effectors termed CRISPR-Cas13 (Cas13) are programmable RNA-guided targeting enzymes that exclusively degrade single-stranded RNAs (ssRNAs) with high efficacy and specificity9,10. Recent studies have deployed Cas13 systems in a variety of targeted RNA manipulations including, nucleic-acid detection11–13, precise RNA base editing14,15, and viral suppression16–18. The efficiency and reversibility of RNA targeting with Cas13 represents a promising modality to specifically edit oncogenes without risking permanent alteration of the genome in somatic and germline cells, an inherent limitation of DNA-editing CRISPR enzymes19–21. Therefore, Cas13 is highly attractive for targeting aberrant fusion transcripts that drive various human genetic diseases including cancer. The PspCas13b ortholog appears to possess high silencing efficiency and specificity, underscoring its suitability for targeted gene silencing in human cells22. However, the poor understanding of the molecular principles governing PspCas13b target recognition and cleavage limits the development of this tool for exclusive targeting of the breakpoint of any fusion transcripts.
crRNAs silencing efficiency is highly variable.
To elucidate PspCas13b crRNA design principles, we developed a quantitative fluorescence-based silencing assay, in which we targeted the transcript of the mCherry reporter gene. To achieve this, we co-transfected HEK 293T cells with three plasmids encoding mCherry, PspCas13b-BFP, and either non-targeting (NT) or mCherry-targeting crRNA (Fig. 1a). Fluorescence microscopy analysis of cells transfected with mCherry targeting crRNAs showed pronounced silencing activity, contrasted with no appreciable silencing in cells expressing NT crRNAs (Fig. 1b).
Next, we questioned whether parameters such as efficiency of crRNA transcription, crRNA loading, spacer nucleotide composition, target accessibility, and the presence of a potential protospacer-flanking sequence (PFS) may influence the efficiency of PspCas13b and could lead to variability in the silencing profiles of various crRNAs. We designed 16 crRNAs with spacer sequences that fully basepair with the coding sequence of the mCherry mRNA at various positions (Extended Data Fig. 1a). To accurately determine the silencing efficacy of each crRNA in this cohort, we performed crRNA dose-dependent silencing assays in which cells were transfected with 0, 1, 5, and 20 ng of each of 16 mCherry-targeting crRNAs and quantitated silencing efficiency. mCherry targeting crRNAs demonstrated dose-dependent silencing. However, we noticed marked differences in the silencing efficacy of the various crRNAs, even when they are designed to target neighbouring RNA locations (Fig. 1b; Extended Data Fig. 1b-1d). This finding suggested there are key determinants of PspCas13b efficacy beyond target accessibility. Identifying such determinants is crucial for efficient reprogramming.
Single-base tiled crRNAs reveal hidden parameters.
To further understand the spectrum of crRNA potency, we investigated the silencing activity of PspCas13b across a defined targeted region, reasoning that silencing efficiency is likely intrinsic to the spatial characteristics of the crRNA sequence and binding sites. We focused our study on crRNA12 (binding position #455) and crRNA16 (#655) that exhibited high and moderate silencing, respectively (Extended Data Fig. 1a-1d). We designed 3-nucleotide resolution tiled crRNAs spanning a 30-nucleotide target region surrounding crRNA12 and crRNA16 binding positions (Fig. 1c; Extended Data Fig. 2a). In this tiled design, each adjacent crRNAs are spaced by 3 nucleotides, thus silencing profiles should reveal the relationship between efficacy, the sequence of the spacer-target, and target accessibility. We again observed considerable heterogeneity in the potency of these tiled crRNAs despite their physical proximity, with some adjacent crRNAs demonstrating contrasted silencing efficacy. These data indicated that physical barriers such as RNA binding proteins or structured RNA motifs are unlikely to explain the fluctuation in silencing between spatially adjacent crRNAs that are separated by just 3 nucleotides (Fig. 1c; Extended Data Fig. 2a-2c).
To further enhance our understanding, we maximised the spatial resolution of this approach by designing 61 tiled crRNAs with single-base incremental targeting of the region between nucleotide position 424 and 485 of the mCherry coding sequence (Fig. 1d). Consistent with previous data, we again observed markedly diverse silencing profiles of neighbouring crRNAs. For instance, crRNA13 achieved silencing exceeding 95% efficiency, but shifting the targeted region by only 1 nucleotide (crRNA14) dramatically reduced efficiency to ~ 30%. Similarly, crRNA51 yielded ~ 99% silencing efficiency while its adjacent crRNA52 did not show any appreciable silencing activity (Fig. 1d).
These data strengthen our contention that silencing efficacy is unlikely to be solely dependent on the target accessibility, and that other factors including specific nucleotide positions within the spacer or target, and a possible PFS, may all influence key steps of target silencing such as crRNA transcription, loading, and target recognition.
In silico analysis of 201 crRNAs revealed key design principles.
In an effort to uncover fundamental principles that dictate PspCas13b silencing efficiency, we expanded our dataset by analysing the silencing profiles of 201 individual crRNAs targeting various transcripts16. First, we questioned whether the folding of crRNA or target, spacer-target stability, and spacer nucleotide content correlate with pspCas13b potency. The data suggest that the folding of the crRNA and the targeted sequence into complex secondary structures can only moderately limit PspCas13b silencing efficiency, possibly perturbing crRNA loading or target accessibility (Extended Data Fig. 3). Whereas enriched C nucleotides in spacers exhibited a strong negative correlation with crRNA potency (r=-0.30; p < 0.0001) (Extended Data Fig. 3&4).
Next, we pooled these 201 crRNAs and ranked them by silencing efficiency. crRNA that achieved > 90% silencing efficiency were designated as potent crRNAs and those with less than 50% efficiency were considered ineffective crRNAs. crRNAs with ambiguous silencing profiles (efficiencies ranging from 50 to 90%) were excluded from the analysis. We sought to identify molecular features capable of differentiating potent and ineffective crRNA cohorts (Fig. 1e). Many CRISPR variants possess an upstream or downstream protospacer flanking sequence (PFS) that restricts targeting activity and prevents degradation of their own nucleic acids23. To investigate the existence of a PFS that could constrain PspCas13b silencing, we generated weight matrix plots that analyse nucleotide composition at each position of four bases upstream and downstream of the targeted sequence in the highly potent and ineffective cohorts of crRNAs. There was no detectable bias in nucleotide composition at various target flanking sites, suggesting that PspCas13b activity is not subject to PFS motifs in mammalian cells (Fig. 1f).
Last, we questioned whether the nucleotide composition of the spacer could influence PspCas13b silencing efficiency. Nucleotide content analysis of the filtered crRNA cohorts revealed an enrichment of G bases in the potent group, and enrichment of C bases in the ineffective crRNA cohort (Extended Data Fig. 4a-4e), indicating that G-enriched spacers are associated with higher potency, whereas C-enriched spacers are associated with low potency.
To reveal the relevance of G and C bases at specific positions within the spacer sequence, we conducted unbiased analyses of nucleotide composition at all 30 positions of the spacer in highly potent and ineffective crRNA cohorts. We used weight matrix plots and Delta probability analysis to compare spacer nucleotide composition at all positions between filtered and unfiltered samples (Fig. 1g-1h; Extended Data Fig. 2d-2f), and revealed marked differences in nucleotide positions between the two crRNA cohorts. We show that G bases at the 5’end, particularly a GG sequence at the first and second positions was strongly associated with highly potent crRNAs. Conversely, G nucleotides were depleted and C bases were enriched at the 5’end of spacers in the ineffective crRNA cohort. In addition to this C-rich motif at the 5’end of ineffective crRNAs, we also identified a significant enrichment of C bases at positions 11, 12, 15, 16, and 17 (Fig. 1g-1h; Extended Data Fig. 2d-2f). These data revealed key nucleotide positions that may determine the potency of crRNAs, which could serve as predictive parameters of crRNA potency.
Functional validation of crRNA prediction and design.
The above in silico analysis enabled us to generate a formula to predict potent and ineffective crRNAs. We postulated that potent crRNAs should include GG sequence at the first and second position of the spacer and should lack C bases in position 11, 12, 15, 16, and 17 (GGNNNNNNNNDDNNDDDNNNNNNNNNNNNN; D is a G, U, or A nucleotide; N is any nucleotide). We also hypothesised that crRNAs containing C in spacer positions 1, 2, 3, 4, 11, 12, 15, 16, and 17 are predicted to yield poor silencing efficiency (CCCCNNNNNNCCNNCCCNNNNNNNNNNNNN).
We tested the predictive accuracy of these spacer-based formulas through prospective unbiased design of crRNAs targeting EGFP and TagBFP, two mRNA targets we had not investigated previously. Notably, out of 21 predicted potent crRNAs, 20 achieved very high silencing efficiency of either EGFP or TagBFP mRNA. Conversely, the majority of predicted ineffective crRNAs failed to efficiently silence EGFP and TagBFP transcripts (Fig. 2a-2f). By formulating our prediction from a pre-existing dataset, and validating its accuracy in heretofore untargeted transcripts, these data demonstrate our spacer nucleotide-based formula to be both accurate and generalisable, and demonstrate its utility in crRNA design for silencing any transcript of interest.
Next, we compared the efficiency of our design to the benchmark crRNA design tool that is available for RfxCas13d (Fig. 2g). We selected 10 top predicted potent crRNAs for RfxCas13d targeting mCherry and probed their silencing efficiency, which achieved an average silencing of 80.7% (Fig. 2h). Our PspCas13b design of potent crRNAs showed ~ 87.8% average silencing efficiency (EGFP and TagBFP together, Fig. 2c, 2f) and outperformed RfxCas13d design, further validating the accuracy of our prediction tool (Fig. 2c & 2f).
To further investigate the enrichment of a G-rich motif at the 5’end of potent crRNAs and C bases at the 5’end of ineffective crRNAs, we hypothesized that altering these sequences in a bona fide spacer sequence may either worsen or improve their silencing efficiency. First, we selected 11 crRNAs that possess a GG sequence at 1st and 2nd positions of the spacer which we altered to CC by spacer mutagenesis. The data showed substantial compromise in the silencing efficiency of the majority of these crRNAs (Extended Data Fig. 5a). We also mutated 3, 2, or 1 G base(s) at the 5’end of the spacer to a C residue(s) and found that the substitution of 3 or 2 C bases at the 5’end of the spacer reduces their silencing by > 99% and ~ 70% respectively, while the introduction of a single C base at spacer position 1, 2, or 3 has no significant effect on the potency of the crRNA (Extended Data Fig. 5b-5c).
Next, we selected ineffective crRNAs lacking a GG sequence at their 5’end, and then modified them either by inserting an additional G at the first position, substituting the 1st nucleotide to a G, or substituting the 1st and 2nd nucleotides to a GG (Fig. 2i-2o). Importantly, the data demonstrated that G sequences at the 5’end of the spacer greatly increase the potency of crRNA despite the introduction of spacer-target mismatch (Fig. 2i-2o). We questioned whether the improvement in silencing efficiency of crRNAs harbouring a G-rich motif at their 5’end could be secondary to changes in crRNA abundance. We quantified the expression levels of original crRNA or mutated crRNAs harbouring 5’end G motifs using quantitative real-time PCR (RT-PCR). Although not statistically significant, we observed an increase in crRNA abundance when a G-rich motif is present at the 5’end (Extended Data Fig. 6).
In addition to mCherry, we also show that C to G substitutions in key spacer positions (1, 2, 11, 15, 16, 17) can further improve the silencing efficiency of crRNAs targeting other transcripts (Extended Data Fig. 7a-7l). Indeed, when crRNA design choices are restricted, de novo design of crRNAs incorporating mismatched G bases at these key positions can substantially increase their potency despite introducing nucleotide mismatches with the target.
To facilitate the use of our optimized and validated spacer nucleotide-based formula for potent crRNA design, we created a user-friendly webpage (https://cas13b.github.io/) to assist the community with their silencing assays. This in-silico tool requires only the targeted sequence as input to create single-base tiled spacer sequences and rank them based on their predicted potency (see Methods).
Comprehensive mutagenesis of spacer-target interaction.
Understanding PspCas13b specificity, off-targeting potential, and its capability to discriminate between two transcripts that share extensive sequence homology is extremely important for evaluating the potential and the limitations of PspCas13-based RNA silencing. To study crRNA spacer promiscuity and the consequent PspCas13b targeting resolution, we conducted a comprehensive spacer mutagenesis to introduce mismatches with the target at various spacer positions. First, we introduced 3, 6, 9, 12, 15, 18, 21, 24, 27, and 30-nt successive mismatches starting from the 3’ and 5’ends of the spacer (Fig. 3a & 3b). 3-nt mismatches at the 3’end of spacers (position 28–30) did not affect the silencing efficiency of this crRNA, whereas mismatches greater that 3-nt completely abrogated its silencing (Fig. 3a). In contrast to the 3’end, all 5’end mismatches resulted in complete loss of silencing including 3-nt mismatches at the 5’ end (Fig. 3b). Based on our earlier findings (Extended Data Fig. 5), silencing loss consequent to the introduction of a 3-nt mutation at the 5’end is likely attributable to the substitution of a GGG motif by a CCC sequence rather than spacer-target mismatch itself, thus reaffirming the importance of a G-rich motif at the 5’end of potent crRNAs as previously described (Fig. 1 & Fig. 2).
We also created crRNA constructs harbouring 6-nt, 5-nt, 4-nt, and 3-nt mismatches at different spacer positions and probed their silencing efficiency in live cells (Fig. 3c-3f). Overall, 6-nt mismatches largely compromised the efficiency of PspCas13b regardless of mismatch position (Fig. 3c). 5-nt mismatches at positions 6–10, 11–15, and 26–30 exhibited a partial loss of silencing, while mismatches at positions 1–5, 16–20, and 21–25 led to a near complete or complete loss of silencing (Fig. 3d). 4-nt mismatches at positions 9–12, 13–16, and 17–20 retained partial silencing activity, whereas mismatches at positions 1–4, 5–8, 21–24, and 25–28 yielded a complete loss of silencing (Fig. 3e). Notably, crRNA constructs harbouring 3-nt mismatches at various spacer positions were well tolerated and yielded no or minor loss of silencing, except for mutations at position 1–3 that, as anticipated, led to a total loss of silencing likely due to 5’end GGG removal (Fig. 3f).
Whilst the preceding experiments established the tolerance for consecutive spacer-target mismatches, we questioned whether the silencing profile of non-consecutive mismatches may differ. We destabilized the spacer-target interaction by introducing 2, 3, 4, 5, 6, 7, 10, and 15 non-consecutive mismatches spread throughout the spacer (Fig. 3g). We noticed that 2, 3, and 4 non-consecutive mismatches were tolerated and led to negligible loss of silencing. However, more than 4-nt non-consecutive mismatches led to a substantial or complete loss of silencing. Likewise, multiple successive 2 or 3 nucleotide mismatches spread throughout the spacer sequence also completely abolished its silencing activity (Fig. 3g). These data revealed the targeting resolution of PspCas13b and suggest that > 4-nt non-consecutive mismatches critically destabilise spacer-target interaction and compromise PspCas13b activity. In addition, the data also suggest that endogenous targets with partial sequence homology are unlikely to be impacted by off-target silencing due to the required minimum ~ 25 consecutive or non-consecutive nucleotide basepairing. These mutagenesis data provide further evidence that highly effective crRNAs can be readily designed with minimal or no off-target effects.
Efficient silencing of oncogenic fusion drivers.
Now that we revealed key design principles of PspCas13b, we questioned whether we can reprogram this CRISPR enzyme to silence major oncogenic gene fusion transcripts. The breakpoint at the interface between the two genes offers a unique targetable sequence at the RNA level, which remains largely unexplored. We designed 21 tiled crRNAs (3-nucleotide resolution) targeting the breakpoint of 3 oncogenic gene fusions BCR-ABL1, SFPQ-ABL1, and SNX2-ABL1 that are established drivers of acute lymphoblastic leukemias (ALL)24–26. The gene fusions were each cloned into an IRES-GFP vector that produces the-gene-of-interest-IRES-GFP transcript, which is subsequently translated into separate proteins due to the presence of the IRES sequence. Therefore, an efficient targeting of the gene fusion transcript by PspCas13b is anticipated to lead to loss of GFP fluorescence due to sequence-specific recognition, cleavage, and degradation of the fusion-GFP transcript. Overall, microscopy data from 3-nucleotide resolution tiled crRNAs showed high silencing efficiency of all 3 gene fusions, although, once more the silencing efficiency varied depending on the position of the crRNA (Fig. 4a-4c). Analysis of mRNA levels of gene transcripts by RT-qPCR confirmed high silencing efficiency with numerous crRNAs, although the magnitude of variance between crRNAs was less pronounced than suggested by the microscopy assay (Fig. 4d-4f), possibly due to an additional Cas13-mediated protein translation regulation. Western blot analysis of the BCR-ABL1 protein expression further confirmed high silencing of BCR-ABL1 at the protein level, which, consistent with the microscopy data, was dependent on the position of crRNAs tested, with − 12, -9 and + 12 crRNAs exhibited the highest silencing efficiencies (Fig. 4g). Analysis of STAT5 and ERK phosphorylation, a hallmark of BCR-ABL1 dependent oncogenic signalling (Fig. 4h), confirmed that potent crRNAs can efficiently suppress BCR-ABL1 and its downstream oncogenic networks (Fig. 4i). Imatinib, a small inhibitory molecule that blocks the tyrosine kinase domain of ABL1 (Fig. 4h), inhibited BCR-ABL1 mediated phosphorylation of STAT5 and ERK without altering the expression levels of BCR-ABL1 protein, whereas PspCas13b crRNAs efficiently silenced BCR-ABL1 protein expression and the downstream phosphorylation of STAT5 and ERK (Fig. 4i). Interestingly, the most potent crRNA + 12 showed greater suppression of STAT5 phosphorylation than Imatinib, consistent with its high efficacy in depleting the BCR-ABL1 protein through mRNA silencing (Fig. 4i).
Next, we sought to investigate whether single-nucleotide tiled crRNAs targeting BCR-ABL1 would also show highly variable levels of silencing. We cloned and deployed 41 individual tiled crRNAs across the breakpoint of BCR-ABL1 (Fig. 4j). Again, we observed that the silencing efficiency highly varied even between neighbouring crRNAs. For instance, despite 96.6% sequence homology and only a single nucleotide position shift, crRNA14 achieved > 90% silencing while crRNA15 exhibited no silencing, with consistent results evident in both quantitative microscopy and Western blot analyses (Fig. 4j & 4k). The potent crRNA + 14 also exhibited higher silencing of downstream STAT5 phosphorylation (Fig. 4k). The contrasted silencing activity obtained with single-base resolved crRNAs within the same targeted region confirms the presence of key RNA sequences or features that profoundly influence PspCas13b activity.
Taken together, these data demonstrated the utility of PspCas13b as a versatile tool to efficiently silence tumour drivers such as fusion transcripts and alter their oncogenic signalling networks, and highlight the importance of rational design for maximum crRNA potency.
Absolute discrimination between fusion and wild type RNAs.
Previous spacer mutagenesis experiments indicated that PspCas13b can discriminate between two RNAs that share extensive sequence homology (Fig. 3). We questioned whether this discriminatory resolution is generalisable to cancer-specific oncogenic RNAs. To determine this, we introduced 3, 4, 5, 6, 7, 10, and 14 non-consecutive mismatches between the spacer of BCR-ABL1 crRNA (crBCR-ABL1) and the targeted breakpoint sequence (Fig. 5a). The data revealed that 3 nucleotide mismatches were well tolerated. However, 4 or higher number of non-consecutive nucleotide mismatches drastically impaired crRNA silencing efficiency (Fig. 5a). 3 consecutive nucleotide mismatches at various positions did not affect the silencing of BCR-ABL1. 6 consecutive nucleotide mismatches at the 3’end (25–30) or at the central region (12–17) led to notable loss of silencing, while 9 consecutive nucleotide mismatches dramatically curtailed silencing irrespective of position (Fig. 5b). Western blot analysis of BCR-ABL1 protein expression confirmed these data and showed that 3-nucleotide mismatches are well tolerated, while 4-nucleotide mismatches or higher led to substantial or complete loss of silencing (Fig. 5c). Overall, the data highlights the specificity of PspCas13b and its potential to discriminate between transcripts despite extensive sequence homology.
To confirm this specificity, we tested crRNAs targeting BCR-ABL1 fusion against wild type non-translocated BCR and ABL1 transcripts which are expressed in normal tissues. We cloned constructs encoding partial mRNA sequences of the BCR-ABL1 fusion, ABL1 alone, and BCR alone in frame with mCherry, eGFP, or TagBFP fluorescent reporters, respectively (Fig. 5d-5f). We designed 3 crRNAs targeting the BCR-ABL1 breakpoint sequence (crBCR-ABL1), BCR sequence (crBCR), or ABL1 sequence (crABL1) that we tested against the aforementioned constructs. The fluorescence signals from mCherry, eGFP, and TagBFP enable accurate quantification of on-target and off-target silencing with these crRNAs. As anticipated, all 3 crRNAs silenced the bona fide BCR-ABL1 transcript as this mRNA possesses full-length spacer binding sites for all three crRNAs (Fig. 5d). However, ABL1 and BCR transcripts were silenced only by their cognate crABL1 and crBCR crRNAs (Fig. 5e-5f). Notably, crBCR-ABL1 targeting the breakpoint sequence had no effect on either BCR or ABL1 wildtype transcripts despite 15-nucleotide sequence basepairing (Fig. 5e-5f). Western blot analysis confirmed these data (Fig. 5d-5f), demonstrating the high-resolution capability of PspCas13b and its utility to specifically silence oncogenic gene fusion drivers at the RNA level while sparing non-translocated wild type transcripts.
High potency against drug-resistant point-mutated fusion transcripts.
Acquired drug resistance to all approved ABL1 kinase inhibitors through secondary mutations remains a major challenge in the treatment of BCR-ABL1 driven leukemias27. For instance, the BCR-ABL1 kinase domain mutation Thr315Ile (T315I) confers resistance to Imatinib and drives tumour relapse28. We hypothesised that unlike Imatinib, targeting the breakpoint of BCR-ABL1 transcript with potent crRNAs will remain effective against both BCR-ABL1 variants as the mutation is located outside the targeted sequences at the breakpoint. We tested the potency of either the Imatinib or three Pspcas13b crRNAs targeting the breakpoint of BCR-ABL1. As anticipated, Imatinib efficiently inhibited the oncogenic signalling of ancestral BCR-ABL1 but failed to effectively suppress T315I BCR-ABL1 downstream signalling (Fig. 5g). Notably, all three PspCas13b crRNAs we tested largely inhibited the expression of ancestral and T315I BCR-ABL1 proteins and their downstream oncogenic signalling as exemplified by phospho-STAT5 and phospho-ERK inhibition. Consistent with previous data, crRNA-12 and crRNA + 12 achieved the highest inhibitory effect due to higher silencing potency (Fig. 5g). These data demonstrate that targeting the breakpoint of BCR-ABL1 transcript can overcome drug resistance commonly observed in recurrent leukemia.