Construction of cjCBEmax and cjABE8e
We used improved variants of spCas9-based BEs—BE4MAX and ABE8e—to generate highly active cjCas9-based BEs—cjCBEmax and cjABE8e. The two improved base editors (BE4MAX and ABE8e) were developed and reported in the literature 15,19; BE4MAX contains codon-optimized APOBEC1 cytidine deaminase and UGI domains, and the ABE8e has an evolved TadA deoxyadenosine deaminase. We cloned these domains into the pCMV-cjCas9-D8A construct to generate cjCBEmax and cjABE8e (Figure 1A). Compared with the original spCas9-based BEs, the coding sequences of cjCBEmax and cjABE8e were approximately 1.3-fold smaller and their expression levels were confirmed by Western Blot assay (Supplementary Figure 1). We transfected these constructs into HEK293T cells with AAVS1-2 and AAVS1-8 gRNA, which was shown to allow cjCas9 to induce indel with a high mutation frequency 8, and analyzed the mutation frequencies by targeted deep sequencing.
The conventional active window of base editors was 4–9 positions in their spacer sequences; however, because we did not know the active window of the cjCas9-based BEs, we analyzed base editing frequencies over a broader range, including 15-nt outside the spacer sequence. At the AAVS1-2 and AAVS1-8 target sites, cjABE8e induced A:T to G:C conversions by up to 55.7% and 49.3%, and the cjCBEmax introduced C:G to T:A conversions by up to 18.1% and 19.1%, respectively (Figure 1B). Interestingly, both cjCBEmax and cjABE8e were able to introduce base substitutions outside the conventional active windows (Figures 1C and 1D). cjABE8e induced an A:T to G:C conversion by up to 32.0% at the A(15) position of AAVS1-2 target sites and 43.0% at the A(11) position of AAVS1-8 target sites.
Although there was no cytosine in the conventional active window, cjCBEmax could induce base C:G to T:A conversions in both AAVS1-2 and AAVS1-8 target sites, especially by up to 11.3% at the C(-9) position of the AAVS1-8 target site. At these two sites, we found that cjABE8e could not efficiently edit adenines located outside the spacer sequence, whereas cjCBEmax could convert cytosines located outside the spacer sequence. Taken together, we demonstrated that cjCBEmax and cjABE8e can induce base conversion at endogenous target sites, suggesting that their active window might be much wider than that of conventional BEs.
Characterization of cjCBEmax and cjABE8e
To further characterize cjCBEmax and cjABE8e, we cloned gRNAs to edit 11 additional target sites and transfected them with cjCas9, cjCBEmax, and cjABE8e in HEK293T cells. As shown in Figure 2A, cjCBEmax induced C:G to T:A conversions by up to 43.5% at the HPD-1 target site and cjABE8e induced A:T to G:C conversions by up to 54.3% at the HPD-2 target site. We found that cjABE8e had higher activity than cjCBEmax in all target sites, with cjABE8e and cjCBEmax showing average mutation frequencies of 35.4% and 17.4%, respectively, across 13 endogenous target sites. To characterize the base editing active window of cjCBEmax and cjABE8e, we analyzed the substitution frequencies of individual cytosines and adenines in a 50-nt window (Figure 2B and Supplementary Figure 2). In line with the results on AAVS1-2 and AAVS1-8 (Figures 1C and 1D), we found that cjCBEmax and cjABE8e had a wider active window compared with spCas9-based BEs. cjCBEmax could edit the C(-12) position by up to 1.9% in the EPAS1-1 target site and the C(19) position by up to 0.7% in the HIF1A-1 target site. Particularly in the SERPINC1 target site, cjCBEmax induced base substitutions by up to 13.5% at the C(13) position, which was a much higher frequency than those of the C(8) located in the conventional active window. Compared with cjCBEmax, cjABE8e showed a narrower active window across the 13 target sites, inducing base substitutions by up to 9.4% at the A(-3) position of the ANGPT2 target site and 7.1% at the A(18) position of the EPAS1-1 target site.
Next, we examined whether the context around adenines and cytosines affected the base editing activities of cjCBEmax and cjABE8e. For cjABE8e, the AA sequence context had an adverse effect on the adenine base editing activity, while the TA sequence context tended to enhance the base editing activity (Figure 2C). The cjCBEmax exhibited a relatively high cytosine editing activity in the context of a TC sequence compared with that in the context of a GC sequence (Figure 2D). Previously, Song et al. analyzed the correlation between the efficiency of spCas9-based BEs and sequence context 23, and we found that cjCBEmax and cjABE8e showed similar trends to the spCas9-based BEs. Overall, we demonstrated that cjCBEmax and cjABE8e can induce substitutions at various endogenous sites with a wider active window than that of spCas9-based BEs, and that the sequence context affecting their activity was similar to that of the spCas9-based BEs.
Specificity of cjCBEmax and cjABE8e
Next, we assessed the tolerance of cjCBEmax and cjABE8e for mismatched gRNAs. A total of 22 gRNAs having one or two mismatches with the target sequences were constructed and transfected in HEK293T cells with cjCas9, cjCBEmax, and cjABE8e (Supplementary Table 1). We analyzed the indel and base substitution frequencies by targeted deep sequencing and compared the tolerance for mismatched gRNAs (Figure 2E). In most cases, cjCBEmax and cjABE8e were tolerant to base mismatches in the PAM-distal region than in the PAM-proximal region, whereas HPD-2-M1-1 and HPD-2-M1-2 mismatched gRNAs showed a different trend. For example, cjCBEmax showed 44.2% base editing activity with HPD-2-M1-2-mismatched gRNA containing a 1-bp mismatch at position 20 of the spacer sequence, which was comparable to the base editing activity with HPD-2 gRNA. As the mismatch tolerance for HPD-2-M1-1 and HPD-2-M1-2, which had a 1-bp mismatch at the closest location in the PAM, was also observed with cjCas9, we speculated that this might be a characteristic of cjCas9 or a target-specific trait.
For most gRNAs with 2-bp mismatches, cjCas9 was not tolerant, whereas cjCBEmax and cjABE8e had modest tolerance that was proportional to the distance from the PAM region. Especially with HPD-2-M2-11-mismatched gRNA containing 2-bp mismatches at positions 1 and 2, cjCBEmax and cjABE8e induced substitutions by up to 15.0% and 35.6%, respectively, whereas cjCas9 showed an indel frequency of 0.9%. These results suggest that cjCBEmax and cjABE8e have a slightly lower specificity compared with cjCas9.
We also sought to identify the endogenous off-target effects of cjCBEmax and cjABE8e. The potential off-target sites of each gRNAs were analyzed in silico using Cas-OFFinder 24, and we selected an AAVS1-8-OT1 potential off-target site containing 2-bp mismatched to their target spacer sequence (Supplementary Table 2). We analyzed the endogenous mutations by targeted deep sequencing and found that cjCBEmax and cjABE8e as well as cjCas9 showed no detectable endogenous off-target mutations (Figure 2F).
Improvement of cjCBEmax and cjABE8e
To improve the base editing efficiency of cjCBEmax and cjABE8e, we engineered the scaffold sequences of gRNAs according to previous study regarding to the spCas9 gRNA engineering. As shown in Figure 3A, we removed a putative terminator motif of four consecutive uracils by single A:U to G:C conversion to avoid premature termination of gRNA transcription and truncated the tetraloop to shorten the length of gRNAs and named the engineered scaffold as “e-scaffold”. We cloned gRNAs with e-scaffold for five endogenous target sites and compared their mutation frequencies to those of wild-type scaffold by targeted deep sequencing. As shown in Figure 3B, the e-scaffold improved the mutation frequencies of cjCas9, cjCBEmax, and cjABE8e at all five target sites. Especially at the EPAS1-2 target site, gRNAs with e-scaffold enhanced the base editing frequencies of cjCBEmax by 3.1-fold.
A recent study showed that an L58Y/D900K double mutation in cjCas9 (encjCas9) can improve the activity of cjCas9 21. To determine whether the L58Y/D900K double mutation was synergetic with e-scaffold, we first compared the indel frequency of cjCas9 and encjCas9 combinations with gRNAs with the wild-type scaffold or e-scaffold (Figure 3C). We found that encjCas9 had improved activity compared with cjCas9 and had synergetic effects with e-scaffold across five target sites; especially, at the HIF1A-2 target site, the combination of encjCas9 and e-scaffold enhanced the indel activities by 7.6-fold (from 4.0% to 30.3%). We then introduced the L58Y/D900K double mutation in cjABE8e (encjABE8e) and tested their activity with gRNAs bearing e-scaffold (Figure 3D). In five target sites, encjABE8e showed improved base editing activity compared with cjABE8e, which were synergetic with e-scaffold.
AAV vector of cjABE8e for base editing
We next examined whether the cjABE8e could be packaged into an AAV vector. Because of the limited packaging capacity, spCas9-based BEs have challenge to be delivered through a single AAV vector systems . Since cjABE8e was small enough to package into an AAV vector, we speculated that cjABE8e and two tandem arrays of gRNA might be integrated into a single AAV vector (Figure 4A). To further reduce the size of the construct, we investigated whether a previously known synthetic polyadenylation (polyA) sequence was compatible with cjABE8e. As the synthetic polyA sequence is 49-bp long, which is much shorter than the 225-bp bovine growth hormone (BGH) polyA sequence, it can provide more space for AAV packaging.
We first cloned single-pAAV-cjABE8e constructs containing BGH polyA or synthetic polyA sequence and transfected them in HEK293T cells (Figure 4B). Targeted deep sequencing showed that single-pAAV-cjABE8e constructs with synthetic polyA and BGH polyA sequence had similar base editing frequencies across five target sites. We then constructed a dual-pAAV-cjABE8e vector containing two gRNAs targeting ANGPT2 and HPD-1 and compared its base editing efficiency with those of single-pAAV-cjABE8e vectors. As a result, we found that dual-pAAV-cjABE8e vector could induce base substitutions by 68.4% and 82.9% at the ANGPT2 and HPD-1 targe sites, respectively, which were comparable with those of single-pAAV-cjABE8e vectors (Figure 4C). Subsequently, we produced AAV particles and infected them into HEK293T cells and found that the base editing frequencies were accumulated in a dose-dependent manner up to 24.0% and 91.9% at the ANGPT2 and HPD-1 target sites, respectively (Figure 4D). We also investigated potential off-target sites of ANGPT2 and HPD-1 in silico and measured mutations at these sites by targeted deep sequencing in AAV-infected HEK293T cells, but did not find detectable off-target mutations (Figure 4E and Supplementary Table 2).