ABE is an effective gene editing tool that can convert an A/T pair to a G/C pair without generating a DNA double-strand break (DSB) or requiring a donor DNA template. An initial practical version of ABE (ABE7.10) was composed of three fused elements: a partially inactive Cas nuclease (Cas nickase or nCas) and a pair of adenosine deaminases [a wild-type tRNA-specific adenosine deaminase, TadA from Escherichia coli (wtTadA), and an engineered TadA (eTadA), TadA7.10, which was created by directed evolution so that it would operate on DNA instead of RNA]1. Recently, several groups have reported ABE shortcomings2, including genome-wide single-guide RNA (sgRNA)-dependent off-target DNA editing3,4, transcriptome-wide sgRNA-independent off-target RNA editing5-7, and ABE-mediated cytosine deamination at the on-target site8. The first problem is caused by the imperfect target specificity of Cas nucleases, but the others result from the DNA/RNA-binding properties of adenosine deaminases. Therefore, to reduce the adenosine deaminase-mediated effects, further engineering of TadA7.10 is required.
To date, a few groups have provided new versions of ABE with additional mutations to alleviate its off-target effects in RNA. The Yang group determined that the addition of an F148A mutation in both wtTadA and TadA7.10 decreased the promiscuous RNA deamination activity7. The Liu group and the Joung group independently found that the wtTadA in ABE7.10 is mainly responsible for the RNA deamination activity, whereas the absence of wtTadA does not affect the DNA editing activity5,6. Therefore, the Liu group deactivated wtTadA by introducing an E59A mutation and found that a further mutation (V106W) in TadA7.10 reduced the RNA off-target effects without reducing the DNA on-target activity5. The Joung group eliminated wtTadA entirely and added several mutations to TadA7.10 (K20A/R21A or V82G), resulting in a decrease of the RNA off-target effect6. However, the ABE-mediated cytosine catalysis effect has not been seriously addressed thus far. Because the eTadA enzyme has a common catalytic site for both adenine and cytosine, it is proposed that additional mutations in and around the active site may decrease the cytosine conversion activity, without reducing the adenine conversion activity. Here, by rationally designing and testing tens of TadA variants, we identified key mutations that either eliminate or enhance cytosine catalysis activities.
Previously, we showed that ABE7.10 catalyzed cytosine deamination in a narrow editing window (5th~7th from the 5’ end of the Cas9 target spacer) at a preferred motif (TC*N) (Figure 1a) and that not only ABE7.10, but its predecessors (ABE6.3, ABE7.8, and ABE7.9) and a further optimized version (ABEmax), exhibited conserved cytosine catalysis activities8. Here, we investigated whether cytosine deamination activities are also displayed by newly developed ABE variants, including the versions that were generated with the aim of reducing ABE-mediated RNA off-target effects (i.e., ABEmax-F148A, ABEmax-AW, and SECURE-ABEs)5-7, versions that contain TadA8e variants, which show increased deamination kinetics (i.e., ABE8e and ABE8e-V106W)9, and a version that contains TadA8s, which displays enhanced editing activity (i.e., ABE8.17-m)10. We tested all ABE variants at two representative endogenous targets (in FANCF and RNF2), which contain both an adenine residue and a cytosine target motif within the editing window, in human HEK293T cells, and made several conclusions based on high-throughput sequencing data from bulk cell populations. (Supplementary Tables 1 and 2) First, all tested variants still induced cytosine editing in addition to adenine editing (Figure 1b), but we observed slightly decreased cytosine editing rates for ABEmax-F148A and ABEmax-AW, supporting our hypothesis that further engineering of eTadA could eliminate or minimize its cytosine deamination activity. Second, although all ABE8 variants showed adenine editing rates that were greatly increased compared with the rates of previous versions, their cytosine conversion rates were also increased. Nevertheless, it is notable that ABE8e-V106W showed mitigated cytosine editing compared to ABE8e. Third, we confirmed that deactivation or elimination of wtTadA did not hinder DNA editing activities; rather, a version of ABEmax that lacks wtTadA (ABEmax-m) showed higher DNA editing activities than intact ABEmax (Supplementary Figure 1).
To identify key mutations in TadA that would promote discrimination between adenine and cytosine, we first examined the amino acid sequences of TadA orthologs from various species, because some of these orthologs may have already evolved to avoid cytosine editing. Based on an amino acid sequence alignment of TadA orthologs (Figure 1c) and the structure of Staphylococcus aureus TadA (saTadA) bound to a fragment of tRNA (Figure 1d), we found that the identities of several residues in and around the active site vary substantially among the orthologs. For example, P48 in E. coli wtTadA is substituted by arginine in the majority of TadA orthologs, and D108 is changed to asparagine, glutamate, or serine in other orthologs (Figure 1c and 1d). In addition, the saTadA structure provided insight into what structural change in the RNA substrate is required for the deamination of cytosine, which is smaller than adenine. For adenine deamination activity, the hexagonal ring of adenine should be located deep inside the adenine-binding pocket, similar to what is shown in the saTadA structure with a purine base bound to the pocket. However, for cytosine deamination, its pyrimidine ring needs to be at the same position as that of the hexagonal ring of the purine base in the structure, and this consequently requires a shift of the sugar-phosphate backbone toward the rim of the pocket. Therefore, we chose to substitute P48 and D108 with bigger residues, which may prevent the DNA backbone from approaching the pocket rim. We also mutated V30 and F84, located in the adenine binding pocket, into isoleucine and leucine, which are found in the corresponding positions of many TadA orthologs. In addition, we introduced mutations that have been previously tested and shown to incompletely reduce RNA editing activities, such as an R47Q mutation, which maintain DNA on-target editing activities5, and a D53E mutation that caused reduced RNA editing activity in vitro7,11.
We next introduced each candidate mutation into TadA7.10 in either ABEmax or ABEmax-m (Supplementary Table 3) and tested the nucleotide conversion activity of each resulting ABE variant at the target sites in the FANCF and RNF2 genes in HEK293T cells. When we normalized the adenine and cytosine conversion rates of the ABE variants to that of ABEmax, we found that most mutations increased or decreased both the adenine and cytosine editing activities in concert (Figure 1e). Nevertherless, we identified four mutations (i.e., V106W, D108Q, F148A, and F149A) that substantially lowered the cytosine editing activity but maintained or slightly decreased the adenine editing activity, resulting in higher specificity for adenine editing (Figure 1f). Therefore, we further tested the activities of the four ABE variants at sites in two more endogenous genes and concluded that ABEmax-m containing TadA7.10-D108Q (hereafter ABEmaxQ-m) showed the highest specificity for adenine editing (Supplementary Figure 2 and Supplementary Table 1). In addition, we unexpectedly found that the P48R mutation in TadA7.10 of ABEmax-m substantially reduced adenine editing rates but increased cytosine editing rates, resulting in high specificity for cytosine editing (Figure 1e). In summary, we identified two TadA7.10 variants, TadA7.10-D108Q and TadA7.10-P48R, which showed enhanced selectivity for adenine or cytosine editing, respectively.
Although ABEmaxQ-m exhibited greatly reduced cytosine conversion activity, its adenine conversion activity was also reduced compared to that of ABEmax. To compensate for this low editing activity, we next adapted TadA8e and TadA8.17, which were developed with the aim of enhancing editing activities9,10, and tested them in place of TadA7.10. We introduced the D108Q mutation into ABE8e, ABE8e-V106W, and ABE8.17-m, thereby generating ABE8e-D108Q, ABE8e-V106W/D108Q, and ABE8.17-D108Q-m, which are referred to hereafter as ABE8eQ, ABE8eWQ, and ABE8sQ, respectively. We tested all ABE variants at a total of four endogenous sites (in FANCF, RNF2, ABLIM3, and CSRNP3). High-throughput sequencing results showed that the ABE8eQ, ABE8eWQ, and ABE8sQ variants all exhibited enhanced adenine editing activities (Figure 2a) with reduced cytosine editing activities (Figure 2b). In particular, ABE8eWQ was determined to be the most optimized version for both editing activity and specificity (Figure 2c), indicating that the V106W and D108Q mutations show synergistic effects (Figure 2d).
We next evaluated the ABE-mediated RNA off-target editing activity of all of the ABE variants. We transfected each ABE variant into HEK293T cells and measured the A-to-I conversion frequencies in four representative RNA transcripts (CCNB1IP1, AARS1, PERP, and TOPRS)6,7. High-throughput sequencing results revealed that the ABE8e and ABE8.17-m versions developed earlier showed increased RNA off-target effects compared to ABEmax, and that ABE8eW showed reduced RNA off-target effects compared to ABE8e (Figure 2e), as previously reported9,10. To our surprise, ABE8eQ, ABE8eWQ, and ABE8sQ showed highly decreased RNA off-target effects, indicating that the D108Q mutation also effectively reduces the RNA deamination activity. D108Q could decrease the binding affinity of TadA8e for RNA, but not for DNA, because the carboxyl group of D108 forms a hydrogen bond with a 2’ hydroxyl group of the bound RNA in the saTadA-RNA structure. Taken together, our results indicate that the D108Q mutation affects a key residue, decreasing both cytosine editing and RNA deamination activities, so that ABE8eQ, ABE8eWQ, and ABE8sQ are optimized versions of ABE and of these, ABE8eWQ is the best version.
Next, we sought to develop TC sequence-specific base editing tools by using the TadA7.10-P48R variant, which has increased cytosine editing activity with highly decreased adenine editing activity. To this end, we linked two copies of uracil DNA glycosylase (UGI) to the C-terminus of ABEmax-P48R as had been done in AncBE4max, an optimized cytosine base editor (CBE)12. The addition of UGI may increase the C-to-T editing ratios rather than the C-to-G editing ratios, or vice versa. We ultimately prepared six base editing tools: AncBE4max and AncBE4max(ΔUGI) as CBEs, ABEmax and ABEmax-UGI as ABEs, and ABEmax-P48R and ABEmax-P48R-UGI as TC-specific base editors (Figure 3a). We tested all of them at a target site in the CSRNP3 gene in HEK293T cells. High-throughput sequencing results showed that CBE and CBE(ΔUGI) converted all Cs (i.e., C3, C6, and C7) within the editing window with the highest rates, that ABE and ABE-UGI converted all As (i.e., A4 and A8) and C6, whereas ABE-P48R and ABE-P48R-UGI dominantly converted C6 (Figure 3b). We further tested these tools at three additional endogenous sites (in FANCF, RNF2, and ABLIM3). High-throughput sequencing results showed that editing activity tendencies were consistent with results at the CSRNP3 site and that ABE-P48R dominantly converted C-to-G but that ABE-P48R-UGI dominantly converted C-to-T as expected (Figure 3c), suggesting that ABE-P48R and ABE-P48R-UGI can respectively function as specific cytosine editing tools for TC-to-TG and TC-to-TT editing without significant bystander editing effects.
To show the potential of ABE-P48R and ABE-P48R-UGI for treating genetic diseases, we inspected all targetable variations registered in the ClinVar database in silico. A total of 36,153 T>C mutations causing pathological phenotypes in the database are located in a canonical cytosine base editing window (4th ~ 8th) (Figure 3d). Among them, 3,874 mutations are associated with a cytosine target motif within the ABE-P48R-UGI editing window. In addition, 3,248 of the 23,237 G>C mutations in the database can be targeted by ABE-P48R. In the case of a missense mutation in the TUBB6 gene (causing an F394S change in the protein) that is associated with congenital facial palsy, bilateral ptosis, and velopharyngeal dysfunction13, a TC sequence should be corrected to TT. We first established a cell line containing the appropriate mutation in the genome to mimic the disease situation (Supplementary Figure 3), after which we transfected a CBE (AncBE4max) or ABE-P48R-UGI into the cell line. High-throughput sequencing results showed that relative to ABE-P48R-UGI, CBE generated higher total cytosine conversion rates but lower rates of exact corrections (<1% compared to 4.1% for ABE-P48R-UGI) (Figure 3e). It is notable that ABE-P48R-UGI generated negligible bystander effects, whereas CBE generated abundant bystander effects, which suggests that ABE-P48R-UGI would be a useful TC-to-TT editing tool.
On the other hand, for a missense mutation in the TPO gene (causing a Q660E change) that is found in both nontoxic and toxic goiter patients14, a TC sequence should be corrected to TG. As before, we generated a cell line with the appropriate genomic mutation to mimic the disease condition (Supplementary Figure 3), and then transfected the CBE (AncBE4max) or ABE-P48R into the cell line. High-throughput sequencing results showed that compared to ABE-P48R, CBE generated higher total cytosine conversion rates but lower rates of exact corrections (<1% compared to 3.1% for ABE-P48R) (Figure 3f). It is notable that like ABE-P48R-UGI, ABE-P48R generated negligible bystander effects, whereas CBE generated abundant bystander effects, which suggests that ABE-P48R could be a useful TC-to-TG editing tool.
In this study, through rational design we determined a key mutation in eTadA, D108Q, that is responsible for decreasing cytosine catalysis activity. Interestingly, in contrast, D108F and D108W mutations dramatically decreased both adenine and cytosine editing activities (Figure 1e), suggesting that amino acids larger than glutamine may limit the accessibility of the sugar-phosphate backbone. D108E completely abolished the activity, probably due to strong charge repulsion between the carboxyl group of glutamate and the backbone phosphate. However, D108K caused only a mild decrease, suggesting that the lysine-phosphate interaction may not have a significant effect on the conformational dynamics of the DNA backbone. Although the size of methionine is similar to that of glutamine, D108M decreased the editing rates much more than D108Q, suggesting that direct polar interactions or water-mediated interactions between glutamine and the sugar-phosphate backbone may be crucial for the catalytic activity.
Conversely, we further identified a key eTadA mutation, P48R, that increases cytosine catalysis activity while reducing adenine editing activity. Because other mutations that affected P48 also similarly decreased the ratio of adenine editing to cytosine editing (Figure 1e), the cytosine specificity can likely be attributed to a change in the conformation of the main chain that consequently affects the orientations of neighboring residues. For example, a slight conformational change of N46 could have a dramatic effect on the editing activity because the residue closely contacts the purine hexagonal ring in the saTadA-RNA structure. Fortunately, in the case of P48K and P48R, the editing activity seemed to be restored by the interactions of the residue at position 48 with the backbone phosphate; the arginine-phosphate interaction might have an additional effect on the cytosine specificity by stabilizing a backbone conformation favorable to cytosine binding.
Recently, novel forms of DNA base editing such as a C>G editing15-17, and simultaneous C and A editing18-21 have been suggested, which would substantially expand the utilities of DNA base editors. Along with the intense efforts to improve DNA base editors, our suggested tools, high-fidelity ABE variants that exhibit minimized cytosine catalysis and reduced off-target RNA editing, and TC-specific base editors with negligible bystander effects, should make DNA base editing tools a more attractive alternative for gene editing in many research areas, such as disease therapy development, gene regulation, and plant transformation.