Different HDR Strategies: Cut-Site Insertion vs. Coding Sequence (CDS) Replacement and Adjusting Homology Arm Length
RAG2-SCID is caused by mutations scattered throughout the CDS of the RAG2 gene4. Therefore, a universal correction technique that would suit all RAG2-SCID patients requires the delivery of an intact copy of the complete RAG2 CDS. While KI of an intact CDS at the endogenous locus would achieve this, this strategy could interfere with the 3D chromatin architecture and critical endogenous gene regulation by moving regulatory elements further downstream from the transgene. This could potentially disrupt spatial cross-talk between functional elements upstream and downstream of the RAG2 gene, such as promoter and/or enhancer sequences12,14,17,18 (Figure S1A-D). Hence, we hypothesized that a donor DNA with a left homology arm (LHA) upstream adjacent to the cut site, and a right homology arm (RHA) distanced from the cut site, downstream of the RAG2 stop codon would ensure preserving RAG2 regulatory elements by replacing the entire RAG2 CDS. To examine the feasibility of our hypothesis, we produced two rAAV6 vectors that would integrate a GFP expression cassette under the regulation of a spleen focus-forming virus (SFFV) promoter and BGHpA sequence after delivery of a chemically modified RAG2 sgRNA/Cas9 ribonucleoprotein (RNP) complex via electroporation into CD34+ HSPCs. In previous studies, we demonstrated this sgRNA’s high on-target editing efficiency and accuracy50,63,64. The first donor56, herein CSI_GFP-BGHpA_400x400, uses 400bp homology arms immediately flanking the Cas9-induced cut site for donor insertion (hereafter termed a cut-site-insertion [CSI] vector), while the second donor, herein CDSR_GFP-BGHpA_400x400, uses a 400bp LHA spanning the immediate sequence upstream to the Cas9 cut site and a 400bp RHA spanning the immediate sequence downstream to the RAG2 stop codon, to replace the entire RAG2 CDS with the DNA donor (herein a CDS-replacement [CDSR] vector) (Fig. 1A and Table S1). Two days post-editing, via flow cytometry, we found that the HDR efficiency of the CSI_GFP-BGHpA_400x400 vector was significantly higher than that of the CDSR_GFP-BGHpA_400x400 vector (21.8% and 9.1%, respectively) (Fig. 1B and S1E). Attempting to improve the HDR efficiency of the CDSR technique, we designed an additional two rAAV6 donors with RHAs extended from 400bp to 800bp and 1,600bp spanning the immediate region downstream to the RAG2 stop codon (herein CDSR_GFP-BGHpA_400x800 and CDSR_GFP-BGHpA_400x1600, respectively [Figure 1A and Table S1]). While elongation to 800bp produced significantly higher HDR efficiency than the CDSR_GFP-BGHpA_400x400 donor (14.8%), only after elongation to 1,600bp, did we observe HDR efficiency comparable to that of the CSI_GFP-BGHpA_400x400 donor (25.2%). (Fig. 1B and Figure S1E). Using a uniform pair of primers and probe for droplet digital PCR (ddPCR), we confirmed that the HDR efficiencies as determined by flow cytometry were accurate and locus-specific (Fig. 1C). Interestingly, via flow cytometry, we observed a significantly higher mean fluorescence intensity (MFI) of GFP-expressing cells after integration of the CDSR vectors compared to the CSI vectors (Fig. 1D-E), highlighting that different integration strategies have unique effects on the genomic locus and impact subsequent transgene expression.
Synthetic polyA Sequences and/or Cis-acting PREs Affect Transgene Expression
To modulate transgene expression further, we aimed to test the impact of synthetic 3’ regulatory elements on transgene expression. Thus, we designed two CDSR vectors (herein CDSR_GFP-WPRE-BGHpA_400x1600 and CDSR_GFP-NoBGHpA_400x1600 [Figure 2A and Table S1]) with a homology arm pattern similar to that of the CDSR_GFP-BGHpA_400x1600 donor. However, whereas the CDSR_GFP-BGHpA_400x1600 vector contained a BGHpA sequence alone and the CDSR_GFP-WPRE-BGHpA_400x1600 contained a WPRE-BGHpA sequence, the CDSR_GFP-NoBGHpA_400x1600 vector lacked both regulatory elements, thus allowing GFP expression to be controlled by the endogenous RAG2 3’-UTR. While the CDSR_GFP-BGHpA_400x1600 and CDSR_GFP-WPRE-BGHpA_400x1600 donors produced comparable HDR efficiencies, the CDSR_GFP-NoBGHpA_400x1600 donor induced lower HDR as observed by flow cytometry and confirmed by ddPCR (Fig. 2B-C and S2). Interestingly, the three donors produced significantly different MFI levels, with CDSR_GFP-BGHpA_400x1600 (2.8x106) being the highest followed by CDSR_GFP-WPRE-BGHpA_400x1600 and CDSR_GFP-NoBGHpA_400x1600 (1.6x106 and 0.3x106, respectively) (Fig. 2D-E), highlighting the strength of the synthetic 3’-UTRs in boosting expression and the significance of preserving the endogenous RAG2 regulatory elements.
KI-KO Genotype Engineering in HD-derived HSPCs Using Two-part Enrichment Strategy
Since RAG2 gene regulation is critical, we aimed to fine-tune our previously published RAG2-correction strategy56 by using the CDSR method. We hypothesized that replacing the entire CDS would allow for transgene expression to be driven by the RAG2 endogenous promoter and 3’-UTR, thus enabling the transgenic dcoRAG2 cDNA expression patterns to most similarly resemble that of endogenous RAG2. Additionally, by replacing the entire CDS (~ 1.5kb), as opposed to pushing the sequence ~ 4kb downstream in the case of HDR via insertion, we are able to more closely maintain the proximity of the RAG genes, conserving the ability to form the chromatin hub super enhancer necessary for proper expression. The dcoRAG2 cDNA produces a protein identical to WT RAG2, while the introduction of wobble changes leads to reduced similarity to the genomic sequence precluding the Cas9 from re-cutting the inserted sequence or from the inserted sequence serving as a homology arm causing premature cessation of HDR. We constructed a CDSR correction donor (herein CDSR_Corr_Endo3’UTR [Figure 3A and Table S1]) with a 400x800bp homology arm pattern for KI of the dcoRAG2 cDNA. To track the expression of dcoRAG2 cDNA and enrich for cells with successful integration, the dcoRAG2 stop codon was eliminated and replaced with a T2A self-cleaving peptide sequence followed by a truncated nerve growth factor receptor (tNGFR) reporter gene, producing in-frame transcription of the two sequences (dcoRAG2 cDNA and tNGFR). Following translation of the fusion protein, the T2A self-cleaves producing two proteins (RAG2 and tNGFR) at a 1:1 ratio in the cell. The use of tNGFR is particularly advantageous since it enables tracking and enrichment of corrected cells and has been approved for clinical applications65. Additionally, we constructed two donors with synthetic 3’-UTRs following the tNGFR, one with WPRE-BGHpA sequences and one with only the BGHpA sequence each with a 400x800bp homology arm pattern (herein CDSR_Corr_WPRE-BGHpA and CDSR_Corr_BGHpA, respectively [Figure 3A and Table S1]). While we observed highest HDR for CDSR donors with a RHA of 1,600bp (Fig. 1B-C), we could not design correction donors with a 400x1,600bp homology arm pattern due to the limited carrying capacity (~ 4.8kb) of rAAV6 vectors66. For comparative purposes, we utilized our previously published CSI KI donor which contained dcoRAG2 cDNA followed by the RAG2 endogenous 3’-UTR sequence along with a tNGFR reporter gene cassette under the regulation of a constitutive phosphoglycerokinase (PGK) promoter and BGHpA sequence between 400bp homology arms (herein CSI_Corr)56 (Figure S3A and Table S1). We tested these four correction donors individually and observed highly effective locus-specific HDR for them all, determined by ddPCR (Figure S3B).
We aimed to utilize the KI-KO strategy to engineer genotypes via multiplex HDR in HD-derived CD34+ HSPCs to simulate the therapeutic outcome of RAG2-SCID single-allelic correction following a gene-editing-based treatment. This strategy has two main advantages over more extensive editing methodologies: 1) In contrast to the use of induced pluripotent stem cells (iPSCs)67–70, HD-derived CD34+ HSPCs are biologically authentic since they are the same cells used in HSCT71; and 2) Lengthy culturing protocols are insufficient, since CD34+ HSPCs lose their regenerative ability as well as their engraftment potential after elongated culturing72. Thus, we chose to apply our KI-KO strategy in HD-derived CD34+ HSPCs by utilizing multiplex HDR to obtain a cell population with one allele targeted with one of the four aforementioned correction donors and the other allele with a KO template (Fig. 1A [CDSR_GFP-BGHpA_400x800 was paired with the three CDSR correction donors and CSI_GFP-BGHpA_400x400 was paired with the CSI_Corr]).
For the CSI_Corr donor, enrichment of KI-KO CD34+ HSPCs was achieved by sorting for biallelic double-positive tNGFR+/GFP+ expression two days post-electroporation (herein day 0) and immediately seeding the cells into the in-vitro T-cell differentiation (IVTD) system (Figure S3C-E). However, with CDSR correction donors, since tNGFR expression is under the regulation of the RAG2 promoter (and the RAG2 expression window occurs later in the T-cell developmental process), there is no expression of tNGFR on day 0. Therefore, a novel enrichment strategy was required to isolate KI-KO cells. On day 0, the CDSR donors were sorted only for KO GFP expression, and the GFP+ cells were immediately seeded into the IVTD system. On day 14 of IVTD, when RAG2 is highly expressed (thus, tNGFR is expressed [Figure S3F]) tNGFR+ cells were sorted for and seeded back into the IVTD system (Fig. 3B and S3G-H). Additionally, all samples were sorted for CD7 expression to enrich for cells that have begun to differentiate, namely, only cells that were CD7+ were subjected to days 14–28 of IVTD (Figure S3G). ddPCR was performed on gDNA from the KI-KO populations to confirm that the two-step enrichment method indeed led to enrichment of a cell population with ~ 100% edited alleles. Indeed, in all four correction donor multiplex HDR combinations, ~ 100% of targeted alleles were found to be positive (Fig. 3C). For confirmation that the integration of the donors occurred as expected, we conducted an 'in-out' PCR with one primer located on the tNGFR sequence and one primer downstream to the RHA (Figure S3I). Indeed, the observed amplified bands were consistent with the expected integration patterns and corroborated effective HDR (Figure S3J).
Additionally, since the CDSR donors produce a fusion protein separated by a self-cleaving T2A sequence resulting in a 1:1 ratio between transgenic RAG2 and tNGFR, tNGFR MFI measurement was used as a proxy measurement for transgenic dcoRAG2 expression levels. As expected, we observed that the MFI for CDSR_Corr_WPRE-BGHpA was 2x that of CDSR_Corr_Endo3’UTR and 1.4x greater than CDSR_Corr_BGHpA on day 28 of IVTD with a similar trend on day 14 (Fig. 3D) indicating higher expression of the dcoRAG2.
KI-KO HSPCs Produce CD3+TCRγδ+ and CD3+TCRαβ+ T Cells in the IVTD System
With a robust method to isolate cells with the KI-KO genotype, we aimed to validate that specifically the expression of dcoRAG2 enabled the KI-KO cells to differentiate into CD3+ T cells with diverse TCR repertoires, to present a proof-of-concept for gene correction. Quantitative real-time PCR (qRT-PCR) using transcript-specific primer pairs (Figure S4A) revealed that the expression of endogenous RAG2 was ostensibly eliminated in the KI-KO populations while robust dcoRAG2 cDNA expression was found exclusively in the KI-KO engineered cells (Fig. 4A and S4B). Additionally, when we directly compared the total RAG2 mRNA levels between all groups, we found that expression of the dcoRAG2 transgenes does not exceed that of the Mock samples indicating that the transcription is still tightly controlled and that the gene is not being overexpressed (Fig. 4B).
Importantly, the expression of the dcoRAG2 cDNA indeed facilitated T-cell development highlighted by the successful differentiation of RAG2 KI-KO cells into CD7+, CD5+, and CD1a+ pre-T cells on day 14 and CD3+ T cells by day 28 (Fig. 4C-D and S4C-D). Additionally, robust TCRγδ expression in the CD3+ population was observed by flow cytometry on day 28 with the CD3+TCRγδ− cells presumed to be CD3+TCRαβ+ T cells (Fig. 4E). Lastly, PCR amplification using primers flanking the V-J regions of the TRG locus highlighted the successful recombination of KI-KO cells comparable to that of the Mock cells on day 28 (Figure S4E).
Expression of dcoRAG2 cDNA Induces Normal TCR Repertoire
Deep-sequencing analysis of TRB and TRG recombination on day 28 revealed diverse V(D)J rearrangement repertoires in the RAG2 KI-KO populations following expression of the dcoRAG2 cDNA (Fig. 5A) with no significant differences in either TRB or TRG clonotypes between the Mock and RAG2 KI-KO populations as calculated by Shannon’s H and Simpson’s 1-D diversity indices (Fig. 5B-C). Lastly, the complementarity determining region 3 (CDR3) lengths frequency distribution was comparable in all RAG2 KI-KO and Mock populations for both the TRB and TRG sequencing (Figure S5). CDR3 is the region of the TCR responsible for recognizing processed antigen peptides and its sequence and length varies from one clone to another. Thus, sequencing the CDR3 regions of a cell population is used as a measurement of TCR diversity73. Together, these data indicate that KI and expression of the dcoRAG2 cDNA promotes successful V(D)J recombination, subsequent differentiation into CD3+TCRαβ+ and CD3+TCRγδ+ T cells, and the development of highly diverse TRB and TRG repertoires.