Development of CRISPR/Cas9-assisted recombination system (CARS)
The CARS constructed in this study is a two-plasmid system that consists of five elements: a Cas9-expressing cassette induced by L-arabinose; an sgRNA-expressing cassette induced by L-arabinose; a λ-Red recombination system induced by isopropyl-β-D-thiogalactopyranoside (IPTG); a donor DNA-generation system; and a plasmid curing system for eliminating the two plasmids independently or together from cells (Fig. 1).
Specifically, Cas9 protein and λ-Red recombinases (Gam, Beta, and Exo) were expressed by the p15A-ParaB-Cas9-PT5-Redγβα plasmid (plasmid#1), which contained a p15A replication origin and a kanamycin-resistant (KanR) gene. Targeting sgRNA was expressed by the pSC101-ParaB-sgRNA-Donor plasmid (plasmid#2) containing a pSC101 replication origin and an ampicillin-resistant (AmpR) gene (Fig. 1). There are two types of plasmid#2, the first of which contains two sgRNA-expressing cassettes, and the other containing one sgRNA-expression cassette. The variant of plasmid#2 depends on the type of genomic editing. The araB promoter, which is strict and induced by L-arabinose, controlled Cas9 and sgRNA expression so that DNA cleavage was only initiated when the inducer was present. The T5 promoter, which is strong and induced by IPTG, controlled λ-Red recombinase expression to ensure homologous recombination took place in time after DNA cleavage. Donor DNA, which served as a template to introduce sequence deletions, insertions, or replacements, was constructed and integrated into the plasmid#2 (Fig. 1). The plasmid-borne donor DNA could avoid nuclease attack and copy itself along with the replication of plasmid#2. The target site (N20 + PAM) on the genome was added to plasmid#2 in the flanks of donor DNA, thus the donor DNA could be cut off from plasmid#2 during genomic editing. This generated linear donor DNA that participated in homologous recombination with the cleaved genomic DNA (Fig. 1). At an appropriate concentration of L-arabinose, the expression levels of Cas9 and sgRNA were enough for cleaving the single-copy genome, but insufficient for cleaving all copies of the plasmid#2 (about five copies). Therefore, cells still possessed resistance to Amp. To construct the plasmid curing system, we used the temperature-sensitive pSC101 replication origin for plasmid#2 and added the sucrose-sensitive sacB gene to plasmid#1 as a counter-selection marker (Fig. 1).
Each cycle of editing started with the transfection of plasmid#2 into cells containing plasmid#1 (Fig. 1 and Figure S1). Then, we cultivated the correct transformants containing the two plasmids for cell reproduction before adding inducers to trigger DNA cleavage and DSB repair. Theoretically, sgRNA guides Cas9 to recognize and cleave the target DNA, generating DSB in the genome and plasmid#2. Then, the λ-Red recombinases mediate homologous recombination between the broken genome and linear donor DNA. This transfers the desired mutation from the donor DNA to the genome, destroying the target site (Fig. 1 and Figure S1). The cells acquiring the desired mutation survive, and the cells with an unrepaired genome undergo cell death. Thus, plating liquid cultures on agar medium containing Kan and Amp allowed the selection of desired clones. Colonies growing on the plates were further verified through PCR and sequencing. Then, correct mutants were cultivated at 40 °C in medium containing only Kan to eliminate plasmid#2 (Figure S2a). The cultures were inoculated into fresh medium to prepare competent cells for a new round of editing (Figure S1). Each cycle of editing required only three days. After the final round of editing, plasmid#1 and plasmid#2 were eliminated by incubating the correct clones at 40 °C in antibiotic-free medium and plating the cultures on agar medium containing sucrose (Figure S1 and Figure S2).
CARS-mediated long fragment integration
To evaluate the ability of CARS to mediate long fragment integration, we tried to insert fragments of different lengths (3 kb, 6 kb, 9 kb, and 12 kb) into the lacZ gene of E. coli strain MG1655 (Fig. 2a). We constructed four different versions of plasmid#2 harboring the corresponding donor DNA and expressing the same sgRNA targeting the lacZ gene. The four inserted fragments came from the F plasmid of E. coli strain XL1-Blue, and they had no homology with the MG1655 genome. The insertion of these fragments would inactivate the lacZ gene encoding β-galactosidase. Thus, we could differentiate edited and unedited colonies via blue-white selection. The edited colonies were white in a Luria-Bertani (LB) plate containing IPTG and X-gal, while the unedited colonies were blue. We also identified edited clones though PCR. One pair of primers (F1/R1) was designed for the verification of 3-kb insertion (Fig. 2a), and correct clones obtained much larger PCR products than the control (Figure S3a). Two pairs of primers were designed for the verification of 6-kb, 9-kb, and 12-kb insertions (Fig. 2a). The correct clones obtained the desired PCR products using both F1/R2 and F2-X/R1 (X=1, 2, 3), while the control did not (Figure S3b–d). The PCR products were further verified by sequencing. Based on the results of blue-white selection, PCR, and sequencing, we determined the editing efficiencies and positive rates. The editing efficiencies in these four insertion experiments were 1.2 × 10–3, 1.2 × 10–3, 9.6 × 10–4, and 7.2 × 10–4, respectively (Fig. 2b). The positive rates in the four insertion experiments were 97.3%, 98.3%, 96.7%, and 98.3%, respectively (Fig. 2b). These results indicated that both Cas9-mediated DNA cleavage and λ-Red-mediated DSB repair were efficient in our experiments. We found that the small-proportion negative colonies (<5%), commonly called “escapers” [27, 28], came from two sources. More than half of the “escapers” did not undergo cleavage by Cas9, probably because of the limited induction time and intensity of L-arabinose. The remaining “escapers” acquired deletions of unknown length in the target site, which was likely due to the presence of A-EJ repair [29, 30]. We tried to insert a 15-kb fragment into the lacZ gene, but failed, because the corresponding plasmid#2, which was over 20 kb in size, was difficult to construct. The 12-kb insertion is sufficient for application in metabolic engineering. To highlight the advantages of our method, we compared CARS to three representative methods that performed relatively well in long fragment insertion. These data came from published articles [28, 31, 32]. Our method performed much better than the others when comparing both largest insertion length and positive rate (Fig. 2c).
CARS-mediated long fragment knockout
Firstly, we successfully deleted a 99.9-kb fragment, starting at 565,156 and ending at 665,088, in the MG1655 genome (Fig. 3a). To determine the relationship between editing performance and the length of the deleted fragment, we selected seven fragments of different lengths within the 99.9-kb fragment for individual deletion. The lengths of these fragments were 9.1 kb, 21.5 kb, 30.6 kb, 39.4 kb, 59.8 kb, 79.8 kb, and 99.9 kb (Fig. 3a). To delete these fragments, we constructed seven different versions of plasmid#2 harboring two sgRNA-expressing cassettes. One sgRNA targets the same site (TS1) in the genome, and the other targets different sites (TS2-1–TS2-7) (Fig. 3a). Based on the results of PCR and sequencing, we determined their editing efficiencies and positive rates (Fig. 3b). As demonstrated, all positive rates were over 95%, similar to the results in long fragment insertion experiments. The deletion of 9.1-kb, 21.5-kb, 30.6-kb, 39.4-kb, 59.8-kb, and 79.8-kb fragments resulted in similar editing efficiencies, and the deletion of the 99.9-kb fragment resulted in lower editing efficiencies (Fig. 3b). We found that the 99.9-kb fragment knockout strain grew much more slowly than MG1655, while the 79.8-kb fragment knockout strain had a similar growth rate to MG1655 (Figure S4a and S4d). This phenomenon implied that the terminal region of the 99.9-kb fragment contained some genetic information that was important, but not essential, for cell survival. The decrease in editing efficiency of the 99.9-kb deletion experiment was probably due to the lower viability of edited cells. In this study, we also successfully deleted other long fragments in the genome (Fig. 4d). To highlight the advantages of our method, we compared CARS with four representative methods that performed relatively well in long fragment deletion. The data came from published articles [28, 33-35]. In comparison to these data, our method performed much better in terms of both largest deletion length and positive rate (Fig. 3c).
Identification of nonessential sequence and chromosomal simplification
According to previous reports, the MG1655 chromosome harbors 4497 genes, including 4296 protein-encoding genes and 201 RNA-encoding genes [36, 37]. Researchers at Keio University identified the essentiality of all protein-encoding genes in E. coli K-12 by single gene deletion, generating the Keio collection [38, 39]. This provided important information for us to identify potential nonessential long fragments in the MG1655 genome. To delete a long fragment, we needed to construct a plasmid#2 that expressed a pair of sgRNA targeting two flanks of the fragment and harboring the corresponding donor DNA (Fig. 4a). To delete a long fragment harboring a limited number of essential genes, we added these genes to the corresponding plasmid#2 between the two homologous arms. Therefore, the essential genes remained in the chromosome after genomic editing, and the edited cells survived (Fig. 4b and 4c). For each long fragment deletion, we designed two pairs of primers for PCR verification. The first primer pair targets DNA sequences within the long fragment, and the second primer pair targets the adjacent sequences outward the two homologous arms (Fig. 4d and Figure S5). The correct clones did not obtain PCR product using the first primer pair, but obtained the corresponding PCR products using the second. On the contrary, the unedited control clone obtained the corresponding PCR products using the first primer pair, but did not obtain PCR products using the second (Fig. 4e and Figure S6).
Altogether, we successfully deleted twelve long nonessential fragments in the MG1655 genome (Table 1), including the 99.9-kb fragment (No. 3) mentioned in the previous section. These fragments are located in different regions of the genome, and their lengths range from 52.0 to 186.7 kb. Among the twelve fragments, No. 3, No. 8, and No. 11 harbor one essential gene; No. 1 and No. 4 harbor two essential genes; and No. 9 harbors three essential genes (Table 1). Based on the results of PCR and sequencing, we determined the editing efficiencies and positive rates (Fig. 4f). All positive rates were over 95%, and the editing efficiencies ranged from 2.3 × 10–4 to 1.3 × 10–3. The deletion of fragments No. 3, No. 4, and No. 7 led to much lower editing efficiencies than that from deletion of the other fragments. By measuring growth curves of the twelve knockout strains, we found that the No. 3, No. 4, and No. 7 knockout strains grew much slower than other knockout strains, and the No. 4 knockout strain grew slowest (Figure S4). This may have led to the lower editing efficiencies in the deletion experiments of fragments No. 3, No. 4, and No. 7. The results indicated that these fragments were important, but not essential, for cell growth.
After deleting twelve long fragments individually, we tried to construct cumulative deletion mutants. Here, we used MG1655-ΔNo. X to represent the MG1655 mutant that loses fragment No. X (X=1, 2, 3, …, 12). As No. 1 was the longest fragment deleted in this study (Table 1), we chose to construct cumulative deletion mutants on the basis of strain MG1655-ΔNo. 1. Though iterative editing, we successfully deleted fragment No. 9 from MG1655-ΔNo. 1, generating strain MG1655-ΔNo. 1/ΔNo. 9 that lost a total of 270.7 kb of the DNA sequence, containing 268 open reading frames (ORFs) (Fig. 4g). We then tried to delete a third fragment on the basis of MG1655-ΔNo. 1/ΔNo. 9. According to the growth curves of single deletion mutants, the knockout of fragment No. 2, No. 5, No. 6, No. 8, No. 10, or No. 12 had no apparent influence on cell growth (Figure S4). Therefore, we attempted to delete these fragments individually in MG1655-ΔNo. 1/ΔNo. 9. As a result, we successfully obtained strains MG1655-ΔNo. 1/ΔNo. 9/ΔNo. 2, MG1655-ΔNo. 1/ΔNo. 9/ΔNo. 5, and MG1655-ΔNo. 1/ΔNo. 9/ΔNo. 6. The three knockout strains lost a total of 324.1 kb, 370.6 kb, and 368.7 kb of the DNA sequence containing 315, 364, and 368 ORFs, respectively (Fig. 4g). We failed to knock out fragments No. 8, No. 10, and No. 12 in MG1655-ΔNo. 1/ΔNo. 9 despite repeating the experiments several times, implying that these fragments were all essential for the survival of MG1655-ΔNo. 1/ΔNo. 9.
Metabolic engineering of E. coli for producing isobutanol
Higher alcohols such as isobutanol and n-butanol show promise in becoming the next generation of biofuels, due to their higher energy density, higher vapor pressure, and relatively low hydroscopicity [40, 41]. To illustrate the potential application of CARS in metabolic engineering, we used the system to modify the E. coli chromosome for producing isobutanol. Firstly, we constructed a chassis strain named JW74 based on MG1655 with six rounds of genomic editing (Fig. 5a). The competency of JW74 was 170-fold that of MG1655, making it much easier to transfect exogenous DNA. We then built a 7.9-kb operon and integrated it into the JW74 chromosome, thus displacing fragment No. 5 (Fig. 5a) and generating strain SH258. Fragment No. 5 was 99.9 kb in length, and the corresponding knockout strain grew slightly faster than its parental strain (Figure S4f). The operon consists of five structural genes and 5′ and 3′ untranslated regions (UTRs). The 5′ UTR contains a strong bacterial ribosome-binding site [42] and a T7 promoter, which naturally controls the expression bacteriophage T7 RNA polymerase [43]; the 3′ UTR contains a T7 terminator. The five structural genes are alsS, ilvC, ilvD, kivD, and adhA (Fig. 5a). Among the five genes, ilvC and ilvD came from E. coli, alsS came from Bacillus subtilis [44], and kivD and adhA came from Lactococuus lactis [45] (Fig. 5b). In order to initiate transcription of the operon, we introduced the T7 RNA polymerase-encoding gene controlled by the T5 promoter [46] to the SH258 genome, generating the SH274 strain (Fig. 5a). Though the T5 promoter is a strong inducible promoter repressed by LacI, it served here as a strong constitutive promoter. This is because SH274 is a lacI-defective strain. In traditional metabolic engineering, introducing a high-copy-number fermentation plasmid is a commonly used strategy to overexpress enzymes related to the target products. Therefore, we constructed the pColE1-PT5-alsS-ilvC-ilvD-kivD-adhA plasmid and transfected it into JW74, generating the SH279 strain.
We used the strains SH274 and SH279 to conduct micro-aerobic fermentation in shake flasks containing 20 mL M9 medium. Briefly, the acetolactate synthase (AlsS) converts pyruvate, the intermediate product of glycolysis, into 2-acetolactate. This is then transformed into 2,3-dihydroxy-isovalerate by ketol-acid reductoisomerase (IlvC). As the substrate of dihydroxyacid dehydratase (IlvD), 2,3-dihydroxy-isovalerate is converted into 2-ketoisovalerate, which is transformed into isobutyraldhyde by 2-ketoisovalerate decarboxylase (KivD). Finally, isobutyraldhyde is catalyzed by alcohol dehydrogenase (AdhA), generating isobutanol (Fig. 5b). During fermentation, samples were taken every 12 hours to measure the OD600 value and isobutanol titer (Fig. 5c). As a result, isobutanol reached a maximum titer of 1.3 g/L after 48 hours of SH274 fermentation (Fig. 5c). To our knowledge, this was the first attempt to produce isobutanol without introducing a high-copy-number fermentation plasmid, and isobutanol production was higher than many reports using such a plasmid [47, 48]. For strain SH279, isobutanol reached a maximum titer of 5.5 g/L after 48 hours (Fig. 5d). This is 4.2 fold that of SH274, indicating that the SH274 strain has much room for improvement. In future study, we therefore plan to increase the copy number of the operon PT7-alsS-ilvC-ilvD-kivD-adhA-TT7 in the SH274 genome to strengthen the expression of related enzymes.