Identification and activity test of the Type I-B CRISPR/Cas system in SCUT27
Recently, the Type II CRSIPR/Cas9 system tools have been reported for thermophiles based on some thermostable cas9 derived from Geobacillus stearothermophilus, Acidothermus cellulolyticus, and Geobacillus thermodenitrificans T12 [9, 10]. For genome engineering in SCUT27, we also tried to construct the plasmids pLY2_ldh-HA12 and pLY2_ldh-HA12 (Table 1) containing the thermostable cas9 gene from G. stearothermophilus (under the control of constitutive promoter PadhE and riboswitch-controlled inducible promoter Pkan-RSpbuE, respectively), a sgRNA expressing module targeting ldh and a repairing template. However, no transformants were obtained by electro-transformation with several tries. It indicated that the genome editing with CRISPR/Cas9 system was difficult to be used in our strain probably due to the high toxicity of the potent Cas9 RNP (the heterologous nuclease Cas9 was non-lethal as the pLY1 could transformed into SCUT27), and the limited transformation efficiency with pIKM1. Even using the inducible promoter Pkan-RSpbuE, it seemed that the leaky expression of Cas9 with sgRNA led to cell death. Therefore, we turned to exploit the endogenous CRISPR/Cas system for genome editing in SCUT27.
Table 1 Strain and plasmids used in this study.
CRISPRCasFinder, a web tool to identify clustered regularly interspaced short palindromic repeats and the presence of Cas genes , was used here to mine the native CRISPR/Cas system on chromosome of SCUT27 (GCF_000512105.1 Thermoanaerobacterium aotearoense SCUT27 assembly genomic.fna). Results showed that SCUT27 possessed the native CRISPR/Cas system which was mainly assigned to Class 1 Type I-B. Seven major CRISPR loci were found in SCUT27 genome sequence (Table S2), but intact Cas genes (v518_0414-0422) were only identified at the upstream and the downstream of the largest CRISPR array containing 55 spacers (Fig. 2a; Table S2). The other CRISPR arrays uncoupled with Cas genes were defined as orphan CRISPR. This may suggest that the longest array created by numerous spacer acquisition events and co-expression with a complete functional set of Cas genes might constitute an active CRISPR functional unit within the bacterial genome . As a result, this longest CRISPR array (Positon: 14334–18020 in NZ_AYSN01000014.1) including 55 spacers along with direct repeats (DR) was used for further research.
To recognize protospacers by CRISPR/Cas system, a functional PAM flanking at the 5’ end of the protospacer is necessary. The primary approach for PAM identification was to identify the putative original sequences of the invading elements that ended up as CRISPR spacers during the ‘adaptation’ stage in SCUT27. The searching program CRISPRTarget , was used to analyze the CRISPR spacers of the strain by aligning these sequences against the existing genome sequences in various databases such as phage, plasmid and so on, as described by Pyne . We set out to analyze spacer sequences of which score were greater than 20 points in silico. However, there was no obvious pattern of PAM with the only 55 spacers in the largest CRISPR array in SCUT27. Then we searched for the closest-related strain T. thermosaccharolyticum M0795, which possessed the similar Cas operon (Thethe_02655-02664) as SCUT27 and up to 266 spacers on its largest CRISPR array (the same DR motif as SCUT27: GTTTTTAGCCTACCTATAAGGAATTGAAAC) to identify the PAM sequence. From the results of spacer-protospacer matching analysis in M0795, we found the high-frequency patterns of PAM motif ‘TTA’ at the 5’ end of the protospacers (Fig. 2b, 2c). Besides, the PAM sequence ‘TNA’ or ‘TNA’ referring to Thermoanaerobacter sp. was also predicted in some literatures with various in silico tools [22–24]. In some closely related Clostridium such as C. pasteurianum, C. difficile, C. tyrobutyricum and C. thermocellum, PAM site ‘TNA’ was defined in their reported subtype I-B system as well [9, 15, 16, 25]. Based on the above analysis, the most confident PAM site ‘TTA’ in SCUT27 was characterized and chosen in the following experiments.
Then toxicity assay was performed to evaluate the activity of native CRISPR system. Through designing a plasmid containing a CRISPR mini-array cassette (Fig. S1b, c), the expression of sgRNA further forming ribonucleoprotein (RNP) on the target site mediated single strand DNA break. The artificial mini-array composed of 30 bp DR from the largest CRISPR array of SCUT27 and a 37 bp spacer (responding to the target protospacer with a functional PAM) was constructed (Fig. S1a, b). Like most bacteria, SCUT27 does not encode proteins responsible for non-homologous end joining. Thus, if the CRISPR/Cas system was functional, DNA cutting caused by the active RNP complex could not be repaired and no transformants would be observed. Here, the mini-array on the plasmid was designed to target pyrF gene (V518_1373) in SCUT27 (Fig. S1). The 921 bp pyrF ORF contains a total of 44 potential PAMs (TTA) on double strands of DNA. Two spacers (37 nt downstream of ‘TTA’) were chosen for targeting pyrF locus and one non-target spacer (PAM motif as ‘GGC’) was designed as a negative control (Fig.S1a). With a strong promoter Pkan for sgRNA expression, we observed a 21-fold decrease in electro-transformation efficiency of SCUT27 (from ~ 48.3 to 2.3 CFU/µg DNA), indicating that CRISPR-mediated interference was happening and around 95 % cell killing (cutting efficiency) was observed (Table 2, Fig. S1d). The promoters upstream of Cas operon and sgRNA were strong enough to mediate a single-chain cut. Some background colonies (escape mutants) containing the killing plasmids in experimental groups may be due to spontaneous mutations in Cas operon (specifically Cas3), which would be expected to inactivate the encoded functional protein [9, 26].
Effect of spacers of different genes on transformation efficiency.
Spacer sequence in sgRNA (protospacer)
Transformation efficiency (CFU/µ DNA)
2.5 ± 0.9
2.1 ± 0.7
48.3 ± 11.5
6.2 ± 4.4
3.1 ± 1.9
Endogenous Type I-B CRISPR-based deletion of ldh in SCUT27
After verifying the activity of the native CRISPR system, we constructed the gene editing plasmid pKQ1_ldh-HA12 (Fig. 3a) targeting the gene ldh (V518_0188) for knockout, as the resultant mutant could easily be identified by its product without lactate and showed obvious advantage for growth and ethanol production based on our previous report . A spacer targeting the chromosomal ldh locus was chosen by the same process as described above. Besides, a repair template was introduced for CRISPR/Cas mediated homology directed repair at the downstream of the CRISPR mini-array. It consisted of two homologous arms (HAs) of ~750 bp each and a KpnI digestion site instead of 805 bp in ldh locus to inactivate its function (Fig. S2a).
The transformation of pKQ1_ldh-HA12 into SCUT27 resulted in 12-65 colonies on MTCK plate (~6.2 CFU/ug DNA). Colony PCR was carried out with randomly picked colonies to screen the ldh deletion mutants. Twelves colonies were picked and verified by PCR. As shown in Fig. 3b, the PCR products in three kinds of cases were observed: Nine of 12 colonies (75%) had a clean deletion genotype (Δldh) with 1616 bp, two colonies had a wild-type genotype with 2421 bp and one colony showed the mixed genotype. For the wild type, it was supposed that some site mutation appeared on the key Cas proteins leading to the ineffective function of editing system. For the mixture type, it hasn’t been given clear explanations. Overall, with the facilitation of native CRISPR system, desirable mutants could be reproducibly obtained with high edited rate varying from 58.3 to 100 % within 2-4 days after transformation. The colony PCR product was sent for DNA sequencing (Fig. S2a), showing the target locus was edited as we designed. We also cultured the mutant and took the fermentation sample for HPLC with the result of non-lactate product and the same characteristics as SCUT27/Δldh::ermR in our lab (Table S5).
Promotion of endogenous CRISPR system with various promoters for sgRNA expression
Two metrics were considered to evaluate the usefulness of the genome editing tool. One metric is the total number of transformants (the transformation efficiency), the other is the fraction of correct transformants (true-editing efficiency). The number of correct transformants is the product of the above two metrics. Although the Type I–B system in SCUT27 showed efficiency at CRISPR mediated killing and high edited rate was observed in several times, few transformants were obtained after transformation of SCUT27 with editing plasmids. The low efficiency of transformation is possibly due to the high toxicity of CRISPR-induced DNA cutting with sgRNA expression under Pkan. To further explore the efficiency of CRISPR-mediated DNA cleavage and transformation efficiency, we tested other three constitutive promoters (PadhE, Pclo1313_1194 and Pcat1) for regulating the expression of sgRNA to see which expressing extent was better for editing. The expressing strength of the promoters were characterized by enzymatic activity downstream of them in overexpression vectors (Fig. S3a).
It showed that Pclo1313_1194 as a strong promoter in C. thermocellum did not work in SCUT27 (Fig. 4a). The strength of PadhE was weaker than Pkan, and the efficiency of transformation by PadhE (~5.5 CFU/µg DNA) was very close to that by Pkan, as well as the similar editing efficiency (~75.0%) (Fig. 4a, Table S3). The strength of Pcat1 was obviously stronger than Pkan. However, the transformation and editing efficiency of Pcat1 were both lower than those of Pkan (Fig. 4a, Table S3). We speculated that stronger expression of sgRNA caused much stronger cleavage and led to higher toxicity to cell, especially in the initial period after electro-transformation, possibly leading to an increased proportion of escape mutants with wild type phenotype as a result.
Notably, through the enzyme activity testing, we found the activity value of ldh complement by pIKM1 (Pldh-ldh) was nearly equal to that of wild type. So, we speculated that the plasmid pIKM1 in our strain mostly had only one copy, which was consistent with Walker’s resequencing date in C. thermocellum, although the plasmid is originally known to have 10-1000 copy number in C. thermocellum . This may could give an explanation to the low number of colonies on MTCK plate after electro-transformation. As shown in Fig. 5, with a constitutive promoter to express sgRNA, the cell would die due to the chromosome breaking under the case of no homology directed repairing, no matter how many plasmids transformed into the cell. However, under the case of homology directed repaired, if only one plasmid was transformed into the cell, the cell would also die owing to the plasmid-breaking (loss of the resistance to kanamycin); only the cells with several plasmids transformed would be survival (SCUT27/Δldh we got was in this case).
Thus, we looked for an inducible promoter to control the expression of sgRNA and restrict the cutting by the CRISPR system on the initial stage of the transformation. As shown in Fig. 5, with an inducible promoter, the cell containing one plasmid would be survival on MTCK plates without inducer, due to the repressed expression of sgRNA. Then, the inducer was added to activate the cleavage. The colonies with one plasmid under the case of homology directed repaired could be picked on the non-selective plates, while others would be death as no homology directed repairing happened. And in this case, the mutants could be obtained with plasmid-curing in theory.
To verify our guess, here we used a thermostable riboswitch element  and constructed the adenine-controlled inducible promoter Pkan-RSpbuE to control the expression of sgRNA in order to avoid the cutting at the initial stage after transformation. The results of enzyme activity assays showed the riboswitch-mediated inducible promoter was strictly controlled by inducer adenine (Fig. 4a). No enzyme activity was observed when the medium didn’t add extra adenine, while 0.04 U/mg enzyme activity was observed with 1 mM adenine in medium. With the sgRNA expression under Pkan-RSpbuE, the number of transformants was obviously increased on MTCK plates (approximately 42.0 CFU/µg DNA), and colonies were picked and cultured in medium with 1 mM adenine for inducing. However, the clones on MTCA plates were mostly false-positive and showed lower editing efficiency (Fig. 4a), even after five times passaging for enrichment. We speculated that the induced promoter here for the expression of CRISPR mini-array might be not strong enough to make ssDNA break thus leading to an increased number of wild type strains or escape mutants. And the mutants we got in this way still contained the plasmids, indicating these mutants had several plasmids initially and cells with a single plasmid maybe difficult to be edited with weak induced promoter. Therefore, utilization of a thermostable and strong inducible promoter for controlling the expression of sgRNA seems a key to this genetic modification tool.
Other attempts to improve the efficiency of transformation and editing with the plasmids
The highly efficient genome editing system (cutting of ssDNA) resulted in few alive colonies on plates, which was likely due to low homologous recombination . It has been noted in CRISPR/Cas genome editing for Clostridial organisms that serial transferring (1:20 dilution) and sub-culturing in liquid medium could enrich the desirable homologous recombination and increase the probability of gene editing . As our previous protocol without extra enrichment, we speculated that it might be also required due to insufficient homology-directed repair in SCUT27. Thus, after the electric shock, we cultivated the bacteria solution and transferred with several times to increase the opportunity of homology repairing. After 5, 10 and 20 rounds of serial passaging in MTCK medium, the results showed that the editing efficiency was between 15.0 - 58.3%, lower than that without serial transferring, while a large number of true-edited mutants could be obtained. It seemed that the escaped mutants increased more rapidly than the true-positive mutants during the serial transferring.
In this study, we also adjusted the amounts of plasmids from 5 µg to 100 µg in 300 mL buffer of electric shock with no prominent improvement (data not shown). Besides, we explored the effects of the various lengths of HAs on the efficiency of transformation and editing with the endogenous CRISPR/Cas system in our strain. The length of HAs in the plasmid was set to 250 bp, 500 bp, 750 bp and 1000 bp. As shown in Fig. 4b, the transformation efficiency was improved as the length of HAs increased. The highest editing efficiency (~ 75% correct edited) was observed when the length of HAs was up to 750 or 1000 bp. And we tried to respectively add 5 and 20 µg HAs of 1 kb (purified PCR products) mixed with 20 µg plasmids into the buffer during the electro-transformation, in order to strength the homologous recombination . The results showed that the addition of repair template (HAs) had an obviously positive effect on transformation efficiency while the editing efficiency of the transformants was relatively low as a compensation (Fig. 4b).
Plasmid curing and argR deletion in mutant Δldh for higher ethanol production
As described above, single gene deletion was achieved with high efficiency using endogenous CRISPR/Cas system and we further explored this system for multiplex genome editing in SCUT27. Considering the limitation of number of both selective makers and thermostable shuttle vectors available in SCUT27, plasmid curing was a must for continuous editing. The editing plasmids pKQ1_ldh-H12 in SCUT27/Δldh needed to be eliminated for acquiring marker-free mutants. So, the mutants were firstly transferred into medium without kanamycin for 5-20 generations transferring in order to lose the plasmid spontaneously. However, no colony without plasmid was picked, indicating the plasmid was difficult to be eliminated probably due to its stable replication region. Then some chemical and physical methods were applied for plasmid curing. To weaken cell wall or damage cell membrane, curing agents like isonicotinic acid hydrazide of 4µg/mL or 0.002% sodium dodecyl sulfate (SDS) (the maximum non-lethal concentration for SCUT27 we tested) were added into cultures in early exponential phase [29, 30]. However, almost all of the picked colonies had retained the plasmids even after several generations. Physical processing such as repeated freezing-thawing, sublethal temperature, electroporation or integrating several methods together were also tried but no one worked efficiently.
Here we developed a novel way for efficiently screening cells of plasmid curing and recycling the editing plasmid for multi-genes editing, based on the thymidine kinase (tdk) as a negative selection marker . The 5-fluoro-2’-deoxyuridine (FUDR) is the agent for negative selection as shown in Fig. S4a. The toxicity of FUDR for SCUT27 and SCUT27/Δtdk was tested. As shown in Fig. S4b, under the concentration of 50 µg/mL FUDR, no colonies of wild type grew while lots of colonies of Δtdk were observed, indicating tdk could be used as a selectable marker. Besides, deletion of tdk showed no influence on growth and properties of fermentation for SCUT27 (Fig. S4d, Table S4). Thus, SCUT27/Δtdk could be used as a starting strain for engineering, unnecessary for reintroducing tdk at last. As shown in Fig. 6, SCUT27/Δtdk was firstly picked by homologous recombination and verified by PCR (Fig. S4c), and then the mutant Δtdk/Δldh was obtained via electro-transformation with pKQ1_Δldh-H12::tdk. Through successive transferring (three generations here) and spreading on MTCF plate, colonies of Δtdk/Δldh without plasmids were picked with a positive rate around 25% (Fig. S5), showing a significant improvement compared with the previous blind screening. The mutant of plasmid curing was prepared for next round editing.
Here, arginine repressor (argR), which had been reported on previous research in our lab , was chosen as the target gene for the second editing with pKQ1_ΔargR-H12::tdk. The mutant with argR inactivation had a good performance on growth, ethanol yield and energy level, and its ability to utilize xylose and lignocellulosic hydrolysates had been enhanced [4, 5]. Thus, the editing of argR based on SCUT27/Δldh was valuable to explore for better performance. With the same operation above, the mutant SCUT27/Δtdk/Δldh/ΔargR (termed as Δldh/ΔargR below) was successfully obtained (Fig. S2b, S6), indicating the way for multiplex genome editing was feasible.
Analysis on glucose and xylose utilization ability of the mutants in serum bottles
In most of the lignocellulose hydrolysates, glucose and xylose are the two major fermentable sugars . Here, three kinds of glucose/xylose ratios (1:0, 2:1, 0:1) were investigated by flask fermentation with wild type, Δldh, ΔargR, and Δldh/ΔargR. As shown in Fig. 7 and Table S5, compared with wild type strain under different carbon sources, the maximum OD600 of Δldh, ΔargR and Δldh/ΔargR had respectively increased by 95.80-129.44%, 37.64-81.60% and 130.63-135.44%, respectively, with the ethanol production increased by 178.74-218.14%, 102.97-211.54% and 272.96-319.97% as well as acetic acid production, respectively, increased by 215.07-242.21%, 73.80-146.87% and 182.98-195.90%. Besides, the sugar uptake rate of SCUT27/Δldh, SCUT27/ΔargR and SCUT27/Δldh/ΔargR had increased by 79.62-98.68%, 57.02-66.55% and 94.31-147.56%, respectively. The results suggested that knockout of both ldh and argR strengthened the substrate utilization and the growth of the strain along with the increased production of ethanol. The carbon distribution of SCUT27/Δldh/ΔargR notably flowed to ethanol with the yield of ethanol (g/g sugar) increased by 69.61-121.62%, 11.42-26.74% and 0.83-31.33% compared with wild type strain, SCUT27/Δldh and SCUT27/ΔargR, respectively, under different carbon sources. Especially with xylose, compared with single gene-deletion mutants, SCUT27/Δldh/ΔargR had a more outstanding performance in sugar utilization ability and obviously shortened the lag phase of the xylose utilization caused by carbon catabolite repression under the mixed sugars, suggesting that SCUT27/Δldh/ΔargR could be a potential candidate for ethanol production with various pretreated lignocellulosic hydrolysates.
Besides, as shown in Table S5, SCUT27/Δldh gained in this study has the similar characterization with the previous mutant SCUT27/Δldh::ErmR in our lab (2), indicating the editing way by using the CRISPR/Cas system was reliable as same as the traditional way of homologous recombination with a selective marker.
Fermentation kinetic of Δ ldh /Δ argR with various lignocellulosic hydrolysates
An ideal microorganism for lignocellulosic bioethanol production should not only utilize various sugars efficiently but also tolerate high temperature (at least 40 ℃) and inhibitors . Saccharomyces cerevisiae as a well-known ethanol fermentation strain could not grow well above 40 ℃. Though some thermotolerant yeast such as Kluyveromyces marxianus or Ogataea polymorpha were found, the fermentation traits of them were far inferior to S. cerevisiae . Besides, yeasts showed unsatisfactory performance on ethanol fermentation with xylose as sole carbon source due to the cofactor imbalance in cell [35, 36] or the inefficient XI pathway for ethanol production owing to the thermodynamic limit with an unfavorable equilibrium between the xylose and xylulose . Thus, it’s urgent to obtain an ideal microorganism for ethanol production especially with xylose. Here, we focused on the ethanol fermentation from xylose with SCUT27/Δtdk/Δldh/ΔargR, a candidate for ethanol fermentation with xylose-rich lignocellulosic hydrolysates.
Hydrolysates of rice straw (RSH), sorghum straw (SSH), peanut straw (PSH), wheat straw (WSH) and soybean straw (OSH) pretreated with diluted H2SO4 were chosen to assess the ethanol production ofΔldh/ΔargR. Xylose was the main sugar in the hydrolysates of the most lignocellulosic biomass with partial glucose and cellobiose. Here each hydrolysate had been diluted with water at initial xylose concentration about 15 g/L. As shown in Fig. 8, within the expectation, the improved ethanol production had been obtained by Δldh/ΔargR along with improved sugar utilization ability with all hydrolysates. The ethanol production and yield of the mutant Δldh/ΔargR had been greatly improved about 147.42–739.40% and 112.67–267.89% respectively when compared with wild type. The ethanol titers of RSH, SSH, PSH, WSH and OSH were up to 9.84 g/L, 10.25 g/L, 9.27 g/L, 10.45 g/L and 9.70 g/L in serum bottles, respectively. And the ethanol yields were 0.59 g/g, 0.53 g/g, 0.57 g/g, 0.61 g/g and 0.60 g/g, respectively, all of which were higher than 0.35-0.40 g/g ethanol yield with pure sugars and also beyond the theoretical maximal yield of ethanol 0.51g/g glucose or xylose. We speculated that trace amounts of other substrates such as cellobiose, arabinose and some protein might exist in hydrolysates and had not been calculated in the ethanol yield [38, 39]. Nearly all the xylose and glucose in hydrolysates except RSH could be exhausted by SCUT27/Δldh/ΔargR, which might attribute to the enhanced activity of xylose isomerase and xylulokinase as well as higher energy level of cells for xylose transport . Both wild type strain and Δldh/ΔargR had the strong ability to tolerate lignocellulose-derived inhibitors in hydrolysates such as weak acids, furan derivatives and phenolic compounds , and the ability of Δldh/ΔargR was much stronger, especially under the rice and peanut straw hydrolysates (Fig. 8), possibly due to both DnaK-DnaJ-GrpE system and the GroEL-GroES chaperonin up-regulated simultaneously , the potential improved ability to eliminate ROS against the inhibitors  and improved ATP level to synthesis heat shock protein, pump cytoplasmic protons and transform inhibitors . Results above suggested that Δldh/ΔargR was a promising bioethanol producer with prominent advantages for dealing with harsh environment from various hydrolysates.