Internal promoters of the epothilone biosynthetic gene cluster and their activation in Myxococcus xanthus by CRISPR-dCas9

Multiple genes involving in a complex pathway are often clustered into a giant operon with no transcription terminator before the end, and this leads to frangibility of the transcriptional process and arduous engineering work to control the transcription of operon genes. Internal promoters might occur in operon to coordinate the transcription of individual genes, but their effects on the transcription of operon genes have been less investigated. Epothilones are a kind of polyketides synthesized by seven multifunctional enzymes, which are encoded by a 56-kb operon of the myxobacterium Sorangium cellulosum. In this study, we determined that the epothilone operon contained multiple internal promoters. These promoters were activatable by the CRISPRa technique, and the yields of epothilones were accordingly increased. However, the activation eciencies of promoters in operon and separate forms were greatly different. Further, we found that the transcriptional levels of the epothilone genes were always increased at a greater extent than the epothilone yields, which suggested that the transcriptional activation of single genes probably had a weak effect on the nal epothilone yield, and higher yield required an overall transcriptional increase of the multiple operonic genes. Finally, we combined the activation of the starting promoter P epoA together with internal promoters in different epothilone-producing strains, and obtained the highest 15-fold increase of epothilone yield in Myxococcus xanthus ZE5. transcription of internal promoters, and found that the activation eciency of promoters in operon was distinct from that of them in separate form. We combined the activation of the starting promoter PepoA together with internal promoters in different epothilone-producing strains, and obtained the highest 15-fold increase of epothilone yield in Myxococcus xanthus ZE5. Our results highlight that coordinating transcriptional activities of internal promoters is critical for the transcription of operon genes and the production eciency of microbial secondary metabolites..


Results
Epothilones are a kind of polyketides synthesized by seven multifunctional enzymes, which are encoded by a 56kb operon of the myxobacterium Sorangium cellulosum. In this study, we determined that the epothilone operon contained multiple internal promoters. These promoters were activatable by the CRISPRa technique, and the yields of epothilones were accordingly increased. However, the activation e ciencies of promoters in operon and separate forms were greatly different. Further, we found that the transcriptional levels of the epothilone genes were always increased at a greater extent than the epothilone yields, which suggested that the transcriptional activation of single genes probably had a weak effect on the nal epothilone yield, and higher yield required an overall transcriptional increase of the multiple operonic genes. Finally, we combined the activation of the starting promoter P epoA together with internal promoters in different epothilone-producing strains, and obtained the highest 15-fold increase of epothilone yield in Myxococcus xanthus ZE5.

Conclusions
This is the rst time to report the internal promoters in epothilone gene clusters in Myxococcus xanthus and the rst time to assay the activation effects of these internal promoters by CRISPR-dCas9. Our results highlight that tuning internal promoter activities is critical to control the transcription of operon genes and the production e ciency of microbial secondary metabolites.

Background
Operons, a well-known feature of prokaryotic genomes, are clusters of co-regulated genes with related functions [1]. Big operons, such as pathways for the biosynthesis of secondary metabolites, may contain multiple genes with no transcription terminator before the end. The transcription of multiple genes in operon is initiated by the starting promoter, forming a single polycistronic mRNA. However, the transcription of big mRNA molecules is easily subject to various in uences in cells, leading to frangibility of the transcriptional process and arduous engineering work to control the transcription of operon genes. Internal promoters have been observed in operons for many years.
According to experiments and genome prediction, internal promoters are universal in operons in different bacteria [2][3][4][5][6][7]. For example, Ma et al. demonstrated that there are at least three weak internal promoters P2, P3 and P4 in addition to two strong promoters P L and P1 in the E. coli rpoBC operon, which encodes four ribosomal proteins and the β and β ' subunits of RNA polymerase [2]. In the 57-kb jamaicamide operon from the marine cyanobacterium Lyngbya majuscule, 17 genes are co-transcribed from the starting promoter, and six internal promoters are present in intergenic regions, which potentially assist management of the toxin production in various environments [3].
Similarly, seven internal promoters have also been observed in the microcystin operon of Microcystis aeruginosa [4]. However, the effects of internal promoters on the transcription of operon genes have been less investigated.
Epothilones are a kind of antitumor polyketide compounds with the microtubule-stabilizing activity, which were originally discovered in the extracts of some Sorangium cellulosum strains [8]. In 2000, Molnár and Julien reported the sequence and construction of the epothilone gene cluster, which is approximately 56 kb, containing seven open reading frames (ORFs) in the same transcriptional direction [9,10]. These ORFs encode polyketide synthases and a non-ribosomal peptide synthetase with multiple functioning domains for elongation, modi cation and release of epothilones in a sequential mode [9]. This gene cluster contains no terminator between ORFs before the end, thus forming a giant operon.
In previous work, we have successfully integrated the epothilone biosynthetic gene cluster from S. cellulosum So0157-2 [11] into different sites of the Myxococcus xanthus genome, and produced dozens of epothilone producers [12]. We found that the transcriptional levels of operon genes greatly varied in either M. xanthus or the original S. cellulosum producers. We constructed a CRISPRa (CRISPR-dCas9-mediated transcription activation) system in M. xanthus and have successfully improved the production of epothilones by activating the starting promoter of the operon [13]. However, the transcriptional levels of operon genes still varied. In fact, uneven gene expression of operon genes was observed many years ago [14], and genome-wide transcriptomics studies have also revealed varied transcriptional levels of the consecutive genes within operons [15], which suggest that the operon genes are complexly regulated, rather than transcribed from the single starting promoter. Internal promoters are suggested to coordinate the transcription of operon genes, but their effects remain mysterious. In this study, we determined the existence of multiple internal promoters in the epothilone biosynthetic gene cluster. We employed the established CRISPRa technique to promote the transcription of internal promoters, and found that the activation e ciency of promoters in operon was distinct from that of them in separate form. We combined the activation of the starting promoter PepoA together with internal promoters in different epothilone-producing strains, and obtained the highest 15-fold increase of epothilone yield in Myxococcus xanthus ZE5. Our results highlight that coordinating transcriptional activities of internal promoters is critical for the transcription of operon genes and the production e ciency of microbial secondary metabolites..

The epothilone gene cluster is a big operon containing multiple internal promoters
The epothilone gene cluster contains seven ORFs (epoA to epoF) in the same transcriptional direction, which are separated with a short distance (15 ~ 147bp) or shortly overlapped (Fig. 1a). No terminator structure exists in the intergenic regions of these ORFs, which means that the epothilone gene cluster is a huge operon. The continuous transcription of the gene cluster was also proved by co-transcriptional detection of two adjacent ORFs ( Figure S1).
However, these operon ORFs exhibited signi cant differences in their transcriptional levels. For example, as determined in M. xanthus ZE9, an epothilone-producer constructed with the whole gene cluster of S. cellulosum So0157-2 [12], the last ORF epoF showed the highest expression level by approximately 2.5 times of that of the rst ORF epoA (Fig. 1b). We suggested that internal promoters might be present in this operon to tune the transcription of individual ORFs.
We predicted the existence of internal promoters in the epothilone operon derived from S. cellulosum So0157-2 using the "Neural Network Promoter Prediction" program (https://www.fruit y.org/seq_tools/promoter.html), and found ten potential promoters (shown as red arrows in Fig. 1a). Prediction with "BPROM" (http://www.softberry.com/berry.phtml) revealed similar results but excavating a new promoter upstream of epoD (shown as green arrows in Fig. 1a). Detailed information of the predicted promoters is shown in Table S1. These promoters mostly appeared in the junctional regions between ORFs. Up to now, four S. cellulosum strains have been reported of their epothilone biosynthetic gene clusters with the GenBank accession numbers of AF210843.1 in SMP44 [9], AF217189.1 in So ce90 [10] and GU063811.1 in KYC3013 [16], in addition to EU414841.1 in So0157-2 [11]. The sequences of these four epothilone gene clusters exhibited more than 98.6% similarity. In the epothilone operons from other three S. cellulosum producers, internal promoters were similarly present but with some differences ( Figure S2; detailed information is provided in Table S2 ~ S4).
To analyze activities of internal promoters, we selected six regions that locate in the junction between two ORFs in the So0157-2 epothilone operon. These regions were 1000 bp in length, including the 800-bp fragment of an upstream gene and the 200-bp fragment of a downstream gene, which were ampli ed from M. xanthus ZE9 using the corresponding primer pairs (listed in Table S5). Each region, containing at least one predicted internal promoter, was cloned into the pKK232-8 plasmid to control the expression of chloramphenicol acetyltransferase (CAT) ( Figure S3). The aphII promoter was constructed upstream of the CAT reporter gene as a positive control. According to the CAT activities, assayed with the CAT ELISA kit, all the six regions exhibited transcriptional activities in E. coli: the P epoP promoter showed the highest activity, which was close to that of the aphII promoter, while the P epoB and P epoE promoters showed weak activities, slightly higher than that of the negative control (without a promoter before the CAT gene) (Fig. 1c).
Simultaneously, we also analyzed promoter activities of the 1 kb sequence upstream of the translation initiation codon (ATG or GTG) of each ORF in M. xanthus. The six fragments were constructed respectively in front of the EGFP (Enhanced Green Fluorescent Protein) reporter gene in pZJY41, an autonomously-replicating plasmid in M. xanthus [17] ( Figure S4). The identi ed starting promoter P epoA in front of the epothilone operon [18] was used as the positive control. We introduced these plasmids into M. xanthus DZ2 and assayed the green uorescence values of the mutants after 24 h, 48 h and 72 h of incubation. The results showed that these fragments, even the sequence upstream of epoP, where no promoter had been predicted, exhibited promoter activities; P epoA always had the highest activity, and the activity of P epoP is the lowest (Fig. 1d). Notably, the detected promoter activities, either in E. coli or M. xanthus, were signi cantly inconsistent with the transcriptional levels of the genes in the epothilone operon. For instance, we did not nd a strong promoter upstream of the epoF, which, however, displayed the highest transcriptional level among the ORFs in the epothilone operon (Fig. 1b). These results suggested that these internal promoters might be complexly regulated of their transcriptional activities in operon.

Operon and separate internal promoters exhibit different transcriptional activities
CRISPR (clustered regularly interspaced short palindromic repeat) Cas system is an RNA-mediated immune system existing in many bacteria and archaea to protect cells from foreign DNA invasion [19][20][21]. Peng et al. constructed a CRISPRa system in M. xanthus and successfully activated the P epoA promoter of the epothilone operon, which increased the transcriptional levels of operon genes and improved the yield of epothilones by 230% [13]. The schematic diagram of CRISPRa construction in M. xanthus is shown in Fig. 2a.
We performed the CRISPRa-based activation on separate promoters in E. coli. To achieve activation, we transferred three plasmids into the HB101 strain: the pSWcuomxdCas9-ω plasmid carrying dCas9 protein and transcription activator Omega(ω) [13], the pZJY41-sgRNA series plasmid carrying sgRNA, and the pkk232 series plasmid carrying promoter P epo and reporter gene CAT ( Figure S5). To construct the pZJY41-sgRNA series plasmids, we employed one spacer for each promoter (refers to Table 1), which were designed using the online software "Cas-OT" [22]. We found that the transcription activities of these separate promoters were all signi cantly improved: weak promoters were more easily activated by CRISPRa, and the promoters with high transcription activities were also activated but to low extents (Fig. 2b), which is consistent with the previous report [23]. For example, the transcription activity of the weakest promoter P epoB was increased by nearly 33-fold, while the strongest promoter P epoA was activated by approximately 1.6-fold. We further performed in-situ activation on the internal promoters within the epothilone operon in M. xanthus ZE9 by using the same spacer sequences for the activation of each of the six internal promoters ( Table 1). The spacer sequences were constructed into the pZJY41 plasmid, respectively ( Figure S6), producing six series plasmids (Table S6), which were separately introduced into the CuOm strain by electro-transformation. The CuOm strain was constructed by introducing the pSWcuomxdCas9-ωplasmid into ZE9 (a diagrammatic sketch for the construction is shown in Figure S7). The results showed that the transcriptional levels of most of the activated genes were signi cantly increased, except that the expression of epoD in CuOm-D2 was increased insigni cantly (t-test, p > 0.05) (Fig. 2c). The best activation in M. xanthus was achieved on the epoE gene, which was activated by about 5fold. Notably, the results were markedly different from that of the activation on the separate promoters although using the same spacer sequences. Thus, the transcription activities of internal promoters were in uenced in operon.
In addition, we found that the epothilone production abilities in these CRISPRa-promoted mutants were signi cantly increased, even including CuOm-D2, in which the expression of epoD was not increased signi cantly by the CRISPRa technique (Fig. 2d). We found that the transcriptional levels of the epothilone genes were always increased at a greater extent than the epothilone yields. For example, the highest gene expression was found in CuOm-E1 (increased by 5-fold), but the epothilone yield was only increased by 1.7-fold. We thus checked the transcription of each ORF in the epothilone operon with the activation of single internal promoters. The results showed that the activation of the front promoter P epoP , P epoB or P epoC normally increased the transcription of the front genes epoA, epoP, epoB and epoC, but did not change or even decreased the transcription of the hind genes epoD, epoE and epoF, whether the genes were speci cally activated or not ( Figure S8). Similarly, the activation of the hind promoter P epoD , P epoE or P epoF increased the transcription of the hind genes, and sometimes the upstream epoB and epoC genes, but had no effect on epoA or epoP. The results suggested that the transcriptional activation of single genes probably had a weak effect on the nal epothilone yield, and higher yield required an overall transcriptional increase of the multiple operonic genes.
3. Tuning the activity of multiple promoters to increase the epothilone yield To improve the activation effects to increase the yield of epothilones, we combined the activation of the starting promoter P epoA together with an internal promoter P epoP , P epoB or P epoD , thus forming the AP, AB or AD combination. We also combined the activation of three promoters, i.e., APB, APD and ABD. The combined sgRNA sequences were cloned into the pZJY41 plasmids, which were introduced into the CuOm strain, respectively ( Figure   S9). After the CRISPRa-based activation, the production of epothilones was increased in each combination, and the highest yields were obtained in the mutants CuOm-AP (11.17 mg/L) and CuOm-ABD (11.69 mg/L), both of which increased approximately 2.4-fold, compared with the 4.95 mg/L in the initial strain ZE9 (Fig. 3a). However, the combined activation of two or three promoters did not lead to an accumulation of the activation effects of single promoters.
We analyzed the transcription of each operon ORF under different activation combinations (Figs. 3b and 3c). The results showed that the transcriptional levels of the front operon genes epoA, epoP were mostly signi cantly increased by the combined activation, even when P epoP was not speci cally activated (in CuOm-AB, CuOm-AD or CuOm-ABD). However, the transcription of epoD, as well as the genes behind, was not increased in CuOm-AD, CuOm-ABD or CuOm-APD, in which P epoD was even speci cally activated. Seemingly, the genes close to the starting promoters P epoA were more easily activated than the hind operon genes.
Based on the abovementioned results, we combined CRISPRa-mediated activation on promoters in M. xanthus strains with different transcription levels of the epothilone genes. We previously constructed dozens of epothiloneproducing M. xanthus strains, in which the same epothilone biosynthetic gene cluster was inserted in different sites of the DZ2 genome, resulting in varied production abilities of epothilones [12]. We chose four strains ZE9, ZE5, ZE10 and ZE14 to assay the transcriptional e ciency using the CRISPRa technique. Among these four strains, ZE9 had the highest production ability of epothilones, followed by ZE10, then ZE14, and ZE5 exhibited the lowest epothilone yields (Fig. 4a). Consistently, the transcriptional levels of the seven operon ORFs in ZE9 were higher than that in the other three strains (Fig. 4b). In ZE14, the front operon genes were transcribed at higher levels, but epoD, epoE and epoF were signi cantly lower than that in ZE10. Similarly, the transcriptional levels of epoD, epoE and epoF were extremely low in ZE5, causing the strain to produce the lowest yield of epothilones among the four strains.
We combined the activations either on the front promoters (P epoA , P epoP and P epoB ) or the hind promoters (P epoD , P epoE and P epoF ) in these epothilone producing strains. As expected, the epothilone yields were all increased in these strains, and the highest 15-fold increase of epothilone yield was obtained in ZE5 with the DEF promoter activation (Fig. 4a). Consistent with the yields of epothilones, the highest activation e ciency also occurred in the ZE5 strain. In ZE5-DEF, the transcriptional levels of the three activated genes epoD, epoE and epoF were increased by 9.6, 3.1 and 51.7 times, respectively, and epoP and epoB were also increased slightly (Fig. 4c). However, the transcriptional changes of operon genes suggested that the interferences between operon promoters were very complex. For example, in the ZE5-APB strain, the transcriptional levels of the three activated genes epoA, epoP and epoB were all increased, but to different extents; the transcriptional levels of the four hind genes (epoC ~ epoF), which were not specially activated, were also mostly increased.

Discussion
Transcription regulation is always a topic of concern. Operons are clusters of co-regulated genes with related functions. Bacteria have established multiple mechanisms to ensure the relative expressional levels of individual genes in operon to meet the requirements of cell and environment [24,25]. Transcriptional interference between tandem promoters is recognized as a potentially widespread mechanism to regulate gene expression [26,27].
There are many studies on the regulation of single internal promoters on the expression of operon genes, and a few studies have been performed on transcriptional interferences between multiple operon promoters with no clear conclusions. For example, the 14-kb CAP1 gene cluster in Staphylococcus aureus is transcriptionally controlled by a strong upstream promoter and ve weak internal promoters, and the internal promoters showed signi cant activity only after removing the primary promoter [28]. In the cyanobacterium Anabaena sp. strain PCC 7120, a zinc-responsive operon contains 4 distinct promoters, which were induced by metal depletion, and they were constitutively derepressed in a zur mutant, despite the two downstream promoters not being direct targets for this regulator [29]. In this study, we demonstrated that the big epothilone gene operon contained multiple internal promoters, and the transcriptional processes of these internal promoters may intricately interfere with each other.
Interference between tandem promoters may be generated by dislodgement of slow-to-assemble pre-initiation complexes and transcription factors, or prolonged occlusion by paused RNA polymerases (RNAPs) [29,30,31].
The direct and in cis suppression of one transcriptional process by another transcriptional process is that RNAP transcribing from one promoter may have effects on the supercoiling state of neighbor promoters [26,32.33]. A study in the human and mouse genomes concluded that RNAP collisions were the primary mechanism of interaction between transcripts, based upon decreasing transcript abundance with increasing overlap length, to almost zero when the overlap exceeded 2,000 nucleotides [34]. Mathematical modeling also shows that the probability that an RNAP can avoid colliding with an RNAP from a convergent promoter decreases exponentially with the ring rate of the interfering promoter and with the inter-promoter distance [35]. However, the transcriptional interference between tandem promoters is often not satisfactorily explained by RNAP collisions or occlusion by elongating RNAP [27], and there are other unknown regulatory patterns [29].

Conclusions
The inconsistency in the transcriptional levels of operon genes often limits the yield of secondary metabolites. The inconsistent expression levels of genes in operon observed in different bacterial species are not only challenging the concept of operons [36.37], but also impeding our engineering work to control the transcription of operon genes. Our results present in this study indicated that, multiple internal promoters are present in epothilone gene cluster. Although little is known of the involving mechanism, regulation of operon internal promoters should be crucial for the biosynthetic pathways of secondary metabolites encoded by a big operon. Tuning the transcriptional activities of operon promoters, such as using CRISPRa technique, can e ciently improve the metabolite yields.

Strains and culture conditions
Strains used in this study are listed in Table S7.
Escherichia coli DH5α and HB101 were used for routine transformations and sub-cloning. The E. coli strains were grown routinely in Luria Broth (LB) medium (10 g/L peptone, 5 g/L yeast extract, and 5 g/L NaCl, pH 7.2). Myxococcus xanthus strains were grown in CYE medium [ respectively. In experimental groups, the cDNAs were used as templates.

Prediction of internal promoters in the epothilone gene cluster
We used the promoter prediction software "Neural Network Promoter Prediction" (https:// www. fruit y. org/seq_ tools/ promoter. html) to predict internal promoters in the epothilone gene clusters derived from Sorangium cellulosum strains So ce90, SMP44, So0157-2 and KYC3013. The threshold was set as 0.8. At the same time, we used another online promoter prediction software "BPROM" (http:// www. softberry. com/ berry. phtml? Topic = bprom & group = programs & subgroup = g ndb) to correct predictive results. Bold -35 and -10 binding regions were predicted in comparison to the σ70 consensus -35 (TTGACA) and -10 (TATAAT) promoter regions of E. coli. Then we selected the prediction results with a threshold greater than 0.8.

Construction of plasmids
The plasmids and primers used in this study are provided in Tables S5 and S6. pkk-232-P epoP ~ pkk-232-P epoF were used as a promoter activity reporter vector in E. coli. P epoP ~ P epoF were obtained by PCR with primers P epoP -F/R ~ P epoF -F/R from epothilones gene cluster in So0157-2. pkk-232-P epoP p kk-232-P epoF were constructed by inserting P epoP ~ P epoF into the Hind /BamH sites of pkk-232-8. pKK-232-aph were constructed by inserting aph into the Hind /BamH sites of pkk-232-8.
pZJY41-Ap-EGFP ~ pZJY41-Fp-EGFP were used as a promoter activity reporter vector in Myxococcus xanthus. P epoA was obtained by PCR with primers Ap-F/R from So0157-2 genome, while report gene EGFP was ampli ed with primers EGFP-F/R, and then overlap PCR was used to obtain P epoA -EGFP with primers P epoA -F and EGFP-R.
Whereafter, P epoA -EGFP was inserted into the BamH /Kpn sites of pZJY41 to construct the reporter vector pZJY41-Ap-EGFP. The other six promoter activity reporter vectors were constructed in the same way.

Activity detection of internal promoters in E. coli
We constructed plasmids to detect the activity of internal promoters from the epothilone operon in E. coli. The promoter activity was characterized by detecting the activity of the report gene chloramphenicol acetyltransferase (CAT). The promoter sequences and reporter gene sequence were cloned into plasmid pKK232-8 by digestion with BamH /Hind and then ligation with T4 DNA ligase. The activity of reporter gene CAT was detected by CAT ELISA Kit. The promoter aphII was used as a positive control. The CAT ELISA Kit was purchased from Roche and operated according to the instructions provided (https:// www. sigmaaldrich. com/ catalog/ product/ roche/ 11363727001? Lang = zh & region = CN).

Activity detection of internal promoters in Myxococcus xanthus
We constructed plasmids to detect the activity of internal promoters from the epothilone operon in M. xanthus. The promoter activity was characterized by detecting the uorescence intensity of the green uorescence reporter gene EGFP. The promoter sequences and the EGFP gene sequence were seamlessly connected by fusion PCR, and nally cloned into the plasmid pZJY41 by digestion with BamH /Kpn and then ligation with T4 DNA ligase.
Related primer information was shown in Table S5. The uorescence intensity of the green reporter gene was detected at 485 nm/528 nm, and three different incubation times (24h, 36h and 72h) were selected for detection.
Construction of CRISPRa-dCas9 system in E. coli We transfected three plasmids into E. coli competent cells HB101 at the same time: pSWcuomxdCas9-ω plasmid [13] carrying mxdCas9 protein and transcription activator, Omega(ω); pZJY41-sgRNA series plasmid carrying sgRNA with different spacer sequences; pKK232 series plasmid carrying internal promoter and reporter gene CAT. Related sequences and plasmid information were shown in Tables S5 and S6.