A Small Natural Light-induced Bidirectional Promoter of Rapeseed (Brassica Napus)

A bidirectional promoter is an intergenic region located between a pair of adjacent and oppositely transcribed genes (‘head-to-head’ genes) that concurrently promote both genes expression. In the genome of Brassica napus, we identied a bidirectional promoter (265bp long), named P Bn265 . P Bn265 was located between the transcription initiation codons (ATG) of two genes that encode the homeodomain protein SHH2, and chloroplast GROUP II intron splicing factor CFM3. Its bidirectional promoting activity was veried by transient expression of Nicotiana benthamiana leaf tissue via Agrobacterium-mediated transformation of P Bn265 -F::EGFP and P Bn265 -R::mCherry. The expression of both reporter genes, EGFP linked to one end and GUS on the other end of the P Bn265 sequence, was observed in the various tissues of the transgenic Arabidopsis thaliana using histochemical staining and uorescence microscopy. Furthermore, we also found that the promoting activity of this sequence was regulated by illumination. Considering its short sequence length and light inducible regulation, this promoter likely has application potential in bioengineering and agricultural molecular breeding.


Introduction
A promoter, as an important cis-element, is a special DNA sequence located upstream of its associated gene. The promoter sequence contains the core regulatory information of gene expression, such as ciselements like TATA-box, as well as de nes gene transcription initiation, transcription e ciency, and transcription temporal-spatial speci city [1]. Promoters can be roughly divided into three types according to their function, constitutive promoters [2] inducible promoters [3] and tissue-speci c promoters [4]. A single promoter can also have two characteristics at the same time [5]. They can respond to environmental factors, and inducible promoters can be divided into biological-stress-induced promoters, physical-stress-induced promoters, and chemical-stress-induced promoters [6]. Inducible promoters can avoid excessive consumption of plant energy resulting from continuous target gene expression, in addition to eliminating damage caused by the accumulation of gene products to the plant itself [7]. Therefore, the identi cation of inducible promoters has recently become a research priority in plant genetic engineering [8].
Bidirectional promoters specify an intergenic region between two genes that are adjacent but in opposite directions of transcription, often with less than 1,000 bp between transcription initiation sites [9]. These two genes are called bidirectional gene pairs or "head-to-head" gene pairs. The rst plant seed-speci c bidirectional promoter of the oleosin gene in Brassica napus, included an expression module that was ABA-inducible at the forward orientation and ethylene-inducible at the reverse orientation [10]. Given that the bidirectional promoter can drive two genes at the same time, it has great potential in genetic engineering and metabolic engineering research involving multi-gene expression. Furthermore, when multiple genes are expressed, repeated utilization of constitutive promoters can easily lead to gene silencing. The application of bidirectional promoters can avoid these unwanted negative effects [11].
In recent years, bioinformatics studies have identi ed a large number of bidirectional gene pairs in humans [12], Arabidopsis [13], rice [14] and populus [15]. There are many intergenic regions between "head-to-head" genes that are potentially active bidirectional promoters [16] Currently, only a few bidirectional promoters in Arabidopsis and rice have been veri ed using biological experiments [17].
These include the Arabidopsis chlorophyll a/b binding protein gene promoter [18], antioxidant protein B gene promoter [19], rice chymotrypsin inhibitor gene promoter [20], and heat shock protein gene promoter [21]. The Arabidopsis chlorophyll a/b binding protein gene promoter has light-inducible properties, while the rice chymotrypsin inhibitor gene promoter was induced by drought [18]. Both have already exhibited application value in the eld of genetic engineering [21].
In this study, we identi ed that LOC106347029 and LOC106347039 were a pair of "head-to-head" genes in the Brassica napus genome. The gene annotation from the NCBI website revealed that the protein encoded by LOC106347029 gene was a chloroplast group II intron splicing factor (CFM3), and LOC106347039 encoded the protein SHH2. CFM3 and SHH2 genes are adjacent and arranged oppositely on the two complementary DNA strands. The distance between the transcription initiation codons was only 265bp. Therefore, we hypothesized that the intergenic region between CFM3 and SHH2 may be a bidirectional promoter that can simultaneously drive the expression of genes on both sides. To validate this hypothesis, we analyzed the promoter sequence using Plantcare website software, and conducted a series of characterization experiments to test bidirectional activity of this sequence, the expression level of native gene pairs (CFM3 and SHH2), and reporter gene pairs (EGFP and GUS). Here we report one of the shortest natural bidirectional and light induced promoters for Brassica napus, to our knowledge. Therefore, the promoter has broad application prospects in plant genetic improvement and genetic engineering [22,23].

Sequence analysis and PCR ampli cation
The arrangement of the genes on chromosome A1 of Brassica napus (ZS11) was observed using the NCBI genome data viewer (https://www.ncbi.nlm.nih.gov/). The DNA sequence between "head-to-head" genes was downloaded from GenBank as a candidate for bidirectional promoters. The cis-elements were analyzed by Plantcare software. CpG island prediction was made by MethPrimer software. Speci c primers were designed according to the sequences of both ends. In order to amplify the sequence (named P Bn265 ) from the genomic DNA of Brassica napus (KeleYou), the PCR reaction procedure was as follows: 95℃ for 3min; 34 cycles of 95℃ for 10s, 55℃ for 10s and 72℃ for 15s; and a nal extension at 72℃ for 5min.

Construction of vectors
In this study, ve recombinant plasmids were constructed, namely pBI121-P Bn265 F::EGFP, pEG-P Bn265 R::mCherry, pDX2181-P Bn265 (pDX2181 vector was provided by Yongjun Lin [24], pBI221-SHH2 and pBI221-CFM3. Plasmid pBI121-P Bn265 F::EGFP contained the forward orientation promoter that drove the EGFP reporter gene expression. Plasmid pEG-P Bn265 R::mCherry contains the reverse promoter that drove the expression of mCherry reporter gene. Vector pDX2181-P Bn265 contained a fused pair of reporter genes GUS and EGFP which are "head-to head" genes ( Fig. 4a), and a promoter P Bn265 was inserted between them to promote the two genes simultaneously, the forward orientation drove the expression of GUS and the reverse orientation drove the expression of EGFP. pBI221-SHH2 and pBI221-CFM3 were used for subcellular localization of SHH2 and CFM3 genes, these two genes were ligated to the vector pBI221 before the EGFP reporter gene and driven by CaMV35S promoter. All restriction sites and primers are reported (Supplementary Table S1).

Tobacco instantaneous expression experiment
We constructed and transformed the recombinant plant binary expression vectors, pBI121-EGFP and pEG-mCherry, which contained P Bn265 -F::EGFP and P Bn265 -R::mCherry respectively (Fig. 1b,c), into Agrobacterium tumefaciens strain GV3101. Then we selected positive clones from LB medium containing 50µM kanamycin and 20µM rifampicin after cultivating for 36 h at 28°C, and identi ed them by PCR using speci c primers (Supplementary Table S1). Positive clones were inoculated into 20mL LB liquid medium containing the same antibiotics at 28°C for 16h at 200rpm in an incubator shaker. When the OD600 reached 0.8, we centrifuged it at 2500×g for 10min by the differential centrifuge at room temperature, and then resuspended bacteria to OD600 at 1.0 by Agromix Buffer

Subcellular localization assay
We obtained the N. benthamiana protoplast through PEG-mediated transfection [25], and gently mixed 10µL recombinant plasmid DNA into 100µL protoplast solution in a 2mL round bottom tube, before adding and gently mixing 110µL freshly prepared PEG solution (PEG4000, 0.2M mannitol, 0.1M CaCl 2 ). This solution was left at room temperature for 15min before adding W5 solution [5mM KCl, 125mM CaCl 2 , 154mM NaCl, and 2mM MES (pH 5.7)] and inverting it several times, in order to completely terminate the transfection process. After centrifugation at 100g for 3min, the supernatant was removed, and 1mL WI solution [4mM MES (pH5.7), 20mM KCl, and 0.5M mannitol] was added to resuspend the protoplasts. Finally, we cultured the transfected protoplasts in the darkness for 16h at room temperature, and the EGFP signals were observed and photographed under a uorescence microscope (Leica microsystems DM4 B).

Arabidopsis transformation and selection of the transformants.
In order to explore the e ciency of P Bn265 , the recombinant vector pDX2181-P Bn265 with GUS and EGFP reporter genes, was transformed into Arabidopsis thaliana ecotype Col-0 by oral-dip method. The transformants were selected on 1/2MS solid medium with 20µg/mL hygromycin. DNA was extracted from all T1 transformants, and transgenesis was identi ed through PCR using P Bn265 -F/R primer (Supplementary Table S1). The T2 transgenic seedings were selected by 1/2MS solid medium with 30µg/mL hygromycin. We repeated the previous steps until the homozygous transformants of T3 generation were obtained.
The GUS assay was performed by immersing tissues in the GUS staining solution (Solarbio) for 4h in the dark at 37℃, before soaking them in 70% ethanol and observing them under the microscope (Leica MZFLIII). In addition, the uorescence of different tissues was detected by uorescence microscope (Leica microsystems DM4 B).

Synthesis of cDNA and Real-Time qPCR analysis.
We extracted RNA from the roots, stems, leaves, owers, and siliques of transgenic Arabidopsis thaliana, and obtained cDNA, before quantitatively analyzing the expression of P Bn265 ::GUS and P Bn265 ::EGFP genes by RT-qPCR. We then divided transgenic plants into two groups, treatment group and control group. The transgenic Arabidopsis thaliana seedlings in treatment groups were maintained in continuous darkness for 12, 24, 36 and 48 h, and relighted for 12, 24, 36 and 48 h, while the control group continued with its regular light conditions (16h light/8h darkness). Tissues from the roots, stems, and leaves were collected every 12 h for RNA extraction. Quantitative analysis of P Bn265 -F::GUS and P Bn265 -R::EGFP activity was conducted three times, each treatment had three biological replicates and each replicate was made from a pool of 5 Arabidopsis plants. All of the data were normalized to the ratio of GUS and EGFP genes expression in treatment group, to its counterpart in control group over treatment time.
RNA was extracted by using Plant Total RNA Isolation Kit (FOREGENE). Then the rst-strand cDNA was obtained through reverse transcription by using One-Step gDNA Removal and cDNA Synthesis SuperMix (TransGen Biotech). Primers At.actin (Supplementary Table S1) were used to detect the quality of cDNA and as an endogenous control to normalize the variance between samples. RT-qPCR was performed with qPCR SYBR Green Master Mix (YEASEN Biotech). Procedure of RT-qPCR as follow: qPCR cycle parameters were set as 95℃ for 30s, 39 cycles of 5s at 95 ℃, 15s at 57 ℃, a nal melting curve from 65℃ to 95℃ with an increment of 0.5℃.

Sequence analyze of promoter P Bn265 from Brassica napus
According to the reference genome of Brassica napus (ZS11), we identi ed an intergenic region between the initiation codon ATG of the two "head-to-head" gene pairs (Fig. 1a), SAWADEE HOMEODOMAIN HOMOLOG 2 (SHH2) and Containing Factor of CRM domain (CFM3). It was only 256bp long and was named P Bn265 .The primers P Bn265 -F/R were designed (Supplementary Table S1) to amplify the complete 265bp sequence from the Brassica napus (Keleyou) genome. We stipulated that the forward orientation controls transcription of SHH2 and the reverse for CFM3.
The sequence of P Bn265 was analyzed by Plantcare (http://bioinformatics.psb.ugent.be/ webtools/plantcare/html/), and we located approximately 10 cis-elements (Fig. 2), including AE-box, ATCT-motif, CAAT-box, DRE1, G-box, GCN4-motif, MRE, MYC, SP1 and TATA-box. Some of these ciselements are in the forward orientation, such as SP1 (GGGCGG, a light responsive element), GCN4-motif (TGAGTCA, cis-regulatory element involved in endosperm expression), MRE (AACCTAA, MYB binding site involved in light responsiveness) and two TATA-box (TA-rich sequences, core promoter element around − 30 of transcription start). While the others are in the reverse orientation, including G-box (TCCACATGGCA, cis-acting regulatory element involved in light responsiveness), ATCT-motif (AATCTAATCC, part of a conserved DNA module involved in light responsiveness) and AE-box (AGAAACAA, part of a module for light response). Due to the identity of these cis-elements, we hypothesized that P Bn265 was most likely affected by light.

The function of promoter P Bn265 was preliminarily veri ed by transient expression assays.
In order to verify whether P Bn265 regulated promotion in plant tissue, we conducted tobacco instantaneous expression experiments by introducing the promoter P Bn265 (P Bn265 -F::EGFP and P Bn265 -R::mCherry) into N. benthamiana leaf tissue by Agrobacterium-mediated transformation for transient expression. These results revealed that both forward and reverse promoters could drive the expression of the corresponding reporter genes, and green (EGFP) or red (mCherry) uorescence was observed in transformed tobacco leaves (Fig. 1b, c). Therefore, we successfully illustrated that this P Bn265 sequence has bidirectional promoter activity in both rapeseeds and tobacco.

Cis-elements and subcellular localization of SHH2 and CFM3.
The SAWADEE domain has been veri ed as a novel chromatin-binding module, which is used to probe the methylation state of the histone H3 tail [26]. SHH protein can also interact with Pol IV (RNA polymerase IV) and is important for the location of Pol IV at some target sites [27]. Therefore, SHH2 should have been located in the nucleus, and the promoter, P Bn265 -F, which regulates its transcription should be a constitutive promoter. The CRM domain is one of a group II intron splicing factors which is involved in chloroplasts [28]. However, many of them have lost their capacity for self-splicing, resulting in some pseudo genes and they will not transcribe and translate into proteins [29], such as LOC106451789.
In accordance with the characterizations and functions of SHH2 and CFM3 mentioned above, the results of subcellular localization of SHH2 and CFM3 revealed that SHH2 was located in the nucleus, and CFM3 was most likely to be located in chloroplasts (Fig. 3). These results were also consistent with previous research reports and the predictions of cis-acting elements of this promoter (Fig. 2).

GUS and EGFP gene expression analysis in transgenic plants
The reporter genes, GUS and EGFP, were introduced into Arabidopsis thaliana (Fig. 4a), and real-time uorescence quantitative analysis of P Bn265 -F::GUS and P Bn265 -R::EGFP in transformants showed that P Bn265 -F::GUS was expressed in root, stem, leaf, ower and silique (Fig. 4b), and GUS staining was also observed in these tissues (Fig. 4c). Measurement of GUS staining intensity supported the RT-qPCR results. However, uorescence determined by microscope images indicate that P Bn265 -R::EGFP was almost exclusively expressed in stem, leaf, ower and silique, with its highest expression level in green tissues like the leaf and stem, while barely detectable in the root (Fig. 4d). These results were similar to the analysis of RT-qPCR (Fig. 4b).
3.5 RT-qPCR of P Bn265 -F::GUS and P Bn265 -R::EGFP in lighttreat transformants veri ed that P Bn265 is induced by light.
The different reporter gene expressions driven by the same promoter with different orientations indicated that the forward and reverse orientations of the P Bn265 promoter were regulated by different factors. To verify this, we conducted an illumination experiment. Transformants were grown under darkness for 64h and then shifted into a light room for an additional 64h.
Quantitative analysis revealed that the relative expression level of P Bn265 -R::EGFP in the root was signi cantly lower than the other tissues (Fig. 4b), and the relative expression level of P Bn265 -R::EGFP in the root, stem and leaf decreased signi cantly under constant darkness. On the other hand, the relative expression level of P Bn265 -F::GUS in the root did not show a regular change. In the stem, however, the relative expression level of P Bn265 -F::GUS signi cantly decreased after dark treatment for 36 h. Changes of P Bn265 -F::GUS in the leaf were similar to those in the stem (Fig. 5a). After resumption of light, both P Bn265 -F::GUS and P Bn265 -R::EGFP relative expression was elevated in the stem and leaf. Interestingly, the relative expression level of P Bn265 -F::GUS was persistently decreased (Fig. 5b).
In order to obtain a more accurate trend of the gene variation in transgenic plants under the change of light, we recalculated the quantitative data of treatment group and control group (described in method 2.6). As can be seen in the processed data, in the root, the changing trend of P Bn265 -R::EGFP under illumination treatment was not regular, while P Bn265 -F::GUS expression showed a downward trend as a whole, and there was no obvious trend of gene expression recovery even after restoration of light. In the stem and leaf, the changing trend of both P Bn265 -F::GUS and P Bn265 -R::EGFP decreased as darkness continued and then gradually recovered after restoration of light (Fig. 5c).

Discussion
In this study, we identi ed a potential bidirectional promoter from the genome of Brassica napus (Fig. 1a). Through the cloning and functional analysis of the promoter P Bn265 of Brassica napus, we have drawn the following preliminary conclusions: the promoter has transcriptional activity in both two directions (Fig. 1b,c); the promoter may be a bidirectional promoter, containing a variety of cis-acting elements related to light response (Fig. 2); the P Bn265 -F has activity in the root, stem, leaf, ower and silique in transgenic Arabidopsis thaliana (Fig. 4), and that SHH2 may also express in all tissues in Brassica napus. On the other hand, the expression level of P Bn265 -R was relatively high in green tissues such as stem and leaf, but signi cantly lower or not even present in the roots. In stem and leaf tissues, the expression level of EGFP and GUS decreased signi cantly in the absence of light. After resumption of light, the gene expression level rose gradually, and with the prolonging of the light, the gene expression level increased too. Similar patterns were not observed in the root. Therefore, we hypothesized that CFM3 is mainly expressed in chloroplasts, as there are fewer chloroplasts in roots. Based on these results, when P Bn265 simultaneously drives the expression of both two genes, P Bn265 -R may affect the function of P Bn265 -F (Fig. 5).
The leaf is the photosynthesis organ of the plant, playing an important role in energy xation and utilization. The study of light-induced promoters has important theoretical signi cance and application value for the further study of genes for which light regulates expression [30]. Light-induced promoters with bidirectional driving function have a potentially broad application in plant genetic improvement and genetic engineering. In seven species of Brassicaceae, the arrangement of the structure, "CFM3bidirectional promoter-SHH2", has a high degree of conservation of synteny (Supplementary Table. S2). This evolutionarily conservative arrangement may have some unknown biological signi cance and these promoters may all be bidirectional promoters. Two genes that share a short promoter can potentially save on resource usage, such as time and space, energy, and RNA polymerase. This is highly conducive to an organism's growth and development. Therefore, the evolutionary origin of this promoter deserves further study.
It is worth noting that, in previous studies on bidirectional promoters, only 9% of them have TATA-box [31]. It was speculated that the regulation mechanism of bidirectional promoters without TATA-box may be different from directional promoters [32]. However, this short (265bp) sequence contains two TATA-box and rich CpG islands, but low GC content. Among the promoters randomly selected from the human genome, the average GC content is 53% and about 70.8% of them have GC content higher than 60%, signi cantly higher than average [33]. We found that P Bn265 has only 42% GC content and does not contain CpG islands (Supplementary Fig. S1). This may be a new discovery of the characteristics of bidirectional promoters, indicating that there may be new regulation mechanisms for bidirectional promoters, which will help future generations to further study the regulatory mechanism of bidirectional promoters.
In summary, the light-induced bidirectional promoter found in this study is the shortest among the natural inducible bidirectional promoters as far as we know, and it also has some characteristics different from other promoters, containing two TATA-boxes, having low GC content and lacking CpG islands. In addition, considering the promotion activity and the applicability in heterologous species, the promoter has broad application potential in plant genetic improvement and genetic engineering. The relative expression level and changing trend of GUS and EGFP in transgenic plants after darkness and relight treatment. a Relative expression level of GUS and EGFP in transgenic line's root, stem and leaf under a continuous darkness. EGFP/GUS relative expression level was standardized by the EGFP/GUS expression in root which was treated for 12h. The Actin2 was used as an internal control. Signi cance analysis was conducted by independent-samples t-test. * p < 0.05; ** p<0.01. Error bars indicate standard error based on three independent biological replicates. b Relative expression level of GUS and EGFP in transgenic line's root, stem and leaf under resumption of light. EGFP/GUS relative expression level was standardized by the EGFP/GUS expression in root which was treated for 12h. The Actin2 was used as an internal control. Signi cance analysis was conducted by independent-samples t-test. * p < 0.05; ** p<0.01.
Error bars indicate standard error based on three independent biological replicates. c The change trend of EGFP and GUS in transgenic line after darkness and relight treatment. Continuous darkness treatment for 0, 12, 24, 36 and 48 h and then relight for 12, 24, 36 and 48 h. All the data were normalized by the ratio of GUS and EGFP genes expression level in treatment group to the counterpart in control along treatment time