Unprecedented evolution and transcriptional reprogramming of CYP81E subfamily in Carthamus tinctorius during avonoid biosynthesis

Cytochrome P450s are widely known as an important class of enzymes involved in multi-dimensional metabolic reactions which facilitate both primary and secondary metabolism in plants. Recent advances in genome sequencing of new plant species have greatly inuenced our knowledge of the evolution of gene families. Herein, we present the extensive genome-wide identication study and early experimental groundwork of CtCYP81E subfamily extracted from saower genome. The evolutionary divergence and several other molecular aspects of CtCYP81E enzymes were described with the help of phylogenetic reconstruction and robust in silico analysis. A total of 15 CtCYP81E candidate enzymes were identied and clustered together with A-type CYP71 clan of the model plant. The detail overview of their gene structures organization, conserved signatures motif, cis regulatory elements, Go functional categorization and protein-protein interaction network, respectively suggested novel insights for physiological and biosynthetic implications. Following multiple recombinant DNA approaches combined with the development of GPF fusion, heterologous expression, and transcriptional regulation network of CtCYP81E8 under normal and uctuating environments, further functional validation was performed. The transient expression system using onion epidermal cells revealed the candidate protein's subcellular position to cell membrane. Similarly, the biochemical assay of recombinant CtCYP81E8 protein, effectively produced during heterologous expression, veried 2,4-dimethylphenol activity over different time periods. Moreover, the results of RNA-transcriptomic data and qRT-PCR analysis of 15 CtCYP81Es at different owering stages indicated a differential expression levels dening their potential roles during saower metabolite biosynthesis.Consequently, the transcriptional regulation of CtCYP81E exploited with various stress conditions indicated considerable susceptibility against these environmental drifts. Furthermore, the correlation analysis of CtCYP81E8 transcription and metabolite accumulation pattern in wild and mutant saower lines also suggested positive outcomes during ower development. Although presumably, these results may be helpful in determining the fundamental idea of transcriptional regulation channels that strategically turn on the secondary metabolic pool of plant system in response cute environmental falls. reaction. The reverse-transcribed cDNA templates were used for subsequent qRT-PCR analysis under different stress consitions. The reactions from qRT-PCR have all been carried out using the Fast Real-Time PCR Method (Applied Biosystems, CA, USA) with a 20µL nal reaction volume containing 10µL SYBR Premix Ex Taq (Tli RNaseH Plus) (TaKaRa). A compartive analysis between wild and mutant saower lines was used to unviel the relationship of CtCYP81E8 mRNA transcription with the accumulation of total metabolite content in red-typed and yellow-typed owers. For this purpose, the ower petals of both red-typed and yellow-typed owers were collected at four different owering stages including bud, initial, full, and fade stage in wild saower line using similar conditions. However, the mutant line was only tested for bud, initial, and full owering stages as it lacks the phenotype of fading ower stage due to mutation. These petals were immediately transferred to liquid nitrogen after the collection of each replicate in already marked separate tubes. The experimental material of each ower type and saower line was simultaneously subjected to RNA extraction and total metabolite extraction. After, the total RNA content extraction, cDNA templates were synthesized using reverse transcription PCR. The qRT-PCR assays were conducted in accordance to previously indicated system to determine the transcription regulation level of CtCYP81E8 at each owering type and phase development of wild and mutant line. All experiments were conducted in three independent replicates at each growth stage, and the results were analyzed according to 2 − ΔΔCt. The 18 s ribosomal RNA gene (GenBank accession: AY703484.1) was used as a reference. At the same time and conditions, the remaining homogeneous mixture of the ower petals was immersed in 14 mL water-alcoholic solution for ultrasonication separation of the metabolites under the controlled conditions including 60 °C of extraction temperature, 30 min twice of extraction period, and 5000 rpm cycle of centrifugation for 10 min. A specimen of 0.5 mL (1 mg /mL) was subjected to mixing with 10% aluminium chloride and 1 M potassium acetate in addition 80% methanol solution. The absorbance measurments was performed spectrophotometrically at wavelength of 415 nm. The measurment of the percentage total avonoid content (TFC) was calculated using the method of milligrams of TFC per 100 grams fresh weight or dry weight. Three biological replicates


Introduction
Carthamus tinctorius L. also known as sa ower is commonly known as one of the important economic plants globally. The wider use of dried petals of sa ower as a rich source of Chinese traditional medicine against various diseases such as coronary heart diseases, hypertension, gynaecological diseases, and cerebral blood ow and cerebrovascular diseases [1,2]. It also offers historic blueprints for enriched avonoids content, fatty acids, various phenolic compounds, and lignin product [3]. Sa ower contains a striking variety of secondary metabolites and in particular avonoids which include carthamin chalcone glycoside, kaempferol glucosides, hydroxy sa or yellow A and B, and quercetin glucosides [4][5][6][7][8]. The economical value of sa ower highlights massive genomic diversity for genome wide studies of gene families related to avonoid biosynthesis [9]. However, the knowledge about P450 gene clusters and their associated subfamilies involved directly or otherwise during avonoid biosynthesis in sa ower is still scarce.
Cytochrome P450 corresponds to a diverse group of multigene family with hundreds of genes per genome identi ed in 50 differet plant species [10]. The molecular biology and biochemistry of cytochrome P450 have been crucial in understanding the complex enzyme structure and function interaction, gene expression and regulation, and other catalytic reactions during plant adaptation to various environmental constrains through secondary metabolism.
They are mainly engaaged with essential biosynthetic reactions such as secondary metabolites, hormones, and fatty acid conjugates, as well as during oxidative detoxi cation pathways in plants [11,12]. Recent studies have shown various biosynthetic pathways including avonoid biosynthesis, lignans metabolic pathway, and alkaloids metabolism, [13,14]. In addition, cytochrome P450 genes were found crucial candidates during the metabolic and stress resistance to various allelochemicals in plants [15]. Despite various researches on the functional identi cation of multiple subfamily genes of cytochrome P450 super family have been successfully reported in Arabidopsis, wheat, ginseng and other plants [16][17][18][19], nontheless, the comprehensive genome-wide identi cation and functional characterization of CYP81E subfamily in sa ower still remianed unexplained. As a matter of prime signi cance, it is essential to complex genome diversity of sa ower with a particular focus on studies related to identi cation and characterization of candidate gene families of cytochrome P450 genes involved in avonoid biosynthesis.
In this study, the draft genome sequence of sa ower available online at (PRJNA399628; posted publicly to NCBI on August 23, 2017) was exploited to functionally characterize CtCYP81E subfamily genes by revealing their extensive molecular evolution, structural and functional diversity, conserved patterns of the molecular regulatory factors and diverse expression pattern under normal and stress conditions. In addition, a putative CtCYP81E8 was further characterized by carrying out multiple functional analysis following with molecular cloning, subcellular localization, prokaryotic expression and differential expression analysis during owering stages of sa ower under uctuating climates. Our results not only provide practical basis for the understanding of the biosynthesis, regulation and metabolic network of avonoid metabolism in sa ower, but also present fundamental experimental groundwork for further studies related to secondary metabolism and regulation. On the other hand, these ndings also facilitate new molecular breeding programmes for sa ower varieties with high metabolites content.

Sequence retrieval and characterization of CtCYP81E subfamily in sa ower
Genome assembly of Carthamus tinctorius from Jilin Agricultural University (PRJNA399628; submitted on august 23rd, 2017) were retrieved from the NCBI website (https://www.ncbi.nlm.nih.gov). The 246 full-length P450s in Arabidopsis along with 26 pseudogenes, available on the TAIR website (https://www.Arabidopsis. org/) were used as input for the identi cation of candidate CtCYP81Es using genome assembly of Carthamus tinctorius with the help of local BLAST tool through BioEdit software (Ibis Biosciences, Carlsbad, CA, USA). Further veri cation of the identi ed CtCYP81Es was performed with the Pfam database search (http://pfam.xfam.org). Various physicochemical properties including molecular weight (MW) and isoelectric point (pI) of each identi ed CtCYP81E protein were determined by the online webserver of ExPASy (http://www.expasy.org/). Furthermore, signal peptide analysis was performed using SignalP 4.1 program. The alignment of the amino acid sequences were carried out with the help of DNAMAN software (Vers. 7; Lynnon Corporation, Quebec, Canada), by using the preset parameters.

Phylogeney analysis
The full-length 15 amino acid sequences of the CtCYP81E proteins obtained from C. tinctorius genome were subjected to multiple sequence alignment using Clustal W (2.0). To analyze the evolutionary divergence and sequence homology of CtCYP81Es in comparison with 246 full-length P450s in Arabidopsis and 26 identi able pseudogenes, a neighbour-joining phylogenetic tree with 1000 bootstrap method was generated using MEGA 5 software version 4.1 (http://www.megasoftware.net/) [20]. The classi cation and divergence of the CtCYP81E family in sa ower with the Arabidopsis CYPome was demonstrated and clustered in several clans based on their structural and functional properties of their subfamilies.

Gene structure, protein motifs and promoter analysis
The gene structure organization including exons and introns along the length of CtCYP81E genomic sequences were monitored using the CDs and genomic sequences of CtCYP81E genes with the help of GSDS (Gene Structure Display Server) (http://gsds.cbi.pku.edu.cn/index.php) according the instructions given by [21]. The conserved protein motifs within 15 CtCYP81E protein sequences by adding them to MEME web server Version 4.8.1; available at http://meme.nbcr.net/meme/cgi-bin/meme.cgi) using the default values. The graphical representation of protein motifs within the ML Phylogenetic trees was edited in EvolView v.2 (http://www.evolgenius.info/). For the investigation of the cis-regulatory units of the promoter region in the selectedCtCYP81E genes of C. tinctorius, the 2 kb upstream 5` UTR anking sequence of each putative gene was analyzed at PLACE (https://sogo.dna.affrc.go.jp/).

Gene term enrichment
Gene ontology annotation is basically a sequence-homology-based phylogentic tool commonly used to functionally classify a set of genes in silico. GO term analysis for C. tinctorius CtCYP81E subfamily was determined with the help of Blast2GO (https://www.blast2go.com/) [22]. For this purpose the full-length amino acid residues of CtCYP81E proteins were added to Blast2Go for initial blast search following by mapping and annotation. The Go term annotation was classi ed into three classes including biological processes, cellular component and molecular function.

Protein interaction network prediction
The functional protein interactive network of the putative CtCYP81E proteins, were manually predicted using the online web server of STRING database version 10 (https://string-db.org/). The online hierarchical network of interactor proteins showed the prediction of a variable group of experimental and hypothetical proteins that interact with CtCYP81E proteins during upstream and downstream regulation.

Experimental materials, vectors, and strains
The "Jihong No.1" cultivar seeds of c. tinctorius were purchased from Tacheng, Xinjiang province of China. The seeds were grown in the greenhouse station of the Engineering Research Center of Jilin Agricultural University Changchun, China, until harvesting. Agrobacterium tumefaciens strain EHA105, E. coli BL21, E. coli DH5α cells, prokaryotic expression vector (PET28a + -CtCYP81E8), subcellular localization vector pCAMBIA1302-CtCYP81E8-GFP) were constructed and stored in a refrigerator at -80 ° C with 75% glycerol until the next use.

Transcriptomic pro ling and expression analysis
The transcriptomic pro ling and expression analysis of CtCYP81E subfamily genes in sa ower was determined using RNA-seq data (whole Transcriptome Shotgun Sequencing) in ve different tissues/organs (root, stem, seed, ower, and leaf tissue). A heatmap is created from the kilobase model of exon model per million mapped read (RPKM) method. For semi quantitative realtime PCR analysis, total RNA content was extracted from the aforesaid tissues of 4 months old Jihong No.1 cultivar of C. tinctorius. The rst-strand cDNA templates were synthesised using the reverse transcription system. The quantitative real-time PCR assay was carried out to determine the transcription levels of 15 CtCYP81E genes using SYBR® Premix Ex Taq™ (TaKaRa). The system of Stratagene Mx3000P (Stratagene, CA, USA) was employed to determine the semi qRT-PCR analysis. The expression level was normalised with 18 s ribosomal RNA gene from C. tinctorius using an internal reference gene. The relative expression level of CtCYP81Es in each tissue was calculated according to the 2-ΔΔ CT method [23]. Each experiment was repeated in three independent biological replicates. The primers are listed in (Table S1).

Cloning, expression analysis and subcellular localization of CtCYP81E8
The total RNA content was obtained from the ower petals of JH1 cultivar of sa ower using an RNA Isoplus reagent (TIANGEN Biotech, China). The rst-strand cDNA templates were prepared with the help PrimeScript™ RT Reagent Kit and gDNA Eraser (TaKaRa, China). The full length cDNA sequence of CtCYP81E8 was ampli ed using the primers pair CtCYP81-F1 (5′CCCATGGGATGATGAGGATGATTAGTGG3′).and.CtCYP81-R1. 2.10. Correlation analysis CtCYP81E8 transcription and total metabolite accumulation in wild and mutant sa ower A compartive analysis between wild and mutant sa ower lines was used to unviel the relationship of CtCYP81E8 mRNA transcription with the accumulation of total metabolite content in red-typed and yellow-typed owers. For this purpose, the ower petals of both red-typed and yellow-typed owers were collected at four different owering stages including bud, initial, full, and fade stage in wild sa ower line using similar conditions. However, the mutant line was only tested for bud, initial, and full owering stages as it lacks the phenotype of fading ower stage due to mutation. These petals were immediately transferred to liquid nitrogen after the collection of each replicate in already marked separate tubes. The experimental material of each ower type and sa ower line was simultaneously subjected to RNA extraction and total metabolite extraction. After, the total RNA content extraction, cDNA templates were synthesized using reverse transcription PCR. The qRT-PCR assays were conducted in accordance to previously indicated system to determine the transcription regulation level of CtCYP81E8 at each owering type and phase development of wild and mutant line. All experiments were conducted in three independent replicates at each growth stage, and the results were analyzed according to 2 − ΔΔCt. The 18 s ribosomal RNA gene (GenBank accession: AY703484.1) was used as a reference. At the same time and conditions, the remaining homogeneous mixture of the ower petals was immersed in 14 mL water-alcoholic solution for ultrasonication separation of the metabolites under the controlled conditions including 60 °C of extraction temperature, 30 min twice of extraction period, and 5000 rpm cycle of centrifugation for 10 min. A specimen of 0.5 mL (1 mg /mL) was subjected to mixing with 10% aluminium chloride and 1 M potassium acetate in addition 80% methanol solution.
The absorbance measurments was performed spectrophotometrically at wavelength of 415 nm. The measurment of the percentage total avonoid content (TFC) was calculated using the method of milligrams of TFC per 100 grams fresh weight or dry weight. Three biological replicates (n = 3) were used to minimize the risk of the possible error.

Prokaryotic expression and in vitro DMP activity of CtCYP81E8
The full-length CtCYP81E8 cDNA was ampli ed using Pfu DNA polymerase (Takara) using a different set of primers CtCYP81PE-F (5′CGGATCCGATGATGAGGATGATTAGTGG3′) with an added EcoRI(GAATTC) site and CtCYP81PE-R (5′TCAAAGATGCGATAATAGATTTGGAATTCC3′) with an added BamHI(GGATCC) site. The construction of the binary vector (pET28a+) was carried out using the double restriction digestion of CtCYP81E8 and pET28a + vector. Subsequently, the ligation of the CtCYP81E8 into the appropriate EcoRI, and BamHI restriction sites of the empty pET28a + vector was performed withn T4 ligase enzyme. The binary vector (Pet28a+-CtCYP81E8) was transformed into BL21 (E. coli cells). The CtCYP81E8 protein was effectively induced and then expressed. The BL21 cells transformed with pET-28a+-CtCYP81E8 were grown in LB media (500 mL) supplemented with 50 mg/L of kanamycin. The bacterial cells were harvested at an OD of 2 × 108 cells/mL (A600 = 1.0), followed by sonicaion with 0.4 mM Isopropyl β-d-1-thiogalactopyranoside (IPTG) under controlled conditions. The bacterial cultures were extracted by eradicating the supernatants, and the only pellet was resuspended in a PBS buffer followed by centrifugation at 12,000 × g for 10 min at 4 °C. The Lysis buffer was used to collect the bacterial cells. The ultrasonication was folloed in three intervals for 15 s until soluble fractions. The soluble protein product of CtCYP81E8 was separated on 12% SDS-PAGE, and the expected bands were stained using Coomassie brilliant blue and the expected protein product was puri ed using western blot hybridization method. Three independent biological and two technical replicates were analyzed for all measurement. In addition, 2,4-dimethylphenol activity (DMP) test was emplyed to check the in vitro activity of CtCYP81E8 by measuring the dissolved oxygen concentration of the reaction mixture at various time periods. A mixture of hydrogen peroxide 5 ul, enzyme solution 30 ul, 100 Mm DMP 20 ul, citrate buffer solution 145 ul, mixed in the enzyme label strip, was fully re ected for 5 minutes, and nally measured the OD278 values. The otherwise conditions include a shaker incubator xed at 28C, the time periods ranges from 0, 12 h, 24 h, 36 h, 48 h, 60 h, and 72 h and the centrifugation cycle pertains 10000 rpm at 4 °C for 5 minutes, with twice repeat.

Identi cation and physicochemical properties CtCYP81E subfamily in sa ower
In total, 15 full-length sa ower CtCYP81E genes with c. tinctorius location markers were extracted from the draft genome database selected on the basis of P450 Pfam00067 domain. In order to further classify CtCYP81E genes into subfamily, we extensively carried out comparitive analysis with the overall P450 genes of Arabidopsis genome using Phytozome 4.0. A total of 246 P450s and 26 pseudogenes were retrieved for further characterization. In addition, the physicochemical properties were also determined with the help of ProtParam online tool ( Table 1). The result of the protein size for all 15 CtCYP81E encoded amino acids was found between 115-516 amino acids. The expected molecular weight was recorded theoretically resulting within the range of 12.97 kDa (CtCYPE9) to 59.01 kDa (CtCYPE14), with an average value of 45.59 kDa. The values of isoelectric points (pI) ranged from 4.74 (CtCYPE9) to 9.35 (CtCYP81E10). Furthermore, the grand average of hydropathicity (GRAVY) index was also determined revealing that most of the CtCYP81E proteins were allocated to hydrophilic nature. Out of all, the most stable protein was CtCYPE4, comprising a stability index equals to 40.66. Table 1 The physiochemical properties of sa ower CYP81E subfamily proteins. The data was collected using the online tool of ExPASy (available online: http://web.expasy.org/protparam/).  (Table S2). The division of cluster groups between these two plant species by means of P-distance method determined their phylogenetic origin using MEGA-X package [25]. The clustering of P450 genes were initially divided into two widely known classess known as A-type and non-A-type P450 sequences. These two clades were further subdivided into nine different clans including, clan71, clan51, clan710, clan85, clan711, clan86, clan97, clan72, and clan74 (Fig. 1). The CYP71 clan was found the largest A-type class which contains 131 genes (48.51%) succeeded into 10 more subfamilies such as CYP71AH, CYP71AT, CYP71AU, CYP71AX,CYP71D, CYP71BE CYP71BG,, CYP71BL, CYP71BN and CYP71BP. Mostly, these subfamilies of CYP71 clan symbolize the presence of plant-speci c enzymes that are involved during secondary metabolic reactions mainly in avonoid biosynthesis. Our phylogenetic analysis indicated that CtCYP81E gene family of sa ower is consistently clustered with the CYP71 clan of Arabidopsis. Hence, it is possible that the CtCYP81E subfamily in sa ower is most likely involved in avonoid biosynthesis as found in the CYP71 clan of the model plant. [26,27].
3.3. Analysis of CtCYP81E gene structure, motifs and promoter The details analysis of CtCYP81E subfamily including gene organization, signature motifs and cis regulatory elements were carried out. As described in (Fig. 2), most of the CtCYP81E genes (Table S3) shared a common organization of exon/intron makeup. Nonetheless, a small number of CtCYP81E genes did not contain the basic gene structures. For instance, the gene structure organizations of CtCYP81E6 and CtCYP81E1 showed that both of these genes consist of a short exon and long intron when compare with the other candidate genes present in the same family. Our ndings were found consistent with CYP450 organization in A. thaliana [28] which con rrns that the number of exons in CYP71 clans of A. thaliana was ranged between 2-5. In addition, the shortest exon (27 bp) in the CYP71 clan (CYP71B32) of Arabidopsis was found longer than the shortest exon of CtCYP81E6 (16 bp) in sa ower. Sa ower CYP81E genes most likely contain 1-3 exons as demonstrated in (Fig. 2B). Mostly, these genes shared two (60%, 9/15), three (26.6%, 4/15), and one (13.3%, 2/15) of exon arrangments. Whereas the length of the introns in sa ower P450 genes were estimated from 126 to 5946 bp indicating parallel results with the A. thaliana [28] and C. elegans [29].
In the next level, all 15 CtCYP81E genes from sa ower were analyzed for the identi cation of naturally conseved protein motifs. The multiple sequence alignemt of these proteins demonstrated that almost all 15 candidate sa ower P450 genes contained the basic signature motifs of P450 family such as heme binding region, PERF region, K-helix region, and I-helix motif (Fig. S1). A total of 10 conserved motifs of CtCYP81E proteins have been found consistent with the Arabidopsis P450s MEME online investigation. The output results implied that nearly all candidate CtCYP81E proteins contained these conserved motifs exluding CtCYP81E4, CtCYP81E8, CtCYP81E9, CtCYP81E11, which described a different conservation pattern. In addition, the position of few conserved motifs in CtCYP81E was not inlined with the Arabidopsis P450 s. The sequences and organization of these signatory motifs of CtCYP81E proteins were demonstrated in (Fig. 2C). These results suggested that CtCYP81Es in sa ower inherit basic structural and functional domians during the process of evolution. Moreover, the cis-regulatory elements organizations of these CtCYP81Es speci cally in the 2 kb 5′ anking region upstream to the start codon (Table S4) was thoroghly analyzed. Altogether, six major types of cis-elements were found in the 2 kb 5′ anking region of the promoter (Fig. 2D). Among these genes promoter, the result of a few CtCYP81Es members contain endosperm The functional categorization of the 15 CtCYP81E transcripts in sa ower was carried out using GO analysis. The results of the in silico class cation revealed that all 15 CtCYP81E transcripts were allocated into one or additional GO terms. These CtCYP81E transcripts were found in all three fundamental functional categories including biological process indicated as (BP), molecular function indicated as (MF), and cellular component indicated as (CC). Moreover, eight functional subcategories were also demonstarted in the next level wherein two CC subcategories: integral compoent of membrane and membrane were detected. Five MF subcategories: oxioreductase activity, iron ion binding, heme binding, mrtal binding and iso avone 2'-hydroxylase activity; and one BP subcategories: oxidation-reduction process (Fig. 3). Despite the fact that CtCYP81E gene belongs to type A, as previously reported, there is no signi cant difference found in the functional annotation of the type A and non-type A P450 sequences [30].

Protein Clustering Networks
The monooxygenases (cytochrome P450) play important roles in xenobiotic metabolism and biosynthesis of internal nutrients such as avonoids, vitamins, steroids, hormones, and fatty acids. The capability of the P450 encoded enzymes to catalyze important substrates that involve interaction with its redox protein counterparts. This biochemical catalysis can be altered in association with the membrane-bind heme protein cytochrome b5 [31]. With the help of AtCYP81E orthologous, we systematically predicted the PPI interaction network of CtCYP81Es subfamily in sa ower. We con rmed 10 widely spread proteins co-associated with these CtCYP81Es, which include translocation (2), membopane lipoprotein (1), aquaporin-like (1), ABC transporter (1) and protein kinase (1) (Fig. 4).
The independent interactor protein networks indicated that CtCYP81E protein (1, 2, 3, 5, 6, 7, 9 and 15) interacts with the UGT74E2 protein, which is mainly involved in the biosynthesis of IBA (indole-3-butyric acid) and directly in uence the homeostasis of auxin. Additionally, CtCYP81E proteins (1, 2, 3, 5, 6, 7, 9 and 15) work together with AT5G25930 proteins which are largly associated with the protein amino acid phosphorylation. The CtCYP81E proteins (4, 8 and 11) interact with the AT5G48605 protein and can enhance plant defense mechanism. Besides this, CtCYP81E protein (4, 8 and 11) interacts with the AT1G59660 protein which acts as key regulator in the water channel. Notably, we found that other CtCYP81E orthologous except CtCYP81E (4, 8 and 11) proteins interact with ABCD1 proteins and may be involved in the transportation mechanism. The PPI network of CtCYP81E orthologous highlights its potential role in several physiological and biosynthetic process occurred simultaniously in plants system.

Expression analysis and functional annotation of CtCYP81E subfamily genes
Expression levels of P450 variants in sa ower were initially determined with the help of RNA-seq data (whole Transcriptome Shotgun Sequencing) in different tissue specifying ve selected tissues/organs (root, stem, seed, ower and leaf tissues). The expression level was calculated according to kilobase model of exon model per million mapped read (RPKM) method according to the instruction given by (Mortazavi et al. 2008). The RNA-seq data was obtained from the sa ower genome database (PRJNA399628; posted to NCBI on August 23, 2017). In general, the expression signals of almost all selected sa ower P450 genes were detected in all organs but with different patterns. As revealed in (Fig. 5A), the expressed P450 genes in sa ower were clustered into ve groups including, G1 (6.6%, 1/15), G2 (6.6%, 1/15), G3 (33.3%, 5/15), G4 (33.3%, 5/15), and G5 (13.3%, 2/15) that were more preferably expressed in the leaves, stems, seeds, owers, and roots, respectively. Furthermore, to validate the transcript abundance of CtCYP81E genes and their correlation in biosynthetic processes, we extensively carried out qRT-PCR analysis of these 15 genes at different owerering stages such as bud, initial, ower and fading stage.
Across CtCYP81E subfamily, the expression level of CtCYP81E2, CtCYP81E8, and CtCYP81E15 were abundantly detected at ower stage indicating that there might be a strong link between the regulation of transcription of CtCYP81E genes and cellular metabolism in sa ower.In addition, the transcripts of CtCYP81E1, CtCYP81E2, CtCYP81E5, and CtCYP81E7 were identi ed in high expression level at the fading stage of owering suggesting the transcription regulation of these genes at a later ower developmental period. The transcripts of CtCYP81E14, and CtCYP81E15 showed high expression level at initial owering of sa ower (Fig. 5B). Altogether, the qRT-PCR assay suggested a differential expression pattern and fold-change values of the selected CtCYP81E subfamily genes highlight their decisive roles in plant defense systems and developmental processes.

Subcellular localization and transcriptional regulation of CtCYP81E8
Based on our previous study on a CtCYP82G24, [32], we aimed to investigate the correlation between the quantitative expression trend of CtCYP81E8 gene and metabolite accumulation at different owering stages of sa ower. As described in (Fig. 6A&B), the expression level of CtCYP81E8 was detected consistent with the accumulation rate of total metabolites content in sa ower petals. These ndings provide a practical basis for the funtional characterization of CtCYP81E8, which could be a crucial modulator in the biosythetic pathway of avonoid biosynthesis in sa ower. Taking into consideration the functional importance and differential expression pattern of CtCYP81E8 gene, we therefore, cloned the full length sequence of CtCYP81E8 from sa ower (Fig. 6C) and then constructed a fusion vector of CtCYP81E8 and GFP gene under the control of the 35S promoter (pCAMBIA1302-CtCYP81E8-GFP-35S) in order to determine the experimental subcellular localization. After the e cient construction of the plant overexpression vector fusion, the recombinant vector was then transiently transformed intothe onion epidermal cells through agrobacterium mediated transformation system. Fluorescence imaging of infected epidermal cells of onion bombarded with CtCYP81E8-GFP showed cell membrane localization (Fig. 6D). These ndings revealed important evidence to support the assumption that CtCYP81E is able to catalyze cellular based biological reactions occurred in Carthamus tinctorius.

The induction of CtCYP81E8 transcription under variable stress conditions
The transcriptional regulatory network of CtCYP81E8 mRNA under variable stress environments has been demonstrated to con rm the underline notion of Cytochrome P450s involvement during a variety of plant secondary metabolites biosynthesis (Mizutani) Ohata 2010; Nelson, Werck-Reichhart 2011). By exploiting the temporal transcriptional regulatory channels of CtCYP81E8 under arti cial environmental switches, we demonstrated a multiregulation control system using qRT-PCR assays. The treatment group with methyl jasmonate at 0-12 hours, compared with the control group, the expression level of CtCYP81E8 gene showed an upward trend, among which the expression of CtCYP81E8 showed a unique increase at 8 h where, the transcription level was reached to its maximum. In contrast, at 12 h timepoint, the expression decreased signi cantly (Fig. 7A). Under drought stress conditions, the CtCYP81E8 gene expression was signi cantly induced at 4-8 h than the control plants. The expression level was reached to its maximum at 8 h timepoint under PEG induced stress however, the transcription of CtCYP81E8 was down-regulated at 12 h treatment times (Fig. 7B). Under strong light irradiation, the gene expression level of CtCYP81E8 at different treatment times 12-60 h was surprisingly down-regulated compared with control plants. The down-regulation was most signi cant at 36 h treatment time indicating intense susceptibility towards light stress (Fig. 7C). The transcription level of this gene after dark treatment was expectedly upregulated in all treatment times reaching to the maximum at 36 h. In general, the expression level was consistently rising from 12-36 h, and suddenly drops sharply after 48 h, but the overall expression level is up-regulated compared to the control group (Fig. 7D). These ndings unanimously represented the multi-dimensional periodic regulatory network of CtCYP81E8 transcriptional system upon different stress conditions, highlighting cruicial blueprints in the molecular regulation system of plants adaptation to biotic and abiotic stress responses.
3.9. Transcriptional regulation system CtCYP81E8 overlapping with avonoid accumulation in wild and mutant sa ower The correlation between the transcription level of CtCYP81E8 and accumulation pattern of total metabolite content through multiple ower developmental stages of wild and mutatnt sa ower varieties was extensively investigated using qRT-PCR assay. Simultaneously, the accumulation content of total metabolites was purposely investigated using the same phases of the two naturally occuring owers types in sa ower including red and yellow ower.
Interestingly, the expression pro le of CtCYP81E8 showed a programmed expression system correspondantly during owering developmental phases both in wild and mutant type of sa ower (Fig. 8A&B). The transcription control level of CtCYP81E8 during the red owering development stages except at the bud owering phase (R1), con rmed that the increased trend of CtCYP81E8 transcript simultaneously affect the accumulation level of total metabolite content in red-typed wild sa ower. In the same way, the yellow-typed wild sa ower showed a consistent network of increased trend in the accumulation of metabolite content with the increase in the expression level of CtCYP81E8 excluding the bud owering stage (Y1). The estimated theme was further con rmed by conducting simillar analysis in the mutant sa ower line, suggested almost a similar type of correlation, however, the opposite trend was also found as in the wild type sa ower but through a different ower developmental stage. It was suggested that the accumulation content of sa ower metabolites was signi cantly increased with the increase in the transcription level of CtCYP81E8 during all three owering stages of the red-typed mutant sa ower. Nonetheless, a discontinued scheme of the metabolite accumulation in the white-typed mutant sa ower was observed at bud owering (M5) and full owering (M7) stages with an axception to intitial owering phase (M6), indicating a reverse order in comprison to their corresponding ower stages. As mentioned earlier, the expression level of CtCYP81E8 was signi cantly exploited under different stress conditions, indicating the concept of secondary metabolic activation in plants under variable abiotic stress conditions. Conclusively, these results insistingly suggested that the transcription regulation of CtCYP81E8 has a certain relationship with the accumulation pro ling of metabolite content of different sa ower varieties. Though assumuingly, but these ndings could be cruicial in understanding the core concept of molecular regulatory signals that strategically switch on the secondary metabolic ux by intervening through a bulk of genetical and particularly, transcriptional events, to ensure plant's survival under acute environmental drifts.

Heterologous Expression and in vitro enzymatic assay of CtCYP81E8
In order to validate the potential function of CtCYP81E8 in vitro, the full length cDNA of CtCYP81E8 was cloned into the prokaryotic expression vector (pET28a+), and then transformed into E. coli BL21DE3 cells by thermal and electric shock transformation, and then induced by adding different concentrations of IPTG. The bacterial solution without IPTG induction and no load were used as control. The heterologous expression of CtCYP81E8 recombinant protein was mainly detected by Coomassie blue staining SDS-PAGE and Western Blot hybridization. The analysis of SDS-PAGE showed that the recombinant CtCYP81E protein was expressed at the 36.4 KDa site, but it seemed that the concentration of IPTG did not affect the protein expression (Fig. 9A). Then we puri ed the recombinant CtCYP81E8 protein and further identi ed the expression of the protein at different IPTG concentrations by western blot hybridization. As shown in (Fig. 9B), we found a single purpose band and a change in the protein expression with the concentration of IPTG induction. From our ndings, we deduced that target protein of CtCYP81E8 was stably detected on SDS-PAGE and western blot hybridization on nylon membrane, moreover, the product size was also consistent with theoretical molecular weight of CtCYP81E8 protein (36.4 kDa), suggesting that the target protein was effetiently expressed in prokaryotic system, however, the different concentrations of IPTG could potentially affect the expression level of the target protein [16].
The primary objective of the DPM assay was designed to explore the complete consumption of oxygen by CtCYP81E8 recombinant enzyme for the oxidation of 100 mM DMP by direct absorbance method under different time periods. For the present study, by adding hydrogen peroxide in the reaction, the highest removal of 100 mM DMP concentrations (60%) in CtCYP81E8 batch was variably detected including the optimum removel at 24 h followed by 48 h respectively as compare to the control group (Fig. 10). For 12 h and 72 h, the removal of 100 mM DMP was less found than 15-40%, but it was also observed that after 36 h reaction the removal of DMP was reached to almost 40%. These ndings depicts that insu cient absorbance of oxygen in a uniform reaction batch may be due to the poor e ciency rate. It was also suggested that the rate of reaction was independent of dissolved oxygen at the start of the reaction, until enough oxygen was present. But, after the su cient utilization of the dissolved oxygen in the aforesaid reaction, the dependency of the reaction becomes essential for exogenous oxygen addition. Hence, more efforts are still needed to provide further insights in obtaining more e cient removal of DMP during in vitro activity assay.

Evolutionary classi cation of CtCYP81E genes in sa ower
The complete genome sequencing of the Arabidopsis model plant (Arabidopsis Genome Initiative, 2000) [33] has broadened the genome wide identi cation studies of functionally important gene families. Among which, a stricking on already discovered in Arabidopsis is largest enzymne-encoding Cytochrome P450 monooxygenases gene family.
In this study, we also conducted a comparitive genome wide study of the putative CtCYP81E subfamily in sa ower with Arabidopsis P450s to investigate their ancestral relationship by studying the annotation result of similar clans of related P450 families. The clustering of 15 CtCYP81E enzyme encoding sequences were found on the largest Atype clan CYP71 clan supporting our hypothetical notion as most of CYP71 subfamilies symbolize a group of enzymes involved during secondary metabolic biosynthesis. The CYP71 clan was found the largest A-type class which contains 131 genes (48.51%) succeeded into 9 and with the inclusion of the new sa ower CtCYP81E sequences, in total 10 clans such as CYP71AH, CYP71AT, CYP71AU, CYP71AX,CYP71D, CYP71BE CYP71BG,, CYP71BL, CYP71BN and CYP71BP (Fig. 1). Our Phylogenetic reconstructions revealed that four clans namely clan51, clan710, clan711 and clan74 belong to the same family clans, however, the rest of the ve clans covers various other families of P450 genes [34,35]. The CYP72 clan contains eight subfamilies suggested that CYP72 clan is the largest non-A family comprising 20 genes (13.70%). Further classi cation of CYP72 clan was catogorised into two subgroups including CYP72A and CYP72D. The sequences encoded by the non-A-type clan surround 139 of the 270 sequences and also a group of other enzymes paticipating during the biosynthesis of primary and secondary metabolic compounds in plants for example sterols, fatty acids, hormones and other signaling molecules [36]. Lastly, CYP74 clan contains four different subgroups which is designated as outgroup in our phylogenetic tree becuase it is an atypical plant P450 clan which doeos not contains monooxygenase activity. 4.2. Hierarchy of gene structure, conserved motifs and cisregulatory classi cation of CtCYP81E genes in sa ower Intron-exon organization and their pattern of gain and loss mutations greatly highlight the evolutionary mechanism of cetain gene families falls within the same phylogenetic clade. The understanding of the conserved introns organization likely offer ancient elements to understand similar group of genes involved in multple physiological processes of plants [37]. During our analysis, the sa ower CtCYP81E subfamily genes mostly contain 1-3 exons and two introns that were found comparable with clan71 gene families of Arabidopsis and mullbery genome [28,38] (Fig. 2B). In addition, it was also found that the two conserved introns were not detected in the non-A type Arabidopsis P450 gene families as compare to A-type clan71 gene families indicating the signi cant course of intron evolution during gene families organizations [39,40]. Cytochrome P450s comprise comparitively a different pattern of amino acid conservation [41], however, the universal topology of the secondary and tertiary structures and other basic signatory regions/motifs remains consistent throughout plant kingdom. In this study, sa ower CtCYP81E subfamily also revealed the occurrence of the ve well known P450 motifs containing heme-binding region (PFxxGxRxCxG/A), C-helix (WxxxR), PERF motif (PxxFxPE/DR), K-helix (ExLR), and I-helix motifs (GxE/DTT/S) [36,42]. Furthermore, the overspread conservation of the speci c amino acid groups including tryptophan (W) and arginine (R) in the conserved C-helix motif, glycine (G) and threonine (T) in the conserved structure of I-helix motif, phenylalanine (F), glycine (G), arginine (R) and cysteine (C) repeatition inside the conserved signature of the heme-binding motif, glutamic acid (E) and arginine (R) of the widely spread K-helix region, and proline (P) residues inside the conserved PERF band were consistently found throughout CtCYP81E subfamily in sa ower (Fig. 2C).
Moreover, the conservation and distribution of cis-regulatory elements in CtCYP81E promoters suggested the presence of widely known stress-responsive regulatory units containing low-temperature responsiveness elements (P-box; TATC-box), abscisic acid-responsive element (ABRE) [43], dehydration/drought-responsive element (DRE) [44], hormonal responsive elements (methyl jasmonate (MeJA) and C-repeat [45] (Fig. 2D). The conservation of such important cis regulatory units within the promoter regions of several gene clusters such as NAM, ATAF, and CUC (NAC) genes [46] have suggested their stress tolerance potential under extreme climatic changes. The RNA-seq results of CtCYP81E expression showed a slightly up-regulated pattern in ower petals and the presence of overrepresented endosperm expression elements (AACA_motif; GCN4_motif) and TCP transcription factor elements in the promoter region, which is well studied in ower developmental growth and different hormonal biosynthesis reactions [47][48][49]. Therefore, CtCYP81E subfamily genes could be crucial candidate genes in sa ower underlining the core concept of oral development and secondary metabolism. Conclusively, the overall survey of gene structure compositions combined with widely identi ed conserved motifs and commonly spread cis-regulatory units of CtCYP81E candidate subfamily of sa ower demonstrated unique evolutionary pattern which further put emphasis on the functional dynamics of these genes during plant adaptation to various stress responsesand other crucial secondary biosynthetic pathways.

Transcriptional regulation of CtCYP81E genes during oral development in sa ower
The gene expression at transcription level deeply relies on phases of plant growth and development, age, environmental in uences, degree of expression, tissue variability, and various biotic and abiotic stress responses. In the present study, we also tend to investigate the undelying molecular regulatory network of CtCYP81E genes at the transcriptional level. The RPKM data obtained from RNA-sequencing suggested a distinct expression pro ling of CtCYP450 genes in sa ower by clustering into ve main groups including, G1 (6.6%, 1/15), G2 (6.6%, 1/15), G3 (33.3%, 5/15), G4 (33.3%, 5/15), and G5 (13.3%, 2/15) detected in various tissues/organs such as leaves, stem, seed, owers, and root respectively (Fig. 5A). These results were found compatible with the results of vasu et al 2019 who described that 31.33% of Solanum lycopersicum P450 genes demonstrate differential expression pro ling through different tissues/organs development. Our ndings also provides close proximity with soybean (31.92%) [19], mulberry (23.6%) [38], and rice (49.81%) [50] P450 genes exhibiting ve major groups of expression pattern were found. Furthermore, the transcriptional regulation of CtCYP81E8 in different owering stages of sa ower under normal and challenging climate conditions was expoloited to unleash their molecular regulatory channels during plant adaption system by activating the secondary metabolic signals. Our results signi cantly depicts the regulation of CtCYP81E8 mRNA abundance through multiple ower developmental phases in sa ower during temporal exposure to hormonal (MeJA) and various abiotic stress conditions including drought, light and dark environments (Fig. 7). The speci c outcome from this study represented the multi-facet and periodic regulation system of CtCYP81E8 transcriptional channels during developmental phases of the ower tissues encountering a variety of climate shifts, emphisizing essential elements in the quest of plant adaptations against biotic and abiotic stimuli.
These ndings were effectively supported by our recent studies on CtCYP82G24, [32], which outlined the core concept of the correlation analysis between quantitative gene expression trend and metabolite accumulation pattern under abiotic stress encounter in transgenic Arabidopsis. In addition to this, we also reported the upregulation of CtCYP82C1 transcription under arti cial hormonal and abiotic stress exposure [51]. Even so, these ndings facilitate practical basis underlining the funtional characterization of CtCYP81E8 with a particular focus on avonoid biosynthesis in sa ower, more robust approaches are however, required to screen out appropriate candidate genes clustered together from a large repertoire of Cytochrome P450 supergene family in sa ower for extended functional characterization studies.

Functional dynamics of CtCYP81E8 and heterologous expression
Cytochrome P450 is the largest monooxygenase superfamily found in all forms of life, however, the genral tendency of this multigene familly in plants revolves around the distinct metabolic pathways through different developmental stages. So far, the primary functional characterization of the CYP71 clan subfamilies has been suggested to coexist during shikimate biosynthetic pathway [52][53][54]. Particularly, the investigation of P450 genes clusters involved in avonoids and alkaloids biosynthetic pathways has also been demonstrated in other plants for instance, CYP73A and CYP93 subfamily genes [55]. The catalytic activity of P450s shares a common oxidative and reductive mechanism. However, to fully understand the wide range of substrate speci city of novel P450 enzyme encoding sequences, the development of a stable heterologous expression system is still a challenging mode. The adavatage of bacterial P450 expression systems over others have been comprehensively summarized by [56] providing a fast and robust expression system. This study also mimics the establishment of an e cient and stable expression system for the in vitro ampli catiom of the candidate CtCYP81E8 protein using prokaryotic machinery of bacterial system (Fig. 9). Our ndings suggested an e cient system of the heterologous protein expression of putative CtCYP81E8 protein resulting in a product of 36.4 kDa size protein. The induction of the putative CtCYP81E8 protein with IPTG suggested a stable induction however, using variable IPTG concentrations could potentially in uence the expression level of the target protein [16]. In addition, the DMP in vitro activity assay was also emplyed to tie the biochemical aspects of CtCYP81E8 reactions to their actual physiological and biosynthetic pathways which is not yet revealed (Fig. 10).

Conclusion
After the identi cation of cytochrome P450 superfamily, considerable knowledge has been reported about the unique biology of this special category of hemoprotein, however, it is essential to know further, the latest P450s molecular properties, particularly structural and functional discrepancies. Such unique properties can alter the effectiveness of the functional dynamics in speci c P450s, which can explain new footprints that P450s may play besides their typical functions. In our own study, we suggested a comprehnsive structural and functional model of a putative member of CtCYP81E subfamily in sa ower, providing crucial insights during the regulation of the adpative mechanism of plant secondary metabolites through a number of genetical, transcriptional and biochemical events. Though assumingly, these ndings indicate considerable amount of information towards the partial functional identi cation of putative CtCYP81E8 member in sa ower. Many efforts are, however, still required to understand the overall biological nature of CtCYP81E subfamily concerning its natural and functional diversity.

Declarations
Ethics approval and consent to participate Not applicable

Consent for publication
Not applicable Availability of data and materials All data generated or analysed during this study are included in this published article and its supplementary information les.