Transcriptomic and Targeted Metabolomic Analysis Identies Genes Controlling for Early Bolting and Flowering in Angelica Sinensis

Background: The root of the perennial herb Angelica sinensis is a widely used source for traditional Chinese medicines. While the plant thrives in cool-moist regions of western China, early bolting and owering (EBF) for young plants, signicantly reduces root quality and yield. Approaches to inhibit EBF by changes in physiology during the vernalization process have been investigated, however the mechanism for activating EBF has not been identied. Here, transcript proles for bolted and unbolted plants (BP and UBP, respectively) are compared. Results: A total of over 72,000 unigenes were detected with ca. 2,600 differentially expressed genes (DEGs) observed in the BP compared with UBP. While various signaling pathways participate in ower induction, it is genes associated with oral development and the sucrose pathway that are observed to be coordinated in EBF plants, to coherently up and down regulate owering genes that activate and inhibit owering, respectively. Down-stream signal accumulation including gibberellic acids and sucrose metabolites were also monitored by HPLC-MS/MS for EBF plants. Conclusions: The signature transcripts pattern for the developmental pathways that drive owering provides insight into the molecular signals that activate plant EBF.

The transition from vegetative growth to owering involves multiple signaling pathways that are transcriptionally regulated including: photoperiodic, autonomous/vernalization, sucrose, and gibberellin (GA) pathways. All pathways converge by increasing the expression of the two meristem identity genes: SUPPRESSOR OF OVEREXPRESSION OF CONSTANS1 (SOC1) that is also known as AGAMOUSLIKE 20 (AGL20) and LEAFY (LFY). SOC1 and LFY, in turn, regulate the oral homeotic genes to produce the oral organs [20,21]. The photoperiodic pathway is initiated by phytochromes and cryptochromes. The interaction of photoreceptors with a circadian clock activates the expression of the gene CONSTANS (CO) that encodes a zinc-nger transcription factor that promotes owering. In the dual autonomous/vernalization pathway, owering occurs either in response to internal signals, the production of a xed number of leaves, or to low temperatures that reduces the expression of the owering repressor gene FLOWERING LOCUS C (FLC). The sucrose pathway re ects the metabolic state of the plant and sucrose stimulates owering by increasing LFY expression. Lastly, the GA pathway can participate in early owering and for owering under noninductive short days.
During A. sinensis seedling vernalization with winter storage, levels of soluble sugars, amino acids and organic acids increase as well as nitrate reductase activity [27]. In contrast, during the photoperiodic stage of plant growth, soluble sugars and protein levels are reduced. The amounts of amino acids, GA 3 , zeatin riboside and polyamines as well as the activities of peroxidase and polyphenoloxidase increase in bolting plants (BP) compared with unbolted plants (UBP) [28]. Although ca. 5,100 genes have been differentially expressed in the apical meristem of vegetative growth compared to ower buds of early owering and 13 DEGs were involved in photoperiodic, vernalization, sucrose and GA pathway [29], early bolting-dependent changes that impact genes expression and GAs metabolism have not been mapped and identi ed. In this study, functional leaves, and lateral roots of BP and UBP were measured by transcriptomic analysis and 40 DEGs associated with EBF were mapped on pathways associated with ower control. Gene expression levels were validated by qRT-PCR and down-stream GA metabolites were pro led by HPLC-MS/MS.

Global gene analysis
A robust data set was collected (Fig. S2) and after data ltering, 60.7 and 52.4 million high-quality reads were obtained for the BP and UBP, respectively; 44.7 and 37.4 million unique reads as well as 7.8 and 6.4 million multiple reads were mapped. From the 72,502 compiled genes and annotated against the databases including NR, SwissProt, KEGG, KOG, and GO (Table 1 and Table 2 (167), translation (119), transport (233), and stress response (102) (Fig. 1). Based on ower driving genes characterized in higher plants [21], 40 DEGs (29 UR and 11 DR) were identi ed as potential regulatory genes for EBF ( Fig. 1).  1 Reads with a quality score < 30 and length < 60 bp were excluded; 2 Mapping ratio = (Unique mapped reads + Multiple mapped reads) / Filtered reads.   Table 4) were transcriptionally regulated so as to favor owering in BPs. The RELs were consistent with RPKM values, with down-regulated 0.3-fold for the INV Inh gene, and up-regulated 1.3-to 6.1-fold for the other 10 genes in the BP compared to the UBP (Fig. 2B).   (Fig. 4A). Since GA 1 and GA 4 exhibit higher oral induction activity than other GAs that are produced in plants [21], an elevated level of GA 1 and GA 4 may promote EBF. In contrast, an almost 2-fold decrease in soluble sugars in the BP was unexpected as elevated sugar is usually a driver of owering (Fig. 4B) [28].

Discussion
The SUPPRESSOR OF OVEREXPRESSION OF CONSTANS1 (SOC1) can integrate signals from the photoperiodism, vernalization, sucrose and GA pathways and regulate the expression of LFY, which links oral induction and oral development, when associated with other MADS box genes [30]. MADS box proteins regulate different developmental processes including owering time, oral meristem identity and oral organ development [31]. MADS8 that is structurally related to the AGL2 family is involved in controlling owering time [32]. AGL8 promotes early oral meristem identity in synergy with AP1 and CAULIFLOWER [33]. AGL12 acts as promoter of the owering transition through up-regulation of SOC, FT and LFY [34]. DEFICIENS (DEFA) is involved in the genetic control of oral development [35]. APETALA1 (AP1) and AP2 are required for the transition of an in orescence meristem into a oral meristem and promote early oral meristem identity, with AP1 regulating positively AG in cooperation with LFY, while AP2 repressing AG by recruiting the transcriptional corepressor TPL and HDA19 [36,37]. AINTEGUMENTA (ANT), a member of the AP2-like family, is involved in ower organs initiation and development and mediates AG down-regulation [38,39]. Positive regulators of owering in the oral development pathway were observed to be up-regulated in EBF plants are while genes that disfavor owering (AP2 and ANT) were down-regulated, suggesting that transcription regulation of these genes may well be a driver for A. sinensis EBF.
Suc and its cleavage products glucose (Glc) and fructose (Fru) are central molecules for cellular biosynthesis and signal transduction throughout a plant's life cycle [40]. In this study, Suc synthases (SUSs) that are encoded by three SUS1, SUS3 and SUS7 genes catalyze a reversible conversion of Suc and UDP to UDP-Glc and Fru [41,42]; Alkaline/neutral invertases (INVs) that are encoded by three INVA, INVB and INVE genes catalyze an irreversible hydrolysis of Suc to Glc and Fru [43][44][45]; and the invertase inhibitor (INV Inh) inhibits the INV activity by forming a complex with INV [46]. Two kinds of amylase enzymes including α-amylase (AMY) and β-amylase (BAM) could respectively produce α-maltose and βmaltose through the hydrolysis of amylopectin and amylose [47]. In this study, four DEGs encoding amylase enzymes include: AMY1.1 that can increase enzyme activity via accessory binding sites on the protein surface, BAM1 and BAM3 that play important roles in starch degradation and maltose metabolism and BAM9 is inactive due to lack the conserved Glu active site [47][48][49]. Since the genes (SUS1, SUS3, SUS7, INVA, INVB, INVE, AMY1.1, BAM1, BAM3 and BAM9) that avor owering were upregulated and the INV Inh gene that dis avors owering was down-regulated, transcriptional regulation of sucrose pathway is consistent with EBF.
While genes associated with GA biosynthesis and GA mediated signaling were differentially regulated in BP versus UBP, the genes did not exhibit coherent transcriptional regulation with EBF, suggesting that transcriptional regulation of GA mediated genes is not a driver of early bolting. For example, with GA mediated signaling, DELLA proteins GA-INSENSITIVE (GAI) and GAIP function as inhibitors by interacting in large multiprotein complexes that repress transcription of GA-inducible genes [50][51][52]. Inconsistent with promoting owering, these genes are transcriptionally up-regulated in BP versus UBP. Inconsistency is also observed in genes that encode GA biosynthesis with a subset of genes up regulated such as KO, that catalyzes the conversion of ent-kaurene to kaurenoic acid early in the biosynthetic pathway [53] as well as GA20OX1 that converts GA12/GA53 to GA9/GA20 [54] later in the pathway (Fig. S13), while GA2OX8 that catalyzes 2-beta-hydroxylation of GA precursors, rendering them unable to be converted to active GAs is up-regulated under the same condition that promotes owering (BP). This incoherent transcriptional regulation of GA biosynthesis and signaling for EBF suggests that early bolting may be regulated by events downstream of owering signaling such as GA and/or sugar accumulation.
While CONSTANS-LIKE (COL) genes are regulators in the photoperiod pathway and owering, transcripts in this pathway were also inconsistently induced providing an inarticulate signal for plant owering. For example, while both CO3 and COL3 function as oral activators, the two genes were transcriptionally upand down-regulated, respectively when comparing BP with UBP. Speci cally, CO3 up regulates the expression of Heading date 3a (HD3A) and FLOWERING LOCUS T-LIKE (FTL) under LD conditions [55,56]. FT-interacting protein 1 (FTIP1) is an essential regulator required for the export of FT protein from the phloem companion cells to sieve elements through the plasmodesmata under LD conditions [57] and was observed to be up regulated in BP. The FT protein acts as a long-distance signal to induce owering [58] and FLOWERING LOCUS D (FD) interacts with FT protein to activate the downstream oral meristem identity genes AP1 to initiate oral development [59,60]. While this is consistent with ower induction that is observed with BP, there are several transcriptional responses that are not down-regulated at expected. For example, CDF2 is a transcriptional repressor that delays owering by repressing CO transcription under LD conditions [25] was found to be up-regulated almost 4-fold in BP compared with UBP. MIP1A and MIP1B that repress owering by forming heterodimeric complexes that sequester CO and COL proteins into non-functional complexes [26] were also found to be up-regulated in BP. Another inconsistent transcriptional response for owering is up regulation of HEADING DATE REPRESSOR 1 (HDR1), a owering suppressor that up-regulates HD1 in LD conditions [61]. Again, inconsistent regulation of photoperiod pathway transcripts associated with owering in BP suggests down-stream signaling involvement in early bolting.
Flowering is a process in which plants transition from vegetative to reproductive growth via a complex pathway of signaling networks. The DEGs observed comparing BP and UBP suggests transcription-based regulation of EBF. Speci cally, genes associated with oral development and sucrose signaling are transcriptionally correlated with bolting (Fig. 5). For the oral development, SOC1 can integrate signals from the photoperiodic, GA and sucrose pathways to initiate early oral meristem identity by regulating the over-expression of LFY; meanwhile, AP1 in synergy with MADS, AGL8 and AGL12 that are repressed by AP2 and ANT, promotes early oral meristem identity. Lastly, the early oral meristem identity induces early bolting and owering of A. sinensis plants. For sugar signaling, over-expression of genes AMY1.1, BAM1 and BAM3 enhances starch degradation while differential expression SUSs, INVs and INV Inh cleavage Suc to Glc and Fru that can also promote SOC1 expression.

Conclusions
The DEGs observed comparing BP and UBP suggests transcription-based regulation of EBF. This transcriptomic and analysis focuses on four pathways that can mediate a transition from vegetative to reproductive growth: photoperiodic, GA signaling, autonomous and oral development. While genes associated with EBF have been identi ed and mapped here, a causative role of these genes in activating and/or regulating EBF will require the knocking out of speci c genes via a CRISPR-Cas 9 system. Sequence ltration, assembly and unigene expression analysis Raw reads obtained from the Illumina sequencing were further ltered to get high quality clean reads by removing reads containing adapters, more than 10% unknown nucleotides as well as more than 40% low quality (Q-value≤10) bases. De novo assembly of clean reads was carried out using Trinity software [62] that combined three components: Inchworm, Chrysalis and Butter y, respectively for assembling a collection of linear contigs, building graphs for each cluster of related contigs and outputting one linear sequence for each alternatively spliced isoform and transcripts. The expression level of each transcript was calculated and normalized to reads per kb per million reads (RPKM) [63]. In this study, the level of differential expression for each transcript with a criterion of |log 2 (fold-change) ≥ 1 and p value ≤ 0.05 to identify DEGs between BP and UBP.

Basic annotation of DEGs and gene cluster analysis
Unigenes were annotated against the databases including: NCBI non-redundant protein (NR), Swiss-Prot protein, Kyoto Encyclopedia of Genes and Genomes (KEGG), euKaryotic orthologous groups of proteins (KOG), and gene ontology (GO) by using a BLASTx procedure with an e-value ≤ 10 -5 [64]. Molecular Evolutionary Genetics Analysis (MEGA) 7.0 was used for the gene cluster analysis (Fig. S11).

qRT-PCR validation
Total RNA samples from samples of the BP and UBP plants were extracted using a plant RNA kit. Primer sequences of the 40 DEGs (Table 4) were designed with the tools for primer-blast in NCBI. First-strand cDNA was synthesized using a FastKing RT kit with one cycle at 42°C for 15 min and then 95°C for 3 min. PCR ampli cation was carried out using a SuperReal PreMix with one cycle at 95°C for 15 min, followed by 40 cycles at 95°C for 10 s, 60°C for 20 s and 72°C for 30 s. Melting curves were analyzed after an incubation at 72 o C for 34 s. Actin was used as an internal standard, the relative expression level (REL) of gene was calculated based on a 2 △△Ct method [65]. GAs was calculated based on calibration curves (Table S1).

Soluble sugar measurement
Soluble sugar was measured using a sulfuric acid-phenol protocol [66]. A dried powder (1.0 g) was soaked in 10% EtOH (25 mL) for 72 hrs. at 22 o C and then centrifuged (4°C, 8000 r/min, 10 min). Extracts (30 µL) were added into 9% phenol reagent (1 mL), sulfuric acid (3 mL) was added after oscillation and then reacted at 22 o C for 30 min. Absorbance was measured at 485 nm, soluble sugar content was evaluated based on mg of Suc.

Statistical analysis
All the measurements were performed using three replicates. A t-test for independent samples was