Effects of different light qualities on plant growth
At the final harvest (day 60), the leaf area was significantly lower in the RL and BL treatments than in the WL treatment (Fig. 1C), while plant height was significantly higher and lower in the RL and BL treatments, respectively, compared to the WL (Fig. 1D). Finally, on day 60, the stem diameter of samples in the WL and BL treatments was significantly larger than those in RL treatment (Fig. 1E). All statistical data were shown in Additional file 1: Table S1.
Illumina sequencing, de novo assembly, and annotation of the reference transcriptome
In total, 1,025 million raw reads were generated from the 15 cDNA libraries with a total of 1,009 million high-quality clean reads. To be specific, each library produced about 55.9–92.7 million raw reads (CRA002486). The relevant parameters of de novo assembly were shown in Additional file 2: Table S2. The results of the functional annotation from the seven public databases showed that 46.9% of unigenes were annotated in at least one database, while only 3.0% of unigenes were predicted from all the databases (Additional file 3: Table S3). A total of 33,626 unigenes were categorized into 56 functional groups among three categories: biological process, cellular component and molecular function (Additional file 4: Fig. S1).
DEGs analysis
To identify DEGs in the 15 samples, a pairwise comparison was performed and filed with padj < 0.05 and |log2FoldChange| > 1. The results showed that 861 (BY vs RY), 378 (BY vs WY), 47 (RY vs WY), 10,033 (WG vs WY), 7917 (WJ vs WG), and 6379 (WJ vs WY) unigenes were significantly expressed.
The enrichment analysis was performed using the GO ontology analysis (Additional file 5: Fig. S2a-e). A large number of DEGs were predominant in five terms, namely: metabolic process, single-organism metabolic process, catalytic activity, ion binding, and transferase activity. Meanwhile, all DEGs were mapped in the KEGG database (Additional file 6: Fig. S3a-f). Interestingly, in almost every group (apart from RY vs WY) the DEGs were enriched in the biosynthesis of phenylpropanoids (including flavonoids). The phenylpropanoid-derived compounds are just originated from this biosynthetic pathway. Therefore, the follow-up experiment and analysis were performed to describe this regulatory mechanism, based on transcriptome and mebabolomic profiling.
Metabolism analysis
Based on results obtained from the UPLC-MS/MS platform and self-built database, a total of 454 metabolites were detected, including various classes of lignans and coumarins, flavonoids, phenolic acids and other, and the majority of them are phenylpropanoid-derived compounds (Table 1 and Additional file 7: Table S4). Among them, fraxetin, scopolin, isofraxidin, fraxidin and esculin from the lignans and coumarins class and rosmarinic acid, rosmarinyl glucoside, chlorogenic acid, esculetin from the phenolic acids category were isolated from the samples, which are the major ingredients of S. glabra.
Table 1
Classes of the detected metabolites in S. glabra
Metabolite class | Number of metabolites | Metabolic class | Number of metabolites |
Flavonoids | 90 | Organic acids | 32 |
Alkaloids | 39 | Lipids | 60 |
Phenolic acids | 66 | Tannins | 13 |
Lignans and Coumarins | 11 | Terpenoids | 10 |
Amino acids and derivatives | 52 | Others | 53 |
Nucleotides and derivatives | 28 | | |
PCA (Fig. 2A) and HCA (Fig. 2B) analyses aimed to evaluate the temporal accumulation patterns of metabolites identified in our samples. The PCA results showed that the samples tested were significantly different and that replication was generally consistent with that of others in the same group. The HCA presented metabolic profiling and distributed them into 10 classes in all of the samples. As shown in Fig. 2B, most flavonoids from BY were higher than in WY and RY. In contrast to WY, the contents of tannins in BY and RY were significantly different. However, the chemicals distribution was significantly different between the root (WG) and leaves (WY, RY, and BY). Leaf tissues contained a higher abundance of compounds such as flavonoids, phenolic acids, and tannins, while lipids, amino acids and derivative alkaloids, organic acids, nucleotides and derivatives, lignans and coumarins, terpenoids were much more abundant in the root tissue.
The results of the comparative analysis are shown in Table 2. In the WY vs RY group, compared with WY, 11 flavonoids and six phenolic acids were significantly down-regulated in RY. Among them, only three flavonols (kaempferol, kaempferin, and cynaroside) were annotated in the flavonoid biosynthesis. In the WY vs BY group, 40 flavonoids, 11 phenolic acids, two lignans and coumarins, and three terpenoids were significantly differences. Only eight out of 40 flavonoids were mapped in the flavonoid biosynthesis, six of which increased by 3.2- to 6.2-fold in BY, but the content of cynaroside and protocatechuic aldehyde was approximately reduced by 4.5-fold in BY. For phenolic acids, 7 metabolites were down-regulated, including caffeic acid, esculetin, and sinapinaldehyde, while others increased by 2.5- to 3.3-fold. As for the two lignans and coumarins, daphnetin production decreased by 9.5-fold, while that of oxypeucedanin increased by 2.1-fold in BY. In the RY vs BY group, the flavonoids (49 compounds) showed similar accumulation patterns to those of the WY vs BY group. For phenolic acids (21 compounds), the production of 12 compounds in BY was higher than that of RY by 2- to 5-fold, while that of the remaining compounds significantly decreased. Especially, in the WY vs WG group, compounds derived from flavonoid biosynthesis mainly accumulated in WY compared to WG. In contrast, some coumarins (esculin, isofraxidin, and fraxidin) were more abundant in WG, but other coumarins (scopolin and oxypeucedanin) and most phenolic acids were significantly more abundant in WY. Furthermore, we selected the main active constituents and compared the metabolic profiles of different samples (Fig. 3). It should be noted that BL inhibited the production of rosmarinic acid, esculetin, isofraxidin, scopolin, scopoletin, fraxidin, fumaric acid, and maleic acid, while RL stimulated the accumulation of esculetin, fraxetin, esculin, scopoletin, fumaric acid, and maleic acid in leaves.
Table 2
The quantitative statistics for the differential analysis of identified metabolites in different groups
Group name | Identified Metabolites | Identified Metabolites (significant) | Up (Down)- regulated | Annotated in KEGG | Annotated in KEGG (significant) |
WY vs RY | 450 | 44 | 29 (15) | 203 | 14 |
WY vs BY | 450 | 87 | 34 (53) | 203 | 40 |
RY vs BY | 449 | 114 | 45 (69) | 202 | 51 |
WY vs WG | 454 | 296 | 128 (168) | 203 | 123 |
Candidate genes involved in flavonoid, coumarin, and phenolic acid biosynthesis
We have identified 46 candidate unigenes encoding most of enzymes involved in the flavonoids, coumarins, and phenolic acid biosynthesis (Table 3 and Additional file 8: Table S5). Moreover, five unigenes relating to rosmarinic acid biosynthesis were also detected. All of these compounds are related in one way or another to the core phenylpropanoid pathway, comprising the three committed steps catalyzed by phenylalanine ammonia lyase (PAL), cinnamate 4-hydroxylase (C4H), and 4-coumaroyl CoA ligase (4CL) enzymes. In this study, four PAL unigenes, four 4CL unigenes, and one C4H unigenes were identified, of which the PAL unigene (Cluster-22883.50216), 4CL unigene (Cluster-22883.47381), C4H unigene (Cluster-22883.50271) exhibited the highest expression level compared to other members. These also showed higher expression in BY and RY treatments. Five unigenes encoding hydroxycinnamoyltransferases were found, and the expression of unigene (Cluster-22883.48487) was detected in all samples, but it was most highly expressed in BY and RY. Furthermore, the unigene (Cluster-22883.21708) was only highly expressed in WG compared to other treatments. The scopoletion biosynthesis is derived from a key precursor - feruloyl CoA, produced by the action of caffeic acid 3-O-methyltransferase (COMT) and 4CL on caffeic acid or by Caffeoyl CoA 3-O-methyltransferase (CCoAOMT) reacting wit caffeoyl CoA. In this pathway, only one COMT unigene (Cluster-22883.56077) and one CCoAOMT unigene (Cluster-22883.54879) were detected, which were both significantly up-regulated in WG tissues and down-regulated in RY and BY compared to WY. Furthermore, five unigenes belonging to 2-oxoglutarate and Fe(II)-dependent dioxygenases were characterized. The unrooted phylogenetic tree showed that these were highly similar to feruloyl-CoA 6´-Hydroxylase (F6´H) family proteins. Among them, Cluster-22883.51443 and Scopoletin 8-hydroxylase protein (S8H, XP_030962790.1) from Quercus lobata showed maximum similarity, Cluster-22883.11702 and Cluster-22883.33018 clustered with other F6´H1 enzymes that catalyzes feruloyl CoA to scopoletin (Additional file 9: Fig. S4). The pathway to the product of rosmarinic acid consists of four consecutive enzyme reactions, starting from the precursor (tyrosion) and three enzymes [tyrosine aminotransferase (TAT), hydroxyphenylpyruvate reductase (HPPR), and Cytochrome P450 reductase (CYP)] encoded by their respective unigenes (Cluster-22883.49301, Cluster-22883.50853, and Cluster-22883.50532). As for the rosmarinate synthase (RAS) enzyme, it is thought to be encoded by the unigene (Cluster-22883.29589) although this has not been conclusively proven (Additional file 10: Fig. S5). The unigene of Cluster-22883.49301 showed no significant differential expression in WY, RY, and BY, but Cluster-22883.50853 were highly expressed in BY. Cluster-22883.50532 had a significantly higher expression in WG, whereas it was significantly lower in BY, leading to a lower production of rosmarinic acid.
Table 3
Summary of identified unigenes in the transcriptome of S. glabra
Gene | Number of unigenes | Gene | Number of unigenes | Gene | Number of unigenes |
PAL | 4 | CHS | 2 | ANR | 1 |
4CL | 4 | CHI | 2 | ANS | 1 |
C4H | 1 | F3H | 2 | TAT | 1 |
C3H | 2 | F3´H | 1 | HPPR | 1 |
COMT | 1 | FLS | 1 | HPPD | 1 |
CCoAOMT | 1 | UFGT | 1 | RAS | 1 |
F6´H | 4 | RT | 2 | CPR | 1 |
S8H | 1 | DFR | 1 | Hydroxycinnamoyl- transferase | 5 |
SGTF | 1 | LAR | 1 | O-MT | 2 |
The flavonoids biosynthesis shared the core pathway incorporating chalcone synthase (CHS), chalcone isomerase (CHI), flavanone 3-hydroxylase (F3H) enzymes. The expression level of the unigenes encoding the above mentioned enzymes was higher in RY and BY than in WG. When the intermediate compound (dihydrokaempferol) was produced by the core enzymatic reactions, it would be flowed into several branches to generate anthocyanins, proanthocyanidins, flavonols, etc. The biosynthesis of kaempferol is catalyzed by flavonol synthase (FLS) (Cluster-22883.47725) enzyme on dihydrokaempferol, and its production varied significantly with changing gene expression level under different LED lights and in different tissues. Rutin also originates from the precursor of dihydrokaempferol through the enzymatic reactions by flavonoid 3'-hydroxylase (F3´H, Cluster-22883.55303), FLS (Cluster-22883.47725), UDPG flavonoid O-glucosyltransferase (UFGT, Cluster-22883.49005) and rhamnosyl transferase (RT, Cluster-22883.85233 and Cluster-22883.49258). These unigenes (except Cluster-22883.85233) had similar expression patterns that BL had the potential to induce the higher expression quantity, while FLS was the limiting factor resulting in a lower production of flavonol compounds in the RL environment. Interestingly, the unigene of Cluster-22883.85233 was mainly expressed in root tissues to synthesize rutin, but Cluster-22883.49258 showed higher expression levels in leaves and stem tissues. A series of enzymes [dihydroflavonol 4-reductase (DFR), leucoanthocyanidin reductase (LAR), anthocyanin synthase (ANS), and anthocyanidin reductase (ANR)] are responsible for the production of catechin and epicatechin, which eventually synthesize the product of proanthocyanidins (PAs), and the unigenes encoding these enzymes had low expression levels in WG compared to other samples. Additionally, catechin was produced in large amounts in the leaf tissues, catalyzed by the combined action of these highly-expressed structural genes under the BL and RL environment. Overall, the BL and RL cultivation increased the expression level in a majority of structural genes, such as some isoforms in the PAL gene family, 4CL gene family, CHS, CHI, F3H, and C4H. The phenylalanine metabolic pathways were shown in Fig. 4A, and the heat map diagram of the aforementioned unigenes with the FPKM values was presented in Fig. 4B.
Candidate transcription factors of MYB and bHLH
A total of 142 raw MYB (MYB and MYB-related) unigenes and 61 raw bHLH unigenes were collected. For MYB proteins, after screening, 92 MYB unigenes had different numbers of highly conserved DNA-binding domains and were divided into 4 classes: 1R-MYB, R2R3-MYB, 3R-MYB, and 4R-MYB proteins including 35, 53, 3, and 1 unigenes, respectively. Since the Arabidopsis genome was published, it has provided the first insight into the description and classification of plant MYB TFs. In the plants, R2R3-MYB proteins are the largest class in MYB family. They are involved in an extensively set of primary and secondary metabolism and have been divided into subgroups in Arabidopsis based on the conservation of the DNA-binding domain and of amino acid motifs in the C terminal domain [32]. Here, we have identified all the primary structures of 53 R2R3-MYB domains in S. glabra (Additional file 11: Fig. S6). We also selected 53 unigenes encoding R2R3-MYB in S. glabra to construct a phylogenetic tree with 125 full-length R2R3-MYB amino acid sequences from Arabidopsis thaliana (Fig. 5A). Numerous studies have shown that some subgroups of R2R3-MYB proteins have been involved in the regulation of flavonoids biosynthesis to date, namely S4, S5, S6, and S7. In subgroup 7, AtMYB11, AtMYB12, and AtMYB111 were closely related to flavonols biosynthesis. The unigene SgMYB38 was clustered with AtMYB111. The unigene of SgMYB53 showed a high degree of functional similarity within the S6 (AtMYB75/90/113/114), regulating the anthocyanin pigments pathway. A subset of six unigenes of SgMYB8/14/28/43/27/39 was in the same branch of the PAs biosynthetic pathway with AtMYB123 in subgroup 5. Moreover, we identified two unigenes: SgMYB6 with a high amino acid sequence similarity to AtMYB38 (RAX2, S14), and SgMYB32 clustered with members of subgroup 16. The heat map showed the expression patterns of the 10 unigenes mentioned earlier according to the FPKM values (Fig. 5C). Based on the transcriptome results, the relative expression quantity of SgMYB38 was up-regulated in BY. In subgroup 5, the six unigenes had a significantly different expression pattern among different samples. SgMYB28, SgMYB8, and SgMYB39 exhibited the highest expression level in WY, compared to RY and BY. Their expression was also significantly lower in the WJ and WG. Meanwhile, SgMYB43 displayed a dramatic increase in expression in BY. Remarkably, SgMYB27 presented relatively higher expression in WJ, while both of SgMYB6 and SgMYB32 were strongly expressed in WG.
The bHLH proteins are another ancient and functionally diverse superfamily of TFs, which has been widely investigated in the past. In this study, we have identified 53 unigenes encoding bHLH proteins that have the distinguishable signature domain. We constructed the phylogenetic tree with these unigenes in S. glabra and 182 full-length AtbHLH protein sequences. The results of multiple sequences alignment of SgbHLH domains are displayed in Additional file 12: Fig. S7. As shown in Fig. 5B, all the bHLH proteins were divided into 21 subclasses indicating that the node values between different subclasses were low, but values within each subclass were high, suggesting that the latter have strong evolutionary relationships. A previous study classified bHLH proteins into 32 subfamilies [43], with subfamilies 2, 5, and 24 (corresponding to subclasses of 8, 7, and 15 of AtbHLH proteins) involved in the regulation of flavonoid biosynthesis. In a subclass of 8, we have identified a group of unigenes named Sg(bHLH) 22/24/25/43/45, which showed high sequence similarity with other AtbHLH protein members. Compared to the similar gene-expression change between the WY and RY groups, the relative expression level of these five unigenes was significantly lower in BY. Particularly, Sg45 was significantly up-regulated in WJ and WG. Within the subfamily of 7, Sg13 and Sg12 were clustered with the subset of At1g63650 (EGL), At5g41315 (GL3), At4g00480 (MYC1), and At4g09820 (TT8) that associated with some R2R3-MYB members to participate in the regulation of flavonoids metabolism, trichome formation, epidermal cell fate specification, and in the formation of hair and non-hair cells in root epidermis cells [43]. Furthermore, three unigenes of Sg(bHLH)4/27/37 encoding SPATULA (SPT), PIF7, and PIF1 belonged to subfamily of 15. The expression level of Sg4 was slightly reduced in BY and RY in comparison with WY, but Sg27 showed a lower level in BY, WJ, and WG (Fig. 5D).
Verification of the transcript expression using qRT-PCR
To confirm the accuracy of the RNA-seq results, a total of 24 unigenes (five R2R3-MYB unigenes, five bHLH unigenes, and another 14 structural genes related to flavonoids biosynthesis) were selected (Additional file 8: Table S5). The qRT-PCR results for all unigenes were tested analyzed, revealing that the expression levels were mostly consistent with those obtained by RNA-sEq. Furthermore, the correlation coefficient (R) for each unigene between RT-qPCR and RNA-seq for each unigene showed that RNA-seq data were accurate and hence can be used in future experiments (Fig. 6).