Mass Conditions
The metabolic analysis of Rheum Officinales Bails (ROB) and Rheum Palmatum L. (RPL) leaves were optimized by comparing the performance of several candidate elution systems, based on previous references data[21]. The acidity of the elution system is the most important factor for the analysis of metabolism. Therefore, different elution acidities with 0.1%, 0.2%, 0.5%, and 0.8% formic acid were evaluated by comparing the obtained target compounds and resolution of all the metabolism. Finally, 0.2% formic acid exhibited the best performance. The optimized HPLC and electrospray ionization-QqQ-MS (ESI-QqQ-MS) conditions were as follows: the solvent system, eluent A was milli-Q water containing 0.2% formic acid, and eluent B which was acetonitrile. All samples were analyzed using an Eclipse-Plus C18 column (2.1×50 mm, 1.8 μm) at 30oC with a linear elution gradient protocol of 0-1.5 min, 95% B, 1.5-3 min, 5-24% B, 3-5min, 24-25% B, 5-11min, 25-66% B; 11-16 min, 66% B; 16-17 min, and 66-100% B, with a flow rate of 0.2 mL/min. Under these conditions, the metabolites of rhubarb leaves were completely eluted with resolving powder best target (Additional file 1).
Identification of rhubarb leaves metabolites based on ESI-Q-TOF-MS/MS
The identification of metabolites in conjunction with the ESI-Q-TOF-MS, both positive and negative ESI modes, were used to determine fragment ion information of metabolites. Information including mass spectra (in PI, NI, NI-MS2, and PI-MS2), ultraviolet-vis spectra and retention time on C18 column are list in table 1. Based on the dissociation rules of different metabolites in rhubarb leaves, the precise molecular weights were confirmed after eliminating false positive rates. Firstly, the MS/MS behavior was resolved by fragment characteristics. Based on the combination of data available on the mass-bank and previous reference data [1, 22], 62 compounds were identified including 21 anthraquinones, 17 flavonoids, 6 trimethyltin, 12 gallates, 3 tannic acid, and 3 others. Anthraquinones and flavonoids are the main pharmacologically active ingredients of rhubarb. The detected metabolites are modified at one or several positions by methylation, glycosylation, finally endow them with different activities. The metabolic network diagram of rhubarb flavonoids and anthraquinone is well constructed, as showed in Figure 1. The recovery index of characteristic compounds was used to measure the accuracy of the method, showing good repeatability.
A Large-scale Detection, Identification and Quantification of Target Metabolites using dMRM-MS/MS
A large-scale quantification method was carried out constructing a tandem mass spectrometry (MS2) spectral tag library from detection of target metabolites using paired ions. A dynamic multiple reactions monitoring (MRM) mode under unit mass-resolution was used for simultaneous quantitative measured all the detected metabolites. It is the first time, simultaneous quantitatively identified and qualitatively assessed 62 metabolic compounds in rhubarb, only gallic acid and harrington-O-β-glucoside did not show significant differences between the two species. The remaining 60 compounds exhibited significant differences (P < 0.05). The contents of aloe emodin, rhein, and emodin-O-(6-O-acetyl)-glucoside in RPL leaves were significantly higher than those observed in the ROB leaves. In addition, the contents of chrysophanol, emodin and methyl ether were significantly higher than those reported in ROB. Moreover, the contents of chrysophanol-(6-O-acetyl)-glucoside and 43 other substances in the RPL were significantly lower than those measured in the ROB.
The contents of 8-O-methyl chrysophanol, chrysophanol, and physcion in the RPL were significantly lower than those observed in the ROB. Furthermore, the contents of three free anthraquinones in the RPL were significantly higher than those reported in ROB. Of the 14 binding anthraquinones, the content in the RPL was significantly lower than that detected in ROB, whereas the content of the other six in the ROB was significantly higher than in the RPL.
Principal component analysis was used to examine the quantitative data of the anthraquinone compounds in two kinds of rhubarb. The variance contribution rate of the first eigen-value was 92.61%, the second eigen-value was 3.62%, and the accumulative contribution rate of the first two factors was 96.23%. The six replicates of ROB and RPL leaves were clustered on the left and right sides of PC1, respectively, indicating a good correlation between biological duplication and a significant difference between the two kinds of rhubarb (Figure 2A). There was a significant difference in the content of anthraquinones between the two rhubarb species (Figure 2B).
Method validation
The calibration curves showed good linearity for all the available standards. The standard solutions were detected through chromatography until the signal-to-noise ratio corresponded to 3 and 10. The corresponding concentrations at these signal-to-noise ratios were defined as the limit of detection and limit of quantitation, respectively. The precision of metabolic quantification was studied by examining the repeatability and intermediate precision for all the compounds.
Sequencing and de novo transcriptome assembly
RNA was sequenced using Illumina paired-end sequencing technology. A total of 193,873,248 Illumina raw reads were obtained (Table 2). After removing adaptor sequences, ambiguous nucleotides and low-quality sequences, approximately 185 million clean reads were kept for the subsequent analysis. These reads were then assembled de novo using Trinity [23]. After eliminating repeated sequences and sequences < 200 bp, the assembly of clean reads resulted in 180, 689 transcripts and 90,242 unigenes. These unigenes ranges from 201 to 14,669 bp, with a mean length of 1,052 bp, median length of 706 bp and N50 length of 1,556 bp (Additional file 2).
Annotation of unigenes
To access the quality of these assembled unigenes, a homology search in Nr database was conducted for all unigenes using the BLAST program [24] and a cutoff E-value <10−5. The best aligning results were selected to annotate the unigenes. The results showed that 56% of the aligned sequences (50,558) exhibited significant homology with entries in the Nr database (E-value < 10−5), while > 75% of these matched sequences showed an E-value < 10-15 (Additional file 3). Based on the BLAST similarity distribution, 32.5% of the aligned sequences exhibited alignment identities > 80% (Additional file 3). We screened these 90,242 non-redundant unigenes for similarity in seven public databases (i.e., Nr, Nt, Swiss-Prot, KEGG, GO, KOG, and Pfam). The annotation results showed that 56.05% (50,588 unigenes) had hits in the Nr database, followed by the Swiss-Prot database (45.15%, 40,753 unigenes). In total, there were 57,239 unigenes (63.42%) annotated in at least one of these seven databases, with 8,819 unigenes (9.77%) appearing in all seven databases including annotated in Nr, Nt, KEGG Orthology, SwissProt, Pfam, GO, Databases, at least one database, as well as total unigens.
Functional classifications of unigenes
GO (Gene Ontology) was used to classify the functions of the predicted rhubarb unigenes [25]. In total, 39,567 unigenes with BLAST matches to known proteins were classified into 56 functional groups in three main categories (i.e., biological process, cellular component, and molecular function) using 1660 functional terms (Figure 3A). The majority of the unigenes were assigned to the categories of biological processes (488,916), followed by cellular components (145,217) and molecular functions (98,389). Within biological processes, metabolic process (242,663) and cellular process (108,192) were the most dominant terms. Under the category of cellular components, cell part (44,594) and organelle (33,420) had the highest representation. For molecular functions, the most represented were binding (72,789) and catalytic activities (18,096).
All unigenes were aligned with the (KOG) database in which orthologous gene products were classified to classify possible functions. A total of 16,395 unigenes were clustered into 25 KOG classifications (Figure 3B, Table 2), in which the cluster of translation, ribosomal structure and biogenesis (2766) was the largest groups, followed by post-translational modification, protein turnover, and chaperon (2,112) and general function prediction of function only (1,801).
KEGG pathway database is a knowledge base for the systematic analysis of gene functions in terms of networks of genes and molecules in cells [26]. Based on KEGG, 21691 unigenes were assigned to 130 pathways (Table 2, Figure 3). The pathways involving the largest number of translations were carbohydrate metabolism (3632, 16.74%) and translation (3117, 14.37%). In contrast, membrane transport (59, 0.27%) was the smallest group (Figure 3C).
Identification of DEGs
To reveal the molecular mechanism involved in flavone, flavanol, and flavonoid biosynthesis, the DEGs between ROB and RPL were identified. Pearson’s correlation coefficient (r2) between samples was relatively high in ROB_1 vs ROB_2 (r2 = 0.880) and RPL_1 vs RPL_2 (r2 = 0.867), whereas it was low in ROB vs. RPL (from 0.527 to 0.576). Thus, we expected to obtain a large number of DEGs between ROB and RPL. Read counts of every gene obtained through Hi-seq in all four RNA-sequencing samples (two biological replicates in ROB and RPL, respectively) were normalized and analyzed via DE-Seq package, finally 19,242 genes were detected as significantly differentially expressed between ROB and RPL.
Combined analysis of RNA-sequencing and metabolomics data
Emodin was methylated by methyl-transferase, followed by methoxy to form physcion. According to the results of the metabolomics analysis, the content of emodin in ROB is slightly lower than that observed in RPL. In contrast, the content of physcion in ROB was approximately 2.4-fold higher than that reported in RPL (Figure 4B). Since the amount of physcion is different in ROB and RPL, we then searched possible enzyme genes which may contribute to this difference (Figure 4A). We mined 121 genes involved in flavonoid biosynthesis, of which 38 were detected as DEGs. Among 38 DEGs, 7 genes (cluster-14354.36512, cluster-14354.38156, cluster-14354.46426, cluster-14354.26346, cluster-14354.3016, cluster-14354.37142, and cluster-14354.3014) were annotated with the activity of caffeoyl coenzyme methyltransferase and their expression levels are significantly different between leaves of ROB and RPL (Figure 4C). And we further found that four genes (cluster-14354.38156, cluster-14354.26346, cluster-14354.3016 and cluster-14354.3014) were expressed in both ROB and RPL (Figure 4C).
We used RT-qPCR to quantify the four genes obtained from the above analysis. The results showed that the expression of the cluster-14354.38156 in the ROB was approximately 14-fold higher than that observed in the RPL. This finding was consistent with the difference detected by RNA-seq gene expression. Therefore, cluster-14354.38156 may regulate the metabolic reaction of emodin 3 methylation to physcion (Figure 4D).