Integrated transcriptome, small RNA and degradome sequencing approaches proffer insights into chlorogenic acid (CGA) biosynthesis in leafy sweet potato

Background Phenolic compounds play key roles in health protection and leafy sweet potato is an excellent source of total phenolics (TP). The chlorogenic acid (CGA) family, which includes caffeoylquinic acid (CQA), 3,4-O-dicaffeoylquinic acid (3,4-diCQA), 3,5-O-dicaffeoylquinic acid (3,5-diCQA) and 4,5-O-dicaffeoylquinic acid (4,5-diCQA), constitutes the major components of phenolic compounds in leafy sweet potato. However, the mechanism of CGA biosynthesis in leafy sweet potato is unclear. The objective of present study is to dissect the mechanisms of CGA biosynthesis by using transcriptome, small RNA and degradome sequencing.Results Transcriptome sequencing of twelve samples (three replicates) from one low-CGA content genotype and one high-CGA content genotype at two stages (65 and 85 days after planting) identified a total of 2333 common differentially expressed genes (DEGs). The enriched DEGs were related to photosynthesis, starch and sucrose metabolism and phenylpropanoid biosynthesis. In this study, functional genes CCR , CCoAOMT and HCT in the CGA biosynthetic pathway were uniformly downregulated, indicating the way to lignin was altered. Small RNA sequencing of the samples resulted in the identification of 171 microRNAs (miRNAs), including 149 known and 22 novel miRNAs. A total of 38 miRNAs were differentially expressed. Using in silico approaches, 1799 targets were predicated for 38 DE miRNAs. The target genes were enriched in lignin and phenylpropanoid catabolic process. Transcription factors (TFs) such as APETALA2/ethylene response factor ( AP2/ERF ) and Squamosa promoter binding protein -like ( SPL ) predicated in silico were validated in degradome sequencing. The association analysis of DE miRNAs and transcriptome datasets identified that MIR156 family targeted DHQ / SDH (3-dehydroquinate the could leafy sweet

3 sequencing results.Conclusions This study established comprehensive functional genomic resources for the CGA biosynthesis and provided insights into the molecular mechanisms involving in this process. The results also enabled the first insights into the regulatory roles of mRNAs and miRNAs and offered candidate genes for leafy sweet potato improvement.

Background
Sweet potato (Ipomoea batatas (L.) Lam.) is the seventh most important food crop in the world due to its wide adaptability, high nutrition and high productivity [1]. In the past, the tuberous roots of sweet potato were the main organs harvested. However, in recent years, the tender stems and leaves of certain sweet potato varieties as fresh vegetables have become increasingly popular in many regions [2]. In central and southern China, leafy sweet potato plays an important role in summer. Its yield exceeds 75,000 kg/ha each year with the price about 0.59 USD/kg, making the total output value reach as much as 44,117 USD/ha [3]. Thus, planting leafy sweet potato is a commercially viable venture.
The nutritional attributes of leafy sweet potato are increasingly being recognized. It is rich in vitamins, minerals, dietary fibres, phenolics and proteins. These characteristics make it a candidate vegetable for reducing malnutrition [4,5]. Among its nutritional components, phenolics have attracted particular attention because they can reduce the risks of serious human afflictions, such as cancer and cardiovascular diseases, and protect the human body from oxidative stress which causes fatigue and aging [6][7][8][9][10]. Phenolics are the most abundant type of secondary metabolites produced from the general phenylpropanoid pathway in sweet potato [2,9,10].
Gene studies showed that functional genes and TFs in the pathway were closely related to the biosynthesis of CGAs. A transesterification reaction between caffeoyl D-glucose and Dquinine acid was discovered in the CGA biosynthetic pathway of the roots of sweet potato via the isotope tracer method [19]. In addition, HCGQT extracted from sweet potato roots was found to catalyse the formation of CGAs in in vitro experiments [20]. A high level of HCT expression increased CGA accumulation in Solanaceous species [25]. In Lonicera japonica, the HQT gene was found to positively regulate CGA biosynthesis [26].
Overexpression of HQT gene isolated from Cynara cardunculus var. scolymus in Nicotiana tabacum led to rechannel of the phenylpropanoid pathway [27]. Some TFs had also been reported to regulate the biosynthesis of CGA. MYB1 was an important transcriptional activator of Phenylalanine ammonia-lyase ( PAL1), while MYB 3 and 5 were found to act on the promotor of PAL3 in carrots [28]. ZIP8 can be specifically bound to the G-box element of PAL2 5'-UTR in Lonicera japonica, with its overexpressing leading to a decrease in CGA content [26]. The biosynthesis of many phenolic compounds was also recorded to be regulated by the WRKY family, for example, WRKYs 38, 45, 60, 89 and 93 acted as activators for HCT2 in poplar [29]. In Salvia miltiorrhiza, AP2/EFR1 was reported to be able to increase the phenolic acid level [30]. However, despite the abundance of CGA 5 compounds in sweet potato and growing recognition of their importance to human health, there were few data in the literatures concerning genes involved in the CGA biosynthetic pathway in leafy sweet potato.
In addition, small non-coding RNAs had been extensively studied to be participated in epigenetic regulations by altering gene expression. Small RNAs, such as miRNAs, composed a class of endogenous small non-coding RNAs that ranged from approximately 20 to 24 nt in length. They negatively regulated expression of their target genes at the post-transcriptional and translational levels and played crucial roles in diverse biological processes, including plant growth and development, viral defence, metabolism and apotosis [31]. Although much progress had been made in miRNA research in plants, including a few studies in sweet potato [32][33][34][35], the mechanisms of miRNA regulating CGA biosynthesis in leafy sweet potato remained unclear.
To better understand the basis of the high phenolic levels of leafy sweet potato and elucidate the global expression patterns of genes and miRNAs involved in the CGA biosynthetic pathway, the present study employed transcriptome, small RNA and degradome sequencing approaches using two leafy sweet potato genotypes. These genotypes comprised one high-CGA content genotype and one low-CGA content genotype.
The comprehensive and integrated analysis of different datasets identified DE mRNAs, DE miRNAs, and DE miRNA targets in CGA biosynthesis of leafy sweet potato.

Plant materials
Two brilliant leafy sweet potato variety and line, Fushu No. 7-6 (F) and EC16 (E), respectively, were chosen for further study based on previous selection (data not shown).
Variety Fushu No. 7-6 was bred by Fujian Academy of Agricultural Sciences and introduced into Hubei Academy of Agricultural Sciences as a resource. EC16 was one of the progenies 6 of Fushu No. 7-6. Both of them were kept in the plant nursery of Food Crops Institute of Hubei Academy of Agricultural Sciences and cultivated in potting soil on May 2 nd and grown under standard production practices. The leaves of the two genotypes were sampled at two stages: 65 days (S1) and 85 days (S2) after planting. Each sample was pooled with leaves from three individual plants and three biological replicates were collected. Part of the samples were immediately frozen in liquid nitrogen and stored at -80°C in a freezer for transcriptomic analysis. These remaining samples were rinsed gently and dried in a blast drier (Shangce, Wuhan, China) at 70°C. After powdered by a blender, the dehaydrated samples were filtered through a 60-mesh sieve and then were placed in the sealed plastic bags, maintaining in a freezer at -20℃ for further TP and CGA measurement analysis.

Determination of TPC
TP was determined following Xu et al. [12] with a few modifications. The powders of the samples were extracted 25 times (w/v) with 70% ethanol for 40 min in a 80°C water bath.
After the solution was centrifuged at 5000 × g for 10 min, the residue was re-extracted with 70% ethanol as described above. The supernatants were combined, concentrated in a rotary evaporator and filtered. The crude solution was diluted with distilled water to 100 ml. One ml of the prepared solution was mixed with 1 ml of Folin-Ciocalteu reagent (Guoyao, Shanghai, China), 3 ml of 7.5% Na 2 CO 3 , and 5.0 ml of distilled water in a test tube and allowed to react at 45°C for 1.5 h in a water bath. The absorbance was measured at 756 nm using a UV-2880 spectrophotometer (UNICO, Shanghai, China). A calibration curve of gallic acid (ranging from 0 to 0.05 mg/ml) (Guoyao, Shanghai, China) was prepared, and the TPC was expressed as mg GAE (Gallic acid equivalent) per gram of DW.

Determination of CGA contents by HPLC
The powder of the samples was extracted 50 times (w/v) with 70% ethanol for 40 min at 80°C. The solution was centrifuged at 5000 g for 10 min, and the residue was re-extracted twice with 70% ethanol as described above. The supernatant was filtered through a cellulose acetate membrane filter (0.2 µm, Advantec, Japan) and used for analysis. A 20 μl portion of the filtrate was injected into the HPLC system and eluted as described below.
The HPLC system was an Agilent 1260 system (Agilent Technologies Inc., USA), and the column was a ZORBAX Eclipse Plus C18 column (4.6 × 250 mm, 5 µm HPLC column, Agilent Technologies Inc., USA). The column oven temperature was set at 40°C. The mobile phase

Expression analysis and annotation
Low-quality RNA-seq reads were first processed with SeqPrep (https://github.com/jstjohn/SeqPrep) and Sickle (https://github.com/najoshi/sickle) softwares to remove reads containing poly-N and adaptor sequences. The clean RNA-seq reads were filtered and mapped to the Ipomoea trilioba (NSP323.v3) genome using HISAT software [37]. The mapped reads were assembled by StringTie. Gene expression levels were estimated using FPKM (fragments per kilobases of transcript per million fragments mapped) values calculated with RESEM software [52]. The expression patterns of the mRNAs aligned to the reference genome were evaluated by calculating FPKM values. An mRNA was considered a DE mRNA when it exhibited a two-fold or higher expression change and an FDR below 0.01 between the respective stages as determined via the DESeq2 R package [53]. To functionally characterize the pathway and expression clusters, the BLAST algorithm was used to annotate DE mRNAs based on the COG, KEGG, KOG, Swiss-Prot, TrEMBL and Nr databases.
For sRNAs, the clean reads were aligned to the Ipomoea triloba genome [36] via Bowtie [54], and then the reads aligned with the reference genomes were searched against miRbase and Rfam to detect known miRNAs. The prediction of precursors for the novel miRNAs was performed by using miRDeep2 [40], and only ones with the MFEIs of precursors (pre-miRNAs) above 0.85 were considered to be novel. Moreover, the 9 normalized copy number of the novel miRNAs were required to be ≧ 10 in at least one small RNA library to avoid potential false positive. The expression levels of miRNAs in each sample were calculated and normalized by the Transcript per million (TPM) algorithm. Differential expression analyses were carried out using the DESeq R package (1.10.1). miRNAs with absolute values of log 2 (Fold Change)≥1 and FDR≤0.05 were considered DE miRNAs.

Degradome sequencing
Degradome library construction was conducted with the method previously described by German et al. [55], with some modifications. Poly(A)+ RNA was isolated and annealed with Biotinylated Random Primers. The annealed products containing 5'-monophosphates were ligated to a 5' adaptor and used to generate first-strand cDNA. Single-end sequencing using the 5' adapter was performed on an Illumina HiSeq2500 at the BioMarker company (Beijing, China).

qRT-PCR
qRT-PCR analyses were carried out to determine the reliability of the RNA-seq results for expression profile analysis. All primers were designed according to the mRNA sequences and miRNA mature sequences and were synthesized commercially (Tianyi Huiyuan, Wuhan). The primer sequence information was presented in Additional file 12: Table S12. qRT-PCR for mRNAs and miRNAs was carried out using 2.0 μl of cDNA product, 10 μl of 2 × qPCR Mix, and each of the forward and reverse primers at 2.5 μM in a 20 μl system. The reactions were incubated in a Real time System Thermocycler for 10 min at 95°C, followed by 40 cycles of 15 s at 95°C, 60 s at 60°C. All reactions were run in three replicates and βactin served as the endogenous reference gene. The 2 -ΔΔCT method was employed to analyse the relative changes of the genes and miRNAs [56]. T-TEST was employed to analyse the data generated from qRT-PCR.
Obviously, within the same management condition, the TP and CGA contents of E were significantly higher than F; S1 notably higher than S2.

Transcriptome sequencing and analyses
The RNA-seq reads for two genotypes at two stages (three replicates) included 1675.7 million reads, with individual libraries containing 128.4 to 185.7 million reads (Table 1).
Reads from each sample were mapped to the reference genome [36] using HISAT [37].
BLAST mapping [38] revealed 29834 (91.84%) genes with homology to protein sequences in the Nr database. The expressional level distributions of expressed genes were shown in Fig. 3a. Correlation analysis showed that L10 revealed low coorelation to the other double replicated samples, and therefore were removed from further DE mRNA analysis (Fig. 3b).

Differentilally expressed gene analyses and annotation
The TP and CGA compounds increased across the four pairwise combinations (FS2 vs. FS1; ES2 vs. ES1; FS2 vs. ES2; FS1 vs. ES1), thus co-regulated DEGs across the combinations indicated the pivotal steps in the pathway of CGA biosynthesis. In total, 6961 DEGs were found across the four combinations (Additional file 1: Table S1). The number of DEGs ranged from 1315 (690 upregulated; 625 downregulated) for FS1 vs. ES1 to 4482 (2196 upregulated; 2286 downregulated) for FS2 vs. FS1 (Fig. 4a). A total of 1685 and 711 DEGs exhibited common expression patterns between FS2 vs. FS1 and ES2 vs. ES1, between FS1 vs. ES1 and FS2 vs. ES2; an overlap of 63 common DEGs were found across the four combinations (Fig. 4b). In comparison of stage-specific and genotype-specific groups, 2333 common expressed DEGs were at least in two comparisons and then therefore considered for further analysis.
To functionally characterize expression genes, firstly, we used the BLAST algorithm to  Table S2. Out of these DEGs, 2333 common expressed ones were assigned to 47 GO terms in Additional file 3: Table S3. It was found that GO terms for DEGs were uniformly assigned to each of the biological process (20), molecular function (14) Table S4). Further, the pathway analysis of common expressed DEGs was carried out to understand the molecular mechanism using KEGG database. The DEGs were found to represent 288 pathways (Additional file 5: Table S5). The enrichment analysis suggested that photosynthesis-antenna proteins (map00196), starch and sucrose metabolism (map00500), drug metabolism-cytochrome P450 (map00982) and phenylpropanoid biosynthesis (map00940) were among the most enriched pathways (Fig.   5b). A total of 134 transcription factors (TFs) belonging to 26 families were identified differentially expressed. Among them, C2C2 (18), AP2/ERF (16), MYB-related (11), bHLH (11) were the most overrepresented TF families (Fig. 5c).

Metabolic pathway and gene analysis during CGA accumulation
To provide a global view of leafy sweet potato secondary metabolism, common expressed genes with different map ids were further submitted for analysis via the online Interactive Pathway (ipath) explorer v2 (Fig. 6) [39]. The red lines in Fig. 6 indicated that genes involving in various pathways were enhanced. The metabolic rate of pathways such as pentose phosphate pathway (Fig. 6a), phenylalanine biosynthesis (Fig.6b), CGA biosynthetic pathway (Fig. 6c), flavonoid biosynthesis, carotenoid biosynthesis and brassinosteroid biosynthesis showed enhanced, which were in accordance with the results of GO analysis. As pentose phosphate metabolism, phenylalanine biosynthesis and CGA biosynthetic pathway were vital steps for CGA biosynthesis, genes invloving in these pathways were fully illustrated.
In this work, genes participating in CGA biosynthesis showed special expression pattern.

High-throughput small RNA sequencing
The small RNA sequencing resulted in 248.6 million clean reads, with 14.4 to 30.0 million reads per library. Reads with length > 17 nt and < 33 nt were kept, following by the removal of ribosomal RNA (rRNA), transfer RNA (tRNA), small nucleolar RNA (snoRNA), and repetitive sequences (Table 1). The length distribution patterns of the sRNAs were similar in the eleven libraries. They ranged from 18 to 30 nt, of which 24 nt were the most abundant size (Fig. 8a). In order to identify known miRNAs, the filtered reads were searched against the miRNAs from miRBase. A total of 149 known miRNAs were obtained.
As some of the known miRNAs were aligned with more than one pre-miRNAs, the detailed information of all aligned miRNAs was listed in Table S6. The length distribution of known miRNAs exhibited a peak at 21 nt, similar to the results reported in previous research in sweet potato and other species (Fig. 8b) [32][33][34][35]. Reads that could not be mapped to miRBase were subjected to novel miRNA predication by miRDeep2 and the most length distribution of novel miRNA was 24 nt following by 21 nt (Fig. 8c) [40]. A total of 22 novel 14 miRNAs were identified and listed in Table S6. The negative folding free energies of the hairpin structures of novel miRNAs ranged from -68.37 to -26.52 kcal mol -1 with an average of -43.47 kcal mol -1 . The minimal folding free energy index (MFEI) of novel miRNAs ranged from 0.86 to 1.69 with an averge of 1.13.

DE miRNA expression during CGA accumulation
miRNAs were considered as DE miRNAs if they had absolute values of log 2 (Fold Change) ≥1 and FDR (False discovery rate) ≤0.05. A total of 9, 7, 18 and 9 miRNAs were identified as DE miRNAs across the four combinations and 5 miRNAs were common expressed ones (Additional file 7: Table S7). The majority of DE miRNAs showed a trend of downregulation during CGA accumulation (Fig. 8d). miR156, miR166, miR167 and miR858 were found in different combinations, which had been reported to be involved in phenylpropanoid pathway [41].

Target predication via in silico and degradome approaches
To explore the function of miRNAs, computational program was performed to predict their target genes. All identified 171 miRNAs were predicated to have 1799 targets via TargetFinder software with the score value < 4 [42]. The annotations for the 1799 miRNA targets were based on the GO, KEGG, COG, Nr, Swiss-Prot and Pfam databases (Additional file 8: Table S8). The targets were uniformly assigned to 20 biological processes, 14 cellular components and 11 molecular functions. The most abundant 20 GO terms was demonstrated in Fig. 9a. The significant enriched GO terms like lignin catabolic process (GO:0046274), phenylpropanoid catabolic process (GO:0046271), lignin metabolic process (GO:0009808) and phenylpropanoid metabolic process (GO:0009698) were listed in Additional file 9: Table S9 and they were all involved in CGA accumulation pathway.
Further, KEGG annotation was carried out to explore the pathways in which the identified targets were involved. A total of 220 pathways were identified indicating the highly diverse functions of the targets. Phenylpropanoid biosynthesis (map00940) which was CGA accumulation related pathway were among the most 20 abundant pathways (Fig. 9b).
Using degradome sequencing, a total of 21.94 Mb clean tags and 7,892,630 cluster tags were obtained. The cluster tags were aligned to the transcriptome and Rfam database for cleavage site analysis. After processing and analysis with CleaveLand [43], 158 miRNA-mRNA pairs were totally identified (Additional file 10: Table S10). Target analysis showed that many cleaved-target genes by miRNAs were TF genes, including AP2/ERF, bZIP, TCP, MYB, SPL, etc. Some miRNAs had more than one target genes, like miR530a targeted microtubule-associated protein 70-1-like and bHLH130-like genes. On the contrary, same gene can be targeted by more than one miRNAs, for instance, miR394c and miR384-5p shared the same target F-box. TFs such as AP2/ERF (itb14g16290) and SPL (itb01g24030) predicated in silico were validated in degradome sequencing.

Validation of differential gene and miRNA expression
qRT-PCR analysis was carried out to validate the expression patterns of genes and miRNAs obtained from the RNA and small RNA sequencing. The expression of enzyme encodinggenes (HCT, CCoAOMT, CCR ) in the CGA biosynthetic pathway, two synonymous G6PD genes (itb03g00300, itb02g05910) in the pentose pathway and one phenylalanine biosynthesis-related gene DHQ/SDH were validated via qRT-PCR ( Fig. 10; Additional file 12: Table S12), the qRT-PCR results were consistant with the mRNA sequencing reults, except the CCR in FS2 vs. FS1 was not so significant as sequencing result. In addition, six miRNAs, namely, Nov-m2294-5p, Nov-m3917-3p, Nov-m4613-3p, sly-miR168a-5p, stu-miR156e and tcc-miR530a were validated by qRT-PCR as well ( Fig. 10; Additional file 12: Table S12). Similar expression trends (upregulated or downregulated) were observed between the qRT-PCR analysis and the and sRNA sequencing results.

Discussion
Leafy sweet potato is extremely popular among consumers in China, because it is beneficial to the health. CGA in leafy sweet potato is not only the key attribute for health care by fresh consuming, but also has potential applications in food and pharmaceutical industries. As a result, the major objective of this research is to comprehensively study the CGA metabolism and investigate the molecular basis of this pathway in leafy sweet potato. The availability of diverse germplasm resources and high-throughput approaches, namely, transcriptome, small RNA and degradome sequencing, provide an opportunity to dissect the mechanism. We investigated two well-characterized genotypes at two stages for their different CGA accumulation (Fig. 2a-c)Although a few genes involved in CGA biosynthesis had been reported in other plant species [14, [22][23][24][25][26][27][28][29][30], the molecular mechanisms underlying in leafy sweet potato remains largely unknown. In the present study, the reference guided transcriptome analysis of two genotypes at two stages generated a total of 29834 genes. The sequences for all genes in route 2 and 3 were assembled. Therefore, it was reasonable to conclude that route 2 and 3 were the main pathways for the biosynthesis of CGA in leafy sweet potato. This result was not in accordance with that of the report by Kojima and Uritani [20], in which the biosynthesis of CGA was assumed by route 1. We speculated that the mechanism of CGA biosynthesis in the leaves of sweet potato was different from that of the root. There were 4426, 2289, 2678, 1273 DEGs annotated for FS2 vs. FS1, ES2 vs. ES1, FS2 vs. ES2 and FS1 vs. ES1. As overlapping DEGs between FS2 vs. FS1 and ES2 vs. ES1, between FS2 vs ES2 and FS1 vs ES1 were potential DEGs explaining CGA differences based on stage-and genotypespecific, 2333 common expressed DEGs identified at least in two comparisons were therefore considered for further analysis.
It has been acknowledged that the accumulation of CGAs was a multifaceted process that can be traced to the pentose phosphate pathway, where the precursor phosphoenolpyruvic acid (PEP). Following this process was the shikimic acid pathway, which was the main pathway leading to the production of phenylalanine, and then CGAs were produced by the For HCT, all homologous genes were upregulated except the most highly expressed one (itb03g29460). Though having been shown to synthesize caffeoylquinate in vitro [18], HCT was involved both upstream and downstream of the 3-hydroxylation step (Fig. 1). Its inhibition could affect predominant caffeoylquinate catabolizing into caffeoyl CoA which led to the lignin bosynthesis, and thus the CGA accumulation occured. The same phenomenons had been reported by Hoffmann et al. [17]. The CGA mechanism also involved a number of TFs like C2C2, AP2/ERF, MYB-related, bHLH, etc. miRNAs have emerged as master modulators of gene expression and are promising tools for crop improvement [45]. A few studies in sweet potato had reported the genome-wide discovery of miRNAs [32][33][34][35], but no study had yet characterized the roles of miRNAs in CGA biosynthesis. In the present study, a total of 149 known and 22 novel miRNAs were identified. The expression pattern of the isolated miRNAs were analyzed, more miRNAs were downregulated than upregulated across the four combinations (Fig. 8d) and thirtyeight miRNAs were recognized as DE miRNAs. Most of the DE miRNAs were known miRNAs.
By target predication, 1799 miRNA targets were annotated based on the GO, KEGG, COG, Nr, Swiss-Prot and Pfam databases (Additional file 8: Table S8). Go enrichment and KEGG analysis showed that genes were enriched in lignin and phenylpropanoid catabolic process. Degradome sequencing which had been successfully applied to identify miRNA targets in many plant species [49,50] were employed to verify the predication results. In degradome sequencing analysis, the majority target genes were transcription factors, containing SPL, HD-ZIP and MYB genes. These TFs were all reported to be related to the phenylpropanoid pathway. For instance, SPLs played important roles in plant growth and development. The miR156/SPL module was reported to participate in the biosynthesis of 20 phenylpropanoids by destabilizing the MYB-bHLH-WD (MBW) complex and directly preventing the expression of anthocyanin biosynthetic genes in Litchi chinensis [51]; as HD-ZIP TFs played crucial roles in shoot apical meristem and organ polarity, the blockage of miRNA165/166 caused the upregulation of HD-ZIP TFs and increased IAA content accompanied by enhanced anthocyanin [41]. In this research, the analysis of degradome sequencing demonstrated that SPLs were targeted by miR156, HD-ZIP by miR166, MYB by miR159. These results suggested that miR156, miR166 and miR159 might be involved in CGA biosynthetic pathway. SPL (itb01g24030) predicated in silico were also validated in degradome sequencing.
ES2 and its target MYB (itb12g01510) was downregulated. All these target genes were noted to be associated with CGA biosynthesis.

Conclusions
In summary, the present study integrated mRNA and miRNA expression data along with degradome analysis to identify key factors in CGA biosynthesis in leafy sweet potato. The study revealed complex mechanism, in which pentose metabolism and lignin biosynthesis were all related to CGA biosynthesis. A set of genes and miRNAs were identified as playing roles for the CGA biosynthesis. They could serve as targets for further research of gene functions. This study provided a foundation for studying the CGA biosynthetic system in leafy sweet potato, and the results could be used to improve leafy sweet potato varieties for both consumer health benefits and pharmaceutical use in the future. measurements from various samples, analyzed the data and prepared the manuscript under the supervision of WZ and XY. LW, JL, SC, and CJ provided manuscript revision advice. We also thank all the fellows in Dr. Yang's lab. All authors have read and approved the manuscript.

Figure 2
Measurements of TP and CGA contents between genotypes E and F at stages S1 and S2. a Measurements of TPC in genotypes E and F at stages S1 and S2. b Measurements of CGA monomers in genotypes E and F at S1. c Measurements of CGA monomers in genotypes E and F at S2. Error bars indicated SD (N=3). ** T-TEST, P<0.01.