Combined Transcriptomic and Metabolic Analysis Reveals the Potential Mechanism for Fruit Development and Quality Control of Rubus chingii Hu

Background: Rubus chingii Hu (Chinese raspberry) is an important dual functional food with nutraceutical and pharmaceutical values. Comprehensive understanding of fruit development and bioactive components synthesis and regulation could accelerate genetic analysis and molecular breeding for the unique species. Results: A combined transcriptomic and metabolic analysis of R. chingii fruits from different developmental stages was conducted in this study. A total of 89,188 unigenes was generated and 57,545 unigenes (64.52%) were got annotated. Differential expression genes (DEGs) and differential ions mainly involved in biosynthesis of secondary metabolites. Phenolic acids and avonol glycosides syntheses were strongly activated at earlier stages, while amino acids, linolenic acid metabolism and anthocyanins synthesis were dominant at later stages. The core genes participated in biosynthesis of ellagic acid (EA) and kaempferol-3-O-rutinoside (K-3-R) and their corresponding metabolites were elaborately characterized. And some probable MYB and bHLH transcription factors controlling avonoids synthesis were also identied. Conclusion: Combined transcriptomic and metabolic analysis initially reveals molecular and chemical mechanism of fruit development of R. chingii Hu fruit. The fruit launched the biosynthesis of phenolic acids and avonols at the very beginning of fruit set and then coordinately accumulated and converted. And it was tightly regulated by expressions of the related genes and transcription factors. The results provide a solid foundation for genetic analysis, functional genes isolation, fruit quality improvement and modiable breeding. was synthesized by reverse transcriptase with random hexamer primers, and the second strand cDNA was synthesized using DNA polymerase I and RNase H. After purication, the cDNA fragments were connected with adapters and those with suitable were selected for PCR amplication. Library quality and quantity were assessed by Agilent 2100 Bioanaylzer and ABI StepOnePlus Real-Time PCR System. Finally, the library was sequenced using Illumina (HiSeq X-Ten) platform. pathway analysis was performed to help understanding the functions of metabolites.

Rubus L. is a large and diverse genus with a total of 900-1000 species attributed to the frequent intraspeci c and interspeci c hybridization, and is divided into 12 subgenera [24].Until now, a wide spectrum of wild species and germplasms keep unexploited.The most economically important and popular species of Rubus are red raspberry, black raspberry, blackberry, and their hybrids, but their planting areas (50-70°N) are limited due to the strict chilling requirement.Nevertheless, only in recent years has the genomic information of the genus been unveiled.VanBuren et al. [25,26] revealed the near complete genome of black raspberry with V1 of 243 Mb sequences using next generation sequencing (NGS)-based approach, or modi ed V3 of 290 Mb sequences using sing molecule real-time PacBio sequencing (SMRT) and Hi-C genome scaffolding.In the case of transcriptomic analysis, large-scale sequencing data and genes information of red raspberry (R. idaeus cv.Nova) [27], black raspberry (R. occidentalis) grew in Korea [28,29], blackberry (Rubus spp.Var.Lochness) [30,31], black raspberry (R. occidentalis) 'Jewel' [25,26], R. coreanus [32] and Himalayan raspberry (R. ellipticus) [33] have been identi ed, and differentially expressed genes during fruit ripening, especially genes involved in anthocyanin accumulations, were generally veri ed [27,28,32].However, the fruit ripening process is a comprehensive network of metabolites changes relying on related genes expressions and enzymes activities.In this perspective, combined transcriptomics and metabolomics approaches will provide a powerful dissection tool for better comprehension of biochemical, physiological, and organoleptic changes in the reproductive organs.Hyun et al. [28] made rst attempt to establish a preliminarily integrated understanding of black raspberry fruit development by combining these two omics analysis and facilitated the fundamental cognition of gene-metabolite relationships.What's more, some transcription factors, such as MYB, bHLH, WRKY and MADS, involving in anthocyanin synthesis were also tested in berries [28,34].Nonetheless, different berries had speci c biosynthetic routes and distributions of secondary metabolites that regulated by large disparate genetic expression patterns [35,36].For instance, mandarin melonberry enriches genistein and its derivatives while Korean black raspberry contains a relatively higher proportion of avonoid-and anthocyaninsrutinoside forms.Up to now, there's still no genetic, transcriptomic and metabolic information for R. chingii.And synthesis of bioactive components in Rubus remains unclear.
With the increased demands from consumers for varied, nutritious, healthy foods and locally grown fruits in China under the warmer climate condition (20-40°N), the planting area of R. chingii rapidly expanded.Our groups have tracked phenological periods of R. chingii for more than 5 years and screened nearly 20 different germplasm resources with special traits, such as thornless, big fruit, sweet fruit, etc, from thousands of seedlings.Among these, L7 was selected in this study with good taste and highly-contained medicinal ingredients.And the gene expressions and metabolites syntheses, especially the phenolic compounds and their regulators, in the process of R. chingii fruit development, were analyzed by combined transcriptome and metabolome analysis.The results provide a solid foundation for genetic analysis, functional genes isolation, fruit quality improvement and Rubus breeding.

Sample collection
Rubus chingii Hu is widely distributed in the eastern and southern China, especially in Provinces of Jiangxi, Zhejiang, Anhui, Jiangsu, Fujian and Guangxi.In last decade, numerous wild seedlings from local hills were domesticated and commercially planting in local areas due to its concerned values for human health.More than 12,000 wild and cultivated plantlets were domesticated or bought, and cultivated from 2011, in the farm of Huahan Raspberry Professional Cooperative on a rental basis, which is located at the foot of a hill (28°73'39"N, 121°09'11"E) in Linhai city, Eastern China.Seedlings were all formal identi ed by taxonomists Pro.Moshun Chen.Actually, there are abundant variations in intra-species of R. chingii, but no national variety has been certi ed yet.We have screened nearly 20 different germplasm and homogeneous L7 plants with good taste and highly-contained bioactive components were selected in this study.
Fruit set of L7 occurs in late-March and fruits mature in early-May in this area.Eight stages of fruit growth and development were designated according to anthesis, fruit size and color, including small green (SG, 7 days postanthesis, 7 DPA), medium green (MG, 14 DPA), big green I (BGI, 21 DPA), big green II (BGII, 28 DPA), big green III (BGIII, 35 DPA), green-to-yellow (GY, 42 DPA), yellow-to-orange (YO, 48 DPA) and red (Re, 54 DPA) (Fig. 1).For R. chingii cultivation, high-yield period arrives at the third year.So after planting 2 years, a total of about 50 fruits at each stage from different individuals of the same germplasm (L7) were harvested in 2017.Fruit samples were frozen immediately in liquid nitrogen and then kept at -80 °C until use.Three biological repetitions for RNA preparation and six biological repetitions for metabolites extractions were carried out in this study.
Meanwhile, another group of more than 50 fruits at every stage from the same conditions was harvested for measurements of fruit quality, including size, fresh weight, dry weight, and content of total soluble solid, Vc, total phenols, total terpenoids, ellagic acid and kaempferol-3-O-rutinoside. Soluble solid of mature fruit was assessed by refractometer PAL-1 (ATAGO, Japan).Vc was measured using molybdenum blue colorimetry [37].RNA preparation, cDNA library construction and sequencing Total RNAs of the four representative stages (BGI, GY, YO and Re) fruits was isolated using modi ed cetyltrimethyl ammonium bromide (CTAB) method [38].RNA-seq library construction and sequencing was performed by the Beijing Genomics Institute (BGI) (Wuhan, China).mRNA was enriched with Oligo (dT) method and fragmented using divalent cations under elevated temperature.Then rst strand cDNA was synthesized by reverse transcriptase with random hexamer primers, and the second strand cDNA was synthesized using DNA polymerase I and RNase H.After puri cation, the cDNA fragments were connected with adapters and those with suitable size were selected for PCR ampli cation.Library quality and quantity were assessed by Agilent 2100 Bioanaylzer and ABI StepOnePlus Real-Time PCR System.Finally, the library was sequenced using Illumina (HiSeq X-Ten) platform.

Tanscriptome de novo assembly, annotation, and expression analysis
The raw reads were rstly cleaned by removing adaptor-polluted reads, low-quality (with more than 20% Q < 15 bases) and high content of unknown base (N > 5%) reads.Then tanscriptome de novo assembly of clean reads was performed using Trinity (v2.0.6) [39].The harvested transcripts were clustered into nal unigenes using Tgicl (v2.0.6) [40].The unigenes were divided into two types, one type was cluster, which the pre x was CL with the cluster ID behind it (in one cluster, there were several unigenes that the similarity between them was more than 70%), and another type was singleton with the pre x of Unigene.
For gene expression analysis, we mapped clean reads to unigenes using Bowtie2 [46] and then calculated the gene expression level by RSEM (v1.2.12) with the formula: Fragments Per Kilobase per Million mapped fragments (FPKM) = 10 6 C/(NL/10 3 ) [47].We performed hierarchical clustering analysis with hclust function and PCA analysis with princomp function of R. Software.Differential expression genes (DEGs) were detected with DEseq2 as the request described by Michael et al. [48] Hierarchical clustering for DEGs was carried out using pheatmap.We classi ed the DEGs based on the GO and KEGG annotation results and o cial classi cation, and also mapped functional enrichment using a phyper.Hypergenometric test was adopted, and the terms which false discovery rate (FDR) was no longer than 0.01 were de ned as signi cant enrichment.

Metabolites extraction
Fruit sample (25 mg) from BGI, GY, YO or Re stage was extracted with frozen solution of 800 µL methanol: water (1:1, v/v) using Tissue-Lyser at 60 Hz for 4 min.Then the mixture was kept at -20 °C for 2 h and centrifuged at 4 °C under 30 000 g for 20 min.500 µL supernatant was transferred to a new Ep tube for UPLC-Q-TOF-MS analysis.

UPLC-Q-TOF-MS analysis
All samples were acquired by the LC-MS system followed machine orders at the Beijing Genomics Institute (BGI) (Shenzhen, China).Firstly, all chromatographic separations were performed using an ultra performance liquid chromatography (UPLC) system (Waters, UK).An ACQUITY UPLC BEH C18 column (100 mm × 2.1 mm, 1.7 µm, Waters, UK) was used for the reversed phase separation.The column oven was maintained at 50 °C.The mobile phase consisted of solvent A (water + 0.1% formic acid) and solvent B (methanol + 0.1% formic acid) with the ow rate of 0.4 mL/min.Gradient elution conditions were set as follows: 100% phase A from 0 to 2 min, 0-100% phase B from 2 to 11 min, 100% phase B from 11 to13 min, and 0-100% phase A from 13 to 15 min.The injection volume for each sample was 10 µL.
Then the eluted metabolites were detected by a high-solution tandem mass spectrometer Xevo G2-XS QTOF (Waters, UK) operated in both positive and negative ion modes.For positive ion mode, the capillary and sampling cone voltages were set at 1.0 kV and 40.0 V, respectively, while for negative ion mode, they were changed to 3.0 kV and 40.0 V.The mass spectrometry data were acquired in Centroid MSE mode.The TOF mass range was from 50 to 1200 Da with a 0.2 s scan accumulation time.Moreover, for the MS/MS detection, all precursors were fragmented using 20-40 eV, and the scan time was 0.2 s.During the acquisition, the LE signal was monitored to calibrate the mass accuracy.Furthermore, a quality control sample (pool of all samples) was collected after every 10 samples, in order to evaluate the stability of the LC-MS during the whole acquisition.
The raw data were pretreated using metaX to lter low quality ion [49], and the ions of RSD ≤ 30% were kept for next analysis by Progenesis QI (v2.2), including baseline correction, peak detection and matching, and retention time alignment.Major metabolites were identi ed using Progenesis QI (v2.2) and KEGG database by comparing both the mass spectra and retention time.The multivariate statistical analysis was performed using metaX, including Principal Component Analysis (PCA) and PLS-DA, and metabolites with a Variable Important for the Projection (VIP) value (≥ 1.0) were selected as suitable products.Finally, metabolites of signi cantly different expression were screened based on VIP value and Fold change (FC) analysis with FC ≥ 1.2 or ≤ 0.8333 and q-value < 0.05.Then these differential ions were identi ed using Progenesis QI (version 2.2).Heatmap was generated to visualize metabolite pro le and KEGG pathway analysis was performed to help understanding the functions of metabolites.
Real-time quantitative PCR assay Total RNAs from eight independent stages were isolated respectively using RNAprep Pure Plant Plus Kit (Polysaccharides & Polyphenolics-rich, DP441, Tiangen Biotech, China) according to the manufacturer's instruction.
After concentration and purity detections by NanoDrop2000 (Thermo, USA), the RNA was reverse-transcribed to cDNA using PrimeScript™ RT Master Mix Kit (RR036A, TaKaRa, Japan).The quantitative expressions of enzyme and TFs encoding genes involved in synthesis of phenolic acid compounds and avonoids were determined by qPCR in triplicate with TB Green® Premix Ex Taq™ (RR420A, TaKaRa, Japan) using CFX96 real-time PCR detection system (Bio-Rad, USA).The parameter was set as follows: 95 °C for 10 min, followed by 40 cycles of 95 °C for 10 s, 55 °C for 30 s, and 72 °C for 20 s, and then melting carve was made.Actin was chosen as the control.PCR primers were listed in Table S1.The relative expression level was calculated using 2 −ΔΔCt method [13].

Measurement of total phenols and terpenoids
The fully dried fruits were ground to powder and sieved through 60-mesh.Total phenolic (TP) contents were determined by the Folin-Ciocalteau method with a few modi cations [19,31].1.0 g of each sample was re uxed and condensed three times with 30 mL 70% (v/v) ethanol (T = 45 °C, t = 30 min).After ltration, the extract was concentrated by rotary evaporation and then diluted with pure deionized water to the nal volume of 100 mL.Finally it was thoroughly extracted with petroleum ether.For chromogenic reaction, 1 mL of each extraction was mixed with 1 mL Folin-Ciocalteu solvent in 10 ml volumetric ask with a vortex for 30 s, and then 3 mL of 20% Na 2 CO 3 was added to the mixture, following deionized water up to the constant volume.After being kept in dark for 60 min at room temperature, the optical density was detected at wavelength of 765 nm.The standard calibration curve was plotted with 0 ~ 0.08 mg/mL gallic acid.
For total terpenoids (TT) assessment, colorimetric method was applied with 5% vanillin-acetic acid and perchloric acid as the developer [50].First, 0.1 g fruit powder was extracted by 25 mL 55% ethanol with ultrasonic treatment for 30 min.After ltration, 1 mL extraction was kept at 70 °C water bath for 15 min, and then 0.2 mL 5% vanillin-acetic acid and 0.4 mL perchloric acid were added.The mixture was heated at water bath of 70 °C for 15 min and then 5 mL acetic acid was supplemented.Fifteen minutes later, the absorbance at 480 nm was recorded.Oleanolic acid (0 ~ 0.04 mg/mL) was used as standard substance.

HPLC analysis of ellagic acid and kaempferol-3-O -rutinoside
The principal medical ingredients of R. chingii Hu, EA and K-3-R, were analyzed by HPLC in terms of the standards described in Pharmacopoeia of the People's Republic of China [5].Brie y, for ellagic acid determination, 0.5 g fruit powder was re uxed with 50 mL 70% (v/v) methanol for 60 min and then made up for the weight.After ltration and suitable dilution, sample of 20 µL was injected into HPLC system (LC-20A, Shimadzu, Japan) with C18 column (25 °C) and the mobile phase of acetonitrile − 0.2% phosphate (20:80) at the ow rate of 1 mL/min.The wavelength was set at 254 nm.
For kaempferol-3-O-rutinoside measurement, fruit powder of 1 g was re uxed with 50 mL 70% (v/v) methanol for 60 min and the de cient weight was replenished by 70% methanol.After ltration, 25 mL extracting solution was dried by distillation and the residue was re-dissolved with 20 mL water.Then the solution was extracted by 20 ml petroleum ether for three times.After that, the supernatant was extracted by 20 mL water-saturated butanol solvent for another three times.Finally, the butanol was dried by distillation and the residue was re-dissolved with 5 mL methanol.20 µL of the nal extracting solution was analyzed by HPLC with the same mobile phase and ow rate as EA analysis.The wavelength was set at 344 nm and the temperature for C18 column was 30 °C.

Results
Fruit development of R. chingii R. chingii Hu is an erect shrub with biennial canes that sprout from the perennial root system (Fig. 1a).The axillary mixed buds were initiated from summer of the rst year of growth, differentiated in autumn, and then followed by dormancy in winter.At early-March of the next year, new leaves and ower buds emerge.One week later seedlings began to blossom and their orescence usually lasted for 20 d.The whole process of fruit development could be divided into eight stages: small green (SG, 7 DPA, March 24), medium green (MG, 14 DPA, March 31), big green I (BGI, 21 DPA, April 7), big green II (BGII, 28 DPA, April 14), big green III (BGIII, 35 DPA, April 21), green-to-yellow (GY, 42 DPA, April 28), yellow-to-orange (YO, 48 DPA, May 4) and red (Re, 54 DPA, May 10) (Fig. 1b).During the two weeks of the big green stages, there were no obvious changes of appearance.The fresh weight, dry weight, vertical diameter and transverse diameter of fruits were increased rapidly at early stages (7 ~ 21 DPA), then remained stable with big green appearance at middle stages (21 ~ 35 DPA), and nally sharp increased at last stages (42 ~ 54 DPA) until full maturity (Fig. 1c and d).The yield of medicinal fresh fruit at GY stage reached about 36.5 Kg per hectare.The content of water and soluble solid of red fruit were 83.11% and 15.8%,respectively.The Vc content of R. chingii L7 reached 66.09 mg/100 gFw.

Transcriptome sequencing and de novo assembly
In our project, about 80.42 Gb bases in total were generated on Illumina HiSeq sequencing platform.After removing low-quality reads and trimming adapter sequences, approximately 639.41 million total raw reads with high quality were obtained from all samples of the four representative developmental stages of R. chingii fruits (Table S2).Then de novo assembly was carried out with clean reads using Trinity to construct the full-length transcript.After ltering the abundance by Tgicl, a unigene database for R. chingii containing 89,188 unigenes was established.The total length, average length, N50, and GC content of these unigenes were 96,361,653 bp, 1,080 bp, 1,916 bp and 41.72%, respectively (Table 1).contribute to the special characteristics of R. chingii.These results suggest that de novo assembled transcripts covered a wide range of genetic information of R. chingii, and provided an invaluable resource for facilitating the identi cation of novel genes implicated in speci c developmental and metabolic processes.
In the case of Eukaryotic Orthologous Groups (KOG) assignment, 25 functional groups were classi ed.Among these, except for the "general function prediction only'', the cluster for "Signal transduction mechanisms" (6,552 unigenes) represented the largest group, followed by "Posttranslational modi cation, protein turnover, chaperones" (4,772 unigenes), "Transcription" (3,554 unigenes), "Carbohydrate transport and metabolism" (2,705 unigenes) and "RNA processing and modi cation" (2,509 unigenes), suggesting the active metabolisms in the procedure of fruit development of R. chingii (Figure S1).Further, to identify the precise biochemical pathways, the sequences were mapped against the KEGG pathway.As a result, a total of 37,833 unigenes were involved in different pathways (Figure S2).And most of them were participated in metabolisms, including "Carbohydrate metabolism (3,144 unigenes)", "Amino acid metabolism (1,719 unigenes)", "Lipid metabolism (1,533 unigenes)", "Biosynthesis of other secondary metabolism (1,307 unigenes)", "Metabolism of terpenoids and polyketides (788 unigenes)", and so on.All these results were in accordance with the accumulations of fruit nutrients and vigorous secondary metabolic synthesis for medicine constituents in R. chingii Hu.
Metabolite pro ling of R. chingii fruits at different ripening stages Mass spectrometry based on metabolite pro ling of R. chingii fruits at four stages was also performed to investigate the changes of metabolic compositions.In total, 8,352 ions were captured in positive ion additive mode (pos).After ltering low quality and RSD > 30% ions, 7,611 ions (92.61%) were harvested.The identi cation numbers at MS and MS2 levels were 4730 and 2999, respectively.Simultaneously, the number of total ions, RSD ≤ 30% ions, MS ions and MS2 ions in negative ion additive mode (neg) were 6276, 4683 (76.15%), 2070 and 1130, respectively.Multivariated analyses of Principal Component Analysis (PCA) and partial least squares discriminant analysis (PLS-DA) were performed to support pattern recognition of the metabolic differences in R. chingii samples at different stages.In pos, the score of PC1 and PC2 were 40.48% and 30.76%, respectively (Fig. 3).Stages were relatively separated from each other, especially the BG and Re stage at PC1 axis.And the same trend was found in neg.
Reliable PLS-DA models were also established between every two stages with high values of R2 and Q2: 0.998 and 0.991 from GY:BG (pos), 0.994 and 0.950 from YO:GY (pos), 0.994 and 0.965 from Re:YO (pos), and similar in neg mode.Then based on the VIP value (≥ 1) by PLS-DA analysis, fold-change (≥ 1.2 or ≤ 0.8333) by univariate analysis and q-value < 0.05, differential ions were screened and identi ed (Table 2, Figure S3).Differential expression genes and their products during the four stages of R. chingii Fruit Differential expression genes (DEGs) were detected between every two stages with the parameters: Fold Change ≥ 2.0 and Adjusted Pvalue ≤ 0.05.As a result, 8,792 genes were differentially expressed between the green-to-yellow turning stage (GY) and big green I stage (BGI).Among these genes, expressions of 4,054 genes were upregulated while 4,738 genes were downregulated.Meanwhile 6,439 genes were annotated in KEGG pathway and 134 pathways were involved.The top enrichment pathways were as follows: "Metabolic pathways" with 1,648 (25.59%)DEGs, "Biosynthesis of secondary metabolites" with 1,033 (16.04%)DEGs, "Phenylpropanoid biosynthesis" with 226 (3.51%) DEGs, "Starch and sucrose metabolism" with 164 (2.55%) DEGs, "Flavonoid biosynthesis" with 93 (1.44%) DEGs, and "Phenylalanine metabolism" with 67 (1.04%) DEGs, and so on (Table S3a).Correspondingly, 2008 differential ions (Diff ions) in pos or 1243 Diff ions in neg were detected between GY and BG stages (Table 2).The top enrichment pathways were consistent with that of gene expressions changes.In pos, 415 (20.67%)Diff ions were found in "Metabolic pathways", followed by 332 (16.53%)Diff ions in "Biosynthesis of secondary metabolites", 54 (2.69%) Diff ions in "Flavonoid biosynthesis", 49 (2.44%) Diff ions in "Diterpenoid biosynthesis", 48 (2.39%) Diff ions in "Tyrosine metabolism", and 39 (1.94%) Diff ions in "Phenylpropanoid biosynthesis".The similar trends were found between GY vs YO or YO vs Re stages (Table S3b and c).More visualized results of DEGs in pathways were illustrated by bubble diagrams (Fig. 4a, b, c).Eleven pathways were enriched across the whole fruit development process, mainly focusing on those genes involved in secondary metabolites, such as "Phenylpropanoid biosynthesis", "Flavonoid biosynthesis", "Cutin, suberine and wax biosynthesis", and "Stilbenoid, diarylheptanoid and gingerol biosynthesis", among others.Interestingly, among these four stages towards maturation, DEGs in pathways of "Phenylalanine, tyrosine and tryptophan biosynthesis", "Flavone and avonol biosynthesis", "Diterpenoid biosynthesis" and "Tropane, piperidine and pyridine alkaloid biosynthesis" were detected speci cally in the early stages (Fig. 4a, b, c).And the corresponding products in these pathways might act on plant defenses against biotic and abiotic stresses under wild environments for fruits protection and reproduction, and now exactly contribute to the bioactive roles for human health. 31Therefore it con rms that the unripe fruit of R. chingii is the perfect resource for medicinal material.In contrast, activities of "Alpha-linolenic acid metabolism", "Biosynthesis of amino acids", "C5-branched dibasic acid metabolism", "Thiamine metabolism", "Valine, leucine and isoleucine biosynthesis", as well as "Carotenoid biosynthesis" were signi cantly enhanced in the later stages (Fig. 4a, b, c), just ready for maturation of the edible, delicious and nutritive red fruits.
X axis represents enrichment factor and Y axis represents pathway name.The color indicates the q value (low: blue, high: white) and the lower q value refer to the more signi cant enrichment.Point size represents DEG number (the bigger dots is, the larger amount get).Rich Factor represents the value of enrichment factor, which is the quotient of the number of DEGs and total Gene amount.The larger value accounts for the more signi cant enrichment.

Biosynthesis and regulation of phenolic acids and avonol derivatives
More than 235 chemical constituents, especially 15 diterpenoids, 15 triterpenoids, 18 avonoids, 5 coumarins, 9 steroids and 56 organic acids (including phenolic acids, fatty acids, tannins, citric acid, and others) have been identi ed in R. chingii through phytochemical approaches [1].However, their biosynthetic and metabolic mechanisms remain unclear.Changes of total terpenoids (TT) and total phenols (TP) contents were rst determined and the results showed that both of TT and TP contents were gradually decreased with the development of fruits, whereas the accumulations of TT and TP kept at relative stable levels (Table 3).It was coincident with the changes of differential accumulated metabolites and differential expressions of the related core genes in shikimate, phenylpropanoids and avonoids synthetic pathways, determined by metabolic and transcriptomic analyses.Among these, ellagic acid (EA, a type of phenolic acid) and kaempferol-3-O-rutinoside (K-3-R, one of the avonol glycosides) are the two key bioactive components and their extraction and content standards are described in Chinese Pharmacopoeia [5].So it prompted us to characterize their molecular mechanism of synthesis and regulation.
As shown in Fig. 5a, the EA synthesis is primarily derived from the shikimic acid pathway, which starts with the condensation of erythrose .Finally, EA is a dimeric derivative of GA and can be polymerized to form ellagitannins (ETs).In this pathway, seven unigenes encoding DHAPS, DHQS and DQD/SDH were found from R. chingii transcriptome library, including two putative DAHPS genes, one putative DHQS gene and four putative DHD/SDH genes (Fig. 5 and Table S4).Intriguingly, the relative expression levels of DHQS, DQD/SDH1, DQD/SDH2, and DQD/SDH4 reached extremely high at SG stage, decreased sharply at the subsequent stage and then kept at relative stable levels (Fig. 5a and 6), indicating that R. chingii triggered the strong expressions of these genes at the very beginning of fruit set.Exceptionally, DQD/SDH3 expression remained low levels at all stages.Among the four representative stages, DAHPS1 and DAHPS2 transcripts were notably more abundant than DHQS and DHD/SDHs and reached the maximum values at BGI stage, decreased at GY stage, and then had a certain recover at YO stage.Finally, the expression of DAHPS1 and DAHPS2 reduced to the lowest levels at Re stage.The length of DAHPS1 and DAHPS2 transcripts were 2498 and 2454 bp, respectively, with the deduced proteins of 532 and 539 amino acids (Table S4).
The sequence of R. chingii DAHPS1 had similarities of 89.6% and 89.9% with the DAHPS1 from Rosa chinensis and F. vesca, respectively.And the sequence of R. chingii DAHPS2 exhibited similarities of 89.7% and 87.3% with DAHPS2 from R. chinensis and F. vesca.Analogously, the genes information of DHQS and DQD/SDH1-4 were list in Table S4.Correspondingly, UPLC-Q-TOF-MS analysis showed the contents of substrates and products in this pathway, including E4P, DHQ, 3-Dehydroshikimate, GA and EA, were all reached the maximum levels at BGI stage and reduced to the lowest levels at Re stage (Fig. 5a and Table S5).And GA and EA were remarkably enriched, con rming the functional role of R. chingii fruits.Furthermore, HPLC results showed that the EA percentage contents gradually declined as ripening progressed, while the EA accumulations in each fruit reached highest at BGI and BGIII stages, and kept relative stable levels from BGIII stage to Re stage (Table 3).

Discussion
Recently, consumption as well as planting of the Chinese raspberry, Rubus chingii Hu, has rapidly increased for its nutraceutical and pharmaceutical bene ts.However, the synthesis and regulation of the bioactive constituents are still in blank.In this study, we performed transcriptomics and metabolomics analyses of R. chingii Hu fruits from different development stages.Total of 89,188 unigenes were generated and 57,545 unigenes were got annotated.
Blast result showed 64.37% similarity between R. chingii and F. vesca subsp.vesca.The two species belong to the same sub-family Rosoideae of Rosaceae and have the same basal chromosome number of x = 7 [56,57].As previous reports, red raspberry R. idaeus 'Nova' fruit transcriptome exhibited 68% and 77% similarities with F. versa and R. occidentalis [27], and 73.5% of blackberry R. occidentalis 'Lochness' annotated genes matched similarity with F. versa [30].These ndings proved the high similarities and certain divergences among the Rubus species, and the information of sequence homology and difference will accelerate genetic analysis and molecular breeding of R. chingii.
Furthermore, differential expression genes (DEGs) and differential ions (Diff ions) examinations revealed the molecular and chemical mechanism of fruit development in R. chingii.And KEGG pathway analysis con rmed the corresponding tendency in metabolites changes and gene expressions.Metabolic pathways, especially secondary metabolites biosynthesis thrived across the whole fruit development process.It was also found in blackberry (Rubus sp.Var.Lochness) fruits [30].What's more, avone and avonol biosynthesis and diterpenoid biosynthesis were relatively higher in the early stages while amino acids and linolenic acid metabolism were prominent in the later stages.It is in agreement with the ndings in R. ellipticus that the concentrations of polyphenolic compounds reached maximum in unripened stages S-1 and S-2, and then decreased signi cantly as the fruits proceeding towards maturation [58].The phenols are produced by plants for defensive purposes as "natural pesticides" and then would be oxidized and switch to anthocyanins biosynthesis for fruits maturation [32,35,38].Similarity, in R. coreanus, degradation of aromatic compounds, and starch and sucrose metabolism were enriched in the two early stages, while polycyclic aromatic hydrocarbon degradation and bisphenol degradation were detected specially in the later stages, leading to the formation of special aroma and avor of ripening Rubus fruits [32].Another obvious sign of ripe and soft fruits is the degradation of cell wall components and cuticular/wax layer.So DEGs for cutin, suberin and wax biosynthesis were signi cantly enriched across R. coreanus fruit ripening [32], and this case was also found in R. chingii fruit in our study.Altogether, these results will be bene cial for de ning the syntheses and conversions of metabolites for fruit qualities and provide the basic data for modi able led managements for R. chingii.
The ellagitannins (ETs) and ellagic acid (EA) are abundantly existed in berries, such as strawberry, raspberry, blackberry, as well as in tea and some nuts [52,59].Their main bioactivities are antioxidant, anti-in ammatory, antiglycation, antidiabetic activity, hepatoprotective and anticancer. 11,59A total of 25 ETs were lately identi ed by HPLC-QTOF-MS/MS in R. chingii and 5 ETs were isolated, including monomer and dimer ellagitannins [11].Nonetheless, the EA was the main component among the ETs according to the highest peak value by HPLC detection and a series of pharmacological tests deeply revealed that EA had the highest activities for ABTS•+ scavenging, antiglycation than other ETs [59].The paper suggested that polymerization of EA might lead to the change of action mode.Hence, the decreased EA content at later stages in our study might attribute to the polymerization of EA or the metabolic pathways transfer to the biosynthesis of anthocyanins as the ripening progressed.In addition, Chen et al.
[59] found ETs contents in the YF (yellow fruit) group were signi cantly higher than that in GF (green fruit) group, indicating the fruit's color was closely correlated with the ETs content and composition.However, the "GF" and "YF" meant the color of the dried unripe fruits from different area, not the color of the fresh fruit.So the color may be affect by the drying process.What's more, we found that there were extensive diversities among the R. chingii seedlings and the chemical compositions and contents were remarkably varied among the different germplasms (data no shown).Therefore, the fruit development from the homogeneous seedlings was rst investigated in this study.The results preferably illuminated the developmental law of R. chingii fruit and metabolites synthesis mechanisms.
More speci cally, the genes involving in EA synthesis pathway have not been elaborately investigated.In this paper, we identi ed seven unigenes encoding DHAPS, DHQS and DQD/SDH and assessed their expression patterns during the development of R. chingii fruit.As the rst committed enzyme in shikimic acid pathway, DAHPS controls carbon ow.DAHPS genes in some species, including Arabidopsis thaliana, rice (Oryza sativa), tomato (Lycopersicon esculentum), potato (Solanum tuberosum), and Gossypium hirsutum have been isolated, and these species all have at least two putative genes, DAHPS1 and DAHPS2 [60].Expression patterns of these two genes had distinct difference and AtDAHPS1 performed a strong role in environmental resistances, while tomato two DAHPS genes showed different tissue-speci c patterns of expression.In this study, expressions of DAHPS1 and DAHPS2 in R. chingii fruits exhibited the same pattern for fruit development.DQD/SDH genes have been functionally characterized in some species and only single DQD/SDH gene was identi ed in Arabidopis with the N-terminal DQD domain (1-316 aa) and C-terminal SDH domain (328-588 aa) [61].In Populus trichocarpa, ve putative DQD/SDH genes have been found with different activities [51].In tea (Camellia sinensis), CsDQD/SDH2 and CsDQD/SDH3 had a similar expression pattern and were synergistic with each other, while CsDQD/SDH1 exhibited an opposite expression pattern and negatively regulated the other two genes in same family [62].In our study, four putative DQD/SDH genes (encoding 530-534 aa) were identi ed with similar expression patterns during fruit development.DQD/SDH1, DQD/SDH2 or DQD/SDH4 may play the dominant roles and their interaction need to be further investigated.Therefore, the identi cations of key genes and their expression patterns would be potential to R. chingii breeding for higher medicinal quality.
The avonoid biosynthetic pathway and regulation from transcription factors (TFs) have been widely reported.The avonols metabolism in R. chingii fruits was rstly investigated in this study.The results revealed that R. chingii launched avonols synthesis at the very beginning of fruit set, and then coordinately accumulated and converted.A similar phenomenon also occurred in strawberry receptacle, avonols (rutin, quercetin glucuronide, kaempferol glucuronide, and others) prominently accumulated at early to middle stages, while signi cantly declined at red stage [35].Kaempferol is the most common avonol and presents in different glycosidic forms.Yu et al. [1] summarized that there were 5 types of kaempferol glucoside in R. chingii, including kaempferol-3-O-β-D-glucuronic acid methyl ester, kaempferol-7-O-α-L-rhamnoside, kaempferol-3-O-hexoside, kaempferol-3-glucuronide, and kaempferol-3-O-β-Drutinoside.In this study, kaempferol, kaempferol-3-O-glucoside, kaempferol-3-O-β-D-glucosylgalactoside, and kaempferol-3-sophorotrioside were identi ed by UPLC-Q-TOF-MS analysis and kaempferol-3-O-β-D-rutinoside was determined by HPLC.Kaempferol and kaempferol-3-O-glucoside maintained at high levels within all the fruit developmental stages, while the others mainly accumulated at BG stages.These avonols accumulated to high levels and provide immature fruits protections against pathogenic attacks [35].Pharmacological researches have proved that they have strong activities of anti-in ammation, analgesia, hepatic protection and wound healing [63,64].Hence, the unripe fruits of R. chingii are the perfect resources of avonols for health care and pharmaceutical applications.
Biosynthesis of avonols derived from pathways of shikimic acid, phenylpropanoid biosynthesis, and avonoid biosynthesis.PAL is the rst committed enzyme in phenylpropanoid biosynthesis catalyzing the formation of transcinnamic acid from Phe.The PAL gene family in A. thaliana comprises four genes, and AtPAL1 and AtPAL2 play the principle roles in avonoids synthesis [65].RiPAL1 in raspberry was associated with early fruit ripening: its expression reached maximum level at Fruits I stage (green and cell expansion), then declined at Fruits II (green but reached mature size) and Fruits III (yellow) stages, and certainly recovered at Fruits IV (ripe) stage [66].However, expression of RiPAL2 correlated more with later stages of fruit development.In addition, RiPAL2 transcripts were 5.6fold more abundant than those of RiPAL2.All these results are strongly consistent with our ndings.So we hypothesize that the two PAL genes might be controlled by different regulatory mechanism.Blastx results showed that R. chingii PAL1 and PAL2 were more than 96% identical with RiPAL1 (710 amino acids) and RiPAL2 (730 amino acids) (Table S4), indicating the conservative evolution of PAL in Rubus.The information of C4H, C4L, CHS, CHI, F3H, F3'H, FLS, DFR, and ANS genes from R. chingii was all clari ed in this study (Table S4).What's more, glycosylation is essential for stable accumulation of avonols.Arabidopsis UGT78D2 is one of the avonoid 3-O-glucosyltransferases (FGTs), belonging to family 1 glycosyltransferases (UGTs), and involves in both of avonol and anthocyanin glucosylation [65].And Routaboul et al. [67] has veri ed that UGT78D2 solely catalyzes the addition of glucose moiety on aglycone kaempferol or quercetin.The sequence of UGT78D2 (Unigene7678_All) in R. chingii was identi ed in this study and the expression pattern was consistent with PAL genes.The blast result also showed high similarity (85.1%) between Unigene7678_All and Rubus hybrid cultivar 'Arapaho' UDP glucosyltransferase (UGT78H2).Chen et al. [68] observed most transcripts of UGT78H2 were abundant in the immature fruits of blackberry 'Arapaho', and then transcription levels dramatically decreased with fruit maturation.These results con rmed the role of avonol 3-O-glucosyltransferase in kaempferol glucosylation.Subsequently, for synthesis of kaempferol-3-O-β-D-rutinoside, rhamnosyltransferase might be needed for the addition of rhamnose moiety.Two clusters of unigenes, CL2419.Contig1/3/4/5_All and Unigene428_All, encoding putative UDPrhamnose:rhamnosyltransferase 1, were found in transcriptome library, but their expression pattern was on the opposite side of UGT78D2/78H2 and strongly expressed at red stage.So the integrated view of kaempferol-3-O-β-Drutinoside biosynthesis still needs to be further tested.
[54] performed MYB-binding sites (MBSs) searches using FUZZNUC, MEME, and DREME and screened 141 candidate genes as potential direct targets of SlMYB12, including DAHPS1, PAL2, 4CL3, CHS1, CHS2, F3H, F3'H, among others.In our study, RucC1, RucMYB12-1, and RucMYB12-2 showed initially high expression levels and coincident changes with key genes, such as DAHPS1, DAHPS2, PAL1, PAL2, C4L1 ~ 3, CHS, CHI, FLS1 and UGT78D2, in shikimate, phenylpropanoids and avonols synthetic pathways, con rming the potential in uence of the avonolspeci c activators in R. chingii.In general, the MYB proteins are characterized by the conserved sequences at Nterminal DNA-binding domains, called MYB repeats (R), and the variable C-terminal region controls the regulatory activity [55].Hence, we carried out sequence analysis and found R2 to R3 domain of the predicted RucC1 protein has 75.86% similarity with C1 protein in Amborella trichopoda, indicating the sequence and functional homology between the C1 proteins.And the deduced protein sequence of RucMYB12-1 was closely related to FaMYB11 in Fragaria × ananassa and MdMYB12 × 1 in Malus domestica, with 97.87% and 87.34% identities in the R2R3 domain (14 to 61 amino aicd and 67 to 112 amino acid), whereas the complete protein sequence showed only 77% and 52% identities.On the other hand, the amino acids of RucMYB12-2 were highly homologous with Fv WER-like (87%) and PaMYB114-like in Prunus avium.WEREWOLF (WER), an initial regulator of root hair pattern formation, also controlled the owering time in A. thaliana [72].MYB114 was identi ed to involve in anthocyanin biosynthesis [55,73].Therefore, RucMYB12-1 and RucMYB12-2 may act in different ways and their functions need to be further veri ed.What's more, MYB1, MYB4, MYB5 and MYB6 are homologous regulatory genes and they have been found to play positive or repressed regulations in avonoids biosynthesis in different species [31,74].In this study, RucMYB4, RucMYB5 and RucMYB1R1-1 might be positively related to avonols production, while RucMYB6 and RucMYB1R1-2 (R3-type) were mainly in correlation with the anthocyanins synthesis.In F. vesca, MYB39 and MYB86 were downregulated in yellow fruit, in comparison with red fruit, re ecting their functional involvement in avonoids biosynthesis [75].Our results revealed the strong activations of RucMYB86-1, RucMYB86-2 and RucMYB86-3 at BG stages in R. chingii.Unexpectedly, the transcripts of MYB44, MYB46, MYB308 and MYB308-like were also abundant in R. chingii fruits at BG stage and then changed under the same trend as the expressions of the core genes in avonols biosynthesis pathway.In addition, the action of MYB was independent or in combination with bHLH and WD40, as the MBW complex [55,73].The regulation of AtMYB12 in avonol synthesis did not depend on bHLH coactivators [71].However, in red-eshed apple (Malus sieversii f. niedzwetzkyana), MYB12 interacted with bHLH3 and bHLH33 and essentially control proanthocyanidin synthesis, while MYB22 independently activated avonol pathway by direct combination with FLS [76].PsCHS expression in tree peony could be directly activated by the BMW complex of PsMYB12, bHLH, and WD40 protein [77].In this study, bHLH13, bHLH63, bHLH93 and bHLH137 might participate in the synthesis of avonols, whereas bHLH62, bHLH77, bHLH96 and bHLH130 might regulate anthocynins biosynthesis.bHLH62 have been reported to be related to the anthocyanin biosynthesis in eggplant [78].However, the roles of the other bHLH-type TFs for fruit development were rst found in this paper and their action modes should be further studied.Nonetheless, the regulations of transcription factors were tightly associated with environmental and internal conditions, such as light, temperature, plant hormones, among others.The discoveries of the genes and regulators will provide vital supports to harness maximum nutraceutical and medicinal potential of R. chingii via agricultural management strategies and genetic engineering.
In summary, combined transcriptomic and metabolic analysis initially reveals molecular and chemical mechanism of fruit development of Rubus chingii Hu fruit.The fruit launched the biosynthesis of phenolic acids and avonols at the very beginning of fruit set and then coordinately accumulated and converted.And it was tightly regulated by expressions of the related genes and transcription factors.Unveiling the genes and metabolites information will accelerate genetic analysis and molecular breeding for the unique species and then harness maximum nutraceutical and medicinal potential via improved agricultural management.

Figure 2 Distribution
Figure 2

Figure 5 The
Figure 5
Functional annotation of R. chingii transcriptomeFor functional annotation, the unigenes were rst searched using Blastn or Blastx against NCBI database with a cutoff E-value of 10 − 5 .Out of all 89,188 unigenes, 47,957 (53.77%) and 49,755 (55.79%) had BLAST hits to known nucleotides in NT database and proteins in NR database, respectively.

Table 3
Changes of the main medicinal components contents during fruit development in R. chingii┴