Multi-omics Analysis Reveals Whether and how Naphthylacetic Acid Affects the Quality of Rehmannia Glutinosa


 Background: Rehmannia glutinosa (R.glutinosa) is an important medicinal plant. The tuberous root of R.glutinosa is often used as herbal medicine. Naphthylacetic acid (NAA) as expansin can improve its yield, but knowledge about gene regulation and metabolome in its root is limited.Results: Full-length transcriptome, next generation transcriptome(NGS), small RNA and degradome sequencing and metabolomics were used to elucidate whether and how NAA affected its quality.30 differential expression metabolites (DEMs) (11 upregulated, 19downregulated) were identified, but catalpol and Rehmannioside D as quality standards were unchanged in its tuberous roots under control and NAA conditions (CKs and NTs); Their NGS identified 1,113 differentially expressed transcripts (DETs) (596 upregulated, 517downregulated) verified by RT-qPCR; Small RNA sequencing identified 78miRNAs (11known, 67 novel), of which 3 were differentially expressed miRNAs (1upregulated, 2downregulated). Among them, 274 differentially expressed miRNAs target transcripts (DEMTs) were predicted found and then validated by degradome sequencing; DETs and DEMTs were mainly related to metabolism. 4 miRNA-mRNA interaction pairs that regulates 4 metabolites (2 negatively correlated, 2 positively correlated) were identified; DETs, DEMs, differentially expressed miRNAs and DEMTs involved in phenylpropanoid biosynthesis regulated metabolites.Conclusions: The identification of DETs, DEMs, differentially expressed miRNAs and DEMTs could help to elucidate the regulatory networks and molecular mechanisms important for NAA-improving root quality of R.glutinosa.

transcriptome sequencing with metabolomics has increased the power of bioinformatics analysis, and has been used to explore the important genes participating in the regulatory pathways of speci c metabolites and to elucidated the molecular mechanisms of melatonin's protective effects in plants [22,23]; Integration of transcriptome, degradome and smalls sequencing have provides a useful platform for some molecular mechanisms in plants [24]. However, their integration analysis is not used in R. glutinosa until now. Therefore, there is a need for comprehensive identi cation of the DETs, DEMs, differentially expressed miRNAs and their targets regulated by NAA in R. glutinosa.
In our current study, based on our NT and CK yield measurement of R. glutinosa in advance, we integrated transcriptomic, small RNAs, degradome and metabolomics analyses to compare the gene expression, miRNA and metabolomic pro les between NTs and CKs. The aims of this study were (1) to provide abundant transcriptome resources for a transcript isoform catalog, (2) to detect NAA-regulated metabolites and (3) to systematically identify potential NAA-responsive DETs, DEMs, miRNAs and their targets for providing valuable information regarding NAA-improving quality of R. glutinosa Libosh and its the mechanisms.

Results
Full-length transcriptome sequencing data and full length sequence statistics 26.61 GB clean reads from 1 kb to 6 kb were generated via Full-length transcriptome sequencing of Sample F01 on PacBio Sequel platform. Using " full passes ≧ 3 and sequence accuracy > 0.9" as criteria, 357,961circular consensus reads were extracted from these clean reads, which contain 775,062,045bases and share a mean read length of 2,165 bases. when mean number of passes is 43, these circular consensus reads were divided into full length sequences with 5' primers and 3' primers and poly(A)n tails, and non-full length sequences without any primers and tails. After those primers and poly(A)n tails were removed, insertion sequences were obtained for the construction of database. When the database was constructed, synthesis directions of their strands were ascertained based on the differences between their two ends. 307,475 full length non-chimeric clean reads with a percentage of 85.9% were obtained; Among these full length non-chimeric reads, 99,929 consensus isoforms with a read length of 1,969bases were identi ed. They were polished for the production of 96,548 polished high-quality consensus isoforms with a percentage of 96.62% and 3,001 polished low-quality consensus isoforms. 61,262 non-redundant high quality consensus isoforms were generated after redundant high quality consensus isoforms were removed using CD-HIT software (identity > 0.99) [25]; In addition, 46,412 coding sequences,4,241 lncRNAs,3,070 alternative splicing,34,013SSRs and 5,562 transcription factors were predicted. 56,633full length transcripts were annotated to 8 databases such as COG, GO,KEGG, KOG, Pfam, Swissport, eggNOG and Nr; After removing redundancy, take OrthoDB as the database, transcriptome integrity were assessed by BUSCO [26] as follows: Complete coverage was 1129, of which single copy and duplicated (multi copies) was 532 and 597, respectively. Incomplete coverage was 70. Missing isoforms (no alignment result) was 241. Single copy gene clusters of related species and their genes number was 1440.
The identi cation and KEGG classi cation of DETs 42.91 Gb clean data from six samples were presented in Table S1, and aligned to nonredundant transcripts using STAR software and 3rd G nonredundant transcripts as references (Table S2). Based on the positional information of mapped clean reads in 3rd G nonredundant transcripts, the transcript expressions were quanti ed by RSEM software [27]. 1,113DET sets between CK and NT were obtained by DESeq software [28] using Fold change ≧ 2 and False discovery rate < 0.01 as criteria, including 596 up-regulated DETs and 517 down-regulated DETs in NT compared to CK (Table  S3). The rst 25up-regulated DETs and 25down-regulated DETs for heatmap were grouped (Fig. 1). To validate our DETs (Table S3), we selected 10 upregulated DETs and 11 down-regulated DETs for RT-qPCR analyses. Among 10 up-regulated DETs, 9 (90%) were con rmed by RT-qPCR. Among 11 down-regulated DETs, 10 (90.91%) were con rmed by RT-qPCR (Table 1). These results revealed that our sequenced data were highly reliable.  The statistical results of KEGG classi cation of DETs were shown in Fig. 2A. 271of 454DETs were annotated to 50KEGG pathways 387times, which were classi ed into ve groups such as cellular process with 20annotated DETs (7.38%), environmental information process (plant hormone transduction) (23,8.49%), genetic information processing (57,21.07%), metabolism (156,57.89%) and organismal systems (14,5017%).
The identi cation of differential miRNAs and their targets and KEGG classi cation 110.67 Mb clean reads were generated by small RNA sequencing of 6 samples (Table S4), and mapped to several databases such as Silva, GtRNAdb database, Rfam database and Repbase database by Bowtie software to lter out sequence repeats and ncRNAs, including rRNAs, tRNAs, snRNAs, snoRNAs and unannotated reads with miRNAs (Table S5). The rest were aligned to Reference database for mapped reads (Table S6). 78miRNAs, including 11 known miRNAs and 67 novel miRNAs, were identi ed in NTs and CKs (Table S7). These known miRNAs were 20nt to 24nt, while these novel miRNAs were 18nt or 20nt to 24nt. Based on their expression levels, 3 differential novel miRNAs were screened out, including upregulated novel_miR_57 and downregulated novel_miR_65 and novel_miR_23.
According to the gene sequence information of known miRNAs and novel miRNAs-corresponding species, their DEMTs (5077) were predicted by Target Finder software (Table S8), of which 4861were annotated to the following databases including NR, Swiss-Prot, GO, COG, KEGG, KOG, eggNOG and Pfam by BLAST software. For KEGG annotation (Fig. 2B), 42 DEMTs were annotated to 19KEGG pathways, including cellular process with 8annotated DEMTs (18.96%), environmental information process (other than plant hormone transduction) (3,7.14%), genetic information processing (21,49.94%), metabolism (8,19.04%) and organismal systems(2, 4.76%). KEGG pathways enrichment of these DEMTs was shown as in Table S9. miRNA target prediction and validation To further predict and identify miRNA targets based on miRNA sequences, we performed degradome sequencing, generated 20.67M clean tags. We detected 543 degraded DEMTs from these clean tags, which contain 557 cleaved sites. Among them, 247 targets of novel miR23, novel miR57 and novel miR65 were predicted using Target Finder software. Degradome sequencing just identi ed 12 cleavage DEMTs of novel miR23 and 13 of novel miR65 ( Table 2). See Table S10 for more details, indicating that 25 of 247 miRNA DEMTs were cleaved. 83637 precursor molecules were obtained from CKs and NTs by LC-MS technique, which included 46364 in the positive ion mode and 37273 in the negative ion mode; Based on the screening standards such as VIP ≧ 1, FC ≧ 1.5 or ≦ 0.667 and p-value ≦ 0.05, 6125differential metabolites between CKs and NTs were screened out of these precursor molecules, including 2,690 up-regulated and 3,435 down-regulated in NTs; After the exact masses of metabolites were con rmed according to mass error < 15 ppm, 30DEMs were analyzed (Fig. 3A) and identi ed (Fig. 3B) (Table S11) including 11upregulated (Z-score > 0) and 19down-regulated (Z-score < 0) in NTs (Fig. 3C); In their heat map constructed with R (v3.1.3) (Fig. 3D), the closer to 1 correlation coe cient, the more positively signi cant the correlation between any two metabolites. the closer to -1 correlation coe cient, the more negatively signi cant the correlation between any two metabolites; KEGG pathways of differential metabolites were presented in Table S12.
Integrated analysis of NGS and small RNA Sequencing based on KEGG pathway classi cation KEGG pathway classi cations of both DETs and DEMTs were all divided into Cellular process, Environmental information process, Genetic information processing, Metabolism and Organismal systems (Fig. 2). Moreover, both had some common KEGG pathways such as endocytosis in Cellular process, splicesome and RNA degradation in Genetic information processing, biosynthesis of amino acids, glycerolipids, starch and sucrose metabolism and phenylpropanoid biosynthesis in Metabolism, plant-pathogen interaction in Organismal systems. These results indicated that DETs and DEMTs regulated common KEGG pathways to improve R. glutinosa.

Integrated analysis of NGS and metabolomics based on KEGG pathway
Correlation of DEMs and DETs was presented in Fig. 4. It was seen from it that 22 DEMs were correlated to 3846 DETs (Table S13). Among them, 1,357 transcripts were negatively correlated to DEMs, while 2,309 transcripts were positively correlated to DEMs. According to KEGG pathways, the relations of DEMs and DETs screened by correlation were mapped. One DEM was correlated to 1 ~ 16DETs in one KEGG pathway. On the other hand, according to the results of differential metabolite analysis and differential transcripts analysis, the differential transcripts and differential metabolites in the same group were mapped to the same KEGG pathways at the same time for us to better understand the relationship between genes and metabolites.
There were 24KEGG pathways simultaneously mapped by DETs and DEMs (Table S14). According to p-value < 0.05, 6KEGG pathways were commonly enriched, which included avone and avonol biosynthesis, avonoid biosynthesis, phenylpropaniod biosynthesis, biosynthesis of unsaturated fatty acids, phenylalanine metabolism and purine metabolism (Fig. 4). Among them, 3KEGG pathways, such as avone and avonol biosynthesis, avonoid biosynthesis and phenylpropaniod biosynthesis, were extremely signi cant (p-value < 0.01).These results suggested that these DEMs were correlated to a core set of DETs.

Discussions
Previous studies indicated that NAA as one expansin could improve the yield of R. glutinosa by inhibiting the growth of its shoots and promote the transfer of nutrients to roots [3]. However, the effect of NAA on its quality and the genetic mechanism underlying its effect remain unknown. R. glutinosa is a non-model plant species without a reference of known genome, so its genetic basis is still narrow. In the current study, both were investigated using LC-MS technology, the full-length transcriptome sequencing technology, small RNA Illumina sequencing and degradome sequencing.
The effect of NAA on the metabolites in the mature tuberous roots of R. glutinosa Chemical composition of R. glutinosa is an important part of its quality, of which catalpol, acteoside (ever) and Rehmannioside D are its index components for quality control. In this study, untargeted metabolomics was used to evaluate the effect of NAA on its metabolites. 6,125 differential metabolites between CK and NT were identi ed. Among them, 2,690 were up-regulated, while 3,435 were down-regulated in NTs, of which 30 DETs were matched in the databases and further identi ed; According to Z-score, 30DETs were divided into 11 up-regulated DETs (Z-score > 0), such as melibiose, thromboxane B2, nicotinate D-ribonucleoside and so on, and 19 down-regulated DETs (Z-score< 0), including rosmarinic acid, lipoxin A4, linoleic acid and so on in NTs. Therefore, NAA treatment did not change the kinds of metabolites, but change the contents of some metabolites in the mature tuberous roots of R. glutinosa. However, Catalpol and acteoside were not differential metabolites between CK and NT. This result of ours is not consistent with the previous report, indicating the increased catalpol and acteoside contents in CPPU-treated R. glutinosa cultivar Wen 85-5 [7], even though both are a type of expasins. The inconsistency could be caused by different cultivars of R. glutinosa or /and different chemical composition between NAA and CPPU. What is worse, Rehmannioside D was not identi ed. Anyway, NAA treatment could affect the contents of some metabolites in R. glutinosa, suggesting some change in its quality. On the other hand, these result of our current study also indicated that LC-MS technology had a potential to identify many metabolites in R. glutinosa as stated by previous reports [1,2], but could not identify any metabolite in R. glutinosa.
Full-length transcriptome sequencing and NGS-based analyses RNA-Sequencing has been widely used in plant gene identi cation, expression and regulation. One of its most popular applications is for differential expression analysis where one identi es genes expressed at different levels between two classes of samples [29,30]. So far, the NGS technology has produced most of the plant transcriptomes, but the short reads from NGS sequencers cause incompletely assembled transcripts to be lack of some important information, for example, alternative splicing, limiting better understanding of transcriptome data. Based on the single-molecule real-time (SMRT) sequencing technology, the PacBio platform does not need to break mRNA randomly, can sequence the transcript from 5' end to 3' UTR region at once, generate longer and even full-length transcripts from observations of single molecules without assembly. The full-length transcripts can be used to investigate alternative splicing, alternative polyadenylation, novel genes, non-coding RNAs and fusion transcripts [31], Based on these advantages of full-length transcriptome sequencing, we conducted the full-length transcriptome sequencing of R. glutinosa, a non-model plant, and constructed its full-length transcriptome database for the rst time, which would be used as a reference database for other studies. Its database capacity of 26.61Gb clean data with cDNA size of 1-6kb is bigger than its NGS transcriptome databases such as 42.91Gb clean data in this study and one of its previous studies [20]. 46,412 coding sequences with complete ORF, obtained based on this full-length transcriptome database, made its gene cloning faster and more e cient and easier than homology cloning and electronic cloning [32,33]. By using the full-length transcriptome as reference, we performed Illumina sequencing to analyze the differential gene expression between NTs and CKs, found 596 upregulated DETs and 517 downregulated DETs in NT, and validated some of them by qRT-PCR. Based on their annotation results to four databases such as GO, KEGG, COG and eggNOG, a large number of transcripts related to metabolism, signal transduction, posttranslational modi cation and response to stimulus were signi cantly upregulated. On the other hand, according to GO annotation analysis, some transcripts related to cellular component organization, reproduction, growth and structural molecule activity were signi cantly downregulated. These ndings showed that NAA improved the metabolites contents, the quality R. glutinosa by regulating some DETs mentioned above.

Identi cation and functions of miRNAs and their target genes
After 110.67 Mb clean data from 6 samples generated by small RNA sequencing were mapped to the full length transcriptome reference, 11 known miRNAs and 67 novel miRNAs were obtained, while no nat-siRNA and phasiRNA were identi ed. Among them, only novel_miR_57 was upregulated, while novel_miR_65 and novel_miR_23 were downregulated in NT. 18nt miRNAs were also discovered in other plants [34]. These miRNAs enlarged its known miRNA database and their categories. Before our work, Yang et al. [35] identi ed 25 conserved miRNA families from its cultivar Wen85-5 subjected to continuous cropping; Li et al. [36] identi ed 598 conserved miRNA families from the same cultivar again. However, when our identi ed miRNAs were aligned to their miRNAs, it was found that many miRNAs between theirs and ours were incomparable. The possible reasons leading to this incomparability are the disadvantage of their con rmation by bioinformatics method, the loss of some potential miRNAs during data analysis, the low annotation rate percent of 18.09-30.94, different cultivar and stress, of which the rst three reasons were in accordance with the report [37]. 5,077DEMTs, including 2,285 of 11 known miRNAs and 3,179 of novel miRNAs, were predicted, of which 4,861 were annotated to 8databases. Among 4,861, 159 differential expressed genes were annotated to 8databases.In terms of KEGG annotation, 63DEMTs were annotated, of which 42 were annotated into 19types of KEGG pathways, dealing with cellular processes, environmental and genetic information processing, metabolism and organismal systems. 21(50%) DEMTs were related to genetic information processing such as RNA degradation and transport, spliceosome, ribosome and protein processing in endoplasmic reticulum and base excision repair; 8(19.05%) associated with metabolism were phenylpropanoid biosynthes, biosynthesis of amino acids, starch and sucrose metabolism, carotenoid biosynthesis glycerolipid metabolism and so on. These miRNAs downregulated their target genes in order to regulate metabolism and metabolites accumulation. In addition, 247 DEMTs of novel_miR_23, novel_miR_57 and novel_miR_65 were further veri ed by degradome sequencing. We found that 25 of 247 DEMTs (10.12%) were cleaved, and that one miRNA could regulating many target genes. This phenomenon of one miRNA relative to multiple targets was also observed in other plant species.
NAA regulated genes and metabolites involved in phenylpropanoid biosynthesis to improve the quality of R. glutinosa We found some gene and metabolites were regulated by NAA. In terms of phenylpropanoid pathway in which they took part, the general phenylpropanoid metabolism generates an enormous array of secondary metabolites [38]. The phenylpropanoid pathway is responsible for the synthesis of numerous compounds important for plant growth and responses to all aspects of plant responses towards biotic and abiotic stimuli [38,39] and might have potential implications in manipulating the tea quality in tea plants [40], which is consistent with ours. Phenylpropanoids a group of plant secondary metabolites derived from phenylalanine, exert bene cial pharmacological effects on human health such as anticancer, antiin ammatory, antiviral, and antibacterial activity, and aid in wound healing [41], as well as have a wide variety of functions both as structural and signaling molecules [42]. We identi ed 14DETs taking part in Phenylpropanoid biosynthesis. Among them, 10genes such as (1) For beta-glucosidase, it responds to drought stress [43], dehydration and NaCl stress [44], freezing stress [45], induces the accumulation of higher abscisic acid (ABA) levels, which results in the formation of dwarf creeping plant [46], has high glycosyl hydrolase activity and plays a role in fruit ripening via the rapid production of free ABA [47], increases biomass yields and convert lignocellulose [48] and deals with stomatal development [49]; (2) For 4CL, phenylpropanoid pathway provides precursors for numerous secondary metabolites in plants. In this pathway, 4CL is the main branch point enzyme which generates activated thioesters. We identi ed one F01_transcript_33286 (4CL). Products of 4CL are subsequently used by various oxygenases, reductases and transferases for biosynthesis of lignin, avonoids, anthocyanins, aurones, stilbenes, coumarins, suberin, cutin, sporopollenin, etc [38]. 4CL plays multiple important roles in plant growth and development by catalyzing the formation of CoA ester [50], regulates the biosynthesis of lignin, avonoids and and wall-bound phenolics [51,52] plays a compensatory role in normal development [35], mediates the rice blast resistance, oret development and lignin biosynthesis [53] and affects plant phenotype and resulted in dwarfed plants with a "bonsai treelike" appearance [54]; (3) For peroxidase, peroxidase possesses salt tolerance and antioxidant reaction [55], tolerance to chilling [56] and drought resistance [57], and promotes stomatal closure [42], and related to ROS and NO [58] and modulates ascorbic acid metabolism [59]; (4) For CCoAOMT, it is a key enzyme in the lignin biosynthesis pathway [60], and involved in lignin biosynthesis [61] and vanillin biosynthesis [62], and affects the accumulation of phenolic acids [14], and increases biomass production, Cd uptake and translocation [63]. In a word, these four genes were associated with phenylpropanoid biosynthesis, lignin biosynthesis, biomass, stress tolerance, growth, development and so on.

Integrated analysis of multi-omics
A combined transcriptomic and metabolic analysis can better explain the transcriptional regulation of metabolic pathways. To identify the main pathways involved in the roots of naphthylacetic acid-improving R. glutinosa, we simultaneously mapped DETs and DEMs from the same treatment groups to KEGG pathways to clarify the relationships between DETs and DEMs (Fig. 4). As summarized in results, 30 signi cantly differentially expressed metabolites were assigned to 12 KEGG pathways, including phenylpropanoid biosynthesis (ko00940), biosynthesis of biosynthesis of unsaturated fatty acids (ko01040), avone and avonol biosynthesis (ko00944), and cysteine and methionine metabolism (ko00270), glyoxylate and dicarboxylate metabolism (ko00630), serine and threonine metabolism(ko00260), fatty acid biosynthesis(ko00061), alpha-Linolenic acid metabolism(ko00592), nicotinate and nicotinamide metabolism(ko00760), arginine and proline metabolism(ko00330), steroid biosynthesis(ko00100), galactose metabolism(ko00052). Among these pathways, phenylpropanoid biosynthesis (ko00940) included seven genes (F01_transcript_45042, F01_transcript_79209, F01_transcript_88383, F01_transcript_48883, F01_transcript_57503, F01_transcript_54233, F01_transcript_83451) and two metabolites (Sinapoyl aldehyde and (E)-3-(4-Hydroxyphenyl)-2-propenal), the seven genes were also enriched for alkaloid biosynthetic process (GO:0009821), lignin biosynthetic process (GO:0009809), carbohydrate metabolic process (GO:0005975), response to hormone (GO:0009725) and glucosinolate catabolic process (GO:0019762); steroid biosynthesis(ko00100) included one genes(F01_transcript_18023) and one metabolite (beta-Sitosterol), the genes was enriched for lipid catabolic process (GO:0016042); In addition, cysteine and methionine metabolism (ko00270) included one genes (F01_transcript_65625) and one metabolite (S-Adenosylmethionine), the genes was enriched for methionine biosynthetic process (GO:0009086) and homocysteine metabolic process (GO:0050667). For further study, A total of 78 miRNAs were identi ed by sRNA sequencing, including 11 known miRNAs and 67 new miRNAs, and 3 miRNAs were differentially expressed under NAA treatment. Through the joint analysis of miRNA and its target genes, 4 miRNA-mRNA pairs were identi ed, which were novel miRNA23, novel miRNA65, F01_transcript_81522 and F01_transcript_11999 annotated to phenylpropanoid biosynthesis (ko00940), respectively. Taken together, a coexpression regulatory network was constructed based on the pro les of the differentially expressed miRNA-mRNA-metabolite pairs, suggesting that NAA could improve the quality of R. glutinosa through multiple mechanisms (Fig. 5 it should not change the quality of R. glutinosa. Multi-omics analysis showed that NAA may regulate some metabolites contents of R. glutinosa via coexpression of DETs, DEMs, differentially expressed miRNAs and DEMTs linked to phenylpropaniod biosynthesis. These results enlarge R. glutinosa transcriptome and miRNA databeses, provide novel insights into NAA improvement in its quality. The candidate genes and metabolites presented here will lay the foundation for some metabolite accumulation and biosynthesis mechanism elucidation, for example, acteoside, in R. glutinosa.

Plant materials
A total of 12 samples of Rehmannia glutinosa, divided into CK and NT groups because of the different treatment conditions. Materials were allowed and provided by Wen County Agricultural Science institute and were identi ed as Rehmannia glutinosa by Professor Yanqing ZHOU of Henan Normal University and Agronomist Cuihong Lu of Wen County Agricultural Science Institute. Group1 was not treated without any NAA (CK), while Group2 was treated three times with NAA with foliar spraying method: (1)  called Sample F01 for PacBio Full length transcriptome sequencing using sequencing by synthesis. Moreover, the yield of NT is 10% higher than that of CK.

Full-length transcriptome sequencing
Paci c Biosciences Long Read Processing Raw reads was processed into error corrected reads of insert (ROIs) using Iso-seq pipeline with minFullPass=3 and minPredicted Accuracy=0.9. Next, full-length and non-chemiric (FLNC) transcripts were determined by searching for the polyA tail signal and the 5' and 3' cDNA primers in ROIs. ICE (Iterative Clustering for Error Correction) was used to obtain consensus isoforms and full-length (FL) consensus sequences from ICE was polished using Quiver [64]. High quality FL transcripts were classi ed with the criteria post-correction accuracy above 99%.
Next generation transcripome sequencing

Data Processing
Raw data (raw reads) of fastq format were rstly processed through in-house perl scripts. In this step, clean data(clean reads) were obtained by removing reads containing adapter, reads containing ploy-N and low quality reads from raw data. At the same time, Q20, Q30, GC-content and sequence duplication level of the clean data were calculated. All the downstream analyses were based on clean data with high quality [14]. These clean reads were then mapped to the Pacbio reference genome sequence. Only reads with a perfect match or one mismatch were further analyzed and annotated based on the reference genome. Hisat2 tools soft were used to map with reference genome.
Quanti cation of gene expression levels and differential expression analysis Gene expression levels were estimated by fragments per kilobase of transcript per million fragments mapped. For the samples with biological replicates: Prior to differential gene expression analysis, for each sequenced library, the read counts were adjusted by edgeR program package through one scaling normalized factor. Differential expression analysis of two samples was performed using the EBSeq R package [2]. The resulting FDR (false discovery rate) were adjusted using the PPDE(posterior probability of being DE). The FDR < 0.05 and |log2(foldchange)| ≥1 was set as the threshold for signi cantly differential expression.

Validation of gene expression using RT-qPCR
The RT-qPCR was performed on RNA extracted from CK and NT using the GAPDH gene as the internal reference. Speci c primer pairs of 22 selected genes were designed using the Primer Premier 5.0 (Table 1). Data were presented as relative transcript level based on the 2 -∆∆Ct method [23].

Functional enrichment analysis
Gene Ontology (GO) enrichment analysis of the DETs was implemented by the GOseq R packages based Wallenius non-central hyper-geometric distribution [66], which can adjust for gene length bias in DETs; KEGG [67] is a database resource for understanding high-level functions and utilities of the biological system, such as the cell, the organism and the ecosystem, from molecular-level information, especially large-scale molecular datasets generated by genome sequencing and other high-throughput experimental technologies (http://www.genome.jp/kegg/). We used KOBAS software [68] to test the statistical enrichment of differential expression genes in KEGG pathways.

Small RNA sequencing and data analyses
The small RNAs from CK and NT were sequenced and their data were analyzed as stated by Wang et al. [14].

Metabolomics detection
Metabolites extraction [69][70][71] Accurately weigh 50mg (±1%) of each sample in 2 mL EP tube, and add 0.6 mL 2-chlorophenylalanine (4 ppm) methanol (-20 ℃), vortex for 30 seconds; Add 100 mg glass beads and grind the samples by a high-throughput tissue grinder for 90 seconds at 60 Hz; Ultrasonic treatment for 15 minutes at room temperature; Centrifuge at 4 ℃ for 10 min at 14000 rpm, and the supernatant was ltered through 0.22 µm membrane to obtain the prepared samples for LC-MS; Take 30 µL from each sample to the quality control (QC) samples (These QC samples were used to monitor deviations of the analytical results from these pool mixtures and compare them to the errors caused by the analytical instrument itself) [72]; Use the rest of the samples for LC-MS detection. Screening and identi cation of differential metabolites Based on the screening criteria for differential metabolites : p-value ≦0.05 and VIP≥1 [76,77], nal differential metabolites were screened out. The exact masses of metabolites were con rmed according to mass error < 15ppm. Relative contents statistical analyses of differential metabolites Z-score is a value based on the relative content conversion of metabolites, which is used to measure the relative content of metabolites on the same level [78]. The calculation of Z-score is based on the mean and standard deviation of reference data set (control group). The speci c formula is Z = (xμ) / σ, in which x is a speci c fraction, μ is the average, σ is the standard deviation [72].

LC-MS test
Correlation heat map of differential metabolites When the linear relationship between the two metabolites is enhanced, the correlation coe cient tends to 1 or -1: positive correlation tends to 1, negative correlation tends to -1. R (v3.1.3) cor.test function was used to test the correlation analysis of metabolites, p-value ≤ 0.05 had signi cant correlation [79].

KEGG pathways of differential metabolites
MetPA database is a part of metaanalyst (www metaboanalyst.ca) .Its analyses are mainly based on the metabolic pathway of KEGG [80]. MetPA database can be used to analyze the metabolic pathways related to two groups of different metabolites. MetPA database identi es the metabolic pathways that may be disturbed by organisms through metabolic pathway concentration and topological analysis, and then analyzes the metabolic pathways of metabolites [72].
Combined analysis of transcriptome and metabolome Correlation of differential genes and metabolites was analyzed as stated by Liu et al. [81]. Both KEGG pathways were analyzed as mentioned above. Both common enriched KEGG pathways were analyzed based on p-value< 0.05.

Declarations
Ethics approval and consent to participate Not applicable.

Consent for publication
Not applicable.

Availability of data and materials
The datasets used and analyzed in the current study are available from the Supplemental Materials.

Competing interests
No con icts of interest declared.    Commonly enriched KEGG pathways of differential metabolites and differential genes. Abscissa: All KEGG pathways enriched. Ordinate: Pvalue of DEMs and DETs.