Transcriptomes Analysis Reveals the Regulatory Roles of Noncoding RNA in Tanshinones Synthesis Pathway of Salvia Miltiorrhiza

Background: Long noncoding RNAs (lncRNAs), circular RNAs (circRNAs), and microRNAs (miRNAs) have been shown to play fundamental roles in plant development. However, the information of these noncoding RNAs (ncRNAs) in Salvia miltiorrhiza remains largely unexplored. In this study, the expression pattern of ncRNAs in six tissues from the same strain of S. miltiorrhiza was analyzed to study the biological function of ncRNAs on active ingredients synthesis. Methods: Analysis of tanshinone content differences of two root simples was carried out on high-performance liquid chromatography (HPLC). RNA sequencing, GO and KEGG enrichment analysis were applied to analyzing the targets of diferentially expressed ncRNAs in different organs. Results: A total of 6,929 lncRNAs, 6,239 circRNAs, and 360 miRNAs were identied. Forty-eight lncRNAs, 70 miRNAs, and 26 circRNAs expressed differentially between red and white root tissues with signicantly different tanshinone content. GO and KEGG pathway analysis of target genes of differently expressed ncRNAs indicated that some target genes are involved in the synthesis pathway of terpene, including diterpene and sesquiterpene. We also found many target genes related to secondary metabolites, including 2-C-Methyl-d-erythritol 2,4-cyclodiphosphate Synthase (SmMCS) and several CYP450s. Furthermore, most target genes may be related to the resistance of pathogens, such as receptor kinases, disease-resistant proteins, and pentatricopeptide repeat-containing proteins. Conclusions: The present study exhibited the tissue-specic expression patterns of ncRNAs preliminarily in S. miltiorrhiza, which may reect that the formation of white root or red root is related to regulation by ncRNAs. It would provide a basis for further research about the regulation mechanism in the tanshinone synthesis process.

relieving effects. It has been widely applied in treating cardio-cerebrovascular diseases, such as angina, coronary heart disease, myocardial infarction in China and other Asian countries for thousands of years [30,31]. The main active components of dry, red root in S. miltiorrhiza are lipid-soluble tanshinones and water-soluble phenolic acids compounds [32,33]. Tanshinones are a kind of diterpenoid quinone and also a pigment component. About 40 kinds of tanshinones have been isolated and identi ed in S. miltiorrhiza, such as tanshinone IIA ( aky orange crystal), tanshinone I (reddish-brown columnar crystal), tanshinone IIB (purple acicular crystal), cryptotanshinone (tangerine acicular crystal), dihydrotanshinone I (red columnar crystal), and so on [34][35][36].
Tanshinones are mainly biosynthesized and accumulated in the periderm of S. miltiorrhiza root. The content of tanshinone IIA in the periderm is ~ 185 times higher than that in the phloem and ~ 17 times higher than that in the xylem of S. miltiorrhiza root [37]. In general, the color of periderm in S. miltiorrhiza root gradually turned red because of the accumulation of tanshinone [38,39]. Aboveground organs (such as leaves) also contained signi cantly lower tanshinone, which implies that genes related to tanshinone synthesis may be preferentially expressed in the periderm of roots [40]. Recent studies con rmed that three essential genes (SmCPS1, SmKSL1, SmCYP76AH1) were involved in the downstream pathway of tanshinone synthesis. Several essential genes were involved in the upstream pathway were mainly expressed in the periderm tissue [41].
In order to interpret the function of ncRNAs in the development and accumulation of active components in S. miltiorrhiza, red and white roots, stems, leaves, and ower tissues were used as materials to conduct noncoding RNA sequencing analysis in the present study. The results would bene t from predicting and screening potential ncRNA involved in the synthesis of active components and exploring its function in the growth and development of S.miltiorrhiza.

Plant materials and growth conditions
The S. miltiorrhiza line DSS3 used in this experiment were grown in a glass bottle with diameter of 10 cm and height of 40 cm in the State Key Laboratory of Crop Biology, Shandong Agricultural University, Taian, Shandong province, China. Plantlets were harvested after grown for 6 months, washed with distilled water, divided into 6 types of tissues including leaf, stem, ower, phloem of white root, phloem of red root, and xylem of red root. Those tissues were immediately frozen in liquid nitrogen and stored at -80℃ for further experiment.

HPLC analysis of tanshinone content
Analysis of tanshinone content differences of two root simples was carried out on high-performance liquid chromatography (HPLC). The methods of extraction and detection used in this study were established by our laboratory [42] and mainly used for quantitation of known components. Chromatographic separations were carried out in a reverse-phase C18 column (250 × 4.6 mm, ve µm particle size; Thermo, USA), and a 20.0 × 4.6 mm guard column connected to a Waters 600E HPLC System which equipped with an auto-injector, UV detector, and Empowers software (Waters Associates, Milford, MA, USA).
Methods of RNA extraction, detection, and profound sequence of ncRNAs The RNA samples were extracted with Trizol. RNA integrity was assessed using the RNA Nano 6000 Assay Kit of the Agilent Bioanalyzer 2100 system (Agilent Technologies, CA, USA) to ensure the use of quali ed samples for sequencing. RNA degradation and contamination, especially DNA contamination, was monitored on 1.5% agarose gels. RNA concentration and purity were measured using the NanoDrop 2000 Spectrophotometer (Thermo Fisher Scienti c, Wilmington, DE).
Library preparation for sRNA sequencing A total of 2.5 ng RNA per sample was used as input material for the RNA sample preparations.
Sequencing libraries were generated using NEB Next Ultra-small RNA Sample Library Prep Kit for Illumina (NEB, USA) following the manufacturer's recommendations. Index codes were added to attribute sequences to each sample. First of all, ligated the 3′SR Adaptor. The mixture system included Mixed 3′SR Adaptor, RNA, and Nuclease-Free Water, which were incubated for 2 minutes at 70 degrees in a preheated thermal cycler. The tube was transferred to ice. Then, add 3′Ligation Reaction Buffer (2X) and 3′Ligation Enzyme Mix ligate the 3′SR Adaptor. They were incubated for 1 hour at 25°C in a thermal cycler. To prevent adaptor-dimer formation, the SR RT Primer hybridizes to the excess of 3′SR Adaptor (that remains free after the 3′ligation reaction). It transforms the single-stranded DNA adaptor into a double-stranded DNA molecule. DsDNAs are not substrates for ligation-mediated. Then, reverse transcription synthetic rst chain. After that, PCR ampli cation and Size Selection. PAGE gel was used for electrophoresis fragment screening purposes, rubber cutting recycling as the pieces get small RNA libraries. At last, PCR products were puri ed (AMPure XP system), and library quality was assessed on the Agilent Bioanalyzer 2100 system.
Library preparation for lncRNA-Seq and circRNA-Seq A total amount of 1.5 µg RNA per sample was used as input material for rRNA removal using the Ribo-Zero rRNA Removal Kit (Epicentre, Madison, WI, USA). Sequencing libraries were generated using NEBNext R Ultra™ Directional RNA Library Prep Kit for Illumina (NEB, USA). The manufacturer's recommendations and index codes were added to attribute sequences to each sample. Brie y, Fragmentation was carried out using divalent cations under elevated temperature in NEBNext First Strand Synthesis Reaction Buffer (5X). First-strand cDNA was synthesized using a random hexamer primer and Reverse Transcriptase. Second-strand cDNA synthesis was subsequently performed using DNA Polymerase I and RNase H. Remaining overhangs were converted into blunt ends via exonuclease/polymerase activities. After adenylation of 3' ends of DNA fragments, NEBNext Adaptor with hairpin loop structure was ligated to prepare for hybridization. In order to select insert fragments of preferentially 150 ~ 200 bp in length, the library fragments were puri ed with AMPure XP beads (Beckman Coulter, Beverly, USA). Then three µl USER Enzyme (NEB, USA) was used with size-selected, adaptorligated cDNA at 37°C for 15 min before PCR. Then PCR was performed with Phusion High-Fidelity DNA polymerase, Universal PCR primers, and Index(X) Primer. At last, PCR products were puri ed (AMPure XP system), and library quality was assessed on the Agilent Bioanalyzer 2100 and qPCR.
Clustering, sequencing, and Quality control According to the manufacturer's instructions, the clustering of the index-coded samples was performed on a cBot Cluster Generation System using TruSeq PE Cluster Kit v4-cBot-HS (Illumia). After cluster generation, the library preparations were sequenced on an Illumina Hiseq Xten platform, and paired-end reads were generated. Raw data (raw reads) of fastq format were rstly processed through in-house Perl scripts. In this step, clean data (clean reads) were obtained by removing reads containing adapter, reads containing ploy-N, and low-quality reads from raw data. The miRNA reads were trimmed and cleaned by removing the sequences smaller than 18 nt or longer than 30 nt. At the same time, Q20, Q30, GC-content, and sequence duplication levels of the clean data were calculated. All the downstream analyses were based on clean data with high quality.

Identi cation and Analysis of lncRNA and CircRNA
The transcriptome was assembled using the StringTie (https://ccb.jhu.edu/software/stringtie/index.shtml) based on the reads mapped to the reference genome. Then the assembled transcripts were annotated using the gffcompare program. The unknown transcripts were used to screen for putative lncRNAs. Three computational approaches include CPC (http://cpc.cbi.pku.edu.cn/), CNCI (http://www.ncbi.nlm.nih.gov/pubmed/23892401), Pfam (http://pfam.xfam.org/), CPAT (http://lilab.research.bcm.edu/cpat/) were combined to sort non-protein coding RNA candidates from putative protein-coding RNAs in the unknown transcripts. Putative proteincoding RNAs were ltered out using a minimum length and exon number threshold. Transcripts with lengths more than 200 nt and have more than two exons were selected as lncRNA candidates and further screened using CPC/CNCI/Pfam/CPAT that have the power to distinguish the protein-coding genes from the noncoding genes. As well as the different types of lncRNAs include lincRNA, intronic lncRNA, antisense lncRNA, sense lncRNA, were selected using cuffcompare. CircRNA were identi ed by nd_circ software.

Differential expression analysis
For the samples with biological replicates, differential expression analysis of two conditions/groups was performed using the DESeq R package (1.10.1, http://www.bioconductor.org/packages/release/bioc/html/DESeq.html). DESeq provides statistical routines for determining differential expression in digital miRNA/gene expression data using a model based on the negative binomial distribution. The resulting P values were adjusted using Benjamini and Hochberg's approach for controlling the false discovery rate. MiRNA/genes with an adjusted p < 0.01 found by DESeq were assigned as differentially expressed. For the samples without biological replicates, and about miRNAs, before differential gene expression analysis, for each sequenced library, differential expression analysis of two samples was performed using the IDEG6 [43]. P-value was adjusted using q value [44]. Qvalue < 0.01 & |log 2 (Fold Change) |≥1 was set as the threshold for signi cantly differential expression.
For lncRNA and circRNA, the edgeR program package adjusted the read counts through one scaling normalized factor. Differential expression analysis of two samples was performed using the EBseq (2010, https://www.biostat.wisc.edu/~kendzior/EBSEQ/) R package. The resulting FDR (false discovery rate) was adjusted using the PPDE (posterior probability of being DE). The FDR < 0.05 & |log2(Fold Change)| ≥1 was set as the threshold for signi cantly differential expression. P-value was used as the key index for screening differentially expressed circRNA.
Validation by quantitative real-time polymerase chain reaction (qRT-PCR)

Target prediction
We used TargetFinder software to predict the target of known miRNAs and novel miRNAs. The nearest mRNAs at the range of 100kb upstream and downstream of lncRNAs were identi ed as cis-target genes by Perl. We investigated the complementary sequence between lncRNA and mRNA by the LncTar tool and then calculated and standardized the free energy of the matching sites. Those mRNA with standardized free energy threshold < -0.1 were considered as the target genes of lncRNAs. The mRNA with the highest matching degree with circRNA was selected as the parental genes, and the circRNA was classi ed according to their position on the genome. Miranda (v3.3a) software was used to predict the target miRNA of circRNA, and TargetFinder software was used to predict the target mRNA of target miRNA.

Target functional annotation
Gene function was annotated based on the following databases: Nr (NCBI non-redundant protein sequences);Nt (NCBI non-redundant nucleotide sequences) Pfam (Protein family) KOG/COG (Clusters of Orthologous Groups of proteins) Swiss-Prot (A manually annotated and reviewed protein sequence database) KO (KEGG Ortholog database) GO (Gene Ontology).

GO enrichment analysis
Gene Ontology (GO) enrichment analysis of the differentially expressed genes (DEGs) was implemented by the GOseq R packages based on Wallenius non-central hypergeometric distribution for targets of differential expression miRNAs. For targets of differential expression lncRNAs and circRNAs, Gene Ontology (GO) enrichment analysis of the differentially expressed genes (DEGs) was implemented by the topGO R packages.

KEGG pathway enrichment analysis
We used KOBAS software to test the statistical enrichment of differential expression targets of differential expression ncRNAs in KEGG pathways.

Results
Effective constituents contents in white root and red root S. miltiorrhiza is an important medicinal plant, and the tanshinone and salvianolic acid in its roots have a remarkable curative effect on cardiovascular and cerebrovascular diseases [45,46]. In the designed experiment, these red roots and white roots are located in different parts of the same root. The roots close to the base of stem and near the ground surface show white roots, while the roots far away from the ground surface show red roots. This experiment is specially designed to produce the difference between white roots and red roots, resulting in the difference of dark and dim light (This experiment will be specially reported). The HPLC results showed that the four main tanshinone components in the phloem and periderm tissues of red roots, including tanshinone IIA and tanshinone I, dihydrotanshinone I and cryptotanshinone, were signi cantly higher than those of white roots (Table 1). In order to explore the molecular mechanism of tanshinone synthesis, differentially expressed pro les of miRNAs, lncRNAs, and circRNAs of these two kinds of roots were analyzed in this study.
High-throughput ncRNAs sequencing data from S. miltiorrhiza Six different organs (periderm and phloem in the red root, xylem in the red root, periderm and phloem in white root, leaf, stem, ower) were collected and sequenced by next-generation sequencing to acquire the spatial distribution pattern of ncRNAs.
One hundred thirty-two known miRNAs were identi ed, and 228 novel miRNAs were forecasted by applying miRDeep2 core algorithms with modi cations for plant miRNAs (Table S1). The length of the novel mature miRNA was mainly in the range of 20 nt to 24 nt (Fig. 1a). We further analyzed the 5 'end rst base of novel miRNAs and showed the AU content of the precursor miRNA was higher than the GC content, and Uracil was the predominant base in the mature miRNA sequences (Fig. 1b). Based on the sequence similarity, these miRNAs could be grouped into 72 miRNA families (Table S1).
The cDNA library was sequenced through the Illumina HiSeq Xten platform, and 222.53 Gb clean data were obtained for bioinformatics analysis. Reads mapped to the reference genomes of S. miltiorrhiza were assembled and spliced by StringTie software to identify new transcripts from previously unannotated transcriptional regions. Afterward, transcripts with characteristics (length ≥ 200 bp, exon count ≥ 2, and FPKM ≥ 0.1) were initially selected and further analyzed by four computational approaches include CPC, CNCI, Pfam, and CPAT. Eventually, 6,929 candidate lncRNAs without proteincoding potential were identi ed in all the samples (Fig. 2a). Most of the lncRNAs were lincRNAs (5503, 79.4%), followed by antisense-lncRNAs (617, 8.9%), sense lncRNAs (546, 7.9%), and intronic-lncRNAs (263, 3.8%) (Fig. 2b). The distribution of lncRNAs on chromosomes was shown in the circular diagram, which describes the distribution of different kinds of lncRNAs on chromosomes (Fig. 2c).
Furthermore, 6,239 circRNAs were detected via nd_circ software, of which 5,633 (90.3%) were exonic circRNAs (ecircRNAs) generated from exons of a single protein-coding gene. The majority of circRNAs were between 200 to 1000nt and derived from exons, while the other circRNAs were over 3000 nt and from intergenic regions. We compared the locations of circRNAs and their source genes on the reference genome and then named circRNA ID after their source genes ( Fig. 2d and 2e).

Comparison of mRNAs and lncRNAs features
In order to understand the differences in features between mRNAs and lncRNAs, the terms of length, exon number, ORF of mRNA, and lncRNA were compared. The results showed that the number of mRNAs with a length of 400nt was the highest, followed by those with a length ≥ of 3000nt. For lncRNAs, the number of lncRNAs with a length of 400nt was the largest, while the number of lncRNAs showed a decreasing trend with the increase of length (Fig. 3a). The number of exons of lncRNAs ranges from 2 to 8, and most lncRNAs have two exons. In contrast, most mRNAs contain one or two exons. The number of mRNA decreases gradually with the increase of exon (Fig. 3b). We predicted that the length of ORF of mRNA was mainly in the range of 100 ~ 500, while that of lncRNAs was mainly in the range of 50 ~ 200nt (Fig. 3c). By comparing the number of variable shear isomers of mRNA and screened lncRNAs, it was concluded that most mRNA and lncRNAs had one variable shear isomer (Fig. 3d).

Expression pro les of ncRNAs in different tissues
Tissue-speci c expression patterns of ncRNAs would explain its spatiotemporal distribution and its role in gene regulation of different tissues. FPKM value was used as an indicator to measure the expression level of the transcript. A total of 321 miRNAs, 1969 lncRNA, and 270 circRNAs were found to be differentially expressed with the FDR < 0.05 (False Discovery Rate, FDR) and FC ≥ 2(Fold Change) (Table  S2). Hierarchical clustering analyses were conducted on the differentially expressed ncRNAs; the results of differentially expressed ncRNA are exhibited in Fig. S1. Given the signi cant difference of tanshinones content between red and white roots, we focused on the expression pro les of ncRNAs between the phloem and periderm of the red and white root. After that, 70 miRNAs, 48 lncRNAs, and 26 circRNAs were discovered to be differentially expressed. Among them, 40 miRNAs, 22 lncRNAs, and 14 circRNAs were down-regulated in the phloem and periderm of red root than that of white root, 30 miRNAs, 26 lncRNAs, and 12 circRNAs were up-regulated, respectively ( Fig. 4; Table 2).

Functional annotation of target genes of different expression ncRNAs
In total, 1232 target genes of DE-miRNAs, 304 cis-target genes and 11 trans-target genes of DE-lncRNAs, and 12 source genes DE-circRNAs between the phloem and periderm of white and red root were annotated. To explore the potential functions of these targets, GO and KEGG Pathway analysis was performed. The function annotation was integrated based on multiple databases, such as NR, Swiss-Prot, COG, KOG, Pfam (Table 3; Table S3).
GO enrichment analysis revealed that 473 target genes of DE-miRNAs could be classi ed into 19 biological processes, 15 molecular functions, and 11 cellular component terms. For biological processes, the "lignin catabolic process," "regulation of transcription, DNA-dependent," and "auxin-activated signaling pathway" were the three most dominant GO categories. As for molecular functions, "hydroquinone: oxygen oxidoreductase activity" and "copper ion binding" were the two most signi cant GO terms. The three most dominant GO terms in cellular components were "apoplast," "chloroplast stroma," and "coated vesicle membrane" (Fig. 5a). Based on the KEGG pathway analysis, 162 DE-miRNAs targets gene are mainly involved in 58 metabolic pathways. Among them, the "Plant-pathogen interaction pathway," "Biosynthesis of amino acids," and "diterpenoid biosynthesis" pathway contain the most DEGs. These results indicate that miRNA plays an essential regulatory role in the growth, development, and metabolism in S. miltiorrhiza ( Fig. 5b; Table S4). For the terpenoid biosynthesis pathways, seven targets matched the "Diterpenoid biosynthesis" pathway, such as the sly-miR164b-3p target Ent-kaurenoic acid oxidase (EVM0005756 and EVM0001334), aly-miR166g-5p target leucoanthocyanidin dioxygenase-like (Salvia_newGene_57657), unconservative_Lachesis_group0_16803 target kaurene synthase-like (Salvia_newGene_54337 and EVM0017060), unconservative_Lachesis_group5_24210 target ent-kaurene oxidase (EVM0019420 and EVM0013257). Besides, one target gene also matches the "Sesquiterpenoid and triterpenoid biosynthesis" pathway, aly-miR172c-5p target premnaspirodiene oxygenase (EVM0008863). These target genes are likely to play a signi cant role in the biosynthesis of terpenoids in S. miltiorrhiza.
Both the cis and trans targets of the DE-lncRNAs were analyzed. GO enrichment analysis displayed that 165 cis-regulated targets were signi cantly enriched in biological processes, cellular components, and molecular function. The GO terms associated with biological processes that were most abundant contained "protein retention in ER lumen," "regulation of G2/M transition of the mitotic cell cycle," and "lipid metabolic process." The most relevant GO terms related to molecular functions were "chromatin binding," "oxidoreductase activity," and "monodehydroascorbate reductase (NADH) activity." In the cellular component group, targets were enriched in terms of "plant-type vacuole," "peroxisome," and "anchored component of membrane" (Fig. 6a). The six trans-targets were correlated with 34 GO terms in three main classi cations, contained "membrane," "organelle," "binding and cellular process," and so on (Fig. 6c). KEGG analysis revealed the enrichment pathway of 111 cis-regulated targets mainly included "ribosome, carbon metabolism, plant hormone signal transduction, glycerophospholipid metabolism, protein processing in the endoplasmic reticulum," and so on. The Ribosome pathway is the most signi cant enrichment pathway (Fig. 6b; Table S4). There were only two KEGG pathways for two trans-regulated targets of DE-lncRNAs, "Proteasome" and "Ubiquinone and another terpenoid-quinone biosynthesis" (Fig. 6d; Table S4). Annotation results for other databases also shown that target genes of DE-lncRNAs encoded some receptor protein kinase, pentatricopeptide repeat-containing protein, peroxidase, membrane protein, transporter protein, zinc nger protein, hormone synthesis, and signal transduction related protein.
As a result of KEGG analysis, the target gene of DEcircRNA was signi cantly enriched in only one pathway. The unigene in the alpha-Linolenic acid metabolism pathway was analyzed between white phloem and red phloem samples ( Fig. 7b; Table S4).
In addition, the 201 targets of 59 miRNAs combined with DE-circRNAs also were Analysis. Target genes were signi cantly enriched in biological process contained "defense response to the bacterium," "cellular biogenic amine metabolic process," "organic cyclic compound biosynthetic process," "purine nucleotide biosynthetic process" "heterocycle biosynthetic process." The enriched term related to molecular function included "sequence-speci c DNA binding transcript," "ligase activity," "symporter activity," "oxidoreductase activity," and "oxidoreductase" (Fig. 7c). The function of the DE targets gene is mainly involved in 15 metabolic pathways, such as "Plant-pathogen interaction, Endocytosis, Phenylpropanoid biosynthesis, Porphyrin, and chlorophyll metabolism, Galactose metabolism." Thereinto, the "Sesquiterpenoid and triterpenoid biosynthesis" pathway had been paid close attention. Only premnaspirodiene oxygenase (EVM0021656) was enriched in the pathway targeted by unconservative_Lachesis_group2_8821 ( Fig. 7d; Table S4).

Validation of differential expression ncRNAs
To verify the validity of RNA-Seq data, we picked six DE-mRNAs, ve DE-lncRNAs at random and validated their expression by quantitative RT-PCR (Fig. 8). Experiment results indicated that qRT-PCR expression pro les of these ncRNAs were consistent with the results of RNA-Seq, as that con rmed the effectiveness of RNA-Seq data. Therefore, present results might serve as an e cient reference for researching the molecular mechanism of these ncRNAs in the future.

A large number of ncRNAs were identi ed in Danshen
Most ncRNAs in plants, which have preliminarily obtained functional clues, regulate multiple aspects of development and response to environmental stress-speci c in tissue cells and exist in complex regulatory networks [47][48][49]. Previous studies on ncRNA are still preliminary. In this study, six tissues, including stem, leaf, ower, red and white roots derived from the same plant, were collected for deep sequencing of non-coding RNA and preliminarily explored this regulatory mechanism in tanshinone biosynthesis at the transcriptional and post-transcriptional levels. A total of 360 miRNAs, 6,929 lncRNAs, and 6,239 circRNAs were identi ed in all the samples. Differential expression analysis showed that a total of 321 miRNAs, 1,969 lncRNAs, and 270 circRNAs were screened, indicating the tissue expression characteristics of noncoding RNAs. Based on the difference in tanshinone content, we focused on analyzing the differentially expressed ncRNAs between the red and white roots. A total of 70 miRNAs, 48 lncRNAs, and 26 circRNAs were found to be differentially expressed, which indicated their potential regulatory roles in biosynthesis of diterpenoid tanshinone. The number of differentially expressed ncRNAs were less than found in previous studies [50,51], which might result from the similar period of development and same genetic background between red and white root.

NcRNAs expressed differentially in root tissues and their target genes
In order to explore the potential regulatory functions of these differentially expressed ncRNAs, we carried out GO classi cation and KEGG enrichment analysis of their target genes. We mined the target genes related to terpenoid synthesis. The results showed that seven miRNAs were enriched in the "diterpene biosynthesis pathway," and their target genes were mainly annotated as terpenes synthase and oxidase including ent-kaurenoic acid oxidase, kaurene synthase. A few miRNAs were enriched in "sesquiterpene and triterpene biosynthesis pathways". Their target genes are also oxygenases. Combining with previous studies [10][11][12] and the test results, suggestion was proposed that the miRNAs generally involved in secondary metabolic pathways, and the study of regulatory function has important signi cance in the future.
Studies have shown that the expression patterns of mRNA-like noncoding RNAs (mlncRNAs) exhibited tissue speci city, suggesting that mlncRNAs may be involved in regulating the growth and development of S.miltiorrhiza. In addition, many mlncRNAs were responsive to the treatment of elicitors (yeast extract, Ag + and MeJA) which lead to the production of bioactive compounds in S. miltiorrhiza, suggesting that lncRNAs might be involved in the production of active substances [52]. In this study,analysis of cisand transtarget genes of differentially expressed lncRNAs revealed that most of the target genes were involved in the primary metabolism, such as the ribosome, carbon metabolism, plant hormone signal transduction. Only one target gene was involved in the biosynthesis of ubiquinone and other terpene quinones. Our results indicated that lncRNAs may not directly target some key enzyme genes for secondary metabolite synthesis, but indirectly in uence the synthesis of active substances by responding to the external environment.
The source genes of the circRNA and the interacting miRNAs were also analyzed. CircRNA source genes may be involved in some primary metabolic processes, such as the Acetyl-Coa Metabolic process; target genes of miRNAs interacting with them participate in defense response to bacterium process and plant interaction pathway. At present, the functions of circRNAs in most plants are still unknown, especially in the regulation of secondary metabolites. Our results provide some references for future studies.
Combined with previous research results [12,52], tanshinone synthesis may be generally regulated by non-coding RNA. In addition, based on the expression patterns of the three non-coding RNAs in different tissues, miRNA is more commonly and directly involved in the regulation of secondary metabolite synthesis than lncRNA and circRNA.
In addition, there were still most target genes of these DE ncRNAs encoded some receptor protein kinase, zinc nger protein, disease-resistant proteins, F-box Protein, hormone synthesis, and signal transduction related protein. This means that these genes may be involved in plant and pathogenic interactions, which may be related to the abundance of microbial ora in the soil [59,60]. The pericarp of the root approached the soil directly and closely is stimulated by microorganisms, which initiates the defense system of the root system [61]. It is worth mentioning that lots of genes are differentially expressed in the red and white roots, which indicates that the production of secondary metabolites, primarily diterpenes, may be related to the interaction of plant pathogens. Some studies suggest that secondary metabolites produced by plants are natural antimicrobial substances [62,63]. The potential regulatory relationships between the production of secondary metabolites and the interactions among plants and pathogens would focus on future study. The study also provides new insight into the regulation mechanisms in the terpene synthesis process.

Conclusion
In this study, six tissues of the same S.miltiorrhiza plant, including stem, leaf, ower, red and white root, were collected for deep sequencing of non-coding RNA and preliminarily explored the regulatory mechanism of tanshinone biosynthesis at the transcriptional level. In view of the difference in tanshinone content between red and white roots, we conducted an in-depth analysis of the target genes of differentially expressed ncRNAs between them.The results showed that seven miRNAs targeted terpene synthase and oxidase in the diterpenoid synthesis pathway, suggesting that miRNAs may be directly involved in the regulation of terpenoid synthesis. The target genes of lncRNAs and circRNAs are mainly involved in the primary metabolic process, suggesting that they indirectly affect the synthesis of active substances by responding to the environment. In addition, we found that a large number of target genes of differentially expressed ncRNAs are involved in the plant-pathogen interaction pathway, suggesting that the production of secondary metabolites (mainly diterpenoids) may be related to the interaction of plant pathogens. In conclusion, the data provided in this study will provide a reference for the study on the regulation of tanshinone synthesis.

Consent for publication
Not applicable.

Availability of data and materials
The datasets supporting the conclusions of this article are included within the article and its additional les.

Competing interests
The authors declare that they have no competing interests.     Differential expression levels of miRNAs(a), lncRNAs(b) and circRNAs (c) in two root tissues of S. miltiorrhiza Figure 5 Page 25/28 The gene ontology (GO, a) and the Kyoto encyclopedia of genes and genomes (KEGG, b) analysis of DE miRNAs targets in two root tissues of S. miltiorrhiza Expression of lncRNAs (Right) and miRNAs (Left) in phloem and periderm of red roots (RZP, black) and white roots (WZP, white) of a S. miltiorrhiza plants. Expression levels were quanti ed by qRT-PCR. The level of transcripts in RZP was arbitrarily set to 1, and the level in WZP was given relative to this. Smactin was used as the internal control gene, and three biological replicates were used.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.