3. 1. Overview of small RNA sequencing
To identify miRNAs, two sRNA libraries were constructed from leaves of C. asiatica and sequenced by Illumina HiSeq 2500 analyzer with 50 single end reads. A total of 58,861,370 raw reads were generated from the sRNA libraries. After removing adapter contaminations and low-quality reads, clean reads of 12,112,674 were obtained. Subsequently selection of size and removal of other non-coding RNA sequences yielded 8,121,083 reads. The size distribution pattern of small RNAs (Fig. 1) showed higher abundance of 21 nt. 24 nt sequences was the next largest fraction followed by 20 and 22 nt. After mapping with the Centella transcriptome (accession no: SRX2572577), 8,123,381 unique reads were obtained. The small RNA sequence data was submitted to the NCBI SRA archive under the accession numbers SRX6411842 and SRX6411843. Among the total unique reads 2,182 unique sequences were identified as potential miRNAs, showing perfect or near perfect matches to the known miRNAs from viridiplantae.
3. 2. Identification Of Conserved And Novel Mirnas
Thirty-seven conserved miRNA families having high sequence similarity to the currently known plant mature miRNAs have been identified from C. asiatica. Comparing with miRNAs from miRBase database, miRNAs present in C. asiatica which are conserved in other plant species were identified. Mapping of filtered reads against miRBase 22.1 identified 227 miRNAs matching against conserved miRNAs. Apparent expression level of miRNA in C. asiatica leaf tissue was observed by analyzing the read count for each miRNA. The largest number of members were identified from miR156 family containing 20 members while 18 miRNA families (miR2118, miR482, miR3630, miR1436, miR827, miR165, miR2673, miR6167, miR2655, miR408, miR174, miR1171, miR530, miR1128, miR6138, miR5174, miR845, miR5653) contained one member each (Fig. 2). The most abundantly expressed miRNAs in both the libraries were miR159, miR396, miR156, miR160, miR172 and miR398. It was possible to detect 25 3p and 22 5p miRNA sequences in the libraries. Every member showed different expression levels within each miRNA family as evident from their reads which further clarifies the fact that different miRNAs play distinct role in plant growth and development. The conserved miRNA population of C. asiatica had high, moderate and lowly conserved miRNAs (Supplementary Table: 1) which can be classified according to the number of plant families containing identified miRNAs [53].
The remaining 7,373,800 reads were retained for novel miRNA prediction and mapped to C. asiatica transcriptome sequence (SRX2572577) using software package miRDeep-P which is specifically designed to characterize miRNA transcriptome in plants. We could identify 109 miRNAs with their precursor sequences (pre-miRNAs) (Supplementary Table: 2) which formed canonical stem-loop hairpin secondary structures as predicted by RNAfold (Fig. 3). The length of the novel mature miRNAs varied from 21 to 25 nt, with the majority being 23 nt. Length of these novel miRNA precursors ranged from 45-200, with an average of 83, which was in compliance with the commonly observed length of plant miRNA precursors [54]. The minimal folding energy (MFE) values of these novel miRNA precursors ranged from -38 to -84.3 with an average of -56.71 Kcal mol−1 which was similar to plant miRNAs [55]. Among the novel miRNAs predicted, all the candidates were found to have complementary miRNA* in the sRNA sequence data.
3. 3. Identification Of Mirna Targets
To comprehend the functions of the miRNA families identified in C. asiatica, we predicted the possible targets of them by psRNATarget. The total numbers of potential target genes supported by psRNATarget were 3313 and 3635 for A. thaliana and D. carota respectively (Supplementary Table: 3). Majority of the target genes (95.51%) were predicted to be regulated through transcript cleavage whereas the remaining targets are regulated by translational repression. Most of the predicted targets were orthologs of known conserved miRNAs targets which are involved in wide range of biological processes including metabolism, plant reproduction, and regulation of development. Moreover, miRNA 396 family had highest number of target genes targeting GRF TFs such as GRF3, GRF4 and GRF12, TATA-binding protein-associated factor BTAF1 etc. The major targets of the miRNA156 included genes encoding squamosa promoter binding like (SPL) proteins such as SPL2, SPL3, SPL6, SPL7, SPL9 and SPL13 as well as F-box and LRR family proteins and NAC domain containing proteins, all of which are transcription factor (TFs) families. A less conserved miRNA such as miR2673 was found to have a variety of targets such as chloroplastic heat shock protein 90-5, nucleosome assembly protein 1, kinase-interacting family proteins, bHLH30, DNA topoisomerase 3-beta isoform etc. In addition, auxin response factor - like TFs, homeobox-leucine zipper protein, scarecrow-like protein, AP2-like ethylene-responsive transcription factor mostly by miR160, miR166, miR171 and miR172 respectively. Thus, target genes were functionally diverse which included enzymes, transcription factors and functional as well as regulatory proteins. A transcription factor-based classification of the targets performed using PlantTFDB illustrated distribution of various TFs including families which were over-represented such as C2H2, WD40-like, HB-PHD, Myb, NAC etc. (Fig. 4).
Among 109 of novel miRNAs detected from C. asiatica, gene targets were predicted for 92 of them (Supplementary Table: 4). C.as_miR91 was found to have maximum number of targets including various types of mRNAs encoding heavy metal transport/detoxification superfamily, AP2/B3-like transcriptional factor family, homeobox 16, cytochrome p450, drought-responsive family, leucine-rich receptor-like protein kinase family etc. On the other hand, phytochrome interacting factor 3-like protein family was predicted to be targeted by C.as_miR106 while C.as_miR16, C.as_miR100, and C.as_miR107 were found to target F-box family proteins, which are considered to be broadly regulated by microRNA-mediated gene silencing [56]. The miRNA-target network of genes encoding secondary metabolites was created using Cytoscape 3.8.1 is depicted in Supplementary Figure 1. Further experiments that elucidate miRNA-target interaction of the predicted targets and their regulatory mechanisms in biological processes are essential.
3. 4. Go And Kegg Analysis Of Targets
Gene ontology (GO) classification annotation analysis of the miRNA targets reflected diverse biological characteristics (Fig. 5). GO analysis of both conserved and novel miRNAs targets could be classified into 14 biological processes, 8 molecular functions and 6 cellular components. For biological process, “metabolic process” (GO: 0008152) and “cellular process” (GO: 0009987) were the two most represented GO categories. Considering molecular functions, the most dominant GO terms were catalytic activity (GO: 0003824) and binding (GO: 0005488), and the two most dominant GO terms in cellular compartments were cell (GO: 0005623) and organelle (GO: 0043226).
There were several conserved miRNA target genes (14.29%) that had no functional annotation. The most frequently represented protein class among the miRNA targets was nucleic acid binding (PC00171) (17%) followed by hydrolase (PC00121) (15.2%) and transferase (PC00220) (13.9%). KEGG analysis using BlastKOALA which is an internal annotation tool for KEGG genes was performed wherein 44.1% of the targets were assigned with KO (KEGG ORTHOLOGY) IDs. Seventeen different pathways related to metabolism were found (Fig. 6), and the most frequently represented pathways were related to “genetic information processing” and “carbohydrate metabolism” after screening out unannotated and repeated sequences. As seen in Fig. 6, 50 genes were related to metabolism of “terpenoids and polyketides” and 58 genes related to “biosynthesis of other secondary metabolites” were found.
3. 5. Expression profiles of conserved miRNAs in leaf, petiole and root
To investigate the gene expression pattern of conserved miRNAs in different tissues of the plant, differential expression of eleven conserved miRNAs were validated by qRT-PCR in the leaf blade, petiole and root. We selected eleven highly abundant conserved miRNAs according to the high-throughput sequencing data namely miR171b, miR171g, miR171f, miR398b, miR396a, miR399b, miR159e, miR159a, miR156d, miR156k and miR2673.
Leaf tissue was considered as reference to calculate fold change expression. In the study most of the miRNAs were up-regulated in leaf tissue compared to both petiole and root such as miR171g, miR156k, miR159a, miR159e, miR398b, miR399b and miR2673. Other miRNAs such as miR156d, miR171a, miR171f and miR396a were found to be up-regulated in both leaf and root compared to petiole. The results (Fig. 7) suggest that the miRNAs profiled from leaves of C. asiatica through high throughput sequencing were considerably abundant in leaf compared to the other tissues, petiole and root.
3. 6. SA responsive triterpenoid pathway genes and miRNAs in in-vitro grown multiple shoot cultures of C. asiatica
To uncover the expression profiles of mRNAs involved in triterpenoid pathway and miRNAs in a controlled condition, we established an in-vitro multiple shoot culture system of the plant. To set up an environment where the triterpenoid pathway genes are essentially up-regulated which will further lead on to increased production of the centellosides, the multiple shoot cultures were subjected to SA acid (150 µM) treatment over a period of 3 weeks with its control (devoid of SA). Genes encoding enzymes which participate in the production of centellosides such as β-amyrin synthase (βAS), squalene epoxidase (SQE), squalene synthase (SQS), farnesyl diphosphate synthase (FPS), 1-deoxy-D-xylulose-5-phosphate reductoisomerase (DXR), hydroxyl methyl glutaryl-CoA reductase (HMGR), and glucosyl transferase (GT) were used for the study (Fig. 8). βas, the enzyme which catalyze synthesis of β-amyrin which acts as direct precursor for the production of centellosides was found to increase up-to 14-fold within 3 weeks of SA (150 µM) treatment compared to its control. Other enzymes such as SQE, SQS, FPS, DXR, HMGR and GT were also found to get up-regulated approximately 2.5 to 5-fold, reinforcing the fact that SA has the potential to induce expression of triterpenoid pathway genes. Similarly, twelve miRNAs belonging to eight miRNA families such as miR171 (miR171b, miR171g, miR171f), miR398b, miR399b, miR396g, miR2673, miR156 (miR156i, miR156k, miR156d), miR159a and miR1171 were analyzed for their expression in SA (150 µM) elicited as well as non-elicited multiple shoots in vitro-grown cultures (Fig. 9). All members of the miR156 family viz. miR156i, miR156k and miR156d as well as miR159a and miR1171 showed down regulation in SA elicited samples compared to its control. MiR156i was down regulated approximately up to 5.5-fold while miR156k, miR156d, miR159a and miR1171 were down-regulated up to 0.5-fold. No considerable change was observed for miRNA family miR171, miR398, miR399 and miR396g with SA treatment compared to its control while miR2673 was up-regulated up to 2-fold compared to its control.
3. 7. Validation Of Mirna-guided Cleavage On Mrnas
One of the distinctive features of miRNA mediated mRNA cleavage as compared to other mRNA degradation method is that cleavage of mRNA takes place between the 10th and 11th nucleotides from the 5ʹ end of the complimentary miRNA. [57]. In this study, two C. asiatica miRNA target gene sequences were verified as targets of miRNAs through 5ʹ RLM-RACE. Sequencing of the 5ʹ ends revealed that the gene SPL6 and MYB54 was cleaved between 10 and 11th nucleotide complementary region to miR156 and miR398 respectively (Fig. 10). The results confirmed the predicted cleavage sites of these miRNA targets.