Genome-wide identication of microRNAs associated with the somatic embryogenesis in Eucalyptus

Background: MicroRNAs (miRNAs) are a class of small noncoding RNAs with 18-24 nucleotides in length and function in many biological processes in plant. Although Eucalyptus trees are widely planted across the world, our understanding of the miRNA regulation in the somatic embryogenesis of Eucalyptus is still poor. Here we reported for the rst time the miRNA proles of differentiated and dedifferentiated tissues of two Eucalyptus cultivars and identied miRNAs involved in the somatic embryogenesis of Eucalyptus. Results: Stem and tissue-culture induced callus were obtained from the subculture seedlings of E. camaldulensis and E. grandis x urophylla, and were used as differentiated and dedifferentiated samples, respectively. We generated 346.4 million reads for 12 samples (n=3) and identied 888 miRNA precursors (197 known and 691 novel) which can produce 1,067 mature miRNAs. These miRNAs were mainly distributed in chromosomes Chr03, Chr05 and Chr08, and can produce 46 miRNA clusters. In these samples we detected 998 miRNAs with TPM (transcripts per million reads) > 5 and found that highly expressed miRNAs varied across samples. We identied 327 and 343 differentially expressed miRNAs in the dedifferentiation process of E. camaldulensis and E. grandis x urophylla, respectively. Dysregulated miRNAs shared by the two cultivars might be involved in the development of embryonic callus of Eucalyptus, such as MIR156, MIR159, MIR160, MIR164, MIR166, MIR169, MIR171, MIR399 and MIR482. We also identied 81 up-regulated (e.g., miR159c-3p, miR167a-5p, miR397a-3p, miR397c-5p, miR397d-3p, miR397d-5p, N-miR1-5p and N-miR5-5p) and 67 down-regulated (e.g., miR482b-3p, N-miR3-3p, miR156a-3p, N-miR40-3p and N-miR18-5p) miRNAs specic to E. camaldulensis. Target prediction and functional analysis showed they might be involved in longevity regulating and plant hormone signal transduction pathways. Then, the expression patterns of these miRNAs were conrmed by qRT-PCR. Conclusions: This is the rst time to study the miRNAs proles in the dedifferentiation process of Eucalyptus and it will provide a valuable resource for future studies. More importantly, our ndings will improve our understanding of miRNA regulation and molecular mechanisms during the somatic embryogenesis of Eucalyptus, and the output of this study will benet the Eucalyptus breeding program.


Background
Eucalyptus, a highly diverse genus of the Myrtaceae family, is the most widely planted hardwood due to its increasing importance for ber and energy in the word. Their remarkable adaptability coupled with fast growth and superior wood properties has driven their rapid adoption for plantation forestry in South America, Africa, Asia, Spain, Portugal, Middle Eastern countries and North America [1]. This genus consists of about 700 species, of which Eucalyptus globulus (E. globulus) and Eucalyptus grandis (E. grandis) are currently the most extensively planted species [2]. Natural regeneration of Eucalyptus mainly relies on seed and the breeding process is generally slow because of the length of the juvenile phase before owering. Although natural hybridization of Eucalyptus has often been reported to gain some characteristics (e.g., cold resistance), arti cial hybridization is still a means of producing a limited number of seeds and the selected hybrids must then be multiplied by vegetative propagation.
Several means of vegetative propagation for Eucalyptus have been reported, such as air layering, grafting, stem cutting and tissue culture [1]. During the in vitro tissue culture-induced organogenesis or somatic embryogenesis, cells of explants are activated to dedifferentiate and re-enter the cell cycle and in some cases shoots or roots are also regenerated after callus formation [3,4]. The rst step of somatic embryogenesis is the callus induction, which includes a number of characteristic events: dedifferentiation of cells, activation of cell division and reprogramming of cell physiology, metabolism and gene expression patterns [5]. The second step of somatic embryogenesis is the plant regeneration, which is related to the ability to regenerate plant (also called as embryogenic potential) [6]. Some studies have been demonstrated to study the molecular mechanisms of somatic embryogenesis in other plants, such as maize [7], rice [8], banana [9] and Catalpa bungei [10]. However, large is not clear about the molecular mechanisms in the somatic embryogenesis of Eucalyptus.
MicroRNAs (miRNAs) are a class of small (~21 nucleotide) noncoding RNAs that function in regulating gene expression at post-transcription level [2,11]. They have been reported to function in multiple biological processes in plant, such as development, owering, architecture and response to biotic and abiotic stresses [12]. Some studies have been demonstrated to identify miRNAs in Eucalyptus. For example, Levy found no mutual relationship between alternations in miR156 and miR172 expression in the adventitious root induction [13]; Pappas reported the variable patterns of conservation and diversity with miRNAs across Myrtaceae species [2]; and Lin identi ed 386 novel miRNAs in E. grandis and predicted their target genes [14]. These studies provide a basis of understanding the miRNAs regulatory network in Eucalyptus. However, little is known about the miRNA regulation in the somatic embryogenesis of Eucalyptus.
Small RNA sequencing has been used to identify known and novel miRNAs in animals and plants. In Eucalyptus, it has been used to characterize the miRNAs and its targets in some studies [2,14]. In this study, we aimed to identify the miRNA pro les of differentiated (stem) and dedifferentiated (callus) tissues in Eucalyptus camaldulensis Dehnh. (E. camaldulensis, high embryogenic potential) and E. grandis x urophylla (low embryogenic potential). Also, we aimed to characterize differentially expressed miRNAs between differentiated and dedifferentiated tissues, which might be associated with the embryogenic potential of Eucalyptus. Our study will provide a valuable resource for future studies and improve our understanding towards the molecular mechanisms during the somatic embryogenesis and propagation development of Eucalyptus. Our results will bene t the breeding program of Eucalyptus in this eld.

Global miRNA identi cation in Eucalyptus
Due to the unavailable access of Eucalyptus miRNAs, we rst identi ed miRNAs in the E. grandis genome using the small RNA sequencing data of differentiated (stem) and dedifferentiated (tissue-culture induced callus) of two Eucalyptus cultivars (E. camaldulensis stem: A1; E. camaldulensis callus: A2; E. grandis x urophylla stem: B1; E. grandis x urophylla callus: B2). Initially, we obtained a total of 346.4 million raw reads (average 28.9 million reads) and 304.2 million clean reads (average 25.4 million reads) after data cleaning. Using the clean data, we identi ed 888 miRNA precursors (producing 1,067 mature miRNAs, Additional le 1), including 691 novel and 197 known miRNAs. Length distribution ( Figure 1A) showed that 786 Eucalyptus mature miRNAs were in the length of 21 nt, which took 73.7% of the total miRNAs. Figure 1A also showed that the rst nucleic acid of the miRNAs was enriched with uracil (U), which agrees with the base bias of plant miRNAs in miRBase [15]. Next, we analyzed the chromosome distribution of the Eucalyptus miRNAs. Figure 1B showed that chromosome Chr03 can produce the most miRNAs (142 miRNA precursors), followed by chromosomes Chr05 (129 miRNA precursors) and Chr08 (92 miRNA precursors). Based on the miRNA locations in the Eucalyptus genome, we identi ed 46 miRNA clusters whose members are located within 10 kb (Additional le 1) and have the potential of being transcribed together. We also found that 55, 52 and 782 miRNA precursors were located in the exon, intron and intergenic regions, respectively. The redundant one was egd-MIR169aj, which can be produced by two chromosomes -Chr03 and Chr09 (Additional le 1). These miRNAs were used as the reference for downstream analyses, including the miRNA expression pro ling and target prediction. miRNA expression pro les We next pro led the miRNA expression in differentiated and dedifferentiated tissues of E. camaldulensis and E. grandis x urophylla. After lowly expressed miRNAs (average TPM < 5) were ltered, we identi ed 998 miRNAs in all samples ( Figure 1C, Additional le 2), of which 500, 547, 505 and 620 distributed in A1, A2, B1 and B2, respectively. Venn diagram ( Figure 1C) showed that 230 miRNAs were detected in all samples while 81, 79, 60 and 163 miRNAs were speci cally identi ed in A1, A2, B1 and B2, respectively. The distribution of miRNAs in different expression levels revealed that more than 60% of the miRNAs were less than 5 TPM ( Figure 1D) while 31~38 miRNAs were expressed more than 1,000 TPM in these samples. We showed in Table 1 the top 10 highly expressed miRNAs in these samples. It is interesting that the only the top 3 highly expressed miRNAs were shared by all four samples and that MIR166 family members were highly expressed in the dedifferentiated tissues of both E. camaldulensis and E. grandis x urophylla.
qRT-PCR Next, we used tailed miRNA qRT-PCR strategy to validate the differential expression of seven randomly selected miRNAs in the differentiated and dedifferentiated tissues of E. camaldulensis and E. grandis x urophylla. The primer sequences for these miRNAs can be found in Additional le 5. Comparison of small RNA sequencing and qRT-PCR results is shown in Figure 4. Overall, 12 (85.7%) out of 14 events were agreed by both small RNA sequencing and qRT-PCR. Using deep sequencing, egd-miR482b-3p, egd-miR156a-3p and egd-N-miR1-5p were found with down-regulation only in DH201-2 while egd-miR390a-3p, egd-miR396d-3p and egd-N-miR6-3p were found with dysregulation only in GL9. The speci c regulations of these miRNAs were con rmed qRT-PCR. High agreement of miRNA expression patterns in small RNA sequencing and qRT-PCR indicate that the miRNAs identi ed in this study might be associated with the somatic embryogenesis and embryogenetic potential of Eucalyptus, which requires more experiments.

Discussion
Although some studies have been demonstrated to study the Eucalyptus miRNAs [2,13,14], the miRNA identities and sequences of Eucalyptus are still missing in miRbase (v22). In this study, we conducted the small RNA sequencing and identi ed a total of 888 miRNA precursors and 1,067 mature miRNAs in the Eucalyptus genome (Additional le 1). Among them, 197 precursors were aligned to other known miRNAs while 691 could be speci c to the Eucalyptus. The miRNAs identi ed in this study can provide a valuable source to study the regulation mechanisms of noncoding RNAs in the somatic embryogenesis of Eucalyptus. Comparison of miRNAs identi ed this study ( Figure 1C) revealed 104 miRNAs speci c to the callus tissues of Eucalyptus, including six highly expressed miRNAs (e.g., egd-miR482a-3p, egd-miR482a-5p, egd-miR397a-5p, egd-miR397b-5p, egd-miR159b-3p, egd-miR159a-3p) ( Table 1). While 79 miRNAs speci cally identi ed in the callus tissue of DH201-2 ( Figure 1C, Additional le 2) might be related to the high potential of embryogenesis.
Among the top 10 highly expressed miRNAs (Table 1), miR482a, miR397a/b, and miR159a/b were shared by the dedifferentiated tissues of both E. camaldulensis and E. grandis x urophylla. In the callus-derived protoplast, both miR482a and miR482b were up-regulated [16], and they were supposed to regulate the disease resistance pathways and the expression of plant protoplast totipotency [17]. miR482 family has been proved to target mRNAs for NBS-LRR disease resistance proteins using transient expression in Nicotiana benthamiana [18]. In this study, no NBS-LRR gene was predicted to be regulated by miR482a, probably due to the limited annotation of Eucalyptus genes. However, we found that miR482a was predicted to target 30 genes encoding disease resistance proteins, such as Eucgr.A00735 (putative disease resistance protein At4g11170), Eucgr.A02363 (RGA1), Eucgr.C02142 (RPS6), Eucgr.H03707 (RML1B), Eucgr.H03723 (TAO1) and 14 genes encoding TMV resistance protein N. Interestingly, MIR397 has been reported to be particularly expressed in undifferentiated tissue and to be overexpressed in undifferentiated rice callus [19,20], which coincide with our results. MIR397 was reported to target a family of copper-containing oxidase enzymes -laccase, which is proposed to function in the formation of lignin, an integral part of the secondary cell walls of plants [19]. We found a total of 112 laccase genes regulated by MIR397 in Eucalyptus. Also, miR397b-3p was up-regulated in the callus of both E. camaldulensis and E. grandis x urophylla, and miR397a-3p was up-regulated only in the dedifferentiated tissue of E. camaldulensis (Additional le 3). Like MIR397, MIR159 has been studied to regulate the development of callus in multiple plants, such as Dimocarpus longan [21,22], Moringa oleifera [23] and Larix leptolepis [24]. miR159 has been shown to be induced by ABA and to regulate MYB33 during the seed germination of Arabidopsis [25]. Although in the present study miR159 was not differentially expressed in the dedifferentiated tissue, its high expression indicates a potential role in the dedifferentiation and the callus development.
In addition, we identi ed 81 up-regulated and 67 down-regulated miRNAs speci cally in the dedifferentiated tissue of E. camaldulensis compared to E. grandis x urophylla ( Figure 3A). Among them, some miRNAs were detected with relative high expression in the stem or callus, such as miR156a-3p, miR167a-5p, miR482b-3p, miR3488-3p and miR535a-3p (Additional le 2). miR156a and miR482b were down-regulated in the callus of E. camaldulensis (Additional le 3). In Arabidopsis, the repression of miR156a by MYB33, a target of miR159, has the capacity of accelerate the vegetative development through its target SPL9 [26]. In E. grandis and in E. brachyphylla the expression of miR156 was shown to have no relationship with the rooting ability [13]. Thus, the relationship between miR156 and somatic embryogenesis in E. grandis requires further experiments to be explored. The expression of miR167a was found to be up-regulated during the development of rice callus and its target ARF8 was found to be downregulated [27]. miR535-3p is the passenger miRNA of miR535, which was relevant from their expression (Additional le 2). miR535-5p was down-regulated in both E. camaldulensis and E. grandis x urophylla (Additional le 3). In rice, SPL14, the target of miR535, remains at a low level during the vegetative growth and can strengthen the ability of disease resistance in plant [28]. miR3488 was identi ed due to its similarity to an animal miRNA. More experiments are required to characterize its presence in Eucalyptus and the association with the somatic embryogenesis.
We also identi ed a number of novel miRNAs for Eucalyptus (Additional le 1) and some of them were highly expressed in the differentiated and dedifferentiated tissues, such as egd-N-miR1-3p, egd-N-miR2-3p, egd-N-miR4-3p, egd-N-miR5-5p and egd-N-miR7-5p (Table 1, Additional le 2). The abundance of these miRNAs support that they might be speci c to the stem or the callus of Eucalyptus. Further, some of them were found to be differentially expressed in this study, such as egd-N-miR1-3p, egd-N-miR2-3p, and egd-N-miR5-5p (Additional le 3). Among them, 99 Eucalyptus novel miRNAs were identi ed with dysregulation speci c in E. camaldulensis, such as egd-N-miR3-3p and egd-N-miR-5p (Additional le 3). Target prediction identi ed Eucgr.B01315 ((-)-germacrene D synthase) is the target of egd-N-miR3-3p while egd-N-miR5-5p can regulate the expression of Eucgr.C03404 (myosin-17) and Eucgr.E03882 (microtubuleassociated protein 70-5). Although it is not clear about their functions in the somatic embryogenesis, our results provide a valuable resource for future studies in Eucalyptus.

Conclusions
In this study, we analyzed the small RNA pro les of differentiated and dedifferentiated tissues of two Eucalyptus cultivars -E. camaldulensis and E. grandis x urophylla. In total, we identi ed 691 novel and 197 known miRNAs for Eucalyptus, most of which distributed in chromosome Chr03 and Chr05. In the stem and callus we detected 998 miRNAs with TPM > 5, of which 230 shared by all samples. Further, we identi ed 143 up-regulated and 184 down-regulated miRNAs in the dedifferentiation process of E. camaldulensis, and 128 up-regulated and 215 down-regulated miRNAs in E. grandis x urophylla. Comparison of the differentially expressed miRNAs in dedifferentiated tissues revealed 62 up-regulated and 117 down-regulated miRNAs shared by both E. camaldulensis and E. grandis x urophylla, which might be related to the somatic embryogenesis of Eucalyptus. Further, the up-regulation of 81 miRNAs and down-regulation of 67 miRNAs were speci cally identi ed in E. camaldulensis. These miRNAs were shown to function in the regulation of pathways like "Longevity regulating pathway -multiple species" and "Plant hormone signal transduction". The expression of miRNAs was con rmed by qRT-PCR and their relationship with somatic embryogenesis was agreed with previous studies in other plants. This is the rst time to study the association of miRNAs and somatic embryogenesis in Eucalyptus. The output of this study will provide a valuable resource for future studies in Eucalyptus. More importantly, our ndings will bene t the molecular research in this eld and the breeding program of Eucalyptus.

Plants and tissue culture
The plants of E. camaldulensis (high regenerative ability, voucher ID: c0009) and E. grandis x urophylla (low regenerative ability, voucher ID: j0017) were maintained in the experimental elds of Guangxi Forestry Research Institute. The second generation of in vitro tissue-culture induced seedlings of E. camaldulensis and E. grandis x urophylla were maintained on the MS medium supplemented with 20mg/L Ca(NO 3 ) 2 , 0.5 mg/L 6-BA and 0.1 mg/L IAA until 2 to 3 cm height for this project. The second to the third stems from the stem tip of the seedlings were obtained and cut into 0.3 ~ 0.5 cm segments. About 60 segments of each Eucalyptus species were then transferred onto the induction MS medium (supplemented with 20mg/L Ca(NO 3 ) 2 , 1 mg/L KT and 0.5 mg/L 2,4-D) and maintained in dark at 28±2°C for 10 days. Successful callus samples were used as dedifferentiated tissues. The induction experiment was replicated three times.

RNA isolation and quality control
Total RNA was isolated from the plant tissues (100 mg, in triplicates) using the TRIzol reagent (Invitrogen) according to the manufacturer's protocol, as previously described [11]. The quantity and quality of total RNA was evaluated by Agilent Bioanalyzer 2100 (Agilent Technologies).

Library preparation of small RNAs
Total RNA (1 µg) of each sample was used to construct the small RNA libraries. Initially, total RNA samples were fractionated on a 15% urea-PAGE gel electrophoresis and a band corresponding to small RNAs (18 -30 bp) were excised. After small RNAs extracted by centrifugation, they were ligated with the adenylated 3′ adapter. Then, RT primer with barcode was used to anneal the 3' adenylated adapter, followed by the ligation of 5' adapter and the reverse transcript reaction. After rst strand cDNA synthesis, we ampli ed the product by 15 cycles and carried out another size selection (103 -115 bp) from the gel. After gel puri cation, the PCR product was quanti ed by Qubit (Invitrogen) and a single strand DNA circle (ssDNA circle), which gave the nal miRNA library, was made by pooling multiple samples together.

Small RNA deep sequencing
We rst generated the DNA nanoballs (DNBs) with ssDNA circle by rolling circle replication (RCR) to enlarge the uorescent signals at the sequencing process, as previously described [29]. Then, the DNBs were loaded into the patterned nanoarrays and single-end read of 50 bp were read through on the BGISEQ-500 platform.

Data analysis and miRNA identi cation
The raw reads were cleaned by SOAPnuke, as previously described [30]. In order to identify miRNAs in the ooded Eucalyptus genome, we merged all the clean reads into one le and ltered reads less than 50 copies. The clean data was aligned to the Eucalyptus genome (http://plantgenie.org) using SOAP2 [31] and potential miRNA precursors were predicted using MIREAP. Then, potential miRNAs were searched against all the plant miRNA sequences in miRBase (v22.1). The family information of Eucalyptus miRNAs was manually identi ed using their similarities to known miRNAs in other species. miRNA clusters were identi ed using a 10,000 bp window screening in the genome. miRNAs not aligned to other known miRNAs were considered as novel miRNAs.
Expression pro le of small RNAs Before miRNA expression was pro led, we annotated other kinds of small RNAs in each sample. First, rRNA, tRNA, scRNA, snRNA, snoRNA and srpRNA were annotated by mapping the clean reads to NCBI GenBank and Rfam databases. Then, mRNA fragments were identi ed by mapping the clean reads to the Eucalyptus mRNA sequences. Next, we aligned the clean reads of each sample to all the miRNA precursors identi ed before using BLAST (no mismatch). Reads aligned to each mature miRNA were counted. Normalization of miRNA expression was performed using the TPM (transcripts per million reads) method and total clean reads were used as background.

Differential expression analysis
We performed differential expression analysis of miRNAs using edgeR, as previously described [32]. Raw counts of all miRNAs were used as input for edgeR. Some cut-offs were applied to the expression levels and statistical values to select differentially expressed miRNAs, such as TPM > 5, log2 fold change (log2fc) > 1 or log2fc < -1, p-value < 0.05 and false discovery rate (FDR) <0.1. miRNA target prediction miRNA target prediction was performed using the psRobot software [33].

Functional analysis
We rst annotated the Eucalyptus genes by mapping them to the KEGG and GO databases. Then, enriched GO terms and pathways were identi ed using two statistical values -p-value (calculated by Fisher's exact test, < 0.05) and q-value (calculated by the R package 'qvalue', < 0.05), as previously described [34].
qRT-PCR Quantitative real-time PCR (qRT-PCR) was used to con rm the expression levels of miRNAs identi ed in this study. We randomly selected 7 miRNAs and used the actin gene as internal control.

Declarations
Ethics approval and consent to participate No speci c permits were required for the described eld studies. The location is not privately-owned or protected in any way, and the eld studies did not involve endangered or protected species.

Consent for publication
Not applicable.