Computational Discovery of a miRNA and its Putative Target Genes in Ziziphus Jujuba using Genome-wide Expressed Sequence Tags

Background Ziziphus jujuba is an important fruit crop which is increasingly becoming popular among consumers due to its medicinal properties. Increasing worldwide demand for the fruit poses new challenges to the industry which includes the need for accelerated cultivar development of jujubes. To embark on cultivar development with improved traits such as high yield and disease resistance, molecular and conventional breeding, and genetic engineering become imperative. But inadequate trait-enhancing alleles or gene pleiotropism limit the direct use of several identi�ed genes. To overcome these issues, microRNAs (miRNAs) can be utilized in breeding of jujubes as genetic modulators to ne-tune the regulation of gene expression, thus the discovery of miRNAs becomes important. Methods and results In this study using a computational approach, we identi�ed one potential miRNA (zju-miR-215-3p) from 2904 expressed sequence tags. The miRNA showed down regulation of �ve target proteins (AP-2 complex subunit alpha, C2H2-type domain-containing protein, sentrin-specic protease 1, hydrolase_4 domain-containing protein and putative alpha-ketoglutarate-dependent dioxygenase) and their suppression appears to be helpful to the plant to overcome stress conditions. Conclusion The miRNA identi�ed in this study is associated with �ve potential target proteins, most of which are implicated in metabolic and developmental processes associated with plant growth and reproduction. Future studies are necessary to validate the miRNA by RNA sequencing and to con�rm the molecular functions of the down regulations of target proteins.


Introduction
The jujube (Ziziphus jujuba Mill.), which is also known as Chinese date, is an important species in Rhamnaceae (buckthorn family) and regarded as a superfruit for the future due to its excellent survival mechanisms and adaptability to withstand water stress and drought conditions [1,2].The jujubes diverged from Cannabaceae and Musaceae families nearly 79.9 million years ago [3].It is indigenous to the lower and middle reaches of the Yellow River in China but widely distributed in Australia, Europe and south and east Asia [1,4].Jujube trees were found in northern China about 24 million years ago, but the domestication occurred at least 7240 years ago [5,6].The jujube is one of the oldest cultivated fruit trees in the world [1] and used as a fruit and a medicinal plant due to the presence of complex chemical substances with many curative properties [7].China has as many as 4000 years history of the use of jujube and is the largest exporter of this fruit in the world [4].At present, the extent of jujube cultivation in China is over 1.5 million hectares [8] and fresh jujube fruit production is as much as eight million tons [1].
According to a report of the United Nation's Food and Agriculture Organization, the China's export volume of jujube exceeds 4700 tons of dry fruit per annum [9].
The jujube is an excellent source of biologically active phytochemical compounds with nutritional and nutraceutical properties [2].These compounds are found in many parts of the tree including bark, fruits, leaves and seeds [10].Thus, different plant parts of jujubes are used in alternative and complementary medicine to treat different ailments including fever, diarrhea, insomnia, wounds and ulcers, to purify blood and help in digestion [11].It is also used in traditional Chinese medicine to treat hysteria in women and anorexia, fatigue and diarrhea due to spleen-de ciency syndrome [12].While dry jujubes are generally used as food, food additives and avoring agents, the fruits made into juice, confectionery, paste and puree are consumed to help in digestion and for general wellbeing [4,13].
The jujube industry has well developed over the last 70 years in China and later in many different countries including Australia, Iran, South Korea and Japan, mainly due the application of modern scienti c technology and innovations for crop production, protection and improvement [1,14].However, the industry is currently facing a series of new challenges due to increasing demands for the fruit worldwide.For example, fundamental research on genetic control of economically important traits is insu cient while modern and highly e cient breeding techniques are yet to be used for accelerated cultivar development of jujubes [1].Further, crop improvement practices by introducing single or multiple genes through marker assisted selection and conventional breeding or genetic engineering also present new challenges mainly due to the constraints in the direct use of many identi ed genes because of limited trait-enhancing alleles or gene pleiotropism [15,16].There is a growing body of evidence suggesting that miRNA act as genetic modulators which can ne-tune the regulation of gene expression at the post-transcriptional level by causing the cleavage of messenger RNAs or directing translational inhibition [17,18].The validated targets of most miRNAs are regulatory proteins, most of which are transcription factors that are crucial regulators of plant development process, metabolism and stress response [19][20][21][22].Therefore, identi cation of miRNAs in plants such as jujuba is essential to utilize them in breeding for the improvement of yield, quality and stress resistance traits.
Computational approach has been widely used for the discovery of miRNAs in many economically important plant species including coffee, rapeseed, rice and watermelon [23][24][25][26][27].These studies have used either whole genome sequences or expressed sequence tags (ESTs) for the identi cation of miRNAs.As genome information and ESTs are currently available for many plant species including jujuba, miRNA identi cation using computational approach is very feasible and useful for future validations and their use in crop improvement programs.miRNA are endogenous single-stranded non-coding RNA molecules of approximately 19-25 nucleotides in length that has a nity to bind to partially complementary sequences in target messenger RNAs (mRNAs) thereby catalyzing posttranscriptional silencing of target genes [28][29][30].miRNAs are transcribed from miRNA genes by RNA polymerase II and the immediate transcript is primary miRNAs (pri-miRNAs) which are processed into precursor miRNAs (pre-miRNAs) [18].Further processing of pre-miRNAs by the Dicerlike 1 enzyme with the help of HYPONASTIC LEAVES 1 and Serrate results in miRNA duplex [18,31].An RNA methyltransferase protein HUA ENHANCER 1 methylates the miRNA duplex and then the guide miRNA strand is incorporated into ARGONAUTE proteins to form a functional RNA-induced silencing complex (RISC) [18,32].When a successful pairing between miRNA and target gene occurs, the RISC instigates the inhibition of protein expression via cleavage of mRNA or inhibition of translation [29,30].
As there is growing potential for genetic improvement of jujubes, identi cation of miRNAs is imperative to utilize them in the jujube breeding programs.Only a handful of studies have been undertaken in jujube to identify miRNAs, particularly the one which are involved in the response to witches' broom disease caused by phytoplasma [33,34].This study embarked on the identi cation of other miRNAs in jujube and their putative target genes using ESTs and a computational approach.

Materials And Method
Retrieval of reference miRNAs, Ziziphus jujuba ESTs and other data All known plant miRNAs were obtained from the miRNA database in miRBase (http://www.mirbase.org/).This step resulted in 48,885 mature miRNA sequences which were then subjected to the removal of redundant sequences using a python code.A total of 2,901 ESTs of Z. jujube were retrieved from NCBI (https://www.ncbi.nlm.nih.gov/).Further, non-redundant protein sequences were obtained from the nonredundant protein database (nr) of NCBI at ftp://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/. mRNA sequences required for this study were sourced from the European Nucleotide Archive (EBA) of EBI (https://www.ebi.ac.uk/ena/browser/home).

Identi cation of potential miRNAs
A reference miRNA data set was prepared from all known plant miRNAs after removing the redundancy in the sequences in order to use it as a query for homology search against Z. jujuba EST sequences using the BLASTN tool (Fig. 1).The following parameters were selected for the BLASTN homology search: low complexity in sequence lter, descriptions and alignments kept at default, word match size was kept at the default value (7) and the e-value at 0.01 [25].From the hits obtained in this step, the sequences that ful lled the following criteria were only considered potential miRNA candidates; (1) at least 18-20 nt length in predicted mature miRNAs and (2) 0-4 mismatches allowed in sequences when searched against previously known mature miRNAs [35].The screened ESTs having miRNA candidates were matched against protein database using BLASTX tool to nd non-protein coding sequences (Fig. 1).The nonprotein coding sequences (candidates for precursor miRNA sequences) were then used for the prediction of secondary structures (details given in the subsequent section).Subsequently, precursor miRNA sequences were obtained using the miREval 2.0 online software [36].

Secondary structure prediction
Non-protein coding ESTs were assessed for their secondary structures using Zuker folding algorithm in MFOLD software [37].ESTs were considered as potential miRNA candidates when the following criteria were satis ed; (1) EST should fold into a proper stem-loop hairpin secondary structure, (2) sRNA should sit on one arm of the hairpin structure, (3) predicted mature miRNA sequence and its opposite miRNA* sequence in the secondary structures should not have more than six mismatches between them, (4) there should not be a loop or break in miRNA or miRNA* sequences, and (5) predicted secondary structure should have higher MFEI (minimal folding free energy index) and negative minimal folding free energy [38].AMFE (adjusted minimal folding free energy) and MFEI values were calculated using the following equations [39].
Novel miRNA predicted in the study was named according to the miRNA nomenclature criteria (Gri ths-Jones et al., 2006).Using small RNA workbench software, the novel mature miRNA sequence was highlighted in the secondary structure [40].Precursor sequence of Z. jujuba miRNA and six related precursor sequences from same family (miR-215) were aligned using FFT-NS-I ×1000 MAFFT algorithm with all parameters at defaults in Geneious software (version 10.2.4).The alignment was used to develop a WebLogo signature using the WebLogo server at https://weblogo.berkeley.edu/[41,42].

Target gene prediction
Limited number of genes are currently annotated in Z. jujuba, thus Morus notabilis (mulberry) in the same order (Rosales) was used as a reference organism to identify the targets of the newly identi ed miRNA.To predict the target genes, the novel miRNA of Z. jujuba was matched in a simple homology search against M. notabilis coding sequences (cds; released on 9 September 2013) using psRNATarget (http://plantgrn.noble.org/psRNATarget/analysis),which is a plant small RNA target analysis server that identi es plant regulatory small RNA (sRNA) including miRNAs by analyzing complementary matches between sRNA and target mRNA sequences using prede ned scoring scheme and by evaluating target site accessibility [43,44].The following parameters were chosen for the analysis; (1) maximum expectation value of 3, (2) multiplicity of target sites of 2, (3) range of critical mismatch for translational inhibition set at 9-11 nucleotides, and (4) maximum mismatches at complementary sites ≤ 4 without any gaps.Using mRNA IDs in UniProt (Universal Protein Resource) at https://www.uniprot.org/,the target protein, molecular function and biological processes of miRNA were determined (Fig. 1).

Phylogenetic analysis
Family members of the newly identi ed miRNA in Z. jujuba and a miRNA (osa-miR395u) from Oryza sativa as an outgroup were collected from miRBase by a simple search of miRNA family name and miRNA name, respectively (Table 1).A phylogenetic analysis was performed with the new miRNA and the collected miRNA family members.Brie y, the miRNA sequences were aligned using MUSCLE alignment algorithm [45] in MEGA X software.A maximum likelihood (ML) analysis was done in IQ-TREE using

Results
Potential miRNAs in Z. jujuba Initial screening by matching the matching of 2,904 EST sequences from Z. jujuba against 48,885 mature miRNA sequences of plants in miRBase resulted in 1,021 hits.Further screening as outlined in gure 1 predicted one potential miRNA belonging to the family miR-215-3p with 2 mismatches (Fig. 2).The sequence length of the new mature miRNA was 22 nucleotides while that of the precursor sequence was 85 nucleotides (Fig. 2).The MFE value of the novel miRNA was estimated at -21.1kcal/mol while the MFEI value of the miRNA was 0.88 (Table 2).The G+C content of the precursor sequence was 28% while (A+U) content was 72% (Table 2).According to the nomenclature criteria, the newly identi ed miRNA in Z. jujuba was named as zju-miR-215-3p (5'-AUUGGCUGUACAGAAUUACAGA-3').

Topological features of precursor secondary structure and its sequence comparison
The secondary structure (stem-loop structure) of the Z. jujuba miRNA showed two hairpins, one large hairpin at the 5' end and one small hairpin at the 3' end of the sequence (Fig. 2).The large hairpin formed 2 bulges while the small hairpin had a single bulge formed by an unpaired adenine (A) nucleotide at the 59th nucleotide position on the 5' side of the stem (Fig. 2).The mature miRNA was found between 31st and 52nd nucleotide positions on the 3' side of the stem of large hairpin (Fig. 2).Although only a handful number of precursor sequences were compared in this study and the mature miRNA sequence identi ed in Z. jujuba showed similarity to other miRNAs at 13 nucleotide positions (Fig. 3a).The alignment revealed that the precursor of Z. jujuba had a seven-base sequence deletion after 71 nucleotides (Fig. 3a).
The WebLogo signature of the alignment of precursor sequences showed that uracil (U) nucleotide was the most conserved nucleotide followed by adenine (A) among the sequences analyzed in this study (Fig. 3b).
Targets and molecular functions novel miRNA in Z. jujuba Molecular target identi cation of the novel miRNA revealed ve different target proteins (Table 3).The identi ed miRNA showed inhibition of gene expression mainly by cleavage of mRNA transcripts and translation inhibition (Table 3).Of ve different targets, four proteins targets were inhibited by mRNA cleavage while the target protein putative alpha-ketoglutarate-dependent dioxygenase showed inhibition of translation (post-transcription step).Of the target proteins, the AP-2 complex subunit alpha has a role in protein transport through transport vesicles in different membrane transport pathways which encompasses molecular processes such as clathrin-dependent endocytosis, clathrin adaptor activity and intracellular protein transport (Table 3).The targets Sentrin-speci c protease 1 (SENP1) and Hydrolase_4 domain-containing protein are involved in cysteine-type peptidase activity and hydrolase activity, respectively.However, the molecular function of C2H2-type domain-containing protein is currently unknown (Table 3).The putative alpha-ketoglutarate-dependent dioxygenase aids in dioxygenase activity to catalyze oxidation-reduction (redox) reactions where both atoms of an oxygen molecule (O 2 ) are incorporated into the (reduced) product(s) of the reaction.Phylogenetic relationship of Z. jujuba miRNA A maximum likelihood tree to explain the phylogenetic relationship of the new miRNA revealed that the Z. jujuba miRNA (zju-miR-215-3p) was found to be closely related to oan-miR-215-3p in O. anatinus which together with some other miRNAs of this family formed a clade (Clade 1; Fig. 3).The miRNAs aca-miR-215-3p, cli-miR-215-3p, cpo-miR-215-3p, gga-miR-215-3p, hsa-miR-215-3p, mml-miR-215-3p, oha-miR-215-3p, pal-miR-215-3p, pbv-miR-215-3p and tgu-miR-215-3p formed a separate clade (Clade 2) from the previous clade (Fig. 3).

Discussion
This study identi ed potential miRNAs in Z. jujuba using a computational approach and assigned putative target genes for the miRNAs.We investigated over 2900 ESTs in Z. jujuba and were able to identify a single miRNA with a length of 22 nucleotides in the family miR-215-3p.This miRNA showed putative regulatory functions by destabilizing mRNA targets or inhibiting protein translation of ve different target proteins.miRNAs play a key function as master modulators of gene expression, particularly for post transcriptional gene regulation or translation inhibition [29,30,49].Because of these regulatory functions and promising roles in the expression or silencing of certain traits, miRNAs have become important candidates for research.
miRNAs have been identi ed and studied extensively using in silico approaches as well as wet lab-based transcriptome and genome analyses in many plant species [26, [50][51][52].As genome sequences of various organisms are increasingly added to databases and freely available, computational approach is the easiest and cheapest option for miRNA identi cation [53].The presence of evolutionarily conserved sequences within species of plant kingdom is the most salient feature enabling the computational discovery of miRNAs or their orthologs in different plant species [54].
In this study, we have utilized the Z. jujuba EST database which is freely available thus making the discovery of miRNA cost-effective.Many studies have used EST databases to identify miRNAs in various plant and animal species using similar approaches [26, [55][56][57][58].The e ciency of computational identi cation of miRNAs using various bioinformatics tools is higher than the identi cation by direct cloning approach that may have low expression level or varied spatiotemporal expression patterns [59].
Thermodynamic stability of the secondary structures of miRNA precursors is an important parameter that is considered in screening the candidates for miRNA prediction [60].miRNA precursors form thermodynamically stable structures as opposed to randomized sequences with the same nucleotide composition and have lower MFE than randomized sequences [61,62].But miRNA precursors have a higher MFE value than other coding or non-coding RNAs including tRNA and rRNA [39].The miRNA that we have identi ed in this study is veracious due to the stringent selection criteria that were used in the computational miRNA identi cation pipeline.To avoid false prediction of miRNA precursors, the current study considered precursor sequences with an MFEI value greater than 0.85 as likely miRNA candidates [35].Various studies have reported the selection of precursors with MFEI within the range of 0.4 -1.9 as potential candidates for miRNA prediction [26,56,[63][64][65].
The miRNA identi ed in this study has ve target proteins that have various cellular and metabolic functions in Z. jujuba.The AP-2 complex which has four subunits (a-, b-, s-and m-) is important for clarithrin-dependent endocytosis for the regulation of signaling and transport in plants [66].The miRNA identi ed in this study targets the subunit alpha.AP-2 complex has essential roles in physiological and development functions including the development of oral parts in A. thaliana [66].Homozygous AP-2 mutants of Arabidopsis thaliana have shown defective ower development that results in pollination failure [66].Adjustment of owering time is an evolutionary feature that is adopted by plants to increase the chance of reproduction under biotic and abiotic stress conditions [67].Therefore, it appears that the down regulation of AP-2 complex subunit alpha by the new miRNA may help prevent ower formation in jujube trees under various stress conditions to alter the reproductive time of the plant and cope with unfavorable conditions.
All eukaryotes have evolutionarily conserved small ubiquitin-related modi er (SUMO) proteins which cause reversible posttranslational modi cation of target proteins [68][69][70].Hyper SUMOylation of target proteins plays a pivotal role in stress and pathogen responses and control of owering [71].But Sentrinspeci c proteases (SENPs) are involved in the maturation and deconjugation of SUMO proteins from their substrate proteins in human [72].Currently six different SENPs (SENP1, 2, 3, 5, 6, 7) have been identi ed in human [72].SENP1 has similar sequence homology to the SUMO protease ESD4 (early in short days4) that catalyzes deSUMOylation in A. thaliana [73,74].The cleavage of the gene encoding SENP1 by the newly identi ed miRNA most likely prevent deSUMOylation in Z. jujuba to help the plant tolerate stresses.
In plants, alpha-ketoglutarate-dependent dioxygenases (also known as 2-oxoglutarate-dependent dioxygenase) [75] are engaged in a variety of metabolic processes including the regulation of hormones such as gibberellic acids (GAs) and jasmonic acid [76,77].The GAs function in plants by counteracting the effects of abscisic acid which induces leaf senescence [78].During stress conditions plants tend to initiate senescence of leaves in order to preserve their energy and food reserves [79].As GA interferes with this stress response, the newly identi ed miRNA in this study may possibly suppress the translation of alpha ketoglutarate-dependent dioxygenases which subsequently prevents GA production thus aiding the plant to survive stress conditions.

Conclusions
The computational approach used in this study discovered one potential miRNA in Z. jujuba.This miRNA showed ve potential target proteins, most of which are involved in metabolic and developmental processes associated with plant growth and reproduction.Suppression or down regulation of these proteins by the miRNA identi ed in this study may possibly help Z. jujuba overcome stress conditions by various processes such as altering owering time and leaf senescence.A better understanding of what environmental variables trigger the production of this miRNA and how this miRNA affects plant phenology is important for future genetic manipulation of Z. jujuba crops to increase productivity under changing climates.Experimental con rmation of this miRNA and its protein targets is necessary.

Figure 1 Procedure
Figure 1

Figure 3 Comparison
Figure 3

Table 1
Details of miRNAs used for phylogenetic analysis

Table 2
Properties of newly identi ed miRNA in Z. jujuba

Table 3
Potential targets and functions of novel miRNA identi ed in Z. jujuba