Integrative analysis of microRNA and mRNA expression in response to salt stress by UMI

Background Small RNAs play an important role in regulating plant responses to abiotic stress through modulation of the processing, stability, and translation of larger RNAs. Plants employ complex mechanisms of gene regulation to respond to salinity stress. Results In this study, we constructed 12 small RNA libraries, 12 mRNA libraries, and a degradome library to systematically investigate the miRNAs response to salt stress. A total of 312 cotton miRNAs were identi ﬁ ed and of these, 80 were known ghr-miRNAs. 144 genes showed signicant differential expression under salt stress combined with the targeting relationship between the sRNA data and mRNA data. We found 56 miRNA-mRNA pairs which were positively correlated, and 91 pairs which were negatively correlated. Using degradome sequencing, 72 target genes were identied associated with 25 miRNA families. Gene ontology and KEGG analysis indicated some targets were involved in vital biological pathways of salinity stress tolerance: 14 were involved in responses to abiotic stress, two were associated with an environmental adaptation pathway, and a total of 16 NAC and MYB family transcription factors were related to salinity stress. Conclusions The present study identied a large number of cotton miRNAs and identied relationships between miRNA and mRNA, with function annotation revealing their possible biological roles in response to salt stress. Our ndings will further functional studies of cotton miRNAs and the salt tolerance mechanism.

diverse physiological, biological and molecular processes such as growth, development, signal transduction, and tolerance to various biotic and abiotic stresses [14,15].
RNAseq and PCR are important tools for biological research and detecting the expression level of miRNAs. Two challenges exist for these technical methods: one is the small copy numbers that limit detection, and the other is the ampli cation bias that reduces quantitative accuracy [16,17]. Unique molecular identi ers (UMIs) are highly sensitive, accurate, and reproducible and can directly count molecules to correct ampli cation bias. The UMI method begins with a reverse-transcription reaction using a primer designed with an anchored polyT and a unique barcode. The number of samples have unique barcodes and are essentially unlimited. The quantitative precision of this method is thus improved, especially at low molecule counts [18].
In the study, to better understand the molecular basis of salt stress responses in cotton, we combined small RNA (sRNA) sequencing with transcriptome and degradome sequencing to identify conserved and novel miRNAs. We analyzed their targets and expression levels in twelve libraries. We established relationships between miRNA and mRNA to further investigate the regulatory mechanisms. We found that ghr-miR156a/b, ghr-miR7502, ghr-miR7459a/b, ghr-miR172 and ghr-miR399c have high correlation with Ghir_A02G007740, Ghir_A02G015000, Ghir_A03G019050, Ghir_A12G010210 and Ghir_D11G003180, respectively. These targets are closely linked to stress and ghr-miR156a/b, ghr-miR172, and ghr-miR399c in particular have been reported to play key roles in response to stress. This study illuminates the siRNAs and putative targets involved in salt tolerance and facilitates the molecular breeding of cotton.

Results
Phenotypic response of E911 to salt treatment Cotton seedlings grown under a 300mM NaCl solution treatment provided a phenotype of salt stress: the leaves turned yellow and wilted from 0h to 3h, then were partially restored at 6h ( Fig. 2A). Furthermore, the conductivity was signi cantly increased at three time points in leaves under the 300 mM NaCl treatment as compared to the control and became stable after 3h (Fig. 2B). Fresh weight decreased gradually, and nally stabilized, while plant water contents increased gradually (Fig. 2C). Therefore, miRNA-seq and RNA-seq was used to pro le the gene expression at each time point and in different tissues to understand the salt stress response in cotton.
Deep sequencing of small RNA libraries A total of 1,107,343,118 reads were generated from the twelve cotton small RNA libraries generated from the salinity treatment as well as the control, generating 80.37% clean reads including miRNA, rRNA, snRNA, snoRNA, tRNA, and degraded fragments of mRNA introns or exons and several other unannotated reads, ranging from 18nt to 44nt in length. The sequences of 18-28nt in length, which accounted for 70.31% of all the clean reads, were extracted to analyze small RNA length distribution (Fig. 3). The size of the small RNAs was not evenly distributed in each library, and a high abundance of 21-24nt sequences was found, accounting for 53.97% of the total reads, thus representing the typical length of plant mature miRNAs. The number of 24nt sequences was signi cantly greater than that of other sequences, followed by 21nt.

Identi cation of conserved and novel miRNAs
All libraries displayed similar distributions of other RNA families in the CK, 1h, 3h, and 6h salt-treatment samples, including rRNA (~1.83%, ~2.52%, ~3.33% and ~3.10% for the unique reads), snRNA (~0.03%, 0.07%, ~0.07% and ~0.04% for the unique reads), snoRNA (~0.03%, ~0.05%, ~0.06%, and ~0.05% for the unique reads), and tRNA (~0.31%, ~0.40%, ~0.52% and ~0.42% for the unique reads) ( Table 1). sRNAs that differed from known miRNAs by no more than two mismatches were de ned as conserved miRNAs in the central miRNA Registry Database, miRBase. The miRNA genes primarily originate from independent transcriptional units, so RNAseq was performed to identify primary transcripts of novel miRNAs. We obtained 2490.2M clean reads and 2090.3M mapped genome. A total of 422.2M unigenes were identi ed. The results of RNAseq were used for mapping reads and prediction of the hairpin structure of precursors of miRNAs. As a result, a total of 232 miRNA sequences were predicted in all libraries ( Additional File 1: Table S 1). We detected 80 ghr-miRNAs belonging to 53 miRNA families, including 31 miRNA families evolutionarily conserved with more than two members, and 22 nonconserved miRNAs with only one member. Most of the conserved miRNA families had two or more members, while most of the non-conserved miRNA families contained only one member.
Identi cation of the targets via online software and degradome analysis Plant miRNAs usually have perfect or near-perfect complementarity with their targets allowing the identi cation of targets using TargetFinder and psRobot14 [19,20] with position-dependent scoring systems to predict miRNA targets. These analyses identi ed 1662 putative targets (Additional File 2: Figure S1; Additional File 3: Table S2). Further, using degradome sequencing, a total of 28.31M sequence reads were generated. After consecutive steps of ltering, 27.55M reads were obtained which were processed for identi cation of cleavage sites. Through degradome analysis, a total of 72 targets were identi ed for miRNAs (Additional File 4: Table S3 ). The maximum number of targets was obtained for members of the miR164 family, followed by miR172 and miR393.

Differential transcript levels of miRNA in response to salt stress
The abundances of the conserved miRNAs in the twelve libraries ranged from 0 to 470,902, indicating that the expression of the miRNAs varied greatly. To distinguish miRNAs responsive to salt stress, the identi ed miRNAs were studied for their expression patterns across all libraries. The majority of the identi ed miRNAs were expressed in more than one sample (Additional File 5: Table S4 ). Cluster analysis revealed four clusters constructed based on the expression patterns of 56 miRNAs in response to different stress regimes across root, leaf, and stem tissues (Fig. 4). Cluster I contains miRNAs with a high expression pattern; Cluster II shows a low expression pattern. Cluster III contains primarily miRNAs with lower expression in roots; Cluster IV contains many miRNAs with higher expression in roots. 144 miRNAs showed signi cant differential expression patterns across different combinations. The majority of miRNAs (down-regulated 284 miRNAs VS up-regulated 160 miRNAs) showed a trend of down-regulation under stress conditions, especially in root. (Fig. 5). Hierarchical clustering analysis of differentially expressed miRNAs was conducted using the heatmap function. First, the intersection analysis found 155 miRNAs differentially expressed between the different samples and union DEGs for nine groups.
Hierarchical clustering of the intersection showed two groups, and union DEGs clustered into three groups (Fig. 6). Relative expression level of miRNAs in response to salt stress was analyzed at different time points in different tissues. Among these, miR68 and miR119 exhibited the highest down-and upregulated changes after salt stress, respectively (Additional File 5: Table S4 ). There are unique regulation modes for abiotic stresses, and roots, stems and leaves respond to salt stress differently.
Association analysis of mRNA and small RNA expression data The expression of stress responsive miRNAs (from small RNA-seq) and their target genes (from RNA-seq) were integrated to infer the mediatory role of miRNAs during stress conditions. Correlation analysis of miRNA and their target mRNA expression pro les using Pearson's correlation coe cient (r>0.6) identi ed a total of 147 miRNA-mRNA interaction pairs across all combinations under different stress conditions. These correlation pairs comprised 126 genes and 38 miRNAs (Additional File 6: Table S5 ). 56 pairs were positive correlations and 91 were negative correlation. Multiple potential target genes of the conserved miRNA ghr_miR7504a/b and novel miRNAs novel_mir83 were listed (Additional File 7: Figure S2). These results indicated that a single miRNA has the capability to cleave multiple targets.
GO and KEGG pathway analysis GO-based analysis allows the determination of which GO terms (biological process, molecular function, and cellular component) a gene belongs to [21]. Therefore, GO-based analysis can provide more insight into understanding miRNA function. We conducted GO enrichment analysis on the signi cantly up-and down-regulated genes detected by pairwise comparisons in different tissues. A total of 735 DEGs of the targets were classi ed into 464 molecular functions, 754 biological processes, and 567 cellular components (Additional File 8: Figure S3). Twenty-six genes belonged to the biological process of response to stimulus which was most likely associated with resistance, and further analysis showed that Ghir_A02G007740, Ghir_A03G019050, Ghir_A08G008470, Ghir_A09G017710, Ghir_A09G023410, Ghir_A10G002160, Ghir_A11G003190, Ghir_A12G010210, Ghir_D06G002780, Ghir_D08G008560, Ghir_D10G002930, Ghir_D11G003180, Ghir_D11G034240 and Ghir_D12G011240 responded to abiotic stress. Fifty-six genes were associated with the biological process of biological regulation, including Ghir_A02G013350, Ghir_A02G015000, Ghir_A07G005320, Ghir_A08G004810, Ghir_A08G007740, Ghir_A09G000800, Ghir_D10G002930, Ghir_D12G011240, Ghir_D06G020660, Ghir_D08G004960, Ghir_A12G014750 and Ghir_A10G002160, which responded to stimulation. KEGG analysis of a set of DEGs identi ed 735 targets from 20 pathways (Additional File 9: Figure S4). The genes of the signal transduction and environmental adaptation pathways may be related to salt stress. Of these, Ghir_A04G000780, Ghir_D03G001070, Ghir_D05G038710, Ghir_D05G039730, Ghir_D11G027430, Ghir_A04G000770, Ghir_A11G027330, Ghir_D05G038660, Ghir_D05G038670, Ghir_D05G038680, Ghir_D05G038730, Ghir_D05G038740 and Ghir_D11G030710 (ko04626//Plant-pathogen interaction, ko04016 //MAPK signaling pathway -plant) can respond to environmental stress. In addition, the signi cantly enriched pathways include that of plant hormone signal transduction, and those genes may be involved in regulation of stress by plant hormones.

Discussion
As an important drought-tolerant crop, cotton provides an ideal system to study drought tolerance. Under environmental stress, some miRNAs are synthesized in plants [22,23]. For example, IAA-Ala Resistant 3 (IAR3) is a new target of miR167a, which is required for drought tolerance [24]. The expression level of osa-MIR393 changed under salinity stress, and target genes of osa-MIR393 were shown to be responsive to abiotic stress. Transgenic rice and Arabidopsis thaliana that over-expressed osa-MIR393 were more sensitive to salt [25]. These plants constitutively over-expressing osaMIR396c showed reduced salt and stress tolerance [26]. miR399f participates in plant responses to abiotic stresses including salt and drought [27]. miR417 plays a role as a negative regulator of seed germination in Arabidopsis under salt stress conditions [28]. Increasing evidence indicates that miRNAs play important roles in plant response to drought. Many stress related mRNAs are reported targets of known miRNAs: the NAC, MYB, and MAPK families are among the most important in the context of drought and salinity, indicating their roles in plant response to drought and salinity stress [29,30]. According to target predictions, a series of cotton miRNAs are associated with these top-ranked genes, including miR164, miR172, miR167, and miR396 [31,32]. Degradome sequencing data con rmed those stress-related miRNA targets (Fig. 6), supporting the data obtained using the UMI method.
RNAseq is a powerful tool for transcriptome analysis in tissues. However, losses in cDNA synthesis and bias in cDNA ampli cation lead to severe quantitative errors. To correct for ampli cation bias, UMIs is used for direct molecular counting. This method corrects for PCR errors and provides an absolute scale of measurement with a de ned zero level. In contrast, standard RNA-seq uses relative measures such as reads per kilobase per million reads (RPKM), which mask differences in total mRNA content. In this study, we counted molecules by the total number of distinct UMIs to quantitatively assess miRNA-seq, and provide the accurate expression patterns of the miRNAs.
In this study, miR164, miR167, miR399, miR7052 and novel miRNA mir45, mir46, mir223, mir227, mir1, mir68 and mir86 had signi cant differential expression. Compared with the 0h samples, mir1, mir68 and mir86 all showed down-regulation at 1h, 3h and 6h, and mir45, mir46, mir223 and mir227 showed upregulation at 1h, 3h and 6h. Similar to the results of previous studies, we found that miR164, miR167 and miR399 were related to salt stress [ 23,27,28]. Therefore, we hypothesized that mir45, mir46, mir223, mir227, mir1, mir68 and mir86 may also be related to salt stress. In future studies, we will verify these presumed results. In addition, there are a number of unannotated small RNA sequences in the small RNA data which are likely to include functional small RNAs such as siRNA with functions yet to be revealed. Therefore, continued efforts are needed to identify the complete set of miRNAs and other small RNAs from cotton.

Conclusion
In this study, miRNA and mRNA expression in response to salt stress was described. Our results indicate that miRNA-mRNA pairs participate in gene expression regulation and promote gene regulation in response to salt stress. The discovery of these miRNAs and their function allows better understanding of salt stress mechanisms in cotton and other drought-resistant crops.

Plant materials and stress treatment
The salt-tolerant cotton cultivar 'E911' (E911 was provided by the Institute of Cotton Research of Chinese Academy of Agricultural Sciences) was used in this study to test miRNA-target response to salt stress. Seeds were sterilized and germinated on 1/2 MS medium under a 16h light/8h dark cycle at 28℃ until two cotyledons unfolded, then healthy seedlings were placed in pots containing aerated Hoagland nutrient solution. Growth conditions were 28/20℃ day/night temperature, 55-70% relative humidity, and a 14/10h light/dark cycle. At the three-true leaf stage, seedlings showing normal growth were randomly divided into two groups: one was treated with 300mM salt stress, and experienced salt shock for 1 h, 3 h and 6 h. The remaining seedlings served as the control. In all experiments, 50 seedlings were used. After exposing the seedlings to salt stress, we measured relative electric conductivity and plant water content [33]. Then, leaves, stems and roots were harvested, immediately immersed in liquid nitrogen, and stored at -80°C.

Construction of small RNA libraries
Before small RNA and mRNA library construction, total RNA was extracted from samples using TRIzol reagent (Invitrogen, CA, USA) according to the manufacturer's instructions and was then tested using the Agilent 2100 bioanalyzer system to guarantee RNA quality. All experiments were performed with three biological replicates, resulting in 36 small RNA and RNA-seq libraries for sequencing. The general process used for constructing the small RNA library is as follows: small RNA 18 to 30nt in length were isolated on a 15% polyacrylamide gel and the 5-adenylated and 3-blocked adaptor were ligated to the 3' end of the small RNA fragment. Unique molecular identi ers (UMI) [34] labeled Primer (8-10nt sequences) were added and 5' end adaptor was ligated, and the unligated adaptor was digested. Puri ed RNAs were reverse-transcribed to cDNA with UMI labeled primer and the library was validated (Fig. 1). The samples were sequenced on an Illumina Genome Analyzer at the Beijing Genomics Institute (BGI) in Wuhan.

Identi cation of conserved and novel miRNAs
The raw reads from the small RNA libraries were rst ltered to remove low-quality reads and the impurities of raw data (5' primer contaminants, no-insert tags, oversized insertion tags, low quality tags, poly A tags and small tags, and/or tags without 3' primer) to obtain clean reads. The clean sequences were used to search GenBank and the Rfam database [35] to annotate rRNA, tRNA, snRNA and snoRNA. After removing sequences of rRNAs, tRNAs, snRNAs and snoRNAs, the remaining sequences were used in a blast search against miRBase18 (http://www.mirbase.org/) to identify conserved miRNA. Only the sequences that contained fewer than two mismatches with known miRNAs in miRBase were considered as conserved miRNAs.
Degradome sequencing of 5' RACE libraries and data analysis Degradome sequencing provides a method for prediction of target genes. The libraries were constructed as previously described (German et al., 2008). Twenty micrograms of control and salt-treated RNA samples were used for degradome sequencing. Brie y, a 5'RNA adapter was ligated to the cleavage products, which possessed a free phosphate at the 5'end. The puri ed ligated products were reversetranscribed to cDNA. After ampli cation, the PCR products were digested using the enzyme MmeI and ligated to the 3'adapter. The ligation products were ampli ed and sequenced on an Illumina Genome Analyzer and produces 49nt raw reads. After ltering, the clean data were classified by alignment to the database, and ncRNAs were removed. Finally, the miRNA-mRNA pairs were identified and mapped to the reference genes and the true miRNA cleavage site.

GO and KEGG analysis
Bioinformatics analysis uncovers the miRNA-gene regulatory network on the basis of biological process and molecular function. The functional enrichment analysis was applied to assign gene ontology (GO) annotations (http://www.geneontology.org/), exploring the candidate genes associated with miRNAs which may play important biological functions in cotton. Similarly, the pathway annotations were assigned based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) database (http://www.genome.jp/kegg/). Gene numbers for every GO term and pathway and hyper geometric distribution were used to detect signi cantly enriched GO terms and pathways with a P-value threshold of ≤ 0.05.

Differential expression analysis
UMI was connected to cDNA molecules by marking each molecule in the original sample at the early stage of library construction. It was used to reduce the quantitative bias introduced by PCR ampli cation and is conducive to obtaining su cient readings for testing. We calculated the species numbers for accurate quantitative comparison of sRNAs. The expression levels of known miRNAs in each library were calculated and the miRNA expression levels between different samples were compared.
DEGseq [36] is novel method based on the MA-plot, which is a statistical analysis tool widely used to detect and visualize intensity-dependent ratio of microarray data [37]. To improve accuracy of the DEG results, we de ned a gene as a DEG (differentially expressed gene) when reads number fold change was ≥2 and Q-value was ≤ 0.001.

Association analysis of miRNAs and mRNAs
Pearson correlation coe cients (r values) between miRNA and mRNA expression ratios were constructed and calculated, respectively, using Graph Pad Prism version 6 with two-tailed t-tests. A P-value >0.05 was considered statistically signi cant. We screened the differentially expressed miRNAs and their target genes. Based on the difference between processed and control samples, the interaction between the genomic data and the gene targeting relationship was shown with VisNetwork.

Statistical Analysis
The statistical signi cance of expression pro les of miRNAs and genes was compared with one-way analysis of variance (ANOVA), followed by a Duncan's multiple range test. All data were analyzed by SPSS10.0 (SPSS Inc., Chicago, IL, USA) software. P-values > 0.05 were considered to be statistically signi cant.

Declarations
Funding This work was supported by funding from the National Natural Science Foundation of China (grants 31621005 and 31701476 Authors'contributions JZ, YD and XYG designed the experiments and drafted the manuscript. LSL, QDY and XW prepared samples for small RNA sequencing. JZ, CL, XH, LYH, MS, YPD, YC and PW performed the high-throughput sequencing data analysis. YD and YZ contributed in the design and discussion of the work, and assisted in drafting the manuscript. All authors read and approved the nal manuscript.
Ethics approval and consent to participate Not applicable.

Consent for publication
Not applicable.  Length distribution of the small RNAs in the twelve libraries. Y-axis represents percentages of miRNAs identi ed in this study; X-axis represents the length of miRNAs. A: leaf; B: root; C: stem. Blue, yellow, red, and green lines represent 0h, 1h, 3h and 6h, respectively.

Figure 5
Differentially expressed miRNAs. X-axis represents the DESs pair, Y-axis represents the number of screened DESs, with green indicating down-regulation and red indicating up-regulation.