High-throughput sequencing and data analysis of small RNAs
In our previously study [32], we reported that maize inbred line CM1 exhibits high tolerance to high temperature stress. To identify miRNAs involved in the response to heat stress, six small RNA libraries were constructed from the leaves of maize inbred line CM1 under heat treatment (MH1, MH2 and MH3) and normal conditions (MC1, MC2 and MC3). These libraries were sequenced using an Illumina Hiseq2500 platform, and the generated data were shown in Table 1. In total, an average of 17,568,632 and 21,192,599 raw reads of the three biological replicates were obtained from MC and MH libraries, respectively. After filtering low quality reads, adapter sequences, and junk reads, 5,522,709 − 19,120,848 valid reads were remained from each sample with 18–25 nt in length, and subsequently used to identify known and novel miRNAs. The length distribution of the small RNAs (sRNAs) was similar between MC and MH libraries (Fig. 1a). We found that the majority of length distribution of the sRNAs was 21 and 22 nt. In addition, the 24 nt sRNAs were also accounted for a large percentage. We noted that the percentage of 18–21 nt sRNAs in MH library was higher than that in MC library, while the opposite result was found for the range of 22–25 nt sRNAs.
Table 1
Statistics of small RNA sequences from the six libraries.
Type
|
MC1
|
MC2
|
MC3
|
MH1
|
MH2
|
MH3
|
Total
|
%
|
Total
|
%
|
Total
|
%
|
Total
|
%
|
Total
|
%
|
Total
|
%
|
Raw reads
|
20,636,944
|
100.00
|
13,188,120
|
100.00
|
18,880,831
|
100.00
|
13,303,420
|
100.00
|
11,923,314
|
100.00
|
38,351,063
|
100.00
|
Valid reads
|
9,315,467
|
45.14
|
5,522,709
|
41.88
|
7,125,136
|
37.74
|
7,693,944
|
57.83
|
6,893,172
|
57.81
|
19,120,848
|
49.86
|
3ADT&length filter
|
5,507,005
|
26.69
|
2,657,816
|
20.15
|
4,303,780
|
22.79
|
3,314,282
|
24.91
|
2,675,581
|
22.44
|
12,435,168
|
32.42
|
Junk reads
|
29,539
|
0.14
|
29,112
|
0.22
|
34,396
|
0.18
|
3,4715
|
0.26
|
24,114
|
0.20
|
62,523
|
0.16
|
Rfam
|
2,521,197
|
12.22
|
1,866,557
|
14.15
|
2,249,898
|
11.92
|
1,202,084
|
9.04
|
1,449,839
|
12.16
|
4,165,178
|
10.86
|
mRNA
|
3,746,950
|
18.16
|
3,647,915
|
27.66
|
5,856,919
|
31.02
|
1,223,256
|
9.20
|
1,090,629
|
9.15
|
3,462,254
|
9.03
|
Repeats
|
20,030
|
0.10
|
15,058
|
0.11
|
17,083
|
0.09
|
22,941
|
0.17
|
16,194
|
0.14
|
52,363
|
0.14
|
rRNA
|
1,840,815
|
8.92
|
1,044,648
|
7.92
|
1,184,966
|
6.28
|
1,009,375
|
7.59
|
1,247,375
|
10.46
|
3,155,023
|
8.23
|
tRNA
|
569,939
|
2.76
|
733,926
|
5.57
|
969,077
|
5.13
|
130,267
|
0.98
|
135,382
|
1.14
|
824,895
|
2.15
|
snoRNA
|
9,604
|
0.05
|
13,219
|
0.10
|
16,023
|
0.08
|
4,435
|
0.03
|
5,842
|
0.05
|
26,990
|
0.07
|
snRNA
|
11,500
|
0.06
|
18,398
|
0.14
|
21,375
|
0.11
|
2,904
|
0.02
|
2,510
|
0.02
|
9,273
|
0.02
|
Other Rfam RNA
|
89,339
|
0.43
|
56,366
|
0.43
|
58,457
|
0.31
|
55,103
|
0.41
|
58,730
|
0.49
|
148,997
|
0.39
|
Identification of known and novel miRNAs
To identify known miRNAs, the valid reads were used to compare with known plant miRNA precursors or mature miRNAs by alignment against miRBase (v22.0). Totally, 215 known miRNAs were identified from the MC and MH libraries (Table S1). The length size of the known miRNAs was ranged from 18 to 25 nt, and we found that the 21 nt miRNAs were the most abundant, accounting for 55.81% followed by the 22 nt miRNAs (18.60%) (Fig 1b). These known miRNAs were divided into 40 different miRNA families, and the number of each miRNA family has a large range. The miR166 and miR169 families contained the largest number with 22 members, followed by miR156 and miR171 families, which have 21 and 20 miRNAs, respectively. Among the 40 miRNA families, 14 families contained only one member, such as miR11969, miR395 and miR6199 families (Fig. 2). After identification of known miRNAs from the valid data, the remaining unmapped sequences were further aligned to the maize genome to screen potential novel miRNAs, and the RNAfold software was used to predict the secondary structures of the sequences to formulate a hairpin structure or not. Finally, a total of 125 members were identified as the novel miRNAs. The length distribution of the novel miRNAs ranges from 18 to 24 nt, and the 24 nt miRNA was most abundant (64.00%), followed by the 21 nt miRNAs (20.80%) (Table S2).
DEMs in response to heat stress and qRT-PCR validation
Expression patterns of the identified miRNAs were analyzed based on the normalized method reported previously [33]. Pearson correlation analysis was performed to evaluate the correlation among the biological replicates for MC and MH samples. As shown in Fig. 3a, the correlation coefficient (R) of the three biological replicates was greater than 0.913 and 0.945 for MC and MH libraries, respectively, suggesting high reproducibility of the biological replicates of the high-throughput sequencing. DEMs from the known and novel miRNAs were screened followed the differential expression analysis criteria, and 35 miRNAs were shown significantly different expression between MC and MH samples, including 26 known miRNAs and 9 novel miRNAs. Compared with the expression in MC library, 21 miRNAs were up-regulated and 14 members were down-regulated (Fig. 3b and 3c). We noted that the 35 DEMs belong to different miRNA families, and some DEMs were shown with large fold change, such as zma-miR160f-5p_R-1, mtr-miR171c and zma-miR168b-3p_R+1 (Table S3). In addition, the same family miRNAs were shown with same expression patterns, for example, each of the five miR164 family members shown up-regulated, while down-regulated expression were detected for the three miR168 family members. To validate the expression patterns of DEMs, 10 DEMs were selected to validate the high-throughput sequencing results by quantitative real-time polymerase chain reaction (qRT-PCR). As shown in Fig. 4, the expression patterns of the 10 selected miRNAs were similar with the sequencing data, suggesting the reliability of sequencing data and can be used for further analysis.
Identification of miRNA targets by degradome sequencing
The target genes of miRNAs were predicted using degradome sequencing technology. We used the CleaveLand and TargetFinder software to detect potentially sliced targets of miRNAs in the MC and MH libraries. Totally, 21136209 and 11630262 raw reads were obtained for the MC and MH libraries, respectively. And a total of 9395171 (44.45%) and 4861092 (41.80%) of the reads were mapped to the maize transcript in MC and MH libraries, respectively (Table S4). The cleavage products were divided into five categories (0-4) according to the relative abundance of the signatures at the target cleavage sites (Fig. 5). In the MC library, a total of 1563 cleavage events were identified, which 683 (43.70%), 72 (4.61%), 437 (27.96%), 29 (1.86%) and 342 (21.88%) were grouped into category 0, 1, 2, 3, and 4, respectively. In the MH library, a total of 1386 cleavage events were identified, with 596 (43.00%), 97 (6.70%), 459 (33.12%), 9 (0.65%) and 225 (16.23%) cleavage events in category 0, 1, 2, 3, and 4, respectively (Fig. 6).
Of these cleavage events, a total of 174 (527 transcripts) target genes were predicted to be cleaved by 115 miRNAs (104 known miRNAs and 11 novel miRNAs). We found that many of the target genes encode different families of transcription factors, such as auxin response factor (ARF), homeobox-leucine zipper protein (HD-Zip), squamosa-promoter binding protein (SBP), NAC domain-containing protein and GRAS protein family (Table S5). Many miRNAs have multiple target genes, and the same family miRNA always have the same type of target genes. For example, the targets of zma-miR160 family encode ARF transcription factors; the targets of zma-miR171 family encode GRAS transcription factors, which might suggest the conserved regulation mechanism of miRNAs. Of the 35 DEMs, degradome sequencing showed that only 17 members had target genes, including 16 known miRNAs and 1 novel miRNAs (Table S3).
GO and KEGG enrichment analysis of miRNA targets
To better understand the potential biological functions of the 174 targets identified by degradome sequencing, GO and KEGG enrichment analysis were performed for these target genes. GO enrichment analysis includes three ontologies of molecular function (MF), cellular component (CC), and biological process (BP) on the basis of the ontological descriptions of GO terms. The 30 most significantly GO terms were shown in Fig. 7. In the molecular function category, the most abundant of targets were enriched in GO terms of DNA binding (GO:0003677), sequence-specific DNA binding transcription factor activity (GO:0003700), and protein dimerization activity (GO:0046983). In the cellular component category, the most abundant of targets were enriched in nucleus (GO:0005634) and nucleolus (GO:0005730). In the biological process category, GO terms with most genes were regulation of transcription, DNA-templated (GO:0006355) and auxin-activated signaling pathway (GO:0009734). Particularly, we also found some genes were enriched in GO term response to hormone (GO:0009725), heat acclimation (GO:0010286), response to hydrogen peroxide (GO:0042542) and etc., which might have potential roles in response to heat stress.
KEGG analysis indicated that a total of 42 pathways were enriched for the 174 target genes, and 30 pathways were shown in Fig. 8 based on the significance of qvalue. Among the 30 pathways, 4 of them were significantly enriched pathways (qvalue<0.05), including ubiquinone and other terpenoid-quinone biosynthesis (ko00130), selenocompound metabolism (ko00450), monobactam biosynthesis (ko00261), sulfur metabolism (ko00920). Of these enriched pathways, the metabolic pathways (ko01100) pathway was the most enriched pathway with 18 genes, followed by biosynthesis of secondary metabolites (ko01110) and ribosome (ko03010) pathways, which contain 10 and 7 members, respectively. In addition, 4 and 3 members were enriched in plant hormone signal transduction (ko04075) pathway and spliceosome (ko03040), and one gene was enriched in MAPK signaling pathway-plant (ko04016) pathway.
Regulatory network of heat-responsive miRNAs in maize
As mentioned above, one miRNA can function on several different target genes, and one gene also can be targeted by one or more miRNAs. To explore the potential regulatory mechanisms of miRNAs under heat stress in maize, regulatory network was constructed using Cytoscape software based on the relationships of miRNAs, target genes and the enriched GO terms and KEGG pathways. As shown in Fig. 9, 11 up-regulated DEMs and 5 down-regulated DEMs were shown to involve in the network, respectively. It can be seen that the target genes were mainly enriched in several GO terms, such as regulation of transcription, DNA-templated (GO:0006355), nucleus (GO:0005634), DNA binding (GO:0003677), and the mainly enriched pathways were metabolic pathways (ko01100), biosynthesis of secondary metabolites (ko01110), and ribosome (ko03010). The relationships of miRNA, targets and the enriched GO terms and pathways were clearly presented in the network. Thus, the regulatory network provided an important foundation for further identification of important miRNAs and target genes in responsive to heat stress.