High-throughput sequencing of lncRNA and small RNA libraries
In total, 1,241,588,130 and 168,839,937 raw reads were generated from the libraries of lncRNAs and small RNAs, respectively. After the removal of low–quality reads and trim adapters, we obtained approximately 1,175,852,744 and 165,272,934 clean reads from the lncRNA and small RNA libraries. Then, we mapped the clean reads for each sample to the maize genome (B73 RefGen_V4) and used them for further analysis (Table 1). For mRNAs, a total of 214,532 transcripts were reconstructed from all of the 12 RNA-seq datasets, and 8836 differentially expressed genes were identified in leaves and roots (Fig. S1a, b). For lncRNAs, we based our analysis on the results of transcript splicing and following the structural characteristics and noncoding proteins, setting up a series of strict screening conditions (Fig. 1a). A total of 6274 transcripts were applied to the analysis of differential expression (Fig. 1b). We obtained 894 reliably expressed lncRNAs in leaves and roots (Fig. 1c, d). For miRNAs, the length distribution (range 18–30 nt) of total small RNAs was generated from clean reads (Fig. S2a). The TPM of miRNA is shown in Fig. S2b. Then, the screened sRNAs were used to analyze the distribution on the reference sequence and identify known and novel miRNAs. A total of 184 known miRNAs and 106 novel miRNAs were found in leaves and roots (Table S1). Clusters were generated and analyzed for the differentially regulated mRNA miRNAs under HN and LN conditions are presented in Fig. S3.
Genome-wide identification of lncRNAs
In maize seedlings, the lncRNAs ranged in length from 201 base pairs (bp) to approximately 29,176 bp, with a mean length of 829 bp and distributed on each chromosome. The lncRNAs showed lower FPKM than the mRNAs (Fig. 1e). Of these lncRNAs, 93.4% were lincRNAs and 6.6% were antisense lncRNAs (Fig. 1f). Their coding potential was also predicted by PFAM and CPC (Fig. 1g). To identify the lncRNAs involved in responses to nitrogen stress, we selected lncRNAs with a q-value<0.05 between the HN and LN groups as being differentially expressed. Overall, 607 and 287 differential lncRNAs were genome-widely screened from leaf and root (Fig. 1c, d), respectively, of which 23 were consistently upregulated and 30 were consistently downregulated in both root and leaf (Fig. 1h). This suggested that these lncRNAs perform similar functions in different tissues in response to LN stress. The clustering analysis of different lncRNAs was applied to determine the cluster model under HN and LN conditions (Fig. 1i); it showed upregulated and downregulated lncRNAs in the two tissues. Besides, 477 differentially expressed transcripts of uncertain coding potential (TUCPs) potentially containing a subset of lncRNAs with certain coding potential were identified (Fig. S1c, d). The characteristics of identified lncRNAs are shown in Fig. S4. In these results, we obtained a majority of lincRNAs and antisense lncRNAs, and a low proportion of TUCPs, which indicated that the RNA-Seq strategy can be used for these RNAs.
The expressional level mRNA distribution from the 12 libraries is shown along 10 chromosomes (Fig. 2). To understand the potential interaction between lncRNAs and coding transcripts, we identified the adjacent genes within 100 kb either up and downstream from the lncRNAs (Table S2). We also used Pearson’s correlation coefficient to analyze the correlation of expression level between lncRNAs and genes; those with a correlation value greater than 0.95 were taken for analysis (Table S2). The target analysis of co-location and co-expression provides an approach to predict the main functions of lncRNAs.
Validation of lncRNA, lncRNA target, and miRNA expression using qRT-PCR
To confirm the reliability of the deep sequencing results, we randomly selected 17 mRNAs, 10 lncRNAs, and 8 miRNAs for qRT-PCR analysis. As shown in Fig. 3a–c, the qRT-PCR, and RNA-seq data showed the same tendency, indicating the reliability of the RNA-seq results.
Analysis of the co-expression network
A total of 38 differentially expressed miRNAs identified from 12 small RNA libraries (Fig. S1e, f) were utilized for analyses of their interactions with lncRNAs and mRNAs. And then, we constructed the co-expression network that lncRNA–gene pairs generally share the same miRNA binding sites [10]. As shown in Fig. 4a, b, 2 lncRNAs, 2 miRNAs, and 14 mRNAs were included in the two co-expression networks, and the lncRNAs served as ceRNAs to communicate with many mRNAs through competing with specific binding miRNAs. These results contributed to our understanding the potential functions of lncRNAs in maize seedling under LN stress, and revealed the mechanism of lncRNAs regulated gene expression at the whole transcriptome.
We then analyzed the dynamic responses of lncRNA–miRNA–mRNA expression levels to LN stress at different treatment stages. For the co-expression network in the leaf, LNC_002923, miR159c, and two mRNAs were selected for qRT-PCR. As shown in Fig. 3d, e, the expression profile of Zm00001d023434 was similar to LNC_002923, but opposite of miR159c. Furthermore, Zm00001d015521 in the lncRNA-associated co-expression network showed the opposite expression pattern to LNC_002923, but a similar one to miR159c at 14 days. In general, the level of Zm00001d023434 was upregulated under the nitrogen starvation of seedling leaves, except for at 24 h and 14 days. Zm00001d015521 was almost completely suppressed at most stages, except for 1 h, 12 h, 24 h, and 4 days. The level of miR159c was upregulated in leaf at most treatment stages, only showed downregulation at 1 h, 6 h, 24 h, and 6 days. The level of LNC_002923 was almost always downregulated in the short–term, but was upregulated in the long–term. For the co-expression network in the root, the expression of LNC_003272, miR164a, and two mRNAs is shown in Fig. 3f, g. The expression of LNC_003272 was hardly detected, only showing downregulation at 12 and 14 days. The levels of miR164a and mRNAs showed the opposite expression pattern at 14 days after LN stress. Overall, the expression profiles of nitrogen-responsive lncRNAs, miRNAs, and mRNAs match better at a later stage than other short–term treatment for the “ceRNA hypothesis”, suggesting that ceRNA mechanisms are vital for further studying lncRNA functions under LN condition in maize.
Targets analysis of LN-responsive lncRNAs
To detect the cis-acting lncRNAs function, we screened 100kb upstream and downstream of the 607 and 287 differentially expressed lncRNAs in leaf and root, respectively, and performed lncRNA-mRNA pairs expression correlation analysis. In total, 2561 and 1142 lncRNA-mRNA pairs for differentially expressed lncRNAs in leaf and root were found, respectively (Table S3). GO analysis predicted some differential genes in the following subcategories within the main category of the biological process: nitrogen compound metabolic process, response to oxidative stress, and chromatin remodeling. Besides, there were also enriched GO terms such as nitrogen compound transport, organ nitrogen compound biosynthetic process, nitrate metabolic process, and regulation of external response (Table S5).
On the other hand, to detect the effect of trans-acting lncRNAs on genes expression regulation in diverse biological processes. We screened the targets were predicted according to the expression correlation between lncRNAs and mRNAs (Pearson correlation > 0.95). 59,577 and 22,227 co-expression relationships of lncRNAs and LN-responsive genes were found in leaf and root, respectively (Table S4). GO categories and subcategories were analyzed, the results predicted that most of these genes were organic substance biosynthetic, alpha-amino acid metabolic, and photosynthetic membrane in leaf and root.
We next analyzed the KEGG pathway enrichment analysis with differentially expressed genes, and the top20 enriched pathways are presented in (Table S6). The results showed the trans-acting lncRNAs were enriched in several pathways associated with cysteine and methionine metabolism, cyanamino acid metabolism. Meanwhile, some enriched pathways associated with nitrogen metabolism, alanine, aspartate, and glutamate metabolism were identified. In cis-acting, the results of these genes function in the chlorophyll metabolism, biosynthesis of secondary metabolites, as well as carbon fixation in photosynthetic organisms. These enriched biological processes and pathways are related to glutamine family amino acid biosynthetic process and abiotic stress, indicating that the differentially expressed LN-responsive lncRNAs play important roles in the absorption and transportation of nitrogen during maize development.
LNC_002923 inhibits the cleavage of Zm00001d015521 by ZmmiR159
According to our previously predicted co-expression regulatory networks, experiments were conducted to verify whether lncRNA affected the mRNA-miRNA pair. The Zm00001d015521 gene is one of the predicted target genes of ZmmiR159c. We used ZmmiR159c and ZmmiR159c-mut to construct the transient expression assay in Nicotiana benthamiana, in which ZmmiR159c-mut was designed primers for six loci in ZmmiR159c mature sequence to introduce the target mutation. The expression levels of ZmmiR159c and ZmmiR159c-mut were detected by RT-qPCR (Fig. 5a, c). As shown in Fig. 5b, d, the expression level of Zm00001d015521 significantly decreased when co-expressed with ZmmiR159c, however, the expression level of Zm00001d015521 was not affected when co-expressed with ZmmiR159c-mut. These results indicate that Zm00001d015521 is one of the target genes of ZmmiR159c. Previous studies have found that lncRNA could bind to miRNA, thus relieving the inhibitory effect of miRNA on target genes. In this study, the binding site of ZmmiR159c was found in LNC_002923 sequence (Fig. 6a). We also introduced six mutations in the ZmmiR159c sequence of LNC_002923 (LNC_002923-mut). Then, we conducted transient expression assays in Nicotiana benthamiana to detect whether LNC_002923 could inhabit the cleavage of Zm00001d015521 by ZmmiR159c. The expression levels of ZmmiR159c,LNC_002923 and LNC_002923-mut were detected by RT-qPCR (Fig. 6b, c). As shown in Fig. 6d, the expression level of Zm00001d015521 was not affected when LNC_002923 was coexpressed with ZmmiR159c, indicating that LNC_002923 could effectively inhibit the cleavage of Zm00001d015521 by ZmmiR159c. However, the Zm00001d015521 expression level was significantly decreased when coexpressed with LNC_002923-mut. These results suggested that LNC_002923 could reduce the inhibitory effect of ZmmiR159c.
Phenotypic differences under HN and LN conditions
The descriptive statistics and broad heritability estimates (H2) of 17 maize seedling traits in HN and LN are presented in Table 2. The significance of genotypes and contrasting N concentrations on maize seedlings could be revealed by the analysis of variance results. Most traits showed significant differences between LN stress and the control, except the seminal root number (SRN). These findings indicated that collection of natural populations was sufficiently diverse for the subsequent association analysis. On average, shoot growth was limited under LN condition, while root growth was enhanced. The shoot length (SL) was greater under HN (36.278) than under LN (30.377) treatment. Moreover, root dry weight (RDW) and total root length (TRL) were higher under LN than under HN. Pearson’s correlation coefficients were calculated for the 17 collected traits under contrasting nitrogen levels listed in Table S8.
Candidate genes revealed by genome-wide association analysis
For the genome-wide identification of significant SNPs associated with traits, the software TASSEL 5.0 was selected to conduct GWAS with 46,108 high-quality SNPs (MAF>0.01) using a mixed linear model (MLM) and markers with a p-value of <1e−4.6 were considered for candidate gene analysis. A total of 23 significant markers were detected by low nitrogen tolerance index (LNTI) and three markers were detected under LN treatment. Among them, 12 SNPs were found to be associated with the crown root number (CRN). Seven and three significant associations corresponded to Forks and Tips, respectively, and three markers were associated with each of Crossings and ARD. Moreover, one SNP was found to be significantly associated with two root traits (Forks and Tips). According to the average linkage disequilibrium (calculated by TASSEL 5.0) decay distance across all 10 maize chromosomes within the mapping population, a total of 1474 candidate genes were screened near these significantly associated SNP markers. Three significant markers were associated with 232 genes under LN treatment, namely, PUT-163a-13126581-167, PZE-104092320, and PZE-109052750, and they were respectively associated with SRN, Crossings, and Tips and located on chromosomes 4, 6, and 9. Meanwhile, 23 significant SNPs associated with 1242 candidate genes were revealed by LNTI value. A list of all significant markers’ trait associations is presented in Table S7. According to the location of identified lncRNAs, a total of 36 and 134 differently expressed lncRNAs within the range of significant markers were screened under LN and LNTI, respectively (Table S7). These LN-responsive candidate genes and lncRNAs could play vital roles in regulating root development during the maize seedling stage.
Combining GWAS and expression profile to mine consistent candidate genes
Combined with GWAS and RNA-Seq, we found that among 232 candidate genes detected by GWAS under LN condition, 16 and 29 candidate genes respectively showed obvious downregulation and upregulation patterns, and 7 candidate genes were expressed in roots and leaves (Fig. S6a and Table S7). At the same time, among 1013 candidate genes detected by GWAS under LNTI, 107 and 125 candidate genes showed downregulation and upregulation respectively, among which 38 candidate genes were expressed in roots and leaves (Fig. S6b and Table S7). Besides, integrating the results of multiple RNA-seq and GWAS, a total of 10 significant SNPs (P <1e−4.6) were associated with 34 candidate genes identified by using LNTI of root traits, and one significant SNP associated with six candidate genes was identified under LN stress (Table 3). The expression levels of candidate genes in which GWAS-identified SNPs located were evaluated in 255 maize lines through RNA-Seq of root at the maize seedling stage under LN stress. This includes the discovery of two genes in the integration of meta-analysis and large-scale gene expression profile data of maize under LN stress [21]. For example, Zm00001d051804 plays an important role in glutamine biosynthesis in response to exogenous nitrogen during seed germination in maize and Arabidopsis [22]. In addition, Zm00001d048998 is a chlorophyll A-B binding protein, which is involved in the absorption and utilization of nitrogen nutrients in maize [23]. For other genes, Zm00001d051666 and Zm00001d049380 were found to be associated with nitrogen acquisition in maize roots [24]. Finally, six candidate genes were found in RNA-seq of maize root in response to nitrogen stress [25], and 29 genes were consistently detected upon comparing with research involving GWAS of carbon and nitrogen metabolism in maize [26], respectively. The Manhattan plots for Forks and fold change values from RNA-Seq are shown in Fig. 7a, b. These consistent candidate genes explored in our research could contribute strongly to deficient nitrogen tolerance.
To investigate the regulatory roles of lncRNAs in maize seedlings under LN stress, using the predictive results of lncRNA targets, potential lncRNAs that could influence 40 consistently LN-response candidate genes in a cis or trans manner were identified. As shown in Fig. 4c, a total of 354 lncRNAs were found. Furthermore, these targets of lncRNAs contained significant SNPs for multiple root traits (Table S9), and lncRNA targets were found to be significantly associated with nitrogen metabolism and the abiotic stress response pathway. Of these lncRNAs, we found LNC_002984, LNC_002985, and LNC_002986 located upstream of Zm00001d051804 that could potentially regulate the expression levels of neighboring genes in a cis manner. Zm00001d051804 was reported to be related to the response to low nitrogen stress [21]. Moreover, the homologous gene in Arabidopsis was shown to play an important redundant role in ammonium assimilation under ammonium-deficient conditions and in facilitating nitrogen remobilization in root [27, 28]. This implies these lncRNAs could play an important role in root development, especially in nitrogen absorption, transfer, and assimilation in plants. The present analysis provides new information for understanding lncRNAs as important regulators in growth and development associated with LN stress.