ssRNA-seq and transcript assembly
To examine lncNAT expression patterns in wild rice under drought stress treatment, one cultivated rice accession (Nip) and two O. nivara accessions (BJ89 and BJ278) were grown under control and drought conditions (Fig. 1a–c). Leaves of BJ89 and BJ278 showed lower water loss than those of Nip (Fig. 1d). Furthermore, the two O. nivara accessions exhibited a higher survival rate than Nip after 25 days of drought stress treatment (Fig. 1e). These physiological data suggest that O. nivara accessions BJ89 and BJ278 are more drought tolerant than Nip at the seedling stage.
Next, we conducted ssRNA-seq analysis of these three accessions treated with or without 25% polyethylene glycol (PEG-4000; w/v) for 10 days. A total of 18 strand-specific cDNA libraries were constructed from leaf tissues, with three replicates per accession in both control and drought stress treatments. In total, 445.5 million paired-end reads (2 × 125 bp) were generated by ssRNA-seq using Illumina HiSeq 2500, of which 373.6 million reads (83.8%) mapped perfectly on to the Nip reference genome (Table S1). Pearson correlation coefficients of the three biological replicates of each accession were greater than 0.9, indicating the high reproducibility of our ssRNA-seq data (Figure S1). Among the 373.6 million paired-end reads, we identified 62,201 transcripts, including 17,583 novel transcripts (8,905 known gene loci and 2,704 new gene loci) with hisat2  and stringtie  (Fig. 2a). We also conducted principal component analysis (PCA) of gene expression data. The results showed that PC2 clearly distinguished between the control and drought treated samples, while PC3 separated the different accessions (Figure S2).
Identification of lncRNAs and NAT pairs
To identify lncRNAs, novel transcripts larger than 200 nt were mapped against the Rfam 13.0 database to exclude micro RNAs (miRNAs), ribosomal RNAs (rRNAs), and other small noncoding RNAs . Then, any transcripts with a coding potential, according to Coding Potential Calculator (CPC)  and Pfam with HAMMER scan , were filtered out (Fig. 2a). Finally, a total of 1,246 lncRNAs were identified, including 940 in Nip, 959 in BJ89, and 974 in BJ278 (reads in the same accession under both control and drought stress conditions were combined to identify lncRNAs) (Fig. 2b and Table S2). Among the 1,246 lncRNAs, 692 (55.5%) were common to all three accessions, 306 (24.6%) were uniquely found in at least one of the two O. nivara accessions, and 111 (8.9%) were present in both O. nivara accessions (Fig. 2b). The expression profiles of lncRNAs were more different between the three accessions than between the drought and control conditions of the same accession (Fig. 2c).
Of the 1,246 lncRNAs, a total of 394 lncRNAs were differentially expressed between control and drought stress treatments (118 in Nip, 227 in BJ89, and 174 in BJ278). Among these 394 lncRNAs, 23 were common to all three accessions; 34 were identified as being shared between one of the O. nivara accessions and Nip; 45 were shared between the two O. nivara accessions (BJ89 and BJ278); and 139, 92, and 61 were uniquely found in BJ89, BJ278, and Nip, respectively (Fig. 3a, b).
Based on their location relative to the gene coding regions, 675 of the 1,246 lncRNAs were long intergenic noncoding RNAs (lincRNAs; lncRNAs located in intergenic regions), and 571 lncRNAs were long noncoding natural antisense transcripts (lncNATs; lncRNAs overlapped with coding genes on the opposite DNA strand) (Table S2). The lincRNAs contained fewer exons than lncNATs; 72.7% lincRNAs contained only one exon (Figure S3). By contrast, mRNAs contained more exons than both lincRNAs and lncNATs (Figure S3). In addition, mRNAs showed higher expression variation than lincRNAs and lncNATs under either control or drought stress condition at the genome level (Figure S4).
It has been shown that lncNATs can regulate the expression of sense transcripts [44, 45], and each strand of a NAT pair could potentially represent a protein-coding gene. Therefore, in addition to identifying NAT pairs from lncRNAs, we scanned NAT pairs across the whole genome of the three accessions, based on the annotation of the Nip reference genome. All transcripts annotated in the Nip reference genome were integrated and assembled with the ssRNA-seq data generated in this study. A total of 8,529 NAT pairs with overlapping regions greater than 25 nt were identified according to a previous study . According to the coding capacity of the sense–antisense pair , 86.88% (7,410) of NAT pairs were coding–coding pairs (both transcripts with protein-coding capacity), 0.33% (28) were noncoding–noncoding pairs (both transcripts represented lncRNAs), and 12.79% (1,091) were coding–noncoding pairs (one strand showed protein-coding capacity, while the other strand represented an lncRNA) (Table S3), and each transcript (transcripts with protein-coding capacity or lncRNAs) could flank a few different transcripts. Depending on the direction and location of the sense and antisense transcripts, 61.33% of the 8,529 NAT pairs were enclosed (one transcript fully embedded in the other), 23.38% were divergent (head-to-head, 5'-end overlap), and 15.29% were convergent (tail-to-tail, 3'-end overlap) (Fig. 3d).
Of the 8,529 NAT pairs, 5,866 were detected through ssRNA-seq analysis of the three accessions, of which 4,783, 4,813, and 4,596 were detected in Nip, BJ89, and BJ278, respectively (Fig. 3c, Figure S5, Table S3). Of the 5,866 NAT pairs, 62.2% (3,651) were shared among all three accessions (2,792 coding–coding pairs, 13 noncoding–noncoding pairs, and 846 coding–noncoding pairs) (Figure S5, Table S3); 8.8% (517) were expressed only in Nip (462 coding–coding pairs, 5 noncoding–noncoding pairs, and 50 coding–noncoding pairs) (Figure S5, Table S3); and 18.5% (1,083) were uniquely expressed in O. nivara accessions (998 coding–coding pairs, 8 noncoding–noncoding pairs, and 77 coding–noncoding pairs) (Figure S5, Table S3).
Genes involved in the response to drought stress
To determine the differences in gene expression patterns between control and drought treatments, we identified differentially expressed genes (DEGs) using the following criteria: fold change (FC) ≥ 2.0 and false discovery rate (FDR) ≤ 0.01. A total of 3,934, 5,880, and 5,036 DEGs were identified between control and drought treatments in Nip, BJ89, and BJ278, respectively (Fig. 3a, Table S4).
To identify genes within biological processes related to drought stress, GO enrichment analysis was performed on all DEGs (FC ≥ 2.0 and FDR ≤ 0.01) and highly differentially expressed genes (HDEGs) (FC ≥ 4.0 and FDR ≤ 0.01) identified in each accession. A total of 57 GO terms were enriched in the three accessions, and most terms were related to primary metabolic pathways essential for plant growth and development, such as ‘biosynthetic process’, ‘cellular biosynthetic process’, and ‘primary metabolic process’ (Figure S6). In addition, different GO terms were enriched in the three accessions in response to drought; for example, the ‘response to water’ GO term was uniquely enriched in BJ278 (Figure S6).
Based on the HDEGs (FC ≥ 4.0 and FDR ≤ 0.01), 63 GO terms in the biological process category were enriched, including 10 terms related to stress, such as ‘response to jasmonic acid stimulus’, ‘oxidation reduction’, and ‘gibberellin metabolic process’ [46, 47] (Fig. 4a, Table S5). Among these ten stress related terms, three (‘response to chemical stimulus’, ‘response to stimulus’, and ‘response to stress’) were detected in both BJ89 and Nip; three terms (‘jasmonic acid mediated signaling pathway’, ‘response to jasmonic acid stimulus’, and ‘response to biotic stimulus’) were uniquely enriched in BJ89; and four terms (‘response to abiotic stimulus’, ‘oxidation reduction’, and ‘response to water and gibberellin metabolic process’) were only enriched in Nip. In BJ278, only one GO term (‘carbohydrate metabolic process’) was enriched. These results suggest that Nip, BJ89, and BJ278 employ different mechanisms to respond to drought stress (Fig. 4a, Figure S6).
A total of 134 HDEGs were enriched in these 10 stress related GO terms. Among these 134 genes, 48 were found only in O. nivara (either one or both accessions); 12 were found only in Nip; 35 were shared between Nip and one of the two O. nivara accessions; and 39 were found in all three accessions but showed different expression levels (FC ≥ 2.0 and FDR ≤ 0.01) either between BJ89 and Nip or between BJ278 and Nip under the drought stress condition (Table S5).
A recent study reported the role of protein kinases in abiotic stress . Therefore, we focused on drought stress-responsive protein kinases in our ssRNA-seq data. We found 51 genes encoding protein kinases (24 belonging to ABA co-receptor clade A type 2C protein phosphatases [PP2Cs]), which were induced in response to drought stress and were differentially expressed between drought stress and control conditions in Nip, BJ89, and BJ278 or between O. nivara (BJ89 or BJ278) and cultivated rice (Nip) under the drought condition (Figure S7). Among the 24 PP2C genes, 14 were downregulated either in O. nivara accessions under drought stress compared with their control counterparts or in O. nivara accessions compared with Nip under the drought stress condition (Figure S8).
NAT pairs responsive to drought stress
Antisense transcription could silence or concordantly regulate the sense transcripts . To detect NAT pairs responsive to drought stress, both sense and antisense transcripts of each NAT pair showing differential expression (FC ≥ 2.0 and FDR ≤ 0.01) between control and drought stress conditions were identified as differentially expressed NAT pairs. A total of 369 differentially expressed NAT pairs were identified, of which 240 were coding–noncoding pairs (193 in BJ89, 96 in BJ278, and 53 in Nip). Additionally, among these 240 differentially expressed NAT pairs, 23 were common in all accessions; 18 were shared between Nip and BJ89 or BJ278; 12 were present only in Nip; and 187 were found only in O. nivara accessions (Figure S9, Table S6).
According to the effect of antisense transcripts on sense transcripts, we classified the NAT pairs into two categories, as described previously : concordant (sense and antisense transcripts expressed coordinately) and discordant (sense and antisense transcripts showing opposite expression patterns). A total of 24 discordant NAT pairs were identified (one shared among all accessions; two shared between BJ89 and BJ278; one shared between Nip and BJ89; 8, 9, and 12 uniquely found in Nip, BJ89, and BJ278, respectively) (Figure S10), including one NAT pair discordant in Nip but concordant in BJ89, and one pair discordant in BJ89 and BJ278 but concordant in Nip (Fig. 4b and Table S6). Additionally, 102 concordant NAT pairs were upregulated (30 in Nip, 61 in BJ89, and 54 in BJ278) (Fig. 4b). Among the upregulated concordant NAT pairs, 10 were shared among all accessions, 7 were shared between Nip and BJ278, 13 were found only in Nip, and 72 were found only in O. nivara (Figure S11 and Table S6).
A total of 245 concordant NAT pairs were downregulated (45 in Nip, 205 in BJ89, and 82 in BJ278), of which 17 were shared among all three accessions, 14 were shared between Nip and BJ89 or BJ278, 14 were found only in Nip, and 200 were found only in O. nivara (Fig. 4b, Figure S12, and Table S6). Among the 10 coding–noncoding NAT pairs related to GO enrichment terms for the drought stress response, nine were uniquely found in O. nivara, and one was common to all accessions (Table S7, Figure S13). Furthermore, among the nine coding–noncoding NAT pairs uniquely found in O. nivara, six were correlated with response to stress, three were correlated with the jasmonic acid stimulus pathway, and one was correlated with oxidation reduction (Table S7).