Identification of rice regular NLR genes at the genome level
More than 10 years ago, 530 NBS genes were identified at the rice genome, of which 440 are regular NLR genes (Zhou et al. 2004; Yang et al. 2006). From then on, the rice genome becomes more and more perfect. Thus, we re-excavated the rice NLR genes using updated rice genomic data. We obtained a total of 430 regular NLR genes, which consists of both NBS and LRR domains (Table S1). The 430 genomic loci were then converted into their corresponding probes. Since the probes did not correspond to all the genomic loci, 23 NLR genes were not discovered to have the corresponding probes, and 501 probes were found to match the remaining 407 NLR genes (Table S1).
Based on the presence and absence of a TIR motif in the N-terminal regions, NLR genes in Arabidopsis thaliana could be classified into TIR-NBS-LRR (TNL) and CC-NBS-LRR (CNL) (Meyers et al. 2003). However, our result showed that the 430 regular rice NLR genes contained no TIR domain (Table S1). Generally, non-TIR NLR gene from dicots typically has a Coiled-Coil (CC) motif in the N-terminal region (Tarr and Alexander 2009), so we next performed a CC motif search for these 430 rice NLR genes using the COILS Sever (https://embnet.vital-it.ch/software/COILS_form.html). Previous studies showed that 160 of the 440 regular rice NLR genes identified were CNL genes (Zhou et al. 2004). In our analysis, we found 192 CNL and 238 XNL (without a CC motif) genes (Table S1).
The NLR genes’ expression profiling under pathogen-infected conditions by integrative microarray analysis
In order to evaluate the expression profiles of rice NLR genes induced by Xoo and Mor, we analyzed a great number of microarray samples (Table S2). To avoid the variance of different platforms, the microarray data for analysis were selected from the same rice platform GPL2025. All the 69 sets of samples consisted of various rice genotypes and growth stages, diverse pathogen strains, different inoculated methods, and sampling times (Table 1; Table S2). Based on the number of occurrence in different expression ranges (out of 69 sets of samples), 407 rice NLR genes (matched by 501 probes) were considered for evaluating the gene expression levels. The expression levels of 407 NLR genes were divided into three groups: low (< 1.0 fold-change), medium (≥ 1.0 and < 1.5 fold-change), and high (≥ 1.5 fold-change). Most of the NLR genes (397/407) were found to present low or medium levels of expression, and only 10 NLR genes showed high levels of expression (Fig. 1 and Table S3). No NLR genes were found to be expressed at very low level (< 0.5 fold-change), indicating that no NLR genes were highly repressed by Xoo or Mor in all the 69 samples.
Table 1. Summary of the microarray samples used in this study. |
| Sample infected by Xoo | Sample infected by Mor |
Number of sample | 51 | 18 |
Rice genotype | 10 | 5 |
Rice stage | from 14 to 75 days old | from 14 to 28 days old |
Pathogen strain | 10 | 8 |
Inoculated tissue | leaf | leaf, root |
Inoculated method | infiltration/leaf-clipping | spore suspension |
Sampling time | from 2 to 96 hpi | from 12 to 144 hpi |
The differentially expressed NLR genes (de_NLRs) induced by Xoo- and Mor-infections in rice
To further reveal the differential expression of the above 407 rice NLR genes in the 69 microarray samples, the threshold of p.value < 0.05 was set to screen the de_NLRs. The results showed that the number of the de_NLRs in all the 69 samples varied from 10 to 172 (out of 407 NLR genes) (Table S4); and in 40 samples, the number of the de_NLRs differed from 51 to 100 depending on diverse samples (Fig. 2). This unbalance may be caused by different conditions in diverse experiments. In all the 69 samples, the frequency of occurrence of the de_NLRs was observed to range from 0 to 39 (Table S5). Of all the 407 rice NLR genes identified, 400 NLR genes were observed to be differentially expressed in at least one sample, two NLR genes were detected to be differentially expressed in 39 samples and no NLR genes were found to be differentially expressed in all the 69 samples (Table S5).
In addition, we tested the statistical significance of the de_NLRs present in the whole rice genome. Based on the over-representation method (Khatri et al. 2012), we performed a gene enrichment analysis on rice genes and compared the significant relationship between all the de_NLRs and all the rice differentially expressed genes (DEGs). We found that only 8 groups of samples showed high enrichment of the de_NLRs compared to the total rice DEGs, and the de_NLRs in the 39 samples showed a lower incidence compared to the DEGs in the whole rice genome (Figure S1; Table S6). This indicates that although the rice NLR genes were thought to be highly required for pathogen perception and defense responses, they exhibited different expression profiles under different experimental conditions.
The robust pathogen-responsive rice NLR genes identified by integrative microarray analysis
To further determine how many rice NLR genes could simultaneously show high expression level and significant differential expression, we examined the relationship between the occurrence of these 407 NLR genes which presented greater or equal to 1.5 fold-change and the percentage of these genes which showed a p-value of less than 0.01. The combination of two empirically rigorous filters could help us obtain a more reliable set of pathogen-responsive NLR genes in rice.
As shown in Fig. 3, among the 1.5 fold-change line, the most dramatic decrease occurred in less than 17 out of 69 samples. As the frequency of occurrence increased, the percentage of significance at the 0.01 level also rose gradually and reached about 95% in 17 out of 69 samples. Based on this result, we speculated that a cutoff of more than 16 would capture a convincing set of pathogen-responsive NLR genes. Hence, two filters were then used to query NLR genes: a) genes showing an increased expression with a greater than 1.5 fold-change in at least 17 samples; b) genes that had a p-value of 0.01 or less in the Fisher's test. Using this approach, a set of 46 rice NLR genes were identified in the rice response to the two pathogens (Table 2). These putative rice NLR genes displayed a high frequency of differential expression under different pathogen-infected conditions and were further defined as robust pathogen-responsive rice NLR genes.
Table 2
List of 46 pathogen-responsive rice NLR genes identified by the integrated-analysis.
Chromosome | Gene ID |
Chr1 | LOC_Os01g57870 |
| LOC_Os01g70080 |
| LOC_Os01g72680 |
Chr2 | LOC_Os02g16060 |
| LOC_Os02g18000 |
| LOC_Os02g18080 |
Chr3 | LOC_Os03g20840 |
| LOC_Os03g50150 |
| LOC_Os03g63150 |
Chr4 | LOC_Os04g02030 |
| LOC_Os04g02110 |
| LOC_Os04g25900 |
| LOC_Os04g30930 |
| LOC_Os04g43340 |
| LOC_Os04g43440 |
Chr5 | LOC_Os05g34220 |
| LOC_Os05g41310 |
Chr6 | LOC_Os06g49390 |
Chr7 | LOC_Os07g17250 |
Chr8 | LOC_Os08g14850 |
| LOC_Os08g16120 |
| LOC_Os08g29809 |
| LOC_Os08g42700 |
Chr9 | LOC_Os09g09490 |
| LOC_Os09g14450 |
| LOC_Os09g34150 |
| LOC_Os09g34160 |
Chr10 | LOC_Os10g04120 |
| LOC_Os10g36270 |
Chr11 | LOC_Os11g11770 |
| LOC_Os11g11960 |
| LOC_Os11g12000 |
| LOC_Os11g12050 |
| LOC_Os11g12330 |
| LOC_Os11g12340 |
| LOC_Os11g16470 |
| LOC_Os11g37050 |
| LOC_Os11g38440 |
| LOC_Os11g38520 |
| LOC_Os11g42660 |
| LOC_Os11g45930 |
Chr12 | LOC_Os12g10180 |
| LOC_Os12g10710 |
| LOC_Os12g13550 |
| LOC_Os12g28100 |
| LOC_Os12g37770 |
Further validation of the expression of the 46 pathogen-responsive rice NLR genes by RNA-seq data
To further validate the reliability of 46 candidate pathogen-responsive NLR genes, we compared them against lists of rice genes that showed differential expression in six rice RNA-seq datasets. These RNA-seq data included interactions between rice and Magnaporthe oryzae as well as between rice and Rhizoctonia solani. We found that the expression levels of the 46 NLR genes obtained from the RNA-seq data were very similar to those from the microarray data. Among all the 46 NLR genes, 38 were detected to be differentially expressed in at least one set of the RNA-seq data, and 15 genes were detected to be differentially expressed in at least three sets of the RNA-seq data (Table 3). Therefore, the 38 NLR genes could be detected by both the microarray data and the RNA-seq data, indicating they are active in the response to the rice pathogens (Fig. 4).
Table 3
List of the golden list rice NLR genes (FC ≥ 1.5) frequently responsive to Xoo and Mor.
Gene ID | Frequency in microarray data (n = 69) | Frequency in RNA-seq data (n = 6) |
LOC_Os01g70080 | 34 | 5 |
LOC_Os11g12340 | 25 | 5 |
LOC_Os12g13550 | 21 | 5 |
LOC_Os02g18080 | 21 | 4 |
LOC_Os11g11960 | 18 | 4 |
LOC_Os11g12000 | 21 | 4 |
LOC_Os12g10710 | 21 | 4 |
LOC_Os03g20840 | 32 | 3 |
LOC_Os07g17250 | 21 | 3 |
LOC_Os08g42700 | 19 | 3 |
LOC_Os09g34150 | 33 | 3 |
LOC_Os09g34160 | 26 | 3 |
LOC_Os11g37050 | 24 | 3 |
LOC_Os11g45930 | 18 | 3 |
LOC_Os12g37770 | 30 | 3 |
LOC_Os03g63150 | 18 | 2 |
LOC_Os04g02030 | 18 | 2 |
LOC_Os04g43440 | 39 | 2 |
LOC_Os05g34220 | 24 | 2 |
LOC_Os11g12050 | 18 | 2 |
LOC_Os11g12330 | 18 | 2 |
LOC_Os11g38520 | 24 | 2 |
LOC_Os12g10180 | 30 | 2 |
LOC_Os12g28100 | 20 | 2 |
LOC_Os01g57870 | 18 | 1 |
LOC_Os01g72680 | 19 | 1 |
LOC_Os02g18000 | 18 | 1 |
LOC_Os04g02110 | 20 | 1 |
LOC_Os04g25900 | 20 | 1 |
LOC_Os04g30930 | 19 | 1 |
LOC_Os04g43340 | 17 | 1 |
LOC_Os06g49390 | 19 | 1 |
LOC_Os08g16120 | 17 | 1 |
LOC_Os09g09490 | 20 | 1 |
LOC_Os09g14450 | 18 | 1 |
LOC_Os10g04120 | 17 | 1 |
LOC_Os10g36270 | 17 | 1 |
LOC_Os11g11770 | 22 | 1 |
# FC: Fold change between infected and control samples. |
Expression patterns of the rice NLR genes detected in both microarray and RNA-seq data
To obtain the detailed expression patterns of the above 38 NLR genes examined in microarray and RNA-seq data, the OmicShare tool (https://www.omicshare.com/) was used to conduct a heat map analysis based on the expression values (log2 fold-change between infected and control samples) obtained from the 69 individual experiments. Strikingly, the 38 rice NLR genes showed very similar expression patterns (Figure S2). They presented highly up-regulated expressions under most of the experimental conditions and down-regulated expressions only in a few experiments (Figure S2), showing that they were active in rice response to the two pathogens and suggesting that they play important roles in rice against the pathogens.
The putative cis-regulatory elements in the promoters of 38 rice NLR genes
Our previous study has shown that there are four pathogen-inducible cis-regulatory elements (PICEs) (G-box, AS-1, GCC-box, and H-box) in the promoter regions of some rice genes, which are critical for the genes’ response to pathogen attacks (Kong et al. 2018). Understanding the upstream regulators of rice NLR genes will definitely extend our knowledge of the expression of the NLR genes in rice defense response. Here, we examined the 2000 bp upstream promoter regions of the above 38 NLR genes which were detected in both microarray and RNA-seq data to determine if the PICEs exist. Besides two core promoter elements (TATA box and CAAT box), we identified six cis-regulatory elements that occur frequently in the promoter regions, which were MYC, STRE, MYB, ABRE, G-box, and AS-1 (Fig. 5).
We found that the top four elements were MYC, STRE, MYB, and ABRE, and their occurrences exceeded 100 times (Fig. 5). The presence of MYB and MYC elements in the promoters suggests they are potential binding sites for MYB and MYC transcription factors, respectively. These two transcription factors have been revealed to play significant roles in plant defense against various stresses (Fang et al. 2018; Feng et al. 2013; Wu et al. 2019). When the MYB and MYC transcription factors bind to their corresponding cis-regulatory elements, they regulate a range of genes associated with biotic or abiotic stresses (Ambawat et al. 2013; Dubos et al. 2010; Ogata et al. 1994). In addition, the STRE and ABRE elements are stress-responsive elements that were extensively studied; the former is a typical stress-responsive element and the latter is an abscisic acid response element (Fujita et al. 2005; Moskvina et al. 1999; Narusaka et al. 2003).
It is worth noting that G-box and AS-1 are also occurring with high frequency, and these two cis-regulatory elements have been shown to be implicated in the induced expression of some genes after pathogen attacks (Loake et al. 1992; Ulmasov et al. 1994; Xiang et al. 1996). Several studies have reported that AS-1 cis-regulatory element could be bound by the TGA family of the basic-leucine zipper (bZIP) transcription factors and then activate the initial plant defense against pathogen attack (Despres et al. 2000; Sarkar et al. 2018). G-box was found to be bound by G/HBF-1, which is also a bZIP protein, for inducing the expressions of some defense genes (Droge-Laser et al. 1997). In addition, G-box also has been demonstrated to be bound by the MYC proteins and basic/helix-loop-helix (bHLH) proteins(Boter et al. 2004; Toledo-Ortiz et al. 2003). We speculate that the interactions of PICEs present in NLR genes’ promoter regions with their corresponding transcription factors are required for the transcriptional activation of some rice NLR genes.
Validation of selected NLR genes by qRT-PCR
To further evaluate the 38 pathogen-responsive NLR genes identified in this work, the top five up-regulated genes were selected for qRT-PCR analysis. In addition, another five NLR genes, which displayed a high frequency of down-regulation during infection with pathogens, were also used for validation. As shown in Fig. 6, three of the five identified up-regulated NLR genes displayed highly significant increases (p-value < 0.05) at 24 hpi in PXO99A-infected samples. But the rest two NLR genes (LOC_Os09g34150 and LOC_Os12g10180) were down-regulated by PXO99A, which were inconsistent with the microarray and RNA-Seq data (Fig. 6). As for the down-regulated gene set, the expression levels of four members (LOC_Os12g32660, LOC_Os11g35210, LOC_Os05g30220 and LOC_Os06g41660) were significantly repressed in the PXO99A-infected conditions, consistent with those of the microarray and RNA-Seq data (Fig. 6). There was also one gene, LOC_Os01g16370, which showed an opposite trend of expression change in response to PXO99A inoculation (Fig. 6).
In all, the results above showed a similar overall trend with the microarray analysis and they also demonstrated the reliability of the pathogen-responsive rice NLR genes identified in this work. The discrepancy between the qRT-PCR results and the microarray results for 3 NLR genes may be due to the different infected rice samples used.
Examination of the expression of 30 NLR genes without detection by microarray data
We found that 23 rice NLR genes had no corresponding probes and other 7 rice NLR genes could not be detected differential expression at the transcriptional level in the microarray data. We hence wondered whether these 30 NLR genes could be detected to be differentially expressed by RNA-seq data. Analyses of six published RNA-seq data showed that, of the 7 NLR genes examined, only two (LOC_Os12g09240 and LOC_Os11g42770) were detected at least one differential expression in the RNA-seq data (Table 4). LOC_Os12g09240 was found to show high levels of expression (> 1.8 fold) and LOC_Os11g42770 was observed to show a low level of expression (< 1.0 fold) (Table 4). Out of the 23 NLR genes that had no corresponding probes, ten (43%) were detected to show at least one differential expression (Table 4). Among these ten NLR genes, five could be detected to be differentially expressed in at least two sets of the RNA-seq data (Table 4). These data demonstrate that twelve of the 30 NLR genes had the potential to express in rice defense response.
Table 4
Detection of differential expression of 30 rice NLRs in 6 groups of RNA-seq data.
Gene_ID | Type | Number of differential expression | The maximum value of fold change | The minimum value of fold change |
LOC_Os12g09240 | Could not be detected in microarray | 2 | 3.8 | 2.8 |
LOC_Os11g42770 | Could not be detected in microarray | 1 | 0.5 | 0.5 |
LOC_Os11g42580 | Could not be detected in microarray | 0 | - | - |
LOC_Os12g10340 | Could not be detected in microarray | 0 | - | - |
LOC_Os12g10360 | Could not be detected in microarray | 0 | - | - |
LOC_Os12g10390 | Could not be detected in microarray | 0 | - | - |
LOC_Os12g29710 | Could not be detected in microarray | 0 | - | - |
LOC_Os06g48520 | No corresponding probe | 4 | 5.3 | 1.3 |
LOC_Os11g12320 | No corresponding probe | 3 | 2.9 | 1.5 |
LOC_Os08g05440 | No corresponding probe | 2 | 1.8 | 0.4 |
LOC_Os11g39330 | No corresponding probe | 2 | 0.5 | 0.2 |
LOC_Os12g17480 | No corresponding probe | 2 | 0.9 | 0.5 |
LOC_Os04g53496 | No corresponding probe | 1 | 0.9 | 0.9 |
LOC_Os06g20050 | No corresponding probe | 1 | 27.3 | 27.3 |
LOC_Os08g32880 | No corresponding probe | 1 | NA | NA |
LOC_Os10g07534 | No corresponding probe | 1 | NA | NA |
LOC_Os11g45840 | No corresponding probe | 1 | NA | NA |
LOC_Os02g16330 | No corresponding probe | 0 | - | - |
LOC_Os04g02460 | No corresponding probe | 0 | - | - |
LOC_Os04g30530 | No corresponding probe | 0 | - | - |
LOC_Os06g41640 | No corresponding probe | 0 | - | - |
LOC_Os08g07380 | No corresponding probe | 0 | - | - |
LOC_Os09g11020 | No corresponding probe | 0 | - | - |
LOC_Os11g30210 | No corresponding probe | 0 | - | - |
LOC_Os11g45924 | No corresponding probe | 0 | - | - |
LOC_Os12g10490 | No corresponding probe | 0 | - | - |
LOC_Os12g17410 | No corresponding probe | 0 | - | - |
LOC_Os12g17420 | No corresponding probe | 0 | - | - |
LOC_Os12g30720 | No corresponding probe | 0 | - | - |
LOC_Os12g30760 | No corresponding probe | 0 | - | - |
-: can not be detected in the published RNA-seq data; NA: not available in the published RNA-seq data (Bagnaresi et al., 2012; Jain et al., 2017; Zhang et al., 2016; Zhang et al., 2018). |
Additionally, we explored if there were factors repressing the expression of the remaining 18 NLR genes that could not be detected to be differentially expressed by either microarray or RNA-sEq. We initially described these 18 genes as the group A that contained NLR genes without differential expression, and then randomly selected 18 genes from the above differentially expressed NLR genes as the group B. By performing a motif search on both groups, we compared the highly frequent cis-regulatory elements in the upstream regions of the two groups of the genes, but did not observe any differences. In addition, we also examined the possible transposons within or adjacent to the NLR genes of the two groups, yet could not find any significant difference between group A and group B.