Whole genome identification of lncRNAs expressed in walnut fruit bracts
To identify lncRNAs expressed in walnut fruits in response to C. gloeosporioides, we constructed 20 cDNA libraries from the anthracnose-resistant and the anthracnose-susceptible walnut fruits at the following five infection stages: tissue at 0 hpi (hours post inoculation), infected tissue at 24, 48, and 72 hpi, and distal uninoculated tissue at 120 hpi (Additional file 1: Table S1). The libraries were sequenced with an Illumina HiSeq 4000 platform. A total of 265.4 Gb clean data were obtained, with an average of 13.27 Gb per library. Approximately 69.7% of the clean reads in all libraries were mapped to the walnut reference genome (Additional file 2: Table S2). The aligned transcripts were assembled, combined, and screened with the FEELnc software to obtain 22,336 lncRNAs (length ≥ 200 nt, ORF coverage < 50%, and potential coding score < 0.5), including 18,403 unknown lncRNAs (23.97%) and 3,933 known lncRNAs (5.12%) (Fig 1).
Characterization of walnut fruit bract lncRNAs
A total of 58,369 mRNAs and 22,336 lncRNAs were obtained for the walnut fruit bracts (all samples combined) (Additional file 3: Table S3,S4). The lncRNAs were characterized according to their locations relative to the partner RNA. A total of 40,429 (67.57%) lncRNAs were located in intergenic regions (i.e., only 32.43% genic lncRNAs). Additionally, 19,767 (48.89%) and 7,302 (37.63%) of the intergenic lncRNAs and genic lncRNAs were located in the antisense strand, respectively (Fig 2a) (Additional file 5: Table S5). Most lncRNAs contained two or three exons, which differentiated them from mRNAs (Fig 2c). Moreover, there was considerable diversity in the distribution of mRNA and lncRNA lengths (Fig 2b). Furthermore, the expression level of most lncRNAs was significantly lower than that of mRNAs (Fig 2d).
Differentially expressed lncRNAs at various infection stages
The lncRNAs that were differentially expressed between the disease-susceptible F423 fruits and the disease-resistant F26 fruits at different C. gloeosporioides infection stages were analyzed. Compared with F423, a total of 14,525 DELs were identified, including 10,645 up-regulated lncRNAs and 3,846 down-regulated lncRNAs in F26. The number of upregulated and downregulated lncRNAs in the various comparisons were respectively as follows: 7,668 and 1,386 in the F26_0hpi vs F423_0hpi comparison; 6,910 and 1,165 in the F26_24hpi vs F423_24hpi comparison; 1,721 and 1,593 in the F26_48hpi vs F423_48hpi comparison; 898 and 1,133 in the F26_72 hpi vs F423_72 hpi comparison; and 4,711 and 550 in the F26_120 hpi vs F423_120 hpi comparison (Fig 3a, b) (Additional file 6: Table S6). Additionally, compared with F423, a total of 34,007 differentially expressed mRNAs were identified, including 15,247 upregulated mRNAs and 13,198 downregulated mRNAs in F26. the number of upregulated and downregulated mRNAs in the various comparisons were respectively as follows: 6,836 and 4,622 in the F26_0 hpi vs F423_0 hpi comparison; 6,392 and 3,955 in the F26_24 hpi vs F423_24 hpi comparison; 3,454 and 4,347 in the F26_48 hpi vs F423_48 hpi comparison; 2,709 and 3,113 in the F26_72 hpi vs F423_72 hpi comparison; and 4,976 and 3,563 in the F26_120 hpi vs F423_120 hpi comparison (Fig 3c, d) (Additional file 7: Table S7). These results revealed the similarities in the expression of lncRNAs and mRNAs. And the number of upregulated lncRNAs and mRNAs in F26 compared to in F423 was significantly higher at the early stages of C. gloeosporioides infection.
Identification of co-expressed lncRNA modules
To identify the hub lncRNAs and predict their potential target genes in trans-regulatory relationships, a weighted gene co-expression network analysis (WGCNA) was used to generate a correlation matrix of the expression levels of 10,645 upregulated lncRNAs and 15,247 upregulated mRNAs. A total of 19 expression modules were screened (Fig 4a) (Additional file 8: Table_S8). The relationships between modules and the resistance traits of the walnut fruit bracts were analyzed and four significantly correlated modules (|r|≥ 0.8) were identified. The MEviolet module was correlated with F26_0hpi (r = 0.95, p = 9e−11), which contains 406 lncRNAs and 1350 mRNAs. The MElightyellow module was correlated with F26_24hpi ( r = 0.86, p = 1e−06 ), which contains 165 lncRNAs and 892 mRNAs. The MEbrown2 module was correlated with F26_48hpi (r = 0.82, p = 8e−0.86), which contains 128 lncRNAs and 224 mRNAs. The MEwhite module was correlated with F26_72hpi (r = 0.81, p = 1e−05), which contains 111 lncRNAs and 378 mRNAs (Fig 4c). Regarding F26_120 hpi, the rand p value for the MEorange module was 0.73 and 3e−0.4, respectively. The highest r value (0.77) for F423 was calculated for the MEdarkseagreen module and F423_48hpi (Fig 4b). And the MEorange module contains 76 lncRNAs and 227 mRNAs (Fig 4c). These results suggested that lncRNAs are closely related to the disease resistance of walnut fruit bracts.
Enrichment analysis of genes co-expressed with lncRNAs
The GO and KEGG pathway databases were used to analyze the genes co-expressed with lncRNAs in each significant module and MEorange module. In the MEviolet module, a total of 208 GO terms were assigned, including 106, 8 and 94 GO terms in “biological process”, “cellular component” and “molecular functions”, respectively (Additional file 9:Table_S9). Among these enriched GO terms, most of them were related to biosynthesis and gene expression regulation, and the ones related to plant immunity were “response to stimulus”(GO:0050896) (187 genes) and “cellular response to stimulus”(GO:0051716) (114 genes) (Fig 5a).In total, 104 enriched KEGG pathways were identified, of which 30 pathways were significantly enriched in this module (Additional file 10: Table_S10). The top 30 significantly enriched pathways for target genes are mentioned in Fig 6a. “Plant hormone signal transduction” (ko04075) (22 genes), “Fatty acid metabolism” (ko01212) (15 genes), “Fatty acid elongation” (ko00062) (12 genes), “Ribosome” (ko03010) (12 genes), and “Spliceosome” (ko03040) (11 genes) were the most significant KEGG pathways.
In the MElightyellow module, a total of 164 GO terms were assigned, including 79, 16 and 69 GO terms in “biological process”, “cellular component” and “molecular functions”, respectively (Additional file 9: Table_S9). Among them, GO terms related to plant immunity included "activation of innate immune response" (GO: 0002218) (4 genes), "activation of immune response" (GO: 0002253) (4 genes), and "induced systemic resistance, jasmonic acid mediated signaling pathway" (GO: 0009864) (3 genes) (Fig 5b). In total, 93 enriched KEGG pathways were identified, of which 30 pathways were significantly enriched in this module (Additional file 10: Table_S10). The top 30 significantly enriched pathways for target genes are mentioned in Fig 6b. “Starch and sucrose metabolism” (ko00500) (14 genes), “Plant hormone signal transduction” (ko04075) (13 genes), “Phenylpropanoid biosynthesis” (ko00940) (11 genes), “Biosynthesis of amino acids” (ko01230) (10 genes), and “DNA replication” (ko03030) (8 genes) were the most significant KEGG pathways.
In the MEbrown2 module, a total of 126 GO terms were assigned, including 89, 5 and 32 GO terms in “biological process”, “cellular component” and “molecular functions”, respectively (Additional file 9: Table_S9). In addition to the terms related to biological metabolism and gene expression regulation, the items related to plant immunity “response to endogenous stimulus” (GO:0009719) (15 genes), “cellular response to endogenous stimulus” (GO:0071495) (13 genes) and “cellular response to hormone stimulus ”(GO:0032870) (12 genes) were also enriched significantly (Fig 5c). In total, 38 enriched KEGG pathways were identified, of which 30 pathways were significantly enriched in this module (Additional file 10: Table_S10).The top 30 significantly enriched pathways for target genes are mentioned in Fig 6c. “Cyanoamino acid metabolism” (ko00460) (3 genes), “Plant hormone signal transduction” (ko04075) (6 genes), “Nitrogen metabolism” (ko00910) (2 genes), “Terpenoid backbone biosynthesis” (ko00900) (2 genes) were the most significant KEGG pathways.
In the MEwhite module, a total of 142 GO terms were assigned, including 95, 4 and 43 GO terms in “biological process”, “cellular component” and “molecular functions”, respectively (Additional file 9: Table_S9). Among the biological process category, the significantly over represented GO terms were “response to stimulus” (GO: 0050896) (67 genes), followed by “response to stress” (GO: 0006950) (51 genes) and “defense response” (GO: 0006952) (43 genes), which were all related to plant immunity . In addition, other terms related to plant immunity were also enriched, such as “immune system process” (GO:0002376) (14 genes), “response to biotic stimulus” (GO:0009607) (14 genes) and “innate immune response” (GO:0045087) (13 genes), etc (Fig 5d). In total, 54 enriched KEGG pathways were identified, of which 30 pathways were significantly enriched in this module (Additional file 10: Table_S10). The top 30 significantly enriched pathways for target genes are mentioned in Fig 6d. “Carbon metabolism” (ko01200) (5 genes), “Cysteine and methionine metabolism” (ko00270) (4 genes), “Amino sugar and nucleotide sugar metabolism” (ko00520) (4 genes) were the most significant KEGG pathways.
In the MEorange module, a total of 128 GO terms were assigned, including 87, 8 and 33 GO terms in “biological process”, “cellular component” and “molecular functions”, respectively (Additional file 9: Table_S9). Among the biological process category, “response to organic substance” (GO: 0010033) (14 genes), “response to endogenous stimulus” (GO: 0009719) (13 genes), and “response to external stimulus” (GO: 0009605) (10 genes)etc., associated with plant immunity were significantly enriched (Fig5e). In total, 32 enriched KEGG pathways were identified, of which 30 pathways were significantly enriched in this module (Additional file 10: Table_S10). The top 30 significantly enriched pathways for target genes are mentioned in Fig 6e. “Plant hormone signal transduction” (ko04075) (4 genes), “Thiamine metabolism” (ko00730) (3 genes), “Starch and sucrose metabolism” (ko00500) (3 genes) and “Fatty acid degradation” (ko00071) (2 genes) were the most significant KEGG pathways.
Network analysis of hub lncRNAs
The hub lncRNAs are important for regulating the whole network. Therefore, we screened the 96 hub lncRNAs and 124 known target genes according to their weight value and connectivity in five modules (Additional file 11: Table_S11). In the MEviolet module, the 25 known target genes for 15 hub lncRNAs were found to be involved in multiple functions (Fig 7a), such as probable galacturonosyl transferase 10 and ultraviolet-B receptor UVR8-like. In addition, target genes encoding receptor-like serine / threonine-protein kinase NCRK (XM_018958556.1) and eukaryotic translation initiation factor 5A-2-like (XM_018994862.1) are known resistance genes (Fig 7a). In the MElightyellow module, 16 hub lncRNAs were generated and their 22 known target genes were involved in many functions (Fig 7b). And the target genes encoding G-type lectin S-receptor-like serine/threonine-protein kinase LECRK1 (XM_018950446.1), probably inactive leucine-rich repeat receptor-like protein kinase At2g25790 (XM_018989953.1) and TMV resistance protein N-like (XM_018961957.1) were known resistance genes (Fig 7b). In the MEbrown2 module, 24 hub lncRNAs and their 15 known target genes were generated (Fig 7c), the target gene encoding probable LRR receptor-like serine / threonine-protein kinase At3g47570 (XM_018962714.1) was konwn resistance gene (Fig 7c). In the MEwhite module, 23 hub lncRNAs were generated and their 38 known target genes were involved in many functions (Fig 7d). The target genes encoding putative disease resistance protein At1g50180 (XM_018965430.1) , probable LRR receptor-like serine/threonine-protein kinase At1g63430 (XM_018973294.1) and L-type lectin-domain containing receptor kinase IV.2-like (XM_018954279.1) were konwn resistance genes (Fig 7d). In the MEorange module, 18 hub lncRNAs were generated and their 24 known target genes were involved in many functions (Fig7e). And the target gene encoding the inactive LRR receptor-like serine / threonine-protein kinase BIR2 (XM_018967526.1) was konwn resistance gene (Fig 7e). All disease resistance genes in walnut are listed in Additional file 12: Table_S12. These results suggested that lncRNAs may participate in the resistance of walnut bracts to C. gloeosporioides by acting on their target genes. Based on the enrichment results of KEGG, we predicted the possible pathway of hub lncRNAs (Additional file 13: Table_S13). Most of the hub lncRNAs and its target genes in the five modules are enriched in the pathways of material metabolism and biosynthesis. In the white module, lncRNA MSTRG.94840.7 and MSTRG.103441.8 were enriched in “plant pathogen interactions” and “plant hormone signal transduction” pathways, which may be related to plant immunity.
Validation of hub lncRNAs and target genes
We randomly selected 5 hub lncRNAs and 5 target genes for qRT-PCR analysis with the aim to validate the expression profiles between F26 and F423 obtained by RNA-Seq. The list of hub lncRNAs specific primers used for qRT-PCR analysis is listed in Additional file 13: Table_S13. The hub lncRNAs selected for qRT-PCR confirmation were MSTRG.13585.8 , MSTRG.152205.1 , MSTRG.11713.16 , MSTRG.112028.8, and MSTRG.62751.2, the target genes were related to ultraviolet-B receptor UVR8-like (LOC109014322), G-type lectin S-receptor-like serine / threonine-protein kinase LECRK1(LOC108979712), NHL repeat-containing (LOC108987880), probable LRR receptor-like serine/threonine-protein kinase At3g47570 (LOC108989177), and putative disease resistance protein At1g50180 (LOC108991254). The qRT-PCR analysis showed that the expression of MSTRG13585 and LOC109014322 peaked at 0hpi , MSTRG11713, MSTRG152205 , LOC108979712 and LOC108987880 at 24hpi , MSTRG112028 and LOC108989177 at 48hpi, MSTRG62751 and LOC108991254 at 72hpi (Fig 9), which were consistent with the RNA-seq data, with similar trends observed for the hub lncRNAs and their target genes.