Determination of drought tolerance in linseed varieties
In this study, we measured three drought-tolerance related phenotypic traits of Z141 and NY-17 (Additional file 1). Z141 consistently performed better than NY-17 under DS (Figure 1a-d). In addition, under DS, Z141 had a lower plant height and biomass reduction rate compared than NY-17 under DS (Figure1e, f; Additional file 2). The biomass reduction rate under DS was 30% and 46% in Z141 and NY-17 respectively. The relative leaf water content (RLWC) of Z141 was significantly higher than that of NY-17, suggesting that Z141 leaves can retain more water under drought stress. (Figure 1g, h; Additional file 3)
Two-way ANOVA results showed significant effects of the different varieties and different drought level treatments and their effects on plant height, biomass ALWC and RLWC (Table 1). By comparing the phenotypes of Z141 and NY-17 under drought stress, it is found that the drought-tolerant of Z141 was stronger than that of NY-17. Therefore, we reveal the molecular mechanism difference between Z141 and NY-17 in response to drought stress using single-molecule long-read transcriptome sequencing..
Analysis of the linseed transcriptome by PacBio Iso-Seq
Total RNA of Z141 and NY-17 was isolated from control, DS, RW and RD treatment groups and quality checked. A total of 16 RNA samples were sent to Wuhan Frasergen Bioinformatics Co.,Ltd. Genomic Service for sequencing using the PacBio Sequel platform. This platform can generate sufficiently long read lengths that cover the full length of most RNA transcripts, ensuring that accurate reconstructed FL splice variants are obtained. Over 2 million polymerase reads with a mean length of ~30,000 bp were generated after quality checking by Frasergen (Additional file 4). After processing raw data, we obtained more than 33 million filtered subreads with a mean length of ~2000 bp (Additional file 5). In addition, we obtained 1,599,415 circular consensus (CCS) reads, which included 1,293,134 FL reads (Additional file 6). De novo reconstruction of the transcriptome data was performed using RNA-Seq reads and publicly available flax sequences. To evaluate the density and length of isoforms, we compared the locus coverages of PacBio full-length and non-chimeric (FLNC) sequences and swine SSC 10.2 annotation. In the PacBio dataset, a total of 1,093,282 high-quality FLNC sequences covered 108,579 isoforms and were allocated to 28,686 loci (Additional file 7). Due to the high base error of SMRT sequencing, high-quality Illumina short reads were obtained using Prooveread software to correct the errors (Additional file 8). In this study, the pre- and post-correction FLNC sequences were aligned to the linseed genome sequence through GMAP, and finally, we obtained 1,093,282 high-quality FLNC sequences for further study (Additional file 9).
Global comparisons of DS- and RD-related transcriptomes reveal gene expression and functional group differences
mRNA populations were compared using principal component analysis (PCA) to provide a framework for understanding how linseed genes are regulated to respond to DS. Transcriptomes of Z141 and NY-17 under DS, RW and RD were likely to share a great similarity in gene expression, with variations forming three groups that were separated far from the control (Figure 2a). The transcriptomes of DS exhibited a distinct relationship from those of RD, suggesting that the gene expression in the transcriptome has a major shift between DS and RD.
Cluster analysis of differentially expressed genes (DEGs) further supported our observed results from PCA (Figure 2b). The overlaps of up- and downregulated genes between Z141-RD and NY-17-RD was significantly higher than that between Z141-DS and Z141-RD, with 62.1% compared to 47.8% (upregulated) and 70.7% compared to 60.6% (downregulated) respectively (Figure 2c, d). In addition, in Z141 and NY-17 approximately 52.2% and 65.6% of upregulated genes were responsive to only RD respectively, and 29.9% and 43.6% of upregulated genes were responsive to only DS (Additional file 10). Specifically, in Z141 and NY-17, 8005 (including 3245 for DS and 4760 for RD) and 6285 (including 2381 for DS and 3904 for RD) genes were upregulated under drought, respectively (Additional file 10). Approximately 9104 (including 4041 for DS and 5063 for RD) and 7908 (3515 for DS and 4393 for RD) genes were downregulated under drought in Z141 and NY-17 (Additional file 10). We also observed a higher proportion of stress-responsive genes under RD than that under DS. In this study, 2275 and 1343 genes were upregulated, and 3067 and 2154 were downregulated when Z141 and NY-17 were under DS, respectively. In total, 1007 and 1686 genes were significantly up- and downregulated when Z141 and NY-17 were under DS and RD (Figure 2c, d). Taken together, these results suggest that the transcriptomes of DS and RD has fundamentally different.
Gene Ontology (GO) enrichment analysis was conducted to examine the functional distribution of the DS-related candidate genes identified in our study. We performed GO enrichment analysis on 2,275 and 1,343 DEGs that both up-regulated under DS and RD stress in Z141 or NY-17 respectively (Additional file 11). A series of GO categories exhibited significantly higher enrichments in the overlapping or unique upregulated gene sets under DS and RD treatments compared to their levels in the control. The GO terms of upregulated genes overlapping between DS and RD in Z141 and NY-17 were mainly enriched in “proline biosynthetic process (GO: 0006561)” and “proline metabolic process (GO: 0006560)” (Figure 3a, b). Moreover, except for amino acid biosynthesis and metabolism, abiotic stress-related GO terms e.g., “response to stress (GO: 0009650)” and “response to desiccation (GO: 0009269)”, exhibited significant enrichment among Z141 upregulated genes (Figure. 3a). Interestingly, GO terms related to flower development (GO: 0009908) were significantly enriched in only Z141 upregulated genes(Additional file 11, Figure 3a). Precocious flowering might be an important drought avoidance mechanism for species preservation when plants under stress[23, 24]. Therefore, this result may indicates that the drought avoidance mechanism of Z141 was activated. DS inhibits plant photosynthesis. In this study, the GO terms of photosynthesis (GO: 0015979) were significantly enriched in downregulated genes in Z141 and NY-17 under DS and RD (Additional file 11). Proline accumulation is one of the striking metabolic responses of plants to drought stress, it contributes to the redox balance of cells under stressful conditions[25]. Our study showed that proline biosynthesis genes were significantly up-regulated in linseed under drought stress.
The difference in linseed gene regulation patterns under DS and RD, suggests that under repeated DS, linseed may have different molecular mechanisms for drought tolerance. In order to verify this hypothesis, we performed GO enrichment analysis on 970 and 2485 DEGs that were specifically up-regulated in Z141 under DS or RD stress. Of the stress responsive GO terms, two distinct functional categories of specific DS upregulated genes in Z141 exhibited significantly higher enrichments, namely methylation and negative regulation. The first group included “histone H3-K36 demethylation (GO: 0070544)” and “macromolecule methylation (GO: 0043414)”, whereas the second group included “negative regulation of biological process (GO: 0048519)” and “negative regulation of macromolecule metabolic process (GO: 0010605)” (Additional file 11 and 12). The GO terms of upregulated genes in Z141 under RD were mainly enriched in “fatty acid oxidation (GO: 0019395)”, “fatty acid biosynthetic process (GO: 0006633)”, “fatty acid m metabolic process (GO: 0006631)” and “lipid metabolic process (GO: 0006629)” (Additional file 11). The GO terms of genes downregulated in only Z141 under DS were mainly enriched in “carbohydrate metabolic process (GO: 0005975)”, “lignin biosynthetic process (GO: 0009809)” and “lignin metabolic process (GO: 0009808)”, whereas under RD, the GO terms of genes downregulated in only Z141 were mainly enriched in “amide biosynthetic process (GO: 0043604)” and “cellular amide metabolic process (GO: 0043603)” (Additional file 11 and 12). Overall, these functional categories indicated that epigenetic modifications might play a crucial role in the DS response process, although the exact functions of these genes remain to be elucidated. Meanwhile, DS may induce the Z141 to shift from vegetative growth to reproductive growth.
Under DS, 1038 DEGs were specifically up-regulated in NY-17, and their GO terms of genes were mainly enriched in RNA regulation, including “RNA modification (GO: 0009451)”, “RNA processing (GO: 0006396)” and “ncRNA processing (GO: 0034470)” (Additional file 11). There were 1525 DEGs specifically up-regulated under RD, and the GO terms of genes upregulated only under RD were mainly enriched in “transmembrane transport (GO: 0055085)” (Additional file 11 and 12). The GO terms of 1379 specifically down-regulated DEGs in NY-17 under DS were mainly enriched in flavonoid biosynthesis (GO: 0009813). Interestingly, more than 3000 DEGs were specifically down-regulated in NY-17 under RD stress, and the GO terms of genes were similar to those in Z141 and were mainly enriched in “amide biosynthetic process (GO: 0043604)” and “cellular amide metabolic process (GO: 0043603)” (Additional file 11 and 12).
Comparison of Z141 and NY-17 transcriptomes reveals the molecular mechanism of linseed drought tolerance
Although the transcriptomes of Z141 and NY-17 are very similar in overall gene expression, a set of stress-responsive genes exhibited altered expression patterns specific to Z141 or NY-17 under DS, indicating that genes of distinguished functional categories could impact the drought tolerance of linseed. There were 1552 overlapping up-regulated genes between Z141 and NY-17 under DS, and the GO items were mainly enriched in two distinct functional categories, including proline biosynthesis and reproductive development. The proline biosynthesis category “proline biosynthetic process (GO: 0006561)”, “proline metabolic process (GO: 0006560)”, “glutamine family amino acid biosynthetic process (GO: 0009084)” and “glutamine family amino acid metabolic process (GO: 0009064)”, whereas the abiotic stress response category includeed “reproductive system development (GO: 0061458)” and “reproductive structure development (GO: 0048608)” (Additional file 13 and 14, Figure 5a). Under RD stress, 2957 DEGs were both up-regulated in Z141 and NY-17. The GO items of these genes were also mainly enriched in the proline biosynthesis category with “proline biosynthetic process (GO: 0006561)” and “proline metabolic process (GO: 0006560)”, and in the abiotic stress response category with “response to abscisic acid (GO: 0009737)”, and “response to desiccation (GO: 00009269)”, “response to acid chemical (GO: 0001101)” (Additional file 13 and 14, Figure 5b). The GO terms of downregulated genes overlapping between Z141 and NY-17 under DS and RD conditions were mainly enriched in functional categories related to photosynthesis (Additional file 12)
There were 1693 specifically up-regulated DEGs under DS in Z141, and the GO items of these genes were mainly enriched in “abscission (GO: 0009838)”, “defense response (GO: 0006952)” and “NADP biosynthetic process (GO: 0006741)” (Additional file 13 and 14), whereas under RD, the GO terms were mainly enriched in “jasmonic acid biosynthetic process (GO: 0009695)” and “jasmonic acid metabolic process (GO: 0009694)” (Additional file 13 and 14). The uniquely upregulated genes showed more enrichment in pathways closely related to plant drought resistance, such as jasmonic acid biosynthesis, abscission and NADP biosynthesis, than in other pathways.. In contrast, the GO terms for genes upregulated in NY-17 under DS were mainly enriched in the RNA regulation functional category with “ncRNA metabolic process (GO: 0034660)”, “ncRNA processing (GO: 0034470)”, and “tRNA processing (GO: 0008033)” terms (Additional file 13 and 14). Under RD, the GO terms for genes in only NY-17 were mainly enriched in “phenylpropanoid biosynthetic process (GO: 0009699)” and “phenylpropanoid metabolic process (GO: 0009698)” (Additional file 13 and 14).
Reduce and visualize GO (REVIGO) analysis
To remove the insignificant GO terms which p. adjust value >0.05 and visualize the GO difference between only Z141 and NY-17 genotypes, we submitted upregulated and downregulated enriched GO categories from Z141 and NY-17, respectively, with a false discovery rate (FDR)<0.05, respectively, to REVIGO analysis (Figure 5a, b). Graphical results revealed that highly significant biological process (BP) GO terms such as proline biosynthesis process (GO: 0006561), DNA recombination (GO: 0006310), reciprocal DNA recombination (GO: 0035825), response to desiccation (GO: 0009269) and response to stress (GO: 0006950) were upregulated in Z141 under DS. These GO terms are enriched in 6 main functional groups, namely, proline biosynthesis, response to desiccation, deoxyribose phosphate metabolism, calcium ion transport, reproductive process, and reproduction (Figure 5a). Although DEGs of proline biosynthesis (GO: 0006561), response to abiotic stimulus (GO: 0009628), and mismatch repair (GO: 0006298) were significantly upregulated in NY-17 under DS stress, more DEGs were enriched in RNA modification (GO: 0009451), RNA processing (GO: 0006396), and ncRNA processing (GO: 0034660). Therefore, the upregulated DEGs in NY-17 under DS were mainly enriched in RNA modification, anatomical structure homeostasis, ribosome biogenesis, protein refolding, reproductive system development, and reproductive process (Additional file 15).
The REVIGO analysis showed that the functional groups of enriched GO terms were more similar between Z141 and NY-17 under RD stress than under DS. The GO terms were mainly enriched in proline biosynthesis, response to stress, metal ion transport, and inorganic ion homeostasis. These functional groups are closely related to the response of plants to DS; however, in NY-17, the DEGs of leaf senescence (GO: 0010150) and ageing (GO: 0007568) were upregulated, and this result is consistent with the phenotype of NY-17 under RD stress (Additional file 15).
The downregulated GO terms in both Z141 and NY-17 under DS and RD stress were mainly involved in tetrapyrrole biosynthesis, photosynthesis, and light reactions ( Additional file 15, Figure 5b). This result is consistent with GO analysis and indicated that the effects of DS on the linseed aboveground parts mainly involved photosynthesis.
Functional analysis of DEGs using MapMan analysis
MapMan is a user-driven tool that projects large data sets onto diagrams of metabolic pathways and other processes. Therefore, in this study, we used it to explore the effects and changes induced under DS in linseed leaf tissues. We input data of specific BP DEGs that were co-upregulated or co-downregulated in Z141 and NY-17 under DS or RD stress and used the reference Lusitatissimum_200. m02. Figure 6 and additional file 16 shows an overview of Z141 and NY-17 up- and downregulated DEGs involved in metabolic pathways under DS and RD stress.
The results showed that of the Z141 and NY-17 DEGs that were up- or downregulated DEGs under DS stress, 1483 upregulated DEGs and 2478 downregulated DEGs were mapped, and of them, only 178 and 581 are visible in figure 6 and additional file 16. In contrast, of the Z141 and NY-17 DEGs that were up- or downregulated under RD stress, 2973 upregulated DEGs and 3581 downregulated DEGs were mapped; 400 and 723 of these are visible in Figure 6 and additional file 14. Consistent with the GO enrichment analysis, up- and downregulated DEGs were mainly enriched in similar functional groups and pathways by MapMan analysis.
It is evident from both GO enrichment and MapMan analysis that upregulated DEGs were mostly enriched in the glutamine family amino acid biosynthesis process (GO: 0009084) and proline biosynthetic process (GO: 0006561). The downregulated DEGs were mainly enriched in photosynthesis (GO: 0015979), light harvesting in photosystem I (GO: 0009768), and light harvesting (GO: 0009765). These terms are most likely to play an essential role in regulating DS in linseed.
PPI network analysis
To further explore the protein interactions during DS, we constructed a PPI network of all the up- and downregulated DEGs and identified them in linseed leaf tissues using the STRING program. For the upregulated DEGs, we identified two interaction subnetworks that were predicted from 43 nodes of proteins with a PPI enrichment p-value< 1.0e-16 at the medium confidence parameter level. In this network analysis, we identified RAD50 (DNA repair protein 50) interacting protein 1 (RIN-1) as a hub gene that interacted with proline biosynthesis and response to stress (Figure 7a). For the downregulated DEGs, there were 94 nodes of proteins with PPI enrichment (Figure 7b). Almost all of the nodes were concentrated on photosynthesis or related regulation networks. This result is completely consistent with the results of our previous analysis.
Identification of transcription factors (TFs) temporarily up- and downregulated in response to DS and RD
TFs have play irreplaceable roles in the response to various abiotic stresses by modulating target gene expression [26]. To understand the essence of regulatory processes during DS and RD treatment, a domain searching method was used to first predict TFs in Z141 and NY-17 on a whole-genome scale based on our identified non-redundant linseed unigenes. A total of 4,936 linseed TF genes distributed among 50 families were identified (Additional file 17) [27].
To profile a stress-responsive TF open reading frame collection (TFome) under DS and RD, we focused on TF genes exhibiting diverse expression patterns with stress changes, including continuous upregulated, continuous downregulated an early peak in expression and a late peak in expression. As a result, 1190 TFs distributed in 50 families were found to be differentially regulated in response to at least one stress. (Fold change ≥ 2 and FDR adjusted p-value < 0.01). Eleven TF families accounted for approximately half of the stress-responsive TF genes, including bHLH (9%), C2H2 (8%), NAC (8%), MYB (6%), ERF (6%), bZIP (5%), WRKY (5%) and MYB-related (4%) (Figure 8a)
Moreover, the 1190 TFs were further classified into 15 clusters according to their expression patterns by performing Mfuzz program analysis in R software. Clusters 5, 8,11 and 13 consisted of 387 TFs mainly upregulated by DS and RD, including DREB, HSF and NF-YA10, which have been confirmed to be key regulators of plant abiotic resistance pathways (Figure 8b and Additional file 18).
Candidate gene prediction
By considering the results of GO enrichment, MapMan, and PPI network analysis and gene annotations, we screened DS-responsive genes from the DEGs that have functions related to proline biosynthesis, response to stress, response to water, and cellular response to abiotic stimulus for candidate gene analysis. A total of 508 DEGs related to the above functions were screened for candidates for gene prediction analysis in Z141 and NY-17, respectively (Additional file 19, Table 2). P5CS gene family encodes 1- pyrrolin-5 - carboxylate synthase (P5CS), which is the key rate-limiting enzyme in plant proline biosynthesis [28]. Previous studies have shown that overexpression of members of the P5CS gene family can significantly increase the proline content in plant cells and improve the drought tolerance of plants [29]. Usually, the P5CS gene family of other plants has 2-4 members [28]. But in linseed, we have identified 8 members, and their expression patterns closely match with our repeated drought patterns (Additional file 19 and 20, Figure 9a). In addition, the expression level of most P5CS gene family members in Z141 was significantly higher than that in NY-17 under drought stress. (Figure 9a). P5CR gene family members encode the last enzyme in the plant proline biosynthesis pathway, overexpression P5CR gene will significantly improve the photosynthetic response of Arabidopsis under drought and high-temperature stress [30, 31]. In this study, we found that the expression level of P5CR gene, such as Lus10034453, in Z141 was significantly higher than that in NY-17 under drought stress, and its expression pattern also matched our drought treatment model (Figure 9b). Higher gene expression of P5CR family members may be conducive to proline accumulation. These results indicate that the members of the P5CS and P5CR gene families are closely related to drought tolerance in plants. Moreover, in this study, we found that some encoding dehydrin genes were also rapidly increased their expression levels under drought stress (Additional file 19 and 20). Previous studies have shown that overexpression dehydrin genes will enhance the drought tolerance of plants [32, 33]. Therefore, we hypothesize that proline, dehydrin, and DNA repair might play important roles in regulating drought tolerance in linseed. Hence, based on all the above analyses, 24 genes (including 8 P5CS gene family members, 2 P5CR gene family members, 8 DNA repair genes and 6 dehydrin -encoding genes) were considered the most likely candidate genes enabling drought tolerance (Additional file 19 and 20). However, further validation and verification are needed to check their actual roles in drought tolerance.
Validation of isoforms by RT-PCR
Expression analysis of differentially expressed functional candidate genes, associated with DNA repair, the MAPK signalling pathway, proline biosynthesis, and photosynthesis that were selected from transcriptome data, was validated by RT-PCR. The results (Figure 10a-d) demonstrated that transcript abundances of selected genes were consistent with the transcriptome analysis results, thereby validating the reliability of our annotated transcriptome data for future studies.