Sequencing，mapping and differentially expressed genes
To characterize the DGE profiles in the ABP and UNP segments of the flax stem, 16 DGE libraries were collected and used for RNA-Seq analysis (Fig.2A). Approximately 47.1～65.0 million raw reads were obtained from these 16 libraries. More than 90% of the reads were mapped to the reference database for the 16 libraries, except UNP30_2. A total of 43,471 genes were detected in all of 16 samples. The RPKM values for each gene in each sample are listed in Table S2. The correlation analysis showed that the correlation between ABP30 and ABP50 was closer than that between ABP30 and UNP30, as well as between ABP30 and UNP50. The correlation between UNP30 and UNP50 was closer than that between ABP30 and UNP30, as well as between ABP30 and UNP50. This indicated that different genes regulate stem growth with different stem regions which above or below the snap point in the same stage. However, the same genes were involved in the growth of the same stem regions at different stem developmental stages. The correlation coefficients of four biologically independent replicates were all above 0.90, except UNP30_1 and UNP30_2, as well as ABP30_1 and ABP30_4, which were 0.899 and 0.829, suggesting that the expression levels of genes in the four replicate libraries were similar and suitable for downstream analysis (Fig.2B).
To identify genes that were differentially expressed in the four tissues of flax, pairs of DGE profiles of the sixteen libraries were compared to analyze gene expression variations. Genes that showed significant differences in expression were identified between the pairwise comparisons of the ABP30 vs. ABP50, ABP30 vs. UNP30, ABP50 vs. UNP50, and UNP30 vs. UNP50 (Fig.2C and Fig.2D). A total of 8639 DEGs were detected across all four sample types. Among of them, 92% of DEGs were detected in ABP30 vs. UNP30 and UNP30 vs. UNP50. For ABP30 vs. UNP30, there were 2003 up-regulated genes and 1698 down-regulated DEGs. For UNP30 vs. UNP50, there were 2791 up-regulated genes and 1631 down-regulated DEGs. The data suggested that UNP30 had great differences in gene regulation with both ABP30 and UNP50, indicating significant differences between genes that regulate stem regions below the snap point in different stem development stages, including fiber development stage and fiber maturity stages. In the fiber development stage, there are also significant differences between genes that regulate stem regions above the snap point and genes that regulate regions below the snap point. Only 18 up-regulated genes and 12 down-regulated DEGs were detected in ABP30 vs. ABP50, indicating that there are much fewer gene regulating differences than regions below the snap point in different stem development stages. Only 230 up-regulated genes and 256 down-regulated DEGs were detected in ABP50 vs. UNP50, indicating that there are much fewer differences between genes that regulate stem regions above the snap point and genes that regulate regions below the snap point in the fiber mature stage than in fiber development stage. All of detected DEGs with their functional annotations are listed in Table S3.
Most abundant genes in fiber development and maturing stage
For pairwise comparisons of the ABP30 and UNP30, the most abundant transcripts in the stem tissue above the snap point when plant at 30 cm were those of the genes encoding rhamnogalacturonate lyase (Lus10011565.g), 1-aminocyclopropane-1-carboxylate synthase 8 (Lus10002795.g), fasciclin-like arabinogalactan protein (Lus10002984.g, Lus10036112.g, and Lus10036113.g), leucine-rich repeat extensin-like protein 4 (Lus10033672.g), peroxidase (Lus10020994.g and Lus10009932.g), CASP-like protein 1D1 (Lus10012808.g), protein SRG1(Lus10015252.g), and MLP-like protein 328 (Lus10008932.g). For pairwise comparisons of the UNP30 vs. UNP50, the most abundant transcripts in stem tissue above the snap point when plant at 30 cm (fiber development stage) were those of the genes encoding probable protein kinase At2g41970 (Lus10000773.g), RPW8-like protein 3 (Lus10000836.g), 1-aminocyclopropane-1-carboxylate oxidase 5 (Lus10000857.g), subtilisin-like protease SBT1.6 (Lus10002397.g), cytochrome P450 82C4 (Lus10003148.g), and MLP-like protein 423 (Lus10003451.g). Some genes were highly expressed in the fiber development stage compared to the fiber mature stage but with unknown proteins, such as Lus10000527.g, Lus10000834.g, Lus10000835.g, Lus10002396.g, and Lus10008019.g (Table 1).
For pairwise comparisons of the ABP50 and UNP50, the most abundant transcripts in stem tissue above the snap point when plants were 50 cm were those of the genes encoding MLP-like protein 328 (Lus10008930.g), ethylene-responsive transcription factor ERF022 (Lus10002953.g), cytokinin hydroxylase (Lus10033555.g), high-affinity nitrate transporter 2.4 (Lus10026527.g), cytochrome P450 83B1 (Lus10034505.g), subtilisin-like protease SBT1.7 (Lus10006307.g), dehydration-responsive element-binding protein 1D (Lus10027412.g), and 7-deoxyloganetin glucosyltransferase (Lus10032218.g). Two genes, Lus10015036.g and Lus10041692.g were highly expressed in stem tissue below the snap point with unknown proteins. For pairwise comparisons of the UNP30 vs. UNP50, the most abundant transcripts in stem tissue above the snap point when plants at 50 cm (fiber maturing stage) were those of the genes encoding 36.4 kDa proline-rich protein (Lus10010482.g), transmembrane protein 205 (Lus10000723.g), auxin-responsive protein SAUR21 (Lus10008992.g), transmembrane protein 205 (Lus10021473.g), BTB/POZ and TAZ domain-containing protein 1 (Lus10033038.g), anthranilate N-methyltransferase (Lus10002668.g), Syntaxin-71 (Lus10007411.g) and short-chain dehydrogenase TIC 32, chloroplastic (Lus10025321.g). Two genes, Lus10015276.g and Lus10033722.g were highly expressed in the fiber mature stage compared to the fiber development stage but with unknown proteins (Table 2).
Gene ontology and metabolic pathway enrichment analysis of differentially expressed genes
Using GO analysis, this study classified differentially expressed genes based on enriched GO terms in the biological process, cellular component, and molecular function ontologies. The top 30 significantly enriched categories for the DEGs are listed in Fig. 3. For the stem tissue above the snap point vs. other tissues, cellular components enriched fewer DGEs then the other two ontologies. At the fiber development stage, most of the cellular component genes were enriched above the snap point (ABP30 vs. UNP30). However, at the fiber maturation stage, genes which involved in membrane part, integral component of membrane, intrinsic component of membrane, and intracellular are significantly different between stem tissues above and under the snap point (ABP50 vs. UNP50). For different fiber development stages, most DGEs were enriched in metabolic process, primary metabolic process, catalytic activity, and organic substance metabolic process, and fiber development stage enriched more DEGs than fiber maturing stage for each ontology (UNP30 vs. UNP50). Furthermore, it was found that some cellulose synthesis GO terms are specific in stem tissue of fiber development stage (UNP30), including cellulose microfibril organization (GO:0010215: Lus10034670.g and Lus10017863.g), galactosyl transferase activity (GO:0008378: Lus10023955.g, Lus10021538.g, Lus10039657.g and Lus10027185.g) (Table S4).
Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis was performed to identify major or specific metabolic pathways and their underlying genes in the fiber development stage (Fig. 4). The term “Biosynthesis of secondary metabolites” was significantly enriched in all of the stem tissue above the snap point vs. other tissues, including over 500 DEGs underlying ABP30 vs. UNP30, UNP30 vs. UNP50, and ABP50 vs. UNP50. In particular, the pathways of “Phosphatidylinositol signaling system,” “Zeatin biosynthesis,” “Glycosaminoglycan degradation,” and “Ubiquitin mediated proteolysis” were significantly enriched in the stem tissue above the snap point. Galactose metabolism pathway is specifically enriched in stem tissue above the snap point in fiber development stage, which include 8 genes (Lus10003875.g, Lus10001822.g, Lus10043044.g, Lus10002740.g, Lus10007370.g, Lus10001529.g, Lus10034218.g, Lus10020788.g) (Table S5), suggesting genes encoding β-galactosidase participated in the early fiber synthesis.
Differentially expressed genes involved in the cellulose biosynthesis
Cellulose is synthesized by large multi-meric cellulose synthase (CesA) complexes, tracking along cortical microtubules at the plasma membrane (Fig. 5A). Recent studies have identified the Korrigan (KOR), sucrose synthase (SuSy), and COBRA-like proteins are indirectly involved in cellulose biosynthesis. According to GO function, this study determined the DEGs involved in the pathway of cellulose synthase activity, cellulose biosynthetic process, cellulose metabolic process, cellulose binding, and cellulose microfibril organization. Most of these genes were highly expressed at the fiber development stage (UBP30), comparing with those at the fiber maturing stage (UBP50) or stem tissues above the snap (ABP30 and ABP50), except Lus10034665.g and Lus10020414.g, which encoding Alpha-1,4-glucan-protein and Phosphoglucosamine mutase (Fig. 5B).
Analysis of CESA and CSL genes in fiber cell wall formation stage
Among the 8639 up-regulated and down-regulated DEGs detected across all four types of samples, 15 genes encoding cellulose synthase proteins or cellulose synthase-like proteins were found. These genes were found in pairwise comparisons of ABP30 vs. UNP30 and UNP30 vs. UNP50. Pairwise comparisons of ABP30 vs. ABP50 and ABP50 vs. UNP50 did not reveal genes related to cellulose synthase. In pairwise comparison of ABP30 vs. UNP30, one probable cellulose synthase A catalytic subunit 3 gene (Lus10022449.g) was found to be expressed at significantly higher levels in stem tissue above the snap point than that of below the snap point. In contrast, five cellulose synthase genes, including two CESA4 genes (Lus10008225.g and Lus10008226.g), one CSLD4 gene (Lus10026568.g), and two CESA8 genes (Lus10007296.g and Lus10029245.g), were found to be expressed at significantly higher levels in stem tissue below the snap point than in those above the snap point. In pairwise comparison of UNP30 vs. UNP50, seven cellulose synthase genes, including two CESA8 genes (Lus10007296.g and Lus10029245.g), one CESA4 gene (Lus10008226.g), one CSLE1 gene (Lus10016625.g), two CSLG3 genes (Lus10023056.g and Lus10023057.g), and one CSLD4 gene (Lus10026568.g) were found to be expressed significantly higher in stem tissue with fiber development stage than in the fiber mature stage. In contrast, two CESA6 genes (Lus10006161.g and Lus10041063.g) were found to be expressed at significantly higher levels in stem tissue at the fiber mature stage than at the fiber development stage. To confirm the results of expression profiling by RNA-Seq analysis, ten transcripts of cellulose synthase genes were used as targets for quantitative real-time RT-PCR analysis, including two CESA4 genes (Lus10008225.g and Lus10008226.g), two CESA6 genes (Lus10006161.g and Lus10041063.g), two CESA8 genes (Lus10007296.g and Lus10029245.g), two CSLG genes (Lus10023056.g and Lus10023057.g), CSLE1 (Lus10016625.g), and CSLD4 (Lus10026568.g). The results showed that the expression patterns of two CESA4 genes (Lus10008225.g and Lus10008226.g), two CESA8 genes (Lus10007296.g and Lus10029245.g), CSLE1 (Lus10016625.g), and CSLD4 (Lus10026568.g) were determined by qRT–PCR fitted in well with that of RNA-seq analysis (Fig. 6A). Furthermore, these six genes were highly expressed in stem tissue at the fiber development stage than in the other three samples, suggesting that these genes were enriched during the fiber cell wall formation stage (Fig. 6B).