3.1 Growth performance
The growth performance of each group at different days were shown in Fig.1A. At the beginning of the experiment (0 days), no significant difference was found between the control and DPC groups (P>0.05). However, the sugarcane heights on days 3, 6, and 12 as well as that of mature sugarcane, were significantly higher in the control than in the DPC groups (P<0.05) (Fig.1B). Contrary to the sugarcane height, the growth rates of DPC groups were significantly lower on days 3, 6, and 12 when compared to the control (P<0.05) (Fig.1C). Moreover, all the internodes were significantly longer in the control group (Fig.1D).
3.2 Full-length transcriptome of sugarcane
To generate a high-accuracy reference for read mapping data, full-length mRNA sequencing was performed using the PacBio Sequel platform on internodes from mature sugarcane. A total of 17 billion raw reads were obtained. The average length was 2,718 bp and N50 was 3,011 bp. After circular-consensus sequence (CCS) extraction, 428,444 reads were identified. Among these reads, 348,840 (81.42%) were full-length reads containing 5′ adaptors, poly(A) tail signals, and 3′ adaptors. Meanwhile, 999 million full-length non-chimeric (FLNC) reads with an average length of 2,906 bp were identified. These FLNC reads from the cDNA library contain repetitive isoforms that provide data for analysis of isoforms by alignment and assignment to different clusters. The present full-length transcriptome generated 72,671 isoforms. Of these, the average length was 2,888.94 bp and the N50 was 3,073 (Additional file 2).
The isoforms were annotated by aligning the protein and nucleotide databases. In total 69,803, 56,843, 47,438, and 30,240 isoforms were annotated from nr, Swissport, KOG, and KEGG, respectively. Combining these results, a total of 69,867 isoforms were annotated (Additional file 3). The isoforms were also aligned to different species. The five species with the most hit sequences were Saccharum spontaneum, Setaria italica, the Oryza sativa Japonica group, Dichanthelium oligosanthes, and Sorghum bicolor. In addition to this, these isoforms were annotated by GO terms assigned to three categories: biological process (50,805 isoforms), cellular component (32,922 isoforms), and molecular function (26,696 isoforms). In the biological process category, metabolic process (13,462 isoforms) and cellular process (12,836 isoforms) were the two most functional terms. Cell (7,598 isoforms) and cell parts (7,597 isoforms) were the two most functional terms in the cellular component category, while in the molecular function category, catalytic activity (13,086 isoforms) and binding (11,642 isoforms) were the two most functional terms (Fig.2C).
3.3 DEGs by DPC treatment
The 150 pair-end reads were obtained for DEG analysis. In total, 1,404,530,300 raw reads were generated from 18 cDNA libraries using the Illumina HiSeq 4000 platform. After trimming the adaptor and removing the low-quality reads, 1,380,323,402 (98.28%) reads were retained as high-quality clean reads. These clean reads were mapped to the reference as the full-length transcriptome. The mapping ratios for the 18 cDNA libraries ranged from 73.97% to 83.78%. Using these data, the normalized expression data were calculated and normalized gene expression was analyzed by PCA (Fig.3A). Two clusters were clearly defined by PCA, which contained the DPC group and control for each cluster. The first principal component, PC1, summarized 30.7% of the whole variability and discriminated samples according to the treatment. The second principal component, PC2, and the third principal component, PC3, summarized 25.1% and 17.4% of the whole variability and discriminated samples, respectively. The DEG analysis showed that the comparison between C2 and D2 groups had the most DEGs (a total of 6,012 genes, which contained 3,227 upregulated genes and 2,785 downregulated genes). D1 showed more upregulated genes compared to D2 and D3 groups, while less downregulated genes were found in D1 than in D2 and D3 groups. In addition, most DEGs in C2-vs-D2, C1-vs-C2 (2,895 DEGs), and D1-vs-D2 (3,157 DEGs) also showed a large number of differentially expressed genes (Fig.3B).
3.4 Functional analyses of DEGs between C2 and D2 groups
To illustrate the functions of the DEGs after DPC treatment, GO enrichment and KEGG enrichment analyses of the comparison of C2 and D2 with the most DEGs were performed. The upregulated and downregulated genes were annotated in 29 and 37 GO terms, respectively (Fig.4A,B). The GO enriched terms with the four most upregulated genes were DNA metabolic process, negative regulation of biological process, regulation of translation, and regulation of cellular amide metabolic process. Meanwhile, the GO enriched terms with the two most downregulated genes were single-organism transport and single-organism localization (Additional file 4). KEGG enrichment analysis showed that 17 and 30 pathways were enriched in the upregulated and downregulated genes, respectively (Fig.5A,B). Either for the upregulated genes or downregulated genes, metabolic pathways and biosynthesis of secondary metabolites were the top two enrichment KEGG pathways with the most genes. Among the upregulated genes, 55 were found to increase in the plant hormone signal transduction pathway. Meanwhile, phenylpropanoid biosynthesis, flavonoid biosynthesis, favone and flavonol biosynthesis, and glucosinolate biosynthesis were enriched in the downregulated genes (Additional file 5). These KEGG pathways were associated with the growth and development of internodes.
3.5 WGCNA and hub genes
The WGCNA divided the genes into 36 modules (Fig.6). Based on the identification of DEGs, we focused on the D2 group. This group contained significant gene expression changes, which is the crucial stage for internode elongation. We found that sienna3 was the module that most significantly correlated with the D2 stage (p=1e-4) (Additional file 6) (Fig.7). The sienna3 module contained 33 genes and the top three hub genes, namely Stf0 sulfotransferase, cyclin-like F-box, and HOX12, were identified in this module. These three hub genes correlated with 30 genes (Additional file 7) (Fig.8).
3.6 Validation of RNA-seq result
qPCR was used to validate the RNA-seq results. Randomly, nine genes were selected for the analysis. Except for GID2 and PBS1, the other six tested genes, GA2OX1, GID1, MPK4, CML49, PRPF8, and ACO2, showed similar qPCR results to those of the RNA-seq. Moreover, the expression trend of six out of eight genes from qPCR and RNA-seq was highly consistent, indicating that the majority of genes had the same tendency (Fig.9). The three hub genes, Stf0 sulfotransferase, cyclin-like F-box, and HOX12, were also analyzed by qPCR, and the results were similar between both qPCR and RNA-seq (Fig.10). These results showed the high reliability of the RNA-Seq data.