Clustering based on gene expression profiles grouped samples in accordance to their phenotypic measures, but also revealed differences within the groups. A higher BCV when contrasting groups was expected because we used different genotypes as replicates of the same group (Fig. 1). The two within-group contrasts are relevant to capture differences between hybrids and wild genotypes that present similar phenotypes. Previously, using SSR genotyping of a subset of the Brazilian Panel of Sugarcane Genotypes, TUC71-7 and SP80-3280 were assigned to the same subpopulation, RB72454 and RB855156 to another and, separately, White Transparent and IN84-58 to the two remaining subpopulations (23). Indeed, the third dimension of the multidimensional scaling based on gene expression showed that SP80-3280 clustered apart from RB72454 (Fig. 1 in Additional file 3). We hypothesize that the lower number of DEGs in the low biomass group reflects sugarcane breeding, because this group is formed by S. officinarum and hybrids.
The position of accession US85-1008 between the biomass groups also seemingly reflects the sugarcane breeding history, because this hybrid diverged from the high biomass genotypes more than S. officinarum (Criolla Rayada and White Transparent) did from commercial hybrids. Furthermore, the high biomass group included US85-1008 and accessions of two ancestral species – S. spontaneum and S. robustum. Samples of the S. spontaneum SES205A were grouped apart, possibly reflecting the diversity within the subpopulations of this species (8). The wild sugarcane genotypes of the high biomass group showed substantial differences in their expression profiles and we did not find any evidence of kinship among them in the scientific literature. Wild genotypes, particularly those of S. spontaneum, have specific alleles that make them a source of variability for sugarcane breeding. Based on SSR markers, IN84-58 showed more species-specific fragments than Badila and Ganda Cheni - S. officinarum and S. barberi genotypes, respectively (23). Also, IN84-58 showed a similar expression profile to IJ76-318, a S. robustum accession. In fact, Ferreira and colleagues (24) concluded that S. spontaneum and S. robustum can have similar expression patterns and group together, separately from S. officinarum or a hybrid accession.
Transposition-associated terms were enriched among DEGs both for between- and within-group comparisons. Phylogenetically close species have different transposable elements (TEs) families and differ in the number of TEs in the genome (25). Saccharum species have a high number of TEs, mainly Long Terminal Repeat (LTR) retrotransposons (26, 27). We suggest that the differential expression of TEs was likely due to the genome differences among the genotypes compared in each contrast. S. officinarum showed less differential expression of transposition-related genes in comparison to hybrids relative to that found in the comparisons between groups or between US85-1008 and the other high biomass genotypes (Fig. 6 in Additional file 3). This may partly be explained by the higher contribution of the S. officinarum genome in hybrids and by large differences between the genomes of the wild canes. This is reinforced by the observation that the divergence between S. officinarum and S. spontaneum is partially due to the expansion of two TE families in S. officinarum (28). TEs may demonstrate restricted expansion in specific genomes, such as certain families of miniature inverted-repeat transposable elements (MITE) with proliferation-specificity to the T. aestivum subgenomes (29). Moreover, the activity of TEs resulting from polyploidization is analogous to the induction of TEs promoted by stresses (25), a form of genomic shock (30, 31), which is a well described phenomenon in allopolyploids (32). We can conclude that differences in transposition found within the low biomass group were largely due to variation between commercial hybrids and White Transparent, similar to the observation when contrasting S. officinarum to the cultivar RB867515 (24).
Polyploidy creates an imbalance in the nucleotide pool, causing genomic stress in the cell and triggering non-additive expression of genotype-specific responsive genes and other stochastic differences (33, 34). Along with transposition, we noted enriched defense-associated terms when comparing both biomass groups (Fig. 4 and Fig. 5 in Additional file 3). There is evidence that proteins involved in basal metabolism can be more active during stresses. For instance, Ferreira et al. (24) hypothesized that upregulation of histone genes in a hybrid genotype arose from changes in epigenetic control caused by the genomic stress of hybridization. Carson and colleagues (14) evaluated gene expression in sugarcane leaves and found, among many functions, genes coding for proteins responsible for the maintenance and control of cellular metabolism, as well as transport and stress responses. Not only does ploidy regulate these responses, but genes coding for resistance proteins were also upregulated in culms to protect against the stress caused by increased sugar levels in sucrose-rich genotypes (35). Genotypes in the high biomass group differed in their response to oxidation-reduction, presenting changes in genes whose products are associated to detoxification. Gluthatione transferases, involved in detoxification, display gene classes occurring in tandem on plant genomes, coding for enzymes acting over a wide range of substrates (36). Previously, higher expression levels of transcripts related to glutathione-S-transferase were observed in a fiber-rich genotype (18).
The co-expression analysis complemented the enrichment tests based on sets of DEGs. Genes associated with transposition formed two clusters of co-expressed genes, being highly expressed in hybrids and S. officinarum (Table 3 and Fig. 12 in Additional file 3). The machineries of replication, transcription, translation and regulatory mechanisms were enriched with similarly expressed genes. Our differential expression analysis involved leaf samples, but no carbon assimilation terms were enriched among DEGs. Interestingly, genes whose products are involved with this process were grouped in a co-expressed module (Table 3 in Additional file 3). Depending on the contrast assessed, pathway analysis showed changes in specific photosynthesis processes, such as C4/CAM photosynthesis and photorespiration (Fig. 13, Fig. 15 and Fig. 16 in Additional file 3). Recently, Singh and colleagues (19) detected upregulation of almost all photosynthesis-related coding genes in high biomass genotypes. As a C4 grass, sugarcane photosynthesis includes a pathway to obtain a four-carbon compound, a process that occurs in the mesophyll and is orchestrated by PEPC. In agreement with Verma and colleagues (37), we noted that high biomass genotypes may require a more intense expression of PEPC coding genes to support metabolic functions other than sucrose accumulation. Expression of PEPC genes was lower in young leaves associated with maturing culms but was practically invariable in leaves connected with more mature stalks (37). In addition, a group of photophosphorylation genes coding for Psa, Psb and cytochrome proteins formed a downregulated cluster in low biomass genotypes (Fig. 14 in Additional file 3). The module with photosynthesis co-expressed genes was also enriched with terms related to the responses to four hormones - abscisic acid, cytokinin, ethylene and gibberellin. DEGs annotated with hormone responses inside this co-expression module were downregulated in S. spontaneum (Fig. 10 in Additional file 3). In fact, Singh and colleagues (19) noted that low fiber sugarcanes showed upregulation of genes involved with responses to auxin, jasmonic acid, salicylic acid, abscisic acid and ethylene (19).
Genes coding for enzymes involved in sucrose synthesis, breakdown and transport had been previously studied in different phenological stages of sugarcane culm development (38) and between varying (groups of) genotypes (17, 18, 35). The pioneering transcriptome studies in sugarcane addressed gene expression in leaves or leaf rolls (13, 14). Analysis of tissue-specific expression enabled the detection of functions in leaves and culms (14). Synthesis of sucrose occurs in sugarcane leaves, followed by its transport through phloem to be stored in stalk parenchyma cells (21). Clearly, sucrose storage is higher in the hybrids and S. officinarum clones analyzed herein (Table 1 in Additional file 1). In leaves, higher expression of SPS and SPP coding genes in the low biomass group may indicate that the stalk of these genotypes requires more sucrose. They also showed an upregulated gene coding for Cell Wall Invertase (CWINV), an enzyme acting on sucrose hydrolysis and allowing the apoplastic entry of hexoses in the stem parenchyma cell (21). However, CWINV overexpression can promote monomer accumulation in leaves, impairing carbohydrate storage and affecting growth, as described in cassava (39).
SPS and CWINV have been shown to be highly expressed in sugarcane before maturation of culms, precisely to allow the development of leaves and to compensate for sucrose storage requirements in sink tissue (37). These authors also pointed out that genes coding for enzymes such as PEPC and SUT1 can show stable or increased expression levels in more mature leaves. Our data shows, that in + 1 leaves, genes coding for SUT4 were upregulated in hybrids and S. officinarum. However, the SUT1 coding gene was downregulated in the low biomass group but had a higher overall expression level that SUT4 (Fig. 2), which makes it difficult to determine which SUTs are more relevant to sucrose accumulation. A gene coding for the SWEET14 protein was described as repressed in S. officinarum and S. spontaneum (24), but we found a SWEET14 gene repressed in the low biomass group, with no evidence of differential expression within this group. We believe that genes coding sugar transporter proteins or sucrose transporter families may be differentially expressed in a genotype-specific manner (Fig. 24-B in Additional file 3).
Carbohydrate metabolism in culms also includes gene products from members of the SuSy family. When differentially expressed in a given contrast, SuSy coding genes were always upregulated in genotypes with the higher sucrose level (Fig. 2). One DEG was also detected in the two other contrasts; other two DEG coding SuSy were upregulated in US85-1008 (Fig. 23 in Additional file 3). In contrast to its common role in stems, SuSy can synthesize sucrose from the reducing sugars present in leaves. Hoffmann-Thoma and colleagues (40) found a higher SuSy activity than SPS in 60 and 90-day expanded leaves. In the same experiment, they found that the content of hexoses was higher than sucrose and that SPS was more active than SuSy in older leaves (2 through 7). In leaf rolls, a low sucrose breakdown/synthesis ratio indicates that SuSy contributes to sucrose synthesis in young sugarcane tissues (15). Immature leaf rolls, internodes one to six and roots showed higher expression of SuSy1 than leaves (41). The same study, however, revealed a highly expressed SuSy2 gene in immature and mature leaf lamina. The five DEGs coding for SuSy identified with Mercator showed low average expression levels in our study (Table 4 in Additional file 3), three of them being upregulated in low biomass genotypes. Thirugnanasambandam and colleagues (16) noted that the expression levels of four SuSy genes in leaves were lower than in other tissues, regardless of genotype. Although SuSy is possibly synthesizing sucrose, we also stress the importance of SPS for sucrose synthesis in the low biomass group (Fig. 23 in Additional file 3).
Genes coding for proteins of the lignocellulose pathways were upregulated in high biomass genotypes. Expansins are a class of enzymes that can modify the structure of the cell wall, promoting its expansion (42). The sugarcane genome has roughly ninety expansin-coding genes, mostly from the families α and β (43). In Poaceae, β-expansin members act over the matrix polysaccharides, loosening the cell wall (42). In our study, the high biomass group showed higher expression of expansin genes, possibly promoting the development of the leaf. Because structures of the sugarcane top are relevant as biomass sources for energy cane, leaf growth is a desirable trait. Moreover, wild high biomass canes displayed higher expression of expansins α − 2, β − 11 and β − 3, which can be explored as candidate genes in other functional genomic studies. More directly related to the cell wall, many genes coding enzymes that assemble polysaccharides were upregulated in the high biomass genotypes. We identified genes coding for xylosyltransferases, arabinosyltransferases and fucosyltransferases (Fig. 21 in Additional file 3), which are glucosyltransferases involved in the biosynthesis of xyloglucan in the Golgi stacks (22). Loss of function in a xylosyltransferase coding gene led to higher saccharification in mutant rice plants, facilitating xylan extraction (44).
Sugarcane genotypes rich in biomass have a higher content of cellulose, hemicellulose and lignin, in detriment to the sucrose content (45). Clustering of sugarcane genotypes based on similar biomass and sucrose accumulation traits (see Fig. 2 in Additional file 1) was confirmed by gene expression (Fig. 1). The high biomass group contained mainly wild genotypes, while the low biomass group was represented by S. officinarum and hybrids. The high biomass hybrid US85-1008 is the offspring of a wild female parent - an unknown S. spontaneum -, while the low biomass hybrids have other hybrids as female parents (23, 46, 47). Moreover, the low biomass hybrids we studied are all genetically related, with varying degrees of relatedness. This distinct variability within each of the two groups reflects the genomic differences of the accessions (Fig. 1 in Additional file 3). Leveraging wild genotypes in sugarcane breeding can be useful to expand the narrow genetic basis of this crop (46, 48), making it possible to develop cultivars with adequate biomass-associated traits, addressing the current limitations in the field and industry. There are also obstacles in sucrose accumulation, which also have to be taken into account because energy canes must be efficient both in biomass and sugar yields (3).