Transcriptome sequencing and gene expression profiles
To study the gene expression changes in the dedifferentiation of Eucalyptus, we analyzed the transcriptome data of differentiated (stem) and dedifferentiated (callus) tissues from two E. camaldulensis and E. grandis (stem of E. camaldulensis: A1, callus of E. camaldulensis: A2, stem of E. grandis x urophylla: B1, callus of E. grandis x urophylla: B2). Table 1 showed the overview of transcriptome sequencing and the numbers of genes identified in this study. Initially, we obtained 528 million reads for all 12 samples (in triplicates, n=3) and each individual replicate sample had 44 million reads on average. Then, we aligned the reads to the E. grandis genome. It showed 64.66% to 74.84% of the total reads were mapped in the samples with 31.3 million mapped reads on average. Out of the total 36,349 E. grandis genes, we identified 18,777 to 20,240 genes with TPM > 1 in the samples (Additional file 1). If we set the TPM cut-off to 5, only 12,650 to 14,464 genes were identified. Distribution of genes with different expression levels (Figure 1A) showed that more than 60% of the genes were lowly expressed (TPM < 5) and only 0.165% to 0.272% of the Eucalyptus genes were expressed higher than 1,000 TPM. Interestingly, highly expressed genes varied in differentiated and dedifferentiated tissues of Eucalyptus. Figure 1B showed comparison of top 10 genes in all samples according to the average TPM. Dedifferentiated tissues of both E. camaldulensis and E. grandis x urophylla shared 8 genes including Eucgr.H01085 (ethylene-responsive transcription factor ERF071), Eucgr.G03106 (wound-induced protein 1) and Eucgr.F00114 (zinc finger protein ZAT10). We next compared the genes identified in the samples (TPM > 1) and found 10,374 genes shared (Figure 1C). Also, 628, 459, 671 and 580 genes were specifically identified in A1, A2, B1 and B2, respectively. Based on the gene expression, we analyzed the correlation between samples. It was revealed that replicates were performed well, and differentiated tissues were distinguishable from dedifferentiated tissues in E. camaldulensis and E. grandis x urophylla (Figure 1D).
DEGs in E. camaldulensis
We first identified DEGs in the dedifferentiated tissue of E. camaldulensis compared to the dedifferentiated tissue using edgeR and identified 4,690 up-regulated as well as 4,539 down-regulated genes (Figure 2A, Additional file 2). It showed that Eucgr.H02264 (probable indole-3-acetic acid-amido synthetase GH3.1), Eucgr.D02625 (phosphoribulokinase, chloroplastic), Eucgr.J03055 (hypothetical protein), Eucgr.H02600 (protein SRG1) and Eucgr.B03016 (LOB domain-containing protein 40) were the top 5 up-regulated genes while Eucgr.J00794 (DNA-damage-repair/toleration protein DRT100), Eucgr.I02271 (endochitinase), Eucgr.A01080 (glycine-rich RNA-binding protein), Eucgr.D00703 (beta-galactosidase 8) and Eucgr.A01881 (trans-resveratrol di-O-methyltransferase) were the top 5 down-regulated genes, according to the FDR values (Additional file 2). We next analyzed the DEGs enriched KEGG pathways and found that “plant hormone signal transduction” (ko0475), “RNA transport” (ko03013) and “carbon metabolism” (ko01200) were the top 3 significant pathways (Additional file 3). We also analyzed the GO enrichment by the DEGs in DH201-2 (Figure 2B). Except the GO terms enriched by more than 10% of the total DEGs, we also found “reproduction” and “reproduction process” of interest.
Then, we analyzed some gene groups might be involved in the dedifferentiation process of E. camaldulensis. We identified 27 up-regulated and 22 down-regulated genes related to ethylene (Table 2), including 7 AP2-like ethylene-responsive TFs, 41 ethylene-responsive TFs and 1 ethylene response sensor (Additional file 2). Among the DEGs there were 25 up-regulated and 16 down-regulated genes encoding auxin related products (Table 2), including 7 auxin response factors, 3 auxin-binding proteins, 2 auxin-induced in root clusters proteins and 16 auxin-responsive proteins (Additional file 2). Furthermore, we found 111 (15 up-regulated and 86 down-regulated), 79 (25 up-regulated and 44 down-regulated), 36 (23 up-regulated and 13 down-regulated), 39 (27 up-regulated and 12 down-regulated) and 272 (155 up-regulated and 117 down-regulated) DEGs, whose products related to RP, ZFP, HSP, histone and TF, respectively (Table 2). Notably, 8 up-regulated and 3 down-regulated embryogenesis related genes were identified in the dedifferentiated tissue of E. camaldulensis compared to the differentiated tissue, such as late embryogenesis abundant protein, somatic embryogenesis receptor kinas 1 and embryogenesis-associated protein EMB8 (Additional file 2).
DEGs in E. grandis x urophylla
We next analyzed the 4,200 up-regulated and 4,708 down-regulated genes in the dedifferentiated tissue of E. grandis x urophylla compared to the differentiated tissue (Figure 2C). According to the FDR, top 5 DEGs include Eucgr.B03016 (LOB domain-containing protein 40), Eucgr.H02264 (probable indole-3-acetic acid-amido synthetase GH3.1), Eucgr.I02271 (endochitinase), Eucgr.L02894 and Eucgr.L02534 (Additional file 2). KEGG pathway enrichment analysis showed “Glycolysis / Gluconeogenesis” (ko00010), “Carbon metabolism” (ko01200) and “Biosynthesis of secondary metabolites” (ko01110) were the top three pathways involved by the DEGs. Also, we found “Plant hormone signal transduction” (ko04075) significantly enriched by 260 DEGs (q-value: 8.36E-14). Like E. camaldulensis, GO analysis of these DEGs (Figure 2D) revealed that the numbers of DEGs had a similar trend but varied across all the GO terms, such as biological adhesion, virion and nucleoid.
Table 2 showed the numbers of DEGs from the seven gene groups identified in the dedifferentiation process of E. grandis x urophylla. Interestingly, we found 41 up-regulated and 8 down-regulated genes encoding ribosomal proteins in E. grandis. It showed that more ribosomal protein DEGs were up-regulated in E. grandis x urophylla but down-regulated in E. camaldulensis. Moreover, we found in E. grandis x urophylla more down-regulated ethylene related genes and up-regulated genes encoding HSP and histone (Table 2). Figure 2E showed an overview of all the DEGs identified in the dedifferentiated tissues compared to the differentiated tissues in both E. camaldulensis and E. grandis x urophylla. It was revealed that they shared some DEGs and some other DEGs were specifically identified in E. camaldulensis or E. grandis x urophylla, which might be related to the regenerative ability in Eucalyptus.
Regenerative ability associated genes
We next compared the DEGs identified in E. camaldulensis and E. grandis x urophylla. It showed in the upper panel of Figure 3A that they shared 2,687 up-regulated genes in the dedifferentiated tissue compared to differentiated tissue, including Eucgr.H01085 (ethylene-responsive transcription factor ERF071), Eucgr.A01538 (fructose-bisphosphate aldolase 6, cytosolic), Eucgr.G03106 (wound-induced protein 1), Eucgr.K02614 (NDR1/HIN1-Like protein 3), Eucgr.F00114 (zinc finger protein ZAT10) and Eucgr.H03082 (early nodulin-75) (Additional file 2). There were 2,003 and 1,513 up-regulated genes specifically identified in the dedifferentiated tissues of E. camaldulensis and E. grandis x urophylla, respectively (upper panel of Figure 3A). Then, we compared to down-regulated genes and found that they shared 2,581 DEGs (lower panel of Figure 3A), including Eucgr.J00025 (heat shock cognate 70 kDa protein 2), Eucgr.B01596 (probable xyloglucan endotransglucosylase/hydrolase protein 23), Eucgr.A01080 (glycine-rich RNA-binding protein), Eucgr.F00590 (snakin-2) and Eucgr.H03983 (major allergen Pru ar 1) (Additional file 2). A total of 1,958 and 2,127 down-regulated genes were specifically identified in E. camaldulensis and E. grandis x urophylla, respectively (lower panel of Figure 3A). The comparison overview of DEGs in E. camaldulensis and E. grandis x urophylla can also be found in the heat map (Figure 2E).
Interestingly, we found 6, 17, 12, 83, 28, 10, 9 and 98 DEGs related to embryogenesis, ethylene, auxin, RP, ZFP, HSP, histone and TF, respectively, specifically dysregulated in the dedifferentiated tissue of E. camaldulensis (Table 2, Additional file 2). The 6 embryogenesis related genes include 5 up-regulated genes (2 SE receptor kinase, 3 LEA) and 1 down-regulated gene encoding embryogenesis-associated protein EMB8. Among the 17 DH201-2 specific DEGs encoding auxin related products (Figure 3B), Eucgr.K02992 (auxin transporter-like protein 4) and Eucgr.C02984 (auxin-responsive protein IAA26) were down-regulated and Eucgr.H03171 (auxin-induced protein 22D) was up-regulated in E. camaldulensis but down-regulated in E. grandis x urophylla (Additional file 2). All of the ethylene related DEGs specifically in E. camaldulensis were found to encode ER TFs and 7 had reverse regulation in E. camaldulensis and E. grandis x urophylla (Figure 3C). Interestingly, it is shown in the heat maps (Figure 3D and Figure 3E) that most of the E. camaldulensis specific DEGs related to HSP and RP were down-regulated in the dedifferentiated tissue compared to the differentiated tissue. Although some of the DEGs encoding histone were more or less changed in E. grandis x urophylla, the difference of their expression was not significant as that in E. camaldulensis (Figure 3F). Importantly, we found another 4 TF subfamilies including ASIL2, bHLH, MYB and WRKY were dysregulated only in E. camaldulensis (Figure 3G). Further, the MYB and WRKY TF genes were up-regulated in the callus of E. camaldulensis compared to the stem. In addition, some other gene families were identified to be specifically differentially expressed in E. camaldulensis (Table 2, Additional file 3), including 3 ABA related genes, 3 arabinogalactan protein genes, 4 ABC transporter genes and 21 abscisic stress-ripening protein genes.
qRT-PCR validation
We used qRT-PCR to confirm the expression patterns of DEGs identified in the dedifferentiation process of E. camaldulensis and E. grandis x urophylla. We randomly selected 8 genes (Eucgr.A00971, Eucgr.A01091, Eucgr.B03715, Eucgr.C03048, Eucgr.D01811, Eucgr.F00490, Eucgr.F01164 and Eucgr.H03077) for the qRT-PCR experiment and the H2B gene was used as internal control. The primers for each gene can be accessed in the Additional file 5. Each gene was replicated three time in each sample, so we performed 9 reactions in total for the differentiated and dedifferentiated tissues. We used log2FC (log2 fold change) and log2RNE (log2 relative normalized expression) to present the gene changes detected by transcriptome sequencing and qRT-PCR, respectively. Overall, 13 (81.25%) out of 16 events were agreed by both qRT-PCR and deep sequencing (Figure 4). It is notable that the elevation of WRKY TF (Eucgr.D01811) and the decrease of RP gene (Eucgr.A00971) in E. camaldulensis were confirmed by qRT-PCR. The dysregulation of Eucgr.B03715, Eucgr.C03048 and Eucgr.F00490 in E. grandis x urophylla were also confirmed. High agreement of gene expression patterns in transcriptome sequencing and qRT-PCR indicate that the genes identified in this study might be associated with the regenerative ability and the SE in Eucalyptus, which requires future experiments to be explored.
SNPs and indels
We next called the variants including SNPs and small indels in E. camaldulensis and E. grandis x urophylla using the transcriptome sequencing data. Table 3 shows the numbers of variants identified in the samples. Initially, we obtained 97,504 and 75,582 variants in the differentiated and dedifferentiated tissues of E. camaldulensis, respectively. After the variants with less than 100 reads were filtered, we identified 97,974 SNPs and indels for E. camaldulensis. Likewise, 72,208 and 66,311 variants were called in the differentiated and dedifferentiated tissues of E. grandis x urophylla, respectively, and they produced 78,977 highly covered variants (Table 3). Comparison showed 49,527 variants shared by E. camaldulensis and E. grandis x urophylla, and 48,447 were specifically identified in E. camaldulensis. Then, we annotated the E. camaldulensis specific variants using ANNOVAR and found that 13,434 variants were functional, such as nonsynonymous, frameshift insertion, frameshift deletion, stop gain and stop loss variants (Table 3, Additional file 6). Interestingly, these 13,434 variants were produced from 4,723 Eucalyptus genes, such as annexin (Eucgr.F02423, Eucgr.H00564), ARF guanine-nucleotide exchange factor GNOM (Eucgr.B03196), AP2-like ER TF (Eucgr.I00278, Eucgr.J02113), auxin response factors (Eucgr.G00076, Eucgr.C02178, Eucgr.C03293, Eucgr.J00923, Eucgr.F02090, Eucgr.D00264, Eucgr.E00888) and wall-associated receptor kinas-like (Eucgr.I01022). KEGG pathway enrichment analysis showed that 85 pathways including “longevity regulating pathway” (ko04211) and “Plant hormone signal transduction” (ko04075) were enriched by the of the DH201-2 specific mutated genes (Additional file 7).