Somatic embryogenesis efficiency of Jin668
The embryogenic potential of Jin668 is high. NEC cells can be distinguished from EC cell by visual discrimination. NEC is characterized as fluffy, non-differentiated or polarized cells that are actively dividing, Fig. 1A. Typical EC is characterized as small cell clusters, with condensed cytoplasm, that are rapidly dividing and proliferating, Fig. 1B. After inducing callus for 45 days, 4% of the explants were found to be in the NEC stage and subsequently did not develop into embryos (Fig. 1a and Table 1). The remaining 96% of the explants were found to be induced to form characteristic embryonic callus cells (Table 1 and Fig. 1b). In all experimental replications, both NEC and EC cells could be observed in the same piece of callus (Fig. 1a and 1b). Therefore, those calli that consisted of both NEC and EC cells were sampled for gene expression profiling between the NEC and EC (Fig. 1b).
Table 1
Callus induction of cotton hypocotyls explants
No. Explants | No. NEC | No. EC (efficiency rate %) | |
300 | 12 | 288 (96) | |
320 | 12 | 308 (96) | |
330 | 13 | 317 (96) | |
Differential gene expression and GO enrichment of NEC and EC callus
For each callus cell type, an average of 36.8 and 42.7 million read pairs were produced for NEC and EC samples, respectively (BioProject ID PRJNA629328). A total of 39,105 genes were expressed in both NEC and EC callus cells, Fig. 2. Overall, a total of 5,333 genes were differentially expressed with 2,534 upregulated and 2,799 downregulated in EC (Additional file 1). Among these genes, a total of 144 were uniquely expressed in NEC cells, whereas, 174 genes were unique to EC (Fig. 2, Additional file 2: Table S1). Genes with unique expression to either EC or NEC cells were mostly genes of unknown function. Genes with annotations that were highly expressed in NEC include ROTUNDIFOLIA-LIKE 21, gibberellin-regulated, WRKY DNA binding, and others. Genes with high expression unique to EC callus with annotations include MYB family transcription factor, cytochrome C biogenesis, ROTUNDIFOLIA-LIKE 10, heat shock, and several others (Additional file 2: Table S1). Gene expression fold changes ranged from − 12 to 12 in EC when compared to NEC, (Additional file 2: Table S2). A slightly larger set of transcripts indicated transcriptional downregulation from NEC to EC, with a total of 6,786 genes (logFC ) becoming downregulated in EC; while 6,538 (logFC ) were upregulated (Fig. 3A and 3B, Additional file 2: Table S2). The top 10 genes with the sharpest fold changes in upregulation EC versus NEC is a gene with a nodulin late domain (Gohir.D11G247300.1), and 9 genes with unknown function (Table 2). The top 10 genes with the sharpest fold changes in downregulation are a gibberellin-regulate homolog (Gohir.D04G048900.1), 4 genes with unknown function, a gene with an S-adenosyl-l-methionine decarboxylase (AdoMetDC) leader peptide, a ROTUNDIFOLIA like gene (Gohir.D07G179500.2), and 60S ribosomal protein (Table 2). Gene ontology enrichment analysis of the upregulated genes in EC (Fig. 2A and B) discovered genes enriched in biological processes such as trehalose biosynthesis, transmembrane transporter, sugar transport, iron binding, growth factor activity, embryo development, development, auxin efflux, and response to stress among other processes related to cellular reprogramming and differentiation (Fig. 4A, Additional file 2: Table S3). The most pronounced enriched category was DNA binding and transcription factor activity with 309 genes combined, followed by regulation of transcription (155 genes), membrane (228 genes), oxidation reduction (105 genes), and transmembrane transport (80 genes) (Additional file 2: Table S3). Notable categories that were downregulated were categorized by assigned GO terms enriched for oxidation-reduction (524 genes), transmembrane transport (233 genes), response to hormones and auxin (48 genes), methyltransferase activity (112 genes), and peroxidase activity (58 genes) (Additional file 2: Table S4).
Table 2
Top 10 genes upregulated in EC relative to NEC
Gene | Control | Observed Effect | logFC | PValue | FDR | Functional Domains (descriptions) | Functional Domains (domain name) | Athaliana_Defline | Ghirsutum_Defline |
Gohir.D11G247300.1 | NEC | EC | 12.2179854 | 2.64E-60 | 2.44E-56 | Nodulin_late | Nodulin_late | | |
Gohir.A09G068700.1 | NEC | EC | 11.0311639 | 2.26E-16 | 1.41E-14 | DUF3963 | DUF3963 | | |
Gohir.A03G181200.1 | NEC | EC | 10.8541708 | 5.80E-67 | 1.34E-62 | #N/A | #N/A | | |
Gohir.A05G291150.1 | NEC | EC | 10.7942425 | 2.50E-24 | 5.12E-22 | DUF4912 | DUF4912 | | |
Gohir.A09G187300.1 | NEC | EC | 10.6509234 | 1.66E-19 | 1.72E-17 | #N/A | #N/A | | |
Gohir.A11G045350.1 | NEC | EC | 10.6244362 | 0.00116229 | 0.00472295 | #N/A | #N/A | | |
Gohir.D12G136400.1 | NEC | EC | 10.6014584 | 6.50E-05 | 0.0003752 | #N/A | #N/A | | |
Gohir.A08G165050.1 | NEC | EC | 10.5716199 | 1.71E-34 | 1.16E-31 | #N/A | #N/A | | |
Gohir.A07G162000.3 | NEC | EC | 10.488521 | 1.52E-25 | 3.68E-23 | #N/A | #N/A | | |
Gohir.D11G067750.1 | NEC | EC | 10.3328265 | 7.12E-05 | 0.00040677 | #N/A | #N/A | | |
Gohir.D04G048900.1 | NEC | EC | -9.94544461036326 | 1.19E-12 | 3.79E-11 | GASA | GASA | GASR7 - Gibberellin-regulated GASA/GAST/Snakin family protein precursor, expressed; GAST1 protein homolog 4 | GAST1 protein homolog 4 |
Gohir.A09G040900.1 | NEC | EC | -9.974189713011 | 0.0001288 | 0.0006863 | #N/A | #N/A | | |
Gohir.D06G144900.2 | NEC | EC | -9.98990989847144 | 1.08E-39 | 1.16E-36 | #N/A | #N/A | | |
Gohir.D10G179800.1 | NEC | EC | -10.4131140476177 | 5.93E-22 | 8.57E-20 | VanY | VanY | | |
Gohir.D05G002150.1 | NEC | EC | -10.4414002634445 | 0.00169419 | 0.00655082 | #N/A | #N/A | | |
Gohir.A11G280900.1 | NEC | EC | -10.675999492066 | 9.84E-05 | 0.00054239 | Phage_holin_3_7 | Phage_holin_3_7 | | |
Gohir.D13G226251.1 | NEC | EC | -10.6995332832501 | 0.00118551 | 0.00480505 | #N/A | #N/A | | |
Gohir.D05G008300.1 | NEC | EC | -11.6191438066039 | 7.96E-05 | 0.00044992 | AdoMetDC_leader | AdoMetDC_leader | S-adenosyl-l-methionine decarboxylase leader peptide, putative, expressed; conserved peptide upstream open reading frame 9 | conserved peptide upstream open reading frame 9 |
Gohir.D07G179500.2 | NEC | EC | -11.6654565629349 | 0.00012965 | 0.00069003 | DVL | DVL | expressed protein; ROTUNDIFOLIA like 21 | ROTUNDIFOLIA like 21 |
Gohir.D08G229750.1 | NEC | EC | -12.7143630581706 | 1.18E-53 | 5.45E-50 | Ribosomal_L18A, DUF1891 | Ribosomal_L18A, DUF1891 | 60S ribosomal protein L18a, putative, expressed; Ribosomal protein L18ae/LX family protein | Ribosomal protein L18ae/LX family protein |
Hierarchal clustering and GO enrichment of key clusters
Hierarchal clustering of statistically significant gene expression profiles identified 3 distinct sub-clusters when cut into 50% of the tree (Fig. 5A-C, Additional file 2: Table S5). Subcluster 1 contained 2,186 genes, and the expression trend of these genes ranged from moderate to low in NEC and EC callus, respectively (Fig. 5A, Additional file 2: Table S5). Processes enriched in this subcluster were assigned to oxidation reduction, methyltransferase activity, iron binding, and defense (Fig. 5A, Additional file 2: Table S6). Subcluster 2 contained 675 genes, and showed an upward trend in expression in EC callus and enrichment of processes related to transcription factor activity, membrane, growth factor, and developmental processes (Fig. 5B, Additional file 2: Table S7). Subcluster 3 contains 183 genes, and specifically showed a more discreet trend of higher expression in NEC and lower expression in EC callus, with enrichment in processes such as membrane and transport, peroxidase activity, carbohydrate metabolism, and fatty acid binding (Fig. 5C, Additional file 2: Table S8).
Phytohormone gene expression profiles in NEC and EC callus
Genes involved in phytohormone signaling and biosynthesis are crucial for the transition from NEC to EC. In total, there are 6,132 genes annotated as a phytohormone in the Gossypium hirsutum (CV TM1) reference genome assembly [49]. In general, we observed an abundance of phytohormone related genes being downregulated in EC callus relative to NEC (Table 3). Abscisic acid and auxin related genes were the categories with the most abundant differentially expressed genes, including 1,508 and 1,332 genes, respectively (Table 3). Expression magnitudes ranged from − 10 (logFC) in genes involved in gibberellic acid to 9 (logFC) in genes involved in abscisic acid (Additional file 2: Table S9 and Fig. 6). Genes annotated as cytokinin, gibberellic acid, jasmonic acid, salicyic acid, ethylene, and brassinosteroid were differentially expressed in NEC and EC (Table 3 and Additional file 2: Table S9).
Table 3
Phytohormone genes involved in NEC and EC callus
Phytohormone | No. Genes | No. Upregulated (> 1 FC) | No. Downregulated (> 1 FC) | logFC range |
Auxin | 1332 | 129 | 188 | -8 to 7 |
Cytokinin | 355 | 42 | 47 | -9 to 5 |
Gibberellic acid | 461 | 48 | 71 | -10 to 8 |
Abscisic acid | 1508 | 150 | 213 | -8 to 9 |
Jasmonic acid | 781 | 82 | 108 | -8 to 5 |
Salicyic acid | 593 | 74 | 88 | -5 to 5 |
Ethylene | 782 | 82 | 129 | -6 to 5 |
Brassinosteroid | 320 | 47 | 40 | -8 to 8 |
Transcription factors and signaling cascades
Transcription factors and cellular signaling genes that regulate reprogramming and differentiation during somatic embryogenesis were differentially expressed. We found 24 different classes of transcription factors that were significantly differentially expressed (Additional file 2: Table S10). The class with the most abundant genes was the GRAS (11 genes), followed by TGA like (9), MADS-box (9), CBF/NF-Y/archeal histone domain (9), GATA (8), and others with 7 or fewer members (Additional file 2: Table S10). Specific genes with known involvement in embryogenesis were individually analyzed for expression profiles in NEC and EC callus in Jin668. Our data shows that noteworthy embryogenesis transcription factors are lowly expressed in NEC callus with several key genes that are sharply upregulated in EC (Table 4). For example, the gene with the highest upregulation is LEAFY COTYLEDON 1 (LEC1), with the dominant copy residing in the D-subgenome (Table 4). The next highly expressed embryogenesis related gene was the morphogenic regulator BABY BOOM (BBM) with the dominant copy on the A-subgenome, followed by FUSCA (FUS3) and AGAMOUS-LIKE15 (AGL15) both with the dominant copies on the D-subgenome. Interestingly, WUSHEL (WUS) had very little to no expression with no apparent subgenome bias (Table 4).
Table 4
Embryogenesis genes and their expression values in Jin668
| A-Subgenome | D-Subgenome |
GeneID | Gene Description | Arabidopsis Ortholog (GeneID) | GeneID | NEC TMM | EC TMM | GeneID | NEC TMM | EC TMM |
WOX5 | WUS-related homeobox 5 | AT3G11260 | Gohir.A10G233000 | 5.015 | 0.043 | Gohir.D10G245300 | 2.261 | 0.234 |
WUS | WUSHEL | AT2G17950 | Gohir.A10G098300 | 0.223 | 0.773 | Gohir.D10G089500 | 0 | 0 |
Gohir.A12G059800 | 0 | 0 | Gohir.D12G060100 | 0 | 0 |
WRKY2 | WRKY2 | AT5G56270 | Gohir.A08G026100 | 3.614 | 6.522 | Gohir.D08G036600 | 5.855 | 6.839 |
Gohir.A13G154100 | 3.503 | 6.941 | Gohir.D13G158700 | 2.199 | 4.291 |
GRD/RKD4 | GROUNDED | AT5G53040 | Gohir.A03G051800 | 0.092 | 0.682 | Gohir.D03G115300 | 0.027 | 0.351 |
BBM | BABY BOOM | AT5G17430 | Gohir.A08G227000 | 4.889 | 26.81 | Gohir.D08G247400 | 0.372 | 8.677 |
LEC1 | LEAFY COTYLEDON1 | AT1G21970 | Gohir.A13G132600 | 4.363 | 85.08 | Gohir.D13G136000 | 6.251 | 110.289 |
Gohir.A08G025100 | 0.526 | 8.433 | Gohir.D08G035600 | 2.069 | 15.03 |
FUS3 | FUSCA | AT3G26790 | Gohir.A07G230400 | 0.98 | 17.37 | Gohir.D07G237600 | 1.353 | 23.266 |
ABI3 | ABSCISIC ACID INSENSITIVE3 | AT3G24650 | Gohir.A07G154900 | 0.3 | 6.25 | Gohir.D07G161100 | 0.047 | 1.09 |
AGL15 | AGAMOUS-LIKE15 | AT5G13790 | Gohir.A08G141500 | 0.193 | 2.283 | Gohir.D08G162600 | 0.485 | 4.502 |
Gohir.A12G100400 | 0.948 | 10.92 | Gohir.D12G103400 | 0.901 | 14.455 |
Regulation of DNA methylation in NEC and EC
DNA methylation has a critical role in governing gene expression. In general, we observed a larger downregulation of genes regulating various methylation pathways in EC. We identified 226 genes with at least one-fold change in EC (71 upregulated, 155 downregulated) relative to NEC (Additional file 2: Table S11). Of the downregulated methylation genes, 15 proteins belong to protein family methyltransferase, 26 belong to plant invertase/pectin methylesterase inhibitors, 14 involved in S-adenosyl-L-methionine-dependent methyltransferase, 12 in the pectin methylesterase inhibitor family, and 35 in the O-methyltransferase superfamily (Additional file 2: Table S11). In contrast, very few methylation related protein families were upregulated in EC callus. Those that were include ribosomal protein L11 methyltransferase (3), tRNA methyltransferase (3), SAM dependent carboxyl methyltransferase (3), N6-adenine methyltransferase (3), and several others (Additional file 2: Table S11).
RT-qPCR and validation of RNA-seq
To validate the RNA-seq data, ten critical embryogenesis related genes were selected for RT-qPCR analysis. All the primer pairs used in this test amplify a unique band with a unique melting curve peak (data not shown). The results showed (Fig. 7 and Additional file 2: Table S12) that five of the predicted genes were up-regulated in EC and are as follows: GhUGt73C5, UDP-glycosyltransferase 73C5-like gene (Gohir.A01G078300.1) [50]; GhPin2, auxin transporters, encodes an auxin efflux carrier (Gohir.A05G001400.1) [51]; GhNFYB6, nuclear factor Y subunit B-6, acts as a key component regulating embryogenesis and seed maturation in Arabidopsis thaliana (Gohir.A05G176400.1) [52]; GhORG2, transcription factor ORG2-like gene with basic helix-loop-helix (Gohir.A09G106500.1) [53]; GhBBM, AP2-like ethylene-responsive baby boom transcription factor (Gohir.D08G247400.1) [54]. The five deduced genes that were down-regulated in EC, i.e., GhARF4, auxin response factor 4 like, encodes a member of the ARF family of transcription factors (Gohir.A05G074600.1) [55]; GhACRY1/BLU1, cryptochrome-1-like, a flavin-type blue-light photoreceptor with ATP binding and autophosphorylation activity (Gohir.A05G226100.1) [56] (Pooam et al., 2018); GhOMT1, encoding a reticuline 7-O-methyltransferase-like gene (Gohir.D02G181400.1); GhCOMT1, raimondii caffeic acid 3-O-methyltransferase-like (Gohir.D12G246900.1) gene [57]; GhCKX3, cytokinin dehydrogenase 3 (Gohir.D07G011600.1) [58]. The results showed that the expression level evaluated by RT-qPCR exhibited similar patterns to the RNA-seq results of the ten genes, confirmed the results of RNA-sEq. (Fig. 7, Additional file 2: Table S12).