Dynamics of spatial gene expression during the development of somatic embryos in cotton
We developed a research approach using ST to comprehensively assess the spatial expression patterns of genes during somatic embryo development in cotton (Fig. 1a). ST was used to conduct a comprehensive transcriptome analysis of different tissues of cotton across all the stages of somatic embryogenesis. First, frozen tissue sections of globular embryos, torpedo embryos, and cotyledonary embryos were made (Fig. 2a-c). The three stages of sampled embryos possessed three sections, and each spot on each section corresponds to 1-10 cells (Supplementary information, Fig. S1d). After being subjected to quantitative quality control, the number of high-quality spots was 801-1,217 per section, the average number of unique molecular identifiers (nUMIs) per spot was 11,857-16,401, the average number of expressed genes in each spot was 6,287-8,001 (Supplementary information, Fig. S1f-h, Table S1).
We used t-distributed Stochastic Neighbor Embedding (t-SNE) to divide the data into 13 cell clusters, and then mapped the spot cluster back to its original position in the tissue section (Fig. 2d and 2e). The 13 clusters were named based on their spatial location. We also constructed a heatmap of the 10 highest expressed genes in each cluster (Fig. 2f). For example, Gh_D05G159100, which is highly expressed in ‘pro-embryogenic cell’ (cluster 6), is a homologous gene of WUSCHEL-related homeobox 9 (WOX9) that plays a role in embryogenesis and meristems in plants 29. In ‘cotyledon cell’ (cluster 9), Gh_A07G040500 displays specific expression pattern and shows high homology to axial regulator YABBY 5 (YAB5) 30. In ‘cortex cell_12’ (cluster 12), Gh_D08G133100 was identified, which is homologous to YUCCA10 (YUC10) encoding a flavin monooxygenase that has been associated with embryogenesis and auxin synthesis in Arabidopsis 31.
The differentially expressed genes (DEGs) were identified in the 13 clusters (Supplementary information, Table S4). Gene ontology (GO) enrichment of the DEGs in the different cell clusters was conducted (Fig. 2g). Genes associated with the ‘regulation of post-embryonic development’ were highly expressed in ‘cotyledon cell’ (cluster 9), while genes of ‘defense response to other organism’ were found to be enriched in ‘cortex cell_11’ (cluster 11) (Fig. 2g). DOX2/Gh_A09G193100 and WOX9/Gh_D05G159100 were highly expressed in ‘pro-embryogenic cell’ (cluster 6), and this cluster of cells is involved in ‘shoot meristem development’ (Fig. 2f). These results suggest that genes involved in different GO functions may play a specific physiological function during the different stages of somatic embryogenesis.
Analysis of the spatial expression of genes regulating the initial development of somatic embryos in cotton
The initial induction of callus cells and the formation of embryogenic cells are the most critical stages for the development of cotton somatic embryos 32. scRNA-seq had been used to study different cell types of Arabidopsis callus 33. Therefore we also conducted a ST analysis of the callus induced from cotton hypocotyl (Fig. 3). Six cell clusters were obtained (Fig. 3a). It was found that the homologue of vascular tissue marker gene aintegumenta (ANT) (Gh_A11G229400) and the homologue of stem cell formation-related gene of wuschel-related homeobox 13 (WOX13) (Gh_A02G209400) were expressed in cluster 1 34, 35, suggesting that cluster 1 contained vascular tissue cells from explants and callus founder cells. Combined with previous identification of callus cell types 33, cluster 1 was identified as ‘explant vasculature and callus founder cells’. Similarly, the homologue of procambium developmental marker gene homeobox-leucine zipper protein ATHB-15 (ATHB-15) (Gh_D05G130700) was expressed in the ‘explant vasculature and callus founder cells _3’ (cluster 3) 36, and the homologue of vascular cambium developmental marker genes wuschel related homeobox 4 (WOX4) (Gh_A05G188600) and the homologue of WOX13 were expressed in the ‘explant vasculature and callus founder cells _6’ (cluster 6) 34, 37. The epidermal cell marker gene ethylene responsive element binding factor 4 (ERF4) (Gh_A10G019700) is expressed in the ‘outer cell layer of the explant’ (cluster 2) 38. The homologue of embryonic root apical meristem marker gene like auxin resistant 2 (LAX2) (Gh_A01G254000) was expressed in the ‘middle cell layer of callus cells’ (cluster 4). The homologue of WOX4 was expressed in the ‘Inner cell layer of callus cells’ (cluster 5) 37. The expression of marker genes for each cell cluster is presented in Fig. 3b. Further GO analyses indicated that the majority of DEGs in the cell clusters of ‘explant vasculature and callus founder cells_3’ and ‘inner cell layer of callus cells’ (clusters 3 and 5) are involved in the ‘response to hypoxia’ (Fig. 3d).
Analysis of the expression patterns of genes in multiple developmental stages of cotton somatic embryos
To further analyze gene expression in relation to tissue development in somatic embryos of cotton, we divided all the samples from the three developmental stages into 9 sections and determined the correlation of gene expression among the samples by Pearson correlation analysis (Supplementary information, Fig. S1c).
We then analyzed the GO enrichment terms and spatial expression of DEGs in different cell clusters within the nine sections, respectively (Supplementary information, Table S5-Table S13).
In section 1, the ‘non-embryogenic callus cell’ (cluster 1) contains the homologue of vascular tissue marker gene WOX13 39. ‘Globular embryo cell’ (cluster 2) expressed the homologue of early embryonic vascular marker gene homeobox gene 8 (ATHB-8) (Gh_D05G049400) and the homologue of marker gene baby boom (BBM) (Gh_D08G251600) that induces somatic embryogenesis 40, 41 (Supplementary information, Table S5). In section 2, ‘pro-embryogenic cells_3’ (cluster 3) expressed the homologue of embryonic epidermal cell marker gene protodermal factor 1 (PDF1) (Gh_D04G040400) and the homologous gene of WOX9 for embryonic pattern formation 42, and ‘pro-embryogenic cells_4’ (cluster 4) expressed the homologous marker gene of AP2-like ethylene-responsive transcription factor 6 (AIL6) (Gh_D10G104500) for somatic embryogenesis 43, ‘pro-embryogenic cells_5’ (cluster 5) expressed the homologous marker gene of BBM associated with somatic embryogenesis 40. The homologous marker gene of auxin response factor 5 (ARF5) (Gh_D05G188200) and ANT for embryonic vascular tissue formation were found in ‘globular embryo cell_1’ (cluster 1) 35, 44, ‘globular embryo cell_2’ (cluster 2) expressed the homologous marker gene of floral homeotic protein apetala 2 (AP2) (Gh_A11G207700) for somatic embryogenesis 45, and ‘globular embryo cell_6’ (cluster 6) also expressed the homologous marker gene of WOX9 and WOX13 29, 39 (Supplementary information, Table S6). GO enrichment analysis for the DEGs in sections 3 to 9 were conducted and the results are presented in Supplementary information, Fig. S4 to Fig. S10.
Expression Analysis of Genes Regulating the Cambium Development of Somatic Embryos
The spatial division and differentiation of plant stem cells are the basis of plant growth and development 46. The differentiation of embryogenic cells determines the beginning of embryonic development 47. During embryonic development and maturation, differentiation starts from the shoot and root meristems at the base of the embryo, as well as in the cambium 48, 49. The differentiation of cambial cells is essential for the formation of the cortex and the development of vascular tissue. Many well-known TF-encoding genes were highly expressed in these cambial clusters, including revoluta (REV) (Gh_D13G262600), short root (SHR) (Gh_D03G030200), MP/ARF5, and ATHB-15, which have been reported to be key genes associated with cambium differentiation and vascular development in Arabidopsis and rice 50-53. Many new TFs in the HB-HD-ZIP gene family, such as homeobox-leucine zipper protein HAT2 (HAT2) (Gh_A08G191700), ATHB-13 (Gh_D02G129400), ATHB-20 (Gh_D09G022200), ATHB-6 (Gh_A02G158400) were also identified in the cambial clusters. In addition, the TFs newly discovered in the cambium, cotyledon primordium, and hypophysis are listed in Supplementary information, Table S15 and Table S16.
ScRNA-seq Analysis of Gene Expression During Somatic Embryo Development in Cotton
To distinguish the transcriptional heterogeneity of individual cells, we conducted scRNA-seq on protoplasts isolated from the samples used in the ST. Filtering of raw data was conducted using the linear model fitting curve and the violin map of the number of genes (nGene) per cell and the nUMIs (Supplementary information, Fig. S11). After quality control and screening, 8,842, 9,031 and 9,037 single-cell transcripts were identified in globular embryos, torpedo embryos and cotyledonary embryos, with an average of 2,339, 1,992 and 2,146 genes per cell, respectively (Supplementary information, Table S2).
A total of 11 cell clusters were identified after the t-SNE analysis (Fig. 4a). Heatmap of the expression of several highly expressed marker genes for each cell cluster was then constructed (Fig. 4c). A correlation analysis revealed that the scRNA-seq and ST data were strongly correlated (Supplementary information, Fig. S11e-g). The cell types were then annotated with several marker genes identified in the ST results. DEGs were subsequently identified in each cluster (Supplementary information, Table S14). For example, Gh_A05G087300, which was expressed in cluster 2, is a homologous gene of mitogen-activated protein kinase 3 (MPK3) that was shown to play an important role in epidermal cells and embryonic development in Arabidopsis 54. Thus, cluster 2 was classified as ‘epidermal cells_2’. Genes with homology to auxin22 (AUX22) (Gh_D05G125900) and cytochrome P450 87A3 (CYP87A3) (Gh_Contig00374G000600) were highly expressed in ‘ground tissue cells_3’ (clusters 3) and ‘mesophyll cells’ (cluster 5), respectively (Fig. 4b). AUX22 encodes a TF that may be involved in the processes underlying the progression from embryogenic callus to somatic embryos morphogenesis 55. CYP87A3 is an auxin-induced gene originally isolated from coleoptiles in the early stage of rice seedling growth 56. Bcl-2-associated athanogene family molecular chaperone regulator 5 (BAG5) (Gh_D02G193400) was highly expressed in clusters 7 (Fig. 4b). In ‘apical cortex cells_7’ (cluster 7), Gh_D02G193400 plays a role in leaf development in cotton as its homolog BAG5 in Arabidopsis 57. Therefore, based on the combined functions and location of the above genes, we annotated cluster 7 as ‘apical cortex cells_7’. Gh_A08G230200 is a homologous gene of LOB domain-containing protein 12 (LBD12) that is associated with the formation of shoot apical meristem in rice 58. LBD12 is expressed in ‘apical cortex cells-1’, ‘epidermal cells-9’ and ‘cortex cells’ (clusters 1, 9 and 10).
Different with ST analysis, scRNA-seq allows studies at the high-resolution level of single cells 59, 60. In this study, many similarities were found in the two datasets, such as common cell types like epidermal cells, vascular cells and cortex cells were identified by both ST and scRNA-seq; common gene expression, such as LBD12 is expressed in ‘cortex cell_12’ (cluster 12) in the ST, consistently, LBD12 also expressed in 'apical cortex cells-1', 'epidermal cells-9' and 'cortex cells' (clusters 1, 9 and 10) in the scRNA-seq data (Supplementary information, Fig. S12i, Table S14). Thus, we can combine the complementarity of ST and scRNA-seq to perform spatial and high-resolution studies of cell types in cotton somatic embryos.
Next we constructed a developmental trajectory for cell differentiation using a pseudo-time analysis (Fig. 5a and b). Different cell clusters were apparent at different branching positions on the pseudo-time trajectory (Fig. 5b), which can be used to make an initial determination of the relationship between these cells at different developmental stages 61. For example, ‘epidermal cells_2’ (cluster 2) and ‘ground tissue cells’ (cluster 3) exhibited relatively similar distribution patterns on their developmental trajectories, indicating that their developmental stages were similar (Fig. 5b). Then, we analyzed genes enriched at different stages of pseudo-time for different single cell clusters (Fig. 5c). The expression of these genes can be divided into three categories. In the first type, the expression of marker genes from ‘epidermal cells_11’ (cluster 11), ‘epidermal cells_9’ (cluster 9), ‘epidermal cells_8’ (cluster 8) and ‘cortex cells’ (cluster 10) was at their highest level in the early stage of pseudo-time, and then decreased gradually. The second type comprises marker genes from ‘vascular tissue cells’ (cluster 4), ‘mesophyll cells’ (cluster 5), and ‘epidermal cells_2’ (cluster 2), whose expression increased gradually with pseudo-time, and reached their highest level in late pseudo-time. The third type was comprised of DEGs from ‘apical cortex cells_1’ (cluster 1), ‘ground tissue cells’ (cluster 3), ‘epidermal cells_11’ (cluster 11), whose expression level was the highest in the middle stage of pseudo-time, and then gradually decreased.
Establishment of a Gene Expression Map for Somatic Embryo Development in Cotton
As previously mentioned, the initial onset of embryogenesis is a critical stage that transforms non-embryogenic callus cells into embryogenic cells. In the ST clusters, we found some highly expressed genes in ‘non-embryogenic callus’ (cluster 13), ‘pro-embryogenic cell’ (cluster 6) and ‘globular embryo cell_10’ (cluster 10) (Supplementary information, Fig. S12). More specifically, beta-D-xylosidase 1 (BXL1) (Gh_A09G261400) was specifically expressed in ‘non-embryogenic callus’ (cluster 13), and further ISH analysis indicated that BXL1 was primarily expressed in the cortex of globular embryos, but also in the epidermis of both torpedo and cotyledonary embryos (Supplementary information, Fig. S13a-f). Gh_A01G086600 of ‘pro-embryogenic cell’(cluster 6) is a homologous gene of indole-3-acetate O-methyltransferase 1 (IAMT1) that was reported to be involved in auxin metabolism in Arabidopsis 62, suggesting that Gh_A01G086600 may regulate the initial formation of globular embryos through auxin-related metabolic processes. Similarly, Gh_A12G174900 of ‘pro-embryogenic cell’ (cluster 6) is a homologous gene of receptor-like transmembrane kinase 1 (TMK1) that is involved in auxin signal transduction in Arabidopsis 63 (Supplementary information, Fig. S12l), suggesting that Gh_A12G174900 may function in regulating the formation of globular embryos by participating in auxin signal transduction. Additionally, ISH analysis indicated that Gh_D11G232500/ABC transporter G family member 11 (ABCG11) was primarily expressed in the endodermis of globular and torpedo embryos, and mainly in the epidermis of cotyledonary embryos (Fig.s 6b-d and Supplementary information, Fig. S12g). In this regard, the Gh_D11G232500/ABCG11 reportedly encodes a transporter for epidermal lipid transport in Arabidopsis 64.
The establishment and maintenance of embryonic epidermis is crucial for the normal development of embryos. For example, the epidermis of shoot organs of higher plants originates from the embryonic epidermis, and the epidermis also protects the embryo from biotic and abiotic stresses 65, 66. Then we next analyzed the key genes expressed in ‘epidermal cell’ (cluster 3), ‘cortex cell_1’ and ‘cortex cell_12’ (clusters 1 and 12) of ST to identify genes that may regulate the development of cortical and epidermal cells (Fig. 2f). Cysteine-rich receptor-like protein kinase 29 (CRK29) belongs to a class of receptor-like kinases that encode cysteine-rich protein kinases 67. In cotton, Gh_D10G017200/CRK29 was expressed in ‘epidermal cell’ (cluster 3) and the epidermis of early globular and torpedo embryos (Supplementary information, Fig. S12d). Additionally, ISH analysis indicated that Gh_A13G249900, a homolog of histidine-containing phosphotransmitter 3 (AHP3), was first expressed in the cortex of globular embryos, and was still distributed in the cortex of torpedo embryos and cotyledonary embryos (Supplementary information, Fig. S14g-m). AHP3 encodes a histidine phosphate transfer protein involved in the cytokinin signaling pathway in Arabidopsis 68. ISH analysis indicated that LBD12 was primarily expressed in the cortex of globular embryos, torpedo embryos, and cotyledonary embryos (Supplementary information, Fig. S14n-s), which further verifies the reliability of the cell cluster classification.
Gh_A03G031900 is a homologous gene of heavy metal-associated isoprenylated plant protein 39 (HIPP39) that encodes a heavy metal-associated isoprenylated plant protein in rice 69. This gene is specifically expressed in the embryonic cortex of cotyledonary embryos (Supplementary information, Fig. S12e).
AATP1 and DOX2 negatively regulate cotton somatic embryogenesis
Based on the analysis of ST, we identified that AAA-ATPASE1 (AATP1) was expressed in ‘globular embryo_10’, while DOX2 was expressed in ‘pro-embryogenic cells’ (cluster 6) (Fig. 2f), both cell clusters are key cell types associated with cotton somatic embryo regeneration 70. Gh_D09G025900 has homology to AATP1 that has been demonstrated to play a role in inducing systemic signaling to increased resistance to pathogens in potato (Solanum tuberosum) tubers 71. DOX2 encodes an α-dioxygenase, which was reported to participate in α-linoleic acid metabolism and play a role in protecting tissues from oxidative damage and cell death in tomato 72. Given that AATP1 and DOX2 have been found to be highly expressed during the early stages of cotton somatic embryogenesis, we generated three overexpression (OE) and three DNA-edited lines (CRISPR/Cas, CAS) of AATP1 (Gh_D09G025900) and DOX2 (Gh_A09G193100) to determine their potential roles in cotton somatic embryogenesis (Supplementary information, Fig. S15). AATP1-CAS and DOX2-CAS transgenic calli (Cs) proliferated faster and had a larger volume at the end of the first 30 days of calli induction than wild-type (WT) calli (Supplementary information, Fig. S15a). By contrast, the calli proliferation rate of AATP1-CAS and DOX2-CAS transgenics, was lower than the WT control; and thus, their Cs volume was also lower than that of WT (Supplementary information, Fig. S15a). AATP1-CAS and DOX2-CAS Cs at the end of 90 days were friable and could be easily recognized as embryogenic. In contrast, AATP1-OE and DOX2-OE Cs proliferated more slowly during the initial stage of callus induction than the WT control and CAS Cs, and their callus volume was also relatively reduced. Consistently, the Cs proliferation rates in AATP1-OE and DOX2-OE Cs increased by the end of 60 days; however, no significant difference in Cs volume and weight, relative to the control, was observed (Supplementary information, Fig. S15). After 90 days of calli induction, the Cs of AATP1-OE and DOX2-OE transgenics were hard and had no signs of developing into embryogenic callus (ECs). The expression levels of AATP1 and DOX2 in overexpressing lines were found to be significantly increased compared with that of WT (Supplementary information, Fig. S15b). The Cs of AATP1-CAS and DOX2-CAS exhibited a higher induction and differentiation rates than that of WT control (Supplementary information, Fig. S15c and Fig. S15d).
Compared to AATP1-OE, DOX2-OE and control callus tissues at 30d, the cell morphology in AATP1-CAS and DOX2-CAS callus tissues was mainly round or oval, indicating that callus cells had begun to transform into embryogenic callus cells. In contrast, the cells of OE calli mainly exhibited an elongated shape, indicating that these cells were in a non-embryogenic cell state (Supplementary information, Fig. S15e). These results suggest that AATP1 and DOX2 negatively regulate callus initiation and embryogenic callus formation.
Mass Spectrometric Imaging Analysis of Metabolites During Somatic Embryo Development in Cotton
We next used AFADESI-MSI to analyze the metabolites in cotton embryo samples (Supplementary information, Fig. S16). We determined the mass charges and obtained the molecular contours in the tissue sections (Fig. 7a, Supplementary information, Fig. S16c-e). We divided the tissue sections to identify the marker metabolites related to somatic embryo development and made four groups of comparisons: (1) cortex vs. cotyledon, (2) cortex vs. vascular, (3) cotyledon vs. vascular, and (4) hypocotyl callus vs. hypocotyl.
We compared the metabolite information obtained in the different regions using unsupervised PCA (Supplementary information, Fig. S16f). We further used orthogonal projections to latent structures-discriminant analysis (OPLS-DA) to distinguish the overall differences in metabolites among the groups to identify the DPMs in the groups (Supplementary information, Fig. S16g). The metabolic pathway matching analysis of these selected DPMs was carried out based on the exact mass number, SmetDB database, and KEGG database (Fig. 7b). Among the DPMs detected during somatic embryo development (Supplementary information, Table S19 and Table S20), 74 metabolites were assigned to 26 metabolic pathways that are known to be involved in plant development (Fig. 7b). For example, ‘histidine metabolism’ has been reported to be necessary for embryonic development 73, ‘pyrimidine and purine metabolism’ provides a sufficient level of purines and pyrimidines to support nucleotide biosynthesis during embryonic development 74. In ‘arginine and proline metabolism’, arginine acts as a precursor for proline synthesis, and also plays a role in the synthesis of polyamines, which are essential for embryogenesis and organogenesis 75, 76 (Fig. 7b).
We compared the differences in annotated DPMs between cortex vs. cotyledon in group 1 (Supplementary information, Fig. S17a). These metabolites showed significant differences in abundance in different tissues (Supplementary information, Fig. S17c). For example, dihydrozeatin (C02029) and zeatin (C00371) were also highly abundant in cotyledons. Linoleic acid (C01595), (S)-β-aminoisobutyrate (C03284), and D-malic acid (C00497) were highly expressed in the cortex (Supplementary information, Table S19). In particular, the abundance of linoleic acid was particularly high in the cortex (Supplementary information, Fig. S17d).
We also compared the abundance of DPMs in the ‘cortex vs. vascular tissues” (Supplementary information, Fig. S18a). These metabolites participate in several metabolic pathways (Supplementary information, Fig. S18b). Pyroglutamic acid (C01879) is a cyclized derivative of L-glutamate and participates in ‘glutathione metabolism’, which in turn is a key component of the plant stress response network 77. In the cortex, Succinic acid participates in many metabolic pathways, including the ‘citrate cycle (TCA cycle)’ and ‘butanoate metabolism’ (Supplementary information, Fig. S18a). The ‘TCA cycle’ and ‘butanoate metabolism’ are essential metabolic pathways in plants 78, 79.
In the comparative analysis of DPMs between cotyledon and vascular tissues (Supplementary information, Fig. S19a), N-carbamoylputrescine (C00436), L-arginine (C00062) and indole-3-acetaldehyde (C00637) were highly abundant in cotyledons (Supplementary information, Table S19). The dihydrozeatin (C02029), zeatin (C00371) aND DTDP (C00363) were more abundant in hypocotyls (Supplementary information, Fig. S20a). These metabolites are involved in ‘zeatin biosynthesis’ and ‘purine metabolism’ (Supplementary information, Fig. S20b). On the other hand, L-aspartic acid (C00049), L-histidine (C00135) and galabiose (C00760) were more abundant in callus tissues (Supplementary information, Fig. S20d, Table S19). L-aspartic acid, L-histidine are utilized in ‘histidine metabolism’, which is essential for plant development, including embryonic development 73.
Gene-Metabolite Joint Analysis During Somatic Embryo Development in Cotton
Next, a joint analysis of SM and ST data was conducted. Based on the ST data obtained on somatic embryos, a heatmap analysis of the expression of genes involved in the metabolic pathways of identified DPMs was conducted (Supplementary information, Fig. S21, Table S21). Results indicated that most genes related to the synthesis of polyamines (e.g. spermine and spermidine), including spermidine synthase 1 (SPDSYN1) (Gh_A03G164400) and spermine synthase (SPMS) (Gh_A08G106900), were highly expressed in globular embryos, which is consistent with higher abundance of these polyamines (Supplementary information, Fig. S21). And zeatin synthesis like CKX5/Gh_D06G005100 were highly expressed in cotyledonary embryos (Supplementary information, Table S21). The consistency of ST and SM data suggests that spermine and spermidine may represent marker metabolites of globular embryo development, while zeatin may be a marker metabolite of cotyledonary embryo development.
Further, KEGG analysis indicated that DOX2 participates in ‘alpha-linolenic acid metabolism’ as α-dioxygenase and plays a role in the generation of 2(R)-HPOT from alpha-linolenic acid (Supplementary information, Fig. S22a), which indicates that DOX2 acts as a negative regulator in the synthesis of alpha-linolenic acid. Combined with the SM data, we found that the accumulation of alpha-linolenic acid was elevated at the torpedo embryo stage (Supplementary information, Fig. S22b). While alpha-linolenic acid is a type of fatty acids (FA), studies have found that the reduction of FA synthesis may be one of the reasons for embryonic developmental defects 80, 81. This suggests that DOX2 may affect cotton somatic embryogenesis by negatively regulating the synthesis of alpha-linolenic acid, thereby acting as a negative regulator.