2.1 Construction of a multi-omics atlas of HGSOC including chemotherapy response characteristics using single-cell and spatial transcriptomics
To enhance the understanding of the complex interplay between intra- and intertumoral heterogeneity in HGSOC, and to develop a comprehensive tumor evolutionary model that delineates the distinct tumor and microenvironmental profiles across diverse patient cohorts, we assembled a robust dataset (Fig. 1A). This dataset comprises four single-cell datasets and two spatial transcriptomics datasets from GEO datasets (Table S1), totaling 50 samples. Notably, 28 (57%) of these samples were obtained from patients prior to chemotherapy and who exhibited measurable responses to therapy (Fig. 1B). We also curated two bulk RNA-sequencing datasets that include chemotherapy treatment responses from GEO datasets, encompassing 24 samples (Table S1). Our analytical approach began with the application of advanced text-mining and machine-learning algorithms to three of the single-cell datasets. This enabled the construction of a sophisticated hierarchical evolutionary model specific to HGSOC. This model was then validated and further scrutinized for insights into the relationships between intra-tumoral microenvironmental heterogeneity and intertumoral population heterogeneity using the remaining two datasets. Subsequently, we focused on the spatial transcriptomics datasets to dissect the spatial distribution characteristics of microenvironmental heterogeneity In the merged test dataset, which consisted of 14 samples, we processed and analyzed a total of 87,288 cells after quality control (Fig. 1C). These cells were classified into several key subtypes: B cells (5,700, 6.53%), CAFs (17,664, 20.24%), endothelial cells (ECs; 1,174, 1.35%), epithelial cells (Epi; 20,273, 23.22%), monocytes (15,217, 17.43%), and T and NK cells (T_NK; 27,259, 31.23%). We used the CNV method and calculated the epithelial cell score to identify 18,273 individual tumor cells (Fig. 1C). Through this integrated approach, we aimed to not only elucidate the evolutionary dynamics of HGSOC but also provide a foundation for the development of personalized therapeutic strategies that account for the unique heterogeneity observed in each patient’s tumor.
2.2 Inter- and intratumoral heterogeneity in high-grade serous ovarian cancer
Patient prognosis and treatment response are significantly affected by inter- and intratumoral heterogeneity. We used text mining and consensus clustering to dissect the clonal evolution within individual patients, followed by hierarchical clustering and bootstrap analysis to establish intertumoral clonal evolutionary relationships. We identified three evolutionary branches, designated as BR1, BR2, and BR3, across 69 clusters in 14 samples from GSE140819, GSE154600, and GSE184880 (Fig. 2A). Notably, individual tumor samples tended to cluster under a specific evolutionary branch. To validate the robustness of our method, we replicated the analysis in a subset of the data and confirmed consistent evolutionary relationships among cell populations in dataset GSE184880 (Figure S1).
Having established the clonal evolutionary relationships across different samples, we identified key genes within these branches and conducted functional analyses (Table S2). Key genes of BR1 were enriched in pathways related to myeloid leukocyte migration (GO:0097529), neutrophil degranulation (GO:0043312), and digestion (GO:0007586). BR2 genes were predominantly associated with cellular responses to chemokines (GO:1990869), T-cell migration (GO:0072678), regulation of response to type II interferon (GO:0060330), and autophagy (GO:0006914). BR3 genes were enriched in pathways such as the generation of amyloid fibrils (GO:1990000), cellular response to hypoxia (GO:0071456), positive regulation of growth (GO:0045927), and collagen fibril organization (GO:0030199) (Fig. 2B). These findings suggest that BR3 is characterized by highly fibrotic, immune desert-like tumors, BR2 by T-cell-infiltrated hot tumors, and BR1 by predominantly tumor-cell-dominated cold tumors [18].
Further analysis revealed significant heterogeneity in the distribution of different microenvironmental cells across the branches, with CAFs notably enriched in the BR3 branch (Fig. 2C). To delve deeper into intratumoral heterogeneity, we dissected the heterogeneity of epithelial, fibroblast, monocyte, and CD8 cells. We identified six epithelial cell subpopulations (E_HSPA1B, E_JUND, E_MALAT1, E_S100A4, E_SPP1, and E_TOP2A), with E_SPP1 and E_TOP2A predominantly found in the BR3 branch (Fig. 2D). Secreted phosphoprotein 1(SPP1) is highly expressed in multiple tumor types and interacts with fibroblasts or T cells via pathways such as SPP1-CD44, correlating with tumor malignancy[19]. Type IIA topoisomerase(TOP2A), a cell cycle gene, indicates high proliferative states when overexpressed. Cell velocity and pseudo-time analysis revealed that epithelial cells evolve toward the BR3 subpopulation, with terminal pathways enriched in the cell cycle, epithelial differentiation, and lipid metabolism pathways (Fig. 2D).
We identified five fibroblast subpopulations (F_ACTA2, F_COL1A1, F_JUNB, F_LGALS3, and F_MMP11), most of which were concentrated in the BR3 branch (Fig. 2D). Cell velocity analysis showed enrichment of cell adhesion-related pathways at the onset and metabolic, chemokine, and MAPK pathways at the end (Fig. 2D). Eight macrophage subpopulations were identified (M_A2M, M_CCL3, M_CXCL10, M_HLA-DPB1, M_HMGB2, M_HSPA1B, M_MMP9, and M_SPP1), with M_CCL3, M_CXCL10, M_HLA-DPB1, and M_SPP1 enriched in the BR3 branch (Fig. 2D). Temporal analysis revealed the enrichment of lipid metabolism, granulocyte differentiation, and c-Jun N-terminal kinase (JNK) pathway pathways in the terminal stage (Fig. 2D). Four CD8 cell subpopulations were identified, with CD8_GZMA and CD8_TIGIT enriched in BR1 and BR2, and BR3 dominated by CD8_CXCR4. Cell velocity analysis showed enrichment of T-cell activation, lymphocyte differentiation, and autophagy regulation pathways in the terminal stage of cell differentiation (Fig. 2D).
These results underscore significant inter- and intratumoral cellular heterogeneity in HGSOC, with epithelial cells in BR3 samples exhibiting high proliferation rates and infiltration by diverse fibroblast subtypes.
2.3 Tumor functional clonality is associated with patient prognosis and TME composition
In our previous analysis, we divided all single-cell samples into three branches and identified significant heterogeneity between these branches, along with 150 characteristic genes for each branch (Table S2). We further examined the prognostic clustering effect of these genes. Prognostic analysis across five datasets (TCGA, GSE14764, GSE26193, and GSE26712; Table S1) revealed strong prognostic clustering in these datasets, with the poorest prognosis samples exhibiting high levels of fibroblast infiltration (G1 in TCGA and G2 in GSE14764, GSE26193, and GSE26712; Fig. 3A, B). Correlation analysis also indicated a significant high correlation between fibroblast content and fibroblast-associated marker genes (COL3A1, COL1A2, COL1A1, FN1, and DCN; Figure S2). Subsequently, we used the obtained intertumoral heterogeneity branch genes as a reference gene set to perform a hypergeometric test in different bulk samples to calculate the distribution characteristics of samples in the single-cell BR branches. Bulk samples with the poorest prognosis and high fibroblast infiltration were significantly enriched in BR3 scores (Fig. 3C), further validating the reliability of using single-cell samples for studying intertumoral heterogeneity in tumors. Differential gene analysis in bulk samples also revealed the enrichment of numerous pathways related to epithelial cell migration, cell adhesion, and chemokines in samples with poor prognosis (Fig. 3D). These findings suggest that the fibroblast content in HGSOC samples may be highly correlated with tumor malignancy.
2.4 Identification of a subpopulation of CXCL12-positive fibroblasts associated with chemotherapy resistance and poor prognosis
Previous research has demonstrated a strong correlation between the heterogeneity of TME composition, especially the infiltration of fibroblasts, and prognosis[2]. Given that chemotherapy is commonly used in the clinical treatment of ovarian cancer, we explored the relationship between intratumoral heterogeneity and chemotherapy resistance. We analyzed the differences in the cellular composition of the TME in tumors with varying responses to chemotherapy. Using the 150 characteristic genes obtained (Table S2), we conducted a tumor evolution analysis on five samples within the single-cell dataset (GSE154600). The chemotherapy-sensitive samples GSM4675276 and GSM4675276 belonged to the BR1 subgroup, while the chemotherapy-resistant samples GSM4675273 and GSM4675273 were classified under the BR3 subgroup (Fig. 4A; Table S3). Combining these five samples, a total of 50,571 cells were obtained, including B cells, CAF, monocytes, T cells, EPI_1 (epithelial cells with high levels of Matrix metallopeptidase 7 (MMP7) and E74 like ETS transcription factor 3 (ELF3)), and EPI_2 (epithelial cells with a high level of hes family bHLH transcription factor 1 (HES1)) (Figs. 4B and S3). Among these, B cells were significantly enriched in samples derived from BR1 and BR2, while CAFs were significantly enriched in BR3 samples, indicating that the degree of fibroblast infiltration not only affects patient prognosis but also correlates with the treatment response (Fig. 4C). To elucidate the relationship between CAFs and ovarian cancer treatment response, we further sub-classified CAFs and identified seven distinct subtypes (Fig. 4D). F_CCL21 and F_RAMP3 were significantly enriched in BR1, and F_PGF, F_IGFBP4, F_IGFBP7, and F_CXCL12 were significantly enriched in BR3, particularly the F_CXCL12 subtype, which was almost exclusively found in BR3 (Fig. 4E). We then used gene co-expression analysis to identify key characteristic genes in different CAF subpopulations, resulting in four gene modules, with the blue module showing the highest correlation with the BR3-specific F_CXCL12 subgroup (R = 0.88, P < 0.001, Fig. 4F) and the yellow module highly correlated with the BR1-specific F_RAMP3 module (R = 0.97, P < 0.001; Fig. 4F). Analysis of the hub genes in the blue module revealed that these genes were primarily enriched in pathways related to extracellular matrix remodeling and tube morphogenesis (Figs. 4G and S5). Given the significant enrichment of CAFs in BR3, we explored the communication between these fibroblasts and tumor cells and found high-intensity interactions between the fibroblast subpopulations F_CXCL12 and F_IGFBP7 and tumor cells in EPI_1 (Fig. 4H). Specifically, F_CXCL12 interacts with tumor epithelial cells via the ligand-receptor pair CXCL12/CXCR4 and through the secretion of placental growth factor (PGF) (Fig. 4I). These findings suggest that CXCL12-positive CAF cells may be associated with chemotherapy resistance.
To validate the relationship between fibroblasts and chemotherapy, we examined the relationship between fibroblasts and treatment response in the other dataset (GSE165897) that included chemotherapy response information (Table S4). Using the previously identified 150 characteristic genes, we performed a hierarchical evolutionary analysis between samples and conducted permutation tests. The platinum-free interval (PFI) values of samples of EOC3, EOC349, EOC540, and EOC87 in the BR3 subgroup were significantly lower than those in the groups BR1 and BR2, indicating significant chemotherapy resistance, while the samples EOC136 and EOC153 in BR1 showed chemotherapy sensitivity, again confirming the high correlation between intertumoral heterogeneity and treatment response (Fig. 5A; Table S4). From this dataset of nine samples, we obtained 17,898 cells, comprising 11 subpopulations, including two fibroblast subpopulations, F_CXCL12 and F_COL1A1, with significant F_CXCL12 infiltration in BR3 and significant F_COL1A1 enrichment in BR1 (Figs. 5B, C and S6). We then re-clustered the fibroblasts and identified five fibroblast subpopulations, including the F_CXCL12 subgroup (Figs. 5D and S6). Using gene co-expression analysis, we also obtained a gene module that was highly correlated with the F_CXCL12 subgroup (R = 0.85, P < 0.001; Fig. 5E), with characteristic genes primarily enriched in TGF-beta, MAPK, and cytokine interaction signaling pathways (Figure S7).
In both datasets, we identified a fibroblast subgroup with high CXCL12 expression, and a similarity correlation analysis between subgroups confirmed the high similarity of CXCL12-high fibroblast subgroups in both datasets (Fig. 5E).
2.5 Identification of a gene set affecting prognosis and drug resistance in HGSOC
Above, we identified a significant correlation between the highly invasive CXCL12-positive CAF subpopulation and both treatment response and prognosis in ovarian cancer. We next conducted a comparative analysis of the WGCNA hub gene results to consolidate the characteristic genes of the CXCL12-positive CAF subpopulation and identified 24 shared genes (Table S5). Subsequent Cox regression analysis using the GSE26193 dataset revealed that DCN, CXCL12, and TNFAIP6 were significantly positively associated with poor prognosis in ovarian cancer, with DCN specifically identified as a fibroblast-characteristic gene (Fig. 6A). Using these significantly differential genes and Cox regression coefficients, we performed a risk analysis across four additional datasets(GSE14764, GSE26712, TCGA and GSE9891, Table S1). The findings indicated that the risk coefficients for the high gene expression groups were notably higher than those for the low expression groups, and both CXCL12 and DCN showed significant positive correlations with poor overall patient prognosis (Fig. 6A, B). Through Friends analysis, we observed a high correlation of CXCL12 with other genes (Fig. 6C). Additionally, using these 24 genes, we applied enrichment analysis in three treatment datasets (GSE33482, GSE114206 [20], and GSE189843 [15]) to validate the association between CAFs and chemotherapy response (Fig. 6D; Table S6). Based on gene expression data from these samples, we categorized them according to their enrichment levels in the 24 genes. These 24 genes could effectively stratify samples into high and low groups, with the high CAF subpopulation exhibiting strong drug resistance. Moreover, the 24 genes showed excellent predictive efficacy for drug resistance, with AUC values of 1, 0.89, 0.83 in datasets GSE33482, GSE114206 and GSE189843 respectively (Fig. 6D). These findings underscore the critical role of fibroblasts in OC treatment and prognosis, and through further feature extraction, we have identified a reduced set of genes that are indicative of treatment response.
2.6 Spatial distribution characteristics of CXCL12-positive fibroblasts affect the clinical prognostic results of HGSOC
Given the strong correlation between CXCL12-positive fibroblasts and the treatment outcomes and prognoses of HGSOC, as well as their interaction with tumor cells via CXCR4, we elucidated the interaction mechanisms of CXCL12-positive fibroblasts in vitro and in spatial transcriptomic data. In vitro, we silenced the CXCL12 receptor gene CXCR4 in the ovarian cancer cell line SKOV3 (Fig. 7A). As shown by qPCR, all three siRNA knockdown fragments significantly suppressed CXCR4 mRNA expression levels in SKOV3 cells (P < 0.05; Fig. 7A). Subsequent addition of exogenous CXCL12 protein to both the control group and the CXCR4-si group demonstrated that CXCR4 knockdown significantly inhibited cell viability compared with the siNC control group (P < 0.05; Fig. 7B, C). Considering that fibroblasts and tumor cells interact through cellular communication, and the intensity of this communication is highly related to the spatial positioning of genes, we analyzed the spatial distribution of fibroblasts and tumor cells within a treatment response-associated spatial transcriptomic dataset. Using the inferCNV method and cell type identified, we identified tumor regions and the surrounding cellular environments. We analyzed the spatial distribution characteristics of cells in chemotherapy-resistant (GSM6506110) and chemotherapy-sensitive (GSM6506114) samples in GSE211956 dataset respectively (Fig. 7D, 7E, Table S7). In the chemotherapy-resistant samples, we identified 5 clusters (Fig. 7D(i)). Cell type analysis revealed that cluster 4 is enriched with fibroblasts with high expression level both CXCL12 and DCN (Fig. 7D(iii)). Furthermore, in the surrounding area of cluster 4 (Nbs_4), which are enriched in tumor epithelial cells and expressed high level of tumor cell marker KRT19 and CXCR4 (the receptor for CXCL12) (Fig. 7D(iii)). In contrast, in the chemotherapy-sensitive samples, we identified 4 clusters (Fig. 7E(i)). Among these, clusters 0, 1, and 3 are enriched with fibroblasts (Fig. 7E(ii)). Cluster 1 is specifically enriched with KRT19-positive tumor cells (Fig. 7E). However, in the regions dominated by KRT19-positive tumor cells, there is no high expression of CXCR4, forming a spatial interaction domain of CXCL12-positive CAFs. Finally, in our treatment response-associated clinical samples, we also validated the local spatial adjacency of CXCL12/DCN/KRT19 in chemotherapy-resistant samples (Fig. 7E).