Deep learning on image-omics data in identifying prognostic immune biomarkers for ovarian cancer

Although stromal and immune cells in the tumor microenvironment have been shown to directly affect tumor growth and chemoresistance, how the interactions among stromal and immune cells and their spatially resolved cell heterogeneity would inuence ovarian patients’ survival remains largely unknown. To ll this gap, we developed a new imageomics method that (i) incorporates an articial intelligence– based analytics pipeline for imaging mass cytometry (IMC) to increase the accuracy of cell segmentation and spatial information extraction in order to identify immune biomarkers and their interactions that can predict overall survival rates in patients with treatment-naïve ovarian cancer, and (ii) integrates quantitated spatial IMC image data with microdissected tumor and stromal transcriptomic data from the same patients as well as single-cell RNA sequencing data to detect genes correlated with the prognostic features and postulate novel mechanisms by which these genes contribute to the prognostic features.


Introduction
Advanced high-grade serous ovarian cancer (HGSC) is the most lethal gynecologic malignancy, causing more than 13,000 deaths annually in the United States 1 . HGSC is notable for initial sensitivity (75% response rate) to platinum and taxane neoadjuvant chemotherapy or chemotherapy following debulking surgery 2,3 . However, most tumors (> 75-80%) recur within 12 to 24 months after treatment, and many patients die of progressively chemotherapy-resistant disease [4][5][6] . A critically important component that in uences the patient survival is the tumor microenvironment 7,8 , which is primarily composed of broblasts, extracellular matrix proteins, endothelial cells, lymphocytic in ltrates, and cancer cells. The tumor microenvironment has been shown to directly affect cancer cell growth, migration, invasion, chemoresistance, cell-cell interactions, and matrix remodeling 9,10 . However, spatially resolved, single-cell analysis that identi es tumor and stromal cell phenotypes and characterizes their organization and heterogeneity, as well as biomarkers at a single-cell level for predicting survival of HGSC patients, are lacking.
Imaging mass cytometry (IMC) is an imaging-based mass cytometry (CyTOF) that couples immunohistochemical and immunocytochemical methods with high-resolution laser ablation 11 to allow the imaging of more than 30 proteins and protein modi cations simultaneously at subcellular resolution.
This enables researchers to uncover the heterogeneity of cellular phenotypes and cell-cell interactions (phenograph 12 ; histocat 13 ). Cell segmentation is the rst key step of IMC analysis. However, IMC images of biological tissues, particularly those of solid tumors, are extremely challenging for conventional cell segmentation methods, such as the watershed algorithm 11 and pixel-based classi cation 13 , owing to great variations in image intensities and cell shapes, overlapping cells, dense cell clusters, blurred edge information, missing object borders, and low signal-to-noise ratios. In addition, the highly multiplexed IMC data generate rich information of cell phenotypes, spatial organization, and heterogeneity. The way to quantify and integrate of various data to identify reliable prognostic biomarkers remains largely unexplored.
In the current study, we built a novel analytic and modeling pipeline of IMC images using deep learning and applied it to predict patient survival rates using IMC data generated from patient samples of treatment-naïve HGSC tumor tissues. We compared cell density and nearest-neighbor interactions in tumor-enriched regions between long-term and short-term survivors. We also used the cell density or nearest-neighbor interactions as features to build a machine learning model to identify salient prognostic features for predicting patient survival. Finally, we combined the quantitated IMC images with cancer and stromal gene expression data of microdissected tissue specimens from the same HGSC patients to detect genes that are signi cantly correlated with prognostic features and postulate the mechanisms by which these genes contribute to the prognostic features.

Image analysis pipeline
To comprehensively quantify the cellular heterogeneity and spatial organization of HGSC tissue and nd biomarkers that predict patient survival, we used IMC to detect 34 different proteins in 41 tumor samples from treatment-naïve HGSC patients (Table S1). Tissue sections were stained with a panel of metaltagged antibodies followed by laser ablation coupled to mass spectrometry to generate high-dimensional images as previously described 11 (Fig. 1). Among the 34 analyzed proteins, we selected 21 markers with relatively high signal-to-noise ratios for further analysis (Fig. S1A). Our selected panel consisted of markers of proliferation; immune cell regulators, and markers of epithelial, stroma, immune and endothelial lineages (Fig. S2). Resulting data were then analyzed using a novel IMC-adapted image analysis pipeline. Brie y, cell segmentation was performed using a deep learning method, followed by three rounds of clustering to identify and annotate different cell subtypes used for cell density and neighborhood analyses. These two features were then used for survival prediction analysis and correlating IMC phenotype with gene expression pro le (Fig. 1).

Cell segmentation and annotation by deep learning-based IMC data analysis
We employed a deep learning method, mask-R-CNN (MRCNN) 14 for IMC cell segmentation based on the computerized outputs of watershed segmentation ( Fig. 2A). The mean dice similarity of MRCNN was signi cantly higher than that of watershed (mean of mean dice similarity of MRCNN = 0.59, mean of mean dice similarity of watershed = 0.56, p = 0.0023, n = 10 samples, paired t test; Fig. S1B), and the standard deviation of dice similarity of MRCNN was signi cantly lower than that of watershed (mean standard deviation of dice similarity of MRCNN = 0.20, mean standard deviation of dice similarity of watershed = 0.24, p = 1.33*10 − 5 , n = 10 samples, paired t test; Fig. S1C). These results suggest that the segmentation by MRCNN is more similar to the manual segmentation than the watershed segmentation and that it generates less oversegmentation than watershed ( Fig. 2A).
After three iterations of phenograph clustering to identify cell subtypes, we identi ed 40 cell subtypes of T and B cell, macrophage, endothelial, and broblastic cell populations, as well as tumor cell subtypes from 162,869 cells in 41 images (Fig. 2B) and quanti ed the normalized expression of all markers across various cell subtypes (Fig. 2C). Speci cally, based on the normalized expression of cell subtype-speci c markers, we identi ed nine macrophage/monocyte subtypes (Fig. 3A), four CD8 + T cell subtypes and 6 CD4 + T cell subtypes (Fig. 3B), nine tumor subtypes (Fig. 3C), and many other broblastic and immune cell subtypes (Fig. 3D). Because the normalization step converts the expression of all cells in a sample to a z-score, the normalized expression re ects the relative expression across all cells. For example, although tu_1, tu_2, tu_3, tu_5, tu_6, tu_8, and tu_9 have median normalized Keratin 8/18 expression around zero (Fig. 3C), they exhibit signi cantly higher Keratin 8_18 expression levels than non-tumor cells (Fig. S3). Based on the results of Fig. 2C and Fig. 3, the phenotypes of all cell subtypes are summarized in Table S2.

Spatially resolved cell density and cell-cell nearest-neighbor interactions analyses of the ovarian tumor microenvironment
We automatically calculated the tumor-enriched region for each image (Methods). By computing the cell density as the cell count in the tumor-enriched region per total tumor cells, we found several cell subtypes exhibiting signi cant differences between long-term survivors (LTS; overall survival ≥ 60 months, n = 21) and short-term survivors (STS; overall survival ≤ 20 months, n = 20; Fig. 4A). Among different T cell subtypes, granzyme B + CD8 + cytotoxic T cell (CD8_4) density was signi cantly higher in LTS (p = 0.019, Fig. 4A) and CD45RO + CD44 + CD8 + memory T cell (CD8_3) density had a trend of declining in in LTS (p = 0.084, Fig. S4) than in STS. In addition, CD45RO + CD4 + memory T cell (CD4_4) density was signi cantly higher in LTS than in STS (p = 0.024, Fig. 4A). Among different CD73 + cell subtypes, CD73 + cell (CD73_1) density and CD73 mid cell (CD73_2) density were signi cantly lower in the LTS than in STS (p = 0.015 and p = 0.002, respectively; Fig. 4A). CD31 + CD73 mid endothelial cell (CD31) density was signi cantly lower in LTS than in STS (p = 0.007, Fig. 4A). Among tumor cell subtypes, B7H4 + Keratin + tumor cell (tu_9) density was signi cantly lower in LTS than in STS (p = 0.018, Fig. 4A). A comparison of cell densities of all cell subtypes between LTS and STS is shown in Fig. S4.
Next, we examined the prognostic signi cance of cell-cell nearest-neighbor interactions by computing the average cell count of cell subtype X in the nearest neighborhood of cell subtype Y (distance between the center of two cells less than 20 µm) in the tumor-enriched region of every LTS and STS patient sample.
We computed the cell-cell nearest-neighbor interactions that were signi cantly higher or lower (Benjamini-Hochberg adjusted p value < 0.05) in LTS (n = 21) than in STS (n = 20). We identi ed 120 cell-cell nearestneighbor interactions that were signi cantly different between LTS and STS (Fig. 5A). Among them, granzyme B + CD8 + cytotoxic T cells (CD8_4) had signi cantly more interactions with multiple tumor cell subtypes (tu_1, tu_2, tu_3, and tu_5) in LTS than in STS (Fig. 5A), suggesting increased interactions between CD8 cytotoxic cells and multiple subtypes of tumor cells in LTS. An example of the interaction between CD8_4 and tu_1 is shown in Fig. S5A.

Feature selection for overall survival prediction by logistic regression
We used a machine learning method, logistic regression to predict survival, with recursive elimination of features after ltering out highly correlated ones (see Methods). Our rst approach included using only the cell density detected in tumor-enriched regions and patient age as features (Fig. 5B). The optimal number of selected features was seven, because both training accuracy (0.947) and validation accuracy (0.938) were high ( Fig. 5B.a). The test accuracy was 0.78 (test sensitivity = 0.6, test speci city = 1) and the area under the curve (AUC) was 0.8 ( Fig. 5B.b). Among the seven prognostic features selected by the model, CD73 mid cells (CD73_2), CD31 + CD73 mid endothelial cells (CD31), CD163 + CD68 + Vista mid CD14 + macrophages (ma_1), CD45RO + CD44 + CD8 + memory T cells (CD8_3), and age had negative coe cients, suggesting that they were inversely correlated with patient survival. Granzyme B + CD8 + cytotoxic T cells (CD8_4) and CD45RO + CD4 + memory T cells (CD4_4) had positive coe cients, suggesting that they were positively correlated with patient survival (Fig. 5B.c, Fig. 4A).
Next, we used the cell-cell nearest-neighbor interactions, which are related to the seven prognostic cell density features in the tumor-enriched region, and patient age as features (Fig. 5C). The optimal number of selected features was 11, because both training accuracy (1) and validation accuracy (1) were the highest for that number of features ( Fig. 5C.a). The test accuracy was 0.89 (test sensitivity = 1, test speci city = 0.75) and the AUC was 1 (Fig. 5C.b). Among the 11 features that were selected by the logistic regression model, CD73_2 neighboring CD73_1, ma_9, or s_2; CD31 neighboring tu_4; CD8_3 neighboring ma_8; and age were negatively correlated with patient survival. In contrast, CD4_4 neighboring s_2, s_1, ma_1, or CD44 and CD8_4 neighboring tu_1 were positively correlated with survival ( Fig. 5C.c). Some of the prognostic nearest-neighbor interaction features, such as CD8_4 neighboring tu_1, CD73_2 neighboring ma_9, CD4_4 neighboring ma_1, and CD8_3 neighboring ma_8, can be visualized in Fig. S5. These results indicate that multiple neighboring interactions may have similar predictive power for survival prediction. The complex cell-cell interaction patterns of ovarian cancer with various immune cell and other stromal subtypes led to divergence in the tumor microenvironment between LTS and STS.
Taken together, our results indicated that using cell-cell nearest-neighbor interactions and age as features allowed a more accurate prediction of patient survival than using cell densities and age as features. The Spearman correlation of any two neighbor interaction features that both had relatively high correlation with patient survival (Spearman correlation coe cient > 0.2) and were related to the seven prognostic cell density features is visualized in Fig. 5D. The Spearman correlation study demonstrated that certain features were highly correlated. For example, we found that the average cell count of CD8_4 neighboring with a tu_1 cell was highly correlated with the average cell count of CD8_4 neighboring with a tu_2 cell (r = 0.77, p = 5*10 − 9 ) and the average cell count of CD8_4 neighboring with a tu_3 cell (r = 0.72, p = 1*10 − 7 ). The feature elimination process of our logistic regression model rst ltered out the highly correlated features and kept one feature of each highly correlated feature pair (see Methods). Owing to this elimination process, features that are highly correlated with any feature in the prognostic list identi ed by the logistic regression model may also have prognostic value. For example, because CD8_4 neighboring tu_1 was a prognostic feature, CD8_4 neighboring tu_2 or CD8_4 neighboring tu_3 may have similar prognostic values.
Correlations between cell subtype density and transcriptomic pro les from microdissected broblastic and epithelial compartments of HGSC To provide mechanistic insights by which certain cell phenotypes identi ed by IMC modulate survival in HGSC patients, we performed correlation studies to identify genes in the epithelial compartment ( Fig. 6A) or broblastic compartment (Fig. 6B) of HGSCs that had a signi cant positive or negative correlation with the IMC cell density in the tumor tissue (Table S3 and 4) and that were signi cantly different between LTS and STS ( Fig. S6 and 7). A total of 26 samples with whole transcriptome data available were used. We focused our attention on the cell densities of the six IMC features that were signi cantly different between LTS and STS (Fig. 4A), four of which were also selected by our machine learning model for survival prediction (Fig. 5B.c). We discovered relationships consistent with known cancer biology, and we also made several unexpected observations.
In the epithelial compartment of HGSC tumors, we identi ed expression levels of several genes involved in the migration and invasion of cancer cells correlated with CD73_1, CD73_2, and CD31 densities and with poor survival, suggesting that these genes regulate the metastatic potential of cancer cells in a paracrine manner; these genes included PARD6B, S100A10, SLURP1, and SPINT1. In addition, we demonstrated that ovarian cancer cell-derived ITIH5, TACSTD2, and WFDC2 expression levels were negatively correlated with granzyme B + CD8 + cytotoxic T cell (CD8_4) densities. In contrast, BPIFB1 and SLURP1 expression levels were positively correlated, and CD9 and ITIH5 were negatively correlated with CD45RO + CD4 + memory T cell (CD4_4) densities (Fig. 6A, Table S3, Supp. Figure 6). These ndings suggest that these ovarian cancer cell-derived genes, which are known to code for extracellular matrix and metabolism modulators [19][20][21][22][23][24][25] , could facilitate or block T cell in ltration in the tumor microenvironment as well as interfere with T cell activation and facilitate immune system evasion.
Moreover, cancer cell-derived FST and PARD6B were positively correlated with CD31 + CD73 mid endothelial cell (CD31) density, indicating that these genes modulate endothelial cell activity and subsequently angiogenesis. The prognostic signi cance of these genes was determined by analyzing 530 samples of optimally debulked advanced HGSC in the KM Plotter, and the results are summarized in Fig. S6.
CAF-derived genes involved in the modulation of the immune response and also associated with patient survival were also examined, and the results are summarized in Table S4 and Fig. S7. Among these genes, BMPR1B, a gene encoding the receptor of the bone morphogenetic protein (BMP), was negatively correlated with CD4_4 cells and associated with STS. This nding suggests that CAFs expressing high levels of BMPR1B may be more responsive to BMP signaling, which subsequently modulates the rigidity of CAFs, stiffness of the tumor microenvironment, and CD4_4 T cell tra cking. Besides BMPR1B, VSTM4, a secreted protein that can reduce IFN-γ, IL-2, and IL-17 cytokine production by human T cells and cause a profound decrease in T cell activation 51 , was negatively correlated with CD4_4 density but positively associated with STS, suggesting that CAF-derived VSTM4 modulates CD4_4 activity and subsequently leads to poor survival rates in patients with HGSC.
In addition to these correlations between expression of CAF-derived genes and prognostic immune cell phenotypes, expression of several genes with prognostic signi cance was also associated with the two CAF phenotypes CD73_1 and CD73_2. FN1, TGFBI, TNC, and LRRC32 were positively correlated with both CD73_1 and CD73_2 and negatively correlated with survival. LRRC32 is a key regulator of TGF-β activation 52 . Together with increased TGFB1 secreted by CD73_1 and CD73_2 CAFs, LRRC32 may generate an immune-suppressive and pro-tumorigenic microenvironment to support the malignant phenotype of ovarian cancer cells, as we previously described 53,54 . Both FN1 and TNC encode extracellular proteins that have previously been shown to be associated with short progression-free survival and increased migration-inducing potential in HGSC 55,56 .
Several broblastic genes were associated with density of endothelial cells (CD31). Among them, MFGE8, which was also positively correlated with CD73_1 and CD73_2 density, demonstrated the strongest correlation with increased CD31 density and poor patient survival (Table S4 and Fig. S7). Our ndings are supported by published studies reporting that MFGE8 increases tumor angiogenesis by increasing VEGF and ET-1 expression in stromal cells and by enhancing M2 polarization of macrophages 32 . Moreover, MFGE8 proteins accumulated around CD31 + blood vessels have been shown to promote angiogenesis by enhancing PDGF-PDGFRβ signaling mediated by integrin-growth factor receptor crosstalk 33,57 .
Taken together, the transcriptome analysis revealed multiple key genes in both the cancer cell and broblastic compartments of HGSC that correlate with IMC features positively or negatively, confer either a tumor-promoting or an immune-suppressive microenvironment, and subsequently lead to short-term survival in HGSC patients (Fig. 6C).

Discussion
In the present study, we applied a new analytic pipeline to IMC data based on deep machine learning that ultimately allowed us to predict HGSC patients' survival rates. We quanti ed the abundance of 21 markers in each sample, focusing our attention on the characterization of the immune milieu in both tumoral and stromal compartments. MRCNN, a deep learning method that was initially designed for object detection and instance segmentation of natural images 14 , outperformed all existing single-model entries on every task in the recent Microsoft Common Objects in Context (COCO) challenge, one of the most authoritative competitions in object detection and segmentation 58 . MRCNN was adapted to perform nuclei segmentation in histologic microscopic images 59 . In the current study, we demonstrated that the MRCNN model can also be adapted for cell segmentation in noisy, densely packed, structurally complicated, and widely varying IMC images with minor modi cation. We also showed that it performed better than the traditional watershed method for IMC cell segmentation using the computerized outputs of watershed segmentation as training sets.
Analysis of spatially resolved cell densities enabled us to identify several cell subtypes exhibiting signi cant differences between LTS and STS. In particular, we discovered that the density of granzyme B + CD8 + cytotoxic T cells (CD8_4) was signi cantly higher in LTS than in STS, expanding the prior knowledge that CD3 + , CD4 + , and CD8 + tumor-in ltrating lymphocytes are associated with positive outcomes in ovarian cancer [60][61][62][63] and highlighting the fact that only a speci c subtype of activated T cells (granzyme B + ) is associated with good clinical outcomes in ovarian cancer. In addition, we found that the density of CD45RO + CD4 + memory T cell (CD4_4) was signi cantly higher in LTS than in STS.
Memory CD4 + T cells could provide a protective response against cancer cells by making effector cytokines respond early and by enhancing CD8 + T and B cell responses, as well as by secreting cytokines that can induce other cells in the tumor microenvironment to mount antitumor immunity 61 .
Among different CD73 + cell subtypes, densities of CD73_1 and CD73_2, two different CAF subtypes, were signi cantly lower in LTS than in STS. CD73 is a GPI-anchored nucleotidase that catabolizes the production of extracellular adenosine and promotes tumor immune escape and thereby tumor growth. Indeed, CD73 expression has been shown to be associated with shorter disease-free and overall survival in HGSC patients and decreased CD8 + tumor-in ltrating lymphocytes 64,65 . A recent study reported that CAF-derived CD73 enforces an immune checkpoint 66 . These ndings reinforce the hypothesis that CAFs play a role in shaping the immune landscape of the tumor microenvironment and modulating patient survival rates.
Among different tumor cell subtypes, B7H4 + Keratin + tumor cell (tu_9) density was signi cantly lower in LTS than in STS. B7-H4 overexpression in cancer cells has been previously identi ed in high-grade ovarian tumors 67 , but to the best of our knowledge, this is the rst report showing that increased density of tu_9, a subtype of ovarian cancer cells expressing high levels of B7-H4, is associated with poor patient overall survival rates.
Finally, we found CD31 + CD73 mid endothelial cell (CD31) density was signi cantly lower in LTS than in STS. Although CD31 expression has shown no prognostic survival value, high CD31 expression was found in poorly differentiated tumors in a published study 68 .
Communication between heterogeneous tumor cells and various types of stromal cells, including in ltrating T cells, macrophages, CAFs, endothelial cells, and others, has been shown to be able to shape the ovarian cancer ecosystem, which subsequently modulates disease progression and clinical outcome 3,62,69 . Our deep machine learning analytic pipeline identi ed 10 cell-cell nearest-neighbor interactions together with age as features that can achieve the best survival prediction accuracy, resulting in an AUC of 1. These ndings indicate that interaction between different cell subtypes in the tumor microenvironment generates better prognostic features in predicting patient survival rates than cell density alone. For example, CD31 neighboring tu_4 was negatively correlated with patient survival, con rming that increased angiogenesis supports tumor cell growth and subsequently leads to poor patient survival rates. In contrast, CD8_4 neighboring tu_1 was positively correlated with survival, suggesting that intratumoral activated CD8 + CTL closely interacting with tumor cells represent an attempt of an anti-tumor response, which subsequently leads to improved patient survival rates, as has previously been reported [60][61][62][63] . Increased interaction between heterogenous populations of cell subtypes in the tumor microenvironment likely involves ligand-receptor crosstalk among the different cell subtypes. Further experiments using spatially resolved single cell transcriptomes on one cell subtype with its nearest neighboring partner will be needed to validate the cell-cell interactions as well as to understand how these interactions and the crosstalk signaling networks contribute to malignant phenotypes and their correlation with patient survival rates.
Several studies have coupled IMC data to multiplatform genomics to understand how the genome shapes the composition and architecture of tumor ecosystems 70,71 . To delineate the molecular mechanisms by which certain cell phenotypes identi ed by IMC modulate survival rates in HGSC patients, correlation studies have used genes in the microdissected epithelial compartment or broblastic compartment of HGSCs associated with patient survival and prognostic cell phenotypes identi ed by IMC. We found that cancer cell-derived WFDC2 negatively correlated with granzyme B + CD8 + cytotoxic T cell (CD8_4) density and was associated with poor HGSC patient survival. In fact, WFDC2, which encodes the protein HE4, has been shown to correlate with poor survival in HGSC patients and promote tumor growth and confer chemoresistance in ovarian cancer 72 . Moreover, HE4 has been described as a driver of immune failure in ovarian tumors by compromising cytotoxic CD8 + T cells through upregulation of selfproduced dual-speci city phosphatase 6 (DUSP6) 73 . These ndings suggest that our deep machine learning pipeline showed robust performance in identifying prognostic biomarkers associated with immune cell phenotypes described in previous studies. In addition to WFDC2, we found that other interesting genes involved in the regulation of T cell activity, apoptosis, and in ltration were overexpressed in STS; these genes include ITM2B, ITIH5, CD9, and TACSTD2 21,22,74 .
CAFs have been shown to facilitate cancer progression by supporting tumor cell growth, extracellular matrix remodeling, angiogenesis, and formation of an immunosuppressive microenviroment 71 . These results are supported by our research showing that CAF subtypes CD73_1 and CD73_2 express genes, such as ANGPTL2, TNC, TGFB1, FN1, BMPR1B, and LRRC32, that have been shown to promote tumor progression, angiogenesis, and extracellular matrix remodeling 45,46,52,55,56,75,76 . Some of these genes, including TGFB1, BMPR1B, and LRRC32, have been shown to modulate TGF-β signaling, which suppresses in ltration of anticancer immune cells such as cytotoxic T cells and natural killer cells and promotes the function of pro-cancer immune cells, such as regulatory T cells and M2 macrophages, in the tumor microenvironment 45,77,78 , leading to poor patient survival rates. Using additional antibodies targeting these immune cell phenotypes in IMC analysis will further validate our observations.
In conclusion, our novel deep machine learning-based IMC analytic pipeline combined with transcriptomes generated from microdissected epithelial and broblastic compartments of HGSC patient specimens demonstrates the heterogeneity of both tumor and stromal cell subtypes in HGSC. The imageomics analysis also identi ed cellular features and phenotypes with prognostic signi cance and helped delineate the molecular mechanism by which these features modulate the tumor-promoting and immunesuppressive microenvironment (Fig. 6C). Integrating quantitative spatial features, IMC-derived cell phenotypes, and transcriptomic data provides a unique opportunity to characterize ovarian cancer in an unprecedented way, generating multiparametric tissue biomarkers to predict clinical outcomes.

Patient samples
A total of 41 para n-embedded tumor tissue samples obtained from patients with advanced stage (stage IIIB-IV) high-grade serous ovarian cancer (HGSC) were used in the current study. Tissue samples were obtained from the ovarian cancer repositories at The University of Texas MD Anderson Cancer Center and Gangnam Severance Hospital, Yonsei University College of Medicine. They were collected from previously untreated patients undergoing primary cytoreductive surgery for ovarian cancer. After surgery, patients received platinum-based combination chemotherapy. Optimal surgical cytoreduction was de ned by a residual tumor no more than 1 cm in diameter. The overall survival duration was measured from the date of diagnosis to the date of death or censored at the date of the last follow-up examination. Long-term survivors were those with an overall survival time ≥ 60 months, and short-term survivors were those with an overall survival time ≤ 20 months. Clinical data, including age, cytoreduction status (optimal vs. suboptimal), and overall survival, were obtained from the records of the patients with HGSC. All samples and clinical data were collected with the approval of the Institutional Review Boards of MD Anderson and Gangnam Severance Hospital.

Preparation and staining
Tissue slides were depara nized in xylene followed by rehydration in a graded alcohol series. Antigen retrieval was performed with citrate buffer (pH 6) at 95 °C in a decloaking chamber (Biocare Medical) for 25 minutes. Slides were then blocked with 3% bovine serum albumin in phosphate-buffered saline for 30 minutes and incubated for 2 hours at room temperature with metal-tagged antibodies described in Fig.  S1. Following incubation, tissue slides were washed with phosphate-buffered saline and incubated with 0.5 µM Cell-ID Intercalator-Ir (Fluidigm) for the detection of nuclear DNA. Slides were then rinsed in phosphate-buffered saline and air-dried.

Imaging mass cytometry
Imaging mass cytometry (IMC) data were acquired by a Fluidigm Helios CyTOF instrument equipped with a Hyperion System laser ablation module in the Flow Cytometry and Cellular Imaging Facility at MD Anderson. A total of 41 images of 1 mm 3 each were acquired and used for the current study, including 20 images from short-term survivors (overall survival ≤ 20 months) and 21 images from long-term survivors (overall survival ≥ 60 months). Each 1 mm 2 region of interest on the tissue section was selected based on the image from the corresponding hematoxylin and eosin stained serial tissue section, which demonstrated representative of tumor regions surrounded by stomal cells.
Microdissection and microarray analysis of tissue samples RNA was extracted from microdissected frozen HGSC samples, which included tumor epithelial components and stromal components from 16 HGSC patients, from whom IMC data were available.
Extensive details of specimen handling, RNA extraction and ampli cation, microarray hybridization, and quality-control procedures have been described previously 54 .

Data preprocessing and cell segmentation
Data were converted to TIFF format by MCD viewer (Fluidigm). Channel spillover was compensated using the nonnegative least square approach 79,80 . Watershed segmentation was performed on the maximal projection of normalized images of H3 and nucleic acid intercalator ( 191 Ir and 193 Ir). The best Watershed segmentation results were achieved with prior median ltering (2 × 2 pixels) followed by Gaussian blurring (kernel width of 2 pixels), and standard parameters for watersheds. These steps were performed by an in-house-developed Matlab script with the Matlab image processing toolbox. Mask-R-CNN (MRCNN) segmentation was trained on the outputs of Watershed segmentation. Thirty-one images were employed for training, with each image segmented into 16 small pieces to increase the training speed (each small image had a size of ~ 250 × 250 pixels, image resolution 1 µm/pixel). A total of 496 small images were used as the training set. Each image was converted from a grayscale image into an RGB image in which the R and G channels were median-ltered grayscale images with 2 × 2 and 3 × 3 neighborhoods, respectively, and the B channel was the raw grayscale image. We used an MRCNN model with a feature pyramid network and a convolutional neural network ResNet-101 backbone based on an existing implementation (https://github.com/matterport/Mask_RCNN) by Matterport Inc. (released under an MIT License) that employed the python open-source libraries Keras and Tensor ow. We initiated the model using weights obtained from pretraining on the MSCOCO dataset 58 . We started with a learning rate of 0.001 and trained with 50 epochs and decreased the learning rate to 0.0001 and trained for 50 epochs.
The training was performed using one NVIDIA V100 GPU (Amazon AWS p3.2xlarge instance). To compare the results of Watershed and MRCNN segmentation, we manually segmented 10 testing images (~ 300 cells per image, each image had a size of ~ 250 µm × 250 µm) to be used as a reference segmentation. To quantify the performance of different segmentation methods, we computed the dice similarity coe cient between each cell in Watershed segmentation and its maximum overlapping cell in reference segmentation, and between each cell in MRCNN segmentation and its maximum overlapping cell in reference segmentation. We compared the mean and standard deviation of the dice similarity coe cient of all the cells in each sample between Watershed and MRCNN segmentations (Fig.  S1B, C).
Signal-to-noise ratio evaluation and data normalization neighbor interactions, the average cell count of cell subtype X in the nearest neighborhood of cell subtype Y (distance between the centroids of X and Y < 20 µm) in the tumor-enriched region was computed. An unpaired t test was used to determine if there was a signi cant difference (adjusted p < 0.05) between the means of cell-cell nearest-neighbor interactions of long-term and short-term survivors. Adjustment for multiple testing was conducted using the Benjamini-Hochberg method 82 .

Survival prediction
The training and test data sets were randomly split at 80% and 20% of the total 41 samples that contained 21 long-term survivors and 20 short-term survivors, respectively. The ratio of the long-term to short-term survivors in the training data set was 1 and in the test data set was 1.25. For survival prediction using only cell density and age, the 41 cell density and age features were rst subjected to Spearman correlation with survival, and 22 features that had an absolute correlation coe cient larger than 0.2 were kept. Because logistic regression assumes independence between features, features could not be highly correlated. Only one of the highly correlated features (absolute Spearman correlation coe cient ≥ 0.65) was kept and the rest were dropped, leaving 17 features. Each feature was normalized to a 0 to 1 scale. Recursive feature elimination and logistic regression (python's sklearn package) were used to rank the features according to their importance. Owing to the small sample size, leave-one-out cross validation was used to evaluate performance during the training. The optimal feature number was selected at the highest validation accuracy.
To narrow down the nearest-neighbor interaction features that were related to the prognostic cell density features, we selected 46 features of the nearest-neighbor interaction that contained any of the seven prognostic cell density features and were also signi cantly different between long-term and short-term survivors and combined these features with age to predict survival. To lter the features, we performed Spearman correlation between survival and these 47 features. Forty-four features that had an absolute correlation coe cient larger than 0.2 were kept. Because features of logistic regression cannot be highly correlated, only one of the highly correlated features (absolute Spearman correlation coe cient ≥ 0.65) was kept, leaving 26 features. Each feature was normalized to a 0 to 1 scale. Recursive feature elimination and logistic regression were used to rank the features according to their importance. Owing to the small sample size, leave-one-out cross validation was used to evaluate performance during the training. The optimal feature number was selected at the highest validation accuracy.
Correlation of cell density with gene expression Gene expression was normalized by Robust Multi-array Average 83 . Spearman correlation between gene expressions in microdissected broblastic or epithelial components and cell densities in tumor-enriched regions was performed and genes of interest were selected from those that had positive correlation coe cients (p < 0.05, absolute correlation coe cient > 0.4) with each cell subtype that showed a signi cant difference between long-term survivors and short-term survivors. For genes that had multiple probe IDs, we used only the probe ID that had the largest variance. The genes of interest of the microdissected broblastic components were ltered by the single-cell RNA sequencing (scRNAseq) data 84 , and only the genes that were expressed in broblasts or stromal cells were kept. Moreover, to understand the molecular mechanism that might explain the correlation between IMC and HGSC patient survival, we included in the heatmap in Fig. 6B only genes encoding for secreted or receptor proteins for all listed IMC features, except for CD73_1 and CD73_2 (for which we included all genes). The genes of interest of the microdissected epithelial components were ltered by the scRNAseq data and only genes that were expressed in epithelial cells were kept. Among these, only genes encoding for secreted or receptor proteins for all listed IMC features, except for tu_9 (for which we included all genes), were included in the heatmap in Fig. 6A. Kaplan-Meier analysis was performed on the ltered genes of interest, and only genes that had signi cant prognostic values (p < 0.05) were kept.

Kaplan-Meier analysis
Kaplan-Meier analysis was performed differently for genes in broblastic and epithelial components.     clustering with the Ward method. C. Schematic summarizing the genes that are signi cantly correlated with prognostic IMC features to postulate the mechanisms by which these genes contribute to the prognostic features. Cancer cell-derived WFDC2, TACSTD2 and ITIH5 drive immune surveillance failure by compromising cytotoxic CD8+ T cells activity and in ltration. CAF-derived VSTM4 can reduce cytokine production by CD4 T cells and cause a decrease in T cell activation. CAFs expressing BMPR1B are more responsive to BMP signaling, which subsequently modulates the rigidity of CAFs, and reduces CD4_4 T cell tra cking. CAF-derived LRRC32 suppresses anticancer immune cell in ltration by modulating TGF-β signaling networks. ANGPTL2, TNC, TGFB1, FN1, BMPR1B, and LRRC32 promote tumor progression, angiogenesis, and ECM remodeling.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.