Chromatin reprogramming underpins enzalutamide resistance
To study molecular consequences of AR signaling suppression and drug resistance dynamics in PC, we used LNCaP parental and LNCaP-derived ENZ-resistant cell lines RES-A and RES-B generated via long-term exposure to AR-targeting agents (Handle et al., 2019) (see Methods), as well as other independently generated LNCaP- and VCaP-derived models (Fig. 1A). We hypothesized that chromatin structure would be reshaped in ENZ-resistant cells and lead to modification of the transcriptome (Sönmezer et al., 2020; Strickfaden et al., 2020).
To extrapolate the contribution of chromatin structure to ENZ resistance, we performed single-cell (sc) assays for transposase-accessible chromatin and sequencing (scATAC-seq) on four samples: (1) LNCaP parental cells (LNCaP), (2) LNCaP exposed to short-term (48 hours) ENZ (10 µM) treatment (LNCaP-ENZ48), (3) RES-A and (4) RES-B (Fig. 1A). We first analyzed the scATAC-seq data as if it would have been sequenced in bulk cells (see Methods). The ATAC-seq signal at transcription start sites (TSS) decreased in ENZ-resistant cells compared to parental, particularly in RES-B cells (average enrichment score 4.8 in RES-B vs 6.2 in LNCaP, p < 0.001, t-test) (Fig. 1B). RES-A and RES-B cells shared a large proportion (14% in RES-A and 17% in RES-B) of “ENZ-resistant-specific” open chromatin regions not found in parental LNCaP. Additionally, RES-A cells had a higher proportion of unique open sites compared to RES-B (19% vs 5%, P < 0.001, chi-square test) and LNCaP (19% vs 7%, P < 0.001, chi-square test) (Fig. 1C). These findings are consistent with TSS non-targeted opening (Jiang and Zhang, 2021) of the chromatin upon ENZ resistance.
We corroborated the extent of chromatin opening and reprogramming in ENZ-resistant cells by performing formaldehyde-assisted isolation of regulatory elements (FAIRE) sequencing (Giresi et al., 2007) on the parental LNCaP and RES-A cells subjected to androgen starvation, or exposed to androgens, ENZ, or both agents (Figure S1A-D) (see Methods). Even in this bulk assay, ENZ and androgen starvation appeared to be more significant drivers of reprogramming in RES-A than in parental LNCaP. While there was no difference in the total number of open chromatin sites, ENZ-resistant samples had a higher proportion of unique open sites compared to parental in the presence of androgens (P < 0.001, chi-square test) (Figure S1E) and in androgen-deprived (castrate) conditions (P < 0.001, chi-square test) (Figure S1C). Read distribution analysis (see Methods) demonstrated that the chromatin of ENZ-resistant cells is more open in castrate conditions (P = 0.018, t-test) (Figure S1D) but not in presence of androgens (P = 0.239, t-test) (Figure S1F), and that ENZ has an additive effect on castration in inducing chromatin compaction in parental ENZ-sensitive cells that is counteracted by androgens (Figure S1D, Figure S1F).
Next, we used all samples with scATAC-seq to generate cluster visualizations of cell subpopulations with different chromatin accessibility profiles (Fig. 1D) (see Methods). We identified clusters that we termed “unique” or “shared” across the samples (Fig. 1E). Unique clusters were specific to RES-A, RES-B, or both (named “ENZ-induced clusters”), or specific to the untreated and/or short-term ENZ-treated parental line (named “initial clusters”). Shared clusters were present at similar proportions across the samples and were named “persistent clusters” (Fig. 1E). We compared each cluster to all other clusters to determine its unique chromatin profile based on differentially accessible chromatin regions (DARs; Table S1).
The most prevalent chromatin-based scATAC-seq clusters (0, 1, and 2) were persistent (Fig. 1E) and defined by fewer than 20 unique DARs, suggesting that 74% of the cells share an overall similar chromatin accessibility profile during development of ENZ resistance. We then assessed for changes in cluster chromatin DARs between the parental LNCaP, LNCaP-ENZ48, and in RES-A and RES-B (Table S1). DARs were observed around MYC and TP53 in several clusters during the short-term response to enzalutamide, including in cluster 6 that arises during enzalutamide resistance in RES-A.
Studies on PC cell lines cultured for an extended time without androgens tend to display neuroendocrine-like phenotypes (Braadland et al., 2019; Fraser et al., 2019). The largest fold changes in chromatin accessibility based on average signal from all cells showed enrichment for neural system and neurite development processes between the parental (LNCaP-ENZ48 or DMSO) and resistant cells (RES-A or RES-B) (P < 0.001 in RES-A and P = 0.001 in RES-B). Using Gene Set Variation Analysis (GSVA) for gene expression scoring (see Methods), we found elevated expression of NEPC-derived signatures among upregulated genes (Braadland et al., 2019; Tsai et al., 2017) in RES-A and RES-B cells (particularly EZH2, AURKA, STMN1, DNMT1, and CDC25B), as well as increased expression of NEPC-downregulated genes in initial clusters (Figure S1G). Interestingly, in the same cell lines, enrichment analysis of NEPC signatures showed high NEPC signal in RES-A cells only (Figure S1H).
Overall, these data show extensive chromatin reprogramming during the emergence of resistance to AR-targeting agents.
Enzalutamide resistance reconfigures availability of TF binding DNA motifs in the chromatin
Chromatin accessibility determines transcriptional output by exposing a footprint of TF DNA binding motifs. We hypothesized that increased chromatin opening in resistant cells would change the footprint of TF DNA motifs exposed. To this end, we first utilized AR and MYC binding site maps in LNCaP cells (Barfeld et al., 2017) and explored their relationship with open chromatin sites in the bulk FAIRE-seq data from RES-A cells. Using read distribution analysis, we observed a significant increase in open chromatin at MYC binding sites in ENZ-resistant cells (P < 0.001 in castrate conditions and with androgens, t-test) (Fig. 2A, Figure S2A), as well as a reduction of open chromatin at AR binding sites (P < 0.001 in castrate conditions and with androgens, t-test) (Fig. 2B, Figure S2B). These findings suggest that chromatin dysregulation in ENZ-resistance is associated with reconfiguration of AR and MYC chromatin binding, consistent with previously reported increased MYC and reduced AR transcriptional activity in these cells (Handle et al., 2019).
To resolve how chromatin reprogramming affects TF DNA motif exposure at the single-cell level, we performed a TF motif enrichment analysis on the marker DARs characterizing the scATAC-seq cell clusters in each sample (Fig. 2C, Table S1). This analysis confirmed the enrichment of motifs for several PC-associated TFs such as AR and MYC, as well as GATA2, HOXB13, and others in persistent clusters 3 and 5, as well as initial cluster 4 in parental and LNCaP-ENZ48 (Fig. 2C). Clusters 3 and 5 remained enrichment of a subset of the same TFs motifs in RES-A and RES-B, with cluster 5 showing a consistent enrichment profile in all samples (Fig. 2C). AR, CREB1, E2F1, GATA2, and ZFX were common motifs. Cluster 3 was characterized by FOXA1 and JUND, while cluster 5 was characterized by CTCF, ETS-like, and MYC. Although they possessed distinct sets of DARs, the ENZ-induced clusters 6 and 7 did not display enrichment of TF motifs in RES-A or RES-B.
Between pairs of samples, DARs were predominantly closing in cluster 3 compared to all other clusters (8% vs 4% DARs differentially closing, P < 0.001, chi-square test) and predominantly opening in cluster 4 (11% vs 8% DARs differentially opening, P < 0.001, chi-square test) (Figure S2C). We performed selective TFs motif enrichment analysis in DARs opened (Fig. 2D) and closed (Figure S2D) between pairs of samples (see Methods). While we observed no enrichments after short-term ENZ-treatment (LNCaP-ENZ48 vs parental; Fig. 2D), comparing open DARs in RES-A or RES-B vs LNCaP parental retrieved distinct sets of TFs, with MYC and ESR1 being the most common across all clusters in RES-A and RES-B, respectively (Fig. 2D). Similarly, comparing open DARs in RES-A or RES-B vs LNCaP-ENZ48 showed enrichment of most of the PCa-related TF motifs tested in most clusters (Fig. 2D), and to an even greater extent when considering closing DARs between sample conditions (Figure S2D).
These analyses demonstrate that ENZ-resistance is associated with reconfiguration of TF DNA motif footprints.
Transcriptional patterns of enzalutamide resistance are induced by divergent chromatin reprogramming
To study transcriptional patterns in relation to reconfiguration of chromatin structure at the single-cell level, we performed scRNA-seq in the LNCaP parental, RES-A and B models. Integrated clustering of four LNCaP samples (Fig. 3A) (see Methods) showed 7 persistent, 3 ENZ-induced, and 3 initial cell clusters (Fig. 3B) defined by sets of marker differentially expressed genes (DEGs; Table S2; between 17 and 283 DEGs in the 13 clusters). To confirm that these cell subpopulations are relevant in other independent models of ENZ-resistance, we used the label transfer approach (Stuart et al., 2019) to query for matching cell populations in independent scRNA-seq datasets: a LNCaP parental sample, LNCaP ENZ-treated for 1 week (LNCaP-ENZ168), and an independent ENZ-resistant (RES-C) LNCaP-derived cell line (Fig. 1A). Transferring scRNA-seq cluster labels confirmed the presence of initial clusters (4, 6, and 10) in LNCaP parental (Figure S3A) and RES-C (Figure S3B). Presence of ENZ-induced clusters was confirmed in RES-C (17% of cells in cluster 3) and LNCaP-ENZ168 (79% in cluster 3), suggesting that one week of ENZ treatment is sufficient to give rise to this cluster prior to the development of resistance (Figure S3C). Most importantly, we could retrieve persistent subpopulations of cells in the alternative LNCaP-parental sample (4%), in LNCaP-ENZ168 (13%), and in RES-C (31%), suggesting that these persistent cells are consistently found during emergence of ENZ-resistance.
We additionally performed scRNA-seq on a VCaP parental cell line treated with DMSO or ENZ for 48 hours to test for the generalizability of our results beyond a single cell line (Fig. 1A). A similar analysis with VCaP cells confirmed the prevalence of persistent cells in the VCaP parental (93% of cells), as well as initial and ENZ-induced cells in VCaP-ENZ48 (38% and 55% of cells, respectively) (Fig. 3C).
We then sought to determine whether the observed scRNA-seq clusters (Fig. 3A) could be the result of enriched TFs binding activity in alternative open DARs. Using annotated databases, we queried the transcriptional targets of the enriched TFs in the open DARs when comparing RES-A or B to the parental LNCaP (Fig. 2D) in the matching scRNA-seq samples (see Methods). Chromatin remodeling affected TF activity and consequently DEGs in the scRNA-seq for up to a maximum of 11% in cluster 0 in RES-A and 7.1% in cluster 1 in RES-B (Fig. 3D). While target DEGs for TFs such as MYC, JUND, and E2F where found in most clusters in both RES-A and B, other target DEGs for TFs such as AR, RELA (a NFkB subunit), and GRHL2 appeared more specific to RES-A or B, consistent with proposed stoichiometric models of TFs chromatin binding (Klemm et al., 2019). This analysis confirmed that alternative open DARs in ENZ-resistance can activate divergent transcriptional programs.
Next, we connected the scRNA-seq clusters to their matching scATAC-seq clusters. We took advantage once more of the label transfer approach to identify matching scRNA- and scATAC-seq cell states in the same sample conditions (see Methods). In this process, we assigned cell cluster labels within the scRNA-seq to the scATAC-seq clusters, or vice versa (Figure S3D). We found that a chromatin state can correspond to multiple transcriptional states (96% in scATAC-RNA vs 48% in scRNA-ATAC of cells assigned on average across all samples, P < 0.001, chi-square test). Querying the integrated scRNA-seq clusters (Fig. 3A-B) from the scATAC-seq data, we could find matching cell states in the scATAC-seq for scRNA clusters 9, 10, and 11 (Figure S3E). Across the sample conditions, 95% of the cells projected to belong to scRNA-seq cluster 10 belonged to scATAC-seq cluster 4, while 72% of cells projected to belong to scRNA-seq cluster 9 or 11 belonged to scATAC-seq cluster 3 (Fig. 3E).
Taken together, these data show that transcriptional configuration of ENZ-resistant cells, especially cells persisting during treatment, emerges from processes driven partially by chromatin structure and TF-mediated transcriptional reprogramming. These processes affect a number of important regulators of cell fate, consistent with lineage commitment recently observed in tissue development (Ma et al., 2020).
Prostate cancer cell subpopulations with features of stemness precede enzalutamide resistance
Cell cycle phase can be a strong determinant of the integrative clustering of scRNA-seq data. Accordingly, we found that persistent clusters 8, 9, and 11 scored highly for S and G2/M phase related genes using cell cycle scoring in Seurat (see Methods) (Fig. 4A), suggesting that cells in these clusters are more actively cycling and proliferative. However, we found that cells in clusters 9 and 11 were characterized not only by cell cycle genes, but also by expression of genes involved in chromatin remodeling and organization (CTCF, LAMINB, ATAD2), increased cell cycle turnover and stemness (FOXM1, (Ketola et al., 2017), and DNA repair (BRCA2, FANCI, RAD51C, POLQ) (Fig. 4B). Clusters 5 and 11 showed high expression of a gene set, which we named “Stem-Like”, composed of stemness-related genes mainly from Horning et al (Horning et al., 2018) (Fig. 4C). Karthaus et al recently identified activated luminal prostate cells able to regenerate the epithelium following castration (Karthaus et al., 2020). We extracted the gene expression profile associated with these prostate luminal cells (see Methods) and used it to score each scRNA-seq cluster. We found cluster 10, an initial cluster, to score highly for this gene signature, which we renamed PROSGenesis (Fig. 4D, Figure S4A). We visualized the expression of PROSGenesis and Stem-Like in the VCaP scRNA-seq samples, confirming the presence of these subpopulations of cells in other models (Fig. 4E).
We then set out to reconstruct the trajectories of how these clusters of interest were generated during the development of ENZ-resistance. Based on cytoTRACE (Gulati et al., 2020), cells in clusters 10 and 11 were the least differentiated across most of the sample conditions (Fig. 4F), suggesting that the other cells could derive from cells in these clusters. RNA velocity analysis estimated cluster 10 as a precursor of the enzalutamide-induced clusters (Fig. 4G), concordant with a state derived from activated regenerative luminal prostate cells as previously suggested (Karthaus et al., 2020). Cluster-specific differential velocity analysis in RES-A and RES-B revealed downregulation of many PC-related genes, such as ATAD2, as well as upregulation of genes such as UBE2T, PIAS2, PFKFB4, and EGFR (Figure S4B-C). ATAD2 and UBE2T were otherwise upregulated in persistent clusters 8, 9, and 11 (Figure S4C), suggesting additional transcriptional reprogramming in ENZ-induced clusters.
These analyses point at two distinct subpopulations of PC cells which precede ENZ resistance: one persistent cell cluster (cluster 11) matching “Stem-Like” and one initial cluster (cluster 10) matching PROSGenesis, a signature derived from tissue regeneration (Karthaus et al., 2020). Collectively, our data suggest that there exists a small number of PC cells within the bulk with stem-like and regenerative potential.
Model-based characterization of gene signatures in prostate cancer bulk RNA sequencing
The use of molecular gene classifiers or signature scores is an attractive strategy to select cancer patients for treatment (Doultsinos and Mills, 2021; Eggener et al., 2020). According to an unbiased enrichment and differential expression analysis of hallmark gene sets (see Methods), most of the persistent clusters and cluster 10 also showed significant enrichment of E2F targets, G2M checkpoint, and MYC target genes (Figure S4D). These data are largely concordant with the bulk RNA-seq data on the same cells in our previous study (Handle et al., 2019), reflecting the fact that signals from subpopulations of cells can be retrieved in bulk RNA-seq data. Differential expression within clusters (Figure S4E-G) further revealed that oxidative phosphorylation was immediately upregulated in LNCaP-ENZ48, and this process is maintained highly selectively in RES-A but not in RES-B. Moreover, genes regulated by activated mTORC1 signaling were consistently upregulated in most of the clusters as ENZ resistance develops (Figure S4E-G), in agreement with previous reports showing its activation during ENZ treatment in patients (Ma et al., 2020).
We therefore used a collection of signatures derived from the scRNA-seq analysis to describe features of the same cells in bulk RNA-seq datasets. In addition to Stem-Like and PROSGenesis, we included (1) NEPC markers (Figure S1G), (2) a BRCAness gene signature (Li et al., 2017) as RES-A and RES-B maintain sensitivity to PARP inhibition (Handle et al., 2019) and the persistent cluster 11 is characterized by markers of DNA repair (Fig. 4B), (3) gene sets as proxies of AR signaling activation (He et al., 2018), including activation of AR splice variants (AR-Vs), (4) the DEGs defining our scRNA-seq clusters, and (5) gene sets for mTORC1 signaling and MYC targets (Figure S4D-G) (Table S3).
In the bulk, the ENZ-induced DEGs selectively appeared in the RES-B cells (Fig. 5A). Similarly, the persistent clusters were associated with the Stem-Like signature only in RES-A and RES-B when induced with DHT (Fig. 5A). On the other hand, the PROSGenesis signature was elevated only in RES-B (Fig. 5A). The NEPC features in RES-A were associated with MYC activation (Fig. 5A). Consistent with both resistant lines remaining responsive to PARP inhibition (Handle et al., 2019), we found samples from RES-A and RES-B to score highly for BRCAness (Fig. 5A), which is known to downregulate DNA repair machinery (Li et al., 2017). BRCAness was associated with the AR-V signature as previously shown (Kounatidou et al., 2019) (Fig. 5A).
To confirm the properties of different signatures, we used VCaP cells to develop an independent model of resistance to AR signaling-targeted treatments including ADT, bicalutamide, ENZ, and bicalutamide/ENZ multi-resistant sublines, and performed bulk RNA-seq (Fig. 1A). These VCaP-based sublines did not show NE features (Fig. 5B). Only ENZ-resistant VCaP cells scored highly for the ENZ-induced DEGs, confirming the specificity of this signature to ENZ treatment and resistance. Parental and ENZ-resistant VCaP cells scored highly for the PROSGenesis signature, while the scores of the persistent, Stem-Like, mTORC1 signaling, and MYC target signatures scored highly selectively in resistant VCaP sublines (Fig. 5B). This suggests a convergent mechanism of resistance to these agents in independent models.
Next, we scored xenografts of AR+/NE−, AR−/NE+, or AR−/NE− CRPC and NEPC tumors resistant to ENZ (Labrecque et al., 2019; Lam et al., 2020) with the same signature sets (Figure S5A). AR+/NE− xenograft samples clustered into two separate clusters. AR− tumors clustered together with a series of AR+/NE− tumors due to low mTORC and MYC signaling, while one cluster of AR+/NE− scored highly for all of the gene sets except for markers upregulated in NEPC (“NEPC upregulated”). Interestingly the PROSGenesis signature, along with initial clusters and ENZ-induced clusters, scored particularly high in AR+ tumors while the Stem-Like signature, along with the persistent clusters, scored high in both AR+/NE− and AR−/NE+ tumors (Figure S5A), suggesting that the two signatures capture different tumor biologies. In a transcriptome dataset based on an independent xenograft model (King et al., 2017), we found ENZ resistance to be uniquely associated with higher AR activity, higher expression of MYC target genes, PROSGenesis high score, and high expression of ENZ-induced cluster gene sets (Figure S5B). These data suggest that the Stem-Like status is independent of the AR status and that persistent cells might mediate the development of both AR positive CRPCs and negative NEPCs.
Collectively, the persistent, initial, PROSGenesis, and Stem-Like derived gene signatures show potential for identifying aggressive, regenerative features of PC from bulk RNA-seq.
Transcriptional signal enrichment analysis identifies treatment-persistent cells and prognostic gene signatures in prostate cancer patients
We hypothesized that we could use enrichment of gene signature expression to stratify advanced and primary PC patients.
To this end, we interrogated clinical data of CRPC patients treated with ENZ reported in Alumkal et al. (Alumkal et al., 2020). The patients aggregated into two clusters based on our complete signature set (Figure S5C), but patients in neither cluster had significantly shorter progression-free survival (PFS; P > 0.05, log-rank test). Utilizing a stepwise variable selection process we identified five significant signatures (NEPC upregulated, PROSGenesis, MYC targets, AR activity, and ARV) that are able to identify patients with significantly shorter PFS (Figure S5D). Moreover, PFS analysis of individual gene signatures revealed association with shorter time to progression for patients scoring high for the Stem-Like signature (P = 0.025, log-rank test) or for genes upregulated in NEPC (P < 0.001, log-rank test) (Fig. 5C-D), while patients with longer PFS scored highly for PROSGenesis (P = 0.021, log-rank test) and for the initial cluster signature(P = 0.018, log-rank test) (Figure S5E).
None of the cluster marker gene sets showed a significant difference between Stand Up To Cancer (SU2C) CRPC abiraterone/ENZ-naive and abiraterone/ENZ-exposed patients (Abida et al., 2019) according to their latest treatment regime, suggesting that differences between the tumors based on the signatures may be difficult to retrieve using bulk sequencing from heavily pre-treated patients. Despite the challenges of applying single-cell derived signatures to bulk data however, Stem-Like was still significantly associated with poor overall survival in these patients (Fig. 5E), supporting the potential significant activity of the persistent cells in this group of patients. Similarly, we could not stratify patients that developed resistance to ENZ in the SU2C West Coast DT Quigley et al dataset (Figure S5F) (Quigley et al., 2018), although in this case, ENZ-sensitive patients had higher expression of PROSGenesis (P = 0.024, Wilcoxon rank-sum test) (Figure S5G).
These data show that the Stem-Like signature associated with persistent cells (cluster 11) from our single-cell analysis of ENZ resistance is a consistent classifier with the potential of stratifying patients for response to second line AR-targeted treatments (Fig. 5F).
We then hypothesized that we could systematically use the persistent cluster 11, Stem-Like, initial cluster 10, and PROSGenesis signatures as a proxy for the presence of PC cells with different transcriptional features in clinical settings, to capture signals from such types of pre-existing subclones with metastatic potential in primary untreated tumors. To this end, we took advantage of a recently published scRNA-seq dataset on clinically relevant PCs specimens (Fig. 6A) (Chen et al., 2021). We used GSVA score to highlight our 13 scRNA-seq clusters in 36424 cells from primary untreated PC specimens of 13 patients (Figure S6A). The analysis showed that our LNCaP model-derived cell clusters scored higher in luminal and basal/intermediate cells compared to fibroblasts (P = 0.047, t-test) (Figure S6A). Additionally, luminal cells had higher expression of genes associated with our initial scRNA-seq clusters compared to the basal/intermediate cells (P = 0.02, t-test) and compared to fibroblasts (P < 0.001, t-test).
We then scored the cells for expression of genes from the Stem-Like and PROSGenesis signatures, along with the associated clusters (11 and 10, respectively) and control signatures linked to AR activity (ARV, AR-FL, and AR activation), BRCAness, and NEPC (Fig. 6B). We defined a high score for a gene signature to be above the 90th percentile. 48% percent of the cells that scored highly for the Stem-Like signature were luminal cells (Fig. 6C). Cells scoring highly for the PROSGenesis signature were mostly basal/intermediate (78% of high scorers) (Fig. 6D). Each single patient harbored on average 8% of cells scoring high for the Stem-Like signature (ranging from 2% in patient 173 to 23% in patient 156) and 8% of cells scoring high for the PROSGenesis signature (ranging from 0.9% in patient 153 to 33% in patient 172) (Fig. 6E).
To reconcile the presence of these cells and their relative histopathological position, we assessed gene expression within two sections (Prostate A and B) of primary untreated PC with spatial transcriptomics (see Methods). We reconstructed the gene expression signal from stromal and epithelial components in an average of 1682 spots per sample using clustering analysis and annotated the tissue architecture in 5 clusters of stromal tissue (ST), benign epithelium (BE), and adenocarcinoma (PC-AC) (Fig. 6F, Figure S6B). PROSGenesis and Stem-Like signatures, as well as the companion model-derived cluster 10 signature, showed high expression scores within the sections compared to scores from housekeeping gene signatures (Fig. 6G, Figure S6C). We compared the score distributions of our signatures to the housekeeping gene set score distributions and determined the 90th percentile as a score cutoff for high expression by allowing for 5% false positives (see Methods). Spots with high signal were found interspersed in all 5 clusters in both sections (Fig. 6H, Figure S6D). In Prostate A, however, spots scoring highly for the Stem-Like signature were more prevalent in the PC-AC cluster compared to ST (P = 0.005, chi-square test). Spots scoring highly for PROSGenesis were further enriched in the BE and PC-AC clusters compared to ST (P < 0.001 in both cases, chi-square test), while spots scoring highly for cluster 10 were enriched in the BE clusters compared to all other tissue regions (P < 0.001 for each comparison, chi-square test) (Fig. 6H). Concordant observations were made for PROSGenesis and cluster 10 signatures in Prostate B (Figure S6D). To validate these findings, we undertook a similar approach to re-analyze spatial transcriptomics data from prostate Sect. 3.3, 1.2, and 2.4 from Berglund et al (Berglund et al., 2018), which were annotated to contain a significant proportion of cancer. Similarly to our initial observation, these sections showed enrichment of spots scoring highly for the PROSGenesis in the PC-AC clusters compared to ST or prostatic intraepithelial neoplasia clusters (Figure S6E-G). Spots scoring highly for cluster 10 were more prevalent in both benign and cancerous clusters. Taken together these data suggest the presence of treatment-persistent cells interspersed within the primary untreated prostate tissue of PC patients with high metastatic potential.
Finally, we verified whether we could predict recurrence in primary PC patients using the signature genes derived in these cells. We interrogated legacy primary tumor TCGA PRAD (https://www.cancer.gov/tcga) (Fig. 7A) and early onset PC (EOPC) ICGC (Gerhauser et al., 2018) RNA-seq data (Figure S7A) for our gene signatures of interest. Using all signatures for clustering the TCGA PRAD cohort separated 54% of Gleason score (GS)-7 and 15% of GS-8 + patients which would not benefit from additional treatment, as they had relatively good prognosis (Fig. 7B). A similar trend was also observed in the ICGC cohort (Figure S7B). ENZ-induced (Fig. 7C), PROSGenesis (Fig. 7D), Stem-Like (Fig. 7E), and persistent (Fig. 7F) gene signatures were the most significant (P < 0.05, log-rank test) contributors to cluster separation in the TCGA cohort, while NEPC downregulated genes were the major determinant in the ICGC cohort (Figure S7C). In line with previous reports (Alumkal et al., 2020), signatures reflecting AR activity (AR activity and full length) in these tumors were consistently associated with longer time to progression in the TCGA cohort (Fig. 7G-H), suggesting a better response to inhibition of AR signaling in AR driven tumors. In the EOPC cohort, which is enriched in GS-7 tumors compared to the TCGA PRAD, the persistent and PROSGenesis signatures significantly stratified GS-7 patients (P < 0.05, log-rank test) (Figure S7D-E), suggesting the ability of these signatures to further refine GS-based risk stratification in patients and avoid overtreatment. High PROSGenesis score was associated with good prognosis together with the gene set from the initial cluster 10 (Fig. 7I). Individually, 8 out 13 clusters-derived signatures showed association with PFS in the TCGA cohort (Fig. 7I), pointing at the utility of these signatures in PC patient risk stratification.