Patient characteristics
We analyzed the proteome of 107 MIBC patients undergoing chemotherapy (neoadjuvant or induction, collectively referred to as ‘NAC’ in this study), and N = 25 NMIBC patients. All MIBC patients received transurethral resection (TUR) followed by RC, with 44% showing nodal metastasis and 49% being cT3-4 pre-NAC, with a complete response of 30% defined at time of RC as ypT0N0M0 (Table 1).
Pre-NAC TUR tissue was harvested from all 107 MIBC patients, with an additional 36 duplicate TUR samples collected to study intratumoural heterogeneity (ITH) for a total of 143 TUR specimens. 75 of 107 patients were partial (pTis/a/1N0M0) or non-responders (≥ pT2NXM0) after NAC and RC. From these, we collected tissue from 55 post-NAC RC specimens (20 had no available material), and 14 additional duplicate RC specimens to study ITH, for a total of 69 post-NAC RC specimens. We also pooled N = 37 benign ureter (BU) samples collected at time of RC. The NMIBC cohort comprised 29 TUR tissue specimens from 25 NMIBC patients (4 duplicate samples) with high-grade Ta (Ta-HG) tumours for comparison (Supplementary Table 1). The total number of samples was 278 (107 + 36 = 143 TUR; 55 + 14 = 69 RC; 25 + 4 = 29 NMIBC; 37 BU). An overview of all cohorts, samples, study design, and disease history is provided in Fig. 1.a-b, and Supplementary Fig. 1a.
All FFPE samples underwent SP3-CTP 11-multiplex Tandem Mass Tag (TMT)-MS analysis, as described previously33,38. The 278 samples were distributed across 31 TMT 11-plexes (Methods and Supplementary Fig. 1b). Results were high-quality, with the MS spectra mapping to 182,899 total peptides and 9,769 total proteins, of which 5,823 were quantified across all samples submitted (Supplementary Fig. 1c-e). BU replicates showed robust quantification across plexes with a median coefficient of variation (CV) < 10% for overall and ubiquitous proteins, while technical replicates (N = 37) of the cell line mixture standard (SuperMix) showed higher CV (Supplementary Fig. 1f) probably due to differences in nature (in vitro/in vivo) and preservation (frozen/FFPE) of the cell line mix versus tumour samples, respectively. Principal component analysis (PCA) on the proteome of all MIBC samples revealed no segregation by institution or material age (Supplementary Fig. 1g). Thus, we merged the Vancouver and Bern MIBC cohorts into a single MIBC ‘NAC’ cohort (including TUR and RC samples) for further analysis. Samples from TUR and RC clustered separately, suggesting proteomic heterogeneity according to sample type and/or intervening systemic therapy and not collection site (Supplementary Fig. 1h).
The pre-NAC proteome reveals four different MIBC clusters with distinct biology, clinical features, pathological response and survival.
We performed unsupervised consensus clustering on 107 pre-NAC MIBC samples derived from TURs using the 25% most variable proteins (N = 1,170) with robust quantification (CV < 10% across tumour samples). Based on examination of the consensus matrix and delta plots focusing on the variation in consensus cumulative distribution function (CDF) area (Supplementary Fig. 1i-k), four strongly segregated Consensus Clusters (CCs; CC1, CC2, CC3, CC4) with unique biological characteristics, response to NAC, and prognostic relevance were identified (Fig. 1c). We named the clusters based on their protein expression profiles and enrichment of biological signatures.
CC1-Luminal, which encompassed 39% (42/107) of tumours, was enriched for luminal markers (e.g. KRT20, PPARG) consistent with urothelial differentiation. CC1-Luminal displayed high metabolic and mitochondrial activity by gene set enrichment analysis (GSEA; Fig. 1d-e). CC1-Luminal also had a greater proportion of micropapillary secondary histology, and the highest pathologic response rate to NAC (50%, Fig. 2a). In agreement with other MIBC studies using transcriptomic-based classifiers8,9,11, patients in CC1-Luminal had the best OS (Fig. 2b).
CC2-Nuclear (7/107 = 7%) was enriched for nuclear features including cellular processes involving mRNA binding, RNA metabolism, and histone lysine Nmethyltransferase activity indicative of hypermethylated tumours (Fig. 1d-e). Approximately half of the most common neuroendocrine differentiation markers66 were enriched in CC2-Nuclear (e.g. PSP1, NCAM1; Supplementary Fig. 2a). CC2-Nuclear captured three out of five histologic small cell/neuroendocrine tumours (Fig. 2a). Patients in CC2-Nuclear had the most favorable oncologic outcomes (Fig. 2b), which differs from prior reports using transcriptomic classifiers6,9. This is probably related to the small sample size of this subgroup and their unexpectedly high CR rate after NAC (80%). One of these patients had only T1N0M0 disease prior to treatment.
CC3-Basal (20/107 = 19%) was characterized by markers of basal cell differentiation (e.g. CD44), keratinization (e.g. KRT5, KRT14) and immune pathway enrichment (e.g. IFN-γ production; Fig. 1d-e). This cluster was enriched for female sex (44%), consistent with prior reports9,67. These tumours correlated with higher clinical T-stage, squamous differentiation (Fig. 2a), and the worst RFS and OS among clusters (Fig. 2b). These findings align with reports in the context of NAC68 or other neoadjuvant/adjuvant therapy11, where basal tumours were highly immuneinfiltrated (particularly with cytotoxic T cells and natural killer cells) and were associated with worse treatment response and survival compared to other subtypes.
Tumours in CC4-Stroma-rich (38/107 = 35%) exhibited high expression of stroma-related proteins, including structural proteins (COL1A1, DES), extracellular matrix (ECM) signatures, and high immune-infiltration (Fig. 1c-e). This cluster showed similar survival outcomes to CC1-Luminal and CC2-Nuclear (Fig. 2b). CC1-Luminal and CC4-Stroma-rich shared the most similarities with the BU, suggesting greater urothelial differentiation than CC2-Nuclear and CC3-Basal (Supplementary Fig. 3e). The existence of a stroma-rich group has also been described by others using transcriptomics, and identifies tumours with intermediate levels of urothelial differentiation, a higher proportion of non-tumour cells and distinct immune infiltrates (e.g. T and B cell populations)9,68.
We compared our proteomics cluster centroids to previously published global proteomic data generated from FFPE bladder tissue (Supplementary Fig. 2b)31. Xu et al. described 3 distinct clusters (U.I, U.II, and U.III) in a mixed cohort of MIBC RC and NMIBC TUR samples without chemotherapy history. We found that CC1-Luminal and U.II cluster centroids shared similar profiles, although U.II had notable differences, including low metabolic processes and high activity of RNA and DNA nuclear processes, aligning more closely with CC2-Nuclear. U.III showed centroid similarity with both CC3-Basal and CC4-Stroma-rich, while U.I (enriched for NMIBC) did not align with any of our clusters (Supplementary Fig. 2b). Comparison of our proteomic cluster centroids with those of the consensus RNA-based subtypes9 showed high concordance. CC2-Nuclear aligned with the consensus NE-like subtype, CC3-Basal with Ba/Sq, and CC4-Stroma-rich with Stroma-rich. As expected, CC1-Luminal showed similarity with all three luminal subtypes, with Luminal-Non-Specified (LumNS) being the closest centroid (Supplementary Fig. 2c). Collectively, despite notable cohort differences, our centroids were largely concordant with others, and deviations are probably attributed to cohort composition, sample type (RC versus TUR; Supplementary Fig. 1h), and differences in comparing RNA to protein expression.
The proteome of non-muscle invasive bladder cancer is similar to benign urothelium and distinct from muscle-invasive bladder cancer
The NMIBC cohort was independent from the MIBC cohort and only included patients with high-grade Ta tumours. Specific differences in the proteome of these NMIBC tumours compared to MIBC tumours could reflect global features of tumour aggressiveness crucial in bladder cancer progression. Comparing the most differentially expressed proteins (N = 26) of each pre-NAC CC with the NMIBC cohort showed that the CC1-Luminal proteome had the highest overlap with the NMIBC proteome (Supplementary Fig. 3a). This is consistent with other multiomic studies showing that the majority of NMIBC, and virtually all Ta tumours, express a luminal proteomic and transcriptomic profile69,70. Further analysis revealed proteins (e.g. CD44, MT2A) that are enriched in MIBC compared to NMIBC (Supplementary Fig. 3b). Direct comparison of CC1-Luminal with NMIBC revealed additional unique markers (e.g. FUCA1, PLS1) exclusively upregulated in CC1Luminal compared to NMIBC (Supplementary Fig. 3b). GSEA showed that three of the five most upregulated processes in MIBC versus NMIBC involved antibody-mediated immune responses. Three of the five most upregulated pathways in NMIBC were associated with metabolic processes (Supplementary Fig. 3c). Despite some discrepancies, likely due to fixation methods, this comparison between NMIBC and MIBC generally agreed with a similar analysis by Stroggilos et al.37 in 117 tumour samples (98 NMIBC and 19 MIBC, Supplementary Fig. 3d).
We analyzed the pooled BU proteome and observed a high degree of similarity of BU with both NMIBC and CC1-Luminal (Supplementary Fig. 3a). Proteins implicated in urothelial differentiation, such as PPARG and FOXA1, were similarly expressed in BU, NMIBC, and CC1-Luminal. Conversely, proteins commonly expressed by stromal cell populations (COL1A1, DES) were highest in BU and CC4-Stroma-rich, suggesting BU probably contains a mix of urothelium and underlying layers of the ureter (Fig. 1e). Comparing the BU pool with all preNAC tumours revealed the proteomic signals of CC1-Luminal and CC4-Stroma-rich were closer to that of the BU than CC2-Nuclear and CC3-Basal (Supplementary Fig. 3e).
Immunohistochemical validation of cluster-defining and prognostic proteins
Based on the availability of high-quality antibodies for immunohistochemistry (IHC; Supplementary Fig. 4a), we selected one cluster-defining protein from each CC for IHC validation using a matched TMA (TMA-internal; Supplementary Table 2) with pre-NAC samples. We observed positive correlation values (all p < 0.01) between IHC H-score and MS-derived protein abundances (log2FC) for SNCG (CC1-Luminal, R = 0.79), PSP1 (CC2-Nuclear, R = 0.82), SERPINB3 (CC3-Basal, R = 0.51) and COL1A1 (CC4-Stroma-rich, R = 0.51; Supplementary Fig. 4b-c). COL1A1, specific to stroma cells, was most abundant in CC1-Luminal and CC4-Stroma-rich (R = 0.51). We confirmed the relative selectivity of each protein for its corresponding CC by IHC (Supplementary Fig. 4c). Representative IHC profiles for each CC are shown in Supplementary Fig. 5a-d.
We identified 84 proteins from the MS-derived proteomic data in pre-NAC TUR specimens with the greatest prognostic value. For IHC validation, we selected candidate proteins based on criteria including a high number of peptides per protein, high protein abundance, and cross-sample abundance in the pre-NAC TUR sample cohort, and availability of high-quality antibodies for IHC (Supplementary Fig. 6a-b). We then selected proteins with clinical and transcriptomic/proteomic clustering relevance, assessing their prognostic value in both pre- and post-NAC setting following uni- and multivariable analysis (Supplementary Fig. 6c). Several luminal markers showed trends toward favourable prognostic value (FGFR3, FOXA1, GATA3, with HRs < 1), whereas markers of basal subtypes (KRT5, KRT6A, KRT6B) had unfavourable prognostic value (HRs > 1, Supplementary Fig. 6c66,67).
We selected two prognostic markers for further validation that were not associated with any CCs (Supplementary Fig. 6c), based on criteria described above for cluster-defining proteins: one with favourable (MAPK9/JNK2) and one with unfavourable (NES) prognostic value in both the pre- and post-NAC setting. We stained them in a matched TMA-internal (Supplementary Fig. 7a-b; Supplementary Table 2) and an unmatched TMA-validation (Supplementary Fig. 8a-b; Supplementary Table 3). High expression of MAPK9/JNK2 tended to identify better OS in both the TMA-internal (p = 0.14, Supplementary Fig. 7a) and the TMA-external (p = 0.075, Supplementary Fig. 8a), but did not reach statistical significance. Likewise, high expression of NES in tissue was not associated with worse OS in either of the TMAs (p = 0.27, p = 0.38, Supplementary Fig. 7b and 8b).
Given the prognostic relevance of the pre-NAC CCs themselves, we assessed whether the four single cluster-defining markers described above retained prognostic value when split into high versus low H-score (Supplementary Fig. 7c-f and 8c-f). High SNCG H-score was robustly associated with favorable OS in both TMAs (p < 0.005 and p < 0.018). The high PSPI1 H-score approached significance (p = 0.061) in the matched TMA-internal but not in the TMA-validation TMA (p = 0.36). SERPINB3 and COL1A1 did not correlate with prognosis in either TMA. We subsequently integrated the four cluster-defining markers (SNCG, SERPINB3, PSIP1, COL1A) into a combined score (i.e. high above the median, or low below the median, see Methods). In the TMA-internal data, the combined scores trended towards the identification of worse survival in the ‘high’ group (Supplementary Fig. 7g, p = 0.071) but did not in the TMA-validation (Supplementary Fig. 8g, p = 0.28). Future validation studies will require additional IHC markers in larger independent MIBC cohorts.
Comparison of the proteome before and after neoadjuvant chemotherapy
To study tissue plasticity and biological changes associated with resistance to NAC, we profiled residual tumour tissues from 55 patients with paired pre- and post-NAC tumours (Fig. 1a-b; Supplementary Fig. 1a). 93% were nonresponders (≥ ypT2NXM0) and 7% were partial responders (ypTis/a/1N0M0), with 58% experiencing a local or distant recurrence within 6 months (Supplementary Table 4).
We performed unsupervised consensus clustering on 1,170 proteins (25% most variable proteins) that exhibited robust quantification across BU technical replicates (CV < 10% in BU samples). We identified four post-NAC clusters with different biology and clinical outcomes (Fig. 3a-c), their correlation with the pre-NAC CCs is shown in Fig. 3d. The post-NAC (PN)-CC1 proteome (N = 3) resembled that of pre-NAC CC2-Nuclear. Like CC2-Nuclear, PN-CC1 tumours showed upregulated nuclear processes and PN-CC1 captured both pre-NAC CC2-Nuclear cases with residual tumour. Two of three PN-CC1 patients subsequently recurred. The PN-CC2 tumours (N = 15) were characterized by high immune infiltration indicative of adaptive immune responses, and abundant ECM protein expression. Eleven of 13 pre-NAC CC3-Basal cases with residual tumour were classified as PN-CC2. Recurrence was subsequently observed in 10 of 15 cases. PN-CC3-Basal (N = 15) tumours were characterized by moderate immune- and stroma-cell infiltration and were enriched for antibody-mediated immune responses and wound healing pathways. PN-CC3 captured a mix of pre-NAC CC1-Luminal and CC4-Stroma-rich tumours. Nine of 15 PN-CC3 patients recurred. Lastly, PN-CC4 (N = 22) was characterized by high metabolic pathway activity and was derived primarily of pre-NAC CC1-Luminal tumours and, to a lesser extent, pre-NAC CC4-Stroma-rich tumours. Recurrence was observed in 11 of 21 PN-CC4 cases (one patient had missing clinical information at time of RC). Although limited by small sample size, PN-CC1 showed poor RFS and OS, while PN-CC4 showed a trend toward longer RFS (Fig. 3b-c).
Paired analysis of pre-NAC CCs and PN-CCs showed convergent proteomic profiles, particularly regarding the cluster-defining proteins identified in the pre-NAC samples (Supplementary Fig. 9a). Pre-NAC CC1-Luminal and CC4-Stroma-rich showed the most plasticity after NAC, with CC1-Luminal patients aligning with PN-CC3 or PN-CC4, while CC4-Stroma-rich patients associated with PN-CC2, PN-CC3, or PN-CC4. Pre-NAC CC2-Nuclear and CC3-Basal were most concordant with the respective post-NAC CC (Fig. 3d; Supplementary Fig. 9b). It is important to acknowledge that differences between TUR and RC were likely influenced by NAC and procedural factors (Supplementary Fig. 1h). Post-NAC samples were closer in proteome composition to the BU than the pre-NAC specimens (Supplementary Fig. 9c). Therefore, we compared the proteins enriched in post- versus pre-NAC tissue to those enriched in BU versus pre-NAC tissue. This analysis highlights differences in individual proteins, such as CTHRC1 and NNMT that are involved in ECM and epithelial-to-mesenchymal transition (EMT), and which are exclusively upregulated post-NAC and/or BU tissue (Fig. 3e; Supplementary Fig. 9d). At the pathway level, RC specimens were upregulated for ECM and antibody-mediated immune responses, whereas TURs were enriched for epigenomic processes and keratinization (Fig. 3f).
Tumours resistant to NAC have a high rate of recurrence and progression, leading to poor OS71. Analysis of differential proteome expression between post- and pre-NAC tissue may reveal proteins relevant to platinum chemoresistance. Proteins enriched in post-NAC residual tumours could represent therapeutic targets to circumvent resistance (Fig. 3g). Using interaction scores > 40 from the Drug Gene Interaction Database55, we identified post-NAC enrichment of proteins with roles in platinum-resistance in other cancers, including SOD1 and FN1, both of which are targets of FDA-approved agents. PN-CCs showed differential expression of druggable proteins across clusters for which there are agents that are FDA-approved (Supplementary Fig. 10a) or under investigation (Supplementary Fig. 10b). For example, PN-CC1 tumours showed homogeneous enrichment for MTOR, PARP1, and PARP2, while PN-CC4 showed enrichment of NECTIN4 in about 50% of cases (Supplementary Fig. 10a). These are targets of clinically approved drugs72–74. This analysis identifies cluster-specific therapies that could be investigated for clinical utility to treat recurrences.
Integrating the transcriptomic and proteomic landscape of pre-NAC MIBC
The proteomic CCs largely overlapped with previously described RNA subtypes9 based on the 25% most variable proteins and their corresponding transcriptome values (Fig. 4a). Integrating the two transcriptomic platforms used in the study revealed a modest positive correlation (R = 0.34) between protein and RNA expression across the entire MIBC cohort. Notably, the correlation of transcriptome and protein was higher in microarray data compared RNA-seq (R = 0.41 versus 0.29; Fig. 4b; Supplementary Fig. 11a). CC1-Luminal captured tumours of the Luminal papillary (LumP), Luminal non-specified (LumNS) and Luminal unstable (LumU) classes according to the transcriptomic consensus classifier9 with highly concordant expression of proteins and RNA (Fig. 4a). CC2-Nuclear encompassed all tumours classified as NE-like by RNA expression, in addition to two tumours classified as Basal/Squamous (Ba/Sq) and one as LumU. CC3-Basal was the most concordant with RNA classifiers and had the highest level of RNA cluster separation. All but one CC3-Basal tumours classified as Ba/Sq. CC4-Stroma-rich showed a mix of all RNA subtypes, indicating highly heterogeneous tumours with a large stromal component. This may partly be explained by the loss of the stromal component in the RNA signal (lower right quadrant, Fig. 4a) and the extended half-life of extracellular collagens at the protein level 75.
Further exploration of the differences in RNA and proteome expression between responders and non-responders to NAC revealed a positive correlation between data types (R = 0.45, p < 0.001, Fig. 4c). While no significant differences were observed between responders and non-responders at the protein level (data not shown), several transcripts were significantly up- or down-regulated, indicating disparity in expression of RNA and the corresponding protein76.
To delineate the immune component across pre-NAC CCs, we performed ESTIMATE77 and CIBERSORTx52 analyses on both the proteomic and transcriptomic datasets. We confirmed that CC3-Basal and CC4-Stroma-rich had the highest immune and stroma scores compared to CC1-Luminal and CC2-Nuclear (Supplementary Fig. 11b-c). Deconvolution of specific immune and other cell populations using CIBERSORTx was most effective with transcriptomic data, as endothelial and mast cells were not delineated using the proteomic data (Supplementary Fig. 11d). CC3-Basal and CC4-Stroma-rich were most enriched for monocytes, and B cells, suggesting these are immune ‘hot’ tumours. Conversely, CC1-Luminal and CC2-Nuclear showed immune ‘cold’ features, except for T cell populations, which appeared equally enriched across all CCs. These results align with those of Xu et al, where U.III (basal) tumours were defined as immune ‘hot’ and U.I (luminal) tumours were classed as ‘cold’31.
To further study the ECM in CC4-Stroma-rich tumours, we visualized the matched expression of RNAs and proteins across the CCs for collagens, proteoglycans, and secreted proteins, collectively defined as the ‘matrisome’56 (Supplementary Fig. 12ab). In line with Fig. 1d, we confirmed CC4-Stroma-rich to be enriched for collagens and proteoglycans at the protein, and to a lesser extent RNA, level. Notably, CC4-Stroma-rich contained a mix of transcriptome classes, with tumours of luminal subtypes showing the most enrichment for collagens and proteoglycans. Analysis across the cohort identified high levels of calprotectin (S100A8/A9) in CC3-Basal tumours.
Analysis of intratumoural proteomic heterogeneity and its association with clinical outcomes
We performed multiregional sampling of morphologically similar pre-NAC tissue sections in 36 of 107 MIBC patients. Unsupervised clustering using all 5,828 proteins identified across all pre-NAC TUR duplicate samples showed high inter-tumour, but low ITH across the cohort (Fig. 5a). However, four of 36 sample pairs clustered separately, suggesting some samples had higher ITH despite having similar histologic features (Fig. 5a).
To quantify the level of ITH between paired samples, we assigned each pair an ITH score by Euclidian distance. Despite similar clinical profiles, patients whose tumours had above-median ITH (median = 21, range 12–37) correlated with poorer NAC response rates (t-test, p = 0.029) and survival outcomes than those with belowmedian ITH (RFS p = 0.0056 and OS p = 0.046; Fig. 5b, d-e). Analysis of the distance for individual pre-NAC TUR samples (N = 72) to the CC centroids confirmed individual sample pre-NAC clusters were robust among pairs, with only four specimens having a CC switch within the same sampled tumour (two from CC1-Luminal to CC4-Stroma-rich and two from CC4-Stroma-rich to CC1-Luminal, Fig. 5f). Three of these four belonged to the high ITH group (Fig. 5b).
PCA analysis of paired pre- and post-NAC multiregional samples (N = 8) showed high level of heterogeneity between the TUR and RC specimen (Fig. 5g), which agrees with data shown in Supplementary Fig. 1h. Despite the low number of samples included in this part of the study, our data suggest that ITH is a potential biomarker to identify NAC-resistant tumours with poor prognosis. Prior studies using DNA sequencing on pre- and post-NAC-treated tissue have similarly reported that high ITH is associated with worse outcomes78.