Inferring Bladder Cancer Evolution from Mucosal field Effects by Whole-Organ Spatial Mutational, Proteomic, and Metabolomic Mapping

Multi-platform mutational, proteomic, and metabolomic spatial mapping was used on the whole-organ scale to identify the molecular evolution of bladder cancer from mucosal field effects. We identified complex proteomic and metabolomic dysregulations in microscopically normal areas of bladder mucosa adjacent to dysplasia and carcinoma in situ. The mutational landscape developed in a background of complex defects of protein homeostasis which included dysregulated nucleocytoplasmic transport, splicesome, ribosome biogenesis, and peroxisome. These changes were combined with altered urothelial differentiation which involved lipid metabolism and protein degradations controlled by PPAR. The complex alterations of proteome were accompanied by dysregulation of gluco-lipid energy-related metabolism. The analysis of mutational landscape identified three types of mutations based on their geographic distribution and variant allele frequencies. The most common were low frequency α mutations restricted to individual mucosal samples. The two other groups of mutations were associated with clonal expansion. The first of this group referred to as β mutations occurred at low frequencies across the mucosa. The second of this group called γ mutations increased in frequency with disease progression. Modeling of the mutations revealed that carcinogenesis may span nearly 30 years and can be divided into dormant and progressive phases. The α mutations developed gradually in the dormant phase. The progressive phase lasted approximately five years and was signified by the advent of β mutations, but it was driven by γ mutations which developed during the last 2–3 years of disease progression to invasive cancer. Our study indicates that the understanding of complex alterations involving mucosal microenvironment initiating bladder carcinogenesis can be inferred from the multi-platform whole-organ mapping.


INTRODUCTION
The molecular mechanisms that initiate carcinogenesis involve microscopically normal appearing tissue and are collectively referred to as eld effects. 1,2Their characterization may facilitate the development of early detection, prevention, and treatment strategies intercepting carcinogenesis in its early phases before it progresses to clinically aggressive and often uncurable disease.Understanding of these initiating events is not possible unless they are analyzed in the geographic spatial frame of mucosal changes in the entire organ.Bladder cancer is a particularly useful model for such studies as it develops in the epithelial lining, the urothelium, of the anatomically simple organ facilitating combined geographic microscopic and multiplatform genomic mapping.4][5][6][7] These studies revealed that regionally restricted clonal expansions are present in areas of the urothelium that appear phenotypically normal, and that these changes are associated with mutations in cancer driver genes, particularly those that regulate chromatin structure. 3,4,8These changes are associated with alterations in RNA expression that appear to disrupt innate immunity among other pathways and cause T-cell exhaustion. 3,8However, very little is known about how these genomic changes alter urothelial biology, which requires a deeper understanding of the downstream effects on protein expression and metabolism.Here we provide the rst descriptions of the proteomic and metabolomic pro les of bladder cancer evolution from mucosal eld effects in the context of their mutational landscape on the whole-organ scale.

Preparation of Whole-Organ Cystectomy for Spatial Mapping
To molecularly characterize the evolution of bladder cancer to mucosal eld effect on the whole-organ scale, we collected geographically annotated mucosal samples from a representative cystectomy specimen with invasive bladder cancer (Fig. 1A-J).For whole-organ mapping, the resected human bladder was open along the anterior wall and pinned down to a para n block.Then a mapping grid was applied which separated the mucosal areas into 1x2 cm wells allowing DNA and protein to be extracted while simultaneously preserving the urothelium for microscopic inspection from which the histologic map of the entire bladder mucosa and invasive cancer was reconstructed.The samples were microscopically classi ed as normal urothelium (NU), in situ preneoplastic conditions referred to as low and high-grade intraurothelial neoplasia (LGIN and HGIN) respectively, and invasive urothelial carcinoma (UC).
Geographically oriented samples were analyzed by bulk whole-exome DNA sequencing and proteomic sequencing.Germline DNA from peripheral blood samples was used as a reference for DNA sequencing.Microscopically normal urothelium harvested from ureters of nephrectomy specimens from patients without urothelial neoplasia was used as reference from proteomic sequencing.

Mutational Landscape of Field Effects and their Evolution to Carcinoma
Whole-exome sequencing of DNA from geographically mapped mucosal samples identi ed nonsynonymous variant alleles in 12,764 (12,022 SNVs, 450 inserts, and 292 deletions) genomic loci.The heat map which included their geographic distribution and variant allele frequencies (VAFs) of these mutations of individual mucosal samples is shown in Fig. 2A.Based on their VAFs and geographic distribution, we separated these mutations in to two major groups.The rst group comprised of cluster A consisting of mutations with low VAFs con ned to individual mucosal samples referred to as α mutations (Table S1).The second group comprised of cluster B consisting of mutations expanding across the bladder mucosa.Cluster B mutations were further divided into two subgroups referred to as β and γ (Table S2, S3; Fig. 2B).Mutations of cluster β were expanding across the bladder mucosa but their VAFs were consistently low (< 20%) and did not increase in their frequencies with progression to HGIN and UC.On the other hand, mutations of cluster γ formed a plaque involving large areas of bladder mucosa with consistently high level (> 20%) of VAFs.Overall, based on the geographic distribution and the VAFs, three distinct types of mutations were identi ed and referred to as α, β, and γ (Table S4; Fig. 2A-D).The mutations of α type were the most frequent (12,431) comprising 11,693 SNVs, 448 insertions, and 285 deletions.There were only 54 β mutations with 51 SNVs, 1 insertion, and 2 deletions.There were 324 γ mutations with 315 SNVs, 2 insertions, and 7 deletions.Although the VAFs of γ mutations were high in all three groups of samples corresponding to NU/LGIN, HGIN, and UC there was a major increase in the number of mutations with progression to HGIN and UC (Fig. 2C-F).Most of those mutations were of γ type.
In order to address the issue of preferential selection of the mutations and their potential driver's role we analyzed their spatial distribution and VAFs in different elds of the map.We focused on the distribution patterns of α, β, and γ mutations addressing the hypothesis that some of these mutations maybe under positive selection while others are mere hitchhikers (Fig. 2G-I).First, we additionally classi ed the mutations into private (present in only one eld), regional (present in 2-10 or 11-20 elds), and widespread mutations (present in 21-30 or > 30 elds).The mutations were almost exclusively private while the mutations were almost exclusively widespread.Some of the mutations were regional while others were widespread.
The mutations, in addition to being private have right-skewed VAF distributions typical of proliferating cells with neutral mutations (Fig. 2G, J; Figure S1A). 9Similarly, the VAFs of β mutations were right skewed but their spread was at least regional involving several mucosal samples (Fig. 2J, K; Figure S1B).The VAFs of the widespread mutations were binomially distributed in most elds, which signi es a secondary clone that acquired positive selective advantage (Fig. 2H, I; Figure S1C). 9In some elds the distribution of VAFs for γ mutations were almost uniform which is likely the sign of widespread genome copy number dysregulation (Figure S1C).The VAFs of β mutations in individual samples were in general highly irregular but in some elds have more right skewed pattern while in others have binomial distributions.This pattern suggests the mixture of clones with different proliferative advantage.Overall, there was a progressive increase of COSMIC driver mutations in mutations α, β, and γ.

Mechanisms of Mutagenesis involved in Bladder Cancer Evolution from Field Effects
To characterize the mutagenesis signatures in the evolution of bladder cancer from eld effects we rst analyzed six single-based nucleotide substitutions (C > A, C > G, C > T, T > A, T > C, and T > G) and their context motifs in all mucosal samples (Fig. 3A-C).The frequency of C > T substitutions increased at the transition from NU/LGIN to HGIN and UC with signi cant changes in 12 mutational signatures.Signatures 1, 6, 12, 20, and 24 appeared to be the most dominant with signature 1 having the highest weight scores (Fig. 3D-F).To evaluate the contributions of individual mutagenesis pattern to the mucosal mutational landscape we performed bootstrapping and calculated the p value to assess their signi cance (p < 0.005 was considered statistically signi cant).This approach con rmed the dominance of signatures 1 and 6 (Fig. 3G).The analysis of α, β, and γ mutations identi ed a signi cant increase of C > T substitutions in γ mutations (Fig. 3H).Similarly, it was evident that these three groups of mutations were associated with distinct mutational signatures con rming their development through distinct mutational mechanisms (Fig. 3I-K).

Modeling of Bladder Cancer Evolution from Mucosal Field Effects
To analyze the mutational pattern of clonal evolution of bladder cancer from eld effects we used all nonsilent and silent mutations to construct an evolutionary tree.This revealed a complex branching pattern corresponding to multiple waves of clonal expansion with 4-11 nodes evolving along three distinct branches referred to as δ, ε, and ζ (Fig. 4A).The process evolved from the hypothetical node 0 in the center and neighboring nodes 1 and 2 of the three branches representing incipient events of carcinogenesis.The heat map of genetic distance among individual samples showed the gradual evolution of the mutational landscape from the initiating events of the left lower corner with three clusters of the mutational landscape corresponding to the three branches of the evolutionary tree (Fig. 4B).The branch ζ appeared to be most mutationally active showing an increase of their VAFs (Fig. 4C,D).
To answer the question of how long bladder cancer takes to develop, we applied a mathematical modeling algorithm to a whole-organ mutational landscape.We used successive waves of clonal evolutions of the parsimony trees applying a time-continuous Markov branching process. 10This provided time scale of cancer evolution from eld effects based on maximal parsimonious principles.Initially, we performed the analysis using all synonymous and nonsynonymous mutations (Fig. 4E).These analyses showed that cancer evolved from eld effects over approximately 30 years and the age-related curve of mutations had a left-skewed pattern with the mutations gradually developing over nearly three decades.Using the time scale of mutations, the process of tumor evolution could be divided into two major phases referred to as dormant and progressive.The rst, older dormant phase in which mutations developed over approximately two decades, involved mutations which were characterized by low selection coe cients consistent with their marginal proliferative advantage.Second, the more recent and progressive phase was less than ve years old and was characterized by a large number of mutations with high selection coe cients consistent with their clonal expansion and high proliferative advantage.We repeated the modeling analysis selectively using the three types of mutations referred to as α, β, and γ (Fig. 4F-H).
These analyses have shown that the α mutations were the oldest and gradually developed over 30 years (Fig. 4F).Large proportion of them had low selection co-e cient consistent with the minimal proliferative advantage and the fact they were involving small mucosal areas.The mutations of β type emerged at the transition from dormant to progressive phase and developed over the period of less than ve years (Fig. 4G).Large proportion of them showed increased selection coe cient consistent with proliferative advantage and clonal expansion.The mutation of γ type were the younger and they developed over the last two years before cancer diagnosis (Fig. 4H).These mutations were characterized by high proliferative coe cients consistent with the proliferative advantage and the involvement in the last two years of the progressive phase.Overall, these analyses have shown that dormant and progressive phases of bladder carcinogenesis were characterized by distinct involvement of α, β, and γ mutations.
To address the issue of potential distinctive involvement of different mutagenesis mechanisms in different phases of bladder carcinogenesis we performed a detailed analysis of the mutational signatures in dormant and progressive phases of the disease.These analyses showed that the progressive disease was characterized by an increased C > A substitutions combined with major increase of both silent and non-silent mutations and their VAFs (Fig. 4I, J).Dormant and progressive phases of bladder carcinogenesis showed the involvement of distinct mutagenesis signatures con rming the concept that distinct mutational mechanisms are switched on in a background of dormant eld effect driving the progression to clinically aggressive bladder cancer (Fig. 4K, L).

Proteomic Pro le of Bladder Cancer Evolution from Mucosal Field Effects
Proteomic pro ling of 38 geographically mapped mucosal samples from a cystectomy specimen identi ed 8,475 distinctive proteins, which are listed in Table S5.and UC samples (Fig. 5A).The HGIN and UC samples co-clustered together.Gene ontology analysis of cellular components showed that we recovered approximately 50% proteins of the reported total number of proteins in each cellular compartment (Figure S2E).In the analysis of proteins, we focused on the monotonic changes of protein expression which paralleled the progression of neoplasia from NU/LGIN through HGIN to UC.The total numbers of proteins with monotonic upregulation or downregulation in this progression were: 2,343, 2,413, and 2,504 respectively which were analyzed by iPathwayguide program.
The top 50 monotonically upregulated and downregulated proteins are shown in Fig. 5C, D. By using this approach, we identi ed 42 dysregulated protein pathways in the mucosal eld effects corresponding to NU/LGIN, which continued to be progressively dysregulated in the evolution of neoplasia through HGIN to UC (Table S6).The dysregulated pathways converging on energy and protein homeostasis were complemented by tissue differentiation program defects (Fig. 5E).The example of downregulated energy pathways included oxidative phosphorylation, valine, leucine and isoleucine degradation, TCA cycle, and thermogenesis.The protein homeostasis defects include dysregulated nucleocytoplasmic transport, splicesome, ribosome biogenesis, and peroxisome.The signature dysregulated pathway involved in tissue differentiation, lipid metabolism and protein degradation represents PPAR signaling.

Metabolomic Analysis of Bladder Cancer Evolution from Field Effects
Targeted metabolomic pro ling of 38 geographically mapped mucosal samples from a cystectomy specimen using positive and negative ionization modes (Electrospray Ionization, ESI) identi ed 91 metabolites (Table S7).Subsequently, the peak areas of metabolites were normalized using internal standards for further analysis.The Principal Component Analysis (PCA) showed the clustering distinction among the samples from NU/LGIN, HGIN, and UC groups and revealed a preferential clustering of HGIN and UC samples (Fig. 6A).Differentially expressed metabolites visualized on the PCA plot, represented their log fold change values, with signi cantly upregulated and downregulated metabolites highlighted.
The differentially expressed metabolites of the mucosal samples are shown in the volcano plots as compared to control samples and for comparisons between HGIN/UC versus NU/LGIN in Fig. 6B.The total number of measured metabolites and their expression patterns in individual mucosal samples classi ed as NU/LGIN and HGIN/UC are shown in a heatmap and box plot in Fig. 6C,D.Among the top ve monotonically upregulated metabolites were glycerol-3-phosphate, betaine, farnesyl-PP, taurine, and lactate.Among the top ve monotonically downregulated proteins were 5-CMP, homocysteine, cystathionine, dCMP, and ADP.Furthermore, we conducted pathway analysis for three comparisons: 1) NU/LGIN vs. control, 2) HGIN vs. control, and 3) UC vs. control (Fig. 6E).The top ve monotonically enriched pathways were tRNA charging, citrulline biosynthesis, superpathway of citrulline metabolism, arginine degradation VI, and proline biosynthesis II (from arginine).The analysis of enrichment scores in individual mucosal samples for metabolites using single sample gene set enrichment analysis (ssGSEA) identi ed 74 monotonically dysregulated pathways (Table S8; Figure S4).Among the top ve activated pathways were glycerolipid metabolism, steroid biosynthesis, terpenoid backbone biosynthesis, taurine and hypotaurine metabolism, and cAMP signaling.Among the top ve downregulated pathways were platelet activation, cysteine and methionine metabolism, fatty acid elongation, fatty acid biosynthesis, and fatty acid degradation.Interestingly, elevation of lactate and activation of glycolysis was already evident in microscopically normal appearing areas of bladder mucosa implicating the emergence of Warburg phenotype in the eld effects.These analyses taken together indicate that there were complex alterations of gluco-lipid energy-related metabolism in eld effects dysregulating mucosal microenvironment in the incipient phases of bladder carcinogenesis.

DISCUSSION
This study provided a proof of principle for the validity and feasibility of multi-platform mutational, proteomic, and metabolomic analysis on the whole-organ scale to infer the molecular pro le of bladder cancer development from mucosal eld effects.It provides evidence for time modeling of mutational landscape in the context of complex alterations involving protein homeostasis and tissue differentiation program combined with dysregulation of gluco-lipid energy-related metabolism.2][13] Whole-organ spatial geographic analyses revealed that bladder carcinogenesis may develop innocuously over several decades and can be divided into dormant and Highly variable mutagenesis signatures were identi ed in individual mucosal samples but signatures 1 and 6 were dominant. 14Moreover, the development of α, β, and ϒ mutations were driven by distinct mutational mechanisms.Similarly, the transition from dormant to progressive phase of carcinogenesis was associated with the involvement of different mutagenesis signatures con rming that the progressive phase of bladder carcinogenesis developed through the activation of distinct mutational mechanisms.
These mutational landscape changes developed in a background of complex proteomic and metabolomic dysregulations.Overall the proteome alterations involved defects of protein homeostasis including dysregulated nucleocytoplasmic transport, splicesome, ribosome biogenesis, and peroxisome.Interestingly these changes were associated with dysregulated urothelial differentiation controlled by PPAR and involved lipid metabolism and protein degradations. 15,16The proteome alterations were accompanied by equally complex dysregulation of glycerolipid metabolism, steroid biosynthesis, terpenoid backbone biosynthesis, taurine and hypotaurine metabolism, and cAMP signaling.The top dysregulated metabolomic pathways involved cysteine and methionine metabolism, fatty acid elongation, fatty acid biosynthesis, and fatty acid degradation including activated glycolysis signifying complex alterations of gluco-lipid energy-related metabolism in eld effects complementing the dysregulated mucosal microenvironment in the incipient phases of bladder carcinogenesis.[23] MATERIAL AND METHODS

Experimental Model and Subject Details
Human samples and clinical data were collected according to the laboratory protocols approved by the Institutional Review Board of The University of Texas MD Anderson Cancer Center.The whole-organ histologic, genomic, and proteomic mapping (WOHGPM) was performed on the radical cystectomy specimen from 92-year-old white male patient with high-grade urothelial carcinoma invasive into muscularis propia (T 3 ) as previously described.In brief, the preparation of cystectomy specimen for whole-organ histologic mapping combined with DNA/RNA and protein extraction followed the steps illustrated in Supplementary Data Fig. 1.The cystectomy specimen was open longitudinally along the anterior wall of the bladder and pinned down to a para n block.Then the mapping grid was superimposed and pressed down over the bladder mucosa with mechanical screws.The mapping grid provided sealed wells that separated mucosal areas into 1x2cm (2cm 2 ) rectangles.Phosphate-buffered saline (PBS) with 0.25% Trypsin (1ml) was poured into each well and the surface urothelium was scraped.
The uid was collected into Eppendorf tubes and incubated at 37°C for 30 minutes.The urothelial cell clusters and tissue fragments were removed by passing the cell suspension through the 70µm cell strainer.Subsequently, the red blood cells were removed by Ficoll gradient centrifugation.The single cell suspensions were washed twice in PBS containing 0.04% BSA and resuspended in 600µl of PBS.A small proportion of this uid was used for cytospin preparation to assess the purity and quality of cell suspension.In areas of mucosa which contain grossly recognizable tumor, the tumor tissue was collected by direct dissection from the bladder wall, cut into small pieces and processed as described above.The nal 600µl cell suspensions were divided into three parts and kept frozen until processed for DNA, protein, and metabolite extractions.For DNA extraction the cell suspensions were defrosted and treated with triazol reagents.The mapping grid was removed from the surface of the bladder which was then xed in formalin overnight.The grooves at the bottom of the mapping grid left permanent impression on the bladder surface and preserved the urothelium for microscopic inspection and histologic mapping of the entire bladder mucosa.Para n embedded sections corresponding to mapping grooves were collected and were stained with hematoxylin & eosin to evaluate the distribution of microscopically normal urothelium in situ precursor lesions and urothelial carcinoma.The intraurothelial lesions were dichotomized into lowand high-grade categories referred to as low-grade intraurothelial neoplasia (LGIN) and high-grade intraurothelial neoplasia (HGIN) as previously described. 24,25Samples with tumor tissue were classi ed according to the two-tier histologic grading system of the World Health Organization (WHO) referred to as low-and high-grade. 26The growth pattern of papillary versus solid and the depth of invasion were recorded.Levels of invasion were de ned according to the Tumor Node Metastasis (TNM) Staging System. 27,28ere were two steps of quality check controls which comprised of the overall assessment of the cystectomy specimens and the quality of nal DNA/RNA and protein preparations for genomic and proteomic pro ling.In the rst step the cystectomy specimen was assessed in terms of the representation of the whole spectrum of the in situ precursor lesions and tumor samples as well as purity of urothelial and tumor cell preparations.In the second step the quality of DNA/RNA preparations were veri ed using NanoDrop, Bioanalyzer, and Qubit.

Whole-Exome Sequencing and Data Analysis
Whole-exome sequencing as well as downstream data analysis were performed as outlined in recent cancer sequencing project publications.In brief, whole-exome sequencing was performed by using an Illumina NovaSeq6000 sequencer using high output ow cell with an average coverage across the samples of 300X ± 85.3 SD.The initial alignments of reads to GRCh38 reference genome was performed with BWA-MEM (version 0.7.12).The Genome Analysis Toolkit (GATK, version 3.4-46) was used to generate realigned and recalibrated BAM les.MuTect2 and Oncotator (version 1.8.0.0) were used to identify mutations.

Mutational Signatures
The analyses of mutational signatures were performed as previously described. 3In brief, non-silent mutations which were present in at least one sample with the following substitutions: C > A, C > G, C > T, T > A, T > C, T > G were used to analyze their distribution in the three groups of samples corresponding to NU/LGIN, HGIN, and UC.Fisher's exact test was used to test the null hypothesis that they are equally distributed in the three groups of samples.The genomic context of SNVs including the two anking bases on the 5' and 3' sides to each SNV was assembled and included 96 mutational patterns.The frequency of any ngerprints between groups of mucosal samples was tested by Wilcoxon Rank Sum tests.The Benjamini and Hochberg (BH) method was used to assess the false discovery rate (FDR). 29We used mutational ngerprints (V) in quadratic programming to estimate a weight score (H) for each mutational signature (W) from the Sanger Institute database (https://cancer.sanger.ac.uk/cosmic/signatures) as previously described. 3,4We used the matrix of canonical signatures (W) with the mutational pro le of a sample (V) to compute the 30 by 1 vector (H) for each of the canonical signatures' relative contributions to the sample mutagenesis pro le by computing the following optimization: minH (WH -V)T(WH -V) such that hi ≥ 0 and Σi hi = 1 Kruskal-Wallis test was used to test the null hypothesis of no difference in weight scores among the groups of samples.We applied bootstrapping to assess the contribution of mutational signatures in mucosal samples.Mutational ngerprints (V) for each sample were resampled with replacement and the weight scores were computed as above 2000 times.The one-sided empirical p value was computed as the percentage of weight scores that were equal or greater than the weight score in the resampling distribution.

Phylogenetic Analysis and Modeling of Bladder Cancer Evolution
The phylogenetic tree was constructed by calculating the Hamming distances among the mucosal samples using a matrix of all nonsilent and silent mutations present in at least one sample by applying the maximum parsimony algorithm as previously described. 3In graphical representation of phylogenetic tree, each node corresponds to a population of cells and the length of the edge connecting the nodes is proportional to the number of mutations.A brunch represents a point in the evolution where two distinct population emerge.While the length of the branch is proportional to the number of mutations which are unique for each population.
To reconstruct the time of evolution from mucosal eld effects to bladder cancer, the modi ed timecontinuous Markov branching process with immigration and parsimonious principles was used as described previously with the following modi cations. 10In brief, a mutation appears at time in a progenitor cell of the urinary bladder urothelial lining and gives rise to a mutant clone.Mutant cells divide at rate (1/year), and after division, one cell enters self-renewal and the other differentiates with j t j 0 λ j probability or both cells enter self-renewal with probability .As a consequence, the mutant clone grows exponentially as , where is the age of the -th mutant's clone counted from .The secondary clones expand, involving different areas of bladder mucosa at times modeled by a stochastic Poisson process with intensity (1/yr). 30If the expected cell counts in the successive -th mutant clones are denoted by , and the number of haploid genomes in normal uroprogenitor cells are denoted by , the corresponding VAFs are de ned as the ratios and are computed as follows. 10r any mutation of age , the sequence of expectations was computed to estimate the coe cients and .With a cell division rate and migration rate , the parameter is the proxy for mutation age , whereas the ratio is the proxy for selection coe cient .The coe cient is a constant parameter representing an estimate of the number of uroprogenitor cells in the sampled area.The computations were performed for uroprogenitor cells in the sampled mucosal area, which did not signi cantly change the time modeling results, but the best t was obtained with 5 uroprogenitor cells, for which the data are presented.
The objective is reconstruction of the evolution of mutational landscape from mucosal eld effects to invasive cancer in the forward time by connecting the migration of urothelial cells to their proliferation rate.A parsimonious model assumes that the migration rate is proportional to the power of the proliferation rate, i.e. .Hence the parameter , for mutant , has the form where (and is a reference migration rate), has the form . Solving these two equations for and provides estimates for the proliferation rate (proxy for the selection coe cient) and mutation age of mutant , We carried out a series of extensive parametric studies an found that estimates corresponding to high values of parameter , such as , t the chronology of different mutation classes, and , which is consistent with biological and clinical intuition.2][33] The resulting time estimates were presented as bar diagrams representing the age of mutations and point charts for the corresponding selection coe cients.

Mass Spectrometry-based Proteome Pro ling
The global proteome pro ling was processed as described previously with the following modi cations. 34,35The frozen tissue was resuspended and lysed in 50mM Ammonium bicarbonate, 1 mM CaCl2 by 3 min of sonication, then 50 ug of tissue lysate were digested using 1 ug of trypsin for 12 hours at 37°C.Digested peptides concentration was measured using a colorimetric peptide assay kit (Thermo Fischer Scienti c, cat.#23275) and 25 ug of peptides were separated in a home-made high pH reversephase C18 column in a pipet tip.Peptides were eluted and separated into fteen fractions using a stepwise gradient of increasing acetonitrile (2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28 with Mascot algorithm (Mascot 2.4, Matrix Science).The precursor mass tolerance of 20 ppm and fragment mass tolerance of 0.5 Da was allowed.Two maximum missed cleavage, and dynamic modi cations of acetylation of N-term and oxidation of methionine were allowed.Assigned peptides were ltered with a 1% false discovery rate (FDR) using Percolator validation based on q-value.The Peptide Spectrum Matches (PSMs) output from PD2.1 was used to group peptides onto gene level using 'gpGrouper' algorithm. 36An in-housed program, gpGrouper, uses a universal peptide grouping logic to accurately allocate and provide MS1 based quanti cation across multiple gene products.Gene-protein products (GPs) quanti cation was performed using the label-free, intensity-based absolute quanti cation (iBAQ) approach and then normalized to FOT (a fraction of the total protein iBAQ amount per experiment).
FOT was de ned as an individual protein's iBAQ divided by the total iBAQ of all identi ed proteins within one experiment.
The missing values in the proteome recovery were replaced with half of the minimally detected value in the entire dataset.Following log2 transformation of this dataset, the differential analysis (t test) was performed comparing one speci c group against all the remaining samples combined.This step was repeated for all the different experimental groups concerned. 37Any protein was deemed statistically altered expression it has a p-value of < 0.05 and greater than 1.5 fold change.The selected GPs were analyzed using Advaita Bio's iPathwayGuide.A systems biology approach for pathway level analysis. 38supervised hierarchical clustering of normalized protein expression values was performed in software R, and the results were visualized with R package "ComplexHeatmap".Pearson's correlation, mean centering, and average linkage were applied in all clustering applications.

Targeted Metabolomics Analysis
The human bladder mucosal and tumor tissue lysates were used for metabolite extraction.The lysates from the urothelium harvested from normal human ureters from nephrectomies of patients with renal cancer without evidence of urothelial neoplasia were used as a reference.The mouse liver pool was used as Quality Control (QC), and was combined with 750 µL of internal standard (ISTD) mix.0][41] In brief, after partitioning through ice-cold chloroform and water, organic and aqueous layers were carefully transferred into new glass vials.Proteins and lipids were removed from extracted samples using a 3K Amicon-Ultra lter (Millipore Corporation, Billerica, MA).The dried pellets were dissolved into methanol-water (50:50 v/v).
The extracted total metabolite samples were analyzed through high-throughput Liquid Chromatography-Mass Spectrometry (LC-MS/MS) techniques described previously. 39,40The chromatographic separation of extracted metabolites was performed through Hydrophilic Interaction Chromatography (HILIC) and Reverse Phase (RP) chromatography techniques.The metabolites were separated through the XBridge Amide HPLC column (3.5 µm, 4.6 x 100 mm, Waters, Milford, MA) in both ESI positive and negative mode.
The details Lc methods were described in our earlier publications. 39,42,43The data was acquired via multiple reaction monitoring (MRM) using a 6495 Triple Quadrupole mass spectrometry coupled to an HPLC system (Agilent Technologies, Santa Clara, CA) through Agilent Mass Hunter Software. 39The acquired data were analyzed and integrated into each peak using Agilent Mass Hunter Quantitative Analysis software.The extracted peak area was log2 transformed and normalized by an isotopically labeled internal standard for each method.There were in total 92 metabolites pro led and passing quality control.In addition to clustering and visualizing individual metabolites across samples by heatmaps, we calculated relative metabolic pathway activity level in each individual sample by single sample gene set enrichment analysis (ssGSEA).
Speci cally, we extracted 74 metabolic pathways from the Kyoto Encyclopedia of Genes and Genomes (KEGG) database.The ssGSEA method is an extension of GSEA method: it compares the difference in empirical cumulative distribution functions of the metabolites' ranks inside and outside a given metabolic pathway to calculate an enrichment score for each sample.These ssGSEA enrichment scores could re ect how the metabolites in a pathway are coordinately upregulated or downregulated within a sample, with further normalization across all samples.We examined how the pathways were differentiated among the groups of NU/LGIN, HGIN, and UC using hierarchical clustering and heatmaps.

Declarations
Data Availability whole exome sequencing data both raw and analyzed were deposited on SRA: PRJNA1065919.The mass spectrometry data for proteome pro ling have been deposited via the MASSIVE repository (MSV000085220) to the Proteome X change Consortium (http://proteomecentral.proteomexchange.org)with the dataset identi er PXD018341).

Figures
The heatmap of their expression patterns are shown in FigureS3A, B. The proteome coverage for individual mucosal samples ranged from 150 to 7,123 proteins (FigureS2A).The samples from mucosal areas with more than 2,500 identi ed proteins were selected for downstream analysis and grouped into NU/LGIN, HGIN, and UC.The protein extracted from normal ureters of nephrectomy specimens from patients without urothelial neoplasia were used as a reference.The proteome abundancy distribution was consistent across 38 selected mucosal samples as shown in FigureS2B.The number of gene products and Peptide Spectrum Matches (PSMs) in individual mucosal samples and the rank order of normalized proteins are shown in FigureS2C, D. The principal component analysis of proteome composition showed a clear separation among NU/LGIN versus HGIN progressive phases.The mutational landscape develops gradually via multiple waves corresponding to distinct successive clones of cells which may develop along several distinct branches.The most common are low frequency α mutations restricted to individual mucosal samples.It is very likely they represent the progeny of individual uro-progenitor cells.The α mutations are the oldest and gradually develop over three decades.The other two groups referred to as β and γ mutations were associated with intramucosal clonal expansion.The rst of these two groups were β mutations with low frequency of VAFs typically comprising of < 10% of cells.The emergence of clone with β mutations was associated with the transition to the progressive phase during the last ve years of carcinogenesis.The second group referred to as γ mutations appeared to be drivers of the progressive phase emerging 2-3 years before the progression to invasive cancer.

Figure 1 Preparation
Figure 1

Figure 6
Figure 6 26% acetonitrile, 0.1% formic acid at a ow rate of 800 nl/min.The Mass spectrometry was operated in a data-dependent mode, acquiring fragmentation spectra of the top 30 strongest ions under the control of Xcalibur software version 4.1 (Thermo Fisher Scienti c).The parental ion was acquired in the Orbitrap with a full MS range of 300-1400 m/z at the resolution of 120,000.Higher-energy collisional dissociation (HCD) fragmented MS/MS spectrum was acquired in ion-trap with rapid scan mode.The MS/MS spectra were searched against the target-decoy Human RefSeq database (release Jan.21, 2020, containing 80,872 entries) in Proteome Discoverer 2.1 interface (Thermo Fisher)