Clinical characteristics of patients with bladder cancer
We analyzed genome sequencing data from 412 patients with bladder cancer obtained from the TCGA Bladder Cancer (BLCA) Data Portal. The clinical and pathological characteristics of all patients included in the current analysis are shown in Table 1. High grade tumors comprised 94.9% of the analysis cohort, whereas low grade tumors comprised the remaining 5.1%.
Comprehensive analysis of somatic variants in bladder cancer
A total of 87,624 somatic mutations were identified using Maftools. These variants were comprised of 75,116 missense mutations, 7,291 nonsense mutations, 1,216 insertions, 1,982 deletions, 130 translation start site variants, 1,762 splice site variants, and 127 nonstop mutations (Figs. 1a, b and d). These mutations were further classified using the Variant Effect Predictor (Fig. 1c). The median of total mutation number obtained from each sample was 148 (Fig. 1a and e). There were significant differences among different variant types (Fig. 1f). We identified 50 genes that were mutated in > 10% of samples using Maftools: TP53 (47%), TTN (45%), MUC16 (28%), KDM6A (26%), SYNE1 (20%), RB1 (18%), FGFR3 (14%), STAG2 (14%), BIRC6 (11%), RYR1 (10%), ADGRV1 (10%), and AHNAK (10%) (Fig. 1b and Table S1).
MATH value characteristics and their relationship with clinical factors
Analysis of the mutation rates of various tumors in TCGA database revealed a high mutation rate for bladder cancer. The median mutational burden for the bladder cancer samples was 148 mutations, which is significantly greater than those of many other cancer types listed in TCGA dataset, ranking fourth in the TCGA tumor category (Fig. 2a). Bladder cancer also displays high ITH during its progression. ITH can be assessed using MATH value. MATH values are calculated from the median absolute deviation (MAD) and the median of its mutant-allele fractions of tumor-specific mutated loci, so the precision of MATH values depends on the sampling of loci and of mutant vs. reference alleles [19]. Thus, MATH value represents a different aspect of tumor biology than the mutation rate. MATH value has been shown to be a simple, quantitative, and generally applicable approach to evaluate the degree of ITH. Here, we calculated MATH values in the TCGA bladder cancer cohort to evaluate the clinical implication of ITH in bladder cancer. The distribution of the MATH values is shown in Fig. 2b. A Q-Q (quantile-quantile) plot (Fig. 2c) and Kolmogorov-Smirnov test (p = 0.2) both show that the MATH values from this cohort fit a normal distribution.
The upper and lower tertiles of the MATH values were 39.4 and 52.3, respectively. Thus, cases with MATH values less than 39.4 were classified into the ‘‘low MATH’’ group (135 patients; 33.3%) and cases with a MATH value greater than 52.3 were classified into the ‘‘high MATH’’ group (135 patients; 33.3%), while the remainder (MATH value between 39.4 and 52.3) were defined as the ‘‘intermediate MATH’’ group (136 patients; 33.4%). First, we examined the correlation between MATH value and overall survival time in each patient using linear regression and found that MATH value did not exhibit a linear dependence relation with patient’s overall survival time (p = 0.646, Fig. 3a). Next, we investigated the prognostic significance of MATH value in the three predefined MATH groups described above. We find that MATH value was not an independent predictor of overall survival time in the entire bladder cancer cohort (p = 0.725, Fig. 3b).
Because the MATH value grouped by grade, stage or race did not conform to normal distribution, rank sum test was used to test the difference among them. Using rank sum test of variance, the MATH value was found to be significantly related to tumor grade (p = 0.024), but not to race (p = 0.484) or stage (p = 0.425) (Table 1). Using one-way ANOVA, the MATH value was not found to be related to recurrence (p = 0.389), or tumor histologic subtype (p = 0.162) (Table 1). Fig. 4a shows the range of calculated MATH values for tumors initially labeled low-grade and high-grade. Higher MATH values were specifically related to a high tumor grade. MATH value was higher in the high-grade group than in the low-grade group (p < 0.05), which suggests that MATH value may be useful in differentiating such patients. We next analyzed the percentage of low- and high-grade cases with different ranges of tumor MATH values. The high-grade cases tended to show a significantly increased level of MATH value than that of the low-grade cases (Fig. 4b). We then explored the correlation of MATH value and survival in low-grade and high grade subgroups. Interestingly, MATH value was not an independent prognostic factor in the low- and high-grade patients (Fig. 4c, p = 0.116). Taken together, we found that MATH value was not an independent risk factor for survival time, but a high MATH value was related to high-grade BLCA.
Low MATH value was an independent favorable prognostic biomarker in FGFR3-mutant patients
Although MATH value was not related to the overall mutation rate, we hypothesized that different driver gene mutations may direct different routes of tumor evolution, resulting in variable intratumor heterogeneity. To test this, we analyzed the mutual exclusivity and co-occurrence of the top 50 mutated genes in BLCA based on cBioPortal [21, 22] (Table 2). We observed mutually exclusive variants in FGFR3 vs. TP53 (q < 0.003), and FGFR3 vs. RB1 (q < 0.02). Similar analyses identified co-occurrence of variants in TP53 vs. RB1, RYR1 vs. AHNAK, BIRC6 vs. ADGRV1, SYNE1 vs. AHNAK, FGFR3 vs. STAG2, KDM6A vs. STAG2, MUC16 vs. BIRC6, and TTN vs. MUC16 (q < 0.005). In the BLCA cohort, TP53 was mutated in >47% of the samples, and FGFR3 was mutated in >14% of the samples. FGFR3 and TP53 mutation are potential survival prediction biomarkers of MIBC. The known role of the p53, as a guardian of the genome stability[23], supports the general hypothesis that TP53 mutations lead to increased ITH. Thus, we first aimed to validate the hypothesis that TP53 mutations, rather than FGFR3 mutations, was associated with greater ITH in BLCA. Based on the presence of somatic mutations in TP53 and FGFR3, we divided the BLCA cohort into three groups: TP53-mutant, FGFR3-mutant, and no TP53 and FGFR3-mutant. Using MATH value as a measure of ITH, we examined the relationship of heterogeneity between the above three groups. Consistent with our hypothesis, TP53 mutations were specifically associated with higher MATH values. MATH values were higher in the BLCA cases with TP53 mutations than in those with FGFR3-mutant (Fig. 5a, p < 0.001) or no TP53 and FGFR3-mutant (Fig. 5a, p < 0.001) group. MATH values were lowest in BLCA cases with FGFR3 mutations (Fig. 5a).
Next, we investigated the prognostic significance of MATH value in the TP53-mutant and the FGFR3-mutant groups. For FGFR3-mutant patients, we observed a trend of increased survival time compared to all patients in the BLCA cohort, although the change was not significant (Fig. 5b, p = 0.149). We then divided the TP53- and FGFR3-mutant groups into low and high MATH value subgroups, respectively, using quartile MATH values as cut-offs. We found that low MATH value was a significant predictor of better survival in the FGFR3-mutant group (Fig. 5c, p < 0.05). On the contrary, we found no significant difference of overall survival in FGFR3-mutant with high MATH value compared to all patients in the BLCA cohort (Fig. 5d, p = 0.925), which indicated that the better survival of patients was specifically correlated with patients carrying FGFR3 mutation with lower MATH value. At the same time, we compared the overall survival between TP53-mutant and all patients in the BLCA cohort. There was no significant difference between them (Fig. 5e, p = 0.761). Moreover, there was no significant difference in overall survival comparing TP53-mutant with either low MATH value or high value to all patients in the BLCA cohort, respectively (Fig. 5f, p = 0.639; Fig. 5g, p = 0.626;), which indicated that MATH value was not a biomarker for prognosis in TP53-mutant patients. Taken together, we found that low MATH value was an independent favorable prognostic biomarker in FGFR3-mutant patients.
Validation of low MATH value as a prognostic biomarker in FGFR3-mutant patients in an independent BLCA cohort
To further investigate whether low MATH value in FGFR3-mutant patients is widely applicable to predict better overall survival in BLCA, we selected another BLCA cohort which was composed of 109 bladder cancer patients published in 2015[24]. We download all mutation and clinical data from cBioPortal and reassessed its prognostic value. First, to verify that FGFR3 mutations, rather than TP53 mutations, were associated with lower ITH in BLCA, we divided the 109 samples[24] into three groups: TP53-mutant, FGFR3-mutant, and no TP53 and FGFR3-mutant. Using MATH as a measure of ITH, we examined the relationship of heterogeneity between the above three groups. Although there was no significant difference between them (p > 0.05), probably due to the small sample size in this cohort, the overall trend of the median MATH values showed that MATH values were greater in BLCA patients with TP53 mutations than in those with FGFR3 mutations (Fig. 6a, p = 0.074) and MATH values were lowest in BLCA cases with FGFR3 mutations (Fig. 6a), consistent with our findings in the TCGA BLCA cohort (Fig. 5a).
Next, we investigated the prognostic significance of MATH value in the FGFR3-mutant and TP53-mutant groups. In all FGFR3-mutant patients, we observed a trend of increased survival time compared to all patients in the whole cohort (Fig. 6b, p = 0.279). We divided FGFR3-mutant patients and TP53-mutant patients into low MATH value subgroup and high MATH value subgroup, respectively. The prognosis trend of FGFR3-mutant patients with low MATH value was better than that of the patients in the whole cohort, although the difference was not statistically significant, probably due to the limited number of patients in this subgroup (Fig. 6c, p = 0.170). We found no significant difference of overall survival in any other subgroups (Fig. 6e-g), indicating that MATH value was not a biomarker for prognosis in the TP53-mutant patients. Taken together, we verified that low MATH value was an independent favorable prognostic biomarker in FGFR3-mutant patients in another bladder cancer cohort.