BLCA data acquisition
The flowchart of the study procedure was shown in Fig. 1. Twelve pathways linked to glucose metabolism were chosen with a total of 606 glucose metabolism-related genes included for following study. The RNA-seq transcriptome data of 405 bladder cancer samples and 19 normal bladder samples from TCGA BLCA cohort was collected and corresponding DNA methylation data was obtained from TCGA 450 k microarray.
Consensus clustering of glucose metabolism-related genes identified two main clusters of BLCA samples
With the increase of k value, the 405 bladder cancer samples were gradually and steadily divided into two main large clusters by 30 glucose metabolism-related genes with the largest MAD values, the light green cluster and the light blue cluster (Fig. 2a-j), which suggested that glucose metabolism could play a potential role as an index in classifying bladder cancer patients. Then we tried to categorize the 30 genes into two clusters according to their expression trends, namely glucose cluster 1 and glucose cluster 2. The result showed 405 samples were obviously divided into two clusters, cluster A and cluster B, which were characterized with significant differences on the expressions of cluster 1 genes and cluster 2 genes (Fig. 2k). By measuring the associations between the clustering and clinicopathological features, we detected significant difference between cluster 1 and cluster 2 for neoplasm grade (P = 2.999e-06), while no obvious differences were found for all of the other parameters (Table. 1), which indicated that compared to current indexes, it’s a relatively independent but effective method to categorize subtypes of bladder cancer samples by using glucose metabolism-related genes and it’s worth further exploring.
Table 1
Clinicopathological characteristics of bladder cancer patients by consensus clustering analysis.
Characteristics | Clust A (N = 156) | Clust B(N = 249) | P-value |
Age [mean (1st, 3rd QU)] | 68.66 (61,76) | 67.66 (60,76) | 0.3496 |
Tumor stage (I/II/III/IV) | 0/41/67/48 | 2/89/71/85 | 0.3444 |
Grade (T1/T2/T3/T4) | 1/35/90/19 | 3/84/104/36 | 0.1084 |
Metastasis (Negative/Positive) | 60/5 | 135/6 | 0.3606 |
Lymph node (Positive/Negative) | 46/96 | 82/139 | 0.1604 |
Neoplasm (High/Low) | 156/0 | 225/21 | 2.999e-06 |
P-value < 0.05 as threshold. |
Confirmation Of Glucose Metabolism-related Genes Associated With Survival
In the training set, by comparing patients with survival time less than 1 year and patients who lived longer than 3 years, 70 DEGs were screened out with p < 0.05 as the threshold (see Additional file 1). Then, 101 genes significantly associated to patients’ survival status were obtained from 606 glucose metabolism-related genes by using univariate COX regression analysis in the second step (see Additional file 2). The 46 overlapping genes from the two steps above were survival-related DEGs. In the third step, by analyzing promoter methylation levels of the 46 survival-related DEGs, we detected 18 genes of which the expression levels were significantly correlated to corresponding promotor methylation strengths, regarding the numbers of significant probes/the numbers of non-significant probes > 2/3 as the threshold (see Additional file 3). Thus, via the three steps above, a total of 18 target genes (FBP1, HIP1R, USF1, SLC45A3, SARS, LRRC59, TRAF5, PICK1, UCHL1, NUP188, COPS5, MGST2, SSTR5, ZAP70, SLC44A4, PYCR1, TXNIP and ELF3) associated with both patients’ prognosis and DNA methylation status were selected as predictors for following further study (Fig. 3a). The regression coefficients (β) of the 18 genes were obtained from the univariate Cox proportional hazard regression analysis (Fig. 3b). Genes with regression coefficient values higher than 1 were identified as risky factors while those with coefficient values lower than 1 were protective factors.
Identification of an 18 gene-related risk signature for the prognosis of BLCA
To predict the overall survival of BLCA patients, we built a risk score formula on the basis of the expressions and the regression coefficients of the 18 selected genes as follows: risk score = (0.852801514 * expression level of FBP1) + ( 0.658588685 * expression level of HIP1R) + (0.632868909 * expression level of USF1) +(0.759835042 * expression level of SLC45A3) + (1.746061357 * expression level of SARS) + (1.53204174 * expression level of LRRC59) + (0.636153332 * expression level of TRAF5) + (0.662610091 * expression level of PICK1) +(1.121981252 * expression level of UCHL1) + (1.421043199 * expression level of NUP188) + (0.584186664 * expression level of COPS5) + (0.787686002 * expression level of MGST2) + (0.527554635 * expression of SSTR5) + (0.777212294 * expression level of ZAP70) + (0.918628408 * expression level of SLC44A4) + (1.153860484 * expression level of PYCR1) + (0.880775148 * expression level of TXNIP) + (0.917167392 * expression level of ELF3). We called the signature we established above the18-mRNA risk score.
After calculating the risk score of each patient according to their 18-mRNA expression signature in the training set, we divided patients into a high-risk group (n = 150) and a low-risk group (n = 150) with the median risk score as the cutoff(Fig. 4a). Each patient’s survival time and ending point are shown in (Fig. 4b), which reveals remarkable worse prognosis in the high-risk group than in the low-risk group. The 18 selected genes’ expression levels of each patient are shown in (Fig. 4c) and we found that risky and protective genes showed opposite expression patterns according to the 18-mRNA risk score. Higher levels of 5 genes (UCHL1, PYCR1, SARS, NUP188, LRRC59) were more likely to be expressed in the high-risk group than in the low-risk group while the other genes (FBP1, HIP1R,USF1, SLC45A3, TRAF5, PICK1, COPS5, MGST2, SSTR5, ZAP70, SLC44A4, TXNIP, ELF3) were in the opposite situation, which was similar to our previous result (Fig. 3b).
We then carried out Kaplan-Meier analysis of overall survival for patients in the training set, which suggested great performance of the risk model in dividing patients into high-risk and low-risk (p = 0.0015) (Fig. 4d). In order to confirm the predictive ability of this risk formula, we calculated the 18-mRNA risk score of each patient in the validation set and stratified the patients into the same subgroups by using the median risk score as the cutoff. Similar to the result in the training set, the overall survival rate of the high-risk group was significantly lower than that of the low-risk group in the validation set (p = 0.00024) (Fig. 4e).
ROC analyses for 1-year, 3-year and 5-year survival were respectively carried out in the training set and the validation set, respectively. The results were shown in Fig. 5a-f, which exhibited satisfying survival predictive ability of the 18-mRNA risk score signature. Moreover, the sensitivity and specificity were even higher for the validation set than for the training set, which provides us with more confidence in the prognostic signature we built.
The 18-mRNA risk score signature was an independent prognostic factor for bladder cancer patients
Multivariate Cox regression analysis was performed in the validation set to determine whether the prognostic signature we had previously built could function as an independent prognostic indicator. When we included age, neoplasm grade, pathological T grade, pathological N grade, pathological M grade and diagnostic tumor stage together with the 18-mRNA risk score into the Multivariate Cox regression model, the age (P < 0.0001, HR = 1.033), pathological N stage (P = 0.022, HR = 1.195) and the 18-mRNA risk score signature (P = 0.004, HR = 1.167) showed significance while the other clinicopathological factors showed no statistical significance (Table. 2). Thus, it was indicated that the signature we established might be an independent prognostic factor for BLCA patients.
Table 2
Multivariate Cox regression analysis of the18-mRNA signature and clinical characteristics predictive of overall survival.
Variable | HR | P-value |
Age | 1.033 | < 0.001 |
Neoplasm grade | 0.668 | 0.594 |
Pathological M stage | 1.182 | 0.082 |
Pathological N stage | 1.195 | 0.022 |
Pathological T stage | 1.329 | 0.091 |
Tumor grade | 1.328 | 0.070 |
18-mRNA risk score signature | 1.167 | 0.004 |
HR: hazard ratio; P-value < 0.05 as the criteria. |
GSEA analysis between high-risk patients and low-risk patients
GSEA analysis result showed pathways and functions including wnt/beta-catenin-independent signaling pathway were significantly enriched in the high-risk group (Fig.
6a-c), which might provide us with insights into cancer progression and potential molecular mechanisms.
Weighted Gene Co-expression Network Analysis (wgcna)
With 24286 genes were adopted in WGCNA network construction, including mRNAs, miRNAs and lncRNAs, 34 modules in total were obtained. Among the 34 modules, 9 genes from the 18 glucose metabolism-related genes were identified as hub genes which were linked to 3 modules of co-expressive genes (Fig. 6d).
Functional characteristics of UCHL1 and PYCR1
In order to further validate our bioinformatic analyses, we tended to select two risky genes from the 18 genes to conduct cell functional experiments by using siRNAs to silence their expressions. Via the process of analyzing DNA methylation status of survival-related DEGs, we found the promotor methylation strengths of UCHL1 and PYCR1 were extraordinarily related to their own expressions. At the same time, no studies have so far reported the functions that these two genes play on bladder cancer cells. With great interest, we performed functional validations to determine the impacts that the two genes might have on bladder cancer cell lines, UM-UC-3 and T24. qRT-PCR results confirmed the knockdowns of UCHL1 and PYCR1 by siRNAs at mRNA level, respectively (Fig. 7A-B). Then, the proliferation ability, migration and invasiveness of BLCA cells were proved to be obviously inhibited after downregulating the expression of UCHL1 or of PYCR1 (Fig. 7C-D). Via performing western blotting, we noticed that silencing UCHL1 or PYVR1 resulted in downregulation of the expressions N-cadherin and CyclinD1 at protein level, which suggested that the two genes might promote epithelial-mesenchymal transition (EMT)of bladder cancer cells and accelerate proliferation by regulating cell cycle (Fig. 7E). Meanwhile, no significant changes were detected in beta-catenin protein level, which was accordant to the GSEA result (Fig. 7E). In general, the results above provided us with not only experimental basis to partially validate our bioinformatic analyses but also with further insights to molecular functions and mechanisms for further research.