High Expression of CTSV in Bladder Cancer as a Predictor of Poor Prognosis: A Study Based on TCGA and GEO Database

Background: Bladder cancer (BLCA) is the most common malignancy of urinary system with a high recurrence rate. We aimed to explore the relationship between cathepsin V (CTSV) expression and prognosis in patients with bladder cancer. Methods: The RNA-Seq gene expression data and corresponding clinical information with BLCA were downloaded from TCGA database. The gene expression pro�les of GSE13507 and GSE133624 were downloaded from GEO database. BLCA patients were divided into high and low expression group according to the cutoff value of CTSV expression. The relationship between clinicopathologic characteristics and CTSV expression was analyzed with the Wilcoxon signed-rank test and logistic regression. Kaplan-Meier analysis and Cox regression were used to analyze the relationship between overall survival and clinicopathologic characteristics. Gene set enrichment analysis (GSEA) was utilized to identify enriched KEGG pathway. Results: High expression of CTSV was signi�cantly correlated with pathological grade (OR = 1.662 for low vs. high), clinical stage (OR = 1.589 for I-II vs. III-IV), status (OR = 1.435 for normal vs. tumor), T stage (OR = 1.589 for T1-2 vs. T3-4), and M stage (OR = 4.499 for M0 vs M1). The expression of CTSV was signi�cantly increased in BLCA compared with normal tissue (P < 0.001). Kaplan-Meier survival analysis showed that BLCA patients with high expression of CTSV had a poorer prognosis than low expression of CTSV patients (P = 0.0016). Univariate Cox analysis showed that high expression of CTSV was signi�cantly associated with poorer overall survival (HR:1.662, 95%CI:1.209-2.286, P = 0.002). Multivariate Cox regression showed that high expression of CTSV was an independent risk factor for poor prognosis in BLCA patients (HR: 1.495, 95%CI: 1.069-2.089, P = 0.019). We also used the


Background
Bladder cancer (BLCA) is the most common malignancy of urinary system, with nearly 550,000 new cases and 200,000 deaths occurred in 2018 worldwide [1].According to the depth of tumor invasion and histopathological subtypes, BLCA is divided into non-muscle invasive bladder cancer (NMIBC) and muscle invasive bladder cancer (MIBC).Although the patients suffering from NMIBC (pTa-pT1) was performed the transurethral resection of bladder tumor (TURBT), the recurrence rate within ve years can be up to 75% [2].MIBC has a high metastasis rate with a 5-year survival rate of approximately 60% [3].
BLCA is not easily cured due to its high rates of recurrence and metastasis, causing a ve-year survival rate of approximately 57% [4,5].Therefore, it is particularly important to discover new diagnostic methods and treatment strategies for BLCA.
The most common pathologic type of BLCA is transitional cell carcinoma.Mutations are found in oncogenes and tumor suppressor genes such as FGFR3 and TP53.NMIBC and MIBC are different in genetics.NMIBC is characterized by high frequency mutation in the FGFR3 oncogene, leading to constitutive activation of the RAS/ MAPK pathway.However, MIBC is highly enriched with inactivating mutations in TP53 [6][7][8].In addition, other mutated genes in BLCA have been identi ed, including HRAS, RB1, TSC1 and PIK3CA [9].
Cathepsins are a class of lysosomal peptidases and the cathepsin family includes cathepsin A, B, C, D, E, F, G, H, L, K, O, S, V and W, which play a key role in normal tissue homeostasis [10].The studies have shown that cathepsins are highly expressed in various cancers and closely related to tumor progression and metastasis [11].Cathepsin V, also known as cathepsin L2 (CTSV/CTSL2), is a cysteine cathepsin that is speci cally expressed in the thymus, cornea, and testis [12,13].CTSV is over-expressed in tumors such as breast cancer, colorectal cancer, renal cancer and ovarian cancer [14].It has also been shown that high expression of CTSV is associated with adverse prognosis in hepatocellular carcinoma [15].However, there are no studies on its signi cance in the prognosis of BLCA.
Thus, the objective of the current study was to evaluate the prognostic value of CTSV expression in BLCA based on TCGA database.In order to identify the biological pathways related to the CTSV regulatory network in BLCA, we used GSEA.The high expression of CTSV in BLCA is related to poor prognosis, which provided a potential diagnostic marker and therapeutic target for BLCA.

Data acquisition and preprocessing
RNA-Seq gene expression data and corresponding clinical information (including age, gender, race, histologic grade, clinical stage, T stage, N stage, M stage, overall survival (OS) time and Survival outcome) with BLCA were downloaded from The Cancer Genome Atlas (TCGA) database(https://portal.gdc.cancer.gov/).We got 408 tumor samples and 19 normal samples by removing multiple tissue samples corresponding to the same patient.403 samples with complete survival information were used for survival analysis.These samples with complete clinicopathologic variables were used for logistics regression, univariate and multivariate Cox analysis.The GSE13507 and GSE133624 gene expression pro le matrix les were downloaded from Gene Expression Omnibus (GEO; https://www.ncbi.nlm.nih.gov/geo/), which are used to validate the expression level of CTSV mRNA in BLCA patients.The GSE13507 contains 188 bladder cancer tissues and 68 normal bladder tissues (Platform: GPL6102 Illumina human-6 v2.0 expression beadchip).The GSE133624 contains 36 bladder cancer tissues and 29 normal bladder tissues (Platform: GPL20795 HiSeq X Ten (Homo sapiens)).The gene expression data and clinical data were retained and further analyzed.

Gene set enrichment analysis
Gene set enrichment analysis (GSEA) is a computational method that determines whether an a priori de ned set of genes shows statistically signi cant, concordant differences between two biological states [16].The Molecular Signatures Database (MSigDB) is a collection of annotated gene sets for use with GSEA software.In this study, the samples were divided into the high expression group and the low expression group according to the cutoff value of CTSV expression, and GSEA was utilized to identify enriched KEGG pathway.The low expression group and high expression group were used as a phenotype label.The parameter settings of permutations and minimum setSize are 1000 and 100, respectively.We used the normalized enrichment score (NES) and false discovery rate (FDR) to sort the pathways enriched in each phenotype.NOM p-val <0.05 and FDR q-val < 0.05 were considered statistically signi cant.

Statistical Analyses
All statistical analyses were conducted by R software (version 4.0.0,https://www.r-project.org/).The boxplots were used to visualize the different CTSV expression of all sample for discrete variables.The Wilcoxon signed-rank test and logistic regression were used to analyze the relationship between clinicopathologic characteristics and CTSV.The Kaplan-Meier method was used for survival analysis, and the cutoff value was determined by the survminer package based on CTSV expression value, the survival time, and the survival status.Then univariate Cox analysis was used to analyze the relationship between OS and clinicopathologic characteristics.Multivariate Cox analysis was used to compare the in uence of CTSV expression on survival along with other clinical characteristics (age, clinical stage, T stage, N stage, and M stage).P< 0.05 was considered statistically signi cant.

Association with CTSV expression and clinicopathologic variables
A total of 427 BLCA samples with CTSV expression data were downloaded from the TCGA database, including 408 tumor samples with complete clinical information and 19 normal samples.As shown in Fig. 1, the expression of CTSV in tumor tissues signi cantly increased compared to normal tissues (P < 0.001).High expression of CTSV was associated with gender (P = 0.046), pathological grade (P < 0.001), race (P = 0.0039), and T stage (P = 0.043).Univariate logistic regression analysis showed that CTSV expression as the dependent variable (based on the cutoff value of 211) was correlated with the clinicopathological characteristics of poor prognosis (Table 2).The high expression of CTSV was signi cantly correlated with race (OR = 2.519 for white vs. others), pathological grade (OR = 1.662 for low vs. high), clinical stage (OR = 1.589 for I-II vs. III-IV), status (OR = 1.435 for normal vs. tumor), T stage (OR = 1.589 for T1-2 vs. T3-4), M stage (OR = 4.499 for M0 vs M1) (all P < 0.05).These results suggested that high expression of CTSV in BLCA is associated with higher pathological grade, clinical stage, T stage, and M stage.

Validation of CTSV gene expression in BLCA
The GSE13507 and GSE133624 were used to verify the CTSV gene expression in BLCA tissues and normal tissues.As shown in Fig. 3, the expression of CTSV in BLCA tissues was signi cantly higher than normal tissues, which veri ed the conclusions analyzed above.P< 0.05 was considered statistically signi cant.

Discussion
It has been reported that CTSV is highly expressed in some tumors, including colorectal cancer, breast cancer, oral cancer and endometrial cancer [14,17,18], and associated with poor prognosis of liver cancer and breast cancer [19,15].High expression of CTSV was signi cantly correlated with tumor number, pathological grade, vascular invasion, T stage and clinical stage [15].The purpose of our study was to explore the expression of CTSV in BLCA and its potential effect on prognosis, and whether it can act as a biomarker for prognosis in BLCA patients by bioinformatics.
In this study, RNA sequencing data and relevant clinical information of BLCA were downloaded from the TCGA database for bioinformatics analysis.The results showed that high expression of CTSV was correlated with clinicopathological features (histologic grade, clinical stage, T stage, etc.), OS, and poor prognosis.In order to further identify the function of CTSV in BLCA, we performed gene set enrichment analysis between high and low CTSV expression data sets.GSEA enrichment analysis showed that cell cycle, purine metabolism, chemokine signaling pathway, regulation of actin cytoskeleton, pathways in cancer, tight junction, focal adhesion, ubiquitin mediated proteolysis, Wnt signaling pathway, endocytosis, MAPK signaling pathway, T cell receptor signaling pathway, axon guidance, Toll-like receptor signaling pathway, spliceosome, cell adhesion molecules (CAMs), natural killer cell mediated cytotoxicity, JAK-STAT signaling pathway, insulin signaling pathway, leukocyte transendothelial migration, lysosome and cytokine-cytokine receptor interaction were enriched in a high phenotype.These results suggested that CTSV may be a potential biomarker for prognosis and therapeutic target in BLCA patients.
As a member of the cysteine cathepsin family, CTSV plays an important role in cancer progression, proliferation, invasion and metastasis [20].It has been found that high expression of CTSV in hepatocellular carcinoma is associated with poor prognosis by immunohistochemistry and quantitative real-time polymerase chain reaction [15].In our study, it was also con rmed that the high expression of CTSV in BLCA was associated with advanced clinicopathologic features and predicted poor prognosis.
Cysteine cathepsin is a key acid hydrolytic enzyme in lysosomes, which can degrade extracellular matrix and promote tumor progression [11].In our GSEA enrichment results, we also found many biological pathways associated with extracellular matrix, including regulation of actin cytoskeleton, tight junction, focal adhesion, and cell adhesion molecules (CAMs).
In this study, we enriched 22 biological pathways in a high phenotype by using GSEA.It has been reported that chemokine signaling pathway plays an important role in occurrence and development of BLCA and may be a therapeutic target [21].Sun et al. [22] demonstrated that activation of MAPK signaling pathway promotes proliferation and migration of BLCA cells.Similarly, Wnt signaling pathway is activated in BLCA and regulates tumor growth and proliferation [23,24].It has been discovered that cytokine signaling 3 gene (SOCS3) regulates the cancer initiation and progression of BLCA through JAK-STAT signaling pathway [25].The current evidence suggests that HGF/SF and IL-8 may play an important role in the development of BLCA from super cial to invasive via in uencing the structure and function of tight Junction [26].Cell adhesion molecules (CAMs) play a role in cancer progression and metastasis, and serum level of CAMs is correlated with extent of distant metastasis in BLCA [27].The above results indicate that CTSV plays a crucial role in the progression of BLCA through these pathways.
This study has some limitations.The sample numbers are not large enough and some clinical data is still unknown, such as nearly half of the patients do not have accurate distant metastasis, which may cause some bias.In addition, the expression and molecular mechanism of CTSV in BLCA patients need to be further veri ed in vitro.Therefore, more experimental and clinical studies are needed for further study.

Conclusions
In summary, high expression of CTSV in BLCA is associated with poor prognosis and may serve as a new biomarker.In addition, the chemokine signaling pathway, MAPK signaling pathway, Wnt signaling pathway, JAK-STAT signaling pathway, tight Junction and cell adhesion molecules may be the key pathway regulated by CTSV in BLCA.

Figures Figure 1
Figures

Table 1
Declarations and revised the manuscript.All authors read and approved the nal manuscript.Funding This study was supported by the Science and Technology project of Chengguan District, Lanzhou city, Gansu province Science and Technology Bureau (Project number:2017KJGG0052), the Cuiying Graduate Supervisor Applicant Training Program of Lanzhou University Second Hospital (201704), the Gansu Health Industry Research Project (GSWSKY2017-10), the Cuiying Scienti c and Technological Innovation Program of Lanzhou University Second Hospital (Project number: CY2017-BJ16, Doctoral supervisor training program), Lanzhou City Talent Innovation and Entrepreneurship Project (Project number:2019-RC-37) and the National Nature Science Foundation of China (NO: Bladder cancer patient characteristics in the TCGA cohort