Development and verication of immune-related long non-coding RNA prognostic signature in bladder cancer

Background: Bladder cancer is the second most common malignant tumor in urogenital system. The research aimed to investigate the prognostic role of immune-related long non-coding RNA (lncRNA) in bladder cancer. Methods: We extracted 411 bladder cancer samples from The Cancer Genome Atlas database. Single-sample gene set enrichment analysis was employed to assess the immune cell inltration of these samples. We recognized differentially expressed lncRNAs between tumors and paracancerous tissues, and differentially expressed lncRNAs between the high and low immune cell inltration groups. Venn diagram analysis detected differentially expressed lncRNAs that intersected the above groups. LncRNAs with prognostic signicance were identied by regression analysis and survival analysis. Multivariate Cox analysis was used to establish the risk score model. The nomogram was established and evaluated by receiver operating characteristic (ROC) curve analysis, concordance index (C-index) analysis, calibration chart, and decision curve analysis (DCA). Additionally, we performed gene set enrichment analysis to explore the potential functions of the screened lncRNAs in tumor pathogenesis. Results: Three hundred and twenty differentially expressed lncRNAs were recognized. We randomly divided patients into the training data set and the testing data set at a 2: 1 ratio. In the training data set, 9 immune-related lncRNAs with prognostic signicance were identied. The risk score model was constructed to classify patients as high- and low-risk cohorts. Patients in the low-risk cohort had better survival outcomes than those in the high-risk cohort. The nomogram was established based on the indicators including age, gender, TNM stage, and risk score. The model’s predictive performance was conrmed by ROC curve analysis, C-index analysis, calibration chart, and DCA. The testing data set also achieved similar results. Bioinformatics analysis suggested that the 9-lncRNA signature was involved in modulation of various immune responses, antigen processing and presentation, and T cell receptor signaling pathway. Conclusions: The immune-related lncRNAs cancer biology.


Background
Bladder cancer is the second most common urological malignancy of the world with approximately 549,000 new cases and 200,000 deaths in 2018 [1]. It was reported that the prognosis of bladder cancer patients was strictly related to the immune microenvironment of tumor tissues [2]. Accumulated evidence has veri ed the therapeutic effect of immune checkpoint inhibitors in bladder cancer, including atezolizumab, avelumab, durvalumab, nivolumab, and pembrolizumab [3]. A recent research demonstrated that pembrolizumab could prolong the progression-free survival of patients with high RNA-based immune signature scores [4]. This suggested that we might identify immune-related prognostic indicators to improve the prognosis of breast cancer patients and guide the treatment.
Long non-coding RNAs (lncRNAs) are a group of RNAs that participate in the human physiological and pathological process by interacting with speci c RNAs and proteins. In recent years, lncRNAs were discovered to participate in tumor growth and progression [5]. In bladder cancer, lncRNA plays a vital role in lymphatic metastasis, epithelial-mesenchymal transformation, proliferation, migration, and apoptosis of tumor cells [6,7]. LncRNA SOX2OT could maintain the stemness phenotype of bladder cancer stem cells and served as an adverse indicator of patients' clinical outcomes and prognosis [8]. Furthermore, exosomal lncRNA LNMAT2 in bladder cancer could stimulate the formation and migration of lymphatic endothelial cells tube and intensify cancer lymphangiogenesis and lymphatic metastasis [9]. Therefore, lncRNA, as a novel biological marker, offers broad prospects for early diagnosis and prognosis prediction of bladder cancer.
Studies demonstrated that immune-related lncRNAs have a unique value in the prognosis of several cancers. The heterogeneous expression of lncRNAs was identi ed among different immune-in ltrating groups in muscle-invasive bladder cancer [10]. The potential value of immune-related lncRNAs as prognostic indicators has been validated in several cancers. Shen et al. recognized 11 immune-related lncRNAs as prognostic markers for breast cancer, and the 11-lncRNAs signature was related to the in ltration of immune cell subtypes [11]. Li et al. screened 7 immune-related lncRNAs in low-grade glioma and con rmed that these lncRNAs have prognostic value in patients [12]. However, immune-related lncRNAs in bladder cancer has not been revealed before.
In the study, we analyzed the lncRNAs data set and corresponding clinical information from the Cancer Genome Atlas (TCGA) and screened for immune-related lncRNAs by single-sample gene set enrichment analysis (ssGSEA). We established a prognostic model based on these lncRNAs and explored their potential biological functions in bladder cancer.

Materials And Methods
Bladder cancer sample sources and grouping Gene expression data (RNA-Seq), lncRNA sequencing data and corresponding clinical data of bladder cancer were downloaded from the TCGA database (https://portal.gdc.cancer.gov). According to the published research [13], 29 immune cell data sets were applied to evaluate the in ltration level of immune cells through the ssGSEA method. After that, patients were classi ed as the high and low immune cell in ltration groups using hclust package. The stromal score, immune score, and tumor purity score were calculated by the ESTIMATE algorithm to verify the effectiveness of ssGSEA groupings [14]. In addition, we assessed the difference between the two groups by analyzing the expression of the human leukocyte antigen (HLA) gene. CIBERSORT algorithm was employed to determine the in ltration of various immune cells in the tumor sample and verify the potency of the immune groupings again [15]. Screening Of Immune-related Lncrna In Bladder Cancer We set | log 2 FC | > 0.5 and p < 0.05 as the standard to recognize the differentially expressed lncRNAs between the high and low immune cell in ltration groups by edgeR package. Differentially expressed lncRNAs between bladder cancer and paracancerous tissue were also identi ed by the same method. Venn diagram analysis was used to screen out immune-related lncRNAs in bladder cancer from the above two sets.

Construction Of Risk Score Model Based On Immune-related Lncrnas
In the training data set, univariate Cox regression was performed on immune-related lncRNAs to identify 38 prognosis-associated lncRNAs (Fig. 4a). LASSO regression analysis further screened 9 crucial lncRNAs (Fig. 4b, c). Survival analyses of immune-related lncRNAs revealed that 9 lncRNAs were signi cantly related to OS, including AC126773.  (Table 1) to establish the risk score model: Risk score = . We set the median risk score as the cutoff and divided 411 patients into high-risk and low-risk groups. The Kaplan-Meier curve disclosed that the OS in the low-risk group was signi cantly better than that in the high-risk group (p = 7.542e-05) (Fig. 6a). The risk curve and scatter plot indicated that the risk coe cient and mortality of patients in the high-risk group were higher than those in the low-risk group (Fig. 6b, c). The heat map exhibited the expression pro les of the 9-lncRNAs signature in the high-risk and low-risk groups (Fig. 6d). The correlation status of B cells, CD4 + T cells, CD8 + T cells, dendrictic cells, macrophages, and neutrophils with risk score were ploted in Fig. 7 (Fig. 7a-f for the training data set and Fig. 7g-i for the testing data set). Similar results were obtained using the same method on the testing data set ( Fig. 6e-h). Table 1 The prognostic signi cance of the 9-lncRNAs signature We evaluated the prognostic signi cance of risk score and clinical variables such as age, gender, and TNM stage by univariate and multivariate Cox regression analyses. The nomogram was established according to the results of multivariate Cox regression to predict each patient's 3-and 5-year OS. We conducted the ROC curve analysis, concordance index (C-index) method, calibration curve method, and decision curve analysis (DCA) to assess the model's accuracy. Finally, the testing set data was used to evaluate the above results.

Gene Set Enrichment Analysis
We performed GO enrichment analysis and KEGG pathway analysis on the differentially expressed genes between the high-risk and low-risk groups. GO enrichment analysis indicated that the genes were enriched in ephrin receptor signaling pathway, epidermal growth factor receptor (EGFR) signaling pathway, ERBB signaling pathway, mRNA splice site selection, DNA ADP ribosyltransferase activity, and T cell selection ( Fig. 10a). KEGG pathway analysis showed that these genes were involved in amino sugar and nucleotide sugar metabolism, antigen processing and presentation, extracellular matrix (ECM) receptor interaction, focal adhesion, primary immunode ciency, and T cell receptor signaling pathway (Fig. 10b). These ndings may help researchers further explore the mechanism of immune-related lncRNA affecting the pathogenesis of bladder cancer.

Construction and veri cation of bladder cancer groupings
The owchart of our research was plotted in Fig. 1. Information of 411 bladder cancer tissues and 19 paracancerous tissues were obtained from the TCGA database. The transcriptome data of bladder cancer samples were analyzed with ssGSEA method to assess the immune cell in ltration level. Unsupervised hierarchical clustering algorithm was employed to divide patients into the high immune cell in ltration group (n = 85) and the low immune cell in ltration group (n = 326) (Fig. 2a). ESTIMATE algorithm was used to calculate the ESTIMATE score, immune score, stromal score and tumor purity of all samples. Compared to the low immune cell in ltration group, the high immune cell in ltration group presented higher ESTIMATE score, higher immune score, higher stromal score, and lower tumor purity (p < 0.001) ( Fig. 2b-e). The expression of HLA family genes in the high immune cell in ltration group was higher than that in the low immune cell in ltration group (p < 0.001) (Fig. 2f). In addition, the CIBERSORT method revealed that the high immune cell in ltration group had a higher density of immune cells (Fig. 2g). Overall, our results indicated that the bladder cancer grouping was feasible. Identi cation Of Immune-related Lncrnas We recognized 2067 differentially expressed lncRNAs between tumors and paracancerous tissues (Fig. 3a) and 1076 differentially expressed lncRNAs between the high and low immune cell in ltration groups (Fig. 3b). The Venn diagram analysis detected 320 differentially expressed lncRNAs that intersected the above groups (Fig. 3c). Taking together, we screened 320 immune-related lncRNAs in bladder cancer.

Establishment And Evaluation Of Prognostic Model
Univariate Cox regression showed that the risk score and clinical indicators including age, gender, and TNM stage were rmly related to OS (Fig. 8a). We further conducted the multivariate Cox analysis and found that the 9-lncRNAs signature was an independent prognostic factor for bladder cancer (p < 0.001) (Fig. 8b). ROC curve analysis validated the predictive performance of the signature (Fig. 8c). We then established a nomogram including age, gender, TNM stage, and risk score (Fig. 9a). The area under the curves (AUCs) for 3-year, 5-year OS predicted by the model were 0.784 and 0.790, respectively (Fig. 9b). The C-index of the nomogram was 0.751. The calibration curves and DCAs of the prognostic model showed that the model had an excellent predictive effect (Fig. 9c-f). We acquired similar results using the same method on the testing data set (Fig. 8d-f, 9 g-l).

Discussion
Bladder cancer has the characteristics of high recurrence rate and poor prognosis [1]. Accurately predicting the prognosis of bladder cancer patients is of great importance in guiding their treatment. Transcriptome sequencing results have been widely used in the prognostic model establishment for tumor patients. Prognostic models based on immune-related genes have been developed and proved to have excellent predictive e cacy in bladder cancer patients [16,17]. Using lncRNA to construct a prognostic model may be an important supplement to the prognosis prediction of bladder cancer.
Targeted therapy for immune checkpoints could inhibit the immune escape of tumor cells and eliminate them by activating the immune system [18]. Many immune checkpoint blockers have achieved gratifying therapeutic effects in bladder cancer patients. Accumulated studies have shown that lncRNA played an essential role in the tumor immune microenvironment. LncRNA NKILA could activate T cell-induced cell death to promote tumor immune escape [19]. LncRNA SNHG1 could regulate the differentiation of Treg cells by targeting miR-448 and thus affected the immune escape of breast cancer [20]. It was reported that pre-existing immune cell in ltration in tumor tissue was an crucial factor determining the treatment response of immune checkpoint inhibitors [21]. The above researches indicated that these lncRNAs might have prognostic value in cancer patients. Zhou et al. identi ed 6 immune-related lncRNAs in glioblastoma and con rmed that these lncRNAs had prognostic value in glioblastoma patients [22]. However, the prognostic value of immune-related lncRNAs has not been studied before.
In the research, we analyzed the lncRNAs data set from the TCGA database and screened 320 immunerelated lncRNAs. Nine immune-related lncRNAs with prognostic signi cance were identi ed ultimately. Multivariate Cox analysis was used to construct the risk score model. We found that patients in the lowrisk group had longer OS than those in the high-risk group. Subsequently, we established a nomogram including age, gender, TNM stage, and risk score. The ROC curve analysis, C-index, calibration curves, and DCA con rmed the model's predictive power. Compared to models based on other sequencing data, the prognostic model constructed by immune-related lncRNAs presented better e cacy according to ROC curve method [23][24][25][26].
Subsequently, we performed GO enrichment analysis and KEGG pathway analysis to explore the potential functions of the 9-lncRNAs signature in bladder cancer. The results showed that these lncRNAs were involved in various immune responses, antigen processing and presentation, T cell receptor signaling pathway, epidermal growth factor receptor signaling pathway, ERBB signaling pathway, ECM receptor interaction, focal adhesion, and primary immunode ciency. Epidermal growth factor was reported to activate the androgen receptor and increase the expression of TRIP13 to promote bladder cancer progression [27,28]. Notably, ECM modi cation could not only promote tumor cells to escape, but also help generate and maintain the cancer stem cell niche [29]. Moreover, high in ltration of memory activated CD4 + T cell subsets were associated with prolonged OS and reduced risk of tumor recurrence in bladder cancer [30]. Chobrutskiy et al. demonstrated that lower CDR3 region isoelectric point in T cell receptor was associated with better survival outcomes in bladder cancer [31]. In brief, our data suggested that the 9-lncRNAs signature modulated the development and progression of bladder cancer in a variety of ways.
There are some limitations in our research. It is a retrospective study whose data was obtained from the TCGA database that lacked information including treatment and recurrence records. Further in vivo or in vitro experiments and prospective clinical researches are needed to validate our conclusions.

Conclusion
In summary, the present study identi ed 9 immune-related lncRNA and the 9-lncRNAs signature possessed prognostic value for bladder cancer patients. Bioinformatics analysis suggested that immunerelated lncRNAs may regulate tumor pathogenesis through modulation of various immune responses, antigen processing and presentation, and T cell receptor signaling pathway. Our research proposed a potential model and biomarkers for the immune-related work and personalized treatment in bladder cancer patients.

Data availability
All data used in this study were acquired from TCGA database.
Ethics approval and consent to participate Not applicable.

Consent for publication
Not applicable.

Con icts of interest
The authors declare no con icts of interest.   Establishment and veri cation of bladder cancer grouping. (a) The heatmap showed the unsupervised clustering of 29 immune cells in the high immune cell in ltration group (Immunity_H) and the low immune cell in ltration group (Immunity_L). Parameters including the tumor purity, ESTIMATE score, immune score, and stromal score were also displayed. (b-e) The box plots revealed the statistical differences in tumor purity, ESTIMATE score, immune score, and stromal score between the high and the low immune cell in ltration groups. (f) The expression of HLA family genes in the high immune cell in ltration group was higher than that in the low immune cell in ltration group. (g) the CIBERSORT method demonstrated that a higher density of immune cells was found in the high immune cell in ltration group compared to the low immune cell in ltration group. *p < 0.05; **p < 0.01; ***p < 0.001.    Construction of risk-score model based on immune-related lncRNAs. (a, e) Kaplan-Meier analysis evinced that patients in the high-risk group suffered worse OS compared to the low-risk group in training and testing data sets, respectively. (b, f) The overviews of survival time for each patient in training and testing data sets, respectively. (c, g) The distributions of a risk score for each patient in training and testing data sets, respectively. (d, h) The heatmaps of expression pro les for 9-lncRNAs signature between the low-risk and high-risk groups in training and testing data sets, respectively. Warm colors represented high expression, and cold colors represented low expression. The prognostic value of risk score and clinical variables. (a,d) Univariate Cox analysis showed that risk score and clinical variables including age, gender, and TNM stage were signi cantly related to OS in training and testing data sets, respectively. (b,e) Multivariate Cox analysis manifested that the 9-lncRNAs signature was an independent prognostic indicator for bladder cancer in training and testing data sets, respectively. (c,f) ROC curve analysis of the 9-lncRNAs signature demonstrated that AUCs in training data set and in testing data set were 0.727 and 0.752, respectively. The prognostic value of risk score and clinical variables. (a,d) Univariate Cox analysis showed that risk score and clinical variables including age, gender, and TNM stage were signi cantly related to OS in training and testing data sets, respectively. (b,e) Multivariate Cox analysis manifested that the 9-lncRNAs signature was an independent prognostic indicator for bladder cancer in training and testing data sets, Page 18/18 respectively. (c,f) ROC curve analysis of the 9-lncRNAs signature demonstrated that AUCs in training data set and in testing data set were 0.727 and 0.752, respectively.

Figure 10
Gene set enrichment analysis on the differentially expressed genes between the high-risk and low-risk groups. (a) GO enrichment analysis indicated that the genes were enriched in ephrin receptor signaling pathway, EGFR signaling pathway, ERBB signaling pathway, mRNA splice site selection, DNA ADP ribosyltransferase activity, and T cell selection. (b) KEGG pathway analysis showed that these genes were involved in amino sugar and nucleotide sugar metabolism, antigen processing and presentation, ECM receptor interaction, focal adhesion, primary immunode ciency, and T cell receptor signaling pathway.