The Immunogenomic Landscape of and Immune Cell Inltration in Bladder Cancer

The role of immune cell inltration in tumor biology and the potential of immunotherapy for the treatment of several cancers have been proven. However, the immunogenomic landscape and immune cell inltration need to be comprehensively analyzed in bladder cancer (BC). This study aimed to explore the immune-related genes (IRGs) in BC to create a prognostic risk assessment model and gain some insights into the molecular underpinnings of BC. Based on the datasets retrieved from The Cancer Genome Atlas (TCGA) database, we identied survival-associated IRGs via univariate Cox analysis. Then, we created an immune-related gene-based prognostic factor (IRGPF) and validated it by multivariable Cox analysis. We displayed the proles of 22 types of immune cells by using CIBERSORT and explored the relationship between IRGs and immune cell inltration.


Abstract Background
The role of immune cell in ltration in tumor biology and the potential of immunotherapy for the treatment of several cancers have been proven. However, the immunogenomic landscape and immune cell in ltration need to be comprehensively analyzed in bladder cancer (BC). This study aimed to explore the immune-related genes (IRGs) in BC to create a prognostic risk assessment model and gain some insights into the molecular underpinnings of BC.

Methods
Based on the datasets retrieved from The Cancer Genome Atlas (TCGA) database, we identi ed survivalassociated IRGs via univariate Cox analysis. Then, we created an immune-related gene-based prognostic factor (IRGPF) and validated it by multivariable Cox analysis. We displayed the pro les of 22 types of immune cells by using CIBERSORT and explored the relationship between IRGs and immune cell in ltration.

Results
Altogether, 58 differentially expressed IRGs were found to be associated with the prognosis of patients with BC. We constructed a prognostic assessment model as an IRGPF with IRGs (THBS1, CXCL9, CXCL11, FABP6, BIRC5, and PPY). Pro les of the in ltrating immune cells con rmed their signi cance based on clinical factors and individual differences. The IRGPF was related to immune cell in ltration, and the key gene was identi ed as THBS1.

Conclusions
Our ndings con rmed that IRGs could act as independent prognostic factors and immune-driver factors.
Patients with high levels of activated memory CD4 T cells but low levels of resting memory CD4 T cells had a better prognosis. This study indicates the possibility of developing new immunotherapeutic strategies and individualized treatment based on this approach.

Background
In 2020, it is estimated that approximately 81,400 people will be newly diagnosed with bladder cancer (BC), with approximately 17,980 deaths, in the United States [1]. BC is a highly prevalent disease and is related to substantial and increasing morbidity and mortality; its management requires long-term followup, as well as expensive and complex therapeutic strategies [2,3]. Nearly 50% of the patients with BC are in an advanced stage with poor prognosis when diagnosed [4]. Furthermore, distant recurrence still occurs in approximately 50% of patients with muscle-invasive BC after radical cystectomy, which mostly occurs in lymph nodes, lungs, liver, and bone within 24 months [4,5]. Therefore, it is crucial to explore the pathogenesis and molecular classi cation of BC to develop better therapeutic approaches.
In recent years, immunotherapy has been applied to and recognized for the treatment of various cancers, and this has been a main driver of personalized therapy [6][7][8]. Therapies that target immune checkpoints, such as programmed cell death protein 1, programmed cell death ligand 1, and cytotoxic T lymphocyteassociated antigen 4, are regarded as promising for BC [9]. In addition, some studies have revealed that immune cell in ltration is related to clinical characteristics, such as tumor stage, grade, and prognosis, whereas clusters based on immune cell in ltration are conducive to the selection of treatment strategies [10][11][12]. Moreover, immune-related genes (IRGs) were integrated to construct an independent prognostic assessment model for papillary thyroid cancer patients [13]. All these results con rm the signi cance of immunology and immunotherapy in the treatment of cancers.
However, in BC, only a limited number of studies has analyzed the prognostic signi cance of IRGs and the correlation between immune cell in ltration and BC outcomes. Such studies might illuminate the probable molecular mechanisms involved in BC, thereby facilitating the development of effective therapeutic strategies. The aim of this study was to explore the IRGs in BC as prognostic assessment tools in the clinic and as a clue to understanding immune cell in ltration. We constructed a prognostic risk assessment model for patients with BC by integrating the survival-associated IRGs and exploring the whole immune cell landscape. We gained insight into the relationship between IRGs and immune cell in ltration to explore the underlying regulatory mechanisms. These results could help to gain a deeper understanding of the independent prognostic factors, novel biomarkers, and molecular mechanisms involved in BC, which could provide signi cant impetus to the development of unique immunotherapeutic approaches and individualized patient management.

Data acquisition
We downloaded the RNASeqV2 normalized count datasets (n = 430), simple nucleotide variation datasets (n = 411), and clinical datasets (n = 409) from The Cancer Genome Atlas (TCGA, https://cancergenome.nih.gov/), all of which were combined into a matrix. We summarized and logtransformed the normalized mRNA expression datasets. A total of 411 data les represented tumor samples, whereas 19 represented normal samples. We extracted the information on age, sex, tumor grade and stage, survival status, and survival time from the clinical data les. Furthermore, we downloaded the GSE31864 dataset from the Gene Expression Omnibus (GEO) database to conduct external validation. We acquired a set of IRGs based on the Immunology Database and Analysis Portal (ImmPort) repository [14]. We identi ed clinically relevant transcription factors (TFs) from the Cistrome Cancer web resource [15]. All aforementioned data are open-source; hence, no approval of the ethics committee was required.

Differentially expressed genes (DEGs)
To obtain the DEGs, we explored the datasets using R software and edgeR package [16]. A Wilcoxon test was performed to compare the tumor and normal tissues, taking log 2 |fold change| > 1 and a false discovery rate (FDR) < 0.05 as the cut-off values for screening. Differentially expressed immune-related genes (DEIRGs) and TFs were then identi ed from all DEGs. We applied the Database for Annotation, Visualization, and Integrated Discovery to complete the functional enrichment analysis for DEIRGs to obtain the relevant Gene Ontology terms and visualized the results through the Gopolt R package. An FDR < 0.05 was set as the cut-off value for screening. In addition, functional enrichment analysis based on the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway database was performed with an adjust p < 0.05 as the cut-off.

Identi cation of survival-associated IRGs and construction of the regulatory network
We conducted univariate Cox analysis to explore the survival-associated IRGs with p < 0.05 as the cut-off value. The network between TFs and survival-associated IRGs was built using the Pearson test and displayed using Cytoscape software. P < 0.001 and |Cor| > 0.4 were taken as the cut-off values.

Model construction and validation
Using multivariate Cox regression, datasets from TCGA were randomly divided on average into two parts, one as a training cohort for constructing a prognostic risk assessment model and the other as a testing cohort for internal validation. Datasets from the GEO database were used for external validation. All patients were classi ed into high and low risk groups by using the median risk score of the training cohort. Survival analysis between the two groups was performed using the Kaplan-Meier curves and logrank test. We utilized the time-dependent receiver operating characteristic (ROC) curve to evaluate the prognostic performance of the testing, training, and validation cohorts.

Immune cell in ltration
By using the CIBERSORT algorithm, we calculated the relative proportions of 22 types of tumor-in ltrating immune cells [17]. We performed this analysis using the CIBERSORT code and R software based on 1000 permutations. We quanti ed the 22 immune cell types using the CIBERSORT p-value for each sample, through which we removed the deconvolution results with no statistical signi cance. Samples with a CIBERSORT p-value less than 0.05 were included in the subsequent analysis. Next, the differences in 22 types of in ltrating immune cells between the carcinoma and normal tissues were determined using the Wilcoxon test along with the differences in in ltrating immune cells based on pathological stage and TNM stage. The relationship between tumor-in ltrating immune cells and overall survival (OS) was explored using the Kaplan-Meier curves and evaluated via the log-rank test.
We conducted hierarchical clustering analysis of all included patients to con rm the correlation between the classes and clinical factors. The association between clusters and OS was explored using the Kaplan-Meier curves and assessed by the log-rank test. We conducted a Pearson test to explore the correlation between the risk score and the in ltrating immune cells. Additionally, the key genes were identi ed.  1C and 1F). Using the gene functional enrichment analysis, DEIRGs were found to be enriched in certain cellular components, biological processes, and molecular functions (Fig. 1G). Moreover, based on the KEGG pathway analysis, DEIRGs were found to be mostly enriched in cytokine-cytokine receptor interactions (Fig. 1H).

Identi cation of survival-associated IRGs and construction of the regulatory network
The univariate Cox analysis of DEIRGs in BC showed that 58 genes were associated with OS (p < 0.05, Fig. 2). The survival-associated IRGs were mostly enriched in categories such as extracellular region, extracellular space, growth factor activity, in ammatory response, and positive regulation of cell proliferation (Fig. 1I). By using the correlation test, we developed a regulatory network consisting of 27 TFs and 33 survival-associated IRGs (|cor| > 0.4, p < 0.001; Fig. 1J).

Construction and validation of the prognostic risk model
We constructed a prognostic risk-assessment model based on the training cohort using multivariate Cox regression analysis (Fig. 3A). The risk score was calculated using the following formula: level of THBS1 expression ⋅ 0.006 + level of CXCL9 expression ⋅ (− 0.009) + level of CXCL11 expression ⋅ (− 0.020) + level of FABP6 expression ⋅ (− 0.030) + level of BIRC5 expression ⋅ 0.023 + level of PPY expression ⋅ (0.466). According to this formula, we calculated the risk score for the testing cohort and the validation cohort. Based on the median risk score for the training cohort, patients were classi ed into high-and low-risk groups. We found that the patients in the low-risk group had a better 5-year OS than the patients in the high-risk group in all cohorts (Fig. 3B-D). We used time-dependent ROC curves to evaluate the prognostic risk assessment model, and calculated the area under the curve values of the training cohort, testing cohort, and validation cohort, which were 0.775, 0.653, and 0.634, respectively (Fig. 3E-G). Then, we showed the level of gene expression integrated based on the risk model ( Figure S1A-C), risk score ( Figure  S1D-F), and survival status (Figure S1G-I) of patients in the high-and low-risk groups in the three cohorts. Using the univariate and multivariate Cox analyses, we showed that the risk score calculated by the model could act as an independent prognostic factor for BC in the TCGA cohort (Fig. 3H, 3I).

Landscape of IRG mutations and immune cell in ltration
We next explored genetic alterations in survival-associated IRGs and found that missense mutations comprised the most common type of mutation (Fig. 4A). AHNAK had the highest mutation rate, at more than 20%. By using the CIBERSORT algorithm, we rst depicted 22 subpopulations of immune cells in patients with BC in the TCGA cohort with p < 0.05 as a lter (Fig. 4B). The difference in immune cell in ltration between the normal and tumor samples was also explored. Naive B cells, memory B cells, and resting mast cells were found to be decreased in tumor samples, whereas resting NK cells, M0 macrophages, and M1 macrophages were increased (p < 0.05; Fig. 4C). The proportions of different in ltrating immune cells were weakly to moderately correlated ( Figure S2A). In addition, M0 macrophages were found to be related to lymph node metastasis and distant metastasis (p < 0.05; Figure S2B and 2F), and M2 macrophages were found to be related to distant metastasis (p < 0.05; Figure S2C). We found that activated memory CD4 T cells were related to the pathological stage and lymph node metastasis (p < 0.05; Figure S2D). Survival analysis showed that the high resting memory CD4 T cell in ltration group had a worse 5-year OS (p < 0.05; Figure S2G). In contrast, the high resting NK cell in ltration group had a better 5-year OS (p < 0.05; Figure S2H).

Clustering analysis of immune cell in ltration
Given that the variation in the proportions of in ltrating immune cells might be an intrinsic feature that could characterize the individual differences, we performed hierarchical clustering analysis to further explore the relationship between immune cell in ltration and the clinical outcome of BC. The optimal number of clusters was determined as suggested, and appeared to be k = 3. This was an adequate selection with clustering stability increasing from k = 2 to 9 in TCGA datasets ( Fig. 5A and B). The consensus matrix heatmap revealed the three identi ed clusters, each of which appeared as wellindividualized clusters (Fig. 5C). Furthermore, clusters were correlated with the distinct patterns of OS (Fig. 5D). The relationships between the cell proportions of each cluster and clinical factors showed that cluster 1, with high levels of CD8 T cells and activated memory CD4 T cells but low levels of resting memory CD4 T cells, was associated with better prognosis (Fig. 5E).

Correlation analysis between the IRGPF and immune cell proportions
We nally analyzed the relationship between the IRGPF and immune cell in ltration (p < 0.05; Fig. 6). The risk score was positively related to the in ltration of M0 macrophages, M2 macrophages, resting mast cells, and neutrophils ( Fig. 6A-D). In contrast, the risk score was negatively related to M1 macrophages, resting NK cells, activated CD4 memory T cells, CD8 T cells, and follicular helper T cells (Fig. 6E-I).
Moreover, we explored the relationship between the key gene, THBS1 and immune cell in ltration (p < 0.05; Figure S3). The risk score was positively related to the in ltration of neutrophils, resting mast cells, and M2 macrophages ( Figure S3A-C). In contrast, the risk score was negatively related to the follicular helper T cells, regulatory T cells, CD8 T cells, and resting dendritic cells ( Figure S3D-G).

Discussion
In recent years, increasing attention has been paid to the treatment of tumors using immunotherapy, and great progress has been made in this eld [18][19][20]. Further, the importance of IRGs in tumor progression and immunotherapeutics has been well proven. Moreover, immune cell in ltration, taken as one of the intrinsic features of the tumor has been con rmed to be related to tumor grade, progression, and prognosis. However, fewer studies concerning IRGs have been conducted for BC, especially with respect to the clinical signi cance of immune cell in ltration, which might help to elucidate the underlying molecular mechanisms and describe the immune status more comprehensively [10][11][12].
Prognostication in BC cannot be exact, despite the knowledge that the depth of tumor in ltration into the bladder wall can provide a simple strati cation of risk [21][22][23]. However, integrating multiple biomarkers into a single model could substantially improve the prognostic value over that of a single model [24][25][26].
To create a simple and convenient indicator for assessing the prognosis of patients with BC, we created an IRGPF pro le. The IRGs were explored to construct a risk assessment model to predict clinical outcomes. A previous studies lacked validation [27]; however, we conducted external and internal validation to improve the credibility of our ndings.
We discovered that some in ltrating immune cells are related to the TNM stage and prognosis in BC. In addition, our cluster analysis results of immune cell in ltration could signi cantly improve the prognosis of patients. The results con rmed that immune cells are crucial to BC progression, similar to the ndings of some previous studies on BC and other cancers, and con rming the importance of immune cells in the pathophysiology of cancer [10][11][12]. Based on clustering analysis and survival analysis, we found that patients with high levels of activated memory CD4 T cells but low levels of resting memory CD4 T cells had a better prognosis. Further, highly activated CD4 cell in ltration predicted decreased lymphatic metastasis and a lower pathological grade. Activated memory CD4 T cells might play an important role in BC, which helps to discover new therapeutic targets. Our results indicated that the IRGPF was signi cantly correlated with some in ltrating immune cells, suggesting that it could not only assess prognosis but also indicate the immune status. This might provide clues to the molecular mechanism of immune cell in ltration.
Among the results of these experiments, the THBS1 gene caught our attention. THBS1 features prominently in the TF-mediated network regulating the survival-associated IRGs. As an independent prognostic factor, THBS1 played a key role in the development of a prognostic assessment model. Our analysis also con rmed that THBS1 is related to immune cell in ltration, which we have great interest in exploring further. Some studies suggested that THBS1 is involved in tumor progression and metastasis. THBS1 promotes the invasion of oral squamous cell carcinoma induced by TGFB1 in the cancer stroma [28]. THBS1 is also a prognostic factor and has a negative relationship with histone deacetylase in advanced gastric cancer [29]. Gene methylation leads to brain metastases of solid tumors, primarily involving epigenetic silencing of THBS1 regulators, such as RB1/p16INK4a [30]. THBS1 might also participate in the development and progression of BC and could be a potential key gene to reveal the underlying pathogenesis.
Our ndings might help to reveal the drivers and mechanisms of immune cell in ltration. However, there are some limitations. First, our proposed model needs more reliable datasets to validate the ndings. Second, our molecular results need to be veri ed by vitro or in vivo experiments. Third, the function of THBS1 in BC development and progression needs further study. A multi-group analysis involving immunogenomics, epigenetics, transcriptomics, metabolomics, and proteomics should be employed to further characterize the molecular mechanisms involved in BC.

Conclusion
We analyzed the signi cance of IRGs in monitoring the prognosis of BC and the relationship between IRGs and immune cells. We also identi ed a key gene, THBS1, relevant to BC prognosis. We anticipate that the risk assessment prognostic model that we created would be clinically signi cant. Our ndings provide new insights that might shed some light on the pathogenesis of BC and help in the development of new immunotherapeutic strategies.