Comprehensive Analysis of a Ferroptosis-related Gene Signature for Overall Survival and Immunotherapy Response in Patients With Bladder Urothelial Carcinoma

Background: Bladder urothelial carcinoma (BLCA) is the most common malignant tumor of the urinary system. Ferroptosis is a new type of programmed cell death that is iron-dependent and different from apoptosis, cell necrosis, and autophagy. Studies have indicated that many genes are involved in regulating ferroptosis or markers of ferroptosis. However, the value of genes related to ferroptosis in BLCA remains unclear. Methods: We comprehensively evaluated the differences in ferroptosis genes in patients with BLCA and control samples based on public databases, including mRNA expression, mutations, and copy number variations. The ferroptosis gene expression prole was used for consistent clustering to obtain ferr.clusterA and ferrclusterB. An analysis of differences between groups was performed to obtain ferroptosis-related gene, and then consistent clustering was performed to obtain ferr.gene.cluster A and ferr.gene.clusterB. Subsequently, the random forest algorithm was used to reduce dimensionality, Cox analysis was used to screen characteristic genes, principal component analysis was performed, and the ferroptosis score was constructed to quantify the ferroptosis expression of a single sample. Results: According to the ferroptosis score, the samples could be divided into two groups with signicant differences in prognosis, which proves that the ferroptosis expression pattern in a single tumor can predict the prognostic response of patients. Conclusion: In summary, ferroptosis-related genes are signicantly related to the progression of BLCA.


Background
Bladder cancer is the most common urinary tract malignancy [1]. Most bladder cancers are bladder urothelial carcinomas (BLCAs). Among them, non-muscular invasive bladder cancer usually has a good prognosis, but the recurrence rate and treatment cost are high [2][3][4]. In contrast, the prognosis of muscular invasive bladder cancer is poor [2,5]. It is estimated that the annual incidence of bladder cancer in Western countries is about 2/100,000 [6]. In 2020, there were approximately 81,400 new cases of bladder cancer in the United States (about 62,100 in men and 19,300 in women). Among them, about 17,980 people died of bladder cancer (about 13,050 cases for men and 4,930 cases for women) [7]. After 30 years of development, clinicians continue to provide patients with the same limited treatment methods, so the 5-year survival rate has basically not changed [8]. Therefore, the key to improving the survival rate of this dangerous disease may lie in genomics research, risk factor prevention, and immunotherapy.
In 2012, Dixon et al. [9] proposed that ferroptosis is an iron-dependent form of non-apoptotic regulatory cell death. It has attracted widespread attention in the research community of biochemistry, oncology, and particularly materials science. The killing effect of tumor cells can be achieved by ferroptosis, which belongs to the mechanism of reactive oxygen species. Just as photodynamic therapy does not rely on the external activation of light, the ferroptosis mechanism does not rely on the activation of external oxygen, so it is expected to be used to treat hypoxic tumors [10]. Hypoxia is one of the characteristics of most solid tumors and is closely related to tumor angiogenesis, metastasis and drug resistance [11][12][13]. In addition to aiding in the regulation of iron metabolism, ferroptosis is also regulated by glutathione peroxidase 4 (GPX4) [14,15]. Inactivation of GPX4 inhibits the reduction of reactive oxygen species (ROS) and lipid peroxides, leading to irreversible cell death. Cancers are more or less sensitive to ferroptosis, depending on the relative level of GPX4 [16,17]. There is increasing evidence that many tumors are related to ferroptosis, such as adrenal cancer, pancreatic cancer, hepatocellular carcinoma, and ovarian cancer [18]. Outstanding examples are diffuse large B-cell lymphomas and clear cell carcinomas of the kidney [14]. However, there are few studies on BLCA and ferroptosis, and it is still unknown whether ferroptosis-related genes are related to the prognosis of BLCA patients.
In this study, we rst downloaded the mRNA expression pro le and corresponding clinical data of BLCA patients from public databases. Then, we conducted a functional enrichment analysis to explore the underlying mechanism and analyzed the differential expression of ferroptosis-related genes in BLCA samples to identify the enriched pathways and their biological functions. We determined that ferroptosisrelated genes were associated with the prognosis of BLCA. Overall, our data suggested that ferroptosisrelated genes play pivotal roles in BLCA progression and are potential prognostic markers and therapeutic targets for BLCA.

Data collection
We obtained the mRNA expression pro le data, clinical information, and copy number variation (CNV) information of the BLCA and control samples from The Cancer Genome Atlas (TCGA) via the UCSC Xena website (https://xenabrowser.net/datapages/). The CNV information was downloaded using the "RTCGA" R package and the mutation data were downloaded using the "TCGAbiolinks" R package. The GSE13507 dataset was downloaded from the Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih.gov/geo/). The immunotherapy dataset of IMvigor210 for BLCA was obtained using the "IMvigor210CoreBiologies" package. The data of TCGA, GEO, and IMvigor210 are all publicly available. Therefore, this study did not require the approval of the local ethics committee. The research followed the data access policies and publication guidelines of TCGA, GEO, and IMvigor210.

Data preprocessing
To maintain data consistency, we rst converted the IMvigor210 dataset through log2 (fpkm+1) and then used the "sva" R package to normalize the expression pro les of the three datasets.

Unsupervised cluster analysis
According to the expression data of 55 ferroptosis genes (a total of 60 ferroptosis genes, of which 55 were expressed in the three datasets), unsupervised cluster analysis was used to identify different ferroptosis expression types and further analyze the patient classi cation. The consensus clustering algorithm was used to determine the number of clusters and their stability. We used the "ConsensusClusterPlus" R package. The clustering method used was K-means, the distance used was Euclidean distance, and 1000 repetitions were performed to ensure the stability of the classi cation.
Gene Set Variation Analysis(GSVA) and single-sample Gene Set Enrichment Analysis (ssGSEA) To study the differences in ferroptosis expression patterns in biological processes, we used the "GSVA" R package to perform GSVA analysis. GSVA analysis is a non-parametric, unsupervised method that is mainly used to estimate changes in pathways and biological process activity in samples. We downloaded the "c2.cp.kegg.v7.1.symbols.gmt" gene set from the MSigDB database (https://www.gseamsigdb.org/gsea/index.jsp) for GSVA analysis. In order to evaluate the ratio and the difference of 28 immune cells in different ferroptosis expression subgroups (ferr.cluster), we used the ssGSEA analysis with the "GSVA" R package to obtain the in ltration ratio of 28 immune cells. Then, the samples of each ferr.cluster were compared by Wilcoxon test, and the different cells were subjected to Cox regression analysis to compare the prognostic differences.
Identify differentially expressed genes among differentferr.cluster Based on the expression of 55 ferroptosis genes, we divided the BLCA samples in the three datasets into two categories, and used the "limma" R package to determine the differentially expressed genes between different categories. The signi cance standard for determining the difference gene was set as adj.P.Val<0.05 (P values were adjusted by Benjamini & Hochberg correction.), and the multiple of difference was greater than 2 times or less than 0.5 times.
Ferroptosis score (ferr.score) calculation For the differential genes obtained in the previous step, the random forest method was used to remove redundant genes. Then survival analysis was performed on the remaining genes, and genes that were less related to survival were ltered out (P < 0.05 was considered to be related to survival). Finally, the remaining genes were used for principal component analysis (PCA), and the following formula was used to calculate the ferr.score: . The P for positive and N for negative. According to the median of the ferr.score, the samples were divided into high and low groups. The correlation between the two types of samples and the prognosis was further analyzed.
Correlation between ferr.score and other biological processes Mariathasan et al. [19] constructed a set of genes to store genes related to certain biological processes, including immune checkpoints; antigen processing and presentation; EMT1, EMT2, EMT3, and other epithelial-mesenchymal transition (EMT) markers; DNA damage repair; mismatch repair; and nucleotide excision repair and other pathways. We used GSVA to calculate the enrichment score (ES) for these biological functions in each sample. Then, the Pearson correlation analysis was performed on the ferr.score and the ES score of these biological processes to further reveal the relationship between the ferr.score and related biological pathways.

CNV analysis
The GISTIC method was used to detect the common copy number change area in all samples based on SNP6 CopyNumber segment data. The parameters of the GISTIC method were set as follows: Q ≤ 0.05 was the signi cance level of the change; when determining the peak interval, the con dence level was 0.95. The analysis was performed through the corresponding MutSigCV module in the online analysis tool GenePattern (https://cloud.genepattern.org/gp/pages/index.jsf), which was developed by the Broad Research Institute.

Tumor immune dysfunction and exclusion (TIDE) and inhibiting concentration 50 (IC50)
Researchers from Harvard developed the TIDE (http://tide.dfci.harvard.edu/) tool to evaluate the clinical effects of immune checkpoint suppression therapy. A higher tumor TIDE prediction score is related to a poorer immune checkpoint suppression therapy e cacy and has a poor prognosis. The prognosis prediction of immune checkpoint suppression therapy in this study was completed by the TIDE score. We used the "pRRophetic" R package to estimate the IC50 value of drugs (Cisplatin and Gemcitabine) according to the expression pro le, and then compared the IC50 differences in the ferr.score high and low group samples.

Statistical analysis
In the signi cance analysis between various scores, the Wilcoxon test was used to compare the differences between the two groups of samples. The Kaplan-Meier method was used to generate the survival curve for prognostic analysis, and the log-rank test was used to determine the signi cance of the difference. The receiver operating characteristic curve (ROC) was used to evaluate the ferr.score's prediction of immunotherapy, and the area under the curve (AUC) was quanti ed using the "pROC R" package. When displaying mutation maps, the "maftools" R package was used to present the mutation landscape of ferr.score patients with high and low subtypes. The "RCircos" R package was used to plot the chromosome distribution of 55 ferroptosis genes in 23 pairs of chromosomes.

Results
The ow chart of the present study is shown in Fig. 1. A total of 913 BLCA samples (400 from TCGA-BLCA, 165 from GSE13507, and 348 from IMvigor210) and 73 control samples (19 from TCGA-BLCA and 58 from GSE13507) were included. The clinical information of BLCA samples is presented in Table 1. We rst studied the mRNA expression levels of 55 ferroptosis genes in tumor samples and control samples in three datasets. The results showed that 20 genes were highly expressed and 12 were lowly expressed in BLCA patients compared with normal tissues (Fig. 2A). Then, we summarized the CNV and somatic mutations in 55 ferroptosis genes in tumor samples from the TCGA-BLCA dataset. Among these samples, the mutation frequency of TP53 was the highest (Fig. 2B). Moreover, we found that CNV was common in the 55 ferroptosis genes. Genes such as AKR1C1, AKR1C2, and AKR1C3 generally had copy number ampli cation; genes such as ACSL3, GOT1, and GPX4 generally had deletions (Fig. 2C). Figure   2D shows the location of the ferroptosis genes on the chromosomes. According to the expression of these 55 ferroptosis genes in the three datasets, PCA showed that the tumor samples could be distinguished from normal samples (Fig. 2E).
Unsupervised clustering of ferroptosis genes in BLCA samples The 55 ferroptosis gene expression pro les of 913 tumor samples in three datasets were used to perform univariate Cox analysis and consensus clustering. The interaction between ferroptosis genes and the correlation between regulatory factors and prognosis is presented in Fig. 3A. The result of consensus clustering showed that the BLCA patients were divided into two clusters, which we called ferr.clusterA and ferr.clusterB (Fig. 3B). These two clusters had signi cant differences for prognosis (Fig. 3C). The result of the GSVA analysis showed that ferr.clusterA was signi cantly enriched in cytokine-receptor interaction signaling pathways and ferr.clusterB was signi cantly enriched in steroid hormone biosynthesis and other biological processes (Fig. 3D). Furthermore, we used the expression pro le data to perform ssGSEA analysis to obtain the proportion In the three datasets, we found that there was no signi cant difference between ferr.clusterA and ferr.clusterB in gender (chi-square test P =0.27997, Fig. 4A) and age (P = 0.75195, Fig. 4B). However, signi cant differences were found in grade (P = 0.00021, Fig. 4C) and stage (P <0.00001, Fig. 4D). In the TCGA-BLCA dataset, we found that ferr.clusterA and ferr.clusterB were signi cantly different in the TP53 mutation group (P = 0.00023, Fig. 4E) but had no signi cant difference in the TTN mutation group (P = 0.28833, Fig. 4F). The results of the GSVA showed that the enrichment scores of different ferr.cluster groups had a signi cant difference in most pathways (Fig. 4G). In addition, the distribution of ferroptosis genes in ferr.clusterA and ferr.clusterB is shown in Fig. 5.
Differential genes related to ferroptosis genes A total of 201 differential genes related to the ferroptosis phenotype were identi ed. Then, we performed unsupervised cluster analysis based on the obtained ferroptosis-related genes. Similar to the clustering grouping of ferroptosis expression patterns, we also identi ed two clusters based on ferroptosis-related genes. We named these two clusters ferr.gene.clusterA and ferr.gene.clusterB (Fig. 6A) and observed that ferr.gene.clusterA patients had a better prognosis than ferr.gene.clusterB patients (Fig. 6B). The ferroptosis-related genes were mainly involved in the response of metal ions, glutathione metabolism, and other biological processes (Fig. 6C-D). The expression of most ferroptosis genes in ferr.gene.clusterA and ferr.gene.clusterB was signi cantly different (Fig. 6E).
Ferr.score analysis According to the median value of the ferr.score (−0.1517157), the samples were divided into the ferr.score high group and ferr.score low group (Fig. 7A). As shown in Fig. 7B, the ferr.score high group had a poor prognosis, whereas the ferr.score low group had a good prognosis, indicating that ferr.score offers a good characterization of the prognosis for BLCA patients. The correlation analysis between ferr.score and the known gene features showed that the ferr.score was signi cantly positively correlated with immune checkpoints, EMT2, and other biological functions (Fig. 7C-D). The Wilcoxon test showed that ferr.clusterA and ferr.gene.clusterB had signi cant differences on the ferr.score (Fig. 7E-F). The ferr.score of ferr.clusterA group was signi cantly higher than that of the ferr.clusterB group. Moreover, the ferr.score of the ferr.gene.clusterA group was signi cantly lower than that of the ferr.gene.clusterB group. The AUC of the prediction results for BLCA patients in 1 year, 3 years, and 5 years reached 0.584, 0.589, and 0.582, respectively (Fig. 7G).
In the three datasets, there was no signi cant difference between ferr.gene.clusterA and ferr.gene.clusterB in gender (Fig. 8A), but there were signi cant differences in age, grade, and stage ( Fig. 8B-D). Upon further analysis of the TCGA dataset, we found that the ferr.score had a signi cant difference in the TP53 mutation group (Fig. 8E), but no signi cant difference in the TTN mutation group (Fig. 8F). Survival analysis showed that the ferr.score high group had a worse survival rate than the ferr.score low group in the TCGA-BLCA and GEO datasets ( Fig. 8G-H), but there was no signi cant difference in the IMvigor210 dataset (Fig. 8I). We further explored molecular characteristics of the ferr.score high and low groups in TCGA dataset and found that the somatic mutation and CNV were similar in the two groups ( Fig. 9).
Immunotherapy response in ferr.score high and low groups We estimated the IC50 values of drugs (Cisplatin and Gemcitabine) based on the expression pro les of three datasets. Then, we compared the differences of IC50 values for Cisplatin and Gemcitabine between the ferr.score high group and the ferr.score low group. The results showed that the IC50 values of the ferr.score low group were signi cantly higher than those of the ferr.score high group (Fig. 10A-B). Moreover, we used TIDE to evaluate the clinical effect of the immune checkpoint suppression treatment for the ferr.score high and low groups based on the mRNA expression pro les. We found that the TIDE score of the ferr.score high group was signi cantly higher than the that of the ferr.score low group (Fig.  10C), and the expression of CD274 gene in the ferr.score high group was signi cantly higher than that of the ferr.score low group (Fig. 10F). The ROC curve was used to show different clinical factors and the ferr.score model to predict the immunotherapy response. The ferr.score model was the best (Fig. 10D).
However, in the IMvigor210 dataset, the ferr.score of different immunotherapy responses had no signi cant difference (Fig. 10E).

Discussion
Ferroptosis is closely related to the occurrence of tumors and plays an important role in the treatment of cancer. However, there is currently a lack of systematic studies on ferroptosis and its regulatory genes and BLCA. In this study, we used BLCA data from TCGA, GEO and IMvigor210 databases to explore the differences in molecular and clinical characteristics among different subgroups based on the expression of 55 ferroptosis genes. According to the difference of gene expression, 55 ferroptosis genes were clustered into two clusters to determine the correlation between genes and prognosis. Using BLCA samples to further reduce the dimensionality of ferroptosis-related genes that are differentially expressed between subgroups, we calculated the ferr.score, compared the differences between ferr.score high and low groups of BLCA samples, and used ferr.score classi cation to predict immunotherapy response.
Although the mechanism by which ferroptosis regulates the development and proliferation of BLCA cells is still unclear, we can improve our understanding of ferroptosis by analyzing the relationship between ferroptosis genes and BLCA. Our study found that most of the ferroptosis genes showed signi cant low expression in BLCA samples. These differentially expressed genes may be used as markers for the diagnosis and prognosis of BLCA. The CRYAB gene is a member of the small-molecule heat shock protein family. Its function is to act as a molecular chaperone to protect proteins from aggregation and protect cells from oxidative stress and other damage [20]. In BLCA, the CRYAB gene may play a role in inhibiting tumor development [21]. Moreover, the PTGS2 gene is up-regulated in many in ammatory or tumor tissues. The main product prostaglandin has many biological activities, such as inhibiting cell apoptosis, promoting cell proliferation, inhibiting immune surveillance, and promoting angiogenesis [22]. However, the expression of the PTGS2 gene is down-regulated in BLCA tissues [23].
After analyzing the tumor samples in the TCGA dataset, it was found that the mutation frequency of TP53 in the ferroptosis gene of the sample was the highest, and a change in CNV was also common. The gene encoding the p53 transcription factor is mutated in over half of all human cancers, re ecting its crucial role in preventing cancer [24]. Two clusters clustered based on 55 gene expression levels in the TCGA dataset also found signi cant differences in the TP53 grouping. Studies have shown that TP53 regulates ferroptosis in both directions. Under the condition of low levels of oxygen free radicals, it has the effect of inhibiting ferroptosis. In contrast, it promotes ferroptosis under high oxidative pressure [25].
There is a signi cant correlation between ferroptosis genes and the prognosis of BLCA patients, and there are interactions between different genes. The GSVA enrichment analysis indicated that these genes may be mainly involved in the regulation of two pathways in ferroptosis. One involves the cell factors and receptors that interact with signaling pathways, and the other is a the pathway for steroid hormone biosynthesis. Previous studies indicated that the occurrence of ferroptosis is mainly due to changes in the method of cell metabolism, which is characterized by excessive lipid oxidation catalyzed by iron [26]. The glutathione synthesis pathway is inhibited, causing the accumulation of lipid peroxides in the cell to initiate ferroptosis [9]. GPX4 is a key gene regulating ferroptosis. When the level of intracellular lipid reactive oxygen species (L-ROS) exceeds the antioxidant activity of glutathione-dependent peroxidase (GPX4), cellular redox homeostasis collapse can occur. Iron is a redox active metal that can participate in the formation of free radicals and the spread of lipid peroxidation through Fenton reaction. Once the clearance of lipid peroxides in cells is blocked, lipid reactive oxygen species (L-ROS) can lead to ferroptosis [27]. In short, ferroptosis is mainly the result of metabolic dysfunction caused by active oxygen, iron, and polyunsaturated fatty acids.
By clustering the expression level, we found that the ferroptosis gene had signi cant differences in the grade and stage of BLCA samples, indicating that the ferroptosis gene has great application value in the graded diagnosis of the degree of malignancy of BLCA. According to the phenotype of the ferroptosis gene, we clustered the differential genes related to the phenotype and found that the tumor prognosis of ferr.gene.clusterA cluster was better. This discovery lays the foundation for future research on the relationship between speci c ferroptosis-related genes and the prognosis of BLCA. We used the differential gene to calculate the ferroptosis score of each sample, and analyzed the relationship between the prognosis of BLCA patients based on the score, and found that the group with a high ferr.score was associated with a poor prognosis. The ferroptosis gene can simultaneously exert oncogene and tumor suppressor effects in cancer [28], and the ferr.score can be used as a protective or risk factor for a variety of cancers. The correlation analysis of known gene features constructed by the ferr.score and Mariathasan et al. showed that ferr.score was signi cantly positively correlated with biological functions such as immune checkpoints and EMT2. Studies have also indicated that by blocking the immune suppression mechanism and the function of immune suppression cells, a potential anti-tumor immune response can be triggered [29]. In recent years, manipulation of immune checkpoints or pathways has become an important and effective form of immunotherapy [30]. Studies have also indicated that the ferr.score has signi cant differences in TP53 grouping, age grouping, grade grouping and stage grouping. TP53 inhibits the uptake of cystine and sensitizes cells to ferroptosis by inhibiting the expression of SLC7A11 (cystine/glutamate antiporter) [31]. This shows that these aspects need to be considered in the process of clinical diagnosis and treatment.
We also explored the relationship between the ferr.score and drug resistance, as studies have indicated that the high ferr.score group has worse drug resistance. Immunotherapy is a tumor treatment method that uses the body's immune system to produce an anti-tumor response [30]. We found through research that the ferr.score high group had a poorer immunosuppressive point suppression treatment effect. Moreover, the expression results of CD274 (PD-L1) showed that the ferr.score high group was signi cantly higher than that of the ferr.score low group. This indicated that patients in the high ferr.score group may have a strong inhibitory effect on clinical immunotherapy and that the poor prognosis of patients in the high ferr.score group may be due to the strong immunosuppressive effect. The study also indicated that the ferr.score model is the best predictor of the corresponding results of sample immunotherapy.
The advantage of this study is that ferr.gene.cluster and ferr.score grouping have obvious differences in prognosis, and the differences in clinical or molecular characteristics are also obvious. IC50 and TIDE scores in the prediction of immunotherapy response results have signi cant differences in ferr.score grouping. Our research had several limitations. First, the number of normal samples in the three databases was relatively small. Secondly, this study only included the database mining design, without using fresh samples and prospective experimental studies for veri cation. Therefore, we will collect fresh samples and further prove our conclusions through experiments. There was no signi cant difference in the ferr.score of different immunotherapy responses in the immunotherapy response results.

Conclusion
In summary, we analyzed the relationship between 55 ferroptosis genes and BLCA through the BLCA data of the three databases of TCGA, GEO, and IMvigor210. The established prognostic model based on these 55 genes showed bene cial diagnostic and predictive performance. Understanding the mechanisms underlying ferroptosis and its effect on OS as well as its implications for the treatment of BLCA can provide insights into the identi cation of therapeutic targets for BLCA. Author's contribution HW, XTZ, and XHW conceived and designed research. HW, TD, and SY performed experiments and interpreted results of experiments. HW and QH analyzed data and prepared gures. HW, TD, and SY drafted the paper. XTZ and XHW edited and revised manuscript. All authors read and approved nal version of manuscript.

Funding
None.

Availability of data and materials
The data that support the ndings of this study are available from the corresponding author upon reasonable request.
Ethics approval and consent to participate Not applicable.

Consent for publication
Not applicable. Figure 1 Flow chart of data collection and analysis. The P values were showed as: ns, no signi cant; *, P < 0.05; **, P < 0.01; ***, P < 0.001; ****, P < 0.0001.  Clinical and genetic variation analyses between ferr.cluster groups. A-D. The distribution of the clinical characteristics of the three datasets in the two ferr.cluster groups. E-F. The distribution of TP53 and TTN mutations in the two ferr.clusters in the TCGA dataset. G. The difference of enrichment score in two ferr.cluster groups.

Figure 5
The expression of ferroptosis genes in two ferr.cluster groups.  The ferr.score construction. A. Alluvial plot for the ferr.clusters, ferr.gene.clusters, and ferr.score groups. B.
Kaplan-Meier plot for overall survival in the high ferr.score group and low ferr.score group. C. The correlation between the ferr.score groups and known gene characteristics in BLCA. Negative correlations Analysis of molecular characteristics of ferr.score high and low groups in TCGA dataset. AB. The distribution of gene mutations in the samples of high ferr.score group (A) and low ferr.score group (B).
CD. The distribution of copy number ampli cation and deletion regions in the samples of high ferr.score group (C) and low ferr.score group (D).

Figure 10
The immunotherapy results of ferr.score high and low groups. AB. The difference between the IC50 values of Cisplatin (A) and Gemcitabine (B) in the samples of ferr.score high group and ferr.score low group. C. The difference of TIDE score between ferr.score high group and ferr.score low group. D. ROC curves of ferr.score and clinical characteristics. E. The ferr.score distribution of different immunotherapy