A Novel Ferroptosis-Related Gene Signature Robustly Predicts Clinical Prognosis And Tumor Immunity in Patients With Kidney Renal Clear Cell Carcinoma


 Objective: This study aimed to construct a ferroptosis-related gene signature to predict clinical prognosis and tumor immunity in patients with kidney renal clear cell carcinoma (KIRC).Methods: The mRNA expression profiles and corresponding clinical data of KIRC patients were downloaded from The Cancer Genome Atlas (TCGA), which were randomly divided into training (398 patients) and validation set (132 patients). The iron death related (IDR) prediction model was constructed based on training set and 60 ferroptosis-related genes from previous literatures, followed by prognostic performance evaluation and verification using the validation set. Moreover, functional enrichment, immune cell infiltration, metagene clusters correlation, and TIDE scoring analyses were performed. Results: In total, 23 ferroptosis-related genes were significantly associated with overall survival (OS). The IDR prediction model (a 10-gene signature) was then constructed to stratify patients into two risk groups. The OS of KIRC patients with high-risk scores was significantly shorter than those with low-risk scores. Moreover, the risk score was confirmed as an independent prognostic predictor for OS. The positive and negative correlated genes with this model were significantly enriched in p53 signaling pathway, and cGMP-PKG signaling pathway. The patients in the high-risk group had higher ratios of plasma cells, T cells CD8, and T cells regulatory Tregs. Furthermore, IgG, HCK, LCK, and Interferson metagenes were significantly correlated with risk score. By TIDE score analysis, patients in the high-risk group could benefit from immunotherapy.Conclusions: The identified ferroptosis-related gene signature is significantly correlated with clinical prognosis and immune immunity in KIRC patients.


Introduction
Kidney renal clear cell carcinoma (KIRC), also known as clear cell renal cell carcinoma (ccRCC) is one of fatal malignancies, accounts for 75% of all renal cancers (1,2). It is reported that more than 30% of KIRC patients eventually develop metastasis, and the ve-year survival rate of metastatic KIRC is less than 10%, with a median survival of only 13 months (3). Due to the late diagnosis and resistance to chemoradiotherapy (4), the prognosis of patients with metastatic KIRC is dismally poor (5). Therefore, early identi cation of KIRC metastatic potential and design of prognostic methods to identify patients at higher risk may be bene cial for the improvement of clinical outcomes. As human cancers are a consequence of genetic alterations, dysregulation of molecular signaling pathways, and epigenetic disorders (6), further investigations of molecular mechanisms underlying KIRC and identi cation of novel biomarkers are essential for early diagnosis, prognosis, and personalized treatment.
Programmed cell death (PCD) is a crucial terminal pathway for cells of multicellular organisms, whose dysregulation has been revealed to be related to tumorigenesis (7) and tumor progression (8). Ferroptosis is a newly discovered type of PCD that differs from traditional apoptosis, necrosis, or autophagic cell death, and is driven by iron-dependent lipid peroxide accumulation (9,10). Recent research has revealed that ferroptosis is widely involved in the pathophysiological process of many diseases, such as blood diseases, neurological disorders, kidney injury, and various tumors (11). Moreover, accumulating evidence has con rmed that ferroptosis plays a pivotal role in tumor development and treatment (12)(13)(14), and may be a promising trigger option for cancer cell death, especially for malignancies that are resistant to traditional therapies (15,16). Ferroptosis-related genes such as BRCA1-associated protein 1 (BAP1) (17), nuclear factor erythroid 2-related factor 2 (NRF2) (18,19), and glutathione peroxidase 4 (GPX4) (20) are found to be implicated in tumorigenesis and progression. Although recent studies have revealed that some ferroptosis-related genes, such as Cystathionine beta-synthase (CBS), CD44, Fanconi anemia complementation group D2 (FANCD2), glutamate exchanger xCT (SLC7A11) are associated with high-risk patients with KIRC (21,22), the key ferroptosis-related genes that could predict clinical prognosis and tumor immunity in KIRC have not been fully discovered.
In the present study, the mRNA expression pro les and corresponding clinical data of KIRC patients were downloaded from The Cancer Genome Atlas (TCGA). Based on training set (398 KIRC patients) and 60 ferroptosis-related genes retrieved from the previous literature (10,15,23,24), the iron death related (IDR) prediction model was constructed, followed by performance evaluation for predicting prognosis of KIRC patients by survival analysis and veri cation using the validation set (398 KIRC patients). Moreover, functional enrichment analysis was conducted to explore the key functions of the IDR prediction model. Furthermore, immune cell in ltration analysis, gene set variation analysis (GSVA) for metagene clusters, tumor immune dysfunction and exclusion (TIDE) scoring analysis were carried out to analyze the association of the IDR prediction model with immune in ltration, in ammation, and immunotherapy response in KIRC, respectively. The ow chart of this study is shown in Figure 1. Our ndings may help to improve the clinical outcomes of patients under personalized treatment.

Data collection
The RNA-sequencing (RNA-seq) data and clinical information of 602 KIRC patients in TCGA were downloaded from the public database UCSC Xena (http://xena.ucsc.edu). After excluding the control samples and patients with incomplete clinical data, 530 KIRC patients were included in this study. Clinical indicators with a missing value of more than 50% were excluded in the clinical phenotypic study, and nally, 10 indicators were included, including hemoglobin, platelet, serum calcium, white cell count, M, N, T, TNM, age, and gender. Since the data was obtained from a public database, approval from the ethics committee or written informed consent from patients was not required. Moreover, 60 ferroptosis-related genes were retrieved from the previous literature (10,15,23,24).

Construction of IDR prediction model
All data were randomly divided into training, testing and validation sets using R (4.0.2). Based on training set (398 KIRC patients) and 60 ferroptosis-related genes were retrieved from the previous literature, univariate Cox proportional hazards regression was performed to select candidate ferroptosis-related genes from 60 ferroptosis-related genes from previous literature, whose expression levels were signi cantly associated with the overall survival (OS) of the KIRC patient cohort (P < 0.05). The hazard ratios (HRs) were used to identify risk-related ferroptosis-related genes (HR > 1) and protective ferroptosisrelated genes (HR < 1). Four-fold cross-validation for candidate genes was conducted and the area under the curve (AUC) of receiver operating characteristic (ROC) curves was calculated using R 4.0.2 ROCR (25) based on the testing set. Testing set was included in training set but was not used for model construction. In addition, another model construction method of non-cross-validation of all sample sets was used, and the AUC value was also calculated based on the testing set. By comparison of the AUC values obtained from the two model construction methods, the prediction model with the optimal AUC value was selected as the nal IDR prediction model.

Survival analysis
Survival analysis was carried out using the Survival package in R (26), and log-rank tests were used to compare the differences in multiple subtypes (such as age, sex, and clinical stage) between high-risk and low-risk groups.

Functional enrichment analysis
The IDR prediction model-related genes with low expression value (50% of the gene expression value was 0) were ltered. Then, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses for IDR-related genes were performed using clusterPro ler (27) in R. The false discovery rate (FDR) adjusted P value < 0.05 was set as the signi cance threshold.

Immune cell in ltration analysis
The IDR prediction model-related genes with low expression value (50% of the gene expression value was 0) were ltered. The proportion of human immune cell subpopulations, such as plasma cells, NK cells, mast cells, several T cell types, and macrophages were analyzed using Cell type identi cation by estimating relative subsets of known RNA transcripts (CIBERSORT, R 4.0.2) algorithm (28).
Seven metagene clusters including IgG, Interferson, lymphocyte speci c kinase (LCK), major histocompatibility complex-I (MHC-I), MHC-II, signal transducer and activator of transcription 1 (STAT1), and hemopoietic cell kinase (HCK) metagenes have been revealed to act as surrogate markers for the amount of different immune cell types in the breast cancer sample (29). Herein, the expression value of 7 metagene clusters was analyzed using R GSVA (30), and then the association between metagene clusters and IDR risk scores were analyzed using Spearman analysis.

TIDE scoring analysis
The TIDE website (http://tide.dfci.harvard.edu) was used to analyze the immunotherapy biomarkers of all samples. They were classi ed into high-risk and low-risk groups using median risk score. The immune cell components and the differences of immunotherapy biomarkers between the two groups were analyzed by Mann-Whitney U test.

Results
Identi cation of prognostic ferroptosis-related genes in the training cohort Based on the gene expression data and clinical data of training set (398 KIRC patients), the information of 60 ferroptosis-related genes was extracted. After excluding genes with a missing value of more than 50%, univariate Cox proportional hazards regression analysis was conducted. The results showed that 23 ferroptosis-related genes were signi cantly associated with OS. Among them, 15 ferroptosis-related genes were considered as risk factors (HR > 1), whereas the remaining 8 ferroptosis-related genes were considered as protective factors (HR < 1) ( Table 1). 95%CI: 95% con dence interval. ***P < 0.0001, **P < 0.01, and *P < 0.05.
Results of IDR prediction model construction Using the two model construction methods, the results of model construction were obtained ( Table 2).
Cvlist1 model with AUC = 0.74 was selected as the optimal IDR prediction model with the optimal AUC value was selected as the nal IDR prediction model. This model contained 10 ferroptosis-related genes, including 6 risk factors (STEAP3, CD44, CHAC1, FANCD2, G6PD, and FADS2) and 4 protective factors (GOT1, NCOA4, PEBP1, and GLS2). The risk score was calculated using the following formula for the ferroptosis-related gene signature: performed on the training set. Based on the median risk score (= 7.9775) as the cut-off point, KIRC patients were categorized into high-risk (n = 199) and low-risk (n = 199) groups (P < 0.0001, Figure 2A). KM survival curve analysis showed that the OS of KIRC patients with high-risk scores was signi cantly shorter than those with low-risk scores ( Figure 2B). Moreover, the patients were further grouped according to age (≥ 60 (n = 222) and < 60 (n = 176) years), gender (female (n = 135) and male (n = 263)), and AJCC stage (early (stages I and II, n = 247) and advanced (stages III and IV, n = 151) stages). KM survival curve analysis also showed that the OS rate was signi cantly shorter for the high-risk patients relative to the low-risk patients based on the prognostic signature among those with age ≥ 60 years (P < 0.001), those with age < 60 years (P < 0.001), female (P < 0.001), male (P < 0.001), early stages (P = 0.0033), and advanced stages (P < 0.001) ( Figure 2C-E). These results suggested that the IDR prediction model could determine the prognosis of KIRC patients.
We then analyzed whether the IDR prediction model was an independent prognostic predictor for OS. Univariate and multivariate Cox regression analyses were conducted. A total of 10 clinicopathological characteristics of the KIRC patients were also included, including hemoglobin, platelet, serum calcium, white cell count, M, N, T, TNM, age, and gender. Univariate analysis showed that age, tumor stage, serum calcium, T, M, platelet qualitative, white cell count, hemoglobin, and risk score were signi cantly associated with OS (all P < 0.05, Table 3). Further multivariate analysis revealed that age, tumor stage, serum calcium, T, and risk score were signi cantly associated with OS (all P < 0.05, Table 3). These data con rmed that the risk score was an independent prognostic factor for KIRC patients.

Validation of the IDR prediction model in the validation set
To test the robustness of the IDR prediction model constructed by training set, 132 KIRC patients in the validation set were strati ed into high-risk (n = 68) and low-risk (n = 64) groups based on the median risk score (= 7.9775). Likewise, KM survival curve analysis revealed that KIRC patients in the high-risk group had a shorter OS compared with those in the low-risk group (P = 0.0087, Figure 3A). Then, the validation set were randomly divided into three cohorts, named Cvlist1, Cvlist2, and Cvlist3. According their respective optimal median cut-off value, KIRC patients were strati ed into high-risk and low-risk groups, followed by KM survival curve analysis. Expect Cvlist1 set ( Figure 3B), patients in the high-risk group in Cvlist2 and Cvlist3 sets had a signi cantly worse OS than those in the low-risk group (P = 0.027, Figure 3C; P = 0.045, Figure 3D). Collectively, our results indicated that the predictive performance of the IDR prediction model was robust and reliable.

Functional analysis for the IDR prediction model
To elucidate the functions and pathways that were associated with the risk score, functional enrichment analysis was performed. The 398 samples (training set) and 132 samples (validation set) were all included in this part of the study. The genes with low expression were ltered out, and nally, the expression data of 32643 genes were analyzed. The genes that were signi cantly correlated with risk score was identi ed using Pearson analysis with the cut off value of P < 0.05 and coe cient > |0.4|. As results, 715 genes were found to be signi cantly correlated with risk score, among them, 668 genes were positive correlated while 47 were negative correlated. Heatmap for the top 100 positive correlated genes and 47 negative correlated genes is shown in Figure 4A. Further functional enrichment analysis showed that positive correlated genes were signi cantly enriched in several KEGG pathways, such as cell cycle and p53 signaling pathway, as well as GO functions associated with organelle ssion and nuclear division ( Figure 4B); the negative correlated genes were remarkably enriched in several KEGG pathways, such as Bile secretion and cGMP-PKG signaling pathway, as well as GO functions associated with apical plasma membrane and sodium ion transmembrane transporter activity ( Figure 4C).

Analysis of the relationship between the IDR prediction model and immune in ltration or in ammatory activities
To investigate the relationship between the IDR prediction model and immune in ltration, the CIBERSORT algorithm was applied The 398 samples (training set) and 132 samples (validation set) were all included in this part of the study. The genes with low expression were ltered out, and nally, the expression data of 32643 genes were analyzed. The landscape of immune cell in ltration of high-risk and low-risk groups was shown in Figure 5A. KIRC patients in the high-risk group had higher ratios of plasma cells, T cells CD8, and T cells regulatory Tregs (P < 0.05, Figure 5B-C).
Moreover, GSVA for 7 metagene clusters was conducted to reveal the relationship between the IDR prediction model and in ammatory activities. The duplicated genes and genes with 0 expression of more than 50% in 530 samples (398 samples in training set and 132 samples in validation set) were excluded, and nally the remaining 93 genes were included in the GSVA analysis. Heatmap for the expression value of 93 genes in the high-risk and low-risk groups is displayed in Figure 5D. By further Spearman correction analysis, IgG metagenes (Coe cient = 0.2170, P < 0.001), HCK metagenes (Coe cient = 0.0977, P = 0.0165), and LCK metagenes (Coe cient = 0.0892, P = 0.0286) were found to be signi cantly positively correlated with risk score, and Interferson metagenes (Coe cient = -0.1333, P = 0.0010) was signi cantly negatively correlated with risk score ( Figure 5E).

Discussion
In the present study, we built a novel IDR prediction model integrating 10 ferroptosis-related genes to stratify KIRC patients into two risk groups. The overall survival (OS) of KIRC patients with high-risk scores was signi cantly shorter than those with low-risk scores in both training set and validation set. Moreover, the risk score was con rmed as an independent prognostic predictor for OS. In addition, the positive and negative correlated genes with this model were signi cantly enriched in p53 signaling pathway, and cGMP-PKG signaling pathway. Immune in ltration analysis revealed that KIRC patients in the high-risk group had higher ratios of plasma cells, T cells CD8, and T cells regulatory Tregs. Furthermore, IgG, HCK, and LCK metagenes were signi cantly positively correlated with risk score, and Interferson metagenes was negatively correlated. By TIDE score analysis, patients in the high-risk group could bene t from immunotherapy. These ndings present future KIRC research with more information.
Increasing evidence has con rmed that ferroptosis plays a crucial role in diverse kidney diseases, including KIRC (31) and has anticancer functions that are useful in cancer treatment (10). Investigating the role of ferroptosis and identifying the ferroptosis-related genes will facilitate to the development of effective treatment strategies for KIRC. Moreover, it is reported that multigene signatures could provide risk strati cation and prognostic prediction in breast cancer, such as PAM50 signature (32). A multigene panel has exhibited the potent peformance to predict poor prognosis in patients with KIRC (33). In this present study, the IDR prediction model was built by 10 ferroptosis-related genes, and the risk score was con rmed as an independent prognostic predictor for OS. Therefore, we conduct that our identi ed multigene signature can predict clinical prognosis and brings insight to molecular biologic characteristics of KIRC.
The identi ed IDR prediction model included 6 OS-related risk factors (STEAP3, CD44, CHAC1, FANCD2, G6PD, and FADS2) and 4 potective factors (GOT1, NCOA4, PEBP1, and GLS2). STEAP3 is shown to have a ability to catalyse the reduction of free ferric iron (Fe3 + ) to divalent iron (Fe2 + ) in endosome and then release Fe2 + to the cellular labile iron pool, thus triggering ferropotsis (34). CD44 is widely implicated in various malignant processes such as tumor growth, and angiogenesis (35), which has been observed to be associated with a poor survival rate in many human tumors (36). The P-selectin and the CD44 receptor interactions is reveal to promote the cellular uptake of Fe 3 O 4 -SAS@PLT nanoparticles, resulting in higher cytotoxicity by ferroptosis (37). Upregulated CHAC1 can degrade intracellular glutathione and then contribute to ferroptosis in human triple negative breast cancer cells (38). FANCD2 is a nuclear protein that can protect against bone marrow injury form ferroptosis-mediated injury and represent an amenable target for the development of novel anticancer therapies (39). G6PD is a key enzyme that generates NADPH to maintain reduced glutathione, whose high expression is correlated with poor prognosis and poor outcome in several types of carcinoma (40,41). FADS2 is a key enzyme involved in the biosynthesis of polyunsaturated fatty acids, whose expression is correlated with multiple tumor processes in human cancers, such as tumor proliferation, angiogenesis, ferroptosis, and prognosis (42). NCOA4-mediated ferritinophagy is shown to be implicated in the initiation of ferroptosis (43). PEBP1 is required to form the 15LOX/PEBP1 complex, and suppression of this complex can mediate the Ferrostatin-1 prevention of ferroptosis (44). GOT1 and GLS2 is found to take part in glutaminolysis to regulate ferroptosis process (45). It is reported that miR-9 can suppress ferroptosis via downregulating GOT1 expression in melanoma (46). In cancer cells, GLS2 upregulation can induce an antiproliferative response with cell cycle arrest (47). Notably, high expression of risk genes, such as CD44, FANCD2 and CHAC1 as well as low expression of NCOA4 were found to be correlated with worse OS in KIRC (21,22), in line with our results. Taken together, these genes may be involved in KIRC via involved in ferroptosis. Moreover, the positive and negative correlated genes with this model were signi cantly enriched in p53 signaling pathway, and cGMP-PKG signaling pathway. The p53 signaling pathway has been con rmed to exert tumor suppressive function via regulating cellular processes, like anti-oxidant defense and ferroptosis (48). The cGMP/PKG signaling pathway is revealed to play a signi cant role in regulating the proliferation and survival of human renal carcinoma cells (49). Therefore, we speculate that p53 signaling pathway, and cGMP-PKG signaling pathway may participate in KIRC via regulating ferroptosis.
Although great efforts have been made to investigate the mechanisms underlying tumor susceptibility to ferroptosis, it remains elusive in the potential modulation between ferroptosis and tumor immunity. The immune in ltration analysis revealed that KIRC patients in the high-risk group had higher proportion of plasma cells, T cells CD8, and T cells regulatory Tregs. Tumor-in ltrating plasma cells are shown to play a positive role in antitumor immunity, suggesting that the potential of enhanced these response in the design of cancer immunotherapy (50). Lee et al. demonstrated that increased Tumor-in ltrating plasma cells are associated with worse survival in lung adenocarcinomas (51). CD8 + T cells can enhance ferroptosis-speci c lipid peroxidation and, in turn, affect the anti-tumor e cacy of immunotherapy (52). Treg frequency in peripheral blood and in tumour-in ltrating lymphocytes is revealed associated with poor prognosis in renal cell carcinoma (53). Moreover, we found that IgG, HCK, and LCK metagenes were signi cantly positively correlated with risk score, and Interferson metagenes was negatively correlated. By TIDE score analysis, most of immunotherapy biomarkers, such as CAF, TIDE, T Cell Exclusion, CD8, Merck18, MDSC, T Cell Dysfunction, CD274, MSI.Expr.Sig, and TAM.M2 were signi cantly different between high-risk and low-risk groups. These data prompted that patients in the high-risk group were in an immunoactive state and bene t from immunotherapy, which will provide a more comprehensive view of cancer immunotherapy e cacy Limitations of our study are as follows: rst, our IDR prediction model was constructed and validated with data from TCGA database, therefore, the use of this model in a real clinical setting remains controversial.
Second, the relationships between the risk score and immune immunity are not experimentally con rmed.
More studies are still required to con rm our observation.
In conclusion, a novel gene signature consisting of 10 ferroptosis-related genes shows reliable ability in predicting clinical prognosis and is correlated with immune immunity in patients with KIRC. The patients at higher risk might be more suitable to bene t from immunotherapy. With further validation, this model can be explored as a predictive biomarker to develop personalized therapies for patients with KIRC.

Declarations
Ethics approval and consent to participate Our study did not require an ethical board approval because it did not contain human or animal trials.
Availability of data and materials: Not applicable. This study was only the primary research, and further study has been in progress.

Consent for publication
Not applicable. Funding: National Youth Fund of Natural Science Foundation (Grant Nos. 81902572)

Competing Interests
The authors declare that they have no competing interests.

Author contributions
Qi Zhang carried out the Conception and design of the research, Wei Song participated in the Acquisition of data. Weiting Kang participated in the design of the study and performed the statistical analysis. Qi Zhang wrote and revised the manuscript. All authors read and approved the nal manuscript.

Figure 1
The ow chart of this study.    Analysis of the relationship between the IDR prediction model and immune cell in ltration and in ammatory activities. A: The landscape of immune cell in ltration of high-risk and low-risk groups. B: Higher ratios of immune cells in the low-risk group. C: Higher ratios of immune cells in the high-risk group.
D: Heatmap for the expression value of 93 genes in the high-risk and low-risk groups. E: The correlation between risk score and in ammatory metagenes.