A total of 530 patients with pathologically confirmed KIRC from the TCGA cohort were included in this study and randomly divided equally into two groups: 265 patients were included in the training set and 265 patients were included in the validation set.
2.1 Determination of DEGs related to prognosis of immune-ferropotosis
According to the gene expression data of TCGA patients, 6534 differential genes were obtained through differential analysis, and volcano maps were drawn according to the level of differential expression (Fig. 1a). Genes related to immunity and ferropotosis were intersected using a Venn diagram (Fig. 1b), and 103 intersected genes were obtained. Univariate Cox analysis was performed on genes related to immunity and ferropotosis. According to statistical significance (P < 0.05), 52 genes related to immunity and ferropotosis were obtained, from which forest maps were created: 23 low-risk genes and 29 high-risk genes (Fig. 1c).
2.2 Construction of immune-ferropotosis gene marker prognostic model (IFRSig)
Lasso-Cox regression was performed on prognostic genes using the GLNMET package in R to screen for the best prognostic genes. The Cox multivariate regression analysis coefficients of the prognostic genes were extracted (Fig. 1d and 1e). Risk scores were calculated based on gene expression levels using the following formula: Risk score= Ʃ(Expi × β I), where “Expi” denotes the gene expression level and β I represents the Cox risk ratio coefficient of genes. The risk score of this model was calculated as follows:
-0.578*ACADSB + 0.287*CHAC1-0.397*LURAP1L + 0.447*PLA2G6.
According to the median risk score of the TCGA cohort, the patients were divided into a high-risk group and low-risk group, then randomly divided into training set, validation set, and cohort at a ratio of 1:1 for verification.
To intuitively analyze the high- and low-risk groups, we adopted PCA and t-SNE technology, with dimension reduction as a two-dimensional plane, according to the dangerous degree of the median value of PCA and t-SNE image processing. According to the visualization images, most patients could be divided into high- and low-risk groups, and only a minority of patients could not be classified. This indicates that the model had a high sensitivity (Fig. 1f and 1g).
2.3 Validation of prognostic models
In the training set, high-risk patients had significantly lower survival rates than low-risk patients (Fig. 2a, d). This result was confirmed in the validation set and cohort (Fig. 2b-f). Heat maps were constructed to show the expression levels of the four prognostic genes in the high- and low-risk groups in the three cohorts (Fig. 2g-i).
Further survival analysis showed that the OS of high-risk patients in the training set was significantly lower than that in the low-risk group (Fig. 2j, P < 0.001), which was confirmed by the validation set and cohort (Fig. 2l and 2n, P < 0.001).The predictive ability of the risk score on OS was evaluated using time-dependent ROC curve analysis. The results showed that the AUC of the training set cohort were 0.802 (1 year), 0.746 (3 years), and 0.752 (5 years) (Fig. 2k), while the AUC of the verification set cohort were 0.696 (1 year), 0.699 (3 years), and 0.726 (5 years) (Fig. 2m). The AUC of TCGA cohort was 0.750 (1 year), 0.726 (3 years), and 0.736 (5 years) (Fig. 2o), suggesting that the model had good predictive ability.
2.4 Independent prognostic analysis of risk scores
To verify whether the risk score was independent of other prognostic factors, univariate and multivariate Cox proportional risk regression analyses were performed in the training set, validation set, and cohort, respectively. Univariate and multivariate Cox regression analyses were performed for age, sex, tumor stage, grade, and risk score, as shown in Fig. 3. The risk score and tumor stage of the training set, validation set, and total TCGA cohort were found to be significantly associated with OS, suggesting that tumor stage and risk score were independent predictors of OS (training set: HR = 1.313, 95%CI: 1.154–1.493, P < 0.001; validation set: HR = 1.178, 95%CI: 1.065–1.303, P < 0.001); total TCGA cohort: HR = 1.229, 95%CI: 1.140–1.326, P < 0.001).
2.5 Construction of the prediction column chart
To predict the survival probability of patients with KIRC, 1-, 3-, and 5-year nomograph were established in the cohort based on the prediction model (risk score) and clinical factors (T, N, and M) (Fig. 3g). According to the scoring results of each factor, the total value of the line graph can be obtained from the sum of the single scores of each factor associated with the overall score, which was used to estimate patient survival at 1, 3, and 5 years. To test the predictive power of the prognostic model of the nomograph, the calibration curve was used to evaluate the predictive model. The calibration curve was approximately 45 degrees, acting as the base line (Fig. 3h), indicating a high consistency between the actual and expected survival rates of the model.
2.6 The high-risk group was closely associated with poor prognosis in patients with KIRC
To further evaluate the predictive power of the model, we re-assigned age, gender, G1-2, G3-4, T1-2, T3-4, M0, M1, N0, N1-3, Stage I-II, and Stage III-IV as prognostic and clinicopathological factors in KIRC patients in the TCGA cohort according to different conditions. In multiple clinical subgroups, age ≤ 65 (P < 0.001), age > 65(P < 0.001), female (P < 0.001), male (P = 0.007), T1-2 (P = 0.009), T3-4 (P < 0.001), M0 (P < 0.001), M1 (P = 0.008), N0 (P < 0.001), N1-3 (P = 0.866), Stage I-II (P = 0.016), and Stage III-IV (P < 0.001) were indicative of the relationship between a high risk score and prognosis. We found that the Kaplan-Meier survival analysis and survival rate of each subgroup showed that regardless of age and gender, TM Stage and Stage, the survival time of patients in the low-risk prognostic model group was significantly prolonged, except that the high-risk prognostic curve in the N1-3 subgroup was insignificant. This suggests that the prognostic model has good predictive power in most subclinical subgroups (Fig. 4).
2.7 Enrichment analysis of functional pathways of DEGs
To elucidate why the model differentiated high- and low-risk patients with KIRC, the gene enrichment in high- and low-risk patients with KIRC was further compared. Using GO analysis, the differential genes were found to be enriched in biological process (BP), cell component (CC), and molecular function (MF) (Fig. 5a, Table 2). Differential genes were mainly enriched in complement activation (classical pathway) and complement activation of humoral immune response mediated by circulating immunoglobulins. Differential genes were mainly enriched in immunoglobulin complex, circulating immunoglobulin complex, blood particles, and other functions. In MF, the differential genes were mainly enriched in antigen binding, immunoglobulin receptor binding, chemokine activity, and other functions. In KEGG enrichment analysis, the differential genes were enriched in viral protein and cytokine and cytokine receptor interaction, cytokine-cytokine receptor interaction, PPAR signaling pathway, and other pathways. In GSEA analysis (Fig. 5b), the high-risk group showed abundant gene enrichment in diseases, immune system, innate immune system, hemostasis, and infectious diseases (P < 0.05). These results indicate that these functions and pathways were closely related to the development of KIRC.
2.8 Clinically relevant heat map
To observe the expression of prognostic model genes in clinical features, we constructed an expression heat map based on the correlation of clinical features to observe the expression relationship between the prognostic model genes in high- and low-risk groups, as well as patient age, sex, metastasis, tumor stage, grade, and immune score (Fig. 5c).
2.9 Expression and immunofluorescence localization of model genes in KIRC
According to the median value of gene expression in the prognostic model, patients with TCGA renal clear cell carcinoma data were divided into high- and low-expression groups. The mRNA expression levels of ACADSB and CHAC1 in renal clear cell carcinoma tissues were significantly lower than those in normal tissues (Fig. 6a-b). The mRNA expression levels of LURAP1L and PLA2G6 were significantly higher than those in the normal tissues (Fig. 6c-d). To investigate the subcellular localization of ACADSB, CHAC1, LURAP1L, and PLA2G6 in cancer cells, we used the HPA database to evaluate the distribution of ACADSB, CHAC1, LURAP1L, and PLA2G6 in renal tissue. As shown in Fig. 6e-h ACADSB and CHAC1 were mainly distributed in the mitochondria of U-2OS cells, while LURAP1L was mainly distributed in the nucleoli of A-431 cells. PLA2G6 was mainly distributed in the cytoplasm of the central line satellite of U-2OS cells.
In the TCGA-KIRC cohort, ACADSB and CHAC1 were low in renal clear cell carcinoma, whereas LURAP1L and PLA2G6 were higher in RENAL clear cell carcinoma than in adjacent non-tumor renal tissue. To confirm the expression of these four genes in clinical samples, we further validated the expression of proteins encoded by the four model genes using clinical samples from HPA. Immunohistochemical images of four gene signatures in normal kidney tissue and renal carcinoma are shown in Fig. 6i-p.
2.10 Single-cell RNA sequencing localization analysis of ACADSB gene
To evaluate the localization of model genes in single cells, we used the PanglaoDB dataset for single-cell RNA sequencing localization analysis; however, only ACADSB was found to be included in renal tissues. Therefore, we conducted ACADSB single-cell RNA sequence localization analysis. As shown in Fig. 6q and 6r, renal tissue cells were divided into eight cell clusters: distal tubule cells (P = 5.647E-14), endothelial cells (P = 1.14415E-19), macrophages (P = 1.38073E-10), podocytes (P = 5.62787E-05), main cells (P = 3.46711E-30), proximal tubule cells (P = 7.06537E-35), smooth muscle cells (P = 1.31683E-23), and unknown (P = 0.000803306). ACADSB was set as the superimposed expression of genes, and according to single-cell analysis, ACADSB was found to be enriched in clusters of distal tubule cells (Fig. 6s). This suggests that a decreased expression of ACADSB in the distal tubules plays an important role in the carcinogenesis of KIRC.
2.11 Mutation and correlation analysis of four model genes
To evaluate the mutation of four model genes in KIRC, we used cBioPortal database for data analysis. A total of 1.56% (7/448) of the patients were found to have genetic changes. Genetic changes in ACADSB included depth loss and splicing mutation, while genetic changes in CHAC1 included deep deletions. Similarly, genetic changes in LURAP1L included non-frameshift mutation and missense mutation, while genetic changes in PLA2G6 included missense mutations (Fig. 7a-b). Thereafter, Spearman correlation analysis was used to examine the correlation between model genes. As shown in Fig. 7c, ACADSB was significantly positively correlated with LURAP1L, and negatively correlated with CHAC1 and PLA2G6. CHAC1 was negatively correlated with LURAP1L.
2.12 Correlation between the model and immune cell infiltration
To better investigate the complex cross-talk between the prognostic model and immune characteristics, we evaluated the immune infiltrate profiles of 22 immune cells in clear renal cell carcinoma tissues between the high- and low-risk groups in the KIRC sample using the CIBERSORT algorithm (data visualized in a violin plot). The results showed that CD8 + T cells (P < 0.001), follicular helper T cells (P < 0.01), regulatory T cells (Tregs) (P < 0.001), activated NK cells (P = 0.011), and macrophage M0 (P = 0.01) were significantly increased in the high-risk group of KIRC patients. By contrast, monocytes (P < 0.001), macrophage M1 (P < 0.001), and macrophage M2 (P < 0.001) were significantly reduced in the high-risk group of KIRC patients (Fig. 7d).
2.13 Correlation analysis between risk score and immune cell abundance
With regards to the correlation between the risk score and immune cell abundance, Spearman correlation analysis showed that risk score was positively correlated with regulatory T cells (Tregs), macrophages M0, activated NK cells, plasma cells, CD8 + T cells, and follicular helper T cells (Fig. 7e-t). Risk scores were negatively correlated with naive B cells, active dendritic cells, resting dendritic cells, eosinophils, macrophages M1, macrophages M2, monocytes, neutrophils, resting mast cells, and resting memory CD4 + T cells (Fig. 7e-t).
2.14 Correlation of immune cells
Thereafter, the correlation of these 22 different immune cells was explored, and the results showed that Treg cells were positively correlated with CD8 + T cells and follicular helper T cells. CD8 + T cells were positively correlated with follicular helper T cells and T cells γδ. B cell infantile was positively correlated with plasma cell. Activated dendritic cells were positively correlated with eosinophils. Macrophage M2 was negatively correlated with CD8 + T cells and follicular helper T cells. The resting CD4 memory T cells were negatively correlated with CD8 + T cells and follicular helper T cells (Fig. 8a).
2.15 Risk score and immune-related function
Based on ssGSEA, the correlations between the risk scores and immune-related functions were determined, as shown in the boxplot in Fig. 8b. As a result, the high-risk group was found to have better immune function in terms of immune checkpoint activity, cytolysis activity, proinflammatory activity, T cell co-inhibition, and type II interferon response.
2.16 Kaplan-Meier survival analysis of immune cells
Subsequently, Kaplan-Meier survival analysis of immune cells was performed. The results showed that Treg cells, CD4 + T memory cells, follicular helper T cells, monocytes, inactive mast cells, and resting dendritic cells had a good predictive ability for OS (Fig. 8c-h).
2.17 Immune microenvironment and immune escape
To explore the relationship between the tumor microenvironment (TME), we performed TME scores and found that high-risk patients had higher immune scores and higher ESTIMATE scores than low-risk patients (Fig. 9a-c).To assess the potential of risk scores as biomarkers for immunotherapy or chemotherapy, we used the TIDE online tool to predict responses to immunotherapy in different risk groups. The TIDE score was found to be negatively correlated with the efficacy of immunotherapy, and the results showed that TIDE score of high-risk patients significantly increased (P < 0.05) (Fig. 9d-f), suggesting that the efficacy of immunotherapy in the high-risk group was worse than that in the low-risk group.
2.18 Immune checkpoints
Based on the importance of immunotherapy with checkpoint inhibitors, we further investigated the expression of immune checkpoints in both risk groups. The results showed that most immune checkpoints were more active in high-risk populations. The expression of immune checkpoints was significantly increased in high-risk patients. In particular, the expression of molecules, such as CD70 (CD70), cytotoxic T lymphocyte-associated protein 4 (CTLA4), and PDCD1, were significantly elevated in the high-risk group (Fig. 9g). These results suggest that the immune microenvironment in high-risk groups may be suppressed by the upregulation of immunosuppressive cytokines and immune checkpoints.
2.19 Small molecule drugs
Many small molecule drugs often show drug resistance in the process of cancer treatment, resulting in poor drug efficacy, leading to renal clear cell carcinoma and worse clinical prognosis. To validate the use of drugs in different risk groups, we compared the median maximum inhibitory concentration (IC50), which can help quantify the therapeutic ability of drugs to induce cancer cell apoptosis, which is inversely proportional to the sensitivity of small molecule drugs. Using the pRRophetic algorithm, we calculated the chemotherapy effect of 12 common small molecule drugs (sunitinib, pyrimidine, paclitaxel, lenalidomide, imatinib, gemcitabine, gefitinib, erlotinib, cytarabine, cisplatin, and bosutinib) on KIRC patients to evaluate the relationship between risk score and small molecule drug resistance based on IC50. As shown in Fig. 10, the IC50 for gemcitabine, gefitinib, imatinib, and cisplatin (P < 0.001) was significantly higher in the high-risk group than in the low-risk group, suggesting that high-risk patients may not benefit from these drugs. Sunitinib, pyrimidine, paclitaxel, lenalidomide, erlotinib, cytarabine, and bosutinib were significantly reduced in the high-risk group, indicating that these small molecule drugs may be more sensitive and have a greater impact on high-risk patients. These results indicate that the risk prognostic model could not only classify individuals into different risk groups, but also assist in the selection of small molecule drugs according to the corresponding sensitivity values of clinically observed KIRC patients.