The flow chart of this study is shown in Fig. 1. A total of 433 BC patients from the TCGA-LIHC cohort were finally enrolled. The detailed clinical characteristics of these patients are summarized in Table 1.
Identification of prognostic ferroptosis-related DEGs in the TCGA cohort
61 ferroptosis-related genes were differentially expressed between tumor tissues and adjacent nontumorous tissues (Supplementary Table S3), and 12 of them were correlated with OS in the univariate Cox regression analysis (Fig. 2a). A total of 12 prognostic ferroptosis-related DEGs were preserved (Fig.2b-c). The interaction network among these genes indicated that SLC2A12 and SLC3A2 were the hub genes (Fig.2d). The correlation between these genes is presented in Fig. 2e.
Construction of a prognostic model in the train group
All samples data were divided into train group (n=200) and test group (n=200) averagely(Supplementary Table S5 and S6). LASSO Cox regression analysis was applied to establish a prognostic model using the expression profile of the 12 genes mentioned above in train group. After LASSO Cox regression analysis, the optimal 9 genes and their corresponding correlation coefficients were obtained (Supplementary Table S4). Survival analyses, according to the optimal cut-off expression value of each gene, indicated that high expression of these genes all correlated with a poor prognosis (all adjusted P<0.05, Fig. S2). The risk score was calculated as follows: e (0.001 * expression level of CAPG + 0.059 * expression level of CDO1 + 0.609 * expression level of DRD5 + 0.001 * expression level of SCD + 0.003 * expression level of SLC3A2 + 0.003 * expression level of TFRC - 0.072 * expression level of SLC38A1 - 0.047 * expression level of TP63 - 0.370 * expression level of ZNF419). The patients were stratified into a high-risk group (n=100) or a low-risk group (n=100) according to the median cut-off value (Fig. 3a). As shown in Fig. 3b patients with high risk had a higher probability of death earlier than those with low risk. PCA and t-SNE analysis indicated the patients in different risk groups were distributed in two directions (Fig. 3c-d). Consistently, the Kaplan-Meier curve indicated that patients in the high-risk group had a significantly worse OS than their low-risk counterparts (Fig. 3e, P< 0.001). The predictive performance of the risk score for OS was evaluated by time-dependent ROC curves, and the area under the curve (AUC) reached 0.712 at 1 year, 0.661 at 2 years, and 0.689 at 3 years (Fig. 3f).
Validation of prognostic model in test group
Using test group to verify the robustness of the prognostic model constructed. The same risk scoring formula was used to calculate, and the calculated median value (train group) was divided into high-risk group and low-risk group, as shown in Figure 4a. Similarly, patients in the high-risk group had a higher rate of early death than patients in the low-risk group, as shown in Figure 4b. PCA and t-SNE analysis indicated the patients in different risk groups were distributed in two directions (Fig. 4c-d). Consistently, the Kaplan-Meier curve indicated that patients in the high-risk group had a significantly worse OS than their low-risk counterparts (Fig. 4e, P< 0.01). The predictive performance of the risk score for OS was evaluated by time-dependent ROC curves, and the area under the curve (AUC) reached 0.682 at 1 year, 0.693 at 2 years, and 0.706 at 3 years (Fig. 4f).
Independent prognostic value of the 9-gene signature
Univariate and multivariate Cox regression analyses were carried out among the available variables to determine whether the risk score was an independent prognostic predictor for OS. In univariate Cox regression analyses, the risk score was significantly associated with OS (HR= 3.599, 95% CI =2.383-5.437, P< 0.001). After correction for other confounding factors, the risk score still proved to be an independent predictor for OS in the multivariate Cox regression analysis (HR =3.011, 95% CI = 1.956-4.635, P<0.001) (Fig. 5a, b).
Functional analyses in the TCGA cohort
To elucidate the biological functions and pathways that were associated with the risk score, the DEGs between the high-risk and low-risk groups were used to perform GO enrichment and KEGG pathway analyses. As expected, DEGs were enriched in several ferroptosis-related molecular functions(P. adjust < 0.05, Fig.6a, c). Interestingly, the DEGs from the TCGA cohort were also obviously enriched in many immune-related biological processes, including humoral immune response, complement activation, classical pathway, humoral immune response mediated by circulating immunoglobulin, cell recognition, phagocytosis, recognition, immunoglobulin complex, circulating, antigen binding, immunoglobulin receptor binding, cytokine activity, chemokine activity, and chemokine receptor binding (P. adjust < 0.05, Fig. 6c). KEGG pathway analyses also indicated that the Viral protein interaction with cytokine and cytokine receptor, cytokine-cytokine receptor interaction pathway, Chemokine signaling pathway, Complement and coagulation cascades, Intestinal immune network for IgA production and Natural killer cell mediated cytotoxicity was enriched in both cohorts (P < 0.05, Fig. 6b, d).
To further explore the correlation between the risk score and immune status, we quantified the enrichment scores of diverse immune cell subpopulations, related functions, or pathways with ssGSEA. Interestingly, the score of most immune cells were significantly different between the low risk and high risk group in the TCGA cohort, except NK cells, T helper cells and Th 2 cells (all adjusted P< 0.05, Fig. 7a). Moreover, the score of most immune function were significantly different between the low risk and high risk group. lower in the high risk group, while the type II IFN response were no significantly (adjusted P< 0.05, Fig. 7a-b).