Patient characteristics
The flow chart of our research is shown in Fig. 1. A total of 1139 luminal subtype BRCA patients from the METABRIC cohort were included as a training set, and 754 patients from the TCGA cohort were enrolled as a testing set. The detailed clinical features of these patients are shown in Table 1.
Candidate prognostic ferroptosis-related DEGs were identified in the METABRIC cohort
More than half of the ferroptosis-related genes (32/63) were differentially expressed between 140 adjacent normal breast tissues and 1139 luminal subtype BRCA samples, and in the univariate Cox regression analysis, twelve of them were associated with OS (Fig. 2a). Some ferroptosis-related genes showed correlations among each other (Fig. 2b). The protein-protein interaction (PPI) network showing interactions between candidate genes is presented in Fig. 2c. A forest plot was used to display the results of the univariate Cox regression analysis of the relationship between the expression of candidate genes and OS (Fig. 2d). The heatmap showed that more than half of the genes were downregulated in tumor tissue, and consistent with the univariate Cox regression analysis, they represented a better prognosis, including PTGS2, ACO1, DPP4, CRYAB, PRKCA, ACSL4 and AKR1C3 (Fig. 2e).
Constructing a prognostic model based on the METABRIC cohort
To further identify the best candidate genes for building a predictive model, LASSO Cox regression was performed in the METABRIC cohort. Finally, 10 candidate gene signatures were found to have the optimal value of lambda (Fig. 2f and 2g). A risk score was established to identify the predictive performance of the 10 ferroptosis-related gene-based signature in the METABRIC cohort (Table S3). Patients with a risk score greater than 0.22 were categorized into the high-risk group, and the remaining patients were stratified into the low-risk group. The distributions of the risk scores, survival time, and survival status are displayed in Fig. 3a and 3b. The high-risk group was found to be significantly associated with higher age, postmenopausal status and histologic grade (Table 2). Kaplan-Meier curves were constructed and indicated that patients with low-risk scores were significantly correlated with better prognosis in the METABRIC cohort (Fig. 3c, P < 0.001). Then, time-dependent receiver operating characteristic (ROC) curve analysis was performed to evaluate the area under the curve (AUC). The AUCs of the 10 ferroptosis-related gene-based signature for predicting OS at 1, 3 and 5 years reached 0.721, 0.604 and 0.646, respectively (Fig. 3d). As shown in Fig. 3e and 3f, principal component analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE) analyses showed that patients in different risk groups were spread out in two directions.
Validation of the 10 ferroptosis-related gene-based signature in the TCGA cohort
The baseline characteristics of the patients in different risk groups in the METABRIC and TCGA cohorts are shown in Table 2. To examine the soundness of the model constructed based on the METABRIC cohort, patients in the TCGA cohort were also assigned to either the high-risk or low-risk group with the same calculation formula as that for the METABRIC cohort. The high-risk group was also associated with higher tumor stage in the TCGA cohort (Table 2). Similar outcomes as those in the METABRIC cohort were obtained, and patients in the low-risk group had a longer survival time than those in the high-risk group (Fig S1. c, P = 0.0029). In addition, the AUC values of the 10 ferroptosis-related gene-based signature were 0.628 at 1 year, 0.593 at 3 years, and 0.649 at 5 years in the TCGA cohort (Fig S1. d). PCA and t-SNE analyses also confirmed that patients were distributed in two subgroups in a discrete direction (Fig S1. e and f). The complete list of the 10 candidate genes in the TCGA cohort is provided in Table S4.
Estimation of the independent prognostic value of the 10 ferroptosis-related gene-based signature
The outcomes of univariate and multivariate Cox regression analyses are illustrated with forest plots (Fig. 4), and the complete data are shown in Table S5 and Table S6. The risk score based on the 10 ferroptosis-related gene-based signature was determined to be an independent prognostic predictor in both the METABRIC (Fig. 4a) and TCGA (Fig. 4b) cohorts (hazard ratio (HR), 1.41, 95% confidence interval (CI), 1.14-1.76, P = 0.002; HR, 2.19, 95% CI, 1.13-4.26, P= 0.02). In addition, tumor stage and age were also independent prognostic predictors in both cohorts (P < 0.01).
Gene expression differences and functional analyses between the high- and low-risk score groups in the METABRIC and TCGA cohorts
To better explore the biological functions of the genes in the risk score, the DEGs between the high- and low-risk groups were identified and were consistent with the results of previous univariate Cox regression analysis (Fig. 2d). High expression of ferroptosis-related genes, including FANCD2, CS, G6PD and NQO1, in the high-risk group represented a higher risk of survival (Fig. 5a and Fig S. 2a). GSEA using the KEGG pathway database (c2.cp.kegg.v7.1.entrez.gmt) showed that cytokine-cytokine receptor interactions were enriched in the METABRIC and TCGA cohorts (Fig. 5b and Fig S. 2b). GO and KEGG pathway analyses were also used to explore the potential functions of the DEGs between the two groups. Interestingly, the DEGs from the METABRIC and TCGA cohorts showed enrichment of several cancer-related molecular pathways, including the PI3K-Akt signaling pathway, proteoglycans in cancer and the cell cycle (Fig. 5c, 5d and Fig S. 2c, 2d).
Immune cell infiltration landscapes of high- and low-risk patients with luminal subtype BRCA
The latest literature reported that CD8+ T cells downregulated the expression of SLC3A2 and SLC7A11 to promote tumor cell lipid peroxidation and ferroptosis [29]. The use of immunotherapy with immune checkpoint blockade targeting CTLA-4 and PD-1 has emerged as a promising strategy for the treatment of various malignancies [30]. Therefore, the immune microenvironment may have a strong link to ferroptosis. To identify immunotherapy targets and assess immunotherapy response in patients in high- and low-risk groups. The correlation analysis showed that the estimate score (r = -0.44; P < 0.001, Fig 6a), immune score (r = -0.32; P < 0.001, Fig 6b) and stromal score (r = -0.50; P < 0.001 Fig 6c) were negatively correlated with the score of the gene signature. Twelve common immune checkpoint genes were negatively correlated with the score of the gene signature (r>0.2; P < 0.005, Fig 6d). The correlation analysis bubble diagram shows the relationships among the 5 immune infiltration cell scores from TIMER (r>0.2; P < 0.005, Fig 6e). Similar, 18 and 1 immune infiltrating cell scores from TCIA (r>0.2; P < 0.005, Fig 6f) were negatively and positively correlated with score of gene signature, respectively. This suggests that low risk patients may have more options for immune targets when faced with immunotherapy. In Fig 6g, the percentage of immunotherapy responses in the low-risk group is much higher than in the high-risk group (P <0.001). This demonstrate that the low-risk group had a better response to immunotherapy. To better explore the association between the risk score and immune status, ImmuCellAI, which is used for precisely estimating the abundance of 24 immune cell types, including 18 T-cell subsets, was used to calculate the immune infiltration scores in and TCGA cohorts (Fig. 6h). Interestingly, several T-cell subsets, including Th2, Th17, Tgd, Tfh, Tem, Tcm and Tc, were significantly different between the two groups in TCGA cohorts (adjusted P< 0.05) (Table S7).
Chemotherapy response and chemotherapeutic agents prediction based on gene signature in high- and low-risk patients with luminal subtype BRCA
Although immunotherapy is a new trend in the development of BRCA treatment, chemotherapy remains the cornerstone of current treatment. The search for chemotherapeutic agents with higher drug sensitivity is necessary for patients and two drug response data (CTRP and PRISM) were used. We used two strategies to identify chemotherapy candidates. First, this agent with lower AUC value in the high-risk group (Log2FC>0.10). Second, these agents with negative spearman correlation coefficients (r < -0.20). These analyses yielded three CTRP-derived compounds (including Panobinostat, SB-743921 and KX2-391) and three PRISM-derived compounds (including volasertib, arcyriaflavin-a and CCT128930). All these compounds had lower estimated AUC values in high-risk group and a negative correlation with risk score (Fig 7a and 7b).
The heterogeneity between high- and low-risk patients
To further explore the heterogeneity of the two patient groups, the reverse-phase protein microarray (RPPA) was obtained from previous literature[31]. Our analysis identified that the risk score obtained from the ferroptosis-related gene-based signature were significantly correlated with tumor purity scores (r = 0.3, p <0.001) and most pathway scores (Fig 8a and Table S8). In addition, we sought to investigate whether pathway scores show differences between high- and low-risk patients with luminal subtype BRCA (Fig 8b-k). Our analysis implies that the pathway scores, except for EMT, RAS/MAPK and RTK, all were significantly higher in the high-risk group. These results suggest that the ferroptosis-related gene-based signature show differences in most BRCA-associated phenotypes.
Analysis of mutations and copy number variants in patients with distinct risk statuses
Previous literature has reported that genetic alterations, including copy number variants (CNVs) and somatic mutations, are common features of cancer [32, 33]. Therefore, a waterfall plot depicting the somatic mutation burdens (mutations/Mb) in the high- and low-risk groups was generated in the TCGA cohort. However, no significant differences were found between the two groups of patients (Fig 9a). Many tumors exhibit a low rate of somatic mutations but show large alterations in somatic copy number. GISTIC 2.0 was used to find the differences in somatic copy number alterations between patients with distinct risk scores. As shown in Fig 9b, amplifications on chromosome 11 accompanied by deletion of chromosome 1 were enriched in the high-risk subgroup. The GISTIC scores calculated by GISTIC 2.0 of patients with different risk scores are provided in Table S9 and Table S10.
Building a predictive nomogram for luminal subtype BRCA patients
To provide a clinically appropriate approach for predicting the probability of OS in luminal subtype BRCA patients, the independent risk factors were used to build a risk estimation nomogram (Fig. 10a). These predictors included tumor stage, risk score related to ferroptosis, age and histologic stage. The C-index of our nomogram was 0.66 in the METABRIC cohort. The calibration plots for 3-, 5- and 7-year survival probabilities in the METABRIC cohort are presented in Fig. 10b, c and d, respectively. Importantly, there was good agreement between the predicted survival rate and the actual observed survival rate. This means that our nomogram has good predictive value.