Identification of prognostic ferroptosis-related DEGs and building a prognostic model
The dataset included a total of 509 GBM samples, 394 of which were used as validation samples. Patients’ clinical characteristics are summarized in Table 1. More than half of the genes(39/60) were differentially expressed between two immune score groups. 17 genes that were significantly related to the OS of GBM patients were identified by bioinformatics analysis. A total of 5 prognostic ferroptosis-related DEGs(ACACA,HMOX1,HSPB1,STEAP3,ZEB1) were preserved(Fig.1a-c). The relationship between these genes is presented in Fig.1d. Then, the least absolute shrinkage and selection operator (LASSO) method was utilized to screen significant DEGs from 5-survival-related genes mentioned above and to avoid overfitting the model. 10-fold cross validation with minimum criteria was employed to find an optimal λ, where the final value of λ gave minimum cross validation error. A 4-gene signature was identified based on the optimal value of λ (Fig.2). Using the following formula, the risk score was calculated: 0.3839* expression level of HMOX1+0.5187* expression level of HSPB1 +0.1811* expression level of STEAP3 +0.2649* expression level of ZEB1. A median cut-off value was applied to stratify patients into a high-risk group (n = 61 ) and a low-risk group (n =62) (Fig.3a). The PCA and t-SNE showed that patients in two risk groups were significantly distributed in different directions by the TCGA cohort (Fig.3b-c). As shown in Fig.3d, high-risk group patients is closely related to the high risk of mortality. The same, patients predicted to be in the low-risk group had a higher probability of survival. Kaplan–Meier survival analysis showed similar outcomes(Fig.3e,p<0.01). In order to verify the predictive ability of the risk scoring model. Receiver operating characteristic (ROC) curve were carried out to evaluated the predictive performance of the prognostic model. the AUC of the prognostic risk assessment model for the 4 ferroptosis-related genes was 0.82 at 1 years, 0.75 at 3 years, 0.67 at 5 years (Fig. 3f). Calibration plots can predict probabilities based on actual observed risks, so use calibration plots to assess the calibration of the model. The calibration curve of survival probability for the TCGA dataset showed that the calibration curve was close to the ideal curve, indicating that the model has good prognostic effect.Then, the difference between actual and predicted risk was compared, the study showed good concordance between the predicted and actual outcome (Fig. 3g-h).
Validation of the 4-gene prognostic model
In order to evaluate the effectiveness of the prognostic model constructed using data from the TCGA cohort, its performance was also assessed using validation cohorts. The risk score of each patient was calculated according to the same formula and patients were classified into the high- or low-risk groups according to the same cutoff value(Fig.4a-b). In the CGGA cohorts, we obtained the similar results to the training set. PCA and t-SNE analyses also demonstrated that patients in the two groups were distributed in discrete directions(Fig.4c-d). Similar, The OS time of patients in the high-risk group was significantly worse compared to the low-risk group (p < 0.05) (Fig.4e). Besides, in the CGGA cohort ,the AUC of the 4-gene signature was 0.82 at 1 year, 0.75 at 3 years, and 0.67 at 3 years(Fig.4f). On a calibration graph, the validation groups revealed a moderate calibration(Fig.4g-h), Meanwhile, our results are verified in the GEO dataset, a similar outcome was obtained.
Independent prognostic value of the 4-gene signature
Additionally, To identify the prognostic factors and rule out other confounding factors in GBM patients, we performed the univariate and multivariate cox regression analyses. The results showed that the risk score was an independent prognostic factor, and the high-risk group had a worse OS compared with the low-risk group(Derivation cohort (HR= 2.156, 95% CI =1.976-2.415, Table.2,P< 0.001); Validation cohort ( HR= 1.858, 95% CI = 1.756-2.615, Table.3, P<0.01)). And the multivariable Cox regression analysis indicated risk score was an independent predictors for OS(Derivation cohort(HR =1.656, 95% CI=1.245-1.855,Table.2,P<0.01),Validation cohort(HR=1.221, 95% CI = 1.168-1.498, Table.3, P<0.01)).
Functional analyses
GO was carried out in 3 functional ontologies: biological process (BP), cellular component (CC), and molecular function (MF). Based on GO analysis, the DEGs were most enriched in biological processes, and specifically in immune related processes(TableS2). Accumulated evidence has revealed that immune infiltration plays an essential role in GBM[24]. In addition, KEGG pathway analyses indicated that these DEGs were highly enriched in ferroptosis, fatty acid metabolism and cytokine-cytokine receptor interaction signaling pathway (P. adjust < 0.05). To further explore relationship between immune cell infiltration and risk scores in the prognostic risk mode. Implementation of ssGSEA the infiltration levels of immune cell types, functions, pathways in the cancer samples were quantified by ssGSEA in R package gsva. Interestingly, the score of Macrophages, Tfh, TIL, Treg and T helper cells were significantly different between high-and low-risk groups(Fig.5a). Moreover,APC_co_stimulation,CCR,Checkpoint,Cytolytic_activity,HLA,Inflammation−promoting,Parainflammation,T_cell_co−stimulation were the statistically different between the two risk groups(Fig.5b,all adjusted P<0.05). Similar results were obtained in validation group, which was consistent with the findings in the GO and KEGG analysis