Identifcation of DEGs
After analysis, there was a total of 2721 DEGs between GC (n=375) and normal samples (n=32), including 1658 downregulated DEGs and 1063 upregulated DEGs. The heatmap and volcano plots are shown in Fig1a, b.
Identification of significant gene modules by WGCNA
Overall, 2721 DEGs and 407 samples were selected after gene and sample screening and preprocessing. We used a power calculation of β = 3 (scale-free R2 = 0.895) (Fig 2a). All selected gene dendrograms and their corresponding modules are displayed in Fig 2b. There were 7 modules, including the special gray module according to the network result (i.e., blue, black, red, brown, green, turquoise, and yellow modules). Among these 7 modules, red (r = 0.42; p = 3e−19), blue (r = 0.58; p = 2e−37), and black (r = 0.36; p = 1e−13) modules showed positive relationships with GC (Fig 1E). Furthermore, the genes in the turquoise and red modules showed strong negative correlations with GC (Brown: r = −0.39; p = 3e−16, Green: r = −0.4; p = 1e−16, Turquoise: r = −0.45; p = 1e−21, and Yellow: r = −0.31; p = 9e−11) (Fig 2c). Finally, we identified blue module as the key modules, in which there were 655 genes.
The analysis of DEFRGs
Next, we screened the DEFRGs in TCGA. We identified Sixty-one DEFRGs, including 31 up-regulated and 30 down-regulated genes (Fig 3; Table 1). Following, we compared the co-expressed genes in blue module with the DEFRGs, then a set of 23 shared genes were obtained (Fig 4a). Furthermore, a strong correlation among the FRGs (Fig 4b).
Functional enrichment analysis
To better understand the signaling pathways and functions of DEFRGs in ferroptosis, functional enrichment on the 23 genes were performed, and found that DEFRGs were enriched in iron-related pathways, such as regulation of cell aging, cell cycle arrest, and NF−kappaB binding. KEGG pathway analysis for the DEFRGs showed that genes involved in ferroptosis, including the Cellular senescence, p53 signaling pathway, Phenylalanine metabolism, HIF−1 signaling pathway and Cell cycle (Fig 5a, b).
Construction of the three-gene-based GC prognostic model
After that, UCR analysis of the screening results, including 23 FRGs, led to the identification of 5 FRGs as potential prognostic indicators of GC overall survival (OS), including ANGPTL4, SMPD1, MYB, SLC1A5, and CGAS (Fig 6). After primary filtering, we further shrink the scope of gene screening. Three genes were identified: SLC1A5, ANGPTL4, and CGAS. To establish an optimal prognostic gene model, MCR analysis was performed on the three genes. Risk score was calculated by the following formula: risk score = [0.1497 × mRNA expression level of ANGPTL4] + [ -0.1806 × mRNA expression level of SLC1A5] + [-0.2385× mRNA expression level of CGAS]. After calculating the risk score, we divided 370 patients into high-risk (n=185) and low-risk (n=185) groups using median risk score as the cut-off. The patients in the high-risk group have worse OS than those in the low-risk group (Fig 7a). As the risk score rising, the patients had a shorter survival time, more death events (Fig7b, c). The risk heatmap showed the differences of three genes (SLC1A5, ANGPTL4, and CGAS) (Fig 7d). We used GEO group for further external validation of this 3-gene-based signature. We got the same result as above (Fig7e-h). Next, reliability and stability of the three gene-based model were further confirmed.
Assessment of three FRGs Signature as Independent Prognostic Factor in GC Patients
To further confirm whether the newly generated risk score was an independent risk factor in GC patients, we employed UCR and MCR analyses, which showed that T, N, metastasis and risk score were independent prognostic factors for OS in GC (p < 0.001) (Fig 8b, c). To evaluate the diagnostic performance of risk model in GC, ROC curves were constructed. The under the ROC (AUC) of risk score (0.611) was much higher than that of age (0.571), gender (0.539), T (0.590), N (0.572), and metastasis status (0.547) (Fig 8a). All results illustrated that the three FRGs signature was an independent prognostic factor in GC.
Validation of Prognostic Performance of the FRGs Signature in GC
To further assess outcome prediction, we combined the validation datasets (total of 433 patients) to evaluate the robustness of the three-gene signature. The results revealed the ROC curve AUC = 0.676 for validation datasets (Fig 8d), which is similar to the one in the TCGA set. Cox regression analyses indicated that the risk score of the signature could be a powerful indicator of GC patient’s clinical outcome (Fig 8e, f). Finally, the expressions of 3 signature genes were validated using the immunostaining results from the HPA database. As demonstrated in Fig 9, SLC1A5 was highly expressed in GC tissue, while ANGPTL4 and CGAS was downregulated in GC. Waterfall plot representing the mutant landscape of the top 30 in Fig 10. Interestingly, TTN, TP53, MUC16, and ARID1A were the top mutations in both cohorts, which were involved in various biological processes. Besides, the frequencies of all mutated genes were higher in the high-risk group (96.74%) (Fig 10a) than in the low-risk group (84.75%) (Fig 10b), suggesting somatic mutation was positively correlated risk scores.
Immune Profile in risk groups
Considering ferroptosis was strongly associated with immune status, the correlations between risk scores and immune status were further explored using CIBERSORT and TIMER to evaluate the immune cell features. As shown in Fig 10, Monocytes, Macrophages M2, Dendritic cells activated, Mast cells resting, and Neutrophils were up-regulated in the high-risk group, while T cells CD8, T cells CD4 memory activated, T cells follicular helper, and Macrophages M2 were significantly down-regulated (p < 0.05) (Fig 11). Moreover, the levels of riskscore were positively associated with Macrophages (r=0.366; p =6.846e-13) and T cells CD4(r=0.135; p =0.010) (Fig 12). The results showed strong correlations between ferroptosis-related risk model and the immune state of GC.
Evaluation of pathways within Both High- and Low-Risk TCGA Cohorts
GSEA was performed to identify gene sets differentially expressed in high and low risk groups from the MSigDB databases (c2.cp.kegg.v6.2.symbols.gmt). The cell cycle, P53, MAPK, Ubiquitin mediated proteolysis, and TGF-β signaling pathways were among the most significantly correlated enriched pathways (Fig 13).