Identication the ferroptosis-related gene signature in Gastric cancer based on weighted gene coexpression network analysis (WGCNA)

Ferroptosis is a mode of regulated cell death that depends on iron, plays pivotal roles in regulating various biological process in human cancers. However, the role of ferroptosis in Gastric cancer (GC) remains unclear. A total of 2721 differentially expressed genes (DEGs) were ltered base on The Cancer Genome Atlas (TCGA) (n = 375) dataset. Gene modules were identied based on co-expression network analysis (WGCNA). Functional analysis was performed to explore the biological function. Lasso-penalized and univariate Cox regression (UCR) analysis, survival genes were screened out to construct a prognostic model, which validated by the GSE43292 dataset. Gene set enrichment analysis (GSEA) for prognostic index was performed. Finally, the correlations of ferroptosis and immune cells were assessed through the TIMER database.


Introduction
As a major public health issue globally, gastric cancer (GC) is the third leading cause of cancer-related death [1]. Due to early stages of GC are usually asymptomatic, and if patients are diagnosed at advanced stage, leading to poor survival [2]. Moreover, among patients with GC who receive adjuvant therapy, 50% experience local or distant disease recurrence [3]. Hence, identi cation of reliable diagnostic and prognostic biomarkers is critical for GC, not only to improve prognostication but also to provide novel therapeutic targets for GC.
Ferroptosis is a novel form of regulated cell death that was described in 2012 [4]. It is a type of regulated cell death that is driven by intracellular lipid peroxidation and is highly dependent on reactive oxygen species (ROS) generation and iron availability [5]. Ferroptosis plays an important role in various tumors cells such as brosarcoma, lung cancer, and prostate cancer cells [6][7][8]. In addition, Several publications have reported that natural active components alleviated multidrug resistance of cancer and inhibit the progression of multiple tumors through inducing ferroptosis [9]. These ndings suggest ferroptosis as a new player that regulates the tumor suppressive function. Nevertheless, prognostic models for FRGs have not been constructed for prediction of overall survival (OS) in GC patient.
The present study performed comprehensive analyses utilizing TCGA and GEO. We evaluated the prognostic value of the DEFRGs and constructed a three-mRNA signature that could effectively predict patient survival in TCGA dataset, and further validated in GEO dataset. Further, we further conducted functional studies on risk scores to elucidate the pathogenic mechanisms and provide scienti c basis for clinical diagnosis and treatment.

Identifcation of DRGs and DFRGs
Differential expression of DRGs and DFRGs between tumor and normal samples was assessed using the R-package called limma [11], The false discovery rate (FDR) <0.05 and | log2(fold change)| > 0.5, visualized on boxplots and heatmaps using the "ggpubr" and "heatmap" package, respectively.

Highly Coexpressed Gene Set-Gene Module
The WGCNA was conducted by the WGCNA package [12] in R software. As a previous study showed that the WGCNA analysis was sensitive to batch effects and outlier samples, we performed hierarchical cluster analysis [13]. The module eigengene (ME), which is considered representative of the gene expression pro les, was calculated to identify clinical associated modules base on DRGs. To nd the most tumor related modules, we conducted Module-Trait Relationships calculations for each module.
Then, for genes in the signi cant tumor-related modules, we calculated the Gene Signi cance and Gene Module Membership (MM) within the genes, modules, and clinical traits. Finally, we identi ed the genes in the GC-related modules [14,15].

Acquisition of intersecting genes
Overlapping genes were identi ed as candidates for the subsequent analysis and were oriented from the DFRGs and WGCNA analysis. The online tool Draw Venn Diagram (http://bioinformatics.psb.ugent.be/webtools/Venn/). Coexpression analysis was performed using the "Corrplot" package.

Function and pathway enrichment analyses
The cluster Pro ler package in R [16] was used to test the statistical enrichment of functions. To assess for functional categories, we used Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway. The P-value of <0.05 and q-value of <0.05 were set as the threshold.
Construction of an FRGs-based risk score model The candidate FRGs were analyzed using UCR analysis (p < 0.05). The median values were de ned as the cutoff values for high and low FRGs expression in Cox survival analysis. After identifying the prognosis related FRGs, we performed MCR analysis to identify independent prognostic FRGs. Finally, the risk score for each patient was calculated by taking the sum of Cox regression coe cient for each signature gene multiplied with its corresponding expression value. The immunohistochemistry staining of genes were examined by The Human Protein Atlas (HPA) (https://www.proteinatlas.org/about/download).
Gene Set Enrichment Analysis (GSEA) was performed using GSEA v4.0.3 software with 1000 permutations and weighted enrichment statistics. The median risk score was used as the cut-off point for high or low-risk group classi cation [17].

Immune In ltration Analysis
The in ltration level of immune cells in GC was predicted using the TIMER database [18]. Correlation analysis between six types of immune cells and risk score were then conducted (https://cistrome.shinyapps.io/timer/).

Exploring Relationships Between Immune Components and risk groups
We used the CIBERSORT algorithm to estimate data on tumor-in ltrating immune cells. Violin plots were used to present the full distribution of the data [19].

Statistical analysis
For analysis of differences between two groups, Student's t test was performed. A Kaplan-Meier survival analysis was performed to estimate survival curve. Statistical analyses were performed using the statistical software R (version 4.0.2). A value less than 0.05 (p < 0.05) was considered signi cant. Identi cation of signi cant gene modules by WGCNA Overall, 2721 DEGs and 407 samples were selected after gene and sample screening and preprocessing. We used a power calculation of β = 3 (scale-free R2 = 0.895) (Fig 2a). All selected gene dendrograms and their corresponding modules are displayed in Fig 2b. There were 7 modules, including the special gray module according to the network result (i.e., blue, black, red, brown, green, turquoise, and yellow modules). Among these 7 modules, red (r = 0.42; p = 3e−19), blue (r = 0.58; p = 2e−37), and black (r = 0.36; p = 1e−13) modules showed positive relationships with GC ( Fig 1E). Furthermore, the genes in the turquoise and red modules showed strong negative correlations with GC (Brown: r = −0.39; p = 3e−16, Green: r = −0.4; p = 1e−16, Turquoise: r = −0.45; p = 1e−21, and Yellow: r = −0.31; p = 9e−11) (Fig 2c). Finally, we identi ed blue module as the key modules, in which there were 655 genes.

The analysis of DEFRGs
Next, we screened the DEFRGs in TCGA. We identi ed Sixty-one DEFRGs, including 31 up-regulated and 30 down-regulated genes (Fig 3; Table 1). Following, we compared the co-expressed genes in blue module with the DEFRGs, then a set of 23 shared genes were obtained (Fig 4a). Furthermore, a strong correlation among the FRGs (Fig 4b).

Functional enrichment analysis
To better understand the signaling pathways and functions of DEFRGs in ferroptosis, functional enrichment on the 23 genes were performed, and found that DEFRGs were enriched in iron-related pathways, such as regulation of cell aging, cell cycle arrest, and NF−kappaB binding. KEGG pathway analysis for the DEFRGs showed that genes involved in ferroptosis, including the Cellular senescence, p53 signaling pathway, Phenylalanine metabolism, HIF−1 signaling pathway and Cell cycle (Fig 5a, b).

Construction of the three-gene-based GC prognostic model
After that, UCR analysis of the screening results, including 23 FRGs, led to the identi cation of 5 FRGs as potential prognostic indicators of GC overall survival (OS), including ANGPTL4, SMPD1, MYB, SLC1A5, and CGAS (Fig 6). After primary ltering, we further shrink the scope of gene screening. Three genes were identi ed: SLC1A5, ANGPTL4, and CGAS. To establish an optimal prognostic gene model, MCR analysis was performed on the three genes. Risk score was calculated by the following formula: risk score = high-risk (n=185) and low-risk (n=185) groups using median risk score as the cut-off. The patients in the high-risk group have worse OS than those in the low-risk group (Fig 7a). As the risk score rising, the patients had a shorter survival time, more death events (Fig7b, c). The risk heatmap showed the differences of three genes (SLC1A5, ANGPTL4, and CGAS) (Fig 7d). We used GEO group for further external validation of this 3-gene-based signature. We got the same result as above (Fig7e-h). Next, reliability and stability of the three gene-based model were further con rmed.

Assessment of three FRGs Signature as Independent Prognostic Factor in GC Patients
To further con rm whether the newly generated risk score was an independent risk factor in GC patients, we employed UCR and MCR analyses, which showed that T, N, metastasis and risk score were independent prognostic factors for OS in GC (p < 0.001) (Fig 8b, c). To evaluate the diagnostic performance of risk model in GC, ROC curves were constructed. The under the ROC (AUC) of risk score (0.611) was much higher than that of age (0.571), gender (0.539), T (0.590), N (0.572), and metastasis status (0.547) (Fig 8a). All results illustrated that the three FRGs signature was an independent prognostic factor in GC.

Validation of Prognostic Performance of the FRGs Signature in GC
To further assess outcome prediction, we combined the validation datasets (total of 433 patients) to evaluate the robustness of the three-gene signature. The results revealed the ROC curve AUC = 0.676 for validation datasets (Fig 8d), which is similar to the one in the TCGA set. Cox regression analyses indicated that the risk score of the signature could be a powerful indicator of GC patient's clinical outcome (Fig 8e, f). Finally, the expressions of 3 signature genes were validated using the immunostaining results from the HPA database. As demonstrated in Fig 9, SLC1A5 was highly expressed in GC tissue, while ANGPTL4 and CGAS was downregulated in GC. Waterfall plot representing the mutant landscape of the top 30 in Fig 10. Interestingly, TTN [20], TP53 [21], MUC16 [22], and ARID1A [23] were the top mutations in both cohorts, which were involved in various biological processes. Besides, the frequencies of all mutated genes were higher in the high-risk group (96.74%) (Fig 10a) than in the low-risk group (84.75%) (Fig 10b), suggesting somatic mutation was positively correlated risk scores.

Immune Pro le in risk groups
Considering ferroptosis was strongly associated with immune status, the correlations between risk scores and immune status were further explored using CIBERSORT and TIMER to evaluate the immune cell features. As shown in Fig 10, Monocytes, Macrophages M2, Dendritic cells activated, Mast cells resting, and Neutrophils were up-regulated in the high-risk group, while T cells CD8, T cells CD4 memory activated, T cells follicular helper, and Macrophages M2 were signi cantly down-regulated (p < 0.05) (Fig 11). Moreover, the levels of riskscore were positively associated with Macrophages (r=0.366; p =6.846e-13) and T cells CD4(r=0.135; p =0.010) (Fig 12). The results showed strong correlations between ferroptosisrelated risk model and the immune state of GC.

Evaluation of pathways within Both High-and Low-Risk TCGA Cohorts
GSEA was performed to identify gene sets differentially expressed in high and low risk groups from the MSigDB databases (c2.cp.kegg.v6.2.symbols.gmt). The cell cycle, P53, MAPK, Ubiquitin mediated proteolysis, and TGF-β signaling pathways were among the most signi cantly correlated enriched pathways (Fig 13).

Discussion
Ferroptosis is a recently recognized form of regulated cell death that is characterized by lipid peroxidation [24,25]. Increasing evidence suggest that ferroptosis plays a powerful role in enabling malignancy features of tumor [26]. However, fewer studies reported on ferroptosis-related research in GC and the systematic analysis has not been elucidated yet.
In the present study, Results in this study 23 differently expressed FRGs between GC tissue and normal tissue were revealed by WCGNA and limma, and 5 FRGs out of them were of prognostic value. Three gene prognostic model (SLC1A5, ANGPTL4, and CGAS) was constructed in TCGA dataset, and validated in GEO test set.
Solute carrier family 1 member 5 (SLC1A5), also referred to as ASCT2, is a sodium channel which acts as a high-a nity glutamine transporter in tumor cells [27]. Inhibition of SLC1A5 impedes glutamine uptake, leading to disturbance of mTORC1 signaling and activation of autophagy and cancer cell growth [28,29]. Increased SLC1A5 expression has been documented in melanoma [30], neuroblastoma [31], and GC [32]. In precent research shown that miR-137 suppresses ferroptosis by targeting glutamine transporter SLC1A5 in melanoma [33]. Angiopoietin-Like 4 (ANGPTL4) is a member of the angiopoietin family and the members of which act as regulators of lipid and glucose metabolism [34]. ANGPTL4 is overexpressed in several types of cancers and are associated with poor clinical outcome [35,36]. In addition, ANGPTL4 induces ROS accumulation and subsequent ferroptosis, and high ANGPTL4 levels and anoikis resistance as novel and highly relevant oncogenic properties that can be targeted by inducing ferroptosis [37]. Cyclic GMP-AMP synthase (CGAS) is a cytosolic DNA sensor that activates innate immune responses. cGAS catalyzes the synthesis of cGAMP, which functions as a second messenger that binds and activates the adaptor protein stimulator of interferon genes (STING) to induce type I interferons (IFNs) and other immune modulatory molecules [38]. The expression of cGAS, which produces cGAMP for STING activation, facilitating the activation of antitumor CD8 + T cell responses [39]. Furthermore, 8hydroxy-2 -deoxyguanosine (8-OHG) functions as a damage-associated molecular pattern (DAMP) during ferroptotic cell death to trigger STING1-dependent macrophage polarization, supporting pancreatic cancer initiation and progression [40]. These results indicate that three genes were closely related to tumor ferroptosis.
Considering the pivotal role of ferroptosis in the progress tumor-invading immune cells of cancers, we explored the diverse immune cell in ltrations between low-and high-risk cohorts. A recent investigation documented that resting memory CD4 T cells as one of the most enriched tumor-invasion immune cell in GC samples [41]. Studies have reported follicular helper T cells in tertiary lymphoid structures of numerous tumors, implying that they participate in generating effective, as well as sustained antitumor immune responses [42]. M1 macrophages are linked to antitumor activity, whereas M2 macrophages are associated with cancer progression and metastasis [41]. Herein, in the high-risk group had an elevated level of M2 macrophage. In contrast, the patients in the low-risk group had elevated proportions of M1 macrophages. The GSEA collection found that cell cycle, P53, MAPK, Ubiquitin mediated proteolysis, and TGF-β signaling pathways were most enriched. Recent studies have demonstrated the signi cant role played p53 in the regulation of glucose metabolism, reactive oxygen species (ROS) responses and ferroptosis [43]. Acetylation-defective p53 mutants were shown to promote ferroptosis, an iron-dependent, oxidative and non-apoptotic form of cell death [43]. TGF-β1 is the most important cytokine of Epithelialto-mesenchymal transition (EMT). It has been reported that tumor cells in a high state of oxidative stress typically exhibit EMT, under which tumor cells may resist apoptotic cell death and increase their sensitivity to ferroptosis [44]. The mitogen-activated protein kinase (MAPK) signaling pathway has also been found to be involved in ferroptosis initiation [45]. Inhibiting MAPK signaling protects lung cancer cells against ferroptosis [46]. These GSEA results gave a detailed description of the ways and methods by which the three-gene signature participates in GC's progress, which may bene t future precision medicine research.Our study found a novel, robust FRGs signature for GC. the three-FRGs signature can effectively make GC patient prognosis evaluation and utilizing ferroptosis may be the therapeutic targets. However, further studies that verify these results are required. Declarations FW and HC contributed to the conception and design of the study; FW collected data and wrote the manuscript; CC performed the data analysis and constructed the gures and tables; WP and ZL reviewed and revised the manuscript and were involved in the conception of the study. Additionally, HC was responsible for the organization, revision and submission of this manuscript. All authors read and approved the nal manuscript.

Ethics approval and consent to participate
This was not applicable to this manuscript.

Consent for publication
Consent for publication was obtained from all participants.