The Novel Ferroptosis-related Gene Markers That Can Predict the Survival in Gastric Cancer Patients

Background: Gastric cancer(GC) refers to malignant tumor that derived from gastric epithelial cells. Ferroptosis is another programmed cell demise mode that is Fe-dependent, unique concerning apoptosis, cell necrosis, and autophagy. Current research demonstrates that ferroptosis assumes a basic part of cancer biology. Extracellular matrix(ECM) has been conrmed to play an essential role in the proliferation, apoptosis, metabolism and differentiation of tumor cells. As an important component of the tumor microenvironment, ECM interacts with the immune microenvironment and affects tumor development and progression. Methods: GC data were downloaded from the TCGA database. Furthermore, 259 ferroptosis-related genes were acquired with the FerrDb database. COX regression analysis was used to screen ferroptosis-related genes related to GC's prognosis, and these genes constructed the prediction model. The risk score of the model and clinical data of GC were further analyzed to get the correlation between the model and the overall survival(OS) rate and clinicopathological features. Finally, GO and KEGG enrichment analysis was carried out on the genes of the model. To further analyze the correlation between the genes in the model and tumor immunity, ssGSEA was used to score immune cells and calculate immune-related pathways' activity quantitatively. Results: A prognosis model was constructed according to the 11 ferroptosis-related genes related to prognosis to predict the prognosis of GC patients better. According to univariate and multivariate, risk score can be regarded as an independent predictor. Conclusions: we identied 11 ferroptosis-related genes (NOX4, NOX5, SLC1A5, GLS2, MYB, TGFBR1, NF2, ZFP36, DUSP1, SLC1A4, SP1), which affected the prognosis of GC patients.


Background
Gastric disease (GC) is a regular malignant tumor beginning starting with the gastric mucosal epithelium.
Worldwide, GC's incidence is at the forefront of all cancers, with the second-highest mortality rate [1] . Due to differences in China's dietary structure and lifestyle, GC is the second most common cancer in China and has a high mortality rate [2] . At present, more and more studies construct disease prediction models by screening multiple genes [3] . With the extensive use of high-throughput sequencing in prognosis research, various bioinformatics databases such as TCGA provide a useful data platform for constructing prediction models. However, considering that GC's therapeutic strategies are still limited, it is still necessary to develop new prognostic models.
Ferroptosis is a Fe-dependent, new kind about programmed cell demise. It is characterized by the increase of Fe-dependent lipid reactive oxygen species. Its concept was rst introduced in 2012 [4] . In subsequent studies, it was found that glutathione peroxidase 4(GPX4) mediated signaling pathway played an essential role in ferroptosis, and FSP1 was also found to play a similarly important role as GPX4 in ferroptosis [5][6] . Further studies have revealed the regulation of cadherin-regulated intracellular interactions on ferroptosis [7] .In cancer, ferroptosis may have cancer inhibiting function and can be used to treat cancer [8][9] . With the study revealing that CD8+T cells can regulate tumor cells' ferroptosis process through IFNγ [10] , a link has been established between ferroptosis and tumor immunity. The correlation between GC and ferroptosis has been covered by some studies [11][12][13][14] . Several genes may be included for regulating ferroptosis in GC and in uence the progression from GC, but the correlation with OS rate in GC patients has not been clari ed.
ECM is an essential component of the tumor microenvironment, and its related characteristics and components play an important role in the proliferation, apoptosis, metabolism, and immune response of tumor cells [15] . In cancer, the remodeling of ECM and changes in the immune environment directly or indirectly affect tumor occurrence and development. Among them, macrophages [16][17][18] , NK cells [19][20] and T cells [21][22] have been con rmed to guide the changes of the immune microenvironment through ECM, thereby affecting the tumor cells. In addition, as important components of ECM, collagen and broblasts are also involved in tumor immune changes [23][24] .
This study downloaded mRNA expression pro les and relating clinical information of GC patients from a public database and obtained ferroptosis-related genes. A prognostic model with differentially expressed genes related to ferroptosis was constructed and veri ed. The potential mechanism was further explored through functional enrichment analysis. Finally, we identi ed 11 ferroptosis-related genes that affected GC's prognosis and established the risk scoring model.

Data download
The mRNA expression data (RNA-seq) of 407 patients with GC and clinical data were downloaded from the public database of TCGA (https://portal.gdc.cancer.gov/) updated on September 16, 2020. Data on 259 ferroptosis-related genes were obtained by downloading the FerrDb database (http://www.zhounan.org/ferrdb/).

Obtaining ferroptosis-related genes with prognostic value
Based on the FDR < 0.05, the downloaded TCGA data were standardized using the "Limma" R package in the R programming environment to obtain the DEGs between GC tissues and normal paracancerous tissues, and to extract the ferroptosis-related genes from the DEGs. The OS was determined by univariate COX analysis to screen genes with prognostic value. Then, in the R programming environment, the Venn diagram was drawn using the "Venn" R package, and the intersection genes of the DEGs and the prognostic genes were taken to obtain the ferroptosis-related genes with prognostic value. These genes were analyzed using the "pheatmap" and "survival" R packages to produce heat maps and forest maps.
Besides, the PPI network was obtained through the STRING database [25] , and the correlation network was plotted using the "Igraph" R package.

Construct a prognosis model
The LASSO-penalized Cox regression analysis [26][27] was applied to chosen the genes related to survival and prognosis selected by univariate COX analysis to ensure the best tting error and obtain the lowest variables risk of over tting to improve the accuracy of the model. The penalty parameter(λ) of the model is obtained by the self-contained "cv.glmnet" function in the "glmnet" R package. With the expression matrix of the screened candidate prognostic DEGs as an independent variable and the overall survival time and status as the response variables, the multivariate COX analysis was carried out to construct the prognosis model. The risk score was computed in light of each gene's expression level and the corresponding correlation coe cient. Formula: risk score =∑(expression level of each gene×correlation coe cient of that gene, coef). As stated by the median value of risk values, those patients were divided into high-risk and low-risk groups.

Survival analysis
The "Survminer" R package was used in the R programming environment to determine the optimal cut-off expression value, and the Kaplan-Meier survival curve was drawn by the "Survival" R package (P<0.05). The ROC curve was drawn by "SurvivalROC" R package, and the area under the AUC curve is used to evaluate the model's predictive value. Also, to investigate the distribution of diverse groups, the "stats" R package and "Rtsne" R packages were utilized to PCA and t-SNE analysis.

Functional enrichment analysis
The "clusterPro ler" R package was used for GO and KEGG analysis of DEGs. The ssGSEA [28] was used to score 16 immune cells' in ltration and calculate 13 immune-related pathways' activities.

Statistical analysis
Kaplan-Meier analysis was used to assess the OS between different groups. To determine independent predictors of OS, univariate and multivariate Cox regression analyses were used. Assess the relationship between clinical data and risk scores. The ssGSEA scores of immune cells or pathways among different groups were compared by Mann-Whitney test (P < 0.05). All relevant statistical analysis and mapping work was performed using the above R software (Version 4.0.2) and R packages.

Result
The ow of this study is shown in Figure 1. We downloaded what added up to 407 GC patients from the TCGA database, including 375 GC samples and 32 normal para-cancer samples. The detailed clinical features of these patients are shown in Table 1.

Acquisition of prognostic ferroptosis-related DEGs
What added up to 259 ferroptosis-related genes were extracted from the FerrDb database( Figure 2a). Among them, 170 genes showed signi cant expression differences between GC samples and normal para-cancer samples, and 18 prognostic genes were screened out by univariate COX regression analysis (FDR<0.05, Figure 2). The heat map and forest map of the 18 genes are shown in gures 2b-c. The PPI network of these 18 genes was constructed in the STRING database, and it was found that CAV1, DUSP1, SP1, and NOX4 were the critical genes in this PPI network (Figure 2d). In addition, the correlation network of the 18 genes is indicated done Figure 2e.

Construction and validation of the prognostic model
Those 18 prognostic ferroptosis-related genes were further screened by LASSO regression analysis. The optimal λ value was calculated, and the corresponding 11 genes with the most reduced cross-validation lapse were chosen ( gures 3a-b). Finally, a model was constructed based on the 11 genes. The risk score has been computed dependent upon each gene's expression level and the corresponding correlation coe cient (coef) ( Table 2), and the patients were partitioned under a high-risk group and a low-risk group with the risk values' median value. (Figure 4a). The survival prognosis of the low-risk group is higher than that of the high-risk group through the Kaplan-Meier survival curve analysis(p=1.651e-04) (Figure 4b).
The area under the AUC curve in the ROC curve was used to evaluate the model's prediction reliability, and as shown in Figure 4c, the model had good sensitivity. PCA and t-SNE analyses showed that patients' distribution in the two risk groups was in two directions (Figure 4d-e). Besides, patients with high-risk died earlier than the low-risk group (Figure 4f).

Analysis of the independent prognostic value of the model
To determine whether the established prognosis model can be used as a prognostic factor independent of other clinical features, univariate and multivariate COX regression analysis was used for veri cation. The results showed a signi cant correlation between risk scores and OS in both univariate and multivariate Cox regression analyses (HR=3.627, 95% CI=2.256-5.829, p<0.001, Figure 5a; HR=4.176, 95% CI=2.588-6.736, P<0.001, Figure 5b). Previously summary, the risk score of the prognostic model we constructed can be used as an independent prognostic factor.
Function analysis of the prognosis model GO and KEGG analyses of DEGs between the two risk groups were performed to dissect related biological functions and pathways. Those outcomes indicated that the DEGs were signi cantly enriched in extracellular matrix (ECM) related biological processes, including ECM organization, collagen-containing ECM, focal adhesion, ECM structural constituent, collagen binding, ECM binding (P. adjust<0.05, Figure   6a). KEGG pathway analysis showed that these DEGs were also enriched in ECM related pathways, such as proteoglycans in cancer and ECM-receptor interactions (P. adjust<0.05, Figure 6b). Besides, these DEGs are enriched in the muscle system.
Tumor-targeted immunotherapy is a hot topic nowadays. ECM has been speculated as a part of health and homeostasis, plays a critical immunomodulatory role [29] , and the synergistic antitumor effect between ECM and the immune system will be the focus of future research. Therefore, ssGSEA is used to quantify the enrichment scores of different immune cell subsets, related functions, or pathways further to explore the correlation between risk scores and immune status. The results showed signi cant differences in mast cells, neutrophils, T helper cells, and Th2 cells between these two different risk groups (P. adjust<0.05, Figure 7a). In the KEGG analysis, except for the fact that MHC class I had a lower score in the low-risk group, others including APC co-stimulation, cytokinecytokine receptor interaction (CCR), type IFN response, parain ammation, and type IFN response were all scored higher in the high-risk group (P. adjust<0.05, Figure 7b).

Discussion
In the present study, we analyzed 259 ferroptosis-related genes in GC tissues and their relationship with OS. More than a large portion of the ferroptosis-related genes (65. 6%) were differentially expressed in GC tissues and adjacent non-tumor tissues. Also, 18 genes are associated with OS, which also indicates, to some extent, the possibility part of ferroptosis in GC. Therefore, we then constructed a new prognostic model containing 11 ferroptosis-related genes and further validated the prognostic model.
The prognostic model constructed in this study included 11 ferroptosis-related genes (NOX4, NOX5, SLC1A5, GLS2, MYB, TGFBR1, NF2, ZFP36, DUSP1, SLC1A4, SP1). NOX4 can catalyze the molecular oxidation-reduction to reactive oxygen species (ROC). Also, ROC assumes a crucial part in the ferroptosis [30] . In melanoma, miR-137 can negatively regulate cells' ferroptosis by directly targeting the glutamine transporter SLC1A5 [31] . GLS2 is an essential transporter of glutamine to glutamic acid. In GC, physcion8-O-β-glucopyranoside (PG) plays a Fe-promoting and anti-tumor role in vivo and vitro by regulating the miR-103a-3p/GLS2 axis [13] . One of the characteristics of ferroptosis is the loss of GPX4 activity. Studies have shown that c-Myb transcription regulates cysteine dioxygenase 1(CDO1) and upregulates the expression of GPX4 by inhibiting the expression of CDO1 [11] . Ferroptosis can be regulated by non-cell-autonomously through cadherin-mediated intercellular interactions, mediated by the activation of intracellular NF2 and Hippo signaling pathways, thereby inhibiting ferroptosis [7] . In the study of intestinal ischemia-reperfusion injury, inhibition of ACSL4 before reperfusion protects the process of ferroptosis, during which Sp1 can serve as a critical transcription factor that binds to the ACSL4 promoter region and improves the transcription of ACSL4 [32] . To sum up, in the prognostic model we constructed, the six genes(NOX4, SLC1A5, GLS2, MYB, NF2, and SP1) have been con rmed to participate in the process of ferroptosis, while the ve genes (NOX5, SLC1A4, TGFBR1, ZFP36, and DUSP1) have not been con rmed to be related to ferroptosis. In conclusion, these 11 genes have been a rmed to be related to poor prognosis in our prognostic model. Nevertheless, whether these genes affect the prognosis of patients with GC through the process related to ferroptosis has not been studied in depth. GLS2 alone has been studied for the correlation between GC and ferroptosis. In our GO and KEGG analysis results, the enrichment areas of biological processes and pathways are in the extracellular matrix. The research by Brown et al. [33] proposed that ECM separation of cells lacking GPX4 was the physiological trigger for ferroptosis. Studies have shown that in tumors, porcine bladder ECM has anti-tumor cell activity, and this function is not directly acting on tumor cells but mediated by immune cells [34] . Badylak [29] explored the future research prospect between ECM and tumor immunity and speculated that the presence of individual-speci c signaling molecules in ECM would cause the anti-tumor phenotype of immune cells them possess anti-tumor cell activity. In the further immune-related analysis, many immune cells and immune-related functional pathways were found to be statistically different, which also provides a theoretical basis for us to speculate that ferroptosis may be potentially related to tumor immunity. However, the current study on the potential relationship between tumor immunity and ferroptosis is still preliminary. The proportion of mast cells and neutrophils was higher in the high-risk group. Mast cells exist as immune cells in almost all vascularized tissues, and in tumors, they can affect tumor development, tumor tissue angiogenesis, and the adaptive immune response of tumors [35] . The increase in mast cell density in GC is accompanied by an increase in the release of related angiogenic and lymphangiogenic factors to promote tumor cells' proliferation [36] . Neutrophils are well known in infections of the organism. In tumors, tumor-associated neutrophils can induce angiogenesis, ECM remodeling, metastasis, and immunosuppression [37] . The function of these immune cells in the tumor microenvironment allows us to speculate whether the cells release special signals during the ferroptosis, so that mast cells and neutrophils promote angiogenesis and further promote the proliferation metastasis of tumor cells. Besides, we found noteworthy contrasts in the antigen presentation procedure between those two risk groups, with a higher risk score associated with impaired anti-tumor immunity, including the activity of type I and type II IFN responses. It has been speculated that antigen-presenting cells (APCs) are attracted to dead cells' sites during the ferroptosis [38] . Study has found that cancer cells that induce the death of immunogenic cancer cells (ICD) can coordinate "change self-mimicry" at the level of nucleic acid or chemokine, leading to type I IFN response [39]. In conclusion, the changes in the immune environment and weakened anti-tumor immune function in the high-risk group of GC patients may lead to poor prognosis of GC patients.
This study successfully constructed a prognostic model for GC and veri ed its reliability, but this model's data were retrospective. Besides, due to public databases' limitation, we did not use data from multiple public databases to further verify the constructed model. Finally, the link between the risk score in the model and the ECM as well as the immune system has not been experimentally veri ed.

Conclusions
This study investigates 11 ferroptosis-related genes(NOX4, NOX5, SLC1A5, GLS2, MYB, TGFBR1, NF2, ZFP36, DUSP1, SLC1A4, SP1) were identi ed, and a new prognostic model for GC was constructed. The model has been further con rmed to be capable of predicting the prognosis of patients with GC and has provided a potential new research direction for predicting GC's prognosis. Besides, the potential relationship between these genes and ECM as well as tumor immunity is also the subject of further research in the future.

Consent for publication
All authors agree to publish the paper.

Availability of data and materials
The data used to support the ndings of this study are included within the article.

Competing interests
The authors declare that they have no competing interests.    Identi cation of ferroptosis-related genes associated with prognosis. a. Venn diagram: 18 genes related to prognosis and differentially expressed in GC tissues and normal paracancerous tissues were screened out. b. expression of 18 candidate genes in tumor tissue and normal paracancerous tissue. c. forest plots: the relationship between gene expression and OS was shown by univariate Cox regression analysis. d.
the PPI network of 18 candidate genes. e. the correlation network of 18 candidate genes.

Figure 3
Through lasso regression analysis, 11 ferroptosis-related genes were identi ed to construct the prognosis model. a. LASSO coe cient of 18 candidate genes. b. determine the penalty parameter (λ) in the model, and the dashed vertical line is drawn at the optimal value of the tting error.

Figure 4
Prognostic analysis of a model constructed from 11 candidate genes. a. distribution and the median value of risk scores. b. Kaplan-Meier survival curves for patients in the high-risk and low-risk groups. c.
ROC curve of the model, where AUC veri es that the sensitivity of the model is acceptable. d. PCA plot of the model. e. t-SNE analysis of the model. f. distribution of OS status and risk scores.