Development and validation of a ferroptosis-related prognostic model in pancreatic cancer

Background: The purpose of this study was to identify ferroptosis-related genes (FRGs) associated with the prognosis of pancreatic cancer and to construct a prognostic model based on FRGs. Methods: Based on pancreatic cancer data obtained from The Cancer Genome Atlas database, we established a prognostic model from 232 FRGs. A nomogram was constructed by combining the prognostic model and clinicopathological features. Gene Expression Omnibus datasets and tissue samples obtained from our center were utilized to validate the model. The relationship between risk score and immune cell infiltration was explored by CIBERSORT and TIMER. Results: The prognostic model was established based on four FRGs (ENPP2, ATG4D, SLC2A1 and MAP3K5), and the risk score was demonstrated to be an independent risk factor in pancreatic cancer (HR 1.648, 95% CI 1.335–2.035, p < 0.001). Based on the median risk score, patients were divided into a high-risk group and a low-risk group. The low-risk group had a better prognosis than the high-risk group. In the high-risk group, patients treated with chemotherapy had a better prognosis. The nomogram showed that the model was the most important element. Gene set enrichment analysis identified three key pathways, namely, TGFβ signaling, HIF signaling pathway and the adherens junction. The prognostic model may be associated with infiltration of immune cells such as M0 macrophages, M1 macrophages, CD4 + T cells and CD8 + T cells. Conclusion: The ferroptosis-related prognostic model can be employed to predict the prognosis of pancreatic cancer. Ferroptosis is an important marker, and immunotherapy may be a potential therapeutic target for pancreatic cancer.


Introduction
Pancreatic cancer is a common malignant tumor of the pancreas, with a poor prognosis. The 5-year survival rate of pancreatic cancer is approximately 9% in the United States, and the median survival time in China is less than 7.8 months [1][2][3]. The morbidity and mortality of pancreatic cancer have increased in recent years [1]. Studies have shown that by 2030, pancreatic cancer will be the second leading cause of cancer-related deaths [4,5]. One of the primary causes of the poor prognosis of pancreatic cancer is delayed diagnosis. Due to the deep anatomical location of the pancreas and hidden clinical manifestations of these cancers in the early stages, most pancreatic cancers are detected in the advanced stages when the opportunity for surgery has passed [3]. In the past decade, the diagnosis and treatment of pancreatic cancer patients have been considerably improved with the development of genomics.
Ferroptosis is an iron-dependent nonapoptotic form of cell death that is characterized by excessive accumulation of lipid peroxides and reactive oxygen species (ROS) [6]. Ferroptosis is identified to be closely related to a series of different tumor types [6], such as hepatocellular carcinoma [7,8], prostate Chen-jie Qiu, Xue-bing Wang and Zi-ruo Zheng contributed equally to this work. cancer [9], breast cancer [10], and pancreatic cancer [11][12][13]. This type of cell death inhibits the proliferation of tumor cells and kills tumor cells as a new tumor inhibition pathway [14,15]. Recent studies [13] have determined that artesunate plays an anticancer role in pancreatic cancer by affecting ferroptosis. Another study [16] showed that piperlongumine was highly effective in treating pancreatic cancer, as it significantly increases intracellular ROS levels and induces ferroptosis and it can be inhibited by ferroptosis inhibitors and the iron chelator, deferoxamine. Numerous genes have also been identified as drivers, suppressors or markers of ferroptosis. Ferroptosisrelated genes (FRGs), such as HSPA5, can act as antitumor agents for gemcitabine by inducing ferroptosis [17]. In addition, knocking out or knocking down ATG5 and ATG7 inhibits ferroptosis in pancreatic cancer cells [18].
The development of sequencing technology has allowed the study of FRGs and clinical features to build prognostic models. Characterizing the molecular landscape of pancreatic cancer may help stratify patients by risk and adopt different treatment plans in a personalized manner to maximize the benefits for patients [19]. To assess the significance of FRGs for pancreatic cancer patients, we screened the FRGs that were significantly related to the overall survival (OS) of pancreatic cancer patients to construct a prognostic model and verify its reliability with external datasets. In addition, we analyzed the correlation between prognosis-related FRGs and clinicopathological parameters, and we constructed a nomogram based on clinicopathological parameters and a prognostic model. We also evaluated the correlation between risk score and immune cell infiltration, and we studied immune checkpoints to seek immunotherapy for pancreatic cancer.

Identification of differentially expressed FRGs (DE-FRGs)
Perl software was used to obtain the expression levels of FRGs in TCGA samples. Genes with p value <0.05 and |log FoldChange| > 1 were considered DE-FRGs. The DE-FRGs were visualized by the "volcano" and "heatmap" R packages. The functional enrichment analyses of DE-FRGs primarily involving Gene Ontology (GO) terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis were performed with the "clusterprofiler" R package. GO analysis included the functions of DE-FRGs in biological processes (BP), cellular components (CC), and molecular functions (MF). KEGG analysis demonstrated the pathway enrichment of DE-FRGs. The STRING database (https://string-db. org/) was utilized to construct a protein-protein interaction (PPI) network. Next, we input the interacting data derived from STRING to Cytoscape for visualization. The correlation between the DE-FRGs was calculated by Pearson's test and visualized by the "corrplot" R package.
Identification of prognostic-related genes and construction of a prognostic model Univariate Cox analysis was employed to identify FRGs related to the OS of pancreatic cancer patients. The OSrelated FRGs were included in least absolute shrinkage and selection operator (LASSO) analysis to be further screened. Multivariate Cox analysis was further used to identify independent risk FRGs associated with prognosis. Finally, a prognostic model was constructed via the linear combination of the expression levels of FRGs with the multivariate Cox regression coefficient (β) serving as the weight. The prognostic model was as follows: risk score = expression of gene1 × β1 + expression of gene2 × β2 + … + expression of gene n × βn. Patients were divided into a high-risk group and a low-risk group according to the median risk score. The Kaplan-Meier (KM) method was employed to describe the prognostic difference between the two groups. To evaluate the feasibility of the prognostic model constructed above, we incorporated risk score, age, sex, stage, grade, T stage and N stage into univariate and multivariate Cox regression analyses to determine independent risk factors. We also incorporated these factors into receiver operating characteristic (ROC) analysis to assess the accuracy and sensitivity of the model. The GSE62452 and GSE57495 datasets were selected to validate the prognostic model finally. X-tile software was used to determine the best cutoff value for OS.

Construction of the nomogram
The "rms" and "survival" R packages were used to construct the nomogram based on the Cox regression model. Calibration curves were described to assess the consistency between the actual and predicted 1-year, 2-year and 3-year survival to further evaluate the prognostic model.

Experimental verification
A total of 88 pancreatic cancer tissues were obtained from the Pancreatic Center of the First Affiliated Hospital of Nanjing Medical University. The tissues were frozen in liquid nitrogen after surgery was performed. The study was approved by the Ethics Committee of the First Affiliated Hospital of Nanjing Medical University (No. 2018-SR-344.A2). Total RNA was isolated with TRIzol (Takara, Japan) according to the manufacturer's instructions and subsequently reverse-transcribed into cDNA using PrimeScript RT MasterMix (Takara, Japan). Real-Time quantitative polymerase chain reaction (RT-qPCR) was performed with SYBR Green (Vazyme, China). The expression levels of ENPP2, ATG4D, SLC2A1 and MAP3K5 were normalized to those of 18S rRNA and calculated using the 2 −ΔΔCt method. The primers are shown in Table S1.

Gene set enrichment analysis (GSEA) in the high-risk group
Based on the criteria of p value <0.05 and |log FoldChange| > 1, differentially expressed genes between the high-and lowrisk groups were identified and used to perform enrichment analysis using Metascape (http://metascape.org/gp/index. html) to predict the potential mechanisms. In addition, gene set enrichment analysis (GSEA, http://www.broadinstitute. org/gsea) was applied to the Molecular Signatures Database (v7.2), including hallmark (H) gene sets and curated (C2) gene sets (BIOCARTA and KEGG), to identify enriched pathways related to the high-risk group. The enriched gene sets were obtained based on p-values <0.05 and false discovery rate (FDR) values <0.25.

Relationship between immune cell infiltration and risk score
The "Estimate" R package was utilized to calculate the immune score, stromal score and estimate score of pancreatic cancer samples of TCGA. The "CIBERSORT" R package was used to evaluate 22 types of immune cell infiltration. The immune cell infiltration and somatic number alteration (SCNA) were also assessed by Tumor Immune Estimation Resource (TIMER, https://cistrome. shinyapps.io/timer/) based on six immune cells, including B cells, CD8 + T cells, CD4 + T cells, macrophages, neutrophils and dendritic cells. Finally, we evaluated the relationship between risk score and classical immune checkpoints, such as PDCD1, CTLA4 and CD274, with Pearson's test.

Identification and enrichment analysis of DE-FRGs
The flow chart of the study is shown in Fig. 1. The gene expression profiles and clinical information of 182 cases, including 178 tumor cases and 4 normal cases, were downloaded from TCGA database. Patients with a survival time ≤ 30 days were excluded from subsequent analysis. The clinical information of pancreatic cancer patients is presented in Table S2. A total of 232 FRGs were selected from the FerrDb database. After extracting the gene expression of FRGs from TCGA samples, we performed the differential analysis. According to the standards mentioned above, 18 DE-FRGs were acquired by the Wilcoxon test with 10 upregulated DE-FRGs and 8 downregulated DE-FRGs ( Fig. 2a-b). Next, we performed GO and KEGG pathway analyses to further study the biological functions of ferroptosis in the development of pancreatic cancer. GO analysis indicated that these DE-FRGs were enriched in biological processes related to ferroptosis, such as cellular response to oxidative stress, chemical stress, toxic stress, response to oxidative stress and liver development. The cellular components and molecular functions of these DE-FRGs are shown in Fig. 2c. KEGG pathway analysis suggested that most of the DE-FRG pathways were significantly linked to ferroptosis, leishmaniasis, HIF-1 signaling pathway, fluid shear stress, atherosclerosis and phagosomes (Fig. 2d). Using STRING and Cytoscape software, we constructed a PPI network for these DE-FRGs (Fig. 2e). The correlation of these DE-FRGs is presented in Fig. 2f.

Construction of the FRG-based prognostic model
All FRGs were included in the univariate Cox regression analysis for construction of the prognostic model. The results of the univariate Cox analysis showed that 13 FRGs were associated with the prognosis of pancreatic cancer patients (Fig. 3a). The relative expression levels of the 13 genes in tumor and normal samples are shown in Fig. 3b. The 13 genes were subsequently input into LASSO regression to screen the genes and effectively avoid overfitting (Fig. 3c, d). Finally, the selected genes were analyzed by multivariate Cox regression analysis, and four genes were obtained as follows: Risk score = (−0.30685*ENPP2) + ( − 0 . 8 5 2 6 3 * A T G 4 D ) + ( 0 . 1 1 9 4 4 6 * S L C 2 A 1 ) + (0.636851*MAP3K5).
The median risk score was applied to divide the pancreatic cancer patients into a high-risk group or a low-risk group. Figure S1A shows the risk score of each pancreatic cancer patient and Fig. S1B shows the relationship between the risk score and survival time. Figure S1C shows the gene expression profiles of the four genes in the high-risk group and lowrisk group by heatmap.

Prognostic model as an independent risk factor
We evaluated the clinical paramrters, pathological parameters and risk score for prognostic value by univariate and multivariate analyses. Figure 4a shows that age, N stage and risk score were associated with the OS, and Fig. 4b shows that the risk score was an independent risk factor (HR 1.648, 95% CI 1.335-2.035, p < 0.001). In addition, the KM curve showed that the high-risk group had a poorer OS rate than the low-risk group (Fig. 4c). ROC analysis was performed to explore the sensitivity and specificity of the model. The area under the curve (AUC) was 0.673, which was higher than that of other clinical and pathological parameters (Fig. 4d). We also explored the prognostic value of risk score for pancreatic cancer patients treated with chemotherapy. Patients in the high-risk group treated with chemotherapy had a better survival outcome, but those in the low-risk group who underwent chemotherapy did not (Fig. 4e, f).
In addition, the patients were stratified based on clinicopathologic parameters for further survival analysis. In almost all stratified analyses, this risk model achieved good performance, except for stages III/IV and T1/2, which may be attributed to the low number of cases (Fig. S2).

Correlation between prognostic model and clinicopathological characteristics
Based on the association of these FRGs with OS in pancreatic cancer, we further examined the correlation between the gene expression levels of FRGs and clinicopathological characteristics, such as age (≥ 65/< 65), sex (male/female), grade (G3/4 and G1/2), stage (stages III/ IV and stage I/II), T stage (T3/4 and T1/2) and N stage (N1/N0). Only ATG4D was significantly associated with stage and N stage (Fig. S3A-B), and the risk score was closely related to T stage (Fig. S3C).

Construction of the nomogram for prediction of prognostic risk
To better determine the survival risk of pancreatic cancer patients, w e integrated t he prognostic m odel and  (Fig. 5a). Compared to the actual 1-year, 2-year and 3-year survival rates, the calibration curve showed that when the survival rates predicted by the nomogram achieved good agreement (Fig. 5b-d).  4 The prognostic model is significantly associated with the survival of pancreatic cancer. a The risk score was associated with the prognosis of pancreatic cancer analyzed by univariate Cox regression analysis. b The prognostic model was an independent factor for pancreatic cancer. c Kaplan-Meier curve of pancreatic cancer patients stratified by the median risk score. The high-risk group had a poorer OS rate than the low-risk group. d ROC analysis of the prognostic model and other clinicopathologic parameters

Validation of the prognostic model
To validate the prognostic model, a total of 88 pancreatic cancer tissue samples from our center were used for further experimental verification. The KM curve showed that the OS of the low-risk group was significantly better than that of the high-risk group (Fig. 6a). Figure 6b shows that the AUC of the prognostic model was 0.647, and Fig. 6c shows that higher risk scores indicated lower survival times. The heatmap of the 4 genes is also shown in Fig. 6c. The nomogram was validated, and the 1-year and 1.5-year calibration curves are shown in Fig. 6d, e.
In addition, GSE57495 and GSE62452 datasets were downloaded for further assessment. Figure S4 validates the prognostic model in GSE57495 and GSE62452.

Enrichment analyses in the high-risk group
To further study the enriched pathways in the high-and lowrisk groups, the differential analysis was performed and the differentially expressed genes were input into the Metascape online tool (Fig. 7a). The enrichment analysis showed that the genes were enriched in several biological functions, including chemical synaptic transmission, regulation of ion transport, regulation of system process, neuronal system and inorganic cation transmembrane transport (Fig. 7b-d). In addition, GSEA showed that the pathways enriched in the high-risk group were TGFβ signaling, HIF signaling pathway and the adherens junction ( Fig. 7e and Table S3).

Differences in the immune landscape between the high-and low-risk groups
We assessed the infiltration degree of 22 types of immune cells in the high-and low-risk groups, and we calculated the immune score and immune microenvironment score. Figure 8a shows the immune cells, immune score, stromal score and estimate score in pancreatic cancer patients. Importantly, the risk score was significantly associated with immune score. Figure 8b shows that some immune cells differed significantly between high-and low-risk groups, indicating that immune cells play an important role. M0 and M1 macrophages showed higher infiltration in the high-risk group. To further examine the relationship between the risk score and immune cell infiltration, the four risk genes were first studied by TIMER. ENPP2 expression was associated with B cells, CD8 + T cells, CD4 + T cells, macrophages, neutrophils and dendritic cells, and the other three genes were also related to some of the immune cells (Fig. 8c). Figure 8d-i reveals that the risk score was negatively correlated with CD4 + T cells (p = 0.009, R = −0.199) and positively correlated with CD8 + T cells (p = 0.033, R = 0.163). In addition, the relationship between the SCNA of the four genes and immune cells was explored by TIMER. Figure S5 shows that the SCNA may result in changes in immune infiltration levels of B cells and CD4 + T cells.

Relationship between immune checkpoints and risk score
Immunotherapy plays an important role in pancreatic cancer. In this study, we evaluated the relationship between risk score and classical immune checkpoints, such as PDCD1, CTLA4 and CD274. As shown in Fig. 9, immune checkpoints were correlated with risk score, and the four genes included in the prognostic model were also correlated with risk score. Figure 9c indicates that CD274 may be a key immune checkpoint in pancreatic cancer treatment.

Discussion
In this study, we first obtained FRGs through the FerrDb ferroptosis database and analyzed their expression levels from the TCGA database. Univariate Cox regression analysis showed that 13 FRGs were associated with the prognosis of pancreatic cancer. A signature of four genes was established by LASSO regression and multivariate Cox analysis. The risk score of each patient was calculated by multiplying the risk coefficient by the expression level of the four genes. Through survival analysis, we observed that the prognostic model achieved good performance for all cohorts or stratified analysis, except stages III/IV and T1/2. The low performance in stages III/IV and T1/2 may be due to the small number of cases. In addition, univariate and multivariate Cox analyses showed that the risk score is an independent risk factor for pancreatic cancer. We further studied the correlation between the prognostic model and clinicopathological parameters, and the risk score was demonstrated to be highly associated with T stage. Finally, two independent GEO datasets and 88 pancreatic cancer samples in our center were selected to validate the prognostic performance.
At present, few studies have reported prognostic models of FRGs. These models have been reported only in liver cancer [20], clear cell renal cell carcinoma [21] and glioma [22]. However, the sources of the FRGs that these studies employed were only from a comprehensive literature survey, which may lead to the loss of FRGs. For example, Liang el al. [20] identified 60 FRGs from the previous literature and conducted a 10-FRG signature to predict the prognosis of liver cancer.
The four FRGs in the prognostic model were identified in FerrDb, and they were classified into the following three categories: markers (SLC2A1 and MAP3K5), suppressors (ENPP2) and drivers (ATG4D). SLC2A1 encodes the GLUT-1 protein, a glucose transporter member [23]. Previous studies have indicated that GLUT-1 is a prognostic protein for pancreatic cancer patients [24][25][26] as it is highly expressed in pancreatic cancer and promotes the proliferation of pancreatic cancer cells [24]. LSH interacts with WDR76 to inhibit ferroptosis by activating the lipid metabolism-associated gene, GLUT-1 [27]. ENPP2, termed ATX, produces the lysophosphatidic acid (LPA) signaling molecule as a secreted enzyme, and the lipid metabolic reaction may be associated with the development of colorectal cancer [28,29]. ENPP2/LPA signaling significantly inhibits ROS generation and ferroptosis in cardiomyocytes by regulating GPX4, ACSL4 and NRF2 [30]. According to microarray analysis, the expression of ATX is elevated in pancreatic cancer, and serum ATX activity might be a useful marker [31,32]. Many autophagyrelated genes, including ATG4D, have been identified as positive regulators of ferroptosis by RNAi screening and Fig. 7 Enrichment analysis in the high-risk groups. a Heatmap of differentially expressed genes between high-and low-risk groups. b Bar graph of the enriched terms constructed by Metascape and colored by p value. c Colored by cluster ID, in which nodes that share the same cluster ID are typically close. d Colored by p value, in which terms containing more genes tend to have a more significant p value. e Gene set enrichment analysis (GSEA) of the high-risk patients based on the prognostic model, including TGFβ signaling, HIF signaling pathway and the adherens junction genetic analysis, and autophagy has been shown to play a crucial role in ferroptosis [33]. ATG4D is regulated by SNHG14 and miR-101 in gemcitabine resistance and cell viability in pancreatic cancer. MAP3K5, also known as ASK1, is a serine/threonine kinase that plays an important role in the MAP kinase signal transduction pathway [34]. ASK1 is activated by cytotoxic stresses, such as lipopolysaccharide (LPS) and ROS [34]. ASK1 acts as an oncogene in the development of pancreatic cancer by upregulating cyclin E, and it is also critical for LPS-induced p38 activation and cytokine production [35]. Previous studies have shown that cold stress induces ferroptosis and the ASK1-p38 MAPK pathway, which is a regulator of ferroptosis downstream of lipid peroxide [36]. Based on the DE-FRGs between the normal and tumor groups, we performed KEGG pathway analysis, which indicated that the HIF-1 signaling pathway was enriched. At the same time, GSEA suggested that the HIF signaling pathway was enriched in the high-risk group. The HIF-1 signaling pathway plays an important role in the carcinogenesis and progression of pancreatic cancer, and it promotes ferroptosis sensitivity in cancers [37,38]. Jiang et al. [27] found that LSH interacts with WDR76 to inhibit ferroptosis and that LSH was activated by EGLN1 and c-Myc by inhibiting HIF-1a. Yang et al. [39] showed that the degradation of ARNTL promotes ferroptosis by inducing the expression of EGLN2, thereby destabilizing HIF-1a.
We constructed a nomogram to predict the prognosis of pancreatic cancer patients. The nomogram, which included age, sex, grade, stage, T stage, N stage and risk score, showed that the risk score was the most important variable. Moreover, the calibration plot showed that actual survival and predicted survival achieved good agreement, indicating that the prognostic model accurately assesses prognosis.
The prognostic model not only was associated with the OS but also affected the immune infiltration level of pancreatic cancer patients. Based on the CIBERSORT and ESTIMATE algorithm, the infiltration level of M0 and M1macrophages was increased in the high-risk group. Previous studies have shown that PD-L1 expression is positive in approximately 30% of pancreatic cancer patients [40]. Sporadic clinical evidence has suggested that PD-1/ PD-L1 monotherapy is almost ineffective in pancreatic cancer patients [41]. Recent studies have found that PD-1/PD-L1 inhibitors combined with other immunotherapies, such as galectin-9 antibodies [42], can significantly inhibit the growth of pancreatic cancer cells. Hence, targeted therapy and immune checkpoints therapy for pancreatic cancer will important topics in the future.
The present study had several limitations. First, this study was a retrospective study, and the model should be validated with prospective data. Second, we did not include M stage in the nomogram due to missing M stage data.

Conclusion
We built and validated a prognostic model of four ferroptosisrelated genes. A nomogram that included the risk score and clinicopathological parameters was also established to predict the OS of pancreatic cancer patients. Immune checkpoint inhibitors may be a potential therapy for pancreatic cancer.