A Novel Ferroptosis-related 6-Gene Signature for Overall Survival Prediction in Patients with Thyroid carcinoma

Background: Ferroptosis is a new form of regulated cell death (RCD) that plays a crucial role in the genesis and prognosis of tumor. Nevertheless, the relationship between ferroptosis and the prognosis of thyroid carcinoma (THCA) remains unclear and needs to be explored. Methods: By analyzing data from the THCA cohort in the TCGA database, ferroptosis-related differentially expressed genes (DEGs) with prognostic value were identied, which were used to establish a prognostic signature based on Lasso-penalized Cox regression analysis. Then, the model was testied with Kaplan-Meier survival, Cox regression and receiver operating characteristic (ROC) analyses based on overall survival (OS). Finally, DEGs between the low-risk and high-risk groups were identied and used to conduct GO enrichment analysis, KEGG pathways analysis and immune inltration analysis. Results: A 6-gene signature was constructed which including DPP4, GPX4, GSS, HMGCR, TFRC and PGD. The area under the curve (AUC) were 0.890 (1 year), 0.863 (2 years) and 0.883 (3 years) which validated the prominent predictive capacity of the model. Multivariate Cox regression certied the model as a prognostic-related independent predictor for OS. Conclusion: In this study, we established an innovative prognostic signature of 6 ferroptosis-related genes which can be as a prognostic-related independent predictor for OS in THCA, while the potential mechanisms was still unclear and needed further exploration.

Reports have indicated that the regulatory mechanisms of ferroptosis may involve the GPX4, SLC7A11, p53 and Non-Coding RNAs [14]. Meanwhile, more and more ferroptosis-related genes have been found, and scientists were keen on exploring the function of ferroptosis-related genes in cancer in order to nd more biological factors with diagnostic, prognostic or therapeutic value. Liang et al. structured and validated a signature of ferroptosis-related genes for overall survival (OS) to predict the prognosis of hepatocellular cancer patients [15]. Another study [16] suggested that low expression of ferroptosisrelated NCOA4 gene was associated with poor prognosis, disease progression and impaired in ltration of immune cells in clear cell renal carcinoma. However, few studies focused on the role of ferroptosis in malignant tumor of endocrine-system, especially thyroid cancer.
In our study, we sought to explore the potential relationships between ferroptosis and THCA prognosis based on the Cancer Genome Atlas (TCGA) datasets. We identi ed prognostic ferroptosis-related genes and structured a ferroptosis-related gene model for OS. Then, bioinformatics analyses were performed to explore the potential mechanism. This study can offer potential clinical utility of ferroptosis and great promise for prognosis prediction and targeted therapy of THCA.

Materials And Methods
Retrieving mRNA sequencing and corresponding clinical data MRNA sequencing data (HTSeq-FPKM) and corresponding clinical information of THCA and normal thyroid tissue samples were downloaded from the THCA cohort in the TCGA database (https://portal.gdc.cancer.gov/) up to December 5, 2020 using the "TCGAbiolinks" R package [17]. Then the scale method form the "limma" R package [18] was used to normalize the gene expression pro les. No ethical approval from local ethics committees was required because all the data utilized in our study were obtained from public database.
Construction and further analysis of the prognostic ferroptosis-related gene model. 60 ferroptosis-related genes (provided in Supplementary Table S1) were obtained from the high-quality articles gathered by the comprehensive literature survey [14,[19][20][21], in which differentially expressed genes (DEGs) between the THCA tissues and non-tumorous tissues were identi ed with a |log2FC| ≥ 1 and false discovery rate (FDR) < 0.05 using the "limma" R package. Prognostic ferroptosis-related DEGs were screened out using the univariate Cox analysis of OS. LASSO Cox regression analysis was used to establish a prognostic ferroptosis-related gene model with the "glmnet" R package [22][23][24]. Based on the prognostic model, all patients were assigned with risk scores according to the established score formula (risk score = sum (expression level of each gene * corresponding coe cient) and grouped into low-risk group or high-risk group by comparing to the median risk score. Next, principal component analysis (PCA) and t-distributed stochastic neighbor embedding (t-SNE) analysis were performed to show the distribution of patients. The Kaplan-Meier survival analysis and the receiver operating characteristic (ROC) curve were used to evaluate the predictive value of OS for the signature and the Cox analyses was conducted to identify the independent predictive factor. Following, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses were performed by the "clusterPro ler" R package [25] (P-value < 0.05, gene count > 10) based on the DEGs between the low-risk and high-risk group (|log2FC| ≥ 1, FDR < 0.05). We used single sample gene set enrichment analysis (ssGSEA) [26] to calculated the normalized enrichment scores of 16 immune-related cells and 13 immune-related pathways (the annotated gene sets le was provided in Supplementary Table S2) with the "gsva" R package [27] in order to explore the correlation between immune status and risk score. At last, using the immunohistochemistry images from the human protein atlas (HPA) database (http://www.proteinAtlas.org/), we analyzed protein expression levels of the genes in the signature. And the cbioProtal database (http://www.cbioportal.org/) was accessed to analyze the genetic alteration of these genes.

Statistical analysis
All statistical analyses and most of visualization were performed in the R (version 4.0.3) and the SPSS software (version 22.0). Univariate Cox analysis was conducted to identify prognostic genes for OS. We used the LASSO Cox regression analysis to calculate the risk coe cient of each gene and construct the prognostic model. Difference of clinical data between the low-risk and high-risk group were analysed by Chi square test. Univariate and multivariate Cox analyses were performed to nd the independent predictive factor. The ssGSEA was utilized to quantify the THCA samples and Mann-Whitney U tests was used to compare the ssGSEA scores between the low-risk and high-risk group. A P value less than 0.05 indicated statistical signi cance.

Study population and data acquisition
The detailed work ow of this study is shown in Fig. 1. 497 THCA samples and 56 normal tissue samples were retrieved, and the clinical characteristics of which were showed in Table 1. Identi cation of prognostic DEGs related to THCA and ferroptosis We found 44 DEGs from 60 ferroptosis-related genes obtained from previous published literatures, in which DPP4, GPX4, GSS, HMGCR, TFRC, SQLE and PGD were signi cantly related to OS according to univariate Cox regression ( Fig. 2a and 2b). The expression of 7 genes between different tissues were displayed with heat map in Fig. 2c. Forest plots (Fig. 2a) of OS showed that HMGCR, TFRC, SQLE and PGD were the high-risk genes, while DPP4, GPX4 and GSS were the protective genes for THCA. The correlation network in Fig. 2d shown the correlation between the prognostic DEGs related to ferroptosis.

Construction of a prognostic ferroptosis-related 6-gene signature
Based on the LASSO Cox regression analysis, we used the 7 OS-associated genes above to constructed a prognostic gene signature. A 6-gene signature was established and the coe cients of each gene within are list in Table 2. Each patient's risk score was calculated from each gene's expression and regression coe cients as follows: Risk score = (-0.33121 * expression value of DPP4) + (0.92445 * expression level of GPX4) + (-2.62370 * expression level of GSS) + (0.08964 * expression level of HMGCR) + (1.08994 * expression level of TFRC) + (0.85750 * expression level of PGD). We classi ed all patients into the low-or high-risk group according to the corresponding risk scores (Fig. 3a). The difference of clinical data between the two groups were shown in Table 3 and we found that the low-risk group had better OS. PCA ( Fig. 3b) and t-SNE ( Fig. 3c) analysis shown that patients in this two risk groups were divided into two diverse sets. The dead cases were mainly distributed in the high-risk group (Fig. 3d). The Kaplan-Meier cumulative curve displayed that a patient with high risk score had a signi cantly worse survival outcome than those with low risk score (Fig. 3e). Next, we used the ROC curve (Fig. 3f) to evaluate the prognostic value of the model. The area under the curve (AUC) from the ROC analysis were 0.890 (1 year), 0.863 (2 years) and 0.883 (3 years), which indicated this model as a highly reliable prognosic predictor.  Functional enrichment analysis and immune-related analysis in the THCA cohort To investigate the biological characteristics of the prognostic signature, we identi ed the DEGs between the low-risk and high-risk group, which were used to conducted GO and KEGG functional analyses. The top 10 enriched terms in GO and top 16 enriched terms in KEGG were shown in Fig. 5a and Fig. 5b. From the result of GO, we found out that the DEGs were enriched in several biological process (BP) terms such as hormone metabolic process, organic acid transport, stress response to metal ion, stress response to copper ion and response to cadmium ion. The main enriched cellular component (CC) term includes collagen − containing, extracellular matrix, speci c granule lumen. In addition, the DEGs are also enriched in several molecular function (MF) term, such as carboxylic acid binding and organic acid binding. KEGG analysis demonstrated that the DEGs were related to thyroid hormone synthesis, mineral absorption, pancreatic secretion and gastric acid secretion. Interestingly, we can nd that many hormone-metabolicrelated process (especially thyroid hormone), organic-acid-related process (especially fatty acid and carboxylic acid) and ion-related process were enriched.
In order to analyze the relationship between immunology and THCA, ssGSEA was performed to quantify the enrichment scores of the immune cells and the immune-related pathways between the low-and highrisk group in THCA samples. For immune cells, the result (Fig. 5c) showed that the score of natural killer cells (NK-cells), mast-cells, dendritic cells (DCs), immature DCs (iDCs), activated DCs (aDCs) and in the low-risk group were signi cantly higher than those in the high-risk group. Interestingly, for immune-related functions and pathways (Fig. 5d), the score of antigen-presenting cell (APC) co-stimulation, APC coinhibition, MHC class I, parain ammation, type-I immune interferon (IFN) reponse, check − point and human leukocyte antigen (HLA) had similar conditions.

Comprehensive analysis of 6 genes in prognostic signature
Finally, we further explored the biological characteristics of the genes in the prognostic signature. The HPA database was used to evaluated the protein expression. As shown in the immunohistochemistry images (Fig. 6a), the protein expression levels of most genes were basically consistent with the mRNA levels. The cBioPortal database was accessed to analyze the genetic alteration of these genes. The results showed (Fig. 6b) that DDP4 and GSS has the highest alteration frequency (8% and 7%). GPX4 and HMGCR has the same alteration ratio of 6%. While the mutations ratio of PGD (5% of all sample) and TFRC (4% of all sample) was relatively lower than others. Meanwhile, we observed that the major type of alteration were mRNA expression changes, which may be an important prognostic factor for THCA.

Discussion
For the last several decades, the morbidity of thyroid carcinoma has been increasing in the world. Although the traditional therapy can cure most patients, there are still some THCA patients suffer from poor prognosis, which encourage scientist to continue to explore the new targets for prognosis prediction and treatment. Ferroptosis, an iron-dependent and non-apoptotic RCD process, has been proved by many researches that it is playing an important role in several cancers [28], especially in terms of diagnosis and prognosis. Nevertheless, the biological value of ferroptosis in THCA have not been systematically explored by far. Here, our study rst identi ed the prognostic ferroptosis-related genes and structured a 6gene signature for OS based on THCA dataset. Further biological analysis helped us to explore the potential mechanism of ferroptosis in THCA and we found that hormone and immune function may be valuable for further research.
The Kaplan-Meier analysis and ROC curve showed that the prognostic model we structured provided an excellent predictive value for OS. The prognostic signature was made up of 6 ferroptosis-related genes (DPP4, GPX4, GSS, HMGCR, TFRC and PGD). P53 gene is a well-known tumor suppressor gene, the oncosuppressive functions of which have been proved to be in uenced by ferroptosis [29,30]. A recent study [31] found that nuclear accumulation of DPP4 can be prohibited due to the loss of P53 which will promote DPP4-dependent lipid peroxidation and nally lead to ferroptosis. GPX4 is a glutathione (GSH) dependent enzyme which can catalyses and reduces lipid peroxides [32]. While, ferroptosis inducers can impair the GSH, resulting in excessive aggregation of lipid ROS and nally leading to cell death [33]. A study found that the overexpression and knockdown of GPX4 can modulate the lethality of 12 ferroptosis inducers which proved that GPX4 was the important regulator of ferroptosis [34]. GSS is also associated with GSH, which inhibited elevated lipid ROS levels in HepG2 cells [35]. In addition, another study shown that ribonucleotide reductase regulatory subunit M2 (RRM2) inhibited ferroptosis by stimulating GSH synthesis via GSS in liver cancer [36]. Sterols coenzyme Q10 (CoQ10) is an inhibited of ferroptosis, the synthesis of which can be regulated by HMGCR through regulating the synthesis of mevalonic acid. Shimada et al. [37] found that the inhibition of HMGCR can enhances FIN-56-induced ferroptosis. TFRC, takes part in cellular transferrin-iron uptake, with other ferroptosis modulators can be upregulated by proto-oncogenic transcriptional co-activator YAP which result in promoting ferroptosis [38,39]. For PGD, a kind of pentose phosphate pathway (PPP) enzymes, have been found that the knockdown of which suppressed erastin-induced ferroptosis in Calu-1 cell of non-small cell lung cancer [6]. To sum up, 3 genes (DPP4, TFRC and PGD) in prognostic signature were con rmed to promot ferroptosis, while the other 3 genes (GPX4, GSS and HMGCR) can protect cells from ferroptosis. Though the prognostic model we structured performed an excellent predictive value for OS and these 6 genes were proved as ferroptosisrelated genes, whether these genes in uence the prognosis of THCA by regulating and controlling the process of ferroptosis remains to be veri ed.
In order to explore the potential mechanism, we further conducted GO and KEGG analyses and found that several organic-acid-related terms and ion-related terms were signi cantly enriched. Interestingly, GO and KEGG analyses both pointed out that the risk-related DEGs was correlated with hormone-metabolicrelated process, especially thyroid hormone. Weigand et al. found that adrenal cortex cells are extraordinary sensitive to ferroptosis which depend on the active steroid synthetic pathways [40]. Another study [41] found that melatonin may be an excellent ferroptosis inhibitor and it can produce cerebroprotection from traumatic brain injury (TBI) by inhibiting neuronal Fth-mediated ferroptosis. These previous researches indicated that diseases can be affected by hormone through ferroptosis-related pathways. Though we have not found any report about the correlation between thyroid hormone and ferroptosis, it is reasonable to assume that thyroid hormone synthesis can affect the prognosis of THCA by in uencing the process of ferroptosis.
As is known to all, our immune system plays a vital role in the development of cancer. Ferroptosis and immunoregulation are both intense research areas in oncology, and they can provide different elimination mechanisms for cancer cells. Nevertheless, the relationship between ferroptosis and tumor immunity remains elusive. In our study, we found that the ssGSEA scores of several immune-related cell and functions were signi cantly lower in the high-risk group, including Type_I_IFN_Reponse, MHC_class_I, APC_co_stimulation and the NK_cells. Thus, immune escape may be an important reason for the poorly prognosis of the THCA patients in the high-risk group, which may be affected by ferroptosis. A recent research pointed out that ferroptotic cancer cells can be engulfed by macrophages in vitro [42]. This process may correlate with the modulation of antitumour immunity by AA oxidation products released from ferroptotic cells. In addition, cancer cells can be recognized by NK cells and its antigens will be presented to T cells through other immune cells and immune factors, which will eventually lead to cell elimination by the immune system [43,44]. A speculation about ferroptosis pointed out that ferroptotic cells can release signals which can attract the APCs or other immune cells to the location where the ferroptotic cells died [45].
The present study inevitably has several limitations. Firstly, this is a retrospective study because all the mRNA sequencing data were from TCGA database and no prospective validation in a clinical trial was conducted. What's more, we failed to nd a validation dataset for our model from other databases. Lastly, the underlying mechanisms of ferroptosis-related genes to predict prognosis of thyroid carcinoma are lack of literature support and need to be further investigated.
In sum, we established an innovative independently prognostic model of 6 ferroptosis-related genes based on OS for THCA. Also, our study offered some ideas for further research of the relationship between ferroptosis and thyroid carcinoma, especially in hormone and immune function. Z.x.W. and P.l.S. designed the present study. Z.x.W. and F.d.Q completed the search and downloaded the data. Z.x.W. and P.l.S. analyzed the data and designed the illustrations. Z.x.W. wrote the manuscript. All authors read and approved the nal manuscript. Figure 1 A schematic representation of the study protocol   Univariate and multivariate Cox analyses to identify the independent predictive factor for THCA prognosis Figure 5 Functional enrichment analysis and immune-related analysis of ferroptosis-related DEGs between the high-risk and low-risk groups a. GO functional analysis of the differentially expressed genes between high-risk and low-risk groups. b. KEGG functional analysis of the differentially expressed genes between high-risk and low-risk groups. c. The ssGSEA scores of 16 immune cells between high-risk and low-risk groups. d. The ssGSEA scores of 13 immune-related functions between high-risk and low-risk groups. (*, P< 0.05; **, P< 0.01; ***, P< 0.001)