A Novel Ferroptosis-related Lncrna Prognostic Signature for Colorectal Cancer by Bioinformatics Analysis

Background: Recently, extensive studies have shown that ferroptosis in cancer treatment has been increasingly conrmed. The current study aims to construct a robust ferroptosis-related lncRNAs signature prediction model of colorectal cancer (CRC) patients by bioinformatics analysis Methods: The transcriptome data were abstracted from The Cancer Genome Atlas (TCGA). Differentially expressed lncRNAs were screened by comparing 568 CRC tissues with 44 adjacent non-CRC tissues. Univariate Cox regression, lasso regression, multivariate Cox regression were conducted to design a ferroptosis-related lncRNA signature. This signature’s prognosis was veried by the log-rank test of Kaplan-Meier curve and the area under curve (AUC) of receiver operating characteristic (ROC) in train set, test set, and entire set. Furthermore, univariate and multivariate Cox regression were used to analyze its independent prognostic ability. The relationship of the ferroptosis-linked lncRNAs' expression and clinical variables was demonstrated by Wilcoxon rank-sum test and Kruskal-Wallis test. Gene set enrichment analysis (GSEA) was performed to signaling pathways it may involve. Results: 2541 differentially expressed lncRNAs were screened, of which 439 are ferroptosis-related lncRNAs. A seven ferroptosis-related lncRNAs (AC005550.2, LINC02381, AL137782.1, C2orf27A, AC156455.1, AL354993.2, AC008760.1) prognostic signature was constructed, validated and evaluated. This model's prognosis in the high-risk group is obviously worse than that of the low-risk group in train set, test set, and entire set. The AUC of ROC predicting the three years survival in the train set, test set, and entire set was 0.796, 0.715, and 0.758, respectively. Moreover, the designed molecular signature was found to be an independent prognostic variable. Compared to clinical variables, this signature's ROC curves demonstrated the second largest AUC value (0.737). The expression of these


Background
In 2018, a total of 1.8 million new patients with colorectal cancer (CRC) and 881,000 CRC-related deaths were reported. This accounted for one in 10 newly diagnosed cases of CRC and CRC-associated deaths.
Hence, CRC is ranked as the third most prevalent but second leading cause of cancer-related mortality [1]. Despite the fact that recent advances in the genetic and molecular characterization of tumors, the 5-year survival rate of early CRC exceeds 90% whereas that of rate of metastatic colorectal cancer is below 14% [2]. Therefore, investigating promising prognostic signatures along with potential targets is considered as an essential phase to achieving this goal.
Ferroptosis is a type of cell death that is characterized by high production of lipid ROS (L-ROS) as a result of inactivation of cellular glutathione (GSH)-dependent antioxidant defenses. This form of cell death is iron-dependent and differs from apoptosis, classic necrosis, ferroptosis, and other forms of cell death [3,4]. Ferroptosis has been associated with the initiation of multiple diseases, including kidney injury, blood circulation diseases, conditions of the nervous system, and ischemia-reperfusion injury. It is therefore being investigated as a potential prognostic marker for various diseases [5]. Scholars have suggested that ferroptosis may be adaptive strategy used for eliminating cancerous cells and hence prevent cancer development in situations of infections, cellular stress, and nutrient de ciency [6]. Previous research has reported that some inducers, such as RSL3 [7], β-elemene [8], Resibufogenin [9], andrographis [10], bromelain [11], IMCA [12], talaroconvolutin A (TalaA) [13], ACADSB [14], erastin [15], dichloroacetate [16], and B. etnensis Raf. extract [17] suppressed the progression of CRC via inducing ferroptosis. Hence, it is essential to discover ferroptosis-linked biomarkers that can be applied as valuable early diagnostic as well as prognostic indicators for CRC.
Long non-coding RNAs (lncRNAs) is a class of non-coding RNAs with more than 200 nucleotides long that have apparently little or no protein-coding ability [18]. LncRNAs regulate critical biological functions related to growth of cells and survival, allosteric regulation of enzyme activities, chromatin modi cations, and genomic imprinting [19]. Besides, a mounting number of studies have chronicled that lncRNAs affect cancer progression and predict dismal prognosis in diverse cancer types by modulating ferroptosis. For example, p53 related lncRNA (P53RRA) promotes apoptosis and ferroptosis of cancerous cells by activating the p53 pathway [20]. LncRNA GABPB1-AS1 regulates the status of oxidative stress in context of erastin-triggered ferroptosis in HepG2 hepatocellular carcinoma cells [21]. LncRNA-linc00336 suppresses ferroptosis in lung cancer tissues by acting as a competing endogenous RNA [22]. Linc00618 accelerates ferroptosis via inhabiting vincristine (VCR) and lymphoid-speci c helicase (LSH) /SLC7A11 in leukemia [23]. In non-small cell lung cancer cells, LncRNA-MT1DP enriched on folate-modi ed liposomes promotes erastin-triggered ferroptosis by modulating the miR-365a-3p/NRF2 axis [24]. Hence, it is critical to explore the pivotal lncRNAs closely linked to ferroptosis along with prognosis in CRC.
This study is the rst to propose a predictive model of lncRNA related to ferroptosis genes in tumors. Therefore, we postulated that ferroptosis-linked lncRNAs could be valuable prognostic biomarkers for CRC patients. Herein, we explored the expression of lncRNAs in CRC from The Cancer Genome Atlas (TCGA) and identi ed ferroptosis -associated lncRNAs with prognostic potential. We constructed and veri ed a seven ferroptosis-correlated lncRNA biosignature with the ability to estimate the survival prognosis of CRC patients.

Methods
Data download and processing  (Table 1). Patients with no follow-up time and follow-up time shorter than 30 days were not enrolled in the study.
Furthermore, we identi ed ferroptosis-related lncRNAs by the correlation analysis between the lncRNAs expression levels and the ferroptosis genes based on the criteria of P < 0.001and |Correlation Coe cient| > 0.3.

Development, veri cation, and assessment of prognostic biosignature
We utilized the R language 4.0.1version "caret" package to randomly classify the entire data set (Additional le 1) with FRlncRNAs expression pro les into two sets (train set (Additional le 2) and test set (Additional le 3)), and conducted univariate Cox regression for FRlncRNAs in the train group (P < 0.05). Lasso regression analysis was utilized to minimize over tting using the "glmnet" package [26] (P < 0.05). Afterward, multivariate Cox regression was employed to develop the optimal prognostic risk model and leveraged "coxph" and "direction = both" functions of the R language "survival" package [27] (P < 0.05). Then, the prognostic lncRNA signature's risk score constituting multiple lncRNAs was developed by summing up the product of each lncRNA with its corresponding coe cient. Additionally, the Proportional Hazards Assumption was tested in the Cox model. Similarly, on the basis of the previous training set's risk score formula, we applied it to the testing set as well as the entire set as validation.
This model was employed to explore each patient's survival prognosis by the Kaplan-Meier curve along with the log-rank test on the basis of the median of risk score, namely low-risk group and high-risk group in the train set, test set, entire set. The lncRNA signature's predictive power was explored by computing the AUC of 3 years using the ROC curve by the "survival ROC" package [28].
To further enhance the prognostic signature's credibility, we conducted a strati ed survival prognostic analysis on gender, age, clinical stage, postoperative tumor status, CEA levels, perineural invasion, vascular invasion, mismatch repair (MMR) and gene mutation status (KRAS, BRAF).
Independent and prognostic value of the lncRNA signature Multivariate Cox regression and univariate Cox regression analyses were conducted to analyze the independent and prognostic ability of the lncRNA signature in the train set (Additional le 4), test set (Additional le 5), and entire set (Additional le 6). The clinical parameters include age, gender, clinical stage, T stage, lymph nodes as well as distant metastasis. Besides, compared with clinical variables, The ROC curve was employed to explore whether the lncRNA biosignature has better predictive power. The "rms" package was employed to construct the nomogram according to the multivariate Cox regression result (P < 0.05). To further investigate whether the ferroptosis -associated lncRNAs are involved in CRC development, we explored the relationship of the ferroptosis-linked lncRNAs' expression with clinical variables using the Wilcoxon rank-sum test and Kruskal-Wallis test.
GSEA analysis of the lncRNA signature.
Gene set enrichment analysis (GSEA4.1.0) downloaded from https://www.gseamsigdb.org/gsea/index.jsp website was employed to identify the biological function of the prediction model [29]. Based on the median expression of lncRNA signature riskScore in 568 tumor samples, we divided them into low and high-risk groups for KEGG analysis of GSEA. The abundant signaling cascades in each phenotype were based on the normalized enrichment score (NES), the nominal (NOM) P-value as well as the false discovery rate (FDR). FDR < 25% and NOM P-value < 5% serve as a standard for inclusion.

| Statistical Analysis
R software 4.0.3 version and attached packages were employed to conduct data analyses. All the statistical analyses were two-sided. P < 0.05 signi ed of statistical signi cance.

Screening of ferroptosis-related lncRNAs in CRC
Comparing CRC tissues with adjacent non-CRC tissues, 2541 differentially expressed lncRNAs were found, of which 1805 are up-regulated and 736 are down-regulated (Additional le 7). The correlation results between 259 ferroptosis-related genes and differentially expressed lncRNAs shown that there are 439 ferroptosis-related lncRNAs (FRlncRNAs) (Additional le 8).
Construction, validation, and evaluation of a seven ferroptosis-related lncRNAs prognostic signature The entire set (N = 506) with 439 FRlncRNAs expression data was randomized into the test set (N = 252) and train set (N = 254). In the univariate Cox regression assessment, 22 FRlncRNAs modulated the overall survival of the patients in the train set (Fig. 1a). Lasso regression was used for further analysis to eliminate over tting lncRNAs, and the 16 lncRNAs we obtained were used for the subsequent multivariate Cox regression analysis (Fig. 1b-d)  According to the median value of the risk score, results of the Kaplan-Meier curves demonstrate that the high-risk group has a remarkably dismal overall survival (OS) in contrast with the low-risk group in the train set (P = 2.899E-06), test set (P = 5.314E-03), and entire set (P = 1.1E-06) (Fig. 2a-c). The train set shows three years' OS for patients with high and low-risk group were 60.6% and 90.5%, respectively. The test set is 63.9% and 90.1%, respectively. The entire set is 60.6% and 90.5%, respectively. The AUC of three years dependent ROC for the seven-lncRNA biosignature achieves 0.796, 0.715, and 0.758 respectively in the train set, test set, and entire set ( Fig. 2d-f), which demonstrate the good performance of the model in estimating the CRC patients' OS. The mortality rate was higher in patients with high-risk scores relative to those with low-risk scores in the three sets ( Fig. 2g-i). The six lnRNAs' (AC005550.2, LINC02381, C2orf27A, AC156455.1, AL354993.2, AC008760.1) expression of signature were lower in low-risk group compared to the high-risk group in cluster heat map, AL137782.1 oppositely (Fig. 2j-l).
It is worth noting that AC156455.1, and AL354993.2's high expression of this lncRNA signature also has a worse OS than low (Fig. 3). The association of the seven lncRNAs with ferroptosis genes is shown in Fig. 4. In addition, we strati ed according to various clinical factors (clinical stage, gender, age, CEA levels, MMR status, postoperative tumor status, perineural invasion, vascular invasion, KRAS mutation, BRAF mutation) and applied the prognostic model to OS detection, which is shown in Fig. 5, the results shown that the signature has good predictive signi cance for CRC patients in most strati cation factors, and part of results are not satisfactory (P > 0.05), which might be due to there are not enough samples in these strati cation.
Independent prognostic analysis of the seven ferroptosis-associated lncRNAs signature and its correlation with clinical variables.

Based on the strati cation of clinical variables, the correlation between the lncRNAs and clinical variables
shows that LINC02381' expression is related to T stage, Lymph-node status, and clinical stage, KRAS mutation, BRAF mutation, and perineural invasion. C2orf27A' expression is associated with T stage, Lymph-node status, clinical stage, KRAS mutation, MMR. AC156455.1' expression is correlated to Lymphnode status. AL354993.2' expression is connected to distant metastasis, Lymph-node status, clinical stage, KRAS mutation. AC008760.1' expression is concerning to Lymph-node status, distant metastasis, clinical stage, KRAS mutation. AL137782.1' expression is linked to KRAS mutation. The lncRNA signature' riskscore is coupled to T stage, Lymph-node status, distant metastasis, clinical stage, and KRAS mutation. (Fig. 7).
Functional enrichment analysis of the seven ferroptosis-related lncRNAs signature.
GSEA analysis is used to discover potential biological functions of the seven ferroptosis-associated lncRNAs signature of CRC (Fig. 8). The results showed that three signaling pathways (KEGG_HEDGEHOG_SIGNALING_PATHWAY, KEGG_ARACHIDONIC_ACID_METABOLISM, KEGG_ALPHA_LINOLENIC_ACID_METABOLISM) are obviously enriched in the high-risk group, and three signaling cascades (KEGG_FRUCTOSE_AND_MANNOSE_METABOLISM, KEGG_PENTOSE_PHOSPHATE_PATHWAY KEGG_CITRATE_CYCLE_TCA_CYCLE) were abundant in the low-risk group by c2.cp.kegg.v7.2.symbols.gmt. These results suggest that this signature model may in uence CRC progression and prognosis mainly through metabolism-related pathways Discussion CRC is a common and aggressive cancer with poor survival and prognosis, mainly due to the prone to metastasis to the liver and lung [30]. Given that there are no accurate and sensitive markers to predict the prognosis of CRC patients, it is crucial to investigate and develop more speci c biomarkers to improve the survival of patients. Although the current treatment methods have made great advancements, the prognosis is still very poor. Ferroptosis is differs from other types of cell death in terms of biochemically and morphologically and has been shown to regulate cancer development [3]. More and more reports have documented that lncRNA plays a very important role in regulating gene expression and regulation in tumor [19,31]. In addition, many lncRNAs in uence the progression of CRC by regulating ferroptosis. However, there are no reports on that prognostic model of lncRNA related to ferroptosis was constructed. Although two previous genetic prognostic models of ferroptosis have been reported in hepatocellular carcinoma [32] and glioma [33], our study is the rst to report the study of ferroptosis-related lncRNA prognostic models in CRC.
In the present study, we downloaded ferroptosis genes from FerrDb, and used the R language and its attached packages to nd differentially expressed lncRNAs related to ferroptosis (FRlncRNAs). We randomly grouped all the patients into train set as well as the test set, then a seven ferroptosis-related lncRNAs signature model (AC005550.2, LINC02381, AL137782.1, C2orf27A, AC156455.1, AL354993.2, AC008760.1) was established through univariate Cox regression, Lasso regression, as well as multivariate Cox regression in the train set. At the same time, the biosignature was veri ed in the test set as well as the entire set. On the basis of the median risk score, the Kaplan-Meier curves revealed that the high-risk group had an evidently dismal overall survival relative to the low-risk group in the three data sets Among these lncRNAs of the signature, some studies have shown that LINC02381 is related to immune gene [43] and autophagy gene [44] in colon adenocarcinoma. Interestingly, our research shows that this lncRNA is also related to ferroptosis, which is worthy of our in-depth thinking. In addition, Jafarzadeh, M et al' study revealed that LINC02381 might suppress human CRC tumorigenesis partly by regulating PI3K signaling pathway [45]. Meanwhile, LINC02381 inhibits gastric cancer progression and metastasis through regulating wnt signaling pathway [46]. However, LINC02381 functions as a cancer-promoting gene to promote cell migration and viability by regulating mir-133b / RhoA in cervical cancer [47].
AC008760.1 was reported to be related to autophagy, and Li et al. constructed a autophagy-related lncRNA prognosis model in CRC [48]. The remaining lncRNAs have not seen relevant reports in previous studies, which are worthy of further research.
Our study found that the expression of these lncRNAs and the constructed prognostic signature were closely related to the patient's clinical stage, distant metastasis, Lymph-node status, T stage, MMR status, BRAF mutation, KRAS mutation, and perineural invasion, especially the MMR status, BRAF mutation and KRAS mutation. These features have an important guiding signi cance for patients' medication. So can we explore whether these lncRNAs regulate these variables and how to regulate them? There have been many studies about ferroptosis in the drug resistance of tumor patients [49,50]. The current study demonstrated the prognostic signi cance of these ferroptosis-related lncRNAs and signature in CRC. Therefore, we have reason to believe that these lncRNAs are worthy of in-depth research in tumor resistance mechanisms.
Our current study also has some limitations. First, we use the data in the TCGA database as the starting point for research; although the model has been internally veri ed, it is still needed for further veri cation in external data; second, TCGA's race is mainly white (75%), and whether the model ts other race needs further veri cation. Third, the analysis of the lncRNA expression of the model and the KEGG function enrichment analysis by the GSEA model requires further cell function experimental analysis.

Declarations
Ethics approval and consent to participate LncRNA and mRNA sequencing pro les were obtained from the TCGA data portal, which is a publicly available dataset. Therefore, no ethics approval is needed.