A ferroptosis-related gene signature for lung function and quality of life in patients with idiopathic pulmonary brosis

Background: Rapid advances in genetic and genomic technologies have begun to reshape our understanding of idiopathic pulmonary brosis (IPF). Ferroptosis, an iron-dependent form of regulated cell death, play an important role in the development of IPF. Therefore, our study aimed to explore the role of ferroptosis-related genes (FRGs) and their correlation with lung dysfunction and quality of life in patients with IPF. Methods: Datasets were acquired by researching the Gene Expression Omnibus. FRGs were acquired by researching GeneCard database and PubMed. Ferroptosis-related differentially expressed genes (FRDEGs) were identied according to integrating FRGs and the DEGs identied in the GSE110147 dataset. Candidate key genes were identied from the miRNA-target FRDEGs network and protein-protein interactions (PPI) network. The relationship between key genes and lung function or quality of life was calculated using the GSE32537 datasets. Results: 293 FRGs were obtained, and 71 FRDEGs were identied. According to enrichment analysis, cell growth and death and pathways associated cancer were the important pathways, and signicant biological processes were mainly consisted of cellular responses to stimulus and various situations. In addition, this study constructed an PPI network and a miRNA-target network based on the 71 FRDEGs, determined 19 candidate key genes. Furthermore, acyl-CoA synthetase long chain family member 1 (ACSL1), integrin subunit beta 8 (ITGB8) and ceruloplasmin (CP) were identied as the key genes. The expression level of ACSL1 was the strongest predictor for lung function (negatively) including percent predicted forced vital capacity (FVC% predicted) and percent predicted diffusion capacity of the lung for carbon monoxide (Dlco% predicted) and quality of life (negatively). In addition, ITGB8 and CP were negatively associated with FVC% predicted. According to DrugBank and PubMed, 4 drugs and 16 drugs have been found to act on ACSL1 and CP, respectively. Conclusion: These results imply that FRGs may shed new understanding on disease mechanism and provide potential biomarkers and therapy target to predict IPF progression.


Introduction
Idiopathic pulmonary brosis (IPF), a common interstitial lung disease (ILD) of unknown etiology with repeated acute lung injury, causes worsening dyspnea and deteriorating lung function [1]. The incidence of IPF among people aged 18-64 years between 2005 and 2010 according to a study in the United States was 6.1 new cases per 100000 person-years [2]. Currently, two drugs (Pirfenidone and Nintedanib) have been identi ed to be moderately effective in treating IPF [3,4]. However, the prognosis of IPF remains severe, with death usually occurring within 2-3 years after diagnosis [5,6], and the 5-year survival rate is only 20% [7]. Through the past decades, rapid advances in genetic and genomic technologies have begun to reshape our understanding of IPF. Studies have uncovered some genes that are linked to IPF, including telomerase reverse transcriptase, TERT [8,9]; transforming growth factor beta 1, TGFB1 [10]; and mucin 5B, MUC5B [11] et al. However, the pathophysiologic mechanisms of IPF are complex and remain incompletely understood.
Ferroptosis is a new type of regulated cell death (RCD) which is dependent on iron, and different from apoptosis, cell necrosis and autophagy [12]. Previous study had con rmed that iron overload may cause lung brosis according to increased lipid peroxidation and decreased glutathione peroxidase 4 (GPX4) activity in lung tissues [13]. Furthermore, studies have veri ed that ferroptosis plays an important role in the development of pulmonary brosis, and ferroptosis inhibitor may attenuate pulmonary brosis progression [14,15]. Many genes such as GPX4, solute carrier family 7 member 11 (SLC7A11), transforming growth factor beta receptor 1 (TGFBR1) and so on have also been identi ed as regulators or markers of ferroptosis, and were associated with the development of pulmonary brosis [14][15][16]. However, the systematic exploration of role of ferroptosis-related genes (FRGs) as well as their values for lung function and quality of life are absent in patients with IPF.
Therefore, the purposes of the study are to analyzed the characteristics of ferroptosis-related differentially expressed genes (FRDEGs) in IPF based on the Gene Expression Omnibus (GEO) or other databases, such as miRDB, GeneCards etc., and construct miRNA-target interactions network to explore a novel approach for the determination of gene functions and the pathogenesis of IPF. Furthermore, we summarized the information derived from miRNA-target interactions, gene oncology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis, protein-protein interaction (PPI) data, and then screened out useful potential biomarkers for lung function and quality of life and therapeutic targets for IPF.

Materials And Methods
Acquisition of datasets. Figure 1 shows the work ow of our study. On the GEO database (http://www.ncbi.nlm.nih.gov/geo/), we selected datasets must meet the following items: (1) the gene expression pro le was measured using same platform; (2) the detected samples came from the lung tissues of patients with IPF or healthy donors; (3) raw data or a gene expression matrix should be provided. Finally, two datasets were identi ed, GSE110147 and GSE32537 (platform: GPL6244). Approval of the Ethics Committee was not required because the information of patients was obtained from the GEO.
Human miRNA-target interactions data were downloaded from miRDB [17]. FRGs were obtained from GeneCards database (https://www.genecards.org/) by searching the terms "ferroptosis" and PubMed by searching the terms "Ferroptosis [MeSH] OR Ferroptosis* [tiab]". Consequently, 103 FRGs and 190 FRGs were respectively collected from GeneCards and PubMed in the study as shown in Supplementary Table 1.

Datasets preprocessing
The raw data (CEL format) of GSE110147 and GSE32537 were downloaded from GEO. "Affy" package (http://bioconductor.org/packages/release/bioc/html/affy.html, v.1.68.0) was used to normalize the array data according to the robust multi-array average (RMA) method. We de ned IPF differentially expressed genes (DEGs) as expression levels of genes were signi cantly diverse in IPF patients compared with the controls (|log Fold Change|>1 and adjusted p-value < 0.05). "Limma" package (v.3.46.0) [18] was used for the analysis of DEGs. In addition, St. George's Respiratory Questionnaire (SGRQ) score and lung function [percent predicted forced vital capacity (FVC% predicted) and percent predicted diffusion capacity of the lung for carbon monoxide (Dlco% predicted)] were extracted from the GSE32537 dataset (Table 1). Data are presented as mean ± SD or n(%).
Analysis of data GO and KEGG enrichment analyses of the FRDEGs of IPF were analyzed and visualized by R package "clusterPro ler" [19]. Heatmap was constructed according to R packages "gplots" (v.3.1.1) and "RColorBrewer (v.1.1-2)". STRING (http://www.string.embl.de/, version: 11.0b) was used to analyze the protein-protein interactions (PPI) [20]. Cytoscape (version 3.7.1) [21] was used to visualize miRNA-target network and PPI network, and its MCODE were used to make the visualization of PPI network and identify the modules in the network [parameters: Degree cutoff ≥ 2 (degrees of each nodes in module were larger than 2 at least), K-core ≥ 2 (subgraphs of each node in module were more than 2 at least )].

Drug discovery
DrugBank [22] and PubMed were used to screen drugs associated with related gene that was predicted to be an important gene in this study.

Statistical analysis
Continuous variables were compared between two groups by applying the non-parametric t test. Associations between the expression levels of genes and lung function and SGRQ score were determined by Spearman correlation coe cient. All statistical analyses were carried out with GraphPad Prism 7.0, and P < 0.05 was considered statistically signi cant.

PPI network
The PPI network was constructed based on the 71 FRDEGs according to the STRING database (average node degree: 5.6, PPI enrichment p-value: < 1.0e-16), which was visualized by Cytoscape [20,21]. We removed the nodes with no connections, Therefore, the nal network contained 66 nodes and 196 edges (Fig. 2C). Ceruloplasmin (CP) was the highest up-regulated gene, and angiopoietin like 4 (ANGPTL4) was the highest down-regulated gene in the PPI network. We calculated the connectivity degree of each node, and selected those with degrees ≥ 15, as follows: mitogen-activated protein kinase 3 (MAPK3, down-regulated), heme oxygenase 1 (HMOX1, down-regulated), KRAS proto-oncogene, GTPase (KRAS, up-regulated), heat shock protein family A member 5 (HSPA5, up-regulated) and ATM serine/threonine kinase (ATM, up-regulated). In addition, one module ( Figure S1) were selected after MCODE analysis of the whole network, and the results of enrichment analysis of FRDEGs within the module were showed in Figure S2 by R package "clusterPro ler" [19], which revealed the important pathways: cell growth and death, and pathways associated cancer.

Key gene ontology and pathways enriched in IPF
In order to reveal the biological signi cance of 71 FRDEGs regulating IPF at a single level, we used R package "clusterPro ler" [19] to conduct biological pathway enrichment and biological process annotation for the 71 genes mentioned above. The 20 most signi cantly KEGG pathways were selected (Supplementary Table 2, Fig. 3A, 3F). More importantly, cell growth and death, pathways associated cancer and signal transduction were the main pathways, implying that FRDGEs may participate in the process of IPF according to these pathways ( Figure S3A). Hsa04216 (Ferroptosis, including 11 FRDEGs) was the rst signi cantly enriched pathway ( Figure S3B). FRDEGs-related top 20 biological processes (BP), cellular component (CC) and molecular function (MF) were showed in Fig. 3B-3D respectively. The top 20 GOs were showed in Supplementary Table 3 and Fig. 3E, which were consisted of cellular responses to stimulus and various situations.
Some potential biomarkers had been found in IPF A total of 1638 miRNA-target interactions associated with 68 of 71 FRDEGs and 463 related miRNAs were derived from miRDB [17] and visualized by Cytoscape (Fig. 4). The related nodes with degrees ≥ 25 were shown in Table 2. The more interactions with miRNAs, the more degree is. Therefore, integrin subunit beta 8 (ITGB8) was considered the hub node. In addition, for miRNA, the related nodes with degrees ≥ 9 were shown in Table 3. The top 5 hub nodes with higher degrees were hsa-miR-513a-3p, hsa-miR-513c-3p, hsa-miR-19a-3p, hsa-miR-19b-3p, hsa-miR-3065-5p.  Table 3 The node with degrees ≥ 9 were shown according to miRNA-target network.
The bioprocess enrichment analysis showed that the 71 FRDEGs mentioned above were signi cantly correlated with a series of biological processes: cellular responses to stimulus and various situations.
Persistent alveolar epithelial injury and the abnormal repair are the important causes of lung brosis [25].
Therefore, cellular responses to the persistent injury are important in the development of IPF. Abnormal cellular responses may lead to epithelial-mesenchymal transition (EMT), which may promote the development of lung brosis [15]. Therefore, FRGs may participate in the development of IPF according to these biological processes.
Furthermore, KEGG pathways analysis of 71 FRDEGs and the module identi ed from the PPI network showed that cell growth and death, pathways associated cancer and signal transduction were signi cant enriched pathways. Similar to cancer, IPF affects susceptible individuals and shares common risk factors for cancer such as smoking, environmental or professional exposure, viral infections, and chronic tissue injury [26]. The incidence of cancer in IPF patients is higher compared with matched controls, especially for lung cancer [27]. Ferroptosis, FoxO signaling pathway, HIF-1 signaling pathway and so on play key roles in the development and prognosis of cancer [28][29][30][31]. In addition, the programmed death ligand-1/programmed cell death 1 (PD-L1/PD-1) axis can promote cancer cells to escape the surveillance of the immune system. And studies showed that PD-L1 was overexpressed in the lung tissues [32], lung broblasts [33] and CD4 T cells [34] in IPF. Therefore, we speculated that FRDEGs may participate in the development of cancer in patients with IPF according to these pathways.
MicroRNAs (miRNAs), a kind of small non-coding regulatory rna, are composed of 18-25 nucleotides that inhibit the translation or degradation of RNA transcripts in a sequence-speci c manner, thus controlling the expression of protein-coding/non-protein-coding genes [35,36]. To date, several studies have suggested that differently expressed miRNAs, DEGs, and microRNA-controlled differential gene expression represent key topics in the eld of biomedical research into pulmonary brosis [37][38][39]. In this study, we constructed a miRNA-target FRDEGs network, and found that ITGB8 has the highest degree in the network, followed by ACSL4 and PIK3CA, which may be important biomarkers for regulating IPF. According to searching in the ILDGDB database [40] (a manually curated database of genomics, transcriptomics, proteomics and drug information for interstitial lung diseases), no related study was found for the three genes in patients with IPF.
Subsequently, we veri ed the expression of 19 candidate key genes derived from the miRNA-target network and the PPI network in the GSE32537 dataset, then, 5 key genes were found. According to linear regression, ACSL1 was the strongest predictor for lung function and quality of life. ACSL1 plays a key role in fatty acid metabolism. Studies have found that lipid metabolism dysregulation play an important role in the pathogenesis of IPF [44,45]. In addition, the levels of stearic acid (the one of fatty acid) is down-regulated in IPF lung tissues than in control lung tissues, and further study found that stearic acid had anti brotic activity [45]. Therefore, ACSL1 may play a key role in the development of IPF according to regulating the fatty acid metabolism. Interestingly, ACSL1 is up-regulated in the GSE110147 dataset, however, it is down-regulated in the GSE32537 dataset. The expression level of ACSL1 may need further study to con rm.
The drugs were also screened in DrugBank and PubMed for ACSL1, ITGB8 and CP. Four drugs and sixteen drugs have been found to act on ACSL1 and CP, respectively. For example, representative compound 13 was remarkable inhibitor against not only ACSL1 (IC50 = 0.042 µM) but also other ACSL isoforms [23]. However, more experimental veri cations are still needed to prove this hypothesis.

Conclusion
These results suggest that FRDEGs may provide new clues to potential biomarkers and therapeutic targets for Declarations