3.1 Differential analysis results.
To investigate the changes in gene expression between TACE effective and ineffective patients, we performed differential analysis on the GSE104580 dataset using the limma package. We set the criteria as an absolute value (logFC) > 0.3 and adjPvalue < 0.05, resulting in 2689 differentially expressed genes (DEGs), including 1255 upregulated genes and 1434 downregulated genes. We generated a hierarchical cluster heatmap (Fig. 2A) and a volcano plot (Fig. 2B) of the top ten upregulated and downregulated DEGs, respectively, which effectively distinguished TACE effective and ineffective patients. Furthermore, we identified 45 common genes by intersecting the DEGs from GSE104580 and ferroptosis-related genes, illustrated by a Venn diagram (Fig. 2C). To explore the potential interactions among these 45 genes, we constructed a protein-protein interaction network (Fig. 2D).
3.2 Functional Enrichment Analysis of Differentially Expressed Genes
To investigate the relationship between differentially expressed genes (DEGs) significantly associated with TACE activity and various biological processes, molecular functions, cellular components, biological pathways, and diseases, we conducted a functional enrichment analysis on these DEGs. We found that the DEGs significantly associated with TACE activity were mostly enriched in several biological processes including "cellular response to chemical stress," "response to nutrient levels," and "response to extracellular stimulus" (Fig. 3A). Furthermore, they were also found to be enriched in various cellular components such as "basal plasma membrane," "basal part of cell," and "melanosome" (Fig. 3B), as well as molecular functions like "neutral amino acid transmembrane transporter activity," "organic anion transmembrane transporter activity," and "L-amino acid transmembrane transporter activity" (Fig. 3C). Next, we performed pathway enrichment analysis on these DEGs and found that they were enriched in several biological pathways such as "Central carbon metabolism in cancer," "Ferroptosis," and "Bladder cancer" (Fig. 3D).
3.3 Enrichment Analysis using Gene Set Enrichment Analysis (GSEA)
The results of the GSEA enrichment analysis with an adjusted P-value of less than 0.05 yielded significant gene sets including REACTOME_PHASE_I_FUNCTIONALIZATION_OF_COMPOUNDS, REACTOME_CYTOCHROME_P450_ARRANGED_BY_SUBSTRATE_TYPE, REACTOME_BIOLOGICAL_OXIDATIONS, WP_OXIDATION_BY_CYTOCHROME_P450, and REACTOME_RESOLUTION_OF_SISTER_CHROMATID_COHESION (Fig. 4A-D).
3.4 Identification of Co-expression Modules in DEGs through WGCNA Analysis
To identify the co-expression modules present in the differentially expressed genes (DEGs) of TACE ineffective and effective groups, we performed weighted gene co-expression network analysis (WGCNA). During the WGCNA analysis of the GSE104580 dataset, we observed one outlier sample through cut height setting (Fig. 5A). By plotting a scatter plot, we determined 5 as the optimal soft threshold value for subsequent analyses (Fig. 5B). Subsequently, co-expressed genes from both groups were clustered into darkturquoise, ivory, royalblue, and sienna3 modules (Fig. 5C). Based on the expression patterns of module genes and grouping information, we assessed the correlation between the modules and TACE effectiveness. We selected the darkturquoise, ivory, royalblue, and sienna3 modules that showed positive correlation with TACE effectiveness and significant p value < 0.05 for further analysis (Fig. 5D).
3.5 Protein-protein interaction network analysis
To identify the crucial genes associated with TACE, a comprehensive analysis was conducted using GSE104580 dataset. A total of 37 intersecting genes were identified through co-expression and differential expression analysis with ferroptosis-related genes and TACE-associated genes. The Venn diagram was constructed to visualize the overlapping genes (Fig. 6A).
To further explore the protein-protein interactions among the TACE-associated genes, a protein-protein interaction network was constructed using Cytoscape software (Fig. 6B). The cytoHubba plugin's MCC algorithm was utilized to calculate the top 10 hub genes, which included GLS2, CDKN1A, SRC, GPT2, SLC7A11, SLC7A5, ASNS, SLC38A1, SLC2A1, and SLC1A5 (Fig. 6C). The correlation coefficient heatmap was generated for the hub genes to illustrate their relationships (Fig. 6D).
3.6 LASSO Regression Constructing RESPONSE Diagnostic Model and Identifying Feature Genes
To determine the effective disease-specific feature genes of TACE and analyze their diagnostic capabilities for diseases, we employed the LASSO regression model method. We randomly split GSE104580 into a training group and a testing group at a ratio of 4:1 and used the training group to construct the model and the testing group to verify it. During the model building process, as λ increased, the selected feature parameters decreased while the coefficient absolute value increased (Fig. 7A,B). After simulating and selecting the number of features, we constructed a model and identified six feature genes in the model: GLS2, CDKN1A, GPT2, ASNS, SLC38A1, and SLC2A1(ferroptosis-related genes, FRGs). We validated the risk model by plotting ROC curves and calculating AUC (area under the curve) for both datasets. The results showed that the AUC values of the Train group and Test group were 0.846 and 0.822, respectively (Fig. 7C). Furthermore, we performed differential analysis on the feature genes between the effective and ineffective groups of TACE, and identified six differentially expressed feature genes with statistical significance (p < 0.05), including GLS2, CDKN1A, GPT2, ASNS, SLC38A1, and SLC2A1. We presented these findings using box plots (Fig. 7D).
3.7 Identification of Two Ferroptosis Patterns Based on Feature Genes
In this study, we utilized the "ConsensusClusterPlus" package in R software for cluster analysis of six iron-induced cell death-related genes based on their feature genes. A consensus clustering method was employed, and a consensus clustering plot for k = 2 was generated (Fig. 8A). Additionally, we calculated the relative change in area under the cumulative distribution function curve from k = 2 to k = 9 using the CDF plot (Fig. 8B). Furthermore, we illustrated the cumulative distribution function of the consensus clustering results (Fig. 8C) as well as the tracking plot (Fig. 8D).
3.8 Immunoinfiltration Analysis
To investigate the differences in immunoinfiltration levels between patients who responded to transarterial chemoembolization (TACE) and those who did not, we utilized the cibersort algorithm to determine the degree of infiltration of 22 types of immune cells in both groups using data from the GSE104580 dataset (Fig. 9A). Upon applying the wilcox.test algorithm, six types of immune cells demonstrated significant differences between TACE responders and non-responders in the GSE104580 dataset (Fig. 9B), namely T cells follicular helper, T cells gamma delta, Macrophages M0, Macrophages M1, Mast cells resting, and Neutrophils.
Furthermore, we analyzed the correlation between the expression levels of six ferroptosis-related genes and the abundance of 22 immune cell types. Our analysis identified significant correlations at p < 0.05 between ASNS and T cells gamma delta, NK cells activated, and Mast cells resting; CDKN1A and NK cells resting and Macrophages M0; GLS2 and T cells CD4 memory activated, T cells follicular helper, T cells regulatory (Tregs), and Macrophages M0; SLC2A1 and T cells follicular helper and Neutrophils; and SLC38A1 and T cells CD4 memory activated, T cells gamma delta, and NK cells activated (Fig. 9C).
3.9 Interactions among hub-mRNAs, hub-miRNAs, and hub-LncRNAs in a network
In this study, we have constructed an interplay network among LncRNA, miRNA, and mRNA. The network included two ferroptosis -related genes, CDKN1A and SLC2A1. Using miRTarBase and TarBase databases, we predicted the interactions between miRNAs related to ferroptosis and the ferroptosis-related genes. We found 101 intersecting groups of interaction relationships (Fig. 10A). Further screening of these interactions resulted in 75 groups with experimental evidence supporting them. We then used the Starbase database to predict LncRNAs that interact with the miRNAs, resulting in the construction of a Sankey diagram (Fig. 10B) and network map (Fig. 10C) of the LncRNA-miRNA-mRNA network.
3.10 Construction and validation of LICH prognostic model based on ferroptosis-related genes
In this study, we constructed a prognostic model for liver hepatocellular carcinoma (LIHC) patients based on six ferroptosis-related genes (GLS2, CDKN1A, GPT2, ASNS, SLC38A1, and SLC2A1). Using patient data from The Cancer Genome Atlas (TCGA), we classified patients into high and low risk groups. Kaplan-Meier survival analysis showed that the model had a hazard ratio of 0.52 (0.36–0.74) and P < 0.001 (Fig. 11A). Additionally, the time-dependent receiver operating characteristic (ROC) curve analysis revealed an area under the curve (AUC) of 0.705, 0.636, and 0.658 at 1, 2, and 3 years, respectively (Fig. 11B).
To validate the model, we used patient data from the International Cancer Genome Consortium (ICGC) and calculated the risk score for each patient. We then classified patients into high and low risk groups and performed Kaplan-Meier survival analysis. The results showed that the model had a hazard ratio of 0.45 (0.24–0.85) and P = 0.013 (Fig. 11C). Moreover, the time-dependent ROC curve analysis revealed an AUC of 0.728, 0.642, and 0.657 at 1, 2, and 3 years, respectively (Fig. 11D).