A novel mitochondria-related gene signature in esophageal carcinoma: prognostic, immune, and therapeutic features

Esophageal carcinoma (ESCA) is a common and lethal malignant tumor worldwide. The mitochondrial biomarkers were useful in finding significant prognostic gene modules associated with ESCA owing to the role of mitochondria in tumorigenesis and progression. In the present work, we obtained the transcriptome expression profiles and corresponding clinical information of ESCA from The Cancer Genome Atlas (TCGA) database. Differential expressed genes (DEGs) were overlapped with 2030 mitochondria-related genes to get mitochondria-related DEGs. The univariate cox regression, Least Absolute Shrinkage and Selection Operator (LASSO) regression, and multivariate cox regression were sequentially used to define the risk scoring model for mitochondria-related DEGs, and its prognostic value was verified in the external datasets GSE53624. Based on the risk score, ESCA patients were divided into high- and low-risk groups. Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG), and Gene Set Enrichment Analysis (GSEA) were performed to further investigate the difference between low- and high-risk groups at the gene pathway level. CIBERSORT was used to evaluate immune cell infiltration. The mutation difference between high- and low-risk groups was compared by using the R package “Maftools”. Cellminer was used to assess the association between the risk scoring model and drug sensitivity. As the most important outcome of the study, a 6-gene risk scoring model (APOOL, HIGD1A, MAOB, BCAP31, SLC44A2, and CHPT1) was constructed from 306 mitochondria-related DEGs. Pathways including the “hippo signaling pathway” and “cell–cell junction” were enriched in the DEGs between high and low groups. According to CIBERSORT, samples with high-risk scores demonstrated a higher abundance of CD4+ T cells, NK cells, M0 and M2 macrophages, and a lower abundance of M1 macrophages. The immune cell marker genes were correlated with the risk score. In mutation analysis, the mutation rate of TP53 was significantly different between the high- and low-risk groups. Drugs with a strong correlation with the risk model were selected. In conclusion, we focused on the role of mitochondria-related genes in cancer development and proposed a prognostic signature for individualized integrative assessment.


Introduction
Esophageal carcinoma (ESCA) is a common malignant tumor of the digestive tract, with the seventh highest incidence and the sixth mortality worldwide (Sung et al. 2021). Only about one-quarter of patients are candidates for attempted curative treatment, such as surgery, chemotherapy, radiation therapy, molecular targeting therapy, and combinations of these therapies, and the prognosis still remains poor, with a 5-year survival rate of about 30 percent (Djärv et al. 2014). There are several factors that contribute to the development of ESCA, including environmental factors, living habits, microorganisms, and genetic factors (Arnal 2015). Xintong Zhang and Hao Wu contributed equally to this work.
Since there is no sensitive method for detecting ESCA early, about half of the patients diagnosed with the tumors already have distal metastases (Liang et al. 2020). It is necessary to develop some early diagnostic markers of ESCA.
Mitochondria are the site of adenosine triphosphate (ATP) production in cells. Recent studies have shown that mitochondria are involved in the initiation, progression, and metastasis of tumor cells (Giampazolias and Tait 2016;Wang et al. 2020). Meanwhile, mitochondria are also the site of producing reactive oxygen species (ROS) in the cells, and the dysregulation of ROS contributes to malignancy (Kudryavtseva et al. 2016). Moreover, as a key regulator of apoptosis, mitochondrial disorder can enhance the antiapoptotic ability of cancer cells (Kudryavtseva et al. 2016). Triterpenoid saponins and dioscin can induce tumor cells of ESCA apoptosis by affecting the mitochondrial apoptotic pathway (Mo et al. 2013;Wang et al. 2018), which has been widely investigated as a potential treatment target for cancer (Lopez and Tait 2015).
Considering mitochondria are crucial mediators of tumorigenesis, it is urgent to identify mitochondrial-related prognostic markers in ESCA patients. Besides, cancer immunotherapy has advanced rapidly in recent years and is now being explored as a treatment for ESCA (Vrána et al. 2019). However, little is known about the relationship of mitochondria and immune in tumors. In this study, we constructed a Cox proportional-hazards model using mitochondria-related markers to predict patients' prognosis of ESCA. Combining these biomarkers with immune infiltration analysis provided a more comprehensive view of the risk factors for ESCA as well as a prediction method of patient prognosis. Immune cell infiltration in tumors has been shown to influence carcinogenesis and progression and may also predict patient response to neoadjuvant chemotherapy (Tian et al. 2020). In summary, our findings highlight mitochondria-related ESCA prognostic signatures and explore their correlation with immunity to provide new ideas for the diagnosis and treatment of ESCA.

Datasets and source
Gene expression RNAseq (HTseq) was downloaded from The Cancer Genome Atlas database (TCGA: https:// portal. gdc. cancer. gov/). The matched phenotype and survival data were from the University of California SANTA CRUZ (UCSC: https:// xenab rowser. net/ datap ages/). GEO datasets are all downloaded from the National Center for Biotechnology Information (NCBI) online Gene Expression Omnibus (GEO) database (https:// www. ncbi. nlm. nih. gov/ geo); 1136 genes encoding mitochondrial proteome were downloaded from MitoCarta3.0 (http:// www. broad insti tute. org/ mitoc arta), including sub-mitochondrial localization genes and genes involved in MitoPathway annotations (Rath et al. 2021). The Gene Set Enrichment Analysis (http:// www. gseamsigdb. org/ gsea/ msigdb/ index. jsp) database was searched with the keywords "mitochondrion" and "mitochondrial"; 165 gene sets (9200 genes) related to mitochondrial structure and function, were enriched. By merging MitoCarta 3.0 database and these 165 gene sets, as well as removing 8306 duplicates, 2030 human mitochondrial-related genes were obtained. The detailed information on these 2030 human mitochondrial-related genes was generated from the Metascape online website (https:// metas cape. org/ gp/ index. html#/ main/ step1 , Zhou et al. 2019, Supplementary Table S1). Patients with OS of fewer than 30 days might die of noncancer-related diseases, so they were excluded from the analysis to remove potential bias related to treatment effects (Jones et al. 2018;Gao et al. 2021). Figure 1 shows the workflow of this study.

Differential expression analysis
The mRNA expression value from TCGA was logarithmic normalized, and the DEGs were identified using the "limma" package in R software (version 3.5.1) (Tian et al. 2020). "Adjusted P < 0.01 and |log (fold change)|> 0.5" was defined as the thresholds for screening DEGs.

Enrichment analysis
Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analyses were performed with the "clusterProfiler" R package. All pathways available in the C2-curated gene set from Gene Set Enrichment Analysis (GSEA) were included in the pathway analysis. P-values less than 0.05 were considered statistically significant. Pathway-related genes were downloaded from GSEA (https:// www. gsea-misgdb. org/ gsea/ index. jsp). Data visualization was performed using "ggplot2" in R. Lollipop chart of correlation coefficients was plotted on the bioinformatics website (http:// www. bioin forma tics. com. cn/).

Establishment of risk scoring model for prognosis
Univariate Cox regression, LASSO regression, and multivariate Cox regression were applied in order, using the "survival" package in R. A mathematical formula ( Riskscore = ∑ n i=1 Exp i * Coe i ) was developed to predict the risk score for each patient based on the multivariate Cox regression analysis result. All patients were classified into high-and low-risk groups according to the median risk score.

Evaluation of the value of the risk scoring model
The Kaplan-Meier (K-M) survival analysis was conducted using the R package "survival". The time-dependent relative operating characteristic (ROC) curve was used to assess the accuracy of the established model, conducted by the "tim-eROC" package. Concordance index (C-index) values were calculated based on Harrell's C-index (Harrell et al. 1996) to evaluate the difference between the predicted value and actual value of the Cox model in survival analysis and judge the prediction accuracy of the prognostic model of tumor patients. To verify the general applicability of the Risk Scoring Model, the GSE53624 dataset was downloaded from Gene Expression Omnibus (GEO) database. The risk plot was generated using the R package "pheatmap". Other charts were created with "ggplot2".

Construction of nomogram signature
Using univariate Cox regression analysis (P < 0.1) and multivariate Cox regression analysis (P < 0.05), proper prognostic factors were identified. Based on the results of multivariate Cox proportional hazards analysis, a nomogram was formulated to predict 1-, 3-, and 5-year overall survival.
Each of the factors was represented by a graphical representation through the R package "rms", which could be used to calculate the risk of death for an individual ESCA patient based on points associated with each risk factor. Similarly, the nomogram was evaluated by ROC curves and C-index. The calibration plot predicted probability and actual occurrence rate were compared. Decision curve analysis (DCA) was used to determine net clinical benefits through the R package "rmda".

Immune analysis
Tumor-infiltrating immune cells were calculated using the CIBERSORT algorithm, with R package "cibersort". ESTIMATE (Estimation of Stromal and Immune cells in Malignant Tumor tissues using Expression) algorithm defined immune score, stromal score, estimate score, and tumor purity of ESCA patients via R package "estimate". TIDE (Tumor Immune Dysfunction and Exclusion) was used to assess the potential response to immune checkpoint inhibitor (ICI) therapy (http:// tide. dfci. harva rd. edu/). Gene lists encoding immunomodulators and chemokines were downloaded from translation initiation site database (TISDB) website (http:// cis. hku. hk/ TISIDB/ downl oad. php).

Mutation analysis
Somatic mutation profiles were obtained from the TCGA database in the form of mutation annotation format (MAF). The mutation profiles and frequencies of genes in high-and lowrisk groups were analyzed and visualized using the R package "Maftools". Tumor mutation burden (TMB) was calculated as mutations per megabase (mut/Mb) (Mayakonda et al. 2018).

Drug sensitivity analysis
The same sample of gene expression and drug sensitivity data was downloaded from the cellminer website (https:// disco ver. nci. nih. gov/ cellm iner/, Reinhold et al. 2015) Drug sensitivity data after clinical laboratory verification and FDA standard A Volcano map showing DEGs between tumor and normal tissues of ESCA (|log FC|> 0.5, P < 0.01). B 306 genes were selected by taking the intersection of the DEGs and mitochondria-related genes. C GO analysis of the mitochondria-related DEGs. D KEGG analysis of the mitochondriarelated DEGs certification was filtered. Pearson correlation analysis was conducted to determine the correlation between risk scoring model and drug sensitivity. The drugs were ranked in order of the absolute value of correlation, and the first 2 with a strong correlation with each gene were displayed.

Statistical analyses
Data analyses and visualization were performed using R version 4.0.2. The chi-square test was executed for the comparison of categorical variables between groups. Student's t-test or Mann-Whitney U test was used to estimate statistical significance between two groups for continuous variables. The correlation between the two parameters was assessed through the Spearman correlation analysis. P < 0.05 was considered statistically significant.

Identification of DEGs and enrichment analysis in ESCA
MRNA sequencing results of ESCA patients collected from 150 tumor tissues and 11 adjacent normal tissues were downloaded from the TCGA database. Three patients with OS of fewer than 30 days were removed. A total of 2195 differentially expressed genes (DEGs) were obtained between the two groups with the screening criteria of P value < 0.01 and |log FC|> 0.5 ( Fig. 2A). Among these DEGs, 306 genes overlapped with mitochondria-related genes (Fig. 2B).
GO pathway enrichment analysis of the 306 mitochondria-related DEGs showed, for biological process (BP), DEGs were mainly enriched in the pathway of cofactor metabolic process, small molecule catabolic process, and fatty acid metabolic process; for cellular component (CC), DEGs were significantly enriched in the pathway of the mitochondrial matrix, mitochondrial inner membrane, and mitochondrial outer membrane; for molecular function (MF), DEGs were associated with coenzyme binding, lyase activity, and flavin adenine dinucleotide binding (Fig. 2C).
KEGG enrichment analysis indicated that the pathways were mainly associated with neurodegeneration-multiple disease, carbon metabolism, and diabetic cardiomyopathy (Fig. 2D).

Risk scoring model construction and validation
Based on the gene expression and overall survival (OS) information of 147 tumor samples, 17 genes were screened out by univariate cox regression analysis of 306 mitochondrial-related DEGs (Fig. 3A). Combining the results of LASSO regression analysis, it was considered that the model was the best when the penalty coefficient was 12 (Fig. 3B); thus, the corresponding 12 mitochondrial-related DEGs including PDHA1, HSPE1, APOOL, H6PD, HIGD1A, MAOB, MTERF2, AASS, MT-ND1, BCAP31, SLC44A2, and CHPT were selected. Then, multivariate cox regression analysis was performed on the 12 genes, and six genes, APOOL, HIGD1A, MAOB, BCAP31, SLC44A2, and CHPT1, were able to enter the equation as a prognostic predictor (Fig. 3C, Table 1). In ESCA, the expression level of BCAP31 was significantly up-regulated in tumor tissues, and the expression Fig. 3 Construction of the risk scoring model. A Univariate COX analysis identified 17 genes related to prognostic values. There were 11 genes with HR > 1, which were marked in red, and 6 genes with HR < 1, which were marked in blue. B LASSO regression analysis on these 17 genes obtained 12 genes. C Multivariate cox regression analysis identified 6 genes related to prognostic values. There were 4 genes with HR > 1, which were marked in red, and 2 genes with HR < 1, which were marked in blue. *P < 0.05, **P < 0.01, ***P < 0.001 level of APOOL, HIGD1A, MAOB, SLC44A2, and CHPT1 was down-regulated in tumor tissues compared with normal samples ( Supplementary Fig. S1A). The expression level of APOOL, HIGD1A, MAOB, SLA44A2, and CHPT1 was significantly up-regulated in adenomas and adenocarcinomas, and the expression level of BCAP31 was down-regulated in adenomas and adenocarcinomas compared with squamous cell neoplasms samples ( Supplementary Fig. S1B). A subsequent stratification analysis revealed that these genes were not differentially expressed by age, gender, grade, or stage ( Supplementary Fig. S1C). Kaplan-Meier survival curves ( Supplementary Fig. S1D) confirmed that higher expression of MAOB and lower expression of BCAP31, CHPT1, and HIGD1A contributed to better survival. The difference in overall survival time between high-and low-expression levels of APOOL and SLC44A2 had no statistical significance.
The risk score of each patient was calculated according to the following formula.
Based on the median value of risk scores, ESCA patients were divided into high-and low-risk groups with 73 and 74 individuals, respectively. The chi-square test between high-and low-risk groups of clinical characteristics (Table 2) showed that there were significantly different in esophageal tumor central location (p < 0.001), M stage (p = 0.026), Kaplan-Meier survival analysis confirmed that the prognosis of patients in the high-risk group was significantly worse than that in the low-risk group (p value = 8.8e-06) (Fig. 4A). Figure 4B showed the risk score distributions in low-and high-risk patients in incremental order. The risk plot indicated that more patients from the high-risk group were distributed at the bottom of the figure with shorter survival time; on the contrary, more patients from the low-risk group were distributed at the top of the figure with longer survival time (Fig. 4C). The heatmap showed the expression level of 6 modeling genes in the two groups (Fig. 4D). The AUC values for 1 year, 3 years, and 5 years were 0.737, 0.724, and 0.945, respectively, indicating that the model could be used to predict patient prognosis accurately (Fig. 4E). The C-index of the model was 0.723. As shown in Fig. 4F-J, the results of the validation set (GSE53624) were consistent with our results in Fig. 4A-E.

Nomogram construction and validation
Clinical information of 147 ESCA patients was downloaded from the UCSC database. By univariate and multivariate Cox hazard regression analysis, columnar metaplasia present, M stage, reflux history, and risk score were selected to construct a nomogram with P value < 0.05 ( Supplementary Fig. S2A). The AUC values for 1 year, 3 years, and 5 years were 0.7, 0.722, and 0.835, respectively, indicating the accuracy of the model (Supplementary Fig. S2B). The C-index of the nomogram was 0.777. Values in blod indicated significant difference between low-and highrisk groups. The calibration plot revealed excellent agreement between observations and predictions ( Supplementary Fig. S2C). Based on these results, decision curve analysis (DCA) suggested the clinical effectiveness of risk score when combined with various clinical features and nomograms ( Supplementary Fig. S2D). These results revealed that the risk signature could serve as an independent factor for predicting the prognosis of ESCA patients.

Pathway analysis in high-and low-risk groups
1188 DEGs were obtained between the high-risk and the low-risk groups with the screening criteria of P value < 0.01 and |log FC|> 0.5. The DEGs were then analyzed for GO and KEGG enrichment. For GO BP, DEGs were mainly enriched in the "regulation of neuron projection development, axonogenesis, and epidermis development." For GO MF, DEGs were associated with "cell adhesion molecule binding, actin binding, actin filament binding." Most notably, for GO CC, DEGs were significantly enriched "cell-cell junction and collagencontaining extracellular matrix." KEGG enrichment analysis indicated that the DEGs were mainly enriched in the "hippo signaling pathway and tight junction" (Fig. 5A, B).
On the basis of KEGG enrichment analysis results, we explored the correlation between risk score and hippo signaling pathway biomarker genes. The results of Spearman correlation analyses demonstrated that risk score negatively correlated with a majority of the hippo signaling pathway biomarker genes in ESCA samples, including AMOTL1, WWTR1, NGFR, and TEAD2 (Fig. 5C). Based on GO CC results, Spearman's correlation analysis revealed that risk score was negatively correlated with more cell-cell junction biomarker genes (Fig. 5D).
Additionally, the top 20 enriched pathways in the highrisk and low-risk groups were obtained from GSEA (Supplementary Fig. S3). The result demonstrated that "genes down-regulated in metastases from malignant melanoma, genes down-regulated in esophageal adenocarcinoma and Barret's esophagus" were abundantly enriched in the lowrisk group (Supplementary Fig. S3A). While "Genes upregulated in early gastric cancer, cervical cancer proliferation cluster," potential pathways were enriched in the high-risk group (Supplementary Fig. S3B).

Immune analysis in high-and low-risk groups
According to CIBERSORT immune analysis, there were significant differences in the types of immune cells between the high-and low-risk groups (Supplementary Fig. S4A). Samples with high risk demonstrated a higher abundance of memory resting CD4 + T cells, resting NK cells, and M0 and M2 macrophages. Supplementary Fig. S5A,B showed the distribution of immune cells in high-and low-risk groups and in individuals of two groups, respectively. To determine whether there is a correlation between risk score and immune cells, including effector memory CD4 + T cell, central memory CD4 + T cell, activated CD4 + T cell, activated DC, immature DC, plaslacytoid DC, macrophage, and NK cell, we conducted a correlation test between immune cell marker genes and risk score, as shown in Supplementary Fig. S5C. Supplementary Fig. S4B-I showed marker genes connected with a risk score, which suggested a certain correlation between immune cells and risk score.
As a new weapon against cancer, immune checkpoint therapy, which controls the activity of T cells, improved antitumor immune responses (Sharma and Allison 2015). Therefore, we identified the expression of immune checkpoint signatures in both high-risk and low-risk groups (Supplementary Fig. S5C). The result indicated that there were significant differences in the expression of CD115, TIM1, CD30, CD112, HVEM, HHLA2, SEMA4A, and BTNL2 (Supplementary Fig. S4J). Afterwards, we calculated the correlation between risk score and the expression of immune checkpoint signatures (Supplementary Fig. S5D). The results illustrated that BTNL2, SEMA4A, and CD30 were negatively correlated with risk score, while TIM1, CD155, HVEM, CD112, and HHLA2 were positively correlated with risk score (Supplementary Fig. S4K).
ESTIMATE calculates the tumor purity, the ESTIMATE score, the immune score, and the stromal score of each ESCA sample. Spearman correlation analysis revealed the risk score was weak and negatively correlated with the stromal score while not significantly associated with the tumor purity, the ESTIMATE score, and the immune score (Supplementary Fig. S6A). And between high-risk and low-risk groups, there was no significant difference in tumor purity, stromal score, immune score, and ESTIMATE score (Supplementary Fig. S6B).
ICI therapy has undoubtedly been proven to be an effective strategy of immunotherapy, which led to an important Fig. 4 Evaluation of the value of the risk scoring model. A Kaplan-Meier curves of OS for high-and low-risk groups in TCGA cohort. B Distribution of high-and low-risk groups with the mean value (vertical dashed line) of the risk score in the TCGA cohort. C Distribution of survival status in high-and low-risk groups in TCGA cohort. D Heatmap for the expression of modeling genes in high-and low-risk groups in TCGA cohort. E Time-dependent ROC curves at 1-, 3-, and 5-year OS outcomes in accordance with risk score in TCGA cohort. F Kaplan-Meier curves of OS for high-and low-risk groups in the GSE53624 cohort. G Distribution of high-and low-risk groups with the mean value (vertical dashed line) of the risk score in the GSE53624 cohort. H Distribution of survival status in high-and low-risk groups in the GSE53624 cohort. I Heatmap for the expression of modeling genes in high-and low-risk groups in the GSE53624 cohort. J Time-dependent ROC curves at 1-, 3-, and 5-year OS outcomes in accordance with risk score in the GSE53624 cohort ◂ breakthrough in the field of antitumor therapy. Thus, the TIDE algorithm was used to predict the potentially different responses to ICB therapy (Wu et al. 2021). Patients with higher-risk scores showed a lower level of TIDE score (Fig. 6A). The highrisk group contained a higher proportion of responders to ICI therapy compared with the low-risk group (Fig. 6B). Samples were divided into the high-score group and the low-score group according to the median value of the stromal score. There was no statistical difference in the response of patients to drug treatment evaluated by the stromal score (Fig. 6C). However, combining risk score with stromal scores, the result showed that ICI therapy was more effective in patients with high-risk scores Pathway analysis in high-and low-risk groups. A GO analysis of DEGs between high-and low-risk groups. B KEGG analysis of DEGs between high-and low-risk groups. C Lollipop chart of cor-relation coefficients between risk score and hippo signaling pathway biomarker genes. D Lollipop chart of correlation coefficients between risk score and cell-cell junction biomarker genes and low stromal scores as compared to other groups (Fig. 6D). Above all, we speculated that patients with a high-risk and low stromal score could benefit more from ICI therapy than patients with a low-risk and high stromal score.

Mutation analysis in high-and low-risk groups
Mutation data of the ESCA patients were obtained from TCGA by the "maftools" package. As shown in Supplementary Fig. S7A, TP53, TNN, SYNE1, MUC16, DNAH5, and other genes were frequently mutated in both highrisk and low-risk groups. Among these mutant genes, the mutation rate of TP53 was significantly different between the high-and low-risk groups (Supplementary Fig. S7B). Tumor mutation burden (TMB) scores were calculated for each ESCA patient. Spearman correlation analysis suggested a positive correlation between TMB score and risk score ( Supplementary Fig. S7C), though there was no significant difference in TMB score between high-and low-risk groups ( Supplementary Fig. S7D). The higher the TMB, the better the effect of tumor immunotherapy. The result is consistent with the result of the high-risk group that contained a higher proportion of responders to ICI therapy.

Drug sensitivity analysis of risk scoring model
CellMiner was used to assess the correlations between the expression of model genes and IC50s (half-maximal inhibitory concentrations) of drugs in order to facilitate better precision treatment of ESCA. As shown in Supplementary  Fig. S8, drugs with the 2 highest correlation with model genes and risk score were selected, including Simvastatin, Fluvastatin, SNS-314, Nitazoxanide, Everolimus, malacid, Tegafur, 4SC-202, PX-316, Artemether, Selumetinib, Acetalax, BMS-690514, and Sapitinib. A complete list of all the drugs related to model genes and risk score (|r|> 0.2, P < 0.05) is provided in Supplementary Table S2.

Discussion
As it is reported in Global Cancer Statistic 2020, ESCA is a digestive tract cancer with high morbidity and mortality (Sung et al. 2021). There were over 477,900 new cases and 375,000 annual deaths occurring in China (Chen et al. 2016). Early symptoms of ESCA are not typical and easy to be ignored. Hospitalized patients are often at a late stage, thus making the treatment difficult. Although tremendous advances have been made in diagnostic technologies and management approaches, ESCA patients still have an unsatisfactory five-year survival rate (Yoshimizu et al. 2018). Functional mitochondria, which are essential for the cancer cell, are able to alter the mitochondrial bioenergetic and biosynthetic state (Wallace 2012). Mitochondria are potential targets for ESCA treatment. For instance, Chiyo et al. have reported that Gal-9 induces mitochondrial-mediated apoptosis in esophageal squamous cell carcinoma cells (Chiyo et al. 2019). Similarly, in the study from Ma and Ke et al.,DS2 induces mitochondria-mediated apoptosis in human ESCA cells (Ma et al. 2016). MiR-634 activates mitochondrial apoptosis in esophageal squamous cell carcinoma and enhances chemotherapy-induced cytotoxicity through direct targeting of genes associated with mitochondrial homeostasis, antiapoptosis, antioxidant ability, and autophagy (Fujiwara et al. 2015). Therefore, to develop more effective treatment strategies targeting mitochondria, identifying the molecular etiology involved in ESCA development and progression is feasible and imperative.
Based on univariate cox regression analysis, LASSO regression analysis and multivariate cox regression analysis, six genes MAOB, APOOL, HIGD1A, BCAP31, SLC44A2, and CHPT1 from the 306 mitochondria-related DEGs were obtained for inclusion in the Cox proportionalhazards model as prognostic predictors. Notably, MAOB can be used as a novel indicator to predict the progression and prognosis of colorectal cancer patients (Yang et al. 2020a). APOOL is a cardiolipin-binding component of the Mitofilin/MINOS complex that determines mitochondrial cristae shape (Weber et al. 2013). HIGD1A, a survival factor induced by hypoxia-inducible factor 1 (HIF1), decreases tumor growth while promoting tumor cell survival in vivo (Ameri et al. 2015). BCAP31 was selected to construct a prognostic signature on ESCA associated with DNA repair in the study of Wang and Li et al., which was consistent with our research . Downregulation of SLC44A2 was associated with poor outcomes of neuroblastoma (Lasorsa et al. 2020). CHPT1 is reported to be a tumor-promoting gene in castration-resistant prostate cancer (Wen et al. 2020). The differently expressed of these six genes between normal and ESCA samples, as well as between in adenomas and adenocarcinomas, suggests that these genes are involved in the occurrence, development, and differentiation of ESCA.
In the risk model constructed by these six genes, Kaplan-Meier survival analysis revealed a worse prognosis for the high-risk group than the low-risk group. This result was confirmed by GSE53624 and suggested that the Cox proportional-hazards model was an effective biomarker for predicting the prognosis of ESCA. ROC curves and C-index demonstrated that the Cox proportional-hazards model could accurately predict prognosis. Also, the nomogram model constructed by risk score and other clinical factors was relatively accurate in predicting ESCA patients' survival. DCA result suggested risk score was better to be a prognostic factor of ESCA than other clinical parameters including nomogram.
Enrichment analysis manifested that DEGs between highand low-risk groups were enriched in the hippo signaling pathway and cell-cell junction pathways. ESCA samples showed a negative correlation between risk score and biomarkers of both the hippo signaling pathway and cell-cell junction, indicating that in high-risk patients of ESCA, the hippo signaling pathway and cell-cell junctions may be suppressed. The hippo signaling pathway regulates cell proliferation, apoptosis, and stem cell renewal (Zhang et al. 2018). Dysregulation of the hippo signaling pathway leads to tumor progression and initiation (Lee et al. 2022). The cell-cell junction plays a crucial role in maintaining normal physiological functions, such as migration, proliferation, and paracellular transport. There are several diseases associated with altered cell-cell junctions (Gray et al. 2020). Our results are consistent with these researches.
Bioinformatics analyses have been conducted in various studies to explore potential pathological mechanisms in ESCA over the past decade. Nevertheless, the mechanisms of immune cell infiltration and the genes involved in immune infiltration in ESCA remain unknown. Hence, we calculated the immune score, estimate score, stromal score, and tumor purity. High-risk patients had lower levels of the stromal score, while tumor purity, ESTIMATE score, and immune score were not significantly affected. Using CIBER-SORT, a comparison of 22 immune cell subsets in high-risk and low-risk samples was conducted. It was found that the high-risk group had higher levels of memory resting CD4 + T cells, resting NK cells, and M0 and M2 macrophages, and a lower level of M1 macrophages. Risk scores are moderately correlated with immune cells. Previous studies have reported high levels of CD4 + T cells in ESCA, (Yang et al. 2006) and ESCA with high numbers of tumor-infiltrating CD4 + T cells have a favorable prognosis (Cho et al. 2003). Tumor-infiltrating innate immune cells, like NK cells and macrophages, are responsible for antitumor immunity (Vitale et al. 2014). While M1-type macrophages can kill tumor cells, M2-type macrophages promote tumor growth (Huang et al. 2021). We speculated that the increased infiltration of CD4 + T cells may potentially sensitize tumors to immune therapy. This hypothesis was confirmed by the subsequent prediction of the response to ICI therapy. Based on the TIDE algorithm, the high-risk group presented a relatively lower TIDE score compared with the low-risk group. ICI therapy was more effective in patients with higher-risk scores and lower stromal scores. As a consequence, the high-risk group might benefit from immunotherapy; it provides a signature model for screening immunotherapy patients.
Additionally, in our study, we analyzed the relationship between model genes and the drug sensitivity of anticancer drugs, offering a novel insight into treating tumors and avoiding drug resistance. According to some studies, the Fig. 6 TIDE score between high-and low-risk groups. A Correlation between risk score and TIDE score. B Distribution of responder and nonresponder samples in high-and low-risk groups. C Distribution of responder and nonresponder samples in high and low stromal score groups. D Distribution of responder and nonresponder samples in different risk and stromal score groups. ns: P > 0.05, *P < 0.05, **P < 0.01 Page 13 of 15 109 use of statins, including Simvastatin and Fluvastatin, was significantly associated with a decrease in all-cause and cancer-specific mortality in patients with ESCA (Nguyen et al. 2018). Besides, Everolimus, an immunosuppressive drug, was used for treating breast cancer, neuroendocrine tumors, and kidney cancer (Fu et al. 2016). A combination of tegafur and uracil is widely used for the treatment of breast cancer and other solid tumors like lung, gastric and rectal cancers. There is evidence that 4SC-202 inhibits colorectal cancer, hepatocellular carcinoma, and urothelial carcinoma (Fu et al. 2016;Zhijun et al. 2016;Pinkerneil et al. 2016). Selumetinib has been approved by the FDA for the treatment of adult melanoma, colon cancer, and non-small cell lung cancer, and it has shown antitumor activity in preliminary trials (Irving et al. 2014). Colorectal cancer and non-small cell lung carcinoma are effectively suppressed by BMS-690514 (Yang et al. 2020b;Motte Rouge et al. 2007). For treating ESCA, however, these drugs have not been fully investigated. At present, only a few chemotherapeutic drugs are effective in the treatment of ESCA, and our study sheds new light on the research for new anti-ESCA drugs.
Nonetheless, several limitations of this study should be addressed. First, all analyses were performed using retrospective data, and the prospective data from multi-center cohorts, which would produce more reliable results, is lacking in public databases. Second, the small sample size of normal samples is a factor causing the possible bias of results. Third, according to a previous report, there are regional differences in the mortality of ESCA because of different racial characteristics, living habits, and other complex risk factors (Ma et al. 2022;Middleton et al. 2022;Liao et al. 2021). By obtaining more detailed information about the clinical samples and by categorizing them more precisely, we can get more accurate conclusions. Finally, the detailed influence mechanisms of the risk model genes on immunity should be verified in vivo and in vitro experiments in the further.

Conclusion
Taken together, we focused on the role of mitochondriarelated genes in esophageal carcinoma development and proposed a prognostic signature for individualized integrative assessment. It may be possible to predict the prognosis of ESCA patients more accurately and choose better immunotherapy options in the future, with further research needed.
Funding This work was supported by the National Natural Science Foundation of China (No. 81902513), the Applied Basic Research Project of Shanxi Province (No. 202103021224228 and No. 20210302124376) and the Science/Technology Project of Sichuan Province (No. 2021YFS0161).

Competing interests
The authors declare no competing interests.