Bioinformatics analysis of autophagy-related lncRNAs in esophageal carcinoma

Background: Esophageal carcinoma (ESCA) is a malignant tumor with high invasiveness and mortality. Autophagy has multiple roles in the development of cancer; however, there are limited data on autophagy genes associated with long non-coding RNAs (lncRNAs) in ESCA. The purpose of this study was to screen potential diagnostic and prognostic molecules, and to identify gene co-expression networks associated with autophagy in ESCA. Methods: We downloaded transcriptome expression proles from The Cancer Genome Atlas and autophagy-related gene data from the Human Autophagy Database, and analyzed the co-expression of mRNAs and lncRNAs. In addition, the diagnostic and prognostic value of autophagy-related lncRNAs was analyzed by multivariate Cox regression. Furthermore, Kyoto Encyclopedia of Genes and Genomes analysis was carried out for high-risk patients, and enriched pathways were analyzed by gene set enrichment analysis. Results: The results showed that genes of high-risk patients were enriched in protein export and spliceosome. Based on Cox stepwise regression and survival analysis, we identied seven autophagy-related lncRNAs with prognostic and diagnostic value, with the potential to be used as a combination to predict the prognosis of patients with ESCA. Finally, a co-expression network related to autophagy was constructed. Conclusion: These results suggest that autophagy-related lncRNAs and the spliceosome play important parts in the pathogenesis of ESCA. Our ndings provide new insight into the molecular mechanism of ESCA and suggest a new method for improving its treatment.


Background
Autophagy is a catabolic process that can degrade and regulate proteins or organelles to maintain the stability of the body. Under normal physiological conditions, autophagy maintains a low level of activity, forming autophagosomes by capturing organelles to be degraded [1]. These combine with the lysosome to form an autophagic lysosome, which degrades the contents for use in the self-renewal of cells [2].
Existing studies have shown that autophagy in the pathological state is closely related to the occurrence and development of cancer [3]. Increased autophagy has an inhibitory effect during the early development of cancer, and on benign tumors. However, the main role of autophagy in cancer is to promote tumor growth. Compared with normal cells, cancer cells have greater autophagy dependence [4].
In addition, cancer can change the normal autophagy mechanism, and autophagy-related gene mutations can lead to the occurrence of cancer [5].
Esophageal carcinoma (ESCA) is eighth the most common malignant tumor in the world, with high mortality and invasiveness [6]. In recent decades, with the aging of the population, the number of ESCA patients has increased [7]. The occurrence of ESCA is mainly due to risk factors such as chronic esophageal stimulation, smoking, and alcohol. At present, treatments for ESCA include surgery, radiotherapy, chemotherapy, and immunotherapy; however, their effects remain unsatisfactory [8]. In addition, it has been reported that autophagy-related genes (ARGs) and TNM stages can predict the prognosis of patients with ESCA [9]. However, due to the in uence of tumor heterogeneity, the TNM stage cannot accurately predict the prognosis of ESCA patients [10]. Moreover, the pathogenesis of ESCA is often regulated by complex molecular networks. The study aimed to elucidate the mechanism of autophagy in the pathogenesis of ESCA, and to identify prognostic factors and potential molecular therapeutic targets.
First, we used The Cancer Genome Atlas (TCGA) database to obtain a gene expression matrix for ESCA.
Furthermore, we screened autophagy-related genes in ESCA using the Human Autophagy Database (HADb). Then, we constructed an autophagy-related co-expression network based on gene set enrichment analysis (GSEA), survival analysis, and multivariate Cox regression analysis. Finally, seven predictive lncRNAs, RAB11B-AS1, ZFAS1, AC093582.1, AC012467.2, AC079684.2, AC092687.3, and AC083799.1, were identi ed. These genes are closely related to the autophagy of ESCA. This study provides a new molecular therapeutic target and lays the foundation for a better understanding of the mechanism of autophagy in ESCA. The work ow of the speci c analysis is shown in (Supplementary Fig.S1).

Data preparation
RNA expression data were obtained from TCGA (https://www.cancergenome.nih.gov/). As of July 16, 2020, 11 normal samples and 160 ESCA samples had been merged. We also downloaded gene expression data for ESCA, reported as fragments per kilobase million. Then, we used a Perl script to extract an mRNA matrix and long non-coding RNA (lncRNA) matrix for ESCA. Further, we downloaded autophagy-related mRNA data from HADb (http://www.autophagy.lu/) [11].
Co-expression analysis of autophagy genes Based on the autophagy genes in HADb, we screened out a matrix of autophagy-related mRNAs from ESCA patient data in TCGA. Then, we used the limma package of the R software (version 4.0.2) to analyze the co-expression between this autophagy mRNA matrix and the ESCA lncRNA matrix, and to remove lncRNAs with low expression. At the same time, whether the expression of lncRNA and ARGs uctuated in different tumor samples was tested by cyclic sentences, and the genes with no differential expression in tumor samples were excluded. Furthermore, the correlation coe cient between lncRNA and ARGs was evaluated by correlation test. Finally, we identi ed the lncRNAs co-expressed with autophagy genes as autophagy-related lncRNAs for subsequent analysis. |Correlation coe cient | ≥ 3 and p-value < 0.001 were used as the screening criteria [12][13].
Survival analysis of lncRNAs associated with autophagy We used our Perl script to extract the survival time and survival status of ESCA patients from the clinical data in TCGA. The survival R package was used to analyze the relationships between survival and autophagy-related lncRNAs. Kaplan-Meier survival analysis was used to analyze the relationship between lncRNAs and prognosis of ESCA patients. p < 0.05 was considered to indicate a signi cant difference.

Multivariate Cox regression model analysis
We used the survival R package to perform multivariate Cox regression analysis using the seven meaningful lncRNAs (ZFAS1, RAB11B-AS1, AC012467.2, AC079684.2, AC092687.3, AC083799.1, and AC093582.1). We also performed stepwise regression analysis on these seven variables. To evaluate the prognostic value of the nal gene combination in patients with ESCA, patients were divided into high-and low-risk groups based on the median risk score. Moreover, survival analysis and receiver operating characteristic (ROC) curve analysis of the nal gene combination were carried out using the survival and survivalROC R packages, respectively. Further, we calculated the area under the ROC curve (AUC). A heat map, risk scores, and survival diagram for the seven lncRNAs were constructed using the pheatmap package. To increase the comparison with clinical parameters, we combined the clinical data of ESCA patients and the risk model constructed in this experiment and used the survival package of R software for univariate and multivariate COX analysis to get the relevant forest plot. Finally, the relevant ROC curve was drawn (p < 0.05).

Construction of co-expression network
We used the Cytoscape software (version 3.7.2) [14] to visualize the autophagy-related co-expression network. Then, the ggalluvial R package was used to construct a Sankey diagram of lncRNAs for the high-and low-risk groups of ESCA patients. Hazard ratio (HR) values > 1 and < 1 were regarded as indicating high-risk and low-risk patients, respectively.

KEGG analysis and GSEA
After the ESCA patients had been divided into high-and low-risk groups, we assigned each sample to the high-or low-risk group according to the median risk score, and then performed KEGG (version 7.1) enrichment analysis using GSEA (version 4.0.3) [15] based on the TCGA mRNA matrix. NES ≥ 1, NOM p ≤ 0.05, and false discovery rate q ≤ 0.25 were considered to indicate signi cance.

Results
Co-expression analysis of autophagy genes A total of 232 autophagy-related genes were obtained from HADb. Then, we used the limma R package to analyze the co-expression of autophagy-related mRNAs and lncRNAs in patients with ESCA. We averaged the repeated lncRNAs and eliminated those with low expression. Finally, we obtained 199 autophagyrelated mRNAs and 1746 autophagy-related lncRNAs in ESCA patients.

Survival analysis of lncRNA associated with autophagy
To study the prognostic role of lncRNA in patients with ESCA, we analyzed the relationships of 1746 genes with survival. According to Kaplan-Meier survival analysis, 12 autophagy-related lncRNAs had signi cant prognostic value in patients with ESCA. Among them, only RAB11B-AS1 and AC093582.1 were related to good prognosis in patients with ESCA. High expression of the other 10 genes was signi cantly associated with shorter overall survival (OS) ( Table 1).

Multivariate Cox regression model analysis
As various marker combinations appeared very promising as prognostic indicators, we further evaluated the prognostic value of a combination of 12 genes in patients with ESCA. Using stepwise regression analysis of these 12 variables, we obtained seven variables by Cox regression. These seven lncRNAs were closely related to the prognosis of patients with ESCA (Fig. 1). To further evaluate the prognostic value of the seven-gene combination in patients with ESCA, patients were divided into high-and low-risk groups according to the median score from the Cox regression model. Patients' risk scores continued to increase (Fig. 2a) from left to right. Compared with the low-risk group, the high-risk group had higher risk scores.
According to the survival diagram, patients in the high-risk group had higher mortality and shorter survival times than those in the low-risk group (Fig. 2b). Kaplan-Meier survival analysis showed that the survival time of patients in the high-risk group was signi cantly shorter than that of those in the low-risk group (p < 0.05) (Fig. 2c). ROC curve analysis showed that the combination of seven genes had high accuracy and speci city in evaluating the prognosis of patients with ESCA, with an AUC value of 0.797 (Fig. 2d). These results suggest that the combination of seven genes could be used as a speci c diagnostic and prognostic index in patients with ESCA. The expression of the seven genes in the high-and low-risk groups was visualized using a heat map (Fig. 3).
The forest plot of the univariate analysis showed that stage, N, M, and riskscore could all predict the prognosis of patients with ESCA ( Supplementary Fig.S2a). Multivariate COX analysis showed that only riskscore could be used as an independent prognostic factor ( Supplementary Fig.S2b). However, the ROC results showed that the area under the curve (AUC) of the risk model, M, and N is 0.777, 0.509, and 0.676, respectively ( Supplementary Fig.S2c). Therefore, compared with TNM staging, the risk model we constructed is more accurate in predicting patient survival.

Construction of co-expression network
After the multivariate Cox regression analysis, we used Cytoscape to visualize the co-expression network. The network included seven lncRNAs and 33 mRNAs (Fig. 4a, Table 2) associated with autophagy. Overall, HR > 1 was considered to be a risk factor, whereas HR < 1 was a protective factor. The Sankey diagram indicated that RAB11B-AS1 and AC093582.1 were protective factors in patients with ESCA, whereas ZFAS1, AC012467.2, AC079684.2, AC092687.3, AC083799.1 were risk factors (Fig. 4b, Table 3).

KEGG analysis and GSEA
To study the biological characteristics of the high-and low-risk groups of patients with ESCA, we used GSEA to analyze the KEGG enrichment results. In the high-risk group, enrichment was mainly found in protein export and the spliceosome. The difference was statistically signi cant. The enrichment scores of both KEGG pathways were positive, suggesting that most of the genes in the high-risk group were highly expressed in these two pathways and accompanied by a higher risk score (Fig. 5). There was no signi cant enrichment in the low-risk group.

Discussion
ESCA is a common malignant tumor with high invasiveness. Although the pathogenesis of ESCA is still unclear, it is known to be associated with chemical stimulation, genetic factors, and other esophageal diseases. Studies have shown that the biological characteristics of ESCA are closely related to autophagy. In the early stage of ESCA, autophagy can activate anti-tumor cells and micro-environment to inhibit the development of ESCA [16]. However, in the middle and later stages of ESCA, autophagy can be used as a promoter of tumor development, by clearing abnormal mitochondria and providing energy for tumor cells, and then promote the proliferation and metastasis of tumor cells [17]. Whelan et al. have also demonstrated that autophagy can promote the metastasis of ESCA by mediating the mechanism of epithelial-mesenchymal transition (EMT) [18]. In addition, it has been proved that autophagy inhibitors can inhibit the development of ESCA and improve the prognosis of patients [19]. Interestingly, anesthetics can also inhibit tumor growth by regulating autophagy. So et al. found that midazolam can induce apoptosis of Leydig tumor cells by regulating autophagy [20]. Zhang et al. also found that ropivacaine can inhibit the growth of melanoma [21]. Since these autophagy inhibitors have not been widely used in clinical treatment, their safety and potential molecular mechanisms related to ESCA are unclear. Therefore, it is very bene cial for patients to explore the potential autophagy mechanism of ESCA and to nd more effective targets for diagnosis and treatment. However, most studies tend to evaluate the role of ARGs in ESCA. So far, there is no study on the diagnostic and prognostic value of autophagy-related lncRNAs in ESCA. Therefore, it is necessary to elucidate the potential molecular mechanism of autophagy-related lncRNAs in ESCA and identify new therapeutic targets.
In this study, we divided ESCA patients into high-and low-risk groups, and then performed KEGG analysis with GSEA on the genes of patients in the high-and low-risk groups. The nal results showed that the genes in the high-risk group were enriched in the protein export and spliceosome KEGG pathways, compared with the low-risk group.
The spliceosome has been shown to be associated with the occurrence of cancer, and is also closely related to the autophagy process in cancer [1]. The spliceosome is a multi-component complex formed during the splicing of RNA, and has an important role in regulating genetic activities [22]. Under normal physiological conditions, the spliceosome removes introns and exon junctions by splicing precursor mRNA, which is transformed into mature mRNA and nally translated by ribosomes into proteins that perform various functions and activities essential to life. During tumor formation, abnormal RNA splicing at the transcript level often promotes the occurrence of cancer [23]. High expression of SF3B4, a component of the splicing complex, is signi cantly correlated with shorter OS in patients with ESCA [24].
Similarly, bioinformatics studies have shown that the occurrence of ESCA is related to the spliceosome [25]. Interestingly, the spliceosome is also closely related to autophagy [26][27][28]. Quidville et al. showed that SNRPE was overexpressed in lung and breast cancer, and knocking down SNRPE could cause cancer cell death through autophagy [29]. Therefore, we speculate that the spliceosome could affect the progress of ESCA by regulating the expression of autophagy genes in ESCA.
In this study, we identi ed seven lncRNAs and 33 mRNAs by analyzing autophagy-related lncRNAs in ESCA and constructing a co-expression network. These genes were closely related to autophagy in ESCA.
ULK1 is a serine/threonine protein kinase that initiates autophagy and regulates autophagy through mTORC1 and AMPK. It has been shown to regulate the growth of cancer cells through autophagy in a variety of cancers [30]. In colon cancer, Liu et al. blocked the autophagy of cancer cells by knocking down the expression of ULK1, thereby promoting cell apoptosis and inhibiting the growth of colon cancer [31]. ULK1 has a similar role in prostate cancer [32]. Jiang et al. found that ULK1 was highly expressed in ESCA patients compared with normal samples through bioinformatics analysis [33]. Moreover, high expression of ULK1 was signi cantly correlated with shorter OS. Functional experiments showed that after inhibiting the expression of ULK1, the proliferation of ESCA cells decreased and apoptosis increased [33]. MAPK8, also known as JNK, is a member of the protein kinase family that participates in processes including cell proliferation, autophagy, and transcriptional regulation. MAPK8 can activate autophagy in lung cancer, bladder cancer, and prostate cancer cells. He et al. showed that activating MAPK8 could induce autophagy and reduce apoptosis of cancer cells [34]. Bao et al. found low expression of MAPK8 in ESCA, and showed that overexpression of MAPK8 could inhibit the proliferation and migration of cancer cells and was related to poor prognosis in ESCA patients [35]. CAPN10 is highly expressed in ESCA. Chan et al. found that knockdown of CAPN10 decreased proliferation of ESCA cells and increased apoptosis, which was signi cantly related to poor prognosis. However, the autophagy mechanism of CAPN10 in ESCA has not been reported. In addition, ULK1, MAPK8, and CAPN10 have roles involving the spliceosome in cancer or other diseases. In breast cancer, abnormal splicing of ULK1 leads to an increase in autophagy and promotes the growth of breast cancer cells [36]. ULK1 and MAPK8 have a similar role in retinitis pigmentosa [37]. Abnormal splicing of CAPN10 leads to ovarian cancer [38].
ZFAS1 is a new tumor-related lncRNA that is abnormally expressed in multiple tumors [39]. ZFAS1 is highly expressed in colon cancer and can inhibit tumor proliferation and promote cancer cell apoptosis by regulating downstream target genes [40]. ZFAS1 is also a sarcoplasmic reticulum Ca2+-ATPase2a (SERCA2a) inhibitor; it induces mitochondrial-mediated cardiomyocyte apoptosis by inhibiting SERCA2a [41]. The expression of ZFAS1 is upregulated in ESCC, and its silencing can inhibit the proliferation, migration, and invasion of ESCC cells [42]. However, the autophagy mechanism mediated by ZFAS1 in ESCA remains unclear. There have been few reports on RAB11B-AS1; at present, little is known about this lncRNA except that its expression is increased in breast cancer and lung cancer [43,44]. Overexpression of RAB11B-AS1 has been shown to promote the proliferation and metastasis of breast cancer and lung cancer cells; however, it had the opposite effect on osteosarcoma [45].
In this study, we found that the pathogenesis of patients in the high-risk group may be related to the spliceosome, in contrast to those in the low-risk group. Survival analysis showed that a total of seven lncRNAs were closely related to the prognosis of patients with ESCA. Furthermore, the results of the multivariate Cox analysis showed that the combination of RAB11B-AS1, AC093582.1, ZFAS1, AC012467.2, AC079684.2, AC092687.3, and AC083799.1 could identify high-risk ESCA patients; thus, they represent a novel multi-gene signature for potential clinical use in ESCA. However, the role of these seven lncRNAs in the autophagy mechanism of ESCA has not been determined, and there have been no reports about the role of the spliceosome. Therefore, we will investigate this in future research.

Conclusions
we used mRNA and lncRNA expression pro les in ESCA to construct an autophagy-related co-expression network. This will provide a basis for further research on the regulatory mechanism of ESCA. In addition, we identi ed a combination of seven genes with potential applications in the diagnosis and prognostic analysis of ESCA patients. In future work, based on these data, we will analyze the mechanisms of autophagy and the spliceosome in ESCA in more detail, in order to clarify the role of these genes in the autophagy network.

Declarations
Ethics approval and consent to participate Not applicable.

Consent to publish
Not applicable.

Availability of data and materials
The datasets used and/or analyzed during the current study are available from the TCGA database (https://portal.gdc.cancer.gov/) and the HADb database (http://www.autophagy.lu/). Processed data are available from the corresponding author on a reasonable request. DW performed the statistical analysis, created the gures, and wrote the manuscript. YD and JBF helped with designing the experiments and assisted in manuscript writing. All authors read and approved the nal version of the manuscript.