Clinical Prognostic Model of Autophagy-Related LncRNA Genes in The Esophageal Adenocarcinoma (EAC) to Predicting Overall Survival (OS) of The Patients (cid:0) The Evidence From Bioinformatic Analysis

Objective: Autophagy-related LncRNA genes play a vital role in the development of esophageal adenocarcinoma.Our study try to construct a prognostic model of autophagy-related LncRNA esophageal adenocarcinoma, and use this model to calculate patients with esophageal adenocarcinoma. The survival risk value of esophageal adenocarcinoma can be used to evaluate its survival prognosis. At the same time, to explore the sites of potential targeted therapy genes to provide valuable guidance for the clinical diagnosis and treatment of esophageal adenocarcinoma. Methods: Our study have downloaded 261 samples of LncRNA-related transcription and clinical data of 87 patients with esophageal adenocarcinoma from the TCGA database, and 307 autophagy-related gene data from www.autuphagy.com. We applied R software (Version 4.0.2) for data analysis, merged the transcriptome LncRNA genes, autophagy-related genes and clinical data, and screened autophagy LncRNA genes related to the prognosis of esophageal adenocarcinoma. We also performed KEGG and GO enrichment analysis and GSEA enrichment analysis in these LncRNA genes to analysis the risk characteristics and bioinformatics functions of signal transduction pathways. Univariate and multivariate Cox regression analysis were used to determine the correlation between autophagy-related LncRNA and independent risk factors. The establishment of ROC curve facilitates the evaluation of the feasibility of predicting prognostic models, and further studies the correlation between autophagy-related LncRNA and the clinical characteristics of patients with esophageal adenocarcinoma. Finally, we also used survival analysis, risk analysis and independent prognostic analysis to verify the prognosis model of esophageal adenocarcinoma. Results: We screened and identied 22 autophagic LncRNA genes that are highly correlated with the overall survival (OS) of patients with esophageal adenocarcinoma. The area under the ROC curve (cid:0) AUC=0.941 (cid:0) and the calibration curve have a good lineup, which has statistical analysis value. In addition, univariate and multivariate Cox regression analysis showed that the autophagy LncRNA feature of this esophageal adenocarcinoma is an independent predictor of esophageal adenocarcinoma. Conclusion: These LncRNA screened and identied may participate in the regulation of cellular autophagy pathways, and at the same time affect the tumor development and prognosis of patients with esophageal adenocarcinoma. These results indicate that risk signature and nomogram are important indicators related to the prognosis of patients with esophageal adenocarcinoma.


Introduction
Esophageal cancer is one of the common malignant tumors of the upper gastrointestinal tract, and accounts for the top ten causes of cancer deaths in the world. In recent years, the incidence of esophageal cancer has shown an increasing trend globally (1). The pathological types of esophageal cancer mainly divided into esophageal squamous cell carcinoma and esophageal adenocarcinoma.
Among them, esophageal squamous cell carcinoma is common in China, Japan, South Korea and Southeast Asia, et.al. While esophageal adenocarcinoma is more common in Western countries such as Europe, America, Australia, et.al.
However, in recent years, studies have found that in China and other regions, with the changes in living standards, eating habits and surrounding environment in Eastern countries, the incidence of esophageal adenocarcinoma has increased signi cantly faster than that of esophageal squamous cell carcinoma (2), and this has caused widespread concern. In clinical practice, many patients with esophageal adenocarcinoma often lack speci c manifestations in the early stage and are not detected in time. The prognosis of patients with advanced esophageal adenocarcinoma is worse in metastasis and recurrence, and the morbidity and mortality are gradually increasing. Therefore, pretreatment intervention and evaluation of esophageal adenocarcinoma will help identify and treat patients with high-risk recurrence of esophageal adenocarcinoma. This autophagy-related LncRNA esophageal adenocarcinoma prognosis model may be used to guide targeted therapy strategies of the patients in the future (3), it may reduces the mortality and recurrence rate, and improves the survival and prognosis of patients. Therefore, it is very signi cant to determine the prognostic indicators of esophageal adenocarcinoma and effectively predict the overall survival time of patients with esophageal adenocarcinoma.
Long-chain non-coding RNA (LncRNA) (4) is de ned as RNA sequences that are greater than 200 nucleotides in length and do not have protein translation functions. They participate in cell growth and proliferation and gene replication, including participation in transcription, Adjustment of translation, editing and other processes. Recently, there are related reports: under certain conditions, some LncRNAs regulate the occurrence of cell apoptosis by participating in the autophagy pathway (5). With the rapid development of biology and genetics, more evidences show that autophagy-related LncRNA expression plays a vital role in the initiation, induction and progression of esophageal adenocarcinoma.Therefore, the expression of these autophagy-related LncRNAs directly or indirectly affects the overall survival (OS) of patients with esophageal adenocarcinoma. In addition, related studies have shown that these LncRNAs may become candidate genes for prognostic indicators of esophageal adenocarcinoma and ideal sites for targeted therapy in the future.Therefore, exploring the autophagy-related LncRNA has important guiding value and broad application prospects in the basic research and clinical diagnosis and treatment of esophageal adenocarcinoma.

Data acquisition and processing
Download relevant data about esophageal adenocarcinoma from The Cancer Genome Atlas (TCGA) database. These data include 261 samples of LncRNA-related transcription genes and clinical data of 87 patients with esophageal adenocarcinoma. On October 29, 2020, the LncRNA sequencing data and related clinical information of esophageal adenocarcinoma were obtained from the TCGA database. We has downloaded a total of 307 data information about autophagy-related genes from www.autuphagy.com online website. These 3 sets of data were overlapped and merged with the corresponding gene sequence numbers, and 47 overlapping autophagy-related LncRNAs were screened for further analysis. From the 47 lncRNA genes, 22 genes related to the prognosis of esophageal adenocarcinoma were screened. In the R software, the "edgeR" package is used to identify the DELs of 261 transcribed gene samples and 87 clinical samples in the data downloaded by TCGA, adjust |logFC|>2 and correct P<0.05. In the analysis of survival time, 22 prognostic-related LncRNAs, we found that have signi cant differences and statistical signi cance.
1.2 Screening and identi cation of autophagy differential genes Our study used GEO2R to screen the DEG between autophagy-related LncRNA samples (http://www.ncbi.nlm.gov/geo2r). GEO2R is an interactive network tool that requires researchers to compare and analyze two or more data sets in the GEO database in order to identify DEG under experimental conditions. The adjusted P-values (adj. P) and Benjamini and Hochberg false discovery rate are used to discover statistically target genes and correct false positives, and remove probe sets without corresponding gene names or genes with multiple probe sets. The cut-off value is determined by the false discovery rate (FDR)<0.05 and log|FC|>1.

KEGG and GO enrichment analysis of autophagy-related genes
KEGG is an online database resource, which is used for a large number of molecular data generated in high-throughput experiments to learn about the high-level and biological functions of the corresponding LncRNA. GO is an analytical bioinformatics tool used to explain and analyze the biological processes of these genes. We downloaded the relevant corresponding gene ID, gene expression and logFC. Before running the script, we need to download the "colorspace", "stringi", and "ggplot2" packages, and start to run the "BiocManager", "DOSE", and "DOSE" packages in the R software. "Enrichplot" package, corrected P=0.05 as the lter condition for GO enrichment.
Before making the GO enrichment bubble chart, we set width=10, height=8, -log(adj p-value)>3 to display the ID of the corresponding gene. The highest false discovery rate (FDR) result <0.05 is considered to be of research signi cance.
Our study use R software to perform KEGG and GO analysis on these autophagy-related LncRNAs with differential analysis value, and to understand the enrichment analysis of the functions and pathways of these LncRNAs, which provides a powerful basis for future basic research and exploration of esophageal adenocarcinoma. It provides valuable guidance for the clinical prognostic diagnosis and treatment of patients with esophageal adenocarcinoma.

Screening of prognostic related autophagy genes
From the 47 autophagy-related LncRNAs obtained through the integration and screening of the "edgeR"package of the R software, 22 LncRNA genes related to the prognosis of esophageal adenocarcinoma (including 18 LncRNA genes that are highly expressed in esophageal adenocarcinoma tumor cells and 4 LncRNA genes that low expression).
1.5 Construction of a prognostic model of autophagy-related genes in esophageal adenocarcinoma

Survival analysis
The Kaplan-Meier survival analysis method is used to compare the survival rate of the high-risk group and the low-risk group, calculate the risk value of patients with esophageal adenocarcinoma, and calculate the overall survival (OS)[1-year survival rate, 3-year survival rate and 5-year survival rate].

ROC analysis
ROC analysis uses independent risk factors to test the sensitivity and speci city of survival prediction. The area under the ROC curve (AUC) ranges from 0.5 to 1.0. The closer the AUC is to 1, the more accurate the predictive ability of the model is. It suggests that the prognostic model has no predictive power if AUC<0.5 .
ROC analysis is run using the "Survival ROC" package in R software to evaluate the LncRNA signature.
The calibration curve is used to evaluate whether the predicted survival time is consistent with the actual survival time, and the area under the curve (AUC) is also used to rate the prediction performance and accuracy of the prognostic model. What is more, our study also validated this prognostic model of esophageal adenocarcinoma through single factor and multivariate CPHR analysis, and further explored the independence of 22 LncRNA signatures on the overall survival (OS) of patients with esophageal adenocarcinoma.

GSEA enrichment analysis
Based on the median of the risk scores in the 22 LncRNA signatures, the 261 gene transcript samples and 87 clinical samples downloaded in TCGA were divided into high-risk groups and low-risk groups according to the risk scores in the LncRNA signature. We use "The JavaGSEA "software to perform GSEA enrichment analysis between the high-risk group and the low-risk group of esophageal adenocarcinoma patients. [Selection criteria (NOM) P value <0.05]

Difference analysis of autophagy-related genes
This research has specially produced a detailed data processing owchart for readers to understand the writing concept and veri cation ideas of this research (Fig 1.).

Boxplot
In this study, 262 LncRNA transcription samples were analyzed by box plot. We used the "ggpubr" package in the R software to identify the expression levels of differential genes to create autophagyrelated LncRNA boxplots (tumor count: 147 normal count: 115) of 47 esophageal adenocarcinoma samples compared with normal tissue samples (Fig 2.), the abscissa is the name of the 47 differential LncRNA genes, and the ordinate is the expression of these genes. We can nd that the 47 differential autophagy-related LncRNAs are all up-regulated genes in esophageal adenocarcinoma samples.

KEGG enrichment analysis
The ordinate is the name of GO, the abscissa is the number of genes enriched in each GO, and the color of the histogram represents the degree of GO enrichment. GO is divided into three categories, including BP, CC and MF. We found that these genes are mainly concentrated in the areas of apoptosis and autophagy regulation mechanisms, including regulation of cysteine-type endopeptidase activity, regulation of apoptosis signaling pathways, and vacuolar membrane integrin complex autophagosomes Participation of cell adhesion, autophagosome membrane raft membrane microdomain, melanosome pigment granule membrane domain protein complex, etc. Fig 3A. The ordinate is -log(adj p-value), the abscissa is Z-score, Z-score>0 indicates that the number of upregulated genes enriched in GO is large, and Z-score<0 indicates that the number of up-regulated genes collected in GO is small(green represents BP, red represents CC, and blue represents MF). We found that the genes enriched in GO are all up-regulated genes, including GO: 0001961, GO0060760 and GO0043280. Fig 3B. We set the GO (count number=10) that showed the most signi cant enrichment of these differential The genes we screened out are mainly enriched in Response to unfolded protein in the GO heat map, Regulation of apoptotic signaling pathway, Process utilizing autophagic mechanism, Neuron death, Neuron apoptptic process, autophagy and other places. Fig 3D.)

KEGG enrichment analysis
The abscissa is the number of genes, and the ordinate is the name of the signal pathway. The darker the color indicates the higher the enrichment of the gene in this eld. We found that the degree of enrichment is most obvious in the KEGG pathway areas such as autophagy, apoptosis, measles and in uenza. Fig  4A. The abscissa is the number ratio of genes, the ordinate is the name of the KEGG signal pathway, and the color of the bubble represents the signi cance of gene enrichment. We found signi cant enrichment in KEGG signaling pathways such as autophagy, process unilizing, autophagic mechanism, regulation of endopeptidase activity, regulation of apoptotic signaling pathway, etc. Among these differential genes, the autophagy pathway is the most enriched, which also validates the esophageal glands. Cancer-related genes have a certain correlation with autophagy cell signal induction pathway. Fig 4B. When we set -log(adj p-value)>3, the name of the KEGG enrichment pathway of autophagy-related LncRNA genes is displayed. We found that hao05219 and hao01524 are both up-regulated genes in the KEGG enrichment pathway. Fig 4C. We found that these up-regulated genes are mainly enriched in hao01524 in the KEGG pathway, which is related to the resistance of anti-tumor drugs and drug resistance. The KEGG pathway of these upregulated genes is also related to ovarian cancer, lung cancer, bladder cancer, P53 signaling pathway and IL-17 signaling pathway. Fig 4D. The autophagy-related LncRNA we downloaded and screened has high expression characteristics in pancreatic cancer, bladder cancer, infection, and drug resistance. Fig 4E. 3. Prognostic model construction and analysis

subsistence analysis
In these 47 esophageal adenocarcinoma autophagy-related LncRNA genes that have been screened, we further screened and overlapped them with R software, and collected clinical data on the overall survival (OS) of patients with esophageal adenocarcinoma in order to construct a prognostic model, including 1year survival time, 3-year survival rate and 5-year survival rate. In the end, we obtained 22 LncRNAs related to prognosis, including 18 LncRNA genes highly expressed in esophageal adenocarcinoma tumor cells and 4 low-expressing LncRNA genes.
In addition to drawing survival curves for survival analysis of these 22 prognostic-related LncRNAs, we also performed overall overall survival (OS) analysis (Fig 6.). The nal results showed that lncRNAs with high risk are important for the prognosis of patients with esophageal adenocarcinoma. The speci c performance is shown in Figure 6. The survival rate of LncRNA expressed at high risk (1 year survival rate: 50%, 3-year survival rate: 0%, 5-year survival rate: 0%), compared with the survival rate of LncRNA expressed at low risk (1 year survival rate: 100%, 3-year survival rate: 90%, 5-year survival rate: 40%), the difference is statistically signi cant (P<0.001).

Risk analysis
We made a risk analysis on the LncRNA prognostic model related to autophagy of esophageal adenocarcinoma, calculated the risk value of each esophageal adenocarcinoma patient and drawn the risk curve, including Risk score, Survival time and risk heat map. The abscissa is the risk value, overall survival and survival status of each patient, from left to right (the left is low-risk patients, and the right is high-risk patients). The patient's risk value increases sequentially, and the number of deaths of adenocarcinoma patients increased sequentially, which is also in line with the rationality of the prognosis model. We performed a visual analysis of risk heat maps for these 22 LncRNAs, and the results showed According to the standard of NOMp value<0.05, 11 KEGG signal pathways in the high-risk group changed signi cantly. However, according to previously published research, KEGG_REGULATION_OF_AUTOPHAGY, as one of the most changed paths, may promote the occurrence and progression of esophageal adenocarcinoma. Therefore, we only studied the expression of autophagy pathways. The risk characteristics of OS predicted in patients with esophageal adenocarcinoma showed that the expressions of AL391121.1, AC015813.1, AL031670.1, AP002449.1, AC133552.5, MIRLET7BHG, and AC092574.2 were related to survival prognosis. (Fig 7.).

Multi-index ROC curve analysis
Multiple curves represent various clinical characteristic indexes of esophageal adenocarcinoma, including risk value, age, gender, tumor staging and so on. Among them, the red risk ROC curve drawn by LncRNA risk values related to the prognosis of 22 esophageal adenocarcinomas is the main analysis object. The larger the area under the ROC curve of the risk value, the more accurate the established prognosis model. We found that in addition to the risk value (AUC=0.941) and tumor stage (AUC=0.622), the AUC>0.6, and the other clinical features of the AUC<0.6, we can draw a conclusion: the prognosis model of esophageal adenocarcinoma we constructed uses risk The value to predict the patient's survival prognosis has a greater advantage over other clinical features (Fig 8.).

Independent prognostic analysis
We used the construction of a COX model to analyze the clinical data of patients with esophageal adenocarcinoma. We found that the risk score and stage were in the independent prognostic analysis of single factor (P=0.003 VS P=0.002) and multivariate (P=0.002 VS P<0.001). [p<0.05] it indicated that th two indicators can be used as independent prognostic factors of the esophageal adenocarcinoma model, while other clinical characteristic indicators cannot be used as independent prognostic factors (P>0.05, indicating that the difference is not statistically signi cant). Univariate and multivariate CPHR analysis showed that this feature is an independent predictor of patients with esophageal adenocarcinoma (Fig  9.).

Clinical correlation analysis
We downloaded all the clinical data of patients with esophageal adenocarcinoma in the TCGA and GEO databases, except for some data without follow-up, data without speci c tumor staging, data without a clear diagnosis, and data without complete information, and nally screened to meet the conditions. In order to explore the relationship between these 22 autophagy-related LncRNAs and the clinical features of esophageal adenocarcinoma, we used t-test and Kruskal-Wallis test to calculate the correlation. Our study found that the prognosis model has a certain correlation with the grade and staging of clinical tumors (grade G1-2 P=0.001, stage I-II P=0.02), and the difference is statistically signi cant (Table 1.). We also found that the model is more accurate and valuable in predicting the prognosis of patients with stage I-II early stage esophageal adenocarcinoma (P<0.05), but for later clinical staging, poor differentiation, high malignancy and metastasis. For patients with advanced esophageal adenocarcinoma, this model is not effective in evaluating the prognosis (P>0.05). Table 1. Correlation analysis of clinical characteristics data of patients with esophageal adenocarcinoma

GSEA enrichment analysis
In the GSEA enrichment analysis, if the peak is on the upper left side, it means that the high expression of the gene will promote the corresponding signal transduction pathway. On the contrary, the peak on the lower right side means that the high expression of the gene will inhibit the corresponding signal transduction pathway.
We found that the high expression of the selected LncRNA can promote the signal transduction pathways, such as TRANSCRITION-FACTORS, CELL-CYCLE, MISMATCH-REPAIR, NUCLEOTIDE-EXCISION-REPAIR, OOCYTE-MEIOSIS, SPLICEOSOME (Fig 10.)

Discussion
In the eld of tumor research, exosomes, apoptosis, autophagy, enzyme regulation, etc. participate in the occurrence and development of tumor cells, which have become the research hotspots on the basis of tumor prevention and treatment in the world today.
Leidal AM et.al (33) pointed out that there is a certain correlation between autophagy dysfunction and various diseases, such as infection, tumor, transplantation, metabolic disorder and so on. Autophagy is a double-edged sword in tumorigenesis. It participates in the growth and metabolism of tumor cells. It can promote or inhibit tumor development through signal transduction pathways, regulate tumor cell death or promote tumor cell proliferation. Our research shows that a total of 22 autophagy-related LncRNAs are involved in the prognostic regulation of esophageal adenocarcinoma. After KEGG and GO enrichment analysis, we found that almost all of these LncRNAs are involved in autophagy signaling pathways. In addition, we also found that these 22 LncRNAs have prognostic value. Among them, AL391121.1, AC015813.1, AL031670.1, AP002449.1, AC133552.5, MIRLET7BHG, AC092574.2 have prognosis in patients with esophageal adenocarcinoma play a protective role. Chen-Yu Liu et.al (34) found that the MIRLET7BHG polymorphism may be an important predictor of lung cancer related to asbestos exposure. He used the MIRLET7BHG polymorphism to establish a lung cancer risk assessment model, but in this study we also found that MIRLET7BHG is also involved in the prognostic evaluation of the overall survival (OS) of patients with esophageal adenocarcinoma. Yu Y et.al (35) pointed out that the relationship between autophagy genes and esophageal adenocarcinoma is related to the corresponding LncRNA expression.
JPX is the most abundant LncRNA in the human genome, and the study of its mechanism with JPX in LncRNA is largely unknown. Karner H et.al (36) hold a special study on JPX. They found that human JPX and its mouse homologue lncRNA JPX have great differences in their nucleotide sequence and RNA secondary structure. Despite such differences, both lncRNAs showed strong binding to CTCF. Karner H also found that it supports a model of lncRNA function conservation independent of sequence and structural differences.
JPX is mostly in the eld of liver cancer, and there are few reports on the mechanism of esophageal adenocarcinoma. Lin XQ et.al (37) found that JPX can induce XIST to inhibit hepatocellular carcinoma through sponge miR-155-5p. Many evidences also indicate that JPX has an inhibitory effect on tumor metabolism signaling. Gong J et.al (38) proposed that JPX will become a new tumor-speci c prognostic lncRNA marker, indicating that there is a certain correlation between JPX and tumor prognosis.
UBR5-AS1 is recognized as a key LncRNA in many epidemic cancers. Ji SQ et al (39) not only explored the relationship between UBR5 expression and cell proliferation, apoptosis and regulatory mechanisms in colon cancer cell lines, but also proved that UBR5 can Degradation of P21 by ubiquitination. UBR5 may become a potential targeted therapy site for tumors in the future. Tasaki T et.al (40) studied the relationship between UBR5 and autophagy. They found that the loss of UBR5 leads to a variety of disorders in autophagy induction and ux, including the synthesis and lipidation/activation of the ubiquitin-like protein LC3 and the formation of autophagy double membrane structure.
Chen S et.al (41) found that MINCR has an inhibitory effect on the cycle arrest and apoptosis of NSCLC cells, suggesting that this lncRNA can be used as a potential target for NSCLC treatment. In our study, we found that MINCR also had a positive expression in patients with esophageal adenocarcinoma, but there is still a lack of a large number of basic experimental studies to verify the relationship between MINCR and the prognosis of esophageal adenocarcinoma.
TMPO-AS1 is a kind of LncRNA that promotes tumorigenesis. It often mediates the activation of PI3K and Akt pathways to achieve the purpose of regulating cell cycle and tumor cell proliferation. Although the overall survival (OS) correlation is not very signi cant, but This also indicates that the increased expression of TMPO-AS1 in esophageal adenocarcinoma has a great correlation with poor prognosis. In our study, the expression of TMPO-AS1 in esophageal adenocarcinoma was signi cantly higher than that of the normal control group. This is consistent with the previous work of Huang W et.al (42), revealing the effect of TMPO-AS1 in patients with esophageal adenocarcinoma. The high expression characteristics are related to the aggressive tumor behavior of esophageal adenocarcinoma.
In addition to the apoptosis mechanism of cell death, another key mechanism is autophagy, which involves signal transduction pathways, receptor binding, and enzyme decomposition and activation. There is an intricate relationship between autophagy and apoptosis (43). The interaction between autophagy and apoptosis is mediated by different molecules in the microenvironment of esophageal adenocarcinoma, and the intricacies between the two the association easily triggers signal transduction regulation that promotes or inhibits tumor growth. In our study, through KEGG and GO enrichment analysis, we found that most of the LncRNAs related to the prognosis of esophageal adenocarcinoma are mainly enriched in the pathways of apoptosis and autophagy. These results were con rmed in the study of Zhu L et.al (19). This study explores the prognosis of esophageal adenocarcinoma that has a certain correlation with autophagy-related LncRNA. Among them, the mechanism of some LncRAN is still unknown, and some LncRNAs are highly expressed in other tumors, but the expression is not signi cant in esophageal adenocarcinoma, etc. We still need a lot of basic experimental research on genes related to esophageal adenocarcinoma and autophagy. At the same time, a lot of clinical follow-up work is needed to obtain su cient data on prognostic indicators such as overall survival time (OS) of patients with esophageal adenocarcinoma. Improving the prognosis model of esophageal adenocarcinoma that we have constructed will help us to have a more accurate assessment of the prognosis of patients with esophageal adenocarcinoma in our clinical work, and at the same time explore those potential targeted treatment sites, which will also To a large extent, the treatment strategy for the survival chance of patients with esophageal adenocarcinoma.

Declarations
Ethics approval and consent to participate Our study does not contain data from any individual person, or any animals.

Author Contributions
Liusheng Wu wrote the manuscript. Dingwang Wu and Pengcheng Xu collected the data; Xiaoqiang Li, Jixian Liu and Da Wu conceived and guided the study. All authors reviewed, edited and approved the nal manuscript. Dingwang Wu, Pengcheng Xu contributed to the acquisition interpretation of data. All the authors revised it critically for important intellectual content, gave nal approval of the version to be published and agreed to be accountable for all aspects of the work. Liusheng Wu, Jixian Liu, Xiaoqiang Li and Da Wu are joint primary authors.