Seven Autophagy-Related Genes are Associated with the Tumor Immune Microenvironment in Predicting Survival Risk of Non-Small Cell Lung Cancer

Background: Non-small cell lung cancer (NSCLC) ranks rst among global cancer-related deaths. Despite the emergence of various immunological and targeted therapies, immune tolerance remains a barrier to treatment. Methods: It has been found that this obstacle can be overcome by targeting autophagy-related genes (ATGs). ATGs were screened by coexpression analysis and the genes related to the prognosis of lung cancer were screened using Kaplan–Meier (K-M) survival analysis, univariate Cox regression, and multivariate Cox regression. The prognostic risk model of ATGs was constructed and veried using K-M survival analysis and receiver operating characteristic (ROC) curve analysis. Results: The prognostic risk model of ATGs was constructed. Gene set enrichment analysis (GSEA) showed that the function and pathway of ATG enrichment were closely related to immune cell function. CIBERSORT, LM22 matrix, and Pearson correlation analysis showed that risk signals were signicantly correlated with immune cell inltration and immune checkpoint genes. Conclusions: We identied and independently veried the ATG (AL691432.2, MMP2-AS1, AC124067.2, CRNDE, ABALON, AL161431.1, NKILA) in NSCLC patients and found that immune regulation in the tumor microenvironment is closely related to this gene.

1 Introduction Owing to its rapid morbidity and mortality, lung cancer poses a great challenge to human health and life on a global scale [1]. The 2017 Global Burden of Disease Study reveals that there are 2.2 million new cases of lung cancer and 1.9 million deaths annually [2]. Non-small cell lung cancer (NSCLC) accounts for approximately 85% of lung cancers [3,4], making it the leading cause of cancer-related deaths worldwide. The current treatment methods for lung cancer include surgery, chemoradiotherapy, and targeting therapy; however, NSCLC has strong aggressiveness and heterogeneity, drug resistance, recurrence, and even malignant progression. Therefore, new treatment options are urgently needed.
Immunotherapy is currently carried out in a variety of cancers. As immunotherapy aims to arouse the tumor recognition function of the patient, this is also a recent research direction for lung cancer. Given that immune tolerance currently presents an obstacle in this research, we hope to nd new biomarkers to guide treatment.
Autophagy is an "eat yourself" phenomenon that can use the components of a cell to recompose other required components. In normal cells, autophagy helps maintain homeostasis. In the context of starvation or malignancy, signi cantly upregulated autophagy functions [5] can maintain the metabolic requirements of cells [6][7][8]. Autophagy can reduce cell death, damage, and chronic in ammation caused by the accumulation of harmful substances in cells during stress [9]. An increasing number of studies have shown that autophagy is also involved in immune regulation. In previous studies on renal carcinoma, it was found that the immune e cacy of tumors could be enhanced by targeting autophagy.
Similar ndings were also found in studies on oral squamous cell carcinoma. However, autophagy has two functions in cancer [10]. Tumor cells can be more tolerant to hypoxia [11], starvation, and treatment

Screening of autophagy-related lncRNAs
Autophagy-related genes (ATGs) were downloaded from the Human Autophagy Database (HADb, (http://www.autophagy.lu/index.html). We used the "limma R" (http://www.bioconductor.org/) software package and the function of Cor in R (https://cran.r-project.org/) and set the coe cient of cor-lter to > 0.4 with P < 0.001. We analyzed the correlation between autophagy genes and the lncRNA expression level in the sample and determined the autophagy-related lncRNA.

Prognosis model development
The "Survival R" package was used to calculate survival prognosis. First, the Kaplan-Meier (K-M) test and univariate Cox regression analysis were used to screen lncRNAs with statistical signi cance. Then, multivariate Cox regression was used to screen lncRNAs with independent prognostic signi cance. Hazard ratios (HRs) were used to distinguish between high-risk lncRNAs (HR > 1) and protective lncRNAs (HR < 1). The risk score of each NSCLC patient was calculated based on the expression of the model lncRNA and its coe cient. The risk score was calculated as follows: risk score = β gene 1 × expressed gene 1 + β gene 2 × expressed gene 2 + β gene 3 × expressed gene 3 +... + β gene 7× expression gene 7. Finally, patients with NSCLC were classi ed into high-risk and low-risk groups based on the median risk score.

Verifying and evaluating the prognostic signature
We used the K-M "Survival R" package and R software to draw the K-M survival curve. The K-M survival curve was used to calculate the difference in OS between the high-and low-risk groups. The receiver operating characteristic (ROC) curve was used to measure the predictive e ciency of various clinical indicators. A strati ed survival analysis was then performed to determine the accuracy of the prognostic model. Finally, univariate and multivariate Cox regression analyses were performed to test whether the prognostic model was an independent risk factor. Statistical signi cance was set at P < 0.05.

Establishment and evaluation of nomograms
Traditional clinical variables (such as age, sex, and AJCC stage) and risk scores derived from prognostic markers were used to construct a nomogram, and the OS of patients with NSCLC at 1, 3, and 5 years were analyzed. At the same time, a calibration curve was constructed to analyze the accuracy of the nomogram.
2.6 Construction of the lncRNA-mRNA coexpression network.
A lncRNA-mRNA coexpression network was constructed to analyze the correlation between autophagyrelated lncRNAs and target mRNAs. Cytoscape software (version 3.7.1, http://www.cytoscape.org/) was used to construct and visualize the lncRNA-mRNA coexpression network.

Functional enrichment analysis
Gene ontology (GO) enrichment analysis and Kyoto Encyclopedia of Genes and Genome (KEGG) pathway analysis were used to determine the mRNAs related to lncRNAs. Statistical signi cance was set at P < 0.05.

Gene Set Enrichment Analysis (GSEA)
Gene set enrichment analysis (GSEA4.1.0) was performed on the whole genome expression pro le of NSCLC patients to determine the biological functions of differentially expressed genes between high-risk and low-risk groups. We used 500 and 15 genes as the largest and smallest gene set sizes to lter gene sets, respectively. Permutation was performed 1,000 times.

Analysis of immune cell components
CIBERSORT is a robust and novel method that can construct the cellular composition of tissues (such as solid tumors) based on gene expression pro les. The LM22 matrix is a leukocyte gene marker matrix consisting of 547 genes that can distinguish 22 subtypes of human hematopoietic cells. CIBERSORT combined with the LM22 matrix was used to assess the composition of immune cells in the high-risk and low-risk groups according to the risk score. NSCLC samples were deleted (P > 0.05). At the same time, the correlation of immune cells was examined, and the correlation between each immune cell and risk score was calculated using the Pearson correlation method. The expression of classic immune checkpoint genes (PD-1, PD-L1, CTLA4, and IDO1), which have traditionally been regarded as targets of immunotherapy, was studied.

Statistical methods
All statistical tests were conducted in R. P < 0.05 was considered statistically signi cant.
3 Results were considered as protective factors with HR values less than 1.

Evaluation of prognostic markers including seven autophagy-related lncRNAs
The risk score of each patient in TCGA dataset was calculated using the following formula to calculate the autophagy-related lncRNA signature: risk score = (-0.096 × AL691432.2 expression level) + (-0.216 × MMP2-AS1 expression level) + (-0.045 × AC124067 .2 expression level) + (-0.020 × CRNDE expression level) + (0.351 × ABALON expression level) + (0.006 × AL161431.1 expression level) + (0.091 × NKILA expression level). Using the median value of the risk score as the cutoff point, NSCLC patients were divided into high-risk (n = 466) and low-risk (n = 462) groups. K-M survival curve analysis showed that the overall survival (OS) and progression-free survival (PFS) of NSCLC patients with high-risk scores were signi cantly shorter than those of NSCLC patients with low-risk scores ( Fig. 2A). The 3-year survival rates of high-risk and low-risk patients were 53.40% and 67.2%, and the 5-year survival rates were 35.34% and 50.1%, respectively. Principal component analysis (PCA) based on seven autophagy-related lncRNAs showed that there were two distinct distribution patterns between the high-risk and low-risk groups (Fig. 2B). Time-dependent ROC curve analysis showed that the 1 -, 3 -, and 5-year survival rates predicted by the three lncRNA risk score curves (AUC) were 0.660 (Fig. 2F), 0.641 (Fig. 2G), and 0.634 (Fig. 2H), respectively. The NSCLC patients were then ranked according to the risk score calculated using autophagy-related lncRNA prognostic indicators (Fig. 2C). The scatter plot shows that the survival rate of NSCLC patients was related to the risk score; patients with a higher risk score had a shorter survival time (Fig. 2D). The heat map shows that the expression levels of the seven prognostic marker-related lncRNAs were signi cantly different in NSCLC patients in different risk groups. High-risk patients expressed higher levels of risk factors (ABALON, AL161431.1, and NKILA), and low-risk patients expressed higher levels of protective factors (AL691432.2, MMP2-AS1, AC124067.2, and CRNDE) (Fig. 2E).

Correlation analysis between autophagy-related lncRNA prognostic signals and other clinicopathological parameters
We then analyzed the correlation between the risk score of the autophagy-associated lncRNA prognostic signature and the clinicopathological characteristics of NSCLC patients. We found that the risk scores were statistically similar among patients at all levels ( Table 1). We further performed a strati ed analysis to study the prognostic value of autophagy-related lncRNAs. Hierarchical analysis was performed according to the pathological classi cation, age (≤ 65 years old, > 65 years old), gender (male and female), AJCC stage (stage I, II and III, IV), T stage (T1/T2 and T3/T4), N stage (N0 and N1/N2/N3), and M stage (M0 and M1). As shown in Fig. 3, K-M survival curve analysis showed that the OS of patients with a high risk score in lung squamous cell carcinoma (LUSC), lung adenocarcinoma (LUAD), age > 65 years, male sex, and AJCC stage was shorter than that of patients with low risk scores, and the difference was statistically signi cant. We analyzed the PFS of patients and reached similar conclusions ( Supplementary Fig. 2). This indicates that the prognostic characteristics can accurately judge the prognosis of patients compared with other clinicopathological characteristics. 3.4 Autophagy-related lncRNA signals are independent prognostic factors Univariate and multivariate Cox regression analyses were performed to determine whether autophagyrelated lncRNA prognostic markers were independent prognostic factors for NSCLC patients. Univariate analysis showed that AJCC stage (P < 0.001), T stage (P < 0.001), N stage (P < 0.001), M stage (P = 0.02), and autophagy-related lncRNA prognostic risk score (P < 0.001) were signi cantly correlated with OS (Fig. 4A). Multivariate analysis showed that the autophagy-related lncRNA prognostic risk score (P < 0.001) was signi cantly correlated with OS (Fig. 4B). ROC curve analysis (Fig. 2F) showed that the AUC value of prognostic markers of autophagy-related lncRNAs was 0.660-fold higher than that of age (AUC = 0. 3.5 Assessment of prognosis prediction nomogram, including autophagy-related lncRNA prognostic signature risk score We used the risk score calculated by autophagy-related lncRNA prognostic signals and other clinicopathological factors (including age, gender, grade, AJCC stage, T stage, and N stage) to construct a nomogram to accurately estimate the 1-, 3-, and 5-year survival rates (Fig. 5A). The analysis of the calibration curve shows that the actual and predicted 1-, 3-, and 5-year survival times were similar (Figs. 5B-5D). These results indicate the reliability and accuracy of the nomogram, which contains the autophagy-related lncRNA prognostic signature risk score.

Construction and functional enrichment analysis of the lncRNA-mRNA coexpression network
The lncRNA-mRNA coexpression network was constructed using Cytoscape, which contained 28 pairs of lncRNA-mRNA (Pearson correlation coe cient |R|> 0.4, P < 0.05) (Fig. 6A). The Sankey diagram shows the relationship (risk/protection) between 28 types of mRNAs and seven types of lncRNAs (Fig. 6B). The rst three GO terms of biological processes are autophagy, processes that utilize the autophagy mechanism, and macroautophagy (Fig. 6C). The rst three GO terms for cellular components were autophagy, vacuolar membrane, and phosphor assembly site (Fig. 6D). The rst three GO terms for molecular functions were protein serine/threonine kinase activity, microtubule binding, and tubulin binding (Fig. 6E). KEGG pathway analysis con rmed that autophagy was the most signi cantly enriched pathway (Fig. 6F).

Gene set enrichment analysis (GSEA)
To study the biological functions and pathways of ATG risk signals in NSCLC patients, we conducted GSEA. A total of 5,472 GO functions (the rst 50 in Table S1) and 178 KEGG pathways (the rst 50 in Table S2) were enriched, with many results related to cancer behavior, immune cell function, and response (adjusted P-value < 0.25). The results of GO and KEGG analyses showed that risk signals were closely related to the tumor immune microenvironment (Fig. 7A) and warranted further analysis.

Recognition of immune cell landscape and its correlation with prognostic markers of ATGs
Combining the CIBERSORT and LM22 matrix, as shown in the heat map of the immune cell distribution of the NSCLC sample (Fig. 7B), the bar graph ( Supplementary Fig. 3), and the violin graph (Fig. 7C), showed that many immune cell types have signi cant changes between groups. Pearson correlation analysis was used to identify the coexpression pattern between signi cantly changed immune cells. The results indicated that these immune cells work together and regulate each other (Fig. 7D). Next, we analyzed the relationship between the prognostic characteristics of patients with NSCLC and the immune cell subtypes. As shown in Supplementary Fig. 4A-H, the correlation between B cells, T cells, regulatory T cells (Treg), M2 macrophages, mast cells, dendritic cells (DCs), monocytes, and risk signals was determined.

Research on the correlation between immune checkpoint genes and risk score
The correlation between risk characteristics and immune checkpoint genes was studied in the high-risk and low-risk groups, and the median risk in NSCLC samples was used as the strati cation standard. As a result, the expression of PD-1 ( Supplementary Fig. 4J), PD-L1 ( Supplementary Fig. 4K), CTLA4 ( Supplementary Fig. 4L), and PD-L2 ( Supplementary Fig. 4I) was relatively higher in the high-risk group.

Discussion
In our study, we identi ed and independently veri ed prognostic signals based on seven ATG lncRNAs (AL691432.2, MMP2-AS1, AC124067.2, CRNDE, ABALON, AL161431.1, and NKILA). GSEA results showed that risk signals were signi cantly correlated with immune cell function and immune response. This nding led us to further study the relationship between autophagy and immune in ltration.
lncRNAs are a type of noncoding RNA with a length of more than 200 nucleotides. They interact with proteins, RNA, and DNA to regulate gene expression before and after transcription. Studies have found that lncRNAs are widely involved in human life activities such as cell development, differentiation, proliferation, apoptosis, migration, invasion, and metastasis, which are closely related to the formation, progression, and metastasis of malignant tumors. In addition, lncRNA is also involved in chemotherapy resistance and targeted therapy.
A study of endometrial cancer found that lncRNA AL161431.1 targets miR-1252-5p, which leads to the upregulation of MAPK signals and regulates cancer cell apoptosis and autophagy cell death [13]. NKILA is a lncRNA that interacts with NF-κB; it regulates the sensitivity of T cells to T lymphocyte death (AICD) by inhibiting NF-κB activity. NKILA can directly or indirectly inhibit IκBα phosphorylation and NF-κB activation in breast and liver cancers [14,15]. In breast cancer studies, in ammatory cytokines were found to upregulate NKILA expression through the NF-κB pathway, which regulates breast cancer metastasis and inhibits angiogenesis in HUVECs [14,16]. In nasopharyngeal carcinoma (NPC), NKILA was also found to inhibit NPC metastasis through the NF-κB pathway [17]. P65 in laryngeal cancer tissues can also reportedly positively regulate NKILA expression; however, NKILA can inhibit the transport of P65, thus reducing the resistance of cells to laryngeal cancer X-ray radiation [18]. We found that NKILA reduces MMP14 expression by mediating IκBα phosphorylation and NF-κB translocation to the nucleus, thereby impairing the migration and invasion of ESCC cells [19]. Lung cancer studies have found that NKILA is associated with the survival prognosis of LUAD NKILA expression is regulated by the classic TGF-β signaling pathway, which subsequently inhibits the migration and invasion of NSCLC cells by interfering with the NF-κB/Snail signaling pathway in NSCLC cells [20]. A similar function was observed in tongue squamous cell carcinoma [21].
Colorectal neoplasia differentially expressed (CRNDE) was rst observed in colorectal adenomas and colorectal cancers [22]. CRNDE is elevated in a variety of cancers, including colorectal cancer (CRC) [23,24], glioma [25], hepatocellular carcinoma [26], and lung cancer [27], and there is increasing evidence that it plays a role in regulating cancer cell proliferation, migration, invasion, and apoptosis. Downregulation of CRNDE reportedly results in the inhibition of CRC cell proliferation and induces cell apoptosis [24]. Ellis et al. found that CRNDE is related to the regulation of aerobic glycolysis, or the Warburg effect, in cancer cells [28]. In lung cancer studies [29], LUAD tissue and CRNDE were signi cantly associated with poor differentiation, TNM stage, lymph node metastasis, radiotherapy response, and shorter OS time. Liu et al. found that CRNDE promotes NSCLC cell proliferation and growth by activating PI3K/AKT signal transduction [30].
As tumor-in ltrating lymphocytes in uence the immunotherapy response as well as the clinical consequences of colorectal, gastric, lung, and breast cancers, their role in various cancer-related processes is increasingly recognized. In this study, we also used gene expression pro les to explore the involvement of TILs in NSCLC patients. In our ATG risk model, the high-risk and low-risk groups, showed a different immune landscape. Speci cally, we found that the distribution of B cells, T cells, macrophages, DCs, mast cells, and monocytes in our risk model was different and correlated with risk (P < 0.05).
Increasing numbers of studies are using DCs to treat cancer. DCs have strong antigen presentation ability, which may overcome tumor tolerance and induce antitumor immunity [31]. In NSCLC patients, DCs upregulate the expression of the coinhibitory molecule B7-H3 and inhibit T cell activity [32]. Lung tumor cells secrete substances that lead to monocyte-induced DC differentiation disorders, maturation abnormalities, and phenotypic defects. Tumor-induced DC de ciency leads to an insu cient ability to recognize and present tumor antigens and tumor immune escape [33].
Previous studies have shown that T cells are the main immune-in ltrating cells in NSCLC [34]; among these cells, CD4 + T cells are the most abundant T cell population (26%), followed by CD8 + T cells (22%) [35]. Many studies have shown that low levels of CD8 + T cell in ltration in tumor lesions are associated with poor prognosis [36]. As a key factor in the tumor microenvironment, B cells and neutrophils play an important regulatory role in tumor progression. However, these cells have both antitumor and pro-tumor activities [37][38][39].
The same macrophages play different roles in the tumor microenvironment. In LUAD, there were higher immune scores, more memory B cells, and more M0 macrophages in the early stage than in the late stage. The abundance of M0 macrophage in ltration was signi cantly correlated with TNM stage and survival rate [40].
In LUAD, mast cells are associated with tumor angiogenesis and poor prognosis [41,42]. In this study, a predictive model of autophagy-related lncRNAs was established and validated by univariate and multivariate Cox regression analyses. The value of our K-M survival curve was P = 3.325e − 05, indicating that our prediction model was closely related to the survival outcomes of patients with lung cancer. In addition, our prognostic model was superior to the other prognostic markers. Finally, we used GSEA to detect the biological functions of the prediction model. These results strongly demonstrate that these lncRNAs are involved in tumor progression and are associated with immune in ltration.
Our study has several limitations. First, our ndings need to be further validated in other independent cohorts to determine the stability of autophagy-related lncRNA prognostic markers. Second, our study was based on 712 patients from the publicly available TCGA database. Among them, the samples of NSCLC patients with early cancer (n = 557) were signi cantly larger than those of patients with advanced cancer (n = 155), which may have skewed our results; therefore, further analysis with a larger sample size is needed. Finally, further studies on biochemical experiments, such as immunohistochemistry, quantitative real-time polymerase chain reaction, ow cytometry, and clinical data analysis, are needed to con rm our ndings.

Conclusions
In conclusion, we herein constructed an autophagy-related lncRNA model of NSCLC. These ndings suggest that seven prognostic models of autophagy-related lncRNAs can effectively predict clinical prognosis. In addition, based on the effect of the immune-infused microenvironment on tumor proliferation and metastasis, our ndings provide a new direction for immunotherapy.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.