ATG101 as a Novel Prognostic Biomarker Related to Immunotherapy in Pan-cancer

Background: ATG101 plays a signicant role in the occurrence and development of tumours by regulating autophagy. Our study aimed to research the correlation between the expression of ATG101 and tumour prognosis and its role in tumour immunity. Methods: First, integrated analysis of The Cancer Genome Atlas and Genotype-Tissue Expression portals were used to analyse the expression of ATG101. Then, we used Kaplan–Meier curves for survival analysis. Next, we analysed the relationship between ATG101 expression and six immune cells, the immune microenvironment and immune checkpoints. Besides, we analysed the relationship between the expression of ATG101 and methyltransferase. Finally, we used GSEA to study the function of ATG101 in COAD and LIHC. Results: Integrated analysis showed that ATG101 was overexpressed in different tumours. Kaplan–Meier curves found that ATG101 was associated with poor prognosis in most tumours. We found that that ATG101 can be used as a target and prognostic marker of tumour immunotherapy for different tumours. We also found that ATG101 regulates DNA methylation. GSEA analysis showed that ATG101 may play a critical role in COAD and LIHC. Conclusions: Our study highlights the signicance of ATG101 in the study of tumour immunity from a pan-cancer perspective.

mechanism of ATG101 in different cancers by GSEA enrichment analysis. Our results provide the rst strong evidence for ATG101 as a target for immunotherapy from the perspective of pan-cancer and suggest that ATG101 can be used as a prognostic marker for immunotherapy.

Methods
Obtain ATG101 expression data from public databases The expression of ATG101 in various normal tissues and cell lines was analyzed using data from the GTEx and CCLE databases. Data from a total of 6678 normal tissues and organs were obtained from the GTEx database (https://www.gtexp ortal.org/home/). We collected 11057 samples from the TCGA database, including 10327 tumour samples and 730 paraneoplastic samples matched with 33 types of malignant cancers. For the calculation of the difference in ATG101 expression between tumour and paraneoplastic tissues, some tumours with missing matches or too few paraneoplastic samples were excluded. For the remaining samples, the tumour samples were normalized and then subjected to the Wilcoxon test to analyze whether there were differences in ATG101 expression between these cancers and normal tissues.
Correlation between the expression of ATG101 and the prognostic value in various cancer patients To verify whether the expression of ATG101 signi cantly affects the prognosis of tumours, the survival information (including the OS, DSS, DFS and PFS rates) of the patients from whom the 10327 tumour samples from 33 cancer types in the TCGA database were obtained was downloaded. The Kaplan-Meier survival estimation method was used to calculate the prognostic value of ATG101, and the log-rank test was used to determine the signi cant difference index. With P<0.05 indicating a signi cant difference, the results obtained by these two methods were analyzed.

Correlation analysis between immune cell in ltration and ATG101 expression
Through the TIMER algorithm, we used the gene expression pro les of 10897 samples from 32 cancer types in the TCGA database to infer the number of tumour-in ltrating immune cells (TIICs). TIMER is a statistical method of deconvolution that can be used for approximate calculation of immune in ltration. We analyzed the relationship between the expression level of the ATG101 gene and the abundance of in ltrating immune cells (including CD4 + T cells, CD8 + T cells, B cells, neutrophils, dendritic cells, and macrophages) through the above methods. Correlation analysis between the expression level of ATG101 and the tumour microenvironment The non-tumour component cells in the tumour microenvironment, including immune cells and stromal cells, are mainly used for cancer diagnosis and prognostic evaluation. ESTIMATE (https://bioinformatics.mdanderson.org/public-software/estimate/) is a commonly used tool to predict the in ltration of stromal cells and immune cells in the tumour microenvironment and the purity of cancer tissues. According to the expression data from 9664 samples of 33 malignant cancer types, after excluding the normal tissue data, the ratio of stromal cells and immune cells in each tumour sample was calculated, and then the immune score, stromal score and ESTIMATE score were used to evaluate tumour purity. Furthermore, the ATG101 expression data and the score of the estimation algorithm were combined, and the Spearman correlation test was used to verify their correlation. This process mainly involved the following packages: ggplot2 (https://CRAN.R-project.org/package=ggplot2), ggpubr and ggExtra (https://CRAN. R-proje ct.org/packa ge = ggExtra).
The correlation of TMB, MSI and neoantigens with the expression of ATG101 Tumour mutation burden (TMB) and microsatellite instability (MSI) are important evaluation indicators directly related to the effect of immunotherapy, such as PD-1 blockade. The mutation data from 10114 samples from 33 cancer types were used to analyze the correlation between TMB and ATG101 expression levels. First, we calculated the mutation score of each tumour sample and obtained the TMB information for each cancer. In addition, we used the Spearman correlation test to analyze the correlation between ATG101 expression and TMB and used the fmsb software package (https://CRAN.Rproject.org/package=fmsb) to construct a related radar chart. We also downloaded and analyzed the MSI scores of 10,415 tumour samples and used the same processing method to draw related radar charts of ATG101 expression and MSI. Predicted neoantigens of each sample were retrieved from a previous publication [8] . The correlations between TMB, MSI, neoantigens and ATG101 expression were all calculated using the Spearman correlation test.
The correlation between ATG101 and known essential marker genes Considering that the analysis results are intended to guide clinical practice, after exploring the relationship between ATG101 expressoin and immune cell in ltration, the correlation between ATG101 expression and various (47) immune checkpoint target genes was further analyzed. Here, the Spearman correlation test was used. Next, the reshape2 software package (http://www.jstatsoft.org/v21/i12/) was used to establish a related heat map. Changes in the DNA methylation state are common in tumours.
DNA methylation plays an important role in the occurrence and development of tumours. Therefore, we deeply analyzed the correlation between four methyltransferases (DNMT1, DNMT2, DNMT3A and DNMT3B) and the expression level of ATG101. The analysis method was the same as that described above.
Gene Set Enrichment Analysis (GSEA) analysis ATG101 expression is closely related to patient prognosis and immune regulation responsiveness in COAD and LIHC; thus, we aimed to determine the potential mechanisms and signalling pathways of ATG101 in different tumours. First, according to the expression level of ATG101, the tumours were divided into high expression groups and low expression groups. Then, limma, org.Hs.eg.db, clusterPro ler (http://bioconductor.org/packages/relea se/bioc/html/clust erPro ler.html) and c2.cp.kegg.v7.2.symbols background were used to perform gene set enrichment analysis on tumours and to select the most important KEGG pathway to draw an enrichment curve. P<0.05 was used as the standard of signi cant difference. Then, the transcription factor information related to ATG101 binding was downloaded from ChIPBase, and the genes enriched by GSEA intersected with the genes identi ed in ChIPBase. Twenty-one common genes were obtained. These 21 genes were then analyzed by STRING (https://www.stringdb.org/)) and visualised with Cytoscape Statistical analysis All data in this study were analyzed using R version 4.0.3 (https://www.r-project.org/) and its auxiliary software packages, and all analyses were performed with P<0.05 as the standard of signi cant difference. When analyzing the expression differences of ATG101, the Limma software package and Student's t-test were used, the Kaplan-Meier curve was used for survival analysis, and the Spearman or part of the Spearman method was used to analyze the correlation between different genes and the relationship between different genes and different immune cells.

Results
Expression of ATG101 in pan-cancer First, we compared tumour samples and paracarcinoma tissues from the TCGA database, normal samples from the GTEx database and cancer cells from the CCLE database to evaluate the mRNA expression characteristics of ATG101 in humans. We found that the expression of ATG101 varied in different normal tissues ( Figure 1A) and cancer cells ( Figure 1B) after collating the tumour tissues and paracarcinoma tissues from the TCGA database. We found that the expression of ATG101 was upregulated in BRCA, CHOL, COAD, ESCA, HNSC, LIHC, LUAD, LUSC, PRAD, READ, STAD, and UCEC. ATG101 was downregulated only in KIRP ( Figure 1C). Then, we analyzed the difference in ATG101 expression between normal samples from the GTEx database and tumour tissues from the TGGA database and found that ATG101 expression was upregulated in BRCA, CESC, CHOL, COAD, GBM, HNSC, KICH, LGG, LIHC, LUSC, OV, PAAD, PRAD, READ, SKCM, UCEC and UCS. However, its expression was found to be downregulated in ESCA, LAML, LUAD, SKCM, STAD and TGCT ( Figure 1D).

Correlation analysis between ATG101 expression level and prognostic value
The characteristics of ATG101 expression at the mRNA level suggested that ATG101 may be a valuable target for pan-cancer. Therefore, we further used Kaplan-Meier analysis to explore the correlation between ATG101 mRNA expression levels and the survival outcomes (including the OS, DSS, DFS and PFS rates) of different cancers from the TCGA database. Our results demonstrated that upregulated ATG101 expression was associated with a shorter OS rate in ACC, CHOL, LGG, LIHC and MESO ( Figure 2 A-E). In addition, upregulation of ATG101 expression was related to a shorter DSS rate in ACC, COAD, KIRP, LGG, LIHC, MESO and READ (Figure 2 F-L). High ATG101 expression was associated with a poor DFS rate in ACC, CHOL and LUSC, while low ATG101 expression was associated with a poor DFS rate in USC ( Figure 2 M-P). Furthermore, the high expression of ATG101 was also associated with a poor PFS rate in ACC, COAD, KIRP and LIHC (Figure 2 Q-T). These results con rmed that the high expression of ATG101 is negatively correlated with the prognosis of most malignant tumours.

Relationship between ATG101 expression and the tumour immune microenvironment
Based on the correlation between immune cells and ATG101 expression, we also explored the relationship between ATG101 expression and the tumour immune microenvironment. Our results showed that ATG101

Relationship between ATG101 expression and immune checkpoint gene expression in various cancers
The immune checkpoint is an inhibitory signalling pathway in the immune system that regulates the intensity and persistence of the immune response in peripheral tissue, prevents tissue injury and plays an important role in maintaining self-antigen tolerance. We used the mRNA sequence database to assess whether there was an association between the expression level of ATG101 and 47 common immune checkpoint genes. We found that ATG101 was highly correlated with immune checkpoint genes in various cancers. In BRCA, COAD, LIHC, LUAD and PCPG, we found that ATG101 and other immune checkpoint genes were signi cantly coexpressed. However, the expression of ATG101 was negatively correlated with all forty-seven immune checkpoint genes in CHOL, DCBC, TGCT and UCS (Figure 5 A).

Relationship between ATG101 expression and TMB and MSI in various cancers
Classic immune checkpoint therapeutic targets, including PD-1, PD-L1 and CTLA-4, are widely targeted clinically and these therapies have achieved satisfactory effects. TMB and MSI are important evaluation indexes directly related to the e cacy of immune checkpoint therapies such as PD-1 blockade. We found that TMB was signi cantly correlated with the expression levels of ATG101 in COAD, HNSC, LGG, LUAD, SARC, SKCM, STAD, THCA and THYM (P < 0.05). In addition, UCEC had the lowest TMB score, and SKCM had the highest. (Figure 5 B). The results showed that ATG101 expression is negatively correlated with the hypermutation state in UCE but positively correlated with the hypermutation state in SKCM. We also researched the correlation between ATG101 expression and MSI in various cancers. We found that ATG101 expression was signi cantly correlated with MSI in COAD, DLBC, HNSC, KICH, KIRC, KIRP, READ, SARC and SKCM (P<0.05), while DLBC had the highest coe cient and CHOL had the lowest score (

Relationship between ATG101 expression and neoantigens in various cancers
Neoantigens are new antigens produced by tumour cells after tumour mutation and are different from those expressed by normal cells. The more neoantigens are produced, the easier it is for tumour cells to be recognized and attacked by immune cells, which is more conducive to the development of new targeted drugs. The development of bioinformatics technology has accelerated the identi cation of neoantigens. Here, we further analyzed the relationship between ATG101 expression and pan-cancer neoantigens to provide a reference for new immunotherapies for tumours. Our results showed that ATG101 expression was positively correlated with neoantigens in BRCA, KIRC, READ, HNSC, LIHC, SKCM and CESC (P < 0.05). However, in GBM, OV, LUAD, LUSC, KIRP, UCEC, COAD, STAD, THCA, BLCA, PRAD and LGG, ATG101 expression was negatively correlated with neoantigens ( Figure S4).

Relationship between ATG101 expression and the expression of four methyltransferases in various cancers
DNA methylation is a physiological process that changes chromatin structure, DNA conformation, DNA stability, and the interaction between DNA and protein through the action of DNA methyltransferase. It is one of the main mechanisms by which gene expression is regulated. DNA methylation is considered to be one of the main factors affecting tumour occurrence and development. In our research, we explored the correlation between ATG101 expression and the expression of four methyltransferases (DNMT1, DNMT2, DNMT3A and DNMT3B) ( Gene set enrichment analysis of ATG101 in COAD and LIHC After preliminary exploration of the genetic and epigenetic mechanisms of ATG101, we used GSEA to research the function of ATG101 in COAD and LIHC. In COAD, ATG101 can inhibit proteasome, RNA polymerase, BASE excision repair, DNA replication, and ribosome (Figure 7 A), but promote Wint signaling pathway, ERBB signaling pathway, Adherens junction, and aldosterone regulated sodium reabsorption (Figure 7 B). While in LIHC, ATG101 can inhibit Base excision repair, spliceosome, RNA degradation, cytosolic DNA sensing pathway, and epitheliak cell signaling in helicobacter pylor (Figure 7 C), but promote complement and coagulation cascades, butanoate metabolism, valine leucine and isoleucine degradation, drug metabolism cytochrome P450, and fatty acid metabolism (Figure 7 D). As for transcription factor level, ATG101 can promote the function of transcription factor 4, androgen receptor, signal transducer and activator of transcription 5A and YY1 transcription factor (Figure 7 E). In addition, ATG101 can inhibit the function of members of the ETS oncogene family, v-myc avian myelocytomatosis viral oncogene homologue, cAMP responsive element binding protein 1, v-ets avian erythroblastosis virus E26 oncogene homologue and early growth response 1 (Figure 7 F). In LIHC, ATG101 can affect the function of v-myc avian myelocytomatosis viral oncogene homologue, GA binding protein transcription factor alpha subunit, ETS oncogene family, v-myc avian myelocytomatosis viral oncogene homologue and v-myc avian myelocytomatosis viral oncogene homologue (Figure 7 G). The transcription factor binding information for ATG101 was then downloaded from ChIPBase, and the genes enriched by GSEA were intersected with the genes identi ed from ChIPBase. Twenty-one common genes related to ATG101, COAD and LIHC were obtained ( gure 7H). Discussion ATG101, a novel mammalian autophagy factor thought to directly interact with Atg13, was rst identi ed by Hosokawa in 2009 [9] . Previous studies regarding ATG101 were mainly in the eld of autophagy. ATG101 is an essential gene for the initiation of autophagy and might be a potential therapeutic target in diseases involving endothelial injury [7] . Meanwhile, ATG101 also functions together with FIP200 ATG13 and ULK [9] . Regulating ULK1 to regulate autophagy/autophagy-related cell death (ACD) may be effective in the treatment of triple-negative breast cancer [10] . Moreover, ATG101 is essential for tissue homeostasis in both adult brains and midguts [11] and mediates GSN and OAS2, which are positively and negatively associated with the recurrence of colorectal cancer, respectively [12] . Some scholars have used 7 ATGrelated genes to establish the prognostic risk characteristics of hepatocellular carcinoma [13] . Another previous study showed that stable knockdown of ATG5 can signi cantly inhibit the occurrence and progression of colorectal cancer tumours in vivo [14] . In addition, ATG genes are mainly associated with the regulatory mechanism of cell-speci c gene expression in cholangiocarcinoma [15] . However, frameshift mutations of ATG-related genes may affect tumour progression by regulating autophagy in gastric cancer and colorectal cancer [16] . These studies are consistent with the results of our study. Therefore, ATG101 is expected to be a diagnostic and prognostic marker of different tumours. However, there are no studies on the signi cance of ATG101 in pan-cancer or its role in the immune microenvironment. In this study, we explored for the rst time the relationship between ATG101 and different tumours from a pan-cancer perspective.
Cancer neoantigens are generally considered to be a new class of antigens produced as a result of individual cancer cell mutations. Because of their immunogenicity and low expression in normal tissues, cancer neoantigens are considered to be an important target for cancer immunotherapy. [17] . Changes in the abundance and mutation pro le of ATG101 in various cancers leads to the emergence of new antigenic epitopes. Some initial, authoritative clinical trials revealed that dendritic cell-related neoantigen vaccines are safe and can induce CD8+ and CD4+ neoantigen-speci c T cell responses, which have broad application prospects [18] . In fact, dendritic cells (DCs) are an important part of the immune microenvironment. An increasing number of studies have revealed that DC vaccines with speci c genes (such as ATG101) can effectively initiate adaptive cytolytic immune responses. For example, one prospective study con rmed the safety and effectiveness of vaccines against cancer neoantigens. The cancer neoantigen vaccine can induce speci c CD4+ and CD8+ T cells to target speci c patient tumours, and in further clinical trials, it is expected that the effect of the vaccine will be satisfactory [19] . In addition, Beatriz con rmed in patients with advanced melanoma that the DC vaccine can improve naturally occurring neoantigen-speci c immunity and discovered a new human leukocyte antigen (HLA) neoantigen [20] . In addition, there is an urgent need to develop vaccines that target DCs or use them to present antitumour antigens in clinical practice [21] .
In this study, we attempted to deeply explore the relationship between immunity and ATG101 expression; therefore, we analysed the correlation between six types of immune cells, B cells, CD4+ T cells, CD8+ T cells, neutrophils, macrophages and dendritic cells, and the expression levels of ATG101. We found that in LIHC, LUSC and PCPG, the expression of ATG101 was positively correlated with all six immune cells. ATG101 may be a potential target for immunotherapy of these cancers. We further studied the relationship between ATG101 expression and neoantigens in various cancers. Our results showed that ATG101 expression was positively correlated with neoantigens in BRCA, KIRC, READ, HNSC, LIHC, SKCM and CESC. LIHC (liver hepatocellular carcinoma) was the only cancer in which there was a positive correlation among all six immune cells and neoantigens. The LHIC tumour microenvironment is complex and changeable. The TME affects the process of cancer cell antigen presentation by expressing tumour antigens, thereby allowing tumour cells to evade effective tumour treatment methods [22] . The complex tumour immune microenvironment of the LHIC is also one of the main causes of the heterogeneity of the treatment response to immune checkpoint blockers such as PD-1 and CTLA-4 blockade in LHIC patients with the same TNM stage [23] . We also studied the relationship between ATG101 expression and immune checkpoint genes. In BRCA, COAD, LIHC, LUAD and PCPG, we found that ATG101 and other immune checkpoint genes were signi cantly coexpressed. However, the expression of ATG101 was negatively correlated with all forty-seven immune checkpoint genes in CHOL, DCBC, TGCT and UCS. Therefore, ATG101 may be a potential biomarker to determine the prognosis of LHIC patients.
TMB is the total number of mutations per megabase in cancer, representing tumour mutation quantity. The more tumour mutations there are, the greater the differences between the cancer cells and normal cells. That is, the higher the value of TMB is, the more likely the cancer is to be recognized and attacked by the immune system and the better the response to immunotherapy. In recent years, TMB has been considered a biomarker for immunotherapy and an important evaluation index directly related to the e cacy of immune checkpoint therapy, such as PD-1 and CTLA-4 blockade [24] . Similarly, the formation of MSI caused by DNA MMR protein defects results in the accumulation of mutations and the production of neoantigens. MSI is also an important marker that determines the effectiveness of immunotherapy [25] . Our results showed that ATG101 expression was signi cantly correlated with both MSI and TMB only in COAD, HNSC and SKCM. This indicates that there is a signi cant negative correlation between ATG101 expression and MSI and TMB in these cancers. Our results are consistent with the results of previous studies. Maria demonstrated that immunogenic ileal apoptosis contributes to the prognosis of chemotherapy-treated COAD [26] . In addition, Simone's research showed that CD8+ T cell-related therapies can be used for anticancer immunotherapy for SKCM [27] . Similar results have also been veri ed in HNSC [28] .
In recent years, an increasing number of studies have shown that DNA methylation is highly related to the diagnosis, prognosis and prediction of the response to therapies for tumours, such as NSCL, colorectal cancer, hepatocellular carcinoma and metastatic breast cancer [29][30][31] . DNA methylation is one of the most common epigenetic events in the mammalian genome. Normal DNA methylation can maintain the normal functions of the body, such as the stability of the genome structure, normal embryonic development and cell differentiation, while abnormal DNA methylation will lead to tumorigenesis [32] . Some studies have indicated that the ATG gene may be involved in the development and progression of various diseases through promoting demethylation, regulating a variety of cellular functions and signalling pathways [33][34][35] . These studies indicated that DNA methylation may play a critical role in ATG gene expression. This study also revealed that the expression of several DNA methylation/demethylation factors has a signi cant correlation with ATG101 methylation status.
In this study, we found that the correlation between the expression levels of DNA methyltransferase and ATG101 varied in different tumours. Moreover, we explored the function of ATG101 in COAD and LIHC. Twenty-one common genes related to ATG101, COAD and LIHC were obtained. Our results are in agreement with previous studies. It was concluded that targeting STAT5-and STAT6-related signalling pathways can inhibit the proliferation of CRC cells and induce CRC cell apoptosis [36] , and Dong suggested that STAT5A, STAT5B and STAT6 expression may be potential prognostic markers of hepatocellular carcinoma [37] . MiR-532-3p can inhibit the progression of colorectal cancer by regulating ETS1-related signalling pathways [38] . Meanwhile, WTAP plays an important role in the progression of hepatocellular carcinoma by affecting the epigenetic modi cations of ETS1 [39] . Chen concluded that MJD1C can affect colorectal cancer metastasis by targeting ATF2 [40] . However, further research is still needed to explore ATG101 epigenetic changes and its potential functions, which may contribute to the development of new cancer treatments and the discovery of new markers to predict the prognosis of patients with cancer.

Conclusions
In summary, our study is the rst to demonstrate the relationship between the expression of ATG101 and tumour prognosis and the role of ATG101 in tumour immunity. Our research highlights the signi cance of ATG101 in the study of tumour immunity from the perspective of pan-cancer. However, our results need to be further veri ed in large-sample genome sequencing and in vivo and in vitro functional experiments.    The relationship between the algorithm score in cancer patients and the expression level of ATG101. The third highest correlations with the stromal score, immune score and ESTIMATE score. The relationship between 4 methyltransferases and the expression level of ATG101.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.