Expression signature, prognosis value, and immune characteristics of MUC1 identied by pan-cancer analysis

Background: Mucin 1 (MUC1) plays a major role in the occurrence and development of tumor by regulating the process of tumor cell proliferation, epithelial-mesenchymal transformation and epigenetics. However, the relationship between MUC1 expression and tumor prognosis and its role in tumor immunity is still worth exploring. Methods: MUC1 expression was analyzed via GTEx database and TCGA database. We used the Kaplan-Meier survival estimation method to evaluate the inuence of MUC1 on tumor prognosis through the survival information from TGCA database. The correlations between MUC1 and immune cell inltration, tumor microenvironment were investigated through TIMER algorithm and ESTIMATE. In addition, we used Spearman correlation test to examine the correlation between MUC1 and TMB, MSI. Spearman correlation test was also designed to predict the correlation between MUC1 and immune checkpoint genes, four methyltransferases. Further, Gene Set Enrichment Analysis was used to explore the potential mechanism of MUC1 in Adrenocortical carcinoma (ACC) and Liver hepatocellular carcinoma (LIHC). Results: Our study reported that MUC1 is highly expressed in most tumors, differing between cancer types. In most cancers, high expression of MUC1 means poor prognosis indicators, such as overall survival (OS), disease free survival (DSS), disease free survival (DFS) and progression free survival (PFS). MUC1 was positively correlated with inltrating levels of B cells, CD4+ T cells and CD8+ T cells, dendritic cells, macrophages, neutrophils in LIHC, Prostate adenocarcinoma (PRAD) and Thyroid carcinoma (THCA). Besides, we reported that the expression of MUC1 correlated signicantly with immune checkpoint gene, TMB and MSI and in most tumors. However, we found that MUC1 is negatively correlated with most immunotherapy-related indicators in BRCA and LUAD. The relationship between MUC1 and tumor neoantigens and DNA methylase is different in different tumors. highlights the from


Introduction
Cancer is now being the main cause of death and an important obstacle to the extension of life expectancy. Depending on estimates from the World Health Organization in 2019, cancer is the rst or second leading cause of death before the age of 70 in 112 countries, and the third or fourth cause of death in 23 other countries [1]. Among women, the top ve cancers are breast cancer, lung cancer, colorectal cancer, prostate cancer and gastric cancer. Among men, the most common cancers are lung cancer, prostate cancer, colorectal cancer and liver cancer. Lung cancer and breast cancer are the most frequent causes of death for men and women, respectively [2]. After surgery, radiotherapy and chemotherapy, cancer immunotherapy is rapidly developing into the fourth most prominent cancer therapy [3]. As the rst membrane mucin to be identi ed, MUC1 is up-regulated, aberrantly glycosylated and polarization disappears in many cancers [4]. Using MUC1 as a target for immunotherapy has made excellent progress. However, there are no studies to explore the signi cance of MUC1 and its role in immunotherapy from the perspective of pan-cancer.
In this study, we used bioinformatics techniques to predict the effect of MUC1 on the prognosis of pancancer and in ltration of immune cells in different tumors. Besides, we analyzed the relationship between MUC1 and forty common immune checkpoint genes as well as the relationship of MUC1 expression levels with the expression of four methyltransferases (DNMT1, DNMT2, DNMT3A, DNMT3B) in different tumors. Finally, we explain the regulatory mechanism of MUC1 in different tumors by GSEA enrichment analysis. Our results further provide evidence for MUC1 as a target for immunotherapy and suggest that MUC1 can be invoked as a prognostic marker for immunotherapy.

Results
Expression of MUC1 in pan-cancer First, we collated tumor samples and adjacent nearby tissues from The Cancer Genome Atlas (TCGA), normal samples from Genotype-Tissue Expression (GTEx) database to evaluate the mRNA expression characteristics of MUC1 in humans. Through pan-cancer analysis, we found that the expression of MUC1 was quite different in different normal tissues ( Figure 1A). In general, comparing tumor tissues with adjacent nearby tissues from TCGA, we found that the expression of MUC1 was upregulated, in such as Relationship between MUC1 expression and tumor immune microenvironment Because of the interaction of immune cells, we also explored the relationship between MUC1 and immunity from the perspective of tumor immune microenvironment. The ESTIMATE method was used to analyze the correlation between MUC1 expression in TCGA tumor samples and the ratio of both stromal cells and immune cells ( Figure 4). In GBM, KIRC, LGG, LIHC, LUSC, OV, PCPG, PRAD, THCA, SKCM and UVM, we found that MUC1 signi cantly positively correlated with stromal score, immune score, and estimated score. MUC1 had the highest correlation with stromal score in TGCT (r = 0.646, P < 0.001), while the highest correlation with immune score (r = 0.556, P < 0.001) and estimate score (r = 0.513, P < 0.001) was found in LGG.

Relationship between MUC1 expression and immune checkpoint gene expression in different tumors
Immune checkpoint refers to a series of molecules expressed on immune cells and can regulate the degree of immune activation. They play an important role in preventing autoimmunity (such as inhibiting abnormal immune function and preventing overactivation of immune function). Tumors can use these molecules to escape recognition and destruction of the body. Therefore, blocking these molecules will cause anti-cancer response and promote the immune clearance of tumor cells, thus inhibit the occurrence and development of the tumor. We used the mRNA sequence database to assess whether there is a relationship between MUC1 expression and more than forty common immune checkpoint genes. The results showed that MUC1 was highly correlated with genes in many different tumors. In ACC, LGG, THCA, Thymoma (THYM) and UVM, we found that MUC1 and other immune checkpoint genes were signi cantly co-expressed. Besides, the expression of MUC1 was negatively correlated with most immune checkpoint genes in BRCA, LUAD and PAAD (Figure 5 A).

Relationship between MUC1 expression, and TMB, MSI in different tumors
Immune checkpoint inhibitors can kill tumor cells by breaking immune tolerance and activating immune activity. At present, a variety of immune checkpoint inhibitors (ICI) drugs have been used in clinic. Tumor Mutational Burden (TMB) and Microsatellite instability (MSI) are important indicators to predict the prognosis of ICI-treated patients and have been explored extensively. Our results showed that the expression of MUC1 correlated signi cantly with TMB in THYM, THCA, PAAD, LGG, HNSC, ESCA, COAD and BLCA (P < 0.05), and THYM had the highest coe cients, while LUAD and BRCA had the lowest score (Figure 5 B). The coe cient value refers that MUC1 was positively correlated with high mutation status in THYM, and positively correlated with low mutation status in BRCA and LUAD. In addition, we also analyzed the correlation between MUC1 expression and MSI in different tumors. Results showed that MUC1 was signi cantly correlated with MSI in TGCT and COAD (P < 0.05), indicating a positive correlation between MUC1 expression and MSI in these cancers. In contrast, the expression of MUC1 had negative coe cients in UCS, UCEC, CESC, and BRCA, indicating that there is a signi cant negative correlation between MUC1 expression and MSI in these cancers (Figure 5 C).

Relationship between MUC1 expression and neoantigen in different tumors
Neoantigen is a new antigen encoded by mutant genes of tumor cells, which is different from that expressed by normal cells. According to the speci c immune activity of neoantigen, the new antigen vaccine can be designed and synthesized through the mutation of tumor cells to achieve the therapeutic effect. Our research evaluated the correlation between MUC1 expression and neoantigen in different tumors, and MUC1 was positively correlated with neoantigen in HNSC and STAD (P < 0.05). However, in LUAD, BRCA, LIHC and SKCM, MUC1 was negative correlated with neoantigen ( Figure 5 D).

Relationship between MUC1 expression and four methyltransferases expression in different tumors
DNA methylation is a form of chemical modi cation of DNA, which can cause changes in chromatin structure, DNA conformation, DNA stability and the interaction between DNA and protein, thus regulating gene expression. It can modify the genetic expression without altering the DNA sequence through the action of DNA methyltransferase. In our research, we discussed the correlation between MUC1 expression and four methyltransferases expression (DNMT1, DNMT2, DNMT3A and DNMT3B) ( Figure 6). In STAD, THCA, UCEC, UVM, KIRP, LAML, LGG, LIHC, PCPG, PARD and READ, MUC1 was positively correlated with DNA methylation (P < 0.05). On the contrary, MUC1 was negatively correlated with DNA methylation in BRCA, COAD, ESCA, LUAD, LUSC, OV, and PAAD (P < 0.05).

Gene set enrichment analysis (GSEA) of MUC1 in ACC and LIHC
After preliminary exploration of the gene epigenetic mechanism of MUC1, we used GSEA to examine the function of MUC1 in ACC and LIHC. In ACC, MUC1 can promote the cycle of citrate and TCA, the metabolism of alanine, aspartate and glutamate, the metabolism of xenobiotics and drugs by cytochrome P450 (Figure 7 A). In addition, MUC1 can inhibit amyotrophic lateral sclerosis, DNA replication, base excision repair, proteasome in ACC. In LIHC, MUC1 advances the metabolism of many substances, such as drug, propanoate, fatty acid, glycine, serine and threonine (Figure 7 B). Conversely, MUC1 inhibits axon guidance, the interaction of cytokine and cytokine receptor, focal adhesion, and endocytosis.

Discussion
MUC1 is a high molecular weight glycosylated transmembrane protein expressed mainly on the apical surface of glandular or luminal epithelial cells of almost all tissues, such as the mammary gland, gastrointestinal, respiratory, urinary and reproductive tracts [5]. Early studies have suggested that the function of MUC1 is to form physical barriers, block pathogens and protect and lubricate the epithelium [6,7].However, abnormal expression of MUC1 was also found in a variety of cancers, including breast cancer, gastric cancer, hepatobiliary cancer, bladder cancer, pancreatic cancer and other tumors [8][9][10][11][12]. The high expression of MUC1 is associated with poor prognosis in many cancers. In lung cancer, MUC1 can be regulated by various signal molecules and transcription factors, such as the extreme expression of STAT3 in NSCLCs [13]and the hypoxia-induced HIFA [14]. In addition, the overexpression of MUC1 can enhance the stemness and Paclitaxel resistance of NSCLCs, affecting the prognosis [15]. MUC1 promoted the expression of platelet-derived growth factor-A, which causing the tumor growth, angiogenesis and metastasis of pancreatic ductal adenocarcinoma [16]. Another study shows that MUC1 can directly regulate the expression of multidrug resistance genes in pancreatic cancers cells, increasing resistance to chemotherapeutic drugs (gemcitabine and etoposide) [17]. In endometrial cancer, MUC1 was hormonally upregulated and signi cantly associated with poor prognosis [18]. These researches are consistent with the results of our study. Therefore, MUC1 is supposed to be a diagnostic and prognostic marker of different tumors.
MUC1 changes both quality and quantity in different tumors, leading to the emergence of new antigenic epitopes. MUC1 is a cell surface molecule that comes into contact with the immune system, so it has been considered as an ideal target for immunotherapy in recent studies. Dendritic cells (DCs) are important parts of immune microenvironment, more and more studies have demonstrated that effectiveness of vaccination with DCs pulsed with MUC1 to initiate adaptive cytolytic immune responses via T cells. It has been noted that MUC1 can be presented to DC, to activate CD8+T cell and induce anticancer-related immune response [19].A study evaluated the clinical effectiveness of a MUC1 peptideloaded DC vaccine in 12 pancreatic and biliary cancer patients following surgical resection. The vaccine was tolerated and no toxicity was observed. Surprisingly, four vaccinated patients survived for up to ve years [20]. Similarly, the e cacy of the vaccine in pancreatic cancer has been con rmed by many researches [21,22].Ge et al. [23] reported that DC vaccine pulsed with survivin and MUC1 was well tolerated without dose-limiting toxicity, and signi cantly improved the quality of NSCLS patients' life.
Another clinical study showed that the DC-based vaccine targeting MUC1 induced an antitumor immune response that promoted prolonged survival of patients with refractory NSCLC. The median survival time (MST) after the initial vaccination was 7.4 months, and the 1-year survival rate was 25.0%. When the patients were vaccinated more than 6 times, the MST could be extended to 9.5 months, and the 1-year survival rate was 39.3% [24]. The vaccine can be also against a variety of tumors, such as prostate cancer, metastatic renal cancer and metastatic colorectal cancer [25][26][27]. Targeting MUC1 to chimeric antigen receptor (CAR)-T cells is also a promising immunotherapy. Zhang et al. [28] con rmed MUC1-CAR-T cells have long-lasting tumor killing and proliferative capabilities in esophageal cancer, and it had signi cant antitumor function and a prolonged half-life in subcutaneous transplantation models and PDX models. In another study, Mei et al. found [29] that human IL22 recombinant protein could increase the MUC1 expression and enhance the function of T cells. They constructed a fourth-generation CAR (CAR-MUC1-IL22 T cells) that secretes IL22, and con rmed that it has a stronger anti-tumor effect on MUC1+HNSCC cells. MUC1-CAR-T cells have good therapeutic effects in seminal vesicle cancer, NSCLC, triple-negative breast cancer, and pancreatic ductal adenocarcinoma [30][31][32][33].
Our study attempts to deeply analyze the relationship between immunity and MUC1, so we explored the correlation between the expression of MUC1 and 6 kinds of immune cells. Our results showed that in LIHC, PRAD and THCA, the expression of MUC1 was positively correlated with immune cells. MUC1 may be a potential target for immunotherapy of these cancers.
TMB is the total number of mutations per megabase in tumor, re ecting cancer mutation quantity. The more tumor mutations, the greater the difference between tumor and normal cells. Therefore, the higher the value of TMB, the more likely the tumor is supposed to be recognized by immune cells, and the more effective it is for immunotherapy. In recent years, TMB has been taken into account as a biomarker of immunotherapy [34]. Similarly, the formation of MSI caused by DNA MMR proteins defect leads to the accumulation of mutations and the production of neoantigens. MSI is another important marker that determines the effectiveness of immunotherapy [35]. Neoantigens are also given the same meaning.
Neoantigens are not expressed in normal tissues but only expressed in tumor tissues. Because of its high speci city and can be recognized by immune cells, it is the target of personalized therapeutic cancer vaccines [36]. Previous studies have con rmed thatMUC16 is associated with TMB in hepatocellular carcinoma and gastric cancer [37,38]. MUC family proteins are highly related to MSI in colon cancer [39]. MUC2 expressed in mucinous carcinomas, a distinct subtype of colon cancer associated with MSI [40]. In our research, we found that MUC1 is associated with TMB, MSI and neoantigen in a variety of tumors. Especially in BRCA, these three indexes are all negatively correlated with MUC1. Our results suggest that the expression of MUC1 in BRCA may be used to predict the effectiveness of immunotherapy. In general, our study revealed a relationship between the expression of MUC1 and the effectiveness of immunotherapy. At present, in speci c cancers, a lot of researches are still needed to examine the possibility of MUC1 as a target of immunotherapy and predictive.
In recent years, immune checkpoint inhibitors have made a signi cant breakthrough as a novel type of immunotherapy. Antibodies blocking the cytotoxic T lymphocyte-associated protein 4 (CTLA-4) or the programmed cell death 1 (PD-1) pathway are the two most common inhibitors with good clinical effects [41]. At present, there are many other target antibodies under study, such as LAG-3, TIM-3, TIGIT. On the basis of exploring the relationship between MUC1 and immunotherapy, we also studied the relationship between MUC1 and immune checkpoint genes. Our result showed that the expression of MUC1 was negatively correlated with most immune checkpoint genes in BRCA, LUAD and PAAD.
Interestingly, the expression of MUC1 in BRCA and LUAD is negatively correlated with most immunotherapy-related indicators in this study.
More and more studies have demonstrated that DNA methylation is highly related to diagnosis, prognosis and prediction of response to therapies of tumors, such as NSCL, colorectal cancer, hepatocellular carcinoma and metastatic breast cancer [42][43][44][45]. Yamada et al. found that the DNA methylation level of MUC1-negative cancer cell lines was greater than the MUC1-positive cell lines [46]. This research indicated that DNA methylation play a critical role in MUC1 gene expression. Another research showed that MUC1 and NF-κB p65 form a complex that binds to the promoter of DNMT1 and DNMT3b, promoting transcription of DNMT1 and DNMT3b [47]. DNMT1 and DNMT3b are two crucial DNA methyltransferases, so MUC1 can promote DNA methylation. In pancreatic ductal adenocarcinomas, a study found that low methylation of MUC1 promoters correlates with decreased overall survival [48]. This study also revealed that the expression of several DNA methylation/demethylation factors have a signi cant correlation with MUC1 methylation status. Subsequent studies also concluded that aberrant methylation of MUC1 promoters are potential prognostic biomarkers for pancreatic ductal adenocarcinomas [49]. DNA methylation refers to the chemical modi cation that transfers methyl to speci c bases in the DNA chain under the action of DNA methylase. In our research, we found that the correlation between expression levels of DNA methyltransferase and MUC1 varied in different tumors. In addition, although we have explored the function of MUC1 in ACC and LIHC, a lot of research is still needed to explore the epigenetic changes of MUC1 and its potential functions, which may contribute to discover new cancer treatments and predict the prognosis of cancer patients.

Materials And Methods
Obtain MUC1 expression data from public databases We analyzed the expression of MUC1 in various normal tissues through the GTEx database. We have introduced 6678 healthy tissues and organs from the GTEx (https://www.gtexp ortal.org/home/) database sample. Sequencing data of 33 tumors were collected from 11,057 samples (10,327 tumor samples and 730 matched paraneoplastic samples) in the TCGA database. Some tumors with missing matches or too few paraneoplastic samples were excluded from the calculation of the difference in expression of MUC1 in tumor and paraneoplastic tissue. For the remaining tumor samples, the tumor samples were normalized and then subjected to the Wilcoxon test to analyze whether there were differences in MUC1 expression between these tumors and normal tissues. P <0.05 is considered statistically signi cant.

Correlation between the expression of MUC1 and the prognosis of tumor patients
We downloaded the survival information (including overall survival, disease-speci c survival, disease-free survival, and progression-free survival) of 10,327 tumor samples from 33 tumors in TCGA. The Kaplan-Meier survival estimation method was used to calculate the prognostic value of MUC1, and the log-rank test was used to determine the signi cant difference index. Comprehensive analysis of the results obtained by these two methods, when the overall survival test P <0.05, we believe that the expression of MUC1 signi cantly affects the prognosis of the tumor.

Correlation analysis between immune cell in ltration and MUC1 expression
Through the TIMER algorithm, we use gene expression pro les of 10897 samples from 32 tumors in TCGA to infer the number of tumor in ltrating immune cells (TIICs). TIMER is a statistical method of deconvolution, which can be used for approximate calculation of immune in ltration. We analyzed the relationship between the expression level of the MUC1 gene and the abundance of in ltrating immune cells (including CD4+ T cells, CD8+ T cells, B cells, neutrophils, dendritic cells, and macrophages) through the above methods.

Correlation analysis between tumor microenvironment and MUC1 expression level
Immune cells and stromal cells are the two main non-tumor components in the tumor microenvironment. They are of great value to the diagnosis and prognosis of cancer. ESTIMATE (https://bioinformatics.mdanderson.org/public-software/estimate/) is a tool that uses gene expression data to predict the abundance of stromal and immune cell in ltration and tumor purity in tumor tissues.
Based on the expression pro le model les of 9664 samples of 33 tumors, after removing the normal samples, according to the ratio of stromal cells and immune cells in each tumor sample, the estimate package is used to estimate the tumor purity. The stromal score, immune score, and estimated score are used to calculate tumor purity. Then we combine the MUC1 expression data and the score of estimate algorithm, use the spearman correlation test to calculate their correlation. After that, the correlations were plotted using the ggplot2 (https://CRAN.R-project.org/package=ggplot2), ggpubr, and ggExtra (https://CRAN.R-proje ct.org/packa ge=ggExt ra) package.
The correlation of TMB, MSI and neoantigens with the expression of MUC1 Tumor mutation burden (TMB) and microsatellite instability (MSI) are important evaluation indicators directly related to the effect of immunotherapy such as PD-1. We downloaded the mutation data of 10114 samples of these 33 tumors in TCGA, calculated the mutation score of each sample, and obtained the TMB information of each tumor. Further, we used Spearman correlation test to analyze the correlation between MUC1 expression and TMB, and used the fmsb package (https://CRAN.Rproject.org/package=fmsb) to construct the relevant radar chart. We also downloaded and analyzed the MSI scores of 10,415 tumor samples, and used the same processing method to draw related radar charts of MUC1 and MSI. Predicted neoantigens of each samples were retrieved from previous publication [50]. Correlation among TMB, MSI, neoantigens and MUC1 were calculated using the spearman correlation test.
The correlation between MUC1 and known essential marker genes After exploring the relationship between MUC1 and immune cell in ltration, it is necessary to further explore the correlation between MUC1 and speci c immune checkpoint genes. We screened 47 immune checkpoint genes from the literature, and tested the correlation between MUC1 expression in tumors and the expression of these immune checkpoint genes through Spearman correlation test. Then, use the reshape2 package (http://www.jstatsoft.org/v21/i12/) to create related heat maps. DNA methylation changes chromatin structure, DNA conformation, DNA stability, and the way DNA interact with proteins. Therefore, we analyzed the correlation between MUC1 expression and four methyltransferases (DNMT1, DNMT2, DNMT3A and DNMT3B). The analysis method is basically the same as described above.

Gene Set Enrichment Analysis (GSEA) analysis
Since MUC1 is closely related to patient prognosis and immune regulation responsiveness in LIHC and ACC, we further want to explore the potential mechanism of MUC1 in these tumors. To determine the most signi cant pathways of MUC1 in these tumors, the tumors were grouped into high and low expression groups based on the expression level of MUC1. The expression level of MUC1 was then analyzed by limma, org.Hs.eg.db, clusterPro ler (http://bioconductor.org/packages/relea se/bioc/html/clust erPro ler .html), and enrichplot package based on c2.cp.kegg.v7.2.symbols background le to perform Gene Set Enrichment Analysis on the tumor expression matrix le and the most signi cant KEGG pathways were selected to plot enrichment curves. P < 0.05 was considered a signi cant difference criterion.

Statistical analysis
The R version 3.6.2 software (https://www.r-proje ct.org/) and its ancillary packages were used for data analysis. Limma package and Student's t-test were used to analyze MUC1 expression, and P < 0.05 was considered statistically signi cant. Kaplan-Meier curves were used for survival analysis using the log-rank test, and P < 0.05 was considered statistically signi cant survival difference. Spearman or partial Spearman method was used to analyze correlations between genes and correlations between genes and immune cells. P < 0.05 was considered statistically signi cant. All gures in this study were performed using R version 3.6.2.

Declarations
Ethics approval and consent to participate: Not applicable.
Consent for publication: Not applicable.
Availability of data and materials: All data generated or analyzed during this study are included in