KMT2C is Potential Prognostic Biomarker and its Immune Regulating Roles in Pan-Cancer: A Comprehensive Analysis

Guangnan Wei South China University of Technology Yuchen Zhang South China University of Technology Hongkai Zhuang Shantou University of Medical College Yingzi Li Guangdong Academy of Medical Sciences Chongyang Ren Guangdong Academy of Medical Sciences Lingzhu Wen Guangdong Academy of Medical Sciences Jiali Lin Guangdong Academy of Medical Sciences Danian Dai Guangdong Academy of Medical Sciences Bo Chen Guangdong Academy of Medical Sciences Ning Liao (  syliaoning@scut.edu.cn ) Guangdong Academy of Medical Sciences


Introduction
There are ammounting evidencesthat epigenetic mechanisms play important roles in progressionof many cell types, and con rmations that it acts as a undamental part in cancer onset and development.
Epigenetic alterations, including chromatin rearrangement, histone modi cations, DNA methylation, and non-coding RNAs during the initiation and progression of disease, have been widely uncovered in recent researches [1,2]. Histone lysine methyltransferases (KMTs), working as a type of fundamental component of chromatin-modi er responsible for active transcription-related histone modi cations, has the involvment in methylation process by catalyzing the distribution of histone amino acid sites, thus contributing the structure changeof chromatin changed and target genes transcription affections, so as to play an essential role in regulating epigenetic changes [3][4][5].
Based on primary amino acid sequence and substrate speci city, KMTs are classi ed into six subfamilies [3]. H3K4 monomethylase KMT2C working as one of member of histone lysine methyltransferases subfamily, has involved in epigenetic modi cation for gene expressions [6], andKMT2C was located on chromosome 7q36, responsible forencoding nuclear protein with an AT hook DNA-binding domain, which is a protein domain like SET domain, and zinc nger. [7][8][9].KMT2C acting as a tumor suppressor gene was evidenced by several observations and experimental data, where researches were focused on exploring the function of the SET domain in tumors via knockout models [10][11][12][13]. For example, The case that a mouse model of pancreatic adenocarcinoma was constructed by knocking out the SET domain of KMT2C, which contributed to epithelial tumors formation ,also proved thatKMT2C may function as a tumor suppressor gene [14]. Our previous study elucidated KMT2C mutations have high-rate occurences in Chinese patients with breast cancer and with signi cant KMT2C mutation differences according to race and ethnicity. [13] Due to the improvement of bioinformatics technology and the establishment of open-access databases, KMT2C mutation was detected in most of cancer types. However, there were lack of data about KMT2C expression in different tissues and no enough date for revealing the correlation between prognosis and KMT2C expression. A link between tumor immune response and epigenetic regulation had been well-konwn [15][16][17][18][19] And participation of the epigenetic alterations in gene silencing allow tumors to adapt to changes in surrounding microenvironment [19]. The aim of this study was aimed mining the biologicak information by analyzing the expression of KMT2C in different tissue t, and present the relationship between KMT2C expression and immune microenvironment to gured out whether KMT2C mutations are related to prognosis.

Results
The expression level of KMT2C in normal tissue and tumor The GTEx database analysis revealed that KMT2C mRNA expression are similar across all tissues in different healthy people ( Figure 1A), but with the lowest expression in blood tissue. The expression level of KMT2C in different cancer cell line analysised from data collected in the CCLE database indicated that KMT2C expression level obviously elevated with narrower ranges compared with normal tissue ( Figure   1B). Further, we obtained 9 out of 20 cancer types with signi cantly different expression in the comprasion between tumor and peri-tumor from the TCGA database.
Considering that there are fewer normal samples in TCGA, we integrated the data of TCGA with GTEx, and there was also signi cant difference were found in 21 out of 27 cancers. The bladder urothelial carcinoma (BLCA), cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC), uterine corpus endometrial carcinoma (UCEC) and uterine carcinosarcoma (UCS) was found to be with lower expression of KMT2C in cancer cells compared with normal cells (Figure 1C In addition, we show the correlation of KMT2C expression with immune score, stroma score and ESTIMATE score across 33 tumor types. For the immune score, the three top-ranking tumor cohorts were sarcoma(SARC), thyroid carcinoma (THCA) and breast invasive carcinoma (BRCA), while in stroma score that SARC, head and neck squamous cell carcinoma(HNSC) and glioblastoma multiforme (GBM) were most signi cant relation. The different result showed in ESTIMATE score that the three top-ranking tumor cohorts were SARC, THCA, and GBM (Figure3B). To further understand the relationship between the expression level of KMT2C and immune checkpoint genes, we collected acknowledged immune checkpoint genes, and calculatied correlation of its expression with our target gene. As shown in Figure4, NRP1, ADORA2A, CD160, TNFSF15 had a high correlation (P<0.05) with KMT2C in most of cancer types. Data also showed co-expression of KMT2C with more immune checkpoint genes was detected in COAD and LICH. A different result found KMT2C expression was negatively correlated with most immune checkpoint genes in CHOL, esophageal carcinoma (ESCA), MESO, UCS.

KMT2C is associated with the Neoantigen TMB and MSI in some cancers
Neoantigen is a new protein encoded by a mutated gene, which included gene point mutations, deletion mutations, gene fusions, in a tumor cell rather thanthose expressed by normal cells [29]. Here we counted the number of neoantigens in each cancer sample and described correlation between KMT2C expression and number of antigens. In BRCA, KIRC, THCA, prostate adenocarcinoma (PRAD) and LGG, the expression of KMT2C was positively correlated with neoantigens (Figure5A). Some studies have been reported that microsatellite instability (MSI) status and the level of tumor mutation burden(TMB) are considered to be a robust prognostic biomarkers and therapeutic response to immune checkpoint inhibitors [30]. We performed their respective relationships with KMT2C expression in many kinds of cancer to explore the relevance between KMT2C expression and mutation across tumors. The correlation between KMT2C expression and MSI were obtained signi cance (P<0.05) from 12 cancer types including lymphoid neoplasm diffuse large B-cell lymphoma (DLBC), HNSC, LUAD, LUSC where LUSC had the highest coe cients, while DLBC had the lowest coe cients (Figure5B). The coe cient values would demonstrate that KMT2C expression positively correlates with MSI for LUSC, but negatively correlates with MSI for DLBC. When it comes to TMB, 6 out of 33 cancer types (BRCA, KIRC, THCA, THYM, UCEC, UVM) showed a relationship between TMB and KMT2C expression. THYM, UCEC had the highest coe cients, suggesting that KMT2C expression positively relates with high mutation status in those cancer types, but low mutation in BRCA, KIRC, THCA, UVM (particularly THCA).

KMT2C expression may affect methylation after transcription by regulating MMR genes
We also investigated the relevance between KMT2C expression level and some common known mismatch repair(MMR) genes (MLH1, MSH2, MSH6, PMS2 and EPCAM). The expression of KMT2C is closelyrelated to the MMR genes (Figure6A). In some speci c cancer types, the expression of KMT2C had positively relationship with MLH1, MSH2, MSH6 and PMS2. With regard to correlation between the KMT2C expression with four methylation transferases (DNMT1, DNMT2, DNMT3A, DNMT3B), it manifested there is an co-expression in many kind of cancer types expect USC. SARC, testicular germ cell tumor (TGCT)demonstrated signi cantly low co-expression coe cients, but in other cancer type coe cients were all high (Figure6B).

Discussion
Our result have revealed that the expression of KMT2C mRNA varied across tissue by analyzing multiple kinds of cancer from GTEx and TCGA database, and it show especially low expression in blood tissue.
When comparing cancer cells with corresponding normal tissues, the KMT2C expression level was obviously elevated among tumors. While mutation frequency of KMT2C differ from cancer types to cancer types. Our study detected the most common mutation types are missense mutation and frameshift mutations, but not all tumors found KMT2C mutation. From COSMIC database, 28.3% of KMT2C are primarily frameshift and missense mutations, a substantial proportion of mutations is missense mutations (17.1%) [7]. There have been reported that the majority of the protein usually truncate by these two types mutations of frameshift or missense, and frameshift mutations probably affected carboxy-terminal SET domain, while missense mutations were found in PHD domain. The different mutation patterns imply that the function of KMT2C might be disturbed by differently localized mutations [31,32].
Interestingly, the survival outcome is not always associated with the mutation frequency ormutation types. For example CESC patients with high KMT2C mutation frequency not suffered from early recurrence of tumor. KMT2C is acted as a tumor suppressor and deletion of the KMT2C gene was shown to be associated with worse outcome in KIRC, LUAD, OV, UVM, while LGG patients with lower expression of KMT2C exhibited better outcome. The explanation for this phenomenon is that KMT2C heterozygous mutations might contribute to heritability incidence. The expression of KMT2C might not be an independent risk factor to predict the prognosis of cancer patients and demonstrated different effects ordifferent mutations on KMT2C function. Genome-wide functional studies showed that KMT2C involved in installation of mono-and dimethylation of H3K4 at gene enhancers, which would help to active gene [6]. When promoter bound to KMT2C, gene transcription can be repressed [33]. It seems that KMT2C play a complex role in tumorigenesis and cancer progression. Epigenetic variations can be affected by diet, exercise, or other factors,and it can be re ected on differences of outcomes in diseases caused by DNA damage [1][2]. However, the expression differences of KMT2C in cancer cells and normal cells discovered in our correlation analysis also leading a different outcome in cancer patients, make it promising biomarker for cancer monitoring and further investigation.
There are another question should be consideredis whether KMT2C expression related to immune response or not. In our study, we try to answer this question through analyzing a relationship between its expression and immune microenvironment. We found that KMT2C expression was associated with tumor immune cell in ltration across different cancer types, the highest scoring tumors were COAD, KIRC, and LUAD. In these three cancer types, the highest coe cients of immune cell in ltration are varied. Studies reveal that tumor-in ltrating lymphocyte is an enssential component of the immune microenvironment, working as a mediator in antitumor immunity [34,35]. Macrophages provide the rst line of defense against tumor immunity by producing tumor-promoting cytokines.
Previous studies showed that macrophages produced proin ammatory cytokines, among which NF-kB acts as a central regulator, leading to tumorigenesis [36]. Neutrophils are the predominant in ltration cell in early in ammation, involving in initiating and expanding in ammatory response [37]. However, there was lack of data to show the role of KMT2C in immune microenvironment. Our study is consistent with previous studies that KMT2C impacted immune cell in ltration. Moreover, we showed the KMT2C expression have a closely connection with immune checkpoint genes across tumors. However, given the complexity of KMT2C function, there are still other undiscovered mechanisms. Previous studies have focused on proving MSI status and high level of TMB working as a predictor for immune checkpoint inhibitor [30]. According to our result, LUSC patient with the high MSI shown better checkpoint inhibitor responses. And TMB is now recognized as an indicator for screening patient who bene t from immune therapy [30]. The higher TMB achieve good response for immune therapy. In UCEC patient, both TMB and MSI have positively relationship with KMT2C expression in our study, so as to lead to our proposition that KMT2C might be a candidate for predicting immune responses more credible. The peptide fragments that enzymaticly hydrolysised from neoantigens are presented as antigens to T cells through DC cells, which can promote T cells to become mature activated T cells for tumor neoantigens speci cally recognization and proliferation of these activated T cells [29]. We can design and synthesize neoantigen vaccines according to the mutations of tumor cells induced by the immune activity of tumor neoantigens, and then immunize patients to achieve therapeutic effects [38]. In BRCA, KIRC, THCA, PRAD and LGG, patient might have a good response to neoantigens vaccines.
In normal tissues, the MMR systems recognize and correct DNA mismatch to retain genomic stability. [39,40]. Patients who were detected with MMR genes mutation or delete may increase the chances of genetic errors, thus contributing to cancer incidence for genome instability. For example, BRCA germline mutation predispose to breast cancer or ovarian cancer.
We also illustrated the expression of KMT2C was closely related with MMR genes in all cancer types.
Recent research has shown that DNA methylation is a well-known epigenetic feature of tumor development [3][4][5]. In the present study, we found a strongly correlation with KMT2C expression and methyltransferases in human cancer expect USC. These results demonstrated that KMT2C involved in tumorigenesis and progression through mediating MMR gene mutation and DNA methylation.
Our results have illustrated the role of KMT2C in tumor development, and provided the useful data for immune therapy across cancer types, but some limitation need to be mentioned. Our study is based on data from public databases with no con rmation through experimentation, it may has biases resulting from confounding factors. Besides, in this article we only showed the KMT2C mRNA level that data downloaded from open-access databases, the levels of functional protein haven't mentioned. It can't be concluded the in uence of post-transcription modi cation in normal or cancer cells.

Conclusions
In summary, KMT2C is expressed in many tissues, and its high expression found in cancer cells. The expression level of KMT2C is related to survival outcome, for example, KIRC with low expression predict better outcome. In some speci c cancer types, KMT2C expression strongly relates to tumor in ltration and immune checkpoint genes expression, it could be affected by the immune microenvironment. The connection between KMT2C expression and immune therapy indicators, like TMB, MSI also re ect its important role in immune therapy. An interesting nding is KMT2C expression can impact the MMR gens and DNA methylation. Further studies of KMT2C would help to better understand the role of KMT2C in human cancer.

Samples information and KMT2C expression analysis in Human Pan-Cancer
Data of KMT2C expression in different cancer types was obtained from database, including The Cancer Genome Atlas (TCGA) database and Broad Institute Cancer Cell Line Encyclopedia (CCLE) database. For data of 31 normal tissues (liver, lung, kidney, brain, bone marrow, etc.)collected from the Genotype-Tissue Expression (GTEx) program and downloaded through the GTEx portal [20][21][22], input the rma function within the R package (R studio version: 1.2.1335, R version: 3.6.1), we can normalize all expression data through log2 conversion [23,24] . Moreover, we also extracted mutation information and mismatch repair (MMR) genes in 33 cancer types via TCGA database. Information of TMB and MSI were both downloaded from TCGA database.

Prognosis analysis
Correlation of KMT2C expression and patients' prognosis, which included overall survival (OS), diseasefree interval (DFI), disease-speci c survival (DSS) and progression-free interval (PFI) across 33 cancer types of tumors, were analysised and shown by deploying survival ROC and survival in the R package (rdocumentation.org/packages/survival), and the results also menifested the speci city and timedependent sensitivity of survival [25]. Kaplan-Meier method was used to perform patients' survival curves in different cancer type after classifying two group of expression of KMT2C. Hazard ratio (HR) with 95% con dence intervals and log-rank P-value were calculated by univariate survival analysis.
Immune Correlation Analysis TIMER database based on TCGA was used to analyze the abundance of six subgroups of TIICs including B cells, CD4+T cells, CD8+T cells, neutrophils, dendritic cells, macrophages. [26,27]. The correlation between KMT2C expression and the scores of immune cells was analyzed using the method of spearman correlation analysis. In addition, a link between KMT2C expression and immune checkpoint genes was presented by correlation modules.

Statistics
We used Kruskal-Wallis test toanalyzed KMT2C expression levels in different normal tissues and different cancer cell lines. When comparing differences, Paired t tests or the t test was performed in comparing the differences of KMT2C expression levels in tumor tissues and normal tissues. Univariate survival analysis was applied to analyze the connection between KMT2C expression and patients' survival. We utilized Kaplan-Meier methods to compare survival with different levels of KMT2C expression. In this study, Pearson correlation analysis was utilized to evaluate the relevance between KMT2C expression and DNA methyltransferases, MMR. We de ned signi cant and positive correlation when P < 0.05 and r > 0. 20