Biological Omics Analysis of Tumor Mutation Burden Combined with Immune Inltrates in Uterine Corpus Endometrial Carcinoma

Background: In various malignancies, whether tumor mutation burden (TMB) is associated with improved survival outcomes or enhanced immunotherapy remains controversial. We are committed to exploring the prognosis of TMB and its potential association with immune inltration of uterine corpus endometrial cancer (UCEC). Methods: We downloaded transcriptome data and somatic mutation data from the Cancer Genome Atlas (TCGA) database, and visualized the mutation proles using the "maftools" package. TMB was calculated and the samples were divided into two groups by the median. Kaplan-Meier analysis and Wilcoxon test were used to compare the differences in survival and clinicopathological characteristics. Furthermore, we performed functional enrichment analysis to screen for signicant contributing pathways. The "CIBERSORT" package was used to estimate the abundance of immune components. Differential analysis was performed using the "limma" package to compare gene expression proles. Multivariate analysis was used to identify risk genes related to TMB. Based on these genes, a risk model was established, and the receiver operating characteristic (ROC) curve was drawn to assess the predictive accuracy. Finally, we evaluated the relationship between risk genes and immune inltration using the Timer database. Results: The mutation data included 542 UCEC patients. The waterfall plot summarized the specic mutation information. Higher TMB levels indicated improved overall survival (OS) (p =0.048) and were associated with advanced grades. Pathway analysis showed that the differential signals were enriched in multiple immune-related crosstalks. We screened 108 differentially expressed genes in two TMB groups and identied four risk-related genes. The model based on these four risk genes had good predictive value, the area under the curve (AUC) = 0.670, and patients with higher risk scores showed worse OS (p <0.001). Additionally, CD8 T cells memory-activated, CD4 T cells, and M1 macrophages in the high TMB group showed higher inltrates, while regulatory T cells (Tregs) and M2 macrophages showed lower inltrates. Conclusions: Higher TMB was associated with better survival outcomes and increased the immune inltration in the tumor microenvironment of UCEC.


Introduction
Uterine corpus endometrial cancer (UCEC) is a group of epithelial malignancies occurring in the endometrium (the lining of the uterus or womb), most generally in perimenopausal and postmenopausal women (1). UCEC is one of the most common tumors of the female reproductive system, with nearly 200,000 new cases each year. In the past few decades, the incidence and mortality of endometrial cancer have been on the rise. This makes it the third most common cause of death in cancers which only affect women, behind ovarian and cervical cancer (2,3). According to the pathogenesis and biological behavioral characteristics, UCEC can be divided into estrogen-dependent type (type I) and non-estrogendependent type (type II) (4). Approximately 40% of cases are related to obesity (5,6). Besides, endometrial cancer is also associated with excessive estrogen exposure, hypertension, and diabetes. The prognosis of UCEC is based largely on histologic grade and clinical stage (7,8). Although patients with early-stage disease have a 5-year overall survival rate of about 80%, those with advanced-stage have a 5-year overall survival rate of less than 20% (3). As overall survival remains poor in advanced and recurrent cases, the emergence of new targeted therapeutics and immunotherapies for these sub-patients offers hope for improved outcomes (9).
In recent years, there has been a lot of good news about cancer immunotherapy (10). Currently, it has manifested strong anti-tumor capability in the treatment of solid tumors such as melanoma (11), nonsmall cell lung cancer (12), kidney cancer (13), and prostate cancer (14). Several immunotherapy drugs have been approved for clinical application by the Food and Drug Administration (FDA) (15). Due to its prominent e cacy and innovation, "cancer immunotherapy" was named the most important scienti c breakthrough of the year by "science" in 2013. In 2018, American immunologist James P. Allison and Japanese immunologist Tasuku Honjo received the Nobel Prize in Physiology or Medicine for their discovery of cancer therapy by inhibition of negative immune regulation (16). Immunotherapies can be categorized as active, passive, or hybrid (active and passive) (17). Active immunotherapies direct the immune system to attack tumor cells by targeting tumor antigens, while passive immunotherapies enhance existing anti-tumor responses (18). Immunotherapies mainly involve immune checkpoint inhibitors (ICIs), therapeutic antibody, cancer vaccines, immune system modulators, cell therapy, and small synthetic molecule inhibitors. Endometrial cancer cells and tumor microenvironment have been shown to modulate immune responses. The strategies utilizing ICIs monotherapy, ICIs combination regimens, and ICIs combined with other fundamental targeted therapies have become promising treatments to improve the anti-tumor immune response of advanced endometrial cancer (19).
Tumor mutational burden (TMB, the number of mutations within a targeted genetic region in the cancerous cell's DNA) is a promising biomarker for predicting the effect of immunotherapy (20,21). Goodman AM et al. (2017)indicated that higher TMB is a promising biomarker that predicts favorable outcome to PD-1/PD-L1 blockade in melanoma and non-small cell lung cancer (NSCLC) (22). TMB may be an independent predictor of immunotherapy response. Thomas A et al. (2018) found that TMB swayed immune-related survival outcomes in breast cancer patients (23). In contrast, there was growing evidence that higher TMB in renal cell carcinoma (RCC) was correlated with immune cell exclusion and the immunologically cold phenotypes. Accordingly, higher TMB was supposed to be associated with worse survival in RCC (24,25). These ndings stated that TMB levels may have contradictory predictions in different tumor types.
With the continuous development of bioinformatics, we have the opportunity to obtain helpful resources about genomics and therapeutic responses from various public databases. Zhang C et al. performed a multi-omics analysis of TMB and immune in ltration in bladder urothelial carcinoma using the Cancer Genome Atlas (TCGA) database (26). Given the impressive results achieved by immunotherapies, especially ICIs in the treatment of UCEC, we performed this study to explore the prognostic role of TMB and its potential association with immune in ltration in UCEC.

Data Collection
Somatic mutation data were downloaded from the Genomic Data Commons (GDC) Data Portal of TCGA (https://portal.gdc.cancer.gov/). Next, the "Masked Somatic Mutation" data was selected and processed based on the VarScan2 software. We input the downloaded mutation annotation format (MAF) data and implemented the visualization process for genomic analysis by executing the "maftools" R package. We acquired the transcriptome RNA-sequencing data with HTSeq-FPKM type and corresponding clinical follow-up information of UCEC also from the TCGA database by the GDC tool.
2.2 Assessment of TMB and prognostic analysis TMB was de ned as the total number of mutations per coding area of a tumor genome (27). In our study, the TMB was calculated as (total number of variants) / (total length of exons), and the variants included base substitution, deletion, or cross-base insertion. We then operated the Perl script based on the JAVA8 platform to calculate the frequency of the variants of each sample. According to the median as the cutoff value, UCEC patients were divided into the low-TMB group and high-TMB group. The TMB levels were merged with the corresponding survival data of each sample according to the sample ID. The "survival" R package was applied to Kaplan-Meier analysis to compare OS between the two groups. Moreover, the correlation between the TMB levels and other clinicopathological factors was further judged, where Wilcoxon rank-sum test was utilized.

Differential gene analysis and functional enrichment analysis
First, the transcriptional data of UCEC samples were divided into low and high TMB groups. Differential gene analysis was performed using the "limma" package, with parameters of log fold change (FC)>1 or<-1 and false discovery rate (FDR)<0.05. And we applied the "heatmap" package to visualize the heatmap. Then, functional enrichment analysis was conducted through gene ontology (GO) and "Kyoto Encyclopedia of Genes and Genomes" (KEGG) analysis to explore the potential functions of TMB, using "ggplot2", "enrichplot", and "clusterPro ler" packages. Additionally, based on the TMB level as the phenotypes, gene set enrichment analysis (GSEA) was performed, in which "c2. Cp. Kegg. V6.2. Symbols. GMT gene set" was selected as the reference gene set. The way to download GSEA software is http://software.broadinstitute.org/gsea/index.jsp. The gene set was obtained from the MSigDB database (http://software.broadinstitute.org/gsea/msigdb/). Set the false discovery rate (FDR) less than 0.05 as the threshold.

Analysis of immune cell in ltration
Based on the CIBERSORT analysis tool (https://cibersort.stanford.edu/), we estimated the expression abundance of 22 immune cells. The box plot showed the in ltration proportion of 22 types of immune cells in each UCEC patient. Wilcoxon rank-sum test was used to accurately evaluate the differential density between high and low TMB groups, which was visually represented by the violin plot.

Identi cation of TMB-related immune signature
The "Gene Summary" le containing the information of immune-related genes was downloaded from the immunology database -Immport (https://www.immport.org/). Based on the immune-related gene data and the differential gene expression data between the TMB high and low groups, the "VennDiagram" package was used to screen out the differentially expressed immune genes between the two groups.

Construction of prognostic risk model based on differential TMB-related immune genes
Multivariate Cox regression analysis was applied to opt for the differential TMB-related immune genes (DTIGs) associated with prognosis risk, and resulting weighted scores were calculated as risk scores for each patient. The score of the prediction model was determined as = (exprDTG1×coef1) + (exprDTG2×coef2) +... + (exprDTGn×coefn) and the score was described TMB Prognostic index (TMBPI). According to the median score as a cut-off value, UCEC patients were divided into the low-risk group and high-risk group. Kaplan-Meier analysis was used for survival analysis. Using the "survivalROC" R package, we drew the ROC curve and calculated the area under the curve (AUC) to verify the predictive capability of this model. Besides, based on the "SCNA" module of the TIMER database (https://cistrome.shinyapps.io/timer/), we further assessed the association between DTGs in the UCEC prognosis model and the in ltration of 6 immune cells.

Statistical analysis
All analyses were conducted using R software (version 3.6.2). Wilcoxon rank-sum test was selected for the non-parametric statistical test used to compare the differences between the two groups. Survival analysis was performed using the log-rank test. Multivariate analyses were performed via Cox regression. P values<0.05 were considered signi cant.

Landscape of mutation pro les of UCEC
We downloaded the somatic mutation information of 542 UCEC patients from TCGA, in which the "Masked Somatic Mutation" data type and "VarScan2" work ow type were selected. In the TCGA database, there are four types of mutation data: Annotated Somatic Mutation, which is an annotation le of somatic mutations in the format of VCF; Raw Simple Somatic Mutation, which is the original le of somatic mutations in the format of VCF, Aggregated Somatic Mutation, which is protected Mutation annotation le in the format of MAF, Masked Somatic Mutation, which is open access annotation le in the format of MAF. Next, we input the prepared MAF les and visualizing the results of the patients' mutation data using the "maftools" package. The detailed mutation information of each UCEC patient was shown in the waterfall plot ( Figure 1). The clinical baseline of all 545 UCEC patients was summarized in Table 1, of whom the mean age was 63.93±11.14. To discuss the patient's variant details in more depth, we further categorized and summarized the mutations. To sum up, we found that missense mutation was the most frequently occurring variant classi cation (Figure 2a), single nucleotide polymorphism (SNP) was the most common variant type (Figure 2b), and C>T accounted for the largest proportion in single-nucleotide variation (SNV, Figure 2c). In addition, the number of variants for each patient sample was calculated and displayed (Figure 2d), and variant classi cation levels were presented again in a box plot ( Figure 2e). Subsequently, we listed the top ten frequently mutated genes, which were PTEN (64%), PIK3CA (48%), ARID1A (44%), TTN (38%), TP53 (37%), PIK3R1 (30%), KMT2D (26%), CTCF (26%), MUC16 (25%), and CTNNB1(24%, Figure 2f). What's more, our study continued to examine the consistency and exclusivity correlations among these mutated genes, with green for co-occurrence and brown for mutual exclusion. It can be observed from the gure that there were coexistence relationships across numerous mutated genes, while the mutually exclusive relationships between PTEN and TP53 and between TP53 and ARID1A were obvious (Figure 2g). Finally, the genetic cloud map was drawn to distinctly recap the mutated genes again (Figure 2h).

The relationship between TMB level and survival prognosis and tumor grades in UCEC
Tumor mutation burden (TMB) is de ned as the total number of somatic gene coding errors, base substitutions, gene insertions, or deletion errors detected per million bases. The TMB level of each UCEC sample was calculated, and patients were divided into high TMB group and low TMB group with the median as the cut-off value. Then, according to the sample ID, TMB levels were merged with the patient survival information and the clinicopathological characteristics information. Comparing the survival outcomes of the two groups, it was found that patients with high-level TMB kept better overall survival (OS), and the results were statistically signi cant (p = 0.048, Figure 3a). Surprisingly, higher TMB was correlated with advanced pathological grades of UCEC (p = 0.002, Figure 3b). It seemed that TMB was higher in patients aged 65 and younger than in older patients, but not statistically signi cant (p = 0.893, Figure 3c).

Differential expression gene identi cation and functional enrichment analysis
Transcriptome RNA-sequencing data of HTSeq-FPKM type were downloaded from TCGA, including 552 UCEC tissues and 23 adjacent non-tumor tissue samples. Comparing the transcriptome genes of the two TMB groups, 427 differential genes were obtained. Then, we selected the 40 genes with the most signi cant differences to draw a gene heatmap (Figure 4a). Moreover, we performed enrichment analyses on differential genes, including gene ontology (GO) analysis, Kyoto gene and genome encyclopedia (KEGG) analysis, and gene set enrichment analysis (GSEA). It is well known that GO can be divided into three parts: Molecular Function (MF), Biological Process (BP), and Cellular Component (CC). Our study found that in the BP group, humoral immune response, lymphocyte-mediated immunity, and complement activation were frequently enriched. In the CC group, differential genes were mainly involved in immunoglobulin complex, external side of the plasma membrane, and immunoglobulin complex, circulating. In the MF group, antigen binding, immunoglobulin receptor binding, and receptor-ligand activity were the three main terms (Figure 4b). According to the above results, we discovered that the major GO terms enriched by differential genes were mainly concerned with the immune response. The top ten KEGG pathways were Neuroactive ligand-receptor interaction, Alzheimer disease, Cytokine-cytokine receptor interaction, Breast cancer, Cell adhesion molecules (CAMs), Gastric cancer, Hippo signaling pathway, Proteoglycans in cancer, MAPK signaling pathway and Wnt signaling pathway (Figure 4c). Finally, we listed four excellent results of GSEA, in which the high TMB levels were signi cantly enriched in pyrimidine metabolism, nucleotide excision repair, P53 signaling pathway, and fructose and mannose metabolism (Figure 4d).

Comparison of immune cells in ltration between two groups of high and low TMB levels
In 2015, Newman et al. of Stanford university school of medicine proposed a new method for analyzing single-cell types, called CIBERSORT, which is a computer algorithm that reconstructs the type and number of original cells based on the RNA content of all cell mixtures (28). Based on the CIBERSORT algorithm, our study evaluated the immune pro le of each patient and compared the in ltration differences of these 22 immune cells between the high and low TMB groups. The immune in ltration pro le of each sample was shown in Figure 5a, where each bar represented a patient, and different colors represented different cell components. Additionally, the violin-plot revealed that the in ltration levels of CD8 T cells, memory activated CD4 T cells, follicular helper T cells, and M1 macrophages in the high TMB group were signi cantly higher than those in the low TMB group (Figure 5b). Based on the above results, it was not di cult to recognize that higher TMB generally heightened the level of immune in ltration of UCEC samples and advanced the patients' anti-tumor immune response.

Construction of a TMB-related immune genes risk model for UCEC patients
Through the intersection of immune-related genes and differential genes, 108 differential TMB-related immune genes (DTIGs) were obtained (Figure 6a). Next, we performed multivariate Cox regression analysis on these DTIGs and acquired 4 risk-related DTIGs (EDN3, FGF19, IL13RA2, TRAV21) and their coe cients, which participated in the construction of the risk model ( Table 2). The risk score in the model was calculated as ∑coe cients * expression values. Here, we de ned this risk score as the "TMB Risk Index" (TMBRI). The median TMBRI was 1.1734604, which was regarded as a cut-off value, and patients were divided into the high-risk group (n=261) and low-risk group (n=262). Kaplan-Meier (KM) survival analysis showed that there was a signi cant difference in OS between the two groups, and OS in the highrisk group was even worse (Figure 6b). The 5-year survival rate of patients in the high-risk group was nearly 20% lower than that in the low-risk group (Table 3). Furthermore, ROC analysis was performed to verify the reliability of this risk model. The area under the curve (AUC) of the ROC curve was 0.670, indicating that the TMBRI risk model had certain applicability in predicting the prognosis of patients with UECE ( Figure 6c). Besides, we further evaluated the potential relationship between the mutants of these DTIGs in the risk model and immune in ltration in the microenvironment. Using the SCNA module of the TIMER database, the relationship between END3 or FGF19 or IL13RA2 mutants, and the in ltration of 6 immune cells were exhibited by the box plot (Figure 6d-f).

Discussion
Endometrial cancer was decided for appropriate treatment based on the patient's age, physical condition, lesion range, and histologic type. The majority of endometrial carcinoma is adenocarcinoma, which is not highly sensitive to radiotherapy, so the treatment is mainly surgery, and adjuvant therapy includes radiotherapy, chemotherapy, hormone therapy, and targeted therapy (7). New targeted and ICB therapies for advanced endometrial cancer held promise for improving prognosis, as overall survival for patients with advanced and recurrent endometrial cancer remains poor. Overexpression of PD-1 and PD-L1 was detected in EC tissues (29). Immunohistochemical results showed that more than 75% of endometrioid carcinomas were positive for PD-1 and PD-L1 (30). Therefore, targeting PD-1 / PD-L1 may be a promising strategy to magnify the anti-tumor immune response. Le et al. (2015) demonstrated for the rst time that Pembrolizumab, a humanized monoclonal antibody against PD-1, was clinically effective in tumors with mismatch repair defects, in a phase II study (31). In the Phase Ib study (KEYNOTE-028) of pembrolizumab in the treatment of PD-L1-positive advanced endometrial cancer, three of the 24 patients with recurrent and metastatic endometrial cancer achieved partial remission (PR), of which one had POLE mutation. Three other patients achieved stable disease (SD). The overall response rate (ORR) was 13%. As for toxicities, only mild side effects were observed in thirteen patients (54.2%), the most common being fatigue, itching, fever, and anorexia (32). Additionally, in the monotherapy for PD-L1-positive endometrial cancer, the e cacy of Atezolizumab (anti-PD-L1) and Nivolumab (anti-PD-1) were also examined, with ORR of 13% and 23%, respectively(3). According to existing investigations, the e cacy of ICB therapy was not only based on the expression of PD-1 / PD-L1 but also the release and presentation of tumor antigens and the in ltration of immune cells in the tumor microenvironment. Although the ICB has shown certain satisfactory therapeutic outcomes in anti-tumor treatment, only a small number of patients can bene t from it. Therefore, many studies are devoted to nding biomarkers that can effectively predict the e cacy of immunotherapy.
As the latest marker for evaluating the e cacy of PD-1 blockade, the predictive capability of TMB has been approved in the use of PD-1 antibody in the treatment of colorectal cancer with mismatch repair de cient (dMMR) (31,33). Among 30 human cancers, endometrial cancer has the highest incidence of microsatellite instability (MSI) (34). Approximately 30% of primary endometrial cancer are MSI-H, and 13% to 30% of relapsed endometrial cancer are MSI-H or dMMR (35,36). Based on the above conclusions, we speculated that TMB may play a crucial role in predicting the prognosis of endometrial cancer. Zhang et al. (2019) performed a multi-omics analysis of TMB and immune in ltration in bladder urothelial carcinoma on the TCGA public platform, demonstrating that TMB may be an independent biomarker for survival prediction of bladder cancer (26). In this study, a similar methodology was utilized to analyze the relationship between TMB level and survival status in endometrial cancer. Our results discovered that higher TMB resulted in better OS (p = 0.048, Figure 3a), which was consistent with similar results in most other malignancies, that higher TMB was more conducive to provoking local immune responses and improving prognosis. In the trial of PD-1 combined with CTLA-4 blockade therapy for NSCLC, Hellmann MD et al. (2018) observed that high TMB can predict improved objective response, long-term bene t, and progression-free survival (PFS). Moreover, TMB was independent of the expression of PD-L1, which was the strongest feature associated with a therapeutic effect in multivariate analysis (37). Lv et al. (2020) revealed that the prognosis of bladder cancer patients with higher TMB was better so that TMB was considered to be a powerful predictor of tumor behavior and response to immunotherapy (38). However, in some types of cancer, such as clear cell renal cell carcinoma, patients with high TMB levels showed worse survival outcomes (24).
So why did TMB predict the opposite across different tumor types? Chen (2013) (39) proposed the concept of the "cancer-immune cycle" and pointed out seven important steps for the anti-tumor immune response to function, including the release and presentation of tumor antigens and nal T cell activation, etc. Therefore, the anti-tumor effect largely depended on the in ltration and activation of immune cells in the tumor microenvironment. In our study, patients in the high TMB group possessed higher levels of CD8 T cells, memory-activated CD4 T cells, follicular helper T (Tfh) cells, and M1 macrophage in ltration, while patients in the low TMB group held higher levels of regulatory T cells (Tregs) and M2 macrophages (Figure 5b). CD8 T cells are the leading anti-tumor effector cells. Memory CD4 T cells help the body obtain long-term protection against pathogen reinfection (40). Tfh cells secrete BCL-6, IL-21, CXCR5, and ICOS, which are mainly involved in the information transmission during B cell differentiation, helping to activate B cells and maintain humoral immune response for a long time (41,42). Macrophages participate in the tumor immune response through different polarization methods: classic M1 macrophages produce IL-12 and promote tumor immune response: while M2 macrophages produce IL-10 to promote tumor progression (43). Tregs are a group of lymphocytes with negative regulation of the body's immune response, which are involved in the escape of tumor cells from the body's immune surveillance and chronic infection (44). From the above results, it can be seen that high levels of TMB in endometrial cancer were positively correlated with anti-tumor immune cells and negatively correlated with immune cells that mediate tumor immune escape. Furthermore, we also observed that the level of TMB was correlated with the pathological grade of UCEC. The higher the level, the worse the differentiation, but patients with advanced-stage may be more prone to bene t from ICB (Figure 3b).
GO enrichment analysis of DTIGs indicated that these genes were mainly involved in humoral immune response, lymphocyte immunity, immunoglobulin complex, antigen binding, and other immune-related crosstalk. Subsequently, 4 risk-related DTIGs (positive correlation: FGF19 and IL13RA2, negative correlation: EDN3 and TRAV21) that were highly associated with survival were identi ed via multivariate cox analysis, and risk prediction models were developed based on these 4 genes. Patients with high TMBRI showed poor survival results (p <0.001), and the AUC of the model 's ROC curve was 0.670, indicating excellent predictive reliability.
The binding of FGF19 to its speci c receptor FGFR4 can inhibit apoptosis and NF-κB signaling, and upregulate the expression of proliferation-related genes (45,46). Both are linked with malignancy and risk prognosis and may be independent prognostic indicators for breast cancer and bladder cancer. IL13RA2 (interleukin-13 receptor Subunit Alpha 2) is a protein-coding gene related to diseases including glioblastoma and malignant glioma, which are mainly associated with poor prognosis and drug resistance (47,48). Besides, Barderas et al. (2012) showed that interleukin 13 (IL-13) mediated the invasion, pro-metastatic effect, and poor prognosis in colorectal cancer through IL13Rα2 (49). At present, there are no researches on the association between IL13RA2 and the development, metastasis, and prognosis of endometrial cancer. END3 (endothelin-3) was down-regulated in a variety of tumor tissues, such as breast (50), cervical (51), and colon cancer (52,53). In our study, EDN3 was also a negative risk factor, and the higher its expression, the lower the patient's risk value (coe cient = -0.016578188).
However, some studies have suggested that EDN3 may be involved in the development of melanoma via Hypoxia-Inducible factor-1alpha (54). Therefore, the effect of EDN3 on endometrial cancer remains to be further explored.

Conclusions
In summary, our study demonstrates that in UCEC, higher TMB was associated with better survival outcomes, which may be due to the increased immune in ltration in the tumor microenvironment caused by high levels of TMB.

Availability of data and materials
All data generated or analyzed during this study are included in this article and its additional les.

Con ict of interest
All authors declare that they have no con ict of interest.         The landscape of mutation pro les in UCEC samples. The detailed mutation information of each patient was shown in the waterfall plot, with different colors with speci c annotations at the bottom indicating various types of mutations. The barplot above the legend showed the number of mutation burden. UCEC, uterine corpus endometrial carcinoma.