LPCAT1 is a Prognostic Biomarker and Correlated with Tumor Microenvironment in Endometrial Cancer

Background: Endometrial cancer (EC) is one of the three malignant reproductive tumors threatening women’s life and health. Glycerophospholipids (GPLs) are important bioactive lipids involved in various physiological and pathological processes including cancer. Immune-inltration of the tumor microenvironment (TME) is positively associated with the overall survival in EC. Exploring GPLs-related factors associated with TME in endometrial cancer can aid in the prognosis of patients and provide new therapy targets. Methods: Differentially expressed GPLs-related genes were identied from TCGA-UCEC datasets and Molecular Signatures Database (MSigDB). Univariate Cox regression analysis was used to select GPLs-related genes with prognostic values. Random forest algorithm, LASSO algorithm and PPI network were used to identify critical genes. ESTIMATEScore was calculated to nd out genes associated with TME. Then, differentiation analysis and survival analysis of LPCAT1 were performed based on TCGA datasets. GSE17025 and immunohistochemistry (IHC) veried the results of differentiation analysis. GO and KEGG enrichment analyses were performed to explore the underlying mechanism. In addition, we used ssGSEA algorithm to explore the correlation between LPCAT1 and cancer immune inltrates. Results: Twenty-three differentially expressed GPLs-related genes were identied and eleven prognostic genes were selected by Univariate Cox regression analysis. Four signicant genes were found out by two different algorithms and PPI network. Only LPCAT1 was signicantly correlated with tumor microenvironment. Then, we found that LPCAT1 was highly expressed in tumors samples compared with normal tissues, and lower survival rates was along with high expression groups of LPCAT1. Moreover, the expression of LPCAT1 was positively correlated with histologic grades and types. ROC curve indicated LPCAT1 had good accuracy of prognostic value. Receptor ligand activity, pattern specication process, regionalization, anterior/posterior pattern specication and salivary secretion pathways were enriched as potential targets of LPCAT1. By using ssGSEA algorithm, fteen kinds of tumor inltrating cells (TICs) were found correlated with LPCAT1 expression. Conclusion: These ndings suggested that LPCAT1 was a valuable prognostic biomarker and correlated with immune inltrates in endometrial cancer, which may provide novel therapy options and improved treatment of EC.


Introduction
Endometrial cancer (EC) is one of the most prevalent gynecological cancers worldwide with an incidence of about 4.5% in female cancer cases and over 410,000 new cases were diagnosed in 2020 [1]. The treatment for endometrial cancer is mainly with surgery to remove the uterus, complementary with chemotherapy, radiotherapy and hormone therapy. Other options include immunotherapy which mobilizes immune system against cancer and targeted therapy with drugs that attack speci c weaknesses in cancer cells. Despite great advances in medical device and treatment in recent years, the mortality of EC has still increased in the past several years, and treatment effects are poor for patients with advanced and speci c subtypes [2]. Endometrial cancer often occurs in women with metabolic disorders, including diabetes, hypertension and obesity [3]. A fully understanding of metabolic alterations could contribute to early diagnosis, timely treatment and development of new therapeutic targets.
Glycerophospholipids (GPLs) are the fundamental components of biomembrane system. Enhanced synthesis of GPLs provides su cient membrane structure for rapid cell proliferation and serves as a source of energy supply in conditions of nutrient de ciency [4]. The composition of fatty acyl chain in individual GPLs contributes to speci c biophysical properties of cell membrane and in uences varieties of cellular processes [5]. Accumulating evidence have shown that GPLs-related genes which regulate GPLs synthesis and remodeling were involved in development of cancer. Phospholipase A2 enzymes (PLA2s) was found signi cantly overexpressed in various tumor tissues including colorectal cancer, prostate cancer and gastric cancer, and patients with higher PLA2s expression were accompanied with poor prognosis [6]. The study by Wang et al demonstrated that phospholipid remodeling caused by lysophosphatidylcholine acyltransferase 3 (LPCAT3) deletion promoted the proliferation and division of small intestinal stem cells [7]. Moreover, Trousil et al found that upregulation of choline kinase α (CHKA) was responsible for alterations of choline phospholipid metabolism in EC and validated abnormal choline biochemistry as biomarker for EC [8]. Although attentions had been paid to the role of GPLs-related genes on EC progression, their effects were not well characterized.
Tumor microenvironment is an ecosystem composed of mesenchymal cells, immune cells, extracellular matrix (ECM) molecules and in ammatory mediators that have vital impact on tumor progression and clinical outcomes [9]. Structural components of the TME contain not only cancer cells but also recruited immune cells and resident stromal cells. Studies have shown that immune and stromal scores positively correlated with EC clinical characters and outcomes [10]. Moreover, several genes related to the EC immune environment that could predict prognosis [11]. Lipid metabolites are important mediators which affect tumor microenvironment. A recent study elucidated that GPLs metabolism level in uence the effect of PD-1 antibody by changing the components of tumor microenvironment (TME) in colorectal cancer [12]. However, whether GPLs-related genes have in uences on EC microenvironment and clinical outcomes are largely unknown.
Here, our results indicated that GPLs-related genes LPCAT1 had effects on endometrial cancer TME and clinical outcomes of EC patients, which might provide novel prognostic biomarkers and immunotherapy targets for EC.

Generation of GPLs-related DEGs
Transcriptome RNA-seq data and the corresponding clinical data which contains 552 endometrial cancer cases and 35 normal control cases were downloaded from The Cancer Genome Atlas (TCGA) (https://portal.gdc.cancer.gov/) by R-software. GPLs-related gene sets were obtained from Molecular Signatures Database (MSigDB) on gene set enrichment analysis (GSEA) website. Differential expression genes (DEGs) analysis was performed using the "limma" package, at a corrected P < 0.05 and | log 2 FC | ≥1. The website Venn diagrams tool (http://bioinformatics.psb.ugent.be/webtools/Venn/) was used to identify the intersection genes between upregulated or downregulated DEGs and GPL-related gene sets which were selected for further analysis.
Identi cation of the Core Prognostic GPLs-related Genes Univariate Cox proportional hazards regression analysis was performed to identify those genes associated with overall survival (OS). Those survival-related genes were put into the least absolute shrinkage and selection operator (LASSO) penalty Cox regression analysis to eliminate false positives because of over-tting. Then, The patients were randomly divided into training set and testing set according to the ratio of 7:3. Random Forest algorithm by "randomForest" R package was used for candidate genes selection in training set and veri ed in testing set [13]. To understand protein interactions, we constructed the protein-protein intersection (PPI) network by STRING (V11) [14] with high con dence (0.65). The overlapped genes in PPI network, LASSO and Random Forest algorithms were selected.
Generation of ImmuneScore, StromalScore, and ESTIMATEScore ESTIMATE algorithm by feat of R language loaded with estimate package [15] was used to estimate the ratio of immune-stromal component in TME for each sample, exhibited in the form of three kinds of scores: ImmuneScore, StromalScore, and ESTIMATEScore. The scores are positively correlated with the ratio of immune, stromal, and the sum of both, respectively, which means the higher the scores, the greater the proportion of the corresponding components in TME.

Survival and expression analysis of LPCAT1
Box plots using disease state (Tumor or Normal) as variable was graphed to visualize the differential expression of LPCAT1. The expression of LPCAT1 in GSE17025 data set was download from GEO database (https://www.ncbi.nlm.nih.gov/geo/) to verify the differential expression of it between tumor and normal tissues. Box plots using pathological type and cancer grade as variables were analyzed to compare the LPCAT1 expression in different pathological types and cancer grades. Meanwhile, Kaplan-Meier curve analysis R package was further conducted to evaluate the relationship between LPCAT1 expression and survival rate, including overall survival (OS), disease free interval (DFI) and progress free survival (PFS). In addition, Receiver operating characteristic (ROC) curve analysis con rmed the LPCAT1's predictive capacity.

Enrichment analysis
All the patients were divided into two groups by median expression of LPCAT1,and DEGs analysis were performed using "limma" package. The resulting data was used to generate volcano plots using R language with package ggplot2. GO and KEGG enrichment analyses using 379 DEGs were performed with R language with the aid of packages clusterPro ler, enrichplot, and ggplot2. Only terms with both pand q-value < 0.05 were considered signi cantly enriched. PPI network was used to show the interactions of each protein.

Immunohistochemistry analysis
Pathological specimens from 25 EC tissue samples and 5 adjacent non-tumor tissue samples were obtained from Qilu hospital of Shandong University. Tissue samples were xed with 4% paraformaldehyde then embedded in para n. After cutting into 4µm sections, they were depara nized and rehydrated with xylene and graded concentrations of ethanol solutions. Tris/EDTA buffer was used to perform heat mediated antigen retrieval. Then, sections were incubated in 3% H 2 O 2 to reduce endogenous peroxidase activity and blocked in 10% goat serum for 30min to reduce nonspeci c antigens.
Sections were incubated with a rabbit monoclonal anti-LPCAT1 antibody (1:2000 dilution, Abcam ab214034) overnight at 4°C, followed by incubation of biotin-labeled secondary goat anti-rabbit antibody and horseradish enzyme-labeled streptomycin at 37℃ for 30 min, respectively. DAB reagent was used to detect positive signals. Integrated optical density (IOD) of each section was calculated using Image-Pro Plus software 6.0 (Media Cybernetics, USA) by two independent blinded investigators randomly.

TICs Pro le
To explore the differences of immune cell subtypes, single sample gene set enrichment (ssGSEA) analysis was conducted to estimate TIC abundance pro les in all tumor samples. The results were used for further following correlation analysis. Mann-Whitney U test was used to compare differences in immune cell subtypes between the high LPCAT1 and low LPCAT1 groups.

Statistical analysis
All analyses were carried out by R version 4.0.3 and corresponding packages. Kaplan-Meier analysis was further conducted to evaluate the relationship between LPCAT1 expression and survival rate using survimer R package. Meanwhile, The glmnet R package was used for LASSO analysis.

Results
The ow chart of this study is shown in Fig. 1. A total of 552 UCEC patients from the TCGA-UCEC cohort were nally enrolled. The detailed clinical characteristics of these patients were summarized in Table 1. Identi cation of prognostic GPLs-related differentially expressed genes (DEGs) To identify the DEGs between tumor tissues and normal tissues in the TCGA-UCEC cohort, differential analysis using "limma" package was performed. 24 GPLs-related genes were found to be differentially expressed between tumor tissues and normal tissues (Fig. 2a). Univariate COX regression analysis for the survival of UCEC patients was performed to determine the prognostic factors among 23 DEGs and 11 genes were selected (Fig. 2b).
Intersection of Random forest model, LASSO COX model and Analysis of PPI Network To further identify core GPLs-related genes, we performed two different machine learning algorithms, LASSO algorithm and Random forest algorithms. By LASSO algorithm, 9 out of 11 genes were identi ed (Fig. 3a). The areas under ROC curve (AUC) (Additional le 1: Fig. S1a, AUC:0.698) and risk score were calculated based on the coe cients in the LASSO algorithm. The risk score of patients who died of the desease was signi cantly higher than that of patients who survived (Additional le 1: Fig. S1b). Simultaneously, the top 10 genes out of 11 were con rmed by Random forest algorithm (Fig. 3b), and the predictive value of patients who died was signi cantly higher not only in training set (Additional le 2: Fig. S2a) but also in testing set (Additional le 2: Fig. S2b). Also, we constructed PPI network based on the STRING database using Cytoscape software [National Institute of General Medical Sciences (NIGMS) USA] to explore the underlying mechanism (Fig. 3c). Then, the LASSO algorithm was cross-analyzed with the top 10 genes ranked by random forest and the leading nodes in PPI network, and four factors, PLA2G2F, PLA2G2A, LPCAT2 and LPCAT1 were found to overlap in the above analysis (Fig. 3d).

The expression of LPCAT1 was associated with EC microenvironment
In order to determine the relationship between the proportion of immune and stromal components with the expression of core GPLs-related genes, data from 552 EC cases and 19618 RNAs extracted from RNAseq data according to ENSEMBL Genomes (hg38) were analyzed in this study. All cases were divided into high-expression group and low-expression group compared with median expression of above four core GPLs-related genes. ImmuneScore, ESTIMATEScore and StromalScore were predicted by expression pro le data using the ESTIMATE R package (Table S1). We observed no signi cant differences in ImmuneScore, ESTIMATEScore or StromalScore between high-expression group and low-expression group of LPCAT2 (Fig. 4d-f) and PLA2G2F (Fig. 4j-l). Although the expression of PLA2G2A had signi cant correlation with StromalScore ( Fig. 4i), there was no signi cant correlation between PLA2G2A expression and ImmuneScore or ESTIMATEScore (Fig. 4g-h). However, ImmuneScore, ESTIMATEScore and StromalScore of LPCAT1 in low-expression group were signi cantly higher than those in high-expression group (Fig. 4a-c). These results suggested that the expression of LPCAT1 was associated with the structural components of EC microenvironment.
Relationship between LPCAT1 expression and survival, grade, and histological types in EC patients According to TCGA analysis, we found that LPCAT1 expression in endometrial cancer tissues was signi cantly higher than that in adjacent non-tumor tissues (Fig. 5a). Also, the expression level of LPCAT1 in tumor tissues was also signi cantly higher than that in paired normal tissues (Fig. 5b). The same nding was veri ed in GSE17025 (Fig. 5c). From the Kaplan-Meier curve, lower overall survival (OS), progress free survival(PFS), disease speci c survival(DSS) rates were recorded for patients in the highrisk compared to those in the low-risk group (Fig. 5d-f). As shown in Fig. 4, LPCAT1 expression showed positive correlation with grades (Fig. 5g). The expression of LPCAT1 in serous EC tissues was signi cantly higher than that in endometrioid EC tissues (Fig. 5h). The AUC value was 0.898,indicating a good accuracy of the prognostic prediction-value of LPCAT1 (Fig. 5i). To verify LPCAT1 expression in EC, immunohistochemistry (IHC) was performed in 25 EC patients with different differentiation levels and 5 adjacent normal tissues. Results indicated that LPCAT1 expression in EC tissues was signi cantly higher than that in adjacent normal tissues, while there was no signi cant difference among EC tissues with different differentiation levels ( Fig. 7a-b).

Enrichment and PPI analysis of DEGs
To determine the biological role of LPCAT1, we divided patients into high expression group and low expression group by median expression of LPCAT1. DEGs was analyzed by "limma" package. Results were shown by volcano plots (Fig. 6a), in which the red dots represented the up-regulated genes screened on the basis of corrected P < 0.05 and log 2 FC ≥ 1. The green dots represented the down-regulated genes screened on the basis of corrected P < 0.05 and log 2 FC≤-1. The grey dots represented genes with no signi cant differences. PPI network of DEGs with con dence > 0.65 was constructed (Fig. 6b). Go enrichment analysis and KEGG pathway enrichment of DEGs were performed by R software (Fig. 6c). The results of GO analysis indicated that receptor ligand activity, endopeptidase inhibitor activity, hormone activity, transmembrane transporter complex, ion channel complex, motile cilium, pattern speci cation process, regionalization and anterior/posterior pattern speci cation were involved. The KEGG results showed that salivary secretion were suggested to be enriched pathways.

Correlation of LPCAT1 Expression with the Proportion of TICs
We applied ssGSEA algorithm to further con rm the correlation of LPCAT1 expression and the immune component, analyzing the proportion of tumor in ltrating immune cells and constructing 28 sorts of immune cell pro les in UCEC samples (Fig. 7c-d). By constructing difference and correlation analysis, 15 kinds of TICs were found signi cantly correlated with the expression of LPCAT1 (Fig. 8a-c). Among them, 4 kinds of TICs were positively associated with LPCAT1 expression, including Activated CD4 T cell, effector memory CD4 T cell, memory B cell and Type 2 T helper cell; 11 kinds of TICs were negatively correlated with LPCAT1 expression, including activated B cell, activated CD8 T cell, CD56dim natural killer cell, central memory CD4 T cell, effector memory CD8 T cell, eosinophil, macrophage, mast cell, MDSC, monocyte, and T follicular helper cell. These results strongly supported that LPCAT1 expression in uenced the immune activity of TME. Then, the correlation of LPCAT1 expression with immune checkpoints genes (ICGs) was explored to estimate the immunotherapy responses through LPCAT1 expression. The correlation between LPCAT1 expression and ICGs [cytotoxic T-lymphocyte associated protein 4 (CTLA4), lymphocyte activating 3 (LAG3), CD47, CD70, TNF receptor superfamily member 14 (TNFRSF14), CD155 (PVR), etc.] was performed, indicating signi cant different expression of ICGs was observed between high-expression group and low expression group of LPCAT1 (Fig. 8d). The results demonstrated that LPCAT1 may help evaluate immunotherapy responses in EC.

Discussion
In the current study, we attempted to identify prognostic GPLs-related genes involved in tumormicroenvironment in endometrial cancer. LPCAT1 has been con rmed to be associated with the immune activities. Importantly, ssGSEA analysis indicated that LPCAT1 might serve as an indicator for TME statement in UCEC patients.
Obesity, hypertension and diabetes are clearly recognized as risk factors of endometrial cancer. And the metabolic changes of endometrial tumors themselves compared with their nonmalignant counterparts have been gradually recognized. Studies have shown glucose transport which is mediated by GLUT-6 and glycolytic-lipogenic metabolism may be responsible for tumor cell survival [16]. Signi cant elevation in the expression of SREBP1 and subsequent enhancement of lipid synthesis are the main characteristics of EC [17]. Poor diagnosis has been proved to be associated with overexpression of FASN in endometrial cancer [18]. Despite increasing metabolic genes were found responsible for endometrial cancer, there are few studies on glycerophospholipids-related metabolites. GPLs are necessary for cells to maintain homeostasis and normal physiological functions. Disorder of GPLs is obviously involved in benign disease and cancers [19][20][21][22]. Identi cation of dysregulated GPLs-related genes may provide a novel perspective for the therapies of EC.
More and more studies have shown that metabolic changes can affect tumor microenvironment (TME), especially immune cells. The study of Ringel et al demonstrated that free fatty acid (FFA) was decreased in tumor microenvironment of mice fed with high-fat diet, and the exhaustion of free fatty acid not only inhibited the function of CD8 + T cells but also reduced the number of it [23]. TME components played an indispensable role in the initiation and development of tumorigenesis. Targeting TME remodeling may provide a potential therapy strategy to inhibit tumor progression. Several studies have demonstrated that immune microenvironment in uence the tumor biological behavior [9,24,25]. Lack of Tumor-killing immune cells have been shown to associated with poor diagnosis of various malignancies.
Focusing on GPLs metabolism in uence endometrial cancer TME. We systematically investigated the expression of 77 GPLs-related genes in UCEC tumor tissues and their prognostic values. 23 out of 77 genes were differentially expressed, and the expression of 11 genes was related to prognosis of UCEC patients. These results signi cantly indicated the potential role of GPLs metabolism in EC and the possibility of targeting GPLs-related genes as therapy. To identify core prognostic genes playing essential role in EC, two machine learning algorithm and PPI analysis were used. Finally, we select four signi cant genes. Then, Patients were divided into high-expression group and low-expression group by median expression of four signi cant genes, respectively. ImmuneScore, StromalScore and ESTIMATEScore were calculated by estimate R package to estimate the immune and stromal proportion of each patients. The higher the ImmuneScore and the StromalScore, the larger respective components were in TME. Scores were compared between high-expression group and low-expression group and the results revealed that only LPCAT1 was related to tumor microenvironment. Scores of high expression group of LPCAT1 was signi cantly lower than low expression group, suggesting that the high concentration of immune cells was in the TME of low expression group of LPCAT1. Here, we embarked from the transcriptomic analysis of UCEC in TCGA database, which revealed that the increased expression of LPCAT1 was signi cantly associated with the advanced, speci c subtypes and poor prognosis. Chen et al. had demonstrated that the immune and stromal scores positively correlated with clinical outcomes of EC patients [10]. It might represent one of the mechanisms contributing to better prognosis of low expression group of LPCAT1. Accordingly, it suggested that LPCAT1 might be a potential prognostic marker and a therapeutic target of TME in UCEC.
LPCAT1 is a membership of lysophosphatidylcholine acyltransferases (LPCATs) family which regulate phospholipid metabolism in the Lands cycle. It catalyzes the transformation of lysophosphatidylcholine (LPC) into phosphatidylcholine (PC) by incorporating fatty acyl chains into phosphatidy lcholine [26]. LPCAT1 was initially isolated from alveolar type II cells and involved in the synthesis of alveolar surfactant [27]. Recently, LPCAT1 has been found overexpressed and acted as an oncogene in a variety of tumors [28]. However, no study demonstrated the relationship of LPCAT1 in endometrial cancer and tumor microenvironment.
To further investigate the functions of LPCAT1 in EC, DEGs analysis were performed, which were further subjected to GO and KEGG enrichment analysis.
Our results showed that salivary secretion in KEGG and receptor ligand activity, endopeptidase inhibitor activity, hormone activity, transmembrane transporter complex, ion channel complex, motile cilium, pattern speci cation process, regionalization, anterior/posterior pattern speci cation in GO were enriched.
Those all suggested that the underlying mechanism of LPCAT1 served as a potential prognostic molecular marker and therapeutic target in EC.
Another important aspect of this study was the correlation between LPCAT1 expression and the level of immune in ltration, which closely tied to the microenvironment of EC. As many studies have reported that various of immune cells, including different kinds of lymphocytes and macrophages, could form an immune microenvironment of EC, which consequently in uenced EC patients outcomes [29][30][31]. Thus, we analyzed the relationship between LPCAT1 genomic alterations and immune in ltration in EC. Our results demonstrate that the upregulation of LPCAT1 was along with the increase of activated CD4 T cell, effector memory CD4 T cell, memory B cell and Type 2 T helper cell, which were positively correlated with LPCAT1 expression. While the upregulation of LPCAT1 was also along with the reduction of activated B cell, activated CD8 T cell, CD56dim natural killer cell, central memory CD4 T cell, effector memory CD8 T cell, eosinophil, macrophage, mast cell, MDSC, monocyte, T follicular helper cell which were negatively correlated with expression of LPCAT1. Meanwhile, LPCAT1 might act as a predictor estimating immunotherapy responses. Together these ndings suggested that the LPCAT1 played an important role in recruitment and regulation of immune in ltrating cells in UCEC.
However, lack of experimental proof is the limitation of this study. Although IHC assay proved the LPCAT1 expression was signi cantly higher in EC than in adjacent normal tissues, the potential role of LPCAT1 and its relationship with immune cells is not su cient without further in vivo or in vitro experiments to support. Better understanding of the functions and underlying mechanism of LPCAT1 in EC will be recognized by more verifying experiments and clinical trials which might improve accuracy of diagnosis and therapy strategy. Therefore, further studies should be conducted to verify the accuracy of a combined analysis of LPCAT1 expression, GPLs and the amounts of tumor-in ltrating immune cells for UCEC patients.

Conclusion
Our ndings provided speci c insight into the role of GPLs-related genes in EC. We concluded that LPCAT1 served as a molecular marker to predict the prognosis and status of EC microenvironment. The results further suggested the potential mechanism of LPCAT1 as a novel therapy target for improving clinical outcomes with LPCAT1. We strongly recommend further investigation on the subject matter to progressively clarify the exact biological impact of LPCAT1.

Availability of data and materials
The datasets generated and analyzed during this study are available in the TCGA database (https://portal.gdc.cancer.gov) and GEO database (https://www.ncbi.nlm.nih.gov/geo/).   Correlation of ImmuneScore, StromalScore with expression of core  patients with different LPCAT1 expression. Patients were labeled with high expression or low expression according to the optimal cut-off value (minimum p-value). p = 0.006, 0.042, 0.036 by log-rank test. (g-h)

Ethics approval and consent to participate
The correlation of LPCAT1 expression with histologic grade and histological type. Wilcoxon rank sum or Kruskal-Wallis rank sum test served as the statistical signi cance test. (i) ROC curve for judging the accuracy of LPCAT1(AUC=0.898). DEGs, terms with p and q < 0.05 were believed to be enriched signi cantly. Heatmap showed the correlation between 22 kinds of TICs, and the numeric indicated the p-value of the correlation. Figure 8