Identification of prognosis-related genes in the cervical cancer immune microenvironment CURRENT POSTED

Background Cervical cancer is the fourth most common cancer in women worldwide. The metastasis and invasion of this type of cancer are closely related to the tumor microenvironment. Immune cells and stromal cells dominate the tumor microenvironment in cervical cancer. Therefore, we should further understand the association between tumor progress and immune cells or stromal cells. Methods we downloaded the gene expression profiles and clinical data of 307 patients with cervical cancers based on the TCGA database. Subsequently the Estimation of Stromal and Immune cells in Malignant Tumours using Expression data (ESTIMATE) algorithm was used to calculate the scores of stromal cells and immune cells to find differential genes, and analyzed the correlation between their scores and patient survival. Moreover, we also used R language packs and network tools to analyze GO term, gene enrichment pathway, and protein-protein relationship to find genes related to inflammation and immune regulation. Results The gene expression profiles and corresponding clinical data of 307 patients were obtained from TCGA datasets. The results showed that there was a statistically significant difference between the high immunescore group and the low immunescore group. And the low immunescore group had shorter lifetimes than the high scores group (P = 0.035).Moreover, PPI network analysis CCR5 and CXCL9, -10, -11 / CXCR3 axis might be new target for cervical cancer treatment. Finally, Kaplan-Meier survival curves found out nine representative genes significantly related to survival including BTNL8 , CCR7 , CD1E , CD6 , CD27 , CD79A , GRAP2 , SP1B , LY9 . Conclusions These genes can be used as markers for the prognosis and diagnosis of cervical cancer and also might be used as treatment targets.


Background
The incidence and mortality of cervical cancer were ranked fourth in the list of new cases and deaths of common cancers in 2018,with 6.6% and 7.5% respectively (1). Especially in Eastern Africa, cervical cancer was the most frequently diagnosed cancer in women. There were many risk factors that influence cause of cervical cancer, such as high-risk human papillomavirus(hrHPV) infection, smoking, health conditions, economic status and sexual partners (2,3). Compared with 2012 statistics, the incidence of cervical cancer in 2015 had decreased by 1.3% (4), possibly dued to production of HPV vaccine, improvement of living conditions and early screening (5). However, it was still a highmortality cancer in Asia, accounting for 53.8% of the world. Therefore, essential measures were still needed to reduce the incidence of cervical cancer.
Immunotherapy for cancer is based on the tumor microenvironment (TME), which improves endogenous immune system to monitor and eliminate tumor cells (6). In addition to traditional surgical treatment, radiotherapy, chemotherapy and corresponding vaccines, cervical cancer immunotherapy has gradually emerged (7, 8). TME plays a bad role in immunotherapy to suppress immune cell function and promote tumor cell infiltration (9). TME is mainly composed of cellular and molecular parts, including fibroblasts, endothelial cells, tumor cells, extracellular matrix (ECM), immune cells and vasculature (10). ESTIMATE algorithm, which is based on sample Gene Set Enrichment Analysis, create immune and stromal score to predict their respective infiltration level in tumor (11). Currently, several cancer types had been evaluated the infiltration results in TME through using ESTIMATE algorithm (12)(13)(14)(15). However, as far as we known, there was no report to systematically assess the infiltration condition in TME of cervical cancer yet.
Therefore, in the present study, we would estimate immune and stromal score in TME based on cervical cancer data obtained from The Cancer Genome Atlas (TCGA) database to further find genes related to the diagnostic and prognosis of cervical cancer.

Database
The gene expression profiles of adult patients with cervical cancer and corresponding patients' clinical profiles dataset including outcome, age, TNM stage and survival time were downloaded from the TCGA database (https://gdc.nci.nih.gov/) (16). Overall stromal content (StromalScore), immune infiltration (ImmuneScore) and the combined (ESTIMATEScore) score of each tumor samples were assessed by the ESTIMATE algorithm of the down-loaded database (17).

Differentially expressed genes (DEG) identification
All carcinoma of uterine cervix patients were divided into high-score and low-score groups based on median score of the immune and stromal scores. Grouping criteria were based on DEGs with a p value and false discovery rate (FDR) less than 0.05 and fold change greater than 2 times.In this study, package edgeR was used for data analysis and heatmap of the DEGs were drawn using the pheatmap R package.

GO and pathway enrichment analyses
The R package (18) was applied to analyze DEG functions and KEGG pathway enrichment. GO term analysis consists of biological processes (BP), cellular component (CC), and molecular function (MF).
The ClueGo plug-in from Cytoscape software were employed to analyze DEGs enriched in various pathways. Differences of genes between high score group and low score group were considered to be statistically significant at P<0.05 and were reported.

PPI network construction
The STRING database (http://string-db.org/cgi/input.pl) is an online tool for evaluating protein-protein interaction (PPI) information (19). We used STRING database to assess prepared DEGs-encoded proteins and interactions were selected with an interaction score greater than 0.95. Cytoscape software was applied to establish PPI network, and the Molecular Complex Detection (MCODE) in Cytoscape software was subsequently utilized to carry out modular analysis based on the node number and MCODE score. The most significant module was identified according to node number and the MCODE score (20).

Survival analysis
Kaplan-Meier plots were performed to analyze correlations between the overall survival of patients with carcinoma of uterine cervix and expression of DEGs or immune/stromal scores. The log-rank test was used to calculate statistical significance of the correlation.The survival analyses were performed in TCGA datasets by R"survival" package.

Immune microenvironment correlates with clinical characteristics of patients with cervical cancer
The gene expression profiles and corresponding clinical data of 307 patients were obtained from TCGA datasets. The youngest patients were 20 years old, the oldest 88 years old, and the median was 46 years old.
We used the ESTIMATE algorithm to calculate the immunescore ranging from -1209.74 to 3419.323, stromalscore ranging from -2437.40 to 804.22 and ESTIMATEScore ranging from -3262.05 to 4002 of all patients (13). This data indicated that the higher the ESTIMATEScore, the less the proportion of tumor cells. In order to study the potential relationship between clinical overall survival of patients and their immunescores or stromalscores, we divided the 307 patients into high-and low-score groups according to immunescore or stromalscore, then evaluated their relationship with Kaplan-Meier survival analysis. The results showed that there was a statistically significant difference between the high immunescore group and the low immunescore group. And the low immunescore group had shorter lifetimes than the high scores group (P = 0.035, Fig. 1A). However, there was no significant difference between the high stromalscore group and low stromalscore group for nearly 15 years. (P = 0.391, Fig. 1B) Furthermore, according to the clinical TMN stage of the tumor, patients with cervical cancer were classified into two groups: M1 and M0 stage. We analyzed the relationship between M1/M0 stage and their corresponding immune score or stromal score. As shown in Figure 1C,D, the immunescore and stromalscore in M0 stage group were significantly higher than those in M1 stage group (P=0.048,P=0.027).

Identification of DEGs based on immunescore and stromalscore
The expression profile data for all 307 patients with cervical cancer from TCGA was investigated to screen for DEGs between the high score and low score groups. A total of 1067 and 946 DEGs in cervical cancer sample cells based on immunescores and stromalscores, respectively. The heat map for DEGs of immune/stromal score groups was shown that DEGs were classified into 2 clusters. Genes in cluster with blue bar were belonged to high score group, while genes in cluster with pink bar were belonged to low score group in cervical cancer samples ( Fig. 2A,B). Among them, we identified 408 up-regulated genes and 17 down-regulated genes in Fig. 2C,D from both immunescore groups and Discussion Although the incidence of cervical cancer had generally declined, poor prognosis was most frequently stated problem with advanced and metastatic cervical cancer (21). In recent years, immunotherapy emerged as a novel treatment option, opening the new door to tumor treatment (22). But it still had little effect in solid tumors, including cervical cancer. Mainly because the stromal cells build a complex and hard barrier for the TME, so that immune cells are difficult to infiltrate into the tumor tissue. The metastasis and progression of cervical cancer were mainly due to the loss of cancer cell adhesion and the degradation of the extracellular matrix leading to the invasion phenotype of the cells from in situ metastasis. Moreover, tumor evoluted a series of ways to escape immune surveillance even facing continuous immune pressure in the TME (23,24). Given stromal cells and immune cells played pivotal role in the cervical cancer microenvironment. It was essential to further elucidate the their correlation between the survival of patientsand immune cells or stromal cells in cervical cancer.
In this study, we downloaded the gene expression profile and clinical database of cervical cancer from the TCGA database using a web tool to study the composition of cervical cancer TME including stromal cells and immune cells through ESTIMATE algorithm for the first time. Meanwhile, the scores of stromal cells and immune cells calculated with ESTIMATE algorithm, then their correlation with the survival of cervical cancer patients were also analyzed. Finally, we found DEGs correlation using GO analysis, signaling pathway enrichment, PPI network construction and compared the relationship between DEGs and overall survival by package edgeR.
At present, few studies explored the relationship between the prognosis of cervical cancer and the infiltration condition of immune cells or stromal cells. In this study, the ESTIMATE algorithm was used to calculate the scores of immune cells or stromal cells, which were divided into high score group and low score group. The calculated results demonstrated that these scores were strongly associated with the survival in cervical cancer. Some previous studies had found that in cervical cancer patients with high-risk, immune-related genes might reflect a changing tumor microenvironment (25). Unbalanced activation of Th17 and regulatory T cells might contribute to cancer progression and had a poorer prognosis (26). Th17 cells differentiation signaling pathways in KEGG analysis had also been mentioned in cervical cancer progression. Moreover, intense interaction between the tumor and the surrounding stroma cells could lead to a vicious circle of cancer (27). Based on our experimental data of immune / stromal scores associated with the survival of cervical cancer patients, the results were coincident with others' findings. It was further speculated that the TME or the content of immune cells and stromal cells would have a correlation with the survival of patients with cervical cancer. Various intrinsic factors such as disease stage, lymph node metastasis, and distant metastasis of cervical cancer had been studied to identify factors associated with cancer survival(28, 29). Our data also showed that immune/stroma scores were closely related to distant metastasis, which was consistent with their studies.
Subsequently, we found DEGs between high and low score groups, and processed them using bioinformatics methods including GO term analysis, KEGG analysis and PPI network construction. GO term analysis showed that DEGs were mainly enriched in T cell activation, external side of plasma membrane and cytokine receptor activity. KEGG pathway analysis observed DEGs mainly clustered in for the cytokine-cytokine receptor interaction, chemokine signaling pathway, cell adhesion molecules and the intestinal immune network for IgA production. Moreover, according to DEGs of the PPI network, we identified two modules including some chemokines and CD3 molecules which were relevant with the immune and inflammatory response.
Chemokines are small proteins that binds to the G protein-coupled receptors. Among them, CXCL9, -10, -11 are selective ligands for the CXCR3 receptor. CXCL9, CXCL10, CXCL11 / CXCR3 axis played an important role that regulate the migration (espacially chemotactic migration), activation and proliferation of immune cells. On the other hand, it also caused cancer cell metastasis mainly due to tumor-derived CXCR3A ligand activity (30, 31). Our present work found that this axis was also expressed in cervical cancer. In addition, through our PPI network analysis chemokine receptor CCR5 was also expressed. The previous study revealed CD8 + T cells expressing CCR5 and CXCR3 chemotactic receptors preferentially proliferated in cervical cancer (32). And anti-CXCR3 treatment could suppress tumor metastasis and tumor-host response. Furthermore, down-regulating the expression of CCR5 inhibited the proliferation and invasion of cervical cancer(33, 34). Therefore, CCR5 and CXCL9, -10, -11 / CXCR3 axis might be new target for cervical cancer treatment.
CD3D, CD3G, and CD3E genes are three of the genes encoding CD3 chain in TCR/CD3 complex. Their mutations caused severe immune deficiencies and showed increased susceptibility to infections early (35). It was reported that a high CD3D / CD4 ratio predicted better survival and also be used as a prognostic marker for bladder cancer (36). Our result also provided the evidence about CD3 / CD4 expression involved in cervical cancer progression Finally, we verified 9 representative genes including BTNL8, CCR7, SPIB, CD6, LY9, CD1E, CD27, CD79A, GRAP2 as prognostic markers of cervical cancer, which needed further verification in clinical.
Among them, CD27 and CCR7 were markers of T cell differentiation. Previous studies showed that advanced CD27 − CCR7 − CD45RA − T cell of PD-1 + TIGIT + 2B4 + Tim-3 + KLRG-1 − CTLA-4 − CD8 TILs was related to poorly differentiated cervical cancer (37), and CCR7 might be associated with lymph node metastasis. So that they could be used as a prognostic factor for survival (38). As we known, the remaining 7 genes were the first time to be reported to possess relationship with cervical cancer, thus they might also be used as a reference for the prognosis of this type of cancer.

Conclusion
In conclusion, we used the ESTIMATE algorithm for the first time to evaluate the score of stromal cells and immune cells in cervical cancer based on TCGA database, and analyzed the DEGs between immune cells and stromal cells through systematic bioinformatics analysis. Based on our study, more understanding on complicated relationship between cervical cancer microenvironment and tumors were promoted, as well as some potential biomarkers related to cervical cancer were found. Of