Identification of CXCL8 as an Immune Microenvironment Regulator of Colorectal Cancer: A Bioinformatics Study Based on Cancer Genome Atlas and Gene Expression Omnibus Datasets


 BackgroundAt present, a lack of knowledge of the molecular mechanisms underlying the genesis and progression of colorectal cancer (CRC) exists. This is one of the reasons for poor prognosis of this increasingly occurring disease. The purpose of our research is to identify molecular candidates for therapeutic intervention that would help stop the malignant progression of CRC.MethodsIn this study, we performed bioinformatic analysis of the combined microarray data GSE40967 and GSE8671 in Gene Expression Omnibus (GEO) and the CRC RNA-seq dataset in The Cancer Genome Atlas (TCGA). This analysis consisted of 1067 CRC samples and 92 normal mucosae in total. We identified the common differentially expressed genes (DEGs) using R and Venn software. Protein–protein interaction (PPI) network information was obtained using the Search Tool for the Retrieval of Interacting Genes (STRING) Database. The Cytoscape plug-ins MCODE and cytoHubba were used to screen hub modules. CXCL8, which was the selected hub gene, was subjected to functional analysis and Gene Set Enrichment Analysis (GSEA). The algorithm “Cell type Identification By Estimating Relative Subsets Of RNA Transcripts (CIBERSORT)” was employed to estimate the relative abundance of tumor infiltrating lymphocyte cells (TILs) with CXCL8 expression. Finally, correlation analysis was carried out to explore the underlying mechanisms.ResultsWe identified 135 common DEGs, which consisted of 38 upregulated genes and 97 downregulated genes. The upregulated DEGs were strikingly enriched in regulation of neutrophil chemotaxis and cytokine-cytokine receptor interaction. The downregulated DEGs were enriched in bicarbonate transport. We identified the candidate molecule CXCL8 as a product of the hub gene. Functional enrichment analysis and GSEA indicated that CXCL8 may be closely related to a regulator of the CRC immune microenvironment. Finally, TILs analysis and correlation analysis showed that CXCL8 may inhibit CD8+ T cell immune infiltration through the HIF-1α/PD-Ls axis. In addition, CXCL8 could induce the polarization of M2 macrophages through the PI3K/AKT3 pathway.ConclusionsWe determined that hub gene CXCL8 may promote immune escape of CRC by inhibiting CD8+ T cell immune infiltration or promoting polarization of M2 macrophages. CXCL8 is a potential therapeutic target for the treatment of CRC patients.

CRC is one of the most common malignancies of the digestive system. Most CRC is related to old age, gene aberrations, environmental in uence, and lifestyles [1]. However, accumulating evidence has demonstrated that the incidence of CRC in young and middle-aged people has been increasing gradually in recent years [2]. There is a lack of knowledge of the molecular mechanisms underlying CRC progression and uncertainty about the increasing occurrence of the disease and its poor prognosis. Therefore, clarifying the roles of the key molecules and the pathogenesis involved in the progression of CRC is extremely important for the development of effective diagnostic and therapeutic strategies.
In recent years, the high-throughput data from the analysis of gene expression have been archived and deposited in public databases. This includes for example data from platforms such as microarrays. Two databases-GEO and TCGA-serve to collect data from different microarray platforms. More and more researchers have begun to use bioinformatics to discover the precise molecular mechanisms of disease by data mining the many gene expression pro les [3][4][5]. These comprehensive analyses helped reveal the precise molecular mechanism of tumor pathogenesis and progression. They also provided a theoretical basis for effective diagnostic and therapeutic strategies [6].
In the present study, bioinformatics methods and techniques were used to analyze and integrate the mRNA expression data of CRC from GEO and TCGA. We determined CXCL8 as the hub molecule that might regulate the microenvironment of CRC using PPI information, hub module analysis, and functional enrichment analysis. In addition, we explored the underlying mechanisms of CXCL8 to regulate the immune microenvironment of CRC based on the algorithm CIBERSORT and correlation analysis. This study provides a reliable way to investigate the mechanism of in ammatory factors mediating immune escape of CRC and to search for potential therapeutic targets for clinical diagnosis or treatment.

Microarray and RNA-seq Data
The National Center for Biotechnology Information (NCBI)-GEO (www.ncbi.nlm.nih.gov/geo/) is a free public database of microarray pro les. We used GEO to obtain the gene expression pro les of GSE40967 and GSE8671 in CRC samples and normal tissue. The microarray data of GSE40967

Data Preprocessing of DEGs
After CRC data were downloaded, probe identi cation numbers were transformed into gene symbols using the Perl software. If multiple probes corresponded to one gene, the average of multiple expression was taken as the gene expression level. Gene expression values were subsequently normalized using the R software.
The aberrantly expressed genes in normal and cancer specimens were analyzed by R software. DEGs were screened by using the criterion |log 2 (Fold Change)| > 1.5 and adjusted P<0.05. The raw data were checked online in Venn software (bioinfogp.cnb.csic.es/tools/venny/) to detect common DEGs among the three datasets. The DEGs with log 2 (Fold Change) < 0 were considered as down-regulated genes, while the DEGs with log 2 (Fold Change) > 0 were considered as up-regulated genes.
A heatmap of DEGs between CXCL8 high-expression group and CXCL8 low-expression expression samples in the TCGA CRC dataset was completed using the "heatmap" package in R software [7].

Gene Ontology and Pathway Enrichment
Gene ontology (GO) analysis is a useful method for annotating genes and identifying characteristic biological properties of genome and transcriptome data [8]. Kyoto Encyclopedia of Genes and Genomes (KEGG) is a collection of databases dealing with genomes, diseases, biological pathways, and chemical materials. The functional annotation tool of Visualization and Integrated Discovery (DAVID) Bioinformatics Resources 6.8 (david.ncifcrf.gov/) was used to verify the remarkable enrichment of gene function and pathways in the resulting dataset. A Q value < 0.05 was set as signi cant.

Integration of PPI Network and Module Analysis
PPI information can be evaluated online using the STRING tool (string-db.org/) [9]. We mapped the DEGs using the STRING website. A minimum required interaction score > 0.4 was selected as the signi cant threshold. PPI networks were constructed using the Cytoscape software [10]. MCODE plug-in was used to nd highly interconnected subgraphs as molecular complexes or clusters of PPI networks. The parameter criteria were set as: MCODE scores > 3 and number of nodes > 4. Finally, the cytoHubba plug-in of Cytoscape was utilized to explore PPI network hub genes. This provides a user-friendly interface to explore important nodes in biological networks and computes using eleven methods. Out of these methods, Maximal Clique Centrality (MCC) had a better performance in the PPI network [11].
Gene set enrichment analysis GSEA (software.broadinstitute.org/gsea/index.jsp) is a computational evaluation of whether the gene expression differences across biological samples among certain gene sets reach statistical signi cance [12]. 478 samples of CRC from TCGA were divided into two groups according to the median values of the expression of the CXCL8 gene: CXCL8 high-expression group vs CXCL8 low-expression group. GSEA was used to screen the biological processes and pathways relevant to CXCL8. A False Discovery Rate (FDR) < 0.05 and a Q value < 0.05 were set to be signi cant.

Tumor in ltrating lymphocyte cell analysis
The CIBERSORT algorithm is highly sensitive and speci cally discriminates 22 human immune cell phenotypes using support vector regression [13]. It has already been used for the construction of immunoscore models in several cancers [14][15][16]. The CXCL8 high-expression group (239 samples) and CXCL8 low-expression group (239 samples) divided as described above were used for the following analysis. The CIBERSORT algorithm was used to estimate the relative abundance of 21 TILs with CXCL8 expression using R software. P < 0.05 was considered to be statistically signi cant.

Correlation analysis
The corrplot package was used to calculate the correlation between different genes. The darkest red means the maximum positive correlation coe cient, and the darkest blue means the maximum negative correlation coe cient.
Statistical analysis of Scatter diagram was completed by SPSS 19.0 software package and Prism 8.0. P < 0.05 was considered statically signi cant.

Prediction of transcription factor binding site
The promoter region of PD-L2 was identi ed using resources available at the NCBI website (www.ncbi.nlm.nih.gov/). First, we searched for the promotor region of PD-L2 at NCBI-GENE (www.ncbi.nlm.nih.gov/gene/). Then we predicted the transcription factor HIF-1α binding site in promotor region of PD-L2 using Jaspar (jaspar.genereg.net/).

Identi cation of DEGs in CRC
There were 1076 CRC samples and 92 normal colorectal mucosae in our present study. We extracted 434, 618, and 1395 DEGs from the GSE40967, GSE8671, and TCGA CRC samples, respectively, using the limma package and Student's t-test. Venn diagram software was used to identify the common DEGs in the three datasets. 135 DEGs were detected, which included 38 up-regulated genes and 97 down-regulated genes in the CRC tissues and normal colorectal mucosae (Table 1, Fig.1a and b).

Gene ontology and KEGG pathway analysis of DEGs
All 135 of the DEGs were analyzed by DAVID 6.8 to identify GO categories and KEGG pathways. The results of the GO analysis indicated the following: 1) for the biological process (BP), up-regulated DEGs were particularly enriched in "positive regulation of neutrophil chemotaxis", "chemokine-mediated signaling pathway", "hair follicle development", "in ammatory response", and down-regulated genes were enriched in "bicarbonate transport"; 2) for molecular function (MF), up-regulated DEGs were enriched in "chemokine activity", "CXCR chemokine receptor binding", and down-regulated genes were enriched in "carbonate dehydratase activity", and "chloride channel activity"; 3) for the GO cell component (CC), upregulated DEGs were particularly enriched in "extracellular space", "extracellular region", and down-regulated genes were enriched in "apical plasma membrane", and "brush border membrane" ( Table 2, Fig.1c and d). Fig.1e shows the most signi cantly enriched pathways of the up-regulated DEGs and downregulated DEGs analyzed by KEGG analysis. The up-regulated DEGs were particularly enriched in regulation of "cytokine-cytokine receptor interaction", "Tumor Necrosis Factor (TNF) signaling pathway", "NOD-like receptor signaling pathway", and "chemokine signaling pathway", and down-regulated genes were enriched in "retinol metabolism", "bile secretion", "nitrogen metabolism", and "mineral absorption".

PPI network and modular analysis
We used the STRING online database and Cytoscape software to look at the PPI network. All 135 of the DEGs were imported into the PPI network complex, which contained 88 nodes and 152 edges (Fig.2a), 49 genes did not fall into the PPI network. We applied MCODE for further analysis and it showed that the signi cant module (8 nodes, 25 edges, Fig.2b) from the PPI network was selected. Moreover, a total of 135 nodes and 152 edges were analyzed using the cytoHubba plug-in of Cytoscape. We used the top ten genes for the analysis in each of the eleven methods. The results of ve methods, namely, MCC, Maximum Neighborhood Component (MNC), degree, Edge Percolated Component (EPC), and closeness, indicated that their rst ranked gene was CXCL8 ( Fig.2c-g).

Enrichment analysis of CXCL8 in the TCGA CRC dataset
We detected the DEGs between the CXCL8 high-expression and CXCL8 low-expression patients (Fig.3a). 1229 DEGs were enriched using the DAVID database for KEGG pathway and GO functional enrichment analysis to make a rst pass at understanding the biological relevance of CXCL8. The genes were mainly enriched in "cytokine secretion", "cytokine-cytokine receptor interaction", "immune response", in addition to accumulating "neutrophil chemotaxis" (Fig.3b and c). Furthermore, GSEA was also applied to look at how the biological process was enriched in CXCL8 highly expressed samples. Similarly, we analyzed several gene sets that may be associated with immune microenvironment. GSEA analysis not only enriched the gene sets related to "cytokine-cytokine receptor interaction", but also enriched many immune cell receptor signaling pathways ( Fig.3d and Fig.4). Therefore, it can be speculated that the function of CXCL8 in CRC tissues may be closely related to its immune microenvironment.
CXCL8 expression negatively correlates with the immune in ltration of CD8 + T cell and positively correlates with the immune in ltration of M2 macrophages Analysis was performed to explore the mechanism by which CXCL8 regulates the immune microenvironment of CRC. We estimated the relative abundance of 21 TILs between CXCL8 high-and lowexpression samples of TCGA using the CIBERSORT algorithm. The results showed that CD8 + T cell exhibited low in ltration in the CXCL8 high-expression group, while M2 macrophages were enriched in the CXCL8 high-expression group (Fig.5b). We also observed that the CXCL8 high-expression samples of GSE40967 had lower abundance of CD8 + T cell and higher abundance of M2 macrophages (Fig.5c).
CXCL8 may through HIF-1α/PD-Ls axis inhibited immune in ltration of CD8 + T cell Next, we analyzed the mechanism by which CXCL8 regulates the immune microenvironment by inhibiting CD8 + T cell in ltration. Immunotherapy currently relies on the activation of the anti-tumor effect of T lymphocytes to improve the therapeutic effect. Many researchers have focused on the function of immune inhibitory signaling molecules, including PD-L1 and PD-L2, which are highly expressed on tumor cells and one of the reasons for evasion of the immune system [17,18]. In this study, Spearman analysis demonstrated that CXCL8 was positively correlated with PD-L1 (CD274) and PD-L2 (PDCD1LG2) (Fig.6a-c). This indicated that CXCL8 inhibited T cell proliferation and that activation may proceed through effects on the expression of PD-L1 and PD-L2. Therefore, we analyzed the transcriptome data of TCGA to explore the upstream transcription factors that regulated PD-L1 and PD-L2 expression. The correlation analysis showed that HIF-1α was the highest positively correlating entity with CXCL8 ( Fig.6d and e). In addition, HIF-1α also positively correlated with PD-L1 and PD-L2 ( Fig.6f and g). HIF-1α can regulate the expression of PD-L1 by binding directly to a hypoxia response element (HRE) [19]. The promoter region CCACATGCCT of PD-L2 may be the HIF-1α binding site according to analysis performed using Jasper (Table 4). These predictions provide clues that HIF-1α may regulate PD-L1 and PD-L2 expression. In summary, we preliminarily speculate that CXCL8 may inhibit CD8 + T cell in ltration in the immune microenvironment through the HIF-1α/PD-Ls axis.
CXCL8 may act through the PI3K/AKT3 pathway to induce the polarization of M2 macrophages It has been reported that CXCL8 induces the tra cking of M2 macrophages [20] and that the PI3K/AKT pathway could regulate the polarization of M2 macrophages [21]. We found that the high expression of CXCL8 was positively related to the activation of the PI3K pathway and AKT3 expression ( Fig.6h and i). CXCL8 may therefore promote the polarization of M2 macrophages through the PI3K/AKT3 pathway to exert its immunosuppressive role.

Discussion
CRC is the result of accumulated genetic, epigenetic, and endocrine aberrations [11,22]. The limitations of previous studies are that they focused on a single cohort study or results from a single database [23][24][25]. In the present study, we analyzed two microarray datasets GEO40967 and GEO8671 from GEO and RNA sequencing data form TCGA. This included 1067 CRC specimens and 92 normal specimens and compensated for the limitation of a single database.
Although several studies indicated that CXCL8 can promote the malignant progression of a variety of tumors including CRC [26-30], the function of CXCL8 in the immune microenvironment still needs further research. Recently, researchers found that CXCL8 could reshape the tumor microenvironment and accelerate the malignant progression of tumor through recruiting myeloid-derived suppressor cells [31], stimulating angiogenesis [32], and promoting tumor cell epithelial-mesenchymal transition [27]. However, the in uence of CXCL8 on the CRC microenvironment is still poorly understood and needs further exploration. Further studies are needed if suitable therapies are to be developed for CRC.
Previously, Oliver and colleagues found that CXCL8-derived macrophages can exert immunosuppressive function by inducing macrophages to express PD-L1 in gastric cancer [33]. Furthermore, more recent research indicated that high expression of PD-L1 can lead to CD8 + T cell exhaustion, which may be a key mechanism of tumor immune escape [34]. In this study, we rst indicated that CXCL8 upregulation is negatively associated with CD8 + T cell immune in ltration in CRC. Moreover, evidence showed that a series of molecular features in uenced immunotherapy e cacy, including expression of immune checkpoint molecules [33], in ltration of CD8 + T cell [35,36], oncogenic pathways [37], and tumor-speci c neoantigens [38]. In this study, we found that CXCL8 up-regulation is negatively associated with PD-L1 (CD274) and PD-L2 (PDCD1LG2) Meanwhile, we established that the high expression of CXCL8 was positively correlated with the HIF-1α expression. Furthermore, evidence showed that HIF-1α was positively correlated with the expression of PD-L1 and PD-L2 [39,40]. Therefore, we speculated that CXCL8 may play an immunosuppressive function through the HIF-1α/PD-Ls axis. At present, immunotherapy targeting immunosuppressive checkpoints (such as PD-1/PD-L1) has been applied in the clinic and can improve the treatment of various malignant tumors. Only CRC patients with high microsatellite instability and DNA mismatch repair defects can bene t from this type of treatment [41]. Our research helps provide a better understanding of the mechanism of PD-L1 expression and could improve therapies targeting PD-1/PD-L1.
In addition, we found that high expression of CXCL8 is positively correlated with the degree of M2 macrophage in ltration. The imbalance of M1/M2 polarization of macrophages is usually associated with various in ammatory diseases. M1 macrophages are mainly involved in initiating and maintaining the in ammatory response; M2 macrophages are mainly involved in the regression of in ammation [42]. We can preliminarily speculate that CXCL8 can accelerate the immune escape of CRC cells by promoting M2 macrophages polarization. The polarization of macrophages is regulated by a variety of molecules and pathways, including the PI3K/AKT pathway, Notch pathway, IRF-STAT signaling, TGF-β pathway and TLR4/NF-κB pathway [43]. Our results showed that high expression of CXCL8 was positively related to the activation of the PI3K pathway and AKT3 expression. Therefore, the results indicated that CXCL8 may promote the polarization of M2 macrophages through the PI3K/AKT3 pathway, and induce its immunosuppressive effect on T cells.
We integrated gene expression pro les and sequencing data from datasets of TCGA and GEO. Bioinformatics analysis identi ed CXCL8 as key genes involved in pathways that may play important roles in CRC immune escape. Our results are expected to complement the mechanism of in ammatory factors mediating immune escape of CRC. This is conducive to the search for potential therapeutic targets of CRC and lays a theoretical basis for promoting the application of targets in the clinic. However, there are also limitations to the study. First, our results were only an analysis of microarray or RNA-seq data. The effect of the key molecule CXCL8 on T cells and macrophages needs to be veri ed in CRC samples and tumor cells. In addition, the mechanisms of CXCL8-mediated immune escape of CRC remain unclear. Further studies are needed to determine if CXCL8 can become a potential target of immune therapy. This will require further validation by subsequent experiments.

Page 9/23
This study explored the key role of CXCL8 in the microenvironment of CRC from the perspective of bioinformatics. CXCL8 could regulate the CRC microenvironment by preventing CD8 + T cell in ltration or inducing M2 macrophages polarization by different pathways. CXCL8 may therefore serve as a potential immunotherapy target for CRC. Ethics approval and consent to participate Not applicable.

Consent for publication
Not applicable.

Competing interests
The authors declare that they have no competing interests.        The relative abundance of 21 TILs between CXCL8 high-and low-expression samples of CRC. a Group of high-and low-expression of CXCL8 in TCGA CRC dataset. b Violin diagram showed the relative abundance of 21 TILs of CXCL8 in TCGA CRC dataset. Red color represented the CXCL8 high-expression samples, blue color represented the CXCL8 low-expression samples. c Violin diagram showed the relative abundance of 21 TILs of CXCL8 in GSE40967 dataset. Red color represented the CXCL8 high-expression samples, blue color represented the CXCL8 low-expression samples.