A Four-Immune Gene Prognostic Risk Model for Colorectal Adenocarcinoma Based on the TCGA and ImmPort Data Sets

Background: As one of the hot spots in oncology eld, immune research provides new ideas for the diagnosis and treatment of tumors. Different histological types of colorectal cancer are different. Adenocarcinoma, as the type with the highest proportion, has a high research value. This study aims to build an immune gene prognostic risk model for colorectal adenocarcinoma to improve the diagnosis and prognosis prediction of colorectal adenocarcinoma. Methods: The differentially expressed immune genes could be obtained from the gene expression data downloaded from The Cancer Genome Atlas (TCGA) and the immune gene data downloaded from the ImmPort Database. Univariate COX and multivariate COX analyses were used to construct the immune gene prognostic risk model of and the clinical application potential of this model. The correlation between the model and the immune cells inltration and the inuence of each immune cell on the survival were analyzed. Results: 5975 differentially expressed genes were obtained, and 497 differentially expressed immune genes were selected by combining the information of immune genes. Among them, 36 immune genes were associated with prognosis, and 4 immune genes (THRB, IL1RL2, LGR6, LTB4R2) were included in the prognostic risk model of immune genes. Patients with higher Risk Score had shorter survival. Compared with gender, age and pathological stage, the model has better prediction potential. In addition, the model was correlated with Macrophages M0, Macrophages M1, T cells follicular helper and NK cells activated. Among them, T cells follicular helper and Macrophages M0 were related to the survival of patients. Conclusion: We developed a prognostic risk model containing four immune genes, THRB, IL1RL2, LGR6 and LTB4R2, which accurately described the prognosis of the patient, and affected the survival of patients by inuencing the inltration of Macrophages M0 and T cells follicular helper.

Conclusion: We developed a prognostic risk model containing four immune genes, THRB, IL1RL2, LGR6 and LTB4R2, which accurately described the prognosis of the patient, and affected the survival of patients by in uencing the in ltration of Macrophages M0 and T cells follicular helper.

Background
As one of the hot spots in oncology eld, colorectal cancer seriously endangers human health. Its morbidity and mortality rank the third and the second respectively among all cancers, and the trend is increasing (1). There are many histological types of colorectal cancer, including adenocarcinoma, squamous cell carcinoma and undifferentiated carcinoma, etc. There are great differences in molecular structure, cell type and biological behavior among different histological types of colorectal cancer (2,3).
Adenocarcinoma, as the highest type, has a high research value.
Immune research, a hot spot in the eld of cancer research, provides a new way for the diagnosis, treatment and prognosis assessment of malignant tumors. Studies have shown that low immune cytotoxicity and the absence of T cell in ltration are associated with poor prognosis in colon carcinoma patients (4,5). In the immunotherapy of colorectal cancer, immune checkpoint inhibitors targeting PD-1 and CTLA-4 are very effective in advanced-stage, mismatch repair de cient (dMMR) colon cancers and has become the standard treatment for dMMR metastatic colorectal cancer (6)(7)(8). The protein encoded by immune gene CCR8 is a chemokine receptor mainly expressed on regulatory T cells (Tregs), which affects the regulation of monocyte chemotaxis and thymocyte apoptosis and is crucial for CCR8 + Tregmediated immune suppression (9). Studies have shown that CCR8 is highly speci cally expressed in colon cancer, and inhibition of CCR8 can enhance anti-tumor immunity and prolong patient survival by regulating resident regulatory T cells of the tumor (10).
With the progress of high-throughput gene sequencing technology, more and more genes have been put into practical use as speci c therapeutic targets and prognostic markers for malignant tumors (11,12).
However, compared with other cancers, immune studies in colorectal cancer are still immature, and studies on speci c immune genes and immune cell in ltration of colorectal adenocarcinoma are still to be explored. Therefore, this study constructed a reliable risk model based on differentially expressed immune genes in patients with colorectal adenocarcinoma and discussed the in uence of this model on the immune cells in ltration in patients with colorectal adenocarcinoma and its clinical application potential.

Database download
Transcriptome data and clinical data of all patients with colorectal adenocarcinoma were obtained from The TCGA database(https://portal.gdc.cancer.gov/). ImmPort (www.immport.org) database provides the data of immune genes.

Identi cation of DE genes and DE immune genes
Differentially expressed (DE) gene analysis was performed on transcriptome data through R package limma. Genes with |log2FC| > 1 and Pvalue < 0.05 were selected as DE genes for subsequent analysis. Construction and evaluation of prognostic risk models Based on DE immune genes and patients' survival time, univariate COX analysis was used to determine the DE immune genes associated with survival. Pvalue < 0.05 were set as the cut-offs. Based on the above genes, multivariate COX analysis was used to determine the optimal prognostic risk model. The Risk Score is calculated based on Cox coe cient and gene expression, and the formula is as follows: N, Coei and Expi represent the number of genes, coe cient value and gene expression level, respectively.
The cut-off value of Risk Score is the median. If the cut-off value is greater than or equal to the cut-off value, it is high risk, otherwise it is low risk. Survival package and Survminer package were used to analyze the survival of the two groups. Receiver operating characteristic (ROC) analysis was used to evaluate the accuracy of the model for predicting 3-year and 5-year survival. The area under the ROC curve (AUC) > 0.6 is considered to have predictive value, while AUC > 0.75 represents excellent predictive ability (13,14).

Relationship between model and immune cell in ltration
The in ltration of 22 kinds of immune cells was analyzed by R-package Cibersort and the in uence of different immune cells on survival was analyzed. Pearson correlation coe cient test was used to calculate the correlation between the Risk Score and immune cell in ltration.

Database download
The transcriptome data of 39 normal tissues and 398 colon adenocarcinoma tissues were downloaded from TCGA database. Samples coming from the same patients with other samples and samples with short follow-up time (< 90 days) were deleted. As a result, 234 samples were included in further analysis. The patients' clinical information is shown in Table 1.  . 1A and B), while the distribution of DE genes and DE immune genes in log 2 FC and log 10 Pvalue were shown by volcano map (Fig. 1C and D). Function and pathway enrichment analysis of immune genes BP and KEGG analysis showed that DE immune genes mainly in uenced tumor and immune-related processes and pathways. The top ten BP and KEGG terms were shown in Fig. 1E and F (Tables 3 and 4).  Construction of prognostic risk model Univariate Cox analysis was used to screen for DE immune genes related to patient survival. Pvalue < 0.05 was used as cutoff value, and a total of 36 genes were obtained ( Fig. 2A). By multivariate Cox analysis, four high-risk immune genes were obtained (Fig. 2B), including thyroid hormone receptor beta (THRB), interleukin 1 receptor like 2 (IL1RL2), leucine rich repeat containing G protein coupled receptor 6 (LGR6) and leukotriene B4 receptor 2 (LTB4R2). The formula of Risk Score is as follows: Risk Score = (0.924903748 × Exp of THRB) + (0.443400271 × Exp of IL1RL2) + (0.053843604 × Exp of LGR6) + (0.428025347 × Exp of LTB4R2). Median Risk Score was set as the cut-off value and the patients were divided into high-risk and low-risk group. There was a signi cant difference in survival between two groups (Pvalue = 1.625e-03) (Fig. 2C). The ROC curves show that the AUC values of the model in 3 years and 5 years are 0.731 and 0.827 respectively, which shows that this model has good prediction ability (Fig. 2D). The heatmap of gene expression, Risk Score distribution map and survival state map of the two groups are shown in Fig. 2E, Fig. 3A and Fig. 3B.
Independent prognostic ability of the model According to univariate COX and multivariate Cox analyses, the survival of patients with colorectal adenocarcinoma was correlated with Risk Score and pathological stage ( Fig. 3C and D). Among them, the Pvalue of Risk Score = 0.004 and AUC of 5 years = 0.800, which has good independent prediction ability (Fig. 3E).

Correlation between genes in model and clinical characteristics
With the increase of THRB expression, N stage and pathological stage showed an increasing trend ( Fig. 4B and C).
LGR6 had the same relationship with age, M stage and pathological stage ( Fig. 4D and F). In addition, IL1RL2 was positively correlated with age (Fig. 4A).

Relationship between model and immune cell in ltration
The in ltration of 22 kinds of immune cells (Fig. 5A) was analyzed. Among them, the in ltration of 16 kinds of immune cells were different between the colorectal adenocarcinoma tissues and normal tissues (Fig. 5B). The correlations between different immune cells are shown in Fig. 5C. Macrophages M0 was positively correlated with Risk Score (r = 0.247, P = 0.013), while Macrophages M1 (r = -0.213, P = 0.032), NK cells activated (r = -0.206, P = 0.039) and T cells follicular helper (r = -0.310, P = 0.002) were negatively correlated with Risk Score (Fig. 6A-D). In the survival analysis of the above four kinds of cells, Macrophages M0 was negatively correlated with survival, while T cells follicular helper showed a positive correlation ( Fig. 6E and F), which suggested that the model might affect the survival of patients by affecting the in ltration of Macrophages M0 and T cells follicular helper in patients with colorectal adenocarcinoma.

Discussion
As a malignant tumor with a morbidity of second and a mortality rate of third, there are about 1.8 million new cases of colorectal cancer in 2018 and caused about 550000 deaths (1). Therefore, the research on colorectal cancer needs further study. Studies have shown that the immune system greatly in uences the occurrence and development of tumors, immunotherapy for tumors is also more and more put into clinical application and achieved good results (15)(16)(17). However, as the largest histological type of colorectal cancer, the speci c immune research on colorectal adenocarcinoma is still scarce, so the in uence of immune genes and immune cells on colorectal adenocarcinoma has high research value. In this study, we developed a four-gene prognostic risk model for colorectal adenocarcinoma that can be used as an independent risk predictor of patient survival. Compared with age, gender and pathological stage, this model has higher prediction accuracy. Immune cell in ltration analysis also showed that two kinds of immune cells related to survival were correlated with the model, which suggested that this model could affect the patients' prognosis through immune in ltration.
In the analysis of differentially expressed genes between normal tissues and adenocarcinoma tissues, 497 immune genes were obtained. Enrichment analysis of BP and KEGG pathways showed that these genes affected immune response, signal transduction and other biological processes, as well as cytokinecytokine receptor interaction, chemokine signaling pathway and other pathways. These provide some reference for further basic experiments. Among them, 36 immune genes were associated with survival.
Finally, four ideal immune genes (THRB, IL1RL2, LGR6 and LTB4R2) were included in the risk model. The protein encoded by THRB is one of the receptors of thyroid hormone, which mediates the biological activity of thyroid hormone (18). Studies have shown that the expression of THRB in the nucleus represents the poor prognosis of breast carcinoma patients (19). IL1RL2 was expressed in intestinal T lymphocytes, which induced the proliferation of CD4 + lymphocytes, and low expressed in colon of patients with Hirschsprung's disease, resulting in increased in ammation and changes in mucosal healing (20,21).
LGR6 encodes a glycoprotein hormone receptor, which is highly expressed in colon carcinoma and has tumor promoting effect. It can be used as an independent risk factor and a biomarker for the diagnosis and prognosis of colon carcinoma (22,23). LTB4R2, a member of the G protein-coupled receptor (GPCR) family, greatly in uences the progresses of many diseases including tumor and asthma (24,25). Research has shown that LTB4R2 promoted the invasion and metastasis of bladder carcinoma through a reactive oxygen species-linked pathway (26).
To estimate the risk predictive capacity of the model, we used the univariate COX analysis to calculate the correlations among clinical variables, Risk Score and survival. The results showed that pathological stage and Risk Score could be regarded as independent risk predictors. Further multivariate Cox analysis showed compared to other clinical variables, the Risk Score had more satisfying predictive ability. Besides, the genes in this risk model were also correlated with clinical parameters. With the increase of age, the expression of IL1RL2 increased. Meanwhile, THRB was positively correlated with N stage and pathological stage.
LGR6 showed the same relationship with age, M stage and pathological stage.
Immune cells in ltration is an important part of tumor immunity, which affects the treatment and prognosis of various cancers (27)(28)(29). In this study, we found that the Risk Score of the model was correlated with four kinds of immune cells, in which Macrophages M0 was positively correlated, while Macrophages M1, T cells follicular helper and NK cells activated were negatively correlated. Survival analysis showed that the high expression of Macrophages M0 represents the poor prognosis of the patients, while the higher level of T cells follicular helper represents a longer survival. Qun Zhang et al.
reported that in ovarian cancer, the apoptosis of tumor cells stimulated the differentiation of Macrophages M0 to M2, and promoted the proliferation and migration of tumor, which provided a possible chemotherapy scheme for patients with ovarian cancer (30). In addition, higher T cells follicular helper represent a better prognosis in patients with colon and breast cancer (31)(32)(33). In this study, high Risk Score represent higher Macrophages M0 and lower T cells follicular helper levels, which have adverse effects on the survival of patients. However, the model derived from this analysis needs to be further veri ed by clinical samples and basic experiments to obtain the exact impact of the model on the prognosis of patients with colorectal adenocarcinoma.

Conclusion
We developed and validated a four-immune gene model of colorectal adenocarcinoma, including THRB, IL1RL2, LGR6 and LTB4R2. This model could be used as an instrumental variable in the prognosis prediction of colorectal adenocarcinoma.

Availability of data and materials
The data used to support the ndings of this study are included within the article. The data and materials in the current study are available from the corresponding author on reasonable request.

Competing interests
The authors declare no con icts of interest.