Data resource processing
We downloaded a total of 461 COAD patients' clinical data and 456 gene expression levels (including 480 cancer tissue expression levels and 41 adjacent normal tissue expression levels) from the TCGA database. We integrated the two data to obtain 438 cases of COAD tumor tissue and 41 cases of adjacent normal tissue (removal of no prognostic information, mismatched information, and repeated expression of cancer tissue). The univariate survival analysis of clinical parameters was shown in Table 1. The results showed that the TNM stage was correlated with the OS of COAD patients (Log-rank P < 0.001).
Table 1
Baseline patient characteristics in TCGA cohort
Variables
|
Patients
|
OS
|
(n = 438)
|
No. of events
|
MST (days)
|
HR (95%CI)
|
Log-rank P
|
Age(years)
|
|
|
|
|
|
≤ 65
|
268
|
57
|
3042
|
1
|
0.196
|
> 65
|
168
|
41
|
1910
|
1.305(0.871–1.965)
|
|
Missing*
|
2
|
|
|
|
|
Sex
|
|
|
|
|
|
Male
|
234
|
54
|
2475
|
1
|
0.545
|
Female
|
204
|
44
|
NA
|
0.884(0.593–1.318)
|
|
TNM Stage
|
|
|
|
|
|
Ⅰ
|
73
|
4
|
NA
|
1
|
< 0.001
|
Ⅱ
|
168
|
28
|
2821
|
2.308(0.807–6.602)
|
|
Ⅲ
|
126
|
31
|
NA
|
4.101(1.446–11.634)
|
|
Ⅳ
|
61
|
31
|
858
|
11.355(4.003–32.208)
|
|
Missing#
|
10
|
|
|
|
|
Notes: Missing*, information of age was unknown in 2 patients; Missing#, information of TNM stage was not reported in 10 patients; TCGA, The Cancer Genome Atlas; OS, overall survival; MST, median survival time; 95 % CI, 95 % confidence interval; HR, hazards ratio; NA, not available. |
Differential expression analysis and diagnostic ROC Curve Analysis based on TCGA cohort and public networks
We downloaded the HEPACAM2 gene expression level in various cancer tissues and normal tissues from the UALCAN and TIMER databases. The results showed that the expression level of HEPACAM2 gene in COAD tumor tissue was lower than that in normal colon tissue. (Fig. 1) We further downloaded the expression box diagrams of HEPACAM2 gene in COAD tissue and normal colon tissue from the GEPIA and MERAV databases, and the results were consistent with the previous description. (Figure S1A-1B) Meanwhile, the expression level of HEPACAM2 gene in COAD patients of different TNM stages did not reach statistical differences (P > 0.05). (Figure S1C)
Based on the TCGA database, we also investigated the differential expression of HEPACAM2 gene between COAD tumor tissues and adjacent normal tissues, the result showed that the HEPACAM2 expression level was higher in COAD adjacent normal tissues than that in tumor tissues. We also found that the HEPACAM2 expression level didn’t show differential expression in different TNM stages. (Fig. 2A) The diagnostic ROC curve showed that HEPACAM2 had a higher diagnostic value in patients with COAD (P < 0.001, Area Under Curve (AUC) = 0.940, 95%CI = 0.805–0.979). (Fig. 2B)
Finally, we also investigated the differential expression and diagnostic value of HEPACAM2 gene in COAD or CRC using an Oncomine dataset based on GEO cohort, the result showed that the HEPACAM2 gene was highly expressed in Skrzypczak COAD or Hong CRC normal tissue than that in tumor tissues, and the HEPACAM2 gene had a high diagnostic value in Skrzypczak COAD (P < 0.001, AUC = 0.896, 95%CI = 0.812–0.980) and Hong CRC (P < 0.001, AUC = 0.976, 95%CI = 0.944-1.000). (Fig. 3)
Validation and analysis of HEPACAM2 in the diagnostic value of COAD based on the Guangxi cohort
We collected 30 pairs of COAD patients' tumor tissues and adjacent normal tissues. After RT-qPCR detection, we found that the expression level of HEPACAM2 gene in COAD tumor tissue (0.036942 ± 0.062463) was significantly lower than that in adjacent normal colon tissue (0.167750 ± 0.179779). (P < 0.001) Meanwhile, the diagnostic ROC curve suggested that the HEPACAM2 gene had a higher diagnostic value in patients with COAD (P < 0.001, AUC = 0.892, 95%CI = 0.805–0.979). (Fig. 2C-2E)
Univariate and multivariate survival analysis of HEPACAM2 gene in COAD
We performed a survival analysis of HEPACAM2 gene in patients with COAD in accordance with the median cutoff value of HEPACAM2 gene expression, the patients with high expression of HEPACAM2 gene had better survival than those with lowly expressed HEPACAM2 gene. (Fig. 4A) In terms of the survival results of univariate clinical parameters, we constructed two different adjusted models and found that the HEPACAM2 gene was related to the OS of patients with COAD, namely model 2: Adjusted by TNM stage (P = 0.044, HR (95% CI) = 0.643(0.419–0.988)) and model 3: Adjusted by age, sex, and TNM stage (P = 0.038, HR (95% CI) = 0.635(0.414–0.976)). (Table 2) Finally, the association between HEPACAM2 gene and death risk of COAD patients was presented in Fig. 5A-5C. In short, the death risk of COAD decreased with the increased expression of HEPACAM2 gene. The HEPACAM2 gene-related nomogram showed that HEPACAM2 gene made a certain contribution to COAD OS. (Figure S2 A)
Table 2
Multivariate survival analysis of HEPACAM2 gene and HEPACAM2 related genes expression in COAD of TCGA cohort
Gene
|
Patients
|
OS (model 0)
|
OS (model 1)
|
OS (model 2)
|
(n = 438)
|
HR (95%CI)
|
Crude P*
|
HR (95%CI)
|
Adjusted P#
|
HR (95%CI)
|
Adjusted P&
|
HEPACAM2
|
|
|
0.006
|
|
0.044
|
|
0.038
|
Low
|
219
|
1
|
|
1
|
|
1
|
|
High
|
219
|
0.560(0.370–0.846)
|
|
0.643(0.419–0.988)
|
|
0.635(0.414–0.976)
|
|
CLCA1
|
|
|
0.001
|
|
0.008
|
|
0.006
|
Low
|
219
|
1
|
|
1
|
|
1
|
|
High
|
219
|
0.499(0.330–0.755)
|
|
0.561 (0.366–0.860)
|
|
0.550 (0.358–0.844)
|
|
REP15
|
|
|
0.023
|
|
0.046
|
|
0.042
|
Low
|
219
|
1
|
|
1
|
|
1
|
|
High
|
219
|
0.623(0.414–0.938)
|
|
0.651(0.427–0.992)
|
|
0.654(0.423–0.984)
|
|
B3GNT6
|
|
|
0.003
|
|
0.005
|
|
0.004
|
Low
|
219
|
1
|
|
1
|
|
1
|
|
High
|
219
|
0.528(0.348–0.801)
|
|
0.540(0.352–0.830)
|
|
0.535(0.348–0.823)
|
|
Notes: Crude P*, Univariate survival analysis; Adjusted P#, Adjustment for TNM stage; Adjusted P&, Adjustment for Age, Sex, and TNM stage; TCGA, The Cancer Genome Atlas; OS, Overall survival; COAD, Colon adenocarcinoma; HR, hazard ratio; CI, confidence interval; TNM, Tumor Node Metastasis; NA, not available. |
Table 3
Joint effects analysis of HEPACAM2 and related genes expression in COAD patients OS
Group
|
HEPACAM2
|
Related
genes
|
Patients (n = 438)
|
OS (model 0)
|
|
OS (model 1)
|
|
OS (model 2)
|
|
HR (95% CI)
|
Crude P*
|
HR (95% CI)
|
Adjusted P#
|
HR (95% CI)
|
Adjusted P&
|
HEPACAM2&CLCA1
|
|
|
|
|
|
|
|
|
|
A
|
Low
|
Low
|
117
|
1
|
|
1
|
|
1
|
|
B
|
Low
|
High
|
45
|
0.654(0.333–1.286)
|
0.218
|
0.694(0.351–1.372)
|
0.294
|
0.724(0.365–1.439)
|
0.357
|
C
|
High
|
Low
|
42
|
0.752(0.358–1.581)
|
0.452
|
0.832(0.392–1.765)
|
0.631
|
0.905(0.426–1.924)
|
0.796
|
D
|
High
|
High
|
174
|
0.433(0.270–0.695)
|
0.001
|
0.499(0.304–0.819)
|
0.006
|
0.489(0.298–0.803)
|
0.005
|
HEPACAM2&REP15
|
|
|
|
|
|
|
|
|
|
A
|
Low
|
Low
|
183
|
1
|
|
1
|
|
1
|
|
B
|
Low
|
High
|
39
|
0.927(0.493–1.743)
|
0.814
|
0.856(0.453–1.619)
|
0.633
|
0.864(0.448–1.666)
|
0.663
|
C
|
High
|
Low
|
36
|
0.685(0.311–1.510)
|
0.348
|
0.815(0.367–1.809)
|
0.615
|
0.823(0.365–1.856)
|
0.639
|
D
|
High
|
High
|
180
|
0.515(0.323–0.820)
|
0.005
|
0.568(0.350–0.923)
|
0.022
|
0.560(0.344–0.910)
|
0.019
|
HEPACAM2&B3GNT6
|
|
|
|
|
|
|
|
|
|
A
|
Low
|
Low
|
184
|
1
|
|
1
|
|
1
|
|
B
|
Low
|
High
|
35
|
0.724(0.365–1.439)
|
0.357
|
0.413(0.173–0.984)
|
0.046
|
0.397(0.166–0.947)
|
0.037
|
C
|
High
|
Low
|
35
|
0.905(0.426–1.924)
|
0.796
|
0.698 (0.314–1.551)
|
0.377
|
0.653(0.291–1.462)
|
0.300
|
D
|
High
|
High
|
184
|
0.489(0.298–0.803)
|
0.005
|
0.544(0.340–0.870)
|
0.011
|
0.534(0.333–0.856)
|
0.009
|
Notes: Crude P*, Univariate survival analysis; Adjusted P#, Adjustment for TNM stage; Adjusted P&, Adjustment for Age, Sex, and TNM stage; TCGA, The Cancer Genome Atlas; OS, Overall survival; COAD, Colon adenocarcinoma; HR, hazard ratio; CI, confidence interval; TNM, Tumor Node Metastasis; NA, not available. |
HEPACAM2 gene mutation and immune infiltration information of COAD
We investigated the mutation status of HEPACAM2 gene in patients with COAD and found that mutation frequency was low and genomic alterations occurred in COAD patients. (Fig. 6A-6B) Additionally, we utilized the TIMER dataset to analyze possible correlations between HEPACAM2 gene and immune infiltration of different Immune cells in COAD. The result showed that there was no significant and positive association between HEPACAM2 gene and different Immune cells. (Fig. 6C)
Correlation analysis and correlation-genes survival analyses in COAD patients
We selected correlation genes that were related to the HEPACMA2 gene in COAD. The correlation coefficients of the genes were greater than 0.7 in the three datasets, including GEPIA, ULCAN, and LinkedOmics databases. The correlation analysis by Venn diagram showed that HEPACAM2 gene was strongly associated with chloride channel accessory 1(CLCA1) gene, RAB15 effector protein (REP15) gene, and UDP-GlcNAc:betaGal beta-1,3-N-acetylglucosaminyltransferase 6 (B3GNT6) gene. (Figure 7 and Figure 8) The survival curves showed that the low expression of CLCA1 gene, REP15 gene, and B3GNT6 gene had worse survival in COAD. (Figure 4B-4D) Finally, the multivariable survival analyses indicated that the expression of CLCA1 gene (Model 1: P = 0.008, HR (95% CI) = 0.561 (0.366-0.860); Model 2: P = 0.006, HR (95% CI) = 0.550 (0.358-0.844)), REP15 gene (Model 1: P = 0.046, HR (95% CI) = 0.651(0.427-0.992); Model 2: P = 0.044, HR (95% CI) = 0.654(0.423-0.984)), and B3GNT6 gene (Model 1: P = 0.005, HR (95% CI) = 0.540(0.352-0.830); Model 2: P = 0.004, HR (95% CI) = 0.535(0.348-0.823)) were significantly associated with the OS of patients with COAD.
Joint-effects analysis and comprehensive prognosis analysis
We carried out the survival analysis of each combination with HEPACAM2 gene. (Figure 5D-5F) Highly expressed HEPACAM2 gene combined with highly expressed CLCA1 gene (Model 1: P = 0.006, HR (95% CI) = 0.499 (0.304-0.819); Model 2: P = 0.005, HR (95% CI) =0.489 (0.298-0.803)) or REP15 gene (Model 1: P = 0.022, HR (95% CI) = 0.568 (0.350-0.923); Model 2: P = 0.019, HR (95% CI) = 0.560 (0.344-0.910)) or B3GNT6 gene (Model 1: P = 0.011, HR (95% CI) = 0.544 (0.340-0.870); Model 2: P = 0.009, HR (95% CI) = 0.534 (0.333-0.856)) predicted longer OS of COAD patients. In other words, the combination of these highly expressed genes was associated with a reduced risk of death in COAD. However, we also observed that the combination of low expression of HEPACAM2 gene and high expression of B3GNT6 associated with a better prognosis of COAD OS (Model 1: P = 0.046, HR (95% CI) = 0.413 (0.173-0.984); Model 2: P = 0.037, HR (95% CI) = 0.397 (0.166-0.947)).
The nomograms of HEPACAM2 and CLCA1, HEPACAM2 and REP15, HEPACAM2 and B3GNT6 showed that these different combinations displayed a higher prognostic contribution to COAD OS than the only HEPACAM2 gene-related nomogram. (Figure S2B- S2D)
Finally, the risk scored model of HEPACAM2 and CLCA1 was constructed by the following formula: (-0.453×HEPACAM2 expression) +(-0.598 × CLCA1 expression). The risk scored model of HEPACAM2 and REP15 was as follows: (-0.453×HEPACAM2 expression) +(-0.615 × REP15 expression). The risk scored model of HEPACAM2 and B3GNT6 was as follows: (-0.453×HEPACAM2 expression) +(-0.429 × B3GNT6 expression). All nomograms we constructed showed that low risk scored group had better survival and fewer deaths of COAD OS. The prognosis 1-, 3-, and 5-year AUC of HEPACAM2 and CLCA1 was 0.547, 0.561 and 0.635, respectively. The prognosis 1-, 3-, and 5-year AUC of HEPACAM2 and REP15 was 0.559, 0.582 and 0.650, respectively. The prognosis 1-, 3-, and 5-year AUC of HEPACAM2 and B3GNT6 was 0.550, 0.569 and 0.619, respectively. (Figure 9)