Construction of Weighted Gene Co-expression Network Analysis
Based on the standard P < 0.05, | logFC | > 1, we used R package to analyze the TCGA-COAD and GSE44076 data respectively to obtain meaningful differentially expressed genes. The top 50 with the most significant difference were selected to draw the heatmap and volcano map, in which the red part represents the up-regulated genes and the green part represents the down-regulated genes as shown in Fig. 1,2. Subsequently, A gene co-expression network was constructed from TCGA-COAD and GSE44076 data sets using WGCNA software package to find functional clusters in COAD patients. In the case of assigning color to each module, 10 modules of TCGA-COA (Fig. 3A) gain color assignment (remove a gray module that is not assigned to any cluster), and 7 modules in GSE44076 (Fig. 4A) get different colors in this article. Subsequently, we mapped the module feature relationship to assess the association between each module and two clinical features (cancer and normal), The outcomes of the module trait relationship are shown in Fig. 3B,4B. It was found that the brown module (r = 0.86, p = 3e-131) in TCGA-COAD and the turquoise module (r = 0.89, p = 2e-87) in GSE44076 had the positively highest correlation with normal tissues respectively (Fig. 3C,4C).
Identification between DEG Databases and Co-expression Modules
Based on the cut-off criteria of |logFC| ≥ 1.0 and adj. P < 0.05, It was found that 1913 DEGs in TCGA dataset and 206 DEGs in GSE44076 dataset were dislocate in tumor tissues,480 and 417 co-expressed genes were found in the brown module of TCGA dataset and the turquoise module of GSE44076 Separately. Finally, a total of 451 intersection genes were extracted to verify the genes of the co-expression modules (Fig. 4A).
Functional Enrichment Analyses and Identification of Hub Gene
The clusterProfiler software package was used for gene enrichment analysis to further understand the potential functions of 451 genes overlapping with DEG lists and two co-expression modules. After screening the GO enrichment analysis, we observed that biological processes (BP) mainly focus on lipid catabolic process (Fig. 5 and Table 1). The results of cell composition (CC) showed that these genes were mainly involved in the apical part of cell. In molecular function (MF) analysis, phosphoric ester hydrolase activity was considered to be related to 451 genes. Besides, Kyoto Encyclopedia of Genes and Genomes (KEGG) enrichment analysis of top ten genes in DEGs showed that the genes are mainly related to mineral absorption and PPAR signaling pathway (Fig. 6, Table 2).
Table 1
The GO function enrichment analysis of DEGs in COAD
GO
|
Category
|
Description
|
Count
|
%
|
Pvalue
|
Qvalue
|
GO:0016042
|
BP
|
Lipid catabolic process
|
30
|
0.073
|
1.90E-11
|
6.17E-08
|
GO:0006631
|
BP
|
Fatty acid metabolic process
|
29
|
0.071
|
8.83E-06
|
7.83E-06
|
GO:0006066
|
BP
|
alcohol metabolic process
|
27
|
0.066
|
1.03E-05
|
9.13E-06
|
GO:0044242
|
BP
|
cellular lipid catabolic process
|
23
|
0.056
|
5.02E-07
|
5.02E-07
|
GO:0007586
|
BP
|
digestion
|
16
|
0.039
|
4.98E-08
|
1.62E-05
|
GO:0045177
|
CC
|
apical part of cell
|
39
|
0.114
|
1.01E-13
|
3.31E-11
|
GO:0016324
|
CC
|
apical plasma membrane
|
34
|
0.100
|
1.66E-12
|
2.70E-10
|
GO:0031253
|
CC
|
cell projection membrane
|
25
|
0.073
|
2.20E-07
|
1.19E-05
|
GO:0098862
|
CC
|
cluster of actin-based cell projections
|
19
|
0.055
|
2.84E-09
|
1.85E-07
|
GO:0005903
|
CC
|
brush border
|
18
|
0.052
|
1.67E-11
|
1.81E-09
|
GO:0042578
|
MC
|
phosphoric ester hydrolase activity
|
25
|
0.073
|
1.10E-06
|
1.50E-04
|
GO:0008509
|
MC
|
anion transmembrane transporter activity
|
21
|
0.050
|
1.10E-05
|
6.01E-04
|
GO:0008081
|
MC
|
phosphoric diester hydrolase activity
|
15
|
0.035
|
1.41E-09
|
7.69E-07
|
GO:0016616
|
MC
|
oxidoreductase activity, acting on the CH-OH group of donors, NAD or NADP as acceptor
|
14
|
0.033
|
8.74E-07
|
1.50E-04
|
GO:0016616
|
MC
|
oxidoreductase activity, acting on the CH-OH group of donors, NAD or NADP as acceptor
|
14
|
0.033
|
8.74E-07
|
1.50E-04
|
Table 2
The KEGG function enrichment analysis of DEGs in COAD
GO
|
Description
|
Count
|
%
|
Pvalue
|
Qvalue
|
hsa04978
|
Mineral absorption
|
10
|
0.043
|
6.23E-06
|
0.000764207
|
hsa03320
|
PPAR signaling pathway
|
10
|
0.043
|
4.71E-05
|
0.003854135
|
hsa04530
|
Tight junction
|
10
|
0.043
|
2.24E-02
|
0.189207
|
hsa00071
|
Fatty acid degradation
|
9
|
0.038
|
2.55E-06
|
0.000625852
|
hsa00830
|
Retinol metabolism
|
9
|
0.038
|
1.20E-04
|
0.004907811
|
hsa00982
|
Drug metabolism - cytochrome P450
|
9
|
0.038
|
1.88E-04
|
0.005812421
|
hsa04976
|
Bile secretion
|
9
|
0.038
|
9.23E-04
|
0.022638297
|
hsa04936
|
Alcoholic liver disease
|
9
|
0.038
|
1.99E-02
|
0.178525176
|
hsa01212
|
Fatty acid metabolism
|
8
|
0.034
|
1.90E-04
|
0.005812421
|
hsa00983
|
Drug metabolism - other enzymes
|
8
|
0.034
|
1.90E-03
|
0.038794788
|
The PPI network between overlapping genes was established by using STRING database. The hub genes selected in the PPI network shows the MCC algorithm using the CytoHubba plug-in (Fig. 4B). According to the statistics of MCC, the top 20 genes with the highest scores are selected as the hub genes include Acetyl-CoA acyltransferase2 (ACAA2), Acyl-CoA dehydrogenase short chain (ACADS), Fatty acid binding protein 1 (FABP1), Carnitine palmitoyltransferase 1A (CPT1A), Acyl-CoA oxidase 1(ACOX1), Metallothionein 1G (MT1G), Metallothionein 1E (MT1E), Enoyl-CoA hydratase and 3-hydroxyacyl CoA dehydrogenase (EHHADH), Carnitine palmitoyltransferase 2 (CPT2), Acyl-CoA dehydrogenase medium chain (ACADM), Metallothionein 1X (MT1X), Metallothionein 1H (MT1H), PPARG coactivator 1 alpha (PPARGC1A), Acyl-CoA synthetase short chain family member 2 (ACSS2), Metallothionein 2A (MT2A), Metallothionein 1F (MT1F), Carnitine O-acetyltransferase (CRAT), UDP glucuronosyltransferase family 2 member B17 (UGT2B17), Beta-1,3-N-acetylglucosaminyltransferase 6 (B3GNT6), and Mucin 4 (MUC4) as shown in Fig. 4C. After screening 20 hub genes through the CytoHubba plug-in, we verified the expression level of hub gene in patients by convenient online database GEPIA. Compared with normal tissues, 20 hub genes in COAD were significantly down regulated as shown in Fig. 7C,7D and supplement 1,2.
Correlation of Hub Gene Expression and Colon Carcinoma
Kaplan Meier plotter used R survival software package to analyze OS of 20 hub genes to investigate the prognostic value of hub genes in COAD patients. Among the 20 hub genes, Kaplan Meier analysis showed that the low expression level of CPT1A and B3GNT6 was significantly correlated with better OS in patients with COAD (P < 0.05) (Fig. 7A,7B and supplement 3,4). Moreover, according to the UALCAN database, the protein level of B3GNT6 gene in tumor tissues was significantly lower than that in normal tissues (Fig. 8A). There was an obvious correlated among clinical stages, Gender and age with low expression level of B3GNT6 (Fig. 8B, C, D). Also, we applied TIMER to analysis that low expression level of B3GNT6 is significant in different tissue include colon adenocarcinoma, esophageal carcinoma and prostate adenocarcinoma (Fig. 9A). TIMER database was used to analyze the correlation between B3GNT6 expression and immune infiltrating cells in COAD. The results showed that the negative expression of the B3GNT6 gene was related to the expression levels of B cells (r = 0.1708, P < 0.05) and CD4 + T cells (r = 0.0986, P < 0.05) (Fig. 9B). CIBERSORT analysis showed that the expression level of B3GNT6 was related to tumor immune cell infiltration include NK cell activated (p = 0.030), Macrophages M1 (P = 0.006) and Mast cells activated (p = 0.037) (Fig. 9C). All the above observations confirmed that the deregulation of B3GNT6 expression was interrelated with COAD.
Expression of the B3GNT6 Gene in HPA
By searching the HPA database, we entered B3GNT6 as the target gene into the website. The output showed a significant difference, in which B3GNT6 was lightly stained in colon cancer tissues, while it was significantly stained in normal colon tissues (Fig. 10).