3.1 Selection Differential Genes
Three gastric carcinoma-associated datasets in TCGA and GEO were screened. GSE13911, GSE19826, and GSE79973 datasets were examined for DEGs in GC and normal tissues with the limma package in R, and STAD was analyzed for data mining. The TCGA database was also analyzed for differential genes in GC and normal tissues. Differential gene expression analysis was performed with |logFC| > 1 and p. adj < 0.05 as DEGs, and volcano plots show upregulated genes in red and downregulated genes in green (Figure 1A–D). The three datasets were intersected with the DEGs in the TCGA database, resulting in the identification of 201 DEGs, 65 of which were upregulated (Figure 1E).
3.2 Screening for Prognostic Genes
Based on TCGA-STAD project, this work obtained RNA-seq data, while their prognosis (OS) was assessed via Cox regression analysis with DEGs. Venn diagram analysis yielded a total of 26 co-expressed genes(Figure 2A). We found that there was a significant upregulation of COL12A1 in GC and a noticeable connection with poor prognosis. We compared OS and DSS using the Kaplan-Meier method and found shorter OS times in cases showing COL12A1 up-regulation (p=0.033, Figure 2B). Worthwhile, DSS was worse in GC cases showing COL12A1 up-regulation (p=0.02, Figure 2C). The above results indicate that GC patients with high expression of COL12A1 is related with poor patient prognosis. Nomogram is a visual instrument for describing the prognosis of an individual or the risk of a clinical event, and medical technicians can effortlessly access the survival status of an individual through this simple graph (11). We combined COL12A1 expression levels with clinical parameters to construct a line graph for predicting 1-/3-5-year survival. The results indicated that patients with GC and high COL12A1 expression carried poor survival probability (Figure 2D). Therefore, COL12A1 was selected for the next analysis.
3.3 Correlation of COL12A1 Expression with Clinicopathological Features
In this study, we employed Chi-square test for analyzing statistically significant differences in COL12A1 expression at different levels in different clinical data. There was a remarkable difference between overexpression and underexpression of COL12A1 in cancer tissues of patients with T-stage, as well as histological staging (p < 0.05, Table 1), suggesting a correlation between COL12A1 expression and clinicopathological factors.
3.4 COL12A1 Overexpression in GC
COL12A1 was up-regulated within GC tissues of GSE13911, GSE19826, GSE79973 datasets and the TCGA database (p < 0.001; Figure 3A–D). COL12A1 was overexpressed within GC samples in comparison with matched and non-paired healthy samples based on all four datasets (Figure 3E–H). As in Figure 1, immunohistochemical staining from HPA also confirmed higher levels of COL12A1 in tumor tissue than in adjacent normal gastric tissue (Figure 3I). In comparison to normal epithelial cells, COL12A1 was up-regulated within AGS, MKN-28, HGC-27 GC cells (p < 0.01; Figure 3J).
3.5 Knockdown of COL12A1 Suppresses the Malignant Phenotype of GC Cells
We performed follow-up experiments to knock down COL12A1 within GC cells for elucidating COL12A1’s effect on GC oncogenesis and development. GC cells, including HGC-27 as well as AGS, were transfected with si-COL12A1. The results in Figure 4A confirmed that si-COL12A1 significantly downregulated COL12A1 expression in GC cells. As revealed by CCK-8 assay, COL12A1 down-regulation HGC-27 and AGS growth (Figure 4B–C). Besides, according to Transwell assay, downregulation of COL12A1 could significantly suppress invasion and migration of the above two cell lines (Figure 4D–G).
3.6 Pathway Enrichment Analysis
For further investigating COL12A1 biological effect on STAD, COL12A1 co-expression was exmined using the LinkedOmics database (12). As shown in Figure 5A, 5087 genes and COL12A1 showed positive correlation, whereas 4252 showed negative relation to COL12A1 (FDR < 0.05). Heatmap lists the 50 most significant genes with positive relation to COL12A1 and top 50 negatively associated genes (Figure 5B–C). We analysed COL12A1-related genes using gene ontology and KEGG pathways. At p. adj < 0.1, there are 221 biological processes (GO-BP), 31 cellular components (GO-CC), 24 biological processes (GO-MF), as well as 14 KEGG. 12 entries of GO and KEGG are displayed in bubble plots, of which 3 entries for BP, CC, MF, and KEGG are included. According to GO as well as KEGG analysis, COL12A1 co-expression is primarily related to extracellular structure organization and extracellular matrix organization (Figure 5D–E). Additionally, we show the findings of GO and KEGG analysis of co-expression with COL12A1 by visualizing strings (Figure 5F) and circle plots (Figure 5G–H).
3.7 Alterations in the COL12A1 Gene among STAD Cases
This work evaluated altogether 765 STAD cases from three datasets in the cBioPortal database including TCGA, Firehose Legac, OncoSG (2018), Pfizer, UHK, and Nat Genet (2014) (13). The percentage of COL12A1 gene alterations in STAD was 12% (Figure 6A), with the predominant mutation type being missense mutations (Figure 6B). Furthermore, the COL12A1 mutation type was further assessed in the COSMIC database (14). A pie chart of the two mutation types is shown, with missense substitutions occurring in approximately 33.29%, synonymous substitutions in 10.02%, and nonsense mutations in 2.91% of the samples (Figure 6C). The main substitution mutations were G>A (27.58%), followed by C>T (25.18%), G>T (16.51%), and C>A (10.52%) (Figure 6D).
3.8 Visualization of the Relationship between Methylation Levels and COL12A1 Expression
MethSurv database was utilized to examine COL12A1 DNA methylation expression and prediction value of diverse CpG (15). The findings demonstrated that there were 41 methylated CpG sites, with cg13319757, cg13395133, and cg16633701 having the highest levels of DNA methylation (Figure 7). The 17 CpG sites with prognostically relevant and meaningful methylation levels were cg03503642, cg03564793, cg04504006, cg04611812, cg08009622, cg11353250, cg11526848, cg12488810, cg12801474 cg13319757, cg13395133, cg14375912, cg14375912 cg15089846, cg21112099, cg24897255, and cg26997327 (p < 0.05; Table 2). High COL12A1 methylation at these CpG loci was significantly associated with shorter overall survival.
3.9 Correlation between COL12A1 and Methylation Genes
To further elicit the role played by methylation, this work analyzed correlation of COL12A1 with methylation genes using TCGA. ALKBH5, FTO, HNRNPA2B1, HNRNPC, IGF2BP1, IGF2BP2, IGF2BP3, METTL3, METTL14, RBMX, RBM15, RBM15B, VIRMA, WTAP, YTHDC1, YTHDC2, YTHDF1, YTHDF2, YTHDF3 and ZC3H13 are the 24 methylated genes of importance. The heatmap showing association of COL12A1 level with methylated genes is displayed in Figure 8A. As shown in Figure 8B–J, COL12A1 expression showed positive relation to FTO, VIRMA, METTL14, and YTHDF3; and COL12A1 expression showed negative relation to HNRNPC, YTHDF1, HNRNPA2B1, YTHDC1, and RBMX. Among them, COL12A1 expression displayed a marked positive association with FTO (r = 0.345, p<0.001) as well as COL12A1 expression exhibited the striking negative relation to RBMX (r = -0.271, p<0.001). These findings support the notion that COL12A1 was associated with methylation..
3.10 COL12A1 is Associated with Infiltration of Immune Cells within GC Tissue
Immune infiltration was known to function in the tumour microenvironment. TIMER database was adopted for assessing infiltration of immune cells within STAD tissues (16). According to Figure 9A, COL12A1 showed significant relation to infiltration degrees of immune cells in GC, including macrophages, CD4+ T cells, dendritic cells (DCs) and neutrophils in GC tissues (p<0.01); however, statistically distinct association was not seen with B cells or CD8+ T cells (p>0.05). Correlation coefficient of COL12A1 with macrophages was significant (r = 0.402). Independent immune cell infiltration correlation analyses can provide key clues to investigate the function and mechanism of COL12A1. Therefore, the cancer group was further classified as 2 groups, namely, high (188 samples) or low (187 samples) expression group. To test differences in tumor immune microenvironment between both groups, We carried out further analysis of differential expression the differential expression of 24 distinct immune cell types (17) between the different COL12A1 expression groups (Figure 9B). Herein, we present a lollipop plot of immune cells with significant correlation (p < 0.01), and these correlations are visualized through chord diagrams. Based on the findings, macrophages, Tem, NK cell proportions were highly differential and correlated with the COL12A1 high/low expression groups: macrophages (r = 0.520), Tem (r = 0.409), NK cells (r = 0.388), Th1 cells (r = 0.367), and neutrophils (r = 0.360) (Figure 9C–D). The differences in expression were large and correlated. Overall, COL12A1 expression in GC tissues was involved in immune cell infiltration in various ways.
3.11 Drug Sensitivity Analysis
The drug response to cancer is a remarkably complex mechanism that is regulated by various factors (18). Studies have revealed that the genes highly associated with drug sensitivity are oncogenes, which can serve as direct targets for relevant drugs (19). Thus, the study of genomic markers and anticancer drug sensitivity can provide new drug targets and develop key predictive biomarkers for improving the sensitivity of cancer treatments (20). RNAactDrug is a comprehensive resource containing numerous helpful databases, which can be used to explore the link between drugs sensitivity and the molecular expression of RNA (21). We use RNAactDrug database to determine the link between anticancer drug sensitivity and COL12A1 mRNA at multiple histological levels. As presented in Table 3, palbociclib, geldanamycin, rifocin, eupachlorin, nanaomycin, and nilotinib were negatively associated with COL12A1 mRNA expression. However, vorinostat, zibotentan, amuvatinib, alectinib, and ruxolitinib were negatively associated with COL12A1 mRNA expression. Lastly, alectinib, ruxolitinib, and cabozantinib were positively correlated with COL12A1 mRNA. Therefore, antineoplastic drugs such as palbociclib, geldanamicin, rifocin, eupachlorin, nanaomycin, and nilotinib may be potential therapeutic agents for GC.