Identification of DEGs from GEO datasets analysis
The raw data from three gene expression profiles (GSE103227, GSE104267, and GSE111260) were downloaded from NCBI GEO database. There were 81 GBM samples and 11 normal samples in this study. DEGs between GBM samples and normal samples were screened from three studies, and then visualized by volcano plots and Principal Components Analysis (PCA) score plots (Figure 2A and 2B). Afterwards, the number of DEGs obtained from three datasets is shown in Supplementary Table 1. Furthermore, the Venn diagrams showed that 24 overlapping DEGs were obtained among three datasets, including 18 up-regulated genes (Figure 2Ca) and 6 down-regulated genes (Figure 2Cb).
Meta-analysis of three GEO datasets
By employing three statistical methods, a total of 5801, 640, and 2368 DEGs were identified by Fisher’s method, Fixed effects models, and Vote counting, respectively. Additionally, 613 shared genes were obtained by all three statistical methods and 2357 DEGs were existed in at least two methods (Figure 2D).
Survival analysis of DEGs
In order to clarify the relationship between gene expression and GBM prognosis, we used K-M and log rank test for survival analysis. The clinical data of 167 patients with GBM was downloaded from TCGA, and the overall survival analysis of 2357 DEGs was performed. Finally, we obtained 78 DEGs were significantly connected with the prognosis of GBM (Supplementary Table 2).
GO enrichment and KEGG pathway analysis of prognosis related genes
According to the result mentioned above, the functional enrichment analysis of 78 prognosis-related genes was conducted. Three categories of GO enrichment analysis were performed, including biological process (BP), cellular component (CC), and molecular function (MF). The results indicated that these genes were mainly associated with GO_BP terms such as behavior, sensory organ morphogenesis, and chromosome separation. As for GO_CC terms, genes were primarily enriched in histone deacetylase complex, neuron to neuron synapse, transferase complex, and axon. For MFs, DEGs were particularly related to DNA-binding transcription repressor activity, RNA polymerase II-specific, actin filament binding, and chromatin binding. Additionally, the KEGG pathway analysis revealed that these genes were significantly involved in longevity regulating pathway, bile secretion, insulin secretion, and thyroid hormone signaling pathway (Figure 3A and Supplementary Table 3). Furthermore, all terms were grouped into clusters based on the similarities, and a total of 13 clusters of significantly enriched terms were obtained (Figure 3B), among these, sensory organ morphogenesis was the most enriched term.
Establishment of PPI network
In order to understand the potential relationships between prognostic related DEGs, the PPI analysis was conducted. The PPI network was composed of 71 nodes and 214 edges (Figure 4). A total of 16 nodes with the higher connectivity degrees were screened as hub genes, including ELAV like RNA binding protein 3 (ELAVL3), histone deacetylase 2 (HDAC2), Calbindin 1 (CALB1), cullin 3 (CUL3), synapsin II (SYN2), citron Rho-Interacting serine/threonine kinase (CIT), SH3 and multiple ankyrin repeat domains 2 (SHANK2), solute carrier family 12 member 5 (SLC12A5), superoxide dismutase 1 (SOD1), SET domain bifurcated histone lysine methyltransferase 1 (SETDB1), calneuron 1 (CALN1), cyclase associated actin cytoskeleton regulatory protein 2 (CAP2), ADP ribosylation factor like GTPase 13B (ARL13B), adenylate cyclase 3 (ADCY3), centrin 2 (CETN2), and marker of proliferation ki-67 (MKI67). Additionally, the specific degree values of these genes are listed in Table 2.
The mRNA level and mutation state of hub genes
By analyzing the expression of the hub genes in the TCGA GBM data, we observed that the expression levels of 10 hub genes were consistent with the results of microarray datasets, including MKI67, ARL13B, SETDB1, ELAVL3, ADCY3, SOD1, CALN1, SYN2, and SLC12A5. Notably, compared with normal samples, the expression level of MKI67, ARL13B, and SETDB1 was significantly up-regulated in GBM samples, while ELAVL3, ADCY3, SOD1, CALN1, SYN2, and SLC12A5 were markedly down-regulated (Figure 5 and Table 3). In addition, we also display the K-M curves of hub genes in Supplementary Figure 1. Results showed that CETN2, MKI67, ARL13B, and SETDB1 with lower expression level were related to a significantly longer survival time; meanwhile, high expression of CALN1, ELAVL3, ADCY3, SYN2, ARL13B, SLC12A5, and SOD1 were associated with better overall survival of patients with GBM. The results of prognosis were consistent with the mRNA expression levels of hub genes.
Furthermore, the hub gene mutations in GBM were tested using cBioPortal. The MKI67, SLC12A5, and SOD1 exhibited higher mutation frequencies, and the proportion of them was 2.2, 0.7, and 0.2%, respectively (Supplementary Figure 2A). Meanwhile, approximately 3% of GBM clinical cases showed significant alterations in the 10 hub genes (Supplementary Figure 2B).
Apart from investigating the mRNA level of hub genes, the protein expression levels were also explored using the HPA database. Because the immunohistochemical information of SYN2 was not existed in HPA, we have displayed nine pairs of staining results in Figure 6. The protein level of MKI67 and ARL13B was undetected in normal tissues, while the level of these genes was medium and high in the GBM tissues, respectively. The protein level of CETN2 was low in normal samples, while the level of it was high in GBM samples. Additionally, the medium protein level of SETDB1 was observed in normal tissues, whereas the high protein level was revealed in GBM tissues. Meanwhile, the protein level of CALN1 was medium in normal samples, while was low in the GBM samples. SOD1 moderately expressed in normal tissues but undetectable in GBM tissues, and ELAVL3 and ADCY3 lowly expressed in normal tissues but undetectable in GBM tissues. Moreover, SLC12A5 was undetectable in normal and GBM samples. Thus, CETN2, MKI67, ARL13B, SETDB1, and CALN1 might be potential biomarkers for screening high-risk patients with GBM.
Analysis of GBM-related small molecular drugs
To identify candidate small molecular drugs targeting the gene expression of GBM, all the prognosis-related DEGs were divided into up-regulated and down-regulated groups, which were submitted to the CMAP database. A total of 98 small molecular drugs that closely related to the biological status of GBM were obtained, of which 45 drugs might play potential synergies role in the development of GBM (enrichment score> 0), while 53 drugs might serve repress role in the GBM progression (enrichment score< 0). The top 10 vital small molecule drugs were selected (Figure 7). Among these drugs, cycloserine (enrichment score =-0.844) and 11-deoxy-16,16-dimethylprostaglandin E2 (enrichment score =-0.835) showed highly significant negative correlation and had potential to reverse the carcinoma status of GBM. These identified small molecule drugs with enrichment scores <0 could reverse the abnormal gene expression and serve as potential drugs for GBM treatment.