Identification of DEGs from GEO datasets analysis
The raw data from three gene expression profiles (GSE103227, GSE104267, and GSE111260) were downloaded from NCBI GEO database. The detailed information of these three datasets is listed in Table 1. DEGs between GBM samples and normal samples were screened from three studies, and visualized by PCA and volcano plots (Figure 2A and 2B). Afterwards, the number of DEGs obtained from three datasets is shown in Supplementary Table 1. Besides, an integrative analysis across three datasets was conducted, and the results are shown in Figure 2C. We observed 18 up-regulated and 6 down-regulated overlapping DEGs existed in three microarray datasets.
Integrated analysis of three GEO datasets
By employing three statistical methods, a total of 5801, 640, and 2368 DEGs were identified by Fisher’s method, Fixed effects models, and Vote counting, respectively. Additionally, 613 genes were obtained by all three statistical methods (Supplementary Figure 1A). Heat map representation of the top 10 hub DEGs across different microarrays is displayed in Supplementary Figure 1B. Among these, centrin 2 (CETN2), marker of proliferation ki-67 (MKI67), ADP ribosylation factor like GTPase 13B (ARL13B), SET domain bifurcated histone lysine methyltransferase 1 (SETDB1), calneuron 1 (CALN1), ELAV like RNA binding protein 3 (ELAVL3), adenylate cyclase 3 (ADCY3), synapsin II (SYN2), solute carrier family 12 member 5 (SLC12A5), and superoxide dismutase 1 (SOD1) were selected as hub genes and subjected to further analysis and validation.
Survival analysis of hub genes
A total of 167 samples were downloaded from TCGA-GBM expression dataset. To further analyze the relationship between hub gene expression and the prognosis of GBM, the overall survival analysis of hub genes were performed (Figure 3). According to the median level of expression, the GBM patients were divided into high and low expression group. CETN2, MKI67, ARL13B, and SETDB1 with lower expression level were related to a significantly longer survival time; meanwhile, high expression of CALN1, ELAVL3, ADCY3, SYN2, ARL13B, SLC12A5, and SOD1 were associated with better overall survival among patients with GBM.
GO enrichment and KEGG pathway analysis of hub genes
Based on the GEM prognosis related genes, the functional enrichment analysis was conducted. The results of GO enrichment analysis indicated that these genes were mainly associated with behavior, sensory organ morphogenesis, and chromosome separation. Additionally, the KEGG pathway analysis revealed that these genes were primarily involved in longevity regulating pathway, bile secretion, and hemostasis pathways (Figure 4A). Furthermore, all terms were grouped into clusters based on the similarities, and a total of 14 clusters of significantly enriched terms were obtained (Figure 4B), among these, sensory organ morphogenesis was the most enriched term.
Establishment of PPI network
In order to understand the potential relationships between prognostic related DEGs, the PPI network was constructed. The PPI network composed of 71 nodes and 214 edges (Figure 5). There were 16 DEGs with their degree > 10 were considered as hub genes. Additionally, the specific degree values of these genes are listed in Table 2. We found the degree of ELAVL3, HDAC2, and CALB1 were higher than other genes.
The expression and mutation of hub genes
By analyzing the expression of the hub genes in the TCGA GBM data, we observed that the change trend of genes expression was consistent with microarray datasets. Compared with normal samples, the expression level of MKI67, ARL13B, and SETDB1 was significantly up-regulated in GBM samples, while ELAVL3, ADCY3, SOD1, CALN1, SYN2, and SLC12A5 were markedly down-regulated (Figure 6). The information of hub genes is presented in Table 3. In addition, the hub gene mutations in GBM were tested using cBioPortal. The MKI67, SLC12A5, and SOD1 exhibited higher mutation frequencies, and the proportion respectively was 2.2, 0.7, and 0.2% (Supplementary Figure 2A). Meanwhile, approximately 3% of GBM clinical cases showed significant alterations in the 10 hub genes (Supplementary Figure 2B).
Immunohistochemical Analysis
The protein expression levels of the 10 hub genes in GBM were explored using the HPA database (Figure 7). The protein level of MKI67 and ARL13B was not detected in normal tissues, while the level of these genes was medium and high in the GBM tissues. The protein level of CETN2 was low in normal samples, while the level was high in GBM samples. Additionally, the medium protein level of SETDB1 was observed in normal tissues, whereas the high protein level was revealed in GBM tissues. Meanwhile, the protein level of CALN1 was medium in normal samples, while was low in the GBM samples. In brief, these results indicated that the protein levels of the hub genes were consistent with the mRNA expression levels.
Analysis of GBM-related small molecular drugs
We obtained 98 GBM-related small molecular drugs from the CMAP database according to the screening criteria of p < 0.05, among these, 45 drugs might play potential synergies role in the development of GBM (enrichment > 0), and 53 drugs might served repress role in the GBM progression (enrichment < 0). The most relevant top 10 drugs (with a smaller p-value) were selected and are displayed in Figure 8. The results revealed that drugs like nomegestrol (enrichment=0.975), adiphenine (enrichment=0.783), etiocholanolone (enrichment=0.775), and podophyllotoxin (enrichment=0.835) might be potential drugs for GBM treatment.