Analysis of glioma mutation database in TCGA database and characteristics of glioma patients
The single nucleotide variation data of glioma patients were obtained from TCGA database, the data processed by varscan software were selected, and the R software "maftools" package was used for visualization. A total of 488 (96.44%) of the 506 patients had mutations, of which missense mutations accounted for the largest proportion. The mutation types were dominated by single nucleotide polymorphisms. In the SNV classification, C>T was the most important type, and IDH1, TP53 and ATRX had the highest mutation rates, as shown in Fig 1A. The waterfall chart (Fig 1B) showed the specific mutation type and mutation gene ratio in each sample. Interaction diagram (Fig 1C) showed the relationship between two mutant genes, in which green indicated positive correlation and brown indicated negative correlation.
The clinical information of 515 samples was obtained from TCGA database. After sorting out the data, it was found that there were 285 male patients and 230 female patients, and the clinical baseline was detailed in Table 1.
Table 1 Clinical characteristics of 515 patients in TCGA clinical database.
Variables
|
Number(%)
|
Vital status
|
|
Alive
|
406(78.83%)
|
Dead
|
109(21.17%)
|
Age
|
|
≤65
|
483(93.79%)
|
>65
|
32(6.21%)
|
Gender
|
|
Female
|
230(44.66%)
|
Male
|
285(55.34%)
|
Correlation between TMB score and survival outcome
529 glioma transcriptome data downloaded from TCGA were divided into TMB groups, including 243 cases in high TMB group, 276 cases in low TMB group, and 10 cases were unable to match the grouping due to lack of data. It was found that the survival rate of high TMB group was significantly higher than that of low TMB group, and the difference was statistically significant (P<0.001), which confirmed that the survival time of patients in high TMB group was longer. According to the analysis of clinical data, the correlation between TMB scores and age, gender were explored respectively. The results showed that TMB value of glioma patients aged > 65 years was significantly higher than that of glioma patients aged ≤65 years, the difference was statistically significant (P < 0.001), TMB was not associated with gender, shown in Fig 2.
Gene expression differential analysis, GSEA analysis and PPI network of different TMB groups
There were 318 differentially expressed genes between the high and low TMB groups, and the differentially expressed genes were identified by "limma" package, with |log2FC|> 1 and FDR < 0.05 as screening criteria, and the top 10 differentially expressed genes were shown in Table 2. According to the differential gene FDR and its expression multiple, the volcano map was drawn by "ggplot2" package in Fig 3A. The heat map showed the expression level of DEGs by constructing the "pheatmap" package of R software, and the results were shown in Fig 3B. Venny diagram showed 1811 immune related genes and 318 differentially expressed genes, and 26 differentially expressed immune genes were obtained, as shown in Fig 3C. GO enrichment and KEGG pathway analysis of DEGs showed that BP was mainly related to pattern specification process, chromosome segregation, nuclear division and other pathways, while CC was mainly related to spindle, chromosome. Moreover, the MF was mainly related to DNA binding transcription activator activity-RNA polymerase II specific, microtubule binding and extracellular matrix structural constitution. KEGG pathway was related to cell cycle, proteoglycans in cancer, oocytemeios and other signal pathways. The results were shown in Fig 4A, B. In addition, GSEA results showed that the expression of adipocytokine signaling pathway, alanin aspartate and glutamate metabolism, aldosterone regulated sodium reabsorption and other signaling pathways were significantly enriched when TMB level was used as phenotype, as shown in Fig 4C.
Table 2 The top 10 DEGs in glioma
Gene
|
Low group
|
High group
|
LogFC
|
p-Value
|
FDR
|
SFRP2
|
77.20208
|
30.61399
|
-1.33445
|
5.86E-22
|
1.08E-17
|
SELL
|
47.09799
|
22.71912
|
-1.05176
|
2.61E-12
|
3.77E-10
|
USH1C
|
13.91831
|
6.485024
|
-1.1018
|
2.49E-14
|
1.07E-11
|
FSTL5
|
3.620375
|
1.527565
|
-1.24491
|
3.87E-20
|
2.38E-16
|
HPSE2
|
8.994663
|
3.922698
|
-1.19722
|
1.27E-16
|
1.67E-13
|
CHI3L1
|
27.82078
|
189.8005
|
2.770249
|
1.97E-11
|
1.81E-09
|
LTF
|
3.718816
|
24.61306
|
2.726509
|
4.72E-05
|
0.000281
|
FMOD
|
1.543689
|
11.3636
|
2.879966
|
7.54E-07
|
8.79E-06
|
AC004233.2
|
0.111767
|
5.472285
|
5.613576
|
1.50E-05
|
0.000108
|
OTP
|
0.05032
|
0.377971
|
2.909065
|
9.82E-18
|
1.88E-14
|
PPI network of DEGs was constructed by STRING v10, and visualized by Cytoscape. The result was shown in Fig 5A. The cytohubba software of Cytoscape was selected to determine the key nodes in the PPI network, and 10 key genes including CCNB1, AURKA, BUB1, AURKB, CDC20, BIRC5, TTK, CCNB2, CDCA8 and CENPE are obtained. The node degree score was generated according to Cytoscape, and the hub gene score was shown in Figure 5B.
Immune cells in high and low TMB group.
Based on the CIBERSORT algorithm, the proportion of 22 immune cells in each glioma patient is estimated, and the results are displayed in a block diagram, where different colors represent different cell subgroups, as shown in Fig 6A. The differences between the high and low TMB groups in 22 immune cells are illustrated in Fig 6B. Among them, the expression of the low mutation load group in Monocytes cells was significantly higher than that in the high mutation load group, and the difference was statistically significant. There was no significant difference in the expression of the two groups in the remaining cells.
Establishment of risk model
The related genes were obtained by univariate COX and multivariate COX analysis, a total of 9 genes, including CXCL11, GDF15, VEGFA, PLA2G2A, BIRC5, VAV3, IL9, TGFB2 and TNFRSF12A, were involved in the construction of the prognosis model. The survival time of the high expression group of 9 genes was significantly longer than that of the low expression group, and the difference was statistically significant (P < 0.05). The results were shown in Fig 7A. The glioma patients were divided into high-risk and low-risk groups. Comparing the survival time of high-risk patients with that of low-risk patients, it was found that the survival time of high-risk patients was significantly longer than that of low-risk patients, and the difference between the two groups was statistically significant (p<0.05), as shown in Fig 7B. The ROC curve constructed a model by calculating the area under the curve to predict the survival time of patients, and the AUC value of the curve was 0.863, indicating that the model had higher accuracy in predicting the survival time, as shown in Fig 7C.
TIMER database analysis
Immune cell analysis of high TMB and low TMB groups showed that the expression of low mutation load group was significantly higher than that of high mutation burden group in Monocytes cells, and 9 genes including CXCL11, GDF15, VEGFA, PLA2G2A, BIRC5, VAV3, IL9, TGFB2, TNFRSF12A participated in the model construction. The correlation between Monocytes cells and copy numbers of 9 genes was analyzed, and the results were shown in Fig 8A. Kruskal-Wallis analysis showed that among the 9 genes, the copy numbers of CXCL11, CDF15, PLA2G2A, TGFB2, VAV3 were correlated with the content of Monocytes cells.
Multivariate Cox regression analysis was performed on 26 immune-related DEGs by the "Survival" module of TIMER database. Nine genes were identified as highly correlated with the survival time of patients. The expression difference of these immune genes in glioma was verified by analyzing the expression of CXCL11, GDF15, VEGFA, PLA2G2A, BIRC5, VAV3, IL9, TGFB2 and TNFRSF12A in TIMER database. The results showed that sCNAs with different genes had influence on immune infiltration of glioma patients, as shown in Fig 8B. To illustrate the relationship between immune infiltrating cells and glioma survival rate, COX regression equation was used to calculate the expression levels of six immune infiltrating cells. Kaplan-Meier plots showed that the contents of 6 kinds of immune infiltrating cells were related to the survival and prognosis of patients, and the high levels of 6 kinds of cells could prolong the survival time of patients.
The correlation between BIRC5 gene and immune checkpoint
Ten Hub genes were constructed from PPI network and nine genes were constructed from survival model to screen BIRC5 as the target gene. 529 RNAseq data in LGG HTSeq-FPKM format were downloaded from TCGA database, and 1 case was deleted, with 528 cases in total. RNAseq data in FPKM format were transformed into log2, and the correlation between BIRC5 gene in glioma and CD274, PDCD1, CTLA4 immune checkpoints was analyzed by r software "ggplot2" package. The results indicated that BIRC5 gene expression was closely related to glioma prognosis, and BIRC5 gene expression was positively correlated with PDCD1 and CTLA4, but negatively correlated with CD274, Fig 9.
The correlation between BIRC5 expression and prognosis of glioma and clinical features of glioma
The expression data of BIRC5 was extracted, and the columnar difference map of BIRC5 expression was drawn by software "ggplot2". The result was shown in Fig10A, which showed that the expression of BIRC5 in tumor tissue was significantly higher than that in normal tissue. The glioma patients were divided into survival group and death group according to their survival status, and the correlation analysis was carried out between the two groups, as shown in Fig 10B. The results showed that the expression of BIRC5 in death group was significantly higher than that in survival group (p < 0.001), indicating that BIRC5 gene was a risk factor for glioma patients, and the higher the expression of BIRC5, the worse the prognosis of glioma patients. According to the median expression of BIRC5 mRNA, 528 glioma samples were divided into high expression group and low expression group. The relationship between BIRC5 expression and clinical characteristics of glioma patients was expounded. The results showed that BIRC5 expression was related to age of glioma patients. The expression of BIRC5 in patients age > 40 was significantly higher than that in patients age ≤40 (P<0.05) , no significantly in gender, results as shown in Fig 10C-D.
GSEA enrichment analysis of BIRC5 gene
To further understand the biological process and function of BIRC5 in glioma and the related signal pathway, glioma patients were divided into high expression group and low expression group according to BIRC5 expression level. In this study, BIRC5 expression data set. get file and phenotypic data. els file were obtained by Perl language and imported into GSEA4.0.3 software. The result showed that the high expression of BIRC5 mRNA was related to P53 signaling pathway, DNA replication, cell cycle, phosphoinositol signaling system and neuroactive ligand receptor interaction signaling pathway. The results were shown in Fig 11 and Table 3.
Table 3 GSEA analysis of B1RC5 enrichment results
ID
|
ES
|
NES
|
p.adjust
|
FDR
|
CELL_CYCLE
|
0.797
|
2.771
|
0.013
|
0.009
|
P53_SIGNALING_PATHWAY
|
0.723
|
2.296
|
0.013
|
0.009
|
DNA_REPLICATION
|
0.807
|
2.306
|
0.013
|
0.009
|
PHOSPHATIDYLINOSITOL_SIGNALING_SYSTEM
|
-0.577
|
-2.008
|
0.013
|
0.009
|
NEUROACTIVE_LIGAND_RECEPTOR_INTERACTION
|
-0.565
|
-2.349
|
0.016
|
0.011
|