Six vitamin-related genes were screened out to develop a risk signature
To identify vitamin-related genes that were significantly associated with glioma OS, we first performed univariate COX regression analysis based on the TCGA, CGGA, and GSE16011 datasets. There were 482, 354, and 290 vitamin-related genes, respectively, in the TCGA, CGGA, and GSE16011 datasets that were obviously linked with OS of glioma patients (Fig. 1A, Tables S4 − 6). Eleven vitamins-related genes closest linked with the prognosis of glioma patients were selected out from the 193 overlapped genes (Fig. 1A, Table S7) in the three databases according to the permutation importance score by the RSFVH algorithm based on TCGA data (Fig. 1B). To identify an optimal prognostic model and avoid overfitting of the risk signature, the above 11 genes were subjected to multivariate Cox regression analysis with a stepwise model selection to establish an optimal gene combination for constructing the risk signature based on TCGA data. Finally, six genes (POSTN, IRX5, EEF2, RAB27A, MDM2 and ENO1) were filtered out (Fig. 1C). Among the six genes, POSTN, IRX5, RAB27A, MDM2 and ENO1 were unfavorable factors for glioma survival with HR > 1, and EEF2 was a conservatory factor with HR < 1.
Construction and evaluation of the vitamin-related risk signature
After extracting the coefficient values obtained from the multivariate Cox regression analysis on the basis of TCGA data, we calculated individualized risk scores with coefficient-weighted expression levels of six vitamin-related genes with the following formula: risk score = (0.1363 × expression level of POSTN) + (0.2367 × expression level of IRX5) + (− 0.4342 × expression level of EEF2) + (0.2201 × expression level of RAB27A) + (0.2183 × expression level of MDM2) + (0.4314 × expression level of ENO1). Individuals were classified as low- or high-risk based on the median value of the risk scores; consequently, a vitamin-related six-gene risk signature was established. The ROC curve analysis displayed excellent discrimination with the areas under the curve (AUC) of 0.900, 0.936, and 0.874 at 1-, 3-, and 5-year OS, respectively, in the TCGA dataset (Fig. 2A). ROC curve analysis based on the CGGA and GSE16011 datasets also indicated that the vitamin-related risk signature had high sensitivity and specificity in prognostic prediction (Fig. 2B, C). To further examine the prognostic prediction ability of the vitamin-related risk model in glioma, KM analysis was employed to determine the difference of OS between the low- and high-risk groups, which revealed that the low-risk group had a considerably increased OS compared with the high-risk group in the TCGA, CGGA, and GSE16011 datasets (Fig. 2D − F). The distribution of risk scores and survival status displayed in Fig. 2G and H and Figure S1 further confirmed the excellent prognostic prediction potential of the risk model. These results confirmed that the vitamin-related risk signature could act as a reliable prognostic predictor for glioma patients.
Genetic and expression characteristics of the six genes
To better understand the six vitamin-related genes, we analyzed the correlations of the six genes in the TCGA, CGGA, and GSE16011 datasets. We found that POSTN and RAB27A had a significantly positive correlation while EEF2 was negatively correlated with RAB27A in the three datasets (Fig. 3A − C). We next explored the expression of the six genes in glioma and normal brain tissue. GEPIA, the online tool with data sourced from TCGA and GTEx, was adopted to detect the mRNA expression of these genes (22). As show in Fig. 3D, except for EEF2, the expression of the remaining five genes in GBM was higher than that in LGG, and the expression of the six genes in glioma was higher than that in normal brain tissue. We further explored the protein expression of POSTN, EEF2, MDM2, and ENO1 in the HPA dataset; characteristic images are shown in Fig. 3E. However, we did not find IRX5 and RAB27A proteins expression in the HPA database. Consistent with the multivariate Cox regression analysis results, KM survival curves presented that samples with higher expression levels of POSTN, IRX5, RAB27A, MDM2, or ENO1 had worse outcomes and samples with higher expression levels of EEF2 had a more favorable outcome in the TCGA cohort (Fig. 3F − K). Then, we examined the genetic alteration of these risk-associated genes in glioma; among the 794 patients included in the cBioportal for Cancer Genomics database, 66 (8.3%) patients showed genetic alterations. Amplification was the most prevalent genetic alteration (Figure S2).
The correlation of risk scores and clinicopathologic features
We further detected the risk scores in glioma patients classified by WHO grades, IDH status, 1p19q status, MGMT promoter, and subtype. The risk score was significantly increased along with WHO grades in the TCGA, CGGA, and GSE16011 cohorts (Fig. 4A − C), and the remarkably increased risk scores were also observed in IDH wild type (Fig. 4D − F), 1p19q non-codeleted (Fig. 4G, H), MGMT promoter unmethylated (Fig. 4I, J), and mesenchymal subtype (Fig. 4K, L) glioma patients. These data suggested a notable association between the vitamin-related risk signature and clinical molecular features of glioma patients.
Prognostic significance of the risk signature in stratified patients
To comprehensively examine the risk model in glioma, we detected the correlation between risk scores and OS of glioma patients classified by WHO grades, IDH status, 1p19q status, and MGMT promoter. KM analysis showed samples in the low-risk group had an obviously longer OS than those in the high-risk group for LGG patients in the TCGA, CGGA, and GSE16011 datasets (Fig. 5A − C). Consistent results were also found in most stratified samples expect in 1p19q codeleted patients (Fig. 5D − T). These findings demonstrated the powerful potential of the vitamin-related risk model in predicting the outcome of glioma patients.
Functional enrichment analysis
To unearth the vitamin-related risk signature associated biological processes, the GSVA score of KEGG pathways and GO biological processes (c5.bp.v7.1.symbols.gmt) was investigated between the low- and high-risk groups in the TCGA, CGGA, and GSE16011 datasets. The top 30 KEGG pathways and top 30 GO biological processes significantly enriched in the high-risk group (p < 0.05) were screened out. We found that most of the processes were linked with immune and inflammatory responses in the TCGA (Fig. 6A − B), CGGA (Figure S3A − B), and GSE16011 (Figure S4A − B) datasets. To confirm this result, we performed GO and KEGG analyses of the differentially expressed genes between the two groups. The biological processes and pathways enriched in the high-risk group was similar to the results of GSVA in the three datasets (Fig. 6C, D, Figure S3C, D, Figure S4C, D). Additionally, the GSEA analysis of GO biological processes in the three databases was also consistent with the GO analysis (Fig. 7A − C). These data denoted that the vitamin-related risk model may be able to distinguish the immune characteristics of gliomas.
Immune features of the low- and high-risk groups
Considering the role of vitamins in the immune system (11, 12) and the results of the above-mentioned functional enrichment analysis, we speculated that the vitamin-related risk signature may be linked with the tumor immune characteristics of glioma. To verify this conjecture, we first compared the ESTIMATE scores between the low- and high-risk groups. ESTIMATE scores in the high-risk group were significantly higher than that in the low-risk group in the TCGA (Fig. 8A), CGGA (Fig. 8E), and GSE16011 (Figure S5A) datasets. Similarly, the immune and stromal scores were obviously higher in the high-risk group than in the low-risk group in the three datasets; while tumor purity showed the opposite result (Fig. 8B − D, F − H, Figure S5B − D). Correlation analysis presented that the ESTIMATE, immune and stromal scores were all significantly positively correlated with the risk score, but tumor purity was negatively correlated with the risk score in the TCGA, CGGA, and GSE16011 datasets (Fig. 8I − P, Figure S5E − H). These results indicated that the higher the risk score, the lower the purity of the glioma, indicating that the higher the risk score, the more complex the microenvironment of the glioma. We previously confirmed that, in glioma, lower tumor purity indicated a worse prognosis (33). This was accordant with the prognosis prediction of the risk signature of this study, and further confirmed that the vitamin-related risk model was accurate in predicting the tumor purity of glioma and the prognosis of glioma patients.
Immune cells, as an important component in the glioma microenvironment, affect tumor purity of glioma (34, 35). As glioma purity is reduced, the local immune status can be enhanced (33) and the vitamin-related risk score is significantly negatively correlated with tumor purity. We speculated that the risk score may be linked with the distribution of immune cells in the glioma microenvironment. To investigate their relationship, we employed GSVA to estimate signatures of multiple immune cells and compared the distribution of immune cells in low- and high-risk groups. We found that most immune cells, whether innate or adaptive immune cells, were notably more enriched in the high-risk group in the TCGA (Fig. 9A, B), CGGA (Fig. 9C, D), and GSE16011 (Figure S6A, B) datasets. This indicated that the higher the risk score, the more immune cells were enriched in the glioma microenvironment. At the same time, it also suggested that the vitamin-related risk signature can directly predict the local immune status of glioma, highlighting its strong predictive ability. It has been shown that higher immune cells infiltration indicates a worse outcome in both LGG and GBM (36). This further confirmed the accuracy and reliability of the vitamin-related risk signature in predicting the local immune response of the glioma microenvironment and the prognosis of glioma patients.
To further characterize intratumoral immune states, we analyzed the components of immune subtypes in the TCGA, CGGA, and GSE16011 datasets using the immuneSubtypeClassier package. In the three datasets, the C4 subtype was the main component of the high-risk group, while the C5 subtype was mainly concentrated in the low-risk group (Fig. 9E). The OS of tumor patients classified as C4 subtype is worse than that of tumor patients classified as C5 subtype (28), which is consistent with the prognosis of glioma in the low- and high-risk groups differentiated by the vitamin-related risk model. These results indicated that the vitamin-related risk model could not only accurately predict the immune status, but also could exactly reveal the prognosis of glioma patients.
Vitamin-related signature indicates an immunosuppressive microenvironment
In the process of the immune system's elimination of tumor cells, a series of anti-tumor immune response processes are initiated and expanded iteratively; these processes are defined as the cancer-immunity cycle (37). Immune cells play a vital role in tumor clearance in this cycle. However, in glioma, the high-risk group had significantly higher immune cell enrichment than the low-risk group, but the prognosis was worse than that of the low-risk group, indicating that there may be other immunosuppressive mechanisms in the high-risk group. To explore this, we first detected the expression of cancer-immunity cycle inhibitors in the two groups (38). The expression levels of most cancer-immunity cycle inhibitors in the high-risk group were obviously higher than those in the low-risk group in the TCGA, CGGA, and GSE16011 (Fig. 10A, B, Figure S7A) datasets. TGFB1, VEGFA, IL10, ARG1, CSF1, and FGL2 are immunosuppressive factors secreted by glioma cells, while CD70 and FASLG are glioma cell surface immunosuppressive factors (39). We further compared the expression levels of these genes in the two groups in the TCGA, CGGA, and GSE16011 datasets. As presented in Fig. 10C − E, the expression levels of these immunosuppressive factors in the high-risk group were evidently higher than those in the low-risk group in the three datasets.
In addition to cancer-immunity cycle inhibitors, immune checkpoints also play a vital role in tumor immune tolerance. Some immune checkpoint blockers have emerged as powerful weapons in the oncological armamentarium (40). Similar to cancer-immunity cycle inhibitors, the expression of most immune checkpoints in the high-risk group was notably higher than that in the low-risk group in the TCGA (Fig. 11A), CGGA (Fig. 11B), and GSE16011 (Figure S7B) datasets. Since CTLA4, PD1 and PD-L1 blockers have shown clinical benefit in tumor treatment (41, 42), we separately explored their expression in the two groups and their relationship with risk scores in the TCGA and CGGA datasets. As shown in Fig. 11C − N, the expression levels of CTLA4, PD1 and PD-L1 in the high-risk group were significantly higher than those in the low-risk group and significantly positively correlated with the risk scores.
These findings indicated that glioma with high vitamin-related risk scores tended to form an immunosuppressive microenvironment via overexpression of cancer-immunity cycle inhibitors and immune checkpoints. This also explained why the high-risk group had more immune cells infiltration than the low-risk group but their prognosis was worse.
Building and validating a nomogram
To examine whether the prognostic significance of vitamin-related risk model did not depend on other clinicopathological factors, univariate and multivariate Cox regression analyses were conducted in the TCGA dataset We detected that the risk signature was obviously linked with OS even when adjusted by other clinical factors (Fig. 12A, B). The same results were also obtained in the CGGA and GSE16011 datasets (Figure S8A − D).
To develop a clinically utilizable method for predicting the 1-, 3‐, and 5‐year OS of glioma patients, we constructed a nomogram based on the TCGA cohort. The factors of the nomogram included clinicopathological variables (age, gender, WHO grade, IDH status, 1p19q status, and MGMT promoter status) and the vitamin-related risk score (Fig. 12C). Calibration curves were employed to evaluate the accuracy of the nomograms in predicting the OS of glioma patients. The 45-degree line denoted the most accurate prediction. Calibration curves displayed that the nomogram performed excellently in both the TCGA and CGGA cohorts (Fig. 12D, E). The DCA based on the TCGA dataset suggested that the risk score-based nomogram could amplify net benefits and display a broad range of threshold possibility in the prediction of 1-, 3-, and 5-year OS (Fig. 12F − H). Consistent results were also found in the CGGA dataset (Figure S8E − G). The net reduction analyses based on the TCGA and CGGA cohorts further confirmed the results of the DCA (Fig. 12I − K, Figure S8H − J).