Basic characteristics
This study analysed data from 2,322 medulloblastoma patients registered in the SEER database between 2000 and 2019. Table 1 presents the demographic features of the patients, with 869 cases (37.42%) being female and 1,453 cases (62.58%) being male. The racial distribution was as follows: 185 patients (7.97%) were Black, 1,939 (83.51%) were White, and 198 (8.53%) belonged to other races. Regarding the subtypes of medulloblastoma, 329 patients (14.17%) had desmoplastic/nodular medulloblastoma (DMB), 1,866 (80.36%) had medulloblastoma, not otherwise specified (MB, NOS), and 127 (5.47%) had large-cell/anaplastic medulloblastoma (LC). In terms of surgical interventions, 1,616 patients (69.60%) underwent total resection, 244 (10.51%) underwent subtotal resection, 343 (14.77%) underwent local excision or biopsy, and 119 (5.12%) did not undergo surgery. Of the patients, 1,849 (79.63%) received chemotherapy, 1,766 (76.06%) underwent radiation therapy, and 713 (30.71%) died. The cutoff values for age and tumor size were determined using X-tile analysis (Fig. 2). Specifically, 324 patients (13.95%) were ≤ 3 years old, and 1,998 patients (86.05%) were older than 3 years. Regarding tumor size, 314 patients (13.52%) had tumors ≤ 3.4 cm, 1,269 patients (54.65%) had tumor size > 3.4 cm, and the tumor size was unknown for 739 patients (31.83%).
The predictive model was generated by partitioning the complete dataset into two mutually exclusive subsets. 70% of the dataset was allocated for the training set, while the remaining 30% was used for the testing set. Model generation was performed on 1,625 randomly assigned patients from the training set, while the accuracy of the model was estimated using 697 randomly assigned patients from the validation set. No statistically significant differences in characteristics were found between the two groups (refer to Table 1). Additionally, survival outcomes showed no differences between the two groups (refer to Fig. 3).
Table 1
Characteristic distribution of data into raining sets and test sets.
Variables | Overall N (%) | Train cohort N (%) | Test cohort N (%) | P |
Patients | 2,322 | 1625 (69.98) | 697 (30.02) | |
Age ≤ 3 >3 | 324 (13.95) 1,998 (86.05) | 229 (14.09) 1,396 (85.91) | 95 (13.63) 602 (86.37) | 0.73 |
Sex Female Male | 869 (37.42) 1,453 (62.58) | 625 (38.46) 1,000 (61.54) | 244 (35.01) 453 (64.99) | 0.12 |
Race Black White Other | 185 (7.97) 1,939 (83.51) 198 (8.53) | 135 (8.31) 1,349 (83.02) 141 (8.68) | 50 (7.17) 590 (84.65) 57 (8.18) | 0.57 |
Histopathology DMB MB, NOS LC | 329 (14.17) 1,866 (80.36) 127 (5.47) | 229 (14.09) 1,302 (80.12) 94 (5.78) | 100 (14.35) 564 (80.92) 33 (4.73) | 0.24 |
Size (cm) ≤ 3.4 >3.4 Unknown | 314 (13.52) 1,269 (54.65) 739 (31.83) | 211 (12.98) 896 (55.14) 518 (31.88) | 103 (14.78) 373 (53.52) 221 (31.71) | 0.17 |
Surgery Total resection Subtotal resection Local excision/Biopsy No evidence | 1,616 (69.60) 244 (10.51) 343 (14.77) 119 (5.12) | 1,122 (69.05) 175 (10.77) 238 (14.65) 90 (5.54) | 494 (70.88) 69 (9.90) 105 (15.06) 29 (4.16) | 0.14 |
Chemotherapy Yes No evidence | 1,849 (79.63) 473 (20.37) | 1,274 (78.40) 351 (21.60) | 575 (82.50) 122 (17.50) | 0.25 |
Radiotherapy Yes No evidence | 1,766 (76.06) 556 (24.94) | 1,214 (74.71) 411 (25.29) | 552 (79.20) 145 (20.80) | 0.33 |
Status Death Alive | 713 (30.71) 1,609 (69.29) | 514 (31.63) 1,111 (68.37) | 199 (28.55) 498 (71.45) | 0.46 |
Cox proportional-hazard (CoxPH) model
The CoxPH model was developed using the training set (refer to Fig. 4). Only variables that showed statistical significance in the univariate analysis were included in the multivariate analysis. The survival of medulloblastoma patients was significantly affected by non-surgical treatment, LC, white race, tumor size ≤ 3.4 cm, total resection, age > 3 years, chemotherapy, and radiotherapy. Furthermore, the survival of the patients was significantly associated with these features in the multivariate analysis. The collinearity analysis also revealed a high correlation between age and radiotherapy, as well as between chemotherapy and radiotherapy (refer to Fig. 5). Ultimately, we included seven features (age, race, tumor size, histological type, surgery, chemotherapy, and radiotherapy) in the model development.
Random Survival Forests (RSF)
Prediction error is calculated using the out-of-bag (OOB) from the training and the testing set (Fig. 6A, B). Variable importance (VIMP) can be measured by randomising a specific variable, as shown in Fig. 6C. A higher VIMP value indicates a greater influence or importance of that variable in accurately predicting the outcome20. The interaction between variables in the analyzed data is illustrated and displayed in Fig. 6D. If one variable's split in a decision tree affects or influences the split of another variable, it suggests an interaction between those variables21,22. The extent of interactions is assessed based on the minimum depth, which represents the distance from the root node to the node where the variable first splits. In this case, it is observed that chemotherapy and radiotherapy exhibit the lowest minimum depth among the variables considered were expected to be associated with other variables.
DeepSurv
The loss function curve, which illustrates the relationship between the loss and the number of iterations during the training process, provides valuable insights into the convergence and performance of the model. By examining this curve, we can assess how well the model is optimizing its parameters over iterations23. Furthermore, plotting the performance of the training dataset as a function of the number of iterations allows us to evaluate the model's ability to rank the samples accurately. This measure helps us monitor the model's generalization ability and identify any signs of overfitting, where the model may excessively capture details specific to the training dataset but fails to generalize well to unseen samples24. The learning process of DeepSurv, a survival prediction model based on deep learning, was visualized (Fig. 7). The figure demonstrates a good fit of the model, indicating that it is effectively learning and capturing the patterns within the data.
Model comparisons
The predictive performance of the three models is shown in Table 2. In the test dataset, the DeepSurv and RSF model exhibited significantly better discrimination abilities (the DeepSurv C-index: 0.751, RSF: 0.750) compared with the CoxPH model (the C-index: 0.748). And in the three models, DeepSurv had the highest C-index of 0.751. The IBS for the three models were as follows: DeepSurv (0.150), RSF (0.160), and CoxPH (0.166). Lower IBS values indicate better model performance. Additionally, the C-index obtained from the train data set (DeepSurv: 0.763, RSF: 0.759, CoxPH: 0.757) differed only slightly with test set, indicating that the models did not exhibit overfitting.
Furthermore, in terms of the Brier score (Fig. 8), DeepSurv outperformed the other two models, indicating its superior accuracy. The AUC for DeepSurv was also higher than the other models (1-year-AUC of DeepSurv: 0.838, RSF: 0.809, CoxPH: 0.808; 3-year-AUCof DeepSurv: 0.820, RSF: 0.791, CoxPH: 0.782;5-year-AUC of DeepSurv: 0.805, RSF: 0.780, CoxPH: 0.773) (Fig. 9). These results demonstrate that DeepSurv outperforms both RSF and the classical CoxPH model in accurately predicting the prognosis of patients with medulloblastoma.
Table 2
Performance of three survival models.
Models | C index Train Test | IBS | 1-year ACU | 3-year AUC | 5-year AUC |
CoxPH | 0.757 | 0.748 | 0.166 | 0.808 (0.77–0.85) | 0.782 (0.74–0.82) | 0.773 (0.74–0.81) |
RSF | 0.759 | 0.750 | 0.160 | 0.809 (0.77–0.85) | 0.791 (0.75–0.83) | 0.780 (0.74–0.82) |
DeepSurv | 0.763 | 0.751 | 0.150 | 0.838 (0.80–0.88) | 0.820 (0.78–0.86) | 0.805 (0.76–0.84) |