Deep learning models for predicting the survival of patients with medulloblastoma based on a surveillance, epidemiology, and end results analysis

doi:10.21203/rs.3.rs-3975955/v1

Download PDF

Article

Deep learning models for predicting the survival of patients with medulloblastoma based on a surveillance, epidemiology, and end results analysis

https://doi.org/10.21203/rs.3.rs-3975955/v1

This work is licensed under a CC BY 4.0 License

You are reading this latest preprint version

Background

Medulloblastoma is a malignant neuroepithelial tumor of the central nervous system. Accurate prediction of prognosis is essential for therapeutic decisions in medulloblastoma patients. Several prognostic models have been developed using multivariate Cox regression to predict the1-, 3- and 5-year survival of medulloblastoma patients, but few studies have investigated the results of integrating deep learning algorithms. Compared to simplifying predictions into binary classification tasks, modelling the probability of an event as a function of time by combining it with deep learning may provide greater accuracy and flexibility.

Methods

Patients diagnosed with medulloblastoma between 2000 and 2019 were extracted from the Surveillance, Epidemiology, and End Results (SEER) registry. Three models—one based on neural networks (DeepSurv), one based on ensemble learning (random survival forest [RSF]), and a typical Cox proportional-hazards (CoxPH) model—were selected for training. The dataset was randomly divided into training and testing datasets in a 7:3 ratio. The model performance was evaluated utilizing the concordance index (C-index), Brier score and integrated Brier score (IBS). The accuracy of predicting 1-, 3- and 5- year survival was assessed using receiver operating characteristic curves (ROC), and the area under the ROC curves (AUC).

Results

The 2,322 patients with medulloblastoma enrolled in the study were randomly divided into the training cohort (70%, n = 1,625) and the test cohort (30%, n = 697). There was no statistically significant difference in clinical characteristics between the two cohorts (p > 0.05). We performed Cox proportional hazards regression on the data from the training cohort, which illustrated that age, race, tumour size, histological type, surgery, chemotherapy, and radiotherapy were significant factors influencing survival (p < 0.05). The Deepsurv outperformed the RSF and classic CoxPH models with C-indexes of 0.763 and 0.751 for the training and test datasets. The DeepSurv model showed better accuracy in predicting 1-, 3- and 5-year survival (AUC: 0.805–0.838).

Conclusion

The predictive model based on a deep learning algorithm that we have developed can exactly predict the survival rate and duration of medulloblastoma.

Biological sciences/Cancer/Cancer models

Biological sciences/Cancer/Cns cancer

DeepSurv

Medulloblastoma

Neural network

Survival prediction

SEER

Medulloblastoma is an embryonal tumor that arises from the cerebellum and has the potential to spread throughout the nervous system. It is the most common type of paediatric embryonal tumor, with an incidence ranging from 5 to 11 cases per 1 million individuals^1,2. According to current international consensus, there are four subgroups of medulloblastoma: Wingless (WNT), Sonic Hedgehog (SHH), group 3 (G3), and group 4 (G4)³. Multimodal therapy, which includes surgery, external beam irradiation, and/or cytotoxic chemotherapy, can result in survival rates ranging from 50–80% based on clinical staging⁴. Certain prognostic features, such as age at diagnosis, extent of resection, histological subtype, and molecular subgroup classification, have been found to affect survival predictions in individual patients.

Previous studies have used the Cox proportional-hazards model (CoxPH) to evaluate the survival rate of medulloblastoma patients^{5 6,7}. This model incorporates survival outcomes and time as target variables, allowing for the simultaneous analysis of multiple factors' impact on survival time. It is extensively used for predicting outcome events when the survival distribution of the analyzed data is unknown⁸. A nomogram is a commonly used method for quantifying and combining important clinical characteristics of patients to calculate the probabilities of outcome events based on the CoxPH model ⁹. However, the model assumes that each predictor variable has the same effect throughout the follow-up time, which ignores variations in their impact on individual patients at different times. Therefore, a new method is required to improve the accuracy of predicting the survival rate of cancer patients.

In recent years, computer and information technology have shown revolutionary potential for artificial intelligence (AI) in the healthcare industry ^10–12. Machine learning models have stronger nonlinear modeling capabilities compared to traditional linear models and can better capture complex relationships among clinical variables. The analysis of these models can provide accurate personalized survival predictions and decision-making support for treatment strategies to improve patient survival rates ^13,14. Deep learning, a field in machine learning, explores patterns and representations within data to characterize their distribution ^15,16. It is a statistical model that consists of an input layer, hidden layer, and output layer. This model can solve complex, multifactorial, and nonlinear problems. Deep learning-based models have become highly effective predictors of clinical outcomes across various disease domains due to the continuous advancements in deep learning research techniques and the abundance of biomedical big data. Jiang et al. ¹⁷ demonstrated the use of an artificial neural network model to predict the survival rate of patients diagnosed with pancreatic neuroendocrine neoplasms, by leveraging clinical information. Katzman et al. ¹⁸ integrated deep learning with a multilayer neural network architecture, known as the DeepSurv model, resulting in a personalized treatment recommendation system that showed remarkable performance.

To our knowledge, there is a lack of research combining deep learning techniques with the study of medulloblastoma. Therefore, this study aimed to fill this research gap by utilizing data obtained from the Surveillance, Epidemiology, and End Results (SEER) database, which contains information on patients diagnosed with medulloblastoma in the United States. And then the DeepSurv model was used to evaluate their survival rates.

Data source and patient selection

The data of this retrospective cohort study from the SEER database, which encompasses information from 18 cancer registries representing approximately 28% of the entire US population¹⁹. This database offers extensive and detailed patient data, including demographic characteristics, tumor-related information, cause of death, and survival duration. The SEER*Stat software (version 8.3.6) was used to identify patients with medulloblastoma. The dataset covering the years 2000 to 2019 in the United States was accessed.

The patients included in the study had to meet the following criteria: 1) a confirmed pathological diagnosis of medulloblastoma; 2) identification of medulloblastoma cases based on the third edition of the International Classification of Diseases for Oncology (ICD-O3) using specific ICD-O-3 codes for histopathology, including 9,470/3 for medulloblastoma, NOS; 9,471/3 for desmoplastic nodular medulloblastoma; and 9,474/3 for large cell medulloblastoma. Furthermore, patients were required to have a known survival status and time. Afterwards, they were randomly divided into a training group and a testing group at a 7:3 ratio. A flowchart in Fig. 1 illustrates the process of patient selection.

Variable’s definitions

Several parameters were collected from the samples, including age at diagnosis, sex, race, histological type, tumor size, surgery, chemotherapy, radiation therapy, and survival time. To evaluate the prognostic value of age and tumor size in patients with medulloblastoma objectively, the patients were categorized into two groups based on the optimal cutoff values obtained using the X-tile software (https://x-tile.software.informer.com, Yale School of Medicine, New Haven, CT, United States). Age cutoff values of ≤ 3 years and > 3 years, and tumor size cutoff values of ≤ 3.4 cm, > 3.4 cm, and/or unknown were utilised. For detailed visual representations, please refer to Fig. 2.

Model development

This study selected three models for training: DeepSurv, RSF, and CoxPH. DeepSurv is a deep feedforward neural network used to predict patients' survival time or survival probability. It employs a multi-layer neural network to capture the complex nonlinear relationship between patients' survival probability and input features. This study utilized deep-learning calculations based on the DeepSurv calculation method described by Katzman et al.¹⁸ to predict the survival outcome of patients diagnosed with medulloblastoma. The term RSF refers to Random Survival Forests, which is a survival analysis method based on random forests. When constructing a random survival forest, subsets of samples and features are randomly selected, and multiple decision trees are built using these subsets. Each decision tree splits the samples based on features in the nodes and determines the optimal splitting based on the evaluation of survival time differences. The predictions from multiple decision trees in the random survival forests are combined to obtain the final survival prediction. The CoxPH is a semi-parametric regression model used to analyse survival data and estimate the risk of event occurrence. The Cox proportional-hazards model is used to compare the relative risks of events between different groups and study the impact of various factors on event occurrence. The model functions by modeling the relationship between time and event occurrence as a function of hazard ratios.

For the implementation of the algorithms in this research, CoxPH and RSF were implemented using the R packages "survival" and "randomForestSRC", respectively. On the other hand, DeepSurv utilized an open-source Python package. The hyperparameter optimization for DeepSurv was conducted using the Hyperopt package within the TensorFlow framework.

Model evaluation

The study evaluated the model's performance using several metrics, including C-index, Brier score, integrated Brier score (IBS), Receiver Operating Characteristic (ROC) curves, and Area Under the Curve (AUC) values.

The C-index is a commonly used metric for evaluating the accuracy of survival predictions. It measures the concordance or correlation between the predicted survival risk and the actual observed survival time. A C-index of 0.5 indicates random predictions, while a value of 1.0 indicates perfect predictions. The Brier score assesses the mean squared difference between the observed patient statuses (event occurrence or censoring) and the predicted survival probabilities. It ranges from 0 to 1, with 0 indicating a perfect match between predictions and observations. In practice, models with Brier scores less than 0.25 are considered useful. The IBS is a metric that evaluates the overall performance of a survival model across all available time points. It takes into account the model's sensitivity and specificity to time-dependent events, providing a comprehensive measure of predictive accuracy. Receiver Operating Characteristic (ROC) curves are frequently used to assess a model's sensitivity and specificity at various discrimination thresholds. The ROC curve plots the true positive rate against the false positive rate. The Area Under the Curve (AUC) values, which range from 0 to 1, are computed to quantify the overall performance of the model. A higher AUC indicates better discrimination ability. This study calculated AUC values to assess the model's performance at different time points: 1-year, 3-year, and 5-year survival rates.

Statistical analysis

In the clinical data, continuous variables are expressed as mean ± standard deviation (SD), while categorical variables are described using frequencies and percentages. Statistical tests such as chi-square tests and unpaired t-tests are used to compare variables between groups.

Basic characteristics

This study analysed data from 2,322 medulloblastoma patients registered in the SEER database between 2000 and 2019. Table 1 presents the demographic features of the patients, with 869 cases (37.42%) being female and 1,453 cases (62.58%) being male. The racial distribution was as follows: 185 patients (7.97%) were Black, 1,939 (83.51%) were White, and 198 (8.53%) belonged to other races. Regarding the subtypes of medulloblastoma, 329 patients (14.17%) had desmoplastic/nodular medulloblastoma (DMB), 1,866 (80.36%) had medulloblastoma, not otherwise specified (MB, NOS), and 127 (5.47%) had large-cell/anaplastic medulloblastoma (LC). In terms of surgical interventions, 1,616 patients (69.60%) underwent total resection, 244 (10.51%) underwent subtotal resection, 343 (14.77%) underwent local excision or biopsy, and 119 (5.12%) did not undergo surgery. Of the patients, 1,849 (79.63%) received chemotherapy, 1,766 (76.06%) underwent radiation therapy, and 713 (30.71%) died. The cutoff values for age and tumor size were determined using X-tile analysis (Fig. 2). Specifically, 324 patients (13.95%) were ≤ 3 years old, and 1,998 patients (86.05%) were older than 3 years. Regarding tumor size, 314 patients (13.52%) had tumors ≤ 3.4 cm, 1,269 patients (54.65%) had tumor size > 3.4 cm, and the tumor size was unknown for 739 patients (31.83%).

The predictive model was generated by partitioning the complete dataset into two mutually exclusive subsets. 70% of the dataset was allocated for the training set, while the remaining 30% was used for the testing set. Model generation was performed on 1,625 randomly assigned patients from the training set, while the accuracy of the model was estimated using 697 randomly assigned patients from the validation set. No statistically significant differences in characteristics were found between the two groups (refer to Table 1). Additionally, survival outcomes showed no differences between the two groups (refer to Fig. 3).

Table 1

Characteristic distribution of data into raining sets and test sets.
Variables	Overall N (%)	Train cohort N (%)	Test cohort N (%)	P
Patients	2,322	1625 (69.98)	697 (30.02)
Age ≤ 3 >3	324 (13.95) 1,998 (86.05)	229 (14.09) 1,396 (85.91)	95 (13.63) 602 (86.37)	0.73
Sex Female Male	869 (37.42) 1,453 (62.58)	625 (38.46) 1,000 (61.54)	244 (35.01) 453 (64.99)	0.12
Race Black White Other	185 (7.97) 1,939 (83.51) 198 (8.53)	135 (8.31) 1,349 (83.02) 141 (8.68)	50 (7.17) 590 (84.65) 57 (8.18)	0.57
Histopathology DMB MB, NOS LC	329 (14.17) 1,866 (80.36) 127 (5.47)	229 (14.09) 1,302 (80.12) 94 (5.78)	100 (14.35) 564 (80.92) 33 (4.73)	0.24
Size (cm) ≤ 3.4 >3.4 Unknown	314 (13.52) 1,269 (54.65) 739 (31.83)	211 (12.98) 896 (55.14) 518 (31.88)	103 (14.78) 373 (53.52) 221 (31.71)	0.17
Surgery Total resection Subtotal resection Local excision/Biopsy No evidence	1,616 (69.60) 244 (10.51) 343 (14.77) 119 (5.12)	1,122 (69.05) 175 (10.77) 238 (14.65) 90 (5.54)	494 (70.88) 69 (9.90) 105 (15.06) 29 (4.16)	0.14
Chemotherapy Yes No evidence	1,849 (79.63) 473 (20.37)	1,274 (78.40) 351 (21.60)	575 (82.50) 122 (17.50)	0.25
Radiotherapy Yes No evidence	1,766 (76.06) 556 (24.94)	1,214 (74.71) 411 (25.29)	552 (79.20) 145 (20.80)	0.33
Status Death Alive	713 (30.71) 1,609 (69.29)	514 (31.63) 1,111 (68.37)	199 (28.55) 498 (71.45)	0.46

Cox proportional-hazard (CoxPH) model

The CoxPH model was developed using the training set (refer to Fig. 4). Only variables that showed statistical significance in the univariate analysis were included in the multivariate analysis. The survival of medulloblastoma patients was significantly affected by non-surgical treatment, LC, white race, tumor size ≤ 3.4 cm, total resection, age > 3 years, chemotherapy, and radiotherapy. Furthermore, the survival of the patients was significantly associated with these features in the multivariate analysis. The collinearity analysis also revealed a high correlation between age and radiotherapy, as well as between chemotherapy and radiotherapy (refer to Fig. 5). Ultimately, we included seven features (age, race, tumor size, histological type, surgery, chemotherapy, and radiotherapy) in the model development.

Random Survival Forests (RSF)

Prediction error is calculated using the out-of-bag (OOB) from the training and the testing set (Fig. 6A, B). Variable importance (VIMP) can be measured by randomising a specific variable, as shown in Fig. 6C. A higher VIMP value indicates a greater influence or importance of that variable in accurately predicting the outcome²⁰. The interaction between variables in the analyzed data is illustrated and displayed in Fig. 6D. If one variable's split in a decision tree affects or influences the split of another variable, it suggests an interaction between those variables^21,22. The extent of interactions is assessed based on the minimum depth, which represents the distance from the root node to the node where the variable first splits. In this case, it is observed that chemotherapy and radiotherapy exhibit the lowest minimum depth among the variables considered were expected to be associated with other variables.

DeepSurv

The loss function curve, which illustrates the relationship between the loss and the number of iterations during the training process, provides valuable insights into the convergence and performance of the model. By examining this curve, we can assess how well the model is optimizing its parameters over iterations²³. Furthermore, plotting the performance of the training dataset as a function of the number of iterations allows us to evaluate the model's ability to rank the samples accurately. This measure helps us monitor the model's generalization ability and identify any signs of overfitting, where the model may excessively capture details specific to the training dataset but fails to generalize well to unseen samples²⁴. The learning process of DeepSurv, a survival prediction model based on deep learning, was visualized (Fig. 7). The figure demonstrates a good fit of the model, indicating that it is effectively learning and capturing the patterns within the data.

Model comparisons

The predictive performance of the three models is shown in Table 2. In the test dataset, the DeepSurv and RSF model exhibited significantly better discrimination abilities (the DeepSurv C-index: 0.751, RSF: 0.750) compared with the CoxPH model (the C-index: 0.748). And in the three models, DeepSurv had the highest C-index of 0.751. The IBS for the three models were as follows: DeepSurv (0.150), RSF (0.160), and CoxPH (0.166). Lower IBS values indicate better model performance. Additionally, the C-index obtained from the train data set (DeepSurv: 0.763, RSF: 0.759, CoxPH: 0.757) differed only slightly with test set, indicating that the models did not exhibit overfitting.

Furthermore, in terms of the Brier score (Fig. 8), DeepSurv outperformed the other two models, indicating its superior accuracy. The AUC for DeepSurv was also higher than the other models (1-year-AUC of DeepSurv: 0.838, RSF: 0.809, CoxPH: 0.808; 3-year-AUCof DeepSurv: 0.820, RSF: 0.791, CoxPH: 0.782;5-year-AUC of DeepSurv: 0.805, RSF: 0.780, CoxPH: 0.773) (Fig. 9). These results demonstrate that DeepSurv outperforms both RSF and the classical CoxPH model in accurately predicting the prognosis of patients with medulloblastoma.

Table 2

Performance of three survival models.
Models	C index Train Test		IBS	1-year ACU	3-year AUC	5-year AUC
CoxPH	0.757	0.748	0.166	0.808 (0.77–0.85)	0.782 (0.74–0.82)	0.773 (0.74–0.81)
RSF	0.759	0.750	0.160	0.809 (0.77–0.85)	0.791 (0.75–0.83)	0.780 (0.74–0.82)
DeepSurv	0.763	0.751	0.150	0.838 (0.80–0.88)	0.820 (0.78–0.86)	0.805 (0.76–0.84)

Medulloblastoma, a malignant brain tumor that mainly impacts children, continues to pose a substantial obstacle in the field of pediatric oncology. Precisely predicting the individual prognosis of patients is crucial for customizing treatment approaches and enhancing survival rates. Prior research has identified several prognostic factors that affect the survival duration of medulloblastoma patients, including age, extent of surgical removal, and the administration of radiotherapy or chemotherapy ^7,25,26. Moreover, as medical advancements progress, an increasing amount of imaging data ⁵ and genetic data ²⁷ are being analyzed for survival analysis of medulloblastoma patients. However, classical survival analysis methods, such as the Cox proportional-hazards model, assume a linear relationship between variables, which may be limited in the face of multidimensional data. With the advancement of artificial intelligence, machine learning methods are being applied to clinical, imaging, and genetic data, allowing for the discovery of potential nonlinear relationships within the data ^28–30. Within machine learning, deep learning is a specific class of methods that utilizes multilayered neural networks to extract high-order features. Deep learning has gained increasing popularity in the field of cancer survival analysis, and has demonstrated excellent performance ^31–33. As far as we know, this approach has not been applied to medulloblastoma. Therefore, we developed a deep learning model (Deepsurv) to predict the overall survival (OS) of medulloblastoma patients and compared its performance to that of a machine learning model (RSF) and a classical model (CoxPH).

By extracting potentially significant features from the SEER database, this research developed multiple models to forecast the survival rates of individuals diagnosed with medulloblastoma. Initially, we utilized the X-tile tool to determine the optimal cutoff values for age and tumor size from a cohort of 2,322 medulloblastoma patients. We identified two high-risk factors, age ≤ 3 years old and tumor size > 3.4 cm, that significantly impact the survival duration of patients with medulloblastoma. Subsequently, we employed Cox proportional hazards regression to identify variables associated with the prognosis of medulloblastoma patients. Age, race, tumor size, histological type, surgery, chemotherapy, and radiotherapy were selected for inclusion in the modeling process (p < 0.05). We established RSF, DeepSurv and CoxPH models and evaluated their performance using metrics such as the C-index, IBS, and ROC curve. The study results demonstrated that the DeepSurv model outperformed both the CoxPH and RSF models, as indicated by its higher C-index in both the training and testing sets. Moreover, the DeepSurv model exhibited the lowest IBS and the largest AUC values when predicting 1-, 3-, and 5-year survival. These findings collectively suggest that the DeepSurv model is more accurate in predicting the survival of patients with medulloblastoma.

In previous studies, Guo et al. ⁷ and Zhou et al. ⁵ utilized Cox proportional hazard regression for survival analysis of medulloblastoma and developed a nomogram. Compared with their study, the C-index values obtained from the DeepSurv model were higher in both the training cohort, indicating its superior predictive accuracy of the prognosis of patients with medulloblastoma. This finding is consistent with the results reported in several previous studies focusing on cancer prognosis^34,35. The main advantage of the DeepSurv model in its ability to process both linear and nonlinear predictive variables by utilizing a multilevel neural network. This transformation into a linear combination allows the model to uncover associations that may not be readily apparent to the human eye or traditional statistical techniques.

Nevertheless, our study encountered several limitations. Firstly, the data collected from the SEER database for medulloblastoma patients contained some missing information that could potentially influence survival outcomes. This includes important details such as molecular subgroups, specific radiotherapy dosages, and chemotherapy regimens. The availability and completeness of these data rely on the ongoing improvements in data collection within the SEER database. Secondly, our model has yet to undergo external validation, and it is necessary to validate its performance on new data. Conducting further validations using independent datasets would enhance the reliability and generalizability of the findings. Another inherent limitation lies within the DeepSurv model itself. Due to its utilization of hidden layers in its architecture, the model operates as a black-box, making it challenging to fully comprehend the computations involved in the model construction process and its associated limitations. Future research should aim to address these concerns and explore the inner workings of the model to improve interpretability.

This study employed Cox proportional hazards regression analysis to examine the prognostic factors influencing medulloblastoma patients' outcomes, which include age, race, tumor size, histological type, surgery, chemotherapy, and radiotherapy. Subsequently, we developed a groundbreaking DeepSurv prediction model, which exhibited strong predictive capabilities in assessing the prognosis of patients diagnosed with medulloblastoma. This innovative DeepSurv model holds significant potential in accurately predicting the survival duration of medulloblastoma patients.

Funding

The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.

Competing interests

The authors have no relevant financial or non-financial interests to disclose.

Author Contributions

Sun M: conceptualization, data curation, investigation, methodology, software, visualization, writing—original draft; Sun J: conceptualization, data curation, formal analysis; Li M: methodology, project administration, supervision—review and editing. All authors read and approved the final manuscript.

Data availability

The datasets analyzed during the current study are available in the SEER database repository (https://seer.cancer.gov/).

Ethical approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Gajjar, A. J. & Robinson, G. W. Medulloblastoma-translating discoveries from the bench to the bedside. Nat Rev Clin Oncol 11, 714–722, doi:10.1038/nrclinonc.2014.181 (2014).
Ostrom, Q. T., Cioffi, G., Waite, K., Kruchko, C. & Barnholtz-Sloan, J. S. CBTRUS Statistical Report: Primary Brain and Other Central Nervous System Tumors Diagnosed in the United States in 2014–2018. Neuro Oncol 23, iii1-iii105, doi:10.1093/neuonc/noab200 (2021).
Taylor, M. D. et al. Molecular subgroups of medulloblastoma: the current consensus. Acta Neuropathol 123, 465–472, doi:10.1007/s00401-011-0922-z (2012).
Ramaswamy, V. & Taylor, M. D. Medulloblastoma: From Myth to Molecular. J Clin Oncol 35, 2355–2363, doi:10.1200/JCO.2017.72.7842 (2017).
Zhou, L. et al. Automatic image segmentation and online survival prediction model of medulloblastoma based on machine learning. Eur Radiol, doi:10.1007/s00330-023-10316-9 (2023).
Li, X. & Gong, J. Survival nomogram for medulloblastoma and multi-center external validation cohort. Front Pharmacol 14, 1247812, doi:10.3389/fphar.2023.1247812 (2023).
Guo, C. et al. External Validation of a Nomogram and Risk Grouping System for Predicting Individual Prognosis of Patients With Medulloblastoma. Front Pharmacol 11, 590348, doi:10.3389/fphar.2020.590348 (2020).
Baek, E. T. et al. Survival time prediction by integrating cox proportional hazards network and distribution function network. BMC Bioinformatics 22, 192, doi:10.1186/s12859-021-04103-w (2021).
Iasonos, A., Schrag, D., Raj, G. V. & Panageas, K. S. How to build and interpret a nomogram for cancer prognosis. J Clin Oncol 26, 1364–1370, doi:10.1200/JCO.2007.12.9791 (2008).
Schwalbe, N. & Wahl, B. Artificial intelligence and the future of global health. Lancet 395, 1579–1586, doi:10.1016/S0140-6736(20)30226-9 (2020).
Hamet, P. & Tremblay, J. Artificial intelligence in medicine. Metabolism 69S, S36-S40, doi:10.1016/j.metabol.2017.01.011 (2017).
Hunter, D. J. & Holmes, C. Where Medical Statistics Meets Artificial Intelligence. N Engl J Med 389, 1211–1219, doi:10.1056/NEJMra2212850 (2023).
Connor, C. W. Artificial Intelligence and Machine Learning in Anesthesiology. Anesthesiology 131, 1346–1359, doi:10.1097/ALN.0000000000002694 (2019).
Bhat, M., Rabindranath, M., Chara, B. S. & Simonetto, D. A. Artificial intelligence, machine learning, and deep learning in liver transplantation. J Hepatol 78, 1216–1233, doi:10.1016/j.jhep.2023.01.006 (2023).
Choi, R. Y., Coyner, A. S., Kalpathy-Cramer, J., Chiang, M. F. & Campbell, J. P. Introduction to Machine Learning, Neural Networks, and Deep Learning. Transl Vis Sci Technol 9, 14, doi:10.1167/tvst.9.2.14 (2020).
Greener, J. G., Kandathil, S. M., Moffat, L. & Jones, D. T. A guide to machine learning for biologists. Nat Rev Mol Cell Biol 23, 40–55, doi:10.1038/s41580-021-00407-0 (2022).
Jiang, C. et al. Predicting the survival of patients with pancreatic neuroendocrine neoplasms using deep learning: A study based on Surveillance, Epidemiology, and End Results database. Cancer Med 12, 12413–12424, doi:10.1002/cam4.5949 (2023).
Katzman, J. L. et al. DeepSurv: personalized treatment recommender system using a Cox proportional hazards deep neural network. BMC Med Res Methodol 18, 24, doi:10.1186/s12874-018-0482-1 (2018).
Hankey, B. F., Ries, L. A. & Edwards, B. K. The surveillance, epidemiology, and end results program: a national resource. Cancer Epidemiol Biomarkers Prev 8, 1117–1121 (1999).
Taylor, J. M. Random Survival Forests. J Thorac Oncol 6, 1974–1975, doi:10.1097/JTO.0b013e318233d835 (2011).
Gilhodes, J. et al. Comparison of variable selection methods for high-dimensional survival data with competing events. Comput Biol Med 91, 159–167, doi:10.1016/j.compbiomed.2017.10.021 (2017).
Kretowska, M. Tree-based models for survival data with competing risks. Comput Methods Programs Biomed 159, 185–198, doi:10.1016/j.cmpb.2018.03.017 (2018).
Du, J., Zhou, Y., Liu, P., Vong, C. M. & Wang, T. Parameter-Free Loss for Class-Imbalanced Deep Learning in Image Classification. IEEE Trans Neural Netw Learn Syst 34, 3234–3240, doi:10.1109/TNNLS.2021.3110885 (2023).
Serghiou, S. & Rough, K. Deep Learning for Epidemiologists: An Introduction to Neural Networks. Am J Epidemiol 192, 1904–1916, doi:10.1093/aje/kwad107 (2023).
Dasgupta, A. et al. Nomograms based on preoperative multiparametric magnetic resonance imaging for prediction of molecular subgrouping in medulloblastoma: results from a radiogenomics study of 111 patients. Neuro Oncol 21, 115–124, doi:10.1093/neuonc/noy093 (2019).
Liu, H. & Sun, P. A Nomogram Model for Predicting Prognosis of Patients with Medulloblastoma. Turk Neurosurg 34, 38–45, doi:10.5137/1019-5149.JTN.40397-22.3 (2024).
Zhu, S. et al. Identification of a Twelve-Gene Signature and Establishment of a Prognostic Nomogram Predicting Overall Survival for Medulloblastoma. Front Genet 11, 563882, doi:10.3389/fgene.2020.563882 (2020).
Erickson, B. J., Korfiatis, P., Akkus, Z. & Kline, T. L. Machine Learning for Medical Imaging. Radiographics 37, 505–515, doi:10.1148/rg.2017160130 (2017).
Eraslan, G., Avsec, Z., Gagneur, J. & Theis, F. J. Deep learning: new computational modelling techniques for genomics. Nat Rev Genet 20, 389–403, doi:10.1038/s41576-019-0122-6 (2019).
Handelman, G. S. et al. eDoctor: machine learning and the future of medicine. J Intern Med 284, 603–619, doi:10.1111/joim.12822 (2018).
She, Y. et al. Deep learning for predicting major pathological response to neoadjuvant chemoimmunotherapy in non-small cell lung cancer: A multicentre study. EBioMedicine 86, 104364, doi:10.1016/j.ebiom.2022.104364 (2022).
Tran, K. A. et al. Deep learning in cancer diagnosis, prognosis and treatment selection. Genome Med 13, 152, doi:10.1186/s13073-021-00968-x (2021).
Foersch, S. et al. Multistain deep learning for prediction of prognosis and therapy response in colorectal cancer. Nat Med 29, 430–439, doi:10.1038/s41591-022-02134-1 (2023).
Huang, B. et al. Deep Learning for the Prediction of the Survival of Midline Diffuse Glioma with an H3K27M Alteration. Brain Sci 13, doi:10.3390/brainsci13101483 (2023).
Zhang, X. et al. Deep learning-based pathology image analysis predicts cancer progression risk in patients with oral leukoplakia. Cancer Med 12, 7508–7518, doi:10.1002/cam4.5478 (2023).

No competing interests reported.

Download PDF

Editorial decision: Revision requested
25 Mar, 2024
Reviews received at journal
08 Mar, 2024
Reviewers agreed at journal
06 Mar, 2024
Reviewers agreed at journal
06 Mar, 2024
Reviewers invited by journal
06 Mar, 2024
Editor assigned by journal
06 Mar, 2024
Editor invited by journal
05 Mar, 2024
Submission checks completed at journal
05 Mar, 2024
First submitted to journal
21 Feb, 2024

You are reading this latest preprint version

Deep learning models for predicting the survival of patients with medulloblastoma based on a surveillance, epidemiology, and end results analysis

Status:

Version 1

Abstract

Background

Methods

Results

Conclusion

Figures

Introduction

Method

Variable’s definitions

Model development

Model evaluation

Statistical analysis

Result

Cox proportional-hazard (CoxPH) model

Random Survival Forests (RSF)

DeepSurv

Model comparisons

Discussion

Conclusions

Declarations

References

Additional Declarations

Status:

Version 1