Interpretable Machine Learning Model to Predict Rupture of Small Intracranial Aneurysms and Facilitate Clinical Decision

DOI: https://doi.org/10.21203/rs.3.rs-1015315/v1

Abstract

Estimating the rupture risk of small intracranial aneurysms (IAs) to determine whether to treat is difficult but crucial. We aimed to construct and external validation a convenient machine learning (ML) model for assessing the rupture risk of small IAs.1004 patients with small IAs recruited from two hospitals were included in our retrospective research. The patients at hospital 1 were stratified into training (70%) and internal validation set (30%) randomly, and the patients at hospital 2 were used for external validation. We selected predictive features using the least absolute shrinkage and selection operator (LASSO) method, and constructed five ML models applying diverse algorithms including random forest classifier (RFC), categorical boosting (CatBoost), support vector machine (SVM) with linear kernel, light gradient boosting machine (LightGBM) and extreme gradient boosting (XGBoost). The Shapley Additive Explanations (SHAP) analysis provided interpretation for the best ML model.The training, internal and external validation cohorts included 658, 282, and 64 IAs, respectively. The best performance was presented by SVM as AUC of 0.817 in the internal [95% confidence interval (CI), 0.769-0.866] and 0.893 in the external (95% CI, 0.808-0.979) validation cohorts, overperformed than the PHASES score significantly (all P < 0.001). SHAP analysis showed maximum size, location and irregular shape were the top three important features to predict rupture. Our SVM model based on readily accessible features presented satisfying ability of discrimination in predicting the rupture IAs with small size. Morphological parameters made important contributions to prediction result.

Introduction

Intracranial aneurysms (IAs) occurring in around 3% adults, are relatively common in the general population [1]. Ruptured IAs leads to aneurysmal subarachnoid hemorrhage with high case morbidity and disability [2]. Of note, most incidentally detected IAs have small sizes (≤7mm in diameter) [3]. Small IAs account for more than 40% of all ruptured IAs [4], which may push patients with small IAs to accept preventive treatment and endure some additional treatment risks. Therefore, early evaluation of the rupture risk of small IAs is of great significance to provide some reference for physicians and patients to formulate treatment strategies.

Scoring systems for evaluating the rupture risk of IAs have been reported [5,6]. These researches studied the relationships between various risk factors and rupture outcome according to the traditional statistical methods such as logical regression. However, the relationships are usually complex and nonlinear which makes conventional methods less reliable. In addition, on account of the difference of pathophysiological characteristics between small and large IAs [7], these scoring systems do not apply for small IAs well. As a result, it is necessary to apply new approaches to the rupture risk prediction model of small IAs.

Machine learning (ML), as a novel kind of modeling method, could identify the correlation between features of a multivariate large sample data set [8]. It is superior to conventional statistical methods in dealing with non-linear relations and complicated pattern problem [9]. The potency and effectiveness of ML approaches in predicting the rupture risk of IAs have been testified. Liu et al. proposed a ML model achieving an overall prediction accuracy of 94.8% in evaluating the rupture risk of IAs located at anterior communicating artery [10]. Another research used ML to stratify the risk of developing IAs for those taking health examinations, and recommended further screening tests for those at high risk [11].

In this study, we constructed prediction models for the rupture risk of small IAs based on ML methods and clinical data, and validated according another cross-regional center dataset. To improve the interpretability, we introduced a model interpretation technique to rank the importance of the selected input features. We aimed to develop a convenient tool to facilitate clinical decision and optimize treatment options.

Method

Study population

We recruited a continuous series of patients with IAs from two hospitals (Hunan Provincial people's Hospital and the second affiliated Hospital of Nanjing Medical University) between September 2015 and December 2020 and obtained data retrospectively from cerebrovascular images and medical records. The ethics committee of Hunan Provincial People's Hospital have authorized this study ([2015]-10). The inclusion criteria were as follows: (1) patients with IA(s) confirmed by digital subtraction angiography (DSA), (2) patients ≥18 years old, (3) patients with the size of IAs ≤ 7mm, and (4) patients with available clinical information and imaging data. Patients who were diagnosed as malignant brain tumors, fusiform or dissecting IAs, arteriovenous fistulas, moyamoya disease, other cerebrovascular diseases, and incomplete clinical and imaging data were excluded.

Data collection and data pre-processing

The baseline data of patients were as follows: age; gender; drinking; smoking; presence of hypertension, coronary heart disease (CHD) and diabetes mellitus (DM); and history of subarachnoid hemorrhage (SAH). Morphological parameters (such as size, location, shape and etc.) were extracted from 3D-DSA images and were measured by two researchers, which was supervised by two senior neurosurgeons. The maximum neck width, neck-to-dome length (from the neck center to the IA dome) and IA width (perpendicular to the neck to the dome) were measured on a 0.1 mm scale. Size of IAs was explained as the aneurysm neck-to-dome length or the largest distance within the aneurysm sac. IAs were categorized into narrow neck aneurysm (NNA) and wide neck aneurysm (WNA) (with a neck width exceeding 4 mm or a ratio of maximum diameter to neck width less than 2). According to the position relative to the parent vessel, IAs were divided into sidewall type and bifurcation type. Shape of IAs was categorized as regular and irregular shape (with the presence of aneurysm wall protrusions, bi- or multi-lobular or small blebs). The location of IAs was specifically divided into internal carotid artery (ICA), anterior communicating artery (ACOA), anterior cerebral artery (ACA), posterior cerebral artery (PCA), middle cerebral artery (MCA), posterior communicating artery (PCOA), vertebral artery (VA), basilar artery (BA), posterior inferior cerebellar artery (PICA) and others, which was further dichotomized as anterior vs posterior circulation. The largest IA was used for analysis when a patient was detected with at least two IAs.

Feature selection and model development

The eligible patients at hospital 1 were assigned into derivation cohort (70%) and internal validation cohort (30%) using a stratified random sampling method, and the eligible patients at hospital 2 were used for external validation. Feature selection, model derivation and hyper-parameter tuning described below were performed using the training cohort only. Before developing the ML models, z-score was applied to normalize the continuous data [12] while one-hot encoding was employed to transform the categorical data [13]. The least absolute shrinkage and selection operator (LASSO) method was applied to selected predictive features [14], in which the features with non-zero coefficients were selected as predictive features to train the ML model. We constructed ML models used to classify ruptured versus unruptured IAs with random forest classifier (RFC), extreme gradient boosting (XGBoost), support vector machine (SVM) with linear kernel, light gradient boosting machine (LightGBM) and categorical boosting (CatBoost) algorithms, and tuned model hyper-parameters using ten-fold cross-validation combined with grid search [15]. In the process of ten-fold cross-validation, our training dataset was randomly stratified into 10 smaller subsets. For each fold, 9 subsets were used for model construction with a specific set of hyper-parameters, and the remaining one for model evaluation. Eventually, the models were retrained using the set of hyper-parameters with the best average AUC among the 10 models, that is, the optimal hyper-parameters.

Model evaluation

Model performance measurement was the area under the curve (AUC) of receiver operating characteristic (ROC). We also compared our five ML models with the PHASES score [5] in which higher score denotes higher rupture risk. For instance, patients scoring 2 have a 5-year rupture risk of 0.4%, while those scoring 11 have a risk of 7.2%. The method of DeLong et al. was adopted to compute confidence intervals (CIs) of the AUC values and compare the different ROC curves [16]. The cut-off threshold corresponding to the maximum Youden Index was selected as the optimal cut-point that dichotomize the predictions from the ROC curves [17]. Values of specificity, sensitivity and accuracy, positive predictive value (PPV), and negative predictive value (NPV) were calculated at the optimal threshold.

Model interpretation

ML models were often criticized as black boxes because the function between input features and model output was invisible to researchers. We applied a model interpretation technique named Shapley Additive Explanations (SHAP) [18] to our best performing model to reveal the importance of each included feature in order to improve its interpretability and trustworthiness. Besides, we used 2 correctly predicted and 2 falsely predicted cases that were randomly sampled from the derivation set to make explanation for individual prediction, which clarified the causes of the model’s correct and incorrect prediction.

Statistical analyses

Statistical analysis was performed for comparison of patient and IA characteristics across training, internal and external validation sets, thereinto, continuous data employing the analysis of variance (ANOVA) while categorical data employing the Fisher’s exact test. Besides, in the univariable analysis of the clinical features difference between the ruptured and unruptured group, Mann-Whitney U test or Student's T-Test were applied to continuous data while Chi-squared test for categorical data. A two-tailed P < 0.05 was deemed as statistical significance. Data were analyzed with the SPSS software (IBM Corporation, USA).

Result

Study population

Totally 1004 patients were included in this retrospective research. Their mean age was 59.06 ± 10.48 years, and 70.2% of them were female. Patients and IAs characteristics of the training (n = 658) and internal validation (n = 282) cohorts from hospital 1, and external validation (n = 64) cohorts from hospital 2 were presented in Table 1. Significant differences in age, history of DM, irregular shape and rupture status were observed across the three groups (all P < 0.05, Table 1), while no significant differences in any variable could be found between the training and internal validation sets (p = 0.060 –1.000, Supplementary Table 1). Results of the univariable analysis showed that maximum size, location, irregular shape, presence of hypertension and DM were significantly related to IA rupture (all P < 0.05, Supplementary Table 2).

Model performance

The prediction models were trained using 9 predictive features and 5 ML algorithms (including RFC, SVM, XGBoost, LightGBM and CatBoost). Predictive features were age, hypertension, DM, irregular shape, NNA, maximum size, location at ACOA, location at ICA, and location at PCOA, which were determined by LASSO analysis (Table 2). The hyper-parameters of each algorithm were presented at Supplementary Table 2.

Values of AUC, specificity, sensitivity and accuracy, PPV, and NPV derived from the five ML models was summarized in Table 3. ROC curves and AUC values of the PHASES score and our four models were shown at Fig.1a-c, in which the SVM model achieved the highest AUC values (0.817 [95% CI, 0.769-0.866]) in the internal validation, followed by the LightGBM (0.791 [95% CI, 0.740-0.843]), the RFC (0.789 [95% CI, 0.736-0.841]), the CatBoost (0.785 [95% CI, 0.733-0.838]) and XGBoost (0.773 [95% CI, 0.719-0.827]). All our four ML models overperformed than PHASES score significantly (All P < 0.001). Besides, in the external validation, SVM model also had a significantly higher AUC (0.893 [95% CI, 0.808-0.979]) than others.

Model interpretation

SHAP analysis was introduced to reveal the contribution to the prediction outcome of each feature in the SVM model. The model tended to correlate larger size, location at ACOA, location at PCOA, and NNA with increased rupture risk, represented as positive SHAP values (Fig. 2). Based on the order of importance, the top three features that have important contribution to classification of rupture and unruptured IAs are maximum size, location (ACOA and ICA) and irregular shape (Fig. 3).

We also randomly sampled 2 correctly predicted and 2 falsely predicted cases from the training dataset, as plotted in Supplementary Fig. 1. The true positive prediction that the first case was correctly classified as a ruptured IA mainly resulted from maximum size of 6.3mm, irregular shape, no hypertension and location at PCOA. The true negative prediction that the second case was classified as an unruptured IA mainly relied on maximum size of 2.7mm, regular shape and absence of hypertension. Maximum size of 6.1mm, location at PCOA and irregular shape are the main reasons for the false positive prediction that the third case was a ruptured IA, while maximum size of 3mm and regular shape are the main reasons for the false negative prediction.

Discussion

In this study, we combined the simple variables obtained in routine clinical practice and the ML algorithm to establish a model for predicting the rupture risk of small aneurysms. The best models, SVM, carried out a satisfying ability of discrimination in screening IAs with high risk of rupture, with an AUC value of 0.817 and 0.893 in the internal and external validation. In SHAP analyze, size, location, shape and presence of hypertension exerted great influences on predicting outcome.

Physicians and patients are often caught in a dilemma when making treatment decisions for unruptured IAs, especially in small ones. On the one hand, the low rupture risk of small IAs makes conservative treatment seem more reasonable. On the other hand, the disastrous consequences lead by rupture make many patients incline to receive preventive treatment. Nevertheless, treatment is always accompanied by risks. The former study indicates that among patients without history of hemorrhage, the total morbidity and mortality rate of 1 year after open surgery and endovascular treatment are 12.6% and 9.8%, respectively [1]. Accordingly, accurately and quickly screening IAs with high risk of rupture for preventive treatment of these IAs is extremely crucial.

Traditional statistical methods have been widely employed to correlate ruptured aneurysms with related risk factors. However, the fact of the complex relationship between various features and the outcome would bring some problems to the analysis based on the assumption of a simple linear relationship. ML has shown great potential in dealing with variables with nonlinear relationships and missing values [8,9]. It could enable us to have a more comprehensive understanding of the relationships from different perspectives. Furthermore, with the wide application of electronic medical record system and the progress of technology, ML model could be integrated into some systems that could automatically process a large amount of data, and bring great convenience in aiding clinical decision for doctors and providing individualized diagnosis and treatment for patients.

The prime advantage of our model was convenient to apply and serve for physicians and patients. Considering that it could be a difficult task for physicians to spend much time on collecting complex additional information in their busy work, we only collect patient and morphological characteristics that can be accessed in routine clinical practice for modeling. This design could improve the convenience of our model in clinical environment well. On the contrary, two previous studies constructed ML models based on complex hemodynamics and pyradiomics-derived morphological features, which may limit their clinical promotion [19,20]. At the same time, another two researches employed convolutional neural networks to develop prediction model, which worked by identifying information from 3D-DSA [21,22]. However, ignoring important patient characteristics could exert some impact on the clinical efficacy of their models in the real world.

Another advantage of our model is the interpretability by introducing SHAP algorithm to rank the importance of the selected input features of IAs patients. ML has gradually become a research hotspot because of its excellent ability to handle large samples and nonlinear relationships. However, a significant defect of ML models is that they tend to operate like “black boxes”, which makes them seem less reliable for experts. What we did to conquer this flaw was to interpret the predictions made by our models according to the SHAP method [18]. By this way, the rules behind prediction of our ML model could be better revealed; and physicians could validate the interpretation of the ML model based on professional knowledge.

Researchers have extensively studied and discussed various factors related to ruptured IAs, in which larger size [3,23], irregular shape [24], or location at ACOA and PCOA [25] associated with higher rupture risk have been recognized by most studies. Same results could be concluded in our study. Interestingly, patients with history of hypertension in our cohort showed a lower risk of rupture, which were different from some studies. This may be attributed to the changes brought about by the use of antihypertensive drugs. In a previous animal model study, they found that the normalization of blood pressure by antihypertensive drugs can reduce the rupture rate of aneurysms in mice [26]. In addition, one Finland research pointed that drug-treated hypertension may relate to the formulation of IAs instead of the rupture, and bring higher rupture risk only if not be treated [27]. Similarly, several studies regarded DM as a protective factor, and attributed it to the consume of hypoglycemic agents [28,29]. More well-designed researches were required to sufficiently investigate the connection between IAs rupture and drug-treated hypertension.

There are still certain limitations in our study. First and foremost, the retrospective nature of this study may introduce impacts to our analysis. Second, most IAs of the patients had ruptured during the study period. Although ruptured IAs were indeed unstable, there were reports considered that post rupture morphology should not be considered as an adequate alternative indicator in evaluating the rupture risk [30]. Third, we only took into account clinically accessible factors. Some complex factors, such as morphology and hemodynamics parameters, were rarely included in the current study. Finally, although our model is satisfying in external validation, it remains problematic that the external validation dataset is relatively small. Going forward, prospective multicenter validation and long-term follow-up is needed to better improve our results.

Conclusions

Our study combined readily accessible clinical and morphological features to derive ML models for predicting the risk of small IAs rupture. In internal and external validation, our SVM model showed satisfying ability of discrimination. Morphological parameters (size, location and shape) made important contributions to prediction result.

Declarations

Funding

This study was funded by the Special Scientific Research Fund Project of Jiangsu Research Hospital Association - Lean Drug use - Stone Medicine [Grant number JY202001]; National Natural Science Foundation of China [grant number 82173899].

Conflicts of interest

The authors declare that they have no conflict of interests.

Data availability

Not applicable.

Code availability

SPSS version 25.0, python version 3.7.

Authors’ contributions

JJZ, YBL and HWF conceived and designed the study. WGX, TTC and ZHZ contributed equally to this work. WGX, TTC and ZHZ conducted the literature review. TTC performed data analysis. WGX drafted the manuscript. XML, YJS, LX and DC collected the data. LHG and ZZ polished this article. All authors have read and agreed to the published version of the manuscript.

Ethics approval

Data collection and scientific use were approved by the institutional ethics committee of Hunan Provincial People's Hospital ([2015]-10).

Consent to participate

Data collection and scientific use were approved by the patients according to regulations by the ethics committee.

Consent for publication

All authors agreed to the publication of the manuscript.

References

  1. Wiebers, D. O., Whisnant, J. P., Huston, J., 3rd, Meissner, I., Brown, R. D., Jr, Piepgras, D. G., Forbes, G. S., Thielen, K., Nichols, D., O'Fallon, W. M., Peacock, J., Jaeger, L., Kassell, N. F., Kongable-Beckman, G. L., Torner, J. C., & International Study of Unruptured Intracranial Aneurysms Investigators (2003) Unruptured intracranial aneurysms: natural history, clinical outcome, and risks of surgical and endovascular treatment. Lancet (London, England) 362(9378): 103–110. https://doi.org/10.1016/s0140-6736(03)13860-3
  2. Macdonald, R. L., & Schweizer, T. A. (2017) Spontaneous subarachnoid haemorrhage. Lancet (London, England) 389(10069): 655–666. https://doi.org/10.1016/S0140-6736(16)30668-7
  3. Malhotra, A., Wu, X., Forman, H. P., Grossetta Nardini, H. K., Matouk, C. C., Gandhi, D., Moore, C., & Sanelli, P. (2017) Growth and Rupture Risk of Small Unruptured Intracranial Aneurysms: A Systematic Review. Annals of internal medicine 167(1): 26–33. https://doi.org/10.7326/M17-0246
  4. Bender, M. T., Wendt, H., Monarch, T., Beaty, N., Lin, L. M., Huang, J., Coon, A., Tamargo, R. J., & Colby, G. P. (2018) Small Aneurysms Account for the Majority and Increasing Percentage of Aneurysmal Subarachnoid Hemorrhage: A 25-Year, Single Institution Study. Neurosurgery 83(4): 692–699. https://doi.org/10.1093/neuros/nyx484
  5. Greving, J. P., Wermer, M. J., Brown, R. D., Jr, Morita, A., Juvela, S., Yonekura, M., Ishibashi, T., Torner, J. C., Nakayama, T., Rinkel, G. J., & Algra, A. (2014) Development of the PHASES score for prediction of risk of rupture of intracranial aneurysms: a pooled analysis of six prospective cohort studies. The Lancet. Neurology 13(1): 59–66. https://doi.org/10.1016/S1474-4422(13)70263-1
  6. UCAS Japan Investigators, Morita, A., Kirino, T., Hashi, K., Aoki, N., Fukuhara, S., Hashimoto, N., Nakayama, T., Sakai, M., Teramoto, A., Tominari, S., & Yoshimoto, T. (2012) The natural course of unruptured cerebral aneurysms in a Japanese cohort. The New England journal of medicine 366(26): 2474–2482. https://doi.org/10.1056/NEJMoa1113260
  7. Kataoka, K., Taneda, M., Asai, T., & Yamada, Y. (2000) Difference in nature of ruptured and unruptured cerebral aneurysms. Lancet (London, England) 355(9199): 203. https://doi.org/10.1016/S0140-6736(99)03881-7
  8. Senders, J. T., Arnaout, O., Karhade, A. V., Dasenbrock, H. H., Gormley, W. B., Broekman, M. L., & Smith, T. R. (2018) Natural and Artificial Intelligence in Neurosurgery: A Systematic Review. Neurosurgery 83(2): 181–192. https://doi.org/10.1093/neuros/nyx384
  9. Ngiam, K. Y., & Khor, I. W. (2019) Big data and machine learning algorithms for health-care delivery. The Lancet. Oncology 20(5): e262–e273. https://doi.org/10.1016/S1470-2045(19)30149-4
  10. Liu, J., Chen, Y., Lan, L., Lin, B., Chen, W., Wang, M., Li, R., Yang, Y., Zhao, B., Hu, Z., & Duan, Y. (2018) Prediction of rupture risk in anterior communicating artery aneurysms with a feed-forward artificial neural network. European radiology 28(8): 3268–3275. https://doi.org/10.1007/s00330-017-5300-3
  11. Heo, J., Park, S. J., Kang, S. H., Oh, C. W., Bang, J. S., & Kim, T. (2020) Prediction of Intracranial Aneurysm Risk using Machine Learning. Scientific reports 10(1): 6921. https://doi.org/10.1038/s41598-020-63906-8
  12. [15] Shalabi LA, Shaaban Z, Kasasbeh B. (2006) Data mining: a preprocessing engine. J Comput Sci 2: 735-739. https://doi.org/10.3844/jcssp.2006.735.739
  13. Okada, S., Ohzeki, M., & Taguchi, S. (2019) Efficient partition of integer optimization problems with one-hot encoding. Scientific reports 9(1): 13036. https://doi.org/10.1038/s41598-019-49539-6
  14. Tibshirani R. (2011) Regression shrinkage and selection via the Lasso: a retrospective. J R Statist Soc B 73: 273-282. https://doi.org/10.1111/j.1467-9868.2011.00771.x
  15. Saeys, Y., Inza, I., & Larrañaga, P. (2007) A review of feature selection techniques in bioinformatics. Bioinformatics (Oxford, England) 23(19): 2507–2517. https://doi.org/10.1093/bioinformatics/btm344
  16. DeLong, E. R., DeLong, D. M., & Clarke-Pearson, D. L. (1988) Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 44(3): 837–845.
  17. Ruopp, M. D., Perkins, N. J., Whitcomb, B. W., & Schisterman, E. F. (2008) Youden Index and optimal cut-point estimated from observations affected by a lower limit of detection. Biometrical journal. Biometrische Zeitschrift 50(3): 419–430. https://doi.org/10.1002/bimj.200710415
  18. Rodríguez-Pérez, R., & Bajorath, J. (2020) Interpretation of Compound Activity Predictions from Complex Machine Learning Models Using Local Approximations and Shapley Values. Journal of medicinal chemistry 63(16): 8761–8777. https://doi.org/10.1021/acs.jmedchem.9b01101
  19. Liu, Q., Jiang, P., Jiang, Y., Ge, H., Li, S., Jin, H., & Li, Y. (2019) Prediction of Aneurysm Stability Using a Machine Learning Model Based on PyRadiomics-Derived Morphological Features. Stroke 50(9): 2314–2321. https://doi.org/10.1161/STROKEAHA.119.025777
  20. Shi, Z., Chen, G. Z., Mao, L., Li, X. L., Zhou, C. S., Xia, S., Zhang, Y. X., Zhang, B., Hu, B., Lu, G. M., & Zhang, L. J. (2021) Machine Learning-Based Prediction of Small Intracranial Aneurysm Rupture Status Using CTA-Derived Hemodynamics: A Multicenter Study. AJNR. American journal of neuroradiology 42(4): 648–654. https://doi.org/10.3174/ajnr.A7034
  21. Kim, H. C., Rhim, J. K., Ahn, J. H., Park, J. J., Moon, J. U., Hong, E. P., Kim, M. R., Kim, S. G., Lee, S. H., Jeong, J. H., Choi, S. W., & Jeon, J. P. (2019) Machine Learning Application for Rupture Risk Assessment in Small-Sized Intracranial Aneurysm. Journal of clinical medicine 8(5): 683. https://doi.org/10.3390/jcm8050683
  22. Ahn, J. H., Kim, H. C., Rhim, J. K., Park, J. J., Sigmund, D., Park, M. C., Jeong, J. H., & Jeon, J. P. (2021) Multi-View Convolutional Neural Networks in Rupture Risk Assessment of Small, Unruptured Intracranial Aneurysms. Journal of personalized medicine 11(4): 239. https://doi.org/10.3390/jpm11040239
  23. Ikawa, F., Morita, A., Tominari, S., Nakayama, T., Shiokawa, Y., Date, I., Nozaki, K., Miyamoto, S., Kayama, T., Arai, H., & Japan Neurosurgical Society for UCAS Japan Investigators (2019) Rupture risk of small unruptured cerebral aneurysms. Journal of neurosurgery 1–10. Advance online publication. https://doi.org/10.3171/2018.9.JNS181736
  24. Lindgren, A. E., Koivisto, T., Björkman, J., von Und Zu Fraunberg, M., Helin, K., Jääskeläinen, J. E., & Frösen, J. (2016) Irregular Shape of Intracranial Aneurysm Indicates Rupture Risk Irrespective of Size in a Population-Based Cohort. Stroke 47(5): 1219–1226. https://doi.org/10.1161/STROKEAHA.115.012404
  25. Rousseau, O., Karakachoff, M., Gaignard, A., Bellanger, L., Bijlenga, P., Constant Dit Beaufils, P., L'Allinec, V., Levrier, O., Aguettaz, P., Desilles, J. P., Michelozzi, C., Marnat, G., Vion, A. C., Loirand, G., Desal, H., Redon, R., Gourraud, P. A., Bourcier, R., & ICAN Investigators (2021) Location of intracranial aneurysms is the main factor associated with rupture in the ICAN population. Journal of neurology, neurosurgery, and psychiatry 92(2): 122–128. https://doi.org/10.1136/jnnp-2020-324371
  26. Tada, Y., Wada, K., Shimada, K., Makino, H., Liang, E. I., Murakami, S., Kudo, M., Kitazato, K. T., Nagahiro, S., & Hashimoto, T. (2014) Roles of hypertension in the rupture of intracranial aneurysms. Stroke 45(2): 579–586. https://doi.org/10.1161/STROKEAHA.113.003072
  27. Lindgren, A. E., Kurki, M. I., Riihinen, A., Koivisto, T., Ronkainen, A., Rinne, J., Hernesniemi, J., Eriksson, J. G., Jääskeläinen, J. E., & von und zu Fraunberg, M. (2014) Hypertension predisposes to the formation of saccular intracranial aneurysms in 467 unruptured and 1053 ruptured patients in Eastern Finland. Annals of medicine 46(3): 169–176. https://doi.org/10.3109/07853890.2014.883168
  28. Song, J., & Shin, Y. S. (2016) Diabetes may affect intracranial aneurysm stabilization in older patients: Analysis based on intraoperative findings. Surgical neurology international 7(Suppl 14): S391–S397. https://doi.org/10.4103/2152-7806.183497
  29. Can, A., Castro, V. M., Yu, S., Dligach, D., Finan, S., Gainer, V. S., Shadick, N. A., Savova, G., Murphy, S., Cai, T., Weiss, S. T., & Du, R. (2018) Antihyperglycemic Agents Are Inversely Associated With Intracranial Aneurysm Rupture. Stroke 49(1): 34–39. https://doi.org/10.1161/STROKEAHA.117.019249
  30. Skodvin, T. Ø., Johnsen, L. H., Gjertsen, Ø., Isaksen, J. G., & Sorteberg, A. (2017) Cerebral Aneurysm Morphology Before and After Rupture: Nationwide Case Series of 29 Aneurysms. Stroke 48(4): 880–886. https://doi.org/10.1161/STROKEAHA.116.015288

Tables

Table 1 Patient and aneurysm characteristics of the training, internal validation and external validation set

Characteristics

Training set

Internal validation set

External validation set

P-value

Patient characteristics

Age, years, mean ± SD

59.032 ± 10.52

58.170 ± 10.09

63.33 ± 10.09

0.002*

Female, n (%)

194 (29.5 %)

82 (29.1 %)

23 (35.9 %) 

0.531

Hypertension, n (%)

395 (60 %)

169 (59.9 %)

39 (69.9 %)

1.000

DM, n (%)

62 (9.4 %)

19 (6.7 %)

14 (21.9 %)

0.002*

AF, n (%)

8 (1.2 %)

6 (2.1 %)

0 (0 %)

0.458

CHD, n (%)

64 (9.7 %)

28 (9.9 %)

8 (12.5 %)

0.722

SAH, n (%)

10 (1.5 %)

1 (0.4 %)

1 (1.6 %)

0.263

Smoking, n (%)

109 (16.6 %)

43 (15.2 %)

6 (9.4 %)

0.329

Drinking, n (%)

47 (7.1 %)

19 (7.1 %)

3 (4.7 %)

0.835

Aneurysm characteristics

PC, n (%)

52 (7.9 %)

26 (9.2 %)

2 (3.1 %)

0.266

Bifurcation location, n (%)

54 (8.2 %)

19 (6.7 %)

3 (4.7 %)

0.568

Irregular shape, n (%)

269 (40.9 %)

134 (47.5 %)

7 (10.9 %)

<0.001*

NNA, n (%)

201 (30.5 %)

82 (29.1 %)

22 (34.4 %)

0.684

Maximum size, mm, mean ± SD

4.13 ± 1.45

4.08 ± 1.46

4,21 ± 1.35

0.746

Location, n (%)

 

 

 

0.052

 ACA

34 (5.2 %)

17 (6.0 %)

3 (4.7 %)

 

 PCOA

117 (17.8 %)

49 (17.4 %)

8 (12.5 %)

 

 ICA

125 (19.0 %)

41 (14.5 %)

24 (37.5 %)

 

 MCA

86 (13.1 %)

34 (12.1 %)

6 (9.4 %)

 

 PCOA

244 (37.1 %)

115 (40.8 %)

21 (32.8 %)

 

 PCA

12 (1.8 %)

7 (2.5 %)

0 (0 %)

 

 VA

7 (1.1 %)

6 (2.1 %)

1 (1.6 %)

 

 PICA

8 (1.2 %)

7 (2.5 %)

0 (0 %)

 

 BA

16 (2.4 %)

2 (0.7 %)

0 (0 %)

 

 Others

9 (1.4 %)

4 (1.4 %)

1 (1.6 %)

 

Ruptured IAs

336 (51.1 %)

144 (51.1 %)

20 (31.3 %)

0.009*

DM, diabetes mellitus; AF, atrial fibrillation; CHD, coronary heart disease; SAH, subarachnoid hemorrhage; PC, posterior circulation; NNA, narrow necked aneurysm; ACA, anterior cerebral artery; PCOA, anterior communicating artery; ICA, internal carotid artery; MCA, middle cerebral artery; PCOA, posterior communicating artery; PCA, posterior cerebral artery; VA, vertebral artery; PICA, posterior inferior cerebellar artery; BA, basilar artery; IA, intracranial aneurysm. * Statistical difference across the three groups on intergroup comparison.


Table 2 Predictors determined by LASSO analysis

Feature

Coefficients

ICA

0.335

ACOA

0.290

DM

0.115

Size

0.102

Hypertension

0.080

Shape

0.077

PCOA

0.016

NNA

0.008

Age

0.002


Table 3 Model performance in the training, internal validation and external validation set

Data Set

Model

AUC (95% CI)

Sensitivity

Specificity

PPV

NPV

Accuracy

P-value

Training

PHASES

0.635 (0.592 - 0.678)

0.848

0.457

0.620

0.742

0.657

Reference

RFC

0.887 (0.863 - 0.910)

0.810

0.792

0.802

0.799

0.801

<0.001*

SVM

0.823 (0.792 - 0.855)

0.821

0.699

0.740

0.789

0.761

<0.001*

XGBoost

0.889 (0.865 - 0.913)

0.783

0.820

0.819

0.783

0.801

<0.001*

CatBoost

0.876 (0.859 - 0.901)

0.872

0.730

0.771

0.845

0.802

<0.001*

LightGBM

0.854 (0.825 - 0.883)

0.744

0.826

0.817

0.756

0.784

<0.001*

Internal validation

PHASES

0.616 (0.550 - 0.683)

0.896

0.391

0.606

0.783

0.649

Reference

RFC

0.789 (0.736 - 0.841)

0.771

0.645

0.694

0.730

0.709

<0.001*

SVM

0.817 (0.769 - 0.866)

0.854

0.623

0.703

0.804

0.741

<0.001*

XGBoost

0.773 (0.719 - 0.827)

0.743

0.645

0.686

0.706

0.695

<0.001*

CatBoost

0.785 (0.733 - 0.838)

0.826

0.594

0.680

0.766

0.713

<0.001*

LightGBM

0.791 (0.740 - 0.843)

0.708

0.681

0.699

0.691

0.695

<0.001*

External validation

PHASES

0.667 (0.536-0.789)

0.800

0.545

0.444

0.857

0.625

Reference

RFC

0.852 (0.744-0.960)

0.900

0.614

0.514

0.931

0.703

<0.001*

SVM

0.893 (0.808-0.979)

0.900

0.636

0.529

0.933

0.719

<0.001*

XGBoost

0.842 (0.728-0.956)

0.900

0.500

0.450

0.917

0.625

<0.001*

CatBoost

0.869 (0.770-0.967)

0.900

0.500

0.450

0.917

0.625

<0.001*

LightGBM

0.877 (0.788-0.967)

0.900

0.500

0.500

0.929

0.688

<0.001*

AUC, area under the curve of receiver operating characteristic; CI, confidence interval; RFC, random forest classifier; SVM, support vector machine; XGBoost, extreme gradient boosting; LightGBM, light gradient boosting machine; CatBoost, categorical boosting. P-value was calculated using Delong test.