Background
China faced the most significant challenge from stroke because it imposes a heavy burden on families, national health services, social services, and the economy. The length of hospital stay (LOS) was an essential indicator of utilization of medical services and was usually used to assess the efficiency of hospital management and patient quality of care. This study established a prediction model based on the machine learning algorithm to predict the ischemic stroke patients' LOS.
Methods
A total of 18,195 ischemic stroke patients' electronic medical records and 28 attributes were extracted from electronic medical records in a large comprehensive hospital in China. After preprocessing the data and feature selection, the XGBoost algorithm was used for building a machine learning model. The 10-fold cross-validation was used for model validation. The accuracy (ACC), recall rate (RE) and F1 measure were used to evaluate the performance of the prediction model of LOS of ischemic stroke patients. Finally, the XGBoost algorithm was used to identify and remove irrelevant features by ranking all attributes based on feature importance.
Results
The average ACC, RE and F1 measure were 0.96, 0.82 and 0.79, respectively, under the 10-fold cross-validation. According to the analysis of the importance of features, the LOS of ischemic stroke patients was affected by demographic characteristics, past medical history, admission examination features, and operation characteristics. Finally, the features, including NIHSS, MRS, Hemiplegia aphasia, age, BMI and TIA etc. were found to be the top ten features in importance in predicting the LOS of ischemic stroke patients.
Conclusions
The XGBoost algorithm was an appropriate machine learning method for predicting the LOS of patients with ischemic stroke. Based on the prediction model, an intelligent medical management prediction system could be developed to predict the LOS based on ischemic stroke patients' electronic medical records.