Background: Lymphedema is a disease thatrefers to tissue swelling caused by an accumulation of protein-rich fluid that is usually drained through the lymphatic system. Detection of lymphedema is often based on expensive diagnoses such as bioimpedance spectroscopy, shear wave elastography, computed tomography, etc. Applications of data science and machine learning in predicting medical conditions offered support for medical doctors and patients in the early detection of diseases. Although current studies proposed machine learning models to predict lymphedema by using symptoms reported by patients, there might be uncertaintyinpatient-input data. In this study, we proposed to usemore reliable input data such as complete blood count, serum, and therapy data to develop predictive models for lymphedema.
Methods: We collected data from 2137 patients, including 356 patients having lymphedema and 1781 patients not having lymphedema. The lymphedema status of each patient was confirmed by clinicians. Data of each patient includes: 1) complete blood count (CBC) test, 2) serum test, and 3) therapy information. We used machine learning algorithms (i.e., random forest, gradient boosting, support vector machine, decision tree and artificial neural network) to develop predictive models on training dataset (i.e., 80% of the data) and tested the models on the test dataset (i.e., 20% of the data). After choosing the best predictive models, we developed web application for medical doctors and clinicians to use our models for quick screening lymphedema patients.
Results: A dataset of 2137 patients was collected from Seoul National University Bundang Hospital. Predictive models based on random forest algorithm showed satisfactory performance (balanced accuracy = 86.7 ± 0.9%, sensitivity = 84.3 ± 0.6%, specificity = 89.1 ± 1.5%, precision = 97.4 ± 0.4%, F1 score = 90.4 ± 0.4%, and AUC = 0.931 ± 0.007). A web application was made to assist medical doctors in quick screening lymphedema: https://snubhtxt.shinyapps.io/SNUBH_Lymphedema.
Conclusions: Our study would provide a tool for the early detection of lymphedema and be the basement for future studies predicting lymphedema stages.