DFUCare: Deep learning platform for diabetic foot ulcer detection, analysis, and monitoring

doi:10.21203/rs.3.rs-3228873/v1

Diabetic foot ulcers (DFUs) are a severe complication among diabetic patients and often result in amputation and even mortality. Early recognition of infection and ischemia is crucial for improved healing, but current methods are invasive, time-consuming, and expensive. To address this need, we have developed DFUCare, a platform that uses computer vision and deep learning (DL) algorithms to non-invasively localize, classify, and analyze DFUs. The platform uses a combination of CIELAB and YCbCr color space segmentation with a pre-trained YOLOv5s algorithm for wound localization achieving an F1-score of 0.80 and an mAP of 0.861. Using DL algorithms to identify infection and ischemia, we achieved a binary accuracy of 79.76% for infection classification and 94.81% for ischemic classification on a validation set. DFUCare also measures wound size and performs tissue color and textural analysis to allow comparative analysis of macroscopic features of the wound. We tested DFUCare performance in a clinical setting to analyze the DFUs collected using a cell phone camera. DFUCare successfully segmented the skin from the background, localized the wound with less than 10% error, and predicted infection and ischemia with less than 10% error. This innovative approach has the potential to deliver a paradigm shift in diabetic foot care by providing a cost-effective, remote, and convenient healthcare solution.

Health sciences/Biomarkers

Health sciences/Diseases

Health sciences/Health care

Diabetic foot ulceration (DFU) is a serious complication affecting people with diabetes, with more than half of DFUs at risk of becoming infected. Of these infections, approximately 20% require amputation^1,2. This is a significant concern as patients who undergo amputation due to DFUs have a high mortality rate, with more than half expected to die within five years³. Additionally, the financial burden associated with treating and managing DFUs and their complications surpasses that of the top five cancers, with an annual cost exceeding 11 billion dollars in the United States alone⁴. As the prevalence of Diabetes Mellitus (DM) continues to rise, DFUs are expected to become an even greater burden for global health systems and may be one of the most expensive diabetes complications⁵.

Despite significant improvement in identifying novel therapies for DFU treatment, the early diagnosis of the underlying cause and management of DFU still remains challenging. Impaired DFU healing is complex pathogenesis driven by multiple factors including diabetic foot infections, wound ischemia, exhausted immune system, and poor glycemic control^6–8. DFU management requires infection and ischemia evaluation at multiple time points for better management, which is currently limited due to its invasive nature. This problem is more aggravated in the rural areas of the country due to limited access to DFU wound centers and clinical experts. Therefore, there is an unfulfilled clinical need for non-invasive tools for the analysis of wound infection as well as ischemia detection, two key factors associated with impaired wound healing.

In recent years, DL algorithms have demonstrated great potential in the detection and diagnosis of diseases, particularly in medical imaging, radiology, and pathology^9–11. This has led to the emergence of DL image analysis as an assistive tool, supporting clinicians with decision-making procedures and enhancing the efficiency and accuracy of disease diagnosis and treatment¹². DL has also shown promising results in the classification and localization of DFUs. It achieved high accuracies in ischemia and infection classification, ranging from 87.5–95.4% and 73–93.5%, respectively^13–16. Furthermore, researchers have made significant progress in DFU localization, with Mean Average Precision (mAP) values between 0.5782 and 0.6940, and F1-scores between 0.6612 and 0.7434^17,18.

Despite these advancements, many of these tools are still in the early stages of development and lack automated analysis capabilities for predicting infections, ischemia, and other physical features crucial for DFU wound management. Additionally, current wound analysis platforms rely on proprietary hardware attachments, such as thermal scanners (e.g., SmartMat by Pod Metrics), 3D scanners using structured light or lasers (e.g., Insight 3D by Ekare.ai and Ray 1 by Swift Medical), and Optical Coherence Tomography (OCT) for visualizing and quantifying microvascular structures related to DFU formation^19,20. The need for these specialized attachments may restrict the access to DFU management among the general population.

To address these limitations, it is essential to develop a non-invasive and automated tool that can comprehensively analyze wound tissues, even in resource-limited areas. This study aims to investigate this issue by introducing the DFUCare, a novel approach that enables the comprehensive analysis of wounds through images captured using standard phone hardware. DFUCare incorporates key components such as wound region detection models, infection and ischemia classification, size measurement, and traditional color and textural analysis. DFUCare's non-invasive nature, coupled with its automated analysis, empowers clinicians to manage infections, ischemia, and other critical physical features more effectively, ultimately enhancing DFU wound management.

Here we developed DFUCare, the platform involves localizing, cropping, and classifying the wound, and analyzing macroscopic features such as size, color, and texture extracted from the cropped wound image to determine their association with infection and ischemia status (Figure 1). Additionally, we conducted a pilot study in collaboration with the Postgraduate Institute of Medical Education and Research (PGIMER), in which the end-to-end DFUCare was tested.

DL model enabled wound localization from healthy skin with high precision

As an initial step of the DFUCare, we developed a wound region detection module using the YOLOv5s model trained on the DFUC2020 dataset. This algorithm achieved an F1-score of 0.78 and a mAP of 0.847 on the test set (Figure S1A). However, upon further analysis on the incorrectly localized test cases, we observed detection of false positives on the image background (Figure 2A).

To prevent false positives, image preprocessing pipeline removing background images was developed. The performance of the image preprocessing pipeline was evaluated using a subset of 100 randomly chosen images from the DFUC2020 dataset. The performance of the workflow was evaluated through manual analysis of the resulting images, specifically by determining if the wound region was unobstructed/visible after background removal was applied. The implemented filtering workflow has shown 97% accuracy in segmenting the foreground from the background for downstream analysis.

Applying this preprocessing step before the wound localization algorithm, performance increase of F1-score of 0.80 and mAP of 0.861 has been observed (Figure S1B). These results demonstrate the effectiveness of the developed module in accurately detecting and localizing wounds. Additionally, the use of the image preprocessing pipeline further improves the performance of the algorithm by reducing the instances of false positives (Figure 2B), highlighting the importance of background region removal in enhancing the precision of wound localization.

Wound infection and ischemia classification

To explore the potential of DL models to determine clinical information from DFU wound images, we conducted extensive training of multiple CNN models using the DFUC2021 dataset. The DFUs detected from the wound localization module were classified on a scale of 0-1, where values below 0.5 indicated non-infected wound, whereas values between 0.5-1, infected wound. Among the tested DL algorithms, the Resnet50v2 model with additional dense layer before output node obtained the best binary classification accuracy, AUC (Area under the ROC (Receiver Operating characteristic curve) Curve), and precision for infection, and DenseNet121 for ischemia classification, respectively, on the validation set. The DL approach achieved binary classification accuracies of up to 79.76% and 94.81%, for infection and ischemia (Table 1, Figure S2). In the four-way classification results, the combination of InceptionResnetv2 for infection classification and DenseNet121 with additional dense layer for ischemia classification earned the best F1 score. Resnet50v2 demonstrated the highest recall for both. In summary, the InceptionResnetv2 model demonstrated the best classification performance among the DL algorithms tested.

As a comparison to the DL approach, classical machine learning approaches were found to have classification accuracies on held out data ranging from 65.7% to 75.8% for infected or non-infected image patches, and 89.4% to 91.6% accuracy for ischemic and non-ischemic wound patches. Out of the classical models, an SVM model with an RBF kernel using the first 128 principal components showed the best performance on both infection and ischemia. Other classical models tended to overfit the training set, and therefore, poorly generalized to held-out data. While the classical models demonstrate the viability of utilizing handcrafted features for classifying small image patches, these findings suggested that the features learned and utilized by the deep learning architectures are better for the task of wound patch classification.

Demonstration of DFUCare

To validate the results of the DFUCare platform, we collaborated with Dr. Bhadada and his team at the PGIMER, and surveyed a total of 10 patients

DFUCare was found to be comparable to a physician's analysis in terms of accuracy (Table 3, Figure S3). The trained YOLO v5s model successfully localized all DFUs diagnosed by physician except one out of ten patients (patient 4), in which two DFUs were detected. In this test case, the larger bounding box captured the overall wound region, and the smaller bounding box captured the open wound in the overall wound image.

In DFU classification, DFUCare was correct for all of twelve wounds except for the case of ischemic classification for patient 4. This discrepancy may be attributed to the presence of moisture in the image, as ischemia is associated with dryness of the wound and surrounding skin.

In terms of DFU size measurement, DFUCare had an average difference of ± 0.2 cm for length and ± 0.3 cm for width, with the longest side being the length. Additionally, the results from the color analysis module align with the wound classification results (with cases of infection including more yellow hues and cases of ischemia with darker hues), justifying the results from the wound classification algorithm.

Overall, these results demonstrated the relative accuracy and practicality of the DFUCare model in clinical environments.

The aim of this study was to develop a non-invasive, automated, and remote solution for detecting and classifying DFUs using DL-based analysis of wound images. Our approach combined various techniques to perform a comprehensive analysis of wound tissues, differentiating from previous studies. Additionally, unlike existing wound analysis platforms that rely on proprietary hardware attachments, DFUCare only requires standard phone hardware, making it an accessible and portable alternative for DFU management.

Our pipeline successfully detected and localized the wound region with an F1-score of 0.80 and mAP of 0.861, classified infections and ischemia with high level accuracy (79.76% and 94.81% respectively), measured wound size, and analyzed wound color and texture. However, we acknowledged the current dataset's limitations in regard to diversity in age, race, and types of cameras used. To address this issue, we plan to further validate our preliminary results using a prospective set of images collected from DFU patients at Grady Memorial Hospital. This validation will provide insights into the generalizability of DFUCare to all skin tones, as the patient population predominantly consists of underrepresented minority populations.

In situations where camera quality or lighting conditions may impair the performance of the segmentation or infection/ischemia classification algorithm, we intend to incorporate a user override feature. This will allow users to manually adjust the estimated wound location, ensuring accurate subsequent size estimation or tissue color and texture profiling.

To enhance the DFUCare platform, we propose combining computer vision-extracted features (color and texture) with DL-extracted features to improve phenotype prediction performance. This integration has the potential to yield superior biomarkers for infection classification compared to conventional imaging alone.

Furthermore, we plan to incorporate clinical, biological, and epidemiological features alongside macroscopic image features to enhance the accuracy of classifying infection and predicting curability. Collecting patient records and examining the correlation between patient data and DFU development will provide a wide range of information, including clinical factors (age, gender, medical history, comorbidities, and medication usage), biological markers (blood glucose levels, inflammatory markers, and wound-related characteristics), and epidemiological features (environmental factors and lifestyle choices). Integrating these multifaceted factors with DL analysis of macroscopic image features will enable the development of a comprehensive predictive model for DFU outcomes.

In conclusion, this study presents a promising approach to developing a non-invasive, automated, and remote platform for monitoring and managing DFU using DL-based analysis of wound images. Future research involving diverse data sources will focus on improving the overall reliability of the DFUCare predictive model. The advancements resulting from this research endeavor hold the potential to significantly improve patient outcomes and facilitate more targeted interventions in the future.

DFU Datasets

DFUC2020

The goal of the Diabetic Foot Ulcer Competition 2020 (DFUC 2020) dataset was to improve the accuracy of DFU detection in real-world settings²¹. The dataset consisted of foot images with DFUs collected from Lancashire Teaching Hospitals. The images were captured using three digital cameras (Kodak DX4530, Nikon D3300 and Nikon COOLPIX P100), and close-ups of the foot were taken without zoom or macro functions. The dataset comprised of 4,000 images, with 2,000 used for training and 2000 for testing. The images were acquired during regular patient appointments, resulting in variability in factors such as distance, angle, lighting, and the presence of background objects. The dataset included cases with multiple DFUs, different stages of healing, partial foot visibility, and foot deformities. The dataset also featured cases with time stamps, rulers, and partial blurring or obfuscation of wounds. The images were annotated by healthcare professionals, indicating the ulcer location using bounding boxes.

DFUC 2021

The goal of the Diabetic Foot Ulcer Competition 2021 (DFUC 2021) dataset is to has been aimed to improve the accuracy of DFU classification in real-world settings enhance the DFU classification accuracy in clinical environment²². The images in the dataset were captured from patients during their clinical visits at Lancashire Teaching Hospitals using three different camera models. Close-up photographs of the full foot were taken at a distance of 30–40 cm, ensuring a parallel orientation to the ulcer plane and using adequate room lighting for consistent colors. The dataset includes annotations by a podiatrist and a consultant physician for ulcer location, ischemia, and infection status. Data curation involved cropping DFU regions and applying natural data augmentation. The DFUC2021 dataset contained a total of 15,683 images.

Wound image preprocessing

To optimize the performance of wound detection model, a comprehensive image preprocessing pipeline with the primary objective of removing background regions, in the wound images was applied (Fig. 3A). Before background removal, min-max image normalization was applied to ensure the comparability of wound images across different samples. This technique rescaled the pixel intensities of each image to a specific range, between 0 and 1. By normalizing the pixel intensities through subtracting the minimum value and dividing by the range of pixel values, consistent intensity levels across all samples were achieved, accounting for variations in camera resolution and lighting conditions.

To accurately distinguish between the skin and background regions in the wound images, we implemented a colorspace thresholding approach. Extensive research has demonstrated the effectiveness of the Cr channel in the YCRCB colorspace, as well as the a* channel in the CIELAB colorspace, for precise skin-to-background segmentation²³. Leveraging this knowledge, we generated a binary mask by applying Otsu's thresholding technique to the Cr channel in the CIELAB colorspace and the CR channel in the YCbCr colorspace. This binary mask was applied on the original wound image to separate the foreground from the backround skin. In addition, median filtering was incorporated to refine the binary mask obtained from the thresholding process and minimize background region inconsistencies (Fig. 3B). This technique replaced each pixel with the median value of its neighboring pixels, resulting in the removal of isolated background region pixels while preserving the overall structure of the mask. By incorporating this multi-step approach, our platform achieved a significant reduction in background region in the wound images.

Wound detection and localization

DL-based object localization models, such as the YOLO series, have consistently demonstrated exceptional speed and accuracy in detecting objects. In particular, YOLOv5 exhibits improved learning capabilities compared to its predecessors and utilizes the BottleneckSCP technique to extract hierarchical features with reduced computational complexity ²⁴.

For our study, we employed the YOLOv5s model, pretrained on the COCO dataset, and fine-tuned it on the DFUC 2020 dataset to enhance model convergence. The DFUC 2020 dataset was divided into a training set (n = 1800) and a test set (n = 200), and a 10-fold cross-validation technique was applied, training each fold for 30 epochs.

To address the limited number of wound images in the dataset, we employed data augmentation techniques. These included adjusting the hue, saturation, and value (HSV) of the images, as well as utilizing translation, scaling, flipping, and mosaic techniques. This augmented dataset improved model performance and generalization.

Additionally, the YOLOv5s model employs a stochastic gradient descent (SGD) optimizer with an initial learning rate of 0.01²⁵. The chosen learning rate ensures a balance between convergence speed and accuracy, allowing the model to effectively optimize its performance in detecting wounds.

To improve the localization accuracy of the model and reduce generalization error, the weights were tuned to achieve the highest mAP and Intersection over Union (IoU) scores within the range of 0.5 to 0.95. A 10-fold cross-validation process was performed and the weights that achieved the best mAP and IoU scores were aggregated. This ensures that the selected weights yield improved localization performance on the DFUs even for unseen wound images beyond the training set.

Automated classification of infection and ischemia in wound images

To classify the detected wound images into four categories: i) infection, ii) ischemia, iii) both infection and ischemia, and iv) neither infection nor ischemia, both a classical machine learning pipeline trained on hand-crafted image features and a DL pipeline were developed. The inclusion of the classical machine learning approach facilitates the extraction of interpretable wound features, ensuring transparency and practicality in medical application. The DL-based approach automatically learns complex patterns and hierarchical representations from wound images, capturing subtle features and nuances not easily discernible through traditional hand-crafted feature extraction, increasing the model performance.

Deep learning-based classification of DFU

To determine the CNN architecture that achieved the highest DFU classification reliability, we chose four most popular pre-trained ImageNet models (Resnet50v2, VGG16, InceptionResNetV2, and DenseNet121) and trained into three phases of 20 epochs each^26–29. For each model architecture, variants were trained with and without the addition of an additional dense layer between the last convolutional layer output, and the output node. Approximately 20% of images from the training dataset were held out for validation (1,156 images). To prevent overfitting and improve the performance of the DL models, image augmentation techniques including random rotations, flips, and shifts in brightness to each image in each epoch. Additionally, binary cross-entropy was used as a loss function to update the weights in each iteration. We evaluated the performance of the algorithms using multiple metrics, including binary accuracy, area under the curve (AUC), precision, and recall. All four models as-is with single output node and the same four models with a trainable dense layer after the last convolutional layer were trained on the binary classification tasks for either the presence of infection or ischemia. An output node following the last convolutional layer with a sigmoid activation function was used to give the binary classification result. Models were trained by three phases of 20 epochs each: 1) All weights for convolution layers were frozen and optimized by Adam with learning rate of 3e-4. 2) 4/5ths of the convolutional layers were frozen and RMSprop with learning rate of 1e-5 was used for optimization. 3) 2/3rds of the layers remained frozen and optimized with decayed learning rate of 1e-6 on binary cross entropy loss in Tensorflow2 (Fig. 4) ³⁰.

Due to imbalances in the number of ischemia images present (179 of the 4,799 images), ischemic models were trained both on the dataset as-is, and with ischemia-only and ischemic and infected images upsampled by a factor of six with random augmentations. This duplication brings the number of positive ischemic cases (662) in line with the number of negative ischemic cases (4,137). No modifications were made to the validation dataset.

The binary classification results were converted to a four-way classification result through the following formulas:

$$P\left(none\right)=\left(1-P\left(Inf\right)\right)*\left(1-P\left(Isch\right)\right)$$

1)

$$P\left(In{f}_{Only}\right)=P\left(Inf\right)*\left(1-P\left(Isch\right)\right)$$

2)

$$P\left(Isc{h}_{Only}\right)=\left(1-P\left(Inf\right)\right)*P\left(Isch\right)$$

3)

$$P\left(Both\right)=P\left(Inf\right)*P\left(Isch\right)$$

4)

Where $P\left(Inf\right)$ is the output of the binary infection model, and $P\left(Isch\right)$ is the prediction of the binary ischemia model. Four classification accuracy, F1-Score, and AUC were assessed on the training, validation, and test dataset by combining each network architecture’s best infection or ischemia models.

Handcrafted features extraction and classical machine learning-based DFU classification

The classical machine learning algorithm for wound classification was a comprehensive approach that incorporates six visual analysis methods to extract features from wound images³¹. The algorithm computed the distribution of CIELAB color space channels, the Gray Level Co-occurrence Matrix (GLCM) for the full image, distribution of GLCM metrics for 64x64 pixel patches across an image, Local Binary Patterns (LBP), Local Phase Quantification (LPQ), and Gabor filter to extract a mixture of color and textural features (Figure S4). These handcrafted features are used to train classical models including a non-linear SVM model using a Radial Basis Function (RBF) kernel, Gradboost (100 tress with depth of 3 either on raw features or after applying Principal Component Analysis (PCA)), XGBoost (100 tress with depth of 3, raw features or after PCA), and multilayer perceptron (MLP) with three layers to classify infected vs non-infected or ischemic vs non-ischemic DFUs^32–34. The algorithm was trained on a dataset of 4799 images using 5-fold cross-validation to select the optimal number of PC to use, and additionally tested on the held-out validation set (1,156 images). Two binary classifiers identifying infection and ischemia respectively and multi-classifier with four categories were developed and evaluated using F1-score, precision, recall, and accuracy.

Wound characterization and analysis

Wound size measurement

To determine the surface area of the wound with a camera, DFUCare utilized a 1.3 cm by 1.3 cm ArUco marker placed near the wound along with the Open-cv library to calculate a "pixel to metric" ratio based on the predefined size of the marker. This allows for the conversion of pixel size to a numerical measurement in centimeters (Fig. 5). This provided the width and height of the wound region using the size of the bounding box from the wound localization.

Color analysis of the wounds

The coloration of DFUs is a significant factor in their classification and assessment. Studies have demonstrated that ulcers with a red or yellow hue are more likely to be infected, while those displaying a pale or darker tones are more likely to be caused by ischemia^31,35,36. DFUCare employed unsupervised K-means clustering to analyze and determine the relative percentage of the seven major colors present in the localized DFU images, providing valuable insights to clinician for tissue analysis (Fig. 5). This is achieved by Scikit-learn library. The DFUCare color analysis tool enables physicians to conduct a proper analysis of the coloration of diabetic foot ulcers by determining the relative percentages of each color present in the wound.

Texture analysis of the wounds

The progression of wound healing can be observed through changes in the wound surface's texture. A smooth surface is indicative of proper healing as new tissue forms and the wound contracts. Conversely, the presence of roughness may suggest the potential for infection or a delay in tissue regeneration. Furthermore, the accumulation of necrotic tissue, also known as eschar, can contribute to roughness and impede healing. To obtain the roughness values, a two-dimensional grayscale image of the wound surface is transformed into a three-dimensional representation with a height map projection using the Numpy and Scipy libraries. After applying a Gaussian filter to minimize image noise, the roughness can be calculated by analyzing the "bumps" or variations of the surface of the three-dimensional projection. This allowed a graphical representation of the roughness as well as a numerical measurement.

Pilot study for determining the performance of DFUCare algorithm

To test the performance of the DFUCare algorithm, we performed a pilot study in collaboration with the PGIMER in Chandigarh, India. Wound images were obtained as part of a routine visit to the foot care lab of the endocrinology clinic at PGIMER. All the image patch and data collection have been performed using methods/procedures in accordance with the relevant guidelines and regulations approved by the “Institute Ethics Committee PGIMER Chandigarh, India”. The infection and ischemia status of wounds were determined by a physician at the foot care lab of PGIMER with the help of standard culture and wound characteristics. The wound images with the ArUco marker placed adjacent to the wound were acquired using an iPhone X camera. In addition to wound images, de-identified patient demographic, infection status, ischemia status, and manual wound size (rounded to the nearest whole number) were also collected.

Ethical compliance

All wound patch images were collected from the Diabetes clinic at the Postgraduate Institute of Medical Education and Research (PGIMER), Chandigarh India according to the procedure approved by the “Institute Ethics Committee, PGIMER, Chandigarh, India”. Informed consent was obtained from study participants at PGIMER.

Author Contributions

VS: Workflow development and analysis, Writing – original draft, Visualization

WP: Workflow development and analysis, Writing – review and editing, Visualization

DC: Wound localization module, Writing – review and editing, Visualization

A: Data collection and analysis

AB: Data collection

SB: Clinical analysis, Supervision

MB: Conceptualization, Supervision, Project administration

All authors reviewed and approved the manuscript.

Data Availability

The datasets used in this study can be downloaded from DFUC 2020 and DFUC 2021 grand challenges. The codes and generated data are available from the corresponding author on request.

Competing interests

The authors declare no competing interests.

Armstrong, D. G., Boulton, A. J. M. & Bus, S. A. Diabetic Foot Ulcers and Their Recurrence. N Engl J Med 376, 2367-2375, doi:10.1056/NEJMra1615439 (2017).
Prompers, L. et al. High prevalence of ischaemia, infection and serious comorbidity in patients with diabetic foot disease in Europe. Baseline results from the Eurodiale study. Diabetologia 50, 18-25, doi:10.1007/s00125-006-0491-1 (2007).
Armstrong, D. G. et al. Five year mortality and direct costs of care for people with diabetic foot complications are comparable to cancer. J Foot Ankle Res 13, 16, doi:10.1186/s13047-020-00383-2 (2020).
Gordois, A., Scuffham, P., Shearer, A., Oglesby, A. & Tobian, J. A. The health care costs of diabetic peripheral neuropathy in the US. Diabetes Care 26, 1790-1795, doi:10.2337/diacare.26.6.1790 (2003).
Sen, C. K. Human Wounds and Its Burden: An Updated Compendium of Estimates. Adv Wound Care (New Rochelle) 8, 39-48, doi:10.1089/wound.2019.0946 (2019).
Lane, K. L. et al. Glycemic control and diabetic foot ulcer outcomes: A systematic review and meta-analysis of observational studies. J Diabetes Complications 34, 107638, doi:10.1016/j.jdiacomp.2020.107638 (2020).
Moura, J., Madureira, P., Leal, E. C., Fonseca, A. C. & Carvalho, E. Immune aging in diabetes and its implications in wound healing. Clin Immunol 200, 43-54, doi:10.1016/j.clim.2019.02.002 (2019).
O'Brien, T. D. Impaired dermal microvascular reactivity and implications for diabetic wound formation and healing: an evidence review. J Wound Care 29, S21-S28, doi:10.12968/jowc.2020.29.Sup9.S21 (2020).
van der Heijden, A. A. et al. Validation of automated screening for referable diabetic retinopathy with the IDx-DR device in the Hoorn Diabetes Care System. Acta Ophthalmol 96, 63-68, doi:10.1111/aos.13613 (2018).
Abramoff, M. D. et al. Improved Automated Detection of Diabetic Retinopathy on a Publicly Available Dataset Through Integration of Deep Learning. Invest Ophthalmol Vis Sci 57, 5200-5206, doi:10.1167/iovs.16-19964 (2016).
Janowczyk, A. & Madabhushi, A. Deep learning for digital pathology image analysis: A comprehensive tutorial with selected use cases. J Pathol Inform 7, 29, doi:10.4103/2153-3539.186902 (2016).
Cui, M. & Zhang, D. Y. Artificial intelligence and computational pathology. Lab Invest 101, 412-422, doi:10.1038/s41374-020-00514-0 (2021).
Yap, M. H. et al. in 2021 IEEE EMBS International Conference on Biomedical and Health Informatics (BHI). 1-4 (IEEE).
Xu, Y. et al. Classification of Diabetic Foot Ulcers Using Class Knowledge Banks. Front Bioeng Biotechnol 9, 811028, doi:10.3389/fbioe.2021.811028 (2021).
Goyal, M. et al. Recognition of ischaemia and infection in diabetic foot ulcers: Dataset and techniques. Comput Biol Med 117, 103616, doi:10.1016/j.compbiomed.2020.103616 (2020).
Wu, X., Liu, R., Wen, Q., Ao, B. & Li, K. in 2022 2nd International Conference on Consumer Electronics and Computer Engineering (ICCECE). 499-502 (IEEE).
Yap, M. H. et al. Deep learning in diabetic foot ulcers detection: A comprehensive evaluation. Comput Biol Med 135, 104596, doi:10.1016/j.compbiomed.2021.104596 (2021).
Cassidy, B. et al. The DFUC 2020 Dataset: Analysis Towards Diabetic Foot Ulcer Detection. touchREV Endocrinol 17, 5-11, doi:10.17925/EE.2021.17.1.5 (2021).
Shah, A., Wollak, C. & Shah, J. B. Wound Measurement Techniques: Comparing the Use of Ruler Method, 2D Imaging and 3D Scanner. J Am Coll Clin Wound Spec 5, 52-57, doi:10.1016/j.jccw.2015.02.001 (2013).
Argarini, R. et al. Optical coherence tomography: a novel imaging approach to visualize and quantify cutaneous microvascular structure and function in patients with diabetes. BMJ Open Diabetes Res Care 8, doi:10.1136/bmjdrc-2020-001479 (2020).
Diabetic Foot Ulcers Grand Challenge 2020, <https://dfu-challenge.github.io/dfuc2020.html> (
Diabetic Foot Ulcers Grand Challenge 2021, <https://dfu-challenge.github.io/dfuc2021.html> (
Marijanovic, D. & Filko, D. A Systematic Overview of Recent Methods for Non-Contact Chronic Wound Analysis. Appl Sci-Basel 10, doi:ARTN 7613 10.3390/app10217613 (2020).
Jocher, G. ultralytics/yolov5: v3.1 - Bug Fixes and Performance Improvements. doi:10.5281/zenodo.4154370 (2020).
Ruder, S. An overview of gradient descent optimization algorithms. arXiv preprint arXiv:1609.04747 (2016).
Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
He, K. M., Zhang, X. Y., Ren, S. Q. & Sun, J. Identity Mappings in Deep Residual Networks. Lect Notes Comput Sc 9908, 630-645, doi:10.1007/978-3-319-46493-0_38 (2016).
Szegedy, C., Ioffe, S., Vanhoucke, V. & Alemi, A. A. Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning. Aaai Conf Artif Inte, 4278-4284 (2017).
Huang, G., Liu, Z., Van Der Maaten, L. & Weinberger, K. Q. in Proceedings of the IEEE conference on computer vision and pattern recognition. 4700-4708.
Kingma, D. P. & Ba, J. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
Goldman, R. J. & Salcido, R. More than one way to measure a wound: an overview of tools and techniques. Advances in skin & wound care 15, 236-243 (2002).
Cortes, C. & Vapnik, V. Support-vector networks. Machine learning 20, 273-297 (1995).
Friedman, J. H. Greedy function approximation: A gradient boosting machine. Ann Stat 29, 1189-1232, doi:DOI 10.1214/aos/1013203451 (2001).
Chen, T. & Guestrin, C. in Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining. 785-794.
Leaper, D., Assadian, O. & Edmiston, C. E. Approach to chronic wound infections. Br J Dermatol 173, 351-358, doi:10.1111/bjd.13677 (2015).
Keast, D. H. et al. MEASURE: A proposed assessment framework for developing best practice recommendations for wound assessment. Wound Repair Regen 12, S1-17, doi:10.1111/j.1067-1927.2004.0123S1.x (2004).

Table 1. Validation accuracy of DL-based infection and ischemia classification models with different architectures trained and tested on DFUC2020 dataset.

(A) Infection classification

Model	Accuracy (%)	AUC (%)	Precision (%)	Recall (%)
Inception-ResNet v2	76.82	82.42	75.33	74.49
DenseNet121	77.34	84.07	73.27	80.63
Resnet50v2	79.76	84.48	76.72	81.01
VGG16	77.34	83.14	72.95	81.38
Inception-ResNet v2 (with dense layer)	76.56	82.79	73.17	78.21
DenseNet121 (with dense layer)	77.77	84.64	75.45	77.28
Resnet50v2 (with dense layer)	79.76	84.9	77.01	80.45
VGG16 (with dense layer)	78.2	83.42	72.44	85.66

(B) Ischemia classification

Model	Accuracy (%)	AUC (%)	Precision (%)	Recall (%)
Inception-ResNet v2	93.69	94.96	95.63	96.91
DenseNet121	94.81	96.64	97.4	96.39
Resnet50v2	94.03	96.34	96.3	96.6
VGG16	94.55	93.97	96.13	97.42
Inception-ResNet v2 (with dense layer)	93.86	94.75	95.18	97.63
DenseNet121 (with dense layer)	94.72	96.49	96.71	97.01
Resnet50v2 (with dense layer)	93.69	94.42	96.67	95.77
VGG16 (with dense layer)	93.43	91.62	95.33	96.91

(C) Ischemia classification with image augmentation

Model	Accuracy (%)	AUC (%)	Precision (%)	Recall (%)
Inception-ResNet v2	92.65	92.64	96.63	94.54
DenseNet121	92.91	96.52	97.44	94.02
Resnet50v2	93.34	93.62	97.25	94.74
VGG16	93.17	92.95	96.26	95.57
Inception-ResNet v2 (additional layer)	92.91	93.02	97.23	94.23
DenseNet121 (additional layer)	92.39	96.41	97.52	93.3
Resnet50v2 (additional layer)	93.17	92.54	96.75	95.05
VGG16 (additional layer)	92.65	91.34	95.76	95.46

Table 2. Predictive ability of classical machine learnings for infection and ischemia classification trained by handcrafted color and textural feature measured on validation set.

(A) Infection

	SVM with RBF kernel	gradBoost after PCA	gradBoost	XGBoost after PCA	XGBoost	MLP
Accuracy	0.7725	0.8298	0.8559	0.8344	0.8529	0.8348
AUC	0.7731	0.7739	0.7738	0.7853	0.7733	0.9548
Precision	0.7603	0.8128	0.8190	0.8197	0.8203	0.9441
Recall	0.8035	0.8541	0.8603	0.8572	0.8829	0.9714
F1 Score	0.7806	0.7736	0.7736	0.7755	0.7737	0.9574

(B) Ischemia

	SVM with RBF kernel	gradBoost after PCA	gradBoost	XGBoost after PCA	XGBoost	MLP
Accuracy	0.9289	0.9451	0.9471	0.9542	0.9537	0.9533
AUC	0.9159	0.8997	0.9049	0.9286	0.9221	0.9154
Precision	0.5902	0.8105	0.8091	0.7930	0.8118	0.8414
Recall	0.9001	0.8356	0.8456	0.8945	0.7755	0.8606
F1 Score	0.7127	0.8233	0.8263	0.8397	0.7815	0.8514

Table 3. Information of patients collected from Postgraduate Institute of Medical Education and Research (PGIMER) in Chandigarh, India. The number of DFU, whether the wound is infected or ischemic, and the size diagnosed and measured by physician.

	Physician Analysis
ID	# of DFU	DFU Classification	DFU size
P1	1	Neither infected nor ischemic	3 cm by 3 cm
P2	1	Neither infected nor ischemic	3 cm by 3 cm
P3	1	Neither infected nor ischemic	5 cm by 3 cm
P4	1	Infected only	5 cm by 6 cm
P5	1	Infected only	7 cm by 3 cm
P6	1	Infected only	1 cm by 1 cm
P7	2	1) Neither infected nor ischemic 2) Infected only	1) 3 cm by 3 cm 2) 3 cm by 1.5 cm
P8	1	Infected only	5 cm by 5 cm
P9	1	Infected and ischemic	5 cm by 6 cm
P10	2	1) Neither infected nor ischemic 2) Infected and ischemic	1) 2cm by 2 cm 2) 7 cm by 4 cm

No competing interests reported.

Fig6.png
Figure S1. ROC curves demonstrating performance of DL-based DFU classification models on a validation set. Tested model architectures include Inception-Resnet v2, DenseNet121, Resnet50 v2 and VGG16 and the models with the same architecture with additional dense layer.
Fig7.png
Figure S2. Precision-Recall curves showing DFU localization performance of the trained YOLO v5 model on a test set containing A. raw images and B. images after background removal.
Fig8.png
Figure S3. Detected wound (wound localization), wound classification (whether the wound is infected or ischemic), wound size estimation, color and roughness quantified by DFUCare with the patients collected from PGIMER.
Fig9.png
Figure S4. Example data used for classical machine learning model including A. Lab Color space Conversion, B. Gabor Filter, and C. Lab Color Space Distribution.
Fig10.png

DFUCare: Deep learning platform for diabetic foot ulcer detection, analysis, and monitoring

Status:

Version 1

Abstract

Figures

Introduction

Results

Discussion

Methods