In this study, we used deep learning to develop a CAD system to detect radio-opaque urinary tract stones on a plain X-ray. Our model was able to detect urinary tract stones quickly with a high sensitivity. This might become a new screening modality for diagnosing urolithiasis.
Automatic detection of urinary tract stones using a CAD system has several advantages. First, a CAD system is always able to provide quick and consistent interpretation without fatigue or fickleness. This capability is helpful in emergency medicine. Second, a CAD system is economical and easy to access because it can function as an application on a computer. There is no need to purchase any special machine and all we need to do is to install the application. Third, a CAD system with deep learning can have a self-learning system. Learning the mistakes that were made using a CAD system allows improvement of its diagnostic performance. Given these advantages, a plain X-ray with a CAD system may become a reliable modality in clinical practice as a screening tool, although the main problem is its accuracy.
We studied the diagnostic performance of a CAD system with a focus on a balance by changing the weight of loss for overlooking. In this study, we measured the F score, which represented a balance of the CAD system’s diagnostic performance between overlooking and misdetection. The F score was best when the weight of loss for overlooking was set to 1, and the sensitivity and PPV were 0.872 and 0.662. The sensitivity seemed to be so satisfactory that the CAD system could help primary care physicians to find a urinary tract stone. However, PPV was low, particularly in the kidney, at mid-ureter, and in the distal ureter. It was probably because the computer was likely to misdiagnose calcifications or bones as urinary tract stones. The computer did not have knowledge about the structure of the human body, resulting in mistakes that would not be made by physicians. Another possible reason is the small amount of training data that we used in this study. We prepared only 1017 images for deep learning. This amount was low for training data for adequate deep learning. For example, in a previous report about deep learning for chest X-ray, over 100,000 images were used [15]. In the future, if we are able to prepare an adequate amount of training data and combine it with another algorithm to identify the human body’s structure, the accuracy may be further enhanced. However, the CAD system that we created in this study seems to be useful as a screening tool, although the PPV was low.
Data augmentation is a method that is used to amplify training data by adding a change to an original image. We did not perform data augmentation in this study. In our pilot analysis, data augmentation did not improve accuracy (i.e. the best F score was 0.636), which was probably because the computer identified urinary tract stones based on their shape or orientation. Data augmentation by transformation or rotation might not be helpful for developing the CAD system to detect urinary tract stones. However, data augmentation without transformation or rotation may improve the accuracy. One such method is embedding augmentation, which means making a fake image that is similar to a real image [16]. Using stone embedding, we can create a large amount of fake images of urolithiasis and increase the training data for deep learning, leading to an improvement in the CAD system’s diagnostic performance.
The present study has several limitations. First, we excluded cases with only radiolucent urinary tract stones, which is reportedly observed in 10% of patients with urolithiasis, because the diagnostic ability of the algorithm depends on the contrast information in the image [17]. A radiolucent urinary tract stone will not be detected by models using a plain X-ray trained with CNN algorithm. That seems to be the limit. Although the diagnostic performance of a CAD system would currently be inferior to CT for diagnosing radiolucent stones, a CAD system could be improved in the future, and it is expected to be used in various situations such as follow-up until stone expulsion or objective evaluation of stone treatment. In addition, the efficacy of a CAD system combined with CT images for the urological field including not only urolithiasis but also neoplastic disease is under investigation. Second, the proportion of X-ray images with a distal ureteral stone was lower in the test dataset than in the training dataset. Distal ureteral stones were difficult to distinguish from pelvic phleboliths, which might tend to increase the number of FP results. Therefore, the test dataset in the present study might contribute to reducing the number of FPs. Third, we did not include negative images without a stone lesion. Therefore, the specificity and negative predictive value could not be calculated in this study. Prospective investigations of the CAD system’s usefulness are required for urinary stone clinical practice.