The present study suggests that deep learning may be a valuable tool for automatically evaluating the quality of canine latero-lateral thoracic radiographs. This option would be highly beneficial in situations where an expert veterinary radiologist is not readily available, such as when centres rely on external consultation services or when an expert radiologist is only occasionally present. Overall, the ability to automatically evaluate image quality has the potential to improve efficiency and effectiveness in the veterinary medical imaging field.
In this prospective quality-improvement study, the quality criteria for chest radiographs were derived from the indications given in textbooks1, while also incorporating elements from prior works on the automatic evaluation of chest radiographs in human medicine14,15. Radiographic abnormalities were evaluated by the authors based on their expertise in veterinary diagnostic imaging, which thus involved some degree of subjectivity. In order to, at least partially, overcome this subjectivity, the radiographs were evaluated simultaneously by three different experienced operators.
Not surprisingly, one of most common quality issue encountered on our database was a lack of parallel (in 840 latero-lateral and 1018 sagittal radiographs) between the animal and the detector, labelled as “rotated” in this paper. This quality index is also frequently reported in human medicine, with Nousiainen et al. (2021)15 proposing an automated methodology for chest radiograph quality control using convolutional neural networks (CNNs). Rotation was evaluated subjectively during that study, and the deep learning-based approach had an AUC of 0.72 for detecting a quality issue of that type. Instead, the model presented here, demonstrated a higher accuracy (AUC of 0.84) for rotation, likely due to the larger size of our training database. Another study, by Meng et al. (2022), also examined the automatic evaluation of human chest X-rays, including the assessment of rotation. However, it is difficult to directly compare the results of our study with those of Meng et al. (2022) as the methods used were quite different; in fact, Meng et al. 202214 developed a complex method to automatically measure the degree of rotation. However, the accuracy of this latter method for detecting rotation was limited.
In the present study, the accuracy for classifying both underexposed and overexposed radiographs was high, with AUCs between 0.84 and 0.92 in the different datasets. This result was rather unexpected because the radiographs included in the study were obtained using both computed radiology (CR) devices and direct radiology (DR) scanners. It is known that underexposure appears slightly differently in CR than in DR16. Nonetheless, the high accuracy achieved in this study suggests that the developed algorithm was able to identify common features of underexposure in both modalities. To the best of our knowledge, this is the first study proposing a deep learning-based algorithm to evaluate such quality indices and, therefore, a comparison with similar studies is not possible.
The presence of any foreign object on the radiograph was recorded and included in the quality indices. While these foreign objects are not a quality issue in and for themselves, they can sometimes obscure important areas of the image, making it difficult to detect certain lesions. Most of the time, these objects are medical devices that are vital to the patient (e.g. metallic clips, tracheal or oesophageal tubes, chest drainages). To the best of our knowledge, the influence of foreign bodies on the accuracy of AI-powered diagnostic tools has not yet been investigated. However, it can be postulated that their presence might interfere with the interpretation of the images by the algorithms, as these objects are superimposed on thoracic structures.
Mispositioning of the limbs is a common issue in latero-lateral radiographs, and this can hinder interpretability due to the superimposition of the shoulder and forelimb muscles and bones on the cranial portion of the thorax, potentially obscuring lesions in that region18. The developed network had a high accuracy (AUC = 0.93 on latero-lateral, and AUC = 0.92 on sagittal) in detecting this technical error, suggesting that it was readily identified by ResNet-50. In our opinion, this quality index is less prone to subjectivity, and the evaluation by the three experienced radiologists may have been more consistent, leading to the high accuracy of the network.
One limitation of this study is that the respiration phase was not considered among the quality indices. Other similar studies in human medicine have included this quality index in their analysis15. We elected not to include inspiration among the quality indices because there are no objective criteria for evaluating the appropriateness of the respiratory phase in the literature, and such an assessment would therefore be very subjective and prone to high inter- and intra-rater variability.
The overall accuracy of the generated system exhibited a slightly superior performance on latero-lateral images (total accuracy 81.5%) than on sagittal images (total accuracy 75.5%). It is the authors’ opinion that this discrepancy is largely due to the smaller size of the sagittal image database in comparison to the latero-lateral radiograph database. Employing a more extensive database could potentially enable higher overall results to be achieved during classification.