The inspiration for this study arises from the unmet clinical need to accurately predict UA stones prior to selecting the optimal treatment modality. Over the last decade, technological advancements in ureteroscopes and laser lithotriptors have paved the path to an upsurge in surgical intervention regardless of stone composition [15, 16]. Although alkalization therapy for UA stones is ideal for patients with high morbidity or recurrent UA stone formers, it is commonly underused due to the lack of reliable factors predicting its outcome, concerns about the existence of heterogeneous stone composition, and patient intolerance [7–15]. Most of all, the lack of standardized protocols for predicting UA components adds complexity to making treatment decisions in real-life situations [15]. The present study is the first to develop and validate an effective predictive model incorporating NCCT images into traditional demographic and clinical data to classify UA from non-UA stones in patients with stones in the ‘grey zone’ HUs. External validation showed that our objective, expeditious, and non-invasive model could identify UA stones with an accuracy of 97.1%, the highest predictive performance reported to date.
Our model has several implications for improving the current standard of care through its implementation in clinical practice. First, the input variables, including demographic, clinical, and NCCT data, are those readily available in real-world practice, which supports the general applicability of our model. Previously reported stone component classification models utilizing imaging data generally require time-consuming manual analysis of HU parameters or additional examination using specific CT scanner types, such as DECT, which may not be available across all practice settings [8, 17–19]. In contrast, our automated model has the potential to be integrated into any electronic medical records system that utilizes coding algorithms to be utilized as a decision support system. Such a system may reduce the time required for classification and avoid additional radiation exposure and costs.
Second, we selected patients with stones of relatively low HUs for the model development since these stones pose a diagnostic dilemma in clinical decision-making for alkalization therapy [5]. We selected stones with HUs < 800 in order to include struvite and cystine stones, in addition to UA stones, that are characterized as having a completely distinct management approach. The multiclass classification model provided a relatively lower performance than the binary classification. However, the overall performance was excellent and surpassed that of the conventional multivariate logistic regression model, providing a reliable diagnostic standard for treatment decision-making. Lastly, the architecture of our model and its working principle allow future refinements. Our model can additionally integrate intraoperative laser lithotripsy data and has the potential to provide patient-specific optimal laser settings for maximal fragmentation efficiency according to each stone feature.
Several strengths of our study are worth mentioning. First, external validation of prediction models is essential before their use in clinical practice. Since validation samples should be obtained from different but plausibly relevant cohorts, the performance of our model was validated with an external cohort comprised of patients from an international institution with distinct ethnic backgrounds. Discrimination performance is usually observed to be inferior in the external validation cohort compared to the development cohort [20]. Nevertheless, the performance of our external validation cohort was non-inferior compared to that of the development cohort, indicating the validity and feasibility of our model. Second, we incorporated a comprehensive set of demographic, clinical, and NCCT imaging data that are potentially associated with stone components for the model development. Moreover, the dataset was considered of high quality, with all input variables of the development and external validation cohorts being manually reviewed and incorporated without any missing data, which may have contributed to its high predictive performance.
This study is not without limitations. First, mixed component stones were excluded from the development and external validation cohorts. Although the distinction between pure UA stones and mixed component stones is crucial in the decision-making of alkalization therapy, only stones with pure components were included. Since the extent of the UA component beyond which the stone has to be defined as mixed is unclear, subsequent studies incorporating quantitative analysis of mixed stones, will need to be performed to screen optimal patients who would be amenable to medical therapy. Second, a population-based database with a larger number of subjects may provide better generalizability. Albeit, we utilized institutional data, which provided a comprehensive and high-quality dataset, to maximize predictive performance. Lastly, performances declined for the multiclass classification, indicating uncertainty of clinical usefulness, especially in classifying cystine stones. The likely explanation is the limited number of cystine stones in both the development and external validation cohorts. Notwithstanding these limitations, the advantages of our model over previously reported tools classifying predicting stone components indicate its feasibility and general applicability to be implemented into real-world clinical practice.