A combined model based on CT radiomics and clinical variables to predict uric acid calculi which have a good accuracy

The aim of this study was to develop a CT-based radiomics and clinical variable diagnostic model for the preoperative prediction of uric acid calculi. In this retrospective study, 370 patients with urolithiasis who underwent preoperative urinary CT scans were enrolled. The CT images of each patient were manually segmented, and radiomics features were extracted. Sixteen radiomics features were selected by one-way analysis of variance (ANOVA) and least absolute shrinkage and selection operator (LASSO). Logistic regression (LR), random forest (RF) and support vector machine (SVM) were used to model the selected features, and the model with the best performance was selected. Multivariate logistic regression was used to screen out significant clinical variables, and the radiomics features and clinical variables were combined to construct a nomogram model. The area under the receiver operating characteristic (ROC) curve (AUC), etc., were used to evaluate the diagnostic performance of the model. Among the three machine learning models, the LR model had the best performance and good robustness of the dataset. Therefore, the LR model was used to construct the nomogram. The AUCs of the nomogram model in the training set and validation set were 0.878 and 0.867, respectively, which were significantly higher than those of the radiomics model and the clinical feature model. The CT-based radiomics model based has good performance in distinguishing uric acid stones from nonuric acid stones, and the nomogram model has the best diagnostic performance among the three models. This model can provide an effective reference for clinical decision-making.


Introduction
Urolithiasis is a disease with high incidence in urology, and its incidence is increasing yearly.Its chemical components include uric acid, cystine, calcium oxalate, hydroxyapatite and so on.Uric acid stones (UAs) account for 10 to 15% of all urolithiasis cases [1,2].Seventy-nine percent of UAs occur in men, mostly between the ages of 60 and 65.It is generally believed that UAs are associated with obesity, diabetes, and hypertension [3][4][5].It is very important to distinguish between UA and non-UA (nUA) before treatment.Oral litholytic drugs to alkalinize and dilute urine are the first choice for the treatment of UA, and most patients with UA can be cured by drug treatment or avoid surgery, while patients with nUA generally require multiple methods of comprehensive treatment [6,7].With the rapid development of CT technology, dualenergy CT (DECT) has been used to accurately distinguish UA from nUA, but the stone composition analysis technology of dual-energy CT has not been popularized 37 Page 2 of 12 and significantly increases the radiation exposure dose and treatment cost of patients [8,9].Since the composition of the stones cannot be known before surgery, most UA patients are still treated with extracorporeal shock wave or endoscopic surgery, which causes physical pain and increases financial burdens.
Radiomics is a relatively new approach that uses data representation algorithms to extract features from multimodal medical images [10,11].Over the past decade, radiomics features have been used as imaging biomarkers for cancer prognosis, staging, and prediction [12][13][14].Radiomics methods have been successfully applied in the preoperative diagnosis, component identification and prognosis prediction of urinary system diseases such as prostate cancer, kidney cancer and bladder cancer [15][16][17][18][19][20].Urolithiasis is an important disease of the urinary system, and relevant studies have made achievements in stone composition identification, stone volume assessment, predicting the stone-free rate of flexible ureteroscopy, etc. [21][22][23][24] The aim of this study was to develop and validate a CT-based radiomics and clinical variable diagnostic model for the preoperative prediction of uric acid in urinary stones.

Patient data
The institutional review board approved this retrospective study, waiving the requirement for informed consent from patients.The data of 370 patients with UA who underwent surgical treatment at the Affiliated Hospital of Qingdao University from July 2013 to December 2021 were retrospectively collected, including 169 cases of UA and 201 cases of nUA.Stones were obtained via percutaneous nephrolithotomy (PCNL), ureteroscopic lithotripsy, and open or laparoscopic open lithotomy.
The inclusion criteria were as follows: (1) UA or nUA confirmed by postoperative stone composition analysis; (2) preoperative urinary noncontrast CT (NCCT) scan; and (3) patients aged over 18 years old.The exclusion criteria were as follows: (1) combined with other urinary system diseases such as tumors and malformations; (2) nephrostomy tubes, DJ tubes, urinary catheters, etc., were indwelling in the body of the patient during imaging; (3) the image artifacts were heavy and could not be used for subsequent image segmentation and radiomics feature extraction; and (4) stones appeared on a single level or were less than 5 mm in diameter.The baseline clinical data, including age, sex, blood analysis and urinalysis results, were obtained from the medical records.See the flowchart (Fig. 1) for specific participants inclusion/exclusion.

CT image acquisition
The scan range was from the upper edge of the kidney to the lower edge of the ischial tuberosity, and NCCT was performed with a 64-slice MDCT scanner (Discovery CT 750 HD, GE Healthcare, USA).The relevant CT imaging acquisition parameters were as follows: tube voltage, 100-120 kV; automatic tube current, 200-350 mA; rotation time, 0.5 s; scan slice thickness, 5 mm; and reconstruction thickness, 5 mm.

Stone composition detection
All stones were detected by an infrared spectroscopy automatic analysis system (LIIR-20, Lamoride Company, Tianjin, China) in the lithotrig center of our hospital.Before testing, the stained residues on the surface of the stone were washed with clean water and distilled water, dried, and dried in an oven at 70-100 °C.The stone samples (1.0-1.5 mg) were removed, and 200-300 mg of pure potassium bromide, which was fully dried in advance, was thoroughly ground and mixed in an agate mortar.Subsequently, the mixture was pressurized by 20 MPa with a tablet press for 1 min to make a tablet with a thickness of 35 mm and a translucent shape, which was quickly placed into an infrared spectral slot for scanning.The stone composition was automatically analyzed and reported by computer.If the uric acid component exceeded 50% of the total stone composition, it was considered the major component.Because most of the stones submitted for examination were mixed stones, this study used the most prevalent component urolithiasis as the main component, which included calcium oxalate, carbonate apatite, hydroxyapatite, magnesium ammonium phosphate hexahydrate, and cystine.The composition of nUA is shown in Fig. 2.

Lesion segmentation and feature extraction
The region of interest (ROI) was delineated layer by layer by two radiologists, and the ROI was delineated along the edge of the stone.Window level is 60 and window width is 400.As much as possible, stones were included without including the surrounding renal tissue, blood vessels, fat, or image artifacts to ensure the reliability of the data.Using the open source radiomics software 3D Slicer (Version 4.1.1https:// www.slicer.org/), interest areas were drawn, and the PyRadiomics platform (Version 3.0 http:// pyrad iomics.readt hedocs.io/ Python Version3.6.7) was used to extract radiomics features.A total of 1 037 features were extracted from each region of interest.

Feature selection and data dimension reduction
The intraobserver and interobserver intraclass correlation coefficients (ICCs) were calculated, and features with an ICC > 0.80 were included in the follow-up study.The training set and validation set were divided according to the visit time.To reduce the risk of overfitting, the obtained radiomics features were normalized and analyzed by one-way analysis of variance (ANOVA), and the least absolute shrinkage and selection operator (LASSO) was used for feature selection.The hyperparameter λ of the LASSO regression model was screened, the λ with the smallest model error was selected, and the features with coefficients other than zero were retained.

Model building and model selection
Based on the previously screened radiomics features, three machine learning methods, including logistic regression (LR),  random forest (RF) and support vector machine (SVM), were used to establish the radiomics model.After building the radiomics model using the training dataset, the performance of the model was evaluated using the validation dataset.The receiver operating characteristic (ROC) curve was drawn, and the evaluation indicators included the area under the ROC curve (AUC), sensitivity, specificity and accuracy.The model with the best efficiency was compared and screened.
Clinical factors such as age, sex, BMI, history of hypertension, low-density lipoprotein, triglyceride, blood uric acid, blood/urine WBC, urine pH, stone CT value and CT maximum stone diameter were included in the multivariate logistic regression.The significant clinical factors and the best machine learning model were combined to construct a clinicalradiomics nomogram.

Model evaluation
A radiomics nomogram was constructed by combining the important variables of clinical factors and the radiomics model.Calibration curves were used to evaluate the calibration of the nomogram.The Hosmer-Lemeshow test was used to evaluate the goodness of fit of the nomogram.On the training set and validation set, the ROC curve of the nomogram was drawn, and the diagnostic performance of the clinical factor model, the radiomics feature model and the nomogram model in identifying UA and nUA were compared.To assess the clinical usefulness of the nomogram, we plotted the corresponding decision curve analysis (DCA) by calculating the net benefit over a threshold probability range across the entire cohort.

Statistical analysis
Statistical tests were performed using SPSS (version 25.0, IBM) and R statistical software (version 4.2.1, https:// www.rproje ct.org).Univariate analysis was applied to compare the differences of the clinical factors between the two groups by using the chi-square test or Fisher exact test for categorical variables, and the Mann-Whitney U test for continuous variables, where appropriate.One-way ANOVA was used to compare the value of each radiomics feature for the differentiation of UA from nUA.We used "glmnet" package to perform the LASSO regression model analysis.The ROC curves were plotted using the "pROC" package.We used the "generalhoslem" package to perform Nomogram development, and the Hosmer-Lemeshow test, and calibration plots were performed using the "rms" package.Use Delong test to perform differences in the AUC values between different model.The DCA was performed using the "dca.R." package.A two-sided p < 0.05 was considered significant.

Feature extraction and model building
A total of 1037 radiomics features were extracted from each patient's CT image type using the Gauss-Laplace filter and wavelet filter.After ANOVA and LASSO with tenfold cross-validation (Fig. 3A, B) screening, 16 features were retained (Table 3).We plotted the ROC curves of the three machine learning models by using the training set and the validation set (see Fig. 4A, B), and the corresponding result data of the models are shown in Table 4.
Based on the comparison, we identified that the LR model had the best results on the validation set, the performance was similar in the training set and the validation set, and the performance was stable.Therefore, the LR model was finally used to construct a nomogram.Logistic regression was used to establish a prediction model for the selected feature parameters, and the radiomics score (Rad-score) of each sample was calculated   to reflect the risk of UA.The Rad-score is the sum of radiomics features and corresponding coefficients used to construct the model, and the formula is Rad-score = intercept + ∑βi × Xi.The corresponding coefficients for each feature are included in Table 3.

Nomogram construction and evaluation
Age, stone CT value, CT longest diameter of stone, urine pH, urine WBC, and Radscore were included to construct the nomogram model, as shown in Fig. 5A.The nomogram   3. Figure 6C shows the DCA for the three models.
The decision curve analysis showed that the Rad-score model and nomogram model had higher overall net benefits in distinguishing UA, which were significantly higher than that of the clinical model.

Application of nomogram
Figure 7 shows a typical clinical application of a Nomogram.A 68-year-old male patient was admitted for PCNL due to kidney stones.NCCT showed that the mean core CT value of the stone was 504 HU, the length diameter of the stone was 10.9 mm, Rad-score was 2.1, preoperative urine PH was 5, and urinary white blood cells were 28.7/μL.The Nomogram showed that the score of the patient was 155 points, and the probability of UA was greater than 0.9.The postoperative stone composition identification of the patient also showed UA.

Discussion
The ability to predict stone composition, especially UA, has great clinical application value for patient education as well as the selection of clinical diagnosis and treatment measures.However, many in vitro and in vivo attempts to address this problem have yielded different results, and, at present, there is no universally recognized simple and effective tool to classify stones.At present, several studies have identified UA and nUA by using clinical models and CT images.When comparing UA and calcium stones, Vishnu Ganesan et al. [25] found that there was a difference in the attenuation of the core and periphery of the stone.According to this property, they drew a two-dimensional distribution of stone attenuation and constructed a semiautomatic algorithm to process the image.The sample size was 100, the sensitivity was 89%, the specificity was 91%, and the effect was good.Jendeberg et al. [26] defined three kinds of features as prediction indicators based on the Hounsfield unit (HU) of NCCT images, and their accuracies were 97%, 98% and 94%, respectively, with all of them reaching the same level as DECT.Urine pH is the most important factor affecting the solubility of uric acid.At a urine pH < 5.5, almost 100% of the uric acid does not dissociate, and the urine becomes supersaturated with uric acid.In contrast, at a pH > 6.5, most of the uric acid is in the form of anionic urate; therefore, alkalization of urine can significantly improve UA progression and prevent UA formation.When Spettel et al. [27] combined HU with urine pH information and restricted it to stone sizes greater than 4 mm; they obtained 86% sensitivity and 99.4% specificity.Based on the NCCT, Kim et al. [28] defined the maximal stone length (MSL), mean stone density (MSD), stone heterogeneity index (SHI), variant coefficient of stone density (VCSD = SHI/MSD × 100) and other indicators to identify uric acid stones, among which SHI had the best results.The AUC reached 0.893 (95% CI 0.855-0.931).Zhang et al. [29] extracted and analyzed the NCCT of 49 patients with calculi based on the texture analysis method and found that the combined model composed of standard deviation, kurtosis and skewness in the histogram of gray distribution could achieve an accuracy of 88-92% in identifying UA.Black et al. [30] used the cutting-edge deep learning algorithm ResNet-101 to identify the photos of surgically removed stones and improved the prediction ability of four types of stones to a high level.The specificity and accuracy of each type of stone were as follows: UA (97.83%, 94.12%), calcium oxalate monohydrate (97.62%, 95%), struvite (91.84%, 71.43%), cysteine stones (98.31%, 75%) and brushite (96.43%, 75%).However, the shortcomings of this study were also obvious.First, studies based on postoperative stone composition cannot help patients avoid unnecessary surgical procedures.Second, the sample of this study was comprised only 63 cases, as four classification models needed to be divided into a training set and a validation set at the same time, which means that the sample sizes of each group was < 10, thereby seriously limiting the range of application of the model.
The big data algorithm based on machine learning and radiomics has opened a new idea for the prediction of stone composition.Tang et al. [24] used radiomics to identify oxalate stones and screened 8 radiomics features from the CT images of 507 patients with stones.The sensitivity, specificity and accuracy of the model were 90.5%, 84.3% and 88.5% in the training set and 90.1%, 84.3% and 88.3% in the validation set, respectively.The AUC of the training set was 0.935, and that of the validation set was 0.933.Zheng et al. [21] extracted radiomics features from the CT images of 1198 eligible urolithiasis patients from three centers, combined with the presence or lack of urease-producing bacteria in urine and urine pH value, to construct a nomogram model.The AUCs of the model in the training set and three external validation sets were 0.898, 0.832, 0.825 and 0.812.Similar to the above two studies, we combined the radiomics model and clinical features to construct a nomogram.The AUCs of the training set reached 0.878, and those of the validation set reached 0.867, which were similar to the results of the above two studies.It is clear that the identification of UA is more important in the choice of preoperative management than the identification of oxalate calculi and infectious calculi because most patients can be relieved or saved from surgery by drug therapy.We established the first radiomics model for the identification of uric acid stones before surgery, which had strong robustness.
Our study had several limitations.First, this was a single-center study; due to differences in medical equipment and scanning parameters between hospitals, it is not clear how effective the artificial intelligence model may be when applied to other centers, and it still requires verification using a larger sample set from different centers.Second, due to the limitation of clinical data, our model was based on CT images with a thickness of 5 mm, which have a lower accuracy than thin-slice CT images.Finally, our qualitative description of stone composition relied on postoperative infrared spectroscopy analysis as the gold standard, which was greatly affected by the sampling site, and the stone samples after lithotripsy had already been powdered and lost their original morphological structure, which made our description of stone composition incomplete.Therefore, improving the prediction ability of mixed stones and rare component stones is the most important research direction for radiomics technology in the future.Fig. 7 An elderly male urolithiasis patient, the Nomogram showed that the score of the patient was 155 points, and the probability of UA was greater than 0.9.The postoperative stone composition identification of the patient also showed UA

Conclusion
NCCT is widely used in clinical practice and is simple, convenient, and fast.Combined with radiomics, the preoperative prediction of UA stones can be realized in vivo with high sensitivity, specificity, and accuracy.Further study of radiomics models and their application in multicenter prospective clinical trials can help surgeons develop appropriate treatment and prevention programs.

Fig. 1
Fig.1The flowchart shows the inclusion/exclusion process of participants

Fig. 2
Fig. 2 Proportion of components of nonuric acid stones in the included samples.CO calcium oxalate, CA carbonate apatite, HA hydroxy apatite, MAPH magnesium ammonium phosphate hexahydrate

Fig. 3 A
Fig. 3 A Radiomics feature selection using least absolute shrinkage and selection operator (LASSO) regression model.Selection of the tuning parameter (λ).The LASSO regression model was used with penalty parameter tuning that was conducted by tenfold crossvalidation according to minimum criteria.The binomial deviance was

Fig. 5 A
Fig. 5 A Performance of the radiomics model.Nomogram developed based on the radiomics model and meaningful clinical indicators; calibration curves of the nomogram model on the training set (B) and validation set (C).The calibration curve indicates the goodness of fit

Fig. 6 A
Fig. 6 A, B Radiomics nomogram scores for each patient in the training and validation sets.Blue bars represent the scores of patients without uric acid, while yellow bars represent the scores of patients with uric acid stones.C Decision curve analysis of the three models.The Y-axis is the net benefit; the X-axis represents the threshold probability.The blue, green, and red lines represent the net benefits of the Rad-score model, clinical model, and nomogram model, respectively.Compared with the other two models and simple diagnoses (for example, all patients with uric acid stones (gray line) or all patients without uric acid (black line)), the Rad-score model and the nomogram model have high net benefits

Table 1
Clinical baseline data of patients with urolithiasisCTdmax CT maximum stone diameter, HBP High Blood Pressure, BMI Body Mass Index, WBC White Blood Cell, FBG Fasting Blood Glucose, b-UA Blood Uric Acid, TG Triglyceride, LDL Low Density Lipoprotein, U-pH Urine pH, U-WBC Urine WBC WBC and other laboratory indicators, were not significantly correlated with UAs.Older patients were more likely to have uric acid stones.

Table 2
Logistic regression analysis of clinical candidate predictors CI Confidence Interval, OR Odds Ratio, BMI Body Mass Index, HBP High Blood Pressure, WBC White Blood Cell, FBG Fasting Blood Glucose, b-UA Blood Uric Acid, TG Triglyceride, LDL Low Density Lipoprotein, U-pH Urine pH, U-WBC Urine WBC, CTdmax CT maximum stone diameter

Table 4
Diagnostic performance of the logistic regression, random forest, and support vector machine models LR logistic regression, RF random forest, SVM support vector machine

Table 5
Diagnostic performance of the clinical factor model, the radiomics model, and the nomogram model