Test–Retest Reproducibility Analysis of Bone Mineral Densitometry Radiomics Features

Background Radiomics features reproducibility assessment is a critical issue in imaging biomarker development era. In the present study, we aimed to assess test–retest reproducibility analysis of bone mineral densitometry (BMD) image radiomics features. Methods In this prospective research work, eighteen patients were included and were subjected to DXA BMD scans acquired within 10 min of each under an Seven regions of (ROIs) including four lumbar spine regions (L1-L4) and three hip regions (trochanteric, inter and in both test and re-test were segmented and 107 from seven different feature sets were extracted. Intra-class correlation coecient (ICC) were initially used to estimate radiomics features


Abstract Background
Radiomics features reproducibility assessment is a critical issue in imaging biomarker development era. In the present study, we aimed to assess test-retest reproducibility analysis of bone mineral densitometry (BMD) image radiomics features.

Methods
In this prospective research work, eighteen patients were included and were subjected to DXA BMD scans acquired within 10 min of each other under an approved protocol. Seven regions of interest (ROIs) including four lumbar spine regions (L1-L4) and three hip regions (trochanteric, inter trochanteric and neck) in both test and re-test images were segmented and 107 radiomics features from seven different feature sets were extracted. Intra-class correlation coe cient (ICC) were initially used to estimate radiomics features reproducibility.

Results
We showed that there is no radiomics feature with 90% < ICC < 100% in all ROIs, but there are three feature including Strength (from NGTDM feature set), SALGLE (Small Area Low Gray Level Emphasis) (from GLSZM feature set) and Busyness (from NGTDM feature set) with ICC < 70% in all eight ROIs. Shape features has features with ICC < 70%.

Conclusion
Our study on test-retest reproducibility analysis of bone mineral densitometry radiomics features shows radiomics features have several variations against changes of time of image acquisition. The reproducible features may be used as imaging biomarkers in the eld of clinical densitometry. The results of this study may be repeated by more radiomics features and more BMD scanners as rst line for bone mineral biomarker discovery.

Background
Bone mineral density (BMD) assessment by using dual-energy X-ray absorptiometry (DXA) is the gold standard for diagnosis, monitoring and follow-up of bone mineral loss de ciencies such as osteoporosis and osteopenia. Generally speaking, DXA plays three main roles including diagnosis of osteoporosis, fracture risk assessment, and monitoring of treatment response (1). Studies have indicated that BMD measurement in hip and lumbar regions is the most reliable measurement for predicting hip fracture risk and spinal therapy monitoring respectively. In addition, DXA has several advantages including low scan times, easy to use, low radiation dose, and good measurement precision. However, although, DXA is a feasible approach for BMD assessment, it suffers from some limitations. For example, it is a twodimensional (2D) imaging approach and areal density measurement is affected by bone size as well as the true 3D volumetric density of the bone tissue (2).
In addition to DXA, several other approaches are developed for BMD assessment. These methods are including quantitative computed tomography (QCT), peripheral DXA (pDXA), quantitative ultrasound (QUS) and magnetic resonance imaging (MRI) (3)(4)(5). Recently, quantitative radiomics texture analysis studies have been applied for a wide range of clinical applications such as diseases detection, diagnosis, prognosis and prediction (6)(7)(8). Although radiomics is used comprehensively for cancer studies, it is applicable for bone diseases management such as osteoporosis diagnosis and therapy response prediction and assessment. Previous studies have identi ed that radiomics features extracted from computed tomography (CT), magnetic resonance imaging (MRI) and radiology images could be used for bone diseases management (9,10). In our previous study, we showed the feasibility of radiomics features extracted from DXA images to classify osteoporosis and osteopenia patients from normal ones (11).
Radiomics is an advanced image processing approach with several steps including image acquisition, segmentation, feature extraction, feature selection and data modelling (12)(13)(14). In radiomics approach, the extracted features could be served as imaging biomarker for diagnosis, prognosis, response prediction and assessment of therapy for several diseases. With this regard, radiomics features have to be reproducible, means that radiomics feature values should stay unchanged or minimally changed when the feature is computed from a repeat scan acquired after a short time interval. A wide range of studies have reported that radiomics features are vulnerable against changes in image acquisition, reconstruction, pre-processing, segmentation and analysis (15)(16)(17). In this light and to decrease false positive rate, robust radiomics features have to be found and used for further clinical applications.
Based on the previous radiomics studies, a reproducible/repeatable feature remain the same in different radiomics processes settings or in same radiomics processes, but in different times. To the best of our knowledge, there are no reports on BMD radiomics reproducibility over changes in the times of image accusation. In the present study, we aimed to assess the test-retest reproducibility of radiomics features extracted from DXA images. In this study, for rst time, reproducibility of image features was checked in several regions of BMD images in two consecutive times. The reproducibility was calculated based on the statistical tests.

Patients
This research work was conducted as a prospective study and approved by local ethics committee. In this work, 18 patients that referred to bone densitometry department were included. For each patient, informed consent was taken before examination. All patients underwent two DXA within 10 minutes before any treatment was delivered. Both scans were obtained with the same DXA scanner by using the same imaging protocol (Lexxos, 4.7 mm Al 75 kVp and 14.2 mm Al 140 kVp). These patients' medical images were divided into two groups: test and re-test.

Contouring
In this work, seven anatomical regions were contoured by a nine years experienced radiologist in the eld of bone densitometry. All contours were done by ITK SNAP (18) and then all images and contours were exported for feature extracting and analysis. Differnt contouring was shown in Fig. 1. In this contouring four regions from lumbar spinal including L1, L2, L3 and L4 and three regions from hip including neck (NK), trochanteric (TR) and inter trochanteric (IT) were drawn for radiomics analysis. These regions are analyzed in routine clinical BMD assessment. In addition, all regions were analyzed as a separate unique region, called ALL region (combinations of all regions). These contouring were same in both test and retests images.

Radiomics
.. Totally 107 radiomics features were extracted by using Pyradiomics radiomics feature extraction platform (19). These features were listed in supplementary table 1

Data analysis
To estimate the reproducibility radiomics features by using repeat DXA data, the intra-class correlation coe cient (ICC) were initially used. ICC was calculated by the following equation: Where MSR = mean square for rows, MSW = mean square for residual source of variance, k = number of observers involved and n = number of subjects. In this study, radiomics feature reproducibility were categorized in four categories including 1) ICC < 70%, 2) 70% < ICC < 80%, 3) 80% < ICC < 90% and 4) 90% < ICC < 100%. ICC were shown as Heatmap, box plot, bar plot and density distribution plot.
In addition, Bland-Altman graphs were drawn for all features and some graphs were presented. This graphical method was used to quantify the agreement between two radiomics features by studying the mean difference within which 95% of the difference between the retest features in comparison to the test features. R package version 3.1.3 IRR was used for all statistical analysis.

Results
The results of this study were shown as clustering heatmap, bar plot, box plot, density distribution and Bland-Altman graphs. In Fig. 2 We showed the ICC density distribution in Fig. 5. In this plot, it was seen that there are two peaks of high ICC in L4 region and combination of all ROIs (ALL). For L1 and NK regions, also there are peaks in ICC more than 75%. The lowest ICC peak is for radiomics features that extracted from L2 region. In Fig. 6, the box plot of ICC is shown for all ROIs. As was depicted, the maximum ICC is for combination of all ROIs (ALL), followed by L4, TR and L2.
The Bland-Altman analysis for all feature sets was shown in Figs. 7-13. In these gures, features with ICC more than 0.85% were depicted except NGTDM feature set that didn't have any reproducible features and features with highest ICC were shown. In radiomics reproducibility analysis based on the Bland-Altman, mean, standard deviation (SD), and upper/lower reproducibility limits (U/LRL) are considered to nd reproducible radiomics features.

Discussion
Bone mineral densitometry is a critical medical approach to assess, monitor and predict several bone mineral diseases. It is a widely accepted examination with several advantageous such as, easy to use and data reporting, low cost and low radiation dose. But it has some limitations and challenges and new approaches are being studied for bone density assessment. Radiomics is an advanced quantitative imaging approach that could be used for bone diseases diagnosis, prognosis and prediction. In this approach, radiomics features extracted from medical images are used as biomarkers for clinical decision making. However, before any clinical applications, they have to be assessed in terms of reproducibility and repeatability (20).
In the present study, we assessed test-retest reproducibility of radiomics features extracted from DXA BMD images. We observed that radiomics features based on the region being interested, have several changes over changes in the time of image acquisition.
In our study, we found that most radiomics features extracted from BMD images are vulnerable against changes in the time of acquisition. Based on the results presented in the heatmap gure, we observed The mechanisms of radiomics feature changes are not fully understood. But some issues such as the level of different noises, patient's status changes, operational errors, radiation scattering, electronical changes in the scanner, the nature of radiomics features and changes in the biology may induce radiomics features variations (20,22). In this study, the time between two image acquisitions was ten minutes, this time may not induce any change in the bone status. In previous studies on the cancer image radiomics reproducibility, more times were used between two imaging and this issue may impacts the biology of the tumors and therefore radiomics feature values. Also, in some cases, features were compared before and after a treatment and changes in the features are assessed and are used as therapy monitoring.
This study suffers from some limitations. First, low sample size. Eighteen patients are low, and more sample sizes will be resulted in more reliable results. Second, this study was conducted by one scanner. Further studies with different scanners from different vendors are needed to found new results. Third, segmentation variation. Our segmentation was done manually. This issue may induce some variations in results. We suggest automatic or semi-automatic segmentation and compare the results. We also, suggest that the results of this study may compared by another imaging modalities such as MRI, CT or radiology.

Conclusions
In summary, our study on test-retest reproducibility analysis of bone mineral densitometry radiomics features shows radiomics features have several variations against changes of time of image acquisition.
The reproducible features may be used as imaging biomarkers in the eld of clinical densitometry. The Regions of interest (ROI) identi cation for radiomics study. L; Lumbar; NK: Neck; TR: Trochanteric and IT: Inter trochanteric.

Figure 4
Bar plot showing the percentage of all radiomics features reproducibility based on the ICC for eight