Table 4 indicates a strong relationship between IBEX BH and DXA, with all adjusted R2 values between 0.86 and 0.89. The correlation between IBEX BH and the GE Lunar DXA system is 0.932 which is also within the bounds of the reported correlations between different DXA manufacturers [0.78, 0.95] [22]. These results indicate that the software performs at a similar level to commercially available DXA systems and should therefore be above the performance level required for a clinically useful screening device. This is further supported by the AUCs (0.95 and 0.98) for discriminating whether the T-score ≤ −2.5 at that ROI. Using this classifier, patients at high risk of osteoporosis at the wrist could be identified with clinically useful sensitivity and specificity.
Table 4 also shows no statistically significant difference between the adjusted R2 values reported for manual and automated use of IBEX BH, indicating that the automation features within the software are non-inferior to manual user intervention. This offers the possibility of avoiding human placement error associated with differences in ROI placement, the need for user training and enables the software to be integrated with minimal impediment to radiology workflow. These results indicate that the software could be used as an opportunistic screening tool with standard DR procedures.
The least significant change of IBEX BH could not be measured as the study design did not allow for multiple images to be taken of the same patient. The residual standard deviation of 0.042 provides an analogous measure to the least significant change reported on the forearm DXA reports 0.016. This implies the difference between IBEX BH and DXA is above intra-machine variability but within inter-machine variability.
Table 1 displays results for the continuous demographic factors. As expected, the older the participant the lower their aBMD and the heavier a participant the higher their aBMD. Weight is most predictive of a T-score ≤ −2.5 at the UD or TD regions with an AUC of 0.81 compared to age which is the least predictive with an AUC of 0.62. Table 2 displays results for the categorical demographic factors. Sex is the strongest predictor of osteoporosis with females more likely to have a T-score ≤ −2.5 at the forearm.
Table 3 displays results for the continuous DXA and IBEX BH outputs. DXA is most predictive of osteoporosis at the forearm with a minimum AUC of 0.97 compared to a maximum AUC of 0.95 for IBEX BH outputs. This is expected as DXA was used to define the osteoporosis and non-osteoporosis cohorts so it should have the highest AUC. The AUC is not 1 for DXA since sex was not included and osteoporosis at the forearm was defined using T-scores. IBEX BH bone thickness is the most predictive of osteoporosis, followed by total thickness and finally alloy. This is expected since, if the same bone material assumptions were used and there was no error in IBEX BH or DXA, bone thickness and DXA aBMD would theoretically have correlation equal to 1.
Comparing Tables 1 and 3, all DXA outputs, IBEX BH bone thickness and IBEX BH total thickness are more predictive than the continuous demographic factors. IBEX BH alloy is more predictive than age and height but not weight. The total thickness is the next best predictor as there is a high correlation between the total AP thickness and the thickness of the bone and it is a better predictor at the UD region as there is less variability in the tissue surrounding the UD region than there is in the TD region. The conclusions from this table are that all IBEX BH outputs appear to be predictive of DXA and therefore are justified for inclusion in the subsequent model selection. The differences in predictive ability make intuitive sense given the meaning of IBEX BH outputs and forearm morphology.
An alternative to IBEX BH as an opportunistic screening tool in DR is radiogrammetry, which looks at geometric features of the bone like cortical thickness to infer aBMD. Whilst cortical thickness is linked to bone strength, there is evidence to suggest that cortical changes can occur as a result of ageing independently of aBMD [23]. The correlation to a number of DXA devices was reported between 0.72 and 0.83 [22] which is lower than the lowest cor-relation 0.93 achieved here. It is hypothesised that this is because IBEX BH uses a physics-based inverse problem solving approach that solves the same fundamental problem as DXA: that a dense bone and a porous bone can exhibit equivalent intensity values in a radiograph depending on the surrounding tissue. Radiogrammetry measures a distinct quantity and relies on its correlation to DXA aBMD. These results indicate that IBEX BH outperforms radiogrammetry in comparison to the reference standard DXA.
Another alternative imaging modality is based on quantitative computed tomography (QCT) scans [24]. At the forearm QCT had a correlation to DXA
comparable to the results reported here, between 0.82 and 0.93. It can also measure BMD at the more commonly measured central sites, spine and NoF. However, in the UK the number of DRs examinations is larger than CT (21.4 million compared to 6.6 million [25]), and hence a larger fraction of the target population is accessible via the DR imaging modality. Therefore, IBEX BH has the potential to have wider impact than QCT.
IBEX BH also exceeds the performance reported for a commercially available quantitative ultrasound (QUS) elective screening device which demonstrated an inferior correlation to DXA aBMD of between 0.61 and 0.71 [26]. Radiofrequency Echographic Multi Spectrometry is another elective screening device that reports a correlation to DXA aBMD of 0.93 [27] which is not significantly different to the results presented here. Whilst it is not possible to make a direct comparison with other devices from this study design, IBEX BH performance is not worse than these elective screening devices in clinical use.
Where a fracture or previous fracture prevents the use of the UD region, this study has shown that the TD provides an effective alternative, (R2 = 0.88). The TD region is the forearm site most commonly reported by DXA (possibly due to lower fracture incidence) [28]. Initial testing also indicates similar performance may be possible on the metacarpals which have been evidenced by other methods such as Digital X-ray Radiogrammetery to provide clinically useful indications of bone health [23]. Therefore, if a fracture is suspected, a small change to the field of view (increase to 24cm × 12cm for example) would enable the software to still assess bone health in the presence of a fracture. Furthermore, further development is likely to also include the metacarpals in which case no adjustment to the field of view would be required.
IBEX BH has also been applied to the pelvis in a previous study [12], which is a standard site for diagnosing osteoporosis [29]. A further study is planned (IRAS study reference 326406) which will assess performance at a clinically relevant patient dose (the previous study having been conducted at one fifth of standard dose). This, combined with improvements to the underlying algorithm is expected to result in improved AUC for osteoporosis diagnosis using IBEX BH.
Limitations
The study was carried out at a single centre by a small research team. A single imaging system was used with a fixed protocol so the variation in forearm positioning and acquisition parameters is likely to have been smaller than clinical practice. Further multi-centre clinical studies are needed to evidence that these results can be achieved in clinical practice. A single DXA system was used for the reference standard measurements and as there are differences between manufacturers’ T-scores, for example due to a difference in reference ranges, these results may not transfer directly to other DXA manufacturers.
As reported in Table 2, the sample population varies from the target population (over 50s) with a bias towards i) females, ii) over 70s and iii) low body mass indexes. Furthermore, the use of a volunteer population means that it is likely fewer participants exhibited co-morbidities relative to clinical practice. Most pertinently, there were no fractures present in any of the forearms analysed. Further clinically-based studies are needed to evidence that these results can be achieved on the target population.
Further Research
- The data in this study could be used to extend the number of ROIs that could be used. The metacarpals are used in DXR to measure bone health so may well extend the number of images that IBEX BH can be used on. Significant labelling time is required to train the automated ROI detection for the metacarpals.
- A follow on study is being undertaken to extend IBEX BH to other body parts for opportunistic screening. These are: ankle, knee and pelvis. Wider compatibility will enable a larger fraction of the target population to be assessed.
- Further studies are needed to evidence i) the repeatability of the software over time, ii) performance across different software manufacturers and iii) the performance on a wider patient demographic including ethnic minorities.
- Clinical trials investigating IBEX BH in clinical practice are needed that measure not only its performance against DXA but also its impact on patient outcomes and healthcare costs. This would involve testing a new care pathway wherein high risk participants are referred by IBEX BH for follow up investigation.
- Finally, studies are being considered to extend IBEX BH to mammography systems. In this case an additional scan would be taken when a patient receives their routine breast cancer screening scan. Dose and cost effectiveness become a more complex proposition for elective rather than opportunistic scans. However, the patient demographic is ideally suited to benefit from such provision.