Assessing Accuracy and Precision of 3D Augmented Reality Holographic models derived from DICOM data

Objective: Assess accuracy and precision of measurements on 3D Augmented reality (AR) models derived from CT DICOM data, and compare AR model measurements with PACS measurements. Materials/Methods: 5 individual 3D hologram models were produced using a CT phantom with ducial markers set at varying distances. DICOM les were translated into 3D AR models using open source software. AR models were adapted for display on an AR device using a novel application. AR models were projected and distances between the projected ducial markers were measured. Finally, 5 measurements each were obtained of the holographic projected distances between ducials in the x1, y1, and z1 labeled planes respectively for precision assessment. Mann-Whitney U test was performed to compare measured distances on AGFA-PACS, AR models, and actual measured distances on phantom models. Results: No signicant difference was found between gold standard measurements and either PACS measurements (p=0.9124) or AR measurements (p=0.8966). AR model measurements had a standard error of 0.24mm, 0.24mm, and 0.38mm in the x,y, and z planes respectively. Furthermore, measurements on AR models demonstrated a high degree of accuracy in comparison to gold standard measurements. Conclusion: Current AR technology is can produce reliable 3D AR models from CT DICOM data


Introduction
Augmented reality applications are being increasingly utilized in a number of elds including military, industry, and sports 1-3 with emerging potential applications in medical imaging 4-7. Recent advances in technology have allowed for increased portability of augmented reality hardware, making more widespread utilization possible 8,9. While preliminary investigation of accuracy and precision of augmented reality models has been described with large, projection based equipment 10, only limited data exists for accuracy and precision of head mounted display (HMD)-based conversion of Digital Imaging and Communications in Medicine (DICOM) medical imaging information into augmented reality models (7). The aim of this study was to assess the accuracy and precision of the DICOM-derived 3D holographic models data on a HMD using a proprietary C# programming language-based software application and to determine if any statistically signi cant differences exist between gold-standard physical measurements, Picture Archiving and Communication System (PACS)-based measurements, and Augmented Reality holographic measurements.

Methods
Five unique 3D models were produced using a CT quality control phantom (model 137856101, GE Healthcare, Waukesha WI USA) with ducial markers (CT/MRI 2 mm center hole Multi-modality marker, MM3002, Izi Medical Products, Owings Mills, MD USA). The ducial markers had an adhesive to allow for adjustment on the surface of the CT phantom. For each trial of the CT phantom, ducial markers were adjusted to provide new distances in each of the orthogonal directions.
The distances were measured using electronic digital calipers (model 01407A, Neiko Tools, Henan, China). These measurements were set as the gold standard (GS), aiming to reproduce measurements gathered in clinical settings such as those via surgical ruler or calipers. The calipers' product information reports precise measurements with resolution of 0.01mm and accurate measurements to 0.02mm. FIGURES 1,2. A total of six measurements between ducial markers were made for each model CT phantom: two in the x-direction, two in the y-direction and two in the z-direction by one observer. The observer was trained to measure the shortest distance between ducial markers set in the orthogonal directions (x, y, z) using the inside calipers.
CT scans of the phantom were obtained using the Head CT quality control phantom settings (5mm slice thickness, 134mAs, 22.7cm DFOV, 0.516:1 pitch) with 0.625cm x 0.625cm x 0.625cm voxels on a GE Lightspeed CT scanner (GE Healthcare, Waukesha WI USA) and stored as DICOM les within our AGFA IMPAX version 6.7.0.3502 "site" Picture Archiving and Communication System (PACS), (AGFA, Mortsel, Belgium). The DICOM les obtained were converted into 3D phantom models, and measurements were made on the AGFA-PACS. The observer was trained to measure the shortest distance between the outer edges ducial markers using only the "ruler" found within PACS. FIGURE 3.Additionally, the observer had access to axial, sagittal and coronal views to project views of ducial markers in their most appropriate planes.
Open-source software programs Horos (Purview, Annapolis, MD USA) and Blender (Stichting Blender Foundation, Amsterdam, Netherlands) were used to translate DICOM les into a 3D image. The 3D images were adapted and loaded onto the AR HMD platform using a C# programming language-based code on the Unity Platform (Unity Technologies, San Francisco, CA USA).The AR HMD used for this study was the Hololens 1 (Microsoft, Redmond, WA USA) OS version 10.0.14393.1358. The settings for HoloLens projections were set to default factory settings with the exception of distance of initial hologram projection set to 30cm from HoloLens. This was done to project the hologram within approximately arm's length of the user. The hologram was manipulated via rotation and translation without scaling size to obtain best views of markers. FIGURE 4 (with supplemental video) demonstrates a side-by-side comparison of the CT phantom and its hologram. The 3D models were projected as holograms and the distances between the projected ducial markers were measured by the same observer as GS measurements. Using the built-in AR HMD capabilities, the hologram was pinned to a table where the calipers could be laid atly to make the measurements, FIGURE 5. These measurements are known as Holo measurements.
A power analysis was performed using SAS University Edition (SAS Institute Inc., Cary, NC). This demonstrated that for paired t-test ( =0.05) with a sample size of n=30 per group, the power would be 80% in detecting a difference of 0.3mm. In a previous study assessing accuracy and reliability of CT measurement, max difference was observed in this range (11); the authors opted to detect a difference of this magnitude using this sample size in consideration of available time, resources and training in both PACS and holo measurements.
Since DICOM and PACS have become standard in the clinical radiology workspace, the assessment of accuracy and precision of the Holo measurements were set against this standard (12). To determine accuracy, the absolute error from the GS was gathered for PACS and Holo measurements. To compare the means and distributions of the absolute error, a two-tailed t-test was employed. For the assessment of precision, a right-tailed chi-square test of variance was used, where the PACS absolute error variance is set as the null. Additionally, a root-mean-square error was calculated for a series of repeated measurements made in each orthogonal direction (n=15).

Results
All measurements taken for each CT phantom trial on GS, PACS, and Hologram are found in Table 1

Discussion
Currently, there is little existing literature assessing the accuracy of the 3D hologram derived from DICOM data.There have been some studies investigating the reliability of applications of AR in the medical workspace. A 2017 study evaluated the utility of AR in endovascular interventions using the HMD (7). CT data was utilized to reproduce a hologram of vasculature superimposed on a phantom, and a two-step calibration employing a tracked catheter demonstrated the plausibility of visualizing endovascular procedures without the use of x-ray. Their accuracy assessment pertained to the rst calibration step and demonstrated a root-mean-square error of 4.357 mm in 20 trials of 4 landmarks. Though there is no distinction on the error and its dimensionality, our data suggest a smaller degree of error possibly due to the scale of our phantom.
The results of this study suggest that Holo measurements are as accurate as those made on the PACS.
The Holo measurements trended towards less precise than those measurements made on PACS, however not at a signi cant level. This may be explained in part due to the lack tactile-feedback of holograms, leading to less precise measurements.
Several limitations were faced in obtaining measurements across the modalities. The GS instrument's intrinsic errors were ignored as additional sources of error. Given that most clinical measurements are rarely recorded to the hundredth of a millimeter, the authors deemed these errors less relevant for the purpose of this study. Additionally, user setting variability on the HMD was not assessed and would be a limitation of this study. A challenge faced while obtaining measurements on the Holo models was that distances were made by visually estimating distances between markers. On the other hand, GS measurements on the physical model had tactile-feedback to establish edges, while the Holo models were reliant on the observer's best estimation of depth and distance between markers. Furthermore, there was no assessment for variability between HMD users and no consideration for user experience which may impact accuracy and precision of measurements.
Ultimately, this study demonstrates the utility and reliability of AR HMD technology and for rendering accurate holographic models. The advantage of holographic models is the visualization of a 3D internal anatomical structures and the ability to superimpose the image on a real environment. The AR holographic models have potential to directly impact elds of medicine such as radiology, surgery and medical education.

Conclusion
Measurements on hologram models demonstrated a high degree of accuracy compared with reference standard measurements, nearly approaching that of PACS measurements. The precision of Hologram measurements warrant further investigation, especially their reproducibility among different users. Current augmented reality technology can produce reliable 3D holograms from CT DICOM data and could be used for educational, training, or research purposes. Our study provides the groundwork for future larger-scale research into AR's emerging applications in medicine.

Declarations
Ethics approval and consent to participate Study was approved by our institutional review board.

Consent for publication
Not applicable Availability of data and material The datasets during and/or analysed during the current study available from the corresponding author on reasonable request.  Photograph illustrating scienti c caliper measuring distance between ducial markers. Note: The top smaller calipers (depicted with yellow arrows) are used for this study.  Side-by-side comparison of CT phantom and the resulting Augmented Reality hologram (supplemental video le).