The aim of this study was to develop an automatic quantification package for phantom-based image quality assessment of bone SPECT and to assess its validity. The software could automatically classify the DS of hot lesions in the SIM2 bone phantom using the self-calculated quantitative indexes. The excellent agreement of DS between Hone Graph and the observation of expert BCNMTs was indicated. The most interesting finding was that the %DEV was the most strongly associated index with DS.
Hone Graph, allows easy assessment and reliable results of image quality (e.g., CNR, %CV, and detectability) with excellent repeatability and reproducibility since, the software can automatically calculate the indexes simply by selecting SPECT image files. Moreover, the report will be visually easier to understand for a wide range of experiences. Our findings underscore that the detectability calculated by Hone Graph almost perfectly agreed with that of expert BCNMTs. Thus, Hone Graph is not only easier, but also may improve repeatability and reproducibility for quality assessment in bone SPECT. The repeatability and reproducibility for Hone Graph was significantly superior than those for the results of experienced BCNMTs performed ROI analysis. Hone Graph provided the same results when the same SPECT data were repeatedly measured. Notably, misalignment of the settings of the phantom will not substantially affect the results. Owing to the rigid alignment to the reference image and resampling to 1-mm pixel size, the influence of shift alignment will be significantly decreased, especially in small spheres such as those that are 13-mm, given that the pixel size of bone SPECT images is generally approximately 5 mm and the bone SPECT images will be affected by shift alignment.
In the several indexes calculated using the software, the most strongly associated index with DS was %DEV; thus, this was selected as the first node in the decision tree analysis. We defined %DEV as an alternative index for detectability, and in our preliminary experiment, a threshold value of 40% was optimal. Subsequently, classifications between DS 1 and 2 and between DS 3 and 4 showed detection performance with a certainty of 100% and required %DEV for 14.215%. Then, the classification between DS 1 and 2 indicated whether hot lesion was barely detected, was performed with CNR for 5.25. This value is in agreement with our earlier observations, which showed that observable lesions have a CNR greater than 5 with the Rose criteria [7]. Indeed, in the present study, only one out of 65 detectable lesions did not have a CNR greater than 5. Although decision tree analysis indicates a strong correlation between DS and not CNR > 5, %DEV > 14% was found on detection of hot lesion. Hitherto, several previous reports have demonstrated that decision tree analysis provides a new incremental value in nuclear medicine [21, 22]. In the current study, the reason for the weak correlation between DS and CNR can be attributed to the fact that only hot lesions with a CNR from 4 to 6 were included. Indeed, many studies have only focused on only CNR for the detection of hot lesions; thus, its potential effects on the lesion volume and maximum density in the image may be neglected [11, 12, 23, 24]. However, prior studies have also highlighted the importance of a quantitative index with CNR and lesion volume for detectability [8, 13]. These previous reports focus on the number of pixels in the lesion ROI, although comparison of images of different pixel sizes is impossible. In the current study, CNR%DEV was calculated according to the CNR, and a factor for lesion volume was not also associated with the detectability of hot lesions. One unexpected finding was that the SI, free from influence of background and noise as a resolution characteristic, was poorly associated with the DS, despite the finding that hot lesions were blurred in virtue of the 9.6 mm Gaussian filter.
Given that visual analysis may lead to greater inter-observer variation and more laborious work, previous studies have substituted CNR for detectability [8–11]. Although quantitative indexes, such as CNR, strongly correlate with detectability, measurement of CNR using manual ROI settings showed significant inter-observer variation in this study. In the manual measurement of CNR, variation between observers is one of the primary reasons for variation in the CNR, irrespective of the length of experience [25]. In the manual method, the CNRs exhibited great difference of more than three times that of inter-observer, whereas Hone Graph provided stable results. The intra-observer variation for the manual method was approximately 10%, although the method significantly underperformed Hone Graph. Very few studies have evaluated the inter-observer reproducibility of ROI analysis, and it has been largely neglected in phantom-based studies. The excellence in repeatability and reproducibility of Hone Graph is supported by previous studies that demonstrated the validity of SPM [26, 27]. Regarding detectability, the results showed that classification of the DS using Hone Graph was in almost perfect agreement with the gold standard, and in excellent agreement with experienced BCNMTs. However, with each decrease in year of experience, the agreement with the DS of Hone Graph also decreased. This variation between years of experience is in line with previous studies that demonstrated that the performance in the interpretations improved with experience, although between clinical study and phantom study is different as the observers know where to look for the lesions [28, 29].
Even when using a different device or acquisition parameter (e.g., pixel size or acquisition time), Hone Graph will likely be useful to serve as comparison for image-to-image variation due to pixel size resampling and normalization of count distribution. Hone Graph, the phantom-based method, is considered to be a simple and convenient method to understand differences in gamma camera system or imaging technology. As an extension of this idea, standardization of bone SPECT imaging technology may be possible utilizing Hone Graph. Quantitative indexes are usually calculated on the central slice though the target (i.e., two-dimensions only), whereas Hone Graph using VOI (i.e., three-dimensions) though showed excellent reproducibility. The CNR measurement using ROI retains the potential to pose a serious problem for reproducibility due to variation in pixel size or a slice though the target that is off-center. The use of ROI analysis will lead to difficulty in comparison of CNR, even in variation of pixel size alone [12].
Many studies published on lesion detection in bone SPECT highlight the importance for [1–5] nuclear medicine technologists to maintain adequate imaging technology for bone SPECT. However, to the best of our knowledge, almost no reports have described imaging technology for bone SPECT. The rapid SPECT/CT protocol has a potential impact on the acquisition of SPECT/CT, although the scan protocol has not been shown to be able to detect smaller or weaker lesions. Conversely, phantom-based image quality assessment assists in developing a greater understanding of imaging technology, and the ability to detect small lesion. In terms of quality assessment, the accuracy of quantification, repeatability, and reproducibility are of the utmost importance, all of which are possible with the software.
As a limitation in our study, we assessed the proposed software using only one gamma camera system and limited parameter ranges of imaging technology. The maximum value in the reference section is a criterion value that is used to assess image quality using Hone Graph, although the maximum value in the reference section may be decreased by Gibb’s artifact, which can lead to overestimation of the DS in lesions [30]. Even after optimization based on the proposed method, additional assessment with a physician for clinical image is likely to be needed due to the structural difference between the phantom and the human body. It is conceivable that the Hone Graph program can assess bone SPECT images among multiple centers; however, the quantitative index, other than the DS calculated by the program, has no criterion value. Further multicenter studies are required to define criteria to optimize imaging technology for bone SPECT. With increased volumes of SPECT images, Hone Graph may evolve to machine learning in the same way as random forest or support vector machine.