This study proposed a preliminary protocol for assessing the technical factors contributing to the accuracy and reproducibility of quantitative flow values in MPI. This protocol could eventually be used for planning harmonization measurements for myocardial perfusion imaging using [15O]H2O as well as the flow phantom in the future. We evaluated the impact of technical factors on the modelled flow values, Qin and Qout, extracted from the flow phantom. The accuracy and reproducibility was evaluated between test and retest measurements, two [15O]H2O injectors as well as two digital PET systems, Discovery MI and Biograph Vision 600. However, there is no specific reason why the protocol could not be applied to analog PET/CT systems as well.
The administered [15O]H2O activities were repeatable between test and retest measurements on both systems (Table 3). RWG installed on DMI-20 produced over 10 % relative difference between the test and retest measurements only with one measurement, and the RWG on Vision-600 on two measurements. All administered activities were within 15 % from the requested 500 MBq and fall within the vendor specifications. However, the RWG installed on the Vision-600 produced systematically higher activities and that directly contributed to the amplitude of the bolus curves.
In order to minimize the outside technical factors affecting the modelled flow values and to conduct quality control of our measurements, we also measured the error of Qtube+Qcyl with respect to Qpump. Ideally Qtube+Qcyl should be identical to Qpump but experience differences in most measurements on both DMI-20 and Vision-600. Vision-600 showed larger Qtube+Qcyl differences than DMI-20. Overall, all Qtube+Qcyl vs Qpump differences were below 10 % on both systems (Figure 1), which is why we assume that this factor only has a minor effect on the modelled flow values.
The phantom protocol includes quality control procedures, which we conducted to minimize the variation of Qtube+Qcyl. The most important is the pump calibration prior to each measurement session. There could also be a benefit from calibrating the pump whenever the flow rate or the constriction of Qcyl or Qtube is altered between measurements. However, this might not be practical as changing the calibration between measurements may compromise the measurement reproducibility. Also, the flow meter calibration applied in the QuantifyDCE software should ensure the validity of the flow meter readings and therefore the accuracy of the recorded flow values should be guaranteed by using the single pump calibration.
In general, the most significant factor found contributing to the differences in the input TACs and the resulting flow values is the difference in [15O]H2O bolus peak amplitudes between measurements performed on DMI-20 and Vision-600 (Figure 2). The difference resulted to higher bolus AUCs as well as input AUCs on Vision-600 (Supplemental Figures 1, 2). The direct relationship between bolus and input TAC is caused due to the bolus travelling directly from the injection port to the input chamber without mixing or dispersion on the way, in a relatively short distance and time. Another factor contributing to this variation is that the PET/CT systems have their own individual RWGs, although they are calibrated to the same reference and should not contain delivery variations from the target activity of more than 15 %. However, systematic differences in bolus delivery between test and retest measurements could be seen especially in the measurements conducted with Vision-600 (Figure 2). Nevertheless, the input TACs were reproducible between the systems and the measurements (Figures 3 and 4).
We have observed that the dependency of the input TACs of the bolus delivery does not seem to propagate directly to the tissue and modelled TACs. The tissue TACs showed high reproducibility between measurements and systems (Figure 3 and Figure 4). We could not detect a clear discrepancy between the tissue and modelled TACs between the test and retest measurements, and only their height visually altered in one measurement on DMI-20 (250-80 %) and three measurements on Vision-600 (150-20 %, 150-80 %, and 200-80 %). Overall, the similar behavior in tissue TACs between test and retest measurements and systems is possibly due to the tracer already mixing within the exchange cylinder and washing out at a certain rate. This is primarily determined by the internal flow characteristics of the system, controlled by the pump and the flow constrictor valves.
Most importantly, we could show that most of the measurements were highly repeatable as the repeatability errors of Qin and Qout were below 15 % on majority of the measurements (Figure 5, Table 5). The highest repeatability was seen on the measurements performed on the DMI-20 system. However, as the bolus peak amplitudes were higher with the measurements on Vision-600, and thus the input TACs were larger, the modelled flow values consequently experienced more variation on Vision-600 when compared to DMI-20. A large factor contributing to this variation is the use of individual RWGs on each PET/CT system, which would indicate that most of the differences originate not from the PET/CT system alone, but their origin is also in the different bolus delivery systems.
In general, the input curve discrepancies due to the bolus delivery mainly resulted in higher Qin differences between test and retest measurements (Table 5). This results from Qin being analogous to K1 (wash-in) parameter from the two-compartmental model, and being more sensitive to variations in the input function. We also saw more variations in the Qin parameter with higher flow rates, indicating that Qin might be sensitive to the bolus as well as flow rate (Figure 8). The changes in the input TAC height seemed not to affect the Qout parameter directly similarly as Qin. The Qout values are analogous to the k2 (wash-out) rate parameter, which makes them more sensitive to changes in the tissue TAC. However, variations could also be seen in the Qout parameter in the correlation analysis (Figure 6 and 7) as well as in the agreement analysis (Figure 8), although in general the relative errors between the measurements were smaller with the Qout parameter.
In the Bland-Altman analysis (Figure 8), Qin and Qout showed differences between DMI-20 and Vision-600 and varied more with high flow values. Although Qout showed a smaller mean difference, there were variations in both Qin and Qout in both the test and retest measurements between the systems. Altogether, the LoAs were similar for Qin and Qout and the mean difference between DMI-20 and Vision-600 was 9.9 ml/min and -0.3 ml/min for Qin and Qout, respectively (Figure 8). The results show that although both Qin and Qout can be derived between two injector systems and two PET/CT systems with a small bias compared to the actual flow values, further measures need to be implemented to reduce the variation between the systems to increase the reproducibility of both Qin and Qout.
When comparing the flow phantom measurements to the data published from MBF values measured on clinical subjects, the mean values and standard deviations of the repeatability errors between test and retest measurements on Vision-600 (Qin: 23 % ± 25 % and Qout: 22 % ± 27 %) were still within a similar range that is measured in clinical myocardial perfusion studies [23]. For example the table in Klein et al. describes that overall the stress MBF values have repeatability accuracy from 11 % to 34 %, and for [15O]H2O the reported values are 27 % [18] and 25 % [19]. In this regard, the measured values from the phantom study agree well with the results gained from the clinical subjects. As compared to our phantom study, higher variation in patient studies reflect the contribution of differences in biological factors, such as the systemic hemodynamic state, on MBF.
Finally, these experiments confirmed that the proposed imaging protocol with several flow values as well as imaging parameters including reconstructions can be applied between different injector and PET/CT systems to provide an understanding for technical accuracy and reproducibility of the quantitative flow values between test-retest measurements. A future multi-center study would be essential to provide upper and lower limits for the Qin and Qout values for calibration purposes and would give specific information about the different factors contributing to the variations of MBF values between centers and PET/CT systems. However, in order to apply this protocol for multi-center setting, a careful investigation for standardizing the bolus delivery between different sites and dispenser systems needs to be conducted first. Ensuring an even more consistent tracer administration profile should improve test-retest repeatability further [24].
Limitations
The remaining technical factors from the PET systems that may affect the modelled flow values are for example the higher sensitivity of the Vision-600 compared to DMI-20 (13.7 cps/MBq on DMI-20 and 16.4 cps/MBq on Vision-600) as well as smaller spatial resolution of Vision-600 compared to DMI-20 (4.1 mm on DMI-20 and 3.5 mm on Vision-600) (Table 1). However, no direct relationship of the contribution of these factors could be derived, to determine their effect to the quantification differences between the systems. Future studies conducted on multiple systems will complement the understanding of these factors.
As an additional limitation, the flow phantom has internal characteristics that may affect the repeatability of the measurements. For example, the pressure variations inside the hoses, the air bubbles within the phantom, the air pressure and humidity of the scanner room, and the phantom flow meter inaccuracies are likely to affect the modelled flow values. In addition, the flow phantom presents only a simplified simulation of myocardial perfusion.
In clinical subjects, there will be more variations due to physiological factors, including tracer dispersion and individual reactions to pharmacological stress, in addition to other physiological factors that contribute to the measured MBF. However, the flow phantom allows studying the contribution of different technical factors in controlled and measurable way, where the ground truth is known. Moreover, the phantom includes several calibration and quality control measures to ensure the reproducibility and comparability of the measurements.
Another limitation is that the clinical reconstruction protocols were used on each system without specific harmonization. Ideally, the reconstruction parameters could be harmonized between systems. For example, EARL has proposed guidelines to apply reconstruction parameters that result into equal voxel sizes regardless of the PET system used [4]. In this study, on DMI-20 the voxel sizes were (x, y, and z) 1.82 mm, 1.82 mm, and 2.79, and on Vision-600 1.65 mm, 1.65 mm, and 3. Thus, the final reconstructed image resolution on both systems was similar despite using the clinical protocol. Moreover, assessing the clinical protocol on each system is more appropriate for a multi-center situation without specific harmonization conducted. However, the effect of harmonizing system reconstructions should be investigated in the future.
Furthermore, there is still room for validation of the protocol with multiple tracers used for MPI. The present protocol was assessed only for [15O]H2O. [82Rb] and [13N]NH3 are commonly used in MPI as perfusion tracers. Thus, at least a cross-verification to the present protocol or designing a tracer-specific protocol for these tracers is required. Moreover, modifying the protocol to be used with [18F]-labelled tracers would be highly useful, as they are widely available in different centers. Thereafter, preliminary harmonization measures could be evaluated using also 18F-tracers, as long as the protocol would have been cross-calibrated with the present [15O]H2O protocol.
Summary of the Findings and Future
We were able to demonstrate 15 % repeatability across all measurements on DMI-20 and on half of the measurements on Vision-600. This relatively high repeatability is expected, as in a single-center setting there are several factors that are advantageous for minimizing the bias and variability between the measurements. First, the [15O]H2O bolus injectors were calibrated to a common reference within the center. Second, the acquisition protocols and the activity delivery were standardized between the measurements. Third, both PET systems were cross-calibrated to the common reference within the center, their acquisition durations as well as reconstruction frame times were the same, with relatively similar settings in the reconstruction parameters.
However, in multi-center settings there will be several more technical factors affecting the measurements, all of which should be investigated separately. Therefore, the present findings provide support for assessing the current protocol for measurement of accuracy and reproducibility of flow values in a multi-centre setting for myocardial perfusion imaging with [15O]H2O, where the first factor is to start with ensuring high repeatability and reproducibility of the delivery of the bolus, especially with different injector systems.