Evaluation of a Method Combining Physical Experiment and Data Insertion For the Assessment of PET-CT Systems

11 Background: To evaluate and compare Positron Emission Tomography (PET) devices among 12 them, tests are performed on phantoms that generally consist in simple geometrical objects, 13 fillable with radiotracers. On one hand, those tests bring a control over the experiment through 14 the operator preparation but on the other hand, they are limited in terms of reproducibility, 15 repeatability and are time-consuming, in particular, if several replications are required. To 16 overcome these restrictions, we designed a method combining physical experiment and data 17 insertion that aims to avoid experimental repetitions while testing multiple configurations for 18 the performance evaluation of PET scanners. 19 Methods: Based on the National Electrical Manufacturers Association Image Quality standard, 20 four experiments, with different spheres-to-background ratios: 2:1, 4:1, 6:1 and 8:1, were 21 performed. An additional acquisition was done with a radioactive background and no activity 22 within the spheres. It was created as a baseline to artificially simulate the radioactive spheres 23 and reproduce initial experiments. Standard sphere set was replaced by smaller target sizes (4, 24 5, 6, 8, 10 and 13 mm) to match current detectability performance of PET scanners. Images 25 were reconstructed following standard guidelines, i.e. using OSEM algorithm, and an additional 26 BPL reconstruction was performed. We visually compared experimental and simulated images. 27 We measured the activity concentration values into the spheres to calculate the mean and 28 maximum recovery coefficient (RC mean and RC max ) which we used in a quantitative analysis. 29 Results: No significant visual discrepancies were identified between experimental and 30 simulated series. Mann-Whitney U tests comparing simulated and experimental distributions 31 showed no statistical differences for both RC mean (P value = 0.611) and RC max (P value = 0.720). 1 Spearman tests revealed high correlation for RC mean (ρ = 0.974, P value < 0.001) and RC max 2 (ρ = 0.974, P value < 0.001) between both datasets. According to Bland-Altman plots, we 3 highlighted slight shifts in RC mean and RC max of respectively 2.1 ± 16.9 % and 3.3 ± 22.3 %. 4 Conclusions: The method produced realistic results compared to experimental data. Known 5 synthesized information fused with original data allows full exploration of the system's 6 capabilities while avoiding the limitations associated with repeated experiments. 7

showed no statistical differences for both RCmean (P value = 0.611) and RCmax (P value = 0.720). 1 Spearman tests revealed high correlation for RCmean (ρ = 0.974, P value < 0.001) and RCmax 2 (ρ = 0.974, P value < 0.001) between both datasets. According to Bland-Altman plots, we 3 highlighted slight shifts in RCmean and RCmax of respectively 2.1 ± 16.9 % and 3.3 ± 22.3 %. Since the first Positron Emission Tomography (PET) scanners were introduced, the molecular 11 imaging modality has been technologically redesigned and enhanced to reach higher sensitivity, 12 spatial and timing resolutions. This has been achieved through the development of its hardware 13 components (scintillator crystal, photodetector, electronic) and software solutions 14 (reconstruction and image analysis) [1]. As a result, overall performances of PET devices have 15 been greatly increased, but with various technological and commercial strategies among 16 manufacturers leading to different characteristics. Therefore, the assessment of their 17 performances is essential to classify the systems, which can be achieved through tests based on 18 standards that were defined by scientific experts from national and international authorities such 19 as the National Electrical Manufacturers Association (NEMA) and the International repeatability [4]. In addition, some parameters do not vary during the acquisition, such as the 25 activity ratio between two compartments. Hence, to test n configurations, the experiment will 26 have to be repeated n times. Moreover, these phantoms are often no longer adapted neither to 27 PET performances nor to current clinical challenges such as the detectability of sub-centimeter 28 lesions [5,6]. An auditable way to study PET system performances would be to simulate data 29 to avoid the introduction of biases and experimental repetitions. Indeed, simulation has several 30 advantages among experiments performed on physical phantoms, such as a better control on line, combining time-of-flight (TOF), low resolution and high sensitivity that improve overall 23 image quality by reducing the noise in the reconstructed images and enhance lesion detection 24 [12]. There are several reconstruction algorithms available to produce images from the acquired 25 raw data such as the common ordered subset expectation maximization (OSEM). A most recent, 26 Bayesian penalized likelihood (BPL) algorithm, gives access to a regularization parameter β 27 that allows to reduce image noise through each iteration [13,14]. Results from the NEMA NU2-  The evaluation of its performances and its benchmark among other models is important but is 30 not sufficiently discriminating compared to the opportunities that could offer this high 31 performance PET-CT. We performed several experiments based on the image quality (IQ) 32 performance standard in which we adapted the set of fillable spheres for challenging smaller 1 sizes.

Phantoms experiments
3 Based on the NEMA IQ NU-2 2018 procedure [2], we used the NEMA IEC body phantom with 4 the lung insert. We replaced the standard fillable spheres by another set with smaller internal 5 diameters of respectively 4, 5, 6, 8, 10 and 13 mm (Data Spectrum Corporation, Durham, NC, 6 USA), which central section in the phantom is showed in Figure 1.  As required by the standard, a scatter phantom was placed outside the scanner field of view.

9
The filling of the phantoms, their positioning and their acquisitions on the examination bed 10 were carried out according to the standard recommendation. Five distinct experiments were 11 performed corresponding to five concentrations of 18 F leading to five SBR of approximatively 12 2:1, 4:1, 6:1, 8:1 and finally 0:1 (water only within the spheres) as a baseline for data insertion.

13
The radioactive concentration of the background was nearly the same for all the experiments to 14 finally get rather the same global activity within the body phantom. Details of the filling level 15 in the different compartments of the scanned phantom are available in Table 1.  We chose to conduct this study with OSEM and BPL algorithms because the former is part of 19 the standard procedure and the latter is used in the clinical routine of our institution. It allows 20 to compare the impact of the reconstruction algorithm using our method. Images were obtained 21 using acquisition and reconstruction parameters detailed below in Table 2. Data insertion method 1 We aimed to simulate radioactivity upon spheres filled with water (SBR 0:1) using the 2 background as an activity reference in order to reproduce the SBR of other experiments (2.07:1, we determine the spatial coordinates of the center of each physical sphere from the CT images 6 in order to precisely define the center of each artificial sphere. Lastly, we calculate the activity 7 concentration (AC) (Bq/mL) to be inserted inside them. To achieve that, we draw 12 Volumes-  (3) 4 In order to simulate the four synthetic sets of images reproducing the already acquired 5 experiments, we multiply the mask by the exact experimental value of each SBR, i.e. 6 respectively 2.07, 3.93, 6.03 and 7.97 (4).

22
Once these preliminary steps have been completed, we proceed to the generation of modified

12
RC mean/max = AC mean/max (kBq/mL) AC theoretical (kBq/mL) (5) 13 As the background activity may vary from one experiment to another, even if the SBR is rather 14 the same, we performed this calculation in order to normalize the different datasets and avoid 15 discrepancies between reiterations ( algorithms. An overview of the images is available in Figure 3 and 4, respectively for OSEM 3 and BPL reconstruction.   size. 18 We highlights RCmean relative errors inferior to 20 % for all configurations, and rather the same 19 for RCmax except for 6 mm sphere at SBR 2:1 (OSEM), which is not visible on the image and      step required to normalize the data, which reflects empirical status of the estimation made on 5 the acquisition and reconstruction processes to mimic the system response.

6
In order to overcome these limitations, we designed and evaluated a method combining physical 7 experiment and data insertion that aims to test multiple configurations for the performance 8 evaluation of PET-CT devices, while avoiding experimental reiterations. We used the method 9 to reproduce from one test different experimental scenarios we had conducted. We then focused 10 our analysis on the realism of the synthesized objects through a visual and quantitative 11 comparison with acquired experimental data.

12
Based on real images and using accurate model of the scanner, the method generates images 13 whose visual rendering and visualization of objects is similar to experimental images.
14 Discrepancies observed in the visual comparison occur for target presenting size and contrast 15 challenging in terms of detection for the device. It is the limit of the method which shows slight 16 deviation from the physical images when the limits in performance of the system are reached.

17
Nevertheless, given this specificity and weakness of these differences, we considered them 18 negligible for our study and assessed the equivalence between the two data sets. From the 19 quantitative analyses, we were able to verify that the experimental and artificial data were 20 comparable, correlated, with differences normally distributed. Both algorithms showed no 21 statistical differences and close results in terms of correlations and limits of agreement. quantitative results compared to real data even under challenging situations such as small and low contrast targets. Based on computer programs developed by the PET-CT manufacturer, the 1 method uses the same processes available on the physical systems. In comparison to some 2 simulation studies [7-9], the processes rely on a real acquisition performed on the system.

3
Hence, files generated during the initial acquisition and reconstruction are used to generate new 4 datasets inserting virtual information to get finally a realistic render of the exam. These 5 synthetic datasets are useful for qualitative and quantitative assessment of system performance 6 as they combine real backgrounds with inserted objects of known size, activity and location.

7
From a single experiment, it allows to generate as many configurations as needed without 8 requiring access to the scanner, which may be limited in terms of device and radiotracer 9 availability. In addition, it can be applied directly to clinical data in order to evaluate impacts

13
One of underlying assumptions in the manufacturer functions implied in the method is that 14 inserted information has a negligible impact on the scatter and random coincidences in the 15 resulting sinogram (simulated + original). Hence, the modified data include only the original 16 random and scattered coincidences. In this study, given the sizes and activities present in the 17 spheres, we assume that their impacts are negligible.

18
Conclusion 19 We developed and validated a method allowing the generation of virtual realistic objects in 20 phantom images. The insertions can be fully controlled and provide opportunities to evaluate 21 medical imaging functions and image processing techniques.

22
In the context of a collaborative research partnership, this study is a first step in the method 23 development for the performance evaluation of the next generation of PET systems. It will then 24 be extended to more complex phantom models and clinical data.