Can a Deep Progressive Learning Method Help to Achieve High Image Quality as Total-body PET Imaging?

19 Purpose To propose and validate a total-body PET (TB-PET) guided deep progressive learning 20 method (DPR) for low-dose clinical imaging of standard axial field-of-view PET/CT scanner 21 (SAFOV-PET). 22 Methods List-mode raw data from a total of 182 scans were collected, including 100 patient scans 23 from a TB-PET, and 15 phantom and 67 patient scans from a SAFOV-PET. Neural networks 24 employed in DPR were trained with the high-quality images obtained from the TB-PET using a 25 progressive learning strategy and evaluated on a SAFOV-PET through three stages of studies. The 26 CTN phantom was firstly used to verify the effectiveness of protocols in DPR and OSEM 27 algorithms. Subsequently, list-mode rawdata from retrospective and prospective PET oncological 28 patients (n=26 and 41, respectively) were rebinned into short duration scans (referred as to 29 DPR_full, DPR_1/2, DPR_1/3, and DPR_1/4), and reconstructed with DPR. Full-duration image quality to the reference. In the prospective study, good agreement of the SUVs between 1 DPR and OSEM was found in all the selected background tissues even if the injected dose was 2 reduced to 1/3. Both quantitative and qualitative results demonstrated that the DPR_1/3 group 3 showed no significant difference with the reference regarding the liver COV and subjective scores. 4 The lesion SUVs and TLRs in the DPR_1/3 group were significantly enhanced compared with the 5 reference, even for small lesions. Conclusions The proposed DPR method can reduce the injected dose of SAFOV-PET scan by up 7 to 2/3 in a real-world deployment while maintaining image quality. 1/3 of the standard without compromising the image quality and small lesion detectability. This study has shown the potential of the proposed DPR algorithm in low-dose PET imaging for conventional digital PET/CT scanners in clinical routines.

Helsinki declaration and its later amendments or comparable ethical standards. 24 Consent to participate 25 The need for written informed consent in the retrospective study was waived by the Institutional 26 Review Board of Shanghai General Hospital and all participate in the prospective study signed the 27 informed consent prior to the study. 28 Positron emission tomography (PET) is a non-invasive molecular imaging modality widely used 2 in oncology, neurology, cardiology, and other fields [1][2][3][4][5][6][7][8]. Good image quality and accurate 3 quantification of radiotracer are vital for the clinical diagnosis, prognosis, staging/restaging, and 4 treatment monitoring. However, the key confounding factors are high noise and low spatial 5 resolution of PET images [9]. The signal-to-noise ratio (SNR) of PET images for a given scanner, 6 as a metric to assess the image quality, is proportional to the square root of the product of the 7 system sensitivity, the injected activity, and the acquisition time. Reducing the injected activity or 8 the acquisition time is favored concerning the radiation exposure or patient comfort, while the 9 image quality is maintained at an acceptable level for clinical use. With the advent of the 10 state-of-the-art TB-PET scanner, the system sensitivity can be improved by a factor of 40-fold, 11 allowing for ultra-high image quality [10]. The TB-PET scanner (uEXPLORER, United Imaging 12 Healthcare, Shanghai, China) is composed of 8 PET units along the axial direction, forming the 13 system's total axial length of 194cm [11]. Due to its high cost, the availability of the TB-PET 14 scanner is limited. Thus, an approach that elevates the image quality of the standard axial 15 field-of-view (SAFOV) PET to that of the TB-PET is expected.

17
During the past few years, deep learning approaches have been proven to achieve superior 18 performance in denoising PET images [12][13][14][15][16][17]. It has shown the potential of low-dose to full-dose 19 conversion in various studies. During network training, the high image quality of the training pairs 20 is essential to network performance. However, direct learning from an input image to the target 21 image is challenging if the gap is large between the two images. More recently, a deep progressive 22 learning reconstruction (DPR) algorithm for PET is proposed to bridge the gap between 23 low-quality images and high-quality images through multiple learning steps [18]. The training 24 data used in the network come from the TB-PET images with an acquisition time of 900 sec, 25 which can generate an excellent image quality. Thus, we hypotheses that the proposed DPR 26 algorithm can help the SAFOV PET/CT scanners to reconstruct PET images with comparable 27 quality to that of a TB-PET scanner.

29
To validate the proposed DPR algorithm, we firstly investigated the performance of the DPR 30 algorithm in shortening the acquisition time with the retrospective data. The fast acquisition is to 31 simulate the reduced injected activity in PET imaging and provide evidence for the subsequent 32 study with real-world low-dose injection. Thus, the patients were prospectively enrolled with an 33 injection of reduced activity based on the above results. The image quality of these patients was 34 comprehensively evaluated regarding quantification accuracy, lesion contrast as well as visual 35 assessment.

37
This study aimed to investigate the image quality of 18 F-FDG PET images reconstructed by the 38 DPR algorithm in patients with both a simulated and real-world low injected activity, and compare 39 to that reconstructed by the standard ordered subset expectation maximization (OSEM) algorithm. where MN R   A is the system matrix, r is the random and the scatter estimate, and e is the 6 additive noise. In the framework of the unrolled method for deep learning based iterative image 7 reconstruction algorithms [19], the maximum likelihood estimate of the unknown image x can 8 be calculated as 9 L yxis the log-likelihood function, and ( ; )  z f x θ is a convolutional neural network 11 (CNN) representation of an image z with input image x and parameters θ . The DPR 12 algorithm suggests that the network f could be decomposed into many sub-networks to make 13 the network training easier. Our current implementation employs two sub-networks, that is, 14 ( demonstrated its ability to provide clinically acceptable images for a scan as short as 60 s and good 28 images for a scan of more than 180 s. So a 900-s scan is long enough to generate excellent images 29 with very low image noise and high image contrast. For CNN-DE, PET images with 10% uniformly 30 down-sampled counts were used as training input, and PET images with full counts were used as 1 training targets. For CNN-EH, PET images with insufficient iterations were used as training input, 2 and PET images with sufficient iterations were used as training targets. The training image size was 3 249x249x671 with a voxel size of 2.4x2.4x2.68 mm 3 . The reconstruction algorithm was OP-OSEM 4 with time-of-flight (TOF) and resolution modeling. All necessary corrections like scatter, 5 normalization, dead time, attenuation, random, decay corrections were applied. Totally 53680 6 image pairs from 80 patients were used to construct the training dataset. The training data were 7 augmented via flipping and rotating, and applied with z-score normalization before they were fed 8 into the network. 9 10 The network parameters were initialized with Kaiming initialization. The loss function was the sum 11 of the L1 losses of all three branches. The training of the network took the loss function as the 12 objective function and the backpropagation algorithm was used to update the parameters based on 13 the adaptive moment estimation optimization algorithm and cyclical learning rate. The minimum 14 and maximum values for the cyclical learning rate were 1e-5 and 1e-4, respectively. The training 15 settings were the same for both CNN-DE and CNN-EH. All the training was conducted using 16 Pytorch 1.5.0 on a computer cluster of 4x NVIDIA Quadro RTX 6000 GPU. The CUDA library was 17 10.2, and cuDNN was 7. 6. The trained networks were tested with 13420 image pairs from the other 18 20 patients to ensure that they could be used in this study. 19 20 Algorithm evaluation 21 The performance of the proposed DPR algorithm was evaluated through three stages of studies.

22
The phantom study verified the effectiveness of reconstruction protocols. The retrospective patient 23 study simulated the low-dose scenario using reduced scan durations, providing evidence for the 24 subsequent prospective study with real-world low-dose injection. 25 26

Phantom study 27
The phantom study used a CTN anthropological chest phantom that had two polystyrene-filled 28 chambers and a uniform background to mimic the lung and the other soft tissue in the chest. Seven 29 spherical lesions (diameter = 7, 10, 13, 17, 22, 28, and 37 mm) were placed in the phantom 30 background and five spherical lesions (diameter = 10, 10, 13, 17, 22 mm) in the polystyrene-filled 31 chambers. All spherical lesions and the phantom background were filled with 18F solution, and 32 the concentration at the 1st experiment scan start time were 23.6 and 6.2 kBq/mL, respectively, 33 resulting in a 3.81:1 lesion-to-background concentration ratio. In addition to the first experiment, 34 two delayed experiments were conducted to evaluate the image quality at the lower activity. The 35 intervals between the first and second, second, and third experiments were 110 min and 65 min, 36 respectively. Thus, the actual active concentrations at the start time of the second and third 37 experimental scans were 1/2 and 1/3 of that of the first experiment, respectively. For each 38 experiment, the CTN phantom was repeatedly scanned five times, with 2 min per scan with a 39 digital PET/CT scanner (uMI 780, United Imaging Healthcare, China).

41
The phantom data were reconstructed with both OSEM and DPR algorithms. The OSEM 42 algorithm was applied with two iterations, 20 subsets, a Gaussian filter with full width at half 43 maximum of3 mm, 192×192 matrix, 600 field-of-view (FOV), 2.68 mm slice thickness, as well as 1 TOF and resolution modeling. The standard corrections (scatter, random, dead time, decay, 2 attenuation, and normalization) were included in the reconstruction. The DPR algorithm was 3 applied with the same FOV, matrix, and slice thickness as in the OSEM. No other post-processing 4 method was included in the reconstruction. The reconstructed PET images were analyzed using 5 the phantom analysis toolkit provided by SNMMI (http://www.snmmi.org/PAT). Recovery 6 coefficients (RC) of the spherical lesions and background coefficient of variation were recorded 7 and used as quantitative measurements to evaluate the performance of different algorithms. 8 9

Patient study 10
Patients 11 The study included two patient cohorts, a retrospective and a prospective one. Twenty-six and 12 forty-one oncological patients (female/male: 19/33, age: 24-87 years) referred to the Shanghai 13 General Hospital from Nov. 2020 to Sep. 2021for clinical 18 F-FDG PET/CT examinations were 14 enrolled, respectively. Their demographic and clinical information are listed in Table 1.

15
All patients had fasted for at least 6 h, and a blood glucose level was confirmed to be ≤10 16 mmol/mL before the 18F-FDG injection. A weight-based 18 F-FDG (full and one-third) dose was 17 administered to the patient of the two cohorts using a fully automated PET infusion system 18 (MEDRAD, Bayer Medical Care Inc. Pennsylvania, USA) that allows an accurate dose 19 administration. During the uptake period of about 60 min, the patients were hydrated orally with 20 0.5-1.0 L of water.

21
This study was approved by the Institutional Review Board of Shanghai General Hospital, and the 22 written informed consent was waived in the retrospective part and obtained from the patients in 23 the prospective part. 24

PET/CT imaging and reconstruction 25
The same PET/CT scanner was used for image acquisition for both cohorts. Patients were firstly 26 scanned with CT with a fixed tube voltage of 120 kV and an auto-mAs technique for dose 27 modulation, providing anatomical information and attenuation correction to PET images. 28 Subsequently, patients were scanned with PET in step-and-shoot mode.PET data were acquired in 29 list-mode for 120 s and 360 s per bed position in the retrospective and prospective study.

30
The same reconstruction protocols as in the phantom study were used to reconstruct the acquired 31 PET images (hereinafter referred to as OSEM_full and DPR_full).In the retrospective study, we 32 rebinned the list-mode PET data to 60, 40, and 30 s per bed position to simulate the 1/n (n=2, 3, 4) 33 of the injected activity.. The DPR algorithm was also applied to the rebinned PET data (hereinafter 34 referred to as DPR_1/2, DPR_1/3 and DPR_1/4). In the prospective study, PET images were 35 reconstructed using the first 120 s data and extended the acquisition time to 180 s and 360 s per 36 bed position to simulate the 1/2 and full dose scenarios. The DPR algorithm was also applied to 37 the acquired and rebinned PET data (hereinafter referred to as DPR_1/3, DPR_1/2andDPR_full). 38

Image analysis in the retrospective study 1
In the retrospective study of this work, the image quality was quantitatively assessed on an 2 advanced workstation (uWS-MI, United Imaging Healthcare, Shanghai, China). For each patient, 3 a volume of interest (VOI) with a diameter of 30±3 mm was manually drawn at the same position 4 and the slice on a homogeneous area of the right liver lobe. The SUVmean and standard deviation 5 (SD) within the VOI were recorded. The liver COV, as a measure of background noise, was 6 obtained by dividing the SD by the SUVmean. 7 Regarding the lesions, SUVmax of the identified FDG-avid lesions was measured by placing a 8 VOI to encompass the whole lesion. Thus, tumor-to-liver ratio (TLR), as a measure of image 9 contrast, was obtained by dividing the lesion SUVmax by the liver SUVmean. 10

Image analysis in the prospective study 11
In the prospective study, the same nuclear medicine physician analyzed the images on the same 12 workstation as in the retrospective study. Similarly, a VOI was manually drawn at the right liver 13 lobe, aorta, and gluteus maximus. The SUVmean and standard deviation (SD) of the VOI for each 14 series were recorded. The liver and the muscle COV were obtained by dividing the SUVmean by 15 its SD. The value of SUVmax and TLR for each identified lesion was obtained using the same 16 method as the above. In addition, the diameter of the lesion was measured on the CT images. 17 Subsequently, the qualitative image quality of unlabelled images was assessed by two nuclear 18 medicine physicians (XY, 17 years' experience and WTS, 18 years' experiences) in a randomized 19 order. The patient's clinical information, as well as the acquisition duration and reconstruction 20 algorithm, were blinded to the reader. The physicians viewed both the maximum intensity 21 projection (MIP) and transverse PET images and judged the image quality using a 5-point Likert 22 scale in the following three perspectives: image contrast, image noise, and diagnostic confidence 23 (with 1=worst and 5=best). A score of 3 was given to images that were acceptable for clinical 24 diagnosis. 25

Statistical analysis 26
Continuous parameters are presented as the mean ± SD and range. Fisher's exact test was 27 performed to investigate the distribution of the gender in the two cohorts. An independent t-test 28 was used to test the other demographic parameters of the enrolled patients from the two cohorts. 29 Bland-Altman plot analyses were performed to assess the agreement of the SUVs between the 30 reference and DPR images. All the quantitative parameters were tested for normality using the 31 Kolmogorov-Smirnov test and the two-tailed paired samples t-test was subsequently performed. 32 Inter-rater reliability was evaluated using Cohen's weighted kappa (linear) coefficient. The scores 33 of the qualitative image quality were subsequently compared using the Wilcoxon signed-rank test. 34 Statistical significance was considered for a p-value less than 0.05, and all statistical tests were 35 performed using SPSS Statistics, version 25 (IBM, Armonk, NY, USA) and R package. 36

1
Phantom study 2 The current reconstruction protocol for both DPR and OSEM satisfies the minimum allowable RC 3 values as the EANM EARL2 requires [20]. Moreover, the SUVpeak difference between these two 4 algorithms is within 10%, which shows good consistency on the quantitation accuracy. As the 5 activity decreases from 6.2 kBq/ml to 2.1 kBq/ml, the background COV of DPR image increases 6 from 6.0% to 7.2% while that of OSEM image increases from 12.2% to 20.2%. DPR image has 7 much lower noise than OSEM in low-dose PET imaging (Fig. 2). 8 9 The phantom study verified the effectiveness of our current reconstruction protocols, which would 10 be used in the subsequent patient study. 11

Retrospective patient study 12
The liver SUVmean agreed well as shown in Bland-Altman plots (Fig. 3), and no significant 13 difference was found between the DPR groups and the reference (all p>0.05). The liver COV in 14 the DPR_1/3 group showed no significant difference with that in the reference (p=0.955), 15 indicating a comparable image quality (Fig. 4). The image quality of the DPR_1/2 and DPR_full 16 groups was significantly improved with a reduced COV (both p<0.001). Both the lesion SUVmax 17 and TLR in all the DPR groups showed significant enhancement compared to those in the 18 reference (all p<0.001), indicating an improvement on lesion conspicuity. Based on the above 19 results, we concluded that the DPR algorithm can reduce the acquisition time to 1/3. Thus, in the 20 subsequent prospective study, the image quality of the patients injected with 1/3 of 18F-FDG was 21 analyzed.

23
Prospective patient study 24 In the prospective study, we analyzed the uptake of background tissues, including the liver, blood 25 pool and muscle, and found that the values of SUVmean agreed well between groups as shown in 26 the Bland-Altman plots (Fig. 5). Subsequently, the COVs of the background tissues were 27 compared (Fig. 6). There was no significant difference of the COVs between the DPR_1/3 group 28 and the reference (p=0.055, 0.526 and 0.604 for the liver, blood pool and muscle, respectively), 29 while the COVs in the DPR_1/2 and DPR_full group were both found significantly reduced than 30 those in the reference (both p<0.001). Thus, the image quality of the DPR_1/2 and DPR_full 31 groups were improved, while the DPR_1/3 images showed a comparable quality with the 32 reference.

34
A total of 98 lesions were identified and included in the quantitative analysis. The SUVmax and 35 TLR of the lesions in all the DPR groups were significantly larger than those in the reference 36 group (all p<0.001, as shown in Fig. 7). The enhancement of the lesion uptake in the DPR images 37 can be observed in the MIP and transverse images. Figure 8 illustrated PET images of a 31-year 38 old man with Hodgkin's lymphoma. Both the MIP and transverse images of the DPR images 39 showed an improved lesion detectability. Meanwhile, the DPR images demonstrated a non-inferior 40 performance in the noise level compared to the reference.

42
A further study was performed on the small lesions with a diameter of less than 10 mm (n=27). In 1 all the DPR images, a significantly higher uptake was also found than that in the OSEM images 2 (all p<0.001). Likewise, the TLR in the DPR groups showed significant improvement compared 3 with that in the OSEM group (all p<0.001).

5
In the visual analysis, the weighted kappa coefficient was 0.612, indicating a substantial 6 agreement of the subjective score between the two readers. There were no significant differences 7 between the DPR_1/3 group and the reference regarding the contrast, noise and diagnostic 8 confidence (p=0.284, 0.655 and 0.137, respectively). Both the DPR_1/2 and DPR_full groups 9 showed significantly higher scores than the reference regarding all the three perspectives (all 10 p<0.001, as shown in Fig. 9). 11 12

13
In this study, we investigated the image quality of a deep-progressive learning algorithm with both 14 simulated and real-world reduced injected dose. As known, PET is associated with the detection of 15 annihilation photons that are produced back-to-back after positron emission from a radioactive 16 tracer. Hence, radiation exposure is inevitable in PET imaging, and the injected dose should be 17 reduced while maintaining adequate image quality that provides sufficient clinical information.

18
Particularly for longitudinal studies when multiple PET/CT scans are performed, it is desirable to 19 adapt the injected dose to the lowest level to reduce the accumulated dose. In addition, concerns 20 on radiation exposure are of great interest in paediatric populations because the risk of 21 radiation-induced carcinogenesis is higher in children and thus prone at risk to developing 22 secondary tumors during their lifetime [21-23].

23
The advent of the total-body or long axial field-of-view (LAFOV) PET scanners has made a 24 breakthrough on the system sensitivity which can be greatly beneficial to reduce the injected dose. 25 Previous studies have been performed with half and even 1/10 of the dose on both oncological 26 patients and healthy volunteers, and demonstrated the feasibility of low-dose and ultra-low-dose 27 using the TB PET scanner [24-28]. However, up to now, there are only a limited number of TB or 28 LAFOV PET/CT scanners available worldwide. For the rest of the PET sites, an alternative 29 solution by applying the deep learning technology is more reasonable for the low-dose PET scans.

30
Several studies have shown the promising performance of CNN-based methods in improving the 31 PET image quality with reduced acquisition time or the injected activity while maintaining 32 adequate information for diagnosis and quantification accuracy. In this work, the phantom study 33 was performed to verify the harmonization of the selected reconstruction protocols as the EANM 34 EARL2 requires. Subsequently, both the simulated and real-world low-dose DPR images were 35 evaluated and compared with the standard OSEM images and showed that the image quality can 36 be maintained even the injected dose was reduced to 1/3. In addition, this work comprehensively 37 evaluated the image quality by using quantitative parameters and subjective scores. 38 During the network training, high-quality images reconstructed from high injected dose or long 39 acquisition time are required to be used as the training labels. High injected dose in PET imaging 40 is related to the potential safety concerns and is not employed in clinical practice. On the other 41 hand, a longer acquisition time may adversely degrade the image quality due to the patient's 42 motion during the acquisition. An alternative solution is to adapt the regularised images with less 43 noise and better contrast, such as block sequential regularised expectation maximization (BSREM) 1 [17, 29-32]. Moreover, the employment of the TB-PET images can make an unprecedented 2 breakthrough on the image quality of the training dataset. The proposed DPR method utilized two 3 learning steps to transfer the low-quality images to high-quality images and is feasible for both 4 reducing noise and improving image contrast. A shortcoming of the CNN-based methods is that 5 the performance is usually degraded on small lesions since they are overwhelmed by the image 6 noise in low SNR images. The proposed DPR method can tackle this problem by incorporating the 7 networks into the iteration process [18]. In a previous study on the same network, even the 8 smallest hot sphere with a diameter of 10 mm in an IEC body phantom still showed at least 2 folds 9 contrast to noise (CNR) gain. Consistent results were found with the clinical data in this study. 10 The DPR algorithm on the small lesions still showed good performance regarding both the 11 quantitative SUV measurements and TLR values. 12 The present study has several limitations. First, the work was a single-centre study with a limited 13 number of enrolled patients. A large-scale multi-centre study is expected to be performed, 14 especially on the quantification accuracy of SUVs. In addition, the difference of physiological 15 uptake was observed between the images reconstructed by the first two-minute data and the last 16 two-minute data in the prospective study. Currently, we rebinned the first two-minute data to 17 reconstruct DPR images for analysis. Furthermore, the supervised network in the work was trained 18 with 18 F-FDG PET data and this study only enrolled oncological patients who underwent 18 F-FDG 19 PET examinations. In future work, the performance of the DPR algorithm can be evaluated with 20 other non-18 F-FDG tracers, such as 68 Ga-PSMA. 21

22
In this work, a total-body PET guided deep progressive learning method was proposed for 23 reducing the noise and improving the contrast of 18 F-FDG low-dose PET images.            Figure 1 (a) A general scheme of the DPR algorithm. The reconstruction work ow consists of multiple CNN blocks.

Figures
Each block receives the output image of the previous block as the initial image for the EM iterations and passes the output image to the next CNN block. (b) The network architecture of the FB-Net. In this design, the network has three branches. H1, H2 and H3 are the outputs of these three branches respectively.
Green lines indicate residual connections between the input and the output of each branch. Gray lines indicate dense connections between different layers in the FB-Block.   Bland-Altman plots of liver SUVmean between the DPR group and the reference group in the retrospective study. All the DPR groups showed a good agreement on the quanti cation accuracy of the liver SUVmean with the reference.

Figure 4
Comparison of the liver COV, lesion SUVmax and TLR between DPR groups and the reference in the retrospective study. The liver COV showed no signi cant difference between the DPR_1/3 group and the reference. All the DPR groups show a signi cantly elevated SUVmax and TLR compared to the reference group, indicating an improved image contrast. *** p<0.001; ns, no signi cant difference. COV, coe cient of variance. SUV, standardized uptake value. TLR, target-to-liver ratio. Bland-Altman plots of the liver, the blood pool and the muscle SUVmean between DPR groups (from left to right: DPR_full, DPR_1/2 and DPR_1/3) and the reference in the prospective study. SUV, standardized uptake value Figure 6 Comparison of the liver, the blood pool and the muscle COV between DPR groups and the reference. The liver, the blood pool and the muscle COV showed no signi cant difference between the DPR_1/3 group and the reference. All the other DPR groups show a signi cantly reduced COV compared to the reference group, indicating an improved image quality. *** p<0.001; ns, no signi cant difference. COV, coe cient of variance.

Figure 7
Comparison of the lesion SUVmax and TLR between DPR groups and the reference (blue line: an increase; orange line: a decrease). The SUVmax and TLR of all the DPR groups were signi cantly improved compared to the reference group. *** p<0.001. SUV, standardized uptake value. TLR, target-toliver ratio. Visual assessment of the image contrast, noise and diagnostic con dence between DPR groups and the reference. All DPR groups show a signi cantly higher score compared to the reference group. *** p<0.001.