Ultralow-dose Pediatric Total-body PET/CT Imaging Using an Articial Intelligence Technique

Purpose Young children are more sensitive to radiation than adults, and their absorption of effective dose can be four times higher than that of adults, inducing a higher risk of secondary injury. Here, we propose for the rst time the use of articial intelligence techniques combined with low dose CT prior information to improve image quality in ultralow-dose total-body PET/CT scans. Methods A total of 44 pediatric patients (weight range: 8·5–50·1 kg; ages 1–12 years) who underwent total-body PET/CT at the Sun Yat-sen University Cancer Center were retrospectively enrolled. 18 F-FDG was administered at a dose of 3·7 MBq/kg and an acquisition of 600 s. The low-dose PET images were simulated by truncating the list-mode data to reduce the count density. The neural network uses the residual network as the basic structure and fuses low-dose CT images as the priori information into the network at different scales. The image quality was assessed by subjective and objective analyses. Bland-Altman plots were used to assess the agreement of regional SUV ratios between the image types. Statistical analysis was carried out to assess the differences in the image quality metrics and reader agreement. The use of articial intelligence techniques can signicantly improve PET image quality. When combined with a prior CT information, the anatomical information of the images was better recovered, and the 15 seconds acquisition yields a quality equivalent to the 10 minutes acquisition, it can equivalently guide the concentration of the injected tracer to decrease, which is very important for dose-sensitive pediatric patients. total-body ultralow-dose


Introduction
Positron emission tomography (PET) combined with computed tomography (CT), referred to as PET/CT, is an indispensable malignancy diagnostic device in hospital radiology departments (1)(2)(3)(4). The occurrence of malignant tumor is not localized, but often a systemic disease (5)(6)(7). Therefore, PET-CT generally takes whole-body scan, which can discover not only the primary site lesion, but also the presence of metastatic lesions in soft tissue organs and bones in various parts of the body, which is very helpful for the staging of tumor and determining the scope of metabolically active lesions, and provides accurate information on the site of puncture or tissue biopsy. It provides more reasonable and accurate positioning for radiation therapy (especially precision radiotherapy) and reduces the side effects of treatment (4). PET uses radioactive tracers, special cameras and computers to image tracer distribution and evaluate organ and tissue functions.
Typically, the tracer administration activity and event signal acquisition time are positively correlated with the imaging quality(8, 9). However, highly active tracers will increase the risk of secondary cancer in patients, and longer acquisition times may introduce autonomic or involuntary patient motion artifacts into the images. Therefore, the activity of radiotracer administration and the time duration of data collection are often restricted by radiation safety and tolerability (3,(10)(11)(12). Compared with adults, children are more sensitive to ionizing radiation, and the effective dose absorbed can reach 4-5 times that of adults (10). Pediatric patients who are exposed to radiation at an early stage have a higher risk of developing malignant tumors because their bodies are developing and their life expectancy is longer (11,13,14). In addition, the lack of self-control ability in children is also a problem. The PET data acquisition process is relatively time-consuming compared to CT. The movement of the child during the event signal acquisition will cause image artifacts, can blur the image, and cause di culty in making the diagnosis (15). Therefore, in pediatric nuclear medicine, it is important to minimize the dose of the radiotracer that is administered as well as the acquisition time of the event signal.
The typical PET axial eld-of-view (FOV) is 20 cm, and when performing a whole-body PET scan, data needs to be collected from multiple locations, over 85% of the body at each location lies outside the scanner's FOV, and signals from these areas of the body cannot be collected, yielding less than 1% sensitivity to the signal, making it very di cult to achieve an overall dose reduction(16, 17). Ultra-long axial FOV is regarded as a new generation of PET technology that can fully improve signal sensitivity (17). Recently, a long-axis FOV PET scanner was introduced (3,18,19). This PET scanner, called uEXPLORER (uEXPLORER, United Imaging Healthcare, Shanghai, China), has an axial FOV of 194 cm and can record coupled photons from the entire body simultaneously, hence the name total-body PET imaging. It increases the effective sensitivity by a factor of approximately 40 with respect to the 20 cm axial FOV, thus allowing conventional image quality to be achieved using lower injection concentrations and shorter acquisition times (11,16,20). Based on this platform, arti cial intelligence techniques are used to explore the low-dose limit of rapid scanning achievable by current PET devices, which is of clinical importance, especially for pediatric PET diagnosis.
In this retrospective study of pediatric data, we aim to use arti cial intelligence (AI) techniques (CNNs, convolutional neural networks) to further perform ultralow-dose image recovery on a total-body PET/CT system that already has the advantage of using a low dose. CNNs based on multimodal data fusion have been shown to combine the advantages of the data of each modality, which can effectively and signi cantly reduce the dose (21)(22)(23)(24). We will use the accompanying CT images as a prior knowledge to enhance the anatomical information of the images. The enhanced synthetic network we adopted takes the residual module as the main framework, introduces the high-dimensional information of the prior CT at different scales, and uses the perceptual loss to ensure the effective restoration of the structure, and uses the simulated annealing training strategy to speed up the training process (25)(26)(27). The experimental results show that the CNN can effectively synthesize images from the ultralow-dose images and can reach the clinical diagnosis level, and the network model has an improved performance in the anatomical structure recovery with the introduction of CT prior image information.

Materials And Methods
The data of this retrospective study came from the Sun Yat-sen University Cancer Center. The study was approved by the institutional review board of this center, and informed consent was obtained from all of the patients' legal guardians.

Data Acquisition
A total of 44 pediatric patients who underwent total-body PET/CT using the uEXPLORER scanner at the Sun Yat-sen University Cancer Center from July 2020 to August 2020 were retrospectively enrolled in this study.
A K-fold cross-validation strategy (here K=5) was used to compensate for the lack of training samples, and after training, we selected the last set of 5 patients (4 males, 1 female) as the evaluation set ( Table 1). The dose of 18F-FDG was approximately 3·7 MBq/kg, and the acquisition time was 600 s. Low-dose total-body CT scans were acquired at a tube current of 10 mA and voltage of 100 kV (rotation time 0·5 s, pitch 1·0125, collimation 80 × 0·5 mm) and were reconstructed in a 512 × 512 matrix for attenuation correction. Low-dose PET images (0·037-0·925 MBq/kg) were simulated by truncating the data in the list mode (equivalent to compressing the event signal sampling time) to reduce the count density.

Image Preprocessing
The CT images were registered to the PET images using MATLAB (MathWorks, Natick, MA) software after considering the position offset of the patient in different acquisition modes. All images were resampled to the voxel dimensions of the acquired PET volumes. The PET image pixel values were converted to SUV, and the CT image pixel values were approximated as the values of the absorption coe cients of water to X-ray at 60 keV. A circular mask was created and applied to the PET and registered CT images based on the maximum torso diameter of 44 patients and the World Health Organization (WHO) Child Growth Standards(28). The processed PET images and CT images were used as input into the CNN.

CNN Implementation
The proposed neural network for low-dose PET image synthesis is shown in Figure 1. The inputs to the network were the low-dose PET images and low-dose CT images. The full-dose PET image was treated as the ground truth. To enhance the network's ability to recover anatomical structures and texture details, the loss function used a combination of L2 normal and perceptual loss (30). The network was constructed using the PyTorch deep learning framework and was optimized using the Adam optimizer with a cosine annealing strategy to speed up convergence (31,32). We open sourced the network code so that readers can obtain detailed information on the parameters such as the learning rate and batch size from the code(33).

Quantitative Imaging Analysis
The objective image quality evaluation was performed by an experienced technician under the supervision of a radiologist. The images generated by the neural network were rst visually inspected for artifacts. For each axial section, the image quality of the synthetic PET images and the original low-dose PET images were compared to the full-dose images using the peak signal-to-noise ratio (PSNR) and the structural similarity index (SSIM). Two-dimensional circular regions of interest (ROIs) with a diameter of 2 cm were drawn over a homogeneous area of the liver parenchyma, being careful to avoid blood vessels and tumors to record semiquantitative uptake measurements of the liver, such as the SUVmax and SUVmean. The shortest lengthdiameter of the 18F-FDG-avid suspected lesion (not necessarily malignant) was identi ed and the ROIs were drawn for this lesion on the section with the largest lesion diameter to record semiquantitative uptake measurements of the lesion location, including the SUVmax and SUVmean.
Qualitative Imaging Assessment A subjective assessment of the PET image quality was rated independently by two nuclear radiologists (a senior radiologist with >10 years of experience and a radiologist with >5 years of experience) based on a 5point Likert scale. The synthesized PET images, the low-dose PET images, and the full-dose PET images of each data set were anonymized. The 5-point Likert scale was used for (1)

Image Quality
The objective measurements of the image quality are shown in  Table S1).
The semiquantitative uptake measurements obtained from the uniform areas of the liver parenchyma, SUVmax and SUVmean, are presented in Table 2. The standard deviation of the SUVmean and SUVmax values of low-dose PET decreased with an increasing dose, while that of the images synthesized by arti cial intelligence did not change much, were more stable and could reach the standard of reference images in the lowest dose case. To further compare the differences between the two arti cial intelligence methods, a Bland-Altman plot was drawn ( Figure 5), which showed that the AI-based method combined with the prior synthesized CT images had the lowest bias and almost the lowest variance among all of the dose groups compared to the reference standard full-dose images.    (G60s -G150s, p>0·05). The arti cial intelligence method combined with the prior CT information had an advantage at lower doses.

Clinical Readings
The subjective image quality scores of the different methods were compared in the different dose groups, where the mean ± SD of the Likert scores for the overall image metrics are presented in Table 4. The radar plot of all perspectives (signi cance of major suspicious malignant lesions, signi cance of organ anatomy and image noise) is shown in Figure 6. Synthetic PET with the prior CT information method had an excellent performance for all perspectives in the different dose groups.
The interrater agreement of the Likert scores is important when performing subjective studies of the images.  Table 5.
For the G6s dose group, the kappa coe cient for the low-dose PET was 0·5045, indicating a moderate degree of agreement between the two radiologists. Additionally, the kappa coe cient for synthesized PET was 0·7713, indicating a high degree of agreement between the two radiologists. For the other cases, the agreement between the two radiologists was almost perfect (0·8383-1·0).
The results of Tukey's honestly signi cant difference test comparing the subjective image quality scores between the groups are shown in Supplementary Table S2-S3. Signi cant differences were found in the image scores that were synthesized by the arti cial intelligence methods for the different dose groups compared to the original low-dose image scores in (1) the overall impression of image quality, (2) the signi cance of major suspicious malignant lesions, (3) the signi cance of organ anatomy and (4) the image noise. There were almost no signi cant differences between the two AI methods (Supplementary Table S2).
There was no signi cant difference in (4) the image noise between the different dose groups of the same method. For (2) the signi cance of major suspicious malignant lesions, there were signi cant differences between the low-dose groups (G6s-G30s, G6s-G60s, G6s-G150s) and no signi cant differences between the high-dose groups. Signi cant differences existed for (3) the signi cance of organ anatomy except for G60s- Table S3). All scores were presented as the mean value ± SD

Discussion
The ultralong axial FOV (2 m) total-body PET/CT platform allows for high-quality imaging at lower tracer doses and shorter acquisition times and is well suited for pediatric patients who are dose sensitive and have di culty cooperating with prolonged examinations. Our aim was to further investigate ultralow-dose imaging on the total-body PET/CT platform using arti cial intelligence techniques to evaluate the shortest data acquisition time that can be allowed on this platform when standard doses are used (the equivalent lowest injected tracer dose when the standard acquisition lengths are used, or both the shortest data acquisition time and the lowest injected tracer dose).
This proof-of-concept study shows that the use of arti cial intelligence techniques can be effective in improving the quality of low-dose images. Compared to the low-dose PET images and the images synthesized by a model that does not consider the use of CT images as a prior information, the images synthesized by a network model that considers the prior have higher average image quality and lower regional SUV relative value bias and variance. This indicates the important value of introducing CT images with rich anatomical structure information into the imaging model.
The quantitative results, such as the data shown in Figures 2-5 and Tables 2-3, show that the combined performance of image quality becomes progressively higher as the acquisition time increases and that the images synthesized by the model using CT images as a priori information have quantitative values that are closest to those of the full-dose reference images (SSIM, PSNR) for the same duration. Table 2 and Figure 4-5 also show that the stability of the SUV values can be more effective, which can ensure the accuracy when performing a PET quantitative analysis.
The qualitative results, such as the results shown in Figure 6 and Tables 4-5, show that the images synthesized based on AI techniques performed best in terms of the image noise. In the 15-second data case, the images synthesized by the network model using CT prior information excelled in terms of the anatomical structure, which indicates that CT anatomical structure information plays a key role. In the 30-second data case, the AI-based images achieved the standard of the full-dose images in almost all aspects.
There are several limitations to our study. A total of 44 pediatric data of different ages and genders were used in this retrospective study, and K-fold cross validation was used to compensate for the lack of training analysed the data, and reviewed and revised the manuscript. All authors approved the nal manuscript as submitted and agree to be accountable for all aspects of the work.

Data Availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.

Ethics approval
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.

Consent to participate
Informed consent was obtained from legal guardians.

Consent for publication
Additional informed consent was obtained from all legal guardians for whom identifying information is included in this article.
Competing Interests Figure 1 Schematic diagram of the convolutional neural network used in this work.
It includes two modality-speci c encoders and one decoder that synthesizes the full-dose PET images. The arrows indicate the ow of computational operations, and the lower left corner of the box is labeled with the number of input and output feature images for this module. combined with the prior CT information. (p) Reference, full dose image with the axis view locations marked.
(q) The prior CT image. The synthesized PET images show signi cantly reduced noise compared to the lowdose PET images, and the images generated from the PET combined with the prior CT information model were superior in re ecting the underlying anatomic patterns compared with the images generated from the PET-only model.

Figure 3
Image quality metrics comparing the images from low-dose PET, the synthesized PET-only model, and the synthesized PET combined with CT prior model.
For all of the metrics, the comparison were to the full-dose PET images. The images from the synthesized PET with the CT prior model were superior for all of the metrics, including a higher structural similarity index (SSIM) and a higher peak signal-to-noise ratio (PSNR).

Figure 5
Bland-Altman analysis of SUV differences compared between methods (low-dose PET, synthetic PET, synthetic PET combined with the CT prior) and references across all of the data sets.
The solid and dashed lines denote the mean and 95% con dence interval (CI) of the SUV differences, respectively. The images synthesized from the synthesized PET combined with the prior model had the lowest bias and had almost the smallest 95% CI relative to the reference full-dose images.