A comparison of techniques for PET data-driven tracking of head motion

Background / Aims: Patient motion during positron emission tomography (PET) imaging can corrupt the image by causing blurring and quantitation error due to misalignment with the attenuation correction image. Data-driven techniques for tracking motion in PET imaging allow for retrospective motion correction, where motion may not have been prospectively anticipated. Methods: A two minute PET acquisition of a Ho(cid:27)man phantom was acquired on a Biograph mCT Flow, during which the phantom was rocked, simulating periodic motion with varying frequency. Motion was tracked using the sensitivity method, the axial centre-of-mass (COM) method, a novel 3D-COM method, and the principal component analysis (PCA) method. A separate two minute acquisition was acquired with no motion as a gold standard. The tracking signal was discretised into 10 gates using k-means clustering. Motion was modelled and corrected using the reconstruct-transform-add (RTA) technique, leveraging Multimodal Image Registration using Block-matching and Robust Regression (Mirorr) for rigid registration of non-attenuation-corrected 4D PET and Software for Tomographic Image Reconstruction (STIR) for PET reconstructions. Evaluation was performed by segmenting white matter (WM) and grey matter (GM) in the attenuation correction computed tomography (CT). The mean uptake in the region of GM was compared with that in the WM region. Additionally, the di(cid:27)erence between the intensity distributions of WM and GM regions was measured with the t-statistic from a Welch’s t-test. Results: Di(cid:27)erence in the mean distribution of WM to GM ranked the techniques in order of e(cid:30)cacy: no correction, sensitivity, axial-COM, 3D-COM, PCA, no motion. PCA correction had a great WM/GM separation measured by the t-value than the no motion scan. This was attributed to interpolation blurring during motion correction reducing class variance. Conclusion: Of the techniques examined, PCA was found to be most e(cid:27)ective for tracking rigid motion. The sensitivity and axial-COM techniques are mostly sensitive to axial motion, and so were ine(cid:27)ective in this phantom experiment. 3D-COM demonstrates improved transaxial motion sensitivity, but not to the level of e(cid:27)ectiveness of PCA.


Introduction
Patient motion during neurological positron emission tomography (PET) corrupts images, causing blurring and quantitation error due to misalignment with an attenuation correction image. These eects result in poorer image quality, which may aect patient management in clinical practice, or reduce eect size in research. For these reasons, motion correction has been employed to reduce the eect of patient motion.
In neurological imaging, it can be dicult to predict before a scan whether motion will occur.
Many subjects can remain still and motion correction is only required in a subset of scans. Motion correction techniques can require complex hardware to be set up before a scan, which can be impractical in clinical practice due to the time taken to setup the apparatus, especially for centres where the PET scanner is in high demand. To address this, some authors have proposed a class of techniques which can track motion from raw PET data, without requiring additional hardware.
These techniques are called data-driven motion tracking techniques.
Various techniques for data-driven PET motion tracking have been proposed, and are described in Section 2. In this work, a Homan phantom acquisition, described in Section 3.1, contrasts the subset of these techniques that are able to be employed for tracking head motion. The applicable techniques used in this investigation are described in more detail in Section 3.2. Some of the other techniques that are not applicable in this investigation are addressed briey in the discussion. These signals are used to perform motion correction of a phantom acquisition, as described in Section 3.3 and the results are objectively ranked by comparing white matter (WM) and grey matter (GM) voxel distributions, as described in Section 3.4. Similar previous works either are not comprehensive in evaluating the applicable techniques for head motion [1,2], or are focused on thoracic or abdominal motion [3] -the authors believe this work is the rst to focus on this niche in a phantom experiment.

Background
Various methods for data-driven motion correction have been developed, and those applicable to head motion are investigated in this work. Data-driven techniques analyse the distribution of raw PET data, in order to determine when motion occurs. They produce a motion surrogate, a signal related to motion that can be used for correction.
The rst of such techniques proposed was the sensitivity technique [4,5,6], originally proposed by Visvikis et al. [4]. This method exploits the fact that the sensitivity prole of the scanner is heterogeneous. In a typical cylindrical PET gantry, the sensitivity varies axially, approximately linearly with axial distance from the isocentre of the gantry rings, with the most sensitive region being in the centre of the PET gantry. There are also smaller sensitivity variations transaxially, but the scanner is more sensitive to axial changes that transaxial. These are depicted in Fig. 1. Patient motion causes a shift of the emission distribution with respect to the static sensitivity distribution, resulting in a change in the instantaneous count rate. Hence, the instantaneous count rate can be used as a motion surrogate, and has been applied to respiratory motion, where it tracks the diaphragm. The diaphragm moves in the craniocaudal direction, parallel to the axial direction of the PET scanner where changes in sensitivity are most pronounced and hence where motion is most easily detected. The method can also be applied to head motion. Extensions to this technique that mask out regions without motion to improve signal-to-noise ratio (SNR) have been proposed [7].
Spectral techniques are able to identify periodic motion, such as respiratory or cardiac motion, and count rates are only calculated in the regions that have not been masked out. These techniques are not investigated in this work, as they are not applicable to head motion. The axial-centre-of-mass (COM) technique [8] is calculated similarly to the previous technique, but instead of measuring count rates, the central tendency of the distribution is measured. Specically, it measures the axial centre-of-mass (COM) of raw data sinograms. In tomography, the axial coordinate in sinogram and image space is identical, therefore the COM in raw data space corre-sponds to the COM axially of the emission image. So this motion surrogate measures the mean z position of the distribution, yielding an interpretable value with units of distance. As with the sensitivity techniques, spectral analysis masking extensions are available [9], but are not investigated in this work as head motion is the focus.
Ren et al. [10] introduced a three dimensional extension to the COM technique called centroidof-distribution (COD) [11]. Each detected event has an associated emission location probability distribution that corresponds to the probability an emission at a given location caused the event.
When time of ight (ToF) information is included, the distribution can be reduced to a relatively tight distribution, with the mode of the distribution representing the most likely emission location.
In the COD and related [1,12] surrogate extraction techniques, the modal emission location for each event is recorded, and the centroid of these in a given time period represents the centre of the distribution being tracked. This exact technique was not examined here, because it relies upon ToF information to work, which was not the focus of this investigation. Instead, an alternative 3D extension of the axial COM technique, proposed in Section 3.2, was examined.
The nal method investigated is the principal component analysis (PCA) technique [13,2], which uses principal component analysis to dene a linear model of the modes of variance in the raw data histogram intensities. A truncated PCA transformation [14] yields a dimensionality reduced signal, each component of which measures an independent change in the estimated true count distribution over time. The primary modes of variation encode gross structured changes. Assuming such structured (i.e., non-noise) temporal changes do not occur to factors such as tracer kinetics, then the output of a PCA transform will correspond to patient pose from which relative motion can be inferred.
Commonly, this motion surrogate is used to divide the acquisition into a number of approximately motion-free segments, either as a series of frames, or periodic gates. Motion correction can then be applied, for example, using the reconstruct-transform-add (RTA) motion correction scheme [15,16]. In this scheme the measured pose transform for each gate is applied to the attenuation coecient image (µ-map) in order to get an aligned µ-map. Each gate is then independently reconstructed, and the inverse pose transform applied to correct the reconstruction back to the reference pose. Finally, the gates are summed to estimate the total activity with a high SNR. Four data-driven PET motion tracking techniques were investigated for motion surrogate extraction, the sensitivity method [5,6], the axial-COM method [8], a modied 3D-COM method, and the PCA method [13,2].
The second method investigated is the axial-COM surrogate, τ COMz . The equation used to determine the motion surrogate signal is a weighted mean of counts, weighted by the axial position, and is given in Eq. (2).
τ COMz (t) = e,a,z,s∈E,A,Z,S zT e,a,z,s,t e,a,z,s∈E,A,Z,S T e,a,z,s,t In the third method investigated, the axial-COM surrogate is extended to three dimensions, as an alternative to the COD technique. To achieve this, an approximately parallel subset of the transaxial lines of response (LORs) is considered. The COM of the vertical lines gives the mean x location of the distribution, and the COM of the horizontal lines give the mean of the mean y location of the distribution, as depicted in Fig. 2. This technique avoids ascribing a position to each count in the direction parallel to the LOR, which has the most uncertainty without ToF information.
τ COMz (t) = e,a,z,s∈E,A,Z,S zT e,a,z,s,t e,a,z,s∈E,A,Z,S T e,a,z,s,t The PCA surrogate, τ PCA , was calculated similarly to that proposed by Thielemans et al. [13].
An activity mask is estimated by median ltering and thresholding at 10% of the 95th percentile activity the static sinogram. The mask was applied to the estimate true series, T , which was subsequently normalised to have equal counts at each time point (obviating the need for decay correction), then variance stabilised with a Freeman-Tukey transform [17], yielding T norm . A PCA was conducted on T norm to determine the PCA weights, W , and the surrogate is given by the transform to k components, given in Eq. (4). In this work, 2 components were found to be sucient to capture motion, after observing the signals, yielding a 2 dimensional PCA surrogate.

Motion correction
For each surrogate extraction method, the motion surrogate is generated using the PET list mode data, and k-means clustering with 10 groups is used to discretise the surrogate into a gating signal.
Although k-means is an unconventional motion surrogate discretisation technique, it allows for clustering of multidimensional signals (PCA and 3D-COM) in an identical manner to single dimensional signals (sensitivity, axial-COM). A motion model is formed by binning the list mode data into 10 separate sinograms, each containing data from one of the 10 gates from the gating signal, generating a non-attenuation-corrected (NAC) PET reconstruction for each, and nally registering each to a chosen reference gate using an iterative registration scheme. Reconstructions were performed using Software for Tomographic Image Reconstruction (STIR).
The iterative registration scheme, depicted in Fig. 3, ensured high quality registrations, even when some gates may have low counts contributing to the reconstructions and hence have a low SNR. A sum over all frames reduces the SNR with some additional blur due to motion, and is registered to the µ-map to ensure the accuracy of subsequent attenuation correction, and to prevent the registrations from drifting through iterations. Each individual frame is then registered to the aligned, summed image. Registering frames to this image is more accurate than registering directly to the µ-map, as the mono-modality registration is simpler to optimise. These steps are iterated three times, so that the sum image is less blurred due to motion at each step. The Multimodal Image Registration using Block-matching and Robust Regression (Mirorr) [18] software was used for rigid registrations, with the normalised correlation metric, omitting the highest resolution step of the resolution pyramid. Motion correction was subsequently performed using the RTA scheme. In this scheme, PET data from each gate or frame is reconstructed separately without attenuation correction. These reconstructions are registered with a reference to determine pose parameters at each gate. Next, the measured pose transform for each gate is applied to the µ-map in order to get an aligned µ-map.
Each gate is then independently reconstructed, and the inverse pose transform applied to correct the reconstruction back to the reference pose. Finally, the gates are summed to estimate the total activity with a high SNR.

Analysis
A novel technique is proposed to objectively and quantitatively analyse motion correction in the image. The CT was segmented by intensity into WM and GM classes. The segmentation is used to mask GM and WM PET voxels and generate a distribution of PET intensity in each tissue. Blurring due to uncorrected motion causes higher PET activity in the GM to spill into the WM, causing the two distributions to converge. Therefore, the dierence between the distributions increases with motion blur. The dierence between the two distributions is measured with the dierence in means, and with the t-statistic from a Welch's t-test. Each measure provides an objective and quantitative surrogate measure of motion corruption.

Results
The extracted motion surrogates are depicted in Fig. 4.  Subjective estimation of SNR of the raw motion surrogates (Fig. 4) agreed with the subjective analysis of motion correction quality (Fig. 5). The sensitivity technique provided some improvement compared to doing no correction, although the improvement is only slight. Although the motion surrogate for axial-COM appeared better that the sensitivity surrogate, the improvements seen in the reconstructed image are limited (Fig. 4). The 3D-COM technique exhibits less blurring than  axial-COM. The 3D-COM surrogates (Fig. 4) show that most motion occurred in the x direction, and only a small change was measured in the z direction. This explains the lower performance of the sensitivity and axial-COM techniques, which are mostly sensitive to axial motion. Finally, the reconstructions using the PCA technique approach appearing similar to a smoothed version of the no motion reconstruction. As expected, reconstructing and applying no motion correction resulted in the lowest subjective image quality, and acquiring data with no motion present resulted in the best subjective image quality (Fig. 5).

5.2
Objectively, PCA provides superior WM/GM separability The objective results agreed with the subjective observations discussed previously, the motion correction techniques are ranked by both the t-statistic and the dierence of mean in the order: no correction, sensitivity, axial-COM, 3D-COM, PCA (Fig. 6). Interestingly, the two distributions from the PCA technique are actually more separable under a t-test than those from the no motion acquisition, although the distribution means are indeed further spread in the no motion acquisition.
Motion correction was observed to have an eect similar to a local spatial lter, smoothing the image and thereby reducing the variance of the distribution (Fig. 5). This result does not imply that that motion corrected reconstructions are better than motion-free equivalent reconstructions.
It may imply that the post-ltering applied in the no-motion case was lower than optimal, however, reconstruction parameters were held constant for consistency.
PCA, therefore, was determined to be most eective in this experiment at tracking rigid motion of a phantom. This technique was originally designed for periodic motion, such as respiratory motion [13]. Although it has been veried to be applicable to rigid motion previously [2], the technique has not been directly compared with alternatives for tracking rigid motion.

Limitations and future opportunities
The 3D-COM extension of the axial-COM technique is similar in concept to the COD technique [10,11]. When ToF measurement is available during the PET acquisition, the COD technique is likely to supply superior SNR. This is because the ToF information that is considered by COD but not 3D-COM provides superior localisation, and also because the 3D-COM technique only considers a subset of available data. However, when ToF measurement is not available, as is the case on the Siemens Biograph mMR, then the 3D-COM technique provides a useful approximation.
This investigation compared techniques for tracking motion for a simple translation of a phantom. This represents a simple approximation to a real neurological scan. Several variables that would be present in a human subject scan are not present, such as tracer dynamics. Additionally, in most neurological scans, a portion of the neck is in the eld of view (FOV), meaning that some of the activity distribution moves with the head, and some does not. Such factors are likely to disrupt the motion surrogate extraction techniques to varying amounts. A future investigation should validate the comparison using clinical data.
No literature exists on optimising the parameters for the PCA technique. Parameters used in this investigation, such as the PCA preprocessing, the number of clusters extracted, and the number of motion gates during k-means clustering, were chosen to provide a good base-level performance, but are not optimal. Future work could investigate the optimisation of these parameters, and suggest good default choices for users.

Conclusion
PCA was demonstrated to be most eective for tracking rigid motion is this phantom experiment.
The sensitivity and axial-COM techniques are mostly sensitive to axial motion, and so were ineffective in this phantom experiment. 3D-COM demonstrates improved transaxial motion sensitivity compared to axial-COM, but not to the level of eectiveness of PCA.
Conicts of interest There are no conicts of interest to report.
Availability of data and material The data and materials are not available.
Code availability The code is not available.
Authors' contributions AGG, JS and ND conceived and ran the experiment and contributed to analysis and writing of the manuscript. SR advised on the experiment and contributed to analysis and writing of the manuscript. JAD contributed to analysis and writing of the manuscript.
Ethics approval Not applicable, no human data was used in this study.
Consent to participate Not applicable, no human data was used in this study.
Consent for publication Not applicable, no human data was used in this study.