2.1 Patients
A retrospective study utilizing PET/CT for the differentiation of metastatic and benign bone lesions has been approved by the organizational Institutional Review Board (201900947B0) with waiver of requirement for signed informed consents. For the current analysis, studies performed on a specific PET/CT scanner (Discovery ST16, GE Health Systems, Milwaukee, WI, USA) in our institution during years 2016 to 2018 for patients with cervical cancer were retrieved. Patients with history of other malignancy were excluded. Bone lesions obtained from studies in 2016 and 2017 were used as the training dataset, while studies in 2018 were used as the validation dataset.
2.2 18F-FDG PET/CT Imaging
Patients were instructed to fast for at least 4 hours prior to examination. The scan started about 90 min after intravenous injection of 370 ± 10% MBq of 18F-FDG. A diluted CT contrast agent (Iothalamate meglumine, Mallinckrodt, Missouri, USA) was administered orally during the tracer uptake period. Patients were scanned in the supine position. After a CT acquisition from the head to the upper thigh, the PET data was acquired in three-dimensional (3D) mode, with an acquisition time of 2.5 minutes per cradle position. The CT data were used for attenuation correction and the PET images were reconstructed by applying an iterative ordered subset expectation maximization algorithm with 4 iterations and 10 subsets and a transaxial image matrix size of 128 by 128. The reconstructed voxel size is 5.47mm by 5.47mm by 3.27mm. SUV is defined as the measured tissue concentration (MBq/mL) of the tracer divided by activity injected per body weight (MBq/g) in the reconstructed voxel.
2.3 Identification of Metastatic Bone Lesions
Clinical records of the patients were reviewed. A patient was confirmed to have bone metastases if there was pathological proof or imaging study showing progression of metastatic bone lesions following the PET/CT study. Metastatic bone lesions in the PET/CT studies were identified by a nuclear medicine physician. For each identified lesion, the location/coordinate of the PET voxel with lesion SUVmax was recorded.
2.4 Selection of Benign Bone Lesions
To match the number of metastatic bone lesions identified, an equal number of benign bone abnormalities were selected from the PET/CT studies in patients without evidence of bone metastasis in the clinical records and follow-up imaging studies. These consist of bone abnormalities with relatively increased 18F-FDG activities. For each selected benign abnormality, the location/coordinate of the PET voxel with lesion SUVmax was recorded.
2.5 First-order Metrics and VOI Settings
Five first-order metrics, consisting of lesion SUVmax and mean (SUVmean), standard deviation (SUVsd), skewness (SUVsk) and kurtosis (SUVku) of voxel SUV in VOI, were assessed for lesion classification. Due to the small sizes of bone lesions and the limited resolution of reconstructed PET imaging data, we chose not to define the exact border of a bone lesion. Instead, a simple cuboid VOI which centers on the recorded voxel with lesion SUVmax was defined. The side length of VOI was limited to not exceeding 20mm to avoid the inclusion of too many background voxels. VOI consisting of 3 by 3 by 5 voxels (approximately 16.4mm by 16.4mm by 16.4mm) was thus selected to approach a regular cubic shape.
2.6 Voxel Size, SUV Interpolation, and VOI Definition for TA Metrics
Because stable TA requires a large number of voxels in VOI, direct application of reconstructed voxel size from PET imaging data would be inadequate [15, 16]. Smaller cubic voxels with side lengths of 1 mm, 2 mm and 3 mm were thus adopted with the voxel SUV estimated by trilinear interpolation. In TA of two-dimensional images, a square-shaped region of interest is usually defined. In the current study with 3D data, an isotropic cubic-shaped VOI was adopted. For each recorded lesion, TA metrics were computed from cubic VOIs centered on the recorded voxel location. Three representative VOI sizes, with side length of 10mm, 15mm and 20mm respectively, were adopted for further TA.
2.7 Quantization of Voxel SUV in VOI
To assess various texture features, voxel SUV values in VOI have to be quantized into a specified number of bins. In this study, bin numbers were set representatively as 2n (n = 1 to 8). A linear quantization method was adopted, setting the specified number of bins linearly between the minimum and maximum voxel SUV values in VOI.
2.8 Texture Features and Metrics
The software tool for computation of TA metrics is based on the open-source project, Chang-Gung Image Texture Analysis (CGITA) toolbox, which has been developed in our institution and implemented in MATLAB (MathWorks Inc., Natick, MA, USA) software environment [17]. Eight different kinds of parent texture matrices are included, including cooccurrence matrix [8], run-length (voxel-alignment) matrix [18], neighborhood difference matrix [19], size-zone matrix [20], texture spectrum matrix [21], texture feature coding matrix [22], texture feature coding cooccurrence matrix [22], and neighborhood dependence matrix [23]. The 62 exploited texture features derived from these parent matrices are listed in Table 1. For lesion classification, a total of 4464 TA metrics were to be evaluated due to the combination of three VOI sizes, three voxel sizes, eight bin numbers and 62 texture features.
Table 1
Parent texture matrices with derived texture features exploited in the study
Parent texture matrix
|
Texture features
|
Cooccurrence
|
Energy, entropy, variance, correlation, contrast, homogeneity, sum average, dissimilarity, inverse difference moment
|
Run-length
|
Run percentage, short-run emphasis, long-run emphasis, low-intensity run emphasis, high-intensity run emphasis, gray-level nonuniformity, run length nonuniformity, low-intensity short-run emphasis, high-intensity short-run emphasis, low-intensity long-run emphasis, high-intensity long-run emphasis
|
Neighborhood difference
|
Coarseness, contrast, busyness, complexity, strength
|
Size-zone
|
Run percentage, short-run emphasis, long-run emphasis, low-intensity run emphasis, high-intensity run emphasis, gray-level nonuniformity, run length nonuniformity, low-intensity short-run emphasis, high-intensity short-run emphasis, low-intensity long-run emphasis, high-intensity long-run emphasis
|
Texture spectrum
|
Maximum, variance
|
Texture feature coding
|
Coarseness, homogeneity, mean convergence, variance
|
Texture feature coding cooccurrence
|
Energy, entropy, variance, correlation, contrast, homogeneity, sum mean, dissimilarity, inverse difference moment (IDM)
|
Neighborhood dependence
|
Run percentage, short-run emphasis, long-run emphasis, low-intensity run emphasis, high-intensity run emphasis, gray-level nonuniformity, run length nonuniformity, low-intensity short-run emphasis, high-intensity short-run emphasis, low-intensity long-run emphasis, high-intensity long-run emphasis
|
2.9 Definition of 3D Connectivity and Cooccurrence Matrix Offset
The simple 6-connected neighborhood was adopted for calculation of 3D texture features, that is, the six face-touched voxels were considered as direct neighbors of the central voxel. TA metrics were calculated along these six neighboring directions and averaged. Due to the presence of different voxel sizes, only the distance offset of one voxel was evaluated in the computation of cooccurrence matrix.
2.10 Statistical Analysis
The means of first-order metrics for metastatic and benign bone lesions were compared using the independent-samples t-test. The logistic regression model was used for evaluating lesion classification, with area under the receiver operating characteristic curve (AUC) as the performance measure. AUC comparison was assessed by a fast implementation of DeLong’s algorithm [24]. Two-sided P value less than .05 was considered statistically significant. The correlations between metrics were assessed using the Pearson’s formula, which were considered to be very weak, weak, moderate, strong, very strong if the absolute coefficient values were < .2, .2 to < .4, .4 to < .6, .6 to < .8, .8 to 1.0 respectively. The statistics were performed using the R software (version 4.1.0, R Foundation for Statistical Computing, Vienna, Austria). Due to the large amount of TA metrics, only the twenty top-performing ones from the training dataset were listed. First-order and top-performing TA metrics were then assessed with the validation dataset.