Patient cohort
The study included patients with primary and secondary intrahepatic malignancies who had 90Y PET/CT imaging performed after 90Y radioembolization with glass microspheres (Theraspheres) at University of Michigan (UM) Medical Center as part of an ongoing dosimetry research study. Selection criteria for 90Y PET/CT imaging were: well defined lesions >2 mL, ability to undergo imaging, follow-up at UM and informed consent. The patient and lesion characteristics for the 36 lobar treatments (30 patients, 105 lesions, 6 patients had treatment to right and left lobes at different time points.) are summarized in Supplemental Table 1. The treating physician followed standard guidelines to deliver 80-150 Gy to the treated liver with empirical adjustments within this range based on clinical factors. The 90Y PET/CT imaging was approved by the institutional review board, and all subjects signed an informed consent form.
90Y PET/CT Imaging and dosimetry
Images were acquired on a Siemens Biograph mCT PET/CT within a couple of hours of the RE procedure (prior to discharge) with an acquisition time of ~30 minutes to cover the entire liver and partial lung. PET reconstruction parameters were selected based on phantom studies considering both activity recovery and noise: 1 iteration, 21 subsets of 3D OS-EM with time-of-flight and resolution recovery and a 5 mm Gaussian post-filter [18]. The PET matrix size was 200×200 with a pixel size 4.07×4.07 mm and a slice thickness of 3 mm. The CT was performed in low dose mode (120 kVp; 80 mAs) during free-breathing. The CT matrix size was 512×512 with a pixel size of 0.97×0.97 mm and a slice thickness of 2 mm.
PET images were transformed to CT-space and the CT-derived density map were input to our DPM Monte Carlo code [18] to generate dose-rate maps that were converted into absorbed dose maps by accounting for 90Y physical decay. Mean absorbed doses to segmented lesions were reported following partial volume correction based on volume-dependent recovery coefficients, determined from a phantom study [18].
Radiomics: lesion segmentation, PET data preprocessing and feature extraction
Lesion segmentation was performed on diagnostic quality contrast enhanced baseline CT or MRI by a radiologist specializing in hepatic malignancies (RK), which is considered a gold standard. Note that variability due to contouring can be a source of error, but has been addressed in several previous studies [6, 24, 25]. The diagnostic scan was then rigidly registered to the CT of the 90Y PET/CT and the contours were transformed with fine tuning when mis-registration was evident on MIM (MIM Software Inc, Cleveland, OH). In some cases, where the lesions were well visualized on the non-contrast low-dose CT of the PET/CT they were directly defined on this CT in order to minimize mis-registration effects. Up to 5 (largest) lesions > 2 mL were segmented per patient.
Lesion contours and 90Y PET images were input to an in-house developed (Matlab, MathWorks Inc., Natick, MA) radiomics toolbox (benchmarked by image biomarker standardization - ISBI) that run as an extension on MIM. Our radiomics code is shared at https://github.com/mvallieres/radiomics. All subsequent analyses were performed in MATLAB. First, a root-squared transform was applied to the PET images to reduce quantum noise effects [26].
The full intensity range of the tumor region was quantized to a smaller number of gray levels (Ng) before computation of the features. The quantization algorithm used is Lloyd-Max algorithm, which attempts to minimize the mean-squared quantization error of the output. Ng was experimentally chosen as 32 [27]. The features were extracted from 3D 90Y PET images, which were interpolated to isotropic voxel size (0.97 mm). 46 features, including volume, one shape feature (sphericity), 4 global features, and 40 texture features from gray-level co-occurrence matrix (GLCM), gray-level run length matrix (GLRLM), gray-level size zone matrix (GLSZM), and neighborhood gray-tone difference matrix (NGTDM), were extracted. All the feature extraction followed the Image biomarker standardization initiative (IBSI) guidance [28]. These features represent the spectrum of commonly used features, especially in PET imaging [6, 29, 30]. We further opted for extraction parameters following the ISBI guidelines due to the limited sample size, we didn’t explore further parameterization or less commonly used features. Supplemental Table 2 presents the list of radiomics features used in this study.
Lesion-level Study Endpoints
Phantom study to assess radiomics feature repeatability and reproducibility
A 90Y PET/CT study with a liver/lung torso phantom consisting of a ‘warm’ liver compartment and three ‘hot’ lesion inserts (29 mL ellipsoid, 16 mL sphere, 8 mL sphere) with an insert-to-liver activity concentration ratio of 5:1 was performed. The total activity in the phantom was 1.9 GBq and the activity concentrations in the inserts were 6.0-7.3 MBq/mL and liver minus inserts was 1.2 MBq/mL. To assess radiomics feature repeatability 5 consecutive 30 min acquisitions under identical conditions were performed on the same PET/CT system as in the patient studies. To assess sensitivity to reconstruction parameters and filtering, each of the 5 scans were reconstructed with 1 and 2 OS-EM iterations (21 subsets) and with and without Gaussian post-filter. The activity concentrations, acquisition time and parameters used in the phantom study were chosen to reflect conditions for imaging following 90Y RE, hence, the noise-level was clinically relevant.
Statistical Analysis
1. Phantom feature robustness study
Concordance correlation coefficient (CCC) metric assumed each observation was independent as has been commonly reported in repeatability/reproducibility studies [22, 32]. Thus, in the robustness study of our extracted radiomics features, CCCs were computed for the different scans, different iterations, and with/without Gaussian filtering. For each of the 45 radiomics features (without volume), the resulting CCCs were averaged, and features with larger than 0.85 [22, 33-36] average CCC -robust radiomics feature set, were further investigated in the patient radiomics modeling.
2. Lesion overall response and progression modeling studies
(1) Univariate analysis
Univariate association between the features (or absorbed dose) and OR classification was investigated using Spearman’s rank correlation. Univariate analysis for the features (or absorbed dose) and progression was investigated by Cox regression.
(2) Multivariate analysis -- modified Bo-LASSO
In order to select robust features, build generalized models and evaluate unbiased model performance, a nested cross-validation (CV) framework has been employed (details are shown in Fig. 1). In the outer loop, 10 times 5-fold cross validation was used to estimate the model performance. On the training set of each inner loop, N times bootstrap was performed. For each resampling training set, optimal lambda hyper-parameter for LASSO was tuned by another cross-validation process. Subsequently, features with non-zero coefficients were recorded. With N resampled training sets, N sets of features were recorded. The frequency of a certain feature being selected by LASSO was calculated and thus a ranking list of the features was obtained. Then, M times bootstrap logistic regression modeling was used to estimate the model order. Specifically, models using top i (i = 1, ... , n = number of features) ranked features were developed and mean AUC/c-index for each model order with confidence interval was obtained and the model order corresponding to highest AUC/c-index within one standard error was selected [37]. After we obtained the model order and top selected features, final model in each outer loop was obtained by retraining on the training set and applied on the outer test set (Here, N and M were both 100).
With the developed method, models were constructed using the 15 robust radiomics features set, lesion volume and mean absorbed dose (AD) (15+1+1=17). Since there are two subgroups in this patient cohort, the developed models were applied to both subgroups to assess if the tumor response correlated differently for primary HCC and metastatic lesions. The ROC curve (AUC) and c-index were used to evaluate the lesion OR and progression model performance, respectively. The confidence intervals were calculated by the bootstrap method [38]. The statistical analysis was performed using MATLAB R2019a and RStudio 1.1.463. The Bonferroni correction was applied to account for the family-wise error rate [39]. Overall, 17 features (dose + volume +15 radiomics features) were tested; therefore, p-values < 0.05/17=0.003 was considered significant. For the whole set of features (dose + volume + 45 radiomics features), p- values < 0.05/47=0.001 were considered significant. Meanwhile, due to the existence of unbalance in the dataset, especially for progression analysis (events 14/103), Adaptive Synthetic Sampling Approach (ADASYN) was applied for the multivariate analysis to see if it can improve the performance [40].