Patient Clinical Characteristics
A total of 100 patients were included in this study. At institution 1, 64 patients (mean age ± standard deviation, 61.9 years ± 8.8) were identified and assigned to the training set; delayed imaging was performed in 46 of these patients. Thirty-six patients (mean age ± standard deviation, 67.6 years ± 8.8) from institution 2 were identified and assigned to the external test set, and 25 of these patients underwent delayed scans. Clinical characteristics of training (n=64) and external test (n=36) sets are summarized in Table 1. In the training set, 39 patients were diagnosed with prostate cancer and 25 patients with benign prostate disease, including benign prostate hyperplasia (n=7), chronic inflammation (n=1) and BPH with chronic inflammation (n=17). In the external test set, 21 patients were diagnosed with prostate cancer. The benign prostate disease of 15 patients included benign prostate hyperplasia (n=8), chronic inflammation (n=1) and BPH with chronic inflammation (n=6). The distributions of age, tPSA, PSAD, Gleason score and ISUP grade were not significantly different between training and external test set.
Radiomics model development
Development and Testing of the Radiomics RF Model based on Standard PET images
A total of 1781 radiomics features were extracted from the prostate ROIs on standard PET images. The 10 most useful predictive features of standard PET images selected by the mMRM method were one original first order features, 3 wavelet-based features, 4 Laplacian of Gaussian–based features, one squareroot-based feature and one logarithm-based feature. A cluster heatmap of these radiomics features was generated by means of agglomerative hierarchical clustering to visualize the differences between PCa and non-PCa groups and the association between the found clusters of subjects and features. (Figure 4.a).
The average AUC, recall score, accuracy score and the F1 score of the RF model based on standard PET images from 10-fold cross validation were 0.87 (95% CI: 0.72, 1.00), 0.90 (95% CI: 0.70, 1.00), 0.82 (95% CI: 0.60,1.00), 0.84 (95% CI: 0.68, 1.00) and 0.84 (95% CI: 0.68, 1.00), respectively.
Development and Testing of the Radiomics Model based on Delayed PET images
A total of 1781 radiomics features were extracted from the prostate ROIs on delayed PET images. The 10 most useful predictive features of delayed PET images selected by the mRMR method were 5 wavelet-based, one Laplacian of Gaussian–based features, one square-based feature, one exponential-based feature, one logarithm-based feature and one gradient-based feature. A cluster heatmap of these radiomics features was generated by means of agglomerative hierarchical clustering to visualize the differences between PCa and non-PCa groups and the association between the found clusters of subjects and features. (Figure 4.b).
The average AUC, precision score, recall score, accuracy score and the F1 score of the RF model based on standard PET images and delayed PET images from 10-fold cross validation were 0.86 (95% CI: 0.63, 1.00), 0.87 (95% CI: 0.63, 1.00), 0.81 (95% CI: 0.57, 1.00), 0.80 (95% CI: 0.53, 1.00) and 0.79 (95% CI: 0.51, 1.00), respectively.
Development and Testing of the Radiomics Model based on Standard and Delayed PET images
To develop the RF model based on standard and delayed PET images, we used both the 10 most useful predictive features of standard PET images and the 10 most useful predictive features of delayed PET images selected by the mRMR method.
The average AUC, precision score, recall score, accuracy score and the F1 score of the RF model based on standard PET images and delayed PET images from 10-fold cross validation were 0.91 (95% CI: 0.69, 1.00), 0.91 (95% CI: 0.65, 1.00), 0.93 (95% CI: 0.75, 1.00), 0.88 (95% CI: 0.65, 1.00) and 0.87 (95% CI: 0.61, 1.00), respectively.
The performance of trained radiomics RF models based on standard PET images, delayed PET images, and both standard and delayed PET images from 10-fold cross-validation were summarized in Table 2.
Comparison of the Radiomics Models and PSAD in the External Test Set
In the external test set, the AUCs of trained RF model based on standard PET images, delayed PET images, and both standard and delayed PET images were 0.903 (95% CI: 0.830, 0.975), 0.856 (95% CI: 0.748, 0.964) and 0.925 (95% CI: 0.838, 1.000), respectively. The sensitivity, specificity, accuracy, PPV and NPV were 0.816 (95% CI: 0.657, 0.923), 0.767 ( 95% CI: 0.577, 0.901), 0.794 (95% CI: 0.679, 0.883), 0.816 (95% CI: 0.648, 0.923) and 0.767 (95% CI: 0.587, 0.901) for RF model based on standard PET images; 0.750(95% CI: 0.509, 0.913), 0.846 (95% CI: 0.651, 0.956), 0.804 (95% CI: 0.661, 0.906), 0.789 (95% CI: 0.560, 0.930) and 0.814 (95% CI: 0.603, 0.946) for RF model based on delayed PET images; and 0.850 (95% CI: 0.621, 0.968), 0.885 (95% CI: 0.698, 0.976), 0.870 (95% CI: 0.737, 0.951), 0.850(95% CI: 0.631, 0.968) and 0.885 (95% CI: 0.689, 0.976) for RF model based on both standard and delayed PET images, respectively.
The AUC, sensitivity, specificity, accuracy, PPV and NPV of PSAD (cutoff value, 0.15ng/ml/ml) in the external test set were 0.662 (95% CI:0.510, 0.813), 0.857 (95% CI:0.637, 0.970), 0.467 (95% CI:0.213, 0.734), 0.694 (95% CI:0.519, 0.837), 0.692 (95% CI:0.410, 0.923) and 0.700 (95% CI:0.405, 0.880).
The AUCs of radiomics RF model based on standard PET images, delayed PET images, and both standard and delayed PET images were higher than that of PSAD (0.903, 0.856 and 0.925 vs 0.662, respectively; P = .007, P = .045 and P = .005, respectively).
The diagnostic performance of the three radiomics models and the use of PSAD (cut-off value, 0.15ng/ml/ml) in the external test set were summarized in Table 3. The ROC curves of the three radiomics models and PSAD were shown in Figure 5.