To the best of our knowledge, no study so far has evaluated PET/CT combined with MR-based radiomics and baseline clinical parameters among patients with NPC. We identified that radiomic features from MR and PET/CT were associated with improved prediction of OS and PFS, particularly when combined (AUC 0.96 and 0.86 respectively). Clinical + MR features initially outperformed those of Clinical + PET/CT (< 18 months), with Clinical + PET/CT features then outperforming those of Clinical + RP-MR consistently in the OS model, whilst Clinical + RP-MR features subsequently outperformed those of Clinical + PET/CT (> 42 months) in the PFS model.
Our study confirms the findings of multiple studies in the literature that have demonstrated the pre-treatment prognostic value of MR-based radiomics among patients with NPC, consistently showing that MR-based radiomics outperform clinical features alone, when predicting either PFS or OS (4,6,11,12,14,16–18,20,22,28–30). The AUC for clinical + RP-MR in our study was as high as 0.84 for PFS and 0.87 for OS, which is comparable with the literature where AUC varies from 0.8  to 0.886 , and the C-index from 0.72  to 0.874 .
A significant proportion of these studies were only performed among patients with advanced (stage III-IV), non-metastatic NPC [4, 6, 11, 12, 14, 30], with the remainder performed among non-metastatic NPC patients of all stages, similar to our study [17–20, 22, 28].
Similar to the majority of MR-based radiomic studies, we included both contrast enhanced T1-w and T2-w MR sequences in our study (4,6,11,12,14,16–18,20,22,30); however, although both contrast enhanced T1-w, and T2-w MR sequences were evaluated, ultimately only radiomic features from the contrast enhanced T1-w sequences were found to be significant and included in our final OS and PFS models (RP_T1_GLZLM_GLNU, RP T1 CONVENTIONAL Skewness and RP T1 NGLDM Busyness). This is partly different when compared to other studies which have shown that joint contrast enhanced T1 and T2 radiomic features have a better prognostic performance than T1 or T2 features alone and may be the result of better performing PET-based radiomic features being incorporated into our model [11, 12].
Another differentiation compared to literature are the methods used for radiomic feature extraction (e.g. MATLAB), with only one other NPC radiomic study also using LIFEx software for radiomic feature extraction . Despite utilization of the same MR sequences (contrast enhanced T1-w and T2-w sequences) and radiomic extraction software, different radiomic features were found to be significant (RP_T1_GLZLM_GLNU, RP T1 CONVENTIONAL Skewness and RP T1 NGLDM Busyness in our study, and GLCM_Energy, GLCM_Corre, and CONV_st in Yang et al. 2019 ). This may reflect our utilization of 3.0 T, fat-saturated MR sequences with different technical parameters. Similar to the majority of studies into NPC radiomics, our study evaluated radiomic parameters within the primary tumor; however, there are a number of studies that assess both the primary NPC tumor and adjacent locoregional lymph nodes, with similar findings, confirming the prognostic value of combined baseline clinical and MR-based radiomics [14, 19].
There are three studies in the literature exploring the performance of PET/CT based radiomic features among NPC patients. Similar to our study, they demonstrate that combined clinical with PET/CT features improved the prediction of PFS with a c-index of 0.77 , 0.69  and AUC of 0.829  compared with 0.81 in our study. The study from Peng et al. only examined patients with advanced NPC (stage II-IV) , compared with ours and the remaining PET/CT radiomic studies. The study by Lv et al. identified age as a significant clinical parameter as in our study, in addition to IgA, N and M stage . Our study identified PET_CONVENTIONAL_SUVbwQ1 and PET DISCRETIZED SUVbwmin as significant PET radiomic parameters; however, no PET features were retained following multivariate analysis in the study of Lv et al. 2019 . By comparison, other parameters like PET-NGTDM-Complexity, CT-GLGLM-LGGE and PET-GLGLM-SGLGE were found to be significant in the study by Xu et al. 201924.
Our study evaluated both the PET and the CT component of the PET/CT study, however no CT parameters were found to have significant prognostic value in our study, unlike the remaining PET/CT-based radiomic studies [7, 23, 24]. We routinely evaluate the CT component in our radiomics studies since PET/CT is used clinically as a combined imaging modality. The complementary value of the CT component has previously been demonstrated in the literature , and if radiomics should ever make it into clinical routine decision making in the future, then the combined value of PET and CT radiomics would be beneficial per disease site.
There is currently only a single study examining the prognostic value between PET and MR in the existing literature ; however, this only utilizes T2-w MR and PET images. Our study is the first demonstrating the improved prognostic value of combined clinical + PET/CT + MR features compared with clinical, PET/CT or MR features individually for both OS and PFS (AUC 0.96 at 24 months in OS and 0.86 at 21 months in PFS). Since our results indicated that mainly PET and MR radiomics features seem to have a prognostic value, combined PET/MR imaging could be considered as a clinical tool for staging, prognostication and potentially surveillance of NPC. This may offer the patient (and the hospital) improved staging logistics (one combined exam compared to PET/CT and MR separately) as well as possibly a better prognostication tool in the future.
Interestingly, clinical + RP-MR features initially outperformed the clinical + PET/CT for both OS and PFS in the follow up period (< 18 months), and for PFS (> 42 months). Since MRI is used mostly for the local staging (because of its well documented superiority), one consideration is that the local tumor may potentially be the dominant driver and dictate short term tumoral behavior. PET however may provide improved overall prognostication, representing the overall pathophysiological behavior in a better way than morphological imaging procedures. Ultimately however, these findings remain indeterminate, and would need to be confirmed in similar studies.
Our study had some limitations, predominantly in terms of methodology. This was a retrospective study with a moderate number of patients (124) (reported sample sizes in the literature range from 85–737 ), with mixed clinical stages of NPC (I-IV). Other prognostic molecular biomarkers, such as Haemoglobin, LDH, neutrophil-lymphocyte ration, c-Met, ERBB3 and MTDH were not available for inclusion in the study , as these were not routinely obtained at this time at our institute.
Although the PET/CT and RP-MR images were obtained from the same institution and scanners, maintaining uniformity in image acquisition, no image preprocessing was performed prior to segmentation. However, there is currently no general consensus available regarding if and which image preprocessing should be performed. Some researchers are opposed to image preprocessing, since it would be prohibitive to implement clinically on a large scale. Segmentation was also only performed manually, without reproducibility evaluation.
Statistical methodology, in terms of feature selection and modelling, is highly variable between radiomic studies (LASSO, RFE, univariate analysis; RS, CR and nomogram, Chi-squared test, SFFS and SVM). We performed univariate analysis followed by construction of a multivariate Cox regression model into which radiomic features were then added. The major difference between our studies and those in the literature is that the majority of studies use both training and validation cohorts to estimate model performance, with only one other study utilizing internal cross-validation . Leave-one-out cross-validation is considered a robust statistical analysis, especially for study populations like ours. Although this is associated with the benefit of requiring a smaller sample size, because of the absence of an independent validation cohort, this study can however only be classified as explorative . An independent dataset would therefore still be required for validation of the models presented.