2.1 Patient Selection and Clinical Characteristics
Data for this study were obtained from the First Affiliated Hospital of GuangXi Medical University. This retrospective research was approved by the ethics committee of the First Affiliated Hospital of GuangXi Medical University, China (No. KY-E-294,2021). All treatment protocols in this study were carried out in accordance with National Comprehensive Cancer Network guidelines. Because of the retrospective design of the study, the ethics committee of the First Affiliated Hospital of GuangXi Medical University confirmed that the need for individual informed consent of this research was waived.
In total, 65 patients newly confirmed as having NPC via biopsy analyses between January 2017 and February 2021 were retrospectively reviewed. The inclusion criteria for the study were as follows: (1) Patients with NPC with pathological confirmation of III to IV b stage (According to 8th edition of American Joint Committee on Cancer/Union for International Cancer Control TNM staging system); (2) patients having undergone no treatment before MRI; (3) patients with a clinical Karnofsky performance score > 70; (4) patients having undergone RHES + CCRT treatment; and (5) patients with stipulated MRI examination times: MRI before treatment should be completed within 7 days after hospitalization and MRI after RHES + CCRT should be completed at the end of the treatment. The exclusion criteria of our study were as follows: patients with poor MRI image quality, NPC with stage I–II malignancy, and failure to follow the treatment regimen prescribed in this study.
In this study, 13 patients were excluded, including 3 patients (3.8%) with stage II malignancy, 3 patients (3.8%) with significant imaging artifacts caused by dentures, and 7 patients (8.9%) who did not appear for post-MRI examination. Finally, 65 (83.3%) patients with NPC were included in the present study.
Data on age and sex of the patients and pathologic type, TN staging, and morphologic size of the lesions were collected as clinical factors. All clinical records and MRI images were collected and recorded by two radiologists separately. Any disagreement was resolved through discussion.
2.2 Recombinant Human Endostatin Combined with Concurrent Chemoradiotherapy
All patients received RHES (solubilized in 250 mL of 0.9% normal saline; dosage: 7.5 mg/m2/day) intravenously for the first 5 days of CCRT. The RHES (Endostar) was provided by Simcere Pharmaceutical Research Co., Ltd.
2.3 Criteria for Treatment Response
The tumor treatment response was separately evaluated by two doctors according to the RECIST 1.1 criteria . They used the maximum diameters of target lesions measured on CE T1-weighted imaging obtained before and after RHES + CCRT treatment using tools of the picture archiving and communication system (PACS; Carestream, Ontario, Canada). Both radiologists were blinded to patients’ information.
According to the changes observed in the maximum diameters at the two time points, the curative effect was identified as follows: 1) complete response (CR), wherein the MRI nasopharyngeal mass has retreated, and the mucosal thickness is <5 mm with no obvious abnormalities noted on nasopharyngoscopy, 2) partial response (PR), wherein a retreat rate of >30% was noted, and 3) stable disease (SD), wherein a retreat rate of <30% was noted. None of the cases in this study showed disease progression. Efficacy evaluation was performed at the end of CCRT. Patients with CR and PR were defined as responders; SD was defined as non-responders.
2.4 MRI Imaging Protocols
All 65 patients underwent routine plain scan and contrast-enhanced MR examination of the nasopharynx and cervical region using head and neck coils with 1.5 Tesla scanner (GE Signa Echospeed; GE Medical Systems, Milwaukee, USA). We chose axial T2-weighted imaging with fat suppression (T2WI_FS) and T1-weighted imaging with contrast enhancement (T1WI_CE) for radiomics analyses. In this study, we used the same geometric parameters of T2WI_FS and T1WI_CE; the main imaging parameters were as follows: field of view = 240 mm × 230 mm, matrix = 232 × 219, spatial resolution = 0.25 mm × 0.25 mm × 5.0 mm, slice thickness = 5 mm, spacing between slices = 1 mm, repetition time (TR)/echo time (TE) = 6000/70 ms (T2WI_FS), and TR/TE = 550/8.1 ms (T1WI_CE). The contrast agent used was gadolinium-diethylenetriamine penta-acetic acid (Gd-DTPA, Magnevist; Schering Diagnostics AG, Berlin, Germany) at 0.1 mmol/kg of body weight and an injection rate of 2 mL/s.
2.5 Tumor Segmentation and Radiomics Feature Extraction
All the MRI images of the enrolled patients were exported from the PACS system in Digital Imaging and Communications in Medicine (DICOM) format, and we used the ITK-SNAP software (www.itksnap.org, version 3.6.0) for image segmentation. The key points for image segmentation were as follows: removing blood vessels with a diameter of >3 mm inside the tumor and removing secretions and impurities on the surface of the nasopharyngeal mucosa.
Two radiologists who had rich experience in head and neck diagnosis and were blinded to all clinical records of patients delineated the primary NPC layer by layer on the T2WI_FS image and accordingly obtained the three-dimensional volume of interest (VOI) of the primary tumor. Since the geometric parameters of T1WI_CE and T2WI_FS were the same, the VOI was directly copied to the T1WI_CE sequence, and then, the VOI of the tumor lesions on the T1WI_CE images was obtained using the ITK-SNAP software.
In the current study, we used an artificial intelligence software named FeAture Explorer (ver. 0.3.6 on Python 3.7.6) for MRI texture feature extraction. It extracted a total of 144 texture features from the sequences of T2WI_FS and T1WI_CE. The radiomics features extracted included the original, shape, first order, gray-level co-occurrence matrix (GLCM), and gray-level run length matrix (GLRLM). The radiomics features extracted by the two doctors were tested for consistency within and between groups. The flow of our study is shown in Figure 1.
Before using the machine learning model to perform dimensionality reduction analysis on texture features, we used the Z-score method to standardize the mathematical values of all texture features of the patient to eliminate the unit limitation of these texture features.
Z-score = (x-μ)/σ,
where x is the mathematical value of the texture feature, μ represents the average value of this feature in all enrolled patients, and σ is the corresponding standard deviation.
2.6 Statistical Analysis
Statistical analysis was performed using SPSS 20.0 (SPSS Inc., Chicago, IL, USA) and R software (version 4.1.0, https://www.r-project.org). For analyzing differences in clinical factors among the groups showing different levels of efficacy, the independent t-test or Mann–Whitney U test was used for numerical variables according to the results of normal distribution of data, and chi-square test was used for categorical variables. Intra-class correlation coefficient (ICC) values with 95% confidence interval (CI) were used to assess the inter-reader agreement of the quantitative measurements. Only the features with an ICC > 0.7 were selected for analysis of radiomics features.
2.7 Feature Selection and Radiomics Signature Building
In this study, we set the labelled the responder group as 0 and the non-responder group as 1. First, we compared the features between the groups using the Mann-Whitney U test. Only the features identified to be significantly different between the two groups (P < 0.05) were selected for the next step of univariate logistic regression analysis. Second, univariate logistic regression was used to explore whether the features were discriminative between two groups. Then, due to the redundancy of the features, we used the maximum relevance minimum redundancy (mRMR) algorithm for dimensionality reduction analysis of texture features that were screened out. After removing redundant and irrelevant texture features using mRMR algorithm, only eight texture features were finally retained. For each patient, the Rad score of the primary tumor was calculated from the linear combination of texture features selected by mRMR and the weighting coefficient corresponding to each feature. Finally, multivariable logistic regression was used to select the most predictive feature subset, and prediction models were constructed based on clinical factors and the Rad score, including radiomics models (only the Rad score), clinics models (only clinical factors), and combined models (the Rad score combined with clinical factors).
2.8 Model Performance Evaluation
A 10-fold cross-validation method was used to train and validate machine learning classifiers and prediction models to avoid over-fitting of image data. The specific step was to divide the image data into 10 equal parts, taking 9 of them for training and then using the remaining one for verification. The average of the 10 results was used as an estimate of the accuracy of the classifier or prediction model.
We developed the receiver operating characteristics (ROC) curves to evaluate and compare the predictive ability of the radiomics model, clinics model, and combined model. Next, to evaluate model performance, we calculated the area under the ROC curve (AUC), sensitivity, accuracy, specificity, positive predictive value (PPV), and negative predictive value (NPV). Finally, we obtained the nomogram, ROC diagram, and calibration curve diagram, which were used to measure the consistency between the predicted RHES response probability and the actual RHES response probability. To evaluate these three models, we obtained the goodness-of-fit of the prediction model and the decision curve using the Hosmer–Lemeshow test.