Patients and treatment
The institutional review board approved this retrospective analysis of data from MR images, and the requirement for informed consent was waived. Eligibility criteria included (a) treatment with Gamma Knife radiotherapy for NFPA from August 2009 through August 2012; (b) availability of at MRI scans obtained after radiotherapy 12-18 months; (c) an identical sequence protocol for MR scans, covering CE-T1. Each patient was treated with gamma knife (Elekta instrument AB, Sweden) using dose of 9-12 Gy in single fraction.
Patients were split to the training and test set at a 2:1 ratio, according to random numbers generated by computer.
MR Image Acquisition and Segmentation
We obtained all MR images using a 3-T MRI system (SIEMENS) during routine clinical visits. Patients with poor MRI data quality due to exercise artifacts or poor contrast injection were excluded. A neurosurgeon contoured the regions of interest manually on the CE-T1 images by using the ITK-SNAP[19] (www.itksnap.org) since compared with other scan types, lesions are more easily detected after contrast agent injection. Another one neurosurgeon and one radiologist then reviewed the contours to ensure correct segmentation.
Radiomics features extraction
To reveal phenotypic differences in patients who are sensitive to radiotherapy or not, we loaded the original MR image and the corresponding mask image segmented by neurosurgeon into the radiomics features extraction software, which was implemented using the package PyRadiomics[20] (https://github.com/Radiomics/pyradiomics) based on Python 3.6.4 (https://www.python.org).
According to prior work[13, 21, 22], a large panel of radiomic features quantifying five kinds of phenotypic characteristics on medical imaging was extracted in our study, including first-order statistics features, shape descriptors features, and features describing texture which are gray level cooccurrence matrix features (GLCM), gray level run length matrix features (GLRLM), and gray level size zone matrix features (GLSZM).
Feature Engineering
At the first step of feature engineering, the Z-score standardization procedure was performed on each radiomics feature in the training set, then in the test set using the parameters of the training set. To screen out useless features, we assessed the potential association of the radiomics signature with sensitivity status using the Mann-Whitney U test. Meanwhile, we discarded features that did not show statistically significance difference between sensitive group and insensitive group (a P-value below 0.05 is referred to as statistically significant). To minimize the redundancy between features, we also calculated Pearson correlation coefficients between each pair of features. For pairs with a correlation coefficient greater than 0.8, we retain only one of two features. In order to further reduce the risk of overfitting, we adopted a support vector machine (SVM) based recursive feature elimination (RFE) method with automatic tuning of the number of features selected with cross-validation. As a result, we screened out only a small number of crucial features which were prepared to construct the classification model.
The aforementioned processes were performed in the training set, while the test set only adopted the result.
Model Construction and Radiomics Score Calculation
The classification model was a support vector machine (SVM) constructed on the training set based on radiomics features. The penalty parameter ‘C’ of SVM was automatically decided using 3-fold cross validation. The receiver operating characteristic curve (ROC) was employed to assess how the proposed model performs. In the meantime, area under curve (AUC) was calculated. An AUC value of 0.5 implies that the model has no discriminability, and a value of 1.0 reveals the optimal discrimination.
To illustrate model’s performance, we calculated a radiomics score for respective case via a linearly combined taken characteristics weighted by their corresponding parameters.
Statistical Analysis
Statistical analysis was conducted with Python software (version 3.6.4, https://www.python.org). We used the scikit-learn package (https://scikit-learn.org) to implement feature engineering procedures and to construct the SVM model. Matplotlib package (https://matplotlib.org/) was used to plot the figures. A two-sided P-value less than 0.05 was used for indicating the statistically significant differences.