Patients
This study was approved by our institutional review board and the requirement for informed consent was waived based on its retrospective nature. Between January 2015 and July 2017, consecutive patients with surgically confirmed ODG2 and ODG3 were retrospectively recruited. Tumors were classified according to 2007 WHO classification or 2016 WHO guidelines when enough information were available. The including criteria were, 1. patients underwent preoperative conventional MRI scan; 2. patients underwent gross total resection or subtotal resection of the lesion and a confirmative pathological diagnosis was made. Thirty-six patients were included (19 men, 17 women; mean age =45 years; age range =9 - 65 years) and classified into two groups: ODG2 (n = 19; mean age = 46 years, age range =10 - 65 years) and ODG3 (n = 17; mean age = 44 years, age range = 9 - 65 years). The patient selection process is summarized in Figure 1.
MRI Data Acquisition
All patients underwent 3-T MR scanning (Discovery MR750, General Electric Medical System, Milwaukee, WI, USA) with an 8-channel head coil (General Electric Medical System). The conventional MR protocol included T1-weighted imaging (T1WI) performed before and after contrast enhancement, an axial T2-weighted imaging (T2WI), and a transverse fluid-attenuated inversion recovery (FLAIR).
The parameters of the conventional MRI sequences were as the follows: T1WI with gradient echo (TR/TE, 1750 ms/24 ms; matrix size, 256 × 256; FOV, 24 × 24 cm; number of excitation, 1; slice thickness, 5 mm; gap, 1.5 mm), T2WI with turbo spin-echo (TR/TE, 4247 ms/93 ms; matrix size, 512 × 512; FOV, 24 × 24 cm; number of excitation, 1; slice thickness, 5 mm; gap, 1.5 mm) and sagittal T2WI (TR/TE, 10,639 ms/96 ms; matrix size, 384 × 384; FOV, 24 × 24 cm; number of excitation, 2; slice thickness, 5 mm; gap, 1.0 mm). We obtained axial FLAIR with the following parameters: TR/TE, 8000 ms/165 ms; matrix size, 256 × 256; FOV, 24 × 24 cm; number of excitation, 1; slice thickness, 5 mm; gap, 1.5 mm.
Finally, T1CE were performed after intravenous bolus injection of gadodiamide (Omniscan; GE Healthcare, Co. Cork, Ireland), at a dose of 0.1 mmol/kg body weight. The parameters of T1CE with volumetric interpolated breath-hold examination (VIBE) were as the follows: TR/TE, 8.2 ms/3.2 ms; TI, 450 ms; flip angle 12°; section thickness, 1.2 mm; FOV, 24 × 24 cm; matrix size, 256 × 256; number of excitations, 1; image number, 140.
Tumor Segmentation or Delineation
Two neuroradiologists (S.S.Z with 8 years of experience and L.F.Y, with 12 years of experience in neuro-oncology imaging) independently reviewed all images. A third senior neuroradiologist (G.B.C, with 25 years of experience in brain tumor imaging) re-examined the images and determined the final imaging diagnoses when inconsistency appeared between the two neuroradiologists. The preoperative conventional image features of tumor were retrieved based on the criteria outlined in Supplementary Table 1 (online).
The VOI (volume of interest) were semi-automatically segmented using ITK-SNAP (version3.6, http://www.itk-snap.org) by two neuroradiologists (S.S. Z and L.F.Y). The VOI covering the enhanced lesion were drawn slice by slice on T1CE, avoiding the regions of macroscopic necrosis, cyst, edema and non-tumor macrovessels [20].
Radiomics Strategy
Feature extraction Texture features include 42 histogram features, 432 gray level co-occurrence matrix (GLCM) features, 540 gray level run length matrix (GLRLM) features, 11 gray level size zone matrix (GLSZM) features, 9 form factor features and 10 Haralick features. A total of 1044 features were extracted from the T1CE images using Analysis-Kinetics (A.K., GE Healthcare) software. We used the aforementioned features because these features were found to be relevant for distinguishing ODG2 from ODG3 in our previous studies by using MR imaging [16].
Feature selection After being centered and scaled, the highly redundant and correlated features were subjected to a two-step feature selection procedure. First, highly correlated features were eliminated using Pearson correlation analysis, with the r threshold of 0.75. Then, a random forest (RF) classifier consisting of a number of decision trees was used to rank feature importance. Every node in the decision trees is a condition on a single feature, designed to split the dataset into two so that similar response values end up in the same set. The measurement based on which (locally) optimal condition is chosen is called impurity. For classification, it is typically either Gini impurity or information gain/entropy. Thus, when training a tree, it can be computed how much each feature decreases the weighted impurity in a tree. To build the RF, the impurity decrease from each feature can be averaged and the features are ranked according to this measurement. In our study, Gini impurity decrease was used as the criterion to indicate the feature importance.
Radiomics model building The 30 most important features were fed into a Conditional Inference RF classifier to build model [21]. Five-fold cross validation was employed for tuning hyperparameter number of RF trees. Five-fold cross validation including pre-processing, feature selection and model construction were performed 3 times in order to avoid bias and overfitting as much as possible. The final results were the average from 3 performances. Accuracy, sensitivity, specificity, positive predicting value (PPV), and negative predicting value (NPV) were computed to evaluate the classifying performance. The receiver operating characteristic (ROC) curve was also built to provide the area under the ROC curve (AUC). The larger the AUC, the better the classification [22]. The whole procedure of feature extraction and machine learning was depicted in Figure 2.
Radiologist’s assessment To compare the efficacies of neuroradiologist and machine learning in differentiating ODG2 from ODG3, the images were also evaluated by three junior neuroradiologists (X.L.F, G.X and Y.H with 6, 7 and 7 years of experience in neuroradiology, respectively). The neuroradiologists were blinded to clinical information.
Statistical Analysis
Fisher exact test or the Chi-square test were used for the categorical variables and unpaired Student t test was used for continuous variable between ODG2 and ODG3 groups. The statistical analyses of clinical characteristics were performed by using SPSS 20.0 software (SPSS Inc., Chicago, IL, USA).
The statistical analyses of machine-learning were performed using R version 3. 4. 2 (R Foundation for Statistical Computing). A RF analysis was performed to train the machine-learning classifier. The goal of machine learning was to build the model to differentiate ODG2 from ODG3 based on radiomics features of T1CE image. The following R packages were used: the random forest package was used for feature ranking; the caret and unbalanced packages were used for RF classification. Classifier performance was determined by using accuracy, sensitivity and specificity. The AUC values were also calculated for three readers and compared with that of the radiomics classifier. P value < 0.05 was considered as statistical significance.