This study was approved by our institutional review board and the requirement for informed consent was waived based on its retrospective nature. From January 2015 to July 2017, patients with confirmed ODGs were retrospectively and consecutively recruited. Tumors were classified according to 2007 WHO classification or 2016 WHO guidelines when enough information was available. The including criteria were, 1. patients underwent preoperative conventional MRI scan. 2. patients underwent gross total or subtotal tumor resection and a confirmative pathological diagnosis was made. Thirty-six patients with TICE were included (19 men, 17 women; mean age =45 years; age range =9 - 65 years) and classified into two groups: ODG2 (n = 19; mean age = 46 years, age range =10 - 65 years) and ODG3 (n = 17; mean age = 44 years, age range = 9 - 65 years). Thirty-three out of the above 36 patients with FLAIR were enrolled (18 men, 15 women; mean age = 45 years; age range = 9 - 65 years) and classified into two groups: ODG2 (n = 17; mean age = 45 years, age range =10 - 65 years) and ODG3 (n = 16; mean age = 45 years, age range = 9 - 65 years). The patient selection is summarized in Figure 1.
MRI Data Acquisition
All patients underwent 3-T MR scanning (Discovery MR750, General Electric Medical System, Milwaukee, WI, USA) with an 8-channel head coil (General Electric Medical System). The initial routine scan sequences for each patient included T1-weighted imaging (T1WI) performed before and after contrast enhancement, an axial T2-weighted imaging (T2WI), and a transverse FLAIR to assist with diagnosis.
The parameters of the conventional MRI sequences were as the follows: T1WI with gradient echo (TR/TE, 1750 ms/24 ms; matrix size, 256 × 256; FOV, 24 × 24 cm; number of excitation, 1; slice thickness, 5 mm; gap, 1.5 mm), T2WI with turbo spin-echo (TR/TE, 4247 ms/93 ms; matrix size, 512 × 512; FOV, 24 × 24 cm; number of excitation, 1; slice thickness, 5 mm; gap, 1.5 mm) and sagittal T2WI (TR/TE, 10,639 ms/96 ms; matrix size, 384 × 384; FOV, 24 × 24 cm; number of excitation, 2; slice thickness, 5 mm; gap, 1.0 mm). We obtained axial FLAIR with the following parameters: TR/TE, 8000 ms/165 ms; matrix size, 256 × 256; FOV, 24 × 24 cm; number of excitations, 1; slice thickness, 5 mm; gap, 1.5 mm.
Finally, T1CE were performed after intravenous bolus injection of gadodiamide (Omniscan; GE Healthcare, Co. Cork, Ireland), at a dose of 0.1 mmol/kg body weight. The parameters of T1CE with volumetric interpolated breath-hold examination (VIBE) were as the follows: TR/TE, 8.2 ms/3.2 ms; T1, 450 ms; flip angle 12°; section thickness, 1.2 mm; FOV, 24 × 24 cm; matrix size, 256 × 256; number of excitations, 1; image number, 140.
Tumor Segmentation or Delineation
Two neuroradiologists (S.S.Z with 8 years of experience and L.F.Y, with 12 years of experience in neuro-oncology imaging) independently reviewed all images. A third senior neuroradiologist (G.B.C, with 25 years of experience in euro-oncology imaging) re-examined the images and determined the final imaging diagnoses when inconsistency occurred. The preoperative conventional image features of tumor were retrieved based on the criteria outlined in Supplementary Table 1 (online).
The volumes of interest (VOIs) were semi-automatically segmented using ITK-SNAP (version3.6, http://www.itk-snap.org) by two neuroradiologists (S.S. Z and L.F.Y). The VOIs covering the enhanced lesion were drawn slice by slice on T1CE and co-registered to and FLAIR images, avoiding the regions of macroscopic necrosis, cyst, edema and non-tumor macrovessels .
Feature extraction Texture features include 162 first-order logic features, 216 gray level co-occurrence matrix (GLCM) features, 144 gray level run length matrix (GLRLM) features, 144 gray level size zone matrix (GLSZM) features, 126 grey level difference matrix (GLDM) features, 45 neighborhood grey-tone diﬀerence matrix (NGTDM) features and 14 shape Features. A total of 1072 features were extracted from the T1CE and FLAIR images using 3D-slicer software. We used the aforementioned features because these features were found to be relevant for distinguishing ODG2 from ODG3 in our previous studies by using MR imaging .
Feature selection After being centered and scaled, the highly redundant and correlated features were subjected to a two-step feature selection procedure. First, highly correlated features were eliminated using Pearson correlation analysis, with the r threshold of 0.75. Then, a random forest (RF) classifier consisting of a number of decision trees was used to rank the feature importance. Every node in the decision trees is a condition on a single feature, designed to split the dataset into two so that similar response values end up in the same set. The measurement based on which optimal condition is chosen is called impurity. For classification, it is typically either Gini impurity or information gain/entropy. Thus, when training a tree, it can be computed how much each feature decreases the weighted impurity in a tree. To build the RF, the impurity decrease from each feature can be averaged and the features are ranked according to this measurement. In our study, Gini impurity decrease was used as the criterion to indicate the feature importance.
Radiomics model building The 30 most important features were fed into a Conditional Inference RF classifier to build model . Five-fold cross validation was employed for tuning hyperparameter number of RF trees. Five-fold cross validation including pre-processing, feature selection and model construction were performed 3 times in order to avoid bias and overfitting as much as possible. The final results were the average from 3 performances. There was no feature selection in the combination of T1CE and FLAIR throughout the model building. Accuracy, sensitivity and specificity were computed to evaluate the classifying performance. The receiver operating characteristic (ROC) curve was also built to provide the area under the ROC curve (AUC). The larger the AUC, the better the classification . The whole procedure of feature extraction and machine learning was described in Figure 2.
Radiologist’s assessment To compare the efficacies of neuroradiologist and machine learning in differentiating ODG2 from ODG3, the images were also independently assess by three junior neuroradiologists (X.L.F, G.X and Y.H with 6, 7 and 7 years of neuroradiology experience, respectively). The neuroradiologists were blinded to the clinical information, but were aware that the tumors were either ODG2 or ODG3, without knowing the exact number of patients with each entity. The three readers assessed only conventional MR images (T1WI, T2WI, FLAIR and T1CE), and recorded the final diagnosis using a 4-point scale (1 = definite ODG2; 2 = likely ODG2; 3 = likely ODG3; and 4 = definite ODG3) .
Fisher exact test or the Chi-square test were used for the categorical variables and unpaired Student t test was used for continuous variable between ODG2 and ODG3 groups. The statistical analyses of clinical characteristics were performed by using SPSS 20.0 software (SPSS Inc., Chicago, IL, USA).
The statistical analyses of machine-learning were performed using R version 3. 4. 2 (R Foundation for Statistical Computing). A RF analysis was performed to train the machine-learning classifier. The goal of machine learning was to build the model to differentiate ODG2 from ODG3 based on radiomics features of T1CE and FLAIR images. The following R packages were used: the random forest package was used for feature ranking; the caret and unbalanced packages were used for RF classification. Classifier performance was determined by using accuracy, sensitivity and specificity. The AUC values were also calculated for three readers and compared with that of the radiomics classifier. P value < 0.05 was considered as statistical significance.