Patient information
The Institutional Review Board approved the retrospective review of the medical records for this analysis. Participants were selected according to the inclusion and exclusion criteria and were limited to patients treated between January 2015 and June 2018 at our hospital, resulting in a total of 86 patients who were included in this study. The inclusion criteria were as follows: (I) all patients underwent surgical lung resection and systematic LN dissection within two weeks after undergoing non-contrast and contrast CT scans; (II) the tumor subtypes and LNs status were confirmed by pathology results; and (III) multiple tumors and other manifestations were absent. Due to the high cost and low prevalence of PET/CT, it was not required for eligibility in preoperative examinations for this retrospective study. The exclusion criteria were as follows: (I) clinical data were incomplete, or statistical analysis could not be performed; (II) patients received treatments before the scans were performed; (III) poor image quality affected the quantitative analysis; and (IV) CT images were reconstructed using different algorithms, thicknesses, or equipment.
Then the enrolled patients were divided into two independent cohorts: 61 patients treated between January 2015 and June 2017 constituted the training cohort, and 25 patients treated between July 2017 and June 2018 constituted the validation cohort. Tumor subtypes and lymph node status were proven by pathological results, and clinical factors including gender and stage were derived from medical records. Disease stage was evaluated according to the TNM Classification of Malignant Tumors, 7th Edition.
CT image acquisition
All patients underwent routine and enhanced CT scanning, and a Philips scanner (Holland, CT LightSpeed 16) was used with the following imaging protocol: tube voltage 120 kV, cube current 300 mA, thickness 2 mm and in-plane resolution 0.97×0.97. The contrast medium was injected into the elbow vein at an injection rate of 2.3~3.0 ml/second, and the maximum dose was 100 ml. An arterial phase scan was performed 25 to 30 seconds after contrast medium injection, and a venous phase scan was performed 90 seconds later. Plain, arterial and venous phase images were obtained. All images were exported in the Digital Imaging and Communications in Medicine (DICOM) format for image feature extraction.
Radiomics workflow
The radiomics workflow included: (1) image segmentation, (2) feature extraction, (3) feature selection, and (4) predictive model building.
Lesion segmentation
We performed manual segmentation on arterial phase CT images using MIM Maestro version 6.8.2 (MIM software, Cleveland, OH), and pathologically confirmed LNs were defined as regions of interest (ROIs). Using the arterial phase CT image as the reference, plain and venous phase CT images were corrected by the nonrigid registration method, and the contouring results were mapped to the plain and venous phase images, respectively. The target images were delineated by two senior radiologists with 20 years of experience in chest CT diagnosis, and differences in the findings were resolved by a third high-ranking radiologist when disputes occurred. Figure 1 shows schematic diagrams of the ROIs on three CT images in different phases.
Feature extraction
Radiomic features were extracted from LNs using 3D Slicer software, an open-source Python package for the extraction of features from medical images (version 4.6, http://www.slicer.org) [8]. In total, 841 radiomic features were extracted and were organized into two categories: (I) based on original images; and (II) based on wavelet images. Eighteen first-order features derived from the tumor intensity histogram reflected the distribution of the values of individual voxels without concern for spatial relationships. Thirteen shape features provided the geometric tumor volume. Seventy-four texture features described the spatial arrangement of voxels, as calculated from different parent matrices, which included the gray level dependence matrix (GLDM), the gray level cooccurrence matrix (GLCM), the gray level size zone matrix (GLSZM), the gray level run length matrix (GLRLM) and the neighborhood gray-tone difference matrix (NGTDM) [9]. In addition, 736 wavelet features derived from eight filtering modes were obtained.
Feature selection and radiomic models development
A least absolute shrinkage and selection operator (LASSO) logistic regression algorithm was used to select significant features with nonzero coefficients to develop models. In this study, we constructed six models based on the radiomic features of single-phase imaging and joint two-phase imaging, which included models 1, 2, and 3 (based on the plain, arterial and venous phase radiomic features, respectively), and models 4, 5, and 6 (based on the delta radiomic features between plain and arterial phase imaging, plain and venous phase imaging, and arterial and venous phase imaging, respectively). This process was implemented in R software (version: 3.3.3, https://www.r-project.org). The classification performance of the radiomic models was quantified by the area under the receiver operator characteristic curve (AUC), sensitivity, specificity, accuracy, positive predictive value (PPV) and negative predictive value (NPV) in both the training and validation cohorts.
Statistical analysis
Data analysis was performed using Statistical Package for Social Sciences (SPSS) software version 23.0 (SPSS, Chicago, IL, USA) and R software (version 3.4.0, https://www.r-project.org). We compared clinical characteristics between the training and verification groups by Wilcoxon ranksum test. P values less than 0.05 were considered statistically significant.