Ethical approval was obtained for this study, and the necessity to obtain informed consent was waived, as the data were analyzed retrospectively and anonymously. We retrospectively reviewed the medical records of 403 patients with stage III (American Joint Committee on Cancer (AJCC) 8th edition) NSCLC after definitive CRT in Shandong Cancer Hospital between January 2014 and December 2017. Patients were excluded from the analysis if they met at least one of the following criteria: (I) surgery before CRT; (II) radiotherapy dose less than 55 Gy; (III) no pretreatment CT images; (IV) no posttreatment CT images; (V) other causes of mortality; or (VI) loss to follow-up before the clinical endpoints.
In total, 139 patients were identified in our analysis: patients treated before May 4th, 2016, were included in the primary cohort (n = 100); other patients were evaluated to form an independent validation cohort (n = 39). At baseline, clinical features of primary NSCLC patients (age, gender, smoking history, tumor location, etc.), and the acquisition date of CT imaging were recorded.
Patients were followed up every three to six months after treatment, and surveillance contrast-enhanced CT and/or PET/CT, brain magnetic resonance imaging (MRI), and whole-body bone scans were performed to assess treatment response or tumor progression based on the US National Comprehensive Cancer Network (NCCN) guidelines. The primary endpoint of this study was DMs, which was defined as progression of the disease to other organs as assessed in surveillance scans, and time to DMs was defined as the time interval between the start date of CRT and the first scan date of radiographically evident distant metastases or censoring (date of last negative scan).
Oligometastases was defined as 1–5 separate metastatic lesions in up to three different organs. Metastases to all organs were included, except diffuse serosal metastases (meningeal, pericardial, pleural, mesenteric) and bone marrow involvement, as these cannot be treated with radical intent. Other metastases were classified as polymetastases.
The conventional clinical parameters considered for this study included age, gender, smoking status, histology subtype (1-squamous cell carcinoma (SCC), 2-adenocarcinoma (AC), 3-others), tumor location (peripheral or central), Eastern Cooperative Oncology Group (ECOG) performance status (PS), tumor-node-metastasis (TNM) stage per the AJCC staging system (8th edition), CT-based measurements commonly utilized in the clinic (e.g., maximal tumor diameter measured on a single axial slice), and treatment characteristics.
All patients underwent pretreatment contrast-enhanced (reconstruction thickness of 5 mm) CT with a 64-row detector scanner (Somatom Definition AS, Siemens Healthineers, Germany). The acquisition parameters were as follows: tube voltage of 120 kV, tube current of 200 mAs, detector collimation of 64 × 0.625 mm, 1.5 beam pitch, and 512 × 512 matrix size. We retrieved the pretreatment CT images in DICOM format from the picture archiving and communication system (PACS; Carestream, Canada).
Tumor segmentation and radiomics feature extraction
All available pretreatment CT images were collected centrally and transferred to 3D Slicer (software version 4.8.1), an open-source image analysis platform for image registration, segmentation, 3D visualization, and feature extraction[26–29]. The regions of interest (ROI) was those that contained the entire primary lung tumor, and they were successfully segmented in 3D with a manual single-click ensemble segmentation approach by an experienced radiologist blinded to all clinical outcomes, running on the 3D-Slicer software. Then, primary tumor segmentation was confirmed by another senior radiologist.
In total, 724 quantitative radiomics features, including first-order features, shape, gray-level cooccurrence matrix (GLCM), gray-level run-length matrix (GLRLM), gray-level size zone matrix (GLSZM) and neighboring gray-tone difference matrix (NGTDM), were extracted from each patient’s contrast-enhanced CT images. In addition, considering that a wavelet provides a spatial and frequency representation of the signal, the aforementioned texture features were also extracted from the images that were preprocessed with the wavelet filter. Overall, those radiomics features ware extracted from both filtered and unfiltered images.
Clinical variables Baseline continuous variables were compared between the primary and validation cohorts using the Mann-Whitney U-test, and categorical data were analyzed by the chi-square test or Fisher’s exact test. Univariate and multivariate Cox proportional hazards regression modeling was utilized to evaluate clinical variables as predictors of DMs in SPSS 25 statistical software (IBM, Armonk, NY). P values < 0.05 were considered statistically significant, and all P values were two-sided.
Construction of the radiomics score-based signature The least absolute shrinkage and selection operator (LASSO) Cox regression model, which is suitable for the reduction of high-dimensional data, was applied to select the best predictive features to develop the radiomics signature[31, 32]. The discrimination of that signature was calculated by the area under the curve (AUC). A radiomics score (Rad-score) was calculated for each patient via a linear combination of selected features that were weighted by their respective coefficients.
Validation of the radiomics signature The patients were classified into high-risk or low-risk groups according to the Rad-score, the threshold of which was identified by using X-tile (X-tile software, version 3.6.1, Yale University School of Medicine, New Haven, Conn). Kaplan-Meier survival curves were generated to depict the association between the radiomics signature and clinical outcomes. It was first evaluated in the primary cohort and then verified in the validation cohort. Log-rank testing was performed to compare the difference in the survival curves between the high-risk and low-risk groups.
Model construction For the construction of the nomogram, we performed multivariate Cox analysis of clinical parameters, including age, gender, smoking status, histology subtype, tumor location, ECOG, TNM stage, maximum tumor diameter (MTD) and treatment characteristics. The radiomics nomogram incorporated both the radiomics signature and the independent clinical risk factors.
Model evaluation The discrimination performance of the model was evaluated by the Harrell concordance index (C-index). Calibration curves were adapted to compare the agreement between the actual clinical outcomes and the predicted outcomes. Decision curve analysis (DCA) was conducted to determine the clinical usefulness of the nomogram by quantifying the net benefits at different threshold probabilities in the entire cohort[35, 36].
All statistical analyses are two-sided, with the statistical significance level set at 0.05. Statistical analysis was performed with “glmnet,” “rms,” “Hmisc,” “lattice,” “survival,” “Formula,” “ggplot2,” and “rmda” modules in R software (Version 3.6.1; http://www. Rproject.org).