Radiomics Nomograms for Predicting Different Patterns of Distant Metastases After Denitive Chemoradiotherapy for Locally Advanced Non-Small-Cell Lung Cancer

Background: To develop and validate radiomics-based nomograms for predicting different patterns of distant metastases (DMs), including oligometastases and polymetastases, after denitive chemoradiotherapy (CRT) for locally advanced non-small-cell lung cancer (LA-NSCLC). Methods: In all, 139 LA-NSCLC patients treated with denitive CRT between January 2014 and December 2017 were analyzed retrospectively. Computed tomography (CT) radiomics features were quantitatively extracted by 3D Slicer software. The least absolute shrinkage and selection operator (LASSO) Cox regression model was applied for date dimension reduction, feature selection and radiomics signature development. Univariate and multivariate Cox regression analyses were utilized to identify independent predictors of DMs. Finally, a nomogram that incorporates both the radiomics signature and independent clinical risk factors was developed and assessed according to the Harrell concordance index (C-index), calibration curve and clinical usefulness. Results: With a median follow-up of 20.6 months, DMs were detected by follow-up imaging in 47 patients (26 patients had oligometastatic disease). For those who experienced DM, the rst metastases were most common in the lung (32.1%), followed by the brain (20.7%), and bone (15.1%). The median time to DMs was 13.5 months. Twelve radiomics features were found to be predictive for DMs. Multivariate Cox proportional hazards testing revealed that histology subtype and overall stage were independent clinical risk factors for DMs (p-value < 0.05). The radiomics nomogram combining the selected radiomics signature, histology subtype and overall stage predicted DMs better than either the radiomics signature or clinical factors alone (p-value < 0.001). The model showed good discrimination and good calibration. Furthermore, the radiomics signature consisting of nine selected features was signicantly associated with oligometastases (p-value < 0.001). The radiomics-based nomogram showed strong discrimination ability for the prediction of oligometastatic disease. The C-index was 0.800 (95% CI= 0.710 to 0.890) for the primary cohort, and 0.828 (95% CI= 0.686 to 0.970) for the validation cohort. Conclusion: The above-described radiomics nomograms can precisely predict DM and the oligometastatic state in patients with LA-NSCLC, which may constitute a useful clinical tool to guide subsequent personalized treatments. primary cohort, Kaplan-Meier analysis in the validation cohort showed a statistically signicant difference (log-rank p < 0.05) for metastasis-free probability estimates. The low-risk group showed an HR of 0.20 compared to the high-risk group.


Introduction
De nitive chemoradiotherapy (CRT) is considered the standard of care for patients presenting with inoperable locally advanced non-small-cell lung cancer (LA-NSCLC) [1]. Despite recent advances in radiation therapy technologies, approximately 30%-40% of NSCLCs metastasize to distant sites after de nitive treatment [2]. There is a common belief that patients with metastatic NSCLC cannot be cured, and treatment consists mainly of systemic therapy with radiation reserved for palliation [3]. However, patients with limited metastatic burden, "oligometastatic" disease (OMD), may have a better prognosis than those with polymetastatic disease [4,5]. Although there is no formal consensus on the de nition of oligometastatic disease, it is currently widely accepted that it constitutes up to ve isolated lesions in up to three different organs that are potentially amenable to locally ablative therapy (surgical resection, radiotherapy or both) [6][7][8].
OMD was rst coined by Hellman and Weichselbaum, who hypothesized an intermediate clinical state between locoregional and widely metastatic disease [9]. For patients with oligometastatic and oligorecurrent NSCLC, intensive local therapy, including surgical resection or de nitive radiotherapy, is expected to achieve long-term survival and may even be curative [10][11][12][13][14][15]. However, the accurate classi cation of oligometastatic and polymetastatic patients and the identi cation of patients eligible for local radical therapy or systemic therapy remain of high importance. Lussier et al. found that molecular techniques using microRNAs can distinguish between oligometastases and polymetastases in lung cancer [16,17]. Furthermore, some evidence has indicated that whole-body 18 F-FDG-positron emission tomography (PET)/CT is available for identifying and following up patients with oligometastatic NSCLC [18,19]. Consequently, biomarkers can be developed to identify and distinguish patients with distant metastases (DMs), oligometastases and polymetastases before treatment, which may help clinicians adopt personalized treatments to improve patient prognosis.
Radiomics, as an emerging eld of quantitative imaging, has the capacity to capture tumor phenotypic differences by examining a large set of quantitative features based on noninvasive imaging, which provides a new basis for the accurate diagnosis and treatment of tumors. It shows great potential for assisting in the clinical diagnosis, treatment, and prognosis of cancer [20][21][22]. Recently, numerous studies have shown that imaging-based radiomics features can be used to quantify tumor heterogeneity and have potentials application as clinical biomarkers for patient strati cation [2,[23][24][25]. In particular, radiomics studies have shown that CT-derived imaging features may be novel prognostic indicators for DMs in stage III NSCLC patients treated with CRT [2,20]. Nevertheless, there is still a lack of studies, particularly regarding imaging-based radiomics in the identi cation of oligometastatic and polymetastatic NSCLC.
Therefore, the aim of this study was to develop CT-based radiomics nomograms to precisely predict different patterns of DMs, including oligometastases and polymetastases, in patients with LA-NSCLC and to provide useful information for adopting accurate and personalized treatments in the clinic.

Patients
Ethical approval was obtained for this study, and the necessity to obtain informed consent was waived, as the data were analyzed retrospectively and anonymously. We retrospectively reviewed the medical records of 403 patients with stage III (American Joint Committee on Cancer (AJCC) 8th edition) NSCLC after de nitive CRT in Shandong Cancer Hospital between January 2014 and December 2017. Patients were excluded from the analysis if they met at least one of the following criteria: (I) surgery before CRT; (II) radiotherapy dose less than 55 Gy; (III) no pretreatment CT images; (IV) no posttreatment CT images; (V) other causes of mortality; or (VI) loss to follow-up before the clinical endpoints.
In total, 139 patients were identi ed in our analysis: patients treated before May 4th, 2016, were included in the primary cohort (n = 100); other patients were evaluated to form an independent validation cohort (n = 39). At baseline, clinical features of primary NSCLC patients (age, gender, smoking history, tumor location, etc.), and the acquisition date of CT imaging were recorded.

Clinical endpoints
Patients were followed up every three to six months after treatment, and surveillance contrast-enhanced CT and/or PET/CT, brain magnetic resonance imaging (MRI), and whole-body bone scans were performed to assess treatment response or tumor progression based on the US National Comprehensive Cancer Network (NCCN) guidelines. The primary endpoint of this study was DMs, which was de ned as progression of the disease to other organs as assessed in surveillance scans, and time to DMs was de ned as the time interval between the start date of CRT and the rst scan date of radiographically evident distant metastases or censoring (date of last negative scan).
Oligometastases was de ned as 1-5 separate metastatic lesions in up to three different organs. Metastases to all organs were included, except diffuse serosal metastases (meningeal, pericardial, pleural, mesenteric) and bone marrow involvement, as these cannot be treated with radical intent. Other metastases were classi ed as polymetastases.

Clinical variables
The conventional clinical parameters considered for this study included age, gender, smoking status, histology subtype (1-squamous cell carcinoma (SCC), 2-adenocarcinoma (AC), 3-others), tumor location (peripheral or central), Eastern Cooperative Oncology Group (ECOG) performance status (PS), tumor-nodemetastasis (TNM) stage per the AJCC staging system (8th edition), CT-based measurements commonly utilized in the clinic (e.g., maximal tumor diameter measured on a single axial slice), and treatment characteristics.

Image acquisition
All patients underwent pretreatment contrast-enhanced (reconstruction thickness of 5 mm) CT with a 64row detector scanner (Somatom De nition AS, Siemens Healthineers, Germany). The acquisition parameters were as follows: tube voltage of 120 kV, tube current of 200 mAs, detector collimation of 64 × 0.625 mm, 1.5 beam pitch, and 512 × 512 matrix size. We retrieved the pretreatment CT images in DICOM format from the picture archiving and communication system (PACS; Carestream, Canada).
Tumor segmentation and radiomics feature extraction All available pretreatment CT images were collected centrally and transferred to 3D Slicer (software version 4.8.1), an open-source image analysis platform for image registration, segmentation, 3D visualization, and feature extraction [26][27][28][29]. The regions of interest (ROI) was those that contained the entire primary lung tumor, and they were successfully segmented in 3D with a manual single-click ensemble segmentation approach by an experienced radiologist blinded to all clinical outcomes, running on the 3D-Slicer software. Then, primary tumor segmentation was con rmed by another senior radiologist.
In total, 724 quantitative radiomics features, including rst-order features, shape, gray-level cooccurrence matrix (GLCM), gray-level run-length matrix (GLRLM), gray-level size zone matrix (GLSZM) and neighboring gray-tone difference matrix (NGTDM), were extracted from each patient's contrast-enhanced CT images. In addition, considering that a wavelet [30] provides a spatial and frequency representation of the signal, the aforementioned texture features were also extracted from the images that were preprocessed with the wavelet lter. Overall, those radiomics features ware extracted from both ltered and un ltered images.

Statistical analysis
Clinical variables Baseline continuous variables were compared between the primary and validation cohorts using the Mann-Whitney U-test, and categorical data were analyzed by the chi-square test or Fisher's exact test. Univariate and multivariate Cox proportional hazards regression modeling was utilized to evaluate clinical variables as predictors of DMs in SPSS 25 statistical software (IBM, Armonk, NY). P values < 0.05 were considered statistically signi cant, and all P values were two-sided.
Construction of the radiomics score-based signature The least absolute shrinkage and selection operator (LASSO) Cox regression model, which is suitable for the reduction of high-dimensional data, was applied to select the best predictive features to develop the radiomics signature [31,32]. The discrimination of that signature was calculated by the area under the curve (AUC). A radiomics score (Rad-score) was calculated for each patient via a linear combination of selected features that were weighted by their respective coe cients.
Validation of the radiomics signature The patients were classi ed into high-risk or low-risk groups according to the Rad-score, the threshold of which was identi ed by using X-tile (X-tile software, version 3.6.1, Yale University School of Medicine, New Haven, Conn) [33]. Kaplan-Meier survival curves were generated to depict the association between the radiomics signature and clinical outcomes. It was rst evaluated in the primary cohort and then veri ed in the validation cohort. Log-rank testing was performed to compare the difference in the survival curves between the high-risk and low-risk groups.
Model construction For the construction of the nomogram, we performed multivariate Cox analysis of clinical parameters, including age, gender, smoking status, histology subtype, tumor location, ECOG, TNM stage, maximum tumor diameter (MTD) and treatment characteristics. The radiomics nomogram incorporated both the radiomics signature and the independent clinical risk factors.
Model evaluation The discrimination performance of the model was evaluated by the Harrell concordance index (C-index). Calibration curves were adapted to compare the agreement between the actual clinical outcomes and the predicted outcomes [34]. Decision curve analysis (DCA) was conducted to determine the clinical usefulness of the nomogram by quantifying the net bene ts at different threshold probabilities in the entire cohort [35,36].

Clinical characteristics
The demographic and clinical characteristics of the primary and validation cohorts are listed in Table 1.
In total, 139 LA-NSCLC patients treated with de nitive CRT were analyzed in this study. There were 16 females and 123 males with a median age of 61 years (range: 39 − 84 years). The median follow-up time was 20.6 months (range: 3.7-58.8 months). The median time to DM was 13.5 months (range: 1.4-56.8 months), with 47 (34%) patients who developed DMs versus 92 (66%) who did not. For those who experienced DMs, the rst metastases were most commonly in the lung (32.1%), followed by the brain (20.7%) and bone (15.1%). Table 2 lists sites of metastases and the number of sites of metastatic lesions.  According to the time of enrollment, participants were divided into a primary cohort and a validation cohort. Except for the use of radiotherapy (P < 0.05), there were no signi cant differences in clinical characteristics. An outlier was observed among the patient characteristics: the proportion of elderly patients in the validation cohort was higher than that in the primary cohort. There were no signi cant differences in the probability of DMs (hazard ratio (HR) = 1.40, 95% CI = 0.75 to 2.61, P > 0.05) between the two cohorts.
Clinical risk factor selection Univariate Cox proportional hazards testing revealed that histology subtype (P < 0.01) and overall stage (P < 0.05) were statistically signi cant predictors of DM. The multivariate Cox proportional hazards modeling results are presented in Table 3. Histology subtype appeared to be an independent clinical risk factor (P = 0.013) for metastasis. Overall stage was also independently associated with increased risk of distant metastases (HR = 2.37, 95% CI = 1.00 to 5.59, P < 0.05). The model that incorporated the above independent predictors was developed and presented as the nomogram. Feature selection and radiomics score-based signature construction Twelve nonredundant predictors were extracted from the 724 features based on the primary cohort of the 100 cases (Fig. 1), and those features with nonzero coe cients were used in the LASSO Cox regression model. The radiomics signature, which was constructed based on the regression analysis with the radiomics score, was calculated for each patient ( Table 4). The formula to calculate the radiomics score was: Score = Intercept + Coe cient × Radiomics features. The radiomics signature that we developed showed a signi cant capability to distinguish DM and non-DM in the primary cohort (AUC = 0.836, 95% CI = 0.749 to 0.903) and validation cohort (AUC = 0.661, 95% CI = 0.492 to 0.805). Twelve radiomics features with non-zero coe cients in the least absolute shrinkage and selection operator method (LASSO) cox regression model were selected. The radiomics signature was constructed based on the regression analysis with a radiomics score calculated for each patient. The formula to calculate the score of radiomics signature is Score = Intercept + Coe cient × Radiomics features.

Validation of the radiomics signature
The optimum cutoff generated by the X-tile plot was − 2.45. Accordingly, patients were classi ed into a high-risk group (Rad-score≥ -2.45) and a low-risk group (Rad-score< -2.45). Figure 2 demonstrates the potential strati cation that can be achieved using the radiomics signature. The radiomics signature was associated with the metastasis-free probability in the primary cohort (HR = 20.11, 95% CI = 7.98 to 50.70, P < 0.001). Using the cutoff value derived from the primary cohort, Kaplan-Meier analysis in the validation cohort showed a statistically signi cant difference (log-rank p < 0.05) for metastasis-free probability estimates. The low-risk group showed an HR of 0.20 compared to the high-risk group.
Performance of the multimodality prediction model A predictive nomogram (Fig. 3a) was generated on the basis of the selected radiomics signature, histology subtype, and overall stage. The C-index for the combined nomogram was 0.807 (95% CI = 0.748 to 0.866) for the primary cohort, which was con rmed to be 0.692 (95% CI = 0.556 to 0.828) by bootstrapping validation.
Further, the calibration curves of the nomogram for metastasis-free probability at 1, 2, or 3 years after de nitive CRT are shown in Fig. 3b, and they demonstrated good agreement between the predicted and actual observations in the primary cohort.
The predictive performance of these models was assessed on the primary cohort and the validation cohort (Table 5). In the primary cohort, compared with either the radiomics signature (C-index = 0.763; 95% CI = 0.686 to 0.840) or the clinicopathologic model (C-index = 0.707; 95% CI = 0.629 to 0.785), the radiomics nomogram showed a better discrimination capability (P < 0.001). In the validation cohort, the radiomics signature and the radiomics nomogram had similar C-indexes (0.683 for the radiomics signature and 0.692 for the clinicopathologic model, P = .779). 0.779** *The C-index was compared between the clinical nomogram and the radiomics signature.
**The C-index was compared between the radiomics signature and the radiomics nomogram.

Clinical use
The DCA for each model is presented in Fig. 4. Based on the threshold probability, the decision analysis curve was leveraged to evaluate the clinical application of the prediction model. In our study, a threshold probability < 0.38 or > 0.62 indicates that using the radiomics nomogram to predict DM improves the net bene t compared to that with standard treatment or no treatment. Within this range, the net bene t was comparable, with several overlaps, on the basis of the radiomics nomogram, the radiomics signature and the model with clinical risk factors integrated.

Subgroup analysis according to the number of metastatic lesions
Among the 47 NSCLC patients with DMs, 26 had oligometastatic disease. After excluding patients with polymetastatic NSCLC, we reclassi ed the remaining patients into the primary cohort (n = 96) and the validation cohort (22). In the oligometastatic NSCLC subgroup, multivariate Cox regression analysis proved that the histology subtype was an independent clinical risk factor (P < 0.05) (see Additional le 1: Tables S1). The radiomics signature, consisting of 9 features, was signi cantly associated with oligometastatic NSCLC (P < 0.001) (see Additional le 1: Tables S2). The AUC values of the radiomics signature in the primary and validation cohorts were 0.865 (95% CI = 0.781 to 0.926) and 0.659 (95% CI = 0.428 to 0.845), respectively. Consequently, we integrated some independent predictive factors (radiomics signature and histology subtype) to construct a nomogram speci c to oligometastatic NSCLC (Fig. 3a), which has strong discrimination and calibration power (Fig. 3b). The C-index was 0.800 (95% CI = 0.710 to 0.890) for the primary cohort and 0.828 (95% CI = 0.686 to 0.970) for the validation cohort.

Discussion
Considering the high incidence of DM, metastatic NSCLC is traditionally believed to be incurable, with very little information on the identi cation of oligometastases and polymetastases. To our knowledge, this study is the rst to build CT-based radiomics prediction models for different patterns of DMs in patients with LA-NSCLC based on limited numbers of metastatic lesions. In this study, we extracted twelve radiomics features with nonzero coe cients in the LASSO Cox regression model from a total of 100 LA-NSCLC patients treated with de nitive CRT, and we validated the radiomics-based signature on an independent validation dataset. The radiomics signature successfully strati ed patients according to their risk of DM. Incorporating the radiomics signature and clinical risk factors into the radiomics-based nomogram resulted in better performance (P < 0.001) as well as better discrimination and calibration for the prediction of DM (C-index: 0.807 for the primary cohort and 0.692 for the validation cohort) than either the clinicopathologic nomogram or the radiomics signature. Furthermore, for the oligometastatic NSCLC subgroup, we developed and validated a diagnostic, radiomics-based nomogram speci c to this subgroup for the individualized pretreatment prediction of oligometastatic disease. Since patients with oligometastatic disease are different from those with polymetastases in terms of treatment and prognosis, early identi cation of patients at different risk of developing DM would help clinicians stratify patients and may also provide useful information to adapt treatments to improve outcomes.
We evaluated the conventional clinical parameters associated with DM to determine its clinical risk factors in the context of all LA-NSCLC patients. The results of this study revealed that the histology subtype and overall stage were independent clinical risk factors. Predictors of DM after de nitive treatment for NSCLC have been elucidated in a number of reports. Previous investigations suggested that the histologic subtype was associated with sites of DM [37]. Cox et al. demonstrated nonsquamous histology to be associated with DM from a pooled analysis of four Radiation Therapy Oncology Group (RTOG) trials including 1765 patients. Similar results were con rmed by Tang et al., who also identi ed nonsquamous histology and advanced initial disease stage predicted metastasis at any site [38,39]. In addition, compared with SCC, the risk of DM is higher in AC [40]. However, there was no statistically signi cant difference in DM incidence between SCC vs. AC in our study (P = 0.086). One possible explanation for this phenomenon may be the limited number of patients with AC.
Preliminary evidence has suggested that CT-based radiomics features are associated with failure patterns in patients with LA-NSCLC. Fried et al. found that the combination of CT texture features and conventional prognostic factors (CPFs) may have better predictive value than either alone for localregional control and distant metastasis in stage III NSCLC patients treated with de nitive radiotherapy [20]. Another report also concluded that the CT-based radiomics signature could be used as a biomarker for predicting distant failure after CRT in patients with locally advanced lung AC [2]. Currently, there is still a lack of targeted studies about imaging-based radiomics for the identi cation of oligometastases and polymetatases. First, our study demonstrated that the integrative nomogram of CTbased radiomics and clinical features could be used to generate outcome models superior to those only considering clinical factors or the radiomics signature. We also built a radiomics-based model for the identi cation of oligometastatic NSCLC subgroups. By means of the above-described models, we can preliminarily stratify patients with DMs to select patients with oligometastatic disease suitable for local radical therapy and to select the remaining patients with polymetastatic disease suitable for systemic therapy, including targeted therapy and immunotherapy.
In this study, we found that approximately 32.1% of rst metastases occurred in the lung, followed by the brain (20.7%) and bone (15.1%). Our observed sites of rst metastases were consistent with those presented in other studies assessing treatment with de nitive CRT. In the study of NSCLC patients, approximately 78% whom have stage III NSCLC, rst metastases were most common in the lung (27%), brain (26%), and bone (18%) [37]. A recent analysis of this same patient group showed that rst metastases were most common in the brain (30%), contralateral lung (27%), and bone (27%) in stage III NSCLC patients treated with de nitive CRT [41]. The brain was also an organ with a high probability of extrapulmonary metastasis in stage IIIA NSCLC patients, especially in AC patients [40,42]. In the future, we hope to conduct additional studies examining the use of radiomics analysis in tumor-speci c metastatic sites such as the brain.
Various limitations of this analysis deserve mention. First, as this study was retrospective in nature, our analysis is limited by its inherent biases and the relatively small sample size. To build practical radiomic models, multiple larger databases and multicenter prospective studies are required. Second, manual delineation is most commonly used in these types of analyses but is far from perfect. Thresholding is a useful strategy to enhance contour reproducibility, particularly in lung tumors. However, although our study has some limitations, it is one of the largest attempts to identify a biomarker predictive of different patterns of DM, including oligometastases and polymetastases, in patients with LA-NSCLC.

Conclusion
In conclusion, the combined model of the radiomics signature and clinical risk factors results in better performance for the prediction of DM in LA-NSCLC compared to the radiomics signature or clinical factors alone. Herein, we reported a radiomics-based model for the identi cation of the oligometastatic NSCLC subgroup, which may aid in the strati cation of patients with DM and the selection of

Declarations
Ethics approval and consent to participate Not applicable.

Consent for publication
Not applicable.

Availability of data and materials
The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.
Natural Science Foundation (ZR2019LZL019), and the Jinan Scienti c and Technology Development Project (201805005).
Authors' contributions LX and XS conceived of the present idea. XC and QQ delineated and con rmed the ROI. YC acquired the patient date. XT analyzed the date and drafted the manuscript. LX and XS edited and corrected the manuscript. All authors read and approved the nal manuscript.