In this study, machine learning technology was used for the first time in combination with the SHAP method, radiomics features and clinicopathological features to construct a risk prediction model for postoperative BM in patients with NSCLC. Finally, the model constructed with intra and peritumoral radiomics features and clinicopathological features exhibited good performance. The model was interpreted through SHAP values to determine which features were most closely associated with improved stability and to discover the most important features for predicting BM. Therefore, this feature of tumor images can reliably predict BM to help patients and clinicians make clinical decisions in an understandable way.
Radiomics provides more detailed features than human eyes. our results show that CT radiomics features from the peritumor area can predict the risk of postoperative BM in NSCLC and that combined with intratumoral radiomics features, the efficacy of the model is greatly improved, with AUC ranging from 0.928 to 0.888 in the training cohort and from 0.894 to 0.838 in the validation cohort. It indicates that the prognosis of lung cancer is not only reflected in the lesion site, but also in the tissues around the tumor (Gao et al. 2023). The microenvironment surrounding the tumor is associated with aggressiveness (Braman et al. 2017; Chen et al. 2022), and capillaries and various cells around the tumor boundaries may be more active than those inside the tumor. The peritumoral characteristics were substantially related to tumor spread through air spaces status(Liao et al. 2022). Algohari et al.(2020) studied the density of peritumoral mesenchymal macrophages, epithelial cells and lymphocytes and found that it was associated with the risk of prostate cancer metastasis This finding suggested that the tissue surrounding the tumor is helpful in predicting patient prognosis.
In addition, the model combining intratumoral and peritumoral radiomics features was more effective in predicting BM than the clinical model, which is consistent with previous findings. Zheng et al.(2023) found that the PET/CT radiomics model was more effective in predicting BM than the clinical model, with an AUC of 0.911 in the training cohort and 0.833 in the internal validation cohort. The radiomics model constructed by Ding et al.(Ding et al. 2022) and the combined model combining radiomics and clinical features were superior to the clinical model. The AUC of the radiomics model was 0.870 in the training cohort, 0.824 in the validation cohort, and the AUC of the combined model was 0.912 and 0.859 in the training cohort and validation cohort, respectively. The AUC of the clinical model was 0.712 in the training set and 0.692 in the validation set. Jian et al.(Jiang et al. 2022) established the radiomics model based on 8 selected features with a C-index of 0.733 (95%CI, 0.637–0.828) in the training cohort and 0.693 (95%CI, 0.569–0.818) in the validation cohort, which was also higher than the efficacy of the clinical model. The combined model of radiomics combined with clinical features constructed by Sun et al.(F. Sun et al. 2021) is superior to the clinical model, which is consistent with our research results. Radiomics specifically provides more complementary value to clinical information and improves the efficiency of the predictive model.
The results of clinical risk factor analysis showed that the T stage of the tumor was an independent risk factor for BM, and it could be seen from the column diagram and SHAP diagram that the T4 stage was associated with a lower risk than was the other stages. This finding was different from the results of other studies that showed that the larger the tumor and the higher the T stage were, the greater the probability of BM(Zhang et al. 2023). The reason may be that most of the T4 stage patients included in this study had N stage N0 - N1, which made the overall stage relatively low. However, other studies have shown that T stage is not an independent risk factor for brain metastasis(F. Sun et al. 2021). In this study, there was no significant correlation between total stage and BM incidence. It should be carefully considered that this study included only resectable stage IIB - IIIB patients. In addition, histopathological type is also an independent risk factor for BM. Patients with nonsquamous cell carcinoma and adenocarcinoma are more likely to develop BM, and patients with adenocarcinoma are more likely to develop BM than are those with squamous cell carcinoma, which is consistent with the findings of other studies (F. Sun et al. 2021; S. Sun et al. 2021). Spicules are caused by lung cancer cells infiltrating adjacent normal lung tissue; pulmonary slippage is an important manifestation of borderline aggressiveness of malignant tumors and was considered an independent risk factor for BM in this study. Li et al.(2023) suggested that spicules have certain value in promoting angiogenesis in lung cancer patients, which may be related to distant metastasis.
Previous predictive models for machine learning were difficult to interpret, therefore, we also used a new tool, SHAP, recently developed to interpret the results of “black box” machine learning models. In interpretability studies, the SHAP value for each feature is calculated. This approach provides great help for clinicians in understanding the model, and the model is more practical and generalizable. Deng et al.(2023) used a radiomics model combined with contrast-enhanced T1 MR, Xogboost, and SHAP algorithms and showed promise in accurately and interpretively identifying brain lesions in patients with NSCLC. In addition, the dynamic nomogram we used not only provides additional convenience for clinicians but also further advances scientific research into clinical practice.
Limitations of this study. First, this was a single-center retrospective study, multi-center data will be needed in the future to prove the validity and accuracy of the model. Secondly, the data of some samples were incomplete, and the combined prediction of multiple omics, such as pathomics, genomics, etc., was not carried out. Third, the mining of CT images needs to be further improved, and the sample size is expected to be expanded for deep learning research in the future(Gu et al., 2022).