Clinical prediction model
Supplement Table 3 (Supplemental Digital Content 5) provides the clinical and pathological data for the three respective sets. Regarding CT features, there was strong agreement between the measurements of the two observers (ICC 0.954-0.996), and qualitative evaluation indicators exhibited high consistency (Kappa 0.832-1.000). Additional inter-observer agreement results for each CT feature are detailed in Supplement Table 4 (Supplemental Digital Content 6). Univariate analysis revealed significant differences (P < 0.05) between the VPI-positive and VPI-negative groups for several CT features, including tumor size, solid component size, CTR, whole tumor pleural contact length, solid component pleural contact length, density type, pleural indentation sign, solid attachment sign, spiculation sign, vascular convergence sign, and the presence of emphysema. Notably, the VPI-positive group exhibited a higher prevalence of solid nodules, solid attachments, pleural indentation, spiculation, and vascular convergence signs, along with significantly larger tumors and solid components, longer CTR, and longer whole tumor and solid pleural contact lengths (all P < 0.05; Table 1).
Multivariate logistic regression analysis was employed to select the optimal combination of predictive variables for constructing the clinical model. The independent risk factors for VPI were the solid component size (OR = 1.23, 95% CI 1.16~1.30, P < 0.001), pleural indentation sign (OR = 3.36, 95% CI 1.61 ~ 7.02, P = 0.001), solid attachment sign (OR = 2.98, 95% CI 1.56 ~ 5.70, P < 0.001), and vascular convergence sign (OR = 4.51, 95% CI 1.49 ~ 13.69, P = 0.008). The AUC values of the clinical model were 0.885, 0.814, and 0.838 in the three respective sets.
Radiomics model
A total of 1218 radiomic features were extracted from the VOI of GTV, GPTV5, GPTV10, and GPTV15, respectively. Among these features, 72.2% (880/1218) of GTV features, 92.3% (1124/1218) of GPTV5 features, 97.1% (1183/1218) of GPTV10 features, and 97.8% (1191/1218) of GPTV15 features demonstrated good repeatability, with inter-class and intra-class ICCs exceeding 0.75. Among the features with ICC > 0.75, the mRMR algorithm was initially used to eliminate redundant and irrelevant features, retaining 30 features in each group. Subsequently, the LASSO regression algorithm was applied to select the optimized feature subset for constructing the final model. A 10-fold cross-validation process was used to determine the optimal hyperparameter λ. The optimal λ values for GTV, GPTV5, GPTV10, and GPTV15 were 0.153, 0.131, 0.068, and 0.100, respectively (see Supplementary Fig. 1, Supplemental Digital Content 7). With these optimal λ values, 2, 5, 7, and 3 features were selected to construct the radiomics models for GTV, GPTV5, GPTV10, and GPTV15, respectively (Fig. 3). The features used for model construction and their ICC details are provided in Supplement Table 5 (Supplemental Digital Content 8). The radscore formulas for the four radiomics models can be found in the supplementary data (Supplemental Digital Content 9). The radscore for VPI-positive groups in all models was significantly higher than for VPI-negative groups (all P < 0.05), as shown in Supplementary Fig. 2 (Supplemental Digital Content 10).
Efficacy comparison of radiomics models
The AUC values for the GTV, GPTV5, GPTV10, and GPTV15 models in the training set for predicting VPI status were 0.838, 0.849, 0.855, and 0.841, respectively. In the internal validation set, the corresponding AUC values were 0.808, 0.855, 0.842, and 0.824. Similarly, in the external validation set, the AUC values were 0.809, 0.826, 0.842, and 0.823, respectively. The prediction performance of each radiomics model is summarized in Table 4, and the ROC curves for each radiomics model in the three sets can be found in Supplementary Fig. 3 (Supplemental Digital Content 11). The DeLong test indicated that in the training set, the GPTV10 model outperformed the GPTV15 model, and the difference was statistically significant (Z = 2.076, P < 0.05). In the internal validation set, the performance of GPTV5 and GPTV10 models was superior to that of GTV, and the differences were statistically significant (Z = 3.030 and 2.163, both P < 0.05). The radiomics model with the highest AUC value in the external validation set was selected as the best radiomics model. Consequently, a combined model was constructed based on the GPTV10 model's radscore and CT morphological features.
Efficacy evaluation of combined models
The radscore from the GPTV10 model and CT morphological features included in the clinical model were used as predictive variables to construct a combined model and the corresponding nomogram. The formula for the combined model was as follows:
Nomoscore = (Intercept) * -2.738 + solid component size * 0.076 + pleural indentation sign * 1.169 + the presence of solid component contact pleura * 1.178 + vascular convergence sign * 1.329 + the presence of combined emphysema * 0.650 + GPTV10-radscore * 1.110
The AUC values of the combined model were 0.894, 0.828, and 0.876 in the three respective sets, as presented in Table 4. The nomogram and examples of its clinical application can be found in Fig. 4 and Fig. 5. The ROC curves of the GPTV10-based radiomics model, clinical model, and combined model for predicting VPI in the three sets are shown in Fig. 6.
The DeLong test showed that the combined model outperformed the GPTV10 radiomics model in the training set (Z = 2.987, P < 0.05). In the external validation set, the combined model performed better than the clinical model (Z = 2.348, P < 0.05). The Hosmer-Lemeshow test indicated that the combined model was a good fit in all three sets (all P > 0.05), as shown in Fig. 7. The decision curve analysis (DCA) curves revealed that the combined model achieved a better net benefit in predicting VPI status than the clinical model and the GPTV10 radiomics model, as shown in Fig. 8.