In this study, we specifically engineered machine learning models tailored for MLBO patients. This model accurately predicts the OS and DFS of MLBO patients. MLBO is a complex condition arising from CRC with obstruction as a risk factor. Emergency interventions that weigh oncological outcomes, surgical tolerance, and the emergency situation under the concept of damage control are needed(4). Unfortunately, the current literature provides limited guidance for the prognosis of MLBO, and debates persist regarding the efficacy of stent-bridging procedures versus emergency procedures(10, 21, 22). These factors underscore the urgent need for a scoring system capable of accurately predicting long-term prognosis in MLBO patients. Our approach integrates the emergency situation, radiological features, and oncological characteristics into a unified model. Several prognostic models have been proposed for evaluating MLBO. Classical diagnostic and therapeutic standards for CRC include TNM staging(2, 3), while inflammation, a pivotal clinical feature of MLBO, has been closely linked to prognosis(23-25). HALP baseline metrics, which are used to assess the nutritional status of cancer patients, have proven to be reliable prognostic indicators(26, 27). Moreover, the ASA grade was developed for cardiopulmonary evaluation(28), and CROSS scores were used to measure the degree of obstruction(29). However, no existing prognostic evaluation system specifically addresses MLBO by concurrently considering these multifaceted aspects.
Our study successfully reconstructed the obstructed segment in MLBO and conducted a professional conversion of image data. This builds on previous applications for educational purposes, such as 3D-printed models, reconstruction of mesenteric vessels, and MRI reconstruction of rectal cancer(30-33). However, the ultimate predictive results from the classifier were not entirely satisfactory. The LR for DFS displayed an AUC of 0.61 (95% CI: 0.48-0.79), while the LDA demonstrated decent performance for OS, with an AUC of 0.79 (95% CI: 0.56-0.87). The relatively small sample size used in this research could limit the efficacy of radiomics. In contrast, the clinicopathological feature classifier outperforms the radiological feature classifiers.
At our center, we constructed a novel predictive model considering factors such as LVI, nutritional information, tumor markers, ASA grade, and the CROSS score, all underpinned by TNM stage. After conducting data downscaling and self-clustering, the model yielded exceptional accuracy in predicting survival outcomes. XGB exhibited remarkable performance, with an AUC of 0.97 (95% CI: 0.72-0.97) for DFS. Moreover, the LDA showed excellent performance, with an AUC of 0.92 (95% CI: 0.64-0.90) for OS. Both classifiers have proven to be reliable within the medical field(34-37). To further optimize the model, we integrated clinical genomics and radiomics data. However, despite our efforts, radiomics did not significantly enhance the overall effectiveness of the model. The predictive performance of the LR remained outstanding, with an AUC of 0.96 (95% CI: 0.75-0.97) for DFS and 0.92 (95% CI: 0.66-0.92) for OS. Given these compelling results, we concluded that single clinicopathological classifiers exhibit robust efficacy(38, 39). To further validate our predictive model, we applied it to an external validation cohort. However, as a single validation set, the predictive efficacy was negatively impacted, yielding an AUC of 0.42 for DFS and 0.50 for OS.
To understand these conflicting results, we carried out a randomized matched difference study between centers to identify potential discrepancies. Regarding DFS, our center exhibited a higher recurrence rate, potentially attributable to higher T-stage and inflammation levels. Simultaneously, our center recorded lower CROSS scores and ASA grades for OS, likely due to more severe degrees of bowel obstruction and greater cardiopulmonary tolerance. A lower BMI was also observed at our center, reflecting body type differences between southern and northern regions in China(40). For OS, our center reported a lower incidence of LVI, indicating comparatively less tumor dissemination(41). These discrepancies undermine the predictive efficacy when treating all QD's data as a single validation set. Numerous multicenter studies often compare data variability without merging the data, which can lead to mixed outcomes. Some validation sets performed well(42-44), while others did not meet expectations(45).
Therefore, we initiated training and testing using hybrid data from both centers, resulting in improved predictive accuracy for DFS (using the LDA, with an AUC of 0.96, 95% CI of 0.76–0.95) and OS (using the LDA, with an AUC of 0.95, 95% CI of 0.76–0.96). Based on these findings, we propose that multicenter data should be merged prior to initiating training. In our internal variability study involving merged data from two centers, we discovered that higher TNM stage, the occurrence of LVI, high CA199 levels, and high MONO levels were closely correlated with recurrence and mortality in MLBO patients. This finding is consistent with previous studies and anticipated outcomes(25, 46, 47). The severity of the tumor burden and inflammatory response mirrors the progression of MLBO(48-50). This congruence further supports the robustness of our predictive model and accentuates the significance of these factors in determining the prognosis of MLBO patients.
Despite these promising findings, our study has several limitations. First, this was a retrospective investigation conducted with a relatively small sample size. Second, the selection of classifiers was confined to those that are currently mature and widely used. Future studies might necessitate the development of custom classifiers for more specific or nuanced applications.