In this study, we created a precise and repeatable classifier to differentiate between patients with BM and those with CAE using a wide range of radiomics characteristics and machine learning methods. In this study, we built 5 machine learning models to accurately distinguish CAE and BM on conventional contrast-enhanced T1WI images for. KNN classifier achieved the best performance, with an AUC value of 0.97. The precision 0.70, accuracy 0.86, sensitivity 1.0, specificity 0.78. The logistic regression algorithm has the least performance with an AUC of 0.87 The precision 0.55, accuracy 0.71, sensitivity 0.86, specificity 0.64.
The use of radiomics-based machine learning for the diagnosis of CAE and brain metastases has several advantages over traditional methods. First, radiomics-based machine learning can provide more accurate and reliable results than traditional methods. This is because radiomics-based machine learning can extract more detailed information from medical images than traditional methods. Additionally, radiomics-based machine learning can be used to detect subtle differences between CAE and brain metastases that may not be visible to the naked eye. Cerebral alveolar echinococcosis is a rare parasitic disease, but it is still a severe public health issue in many parts of the world. We believe that radiomics based machine learning is a novel tool to investigate this disease, which have been proved as a powerful approach in other fields33–36.
Alveolar echinococcosis, which is caused by Echinococcus multilocularis, is a rare zoonotic infection that is associated with significant levels of incidence, death, and disability rate, particularly in epidemic regions, where exclusively limited to the northern hemisphere37. When eggs are ingested by the intermediate host, the embryo is discharged into the duodenum, travels through the intestinal wall, and reaches the liver via the portal vein, where its primary organ is affected38. Extrahepatic involvement is possible, which spreads from liver lesions39. Primary cerebral alveolar echinococcosis is very rare, up until now only 7 patients were reported40–44. It is considered that brain involvement is a sign of terminal stage, which leads to poor prognosis45. The treatment of AE mainly involves surgery and benzimidazole (or albendazole/mebendazole) chemotherapy. Due to its tumor-like growth, it can be cured by radical surgery. In some cases, a complete cure is possible and it can only be achieved by radical lesion resection and adjuvant chemotherapy with benzimidazole (or albendazole and mebendazole)46. In most cases, however, due to late admission, complete resection cannot be carried out, chemotherapy is the backbone of treatment. Chemotherapy for these patients should last for a long time (some articles suggest that sustain at least about 2 years, and others found that suspend chemotherapy through regular follow-up and condition evaluation, no recurrence is found)47. The introduction of benzimidazole and its derivatives (albendazole and mebendazole) has greatly improved patients’ life quality. According to relevant studies, the prognosis was poor before the introduction of benzimidazole in 1976. In the 1970s, the life expectancy of AE patients was estimated to be shortened by 18.2 years and 21.3 years for male and female respectively. However, by the year of 2005, this number had decreased to 3.5 years and 2.6 years48. The discovery of benzimidazole was a major breakthrough in this field, continuous chemotherapy is attributed to prolonged the survival rate and patient’s improved life quality.
To describe the anatomical invasion characteristics of the disease better, WHO-IWG (World Health Organization Informal Working Group on Echinococcosis standardized classification of Echinococcal cysts) proposed a PNM classification system. P1-4 represents the location of parasites in the liver. N indicates whether adjacent organs are affected and M indicates whether metastasis is formed49,50.
With the advancement of modern medical science, most cancer patients have extended survival periods and better prognoses, about 20% of patients will develop brain metastases, which have poor clinical outcomes51. Patients with lung, breast cancer, and melanoma most frequently develop brain metastases, which account for 67–80% of patients52. In our study, among 38 brain metastatic patients, 26 patients have lung cancer, 5 breast cancer, 2 esophagus cancer, 2 renal cancer, 1 melanoma, 1 testis cancer, 1 gastric cancer, which is consistent with previous studies52. Studies53 show the highest rates of brain metastases are found in individuals with small-cell lung cancer (SCLC) or non-small-cell lung cancer (NSCLC) at diagnosis.
Diagnosis of brain metastases requires imaging examination, and MRI is the imaging method of choice since it is more sensitive than CT at identifying the size, quantity, and distribution of lesions in the central nervous system. The most common location for brain metastases- which are often solid and ring-enhancing lesions with a pseudospherical form at the grey-white junction, brain metastases, are the cerebral hemispheres (80%), cerebellum (15%), and brainstem (5%)54. Patients with BM often get multimodal therapies, including surgery, radiotherapy, chemotherapy, immunotherapies, targeted therapies, and endocrine therapy55.
Cerebral alveolar echinococcosis (CAE) lesions are often confused with brain metastases (BMT) due to they have similar imaging features and signal intensity. Similarities between CAE and BM make them difficult to diagnose correctly. Although CAE is known as “worm cancer”, its biological characteristics are very different from those of BM, and there are completely differences in treatment and prognosis between the two diseases, thus a correct differential diagnosis before treatment is crucial. Because rupture of CAE lesions can lead to disseminated /or chemical meningitis and fatal anaphylaxis, puncture biopsy is an absolute contraindication to CAE8,56,57. Therefore, accurate imaging diagnosis is required before treatment.
Radiomics is a kind of machine learning, where it aims to extract high-throughput quantitative image features from radiographic images and train a prediction model58. Since its first introduction by Philippe Lambin in 2012, radiomics has demonstrated considerable promise in developing models that can distinguish different types of tumors based on the numerous image features extracted from MRI that represent tumor heterogeneity14,17,21,59. As an important part of artificial intelligence, machine learning also has enormous potential in medical image processing. Radiomics combined with a machine learning approach has been widely studied in recent years. This method, in particular when it comes to tumor detection, subtype categorization, and prognostic assessment, has recently become a powerful tool for facilitating therapy personalization in clinical practice.
Dong J et al17 to distinguish ependymoma and medulloblastoma in children. They utilized MRI contrast-enhanced T1WI images of 51 patients (among them 24 ependymomata and 27 medulloblastomas), and extracted 188 features, which include histogram, shape based, and textural features. Then selected 66 features using univariate analysis, univariate analysis screening, and multivariate logistic regression. They built four machine learning models-random forest, support vector machine, adaptive boosting, K-nearest neighbor. The highest AUC values were obtained when random forest was carried with features selected by multivariate logistic regression (AUC = 0.91). The combinations of radiomics and machine learning methods could well distinguish ependymoma and medulloblastoma in children, which could assist doctors in clinical practice. In our study KNN classifier has AUC of 0.97, 0.94, accuracy 0.86, and specificity 0.78. Qian Z14 et al to identify the best machine learning model to differentiate glioblastoma from solitary brain metastases. Training and test groups were created for 412 individuals with solitary brain tumors (including 242 glioblastoma and 170 solitary brain metastases). They used PyRadiomics software extracted 1303 radiomics features, used 12 feature selection methods and applied 7 supervised machine learning algorithms, selected 12 features. 13 classifiers in the training group performed exceptionally well in terms of prediction (AUC 0.95), 84 machine learning classifiers were created by analyzing the 12 subgroups of the selected features using 7 classification methods, in the test group support vector machine (SVM) + absolute shrinkage and selection operator (LASSO) had the best prediction efficiency (AUC = 0.90). Same as our study, the application of machine learning can help physicians and neuroradiologists to accurately identify two different brain tumors before clinical intervention, which have totally different treatment therapy. Cheng J et al 33 they have used similar approach to ours, radiomics combined machine learning method to differentiate the immune checkpoint inhibitor-related and radiation pneumonitis in lung cancer and achieved excellent outcomes.
Apart from differentiating tumors, radiomics combined with machine learning models are also used to evaluate treatment efficiency. The tumor prognosis and immunotherapy response are closely related to the tumor immune microenvironment, to better understand the relationship between neutrophil-to-lymphocyte ratio (NLR) and radiomics imaging biomarkers in tumor immune microenvironment (TIME), along with its associations with tumor prognosis and immunotherapy response in advanced gastric cancer, Huang W et al26 developed and verified a CT-based radiomics score (RS) using 2272 gastric cancer patients, The NLR AUC in RS projected TIME ranged from 0.795 to 0.861. Significantly, radiomics imaging biomarkers were as accurate at predicting disease-free survival (DFS) and overall survival (OS) for each group as IHC (immunohistochemistry)-derived NLR status. They found that objective responses were significantly higher in the low-RS group than in the high-RS group in the cohort of patients receiving anti-PD-1 immunotherapy. In patients with gastric cancer, radiomics imaging biomarkers offer a non-invasive way to evaluate TIME and may be related to prognosis and responsiveness to anti-PD-1 treatment, which has huge potential to evaluate tumor immune microenvironment.
Whether radiomics labels produced by deep features extracted by transfer learning can be used to predict overall survival in glioblastoma multiform patients, a discovery data set of 75 patients and independent validation data set of 37 patients were analyzed by Lao J et al60. From preoperative multi-modality MR images, 98304 deep features and 1403 handmade features in total were obtained. After feature selection, a six-deep feature signature was produced using the least absolute shrinkage and selection operator (LASSO) Cox regression model. By integrating the signature with clinical risk factors including age and the Karnofsky Performance Score, a radiomics nomogram was additionally created. The proposed signature outperformed conventional risk variables in terms of overall survival (OS) prediction and significantly improved patient stratification into prognostically distinct groups. The findings of their study demonstrate that deep features based on transfer learning can provide prognostic imaging signatures for OS prediction and patient stratification for GBM, demonstrating the promise of deep imaging feature-based biomarkers in the preoperative management of glioblastoma multiforme patients.
Due to rarity and limited data for CE, in our research we have utilized nested cross validation- when the dataset is small and there are numerous hyperparameters to adjust for the model, it is extremely helpful61. Nested cross-validation's generalization ability can be deemed beneficial for a number of reasons. First off, by giving more accurate predictions of the model's performance, it helps to reduce the risk of overfitting. The outer loop offers an objective assessment of how well the model will function on unobserved data by splitting the data into an outer and inner loop. The model is adjusted for better generalization rather than overfitting to the training data using the inner loop, which is used for hyperparameter adjustment. Secondly, the use of cross-validation helps to reduce the dependency of the performance estimate on a particular train-test split. By repeating the process multiple times, with different splits of the data, the variability in the performance estimate can be assessed. This helps to capture the model's ability to perform well on unseen data from different perspectives, enhancing its generalization capability. Nested cross-validation also makes the model selection process more reliable. It makes it possible to compare various models or hyperparameter combinations objectively and choose the one that performs the best. This selection procedure aids in finding models that are effective on training data as well as those that generalize well to fresh, unexplored data62,63. For the selection of biomarkers in high-dimensional data, the variable selection compression estimation method- LASSO has been widely used64. By developing a penalty function, it builds a more refined model by compressing certain coefficients while leaving others at zero. In this method, feature screening (dimension reduction) and over-fitting are both avoided during model training. In the domains of molecular biology and neuroimaging, SVM is a robust, strong, and efficient machine-learning classifier65. These characteristics allowed the LASSO regression model and SVM classifier to work together flawlessly in the radiomics investigation. Additionally, the LASSO algorithm chose the observed radiomics characteristics from a variety of filters and feature classes, which shows that multiple feature categories may provide complimentary information in differentiating between the CAE and BM. Even though the biological activity underlying these radiomics features is not yet known, we hypothesize that they may be able to capture the fine radiomics qualities of microstructure and the tumor's immediate surroundings.
Finally, radiomics combined machine learning approach has the potential to revolutionize the way we diagnose and differentiate between cerebral alveolar echinococcosis and brain metastases. Radiomics is a branch of medical imaging that uses advanced algorithms to extract quantitative features from medical images. These features can then be used to create predictive models that can accurately differentiate between CAE and brain metastases66.