Differentiation between cerebral alveolar echinococcosis and brain metastases with radiomics combined machine learning approach

doi:10.21203/rs.3.rs-3304181/v1

Download PDF

Research Article

Differentiation between cerebral alveolar echinococcosis and brain metastases with radiomics combined machine learning approach

https://doi.org/10.21203/rs.3.rs-3304181/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 09 Dec, 2023

Read the published version in European Journal of Medical Research →

You are reading this latest preprint version

Background

Cerebral alveolar echinococcosis (CAE) and brain metastases (BM) are similar in locations and imaging appearance. While, CAE is usually treated with chemotherapy and surgical treatment, BM is often treated with radiotherapy and targeted primary malignancy treatment. Accurate diagnosis is critical due to the vastly different treatment approaches for these conditions.

Purpose

This study aims to investigate the effectiveness of radiomics and machine learning approaches on magnetic resonance imaging (MRI) in distinguishing CAE and BM.

Methods

We have retrospectively analyzed MRI images of 130 patients (30 CAE, 100 BM, training set = 91, testing set = 39) who confirmed CAE or BM in Xinjiang medical university's first affiliated hospital from January 2014 to December 2022. Three dimensional tumors were segmented by radiologists from contrast-enhanced T1WI images on open resources software 3D Slicer. Features were extracted on Pyradiomics, further feature reduction was carried out using univariate analysis, correlation analysis, and least absolute shrinkage and selection operator (LASSO). Finally, we built five machine learning models, support vector machine, logistic regression, linear discrimination analysis, KNeighbors classifier, and Gaussian NB and evaluated their performance via several metrics including sensitivity (recall), specificity, positive predictive value (precision), negative predictive value, accuracy and the area under the curve (AUC).

Results

The area under curve (AUC) of SVC, LR, LDA, KNN, and NB algorithms in training (testing) sets are 0.99 (0.94), 1.00 (0.87), 0.98 (0.92), 0.97 (0.97), and 0.98 (0.93) respectively. Nested cross-validation demonstrated the robustness and generalizability of the models. Additionally, the calibration plot and decision curve analysis demonstrated the practical usefulness of these models in clinical practice, with lower bias toward different subgroups during decision-making.

Conclusion

The combination of radiomics and machine learning approach on contrast enhanced T1WI images could well distinguish CAE and BM. This approach holds promise in assisting doctors with accurate diagnosis and clinical decision-making

Cerebral alveolar echinococcosis

Brain metastases

Machine learning

Radiomics

The hydatid disease, which includes two different diseases- cystic echinococcosis, and alveolar echinococcosis, is caused by the larva stage of echinococcosis and echinococcosis multilocularsis respectively¹. Alveolar echinococcosis is a lethal parasitic disease, its endemic area is limited to the northern hemisphere, which includes Japan, northwestern parts of China, middle Asia, Russia, parts of Iran and Türkiye, central Europe and North America. Its primary host is the red fox; however, domestic dogs play a substantial role in transmission to humans. The adult stage of the parasite inhabits the small intestine of domestic and wild canid hosts then releases eggs into the environment through the feces of these hosts. Humans get infected through ingest foods or water polluted with eggs or get in touch with contaminated soil or direct contact with canid hosts^2–6. After eggs are ingested by intermittent hosts, embryos enter the gut wall through blood or lymph circulation get into internal organs and develop into the larva stage. Humans play unusual intermittent hosts of the parasite. The liver is the initial site of mass infestation, accounting for more than 95% of cases, the larva may spread to other organs by regional extension or distant metastasis through hematogenous or lymphatic pathways, metastasis from the liver to lung, spleen, brain, bones, lymph nodes or muscles⁷. Cerebral alveolar echinococcosis is a rare and severe parasitic infection that affects central nervous system, accounts for about 1% of extrahepatic involvement cases,the disease predominantly affects adults, with the average age at diagnosis ranging from 40 to 60 years. However, cases in children and adolescents have also been reported⁸.

Brain metastases consist of about 50% of supratentorial brain tumors, and are the most commonly seen secondary malignant brain tumor. Brain metastases are commonly seen in patients with lung, breast cancer, and melanoma^9,10. In daily clinical practice, it is easy to diagnose CAE and BM cases in patients with a definite history of extracerebral AE and primary malignancies. However, when clinical materials are insufficient, or CAE is found in non-endemic areas, it has difficulty to differentiate them accurately.

Radiomics and machine learning have emerged as topics of great interest in medical imaging and nuclear medicine in recent years. Generally, radiomics aims to extract a wealth of information from medical images, converting them into a plethora of minable data that are difficult to discern with the human eye, providing valuable insights into tumor physiology and phenotypes¹¹. Numerous researchers have successfully utilized radiomics approaches to achieve accurate tumor differentiation and assess tumor biology^12–15.Machine learning leverages sophisticated algorithms to process vast amounts of data, uncovering meaningful patterns that may be challenging even for highly skilled individuals¹⁶. In medicine, machine learning has found extensive use, ranging from differential diagnosis of brain tumors^14,17–23, classification of tumor phenotypes²⁴, to disease onset prediction based on patient’s electronic record²⁵, and evaluation of tumor immune microenvironment for predicting immunotherapy efficacy²⁶.

CAE and BM have highly similar imaging appearances. CAE lesions are characterized by keratinized, calcified, fibrotic, and other mixed components, they mostly show equal or slightly high signals in T1WI and obvious low signals in T2WI and T2WI-FLAIR sequences. The edema is usually large in extent, and when the foci are large, the enhancement scans show a clear circular enhancement, when the foci are small, they show solid nodule-like enhancement^27,28.

In contrast, BM lesions tend to be roughly spherical, with predominantly low signals in Tl WI, and different signals on T2WI images based on different components within the lesion. On many common MRI sequences, the presence of calcification, hemorrhage, and cystic components affects the signal of the lesion, which is often surrounded by vasogenic edema. Edema is usually extensive and presents a "small lesion with large edema" with a wreath-like enhancement on enhanced scans²⁹.

MRI is currently one of the most advantageous examinations of the nervous system. But conventional MRI techniques can only provide information on the location, size, morphology, degree of edema surrounding the lesion, and macroscopic structural changes in the lesion such as necrosis and cystic changes. However, these conventional diagnostic imaging methods are not always useful.

Due to CAE and BM having similar imaging findings, however have totally different treatment therapy, therefore it is critically important for clinicians to diagnose accurately before clinical intervention. Moreover, in the non-endemic area, it is truly difficult accurate diagnosis. Thus, we utilize a machine learning model combined with a radiomics approach to distinguish the two diseases.

Our institutional review board at The First Affiliated Hospital of Xinjiang Medical University gave its approval for this study. Given that the study was retrospective, written informed permission was not required.

2.1 Study population

To search for patients who were diagnosed with CAE and BM in our hospital from January 2015 to December 2022, we turned to look at our electronic medical system. Then, 130 patients with histologically proven CAE and BM were found. The following were the inclusion criteria: (1) pathological confirmation of the CAE or BM; (2) pathologically confirmed diagnosis of hepatic alveolar echinococcosis and the clinical comprehensive diagnosis of CAE (3) availability of T1WI, T2WI, and contrast enhanced data from preoperative multi-parametric MRI images; (4) absence of preoperative treatment history; (5) have a definite history of extracerebral malignancy; (6) absence of prior brain cancer in all BM cases; (7) availability of clinical characteristics. The following terms serve as the exclusion criteria: (1) those who had previously had treatment for CAE or BM (such as surgical, radiation, or chemotherapy); (2) patient’s imaging data are not available (3) those whose imaging artifacts made it difficult to segment lesions.

All participants in this research were divided at random into a training and a testing set at the ratio of 7:3 (Fig. 1).

2.2 Imaging characteristics

Magnetic resonance imaging scanners named 3.0-T Signa Hdx MR scanner (General Electric, USA) were used. All images included axial T1WI sequence, axial T2WI sequence, axial fluid-attenuated inversion recovery -FLAIR sequence, sagittal T2WI sequences, contrast-enhanced axial, sagittal, coronal T1WI sequences. The main parameters included axial T1WI: TR = 200 ms; TE = 12 ms; slice thickness = 6 mm, DTPA-Gd injections (0.1 mmol/kg, Beijing Beilu Pharmaceutical Co., Beijing China) were used for contrast-enhanced MRI scans, parameters are as follows: TR = 200 ms; TE = 12 ms; slice thickness = 6 mm. T2WI images: TR = 3900 ms; TE = 120 ms; slice thickness = 6mm, with the field of view [FOV] = 256× 256 matrices). Digital Imaging and Communications in Medicine (DICOM) form was used to retrieve images from the picture archiving and communication system (PACS).

2.3 Image Segmentation

Two radiologists, each having more than three and more than five years of expertise in neuroradiology, viewed images on all sequences (T1WI, T2WI, FLAIR, CE images) independently and blindly to the clinical data. Using 3D slicer 4.10.1 (https://www.slicer.org), they draw the volume of interest (VOI) following the tumor border on contrast-enhanced images.

2.4 Radiomics Feature Pre-processing and Extraction

We used PyRadiomics package (version 3.0.1) to calculate all radiomics characteristics ³⁰. Shape features, first-order (distribution) features, and texture features are the three categories into which image features can be divided. All intensities within the VOI of MR images were discretized to 25 bins. We set the resampling parameter to 1 × 1 × 1 mm³ and the normalization parameter as true for the MR image before feature extraction. First, metrics like volume and surface as well as more complex variables like compactness and sphericity were determined by the segmentation's shape. The distribution of intensities in the volume of interest was examined to produce the second category of characteristics. These characteristics include conventional distributional measures like the mean, median, and interquartile range as well as shape descriptors like skewness and information-theoretical metrics like entropy. Third, texture characteristics were extracted from the volume of interest using discretized gray values. To describe patterns in the discretized gray values, various matrices were developed, including the gray-level size-zone matrix (GLSZM), gray-level run-length matrix (GLRLM), and gray-level co-occurrence matrix (GLCM). Gray-level dependence matrix (GLDM) and neighboring gray-tone difference matrix (NGTDM) are two more matrices that examined the immediate vicinity of pixels. In addition to extracting the characteristics mentioned above, filters were applied to these images to decrease the noise that is inherent to each MR measurement. For each patient, a total of 1584 radiomics features were extracted.

2.5 Feature Selection

Due to the expectation that several attributes would be associated, such as when employing many filters on the same image. It becomes challenging for us to visualize and analyze a machine-learning model when using datasets with a large number of features. Additionally, it takes a lot of time and memory, which increases the time and spatial complexity of the model. Because of the useless features in the dataset, the model may occasionally perform poorly on the testing data. Consequently, To reduce the number of features needed for training, feature-selection algorithms were taken into consideration. The variance threshold method removed the feature of variances of less than 0.8. Last, the least absolute shrinkage and selection operator (LASSO) was carried out with optimal lambda to shrink unimportant feature coeffects to zero.

2.6 Model Construction and Optimization

We bring our selected features to several models including Logistic regression (LR), Support vector Machine classifier (SVC), k-nearest neighbors (KNN), Linear discrimination analysis (LDA and Gaussian naive Bayes (NB) algorithms in the current study.

Model optimization is to modify the value of the various intrinsic parameters of algorithms. Any changes to any parameters may incur the prediction performance improvement or decline. Moreover, the vital procedure in the tuning process is to validate the model with tuned parameters. Yet, it is also a process with the risk of data leakage. Thus, we used grid search with nested resampling method to solve the mentioned issue when optimizing parameters, where inner resampling (cv = 3) is responsible for the tuning while outer (cv = 5) for validate the result³¹. Nested cross validation is a technique used to evaluate the performance of a machine learning model. It is a type of cross-validation where the data is split into two sets: a training set and a testing set. The training set is then further split into two sets: a validation set and a training set. The model is then trained on the training set and evaluated on the validation set. Finally, the model is tested on the testing set to evaluate its performance. This technique is useful for assessing the accuracy of a model and for selecting the best model for a given dataset³². Furthermore, the two kinds of strategy were compared to each other to assess the data leakage impact.

2.7 Model Evaluation

Summary statistics were calculated for the model performance, including sensitivity (recall), specificity, positive predictive value (precision), negative predictive value,accuracy, and the area under the curve (AUC). Receiver operating characteristic (ROC) was constructed. To evaluate the consistency between predicted values and actual labels, a calibration plot was created.

2.8 Statistical Analysis

To determine whether continuous features are normal, we applied the Shapiro-Wilk test. Continues features normal distribution is displayed as mean values ± standard deviation (SD) and examined via Student’s t-test, while the rank sum test is used to analyze non-normal distributions and expressed as interquartile range M (P75, P25). Categorical data are displayed as frequency (percentage), and Fisher's exact or the χ2 test was used to compare the two groups. The independence of the selected features was examined using the Pearson correlation coefficients. The above statistical analysis was performed using R 4.2.2 and SPSS 25.0 software.

3.1 Patient Characteristics

Table 1 Baseline of patients

Table 1

Baseline of patients
Characteristics	ALL(n = 130) M(P₂₅,P₇₅), n(%)	BM(n = 100) M(P₂₅,P₇₅), n(%)	CAE(n = 30) M(P₂₅,P₇₅), n(%)	H/χ²	P
Age	43.50(33.00,52.00)	43.50(33.25,54.00)	43.00(31.00,50.00)	-1.515	0.130
BMI	20.00(19.00,22.25)	20.00(19.00,22.00)	20.00(19.00,23.00)	0.202	0.840
Gender				14.625	< 0.001
Male	78(60.00)	51 (51.00)	27(90.00)
Female	52(40.00)	49 (49.00)	3(10.00)

3.2 Extraction and Selection of Features

For feature selection, 127 out of 1584 features were initially screened using univariate analysis. Afterward, 26 features were selected after removing redundant variables with using highly correlated coefficients. Eventually, 9 optimal features were selected with the LASSO algorithm (Fig. 2). The Pearson correlation coefficient was used to determine whether these features were correlated. According to the results, the majority of the features were independent. The heat map of correlation among the radiomics features is displayed in (Fig. 3).

3.3 Model Optimization

We adjusted the parameters of each model first before building the model with the entire training dataset. The comparison of nested and non-nested resampling is shown in Fig. 4. It is seen that the non-nested method showed better performance as a result of data leakage when tuning the parameters.

3.4 Model Performance Evaluation

The ROC curves of the five radiomics models are shown in Fig. 5A and 5B. The AUC of SVC, LR, LDA, KNN, and NB algorithms in training (testing) sets are 0.99 (0.94), 1.00 (0.87), 0.98 (0.92), 0.97 (0.97), and 0.98 (0.93) respectively. Other metrics are shown in Table 2. The calibration plot in Fig. 5C and 5D revealed good consistency between the predicted and actual labels, indicating that the model's prediction is stable. The five radiomics models' decision curves demonstrated that each model performs better than both the treat-all-patients and the treat-none measures in terms of result prediction (Fig. 6A, B).

Table 2

Model Performance
Classifier	Brier loss	Log loss	Acc.	Recall	F1	Sen.	Spe.	Npv.	Ppv.
LDA	0.160	0.601	0.810	1.000	0.778	1.000	0.714	1.000	0.636
LR	0.231	1.091	0.714	0.857	0.667	0.857	0.643	0.900	0.545
SVC	0.159	0.507	0.762	0.857	0.706	0.857	0.714	0.909	0.600
KNN	0.130	0.396	0.857	1.000	0.824	1.000	0.786	1.000	0.700
NB	0.199	0.751	0.762	1.000	0.737	1.000	0.643	1.000	0.583

In this study, we created a precise and repeatable classifier to differentiate between patients with BM and those with CAE using a wide range of radiomics characteristics and machine learning methods. In this study, we built 5 machine learning models to accurately distinguish CAE and BM on conventional contrast-enhanced T1WI images for. KNN classifier achieved the best performance, with an AUC value of 0.97. The precision 0.70, accuracy 0.86, sensitivity 1.0, specificity 0.78. The logistic regression algorithm has the least performance with an AUC of 0.87 The precision 0.55, accuracy 0.71, sensitivity 0.86, specificity 0.64.

The use of radiomics-based machine learning for the diagnosis of CAE and brain metastases has several advantages over traditional methods. First, radiomics-based machine learning can provide more accurate and reliable results than traditional methods. This is because radiomics-based machine learning can extract more detailed information from medical images than traditional methods. Additionally, radiomics-based machine learning can be used to detect subtle differences between CAE and brain metastases that may not be visible to the naked eye. Cerebral alveolar echinococcosis is a rare parasitic disease, but it is still a severe public health issue in many parts of the world. We believe that radiomics based machine learning is a novel tool to investigate this disease, which have been proved as a powerful approach in other fields^33–36.

Alveolar echinococcosis, which is caused by Echinococcus multilocularis, is a rare zoonotic infection that is associated with significant levels of incidence, death, and disability rate, particularly in epidemic regions, where exclusively limited to the northern hemisphere³⁷. When eggs are ingested by the intermediate host, the embryo is discharged into the duodenum, travels through the intestinal wall, and reaches the liver via the portal vein, where its primary organ is affected³⁸. Extrahepatic involvement is possible, which spreads from liver lesions³⁹. Primary cerebral alveolar echinococcosis is very rare, up until now only 7 patients were reported^40–44. It is considered that brain involvement is a sign of terminal stage, which leads to poor prognosis⁴⁵. The treatment of AE mainly involves surgery and benzimidazole (or albendazole/mebendazole) chemotherapy. Due to its tumor-like growth, it can be cured by radical surgery. In some cases, a complete cure is possible and it can only be achieved by radical lesion resection and adjuvant chemotherapy with benzimidazole (or albendazole and mebendazole)⁴⁶. In most cases, however, due to late admission, complete resection cannot be carried out, chemotherapy is the backbone of treatment. Chemotherapy for these patients should last for a long time (some articles suggest that sustain at least about 2 years, and others found that suspend chemotherapy through regular follow-up and condition evaluation, no recurrence is found)⁴⁷. The introduction of benzimidazole and its derivatives (albendazole and mebendazole) has greatly improved patients’ life quality. According to relevant studies, the prognosis was poor before the introduction of benzimidazole in 1976. In the 1970s, the life expectancy of AE patients was estimated to be shortened by 18.2 years and 21.3 years for male and female respectively. However, by the year of 2005, this number had decreased to 3.5 years and 2.6 years⁴⁸. The discovery of benzimidazole was a major breakthrough in this field, continuous chemotherapy is attributed to prolonged the survival rate and patient’s improved life quality.

To describe the anatomical invasion characteristics of the disease better, WHO-IWG (World Health Organization Informal Working Group on Echinococcosis standardized classification of Echinococcal cysts) proposed a PNM classification system. P1-4 represents the location of parasites in the liver. N indicates whether adjacent organs are affected and M indicates whether metastasis is formed^49,50.

With the advancement of modern medical science, most cancer patients have extended survival periods and better prognoses, about 20% of patients will develop brain metastases, which have poor clinical outcomes⁵¹. Patients with lung, breast cancer, and melanoma most frequently develop brain metastases, which account for 67–80% of patients⁵². In our study, among 38 brain metastatic patients, 26 patients have lung cancer, 5 breast cancer, 2 esophagus cancer, 2 renal cancer, 1 melanoma, 1 testis cancer, 1 gastric cancer, which is consistent with previous studies⁵². Studies⁵³ show the highest rates of brain metastases are found in individuals with small-cell lung cancer (SCLC) or non-small-cell lung cancer (NSCLC) at diagnosis.

Diagnosis of brain metastases requires imaging examination, and MRI is the imaging method of choice since it is more sensitive than CT at identifying the size, quantity, and distribution of lesions in the central nervous system. The most common location for brain metastases- which are often solid and ring-enhancing lesions with a pseudospherical form at the grey-white junction, brain metastases, are the cerebral hemispheres (80%), cerebellum (15%), and brainstem (5%)⁵⁴. Patients with BM often get multimodal therapies, including surgery, radiotherapy, chemotherapy, immunotherapies, targeted therapies, and endocrine therapy⁵⁵.

Cerebral alveolar echinococcosis (CAE) lesions are often confused with brain metastases (BMT) due to they have similar imaging features and signal intensity. Similarities between CAE and BM make them difficult to diagnose correctly. Although CAE is known as “worm cancer”, its biological characteristics are very different from those of BM, and there are completely differences in treatment and prognosis between the two diseases, thus a correct differential diagnosis before treatment is crucial. Because rupture of CAE lesions can lead to disseminated /or chemical meningitis and fatal anaphylaxis, puncture biopsy is an absolute contraindication to CAE^8,56,57. Therefore, accurate imaging diagnosis is required before treatment.

Radiomics is a kind of machine learning, where it aims to extract high-throughput quantitative image features from radiographic images and train a prediction model⁵⁸. Since its first introduction by Philippe Lambin in 2012, radiomics has demonstrated considerable promise in developing models that can distinguish different types of tumors based on the numerous image features extracted from MRI that represent tumor heterogeneity^14,17,21,59. As an important part of artificial intelligence, machine learning also has enormous potential in medical image processing. Radiomics combined with a machine learning approach has been widely studied in recent years. This method, in particular when it comes to tumor detection, subtype categorization, and prognostic assessment, has recently become a powerful tool for facilitating therapy personalization in clinical practice.

Dong J et al¹⁷ to distinguish ependymoma and medulloblastoma in children. They utilized MRI contrast-enhanced T1WI images of 51 patients (among them 24 ependymomata and 27 medulloblastomas), and extracted 188 features, which include histogram, shape based, and textural features. Then selected 66 features using univariate analysis, univariate analysis screening, and multivariate logistic regression. They built four machine learning models-random forest, support vector machine, adaptive boosting, K-nearest neighbor. The highest AUC values were obtained when random forest was carried with features selected by multivariate logistic regression (AUC = 0.91). The combinations of radiomics and machine learning methods could well distinguish ependymoma and medulloblastoma in children, which could assist doctors in clinical practice. In our study KNN classifier has AUC of 0.97, 0.94, accuracy 0.86, and specificity 0.78. Qian Z¹⁴ et al to identify the best machine learning model to differentiate glioblastoma from solitary brain metastases. Training and test groups were created for 412 individuals with solitary brain tumors (including 242 glioblastoma and 170 solitary brain metastases). They used PyRadiomics software extracted 1303 radiomics features, used 12 feature selection methods and applied 7 supervised machine learning algorithms, selected 12 features. 13 classifiers in the training group performed exceptionally well in terms of prediction (AUC 0.95), 84 machine learning classifiers were created by analyzing the 12 subgroups of the selected features using 7 classification methods, in the test group support vector machine (SVM) + absolute shrinkage and selection operator (LASSO) had the best prediction efficiency (AUC = 0.90). Same as our study, the application of machine learning can help physicians and neuroradiologists to accurately identify two different brain tumors before clinical intervention, which have totally different treatment therapy. Cheng J et al ³³ they have used similar approach to ours, radiomics combined machine learning method to differentiate the immune checkpoint inhibitor-related and radiation pneumonitis in lung cancer and achieved excellent outcomes.

Apart from differentiating tumors, radiomics combined with machine learning models are also used to evaluate treatment efficiency. The tumor prognosis and immunotherapy response are closely related to the tumor immune microenvironment, to better understand the relationship between neutrophil-to-lymphocyte ratio (NLR) and radiomics imaging biomarkers in tumor immune microenvironment (TIME), along with its associations with tumor prognosis and immunotherapy response in advanced gastric cancer, Huang W et al²⁶ developed and verified a CT-based radiomics score (RS) using 2272 gastric cancer patients, The NLR AUC in RS projected TIME ranged from 0.795 to 0.861. Significantly, radiomics imaging biomarkers were as accurate at predicting disease-free survival (DFS) and overall survival (OS) for each group as IHC (immunohistochemistry)-derived NLR status. They found that objective responses were significantly higher in the low-RS group than in the high-RS group in the cohort of patients receiving anti-PD-1 immunotherapy. In patients with gastric cancer, radiomics imaging biomarkers offer a non-invasive way to evaluate TIME and may be related to prognosis and responsiveness to anti-PD-1 treatment, which has huge potential to evaluate tumor immune microenvironment.

Whether radiomics labels produced by deep features extracted by transfer learning can be used to predict overall survival in glioblastoma multiform patients, a discovery data set of 75 patients and independent validation data set of 37 patients were analyzed by Lao J et al⁶⁰. From preoperative multi-modality MR images, 98304 deep features and 1403 handmade features in total were obtained. After feature selection, a six-deep feature signature was produced using the least absolute shrinkage and selection operator (LASSO) Cox regression model. By integrating the signature with clinical risk factors including age and the Karnofsky Performance Score, a radiomics nomogram was additionally created. The proposed signature outperformed conventional risk variables in terms of overall survival (OS) prediction and significantly improved patient stratification into prognostically distinct groups. The findings of their study demonstrate that deep features based on transfer learning can provide prognostic imaging signatures for OS prediction and patient stratification for GBM, demonstrating the promise of deep imaging feature-based biomarkers in the preoperative management of glioblastoma multiforme patients.

Due to rarity and limited data for CE, in our research we have utilized nested cross validation- when the dataset is small and there are numerous hyperparameters to adjust for the model, it is extremely helpful⁶¹. Nested cross-validation's generalization ability can be deemed beneficial for a number of reasons. First off, by giving more accurate predictions of the model's performance, it helps to reduce the risk of overfitting. The outer loop offers an objective assessment of how well the model will function on unobserved data by splitting the data into an outer and inner loop. The model is adjusted for better generalization rather than overfitting to the training data using the inner loop, which is used for hyperparameter adjustment. Secondly, the use of cross-validation helps to reduce the dependency of the performance estimate on a particular train-test split. By repeating the process multiple times, with different splits of the data, the variability in the performance estimate can be assessed. This helps to capture the model's ability to perform well on unseen data from different perspectives, enhancing its generalization capability. Nested cross-validation also makes the model selection process more reliable. It makes it possible to compare various models or hyperparameter combinations objectively and choose the one that performs the best. This selection procedure aids in finding models that are effective on training data as well as those that generalize well to fresh, unexplored data^62,63. For the selection of biomarkers in high-dimensional data, the variable selection compression estimation method- LASSO has been widely used⁶⁴. By developing a penalty function, it builds a more refined model by compressing certain coefficients while leaving others at zero. In this method, feature screening (dimension reduction) and over-fitting are both avoided during model training. In the domains of molecular biology and neuroimaging, SVM is a robust, strong, and efficient machine-learning classifier⁶⁵. These characteristics allowed the LASSO regression model and SVM classifier to work together flawlessly in the radiomics investigation. Additionally, the LASSO algorithm chose the observed radiomics characteristics from a variety of filters and feature classes, which shows that multiple feature categories may provide complimentary information in differentiating between the CAE and BM. Even though the biological activity underlying these radiomics features is not yet known, we hypothesize that they may be able to capture the fine radiomics qualities of microstructure and the tumor's immediate surroundings.

Finally, radiomics combined machine learning approach has the potential to revolutionize the way we diagnose and differentiate between cerebral alveolar echinococcosis and brain metastases. Radiomics is a branch of medical imaging that uses advanced algorithms to extract quantitative features from medical images. These features can then be used to create predictive models that can accurately differentiate between CAE and brain metastases⁶⁶.

To the best of our knowledge, this is the first study that has used a combination of radiomics and machine learning algorithms to differentiate CAE and BM. Our study has some limitations: First, due to the rarity of CAE, even though data for CAE and BM have been collected for over ten years, there is still a small sample in this study. We intend to do multicenter research in the future to address this issue. Second, since the borders of CAE and BM are more well-defined in contrast enhanced sequences than in T2WI sequences, only contrast enhanced MRI sequences were used in our study. By including multi-model imaging data in the future, our model can be improved.

In conclusion, with good predicted accuracy and stability, the presented radiomics machine-learning classifier provides a non-invasive way to identify MET from GBM before surgery. We think merging radiomics analysis with machine learning techniques can enhance oncology accuracy and clinical practice.

Ethical approval

The 1964 Declaration of Helsinki and its later amendments or equivalent ethical standards were followed in all procedures carried out in studies involving human subjects. These procedures also complied with institutional and/or national research committee ethical requirements.

Consent to Participate

This retrospective study was approved by the institutional review boards of our hospital, and the requirement for patient informed consent was waived.

Consent for Publication

Written informed consent for publication was waived by the Institutional Review Board and all authors gave their consent for publication unanimously.

Authors Contributions

Yasen Yimit, Parhat Yasin and Abuduresuli: Investigation, Methodology, Writing original paper, Visualization, Supervision. Parhat Yasen and Abudoukeyoumujiang Abulizi: Data curation, Investigation, review. Wenxiao Jia, Yunling Wang and Mayidili Nijiati: Methodology, Resources, Supervision, Writing review & editing.

Competing Interest

The authors affirm that they have no identified economic or personal conflicts that would have seemed to have an impact on the research presented in this study.

Availability of data and materials

The datasets manipulated or generated in our research are available from the corresponding author upon reasonable request.

Code availability

The code used in our research are available from the corresponding author upon reasonable request.

Acknowledgments

This work was supported by the National Key R&D Program of China [grant number 2022ZD0160705]; Tianshan Innovation Team Program of Autonomous Region [grant number 2022D14007].

Meinel TR, Gottstein B, Geib V, et al. Vertebral alveolar echinococcosis-a case report, systematic analysis, and review of the literature. Lancet Infect Dis. 2018;18(3):e87-e98.
Baumann S, Shi R, Liu W, et al. Worldwide literature on epidemiology of human alveolar echinococcosis: a systematic review of research published in the twenty-first century. Infection. 2019;47(5):703–727.
Deplazes P, Rinaldi L, Alvarez Rojas CA, et al. Global Distribution of Alveolar and Cystic Echinococcosis. Adv Parasitol. 2017;95:315–493.
Paternoster G, Boo G, Wang C, et al. Epidemic cystic and alveolar echinococcosis in Kyrgyzstan: an analysis of national surveillance data. Lancet Glob Health. 2020;8(4):e603-e611.
Vuitton DA, Zhou H, Bresson-Hadni S, et al. Epidemiology of alveolar echinococcosis with particular reference to China and Europe. Parasitology. 2003;127 Suppl:S87–107.
Wen H, Vuitton L, Tuxun T, et al. Echinococcosis: Advances in the 21st Century. Clin Microbiol Rev. 2019;32(2).
Kantarci M, Bayraktutan U, Karabulut N, et al. Alveolar echinococcosis: spectrum of findings at cross-sectional imaging. Radiographics. 2012;32(7):2053–2070.
Yibulayin A, Li XH, Qin YD, Jia XY, Zhang QZ, Li YB. Biological characteristics of 18F-FDG PET/CT imaging of cerebral alveolar echinococcosis. Medicine (Baltimore). 2018;97(39):e11801.
Boire A, Brastianos PK, Garzia L, Valiente M. Brain metastasis. Nat Rev Cancer. 2020;20(1):4–11.
Hakyemez B, Erdogan C, Gokalp G, Dusak A, Parlak M. Solitary metastases and high-grade gliomas: radiological differentiation by morphometric analysis and perfusion-weighted MRI. Clin Radiol. 2010;65(1):15–20.
Mayerhoefer ME, Materka A, Langs G, et al. Introduction to Radiomics. J Nucl Med. 2020;61(4):488–495.
Lenga L, Bernatz S, Martin SS, et al. Iodine Map Radiomics in Breast Cancer: Prediction of Metastatic Status. Cancers (Basel). 2021;13(10).
Li G, Li L, Li Y, et al. An MRI radiomics approach to predict survival and tumour-infiltrating macrophages in gliomas. Brain. 2022;145(3):1151–1161.
Qian Z, Li Y, Wang Y, et al. Differentiation of glioblastoma from solitary brain metastases using radiomic machine-learning classifiers. Cancer Lett. 2019;451:128–135.
Yang L, Gu D, Wei J, et al. A Radiomics Nomogram for Preoperative Prediction of Microvascular Invasion in Hepatocellular Carcinoma. Liver Cancer. 2019;8(5):373–386.
Goecks J, Jalili V, Heiser LM, Gray JW. How Machine Learning Will Transform Biomedicine. Cell. 2020;181(1):92–101.
Dong J, Li L, Liang S, et al. Differentiation Between Ependymoma and Medulloblastoma in Children with Radiomics Approach. Acad Radiol. 2021;28(3):318–327.
Bathla G, Priya S, Liu Y, et al. Radiomics-based differentiation between glioblastoma and primary central nervous system lymphoma: a comparison of diagnostic performance across different MRI sequences and machine learning techniques. Eur Radiol. 2021;31(11):8703–8713.
Chen Y, Li Z, Wu G, et al. Primary central nervous system lymphoma and glioblastoma differentiation based on conventional magnetic resonance imaging by high-throughput SIFT features. Int J Neurosci. 2018;128(7):608–618.
Priya S, Ward C, Locke T, et al. Glioblastoma and primary central nervous system lymphoma: differentiation using MRI derived first-order texture analysis - a machine learning study. Neuroradiol J. 2021;34(4):320–328.
Suh HB, Choi YS, Bae S, et al. Primary central nervous system lymphoma and atypical glioblastoma: Differentiation using radiomics approach. Eur Radiol. 2018;28(9):3832–3839.
Xia W, Hu B, Li H, et al. Multiparametric-MRI-Based Radiomics Model for Differentiating Primary Central Nervous System Lymphoma From Glioblastoma: Development and Cross-Vendor Validation. J Magn Reson Imaging. 2021;53(1):242–250.
Yun J, Park JE, Lee H, Ham S, Kim N, Kim HS. Radiomic features and multilayer perceptron network classifier: a robust MRI classification strategy for distinguishing glioblastoma from primary central nervous system lymphoma. Sci Rep. 2019;9(1):5746.
Wang S, Wang G, Zhang W, et al. MRI-based whole-tumor radiomics to classify the types of pediatric posterior fossa brain tumor. Neurochirurgie. 2022.
Artzi NS, Shilo S, Hadar E, et al. Prediction of gestational diabetes based on nationwide electronic health records. Nat Med. 2020;26(1):71–76.
Huang W, Jiang Y, Xiong W, et al. Noninvasive imaging of the tumor immune microenvironment correlates with response to immunotherapy in gastric cancer. Nat Commun. 2022;13(1):5095.
Senturk S, Oguz KK, Soylemezoglu F, Inci S. Cerebral alveolar echinoccosis mimicking primary brain tumor. AJNR Am J Neuroradiol. 2006;27(2):420–422.
Bulakçı M, Kartal MG, Yılmaz S, et al. Multimodality imaging in diagnosis and management of alveolar echinococcosis: an update. Diagn Interv Radiol. 2016;22(3):247–256.
Pope WB. Brain metastases: neuroimaging. Handb Clin Neurol. 2018;149:89–112.
van Griethuysen JJM, Fedorov A, Parmar C, et al. Computational Radiomics System to Decode the Radiographic Phenotype. Cancer Res. 2017;77(21):e104-e107.
Krstajic D, Buturovic LJ, Leahy DE, Thomas S. Cross-validation pitfalls when selecting and assessing regression and classification models. Journal of cheminformatics. 2014;6:1–15.
Cheng J, Dekkers JCM, Fernando RL. Cross-validation of best linear unbiased predictions of breeding values using an efficient leave-one-out strategy. J Anim Breed Genet. 2021;138(5):519–527.
Cheng J, Pan Y, Huang W, et al. Differentiation between immune checkpoint inhibitor-related and radiation pneumonitis in lung cancer by CT radiomics and machine learning. Med Phys. 2022;49(3):1547–1558.
Kalendralis P, Shi Z, Traverso A, et al. FAIR-compliant clinical, radiomics and DICOM metadata of RIDER, interobserver, Lung1 and head-Neck1 TCIA collections. Med Phys. 2020;47(11):5931–5940.
Zheng M, Chen Q, Ge Y, et al. Development and validation of CT-based radiomics nomogram for the classification of benign parotid gland tumors. Med Phys. 2022.
Zhao M, Wen F, Shi J, et al. MRI-based radiomics nomogram for the preoperative prediction of deep myometrial invasion of FIGO stage I endometrial carcinoma. Med Phys. 2022;49(10):6505–6516.
Vuitton DA, Azizi A, Richou C, et al. Current interventional strategy for the treatment of hepatic alveolar echinococcosis. Expert Rev Anti Infect Ther. 2016;14(12):1179–1194.
Piarroux M, Piarroux R, Knapp J, et al. Populations at risk for alveolar echinococcosis, France. Emerg Infect Dis. 2013;19(5):721–728.
Reuter S, Seitz HM, Kern P, Junghanss T. Extrahepatic alveolar echinococcosis without liver involvement: a rare manifestation. Infection. 2000;28(3):187–192.
Baldolli A, Bonhomme J, Yera H, et al. Isolated Cerebral Alveolar Echinococcosis. Open Forum Infect Dis. 2019;6(1):ofy349.
Debourgogne A, Goehringer F, Umhang G, et al. Primary cerebral alveolar echinococcosis: mycology to the rescue. J Clin Microbiol. 2014;52(2):692–694.
Cheng J, Meng J, He W, Hui X. Alveolar echinococcosis presenting with simultaneous cerebral and spinal involvement. Neurology. 2017;88(22):2153–2154.
Ozdemir NG, Kurt A, Binici DN, Ozsoy KM. Echinococcus alveolaris: presenting as a cerebral metastasis. Turk Neurosurg. 2012;22(4):448–451.
Tyagi DK, Balasubramaniam S, Sawant HV. Primary calcified hydatid cyst of the brain. J Neurosci Rural Pract. 2010;1(2):115–117.
Bresson-Hadni S, Vuitton DA, Bartholomot B, et al. A twenty-year history of alveolar echinococcosis: analysis of a series of 117 patients from eastern France. Eur J Gastroenterol Hepatol. 2000;12(3):327–336.
Aydinli B, Aydin U, Yazici P, Oztürk G, Onbaş O, Polat KY. Alveolar echinococcosis of liver presenting with neurological symptoms due to brain metastases with simultaneous lung metastasis: a case report. Turkiye Parazitol Derg. 2008;32(4):371–374.
Faucher JF, Descotes-Genon C, Hoen B, et al. Hints for control of infection in unique extrahepatic vertebral alveolar echinococcosis. Infection. 2017;45(3):365–368.
Torgerson PR, Schweiger A, Deplazes P, et al. Alveolar echinococcosis: from a deadly disease to a well-controlled infection. Relative survival and economic analysis in Switzerland over the last 35 years. J Hepatol. 2008;49(1):72–77.
Nell M, Burgkart RH, Gradl G, et al. Primary extrahepatic alveolar echinococcosis of the lumbar spine and the psoas muscle. Ann Clin Microbiol Antimicrob. 2011;10:13.
Kern P. Clinical features and treatment of alveolar echinococcosis. Curr Opin Infect Dis. 2010;23(5):505–512.
Brown PD, Ahluwalia MS, Khan OH, Asher AL, Wefel JS, Gondi V. Whole-Brain Radiotherapy for Brain Metastases: Evolution or Revolution? J Clin Oncol. 2018;36(5):483–491.
Nayak L, Lee EQ, Wen PY. Epidemiology of brain metastases. Curr Oncol Rep. 2012;14(1):48–54.
Cagney DN, Martin AM, Catalano PJ, et al. Incidence and prognosis of patients with brain metastases at diagnosis of systemic malignancy: a population-based study. Neuro Oncol. 2017;19(11):1511–1521.
Suh JH, Kotecha R, Chao ST, Ahluwalia MS, Sahgal A, Chang EL. Current approaches to the management of brain metastases. Nat Rev Clin Oncol. 2020;17(5):279–299.
Nassif EF, Arsène-Henry A, Kirova YM. Brain metastases and treatment: multiplying cognitive toxicities. Expert Rev Anticancer Ther. 2019;19(4):327–341.
Li S, Chen J, He Y, et al. Clinical Features, Radiological Characteristics, and Outcomes of Patients With Intracranial Alveolar Echinococcosis: A Case Series From Tibetan Areas of Sichuan Province, China. Front Neurol. 2020;11:537565.
Wang F, Gao X, Rong J, et al. The Significance of Perfusion-Weighted Magnetic Resonance Imaging in Evaluating the Pathological Biological Activity of Cerebral Alveolar Echinococcosis. J Comput Assist Tomogr. 2022;46(1):131–139.
Gillies RJ, Kinahan PE, Hricak H. Radiomics: Images Are More than Pictures, They Are Data. Radiology. 2016;278(2):563–577.
Alcaide-Leon P, Dufort P, Geraldo AF, et al. Differentiation of Enhancing Glioma and Primary Central Nervous System Lymphoma by Texture-Based Machine Learning. AJNR Am J Neuroradiol. 2017;38(6):1145–1150.
Lao J, Chen Y, Li ZC, et al. A Deep Learning-Based Radiomics Model for Prediction of Survival in Glioblastoma Multiforme. Sci Rep. 2017;7(1):10353.
Parvandeh S, Yeh HW, Paulus MP, McKinney BA. Consensus features nested cross-validation. Bioinformatics. 2020;36(10):3093–3098.
Krstajic D, Buturovic LJ, Leahy DE, Thomas S. Cross-validation pitfalls when selecting and assessing regression and classification models. J Cheminform. 2014;6(1):10.
Baumann D, Baumann K. Reliable estimation of prediction errors for QSAR models under model uncertainty using double cross-validation. J Cheminform. 2014;6(1):47.
Gui J, Li H. Penalized Cox regression analysis in the high-dimensional and low-sample size settings, with applications to microarray gene expression data. Bioinformatics. 2005;21(13):3001–3008.
Han H, Jiang X. Overcome support vector machine diagnosis overfitting. Cancer Inform. 2014;13(Suppl 1):145–158.
Zhou Q. Computer-aided detection and diagnosis/radiomics/machine learning/deep learning in medical imaging. Med Phys. 2022.

No competing interests reported.

Download PDF

Journal Publication

published 09 Dec, 2023

Read the published version in European Journal of Medical Research →

Editorial decision: Revision requested
09 Nov, 2023
Reviews received at journal
21 Sep, 2023
Reviews received at journal
17 Sep, 2023
Reviewers agreed at journal
17 Sep, 2023
Reviewers agreed at journal
16 Sep, 2023
Reviewers invited by journal
16 Sep, 2023
Editor assigned by journal
01 Sep, 2023
Submission checks completed at journal
31 Aug, 2023
First submitted to journal
28 Aug, 2023

You are reading this latest preprint version

Differentiation between cerebral alveolar echinococcosis and brain metastases with radiomics combined machine learning approach

Status:

Journal Publication

Version 1

Abstract

Background

Purpose

Methods

Results

Conclusion

Figures

1 Introduction

2 Methods

2.1 Study population

2.2 Imaging characteristics

2.3 Image Segmentation

2.4 Radiomics Feature Pre-processing and Extraction

2.5 Feature Selection

2.6 Model Construction and Optimization

2.7 Model Evaluation

2.8 Statistical Analysis

3 Results

3.1 Patient Characteristics

3.2 Extraction and Selection of Features

3.3 Model Optimization

3.4 Model Performance Evaluation

4 Discussion

5 Conclusion

Declarations

References

Additional Declarations

Status:

Journal Publication

Version 1