In the present study, the ML models were applied to predict the metastasis in CRC patients. The clinical efficacy of our models were determined trought the ROC curve analysis and other indexes for evaluation the predictive performance including sensitivity, apesificity and presicion. Classifier performance was assessed using the six approaches of machine learning. In our study, however all models had acceptable performance, the NN model had greater predictive efficiency than the others.
A number of modeling techniques in ML-based have been suggested in CRC dataset. Alternative studies have applied DT, SVM, NN, RF and LR [14-17].
A study investigated the prediction of tumor in the TNM staging (tumor, node, and metastasis) stage in colon cancer patients [18]. In the survey, they applied ML techniques, such as random forest, logistic regression, support vector machine, artificial neural network, K-nearest neighbor and adaptive boosting, based on grouping Tumor Aggression Score (TAS) into two categories (>9.8 and <9.8). They concluded that when tumor size only was regarded as a prognostic factor, the random forest model can outperform over other approaches with an accuracy %84, 74% in training and test sets. In our study, we performed six of ML-based approaches in colorectal cancer data. Moreover, neural network and random forest have the highest sensitivity and neural network and decision tree are considered as the highest values in specificity. In their study, both patients’ age and stage of tumor are regarded as more important in neural network and other approaches, respectively. However, tumor stage was considered as essential variable, which was compatible in our study.
Boyne et.al in 2020 predicted early discontinuation of adjuvant chemotherapy among individuals Age> 17 Years, who have colon cancer patients with high stage, using LR and RF models [17]. Their results revealed that time from surgery to chemotherapy initiation and distance from treatment facility seemed to be the most considerable predictor factors. They concluded that RF algorithm may help predict early discontinuation of chemotherapy among colon cancer patients with stage III. In our study, neural network and random forest were of primary and secondary importance. The primary outcome of their study was chemotherapy discontinuation, defined as receipt of, <5 months and >5 months, while metastasis was determined as dependent variable in our study. Also, in their study RF was considered as the better model than LR approach, but all ML-based approaches had ideal performances in our study.
An investigation was done in South Africa based on LR, NB, C5.0, RF, SVM, and ANN algorithms for predictive analytics for recurrence and survival outcomes in CRC patients [16]. Three datasets were regarded as simulated, recurrent and survival data. They surveyed significant variables in all models and compared them using AUC, which evaluated the discriminatory power of the predictive models, which was supported with a threshold (accuracy) metric. Their results demonstrated that all models had AUC greater than 80%, but ANN model was considered as better method with AUC approximately equal to 100%, which was compatible with our study. However, an inconsistent result presented histology and CRC complication were priority in six methods of African study, however, tumor stage was the prime candidate in our study.
A survey was carried out on Indonesian population who suffer from CRC in 4 Hospitals from 2012 to 2015 [19]. Their predictor factors composed of comorbidity, stage of cancer, age, type of treatment, cancer location, gender, metastasis in CRC patients. In the survey, RF algorithm was used in data classification through tree merging by training on sample data. Moreover, the accuracy of those models was justified by the value of classification using AUC. Also, the most essential variables on the survival of colorectal cancer patient were metastasis history, cancer location and gender, respectively. In our investigation, the main variable was metastatic history, whereas the survival of patients with CRC was dependent variable in the Indonesian study. Moreover, in their study, both stage of tumor and age were of last importance, which was inconsistent with our survey.