Hepatitis C Virus Prediction Based on Machine Learning Framework: a Real-world Case Study in Egypt

DOI: https://doi.org/10.21203/rs.3.rs-1292024/v1


Prediction and classification of the diseases are essential in medical science, as it attempts to immune the spread of the disease and discover the infected regions from the early stages. Machine Learning (ML) approaches are commonly used for predicting and classifying the diseases that precisely utilized as an efficient tool for the doctors and specialists. To forecast Hepatitis C Virus (HCV) among Egyptian healthcare workers (HCWs), a prediction system based on machine learning methodologies is developed. We used data from the National Liver Institute (NLI), which was formed at Menoufiya University (Menoufiya, Egypt). The dataset includes 859 patients with 12 distinct characteristics. To test the proposed framework's robustness and dependability, we ran two scenarios, one without feature selection and the other with feature selection based on Sequential Forward Selection (SFS). In addition, a feature subset evaluation based on SFS-generated features is carried out. Induction algorithms and classifiers used for model evaluation include Nave Bayes (NB), Random Forest (RF), K Nearest Neighbor (KNN), and Logistic Regression (LR). Then, the effect of parameter tuning on learning techniques is measured. The experimental results indicated that the proposed framework achieved higher accuracies after SFS selection than without feature selection. Moreover, the RF classifier achieved 94.06% accuracy with minimum learning elapsed time 0.54 sec. Finally, after adjusting the hyperparameter values of the RF classifier, the classification accuracy is improved to 94.88% using only four features.

Full Text

This preprint is available for download as a PDF.