Heart disease is recognized by WHO as one of the most significant diseases in the world [1]. It incurs high patient care costs [2], which has become a growing concern in countries with an increasing aging population. Early diagnosis and prediction of risk of heart disease followed by timely intervention can play an important role in disease management. Identification of disease is complex due to the many factors involved. Unidentified cases may result in premature deaths. Over the last decade, as large data sets have become available, Machine Learning (ML) techniques have been used on health data, including heart disease, for prediction and classification and have given promising results. Research on ML for prediction of heart disease involves building a model [3], and using the model to determine the risk of developing or having heart disease [4].
ML techniques have been shown to perform well in classification problems with large data sets and many predictive factors. ML algorithms used for diagnosing and predicting a heart disease include conventional algorithms including Support Vector Machine (SVM), K-Nearest Neighbor (kNN), Decision Tree (DT), and Linear Regression (LR), ensemble algorithms including Random Forest (RF), Bagging, Adaptive Boosting (AdaBoost), and deep learning algorithm including Convolutional Neural Network (CNN) [5–11].
Use of an appropriate dataset and using pre-processing techniques such as Exploratory Data Analysis (EDA) to exclude irrelevant data can improve the performance of machine learning algorithms [5, 6]. The performance of each algorithm may vary depending on the dataset used, the parameters applied, and the pre-processing technique performed before creating a model [7].
Senan et al. performed prediction of heart disease using five machine learning algorithms and used the synthetic minority oversampling (SMOTE) technique to resolve an imbalance problem in the dataset [8]. The SelectKBest function was used with the chi-squared statistical method to determine the most important features. The study had accuracy for the testing set of 90.16% for SVM, 90.16% for KNN, 81.97% for DT, 85.25% for RF, and 88.52% for LR. Shah et al. conducted a study on prediction of heart disease using four machine learning algorithms, Naive Bayes, k-NN, DT, and RF [9], using default parameters. The study had accuracy scores of 88.16%, 90.79%, 80.26%, and 86.84%, respectively [9]. Reddy et al. conducted a study on prediction of heart disease by applying ten algorithms with three attribute evaluators (correlation-based feature selection, chi-squared attribute evaluation, and ReliefF attribute evaluation) [10]. This study showed that attribute evaluators could improve model performance. Some classifiers showed performance improvement after the attribute classifier was applied. Sequential Minimal Optimization (SMO) classifier using Chi-Squared attribute evaluator was found to have best performance, with an accuracy of 86.47%. The study by Arooj et al. showed that the deep learning algorithm could be implemented to predict heart disease [11]. In this study, the model with the CNN algorithm resulted in an accuracy of 91.7%.
Several studies have performed pre-processing prior to building models [8, 10], however the improvement in performance was not significant when compared to studies without pre-processing [9, 11]. Pre-processing steps on the dataset, including normalization, are expected to improve the performance of the machine learning model. Singh et al. studied the impact of normalizing data before building a model [12]. They showed that normalized data performed better than un-normalized data. Jo normalized the dataset using Min-Max normalization and concluded that normalization improves the accuracy performance of models created by the Support Vector Machine (SVM) algorithm [13].
Determining relevant parameter values also affects the performance of models in machine learning. Building models based on tuned multi parameters will produce an optimal performance [14]. Fuadah et al. implemented a hyper-parameter tuning technique using grid search on Heart Sound classification to obtain optimal classification [15]. Using the best parameters selected by the grid search method when building the machine learning model resulted in greater accuracy than other studies [15].
This study conducts an empirical analysis on the performance of machine learning algorithms to predict heart disease, comparing classical machine learning algorithms DT, SVM, and k-NN, and the ensemble algorithms RF and AdaBoost. It investigates the effectiveness of using normalization and selection of hyper-parameters.
The heart disease dataset from UCI has been used in many ML studies, including those referenced in this paper and is used in this study. Generally, most studies restrict use to the data provided by Cleveland Clinic, which allows comparison of outcome.