Heart disease prediction using scaled conjugate gradient backpropagation of artificial neural network

Heart disease is a deadly disease in human life. The mortality rate from any disease is the highest in the world. Therefore, before reaching the final stage of this heart disease, all precautionary measures must be taken. For this reason without the help of any kind of traditional methods, if we can scientifically diagnose heart disease at an early stage through various decision support systems, then surely death rate of this disease will decrease in the whole world. Many researchers investigate the diagnosis of heart disease by creating various intelligent medical decision support systems. Artificial neural network concepts represent the highest predictive accuracy over medical data compared to other decision support systems. In this paper, we propose a better prediction method for the existence of heart disease through the scaled conjugate gradient backpropagation of artificial neural networks using K-fold cross-validation. For cardiac datasets, the University of California Irvine (UCI) Machine Learning Repository and IEEE data port have been used. For Cleveland processed heart dataset, the proposed system uses 13 input attributes and provides minimum 63.3803% and maximum 100% accurate results; similarly, for Cleveland Hungarian Statlog heart dataset, the proposed system uses 11 input attributes and provides minimum 88.4754% and maximum 100% accurate results by estimating the presence and absence of heart disease during testing.


Introduction
The heart is an important part of the human body. It is used to distribute blood to other parts of the body through the blood vessels of the circulatory system. Any disorder that affects the heart is called heart disease and it affects other parts such as the brain, lungs, liver, and kidneys which have a profound effect on human survival. Several factors such as anxiety, mental depression, smoking, lack of physical exercise, blood pressure, cholesterol, and obesity increase the risk of heart disease. The World Health Organization (WHO) estimates that by 2030, nearly 23.6 million people will die due to heart disease. As the problem of the disease is not identified in the earlier stages, it causes large number of deaths. If the disease is predicted at an early stage, we can prevent the death of lots of patient. The prediction depends on the symptoms, physical check-up, symptoms of the patient's body and the clinical signs of heart disease such as the presence of many functional and pathological factors. Sometimes these functional and pathological factors delay and complicate the prediction of heart disease which can have negative perceptions and unpredictable effects. Overcoming these false assumptions requires the development of an early prediction and medical diagnosis expert system that offers high accuracy with low operating costs. We therefore propose a new method by applying the scaled conjugate gradient backpropagation algorithm with artificial neural networks by using Cleveland processed database (David 1998) and Cleveland Hungarian Statlog heart database (Siddhartha 2020), to determine the absence or presence and levels of heart disease. The Cleveland processed database contains 76 features and 303 instances, and in this dataset, all published experiments use a subset of 14 of them. According to the Cleveland processed dataset, we have taken all 14 attributes, and not used feature selection method, and out of 14 attributes, this system uses 13 input attributes and 1 Output or 'Goal' field.
On the other hand, Cleveland Hungarian Statlog dataset consists of 1190 instances with 11 attributes as input and 1 attribute as output. In this method, we have taken all 12 attributes. The details of Cleveland processed dataset and Cleveland Hungarian Statlog heart dataset are explained in the subsection 'dataset' under the 'Experimentation' section.

Literature review
Works are being carried out with different varieties of methodology and reached various classification accuracies. Ghosh et al. (2021) invented an effective prediction of cardiovascular disease using machine learning algorithms with Relief and LASSO feature selection techniques and achieved 99.05% prediction accuracy. Kavitha et al. (2021) predicted heart disease using hybrid machine learning techniques such as random-forest using decision tree and obtained 88.7% accuracy. Li et al. (2020) proposed a model for heart disease diagnosis using machine learning classification techniques and achieved 92.37% accuracy. Syed Arslan Ali et al. (2020) established an optimally configured and improved deep belief network (OCI-DBN) approach to predict heart disease by using Ruzzo-Tompa and stacked genetic algorithm and obtained 94.61% prediction accuracy. Wang et al. (2020) invented a stakingbased model for non-invasive detection of heart disease and obtained 95.43% accuracy. Ayoub Khan (2020) proposed an IoT framework for evaluation of heart disease more accurately using a modified deep convolutional neural network (MDCNN) and achieved an accuracy of 98.2%. Vankara et al. (2020) invented a predictive analysis by ensemble learning and classification for heart disease detection and obtained 93% accuracy. Senthil kumar Mohan et al. (2019) Ruhin Kouser et al. (2018) proposed a model for a cardiovascular prediction system using artificial neural networks, radial base functions, and case-based reasoning and obtained 97% to 98% accuracy. Ashok Kumar Dwivedi (2018) established method for performance evaluation by using different machine learning techniques for proper diagnosis of heart disease and achieved 85% accuracy. Ji Zhang et al. (2017) introduced a fast Fourier transformation-coupled machine learning ensemble model for heart disease prediction. The experimental results showed 91-94% prediction accuracy. Leema et al. (2016) designed a computer-aided diagnostic system (CAD) for the prediction of heart disease using differential evolution with worldwide information and backpropagation algorithms and obtained 86% accuracy. Rajeswari et al. (2012) designed a model for feature selection in the detection of ischemic heart disease using the feedforward neural network and achieved 89% accuracy during training data and 82% accuracy during test data, respectively. Khemphila et al. (2011) have achieved 89% accuracy in their model for classifying heart disease and training datasets using neural networks and feature selection. Asadi et al. (2009) worked on the supervised multilayer feedforward neural network model to accelerate the classification problem and got 94% prediction accuracy. Yan et al. (2006) found percentage of accuracy in the interval [88.6, 93.2] with 91% mean accuracy in multilayer perceptron-based medical decision support system for the diagnosis of heart disease. The above-mentioned related works are explained in a following table.

Proposed model and its design
The overall methods of the proposed model are shown in Fig. 1. The original datasets are taken from the Cleveland processed database (David 1998) and Cleveland Hungarian Statlog heart database (Siddhartha 2020), respectively. After processing, clinical datasets are normalized using the following mathematical formula.
Normalized(X) = Original value in the given set Maximum value in the given range Where the normalized value lies in the interval [0, 1]. For example, patients' blood pressure is between 50 and 200 and if a patient's blood pressure is 145 then it can be normalized as 145 200 = 0.725. After normalizing, loading it and importing into the MATLAB in table format which consists of 14 columns and 303 rows for Cleveland processed dataset, & 12 columns and 1190 rows for Cleveland Hungarian Statlog heart dataset, respectively, then convert the table into an array and apply the scaled conjugate gradient backpropagation algorithm. In this technique, sorting the data and storing it in the 'X' variable, it sorts the rows of the matrix in ascending order based on the elements in the first column. When the first column contains repeating elements, sort the rows according to the values of the next column and repeat this process to achieve equal values. Now, take input from the user as a percentage of training. The prompt takes variable input and stores it in variable percentages. Next, perform cross-validation partitions for data training and testing. 'cvpartition' constructs an object c of the 'cvpartition' class defining a random nonstratified partition for k = fivefold cross-validation on n 1 = 303 and n 2 = 1190 observations. Here, 'cvpartition' randomly selects p numbers of observations (When 0 \ p \ 1, the default value of p is 1/10) for the test set. The partition divides the observation into k disjoint subsamples (or folds), randomly selected but almost identical in size. Now, the data need to be prepared for training data and testing data for neural network training. Enter the learning rate, number of input neurons, number of hidden neurons and number of output neurons and stored at 'Prompt.' Where 'Prompt' is the place where user type commands, formulas, and functions or performs tasks using MATLAB. Then get ready for network training. Thereafter, test our data on the trained network. Finally calculating the errors, performance, percentage errors, MSE, accuracy, percentage of accuracy, time elapse and plotting confusion matrix.
Pseudocode of the proposed model is explained below. Here, Cleveland heart dataset is taken from UCI Machine Learning Repository and Cleveland Hungarian Statlog heart dataset is taken from IEEE data port, respectively.

i. Data set
The cardiac dataset used for the experiment is taken from the UCI Machine Learning Repository and IEEE data port. The presence or absence and levels of heart disease are determined using the Cleveland processed heart database and Cleveland Hungarian Statlog heart database, respectively. Cleveland processed heart dataset contains 76 attributes and 303 numbers of instances, but all published experiments use a subset of 14 of them. This system uses 13 (thirteen) most significant attributes as input and one attribute as output or 'target' field. Input attributes are age, gender, chest pain, blood pressure, cholesterol, blood sugar, ECG, maximum heart rate, exercise-induced angina, old peak, slope (The slope of the peak exercise ST segment), CA (number of major vessels (0-3) colored by fluoroscopy), thallium scan (3 = normal, 6 = fixed defect, 7 = reversible defect). The output field is named as 'Diagnosis of Heart Disease.' It has 5 integer values (0-4). The value of the integer 0 means 'No Presence of Heart Disease(healthy),' similarly the value of the integer '1' means 'sick 1,' '2' means 'sick 2,' '3' means 'sick 3,' and '4' means 'sick 4,'respectively.
On the other hand, Cleveland Hungarian Statlog dataset consists of 1190 instances with 11 input attributes and 1 target output. Input attributes are age, gender, chest pain, resting blood pressure, cholesterol, fasting blood sugar, resting ECG, maximum heart rate, exercise angina, old peak, ST-slope. The target output has two levels (0 and 1). The value of the integer 0 represents 'Normal' and 1 represents 'Heart Disease,' respectively.
Two databases are explained in Tables 1 and 2, respectively.
ii. Data normalization The clinical datasets used in this work are normalized ( This algorithm was introduced by Martin F. Moller in the 1991. It is a feedforward neural network based on supervised learning algorithm and does not contain any of the user-dependent parameters. This algorithm keep away from time-consuming line search per learning iteration. It leads to better performance than standard backpropagation algorithm, conjugate gradient algorithm with line search and 'Broyden-Fletcher-Goldfarb-Shanno' (BFGS) algorithm. It requires a larger number of iterations but less computational complexity. The SCG algorithm (Møller 1993) is described below.
1. Choose weight vector w 1 and scalars r [ 0, k 1 [ 0 and k 1 = 0. Set p 1 = r 1 = -E'(w 1 ), k =1 and Success= true 2. If success = true then calculate second-order information: 3. Scale s k: s k = s k ? (k k -k k )p k, d k = d k ? (k k -k k ) p k j j 2 4. If d k B 0, then make positive definite Hessian Matrix If D k C 0 then a successful reduction in error can be made: 7a.
If k mod N = 0, then restart algorithm: p k?1-= r k?1 . Else create new conjugate direction: If, D k C 0.75 then reduce the scale parameter: k k = 1 2 k k Else, reduction in error is not possible: k k = k k , success = false.
8. If, D k \ 0.25 then increase the scale parameter: k k = 4 k k 9. If the Steepest Descent direction r k = 0 then set k = k ? 1 and go to 2, Else, terminate and return w k?1 as the desired minimum.
Where r (B 10 À4 ) is kept small and it implies that it is not critical for SCG's performance.
There is one call of E(w) and two calls of E'(w), for each iteration. In this algorithm, the calculation complexity per iteration is of O(3N 2 ) and it can be reduced to O(2N 2 ).
All symbols in a SCG algorithm are explained in Table 4.

IV. K-fold cross-validation
Here, initial data sets are randomly partitioned into 'k' number of mutually exclusive subsets or folds.
The 1st iteration D 1 is reserved as the test set and the remaining D 2 , D 3 , D 4 …DK are served as the training sets.
The 2nd iteration D 2 is served as the test set and the remaining D 1 , D 3 , D 4 …D K are served as the training sets.
Similarly, the ith iteration D i is served as the test set and the remaining D 1 , D 2 .D (i-1) , D (i?1) , D K are served as the training sets. Each sample is used the same number of times for training and once for testing. For classification problem Accuracy ¼ overall numbers of correct classification from K iteration Total numbers of tuples in the initial data: The classification accuracy of the proposed method is evaluated with k = fivefold cross-validation of the samples. The results obtained from each fold are averaged and used for comparative analysis.

v. Classification accuracy
Classification accuracy is used to predict the performance results of the proposed method in the case of Multilayer perceptron-based medical decision support system 88.6-93.2% Yan et al. (2006) Heart disease prediction using scaled conjugate… 6691 accurate diagnostic results. The accuracy of the classification is measured using the confusion matrix tool. The confusion matrix is a tool. It analyzes how well the classifier can recognize tuples of different classes. The processing outcome of the confusion matrix is shown in Fig. 2

Confusion matrix
There are four additional terms 'True Positive,' 'True Negative,' 'False Positive' and 'False Negative,' which are explained below.
True Positive (TP): It is an outcome where the model correctly predicts the positive class.
True Negative (TN): It is an outcome where the model correctly predicts the negative class.
False Positive (FP): It is an outcome where the model incorrectly predicts the positive class.
False Negative (FN): It is an outcome where the model incorrectly predicts the negative class.                It is described in Table 6.

Performance analysis between two datasets based on the experimental results
The following graphical (Figs. 35,36,37) representations are presented based on Tables 5 and 6     Heart disease prediction using scaled conjugate… 6699

Observation
During our experimentation, we found that if the numbers of neurons in hidden layer are 10 or more for the Cleveland heart dataset and similarly numbers of neurons in hidden layer are 13 and more for Cleveland Hungarian Statlog dataset, then the accuracy of the result is 100%.

Conclusion
The proposed heart disease prediction system with accurate diagnosis has been developed using the scaled conjugate gradient backpropagation neural network. Since this algorithm does not contain any user-dependent parameters  whose values are crucial to the success of this method and use the step size scaling process, this algorithm avoids time-consuming line searches per learning iteration, which makes the algorithm faster than other adaptive learning algorithms. This algorithm is repeated until the minimum error rate is observed. From Fig. 37 (histogram), it is clear that the proposed method has the highest maximum accuracy rate compared to various other methods. Experimental results prove that the percentage of prediction accuracy (63.3803-100%) for Cleveland processed dataset and (88.4754-100%) for Cleveland Hungarian Statlog heart dataset varies for taking of different hidden neurons. Thus, the experimental results give good and encouraging results to predict heart disease with the best possible improved accuracy (Fig. 38). Heart disease prediction using scaled conjugate… 6701