Risk Assessment for Cardiac Arrest : A Deep Learning Approach With Channel Selection

Background: Risk of developing cardiovascular diseases, in the world, is increasing day by day. Accordingly, the number of deaths due to heart attacks is quite remarkable. Early risk assessment and diagnosis of heart disease are vital to prevent heart attacks by providing effective treatment planning and evaluation of outcomes. When a patient with high risk of heart attack is not treated correctly, chances of survival may reduce dramatically. For this reason, artificial intelligence-assisted systems can support the decision of doctors and it can anticipate risk without fatal consequences. Methods: In this study, individuals who has heart attack risks are predicted by using a proposed CNNs method. A set of medical data from patients with heart attacks and healthy individuals are provided from the UCI database. Reinforced deep learning and ANFIS architectures are also applied to the same problem in order to compare the results and put forth the efficiency of proposed method. In addition, ROC analysis and measurements of processing times for the applied methods were performed to reveal the performance, accuracy and efficiency of the study. Results: The proposed CNNs method and other methods are tested and evaluated. The accuracy performance of the methods were 94.34% for the proposed CNNs method, 91.58% for the ANFIS, and 92.66% for the deep multilayer neural network. Highest accuracy has been obtained by using the proposed CNNs method, which is 94.34%. The reasons why the proposed CNNs method is better than other methods is the use of channel selection layer, the number of convolution and pooling layers, the filter size used in these layers, and the functions used in the loss and activation layers. Conclusions: In the study, the channel selection formula is introduced in the proposed CNNs model to select the most discriminatory feature filters. Besides, the applicability of proposed CNNs method with images obtained from numerical data has been demonstrated. With the early prediction system proposed, it is now possible to take precautionary measures against possible cardiac arrest. In this study; a new method based on CNNs is proposed for early detection of possible heart attack, which is a great risk for human life. Different from studies in the literature, the channel selection formula is presented in the proposed CNNs method to select the most selective feature filters. Besides differently, it was used in the proposed CNNs method by converting all numerical data from dataset into 2D images. Afterwards, to show whether this the proposed method is applicable or not, the dataset which is numerical form was applied to other methods and compared.


Background
Heart is a vital organ which has the task of pumping blood into circulatory system. In humans, it is located in the chest area under the rib cage and has the strongest muscles in body although it's about the size of a fist. Heart weighs less in women than men and beats about 100,000 times in one day. Circulation of the blood around the lungs where blood is oxygenated is called pulmonary circulation; and circulation around the body to provide oxygenated blood is called systematic circulation [1]. Although heart is composed of very strong muscles, it is vulnerable to many risk factors. Heart attack is the damage of the heart tissue due to the sudden blockage of the arteries that are feeding the heart. In the period that is called arteriosclerosis, the arteries shrink over time and clots develop on the cracks that clog the vessels. If not intervened in a timely manner, the damage results in loss of the heart tissue. If this loss is widespread, it affects the pumping power of the heart and causes failure. Heart attack is one of the leading causes of death both in Turkey and in the world. The World Health Organization (WHO) stated that 30% of the causes of death in the world are due to heart attack. In Turkey, studies show that the situation is even more serious with the ratio of 46%. Turkish Society of Cardiology, in a message issued on World Heart Day in 2015, stated that 300,000 heart attacks are observed every year in Turkey and 125,000 cases result in death [2]. Nausea, vomiting and cold sweat can be symptoms to detect the heart attack. Fainting and loss of consciousness can be added to these parameters. For patients over a certain age, shortness of breath is also among the symptoms. These symptoms are experienced in approximately 80% of individuals who has had a heart attack. It occurs quietly in the other 20% without any symptoms before. Eighty percent of cardiovascular diseases are developed due to smoking, hypertension, genetic predisposition, obesity, sedentary lifestyle and diabetes. Other causes are high cholesterol (Low-Density Lipoprotein-LDL [sometimes called "bad" cholesterol] and triglycerides), HDL (High-Density Lipoprotein-sometimes called "good" cholesterol), consume of alcohol, presence of various heart diseases (such as vascular occlusion, arrhythmia and experience a heart attack) and stress.
Another statistic shows that women have a lower risk of heart attack before menopause because of estrogen hormone. However, occurrence rate of complications is greater than that of in men. The earlier the heart attack is noticed, the less is the risk of loss of life. Because of this, it is very important to be able to save the life of the individual by determining the risk of heart attack in a timely manner. Only doctors can diagnose cardiac diseases today. Studies in the field of health informatics help doctors in this regard but they are not completely reliable. Misleading of the doctor's interpretation and diagnosis on a disease with a low error-threshold like heart attack might have irreversible consequences. For this reason, artificial intelligence-assisted systems support the decision of doctors in this field.
When analyzed in terms of computer and statistical sciences, health sector is one of the richest areas in terms of data. However, effective analysis methods are not used sufficiently to discover information and its relationship within the data. In recent years, Pattern Recognition, Data Mining and Machine Learning techniques have been applied to many areas of health informatics. Information discovery and data mining can be applied to many sectors and businesses. Valuable information can be determined via the application of artificial intelligence techniques in the health system. To help diagnostics & treatments made by doctors and to prevent human errors, the use of artificial intelligence-based decision support systems is becoming more widespread. In addition to the algorithmic studies to detect cardiovascular diseases in advance, various statistical studies have been conducted. There are many different studies conducted under the discipline of artificial intelligence and pattern recognition.

Related Works
Nabeel Al-Milli developed a heart disease prediction system by using a neural network [3]. In this study, Al-Milli used 4 output classes and 14 parameters. The back propagation algorithm is used for the training of the network and the experiments performed in this study show that the proposed algorithm performs very well. Dr. K. Rani analyzed the heart disease dataset by using the Neural Network approach [4]. In this study, a parallel approach is adopted in the training phase in order to increase the efficiency of the classification process. Dangare et al. [5] proposed the Heart Disease Prediction System using neural network. In order to achieve better accuracy, the authors use 13 medical features such as gender, cholesterol level and stress testing in addition to smoking and obesity. The Heart Disease Prediction System estimates the probability of the patient having a cardiac disease. It is stated that the accuracy of this designed system is close to 100%. Chowdhury et al. applied an artificial neural network model for neonatal disease [6]. This proposed system accurately predicted the diagnosis of 75% of neonatal diseases. Shankar et al. proposed an algorithm for the detection and division of S1 and S2 heart sounds using the Discrete Wavelet Transform and the Shannon energy envelope [7]. For classification, Shankar used an artificial neural network is to detect heart murmuring from heart sounds. Bahekar et al. used discrete wavelet transform for detection and division of S1 and S2 sounds among heart sounds [8]. For classification Adaptive-Network Based Fuzzy Inference Systems (ANFIS) is used.
M. Anbarasi tried to identify the important factors that cause heart diseases with the help of Genetic Algorithms [9]. Among the 13 factors that are responsible for heart diseases, genetic algorithms. determines the common factors, too. In addition, these factors are used in different classification algorithms. K. Srinavas et al. and M. Ghani et al. conducted two separate studies to estimate health status and heart attack by using data mining techniques [10,11]. Heart-c, Heart-h and Heart-statlog are selected from UCI data sets and classification procedures are performed using Weka software [12]. In the experiments performed with UCI datasets, it is stated that best performance is obtained by using Information Gain and ANFIS [13]. Jin et al. monitored electrocardiogram (ECG) signals real time and continuously with a mobile phone based wearable device and the generated signal records were used to detect and prevent abnormal conditions [14]. Image processing-based measurement method that uses video sequences to measure the dynamic heart rate measurements by Link et al [15]. Panboonyuen et al. proposed a novel CNNs for semantic segmentation with three main contributions called global convolutional network, channel attention and domain-specific transfer learning [16]. Some studies focus on the selection of the most effective feature in dataset and ability to classify with less complexity. Similarly, there are various studies on the detection of heart diseases by analyzing ECG signals. In some studies, methods on grouping the factors affecting different heart diseases are offered.
In this study, advanced artificial deep neural network approach is developed to predict deep coronary heart disease and improve diagnostic accuracy by using deep learning based classification and prediction models. The classification and diagnostic models developed for this research consist of two parts: a deep neural network learning based training model and a heart disease diagnostic model. Based on the training model, the diagnostic model is used to predict whether patients have coronary heart disease or not. Then, the performance of the deep learning model in the diagnosis of heart disease will be evaluated in terms of performance measurement parameters. In this study, a computer-assisted heart attack prediction system proposes to help potential patients who are at risk of heart attack and specialists identify the risk of heart attack through personal data. The proposed system is based on Convolutional Neural Network architecture. The feasibility of the proposed system and diagnostic test are presented. In section II, dataset and information of deep learning methods/CNNs are presented. In section III, the proposed architecture and application method are presented and in last section, analysis of the study and results are shared.

Dataset
Data is very important for artificial intelligence and machine learning. Data, which is selected for the problem, is the basic building block for learning and adaptation of the model. Data selection from the raw data is made from a data set that contains the parameters that affect the problem. Parameters are thought to be related with the problem in raw data but in fact have no direct effect on the problem, so they should be removed, not used for training.
The dataset used in this study is the heart attack dataset in the UCI machine learning database [17]. There are 270 data in the data set. In 150 of the total 270 data, a heart attack was not observed and other 120 had a heart attack. In the data set of containing 13 parameters that trigger the result of heart attack; the output column refers to the person's heart attack status. In this study, 270 heart disease patients are considered with 14 factors or variables. Description of the covariates, factors and their levels, summarized statistics such as the mean, standard deviation, and proportion of the levels are described in Figure Figure 1. The current data contains 5 continuous variables and 9 attribute characters. The description of each variable or attribute character, attribute levels, and how they are operationalized in the present report are also displayed in this figure. Presence or absence of heart disease in a patient is considered as a dependent variable (for regression) or output variable (for classification); and the rest of the variables are considered as independent variables/cofactors. Finally, entire data set of 270 total clinical instances are randomly separated into a training data set of 162 clinical instances (60%) and testing data set of 108 clinical instances (40%).

Deep learning
Artificial neural networks are an artificial intelligence method that aim to gain the ability to learn and adapt like human brain by modeling it [18]. Deep Learning can be defined as an architecture that works with more layers that of multilayer artificial neural networks [19]. The difference of Deep Learning method from other artificial intelligence and machine learning methods in literature is that it can respond to data and complex problems because of having a lot of processing units within itself.
Due to its architecturally complex structure and high number of calculations, artificial neural networks require equipment with very high computational power. This need is being fulfilled by using the GPUs (Graphics Processing Unit) of computers. Today, considering the latest technological developments, the amount of data in the field of health can be defined by Big Data concept. Although artificial intelligence and machine learning methods have been used for a long time, artificial neural network's ability to process big data and powerful simultaneous calculation, which improves the performance obtained by GPU processing, has made it quite different from other methods. With GPUs, learning of the model can be completed in a shorter time with many more training data sets. Many studies have been conducted in the literature and practice, using deep learning. Deep learning applications are developed on a wide range of subjects such as natural language processing, image and video processing, biomedical signal and image processing, pattern recognition, health diagnosis support systems, robotics, chemistry, advertising, finance, search engines, autonomous vehicle systems. Deep learning is based on learning from the representation of data.

Convolutional neural networks (CNNs)
Convolutional Neural Networks (CNNs), in contrast to the artificial neural networks, are a deep learning approach that has a layer, which allows the extraction of features [20,21,22]. CNNs that produce good values for performance results in image processing, are multi-layered artificial neural network based, and has a structure of customized deep learning architecture [23,24]. In the first layers of CNNs, local features are extracted from the image. However, in other layers, features are combined to detect geometric shapes or feature symbols. Process steps are repeated until completely input image is created. CNNs architecture ( Figure 2) consists of following layers [19] ; Convolution Layer: This layer provides extraction of features from images by using filters. It is the layer that the convolution of input image and image filter (which has random values in the beginning) is performed. This layer of CNNs is different from artificial neural networks. Instead of having neurons in the layers interconnected, convolution is performed in small sizes. The output feature map is defined as:  (1) In equation 1, ( , ) is the width and height of the output feature map of the last layer; ( , ) is the kernel size; , that defines the number of pixels skipped by the kernel in horizontal and vertical directions and index r indicates the layer. Convolution result of x, results of y pixel value in layer l-1 for each image is shown in equation 2. For convolution, w is the filter of size n*n and randomly selected. This filter is adjusted according to the inputoutput relationship during the training. At the end of the training, the coefficients that model the problem are obtained.
In the proposed model, five convolution layers were used with rectified linear unit layer (Relu layer) and response normalization layer to extract the maximum feature maps form the input frames to train the dataset with maximum accuracy.
Activation Function: Activation functions increase non-linearity in networks that provide a high level of understanding of the input data. In addition, the activation function produces a feature map that increases the independence of neurons in the next layer without excessive data values, thereby increasing the stability of the entire network. Sigmoid function used for the activation process suppresses the pixel values to a range between 0 and 1. Especially large negative values tend to be 0, whereas large positive values tend to be 1. Hyperbolic Tangent function, similar to the Sigmoid function, suppresses pixel values to a real value in the range of [-1, 1]. This function also has a saturated activation problem. Rectified Linear Unit (ReLU) function is defined as max (0, x). This activation is the most commonly used activation function in CNNs architecture.
Pooling Layer: This layer is used between successive convolution layers. This layer reduces the number of weights and operations. In this way, computational complexity is also reduced. Filters that are used in this layer process the mean or maximum value. Mean value is found by dividing the sum of the pixel values in the filter size by the filter window size. On the other hand, maximum value is found by determining the maximum value of the pixel size of the filter. For the input data sized X*Y*Z, filter size F and stride S, the activation map output is shown in equation 5.
Fully Connected Layer: The layer performs the classification process and produces the output.
By using CNNs, besides the image processing problems, there are also very successful results for problems such as search query, estimation, classification and semantic decomposition. Zhou et al. in their study, worked on emotion analysis using CNNs methods . 5 different emotion data sets are used in the model, which is called an active deep network and using a reinforced learning structure. Kim worked on a sentence classification problem by using CNNs-based architecture . In his study, classification of sentences yield to successful results on 7 different subjects. Collobert  AlexNet is a CNNs model developed by Hinton, Krizhevsky and Sutskever. AlexNet won the ImageNet competition in 2012 and promoted deep learning around the world. It consists of a total of 25 layers with consecutive convolutional layers and pooling layers. The activation function (ReLu) was used for nonlinear functions. This activation function was used to shorten the training time as it was faster than the classical function. The dropout layer was used to prevent excessive learning and hanging during the training process. Gradient descent model was used for weight delays and momentum values. Generally, after each convolution layer, the relu layer, which is an activation layer, is used. In addition, there are input layer, normalization layer, pooling layer, dropout layer, full connected layer, soft connected layer (SoftMax) and output layer. In this multilayered structure, each layer has to complete its own process and then transfer the data to the next layer. When the input data is transmitted within the network, the amount of data between the layers is quite high. It takes a long time to execute these operations with a normal processor. By using GPUs, more process is performed at the same time, reducing the processing time. Therefore, GPU (Graphical Processing Unit) graphics processors are used in deep learning architectures. ZF Net; It is a model that is an improved version of AlexNet architecture and won the ImageNet competition of 2013. In AlexNet, 11x11 filter is used for convolution in the input layer, while ZFNet has 7x7 matrix. RELUs function was used for activation, gradient descent was used for training and cross entropy loss was used for error loss. A visualization technique called Deconvolutional Network has been developed in ZFNet architecture. With this technique, a different dimension was brought to the architecture and the deep learning architecture was carried to a more successful point. VGG-16; is a simple network model developed for better results. The most important difference that distinguishes VGG-16 from the previous models is that the pooling layers follow the double or triple convolution layers. Unlike in other deep learning methods, 2x2 and 3x3 filters are applied in the convolutions. Google Net; It has a complex structure consisting of the combination of Inception modules that have won ImageNet-2014 competition. GoogleNet has 22 layers of depth and it has a structure consisting of 144 layers. It has created a different formation from the deep learning architectures that appeared before by filtering in different dimensions with the Inception module. Different sizes of filtering are done with the Inception module. These filters are filters, which used to reduce size. It contains 12 times less parameters than AlexNet. The number of layers used may differ according to the independent building blocks. Unlike the successive layered structure in other deep learning architectures, a depth structure was created. A modular filtering logic has been introduced that forms this depth.
Microsoft Rest Net; Designed to be deeper than all architectural structures, the model was the winner of the 2015-ImageNet competition. It is the first network that has 34 layer and consists of residual blocks. ResNet architecture, which has a depth above previous architectures, is more than the number of layers in other deep learning architectures. In the Microsoft ResNet architecture, Residual Block, which is fed with Residual Value between two RELUs and linear layers, has been created. With this architecture, it is thought that the learning process will take place more quickly.
The date of emergence of deep learning algorithms, number of parameters and error rates has shown in Table  1.

Proposed CNNs Method
In this section, the channel selection formula is introduced in the proposed CNNs model for selecting the most discriminatory feature filters. In the proposed architecture, a convolution operator gives the probability of each class at each pixel. In equation 6, the final score is multiplied over all channels of the feature maps.
In equation 6, x is the output feature of network, w represents the convolution's kernel and P is the set of pixel positions. It is not correct to evaluate only the standard deviation value. Because the standard deviation value gives the distribution of all available scores. The main thing is to calculate how many times below or above the standard deviation from the average score by using the standard deviation value and scores. In this way, the raw scores that are equivalent at the beginning are made equivalent. As a result, the relative position (performance) of each score to all scores can be expressed mathematically and these scores can be compared with each other. The absolute grade of the final score is a function of the absolute grade average of all final scores and a standard deviation. The function gives the success of this final score (Equation 7).
In these equations, K is the number of channels and is the prediction probability (success of final score). which is the feature selection with the channel selection block formula is shown in equation 8. Therefore, as shown in equation 8, a parameter has been added to change the highest probability value towards to the actual tag.

Application
The computer that the application is modeled on has a 3.50GHz processor and 16 GB memory. Python is used for programming. From the UCI dataset, digital data of 270 patients with heart attack risk are converted to image. Of the 270 images, 162 are used for training and 108 are used for testing. Selected inputs from the UCI dataset and other details are presented in Figure 1. The inputs of the proposed model are identified as; age, sex, chest pain, resting blood pressure, cholesterol, fasting blood sugar, resting ECG results, maximum heart rate achieved, exercise induced angina, ST depression induced by exercise, the slope of the peak exercise, number of major vessels and thallium heart scan. In order to create a model input pattern for each individual, an image is obtained from the values belonging to the inputs as a digital data in the data set ( Figure 3). Column pixels for image; it consists of the values of the individual input parameters. When creating the image from the digital data, the related value is normalized by reducing it between 0-255. The obtained images are then used for training in the CNNs model.
In the proposed work, the Alexnet CNNs deep learning architecture was used for heart attack risk assessment. The network is deeper than standard CNNs with five convolution layers and three pooling layers. Dropout value was determined as 0.5% on fully connected layers, preventing the data from overfitting. The architecture consists of the following components ( Figure 4): The application is coded in Python. In the proposed CNNs model, Alex Net deep learning algorithm was used for classification. In addition, Deep Reinforced Network and ANFIS were applied to the same problem for the comparison. All algorithms are run in the Python development environment that is called 'Jupyter Notebook'. Also Keras, Tensorflow and matplotlib libraries are used for all models in the application section. The proposed model consists of 5 convolution layers, 3 pooling layers, Rectified Linear Unit and 3 fully connected layers. 96 nodes are used for the first convolution layer, which is relatively large size 11×11×3. 256 nodes of 5×5 size is used for the second convolutional layer. For the third, fourth and fifth layers, 384 nodes of 3×3 size are used. Each folded layer creates a feature map. Property maps of the first, second and fifth convolutional layers are used with 3×3 pooling layers and 2×2 step. The proposed model consists of eight layered architecture with 4096 nodes. These feature maps are tied to the fully connected layers and then Soft-max activation is performed to determine the classification possibilities. The total number of parameters is 62378344.

Results
In this study; a new method based on CNNs is proposed for early detection of possible heart attack, which is a great risk for human life. Using the heart attack data set obtained from the UCI machine learning database, the classification performances of different classifiers and attribute groups on balanced and unbalanced datasets for the features triggering the heart attack are examined. A pattern consisting of 13 properties, five different feature groups, four different classifiers and a data balancing approach was created with 13x5x4x2 = 520 experiments. Experiment results are evaluated from different perspectives. In addition to the proposed CNNs architecture, other architectures have been tested and evaluated numerically. Heart attack risk assessment statistics of proposed and other architectures are presented in Table 2. The mean accuracy performance of the architectures was 94.34% for the proposed deep learning method, 91.58% for the ANFIS, and 92.66% for the deep multilayer neural network. These results show that the proposed CNNs architecture is more successful than other architectures. The reason why the proposed CNNs method is better than other architectures is the number of convolution and pooling layers, the filter size used in these layers and the functions used in loss and activation layers. The performance of the proposed method is compared with similar previous studies and the performance of the proposed method is found higher. Traditional features are commonly used in the literature, the features obtained by transfer learning from a pretrained model and their combinations are evaluated. Because of the experiments, it is concluded that different classification approaches and different feature groups should be considered separately for each heart attack feature. Moreover; variability is observed at the level of the classifiers' performance criteria, and results that make a particular classifier stand out in all criteria cannot be obtained. The data set obtained from the UCI database has an unbalanced distribution. This unbalanced distribution leads to a higher estimate of the dominant class and a lower estimate of the minority.
High classification performance and low sensitivity measurements confirm this conclusion. When the data sets are balanced, the decrease in the classification performance and the increase in the sensitivity value shows that the classes in the minority are better predicted and the relative prediction rate of the dominant class decreases. The increase in the minority class contributes little to the performance of the classification but high to the sensitivity rate. The reason for this is that the classification performance is calculated over the sum of the samples in the whole data set, and the sensitivity is calculated based on class and averaged. The certainty value is not affected much by the balancing process. This is due to the low false-negative rate in both conditions (balanced / unbalanced data). In terms of the classification methods, designed deep learning architectures provide high classification performance. ANFIS has a high accuracy rate of sensitivity. There is no significant difference between the classifiers in the value of certainty. Figure 5 using a single model structure while estimating mechanisms are being established. This study will contribute to the literature in terms of examining various mechanisms and combinations thereof.
The proposed CNN method obtained better results in terms of accuracy than both the feed forward deep reinforced learning architecture and the ANFIS architecture. However, the temporal performance during model estimation in both training and live systems are lower than other architectures applied in this study (Table 3). Both the performance criterion results presented in Table 2 and the processing time of the architectures given in Table 3 indicate that the proposed architecture is applicable according to the temporal criticality of the problem to be applied.

Discussion
In this study different from studies in the literature, the channel selection formula is presented in the proposed CNNs model to select the most selective feature filters. The determination of a suitable weight for each feature removed at different stages of the proposed CNNs model is demonstrated by this formulation. Also differently from other studies, it was used in the proposed CNNs method by converting all numerical data from dataset into 2D images. Afterwards, to show whether this the proposed method is applicable or not, the dataset which is numerical form was applied to other methods and compared. When the results are analyzed, in terms of both success performance and temporal productivity, it is shown that the proposed method is found applicable. In addition, in this study, the predictability before the cardiac arrest of the risk of heart attack, which ranks top of the list among the death rates all over the world, with the deep learning method was shown. The proposed model, the applicability of which has been demonstrated as a result of the findings obtained, can be integrated into the system which is in a wearable technology and the risks of heart attacks can be continuously controlled by obtaining the instantaneous information of the people.

Conclusions
In this study; a new method based on CNNs is proposed for early detection of possible heart attack before the cardiac arrest which is a great risk for human life. Different from studies in the literature, the channel selection layers is used for select the most selective feature filters in the proposed CNNs model. Also in order to be used in CNNs methods, all numerical data has been converted to 2D images. The problem was applied on the proposed CNNs method as well as ANFIS and deep multilayer network. Statistitical comparisions are shown in the study to show that the proposed CNNs method is applicable. If the deep learning-based heart attack risk model which is proposed in the study is embedded in a device with wearable technology (for example, a watch), the likelihood of saving the lives of people who will experience a heart attack will increase.

Ethics approval and consent to participate
The dataset used in this study is the heart attack dataset in the UCI machine learning database which is open shared and has no personal information in the data. Therefore, there are no ethical problems.

Consent for publication
I understand that the text and any pictures or videos published in the article will be freely available on the internet and may be seen by the general public.

Availability of data and material
The datasets analysed during the current study are available in the UCI repository, [https://archive.ics.uci.edu/ml/datasets.php]

Figure Legends
 Figure 1. Dataset description.  Figure 2. CNN-Deep learning method.  Figure 3. Sample input pattern.  Figure 4. Architecture of proposed CNN.  Figure 5. ROC curve of the proposed CNN architecture.