A new intelligent ECG recognition approach based on CNN and improved ALO-SVM

Cardiovascular disease is one of the most common diseases, which seriously threatens people’s life and health. Therefore, cardiovascular disease prevention becomes one of the most attractive research topics in health care system design. Intelligent recognition of electrocardiogram (ECG) signals represents an effective method for rapid diagnosis and the evaluation of cardiovascular diseases in medicine. Realization and efficiency of the classification of ECG signals in real time play major roles in the detection of cardiovascular diseases. This paper is concerned with the proposition of an intelligent ECG signal recognition method based on a convolutional neural network (CNN) and support vector machines (SVM) with an improved antlion algorithm (ALO). First, the ECG signal is denoised and pre-processed by lifting the wavelet. Subsequently, CNN is used to extract the signal characteristics of the denoising signal, and the extracted signal characteristics are used as the input characteristics of the SVM. Finally, an improved ALO algorithm is used to optimize the relevant input functions of the SVM to achieve a better signal classification. In our algorithm, the performance is enhanced by improving the threshold estimation method of the lifting wavelet, to improve the filtering effect. The proposed CNN architecture is tested with multi-lead ECG signals from the MIT-BIH ECG signal data set. The results display that the method has obtained an average accuracy, sensitivity, and specificity values of 99.97%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$99.97\%$$\end{document}, 99.97%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$99.97\%$$\end{document}, and 99.99%\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$99.99\%$$\end{document}, respectively. Compared with the existing results, the proposed approach has a better recognition performance.


Introduction
The heart is the most important organ in the human body, it can maintain human blood circulation. However, due to the lack of correct diet and exercise habits, the incidence rate of heart disease has been going to rise in recent years, which seriously threatens people's life and health. Electrocardiogram (ECG) is a technique to record the changes of myocardial bioelectric current in the form of a graphic ECG [1]. At present, ECG is still an important means of medi-cal diagnosis and evaluation of cardiovascular diseases. The waveform of the ECG signals is complex, and it contains a huge amount of information. Generally, doctors have to distinguish ECG signals by manual labeling, but it is easy to misjudgment, thus it greatly affects the identification and diagnosis of patients suffering from cardiovascular diseases. Therefore, it is urgent to realize the automatic recognition and classification of ECG signals based on computer technology.
At present, there are many automatic ECG analysis methods proposed based on signal processing technology, such as wavelet transform (WT), support vector machine (SVM), convolutional neural networks (CNN), and so on [2]. Sangaiah et al. designed an enhanced denoising filter to improve ECG signal quality, then used WT and hidden Markov mixture (HMM) to achieve signal classification [3]. Mousavi et al. proposed a method based on NLP to classify ECG signals and achieved good performance [4]. Salem et al. used a spectrogram to convert a one-dimensional ECG signal into a two-dimensional image, then designed a 2D-CNN to clas-sify ECG signals [5]. Acharya et al. developed a 9-layer deep CNN, which can automatically identify five different types of heartbeats in ECG signals [6].
In this paper, a new deep learning method is designed to effectively classify arrhythmias. Firstly, the improved threshold lifting wavelet is used to preprocess the raw ECG signal, and all kinds of noise interference in the original signal are filtered out as much as possible. Secondly, the CNN model with strong features and learning capabilities is selected to adapt and extract relevant features of the signal. Thirdly, the SVM classifier with the parameters that are optimized by the antlion algorithm (ALO) [7] is used to realize the specific classification of the ECG signal. Finally, the MIT-BIH ECG signal data set provided by the Massachusetts Institute of Technology is combined to test the performance of the model, and comparison experiments are carried out through various practical indicators, including precision, sensitivity, and specificity. The results of comparison with existing methods show that our method further improves the accuracy of classification in the current results.
The rest of this paper is organized as follows. Section 2 presents the structure of the ECG recognition method developed in the article. The performance of the proposed algorithm and the comparison with the classification results of existing methods are provided in Sect. 3. Section 4 concludes this paper. Figure 1 shows the proposed ECG classification system architecture. First, the system receives the original ECG signal from the public data set and classifies the signals. Second, an improved lifting wavelet method is used to filter the noise. Then, CNN is utilized to extract ECG signal features. In order to improve the system performance, an improved ALO algorithm is used to adjust the relevant parameters of SVM with the main goal to provide a more accurate classification performance.

Data preprocessing
The original ECG signal contains a lot of noise interference, which will seriously affect the accuracy of ECG data analysis. Therefore, it is necessary to denoise and preprocess the ECG data before model analysis to effectively filter out the noise and improve the accuracy of spectral analysis. In this paper, an improved threshold lifting wavelet is used to filter noise.
Note that the lifting wavelet [8] refers to a secondgeneration wavelet algorithm based on the time domain lifting method. Compared with the first-generation wavelet algorithm, the method adopted a simple multiplication operation to replace the original computationally complex Fourier transform. Therefore, the calculation efficiency of the wavelet algorithm is highly improved compared with the first generation wavelet algorithm.
After decomposing the original signal by wavelet, the amplitude of the wavelet coefficient of the noise in the signal is smaller than the amplitude of the original signal. The threshold denoising method can be adopted to eliminate the signal noise. Among them, soft and hard threshold denoising is commonly applied in many practical applications. Despite the high popularity of the two methods, they both have some problems. The reconstructed signal obtained by the soft threshold method has good continuity, but it could cause partial distortion of the original signal. The reconstructed signal obtained by the hard threshold method is closer to the original signal, but it could locally oscillate.
Aiming at the inherent defects of soft and hard thresholds, Zhou et al. proposed an improved threshold function [9], which is defined in Eq. (1).
where the parameter α tends for the adjustment factor of the threshold function, which can be flexibly selected within the range (0, 1]. When the value α is close to 0, the improved threshold function tends to be a soft threshold method. When the value α is close to 1, the improved threshold function tends to be a hard threshold method. At the same time, when the signal amplitude is close to or smaller than the noise, the traditional threshold estimation method will treat the signal as the noise, thus affecting the denoising performance. To avoid this problem, the traditional threshold estimation method needs to be adjusted. For the wavelet transform value at scale m, it is calculated with the adjacent dimension-related correlation corr (m, n), and normalized to obtain the normalized result k (m, n). At different scales of the wavelet transform, the trend of the noise wavelet signal and the wavelet coefficients of the original signal are opposite. Therefore, when k (m, n)< 1, the output result is a noise signal. In order to better filter out noise interference, the following adjustments are made to the traditional threshold: Remark 1 Lifting wavelet transform is the second generation of the wavelet transform, which inherits the multi-resolution characteristics of the wavelet. it does not need Fourier transform, which improves the operation efficiency. The algorithm mainly consists of three steps: decomposition, prediction, and update. This paper adopts the method of Zhou [9] to adjust the threshold function and threshold estimation method in the algorithm.

Signal detection system
Other scholars have analyzed all five types in the MIT-BIH database [10][11][12], which are normal (N ), ventricular (V ), supraventricular (S), the fusion of normal and ventricular (F), and unknown beats (Q). Since the Q beat was very poor and it is usually meaningless. Therefore, only N , S, V and F were analyzed in detail. In this work, we use CNN to extract the characteristics of the ECG data. Then, SVM is used to classify the features extracted from the heartbeats. In addition, the antlion algorithm will be adopted to optimize the model parameters to further improve the classification accuracy. CNN is one of the most widely used artificial neural networks [13]. Compared with other classification models, CNN's unique local perception and weight sharing operations greatly reduce model parameters and significantly improve model training efficiency, which makes the CNN model easier to process high-dimensional data. In this study, we have proposed a CNN framework for extracting the characteristics of the signal, and its architecture is shown in Fig. 2.
However, CNN has the problem of overfitting in the case of small samples. In this paper, the SVM [14] is combined to make up for the deficiency of CNN. When SVM is used to solve a specific classification problem, it is necessary to adjust the relevant parameters of the algorithm, including the relevant parameters γ of the RBF kernel function and the penalty coefficient C. Traditional parameter adjustment is based on the manual adjustment of parameter characteristics. Although the manual adjustment parameters are guided by scientific theory, there still have great uncertainties, and it is easy to miss the optimal solution. In order to solve this problem, this paper adopts an improved ALO [15,16] to intelligently adjust the C and γ parameters, which avoids the inaccuracy of manual adjustment parameters and can find the optimal parameters more accurately. The improved ALO is easy to implement and avoids the deficiency that the traditional ALO [12,17] is easy to fall into the local optimal solution. The specific improved ALO optimization process is as follows:

Logistic chaotic map initialization population
Chaos is a universal nonlinear phenomenon in nature, which fully reflects the complexity of the system. The process of chaos seems to be chaotic but contains inherent regularity. The application of chaos theory is very extensive, and its coverage involves almost every branch of natural and social sciences.
Logistic mapping is a widely used one-dimensional chaotic system. As early as the 1950s, several ecologists used this simple differential equation to describe population changes. which is defined in Eq. (3).
In the formula, X (n) ∈ [0, 1]; μ is the control parameter, when μ = 2, the system is in a chaotic state. In this article, the idea of logistic chaotic mapping is introduced into the individual position initialization of the antlion intelligent optimization algorithm to enhance the uniformity and ergodicity of the population.

Real-time update of Antlion population
In the traditional ALO algorithm, the routes of each ant are not affected by each other, which leads to great differences in the fitness values of corresponding ant colonies. The fitness value of some antlions will always be higher than the average level of the antlion, which weakens the optimization perfor-mance of the overall algorithm to a certain extent. In order to solve this problem, the idea of tournament selection strategy is added in the optimization process to eliminate a certain proportion of individuals with poor fitness and realize realtime update of the ant colony, so as to ensure that there are more good individuals in the antlion population and reduce the possibility of the algorithm as a whole falling into the local optimum.
In addition, we use Levy flight to replace the random walk of ants as the movement of ants, and the improved random walk of ants is shown in Eq. (4).

Antlion constructs trap
Each ant corresponds to an antlion hunting. The antlions hide in the retrieval space and construct sand traps around them. Ants' random walk could be affected by the traps made by the antlions, and they could slide toward the center of the trap, which is defined in Eq. (5).
where the proportional coefficient I simulates the falling speed of the sand trap. When the ants are on the edge of the sand trap, the random walk could be less affected by the trap. At this time, I tend to 1. When the ants walk toward the center of the sandpit, the random walk could be greatly affected by the trap, thus making it difficult for the ants caught in the sandpit trap to get out of the trap range, and can only slide to the center of the trap. At the center of the trap, I tend to infinity. When simulating the influence of traps, the pit sinking proportional coefficient I increase in sections, which simplifies the algorithm and achieves rapid convergence of the algorithm. On the contrary, it may miss the global optimum solution, because the algorithm converges and shrinks and crosses part of the search space. To solve this problem, Meng et al. proposed a contraction model with smooth and convergent boundaries to simulate the process of bunker subsidence [15].
In the formula, through two adjustment factors ψ and ω, the improved contraction model can ensure that the algorithm can search the search space more comprehensively while maintaining the rapid convergence of the algorithm, so as to avoid missing the optimal solution. The relationship between the adjustment factor and the convergence is shown in Fig.  3.

Remark 2
The antlion optimization algorithm is inspired by the hunting mechanism of ant lions in nature. When the antlion hunts ants, it will use its lower jaw to dig a conical sand pit as a hunting trap and lurk at the bottom of the pit. Once an ant falls into the trap, the antlion easily catches the prey by throwing sand outward to quickly slide the prey to the bottom of the pit. The ALO algorithm simulates the interaction between antlions and ants in the hunting process, regards the parameters to be optimized as ants walking randomly, and imitates the behavior of antlions using sand pits to capture ants to find the optimal parameters. To enhance the performance of the algorithm, this paper improves the process of initialization, random walk, and sand trap construction in the traditional ALO algorithm.

Experimental configurations
In this study, a laptop with AMD Ryzen 7 4800H with Radeon Graphics processor and 16GB memory is used to train the proposed model, and the Keras framework is utilized to build, train, and test the used model. The overall training time of this method is about 26 hours.

Database
In this study, we utilize the MIT-BIH arrhythmia database to evaluate the wave detection and arrhythmia classification of our system. The MIT-BIH arrhythmia database includes records of 48 patients, these fragments were obtained from 47 subjects in the BIH arrhythmia laboratory between 1975 and 1979. Moreover, all heartbeat signals in the MIT-BIH arrhythmia database were labeled beat by beat by more than one cardiologist. There are 15 major labels of arrhythmia in the MIT-BIH database. In order to compare the performance of classification algorithms with other scholars, the AAMI EC57 classification standards specified by the American Association for the Advancement of Medical Instrumentation are adopted to evaluate the performance of this classification algorithm. According to the standard, the ECG signals are classified into five categories, which are N , S, V , F, and Q. In addition, data analysis of the Q class is meaningless; therefore, it's perfectly normal to analyze ECG signals using only four classifications.

Performance evaluation
In order to further demonstrate the superiority of this method, the MIT-BIH arrhythmia database is also used to evaluate the wave detection and arrhythmia classification of our system, and the confusion matrix is adopted to demonstrate the accuracy of four kinds of arrhythmia data respectively. The confusion matrix is used for evaluating the performance of the classification algorithm. By displaying the actual categories and predicted values after classification, it can visually display details such as the classification of samples, and often uses a series of classification indicators to evaluate model performance. Figure 4 illustrates the confusion matrix generated by the four categories of ECG signals, namely N , S, V , and F. According to the confusion matrix, in order to better evaluate the effects of each classification method, three secondary indicators, including positive predictiveness, sensitivity and specificity are used as the classification results of the ECG signal classification indicator evaluation model, which is defined in Eq. (8)(9)(10). Table 1 shows the PPV, SEN, and SPE evaluation results of each category based on the confusion matrix. It can be seen from Fig. 4 and Table 1 that this model has excellent classification performance for ECG signals, and the PPV, SEN, and SPE of each category can reach 1. It reflects the excellent accuracy of the model and is very suitable for ECG signal classification. However, PPV and SEN were slightly lower in the model when dealing with F beats. After performing the experiment study, it is found that the reason for this phenomenon is possibly due to the imbalance of data.
In the MIT-BIH database, the data imbalances are significant, as the N , S, and V heartbeats are about 90,000, 2800, and 7000, respectively. This severe imbalance can degrade the performance of the classifier, especially for the few sensitive and positive predictions. Therefore, it is necessary to expand the ECG data, especially the small sample of data. In this paper, the ECG data are extended based on Z-score normalization, and the formula is: Then, new data samples are synthesized after preprocessing by changing the standard deviation and average value of the Z score calculated from the original ECG signal [18]. Then, we take the segments of class N and keep them the same (because they are the most abundant), and increase the number of segments of the remaining types to match the number of segments of class N . After enhancement, the data volumes of N , S, V and F classes are similar, in which S, V and F heartbeats are expanded to 84,000, 70,000 and 80,000, respectively, and the total number of segments is increased to 324,000. On this basis, we shall use the expanded data to repeat the experiment, and the experimental result is recorded in Table 2.
Comparing Tables 1 and 2, it is easy to see that the classification of PPV and SEN with small samples are also improved, and thus the performance of the whole model is enhanced.
In addition, in order to prove the necessity of Improved ALO-SVM algorithm in this study, we compared the proposed method in this article with the traditional methods, say, CNN-SVM, LWT-CNN-SVM, and ALO-SVM, and the results are shown in Figs. 5, 6 and 7.
It can be seen from Figs. 5, 6 and 7 that the algorithm proposed in this paper has improved in accuracy, sensitivity, and specificity compared with the current algorithms. Moreover, the algorithm in this paper performs well in various indicators of N , S, V , and F, and there is no obvious shortcoming.

Comparison of results
In order to demonstrate the superiority of this method, comparison experiments with those of state-of-arts are performed. Table 3 presents some classification results reported in recent literature. Sangaiah et al. used an enhanced filter to filter out noise, then used WT and HMM to achieve sig- proposed an efficient DL model based on the LSTM network for recognizing four ECG heartbeat classes, and the oversampling method called comprehensive minority oversampling technology (SMOTE) was used to improve the accuracy of minority groups. [19]. Shaker et al. utilized GNN to balance the data set and then used CNN for classification, which effectively improves the classification accuracy of minority groups [20]. Nurmaini et al. used stacked denoising autoencoder (DAE) and autoencoder (AE) for feature learning, then used DNN to achieve signal classification [21]. Oh et al. combined the CNN model with the LSTM architecture to identify five heart conditions. The system showed high classification performance in processing variable-length data [22]. Kora et al. directly applied the hybrid firefly and particle swarm optimization algorithm (FFPSO) to optimize the original ECG signal, then used the Levenberg-Marquardt neural network (LMNN) to classify ECG signals [23]. It is shown in Table 3 and Figs. 5, 6 and 7 that the CNN and combined with the improved ALO-SVM model proposed in this paper outperforms other models. Furthermore, a large number of data samples ensure that the model has high generalization ability and robustness.

Discussions on the advantage of our method
From Table 3, it can be seen that the proposed method can provide better performance. The reason for this result is that we made up for the shortcomings of the existing classification methods. More specifically, Shaker et al. [20] utilized CNN to classify signals directly, without considering the shortcomings of CNN's insufficient generalization performance in the small samples. Oh et al. [22] used raw data without denoising, which increases the possibility of prediction errors. The data set used in the literature [21,23] is too small for signal classification, which seriously hinders the improvement of classification accuracy. Compared with previous studies, the merits of the proposed model are summarized as follows: 1. The improved ALO-SVM is used to process the signal features extracted by CNN, which improves the generalization ability of the model. 2. The lifting wavelet is used in the preprocessing stage, and the new threshold function and adjusted threshold estimation method are used to make up for the shortcomings of traditional threshold methods, which makes the proposed method more accurate and efficient. 3. The z-score method is used to expand the data, which greatly improves the accuracy of small sample data.

Conclusion
How to better extract the features of ECG signals and complete the classification task are the keys to improve the accuracy of detection of abnormal ECG signals. In this paper, an ECG signal detection method based on CNN and improved ALO-SVM model has been proposed. We first adopted an improved threshold lifting wavelet to denoise the original ECG data, so as to avoid the interference of various noises on feature extraction and even specific classification. Then the convolutional neural network with powerful feature extraction function was used to effectively extract ECG features. Finally, the SVM model optimized by the ALO algorithm was used to classify the extracted ECG features, so that the specific ECG signal type can be accurately diagnosed. The proposed new method was tested on the MIT-BIH data set, and the effectiveness of the method was verified in actual classification. It follows from the experimental study that, the newly proposed method can effectively distinguish different types of ECG signals, and the accuracy can reach a level close to 1. Finally, the advantage of our method was demonstrated via a comparison study, which showed that the method can identify and diagnose cardiovascular diseases more effectively. In future work, the attention-based technology [24,25] would be introduced to further improve the accuracy of algorithm.