Automatic Arrhythmia Detection Using One-Dimensional Convolutional Neural Network

doi:10.21203/rs.3.rs-51189/v1

Download PDF

Research

Automatic Arrhythmia Detection Using One-Dimensional Convolutional Neural Network

https://doi.org/10.21203/rs.3.rs-51189/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Background: Cardiovascular diseases (CVDs) are common diseases that pose significant threats to human health. Statistics have demonstrated that a large number of individuals die unexpectedly from sudden CVDs. Therefore, real-time monitoring and diagnosis of abnormal changes in cardiac activity are critical, as they can help the elderly and patients handle emergencies in a timely manner. To this end, a round-the-clock electrocardiogram (ECG) monitoring system can be developed with the quick detection of an ECG signal, segmentation of the detected ECG signal, and rapid diagnosis of a single segmented ECG beat. In this paper, to achieve the automatic detection and diagnosis of an ECG signal, five common types of ECG signals are used for recognition. For pre-processing the original ECG signal, the dual-slope detection algorithm is proposed and developed. Then, with the pre-processed ECG data, a five-layer one-dimensional convolutional neural network is constructed to classify five categories of heartbeats, namely, a normal heartbeat and four types of abnormal heartbeats.

Results: To be able to compare the results of the experiment, the experimental data used in this study are obtained from the open-source MIT-BIH arrhythmia database. This database is authoritative, as each ECG signal cycle is annotated by at least two cardiologists, and abnormal ECG signals are classified into different categories. By comparing the detection and recognition results in this study with the results annotated in the MIT-BIH arrhythmia database, an overall accuracy of 96.20% is achieved in the classification of normal ECG signals and four categories of abnormal ECG signals.

Conclusions: This paper provides an accurate method with low computational complexity for 24-hour dynamic monitoring and automated diagnosis of heartbeat conditions. With wearable devices, this method can be used at home for the initial screening of CVDs. In addition, it can perform diagnosis and warning for postoperative patients or patients with chronic CVDs.

Biomedical Engineering

Electrocardiogram (ECG)

Arrhythmia

1D convolutional neural network (CNN)

Dual-slope processing

Moving-window integration

Arrhythmia is a common clinical disease that occurs not only in people with cardiovascular diseases (CVDs), but also in people with other diseases and in a small number of healthy people. Correctly diagnosing arrhythmia and recognising complex arrhythmia have been significant goals in the history of modern electrocardiology [1]. Arrhythmia monitoring is an important type of long-term electrocardiogram (ECG) monitoring that is mainly used to detect arrhythmias, diagnose CVD in the early stages, evaluate anti-arrhythmic treatment, understand the relationship between premature ventricular contraction (PVC) and sudden death, perform electrophysiological studies of arrhythmia, and evaluate drug treatment effects [2–3]. Therefore, arrhythmia monitoring is critical for clinical analysis. Heart conditions can be monitored using 24-hour real-time dynamic monitoring devices and analysed by the obtained ECG signal. Based on the obtained ECG signal, the patient’s ECG feature values can be determined to detect and analyse arrhythmia [4].

In the past, detection and analysis of ECG signals were generally performed by human experts, who carried out diagnostic analysis through visual observation using their own clinical experience. Although manual diagnosis has high accuracy for short-term ECG diagnosis, it fails in long-term dynamic ECG detection [5]. Hence, engineering methods have potential for achieving high-precision and high-speed automatic analysis of ECG signals. Currently, there are two main methods for ECG classification: one based on the feature vector space and another based on the ECG feature image [6]. The most commonly used methods include the logical discriminant tree, neural network, linear discriminant analysis (LDA), least square support vector machine, hidden Markov tree based model, particle swarm optimisation, rough rule set based model, and a combination of multiple methods [7].

Shen et al. [8] employed adaptive feature extraction combined with the support vector machine to perform diagnosis. This method was computationally efficient but had low accuracy; namely, a recognition accuracy of only 58.61% was achieved for PVC. Hosseini et al. [9] constructed intelligent ECG diagnosis software using a neural network. By comparing the performance of various neural network structures, they selected and combined two types of neural network structures that demonstrated the highest performance in the recognition of six types of abnormal ECG cases. Subsequently, they proposed a multi-level neural network algorithm that achieved higher accuracy; however, its computational complexity was very high. Yeh et al. [10] utilised LDA to diagnose arrhythmia. Their method could accurately recognise common abnormal ECG signals; however, it was unable to recognise premature atrial contraction (PAC) with high accuracy.

ECG is a graphical representation that displays changes in the electrical activity of the heart with time. A convolutional neural network (CNN) can effectively extract image features for recognition; however, it is too computationally complex to process ECG data. In general, a one-dimensional (1D) CNN can better process and analyse time-series data derived from sensors, considerably increasing the computation speed. In this paper, a deep learning algorithm based on a 1D CNN is proposed to classify five categories of labelled ECG data, and a comparison experiment is conducted to evaluate its accuracy.

This study used the MIT-BIH arrhythmia database provided by the Massachusetts Institute of Technology. Five categories of ECG signals were selected from this database for classification, namely, the normal sinus rhythm, LBBB, RBBB, PAC and PVC. After being pre-processed, the five categories of signals were divided into a training set and test set. The sample sizes of the selected training set and test set are presented in Table 1.

Table 1. Classification of sample data

	①Normal	②PVC	③RBBB	④LBBB	⑤PAC
Training samples	12,000	2,000	5,000	5,000	2,000
Test samples	6,000	1,000	2,500	2,500	1,000

Note: premature ventricular contraction (PVC), left bundle branch block beat (LBBB), right bundle branch block beat (RBBB), premature atrial contraction (PAC)

After the divided training data were read, the ECG beats were first pre-processed to achieve data standardisation, and then the pre-processed data were sent to the network for training. To avoid vast fluctuations in the training process, the initial learning rate of the neural network was set to 0.001 in this study, and the objective error of the loss function was set to 1.010^-15. In addition, to reduce the occurrence of falling into the local minimum, the cross-validation method was adopted. After multiple cross-validation runs, an optimised neural network was obtained.

To assist the convergence of the experimental data, the neural network designed in this study was trained using the stochastic gradient descent method. The batch size was set to 16, and the number of iterations was set to 15,000. As illustrated in Fig. 7, the loss function of the CNN eventually converged to approximately 0.94.

By using cross-validation, underfitting and overfitting of the network could be observed, and the parameters could be thereby adjusted. The obtained network parameters were then retained and used for the following testing. The final classification results derived by the testing were used as the key indicator for measuring the performance of the neural network classifier.

Table 2. Accuracy of different algorithms

Algorithm category	①Normal	②PVC	③RBBB	④LBBB	⑤PAC
Current paper Dual-slope detection; 1D CNN	99.66%	98.85%	98.13%	97.64%	93.39%
Shen ^[8] Adaptive features; SVM	89.56%	58.61%	/	90.77%	85.24%
Yeh ^[10] Multi-dimensional features; LDA	98.97%	92.63%	95.09%	91.07%	84.68%

Note: CNN, convolutional neural network; SVM, support vector machine; LDA, linear discriminant analysis

Table 2 indicates that compared with other algorithms, the 1D CNN proposed for ECG signal classification demonstrated good performance, achieving an accuracy of 99.66% for normal beats, 98.85% for PVC, 98.13% for RBBB, 97.64% for LBBB and 93.39% for PAC. In addition, the average accuracy of the classification of all signals reached 96.20%.

Automatic arrhythmia detection and diagnosis is a very challenging work, which needs to integrate the knowledge of many subjects, including medicine and engineering technology, etc. The ECG data from MIT-BIH Arrhythmia Database were used for detection in this paper, and the methods of filtering, detection and automatic classification are studied. A 5-layer 1D CNN diagnosis algorithm are provided, and improved the current methods of the automatic detection and analysis of arrhythmia.

There were some limitations in this study. More studies on real-time clinical ECG data are needed, for the data used in this paper are from MIT-BIH database and no real-time clinical ECG data included at present. In addition, the classification of more types of arrhythmia will be studied next except PVC, RBBB, LBBB, and PAC. As for the work of this paper, the following problems need to be taken into further consideration next:

(1) Some training samples are insufficient. The training sample size of some types of arrhythmia are not adequate in MIT-BIH Arrhythmia Database, which would influence the classification accuracy of the 1D CNN algorithm. In order to alleviate this problem and reduce the trouble of the local minimum, the cross-validation method is adopted to obtain the optimal neural network.

(2) The arrhythmia detection method needs to be more rapid and effective. The detection and location technologies of some features of the waveform, such as low-amplitude and low-frequency signals, like P wave and U wave, are still immature and need to be improved at present. The double slope method is used to detect QRS wave group in this paper for quick positioning. It could be improved on this basis next.

Motivated by the widespread problem of CVDs worldwide and the demand for screening and diagnosis, this study examined the automated diagnosis of arrhythmias. ECG data recorded in a real clinical setting from the MIT-BIH arrhythmia database were used for detection and classification. For automatically analysing signals of heartbeats, this study proposed a five-layer 1D CNN method for classifying the ECG signal.

For the pre-processing of the original ECG signal, this study used the dual-slope method and moving window integration to detect the R wave of the ECG signal, followed by a decision-making process by the dual-threshold method. Then, the detected data were segmented into different categories of ECG beats as the input for subsequent diagnosis. To diagnose abnormal heartbeats, this study proposed and designed a five-layer 1D CNN for analysis and classification, employing a convolutional layer, pooling layer, BN, ReLU layer and other modules to test the classification performance of the test dataset. The results revealed an average recognition accuracy of 96.20% for the five categories of heartbeats (normal sinus beats and four categories of abnormal heartbeats).

This paper thus provides an accurate method with low computational complexity for 24-hour dynamic monitoring and automated diagnosis of heartbeat conditions. In conjunction with wearable devices, this method can be used at home for the initial screening of CVDs. In addition, it can perform comprehensive diagnosis and warning for postoperative patients or patients with chronic CVDs. Through further detection of the feature waveform, the proposed method improved the recognition accuracy and reduced false and missing detection. Future work will focus on examining additional arrhythmia categories to be able to recognise a larger variety of abnormal heartbeats.

As illustrated in Fig. 1, the ECG data to be classified by the proposed algorithm are read and pre-processed. Then, a dataset is constructed with the processed data. Next, the classification model is trained and used for classification. During the pre-processing stage, adaptive filtering is employed for noise reduction. After pre-processing, the obtained dataset is divided into a training set and test set. The network model is then trained under the Caffe framework from the beginning using random initial weights. Finally, arrhythmias are classified using the trained model.

Pre-processing

In this study, the MIT-BIH arrhythmia database provided by the Massachusetts Institute of Technology was used to ensure data authenticity and reliability. This database consisted of over 4,000 recordings from patients, containing common ECG signals as well as rare abnormal ECG signals, and each recording was annotated and verified by expert cardiologists. In this study, five categories of ECG signals were selected for classification, namely, normal heartbeat, left bundle branch block beat (LBBB), right bundle branch block beat (RBBB), PAC and PVC, which accounted for 66.63%, 7.17%, 6.44%, 2.26% and 6.33% of the MIT-BIH arrhythmia database, respectively, while other abnormal signals accounted for less than 1% [11-12].

Due to the different data acquisition equipment and methods used in the ECG signal acquisition process, interference may have been introduced into the acquired ECG signal, resulting in additional noise. Thus, the acquired ECG signal may have contained noise such as power line interference and baseline wander [13]. To avoid large errors in ECG signal recognition, pre-processing was performed before signal detection to reduce the impact of interference on the original ECG signal.

In this study, the ECG signal pre-processing stage consisted of the following steps. First, a band-pass filter with a bandwidth in the range 0.05–100Hz was adopted to filter the original signal. Then, the dual-slope method was used to locate the R peak. Next, a low-pass filter with a cut-off frequency of 0.05Hz was used to remove the interference signal. Finally, to prevent the misrecognition of noise as a QRS complex, moving window integration and a self-defined threshold were used.

Normally, the frequency of an ECG signal is in the range 0.05–100Hz. To remove unnecessary information from the ECG signal, a 40-order finite impulse response (FIR) band-pass filter with a band-pass frequency in the range 0.05–100Hz was designed in this study to filter the original signal. Fig. 2(a) presents the original ECG signal while Fig. 2(b) presents the ECG signal after band-pass filtering.

After filtering the ECG signal, dual-slope pre-processing was employed to detect the peak of the QRS complex, thereby obtaining the effective waveform of the ECG signal. The fundamental principle of dual-slope pre-processing is based on the characteristics of the QRS complex, namely, that it is steep on both sides of the R peak. Using these characteristics, the point with the largest slope within the interval on both sides of that point is determined. In this method, first, the maximum and minimum average slope in the intervals on both the left and right sides of a point are calculated. Then, the slope difference is obtained by subtracting the minimum average slope of one side from the maximum average slope of the other side. Finally, the two average slope differences are compared, and the maximum slope difference is used. Through the dual-slope calculation, the peak of the QRS complex can be located. Fig. 3(a) presents the signal after band-pass filtering while Fig. 3(b) presents the signal after dual-slope pre-processing.

After the dual-slope calculation, a low-pass filter with a cut-off frequency of 5 Hz was used to remove the interference signal. Figs. 2 and 3 indicate that the amplitude of the ECG signal gradually decreased after the band-pass filtering and dual-slope pre-processing. However, an amplitude that is too small is not conducive to the subsequent detection and segmentation of the ECG signal.

To highlight the characteristic points of the QRS complex, moving window integration was used in this study. By selecting a window with a certain width and moving from the initial point in the signal, the signal within the window is integrated during the moving process, and the integrated value is used to represent the amplitude of the signal. The moving window integration algorithm can magnify the effective information in the amplified signal, thereby increasing the absolute amplitude of the signal. In addition, moving window integration allows the peak of the waveform to become smooth and the slope to become less sharp. In experimental test in this study, the width of the moving window was set to 17 sampling points, which achieved the most desirable window integration effect. Fig. 4(a) presents the low-pass filtered signal while Fig. 4(b) presents the signal with a magnified amplitude after moving window integration.

The signal after moving window integration displayed prominent characteristics; hence, the location of the QRS complex could be determined according to the QRS threshold, and the ECG signal could be further segmented. In this study, the dual-threshold method was designed to locate the QRS complex in the integrated signal. When the peak amplitude of the integrated signal exceeded the lower threshold, it was compared with the higher threshold to determine whether a QRS complex was detected [14]. Furthermore, to ensure that the thresholds could be flexibly adapted to ECG signals with different forms, the higher and lower thresholds in this study were designed to independently change according to the variations in the peak amplitudes detected previously. The design of the dual threshold and adaptive threshold mainly aimed to reduce the number of missing and false detections of the QRS complex. In addition, if two peaks were too close to each other, the peak with the larger amplitude was retained as an R peak according to the dual threshold, thereby preventing the misrecognition of some types of noise.

After detection, the peak of the ECG signal was determined. Based on the location of the peak, the input ECG signal was segmented into a series of single ECG beats. The single ECG beats derived from segmentation were fed to the subsequent 1D CNN for the classification of ECG signals.

1D CNN

A neural network is an abstract mathematical model inspired by the human brain that has been developed by modern neuroscience. Neural network models have wide applications and can be used for the classification and prediction of problems with indescribable rules [15].

A CNN is a widely used neural network model that is able to extract local features of data to establish local connections. There are multiple filters in each convolutional layer that can extract multiple feature parameters. The shared weight in the convolutional layer and the pooling operation in the pooling layer can reduce the difficulty of network training and reduce the data dimensions, thus avoiding excessive computational complexity during parameter extraction. For an image, there is a connection between local details and the global area, and the combination of low-level features forms a high-level feature representation. This principle also applies to ECG signal processing [16–17]. Therefore, CNNs have advantages in ECG signal processing. Because an ECG signal consists of 1D data, a 1D CNN is adopted in this study for classification.

The most commonly used CNN in image processing is a two-dimensional (2D) CNN; however, a 1D CNN is more suitable for processing time-series data derived from sensors, such as an ECG signal. A 1D CNN has the same characteristics and processing methods as a 2D CNN. The width of the 1D CNN is fixed, whereas the length can be set to different values according to the required processing [18]. As illustrated in Fig. 5, a 1D CNN slides from left to right without repeating sliding, whereas a 2D CNN must return to the start position and repeat sliding. As a result, during the feature extraction process of the ECG signal, a 1D CNN can reduce redundant computation more effectively than a 2D CNN, thus greatly increasing the computation speed.

The core of a 1D CNN lies in the 1D convolutional layer. Suppose that there is an input sequence xi (i = 1, 2, ..., n) and the weight is set to wj (j = 1, 2, ..., m). The kernel (also known as filter) in the current layer performs a convolution operation on the input signal of the previous layer. Then, the output after the current convolution layer is as follows [19]:

In a CNN, each neuron in the current layer forms a connection network only with neurons in the local window of the previous layer. Usually, an activation function is required for non-linear feature mapping before the output of the convolutional layer [20].

After feature extraction by the convolutional layer, feature selection and information filtering are required by the pooling layer. Generally, there are two types of operations for the pooling layer that either output the maximum value or average value of clustered neurons. The task performed by the pooling layer is actually a subsampling process that reduces the high computational complexity generated by the convolutional layer while ensuring the integrity of feature extraction and preventing overfitting of the neural network output [21].

Based on the feature data and expected target, a five-layer 1D CNN was designed in this study to train the pre-processed data. This CNN has the capacity to learn useful features through a training process. As illustrated in Fig. 6, the proposed 1D CNN consisted of two convolutional layers, two pooling layers and a fully connected layer. Considering the size of the input ECG beat, the filter length in the first convolutional layer was set to 31, the number of filters was set to 4, and the rectified linear unit (ReLU) function was used as the activation function [22]. The window size in the first pooling layer was set to 5, and the average pooling method was used. The filter size in the second convolutional layer was set to 6, the number of filters was set to 8, and the ReLU function was also used as the activation function. The window size in the second pooling layer was set to 3 using the average pooling method. Finally, the output obtained through the convolutional and pooling layers was sent to a fully connected layer for the final output.

Batch normalisation (BN) was also used in the proposed network. BN is a training optimisation method proposed by Google [23–24]. Normalisation refers to data standardisation while batch refers to a group of data; therefore, BN refers to the standardisation of a group of data. After applying BN to the input data and the output of the intermediate network layer, the changes produced by the internal neurons and the sample differences can be reduced. Therefore, most of the data can be maintained in the unsaturated region, thus ensuring gradient back-propagation to prevent the gradient vanishing and exploding problems [25].

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Availability of data and materials

The dataset supporting the conclusions of this article is available in the MIT-BIH arrhythmia database.

Competing interests

No benefits in any form have been received or will be received from a commercial party related directly or indirectly to the subject of this article.

Funding

This study was supported by a grant from Peking University.

Authors' contributions

ZL: research design, wrote program, analysis of the data, wrote manuscript; XAW: research design and guidance of the analysis of the data; All authors were fully involved in the study and approved the final version of this manuscript.

Acknowledgements

Not applicable.

Authors' information

ZL: PhD, Peking University Shenzhen Graduate School, School of Electronics Engineering and Computer Science, Peking University; XAW: Professor, Peking University Shenzhen Graduate School.

Cheung C C, Krahn A D, Andrade J G. The Emerging Role of Wearable Technologies in Arrhythmia Detection[J]. The Canadian journal of cardiology, 2018, 34(8).
Daniela Sofia Martins Pinto, Manuel Joaquim Lopes Vaz da Silva. Cardiovascular Disease in the Setting of Human Immunodeficiency Virus Infection[J]. radiographics a review publication of the radiological society of north america inc, 2018, 26(1):213.
Itani N, Salinas C E, Villena M, et al. The highs and lows of programmed cardiovascular disease by developmental hypoxia: studies in the chicken embryo[J]. Journal of Physiology, 2018, 596.
Fernando S M, Sheldon C, Daniel H. Hands-on defibrillation and electrocardiogram artefact filtering technology increases chest compression fraction and decreases peri-shock pause duration in a simulation model of cardiac arrest[J]. Canadian Journal of Emergency Medicine, 2015, 18(4):270-275.
Hussein A F, Hashim S J, Aziz A F A, et al. Performance Evaluation of Time-Frequency Distributions for ECG Signal Analysis[J]. Journal of Medical Systems, 2018, 42(1):15.
Hwang B, You J, Vaessen T, et al. Deep ECGNet: An Optimal Deep Learning Framework for Monitoring Mental Stress Using Ultra Short-Term ECG Signals[J]. TELEMEDICINE JOURNAL AND E HEALTH, 2018.
Suzuki Y. Self-organizing QRS-wave recognition in ECG using neural networks[J]. IEEE Transactions on Neural Networks, 1995, 6(6) P.1469-1477.
Shen C P, Kao W C, Yang Y Y, et al. Detection of cardiac arrhythmia in electrocardiograms using adaptive feature extraction and modified support vector machines[J]. Expert Systems with Applications, 2012, 39(9): 7845-7852.
Hosseini H G, Luo D, Reynolds K J. The comparison of different feed forward neural network architectures for ECG signal diagnosis[J]. Medical Engineering & Physics, 2006, 28(4):372-378.
Yeh Y C, Wang W J, Che W C. Cardiac arrhythmia diagnosis method using linear discriminant analysis on ECG signals[J]. Measurement, 2009, 42(5):778-789.
Cheung C C, Krahn A D, Andrade J G. The Emerging Role of Wearable Technologies in Arrhythmia Detection[J]. The Canadian journal of cardiology, 2018, 34(8).
Yu-Fei S, Yan Z. A Signal Generator Using Standard MIT-BIH Arrhythmia Database[J]. chinese journal of medical instrumentation, 2007.
Kishimoto Y, Kutsuna Y, Oguri K. Detecting Motion Artifact ECG Noise During Sleeping by Means of a Tri-axis Accelerometer[J]. Conference proceedings: Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Conference, 2007, 2007:2669-2672.
Ghosh S K, Valveny E. A Sliding Window Framework for Word Spotting Based on Word Attributes[C]// Ibpria. 2015.
Hecht-Nielsen R. Theory of the Backpropagation Neural Network[C]// Neural Networks, 1989. IJCNN. International Joint Conference on. 1989.A
Hong S, You T, Kwak S, et al. Online Tracking by Learning Discriminative Saliency Map with Convolutional Neural Network[J]. Computer ence, 2015:597-606.A
Shen, Yelong, He, Xiaodong, Gao, Jianfeng. Learning Semantic Representations using Convolutional Neural Network for Web Search[J]. proc www, 2014:373-374.A
Ossama AbdelHamid, Li Deng, Dong Yu. Exploring Convolutional Neural Network Structures and Optimization Techniques for Speech Recognition[C]// Interspeech. 2013.
Dosovitskiy A, Springenberg J T, Brox T. Learning to Generate Chairs with Convolutional Neural Networks[C]// 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 2015.
Zeng D, Liu K, Chen Y, et al. Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks[C]// Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015.
Nogueira K, Penatti, Otávio A. B, Santos J A D. Towards Better Exploiting Convolutional Neural Networks for Remote Sensing Scene Classification[J]. Pattern Recognition, 2016, 61:539-556.
He K, Zhang X, Ren S, et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification[J]. 2015.
Ioffe S, Szegedy C. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift[C]// International Conference on International Conference on Machine Learning. 2015.
Simon M, Rodner E, Denzler J. ImageNet pre-trained models with batch normalization[J]. 2016.
Chi Z, Nguyen T, Sah S, et al. Batch-normalized recurrent highway networks[C]// IEEE International Conference on Image Processing. 2018.

Download PDF

Version 1

posted

You are reading this latest preprint version

Automatic Arrhythmia Detection Using One-Dimensional Convolutional Neural Network

Status:

Version 1

Abstract

Figures

Background

Results

Discussion

Conclusions

Methods

Declarations

References

Status:

Version 1