Automatic Arrhythmia Detection Using One-Dimensional Convolutional Neural Network

Background: Cardiovascular diseases (CVDs) are common diseases that pose signicant threats to human health. Statistics have demonstrated that a large number of individuals die unexpectedly from sudden CVDs. Therefore, real-time monitoring and diagnosis of abnormal changes in cardiac activity are critical, as they can help the elderly and patients handle emergencies in a timely manner. To this end, a round-the-clock electrocardiogram (ECG) monitoring system can be developed with the quick detection of an ECG signal, segmentation of the detected ECG signal, and rapid diagnosis of a single segmented ECG beat. In this paper, to achieve the automatic detection and diagnosis of an ECG signal, ve common types of ECG signals are used for recognition. For pre-processing the original ECG signal, the dual-slope detection algorithm is proposed and developed. Then, with the pre-processed ECG data, a ve-layer one-dimensional convolutional neural network is constructed to classify ve categories of heartbeats, namely, a normal heartbeat and four types of abnormal heartbeats. Results: To be able to compare the results of the experiment, the experimental data used in this study are obtained from the open-source MIT-BIH arrhythmia database. This database is authoritative, as each ECG signal cycle is annotated by at least two cardiologists, and abnormal ECG signals are classied into different categories. By comparing the detection and recognition results in this study with the results annotated in the MIT-BIH arrhythmia database, an overall accuracy of 96.20% is achieved in the classication of normal ECG signals and four categories of abnormal ECG signals. Conclusions: This paper provides an accurate method with low computational complexity for 24-hour dynamic monitoring and automated diagnosis of heartbeat conditions. With wearable devices, this method can be used at home for the initial screening of CVDs. In addition, it can perform diagnosis and warning for postoperative patients or patients with chronic CVDs.


Background
Arrhythmia is a common clinical disease that occurs not only in people with cardiovascular diseases (CVDs), but also in people with other diseases and in a small number of healthy people. Correctly diagnosing arrhythmia and recognising complex arrhythmia have been signi cant goals in the history of modern electrocardiology [1]. Arrhythmia monitoring is an important type of long-term electrocardiogram (ECG) monitoring that is mainly used to detect arrhythmias, diagnose CVD in the early stages, evaluate anti-arrhythmic treatment, understand the relationship between premature ventricular contraction (PVC) and sudden death, perform electrophysiological studies of arrhythmia, and evaluate drug treatment effects [2][3]. Therefore, arrhythmia monitoring is critical for clinical analysis. Heart conditions can be monitored using 24-hour real-time dynamic monitoring devices and analysed by the obtained ECG signal. Based on the obtained ECG signal, the patient's ECG feature values can be determined to detect and analyse arrhythmia [4].
In the past, detection and analysis of ECG signals were generally performed by human experts, who carried out diagnostic analysis through visual observation using their own clinical experience. Although manual diagnosis has high accuracy for short-term ECG diagnosis, it fails in long-term dynamic ECG detection [5]. Hence, engineering methods have potential for achieving high-precision and high-speed automatic analysis of ECG signals. Currently, there are two main methods for ECG classi cation: one based on the feature vector space and another based on the ECG feature image [6]. The most commonly used methods include the logical discriminant tree, neural network, linear discriminant analysis (LDA), least square support vector machine, hidden Markov tree based model, particle swarm optimisation, rough rule set based model, and a combination of multiple methods [7].
Shen et al. [8] employed adaptive feature extraction combined with the support vector machine to perform diagnosis. This method was computationally e cient but had low accuracy; namely, a recognition accuracy of only 58.61% was achieved for PVC. Hosseini et al. [9] constructed intelligent ECG diagnosis software using a neural network. By comparing the performance of various neural network structures, they selected and combined two types of neural network structures that demonstrated the highest performance in the recognition of six types of abnormal ECG cases. Subsequently, they proposed a multi-level neural network algorithm that achieved higher accuracy; however, its computational complexity was very high. Yeh et al. [10] utilised LDA to diagnose arrhythmia. Their method could accurately recognise common abnormal ECG signals; however, it was unable to recognise premature atrial contraction (PAC) with high accuracy.
ECG is a graphical representation that displays changes in the electrical activity of the heart with time. A convolutional neural network (CNN) can effectively extract image features for recognition; however, it is too computationally complex to process ECG data. In general, a one-dimensional (1D) CNN can better process and analyse time-series data derived from sensors, considerably increasing the computation speed. In this paper, a deep learning algorithm based on a 1D CNN is proposed to classify ve categories of labelled ECG data, and a comparison experiment is conducted to evaluate its accuracy.

Results
This study used the MIT-BIH arrhythmia database provided by the Massachusetts Institute of Technology.
Five categories of ECG signals were selected from this database for classi cation, namely, the normal sinus rhythm, LBBB, RBBB, PAC and PVC. After being pre-processed, the ve categories of signals were divided into a training set and test set. The sample sizes of the selected training set and test set are presented in Table 1. After the divided training data were read, the ECG beats were rst pre-processed to achieve data standardisation, and then the pre-processed data were sent to the network for training. To avoid vast uctuations in the training process, the initial learning rate of the neural network was set to 0.001 in this study, and the objective error of the loss function was set to 1.010 -15 . In addition, to reduce the occurrence of falling into the local minimum, the cross-validation method was adopted. After multiple crossvalidation runs, an optimised neural network was obtained.
To assist the convergence of the experimental data, the neural network designed in this study was trained using the stochastic gradient descent method. The batch size was set to 16, and the number of iterations was set to 15,000. As illustrated in Fig. 7, the loss function of the CNN eventually converged to approximately 0.94.
By using cross-validation, under tting and over tting of the network could be observed, and the parameters could be thereby adjusted. The obtained network parameters were then retained and used for the following testing. The nal classi cation results derived by the testing were used as the key indicator for measuring the performance of the neural network classi er.

Discussion
Automatic arrhythmia detection and diagnosis is a very challenging work, which needs to integrate the knowledge of many subjects, including medicine and engineering technology, etc. The ECG data from MIT-BIH Arrhythmia Database were used for detection in this paper, and the methods of ltering, detection and automatic classi cation are studied. A 5-layer 1D CNN diagnosis algorithm are provided, and improved the current methods of the automatic detection and analysis of arrhythmia.
There were some limitations in this study. More studies on real-time clinical ECG data are needed, for the data used in this paper are from MIT-BIH database and no real-time clinical ECG data included at present.
In addition, the classi cation of more types of arrhythmia will be studied next except PVC, RBBB, LBBB, and PAC. As for the work of this paper, the following problems need to be taken into further consideration next: (1) Some training samples are insu cient. The training sample size of some types of arrhythmia are not adequate in MIT-BIH Arrhythmia Database, which would in uence the classi cation accuracy of the 1D CNN algorithm. In order to alleviate this problem and reduce the trouble of the local minimum, the crossvalidation method is adopted to obtain the optimal neural network.
(2) The arrhythmia detection method needs to be more rapid and effective. The detection and location technologies of some features of the waveform, such as low-amplitude and low-frequency signals, like P wave and U wave, are still immature and need to be improved at present. The double slope method is used to detect QRS wave group in this paper for quick positioning. It could be improved on this basis next.

Conclusions
Motivated by the widespread problem of CVDs worldwide and the demand for screening and diagnosis, this study examined the automated diagnosis of arrhythmias. ECG data recorded in a real clinical setting from the MIT-BIH arrhythmia database were used for detection and classi cation. For automatically analysing signals of heartbeats, this study proposed a ve-layer 1D CNN method for classifying the ECG signal.
For the pre-processing of the original ECG signal, this study used the dual-slope method and moving window integration to detect the R wave of the ECG signal, followed by a decision-making process by the dual-threshold method. Then, the detected data were segmented into different categories of ECG beats as the input for subsequent diagnosis. To diagnose abnormal heartbeats, this study proposed and designed a ve-layer 1D CNN for analysis and classi cation, employing a convolutional layer, pooling layer, BN, ReLU layer and other modules to test the classi cation performance of the test dataset. The results revealed an average recognition accuracy of 96.20% for the ve categories of heartbeats (normal sinus beats and four categories of abnormal heartbeats).
This paper thus provides an accurate method with low computational complexity for 24-hour dynamic monitoring and automated diagnosis of heartbeat conditions. In conjunction with wearable devices, this method can be used at home for the initial screening of CVDs. In addition, it can perform comprehensive diagnosis and warning for postoperative patients or patients with chronic CVDs. Through further detection of the feature waveform, the proposed method improved the recognition accuracy and reduced false and missing detection. Future work will focus on examining additional arrhythmia categories to be able to recognise a larger variety of abnormal heartbeats.

Methods
As illustrated in Fig. 1, the ECG data to be classi ed by the proposed algorithm are read and preprocessed. Then, a dataset is constructed with the processed data. Next, the classi cation model is trained and used for classi cation. During the pre-processing stage, adaptive ltering is employed for noise reduction. After pre-processing, the obtained dataset is divided into a training set and test set. The network model is then trained under the Caffe framework from the beginning using random initial weights. Finally, arrhythmias are classi ed using the trained model.

Pre-processing
In this study, the MIT-BIH arrhythmia database provided by the Massachusetts Institute of Technology was used to ensure data authenticity and reliability. This database consisted of over 4,000 recordings from patients, containing common ECG signals as well as rare abnormal ECG signals, and each recording was annotated and veri ed by expert cardiologists. In this study, ve categories of ECG signals were selected for classi cation, namely, normal heartbeat, left bundle branch block beat (LBBB), right bundle branch block beat (RBBB), PAC and PVC, which accounted for 66.63%, 7.17%, 6.44%, 2.26% and 6.33% of the MIT-BIH arrhythmia database, respectively, while other abnormal signals accounted for less than 1% [11][12].
Due to the different data acquisition equipment and methods used in the ECG signal acquisition process, interference may have been introduced into the acquired ECG signal, resulting in additional noise. Thus, the acquired ECG signal may have contained noise such as power line interference and baseline wander [13]. To avoid large errors in ECG signal recognition, pre-processing was performed before signal detection to reduce the impact of interference on the original ECG signal.
In this study, the ECG signal pre-processing stage consisted of the following steps. First, a band-pass lter with a bandwidth in the range 0.05-100Hz was adopted to lter the original signal. Then, the dual-slope method was used to locate the R peak. Next, a low-pass lter with a cut-off frequency of 0.05Hz was used to remove the interference signal. Finally, to prevent the misrecognition of noise as a QRS complex, moving window integration and a self-de ned threshold were used.
Normally, the frequency of an ECG signal is in the range 0.05-100Hz. To remove unnecessary information from the ECG signal, a 40-order nite impulse response (FIR) band-pass lter with a bandpass frequency in the range 0.05-100Hz was designed in this study to lter the original signal. Fig. 2(a) presents the original ECG signal while Fig. 2(b) presents the ECG signal after band-pass ltering.
After ltering the ECG signal, dual-slope pre-processing was employed to detect the peak of the QRS complex, thereby obtaining the effective waveform of the ECG signal. The fundamental principle of dualslope pre-processing is based on the characteristics of the QRS complex, namely, that it is steep on both sides of the R peak. Using these characteristics, the point with the largest slope within the interval on both sides of that point is determined. In this method, rst, the maximum and minimum average slope in the intervals on both the left and right sides of a point are calculated. Then, the slope difference is obtained by subtracting the minimum average slope of one side from the maximum average slope of the other side. Finally, the two average slope differences are compared, and the maximum slope difference is used. Through the dual-slope calculation, the peak of the QRS complex can be located. Fig. 3(a) presents the signal after band-pass ltering while Fig. 3(b) presents the signal after dual-slope pre-processing.
After the dual-slope calculation, a low-pass lter with a cut-off frequency of 5 Hz was used to remove the interference signal. Figs. 2 and 3 indicate that the amplitude of the ECG signal gradually decreased after the band-pass ltering and dual-slope pre-processing. However, an amplitude that is too small is not conducive to the subsequent detection and segmentation of the ECG signal.
To highlight the characteristic points of the QRS complex, moving window integration was used in this study. By selecting a window with a certain width and moving from the initial point in the signal, the signal within the window is integrated during the moving process, and the integrated value is used to represent the amplitude of the signal. The moving window integration algorithm can magnify the effective information in the ampli ed signal, thereby increasing the absolute amplitude of the signal. In addition, moving window integration allows the peak of the waveform to become smooth and the slope to become less sharp. In experimental test in this study, the width of the moving window was set to 17 sampling points, which achieved the most desirable window integration effect. Fig. 4(a) presents the lowpass ltered signal while Fig. 4(b) presents the signal with a magni ed amplitude after moving window integration.
The signal after moving window integration displayed prominent characteristics; hence, the location of the QRS complex could be determined according to the QRS threshold, and the ECG signal could be further segmented. In this study, the dual-threshold method was designed to locate the QRS complex in the integrated signal. When the peak amplitude of the integrated signal exceeded the lower threshold, it was compared with the higher threshold to determine whether a QRS complex was detected [14]. Furthermore, to ensure that the thresholds could be exibly adapted to ECG signals with different forms, the higher and lower thresholds in this study were designed to independently change according to the variations in the peak amplitudes detected previously. The design of the dual threshold and adaptive threshold mainly aimed to reduce the number of missing and false detections of the QRS complex. In addition, if two peaks were too close to each other, the peak with the larger amplitude was retained as an R peak according to the dual threshold, thereby preventing the misrecognition of some types of noise.
After detection, the peak of the ECG signal was determined. Based on the location of the peak, the input ECG signal was segmented into a series of single ECG beats. The single ECG beats derived from segmentation were fed to the subsequent 1D CNN for the classi cation of ECG signals.

1D CNN
A neural network is an abstract mathematical model inspired by the human brain that has been developed by modern neuroscience. Neural network models have wide applications and can be used for the classi cation and prediction of problems with indescribable rules [15].
A CNN is a widely used neural network model that is able to extract local features of data to establish local connections. There are multiple lters in each convolutional layer that can extract multiple feature parameters. The shared weight in the convolutional layer and the pooling operation in the pooling layer can reduce the di culty of network training and reduce the data dimensions, thus avoiding excessive computational complexity during parameter extraction. For an image, there is a connection between local details and the global area, and the combination of low-level features forms a high-level feature representation. This principle also applies to ECG signal processing [16][17]. Therefore, CNNs have advantages in ECG signal processing. Because an ECG signal consists of 1D data, a 1D CNN is adopted in this study for classi cation.
The most commonly used CNN in image processing is a two-dimensional (2D) CNN; however, a 1D CNN is more suitable for processing time-series data derived from sensors, such as an ECG signal. A 1D CNN has the same characteristics and processing methods as a 2D CNN. The width of the 1D CNN is xed, whereas the length can be set to different values according to the required processing [18]. As illustrated in Fig. 5, a 1D CNN slides from left to right without repeating sliding, whereas a 2D CNN must return to the start position and repeat sliding. As a result, during the feature extraction process of the ECG signal, a 1D CNN can reduce redundant computation more effectively than a 2D CNN, thus greatly increasing the computation speed.
The core of a 1D CNN lies in the 1D convolutional layer. Suppose that there is an input sequence xi (i = 1, 2, ..., n) and the weight is set to wj (j = 1, 2, ..., m). The kernel (also known as lter) in the current layer performs a convolution operation on the input signal of the previous layer. Then, the output after the current convolution layer is as follows [19]: In a CNN, each neuron in the current layer forms a connection network only with neurons in the local window of the previous layer. Usually, an activation function is required for non-linear feature mapping before the output of the convolutional layer [20].
After feature extraction by the convolutional layer, feature selection and information ltering are required by the pooling layer. Generally, there are two types of operations for the pooling layer that either output the maximum value or average value of clustered neurons. The task performed by the pooling layer is actually a subsampling process that reduces the high computational complexity generated by the convolutional layer while ensuring the integrity of feature extraction and preventing over tting of the neural network output [21].
Based on the feature data and expected target, a ve-layer 1D CNN was designed in this study to train the pre-processed data. This CNN has the capacity to learn useful features through a training process. As illustrated in Fig. 6, the proposed 1D CNN consisted of two convolutional layers, two pooling layers and a fully connected layer. Considering the size of the input ECG beat, the lter length in the rst convolutional layer was set to 31, the number of lters was set to 4, and the recti ed linear unit (ReLU) function was used as the activation function [22]. The window size in the rst pooling layer was set to 5, and the average pooling method was used. The lter size in the second convolutional layer was set to 6, the number of lters was set to 8, and the ReLU function was also used as the activation function. The window size in the second pooling layer was set to 3 using the average pooling method. Finally, the output obtained through the convolutional and pooling layers was sent to a fully connected layer for the nal output.
Batch normalisation (BN) was also used in the proposed network. BN is a training optimisation method proposed by Google [23][24]. Normalisation refers to data standardisation while batch refers to a group of data; therefore, BN refers to the standardisation of a group of data. After applying BN to the input data and the output of the intermediate network layer, the changes produced by the internal neurons and the sample differences can be reduced. Therefore, most of the data can be maintained in the unsaturated region, thus ensuring gradient back-propagation to prevent the gradient vanishing and exploding problems [25].

Declarations
Ethics approval and consent to participate Not applicable.

Consent for publication
Not applicable.

Availability of data and materials
The dataset supporting the conclusions of this article is available in the MIT-BIH arrhythmia database.                Proposed one-dimensional convolutional neural network structure.

Figure 6
Proposed one-dimensional convolutional neural network structure. Proposed one-dimensional convolutional neural network structure.

Figure 7
Loss function curve.

Figure 7
Loss function curve.

Figure 7
Loss function curve.