Enhanced Local Patterns Using Deep Learning Techniques for ECG Based Identity Recognition System

Electrocardiogram (ECG) signals exhibit features of the electrical activity of the heart which are unique among individuals and have recently emerged as a potential biometric tool for human identiﬁcation. The paper proposes a new non-ﬁducial approach for ECG biometric person identiﬁcation using unsupervised classiﬁer and Deep Learning approaches. This work investigates the ability of local binary pattern to extract the relevant patterns that describes distinctively the features of a heartbeat activity from each person’s ECG apart.Then, Staked Autoencoders are used to further enhance the extracted features, consequently, this will aﬀect directly the performances of the deep belief network classiﬁer in the identiﬁcation process which is based on their heartbeat activity. The proposed approach is validated using experimental tests performed on datasets from two publicly available databases MIT-BIH Normal Sinus Rhythm and ECG-ID.The results show that the proposed approach is robust and can achieve consistent subject identiﬁcation in comparison to other existing results.


Introduction
Over the past few years, there has been a significant technological advancement in wearable devices. Simplicity of usage, integration, reliability and affordability are key factors for the wearable devices to become a viable technology across different sectors. It is believed that the greatest potential of wearable technology exists in the healthcare sector where its acceptance is already gaining momentum [1].
Obviously, data originating from biosignals such as electrocardiogram (ECG), electroencephalogram (EEG), electromyogram (EMG), etc., that can be used to monitor subjects' physiological characteristics can be considered as hidden and need some processing tools for their analysis and exploitation. These hidden data can represent an alternative solution to many issues that has faced constrained challenges, closely related to data storage, equipment complexity, data security failure, etc. One of the promising areas for the potential exploitation of these biosignals and which is gaining a lot of interest in recent years is biometrics. Biometric technologies play a key role in enhancing the security of commercial businesses, government organisations and financial institutions. Biometric authentication refers to the process of verifying the identity of a person using some metrics that are unique and specific to an individual. There are currently several useful commercial identification systems based on morphological patterns of the iris, fingerprints, face, and approaches based on ADN, voice, hand geometry, ear, signature and gait [9]. To date, none of these approaches has been referred to as an ideal and well-established authentication method and they are still the subject of ongoing research and development to cope with the challenges occurring in in real-world applications.
Designing a robust biometric system that satisfies all requirements and constraints such as high accuracy, enhanced security and reduced computational complexity is still a challenging task. [10]. Traditionally, ECG signals have been mainly associated with medical diagnosis such the analysis of heart arrhythmias and related diseases and were also extensively used modelling the heart activity using advanced signal processing techniques for the detection and classification of arrythmias. [11][12][13][14]. Recently, biometric recognition using heartbeat activities has emerged as a new alternative technology in the behavioral biometric field [15]. ECG-based biometric identification has several advantages over the other existing biometric systems such as robustness, memory saving and reduced complexity. The recognition mechanism through the heartbeat is quite similar to that of voice recognition and therefore, ECG signals are very sensible to factors associated with the ECG recording system itself, due the artifacts generated by the electronic instrumentation, and also the variability of the signal that can significantly reduce the performance of the ECG-based biometric system. Like any realistic biometric system, it consists of four stages: data preparation, preprocessing, feature extraction and classification. The approach proposed in this research work, combines two stages namely: 1D local binary pattern (feature extraction stage) and deep network(for classification task).

Related Research Work
In recent years, ECG has quickly evolved as a potential tool for biometrics and is projected to become one of the most important technology in the future [16][17][18][19].The key requirements from a biometric system are accuracy, security and computational complexity [9,10]. Significant research efforts have been conducted to propose an operatic ECG based biometric application that might satisfy the above requirements criteria.
The common goal of all of these proposed approaches is to achieve high accuracy using an optimized system. Recently, many of the existing related works focus on feature extraction techniques [20] while others deal with classification. Few studies have dealt with the preprocessing stage and types of ECG biometric measurements [21,22] . ECG features extraction technique can be divided into two main categories: (1) fiducial methods which aim at detecting the PQRST points from the signal; (2) non-fiducial methods serve to extract the significant information in both time and frequency domains. The latter category is mainly suitable for online detection or prediction of ECG behavior when the subjects are constantly monitored, and is suited for the identification of humans. Hence, a good survey of widely used techniques were deployed in ECG processing can be found in [23]. Therefore, the main challenge is how to represent the features of ECG heartbeat in a discriminative feature space.
In the fiducial category, Biel et al. [17], proposed an approach based on the fiducial points from a standard 12-lead ECG, and used a classification algorithm called : the soft independent modeling by class analogy (SIMCA) method . In [18], the features are located based on the identified fiducial points, and a linear discriminant analysis (LDA)classifier was used. Irvine et al. [19] applied a feature selection method to extract the fiducial points, and the decision was based on a linear discriminant analysis classifier. Shen et al. [21], exploited a few fiducial points that were used as input to the hybrid classifier such as decision-based neural network (DBNN) and template matching. The same authors have extracted 17 temporal and amplitude attributes, with a relatively large database. When using template matching and neural networks, the accuracy achieved was 95% and 80% respectively. Kyoso et al. [22], used an algorithm based the second derivative to extract the morphological features from the heartbeat segment, namely the duration of the P wave, the PQ interval, the duration of the complex QRS , and the QT interval . These authors then used discriminant analysis for the recognition procedure. As a result, acceptable levels of accuracy (94.2%) were achieved on nine subjects after merging the features of QT interval with QRS duration. In the real world, ECG-based biometrics are practically constrained by either ECG sensor device or various heart-related conditions such as the physical activity, emotions of a person, age, health or disorders, etc. [20]. Those constraints increase the computational complexity of the fiducial point extraction itself and thus put further challenges on the extraction of temporal features.
For the second category, the process relies on the signal energy, which is considered as the key factor of success in many applications, and signal processing which involves detection and processing of information in the time and frequency domain. A non-fiducial method was used to extract features in Wubbeler et al. [23] . A two-dimensional heart vector signal was generated from the well-known three Einthoven leads. Wang et al. [24] defined non-fiducial features using autocorrelation and discrete cosine transform DCT methods. Other studies used non-fiducial method based on principal component analysis (PCA) to define the vector of features, in Irvine et al. [25]. On the other hand, Islam et al. [26] have investigated a new technique based on the fusion of different types of templates relying on fiducial and non-fiducial approaches to fit into new feature termed TempType. Agrafioti and Hatzinakos [27] employed the autocorrelation function to calculate the coefficients and the DCT or LDA methods were used to reduce the number of features. Yu and Chou [28] used Independent Component Analysis (ICA) for extracting 100 coefficients from an ECG heartbeat segment composed of 200 sampling points centered at the identified R peak. Sandeep et al [29] proposed a multitask learning algorithm to extract and classify features for single-lead ECG based biometric recognition.
In the case of multimodal approaches, the ECG signal has also been exploited with other physiological biometrics in a fusion strategy such as: face, sound/voice and fingerprint [30][31][32] .
The ECG as a physiological signal that is known to be inherently non-stationary and having local and global variations. Such characteristics can be considered as a drawback for different signal processing tools both in time and frequency domains. To overcome this drawback, this work proposed an efficient approach for biometric discrimination based on ECG signals. This approach is motivated by the 1D-LBP technique which has been successfully used for signal analysis in various applications including classification of EMG and EEG signals [33][34][35][36]. Similarly to our research work, Louis et al. [37] have developed a new version of the 1D-LBP, named one dimensional multi-Resolution local binary pattern (1DMRLBP), which was used to identify the heartbeat activity of different subjects. Hereafter, it is shown that the LBP technique provides a good alternative in the feature space for classifying heartbeats of distinct subjects. Moreover, the statistical aspect in its computation gives it the ability to effectively capture the local variations which make it suitable for applications involving classification of signals based on the underlying patterns. Traditionally, histograms derived from computed LBPs have been employed as features. Consequently, statistical features computed from LBP histograms hold a promising place as they considerably reduce the length of the original raw signal [37]. However, 1D-LBP can be sensitive to local changes resulting from the mechanism of ECG heartbeat activity time, and this may negatively affect its discriminating ability. Therefore, this work contributes to the enhancement of the local binary pattern when extracting pertinent features which are more sensitive to the local and global variations located in heartbeat ECG dynamics.
As a second contribution of this research work, the ability of a deep learning architecture to further enhance the performance of the proposed framework is also addressed. Our research builds on recently published work where different architectures have been adopted either using 1D signal (short ECG signals) or 2D spectrum of signal (The short-time Fourier transform) for feature extraction, and mostly, CNN, RNN(e.g. LTSM) as classifiers, as well as hybrid deep neural networks, combining CNN for feeding, training the ECG features, and LTSM for learning patterns directly from CNN-hidden trained features that characterize ECG signals. More details can be found in [38][39][40] where authors have introduced several standard and modified architectures of deep learning(DL)for ECG abnormality and variability detection.
Few authors have proposed the use of DL for ECG biometrics, as an alternative to methods based on machine learning (ML) such as SVM, KNN, Random Forest, NN,... etc [41].Eduardo Luz [42] investigated the performance of CNN, used in the learning task of both the ECG raw signal and its spectrogram representation (the shorttime Fourier transform) to identify a heartbeat of an ECG signal [42]. Zhang et al. proposed a new architecture called Multiresolution CNN for human identification. They have mutually investigated the powerful representation of wavelet domain Multiresolution and the discrimination ability of CNN to classify randomly selected ECG signal segments [43]. Despite the promising results obtained using the deep CNN approach for anomaly detection and ECG-recognition based on ECG signals, some tasks, such as the optimal choice of signal segmentation, denoising, requiring hand-engineered features, producing new data with a given joint distribution, and unsupervised learning are still challenging [44]. Some powerful tools can overcome most of these limitations, with their deep architectures such as : Restricted Boltzmann Machines (RBM) and Autoencoders [45]. Gang Zheng et al. [46], applied the denoising autoencoder (SAE) to learn the information entropy feature of the ECG signal. Rahhal et al. [47], used SAE to learn features from the raw ECG data for ECG classification. Wang and Shang [48], used DBN to automatically extract features from raw unlabeled physiological data. In addition, both the advantages of DBN and staked autoencoder which have included the abstraction of data representation and discrimination ability, respectively have been investigated. The proposed DBN architecture is stacked with three types of RBM such as Gaussian-Bernoulli (GBRBM) and Bernoulli-Bernoulli (BBRBM),which can automatically learn better the bin histogram of 1D-LNPD of the ECG segment, that will be consequently reconstructed by SAE. Several training algorithms including contrastive divergence (CD) and persistent contrastive divergence (PCD) are used to define the RBM parameters. Once RBMby-RBM pre-training in DBN is performed, the back-propagation technique is applied through the whole classifier to fine-tune the weights to achieve optimal classification. The proposed approach is tested and evaluated on two databases, ECG-ID and MIT-BIH [49]. The methods proposed in this paper are compared with the existing algorithms in this field. The experiments are performed under the same conditions as these previous studies. The obtained results are compared based on classification accuracy, ROC metrics and boxplots. The remaining of the paper is structured as follows: Section 2 introduces the preprocessing of the recorded ECG signal step and the adopted feature extraction approach. The proposed hybrid architecture combining SAE and DBN is described in Section 3. The datasets, experiments and results are presented and discussed along with extensive comparisons with other results in Section 4. Finally, Section 5 summarises the conclusions of this study.

The Proposed
Approach-Framework The proposed ECG based biometric system is depicted in Figure 1. It consists of four main modules: (i) ECG signal preprocessing, (ii) Segmentation of the heartbeats, (iii) Features extraction from the resulting signal (this step represents a very important part (the core) of our contribution), and finally (iv) Classification of ECG patterns. The first module serves as a filter to enhance the quality of the ECG signal. In the segmentation module, the ECG signal is separated into a series of heartbeats where each heart beat is characterized by a set of points P,Q,R,S,T called fiducial points which represent the most important dynamic characteristics of the signal known as the QRS complex. The feature extraction module is used to extract significant features from each heartbeat segment by using an algorithm closely related to the 1D Local binary pattern approach proposed in [38][39][40] . After that, every ECG heartbeat is represented by this feature in the form of histogram bins. Thus, the output of the proposed feature extraction module is a 1D dimensioned feature vector seen here as a signature of ECG, which can then be processed by a hybrid deep learning module for enhancement and classification. In this research work, two databases publicly shared by PhysioNet [38] are used. A brief description of these data is given below.

The MIT BIH ECG-ID Database.
This database contains 310 ECG recordings, obtained from 90 subjects. Each recording contains: ECG lead I, recorded for 20 seconds, digitized at 500 Hz with 12-bit resolution over a nominal ➧10 mV range;10 annotated beats (unaudited R-and T-wave peaks annotations from an automated detector); information (in the .hea file for the record) containing age, gender and recording date. The records were obtained from volunteers (44 men and 46 women aged between 13 to 75 years) The number of recordings for each person varies from 2 (collected during one day) to 20 (collected periodically over 6 months).

The MIT BIH Normal Sinus Rhythm Database
The database contains long-term (about 24 hours) two-leads ECG recordings of 18 subjects referred to the Arrhythmia Laboratory at Boston's Beth Israel Hospital. All the ECG records are sampled at 128 Hz. Subjects include 5 men, aged between 26 and 45, and 13 women, aged between 20 and 50. These 18 subjects were found not to have had any significant arrhythmias. Only Lead I ECG data will be used in our validation experiments, as it is the most commonly used because it can be easily setup to acquire ECG in various situations [50]. For each subject, the ECG signal from hour 1 to hour 2 is used as training dataset, and the ECG signal from hour 13 to hour 14 is used as test dataset.

The MIT BIH Normal Sinus Rhythm Database
The database contains long-term (about 24 hours) two-leads ECG recordings of 18 subjects referred to the Arrhythmia Laboratory at Boston's Beth Israel Hospital. All the ECG records are sampled at 128 Hz. Subjects include 5 men, aged between 26 and 45, and 13 women, aged between 20 and 50. These 18 subjects were found not to have had any significant arrhythmias. Only Lead I ECG data will be used in our validation experiments, as it is the most commonly used because it can be easily setup to acquire ECG in various situations [50]. For each subject, the ECG signal from hour 1 to hour 2 is used as training dataset, and the ECG signal from hour 13 to hour 14 is used as test dataset.

ECG pre-processing
First, the ECG signals obtained from well known public databases is normalized with respect to the frequency using a conventional linear interpolation. The preprocessing phase of the raw signal aims at reducing noise and removing artifacts originating from various sources such as: muscular interference or more commonly from the main power supply (50 Hz or 60 Hz). Digital filters like Finite impulse response (FIR) recursive filters are the most commonly used for the attenuation of these artifacts [51]. In addition, the ECG signal amplitude normalization is also performed in the preprocessing step.

Heartbeat signal segmentation
The ECG waveform contains many information related to the heart activity, which can be characterized by either fiducial or non-fiducial techniques, The QRS complex describes one of the most important features, susceptible to provide a clear measure of the local and global variation of ECG signals which simplifies the heartbeat segmentation task. Based on this fact, an algorithm known as Pan-Tompkins procedure [52] is employed to isolate the fiducial point (P, Q, R, S, and T) for each beat segment(due to the nonstationary and periodic nature of the ECG signal, beat lengths for all the ECG records are not equalized). In our design, the beginning of the QRS complex is marked by the end of the P wave, the Springer Nature 2021 L A T E X template  segment is tailored by taking 94 samples before the identified R-peak and 150 samples after. Therefore, each ECG heartbeat will be characterized by 245 samples equivalent to a 490 ms segment duration as demonstrated in the third block in Figure  1. Note that the vectors of information are defined inside this normalized segment.

Enhanced LBP Features extraction
Techniques based on local binary pattern (LBP) have gained remarkable interest in recent years [33] . LBP was successfully used in texture image analysis and in biometric systems. In the case of ECG based features extraction, LBP is selected as baseline. The success of LBP in ECG description is due to the discriminative power, computational simplicity of the operator, and its robustness to monotonic amplitude values changes caused by, for example, artifacts and noise due to the intraand inter-subject variability, etc. Once the fiducial points are detected and the segmentation is performed, the LBP approach is applied to discern the most important features that could describe the heartbeat activity. In our work, the performance of a 1D instead of 2D LBP signal is investigated. An enhanced form of the LBP is then proposed to improve the performance in term sensibility and accuracy in the scope of the presented biometric framework.
1D-LBP known as the Local Binary Pattern applied to 1 dimensional signal such as ECG, is a transformation code for the signal point S c that is computed as follow: where Ψ is a function defined as, Hence, the transformation codes are obtained for all the signal points, the bin histogram of these codes form the feature vector of the considered ECG signal, which is normally fed to a classifier to perform the classification.

Shifted One-Dimensional Binary Patterns
Shifted one-dimensional local binary pattern (1D-LBP), which is inspired from the local binary pattern method (LBP), was employed for extracting features from bioelectric signals. Though it is considered as an improved version, the basic process of the Shifted 1D-LBP is quite similar to 1D-LBP [34,36]. They are different in the configuration that is proposed in the process of selection of neighbors while both of them analyze the neighborhood of samples in the time series. In the traditional 1D-LBP, half of the neighbors (P/2)are assigned to both left and right sides of the central point. Therefore, the number of detectable micro and macro patterns is limited with the number of neighbors. In the Shifted 1D-LBP (PL, PR), there is no such a limitation, because the neighbors in the left and right sides of the central point can be simply shifted to increase the potential of getting different micro and macro patterns in an adequate way.
The mathematical formulation of shifted 1D-LBP is given as:

1D-Local Difference Pattern
In this section, our proposed feature extraction method is explained. The 1D-local difference pattern (1D-LDP) method relies on the enhancement brought by the local variation and the relative connection between neighbored points. This method has been inspired by the LBP method [34]. Starting from the LBP original formulation, different variant of LBP have been proposed in many applications related to 2D forms, face recognition, texture, iris, etc. [33]. Even though 1D-LBP feature extraction techniques were proposed for signal processing and successfully applied to epileptic EEG signal classification [34][35][36]53] , the LGP, LDP based techniques are yet to be developed in the same context. For ECG applications, Louis [37] proposed a new algorithm called Mrlbp for ECG based biometric identification. In this work, 1D-LNDP based feature extraction technique has been introduced for heartbeat ECG signal extraction. Due to the transition shown over time in each activity of the heartbeat ECG and the non stationary property of ECG waveform, the 1D-LBP is more sensible to these phenomena. To overcome this intra-variation and inter-variation structural ECG, the neighboring information (difference of successive/consecutive points)is used rather than the original signal before applying the reminder step used in the conventional 1D-LBP. Our technique preserves the structural property of the processed pattern. The various steps of the  proposed 1D-LNDP(equally applied for shifted 1D-LBP, named shifted 1D-LNDP) feature extraction techniques are illustrated in Figure 10, and equally explained in following hereafter; 1. Set the number of neighboring points m for each ECG segment as shown in Figure 3. 2. Compute the difference of consecutive points (starting from the top-left neighbor). The difference of consecutive points computed results in a pattern depicted as shown in Figure 4 : with m the size of the data vector (m = 9 in our case).
3. Compute the center C ν of the data vector ν, and then create the information vector from each neighbouring point and its center: 4. Compute the new centre of S νi as follows: 5. The transformation code for the signal point g ci can be computed as follows: for i = 1, ..., m − 1. That yields the vector of information G c which gives a measure of the local variation of successive point, and global variation for the current segment; . .
6. Compute the histogram bin of signal ECG that describes as 1D-LDP code.
where, the function Ψ(x) is defined beforehand. 7. Create histograms of 1D-LNDP features. To date, different types of LBP histograms were formed from the mapped image/signal. In this work, two types of the LBP histogram have been used: a) The overall LBP histogram is created to collect up the occurrences of the 256 possible patterns from the 1D-LNDP of the considered ECG signal. b) The uniform features are reduced from the whole 1D-LNDP histogram by excluding the features that contain at most two bitwise transitions from 0 to 1 or 1 to 0 in its binary representation when the binary string is considered circular.
8. Hence the ECG signal is divided into segments and the LBP histogram is calculated from each segment and then concatenated. The histogram of the obtained 1D-LDP signal was then determined. It shows how often each of these 256 different patterns appears in its corresponding ECG segment, as shown in figure 6.
In terms of signal pattern matching, the above Figures 6-9 demonstrate the effectiveness in using the derivative of the signal as a tool to distinguish between subjects. In fact, this interpretation can consolidate the fact that the human heart is unique for each human being, and that the heart activity has its own timber, shaped by the phases forming the PQRST wave. In fact, the organ goes through the polarization and depolarization phases marked by accentuation "difference" or a certain dynamic velocity of transition between successive phases of the cycle. Using the local differentiation, the R wave is the part giving the best distinction, it coincides with the polarization of the ventricle, and the depolarization of the atria simultaneously. To consolidate the above reasoning, a pictorial representation of the various steps involved in the proposed 1D-LDP technique is shown in Figure 8.
Table1 gives a comparison of the state-ofthe-art approaches based on 1D LBP and their application fields, respectively in chronological order.
The common aim between the different approaches reported in Table 1 is to find a unique pattern invariant over the time and robust to the challenge associated to each signal and its application. Obviously, the choice of neighborhood selected, and the central point are the key factors for the success of each approach. Similarly, to our work, Louis et al [37] have studied the performance of LBP on the ECG as biometric modality. They have modified the mechanism to select the neighborhood by using multi-resolution temporal approach considering local and global variations, to extract more robust pattern, invariant to the temporal transition and morphological characterization of the ECG signal. Moreover, in this work, the natural variability associated with the ECG signal such as, the presence of artifacts, and other sources of variability, etc. have been investigated. In our work, we have used the proprieties generated from the difference of consecutive points for each segment ECG that may lead to the maximization of the intra variation (see Figures  7 (a) and (c)) and the minimization the intervariation (see Figures 7 (b) and (d)), which is in fact considered as the most influential challenge that faces the robustness and the effectiveness of the biometric task addressed in this paper.

Deep Learning ECG Biometric Identification, with enhanced features extraction
Due the over fitting that is produced in many tasks of pattern recognition and dimension reduction, different approaches have been developed to overcome this drawback by using different regularization rules. In this paper, two common/advanced methods have been adopted, namely the autoencoders(AE) and the restricted Boltzmann machines(RBM) to construct our DNN to better learn the pattern features that were extracted from each heartbeat ECG. Obviously, the features of ECG segment were biased in a way that, the extraction method used in the ECG preprocessing phase, may or may not ignore the intra variation and inter variation, that can be produced by multiple factors, such as sampling selection and quantization error, mis-segmentation errors, and other noise that may be defective to the heartbeats structure. In this work, the proposed model is vaguely shown in Figure 2.which is a combination of stacked auto encoder for feature enhancing, then, the output of SAE is connected into the proposed DBN, to identify and assign ECG signal to a specific individual.

Stacked autoencoder (SAE) for features enhancements
Space reduction is considered an important step in many application of pattern recognition. One of the most successful approach is the Principal Component Analysis (PCA),which was reported in many research works [54]. Despite its benefits, the PCA has several shortcomings such as over-fitting, eigenvalues selection, etc. Recently   LGP EEG Proposed Approach LDP ECG ,auto-encoders have been proposed as an alternative solution to get better results in terms of reconstruction error and consumption time [55].
To efficiently extract the features of the ECG signal in a convenient manner, this paper proposes to feed it in a new space named sparse space that was engendered by sparse propriety. Before introducing this latter, the auto-encoder having an architecture illustrated in Figure 9. Similarly, to the aim of PCA, an auto encoder is a feedforward neural network, which learns from input data, and aims to reproduce data as unbiased vector via nonlinear activation function (ReLu). In fact, the training algorithm of AE seeks to minimize the error function analytically, at the output level of each node of the hidden layer.
Unlike the PCA approach, AE is an unsupervised feature learning algorithm [55] which aims to reproduce better feature representation of input high-dimensional data by finding a set of features that maximize inter-class variance and minimize intra-class variance in the new resulting feature space. The proposed network topology consists of equally-sized input and output layers separated by at least one hidden layer. Successive layers are fully connected through weight matrices, a bias vector, and a nonlinear function. It is common to use tied weights, i.e., the weight matrix connecting the hidden and output layers is simply the transpose weight matrix (i.e.the identity function) connecting the input and hidden layers. This limits the number of model parameters and acts as a kind of regularization.
Using the feature vector x ∈ R n as input, the hidden layer is calculated as: where, f (x) = 1/(1 + exp(−x) is the nonlinear activation function applied component-wise, f (x) ∈ R m (h = 1, ..., m) is the vector of node activation, W 1 ∈ m × n is a weight matrix, and b h ∈ R m is a bias vector. The network output is given by: Where b y is the output bias vector. The complete model parameter vector is θ = {W, b h , b y }.
The hidden layers can be stacked to create a deep network, and the dimensions of the hidden layers comprise the model hyper-parameters. Typically, at least one hidden layer should be smaller in size than the input layer to achieve a compressed representation and avoid the problem of learning the identity function. Parameters are adjusted using back-propagating gradients from a squared error loss function given by: Where y is the reconstructed output. The most frequently used cost function is, Where D is the set of training examples. The bias parameters are initialized to 0 and the weights are initialized from a random uniform distribution such that the scale of the gradients in each layer is roughly the same. During testing, the score of a query sample is given by the negative reconstruction error −L(x, y).
Since the AE network is assigned to learn a feature representation, this representation is mainly focused on minimizing reconstruction error and not on its discriminative powers, which are more important for future use of the resulting feature space. The authors in [56] have imposed a new term into the cost function which is applied as sparsity constraint, by enforcing the output of hidden layer activation to zero. Thus, come the modification that make it possible to represent the data in the new feature space with less error reconstruction, a sparsity constraint is imposed on the hidden nodes, which can be given by: Whereρ j denotes the average activation of the j th node of the ith sample, m is the number of training set, a j (x i ) denotes the jth hidden node activation of the i th sample. As for the training set, to avoid learning the identity function and improve the ability to apprehend important information, the average activation of each hidden node j needs to be set to zero or close to zero. Thus, an additional penalty term is added. During the training of SAE, the output function hidden should be approximately zero. To do this a new term should be added as a penalty to the cost function. The Kullback-Leibler (KL) divergence is considered as one of the most efficient optimization tools to compute a cost function under many constraints. The KL divergence calculates the difference between two Bernoulli probability distributions of sparsity penalty (desiredρ j and real ρ) . The KL divergence penalty to be added is then given by To include the sparsity in the AE network, the overall cost function is modified to include this penalty term, and the SAE cost function can be determined by: Where J SAE (θ)) is the cost function aimed to drive the output of the encoder as equivalent as possible to the target. The parameters {W, b h , b y } can be adjusted using the gradient descent technique which constitutes a form of backpropagation learning algorithm. Figure 13 illustrates the architecture of the stacked AE.

Deep Belief Networks (DBN) classifier
Deep belief networks as a generative graphical model [44], are highly complex directed acyclic graph. They are formed by a sequence of restricted Boltzmann Machine (RBM) architectures. DBN could be trained by training RBMs layer by layer gradually from bottom to top. Since RBMs have the ability to be trained rapidly through layered contrast divergence algorithm, the computation circumvents a high degree of complexity during the training the DBNs which in turn simplifies the process to train each RBM apart. Studies on DBN have shown that it can solve low convergence speed and local optimum problems which exist in classical back-propagation algorithms for training multilayer neural networks. Figure 10 represents the architecture of the DBN in which RBMs are trained layer by layer from bottom to top, which will be used in our ECG based biometric system. To avoid overfitting problems in the classification stage, which usually causes misclassification errors, multiple SAEs are stacked to form a stacked AE, in a way that the output of each layer is fed as the input to the successive layer. In this work, the features extracted from multiple segments ECG are input into the multiple two-layer SAEs for feature enhancing, and the SAEs are trained one by one in an unsupervised way. The extracted local features are then allocated to the feature extraction units of the DBN in order to learn additional and complementary representations. In our design, the DBN architecture stacks 3 RBMs (3 hidden layers). The first two RBMs are used as generative models, while the last one is used as a discriminative model associated with BP for the multiclass classification task. The training of the hidden layers of the DBN is acheived one layer at a time in a bottomup manner, using a greedy layer-wised training algorithm. In this work, the methodology used to train the SAE-DBN model can be divided into four stages: Feature enhancement, pre-training, supervised, and fine-tuning phases.
1. Feature enhancement Even the initialization of the SAE parameters W and b to random values, and setting up maximum epochs, learning rate and sparsity parameter, two-layer SAEs were trained through minimizing the reconstruction error and the output of the last hidden layer was regarded as the new feature representations. 2. Pre-training The main purpose of pre-training is to obtain the network parameters of each RBM from layer to layer. In this paper, LBP features enhancement is achieved through SAE which is imported to the bottom visible layer to initialize the DBN. The weight parameters of the first layer and the hidden layer can be trained through contrastive divergence algorithm. After obtaining the first layer's network parameters, it is imported to the second layer as the input data. Through this process, the whole network parameters of each RBM can be obtained layer by layer. In this paper, the number of layers of the DBN is set to three. 3. Within the supervised phase the last RBM is trained as a nonlinear classifier using the training and validation sets along with their associated labels to observe its performance in each epoch. 4. Fine-tuning After pre-training the multiple layers DBN, some tagged training data are imported to the constructed DBN in the pretraining step, and the back-propagation algorithm is used to optimize the whole network parameters.

Results and Discussion
The proposed ECG biometric authentication algorithm will be tested and its performance is evaluated on the two databases MIT-ID and MIT-BIH [47]. These databases are the most commonly used benchmarks in ECG-based biometric recognition research. In the experiments, for the ECG-ID, the whole ECG signal is used for both training and testing whereas for the MIT-BIH database, for each subject, ECG signal from hour 1 to hour 2 is used as training data, and the ECG signal from hour 13 to hour 14 is used as test data. The evaluation of the proposed method were accomplished in terms of standard biometric measures and schemes [57], such as recognition rate, equal error rate, along with receiver operating characteristic (ROC,DET) curves. The evaluation of the obtained results was performed in two modes: verification and identification. In verification mode, the receiver operating characteristics (ROC) curve and equal error rate (EER) were used a performance index of the proposed method. The ROC curve can be explained as a false acceptance rate (FAR) versus false rejection rate (FRR) curve. Therefore, we consider that a false acceptance error is made by the system when an ECG feature and a template corresponding to two different heartbeat segments to an Euclidean distance is lower than the designed threshold. Similarly, we consider that a false rejection error was made by the system when an ECG feature and a template corresponding to the same heartbeat to an Euclidean distance is higher than the threshold. The EER is the point where a false acceptance rate and false rejection rate are equal in value. The smaller the EER the better is the performance of the algorithm. Furthermore, Another useful measure is used for the evaluation of the performance of the results : the cumulative match score (CMS). The CMS measure assesses the ranking capabilities of the recognition biometric system by producing a list of scores that indicate the probabilities that the correct classification for a given test sample is within the top n matched class labels. The CMS at the first rank is the CCR. Figure 11 shows the CMS curve of the ECG identification for the different LBP variants.

Comparison using different LBP variants
To begin the experiment, all ECG signals were preprocessed with filters to remove the baseline wander as depicted in figure 2. Then we utilized the heartbeat fiducial point times provided with the MIT-BIH arrhythmia database to isolate the R peak of the QRS-complex. In the ECG segmentation step, every heartbeat sample has 245 points data including 94 points in front of the R peak and 150 points after the R peak. All samples information of the segmented ECG signals are shown in Figure 3. Then, the raw ECG segment is transformed into the LBP domain by using 1D-LDP. After applying this procedure, a histogram bin of segment ECG is generated, which has values ranging between 0 and 255 and these values correspond to a variation and significant heartbeat activity. The K-Nearest Neighbor (KNN) classification has been performed for identity authentication purposes. The first experiment is aimed at comparing the performance on the ECG-ID for different LBP variants (Adaptive LBP, 1D LBP, Modified 1D shifted LNDP, LNDP, Mrlbp, 1D-LDP, 1D shifted LBP) with KNN classifier. Note that each approach LBP has its own configuration [31,32,33,34,35,36], which may either decrease or increase the performance of each one of them. In this work, a series experiments have been carried out to choose the best parameter for each LBP variant.
The Detection Error Trade-off (DET) for all different LBP variants are plotted in Figure 12.
As shown in Figures 12-14, the performance of the proposed approach is satisfactory and superior to other existing LBPs methods. Note that the EER is only 3.33%. In identification mode, the algorithm was measured by correct recognition rate (CRR). The CRR of the proposed algorithm is 93.33%.
From the results of Table II, denoting the EER as a performance index of different LBP methods applied to the ECG-ID. It shows that the EER of  the proposed method is distinctly higher than the other LBP discussed in the previous section. Figure 15 illustrates the impact of the neighbor points selection in each ECG segment.It has been observed that the recognition rate reaches its best value when the number of neighbors n=9, However, it decreases when the neighbor selection was 3, 5, 16, 15, 17. In general, it can be observed that for all the used 1D-LBP variants, the performance index does not changes uniformaly with respect of the selected number of neighbors. According to our experiment ,a few segments were selected from the whole signal ECG (failure segmentation ex: misdetection of PQRST) have influenced the performance of 1D-LDP. In other words, it can be noticed that the segmentation method is an important and reliable step to get the desirable  information that describes the activity of the ECG heartbeat.

Evaluating the impact on the neighbor selection
In this context, the outcome of the proposed approach was assessed using deep learning. This latter has been applied recently to address the classification problem, which is considered as a second contribution of this paper. To demonstrate the robustness of this approach, a comparative study with the existing ML algorithms described in [41] including SVM, KNN, RBF, PNN. In addition, the recognition rate is used to compare these approaches: The training and testing sets were processed according to the considered methods. The resultant feature vectors were fed into the RBF,PNN,SVM, DBN classifiers for training and parameter adjustment. For the SAE-DBN, the processed ECG signal is divided into segments and histograms of 1D-LNDPs of these segments are further enhanced through the SAE before the training phase by the DBN network. A boxplot representation summarizes the results in Figure  17 and 18. In this study, several experiments have been performed on the proposed architecture as a classifier system, so that, it is designed as a hydrid "two models" of Deep Learning networks, which closely acts for the selection of common parameters or hyper-parameters that should be considered to learn correctly our data. These parameters are the number of units per layer, sparsity penalty, learning rate, and the number of epochs. Indeed, none of the existing applications based on DL can be considered as an optimal architecture based on its hyper-parameter. From the author's experience, the best way to select the best hyper-parameter should be made through training efficiently our classifier. To determine appropriate 1D-LNDP features and SAE-DBN topologies, a series experiments have been performed to select the best combination on the ECG data base. Table III lists the hyper-parameters and their values used during the simulations. In fact, in all simulations, we consider n input is the size 1D-LNDP feature extraction. The output layer node depends on the number of subjects recorded in the ECG database. Two algorithms of CD and PCD are used to train the DBN. The rest of parameters are summarized in Table 3. In addition, note that the experiments are carried out on a desktop computer with the following characteristics (Intel(R) Core(TM) i7-3632QM CPU @2.20 GHZ, RAM 8GB without GPU graphics).
As shown in Figures 16,17 and 18 a higher identification rate was obtained using the proposed approach against all the other classifiers (PNN,RBF,SVM,KNN,DBN). The proposed SAE-DBN reaches the best recognition rate in both databases considered. Furthermore, it has been verified that the proposed method is more    robust and reliable than other conventional classification methods, which is helpful for achieving better generalization performance.

Discussions and Concluding Remarks
After series of experiments carried out on the two databases where the proposed setting configurations have been evaluated and compared with the existing techniques mentioned in the previous section, several important observations can be made: · The major problems that ECG-based person recognition can face are the sources of variability in ECG signal, which is closely associated with the ECG measurement and acquisition method. To overcome this drawback, many techniques have been proposed to enhance the performance of ECG based biometric systems. Indeed, there are many factors that could be considered to further improve the quality of ECG signal in terms of the signal to noise ratio, smoothing, and re-sampling (i.e. selecting a new sampling frequency rather the original frequency). In our work, a Savitzky-Golay filter was applied to further enhance the quality of the raw ECG signals [58]. · Obviously, the most relevant information of ECG is located along the QRS complex, which describe the heartbeat template of ECG, and  therefore represent the signature of any biometric system based on ECG. Thus, the segmentation methodology is considered one the most crucial step of ECG processing and significantly increase or decrease the performance biometric based ECG. In this paper, a simple algorithm was employed to locate the heartbeat segment by applying a popular algorithm for PQRS detection [52]. · The main challenge in this work was to seek an efficient approach for feature extraction, which allows us to extract relevant information, robust and yet invariant to most factors affecting inter-and intra-variability associated to the ECG measurement techniques. Consequently, we have investigated the ability of local binary pattern, a well-known method that proved efficient on many2D image applications. By formulating a 1D version of the local pattern approach, and proposing a new algorithm termed (1D-LNDP), we have obtained through series of tests and experiments, very interesting results, that demonstrate the effectiveness of the proposed approach, giving satisfying and better performances than the existing methods reported in this work. We sought to improve the performance LBP where by using the property of the difference of consecutive points which leads to more robust LBP to the local transition, a fact considered as the drawback for classic LBP. · A few works have been reported in the literature where LBP was adopted in ECG based biometrics, therefore this research contribution provides significant results and new perspectives in the field. In fact, our results confirmed earlier, show the effectiveness and the performance of local binary pattern in general, hence, we succeeded to yield an enhanced version giving better performances of biometric ECG-based system for biometric verification and identification modes. · To the best of our knowledge, the fundamental proprieties of ECG and their applications always give this field a particular attention, and make it very important. Many researches have started to focus on this new biometric technology, due to its reduced amount of data and more importantly to the fact that it can be used to monitor the human health and remain more secure than other existing biometric methods. · The proposed model was compared with stateof-the-art models based on ML. Furthermore, feature extraction ability of 1D-LDP was compared to that of 1D-LBP technique and a better performance was achieved. To enhance the quality of extracted features and over-fitting phenomena, , the benefits ofboth models were assessed for the classification task. One reproduced the outcomes of 1D-LDP features and the other was used to train the output SAE. It has been shown that by using this hybrid approach, a higher accuracy is achieved as compared to state-of-the-art models based on ML or the use of DBN alone. · Although our proposed method achieved good classification accuracy, it does have some limitations. First, a relatively small database (18 subjects) was used so far, therefore the proposed framework used to discover high-level feature representation in our experiments required hand-engineered features which might not be suitable for real-time applications. To the best of our knowledge, most existing biometric-based ECG systems were developed and have been successfully performed on different databases recently. However, before applying the classification algorithms, the vast majority of current approaches similarly proceeds by the handengineered features extraction that are assumed to detect pertinent information capable of describing the heartbeat activity. Therefore, the effectiveness and precision of these classification approaches are heavily dependent on the quality of the extracted features and the segmentation stage. To avoid this drawback, in future works, we will explore the possibility to extract the ECG features in an unsupervised manner using deep learning without incorporating hand-engineered extracted features stage will be explored. Finally, it should be emphasized that in our study, the presence of different physical and mental conditions was not considered. It is well known that these conditions are responsible for heartbeat variability of a person which could be considered as a big challenge that the generalization of the biometric based ECG when used with confidence and reliability in real world applications. Thus, this problem is worth investigating in our further research work.

Conclusion
In this paper, a hybrid framework is presented to further improve the performance accuracy for ECG-based biometric identification. The proposed model combines the non-fiducial extraction with a deep learning model to obtain better and robust performance. The contributions of this study consist of two aspects: (1) the impact on 1D-local binary pattern was extracted and fed into the deep learning model. In addition, the over-fitting accuracy can be significantly enhanced through utilizing the stacked auto-encoders; (2) a stacked RBM that merges the contrastive divergence algorithm and the fine tuning method was used for the optimization of the DBN network parameters. The hybrid framework was formed to overcome the non stationary and the periodic nature from the variability of ECG heart activity signals, which are usually considered to be the major challenges that the evolving ECG-based biometric identification systems could be facing in the near future. As expected, experimental and comparison results confirmed the effectiveness and the superiority of the proposed biometric hybrid system. We have mentioned in the last section that our implementation has been assessed only by CPU, to further improve the generalization ability of this proposed model, the overall algorithmic process can be parallelized by implementing it on GPU for use in real-time ECG biometric classification.

Declarations
❼ No funding was received to assist with the realization of this study. ❼ The authors have no conflicts of interest to declare that are relevant to the content of this article. ❼ Authors approve Ethics responsibility ❼ Consent to participate ❼ The authors are in complete consent for the publication of this work ❼ all data and materials presented in this paper are available ❼ The Code used in the experimental tests are available ❼ All authors contributed in this research work.