A novel method based on Adaptive Periodic Segment Matrix and Singular Value Decomposition for removing EMG artifact in ECG signal

Background: The Electrocardiogram (ECG) signals are usually used to detect and monitor human health. However, the Electromyogram (EMG) artifact also can be obtained during measurement, these make difficult for doctors in correct diagnosis. In general, the ECG signal is periodic while EMG artifact is non-stationary and has overlapped with it under the frequency domain. According to these characteristics, it is necessary to extract clean ECG signals from noisy EMG artifact signals by using the periodic separation method. Methods: A novel Adaptive Periodic Segment Matrix (APSM) based on Singular Value Decomposition (SVD) is proposed for extracting clean ECG from EMG artifact. Firstly a periodic segment estimation method is proposed by obtaining an average periodic length and RR intervals constraint via envelope spectrum of the measured signal. Secondly, the R wave peaks and its position of the ECG signals are detected by these. After that, APSM with rank one is formed using R wave peaks and the calculated RR intervals constraint, Then SVD is processed on this matrix, the restructured ECG signals will be obtained by the first maximum singular value of the formed matrix. The validation of proposed method is made by applying the algorithm to ECG records from the MIT-BIH Arrhythmia Database. The zero-mean percent root-mean-square difference ( PRD 1 ), Cross-correlation coefficient and output signal to noise ratio ( SNR output ) have been calculated for presenting the algorithm performance by comparison with other methods. Finally two heart disease cases have been studied for P wave and ST segment detection under noisy ECG with EMG artifact. Results: The proposed methods achieved significant improvement in output signal-to-noise ratio, percentage root-mean-square differences and lead to the higher value of cross-correlation coefficient between the original (clean) ECG and the denoised ECG signal. Also, the reconstructed ECG signal can be better able to follow the trend of original (clean) ECG signal under the EMG noise. Conclusion: The proposed periodic segment estimation method can adaptively find the periodic length in ECG signal by using envelope spectrum. Also, the more strict rank one trajectory matrix has been formed in APSM by using R wave peaks and RR intervals constraint. The results show that the proposed APSM-SVD method is effective for EMG artifact removal and extracting the clean ECG signal. The R peak, P wave, QRS complex and ST segment can be preserved in the reconstructed ECG signal.

Compared with those methods, when applying the wavelet transform method on ECG signals, the selection of the mother wavelet is an unavoidable issue, it often depends on the ECG signal types [25]. For EMD method, there exits the mode mixing and the EMG artifact will be distributed over a number of intrinsic mode functions (IMFs) in the EMD method. Among those existing denoising and separation approaches, the singular value decomposition (SVD) is a method to separate the signal of interest from various noises effectively [26]. A hybrid ECG compression method based on SVD and discrete wavelet transform have been used to extract the ECG from mixing noise [27]. The ECG signal has the characteristic of periodic component [28]. A method based on periodic trajectory matrix and SVD has been applied to extract the fetal ECG from maternal ECG signals [29]. Later, The periodic segment matrix (PSM) has been come out and applied with SVD for detecting and extracting the periodic impulse component in vibration signals [30]. The embedding dimension of periodic segment matrix can utilize singular value ratio (SVR) spectrum and the effective rank order of singular value is equal to one [31]. In fact, due to the random disturbance of EMG artifact, the detected intervals among R wave peaks fluctuate. At this time, the ECG signal is a pseudo cyclostationary signal, and the traditional SVR spectrum will not be able to determine the embedding dimension. Hence, it is necessary to come out an adaptive periodic segment matrix for extracting clean ECG from EMG artifact.
In this paper, an Adaptive Periodic Segment Matrix (APSM) based on Singular Value Decomposition is proposed to separate the EMG artifact and ECG signals. The average periodic length is firstly calculated by the maximum of envelope spectrum in input signal, also the RR intervals constraint for R wave peaks selection can be work out. Then find out the R wave peaks and its positions of the ECG signals by these. After the previous stage, the Adaptive Periodic Segment Matrix can be constructed by trajectory matrix with R peaks and RR intervals constraint. Singular Value Decomposition is processed on this matrix to work out the first maximum singular value. According to this value, the ECG signals can be reconstructed.
The organization of this paper is as follows. In method section, the theoretical background is provided, the Periodic Segment Matrix (PSM) and the proposed Adaptive Periodic Segment Matrix (APSM) are introduced, the RR intervals constraint and the embedding dimension of proposed matrix is also discussed. Then, the process of proposed method in ECG is presented. The result section indicates the validation of the proposed method by using the MIT-BIH Arrhythmia Database [32]. Also, a comparative analysis of the existing method and the proposed method is carried out in output signal to noise ratio (SNR output ), zero-mean percent rootmean-square difference (P RD 1 ), and cross-correlation coefficient. Finally, in discussion section, the performance of the reconstructed signal in proposed method is compared and discussed with other methods on the time domain waveform. At the same time, two heart disease cases are also analyzed in the P wave and ST segment to show performance of proposed method in doctor diagnosis.

Simulated signals
In this section, the simulation signals for several different cases are carried out for evaluation and validation of the proposed method. The simulated noisy signal is combined with a clean ECG and EMG artifact. The simulated signals can be shown in Eq.1.
where s ECG is the clean ECG signal used from the MIT-BIH arrhythmia database [32].
s EMG is the EMG noise chosen by the muscle (EMG) artifact (in record 'ma') from MIT-BIH Noise Stress Test database [37]. The Figure 1 shows the constructed signals based on the 103 record in MIT-BIH arrhythmia database. Every data file in the database consists of two lead recordings with sampling frequency at 360 Hz with 11 bits per sample of resolution. The simulation experiment is performed over the EMG noise. Validation studies in EMG artifact cases In this sub-section, the EMG artifact in ECG signals will be discussed. The proposed APSM-SVD method will be used for obtaining the clean ECG signals from EMG artifact, the comparison with EEMD, DWT, PSM-SVD and SSA will be done, and the qualitative and quantitative evaluations will be given and presented. The quantitative evaluation will be progressed with different input/output Signal to Noise ratio (SNR input )(SNR output ) and the modified percentage root-mean-square differences P RD 1 and Cross-correlation coefficient [38]. The Output Signal to Noise ratio (SNR output ) will be used for representations and defined as: where y[l] andŷ[l] represent original clean ECG signal and reconstructed ECG signal, respectively. In most ECG compression algorithms, the percentage rootmean-square differences (P RD) measure is employed [13] and it can be defined as: This error estimate is the one most commonly used in all scientific literature concerned with ECG compression techniques. The clinical acceptability of the reconstructed signal is desired to be as high as possible. The main drawbacks are the inability to cope with baseline fluctuations and the inability to discriminate between the diagnostic portions of an ECG curve. However, its simplicity and relative accuracy that make it a popular error estimate among researchers. As the P RD is heavily dependent on the mean value, it is more appropriate to use the modified criteria. Therefore, the P RD 1 is proposed for accuracy evaluation in ECG compression. The equation of P RD 1 has been shown below: where the ⌢ x(n) is the reconstructed signal, andx is the mean of the signal. If the P RD 1 value is between 0% and 9%, the quality of the reconstructed signal is either 'very good' or 'good' [38], whereas if the value is greater than 9%, its quality group cannot be determined. As we are strictly interested in very good and good reconstructions, it is taken that the value P RD 1 , as measured with Eq.1, must be less than 9%. Hence we compared the performance of our proposed method with three existing techniques: Ensemble Empirical Mode Decomposition, Discrete Wavelet Transform and Singular Spectrum Analysis (SSA). In EEMD method, the noisy ECG signal is decomposed into intrinsic mode functions (IMFs) and the last three IMF are discarded. Discrete Wavelet Transform (DWT) uses filter banks for construction of the multi-resolution analysis with relatively low computation time. The procedure of SSA is usually divided into the following steps: embedding, singular value decomposition, grouping and reconstruction. PSM-SVD method is similar to the proposed APSM-SVD, but the periodic segment is fixed. It is evident from Table.1 and Fig.2 that SNR output obtained by all the methods are larger than those in SNR input . Meanwhile, the proposed method has relatively higher output signal to noise ratio than other methods under the same level. As  the SNR input increase, the obtained SNR output increased. Still, the APSM-SVD performs better than the other method. Moreover, the Cross-correlation coefficient is also calculated with clean ECG and shown in Table.2. The Cross-correlation coefficient of input noisy signal is smaller when SNR input is lower, which matches the EMG noise is larger under the lower signal to noise ratio. Nevertheless, the proposed method still achieved better results compared with others in Cross-correlation coefficient. Also, the performance measures for P RD 1 under different SNR input have been calculated and shown in Table.3. The P RD 1 can be guarantee good reconstructed signals under the range of 0%-9%. Particularly, the P RD 1 exceeds 9% when SNR input is -20dB in EEMD, DWT, SSA and PSM-SVD, where the APSM-SVD achieved 6.0375%. In other SNR input levels, the APSM-SVD has lower P RD 1 compared with other methods. Therefore, the proposed APSM-SVD would be a promised method for separate clean ECG from EMG artifact. Different records may have different denoised performance, the performance measures for P RD 1 in different MIT-HIH records have been calculated and shown in Table.4, as the P RD 1 decreases, the better the performance is, this also obeys the rule with good reconstructed signals under the range of 0%-9%. Therefore, the results show that the APSM-SVD achieve better performance compared with other method among the records.

Discussion
In order to better reflect the effect of noise reduction, the time domain waveform after denoised and separated of the proposed method and other methods are also shown in Fig.3. The reconstructed signals in EMD, DWT, SSA and PSM-SVD method still have some EMG artifact compared with the clean ECG signal in 103 record . However, the reconstructed ECG signal in proposed APSM-SVD method is better and can quite follow the clean ECG signal. In many cases, we need to recognize the R peak of the original (clean) ECG along with the P wave, T wave, QRS complex and the ST segment for correct heart disease identification. It can be seen from Fig.4 (a) and (b) that all the methods can identify the R peak. There is a small phase shift in DWT results, due to the addition of the EMG artifact, the noisy ECG signal has a baseline shift compared with clean ECG signal. This causes the baselines of reconstructed results in DWT and SSA are similar to the noisy ECG signal, and cannot detect the actual baseline of the clean ECG signal.The reconstructed signal in PSM-SVD also has a phase shift compared with APSM-SVD.But the effect of baseline shift is better than other methods. Meanwhile, EEMD method can quite detect the baseline of clean ECG. But there still existed the incorrect identification in P wave and T wave. The proposed APSM-SVD is better able to follow the original (clean) ECG signal under the EMG noise. Also, the R peak, QRS complex and the ST segment can be detected correctly in comparison to the other methods.
General speaking, in ECG signal, R peak and QRS complex of ECG signal has periodicity, and then EMG signal is random with no periodicity. Therefore, the Adaptive Periodic Segment Matrix with rank one is formed by periodic segment of noisy ECG signal, and then process to singular value decomposition for reconstructing the clean ECG signal. The left singular value matrix is a single waveform of QRS complex, and the right singular value matrix is the coefficient, so as to reconstruct a new ECG signal. Meanwhile, EEMD does not have the ability to extract the two signals in the similar frequency band, but for DWT, the result of DWT method depends on the similarity between wavelet base and target signal. In reality, it is often difficult to achieve. At the same time, the quality of SSA depends on the segment length of trajectory matrix, it is difficult to determine a suitable segment length without prior knowledge. The core innovation of this paper is to provide a suitable period segment estimation method and ensure the formed trajectory matrix is strict rank one matrix by average period length in envelop spectrum and RR intervals constraint in R wave peak selection. Also the Periodic Segment Matrix has embedding dimension, but it can not be changed as the length of signals increases. The proposed APSM-SVD based on this method can separate EMG artifact and ECG signal, where traditional methods cannot solve this problem. The application in heart disease identification

Left bundle branch block (LBBB) case studies
In previous section, the validation has been made for proposed method. Therefore, in this section, the application of the proposed method in heart disease is discussed. Hence, the LBBB case has been applied. LBBB is a cardiac abnormality that is mainly caused due to delay in activation of the left ventricle [39]. ECG recordings of patients suffering with LBBB have the following characteristics: (1) QRS duration is greater than 120 ms; (2) Lead V1 signal shows a slurring of QRS with an initial R wave; (3) ST segment has displacement; (4) the direction of T wave is opposite to R wave. The Record 214 of MIT-BIH Arrythmia database shows ECG signal with LBBB. The identification has been shown in Fig.5. Compared with noisy ECG, the proposed APSM-SVD can denoise the EMG artifact and reconstructed the target ECG. While the PSM-SVD still has EMG artifact in reconstructed signal. The R waves can be detected and matched with the clean ECG by using APSM-SVD. P wave is difficult to identify in noisy ECG signal and reconstructed signal in PSM-SVD. This may generate false alarm for heart abnormality known as atrial fibrillation. [40]. STEMI is a type of heart attack in which a coronary artery is blocked completely by a blood clot. Some heart muscles which receive oxygen from that coronary artery begin to die. The highly elevated ST segment indicates the amount of heart muscle damage. We know that EMG noise affects the ST segment. However, for disease identification such as STEMI, it is necessary to preserve the ST segment in the denoised signal. To test the efficacy of our proposed method, we use Record 231 with straight elevation (characteristic of STEMI) in the ST segment (Fig.6). Noisy ECG signal is created by adding EMG noise to the clean ECG signal. And the reconstructed signals are obtained by the proposed APSM-SVD and PSM-SVD. It can be seen that in Fig.6, although the R wave peaks can be detected correctly in all signals, the EMG artifact in noisy ECG seriously affects ST elevation recognition. On denoising with proposed method, we notice that Where that in PSM-SVD is not clear. Therefore, the alarm for atrial myocardial infarction can be applied by the proposed method.

Conclusion
Real-time ECG signals are often suffered from the EMG artifact that need to be denoise and removed before an ECG signal can be used by a doctor for analysis. The ECG signal has the periodic components and EMG is nature random, Therefore, the denoised method based on the periodic segments can extract the clean ECG signal. However, the periodic segments in measured ECG signal will affect the performance of the reconstruction. And the periodic segments will be changed due to the different types of ECG. In this paper, a new periodic segment estimation method and an Adaptive Periodic Segment Matrix based on SVD has been proposed to extract the clean ECG signal in noisy ECG signal with EMG artifact. Compared with Periodic Segment Matrix, the envelop spectrum is used for calculating the average period length. Then RR intervals constraint has been made for selecting the R wave peaks. Also, the embedding dimension is adaptively selected from this. Still, the adaptive periodic segmented matrix is constructed by RR intervals constraint from a R wave peak to both sides, so as to ensure the formation of more strict rank one matrix in different ECG signals. The validations of the proposed method have been made and the comparison also has been done. Performance comparison shows that, compared to other methods, the proposed methods provide significant improvement in output signal-to-noise ratio, percentage root-mean-square differences and lead to higher value of cross-correlation coefficient between the original (clean) ECG and the denoised ECG signal. The time domain waveform after denoised is also compared. The reconstructed signal in proposed APSM-SVD can follow the clean ECG signal and the P wave, QRS complex, ST segment and R wave can be preserved. EMG artifact affect the ST segment and small amplitude waves i.e. P wave and the ST segment of the clean diseased (original) ECG signal. Thus, the two heart disease cases (LBBB) and(STEMI) are also applied in PSM-SVD and APSM-SVD method. The results show that P waves and ST segment can be preserved in APSM-SVD. Therefore, Adaptive Periodic Segment Matrix can be used to form the trajectory matrix and process to SVD for restructuring the pure ECG to preserve the R peak, P wave, T wave and the ST segment for correct diagnosis. We have also demonstrated that our proposed APSM-SVD method is able to avoid the effect of EMG noise on these. Therefore, the proposed APSM-SVD can perfectly achieve the requirements of extracting clean ECG from noisy ECG signals with EMG artifact. In further, during the measurement condition of wearable device, the amplitude of EMG signals may far exceed in ECG signals, which leading to its peak value cannot be accurately identified, how to extract the pure signal through the proposed method needs further research and discuss.

Periodic Segment Matrix
Many methods can be used to construct the trajectory matrix. One of the most famous methods is the Hankel matrix[33] [34]. However, the Hankel matrix is unsuitable for strengthening the periodic impulse responses. Therefore, the novel trajectory matrix, Periodic Segment Matrix (PSM), without accumulative error [29] is used as the trajectory matrix of SVD. The trajectory matrix with PSM properties can be expressed as: . s(c 2 + l) . . . . . . . . . s(c a + 1)s(c a + 2) · · · s(c a + l) where s is any periodic impulse component, α is the number of periods, and l is the and c a + l ≤ N, where < . > is a rounding operator, p is h times T , and h ∈ N * .
T is the period length of the periodic impulse component and can be determined by the singular value ratio (SVR) spectrum [31]. Peaks at higher multiples of this length must be monitored. Therefore, the embedding dimension l can be expressed as Naturally, the row number a,Y is also determined as a=argmax <(a-1)hT> + <hT> (7) h can be obtained by maximizing the rank of a matrix Y, i.e., Where rank (Y) is the rank of a matrix Y and is equal to min(a, l). And this of a pure periodic signal is equal to 1, the trajectory matrix Y can be reconstructed by using the first maximal singular value, i.e.
Finally, the periodic impact componentŶ is extracted by the inverse process of Y in Eq.5. General speaking, the periodic segment matrix can separate the strictly periodic signals, and it can be extracted by SVD based on the rank of the trajectory matrix equals to one. However, the periodic segment in PSM may not be appropriate for the noisy ECG with the EMG artifact, the peaks in SVR spectrum cannot be able to find. This will cause the rank of the trajectory matrix is not a rank one matrix. The first singular value of this method will not preserve the P wave, QRS complex, T wave and ST segment of ECG component. Therefore, the ECG signal will fail to be reconstructed.

Adaptive Periodic Segment Matrix
As mentioned above, the clean and periodic ECG signal can be detected by using the Periodic Segment Matrix. Once the EMG artifact has been added, the embedding dimension and periodic segment in PSM will not be suitable, even though using the particular embedding dimension. In other words, the rank of the matrix will not be equal to one, this affects the signal recovery after processing the SVD. Therefore, according to this shortage in PSM and the requirement for extracting the clean ECG from EMG noise, a new matrix named Adaptive Periodic Segment Matrix is proposed to form a strict rank one matrix. In ECG research field, scholars usually pay more attention to the P wave, R wave, ST segment and QRS complex for ECG signals. As a result, the peaks are found and pursuit in the proposed method, the embedding dimension of the matrix has been selected based on the RR intervals constraint. These steps can choose the suitable periodic segment for ECG signal to make the rank of the trajectory matrix strictly equals to one, and guarantee the correct P wave, R wave, ST segment and QRS complex for reconstructed ECG signal.

Peak pursuit
In the ECG signal, R wave peak needs to be recognized, but due to the interference of EMG noise, directly finding R wave peaks will include some other peaks which not belong to R wave, so a certain RR intervals constraint needs to be set when pursuing for correct R wave peaks. If the selected interval is too large, the R wave peaks will not be fully recognized. Otherwise, if the interval is too small, the peak value that not belongs to R wave will be wrongly selected. Therefore, here is an assumption of RR intervals constraint: where Z is the selected RR intervals constraint. In reality, the maximum RR interval and the minimum RR interval are not very clear due to the noise. This makes it difficult to choose the Z. As we know, the fundamental frequency of the envelope spectrum reflects the average period of the main components of the signal. Hence, the position corresponding to the maximum value of envelope spectrum can be taken as mean(RR intervals ). Also, there is little difference between the maximum and minimum RR intervals in normal ECG signals, both of them can be approximated as mean(RR intervals ), so the above Eq.10 can be written as: where mean(.) is the mean value of the RR intervals . Therefore, the RR intervals constraint Z can be rewritten as: In this paper, we choose α = 2/3 as the coefficient for Z. Hence, the R wave peaks can be found as [R 1 , . . . , R n ] and their positions [X 1 , . . . , X n ].

Trajectory matrix construction
In order to ensure the trajectory matrix is a strict rank one matrix. The position matrix B based on R wave peak position can be written as: where Z = ⌈ 2 3 · mean(RR intervals ) ⌉ Then the trajectory matrix Y apsm can be formed Besides, s are the input signal,n is the number of R wave peak. The embedding dimension selection here can be noted: (1) If embedding dimension is smaller than any R wave peak intervals, the reconstructed period ECG component may not cover all QRS complex, P waves and ST segments, which will lose the feature of the ECG signal.
(2) If embedding dimension is bigger than any R wave peak intervals, the two ends of reconstructed ECG signal will exceed the length of input signal, which cannot reconstruct or reconstruct false ECG components.
Therefore, the embedding dimension in proposed matrix is selected as double length of Z to obtain both requirements above.
For the periodic segment matrix, with the increase of signal length and the decrease of signal stability, every periodic component of its trajectory matrix will shift, and the periodic segment mode will break, which will not form a rank one matrix. Compared with PSM, the trajectory matrix in APSM is traversed from R wave peak to both sides, so the constructed matrix can be a more strict rank one matrix. The comparison of the trajectory matrix in APSM(Eq.14) and PSM (Eq.5) has been shown in Fig.7. It can be seen that the detected periodic components in PSM has a shift in different rows. This will make the formed trajectory matrix in PSM will not be a strict rank one matrix. As a result, the first maximum singular value will recover the false ECG signal.

Matrix Reconstruction and Signal Recovery
Similar to the periodic segment matrix reconstruction, the rank of trajectory matrix Y apsm is equal to 1, the trajectory matrix Y apsm can be reconstructed by using the first maximal singular value, i.e.
Finally, the reconstructed signalŝ is extracted by the inverse process ofŶ apsm in Eq.14.
The process of the Adaptive Periodic Segment Matrix based on SVD All the process has been shown in Fig.8. Firstly, the measured signal has been taken into the signal pre-processing. The input signals have been calculated with envelope spectrum for selecting the average period length. Then the R wave peaks and their positions coordinates have been found by using RR intervals constraint. After this, the trajectory matrix Y apsm is also formed based on matrix B in Eq.13 and process to SVD to find out the rank of the matrix equals to one. Then the signal will be recovered by the first maximal singular value.  (2) use the RR intervals constraint for R wave peaks selection (3) Position matrix B can be formed based on R wave peaks and the RR intervals constraint(4) the trajectory matrix can be strict rank one matrix due to B.(5) The first maximal singular value will be used as the signal recovery.