In this section, the results obtained from the denoising of artificial and real noisy ECG signals using the proposed approach will be discussed. In addition, the effects of different levels of Gaussian and ectopic noise on HRV features will be analyzed.

Validation of the pre-processing tool

Figure 1 shows the performance of the proposed approach used to denoise ECGgau with 2dB of added Gaussian noise. The denoising result in the time-frequency domain shows high visual quality and accuracy of the partially reconstructed denoised ECG (ECGden) with a significant noise reduction. The spectrograms in Fig. 1 demonstrate that the energy profile of the reconstructed ECG signal, specifically of the QRS complex is preserved after denoising with the CEEMDAN-WD method.

The performance indicators for denoising the ECGgau with the proposed method are presented in Fig. 2. High mean correlation coefficient values indicate good reconstruction of the ECGden in comparison to the original ECG. The correlation coefficient value increase with an increase in the SNR level of Gaussian noise (decreased noise). The accuracy of the designed filter is further reaffirmed by the small RMSE values seen in Fig. 2b. The trendline shows a decrease in the RMSE values with a decrease in the level of Gaussian noise (high SNR level) indicating overall good performance for the proposed approach.

A comparative analysis in terms of the RMSE of the CEEMDAN-WD filtering and some EMD domain denoising techniques using real ECG signals was performed. Three ECG recordings from the MIT- BIH Arrhythmia database, ‘111’,’116’, and ‘205’, were embedded with 10 dB Gaussian noise and filtered using the CEEMDAN-WD. The RMSE values for the EMD and EEMD based direct subtraction (EMD-DS,EEMD-DS)52, EMD with Kullback Leibler Divergence (EMD-KLD)53 and CEEMDAN with interval thresholding and higher order statistics (CEEMDAN-HIT) for the same ECG recordings were taken from literature54. The performance for each method is shown in Table 1. As seen from this table, lower RMSE values are given by the CEEMDAN-WD method.

The second type of noise, ectopic beats, are processed through two different filtering algorithms and their combinations to evaluate the robustness and performance of each approach. Figure 3 illustrates

Table 1

RMSE performance values for comparative analysis of the proposed CEEMDAN-WD method versus some developed methods EMD-DS, EEMD-DS, EMD-KLD and CEEMDAN-HT

ECG Recording | EMD-DS | EEMD-DS | EMD-KLD | CEEMDAN-HIT | CEEMDAN-WD |

111 | 7.02 | 8.38 | 14.26 | 4.94 | 0.0439 |

116 | 11.86 | 10.91 | 11.86 | 8.55 | 0.1652 |

205 | 8.19 | 13.87 | 8.19 | 4.96 | 0.0732 |

the denoising results of ECGect with 2% of added ectopic noise for each approach. As seen clearly in the figure, the filter combinations, SDROM-ADF and ADF-SDROM, outperform the individual algorithms which fail to account for all the ectopic beats in the signal. However, between the two filter combinations, SDROM-ADF shows a better visual performance in removing the ectopic noise.

The performance of each approach is quantified by the mean correlation coefficient and RMSE values. Figure4 shows the correlation coefficient and RMSE values for all four methods

*Figure 3:Comparison of different ectopic filters, (* a) *Artificial RR interval with 2% of ectopic beats*, (b) *Denoised RR interval by SDROM*, (c) *Denoised RR interval by ADF*, (d) *Denoised RR interval by combination filter SDROM-ADF*, (e) *Denoised RR interval by combination filter ADF-SDROM.*

SDROM has the lowest correlation coefficients for all noise levels whereas the ADF filter and the combination filter ADF-SDROM show similar results at low levels of ectopic noise. However, ADF- SDROM filter performs better than the ADF filter at higher levels of noise as seen in Fig. 4a. SDROM-ADF method has the highest correlation coefficient values for all levels of noise and shows the best performance.

Figure 4b illustrates the RMSE values for all four filters. SDROM- ADF filter has the lowest values for all levels of noise and SDROM filter has the highest values of RMSE for all percentages of added ectopic beats. ADF and ADF-SDROM filters have similar performance for lower levels of noise with ADF- SDROM performing better at a higher percentage of added ectopic beats.

Therefore, a two-step denoising method was applied where the first step is to denoise the ECG signal using the CEEMDAN-WD algorithm to remove Gaussian noise followed by the application of the SDROM- ADF filter to remove ectopic noise. The complete denoising algorithm was applied to 40 artificial ECG signals with embedded Gaussian noise of 6dB and 6% of added ectopic beats. The proposed method showed good performance with the mean correlation coefficient of 0.846 ± 0.114 (mean ± SD) and RMSE value of 7.69 x10− 5 ± 4.86 x10− 4.

Figure 5 shows a visual illustration of the denoising approach applied to one ECG signal from each of the real ECG signal databases selected above. Figure 5a to 5c show the denoising of the ‘16539’, ‘119’, and ‘52’ ECG recording from the MIT-BIH normal sinus rhythm, the MIT-BIH Arrhythmia, and the sudden cardiac death database respectively. It can be clearly seen from these figures that the denoising performance on the real ECG signals is showing similar impressive results. The CEEMDAN-WD successfully removes the Gaussian noise without altering the original ECG signal and the SDROM-ADF approach removes most of the ectopic noise seen in the HRV.

Effects of noise type on HRV measures

Twenty-four HRV features were extracted from the RRIart, RRIgau and RRIect files. Time-domain features which define the variability of interbeat intervals include the average N-to-N intervals (AVNN), standard deviation of the N-to-N intervals (SDNN), the root mean square of differences between successive N-to-N intervals, the percentage of N-to-N intervals greater than 50 ms (pNN50)

*Figure 6: Heat maps for -log* *10* *transformed p-values for twenty-four of the most common HRV measures for real and artificial signals*, (a) *Heat map of -log**10**(p-value) of original clean and noisy artificial HRV measures for different levels of ectopic and Gaussian noise. The last column shows the artificial HRV measures for both types of noise combined at 6dB of Gaussian and 6% of ectopic noise*, (b) *Heat map of -log**10**(p-value) of noisy and denoised (by the proposed algorithm) real HRV measures obtained from three Physionet databases: MIT-BIH Arrhythmia, MIT-BIH Normal Sinus Rhythm, and Sudden Cardiac Death database.*

and standard error of the mean N-to-N interval (SEM). Frequency-domain features measure the distribution of power into the following discrete frequency bands: high-frequency band (HF, 0.15–0.4 Hz), low-frequency (LF, 0.04–0.15 Hz), and very low-frequency (VLF, 0.0033– 0.04 Hz). The frequency-domain features calculated were total power (combine power in all three bands), VLF power, LF power, HF power, normalized VLF power (VLF Norm), normalized LF power (LF Norm), normalized HF power(HF Norm), the ratio between LF and HF power (LF to HF), LF peak and HF peak4,23. Nonlinear measures reflect the complexity and unpredictability in the HRV signal. These measures included SD1 (standard deviation of N-to-N intervals along the perpendicular to the line of identity), SD2 (standard deviation of N-to-N intervals along the line of identity), alpha1 (low scale slope of detrended fluctuation analysis), alpha2 (high scale slope of detrended fluctuation analysis) and sample entropy55,56. Fragmentation features include PIP (percentage of inflection points in the N-to-N interval), IALS ( Inverse average length of segments), PSS (percentage of short segments), PAS (percentage alternation segments)57.

The results of the paired t test are shown in Fig. 6. *p* values with large variability cannot provide support as accurate and reliable measures of evidence against the null hypothesis58. One way to resolve this issue is to explicate the *p* value expressed as *p = c x 10**− k* on a log scale as *-log**10**(p) =-log**10**(c) + k* where *c* is a constant and *k* an integer. This implies that the magnitude k is a continuous measure of the actual strength of evidence58,59. Using the *-log**10**(p)* value significant statistical differences between the HRV of clean and noisy signals for each HRV measure for different levels of noise were able to be ranked.

The comparison of original and noisy artificial HRV measures is shown in Fig. 6a. All the HRV measures with the exception of HF peak and LF peak (*-log**10**(p) < 1.3)* were statistically significant for ectopic noise. VLF norm (except for 4% ectopic noise) and alpha2 resulted in the highest *k* value followed by AVNN, SDNN, RMSSD, SD1, SD2, and sample entropy. Whereas LF to HF, alpha 1, and PAS for 2% of added ectopic beats resulted in the least order of difference.

AVNN (except 2dB of Gaussian noise), HF peak, LF peak, LF power, and alpha 2(except 2dB of Gaussian noise) were not significantly affected by the addition of Gaussian noise. The highest order of significant change was observed for the fragmentation measures for higher levels of Gaussian noise. In addition, some HRV measures such as SDNN, RMSSD, SEM, and non-linear measures: SD1, and sample entropy showed the most significant changes for lower levels of Gaussian noise.

The last grid in Fig. 6a shows the change for HRV measures with both Gaussian and ectopic noise as would be the case with real signals, indicating a significant change in all the HRV measures except HF peak and LF peak, which were not affected by either type of noise.

Figure 6b illustrates the findings for the paired t test between the noisy and denoised real HRV measures. The results reaffirm that most HRV measures are sensitive to noise and denoising the signal before performing HRV analysis will alter the outcome of the results. The HRV measures obtained from MIT-BIH Arrhythmia database showed no significant change for AVNN, HF peak, LF norm, and LF power between the noisy and denoised signals. HF norm, VLF norm, alpha1, and alpha 2 show the highest statistical significance. The MIT BIH-normal sinus rhythm database showed no effect on the fragmentation measures, sample entropy, HF peak, LF norm, LF peak, and pNN50. The biggest change for this database was in the nonlinear measure, alpha1 followed by alpha2, VLF norm, and HF norm. The most significant difference for the sudden cardiac death database were HF norm, VLF norm, and alpha1 whereas most of the time and frequency domain measures, as well as SD1 and SD2 did not result in any significant change.

Table 2 shows the results of the linear regression analysis ranked by the absolute value of the slope (*B* value). The majority of the HRV features examined for added ectopic noise indicate a linear relationship between the relative percentage change in HRV features (RHRV) and the percentage of added ectopic noise. Some HRV measures such as HF norm, alpha1, and sample entropy among others do not show a linear change with an increasing percentage of artifacts. RHRV for these measures when plotted against the increasing percentage of artifact showed a distinct widening of the spread of data points with small slope values as shown in Fig. 7. Subsequently, the *p* values for the regression model of these measures were not significant.

Time-domain and frequency-domain measures except for AVNN, LF peak, and HF norm showed increasing variability with the increase in the percentage of added ectopic beats. Among the time domain measures, SDNN and RMSSD were more sensitive to added artifact than other measures. However, SEM was robust to ectopic noise. No clear distinction between short term or long-term time-domain metrics was observed. In addition, absolute power measures from the frequency domain were more sensitive to added artifacts when compared to normalized power metrics.

All non-linear and fragmentation measures except alpha1 increased in HRV with the increase in added artifacts. The non-linear measures SD2 and SD1 were the most sensitive to ectopic noise whereas all fragmentation measures showed the greatest sensitivity to ectopic noise.

For the majority of HRV measures, the variability in RHRV with increasing SNR of Gaussian noise was comparatively large. This results in an increased spread of data, especially for lower SNR values, similar to what was observed for some HRV measures for ectopic noise. These measures do not conform well to straight line fitting despite having a high significant *p* value as displayed in Fig. 8.

Table 2

HRV measures ranked by absolute value of slope (B) on linear regression grouped by time domain measures, frequency domain measures and non-linear and fragmentation measures

ECTOPIC NOISE | GAUSSIAN NOISE |

**HRV Measures** | **R****2** | p | **B** | **Abs(B)** | **HRV measures** | **R****2** | p | **B** | **Abs(B)** |

SEM | 0.889 | < .0001 | 0.016 | 0.016 | SEM | 0.111 | < .0001 | -4.41E-04 | 4.41E-04 |

pNN50 | 0.557 | < .0001 | 0.646 | 0.646 | pNN50 | 0.088 | < .0001 | 0.002 | 0.002 |

AVNN | 0.999 | < .0001 | -0.930 | 0.930 | SDNN | 0.112 | < .0001 | -0.034 | 0.034 |

SDNN | 0.970 | < .0001 | 1.225 | 1.225 | RMSSD | 0.136 | < .0001 | -0.046 | 0.046 |

RMSSD | 0.948 | < .0001 | 1.355 | 1.355 | pNN50 | 0.208 | < .0001 | -0.270 | 0.270 |

LF PEAK | 0.000 | **0.943** | -0.006 | 0.006 | HF PEAK | 0.011 | **0.133** | 0.005 | 0.005 |

HF PEAK | 0.013 | **0.104** | 0.342 | 0.342 | LF PEAK | 0.002 | **0.487** | 0.005 | 0.005 |

LF TO HF | 0.010 | **0.155** | 0.403 | 0.403 | LF TO HF | 0.000 | **0.919** | 0.011 | 0.011 |

VLF NORM | 0.010 | **0.155** | 7.208 | 7.208 | LF NORM | 0.096 | < .0001 | 15.287 | 15.287 |

LF NORM | 0.007 | **0.249** | 13.873 | 13.873 | HF NORM | 0.032 | 0.011 | 21.845 | 21.845 |

HF NORM | 0.014 | **0.096** | -21.098 | 21.098 | VLF NORM | 0.061 | 4.02E-04 | -35.935 | 35.935 |

VLF POWER | 0.611 | < .0001 | 17416.08 | 17416.08 | VLF POWER | 0.030 | 0.014 | -91.591 | 91.591 |

LF POWER | 0.644 | < .0001 | 51760.12 | 51760.12 | LF POWER | 0.057 | 0.001 | -138.717 | 138.717 |

HF POWER | 0.622 | < .0001 | 102963.27 | 102963.27 | HF POWER | 0.071 | 1.42E-04 | -245.000 | 245.000 |

TOTAL POWER | 0.636 | < .0001 | 172579.35 | 172579.35 | TOTAL POWER | 0.060 | 4.84E-04 | -477.852 | 477.852 |

alpha1 | 4.969E-05 | **0.921** | -0.026 | 0.026 | SD1 | 0.136 | < .0001 | -0.033 | 0.033 |

SAMPLE ENTROPY | 0.004 | **0.346** | 0.210 | 0.210 | SD2 | 0.102 | < .0001 | -0.037 | 0.037 |

alpha2 | 0.014 | **0.092** | 0.297 | 0.297 | alpha1 | 0.130 | < .0001 | 0.603 | 0.603 |

SD1 | 0.948 | < .0001 | 0.958 | 0.958 | IALS | 0.309 | < .0001 | -0.903 | 0.903 |

IALS | 0.751 | < .0001 | 1.307 | 1.307 | SAMPLE ENTROPY | 0.050 | 0.002 | -1.681 | 1.681 |

SD2 | 0.969 | < .0001 | 1.443 | 1.443 | alpha2 | 0.107 | < .0001 | -1.774 | 1.774 |

PAS | 0.826 | < .0001 | 51.035 | 51.035 | PAS | 0.323 | < .0001 | -47.178 | 47.178 |

PIP | 0.751 | < .0001 | 130.640 | 130.640 | PIP | 0.309 | < .0001 | -90.261 | 90.261 |

PSS | 0.575 | < .0001 | 199.337 | 199.337 | PSS | 0.329 | < .0001 | -106.038 | 106.038 |