In this study, we have successfully shown the feasibility of detecting CAN using short, 10-second 12-lead ECG recordings with the more cumbersome gold-standard CARTs as a reference. We observed robust performance, particularly in identifying dsCAN, when motifs and discords were used with an SVM classifier. This approach yielded high accuracy and precision. However, in the broader context of detecting any stage of CAN, our models’ performance was more modest.
These results highlight the potential of ML in detecting subtle physiological variations characteristic of CAN in 12-lead ECGs, demonstrating the usefulness and clinical relevance of this ML-driven approach in CAN diagnostics. Recently, a study of ML of infant shows that ECGs contain important information about autism spectrum disorder familial likelihood through HRV and sympathetic and parasympathetic activities [22]. Thus, demonstrating the possible wide-ranging utility of ML for ECG analysis beyond cardiac rhythm disorders.
With regard to classifying patients with eCAN, we were able to achieve good recall, but with the trade-off in accuracy, reflected in an AUC of 0.68. Jelinek et al. [23] attempted to detect eCAN utilizing HRV features with fast Fourier transform and Lomb-Scargle periodogram based on ECG lasting from 10 seconds to 5 minutes. Jelinek et al. achieved similar success with AUC values ranging from 0.6 to 0.75 depending on the ECG duration and employed method of analysis. This highlights the inherent challenges in ECG-based eCAN detection.
Numerous studies have leveraged ECG as a diagnostic tool for CAN, primarily utilizing HRV analysis. Notably, Tang et al. [24] explored the diagnostic utility of short-term HRV in CAN detection using Bayesian estimation, challenging the conventional assumption of CARTs as the gold-standard. Their findings indicate that short-term HRV analysis was non-inferior to CARTs for CAN diagnosis. Additionally, specific HRV features like the Root Mean Square of Successive Differences (RMSSD) and the standard deviation of the averages of RR intervals (SDNN) were identified as crucial for CAN detection when employing ensemble classification with random forest [25]. These HRV features are known to be diminished in CAN, attributed to parasympathetic withdrawal leading to sympathetic dominance [26, 27]. However, relying solely on SDNN for CAN detection demonstrated limited efficacy, yielding an AUC of 0.73 [26]. Our study was not limited to the extraction of only HRV features from ECG. The ML algorithms we employed for feature extraction were capable of discerning a broader range of differences in ECG between groups.
Most of the existing ECG-based biometric recognition methods in the literature are based on the identification of key points, referenced using the notation P, QRS, and T. These are specific points of interest within an ECG signal. The accuracy of these methods depends on the precise detection of these points, which is considered challenging in real-world biometric applications [28]. To address this limitation, we used two alternative feature extraction methods that do not require any specific form of point detection before extraction. The first was founded on motif and discord extraction. A motif is time series sub-sequence that repeats frequently, while a discord is a time series sub-sequence that occurs infrequently. Both, it is argued [19], are good discriminators of class. The second was founded on the LSTM algorithm, a deep learning algorithm which is increasingly used in the field of health care [29, 30]. LSTMs frequently adopted in with respect to recent work on ECG data analysis, because of their reported success [18, 31].
This study builds upon existing research, and enhances ECG-based CAN diagnosis by incorporating ML algorithms. While prior studies have achieved promising results in detecting CAN using ECG-derived features such as QT or RR intervals [32] and HRV parameters [33], these often suffered from methodological shortcomings. Common limitations include small sample sizes, the use of non-gold-standard CAN criteria or failure to clearly define these criteria, and the lack of a robust validation process.
Recent studies predicted CAN based on biochemical, demographic data and patients’ history with the use of ML techniques [34]. Nedergaard et al. [25] developed ensembled classification using established ECG-derived features along with clinical and biochemical data. Our study suggests the potential of enhancing such models by integrating novel ECG features identified through motifs, discords, or LSTM along with other types of features.
The potential medical applicability of our findings, particularly in screening and CV disease (CVD) prevention, is noteworthy. Utilizing ML techniques in conjunction with ECG analysis, as demonstrated in this study, provides a novel approach for the early, non-invasive detection of CAN. This condition often precedes more severe cardiovascular complications, making its early identification crucial [5]. The ability of our method to discern subtle ECG alterations indicative of CAN, through the analysis of motifs, discords, and LSTM feature vectors, holds significant promise for identifying individuals at an elevated risk of developing CVD. Timely detection can facilitate earlier intervention, including lifestyle changes and preventive medical therapies, potentially slowing or preventing the progression to overt CVD [35, 36]. Furthermore, the non-invasive nature of ECG screening, combined with the efficiency and accuracy of AI algorithms, makes this approach highly feasible for widespread implementation in clinical practice, potentially enhancing the effectiveness of CVD prevention strategies.
Given that individuals with CAN are at much higher CV risk, newer therapies such as Sodium-glucose cotransporter 2 inhibitors (SGLT2i) and Glucagon-like peptide-1 receptor agonists (GLP-1 ra) may be important in future clinical pathways which define dsCAN. For instance, SGLT2i mediate reduction in CV death and heart failure through mechanism that extends beyond improving glycaemic control [37] and may include reduction in sympathetic tone [38]. However, the development of point-of-care screening for CAN will be essential in risk stratifying those with the highest CV risk.
Limitations
This study enrolled a modestly sized cohort (n = 205). The restricted participant pool from a single centre may not adequately represent the broader, diverse population affected by CAN, potentially limiting the generalizability of our findings. Moreover, the use of neural network models in our analysis, might be hindered by the smaller dataset. Such models typically require extensive data to effectively learn and generalize. Additionally, without external validation across various clinical settings and populations, the robustness and applicability of our ML models remain to be fully established. These factors underscore the necessity for further, expansive research to validate and enhance the reliability of our ML-driven diagnostic approach. Moreover, future studies could focus on the classification based on ECG enhanced by additional clinical or biochemical data.