This is the first clinical study to investigate the ability of AI to differentiate between benign and pathological heart murmurs in children to this extent using echocardiography as the gold standard. The algorithm was able to distinguish a murmur caused by a CHD from innocent murmurs with good sensitivity and specificity when echocardiography was used as the reference. Since the method is based on the identification of different heart sound signals, it is not suitable for screening CHDs without a murmur.
Despite the routine use of pulse oximetry screening, which is highly sensitive in detecting critical cyanotic CHDs, some newborns with acyanotic CHDs may still go undiagnosed before discharge from the maternity ward 11. Indeed, many CHDs do not cause symptoms during the first week of life 4, and some are only diagnosed after normal postnatal adaptation has taken place. For example, VSD causes murmur after a reduction in pulmonary vascular resistance leads to a pressure difference between ventricles and coarctation when the arterial duct has closed and caused narrowing of the aorta. CHDs diagnosed in children over six months of age are usually mild and asymptomatic, including ASD, PDA, and valvular defects 6. Some valvular defects and hypertrophic cardiomyopathy may manifest later in adolescence. Auscultation during childhood plays a vital role in identifying these CHDs, but the accuracy of interpretation depends on the experience and skills of the physician. In previous studies, the sensitivity and specificity of clinical assessments of CHD have varied widely depending on the clinical experience of the physician, while specificity generally increased with experience. For example, medical students and paediatric residents had a sensitivity of 82% but a specificity of only 56% in assessing CHD12. Among paediatricians, sensitivity was found to be better (93%), but specificity remained low (59%) 6. Meanwhile, clinical assessment by paediatric cardiologists had a sensitivity of 81% and specificity of 91% in identifying neonates with CHD 13.
Auscultation using a conventional stethoscope or AI requires optimal conditions, including good patient co-operation and quiet environment. If a child is crying, both the physician and the algorithm may struggle to recognise murmurs accurately. In our study, all children were co-operative, ensuring a reliable evaluation of the algorithm’s performance. The AI used in this study includes a quality algorithm that screens raw phonocardiogram signals and removes noise and other artifacts, allowing high-quality phonocardiogram segments to be used for analysis of heart sounds.
A prerequisite for the development and utilisation of AI in health care is the ability to reliably listen to and record heart sounds. In recent decades, electronic stethoscopes have undergone significant development, resulting in enhanced capabilities for analysing murmurs. These devices improve sound signals and reduce background noise, facilitating auscultation. Most available models not only aid in listening but also can record sounds and store data on murmurs for future reference 14.
The accuracy of algorithms primarily hinges on the quantity and quality of the heart sound samples used in their development. The training data must encompass a sufficient variety of normal heart sounds, innocent murmurs, and abnormal murmurs associated with different CHDs. AI algorithms trained with large high-quality datasets outperform interpretations made by inexperienced listeners when assessing murmurs. In this study, the AI was trained on a dataset comprising voice samples from 1413 patients, after which it was prospectively tested with 98 new cases. Compared to other studies that have used AI in the analysis of heart sounds in children, the algorithm used in our study was trained with a larger number of samples 9,15.
In our study, electronic stethoscope recording was performed with four standard anterior auscultation points. This technique differs from that used in a small pilot study, in which the recording was made at the loudest location of the murmur 15. Murmurs caused by some CHDs are best heard in areas not covered by standard auscultation points. For example, a PDA murmur is often heard just below the left clavicle, and small VSD murmurs are only heard in very small areas. A coarctation of aorta (CoA) murmur is usually heard most clearly from the back near the left scapula. In our study, the algorithm identified all CoAs as abnormal even though the back was not included in the recording areas. However, the possibility of a false negative result increases if heart sounds are analysed in an area where the murmur is not heard best. The ability of an algorithm to identify CHD is also weakened if the murmur associated with it does not clearly differ from innocent murmurs. ASD can occur without a murmur, or the murmur of ASD can mimic an innocent murmur from the pulmonary artery area, which was also observed in this study as a false negative finding for ASD. The ability of AI to recognise murmurs outside standard areas can be improved by directing the recording to the point where the physician hears the murmur best. To improve the reliability of murmur examinations in children, a promising approach would be to combine the results of an AI algorithm with findings from a clinical examination.
AI algorithms based on the recognition of murmurs and normal heart sounds are unable to recognise CHDs without an audible murmur. Therefore, heart defects without murmurs could not be used to train our algorithm and were also excluded from our training dataset. In this study, AI failed to recognise ASDs, small VSDs, and bicuspid aortic valves with normal function (no stenosis or insufficiency) without a murmur. These defects represent “false negatives” and explain the decrease in sensitivity in the entire study population, which also included patients without a murmur. The age group over four years had more CHDs without a murmur, which explains the lower sensitivity in the older age group in our study.
Breathing sounds and heart rate can affect the quality of heart sound recordings 10. Both are faster in children than in adults, and both decrease as the child ages. The effect of breathing sounds on the quality of murmur recordings can be mitigated by performing the recordings during breath holding, as reported in a small pilot study 15. However, breath holding requires good co-operation and is not possible for small children. Algorithms based on adult heart sound samples cannot be used for screening children, as heart diseases and murmurs differ between children and adults. Our algorithm was developed with samples from different age groups of children and adolescents (from 0 to 18 years), so it could be a promising method for broader use in the field of paediatrics and adolescent medicine.
Previous validation studies assessing AI algorithms in the identification of pathologic murmurs in children have reported similar results to ours. In a virtual clinical study (n = 120, age 2-17 years) based on a database of recordings of children`s heart sounds and murmurs, AI identified Still’s innocent murmur with sensitivity of 90% and specificity of 98%. This selected patient sample had no other innocent murmurs or normal heart sounds and only sound samples recorded at the lower left sternal border were used in the analysis, distinguishing it from our study. The performance of the algorithm worsened when also the sound samples without a murmur and all auscultation areas were included, resulting in the sensitivity of 83% and specificity of 89%.16 A small (n = 34) AI pilot study in children over 3.5 years of age reported a sensitivity of 87% and a specificity of 100% in identifying pathologic murmurs 15. A virtual clinical trial (n = 603) using AI identified pathologic murmurs with a sensitivity of 93% and a specificity of 81%. In that study, previously recorded patient heart sounds were analysed from a sound bank with the help of AI. Pathologic cases had at least one pathologic diagnosis by echocardiogram and at least one murmur considered to be caused by the pathology. CHD patients without a murmur or with innocent murmurs were excluded, which increased the accuracy of the algorithm 10. However, due to the differences in patient selection, these results cannot be directly compared with those of our study. The virtual clinical trial by Thompson et al. included also adults, and the age range was wide (0.3–80.9 years), with a median age of 8.8 years and 34% of the patients being over 12 years of age 10.
The strength of our clinical study is the use of echocardiography combined with AI analysis and clinical examination. Another strength of our study is that versatile data were collected from different age groups, covering normal heart sounds as well as innocent and pathologic murmurs related to CHDs. In addition, a large dataset of over 1400 sound samples, validated with echocardiography, was used in developing the algorithm.
One of the limitations of this study was the small number of children under one month of age with fast heart and respiratory rates, raising questions about the algorithm’s utility in that age group, which warrants further evaluation. In addition, the exclusion of children with prior heart surgeries means it was not possible to assess the algorithm’s effectiveness in identifying murmurs in this specific paediatric population. The inclusion of children with innocent murmurs makes it difficult to compare the results to those of previous studies 10,15.
In Finland, most referrals due to murmurs or suspicion of CHD come from primary health care, so our algorithm could be most useful in screening for murmurs in that context 7. The high prevalence of innocent murmurs detected in primary health care strains the limited resources of specialised care. Evaluations of auscultatory findings by inexperienced listeners leads to increased numbers of referrals to specialised medical services. Thus, if innocent murmurs could be reliably diagnosed in primary care settings using AI as an aid to clinical examination, the costs of specialised care could be reduced. Then, direct specialised health care resources could be targeted to those patients who need them most.
In conclusion, the AI algorithm developed in this study showed promising results among paediatric cardiology outpatients in distinguishing between innocent and pathologic murmurs, exhibiting good sensitivity and specificity. It could be used as an aid to identify murmurs that require further analysis by echocardiography. In addition, when combined with clinical examination, the use of this AI algorithm could increase the number of accurate diagnoses of benign murmurs without a need for echocardiography, thus decreasing health care expenses. Additional research is needed to investigate the potential application of AI algorithms in primary health care settings for screening murmurs in children. A working algorithm could be most useful in developing countries, in which the availability of echocardiography can be limited 17.