Congenital Heart Disease Detection from Children’s Heart Sounds using Vision Transformer-based Model

Loading...
Thumbnail Image

Date

2025

Journal Title

Journal ISSN

Volume Title

Publisher

Nazarbayev University School of Engineering and Digital Sciences

Abstract

Congenital heart disease (CHD) is one of the most common birth defects and has been a source of great morbidity and mortality among infants. Early diagnosis al-lows for early intervention and better patient outcomes. Traditional methods are effective yet have limitations such as high costs, dependency on expert knowledge, noise, etc. Recent developments in deep learning (DL) showcase the potential to address these limitations and automate diagnosis with high accuracies. This study proposes the use of a hybrid model of Convolutional Neural Networks (CNNs) with Vision Transformers (ViTs) for the detection of CHD based on heart sounds. The pro-posed CNN-ViT-based architecture leverages both local feature extraction by CNN and global pattern modeling by ViT. Thus, the model is able to capture both local and global dependencies in heart sounds’ spectrogram representations. The proposed approach enhances the accuracy of CHD detection by overcoming some of the limitations arising from traditional approaches and CNN based methods. This study utilized a recently released open-source ZCHSound dataset [1] of children’s pediatric heart sound recordings. To evaluate this model we use accuracy as evaluation measure. The model showed exceptional performance on the binary the classification the task of clean heart, sounds, reaching an accuracy of 95% and adequate results for multi-class classification (74% accuracy). However, for multi-class classification, results are not convencing and some more research work will be needed. Nonetheless, the proposed approach set a new benchmark in CHD diagnostics, contributing to AI-driven healthcare solutions and furthering the application of deep learning in medical research.

Description

Keywords

Congenital heart disease, type of access: open access

Citation

Kabdualiyev, Damir (2025). Congenital Heart Disease Detection from Children’s Heart Sounds using Vision Transformer-based Model. Nazarbayev University School of Engineering and Digital Sciences

Endorsement

Review

Supplemented By

Referenced By

Creative Commons license

Except where otherwised noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States