DEEP LEARNING APPROACHES FOR CLASSIFYING SIGN LANGUAGE GESTURES

Otegen, Nuralim

DEEP LEARNING APPROACHES FOR CLASSIFYING SIGN LANGUAGE GESTURES

Files

DEEP LEARNING APPROACHES FOR CLASSIFYING SIGN LANGUAGE GESTURES (6.62 MB)

Date

2025-04-24

Authors

Otegen, Nuralim

Publisher

Nazarbayev University School of Engineering and Digital Sciences

Abstract

Communication is the basic need of the humankind and one of the cornerstones of human interaction in expressing their feelings, needs, emotions, and opinions clearly. Many platforms offer real-time video messaging options, but only for spoken language. This means that people with disabilities cannot communicate using these technologies without difficulties. This work addresses that omission by training deep networks to recognize the 25-letter Russian Sign Language (RSL) fingerspelling alphabet from RGB photos. With a 2,500-image Kaggle dataset, we use standardised preprocessing (resize 224×224, flipping, rotation, colour-jitter) and evaluate five convolutional pipelines: a tailored lightweight CNN, VGG-16, ResNet-18, ResNet-50 and EfficientNet-B0. Fine-tuned VGG-16 achieves the optimal trade-off with 92% macro-F1 and 91% accuracy on a stratified 15% test set, with the tailored CNN lagging by some 10 pp. Confusion-matrix analysis indicates errors are localized to visually similar hand configurations (e.g., А/Б), meaning that multi-view or segmentation will yield additional benefit. These findings show that moderate data and standard vision networks are enough to provide consistent RSL letter recognition, providing practical foundations for inclusive, real-time sign-to-text in video telephony and public-service kiosks.

Keywords

type of access: open access

Citation

Otegen, N. (2025). Deep learning approaches for classifying sign language gestures. Nazarbayev University School of Engineering and Digital Sciences

URI

https://nur.nu.edu.kz/handle/123456789/8946

Collections

03. Bachelor's Thesis

Creative Commons license

Except where otherwised noted, this item's license is described as Attribution-NoDerivs 3.0 United States

Full item page

DEEP LEARNING APPROACHES FOR CLASSIFYING SIGN LANGUAGE GESTURES

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By

Creative Commons license