Towards Large-Vocabulary Kazakh Sign Language Processing: Corpus Collection, Semi-Automatic Annotation, Recognition, And Translation

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Nazarbayev University School of Engineering and Digital Sciences

Abstract

Sign language (SL) is the primary communication mode for Deaf communities globally, yet automatic Sign Language Processing (SLP) technologies lag significantly behind those for spoken languages, particularly for under-resourced languages like Kazakh Sign Language (KSL). Progress is hindered by critical challenges: severe scarcity of large-scale datasets capturing diverse signers and continuous, natural signing; the difficulty in computationally representing both manual and crucial non-manual linguistic features (e.g., facial expressions, mouthing); and the laborious, time-consuming nature of manual data annotation. This thesis directly confronts these obstacles by developing foundational resources and methodologies specifically tailored for large-vocabulary, continuous KSL processing. We address data scarcity by introducing two novel, large-scale KSL datasets: FluentSigners-50, collected via community crowdsourcing to maximize signer and environmental diversity, and KSL-OnlineSchool, leveraging extensive online interpreted educational content to achieve a large vocabulary. Together, these provide over 900 hours of video data, forming an unprecedented resource for KSL. To tackle representation challenges, we propose and evaluate a framework encompassing both manual components, including an extensive study on automatic handshape classification, and non-manual components like head movements, facial expressions, and mouthing. Addressing the annotation bottleneck, we developed SLAN-tool, an open-source, web-based platform employing machine learning models for semi-automatic signing segmentation and handshape classification, designed to accelerate corpus creation. Finally, the utility of these resources is demonstrated by establishing baseline performance metrics for state-of-the-art Sign Language Recognition (SLR) and Translation (SLT) models evaluated on challenging, purpose-built splits of the FluentSigners-50 dataset. The primary contributions: the creation and release of the first large-scale continuous KSL datasets, the proposed sign representation framework, and the open-source semi-automatic annotation tool, collectively provide essential infrastructure to catalyze future research and development in KSL processing and related low-resource SLP tasks.

Description

Citation

Mukushev, Medet. (2025). Towards Large-Vocabulary Kazakh Sign Language Processing: Corpus collection, Semi-automatic annotation, Recognition, and Translation. Nazarbayev University School of Engineering and Digital Sciences

Collections

Endorsement

Review

Supplemented By

Referenced By

Creative Commons license

Except where otherwised noted, this item's license is described as Attribution-NonCommercial-NoDerivs 3.0 United States