DSpace Repository

MULTILINGUAL TEXT-TO-SPEECH ENGINE

Система будет остановлена для регулярного обслуживания. Пожалуйста, сохраните рабочие данные и выйдите из системы.

Show simple item record

dc.contributor.author Isturlayeva, Aidana
dc.date.accessioned 2021-07-29T10:28:05Z
dc.date.available 2021-07-29T10:28:05Z
dc.date.issued 2021-07
dc.identifier.citation Isturlayeva, A. (2021). Multilingual Text-To-Speech Engine (Unpublished master's thesis). Nazarbayev University, Nur-Sultan, Kazakhstan en_US
dc.identifier.uri http://nur.nu.edu.kz/handle/123456789/5620
dc.description.abstract Serenity and fluency are the most important synthesis qualities expected from text-tospeech. This project introduces a multilingual text-to-speech (TTS) engine, which is capable of reproducing high-quality speech in English, Kazakh and Russian languages. The main idea is to address the limitation of existing TTS that have one voice in one language. So we have 3 languages at the same time. A text-to-speech synthesis system usually consists of several stages: a text analysis interface, an acoustic model, and a sound synthesis module. For synthesis, we use Tacotron, an end-to-end generative text-to-speech model that synthesizes speech directly from symbols. Also described a high-quality speech dataset for Kazakh, Russian and English languages. The dataset contains 40 hours per language of transcribed audio recordings spoken by a Female professional speaker. The publicly available large-scale synthesis was developed to promote multilingual text-to-speech (TTS) applications in academia and industry. This paper outlined our experience by describing the dataset development procedures, facing challenges, and discussing important future directions. To evaluate the resulting system, we conducted subjective assessment tests based on the Likert system. en_US
dc.language.iso en en_US
dc.publisher Nazarbayev University School of Engineering and Digital Sciences en_US
dc.rights Attribution-NonCommercial-ShareAlike 3.0 United States *
dc.rights.uri http://creativecommons.org/licenses/by-nc-sa/3.0/us/ *
dc.subject text-to-speech en_US
dc.subject TTS en_US
dc.subject Research Subject Categories::TECHNOLOGY en_US
dc.subject Type of access: Gated Access en_US
dc.subject speech recognition en_US
dc.title MULTILINGUAL TEXT-TO-SPEECH ENGINE en_US
dc.type Master's thesis en_US
workflow.import.source science


Files in this item

The following license files are associated with this item:

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-ShareAlike 3.0 United States Except where otherwise noted, this item's license is described as Attribution-NonCommercial-ShareAlike 3.0 United States