MULTILINGUAL TEXT-TO-SPEECH ENGINE

dc.contributor.authorIsturlayeva, Aidana
dc.date.accessioned2021-07-29T10:28:05Z
dc.date.available2021-07-29T10:28:05Z
dc.date.issued2021-07
dc.description.abstractSerenity and fluency are the most important synthesis qualities expected from text-tospeech. This project introduces a multilingual text-to-speech (TTS) engine, which is capable of reproducing high-quality speech in English, Kazakh and Russian languages. The main idea is to address the limitation of existing TTS that have one voice in one language. So we have 3 languages at the same time. A text-to-speech synthesis system usually consists of several stages: a text analysis interface, an acoustic model, and a sound synthesis module. For synthesis, we use Tacotron, an end-to-end generative text-to-speech model that synthesizes speech directly from symbols. Also described a high-quality speech dataset for Kazakh, Russian and English languages. The dataset contains 40 hours per language of transcribed audio recordings spoken by a Female professional speaker. The publicly available large-scale synthesis was developed to promote multilingual text-to-speech (TTS) applications in academia and industry. This paper outlined our experience by describing the dataset development procedures, facing challenges, and discussing important future directions. To evaluate the resulting system, we conducted subjective assessment tests based on the Likert system.en_US
dc.identifier.citationIsturlayeva, A. (2021). Multilingual Text-To-Speech Engine (Unpublished master's thesis). Nazarbayev University, Nur-Sultan, Kazakhstanen_US
dc.identifier.urihttp://nur.nu.edu.kz/handle/123456789/5620
dc.language.isoenen_US
dc.publisherNazarbayev University School of Engineering and Digital Sciencesen_US
dc.rightsAttribution-NonCommercial-ShareAlike 3.0 United States*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/3.0/us/*
dc.subjecttext-to-speechen_US
dc.subjectTTSen_US
dc.subjectResearch Subject Categories::TECHNOLOGYen_US
dc.subjectType of access: Gated Accessen_US
dc.subjectspeech recognitionen_US
dc.titleMULTILINGUAL TEXT-TO-SPEECH ENGINEen_US
dc.typeMaster's thesisen_US
workflow.import.sourcescience

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Thesis - Aidana Isturlayeva.pdf
Size:
4.14 MB
Format:
Adobe Portable Document Format
Description:
Thesis