Multilingual Text-to-Speech Synthesis for Turkic Languages Using Transliteration
Loading...
Date
Journal Title
Journal ISSN
Volume Title
Publisher
ISCA
Abstract
This work aims to build a multilingual text-to-speech (TTS) synthesis system for ten lower-resourced Turkic languages: Azerbaijani, Bashkir, Kazakh, Kyrgyz, Sakha, Tatar, Turkish, Turkmen, Uyghur, and Uzbek. We specifically target the zero-shot learning scenario, where a TTS model trained using the data of
one language is applied to synthesise speech for other, unseen
languages. An end-to-end TTS system based on the Tacotron
2 architecture was trained using only the available data of the
Kazakh language. To generate speech for the other Turkic languages, we first mapped the letters of the Turkic alphabets onto
the symbols of the International Phonetic Alphabet (IPA), which
were then converted to the Kazakh alphabet letters. To demon strate the feasibility of the proposed approach, we evaluated
the multilingual Turkic TTS model subjectively and obtained
promising results. To enable replication of the experiments,
we make our code and dataset publicly available in our GitHub
repository.
Description
Citation
Yeshpanov Rustem, Mussakhojayeva Saida, Khassanov Yerbolat. (2023). Multilingual Text-to-Speech Synthesis for Turkic Languages Using Transliteration. INTERSPEECH 2023. https://doi.org/https://doi.org/10.21437/interspeech.2023-249