Initial Experiments on Russian to Kazakh SMT
dc.contributor.author | Myrzakhmetov, Bagdat | |
dc.contributor.author | Makazhanov, Aibek | |
dc.date.accessioned | 2017-01-11T03:53:33Z | |
dc.date.available | 2017-01-11T03:53:33Z | |
dc.date.issued | 2016 | |
dc.description.abstract | We present our initial experiments on Russian to Kazakh phrase-based statistical machine translation. Following a common approach to SMT between morphologically rich languages, we employ morphological processing techniques. Namely, for our initial experiments, we perform source-side lemmatization. Given a rather humble-sized parallel corpus at hand, we also put some effort in data cleaning and investigate the impact of data quality vs. quantity trade off on the overall performance. Although our experiments mostly focus on source side preprocessing we achieve a substantial, statistically significant improvement over the baseline that operates on raw, unprocessed data. | ru_RU |
dc.identifier.citation | Myrzakhmetov, Bagdat., Makazhanov, Aibek (2016) Initial Experiments on Russian to Kazakh SMT. Research in Computing Science 117. pp. 153–160. http://www.rcs.cic.ipn.mx/rcs/2016_117/ | ru_RU |
dc.identifier.uri | http://nur.nu.edu.kz/handle/123456789/2233 | |
dc.language.iso | en | ru_RU |
dc.publisher | Research in Computing Science 117 | ru_RU |
dc.rights | Attribution-NonCommercial-ShareAlike 3.0 United States | * |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/3.0/us/ | * |
dc.subject | statistical machine translation | ru_RU |
dc.subject | SMT | ru_RU |
dc.subject | machine translation | ru_RU |
dc.subject | Research Subject Categories::SOCIAL SCIENCES::Statistics, computer and systems science | ru_RU |
dc.title | Initial Experiments on Russian to Kazakh SMT | ru_RU |
dc.type | Article | ru_RU |