Initial Experiments on Russian to Kazakh SMT
Loading...
Date
Authors
Myrzakhmetov, Bagdat
Makazhanov, Aibek
Journal Title
Journal ISSN
Volume Title
Publisher
Research in Computing Science 117
Abstract
We present our initial experiments on Russian to Kazakh phrase-based
statistical machine translation. Following a common approach to SMT between
morphologically rich languages, we employ morphological processing techniques.
Namely, for our initial experiments, we perform source-side lemmatization. Given
a rather humble-sized parallel corpus at hand, we also put some effort in data
cleaning and investigate the impact of data quality vs. quantity trade off on the
overall performance. Although our experiments mostly focus on source side preprocessing we achieve a substantial, statistically significant improvement over the
baseline that operates on raw, unprocessed data.
Description
Citation
Myrzakhmetov, Bagdat., Makazhanov, Aibek (2016) Initial Experiments on Russian to Kazakh SMT. Research in Computing Science 117. pp. 153–160. http://www.rcs.cic.ipn.mx/rcs/2016_117/
Collections
Endorsement
Review
Supplemented By
Referenced By
Creative Commons license
Except where otherwised noted, this item's license is described as Attribution-NonCommercial-ShareAlike 3.0 United States
