Toleu, Alymzhan; Tolegen, Gulmira; Makazhanov, Aibek
(Tatarstan Academy of Sciences, 2017-10-21)
In this work we address the problems of sentence segmentation and tokenization. Informally the task of sentence segmentation involves splitting a given text into units that satisfy a certain definition (or a number of ...