DSpace Repository

STATISTICAL METHODS IN NATURAL LANGUAGE PROCESSING

Система будет остановлена для регулярного обслуживания. Пожалуйста, сохраните рабочие данные и выйдите из системы.

Show simple item record

dc.contributor.author Nurkhan, Laiyk
dc.contributor.editor Nurkhan, Laiyk
dc.date.accessioned 2024-06-05T11:24:09Z
dc.date.available 2024-06-05T11:24:09Z
dc.date.issued 2024-04-28
dc.identifier.citation Nurkhan, Laiyk. (2024). Statistical Methods in Natural Language Processing. Nazarbayev University School of Sciences and Humanities en_US
dc.identifier.issn -
dc.identifier.uri http://nur.nu.edu.kz/handle/123456789/7753
dc.description.abstract This capstone project explores the application of statistical method ologies to two distinct natural language processing (NLP) tasks: machine translation between Ukrainian and Russian languages and the classifica tion of comments for hate speech detection. The study shows that the strategic integration of statistical approaches can improve performance of the machine translation and text classification problems. The imple mentation of linear regression with an added orthogonal constraint on weight vectors has resulted in higher precision scores. For the classifi cation of hate speech within textual comments, logistic regression with TF-IDF features was identified as the the most effective model in terms of AUC-ROC metric. en_US
dc.publisher Nazarbayev University School of Sciences and Humanities en_US
dc.relation.ispartofseries -;-
dc.subject machine translation en_US
dc.subject Type of access: Restricted en_US
dc.title STATISTICAL METHODS IN NATURAL LANGUAGE PROCESSING en_US
dc.type Capstone Project en_US
workflow.import.source science


Files in this item

This item appears in the following Collection(s)

Show simple item record