STATISTICAL METHODS IN NATURAL LANGUAGE PROCESSING

dc.contributor.authorNurkhan, Laiyk
dc.contributor.editorNurkhan, Laiyk
dc.date.accessioned2024-06-05T11:24:09Z
dc.date.available2024-06-05T11:24:09Z
dc.date.issued2024-04-28
dc.description.abstractThis capstone project explores the application of statistical method ologies to two distinct natural language processing (NLP) tasks: machine translation between Ukrainian and Russian languages and the classifica tion of comments for hate speech detection. The study shows that the strategic integration of statistical approaches can improve performance of the machine translation and text classification problems. The imple mentation of linear regression with an added orthogonal constraint on weight vectors has resulted in higher precision scores. For the classifi cation of hate speech within textual comments, logistic regression with TF-IDF features was identified as the the most effective model in terms of AUC-ROC metric.en_US
dc.identifier.citationNurkhan, Laiyk. (2024). Statistical Methods in Natural Language Processing. Nazarbayev University School of Sciences and Humanitiesen_US
dc.identifier.issn-
dc.identifier.urihttp://nur.nu.edu.kz/handle/123456789/7753
dc.publisherNazarbayev University School of Sciences and Humanitiesen_US
dc.relation.ispartofseries-;-
dc.subjectmachine translationen_US
dc.subjectType of access: Restricteden_US
dc.titleSTATISTICAL METHODS IN NATURAL LANGUAGE PROCESSINGen_US
dc.typeCapstone Projecten_US
workflow.import.sourcescience

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
capstone (7).pdf
Size:
856.99 KB
Format:
Adobe Portable Document Format
Description:
capstone project