DSpace Repository

PARALLEL NEWS CLUSTERING AND TOPIC MODELING APPROACHES

Show simple item record

dc.contributor.author Shomanov, A S
dc.contributor.author Mansurova, M E
dc.date.accessioned 2021-09-16T07:30:38Z
dc.date.available 2021-09-16T07:30:38Z
dc.date.issued 2021
dc.identifier.citation Shomanov, A. S., & Mansurova, M. E. (2021). Parallel news clustering and topic modeling approaches. Journal of Physics: Conference Series, 1727, 012018. https://doi.org/10.1088/1742-6596/1727/1/012018 en_US
dc.identifier.uri http://nur.nu.edu.kz/handle/123456789/5793
dc.description.abstract At the current age there is an urgent need in developing massively scalable and efficient tools to Big Data processing. Even the smallest companies nowadays inevitably require more and more resources for data processing routines that could enhance decision making and reliably predict and simulate different scenarios. In the current paper we present our combined work on different massively scalable approaches for the task of clustering and topic modeling of the dataset, collected by crawling Kazakhstan news websites. In particular, we propose Apache Spark parallel solutions to news clustering and topic modeling problems and, additionally, we describe results of implementing document clustering using developed partitioned global address space Mapreduce system. In our work we describe our experience in solving these problems and investigate the efficiency and scalability of the proposed solutions. en_US
dc.language.iso en en_US
dc.publisher Journal of Physics: Conference Series en_US
dc.rights Attribution-NonCommercial-ShareAlike 3.0 United States *
dc.rights.uri http://creativecommons.org/licenses/by-nc-sa/3.0/us/ *
dc.subject Type of access: Open Access en_US
dc.subject Big Data processing en_US
dc.title PARALLEL NEWS CLUSTERING AND TOPIC MODELING APPROACHES en_US
dc.type Article en_US
workflow.import.source science


Files in this item

The following license files are associated with this item:

This item appears in the following Collection(s)

Show simple item record

Attribution-NonCommercial-ShareAlike 3.0 United States Except where otherwise noted, this item's license is described as Attribution-NonCommercial-ShareAlike 3.0 United States