A Distributed Software Architecture for Performing Text Analysis on Web content

dc.contributor.authorAldabergenov, Aibek
dc.date.accessioned2018-10-31T03:24:14Z
dc.date.available2018-10-31T03:24:14Z
dc.date.issued2017-05
dc.description.abstractWith the high availability of data on the World Wide Web, researchers are actively using Web content for performing various text analysis operations. The large amount of data introduces challenges in data acquisition, storage and processing for researchers who want to use data from different sources on the Internet. In an environment where several people might want to share their data and code, the problem is further complicated by researchers' use of different software applications for performing data collection, storage and analysis tasks. The goal of this thesis is to study the components that make up different parts of web mining systems, and present a scalable software architecture for large-scale Web content analytics tasks performed in a multi-user setting. Additionally, an implementation of the proposed software architecture using modern open source software frameworks and tools is presented in this work.en_US
dc.identifier.citationAibek Aldabergenov. A Distributed Software Architecture for Performing Text Analysis on Web content. 2017. Department of Computer Science, School of Science and Technology, Nazarbayev Universityen_US
dc.identifier.urihttp://nur.nu.edu.kz/handle/123456789/3565
dc.language.isoenen_US
dc.publisherNazarbayev University School of Science and Technologyen_US
dc.rightsAttribution-NonCommercial-ShareAlike 3.0 United States*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/3.0/us/*
dc.subjectWorld Wide Weben_US
dc.subjectdataen_US
dc.titleA Distributed Software Architecture for Performing Text Analysis on Web contenten_US
dc.typeMaster's thesisen_US
workflow.import.sourcescience

Files

Original bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
MS_Thesis_Aibek_Aldabergenov_Spring_2017.pdf
Size:
3.01 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
6 KB
Format:
Item-specific license agreed upon to submission
Description: