Abstract:
In today’s data-driven world, it is possible to gain access to significant amounts
of data from different sources, and even share this data for various purposes. In
modern everyday life, people use social networks extensively, reading tweets and posts,
leaving comments, sharing their views on findings through comments and posts, or
getting feedback from other users. As social networks are enhancing a source of
abundant information flows, it is becoming difficult and time-consuming to filter
the information. The correct analysis of information is important since the way we
communicate and establish various kinds of relationships can heavily rely on correct
interpretations.
This thesis aims to introduce the methods for sentiment analysis, investigating
the application of the Machine Learning Approach for the sentiment classification
problem by comparison of the Machine Learning and Statistical approaches, especially
defining the importance of the Machine Learning approach for our purpose. Moreover,
this research paper intends to explore the effectiveness of the pre-trained models
over other approaches. Logistic Regression, Long Short-Term Memory, and BERT
models will be demonstrated as methods of explaining this topic. And there will
be an observation of what is the performance of the python libraries next to these
methods. The analysis will show how the results are different and how the first
approach outperforms the second one and will test whether ML algorithms show
good performance and best results. Training experimental work will take place on
the open-source dataset Sentiment140 extracted from Twitter.