Sentiment Analysis is an automated process of interpreting an opinion about a subject in the text
or verbal form. The Bag-of-Words (BOW) model although a simple and efficient model for
modeling textual data in sentiment analysis, always gives inaccurate predictions due to some
fundamental deficiencies in the circumstance of polarity shift. The BOW model considers two
sentiment-opposite texts to be very similar. This model disrupts the word order, breaks the syntactic
structures and discards some semantic information.
To address the issue of the Bag-of-Words model unable to handle the polarity shift problem and
inadvertently leading to the failure of statistical machine learning algorithms, a coupled sentiment
analysis is developed where a dataset is taken and its sentiment-reversed review dataset is created
for each training and testing review. Data collected in this project is a cellphone reviews dataset
From the results, after performing coupled sentiment analysis on the dataset, we observe an
improved accuracy in predicting the review sentiment in comparison to predicting the sentiment
using sentiment analysis alone.
Sentiment is a perspective, and the process of sentiment analysis is the study of individual’s
perception towards specific entities. Users post their content using the medium of social media
through social networking sites, forums, and online journals. On the other hand, researchers and
developers collect data for analysis from application programming interfaces (APIs) made available
by social media sites. Data available online is faulty and doesn’t convey crisp information, which
often creates difficulty and obstructs the process of sentiment analysis. The quality of comments
posted online cannot be trusted since everyone has the liberty to post online. Hence, it’s essential
to process and analyze textual data for sentiment analysis to determine the orientation of the
sentiments of online comments.
Product reviews available online are significantly utilized in mining opinions as customers rely
heavily on learning the sentiments indicated in the text (Pak, A., & Paroubek, P. 2010, May). With
the growing popularity and availability of user-generated textual data such as online review sites
and personal blogs which are heavily loaded with opinions, user-generated textual data has
significantly increased corresponding to the need for efficient techniques for analyzing all that data.
The concept of sentiment analysis or sentiment classification has been on the rise since 2000 (Liu,
B. 2012), where its goal is to evaluate the text in accordance to the sentimental polarities of the
users’ thoughts or opinions, e.g., positive or negative which is generally present in the form of
Data mining tools and algorithms are utilized to discover and analyze the sentiments and
attitudes of consumer behavior on products they have purchased or want to buy (Jack, L., & Tsai,
2015). Sentiment analysis or opinion mining comprises a wide range of fields like natural language
processing, decision making, and linguistics. It’s a type of text analysis that is used for classifying
the text that enables decision making by differentiating between the products and featuring client
priorities on specific items.
In statistical machine learning techniques, the bag-of-words (BOW) algorithm is the most
broadly utilized approach for representing text data in sentiment classification. Commonly also
known as the vector space model, the BOW model is an order less collection of words without
assessing the grammar or word order but keeping assortment. Despite being the most generally
utilized system in topic-based text classification, the BOW algorithm has a few fundamental
deficiencies such as the assessing a lexicon manually to evaluate words; it analyzes sentiments with
a lower precision thus ignoring the semantics of words and disrupting the word order and grammar
(El-Din, D. M. 2016). This inadequacy in the bag-of-words algorithm is unable to resolve the
polarity shift problem.
Being a sentiment classification problem, the polarity shift reverses the sentiment polarity of the
review’s text, i.e., negative to positive or positive to negative. Certain polarity shifters also known
as valence shifters such as negation expressions have the capability of shifting the sentiment
polarity of the text entirely (Ikeda, D., Takamura, H., Ratinov, L. A., & Okumura, M. 2008).
Once negation words such as “no”, “don’t” or “not” are added in front of a positive text, the
orientation and sentiment of that text is reversed from positive to negative and the text changes
entirely. For instance, adding a negation trigger word “not” in front of a positive sentence “The
phone’s camera is good.” reverses the sentiment of this sentence from positive to negative. The
BOW model considers these inversed sentiments texts to be fundamentally the same which majorly
impacts the performance of machine learning based systems to fail under the polarity shift
circumstances. Therefore, there is a need to handle this polarity shift problem as it is imperative to
improvise on the performance and execution of machine learning models to give more accurate