Professional Documents
Culture Documents
Implementation of Multinomial Naive Bayes Algorithm For Sentiment Analysis of Applications' User Feedback
Implementation of Multinomial Naive Bayes Algorithm For Sentiment Analysis of Applications' User Feedback
Implementation of Multinomial Naive Bayes Algorithm For Sentiment Analysis of Applications' User Feedback
Abstract—User feedbacks can be used as a tool for application Identifying issues from user feedbacks based on its
developers to find out and understand users’ needs, preferences, characteristics can be challenging. First, user feedback contains
and complaints. It is important for developers to identify problems numerous noise words, such as misspelled words, repetitive
that arise from the user-given feedbacks. This is very difficult to words, not to mention, popular local language of slang words,
do considering the amount of feedback received every day. which is a popular language among Indonesian youngsters. This
Reading and classifying each and every feedback takes a long time makes it becomes difficult to categorize whether the user is
and it is very ineffective. To overcome this problem, a sentiment satisfied or dissatisfied with the application. Second, for some
analysis system based on Multinomial Naïve Bayes classification popular apps, the volume of user feedbacks generated each day
algorithm was built to determine whether a feedback has a positive
is simply too large to be manually read and analyzed. Third, not
or negative sentiment. Naïve Bayes algorithm is generally used for
classification because it is very easy and effective. Based on
all feedback contains useful information for developers. Only a
previous research, Multinomial Naïve Bayes algorithm gives off third of user feedbacks contain information that can directly
the best performance compared to other traditional machine help developers improve their apps [4].
learning algorithms. This study aims to implement Multinomial As a first step towards a comprehensive support tool for
Naïve Bayes classification algorithm on a web application and analyzing user feedbacks, this research proposes a tool to
calculate the accuracy of class predictions made by the system. automatically classify user feedbacks according to its overall
Based on the results of several test, the accuracy of class
sentiment. Machine learning classifiers have been widely used
predictions which were evaluated using confusion matrix, shows
for the purpose of sentiment analysis providing good results [5].
that the model with training-testing comparison of 70:30, balanced
datasets, and over-sampling each dataset by 30% produces the
Among various machine learning algorithm, Naïve Bayes (NB)
best performance, with 71.6% for accuracy, 76.92% for precision, algorithm is generally used for the classification problem due
61.73% for recall and 68.49% for F1 score. to its simplicity and effectiveness [6].
User feedbacks contain many words and sentences, which
Keywords—Bahasa Indonesia; Multinomial Naïve Bayes; NLP; are expressed in various ways. Therefore, the feedback is
sentiment analysis; user feedback;
processed first, before conducting sentiment analysis.
I. INTRODUCTION Preprocessing the user feedback is done with common Natural
Language Processing (NLP) technique. For text written in
System objectives, conceptual structures, requirements and English, there are several NLP toolkits available such as
assumptions that have been elicited, evaluated, specified and Stanford Core NLP Toolkit [7], OpenNLP [8], or NLTK [9]. As
analyzed may need to be changed for a variety of reasons, for Bahasa Indonesia, at the time our research was conducted,
including defects to be fixed; project fluctuations in terms of it was still hard to find an integrated NLP toolkit available. The
priorities and constraints; better customer understanding of the available NLP modules for Bahasa Indonesia are separated
system’s actual features and so on [1]. As a developer, it is modules such as stemmer, POS tagger, syntactic parser and
important to anticipate changes that might occur. One of the other similar modules, thus, we conducted the text pre-
ways to identify the changes that have to be made is through processing activities step by step, details on these activities will
feedbacks given by users. User feedbacks are direct opinion be explained in the following sections.
from the users who have experience the apps and reflect the
instant user experience [2]. Through user feedback, developers This research aims to study, analyze and implement
can understand users’ needs, preferences, and complaints. The Multinomial Naïve Bayes algorithm for analyzing sentiment of
emerging issues detected from user feedback can provide application user feedbacks in Bahasa Indonesia. The paper is
informative evidence for developers in maintaining their apps organized as follows. In the following section we review some
and scheduling the app updates [3]. related works to ours, then we present a brief overview of some
research in sentiment analysis. Section IV describe our
experiment and findings in implementing Multinomial Naïve
Bayes for sentiment analysis of application user feedback.
Section V concludes our work and proposes several future
works.
II. RELATED WORK
Sentiment analysis is the method used for enabling
computers to recognize and classifying opinions from big
unstructured texts datasets with machine language and
computer programming [10]. Sentiment analysis involves
classifying opinions into categories like “positive”, “negative”
or “neutral”. Many approaches and algorithm have been used
for sentiment analysis, most of them are machine learning
techniques. A comparative analysis of performance between
NB, Support Vector Machine and Maximum Entropy show that
NB classifiers outperforms others [11]. Another comparative
research on machine learning classifiers for Twitter sentiment
analysis show that Multinomial NB outperforms other
classifiers [5]. Many researches focus on sentiment analysis of
Twitter data due to simple and easy accessibility to massive
amount of data generated in real time [12, 13, 14]. Fig. 1. The operation flow of the proposed approach [6]
There isn’t as many researches on sentiment analysis of Multinomial NB expands the use of Naïve Bayes algorithm.
application user feedback using NB algorithm, not to mention, It implements NB for data distributed multinomially and is a
application user feedback in Bahasa Indonesia. frequency-based model proposed for text classification, in
III. METHODOLOGY which word counts are used to represent data. Methods for
enhancing Naïve Bayes performance are classified into five
Several literatures and existing tools are used in this categories which are structure extension, feature selection,
research. A previous research [15] has pointed out the potential
attribute weighting, local learning and data expansion [20].
of using Naïve Bayes to classify feedbacks in Bahasa Indonesia.
Among these methods, Song et al [6] used attribute weighting
In our research, in order to classify feedbacks for sentiment
analysis, we utilized a proposed classification method based on method of using different weight for each attribute and feature
Naïve Bayes algorithm proposed in another research [6] and selection method of selecting a subset of attributes based on the
some adjustment were made to accommodate natural language weight. MNB with attribute weighting can be modeled using
processing in Bahasa Indonesia. Researches related to natural (1).
language processing in have also been conducted using other
algorithms [16, 17], displaying the potential of natural language 𝑐(𝑑) = arg max[log 𝑃(𝑐) + ∑𝑚
𝑖=1 𝑊𝑇𝑖 𝑓𝑖 log 𝑃(𝑤𝑖 |𝑐)]
𝑐∈𝐶
processing steps to process texts written in Bahasa Indonesia.
(1)
A. Natural Language Processing where 𝑊𝑇𝑖 is the weight of each word 𝑤𝑖 (𝑖 = 1,2, … , 𝑚).
Natural language processing (NLP) is a computer science C. Attribute weighting
field dealing with human language processing in either text or
speech [18]. Preprocessing of feedback includes the stemming, Song et al [6] proposed an attribute weighting approach
tokenization and removal of URL, numbers, punctuation and similar to GRW. In the proposed approach, the training set is
special character, stop-words and duplicate words. In addition, first divided into positive and negative, and then the number of
common slang and spelling mistake are replaced with the root positive words and negative words are counted separately from
word. Stemming is done using an open-source stemmer library each set. The weight 𝑊𝑇𝑖 of each word 𝑤𝑖,𝑐 in Dc can be
[19], based on Enhanced Confix Stripping for Bahasa defined as:
Indonesia. 𝐼𝐺𝑅(𝑐,𝑤 )×𝑚𝑐
𝑖,𝑐
𝑊𝑇𝑖,𝑐 = ∑𝑚𝑐 𝐼𝐺𝑅(𝑐,𝑤 (2)
B. Multinomial Naïve Bayes 𝑖=1 𝑖,𝑐 )
The classification method used in our research exploits a
novel attribute weighting and feature selection approach based where c is the class label (∈ {𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒, 𝑛𝑒𝑔𝑎𝑡𝑖𝑣𝑒}), mc is the
on the Multinomial NB. Fig. 1 illustrates the operation flow of number of different words in Dc. 𝐼𝐺𝑅(𝑐, 𝑤𝑖,𝑐 ), which is the
the proposed approach. The first method (attribute weighting) information gain ratio of 𝑤𝑖,𝑐 , can be obtained by (3).
aims to calculate the weights of words based on the training set
divided into positive and negative class, while the second 𝐼𝐺(𝑐,𝑤𝑖,𝑐 )
𝐼𝐺𝑅(𝑐, 𝑤𝑖,𝑐 ) = 𝐻(𝑤𝑖,𝑐 )
(3)
method (feature selection) aims to modify the weights using the
average of weight differences for automatic feature selection.
𝐼𝐺(𝑐, 𝑤𝑖,𝑐 ) and 𝐻(𝑤𝑖,𝑐 ) are information gain and entropy common way to overcome this problem is to use Laplace
information of 𝑤𝑖,𝑐 , respectively. 𝐼𝐺(𝑐, 𝑤𝑖,𝑐 ) and 𝐻(𝑤𝑖,𝑐 ) can Smoothing. This technique is performed by assuming that the
be obtained by (4) and (5). training dataset is large enough so that when adding 1 to each
count, we only need to make a difference that is ignored in the
𝐼𝐺(𝑐, 𝑤𝑖,𝑐 ) = 𝐻(𝑐) − 𝐻(𝑐|𝑤𝑖,𝑐 ) (4) probability value of zero estimation. The smoothing is
calculated as follows.
|𝐷𝑣,𝑐 | |𝐷𝑣,𝑐 |
𝐻(𝑤𝑖,𝑐 ) = − ∑𝑉 |𝐷| log 2 |𝐷| (5)
𝑐𝑜𝑢𝑛𝑡(𝑤,𝑐)+1
𝑃(𝑤|𝑐) = (11)
𝑐𝑜𝑢𝑛𝑡(𝑐)+ |𝑉|
where 𝐻(𝑐) is the entropy of Dc, 𝐻(𝑐|𝑤𝑖,𝑐 ) is the conditional
entropy of Dc given the word 𝑤𝑖,𝑐 , |𝐷𝑣,𝑐 | is the size of Dc in where 𝑐𝑜𝑢𝑛𝑡(𝑤, 𝑐) is the number of words w in class c,
terms of the number of 𝑤𝑖,𝑐 for which v (∈ {0, 0̅}). 𝐻(𝑐) and 𝑐𝑜𝑢𝑛𝑡(𝑐) is the total of words in class c and |𝑉| is the number
𝐻(𝑐|𝑤𝑖,𝑐 ) are calculated as: of vocabularies in the training document.