Professional Documents
Culture Documents
Sentiment Analysis On Grab User Reviews Using Support Vector Machine and Maximum Entropy Methods
Sentiment Analysis On Grab User Reviews Using Support Vector Machine and Maximum Entropy Methods
Abstract— One business that has developed along with the transportation service business. The development of internet
increase in information and communication technology is the technology also has an impact on the habits of people using
transportation service business. The last decade has emerged as transportation facilities. Today, online motorcycle taxis are a
a technology-based transportation business innovation, for very popular public transportation especially in Indonesia.
example Grab. It is important for a company or organization to
find out about people's responses to their services or products.
Online motorcycle taxi is a transformation of conventional
Public opinion on the product is not small in number, even motorcycle taxi.
though it is undeniable that public opinion has an impact on the Based on the above phenomena there are emerging
company's image. Therefore, a technique for analyzing the application-based transportation business innovations, one of
opinion is needed so that the company can monitor and organize them is Grab. Based on the "Consumers' Awareness" survey
their services. Public opinion is classified into positive or of 40 drivers and 280 consumers selected randomly on a
negative sentiment classes using SVM and Maximum Entropy national scale by Spire Research and Consulting, 75% and
methods. The labeling results are then analyzed by text 61% of respondents said that Grab was the brand they used in
association to find the relationship of each information obtained. the past 6 and 3 months. Meanwhile, 62% and 58% of
Classification with SVM method produces 89.01% accuracy. respondents chose to use Go-Jek for the same category in the
Whereas Maximum Entropy obtained higher accuracy that is last 6 and 3 months [3].
90.46%. Text associations obtained from the positive sentiment To use Grab, consumer should download Grab application
class and negative sentiment class. The results of negative first. For android users the application can be downloaded in
reviews are analyzed for causes and consequences using Google Play. Google Play is a service with digital facilities
fishbone diagrams for problem solving. initiated by a Google company. These services can be
accessed through websites, android applications (Play Store),
Keywords—sentiment analysis, Grab, SVM, Maximum Entropy
or Google TV [4]. The interesting thing about Google Play is
I. INTRODUCTION that there is a feature that can be reviewed by its users. This
can be used to evaluate reviews of certain application users,
Digital technology is currently developing very fast. This such as Grab.
is driven by the continued development of internet technology User reviews like Grab generally contain positive and
and its users. Reported by the Association of Indonesian negative complaints. A good brand image will form a good
Internet Services (APIJII) in 2017, the number of internet opinion from the users of products/services, then it is expected
users in Indonesia was 143.3 milion from the total population to encourage the purchase process by consumers, and vice
of Indonesia, which was 262 milion. This number has been versa. Positive and negative responses from users can be
tripled compared with the number of internet users in influenced by a number of things that have not been of concern
Indonesia in 2010 as shown in Fig. 1. It is predicted that this to Grab. This might occur because of several factors that must
number will keep increasing every year [1]. The increasing be corrected and unknown to Grab.
number of internet user drive the increasing number of data in An analysis is needed to determine the review patterns of
high volume, velocity and variety. This condition encourages public responses to the Grab application on the Google Play
people to develop new technologies so data can be processed one analysis that can be used is Sentiment Analysis.
more easily and quickly [2]. Sentiment analysis can find out what conversations are often
INTERNET USER IN INDONESIA discussed by users on the Google Play. It was applied to
(YEAR 2010-2017) classify positive, negative, and neutral reciprocity from
consumers so as to accelerate and simplify the task of
MILLION
42 55 63 82 88.1 110.2 132.7 143.3 companies to review their product shortcomings. If a negative
0 sentiment is found, the company can take action to mitigate it.
2010 2011 2012 2013 2014 2015 2016 2017 The evaluation of user reviews from Grab will be
processed with sentiment analysis using the Support Vector
YEAR
Machine (SVM) and Maximum Entropy (ME) methods.
Fig. 1. Graph of internet users in Indonesia (source: Association of After classification, then extraction and exploration of
Indonesian Internet Services, 2017) information are carried out on each classification of positive
and negative sentiments. The negative sentiment that has
One business that has been developed along with the been obtained will then be analyzed using a fishbone diagram
increase in information and communication technology is the to determine the causal factors that occur. In the process of
469
2019 International Conference on Information and Communications Technology (ICOIACT)
470
2019 International Conference on Information and Communications Technology (ICOIACT)
the best accuracy results in SVM. The following are the results Prediction SVM Maximum Entropy
of a comparison of the four kernels that have been tested: Positive Negative Positive Negative
TABLE I. COMPARISON OF KERNEL USE ON SVM Accuracy 89,01% 90,46%
Kernel Accuracy
Linear 89,01% The SVM method predicts that in positive classes, from
Polynomial 63,26% 596 positive reviews tested, there are 522 reviews that have
RBF 85,35% been correctly classified and there are prediction errors of 74
reviews that have entered negative reviews. Whereas in the
Sigmoid 83,57%
negative reviews tested, out of a total of 305 reviews there
Based on Table I, it can be seen that from the four kernels were 280 reviews that were correctly classified as negative
that have been tested, the Linear kernel has the highest level reviews and there were prediction errors of 25 reviews that
of accuracy compared to other kernel methods. Thus in this entered positive reviews. Then from the value of the confusion
study the Linear kernel was used in the classification process. matrix obtained an accuracy rate of 89.01%, meaning that
In addition to using the SVM algorithm, this study also uses from the 901 review data tested, there were 802 reviews that
the Maximum Entropy algorithm to obtain accuracy values were correctly classified by the Support Vector Machine
from classification. (SVM) model.
The classification process was done by making Whereas by using the Maximum Entropy method, the
experiments using training data and testing data randomly. prediction results show that in positive classes, from 549
This study uses the confusion matrix method in the evaluation positive reviews there are 505 reviews that have been correctly
process. Confusion matrix is one of the important tools in the classified and there are prediction errors of 44 reviews that
evaluation method used in machine learning which usually entered negative reviews. While on negative reviews, of the
contains two or more categories [23]. Each matrix element total 352 reviews 310 reviews that have been correctly
shows the number of sample test data for the actual class classified as negative reviews and there are 42 prediction
which is described in the form of rows while the column errors that have entered positive reviews. Then from the value
describes the predicted class. In evaluating the model, five of the confusion matrix obtained an accuracy rate of 90.46%,
dataset experiments were conducted to find the best predictive meaning that from the 901 review data tested, there are 815
accuracy value. The results of each experiment using the reviews that are correctly classified by the Maximum Entropy
Support Vector Machine and Maximum Entropy methods are (Maxent) model. When compared with the Support Vector
as follows: Machine (SVM) method, the Maximum Entropy (Maxent)
method has a higher level of accuracy.
TABLE II. COMPARISON OF ACCURACY VALUE OF E. Visualization and Text associations
EXPERIMENT DATASET USING SVM AND ME
Visualization was carried out on each sentiment class
Trial Model Accuracy classification. The purpose of visualization is to extract
SVM ME information in the form of topics that are most often discussed
Trial 1 88,79% 90,46% / reviewed by Grab users, so that there are many review texts
Trial 2 89,01% 90,46% available, information that can be considered important and
Trial 3 85,79% 88,46% associations between words that appear most often
simultaneously can be taken so as to strengthen information
Trial 4 87,35% 85,24%
search. The following is an explanation of the results of word
Trial 5 87,35% 89,68% visualization and association of each sentiment class
classification:
Based on Table III, of the five dataset experiments
conducted using the SVM and Maximum Entropy methods, 1) Positive Reviews
Experiment 2 produced the highest level of accuracy for the The positive review data used is the labeling data that is
SVM method of 89.01% and the Maximum Entropy method done using either the Lexicon dictionary or manually.
of 90.46%. The results of the calculation of the level of Extraction of information on positive reviews is done
accuracy are obtained from the number of testing data that are repeatedly to get information about the positive reviews of
correctly classified compared to the total of all data tested. The Grab users that are most often reviewed/discussed.
average accuracy of SVM was 87.66% and the average On the results of the positive e-commerce review
accuracy of Maximum Entropy was 88.86%. classification Grab, from the number of positive reviews of
Confusion matrix was used to facilitate the accuracy of the 2,736 reviews, it was found that some of the most appearing
calculation process by knowing the amount of test data that is words included the word "application" with a frequency of
correctly classified and the number of test data that is 515 times, "driver" 458 times, "great" as many as 400 times,
misclassified. The comparison of the two confusion matrix "promo" 366 times, and so on. The words that appear are
methods obtained in Experiment 2 can be seen in the Table words that have positive sentiments and are the topic of the
IV. most widely reviewed topics by Grab users. These words are
then used as a basis for finding associations with other words,
TABLE III. CONFUSION MATRIX so that better information can be obtained. The search for
Prediction SVM Maximum Entropy associations between words is often performed
simultaneously and the following results are obtained:
Positive Negative Positive Negative
Positive 522 25 505 44
Negative 74 280 42 310
471
2019 International Conference on Information and Communications Technology (ICOIACT)
application driver promotion great food login ovo account disappointed cancel
competitor money evaluation reward completenes closed balance blocked pay customer
(0,44) (0,30) (0,30) (0,26) s (0,27) (0,25) (0,64) (0,35) (0,28) (0,30)
pay coupon order order restart cash procedure uninstall reason
completeness (0,22) (0,42) (0,35) (0,22) (0,29)
(0,32) (0,28) (0,24) (0,22) (0,25)
often severe used tricky order service
suitability order completeness appropriate
(0,22) (0,18) (0,28) (0,32) (0,22) (0,27)
(0,29) (0,24) (0,16) (0,22)
consistent forced pay maintain driver waiting
clarity customer favorite (0,16) (0,25) (0,28) (0,18) (0,22)
(0,17) (0,21) (0,22) (0,22)
problem
polite (0,24)
(0,22)
Fig. 4. Text association of Grab's negative sentiment review
good order ovo price like
driver advice balance matching completeness
Information obtained from the analysis of negative class text
(0,28) (0,22) (0,51) (0,38) (0,25) associations is as follows. Users judge Grab's drivers to be
service seller pay menu attitude spoiled, because the driver rejects the order that the distance
(0,24) (0,20) (0,29) (0,38) (0,19)
application available via sell price is far away. Users are disappointed and feel unappreciated
(0,17) (0,18) (0,26) (0,31) (0,17) because drivers who do not deliver their paid orders. The
driver estimation discount Grab application cannot connect to the internet, always asks
(0,18) (0,22) (0,30)
confirm matching for system updates, and always exits when the application is
(0,15) (0,22) opened (so the smartphone must be restarted). Users cannot
Fig. 3. Text association of Grab's positive sentiment review .
order Grabfood. Grabfood information is not appropriate, the
store should open but the Grabfood service says the shop
Information obtained from the analysis of positive class closes. Payment of promo code with Grabpay cannot work.
text associations is as follows. Users like the Grab application Payment with ovo is problematic, the user's ovo balance is
and say that Grab deserves a reward because it has deducted even though the driver does not receive it, so the
completeness, suitability and clarity compared to its user makes a payment to the driver with a cash transaction.
competing applications. Driver Grab is considered polite in The user account is blocked by the Grab system, and there is
serving customer orders. Giving promos in the form of no account recovery procedure.
coupons is best maintained and increased in frequency. The
list of food menus in Grab's service is considered interesting, F. Factors for Improving Negative Review Grab Problems
favorite, complete and appropriate. Grab users provide advice Information on the factors that cause e-commerce Grab to
to drivers to confirm to users about the availability of orders have negative reviews are seen from the 6P aspects, namely
from sellers. Payment of Grab orders via ovo balance price, people, process, promotion, place, and product. These
according to estimates provided through the application. factors are obtained based on fishbone diagram analysis. Then
the problem solving is determined. The problem solving plan
2) Negative Reviews in Grab can be seen in the following table:
Extracting information on negative reviews is repeated
repeatedly to get information about the negative reviews of TABLE IV. PLAN TO SOLVE THE GRAB PROBLEMS OF
Grab users who are frequently reviewed/discussed. From a NEGATIVE REVIEW
total of 4,505 reviews, 1,769 negative reviews were identified.
The results of extracting information in the form of negative No. Factor Problem Solution to Problem
reviews are identified based on the frequency of words in the Solve
review, while also being based on the relevance of the word to 1. Price price is not Updating the latest
the topic which refers to negative sentiments. appropriate prices of products
In the negative classification of e-commerce Grab results, from the seller.
from a total of 1,769 negative reviews, some of the words that 2. People fraudster Make SOPs and strict
appear most in need are the word "driver" with a frequency of driver penalties to drivers so
783 times, "application" of 780 times, "message" of 384 times, they don't commit
"promotion" 327 times, and so on. The words that appear are fraud.
words that have negative sentiments and are the topic of the unilateral Making SOPs and
most widely reviewed topics by Grab users. These words are cancellation strict penalties for
then used as a basis for finding relationships with other words, drivers so as not to
so that better information can be obtained. The search for make unilateral
associations between words is often performed cancellations to
simultaneously and the following results are obtained: consumers.
driver application order promotion difficult 3. Process always ask The developer
far connect food code connect
for a immediately repaired
(0,32) (0,20) (0,30) (0,33) (0,24) system the application system
reason closed Grabfood function reach out update so that there was no
(0,29) (0,17) (0,25) (0,30) (0,21) system update error.
customer service error Grabpay go away network to The developer
(0,25) (0,17) (0,20) (0,22) (0,21)
spoiled network closed run out
server error immediately repaired
(0,19) (0,17) (0,17) (0,21) the application system
reject update wrong so that there was no
(0,18) (0,16) (0,16)
system update error.
472
2019 International Conference on Information and Communications Technology (ICOIACT)
473