Sentiment Analysis On Grab User Reviews Using Support Vector Machine and Maximum Entropy Methods

2019 International Conference on Information and Communications Technology (ICOIACT)
Sentiment Analysis on Grab User Reviews

Using Support Vector Machine and
Maximum Entropy Methods
Bella Azis Dewanti Putri Annisa Uswatun Khasanah Abdullah ‘Azzam
Department of Industrial Engineering Department of Industrial Engineering Department of Industrial Engineering
Universitas Islam Indonesia Universitas Islam Indonesia Universitas Islam Indonesia
Yogyakarta, Indonesia Yogyakarta, Indonesia Yogyakarta, Indonesia
15522076@students.uii.ac.id annisa.uswatun@uii.ac.id abdullah.azzam@uii.ac.id
Abstract— One business that has developed along with the transportation service business. The development of internet
increase in information and communication technology is the technology also has an impact on the habits of people using
transportation service business. The last decade has emerged as transportation facilities. Today, online motorcycle taxis are a
a technology-based transportation business innovation, for very popular public transportation especially in Indonesia.
example Grab. It is important for a company or organization to
find out about people's responses to their services or products.
Online motorcycle taxi is a transformation of conventional
Public opinion on the product is not small in number, even motorcycle taxi.
though it is undeniable that public opinion has an impact on the Based on the above phenomena there are emerging
company's image. Therefore, a technique for analyzing the application-based transportation business innovations, one of
opinion is needed so that the company can monitor and organize them is Grab. Based on the "Consumers' Awareness" survey
their services. Public opinion is classified into positive or of 40 drivers and 280 consumers selected randomly on a
negative sentiment classes using SVM and Maximum Entropy national scale by Spire Research and Consulting, 75% and
methods. The labeling results are then analyzed by text 61% of respondents said that Grab was the brand they used in
association to find the relationship of each information obtained. the past 6 and 3 months. Meanwhile, 62% and 58% of
Classification with SVM method produces 89.01% accuracy. respondents chose to use Go-Jek for the same category in the
Whereas Maximum Entropy obtained higher accuracy that is last 6 and 3 months [3].
90.46%. Text associations obtained from the positive sentiment To use Grab, consumer should download Grab application
class and negative sentiment class. The results of negative first. For android users the application can be downloaded in
reviews are analyzed for causes and consequences using Google Play. Google Play is a service with digital facilities
fishbone diagrams for problem solving. initiated by a Google company. These services can be
accessed through websites, android applications (Play Store),
Keywords—sentiment analysis, Grab, SVM, Maximum Entropy
or Google TV [4]. The interesting thing about Google Play is
I. INTRODUCTION that there is a feature that can be reviewed by its users. This
can be used to evaluate reviews of certain application users,
Digital technology is currently developing very fast. This such as Grab.
is driven by the continued development of internet technology User reviews like Grab generally contain positive and
and its users. Reported by the Association of Indonesian negative complaints. A good brand image will form a good
Internet Services (APIJII) in 2017, the number of internet opinion from the users of products/services, then it is expected
users in Indonesia was 143.3 milion from the total population to encourage the purchase process by consumers, and vice
of Indonesia, which was 262 milion. This number has been versa. Positive and negative responses from users can be
tripled compared with the number of internet users in influenced by a number of things that have not been of concern
Indonesia in 2010 as shown in Fig. 1. It is predicted that this to Grab. This might occur because of several factors that must
number will keep increasing every year [1]. The increasing be corrected and unknown to Grab.
number of internet user drive the increasing number of data in An analysis is needed to determine the review patterns of
high volume, velocity and variety. This condition encourages public responses to the Grab application on the Google Play
people to develop new technologies so data can be processed one analysis that can be used is Sentiment Analysis.
more easily and quickly [2]. Sentiment analysis can find out what conversations are often
INTERNET USER IN INDONESIA discussed by users on the Google Play. It was applied to
(YEAR 2010-2017) classify positive, negative, and neutral reciprocity from
consumers so as to accelerate and simplify the task of
MILLION
42 55 63 82 88.1 110.2 132.7 143.3 companies to review their product shortcomings. If a negative
0 sentiment is found, the company can take action to mitigate it.
2010 2011 2012 2013 2014 2015 2016 2017 The evaluation of user reviews from Grab will be
processed with sentiment analysis using the Support Vector
YEAR
Machine (SVM) and Maximum Entropy (ME) methods.
Fig. 1. Graph of internet users in Indonesia (source: Association of After classification, then extraction and exploration of
Indonesian Internet Services, 2017) information are carried out on each classification of positive
and negative sentiments. The negative sentiment that has
One business that has been developed along with the been obtained will then be analyzed using a fishbone diagram
increase in information and communication technology is the to determine the causal factors that occur. In the process of
978-1-7281-1655-6/19/$31.00 ©2019 IEEE 468

extraction and exploration of information, descriptive III. METHODS

statistics and inter-group associations are used to find topics
A. Population and Sample Research
that are often discussed by Grab users.
The population in this study is the Google Play website
II. LITERATURE REVIEW database, which is all the Grab review data. As for the sample
used is the Grab review from the middle of upgrading the Grab
A. Support Vector Machine application which is January 1, 2019 until the last upgrade at
Support Vector Machine (SVM) is a classification method the end of January 31, 2019.
that has the ability to generalize in classifying a pattern. SVM
can determining the best hyperplane with a maximum distance B. Types and Data Sources
margin, excluding data used in the learning phase of the The type of data used in this study is primary data. Primary
method [5], having high-dimensional input space, a problem data is data that refers to information obtained from first hand
categorizing text can be separated linearly, linearly by researchers that can be related to variables of interest for
inseparable data can be solved using the kernel to map data to specific purposes of study [15]. The data in this study were
high-dimensional space, so that it becomes linearly separated obtained using scraping techniques from the Grab website
from the hyperplane [6], and has a high degree of accuracy in page using the default Scraper application from Google
terms of classification text [7]. Chrome with the website address, namely:
Wulandini and Nugroho researched spatio-temporal https://play.google.com/store/apps/details?id=com.Grabtaxi.
information on tropical diseases, the results shown that SVM passenger. Data obtained in the form of user reviews as many
is capable of classifying text since it works well in high- as 4,505 reviews.
dimensional data and avoids the curse of dimensionality
problem. [8]. Al-Harbi investigated the effect of feature C. Research Variable
selection methods and their combinations on dialectal Arabic The research variable is anything that can be in any form
sentiment classification. The feature selection methods are and can be determined by the author to be studied so that it
Information Gain (IG), Correlation, Support Vector Machine can obtain information which can then be drawn into
(SVM), Gini Index (GI), and Chi-Square. A number of conclusions [16]. The variables used in this study are of two
experiments were carried out on dialectical Jordanian reviews kinds, namely date (which is the date of making a comment)
with using an SVM classifier. The experimental results and a review (a review or content of user comments).
showed that the best performance of the SVM classifier was D. Data Collection Method
obtained after the SVM and correlation feature election
methods had been combined with the uni-gram model [9]. In this study, using methods of data collection using web
Joachims explores the use of Support Vector Machines scraping techniques. Web scraping is a combination of
(SVMs) for learning text classi ers from examples. SVMs techniques used to get information from a website
achieve substantial improvements over the currently best automatically without having to copy it manually [17]. The
performing methods and behave robustly over a variety of tools used for web scraping are Scraper software in Google
different learning tasks. Furthermore, they are fully automatic, Chrome.
eliminating the need for manual parameter tuning [10]. E. Data Analysis Method
B. Maximum Entropy The several methods of data analysis used in this study that
Maximum Entropy is a classification method that is able can help in analyzing the data are, among others:
to find the distribution 𝑝 (𝑎 | 𝑏) which will give the maximum 1) Descriptive analysis, which is used in general description
entropy value in order to get the best probability distribution of Grab reviews found on the Google Play website.
that is closest to reality. Maximum entropy is able to connect 2) Sentiment analysis based on the Lexicon dictionary, used
with historical probability, creating a "smooth" model so that to label data into positive and negative sentiment classes.
it meets all empirical constraints, and can integrate various 3) Machine learning methods, namely Support Vector
sources of information into integrated language models [11]. Machine and Maximum Entropy, which in this study are
Research of Xie, et al., using the Maximum Entropy used to classify the user reviews.
method which is combined with a probabilistic latent semantic 4) Barplot and Wordcloud, are used to visualize the most
analysis (PLSA) model. Experiments prove that the frequently used words in reviews.
classification method proposed by this paper has an ideal 5) Text association, which is used in this study to identify and
classification effect [12]. Kumar, et al. comparing four form word patterns that can be associated with other words
classification methods namely SVM, KNN, Naive Bayes to obtain information that is important.
Classifier, and Maximum Entropy. The results showed that 6) The causal diagram (Fishbone), is used to identify the most
Maximum Entropy showed the best results in terms of F- dominant causal factors for the problems obtained from
accuracy and size when compared to other classifiers [13]. A negative reviews.
research of Naiknaware, et al. comparing the classification
IV. RESULTS AND DISCUSSION
methods of SVM, NBC, and Maximum Entropy. The results
showed that performance classifiers were made for seven The data processing process in this study was conducted
datasets (Budget 2017, Demonetization, GST2017, Digital by analyzing the descriptive results of 4,505 Grab user
India, Kashmir, Make in India, Indian Startup). In the reviews on the Google Play site. Then pre-processing data is
Budget2017 dataset Naïve Bayes is the best performer, in the carried out, which includes translate review of foreign
best performing Naïve Bayes Demonstration dataset. In languages, spelling normalization, folding cases, tokenizing,
GST2017 SVM shows the best performance, while in Digital and filtering. then user review data is ready for the sentiment
India, Kashmir, Make in India, and Startup shows that Max class labeling process and search for text associations.
Entropy performs best [14].
469
A. Descriptive Analysis included in the category of positive sentiment. For

Descriptive analysis in this study aims to see a general classification reviews that contain negative statements such as
description of information about the Grab application based insults, dissatisfaction and so on were included in the negative
on user review data from the Google Play site that was sentiment category. Then, for reviews that do not contain
previously obtained by scraping techniques. The aspects seen positive or negative statements and for review which has the
in this analysis include the number of incoming reviews based same amount of negative and positive statement belong to
on the time sequence and the comparison of the number of neutral sentiment. These reviews included questions without
reviews from these users categorized into two categories, sentiments, advertisements and so on.
positive reviews and negative reviews. In this study class reduction was done by categorizing
The number of review on January tend to increase. The neutral sentiment classes into a class of positive or negative
number of reviews increased dramatically from the total of 82 sentiments performed manually. If the neutral sentiment class
reviews on January 1st to the highest number of reviews on is not identified as positive or negative sentiment, it belong to
January 30th it was estimated that this happened because Grab positive sentiment class. But if neutral sentiments contain
offered promos in early 2019, so Grab gets an increasing balanced words of positive and negative sentiments, it belong
number of user and reviews. It was mentioned on Grab to negative sentiment classes [21]. Previously this has been
website, that there were 50% discount for Grabcar and 70%- done with the consideration that negative information can be
80% discount for Grabbike during 1st – 31st January 2019. extracted more easily to be translated as complaints or
dissatisfaction of users, so that it is hoped that Grab can make
B. Pre-processing Data improvements towards a better course.
Grab user review data that obtained from the scraping
Negative Reviews: 1.769
process on the Google Play site cannot be directly used
because it has a structured
39,3%sentence or text form which has a
lot of noise. There were foreign language (English) reviews
60,7%
that must be translated into Indonesian where the information
obtained will very difficult to translate, so the data needs to do Positive Reviews: 2.736
the cleaning process first. The pre-processing stage is carried

out with the help of the Rstudio-1.1.463 and R-3.5.3
applications. The pre-processing steps were translate review Positive Reviews Negative Reviews
of foreign languages, spelling normalization, folding cases,
tokenizing, and filtering. Fig. 2. Comparison of the number of positive and negative reviews of Grab
users on Google Play (January 2019)
C. Labeling and Weighting of the Sentiment Class
The next stage after the preprocessing process was Based on the Fig. 2, the results of the sentiment class
sentiment analysis for data labeling. The data labeling process labeling with the number of positive reviews was bigger than
was done automatically by the Lexicon dictionary by the number of negative reviews. Of a total of 4,505 reviews,
calculating sentiment scores. Word weighting is done by the number of positive reviews was 2,736 reviews (60.7%)
calculating the frequency of occurrence of words in a text and negative reviews were 1,769 reviews (39.3%).
document. The more often a word appears in a text document,
the greater the weight of the word and the word was D. Classification Analysis
considered as a word that strongly represents the text The data training process will produce a classification
document [18]. model, then the model accuracy will be tested using testing
Basically, the labeling process was divided into three data. To make the classification model using positive training
sentiment classes, which were positive sentiment, neutral data and negative training data, two algorithm, Support Vector
sentiment, and negative sentiment by scoring. Evaluation of Machine (SVM) algorithm and the Maximum Entropy, were
documents in the category of positive or negative used.
segmentation classes was determined by utilizing a collection Training data was used by the classification algorithm to
of words in Indonesian which consists of a collection of form a classifier model, this model was a representation of
positive words and a collection of negative words. Based on knowledge that will be used to predict new data classes that
the collection of Indonesian words, then automatic labeling have never existed, the greater the training data used, the better
will be carried out by application R by calculating the score of the algorithm will understand the data pattern. Data testing
the number of positive words minus the score of the number was used to measure whether the classifier successfully
of negative words in a review sentence [19]. If a sentence has classifies the class correctly or not. The data used for training
a score> 0 will be classified in a positive class, if the sentence data and testing data is data that has class labels, with the
has a score = 0 it will be classified in a neutral class, whereas amount of training data and data testing having a ratio of 80%:
if the sentence has a score <0 it is classified in the negative 20%. Although extensive research has not been carried out in
class [20]. the selection of optimal ratios between these data sets, there
The classification data in this study were divided into three are several common practices in choosing the size of the data
classes; positive sentiment consisted of 2,649 review data, set [22]. Based on the Pareto Principle, the ratio commonly
neutral sentimet consisted of 129 review data, and negative used is 80: 20 for data set training and testing. So that the
sentiment consisted of 1,727 review data. However, the data comparison of training data and data testing is 80%: 20% of
used only positive sentiment and negative sentiment. This is the total 4,505 reviews, used as many as 3,604 reviews as
because the neutral sentiment class is considered to provide training data and 901 reviews as testing data.
less input and benefits for Grab. In the classification review In this study an experiment was carried out using several
which contains positive statements such as pride, expressions kernels namely Linear kernel, Polynomial, Radial Basis
of gratitude as well as words of praise and others were Function (RBF), and Sigmoid to obtain the classification with
470
the best accuracy results in SVM. The following are the results Prediction SVM Maximum Entropy
of a comparison of the four kernels that have been tested: Positive Negative Positive Negative
TABLE I. COMPARISON OF KERNEL USE ON SVM Accuracy 89,01% 90,46%
Kernel Accuracy
Linear 89,01% The SVM method predicts that in positive classes, from
Polynomial 63,26% 596 positive reviews tested, there are 522 reviews that have
RBF 85,35% been correctly classified and there are prediction errors of 74
reviews that have entered negative reviews. Whereas in the
Sigmoid 83,57%
negative reviews tested, out of a total of 305 reviews there
Based on Table I, it can be seen that from the four kernels were 280 reviews that were correctly classified as negative
that have been tested, the Linear kernel has the highest level reviews and there were prediction errors of 25 reviews that
of accuracy compared to other kernel methods. Thus in this entered positive reviews. Then from the value of the confusion
study the Linear kernel was used in the classification process. matrix obtained an accuracy rate of 89.01%, meaning that
In addition to using the SVM algorithm, this study also uses from the 901 review data tested, there were 802 reviews that
the Maximum Entropy algorithm to obtain accuracy values were correctly classified by the Support Vector Machine
from classification. (SVM) model.
The classification process was done by making Whereas by using the Maximum Entropy method, the
experiments using training data and testing data randomly. prediction results show that in positive classes, from 549
This study uses the confusion matrix method in the evaluation positive reviews there are 505 reviews that have been correctly
process. Confusion matrix is one of the important tools in the classified and there are prediction errors of 44 reviews that
evaluation method used in machine learning which usually entered negative reviews. While on negative reviews, of the
contains two or more categories [23]. Each matrix element total 352 reviews 310 reviews that have been correctly
shows the number of sample test data for the actual class classified as negative reviews and there are 42 prediction
which is described in the form of rows while the column errors that have entered positive reviews. Then from the value
describes the predicted class. In evaluating the model, five of the confusion matrix obtained an accuracy rate of 90.46%,
dataset experiments were conducted to find the best predictive meaning that from the 901 review data tested, there are 815
accuracy value. The results of each experiment using the reviews that are correctly classified by the Maximum Entropy
Support Vector Machine and Maximum Entropy methods are (Maxent) model. When compared with the Support Vector
as follows: Machine (SVM) method, the Maximum Entropy (Maxent)
method has a higher level of accuracy.
TABLE II. COMPARISON OF ACCURACY VALUE OF E. Visualization and Text associations
EXPERIMENT DATASET USING SVM AND ME
Visualization was carried out on each sentiment class
Trial Model Accuracy classification. The purpose of visualization is to extract
SVM ME information in the form of topics that are most often discussed
Trial 1 88,79% 90,46% / reviewed by Grab users, so that there are many review texts
Trial 2 89,01% 90,46% available, information that can be considered important and
Trial 3 85,79% 88,46% associations between words that appear most often
simultaneously can be taken so as to strengthen information
Trial 4 87,35% 85,24%
search. The following is an explanation of the results of word
Trial 5 87,35% 89,68% visualization and association of each sentiment class
classification:
Based on Table III, of the five dataset experiments
conducted using the SVM and Maximum Entropy methods, 1) Positive Reviews
Experiment 2 produced the highest level of accuracy for the The positive review data used is the labeling data that is
SVM method of 89.01% and the Maximum Entropy method done using either the Lexicon dictionary or manually.
of 90.46%. The results of the calculation of the level of Extraction of information on positive reviews is done
accuracy are obtained from the number of testing data that are repeatedly to get information about the positive reviews of
correctly classified compared to the total of all data tested. The Grab users that are most often reviewed/discussed.
average accuracy of SVM was 87.66% and the average On the results of the positive e-commerce review
accuracy of Maximum Entropy was 88.86%. classification Grab, from the number of positive reviews of
Confusion matrix was used to facilitate the accuracy of the 2,736 reviews, it was found that some of the most appearing
calculation process by knowing the amount of test data that is words included the word "application" with a frequency of
correctly classified and the number of test data that is 515 times, "driver" 458 times, "great" as many as 400 times,
misclassified. The comparison of the two confusion matrix "promo" 366 times, and so on. The words that appear are
methods obtained in Experiment 2 can be seen in the Table words that have positive sentiments and are the topic of the
IV. most widely reviewed topics by Grab users. These words are
then used as a basis for finding associations with other words,
TABLE III. CONFUSION MATRIX so that better information can be obtained. The search for
Prediction SVM Maximum Entropy associations between words is often performed
simultaneously and the following results are obtained:
Positive Negative Positive Negative
Positive 522 25 505 44
Negative 74 280 42 310
471
application driver promotion great food login ovo account disappointed cancel
competitor money evaluation reward completenes closed balance blocked pay customer
(0,44) (0,30) (0,30) (0,26) s (0,27) (0,25) (0,64) (0,35) (0,28) (0,30)
pay coupon order order restart cash procedure uninstall reason
completeness (0,22) (0,42) (0,35) (0,22) (0,29)
(0,32) (0,28) (0,24) (0,22) (0,25)
often severe used tricky order service
suitability order completeness appropriate
(0,22) (0,18) (0,28) (0,32) (0,22) (0,27)
(0,29) (0,24) (0,16) (0,22)
consistent forced pay maintain driver waiting
clarity customer favorite (0,16) (0,25) (0,28) (0,18) (0,22)
(0,17) (0,21) (0,22) (0,22)
problem
polite (0,24)
(0,22)
Fig. 4. Text association of Grab's negative sentiment review
good order ovo price like
driver advice balance matching completeness
Information obtained from the analysis of negative class text
(0,28) (0,22) (0,51) (0,38) (0,25) associations is as follows. Users judge Grab's drivers to be
service seller pay menu attitude spoiled, because the driver rejects the order that the distance
(0,24) (0,20) (0,29) (0,38) (0,19)
application available via sell price is far away. Users are disappointed and feel unappreciated
(0,17) (0,18) (0,26) (0,31) (0,17) because drivers who do not deliver their paid orders. The
driver estimation discount Grab application cannot connect to the internet, always asks
(0,18) (0,22) (0,30)
confirm matching for system updates, and always exits when the application is
(0,15) (0,22) opened (so the smartphone must be restarted). Users cannot
Fig. 3. Text association of Grab's positive sentiment review .
order Grabfood. Grabfood information is not appropriate, the
store should open but the Grabfood service says the shop
Information obtained from the analysis of positive class closes. Payment of promo code with Grabpay cannot work.
text associations is as follows. Users like the Grab application Payment with ovo is problematic, the user's ovo balance is
and say that Grab deserves a reward because it has deducted even though the driver does not receive it, so the
completeness, suitability and clarity compared to its user makes a payment to the driver with a cash transaction.
competing applications. Driver Grab is considered polite in The user account is blocked by the Grab system, and there is
serving customer orders. Giving promos in the form of no account recovery procedure.
coupons is best maintained and increased in frequency. The
list of food menus in Grab's service is considered interesting, F. Factors for Improving Negative Review Grab Problems
favorite, complete and appropriate. Grab users provide advice Information on the factors that cause e-commerce Grab to
to drivers to confirm to users about the availability of orders have negative reviews are seen from the 6P aspects, namely
from sellers. Payment of Grab orders via ovo balance price, people, process, promotion, place, and product. These
according to estimates provided through the application. factors are obtained based on fishbone diagram analysis. Then
the problem solving is determined. The problem solving plan
2) Negative Reviews in Grab can be seen in the following table:
Extracting information on negative reviews is repeated
repeatedly to get information about the negative reviews of TABLE IV. PLAN TO SOLVE THE GRAB PROBLEMS OF
Grab users who are frequently reviewed/discussed. From a NEGATIVE REVIEW
total of 4,505 reviews, 1,769 negative reviews were identified.
The results of extracting information in the form of negative No. Factor Problem Solution to Problem
reviews are identified based on the frequency of words in the Solve
review, while also being based on the relevance of the word to 1. Price price is not Updating the latest
the topic which refers to negative sentiments. appropriate prices of products
In the negative classification of e-commerce Grab results, from the seller.
from a total of 1,769 negative reviews, some of the words that 2. People fraudster Make SOPs and strict
appear most in need are the word "driver" with a frequency of driver penalties to drivers so
783 times, "application" of 780 times, "message" of 384 times, they don't commit
"promotion" 327 times, and so on. The words that appear are fraud.
words that have negative sentiments and are the topic of the unilateral Making SOPs and
most widely reviewed topics by Grab users. These words are cancellation strict penalties for
then used as a basis for finding relationships with other words, drivers so as not to
so that better information can be obtained. The search for make unilateral
associations between words is often performed cancellations to
simultaneously and the following results are obtained: consumers.
driver application order promotion difficult 3. Process always ask The developer
far connect food code connect
for a immediately repaired
(0,32) (0,20) (0,30) (0,33) (0,24) system the application system
reason closed Grabfood function reach out update so that there was no
(0,29) (0,17) (0,25) (0,30) (0,21) system update error.
customer service error Grabpay go away network to The developer
(0,25) (0,17) (0,20) (0,22) (0,21)
spoiled network closed run out
server error immediately repaired
(0,19) (0,17) (0,17) (0,21) the application system
reject update wrong so that there was no
(0,18) (0,16) (0,16)
system update error.
472
No. Factor Problem Solution to Problem REFERENCES

Solve
[1] Asosiasi Penyelenggaraan Jasa Internet Indonesia. “Penetrasi dan
the Improved the Perilaku Pengguna Internet Indonesia”, in press.
procedure application system so [2] Josi, A., Abdillah, L.A., dan Suryayusra. Penerapan Teknik Web
for account that the system did not Scraping pada Mesin Pencari Artikel Ilmiah. Jurnal Sistem Informasi,
recovery is block accounts 5(2), 2014, p. 159-164.
unclear without reason. [3] ASA News. “Spire Research and Consulting Rilis Survei Layanan
Provide a written Transportasi Online”, in press.
explanation on how to [4] Karch, M. ”What Is Google Play?”, in press.
overcome the recovery [5] Nugroho, A. S., Wranto, A. B., dan Handoko, D. ”Support Vector
Machine Teori dan Aplikasinya dalam Bioinformatika”, in press.
of a blocked account.
[6] Liu, Yi dan Zheng, Y.F. “One-Against-All Multi-Class SVM
4. Promotion active Maintain the active Classification using Reliability Measures” [IEEE International Joint
promo, but promotions. Conference, January, 2005].
cannot be [7] Naradhipa, R.A. dan Purwarianti, A. “Sentiment Classification for
used Indonesian Message in Social Media” [International Conference on
Electrical Engineering and Informatics, July, 2012].
promotion Maintain the active
[8] Wulandini, F. dan Nugroho, A.S. “Text Classification Using Support
is not active promotions. Vector Machine for Webmining Based Spatio Temporal Analysis of
little Adds the number of the Spread of Tropical Diseases” [International Conference on Rural
promotion promotions. Information and Communication Technology, p. 189-192, 2009]
5. Place error Error systems usually [9] Al-Harbi, O. “A Comparative Study of Feature Selection Methods for
Dialectal Arabic Sentiment Classification Using Support Vector
system occur when there is an Machine” IJCSNS International Journal of Computer Science and
application update. Network Security. 19(1). 2019, p. 167-176.
Therefore, before [10] Joachim, T. “Text Categorization with Support Vector Machines:
launching the latest Learning with Many Relevant Features” [European Conference on
version of the Machine Learning, pp 137-142, 2005]
application must be [11] Wu, J. “Maximum Entropy Language Modeling with Syntactic,
Semanctic, and Collocational Dependencies” [Center for Language and
careful in testing until Speech Processing, Baltimore, April, 2001]
the application has not [12] Xie, X., Ge, S., Hu, F., Xie, M., dan Jiang, N. “An improved algorithm
detected an error. for sentiment analysis based on Maximum Entropy” Soft Computing,
6. Product consumers Making SOPs and 23(2), 2019, p. 599-611.
do not get strict penalties for [13] Kumar, H.M.K., Harish, B.S., dan Darshan, H.K. “Sentiment Analysis
on IMDb Movie Reviews Using Hybrid Feature Extraction Method”
their orders drivers so as not to International Journal of Interactive Multimedia and Artificial
carry away consumer Intelligence, 2018, p. 1-7.
orders. [14] Naiknaware, B., Kushwaha, B., dan Kawathekar, S. “Social Media
incorrect Update the product Sentiment Analysis using Machine learning Classifiers” International
Journal of Computer Science and Mobile Computing, 6(6), 2017, p.
product information in the 465-472.
description application. [15] Sekaran, U. Research Methods for Business Edisi I and 2. Jakarta:
Salemba Empat, 2011.
[16] Sugiyono. Metode Penelitian Kualitataif dan R&D. Bandung: Alfabeta,
V. CONCLUSIONS 2011.
The number of Grab reviews in January 2019 there are [17] Vargiu, E. dan Urru, M. “Exploiting Web Scraping In a Collaborative
4,505 user reviews, based on sentiment class labeling the Filtering-based Approach to Web Advertising“ Artificial Intelligence
Research, 2(1), 2012, p. 44-54.
number of positive reviews was 2,736 reviews and negative
reviews were 1,769 reviews. The classification results using [18] Basnur, P.W. 2009. “Pengklasifikasian Artikel Berita Berbahasa
Indonesia Secara Otomatis Menggunakan Ontologi“ Program Ilmu
the Support Vector Machine (SVM) has an accuracy of Komputer Fakultas Ilmu Komputer Universitas Indonesia,
89.01% and Maximum Entropy (Maxent) has a greater unpublished.
accuracy of 90.46%. Based on the results of classification and [19] Susanti, A.R. 2016. “Analisis Klasifikasi Sentimen Twitter Terhadap
text associations conducted, the majority of Grab users talk Kinerja Layanan Provider Telekomunikasi Menggunakan Varian
about drivers, applications, messages, promos, and ovo. Based Naive Bayes” Institut Pertanian Bogor, unpublished.
on fishbone diagram analysis there are 12nd problems on [20] Buntoro, G.A. “Analisis Sentimen Calon Gubernur DKI Jakarta 2017
negative reviews by Grab user. We classify that problems into di Twitter“ Jurnal Integer, 2(1), 2017, p. 32-41.
6P factors, there ae Price, People, Process, Promotion , Place, [21] Gumilang, Z.A. 2018. “Implementasi Naive-Bayes Classifier dan
Asosiasi Untuk Analisis Sentimen Data Ulasan Aplikasi E-commerce
and Product. Shopee pada Situs Google Play” Program Studi Statistika FMIPA
Universitas Islam Indonesia, unpublished.
ACKNOWLEDGEMENT [22] Suthaharan, S. “Machine Learning Models and Algorithms for Big
Data Classification: Thinking with Examples for Effective Learning“
The author would like to thanks for Grab Indonesia who Springer, 2015, p. 10.
provide the data and Universitas Islam Indonesia for the [23] Manning, C.D., Raghavan, P., dan Schutze, H. “An Introduction to
financial support. Information Retrieval” Cambridge: Cambridge University Press, 2009.
473

Sentiment Analysis On Grab User Reviews Using Support Vector Machine and Maximum Entropy Methods

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Sentiment Analysis On Grab User Reviews Using Support Vector Machine and Maximum Entropy Methods

Uploaded by

Copyright:

Available Formats

2019 International Conference on Information and Communications Technology (ICOIACT)

Sentiment Analysis on Grab User Reviews

978-1-7281-1655-6/19/$31.00 ©2019 IEEE 468

extraction and exploration of information, descriptive III. METHODS

A. Descriptive Analysis included in the category of positive sentiment. For

the cleaning process first. The pre-processing stage is carried

No. Factor Problem Solution to Problem REFERENCES

You might also like