1 s2.0 S1877050922022621 Main

Available online at www.sciencedirect.
com
Available online at www.sciencedirect.com
ScienceDirect
ScienceDirect
Available online at www.sciencedirect.com
Procedia Computer Science 00 (2022) 000–000
Procedia Computer Science 00 (2022) 000–000 www.elsevier.com/locate/procedia
ScienceDirect www.elsevier.com/locate/procedia
Procedia Computer Science 216 (2023) 682–690
7th International Conference on Computer Science and Computational Intelligence 2022

7th International Conference on Computer Science and Computational Intelligence 2022
Sentiment analysis for customer review: Case study of Traveloka
Sentiment analysis for customer review: Case study of Traveloka
Ziedhan Alifio Dieksona, Muhammad Rivyan Bagas Prakosoa, Muhammad Savio Qalby
Ziedhan Alifio
Putra a Dieksona,Shaden
, Muhammad Muhammad Rivyan
Al Fadel Bagas
Syaputra a Prakosoa, Muhammad Savioa,Qalby
, Said Achmada, Rhio Sutoyo *
Putra , Muhammad Shaden Al Fadel Syaputra , Said Achmad , Rhio Sutoyoa,*
a a a
a
Computer Science Department, School of Computer Science, Bina Nusantara University, Jakarta 11480, Indonesia
a
Computer Science Department, School of Computer Science, Bina Nusantara University, Jakarta 11480, Indonesia
Abstract
Abstract
This study will discuss customers’ satisfaction with the services of Traveloka by analyzing how many people are satisfied and
unhappy
This studywith
willthe services
discuss that Traveloka
customers’ has towith
satisfaction offer.
theThis study of
services uses Twitter to
Traveloka byacquire all how
analyzing the data
manywepeople
need, focusing onlyand
are satisfied on
tweets
unhappy about
withTraveloka.
the servicesThe thatdataset is gathered
Traveloka from Twitter
has to offer. API,
This study which
uses consists
Twitter of 1200
to acquire all tweets
the datarelated to Traveloka.
we need, Scikit-
focusing only on
learn
tweetslibrary
about is used through
Traveloka. python isto gathered
The dataset do the analysis process.
from Twitter This
API, research
which employs
consists three
of 1200 classification
tweets related tometheods:
Traveloka.Support
Scikit-
Vector Modelis (SVM),
learn library Logistic
used through Regression,
python to do theandanalysis
Na¨ıve process.
Bayes. The
Thissteps in this
research research
employs were
three data retrieval,
classification transformation,
metheods: Support
classification
Vector Modeltraining
(SVM),and predicting
Logistic the test and
Regression, data,Na¨ıve
and finally,
Bayes.theThe
result analysis.
steps in this Therefore, this research
research were is looking
data retrieval, forward to
transformation,
how most Twitter
classification trainingusers
andfeel about thetheperformance
predicting of finally,
test data, and this mobile traveling
the result application.
analysis. Thethis
Therefore, result showsisthat
research SVMforward
looking has better
to
accuracy
how mostinTwitter
determining
users the
feelsentiment
about theofperformance
tweets aboutofTraveloka.
this mobile traveling application. The result shows that SVM has better
© 2022 The
accuracy Authors. Published
in determining by of
the sentiment ELSEVIER B.V.Traveloka.
tweets about This is an open access article under the CC BY-NC-ND license
© 2023
2022TheTheAuthors. Published by Elsevier
(https://creativecommons.org/licenses/by-nc-nd/4.0)
© Authors. Published by ELSEVIERB.V. B.V. This is an open access article under the CC BY-NC-ND license
This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0)
Peer-review under responsibility of the scientific committee of the 7th International Conference on Computer Science
Peer-review under responsibility of the scientific committee of the 7th International Conference on Computer Science and
and Computational
Peer-review
Computational under Intelligence
2022 2022
responsibility
Intelligence of the scientific committee of the 7th International Conference on Computer Science
Keywords: Sentiment analysis,
and Computational Twitter,2022
Intelligence Traveloka
Keywords: Sentiment analysis, Twitter, Traveloka
1. Introduction
1. Introduction
Machine learning addresses the question of how to build computers that improve automatically through
Machine[1].
experience learning
Machineaddresses
learningthe question quite
is growing of how to build
rapidly lately, computers
and it alsothat improve
develops everyautomatically through
time we gather data
experience
worldwide. [1]. Machine
There are many learning is growing
examples and typesquite rapidly learning,
of machine lately, and
oneitofalso develops
which every analysis.
is sentiment time we gather data
worldwide.
SentimentThere are many
analysis examples
is a growing andattypes
field of machine learning,
the intersection one of
of linguistics and which is sentiment
computer scienceanalysis.
that attempts to de-
Sentiment
termine analysis is
the sentiment a growing
that fieldinside
is contained at the the
intersection
sentence of linguistics and
automatically. [2].computer
The goalscience that attempts
of sentiment analysistoisde-
to
termine the sentiment that is contained inside the sentence automatically. [2]. The goal of sentiment analysis is to
* Corresponding author.
rsutoyo@binus.edu
E-mail address:author.
* Corresponding
E-mail address: rsutoyo@binus.edu
1877-0509 © 2022 The Authors. Published by ELSEVIER B.V. This is an open access article under the CC BY-NC-ND license
1877-0509 © 2022 The Authors. Published by ELSEVIER B.V. This is an open access article under the CC BY-NC-ND license
Peer-review under responsibility of the scientific committee of the 7th International Conference on Computer Science and
Computational Intelligence
Peer-review under 2022 of the scientific committee of the 7th International Conference on Computer Science and
responsibility
Computational Intelligence 2022
1877-0509 © 2023 The Authors. Published by Elsevier B.V.
This is an open access article under the CC BY-NC-ND license (https://creativecommons.org/licenses/by-nc-nd/4.0)
Peer-review under responsibility of the scientific committee of the 7th International Conference on Computer Science and Computational
Intelligence 2022
10.1016/j.procs.2022.12.184
Ziedhan Alifio Diekson et al. / Procedia Computer Science 216 (2023) 682–690 683
Author name / Procedia Computer Science 00 (2019) 000–000 677
determine what kind of sentiment we have just acquired from the dataset. Those sentiments could be either negative
or positive.
Analysis of these sentiments and opinions has spread across many fields, such as Consumer information, Mar-
keting, books, application, websites, and Social [3], which we will take an example from the social media platform
Twitter.
We currently know that sentiment analysis is an analysis method that analyzes sentiments from a sentence. We
also know that sentiment analysis can determine whether a sentence is a negative or positive sentiment.
The criteria for positive sentiments are sentences describing a sentiment that a person writes is happy and
satisfied with the app’s performance. The criteria of negative sentiment are the exact opposite of the positive
criteria. The expression that indicates unsatisfied with the performance of Traveloka will classify as a negative
sentiment.
An example of a positive sentence is that there are positive words inside the sentence, such as ”Bagus”, ”Enak”,
and ”Nyaman”. While the unhappy contains negative words inside the sentence, such as ”Jelek”, ”Sedih”, ”Kurang”
and other numerous bad words.
As we may have mentioned, sentiment analysis can be used on many platforms. One of which is a tweet from
Twitter. Nowadays, people tend to pour the inside of their minds through social media, Twitter. These tweets could
be anything, from texts or even a video attached to the tweet. However, it mainly serves as people nowadays think
of a place to say what is on their minds.
Traveloka is a mobile-based app focusing on traveling services such as ticket or hotel booking. That was
launched back in 2012 and mainly served to make traveling more accessible, and people can now book tickets from
their smart- phones. Because of Traveloka, travelers and backpackers can get suitable accommodations and the best
transportation available that they can choose.
The mobile traveling services company also launched its food delivery service called Traveloka Eats to compete
against other mobile application companies that launched their own food delivery services before Traveloka.
Traveloka launched Traveloka Eats in 2018. However, they could not compete against them due to their far stronger
competitors. Although it was launched in 2018, their advertisements about Traveloka Eats can be seen everywhere
recently. We can see them through YouTube advertisements or even banner advertisements online.
This study aims to identify the satisfaction of Twitter users with the services provided by Taveloka by using sen-
timent analysis. The criteria for satisfied or positive sentiments are people happy and satisfied with the app’s perfor-
mance. In contrast, the unhappy or negative criteria are the exact opposite of the satisfied or optimistic criteria, such
as unsatisfied people or bad experiences.
2. Literature Review
Sentiment analysis is one of the tasks of text classification, which aims to determine subjective information from
a sentence, whether the sentence has positive, negative, or neutral sentiments. Sentiment analysis will extract
contextual information from a text and then determine the sentiment of the text by utilizing certain algorithms and
calculations such as machine learning or deep learning. Sentiment analysis is a field where we can use various
methods to complete the task. Moreover, the platform on which we want to analyze the sentiment also varies. This
condition proves that this area of research is massive and that we can explore its possibilities and use various
methods. From using sentiment analysis for customer reviews to predicting the presidential election from Twitter
[4]. Sentiment analysis is also helpful for noticing how people feel during the COVID-19 outbreak. For example, the
research that was conducted by A.
D. Dubey shows which of the 12 countries they picked shows the most emotions [5]. Those emotions are anger,
anticipation, disgust, fear, joy, sadness, surprise, and trust. The results were quite interesting because we can finally
know which of these countries shows the highest emotions from the emotion that was mentioned before. Such as
the, people from France were the highest for giving angry emotions toward the COVID-19 outbreak. Their research
is indeed interesting because they have mixed results for each emotion.
Besides using sentiment analysis for people’s reactions to the COVID-19 outbreak, we can also use it for what
people think about the vaccines or the lockdowns that were happening a couple of years ago. The research that S.
Almotiri conducted aims to know what New Zealanders think about the lockdown that was happening there [6].
684 Ziedhan Alifio Diekson et al. / Procedia Computer Science 216 (2023) 682–690
678 Author name / Procedia Computer Science 00 (2019) 000–000
Most of the people there were surprised, took it positively, and supported the government’s actions for the
betterment of others. While lockdowns were taken positively, the question comes to mind about how people view
vaccines. The research that C. Villavicencio et al. conducted shows that the people in the Philippines were happy
that they were vaccinated [7]. However, some might perceive it as a dangerous vaccine, but not for the Philippines
as it would seem. We can also use sentiment analysis to know what people think about certain companies or
products. For example, the research that D. D. Das et al. conducted shows how many certain people think about an
airline company positively or negatively [8]. Not only on airlines, but the research also that was conducted by A.R.
Prananda et al. shows that
we can use it for other companies or products [9]. They used sentiment analysis for customers’ views about the
performance of the Go-Jek app. Which was new by the time they conducted this research.
As mentioned before, sentiment analysis can be done by using various methods. The research that R. Patel et al.
conducted was using a lexicon-based method aiming to find what people think about the World Cup held in Brazil
back in 2014 [10]. They used and gathered all the data they needed by themselves. Same along with the research that
B. Thapa conducted uses two platforms as the source of the data [11]. They used Twitter and Reddit posts as the
data source for gathering the topic about how they feel about cybersecurity. They used Python Text Processing as
the tool for data gathering and VADER (Valence Aware Dictionary for Sentiment Reasoning) algorithm as the
classifier.
In sentiment analysis, the classification is divided into three levels: document level, sentence level, and aspect
level [12]. Here we use sentence level for this research. There are five steps when preprocessing data: cleaning,
removing stopwords, tokenization, and stemming [13]. After preprocessing, we can determine which words have
sentiment [14]. As a result, we can find out the sentiment of the data we are looking for [15].
Based on information uploaded by Tetra Pak Index(2017), there are around 132 million internet users in Indone-
sia, and 40 percent of the population are social media users [16]. Sentiment analysis analyzes opinions, sentiments,
evaluations, judgments, attitudes, and emotions towards entities such as products, services, organizations,
individuals, problems, events, topics, and their attributes [17]. Sentiment analysis is the process of reading text to get
sentimental information from the text [18]. In this study, similar to existing research, we collect data from tweets on
Twitter [19]. Whereas we use different methods, we also use Twitter’s API and scikit-learn.
Sentiment analysis is also a field where we can accomplish tasks using different methods. There are also different
platforms that we want sentiment analysis to vary. This proves that the field of study is so broad that various
methods can be used to explore its potential. The result of our literature reviews can be seen in table 1.
3. Methodology
Our stages of work is shown in Figure 1. When we first do the data retrieval stage, we first need authorization
from Twitter to gain permission to gather data from Twitter API. After we gathered the data, we next did the data
transformation step. This step transforms all of the datasets from word data into numerical data using TF-IDF
Vectiorizer, and we split data into train and test datasets. Moreover, after transforming it, we began the classification
training and predicting on the test data step, which calculates and gets the accuracy score and the f1-score. The f1-
score formula can be seen in the equation 1. The last step is Results Analysis. The f1-score obtained from each
model could represent in the form of a table based on the sentiments. Then we explore more about the classification
result from the highest f1-score model by representing the confusion matrix.
3.1. Data Retrieval
In this step, we create our dataset, which consists of 1200 tweets related to Traveloka. The datasets that will be
collected in this study will be using the following rules:
1. Twitter posts are filtered only to show posts from Indonesia

2. All Twitter posts that contains “traveloka” keyword
3. All Twitter posts that contains “traveloka eats” keyword
With the rules stated above, the datasets will be processed through a series of analysis methods. Below are
examples of the negative tweet within our dataset:
First example:
traveloka eat promonya makin ga menarik, auto uninstall deh. (traveloka eat’s promos these days are even more
uninteresting, automatically uninstalled it.)
Table 1. Literature Review Table

Paper Objective Method
To analyze prediction of Indonesia presidential election from They used the same platform as ours, but they use R
[4]
Twitter. programming language instead ofpython.
To identify the sentiments of the citi- zens from 12 different They used the same platform as ours, but they use R
[5] countries regarding COVID19 and identify what emotions have programming language instead ofpython just like the first paper
been shared by people from different parts of the world. did.
They used Rcurl as the IDE, sentR as the classifier, and several
[8] To know people thoughts about various ar- line services
base methods of Nat- ural language processing.
To know how satsified are the spectators of 2014 world cup
[10] through finding the hashtags ”#brazil2014” and #worldcup2014” They used the lexicon method
on twitter.
To identify and analyze how people in New Zealand feels about They used RapidAPI for the data collector and AFINN lexicon
[6]
the lockdown during theCOVID-19 pandemic. method as the analyzer.
To know how many Filipinos are enthusiast with the COVID-19 They used NLP and sentiment classificationusing Na¨ıve Bayes
[7]
Vaccine. classifier algorithm.
This study focuses on VAA study, which is a hospital to
[20] recommend candidate and parties to the people during the They used Dynamic Virtual Advice method.
election.
Sentiment analysis of two international apparel brands to They used the same method as us, but they used streaming API
[21]
determine which brand is most popular. instead of regular twitterAPI.
To analyze sentiments related to cybersecu-rity posted by people They used VADER algorithm as the classifier and python text
[11]
in twitter and reddit. processing as the data collector and analyzer.
They used Natural Language Toolkit (NLTK) combined with
[22] To analyze sentiments from twitter posts re-lated to electricity
scikitlearn in phyton
To analyze the sentiment about reviews that are posted in the Gather data from Shopee review page on Google Play Store and
[23]
application page in Google Play Store about Shopee. use naive bayes to perform sentiment analysis.
To analyze the sentiment about reviews that are posted in the
[14] They used Naive Bayes method to identify.
application page in Google Play Store about Go-Jek.
To know whatever Trump supporter or Hillary supporter are The methods that they are using are the samewith ours, but they
[24]
positive, neutral, and negative. use streaming API.
To see public satisfaction with digital payment services They used Na¨ıve Bayes and K-Nearest Neighbour Methods
[25]
available in Indonesia. which is different from ours.
They used Microsoft Analytic Text Analyt- ics instead of
[9] To identify the business intelligence analysis in GO-JEK.
pandas
Second example:
Jangan pesan tiket di @traveloka bikin emosi dan tidak punya tanggung jawab sama sekali #travelokakecewa
(Don’t book your tickets at @traveloka it makes you frustrated and they also don’t have any responsibility at
all.)
Both examples can be identified as a negative sentiment. The first example contains several negative words such
as uninteresting and uninstall which we can learned that the writer of the sentence is dissatisfied with the services of
Traveloka eats and wanted to uninstall the app. The second example is similar with the previous one. The sentence
contains the word frustrated which we can learned that the writer was clearly not happy with the booking system at
traveloka and the writer also encouraged other people not to use the platform.
Next, are some of the examples of the positive tweet within our dataset: First example:
barusan beli tix jakarta bali, totally 5jt, tapi karna reedem point, dapet potongan 300rb, trus ada promo
traveloka discount 100rb, mayan bgt diskonan 400rb. (I just bought a ticket from jakarta to bali, totally 5
million, but because I redeemed some points, I got 300 thousand off. Moreover, there’s a discount from
traveloka 100 thousand, it’s really nice that I got 400 thousand off.)
Second example:
Diskon traveloka eats manteepppp (The discounts in traveloka eats are greatttt)
Both examples can be identified as positive sentiments, with the first containing the word ”mantep” (great),
which we can learn that the writer is satisfied with the services. The second example also contains the word
”mantep” (great) but has differences from the first example. The difference is that the first example tells us that the
writer is generally satisfied with Traveloka, while the second example tells us that the writer is happy with the
discounts that Traveloka has to offer.
Machine learning algorithms often use numerical data, we need to transform or convert the data into a set of
numerical vector data with a process commonly known as vectorization. The vectorizer will convert input data by
calculating how much the TF-IDF score for each word in our dataset and finally put the information into a vector
form.
3.2. Classification Training and Predicting on the Test Data
In the next step, we split the data into training and test subsets. From there, we can start on the training of the
classifier and predict the test data as well. The dataset was split into 80% of the train set and 20% of the test set. We
used Support Vector Machine (SVM), Naive Bayes, and Logistic regression models to see which classification has
the best accuracy.
The SVM model can also be used for sufficient data reduction. This research was already conducted by
Shenglong Zhou [26], which managed to reduce the memory and storage use using a kernel-based SVM model. We
used the SVM model because we needed its features to eliminate feature selection, which makes text classification
fairly easier.
A paper that was written by Ying Guan Et al. Logistic regression is also frequently used in the medical world to
frequently used to develop a predictive model based upon binary data to predict the likelihood of a patient’s health
status, such as health or disease [27]
We used Naive Bayes classifier because it has been used in various research. For example, a paper by Guoliang
Ou et al. stated that Naive Bayesian classifier (NBC) had been used in numerous domains. The main advantage of
the NBC is its simple model structure, which makes it easy to implement, and its good theoretical interpretability.
[28] This is also why we choose this classifier for our research because of its simplicity and is relatively easy to
implement to fulfill our research findings.
After we trained and tested the data, the next step that we did was classifying the data. In this step, we can finally
know the accuracy and also the f1-score from our dataset. Moreover, we printed out the report that can be seen so
that we can see how much accuracy and f1-score we get from our dataset. The formula of the f1-score can be
calculated using the equation:
𝑃𝑃𝑟𝑟𝑒𝑒𝑐𝑐𝑖𝑖𝑠𝑠𝑖𝑖𝑜𝑜𝑛𝑛 ൈ 𝑅𝑅𝑒𝑒𝑐𝑐𝑎𝑎𝑙𝑙𝑙𝑙
𝐹𝐹ͳ − 𝑆𝑆𝑐𝑐𝑜𝑜𝑟𝑟𝑒𝑒 ൌ ʹ ൈ ͳ
𝑃𝑃𝑟𝑟𝑒𝑒𝑐𝑐𝑖𝑖𝑠𝑠𝑖𝑖𝑜𝑜𝑛𝑛 ൅ 𝑅𝑅𝑒𝑒𝑐𝑐𝑎𝑎𝑙𝑙𝑙𝑙
3.3. Results Analysis
After we obtained the accuracy and f1-score from each method, the next step was comparing the results from
each method and analyzing them. After that, we pick one model with the highest accuracy for further analysis, such
as the confusion matrix. Moreover, we summarize the performance result into a form of a table in which we can
finally see the precision, the f1-score, the recall score, and the support value for each sentiment from the model with
the best result that we have used.
Fig. 1. The methodology of this research
4. Result and Discussion
From the dataset, we have a total of 133,227 words inside. Fig 2a shows how many tweets we collected based on
their sentiments. We collected about 690 tweets categorized as positive, and 510 tweets categorized as negative. The
next thing we did was the classification step from three different methods.
The results from each method can be seen in table 2. The table shows us that using the SVM model acquired the
highest accuracy from the other two models. The lowest is logistic regression, in which we acquired an accuracy of
82,50%, and the Naive Bayes method which has 82,91%. From this, we used the results of the SVM model to create
other results, such as a confusion matrix.
The results of the experiment using Logistic Regression, SVM, and Na¨ıve Bayes can be seen in the table 3. The
performance accuracy of SVM for both sentiments are relatively high. From the results that we just acquired from
the SVM model, we also provided a confusion matrix that can be seen in the Fig 3 to acknowledge the errors that
was made in the model. Based on the values that is shown in the Fig 3, we can learn that the result is enough for us
to be satisfied with the experiment we had done.
Table 2. Table of Three Different Methods Results
LR SVM NB
TF-IDF 82.50% 84.5 82.91%

8%
590
Positive
Negative
610
(a) Number of tweets based on its sentiment (b) Word cloud representation from the dataset
Fig. 2. Data Exploratory
In the word cloud figure 2b we can also see which words were most used in the dataset. The word cloud also uses
several stopwords to avoid unwanted words inside the word cloud. The example of stopwords are ”traveloka”,
”traveloka eat”, ”travelokaeat”, ”traveloka health”, ”ada”, ”di”, ”ini”, ”aku”, ”yg”, ”yang”, ”ga”, ”saya”, and many
other words that we acquired from the NLTK library. We removed these words because we thought they might not
be helpful and will not be much of a help for the analysis process.
We also provided a confusion matrix created based on the model with the highest accuracy, the SVM model. The
confusion matrix, can be seen in Fig 3 is to acknowledges the errors made in the model. Based on the values shown
in Fig 3, we can learn that the result is enough for us to be satisfied with our experiment.
We have learned that the dataset about Traveloka is not a topic often used for conducting sentiment analysis.
Despite the rising popularity, only a few researchers conducted a sentiment analysis about Traveloka. The dataset
we created, used in this experiment can be accessed through one of the author’s GitHub links in the footnote. 1
down below. With the accuracy score we acquired through our experiment, We can conclude that our experiment’s
results are good enough to meet our expectations.
Table 3. Table of Results
Method Sentiment Precision Recall F1 Score Support

Negative 0.93 0.64 0.76 102
Logistic Regression
Positive 0.78 0.96 0.86 138
Negative 0.86 0.78 0.82 98
SVM
Positive 0.86 0.92 0.88 142
Negative 0.81 0.77 0.79 102
Naïve Bayes
Positive 0.84 0.87 0.85 138
1 https://github.com/PapihBagas/RMCS
Fig. 3. Confusion matrix based on the SVM model
5. Conclusion and Future Works
This study shows public opinion on the Traveloka application based on data collected from Twitter. Based on a
total of 1,200 tweet data collected, our classification method proves that 610 positive and 590 negative tweets have
relatively high scores, but positive tweets have higher scores than negative tweets. We also use a word cloud to
categorize and find which vocabulary or keywords are frequently used in data sets that describe Traveloka user
performance and satisfaction. The dataset shows that Traveloka gets positive feedback on the promotions,
campaigns, and discounts they provide to users. For further research, we would apply different topics and methods,
such as algorithms, to get more accurate results in assessing public sentiment. These improvements should be
referenced in the body of the paper. Furthermore, by doing this research, we can conduct similar research in the
future with even more excellent results and methodology.
References
[1] Jordan MI, Mitchell TM. Machine learning: Trends, Perspectives, and prospects. Science. 2015;349(6245):255–260.
[2] Taboada M. Sentiment analysis: An overview from linguistics. Annual Review of Linguistics. 2016;2(1):325–347.
[3] Hussein DM. A survey on sentiment analysis challenges. Journal of King Saud University - Engineering Sciences. 2018;30(4):330–338.
[4] Budiharto W, Meiliana M. Prediction and analysis of Indonesia presidential election from Twitter using sentiment analysis. Journal of Big
Data. 2018;5(1).
[5] Dubey AD. Twitter sentiment analysis during covid19 outbreak. SSRN Electronic Journal. 2020.
[6] Almotiri SD. Twitter Sentiment Analysis during the Lockdown on New Zealand. International Journal of Computer and Information Engi-
neering. 2022;15(12):649-54.
[7] Villavicencio C, Macrohon JJ, Inbaraj XA, Jeng JH, Hsieh JG. Twitter sentiment analysis towards covid-19 vaccines in the Philippines
using na¨ıve Bayes. Information. 2021;12(5):204.
[8] Das D, Sharma S, Natani S, Khare N, Singh B. Sentimental Analysis for Airline Twitter data. IOP Conference Series: Materials Science and
Engineering. 2017 11;263:042067.
[9] Prananda AR, Thalib I. Sentiment Analysis for Customer Review: Case Study of Go-Jek Expansion. Journal of Information Systems Engi-
neering and Business Intelligence. 2020;6(1):1.
[10] Patel R, Passi K. Sentiment analysis on Twitter data of World Cup Soccer Tournament using machine learning. IoT. 2020;1(2):218–239.
[11] Thapa B. Sentiment Analysis of Cybersecurity Content on Twitter and Reddit. arXiv preprint arXiv:220412267. 2022.
[12] Pratmanto D, Rousyati R, Wati FF, Widodo AE, Suleman S, Wijianto R. App Review Sentiment Analysis Shopee Application In Google
Play Store Using Naive Bayes Algorithm. In: Journal of Physics: Conference Series. vol. 1641. IOP Publishing; 2020. p. 012043.
[13] Watrianthos R. Sentiment analysis of traveloka app using na¨ıve bayes classifier method. 2019.
[14] Handani SW, Saputra DIS, Arino RM, Ramadhan GFA, et al. Sentiment Analysis for Go-Jek on Google Play Store. In: Journal of Physics:
Conference Series. vol. 1196. IOP Publishing; 2019. p. 012032.
[15] Kuntoro AY, Asra T, Pratama EB, Effendi L, Ocanitra R, et al. Gojek and Grab User Sentiment Analysis on Google Play Using Naive
Bayes Algorithm And Support Vector Machine Based Smote Technique. In: Journal of Physics: Conference Series. vol. 1641. IOP
Publishing; 2020. p. 012102.
[16] Ramadhan F, Sukmana H, Oh L, Wardhani L. ANALYSIS OF WARGANET COMMENTS ON IT SERVICES IN MANDIRI BANK
USING K-NEAREST NEIGHBOR (K-NN) ALGORITHM BASED ON ITSM CRITERIA. ADI Journal on Recent Innovation (AJRI).
2019 09;1:14-9.
[17] Sari P, Alamsyah A, Wibowo S. Measuring e-Commerce service quality from online customer review using sentiment analysis. Journal of
Physics: Conference Series. 2018 03;971:012053.
[18] Zidny Naf’an M, Bimantara A, Larasati A, Risondang E, Nugraha N. Sentiment Analysis of Cyberbullying on Instagram User Comments.
Journal of Data Science and Its Applications. 2019 04;2:88-98.
[19] Damanik F, Setyohadi D. Analysis Of Public Sentiment About Covid-19 In Indonesia On Twitter Using Multinomial Naive Bayes And
Support Vector Machine. IOP Conference Series: Earth and Environmental Science. 2021 03;704:012027.
[20] Tera´n L, Mancera J. Dynamic Profiles Using Sentiment Analysis for VAA’s Recommendation Design. Procedia Computer Science.
2017 12;108:384-93.
[21] Rasool A, Tao R, Kamyab A, Naveed T. Twitter Sentiment Analysis: A Case Study for Apparel Brands. vol. 1176; 2019. p. 022015.
[22] Kaur P, Edalati M. Sentiment analysis on electricity twitter posts. arXiv preprint arXiv:220605042. 2022.
[23] Pratmanto D, Rousyati R, Wati F, Widodo A, Suleman S, Wijianto R. App Review Sentiment Analysis Shopee Application In Google Play
Store Using Naive Bayes Algorithm. vol. 1641; 2020. .
[24] Caetano JA, Lima HS, Santos MF, Marques-Neto HT. Using sentiment analysis to define twitter political users’ classes and their homophily
during the 2016 American presidential election. Journal of Internet Services and Applications. 2018;9(1).
[25] Wisnu H, Afif M, Ruldevyani Y. Sentiment analysis on customer satisfaction of digital payment in Indonesia: A comparative study using
KNN and Na¨ıve Bayes. Journal of Physics: Conference Series. 2020 01;1444:012034.
[26] Zhou S. Sparse SVM for Sufficient Data Reduction. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2022 09;44:5560-71.
[27] Guan Y, Fu GH. A Double-Penalized Estimator to Combat Separation and Multicollinearity in Logistic Regression. Mathematics. 2022
10;10:3824.
[28] Ou G, He Y, Fournier Viger P, Huang J. A Novel Mixed-Attribute Fusion-Based Naive Bayesian Classifier. Applied Sciences. 2022
10;12:10443.

1 s2.0 S1877050922022621 Main

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

1 s2.0 S1877050922022621 Main

Uploaded by

Copyright:

Available Formats

Available online at www.sciencedirect.

7th International Conference on Computer Science and Computational Intelligence 2022

3.1. Data Retrieval

1. Twitter posts are filtered only to show posts from Indonesia

Table 1. Literature Review Table

3.2. Classification Training and Predicting on the Test Data

3.3. Results Analysis

Fig. 1. The methodology of this research

4. Result and Discussion

TF-IDF 82.50% 84.5 82.91%

Fig. 2. Data Exploratory

Table 3. Table of Results

Method Sentiment Precision Recall F1 Score Support

Fig. 3. Confusion matrix based on the SVM model

5. Conclusion and Future Works

You might also like