megersa, thesis presentation (2)

Ambo university Woliso Campus
School of Technology and Informatics

Department of Information Technology
Thesis presentation on Sentiment Analysis of Afaan Oromoo
social media content Using Machine Learning Approaches
Under supervision of Dr Kuulaa.Q
By Megersa Oljira
ID_No IT/RPG/010/10 1
Outlines
 Introduction
 Motivation
 Statement of the problem
 Methodology
 Experiment/results
 Discussions
 Conclusion
 Recommendation
2
What is Sentiment ?
• Sentiment: people's judgment, attitudes,

appreciation about government, politics, products,
places, and events, etc. around the world.
• Sentiment analysis also called opinion mining, is
the field of study that analyzes people’s opinions,
sentiments, evaluations, appraisals, attitudes, and
emotions towards entities such as products,
services, organizations, individuals, issues,
events, topics, and their attributes.
3
What is Sentiment Analysis ?
Identify and analyze underlying opinion at:

• document level
• sentence level
• entity/aspect level
4
Sentiment Analysis ?
Baayyee nama Wow baayyee

jibbisiisa. namatti tola!
• identifying positive / negative

sentiments
5
Why Sentiment Analysis ?
• Movie: is this review positive or Example of Election prediction 2016 USA

negative?
• Products: what do people think
about the new iPhone?
• Public sentiment: how is consumer
confidence? Is despair increasing?
• Politics: what do people think
about this candidate or issue?
• Prediction: predict election
outcomes or market trends from
sentiment
6
Motivation
• Tracking of opinions in social media has

fascinated an increasing level of interest in
the research community.
• The limited availability of the previous
studies
• The luck of state of the art machine
learning
7
Statement of the problem
• Information from social media, blogs and forums as
well as news source, is widely used in SA recently.
• These media information plays a great role in
expressing people’s feelings, or opinions about a
certain topic or product.
• Companies, governments and so on typically receive
high volumes of electronic public feedback every day.
• The received data is huge
• The amount of AO text on web also increasing from
time to time
8
Social media usage
9
Social media usage in Ethiopia
social media usage in Ethiopia

0% 0% 0% 0% 0% 0% 0% • about 4,500,000
2%
4%
3% Facebook users on
June/2017, 4.3%
penetration rate
• There were 5,39,2500

Facebook users in
Ethiopia in September
89% 2019, which increased
Facebook Pinterest YouTube Twitter Instagram VKontakte

by 892,500 number.
Google+ Tumblr LinkedIn reddit Other
10
The Previous Studies
Author objective features model classes Selected Domain No Dataset

model Afaan
oromoo
Preprocessing Data
source
(Tariku, Assign Lexicons rule based +ve&-ve General ORTO 400

Sentiment Lexical
2017), feature +rule based lexicon news reviews
classification database or
dictionary Msc sentiment with rule service
Thesis, & based
Aspect DBU summarizat
summarization ion
(Abate, Feature Lexicon, Unsupervis +ve,-ve bigram (OPDO) 600

2019) extraction POS, N- ed &neutral official reviews
Journal & polarity gram Facebook
Article classificati page,
on political
(Tariku, 2017) and (Abate, 2019) bloggers
page
11
Gap
o Rule based and lexicon based

o very time consuming and it is not used alone.
o (Hailong, Wenyan, & Bo, 2014) , and (Medhat, Hassan, & Korashy,
2014)
o Granularity(Liu B. , 2012) and the differences in opinion expression
among textual genres (Haddi, 2015), complex opinion expression,
idiomatic expression.
o Word matching approach
 Example: ODPn kanaan booda biyya lamuu lammata deebitee Oromoo hin
xiqqeessinee fi Oromoon ala eessa iyyuu gahuu hin dandeenye ijaaruuf kutatee
kaheera!
 ejjannoon keessan garaa nu guute.
 Sareen ni dutti gaalli deemsa isaa itti fufeera.
 Dhugaa jallisuu malee yoom kabiraa barte opdo odpn.
 ibsa baasuu maalee gaaffii keenya nuuf deebisaa waan hin jirreef, yaroo dhihoo keessatti wal
12
Our Method
• Deep learning and State of the art machine learning
• (Kim, 2014), (Moschitti and Severyn, 2015), (
Wang, et al., 2015), (Miedema, 2018)
• Successful machine learning algorithms
• (Pang, Lee, & Vaithyanathan, 2002), (Gautam
& Yadav, 2014), (Martin, Jurafsky, & H., 2018)
• Word embedding
• (Mikolov, 2013)
• Semantic representation of words
• complex composition of word, and complex sentiment
expression
13
Research question
• How the selected algorithm performs

for social media texts SA of Afaan
Oromoo?
• How a reliable text corpus can be
developed for Facebook comments in
the Afaan Oromo language in order to
design and develop a ML-based SA?
• Which classification algorithms can be
successfully designed and used for SA
of Afaan Oromoo?
14
Objective of the study
• The main objective of this study is to

design and develop a machine learning
Approach for Sentiment Analysis of
Afaan Oromoo.
 To explore and understand what has been done
in the previous SA works which are a related
application of ML approaches in sentiment
analysis.
 Preprocessing and preparation text corpus
for AO social media posts comments.
15
Objective of the study
 To develop a model classifier using the

proposed algorithms.
 Evaluate their performance and
Conduct a comparative analysis of the
applied algorithms.
 Based on the obtained result, to
propose a set of improvements and to
suggest recommendations.
16
Scope of the study
• Document level
• Binary (positive and negative)
• Facebook domain
• We focused on text and others, like
Emoticons, Emoji are out of focus.
17
Methodology of the study
Literature review
Dataset collection and preparation
Tools(for experiment purpose) Evaluation techniques
 Python note book, anaconda 3.7.3 • Precision
 Keras deep learning library
 Scikit lear FE python library
• Recall
Feature selection for naïve Bayes • Accuracy
• TF-IDF • F1-score
• N-grams
Classification algorithms
• Confusion matrix
• CNN
• LSTM
• NB
18
Data Collection
https://www.facebook.com/
Oromo
Democratic Party /ODP
June - October
19
Methodology: Classifiers
• Naïve Bayes (Multinomial

Naïve Bayes): to perform
classification, first features
are selected for classifier.
• N-gram and TF-IDF
feature selections are
employed.
• After features are built, the
features are fed in to the
classifier.
20
Feature selection for Naïve Bayes Classifier
• The TfidfVectorizer tokenize comments with different n-gram, learn the

vocabulary and inverse document frequency weightings.
• For example TF-IDF for unigram
21
Naïve Bayes classifier
• After this step is completed different experiment is

performed using unigram bigram, trigram, and their
combination.
• The features are fed in to the classifier for classification
𝑙𝑎𝑏𝑒𝑙𝑚𝑎𝑝 =
22
Proposed CNN
• We proposed the CNN deep learning using

multiple parallel convolutional neural
networks that read the source document using
different kernel sizes.
• Read the text with different n-gram sizes or
group of words
• This captures high level features
• we have five input channel for processing
unigram, bigram, trigram, four-gram and five-
grams.
 take an input sequence
 map it words to embeddings  convolve with that filter sliding in one
 take convolution window size 1,2,3,4,5 by length of direction
embedding vector  take the maximum feature
 Fed to fully connected layer for 23
Each channel is comprised of
1. Input layer : that defines the length of input sequences

2. Embedding layer: set to the size of the vocabulary and
fixed dimensional real valued representation of words.
• This layer processes and represents the words that have
the same meaning have a similar representation.
• This matrix is formed by simply concatenating
embeddings of all words in V.
3. One dimensional convolutional layers
4. Pooling layer
5. Fully connected or final layer:
24
Example of embedding layer word embedding
['wow', 'ODP', 'baga',

'gammaddan', 'uummanni',
'hammam', 'akka', 'si',
'duukaa', 'jiru',
'asumaa', 'hubachuu',
'dandeessa', 'hojii',
'kee', 'cimsi', 'ati',
'dhuguma', 'kallacha',
Embedding for 'keenya', '!!!!!',
“gaddisiisa” 'gammachuun', 'kooti.!',
'baayyee', 'jibbisiisa',
[-0.03731419 0.0066715 -0.02384737 'nama', 'gaddisiisa',
0.00387553 0.04606169 -0.02425868 'oduu', 'haattota',
0.02675312 -0.00208833 0.02460078 0.0052022 'OPDO', 'warra', 'abbaa',
] 'garaa']
25
The third proposed work is LSTM algorithm
• The LSTM that receives a sequence

of vectors as input and considers the
order of the vectors to generate
prediction.
• From the embedding layer, the new
representations will be passed to
LSTM cells.
• These will add recurrent connections
to the network so we can include
information about the sequence of
words in the data.
26
LSTM
• Finally, the LSTM cells will go to a sigmoid

output layer here.
• We’re using the sigmoid because we’re trying to
predict if this text has positive or negative
sentiment.
• The output layer will just be a single unit then,
with a sigmoid activation function.
27
Results and Discussion
• The performance of the classifiers are measured by

using different metrics.
….. (1)
… (2)
… (3)
.(4)
28
Result: MNB
N-grams Accurac F1- Precisio Recall
y score n
Unigram 91% 91% Positive 91% 90%
negative 90% 91%
Bigram 71% 76.53% Positive 48% 62%

Negative 94% 77%
Trigram 54.64% 68.7% Positive 10% 100%
Negative 100% 52%
Unigram+Bigram 93% 93% Positive 95% 91%
Negative 90% 95%
Unigram+Trigram 92.4% 92.2% Positive 95% 90%
Negative 90% 95%
Bigram+Trigram 75% 79.1% Positive 55% 92%
Negative 95% 68%
29
Result of Convolutional Neural Network
Hyperparameter Parameters
training
Embedding dimension 10
Convolutional 1,2,3,4,5
filers(kernel size)
dropout 0.1, at fully connected
layer
Pooling Max-pooling
Number of filters 64
epochs 10
Learning rate Default(0.001),
beta_1=0.9, beta_2=0.999
Batch size 32
30
Result of Convolutional Neural Network
Class Positive Negative Accordingly, the

proposed system by
Precision 0.90 0.86
CNN achieved an
Recall 0.87 0.91
accuracy of 89% and
f1 score of 87%.
31
Results of LSTM
Hyper parameter training parameters
Embedding dimension 256
dropout 0.3 , recurrent dropout

=0.2
Memory unit 250 for both LSTM

layers
epochs 10
Learning rate Default(0.001),

beta_1=0.9, beta_2=0.999
Batch size 20
32
Results of LSTM
Class Positive Negative Our LSTM model

achieved an accuracy
Precision 0.876 0.88
of 87.6% and f1 score
Recall 0.876 0.88 87.7%.
33
Discussion
Comparison of the Three Algorithms

93
• In this study, the
92
91
Multinomial Naïve
90
Bayes with
89
88
Unigram+bigram and
87 TF-IDF outperforms the
86
CNN and LSTM

85
MNB LSTM CNN
34
Discussion
 MNB is Simple and need less computational resources as compared to neural

networks.
 CNN can abele to handle the longer text of the words through different
convolutional filters.
 In addition, the two deep learning (CNN and LSTM) requires no special feature
selection
 The LSTM by its nature has the capability to hold relevant information to the
task at hand.
 In general, both LSTM and CNN can handle complex sentiment expression than
NB.
35
Discussion
• When the context of the word is used to determine the polarity

of the text rather than the probability of the occurrence of the
word both CNN and LSTM are the best approaches.
• This is because our data is small, and both LSTM and CNN
need huge data to learn the important features from the data
and perform well.
• Whereas, MNB can achieve good results even with small
datasets.
36
Conclusion
• We studied three methods, first Multinomial

Naïve Bayes that use Term frequency and inverse
document frequency representation and n-gram
features for training the classifier.
• Secondly, Long Short Term Memory deep
learning method that uses word embeddings and
two different hidden layers to further make
precise the polarity of the reviews/comments.
37
Conclusion
• Thirdly Convolutional Neural Network deep learning

technology that uses word embeddings and applies
different convolutional filters and extracts sentiment
of the text is studied.
• Therefore, we aimed to perform experiments and
investigate the performance of three different
algorithms detecting positive and negative comments.
• Furthermore, the algorithm which gives the best
results is defined.
38
Recommendation
• Spelling corrector can be applied to exclude errors.

• a well prepared standard corpus.
• The study considers document-level sentiment
analysis. Others like sentence level can be considered.
• emoticon, emoji expression refers to a positive or
negative meaning.
• Pre-trained word vectors
39
40

megersa, thesis presentation (2)

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

megersa, thesis presentation (2)

Uploaded by

Copyright:

Available Formats

Ambo university Woliso Campus

School of Technology and Informatics

 Statement of the problem

• Sentiment: people's judgment, attitudes,

Identify and analyze underlying opinion at:

Baayyee nama Wow baayyee

• identifying positive / negative

• Movie: is this review positive or Example of Election prediction 2016 USA

• Tracking of opinions in social media has

social media usage in Ethiopia

• There were 5,39,2500

Facebook Pinterest YouTube Twitter Instagram VKontakte

Author objective features model classes Selected Domain No Dataset

(Tariku, Assign Lexicons rule based +ve&-ve General ORTO 400

(Abate, Feature Lexicon, Unsupervis +ve,-ve bigram (OPDO) 600

o Rule based and lexicon based

• How the selected algorithm performs

• The main objective of this study is to

 To develop a model classifier using the

• Naïve Bayes (Multinomial

• The TfidfVectorizer tokenize comments with different n-gram, learn the

• For example TF-IDF for unigram

• After this step is completed different experiment is

• We proposed the CNN deep learning using

1. Input layer : that defines the length of input sequences

['wow', 'ODP', 'baga',

• The LSTM that receives a sequence

• Finally, the LSTM cells will go to a sigmoid

• The performance of the classifiers are measured by

negative 90% 91%

Bigram 71% 76.53% Positive 48% 62%

Class Positive Negative Accordingly, the

Hyper parameter training parameters

Embedding dimension 256

dropout 0.3 , recurrent dropout

Memory unit 250 for both LSTM

Learning rate Default(0.001),

Class Positive Negative Our LSTM model

Comparison of the Three Algorithms

CNN and LSTM

 MNB is Simple and need less computational resources as compared to neural

• When the context of the word is used to determine the polarity

• We studied three methods, first Multinomial

• Thirdly Convolutional Neural Network deep learning

• Spelling corrector can be applied to exclude errors.

You might also like