Professional Documents
Culture Documents
megersa, thesis presentation (2)
megersa, thesis presentation (2)
By Megersa Oljira
ID_No IT/RPG/010/10 1
Outlines
Introduction
Motivation
Methodology
Experiment/results
Discussions
Conclusion
Recommendation
2
What is Sentiment ?
3
What is Sentiment Analysis ?
4
Sentiment Analysis ?
5
Why Sentiment Analysis ?
7
Statement of the problem
• Information from social media, blogs and forums as
well as news source, is widely used in SA recently.
• These media information plays a great role in
expressing people’s feelings, or opinions about a
certain topic or product.
• Companies, governments and so on typically receive
high volumes of electronic public feedback every day.
• The received data is huge
• The amount of AO text on web also increasing from
time to time
8
Social media usage
9
Social media usage in Ethiopia
10
The Previous Studies
11
Gap
13
Research question
15
Objective of the study
16
Scope of the study
• Document level
• Binary (positive and negative)
• Facebook domain
• We focused on text and others, like
Emoticons, Emoji are out of focus.
17
Methodology of the study
Literature review
Dataset collection and preparation
Tools(for experiment purpose) Evaluation techniques
Python note book, anaconda 3.7.3 • Precision
Keras deep learning library
Scikit lear FE python library
• Recall
Feature selection for naïve Bayes • Accuracy
• TF-IDF • F1-score
• N-grams
Classification algorithms
• Confusion matrix
• CNN
• LSTM
• NB
18
Data Collection
https://www.facebook.com/
Oromo
Democratic Party /ODP
June - October
19
Methodology: Classifiers
21
Naïve Bayes classifier
𝑙𝑎𝑏𝑒𝑙𝑚𝑎𝑝 =
22
Proposed CNN
24
Example of embedding layer word embedding
27
Results and Discussion
… (2)
… (3)
.(4)
28
Result: MNB
N-grams Accurac F1- Precisio Recall
y score n
Unigram 91% 91% Positive 91% 90%
29
Result of Convolutional Neural Network
Hyperparameter Parameters
training
Embedding dimension 10
Convolutional 1,2,3,4,5
filers(kernel size)
dropout 0.1, at fully connected
layer
Pooling Max-pooling
Number of filters 64
epochs 10
Learning rate Default(0.001),
beta_1=0.9, beta_2=0.999
Batch size 32
30
Result of Convolutional Neural Network
31
Results of LSTM
epochs 10
Batch size 20
32
Results of LSTM
33
Discussion
91
Multinomial Naïve
90
Bayes with
89
88
Unigram+bigram and
87 TF-IDF outperforms the
86
34
Discussion
CNN can abele to handle the longer text of the words through different
convolutional filters.
In addition, the two deep learning (CNN and LSTM) requires no special feature
selection
The LSTM by its nature has the capability to hold relevant information to the
task at hand.
In general, both LSTM and CNN can handle complex sentiment expression than
NB.
35
Discussion
37
Conclusion
38
Recommendation
39
40