Professional Documents
Culture Documents
3.sentiment Analysis - 09082020
3.sentiment Analysis - 09082020
3.sentiment Analysis - 09082020
1
Introduction
Terminology:
Sentiment analysis is more widely used in
industry.
Both are widely used in academia
But they can be used interchangeably.
2
APPLICATIONS
Problem statement
4
Opinion definition (Liu, Ch. in NLP handbook, 2010)
An opinion is a quintuple
(ej, ajk, soijkl, hi, tl),
where
ej is a target entity.
ajk is an aspect/feature of the entity ej.
soijkl is the sentiment value of the opinion from the
opinion holder hi on feature ajk of entity ej at time tl.
soijkl is +ve, -ve, or neu, or more granular ratings.
hi is an opinion holder.
tl is the time when the opinion is expressed.
5
Our example blog in quintuples
Id: Abc123 on 5-1-2008 “I bought an iPhone a few days
ago. It is such a nice phone. The touch screen is really
cool. The voice quality is clear too. It is much better than
my old Blackberry, which was a terrible phone and so
difficult to type with its tiny keys. However, my mother was
mad with me as I did not tell her before I bought the phone.
She also thought the phone was too expensive, …”
In quintuples
(iPhone, GENERAL, +, Abc123, 5-1-2008)
(iPhone, touch_screen, +, Abc123, 5-1-2008)
….
6
Sentiment Analysis
7
Different Approaches to sentiment analysis
8
Sentiment Lexicons- Polarity Based
9
Sentiment Lexicons (Valence Based)
Many applications need not just the binary polarity, but also the
strength of the sentiment.
10
Lexicons and context awareness
11
Limitations of Lexicon based approach
12
Machine Learning Approaches
13
Introducing Vader
Valence Aware Dictionary for sEntiment
Reasoning
Developed in 2014 by Hutto and Gilbert
(Georgia Institute of Technology)
Supports both polarity and intensity
14
Vader – Why it is preferred?
Works well on social media text
15
Introducing Vader
16
Introducing Vader
17
Example
The word ‘ok’ has a positive valence of 0.9
18
Generalizable Heuristics- 1
19
Generalizable Heuristics- 2
20
Generalizable Heuristics- 3
The effect of the modifier in the first sentence is to increase the intensity of
cute, while in the second sentence, it is to decrease the intensity.
The effect of the degree modifier also depends on its distance to the word
it’s modifying.
Farther words have a relatively smaller intensifying effect on the base word
21
Generalizable Heuristics- 4
For example, “I love you, but I don’t want to be with you anymore.” The first
clause “I love you” is positive, but the second one “I don’t want to be with
you anymore.” is negative and obviously more dominant sentiment-wise.
22
Quantifying the sentiment
VADER sentiment analysis returns a sentiment score in
the range -1 to 1, from most negative to most positive.
23
Scores- Example
The food is good and the atmosphere is nice”
It has two words in the Vader lexicon (good and nice) with
ratings of 1.9 and 1.8 respectively
24
Quantifying the sentiment
The sentiment score of a sentence is the sum of the
sentiment score of each sentiment-bearing word.
However, we apply a normalization to the total to map it
to a value between -1 to 1.
25
Problem with Vader Model
As X grows larger, it gets more and more close to -1 or 1.
26
Concordant and Discordant Pairs
Feature Extraction
- Unigram
Data Data Cleaning Data
- Bigram
Acquisition and Labelling Preprocessing
- Trigram
0.5 No discrimination
STS-2,
1 0.23 0.43
STS-1 Discordant
STS-2,
2 0.23 0.27
STS-3 Discordant
STS-2,
3 0.23 0.03
STS-4 Concordant
STS-61C,
4 0.83 0.43
STS-1 Concordant
STS-61C,
5 0.83 0.27
STS-3 Concordant
STS-61C,
6 0.83 0.03
STS-4 Concordant
What is the classification accuracy?
Confusion Matrix
Accuracy