3.sentiment Analysis - 09082020

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 34

Introduction

 Opinion mining or sentiment analysis


 Computational study of opinions, sentiments,
subjectivity, evaluations, attitudes, appraisal,
affects, views, emotions, etc., expressed in text.

 Reviews, blogs, discussions, news,


comments, feedback, or any other
documents

1
Introduction
 Terminology:
 Sentiment analysis is more widely used in
industry.
 Both are widely used in academia
 But they can be used interchangeably.

2
APPLICATIONS
Problem statement

Opinion definition. What is an opinion?


 Can we provide a structured definition?
 If we cannot structure a problem, we probably do not
understand the problem.

4
Opinion definition (Liu, Ch. in NLP handbook, 2010)
 An opinion is a quintuple
(ej, ajk, soijkl, hi, tl),
where
 ej is a target entity.
 ajk is an aspect/feature of the entity ej.
 soijkl is the sentiment value of the opinion from the
opinion holder hi on feature ajk of entity ej at time tl.
soijkl is +ve, -ve, or neu, or more granular ratings.
 hi is an opinion holder.
 tl is the time when the opinion is expressed.
5
Our example blog in quintuples
 Id: Abc123 on 5-1-2008 “I bought an iPhone a few days
ago. It is such a nice phone. The touch screen is really
cool. The voice quality is clear too. It is much better than
my old Blackberry, which was a terrible phone and so
difficult to type with its tiny keys. However, my mother was
mad with me as I did not tell her before I bought the phone.
She also thought the phone was too expensive, …”
 In quintuples
(iPhone, GENERAL, +, Abc123, 5-1-2008)
(iPhone, touch_screen, +, Abc123, 5-1-2008)
….

6
Sentiment Analysis

 One can look at this review/blog at the


 document level, i.e., is this review + or -?
 sentence level, i.e., is each sentence + or -?
 entity and feature/aspect level

7
Different Approaches to sentiment analysis

Sentiment Context Aware Machine learning Manual


Lexicons Lexicons
And Rule Based
approach

 Depends on  Parts of speech  Primarily  Expensive


underlying for more context Supervised  Depends on
sentiment lexicons awareness
 It requires skills of raters
 Lexicons can be  When a word has extensive training  Time consuming
polarity based multiple data
(positive/negative) meaning, which  They may not be
sense of the
 Can be based on able to represent
word is being short, sparse text
intensity of the
used of social media
sentiment
(Valence)

8
Sentiment Lexicons- Polarity Based

 A sentiment lexicon is a list of lexical features (words) which are


labelled according to their semantic orientation as either
positive or negative.
 Positive emotions- Love, nice, good, great.
 Negative emotions- Hurt, ugly, sad, bad.
The above approach does not specify the intensity of emotions. It is
important to know the changes in sentiment intensity in order to
understand when rhetoric is heating up or cooling down.

 As manually creating this types of list takes time, preexisting


manual lexicons are used and enhanced.

9
Sentiment Lexicons (Valence Based)

 Many applications need not just the binary polarity, but also the
strength of the sentiment.

 How favorably or unfavorably do people feel about a new


product, movie or legislation bill?

 Analysts and researchers want to be able to recognize changes


in sentiment intensity over time in order to detect when rhetoric
is heating up or cooling down.

10
Lexicons and context awareness

 The word catch has a negative sentiment in “At the first


glance, the offer looks good, but there is a catch.”
 But the word catch is neutral in “ the fishermen plans to sell
his catch in the market.”

 These lexicons try to find which sense of word is used in a


sentence, when the word has multiple meanings.

 In NLP, the most difficult task is to handle language ambiguity.

11
Limitations of Lexicon based approach

 They have trouble with coverage, they cover limited words.


 Specifically, they ignore the important lexical features relevant
to social texts
 Some of them donot keep intensity differential information
 Adding new set of human validated lexical features is time
consuming and expensive.

12
Machine Learning Approaches

 Classification problem using Logistics Regressions, Naïve


Bayes, Support Vector Machines (SVM) and other algorithms.

 They require training data, which are difficult to acquire.


 Training dataset should have all types of features, that are
expected in production data.
 Very expensive in terms of computing resources
 Sometimes, they derive features inside of a black box,
which is difficult to explain, Hence it becomes difficult to
modify or extend the models.

13
Introducing Vader
 Valence Aware Dictionary for sEntiment
Reasoning
 Developed in 2014 by Hutto and Gilbert
(Georgia Institute of Technology)
 Supports both polarity and intensity

 Gold Standard- Uses Human Raters and wisdom


of crowds

 Lexicon as well as Heuristics rules to calculate


sentiment score

14
Vader – Why it is preferred?
 Works well on social media text

 Rules based on conventional behavior of


people in social media

 No training data set is required, but lexicon is


valence based and gold standard

 Does not suffer from speed-performance trade


off

15
Introducing Vader

16
Introducing Vader

17
Example
 The word ‘ok’ has a positive valence of 0.9

 Good 1.9, Great 3.1

 Horrible -2.5, frowning emoticon -2.2, sucks -


1.5

18
Generalizable Heuristics- 1

 Amplifies the sentiment score of the sentence proportional


to the number of exclamation points and question marks ending
the sentence.

 If the score is positive, VADER adds a certain empirically-


obtained quantity for every exclamation point (0.292) and
question mark (0.18). If the score is negative, VADER
subtracts.

19
Generalizable Heuristics- 2

 The second heuristic is capitalization. “AMAZING performance.” is


definitely more intense than “amazing performance.”

 VADER takes this into account by incrementing or decrementing the


sentiment score of the word by 0.733, depending on whether the
word is positive or negative, respectively.

20
Generalizable Heuristics- 3

 The third heuristic is the use of degree modifiers.

 The effect of the modifier in the first sentence is to increase the intensity of
cute, while in the second sentence, it is to decrease the intensity.

 VADER maintains a booster dictionary which contains a set of boosters and


dampeners.

 The effect of the degree modifier also depends on its distance to the word
it’s modifying.

 Farther words have a relatively smaller intensifying effect on the base word
21
Generalizable Heuristics- 4

 Sentiment Shifter- Constructive conjuction

 “but” connects two clauses with contrasting sentiments. The dominant


sentiment, however, is the latter one.

 For example, “I love you, but I don’t want to be with you anymore.” The first
clause “I love you” is positive, but the second one “I don’t want to be with
you anymore.” is negative and obviously more dominant sentiment-wise.

 VADER implements a “but” checker. Basically, all sentiment-bearing words


before the “but” have their valence reduced to 50% of their values, while
those after the “but” increase to 150% of their values.

22
Quantifying the sentiment
 VADER sentiment analysis returns a sentiment score in
the range -1 to 1, from most negative to most positive.

 The sentiment score of a sentence is calculated by


summing up the sentiment scores of each VADER-
dictionary-listed word in the sentence.

 individual words have a sentiment score between -4 to 4,


but the returned sentiment score of a sentence is
between -1 to 1.

 How do we explain this contradiction?

23
Scores- Example
 The food is good and the atmosphere is nice”

 It has two words in the Vader lexicon (good and nice) with
ratings of 1.9 and 1.8 respectively

 The first three, positive, neutral and negative, represent


the proportion of the text that falls into those categories.

 An example sentence was rated as 45% positive, 55%


neutral and 0% negative.

 The final metric, the compound score, is the sum of all of


the lexicon ratings (1.9 and 1.8 in this case) which have
been standardised to range between -1 and 1.

24
Quantifying the sentiment
 The sentiment score of a sentence is the sum of the
sentiment score of each sentiment-bearing word.
However, we apply a normalization to the total to map it
to a value between -1 to 1.

 The normalization used by Hutto is

 where X is the sum of the sentiment scores of the


constituent words of the sentence/ document and alpha is
a normalization parameter that is generally set to 15.

25
Problem with Vader Model
 As X grows larger, it gets more and more close to -1 or 1.

 If there are a lot of words in the document you’re applying


VADER sentiment analysis to, you get a score close to -1
or 1. Thus, VADER sentiment analysis works best on
short documents, like tweets and sentences, not on
large documents.

26
Concordant and Discordant Pairs

Discordant Pairs. A pair of positive and negative observations


for which the model has no cut-off probability to classify both of
them correctly are called discordant pairs.

Concordant Pairs. A pair of positive and negative observations


for which the model has a cut-off probability to classify both of
them correctly are called concordant pairs.
Receiver Operating Characteristics
(ROC) Curve
ROC curve is a plot between sensitivity (true positive rate)
in the vertical axis and 1 – specificity (false positive rate)
in the horizontal axis.
M/C Model Development

Feature Extraction
- Unigram
Data Data Cleaning Data
- Bigram
Acquisition and Labelling Preprocessing
- Trigram

Model Evaluation Feature Selection


- AUC Model - BOW
Model
- Confusion Development - TF*IDF
Deployment
Matrix
ROC Curve/ AUC
General rule for acceptance of the model:

If the area under ROC (AUC) is:

0.5  No discrimination

0.7  ROC area < 0.8  Acceptable discrimination. Beyond .7 is not


acceptable as good model.

0.8  ROC area < 0.9  Excellent discrimination

ROC area  0.9  Outstanding discrimination


Calculate AUC

Flt Temp Actual Predicted


No (Feature/IV) Damage to O Probability by
Ring / Labelled model
STS-1 66 Data
0 0.43
STS-2 70 1 0.23
STS-3 69 0 0.27
STS-4 80 0 0.03
STS-
58 1 0.83
61C
Calculate AUC
SL.No Pair Details Probability Probability of Pair Status
of category 1 category 0

STS-2,
1 0.23 0.43
STS-1 Discordant
STS-2,
2 0.23 0.27
STS-3 Discordant
STS-2,
3 0.23 0.03
STS-4 Concordant
STS-61C,
4 0.83 0.43
STS-1 Concordant
STS-61C,
5 0.83 0.27
STS-3 Concordant
STS-61C,
6 0.83 0.03
STS-4 Concordant
What is the classification accuracy?
Confusion Matrix
Accuracy

(135*95.6 + 65.9*414 )/549 = 73.2%

You might also like