Download as pdf or txt
Download as pdf or txt
You are on page 1of 17

Computational Knowledge Analysis –

Natural Language Processsing with Python


Session 7
Sentiment Analysis
26.06.2022
Dr. Maria Becker
Summer Term 2023

1
What is Sentiment Analysis?
• Technique that identifies the sentiment of a given text/piece of text
• Given a sentence/text, identify if it is positive/negative/neutral
• (Still) a hot topic in NLP with lots of new developments
• Very application-oriented area of NLP

2
Labels for Sentiment Analysis
• Common labels:
• Binary: positive vs. negative, sometimes also neutral
• Ranges (e.g. between -3 and +3)
• Usually a model outputs the probability for each label (between 0 and 1, e.g.
pos=0.367, neg=0.456)
• A text can be assigned with an overall polarity by summing up the
values of the positively assigned words and subtracting the values of
the negative ones

3
Related fields/subfields

• Opinion mining: mostly used synonymously to sentiment analysis

• Emotion detection: goes beyond polarity to detect emotions, like


happiness, frustration, anger, and sadness

• Stance recognition: if someone is in favor for or against


something/ a topic

4
Subtasks of Sentiment Analysis

5
Challenges in Analyzing Sentiment
Easy examples of sentiment analysis
• Netflix has the best selection of films.
• I dislike the new crime series.
• I hate waiting for the next series to come out.

More challenging examples of sentiment analysis


• I do not dislike horror movies.
• Sometimes I really hate the show.
• I love having to wait two months for the next series to come out!
• The final episode was surprising with a terrible twist at the end.
• I LOL’d at the end of the cake scene.

6
Challenges in Analyzing Sentiment
• Negations
• Modifiers
• Ambiguous words
• Negative terms used in a positive way
• New terms (e.g. in social media)
• Domain dependance
• Noisy text
• Multimodality
• Sarcasm
• Fake reviews

7
Methods for Performing Sentiment Analysis
• Lexicon-based methods (or rule-based or dictionary-based methods):
• uses lexicons consisting of words that are pre-annotated concerning their
sentiment expression (sentiment bearing words)
• Ways to create such lexicons:
• Crowdsourcing
• expert annotations
• semi-automatic approaches
• Machine learning approaches:
• Supervised training of neural or feature-based models on sentiment-
annotated corpora
• Often makes use of other linguistic
features such as dependency structures
or POS tags

8
Applications of Sentiment Analysis
• Monitoring social media mentions of brands etc.
• Analyzing feedback from surveys and product reviews
• Analyzing incoming support tickets, e.g. to detect angry customers
• … and many more!

9
Sentiment Analysis – Assignments

Zip all edited tasks (code, outputs, PDF documents etc.) in one
repository and send it to me (maria.becker@gs.uni-heidelberg.de) by
09/07/2023.

10
ASSIGNMENT 1: Choose between option 1 and 2
Option 1 (with coding):
• Write a small Python program that counts all occurrences of sentiment bearing words in each article
of your text corpus and outputs the sum of positive vs. negative words per article.
• Use a sentiment lexicon of your choice, e.g. http://mpqa.cs.pitt.edu/lexicons/subj_lexicon/
• Send me the code (in a py/ipynb file) and its output (in a PDF file).

Option 2 (without coding):


• Analyze three or more of the texts in your corpora using at least two sentiment analysis demos of
your choice (e.g. https://text2data.com/Demo, http://text-processing.com/demo/sentiment/,
https://monkeylearn.com/sentiment-analysis-online/ )
• Find out how the models work. Who developed them, which algorithms do they use, for which
purpose have they been developed, have they been released/published e.g. at NLP conferences?
• Analyze the output of the systems and compare the results of the different tools. What are
challenges and (frequent) mistakes? Is the sentiment analysis helpful, did the tools help you to gain
further insights into your data?
• Send me your observations (key points are sufficient) in a PDF file.
11
ASSIGNMENT 2 (Everybody)

• Textblob and NLTK both offer sentiment analysis modules. Find out how the
models work and on which methods/algorithms they are based on.
• As a starting point, you can use this webpages:
• https://investigate.ai/investigating-sentiment-analysis/comparing-sentiment-analysis-tools/
• https://towardsdatascience.com/my-absolute-go-to-for-sentiment-analysis-textblob-
3ac3a11d524
• Send me your observations (key points are sufficient) in a PDF file.

12
ASSIGNMENT 3 (Everybody)

• Apply the textblob and nltk methods provided in in the Jupyter


Notebook in Moodle to a sample set of sentences from your corpus
(about 15 sentences).
• Evaluate the results manually: What are the differences between the
models? Are the sentiment scores correct? What are challenges and
(frequent) mistakes? What are possible sources for errors?
• Send me your observations (key points are sufficient) in a PDF file.

13
ASSIGNMENT 4:
Choose between option 1, 2 or 3
Option 1 (with coding):
• Modify the Jupyter Notebook in order to apply it to your whole corpus
and analyze the results.
• This includes the following steps:
• Split each articles into sentences by using punctuations as separators (., !, :, ?)
• Iterate over each sentence
• Calculate the medium sentiment scores per article
• Evaluate the results manually: Are the sentiment scores per sentence/per
article correct? What are possible sources for errors?
• Send me the code (in a py/ipynb file), its output (in a text or word file) and
your observations (key points are sufficient, in a PDF file)
14
ASSIGNMENT 4:
Choose between option 1, 2 or 3

Option 2 (with coding):


• What other sentiment analysis libraries are available for Python? Choose
one.
• Example: https://huggingface.co/blog/sentiment-analysis-python
• What are the algorithms behind it?
• Apply it to and compare the results to the textblob and nltk methods.
What are the differences? Which model works best for your data?
• Send me the code (in a py file), its output (in a text or word file) and your
observations (key points are sufficient, in a text or word file)

15
ASSIGNMENT 4:
Choose between option 1, 2 or 3

Option 3 (without coding):


• Write a short survey (1-2 pages) about the history and development of
sentiment analysis in NLP. Which methods have been applied in the past,
and which methods are used today? What are prior and recent
challenges? What are common applications?
• Refer to at least 3 Papers from the ACL Anthology
(https://aclanthology.org/) and to recent blogposts and blogposts from
the past.
• Send me the survey in a text or word file.

16
Next Session

- Next session: 03.07.2023 online


- Deep Learning for NLP – An Introduction
- No preparation required
- No elevator talks

17

You might also like