Professional Documents
Culture Documents
Twitter Sentiment Analysis (NLP) : This Photo CC By-Nc
Twitter Sentiment Analysis (NLP) : This Photo CC By-Nc
Analysis ( NLP)
Workflow
Explanation in Brief
• An NLP Data Science project to find out how people feel about 2021.
• After 2020 turned out to be a disaster, we’ve all been looking forward to 2021 with hope. I
decided to perform a Twitter Sentiment Analysis to find out if the new year is treating us well.
• With this project I wanted to get familiar with the Natural Language Processing (NLP) techniques
and answer the following questions:
• What are the most common words people use to describe 2021?
• What is the number of tweets with positive, negative and neutral sentiment?
• What are the most common words used in positive, neutral and negative tweets?
• What are the most liked and retweeted posts?
Tweets Mining
• In this case, data cleaning was a short step. Firstly, I checked the
shape of the data and the name of the columns. I dropped a column,
which contained a second index (created during merging).
Secondly I checked for any duplicates with the column “id” as a
subset and dropped all of them. Lastly, I checked for NaN values
and since ‘Location’ had all missing values I decided to drop that
column. After a bit of research, I found out that Twitter disabled
automatic location tagging from their users. That’s why the
location data is missing for most of the tweets.
Tweets Processing (Tokenization/POS tagging)
References
all-possible-pos-tags-of-nltk
• https://docs.tweepy.org/en/latest/
• https://developer.twitter.com/en
• https://github.com/s/preprocessor