Professional Documents
Culture Documents
NLP Workshop Assignment - Feedback Data - Apr 20, 2019
NLP Workshop Assignment - Feedback Data - Apr 20, 2019
NLP Workshop Assignment - Feedback Data - Apr 20, 2019
2
Problem Statement
Microsoft has collected feedback from its users for a flagship event.
It has provided us with feedback comments and asked the MAQ team to generate insights from this
data.
Name Values
5
MAQ Software - Confidential
Text Pre-processing – Spell Checker / Abbreviations
1. Perform spell check on the comments data
- Run spell checker and correct the spellings accordingly on the comments text
2. Comments might contain words or phrases which are not present in any standard dictionaries. These pieces
are not recognized by search engines and models. With the help of regular expressions and manually
Ex: Some of the examples are – acronyms, hashtags with attached words,etc.
Before After
MS Microsoft
- Lemmatization is an organized & step by step procedure of obtaining the root form of the word
Before After
am, are, is be
1. Extract the top keywords(uni-grams and bi-grams) from all the comments
3. Create bar chart showing the positive sentiment score for each Technology Pillar.