Professional Documents
Culture Documents
Defense Project Slide Final
Defense Project Slide Final
ON
Analysis of Twitter Data Mining on Genres of Netflix
Web Series
Presented By
Biplove Pokhrel
MSCmKE007
INTRODUCTION
PROBLEM STATEMENT
OBJECTIVES
LITERATURE REVIEW
METHODOLOGY
RESULTS
ANALYSIS
REFERENCES
2
INTRODUCTION
3
INTRODUCTION..
USER TWEET
TWEET Posted by, retweeted by, liked by, replied by Replies/is replied from
5
PROBLEM STATEMENT
6
OBJECTIVES
7
LITERATURE REVIEW
S.N TITLE PUBLISHED FEATURE AUTHORS
. YEAR
1 Measuring user 2016 AD Different Twitter metrics F. Riquelme and P.
influence on to discuss the Activity of González-Cantergiani
Twitter: A survey the Popular User in the
Twitter
9
METHODOLOGY
DATA PRE
FEATURE EXTRACTION
TWITTER PROCESSING
AND SELECTION
DATA
COLLECTION
FEATURES
ACTIVITY POPULARITY
METRICS METRICS SENTIMENT
POLARITY
CLASSIFIER
CALCULATION
10
ANAYSIS OF GENRES
METHODOLOGY
Data Acquisition
• Involves Collection of Twitter Data Sets using Tweepy
library.
• Tweets of the Twitter handle of Popular User of the Genre
along with the hashtags tweets of popular keywords.
11
METHODOLOGY
Text Preprocessing
• Converting all letters to Lower or Upper case
• Converting Numbers into Words or Removing Numbers
• Removing Punctuations, Accent marks and other diacritics
• Removing White Spaces
• Expanding Abbreviations
• Removing Stop Words, Sparse Terms, and Particular Words
• Removing URL, Unnecessary Emojis
12
METHODOLOGY
13
METHODOLOGY
14
METHODOLOGY
15
METHODOLOGY
Source:http://www2.imm.dtu.dk/pubdb/views/publication_details.php?id=6010
17
METHODOLOGY
• Every word from the AFFIN-111 Lexicon is used in categorizing the unigram into
four different categorized into Very Positive, Very Negative, Negative and Positive.
19
METHODOLOGY
20
METHODOLOGY
Metrics Definition
OT1 Original Tweets of User
RP1 Replies from the User
RT1 Retweets from the User
FT1 Favorite Tweet from the User
GA General Activity (OT1+RP1+RT1+FT1)
F1 Account that User Follows
F3 Follower of the User
Follower
Defined as F1/(F1+F3)
Rank
Popularity Defined as in links in network.
21
RESULTS
Predicted Predicted
Confusion Confusion
Positive Negative Neutral Positive Negative Neutral
Matrix Matrix
Genre 2
Genre 1
Predicted Predicted
Confusion Confusion
Positive Negative Neutral Positive Negative Neutral
Matrix
Matrix
Genre 3 Genre 4 22
RESULTS
Predicted
Confusion
Positive Negative Neutral
Matrix
Genre 5
Mean Mean
Genre General Activity
Popularity Follower
12000
10000
8000
6000
4000
2000
0
Genre 1 (Comedy) Genre 2(Drama) Genre 3(Sci-Fi) Genre 4(Romance) Genre 5(Action)
70000
60000
50000
40000
30000
20000
10000
0
General Activity
28
REFERENCES
• A. Pak and P. Paroubek, “Twitter for Sentiment Analysis: When Language Resources are Not Available,”
2011 22nd International Workshop on Database and Expert Systems Applications, 2011
• A. U. Khan, M. Khan, and M. B. Khan, “Naïve Multi-label Classification of YouTube Comments Using
Comparative Opinion Mining,” Procedia Computer Science, vol. 82, pp. 57–64, 2016.
• A. Rahman and M. S. Hossen, “Sentiment analysis on movie review data using machine learning
approach,” in 2019 International Conference on Bangla Speech and Language Processing (ICBSLP),
2019, pp. 1–4.
• U. Kumari, A. K. Sharma and D. Soni, "Sentiment analysis of smart phone product review using SVM
classification technique," 2017 International Conference on Energy, Communication, Data Analytics and
Soft Computing (ICECDS), Chennai, India, 2017, pp. 1469-1474, doi: 10.1109/ICECDS.2017.8389689.
• F. Riquelme and P. González-Cantergiani, “Measuring user influence on Twitter: A survey,” Information
Processing & Management, vol. 52, no. 5, pp. 949–975, 2016.
• Cha, M., Haddadi, H., Benevenuto, F., & Gummadi, K, “Measuring User Influence in Twitter: The
Million Follower Fallacy”, 2010.
• J. Sun and J. Tang, “A Survey of Models and Algorithms for Social Influence Analysis,” Social Network
Data Analytics, pp. 177–214, 2011.
• S. Abe, Support Vector Machines for Pattern Classification, Springer-Verlag London Limited, 2008, 350
pp
• I. Steinwart and C. Scovel, "Fast rates for support vector machines using Gaussian kernels", The Annals
of Statistics, vol. 35, no. 2, pp. 575-607, 2007. Available: 10.1214/009053606000001226.
• A. Pal and S. Counts, “Identifying topical authorities in microblogs,” Proceedings of the fourth ACM
international conference on Web search and data mining - WSDM '11, 2011.
29
APPENDIX
30
SERIES AND TWITTER HANDLES
31
32
33
34