Detect Sarcastic

Detecting the Polarity Of Sarcastic Reviews To
Identify High Profitable Product Using Machine

Learning
DEVAGANTHAN B (192071014)
ANAND R (192071005)
BHARATH R (192071011)
JOHN PAUL D (192071033)

Domain
• Natural Language Processing
• the branch of artificial intelligence or AI—concerned with
giving computers the ability to understand text and spoken
words in much the same way human beings can.
• Sentiment Analysis
• Sentiment analysis, also referred to as opinion mining, is an
approach to natural language processing (NLP)
that identifies the emotional tone behind a body of text.
• Sarcastic Review
• Express their negative feeling using positive words
Abstract
• Most of the people express their ideas, views and opinions over
products in online networks such as Amazon, Flipkart etc.
• These feedbacks or comments carry an emotion in them. This data

can be either straight forward or sarcastic.
• Sentiment analysis is used to analyze the perspective of text.
• Sarcasm can also be present in the text which is a bitter way of

conveying the information. Selection of the dataset is the initial
task.
• Dataset is retrieved from Amazon datasets. Later, feature extraction
is done, which includes term frequency, Inverse document
frequency and n-gram.
• The classification algorithm is used such as Support Vector

Machine (SVM) is implemented.
• By attaining the parameter accuracy various products and their

sarcasm and fake reviews are processed for better efficiency.
• Provided unique ID for every customer, with those ID’s they can
submit their reviews about the purchased products. Based on those
reviews sarcastic information are gathered and optimized.
Objectives
• The main objective of this project is to prevent fake reviews by

sending the review id to the registered email-id.
• This project is to detecting fake and sarcastic reviews from a set of

product reviews by simulating spam reviews using SVM.
Introduction
• Retail websites like Amazon.com offer different options to the

reviewers for writing their reviews.
• For instance, the consumer can provide numerical rating from 1 to

5 or write comments about the product.
• As there are innumerable products manufactured by many different

brands, so providing relevant reviews to the consumers is the need
of hour.
• Number of reviews associated with a product or a brand is

increasing at an alarming rate, which is no less than handling the
big data.
• Classifying the reviews on the basis of sentiment of customers into
positive, negative, fake, sarcastic reviews and sentiment provides
sentiment orientation of the review, hence results in better
judgment.
• Segregation of reviews on the basis of their sentiment can help

future buyers to evaluate positive and negative feedback
constructively and reach at better decisions as per their
requirements.
• This evaluation acts as a testimony to the users who are looking to

know the details and specifications of the products; thereby
increasing user credibility.
Literature Review
S.No Title Author/Journal/Year/ Remarks
Volume /Page No
1 Effective Text Data Saurav Pradha ,et.al(2019) • Preprocessing technique- Stemming, Lemmatization
Preprocessing IEEE and Spelling Correction
Technique for • Stemming technique performed best in terms of
Sentiment Analysis in computational speed.
Social Media Data
2 Data Preprocessing for ShreyasWankhede,et.al(2018) • Use N-gram method and Hidden Markov Model for
Efficient Sentimental International Conference on Spell-Checking and Correction of tweets
Analysis Inventive Communication and • Emoji Sentiment Ranking method used to evaluate
Computational Technologies, pg sentiment mapping of emojis by using sentiment
723-727 polarity
3. Feature Extraction and Sai Chandra Rachiraju & • Used IMDB review datesets
Classification of Movie Madamala Revanth (2020) • Word2vec and SVM gives better accuracy 83.4
Reviews using International Conference on
Advanced Machine Intelligent Computing and
Learning Models Control Systems pg 814-817
8
Literature review (Cont..)
S.No Title Author/Journal/Year/ Remarks
Volume /Page No
4 Deep LSTM-RNN with Sayed Saniya • Logistic regression used to detect sarcasm
Word Embedding for Salim,et.al(2020) • 15000 tweet dataset used
Sarcasm Detection on International Conference for • Long Short Term Memory (LSTM) and Word
Twitter Emerging Technology Embeddings can make the sarcasm detection
efficiently-
5 Sentiment Analysis D.Deepa and A.Tamilarasi • Detect the polarity of words from twitter using feature
using Feature (2019) IEEE, Third extraction and dictionary-based methods
Extraction and International Conference on I- • CountVectorizer gives 81% accuracy
Dictionary-Based SMAC (IoT in Social, Mobile,
Approaches Analytics and Cloud)
6 Sentiment Classification Yongxia • Text documents are obtained through LDA method
of Online Reviews JING,et.al(2019),IEEE, • k-Nearest Neighbors (KNN) algorithm is used for
Based on LDA and International Symposium on executing the sentiment classification of online reviews.
Semantic Analysis of Computational Intelligence
Sentimental Words and Design (ISCID)
9
Problem Description
• Review features like ratings, brand names reference is hard for

human, machines not to mention.
• When only one review is available for a particular item, it is

difficult to identify rating behaviors.
• When fake and sarcastic reviews are intentionally fabricated like

genuine review, it would be hard to decide genuine review.
Existing System
• We will talk of that the detection problem of automatic sarcasm

detection.
• Sarcasm detection can be formulated as the classification task. The

objective of the sarcasm detection problem is to find out whether
the sentences within the text are sarcastic or not.
• A computational model to detect sarcasm in product rating is not up

to the level. Existing model consider lexical features and did not
took pattern of words as features.
• Similar approach but perform pair-wise categorization performance
Existing set of textual features for extracting sarcastic information
constructed a new irony detection model.
• They make use of historical reviews and sarcastic comments to

detect sarcasm.
• Their approach didn’t consider dialogues and cannot extract out the
words having multiple meaning and will determine their sense to
deal with low sarcasm detection.
Disadvantages
• The reviews written to change users’ perception of how good a

product or a service are considered as spam, and are often written
in exchange for money.
• Time Complexity.
• The fact that anyone with any identity can leave comments as
review, provides a tempting opportunity for spammers to write
fake reviews designed to mislead users’ opinion.
• Misclassification rate
• Assigned a neutral score because the method fails to detect any
fake reviews
• Applied on reviews written in other languages than English.
• Only average performance in simple aspect ratings
• Time consuming task.
Proposed System
• In our paper, we have focused on detection of sarcastic sentences

presented in the online product selling platforms such as Amazon,
Flipkart etc like “I love this product” etc using SVM techniques.
• The central issue in sarcasm detection is the data set availability and
their analysis.
• The principle wellsprings of information are from the product

reviews.
• These studies are critical to the business holders as they can take
business choices as indicated by examination after effects of user's
assessments about their items.
• Sarcasm detection can be applied to product reviews but can also
be applied to stock markets, news articles, or political debates.
• In past few years in the field of SA remarkable work has been done
but sarcasm detection in textual sentences is not up to the mark.
• In this proposal we provide every user those who purchase

products are generated with a unique ID so that those users can
post their reviews for their purchased products.
• Entire information are collected and stored in a datasets and

compared and classified using SVM, which provides accurate
optimized results.
Advantages
• Best , Fast and Simple Prevention method
• Simpler, efficient and takes lesser convergence time.
• Proposed a sentiment classification technique that uses both the
supervised deep neural network and unsupervised probabilistic
generative model.
• Convolutional neural network (CNN) is good at capturing local
patterns and plays an important role in NLP.
• The performance is evaluated on two short text review datasets.
• achieve better performance
• By using features with more weights will resulted in detecting fake
reviews easier with less time complexity
• Recommendation System
System Requirements
Hardware Requirements
• Processor : i3,i5,i7
• RAM : 2GB
• Hard disk : 500 GB
• Compact Disk : 650 Mb
• Keyboard : Standard keyboard
• Mouse : Logitech mouse
• Monitor : 15 inch color monitor
Software Requirements
• Front End : PHP

• Back End : MYSQL
• Operating System : Windows OS
• Server : WAMP Server
• System type : 32 or 64-bit OS
• IDE : Dreamweaver 8.0
• DLL : Depends upon the title
Architecture Diagram
Review ID
Customer
Amazon Review Data Set User Review Data Set
Preprocessing Preprocessing
Semantic Representation Semantic Representation
Feature Extraction Feature Extraction
Classification Classification
Matching
Eliminate Fake Reviews
Storage Server Product Recommendation
Satisfied Customer
Dataflow Diagram
Level 1
Analyzing & Extracting

Amazon Dataset Preprocessing Data
Storage
Level 2
Feature Selection Sarcastic

Feature
Data Extraction Reviews
Storage
Level 3
Feature Extracted SVM Classifier

Data
Classifier
Storage
Level 4
Positive Reviews
Sarcastic
Prediction
Negative
Reviews
Modules
• Fake Reviews Analysis •Real time dataset annotation by
Modeler creating e-commerce website.
• E-Commerce Website
– Review ID Generation •Preprocessing
System
•Feature Extraction
– Review ID Issue
• Dataset Annotation •Classification
– Training Phase
• Amazon Review Dataset •Prediction
– Testing Phase
•Recommendation
•Performance Analysis
Module Description
Fake Reviews Analysis Modeler
• Online reviews play an integral part for success or failure of

businesses. Prior to purchasing services or goods, customers first
review the online comments submitted by previous customers.
• However, it is possible to superficially boost or hinder some

businesses through posting counterfeit and sarcastic reviews.
• This module explores a natural language processing approach to

identify sarcastic reviews.
• We present a detailed analysis of linguistic features for
distinguishing sarcastic and trustworthy online reviews.
• Our results indicate that sarcastic reviews tend to include more

redundant terms and pauses, and generally contain longer
sentences.
• The application of SVM classification algorithm reveals that we

were able to discriminate sarcastic from real reviews with high
accuracy using these linguistic features.
E-Commerce Website
• Ecommerce, refers to the buying and selling of goods or services

using the internet, and the transfer of money and data to execute
these transactions.
• An e-commerce website, by definition, is a website that allows you

to buy and sell tangible goods, digital products or services online.
• Trade, be it barter exchange or buying and selling of goods and

services has been prevalent for centuries.
• No one can be self-sufficient. And this brings out the need for
demand and supply of goods and services.
• Transactions have been going on all over the world for centuries,
locally, and across locations.
• Keeping the same concept in mind, now think electronic.
• However, also bear in mind that with the whole world going
online, data privacy laws have become increasingly stringent.
Review ID Generation System
• To post a review for the purchased product, each customer has been
provided with an unique ID while purchasing.
• With those ID the customer can post their reviews regarding the
purchased products.
Review ID Issue
• By providing such ID’s the administrator can avoid unwanted

negative reviews and statements from unauthorized customers for
their products.
• So that in future new customers can gain actual and exact reviews for
the products that are in the E-Commerce sites such as Amazon,
Flipkart, Snapdeal etc.
Data Set Annotation
Training Phase
• We used Amazon review dataset from Kaggle. The dataset in

consisted of two labels, positive and negative, while was composed
of three labels of positive, neutral, and negative.
Testing Phase
• In this module, user login to the E – Commerce sites, purchase

products and provide reviews about the purchased products.
Real Time Dataset Annotation by Creating E-Commerce Website
It is done by the data and reviews that are attained from the testing
phase.
As a result sarcastic results are accurately analyzed from various

reviews that are provided by the customers involved in the system.
Preprocessing
The preprocessing step is essential in sarcastic review detection. Entire

Amazon review datasets that are provided by the customers are initiated
and preprocessed in an efficient way.
As a result more prominently trained and tested data are preprocessed
which extracts a sequence of sarcastic reviews that are provided by the
customers those who purchased products.
References
• R. Xia, F. Xu, C. Zong, Q. Li, Y. Qi and T. Li. ”Dual Sentiment Analysis:
Considering Two Sides of One Review”, IEEE Transactions on Knowledge
and Data Engineering, 2015.
• R. Xia, T. Wang, X. Hu, S. Li and C. Zong. ”Dual Training and Dual
Prediction for Polarity Classiﬁcation”, Proceedings of the Annual Meeting
of the Association for Computational Linguistics (ACL), 2013.
• S. Ahuja and G. Dubey. ”Clustering Sentiment Analysis on Twitter Data”,
Second International Conference on Telecommunications and Networks,
IEEE, 2017
• Rathan M., V. Hulipalled, Murugeshwari P. and Sushmitha M. ”Every Post
Matters: A Survey on Applications of Sentiment Analysis in Social Media”,
International Conference on Smart Technology for Smart Nation, IEEE,
2017.
• D. Ikeda, H. Takamura, L. Ratinov and M. Okumura. ”Learning to Shift the
Polarity of Words for Sentiment Classiﬁcation”, Proceedings of the Third
International Joint Conference on Natural Language Processing: Volume-I,
2008.

Detect Sarcastic

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Detect Sarcastic

Uploaded by

Copyright:

Available Formats

Detecting the Polarity Of Sarcastic Reviews To

Identify High Profitable Product Using Machine

JOHN PAUL D (192071033)

• These feedbacks or comments carry an emotion in them. This data

• Sentiment analysis is used to analyze the perspective of text.

• Sarcasm can also be present in the text which is a bitter way of

• The classification algorithm is used such as Support Vector

• By attaining the parameter accuracy various products and their

• The main objective of this project is to prevent fake reviews by

• This project is to detecting fake and sarcastic reviews from a set of

• Retail websites like Amazon.com offer different options to the

• For instance, the consumer can provide numerical rating from 1 to

• As there are innumerable products manufactured by many different

• Number of reviews associated with a product or a brand is

• Segregation of reviews on the basis of their sentiment can help

• This evaluation acts as a testimony to the users who are looking to

• Review features like ratings, brand names reference is hard for

• When only one review is available for a particular item, it is

• When fake and sarcastic reviews are intentionally fabricated like

• We will talk of that the detection problem of automatic sarcasm

• Sarcasm detection can be formulated as the classification task. The

• A computational model to detect sarcasm in product rating is not up

• They make use of historical reviews and sarcastic comments to

• The reviews written to change users’ perception of how good a

• In our paper, we have focused on detection of sarcastic sentences

• The principle wellsprings of information are from the product

• In this proposal we provide every user those who purchase

• Entire information are collected and stored in a datasets and

• Front End : PHP

Semantic Representation Semantic Representation

Feature Extraction Feature Extraction

Eliminate Fake Reviews

Storage Server Product Recommendation

Analyzing & Extracting

Feature Selection Sarcastic

Feature Extracted SVM Classifier

Fake Reviews Analysis Modeler

• Online reviews play an integral part for success or failure of

• However, it is possible to superficially boost or hinder some

• This module explores a natural language processing approach to

• Our results indicate that sarcastic reviews tend to include more

• The application of SVM classification algorithm reveals that we

• Ecommerce, refers to the buying and selling of goods or services

• An e-commerce website, by definition, is a website that allows you

• Trade, be it barter exchange or buying and selling of goods and

• Keeping the same concept in mind, now think electronic.

• By providing such ID’s the administrator can avoid unwanted

• We used Amazon review dataset from Kaggle. The dataset in

• In this module, user login to the E – Commerce sites, purchase

As a result sarcastic results are accurately analyzed from various

The preprocessing step is essential in sarcastic review detection. Entire

You might also like