Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 34

Detecting the Polarity Of Sarcastic Reviews To

Identify High Profitable Product Using Machine


Learning
DEVAGANTHAN B (192071014)

ANAND R (192071005)

BHARATH R (192071011)

JOHN PAUL D (192071033)


Domain
• Natural Language Processing
• the branch of artificial intelligence or AI—concerned with
giving computers the ability to understand text and spoken
words in much the same way human beings can.
• Sentiment Analysis
• Sentiment analysis, also referred to as opinion mining, is an
approach to natural language processing (NLP)
that identifies the emotional tone behind a body of text.
• Sarcastic Review
• Express their negative feeling using positive words
Abstract

• Most of the people express their ideas, views and opinions over
products in online networks such as Amazon, Flipkart etc.

• These feedbacks or comments carry an emotion in them. This data


can be either straight forward or sarcastic.

• Sentiment analysis is used to analyze the perspective of text.

• Sarcasm can also be present in the text which is a bitter way of


conveying the information. Selection of the dataset is the initial
task.
• Dataset is retrieved from Amazon datasets. Later, feature extraction
is done, which includes term frequency, Inverse document
frequency and n-gram.

• The classification algorithm is used such as Support Vector


Machine (SVM) is implemented.

• By attaining the parameter accuracy various products and their


sarcasm and fake reviews are processed for better efficiency.

• Provided unique ID for every customer, with those ID’s they can
submit their reviews about the purchased products. Based on those
reviews sarcastic information are gathered and optimized.
Objectives

• The main objective of this project is to prevent fake reviews by


sending the review id to the registered email-id.

• This project is to detecting fake and sarcastic reviews from a set of


product reviews by simulating spam reviews using SVM.
Introduction

• Retail websites like Amazon.com offer different options to the


reviewers for writing their reviews.

• For instance, the consumer can provide numerical rating from 1 to


5 or write comments about the product.

• As there are innumerable products manufactured by many different


brands, so providing relevant reviews to the consumers is the need
of hour.

• Number of reviews associated with a product or a brand is


increasing at an alarming rate, which is no less than handling the
big data.
• Classifying the reviews on the basis of sentiment of customers into
positive, negative, fake, sarcastic reviews and sentiment provides
sentiment orientation of the review, hence results in better
judgment.

• Segregation of reviews on the basis of their sentiment can help


future buyers to evaluate positive and negative feedback
constructively and reach at better decisions as per their
requirements.

• This evaluation acts as a testimony to the users who are looking to


know the details and specifications of the products; thereby
increasing user credibility.
Literature Review
S.No Title Author/Journal/Year/ Remarks
Volume /Page No

1 Effective Text Data Saurav Pradha ,et.al(2019) • Preprocessing technique- Stemming, Lemmatization
Preprocessing IEEE and Spelling Correction
Technique for • Stemming technique performed best in terms of
Sentiment Analysis in computational speed.
Social Media Data

2 Data Preprocessing for ShreyasWankhede,et.al(2018) • Use N-gram method and Hidden Markov Model for
Efficient Sentimental International Conference on Spell-Checking and Correction of tweets
Analysis Inventive Communication and • Emoji Sentiment Ranking method used to evaluate
Computational Technologies, pg sentiment mapping of emojis by using sentiment
723-727 polarity

3. Feature Extraction and Sai Chandra Rachiraju & • Used IMDB review datesets
Classification of Movie Madamala Revanth (2020) • Word2vec and SVM gives better accuracy 83.4
Reviews using International Conference on
Advanced Machine Intelligent Computing and
Learning Models Control Systems pg 814-817

8
Literature review (Cont..)
S.No Title Author/Journal/Year/ Remarks
Volume /Page No

4 Deep LSTM-RNN with Sayed Saniya • Logistic regression used to detect sarcasm
Word Embedding for Salim,et.al(2020) • 15000 tweet dataset used
Sarcasm Detection on International Conference for • Long Short Term Memory (LSTM) and Word
Twitter Emerging Technology Embeddings can make the sarcasm detection
efficiently-

5 Sentiment Analysis D.Deepa and A.Tamilarasi • Detect the polarity of words from twitter using feature
using Feature (2019) IEEE, Third extraction and dictionary-based methods
Extraction and International Conference on I- • CountVectorizer gives 81% accuracy
Dictionary-Based SMAC (IoT in Social, Mobile,
Approaches Analytics and Cloud)

6 Sentiment Classification Yongxia • Text documents are obtained through LDA method
of Online Reviews JING,et.al(2019),IEEE, • k-Nearest Neighbors (KNN) algorithm is used for
Based on LDA and International Symposium on executing the sentiment classification of online reviews.
Semantic Analysis of Computational Intelligence
Sentimental Words and Design (ISCID)

9
Problem Description

• Review features like ratings, brand names reference is hard for


human, machines not to mention.

• When only one review is available for a particular item, it is


difficult to identify rating behaviors.

• When fake and sarcastic reviews are intentionally fabricated like


genuine review, it would be hard to decide genuine review.
Existing System

• We will talk of that the detection problem of automatic sarcasm


detection.

• Sarcasm detection can be formulated as the classification task. The


objective of the sarcasm detection problem is to find out whether
the sentences within the text are sarcastic or not.

• A computational model to detect sarcasm in product rating is not up


to the level. Existing model consider lexical features and did not
took pattern of words as features.
• Similar approach but perform pair-wise categorization performance
Existing set of textual features for extracting sarcastic information
constructed a new irony detection model.

• They make use of historical reviews and sarcastic comments to


detect sarcasm.

• Their approach didn’t consider dialogues and cannot extract out the
words having multiple meaning and will determine their sense to
deal with low sarcasm detection.
Disadvantages

• The reviews written to change users’ perception of how good a


product or a service are considered as spam, and are often written
in exchange for money.
• Time Complexity.
• The fact that anyone with any identity can leave comments as
review, provides a tempting opportunity for spammers to write
fake reviews designed to mislead users’ opinion.
• Misclassification rate
• Assigned a neutral score because the method fails to detect any
fake reviews
• Applied on reviews written in other languages than English.
• Only average performance in simple aspect ratings
• Time consuming task.
Proposed System

• In our paper, we have focused on detection of sarcastic sentences


presented in the online product selling platforms such as Amazon,
Flipkart etc like “I love this product” etc using SVM techniques.

• The central issue in sarcasm detection is the data set availability and
their analysis.

• The principle wellsprings of information are from the product


reviews.

• These studies are critical to the business holders as they can take
business choices as indicated by examination after effects of user's
assessments about their items.
• Sarcasm detection can be applied to product reviews but can also
be applied to stock markets, news articles, or political debates.

• In past few years in the field of SA remarkable work has been done
but sarcasm detection in textual sentences is not up to the mark.

• In this proposal we provide every user those who purchase


products are generated with a unique ID so that those users can
post their reviews for their purchased products.

• Entire information are collected and stored in a datasets and


compared and classified using SVM, which provides accurate
optimized results.
Advantages
• Best , Fast and Simple Prevention method
• Simpler, efficient and takes lesser convergence time.
• Proposed a sentiment classification technique that uses both the
supervised deep neural network and unsupervised probabilistic
generative model.
• Convolutional neural network (CNN) is good at capturing local
patterns and plays an important role in NLP.
• The performance is evaluated on two short text review datasets.
• achieve better performance
• By using features with more weights will resulted in detecting fake
reviews easier with less time complexity
• Recommendation System
System Requirements

Hardware Requirements

• Processor : i3,i5,i7
• RAM : 2GB
• Hard disk : 500 GB
• Compact Disk : 650 Mb
• Keyboard : Standard keyboard
• Mouse : Logitech mouse
• Monitor : 15 inch color monitor
Software Requirements

• Front End : PHP


• Back End : MYSQL
• Operating System : Windows OS
• Server : WAMP Server
• System type : 32 or 64-bit OS
• IDE : Dreamweaver 8.0
• DLL : Depends upon the title
Architecture Diagram

Review ID
Customer
Amazon Review Data Set User Review Data Set

Preprocessing Preprocessing

Semantic Representation Semantic Representation

Feature Extraction Feature Extraction

Classification Classification

Matching

Eliminate Fake Reviews

Storage Server Product Recommendation

Satisfied Customer
Dataflow Diagram
Level 1

Analyzing & Extracting


Amazon Dataset Preprocessing Data

Storage
Level 2

Feature Selection Sarcastic


Feature
Data Extraction Reviews

Storage
Level 3

Feature Extracted SVM Classifier


Data
Classifier

Storage
Level 4

Positive Reviews

Sarcastic
Prediction
Negative
Reviews
Modules
• Fake Reviews Analysis •Real time dataset annotation by
Modeler creating e-commerce website.
• E-Commerce Website
– Review ID Generation •Preprocessing
System
•Feature Extraction
– Review ID Issue
• Dataset Annotation •Classification
– Training Phase
• Amazon Review Dataset •Prediction
– Testing Phase
•Recommendation

•Performance Analysis
Module Description

Fake Reviews Analysis Modeler

• Online reviews play an integral part for success or failure of


businesses. Prior to purchasing services or goods, customers first
review the online comments submitted by previous customers.

• However, it is possible to superficially boost or hinder some


businesses through posting counterfeit and sarcastic reviews.

• This module explores a natural language processing approach to


identify sarcastic reviews.
• We present a detailed analysis of linguistic features for
distinguishing sarcastic and trustworthy online reviews.

• Our results indicate that sarcastic reviews tend to include more


redundant terms and pauses, and generally contain longer
sentences.

• The application of SVM classification algorithm reveals that we


were able to discriminate sarcastic from real reviews with high
accuracy using these linguistic features.
E-Commerce Website

• Ecommerce, refers to the buying and selling of goods or services


using the internet, and the transfer of money and data to execute
these transactions.

• An e-commerce website, by definition, is a website that allows you


to buy and sell tangible goods, digital products or services online.

• Trade, be it barter exchange or buying and selling of goods and


services has been prevalent for centuries.

• No one can be self-sufficient. And this brings out the need for
demand and supply of goods and services.
• Transactions have been going on all over the world for centuries,
locally, and across locations.

• Keeping the same concept in mind, now think electronic.

• However, also bear in mind that with the whole world going
online, data privacy laws have become increasingly stringent.
Review ID Generation System

• To post a review for the purchased product, each customer has been
provided with an unique ID while purchasing.

• With those ID the customer can post their reviews regarding the
purchased products.

Review ID Issue

• By providing such ID’s the administrator can avoid unwanted


negative reviews and statements from unauthorized customers for
their products.

• So that in future new customers can gain actual and exact reviews for
the products that are in the E-Commerce sites such as Amazon,
Flipkart, Snapdeal etc.
Data Set Annotation

Training Phase

• We used Amazon review dataset from Kaggle. The dataset in


consisted of two labels, positive and negative, while was composed
of three labels of positive, neutral, and negative.

Testing Phase

• In this module, user login to the E – Commerce sites, purchase


products and provide reviews about the purchased products.
Real Time Dataset Annotation by Creating E-Commerce Website

It is done by the data and reviews that are attained from the testing
phase.

As a result sarcastic results are accurately analyzed from various


reviews that are provided by the customers involved in the system.

Preprocessing

The preprocessing step is essential in sarcastic review detection. Entire


Amazon review datasets that are provided by the customers are initiated
and preprocessed in an efficient way.

As a result more prominently trained and tested data are preprocessed
which extracts a sequence of sarcastic reviews that are provided by the
customers those who purchased products.
References
• R. Xia, F. Xu, C. Zong, Q. Li, Y. Qi and T. Li. ”Dual Sentiment Analysis:
Considering Two Sides of One Review”, IEEE Transactions on Knowledge
and Data Engineering, 2015.
• R. Xia, T. Wang, X. Hu, S. Li and C. Zong. ”Dual Training and Dual
Prediction for Polarity Classification”, Proceedings of the Annual Meeting
of the Association for Computational Linguistics (ACL), 2013.
• S. Ahuja and G. Dubey. ”Clustering Sentiment Analysis on Twitter Data”,
Second International Conference on Telecommunications and Networks,
IEEE, 2017
• Rathan M., V. Hulipalled, Murugeshwari P. and Sushmitha M. ”Every Post
Matters: A Survey on Applications of Sentiment Analysis in Social Media”,
International Conference on Smart Technology for Smart Nation, IEEE,
2017.
• D. Ikeda, H. Takamura, L. Ratinov and M. Okumura. ”Learning to Shift the
Polarity of Words for Sentiment Classification”, Proceedings of the Third
International Joint Conference on Natural Language Processing: Volume-I,
2008.

You might also like