Professional Documents
Culture Documents
8 W 8 DJD
8 W 8 DJD
net/publication/335932404
CITATIONS READS
2 788
3 authors:
Anuja Arora
Jaypee Institute of Information Technology
108 PUBLICATIONS 1,578 CITATIONS
SEE PROFILE
All content following this page was uploaded by Shaurya Uppal on 13 October 2021.
Abstract— E-Commerce applications provide an added Whereas, all the ways discussed above are either based on
advantage to customer to buy product with added suggestions in statistical value provided by users or a sentiment score of
the form of reviews. Obviously, reviews are useful and impactful reviews. Due to the necessity of the market, many
for customers those are going to a buy product. But these improvements are made in this direction. Out of which
enormous amount of reviews create problem also for customers
relevance based review ranking is one of the modish
as they are not able to segregate useful ones. Therefore, there is a
need for an approach which will showcase only relevant reviews approaches. This relevance based review ranking method is
to the customers. This same problem has been attempted in this used by Google Maps. The commonly used reviews filtering
research paper as this is a less explored area. Pairwise Review method used by Google are- Newest, Highest rating, Lowest
relevance ranking method is proposed in this research paper. rating but relevance based review ranking is cutting-edge. One
This approach will sort reviews based on their relevance with the snapshot of Google Map reviews based on relevance is shown
product and avoid showing irrelevant reviews. This work has in Figure 1. Figure 1 shows reviews list based on relevance
been done in three phases- feature extraction, pairwise review mapping of content. Ranking is fundamentally a set of
ranking, and classification. The outcome is sorted list of reviews, instances by their relative relevance. This is useful for various
review ranking accuracy and classification accuracy. Four
applications such as recommendation, text mining, document
classifiers- SVM, Random forest, Neural network, and logistic
regression have been applied to validate ranking accuracy. Out retrieval, and text summarization.
of all four applied classification models, Random forest gives the
best result. our proposed system is able to achieve 99.76%
classification accuracy and 99.56% ranking accuracy for a
complete dataset using random forest.
I. INTRODUCTION
Nowadays, reviews ranking has become an extremely
challenging issue due to advancement in ecommerce sites.
Every product on ecommerce site has reviews and customer
who is going to buy that specific product will surely would
like to visit reviews once. But then the count of reviews
creates problem and leave a customer in a conflicting
condition.
To resolve this issue, Ecommerce sites have started
filtering reviews according to customers desire in order to gain
customer satisfaction while reading reviews. Some popular
ways to filter out reviews to satisfy customers are as follows
- Amazon Reviews filter according to ‘Top Reviews’
based on review helpfulness score and ‘Most Recent’ reviews
but on review post time Fig1. Google map Reviews ranking snapshot based on
relevance
- Flipkart has four ways to showcase reviews- Most There are three categories of ranking algorithms in
Helpful, Most Recent, Positive First, and Negative First literature- Pointwise Ranking, Pairwise Ranking [3], and
- Similarly with many other Ecommerce sites. Listwise Ranking [2]. Table I discusses about description and
output space of ranking algorithms lie under each category.
Rc1 = …(2)
Fig 5. Compound Score stats
Here corpus is total number of reviews for each product.
IV. PAIRWISE RANKING
5) Review complexity each (Rc2)
Ranking is a canonical problem and learning to rank as
Single review complexity is percentage of number of semi-supervised machine learning is a challenging issue in
unique words exist in a review based on total number of words current interactive web era. Undoubtedly, it is easier for
in that particular review. So, to compute each review humans to analyze and distinguish a set of reviews. Humans
complexity equation 3 is used. can easily assure that review is good and informative review,
or it is bad and uninformative. Whereas, on the other end to
Rc2 = …(3) rank these reviews is a troublesome task even for humans.
Therefore, to create a computing method that can rank reviews
6) Word Length (Rw): Number of Words count in a is handled in our proposed method. Pairwise ranking approach
Review. is applied to rank reviews in semi-supervised learning method.
7) Service Tagger (Rd): Pairwise ranking approach looks at a pair of documents at a
time in a loss function and predicts a relative ordering. The
Reviews are basically to describe product. So, a dictionary of objective is not to determine the relevance score but to find
words is created which would mark reviews as service based, which document is more relevant than other. This relevance is
delivery reviews, and customer support. developed to judge the preference of one review over another.
Fuzzy matching of every word in a review is done with the In semi-supervised learning method, mapping is constructed
words in the dictionary with Levenshtein distance less than 1. between input and output. This input-output pair in training
Levenshtein distance helps in measuring the difference model is used to learn the system.
between two sequences and tackle spell errors in review, for
example, despite of “My delivery was on time”, Reviews is Reviews Segregation: The reviews are segregated in two sets.
wrongly written as “My dilivery was on time”. In this case, Set 1 represents review with label 1 i.e reviews that are
Fuzzy matching would help us to match both the reviews. informative and are better than all reviews of Set 0;
Set 0 represents a review with label 0 i.e. that is not
informative. Even, reviews those are more subjective and
8) Compound Score (Rsc) discusses about delivery, service, and customer support but
To improve the efficiency of system, another compound score does not talk about the product also come in this set.
(Rsc) metric is used to determine sentiment of reviews. It is We pairwise compare each review of set1 with all reviews of
incorporated using VaderSentimentAnalyser3 library. The set0 and vice-versa.
system gives Rsc values in following range (Ri, Rj,1) where i∈Set1 and j∈Set0 → Ri is
better than Rj
Rsc value >= 0.5 Positive review
(Rj, Ri, 0) where i∈Set1 and j∈Set0 → Rj is
Rsc value <=-0.5 Negative review
-0.5< Rsc value < 0.5 Neutral worse than Ri
This now becomes a classification problem.
This library is taken from VADER (Valence Aware
Dictionary and sEntiment Reasoner). This is a lexicon and Review Score Computation: For a given product, we
rule-based sentiment analysis tool that is specifically tuned to compare each review (Ri) with every other review (Rj) for a
determine sentiments expressed in social media content. It has product and get a win/lose score where win means (Ri) is
ability to find sentiment of Slang (e.g. SUX! ) , Emoji (, ), better than (Rj) and lose means (Ri) is worse than (Rj).
Emoticons ( :), :D ) and difference between capitalized word Further, a review score is computed using equation 4
expressions also (I am HAPPY, I am happy are different
expressions). Review Score = …..(4)
Reviews are then sorted by this review score which results in
showing the most relevant review on top and most irrelevant
3
https://pypi.org/project/vaderSentiment/2.1/ review in the bottom.
V. CLASSIFICATION MODELS
Different classification models have different techniques to
fit data based on its properties which may lead to predicted
outcomes of models differ from each other and the actual
ground truth. Therefore, we have applied four classification
models to validate the best performance model for this
relevance based review ranking problem. The applied four
classification models are- Support vector classifier, Neural
Network, Logistic Regression, Random Forest Classifier.
Logistic Regression is basically a binary classifier and
transforms its output using the logistic sigmoid function. The
Logistic Sigmoid function returns the probability value which Fig 6. Dataset Statistics
is further mapped to binary class classification.
Using this model we get for any product reviews we can
Further, other nonlinear classifiers are applied and tested in
rank the reviews and get the most relevant reviews on top and
order to validate the pairwise ranking outcome. MultiLayer
push down irrelevant reviews of a product.
Perceptron Neural Network classifier is used. It uses multiple
layers (input layer, hidden layer and output layer) and a non- B. Performance measures
linear activation function in form of back propagation method The approach needs to be validated at two points. First-
of training of data. Support Vector Classifier (SVM) is also Ranking accuracy should be validated and Second-
tested that performs classification by finding the hyperplane to classification accuracy validation is needed. Undoubtedly,
differentiate the two classes. Finally, Random Forest Classifier system shows its efficiency in case both accuracy measures
which is an ensemble learning algorithm is used. This can be provide better results. Both accuracy measures are as follows:
used for both classification and regression. Random forests
creates decision trees on randomly selected data points, gets 1) Ranking Accuracy Measure
prediction from each tree and selects the best solution by A sorted list of reviews based on review score (computed
means of voting. using equation 4) is the outcome of pairwise Ranking
algorithm. Therefore, To test this hypothesis a ranking metric
VI. EXPERIMENTAL SETUP AND RESULTS is designed which is as follows:
The experimental setup, Dataset, and outcome of the
proposed approach are presented in this section in context of Let the number of reviews labeled as 1 in our Dataset be
performance evaluation. Nlabel=1.
A. Dataset Accuracy=
To perform experimental medicine reviews are used. The …….(5)
approach has been tested on various categories of dataset such
as Ecommerce site -Amazon product dataset, medicine review, 2) Classification Accuracy Measure: Classification
etc. and approach works effectively with all. We got some accuracy is computed using true positive, true negative, False
reviews labeled by different people just to know which types Positive, and False negative outcome. To validate it, equation
of reviews are liked by people. To showcase the efficiency of 6 is used
the approach, the reviews of a few medicinal category data4
ClassificationAccuracyreview= …….(6)
has been taken. The dataset statistics are showcased in Figure
6. Then ranking of reviews is a semi-supervised approach.
where TP- True Positive; TN- True Negative, FP-False
Here, we have taken training dataset which contains 503
Positive, and FN- false Negative.
number of reviews of 5 categories under medicine- Vitamin B
Tablet, Vitamin D Tablet, Accu-check, Omega 3 Fatty acid, C. Results
and a medicinal shampoo. Each category contains review in The sample resultant of pairwise review ranking and
83-107 range. classification are shown in figure 7 which depicts result after
experimenting with different models. Figure 7 shows sample
training data for Neurobion (Vitamin B) and Evion (Vitamin
E). Number of training and test set for both the cases is also
displayed in table. The system is validated on four models -
Random Forest, Neural network, Logistic Regression, and
Support vector machine. We found Random Forest Classifier
to be the best and giving classification accuracy is 99.76% and
ranking accuracy of 99.56% for complete dataset. Sorted
4
review list also along with review ranking score.
http://bit.ly/2UMHaSf
● Readability Score can be added in the future if
Personalized per user based ranking would be done.
● Extend the approach for same product on multiple E-
commerce applications.
REFERENCES
[1] Seki, Y. (2002). Sentence Extraction by tf/idf and position weighting
from Newspaper Articles.
[2] Ravikumar, P., Tewari, A., & Yang, E. (2011, June). On NDCG
consistency of listwise ranking methods. In Proceedings of the
Fourteenth International Conference on Artificial Intelligence and
Statistics (pp. 618-626).
[3] Yu, R., Zhang, Y., Ye, Y., Wu, L., Wang, C., Liu, Q., & Chen, E. (2018,
October). Multiple Pairwise Ranking with Implicit Feedback. In
Proceedings of the 27th ACM International Conference on Information
and Knowledge Management (pp. 1727-1730). ACM.
[4] Yu, L., Zhang, C., Pei, S., Sun, G., & Zhang, X. (2018, April).
Fig 7. Pairwise Ranking and Classification Results Walkranker: A unified pairwise ranking model with multiple relations
for item recommendation. In Thirty-Second AAAI Conference on
Artificial Intelligence.
A sample of top 5 sorted reviews is shown in Figure 8. [5] Bai, T., Zhao, W. X., He, Y., Nie, J. Y., & Wen, J. R. (2018).
Dataset along with all features and sorted list after ranking of Characterizing and predicting early reviewers for effective product
reviews for Vitamin E dataset is shown present at following marketing on e-commerce websites. IEEE Transactions on Knowledge
and Data Engineering, 30(12), 2271-2284.
url5.
[6] Yan, Y., Liu, Z., Zhao, M., Guo, W., Yan, W. P., & Bao, Y. (2018,
September). A Practical Deep Online Ranking System in E-commerce
Recommendation. In Joint European Conference on Machine Learning
and Knowledge Discovery in Databases (pp. 186-201). Springer, Cham.
[7] Li, H. (2011). A short introduction to learning to rank. IEICE
TRANSACTIONS on Information and Systems, 94(10), 1854-1862.
5
http://bit.ly/2PyUF29