Professional Documents
Culture Documents
Research Paper
Research Paper
articleinfo abstract
Keywords: Despite the availability of large volumes of movie reviews on Rotten Tomatoes, there is still a need to explore the patterns and
Cosine similarity insights that can be gleaned from the data. Specifically, there is a lack of research that examines the relationship between Rotten
Movie recommendation Tomatoes' review scores and other factors such as movie genre, release year, and audience demographics. This study aims to fill
Sentiment analysis this gap by conducting an exploratory data analysis of Rotten Tomatoes movies reviews to uncover trends and patterns that can
CNN shed light on the factors that influence the review scores and help movie makers to make better decisions. our findings include
LSTM having found an average sentiment score, which movies and genres tend to have a better rating and a positive sentiment, and we
also found if there is a decrement or increment in the frequency of movie reviews over the years. And on the basis of these
findings, we published our conclusions on how there is a decrease in positive reviews over the last few years.
,
1. Introduction:
online review platforms, including Rotten Tomatoes, and found that users are
A lot of study has been done in previous years in the area of data analysis more likely to trust reviews from platforms that have higher ratings and more
of movie reviews. Some examples of previous research studies in this area reviews. The study also found that users are more likely to trust reviews from
include "The Rotten Tomatoes effect: Word-of-mouth and pre-release movie other users who have similar demographic characteristics. However, this study
buzz" by Brett Danaher and Michael D. Smith (2014) This study analysed the did not specifically focus on movies or examine the relationship between
relationship between Rotten Tomatoes' movie scores and pre-release buzz on review scores and other factors. Moreover no such research or study is found
social media platforms. The study found that Rotten Tomatoes scores are which examines the impact of other factors on review scores, such as movie
significantly correlated with pre-release buzz, and that the impact of word-of- genre, release year, and audience demographics, Analysis of the relationship
mouth on movie success is stronger for movies with high Rotten Tomatoes between review scores and movie success metrics, such as box office revenue
scores. However, the study did not examine the impact of other factors on and critical acclaim, Investigating the role of reviewer characteristics, such as
review scores. Another study was done to find the credibility of online review expertise and credibility, on review scores, Comparing the review scores and
platforms “A study of hotel booking websites in the United States" by Xueming trends between Rotten Tomatoes and other movie review platforms, such as
Luo, Jie Zhang, and Yong Liu (2014) This study analysed the credibility of IMDb and Metacritic. These gaps will be filled in our research paper
.
2. Related works
3. Methodology
Dataset:
IMDB Dataset" and it contains a collection of movie reviews that were scraped
from the popular movie review website, IMDb. The dataset consists of a single
CSV file, which contains 50,000 rows and two columns.The first column,
"review", contains the text of the movie review, while the second column,
"sentiment", contains a binary label indicating whether the review is positive or
negative. Reviews with a sentiment label of 1 are positive, while reviews with
a sentiment label of 0 are negative.The dataset can be used for a variety of
natural language processing (NLP) tasks such as sentiment analysis, text
classification, and language modeling. The dataset is well-suited for training
Word Cloud Negative Words: and testing machine learning models that can identify the sentiment of movie
reviews based on their text. It is important to note that the dataset may contain
some biased or inaccurate reviews, as they were scraped from a public website
and were not necessarily written by professional movie critics. Therefore, it is
recommended to clean and preprocess the data before using it for any analysis
ormodelingtask.
Tokenization is a critical stage in the natural language processing process. • output layer with 1 unit "Sigmoid function " which will helps in
Tokenization is the process of breaking down a sentence, paragraph, or even an providing the labels .
entire text document into smaller parts, such as individual words or phrases. By using LSTM we dont need to use preprocessing tasks such as stopwords
Each of these smaller components is referred to as a token. elimination coz this network have its own special feature for elimination of
in this project we will be using word tokenize in clean reviews function unnecessary information.It also has another feature: LSTM has a capability that
allows it to memorise the data sequence.this features makes LSTM a powerful
Stopwords Removal : tool for text classification .
stopwords removal step is a way to remove the unnecessary words that will Results :
not add any important information into the reviews
In this part, I will test the result of the models with custom reviews and compare the accuracy
Lemmatization : of the models to see which model has highest accuracy
N. Pavitha, V. Pungliya, A. Raut et al. Global Transitions Proceedings 3 (2022) 279–284
282
N. Pavitha, V. Pungliya, A. Raut et al. Global Transitions Proceedings 3 (2022) 279–284
[1] N. Nassar, A. Jafar, Y. Rahhal, A novel deep multi-criteria collaborative filtering model for
This paper is basically divided into two major parts. One of which focuses on
recommendation system, Knowl. Based Syst. 187 (2020) 104811 .
Movie Recommendation system and the other on the Senti- ment analysis. The [2] A. Beheshti, S. Yakhchi, S. Mousaeirad, S.M. Ghafari, S.R. Goluguri, M.A. Edrisi, Towards
study discusses both the systems in detail and has come to some important cognitive recommender systems, Algorithms 13 (8) (2020) 176 .
conclusions. For the Movie Recommendation System, the Cosine Similarity [3] S. Sharma, V. Rana, M. Malhotra, Automatic recommendation system based on hy- brid
filtering algorithm, Educ. Inf. Technol. 27 (2021) 1–16 .
algorithm has been used to recommend the best movies that are related to the [4] S.R.S. Reddy, S. Nalluri, S. Kunisetti, S. Ashok, B. Venkatesh, Content-based movie
movie entered by the user based on different factors such as the genre of the recommendation system using genre correlation, in: Smart Intelligent Computing and
movie, overview, the cast as well as the ratings given to the movie. Cosine Applications, Springer, Singapore, 2019, pp. 391–397 .
[5] M. Yasen, S. Tedmori, Movies reviews sentiment analysis and classification, in:
Similarity has given fair results even after running several tests on it and has been
Proceedings of the IEEE Jordan International Joint Conference on Elec- trical Engineering
quite accurate at recommending the movies. and Information Technology, JEEIT, 2019, pp. 860–865, doi:
Sentiment analysis also plays an important role in this study. It ba- sically aims 10.1109/JEEIT.2019.8717422 .
[6] N. Rajput, S. Chauhan, Analysis of various sentiment analysis techniques, Int. J. Comput.
to classify the reviews into positive or negative. Two algo- rithms have been used
Sci. Mob. Comput. 8 (2) (2019) 75–79 .
for the same. One of which is NB and other is SVC. The main reason behind using [7] Z. Shaukat, A.A. Zulfiqar, C. Xiao, M. Azeem, T. Mahmood, Sentiment analysis on IMDB
two algorithms is to find out what which is the best algorithm to classify the using lexicon and neural networks, SN Appl. Sci. 2 (2) (2020) 1–10 .
reviews because the reviews have huge diversity in them, so it is very important [8] T. Widiyaningtyas, I. Hidayah, T.B. Adji, User profile correlation-based similarity
(UPCSim) algorithm in movie recommendation system, J. Big Data 8 (2021) 52 .
to choose the right algorithm for classification. Finally, the experimental results [9] R.H. Singh, S. Maurya, T. Tripathi, T. Narula, G. Srivastav, Movie recommendation system
show that SVM Algorithm has better accuracy than NB by a very small margin. using cosine similarity and KNN, Int. J. Eng. Adv. Technol. (IJEAT) 9 (5) (2020) 2–3 ISSN:
Some prospects of this study have been mentioned below: 2249 –8958VolumeIssueJune .
[10] S. Kumar, K. De, P.P. Roy, Movie recommendation system using sentiment analysis from
1 Increasing the Accuracy of both Sentiment Analysis for better clas- sification microblogging data, IEEE Trans. Comput. Soc. Syst. 7 (4) (2020) 915–923 .
[11] A. Rahman, M.S. Hossen, Sentiment analysis on movie review data using machine learning
of sarcastic or ironic reviews. approach, in: Proceedings of the International Conference on Bangla Speech and Language
2 Sentiment Analysis of the reviews in different languages other than English. Processing (ICBSLP), IEEE, 2019, pp. 1–4 .
3 Movie recommendation according to users’ preference (cast, genre, year of [12] S. Uddin, A. Khan, M.E. Hossain, M.A. Moni, Comparing different supervised ma- chine
learning algorithms for disease prediction, BMC Med. Inf. Decis. Mak. 19 (1) (2019) 1–16
release, etc.).
.
[13] S. Ghosh, A. Dasgupta, A. Swetapadma, A study on support vector machine based linear
Although the system is very accurate, it does have some limitations. One of and non-linear pattern classification, in: Proceedings of the International Con- ference on
which is, if the movie entered by the user isn’t present in the dataset or if the user Intelligent Sustainable Systems (ICISS), IEEE, 2019, pp. 24–28 .
does not enter the name of the movie in the similar manner as that of in the dataset, [14] K. Dashtipour, M. Gogate, A. Adeel, H. Larijani, A. Hussain, Sentiment analysis of Persian
movie reviews using deep learning, Entropy 23 (5) (2021) 596 .
then the system fails to recommend movies. One more limitation is the linguistic
[15] S. Soubraylu, R. Rajalakshmi, Hybrid convolutional bidirectional recurrent neural network
barrier while doing the sentimental analysis. As of now only reviews written in based sentiment analysis on movie reviews, Comput. Intell. 37 (2) (2021) 735–757 .
English can be analyzed. The Sentimental analysis also gives wrong classification
if the reviews are sarcastic or ironic.
283