Professional Documents
Culture Documents
Smriti Swar210259 90960
Smriti Swar210259 90960
Semester-1 Level 5
College ID:np01cp4a210259
I confirm that I understand my coursework needs to be submitted online via Google Classroom under the
relevant module page before the deadline in order for my assignment to be accepted and marked. I am
fully aware that late submissions will be treated as non-submission and a mark of zero will be awarded.
Table of Contents
1. Introduction ............................................................................................................................................ 1
2. Background ............................................................................................................................................ 3
2.2.3. Sentiment Analysis Using Naive Bayes Algorithm Of The Data Crawler: Twitter ................. 7
2.4.2. Disadvantages..................................................................................................................... 12
3. Solution ................................................................................................................................................ 13
4. Conclusion ........................................................................................................................................... 16
4.1. How the solution addresses the real- world problems. ............................................................... 16
5. References ........................................................................................................................................... 19
List of Table
Table 1: Analysis of the work done. ............................................................................................................. 17
Table 2: Further work .................................................................................................................................. 18
List of Figures
1. Introduction
Machine Learning is a branch of artificial intelligence study that covers how smart machines can
improve their perception, knowledge, reasoning, or actions based on experience or data. Machine
learning (ML) uses concepts from economics, statistics, psychology, neurology, and control theory to
do this. Algorithms used in machine learning are trained to identify patterns and relationships in data.
They reduce depth, categorize information, cluster data points, and make predictions. They even help
with the creation of new content by using past data as input. Machine learning has broad applications
in numerous industries. For example, recommendation engines are used by news agencies, social
media, and e-commerce to offer suggested content based on user activity in the past. (Linda Tucci,
2023)
The primary goal of this system is to categorize the sentiment of the text as positive, negative, or neutral.
It uses natural processing language tools and machine learning algorithms for this purpose. It analyzes
the words, phrases, context and emojis to decide the expressed sentiment. The system is useful in
many domains. Businesses use sentiment analysis to gain customer satisfaction, monitor brand image.
Researchers use it to understand public opinion on social or political issues. (Nirmal Varghese Babu,
2021)
Overall, sentiment analysis systems play a important role in identifying and measuring sentiments from
textual data that helps businesses, researchers and analysts to grow their insights and make informed
decisions based on public opinions. (Liu, 2023)
1
Smriti Swar
CS6004NI Application Development
Without a proper sentiment analysis system, understanding customers feedback becomes time-
consuming and confusing. It gets quite impractical and there will be a high chance of delayed response
to customer concerns. There will be a higher risk to the brand reputation. It limits the ability to adapt
with the evolving consumer preferences and trends leading to uninformed and rational decisions.
Business might struggle to detect potential crises in public opinions early. Political decisions, policy-
making and understanding social issues may lack crucial insights. In the same way, it becomes a
challenge to provide timely and personalized customer service affecting the overall quality of service.
(Mayur Wankhade, 2022)
2
Smriti Swar
CS6004NI Application Development
2. Background
3
Smriti Swar
CS6004NI Application Development
4
Smriti Swar
CS6004NI Application Development
Author
Findings
The study focused on understanding feelings in online product reviews from Amazon.com.
They categorized these emotions as positive, negative, or neutral, which is important for
comprehending sentiments expressed in language. They developed a procedure to do this, and
they examined both individual sentences and completed reviews to test it. The experiments showed
that their approach could accurately determine whether a review as a whole or a single sentence
was neutral, negative, or positive. Future work suggestions were also addressed in this report. It
also included enhancing accuracy and analyzing further information. (Zhan, 2015)
Conclusion
5
Smriti Swar
CS6004NI Application Development
Author:
Findings
The study collected tweets straight from twitter using the Twitter API and a Python programming
tool known as a Twitter scraper. The dataset was then divided into two sections: one for training
the models and the other for testing them. Based on the sentiment of each tweet they used lexicon-
based and machine learning classifiers to identify if it was favorable, negative, or neutral. They
used both supervised and unsupervised machine learning models in their methodology. These
models were used to forecast data. Their performance was evaluated using measures such as F1
score, precision, and recall. They sought to determine which model fits the data the best by mixing
several techniques. (Kaur & Sharma, 2020)
Conclusion
Overall, the study demonstrated that they could reliably identify sentiments using both lexicon-
based and machine learning classifiers.
6
Smriti Swar
CS6004NI Application Development
2.2.3. Sentiment Analysis Using Naive Bayes Algorithm Of The Data Crawler: Twitter
Author
Meylan Wongkar
Department of Informatics Engineering, De La Salle Catholic University, Manado, Indonesia
Apriandy Angdresey
Department of Informatics Engineering, De La Salle Catholic University, Manado, Indonesia
Findings
The study uses Twitter data to explore popular opinion toward Indonesia's presidential candidates
for 2019–2024. Comparing Naive Bayes, SVM, and KNN techniques for sentiment classification,
Naive Bayes performed better than KNN (75.58%) and SVM (63.99%), with an accuracy of 80.90%.
Future plans involve extending the analysis to assess public satisfaction with the elected
president's performance. It also includes incorporating data from Facebook and Instagram for a
more comprehensive outlook beyond Twitter. (Wongkar & Angdresey, 2019)
Conclusion
After the study, it can be concluded that naive bayes works best for understanding people's feelings
compared to SVM and KNN techniques we can also check how people feel about the elected
president on Facebook and Instagram too.
7
Smriti Swar
CS6004NI Application Development
8
Smriti Swar
CS6004NI Application Development
2.3.2.Loco buzz:
Loco buzz is a complete customer experience platform that combines technologies such as
Artificial Intelligence, Machine Learning, Big Data, and Analytics. It helps organizations in building
closer relationships with their customers and improving their lifetime value.
9
Smriti Swar
CS6004NI Application Development
2.3.3.Idiomatic:
Idiomatic builds customer sentiment analysis models that fit your sector. The sentiment labels
provided by Idiomatic are customized to the specific channels of your consumer feedback. With
Idiomatic, you can also track changes in sentiment by channel and customer segments over time.
This helps to improve issues causing negative sentiment. (Fontanella, 2023)
2.3.4.Meltwater
Meltwater sentiment analysis uses advanced natural language process (NPL) algorithms to
analyze the sentiment expressed in social media posts, articles or any online content. It provides
text-processing, lexion -based analysis and aggregation and summarization. It gives accurate
sentiment interpretation, customizable sentiment analysis , real-time monitoring and reporting.
(Fontanella, 2023)
10
Smriti Swar
CS6004NI Application Development
2.3.5.Reputation
Reputation is powered by the Natural Language Processing that breaks down customer sentiment
and analyzes feedback to highlight trending topics. The advanced text analytics determoines what
people are commenting on. It helps to differentiated the comments as negative or positive. It clearly
shows the customer pain points across multiple channels.
11
Smriti Swar
CS6004NI Application Development
2.4.1.Advantages
1. Real-Time Insights: Provides real-time insights of public sentiment that will help business to
stay updated on current trends.
2. Maximum Customer Satisfaction: Offers a direct way to understand consumers’ opinions
which helps the businesses to respond to the consumer concerns in a quick manner.
3. Hate speech identification: identify words that donate hate speech, find the source of hate
speech, and take remedial actions.
4. Discover new marketing strategies: Specific marketing strategies can be built analyzing the
customers discussions on social media.
5. Manage crisis better: Sentiment analysis tracks customer opinions regularly, which will
ultimately prevent growing complaints and enabling swift crisis management for brands.
2.4.2.Disadvantages
1. Bia and Noise: Tweets contain noise, abusive language, hate speech leading to
challenges in accurately interpreting sentiments.
2. Limited Context: Understanding sarcasm, irony or context-specific texts can be a
problem, leading to misinterpretation.
3. Language Variations: Sentiment analysis models trained in one language may not be
beneficial to others which can lead to inaccuracies.
4. Neutral Sentiment: Many social media posts can be neutral sentiment or express a lack
of sentiment altogether.
5. Data Quality: Some time data can be noisy, with misspelling and grammatical errors
which can negatively impact the performance of sentiment analysis models.
12
Smriti Swar
CS6004NI Application Development
3. Solution
The technology then applies a technique known as Naive Bayes to learn from these cleaned-up tweets.
It determines how likely it is that a given word will appear in happy, sad, or neutral tweets. After getting
new tweets, it applies its learned expertise to determine the sentiment. It classifies the sentiment as
happy, sad or neutral according to the learned expertise. Finally, the tools check how good it is at
predicting. To that it will compare its predictions to the real-life feelings expressed in the tweets. These
will be the tweets that it did not learn from. (Garg, 2013)
Once trained, this tool will serve as a user-friendly platform where input tweets go through sentiment
analysis. It will help businesses and individuals to analyze if the tweets related to their brand, helping
understand public sentiment on Twitter.
It assumes independence among words class. It makes things simple by assuming words in a tweet
don’t really affect each other. It looks at each word separately to guess how it might express a
sentiment. During training, the algorithm learns the probabilities of words accruing. The algorithm learns
from a set of labeled tweets, where each tweet is already marked as positive negative or neutral. It
looks at this set to figure out which words belong to which sentiment. When a new tweet comes in, the
algorithm uses what is learned from the training tweet to predict the sentiment. It checks the words
within the text and compares it to the pattern, deciding if it’s a happy, sad, or neutral tweet. (Dr. Yi
Shang, 2014)
The algorithm is tested with new, unseen tweets to see how well it predicts their sentiments. It helps to
determine the accuracy and reliability of the algorithm.
13
Smriti Swar
CS6004NI Application Development
The algorithm’s goal is to figure out the sentiment of the tweet quick and accurately based on the words
within the tweet. This helps in understanding public opinions, trends, reactions On Twitter related to
various topics or brands.
This dataset is imported from Kaggle, and it contains 1,600,000 tweets extracted using the twitter api.
The tweets have been annotated (0= negative, 4= positive) and can be used to detect positive and
negative sentiment.
3.4. Pseudocode
START
END
14
Smriti Swar
CS6004NI Application Development
3.5. Flowchart
15
Smriti Swar
CS6004NI Application Development
4. Conclusion
Companies can use sentiment analysis to monitor how their brand is perceived on Twitter. By analyzing
sentiment around their brand mentions, they can understand public opinion, identify issues and address
concerns effectively. In can provide insights into emerging trends, consumer preferences and changes
in public opinion. This kind of information can be extremely useful for market research and strategic
decision making. It can also be used to determine potential crisis or negative sentiment spikes around
certain topics or brands. These early proactive measures can help prevent risks and manage crisis
effectively. Furthermore, in the customer service field, sentiment analysis allows companies to engage
with customers in real-time promoting a positive brand image and enhance customer relationships.
(Ameen Abdullah Qaid Aqlan, 2019)
Different businesses and companies can monitor their how their brand is perceived on twitter.
Its predictive feature predicts trends, allowing proactive decision-making according to evolving
sentiments. Overall, the sentiment analysis on Twitter data provided by this project is excellent,
providing useful insights across sectors that assist enterprises, organizations, and experts equally.
16
Smriti Swar
CS6004NI Application Development
S. N Tasks Status
In order to learn about sentiment analysis system, various research was conducted. Starting with
Artificial Intelligence, we researched about definitions and meanings of AI. After a brief overview of
artificial intelligence and various branches of ML (Machine Language) that were connected to this,
we moved to the sentiment analysis system. Study on what sentiment analysis system was, how it
worked and what purpose it served in today’s world. Research on various existing sentiment analysis
systems were also done along with their advantages and disadvantages. It was found there were
multiple types of sentiment analysis systems like the rule-based system, aspect-based sentiment
analysis, fine-grained sentiment analysis and many more. Out of them machine-learning based
system (Naïve Byes model) was chosen for this project. This model could adapt to different contexts
and fits effectively for dynamic nature of Twitter data. Existing projects and systems were researched
and explored after understanding the problem domain. A dataset of twitter was searched from Kaggle
to work upon and finally the pseudo code and flowchart of the system were developed.
17
Smriti Swar
CS6004NI Application Development
S. N Tasks Status
In the future, the sentiment analysis system will be implemented with the help of libraries such as
pandas and NumPy. Python will be the chosen programming language to develop the system. The
system will utilize the available dataset in Kaggle which is the sentiment 140 dataset with 1.6 million
tweets. The utilized dataset will help the system get smart about understanding the sentiments in
Twitter posts. In order to keep a proper track, the development process will also be documented.
The main goal is to build a robust and efficient system that can read tweets and predict the
sentiments.
18
Smriti Swar
CS6004NI Application Development
5. References
Ameen Abdullah Qaid Aqlan, B. M. a. R. L. N., 2019. A Study of Sentiment Analysis: Concepts, Techniques,
and Challenges. Research Gate.
Dr. Yi Shang, D. D. X. D. J. U., 2014. NAIVE BAYES ALGORITHM FOR TWITTER SENTIMENT ANALYSIS
AND ITS IMPLEMENTATION IN MARPREDUCE, s.l.: CORE.
Garg, B., 2013. DESIGN AND DEVELOPMENT OF NAÏVE BAYES CLASSIFIER, Fargo, North Dakota:
NDSU libraries.
Hong Chen, S. H. R. H. &. X. Z., 2021. Improved naive Bayes classification algorithm for traffic risk
management. EURASIP Journal on Advances in Signal Processing .
Kaur, C. & Sharma, A., 2020. Social Issues Sentiment Analysis using Python, s.l.: IEEE.
Liu, B., 2023. Sentiment Analysis. 2nd ed. s.l.:Cambridge University Press.
Mayur Wankhade, A. C. S. R. &. C. K., 2022. A survey on sentiment analysis methods, applications, and
challenges. Springer Link.
Mejova, Y., 2012. Sentiment analysis within and across social media streams, s.l.: Iowa Research Online.
Nirmal Varghese Babu, E. G. M. K., 2021. Sentiment Analysis in Social Media Data for Depression
Detection Using Artificial Intelligence. s.l.:s.n.
Sourav De, S. D. ,. S. B. ,. S. B., 2022. Advanced Data Mining Tools and Methods for Social Computing.
s.l.:Academic Press.
Wongkar, M. & Angdresey, A., 2019. Sentiment Analysis Using Naive Bayes Algorithm Of The Data Crawler:
Twitter, s.l.: IEEE.
Yang, C.-S. a. S. H.-P., 2012. A Rule-Based Approach For Effective Sentiment Analysis. s.l.:Pacific Asia
Conference on Information Systems..
Yoonjung Choi, Y. K. S.-H. M., 2009. Domain-specific sentiment analysis using contextual feature
generation. Research Gate.
19
Smriti Swar
CS6004NI Application Development
Zhan, X. F. &. J., 2015. Sentiment analysis using product review data. Journal of Big Data.
(Garg, 2013)
20
Smriti Swar