Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

Module Code & Module Title

CU6051NI Artificial Intelligence

Assessment Weightage & Type

25% Coursework I - Proposal

Semester-1 Level 5

Student Name: Smriti Swar

London Met ID: 210259

College ID:np01cp4a210259

Assignment Due Date: 20th December 2023

Assignment Submission Date: 19th December 2023

Word Count (Where Required):3086

I confirm that I understand my coursework needs to be submitted online via Google Classroom under the
relevant module page before the deadline in order for my assignment to be accepted and marked. I am
fully aware that late submissions will be treated as non-submission and a mark of zero will be awarded.
Table of Contents
1. Introduction ............................................................................................................................................ 1

1.1. Artificial Intelligence ...................................................................................................................... 1

1.2. Sentiment Analysis System ........................................................................................................... 1

1.3. Problem Domain ........................................................................................................................... 2

2. Background ............................................................................................................................................ 3

2.1. Research on Topic......................................................................................................................... 3

2.1.1. Rule based sentiment system: .............................................................................................. 3

2.1.2. Aspect-based sentiment analysis:......................................................................................... 4

2.2. Review and Analysis of Existing Work .......................................................................................... 5

2.2.1. Sentiment analysis using product review data. ..................................................................... 5

2.2.2. Social Issues Sentiment Analysis using Python. ................................................................... 6

2.2.3. Sentiment Analysis Using Naive Bayes Algorithm Of The Data Crawler: Twitter ................. 7

2.3. Existing Systems ........................................................................................................................... 8

2.3.1. Brand 24 ................................................................................................................................ 8

2.3.2. Locobuzz: .............................................................................................................................. 9

2.3.3. Idiomatic: ............................................................................................................................. 10

2.3.4. Meltwater ............................................................................................................................. 10

2.3.5. Reputation ........................................................................................................................... 11

2.4. Advantages and Disadvantages of the Problem Domain ........................................................... 12

2.4.1. Advantages ......................................................................................................................... 12

2.4.2. Disadvantages..................................................................................................................... 12

3. Solution ................................................................................................................................................ 13

3.1. Explanation of the proposed solution .......................................................................................... 13

3.2. Explanation of AI algorithm implemented. ................................................................................... 13

3.3. Data Set: Twitter Dataset ............................................................................................................ 14

3.4. Pseudocode ................................................................................................................................ 14

3.5. Flowchart ..................................................................................................................................... 15

4. Conclusion ........................................................................................................................................... 16
4.1. How the solution addresses the real- world problems. ............................................................... 16

4.2. Analysis of work done ................................................................................................................. 17

4.3. Future Work ................................................................................................................................. 18

5. References ........................................................................................................................................... 19

List of Table
Table 1: Analysis of the work done. ............................................................................................................. 17
Table 2: Further work .................................................................................................................................. 18

List of Figures

Figure 1: Sentiment Analysis Steps .............................................................................................................. 2


Figure 2: Rule based sentiment analysis ..................................................................................................... 3
Figure 3: Aspect based sentiment analysis. .................................................................................................. 4
Figure 4:Sentiment Polarity Categorization Process. ................................................................................... 5
Figure 5: Text processing for Twitter ............................................................................................................. 7
Figure 6: BRAND 24 : social listening tool .................................................................................................... 8
Figure 7: Locobuzz- social listening platform ................................................................................................ 9
Figure 8: Idiomatic a sentiment analysis tool .............................................................................................. 10
Figure 9: Meltwater sentiment analysis tool ................................................................................................ 11
Figure 10: Reputation: sentiment analysis tool. .......................................................................................... 11
Figure 11: Flowchart of the system. ............................................................................................................ 15
CS6004NI Application Development

1. Introduction

1.1. Artificial Intelligence


Artificial Intelligence is a vast field that has come a long way from simple beginning to widespread
application. We can say Artificial Intelligence is a moving target. The definition of AI itself is volatile and
has changed over -time. It involves the study, design and building of intelligent entities that can achieve
some goals. In simpler terms, we can say it to be the intelligence demonstrated by machines as
opposed to the natural intellect depicted by humans. Some of the examples of application of Artificial
intelligence includes machine translations, robots, search engines, spam detection, recommendation
system and many more. (Christoph Bartneck, 2021)

Machine Learning is a branch of artificial intelligence study that covers how smart machines can
improve their perception, knowledge, reasoning, or actions based on experience or data. Machine
learning (ML) uses concepts from economics, statistics, psychology, neurology, and control theory to
do this. Algorithms used in machine learning are trained to identify patterns and relationships in data.
They reduce depth, categorize information, cluster data points, and make predictions. They even help
with the creation of new content by using past data as input. Machine learning has broad applications
in numerous industries. For example, recommendation engines are used by news agencies, social
media, and e-commerce to offer suggested content based on user activity in the past. (Linda Tucci,
2023)

1.2. Sentiment Analysis System


Sentiment analysis, often known as opinion mining, is a intelligent tool designed to comprehend and
classify the sentiment expressed within the text data. It is the study of people's opinions, emotions,
attitudes, and emotions regarding entities and their properties expressed in written language. The
entities may consist of services, goods, groups, individuals, events, problems, or ideas. Essentially, it
is a technology that reads through written text from social media and customer reviews to determine
the emotions conveyed.

The primary goal of this system is to categorize the sentiment of the text as positive, negative, or neutral.
It uses natural processing language tools and machine learning algorithms for this purpose. It analyzes
the words, phrases, context and emojis to decide the expressed sentiment. The system is useful in
many domains. Businesses use sentiment analysis to gain customer satisfaction, monitor brand image.
Researchers use it to understand public opinion on social or political issues. (Nirmal Varghese Babu,
2021)

Overall, sentiment analysis systems play a important role in identifying and measuring sentiments from
textual data that helps businesses, researchers and analysts to grow their insights and make informed
decisions based on public opinions. (Liu, 2023)

1
Smriti Swar
CS6004NI Application Development

Figure 1: Sentiment Analysis Steps

1.3. Problem Domain


It can be difficult to understand and handle a brand's reputation online, particularly on sites like Twitter
where opinions can differ dramatically. Monitoring brand mentions and filtering through an extensive
range of opinions in real-time to address issues and extract valuable data is a huge task for businesses.
It’s a goldmine of information about consumer preferences and trends in the market, Customers are
free to express their opinions, which range from abundant appreciation to critical clarifications. It is
extremely tough for business to keep up with a wide range of sentiments, maintain a positive image
and enhance customer satisfaction. (Yoonjung Choi, 2009)

Without a proper sentiment analysis system, understanding customers feedback becomes time-
consuming and confusing. It gets quite impractical and there will be a high chance of delayed response
to customer concerns. There will be a higher risk to the brand reputation. It limits the ability to adapt
with the evolving consumer preferences and trends leading to uninformed and rational decisions.
Business might struggle to detect potential crises in public opinions early. Political decisions, policy-
making and understanding social issues may lack crucial insights. In the same way, it becomes a
challenge to provide timely and personalized customer service affecting the overall quality of service.
(Mayur Wankhade, 2022)

2
Smriti Swar
CS6004NI Application Development

2. Background

2.1. Research on Topic


Sentiment analysis systems are designed to predict the most likely sentiment, emotion that a certain
statement is trying to express. These systems classify the statements into positive, negative, or neutral
sentiments. There are various types of sentiment systems that exist in the current scenario such as
aspect-based sentiment analysis, fine-grained sentiment analysis, rule-based sentiment system,
emotion detection and much more. Some of them are explained below.

2.1.1.Rule based sentiment system:


These systems rely on predefined patterns to identify sentiment. They use lexicons, dictionaries or
linguistic rules and give sentiment scores to words. They are a structured approach for analyzing
sentiments. However, this kind of system might struggle with context-specific sentiment or sarcasm.
They might misinterpret expressions that are based on context and that might lead to inaccuracies.
These systems are best for straightforward and interpretable contents but quite useless to adapt
dynamically in human languages and variation present in textual data. (Yang, 2012)

Figure 2: Rule based sentiment analysis

3
Smriti Swar
CS6004NI Application Development

2.1.2.Aspect-based sentiment analysis:


Instead of analyzing overall sentiment, these systems focus on specific aspects or features within
the text. They classify sentiment related to each aspect separately providing more detailed insights.
It is a multi-step process aiming at detecting and extracting sentiments towards a specific
component of a product. First it identifies features discussed in the text and classification is applied
to the fragments of text. Results are aggregated and scored by aspect. (Sourav De, 2022)

Figure 3: Aspect based sentiment analysis.

4
Smriti Swar
CS6004NI Application Development

2.2. Review and Analysis of Existing Work


2.2.1.Sentiment analysis using product review data.

Author

Xing Fang and Justin Zhan

Figure 4:Sentiment Polarity Categorization Process.

Findings

The study focused on understanding feelings in online product reviews from Amazon.com.
They categorized these emotions as positive, negative, or neutral, which is important for
comprehending sentiments expressed in language. They developed a procedure to do this, and
they examined both individual sentences and completed reviews to test it. The experiments showed
that their approach could accurately determine whether a review as a whole or a single sentence
was neutral, negative, or positive. Future work suggestions were also addressed in this report. It
also included enhancing accuracy and analyzing further information. (Zhan, 2015)

Conclusion

To conclude their approach appears to be an effective way of detecting sentiments in


evaluations. The researchers also have plans to further refine it in the future.

5
Smriti Swar
CS6004NI Application Development

2.2.2.Social Issues Sentiment Analysis using Python.

Author:

Chhinder Kaur; Anand Sharma

Findings

The study collected tweets straight from twitter using the Twitter API and a Python programming
tool known as a Twitter scraper. The dataset was then divided into two sections: one for training
the models and the other for testing them. Based on the sentiment of each tweet they used lexicon-
based and machine learning classifiers to identify if it was favorable, negative, or neutral. They
used both supervised and unsupervised machine learning models in their methodology. These
models were used to forecast data. Their performance was evaluated using measures such as F1
score, precision, and recall. They sought to determine which model fits the data the best by mixing
several techniques. (Kaur & Sharma, 2020)

Conclusion

Overall, the study demonstrated that they could reliably identify sentiments using both lexicon-
based and machine learning classifiers.

6
Smriti Swar
CS6004NI Application Development

2.2.3. Sentiment Analysis Using Naive Bayes Algorithm Of The Data Crawler: Twitter

Author

Meylan Wongkar
Department of Informatics Engineering, De La Salle Catholic University, Manado, Indonesia

Apriandy Angdresey
Department of Informatics Engineering, De La Salle Catholic University, Manado, Indonesia

Figure 5: Text processing for Twitter

Findings

The study uses Twitter data to explore popular opinion toward Indonesia's presidential candidates
for 2019–2024. Comparing Naive Bayes, SVM, and KNN techniques for sentiment classification,
Naive Bayes performed better than KNN (75.58%) and SVM (63.99%), with an accuracy of 80.90%.
Future plans involve extending the analysis to assess public satisfaction with the elected
president's performance. It also includes incorporating data from Facebook and Instagram for a
more comprehensive outlook beyond Twitter. (Wongkar & Angdresey, 2019)

Conclusion

After the study, it can be concluded that naive bayes works best for understanding people's feelings
compared to SVM and KNN techniques we can also check how people feel about the elected
president on Facebook and Instagram too.

7
Smriti Swar
CS6004NI Application Development

2.3. Existing Systems


2.3.1.Brand 24
Brand24 is a tool that watches what people say online about your brand or any topic linked to your
business. It gathers information from various places on the internet, like social media, blogs,
forums, customer reviews, and more, to figure out how people feel about your brand or the topics
you're interested in.

Figure 6: BRAND 24 : social listening tool

8
Smriti Swar
CS6004NI Application Development

2.3.2.Loco buzz:
Loco buzz is a complete customer experience platform that combines technologies such as
Artificial Intelligence, Machine Learning, Big Data, and Analytics. It helps organizations in building
closer relationships with their customers and improving their lifetime value.

Figure 7: Loco buzz- social listening platform

9
Smriti Swar
CS6004NI Application Development

2.3.3.Idiomatic:
Idiomatic builds customer sentiment analysis models that fit your sector. The sentiment labels
provided by Idiomatic are customized to the specific channels of your consumer feedback. With
Idiomatic, you can also track changes in sentiment by channel and customer segments over time.
This helps to improve issues causing negative sentiment. (Fontanella, 2023)

Figure 8: Idiomatic a sentiment analysis tool

2.3.4.Meltwater
Meltwater sentiment analysis uses advanced natural language process (NPL) algorithms to
analyze the sentiment expressed in social media posts, articles or any online content. It provides
text-processing, lexion -based analysis and aggregation and summarization. It gives accurate
sentiment interpretation, customizable sentiment analysis , real-time monitoring and reporting.
(Fontanella, 2023)

10
Smriti Swar
CS6004NI Application Development

Figure 9: Meltwater sentiment analysis tool

2.3.5.Reputation
Reputation is powered by the Natural Language Processing that breaks down customer sentiment
and analyzes feedback to highlight trending topics. The advanced text analytics determoines what
people are commenting on. It helps to differentiated the comments as negative or positive. It clearly
shows the customer pain points across multiple channels.

Figure 10: Reputation: sentiment analysis tool.

11
Smriti Swar
CS6004NI Application Development

2.4. Advantages and Disadvantages of the Problem Domain


The system will have various advantages and disadvantages. A list of them is provided below.

2.4.1.Advantages
1. Real-Time Insights: Provides real-time insights of public sentiment that will help business to
stay updated on current trends.
2. Maximum Customer Satisfaction: Offers a direct way to understand consumers’ opinions
which helps the businesses to respond to the consumer concerns in a quick manner.
3. Hate speech identification: identify words that donate hate speech, find the source of hate
speech, and take remedial actions.
4. Discover new marketing strategies: Specific marketing strategies can be built analyzing the
customers discussions on social media.
5. Manage crisis better: Sentiment analysis tracks customer opinions regularly, which will
ultimately prevent growing complaints and enabling swift crisis management for brands.

2.4.2.Disadvantages
1. Bia and Noise: Tweets contain noise, abusive language, hate speech leading to
challenges in accurately interpreting sentiments.
2. Limited Context: Understanding sarcasm, irony or context-specific texts can be a
problem, leading to misinterpretation.
3. Language Variations: Sentiment analysis models trained in one language may not be
beneficial to others which can lead to inaccuracies.
4. Neutral Sentiment: Many social media posts can be neutral sentiment or express a lack
of sentiment altogether.
5. Data Quality: Some time data can be noisy, with misspelling and grammatical errors
which can negatively impact the performance of sentiment analysis models.

12
Smriti Swar
CS6004NI Application Development

3. Solution

3.1. Explanation of the proposed solution


A sentiment analysis system that used Naïve Bayes classifier algorithm is discussed as a part of the
solution. This tool analyzes Twitter’s tweets data to decide whether they are neutral, negative, or
positive. Initially, it collects tweets that are already labeled as cheerful, depressed, or just ok. After that,
it cleans the tweets, deleting redundant phrases, correcting spelling errors, and improving their
readability. (Hong Chen, 2021)

The technology then applies a technique known as Naive Bayes to learn from these cleaned-up tweets.
It determines how likely it is that a given word will appear in happy, sad, or neutral tweets. After getting
new tweets, it applies its learned expertise to determine the sentiment. It classifies the sentiment as
happy, sad or neutral according to the learned expertise. Finally, the tools check how good it is at
predicting. To that it will compare its predictions to the real-life feelings expressed in the tweets. These
will be the tweets that it did not learn from. (Garg, 2013)

Once trained, this tool will serve as a user-friendly platform where input tweets go through sentiment
analysis. It will help businesses and individuals to analyze if the tweets related to their brand, helping
understand public sentiment on Twitter.

3.2. Explanation of AI algorithm implemented.


The Naïve Bayes algorithm was chosen for this sentiment analysis project. Naïve bayes is basically a
classification algorithm that operates on the principle of Baye’s theorem. It serves aa a powerful tool
for classifying text sentiment, in the context of sentiment analysis. It computes the likelihood of what
sentiment a text is trying to express based on the occurrence of words in it. This makes it a perfect
algorithm to be used in our project. (Samsir, 2019)

It assumes independence among words class. It makes things simple by assuming words in a tweet
don’t really affect each other. It looks at each word separately to guess how it might express a
sentiment. During training, the algorithm learns the probabilities of words accruing. The algorithm learns
from a set of labeled tweets, where each tweet is already marked as positive negative or neutral. It
looks at this set to figure out which words belong to which sentiment. When a new tweet comes in, the
algorithm uses what is learned from the training tweet to predict the sentiment. It checks the words
within the text and compares it to the pattern, deciding if it’s a happy, sad, or neutral tweet. (Dr. Yi
Shang, 2014)

The algorithm is tested with new, unseen tweets to see how well it predicts their sentiments. It helps to
determine the accuracy and reliability of the algorithm.

13
Smriti Swar
CS6004NI Application Development

The algorithm’s goal is to figure out the sentiment of the tweet quick and accurately based on the words
within the tweet. This helps in understanding public opinions, trends, reactions On Twitter related to
various topics or brands.

3.3. Data Set: Twitter Dataset


A sentiment analysis system requires a dataset in order to train and test the algorithm. The dataset
chosen for implementation in the system is the sentiment 140 dataset with 1.6 million tweets.

This dataset is imported from Kaggle, and it contains 1,600,000 tweets extracted using the twitter api.
The tweets have been annotated (0= negative, 4= positive) and can be used to detect positive and
negative sentiment.

3.4. Pseudocode
START

LOAD twitter data FROM Kaggle (Sentiment140 Dataset)

EXPLORE twitter data

CLEAN twitter data (Preprocess and clean the tweets)

SPLIT twitter data INTO training data and testing data

EXTRACT features FROM training data

TRAIN Naive_Bayes_Model USING training data and features

TEST Naive_Bayes_Model USING testing data

PREDICT_SENTIMENT OF new tweet USING Naive_Bayes_Model and features

EVALUATE_MODEL PERFORMANCE WITH testing data

END

14
Smriti Swar
CS6004NI Application Development

3.5. Flowchart

Figure 11: Flowchart of the system.

15
Smriti Swar
CS6004NI Application Development

4. Conclusion

4.1. How the solution addresses the real- world problems.


A sentiment analysis project that uses Twitter data provides industry-specific solutions. It's important
for organizations to manage their brands, quickly determine opinions, address problems, and improve
consumer happiness. It's a goldmine of information about consumer preferences and trends in the
market, enabling flexible strategy.

Companies can use sentiment analysis to monitor how their brand is perceived on Twitter. By analyzing
sentiment around their brand mentions, they can understand public opinion, identify issues and address
concerns effectively. In can provide insights into emerging trends, consumer preferences and changes
in public opinion. This kind of information can be extremely useful for market research and strategic
decision making. It can also be used to determine potential crisis or negative sentiment spikes around
certain topics or brands. These early proactive measures can help prevent risks and manage crisis
effectively. Furthermore, in the customer service field, sentiment analysis allows companies to engage
with customers in real-time promoting a positive brand image and enhance customer relationships.
(Ameen Abdullah Qaid Aqlan, 2019)

Different businesses and companies can monitor their how their brand is perceived on twitter.

Its predictive feature predicts trends, allowing proactive decision-making according to evolving
sentiments. Overall, the sentiment analysis on Twitter data provided by this project is excellent,
providing useful insights across sectors that assist enterprises, organizations, and experts equally.

16
Smriti Swar
CS6004NI Application Development

4.2. Analysis of work done

S. N Tasks Status

1 Research on Artificial Intelligence Completed

2 Research on existing Sentiment Analysis Systems Completed

3 Research on Natural Processing Language Completed

4 Research on Naïve Bayes Algorithm Completed

5 Research on Problem Domain Completed

6 Research on Analysis of existing projects Completed

7 Research on Dataset Completed

8 Build Pseudocode Completed

9 Draw flowchart Completed


Table 1: Analysis of the work done.

In order to learn about sentiment analysis system, various research was conducted. Starting with
Artificial Intelligence, we researched about definitions and meanings of AI. After a brief overview of
artificial intelligence and various branches of ML (Machine Language) that were connected to this,
we moved to the sentiment analysis system. Study on what sentiment analysis system was, how it
worked and what purpose it served in today’s world. Research on various existing sentiment analysis
systems were also done along with their advantages and disadvantages. It was found there were
multiple types of sentiment analysis systems like the rule-based system, aspect-based sentiment
analysis, fine-grained sentiment analysis and many more. Out of them machine-learning based
system (Naïve Byes model) was chosen for this project. This model could adapt to different contexts
and fits effectively for dynamic nature of Twitter data. Existing projects and systems were researched
and explored after understanding the problem domain. A dataset of twitter was searched from Kaggle
to work upon and finally the pseudo code and flowchart of the system were developed.

17
Smriti Swar
CS6004NI Application Development

4.3. Future Work

S. N Tasks Status

1 Implementation of the concept Pending

2 Development of the system Pending

3 Documentation of the development process Pending

4 Resulting system Pending


Table 2: Further work

In the future, the sentiment analysis system will be implemented with the help of libraries such as
pandas and NumPy. Python will be the chosen programming language to develop the system. The
system will utilize the available dataset in Kaggle which is the sentiment 140 dataset with 1.6 million
tweets. The utilized dataset will help the system get smart about understanding the sentiments in
Twitter posts. In order to keep a proper track, the development process will also be documented.
The main goal is to build a robust and efficient system that can read tweets and predict the
sentiments.

18
Smriti Swar
CS6004NI Application Development

5. References

Ameen Abdullah Qaid Aqlan, B. M. a. R. L. N., 2019. A Study of Sentiment Analysis: Concepts, Techniques,
and Challenges. Research Gate.

Christoph Bartneck, C. L., 2021. Wht us AI?. s.l.:Research Gate.

Dr. Yi Shang, D. D. X. D. J. U., 2014. NAIVE BAYES ALGORITHM FOR TWITTER SENTIMENT ANALYSIS
AND ITS IMPLEMENTATION IN MARPREDUCE, s.l.: CORE.

Fontanella, C., 2023. HubSpot. [Online].

Garg, B., 2013. DESIGN AND DEVELOPMENT OF NAÏVE BAYES CLASSIFIER, Fargo, North Dakota:
NDSU libraries.

Hong Chen, S. H. R. H. &. X. Z., 2021. Improved naive Bayes classification algorithm for traffic risk
management. EURASIP Journal on Advances in Signal Processing .

Kaur, C. & Sharma, A., 2020. Social Issues Sentiment Analysis using Python, s.l.: IEEE.

Linda Tucci, 2023. Tech Accelerator, s.l.: Tech Target.

Liu, B., 2023. Sentiment Analysis. 2nd ed. s.l.:Cambridge University Press.

Mayur Wankhade, A. C. S. R. &. C. K., 2022. A survey on sentiment analysis methods, applications, and
challenges. Springer Link.

Mejova, Y., 2012. Sentiment analysis within and across social media streams, s.l.: Iowa Research Online.

Nirmal Varghese Babu, E. G. M. K., 2021. Sentiment Analysis in Social Media Data for Depression
Detection Using Artificial Intelligence. s.l.:s.n.

Samsir, D. I. F. E. J. M. H. J. R. K. R. B. U. R. W., 2019. Naives Bayes Algorithm for Twitter Sentiment


Analysis. Journal of Physics: Conference Series.

Sourav De, S. D. ,. S. B. ,. S. B., 2022. Advanced Data Mining Tools and Methods for Social Computing.
s.l.:Academic Press.

Wongkar, M. & Angdresey, A., 2019. Sentiment Analysis Using Naive Bayes Algorithm Of The Data Crawler:
Twitter, s.l.: IEEE.

Yang, C.-S. a. S. H.-P., 2012. A Rule-Based Approach For Effective Sentiment Analysis. s.l.:Pacific Asia
Conference on Information Systems..

Yoonjung Choi, Y. K. S.-H. M., 2009. Domain-specific sentiment analysis using contextual feature
generation. Research Gate.

19
Smriti Swar
CS6004NI Application Development

Zhan, X. F. &. J., 2015. Sentiment analysis using product review data. Journal of Big Data.

(Garg, 2013)

20
Smriti Swar

You might also like