Download as pdf or txt
Download as pdf or txt
You are on page 1of 34

A PROJECT REPORT ON

WHATSAPP CHAT ANALYZER


-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

A Project report submitted in partial fulfilment of the requirement for


the award of the Degree of

MASTER OF COMPUTER APPLICATION

Submitted By
Soumya Ranjan Jena
(2224100031)

Under the esteemed guidance of

Mrs. Swarna Lata Pati

School of Computer Science


ODISHA UNIVERSITY OF TECHNOLOGY AND RESEARCH
(Techno Campus, Ghatikia, Mahalaxmi Vihar, Bhubaneswar-751003)
Academic Year 2023-2024
ODISHA UNIVERSITY OF TECHNOLOGY AND RESEARCH
(Techno Campus, Ghatikia, Mahalaxmi Vihar, Bhubaneswar-751003)

School of Computer Science

CERTIFICATE

This is to certify that the project report entitled “WHATSAPP CHAT ANALYZER”
being submitted by Mr. Soumya Ranjan Jena bearing the registration no: 2224100031 in
partial fulfilment for the award of the Degree of Master in Computer Application to the Odisha
University of Technology and Research is a record of bonafide work carried out by him under
my guidance and supervision.

The results embodied in this project report have not been submitted to any other
University or Institute for the award of any Degree or Diploma.

Internal Guide Head of School of Computer Science

Mrs. Swarna Lata Dr Jibitesh Mishra


pati
ODISHA UNIVERSITY OF TECHNOLOGY AND RESEARCH
(Techno Campus, Ghatikia, Mahalaxmi Vihar, Bhubaneswar-751003)

School of Computer Science

DECLARATION

I Soumya Ranjan Jena bearing Registration No: 2224100031, a bonafide student of


Odisha University of Technology and Research, would like to declare that the project titled
“Whatsapp Chat Analyzer” in partial fulfilment of MCA Degree course of Odisha University
of Technology and Research is my original work in the year 2024 under the guidance of Mrs.
Swarna Lata Pati, School of Computer Science and it has not previously formed the basis for
any degree or diploma or other any similar title submitted to any university.

Date: 05th Jan 2024 Soumya Ranjan Jena

Place: OUTR, BBSR (2224100031)


ODISHA UNIVERSITY OF TECHNOLOGY AND RESEARCH
(Techno Campus, Ghatikia, Mahalaxmi Vihar, Bhubaneswar-751003)

School of Computer Science

ACKNOWLEDGEMENT

I would like to express my sincere gratitude to my advisor, Mrs. Swarna Lata pati, School of
computer science whose knowledge and guidance have motivated me to achieve goals. He has
consistently been a source of motivation, encouragement, and inspiration. The time I have spent
working under his supervision has truly been a pleasure.
I take it as a great privilege to express our heartfelt gratitude to Dr Jibitesh Mishra, Head of
School of Computer Science for his valuable support and to all senior faculty members of the
School of computer science for their help during my course. Thanks to the programmers and
non-teaching staff of School of computer science.

Finally, special thanks to my parents for their support and encouragement throughout my life
and this course. Thanks to all my friends and well-wishers for their constant support.

Date: 05th Jan 2024 Soumya Ranjan Jena

Place: OUTR, BBSR (2224100031)


ABSTRACT

The most used and efficient method of communication in recent times is an application
called WhatsApp. WhatsApp chats consists of various kinds of conversations held among
group of people. This chat consists of various topics. This information can provide lots of data
for latest technologies such as machine learning. The most important thing for a machine
learning models is to provide the right learning experience which is indirectly affected by the
data that we provide to the model. This tool aims to provide in depth analysis of this data
which is provided by WhatsApp. Irrespective of whichever topic the conversation is based our
developed code can be applied to obtain a better understanding of the data. The advantage of
this tool is that is implemented using simple python modules such as pandas, matplotlib, seaborn
and sentiment analysis which are used to create data frames and plot different graphs, where
then it is displayed in the flutter application which is efficient and less resources consuming
algorithm, therefor it can be easily applied to largest dataset.
CONTENTS

1. INTRODUCTION…………………………………………………..
1.1 . Introduction
1.2 . Problem Statement
1.3 . Existing System
1.4 . Proposed System
1.5 . Objectives
2. LITERRATURE SURVEY………………………………………..
2.1. Feasibility Study
2.1.1. Technical Feasibility
2.1.2. Economical Feasibility
2.1.3. Operational Feasibility
3. REQUIREMENT ANALYSIS……………………………………
3.1. Requirements Analysis
3.2. Platform Specification
3.3. Functional Requirements
3.4. Non- Functional Requirements
3.5. System Specification
3.5.1. Software Requirements
3.5.2. Hardware Requirements
4. DESIGN…………………………………………………………….
4.1. Software Requirement Specification
4.1.1. Glossary
4.1.2. Use Case Model
4.2. Conceptual Level Class Diagram
4.3. Conceptual Level Activity Diagram
5. SYSTEM MODELING…………………………………………...
5.1. Conceptual Level Sequence
Diagram
5.2. Conceptual Level Collaboration
Diagram
5.3. Conceptual Level State Diagram
5.4. Conceptual Level Component
Diagram
5.5. Conceptual Level Deployment
Diagram
5.6. Methodology and
Implementation Phase
5.6.1. Methodology
5.6.1.1. Description
Incremental model
5.6.1.2. Advantages and
Disadvantages
5.6.1.3. Reason for Use
5.6.2. Implementation Phase
5.6.2.1. Language Used
Characteristics
5.7. Testing
5.7.1. Testing Objectives
5.7.2. Testing Methods &
Strategies used along with Test Data

6. CONCLUSION & FUTURE WORK


6.1. Conclusion
6.2. Limitation of Project
6.3. Future Enhancement Suggestion

7. BIOGRAPHY & REFERENCES


7.1. Reference
7.2. Screenshots
1. INTRODUCTION
INTRODUCTION

1.1. Introduction
This tool is based on data analysis and processing. The first step in implementing a machine
learning algorithm is to understand the right learning experience from which the model
starts improving on. Data pre-processing plays a major role when it comes to machine
learning. In order to make the model more efficient we need lots of data, so we turned our
focus primarily on one of the largescale data producers owned by Facebook which is
nothing but WhatsApp. WhatsApp claims that nearly 55 billion messages are sent each
day. The average user spends 195 minutes per week on WhatsApp, and is a member of
plenty of groups. With this treasure house of data right under our very noses, it is but
imperative that we embark on a mission to gain insights on the messages which our
phones are forced to bear witness to..

1.2. Problem Statement


WhatsApp-Analyzer is a statistical analysis tool for WhatsApp chats. Working on
the chat files that can be exported from WhatsApp it generates various plots
showing, for example, which another participant a user responds to the most. We
propose to employ dataset manipulation techniques to have a better understanding of
WhatsApp chat present in our phones.

1.3. Existing Systems


 Chat Stats
 Whatsanalyze
 Chatilyzer
 Chat analyzer
1.4. Proposed system
Data pre-processing, the initial part of the project is to understand
implementation and usage of various python-built modules. The above process helps
us to understand why different modules are helpful rather than implementing those
functions from scratch by the developer. These various modules provide better
code representation and user understandability. The following libraries are used such
as numpy, scipy pandas, csv, sklearn, matplotlib, sys, re, emoji, nltk seaborn etc
Consider the impact of journey time on fare prediction. Longer or shorter flights might
have different pricing dynamics, which could be factored into the model to enhance
accuracy.

Exploratory data analysis, first step in this to apply a sentiment analysis


algorithm which provides positives negative and neutral part of th chat and is used
to plot pie chart based on these parameters. To plot a line graph which shows author
and message count of each date, to plot a line graph which shows author and message
count of each author, Ordered graph of date vs message count, media sent by authors
and their count, Display the message which is di not have authors, plot graph of hour
vs message count.
1.5. Objectives

 This project aims to provide a better understanding towards various types of chats.
This analysis proves to be better input to machine learning models which essentially
explore the chat data. It require proper learning instances which provides better
accuracy for these models. Our project ensures to prove an in-depth exploratory data
analysis on various types of WhatsApp chats.

 Sentiment Analysis: Determine the overall sentiment of the conversations, whether they
are positive, negative, or neutral. This can help in understanding the emotional tone of
the discussions.

 Topic Modeling: Identify and categorize the main topics or themes discussed in the
conversations. This can be achieved through techniques such as topic modeling
algorithms (e.g., Latent Dirichlet Allocation).

 User Behavior Analysis: Analyze the behavior of individual users within the chat,
including the frequency of messages, response times, and participation levels. This can
provide insights into user engagement.

 Keyword Extraction: Extract relevant keywords or phrases that are frequently used in
the conversations. This can help in identifying key topics or trends.

 Named Entity Recognition (NER): Recognize and classify named entities such as
people, locations, organizations, and dates mentioned in the conversations. This can
assist in understanding the context of the discussions.

 Anomaly Detection: Identify unusual patterns or outliers in the chat data. This can be
useful for detecting potential issues or abnormal behavior within the group.

 Language Understanding: Develop a model that understands the language nuances,


abbreviations, and emojis commonly used in WhatsApp chats to improve the accuracy
of analysis.

 Time Series Analysis: Explore how conversation patterns change over time, identifying
peak activity periods, and understanding the dynamics of the group or individual
interactions.

 User Profiling: Create profiles for individual users based on their language use, topics
of interest, and overall behavior in the chat. This can be valuable for personalized
insights.

 Privacy Considerations: Implement measures to respect privacy and ensure that


sensitive information is not exposed during the analysis process.
Diagrammatic representation of the “Project Work Flow”
2. LITERRATURE
SURVEY
2.1 Feasibility Study

2.1.1. Technical Feasibility

The technical feasibility study report whether there exists correct required resources and
technologies which will be used for project development. It is the measure of the
specific technical solution and the availability of the technical resource and expertise.
In our project we will be using Jupiter Notebook(web based application) and VS
code(text editor), both of them are open source software as long with these various
python libraries and will be used.

2.1.2. Economical Feasibility


Costs and benefits of the project is analyzed in economic feasibility, that means
what will be the cost of final development of the product. This project has no cost
in development since all the software and technologies used are open source. This
project is not economical as it mainly depends on the analysis of data between two
or more devices.

2.1.3. Operational Feasibility


It is determined whether the system will be used after the development and
implementation. In operational feasibility degree of providing service to requirements is
analyzed. The involves the study of utilization and performance of the product. Our
project shows the whole analysis of the chart among people. It can be two people or a
group of people and provides various information using charts in easily reliable format.
3. REQUIREMENT
ANALYSIS
 Describe the logical and physical characteristics of each interface between our
software and the hardware component of the system.
 Hardware Required: any web browser supported device.
 supported device types: The software is developed for Windows 32-
bits/64-bits or androids.
3.5.2 Software specification
The connection of your software with other libraies:
 Streamlit
 Pandas
 Wordcloud
 Plotly
 Matplotlib
 Emoij
4. DESIGN
4.1. Software Requirement Specification

Software recruitment specification (SRS) is a technical specification of requirements for the software
product. SRS represent an overview of products, features and summaries the processing environments
for development operation and maintenance of the product.

Requirement Specification:

Conceptually every SRS should have the component:


 Functionality
 Performance
 Design Constraints imposed on
 Implementation External Interface

4.1.1 Use Case Model

 In the use case diagram the actor is User.


 User can make use of chat upload use cases to give input to the system.
 Select time format use case describe that user can input the time for part of the file
in the system select user use case is to select whose analysis result is desired.
 Users can make use of Show analysis use cases to see the result of the entire analysis done by the
system.
4.2. Class Diagram

Description

The class diagram has following two classes with their respective attribute and methods:
 DataFrame
o Attributes : user, message, date, time, year, month, day, dayname,
dayofweek, weeknum, hour, minute, meridian
o Methods : separateDateTime
 Generate report
o Attributes : selectedUser, message, dataFrame, timeFormat
o Methods : fetch_stats, chat_form, most_talkative, hourly_timeline,
daily_timeline, weekly_timeline, most_busy_day, most_busy_month,
crete_worldcloud
The class Dataframe is creating the class Generate report so Dataframe class include
Generate report class.
4.3. Activity Diagram
 In the activity diagram as the initial activity starts user will upload the file as a
input which is action and in the next action time format will be selected.
 The decision box check chat format represents the validity of the time format of
the file.
 If the time format is correct then analysis will be done and process will end.
 If the time format is wrong user will have to again check for the correct format.
5. SYSTEM MODELING
5.1. Sequence Diagram
 The Sequence diagram start with upload chat in front-end then check time
format will be exe it will match time format of chat upload with time format user
selected then it goes to server then server perform analysis operation and send
back to result in user end.
 If time format of chat and user select time format not match it will display a
invalid time format select error.

5.2. Collaboration Diagram


 This collaboration diagram shows the relationship between the objects in a system.
 An object consists of several features. Multiple objects present in the system are
connected to each other.

Figure: 5.2 COLLABORATION DIAGRAM


5.3. Conceptual Level State Diagram

 The State diagram start with the uploading of the file and after that in the next
state time format will be selected if the time format is valid then in the next state
analysis will be done. The analysis state will complete when the overall result
will be shown on the user interface.
 In the analysis state the user can select the option of whose analysis he or she
wants to see and this will give corresponding next state of display result.

Figure: 5.3 STATE CHAT

5.4. Conceptual Level Component Diagram

Whatsapp chat analyzer has following component:


 Chat export which connected with other components via input file.
 The data of the input file will be accessed by following components:
 Top stats
 Message sent
 Most busy
 Most common word
 Emoji analysis
 Sentimental analysis
Figure: 5.4 Component Diagram

5.5. Deployment Diagram


 In the deployment diagram it has two notes browser which is on the user end and
server.
 Both the nodes will connect using http protocol.
 The browser node will certain user interface while all the operation such as
managing and analysing will be done on the server node.
 The data generated will be sent on the browser node to user.

Figure: 5.4 Deployment Diagram


5.6. Methodology and Implementation Phase
5.6.1. Methodology

First, a simple working system implementation only a few basic features is built and then that is
delivered to the customer. Then thereafter many successive iteration are implemented and delivered
to the customer until the desired system is realised. At any time, the plan is made just for the next
increment and not for any kind of long term plans. Therefore it is easier to modify the version as per
the need of the customer. The development team first undertake to develop core features of the
system.

5.6.2. Implementation
Python is a highly general purpose and a very popular programming language. Python
programming language is being used in web development, machine learning
application, along with all cutting-edge technology in the software industry. Python
programming language is very well suited for beginners.
1. python is currently the most widely used multi-purpose, high level programming
language.
2. Python allows programming in object oriented and procedural paradigm.
3. Python programs generally are smaller than other programming languages like
Java. Programmers have to type a relatively less and indention requirement of
the language makes them readable all the time.
4. Python language is being used by almost all technician companies like Google,
Amazon, Facebook, Instagram, Dropbox, Uber…. Etc.
5.6.2.1. Software requirements for developing application
 Jupyter notebook
 VS code
Technologies
 Python and its libraries (streamlit)
 ML algorithm
 NLTK
5.7. Testing
Testing is the major quality control that can be used during software development. Its
basic function is to detect the errors in the software. During requirement analysis and
design, the output is the document that is usually textual and non-executable. After the
coding phase, a computer program is available that can be executed for testing process
purposes.

5.7.1. Testing Objectives


 To check if the application is working as expected.
 To check the errors of different scenarios by using different cases.

5.7.2. Testing Methods & Strategies used along with Test Data
Software Testing Strategies : Software testing is defined as an activity to check
whether the actual results match the expected results and to ensure that the software
system is defect free. It involves execution of a software component or system component
to evaluate one or more properties of interest. Software testing also helps to identify
errors, gaps or missing requirements in contrast to the actual requirements. It can be either
done manually or using automated tools.
In simple terms, Software Testing means Verification of Application under Test (AUT).
1. Functional Testing
2. Non-Functional Testing
1. Functional Testing
Functional testing is defined as a type of testing which verifies that each function of the
software application operates in conformance with the requirement specification. This
testing involves checking of User Interface, APIs, Database, security client or server
application and functionality of the Application under Test. The testing can be done either
manually or using automation.

2. Non-Functional Testing
Non-functional testing is defined as a type of software testing to check non-
functional aspect of a software application. It is designed to test the readiness of a
system as per non-functional parameters which are never addressed by functional
testing. An excellent example of a non-functional test would be to check how many
people can simultaneously login into a software. Non-functional testing is equally
important as a functional testing and affects client satisfaction. Non-functional
testing should increase usability, efficiency and maintainability.
6. CONCLUSION &
FUTURE WORK
6.1. Conclusion

6.2. Limitation of Project


 Maximum file size to be uploaded is 200MB.
 Only supports English languages.
 Supports only txt extension.

6.3. Future Enhancement Suggestions


 Add multiple languages for analysis.
7. BIOGRAPHY &
REFERENCES
7.1. References

Available from : WhatsApp: number of monthly active users 2020 | Statista.


7.2. Screenshots

You might also like