FYP Final Document

Web App to Classify Shopify User Reviews
using Textual Features
Abdul Samad (BSE173024)

Waleed Khalid (BSE173016)
Supervised By
DR. SHAHID IQBAL
BS Software Engineering
Department of Computer Science
Capital University of Science & Technology, Islamabad
i|Page
Capital University of Science & Technology, Islamabad Department of Software Engineering
Submission Form for Final-Year
PROJECT REPORT
Version V 3.0 NUMBER 2
OF
MEMBERS
TITLE Web App to Classify Shopify User Reviews using Textual Features
SUPERVISOR’S Dr. M. Shahid Iqbal Malik

NAME
MEMBER NAME REG. NO. EMAIL ADDRESS
Abdul Samad BSE173024 Bse173024@cust.pk
Waleed Khalid BSE 173016 Bse173016@cust.pk
MEMBERS’ SIGNATURES
Supervisor’s Signatures
ii | P a g e
APPROVAL CERTIFICATE
This Project, entitled as “Web App to Classify User Reviews using Textual
Features” has been approved for the award of
Bachelor of Engineering in Software Engineering
Committee Signatures:
Supervisor: __________________________
(Dr. M. Shahid Iqbal Malik)
Project Coordinator: __________________________
(Mr. Ibrar Arshad)
Head of Department: __________________________
(Dr. Nadeem Anjum)
iii | P a g e
DECLARATION
We, hereby, declare that “No portion of the work referred to, in this project has been
submitted in support of an application for another degree or qualification of this or any other
university/institute or other institution of learning”. It is further declared that this
undergraduate project, neither as a whole nor as a part thereof has been copied out from any
sources, wherever references have been provided.
MEMBERS’ SIGNATURES
iv | P a g e
Contents
Chapter 1 ....................................................................................................................................................... 1
1.1 Project Introduction ..................................................................................................................... 1
1.2 Problem Statement....................................................................................................................... 1
1.3 Business Scope ............................................................................................................................ 2
1.4 Objectives .................................................................................................................................... 2
1.5 Useful Tools and Technologies .................................................................................................. 2
1.6 Project Work Break Down .......................................................................................................... 4
1.7 Project Time Lapse ...................................................................................................................... 4
Chapter 2 ....................................................................................................................................................... 5
Requirement Specification and Analysis ...................................................................................................... 5
2.1. Functional Requirements ............................................................................................................. 5
2.2. Non-Functional Requirements ..................................................................................................... 7
2.3. Use Case Modeling...................................................................................................................... 8
2.4. Use Case Diagram: ...................................................................................................................... 8
2.5. Use Case Descriptions ................................................................................................................. 9
2.5.1. User Registration Use Case Description: ............................................................................ 9
2.5.2. User Login Use Case Description ....................................................................................... 9
2.5.3. Select Dataset Use Case Description ................................................................................ 10
2.5.4. View Dataset Use Case Description ................................................................................. 11
2.5.5. Select feature extraction method Use Case Description ................................................... 11
2.5.6. Select Preprocessing technique Use Case Description ..................................................... 12
2.5.7. Select Classifier Use Case Description ............................................................................. 13
2.5.8. Select Validation Technique Use Case Description ......................................................... 14
2.5.9. Select Evaluation Metric Use Case Description ............................................................... 14
2.5.10. View History Use Case Description ................................................................................. 15

v|Page
2.5.11. Enter unseen text Use Case Description ........................................................................... 16
Chapter 3 ..................................................................................................................................................... 17
System Design............................................................................................................................................. 17
3.1. Layer Definition ........................................................................................................................ 17
3.1.1. Presentation Layer ............................................................................................................ 17
3.1.2. Business Logic Layer........................................................................................................ 17
3.2. System Design Diagrams........................................................................................................... 17
3.2.1. High Level Design ............................................................................................................ 18
3.2.2. System Sequence Diagrams .............................................................................................. 18
3.2.2.1. User Register SSD ........................................................................................................ 18
3.2.2.2. User Login SSD............................................................................................................ 18
3.2.2.3. Load Dataset SSD......................................................................................................... 19
3.2.2.4 View Dataset SSD ........................................................................................................... 19
3.2.2.5 Applying Feature SSD ................................................................................................... 20
3.2.2.6 Apply Preprocessing SSD ............................................................................................... 21
3.2.2.7 Apply Classification SSD ............................................................................................... 21
3.2.2.10 Test Trained Model SSD................................................................................................ 23
3.2.2.11 Logout SSD ................................................................................................................... 23
3.3 Domain Model ................................................................................................................................ 24
3.4 Flow Chart ................................................................................................................................. 25
3.4.1 User Registration Flow Chart ................................................................................................ 26
3.4.2 User Login Flow Chart.......................................................................................................... 27
3.4.3 Features Computation Flow Chart ........................................................................................ 27
3.4.4 Pre-Processing Flow Chart .................................................................................................... 28
3.4.5 Machine Learning Modeling Flow Chart .............................................................................. 29
3.4.6 Evaluation Metrics Flow Chart ............................................................................................. 30

vi | P a g e
3.4.7 Validation Method Flow Chart ............................................................................................. 30
3.4.8 Save Model Flow Chart......................................................................................................... 31
3.4.9 Test Saved Model Flow Chart ............................................................................................... 32
3.5 User Interface Design ................................................................................................................ 33
Chapter 4 ..................................................................................................................................................... 39
Software Development ................................................................................................................................ 39
4.1. Coding Standards....................................................................................................................... 39
4.1.1 Indentation............................................................................................................................. 39
4.1.2 Declaration ............................................................................................................................ 39
4.1.3 Statement Standards .............................................................................................................. 39
4.1.4 Naming Convention .............................................................................................................. 40
4.2 Front End Development Environment ....................................................................................... 40
4.3 Back End Development Environment ....................................................................................... 40
4.4 Software Description ................................................................................................................. 41
Chapter 5 ..................................................................................................................................................... 67
Software Testing ......................................................................................................................................... 67
5.1 Testing Methodology................................................................................................................. 67
5.2 Test Cases .................................................................................................................................. 67
5.2.1 User Registration Test case ................................................................................................... 67
5.2.2 User Login Test case ............................................................................................................. 68
5.2.3 Choose Dataset Test case ...................................................................................................... 69
5.2.4 View Dataset Test Case 1...................................................................................................... 69
5.2.5 View Dataset Test Case 2...................................................................................................... 70
5.2.6 Train Model Test Case .......................................................................................................... 71
5.2.7 Apply Feature Extraction method on Dataset Test Case ....................................................... 72
5.2.8 Apply Part of speech on Dataset Test Case........................................................................... 72

vii | P a g e
5.2.9 Remove Special Characters Test Case .................................................................................. 73
5.2.10 Apply Preprocessing Technique on Dataset Test Case ..................................................... 74
5.2.11 View processed Feature data Test Case ............................................................................ 74
5.2.12 View processed Feature data Test Case ............................................................................ 75
5.2.13 Moving to Classifier Test case .......................................................................................... 76
5.2.14 Machine Learning Model Test case .................................................................................. 76
5.2.15 Evaluation Metrics Test case ............................................................................................ 77
5.2.16 Apply Classifier Test case ................................................................................................ 78
5.2.17 Save Model Test Case 1.................................................................................................... 79
5.2.18 Save Model Test Case 2.................................................................................................... 79
5.2.19 Test Model Test case ........................................................................................................ 80
5.2.20 Test Model Test Case 2 .................................................................................................... 81
5.2.21 Unseen Prediction Test Case 1 ......................................................................................... 81
5.2.24 User Logout Test case....................................................................................................... 83
5.2.25 Contact Us Test Case ........................................................................................................ 84
5.2.26 Contact Us Test Case ........................................................................................................ 85
5.2.27 About Page Test case ........................................................................................................ 85
Chapter 6 ..................................................................................................................................................... 87
Software Deployment.................................................................................................................................. 87
6.1 Installation / Deployment Process Description.......................................................................... 87
Chapter 7 ..................................................................................................................................................... 91
REPORT APPROVAL CERTIFICATE ..................................................................................................... 91
References ................................................................................................................................................... 92
viii | P a g e
Figures:
Figure 1 Work Breakdown Chart .................................................................................................................. 4
Figure 2 Project Time-Lapse ......................................................................................................................... 4
Figure 3 Use case Diagram ........................................................................................................................... 8
Figure 4 User Registration SSD .................................................................................................................. 18
Figure 5 User Login SSD ............................................................................................................................ 19
Figure 6 Load Data SSD ............................................................................................................................. 19
Figure 7 View Dataset SSD ........................................................................................................................ 20
Figure 8 Feature Computation SSD ............................................................................................................ 20
Figure 9 Preprocessing Technique SSD ...................................................................................................... 21
Figure 10 Select Classification SSD ........................................................................................................... 21
Figure 11 Save Result SSD ......................................................................................................................... 22
Figure 12 View History SSD ...................................................................................................................... 22
Figure 13 Test Model SSD.......................................................................................................................... 23
Figure 14 Logout SSD ................................................................................................................................ 24
Figure 15 Domain Model ............................................................................................................................ 25
Figure 16 Flow Chart ................................................................................................................................. 26
Figure 17 User Registration Flow Chart ..................................................................................................... 27
Figure 18 User Login Flow Chart ............................................................................................................... 27
Figure 19 Features computation flowchart.................................................................................................. 28
Figure 20 Pre-Processing Flow Chart ......................................................................................................... 29
Figure 21 Machine Learning Modeling Flow Chart ................................................................................... 29
Figure 22 Evaluation Metrics Flow Chart ................................................................................................... 30
Figure 23 Validation Method Flow Chart ................................................................................................... 31
Figure 24 Save model Flow Chart .............................................................................................................. 32
Figure 25 Test Saved Model Flow Chart .................................................................................................... 32
Figure 26 Signup Page Interface ................................................................................................................. 33
Figure 27 Login Page Interface ................................................................................................................... 33
Figure 28 About Page Interface .................................................................................................................. 34
Figure 29 Dataset Selection Interface ......................................................................................................... 34
Figure 30 View Dataset Interface ............................................................................................................... 35
Figure 31 Features selection Interface ........................................................................................................ 35
ix | P a g e
Figure 32 Data Preprocessing Interface ...................................................................................................... 36
Figure 33 Classifier Selection Interface ...................................................................................................... 36
Figure 34 Classifier Result Interface........................................................................................................... 37
Figure 35 History Interface ......................................................................................................................... 37
Figure 36 Unseen Review Interface ........................................................................................................... 38
Figure 37 Text Result Interface................................................................................................................... 38
Figure 38 Text Result Interface.................................................................................................................. 87
x|Page
Tables:
Table 1 Functional Requirements ................................................................................................................. 5
Table 2 Non-Functional Requirements ......................................................................................................... 7
Table 3 User Registration Use Case Description .......................................................................................... 9
Table 4 User Login Use Case Description .................................................................................................... 9
Table 5 Select Dataset Use Case Description ............................................................................................. 10
Table 6 View Dataset Use Case Description .............................................................................................. 11
Table 7 Select feature extraction method Use Case Description ............................................................... 11
Table 8 Selection of preprocessing technique Use Case Description ......................................................... 12
Table 9 Select Classifier Use Case Description .......................................................................................... 13
Table 10 Select validation technique Use Case Description ....................................................................... 14
Table 11 Select Evaluation Metric Use Case Description .......................................................................... 14
Table 12 View History Use Case Description ............................................................................................ 15
Table 13 Enter unseen text Use Case Description ...................................................................................... 16
Table 14 Layers Definition ......................................................................................................................... 17
Table 15 User Registration Test case .......................................................................................................... 67
Table 16 User Login Test Case ................................................................................................................... 68
Table 17 Choose Dataset Test Case ............................................................................................................ 69
Table 18 View Dataset Test Case .............................................................................................................. 69
Table 19 View Dataset Test Case ............................................................................................................... 70
Table 20 Train Model Test Case ................................................................................................................. 71
Table 21 Apply Feature Extraction method on Dataset .............................................................................. 72
Table 22 Apply Part of speech on Dataset Test Case.................................................................................. 72
Table 23 Remove special characters Test Case .......................................................................................... 73
Table 24 Apply Preprocessing Technique on Dataset Test Case ................................................................ 74
Table 25 View processed Feature data Test Case ....................................................................................... 74
Table 26 View processed Feature data Test Case ....................................................................................... 75
Table 27 Moving to Classifier Test Case .................................................................................................... 76
Table 28 Machine Learning Model Test Case ............................................................................................ 76
Table 29 Evaluation Metrics Test Case ...................................................................................................... 77
xi | P a g e
Table 30 Apply classifier Test Case............................................................................................................ 78
Table 31 Save Model Test Case .................................................................................................................. 79
Table 32 Save Model Test Case .................................................................................................................. 79
Table 33 Test model Test Case ................................................................................................................... 80
Table 34 Test model Test Case ................................................................................................................... 81
Table 35 Unseen Prediction Test Case........................................................................................................ 81
Table 39 Display Button Test Case ............................................................................................................. 84
Table 40 Display Button Test Case ............................................................................................................. 85
Table 41 About Page Test Case .................................................................................................................. 85
Table 42 Project Evaluation Guidelines ...................................................................................................... 91
xii | P a g e
Chapter 1
The following chapter provides the brief summary of project scope, project specification of the
project, this report includes an existing system and technologies which is used for the
development of the software, it also includes the flow of our project timeline and breakdown
structure of the project.
1.1 Project Introduction

App stores usually allow users to give reviews and ratings that are used by developers to resolve
issues and make plans for their apps. In this way, these app stores collect large amounts of data
for analysis. However, it is critical to analyze such feedback due to the volume and redundancy.
Therefore, our work investigates an efficient way to analyze such feedback and solve the
problems related to the classification of Shopify app reviews. This exploits the use of different
machine learning approaches to solve user’s review classification problems based on different
feature engineering techniques. The classifiers, such as Naive Bayes and Random Forest were
trained on text reviews to predict the user’s review as being happy or unhappy for Shopify apps.
[1]
1.2 Problem Statement

There are a number of challenges that need to be considered first, related to retrenchment and
volume of data, through study equipment. This study conducts experiments on databases
containing updates for shopify apps. To overcome this problem, we first categorized user reviews
into two groups happy and unhappy, and then perform preprocessing on the reviews to clean the
data. At a later stage, several feature engineering techniques, such as bag-of-words, term
frequency-inverse document frequency (TF-IDF), are used singly and in combination to preserve
meaningful information. Finally, the random forest and logistic regression models are used to
classify the reviews as happy or unhappy. The experiments reveal that a combination of features
can improve machine learning model’s performance. [1]
1|Page
1.3 Business Scope
Amazon shopify apps reviews will be used as data set, and the system will classify reviews into
two categories (Happy/ Unhappy). This app will be useful for seller, as they can improve their
product for a better future sale and also for the customer, that they should buy a particular
product or not.
1.4 Objectives
This project will have following objectives
• To help developers to resolve problems and make plans for their apps.
• Textual Feature Computation.
• Applying Machine Learning Algorithms for classification purposes.
• Model performance is evaluated using 10-fold cross validation & Hold-Out

method
• Model performance is presented by evaluation metrics such accuracy, precision

recall and f-measure.
1.5 Useful Tools and Technologies
PyCharm is an integrated development environment used in

computer programming, specifically for the Python language. [2]
Python is an interpreted high-level general-purpose programming language.

Python's design philosophy emphasizes code readability with its notable use of
significant indentation. [3]
2|Page
The Hypertext Markup Language, or HTML is the standard markup language for
documents designed to be displayed in a web browser. It can be assisted by
technologies such as Cascading Style Sheets and scripting languages. [4]
Django is a high-level Python web framework that encourages rapid

development and clean, pragmatic design. Built by experienced developers, it
takes care of much of the hassle of web development. [5]
JavaScript is a scripting or programming language that allows you to

implement complex features on web pages every time a web page does more
than just sit there and display static. [6]
SQLite is a relational database management system contained in a C library. In

contrast to many other database management systems, SQLite is not a client–
server database engine. Rather, it is embedded into the end program. SQLite
generally follows PostgreSQL syntax.[7]
3|Page
1.6 Project Work Break Down
Figure 1 Work Breakdown Chart
1.7 Project Time Lapse
Figure 2 Project Time-Lapse
4|Page
Chapter 2
Requirement Specification and Analysis
Requirement’s analysis is a process of determining user expectations for a new or modified

product. These features, called requirements, must be quantifiable, relevant and detailed. In
software engineering, such requirements are often called functional specifications. In Chapter 2
we will enlist the functional and non-functional requirements and model functional requirements
in the form of use case model.
2.1. Functional Requirements

Functional requirements define functionalities of a system or its components. Functional
requirements may be calculations, technical details, data manipulation and processing and other
specific functionality that define what a system is supposed to accomplish.
Table 1 Functional Requirements
S. No. Functional Requirement Type Status
1. User can register to application Core Completed
2. User can login into application Core Completed
3. User can select shopify product dataset. Core Completed
4. User can view dataset Intermediate Completed
5. User can select training, using Part of Core Completed

Speech features.
6. User can select training using Bag of Core Completed

Words features.
7. User can select training using TF-IDF Core Completed

(Team frequency-inverse document
frequency) features.
5|Page
8. User can select training using discrete Intermediate Completed
positive emotion features.
9. User can select training using discrete Intermediate Completed

negative emotion features.
10. User can select training using polarity Core Completed

features.
11. User can select training using Core Completed

sentiments features.
12. User can select all features for training Core Completed
13. User can select lemmatization and stop Core Completed

wards removal preprocessing technique.
14. User can select stop wards removal Core Completed

preprocessing technique.
15. User can select special character Core Completed

removal preprocessing technique.
16. User can select random forest (RF) Core Completed

machine learning model for experiment
17. User can select Naïve Bayes machine Core Completed

leaning model for experiment.
18. User can select Accuracy evaluation Core Completed

metric for experiment.
19. User can select Precision evaluation Core Completed

20. User can select Recall evaluation metric Core Completed

for experiment.
6|Page
21. User can select F-measure evaluation Core Completed
22. User can select 10-fold cross validation Core Completed

technique for experiment.
23. User can select hold-out-method for Intermediate Completed

experiment.
24. User can save train model Core Completed
25. User can test model by entering unseen Core Completed

text and predict label
2.2. Non-Functional Requirements

Following is the list of the non-functional requirements.
Table 2 Non-Functional Requirements
S. No. Non-Functional Requirements Category
1. The user should reach the classified text with one button press Usability
if possible
2. The system also should be user friendly for admins because Usability
anyone can be admin instead of programmers.
3. Will predict class label (happy/unhappy) with maximum Accuracy

accuracy
4. This application is being developed using review’s features Reliability

and machine learning techniques. Therefore, there is no
certain reliable percentage that is measurable.
5. Computation time and response time should be as little as Performance

possible, because one of the software’s features is timesaving.
7|Page
Whole cycle of classifying a dataset should not be more than
40 seconds.
6. After entering unseen tweet text, the system should classify it Accuracy
within defined time.
2.3. Use Case Modeling

A Use Case depicts how actors will interact with the system. A use case is a methodology used in
system analysis to identify, clarify and organize system requirements. The use case is made up of
a set of possible sequences of interactions between systems and users in a particular environment
and related to a particular goal. Following use case diagrams will depict how our system works.
2.4. Use Case Diagram:
Figure 3 Use case Diagram
8|Page
2.5. Use Case Descriptions
2.5.1. User Registration Use Case Description:
Table 3 User Registration Use Case Description
Use Case ID: UC 3

UC Name User Registration
Actors User, Database
Description User must register in the application
Trigger Registration button
Pre-condition User must not be registered previously
Post-condition User will be registered in the system
Basic Flow User System
1. User must enter User data will be
in the credentials stored in the database
Alternative 1.User must first visit the website
Flow 2.User must then click on register, to perform
registration.
2.5.2. User Login Use Case Description

Table 4 User Login Use Case Description
Use Case ID: UC 1
UC Name User Login
Actors User, Database
Description User must login into the application
Trigger Login button
Pre-condition User must be registered
Post-condition User will be logged in the system
Basic Flow System
9|Page
User
1. User will enter in User will be redirected

the application. to home page
Alternati 1.User must have to first register in the application

ve Flow
2.5.3. Select Dataset Use Case Description

Table 5 Select Dataset Use Case Description
Use Case ID: UC 4
UC Name Select Dataset
Actors User
Description User must select the dataset from the given options
Trigger Dropdown menu
Pre-condition User must select dataset
Post-condition System will load the dataset
1. User must select the dataset System will load the

dataset
Alternative 1. Selected dataset had some miscellaneous information.

Flow
2. During loading dataset, System stops responding.
10 | P a g e
2.5.4. View Dataset Use Case Description
Table 6 View Dataset Use Case Description
Use Case ID: UC 5
UC Name View Dataset
Actors User
Description User can view the selected dataset.
Trigger View Button
Pre-condition User must have to select dataset
Post-condition User can be able to view the selected dataset
1. User will view the System displays the

loaded dataset selected dataset
Alternati 1 Selected dataset had some miscellaneous information.

ve Flow 2. During loading dataset, System stops responding.
2.5.5. Select feature extraction method Use Case Description

Table 7 Select feature extraction method Use Case Description
Use Case ID: UC 6
UC Name Select feature extraction method
Actors User
Description User must select one of the feature extraction method

from the given features
Trigger Radio Button
Pre-condition User must load the dataset
11 | P a g e
Post-condition Feature selection is successfully marked
Basic Flow User

System
1. User will select one System will note which

of the feature extraction feature extraction method to
method apply
Alternative 1. Service may not be available.

Flow
2. System may crash while posting request
3. Data may not be loaded to system
2.5.6. Select Preprocessing technique Use Case Description

Table 8 Selection of preprocessing technique Use Case Description
Use Case ID: UC 7
UC Name Select preprocessing technique
Actors Users
Description User must select one of the preprocessing techniques.
Trigger Radio Button
Pre-condition User must select feature extraction method.
Post-condition Preprocessing and feature will be applied to the selected

dataset.
Basic Flow System

User
1. User must select one System will apply

of the preprocessing preprocessing and feature
12 | P a g e
techniques. extraction method on
selected dataset.
Alternat 1. User may select wrong feature and preprocessing method.

ive Flow
2. Service is not currently available.
2.5.7. Select Classifier Use Case Description

Table 9 Select Classifier Use Case Description
Use Case ID: UC 8
UC Name Select Classifier
Actors Users
Description User must select one of the machine learning model.
Trigger Dropdown menu
Pre-condition Must apply preprocessing and feature extraction.
Post-condition System will apply the machine learning model on the dataset
Basic Flow User

System
1. User must select the System will apply that

one of the machine model on the selected
learning model. dataset.
Alternative 1. Preprocessing and feature extraction method may not be

Flow applied.
2. System may not be responding.
13 | P a g e
2.5.8. Select Validation Technique Use Case Description
Table 10 Select validation technique Use Case Description
Use Case ID: UC 9
UC Name Select validation technique
Actors Users
Description User can select one or more than one validation

technique to evaluate the performance of the
classifiers.
Trigger Check boxes
Pre-condition Must select one of the machine learning model.
Post-condition Display different results in graphs.
Basic Flow User

System
1. User can select one or System will display the

multiple validation results in graphs.
technique methods.
Alternativ 1. User may not select any of the given options

e Flow
2. System is not responding.
2.5.9. Select Evaluation Metric Use Case Description

Table 11 Select Evaluation Metric Use Case Description
Use Case ID: UC 10
UC Name Select Evaluation metric
Actors User
Description User can view the quality of the machine learning model.
14 | P a g e
Trigger Drop Down Menu
Pre-condition User must select one of the machine learning model.
Post-condition System will display results in type of graphs.
User System
1. User must select System will display results in
Basic Flow one of the Validation form of graphs.

Technique method
1. Unknown error occurred while

Alternati
displaying graphs
ve Flow
2. System not responding.
2.5.10. View History Use Case Description

Table 12 View History Use Case Description
Use Case ID: UC 11
UC Name View History
Actors User
Description User can view the past history of train models
Trigger Button
Pre-condition There must be any history
Post-condition User will see all the past history of model training.
1. User will train a model System displays the

and then go to history. history of train model
15 | P a g e
Alternative Flow 1. An unknown error occurred while displaying the list.
2. System may not be responding at the moment.
3. No past history
2.5.11. Enter unseen text Use Case Description

Table 13 Enter unseen text Use Case Description
Use Case ID: UC 12
UC Name Enter Unseen Text
Actors User
Description User can be able to predict a review
Trigger Text Field
Pre-condition User must train model
Post-condition User will be able to view result
1. User must enter unseen System will display result

review.
Alternative 1. An unknown error occurred while updating the status.

Flow
2. System may not be responding
16 | P a g e
Chapter 3
System Design
The purpose of this chapter is to provide information that is complementary to the development
phase. Without an adequate design, that delivers required function as well as quality attributes,
the project will fail. However, communicating architecture to its stakeholders is as important a
job as creating it in the first place.
3.1. Layer Definition

Table 14 Layers Definition
Layers Description
Presentation Layer This layer will be used for the interaction with the user
through a graphical user interface.
Business Logic Layer This layer contains the business logic. All the
constraints and majority of the functions reside under
this layer.
3.1.1. Presentation Layer

Occupies the top level and displays information related to services available on a website. This
tier communicates with other tiers by sending results to the browser and other tiers in the
network.
3.1.2. Business Logic Layer

Application Layer also called the middle tier, logic tier, business logic or logic tier, this tier is
pulled from the presentation tier. It controls application functionality by performing detailed
processing.
3.2. System Design Diagrams

System design is divided into two parts:
17 | P a g e
3.2.1. High Level Design
High-level design provides a view of the system at an abstract level. It shows how the major
pieces of the finished application will fit together and interact with each other. The high-level
design does not focus on the details of how the pieces of the application will work. Those details
can be worked out later during low-level design and implementation.
3.2.2. System Sequence Diagrams

System sequence diagram (SSD) is a sequence diagram that shows, for a particular scenario of a
use case, the events that external actors generate their order, and possible inter-system events.
3.2.2.1. User Register SSD

This is User Registration System Sequence Diagram which shown if user will register into
the system, which flow of activities will take place.
Figure 4 User Registration SSD
3.2.2.2. User Login SSD

This is Login System Sequence Diagram which shown if user tries to login, which flow of
activities will take place.
18 | P a g e
Figure 5 User Login SSD
3.2.2.3. Load Dataset SSD

This is Load Dataset System Sequence Diagram which shown that first user will select
dataset type then user will upload dataset from the device.
Figure 6 Load Data SSD
3.2.2.4 View Dataset SSD

This is View Dataset System Sequence Diagram which shown if user want to view dataset
user can view dataset by requesting system.
19 | P a g e
Figure 7 View Dataset SSD
3.2.2.5 Applying Feature SSD

This is Compute Feature System Sequence Diagram which shown if user want to compute
feature of the given dataset, which flow of activities will take place.
Figure 8 Feature Computation SSD
20 | P a g e
3.2.2.6 Apply Preprocessing SSD
This is Applying Pre-processing System Sequence Diagram which shown user will have to
select any of the pre-processing technique and then that technique will be applied to the given
dataset.
Figure 9 Preprocessing Technique SSD
3.2.2.7 Apply Classification SSD

This is Applying Classifier System Sequence Diagram which shows that user will have to
select the machine learning model from the given options, after that user will select the
evaluation metrics, and validation technique from the given options, and after that user will
apply the classifier, and after that system will display the result of the model.
Figure 10 Select Classification SSD
21 | P a g e
3.2.2.8 Save Result SSD
This is Save Result System Sequence Diagram, which shows that when the result of the
models is displayed after that we will save our model in the system
Figure 11 Save Result SSD
3.2.2.9 View History SSD
This is View History System Sequence Diagram which shown if user tries to view history of
old models which are saved, which flow of activities will take place.
Figure 12 View History SSD
22 | P a g e
3.2.2.10 Test Trained Model SSD
This is Test Saved Model System Sequence Diagram which shown if user tries to test the
saved model, user will enter unseen review and then system will display the predicted result.
Figure 13 Test Model SSD
3.2.2.11 Logout SSD

This is Logout System Sequence Diagram which shown if user tries to logout, which flow of
activities will take place.
23 | P a g e
Figure 14 Logout SSD
3.3 Domain Model

The Domain Model is your organized and structured knowledge of the problem. The Domain
Model should represent the vocabulary and key concepts of the problem domain and it should
identify the relationships among all of the entities within the scope of the domain.
In our system we have twelve entities, the user entity is used to register and login to the system,
and train model enity is used to train the model, to train the model we first need the dataset, so
we have a choose dataset entity, and after choosing data we have to apply feature extraction
method and preprocessing technique so we have a feature extraction entity and preprocessing
entity, after this we need machine learing model, evaluation metrics and validation technique
entities, and after getting the result of train model, we have to store that model in our database so
we have save model entity, we can also test our model by giving unseen review so we have test
model entity, then system will give us prediction result so will also be having prediction entity.
24 | P a g e
Figure 15 Domain Model
(Note: We remove Software Architecture diagram, Class diagram, Sequence diagram, and ER-Diagram and add detailed
flow chart because our panel and supervisor suggested us to add detailed flow chart and remove those diagrams because we
don’t need them)
3.4 Flow Chart

• First user will register into the system.
• Then user will login into the system,
• After login system will take user to home page of the system.
• After that user will select the dataset, if user want to view the dataset user can also view
the dataset.
• Then user will select one of the feature extraction method from the given options.
• After selecting feature user will select preprocessing techniques from the given options.
• User can also view the data on which feature extraction method and preprocessing
techniques is applied.
• Then user will select one machine learning model from the given options.
• Then User will select Evaluation metrices from the given options.
25 | P a g e
• User will also select one the validation technique from the given options.
• When User will apply these on dataset, system will train model and will display the
results for the trained model. User can also save the trained model in the system.
• User can go to history tab and view all the saved trained model.
• From History tab user can test the trained model which is saved in our system, by giving
unseen review.
• Then system will predict the class label of the unseen review (Happy/Unhappy), from the
trained model which is selected.
Figure 16 Flow Chart
3.4.1 User Registration Flow Chart

User will Register by giving following credentials:
26 | P a g e
➢ Name
➢ Username
➢ Email
➢ Password
➢ Confirm Password
Figure 17 User Registration Flow Chart
3.4.2 User Login Flow Chart

User will Register by giving following credentials:
➢ Username
➢ Password
Figure 18 User Login Flow Chart
3.4.3 Features Computation Flow Chart

User will select one of the feature extraction methods from the given options, e.g.
➢ Bag of words.
➢ Part of speech tagging.
➢ Discrete Positive emotion.
27 | P a g e
➢ Discrete Positive emotion.
➢ Sentiment,
➢ Polarity,
➢ Term frequency inverse document frequency (TF-IDF).
Figure 19 Features computation flowchart
3.4.4 Pre-Processing Flow Chart

User will select preprocessing techniques from the given options, e.g.
➢ Stopwords Removal.
➢ Stopwords Removal and Special Character Removal.
➢ Stopwords Removal and Lemmatization.
28 | P a g e
➢ Stopwords Removal, Special Character Removal and Lemmatization.
Figure 20 Pre-Processing Flow Chart
3.4.5 Machine Learning Modeling Flow Chart

User will select one machine learning model from the given options:
➢ Naïve Bayes
➢ Random Forest
Figure 21 Machine Learning Modeling Flow Chart
29 | P a g e
3.4.6 Evaluation Metrics Flow Chart
User will select Evaluation metrices from the given options:
➢ Accuracy
➢ F-measure
➢ Precision
➢ Recall
Figure 22 Evaluation Metrics Flow Chart
3.4.7 Validation Method Flow Chart

User will also select one the validation technique from the given options, e.g.
➢ 10-Fold Cross Validation Method
➢ Hold-Out Method
30 | P a g e
Figure 23 Validation Method Flow Chart
3.4.8 Save Model Flow Chart

User will save the trained model into the system with following credentials:
➢ Current date and time
➢ Dataset Name
➢ Feature Computation
➢ Pre-Processing Technique
➢ Machine Learning Model
➢ Accuracy
➢ Precision
➢ Recall
➢ F-measure
➢ Validation Technique
31 | P a g e
Figure 24 Save model Flow Chart
3.4.9 Test Saved Model Flow Chart

User will test the saved model by giving unseen review.
Figure 25 Test Saved Model Flow Chart
32 | P a g e
3.5 User Interface Design
Figure 26 Signup Page Interface
Figure 27 Login Page Interface
33 | P a g e
Figure 28 About Page Interface
Figure 29 Dataset Selection Interface
34 | P a g e
Figure 30 View Dataset Interface
Figure 31 Features selection Interface
35 | P a g e
Figure 32 Data Preprocessing Interface
Figure 33 Classifier Selection Interface
36 | P a g e
Figure 34 Classifier Result Interface
Figure 35 History Interface
37 | P a g e
Figure 36 Unseen Review Interface
Figure 37 Text Result Interface
38 | P a g e
Chapter 4
Software Development
4.1. Coding Standards
4.1.1 Indentation
Proper code indention is used in this project. The indentation of blocks of code enhances
readability, understandability and hierarchy of lines of code.
4.1.2 Declaration
• In this project we have used one declaration per line is to increase clarity and better
understanding of code. Following is the order of declaration:
• All the widgets have been imported at the beginning.
• The sequence of class variables is: First public, protected then private.
• Instance variables follow the sequence: First public then private instance variables.
• Then class constructors are declared with proper names.
• Class methods are grouped by functionality rather than by scope or accessibility to make
reading and understanding the code easier.
• Declarations for local variables are only at the beginning of code after importing
packages and libraries
4.1.3 Statement Standards

Each line of code contains one declaration at most. Compound statements in this project
contain lines of code enclosed in braces. The inner block of code of compound statements
begins after the opening braces from next line. Proper indentation is also followed for lines of
codes inside the compound statements. Proper braces are used in code around all statements
such as if-else, try-catch etc.
39 | P a g e
4.1.4 Naming Convention
Proper naming convention rules are followed while implementation of this project which
make programs more understandable by making them easier to read.
While implementing this project, we have used words from Natural Language (English) to
properly assign understandable names to classes, variables and methods. Such as Requests,
DocumentCollection, BasicInformation etc. instead of un-understandable names like myc
method, a1, b1 etc.
Terminologies applicable to the domain of project are used. Implying that if user refers to
Email as Registration Number, then term Registration Number is used.
Mixed case is used to make names readable with lower case letters in general capitalizing the
first letter of class names and interface names.
4.2 Front End Development Environment
The Hypertext Markup Language, or HTML is the standard markup language

for documents designed to be displayed in a web browser. It can be assisted
by technologies such as Cascading Style Sheets and scripting languages such
as JavaScript. [4]
JavaScript is a scripting or programming language that allows you to

implement complex features on web pages every time a web page does more
than just sit there and display static information for you to look at displaying
timely content updates, interactive maps. [5]
4.3 Back End Development Environment
PyCharm is an integrated development environment used in computer

programming, specifically for the Python language. [2]
40 | P a g e
Python is an interpreted high-level general-purpose programming
language. Python's design philosophy emphasizes code readability with
its notable use of significant indentation. [3]
Django is a high-level Python web framework that encourages rapid

development and clean, pragmatic design. Built by experienced
developers, it takes care of much of the hassle of web development. [6]
SQLite is a relational database management system contained in a C

library. In contrast to many other database management systems, SQLite
is not a client–server database engine. Rather, it is embedded into the
end program. SQLite generally follows PostgreSQL syntax. [7]
4.4 Software Description

Module Classifier
Code
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from sklearn import svm
from sklearn.neural_network import MLPClassifier
from sklearn.linear_model import SGDClassifier
from sklearn.metrics import confusion_matrix, classification_report
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.model_selection import KFold
41 | P a g e
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
# from mlxtend.classifier import StackingClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn import model_selection
from sklearn import tree
from sklearn.ensemble import AdaBoostClassifier
from sklearn.ensemble import GradientBoostingClassifier
import time
from sklearn.model_selection import ShuffleSplit
from sklearn.model_selection import cross_val_score
from sklearn.metrics import precision_score, recall_score, confusion_matrix, classification_report,

accuracy_score, \
f1_score
def get_preprocessing(pre_processing):
if pre_processing == 'Stopwords Removal':
return "a1"
elif pre_processing == 'Stopwords + Special Characters':
return "a2"
else:
return "a3"
def read_dataset(dataset_name, feature_type, pre_processing):

pre_processing = get_preprocessing(pre_processing)
if feature_type.lower() == 'part of speech tagging':
dataset = pd.read_csv("features/feature/part_of_speech/" + dataset_name.lower() +
"_pos_" + pre_processing.lower() + '.csv', encoding="ISO-8859-1")
dataset.drop('text', axis=1, inplace=True)
dataset.drop('Tweet #', axis=1, inplace=True)
dataset.reset_index(drop=True, inplace=True)
print(dataset.head())
42 | P a g e
elif feature_type.lower() == 'bag of words technique':
dataset = pd.read_csv(
"features/feature/bag_of_words/" + dataset_name.lower() + "_bog_" + pre_processing.lower() +
'.csv')
dataset.drop('Tweet #', axis=1, inplace=True)
elif feature_type.lower() == 'tf-idf technique':

dataset = pd.read_csv("features/feature/tfidf/" + dataset_name.lower() +
"_tf_idf_"+pre_processing.lower() + '.csv')
dataset.drop(dataset.columns[[0, -1]], axis=1, inplace=True)
elif feature_type.lower() == 'unigram':

dataset = pd.read_csv("features/feature/polarity/" + dataset_name.lower() + "_polarity_" +
pre_processing.lower() + '.csv')
elif feature_type.lower() == 'sentiment':

dataset = pd.read_csv("features/feature/sentiment/"+ dataset_name.lower() + "_sentiment_" +
pre_processing.lower() + '.csv')
else:
raise Exception('Unknown Feature Type')
return dataset
def generate_random_forest(dataset):
label_Label = LabelEncoder()
43 | P a g e
# covernverting text into numbers
dataset["label"] = label_Label.fit_transform(dataset['label'])
X = dataset.drop("label", axis=1)
y = dataset['label']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
start = time.time()
classifier = RandomForestClassifier(n_estimators=42, criterion='entropy')
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
cv = ShuffleSplit(n_splits=5, test_size=0.3)
scores = cross_val_score(classifier, X, y, cv=10)
print(classification_report(y_test, y_pred))
print("Random Forest accuracy after 10 fold CV: %0.2f (+/- %0.2f)" % (scores.mean(), scores.std() * 2)
+ ", " + str(
round(time.time() - start, 3)) + "s")
print("******************************")
print("******************************")
print("******************************")
# print (' Accuracy:', accuracy_score(y_test, y_pred))

print('scores.mean:', scores.mean())
accuracy = scores.mean()
print("______________________________")
print('Precision:', precision_score(y_test, y_pred, average='weighted'))
precision = precision_score(y_test, y_pred, average='weighted')
# print ('Precision:', precision_score(y_test, y_pred))
print("______________________________")
print('Recall:', recall_score(y_test, y_pred, average='weighted'))
recall = recall_score(y_test, y_pred, average='weighted')
44 | P a g e
# print ('Recall:', recall_score(y_test, y_pred))
print("______________________________")
print('F1 score:', f1_score(y_test, y_pred, average='weighted'))
f1score = f1_score(y_test, y_pred, average='weighted')
# print ('F1 score:', f1_score(y_test, y_pred))
print("______________________________")
print("************************************************************************************
******")
return accuracy, precision, recall, f1score, classifier
def generateNaiveBayes(dataset):
start = time.time()
label_Label = LabelEncoder()
# covernverting text into numbers
dataset["label"] = label_Label.fit_transform(dataset['label'])
X = dataset.drop("label", axis=1)
y = dataset['label']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
nb = GaussianNB()
nb.fit(X_train, y_train)
y_pred = nb.predict(X_test)
cv = ShuffleSplit(n_splits=5, test_size=0.3)
scores = cross_val_score(nb, X, y, cv=10)
print(classification_report(y_test, y_pred))
print("Naive Bayes accuracy after 10 fold CV: %0.2f (+/- %0.2f)" % (scores.mean(), scores.std() * 2) + ",
" + str(
round(time.time() - start, 3)) + "s")
print("******************************")
print("******************************")
print("******************************")
print('Accuracy:', accuracy_score(y_test, y_pred))

45 | P a g e
accuracy = accuracy_score(y_test, y_pred)
print("______________________________")
print('Precision:', precision_score(y_test, y_pred, average='weighted'))
precision = precision_score(y_test, y_pred, average='weighted')
# print ('Precision:', precision_score(y_test, y_pred))
print("______________________________")
print('Recall:', recall_score(y_test, y_pred, average='weighted'))

recall = recall_score(y_test, y_pred, average='weighted')
# print ('Recall:', recall_score(y_test, y_pred))
print("______________________________")
print('F1 score:', f1_score(y_test, y_pred, average='weighted'))

f1score = f1_score(y_test, y_pred, average='weighted')
# print ('F1 score:', f1_score(y_test, y_pred))
print("______________________________")
return accuracy, precision, recall, f1score, nb
if __name__ == "__main__":
print('RF')
accuracy, precision, recall, f1score = generateNaiveBayes(read_dataset('sentiment', 'a1'))
Module Feature Computation

Code
Part-of-Speech Tagging
A Part-Of-Speech Tagger (POS Tagger) is a piece of software that reads text in some
language and assigns parts of speech to each word (and other token), such as noun, verb,
adjective, etc. Following code is for POS
CODE
46 | P a g e
import nltk from nltk import word_tokenize, pos_tag from nltk.corpus import wordnet as wn
from nltk.tokenize import RegexpTokenizer from nltk.corpus import stopwords import xlrd
import xlwt import re from collections import Counter from nltk.stem import
WordNetLemmatizer
wordnet_lemmatizer = WordNetLemmatizer()
prepList = ["", "CC", "CD", "DT", "EX", "FW", "IN", "JJ", "JJR", "JJS", "LS", "MD", "NN",
"NNS", "NNP",
"NNPS", "PDT", "POS",
"PRP", "PRP$", "RB", "RBR", "RBS", "RP", "RP", "TO", "UH", "VB", "VBD",
"VBG", "VBN", "VBP",
"VBZ", "WDT", "WP", "WP$", "WRB"]
# XLSX formatted source file reading Source
loc = (r"C:\CUST 07\FYP\POS tagging\2\abc.xlsx")
wb = xlrd.open_workbook(loc) sheet = wb.sheet_by_index(0) wbWrite = xlwt.Workbook()

style = xlwt.easyxf('font: bold 1') sheetToWrite = wbWrite.add_sheet('Part_of_speech')
def process1():
try: for i in range(1, len(prepList)):
sheetToWrite.write(0, i, prepList[i], style)
for read in range(sheet.nrows): tempIndexList = [] txt1 =

sheet.cell_value(read, 0)
47 | P a g e
# Special Character removing from source file tokenizer =
RegexpTokenizer(r'\w+') x = tokenizer.tokenize(txt1)
print(x)
#stop word removal
tokens_without_sw = [word for word in x if not word in stopwords.words()]
print(tokens_without_sw)
#x = txt1.lower()
#x = re.sub(r'\W', ' ', txt1) #x = re.sub(r'\s+', ' ', txt1)
#print(x)
#word_tokens = word_tokenize(x)
#print(word_tokens)
# Remove Punctuation # Lemmatization for token in tokens_without_sw:
token = wordnet_lemmatizer.lemmatize(token, pos="v")
print(token)
tagged = nltk.pos_tag(tokens_without_sw) counts = Counter(tag for word, tag in

tagged)
print(counts)
sheetToWrite.write(read + 1, 0, tokens_without_sw)
# for i in range(1, len(prepList)):

# sheetToWrite.write(read + 1, i, 0)
48 | P a g e
for i in counts.elements(): if i in prepList:
column = prepList.index(i) if not column in tempIndexList:
sheetToWrite.write(read + 1, column, counts[i]) tempIndexList.append(column)
"""
Writing 0's to columns with no values
"""
for i in range(1, len(prepList)): if i not in tempIndexList:
sheetToWrite.write(read + 1, i, 0)
wbWrite.save("POS Count.xls")
# print(nltk.help.upenn_tagset())
except Exception as e:
print(str(e))
process1()
Bag-of-words :
A bag-of-words model, or BoW for short, is a way of extracting features from text for use in
modeling, such as with machine learning algorithms. The approach is very simple and flexible,
and can be used in a myriad of ways for extracting features from documents Code
import re
import nltk import pandas as pd from nltk.corpus import stopwords from nltk.stem import
WordNetLemmatizer
49 | P a g e
from n_gram import output_to_csv from Pre_Processing import stopword_rem from
Pre_Processing import lemmitization
#wordnet_lemmatizer = WordNetLemmatizer()
#def stopword_rem(token):
#tokens_without_sw = [word for word in token if not word in stopwords.words()]
#return tokens_without_sw
#def lemmitization(token):
#token = wordnet_lemmatizer.lemmatize(token, pos="v")
#return token
def main():
Review_df = pd.read_csv("C:/FYP/POS
tagging/bagofwords/abc.csv")
texts_list = Review_df['text'].tolist() # texts_list[0] = "Playing...." for i in
range(len(texts_list)):
texts_list[i] = texts_list[i].lower()
# Return a match at every NON word character (characters NOT between a and Z. Like "!",
"?" white-space etc.)
texts_list[i] = re.sub(r'\W', ' ', texts_list[i])
# Replace all white-space characters with ""
texts_list[i] = re.sub(r'\s+', ' ', texts_list[i])
# TODO Number remove
50 | P a g e
bag_of_words_list = [] count = 0
for sentence in texts_list:

wordfreq = {} tokens = nltk.word_tokenize(sentence)
# List of words/tokens #stopwords ftoken=stopword_rem(tokens)
"""
['The', 'The', 'Samad'] wordfreq['The'] wordfreq {
'key': value
The: 2
Samad: 1
}
sentence_1 = ['The', 'The', 'Samad'] sentence_2 = ['The', 'BAG', 'Samad']
[{}, {}, {}] """

for token in ftoken: # Token 1 word token=lemmitization(token)
#token = wordnet_lemmatizer.lemmatize(token, pos="v")
if token not in wordfreq.keys():
wordfreq[token] = 1
else:
wordfreq[token] += 1
count += 1
bag_of_words_list.append(wordfreq)
output_to_csv('bag_of_words_output.csv', bag_of_words_list, Review_df)
if __name__ == "__main__":
main()
51 | P a g e
TF-IDF
TF-IDF is a statistical measure that evaluates how relevant a word is to a document in a
collection of documents. ... It has many uses, most importantly in automated text analysis, and is
very useful for scoring words in machine learning algorithms for Natural Language Processing
(NLP).
CODE
import pandas as pd import re import nltk from Pre_Processing import stopword_rem from
Pre_Processing import lemmitization
punctuations = "?:!.,;"
def compute_tf(token): num_of_words = len(token)
freq = {} tf = {}
for word in token: if word in freq: freq[word] += 1
else:
freq[word] = 1
for value in freq:

tf[value] = freq[value] / num_of_words
return tf, freq
def compute_idf(doc_list):
import math idf_dict = {}
N = len(doc_list) # [{}, {}, {}] for doc in doc_list:
52 | P a g e
for word, val in doc.items(): if val > 0: if idf_dict.get(word):
idf_dict[word] += 1
else:
idf_dict[word] = 1
for word, val in idf_dict.items():

idf_dict[word] = math.log(N / float(val))
return idf_dict
def compute_tf_idf(tf_list, idf): for tf_dict in tf_list: for word in tf_dict:

# Tf = doc[word]
# idf = idf[word]
tf_dict[word] = tf_dict[word] * idf[word]
return tf_list
def output_to_csv(file_name, data_list, review_df=None):

df = pd.DataFrame(data_list)
df = df.fillna(0)
df.index.name = "Review #" if review_df is not None: df['text'] = review_df['text']
cols = df.columns.tolist() cols = cols[-1:] + cols[:-1] df = df[cols]
df.to_csv(file_name)
def main():
texts_list = ["it is going to rain today",
"today i am not going outside",
"i am going to watch the season premiere"]
53 | P a g e
# corpus = ['This is the first document.',
# 'This document is the second document.',
# 'And this is the third one.',
# 'Is this the first document?',
# ]
# train_set = ["sky is blue", "sun is bright", "sun in the sky is bright"]
# reviews_df = pd.read_csv("abc.csv")
# texts_list = reviews_df['text'].tolist()
for i in range(len(texts_list)):
texts_list[i] = texts_list[i].lower() texts_list[i] = re.sub(r'\W', ' ', texts_list[i])
texts_list[i] = re.sub(r'\s+', ' ', texts_list[i])
all_tfs = [] all_freqs = [] for text in texts_list:

token = nltk.word_tokenize(text) # Remove Punctuation for word in token:
if word in punctuations: token.remove(word)
# Lemmatization for i in range(len(token)): token[i] = lemmitization(token[i])

tf, freq = compute_tf(token) all_tfs.append(tf) all_freqs.append(freq)
idf = compute_idf(all_freqs) tfs_final = compute_tf_idf(all_tfs, idf)

output_to_csv('tf_idf_output.csv', tfs_final, None)
if __name__ == '__main__':
main()
54 | P a g e
Pre-processing
In pre-processing we are doing stop-word removal , special character removal and lemmatization
Code
from nltk import WordNetLemmatizer from nltk.corpus import stopwords
wordnet_lemmatizer = WordNetLemmatizer()
def stopword_rem(token):
tokens_without_sw = [word for word in token if not word in stopwords.words()] return
tokens_without_sw
def lemmitization(token):
token = wordnet_lemmatizer.lemmatize(token, pos="v") return token
MODULE WEB
APP CODE
Home.html
<!DOCTYPE html>
 <html class="no-js"> 

<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<title></title>
<meta name="description" content="">
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css"
integrity="sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T"
crossorigin="anonymous">
<style>
html, body {
max-width: 100%;
overflow-x: hidden;
}
55 | P a g e
</style>
</head>
<body>
<header id="main-header" class="py-2 bg-primary text-white" >

<div class="wrapper container">
<div class="row">
<div class="col-md-12 text-center">
<h1 ><i class="fa fa-gear"></i> Shopify User Reviews Classification</h1>
</div>
</div>
</div>
</header>
<nav class="navbar navbar-expand-sm navbar-dark bg-dark p-0">

<div class="container">
<button
class="navbar-toggler"
data-toggle="collapse"
data-target="#navbarNav"
>
<span class="navbar-toggler-icon"></span>
</button>
<div class="collapse navbar-collapse" id="navbarNav">
<ul class="navbar-nav">
56 | P a g e
<li><a href="index2"><strong>Home</strong> </a></li>
<li><a href="dataset"><strong>Dataset Selection</strong> </a></li>
<li><a href="feature_selection" disabled=""><strong>Feature Extraction</strong> </a></li>
<li><a href="data_preprocessing" disabled=""><strong>Data Preprocessing</strong> </a></li>
<li><a href="classifier" disabled=""><strong>Classifier</strong> </a></li>
<li><a href="history" ><strong>History</strong> </a></li>
<li><a href="contact_us"><strong>Contact Us</strong> </a></li>
<li><a href="about"><strong>About Us</strong> </a></li>
<li><a href="login"><strong>Logout</strong> </a></li>
</ul>
<ul class="navbar-nav ml-auto">

<li class="nav-item dropdown mr-3">
<a
href="#"
class="nav-link dropdown-toggle"
data-toggle="dropdown"
>
<i class="fa fa-user"></i> Welcome
</a>
<div class="dropdown-menu">
<a href="/profile" class="dropdown-item">
<i class="fa fa-user-circle"></i> Profile
</a>
<a href="/settings" class="dropdown-item">
<i class="fa fa-gear"></i> Settings
57 | P a g e
</a>
</div>
</li>
<li class="nav-item">
<a href="/logout" class="nav-link">
<i class="fa fa-user-times"></i> Logout
</a>
</li>
</ul>
</div>
</div>
</nav>
</br></br>
<div class="row justify-content-center">

<div class="col-md-10 ">
<h1 align="center">Shopify User Reviews Classification</h1>
</br></br>
<p>
<footer class="footer fixed-bottom container" style="text-align: center;">

<hr>
<p>© 2021 Shopify User Reviews Classification, Inc.</p>
</footer>
<script src="https://code.jquery.com/jquery-3.3.1.slim.min.js" integrity="sha384-

q8i/X+965DzO0rT7abK41JStQIAqVgRVzpbzo5smXKp4YfRvH+8abtTE1Pi6jizo"
crossorigin="anonymous"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.14.7/umd/popper.min.js" integrity="sha384-
UO2eT0CpHqdSJQ6hJty5KVphtPhzWj9WO1clHTMGa3JDZwrnQq4sF86dIHNDz0W1"
58 | P a g e
<script src="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/js/bootstrap.min.js" integrity="sha384-
JjSmVgyd0p3pXB1rRibZUAYoIIy6OrQ6VrjIEaFf/nJGzIxFDsf4x0xIM+B07jRM"
<script src="" async defer></script>

</body>
</html>
Header.html
<header id="main-header" class="py-2 bg-primary text-white" >

<div class="row">
<h1 ><i class="fa fa-gear"></i> Shopify User Reviews Classification</h1>
</div>
</div>
</div>
</header>

<nav class="navbar navbar-expand-lg navbar-light bg-light">
<div class="collapse navbar-collapse" id="navbarSupportedContent">

<ul class="navbar-nav mr-auto">
<li><a href="index"><strong>HOME</strong> </a></li>
59 | P a g e
<li><a href="dataset"><strong>DATASET SELECTION</strong> </a></li>
<li><a href="feature_selection"><strong>FEATURE EXTRACTION</strong> </a></li>
<li><a href="data_preprocessing"><strong>DATASET PREPROCESSING</strong> </a></li>
<li><a href="classifier"><strong>CLASSIFIER</strong> </a></li>
<li><a href="history"><strong>HISTORY</strong> </a></li>
<li><a href="unseen_review"><strong>UNSEEN REVIEW</strong> </a></li>
<li><a href="contact_us"><strong>CONTACT US</strong> </a></li>
<li><a href="about"><strong>ABOUT US</strong> </a></li>
</ul>
</div>
</nav>
</div></div>
</br></br>

<h1 align="center">About Us</h1>
</br></br>
<img src="waleed.png" />
<br><br>
60 | P a g e
<img src="samad.jpeg" alt="HTML5 Icon" width="128" height="128">
<br><br>
</div></div>
</br></br>
Preprocessing.html
</script>
<header id="main-header" class="py-2 bg-dark text-white" >

<div class="row" >

<img src="https://i.ibb.co/1sb0Rt1/logo.png" alt="logo" border="0" width="80px" height="80px"
align="left" >
<h1 style="padding-top:15px"><i class="fa fa-gear"></i> Classification of Shopify Apps User
Reviews</h1>
</div>
</div>
</div>
</header>
<div class="col-md-11.5 ">

<li><a href="index2"><strong><h6>Home</h6></strong> </a></li>
<li><a href="dataset"><strong><h6>Dataset Selection</h6></strong> </a></li>
61 | P a g e
<li><a href="feature_selection" ><strong><h6>Feature Extraction</h6></strong> </a></li>
<li><a href="data_preprocessing" ><strong><h6>Data Preprocessing</h6></strong> </a></li>
<li><a href="classifier" disabled=""><strong><h6>Classifier</h6></strong> </a></li>
<li><a href="history" ><strong><h6>History</h6></strong> </a></li>

<li><a href="contact_us"><strong><h6>Contact Us</h6></strong> </a></li>
<li><a href="about"><strong><h6>About Us</h6></strong> </a></li>
<li><a href="login"><strong><h6>Logout</h6></strong> </a></li>
</ul>
<input type="hidden" id="namea" name="variable" value="{{ test }}">

</div>
</nav>
</div></div>
</br></br>

<h1 align="center">Data Preprocessing</h1>
</div></div>
</br></br>

<div class="dropdown">
62 | P a g e
<h3>
Preprocessing {{ system }}
</h3>
Features selectio.html
<script type="text/javascript">
function codeAddress() {
if(localStorage.getItem("feature") != undefined){
var myradioValue = localStorage.getItem("feature")
$("input[name=input_name][value="+myradioValue+"]").attr('checked', true);
}
function saveradio()
{
localStorage.setItem("feature", radiovalue);
}
</script>
</head>
<body onload="codeAddress();" onbeforeunload="saveradio();" on>
<header id="main-header" class="py-2 bg-dark text-white" >

<div class="row" >

<img src="https://i.ibb.co/1sb0Rt1/logo.png" alt="logo" border="0" width="80px" height="80px"
align="left" >
<h1 style="padding-top:15px"><i class="fa fa-gear"></i> Classification of Shopify Apps User
Reviews</h1>
</div>
</div>
</div>
</header>
<div class="col-md-11.5 ">
63 | P a g e
<li><a href="index2"><strong><h6>Home</h6></strong> </a></li>
<li><a href="dataset"><strong><h6>Dataset Selection</h6></strong> </a></li>
<li><a href="feature_selection" ><strong><h6>Feature Extraction</h6></strong> </a></li>
<li><a href="data_preprocessing" disabled=""><strong><h6>Data Preprocessing</h6></strong> </a></li>
<li><a href="classifier" disabled=""><strong><h6>Classifier</h6></strong> </a></li>
<li><a href="history" ><strong><h6>History</h6></strong> </a></li>

<li><a href="contact_us"><strong><h6>Contact Us</h6></strong> </a></li>
<li><a href="about"><strong><h6>About Us</h6></strong> </a></li>
<li><a href="login"><strong><h6>Logout</h6></strong> </a></li>
</ul>
</div>
</nav>
</div></div>
</br></br>

<h1 align="center">Feature Selection</h1>
</div></div>
</br></br>

<h3>Feature Extraction</h3>
<form method="post" id="myfeature" action="feature_extraction">

{% csrf_token %}
<input type="radio" id="Bag" name="feature" value="Bag Of Words">
64 | P a g e
<label for="Bag">Bag Of Words</label><br>
<input type="radio" id="part" name="feature" value="Part of Speech Tagging">
<label for="part">Part of Speech Tagging</label><br>
<input type="radio" id="tf" name="feature" value="TF-IDF">
<label for="tf">TF-IDF</label><br>
<input type="radio" id="pos" name="feature" value="Discrete Positive">
<label for="pos">Discrete Positive</label><br>
<input type="radio" id="neg" name="feature" value="Discrete negative">
<label for="neg">Discrete negative</label><br>
<input type="radio" id="Polarity" name="feature" value="Polarity">
<label for="Polarity">Polarity</label><br>
<input type="radio" id="sent" name="feature" value="Sentiments">
<label for="sent">Sentiments</label><br>
<input type="radio" id="all" name="feature" value="All">
<label for="all">All</label><br>
History.html
5
<table class="table table-striped">
<thead>
<tr>
<th scope="col">Data Type</th>
<th scope="col">Feature Extraction</th>
<th scope="col">Preprocessing</th>
<th scope="col">Machine Learning Model</th>
<th scope="col">Accuracy</th>
<th scope="col">F-Measure</th>
<th scope="col">Precision</th>
<th scope="col">Recall</th>
<th scope="col">Validation Technique</th>
<th scope="col">Action</th>
</tr>
</thead>
<tbody>
{% if dataset %}
{% for ml,prep,feat in dataset %}
<tr>

<td>Product Review</td>
<td>{{ feat.feature }}</td>
<td>{{ prep.prep }}</td>
<td>{{ ml.classifier }}</td>
<td>{{ ml.accuracy }}</td>
<td>{{ ml.fmeasure }}</td>
<td>{{ ml.precision }}</td>
65 | P a g e
<td>{{ ml.recall }}</td>
<td>{{ ml.val_tech }}</td>
<td>
<form method="post" action="delrec">
{% csrf_token %}
<input id="prodId" name="prodId" type="hidden" value="{{ ml.id }}">

<button onclick="return confirm('Are you sure you want to delete this?')" type="submit" class="delete-
row" style="background-color: red;color:white">Delete</button><br><br>
</form>
<a href="unseen_review" style="background-color: #228B22">Test</a><br></td>
</tr>
{% endfor %}
{% endif %}
</tbody>
</table>
66 | P a g e
Chapter 5
Software Testing
This chapter provides a description about the adopted testing procedure. This includes the
selected testing methodology, test suite and the test results of the developed software.
5.1 Testing Methodology

After implementation, the process flow manager is tested for functional errors. We are going to
do Black Box Testing (by passing random selected values and mapping it against the expected
output in a normal flow), Unit and Integration Testing which is the testing of the functional
requirements implemented in our system without regard to code.
The test cases are done manually without the use of any tool.
5.2 Test Cases
5.2.1 User Registration Test case

Table 15 User Registration Test case
Date: 1/8/2021
System: Classify Shopify User Reviews
Objective: Registration Test ID: 2
Version: 2 Test Type: Black Box Testing
Input
Name=Abdul Samad
Username=abdul.samad
Email=bse173024@cust.pk
Password=173024
Confirm Password=173024
67 | P a g e
Expected Output
New user is registered into the system.
Actual Output
Registration completed successfully.
Expected Exceptions
Invalid email.
Password doesn’t match.
5.2.2 User Login Test case

Table 16 User Login Test Case
Date: 1/8/2021
Objective: Login Test ID: 3
Test Type: Black Box

Version: 1 Testing
Input
Username=abdul.samad
Password=173024
Expected Output
New User is Register into system.
Actual Output
Registration Successful.
68 | P a g e
Expected Exceptions
Invalid email.
Password doesn’t match.
Empty Field.
5.2.3 Choose Dataset Test case

Table 17 Choose Dataset Test Case
Date: 1/8/2021
Objective: Choose Dataset Test ID: 4

Version: 1 Testing
Input
Select csv file from system
Expected Output
Dataset should be selected
Actual Output
Dataset selected
Expected Exceptions
Corrupt csv File
5.2.4 View Dataset Test Case 1

Table 18 View Dataset Test Case
Date: 1/8/2021
69 | P a g e
Objective: View Dataset Test ID: 5

Version: 1 Testing
Input
Product review.
Upload csv file from system
Expected Output
Corrupted csv file
Actual Output
Corrupt csv File
Expected Exceptions
Dataset is displayed
5.2.5 View Dataset Test Case 2

Table 19 View Dataset Test Case
Date: 1/8/2021
Objective: View Dataset Test ID: 6
Input
Product review.
Upload csv file from system
70 | P a g e
Expected Output
Dataset should be displayed
Actual Output
Dataset is displayed
Expected Exceptions
Corrupt csv File
5.2.6 Train Model Test Case

Table 20 Train Model Test Case
Date: 1/8/2021
Objective: Train Model Test ID: 7
Input
Product review
Expected Output
System will show Feature Extraction
Actual Output
User will able to select Feature Extraction method.
Expected Exceptions
Backend exception
71 | P a g e
5.2.7 Apply Feature Extraction method on Dataset Test Case
Table 21 Apply Feature Extraction method on Dataset
Date: 4/8/2021
Objective: Apply Feature Extraction method on Test ID: 8

Dataset
Version: 1 Test Type: Black Box

Testing
Input
Part of Speech Tagging
Expected Output
System will apply part of speech tagging on dataset.
Actual Output
User will able to select Preprocessing technique.
Expected Exceptions
System may not be responding.
5.2.8 Apply Part of speech on Dataset Test Case

Table 22 Apply Part of speech on Dataset Test Case
Date: 4/8/2021
Objective: Apply Part of speech on Dataset Test ID: 9
72 | P a g e
Input
Part of Speech Tagging
Expected Output
System will apply part of speech tagging on dataset.
Actual Output
User will able to select Preprocessing technique.
Expected Exceptions
5.2.9 Remove Special Characters Test Case

Table 23 Remove special characters Test Case
Date: 4/8/2021
Objective: Remove special characters Test ID: 10
Input
Special Character removal
Expected Output
System will apply special character removal from the given dataset.
Actual Output
System not responding.
Expected Exceptions
73 | P a g e
5.2.10 Apply Preprocessing Technique on Dataset Test Case

Table 24 Apply Preprocessing Technique on Dataset Test Case
Date: 4/8/2021
Objective: Apply Preprocessing Technique on Test ID: 11

Dataset
Version: 1 Test Type: Black Box

Testing
Input
Stopword Removal
Expected Output
User will able to select Stopword Removal.
Actual Output
System applied Stopword Removal on dataset.
Expected Exceptions
5.2.11 View processed Feature data Test Case

Table 25 View processed Feature data Test Case
Date: 1/8/2021
74 | P a g e
Objective: View Feature Test ID: 12
Input
Stopword Removal
Expected Output
System will show applied stopword removal .csv file
Actual Output
Application error 404
Expected Exceptions
Stop word removal feature does not applied properly
5.2.12 View processed Feature data Test Case

Table 26 View processed Feature data Test Case
Date: 1/8/2021
Objective: View Feature Test ID: 13
Input
Stopword Removal
Expected Output
System will show applied stopword removal .csv file
Actual Output
System showed applied stopword removal .csv file
75 | P a g e
Expected Exceptions
Stop word removal feature does not applied properly
5.2.13 Moving to Classifier Test case

Table 27 Moving to Classifier Test Case
Date: 4/8/2021
Objective: Moving to Classifier Test ID: 14
Version: 1 Test Type: Unit Testing
Input
No input
Expected Output
System will display classification method
Actual Output
System displayed the display Machine learning model and Evaluation Metrics.
Expected Exceptions
Backend exception
5.2.14 Machine Learning Model Test case

Table 28 Machine Learning Model Test Case
Date: 4/8/2021
Objective: Machine Learning Model Test ID: 15
76 | P a g e
Input
Naive Bayes
Expected Output
System will apply machine learning model
Actual Output
System applies machine learning model.
Expected Exceptions
5.2.15 Evaluation Metrics Test case

Table 29 Evaluation Metrics Test Case
Date: 4/8/2021
Objective: Evaluation Metrics Test ID: 16
Input
Accuracy, F-measure, 10-fold cross validation.
Expected Output
System will apply evaluation metrics.
Actual Output
System applies evaluation metrics.
77 | P a g e
Expected Exceptions
5.2.16 Apply Classifier Test case

Table 30 Apply classifier Test Case
Date: 4/8/2021
Objective: Apply Classifier Test ID: 17
Input
• Machine learning model

• Evaluation metrics
• Validation techniques
Expected Output
System will apply Machine learning model, Evaluation metrics, Validation

techniques
Actual Output
System displayed the display Machine learning model and Evaluation Metrics
results.
Expected Exceptions
78 | P a g e
5.2.17 Save Model Test Case 1
Table 31 Save Model Test Case
Date: 4/8/2021
Objective: Save Model Test ID: 18

Version: 1 Testing
Input
• Machine learning modeling

• Preprocessing Techniques
• Feature Computation
Expected Output
System will save Machine learning modeling, Preprocessing Techniques,

Feature Computation
Actual Output
System didn’t save the model in history tab
Expected Exceptions
5.2.18 Save Model Test Case 2

Table 32 Save Model Test Case
Date: 4/8/2021
Objective: Save Model Test ID: 19
79 | P a g e
Input
• Machine learning modeling

• Preprocessing Techniques
• Feature Computation
Expected Output
System will save Machine learning modeling, Preprocessing Techniques,

Feature Computation
Actual Output
System saved Machine learning modeling Preprocessing Techniques, Feature

Computation.
Expected Exceptions
5.2.19 Test Model Test case

Table 33 Test model Test Case
Date: 1/8/2021
Objective: Test Model Test ID: 20
Input
Test model button Click
Expected Output
Redirect to unseen review prediction.
80 | P a g e
Actual Output
System not responding
Expected Exceptions
Test model button may not work properly
5.2.20 Test Model Test Case 2

Table 34 Test model Test Case
Date: 1/8/2021
Objective: Test Model Test ID: 20
Input
Test model button Click
Expected Output
Redirect to unseen review prediction.
Actual Output
Redirected to unseen review prediction.
Expected Exceptions
Test model button may not work properly
5.2.21 Unseen Prediction Test Case 1

Table 35 Unseen Prediction Test Case
Date: 4/8/2021
81 | P a g e
Objective: Unseen Prediction Test ID: 21
Version: 1 Test Type: Black Box Testing.
Input
Text field e.g. I am happy with the product
Expected Output
System will predict unseen review (Happy)
Actual Output
Unhappy
Expected Exceptions
System will may not respond.

Date: 4/8/2021
Input
Text field e.g., Nice fast checkout app. Easy to set up and works as promised. I
like this wiggling button. Really cool feature!
Expected Output
82 | P a g e
Actual Output
Unhappy
Expected Exceptions

Date: 4/8/2021
Input
Text field e.g. I am happy with the product
Expected Output
Actual Output
Happy
Expected Exceptions
5.2.24 User Logout Test case

Date: 4/8/2021
83 | P a g e
Objective: Logout Test ID: 24
Version: 1 Test Type: Black box Testing
Input
Logout Button Click
Expected Output
Redirect to Login page
Actual Output
Redirected to login
Expected Exceptions
Logout button may not work properly
5.2.25 Contact Us Test Case

Table 39 Display Button Test Case
Date: 4/8/2021
Objective: Contact us Page Test ID: 25
Input
Contact us Button Click and type query
Expected Output
Redirect to Contact us and query sent to email
Actual Output
84 | P a g e
Redirect to Contact us but email not sent
Expected Exceptions
Email is not received
5.2.26 Contact Us Test Case

Table 40 Display Button Test Case
Date: 4/8/2021
Objective: Contact us Page Test ID: 26
Input
Contact us Button Click and type query
Expected Output
Redirect to Contact us and query sent to email
Actual Output
Redirect to Contact us and email sent
Expected Exceptions
Email is not received
5.2.27 About Page Test case

Table 41 About Page Test Case
Date: 1/8/2021
85 | P a g e
Objective: About Page Test ID: 27
Input
About Button Click
Expected Output
Redirect to About Page
Actual Output
Redirect to About Page
Expected Exceptions
None
86 | P a g e
Chapter 6
Software Deployment
6.1 Installation / Deployment Process Description

• GitHub
First, we have to install git on the system then we will make the account on GitHub.
• Then we will install GitToolBox on PyCharm, plugins.
• Then we will push the project on GitHub hub using this tool.
• Heroku
87 | P a g e
Then, we will make account on Cloud Application Platform | Heroku
• We will create application on Heroku, after creating application on Heroku then we select
python language.
• Then we will connect our Heroku account with GitHub account.
88 | P a g e
• Then we will click on Deploy Branch on Heroku website to deploy the project.
• Then we will check status of deployment on Heroku and also from PyCharm.
89 | P a g e
• Check status from PyCharm.
• The link of our project shopifyappreviewprediction.herokuapp.com
90 | P a g e
Chapter 7
REPORT APPROVAL CERTIFICATE
The report of the project, “Web App to classify Shopify User Review Using Textual
Features” has been approved based on the following evaluation guideline.
Table 42 Project Evaluation Guidelines
Artifacts Guidelines
Analysis and Design artifacts are syntactically correct (use-case model, SSDs,
domain model, class diagram, SDs, ERDs, Flow charts, Activity Diagram,
DFDs)
Consistency and traceability have been maintained among different artifacts
General Guidelines
Formatting (font style, indentation) is according to the FYP template and
consistent throughout the document
Captions are added to all the figures and tables. Figure captions must be placed
below each figure, and table captions must be provided above the table
Each figure or table is followed by some text describing what it represents
____________________ ____________________ ____________________

Name & Signature Name & Signature Name & Signature
(Examiner 1) (Examiner 2) (Examiner3)
_________________
Name & Signature
(Supervisor)
91 | P a g e
References
Research Paper
[1] F. Rustam, A. Mehmood, M. Ahmad, S. Ullah, D. M. Khan and G. S. Choi, "Classification of
Shopify App User Reviews Using Novel Multi Text Features," in IEEE Access, vol. 8, pp.
30234-30244, 2020, doi: 10.1109/ACCESS.2020.2972632.
Webpage
[2] https://www.jetbrains.com/pycharm/features/, last accessed July 24, 2021.
[3] https://www.python.org/doc/, last accessed July 24, 2021.
[4] https://www.w3schools.com/html/, last accessed July 24, 2021.
[5] https://docs.djangoproject.com/en/3.2/, last accessed July 24, 2021.
[6] https://www.javascript.com/about, last accessed July 25, 2021.
[7] https://www.sqlite.org/index.html, last accessed July 26, 2021
92 | P a g e

FYP Final Document

Uploaded by

Copyright:

Available Formats

You might also like

FYP Final Document

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

FYP Final Document

Uploaded by

Copyright:

Available Formats

Web App to Classify Shopify User Reviews

using Textual Features

Abdul Samad (BSE173024)

SUPERVISOR’S Dr. M. Shahid Iqbal Malik

MEMBER NAME REG. NO. EMAIL ADDRESS

Abdul Samad BSE173024 Bse173024@cust.pk

Waleed Khalid BSE 173016 Bse173016@cust.pk

Bachelor of Engineering in Software Engineering

(Dr. M. Shahid Iqbal Malik)

Project Coordinator: __________________________

(Mr. Ibrar Arshad)

Head of Department: __________________________

(Dr. Nadeem Anjum)

1.1 Project Introduction ..................................................................................................................... 1

1.2 Problem Statement....................................................................................................................... 1

1.3 Business Scope ............................................................................................................................ 2

1.4 Objectives .................................................................................................................................... 2

1.5 Useful Tools and Technologies .................................................................................................. 2

1.6 Project Work Break Down .......................................................................................................... 4

1.7 Project Time Lapse ...................................................................................................................... 4

Requirement Specification and Analysis ...................................................................................................... 5

2.1. Functional Requirements ............................................................................................................. 5

2.2. Non-Functional Requirements ..................................................................................................... 7

2.3. Use Case Modeling...................................................................................................................... 8

2.4. Use Case Diagram: ...................................................................................................................... 8

2.5. Use Case Descriptions ................................................................................................................. 9

2.5.1. User Registration Use Case Description: ............................................................................ 9

2.5.2. User Login Use Case Description ....................................................................................... 9

2.5.3. Select Dataset Use Case Description ................................................................................ 10

2.5.4. View Dataset Use Case Description ................................................................................. 11

2.5.5. Select feature extraction method Use Case Description ................................................... 11

2.5.6. Select Preprocessing technique Use Case Description ..................................................... 12

2.5.7. Select Classifier Use Case Description ............................................................................. 13

2.5.8. Select Validation Technique Use Case Description ......................................................... 14

2.5.9. Select Evaluation Metric Use Case Description ............................................................... 14

2.5.10. View History Use Case Description ................................................................................. 15

3.1. Layer Definition ........................................................................................................................ 17

3.1.1. Presentation Layer ............................................................................................................ 17

3.1.2. Business Logic Layer........................................................................................................ 17

3.2. System Design Diagrams........................................................................................................... 17

3.2.1. High Level Design ............................................................................................................ 18

3.2.2. System Sequence Diagrams .............................................................................................. 18

3.2.2.1. User Register SSD ........................................................................................................ 18

3.2.2.2. User Login SSD............................................................................................................ 18

3.2.2.3. Load Dataset SSD......................................................................................................... 19

3.2.2.4 View Dataset SSD ........................................................................................................... 19

3.2.2.5 Applying Feature SSD ................................................................................................... 20

3.2.2.6 Apply Preprocessing SSD ............................................................................................... 21

3.2.2.7 Apply Classification SSD ............................................................................................... 21

3.2.2.10 Test Trained Model SSD................................................................................................ 23

3.2.2.11 Logout SSD ................................................................................................................... 23

3.3 Domain Model ................................................................................................................................ 24

3.4 Flow Chart ................................................................................................................................. 25

3.4.1 User Registration Flow Chart ................................................................................................ 26

3.4.2 User Login Flow Chart.......................................................................................................... 27

3.4.3 Features Computation Flow Chart ........................................................................................ 27

3.4.4 Pre-Processing Flow Chart .................................................................................................... 28

3.4.5 Machine Learning Modeling Flow Chart .............................................................................. 29