FYP Final Document

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 104

Web App to Classify Shopify User Reviews

using Textual Features

Abdul Samad (BSE173024)


Waleed Khalid (BSE173016)

Supervised By
DR. SHAHID IQBAL

BS Software Engineering
Department of Computer Science
Capital University of Science & Technology, Islamabad

i|Page
Capital University of Science & Technology, Islamabad Department of Software Engineering
Submission Form for Final-Year

PROJECT REPORT
Version V 3.0 NUMBER 2
OF
MEMBERS

TITLE Web App to Classify Shopify User Reviews using Textual Features

SUPERVISOR’S Dr. M. Shahid Iqbal Malik


NAME

MEMBER NAME REG. NO. EMAIL ADDRESS

Abdul Samad BSE173024 Bse173024@cust.pk

Waleed Khalid BSE 173016 Bse173016@cust.pk

MEMBERS’ SIGNATURES

Supervisor’s Signatures

ii | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
APPROVAL CERTIFICATE
This Project, entitled as “Web App to Classify User Reviews using Textual
Features” has been approved for the award of

Bachelor of Engineering in Software Engineering

Committee Signatures:

Supervisor: __________________________

(Dr. M. Shahid Iqbal Malik)

Project Coordinator: __________________________

(Mr. Ibrar Arshad)

Head of Department: __________________________

(Dr. Nadeem Anjum)

iii | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
DECLARATION

We, hereby, declare that “No portion of the work referred to, in this project has been
submitted in support of an application for another degree or qualification of this or any other
university/institute or other institution of learning”. It is further declared that this
undergraduate project, neither as a whole nor as a part thereof has been copied out from any
sources, wherever references have been provided.

MEMBERS’ SIGNATURES

iv | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Contents
Chapter 1 ....................................................................................................................................................... 1

1.1 Project Introduction ..................................................................................................................... 1

1.2 Problem Statement....................................................................................................................... 1

1.3 Business Scope ............................................................................................................................ 2

1.4 Objectives .................................................................................................................................... 2

1.5 Useful Tools and Technologies .................................................................................................. 2

1.6 Project Work Break Down .......................................................................................................... 4

1.7 Project Time Lapse ...................................................................................................................... 4

Chapter 2 ....................................................................................................................................................... 5

Requirement Specification and Analysis ...................................................................................................... 5

2.1. Functional Requirements ............................................................................................................. 5

2.2. Non-Functional Requirements ..................................................................................................... 7

2.3. Use Case Modeling...................................................................................................................... 8

2.4. Use Case Diagram: ...................................................................................................................... 8

2.5. Use Case Descriptions ................................................................................................................. 9

2.5.1. User Registration Use Case Description: ............................................................................ 9

2.5.2. User Login Use Case Description ....................................................................................... 9

2.5.3. Select Dataset Use Case Description ................................................................................ 10

2.5.4. View Dataset Use Case Description ................................................................................. 11

2.5.5. Select feature extraction method Use Case Description ................................................... 11

2.5.6. Select Preprocessing technique Use Case Description ..................................................... 12

2.5.7. Select Classifier Use Case Description ............................................................................. 13

2.5.8. Select Validation Technique Use Case Description ......................................................... 14

2.5.9. Select Evaluation Metric Use Case Description ............................................................... 14

2.5.10. View History Use Case Description ................................................................................. 15


v|Page
Capital University of Science & Technology, Islamabad Department of Software Engineering
2.5.11. Enter unseen text Use Case Description ........................................................................... 16

Chapter 3 ..................................................................................................................................................... 17

System Design............................................................................................................................................. 17

3.1. Layer Definition ........................................................................................................................ 17

3.1.1. Presentation Layer ............................................................................................................ 17

3.1.2. Business Logic Layer........................................................................................................ 17

3.2. System Design Diagrams........................................................................................................... 17

3.2.1. High Level Design ............................................................................................................ 18

3.2.2. System Sequence Diagrams .............................................................................................. 18

3.2.2.1. User Register SSD ........................................................................................................ 18

3.2.2.2. User Login SSD............................................................................................................ 18

3.2.2.3. Load Dataset SSD......................................................................................................... 19

3.2.2.4 View Dataset SSD ........................................................................................................... 19

3.2.2.5 Applying Feature SSD ................................................................................................... 20

3.2.2.6 Apply Preprocessing SSD ............................................................................................... 21

3.2.2.7 Apply Classification SSD ............................................................................................... 21

3.2.2.10 Test Trained Model SSD................................................................................................ 23

3.2.2.11 Logout SSD ................................................................................................................... 23

3.3 Domain Model ................................................................................................................................ 24

3.4 Flow Chart ................................................................................................................................. 25

3.4.1 User Registration Flow Chart ................................................................................................ 26

3.4.2 User Login Flow Chart.......................................................................................................... 27

3.4.3 Features Computation Flow Chart ........................................................................................ 27

3.4.4 Pre-Processing Flow Chart .................................................................................................... 28

3.4.5 Machine Learning Modeling Flow Chart .............................................................................. 29

3.4.6 Evaluation Metrics Flow Chart ............................................................................................. 30


vi | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
3.4.7 Validation Method Flow Chart ............................................................................................. 30

3.4.8 Save Model Flow Chart......................................................................................................... 31

3.4.9 Test Saved Model Flow Chart ............................................................................................... 32

3.5 User Interface Design ................................................................................................................ 33

Chapter 4 ..................................................................................................................................................... 39

Software Development ................................................................................................................................ 39

4.1. Coding Standards....................................................................................................................... 39

4.1.1 Indentation............................................................................................................................. 39

4.1.2 Declaration ............................................................................................................................ 39

4.1.3 Statement Standards .............................................................................................................. 39

4.1.4 Naming Convention .............................................................................................................. 40

4.2 Front End Development Environment ....................................................................................... 40

4.3 Back End Development Environment ....................................................................................... 40

4.4 Software Description ................................................................................................................. 41

Chapter 5 ..................................................................................................................................................... 67

Software Testing ......................................................................................................................................... 67

5.1 Testing Methodology................................................................................................................. 67

5.2 Test Cases .................................................................................................................................. 67

5.2.1 User Registration Test case ................................................................................................... 67

5.2.2 User Login Test case ............................................................................................................. 68

5.2.3 Choose Dataset Test case ...................................................................................................... 69

5.2.4 View Dataset Test Case 1...................................................................................................... 69

5.2.5 View Dataset Test Case 2...................................................................................................... 70

5.2.6 Train Model Test Case .......................................................................................................... 71

5.2.7 Apply Feature Extraction method on Dataset Test Case ....................................................... 72

5.2.8 Apply Part of speech on Dataset Test Case........................................................................... 72


vii | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
5.2.9 Remove Special Characters Test Case .................................................................................. 73

5.2.10 Apply Preprocessing Technique on Dataset Test Case ..................................................... 74

5.2.11 View processed Feature data Test Case ............................................................................ 74

5.2.12 View processed Feature data Test Case ............................................................................ 75

5.2.13 Moving to Classifier Test case .......................................................................................... 76

5.2.14 Machine Learning Model Test case .................................................................................. 76

5.2.15 Evaluation Metrics Test case ............................................................................................ 77

5.2.16 Apply Classifier Test case ................................................................................................ 78

5.2.17 Save Model Test Case 1.................................................................................................... 79

5.2.18 Save Model Test Case 2.................................................................................................... 79

5.2.19 Test Model Test case ........................................................................................................ 80

5.2.20 Test Model Test Case 2 .................................................................................................... 81

5.2.21 Unseen Prediction Test Case 1 ......................................................................................... 81

5.2.22 Unseen Prediction Test Case 2 ......................................................................................... 82

5.2.23 Unseen Prediction Test Case 3 ......................................................................................... 83

5.2.24 User Logout Test case....................................................................................................... 83

5.2.25 Contact Us Test Case ........................................................................................................ 84

5.2.26 Contact Us Test Case ........................................................................................................ 85

5.2.27 About Page Test case ........................................................................................................ 85

Chapter 6 ..................................................................................................................................................... 87

Software Deployment.................................................................................................................................. 87

6.1 Installation / Deployment Process Description.......................................................................... 87

Chapter 7 ..................................................................................................................................................... 91

REPORT APPROVAL CERTIFICATE ..................................................................................................... 91

References ................................................................................................................................................... 92

viii | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Figures:
Figure 1 Work Breakdown Chart .................................................................................................................. 4
Figure 2 Project Time-Lapse ......................................................................................................................... 4
Figure 3 Use case Diagram ........................................................................................................................... 8
Figure 4 User Registration SSD .................................................................................................................. 18
Figure 5 User Login SSD ............................................................................................................................ 19
Figure 6 Load Data SSD ............................................................................................................................. 19
Figure 7 View Dataset SSD ........................................................................................................................ 20
Figure 8 Feature Computation SSD ............................................................................................................ 20
Figure 9 Preprocessing Technique SSD ...................................................................................................... 21
Figure 10 Select Classification SSD ........................................................................................................... 21
Figure 11 Save Result SSD ......................................................................................................................... 22
Figure 12 View History SSD ...................................................................................................................... 22
Figure 13 Test Model SSD.......................................................................................................................... 23
Figure 14 Logout SSD ................................................................................................................................ 24
Figure 15 Domain Model ............................................................................................................................ 25
Figure 16 Flow Chart ................................................................................................................................. 26
Figure 17 User Registration Flow Chart ..................................................................................................... 27
Figure 18 User Login Flow Chart ............................................................................................................... 27
Figure 19 Features computation flowchart.................................................................................................. 28
Figure 20 Pre-Processing Flow Chart ......................................................................................................... 29
Figure 21 Machine Learning Modeling Flow Chart ................................................................................... 29
Figure 22 Evaluation Metrics Flow Chart ................................................................................................... 30
Figure 23 Validation Method Flow Chart ................................................................................................... 31
Figure 24 Save model Flow Chart .............................................................................................................. 32
Figure 25 Test Saved Model Flow Chart .................................................................................................... 32
Figure 26 Signup Page Interface ................................................................................................................. 33
Figure 27 Login Page Interface ................................................................................................................... 33
Figure 28 About Page Interface .................................................................................................................. 34
Figure 29 Dataset Selection Interface ......................................................................................................... 34
Figure 30 View Dataset Interface ............................................................................................................... 35
Figure 31 Features selection Interface ........................................................................................................ 35
ix | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Figure 32 Data Preprocessing Interface ...................................................................................................... 36
Figure 33 Classifier Selection Interface ...................................................................................................... 36
Figure 34 Classifier Result Interface........................................................................................................... 37
Figure 35 History Interface ......................................................................................................................... 37
Figure 36 Unseen Review Interface ........................................................................................................... 38
Figure 37 Text Result Interface................................................................................................................... 38
Figure 38 Text Result Interface.................................................................................................................. 87
Figure 39 Text Result Interface................................................................................................................... 88
Figure 40 Text Result Interface................................................................................................................... 89
Figure 41 Text Result Interface................................................................................................................... 89
Figure 42 Text Result Interface................................................................................................................... 90

x|Page
Capital University of Science & Technology, Islamabad Department of Software Engineering
Tables:
Table 1 Functional Requirements ................................................................................................................. 5
Table 2 Non-Functional Requirements ......................................................................................................... 7
Table 3 User Registration Use Case Description .......................................................................................... 9
Table 4 User Login Use Case Description .................................................................................................... 9
Table 5 Select Dataset Use Case Description ............................................................................................. 10
Table 6 View Dataset Use Case Description .............................................................................................. 11
Table 7 Select feature extraction method Use Case Description ............................................................... 11
Table 8 Selection of preprocessing technique Use Case Description ......................................................... 12
Table 9 Select Classifier Use Case Description .......................................................................................... 13
Table 10 Select validation technique Use Case Description ....................................................................... 14
Table 11 Select Evaluation Metric Use Case Description .......................................................................... 14
Table 12 View History Use Case Description ............................................................................................ 15
Table 13 Enter unseen text Use Case Description ...................................................................................... 16
Table 14 Layers Definition ......................................................................................................................... 17
Table 15 User Registration Test case .......................................................................................................... 67
Table 16 User Login Test Case ................................................................................................................... 68
Table 17 Choose Dataset Test Case ............................................................................................................ 69
Table 18 View Dataset Test Case .............................................................................................................. 69
Table 19 View Dataset Test Case ............................................................................................................... 70
Table 20 Train Model Test Case ................................................................................................................. 71
Table 21 Apply Feature Extraction method on Dataset .............................................................................. 72
Table 22 Apply Part of speech on Dataset Test Case.................................................................................. 72
Table 23 Remove special characters Test Case .......................................................................................... 73
Table 24 Apply Preprocessing Technique on Dataset Test Case ................................................................ 74
Table 25 View processed Feature data Test Case ....................................................................................... 74
Table 26 View processed Feature data Test Case ....................................................................................... 75
Table 27 Moving to Classifier Test Case .................................................................................................... 76
Table 28 Machine Learning Model Test Case ............................................................................................ 76
Table 29 Evaluation Metrics Test Case ...................................................................................................... 77

xi | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Table 30 Apply classifier Test Case............................................................................................................ 78
Table 31 Save Model Test Case .................................................................................................................. 79
Table 32 Save Model Test Case .................................................................................................................. 79
Table 33 Test model Test Case ................................................................................................................... 80
Table 34 Test model Test Case ................................................................................................................... 81
Table 35 Unseen Prediction Test Case........................................................................................................ 81
Table 36 Unseen Prediction Test Case........................................................................................................ 82
Table 37 Unseen Prediction Test Case........................................................................................................ 83
Table 38 Unseen Prediction Test Case........................................................................................................ 83
Table 39 Display Button Test Case ............................................................................................................. 84
Table 40 Display Button Test Case ............................................................................................................. 85
Table 41 About Page Test Case .................................................................................................................. 85
Table 42 Project Evaluation Guidelines ...................................................................................................... 91

xii | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Chapter 1
The following chapter provides the brief summary of project scope, project specification of the
project, this report includes an existing system and technologies which is used for the
development of the software, it also includes the flow of our project timeline and breakdown
structure of the project.

1.1 Project Introduction


App stores usually allow users to give reviews and ratings that are used by developers to resolve
issues and make plans for their apps. In this way, these app stores collect large amounts of data
for analysis. However, it is critical to analyze such feedback due to the volume and redundancy.
Therefore, our work investigates an efficient way to analyze such feedback and solve the
problems related to the classification of Shopify app reviews. This exploits the use of different
machine learning approaches to solve user’s review classification problems based on different
feature engineering techniques. The classifiers, such as Naive Bayes and Random Forest were
trained on text reviews to predict the user’s review as being happy or unhappy for Shopify apps.
[1]

1.2 Problem Statement


There are a number of challenges that need to be considered first, related to retrenchment and
volume of data, through study equipment. This study conducts experiments on databases
containing updates for shopify apps. To overcome this problem, we first categorized user reviews
into two groups happy and unhappy, and then perform preprocessing on the reviews to clean the
data. At a later stage, several feature engineering techniques, such as bag-of-words, term
frequency-inverse document frequency (TF-IDF), are used singly and in combination to preserve
meaningful information. Finally, the random forest and logistic regression models are used to
classify the reviews as happy or unhappy. The experiments reveal that a combination of features
can improve machine learning model’s performance. [1]

1|Page
Capital University of Science & Technology, Islamabad Department of Software Engineering
1.3 Business Scope
Amazon shopify apps reviews will be used as data set, and the system will classify reviews into
two categories (Happy/ Unhappy). This app will be useful for seller, as they can improve their
product for a better future sale and also for the customer, that they should buy a particular
product or not.

1.4 Objectives
This project will have following objectives

• To help developers to resolve problems and make plans for their apps.

• Textual Feature Computation.

• Applying Machine Learning Algorithms for classification purposes.

• Model performance is evaluated using 10-fold cross validation & Hold-Out


method

• Model performance is presented by evaluation metrics such accuracy, precision


recall and f-measure.

1.5 Useful Tools and Technologies

PyCharm is an integrated development environment used in


computer programming, specifically for the Python language. [2]

Python is an interpreted high-level general-purpose programming language.


Python's design philosophy emphasizes code readability with its notable use of
significant indentation. [3]

2|Page
Capital University of Science & Technology, Islamabad Department of Software Engineering
The Hypertext Markup Language, or HTML is the standard markup language for
documents designed to be displayed in a web browser. It can be assisted by
technologies such as Cascading Style Sheets and scripting languages. [4]

Django is a high-level Python web framework that encourages rapid


development and clean, pragmatic design. Built by experienced developers, it
takes care of much of the hassle of web development. [5]

JavaScript is a scripting or programming language that allows you to


implement complex features on web pages every time a web page does more
than just sit there and display static. [6]

SQLite is a relational database management system contained in a C library. In


contrast to many other database management systems, SQLite is not a client–
server database engine. Rather, it is embedded into the end program. SQLite
generally follows PostgreSQL syntax.[7]

3|Page
Capital University of Science & Technology, Islamabad Department of Software Engineering
1.6 Project Work Break Down

Figure 1 Work Breakdown Chart

1.7 Project Time Lapse

Figure 2 Project Time-Lapse

4|Page
Capital University of Science & Technology, Islamabad Department of Software Engineering
Chapter 2
Requirement Specification and Analysis

Requirement’s analysis is a process of determining user expectations for a new or modified


product. These features, called requirements, must be quantifiable, relevant and detailed. In
software engineering, such requirements are often called functional specifications. In Chapter 2
we will enlist the functional and non-functional requirements and model functional requirements
in the form of use case model.

2.1. Functional Requirements


Functional requirements define functionalities of a system or its components. Functional
requirements may be calculations, technical details, data manipulation and processing and other
specific functionality that define what a system is supposed to accomplish.

Table 1 Functional Requirements

S. No. Functional Requirement Type Status

1. User can register to application Core Completed

2. User can login into application Core Completed

3. User can select shopify product dataset. Core Completed

4. User can view dataset Intermediate Completed

5. User can select training, using Part of Core Completed


Speech features.

6. User can select training using Bag of Core Completed


Words features.

7. User can select training using TF-IDF Core Completed


(Team frequency-inverse document
frequency) features.

5|Page
Capital University of Science & Technology, Islamabad Department of Software Engineering
8. User can select training using discrete Intermediate Completed
positive emotion features.

9. User can select training using discrete Intermediate Completed


negative emotion features.

10. User can select training using polarity Core Completed


features.

11. User can select training using Core Completed


sentiments features.

12. User can select all features for training Core Completed

13. User can select lemmatization and stop Core Completed


wards removal preprocessing technique.

14. User can select stop wards removal Core Completed


preprocessing technique.

15. User can select special character Core Completed


removal preprocessing technique.

16. User can select random forest (RF) Core Completed


machine learning model for experiment

17. User can select Naïve Bayes machine Core Completed


leaning model for experiment.

18. User can select Accuracy evaluation Core Completed


metric for experiment.

19. User can select Precision evaluation Core Completed


metric for experiment.

20. User can select Recall evaluation metric Core Completed


for experiment.

6|Page
Capital University of Science & Technology, Islamabad Department of Software Engineering
21. User can select F-measure evaluation Core Completed
metric for experiment.

22. User can select 10-fold cross validation Core Completed


technique for experiment.

23. User can select hold-out-method for Intermediate Completed


experiment.

24. User can save train model Core Completed

25. User can test model by entering unseen Core Completed


text and predict label

2.2. Non-Functional Requirements


Following is the list of the non-functional requirements.

Table 2 Non-Functional Requirements

S. No. Non-Functional Requirements Category

1. The user should reach the classified text with one button press Usability
if possible

2. The system also should be user friendly for admins because Usability
anyone can be admin instead of programmers.

3. Will predict class label (happy/unhappy) with maximum Accuracy


accuracy

4. This application is being developed using review’s features Reliability


and machine learning techniques. Therefore, there is no
certain reliable percentage that is measurable.

5. Computation time and response time should be as little as Performance


possible, because one of the software’s features is timesaving.

7|Page
Capital University of Science & Technology, Islamabad Department of Software Engineering
Whole cycle of classifying a dataset should not be more than
40 seconds.

6. After entering unseen tweet text, the system should classify it Accuracy

within defined time.

2.3. Use Case Modeling


A Use Case depicts how actors will interact with the system. A use case is a methodology used in
system analysis to identify, clarify and organize system requirements. The use case is made up of
a set of possible sequences of interactions between systems and users in a particular environment
and related to a particular goal. Following use case diagrams will depict how our system works.

2.4. Use Case Diagram:

Figure 3 Use case Diagram

8|Page
Capital University of Science & Technology, Islamabad Department of Software Engineering
2.5. Use Case Descriptions
2.5.1. User Registration Use Case Description:
Table 3 User Registration Use Case Description

Use Case ID: UC 3


UC Name User Registration
Actors User, Database
Description User must register in the application
Trigger Registration button
Pre-condition User must not be registered previously
Post-condition User will be registered in the system
Basic Flow User System
1. User must enter User data will be
in the credentials stored in the database
Alternative 1.User must first visit the website
Flow 2.User must then click on register, to perform
registration.

2.5.2. User Login Use Case Description


Table 4 User Login Use Case Description

Use Case ID: UC 1

UC Name User Login

Actors User, Database

Description User must login into the application

Trigger Login button

Pre-condition User must be registered

Post-condition User will be logged in the system

Basic Flow System

9|Page
Capital University of Science & Technology, Islamabad Department of Software Engineering
User

1. User will enter in User will be redirected


the application. to home page

Alternati 1.User must have to first register in the application


ve Flow

2.5.3. Select Dataset Use Case Description


Table 5 Select Dataset Use Case Description

Use Case ID: UC 4

UC Name Select Dataset

Actors User

Description User must select the dataset from the given options

Trigger Dropdown menu

Pre-condition User must select dataset

Post-condition System will load the dataset

Basic Flow User System

1. User must select the dataset System will load the


dataset

Alternative 1. Selected dataset had some miscellaneous information.


Flow
2. During loading dataset, System stops responding.

10 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
2.5.4. View Dataset Use Case Description
Table 6 View Dataset Use Case Description

Use Case ID: UC 5

UC Name View Dataset

Actors User

Description User can view the selected dataset.

Trigger View Button

Pre-condition User must have to select dataset

Post-condition User can be able to view the selected dataset

Basic Flow User System

1. User will view the System displays the


loaded dataset selected dataset

Alternati 1 Selected dataset had some miscellaneous information.


ve Flow 2. During loading dataset, System stops responding.

2.5.5. Select feature extraction method Use Case Description


Table 7 Select feature extraction method Use Case Description

Use Case ID: UC 6

UC Name Select feature extraction method

Actors User

Description User must select one of the feature extraction method


from the given features

Trigger Radio Button

Pre-condition User must load the dataset

11 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Post-condition Feature selection is successfully marked

Basic Flow User


System

1. User will select one System will note which


of the feature extraction feature extraction method to
method apply

Alternative 1. Service may not be available.


Flow
2. System may crash while posting request

3. Data may not be loaded to system

2.5.6. Select Preprocessing technique Use Case Description


Table 8 Selection of preprocessing technique Use Case Description

Use Case ID: UC 7

UC Name Select preprocessing technique

Actors Users

Description User must select one of the preprocessing techniques.

Trigger Radio Button

Pre-condition User must select feature extraction method.

Post-condition Preprocessing and feature will be applied to the selected


dataset.

Basic Flow System


User

1. User must select one System will apply


of the preprocessing preprocessing and feature

12 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
techniques. extraction method on
selected dataset.

Alternat 1. User may select wrong feature and preprocessing method.


ive Flow
2. Service is not currently available.

2.5.7. Select Classifier Use Case Description


Table 9 Select Classifier Use Case Description

Use Case ID: UC 8

UC Name Select Classifier

Actors Users

Description User must select one of the machine learning model.

Trigger Dropdown menu

Pre-condition Must apply preprocessing and feature extraction.

Post-condition System will apply the machine learning model on the dataset

Basic Flow User


System

1. User must select the System will apply that


one of the machine model on the selected
learning model. dataset.

Alternative 1. Preprocessing and feature extraction method may not be


Flow applied.

2. System may not be responding.

13 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
2.5.8. Select Validation Technique Use Case Description
Table 10 Select validation technique Use Case Description

Use Case ID: UC 9

UC Name Select validation technique

Actors Users

Description User can select one or more than one validation


technique to evaluate the performance of the
classifiers.

Trigger Check boxes

Pre-condition Must select one of the machine learning model.

Post-condition Display different results in graphs.

Basic Flow User


System

1. User can select one or System will display the


multiple validation results in graphs.
technique methods.

Alternativ 1. User may not select any of the given options


e Flow
2. System is not responding.

2.5.9. Select Evaluation Metric Use Case Description


Table 11 Select Evaluation Metric Use Case Description

Use Case ID: UC 10

UC Name Select Evaluation metric

Actors User

Description User can view the quality of the machine learning model.

14 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Trigger Drop Down Menu

Pre-condition User must select one of the machine learning model.

Post-condition System will display results in type of graphs.

User System

1. User must select System will display results in

Basic Flow one of the Validation form of graphs.


Technique method

1. Unknown error occurred while


Alternati
displaying graphs
ve Flow
2. System not responding.

2.5.10. View History Use Case Description


Table 12 View History Use Case Description

Use Case ID: UC 11

UC Name View History

Actors User

Description User can view the past history of train models

Trigger Button

Pre-condition There must be any history

Post-condition User will see all the past history of model training.

Basic Flow User System

1. User will train a model System displays the


and then go to history. history of train model

15 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Alternative Flow 1. An unknown error occurred while displaying the list.

2. System may not be responding at the moment.

3. No past history

2.5.11. Enter unseen text Use Case Description


Table 13 Enter unseen text Use Case Description

Use Case ID: UC 12

UC Name Enter Unseen Text

Actors User

Description User can be able to predict a review

Trigger Text Field

Pre-condition User must train model

Post-condition User will be able to view result

Basic Flow User System

1. User must enter unseen System will display result


review.

Alternative 1. An unknown error occurred while updating the status.


Flow
2. System may not be responding

16 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Chapter 3
System Design
The purpose of this chapter is to provide information that is complementary to the development
phase. Without an adequate design, that delivers required function as well as quality attributes,
the project will fail. However, communicating architecture to its stakeholders is as important a
job as creating it in the first place.

3.1. Layer Definition


Table 14 Layers Definition

Layers Description

Presentation Layer This layer will be used for the interaction with the user
through a graphical user interface.

Business Logic Layer This layer contains the business logic. All the
constraints and majority of the functions reside under
this layer.

3.1.1. Presentation Layer


Occupies the top level and displays information related to services available on a website. This
tier communicates with other tiers by sending results to the browser and other tiers in the
network.

3.1.2. Business Logic Layer


Application Layer also called the middle tier, logic tier, business logic or logic tier, this tier is
pulled from the presentation tier. It controls application functionality by performing detailed
processing.

3.2. System Design Diagrams


System design is divided into two parts:
17 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
3.2.1. High Level Design
High-level design provides a view of the system at an abstract level. It shows how the major
pieces of the finished application will fit together and interact with each other. The high-level
design does not focus on the details of how the pieces of the application will work. Those details
can be worked out later during low-level design and implementation.

3.2.2. System Sequence Diagrams


System sequence diagram (SSD) is a sequence diagram that shows, for a particular scenario of a
use case, the events that external actors generate their order, and possible inter-system events.

3.2.2.1. User Register SSD


This is User Registration System Sequence Diagram which shown if user will register into
the system, which flow of activities will take place.

Figure 4 User Registration SSD

3.2.2.2. User Login SSD


This is Login System Sequence Diagram which shown if user tries to login, which flow of
activities will take place.

18 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Figure 5 User Login SSD

3.2.2.3. Load Dataset SSD


This is Load Dataset System Sequence Diagram which shown that first user will select
dataset type then user will upload dataset from the device.

Figure 6 Load Data SSD

3.2.2.4 View Dataset SSD


This is View Dataset System Sequence Diagram which shown if user want to view dataset
user can view dataset by requesting system.

19 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Figure 7 View Dataset SSD

3.2.2.5 Applying Feature SSD


This is Compute Feature System Sequence Diagram which shown if user want to compute
feature of the given dataset, which flow of activities will take place.

Figure 8 Feature Computation SSD

20 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
3.2.2.6 Apply Preprocessing SSD
This is Applying Pre-processing System Sequence Diagram which shown user will have to
select any of the pre-processing technique and then that technique will be applied to the given
dataset.

Figure 9 Preprocessing Technique SSD

3.2.2.7 Apply Classification SSD


This is Applying Classifier System Sequence Diagram which shows that user will have to
select the machine learning model from the given options, after that user will select the
evaluation metrics, and validation technique from the given options, and after that user will
apply the classifier, and after that system will display the result of the model.

Figure 10 Select Classification SSD

21 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
3.2.2.8 Save Result SSD

This is Save Result System Sequence Diagram, which shows that when the result of the
models is displayed after that we will save our model in the system

Figure 11 Save Result SSD

3.2.2.9 View History SSD

This is View History System Sequence Diagram which shown if user tries to view history of
old models which are saved, which flow of activities will take place.

Figure 12 View History SSD

22 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
3.2.2.10 Test Trained Model SSD
This is Test Saved Model System Sequence Diagram which shown if user tries to test the
saved model, user will enter unseen review and then system will display the predicted result.

Figure 13 Test Model SSD

3.2.2.11 Logout SSD


This is Logout System Sequence Diagram which shown if user tries to logout, which flow of
activities will take place.

23 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Figure 14 Logout SSD

3.3 Domain Model


The Domain Model is your organized and structured knowledge of the problem. The Domain
Model should represent the vocabulary and key concepts of the problem domain and it should
identify the relationships among all of the entities within the scope of the domain.

In our system we have twelve entities, the user entity is used to register and login to the system,
and train model enity is used to train the model, to train the model we first need the dataset, so
we have a choose dataset entity, and after choosing data we have to apply feature extraction
method and preprocessing technique so we have a feature extraction entity and preprocessing
entity, after this we need machine learing model, evaluation metrics and validation technique
entities, and after getting the result of train model, we have to store that model in our database so
we have save model entity, we can also test our model by giving unseen review so we have test
model entity, then system will give us prediction result so will also be having prediction entity.

24 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Figure 15 Domain Model
(Note: We remove Software Architecture diagram, Class diagram, Sequence diagram, and ER-Diagram and add detailed
flow chart because our panel and supervisor suggested us to add detailed flow chart and remove those diagrams because we
don’t need them)

3.4 Flow Chart


• First user will register into the system.
• Then user will login into the system,
• After login system will take user to home page of the system.
• After that user will select the dataset, if user want to view the dataset user can also view
the dataset.
• Then user will select one of the feature extraction method from the given options.
• After selecting feature user will select preprocessing techniques from the given options.
• User can also view the data on which feature extraction method and preprocessing
techniques is applied.
• Then user will select one machine learning model from the given options.
• Then User will select Evaluation metrices from the given options.

25 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
• User will also select one the validation technique from the given options.
• When User will apply these on dataset, system will train model and will display the
results for the trained model. User can also save the trained model in the system.
• User can go to history tab and view all the saved trained model.
• From History tab user can test the trained model which is saved in our system, by giving
unseen review.
• Then system will predict the class label of the unseen review (Happy/Unhappy), from the
trained model which is selected.

Figure 16 Flow Chart

3.4.1 User Registration Flow Chart


User will Register by giving following credentials:

26 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
➢ Name
➢ Username
➢ Email
➢ Password
➢ Confirm Password

Figure 17 User Registration Flow Chart

3.4.2 User Login Flow Chart


User will Register by giving following credentials:

➢ Username
➢ Password

Figure 18 User Login Flow Chart

3.4.3 Features Computation Flow Chart


User will select one of the feature extraction methods from the given options, e.g.

➢ Bag of words.
➢ Part of speech tagging.
➢ Discrete Positive emotion.
27 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
➢ Discrete Positive emotion.
➢ Sentiment,
➢ Polarity,
➢ Term frequency inverse document frequency (TF-IDF).

Figure 19 Features computation flowchart

3.4.4 Pre-Processing Flow Chart


User will select preprocessing techniques from the given options, e.g.

➢ Stopwords Removal.
➢ Stopwords Removal and Special Character Removal.
➢ Stopwords Removal and Lemmatization.
28 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
➢ Stopwords Removal, Special Character Removal and Lemmatization.

Figure 20 Pre-Processing Flow Chart

3.4.5 Machine Learning Modeling Flow Chart


User will select one machine learning model from the given options:
➢ Naïve Bayes
➢ Random Forest

Figure 21 Machine Learning Modeling Flow Chart

29 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
3.4.6 Evaluation Metrics Flow Chart
User will select Evaluation metrices from the given options:
➢ Accuracy
➢ F-measure
➢ Precision
➢ Recall

Figure 22 Evaluation Metrics Flow Chart

3.4.7 Validation Method Flow Chart


User will also select one the validation technique from the given options, e.g.
➢ 10-Fold Cross Validation Method
➢ Hold-Out Method

30 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Figure 23 Validation Method Flow Chart

3.4.8 Save Model Flow Chart


User will save the trained model into the system with following credentials:
➢ Current date and time
➢ Dataset Name
➢ Feature Computation
➢ Pre-Processing Technique
➢ Machine Learning Model
➢ Accuracy
➢ Precision
➢ Recall
➢ F-measure
➢ Validation Technique

31 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Figure 24 Save model Flow Chart

3.4.9 Test Saved Model Flow Chart


User will test the saved model by giving unseen review.

Figure 25 Test Saved Model Flow Chart

32 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
3.5 User Interface Design

Figure 26 Signup Page Interface

Figure 27 Login Page Interface

33 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Figure 28 About Page Interface

Figure 29 Dataset Selection Interface

34 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Figure 30 View Dataset Interface

Figure 31 Features selection Interface

35 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Figure 32 Data Preprocessing Interface

Figure 33 Classifier Selection Interface

36 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Figure 34 Classifier Result Interface

Figure 35 History Interface

37 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Figure 36 Unseen Review Interface

Figure 37 Text Result Interface

38 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Chapter 4
Software Development

4.1. Coding Standards

4.1.1 Indentation
Proper code indention is used in this project. The indentation of blocks of code enhances
readability, understandability and hierarchy of lines of code.

4.1.2 Declaration
• In this project we have used one declaration per line is to increase clarity and better
understanding of code. Following is the order of declaration:
• All the widgets have been imported at the beginning.
• The sequence of class variables is: First public, protected then private.
• Instance variables follow the sequence: First public then private instance variables.
• Then class constructors are declared with proper names.
• Class methods are grouped by functionality rather than by scope or accessibility to make
reading and understanding the code easier.
• Declarations for local variables are only at the beginning of code after importing
packages and libraries

4.1.3 Statement Standards


Each line of code contains one declaration at most. Compound statements in this project
contain lines of code enclosed in braces. The inner block of code of compound statements
begins after the opening braces from next line. Proper indentation is also followed for lines of
codes inside the compound statements. Proper braces are used in code around all statements
such as if-else, try-catch etc.

39 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
4.1.4 Naming Convention
Proper naming convention rules are followed while implementation of this project which
make programs more understandable by making them easier to read.

While implementing this project, we have used words from Natural Language (English) to
properly assign understandable names to classes, variables and methods. Such as Requests,
DocumentCollection, BasicInformation etc. instead of un-understandable names like myc
method, a1, b1 etc.

Terminologies applicable to the domain of project are used. Implying that if user refers to
Email as Registration Number, then term Registration Number is used.

Mixed case is used to make names readable with lower case letters in general capitalizing the
first letter of class names and interface names.

4.2 Front End Development Environment

The Hypertext Markup Language, or HTML is the standard markup language


for documents designed to be displayed in a web browser. It can be assisted
by technologies such as Cascading Style Sheets and scripting languages such
as JavaScript. [4]

JavaScript is a scripting or programming language that allows you to


implement complex features on web pages every time a web page does more
than just sit there and display static information for you to look at displaying
timely content updates, interactive maps. [5]

4.3 Back End Development Environment

PyCharm is an integrated development environment used in computer


programming, specifically for the Python language. [2]

40 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Python is an interpreted high-level general-purpose programming
language. Python's design philosophy emphasizes code readability with
its notable use of significant indentation. [3]

Django is a high-level Python web framework that encourages rapid


development and clean, pragmatic design. Built by experienced
developers, it takes care of much of the hassle of web development. [6]

SQLite is a relational database management system contained in a C


library. In contrast to many other database management systems, SQLite
is not a client–server database engine. Rather, it is embedded into the
end program. SQLite generally follows PostgreSQL syntax. [7]

4.4 Software Description


Module Classifier
Code
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from sklearn import svm
from sklearn.neural_network import MLPClassifier
from sklearn.linear_model import SGDClassifier
from sklearn.metrics import confusion_matrix, classification_report
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.model_selection import KFold

41 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn.naive_bayes import GaussianNB
# from mlxtend.classifier import StackingClassifier
from sklearn.tree import DecisionTreeClassifier
from sklearn import model_selection
from sklearn import tree
from sklearn.ensemble import AdaBoostClassifier
from sklearn.ensemble import GradientBoostingClassifier
import time
from sklearn.model_selection import ShuffleSplit
from sklearn.model_selection import cross_val_score

from sklearn.metrics import precision_score, recall_score, confusion_matrix, classification_report,


accuracy_score, \
f1_score

def get_preprocessing(pre_processing):
if pre_processing == 'Stopwords Removal':
return "a1"
elif pre_processing == 'Stopwords + Special Characters':
return "a2"
else:
return "a3"

def read_dataset(dataset_name, feature_type, pre_processing):


pre_processing = get_preprocessing(pre_processing)
if feature_type.lower() == 'part of speech tagging':
dataset = pd.read_csv("features/feature/part_of_speech/" + dataset_name.lower() +
"_pos_" + pre_processing.lower() + '.csv', encoding="ISO-8859-1")
dataset.drop('text', axis=1, inplace=True)
dataset.drop('Tweet #', axis=1, inplace=True)
dataset.reset_index(drop=True, inplace=True)
print(dataset.head())
42 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
elif feature_type.lower() == 'bag of words technique':
dataset = pd.read_csv(
"features/feature/bag_of_words/" + dataset_name.lower() + "_bog_" + pre_processing.lower() +
'.csv')
dataset.drop('text', axis=1, inplace=True)
dataset.drop('Tweet #', axis=1, inplace=True)
dataset.reset_index(drop=True, inplace=True)

elif feature_type.lower() == 'tf-idf technique':


dataset = pd.read_csv("features/feature/tfidf/" + dataset_name.lower() +
"_tf_idf_"+pre_processing.lower() + '.csv')
dataset.drop(dataset.columns[[0, -1]], axis=1, inplace=True)
dataset.drop('text', axis=1, inplace=True)
dataset.reset_index(drop=True, inplace=True)

elif feature_type.lower() == 'unigram':


dataset = pd.read_csv("features/feature/polarity/" + dataset_name.lower() + "_polarity_" +
pre_processing.lower() + '.csv')
dataset.drop(dataset.columns[[0, -1]], axis=1, inplace=True)
dataset.drop('text', axis=1, inplace=True)
dataset.reset_index(drop=True, inplace=True)

elif feature_type.lower() == 'sentiment':


dataset = pd.read_csv("features/feature/sentiment/"+ dataset_name.lower() + "_sentiment_" +
pre_processing.lower() + '.csv')
dataset.drop(dataset.columns[[0, -1]], axis=1, inplace=True)
dataset.drop('text', axis=1, inplace=True)
dataset.reset_index(drop=True, inplace=True)
else:
raise Exception('Unknown Feature Type')
return dataset

def generate_random_forest(dataset):
label_Label = LabelEncoder()
43 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
# covernverting text into numbers
dataset["label"] = label_Label.fit_transform(dataset['label'])

X = dataset.drop("label", axis=1)
y = dataset['label']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)

start = time.time()
classifier = RandomForestClassifier(n_estimators=42, criterion='entropy')
classifier.fit(X_train, y_train)
y_pred = classifier.predict(X_test)
cv = ShuffleSplit(n_splits=5, test_size=0.3)
scores = cross_val_score(classifier, X, y, cv=10)
print(classification_report(y_test, y_pred))
print("Random Forest accuracy after 10 fold CV: %0.2f (+/- %0.2f)" % (scores.mean(), scores.std() * 2)
+ ", " + str(
round(time.time() - start, 3)) + "s")
print("******************************")
print("******************************")
print("******************************")

# print (' Accuracy:', accuracy_score(y_test, y_pred))


print('scores.mean:', scores.mean())
accuracy = scores.mean()
print("______________________________")
print('Precision:', precision_score(y_test, y_pred, average='weighted'))
precision = precision_score(y_test, y_pred, average='weighted')
# print ('Precision:', precision_score(y_test, y_pred))
print("______________________________")
print('Recall:', recall_score(y_test, y_pred, average='weighted'))
recall = recall_score(y_test, y_pred, average='weighted')
44 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
# print ('Recall:', recall_score(y_test, y_pred))
print("______________________________")
print('F1 score:', f1_score(y_test, y_pred, average='weighted'))
f1score = f1_score(y_test, y_pred, average='weighted')
# print ('F1 score:', f1_score(y_test, y_pred))
print("______________________________")

print("************************************************************************************
******")

return accuracy, precision, recall, f1score, classifier

def generateNaiveBayes(dataset):
start = time.time()
label_Label = LabelEncoder()
# covernverting text into numbers
dataset["label"] = label_Label.fit_transform(dataset['label'])
X = dataset.drop("label", axis=1)
y = dataset['label']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
nb = GaussianNB()
nb.fit(X_train, y_train)
y_pred = nb.predict(X_test)
cv = ShuffleSplit(n_splits=5, test_size=0.3)
scores = cross_val_score(nb, X, y, cv=10)
print(classification_report(y_test, y_pred))
print("Naive Bayes accuracy after 10 fold CV: %0.2f (+/- %0.2f)" % (scores.mean(), scores.std() * 2) + ",
" + str(
round(time.time() - start, 3)) + "s")
print("******************************")
print("******************************")
print("******************************")

print('Accuracy:', accuracy_score(y_test, y_pred))


45 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
accuracy = accuracy_score(y_test, y_pred)

print("______________________________")
print('Precision:', precision_score(y_test, y_pred, average='weighted'))
precision = precision_score(y_test, y_pred, average='weighted')
# print ('Precision:', precision_score(y_test, y_pred))

print("______________________________")

print('Recall:', recall_score(y_test, y_pred, average='weighted'))


recall = recall_score(y_test, y_pred, average='weighted')
# print ('Recall:', recall_score(y_test, y_pred))

print("______________________________")

print('F1 score:', f1_score(y_test, y_pred, average='weighted'))


f1score = f1_score(y_test, y_pred, average='weighted')
# print ('F1 score:', f1_score(y_test, y_pred))

print("______________________________")
return accuracy, precision, recall, f1score, nb

if __name__ == "__main__":
print('RF')
accuracy, precision, recall, f1score = generateNaiveBayes(read_dataset('sentiment', 'a1'))

Module Feature Computation


Code
Part-of-Speech Tagging
A Part-Of-Speech Tagger (POS Tagger) is a piece of software that reads text in some
language and assigns parts of speech to each word (and other token), such as noun, verb,
adjective, etc. Following code is for POS
CODE

46 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
import nltk from nltk import word_tokenize, pos_tag from nltk.corpus import wordnet as wn
from nltk.tokenize import RegexpTokenizer from nltk.corpus import stopwords import xlrd
import xlwt import re from collections import Counter from nltk.stem import
WordNetLemmatizer

wordnet_lemmatizer = WordNetLemmatizer()

prepList = ["", "CC", "CD", "DT", "EX", "FW", "IN", "JJ", "JJR", "JJS", "LS", "MD", "NN",
"NNS", "NNP",
"NNPS", "PDT", "POS",
"PRP", "PRP$", "RB", "RBR", "RBS", "RP", "RP", "TO", "UH", "VB", "VBD",
"VBG", "VBN", "VBP",
"VBZ", "WDT", "WP", "WP$", "WRB"]

# XLSX formatted source file reading Source

loc = (r"C:\CUST 07\FYP\POS tagging\2\abc.xlsx")

wb = xlrd.open_workbook(loc) sheet = wb.sheet_by_index(0) wbWrite = xlwt.Workbook()


style = xlwt.easyxf('font: bold 1') sheetToWrite = wbWrite.add_sheet('Part_of_speech')

def process1():
try: for i in range(1, len(prepList)):
sheetToWrite.write(0, i, prepList[i], style)

for read in range(sheet.nrows): tempIndexList = [] txt1 =


sheet.cell_value(read, 0)

47 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
# Special Character removing from source file tokenizer =
RegexpTokenizer(r'\w+') x = tokenizer.tokenize(txt1)
print(x)
#stop word removal

tokens_without_sw = [word for word in x if not word in stopwords.words()]

print(tokens_without_sw)

#x = txt1.lower()
#x = re.sub(r'\W', ' ', txt1) #x = re.sub(r'\s+', ' ', txt1)
#print(x)

#word_tokens = word_tokenize(x)
#print(word_tokens)
# Remove Punctuation # Lemmatization for token in tokens_without_sw:
token = wordnet_lemmatizer.lemmatize(token, pos="v")
print(token)

tagged = nltk.pos_tag(tokens_without_sw) counts = Counter(tag for word, tag in


tagged)
print(counts)
sheetToWrite.write(read + 1, 0, tokens_without_sw)

# for i in range(1, len(prepList)):


# sheetToWrite.write(read + 1, i, 0)

48 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
for i in counts.elements(): if i in prepList:
column = prepList.index(i) if not column in tempIndexList:
sheetToWrite.write(read + 1, column, counts[i]) tempIndexList.append(column)

"""
Writing 0's to columns with no values
"""
for i in range(1, len(prepList)): if i not in tempIndexList:
sheetToWrite.write(read + 1, i, 0)

wbWrite.save("POS Count.xls")
# print(nltk.help.upenn_tagset())

except Exception as e:
print(str(e))

process1()

Bag-of-words :
A bag-of-words model, or BoW for short, is a way of extracting features from text for use in
modeling, such as with machine learning algorithms. The approach is very simple and flexible,
and can be used in a myriad of ways for extracting features from documents Code
import re

import nltk import pandas as pd from nltk.corpus import stopwords from nltk.stem import
WordNetLemmatizer

49 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
from n_gram import output_to_csv from Pre_Processing import stopword_rem from
Pre_Processing import lemmitization

#wordnet_lemmatizer = WordNetLemmatizer()

#def stopword_rem(token):
#tokens_without_sw = [word for word in token if not word in stopwords.words()]
#return tokens_without_sw

#def lemmitization(token):
#token = wordnet_lemmatizer.lemmatize(token, pos="v")
#return token

def main():
Review_df = pd.read_csv("C:/FYP/POS
tagging/bagofwords/abc.csv")
texts_list = Review_df['text'].tolist() # texts_list[0] = "Playing...." for i in
range(len(texts_list)):
texts_list[i] = texts_list[i].lower()
# Return a match at every NON word character (characters NOT between a and Z. Like "!",
"?" white-space etc.)
texts_list[i] = re.sub(r'\W', ' ', texts_list[i])
# Replace all white-space characters with ""
texts_list[i] = re.sub(r'\s+', ' ', texts_list[i])
# TODO Number remove

50 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
bag_of_words_list = [] count = 0

for sentence in texts_list:


wordfreq = {} tokens = nltk.word_tokenize(sentence)
# List of words/tokens #stopwords ftoken=stopword_rem(tokens)

"""
['The', 'The', 'Samad'] wordfreq['The'] wordfreq {
'key': value
The: 2
Samad: 1
}
sentence_1 = ['The', 'The', 'Samad'] sentence_2 = ['The', 'BAG', 'Samad']

[{}, {}, {}] """


for token in ftoken: # Token 1 word token=lemmitization(token)
#token = wordnet_lemmatizer.lemmatize(token, pos="v")
if token not in wordfreq.keys():
wordfreq[token] = 1
else:
wordfreq[token] += 1

count += 1
bag_of_words_list.append(wordfreq)

output_to_csv('bag_of_words_output.csv', bag_of_words_list, Review_df)

if __name__ == "__main__":
main()
51 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
TF-IDF
TF-IDF is a statistical measure that evaluates how relevant a word is to a document in a
collection of documents. ... It has many uses, most importantly in automated text analysis, and is
very useful for scoring words in machine learning algorithms for Natural Language Processing
(NLP).
CODE
import pandas as pd import re import nltk from Pre_Processing import stopword_rem from
Pre_Processing import lemmitization

punctuations = "?:!.,;"

def compute_tf(token): num_of_words = len(token)

freq = {} tf = {}
for word in token: if word in freq: freq[word] += 1
else:
freq[word] = 1

for value in freq:


tf[value] = freq[value] / num_of_words

return tf, freq

def compute_idf(doc_list):
import math idf_dict = {}
N = len(doc_list) # [{}, {}, {}] for doc in doc_list:

52 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
for word, val in doc.items(): if val > 0: if idf_dict.get(word):
idf_dict[word] += 1
else:
idf_dict[word] = 1

for word, val in idf_dict.items():


idf_dict[word] = math.log(N / float(val))

return idf_dict

def compute_tf_idf(tf_list, idf): for tf_dict in tf_list: for word in tf_dict:


# Tf = doc[word]
# idf = idf[word]
tf_dict[word] = tf_dict[word] * idf[word]
return tf_list

def output_to_csv(file_name, data_list, review_df=None):


df = pd.DataFrame(data_list)
df = df.fillna(0)
df.index.name = "Review #" if review_df is not None: df['text'] = review_df['text']
cols = df.columns.tolist() cols = cols[-1:] + cols[:-1] df = df[cols]
df.to_csv(file_name)

def main():
texts_list = ["it is going to rain today",
"today i am not going outside",
"i am going to watch the season premiere"]
53 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
# corpus = ['This is the first document.',
# 'This document is the second document.',
# 'And this is the third one.',
# 'Is this the first document?',
# ]
# train_set = ["sky is blue", "sun is bright", "sun in the sky is bright"]

# reviews_df = pd.read_csv("abc.csv")

# texts_list = reviews_df['text'].tolist()

for i in range(len(texts_list)):
texts_list[i] = texts_list[i].lower() texts_list[i] = re.sub(r'\W', ' ', texts_list[i])
texts_list[i] = re.sub(r'\s+', ' ', texts_list[i])

all_tfs = [] all_freqs = [] for text in texts_list:


token = nltk.word_tokenize(text) # Remove Punctuation for word in token:
if word in punctuations: token.remove(word)

# Lemmatization for i in range(len(token)): token[i] = lemmitization(token[i])


tf, freq = compute_tf(token) all_tfs.append(tf) all_freqs.append(freq)

idf = compute_idf(all_freqs) tfs_final = compute_tf_idf(all_tfs, idf)


output_to_csv('tf_idf_output.csv', tfs_final, None)

if __name__ == '__main__':
main()

54 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Pre-processing
In pre-processing we are doing stop-word removal , special character removal and lemmatization
Code
from nltk import WordNetLemmatizer from nltk.corpus import stopwords
wordnet_lemmatizer = WordNetLemmatizer()
def stopword_rem(token):
tokens_without_sw = [word for word in token if not word in stopwords.words()] return
tokens_without_sw

def lemmitization(token):
token = wordnet_lemmatizer.lemmatize(token, pos="v") return token
MODULE WEB
APP CODE
Home.html
<!DOCTYPE html>

<!--[if gt IE 8]><!--> <html class="no-js"> <!--<![endif]-->


<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<title></title>
<meta name="description" content="">
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css"
integrity="sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T"
crossorigin="anonymous">

<style>
html, body {
max-width: 100%;
overflow-x: hidden;
}

55 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
</style>

</head>
<body>

<header id="main-header" class="py-2 bg-primary text-white" >


<div class="wrapper container">

<div class="row">
<div class="col-md-12 text-center">

<h1 ><i class="fa fa-gear"></i> Shopify User Reviews Classification</h1>

</div>
</div>

</div>
</header>

<nav class="navbar navbar-expand-sm navbar-dark bg-dark p-0">


<div class="container">

<button
class="navbar-toggler"
data-toggle="collapse"
data-target="#navbarNav"
>
<span class="navbar-toggler-icon"></span>
</button>
<div class="collapse navbar-collapse" id="navbarNav">
<ul class="navbar-nav">

56 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
<li><a href="index2"><strong>Home</strong> </a></li>

<li><a href="dataset"><strong>Dataset Selection</strong> </a></li>

<li><a href="feature_selection" disabled=""><strong>Feature Extraction</strong> </a></li>

<li><a href="data_preprocessing" disabled=""><strong>Data Preprocessing</strong> </a></li>

<li><a href="classifier" disabled=""><strong>Classifier</strong> </a></li>

<li><a href="history" ><strong>History</strong> </a></li>

<li><a href="contact_us"><strong>Contact Us</strong> </a></li>

<li><a href="about"><strong>About Us</strong> </a></li>

<li><a href="login"><strong>Logout</strong> </a></li>

</ul>

<ul class="navbar-nav ml-auto">


<li class="nav-item dropdown mr-3">
<a
href="#"
class="nav-link dropdown-toggle"
data-toggle="dropdown"
>
<i class="fa fa-user"></i> Welcome
</a>
<div class="dropdown-menu">
<a href="/profile" class="dropdown-item">
<i class="fa fa-user-circle"></i> Profile
</a>
<a href="/settings" class="dropdown-item">
<i class="fa fa-gear"></i> Settings
57 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
</a>
</div>
</li>
<li class="nav-item">
<a href="/logout" class="nav-link">
<i class="fa fa-user-times"></i> Logout
</a>
</li>
</ul>
</div>
</div>
</nav>

</br></br>

<div class="row justify-content-center">


<div class="col-md-10 ">
<h1 align="center">Shopify User Reviews Classification</h1>
</br></br>
<p>

<footer class="footer fixed-bottom container" style="text-align: center;">


<hr>
<p>&copy; 2021 Shopify User Reviews Classification, Inc.</p>
</footer>

<script src="https://code.jquery.com/jquery-3.3.1.slim.min.js" integrity="sha384-


q8i/X+965DzO0rT7abK41JStQIAqVgRVzpbzo5smXKp4YfRvH+8abtTE1Pi6jizo"
crossorigin="anonymous"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.14.7/umd/popper.min.js" integrity="sha384-
UO2eT0CpHqdSJQ6hJty5KVphtPhzWj9WO1clHTMGa3JDZwrnQq4sF86dIHNDz0W1"
crossorigin="anonymous"></script>

58 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
<script src="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/js/bootstrap.min.js" integrity="sha384-
JjSmVgyd0p3pXB1rRibZUAYoIIy6OrQ6VrjIEaFf/nJGzIxFDsf4x0xIM+B07jRM"
crossorigin="anonymous"></script>

<script src="" async defer></script>


</body>
</html>

Header.html

<header id="main-header" class="py-2 bg-primary text-white" >


<div class="wrapper container">

<div class="row">
<div class="col-md-12 text-center">

<h1 ><i class="fa fa-gear"></i> Shopify User Reviews Classification</h1>

</div>
</div>

</div>
</header>

<div class="row justify-content-center">


<div class="col-md-10 ">
<nav class="navbar navbar-expand-lg navbar-light bg-light">

<div class="collapse navbar-collapse" id="navbarSupportedContent">


<ul class="navbar-nav mr-auto">

<li><a href="index"><strong>HOME</strong> </a></li>

59 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
<li><a href="dataset"><strong>DATASET SELECTION</strong> </a></li>

<li><a href="feature_selection"><strong>FEATURE EXTRACTION</strong> </a></li>

<li><a href="data_preprocessing"><strong>DATASET PREPROCESSING</strong> </a></li>

<li><a href="classifier"><strong>CLASSIFIER</strong> </a></li>

<li><a href="history"><strong>HISTORY</strong> </a></li>

<li><a href="unseen_review"><strong>UNSEEN REVIEW</strong> </a></li>

<li><a href="contact_us"><strong>CONTACT US</strong> </a></li>

<li><a href="about"><strong>ABOUT US</strong> </a></li>

</ul>

</div>
</nav>
</div></div>

</br></br>

<div class="row justify-content-center">


<div class="col-md-10 ">
<h1 align="center">About Us</h1>
</br></br>

<img src="waleed.png" />

<br><br>
60 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
<img src="samad.jpeg" alt="HTML5 Icon" width="128" height="128">
<br><br>

</div></div>
</br></br>

Preprocessing.html
</script>

<header id="main-header" class="py-2 bg-dark text-white" >


<div class="wrapper container">

<div class="row" >


<div class="col-md-12 text-center">
<img src="https://i.ibb.co/1sb0Rt1/logo.png" alt="logo" border="0" width="80px" height="80px"
align="left" >
<h1 style="padding-top:15px"><i class="fa fa-gear"></i> Classification of Shopify Apps User
Reviews</h1>
</div>
</div>

</div>
</header>
<div class="row justify-content-center">
<div class="col-md-11.5 ">
<nav class="navbar navbar-expand-lg navbar-light bg-light">

<div class="collapse navbar-collapse" id="navbarSupportedContent">


<ul class="navbar-nav mr-auto">

<li><a href="index2"><strong><h6>Home</h6></strong> </a></li>

<li><a href="dataset"><strong><h6>Dataset Selection</h6></strong> </a></li>

61 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
<li><a href="feature_selection" ><strong><h6>Feature Extraction</h6></strong> </a></li>

<li><a href="data_preprocessing" ><strong><h6>Data Preprocessing</h6></strong> </a></li>

<li><a href="classifier" disabled=""><strong><h6>Classifier</h6></strong> </a></li>

<li><a href="history" ><strong><h6>History</h6></strong> </a></li>

<!-- <li><a href="unseen_review" disabled=""><strong>Unseen Review</strong> </a></li>-->

<li><a href="contact_us"><strong><h6>Contact Us</h6></strong> </a></li>

<li><a href="about"><strong><h6>About Us</h6></strong> </a></li>

<li><a href="login"><strong><h6>Logout</h6></strong> </a></li>

</ul>

<input type="hidden" id="namea" name="variable" value="{{ test }}">


</div>
</nav>
</div></div>

</br></br>

<div class="row justify-content-center">


<div class="col-md-10 ">
<h1 align="center">Data Preprocessing</h1>
</div></div>
</br></br>

<div class="row justify-content-center">


<div class="col-md-10 ">
<div class="dropdown">
62 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
<h3>
Preprocessing {{ system }}
</h3>

Features selectio.html
<script type="text/javascript">
function codeAddress() {
if(localStorage.getItem("feature") != undefined){
var myradioValue = localStorage.getItem("feature")
$("input[name=input_name][value="+myradioValue+"]").attr('checked', true);
}

function saveradio()
{
localStorage.setItem("feature", radiovalue);
}

</script>

</head>
<body onload="codeAddress();" onbeforeunload="saveradio();" on>

<header id="main-header" class="py-2 bg-dark text-white" >


<div class="wrapper container">

<div class="row" >


<div class="col-md-12 text-center">
<img src="https://i.ibb.co/1sb0Rt1/logo.png" alt="logo" border="0" width="80px" height="80px"
align="left" >
<h1 style="padding-top:15px"><i class="fa fa-gear"></i> Classification of Shopify Apps User
Reviews</h1>
</div>
</div>

</div>
</header>
<div class="row justify-content-center">
<div class="col-md-11.5 ">
<nav class="navbar navbar-expand-lg navbar-light bg-light">

63 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
<div class="collapse navbar-collapse" id="navbarSupportedContent">
<ul class="navbar-nav mr-auto">

<li><a href="index2"><strong><h6>Home</h6></strong> </a></li>

<li><a href="dataset"><strong><h6>Dataset Selection</h6></strong> </a></li>

<li><a href="feature_selection" ><strong><h6>Feature Extraction</h6></strong> </a></li>

<li><a href="data_preprocessing" disabled=""><strong><h6>Data Preprocessing</h6></strong> </a></li>

<li><a href="classifier" disabled=""><strong><h6>Classifier</h6></strong> </a></li>

<li><a href="history" ><strong><h6>History</h6></strong> </a></li>

<!-- <li><a href="unseen_review" disabled=""><strong>Unseen Review</strong> </a></li>-->

<li><a href="contact_us"><strong><h6>Contact Us</h6></strong> </a></li>

<li><a href="about"><strong><h6>About Us</h6></strong> </a></li>

<li><a href="login"><strong><h6>Logout</h6></strong> </a></li>

</ul>

</div>
</nav>
</div></div>

</br></br>

<div class="row justify-content-center">


<div class="col-md-10 ">
<h1 align="center">Feature Selection</h1>
</div></div>
</br></br>

<div class="row justify-content-center">


<div class="col-md-10 ">
<h3>Feature Extraction</h3>

<form method="post" id="myfeature" action="feature_extraction">


{% csrf_token %}
<input type="radio" id="Bag" name="feature" value="Bag Of Words">

64 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
<label for="Bag">Bag Of Words</label><br>
<input type="radio" id="part" name="feature" value="Part of Speech Tagging">
<label for="part">Part of Speech Tagging</label><br>
<input type="radio" id="tf" name="feature" value="TF-IDF">
<label for="tf">TF-IDF</label><br>
<input type="radio" id="pos" name="feature" value="Discrete Positive">
<label for="pos">Discrete Positive</label><br>
<input type="radio" id="neg" name="feature" value="Discrete negative">
<label for="neg">Discrete negative</label><br>
<input type="radio" id="Polarity" name="feature" value="Polarity">
<label for="Polarity">Polarity</label><br>
<input type="radio" id="sent" name="feature" value="Sentiments">
<label for="sent">Sentiments</label><br>
<input type="radio" id="all" name="feature" value="All">
<label for="all">All</label><br>
History.html
5
<div class="row justify-content-center">
<div class="col-md-10 ">
<table class="table table-striped">
<thead>
<tr>
<th scope="col">Data Type</th>
<th scope="col">Feature Extraction</th>
<th scope="col">Preprocessing</th>
<th scope="col">Machine Learning Model</th>
<th scope="col">Accuracy</th>
<th scope="col">F-Measure</th>
<th scope="col">Precision</th>
<th scope="col">Recall</th>
<th scope="col">Validation Technique</th>
<th scope="col">Action</th>
</tr>
</thead>
<tbody>
{% if dataset %}
{% for ml,prep,feat in dataset %}
<tr>
<!-- <th scope="row">1</th>-->
<td>Product Review</td>
<td>{{ feat.feature }}</td>
<td>{{ prep.prep }}</td>
<td>{{ ml.classifier }}</td>
<td>{{ ml.accuracy }}</td>
<td>{{ ml.fmeasure }}</td>
<td>{{ ml.precision }}</td>

65 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
<td>{{ ml.recall }}</td>
<td>{{ ml.val_tech }}</td>
<td>
<form method="post" action="delrec">
{% csrf_token %}

<input id="prodId" name="prodId" type="hidden" value="{{ ml.id }}">


<button onclick="return confirm('Are you sure you want to delete this?')" type="submit" class="delete-
row" style="background-color: red;color:white">Delete</button><br><br>
</form>
<a href="unseen_review" style="background-color: #228B22">Test</a><br></td>
</tr>
{% endfor %}
{% endif %}

</tbody>
</table>

66 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Chapter 5
Software Testing
This chapter provides a description about the adopted testing procedure. This includes the
selected testing methodology, test suite and the test results of the developed software.

5.1 Testing Methodology


After implementation, the process flow manager is tested for functional errors. We are going to
do Black Box Testing (by passing random selected values and mapping it against the expected
output in a normal flow), Unit and Integration Testing which is the testing of the functional
requirements implemented in our system without regard to code.

The test cases are done manually without the use of any tool.

5.2 Test Cases

5.2.1 User Registration Test case


Table 15 User Registration Test case

Date: 1/8/2021

System: Classify Shopify User Reviews

Objective: Registration Test ID: 2

Version: 2 Test Type: Black Box Testing

Input

Name=Abdul Samad

Username=abdul.samad

Email=bse173024@cust.pk

Password=173024

Confirm Password=173024

67 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Expected Output

New user is registered into the system.

Actual Output

Registration completed successfully.

Expected Exceptions

Invalid email.

Password doesn’t match.

5.2.2 User Login Test case


Table 16 User Login Test Case

Date: 1/8/2021
System: Classify Shopify User Reviews

Objective: Login Test ID: 3

Test Type: Black Box


Version: 1 Testing

Input

Username=abdul.samad

Password=173024

Expected Output

New User is Register into system.

Actual Output

Registration Successful.

68 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Expected Exceptions

Invalid email.

Password doesn’t match.

Empty Field.

5.2.3 Choose Dataset Test case


Table 17 Choose Dataset Test Case

Date: 1/8/2021

System: Classify Shopify User Reviews

Objective: Choose Dataset Test ID: 4

Test Type: Black Box


Version: 1 Testing

Input

Select csv file from system

Expected Output

Dataset should be selected

Actual Output

Dataset selected

Expected Exceptions

Corrupt csv File

5.2.4 View Dataset Test Case 1


Table 18 View Dataset Test Case

Date: 1/8/2021

69 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
System: Classify Shopify User Reviews

Objective: View Dataset Test ID: 5

Test Type: Black Box


Version: 1 Testing

Input

Product review.

Upload csv file from system

Expected Output

Corrupted csv file

Actual Output

Corrupt csv File

Expected Exceptions

Dataset is displayed

5.2.5 View Dataset Test Case 2


Table 19 View Dataset Test Case

Date: 1/8/2021

System: Classify Shopify User Reviews

Objective: View Dataset Test ID: 6

Version: 2 Test Type: Black Box Testing

Input

Product review.

Upload csv file from system

70 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Expected Output

Dataset should be displayed

Actual Output

Dataset is displayed

Expected Exceptions

Corrupt csv File

5.2.6 Train Model Test Case


Table 20 Train Model Test Case

Date: 1/8/2021

System: Classify Shopify User Reviews

Objective: Train Model Test ID: 7

Version: 1 Test Type: Black Box Testing

Input

Product review

Expected Output

System will show Feature Extraction

Actual Output

User will able to select Feature Extraction method.

Expected Exceptions

Backend exception

71 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
5.2.7 Apply Feature Extraction method on Dataset Test Case

Table 21 Apply Feature Extraction method on Dataset

Date: 4/8/2021

System: Classify Shopify User Reviews

Objective: Apply Feature Extraction method on Test ID: 8


Dataset

Version: 1 Test Type: Black Box


Testing

Input

Part of Speech Tagging

Expected Output

System will apply part of speech tagging on dataset.

Actual Output

User will able to select Preprocessing technique.

Expected Exceptions

System may not be responding.

5.2.8 Apply Part of speech on Dataset Test Case


Table 22 Apply Part of speech on Dataset Test Case

Date: 4/8/2021

System: Classify Shopify User Reviews

Objective: Apply Part of speech on Dataset Test ID: 9

Version: 1 Test Type: Black Box Testing

72 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Input

Part of Speech Tagging

Expected Output

System will apply part of speech tagging on dataset.

Actual Output

User will able to select Preprocessing technique.

Expected Exceptions

System may not be responding.

5.2.9 Remove Special Characters Test Case


Table 23 Remove special characters Test Case

Date: 4/8/2021

System: Classify Shopify User Reviews

Objective: Remove special characters Test ID: 10

Version: 1 Test Type: Black Box Testing

Input

Special Character removal

Expected Output

System will apply special character removal from the given dataset.

Actual Output

System not responding.

Expected Exceptions

73 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
System may not be responding.

5.2.10 Apply Preprocessing Technique on Dataset Test Case


Table 24 Apply Preprocessing Technique on Dataset Test Case

Date: 4/8/2021

System: Classify Shopify User Reviews

Objective: Apply Preprocessing Technique on Test ID: 11


Dataset

Version: 1 Test Type: Black Box


Testing

Input

Stopword Removal

Expected Output

User will able to select Stopword Removal.

Actual Output

System applied Stopword Removal on dataset.

Expected Exceptions

System may not be responding.

5.2.11 View processed Feature data Test Case


Table 25 View processed Feature data Test Case

Date: 1/8/2021

System: Classify Shopify User Reviews

74 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Objective: View Feature Test ID: 12

Version: 1 Test Type: Black Box Testing

Input

Stopword Removal

Expected Output

System will show applied stopword removal .csv file

Actual Output

Application error 404

Expected Exceptions

Stop word removal feature does not applied properly

5.2.12 View processed Feature data Test Case


Table 26 View processed Feature data Test Case

Date: 1/8/2021

System: Classify Shopify User Reviews

Objective: View Feature Test ID: 13

Version: 2 Test Type: Black Box Testing

Input

Stopword Removal

Expected Output

System will show applied stopword removal .csv file

Actual Output

System showed applied stopword removal .csv file

75 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Expected Exceptions

Stop word removal feature does not applied properly

5.2.13 Moving to Classifier Test case


Table 27 Moving to Classifier Test Case

Date: 4/8/2021

System: Classify Shopify User Reviews

Objective: Moving to Classifier Test ID: 14

Version: 1 Test Type: Unit Testing

Input

No input

Expected Output

System will display classification method

Actual Output

System displayed the display Machine learning model and Evaluation Metrics.

Expected Exceptions

Backend exception

5.2.14 Machine Learning Model Test case


Table 28 Machine Learning Model Test Case

Date: 4/8/2021

System: Classify Shopify User Reviews

Objective: Machine Learning Model Test ID: 15

76 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Version: 1 Test Type: Black Box Testing

Input

Naive Bayes

Expected Output

System will apply machine learning model

Actual Output

System applies machine learning model.

Expected Exceptions

System may not be responding.

5.2.15 Evaluation Metrics Test case


Table 29 Evaluation Metrics Test Case

Date: 4/8/2021

System: Classify Shopify User Reviews

Objective: Evaluation Metrics Test ID: 16

Version: 1 Test Type: Black Box Testing

Input

Accuracy, F-measure, 10-fold cross validation.

Expected Output

System will apply evaluation metrics.

Actual Output

System applies evaluation metrics.

77 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Expected Exceptions

System may not be responding.

5.2.16 Apply Classifier Test case


Table 30 Apply classifier Test Case

Date: 4/8/2021

System: Classify Shopify User Reviews

Objective: Apply Classifier Test ID: 17

Version: 1 Test Type: Black Box Testing

Input

• Machine learning model


• Evaluation metrics
• Validation techniques
Expected Output

System will apply Machine learning model, Evaluation metrics, Validation


techniques

Actual Output

System displayed the display Machine learning model and Evaluation Metrics
results.

Expected Exceptions

System may not be responding.

78 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
5.2.17 Save Model Test Case 1
Table 31 Save Model Test Case

Date: 4/8/2021

System: Classify Shopify User Reviews

Objective: Save Model Test ID: 18

Test Type: Black Box


Version: 1 Testing

Input

• Machine learning modeling


• Preprocessing Techniques
• Feature Computation
Expected Output

System will save Machine learning modeling, Preprocessing Techniques,


Feature Computation

Actual Output

System didn’t save the model in history tab

Expected Exceptions

System may not be responding.

5.2.18 Save Model Test Case 2


Table 32 Save Model Test Case

Date: 4/8/2021

System: Classify Shopify User Reviews

Objective: Save Model Test ID: 19

79 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Version: 1 Test Type: Black Box Testing

Input

• Machine learning modeling


• Preprocessing Techniques
• Feature Computation
Expected Output

System will save Machine learning modeling, Preprocessing Techniques,


Feature Computation

Actual Output

System saved Machine learning modeling Preprocessing Techniques, Feature


Computation.

Expected Exceptions

System may not be responding.

5.2.19 Test Model Test case


Table 33 Test model Test Case

Date: 1/8/2021

System: Classify Shopify User Reviews

Objective: Test Model Test ID: 20

Version: 1 Test Type: Black Box Testing

Input

Test model button Click

Expected Output

Redirect to unseen review prediction.

80 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Actual Output

System not responding

Expected Exceptions

Test model button may not work properly

5.2.20 Test Model Test Case 2


Table 34 Test model Test Case

Date: 1/8/2021

System: Classify Shopify User Reviews

Objective: Test Model Test ID: 20

Version: 1 Test Type: Black Box Testing

Input

Test model button Click

Expected Output

Redirect to unseen review prediction.

Actual Output

Redirected to unseen review prediction.

Expected Exceptions

Test model button may not work properly

5.2.21 Unseen Prediction Test Case 1


Table 35 Unseen Prediction Test Case

Date: 4/8/2021

81 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
System: Classify Shopify User Reviews

Objective: Unseen Prediction Test ID: 21

Version: 1 Test Type: Black Box Testing.

Input

Text field e.g. I am happy with the product

Expected Output

System will predict unseen review (Happy)

Actual Output

Unhappy

Expected Exceptions

System will may not respond.

5.2.22 Unseen Prediction Test Case 2


Table 36 Unseen Prediction Test Case

Date: 4/8/2021

System: Classify Shopify User Reviews

Objective: Unseen Prediction Test ID: 22

Version: 1 Test Type: Black Box Testing.

Input

Text field e.g., Nice fast checkout app. Easy to set up and works as promised. I
like this wiggling button. Really cool feature!

Expected Output

System will predict unseen review (Happy)

82 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Actual Output

Unhappy

Expected Exceptions

System will may not respond.

5.2.23 Unseen Prediction Test Case 3


Table 37 Unseen Prediction Test Case

Date: 4/8/2021

System: Classify Shopify User Reviews

Objective: Unseen Prediction Test ID: 23

Version: 2 Test Type: Black Box Testing.

Input

Text field e.g. I am happy with the product

Expected Output

System will predict unseen review (Happy)

Actual Output

Happy

Expected Exceptions

System will may not respond.

5.2.24 User Logout Test case


Table 38 Unseen Prediction Test Case

Date: 4/8/2021

83 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
System: Classify Shopify User Reviews

Objective: Logout Test ID: 24

Version: 1 Test Type: Black box Testing

Input

Logout Button Click

Expected Output

Redirect to Login page

Actual Output

Redirected to login

Expected Exceptions

Logout button may not work properly

5.2.25 Contact Us Test Case


Table 39 Display Button Test Case

Date: 4/8/2021

System: Classify Shopify User Reviews

Objective: Contact us Page Test ID: 25

Version: 1 Test Type: Black Box Testing

Input

Contact us Button Click and type query

Expected Output

Redirect to Contact us and query sent to email

Actual Output

84 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Redirect to Contact us but email not sent

Expected Exceptions

Email is not received

5.2.26 Contact Us Test Case


Table 40 Display Button Test Case

Date: 4/8/2021

System: Classify Shopify User Reviews

Objective: Contact us Page Test ID: 26

Version: 1 Test Type: Black Box Testing

Input

Contact us Button Click and type query

Expected Output

Redirect to Contact us and query sent to email

Actual Output

Redirect to Contact us and email sent

Expected Exceptions

Email is not received

5.2.27 About Page Test case


Table 41 About Page Test Case

Date: 1/8/2021

System: Classify Shopify User Reviews

85 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Objective: About Page Test ID: 27

Version: 1 Test Type: Black Box Testing

Input

About Button Click

Expected Output

Redirect to About Page

Actual Output

Redirect to About Page

Expected Exceptions

None

86 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Chapter 6
Software Deployment

6.1 Installation / Deployment Process Description


• GitHub

First, we have to install git on the system then we will make the account on GitHub.
• Then we will install GitToolBox on PyCharm, plugins.

Figure 38 Text Result Interface

• Then we will push the project on GitHub hub using this tool.
• Heroku

87 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Then, we will make account on Cloud Application Platform | Heroku
• We will create application on Heroku, after creating application on Heroku then we select
python language.
• Then we will connect our Heroku account with GitHub account.

Figure 39 Text Result Interface

88 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
• Then we will click on Deploy Branch on Heroku website to deploy the project.

Figure 40 Text Result Interface

• Then we will check status of deployment on Heroku and also from PyCharm.

Figure 41 Text Result Interface

89 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
• Check status from PyCharm.

Figure 42 Text Result Interface

• The link of our project shopifyappreviewprediction.herokuapp.com

90 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
Chapter 7
REPORT APPROVAL CERTIFICATE

The report of the project, “Web App to classify Shopify User Review Using Textual
Features” has been approved based on the following evaluation guideline.

Table 42 Project Evaluation Guidelines

Artifacts Guidelines
Analysis and Design artifacts are syntactically correct (use-case model, SSDs,
domain model, class diagram, SDs, ERDs, Flow charts, Activity Diagram,
DFDs)
Consistency and traceability have been maintained among different artifacts
General Guidelines
Formatting (font style, indentation) is according to the FYP template and
consistent throughout the document
Captions are added to all the figures and tables. Figure captions must be placed
below each figure, and table captions must be provided above the table
Each figure or table is followed by some text describing what it represents

____________________ ____________________ ____________________


Name & Signature Name & Signature Name & Signature
(Examiner 1) (Examiner 2) (Examiner3)

_________________
Name & Signature
(Supervisor)

91 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering
References
Research Paper
[1] F. Rustam, A. Mehmood, M. Ahmad, S. Ullah, D. M. Khan and G. S. Choi, "Classification of
Shopify App User Reviews Using Novel Multi Text Features," in IEEE Access, vol. 8, pp.
30234-30244, 2020, doi: 10.1109/ACCESS.2020.2972632.

Webpage
[2] https://www.jetbrains.com/pycharm/features/, last accessed July 24, 2021.

[3] https://www.python.org/doc/, last accessed July 24, 2021.

[4] https://www.w3schools.com/html/, last accessed July 24, 2021.

[5] https://docs.djangoproject.com/en/3.2/, last accessed July 24, 2021.

[6] https://www.javascript.com/about, last accessed July 25, 2021.

[7] https://www.sqlite.org/index.html, last accessed July 26, 2021

92 | P a g e
Capital University of Science & Technology, Islamabad Department of Software Engineering

You might also like