Professional Documents
Culture Documents
Fin
Fin
1. INTRODUCTION 5
1.1Introduction 6
2 .PROBLEM DEFINITION AND METHODOLOGY 7
2.1Problem definition 8
2.2Objectives 8
2.4Methodolog 10
3. REQUIREMENT ANALYSIS AND SPECIFICATION 14
3.1System analysis 15
3.1.1 Requirement Analysis 15
3.1.2Existing system 15
3.1.3 Proposed System 16
3.1.4 Feasibility Study 19
3.1.4.1 Technical Feasibility 20
3.1.4.2economic Feasibility 20
3.1.4.3 Behavioural Feasibility 21
3.1.5Cost Benefit Analysis 21
3.1.5.1Hardware Cost 21
3.1.5.2Software Cost 22
3.1.5.3 Personnel Cost 22
3.2Requirement specification 22
PRODREV
3.2.1Functionalrequirements 22
3.2.2Non functional requirements 23
3.2.3Environmental requirements 24
4. SYSTEM DESIGN 26
4.1SystemDesign 27
4.1.1. Architecture Diagrams 27
4.1.1.1 Data Flow Diagrams 27
4.2 Database Design 30
4.2.1 Normalization 31
4.2.2 ERdiagram 32
4.3 Table design 34
5.IMPLEMENTATION 38
5.1 System implementation 39
6.SYSTEM TESTING 41
6.1System testing 42
6.1.1Unit testing 42
6.1.2Integration testing 43
6.1.2.1TopDown integration 43
6.1.2.2Bottomup integration 43
6.2Future enhancement 44
PRODREV
7. CONCLUSION 46
7.1 Conclusion 47
8.BIBLIOGRAPHY 48
8.1Bibliography 49
9.APPENDIX 50
9.1Screenshorts 51
3
PRODREV
ABSTRACT
4
PRODREV
PROJECT ABSTRACT
E shopping is an integral aspect in the current day to day human
lifestyle. In the IT world we have numerous e-commerce platforms to
meet the shopping needs. Each ecommerce platform allows users to
update the product review for the purchase. Every e-commerce user
today has the luxury to view the review of the same product in different
shopping sites. Generally these are termed as product ratings. Product
ratings are nothing but a scale of values from 1 to 5. Higher the rating
means better the quality of the product. Apart from the rating the buyers
also express their views on the product. Each user review has three main
aspects feeling, experience and sentiment. These factors vary from
individual to individual. Any user before the purchase of a specific
product has to invest a good amount of time in different websites to
understand the review. Many users will loose their interest too for
purchasing the product. To overcome this we have some core objectives
which serves as the fundamentals for the proposed model. users will
loose their interest too for purchasing the product.
To overcome this we have some core objectives which serves as the
fundamentals for the proposed model.
They are as follows:
• Understanding the sentiments expressed in the product reviews
• An automated system which collects reviews from different sources.
• Reducing the manual intervention.
With these core objectives will enable the buyer to make fast and
reliable decisions while shopping.
5
PRODREV
LIST OF FIGURES
Fig2.4 12
Fig4.1.1.1a 29
Fig4.1.1.1b 29
Fig4.1.1.1c 30
6
PRODREV
LIST OF TABLES
Table4.3 35
Table4.3a 35
Table4.3b 36
Table4.3c 36
Table4.3d 37
Table4.3e 37
Table4.3f 38
7
PRODREV
1.INTRODUCTION
8
PRODREV
1. 1 INTRODUCTION
E shopping is an integral aspect in the current day to day human
lifestyle. In the IT world we have numerous e-commerce platforms to
meet the shopping needs. Each ecommerce platform allows users to
update the product review for the purchase. Every e-commerce user
today has the luxury to view the review of the same product in different
shopping sites. Generally these are termed as product ratings. Product
ratings are nothing but a scale of values from 1 to 5. Higher the rating
means better the quality of the product. Apart from the rating the buyers
also express their views on the product. Each user review has three main
aspects feeling, experience and sentiment. These factors vary from
individual to individual. Any user before the purchase of a specific
product has to invest a good amount of time in different websites to
understand the review. Many users will loose their interest too for
purchasing the product.
9
PRODREV
10
PRODREV
2.2 OBJECTIVES
Our objective is to build an intelligent decision-making system by
implementing sentiment analysis techniques, thus helping the buyers to
11
PRODREV
2.4 METHODOLOGY
In earlier days Iterative Waterfall model was very popular to
complete a project. But nowadays developers face various problems
while using it to develop a software. The main difficulties included
handling change requests from customers during project development
and the high cost and time required to incorporate these changes. To
overcome these drawbacks of Waterfall model, in the mid-1990s the
Agile Software Development model was proposed.
The Agile model was primarily designed to help a project to adapt
to change requests quickly. So, the main aim of the Agile model is to
facilitate quick project completion.To accomplish this task agility is
required. Agility is achieved by fitting the process to the project,
removing activities that may not be essential for a specific project.
Also,anything that is wastage of time and effort is avoided.
Actually Agile model refers to a group of development processes.
These processes share some basic characteristics but do have certain
subtle differences among themselves. A few Agile SDLC models are
given below. Agile model is the combination of iterative and
incremental process models.
Steps involve in agile SDLC models are:
• Requirement gathering
• Requirement Analysis
13
PRODREV
• Design
• Coding
• Unit testing
• Acceptance testing
The time to complete an iteration is known as a Time Box. Time-
box refers to the maximum amount of time needed to deliver an iteration
to customers. So, the end date for an iteration does not change. Though
the development team can decide to reduce the delivered functionality
during a Time-box if necessary, to deliver it on time. The central
principle of the Agile model is the delivery of an increment to the
customer after each Time-box.
Fig:2.4
PYTHON
14
PRODREV
MySQL
MySQL database has become the world's most popular Open
source database because of its consistency, fast performance, high
reliability and ease of use. It has also become the database of choice for
a new generation of applications built on the XAMPP stack (Linux,
Apache, MySQL, PHP / Perl / Python). MySQL runs on more than 20
platforms including Linux, Windows. MySQL offers a comprehensive
range of certified software, support, training and consulting. My SQL is
a multithreaded, multi-user SQL Database Management System.
MySQL's implementation of a relational database is an abstraction on
top of a computer’s file system. The relational database abstraction
allows collection of data items to be organized as a set of formally
described tables. Data can be accessed or reassembled from these tables
in many different ways, which do not require any reorganization of the
database tables themselves.
MySQL Features:
15
PRODREV
16
PRODREV
3.REQUIREMENT ANALYSIS
AND SPECIFICATION
17
PRODREV
18
PRODREV
and easy to give sentiment score for a phrase. All the sentiment scores of
each review is taken into account and the average of those scores
represent the quality of the product acquired from the experiences of the
previous customers.
A. Scraping
To gather data for the proposed model, websites, for example,
amazon.in, having product reviews are crawled and the data is saved in
the local computer.
B. Pre processing
Quite often, it is observed that the data obtained from scraping may
not be ready for feeding into an algorithm. The scraped data might
consist of data with spelling errors, data that may not be useful for the
algorithm, data having a different data type, stop words etc. One of the
practices of Pre processing involves tokenization and removal of stop
words. Tokens can be words in a sentence or even sentences from a
document can act as tokens. For example, a given sentence like ’Natural
language processing is one branch of computer science, will be
tokenized into ’Natural, ’language’, ’processing, ’is, etc. Sometimes,
some extremely common words which would appear to be of little value
in helping select documents which match a user requirement are
excluded from the vocabulary entirely. These words are called stop
words. One of the major forms of pre-processing is to filter out data that
does not conform to the parameters of the algorithm. Stop words are
words that occur too frequently in a given document or a paragraph, for
example, words like ’a’, ’the’ etc. Crawlers in some search engines
ignore these words, to reduce the amount of memory consumed by data.
Hence such stop words are removed from the text data by comparing
20
PRODREV
them with a pre existing set of stop words. The text data obtained after
cleaning it of stop words is used for evaluation.
C. Phrase matching
After the processing of data is complete, a phrase matching process
is undertaken. We have a dataset that is analysed. This phase of the
proposed model is for comparison of the current phrase with that of the
existing dataset. Levenshtein distance is utilised to compare the different
phrases. This means that the more phrases we have analysed previously
improves the entire dataset and allows phrases to be more accurately
scored against historical data. A block diagram of the proposed model is
given .The user just needs to enter the ASIN(Amazon Standard
Identification number) code which is a code that is unique for all
amazon products. The algorithm gathers all the web data of that amazon
product. All the unwanted strip tags and irrelevant data (the data other
than reviews) is removed and the data is sent for pre-processing. After
pre processing:
• Phrases are split up into lengths of n grams
• The phrase array is sorted in such a way that the word length from 10,
9 etc are sorted (descending order)
• There are three categories into which the phrases are compared with,
positive, negative and neutral
• Only matches that meet the minimum Levenshtein minimum distance
and similarity minimum distance are kept
D. Levenshtein Distance
21
PRODREV
23
PRODREV
25
PRODREV
each screen
-flows performed by the system
3.2.2 ON FUNCTIONALREQUIREMENT
A non-functional requirement is a requirement that specifies
criteria that can be used to judge the operation of a system, rather than
specific behaviours.
a. Need Internet connection.
b. Usability
The system’s interface is user-friendly and easy to get familiar with.
c. Reliability
26
PRODREV
d. Performance
Response time for a transaction - maximum.
e. Security
The system shall protect itself and its sensitive data and communications
from accidental,malicious, or unauthorized access, use, modification, or
destruction.
f. Safety
Then selection hardware, the size and capacity requirements are also
important.
• Processor : Intel Pentium Core i3 and above
• Primary Memory : 4GB RAM and above
• Storage : 320 GB hard disk and above
• Display : VGA Color Monitor
• Key Board : Windows compatible
• Mouse : Windows compatible
Software Specifications
One of the most difficult task is selecting software for the system, once
the system requirements
is found out then we have to determine whether a particular software
package fits for those system
requirements. The application requirement:
OPERATING SYSTEM : WINDOWS 10
FRONT END :PYTHON
BACK END : Mysql
SOFTWARES USED : Jetbrains Pycharm,SQLyog
28
PRODREV
4.SYSTEM DESIGN
29
PRODREV
Source/Destination of Data
Data flow
30
PRODREV
Process
Storage
LEVEL 0 DFD
As shown in the Fig. Two actors in this system are Admin and
user.The two actors can interact with the system.
Fig4.1.1.1a
LEVEL 1 FOR ADMIN
31
PRODREV
Fig4.1.1.1b
EVEL 1 FOR USER
Fof4.1.1.1c
32
PRODREV
Ease of use
Data independence
Accuracy and
integrity
4.2.1 1 NORMALIZATION
Normalization theory is built around the concept of normal form.
Normalization reduces necessary redundancies of data in database.
Redundancy can cause problem with storage and retrieval of data in
database. During the process of normalization, dependencies can be
identified which can cause problem during deletion and updating
database.
Normalization theory is based on the fundamental for fundamental
dependency. Normalization helps in simplifying structure of tables.
FIRST NORMAL FORM
Moving data into separate table when the data in each table is of
similar type and giving each table a primary key to this. This eliminates
repeating groups of data.
SECOND NORMAL FORM
First normal form table can be converted into second normal
form by taking out data that is only depend on part of the key.
THIRD NORMAL FORM
This means getting rid of anything in the table that does not depend to
primary key. Any time the data is in third normal form, is already
automatically in second normal form. There must be no indirect
relationship between attributes.
34
PRODREV
In the database all the information are stored in the form of tables. A
table is simply a way storing data in rows and columns. In the system
data is stored in many tables.
4.2.2 2 ER DIAGRAM
An entity-relationship diagram is a data modeling technique that
creates a graphical representation of the entities, and relationship
between entities, within an information system.
There are three basic elements in ER models:
• Entities are the “things” about which we seek information
• Attributes are the data we collect about entities.
• Relationships provided the structure needed to draw information
from multiple entities.
ER Diagram Symbols:
Entity
Attribute
35
PRODREV
Relation
36
PRODREV
4.33TABLE DESIGN
In the database all the information are stored in the form of
tables. A table is simply a way storing data in rows and columns. In the
system data is stored in many tables.
37
PRODREV
Table4.3
1.Table name:feedback
Field Datatype Description Constraits
Table4.3a
2.Table name:login
Field Datatype Description Constraits
38
PRODREV
Table4.3b
3.Table name:product
Field Data type Description Constraints
Table4.3c
Table name:product_rate
Field Data type Description Constraints
39
PRODREV
Table4.3d
Table name:review
Field Datatype Description Constraints
Table4.3e
Table name:usertb
Field Data type Description Constraints
Table4.3f
41
PRODREV
5.IMPLEMENTATION
42
PRODREV
43
PRODREV
44
PRODREV
6.SYSTEM TESTING
45
PRODREV
47
PRODREV
48
PRODREV
49
PRODREV
7.CONCLUSION
50
PRODREV
7.1 C0NCLUTION
The project was successfully completed within the time span
allotted. Every effort has been made to present the system in more user-
friendly manner. All the activities provide a feeling like an easy walk
over to the user who is interfacing with the system .The system model
important purposes collecting all reviews of same product from
different online shopping sites and conclude all reviews from all sites to
a single review instead of simple product rating value. This system can
be further improved to consider the above fact and update server
database accordingly.
51
PRODREV
8.BIBLIOGRAPHY
52
PRODREV
8.1BIBLIOGRAPHY
[1] Shriya Se et al. “AMRITA-CEN@ SAIL2015: sentiment analysis in
Indian languages”. In: International Conference on Mining Intelligence
and Knowledge Exploration. Springer. 2015, pp. 703–710.
[2] Theresa Wilson, Janyce Wiebe, and Paul Hoffmann. “Recognizing
contextual polarity in phraselevel sentiment analysis”. In: Proceedings
of human language technology conference and conference on empirical
methods in natural language processing. 2005, pp. 347–354.
[3] Rui Xia et al. “Polarity shift detection, elimination and ensemble: A
three-stage model for documentlevel sentiment analysis”. In:
Information Processing & Management 52.1 (2016), pp. 36–45.
[4] Cäcilia Zirn et al. “Fine-grained sentiment analysis with structural
features”. In: Proceedings of 5th International Joint Conference on
Natural Language Processing. 2011, pp. 336–344.
[5] KS Krishnaveni, Rohit R Pai, and Vignesh Iyer. “Faculty rating
system based on student feedbacks using sentimental analysis”. In: 2017
International Conference on Advances in Computing, Communications
and Informatics (ICACCI). IEEE. 2017, pp. 1648–1653.
53
PRODREV
9.APPENDIX
54
PRODREV
9.1SCREENSHORTS
55
PRODREV
ProdRev.
Welcome to ProdRev
( ,A perfect platform to'know whats bestlo choose.....
*!::'dh!.*
r
I'
ProdRev.
Welcome to ProdRev
--
A perfec}plalform to know whals bes!10 choosa.....
1
,. '"Jh!i.*
56