Project Report Sem 1

Machine Learning Based Prediction of

Consumer Purchasing Decisions:



(Information Technology)
Shubham P. Waghmare Exam no: 71716124C

Baliram N. Pinate Exam no:71715996F

Makrand A. Deshmukh Exam no:71715669K

Aditya A. Mahalle Exam no:71715863C

Under The Guidance of

Prof. V. P. Tonde


Sinhgad Institute of Technology Lonavala



This is to certify that the Project Entitled

Machine Learning Based Prediction of

Consumer Purchasing Decisions:
Submitted by

Is a bonafide work carried out by them under the supervision of Prof. V. P.

Tonde and it is approved for the partial fulfillment of the requirement of
Savtribai Phule Pune University, Pune for the award of the degree of Bachelor of
Engineering (Information Technology).

Prof. V. P. Tonde Prof. T. J. Parvat

Guide H.O.D
Dept. of Info Tech. Dept. of Info Tech.

Sinhgad Institute of Technology, Lonavala, Pune – 410401

Signature of Internal Examiner Signature of External Examiner


We would like to express our sincere gratitude to our guide Prof. V.P.Tonde for the
continuous support, for her patience, motivation, and immense knowledge. Her guidance
helped us in all the time of work and writing of this thesis. We could not have imagined
having a better supervisor and mentor for this project.

We would like to thank our friends for their feedback, cooperation and of course
friendship. In addition, we would like to express our gratitude to the faculty members of
the Department of Information Technology (IT), Sinhgad Institute of Technology, Lonavala
who helped us with their feedback.

Last but not least, we would like to thank our family, our parents and to our brothers
and sisters for supporting us.

Baliram Pinate
Aditya Mahalle
(B.E. Information Technology.)


Every day consumers make decisions on whether or not to buy a product. In some
cases the decision is based solely on price but in many instances the purchasing decision is
more complex, and many more factors might be considered before the final commitment is
made. In an effort to make purchasing more likely, in addition to considering the asking price,
companies frequently introduce additional elements to the offer which are aimed at increasing
the perceived value of the purchase.
The goal of the present work is to examine using data driven machine learning,
whether specific objective and readily measurable factors influence customers’ decisions.
These factors inevitably vary to a degree from consumer to consumer so a combination of
external factors, combined with the details processed at the time the price of a product is
learnt, form a set of independent variables that contextualize purchasing behavior.
Using a large real world data set (which will be made public following the publication
of this work), we present a series of experiments, analyze and compare the performances of
different machine learning techniques, and discuss the significance of the findings in the
context of public policy and consumer education.


1 Introduction 7
1.1 Problem Statement.........................................................................8
1.2 Motivation of the Project................................................................8

2 Literature Survey 9
2.1 Introduction....................................................................................10
2.2 Literature Survey Papers.................................................................11

3 Objectives 16
3.1 Objectives......................................................................................17
3.2 Proposed System………………………………………………………………..17

4 Software & Hardware requirement specification 18

5 Application 20
5.1 Application....................................................................................21
5.2 Advantages.....................................................................................21

6 System Design 22
6.1 Architectural Design.....................................................................23
6.1.1 Component Diagram.........................................................24
6.1.2 Deployment Diagram........................................................25
6.1.3 Sequence Diagram.............................................................26
6.1.4 Use Case Diagran……………………………………………………..27
6.1.5 Class Diagram....................................................................28
6.1.6 Activity Diagram………………………………………………………29
6.1.7 State Diagram………………………………………………………….30
6.1.8 Data Flow Diagram…………………………………………………..31
6.1.9 Flow Chart Diagram…………………………………………………32

7 Conclusion 33

8 References 35

List of Figures

6.1 Architecture diagram....................................................................23

6.2 Component diagram.....................................................................24
6.3 Deployment diagram....................................................................25
6.4 Sequence diagram.........................................................................26
6.5 Use Case diagram..........................................................................27
6.6 Class diagram................................................................................28
6.7 Activity diagram............................................................................29
6.8 State Diagram...............................................................................30
6.9 Data Flow Diagram……………………………………………………………..31
6.10 Flow Chart Diagram…………………………………………………………….32


1.1 Problem Statement
One of the most common financial decisions that each of us makes on a
nearly daily basis involves the purchasing of various products, goods, and
services. In some cases the decision on whether purchase or not purchase is
based largely on price but in many instances the purchasing decision is more
complex, with many more considerations affecting the decision-making process
before the final commitment is made.

1.2 Motivation of the Project

Globalization has increased the competition in the sales and variety of different
products has been introduced in market. Due to this consumers get confused and
can’t decide which product they should buy, to overcome this issue our group has
decided to do prediction of consumer’s decision, thus they can take their decision
easily. Our project will predict the best decision of consumers.


2.1 Introduction
Machine learning is an application of artificial intelligence (AI) that provides
systems the ability to automatically learn and improve from experience without being
explicitly programmed. Machine learning focuses on the development of computer
programs that can access data and use it learn for themselves. Machine learning enables
analysis of massive quantities of data. While it generally delivers faster, more accurate results
in order to identify profitable opportunities or dangerous risks, it may also require additional
time and resources to train it properly.
In machine learning  and statistics, classification is the problem of identifying to which
of a set of categories (sub-populations) a new observation belongs, on the basis of a training
set of data containing observations (or instances) whose category membership is known.
Examples are assigning a given email to the "spam" or "non-spam" class, and assigning a
diagnosis to a given patient based on observed characteristics of the patient (sex, blood
pressure, presence or absence of certain symptoms, etc.). Classification is an example
of pattern recognition.
The buying decision process is the decision-making process used by consumers
regarding market transactions before, during, and after the purchase of a good or service. It
can be seen as a particular form of a cost–benefit analysis in the presence of multiple

2.2 Literature Survey Papers
Paper No: 2

Paper Name: Research on the influence of online reviews on internet

consumer purchasing decision

Author Name: Lei Zhu; Wei Zhang; Yanchun Zhu

Description: As an emerging platform for online shopping, online review

changed the network consumer's purchase mode. In this paper, we study
the effect of potential online purchaser influenced by the online reviews,
by means of the questionnaire. The final study shows that: the Internet
consumers when making purchase decisions are mostly affected by the
average scores of products; and not significantly affected by the following
three factors: the proportion of comment buyers accounted for all buyers,
the sooner or later of online reviews and whether the contents reflecting
the latest product information or not

Paper No: 3

Paper Name: Online consumer's optimal purchasing decisions under contingent free
delay shipping

Author Name: Qianqian Liu; Wenliang Bian

Description: This paper considers the optimal purchasing decisions of a

consumer when online retailers provide contingent free shipping service
under delay delivery. A consumption decision model for online consumers
is proposed by establishing net consumption surplus function under delay
delivery, which comprehensively considering all of the conditions
including the product price, delay time, free shipping threshold given by
online retailers, as well as the consumer's individual characteristics relate
to the preference for the product and sensitivity on delay delivery, which
can be used to analyze the best purchasing decisions of an online
consumer. Finally, a numerical example proves the reliability of the
model, and the result shows that a consumer's optimal purchasing
decisions mainly depend on the comparison of free shipping cut off level
under delay delivery and consumer's specific parameter about the
preference for the product and sensitivity on delay time. Simultaneously, it
indicates that free delay shipping service with a reasonable threshold
could attract additional consumers with middle consumption desire to
enlarge purchasing amount, which can be used as a reference for online
retailers to set an effective free delay shipping strategy.

Paper No: 4

Paper Name: Shaping Customer Confidence in Online Purchasing Decision

Author Name: Otavio Sanchez; Paulo Henrique Silva E Costa; Paulo Goes

Description: E-commerce gives customers the access to a vast quantity of
products but, at the same time increases in a significant way the volume of
information they need to deal in a buying decision process. Aggregators of
information offer comparison shopping solutions to reduce the consumers'
cognitive efforts overload by adding Decision Support Systems to aid
consumers with their purchase decisions. Obviously, the success of these
service providers depends on the customers' confidence on decisions made
with such systems. Although knowing how customers' confidence is
shaped by DSS in information aggregators is crucial to the provider's
survival, it is still unclear which specific factors contribute to shape the
customer confidence. This study, conducted with a major information
aggregator in the Brazilian market, successfully proposes a model to
analyze the consumer confidence in decisions about products and dealers,
and their antecedents.

Paper No: 5

Paper Name: Factors Affecting Consumer Satisfaction of Online Purchase

Author Name: Theresa Lauraéus; Timo Saarine; Anssi Öörni

Description: Consumers frequently engage in pre-purchase search to extract

up-to-date information for their purchase decisions. Search is an essential part
of online comparison-shopping and decision-making process as it reduces
purchase related uncertainty and increases the likelihood of purchase
satisfaction. In this paper, we study how determinants of pre-purchase search,
purchase related uncertainty and the type of the search process, influence
consumers' perceived satisfaction with the online purchase. Our analysis of
351 consumers show that the classic determinants such as product class
knowledge, time availability, attitudes toward shopping, and search effort do
not significantly affect perceived purchase satisfaction. Instead, we find that
involvement and purchase related uncertainties have stronger effect on
satisfaction. However, the type of the search process turned out to be the most
important factor behind perceived purchase satisfaction.


3.1 Objectives

How and why do people make a decision to buy? Well this is indeed a difficult question to
answer as there are many different types of buyers, some being those basic impulse buyers
and some are those who use a totally intense system and make thorough investigation
before making a purchase decision.The objective of this project is to design and build a
tool to predict the consumer’s purchasing decision of product to buy or not. which will
help iin some cases the decision on whether to purchase or not to purchase is based largely
on price but in many instances the purchasing decision is more complex, with many more
considerations affecting the decision-making process before the final commitment is

3.2 Proposed System

We gather information from e-commerce website regarding products (Mobile Phone) and
gathered information about the product is like specification of product, brand of product,
cost of product etc. The extension of existing system helps user to search the product
based on price and in the results it will suggest the best product in the provided cost. User
can make the decision on suggested product that it will purchase or not.


This document describes, in detail, the software requirements for the Machine learning
project. These requirements detail the functionality and interfaces of the software.

Hardware Requirements
 Processor: 1.5GHz or above.
 RAM: 4GB or more.
 HDD: 100GB or above. Software Requirements

Software Requirements
 Operating System: Windows XP or higher.
 Languages: Python,HTML,Css,NLP
 Software: Anaconda,Jupyter notebook
 Database: MYSQL server.

 There are two users or actors which are as follows,

 •
Developer- Who will create project, will accept real time data, and Build project
and also Deploy project.

 End user-Who will use analyzed data for analysis purpose.


5.1 Application
The application that will be developed is going to be useful to the many systems.

• E Commerce.

5.2 Advantages
• Show review wise products to users.

• Save money by showing best and minimum cost Product.

• Buy decisions give you the flexibility to weigh and evaluate a range of

• Choose best quality product.


6.1 Architectural Design
In the new system the user will search the product which he wants to buy on the
web page. User can also provide the cost of the product up to which he is willing
to buy. This user request is further provided to the database, it will work on the
process. If it find any relevant data further database will send acknowledgement
to the web page. This Details will be displayed to the user and on that data he
will decide whether to buy the product or not.

Figure 6.1: Architecture diagram

6.1.1 Component Diagrams

A component diagram, also known as a UML component diagram, describes the

Organization and wiring of the physical components in a system. 
Component diagrams are often drawn to help model implementation details
And double-check that every aspect of the system's required functions is covered
By planned development.

Figure 6.2: Component diagram

6.1.2. Deployment Diagram

Deployment Diagram is a type of diagram that specifies the physical hardware on

which the software system will execute. It also determines how the software
is deployed on the underlying hardware.

Figure 6.3: Deployment diagram

6.1.3. Sequences diagram
In Sequence diagram, we shows the sequence of activity perform by user,
owner, TTP, and cloud.

Figure 6.4: Sequences diagram

6.1.4 Use-case diagram
A use case diagram at its simplest is a representation of a user’s
interaction with the system that shows the relationship between the user
and the differ- ENT use cases in which the user is involved.

Figure 6.5: Use-case diagram

6.1.5 Class diagram
In software engineering, a class diagram in the Unified Modeling
Language is a type of static structure diagram that describes the structure
of a system by showing the system’s classes, their attributes, operations,
and the rela- tionships among objects.

Figure 6.6: Class diagram

6.1.6 Activity Diagram
Activity diagrams are graphical representations of workflows of stepwise
ac- tivities and actions with support for choice, iteration and
concurrency. In the Unified Modeling Language, activity diagrams are
intended to model both computational and organizational processes (i.e.
workflows). Activity dia- grams show the overall flow of control

Figure 6.7: Activity Diagram

6.1.7 State Diagram:
A state diagram is a type of diagram used in computer science and related fields to
describe the behavior of systems.

Figure 6.8: State Diagram

6.1.8 Control Data Flow Diagram

Figure 6.9: Control Data flow Diagram

6.1.9 Flow chart:-

Figure 6.10: Flow Chart


Our results provide a number of novel insights into consumer behavior, amongst others
evidence of different thought processes taking place in the committal buying action from
those underlying the conservative decision not to go ahead with the purchase. The
presented findings and the accompanying discussion highlight avenues for future research,
provide valuable knowledge both to consumers.


[1].Customer churn prediction using improved balanced random forests by Xie, Y.; Li, X.;
Ngai, E.; and Ying, W. 2009. Expert Systems with Applications 36(3):5445–5449

[2].Predicting customer retention and profitability by using random forests and regression
forests techniques by Larivi`ere, B., and Van den Poel, D. 2005.

[3] S. Neslin, S. Gupta, W. Kamakura, L. Junxiang, and C. H. Mason, “Defection detection:

Measuring and understanding the predictive accuracy of customer churn models,” Journal of
Marketing Research, vol. 43, pp. 204- 211, 2006.

[4]Breiman, L. 1996. Bagging predictors. Machine Learning 24(2):123–140.

[5] Breiman, L. 2001. Random forests. Machine Learning 45(1):5–32


