Final Year Project

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 41

Ajeenkya D.Y.

Patil School of Engineering


Department of Computer Engineering
BE Project Semester-I A.Y. 2022-23

PREDICTIVE ANALYSIS FOR BIG MART USING MACHINE LEARNING ALGORITHMS

PRESENTED BY: UNDER THE GUIDANCE OF:


Janhavi Mudaliar Prof. Anita Mahajan
Pooja Bahirat
Samruddhi Jaiswal
Introduction to Domain
Contents
Motivation

Literature Review

Taxonomy chart

Problem Statement

Objectives / Scope

System Architecture

Software Hardware Requirements

 Functionality Provided

Algorithm

 UML Diagrams

Project Plan

Result

Advantages, Limitations, Applications

Conclusion

Future Work

References
Introduction(1)
•The growth of international malls and online shopping has led to an
increase in the competition between them.

•The machine learning Algorithms is highly sophisticated and offers


opportunities for forecasting or forecast demand for any type of
organization in order to defeat low-cost prediction methods.

•A machine-learning algorithm may be extraordinarily effective when


applied to a particular problem.
Introduction(2)
Machine Learning Allows software applications to become
more accurate at predicting outcomes without being explicitly
programmed to do so.
Motivation
•In today’s world there is a huge competition between different companies.
•The main goal of each company is to get
- maximum customer satisfaction
- Boost the business
- maximum profit
•Traditional approach applied for sales improvement would be time consuming and would not
provide us with the expected results

•Therefore, to overcome this issue we propose the system which provides more benefit to the
company and that would ultimately increase it’s business
Literature Survey
The table below shows various existing system for prediction of sales.
Table I: Literature Survey

Sr no. Paper Name Advantages Limitations Year of Publishing


1. Sales prediction using 1.Accurate output 1. Requires high internet September 2020
Machine Learning 2.Easy to use interface speed for working
Association rule

2. Predictive Analysis for Big 1.Speed ,Portability 1.Less Cost-efficient. August 2022
Mart Sales Using Machine Efficient to use and easy 2.User needs to put correct
Learning Algorithm interface. data or else it behaves
abnormally.

3. Intelligent Sales Prediction 1. Easy to use interface 1. Requires high internet August 2018
Using Machine Learning speed for working
Techniques

4. A Study of Demand and 1.Usability is good 1. Less Accurate September 2021


Sales Forecasting Model
using Machine Learning
Algorithm

5. Leveraging Comparables for 1.Easyto use interface 1. More speed of internet is December 2017
New Product Sales 2.High Speed and required
Forecasting Portability
Taxonomy Chart
Parameters
Customer segmentation Download result Sales prediction Ease of access
System
Sales prediction using
Machine Learning
Association Rule

Predictive Analysis for


Big Mart Sales Using
Machine Learning
Algorithm

Intelligent Sales
Prediction Using
Machine Learning
Techniques

Proposed System
Problem statement
• To develop a system for sales prediction of companies like Big Mart so that they can get maximum
customer satisfaction which would ultimately lead to growth of their business

•The system comprises of different functions provided to the sales person of a company so that he
can get predictions for increasing their revenue based on past records of customers . At backend the
Machine Learning activity are performed and predicted output is displayed to the user is the form of
result.
Objectives/ Scope
• Inspecting of data collected from retail store

• Prediction of future sales for a retail store.

• Benefiting retail stores for-


- Managing stocks more effectively.
- Providing economical advantage.
Software/ Hardware Requirements
•Database/Files Requirements
1.MySQL database
2. CSV file

•  Software requirement(Platform choice)


1. API CAN BE USED:
- FLASK/DJANGO API
- JSON FOR TRANSFER OF DATA
2. Jupyter note book( Anaconda)
3. Spyder
4.Subline Text
5. Pycharm

•  Hardware requirements
1.CPU
2. PC/Laptop
3.Updated browser
4. Operating System

 
System Architecture
Functionality provided in the Project to
the user

1.Future sales prediction of an item(Regression)

2. Customer segmentation (Clustering)

3. Association between the items to help increase the sales ( Aprior)


Mathematical Model

Simple Linear Regression Association Rule Learning & K- means Clustering


Apriori Algorithm
ML Algorithms used
1.Regression
- Linear Regression
- Random Forest

2. Apriori Algorithm

3. Hierarchical Clustering
1. Regression
1.1 Linear Regression
Linear regression is one of the easiest and most popular Machine Learning algorithms. It is a statistical method that is
used for predictive analysis. Linear regression makes predictions for continuous/real or numeric variables such as
sales, salary, age, product price, etc.

X VALUES:
1. Item_Weight
2. Item_Fat_Content
3. Item_Visibility
4. Item_Type Item_MRP
5. Outlet_Size
6. Outlet_Location_Type
7. Outlet_Type

Y VALUES:
SALES OF ITEM
1.Regression
1.2. Random Forest Regression
Random Forest Regression is a supervised learning algorithm that uses ensemble learning method for
regression. Ensemble learning method is a technique that combines predictions from multiple machine
learning algorithms to make a more accurate prediction than a single model.
Screen shorts
2. Apriori Algorithm
Apriori algorithm refers to an algorithm that is used in mining frequent products sets and relevant
association rules. Generally, the apriori algorithm operates on a database containing a huge number of
transactions. For example, the items customers buy at a Big Bazar. Apriori algorithm helps the customers to
buy their products with ease and increases the sales performance of the particular store.

If a person will buy an item X like(Bread) then what are his chances of buying an item Y like(Butter)
Screen shorts
3. Hierarchical Clustering
The agglomerative hierarchical clustering algorithm is a popular example of HCA. To group the datasets into clusters, it follows the bottom-up
approach. It means, this algorithm considers each dataset as a single cluster at the beginning, and then start combining the closest pair of clusters
together. It does this until all the clusters are merged into a single cluster that contains all the datasets.
This hierarchy of clusters is represented in the form of the dendrogram

1. Based on customer interest they will be Segmented and further recommendation can be provide
2. Customers can be segmented as regular and non- regular customers
Screen shorts
UML DIAGRAMS(1)
Use case diagram:
UML DIAGRAMS(2)
Class diagram
UML DIAGRAMS(3)
Activity Diagram
UML DIAGRAMS(4)
Sequence Diagram
UML DIAGRAMS(5)
Deployment Diagram
UML DIAGRAMS(6)
State Diagram Component Diagram
SDLC ( Waterfall model)
PROJECT PLAN
RESULTS(1)

Login Page Home Page


RESULTS(2)

Search Images

Feedback page
RESULTS(3)

Items sales prediction output page


Items sales prediction input page
RESULTS(4)

Customer Segmentation input page Customer Segmentation output page


RESULTS(5)

MBA Input page MBA Output page


TESTING
Types of testing done: Test cases developed

•Unit testing -Test Senario1 : - Login Function

•Integration testing - Test Senario2: - Prediction of number of items sold

- Test Senario3: - Customer Segmentation


•System testing

•User Acceptance Testing - Test Senario4: - Market Basket Analysis


(Alpha testing)
- Test Senario5: - Search filed
•User Interface testing
- Test Senario6: - Feedback filed
COMPUTATIONALCOMPLEXITY
The Computational Complexities of different ML Models are:
Assumptions:
n=numberoftrainingexamples,m=numberoffeatures,n'=numberofsupportvectors,
k= numberofneighbour,k' = numberoftrees

-LinearRegression
TrainTimeComplexity=O(n*m^2+m^3)Test Time Complexity=O(m)
SpaceComplexity=O(m)

-RandomForest
TrainTimeComplexity=O(k'*n*log(n)*m)Test Time Complexity=O(m*k')
SpaceComplexity=O(k'*depthoftree)

-HierarchicalClustering
Time complexity = O(n³) where n is the number of data points.Spacecomplexity=O(n²)
wherenisthenumber ofdatapoints.
Advantages ,Limitations and
Applications
•Advantages:
Predicting future sales in supermarkets.
Customer behavioral prediction by company.
More economic profit.
Better management of stock investment.

•Limitations:
Limited to small supermarkets.
Proper dataset needed ,consisting of proper dependent and independent variables.

•Applications:
Useful for business to predict customer behavior from past transaction.
Future implementation in companies like Amazon, Flip-kart, Netflix.
Conclusion

Every shopping mall needs to know customer demands in advance to avoid shortfalls
of sales items. By using Machine Learning approaches prediction of sales with respect
to Various factors helps Business to Adopt suitable strategies for increasing sales.
Future Work
• Sales prediction is Complex yet essential component of business intelligence – like
purchasing and Budgeting.
• With stronger device storage and proper ram management devices it can hold data of
larger supermarkets and can hold data for a longer duration
• The projects implementation for big companies like Amazon ,Netflix, flipkart will be
possible.
Refrences
[1] Singh Manpreet, Bhawick Ghutla, Reuben Lilo Jnr, Aesaan FS Mohammed, and Mahmood A. Rashid. "Walmart's Sales Data
Analysis-A Big Data Analytics Perspective." In 2017 4th Asia-Pacific World Congress on Computer Science and Engineering
(APWC on CSE), pp. 114-119. IEEE, 2017

[2] Panjwani, Mansi, Rahul Ramrakhiani, Hitesh Jumnani, Krishna Zanwar, and Rupali Hande. Sales Prediction System Using
Machine Learning. No. 3243. EasyChair, 2020

[3]https://levelup.gitconnected.com/random-forest-regression-209c0f354c84

[4] ] Trnka., “Market Basket Analysis with Data Mining Methods”, International Conference on Networking and Information
Technology (ICNIT) ,2010

[5] Customer Segmentation Using Machine Learning


December 2021,DOI:10.3233/APC210200.

[6]Cheriyan, Sunitha, Shaniba Ibrahim, Saju Mohanan, and Susan Treesa. "Intelligent Sales Prediction Using Machine Learning
Techniques." In 2018 International Conference on Computing, Electronics & Communications Engineering (iCCECE), pp. 53-58.
IEEE, 2018.

You might also like