Project Presentation

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 42

GLOBAL ACADEMY OF TECHNOLOGY

Department of Information Science & Engineering

Final Year Project Presentation

Redefining Success Metrics: B2B Sales in the


Machine Learning Era
Guided By: Vijay kumar S

Presented By: 18 Presentation Date : 23-05-2024


Deepthi M – 1GA20IS040
Rakshitha R – 1GA20IS098
Agenda
1. Overview / Background
2. Motivation
3. Project Relevance
4. Literature Review
5. Key Challenges
6. Objectives
7. Problem Statement
8. Proposed System with System Architecture
9. Data Flow Diagram
10. Detailed Implementation
11. Results and Discussion
12. Future Scope for Enhancement
13. Conclusion
14. References
5/19/2024 DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING
Overview
● Machine Learning has become a transformative force across various industries , including education,
healthcare,engineering, sales, entertainment, and more. Its revolutionary applications are reshaping
traditional approaches and providing valuable insights.
● Traditional sales and marketing approaches are no longer sufficient in the face of a competitive market.
● Leveraging Linear Regression, K-Neighbors Regressor, XGBoost Regressor, and
Randomforestregressor, to extract patterns and insights from the collected data.
● Sales forecasting is crucial for effective business management and resource allocation.
● Machine Learning, rooted in mathematical principles, excels in outperforming human capabilities in
tasks like sales forecasting.

5/19/2024 DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING


Motivation
● The motivation behind this project stems from the growing complexity of today's business
landscape, especially in the retail sector represented by Big Mart Companies.
● Traditional sales methods fall short in navigating the complexities of modern business
environments.
● The motivation here is to introduce a more efficient and optimal way to forecast sales, one that is
rooted in the capabilities of Machine Learning.
● Losing customers to competitors is a big risk in fast-changing markets.
● Machine Learning acts as a strategic asset in mitigating customer churn by providing predictive
insights into purchasing patterns.

5/19/2024 DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING


Project Relevance
● Economic Impact: B2B sales are a fundamental aspect of business operations. Predicting and
improving sales success rates directly contribute to economic growth by enhancing revenue
generation, job creation, and overall business expansion.
● Market Audience Targeting: Businesses focus on targeting the market audience to meet societal
needs efficiently.
● The incorporation of ensemble techniques indicates a practical approach to improving model
robustness and reliability.
● This project aligns well with the principles of Industry 4.0, which emphasizes the integration of
digital technologies, data-driven decision-making, and advanced analytics into industrial
processes.
● Using four distinct machine learning algorithms and feature selection, is directly applicable to real-
world industrial processes. This versatility ensures that the models you develop can adapt to
different operational contexts, addressing the practical needs of various industries.

5/19/2024 DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING


Key Findings from Literature
1. Predicting and Defining B2B Sales Success with Machine
Learning.
1. Introduction:
This study uses Machine learning methods to help a paper and packaging company improve its sales performance and
profitability
2. Methodology:
● The team used four supervised machine learning algorithms (Logistic Regression, Decision Tree, Random Forest, and
XGBoost) to predict win aptitudes for sales opportunities based on data from the company’s CRM system.
● The team performed an iterative variable selection process to identify the most important features that influenced
sales success, and validated them with the company.
● The team evaluated the performance of the models using accuracy, precision, and recall metrics, and chose the best
model based on its predictive power and ability to generate insights.

5/19/2024 DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING


3. Findings
It was found that Random Forest had the highest accuracy,
precision and recall among all the algorithms used.

The future scope involves implementing a non-meta-


variable for model for divisions with sufficient accuracy,
and developing a system to capture periodic snapshots of
opportunity fields.

Figure-1
Consolidated Table of Literature Review

5/19/2024 DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING


Consolidated Table of Literature Review
Consolidated Table of Literature Review
Key Challenges
1. Data Quality and Availability
Gathering reliable and sufficient data for training the sales forecasting model can be challenging,
particularly when big companies are reluctant to share their sales records, additional strategies and
approaches become necessary.
2. Interpretability and Explainability
Providing businesses with understandable insights and explanations behind the sales predictions is
essential for gaining their trust and facilitating decision-making.
3. Privacy and Security
Ensuring the confidentiality and security of sensitive business data entered into the web application.

5/19/2024 DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING & ENGINEERING


Objectives
● Develop a robust machine learning-based sales forecasting tool capable of accurately predicting
product sales for available items.
● Create a user-friendly web application interface that allows businesses to input relevant product
features and obtain sales estimation.
● Improve decision-making processes for businesses by providing actionable insights derived from the
analysis of key factors influencing consumer behavior and purchasing decisions

5/19/2024 DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING


Problem Statement
Problem Statement

Create an user-friendly web application that utilizes advanced machine learning techniques to analyze
various product data, aiding in informed market entry decisions through seamless integration, while
prioritizing reliability and scalability during deployment.

5/19/2024 DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING


Project Requirements
Hardware / Software Requirements Identified
Hardware Requirements:
●intel core i5/i7 10th gen.
●8GB RAM. etc,
Software Requirements:
●Windows Operating System.
●Git, VS code etc.
Languages:
●Python, JavaScript, CSS,HTML

5/19/2024 DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING


Proposed Methodology
Proposed Methodology

1. Data Collection and Integration


●Collecting dataset from Manufacturer specifications, sales records, and market research reports.
●Developed a structured framework for data integration.
●Merged collected datasets using standardized formats and protocols.
2. Data Preprocessing
●Cleansing and Handling Missing Values
●Categorical to Numerical Conversion
●Normalization and Scaling

5/19/2024 DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING


Proposed Methodology
3. Feature Engineering
●Integrated findings from correlation analysis and tree-based models to prioritize features.
●Identified features consistently ranked high across different methods.
●A rigorous method was used to pick characteristics that fit the established criteria, while those
that did not were discarded.
4. Dataset splitting and training
●Divided the data set into training (70- 80%) and testing (30-20%) sets to enable robust model
evaluation.
●Ensured a randomized split to prevent any biases in the subsequent training of the model.
Proposed Methodology
5.Model Selection and Training
●Methodically chose regression models suitable for the problem domain, considering options like
linear regression, decision trees, or ensemble methods.
●Trained the selected model using the training data set, meticulously adjusting hyper parameters for
optimal performance.
●Implemented cross-validation techniques to assess the model's robustness and generalization,
ensuring reliable predictions on unseen data
6. Model Evaluation
●Performance Metrics Application
●Comparative Model Analysis
●User-friendly Interface Design
System Architecture

Figure-2
5/19/2024
DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING
Proposed Algorithms
● Multiple Logistic Regression — a generalized linear model (GLM) that describes the
relationship between a binary dependent variable and more than one predictor.
● Decision Tree — a non-parametric algorithm that makes sequential, hierarchical
decisions about the outcomes based on the predictors.
● Random Forest — an ensemble algorithm that constructs a multitude of decision trees
and outputs the mode of the classes, correcting the overfitting habit of decision trees.

5/19/2024 DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING


Data Flow
Data Flow Diagram
Data Flow Diagram - Level 0

Figure 3 : Data flow diagram

5/19/2024 DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING


Data Flow Diagram
Data Flow Diagram - Level 1

Figure-4: Data Flow Diagram


Project Implementation
Detailed Project Implementation
●DATASET COLLECTION:
○ Data Sources: Collected data from various sources, including manufacturer specifications, sales records, and market research.
○ Data Integration: Merged diverse datasets to create a comprehensive dataset with all relevant features.
○ Data Cleaning: Addressed missing values, outliers, and inconsistencies to ensure data quality.
○ Feature Importance Analysis: Utilized methods like correlation analysis or tree-based feature importance to identify influential
features.
●DATASET MANAGEMENT AND TRAINING PROCESS:
○ Selection Criteria: Based feature selection on their impact on the target variable (sales).
○ Splitting Dataset: Divided the dataset into training (70-80%) and testing (2030%) sets.
○ Normalisation: Ensured randomization during the split to avoid biased model training.
○ Model Selection: Choose a regression model based on the nature of the problem (e.g., linear regression, decision tree, or ensemble
methods).
○ Training Procedure: Trained the selected model using the training dataset, adjusting parameters for optimal performance.

5/19/2024 DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING


Detailed Project Implementation
●MODEL EVALUATION AND DEPLOYMENT:
○ Performance Metrics: Evaluated the model's performance using metrics like Mean Squared Error
(MSE), R-squared, or others.
○ Model Comparison: Experimented with different models or variations to identify the most
efficient for sales prediction
○ Model Deployment: Serialize the trained models into a format that can be saved to disk, such as
Pickle or joblib, for later use.Deployed the web application, ensuring its accessibility and
responsiveness.
●ALGORITHMS USED:
○ Decision tree: Uses tree structures to model non-linear relationships by splitting the data based on
feature values.
○ Linear regression : Linear Regression is a statistical method that models the relationship between a
dependent variable (sales) and one or more independent variables (features) by fitting a linear
equation to observed data.
○ Random Forest : An ensemble of decision trees that reduces overfitting by averaging their
predictions.
5/19/2024 DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING
Detailed Project Implementation
● Model Selection:Determines the best-performing model based on the highest R-squared value.
Results and Discussion
Results

Figure-6: User Figure-7: User Login Figure-8: About Page


Registration

5/19/2024 DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING


Results

Figure-9:Sales Prediction Page Figure-10:Sales Estimation


Result
Performance Analysis

Root Mean
Mean Square Mean Absolute
Model R-squared Square Error
Error (MSE) Error (MAE)
(RMSE)
Linear
0.75 1200 34.64 25.12
Regression

Decision Tree 0.78 1100 33.17 24.80

Random Forest 0.85 900 30.00 22.50

5/19/2024 DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING


Scope of Enhancement
Project Future Scope

● Investigate more advanced machine learning algorithms beyond traditional regression


model.
● Develop robust time series forecasting models that account for seasonality, trends, and
irregularities
● Integrate external data sources (e.g., social media trends, news articles, economic reports)
to enhance predictive models.

5/19/2024 DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING


Presentation Summary
Conclusion
● The project aims to create a machine learning-driven sales forecasting tool integrated into
a user-friendly web application, facilitating sales predictions.
● By enabling businesses to input product features and receive immediate sales forecasts for
the available products, the tool enhances decision-making processes.
● The Random Forest model, chosen for its superior performance and higher R-squared
value, powers the sales predictions, ensuring accuracy and reliability.
● The web application interface is designed to be intuitive and user-friendly, making
advanced sales forecasting accessible to business users without technical expertise.
● Overall, this tool empowers businesses to make smarter choices, optimize their sales
strategies, and improve efficiency in their sales processes.

5/19/2024 DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING


Publication Details
Papers Published
https://doi.org/10.48175/IJARSCT-15375

5/19/2024
DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING
References
[1] Altuncu MA, Tastan MH, Özcan T. Machine Learning Based Approaches for Short Term Sales
Forecasting in E-Commerce. InThe International Symposium for Production Research 2022 Oct 6 (pp. 16-
24). Cham: Springer International Publishing.
[2] Anil GA, Shankar CR, Santosh BP, Rajendra GA, Thorat BD. SALES FORECASTING USING
MACHINE LEARNING TECHNIQUES.
[3]Cakir A, Akın Ö, Deniz HF, Yılmaz A. Enabling real time big data solutions for manufacturing at scale.
Journal of Big Data. 2022 Dec;9(1):1-24.
[4] Wisesa O, Adriansyah A, Khalaf OI. Prediction analysis sales for corporate services
telecommunications company using gradient boost algorithm. In2020 2nd International Conference on
Broadband Communications, Wireless Sensors and Powering (BCWSP) 2020 Sep 28 (pp. 101-106). IEEE.
[5] Saraswathi K, Renukadevi NT, Nandhinidevi S, Gayathridevi S, Naveen P. Sales prediction using
machine learning approaches. InAIP Conference Proceedings 2021 Nov 1 (Vol. 2387, No. 1). AIP
Publishing.

5/19/2024 DEPARTMENT OF INFORMATION SCIENCE & ENGINEERING


References
[6] Jiang H, Ruan J, Sun J. Application of machine learning model and hybrid model in retail sales forecast.
In2021 IEEE 6th international conference on big data analytics (ICBDA) 2021 Mar 5 (pp. 69-75). IEEE.
[7] Rezazadeh A. A generalized flow for B2B sales predictive modeling: An azure machine-learning approach.
Forecasting. 2020 Aug 6;2(3):267-83.
[8] Thiess T, Müller O, Tonelli L. Design Principles for Explainable Sales Win-Propensity Prediction Systems. In
Wirtschaftsinformatik (Zentrale Tracks) 2020 (pp. 326-340).
[9] Hu Y. Sales Prediction Based on State-of-art Machine Learning Scenarios. InProceedings of the
International Conference on Financial Innovation, FinTech and Information Technology, FFIT 2022, October
28-30, 2022, Shenzhen, China 2023 Apr 14.
[10] Mortensen S, Christison M, Li B, Zhu A, Venkatesan R. Predicting and defining B2B sales success with
machine learning. In2020 Systems and Information Engineering Design Symposium (SIEDS) 2020 Apr 26 (pp.
1-5). IEEE.

You might also like