Professional Documents
Culture Documents
Final Report-Inqer Printers and Spares-Final - Edited
Final Report-Inqer Printers and Spares-Final - Edited
TEAM MEMBERS
ABHYUDAYA MARYA
JEEVA RAMPRASAD
ROMA DADHIRAO
SUDARSHAN RAKHMAJI KADGE
G TAROON SUBRAMANIAM
Table of Contents
1. Introduction………………………………………………………………...2
2. Project Description & Tools Used……………………………………….4
3. Role of Machine Learning………………………………………………..5
4. Data Exploration…………………………………………………………..6
5. Data Manipulation...............................................................................7
6. Feature Engineering………………………………………………………8
7. Building Training-Test Sample...................……………………………..9
8. Model Selection and
Evaluation……………………………………………………………........10
9. Key Contributions…………………………………………………………12
10. Analysis of Results………………………………………………………..13
11. Tableau Visualizations.......................................................................14
1
Final Report-Amity Capstone Demand Forecasting-Printer Spare Parts
Introduction
Through this project, for the benefit of an organization (Inqer Printers and Spares*) dealing
with sale of spare parts of printers, we intend to integrate data on their stock/ inventory and
demand collated for the past 15 months to effectively track their supply chain, enhance
decision making ability and expedite the process of grievance redressal using Machine
Learning Algorithms. Through statistical techniques, we will forecast demand effectively in
the short, medium and long term. In addition, we will optimize the inventory to maximize
profits (while also being cognizant of the need of a buffer stock). This will help envisage
better sales and operations plans across departments and also optimize resourcing efficiency
by creating supply plans based on prioritized demands, allocations and supply chain
constraints.
Dataset Description:
1. The data set consists of sales of a company dealing with a large number of
components.
2. It consists of inventory of the components in 3 different warehouses->AME, APJ,
EMEA.
3. It also consists of parameters to prioritize inventory planning through Local Area
Stock Code, PSMS*, D-Chain Status**, SPT***
D-Chain PSMS
25 C2, C4
60, 61 S9
*Data is real time (with certain realistic changes) by an organization, however, a fictitious name to uphold its privacy.
2
Final Report-Amity Capstone Demand Forecasting-Printer Spare Parts
D-Chain Description
Time it takes to receive a part after a purchase order (PO) is placed with a supplier
1. Part is repairable
2. Part is set to return due to the OEM for warranty coverage
3. If a part is non-returnable, it is assumed it cannot be repaired
3
Final Report-Amity Capstone Demand Forecasting-Printer Spare Parts
PSMS Description
S6 Sustaining –supported part which is > 180 past the SAP FCS date
C8 Supplier (or ‘Supply’) EOL, POs still possible with potential limitations
4
Final Report-Amity Capstone Demand Forecasting-Printer Spare Parts
d. Visualization
i. Matplotlib
ii. Seaborn
e. Other
i. Statistics
ii. Itertools
Objective
1. In this project, the target is to predict demand for the product based on its prior sales.
2. We are also trying to create a system to manage the inventory of the different.
warehouses, depending on the sales of a product.
3. This will give idea of much quantity of product to be ordered.
*PSMS->Plant Specific Material Status, it indicates its current position in the life cycle
**D-Chain->Determines if part is for sale and available for immediate delivery
***SPT->Special Procurement Type->Returnable or non-returnable
5
Final Report-Amity Capstone Demand Forecasting-Printer Spare Parts
booming given the humongous variance as is expected in a business of a such a wide ambit.
The manipulation is done using Python and SciKitLearn Libraries.
Given the supervised learning techniques and the categorical nature of the predictor variable,
we employed the following supervised learning techniques:
a. GaussianNB
b. Random Forest Classifier
c. Decision Tree Classifier
d. MultinomialNB
e. SVM
f. Bernoulli NB
With the obtained accuracies mentioned below (post feature engineering)
As is evident, Random Forest and DT classifier (ensemble methods) give us the best results,
hence we obtain our results based on this. Based on several trials and errors, including
shuffling of data in hold out and using random samples from dataset, Random Forest is
consistently the best method.
Data Exploration
This was the first step to the demand forecasting process and to most data analytics life cycle
in general. After all the libraries from Numpy, Pandas, ScikitLearn, Matplotlib etc are added,
the steps are:
a. Add the csv file and check for datatypes
b. and read the first few rows (head() function)
6
Final Report-Amity Capstone Demand Forecasting-Printer Spare Parts
f. One hot encoded the month since, our initial assumption/ null hypothesis is that
the month per se will not have an impact. If it does, the machine learning model
will account for it based on the equal weights assigned through OHE. Similarly
OHE on the region too.
g. For the predictor variable, we bin it as very low, low, medium, high, very high,
booming based on the needs of the lead. By trial and error based on 25th, 50th, 75th,
85th, 95th and 99th percentiles of predictor variable values, the bins were chosen as:
Data Manipulation
a. Dropping redundant columns/ non numerical. Drop month and region since they were
one hot encoded already.
b. Normalization of columns except local stock advice code (it is essentially a weight)
7
Final Report-Amity Capstone Demand Forecasting-Printer Spare Parts
c. Price, DChain and Inventory values had a few NaN/ invalid values so replaced those
with the median
Feature Engineering
Before dwelling into the feature engineering process, presenting first analysis of the various
supervised learning methods (after hold out method was applied)
(Three examples of only basic code snippets, else it would fill the whole page)
8
Final Report-Amity Capstone Demand Forecasting-Printer Spare Parts
Hence, it was imperative to filter out important features, used random forest to filter out since
that was the most accurate method here.
9
Final Report-Amity Capstone Demand Forecasting-Printer Spare Parts
The process was performed twice->first on the dataset as it is and then again, post the feature
engineering process.
Initially, only the accuracy was tested to check for potential for feature engineering:
While the accuracy is decent in a few methods (Random Forest, DT, KNeighbors, SVM),
there is potential for improvement of both accuracy and also computation time hence, it is
imperative to resort to feature selection.
10
Final Report-Amity Capstone Demand Forecasting-Printer Spare Parts
Did not bode well for Naïve Bayes and SVM based methods however, it terms of Ensemble
(Random Forest and DT) and Neighbors based (KNN), the improvement is highly significant.
While either of the three can be chosen, stuck to Random Forest Classifier.
Confidence Matrix
11
Final Report-Amity Capstone Demand Forecasting-Printer Spare Parts
Due to a high value of over 75% in all metrics, this method could get a go ahead for the
purpose.
Key Contributions
On actually realizing the contributions that advances in the fields of subsets of Data Science,
viz. Machine Learning, Artificial intelligence and Deep Learning have in merely our day to
day activities and is exponentially having more so every day, perhaps the present millennial
generation and generations to come would find it unfathomable a life without this. While it
may be far sighted at this stage, the zenith of this is expounded by Murray Shanahan as
“Technological Singularity” which in very rudimentary terms would involve our entire life
processes being driven by technology to the point of it being the sole decision maker.
However, not digressing towards the philosophical annotations, Machine Learning used in a
swathe of industries ranging from effective assembly lines in core industries to demand and
inventory planning in the service and e-commerce sector to even having forayed deeply into
the primary sector of Agriculture in a range of processes. Here, we using real time data, we
exemplified its significance for a small to medium scale Printers and Spares organization, as
to how it could aid it for effective decision making and inventory planning. Today,
employing machine learning for a business may be a luxury but in the coming years it would
be sine qua non. Following were the aspects covered and usually is in any analytics
processes:
a. Business Acumen=>Often domain wise knowledge
b. Comprehensive knowledge of statistical as well as Machine Learning mathematics
c. Data Exploration processes, technical and logical/ face value based
d. Data preparation, both for valuable insights, as well as to improve computation time
e. Data Modelling and Evaluation, choose the right techniques and evaluate the most
appropriate
f. Deployment->Export the learnings into a csv file for further analysis
g. Descriptive and Predictive visualizations, tableau/ excel/ powerBI
Analysis of Results
With at least 75% of accuracy, we can guarantee the proprietor of Inqer Printers and Spares,
the demand that he may predict (not to be too pedantic, but this was up to Feb-2020 right
before the economic slump due to the pandemic, so realistically, the analysts might have had
a shock from the far cry in the results, which further highlights the importance of
12
Final Report-Amity Capstone Demand Forecasting-Printer Spare Parts
dynamically accounting for factors (economists use the phrases “animal spirits” and “black
swan events”) which is a learning in itself from the project). Nevertheless, now, following is
how, the expectations have been classified in the various demand categories in the graph
As the trend was turning out to be, indeed, there is expectation of lot of products in “Low”
demand as they followed those metrics however, very few in “Very Low” and an
encouraging number in “Very High” and “High”. It is alarming however, that there are
several products with low demand compared to medium which makes it imperative to clear
out the stock appropriately for those that haven’t been off the shelves, either through:
a. Sale at throwaway prices
b. Depending on e-commerce platforms for sale
c. On the shelf marketing around local stores rather than solely own store
d. Chain marketing
e. Grassroot level lead generation for potential bulk orders (although executing this may
be expensive and might not have a great trade off)
For the few within “Very Low”, without a doubt it has to be through Sales at Throwaway
prices.
The goal is to bring as many products in the booming category as possible, which is however,
not a realistic thought and in any case would only involve further raising the yardsticks.
13
Final Report-Amity Capstone Demand Forecasting-Printer Spare Parts
A high amount of demand within the “medium” category is indicative of not being threatened
by competition/ having upped the ante from Inqer’s side at the most, however, must always
be on the lookout.
Tableau Visualizations
Descriptive
Predictions
14