Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

ISSN 2089-8673 (Print) | ISSN 2548-4265 (Online)

Volume 12, Issue 1, March 2023

DEMAND FORECASTING FOR IMPROVED INVENTORY


MANAGEMENT IN SMALL AND MEDIUM-SIZED BUSINESSES

Dian Indri Purnamasari1, Vynska Amalia Permadi 2*, Asep Saepudin 3,


Riza Prapascatama Agusdin 4
1 Department of Accounting, Universitas Pembangunan Nasional Veteran Yogyakarta
2*,4 Department of Informatics, Universitas Pembangunan Nasional Veteran

Yogyakarta
3 Department of International Relations, Universitas Pembangunan Nasional Veteran

Yogyakarta
email: dian_indri@upnyk.ac.id1, vynspermadi@upnyk.ac.id2*, asep.saepudin@upnyk.ac.id3,
rizapra@upnyk.ac.id4

Abstract
Small and medium-sized businesses are constantly seeking new methods to increase productivity
across all service areas in response to increasing consumer demand. Research has shown that
inventory management significantly affects regular operations, particularly in providing the best
customer relationship management (CRM) service. Demand forecasting is a popular inventory
management solution that many businesses are interested in because of its impact on day-to-day
operations. However, no single forecasting approach outperforms under all scenarios, so
examining the data and its properties first is necessary for modeling the most accurate forecasts.
This study provides a preliminary comparative analysis of three different machine learning
approaches and two classic projection methods for demand forecasting in small and medium-sized
leathercraft businesses. First, using K-means clustering, we attempted to group products into three
clusters based on the similarity of product characteristics, using the elbow method's
hyperparameter tuning. This step was conducted to summarize the data and represent various
products into several categories obtained from the clustering results. Our findings show that
machine learning algorithms outperform classic statistical approaches, particularly the ensemble
learner XGB, which had the least RMSE and MAPE scores, at 55.77 and 41.18, respectively. In
the future, these results can be utilized and tested against real-world business activities to help
managers create precise inventory management strategies that can increase productivity across
all service areas.
Keywords: Demand Forecasting, Inventory Management, Machine Learning, Small and Medium-
Sized Businesses

Received: 09-01-2023 | Revised: 28-03-2023 | Accepted: 31-01-2023


DOI: https://doi.org/10.23887/janapati.v12i1.57144

demand forecasting, and effective customer


INTRODUCTION
relationship management. According to
Data analytics is no longer limited to
Bokman et al. [1], businesses that use
huge multinational companies due to
consumer analytics beat their competition by
advancements in data processing and
126%. In contrast, failing to consider data
storage capacity. However, it can be critical
analysis in strategy formulation may put you
in developing strategies to support
at a competitive disadvantage.
businesses of all sizes. On the other hand,
Furthermore, increased consumer
small and medium-sized businesses should
demand for goods and services has
be able to use data analytics to uncover
encouraged small and medium-sized
information such as client purchase habits,

Jurnal Nasional Pendidikan Teknik Informatika : JANAPATI | 56


ISSN 2089-8673 (Print) | ISSN 2548-4265 (Online)
Volume 12, Issue 1, March 2023

businesses to look for ways to improve initial investigation consisted primarily of


operational efficiency in any service domain. scanning the company's website for
Internal logistics and warehousing are high- information on product orders. Then, utilizing
growth sectors that can increase an data mining tools, the study was undertaken
organization's operational efficiency [2], [3] on projected future consumer requests until
and have been the subject of several a more efficient inventory management
studies, as evidenced by the fact that policy was established. Inventory
influential organizations have conducted management using data mining and sales
them [1]. Businesses have encouraged and history data may significantly reduce total
supported the fulfillment of customer inventory expenses. As previously stated,
requirements and expectations throughout data quality is critical for achieving high-
the supply chain and warehouse activities, significance estimations and efficiency.
which are critical to a company's operational However, the previous study showed that
performance. They discovered that inventory the gathered findings might still be
management significantly impacts logistics improved/optimized because web-collected
performance indices, particularly for data is typically poor-quality and requires
organizations looking to reduce costs and numerous additional procedures to be pre-
improve product formulation and delivery processed. Another study [8] used data
procedures. mining techniques to control inventories and
Inventory management refers to focused on associating historical data with
administering and maintaining a company's decision-making. A complete analysis was
stock or inventory. In this context, the word conducted based on the observed
"inventory" refers to raw materials, auxiliary intercorrelation to boost forecast accuracy,
materials, items in production, finished and the authors believe that the company's
goods, and spare parts. Several intelligent expenses and consumption will be lowered.
inventory management solutions have been Moreover, a predictive analytics-based
investigated to increase inventory approach for inventory management that
management effectiveness [4]–[6]. Inventory uses machine learning algorithms to analyze
management is closely linked to customer sales data and predict future demand has
relationship management, and several already conducted. The authors focus on the
technologies are used to meet consumer use of time-series forecasting models to
demand by shifting from traditional to make accurate demand forecasts and
innovative inventory management. Various optimize inventory levels. The proposed
approaches are available for forecasting approach also incorporates a dynamic
client demand. Two possible strategies are pricing model that adjusts prices based on
to employ statistical analytic methodology demand patterns, which can help to further
and data mining approaches. A prediction optimize inventory and reduce costs. The
based on statistics or mathematical analysis paper presents a case study demonstrating
strongly relies on the quality of historical data the proposed approach's effectiveness in
(such as transaction history/product orders). improving inventory management for an
As a result, if an organization has archives of online retailer.
high-quality data, making more accurate Moreover, Zhang’s [9] literature
estimations will be easier. Organizations that review comprehensively reviews the existing
use this inventory management technique literature on inventory management for
typically have a quantitative analytic staff spare parts. The author discusses various
that analyzes historical data on a regular approaches and techniques for spare parts
basis to uncover potential patterns and inventory management, including
trends in market demand. The requirements forecasting demand, setting inventory levels,
analysis is based on historical customer and determining reorder points. The paper
demand transactions and will be used to also examines the challenges and
manage the firm's inventories. opportunities associated with spare parts
X. Guo's research [7] observed past inventory management, such as intermittent
order data to estimate client demand. The demand, obsolescence, and service level

Jurnal Nasional Pendidikan Teknik Informatika : JANAPATI | 57


ISSN 2089-8673 (Print) | ISSN 2548-4265 (Online)
Volume 12, Issue 1, March 2023

agreements. Overall, the paper provides a METHOD


useful resource for researchers and Forecasting models can be classified
practitioners seeking to improve spare parts into two types: qualitative and quantitative,
inventory management in various industries, each with three approaches based on
including manufacturing, aerospace, and analytical methodology: statistical, data
healthcare. mining/machine learning-based, and hybrid.
This research aims to utilize various The literature on business forecasting
analysis techniques to determine the most outlines continuous and intermittent demand
appropriate model for one of the Small and techniques based on underlying market
Medium-Sized Leathercraft Businesses in demands for products to manage inventory.
Yogyakarta, Indonesia, as in previous Identifying trends such as intermittency is
studies. The selected leathercraft industry is critical for selecting the best forecasting
among the largest SMEs in Yogyakarta, strategy. A typical approach that
established in 2011, and sells its products emphasizes expert judgment or client
through various channels. The company has viewpoints above quantitative analysis is
not yet employed a data analytics approach qualitative forecasting, also known as
in its decision-making process. Thus, this judgmental forecasting. Qualitative
study seeks to provide valuable insights into forecasting is desirable and frequently
the business by suggesting a demand necessary in the absence of historical data
forecasting model that aligns with service to support quantitative methodologies or
requirements, such as avoiding stock-outs or when past values have little or no influence
over-orders. To achieve the company's goal on future events. However, qualitative
of delivering the best services to its forecasting is prone to bias due to its reliance
customers, we will identify the most suitable on human opinion, which can be influenced
demand forecasting technique that allows by personal and political objectives. Experts
them to predict future monthly demand and forecasters also tend to place greater
forecasts. emphasis on recent historical happenings,
Since our objective is to develop the resulting in estimates that are close to the
most relevant forecasting model concepts to current reference point, adding another
assist the company in managing its inventory challenge to appraising predictions.
of handcrafted products across multiple Quantitative forecasting uses
parameters, the main takeaway of our study mathematical (statistical) models to predict
is that it is crucial to construct the forecasting the future. Since quantitative forecasting
model with particular consideration for the models are objective, they should be used
type of dataset in order to achieve higher whenever there is a substantial amount of
accuracy. We will use a transactional previous data that can be logically
historical time-series dataset to compare associated with predicted values (i.e., past
several demand forecasting algorithms. This historical data have unique trends and
work will make several contributions, continuous values). Cross-sectional or time-
including: (1) employing machine learning series data can be used for quantitative
clustering to categorize various products forecasting, with time-series data being the
with similar characteristics for efficiency, (2) most common type used for predictions.
utilizing demand forecasting based on Quantitative forecasting uses various
statistical computation and machine learning models based on unique combinations of
algorithms to predict demand and provide a predictive parameters and are included in
comparative analysis of each model's one of two categories: time-series or
performance, and (3) suggesting the best explanatory. Explanatory forecasting models
forecasting model for one of the largest aim to identify the factors that affect the
leathercraft SMEs in Yogyakarta's inventory target variable, such as inflation, while not
management. taking previous trends into consideration.
Regression analysis is the most well-known
method in the field of forecasting.
Regression projections explore the

Jurnal Nasional Pendidikan Teknik Informatika : JANAPATI | 58


ISSN 2089-8673 (Print) | ISSN 2548-4265 (Online)
Volume 12, Issue 1, March 2023

relationships between one or more essential. To keep this method performing


dependent variables and an independent well, limit the number of variables to the
variable. correlated ones.
The following explanation covers a Machine Learning Methods
variety of time forecasting methodologies Machine learning (ML) approaches
and current publications on demand were first used for forecasting in 1964, but
forecasting applications, split into three little follow-up work was done for several
categories: statistical, machine learning, and decades. Since then, several studies have
hybrid. been conducted on applying ML systems to
demand to forecast. The most popular time-
Statistical Procedures series forecasting models include CART
Time-series forecasting, particularly regression trees [14], Generalized
demand and sales forecasting, is often Regression Neural Networks [15], K-Nearest
accomplished through statistical methods. Neighbor regression [16], Bayesian Neural
Furthermore, statistical methods can be Networks [17], Gaussian Processes
broadly classified as either continuous, regression [18], Long Short Term Memory
where a constant time-series pattern network [19], Multi-Layer Perceptron [20],
captures the demand history, or non- Recurrent Neural Networks [21], and Radial
continuous. Additionally, there is an ad hoc Basis Functions [22]. Several large-scale
intermittent special analytic technique for comparative studies, most of which are
slow-moving items. As part of our literature empirical, have reviewed different
review, in this section, we will approaches to regression or time-series
comprehensively discuss several examples forecasting challenges. In a significant
of well-known classic statistical forecasting comparative study [23], the Multi-Layer
approaches and some of the straightforward Perceptron and Gaussian Processes
methodologies that are frequently used. regression were the most effective
The implementation of forecasting algorithms, followed by Bayesian Neural
methods on time series data relies mainly on Networks and Support Vector Regression.
Exponential Smoothing models. This Another study [24] using time-series data
method generates projections as the points discovered that the model with the
weighted average of previous data. There best performance was Radial Basis
are several approaches to exponential Functions, followed by Recurrent Neural
smoothing, including simple, double, and Networks and Multi-Layer Perceptron.
triple Exponential Smoothing models. The Generalized Regression Neural Networks,
Autoregressive Integrated Moving Average on the other hand, had the worst
model [10] combines the best aspects of performance according to recent
autoregressive and moving average models comparison research [25], and Multi-Layer
by differentiating time series data. ARIMA Perceptron is the most viable forecasting
iteratively executes three stages using the strategy among Machine Learning models.
Box-Jenkins method: model detection, Although each method has advantages and
parameter estimation, and diagnostic weaknesses, data quality is critical for any
checking. The Theta Model [11] is frequently empirical study seeking to evaluate the
applied in time series forecasting, performance of a specific forecasting
particularly in supply chain management and methodology. As such, there is no "generic"
planning, because of the precision of its point guaranteed technique for making
forecasts [12]. The Vector Autoregressive predictions. The nature of the data and the
(VAR) model [13] extends the univariate situation in which the forecast is being made
autoregressive model's capabilities to should affect the methodology chosen.
multivariate time series prediction. It has Research also shows that Neural Networks
become a common tool for time series and their variants perform the best of all
forecasting due to its ease of use and machine learning algorithms when it comes
versatility. However, selecting the variables to predicting time series.
and lags employed in a VAR model is

Jurnal Nasional Pendidikan Teknik Informatika : JANAPATI | 59


ISSN 2089-8673 (Print) | ISSN 2548-4265 (Online)
Volume 12, Issue 1, March 2023

Hybrid Approaches forecasting comparative research. The


This approach aims to bring together stages of CRISP-DM are depicted in Fig. 1.
the best features of many statistical and ML- CRISP-DM methodology is the most widely
based forecasting techniques. Hybrid disseminated standard process model that
approaches include methods like SOM-SVR describes common data mining techniques
and ANN-ARIMA. To provide better learning and phases. The initial phase is to analyze
and more precise prediction results, SOM- the business process and available data,
SVR uses a Self-Organizing Map to first split followed by data preparation and modeling,
the dataset into clusters, then a Support and lastly, evaluate the model. This study,
Vector Regressor delivers grouped data. however, did not encompass the deployment
Tay and Caos' research [26] utilized the step. The procedures involved at each level
hybrid SOM-SVR for financial time series are outlined below.
forecasting. Moreover, ANN-ARIMA has
produced more accurate forecasts because Business & Data Understanding
of its ability to cover linear and nonlinear time After collecting the necessary
series components [27]. Time-series transactional data from various sales
forecasting problems, including the channels in the leathercraft industry, we first
prediction of energy prices and stock market evaluate the data that will be used to create
movements [28], have benefited from this a solution as the initial step in developing a
hybrid approach. demand forecast. Therefore, the exploratory
We concluded from our thorough data analysis process involves the following
investigation that no single forecasting steps: first, visualizing historical sales and
approach outperforms all scenarios. For the demand data; second, analyzing the primary
most accurate forecasts, it is necessary to characteristics of demand and sales time
first examine the data and its properties. This series; and third, examining the cross-
research aimed to evaluate which correlation between demand and other time-
forecasting methodologies would be the dependent variables (such as product price).
most efficient for one of the leathercraft Additionally, we have incorporated certain
Small and Medium-Sized Businesses in features into the datasets. For example, we
Yogyakarta, Indonesia. The steps we took to included vacation dates in our analysis to
complete our contribution are detailed in the determine if external factors could aid in
following parts. demand forecasting. While compiling holiday
information, we considered both the general
public's interests and observable school
holiday periods.

Data Preparation
After completing the above stages,
and considering that numerous products
have already been sold, we recommend
grouping the products into specific clusters
to generate a demand prediction forecast for
each group. To classify products with similar
characteristics into multiple categories, we
will use K-Means, a machine learning
clustering approach. We will utilize
hyperparameter tuning techniques such as
the elbow method to determine the optimal
Figure 1. CRISP-DM methodology [29] number of k-clusters that reflect the best
number of product groups. We do not wish
First, we applied a Cross Industry to specify the exact number of categories
Standard Process for Data Mining (CRISP- required, as we aim to use a machine
DM) [29] approach to resolve our demand learning implementation to find the most

Jurnal Nasional Pendidikan Teknik Informatika : JANAPATI | 60


ISSN 2089-8673 (Print) | ISSN 2548-4265 (Online)
Volume 12, Issue 1, March 2023

suitable number of clusters based on their


characteristics. We used time series cross-validation
Our objective here is to identify the with regular model fitting to simulate the
optimal number of clusters that exhibit operational demand forecasting scenario
similar characteristics in the data. The with frequently updated business sales
parameter "k" is used to define the number reports and a potentially shifting data
of groups into which the data should be distribution. Firstly, the 30 most recent
divided. The elbow method is utilized as a months were selected as the training set to
hyperparameter tuning technique to determine the best hyperparameter
determine the ideal number of clusters. The combination. The remaining data was used
appropriate number of clusters is achieved as the testing set. The model configuration
when the SSE sharply declines initially and with the best performance across the entire
then levels off as k increases. This test set was chosen based on the evaluation
phenomenon can be evaluated through the metrics.
SSE plot of each k iteration.
Evaluation
Modeling We used the Root Mean Squared
Table 1 summarizes the models Error (RMSE) in Equation 1 and the Mean
investigated in this comparative analysis, Absolute Percent Error (MAPE) in Equation 2
aimed at comparing machine learning and for model evaluation. A lower score indicates
traditional (classical) forecasting methods. improved performance across all
Therefore, we chose SARIMA, a common assessment metrics. We added a constant
univariate methodology, as well as value to the denominator to avoid division by
SARIMAX, a multivariate classical zero when using MAPE. MAPE values can
alternative, to be compared with machine potentially increase significantly when
learning algorithms in our study. We also demand numbers are close to zero, which is
investigated machine learning approaches a common situation in daily observation
capable of capturing nonlinear relationships, datasets.
taking into account the probability of temporal
dependencies. For this purpose, we built a 1
𝑅𝑀𝑆𝐸 = √𝑛 ∑ni=1 (yi − ŷi )2 (1)
Recurrent Neural Network (RNN) using Long
Short-Term Memory (LSTM) network cells.
Additionally, we included Extreme Gradient 100% 𝑦𝑖 −𝑦̂𝑖
Boosting in this comparative study, as 𝑀𝐴𝑃𝐸 = ∑𝑛𝑖=1 | | (2)
𝑛 𝑦𝑖
integrated or ensemble approaches have
been found to perform better than standalone where 𝑛 is the number of data points, yi is
ones in certain cases [30], [31]. Lastly, we the i-th measurement, and ŷi is its
included a Bayesian approach to model corresponding prediction.
uncertainty using Gaussian Process
Regression.
RESULT AND DISCUSSION
This section summarizes our
Table 1. Comparative analysis algorithm list findings, followed by an in-depth analysis of
Algorithm Abbreviation our experimental research. After studying
Algorithm Abbreviation the characteristics of historical sales data,
Seasonal Autoregressive SARIMA
we selected several product characteristics
Integrated Moving Average (Product Size, Product Motif, Product Color),
Seasonal Autoregressive SARIMAX sales amount, sales type, and unit pricing as
Integrated Moving Average with clustering features. An overview of the data,
external factors including the attributes and value
Long Short-Term Memory LSTM
Network distributions for each attribute, is provided in
Extreme Gradient Boosting XGB Table 2.
(XGBoost)
Gaussian Process Regression GPR

Jurnal Nasional Pendidikan Teknik Informatika : JANAPATI | 61


ISSN 2089-8673 (Print) | ISSN 2548-4265 (Online)
Volume 12, Issue 1, March 2023

Then, we employed K-means have the highest customer demand and must
clustering to automatically cluster the data always be kept in stock. The first category of
based on similar characteristics. Our aim products ranks second in terms of sales
was to determine the best number of clusters volume and frequency, and the unit price is
that represent the best data group with also lower. So we named this group "Popular
similar features. The k parameter indicates product," which should be kept in stock in
the number of groups into which the data moderate quantities. The second category of
should be aggregated. This research applies products is more expensive than the first and
the elbow approach as hyperparameter third categories, and its sales are still less
tuning to determine the best number of frequent; hence we labeled these products
clusters. The SSE is the elbow method's as an "Exclusive Collection," which should be
core index, representing the clustering kept in limited quantities to provide
impact indicating the clustering inaccuracy of alternative options while not interrupting cash
all samples under evaluation of different k flow. These classification results can be used
possibilities. When the number of clusters is to analyze whether the business's stock
fewer than k, increasing k amplifies the inventory is adequate on an operational
degree of aggregation of each cluster, hence basis.
reducing the SSE. k is considered as the
correct amount of clusters when the SSE
initially decreases sharply and then levels off
as k increases.

Table 2. Data Overview


Attributes Value Distributions

Product Name More than thousand unique


product name
Product Size Small, Medium, Large
Product Motif More than fifty unique
product motif (batik motif,
e.g.: Sidomukti, Parang, etc.)
Product Color More than thirty unique
product color (e.g. Havana, Figure 2. Elbow method SSE plot results to
Borduk, etc.) determine the best 𝑘 cluster is three
Sales Amount 12 ~ 200
Sales Type In stock, Pre-order The next step to achieve successful
Unit Pricing 75.000 ~ 1.195.000
inventory management is to establish the
exact amount of stock levels for each
For each iteration, test results to category mentioned above. The most
determine the proper number of k clusters straightforward strategy is to guess the
and their SSE value are represented in the k proportion of stock in each category by trial
and SSE diagram in Fig 2. We aimed to vary and error. Unfortunately, such approaches
the value of k from 1 to 8, and the graph impair a company's management team's
below shows that three clusters are sufficient ability to recognize critical moments.
for grouping our transactional dataset. Three
Therefore, the following section explores
is chosen since SSE initially decreases several demand forecasting methodologies
sharply as seen in the figure below, then to provide the manager with the best demand
levels off as k increases. forecasting method, completing the inventory
Then, the obtained k is used to group management process. Our analysis will only
all the products into three cluster using K- provide a rough estimate of which algorithm
Means algorithm. Through cluster result data is most likely to produce the best results in
analysis, we concluded that the third
our case study. In the future, the manager will
category of products has the highest sales need to implement a system that utilizes one
volume, the highest sales frequency, and the of these demand forecasting methods to be
lowest price. We labeled this category as
"Prioritized product" since these products

Jurnal Nasional Pendidikan Teknik Informatika : JANAPATI | 62


ISSN 2089-8673 (Print) | ISSN 2548-4265 (Online)
Volume 12, Issue 1, March 2023

adequately prepared to develop inventory


management strategies. Table 3. Seasonal Autoregressive Integrated
Before machine learning algorithms Moving Average (with external factors)
can use the series data, which contains Hyperparameter
historical transactional data from the last Parameter Values
𝑝/𝑞/𝑃/𝑄 [0, 1, 2, 3]
three years (2020-2022), it must be turned 𝑑/𝐷 [0, 1]
into a supervised learning scenario. In other 𝑒𝑥𝑜𝑔 [False, True]
words, we convert numerical inputs into 𝑡𝑟𝑎𝑛𝑠𝑓 [False, 'log', 'pw']
outputs by taking the features of earlier time
steps t_2 and t_1 as input and output the Table 4. Long Short-Term Memory Network
supervised value of the current time step t. Hyperparameter
Then, we divide the dataset into training and Parameter Values
testing data. We use the first 30 months of activation_function ReLu
optimizer Adam
data for training and the last 6 months for dropout_rate [0.0, 0.5]
testing. batch_size [4, 8, 16, 32]
Hyperparameter tuning is a machine- learning_rate [1e-3, 1e-2, 1e-1]
learning technique that takes a snapshot of a min_val_loss_improvement [0.01, 0.1]
max_epochs_wo_improvement [10, 30, 50, 100]
model's performance at a given time and lstm_hidden_dim [5, 10, 15, 50]
compares it to earlier snapshots. Every lstm_num_layers [1, 2]
machine learning method requires the seq_length [1, 4]
establishment of hyperparameters before
developing a model into a real intelligence
system. By adjusting the hyperparameters of Table 5. Extreme Gradient Boosting
the model, we can boost our model's Hyperparameter
performance and validate the chosen Parameter Values
learning_rate [0.05, 0.1, 0.3]
parameters using the validation dataset. max_depth [3, 5, 7]
Before we describe our procedure subsample [0.3, 0.7, 1]
further, it is important to understand the n_estimators [10, 100, 500, 1000]
concept of cross-validation, which is a critical gamma [0, 1, 10]
step in the hyperparameter tuning process. alpha [0, 0.1, 1, 10]
reg_lambda [0, 0.1, 1, 10]
Cross-validation (CV) is a statistical
technique used to evaluate the efficacy of Below is a summary outlining several
machine learning models. Normally, we can hyperparameters that can be adjusted while
predict how well a model will do on unknown optimizing GPR. Since the kernel function is
data only after it has been trained. In other the most influential GPR parameter, we
words, we cannot tell if the model is experimented with a wide variety of kernel
underfitting, overfitting, or performing well by functions as well as specific combinations of
evaluating how it performs on unknown data. a maximum of three different kernels.
When the data supplied is limited, cross-
validation is considered a highly beneficial 1. 𝑆𝑞𝑢𝑎𝑟𝑒𝑑𝐸𝑥𝑝𝑜𝑛𝑒𝑛𝑡𝑖𝑎𝑙( )
way to establish a machine learning model's 2. 𝑀𝑎𝑡𝑒𝑟𝑛52( )
3. 𝑊ℎ𝑖𝑡𝑒( )
performance. To perform cross-validation, 4. 𝑅𝑎𝑡𝑖𝑜𝑛𝑎𝑙𝑄𝑢𝑎𝑑𝑟𝑎𝑡𝑖𝑐( )
we set aside some of the data for testing and 5. 𝑃𝑜𝑙𝑦𝑛𝑜𝑚𝑖𝑎𝑙( )
validation, meaning not all subsets will be 6. 𝑃𝑒𝑟𝑖𝑜𝑑𝑖𝑐(𝑘𝑒𝑟𝑛𝑒𝑙𝑠 =
𝑆𝑞𝑢𝑎𝑟𝑒𝑑𝐸𝑥𝑝𝑜𝑛𝑒𝑛𝑡𝑖𝑎𝑙(), 𝑝𝑒𝑟𝑖𝑜𝑑 =
used to train the model; several data points 𝑠𝑒𝑎𝑠𝑜𝑛𝑎𝑙_𝑝𝑒𝑟𝑖𝑜𝑑𝑠)
will be kept for future use to validate the 7. 𝑃𝑒𝑟𝑖𝑜𝑑𝑖𝑐(𝑘𝑒𝑟𝑛𝑒𝑙𝑠 = 𝑀𝑎𝑡𝑒𝑟𝑛52(), 𝑝𝑒𝑟𝑖𝑜𝑑 =
model's performance. K-Fold is a popular 𝑠𝑒𝑎𝑠𝑜𝑛𝑎𝑙_𝑝𝑒𝑟𝑖𝑜𝑑𝑠)
8. 𝑃𝑒𝑟𝑖𝑜𝑑𝑖𝑐(𝑘𝑒𝑟𝑛𝑒𝑙𝑠 =
cross-validation strategy, and we used it to 𝑅𝑎𝑡𝑖𝑜𝑛𝑎𝑙𝑄𝑢𝑎𝑑𝑟𝑎𝑡𝑖𝑐(), 𝑝𝑒𝑟𝑖𝑜𝑑 =
validate our model. We used a value of 10 for 𝑠𝑒𝑎𝑠𝑜𝑛𝑎𝑙_𝑝𝑒𝑟𝑖𝑜𝑑𝑠)]
K in our cross-validation process for each
algorithm. Tables 3–5 display the To thoroughly examine our
hyperparameter scenarios used by experiments, we have prepared a summary
SARIMAX, LSTM, and GPR in this study. of the results for each of the five demand

Jurnal Nasional Pendidikan Teknik Informatika : JANAPATI | 63


ISSN 2089-8673 (Print) | ISSN 2548-4265 (Online)
Volume 12, Issue 1, March 2023

forecast methods investigated and the best data and clusters' generated characteristics,
evaluation metric results for each lead to increased predicting ability. This
hyperparameter iteration. These results are extra information is also considered valuable
shown in Table 6. SARIMA, the classic in machine learning-based approaches.
projection method, has an RMSE of 93.13 Based on the evaluation matrix, we
and a MAPE of 85.29. The SARIMAX will recommend the use of XGB projection as
algorithm, an extension of SARIMA, the demand forecasting model for our case
performs better with an RMSE of 48.96 but study, one of Yogyakarta's Leathercraft
has a higher MAPE of 89.85. The LSTM Small and Medium-Sized Businesses.
algorithm has a RMSE of 85.36 and a MAPE However, it is crucial to consider the impact
of 82.41. The ensemble learner XGB of the new normal era after COVID-19. We
outperforms other algorithms with a RMSE of suggest first using historical data as training
55.77 and a MAPE of 41.18, making it the data to determine if it fits well, as there may
most accurate algorithm for demand be different sales trends during the
forecasting in this study. Finally, the pandemic.
Gaussian process regression (GPR) has a
RMSE of 63.41 and a MAPE of 77.69.
CONCLUSION
This study presents a preliminary
Table 6. Comparison of metric evaluation results comparative analysis of three Machine
of all demand forecast model Learning approaches and two classic
Evaluation Metric projection methods for demand forecasting in
Algorithm
RMSE MAPE
SARIMA 93.13 85.29
Leathercraft Small and Medium-Sized
SARIMAX 48.96 89.85 Businesses. Firstly, we utilized K-means
LSTM 85.36 82.41 clustering to group the products into three
XGB 55.77 41.18 clusters based on the similarity of product
GPR 63. 41 77.69 characteristics, using the elbow method's
hyperparameter tuning. Furthermore, the
Our findings indicate that the data was summarized to represent the
machine learning-based techniques
significance of the clustering results. The
outperform the classic statistical approach in "Prioritized product" cluster had the highest
forecasting demand. XGB yields the best sales volume and frequency, and the price
performance in terms of RMSE and MAPE, was also relatively low. The products in the
as highlighted in the table. In addition, the "Popular product" cluster ranked second in
XGB ensemble outperformed the rest of the terms of sales volume and frequency, and
machine learning models. These findings the price was also relatively low. The
demonstrate that the ensemble learning "Exclusive collection" cluster contained more
approach is preferable for capturing data expensive products than the first two
phenomena in this use case. Compared to categories, and its sales were less frequent,
the other models evaluated, the results also
and the unit price was also lower. According
reveal that XGB's superior performance can to strategic inventory management
overcome the limited quantity of data used in recommendations, "Popular Products"
this study, where only data from the previous
should be retained in moderation, "Priority
three years were used. LSTM and GPR may Products" should be kept in large quantities,
not perform well in this case because the and "Exclusive Collections" should be
amount of information available significantly preserved in limited quantities to provide
impacts the machine learning algorithms' alternative options while not interrupting cash
performance. SARIMAX's comparable flow.
strong performance might be due to extrinsic The evaluation results of five different
variables such as distinct data dissemination
demand forecasting algorithms, including a
in the training set, which could inhibit classic projection method, are presented. In
prediction. However, since all superior the evaluation of demand forecasting
approaches were multivariate, the findings algorithms, SARIMA, which is a traditional
show that external inputs, such as holiday projection method, produces a RMSE of

Jurnal Nasional Pendidikan Teknik Informatika : JANAPATI | 64


ISSN 2089-8673 (Print) | ISSN 2548-4265 (Online)
Volume 12, Issue 1, March 2023

93.13 and a MAPE of 85.29. On the other Kanban System in the Auxiliary Inventory
hand, the SARIMAX algorithm, which is a Sector in a Auto Parts Company,” Int J
modification of SARIMA, performs better with Innov Educ Res, vol. 7, no. 10, pp. 849–
an RMSE of 48.96, but its MAPE is higher at 859, Oct. 2019, doi:
10.31686/ijier.vol7.iss10.1833.
89.85. In terms of the LSTM algorithm, it
[6] Xie Haiyan, “EMPIRICAL STUDY OF AN
generates a RMSE of 85.36 and a MAPE of AUTOMATED INVENTORY
82.41. In comparison, the ensemble learner MANAGEMENT SYSTEM WITH
XGB stands out as the most accurate BAYESIAN INFERENCE ALGORITHM,”
algorithm for demand forecasting in this Int J Res Eng Technol, vol. 04, no. 10, pp.
study, with a RMSE of 55.77 and a MAPE of 398–405, Oct. 2015, doi:
41.18. Lastly, the Gaussian process 10.15623/ijret.2015.0410065.
regression (GPR) has a RMSE of 63.41 and [7] X. Guo, C. Liu, W. Xu, H. Yuan, and M.
a MAPE of 77.69. Overall, the results show Wang, “A Prediction-Based Inventory
that machine learning algorithms outperform Optimization Using Data Mining Models,”
in 2014 Seventh International Joint
the classic projection method, and XGB is the
Conference on Computational Sciences
most accurate algorithm for demand and Optimization, Jul. 2014, pp. 611–615.
forecasting in this study. doi: 10.1109/CSO.2014.118.
Furthermore, in the future, these results [8] L. Lin, W. Xuejun, H. Xiu, W. Guangchao,
can be utilized and tested against real-world and S. Yong, “Enterprise Lean Catering
business activities to help managers create Material Management Information
accurate inventory management strategies. System Based on Sequence Pattern Data
As a suggestion for subsequent Mining,” in 2018 IEEE 4th International
implementation, considering the new normal Conference on Computer and
era after COVID-19, historical data should be Communications (ICCC), Dec. 2018, pp.
1757–1761. doi:
used as training data first to determine if it fits
10.1109/CompComm.2018.8780656.
well since there may be a different trend [9] S. Zhang, K. Huang, and Y. Yuan, “Spare
during the pandemic in terms of sales data. Parts Inventory Management: A Literature
Review,” Sustainability, vol. 13, no. 5, p.
2460, Feb. 2021, doi:
REFERENCES 10.3390/su13052460.
[1] McKinsey & Company, “Using customer [10] Box George E. P., Jenkins Gwilym M.,
analytics to boost corporate performance Reinsel Gregory C., and Ljung Greta M.,
Marketing Practice Key insights from Time Series Analysis: Forecasting and
McKinsey’s DataMatics 2013 survey,” Control. Wiley, 2015.
2014. [11] V. Assimakopoulos and K. Nikolopoulos,
[2] A. Singh, M. Wiktorsson, and J. B. Hauge, “The theta model: a decomposition
“Trends In Machine Learning To Solve approach to forecasting,” Int J Forecast,
Problems In Logistics,” Procedia CIRP, vol. 16, no. 4, pp. 521–530, Oct. 2000,
vol. 103, pp. 67–72, 2021, doi: doi: 10.1016/S0169-2070(00)00066-2.
10.1016/j.procir.2021.10.010. [12] K. Nikolopoulos, V. Assimakopoulos, N.
[3] A. Tufano, R. Accorsi, and R. Manzini, “A Bougioukos, A. Litsa, and F. Petropoulos,
machine learning approach for predictive “The Theta Model: An Essential
warehouse design,” The International Forecasting Tool for Supply Chain
Journal of Advanced Manufacturing Planning,” 2011, pp. 431–437. doi:
Technology, vol. 119, no. 3–4, pp. 2369– 10.1007/978-3-642-25646-2_56.
2392, Mar. 2022, doi: 10.1007/s00170- [13] H. Lütkepohl, “Vector autoregressive
021-08035-w. models,” in Handbook of Research
[4] A. M. Atieh et al., “Performance Methods and Applications in Empirical
Improvement of Inventory Management Macroeconomics, Edward Elgar
System Processes by an Automated Publishing, 2013, pp. 139–164. doi:
Warehouse Management System,” 10.4337/9780857931023.00012.
Procedia CIRP, vol. 41, pp. 568–572, [14] L. Breiman, J. H. Friedman, R. A. Olshen,
2016, doi: 10.1016/j.procir.2015.12.122. and C. J. Stone, Classification And
[5] T. Lima de Souza, D. Barbosa de Alencar, Regression Trees. Routledge, 2017. doi:
A. P. Tregue Costa, and M. C. Aparício de 10.1201/9781315139470.
Souza, “Proposal for Implementation of a

Jurnal Nasional Pendidikan Teknik Informatika : JANAPATI | 65


ISSN 2089-8673 (Print) | ISSN 2548-4265 (Online)
Volume 12, Issue 1, March 2023

[15] D. F. Specht, “A general regression [24] J. J. Montaño Moreno, A. Palmer Pol, and
neural network,” IEEE Trans Neural Netw, P. Muñoz Gracia, “Artificial neural
vol. 2, no. 6, pp. 568–576, 1991, doi: networks applied to forecasting time
10.1109/72.97934. series.,” Psicothema, vol. 23, no. 2, pp.
[16] N. S. Altman, “An Introduction to Kernel 322–9, Apr. 2011.
and Nearest-Neighbor Nonparametric [25] S. Makridakis, E. Spiliotis, and V.
Regression,” Am Stat, vol. 46, no. 3, p. Assimakopoulos, “Statistical and Machine
175, Aug. 1992, doi: 10.2307/2685209. Learning forecasting methods: Concerns
[17] D. J. C. MacKay, “Bayesian neural and ways forward,” PLoS One, vol. 13,
networks and density networks,” Nucl no. 3, p. e0194889, Mar. 2018, doi:
Instrum Methods Phys Res A, vol. 354, 10.1371/journal.pone.0194889.
no. 1, pp. 73–80, Jan. 1995, doi: [26] F. E. H. Tay and L. J. Cao, “Improved
10.1016/0168-9002(94)00931-7. financial time series forecasting by
[18] C. K. I. Williams and C. E. Rasmussen, combining Support Vector Machines with
“Gaussian Processes for Regression,” in self-organizing feature map,” Intelligent
Proceedings of the 8th International Data Analysis, vol. 5, no. 4, pp. 339–354,
Conference on Neural Information Nov. 2001, doi: 10.3233/IDA-2001-5405.
Processing Systems, 1995, pp. 514–520. [27] G. P. Zhang, “Time series forecasting
[19] S. Hochreiter and J. Schmidhuber, “Long using a hybrid ARIMA and neural network
Short-Term Memory,” Neural Comput, model,” Neurocomputing, vol. 50, pp.
vol. 9, no. 8, pp. 1735–1780, Nov. 1997, 159–175, Jan. 2003, doi: 10.1016/S0925-
doi: 10.1162/neco.1997.9.8.1735. 2312(01)00702-0.
[20] Zhengyou Zhang, M. Lyons, M. Schuster, [28] C. N. Babu and B. E. Reddy, “A moving-
and S. Akamatsu, “Comparison between average filter based hybrid ARIMA–ANN
geometry-based and Gabor-wavelets- model for forecasting time series data,”
based facial expression recognition using Appl Soft Comput, vol. 23, pp. 27–38, Oct.
multi-layer perceptron,” in Proceedings 2014, doi: 10.1016/j.asoc.2014.05.028.
Third IEEE International Conference on [29] R. Wirth and J. Hipp, “CRISP-DM:
Automatic Face and Gesture Recognition, Towards a standard process model for
pp. 454–459. doi: data mining,” in Proceedings of the 4th
10.1109/AFGR.1998.670990. international conference on the practical
[21] Medsker L.R. and Jain L.C., Recurrent applications of knowledge discovery and
neural networks: design and applications. data mining, 2000, pp. 29–39.
CRC Press, 1999. [30] F. Petropoulos, R. J. Hyndman, and C.
[22] Buhmann M. D., Radial Basis Functions: Bergmeir, “Exploring the sources of
Theory and Implementations, vol. 12. uncertainty: Why does bagging for time
Cambridge University Press, 2003. series forecasting work?,” Eur J Oper
[23] N. K. Ahmed, A. F. Atiya, N. el Gayar, and Res, vol. 268, no. 2, pp. 545–554, Jul.
H. El-Shishiny, “An Empirical Comparison 2018, doi: 10.1016/j.ejor.2018.01.045.
of Machine Learning Models for Time [31] C. S. Bojer and J. P. Meldgaard, “Kaggle
Series Forecasting,” Econom Rev, vol. forecasting competitions: An overlooked
29, no. 5–6, pp. 594–621, Aug. 2010, doi: learning opportunity,” Int J Forecast, vol.
10.1080/07474938.2010.481556. 37, no. 2, pp. 587–603, Apr. 2021, doi:
10.1016/j.ijforecast.2020.07.007.

Jurnal Nasional Pendidikan Teknik Informatika : JANAPATI | 66

You might also like