Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 45

4.

The Link Between Inventory Management and


Forecasting

As we discovered in Chapter 3, “Inventory Control,” there is a need to


forecast demand over the protection period. For inventory management
purposes, demand is often forecasted on a daily or weekly basis and
combined with the lead time and/or the review interval to come up with the
forecast of demand over the protection period. Inventory process
performance is impacted in part by forecasting performance because the
optimal timing or quantity of when inventory should be ordered and how
much should be ordered depends upon the magnitude and uncertainty of
demand, both of which are contingent upon the forecasting method and
accuracy.
Forecasting is a vast field1 ranging from macroeconomic forecasting of GDP,
interest rates, and inflation, to forecasting of long-term demographic
trends, the weather, and the outcomes of political elections. However, in
this chapter we focus primarily on short-term forecasting, because this is
the crux of most inventory management performance challenges, especially
for replenished items. For nonreplenished items, such as fashion apparel,
forecasts must be made many months into the future for a selling season
that might last a few weeks. In that case, the crux of inventory management
is a longer term forecast. In this book we do not address long-term
forecasting.
Although in this chapter we set out to discuss various forecasting methods
used in managing inventory, we also set out to help you understand what is
actually going on with forecasting. We want to remove some of the mystery
behind forecasting. Forecasting seems scientific because of the
mathematics, probability theory, statistics, and so on, involved in it.
However, successful forecasting requires knowledge and understanding of
the methods, which ones perform well under various conditions, which
ones do not, and why they do not perform well. Technology, databases,
data, and software can improve forecasting and are necessary, but they are
not sufficient for accurate forecasts. Technology in the hands of a skilled
forecasting analyst who is knowledgeable of the domain for which the
forecast is being used can make progress toward improved forecasts.
Understanding the domain requires experience in the business, knowledge
of the industry, and understanding of the variables in the company and the
competitors that drive sales. One challenge in forecasting is finding these
rare individuals who understand forecasting technology, forecasting
methods, databases, and the data per se, and also have sufficient business
domain knowledge. In reality, it takes a team of individuals, each of which
has at least one of these skills, collaborating to develop the forecasting
method and resolve forecasting problems.

UNCERTAINTY IN DEMAND AND


FORECASTING
We begin with a highly stylized example to illuminate an important point.
Imagine that one and only one person, Julie, shops at a particular retail
store for a certain type of candy, a peppermint blueberry fizz hard candy,
called Pepfiz. Every day before Julie goes to this store, she rolls a six-sided
die with numbers 1 through 6 on each of the six sides. The number that
comes up on the die is the number of pieces of candy she buys when she
goes to the store, but of course the store replenishment manager does not
know what is going on. If the store replenishment manager knew how
demand was being generated, he would know that the best forecast is 3.5—
the expected value of the roll of a die. Let X be a random variable

representing the demand,2 then the expected value is   where xi is the


ith outcome of the random variable and pi is the probability of that
outcome.

A forecast of 3.5 units is the best forecast. In the long run, no other forecast
would be better. Unfortunately, the replenishment manager doesn’t know
how demand is being generated.
Figure 4-1 shows the purchases over the past month.3
Figure 4-1 Average POS

The solid lines in Figure 4-1 are the actual purchases (POS), and the dashed
lines represent the average. In this case, a simple average is the best
forecast. A simple average is just the average of all POS, starting from the
beginning, and ending with the most recent observation. Let xi be the ith
observation of a total of n observations; then the simple average is

By about ten days, in this case, the simple average is close to the mean of
the probability distribution. However, if you look at the first nine days, you
might think there is a positive trend even though there is not. In fact, a
number of forecasting methods would actually suggest a trend using the
first nine days of this data.
Figure 4-2 also comes from the same distribution. But in this case, the first
few days look like there is a negative trend. However, it doesn’t take long
for the simple average to get close to the mean of the distribution.
Figure 4-2 Appearance of a negative trend at first

Now let’s assume the replenishment manager does know how the demand
is being generated. For simplicity, let’s also assume that the lead time is
overnight and certain. That is, at the end of each day, any pieces of candy
sold during the day are replaced the next morning. 4 Table 4-1 shows the
cumulative probability distribution of demand. The cumulative probability
up to a demand during lead time of three units is 1/6 + 1/6 + 1/6 = 0.5. So,
if the manager stocked up to three pieces of candy each day, the manager
would be in stock half the time; that is, the PPIS = 0.5.

Table 4-1 Cumulative Probability of Demand During Lead Time


Using Table 4-1, the replenishment manager could determine his PPIS by
selecting his OUL. This is really a (T,OUL) replenishment process where T
= 1 day and L = 0. That is, before the store opens, the manager review and
replenish from the backroom.5 Let’s suppose he chooses a PPIS = 0.83; then
he sets the OUL = 5. What is the safety stock? Recall that safety stock is the
expected number of units on hand when the replenishment arrives and is
available for use. Expected demand is 3.5, so if we have 5 units at the
beginning of the replenishment interval, the expected number of units on
hand at the beginning of the next day, when the replenishment arrives and
is available for use, is 5 – 3.5 = 1.5 units.
Although we have much more to say on this topic, the bottom line is that
the connection between forecasting and inventory management is that we
attempt to understand the distribution of demand to set replenishment
policy parameters and thus achieve various metric targets, such as PPIS
and ILFR.6

TIME SERIES METHODS


As we have mentioned, under these circumstances, just using a simple
average is the best way to forecast. The problem is we don’t know when we
begin forecasting which method is best.
Now, imagine that after 30 days, Julie starts rolling two dice, adds the
results, and purchases the corresponding number of pieces of candy.
Figure 4-3 is a discrete event simulation of this demand process, where
from day 1 to day 30 demand is the roll of one die and from day 31 to day
60 demand is the sum of the rolls of two dice. As mentioned earlier, the
expected value for the first 30 days is 3.5; for the second 30 days, the
expected value of demand is 7 units.
Figure 4-3 Roll two dice and add

Moving Average
The solid line is POS, and the dotted line is the simple average. Notice that
by day 60, the simple average is at about 5 units, well below the expected
value of 7 units. The dashed line is a ten-day moving average.7 The ten-week
moving average is given by the following:

Notice in Figure 4-4 that the ten-day moving average catches up more


quickly than the simple average. However, let’s look at just the first 30 days.
Figure 4-4 Ten-day moving average, average, and POS

Figure 4-4 is day 11 to day 30. The solid line is POS, the dotted line is the
simple average, and the dashed line is the ten-day moving average.
Comparing Figures 4-2 and 4-3 reveals the trade-off between the two
forecasting methods—namely, although a ten-day moving average adapts
more quickly to the change in the underlying demand, it overreacts to
random changes, compared to the simple average. In Figure 4-4 we see the
ten-day moving average is more erratic than the simple average. If the
underlying demand process doesn’t change, the simple moving average
does very well, but if there is a change in the underlying process, the simple
moving average adapts more slowly than the moving average.
Figure 4-5 illustrates the point even further. Again, this is a discrete event
simulation of days 11 through 30. The solid line is POS, the dotted line is
the simple average, the dash-dotted line is the five-week moving average,
and the dashed line the ten-day moving average. In Figure 4-4 we saw that
the ten-day moving average was more erratic than the simple average;
in Figure 4-5 we see that the five-day moving average is more erratic than
the ten-day moving average.
Figure 4-5 Comparison with five-week moving average

Naïve Forecast
If we go to a one-day moving average, it really isn’t a moving average, it is
just taking what happened the previous day as the forecast. This is referred
to as a naïve forecast (see Figure 4-6).

Figure 4-6 Naïve forecast


In Figure 4-6, the solid line is the POS and the dotted line is the naïve
forecast. As you can see, the naïve forecast simply trails the POS by a day.
The performance of the naïve forecast is a good benchmark for comparing
to other forecasts, because if a forecasting method cannot do better than
the naïve forecast, it must not be very good.

Simple Average
One problem with a simple average is that a serious outlier will have an
effect on the forecast for a long time. In Figure 4-6, there are no outliers.
Figure 4-7 is an example of where there is a spike in demand on the first
day. This could simply be a data error. This is also included in the five-day
moving average but is out of the calculation for the five-day moving average
by day 7. However, it remains in the calculation of the simple average
forever. Eventually its effect is negligible, but throughout the range
on Figure 4-7, it has an impact. You can see that the simple average is above
the five-day moving average and the sales from day 7 through day 16. This
illustrates a problem with the simple average—outliers have a lasting
impact that can bias the forecast for many periods into the future.

Figure 4-7 Outlier

Over Fitting Observations


Here are the methods we have considered so far: naïve, n-day moving
average, and simple moving average. We look at others as well. Some
forecasting methods are mathematically complex. The uninformed can be
drawn to these methods because they are impressive. However, research
has shown that in many situations, simple models perform better.8 This is
due in part to the notion of over fitting the data. A time series of sales data
has two components, one of which can be explained, and the other is
random. The problem is that it is possible to fit random fluctuations and
then project them into the future, which actually makes the forecast
perform more poorly.
Figure 4-8 compares sales data generated by rolling a die with a simple
average and a sixth degree polynomial, which is written out above at the top
of the graph.9

Figure 4-8 Overfitting

The x’s in the sixth degree polynomial are for the day. So, for example, if
you were to put a 1 into all of the x’s and evaluate the equation, you would
get a forecast of two pieces of candy. The sixth degree polynomial is much
more impressive than the simple average, and it is a better fit of the sales
data; however, we already know that the best forecast is 3.5 pieces of candy.
The real question is: How will this forecast into the future? With the simple
average, the most recent forecast is the forecast of each period into the
future. In this example, the most recent forecast of the simple average
model is 3.2 pieces of candy; therefore, the forecast of the 30th day out is
3.2 pieces as is the 60th day out.10 Now, if you want to forecast day 30 with
the sixth degree polynomial, you put a 60 in the x variable, and you get a
number that is around negative six million. In fact, this sixth degree
polynomial quickly goes negative when it is outside the data that it was fit
to. So, this seems like a paradox: The sixth degree polynomial fits the data
much better than the simple average, but when the sixth degree polynomial
is projected into the future, it quickly performs horribly. It performs
horribly because it was fitting randomness. You can’t forecast randomness.

Hold Out Data


This also helps to illuminate an important point: Forecasting models
should be judged based on how well they forecast hold out data, not on how
well they do on the data they were fit to. In this case, we see that the sixth
degree polynomial does well on the data it was fit to, but it shouldn’t be
judged based on that. Rather, the models should be judged based on how
well they perform on future sales data, also referred to as hold out data.
This is an important concept to remember. The other important concept to
remember is that we should compare the performance of forecasting
models to the naïve model as well.
There is always a trade-off with hold out data. For example, if I have one
year of data, I might want to use all of it to create a forecasting model, and
this is fine; however, for purposes of comparing the model to other models,
we need to have hold out data. Suppose I fit the data on 364 days and use
one day of sales data as hold out data. This seems insufficient. On the other
hand, suppose I use the first half of the year to fit the model and the second
half of the year as hold out data. If there is no trend and seasonality, that
might work well. If there is trend, this also might work well. If there is
seasonality, we might need all of the data, and more. So, unfortunately,
there is no unequivocal answer, but one good solution is to compare the
models on varying degrees of hold out data. If one model consistently does
the best with varying degrees of hold out data, then we can have more
confidence that the model actually performs better.
From an inventory management perspective, if you can forecast accurately,
you do not need much safety stock. So, in this case, over fitting with the
sixth degree polynomial can make it look like very little safety stock is
needed. Consequently, over fitting can result in lower PPIS than desired.
The simple average, on the other hand, in this example, will have a more
accurate assessment of the level of demand as well as the level of
randomness. In fact, in this example, the level of demand is 3.5 pieces of
candy per day. There is no trend or seasonality, only level demand. The
variance from 3.5 pieces is randomness. This is another reason why you
want to make sure you don’t over fit—over fitting produces both an
erroneous level of demand as well as an erroneous assessment of the level
of randomness. The amount of safety stock should be a function of the
amount of randomness as well as the desired fill rate.

Measuring Uncertainty
Later in this chapter, we discuss other methods of forecasting and how they
relate to inventory management, but first we need to talk about assessing
the level of randomness, which is a function of how well you can forecast
demand from the historical sales data. We need to do this because there is a
relationship between how much safety stock we need and how much
randomness there is.
Before we do that, it is important to distinguish between variability and
randomness. For example, suppose a certain type of cinnamon roll in a
grocery store sells 90 percent of its volume on Saturday. So, suppose you
sell 0 units on Sunday, 2 units per day from Monday through Friday, and
90 units on Saturday. Also, suppose this holds day in and day out
throughout the entire year. Well, there is a lot of variability throughout the
week, but there is perfect predictability, so there is no need for safety
stock.11So, we need to know how much uncertainty there is in sales data. In
other words, we need to know how predictable the sales are based on the
sales data. To that end, we now discuss measures of uncertainty based on
forecasts of sales.
There are many measures of forecast error, but we are going to look at bias,
mean absolute deviation (MAD), mean absolute percent error (MAPE), and
standard deviation of the forecast error (σFE). Before talking about
measuring forecast error (FE), we need to define forecast error for one
forecast. The forecast error for period i is defined as FEi = ai – fi where ai is
the actual realized sales for period i andfi is the forecast for period i.

On average, if bias is positive, it means that we are under forecasting,


whereas, if it is negative, we are over forecasting. If bias is positive, it is
possible that there is a positive trend that is not being accounted for in the
forecasting model, whereas if bias is negative, it is possible that there is
negative trend. If it oscillates in regular intervals, there may be seasonality
that is not being taken into account by the forecasting model. It is not a
good measure of the overall accuracy because positive forecast errors are
cancelled by negative forecast errors. MAD overcomes this because the
absolute value is taken of each forecast error. MAD is the average
magnitude of the error, regardless of the direction of the error.

So, suppose bias = 2 and MAD = 4. That means that, on average, the
forecasting technique is under forecasting by two units, but overall the
forecast is off by four units. The problem with this is that we cannot
compare different SKUs. For example, suppose one SKU has a MAD of 10
but sells on average 100 units per day and another SKU has a MAD of 10
but sells a 1,000,000 units per day. Clearly the latter forecast is more
accurate than the former. So MAD is not good for comparing different
SKUs. MAPE overcomes this problem.

Most textbooks use ai instead of   but if there are days of zero


demand, this causes MAPE to be undefined. The nice thing about MAPE is
that you can compare different SKUs. In addition, MAPE is intuitive.
Finally, we consider the standard deviation of the forecast error (σFE), which
is useful for setting safety stock. Although this is possibly the least intuitive
of the forecast error metrics, it is most useful for measuring the amount of
uncertainty in a forecast such that the measure can be used in setting safety
stock and estimating the expected units out of stock per replenishment
cycle.
Figure 4-9 shows four discrete event simulations of demand based on the
roll of a die for 60 days, using a simple average forecast and measuring
error with MAPE and bias. Notice how quickly MAPE moves closely to
about 10 percent in every discrete event simulation, whereas bias is much
more unstable. In the upper-left corner, bias is negative for the entire 60-
day horizon, whereas in the lower-left corner, bias is positive for the entire
60-day horizon.
Figure 4-9 Bias, MAPE, and simple average

Figure 4-10 shows four discrete event simulations of demand based on the


roll of a die for 60 days, using a five-day moving average forecast and
measuring error with MAPE and bias. This is similar toFigure 4-9 but with
a different forecasting technique. MAPE is a little higher for the five-day
moving average, but it is less biased, and bias converges more quickly to a
steady state level.
Figure 4-10 Bias, MAPE, and moving average

The point of this is that there can be trade-offs among these measures of
forecast error. Since they all contain unique information, or
characterizations of the information, it is important to understand each of
them and to avoid using them in ways that might be misleading, such as
comparing two SKUs with MAD when the SKUs have different levels of
demand.

Exponential Smoothing
We now explore another class of forecasting models that are integral with
inventory management, namely, exponential smoothing 12 models. We begin
with the simplest in this genre, first order exponential smoothing.
First order exponential smoothing weights the last forecast against the
forecast error of the last period. So, for example, if the last forecast was 15
and the error that period was 5, it would adjust 15 up by some fraction of
the error 5. That is, first order exponential smoothing takes the forecast and
adjusts it by a fraction of how much it was off from actual sales. This
fraction is referred to as that smoothing constant, usually designated by the
Greek lowercase letter alpha, α, which varies between zero and one, α   (0,
1). So, if α is small, the adjustment is small, but if α is large, the adjustment
is large. If ft is the forecast for period t, and at is the actual sales for period t,
then the exponentially smoothed forecast for period t+1 is
ft + 1 = ft + α(at – ft)    α   (0,1)
If there is a lot of randomness in the data, alpha should be low. If the level
of demand is changing, alpha should be higher, at least for a time. For
example, suppose you have a product with fairly level demand but a
competitive new item is being introduced. Then alpha should be higher for
a time, until the competitive effects cause the demand for the item to be at a
new level. Alpha should usually be between 0.1 and 0.3, but it actually
depends on the data and situation. Let’s consider the extremes. Suppose
alpha is zero; then you always use the first forecast.
ft + 1 = ft + 0(at − ft)
ft + 1 = ft
ft + n = ft
On the other hand, suppose alpha is equal to one; then you have the naïve
forecast.
ft + 1 = ft + 1(at − ft)
ft + 1 = ft + at − ft
ft + 1 = at
One challenge associated with using first order exponential smoothing is
that you have to start with a previous forecast. So, for the first forecast you
need a previous forecast. If you have data, you could use an average. If you
don’t have data, you could make an estimate with your judgment or you
could get a panel of experts and use an average of their estimates. Whatever
you start with will have a significant impact for quite a while, especially if
you are using a low alpha. For example, suppose you are forecasting POS
generated from the roll of a die, but as before, you do not know how the
demand is being generated. Suppose you start with an estimate of one piece
and are using an alpha of 0.1.
Figure 4-11 is a graph of the forecast error over 60 days. You can see that
for the first third, the bias is positive; that is, we are over forecasting. That
is due to the fact that we started with an estimate that was too low to begin
with and since alpha is low, it took about 20 days, in this example, to
overcome this low initial estimate. Now, let’s change the alpha to 0.5.
Figure 4-11 Forecast error with exponential smoothing

In Figure 4-12, there is no clear bias in the first 20 days. This is a result of
having a higher alpha; it allows the forecast to adjust more quickly.

Figure 4-12 Forecast error with a higher alpha

Figure 4-13 shows a discrete event simulation run if we start with a forecast


of 6 and have alpha set to 0.1. You can see that, again we have a clear bias in
the first third of the forecasts since the forecast errors are on average below
zero.
Figure 4-13 Low alpha and impact of initial forecast

So, back to the problem of not having any data for the first forecast. One
solution is to start with an estimate but to keep alpha on the higher end of
the range and slowly adjust it back to a lower level.
So far we have only looked at a one period ahead forecast. What is the
forecast if you are using first order exponential smoothing and you need a
forecast ten periods into the future? It turns out that the most recent
forecast is the forecast for all future periods. So, if the first order
exponentially smoothed forecast for the next period is 34, then at this point
in time, the first order exponentially smoothed forecast for period 20 is 34
units. Of course, as we move forward in time, this will change. By the time
we get to period 19, the forecast for period 20 might be different from 34
units, but that is what it would be in the current period given that the
forecast for the next period is 34 units.
Let’s look at an example. Suppose the forecast for the current period is 20
but actual sales was 30 and that alpha is 0.1. Then the forecast for the next
period is
fperiod2 =fperiod1 + 0.1 * (aperiod1 − fperiod1)
fperiod2 = 20 + 0.1 * (30 − 20)
fperiod2 = 20 + 1 = 21
That is also the forecast for the period after that:
fperiod3 = 21
However, suppose now that we come to the end of the next period and
actual sales turned out to be 11.
fperiod3 = fperiod2 + 0.1 * (aperiod2 – fperiod2 )
fperiod3 = 21 + 0.1 * (11 − 21) = 21 − 1 = 20
So, the Period 3 forecast at the end of Period 1 is 21, but the Period 3
forecast at the end of Period 2 is 20. In addition to making a decision about
the initial forecast and the level of alpha, another question that has to be
addressed in first order exponential smoothing is the following: How often
should the forecast be updated? In the example we just examined, it was
being updated every period, but that may not be practical nor optimal. It
might not be practical because you cannot change your order every period,
for example. It might not be optimal because it creates too much erratic
behavior in the system. Again, this is an issue that can be analyzed with
discrete event simulation.
The alpha that yields a bias closest to zero may not be the same as the one
that minimizes the standard deviation of forecast error over a given period
of time. A high bias, meaning you are under forecasting, may result in an
expected demand during the protection period being too low, resulting in
more stockouts. An alpha that is too high will cause the standard deviation
of forecast error to be high, resulting in more safety stock. The point here is
that your initial estimate and your selection of alpha both have a lasting
impact on the performance of your inventory management. Discrete event
simulation is a great tool for assessing these decisions, their impact on
forecasting performance, and the resulting impact on inventory
management performance.
First order exponential smoothing assumes there is no trend or seasonality
in the demand. If there is upward trend and first order exponential
smoothing is used, there will be a negative bias. Increasing alpha will
reduce the bias because the forecast will adjust up more quickly, but there
will still be a bias. Similarly, if there is a downward trend, there will always
be a positive bias, since you will on average be forecasting too high.

Trend Adjusted Exponential Smoothing


We now consider trend adjusted exponential smoothing13 (second order
exponential smoothing) to address the possibility of trend in the demand.
If there is trend in the demand, it is important to be able to estimate it for
inventory management purposes, whether it be an upward or a downward
trend.
In essence, second order exponential smoothing estimates an equation with
an intercept and a slope. The intercept is referred to as the level of demand,
and the slope is referred to as the trend component of demand. Once these
are estimated, to forecast one period into the future, you simply take the
level and add to it the trend. To forecast two periods into the future you
take the level and add two times the trend component. To forecast n
periods into the future, you take the level and add n times the trend
component. Based on this, it is easy to see a potential problem with second
order exponential smoothing, namely, it assumes that the trend is linear
and that there is no end to it into the future. There is a way to address this
that we discuss later in this chapter. The forecasting formula for second
order exponential smoothing is
ft + n = Lt + nTt
Where Lt is the intercept, or level component of the forecast, Tt is the slope
or trend component, n is the number of periods into the future of the
forecast. So, if the level is 20 and the trend is 5, the forecast for the next
period is
ft + 1 = Lt + 1 * Tt
ft + 1 = 20 + 1 * 5 = 25
However, if it is 30 days into the future, it is
ft + 30 = Lt + 30 * Tt
ft + 1 = 20 + 30 * 5 = 170
which might very well be too optimistic. Again, we address this problem
later in this chapter.
The formula we provided previously for second order exponential
smoothing is insufficient because it does not address how you estimate Lt,
the level component of the forecast, and Tt the trend component. To
estimate the level component, there is an associated smoothing constant,
which we again refer to as alpha α   (0,1). To estimate the
trend component, there is again another smoothing constant, which we
refer to as beta β   (0,1). As with first order exponential smoothing, you
must have an initial estimate of not only the level, but also the trend. And,
as with first order exponential smoothing, higher levels of alpha result in
more dramatic adjustment of the estimate of the level from one period to
the other. Similarly, higher levels of beta result in more rapid adjustment of
the trend component.
To estimate the level in second order exponential smoothing, we have the
following:
L = (at−1 + Tt−1) + α[at − (at−1 + Tt−1)]
Look back at the formula for first order exponential smoothing and notice
the similarity. In this case, (at – 1 + Tt – 1) is the previous “forecast” of the
current level component, and [a − (at – 1 + Tt – 1)] can be thought of as the
error in the forecast of the level. That is, at is the actual level and (at – 1 + Tt –
1) was the forecast of the level and the difference is the error. So, in essence,

we are taking the previous forecast of the level and adjusting by a fraction
of the error, similar to what we did with the first order exponential
smoothing formula.
The trend component can be thought of as a forecast in the change of the
level of demand. So, the updating of the forecast in the change in the level
in demand is an adjustment to the previous trend estimate based on a
fraction of the error.
Tt = Tt − 1 + β([Lt − Lt−1] − Tt−1)
Again, it is instructive to look back at the formula for first order exponential
smoothing as well as the formula for the estimate of the level in the second
order exponential smoothing. You see the pattern the previous estimate
plus a fraction of the error. We can think of [Lt – Lt – 1] as the actual change
in the level and Tt – 1 as the previous forecast of the change in the level. So,
the difference is the error in the previous estimate.
Let’s look at an example. Suppose the previous level was 100 and the
previous trend estimate was 1. Assume that the actual sales were 110 and
this period was 120 and that the previous level was 111; and that alpha and
beta are both 0.1.
Lt = (at−1 + Tt−1) + α[at − (at−1 + Tt−1)]
Lt = (110 + 1) + 0.1[120 − (110 + 1)]
= (111) + 0.1[9] ≈ 112
Tt = Tt – 1 + β([Lt − Lt−1] − Tt−1)
Tt = 1+0.1([112 − 111] − 1) = 1
Now, let’s forecast 10 periods into the future.
ft + n = Lt + nTt
ft + 10 = 112 + 10 * 1 = 122
Suppose we wanted to forecast 365 periods out, then
ft + 365 = 112 + 365 * 1 = 477
This doesn’t seem reasonable. It seems that there is a point where the trend
amount increases at a decreasing rate.
Damped Trend
The way of dealing with this is through damped trend14 adjusted
exponential smoothing. This uses a damping factor, ϕ  (0,1).

Let’s use the numbers from the previous example but apply damped trend
adjusted exponential smoothing forecasting 10 periods out and let ϕ = 0.9.

Notice that  , so instead of multiplying by 10, we multiply by


5.9. Now let’s forecast out 365 periods.

Recall before we were multiplying by 365. Now, the damping factor selected
has a large impact on how quickly this sum converges. If we would have set
ϕ= 0.98, then we would have multiplied by 49 instead of 9, but that is still
significantly less than 365.

The vertical axis of Figure 4-14 is  , the horizontal axis is n, and the
lines, starting with the lowest line represents ϕ = 0.9 .0.91, 0.92, 0.93,
0.94, and 0.98, respectively. You can see that for ϕ = 0.9 .0.91, 0.92, 0.93,
and 0.94 they begin to converge fairly quickly, around 30 to 45 periods out.
Although for ϕ = 0.9 it is at about 37, at n=365 it is at 49. This method is
important for the prevention of ridiculously high forecasts way out into the
future.
Figure 4-14 Damping factor

Seasonally Adjusted Forecasts


We now look at seasonality.15 Seasonality exists when demand increases
significantly at particular intervals of time each year. Seasonality can be a
complex phenomenon because it is sometimes caused by dates, such as
Christmas; other times it is the result of causal variables, such as increased
temperatures; and sometimes both dates and causal variables. Let’s look at
some specific examples. The demand for candy increases at Halloween
because candy is part of the celebration of the holiday. The demand for
bottled water increases when the temperature increases because people
consume more water when the temperature goes up. However, people also
consume more bottled water on Independence Day because people buy it to
be available for Fourth of July parties, but even more is consumed if it is
particularly hot outside. If you forecast the seasonality of bottled water
sales strictly with dates (e.g., summer), and then it turns out to be an
unusually cool summer, you will probably over forecast the demand for
bottled water. So, for bottled water, you probably want time and
temperature in the equation.
Forecasting seasonality with time requires a time series forecasting method
that directly addresses seasonality. Forecasting seasonality with a causal
variable such as temperature requires a method such as regression, which
we discuss later in this chapter. Using both a time series forecasting
method combined with a causal method such as regression can be done by
using a linear combination of the two, which we also discuss later in this
chapter.
One simple way of forecasting quarterly demand with seasonality is with
seasonal factors applied to annual forecasts. For example, you could
forecast annual demand and not have to worry about seasonality because
seasonality occurs within the year. So, with this approach you look back
over several years and see what percentage of the demand occurs in each
quarter. Suppose you did this and found that annual demand is divided
between the first, second, third, and fourth quarters, by 10 percent, 40
percent, 30 percent, and 20 percent, respectively. Then you would forecast
annual demand with second order exponential smoothing if there were
trend, and then apply the seasonal factors. Suppose that you forecast
annual demand to be $100 million, then you would allocate $10 million to
the first quarter (10% × $100 million), $40 million to the second quarter
(40% × $100 million), $30 million to the third quarter (30% × $100
million), and $20 million to the fourth quarter (20% × $100 million). Of
course, this method only works if the seasonal factors are stable from year
to year.
You can also have a type of seasonality that is within the week. For
example, many grocery items sell more on the weekend than they do during
the regular week. This is due in part to the fact that more people shop on
the weekend. However, some grocery items sell more evenly throughout the
week; it depends on the product, the location, and the retailer. Consider a
product bags of ice. People tend to buy these on the weekend for parties,
special events, and so on. Perhaps as much as 80 percent of the product is
sold in some stores on the weekends in the summer. Then you could use
such a seasonal factors approach to forecasting demand throughout the
week. Now, it could be that you would sell more product ice in the summer
than in the winter. In that case you might want to forecast at a weekly level,
including annual seasonality, and then allocate the weekly demand by day
using a type of within-week seasonal factor. The accuracy of this would
depend on (1) the stability of the within-week seasonal factor, and (2) the
accuracy of the weekly forecast.
We now look at an exponential smoothing approach to seasonality
called additive seasonality exponential smoothing without trend. With this
method, rather than having a static estimate of the seasonal factors and the
static estimate of the level of demand, these are updated. This method
updates the level estimate and the seasonal estimates similarly to the way
the estimates of the levels were estimated in the first order and second
order exponential smoothing methods, and also similarly to the way the
trend estimates were updated in the second order exponential smoothing
method. Let p = number of periods between seasons, Lt = level estimate
for period t, St = incremental season demand for period t; then the level
estimate is given by
Lt = Lt−1 + α[(at − St−p) − Lt−1]
So, we take the previous estimate of the level of demand and adjust it by a
fraction of the error. We can think of at – St – p as the actual level based on
the actual demand and the incremental seasonal demand from the previous
season, and Lt – 1 as the previous estimate of the level.
Now we look at the update of the incremental seasonal demand.
St = St−p + β[(at − Lt) − St−p]
So, the previous estimate of the incremental seasonal demand p periods ago
is updated by a fraction of the error in the estimate. We can think
of at – Lt as the actual incremental seasonal demand and St–p as the estimate,
so (at − Lt) − St−p is the error.
Finally, to estimate the demand for the next periods, we use the following
formulation:
Ft + n + Lt + St + n−p
Suppose Lt = 300 and you want to forecast out to the next period, so n=1.
Suppose the number of periods between seasons is 12 months (p=12),
and St+1–12 = 100, then
Ft + 1 = 300 + 100 = 400
Now, let’s look at the previous example again: Forecast annual demand to
be $100 million, then you would allocate $10 million to the first quarter
(10% × $100 million), $40 million to the second quarter (40% × $100
million), $30 million to the third quarter (30% × $100 million), and $20
million to the fourth quarter (20% × $100 million). The current level of
demand is $100 million per year / 4 quarters = $25 million / quarter.
So, Lt = 25. Then we have St+1–4 = 10 – 25 = – 15, St+2–4 = 40 – 25 = 15, St+3–4=
30 – 25 = 5, and St+4–4 = 20 – 25 = – 5.
Now, let’s suppose the actual demand for the first quarter turned out to be
$15 million and α = 0.1 and β = 0.1.
LQ1 = 25 + 0.1[(15 + 15 − 25] = 25 + .5= 25.5
And
SQ1 = −15 + 0.1[(15 − 25.5 + 15] = − 15 + .45 ≈ − 14.5
So we see that since the actual demand turned out to be $5 million higher
than the previous year, it reduced the magnitude of the negative seasonal
factor and increased the estimate of the level by about half a million each in
the positive direction. So if we want to forecast for Q2
FQ2 = 25.5 + 15 = 40.5
This is half a million higher than Q2 of the previous year. Keep in mind,
this increase is the result of the fact that Q1 of this year was $5 million
higher than Q1 of last year.
We now look at third order exponential smoothing with damped trend, a

variation on Winter’s model. We define  . For example, suppose


p=4, the one possibility is S1 = .5, S2 = 1.5, S3 = .3, and S4 = 1.7. Notice that
these add to 4.

Here is the model to forecast with a linear trend (traditional application of


Winter’s model):
ft + n = (Lt + nTt)St + n−p
Here is the model to forecast with a damped trend:

Figure 4-15 is an example of the difference between the multiplicative


seasonal model with trend and the multiplicative seasonal model with
damped trend. The horizontal axis is time in quarters, going out three
years, and the vertical axis is the forecast in units of sales. The solid line is
the multiplicative seasonal model with trend, and the dotted line is the
multiplicative seasonal model with damped trend. As you can see, the
dampening of the trend also damps the degree of swings in seasonality.
Figure 4-15 Difference between the multiplicative seasonal model with trend
and the multiplicative seasonal model with damped trend

In Figure 4-16 we have a downward trend, and we see again that the
downward trend of the multiplicative seasonal trend model (solid line) is
more dramatic than the multiplicative seasonal damped trend model
(dotted line). In fact, as we forecast out further into the future we see less
dramatic swings in the nondamped model, the opposite of what we saw
with the upward trend.

Figure 4-16 Downward trend


Notice that the three equations for estimating the level, trend, and seasonal
components of the time series are similar to the other smoothing models we
have looked at in the sense that they start with the previous estimate and
adjust the previous estimate by some fraction (the smoothing constant) of
the error in the previous estimate. For example, in the level equation (Lt –

 + Tt – 1) is the estimate of the level in period t, but 


1  is the actual level.
You see at is the actual sales for period t, so dividing by St – p takes the
seasonal component out of the sales, leaving the actual new level. For
example, if at = 10 and St – p = 0.5, then this means sales for period t was 10
units and that, as a result of seasonality, this period is half of the level of
demand, so dividing 10 by 0.5 inflates it to the actual level of 20 for this

period. Consequently,   is the error, the difference


between the actual level of this period and the estimate of what the level
would be this period is t – 1.
Let’s now look at the seasonal estimate.

Here again, we are taking the previous estimate of the seasonal factor St –

p and adjusting it by a fraction, γ, of the error  . The actual sales


for period t is at, and is divided by the level to get the actual seasonal
component for this period. For example, suppose that the actual sales for
this period were 75 units but the level for this period was 100 units, then

the actual seasonal component for this period is  .


Let’s suppose that the smoothing constant for the seasonal component is γ
= 0.1 and that St – p = 0.80.
So far we have looked at four types of exponential smoothing models: (1)
first order exponential smoothing, which only forecasts level, (2)
exponential smoothing with trend, (3) additive seasonality with no trend,
and (4) multiplicative seasonality with trend. Let’s compare these, first
from the perspective of the estimate of the level component of the time
series. Here they are:

Notice that for the first order exponential smoothing and the additive
seasonality with no trend that the last estimate is only the level, but with
the trend adjusted exponential smoothing and the multiplicative seasonal
with trend there is the last level plus the trend. The reason for that is that
the previous level plus the previous trend is the estimate of what the level
would be the next period, and that is precisely what must be adjusted based
on the error. Now, the additive seasonality model and the multiplicative
seasonal model can include trend or not. In our exposition we did not
include trend with the additive seasonal model but we did with the
multiplicative model, but that was arbitrary.
Now, the trend equation for both the trend adjusted exponential smoothing
and the multiplicative seasonal model with trend are the
same: Tt = Tt−1 + β([L − Lt−1] − Tt−1). In addition, if we would have had a trend
component in the additive seasonal, it would also be this same equation.
The seasonal components are similar except in the case of additive
seasonality; the seasonal component is in terms of the units demand,
whereas in the case of multiplicative seasonality, the seasonal component is
in terms of a ratio. Other than that they are the same:

Recall that the forecast n periods ahead for the seasonal forecast was
Ft + n = Lt + St + n−p
If it would have had linear trend, then it would have been
Ft + n = Lt + nTt + St + n−p
Whereas, for the multiplicative seasonal with trend is
ft + n = (Lt + nTt) St + n−p
One thing to notice is that with the additive model with trend, regardless of
the size of the trend, the seasonal quantities stay the same. That doesn’t
seem reasonable. Imagine you are at 100 units per month and the seasonal
factor is 10. Now imagine that at 24 months out it is forecasting 1,000 units
per month. It will still have that seasonal factor of 10. On the other hand,
with the multiplicative model we find that the seasonal factors are
magnified as n grows. That might be reasonable to a point, but it might over
magnify eventually. One solution to this is to use the damped trend. We
have already shown how the damped trend is incorporated into the
multiplicative model, but here is how it is incorporated into the additive
model.
Seasonal indices are usually based on very little data. Suppose p=12 months
for St + n – p. It is unusual to have two years of representative data. There are
situations where there are many years of representative data, but that is
often the exception. If you only have two years of data, they are based on
few observations of those particular seasons. It is possible that the
uncertainty introduced by using seasonal factors that are full of stochastic
error could be worse than the benefit it brings. So you could instead use
trend adjusted forecast with a high beta so that it would adjust quickly to
the seasonality, but using a high beta will make it incorporate more
stochastic noise into the estimate of the seasonal factor. Or you could use a
level method such as first order exponential smoothing. If you have
seasonality and use a level model, you might have (1) too much16 safety
stock throughout the year, (2) too little cycle stock during the peak season,
and (3) too much cycle stock during the trough.
The solid line in Figure 4-17 is a deterministic additive seasonal model with
a level of 100. The dotted line is the same deterministic model but with a
stochastic term added to it that is simulated from a normal distribution
with a mean of zero and a standard deviation of 20. The only difference
between the graph on the top and the graph on the bottom is that they are
two different discrete event simulations from the same distribution; the
deterministic model is the same in both graphs. These graphs demonstrate
the challenge associated with finding the seasonal components from just
two years of data—that is, with even a little stochastic disturbance, the
estimates will be off.
Figure 4-17 Seasonality with standard deviation equal to 20

Figure 4-18 is identical to Figure 4-17, except that the stochastic term has a
standard deviation of 50 instead of 20.
Figure 4-18 Seasonality with standard deviation equal to 50

We can see from this that the estimates of the seasonal components would
be completely unreasonable. The key point of this is that great care must be
taken in applying seasonal models. In fact, in Figure 4-18, a first order
exponential smoothing model would outperform a seasonal model when
forecasting into the next year. The seasonal components of the seasonal
models would add more noise into the forecasting model that would detract
from the forecast accuracy. Figure 4-18 might be the level of uncertainty
you would see for an item in one store where you sell about 100 units on
average per month. However, if you were to average sales from 200 stores
with the same seasonality, you might be able to estimate the seasonal
component of the time series more accurately.
This idea is even more exaggerated when trying to estimate both trend and
seasonal components as illustrated in Figure 4-19.

Figure 4-19 Seasonality and trend with standard deviation equal to 50

In Figure 4-19, the solid line is the same deterministic model but with a
stochastic term added to it, which is simulated from a normal distribution
with a mean of zero and a standard deviation of 50. As you can see again,
the seasonality would be difficult to detect in both graphs. The dashed lines
are the regression lines for the corresponding lines. (The thin dashed line
corresponds to the solid line, and the bold dashed line corresponds to the
dotted line.) The regression model uses the month number as the
independent variable and the level of line as the dependent variable. Hence,
the thin dashed line is the actual trend component and the bold dashed line
is the estimated trend component. In general, it is easier to estimate the
trend component than seasonal components of time series. Consequently,
in these two examples, using trend adjusted exponential smoothing with a
damped trend, being off on the initial estimate of the trend would not have
as deleterious an effect as if the trend was not damped.
As mentioned earlier, Figure 4-18 is identical to Figure 4-17, except that the
stochastic term has a standard deviation of 50 instead of 20. In Figure 4-
20, it is still difficult to estimate the seasonal factors, but the trend
estimates are very close to the actual trend components.
Figure 4-20 Seasonality and trend with standard deviation equal to 20

If the trend is sufficiently pronounced, even if the demand has a lot of


noise, the trend component can often be estimated relatively accurately.
However, you must make sure you have sufficient data to estimate a
trend. Figure 4-21 is an example of this.
Figure 4-21 An illusion of trend

Figure 4-21 is simulated demand from demand that only has a level
component. The level component is 10, and the stochastic term is from a
normal distribution with a mean of zero and a standard deviation of 5. In
this example, there appears to be a trend but obviously, based on the
underlying model, there is no trend.
Figure 4-22 is simulated data from the same demand distribution
from Figure 4-21 except that it is simulated for 40 periods. Of course this is
just one discrete event simulation, but it illustrates the point that even at 40
periods, you could detect a trend that does not exist.
Figure 4-22 Another illusion of trend

It sounds as if we have shown you trend and seasonal models and now we
are saying you probably shouldn’t use them, but that is not the case. We are
simply explaining an important caveat in using trend and seasonal models.
In Figure 4-22, if we were to use damped trend, the effect of the small trend
we seemed to detect (that didn’t really exist) would be minimized. However,
it is a good idea to go further than just using damped trend. In forecasting it
is good to use logic and other empirical data prior to developing forecasting
models. Is there a reason to believe that trend and seasonality exist? Why?
What is the logic? Answering these questions may require getting others
involved, possibly from sales and marketing. In addition, there may be an
upward trend if you are gaining market share. Such information can be
obtained from companies that sell syndicated sales data such as AC Nielsen.

CAUSAL MODELS
There are other types of time series forecasting models, but we now go on to
talk some about causal models, and in particular, regression models.
Building effective regression models requires more skill than building time
series smoothing models. We use regression later in the book for other
purposes, so it is worthwhile to learn it here for multiple reasons.
Regression is a complex topic, and we just skim the surface and discuss it
from an applied perspective.

Regression
To build a regression model you must define your dependent and
independent variables. Since we are forecasting sales, the dependent
variable is sales. Regression finds a line that fits the data by minimizing the
sum of the squared errors. Errors are forecast errors. As we discussed
earlier, the forecast error for period i is defined as FEi = ai – fi, where ai is
the actual realized sales for period i andfi is the forecast for period i. In
regression, they are not referred to as forecast errors, but are referred to as
residuals, because most of the time regression is not used for forecasting
but for testing hypotheses. In particular, regression minimizes the sum of
the squared residuals for n observations.

Regression selects the regression coefficients to minimize this sum. In


general, the forecasting regression equation is
So regression chooses b0, the intercept, and bi, the slope values (referred to
as regression coefficients) for each of the m independent variables, to
minimize the sum of the squared residuals. Many software packages,
including Microsoft Excel, can be used to estimate regression equations.
We begin with one of the simplest regression based forecasting methods,
trend forecasting. Figure 4-23is the data we use to develop a regression
based forecasting model.

Figure 4-23 Trended demand

Table 4-2 has the number represented in Figure 4-23.


Table 4-2 Weekly Demand with Trend

Using regression, we get the following forecasting model:


Ft = 22 + 0.95t
That is, regression estimates the intercept to be 22 and the slope to be 0.95.
So, if we were to forecast for week 31, we would get
F31 = 22 + 0.95(31) = 51
And a forecast for week 50 would be
F50 = 22 + 0.95(50) = 70
Regression shows that the model is statistically significant and has an R-
square of 0.43, meaning that 43 percent of the variance in the forecast is
explained by the week number.
There are ways with regression to estimate something similar to a damped
trend. We can estimate a model of the form
Ft = atb
To estimate this with regression, we must first take the natural log of both
sides of the equation.
lnFt = ln(atb)
lnFt = lna + ln(tb)
lnFt = lna + blnt
In this case the R-square has gone from 0.43 to 0.64, meaning the “power”
form of the regression actually explains more of the variance in sales. The
actual model we estimate from regression is
Ft = 12t0.42
After running the regression, 0.42 was the estimate of the slope, but to get
12, you must first raise the intercept to the power of e. The intercept was
estimated to be 2.48, so e2.48 ≈ 12.
Now, let’s forecast for weeks 31 and 50 as we did before with the linear
regression model.
F31 = 12(31)0.42 = 51
This is the same forecast with the linear model. Now let’s look at the
forecast for week 50.
Ft = 12(50)0.42 = 62
So the forecast of week 50 with the power model is 62, whereas with the
linear model it was 70. In this regard, the power model is like a damped
trend. However, it is only like a damped trend when the estimate of b is
between zero and one. If it is greater than one, it will make forecasts that
shoot off into the stratosphere when the forecasts are very far into the
future.

Additive and Multiplicative Models


Now, sometimes you need to make forecasts that are based on price,
promotional spending, advertising, the price of substitutes, and other
variables. In these cases, you might not only be interested in forecasting but
also interested in testing hypotheses, such as increased promotional
spending leads to higher sales. When we are testing hypotheses, we need to
be more careful in using the regression model than when we are just
forecasting.
Suppose we want to forecast with a model that takes into account trend,
advertising spending, promotional spending, and the price of the product.
We could estimate a forecasting model with regression using a
multiplicative power model, such as the following:
Ft = a * tb  * Advertisingb  * Promotionb  * Priceb
1 2 3 4

Alternatively, we could estimate a linear model, such as the following:


Ft = a + b1t + b2 Advertising + b3Promotion + b4Price
There are several benefits of the multiplicative power model over the linear
model. The first benefit is that the multiplicative power model takes into
account that there may be interactions between the independent variables.
That is, the effect of spending on advertising may depend on the price. The
second benefit of the multiplicative power model is that many of the
relationships between the independent variable and sales would not be
expected to be linear but rather nonlinear. We have already discussed this
with respect to time, but we would also expect it with respect to other
independent variables such as advertising. That is, we would expect that as
we spend more on advertising, we would get more sales, but the increase
would eventually be at a decreasing rate.17 The third benefit of the
multiplicative power model is that the regression coefficients can be
interpreted as elasticity values. Suppose you run the regression model and
find b2=0.3; then we can interpret this to mean that if we increase spending
on advertising by 1 percent, there will be a 0.3 percent increase in
sales.18 Similarly, if b4=–2, then for a 1 percent reduction in price, sales will
increase by 2 percent, highly elastic. This is assuming that the assumptions
of regression hold and are not violated. Now to prepare for the regression,
the natural log must be taken of the dependent and independent variables.
Ft = a *tb  * Advertisingb  * Promotionb  * Priceb
1 2 3 4

lnFt = ln(a *tb  * Advertisingb  * Promotionb  * Priceb  )


1 2 3 4

lnFt = lna + lntb  + lnAdvertisingb  + lnPromotionb  + lnPriceb


1 2 3 4

lnFt = lna + b1lnt + b2lnAdvertising + b3lnPromotion + b4lnPrice
This is similar to how we estimated the regression for the power model of
trend. Now, suppose we want to take into account the Christmas season. To
do so we can use a dummy variable:

So, it could be included in the equation as


lnFt = lna + b1lnt + b2lnAdvertising + b3lnPromotion + b4lnPrice + b5D
Notice that we did not take the natural log of D. The reason for this is that
the natural log of zero is undefined. Of course b4 cannot be interpreted as an
elasticity as before, but the other regression coefficients still can be
interpreted that way.
Assumptions of Regression
If we are wanting to do more than forecast with this model, such as use the
elasticity estimates, or test hypotheses about whether one of these variables
is statistically significant, we need to make sure the assumptions of
regression are satisfied. Since this is not a book on regression, we just give
some high level assumptions to consider:
1. Regression assumes that the residuals are normally distributed with a
mean of zero. So, if you were to make a histogram of the residuals, they
should look somewhat bell shaped.
2. Regression assumes that the residuals are not correlated over time. If
residuals are correlated over time, it is referred to as autocorrelation.
3. Regression assumes that the variance of residuals is constant for various
levels of the dependent variables. This assumption is known
as homoscedasticity. If the assumption is violated, that is, if the variance of
the residuals changes for different levels of the dependent variable, it is
known asheteroscedasticity.
4. Regression assumes that the independent variables are statistically
independent. If they are not statistically independent, they will be
correlated. This problem is referred to as multicollinearity.
5. There are other assumptions as well and when many of them are
violated, there are a number of methods to address the violations, but these
are beyond the scope of this book.

ENDNOTES
1. One of the best repositories of information on forecasting
ishttp://www.forecastingprinciples.com/.
2. The roll of a six-sided die is a discrete uniform distribution, where each
outcome is equally likely.
3. You can simulate the roll of a die in Excel using the following function:
=RANDBETWEEN(1,6). In general, =RANDBETWEEN(a,b) generates
random numbers between a and b using a uniform distribution, meaning
that each of the outcomes is equally likely.
4. This could happen if they were coming out of the backroom and put on
the shelf in the morning.
5. Which has plenty of candy, in our example.
6. Recall, ILFR is item-level fill rate.
7. The ten-week moving average doesn’t start until the 11th week because it
takes ten days of demand to get started.
8. Armstrong, Jon Scott, ed. Principles of Forecasting: A Handbook for
Researchers and Practitioners. Vol. 30. New York: Springer, 2001.
9. You will notice R square below the equation. We discuss that later in the
chapter.
10. We are again assuming that the replenishment manager does not know
how the demand is being generated.
11. Assuming there is no uncertainty in the lead time.
12. Brown, Robert G. Exponential Smoothing for Predicting Demand.
Cambridge, MA: Arthur D. Little,1956.
13. Holt, Charles C. “Forecasting Seasonals and Trends by Exponentially
Weighted Moving Averages.”International Journal of Forecasting 20.1
(2004): 5-10.
14. Taylor, James W. “Exponential Smoothing with a Damped
Multiplicative Trend.” International Journal of Forecasting 19.4 (2003):
715-725.
15. Winters, Peter R. “Forecasting Sales by Exponentially Weighted Moving
Averages.” Management Science 6.3 (1960): 324-342.
16. Too much in the sense that if you took seasonality into account in an
effective way, there would be less forecast error. The challenge is that
having the seasonal component per se might introduce more forecast error.
17. Eventually there should be diminishing returns to advertising.
18. Alternatively, we could say that for a 10 percent increase in spending on
advertising, there will be a 3 percent increase in sales.

You might also like