Basic Forecasting Methods

Forecasting using
simple models
Outline
 Basic forecasting models
– The basic ideas behind each model
– When each model may be appropriate
– Illustrate with examples
 Forecast error measures
 Automatic model selection
 Adaptive smoothing methods
– (automatic alpha adaptation)
 Ideas in model based forecasting techniques
– Regression
– Autocorrelation
– Prediction intervals
2
Basic Forecasting Models
 Moving average and weighted moving

average
 First order exponential smoothing
 Second order exponential smoothing
 First order exponential smoothing with
trends and/or seasonal patterns
 Croston’s method
3
M-Period Moving Average
m 1
 Vt  j
Pt 1(t)  j 0
M
 i.e. the average of the last M data points
 Basically assumes a stable (trend free) series
 How should we choose M?
– Advantages of large M?
– Advantages of large M?
 Average age of data = M/2
4
Weighted Moving Averages
n
Pt 1(t)  W V
t j t j
j 0
 The Wi are weights attached to each

historical data point
 Essentially all known (univariate) forecasting
schemes are weighted moving averages
 Thus, don’t screw around with the general
versions unless you are an expert
5
Simple Exponential
Smoothing
 Pt+1(t) = Forecast for time t+1 made at time t

 Vt = Actual outcome at time t
 0<<1 is the “smoothing parameter”
6
Two Views of Same Equation
 Pt+1(t) = Pt(t-1) + [Vt – Pt(t-1)]
– Adjust forecast based on last forecast

error
OR
 Pt+1(t) = (1- )Pt(t-1) + Vt
– Weighted average of last forecast and last

Actual
7
Simple Exponential
Smoothing
 Is appropriate when the underlying time
series behaves like a constant + Noise
– Xt =  + N t
– Or when the mean  is wandering around
– That is, for a quite stable process
 Not appropriate when trends or seasonality
present
8
ES would work well here
Typical Behavior for Exponential Smoothing
4
2
0
-2
Demand
-4
-6
-8
-10
-12
-14
1
13
19
25
31
37
43
49
55
61
67
73
79
85
91
97
103
109
115
Period
9
Simple Exponential
Smoothing
 We can show by recursive substitution
that ES can also be written as:
 Pt+1(t) = Vt + (1-)Vt-1 + (1-)2Vt-2 + (1-)3Vt-3 +…..
 Is a weighted average of past

observations
 Weights decay geometrically as we go
backwards in time
10
Weights on past data
0.7
0.6
0.5
0.4
Expo Smooth a=0.6
MoveAve(M=5)
0.3
0.2
0.1
0
1 2 3 4 5 6 7 8 9 10
11
Simple Exponential
Smoothing
 Ft+1(t) = At + (1-)At-1 + (1-)2At-2 + (1-)3At-3 +…..
 Large  adjusts more quickly to changes

 Smaller  provides more “averaging” and
thus lower variance when things are stable
 Exponential smoothing is intuitively more
appealing than moving averages
12
Exponential Smoothing
Examples
13
Zero Mean White Noise
Series
0 Series
-1
-2
-3
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
14
15
-3
-2
-1
0
1
2
3
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
0.1
Series
16
-3
-2
-1
0
1
2
3
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
0.3
0.1
Series
Shifting Mean + Zero Mean White Noise
Series
0
Mean
-1
-2
-3
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
-4
17
18
-4
-3
-2
-1
0
1
2
3
4
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
0.1
Mean
Series
19
-4
-3
-2
-1
0
1
2
3
4
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
0.3
Mean
Series
Automatic selection of 
 Using historical data

 Apply a range of  values
 For each, calculate the error in one-step-
ahead forecasts
– e.g. the root mean squared error (RMSE)
 Select the  that minimizes RMSE
20
RMSE vs Alpha
1.45
1.4
1.35
RMSE
1.3
1.25
1.2
1.15
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Alpha
21
Recommended Alpha
 Typically alpha should be in the range 0.05 to

0.3
 If RMSE analysis indicates larger alpha,
exponential smoothing may not be
appropriate
22
23
Time Series Value
-2
-1
0
1
2
-1.5
-0.5
0.5
1.5
1
6
11
16
21
26
31
36
41
46
51
Original Data
Period
56
61
66
71
76
81
86
91
96
Actual vs Forecast for
Various Alpha
1.5
0.5 Demand
Forecast
a=0.1
0
a=0.3
-0.5 a=0.9
-1
-1.5
-2
1
7
13
19
25
31
37
43
49
55
61
67
73
79
85
91
97
Period
24
Series and Forecast using Alpha=0.9
2 Might look good, but is it?

1.5
0.5
Forecast
-0.5
-1
-1.5
-2
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
101
Period
25
1.5
0.5
Forecast
-0.5
-1
-1.5
-2
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
101
Period
26
1.5
1
Forecast
0.5
-0.5
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Period
27
1.5
1
Forecast
0.5
-0.5
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Period
28
Forecast RMSE vs Alpha
0.67
0.66
0.65
0.64
Forecast RMSE
0.63
0.62 Series1
0.61
0.6
0.59
0.58
0.57
0 0.2 0.4 0.6 0.8 1
Alpha
29
Exponential Smoothing on Lake Huron Level Data
Various Alphas
13
12
11
Forecast and actual
10 HuronLevel
a=0.1
9
a=0.3
8 a=0.9
5
1
7
13
19
25
31
37
43
49
55
61
67
73
79
85
91
97
Period
30
Forecast Errors for Lake Huron Data
Various Alphas
1
Forecast Error
a=0.1
0
a=0.9
-1
-2
-3
1
6
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
Period
31
for Lake Huron Data
1.1
1.05
1
0.95
0.9
RMSE
0.85
0.8
0.75
0.7
0.65
0.6
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Alpha
32
Monthly Furniture Demand vs Forecast
Various Alphas
160
140
120
100 Demand
Orders
a=0.1
80
a=0.3
60 a=0.9
40
20
0
1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
33
35
Period
33
Monthly Furniture Demand Forecast Errors
Various Alphas
80
60
40
20
Forecast Error
a=0.1
0
a=0.3
-20
a=0.9
-40
-60
-80
-100
1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
33
35
Period
34
for Monthly Furniture Demand Data
45.6
40.6
35.6
30.6
RMSE
25.6
20.6
15.6
10.6
5.6
0.6
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Alpha
35
Exponential smoothing will
lag behind a trend
 Suppose Xt=b0+ b1t
 And St= (1- )St-1 + Xt
 Can show that
E St  E X t 








(1


 ) b0
 

   
   
36
Exponential Smoothing on a Trend
12
10
8
Trend Data
6 0.2
0.5
4
0
1 2 3 4 5 6 7 8 9 10 11 12
Period
37
Double Exponential
Smoothing
 Modifies exponential smoothing for following
a linear trend
Let St  (1 )St1X t
Let St[2]  (1 )St[2]1St

 i.e. Smooth the smoothed value
Let Xˆ t  2St  St[2]
38
Single and Double smoothed values
16
14
12
10
St Lags Trend Data
8 Single, a=0.5
Double, a=0.5
6
2 St[2] Lags even more

0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
39
Double Smoothing
20
18
16
14
2St -St[2] doesn’t lag
12
Trend Data
10
2(S(t)-S2(t)
8
6
2
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Period
40
E St  E X t 






1 b1 



 

   
   


E St


[2] 
  E St 1


 b1 



   
   
 
b1   
E St  E St



[2] 









 



1
    
     
  
Thus estimate slope at time t as

ˆ
b1(t)   [2]
St  St








1
 
 
 
41
E X t  E St 


1


 b1 





 

   
   
 
 
1    [2]   



E X t  E St 





  E S  E S  










 1  t  t  
     
     
 
 
 
E 



X t  2E St  E St 




[2] 












  
  
Xˆ t  2 St  St[2]
 
     
     
     
     
     
 
42
Xˆ t  2St  St[2]
Xˆ t  Xˆ t bˆ1

   

[2]  
[2] 
Xˆ t  2St  St  St  St
   
   
   
1
   
   
   
   
Xˆ t  2  St  1  St[2]
   
   
   


 1 1









43
44
-1
0
1
2
3
4
5
6
1
11
16
21
Example
26
31
36
41
46
Trend
Series
51
56
61
66
71
76
81
86
91
96
101
6
3
=0.2
Trend
2 Series Data
Single Smoothing
1 Double smoothing
-1
1
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
101
45
6
4
Single Lags a trend
3
Trend
2 Series Data
Single Smoothing
1 Double smoothing
-1
1
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
101
46
6
4
Double Over-shoots a change
3 (must “re-learn” the slope)
Trend
2 Series Data
Single Smoothing
1 Double smoothing
-1
101
41
16
21
26
31
36
46
51
56
61
66
71
76
81
86
91
96
1
11
47
Holt-Winters Trend and
Seasonal Methods
 “Exponential smoothing for data with trend
and/or seasonality”
– Two models, Multiplicative and Additive
 Models contain estimates of trend and
seasonal components
 Models “smooth”, i.e. place greater weight on
more recent data
48
Winters Multiplicative Model
 Xt = (b1+b2t)ct + t
 Where ct are seasonal terms and
L
ct  L where L is the season length
t 1
 Note that the amplitude depends on the level
of the series
 Once we start smoothing, the seasonal
components may not add to L
49
Holt-Winters Trend Model
 Xt = (b1+b2t) + t
 Same except no seasonal effect
 Works the same as the trend + season model
except simpler
50
 Example:
 



1.5


 
X t  1 0.04t 0.5












 
 
 


1 

 
51
Xt =(1 + 0.04t)(1.5,0.5,1)
4.5
3.5
(1+0.04t)
3
2.5
1.5
0.5
0
1
10
13
16
19
22
25
28
31
34
37
40
43
46
49
52
55
58
52
Xt =(1 + 0.04t)(1.5,0.5,1)
4.5
4
*150%
3.5
2.5
1.5
0.5
0
1
10
13
16
19
22
25
28
31
34
37
40
43
46
49
52
55
58
53
Xt =(1 + 0.04t)(1.5,0.5,1)
4.5
3.5
2.5
*50%
2
1.5
0.5
0
1
10
13
16
19
22
25
28
31
34
37
40
43
46
49
52
55
58
54
 The seasonal terms average 100% (i.e. 1)
 Thus summed over a season, the ct must add
to L
 Each period we go up or down some
percentage of the current level value
 The amplitude increasing with level seems to
occur frequently in practice
55
Recall Australian Red Wine Sales
Series
3000.
2500.
2000.
1500.
1000.
500.
0 20 40 60 80 100 120 140
56
Smoothing
 In Winters model, we smooth the “permanent

component”, the “trend component” and the
“seasonal component”
 We may have a different smoothing
parameter for each (, , )
 Think of the permanent component as the
current level of the series (without trend)
57
Step 1. Update the Permanent Component
Let a1(T ) b1b2T be the permanent Component
The update step is:

V
 (1 ) aˆ (T 1) bˆ (T 1)
 
aˆ1(T )  
 
T 



cˆ (T  L)
 


1 2 

T
58
Let a1(T ) b1b2T Current Observation

V
 (1 ) aˆ (T 1) bˆ (T 1)
 
aˆ1(T )  
 
T 



cˆ (T  L)
 


1 2 

T
59
Let a1(T ) b1b2T Current Observation

“deseasonalized”
V
 (1 ) aˆ (T 1) bˆ (T 1)
 
aˆ1(T )  
 
T 



cˆ (T  L)
 


1 2 

T
60
Let a1(T ) b1b2T

V
 (1 ) aˆ (T 1) bˆ (T 1)
 
aˆ1(T )  
 
T 



cˆ (T  L)
 


1 2 

T
Estimate of permanent component from

last time = last level + slope*1
61
Let a1(T ) b1b2T

V
 (1 ) aˆ (T 1) bˆ (T 1)
 
aˆ1(T )  
 
T 



cˆ (T  L)
 


1 2 

T
aˆ1(T )   Current observed level









 
 (1 ) Forecast of current level









 
62
Step 2. Update the Trend Component
bˆ2(T )   aˆ (T )  aˆ (T 1)  (1  )bˆ2(T 1)

 
 
 
 
 1 1 
63
bˆ2(T )   aˆ (T )  aˆ (T 1)  (1  )bˆ2(T 1)

 
 
 
 
 1 1 
“observed” slope
64
bˆ2(T )   aˆ (T )  aˆ (T 1)  (1  )bˆ2(T 1)

 
 
 
 
 1 1 
“observed” slope “previous” slope
65
Step 3. Update the Seasonal Component
for this period
V
cˆT (T )   T  (1 )cˆT (T  L)
aˆ (T )
1
Since VT  a1(T )c1(T )
66
To forecast ahead at time T
use current values of a, b, and c
VˆT  (T )  aˆ (T ) bˆ (T ) cˆT  (T   L)
 
 
 
 


1 2 

67
 
 
 
 


1 2 

Extend the trend out  periods ahead
68
 
 
 
 


1 2 

Use the proper seasonal adjustment
69
Winters Additive Method
 Xt = b1+ b2t + ct + t
 Where ct are seasonal terms and
L
ct  0 where L is the season length
t 1
 Similar to previous model except we
“smooth” estimates of b1, b2, and the ct
70
Croston’s Method
 Can be useful for intermittent, erratic, or

slow-moving demand
– e.g. when demand is zero most of the
time (say 2/3 of the time)
 Might be caused by
– Short forecasting intervals (e.g. daily)
– A handful of customers that order
periodically
– Aggregation of demand elsewhere (e.g.
reorder points)
71
Demand Distribution
0.8
0.7
0.6
0.5
Probability
0.4
0.3
0.2
0.1
0
0 1 2 3 4 5 6 7 8 9
Demand
72
Typical situation
 Central spare parts inventory (e.g. military)

 Orders from manufacturer
– in batches (e.g. EOQ)
– periodically when inventory nearly
depleted
– long lead times may also effect batch size
73
Example
Demand Prob Demand each
0 0.85 period follows a
1 0.1275 distribution that
2 0.0191 is usually zero
3 0.0029
4 0.0004
5 0.00006
6 0.00001
7 0.000002
74
75
Demand
0.5
1.5
2.5
3.5
0
1
2
3
1
14
27
40
53
66
79
92
Example
105
118
131
144
157
170
183
196
Period
209
222
235
An intermittent Demand Series
248
261
274
287
300
313
326
339
352
365
378
391
Example
 Exponential smoothing applied (=0.2)

Exponential Smoothing Applied
0.9
0.8
0.7
0.6
Demand
0.5
0.4
0.3
0.2
0.1
0
1
19
37
55
73
91
109
127
145
163
181
199
217
235
253
271
289
307
325
343
361
379
397
Period
76
Using Exponential Smoothing:
 Forecast is highest right after a non-zero

demand occurs
 Forecast is lowest right before a non-zero
demand occurs
77
Croston’s Method
 Separately Tracks
– Time between (non-zero) demands
– Demand size when not zero
 Smoothes both time between and demand
size
 Combines both for forecasting
Demand Size
Forecast =
Time between demands
78
Define terms
 V(t) = actual demand outcome at time t

 P(t) = Predicted demand at time t
 Z(t) = Estimate of demand size (when it is not
zero)
 X(t) = Estimate of time between (non-zero)
demands
 q = a variable used to count number of
periods between non-zero demand
79
Forecast Update
 For a period with zero demand

– Z(t)=Z(t-1)
– X(t)=X(t-1)
 No new information about

– order size Z(t)
– time between orders X(t)
 q=q+1
– Keep counting time since last order
80
Forecast Update
 For a period with non-zero demand
– Z(t)=Z(t-1) + (V(t)-Z(t-1))
– X(t)=X(t-1) + (q - X(t-1))
– q=1
81
Forecast Update
– Z(t)=Z(t-1) + (V(t)-Z(t-1))
– X(t)=X(t-1) + (q - X(t-1))
– q=1 Latest
order size
 Update Size of order via smoothing
82
Forecast Update
– Z(t)=Z(t-1) + (V(t)-Z(t-1))
– X(t)=X(t-1) + (q - X(t-1))
– q=1 Latest time
between orders
 Update size of order via smoothing
 Update time between orders via smoothing
83
Forecast Update
– Z(t)=Z(t-1) + (V(t)-Z(t-1))
– X(t)=X(t-1) + (q - X(t-1))
– q=1 Reset
counter
 Update size of order via smoothing
 Update time between orders via smoothing
 Reset counter of time between orders
84
Forecast
 Finally, our forecast is:
Z(t) Non-zero Demand Size

P(t) = =
X(t) Time Between Demands
85
Recall example
 Exponential smoothing applied (=0.2)
Exponential Smoothing Applied to Example Data

0.9
0.8
0.7
0.6
Demand
0.5
0.4
0.3
0.2
0.1
0
1
12
23
34
45
56
67
78
89
100
111
122
133
144
155
166
177
188
199
210
221
232
243
254
265
276
287
298
309
320
331
342
353
364
375
386
397
Period
86
Recall example
 Croston’s method applied (=0.2)
Croston's Method Applied to Example Data

0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
1
12
23
34
45
56
67
78
89
100
111
122
133
144
155
166
177
188
199
210
221
232
243
254
265
276
287
298
309
320
331
342
353
364
375
386
397
87
What is it forecasting?
 Average demand per period
Croston's Method Applied to Example Data

0.9
0.8
0.7
0.6
True average demand per period=0.176
0.5
0.4
0.3
0.2
0.1
0
1
12
23
34
45
56
67
78
89
100
111
122
133
144
155
166
177
188
199
210
221
232
243
254
265
276
287
298
309
320
331
342
353
364
375
386
397
88
Behavior
 Forecast only changes after a demand

 Forecast constant between demands
 Forecast increases when we observe
– A large demand
– A short time between demands
 Forecast decreases when we observe
– A small demand
– A long time between demands
89
Croston’s Method
 Croston’s method assumes demand is

independent between periods
– That is one period looks like the rest
(or changes slowly)
90
Counter Example
 One large customer

 Orders using a reorder point
– The longer we go without an order
– The greater the chances of receiving an
order
 In this case we would want the forecast to

increase between orders
 Croston’s method may not work too well
91
Better Examples
 Demand is a function of intermittent random

events
– Military spare parts depleted as a result of
military actions
– Umbrella stocks depleted as a function of
rain
– Demand depending on start of
construction of large structure
92
Is demand Independent?
 If enough data exists we can check the

distribution of time between demand
 Should “tail off” geometrically
93
Theoretical behavior
Theoretical Time Between Demands Distribution
12
10
8
Fequency
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Time Between
94
In our example:
Time Between Demands in Example
14
12
10
Frequency
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Time Between
95
Comparison
Time Between Demands
14
12
10
Frequency
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Time Between
96
Counterexample
 Croston’s method might not be appropriate if

the time between demands distribution looks
like this:
Distribution of Time Between Demand
12
10
8
Frequency
0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Time Between
97
Counterexample
 In this case, as time approaches 20 periods

without demand, we know demand is coming
soon.
 Our forecast should increase in this case
98
Error Measures
 Errors: The difference between actual and
predicted (one period earlier)
 et = Vt – Pt(t-1)
– et =can be positive or negative
 Absolute error |et|
– Always positive
 Squared Error et2
– Always positive
 The percentage error PEt = 100et / Vt
– Can be positive or negative
99
Bias and error magnitude
 Forecasts can be:

– Consistently too high or too low (bias)
– Right on average, but with large
deviations both positive and negative
(error magnitude)
 Should monitor both for changes
100
Error Measures
 Look at errors over time
 Cumulative measures summed or averaged
over all data
– Error Total (ET)
– Mean Percentage Error (MPE)
– Mean Absolute Percentage Error (MAPE)
– Mean Squared Error (MSE)
– Root Mean Squared Error (RMSE)
 Smoothed measures reflects errors in the
recent past
– Mean Absolute Deviation (MAD)
101
Error Measures Measure Bias
 Look at errors over time
over all data
recent past
102
Error Measures Measure error
 Look at errors over time magnitude
over all data
recent past
103
Error Total
 Sum of all errors
n
ET  e
t
t 1
 Uses raw (positive or negative) errors
 ET can be positive or negative
 Measures bias in the forecast
 Should stay close to zero as we saw in last
presentation
104
MPE
 Average of percent errors
1 n
MPE   PE
nt 1 t
 Can be positive or negative

 Measures bias, should stay close to zero
105
MSE
 Average of squared errors
1 n
MSE  e 2
nt 1 t
 Always positive
 Measures “magnitude” of errors
 Units are “demand units squared”
106
RMSE
 Square root of MSE
1 n 2
RMSE   et
n t 1
 Always positive
 Measures “magnitude” of errors
 Units are “demand units”
 Standard deviation of forecast errors
107
MAPE
 Average of absolute percentage errors
1 n
MAPE   PE
nt 1 t
 Always positive
 Measures magnitude of errors
 Units are “percentage”
108
Mean Absolute Deviation
 Smoothed absolute errors
MADt  (10.3)MAD 0.3e

t 1 t
 Always positive
 Measures magnitude of errors
 Looks at the recent past
109
Percentage or Actual units
 Often errors naturally increase as the level of

the series increases
 Natural, thus no reason for alarm
 If true, percentage based measured preferred
 Actual units are more intuitive
110
Squared or Absolute Errors
 Absolute errors are more intuitive

 Standard deviation units less so
– 66% within  1 S.D.
– 95% within  2 S.D.
 When using measures for automatic model
selection, there are statistical reasons for
preferring measures based on squared errors
111
Ex-Post Forecast Errors
 Given
– A forecasting method
– Historical data
 Calculate (some) error measure using the
historical data
 Some data required to initialize forecasting
method.
 Rest of data (if enough) used to calculate ex-
post forecast errors and measure
112
Automatic Model Selection
 For all possible forecasting methods

– (and possibly for all parameter values e.g.
smoothing constants – but not in SAP?)
 Compute ex-post forecast error measure
 Select method with smallest error
113
Automatic  Adaptation
 Suppose an error measure indicates behavior

has changed
– e.g. level has jumped up
– Slope of trend has changed
 We would want to base forecasts on more
recent data
 Thus we would want a larger 
114
Tracking Signal (TS)
ETt
TSt 
MADt
TSt  0 if MADt is zero
 Bias/Magnitude = “Standardized bias”
115
 Adaptation
 
t    0.2 TS 
 
 
t 1
 
t t 1






t  0.8  0.2TSt
t 1
subject to 0.05  0.9
 If TS increases, bias is increasing, thus
increase 
 I don’t like these methods due to instability
116
Model Based Methods
 Find and exploit “patterns” in the data

 Trend and Seasonal Decomposition
– Time based regression
 Time Series Methods (e.g. ARIMA Models)
 Multiple Regression using leading indicators
 Assumes series behavior stays the same
 Requires analysis (no “automatic model
generation”)
117
Univariate Time Series
Models Based on
Decomposition
 Vt = the time series to forecast
 Vt = T t + St + N t
 Where
– Tt is a deterministic trend component
– St is a deterministic seasonal/periodic
component
– Nt is a random noise component
118
Raw Material Price
3.8
3.6
3.4
3.2
Price ($/Unit)
3
Price
2.8
(Vt)=0.257
2.6
2.4
2.2
2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Period
119
Raw Material Price
3.8
3.6
3.4
3.2
Price ($/Unit)
3
Price
2.8
2.6
2.4
2.2
2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Period
120
Simple Linear Regression
Model:
Vt=2.877174+0.020726t
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.569724
R Square 0.324585
Adjusted R Square
0.293884
Standard Error0.21616
Observations 24
ANOVA
df SS MS F Significance F
Regression 1 0.494006 0.494006 10.57257 0.003659
Residual 22 1.027956 0.046725
Total 23 1.521963
Coefficients
Standard Error t Stat P-value Lower 95% Upper 95%Lower 95.0%
Upper 95.0%
Intercept 2.877174 0.091079 31.58978 7.99E-20 2.688287 3.066061 2.688287 3.066061
X Variable 1 0.020726 0.006374 3.251549 0.003659 0.007507 0.033945 0.007507 0.033945
121
Use Model to Forecast into the Future
Actuals and Forecasts
3.8
3.6
3.4
3.2
3
Price
Price
2.8 Forecast
2.6
2.4
2.2
2
1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
33
35
Period
122
Residuals = Actual-Predicted
et = Vt-(2.877174+0.020726t)
(et)=0.211
Residuals After Regression
0.4
0.3
0.2
0.1
Residuals
0 Residuals
-0.1
-0.2
-0.3
-0.4
1
11
13
15
17
19
21
23
Period
123
Simple Seasonal Model
 Estimate a seasonal adjustment factor for

each period within the season
 e.g. SSeptember
124
Residuals Season Residuals Season Season Averages
0.1521 1 0.1521 1
-0.24862609 2 0.27992173 1 Sorted
0.03064782 3 0.22774346 1 by season
0.27992173 1 0.19556519 1
-0.21080436 2 0.18338692 1
-0.07153045 3 0.28120865 1
0.22774346 1 0.33903038 1
-0.28298263 2 0.34685211 1 0.250726055
0.03629128 3 -0.24862609 2
0.19556519 1 -0.21080436 2
-0.1951609 2 -0.28298263 2
0.00411301 3 -0.1951609 2 Season
0.18338692 1 -0.28733917 2
-0.28733917 2 -0.20951744 2
averages
-0.00806526 3 -0.24169571 2
0.28120865 1 -0.26387398 2 -0.242500035
-0.20951744 2 0.03064782 3
-0.00024353 3 -0.07153045 3
0.33903038 1 0.03629128 3
-0.24169571 2 0.00411301 3
-0.0424218 3 -0.00806526 3
0.34685211 1 -0.00024353 3
-0.26387398 2 -0.0424218 3
-0.01460007 3 -0.01460007 3 -0.008226125
125
Trend + Seasonal Model
 Vt=2.877174+0.020726t + Smod(t,3)
 Where
– S1 = 0.250726055
– S2 = -0.242500035
– S3 = -0.008226125
126
Actual vs Forecast (Trend + Seasonal Model)
3.8
3.6
3.4
3.2
Price
Price
3
Forecast
2.8
2.6
2.4
2.2
2
1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
33
35
Period
127
et = Vt - (2.877174 + 0.020726t + Smod(t,3))
(et)=0.145
Residuals from Trend+Season
0.15
0.1
0.05
Residuals
0 Residuals2
-0.05
-0.1
-0.15
1
11
13
15
17
19
21
23
Period
128
Can use other trend models
 Vt= 0+ 1Sin(2t/k) (where k is period)

 Vt= 0+ 1t + 2t2 (multiple regression)
 Vt= 0+ 1ekt
 etc.
 Examine the plot, pick a reasonable model
 Test model fit, revise if necessary
129
130
-3
-2
-1
0
1
2
3
1
5
9
13
17
21
25
29
33
37
41
45
49
53
57
61
65
69
73
77
81
85
89
93
97
Signal=COS(2*PI*t/12)
1
Tim e t
Series1
0
Signal
-1
-2
-3
1
11
16
21
26
31
36
41
46
51
56
61
66
71
76
81
86
91
96
S(t)
131
Model: Vt = Tt + St + Nt
 After extracting trend and seasonal

components we are left with “the Noise”
Nt = Vt – (Tt + St)
 Can we extract any more predictable
behavior from the “noise”?
 Use Time Series analysis
– Akin to signal processing in EE
132
Zero Mean, and Aperiodic:
ˆ
Is our best forecast t 1  0 ?
N
Demand
2
Demand
-2
-4
-6
1
14
27
40
53
66
79
92
105
118
131
144
157
170
183
196
209
222
235
248
261
274
287
300
Period
133
AR(1) Model
 This data was generated using the model

 Nt = 0.9Nt-1 + Zt
 Where Zt ~N(0,2)
 Thus to forecast Nt+1,we could use:
Nˆ  0.9Nt
t 1
2
Nˆ  0.9Nˆ  0.9 Nt
t 2 t 1
134
AR(1): Actual vs 1-Step Ahead Forecast
2
Demand
Actual
0
Forecast
-2
-4
-6
17
33
49
65
81
97
1
113
129
145
161
177
193
209
225
241
257
273
289
Period
135
Forecasting N Steps Ahead
2
Demand
Actual
0
Forecast
-2
-4
-6
1
115
134
153
172
191
210
229
248
267
286
305
324
343
20
39
58
77
96
Period
136
Time Series Models
 Examine the correlation of the time series to

past values.
 This is called “autocorrelation”
 If Nt is correlated to Nt-1, Nt-2,…..
 Then we can forecast better than
Nˆ  0
t 1
137
Sample Autocorrelation Function
Sample ACF Sample PACF

1.00 1.00
.80 .80
.60 .60
.40 .40
.20 .20
.00 .00
-.20 -.20
-.40 -.40
-.60 -.60
-.80 -.80
-1.00 -1.00
0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40
138
Back to our Demand Data
Residuals from Trend+Season
0.15
0.1
0.05
Residuals
0 Residuals2
-0.05
-0.1
-0.15
1
11
13
15
17
19
21
23
Period
139
No Apparent Significant
Autocorrelation
Sample ACF Sample PACF
1.00 1.00
.80 .80
.60 .60
.40 .40
.20 .20
.00 .00
-.20 -.20
-.40 -.40
-.60 -.60
-.80 -.80
-1.00 -1.00
0 5 10 15 20 25 30 35 40 0 5 10 15 20 25 30 35 40
140
Multiple Linear Regression
 V= 0+ 1 X1 + 2 X2 +….+ p Xp + 
 Where
– V is the “independent variable” you want
to predict
– The Xi‘s are the dependent variables you
want to use for prediction (known)
 Model is linear in the i‘s
141
Examples of MLR in
Forecasting
 Vt= 0+ 1t + 2t2 + 3Sin(2t/k) + 4ekt
– i.e a trend model, a function of t
 Vt= 0+ 1X1t + 2X2t
– Where X1t and X2t are leading indicators
 Vt= 0+ 1Vt-1+ 2Vt-2 + 12Vt-12 +13Vt-13
– An Autoregressive model
142
Example: Sales and Leading Indicator
Series 1 Series 2
14.00
260.
13.50
13.00 250.
12.50
240.
12.00
230.
11.50
220.
11.00
10.50 210.
10.00
200.
9.50
0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140
143
Example: Sales and Leading Indicator
Series 1 Series 2
14.00
260.
13.50
13.00 250.
12.50
240.
12.00
Sales(t) = -3.93+0.83Sales(t-3)
230.
11.50 -0.78Sales(t-2)+1.22Sales(t-1) -5.0Lead(t)

220.
11.00
10.50 210.
10.00
200.
9.50
0 20 40 60 80 100 120 140 0 20 40 60 80 100 120 140
144

Basic Forecasting Methods

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Basic Forecasting Methods

Uploaded by

Copyright:

Available Formats

Forecasting using

 Moving average and weighted moving

 The Wi are weights attached to each

 Pt+1(t) = Forecast for time t+1 made at time t

 Pt+1(t) = Pt(t-1) + [Vt – Pt(t-1)]

– Adjust forecast based on last forecast

 Pt+1(t) = (1- )Pt(t-1) + Vt

– Weighted average of last forecast and last

Typical Behavior for Exponential Smoothing

 Pt+1(t) = Vt + (1-)Vt-1 + (1-)2Vt-2 + (1-)3Vt-3 +…..

 Is a weighted average of past

 Large  adjusts more quickly to changes

 Using historical data

 Typically alpha should be in the range 0.05 to

2 Might look good, but is it?

Let St[2]  (1 )St[2]1St

Let Xˆ t  2St  St[2]

2 St[2] Lags even more

Thus estimate slope at time t as

0 20 40 60 80 100 120 140

 In Winters model, we smooth the “permanent

Let a1(T ) b1b2T be the permanent Component

The update step is:

Let a1(T ) b1b2T Current Observation

Let a1(T ) b1b2T Current Observation

Let a1(T ) b1b2T

Estimate of permanent component from

Let a1(T ) b1b2T

aˆ1(T )   Current observed level

 (1 ) Forecast of current level

bˆ2(T )   aˆ (T )  aˆ (T 1)  (1  )bˆ2(T 1)

bˆ2(T )   aˆ (T )  aˆ (T 1)  (1  )bˆ2(T 1)

bˆ2(T )   aˆ (T )  aˆ (T 1)  (1  )bˆ2(T 1)

“observed” slope “previous” slope

Since VT  a1(T )c1(T )

Extend the trend out  periods ahead

Use the proper seasonal adjustment

 Can be useful for intermittent, erratic, or

 Central spare parts inventory (e.g. military)

 Exponential smoothing applied (=0.2)

 Forecast is highest right after a non-zero

 V(t) = actual demand outcome at time t

 For a period with zero demand

 No new information about

 For a period with non-zero demand

 For a period with non-zero demand

 For a period with non-zero demand

 For a period with non-zero demand

 Finally, our forecast is:

Z(t) Non-zero Demand Size

 Exponential smoothing applied (=0.2)

Exponential Smoothing Applied to Example Data

 Croston’s method applied (=0.2)

Croston's Method Applied to Example Data

 Average demand per period

Croston's Method Applied to Example Data

 Forecast only changes after a demand

 Croston’s method assumes demand is

 One large customer

 In this case we would want the forecast to

 Demand is a function of intermittent random

 If enough data exists we can check the

Theoretical Time Between Demands Distribution