Professional Documents
Culture Documents
Business Forecasting - Time Series Forecasting
Business Forecasting - Time Series Forecasting
1
Contents
Forecasting Perspective
Why Forecasting?
Plot
Identify Components
Identify Techniques
2
Contents (cont)
Time Series Decomposition
Principles of decomposition
Additive Decomposition
Multiplicative
3
Contents (cont)
Smoothening Methods
Averaging Methods
Regression
Simple Linear Regression
4
Forecasting Perspective
5
Why Forecasting?
Future is uncertain
Whatever will happen, will happen
Time lag between awareness of an impending event and
occurrence of an event
If Time lag is long, planning can be useful
Reduce Uncertainty to allow for better decision by
management
Assumption: FUTURE is EXTENSION of PAST or
difference can be explained by variable
6
Why Forecasting? (Cont)
Typical Major Applications
Scheduling Resources
Acquiring Resources
7
Overview of Forecasting Techniques
8
Overview of Forecasting
Techniques(Cont)
Quantitative
Time Series
Explanatory / Causal
Apply when
data
Some aspects of past pattern will continue in future
9
Overview of Forecasting
Techniques(Cont)
Time Series vs. Explanatory
Time series forecasting treats system as a black box
10
Basic Steps in Forecasting
Problem Definition and identifying output to be
forecasted
Gathering Information and Input
Preliminary (Exploratory) Analysis
Descriptive Statistics
Visualization
11
Examples
Fig 1-2a Australian Monthly Electricity Consumption
Increasing Trend, Increasing Variation, Strong Seasonal
Application?
Will it continue?
12
Examples (Cont)
13
Basic Forecasting Tools
14
Time Series and Cross Sectional
Data
Time Series data – Sequence of observation over time,
equally spaced (e.g. Table 2-2 Beer – Monthly Australian
Beer Production)
Cross Sectional data – All observations at same time
(e.g. Table 2-1 Car – Price, Mileage and Country of
Origin)
15
Time Series
A time series is made up of the value for a
variable recorded at regular time intervals.
The time interval can be years, quarters,
months, weeks, days, or any other length of time
that is important.
For example, the marketing research
department for a company might record the
company's sales of a product on a daily basis.
These daily time series values could then be
combined for two-week periods to create a bi-
weekly time series and so on.
Basic Steps in Time Series
Forecasting
Transformation and Adjustments (optional)
Plot
Identify Components
Identify Techniques
Evaluate Techniques and Choose appropriate one
17
Components of Time Series
Most time series techniques consider the time series to
be made up of four components.
Trend - a long-term upward or downward change in the
time series.
Seasonal - periodic increases or decreases. Normally
Yt = Tt + St + Ct +It
Components vs Techniques
Components Techniques Base
Irregular Moving Average: MA(n): n is integer NF1
Single Exponential Smoothening: SES(α): 0<α<=1
ARIMA(p, q): p, q are integer
22
Measuring Forecast Accuracy
Goodness of fit (how well forecasting model is able to
reproduce data) Vs. Accuracy of future forecast (of
importance to Business user, i.e. us)
For Table 2-2, take last 8 months actual and forecast for
10 months using average of the same month data over
last 4 years
Calculate ME (Mean Error), Mean Absolute Error (MAE),
Mean Square Error (MSE), RMS (Root Mean Square)
Mean Percentage Error (MPE) and MAPE (Mean
Absolute Percentage Error)
ME and MPE are small as +ve and –ve cancel out. Large
value suggest that we are consistently over-estimating or
under-estimating
23
Other Important Concepts
MAPE is meaningful if scale has a meaningful origin as
well as none of the value are near zero
Interpretation of RMS and MAPE
RMS has a problem if range is large Over-fitting
Training, (Validation) and Test (or Handout) set
Size of Training vs. Test Set, Problems and Advantages
24
Naive Forecast Methods
NF1 – Naïve Forecast 1 Model
^
Yt+1 = Yt
This means forecast for next time is same as the
current actual
25
Transformations and Adjustments
See Table 2-16 “Monthly Australian Electricity Consumption”
Variation increase with Level
Any forecasting method needs to take care of trend, Season
and also variation that is increasing with level
Mathematical Transformation is useful to smoothen out the
variation
Most important transformations are Square Root and
Logarithmic
General “Power” Transformation
Wt = - Ytp for p < 0
Wt = Loge(Yt) for p = 0
Wt = Ytp for p > 0
26
Transformations and Adjustments
Other transformation for (Cont)
cross sectional data depending
upon the problem
After forecasting, do reverse transformation
Prediction interval may not remain symmetric
Adjustments
Month Length Adjustments (sale for February vs Sale
=DAY(DATE(YEAR(A169),MONTH(A169)+1,1)-1)
27
Exponential Smoothening Methods
28
Various Smoothening Methods
Stationary Time Series
Averaging Methods
Simple Average
Moving Average
One Parameter
Adaptive Parameter
Trend
Holt’s Linear Method
29
Approach - Smoothening Methods for
Forecasting
Divide Data into “Training” and “Test” data
Plot time series and identify components
Choose appropriate smoothening methods
Use “Training” data to build model
Apply model on “Test” data for forecasting
Measure MAPE, MSE, etc using test data
Optimize (Minimize) MAPE and MSE
Decide a final smoothening method
30
Averaging Methods
Simple Average
Take simple average of all observed data (Eq 4.1)
Pegels’ table
Moving Average – MA(k)
Different from the one discussed in early chapter. Here
31
Averaging Methods (Cont)
Moving Average – MA(k) (Cont)
Does not handle trend or seasonality, but better than
Simple Average
Do Table 4.2 Exercise (Using MA(3) and MA(5) on
Time Plot
32
Averaging Methods (Cont)
If Time series changes suddenly, Average methods do
not give very good result (They break down. In fact
larger the k, more time it takes to catch up)
Exponential smoothening methods are better for
forecasting
33
SES: Single Exponential
Weighted Average – Exponentially decreasing
Ft+1 = Ft + α (Yt – Ft) = Ft + α * εt
New forecast is old forecast plus adjustment for error
α varies from 0 to 1; α nearer to 0 implies very little
34
Single Exponential (Cont)
F1 needs to be decided
Typically one takes F1 = Y1, as time increases, weight
age of F1 will decrease. However if α is nearer to 0, then
initialization will have considerable effect on forecasting
Other methods will take average of first few values of Yt
Calculate MSE, MAPE, Theil’s U-Statisitcs for Table 4.2
“Electronic Cane Opener” by forecasting values from 2nd
to 11th month (Feb to Dec) for values of α = 0.1, 0.5 and
0.9 (Use F1 = Y1). Plot them
Larger α implies less smoothening (see plot in previous
case)
35
Single Exponential (Cont)
For longer range forecasts,
Ft+h = Ft+1 h = 1, 2, ..
If there is a trend, forecasts lag behind the trend, farther
behind for smaller α
See example Table 4.4 “Inventory Demand”
36
Holt’s Linear Method
Used for data having trend
Also called Double Exponential Smoothening
Two parameters β and α (Range: 0 to 1)
Lt = αYt + (1-α)(Lt-1 + bt-1) = αYt + (1-α)Ft (m=1)
bt = ß(Lt – Lt-1) + (1-ß)bt-1
Ft+1 = Lt + bt
Lt denotes level of series, while bt denotes slope
Initialization:
L1 = Y1; b1 = Y2-Y1 or L1 = Y1; b1 =(Y4-Y1)/3 or
Use least square regression of first few values of Yt
Problem with initialization: If trend is upwards but Y2-Y1 is
negative, it will take long time to overcome influence
Do example Table 4.6 Inventory Demand Data
37
Holt’s Linear Method (Cont)
To find optimal value of α, ß (Minimize error); either use
non-linear optimization method, or use grid search
approach (i.e. try out different values)
38
Holt-Winters’ trend & Seasonality Method
See example in Table 4.7
Multiplicative
Level: L = α*Y /S +(1-α)*(L +b )
t t t-s t-1 t-1
39
Holt-Winters’ trend & Seasonality Method
See example in Table 4.7
Additive
Level: L = α*(Y - S ) +(1-α)*(L + b )
t t t-s t-1 t-1
40
General Aspects of Smoothening Methods
Initialization, Optimization and Prediction Intervals are
the major issues
Initialization
How you initialize becomes less and less important as we move
down the future
Back Forecasting: Reverse data series, start estimation
procedure from latest (most recent) and obtain the first value.
Use this as an initial value
Least Square Estimates: For first few values (say 10), fit it to
straight line and get initial value
Decomposition: Use decomposition methods
Others: For initial period, use high value of α, β, γ
41
General Aspects of Smoothening Methods
Prediction Intervals (Cont)
Since smoothening methods do not depend upon statistical
models, giving interval and interpreting the same is difficult. For
example, when prediction saying that Sale is between 23 and 27
with 66% probability is difficult.
42
ARIMA: Box-Jenkins
Auto Correlation Factor : ACF
PACF(1) = ACF(1)
Y(t) and Y(t-1) ; Y(t-1) and Y(t-2); Y(t) and Y(t-2)
PACF(2) .. Y(t), Y(t-1); Remaining, Y(t-2)
43
44
45
46
ARIMA(p,q)
Y(t) = b0 + b1*Y(t-1) + ε(t); ε(t) = Y(t) – F(t)
F(t) = b0 + b1*Y(t-1)
p=1, q=0
47
48
ACF
Time lag
49
PACF
Time lag
50
ARIMA: Box-Jenkins
AR(p) : Yt = φ1Yt−1 +···+φpYt−p + c + et ; et = Yt - Ft