Professional Documents
Culture Documents
Ch2 Forecaster's Toolbox
Ch2 Forecaster's Toolbox
Agenda
2.1 Graphics
2.2 Numerical data summaries
2.3 Some simple forecasting methods
2.4 Transformations and adjustments
2.5 Evaluating forecast accuracy
2.6 Residual diagnostics
2.7 Prediction intervals
2.8 Exercises
2.9 Further reading
2.10 The forecast package in R
Components of Time Series Data
Cyclical Irregular
Components of Time Series Data
Irregular Cyclical
fluctuations
1 2 3 4 5 6 7 8 9 10 11 12 13
Year
install.packages("fpp") #all data in book
15
10
5
0
Year
There was a period in 1989 when no passengers were carried --- this was due to an industrial dispute.
There was a period of reduced load in 1992. This was due to a trial in which some economy class seats were replaced by business class seats.
A large increase in passenger load occurred in the second half of 1991.
There are some large dips in load around the start of each year. These are due to holiday effects.
There is a long-term fluctuation in the level of the series which increases during 1987, decreases in 1989 and increases again through 1990 and 1991.
There are some periods of missing observations.
Graphics-Look plot(a10, ylab="$ million", xlab="Year",
main="Antidiabetic drug sales")
at the Data!
Antidiabetic drug sales
30
25
20
$ million
15
10
5
Year
Here there is a clear and increasing trend. There is also a strong seasonal pattern that
increases in size as the level of the series increases. The sudden drop at the end of
each year is caused by a government subsidization scheme that makes it cost-effective
for patients to stockpile drugs at the end of the calendar year. Any forecasts of this
series would need to capture the seasonal pattern, and the fact that the trend is
changing slowly.
Seasonal plot: antidiabetic drug sales
30
25
20
$ million
15
10
5
Year
30
25
20
$ million
15
10
5
Month
monthplot(a10,ylab="$ million",xlab="Month",xaxt="n",
main="Seasonal deviation plot: antidiabetic drug
sales",
axis(1,at=1:12,labels=month.abb,cex=0.8))
11
10
9
Carbon footprint
8
7
6
5
4
15 20 25 30 35 40 45
City mpg
plot(jitter(fuel[,5]),
jitter(fuel[,8]), xlab="City
mpg", ylab="Carbon footprint")
15 25 35 45 4 5 6 7 8 9 11
3.5
2.5
Litres
1.5
45
35
City
25
15
40
Highway
30
20
10
8
Carbon
6
4
pairs(fuel[,-c(1:2,4,7)], pch=19)
15 20 25 30 35 40 45 4 5 6 7 8 9 10 11
City
35
0.821 -0.908
25
0.02
0.
06
0.04
15
45
Highway
40
35
0.02 0.002
0.0
6 00
4 -0.927
30
0.
8
0.0
25
6
00
0.04
0.
20
9 10
0.0
2
0.02
Carbon
0.2 0.0
3
7 8
0.01
5
4
library(ResourceSelection)
kdepairs(fuel[,-c(1:2,4,7)], pch=19)
Statistics-Look for Issues!
myfun=function(x){ Hist
m=mean(x)
10
8
s=sd(x)
Frequency
6
f=fivenum(x)
4
2
par(mfrow=c(2,1))
0
4.0 4.5 5.0 5.5 6.0 6.5 7.0
h=hist(x, main="Hist") x
all=as.list(c(m,md,s,f))
cardata=c(4,4.4,5.9,5.9,6.1,6.1,6.1,6.3,6.3,6.3,
rep(6.6,7), rep(6.8,3))
myfun(cardata)
Correlation-Look for
Relationships!
par(mfrow=c(2,2)) 4 6 8 10 12 14 8 10 12 14 16 18 3 4 5 6 7 8 9 6 8 10 12
plot(anscombe$y1~anscombe$x1) x1
12
1 1 -0.5 0.816 0.816 0.816 -0.314
8
plot(anscombe$y2~anscombe$x2)
4
01 06 0.009
x2
12
0.0 0.0
plot(anscombe$y3~anscombe$y3) 0.0
07 06 1 -0.5 0.816 0.816 0.816 -0.314
8
11 0.0
0.0
05 01
0.0 0.0
4
plot(anscombe$y4~anscombe$y4) 01 06 0.009 01 06 0.009
x3
12
0.0 0.0 0.0 0.0
0.0
07
0.0
06 0.0
07
0.0
06 -0.5 0.816 0.816 0.816 -0.314
8
1 1 1 1
0.0 0.0
05 01 0 5 0 1
0.0 0.0 0.0 0.0
4
x4
16
-0.529 -0.718 -0.345 0.817
12
8
2 2 2
y1
8 10
.00 8 .00 8 .00 8
cor(anscombe$y1,anscombe$x1,
0 0.00 0.012 0 0.00 0.012 0 0.00 0.012
0.01
6
0.01
6
0.01
6
0.75 0.469 -0.489
6
0.01
4 06 0.002 0.01
4 06 0.002 0.01
4 06 0.002
0.0 0.0 0.0
method="pearson")
4
9
0.0
1
0.025
y2
7 0.02
0.01 0.005 0.01 0.005 0.01 0.005 5
0.588 -0.478
0.0
0.00
cor(anscombe$y2,anscombe$x2,method
15
5
1
0.0
3
8 10
0.005 0.005
0.005 0.005 0.005
0.01 0.01 0.01 0.015 -0.155
0.015 0.015 0.015 0.01 5
0.01 0.01
6
0.02 0.005 0.02 0.005 0.02 0.005 0.03 0.01 0.02
0.0
0.0
0.01
3
5
0.0
0.01
0.01
0.01
05
35
0.02
0.0
15 0.0
6
4 6 8 10 12 14 4 6 8 10 12 14 4 5 6 7 8 9 10 6 8 10 12
cor.test
A Word About Stationarity
Yt Time Path
Equilibrium
Shock
t
Autocorrelations
Just the correlation between any two observations
of a time series
If Cov(Yt Yt-k) is the autocovariance, then
cor(Yt Yt-k) = Cov(Yt Yt-k)/var(Yt)
newfun=function(x){
ar=rep(0,10)
for (i in 1:10){
ar[i]=cor(x[1:(length(x)-i)],x[(i+1):length(x)])}
return(ar)
}
newfun(co2)
acf(co2)
Autocorrelation (Serial
Correlation)
Autocorrelation occurs in data when the error terms of a
regression forecasting model are correlated.
Potential Problems
Estimates of the regression coefficients no longer have
the minimum variance property and may be inefficient.
The variance of the error terms may be greatly
underestimated by the mean square error value.
The true standard deviation of the estimated regression
coefficient may be seriously underestimated.
The confidence intervals and tests using the t and F
distributions are no longer strictly applicable.
First-order autocorrelation occurs when there is correlation
between the error terms of adjacent time periods.
Overcoming the
Autocorrelation Problem
Addition of Independent Variables
Transforming Variables
First-differences approach
Percentage change from period to period
Use autoregression
Neural network / MCMC simulation
changeit=co2[2:468]-co2[1:467]
newfun(changeit)
Autocorrelation
beer2 <- window(ausbeer, start=1992, end=2006-.1)
acf(beer2)
Series: beer2
0.5
ACF
0.0
-0.5
4 8 12 16
Lag
Average Method
F t
mean(X)
meanf(co2)
Naive Forecasting
F t
X t 1
where: F t
the forecast for time period t
Xt 1
the value for time period t - 1
Last = Next
naive(co2)
Seasonal Naive Forecasting
F t X t s
X t 1
X t 2
X t 3
X t n
Ft n
library(TTR)
mysma=SMA(co2,n=5)
Moving Average (can be equivalent
to exponential smoothing)
Updated (recomputed) for every new time period
May be difficult to choose optimal number of periods
May not adjust for trend, cyclical, or seasonal effects
X t 1
X t 2
X t 3
X t n
Ft n
Mean method
Naive method
Drift method
3900
3800
3700
3600
Day
dj2 <- window(dj,end=250)
plot(dj2,main="Dow Jones Index (daily ending 15 Jul 94)",
ylab="",xlab="Day",xlim=c(2,290)) Mean
lines(meanf(dj2,h=42)$mean,col=4)
lines(rwf(dj2,h=42)$mean,col=2) Nave
lines(rwf(dj2,drift=TRUE,h=42)$mean,col=3)
legend("topleft",lty=1,col=c(4,2,3), Drift
legend=c("Mean method","Naive method","Drift method"))
Transformations-Helps Normalize
Data which is Useful for Prediction
Intervals, etc.
Box Cox transformation
library(car)
BoxCox.lambda(elec)
[1] 0.2654076
powerTransform(elec)
Estimated transformation parameters
elec
0.3896253
Which is "better?"
https://www.otexts.org/fpp/2/4
Which is better? Get the Time Component
Right!
Measurement of Forecasting Error:
Model Comparison Statistics
with = .7
8 1824 1781.8 42.2
9 1826 1811.3 14.7
10 1780 1821.6 -41.6
11 1759 1792.5 -33.5
Mean Error for the Nonfarm Partnership
Forecasted Data
Year Actual Forecast Error
ME
e i
1 1402.0 number of forecasts
2 1458.0 1402.0 56.0 524.3
3 1553.0 1441.2 111.8 10
4 1613.0 1519.5 93.5 52.43
5 1676.0 1584.9 91.1
6 1755.0 1648.7 106.3
7 1807.0 1723.1 83.9
8 1824.0 1781.8 42.2
9 1826.0 1811.3 14.7
10 1780.0 1821.6 -41.6
11 1759.0 1792.5 -33.5
524.3
Mean Absolute Deviation: Nonfarm
Partnership Forecasted Data
Year Actual Forecast Error |Error|
MAD
e i
1 1402.0 number of forecasts
2 1458.0 1402.0 56.0 56.0 674.5
3 1553.0 1441.2 111.8 111.8
10
4 1613.0 1519.5 93.5 93.5
67.45
5 1676.0 1584.9 91.1 91.1
6 1755.0 1648.7 106.3 106.3
7 1807.0 1723.1 83.9 83.9
8 1824.0 1781.8 42.2 42.2
9 1826.0 1811.3 14.7 14.7
10 1780.0 1821.6 -41.6 41.6
11 1759.0 1792.5 -33.5 33.5
674.5
Mean Square Error: Nonfarm
Partnership Forecasted Data
Year Actual Forecast Error Error2
1 1402
2 1458 1402.0 56.0 3136.0
3 1553 1441.2 111.8 12499.2
MSE
e 2
i
4 1613 1519.5 93.5 8749.7
number of forecasts
5 1676 1584.9 91.1 8292.3
55864.2
6 1755 1648.7 106.3 11303.6
10
7 1807 1723.1 83.9 7038.5
5586.42
8 1824 1781.8 42.2 1778.2
9 1826 1811.3 14.7 214.6
10 1780 1821.6 -41.6 1731.0
11 1759 1792.5 -33.5 1121.0
55864.2
Mean Percentage Error: Nonfarm
Partnership Forecasted Data
Year Actual Forecast Error Error %
1 1402
2 1458 1402.0 56.0 3.8% ei
3 1553 1441.2 111.8 7.2% X 100
MPE
i
4 1613 1519.5 93.5 5.8% number of forecasts
5 1676 1584.9 91.1 5.4% 318.
6 1755 1648.7 106.3 6.1%
10
7 1807 1723.1 83.9 4.6% 318%
.
8 1824 1781.8 42.2 2.3%
9 1826 1811.3 14.7 0.8%
10 1780 1821.6 -41.6 -2.3%
11 1759 1792.5 -33.5 -1.9%
31.8%
Mean Absolute Percentage Error: Nonfarm
Partnership Forecasted Data
Year Actual Forecast Error |Error %|
1 1402
2 1458 1402.0 56.0 3.8%
3 1553 1441.2 111.8 7.2% e
4 1613 1519.5 93.5 5.8% X 100
i
1402
1458 1402 56 1402 56
1553 1441.2 111.8 1441.2 111.8
1613 1519.5 93.5 1519.5 93.5
1676 1584.9 91.1 1584.9 91.1
1755 1648.7 106.3 1648.7 106.3
1807 1723.1 83.9 1723.1 83.9
1824 1781.8 42.2 1781.8 42.2
1826 1811.3 14.7 1811.3 14.7
1780 1821.6 41.6 1821.6 41.6
1759 1792.5 33.5 1792.3 33.3
Average 67.46 67.44
The size of the test set is typically about 20% of the total sample, although this
value depends on how long the sample is and how far ahead you want to forecast.
The size of the test set should ideally be at least as large as the maximum forecast
horizon required. The following points should be noted.
A model which fits the data well does not necessarily forecast well.
A perfect fit can always be obtained by using a model with enough parameters.
Over-fitting a model to data is as bad as failing to identify the systematic pattern
in the data.
Cross-Validation-Cross Sectional
Data
1.Select observation i for the test set, and use the remaining
observations in the training set. Compute the error on the
test observation
2.Repeat the above step for i=1,2,,N where N is the total
number of observations.
3.Compute the forecast accuracy measures based on the
errors obtained.
1.Select the observation at time k+i for the test set, and
use the observations at times 1,2,,k+i1 to estimate
the forecasting model. Compute the error on the
forecast for time k+i.
2.Repeat the above step for i=1,2,,Tk where T is the
total number of observations.
3.Compute the forecast accuracy measures based on
the errors obtained.
Cross-Validation: Time Series (h-
step forecasts)