Professional Documents
Culture Documents
Solutions To Time Series Analysis: With Applications in R: Johan Larsson 2017-04-05
Solutions To Time Series Analysis: With Applications in R: Johan Larsson 2017-04-05
R
Johan Larsson
2017-04-05
2
Contents
Preface 5
Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1 Introduction 7
1.1 Larain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Colors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 Random, normal time series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4 Random, χ2 -distributed time series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.5 t(5)-distributed, random values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.6 Dubuque temperature series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2 Fundamental concepts 13
2.1 Basic properties of expected value and covariance . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Dependence and covariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Strict and weak stationarity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4 Zero-mean white noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.5 Zero-mean stationary series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.6 Stationary time series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.7 First and second-order difference series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.8 Generalized difference series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.9 Zero-mean stationary difference series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.10 Zero-mean, unit-variance process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.11 Drift . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.12 Periods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.13 Drift, part 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.14 Stationarity, again . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.15 Random variable, zero mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.16 Mean and variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.17 Variance of sample mean . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3
4 CONTENTS
3 Trends 31
3.1 Least squares estimation for linear regression trend . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2 Variance of mean estimator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.3 Variance of mean estimator #2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.4 Hours . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.5 Wages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.6 Beer sales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.7 Winnebago . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.8 Retail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.9 Prescriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
Preface
This book contains solutions to the problems in the book Time Series Analysis: with Applications in R,
third edition, by Cryer and Chan. It is provided as a github repository so that anybody may contribute to
its development.
Dependencies
You will need these packages to reproduce the examples in this book:
install.packages(c(
"latticeExtra",
"lattice",
"TSA",
"pander"
))
In order for some of the content in this book to be reproducible, the random seed is set as follows.
set.seed(1234)
5
6 CONTENTS
Chapter 1
Introduction
1.1 Larain
Use software to produce the time series plot shown in Exhibit 1.2, on page 2. The data are in
the file named larain.
library(TSA)
library(latticeExtra)
●
●
●
●
30
●
●
● ●
●
Inches
● ●
● ●
● ●
●
●
● ●
20
●
●
● ●
● ● ●● ●
●● ●
●
● ● ● ● ●
● ●
● ● ● ●
● ●
● ● ● ●
● ●● ●
● ● ● ●
● ● ●
● ● ●
●● ● ●
● ●
● ● ● ● ●
● ●
●
● ● ● ● ●● ●
10
● ● ● ●
● ● ● ● ●●
● ●
● ●
● ● ● ● ●
● ● ● ●
● ●
● ●
●
● ●
Year
7
8 CHAPTER 1. INTRODUCTION
1.2 Colors
Produce the time series plot displayed in Exhibit 1.3, on page 3. The data file is named color.
data(color)
xyplot(color, ylab = "Color property", xlab = "Batch", type = "o")
● ●
85
● ●
80
● ●
● ●
●
Color property
● ● ●
● ● ● ●
75
● ●
● ● ●
● ●
70
● ●
● ● ●
● ●
●
65
0 10 20 30
Batch
Simulate a completely random process of length 48 with independent, normal values. Plot the
time series plot. Does it look “random”? Repeat this exercise several times with a new simulation
each time.
xyplot(as.ts(rnorm(48)))
xyplot(as.ts(rnorm(48)))
1.4. RANDOM, χ2 -DISTRIBUTED TIME SERIES 9
2
1
1
0
0
−1
−1
−2
−2
0 10 20 30 40 50 0 10
Time
xyplot(as.ts(rchisq(48, 2)))
xyplot(as.ts(rchisq(48, 2)))
10 CHAPTER 1. INTRODUCTION
6
6
4
4
2
2
0
0
0 10 20 30 40 50 0 10
Time
Simulate a completely random process of length 48 with independent, t-distributed values each
with 5 degrees of freedom. Construct the time series plot. Does it look “random” and nonnormal?
Repeat this exercise several times with a new simulation each time.
xyplot(as.ts(rt(48, 5)))
1.5. T(5)-DISTRIBUTED, RANDOM VALUES 11
4
2
0
−2
0 10 20 30 40 50
Time
xyplot(as.ts(rt(48, 5)))
2
0
−2
−4
0 10 20 30 40 50
Time
It looks random but not normal, though it should be approximately so, considering the distribution that we
have sampled from.
12 CHAPTER 1. INTRODUCTION
data(tempdub)
xyplot(tempdub, ylab = "Temperature", xlab = "Year")
60
Temperature
40
20
Year
Chapter 2
Fundamental concepts
√
Cov[X, Y ] = Corr[X, Y ] V ar[X]V ar[Y ] (2.1)
√
= 0.25 9 × 4 = 1.5 (2.2)
Var[X, Y ] = V ar[X] + V ar[Y ] + 2Cov[X, Y ] (2.3)
= 9 + 4 + 2 × 3 = 16 (2.4)
(2.5)
13
14 CHAPTER 2. FUNDAMENTAL CONCEPTS
a
We have that
b
The autocovariance is given by
library(lattice)
tstest <- ts(runif(100))
lattice::xyplot(tstest,
panel = function(x, y, ...) {
panel.abline(h = mean(y), lty = 2)
panel.xyplot(x, y, ...)
})
For k = 1 we have
1.0
0.8
0.6
0.4
0.2
0.0
0 20 40 60 80 100
Time
Figure 2.1: A white noise time series: no drift, independence between observations.
given that all terms are independent. Taken together, we have that
1 for k = 0
θ
Corr[Yt , Yt−k ] = for k = 1 .
1+θ2
0 for k > 1
And, as required,
{
3 3
1+32 = 10 if θ = 3
Corr[Yt , Yt−k ] = 1/3 1 3 .
1+(1/3)2 = 10/3 = 10 if θ = 1/3
b
No, probably not. Given that ρ is standardized, we will not be able to detect any difference in the variance
regardless of the values of k.
16 CHAPTER 2. FUNDAMENTAL CONCEPTS
No, the mean function (µt ) is constant and the aurocovariance (γt,t−k ) free from t.
{
E[Xt ] for odd t
µt = E[Yt ] = .
3 + E[Xt ] for even t
because Yt is stationary.
In (a), we discovered that the difference between two stationary processes, ▽Yt itself was stationary. It
follows that the difference between two of these differences, ▽2 Yt is also stationary.
▽m Yt = ▽(▽m1 Yt )
Currently unsolved.
b
First, we have
σt σt−k ρk σt σt−k ρk
Corr[Yt , Yt−k ] = √ = = ρk ,
Var[Yt ]Var[Yt−k ] σt σt−k
which depends only on the time lag, k. However, {Yt } is not necessarily stationary since µt may depend on
t.
c
Yes, ρk might be free from t but if σt is not, we will have a non-stationary time series with autocorrelation
free from t and constant mean.
2.11 Drift
Cov[Xt , Xt−k ] = γk
E[Xt ] = 3t
E[Yt ] = 3 − 3t + E[Xt ] = 7 − 3t − 3t = 7
Cov[Yt , Yt−k ] = Cov[7 − 3t + Xt , 7 − 3(t − k) + Xt−k ] = Cov[Xt , Xt−k ] = γk
Since the mean function of {Yt } is constant (7) and its autocovariance free of t, {Yt } is stionary.
2.12 Periods
Cov[et , et−12 ] − Cov[et , et ]−
Cov[et−12 , et−12 ] + Cov[et−12 , et ] =
Var[e t ] − Var[et−12 ] ̸= 0 for k = 12
Cov[Yt , Yt−k ] =
Cov[et , et−k ] − Cov[et , et−12−k ]−
Cov[et−12 , et−k ] + Cov[et−12 , et−12−k ] =
0+0+0+0=0 for k ̸= 12
where
gives us
and
which means that the autocorrelation function γt,s also has to be zero.
The autocorrelation of {Yt } is zeor and its mean function is constant, thus {Yt } must be stationary.
Both the covariance and the mean function are zero, hence the process is stationary.
Cov[Yt , Yt−k ] = Cov[(−1)t X, (−1)t−k X] = (−1)2t−k Cov[X, X] = (−1)k Var[X] = (−1)k σt2
[ ] [ n ]
1∑ ∑
n
1
Var[Ȳ ] = Var Yt = 2 Var Yt =
n t=1 n t=1
[ n ]
1 ∑ ∑n
1 ∑∑
n n
Cov Yt , Y s = γt−s
n2 t=1 s=1
n2 t=1 s=1
Setting k = t − s, j = t gives us
1 ∑ ∑ 1 ∑ ∑
n n n n+k
Var[Ȳ ] = γ k = γk =
n2 j=1 n2 j=1
j−k=1 j=k+1
1 ∑ ∑ ∑ ∑
n−1 n 0 n+k
γk + γk =
n2
k=1 j=k+1 k=−n+1 j=1
(n−1 )
1 ∑ ∑0
(n − k)γk + (n + k)γk =
n2
k=1 k=−n+1
1 ∑
n−1
((n − k)γk + (n + k)γk ) =
n2
k=−n+1
∑
n−1 ∑
n−1 ( )
1 1 |k|
(n − |k|)γk = 1− γk □
n2 n n
k=−n+1 k=−n+1
22 CHAPTER 2. FUNDAMENTAL CONCEPTS
∑
n ∑
n
(Yt − µ)2 = ((Yt − Ȳ ) + (Ȳ − µ))2 =
t=1 t=1
∑
n
((Yt − Ȳ )2 − 2(Yt − Ȳ )(Ȳ − µ) + (Ȳ − µ)2 ) =
t=1
∑
n ∑
n
n(Ȳ − µ)2 + 2(Ȳ − µ) (Yt − Ȳ ) + (Yt − Ȳ )2 =
t=1 t=1
∑n
n(Ȳ − µ)2 + (Yt − Ȳ )2 □
t=1
b
[ ] [ n ]
n ∑ ∑( )
n
n
2
E[s ] = E (Yt − Ȳ ) =
2
E (Yt − µ) + n(Ȳ − µ)
2 2
=
n − 1 t=1 n−1 t=1
n ∑( ) 1 ( )
n
E[(Yt − µ)2 ] + nE[(Ȳ − µ)2 ] = nVar[Yt ] − nVar[Ȳ ] =
n − 1 t=1 n−1
( ( n−1 ( ) ))
n n 1 γ0 2∑ k
γ0 − Var[Ȳ ] = nγ0 − n + 1− γk =
n−1 n−1 n−1 n n n
k=1
( ) ) ( ) )
1 ∑(
n−1
k 1 ∑(
n−1
k
nγ0 − γ0 + 2 1− γk = γ0 (n − 1) + 2 1− γk =
n−1 n n−1 n
k=1 k=1
n−1 ( )
2 ∑ k
γ0 + 1− γk □
n−1 n
k=1
n ( )
2 ∑ k
E[s2 ] = γ0 − 1− × 0 = γ0
n − 1 t=1 n
Y1 = θ0 + e1
Y2 = θ0 + θ0 + e2 + e1
Yt = θ0 + θ0 + · · · + θ0 + et + et−1 + · · · + e1 =
Yt = tθ0 + et + et−1 + · · · + e1 □
2.20. RANDOM WALK 23
µ1 = E[Y1 ] = E[e1 ] = 0
µ2 = E[Y2 ] = E[Y1 − e2 ] = E[Y1 ] − E[e2 ] = 0 − 0 = 0
...
µt−1 = E[Yt−1 ] = E[Yt−2 − et−1 ] = E[Yt−2 ] − E[et−1 ] = 0
µt = E[Yt ] = E[Yt−1 − et ] = E[Yt ] − E[et ] = 0,
Var[Y1 ] = σe2
Var[Y2 ] = Var[Y1 − e2 ] = Var[Y1 ] + Var[e1 ] = σe2 + σe2 = 2σe2
...
Var[Yt−1 ] = Var[Yt−2 − et−1 ] = Var[Yt−2 ] + Var[et−1 ] = (t − 1)σe2
Var[Yt ] = Var[Yt−1 − et ] = Var[Yt−1 ] + Var[et ] = (t − 1)σe2 + σe2 = tσe2 □
√
σ02 + tσe2 σ02 + tσe2
Corr[Yt , Ys ] = √ 2 = □
(σ0 + tσe2 )(σ02 + sσe2 ) σ02 + sσe2
E[Y1 ] = E[e1 ] = 0
E[Y2 ] = E[cY1 + e2 ] = cE[Y1 ] + E[e2 ] = 0
...
E[Yt ] = E[cYt−1 + et ] = cE[Yt−1 ] + E[et ] = 0 □
giving
√
ck Var[Yt−k ] Var[Yt−k ]
Corr[Yt , Yt−k ] = √ = ck □
Var[Yt ]Var[Yt−k ] Var[Yt ]
∑
n ∑
n−1
1 − c2t
Var[Yt ] = σe2 (1 + c2 + c4 + · · · + c2t−2 ) = σe2 c2(t−1) = σe2 c2t = σe2
t=1 t=0
1 − c2
And because
1 − c2t 1
lim σe2 = σe2 since |c| < 1,
t→∞ 1 − c2 1 − c2
ct−1
Yt = c(cYt−2 + et−1 ) + et = · · · = et + cet−1 + c2 et−2 + · · · + ct−2 e2 + √ e1
1 − c2
ct−1
Var[Yt ] = Var[et + cet−1 + c2 et−2 + · · · + ct−2 e2 + √ e1 ] =
1 − c2
c2(t−1)
Var[et ] + c2 Var[et−1 ] + c4 Var[et−2 ] + · · · + c2(t−2) Var[e2 ] + Var[e1 ] =
1 − c2
( n )
c2(t−1) ∑ c2(t−1)
σe (1 + c + c + · · · + c
2 2 4 2(t−2)
+ ) = σe 2
c 2(t−1)
−c 2(t−1)
+ =
1 − c2 t=1
1 − c2
1 − c2t + c2t−2+2 1
σe2 = σe2 □
1 − c2 1 − c2
Futhermore,
[ ]
e1 E[e1 ]
E[Y1 ] = E √ =√ =0
1−c 2 1 − c2
E[Y2 ] = E[cY1 + e2 ] = cE[Y1 ] = 0
...
E[Yt ] = E[cYt−1 + e2 ] = cE[Yt−1 ] = 0,
26 CHAPTER 2. FUNDAMENTAL CONCEPTS
Since both processes are stationary – and hence their sums are constant – the sum of both processes must
also be constant.
both free of t.
[ ]
∑
k
E[Yt ] = E β0 + (Ai cos(2πfi t) + Bi sin(2πfi t)) =
i=1
∑
k
β0 + (E[Ai ] cos(2πfi t) + E[Bi ] sin(2πfi t) = β0
i=1
∑k ∑
k
Cov[Yt , Ys ] = Cov Ai cos(2πfi t) + Bi sin(2πfi t), Aj cos(2πfj s) + Bj sin(2πfj s) =
i=1 j=1
∑
k ∑
k
Cov[Ai cos(2πfi t) + Ai sin(2πfi s)] + Cov[Bi cos(2πfj t) + Bi sin(2πfj s)] =
i=1 i=1
∑
k ∑
k
Var[Ai ](cos(2πfi t) + sin(2πfi s)) + Var[Bi ](cos(2πfj t) + sin(2πfj s)) =
i=1 i=1
σi2 ∑ σi2 ∑
k k
(cos(2πfi (t − s)) + sin(2πfi (t + s))) + (cos(2πfj (t − s)) + sin(2πfj (t + s))) =
2 i=1 2 i=1
∑
k ∑
k
σi2 cos(2πfi (t − s)) = σi2 cos(2πfi k),
i=1 i=1
2.26 Semivariogram
1 1
Γt,s = E[(Yt − Ys )2 ] = E[Yt2 − 2Yt Ys + Ys2 ] =
2 2
1( ) 1 1 1
E[Yt2 ] − 2E[Yt Ys ] + E[Ys2 ] = γ0 + γ0 − 2 × γ|t−s| = γ0 − γ|t−s|
2 2 2 2
Cov[Yt , Ys ] = E[Yt Ys ] − µt µs = E[Yt Ys ] = γ|t−s| □
Yt − Ys = et + et−1 + · · · + e1 − es − es−1 − · · · − e1 =
et + et−1 + · · · + es+1 , for t > s
1 1
Γt,s = E[(Yt − Ys )2 ] = Var[et + et−1 + · · · + es−1 ] =
2 2
1 2
σ (t − s) □
2 e
28 CHAPTER 2. FUNDAMENTAL CONCEPTS
2.27 Polynomials
Hence, because of the zero mean and covariance free of t, it is a stationary process.
which is free of t.
2.29. RANDOM COSINE WAVE FURTHER 29
∑
m ∑
m
E[Yt ] = E[Rj ]E[cos (2π(fj t + ϕ))] = via 2.28 = E[Rj ] × 0 = 0
j=1 j=1
∑
m
γk = E[Rj ] cos (2πfj k), also from 2.28.
j=1
with jacobian
√
−2πR = −2π X2 + Y 2
1
√ .
−2π X 2 + Y 2
Furthermore,
f (r, Φ) = re−r
2
/2
and
√
e−(x e−x /2 e−y /2
2
+y 2 )/2 2 2
x2 + y 2
f (x, y) = √ = √ √ □
2π x2 + y 2 2π 2π
30 CHAPTER 2. FUNDAMENTAL CONCEPTS
Chapter 3
Trends
∂ ∑n
Q(β0 , β1 ) = −2 (Yt − β0 − β1 t)
∂β0 t=1
∑
n
−2 (Yt − β0 − β1 t) =0 =⇒
t=1
∑
n ∑
n
Yt − nβ0 − β1 t =0 =⇒
t=1 t=1
∑n ∑n
Yt − β1 t=1 t
β0 = t=1
=Ȳ − β1 t̄
n
∂ ∑n
Q(β0 , β1 ) = −2 t(Yt − β0 − β1 t)
∂β1 t=1
Setting this to 0 as well, multiplying both sides with −1/2 and rearranging results in
∑
n
−2 t(Yt − β0 − β1 t) =0 =⇒
t=1
∑
n ∑
n ∑
n
β1 t2 = Yt t − β0 t
t=1 t=1 t=1
31
32 CHAPTER 3. TRENDS
∑
n ∑
n ( ∑n ∑n ) ∑
n
Yt
β1 t =2
Yt t − t=1
− β1 t=1
t ⇐⇒
t=1 t=1
n n t=1
( n ∑n ) ∑n ∑n
∑ ( t)2 ∑n
Yt t
β1 t −
2 t=1
= Yt t − t=1 t=1
⇐⇒
t=1
n t=1
n
∑n ∑n ∑n ∑n
n Yt t − Yt t (Y − Ȳ )(t − t̄)
β1 = t=1
∑n ∑n
t=1 t=1
= ∑n t
t=1
□
t=1 (t − t̄)
2 2
t=1 t − (
n 2
t=1 t)
1 1 2σ 2
Var[Ȳ ] = Var[µ + (en − e0 )] = 2 (σe2 + σe2 ) = 2e
n n n
It is uncommon for the sample size to have such a large impact on the variance estimator for the sample
mean.
Setting Yt = µ + et instead gives
1∑ 1∑ 1∑
n n n
Ȳ = Yt = (µ + et ) = µ + et
n t=1 n t=1 n t=1
[
]
1∑
n
1 σ2
Var[Ȳ ] = Var µ + et = 0 + 2 × nσe2 = e .
n t=1 n n
1 2 1
Var[Ȳ ] = (σ + σe2 + 4(n − 1)σe2 ) = 2 2(2n − 1)σe2
n2 e n
Setting Yt = µ + et instead gives the result from 3.2. We note that for large n the variance if approximately
four times larger with Yt = µ + et + et−1 .
3.4 Hours
library(TSA)
data("hours")
xyplot(hours)
3.5. WAGES 33
42
41
40
40
40
39
Time
Figure 3.1: Monthly values of the average hours worked per week in the U.S. manufacturing sector.
In Figure 1 we see a steep incline between 83 and 84. There also appears to be a seasonal trend with
generally longer work hours later in the year apart from the summer; 1984, however, does not exhibit as
clear a pattern.
months <- c("J", "A", "S", "O", "N", "D", "J", "F", "M", "A", "M", "J")
Here, in Figure 2, our interpretation is largely the same. It is clear that December stands out as the month
with the longest weekly work hours whilst February and January are low-points, demonstrating a clear trend.
3.5 Wages
data("wages")
xyplot(wages, panel = function(x, y, ...) {
panel.xyplot(x, y, ...)
34 CHAPTER 3. TRENDS
D
42
D
D D
J
41
S N
A N M M
S N J SO J JF
O FM S N J M A O
J M J M
40
O A A
A M A
J J J M F
A J
A J
40
J
M
A
D F
M
40
N
J
39
A O
J S
F
Time
Figure 3.2: Monthly values of average hours worked per week with superposed initials of months.
There is a positive trend with seasonality: August is a low-point for wages. Generally, there seems to be
larger increases in the fall.
##
## Call:
## lm(formula = wages ~ time(wages))
##
## Residuals:
## Min 1Q Median 3Q Max
## -0.2383 -0.0498 0.0194 0.0585 0.1314
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -5.49e+02 1.11e+01 -49.2 <2e-16 ***
## time(wages) 2.81e-01 5.62e-03 50.0 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
3.5. WAGES 35
9.5
A
DJFM MJ
J N
DJ FMAMJ SO
A
N
J SO
AMJ A
JFM
9.0
D
N
O
J S
MJ A
J FMA
D
N
J SO
8.5
F
J MAMJ A
D
N
J SO
J A
M
J A
FM
8.0
D
N
SO
JA
Time
Figure 3.3: Monthly average hourly wages for workers in the U.S. apparel and textile industry.
We still seem to have autocorrelation related to the time and not white noise.
##
## Call:
## lm(formula = wages ~ time(wages) + I(time(wages)^2))
##
## Residuals:
36 CHAPTER 3. TRENDS
1
Studentized residuals
−1
−2
−3
0 20 40 60
Time
This looks more like random noise but there is still clear autocorrelation between the fitted residuals that
we have yet to capture in our model.
3.6. BEER SALES 37
1
Studentized residuals
−1
−2
0 20 40 60
Time
data(beersales)
xyplot(beersales)
Clear seasonal trends. There is an initial positive trend from 1975 to around 1981 that then levels out.
months <- c("J", "F", "M", "A", "M", "J", "J", "A", "S", "O", "N", "D")
xyplot(beersales,
panel = function(x, y, ...) {
panel.xyplot(x, y, ...)
panel.text(x, y, labels = months)
})
It is now evident that the peaks are in the warm months and the slump in the winter and fall months.
December is a particular low point, while May, June, and July seem to be the high points.
38 CHAPTER 3. TRENDS
16
14
12
10
Time
M
JJ A A A
J J JJ MJ
M J J J MJJ
MJ MA J
M
J M J J J MJ
A MA A
A JA JA M A J
16
M A AA M O
J J A A
JA M A
A M M
J M M
A A
JA M M J M M A
M A S S A S
J JA M S M O S
M S N
J
A A O
M SO S OJ
J M O J
14
O J S
A S O O O JF O
MS A SO SJ S JF
M O NF
N NF D
A A S DF F
N F D
S NJF NDF JF JF N N N
J N D N D D
O J D D
NF
12
O J D
O D
M D ND D
D
J F
JFM D
N J
NF
10
J
F
Time
Figure 3.7: Monthly U.S. beer sales annotated with the months’ initials.
3.6. BEER SALES 39
All comparisons are made against january. The model helpfully explains approximately 0.71 of the variance
and is statistically significant. Most of the factors are significant (mostly the winter months as expected).
Looking at the residuals in 3.8 We don’t have a good fit to our data; in particular, wee’re not capturing the
long-term trend.
N
2 O
J
F J
M J
F
J A M A
A
J M A J D
J
1 M
J A J
J
A
M
O
A
N
F
J A FM
J D S J M
M S MM D M J N J
N A J O M J M S S
F
A S J J
F O O J
M M
M SO M F J AM S
N A
Studentized residuals
F J D F
O J J S D
M A D F A A OD
A A O A N JA
A N J M N F O J O M
D N M D
0 M
S O
N DJ
N
M A N S A J
J M J J
D A
F S
ON J S
D D M S
D J A A
S A J J
D
J
N J
−1 M M J
S
JA D
A
J F F
JA
J O
S
J
J
A J ON
N F
A
−2 O
A J
F M J
−3 M
Time
This model fits the data better, explaining roughly 0.91 of the variance.
N
O
2 M
M J J
J
A M
J
S M M M
A F J
D A
J J D
A A M
S S J
O F
A A A S S
F J
Studentized residuals
J N J J
D D
J N
M A N
S J A S J A F
ON M O F M J J
J M A M
F JA J A J A
D F N M M O M
D J M J S
A N D J D N
M S A M FM S
0 J
ON J F O O
M
J O
J M O
O D N J
S A A F F M J
N
J O
J M F J
A M O S A
JA A J O M
F A D A
A J J
J A J S OD
F N F A D N D
N D JA
J M N
J J N
M F A M J
S A
M
J J
M J S
M A
O
−2 S
D
J
−4
Time
Figure 3.9: Beer sales residual plot from the quadratic fit.
1500
1000
500
0
Time
4
Studentized residuals
−2
Time
Figure 3.11: Residuals for the linear fit for the winnebago data.
The fit is poor (Figure ??. It is not random and it is clear that we’re making worse predictions for later yers.
3.7. WINNEBAGO 43
To produce a better fit, we transform the outcome with the natural logarithm.
This looks more like random noise (Figure ??. Values still cling together somewhat but it is certainly better
than the linear model. We’re still systematically overpredictinig the values for some months, however.
The fit is improved further. We have a R2 of 0.89 and significance for most of our seasonal means as well as
the time trend.
A
J
2
N
M
F
D
J
1 M O
J A A D
N
M
S F
D
J M
J F
N S O
A
Studentized residuals
J O D
J
A M S A
0 A
J
S J
J J
M O M
A M
N J
F
J
F
J A
J
−1 M S N
F N
D O
M D
J
−2
−3
−4
Time
This is acceptable even if our residuals are quite large for some of the values, notably at the start of the
series.
3.8. RETAIL 45
2
J
M
M M
A
J F
1 A
M F
A
A J
Studentized residuals
S F M
M N
M J S
J
J
J J A
A S
D M
J M
O D
N
0 J S
F
A D F
N O
A
O D
S J
J
F
M O
−1 N
N J
J O
A J
D
−2 D
−3
Time
3.8 Retail
data(retail)
xyplot(retail, panel = function(x, y, ...) {
panel.xyplot(x, y, ...)
panel.xyplot(x, y, pch = as.vector(season(retail)), col = 1)
})
Plotting the retail sales trend there seems to be a long-term linear trend as well as heavy seasonality in tht
December – and to slighter extent also November and October – exhibit regular surges in retail sales.
This seems like an effective model, explaining 0.98 of the variance in retail sales.
The residual plot (Figure 3.14) tells a different story: we’re underpredicting values for early period and
overpredicting values for the later years – however, this should be an easy fix.
3.9 Prescriptions
data(prescrip)
xyplot(prescrip, ylab = "Prescription costs",
panel = function(x, y, ...) {
panel.xyplot(x, y, ...)
panel.xyplot(x, y, pch = as.vector(season(prescrip)), col = 1)
})
Figure 3.15 shows a clear, smooth, and cyclical seasonal trend. Values are genereally higher for the summer
months and there seems to be an exponential increase long-term.
3.9. PRESCRIPTIONS 47
D
D
D
D
150
D D N
N N
N
D N
D
N O
D
O
O
O J J
D N O AMJ AS
J AMJ AS M
N J
J M AS
J A S MA
N O M JF
D N AM MJ A J J J
M
D D J S F
M JA
100
M F F
N O M AS JF
D AJ J
O J F
D J A J
O J O A AMJ S M
N MJ S J F
J A M
N O J AMJ AS
N S JM F
D AM A JM
O JS M F
F
J
O N A A N
JS M JF
D O S AJ M
N J M A
N
J S AMJA JF
N O
AJA JM JM
J O M F
J O AMJJAS J M F
N O AM AS MAMJ AS F
O J J J JM
J M JAS F
M F
N AS J MA J F
O
J AMJ F
JAS
50
M
O AJ F
JAS M
J JM
AM F
JM
F
Time
4 D
D
D
D
D
D
N
N
2 J
N
JF
J
F
A M MM M D
Studentized residuals
F M N
S A A D
J S A MA N N
F JS
O O AMJ O
M J J
J J AS MJ O AJ J MJ
F S A S
M J J N A
JA
A J O F M O D O F J M A
M JJ AJA J MJ
F N N JF
S J J AS J
AM O O F M MJ N
D J
AS FMMJ S
0 J M
O
A A
M
J S M S S JF
A A
JA NJ
AJ
J M
MA J A J O S J
O N M S F DFA A M MJ O J O F A O
J O D J O
F A J S J
J AO S
A
M O M A J MJ
N MJ AM DJ A FM FAJ M M
J O M JJ F MJ J A S M J A A
N M A
J O S FA S J M
N M
N
N A J S
N N
N F
D N
D M
D JF
−2 D J
D
D
−4
Time
Figure 3.14: Studentized residuals for our seasonality + linear model of retail sales.
48 CHAPTER 3. TRENDS
30
F
J S
A N J
J O
M
D
A
S
N D M
J A
O
25
Prescription costs
M J
J
F
A
M
S
O F
J A N D J
J
M
20
S M
D
O J
A N
J F
M
J
A
F
D M
S J
J J A N
O
M
M A
15
O
A S N F
D J
Time
F
A M
M
A J
0.04 J M
M
M
M M
M M
J
D M
S
A J
J
F
0.02 F
J
S
D
A S
S A F
J
N
pchange
N
J
S O N
A S
A
J
J J D A
0.00 A
J D
A
N N
J J J
F
N O
O
M
−0.02 F
D O
O
O J
−0.04 D
time(prescrip)
1
rstudent(pres_cos)
−1
−2
time(prescrip)