Professional Documents
Culture Documents
Characteristics of Time Series
Characteristics of Time Series
Process/Objectives
1. Descriptions
• Graphical/nonparametric (eg. sample autocorrelation).
• Suggest possible probability models.
• Identify “unusual” observations.
• stationary/nonstationary stochastic processes.
2. Modelling and Inference
• Use data to build a model.
• Check model fit.
3. Forecasting and Prediction
• Forecast future values of the series.
Graphical Methods
Preliminary goal: Describe the overall structure of data; also want to identify memory type of the time series.
1. Short-Memory
• Immediate past may give some information about immediate future but less information about “long-term”
future.
2. Long-Memory
• Past gives more information about “long-term” future.
• Examples: series with polynomial trends or cycles/seasonality.
1 of 17
Time Series Analysis
1 wp p
• Bernoulli trials: xt ={−1 wp 1− p .
2. Random Walk
• Start with w t ~iid .
• Set x0 =0 . Let xt =w1 w 2⋯ wt by cumulatively summing (or integrating) the w t 's.
• It can be expressed as xt =xt−1w t .
• Can also add a drift xt = x t−1 wt .
3. Linear Trend
• xt =a0 a 1 t wt where w t ~iid .
• a 0 and a1 are constants that can be estimated through least squares.
• Plot residuals e t = xt − a0− a1 t . The should look like the w t 's.
4. Seasonal Model
• Example: series affected by the weather.
• May need to include a periodic component with some fixed known period T .
1
• xt =a0 a1 cos 2 ta 2 sin 2 tw t where w t ~iid . = is a fixed frequency.
T
2 2
5. w t ~WN 0, w , i.e. E w t =0 , var w t = w , w t 's uncorrelated.
6. First-Order Moving Average or MA(1) Process
2
• xt =wt w t − 1 , t =0, ±1, , with w t ~WN 0, w .
2 2
• Then E xt =0 and var x t = w 1 .
7. First-Order Autoregressive or AR(1) Process
2
• xt = x t −1 w t , t=0,±1, , with w t ~WN 0, w .
1 −z
2 ∫
• Special case: If xt ~iid N 0,1 , then F c1, , cn =∏ ct where x = e 2
dz .
t=1 −∞
2 of 17
Time Series Analysis
∞
∂ F t x
The mean function is defined by x = E xt =∫ x f t x dx , where f t x = where F t x = P x t ≤ x .
−∞ ∂x
Remark
If xt =0 1 x s , then 10 ⇒ s , t =1 and 10 ⇒ s , t =−1 .
Remark
• Strict stationary implies weak stationary, but the converse is not true in general.
• Special case: If { xt } is Gaussian, then weak stationary implies strict stationary,
Notation
In a stationary time series, write t h ,t ≝ h .
Remark
3 of 17
Time Series Analysis
∞
One can show for xt that h=
2
∑ j jh .
j=−∞
ESTIMATION OF AUTORRELATION
Assumptions
x1, , xn are data points (one realization) of a stationary process.
Note
∇ xt is an example of a linear filter. Differencing plays a central role in ARIMA modelling.
4 of 17
Time Series Analysis
General Setup
xt = f t y t where f t is a smooth function of time and yt is stationary.
n
f t =∑ w t i xt where w t i
=
−
n
k
t
b
i
z2
1 −
2
k z = e .
2
ARIMA Models
AUTOREGRESSIVE MOVING AVERAGE (ARMA) MODELS
Previously examined a classical regression approach, which is not sufficient for all time series encountered in practice.
Note
• E xt = .
• Also, xt = 1 x t−1⋯ p x t− pw t where = 1− 1−⋯− p .
Note
p
The AR(p) model can be written as 1− 1 B−⋯− p B xt =wt or B xt = wt .
j=0
[ ]
2
k−1
j
• If ∣∣ 1 and xt is stationary, then lim E xt − ∑ w t− j = 0 (convergence in mean-square), and so
k∞ j=0
∞ ∞
xt =∑ j w t− j and E xt = ∑ j E w t − j =0 .
j =0 j=0
5 of 17
Time Series Analysis
1 h
2 h h
• h=cov x t h , x t =
, h≥0 , and h= = , h≥0 .
1− 2 0
• If 0 , smooth path; if 0 , choppy path.
Note
The MA(q) process can be written as xt = B w t .
Definition: ARMA(p, q)
A time series { xt } is ARMA(p, q) if it is stationary and xt = 1 x t −1 ⋯ p x t − p wt 1 w t − 1⋯ q w t − q , where
2
w t ~iid N 0, and = 1− 1−⋯− p .
Note: We can write B xt = B wt .
Definition: Casual
An ARMA(p, q) model B xt = B wt is said to be causal if { xt } can be written as a linear process
∞ ∞ ∞
∑
xt =
=
w − = B w , where
j 0
j t j t B=∑ j B j and
j =0
∑ ∣ j∣∞ .
j=0
j =0
z =1 − z is greater than 1 in absolute valule.
• E xt = ∑
=
j
E w − = 0 .
0
j t j
q−h
[∑ ∑ ]
q q
= { ∑ j jh 0≤ h≤q
2
• h=cov x t h , x t = E j wt h− j j wt− j j=0 .
j =0 j =0
0 hq
6 of 17
Time Series Analysis
{
q −h
∑ j jh
h j=0
0≤ h≤q
• h= = q
0 ∑ 2
j
j=0
0 hq
Example: ARMA(1, 1)
2
1 1 h −1
For an ARMA(1, 1) process, h = 2 h−1 , h≥1 and h= 2 , h≥1 .
1− 12
Yule-Walker Estimation
[ ][ ] [ ]
0 ⋯ p−1 1 1
• ⋮ ⋱ ⋮ ⋮ = ⋮ or =
, and 2 =0−
T
.
p −1 ⋯ 0 p p
2
• Equations used to determine 0 ,, p from and .
• However, we can substitute
h , h=0,, p into equations to estimate 2 and
.
• Minimize ∑ f xt −1 x t −1 −⋯− p xt− p where f is some function with f x ∞ as ∣x∣ ∞ .
t= p1
2
• Examples of f : f x= x (least squares), f x=∣x∣ .
Maximum Likelihood
2
• Assume w t ~iid N 0, .
• Use x1, , xn to obtain likelihood function – joint density of x1, , xn which is a function of unknown parameters.
7 of 17
Time Series Analysis
[ ][ ][ ]
1 h 0 ⋯ h−1 1
Define 0 =1 , h =h h , h≥1 where ⋮ = ⋮ ⋱ ⋮ ⋮ h= −h 1 h (Yule-Walker).
or
h h h−1 ⋯ 0 h
Note: This is the correlation between xt and x th with dependence on x t1 , , x th −1 removed.
Lemma
If xt is an AR(p) process, then h =0 for all h p .
Procedure
p1 , p2 , are close to zero (statistically zero).
Choose p such that
[ ][ ] [ ]
0 ⋯ n−1 1 1
• We have ⋮ ⋱ ⋮ ⋮ = ⋮ or = .
n−1 ⋯ 0 n n
8 of 17
Time Series Analysis
ARMA MODELLING
When do we fit an ARMA(p, q) model as opposed to a pure AR(p) or MA(q) model?
• Sample correlations are the key.
• Data appears to be stationary with no obvious trends (polynomial or seasonal) and the sample ACF h decays fairly
rapidly (geometrically).
• However, the sample ACF and PACF do not “cut-off” after some small lag, i.e. h and h
are not “small” for
moderate h .
Diagnostics
• Observations: x1, , xn .
, , and 2 .
• Fit ARMA(p, q) model, i.e choose model order and obtain the MLEs of parameters
, , t=1, , n which depends upon the MLEs.
• Then obtain the empirical predicted values xt
,
x t − xt E x t − x t ⋅ = P t −1 .
2
• Define the residues e t = , t =1, , n where r t −1 =
,
2
r t −1 2
ARIMA MODELS
Definition
d
Let d =1,2, be nonnegative integers. { xt } is an ARIMA(p, q, d) process if yt = 1− B x t is an ARMA(p, q)
process.
Note
Don't over difference. If xt is a stationary ARMA(p, q) process, then ∇ xt is a ARMA(p, q + 1).
Note
2
Recall conditions for ARMA(p, q) model to be stationary and causal. B xt = B wt , wt ~WN 0, . xt is
9 of 17
Time Series Analysis
stationary if and only if z ≠0 ∀∣z∣=1 , i.e. z has no unit root. If z ≠0 ∀∣z∣≤1 , then xt is causal and
∞
xt = ∑
=
w−
j 1
j t j (MA(∞)).
Note
d *
ARIMA(p, d, q): B 1 − B x t = B wt , or equivalently, B xt = B wt . * z has a root (of order d ) at
z =1 since * z = z 1−z d .
• Very specific type of non-stationarity.
• Useful for slowly decaying sample ACF's where the data has a polynomial trend.
Note
• d and D are nonnegative integers.
• Usually d =1 and D =1 is sufficient.
Note
xt is expressed as B B s 1−B d 1− B s D x t =B B s wt .
d
• 1−B is trend or low frequency.
s D
• 1−B is seasonal component.
• SARIMA models both polynomial trend and seasonality (periodicity) at the same time.
Note
• Models become complicated very quickly, making interpretation difficult.
• Potentially many models to assess for a given set of data.
• Usually low-order models suffice: d , D =0,1 , p , q≤ 2 .
10 of 17
Time Series Analysis
Strategy A
1. Difference data so it appears stationary.
2. Assume P , Q =0 ; choose p and q using AIC like in ARMA case.
3. Carry out diagnostics, i.e. white noise checks.
Strategy B
1. Difference data so it appears stationary.
2. Assume p=P=0 ; choose q and Q by looking at sample ACF 1 ,
2 , and s , 2 s , , and
choose q and Q to be “cut off” points.
3. Carry out diagnostics, i.e. white noise checks.
amplitudes: xt = ∑
=k
U
1
k1 cos 2 k t U k sin 2 k t , where U k and U k are independent random
2 1 2
2
variables with E U k =E U k =0 and var U k = var U k =k
1 2 1 2
Notation
• is frequency; cycles per time.
1
• T is period; time per cycle, T = .
Example
xt = Acos 2 t where A and are independent random variables. We can rewrite xt in regression form as
xt =U 1 cos2 tU 2 sin 2 t where U 1 = A cos and U 2= A sin .
PERIODOGRAM
Identify possible periodicities or cycles in the data.
[
Note: F n⊆ − ,
1 1
2 2 ]
.
11 of 17
Time Series Analysis
i 2
1 e
k
Let ek = ⋮
n n i2
e k
, k =− ⌊ − ⌋⌊ ⌋
n
2
1
, ,
n
2
. It can be shown that {e1, , en } is an orthonormal basis for
n
ℂ . Any
n
1 ⌊ ⌋
n
n
vector x ∈ℂ can be written as x =∑k =−⌊ − ⌋ a k ek where a k =
n t =1
xt e
−2 i t
2
. n 1
2
∑ k
{ak } is called the discrete Fourier transform of the sequence { x1, , x n } (think of {xt } as the sample).
⌊ ⌋
n
⌊ ⌋
n
2 i k t ⌊ ⌋ n
Definition: Periodogram
2
∣∑ ∣
n
1
Define I n = xt e − 2 i t
. If =k , then the periodogram is defined to be
n t =1
2 2 2
n n n
I n k =
1
n ∣
∑ xt e
t =1
−2 i k t
∣ 2
=∣a k∣ =
1
n
∑ xt cos2 k t
t =1
1
n
∑ x t sin 2 k t
t =1
.
Note
• This measures the dependence (correlation) of the data x1, , xn on sinusoides oscillating at a given frequency k .
• If data contains strong periodic behavior corresponding to a frequency k , then I n k will be large.
Note
n−1
k
It can be shown that if x1, , x n ∈ℝ and k = is a Fourier frequency, then I n k = ∑ he−2 i h where
h k
n h=−n1
is the sample ACV function.
SPECTRAL DENSITIES
Definition: Spectral Density
∞
Suppose xt is a zero-mean stationary time series with an absolutely summable ACV function ⋅ , i.e. ∑ ∣ h∣∞
n=−∞
∞
. The spectral density of xt is the function f =
h
∑
=−∞
h e− 2 ih
, ∈
[− ] 1 1
,
2 2
.
Note
• Compare f with I n ; periodogram is used to estimate the spectral density.
∞
12 of 17
Time Series Analysis
SPECTRAL REPRESENTATION OF X T
Theorem
Suppose xt is a zero-mean stationary process with ACV and spectral distribution function F . Then there exist two
1 1
continuous real-valued processes C and S , − 2 ≤ ≤ 2 , with
• cov C 1 , S 2 , ∀ 1, 2
• var C 2−C 1=var S 2 −S 1 =F 2− F 1 for 1 ≤ 2 ,
• cov C −C , C =cov S−S , S =0 for ≠ 0 ,
such that
1 1
2 2
=lim
n ∞
∑ cos
⌊⌋
j=−
n
2
2 j t
n [ ]
C
j
n
−C
j−1
n
lim
n ∞
∑ sin
⌊⌋
j =−
n
2
2 j t
n [ ]
S
j
n
−S
j −1
n
1
2
This is equivalent to xt =∫ e−2 i t d z where z is a complex valued stochastic process on [− 12 , 12 ] having
−1
2
stationary uncorrelated increments and var z 2 − z 1 = F 2 − F 1 for 1 ≤ 2 .
13 of 17
Time Series Analysis
⌊⌋ n
2 ⌊⌋
n
2
Reimann sums xt = ∑ A
⌊⌋
j =− n
2
k cos n
2 k t
∑ B sin
⌊⌋
j =− n
2
k 2 k t
n
, and hence
⌊⌋
n
2 ⌊⌋ n
2
1
2
var x t = ∑ cos
⌊⌋
j=− n
2
2 k t
n
var A k ∑ sin
⌊⌋
j=− n
2
2 k t
n
var B k ≈ ∫ f d . So the spectral density f provides the
1 −
2 2 2
FILTERING
Theorem: Filtering Theorem
∞
Let x t be a stationary time series with E xt =0 and spectral density f x . Let yt = ∑
j =−∞
j xt − j where
∞
yt = ∑ ∣ j∣∞ . Then yt is stationary with E yt = 0 and
j=−∞
∞
−2 i 2
f y =∣ e ∣ f x = e−2 i e2 i f x where e−2 i = ∑ j e−2 i j . is called the transfer
j=−∞
or frequency response function; ∣∣2 is called the squared frequency response.
Linear Filters
Given a time series x 1, , x n , we want to replace x t be a linear combination of past and/or future values, i.e.
xt y t = ∑ u c u x t − u .
r r
1. Moving average filter yt = ∑
u =−r
cu xt −u where c u0 and ∑ c u=1 . Used to estimate underlying/hidden ternds.
u=−r
∞
2. Differencing yt =∇ xt = x t − x t −1 = ∑
u=−∞
cu xt − u where c1 =−1 , c 0=1 , c u=0, u ≠0, 1 .
GARCH Modelling
ARCH MODELLING
• Autoregression conditional heteroscedastic.
• Model time-varying second moments, i.e. variances.
• Very important for modelling financial time series.
14 of 17
Time Series Analysis
Note
• If t = ∀ t (constant), then yt = t . This is equivalent to ln P t following a simple random walk
ln P t = ln P t−1 t .
2 2
• Also, in this case where t = ∀ t , yt ~iid N 0, . This is not the case for ARCH(1) where t = 01 y t−1 .
Note
2
• E y t∣ y t −1= t E t∣y t −1 =0 since E t = 0 , and that var yt∣ y t −1 =t var t∣ y t −1= 01 y t−1 since var t =1 .
2
Thus yt | y t−1~ N 0, t (conditional distribution).
{
2 2 2
y t = T T 2 2 2 2 2 2 2
• From 2 2 , we get yt =0 1 yt −1 T t −1 . Here t −1~1 with E t −1= 0 . Let u t = T t −1 .
0 1 yt −1=t
2 2
Then E ut∣ y s , s≤t−1=0 and hence E ut = E E ut∣ y s , s≤t −1=0 . Therefore yt =0 1 yt −1ut is a non-
Gaussion AR(1) process.
∞
2 0
• By recursively substituting, we can write yt =0 ∑ 1 t t −1 ⋯t− j . Therefore E y t =
2 j 2 2 2
if ∣1∣1 . Also,
j=0 1− 1
∞
yt =t 0 1∑ 1j 2t−1⋯t−
2
j
.
j=1
0
• So now E y t∣ s , s≤t −1=0 , and therefore E y t =0 and var yt = . Also,
1−1
E y th yt∣s , s≤th−1=0 , h0 and so cov yt h , yt =0 , h 0 .
2 2
4 3 0 1−1
• It can be shown that E y t =
2
2 2 if 31 1 . Therefore the kurtosis K of yt is
1 − 1 1−31
4 2
E yt 1−1
K= 2 2
=3 3 .
E y t 1−321
• Hence the solution yt of the ARCH(1) process is strictly stationary white noise. However, yt is not iid noise and is not
Gaussian.
Remark
2 2
• With ARCH(1), yt is an AR(1) process with heteroscedastic non-Gaussian noise. Thus yt is serially correlated; this
should show up on a time series plot and sample ACF for y2t .
2 0
• Also note that E t = =var yt . This is the “expected variance” or “long term average variance”.
1− 1
Remark
1. Regression with ARCH errors: x t =
' z t yt where yt ~ ARCH 1 .
2. AR( p ) with ARCH errors: xt = 1 x t −1 ⋯ p xt− p y t where yt ~ ARCH 1 .
2 2
3. ARCH(1) extends to ARCH( p ): yt = t t where t ~iid N 0, 1 and t = 01 y t−1 ⋯1 yt− p .
15 of 17
Time Series Analysis
GARCH MODELLING
Definition: GARCH(1, 1) Model
yt = t t where t ~iid N 0, 1 and t = 01 y 2t −1 1 2t−1 , 0 0, 1≥0, 1≥0, 1 11 .
Note
2 2 2 2 2 2 2
t =0 1 1 t −1 1 t −1 t −1 −1 . Here t −1−1 ~¿ 1 . Let ut = t t −1 . Then
2 2 2
yt =01 1 yt −1ut − 1 u t . Hence yt is a non-Gaussian ARMA(1, 1) process with heteroscedastic noise.
Note
0 2 2 2 2
One can forecast volatility. Let w = . Then t can be expresses as t =w1 yt −w 1 t −1 −w .
1 −1− 1
j
Forecast function: If 1 11 , then one can show E 2t j∣ 2t =w 1 1 2t − w , j ≥1 . Hence volatilities are
either above or below their long run average w . They mean-revert geometrically to this value; the speed of mean-reversion
is determined by 1 1 .
Note
Maximum likelihood methods can be used to estimate ARCH/GARCH parameters.
Example
Transfer function model. xt , 1 is an input and x t , 2 is an output. Model: x t , 2= 0 x t , 11 xt−1, 1⋯ p x t−p , 1w t .
Example
Vector ARMA and ARIMA models.
Vector AR model:
x t ,2 ⋯
xt , 1 = x t −1, 1
1
x t −1, 2
p xt− p,1
xt− p,2
wt , 1
wt , 2
, where i 's are 2 ×2 possibly diagonal matrices.
Now we have E
E xt ,2 [
xt = E xt , 1 and cov x
t h ,
x t =
]
cov x t h , 1 , xt , 1 cov xth , 1 , xt , 2
cov x t h , 2 , xt , 1 cov xth , 2 , x t , 2
. The series xt is weakly
stationary if E
xt and cov xth
,x t are independent of time t ; in this case, h =
[ 1, 1 h
2, 1 h ]
1, 2 h
2, 2 h
.
Continuous-Time Models
Brownian Motion
1. Bt is continuous and B0 =0 .
16 of 17
Time Series Analysis
2. Bt ~ N 0, t .
3. Bt s − Bt ~ N 0, s , s≥0 and non-overlapping increments are independent.
Continuous-Time AR(1)
dxt =b−a x t dt dB t . This is a stochastic differential equation (SDE).
t t
The solution is given by xt =e−a t x 0b∫ e−a t −u du ∫ e−a t −u dBu , where x 0 is independent of Bt .
0 0
t
Let I t =∫ e
−a t −u
dBu (Ito stochastic integral). Then it can be shown that E I t = 0 and
0
t
cov I t h , I t = ∫e 2au
du ,t , h≥0 .
0
2 2
b b −a h
Suppose now that a 0 , E x0 = , and var x 0= . Then E xt = and cov xt h , x t = e ,t , h≥0 , and
a 2a a 2a
hence xt is stationary.
If x0 ~ N
b 2
,
a 2a
, then xt is a strictly stationary Gaussian process.
17 of 17