Professional Documents
Culture Documents
Chapter 11 Part 1 - Arima (Box-Jenkin) - 2023
Chapter 11 Part 1 - Arima (Box-Jenkin) - 2023
General Overview
• An ARIMA model is a mathematical model for
time series data, or iterative approach (cách tiếp
cận lặp)
• George Box and Gwilym Jenkins developed a
systematic approach for fitting these models
to data so these models are often called Box-
Jenkins models.
• Iterative approach:
identify model estimate parameter
checking …
We always use statistical or forecasting programs
to fit these models
The programs fit models and produce forecasts for
us.
2
ARIMA Models
• ARIMA
Special cases: AR models, MA models, ARMA models,
IMA models.
Models generalize regression but “independent”
variables are past values of the series itself and
unobservable random disturbances.
• Estimation is based on maximum likelihood; not least
squares.
• We distinguish between seasonal and non-seasonal
models.
Objective?: Find the black box that most closely fits the
data
What are the inputs to the black box? In ARIMA
analysis the inputs are observed series and output is
"white noise“
7
Notation
• Y1, Y2, …, Yt denotes a series of values for a time series.
These are observable.
• 1, 2, …, t denotes a series of random disturbances (nhiễu
ngẫu nhiên), or error terms at time 1, 2,...t
These are not observable.
Usually they are assumed to be generated from a Normal
distribution with mean 0 and standard deviation , and to be
uncorrelated with each other.
They are often called “white noise”.
• E[Yt] = μ, var(Yt) = σ2 and cov(Yi, Yj) =
0 for i ≠ j.
Yt t c t • Since these values are constants, this
type of time series is stationary
8
White Noise
White noise (residuals): purely random data no relation
between consecutively observed values or zero "serial
correlation“ The previous values do not help in predicting future
values example: the toss of a fair coin
Characteristics:
• pattern through time is completely random ~ mean of zero
• no correlation between its values at different times ~ All
nonzero lags
A simple random model, often called a white noise model: the
observation Yt is composed of two parts: c, the overall level, and the
random error component 𝜀 t, is assumed to be uncorrelated from
period to period.
𝑌 𝑐 𝜀
10
11
where:
Yt = time series generated;
pth-order autoregressive models
1, 2,…, p = coefficients
Yt-1, Yt-2,…, Yt-p = lagged values of the time series
t = white noise
an AR model of order 2
13
ACF
• Autocorrelation function (ACF) is a measure of the correlation
between observations of a time series that are separated by k
time units (Yt and Yt–k).
The autocorrelation (or serial correlation) for time series
observations (Yt) with observations with of the same series at
previous time steps (Yt-k) or is called lags, k=steps, lags
Confidence intervals are drawn as a cone.
PACF
• Partial autocorrelation function (PACF) is a measure of the
correlation between observations of a time series that are
separated by k time units (Yt and Yt–k), after removing for the
presence of all the other terms of shorter lag (Yt–1, Yt–2, ..., Yt–k–1).
The partial autocorrelation at lag k is the correlation that results after
removing the effect of any correlations due to the terms at shorter
lags.
For an AR model, the theoretical PACF “shuts off” past the order of
the model.
The phrase “shuts off” means that in theory the partial
autocorrelations are equal to 0 beyond that point. = the number
of non-zero partial autocorrelations gives the order of the AR model.
By the “order of the model” we mean the most extreme lag of x that
is used as a predictor.
15
16
It must be
Remembered that,
the sample ACF
are going to
differ from these
theoretical
functions because
of sampling
variation.
18
Example
MA(1), MA(2) model drop to zero after The partial autocorrelation coefficients
the 1st time lag and 2nd time lag trail off to zero gradually
The sample
autocorrelation
functions are
going
to differ from
these theoretical
functions
because of
sampling
variation.
23
both the autocorrelations and the partial autocorrelations die out; neither cuts
off
25
Stationarity Data
• A time series is stationary if:
It’s mean is the same at every time
It’s variance is the same every time
It’s autocorrelations are the same at every time
• Examples:
A series of outcomes from independent identical trials is stationary.
A series with a trend is not stationary.
A random walk is not stationary.
• If a time series is non-stationary, its ACF dies off slowly and the first partial
autocorrelation is near 1.
In such cases we can sometimes create a stationary series by
differencing the original series. This is the source of the "I" in an ARIMA
model
If Yt is a random walk, then its differences are white noise which is
stationary
• A unit root test is a formal test for non-stationarity
One such test is the Dickey-Fuller test 27
30
32
34
35
Example
36
Example
To difference the data to see if we could
eliminate the trend and create a
stationary series.
Example
38
Example
39