RoME Regression Smoothing Splines 2021

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 39

Time series regression:

Regression and Smoothing Splines


Local Polynomial Regression and Averaging

Tommaso Proietti

RoME Econometric Theory

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
1 / 39
Introduction

Time series and their characteristics

A time series is a collection of observations on a economic stock or flow


variable indexed by time: {yt , t = 1, 2, . . . , n}.
Time is discrete and observations are equally spaced.
If y is measured on a ratio scale, common transformations are

log(yt )
yt − yt−1
∆ log yt = log(yt ) − log(yt−1 ) ≈
yt−1
yt − yt−s
∆s log yt = log(yt ) − log(yt−s ) ≈ , s = 4, 12,
yt−s
The log transformation is a special case of the Box-Cox transformation:
( λ
yt −1
yt (λ) = λ , λ=6 0,
log yt , λ = 0.

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
2 / 39
Introduction

We take a look at some economic time series first.


Some stylized facts emerge:
Some (most) series are seasonal.
Some (most) display trends.
The changes are strongly autocorrelated (we will explain this).

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
3 / 39
Introduction

Figure: U.S. Quarterly Gross Domestic Product at chained-linked volumes, ref. year 2009

Original series Logarithmic transformation


10.0

15000 9.5

9.0
10000
8.5

5000 8.0

1960 1980 2000 2020 1960 1980 2000 2020


Monthly logarithmic changes Annual logarithmic changes
0.04

0.10
0.02

0.05
0.00
0.00

-0.02

1960 1980 2000 2020 1960 1980 2000 2020

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
4 / 39
Introduction

Figure: U.S. Unemployment rate, monthly, seasonally adjusted

Original series Logarithms


2.5
10.0
2.0
7.5
1.5
5.0
1.0
1960 1980 2000 1960 1980 2000
Monthly changes ∆log y t
1 0.2

0
0.0
-1
-0.2
1960 1980 2000 1960 1980 2000
Annual Changes ∆12 log y t
5.0 1.0

2.5 0.5

0.0 0.0

-2.5 -0.5
1960 1980 2000 1960 1980 2000

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
5 / 39
Introduction

Figure: Italy, Index of Industrial Production, Manufacturing, 2010 = 100

Series

125 ip ip_sa

100

75

50

1990 1995 2000 2005 2010 2015


Annual log changes
0.2

0.1

0.0

-0.1

-0.2
D12Lip D12Lip_sa
-0.3
1990 1995 2000 2005 2010 2015

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
6 / 39
Introduction

Figure: Consumer Price Index for All Urban Consumers, All Items, 1982-84 = 100

Consumer price index

200

100

1950 1960 1970 1980 1990 2000 2010


Monthly inflation
0.02

0.01

0.00

-0.01

1950 1960 1970 1980 1990 2000 2010


Yearly inflation
0.15

0.10

0.05

0.00

1950 1960 1970 1980 1990 2000 2010

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
7 / 39
Introduction

Approaches to economic time series analysis

Objectives of time series analysis: interpretation and forecasting.


There are various approaches to TSA.
A (rather fuzzy) classification:
1 Parametric models (ARIMA, State space models, etc.). A useful categorisation
is due to Cox (1981):
Observation driven models Dynamic stochastic difference equations (the
behaviour is explained in terms of the past values of the process
and of the innovations): ARIMA models, VAR and VECM,
GARCH, etc.
Parameter driven models The series is generated by a combination of
unobserved components. Structural time series models.
Stochastic volatility models.
2 Semiparametric approach (Splines, Random fields)
3 Nonparametric approach (Kernel methods, Local polynomial regression, Neural
networks, Wavelets etc.)

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
8 / 39
Time Series Regression

Time Series Regression

Let us consider the regression model with a single explanatory variable t,


representing time:
Y = f (t) + ,
where f (t) is an unknown conditional mean function and at least E(|t) = 0,
E(2 |t) = σ 2 .

f (t) is possibly nonlinear and non-additive.

In the linear regression framework we can consider the (global) polynomial


approximation
f (t) = β0 + β1 t + β2 t 2 · · · + βp t p .

Here global means that the coefficients of the polynomial are constant across the
sample span of t and it is not possible to control the influence of the individual
observations on the fit.

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
9 / 39
Time Series Regression

Global polynomials are amenable to mathematical treatment, but are not very
flexible: they can provide bad local approximations and behave rather weirdly
at the extreme of the sample.
This point is illustrated by the first panel of figure 1, which plots the original
series, representing the industrial production index for the Italian Automotive
sector, and the estimate of the trend arising from fitting cubic and quintic
polynomials of time.
In particular, it can be seen that a high order is needed to provide a reasonable
fit (the cubic fit being very poor).

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
10 / 39
Time Series Regression

Industrial Production Index, Manufacture and Assembly of Motor Vehicles,


seasonally adjusted, Italy, January 1990 - October 2005.

Global polynomial fit


110 yt
cubic
Quintic
100

90

80

70

1990 1995 2000 2005


Cubic spline (λ=1600)
110

100

90

80

70

1990 1995 2000 2005

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
11 / 39
Time Series Regression

We will discuss the approximation


M
X
f (t) = βm hm (t)
m=1

which retains linearity in the regression coefficients and uses a suitable set of
transformation or functions of t, hm (t), called basis.
In the case of polynomial splines the idea is to add to a global polynomial of
order p polynomial pieces at given points, called knots, so that the sections are
joined together ensuring that certain continuity properties are fulfilled.
Given the set of points ξ1 < . . . < ξk < . . . ξK , a polynomial spline function of
degree p with K knots {ξk , k = 1, . . . , K } is a polynomial of degree p in each
of the k + 1 intervals [ξk , ξk+1 ), with p − 1 continuous derivatives, whereas the
p-th derivative has jumps at the knots.

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
12 / 39
Time Series Regression

The spline can be represented as follows:


K
X
f (t) = β0 + β1 t + · · · + βp t p + βp+k (t − ξk )p+ , (1)
k=1

where the set of functions


(t − ξk )p , t − ξk ≥ 0,

(t − ξk )p+ =
0, t − ξk < 0

defines what is usually called the truncated power basis of degree p.

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
13 / 39
Time Series Regression

According to (1) the spline is a linear combination of polynomial pieces; at


each knot a new polynomial piece, starting off at zero, is added so that the
derivatives at that point are continuous up to the order p − 1.
The truncated power representation has the advantage of representing the
spline as a multivariate regression model.
The piecewise nature of the spline “reflects the occurrence of structural
change” (Poirer, 1973). The knot ξi is the location of a structural break. The
change is “smooth”, since certain continuity conditions are ensured.
The coefficients βk determines the size of the break.

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
14 / 39
Cubic splines and natural boundary conditions

Cubic splines and natural boundary conditions

The cubic spline model arises when p = 3:

f (t) = β0 + β1 t + β2 t 2 + β3 t 3 + β4 (t − ξ1 )3+ + · · · + βK +3 (t − ξK )3+ .

The model has K + 4 parameters and the number of effective degrees of freedom is
df = 4 + K .

A cubic spline (like any other high order spline) behaves too erratically at the
boundary of the t support, i.e. the variance of the fit is very high.

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
15 / 39
Cubic splines and natural boundary conditions

Training sample (ti , yi ) and true regression function


f (t) = [exp(1.2t) + 1.5 sin(7t) − 1]/3.




● ●●
1.0







● ●

● ●
● ● ●
● ●
● ● ●
● ●
● ●
●● ●

●●
●●● ●
●●●

● ●

● ● ●●●● ●
● ●● ● ● ●
0.5

● ●

● ● ●● ●
● ● ●
y

● ● ● ●
● ●
● ● ●
● ●● ●● ●● ●
●● ●
●●


● ●
● ●


● ● ● ● ●
● ●
●● ● ●
●● ●●
● ● ●● ●
● ● ●
● ●●

● ● ●
● ● ●● ● ●
●● ●● ●

● ● ●

0.0

● ●
● ● ●
● ●● ●
●● ●
● ● ● ● ●

● ●
● ●
● ● ●●




● ●



0.0 0.2 0.4 0.6 0.8 1.0

x
Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
16 / 39
Cubic splines and natural boundary conditions

Comparison of cubic and linear spline fit with 4 internal equally spaced knots at 0.2,
0.4, 0.6, 0.8.


● ●

1.0


linear

cubic


● ●
● ● ●
● ●
● ●
● ●

●● ●
● ●
● ●
● ●● ● ● ●
0.5

● ● ●
● ●
● ●
y

● ●
● ●
● ●● ●

●●

● ●

● ● ● ● ●




●●

● ●
● ● ●●
●● ● ●

0.0


● ●●
● ●
● ● ●

● ●


● ●



0.0 0.2 0.4 0.6 0.8 1.0

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
17 / 39
Cubic splines and natural boundary conditions

Cubic spline fit with different K and global cubic fit (knots are located
automatically at quantiles of the distribution of time t.


● ●

1.0


P3

K=2


K=5 ●

● ● ● K=10
● ●


● K=20 ●

●● ●
● ●
● ●
● ●● ● ● ●
0.5

● ● ●
● ●
● ●
y

● ●
● ●
● ●● ●

●●

● ●

● ● ● ● ●




●●

● ●
● ● ●●
●● ● ●

0.0


● ●●
● ●
● ● ●

● ●


● ●



0.0 0.2 0.4 0.6 0.8 1.0

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
18 / 39
Cubic splines and natural boundary conditions

This is the reason why it is preferable to impose the so called natural boundary
conditions, which constrain the spline be linear outside the boundary knots.

The natural boundary conditions require that the second and the third derivatives
are zero for t ≤ ξ1 and t ≥ ξK . This amounts to imposing 4 restrictions, and this
frees 4 df.

The complexity of the spline model is measured by the degrees of freedom, which is
the trace of the hat matrix. This is equal to p + 1 + K for polynomial splines and
K for a natural cubic spline.

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
19 / 39
Cubic splines and natural boundary conditions

Natural boundary conditions

The 2nd and 3rd derivatives of P


K
f (t) = β0 + β1 t + β2 t 2 + β3 t 3 + k=1 βk+3 (t − ξk )3+ are respectively

K
00 X
f (t) = 2β2 + 6β3 t + 6 βk+3 (t − ξk )+
k=1

K
000 X
f (t) = 6β3 + 6 βk+3 (t − ξk )0+ .
k=1

The natural boundary conditions imply the following constraints on the coefficients:
K
X K
X
β2 = β3 = 0; βk+3 = 0; ξk βk+3 = 0
k=1 k=1

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
20 / 39
Cubic splines and natural boundary conditions

Figure: Natural cubic splines. Comparison of fit with different K and global cubic fit.


● ●

1.0


P3

K=2


K=5 ●

● ● ● K=10
● ●


● K=20 ●

●● ●
● ●
● ●
● ●● ● ● ●
0.5

● ● ●
● ●
● ●
y

● ●
● ●
● ●● ●

●●

● ●

● ● ● ● ●




●●

● ●
● ● ●●
●● ● ●

0.0


● ●●
● ●
● ● ●

● ●


● ●



0.0 0.2 0.4 0.6 0.8 1.0

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
21 / 39
Cubic splines and natural boundary conditions How many knots

Model selection: how many knots?

Model selection is carried out by the same methods considered for regression. It
entails not only the selection of the number of knots, K , but also their location
along the support of t.

The automatic option is to locate them at the 100k/(K + 1), k = 1, . . . , K -th


percentiles of the distribution of t. If there are K candidate knots, there are 2K
possible models to select. Stepwise selection has been proposed. An alternative is
to use a regularization approach, i.e. smoothing splines.

Model complexity is measured by the degrees of freedom, e.g.


df = trace(H ) = p + 1 + K for a polynomial spline.

Note: the truncated power basis is easily interpretable; however, for computational
efficiency the a linear transformation of this basis, the B-spline basis is used for
estimation (the regressors are less collinear).

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
22 / 39
Cubic splines and natural boundary conditions How many knots

Figure: Truncated power basis for polynomial spline models.

(x−ξ k )+0 (x−ξ k )+1


1.10

7.5
1.05

1.00 5.0

0.95 2.5

5 10 5 10
(x−ξ k )+3 (cubic spline) Smoothing spline
750 40

30
500

20

250
10

5 10 5 10

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
23 / 39
Smoothing splines

Smoothing splines

A smoothing spline is a natural cubic spline with n knots placed at each observation
ti , i = 1, . . . , n. Hence each new observation carries “news”. Obviously, such a
model is overparameterized as the number of parameters is n, i.e. there is one
parameter for each observation.
Pn
Hence, we choose f (t) = i=1 θi Ni (t), where Ni (t) are the elements of the
natural spline basis corresponding to the knots ti ’s. The coefficients θi are
estimated by minimising the following penalised residual sum of squares function:
( n Z )
X
2 00 2
PRESS(λ) = min [yi − f (ti )] + λ [f (t)] dt , (2)
i=1

where λ ≥ 0 isR the smoothness parameter, f 00 (t) is second derivative of the


function, and [f 00 (t)]2 dt is the curvature of the function.

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
24 / 39
Smoothing splines

The parameter λ regulates the complexity of the model.


For λ = 0, the spline fits the observations perfectly (yi = fˆ(ti )).
For λ → ∞ we obtain the linear ˆ ˆ ˆ
R fit f (ti ) = β0 + β1 ti (the linear function has
zero second derivative and so [f 00 (t)]2 dt = 0).
In vector notation

PRESS(λ) = (y − N θ)0 (y − N θ) + λθ 0 Ωθ,

R 00 N is00 the regression matrix of the natural spline and Ω has elements
where
Nh (t)Nk (t)dt.

The solution is
θ̂ = (N 0 N + λΩ)−1 N 0 y
(notice the analogy with ridge regression).

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
25 / 39
Smoothing splines

The estimated regression function is fˆ = Hλ y , where Hλ = N (N 0 N + λΩ)−1 N 0 .


The effective degrees of freedom of the spline fit is

df (λ) = tr(Hλ ).

When λ → ∞, df (λ) → 2 (minimal complexity), whereas as λ → 0 df (0) → n


(maximum complexity).

The estimation of λ is carried out either by minimizing an information criterion or


by crossvalidation. Denoting
n
X
RSS(λ) = [yi − f (ti )]2 ,
i=1

AIC (λ) = ln[RSS(λ)] + 2df (λ)/n.

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
26 / 39
Smoothing splines

Crossvalidation

We seek the value of λ which minimises


n
!2
X yi − fˆ(ti )
CV (λ) =
1 − hi,λ
i=1

where hi,λ is the i-th diagonal element of Hλ , or


RSS(λ)
GCV (λ) = .
[1 − n−1 df (λ)]2

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
27 / 39
Smoothing splines

Simulated example: n = 100, f (t) = sin[12(t + 0.2)]/(t + 0.2),  ∼ N(0, 1),


Y = f (t) + .
4 ●●


● ●●●
y
● f(x)

● ●
● ●
●● ●

2


● ●

● ● ●

● ● ● ●

● ●● ●
● ●●●
●● ● ●

● ●● ●

y

● ● ●
●● ●●
0


● ● ● ●

● ● ●
● ● ●
● ●● ●● ● ● ● ●
● ●

● ● ●
● ●

● ●
● ●
−2

● ●● ●

●●



0.0 0.2 0.4 0.6 0.8 1.0

x
Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
28 / 39
Smoothing splines

Simulated example: n = 100, f (t) = sin[12(t + 0.2)]/(t + 0.2),  ∼ N(0, 1),


Y = f (t) + . Smoothing spline fit.
4 ●●


● ●●●


● CV
● ●
● df=2 ●
●● ●
● df=20
2


● ●

● ● df=50 ●

● ● ● ●

● ●● ●
● ●●●
●● ● ●

● ●● ●

y

● ● ●
●● ●●
0


● ● ● ●

● ● ●
● ● ●
● ●● ●● ● ● ● ●
● ●

● ● ●
● ●

● ●
● ●
−2

● ●● ●

●●



0.0 0.2 0.4 0.6 0.8 1.0

x
Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
29 / 39
Local polynomial regression

Local polynomial regression

Consider the regression model Y = f (t) + , where f (t) is an unknown


deterministic function of the input t, so that E(Y |t) = f (t).
A training sample of observations (yi , ti ), i = 1, . . . , n, is available.
Suppose that we are interested in estimating the value of the function at a
point t0 , f (t0 ).
If f (t) is “smooth” (i.e. continuous and differentiable), it can be locally
approximated around t0 by a polynomial of degree p, using a Taylor expansion:

f˜(t) = β0 + β1 (t − t0 ) + · · · + βp (t − t0 )p .
It is clear from setting t = t0 , that β0 = f (t0 ), β1 = f 0 (t0 ), and in general
βj = f (j) (t0 )/j! (the scaled j-th derivative of the function).

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
30 / 39
Local polynomial regression

As f˜(t) is valid only as a local approximation, the p + 1 unknown coefficients


βk , k = 0, . . . , p, are estimated by weighted least squares (WLS), giving more
weight to the observations with values of t that are close to t0 .
The WLS objective function to be minimized is:
n  
X ti − t0 2
K yi − β̂0 − β̂1 (ti − t0 ) − . . . − β̂p (ti − t0 )p ,
h
i=1

where K t−t

h
0
is a kernel function, depending on a bandwidth parameter,
h > 0, providing a set of weights declining with the distance of t from t0 .

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
31 / 39
Local polynomial regression

Denoting

(t1 − t0 )p
 
1 (t1 − t0 ) ···
 1 (t2 − t0 ) ··· (t2 − t0 )p  
β0

.. .. ..
 
 β1
X0
 
. . ··· .
  
= ,β =  .

p  ..

 1 (ti − t0 ) ··· (ti − t0 ) 
  . 
 .. .. .. 
βp
 . . ··· . 
1 (tn − t0 ) · · · (tn − t0 )p

K0
      
t1 − t0 t2 − t0 tn − t0
= diag K ,K ,...,K ,
h h h
the WLS estimator is
β̂ = (X00 K0 X0 )−1 X00 K0 y .
The fitted value is then ŷ0 = β̂0 .

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
32 / 39
Local polynomial regression

Notice that

β̂0 = wh yt−h + wh−1 yt−h+1 + · · · + w1 yt−1 + w0 yt + w−1 yt+1 + · · · + w−h yt+h

fˆ(t0 ) = β̂0

The linear combination yielding our trend estimate is often termed a (linear)
filter, and the weights wj constitute its impulse responses.
The latter are time invariant and carry essential information on the nature of
the estimated signal.
Two important properties of the weights are symmetry (if the design is regular,
i.e., observations are equally spaced) and reproduction of p-th degree
polynomials.

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
33 / 39
Local polynomial regression

The nature of the fit is determined by 3 quantities:


The polynomial order, p.
The kernel function.
The bandwidth parameter, h. The bandwidth determines the width of the
local neighbourhood.

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
34 / 39
Local polynomial regression Kernels

Kernels
A kernel is a symmetric and non negative function with a maximum at zero. There
is a large literature on kernels and their properties.
The uniform kernel is K (u) = I (|u| ≤ 1);
The Epanechnikov kernel has certain optimality properties and is defined as
follows:  3 2
K (u) = 4 (1 − u ), if |u| ≤ 1,
0, otherwise.
The tricube kernel (used by Loess, a k-nearest-neighbour local polynomial
method) is defined as follows:

(1 − |u|3 )3 , if |u| ≤ 1,

K (u) =
0, otherwise.

The Gaussian kernel is such that h is the standard deviation of t and K (u) is
the standard Normal density function.
In the k-nearest-neighbour approach h is set equal to the distance from the
k-th closest ti to t0 . Hence, it varies with t0 .
Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
35 / 39
Local polynomial regression Kernels

Figure: Kernel functions.

Uniform
Epanechnikov

-2 -1 0 1 2 -2 -1 0 1 2

Gaussian Tricube

-2 -1 0 1 2 -2 -1 0 1 2

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
36 / 39
Local polynomial regression Bias-Variance Trade-off

Bias-Variance Trade-off

The degree p of the polynomial and the bandwidth h are crucial quantities.
They regulate the complexity of the fit, which increases with p and decreases
with h.
The mean square estimation error is

MSE[fˆ(t0 )] = Bias2 [fˆ(t0 )] + Var[fˆ(t0 )]

Both p and h affect the MSE.


For a given h, by increasing p we reduce the bias (we improve the Taylor
approximation by including higher order terms), but we increase the variance
(there are more parameters to be estimated).
For a given p, a large h will capture more observations and the fit tends to be
the same as the global polynomial fit (with p + 1 df). The variance is small,
but the bias is high.

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
37 / 39
Local polynomial regression Bias-Variance Trade-off

Usually, fˆ(t0 ) is biased for f (t0 ), unless the true regression function is a
polynomial of order p.
The bias arises from neglecting higher order terms in the Taylor expansion.
The bias is inversely related to p and positively related to the bandwidth h.
As far as the variance is concerned, for a given p, Var[fˆ(t0 )] decreases as h
increases.
The shape of the kernel has much less impact on the bias-variance trade-off.
The optimal choice of (p, h) should minimise MSE[fˆ(t0 )], which can be
estimated using a variety of methods (see e.g. Fan and Gijbels, 1994).
It is usually most effective to choose a low degree polynomial (e.g. p = 1, or
p = 3) and concentrate the efforts on the selection of the bandwidth.
The optimal h can be obtained by cross-validation.

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
38 / 39
Special case: local averaging (p = 0) and kernel smoothing

Special case: local averaging (p = 0) and kernel smoothing

The local constant fit (p = 0) is an important special case.


The local weighted average of Y ,
Pn ti −t0

ˆ i=1 K h yi
f (t0 ) = Pn ti −t0

i=1 K h

is known as the Nadaraya-Watson average.


Denoting by Nk (t) the set of k points nearest to t, the k-nearest-neighbour average
is
1 X
fˆ(t0 ) = yi .
k
ti ∈Nk (t0 )

This corresponds to the use of a uniform kernel.

Tommaso Proietti (RoME Econometric Theory) Time series regression: Regression and Smoothing Splines Local Polynomial Regression
AA 2020-21
and Averaging
39 / 39

You might also like