VHGPR Talk PDF

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 54

Variational Heteroscedastic

Gaussian Process Regression


Miguel Lazaro-Gredilla
1
Michalis K. Titsias
2
1
Dept. Communication Engineering, Universidad de Cantabria, Spain
2
School of Computer Science, University of Manchester, UK
ICML 2011
Contents
1
Heteroscedastic regression
2
Heteroscedastic Gaussian process (HGP)
3
Variational HGP (VHGP)
4
Experiments
5
Summary and further work
Contents
1
Heteroscedastic regression
2
Heteroscedastic Gaussian process (HGP)
3
Variational HGP (VHGP)
4
Experiments
5
Summary and further work
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Problem setting
Available data D {x
i
R
D
, y
i
R}
n
i=1
is modeled as
y
i
= f(x
i
) +
i
where
i
N(0, r(x
i
)).
If r(x) is assumed constant Homoscedastic regression
Widely used, rarely holds
In any other case Heteroscedastic regression
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 1/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Problem setting
Available data D {x
i
R
D
, y
i
R}
n
i=1
is modeled as
y
i
= f(x
i
) +
i
where
i
N(0, r(x
i
)).
If r(x) is assumed constant Homoscedastic regression
Widely used, rarely holds
In any other case Heteroscedastic regression
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 1/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Example: Motorcycle dataset
0 10 20 30 40 50 60
200
150
100
50
0
50
100
150
y
x
Figure: Dataset exhibiting heteroscedastic noise
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 2/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Example: Motorcycle dataset
0 10 20 30 40 50 60
200
150
100
50
0
50
100
150
y
x
Figure: Solution assuming homoscedastic noise
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 3/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Example: Motorcycle dataset
0 10 20 30 40 50 60
200
150
100
50
0
50
100
150
y
x
Figure: Solution assuming heteroscedastic noise
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 4/22
Contents
1
Heteroscedastic regression
2
Heteroscedastic Gaussian process (HGP)
3
Variational HGP (VHGP)
4
Experiments
5
Summary and further work
Contents Heteroscedastic regression HGP VHGP Experiments Summary
The model
Observation model (likelihood)
p(y
i
|f(x
i
), g(x
i
)) = N(y
i
|f(x
i
), r(x
i
))
r(x) = e
g(x)
Gaussian process priors
f(x) GP(0, k
f
(x, x

))
g(x) GP(
0
, k
g
(x, x

))
Model hyperparameters
{
f
,
g
,
0
}
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 5/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
The model
Observation model (likelihood)
p(y
i
|f(x
i
), g(x
i
)) = N(y
i
|f(x
i
), r(x
i
))
r(x) = e
g(x)
Gaussian process priors
f(x) GP(0, k
f
(x, x

))
g(x) GP(
0
, k
g
(x, x

))
Model hyperparameters
{
f
,
g
,
0
}
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 5/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
The model
Observation model (likelihood)
p(y
i
|f(x
i
), g(x
i
)) = N(y
i
|f(x
i
), r(x
i
))
r(x) = e
g(x)
Gaussian process priors
f(x) GP(0, k
f
(x, x

))
g(x) GP(
0
, k
g
(x, x

))
Model hyperparameters
{
f
,
g
,
0
}
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 5/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
It has been used before
1998, Goldberg, P., Williams, C., and Bishop, C.
Regression with input-dependent noise:
A Gaussian process treatment.
In Advances in Neural Information Processing Systems.
2007, Kersting, K., Plagemann, C., Pfa, P., and Burgard, W.
Most likely heteroscedastic Gaussian process regression.
In Proc. of the International Conference on Machine Learning.
2009, Quadrianto, N., Kersting, K., Reid, M., Caetano, T.,
and Buntine, W.
Kernel conditional quantile estimation via reduction revisited.
In Proc. of the International Conference on Data Mining.
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 6/22
Contents
1
Heteroscedastic regression
2
Heteroscedastic Gaussian process (HGP)
3
Variational HGP (VHGP)
4
Experiments
5
Summary and further work
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Variational inference
Joint density at the observations: p(y, f , g) = p(y|f , g)p(f )p(g)
Exact posterior p(f , g|y) is intractable
Apply MF: Approximate p(f , g|y) with q(f )q(g)
Lower-bound the log-evidence and
maximize w.r.t. q(f ) and q(g)
log p(y) F(q(f ), q(g)) = log p(y) KL(q(f )q(g)||p(f , g|y))
Standard VBEM: Update q(f ) and q(g) (and ) iteratively
Very slow
No closed form update for q(g) (non-linear equation)
Let us simplify the problem by removing q(f ) optimally
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 7/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Variational inference
Joint density at the observations: p(y, f , g) = p(y|f , g)p(f )p(g)
Exact posterior p(f , g|y) is intractable
Apply MF: Approximate p(f , g|y) with q(f )q(g)
Lower-bound the log-evidence and
maximize w.r.t. q(f ) and q(g)
log p(y) F(q(f ), q(g)) = log p(y) KL(q(f )q(g)||p(f , g|y))
Standard VBEM: Update q(f ) and q(g) (and ) iteratively
Very slow
No closed form update for q(g) (non-linear equation)
Let us simplify the problem by removing q(f ) optimally
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 7/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Variational inference
Joint density at the observations: p(y, f , g) = p(y|f , g)p(f )p(g)
Exact posterior p(f , g|y) is intractable
Apply MF: Approximate p(f , g|y) with q(f )q(g)
Lower-bound the log-evidence and
maximize w.r.t. q(f ) and q(g)
log p(y) F(q(f ), q(g)) = log p(y) KL(q(f )q(g)||p(f , g|y))
Standard VBEM: Update q(f ) and q(g) (and ) iteratively
Very slow
No closed form update for q(g) (non-linear equation)
Let us simplify the problem by removing q(f ) optimally
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 7/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Variational inference
Joint density at the observations: p(y, f , g) = p(y|f , g)p(f )p(g)
Exact posterior p(f , g|y) is intractable
Apply MF: Approximate p(f , g|y) with q(f )q(g)
Lower-bound the log-evidence and
maximize w.r.t. q(f ) and q(g)
log p(y) F(q(f ), q(g)) = log p(y) KL(q(f )q(g)||p(f , g|y))
Standard VBEM: Update q(f ) and q(g) (and ) iteratively
Very slow
No closed form update for q(g) (non-linear equation)
Let us simplify the problem by removing q(f ) optimally
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 7/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Variational inference
Joint density at the observations: p(y, f , g) = p(y|f , g)p(f )p(g)
Exact posterior p(f , g|y) is intractable
Apply MF: Approximate p(f , g|y) with q(f )q(g)
Lower-bound the log-evidence and
maximize w.r.t. q(f ) and q(g)
log p(y) F(q(f ), q(g)) = log p(y) KL(q(f )q(g)||p(f , g|y))
Standard VBEM: Update q(f ) and q(g) (and ) iteratively
Very slow
No closed form update for q(g) (non-linear equation)
Let us simplify the problem by removing q(f ) optimally
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 7/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Variational inference
Joint density at the observations: p(y, f , g) = p(y|f , g)p(f )p(g)
Exact posterior p(f , g|y) is intractable
Apply MF: Approximate p(f , g|y) with q(f )q(g)
Lower-bound the log-evidence and
maximize w.r.t. q(f ) and q(g)
log p(y) F(q(f ), q(g)) = log p(y) KL(q(f )q(g)||p(f , g|y))
Standard VBEM: Update q(f ) and q(g) (and ) iteratively
Very slow
No closed form update for q(g) (non-linear equation)
Let us simplify the problem by removing q(f ) optimally
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 7/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Variational inference
Joint density at the observations: p(y, f , g) = p(y|f , g)p(f )p(g)
Exact posterior p(f , g|y) is intractable
Apply MF: Approximate p(f , g|y) with q(f )q(g)
Lower-bound the log-evidence and
maximize w.r.t. q(f ) and q(g)
log p(y) F(q(f ), q(g)) = log p(y) KL(q(f )q(g)||p(f , g|y))
Standard VBEM: Update q(f ) and q(g) (and ) iteratively
Very slow
No closed form update for q(g) (non-linear equation)
Let us simplify the problem by removing q(f ) optimally
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 7/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Marginalized Variational (MV) bound (I/II)
The bound at some q(f ), q(g) can be written as
F(q(f ), q(g)) =

q(f )

q(g) log p(y|f , g)dg + log p(f ) log q(f )

df
KL(q(g)||p(g))
Using calculus of variations, the optimum is (standard result):
log q

(f ) = argmax
log q(f )
F(q(f ), q(g))
=

q(g) log p(y|f , g)dg + log p(f ) log Z(q(g))


Normalization factor Z(q(g)) =

q(g) log p(y|f ,g)dg


p(f )df
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 8/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Marginalized Variational (MV) bound (I/II)
The bound at some q(f ), q(g) can be written as
F(q(f ), q(g)) =

q(f )

q(g) log p(y|f , g)dg + log p(f ) log q(f )

df
KL(q(g)||p(g))
Using calculus of variations, the optimum is (standard result):
log q

(f ) = argmax
log q(f )
F(q(f ), q(g))
=

q(g) log p(y|f , g)dg + log p(f ) log Z(q(g))


Normalization factor Z(q(g)) =

q(g) log p(y|f ,g)dg


p(f )df
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 8/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Marginalized Variational (MV) bound (II/II)
Inserting q

(f ) back into the bound


F(q(g)) = F(q

(f ), q(g)) =
= log Z(q(g)) KL(q(g)||p(g))
= log

q(g) log p(y|f ,g)dg


p(f )df KL(q(g)||p(g))
we get the Marginalized Variational (MV) bound
Bound chain:
log p(y) F(q(g)) = F(q

(f ), q(g)) F(q(f ), q(g))


Now we only have to search for the optimal q(g)
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 9/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Marginalized Variational (MV) bound (II/II)
Inserting q

(f ) back into the bound


F(q(g)) = F(q

(f ), q(g)) =
= log Z(q(g)) KL(q(g)||p(g))
= log

q(g) log p(y|f ,g)dg


p(f )df KL(q(g)||p(g))
we get the Marginalized Variational (MV) bound
Bound chain:
log p(y) F(q(g)) = F(q

(f ), q(g)) F(q(f ), q(g))


Now we only have to search for the optimal q(g)
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 9/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
MV bound for Heteroscedastic GPs
Restrict search to Gaussians q(g) = N(g|, )
The MV bound for the HGP model is
F(, ) = log

N(g|,) log p(y|f ,g)dg


N(f |0, K
f
)df
KL(N(g|, )||N(g|
0
1, K
g
))
The term inside the exponential is log N(y|f , R)
1
4
trace()
with [R]
ii
= e
[]
i
[]
ii
/2
(R is diagonal)
. . . so the MV bound for HGPs is analytical:
F(, ) = log N(y|0, K
f
+R)
1
4
trace()
KL(N(g|, )||N(g|
0
1, K
g
))
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 10/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
MV bound for Heteroscedastic GPs
Restrict search to Gaussians q(g) = N(g|, )
The MV bound for the HGP model is
F(, ) = log

N(g|,) log p(y|f ,g)dg


N(f |0, K
f
)df
KL(N(g|, )||N(g|
0
1, K
g
))
The term inside the exponential is log N(y|f , R)
1
4
trace()
with [R]
ii
= e
[]
i
[]
ii
/2
(R is diagonal)
. . . so the MV bound for HGPs is analytical:
F(, ) = log N(y|0, K
f
+R)
1
4
trace()
KL(N(g|, )||N(g|
0
1, K
g
))
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 10/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
MV bound for Heteroscedastic GPs
Restrict search to Gaussians q(g) = N(g|, )
The MV bound for the HGP model is
F(, ) = log

N(g|,) log p(y|f ,g)dg


N(f |0, K
f
)df
KL(N(g|, )||N(g|
0
1, K
g
))
The term inside the exponential is log N(y|f , R)
1
4
trace()
with [R]
ii
= e
[]
i
[]
ii
/2
(R is diagonal)
. . . so the MV bound for HGPs is analytical:
F(, ) = log N(y|0, K
f
+R)
1
4
trace()
KL(N(g|, )||N(g|
0
1, K
g
))
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 10/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Reparameterizing HGPs MV bound
At local optima, the following conditions hold
F(, )

=
1
2
+
1
2

1
2
K
1
g
= 0
F(, )

= (
1
2
I)1 K
1
g
(
0
1) = 0
for some positive semidenite diagonal
Reparameterization with n variational parameters
() = K
g
(
1
2
I)1 +
0
1, () = (K
1
g
+)
1
Bound F() and all derivatives can be computed in O(n
3
)
F(, ) can also be used to optimize hyperparameters!
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 11/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Reparameterizing HGPs MV bound
At local optima, the following conditions hold
F(, )

=
1
2
+
1
2

1
2
K
1
g
= 0
F(, )

= (
1
2
I)1 K
1
g
(
0
1) = 0
for some positive semidenite diagonal
Reparameterization with n variational parameters
() = K
g
(
1
2
I)1 +
0
1, () = (K
1
g
+)
1
Bound F() and all derivatives can be computed in O(n
3
)
F(, ) can also be used to optimize hyperparameters!
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 11/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Reparameterizing HGPs MV bound
At local optima, the following conditions hold
F(, )

=
1
2
+
1
2

1
2
K
1
g
= 0
F(, )

= (
1
2
I)1 K
1
g
(
0
1) = 0
for some positive semidenite diagonal
Reparameterization with n variational parameters
() = K
g
(
1
2
I)1 +
0
1, () = (K
1
g
+)
1
Bound F() and all derivatives can be computed in O(n
3
)
F(, ) can also be used to optimize hyperparameters!
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 11/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Reparameterizing HGPs MV bound
At local optima, the following conditions hold
F(, )

=
1
2
+
1
2

1
2
K
1
g
= 0
F(, )

= (
1
2
I)1 K
1
g
(
0
1) = 0
for some positive semidenite diagonal
Reparameterization with n variational parameters
() = K
g
(
1
2
I)1 +
0
1, () = (K
1
g
+)
1
Bound F() and all derivatives can be computed in O(n
3
)
F(, ) can also be used to optimize hyperparameters!
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 11/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Reparameterizing HGPs MV bound
At local optima, the following conditions hold
F(, )

=
1
2
+
1
2

1
2
K
1
g
= 0
F(, )

= (
1
2
I)1 K
1
g
(
0
1) = 0
for some positive semidenite diagonal
Reparameterization with n variational parameters
() = K
g
(
1
2
I)1 +
0
1, () = (K
1
g
+)
1
Bound F() and all derivatives can be computed in O(n
3
)
F(, ) can also be used to optimize hyperparameters!
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 11/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Predictive distribution
Densities of f

= f(x

) g

= g(x

) at location x

q(f

) =

p(f

|f )q(f )df = N(f

|a

, c
2

)
q(g

) =

p(g

|g)q(g)dg = N(g

,
2

)
Predictive distribution for the observations
p(y

|y) q(y

) =

p(y

|g

, f

)q(f

)q(g

)df

dg

N(y

|a

, c
2

+ e
g

)N(g

,
2

)dg

First two moments are analytical (but this is not a Gaussian!)


E
q
[y

|y] = a

, V
q
[y

|y] = c
2

+ e

+
2

/2
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 12/22
Contents
1
Heteroscedastic regression
2
Heteroscedastic Gaussian process (HGP)
3
Variational HGP (VHGP)
4
Experiments
5
Summary and further work
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Posterior on toy data
! "#$ " "#$ !
!$
!"
$
"
$
!"
!$
y
x
Figure: Observed data y(x) and inferred posterior
Toy data
MAPHGP
VHGP
MCMC
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 13/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Posterior on toy data
10 5 0 5 10
0
0.05
0.1
0.15
0.2
0.25
y
p
r
o
b
Figure: Marginal posterior of y(x) at x = 0.9
MAPHGP
VHGP
MCMC
Gaussian
match VHGP
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 14/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Posterior on toy data
! "#$ " "#$ !
%
!
"
!
%
&
'
g
x
Figure: Latent g(x) and inferred posterior
Noise level
MAPHGP
VHGP
MCMC
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 15/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Several one-dimensional problems
Table: Problems: Goldberg, Cawley, Motorcycle, Toy.
Splits: 90% training and 10% testing, 300 independent runs.
Problem GP MAPHGP VHGP
G. (NMSE) 0.400.21 0.390.21 0.390.21
G. (NLPD) 1.510.28 1.530.44 1.450.28
C. (NMSE) 0.080.06 0.110.08 0.100.07
C. (NLPD) -0.440.52 -0.440.61 -0.590.31
M. (NMSE) 0.260.18 0.260.17 0.260.17
M. (NLPD) 4.590.22 4.320.60 4.320.30
T. (NMSE) 0.780.33 0.770.33 0.770.32
T. (NLPD) 2.221.16 2.101.15 1.910.97
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 16/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Several multi-dimensional problems
Table: Single run on predened train and test sets.
Problem Dim. GP VHGP
Abalone (NMSE) 8 0.4359 0.4259
Abalone (NLPD) 8 2.1265 2.0130
Pole T. (NMSE) 26 0.0237 0.0934
Pole T. (NLPD) 26 2.9082 1.8047
Elevators (NMSE) 17 0.0905 0.0939
Elevators (NLPD) 17 -4.7997 -4.8450
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 17/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Volatility forecasting
Problem
Given price series p[x], dene y[x] = log(p[x]) log(p[x 1])
y[x] is regarded as noise-only.
Objective: Forecast future noise power (volatility)
Volatility model used to illustrate RMHMC
(Girolami & Calderhead, 2011)
y[x] = [x] exp(g[x]/2) [x] N(0, 1)
g[x + 1] = g[x] + [x + 1] [x] N(0,
2
)
with g[1] N(0,
2
/(1
2
))
This is an HGP model with
k
f
(x, x

) = 0, k
g
(x, x

) =
2
0
/(1
2
)
|xx

|
,
0
= 2 log
AR(1) process, VHGP can be implemented in linear time!
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 18/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Volatility forecasting
Problem
Given price series p[x], dene y[x] = log(p[x]) log(p[x 1])
y[x] is regarded as noise-only.
Objective: Forecast future noise power (volatility)
Volatility model used to illustrate RMHMC
(Girolami & Calderhead, 2011)
y[x] = [x] exp(g[x]/2) [x] N(0, 1)
g[x + 1] = g[x] + [x + 1] [x] N(0,
2
)
with g[1] N(0,
2
/(1
2
))
This is an HGP model with
k
f
(x, x

) = 0, k
g
(x, x

) =
2
0
/(1
2
)
|xx

|
,
0
= 2 log
AR(1) process, VHGP can be implemented in linear time!
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 18/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Girolami & Calderhead experiment
0 50 100 150 200 250 300 350 400
2.5
2
1.5
1
0.5
0
0.5
1
1.5
2
2.5
x
g
Figure: Latent g(x) and inferred posterior.
Hyperparameters integrated out.
Noise level
VHGP
RMHMC
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 19/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Girolami & Calderhead experiment
0 50 100 150 200 250 300 350 400
2.5
2
1.5
1
0.5
0
0.5
1
1.5
2
2.5
x
g
Figure: Latent g(x) and inferred posterior.
Hyperparameters xed to VHGP ML-II.
Noise level
VHGP
RMHMC
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 19/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Girolami & Calderhead experiment
Table: Hyperparameter estimation.
Parameter
0

Ground truth 0.1500 0.9800 0.6500
Expected (RMHMC) 0.1714 0.9771 0.6654
Initial value 0.5000 0.5000 0.5000
VHGP ML-II 0.1483 0.9814 0.6662
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 20/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Comparison with GARCH(1,1)
GARCH(1,1) model:
y[x] N(0, r[x])
r[x] = a
0
+ a
1
y
2
[x 1] + b
1
r[x 1]
Daily exchange rate DEM/GBP (1974 trading days)
Table: MSE for three dierent forecast horizons.
Method Days ahead (10
9
)
1 7 30
GARCH(1,1) 3.092 3.312 5.043
VHGP 3.087 3.092 3.118
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 21/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Comparison with GARCH(1,1)
GARCH(1,1) model:
y[x] N(0, r[x])
r[x] = a
0
+ a
1
y
2
[x 1] + b
1
r[x 1]
Daily exchange rate DEM/GBP (1974 trading days)
Table: MSE for three dierent forecast horizons.
Method Days ahead (10
9
)
1 7 30
GARCH(1,1) 3.092 3.312 5.043
VHGP 3.087 3.092 3.118
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 21/22
Contents
1
Heteroscedastic regression
2
Heteroscedastic Gaussian process (HGP)
3
Variational HGP (VHGP)
4
Experiments
5
Summary and further work
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Summary and further work
Heteroscedastic regression has signicant practical importance
The proposed method has several advantages
Variationally integrates out g(x) (unlike MAP methods)
Fast (roughly as two standard GPs)
Quite accurate (assessed by MCMC)
Allows for hyperparameter learning
Provides an analytical bound to optimize
Volatility prediction in linear time is possible
Future lines
Robust regression using k
g
(x, x

) =
2
0

xx
(leptokurtic noise)
Sparse version for big datasets
Code available at http://www.tsc.uc3m.es/
~
miguel
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 22/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Summary and further work
Heteroscedastic regression has signicant practical importance
The proposed method has several advantages
Variationally integrates out g(x) (unlike MAP methods)
Fast (roughly as two standard GPs)
Quite accurate (assessed by MCMC)
Allows for hyperparameter learning
Provides an analytical bound to optimize
Volatility prediction in linear time is possible
Future lines
Robust regression using k
g
(x, x

) =
2
0

xx
(leptokurtic noise)
Sparse version for big datasets
Code available at http://www.tsc.uc3m.es/
~
miguel
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 22/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Summary and further work
Heteroscedastic regression has signicant practical importance
The proposed method has several advantages
Variationally integrates out g(x) (unlike MAP methods)
Fast (roughly as two standard GPs)
Quite accurate (assessed by MCMC)
Allows for hyperparameter learning
Provides an analytical bound to optimize
Volatility prediction in linear time is possible
Future lines
Robust regression using k
g
(x, x

) =
2
0

xx
(leptokurtic noise)
Sparse version for big datasets
Code available at http://www.tsc.uc3m.es/
~
miguel
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 22/22
Contents Heteroscedastic regression HGP VHGP Experiments Summary
Summary and further work
Heteroscedastic regression has signicant practical importance
The proposed method has several advantages
Variationally integrates out g(x) (unlike MAP methods)
Fast (roughly as two standard GPs)
Quite accurate (assessed by MCMC)
Allows for hyperparameter learning
Provides an analytical bound to optimize
Volatility prediction in linear time is possible
Future lines
Robust regression using k
g
(x, x

) =
2
0

xx
(leptokurtic noise)
Sparse version for big datasets
Code available at http://www.tsc.uc3m.es/
~
miguel
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 22/22
Extra: Elliptical slice sampling
The MCMC simulations for toy data were obtained using the
recently proposed elliptical slice sampling from Murray et al., 2010.
We drew posterior samples from p(g|y) while f was integrated out
analytically:
p(y, g) = N (y|0, K
f
+ diag(e
g
)) N(g|
0
1, K
g
).
Notice the similarity of this expression with the MV bound, in
which q(f ) has been optimally removed.
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 1/2
Extra: MV bound for HGP
A more explicit form of the MV bound for the HGP model is
2F(, )
= y

(K
f
+R)
1
y + log |K
f
+R| + nlog(2) +
1
2
trace()
log |K
1
g
| + trace(K
1
g
) + (
0
1)

K
1
g
(
0
1) n
with [R]
ii
= e
[]
i
[]
ii
/2
(diagonal)
Variational Heteroscedastic GP Regression GTAS/Cantabria & SCS/Manchester 2/2

You might also like