Professional Documents
Culture Documents
Gaussian Process Vine Copulas For Multivariate Dependence
Gaussian Process Vine Copulas For Multivariate Dependence
Multivariate Dependence
Jose Miguel Hernandez-Lobato1,2
joint work with David L
opez-Paz2,3 and Zoubin Ghahramani1
1 Department
3 Max
0.0
0.1
0.2
0.3
Copula
10
0.0
0.1
0.2
0.3
0.4
d
Y
fi (xi ) ,
i=1
c(u1 , . . . , ud ) and f1 (x1 ), . . . , fd (xd ) are the copula and marginal densities.
3
Clayton
Frank
t Copula
Gumbel
Joe
Vine Copulas
They are hierarchical graphical models that factorize c(u1 , . . . , ud ) into
a product of d(d 1)/2 bivariate conditional copula densities.
We can factorize c(u1 , u2 , u3 ) using the product rule of probability as
c(u1 , u2 , u3 ) = f3|12 (u3 |u1 , u2 )f2|1 (u2 |u1 )
and we can express each factor in terms of bivariate copula functions
C21 (x, u1 )
F1|2 (u1 |u2 ) =
.
x
x=u2
Regular Vines
A regular vine specifies a factorization of c(u1 , . . . , ud ).
Formed by d 1 trees T1 , . . . , Td1 with node and edge sets Vi and Ei .
Each edge e in any tree has associated three sets of variables C (e), D(e),
N(e) {1, . . . , d} called conditioned, conditioning and constraint sets.
V1 = {1, . . . , d} and E1 forms a spanning tree over a complete graph G1
over V1 . For any e E1 , C (e) = N(e) = e and D(e) = .
For i > 1, Vi = Ei1 and Ei forms a spanning tree over a graph Gi with
nodes Vi and edges e = {e1 , e2 } such that e1 , e2 Ei1 and e1 e2 6= .
For any e = {e1 , e2 } Ei , i > 1, we have that C (e) = N(e1 )N(e2 ),
D(e) = N(e1 ) N(e2 ) and N(e) = N(e1 ) N(e2 ).
c(u1 , . . . , ud ) =
d1
Y
cC (e)|D(e) .
i=1 eEi
8
Bayesian Inference on f
We are given a sample DUV = {Ui , Vi }ni=1 from CC (e)|D(e) with
corresponding values for the variables in D(e) given by Dz = {zi }ni=1 .
We want to identify the value of f that was used to generate the data.
1.0
0.5
0.0
0.5
1.0
10
10
10
10
12
Qn
i=1 c(Ui , Vi |
13
Expectation Propagation
EP approximates p(f|DUV , Dz ) by Q(f) = N (f|m, V) , where
EP tunes m
i and vi by minimizing KL[qi (fi )Q(f)[
qi (fi )]1 ||Q(f)] . We use
numerical integration methods for this task.
Kernel parameters fixed by maximizing the EP approx. of p(DUV |Dz ).
The total cost is O(n3 ) .
14
Implementation Details
We choose the following covariance function for the GP prior:
n
o
Cov[f (zi ), f (zj )] = exp (zi zj )T diag()(zi zj ) + 0 .
MLE + 1)/2) ,
The mean of the GP prior is constant and equal to 1 ((
where MLE is the MLE of for an unconditional Gaussian copula.
We use the FITC approximation:
T
K approximated by K0 = Q + diag(K Q), where Q = Knn0 K1
n0 n0 Knn0 .
Experiments I
We compare the proposed method GPVINE with two baselines:
1 - SVINE , based on the simplifying assumption.
2 - MLLVINE , based on the maximization of the local likelihood.
Can only capture dependencies on a single random variable.
Limited to regular vines with at most two trees .
All the data mapped to [0, 1]d using the ecdfs.
Synthetic Data: Z uniform in [6, 6] and (U, V ) Gaussian with
correlation 3/4 sin(Z ). Data set of size 50.
-0.6
U,V
-0.2
|Z
0.2
0.6
GPVINE
MLLVINE
TRUE
0.0
0.2
0.4
0.6
PZ (Z)
0.8
1.0
16
Experiments II
Real-world data: UCI datasets, meteorological data, mineral
concentrations and financial data
Data split into training and test sets (50 times) with half of the data.
Average test log likelihood when limited to two trees in the vine:
17
GPVINE
SVINE
18
x x xxxx x
x
xx xxxxxx x x x xxx
xxxxxx
x
x xx xxx
x x x xx x x x xxx x x
x x x x xxxxxxxxxxx
x
xxx xxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxx x xxx x
xx x x x x
x
x x xxxxxxxxxx xx
xx xxxxxxxxxxxxxxxxxx x x xxx
xx
x x x xxxxxxxxxxxxxxxxxx xxxx xxxxx x xxxxxxxxxxxxxxxxxxxxxxxx xxx x xxxxxxxxxxxxxx
xx xx x x xx x x xxxxxxxxxx
xxx x
x x xxx xxxxxxxxx xxx
x
x
x
x
x
x
xx
x
x
x
x
x
x
x x xxxxx x x x
xx x
x
x
xxx
x x x
x xx xx xx
x
x
x
x
x
x
xxxxxxx
xx
x xxxxxxxxxxxxxxxxx
x x
xx
xx x
x x
xxxx x
x xxx x
xxxxx x
x
x
xxxxxxx
x
x xx x
x
xx xxxx x x x x
x x x xx xxxxxx xxxxx
x
xxx xxxxx xx xx xxx
x x xxxxxx xx
x
x
xx
x xx x xxxx xxxxxxxxxxxxxxxxxxxxxxxx
x
x xxxx xxx xxx
xxxx xxxxxxx
x
x x xxxxxxxxxxxxxxxxxxxxx
xx x
x
x x xxxxxxxxxxxxxxxxxxxx
x x xx
x
xxxx
x
x xx x x x x x xx
x
x
x
x xxx
x
x
xx
x
xx
xxxx
x
x
x
x xxx
x
x
xx
xx
xx xxx
xxx xxxxxx
xxxxxxxxxxxxxx xxx
x
xxxxxxxxxx
xxxxxx xxxxxx
x
x
x
x
x
xxx xxxxx
xx
xxxxxxxxxxx
xxxxxxx
xx xxx
xxxx
xxxxxxx
xx
x
19
20
References
Lopez-Paz D., Hernandez-Lobato J. M. and Ghahramani Z. Gaussian
Process Vine Copulas for Multivariate Dependence International Conference
on Machine Learning (ICML 2013).
Acar, E. F., Craiu, R. V., and Yao, F. Dependence calibration in conditional
copulas: A nonparametric approach. Biometrics, 67(2):445-453, 2011.
Bedford, T. and Cooke, R. M. Vines-a new graphical model for dependent
random variables. The Annals of Statistics, 30(4):1031-1068, 2002
Minka, T. P. Expectation Propagation for approximate Bayesian inference.
Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence,
pp. 362-369, 2001.
Naish-Guzman, A. and Holden, S. B. The generalized FITC approximation.
In Advances in Neural Information Processing Systems 20, 2007.
Patton, A. J. Modelling asymmetric exchange rate dependence.
International Economic Review, 47(2):527-556, 2006
21
22