Professional Documents
Culture Documents
Cópula Vine Approach
Cópula Vine Approach
Cópula Vine Approach
We use the protocol presented in [Kurowicka D., Cooke R.M. 2004] to specify (conditional) correlations
to be required from experts [see Cooke R.M., 1991]. As we already said these correlations are assigned
to the directed arcs of the BBN.
First we choose the sampling order 1, 2, 3, 4, 5, 6, 7, 8, 9 for the BBN structure, such that the ancestors
of a node appear before that node in the ordering. This order is not unique; we could have chosen a
different sampling order. Observe Figure C-1, the node “Prescribed spacing”, numbered 4 has as
ancestors the nodes “Error ATC Supervisor”, “Separation Mode Planner Failure”, and “Wind Prediction”;
thereby, they were placed in the ordering before node 4 as nodes 3, 2 and 1, respectively.
We write the complete factorization and underscore the nodes which do not have a direct “influence”
with the conditioned variable, i.e., which are not its parents, and hence are not necessary in sampling
it. This factorization is
P (1, 2, 3, 4, 5, 6, 7, 8, 9) = ( ) ( )
P (1) P (2 1) P 3 2 1 P 4 32 1 P (5 4321) P (6 54321)
( ) (
P 7 65 4321 P 8 7654321 P 9 874 65321 ) ( ) (1)
If we drop the underscored variables, we obtain the standard factorization for the BBN given as follows
[Pearl J. 1988, Jensen F.V. 1996]:
P ( X 1, X 2 , , X 9 ) = (
P X i pa ( X i )
9
i =1
) (2)
To sample a distribution specified by a continuous BBN we use the sampling procedure for the D -vine
[Kurowicka D., Cooke R.M. 2005]. For each part of the factorization we build a D -vine on K variables
denoted by DK = D ( K , CK, IK). The ordering of the variables is very important. We start with the variable
K ; then the dependent variables, CK; and, at the end the independent variables, IK.
a) Let us start with the first term of the factorization, P (1) . Since variable X 1 neither has dependent
variables, nor independent ones, C1 = I 1 = . Then, the D -vine for X 1 is trivial, we denote it by D1= D
(1). To sample X 1 , we can just sample a uniform random variable,
x1 = u1 . (3)
C2 = 1 , I 2 = r21
ATC-WAKE D3_6B APPENDIX, 20/02/2005
Figure C-2: D2 for the BBN for the aircraft separation time with 9 variables
In Figure C-2, we can see the D -vine D2 and sets of independent and dependent variables for X 2 .
There are no underscored variables, hence I2 = . The set of dependent variables C2 consists of the
variable X 1 , so the ordering of D2 is as in Figure C-2. To specify dependence between X 1 and X 2 , it
is required to assign a rank correlation r12 to the edge between X 1 and X 2 in D2 and equivalently to
the corresponding arc in the BBN in Figure C-1. The graphical representation of the sampling
procedure is shown in Figure C-3:
X1
r12
u1 = x1
X2
x2
0
u2
1 F2 1
We acquire a value of variable X 2 , say x2 in D2. The horizontal axis represents random variable X 2
, and its parent X 1 is placed on the vertical axis. The diagonal band copula1 [Cooke R.M., Waij R.
1986] realizes the correlation r12 between these random variables. Value X1 = x1 is known from the
first term of the factorization, this allows us to calculate the conditional distribution of X 2 given
variable X 1 = x 1 , denoted by F2 1 . If we sample value of the independent uniform variable U 2 = u2
and invert it with respect to F2 1 then we get the desired value x2 . So, the sampled value of variable
X 2 is obtained as
x2 = F2−11: x (u 2 ) .
1
(4)
(
c) P 3 2 1 )
0
3 2 1
Figure C-4: D3 for the BBN for the aircraft separation time with 9 variables
1
This copula will be used in the text only to visualize the sampling procedure, since it can be easily drawn. Although, for
applications we will use Frank’s copula [Frank M.J. 1979] as it does not add much information to the product of margins, enjoys
the zero independence property and has a close form of conditional and inverse conditional distributions.
Page 2
ATC-WAKE D3_6B APPENDIX, 20/02/2005
For the third part of the factorization K =3, and variables X 1 and X 2 are underscored, that is, X 1
and X 2 are independent of X 3 . C 3 = and I 3 = 2, 1 . Hence, the order of the variables is D 3 =
D(3, 2, 1). Variables X 1 and X 2 were already sampled so we are now interested only in information
about variable X 3 , hence the information in the left-most part of the vine (stood out area in Figure C-
4). Both r32 , r31 2 are equal to zero because X 3 is independent of X 1 and X 2 .
Therefore, to sample random variable X 3 we just sample the value of the independent uniform
variable U 3 , say u3
x3 = u3 . (5)
We turn to the fourth part of the factorization.
(
d) P 4 32 1 )
r 43 0
4 3 2 1
r 42 3
Figure C-5: D4 for the BBN for the aircraft separation time with 9 variables
For the fourth term of the factorization K = 4; the set of dependent variables consists of variables X 2
and X 3 , hence C4 = 3, 2 ; and, variable X 1 is underscored I 4 = 1 ., i.e., variable X 1 is
independent of variable X 4 given X 2 and X 3 . We have D4 = D (4, 3, 2, 1). Notice that the order of
the variables stays the same as in D3.
We are only interested in information about variable X 4 as
variables X 1 , X 2 and X 3 were already sampled. We have that r41 32 = 0 , due to independence
between variables X 1 and X 4 given X 2 and X 3 . The correlations r43 and r 42 3
need to be specified2.
2
Note that we can change the ordering in D4 to 4, 2, 3, 1, which allows another possibility to specify conditional rank
correlations, given as r and r . Hence, we have the following two possibilities to specify (conditional) rank correlations in
42 43 2
D4.
r r
K =4, C4 = 3, 2 , I4 = 1
43 42
or
r 42 3 r 43 2
Page 3
ATC-WAKE D3_6B APPENDIX, 20/02/2005
that the rank correlation r 32 is equal to zero. The sampling procedure for the variable X 4 , say x 4 is
shown in Figure C-6.
X3 F2 3
F1 23
r34 F2 3
(x ) 2
x3 r 24 3
X4 F4 3
0
x4 F4 3
(x )
3
u4
1
F4 3 F4 23
Since X 2 and X 3 were already sampled then values of X 3 = x 3 and F2 3 (x 2 ) are known. We
conditionalize copulas with correlations r34 and r 24 3
on value of X 3 = x 3 and F2 3 (x 2 ) , respectively.
We calculate conditional cumulative distribution functions F4 3 and F4 23 (see Figure C-6). We sample
the value of the independent uniform variable U 4 , say u 4 invert it with respect to F4 23 and get value
of the quantile F4 3 which leads to x 4 . Hence, x 4 is sampled as follows:
x4 = 3
(
F4−13 : x F4−123 : x (u 4 ) .
2
) (6)
(
e) P 5 4321 )
In this term, we have K = 5, the set of dependent variables is empty (C 5 = ) and the rest of the
variables are underscored I = 4, 3, 2, 1 , that is, variable X 5 is independent of X 1 , X 2 , X 3 , X 4 .
5
We can then use the following ordering for D5 = D (5, 4, 3, 2, 1), which after incorporating all zero
correlations in the left most part of the vine simplifies to D (5). We are not required to specify any
(conditional) rank correlation. Value x5 of X 5 in D5 is found by simply sampling the value of the
independent uniform random variable U5 = u5
x5 = u5 . (7)
Similarly, we can get value x6 for the sixth term of the factorization.
(
f) P 6 54321 )
Page 4
ATC-WAKE D3_6B APPENDIX, 20/02/2005
x6 = u6 . (8)
(
g) P 7 65 4321 )
r76 0
7 6 5
r75 6
Figure C-7: D7 for the BBN for the aircraft separation time with 9 variables
This part of the factorization has K =7, the set of dependent variables consist of two variables X 5
and X 6 then C 7 = 6, 5 and there are four underscored variables I 7 = 4, 3, 2, 1 . Hence, D7 = D
(7, 6, 5, 4, 3, 2, 1), the order of the variables stays the same (7, 6, 5, 4, 3, 2, 1) as for the previous
vines. So far, we have sampled variables X 1 , X 2 , X 3 , X 4 , X 5 and X 6 , so we only need to
incorporate the information about variable X 7 given in the left-most part of D7. Notice that, we have
reduced D7 as we did for D4 to D (7, 6, 5). We must assign rank correlation r76 to the edge that
connects variables X 7 and X 6 in D7 and equivalently to the corresponding arc in the BBN in Figure
C-1. We must also incorporate information about the conditional dependence of variables X 5 and
X 7 given variable X 6 in form of conditional correlation r75 3, hence r75 is assigned to the arc
6 6
between X 7 and X 5 in the BBN in Figure C-1. From previous factorizations we find that r65 is equal
to zero.
3
As we mentioned for D4, variables in D7 can be given in different order (7, 5, 6), if it is the case r75 and r76 5
are being
needed. Hence, we have the following possibilities to specify (conditional) rank correlations in D7:
r76 r75
C 7 = 6, 5 , I 7 = 4, 3, 2, 1 or r
r75 6 76 5
Page 5
ATC-WAKE D3_6B APPENDIX, 20/02/2005
X6 F5 6
r76 r75 6
x6
x5
X7
x7 F7 6
(x )6
0
u7
F7 F7 65
1 6
Figure C-8 shows the sampling value of x7 in D (7, 6, 5). It can be obtained in a way analogous to
obtaining value x 4 . We get
x7 = 6
(
F7−16 : x F7−165 : x (u7 ) .
5
) (9)
Now, we shall explain the case of the eighth part of the factorization.
(
h) P 8 7654321 )
In this term, K = 8, the set of dependent variables is empty, C 8 = and I 8 = 7, 6, 5, 4, 3, 2, 1 , that
is variable X 8 is independent of X 1 , X 2 , X 3 , X 4 , X 5 , X 6 and X 7 . Hence we use the following
ordering for D8 = D (8, 7, 6, 5, 4, 3, 2, 1) which reduces to D (8). The sampling value of x8 is obtained
by just sampling the independent uniform variable U 8 , say u8
x8 = u8 . (10)
(
i) P 9 874 65321 )
r98 0 0
9 8 7 4
r97 8 0
r94 78
Figure C-9: D9 for the BBN for the aircraft separation time with 9 variables
We can see in this term of the factorization that K =9, the set of dependent variables has three
Page 6
ATC-WAKE D3_6B APPENDIX, 20/02/2005
variables, C9 = 8, 7, 4 and the underscored variables are I 9 = 6, 5, 3, 2, 1 . Hence, the ordering
of the variables is given as D9 = D (9, 8, 7, 4, 6, 5, 3, 2, 1). Finally, following the same procedure as
above, D9 is reduced to a sub-vine on four variables, namely, D (9, 8, 7, 4). We are only interested
in the information about variable X 9 . We can assign a rank correlation r89 to the edge of D9 and
equivalently to the arc between variables X 8 and X 9 in BBN in Figure C-1. We also need to
incorporate the information about two conditional dependences r 97 8 and r 94 87 (we know values of
variables X 7 and X 8 from D7 from D8, respectively, see Equations 9 and 10)4.
Figure C-10 shows the sampling procedure to realize (conditional) correlations in D9.
X8 F7 8 F4 87
r97 8
r98 F7 8
(x )
7
r94 87
F4 87 (x 4 )
x8
F9 8
x9
X9
F9 8
(x )8 F9 87
(x )F
7
9 87
u9
F9 F9 874
1 F9 8 87
, and F4 87
(x ) , respectively. We calculate conditional cumulative distribution functions
4 F9 8 , F9 87 and
F9 874 (see Figure C-10). We sample the value of the independent uniform variable U 9 , say u9 invert
it with respect to F9 874 and get value of the quantile F9 87 which is used to get quantile F9 8 , which
leads to x9 .
4
Again, if we can change the order of the parents; we may have several possibilities to specify conditional rank correlations.
r r r r
98
98
97 97
C = 8, 7, 4 , I = 6, 5, 3, 2, 1 r , r , r , r ,
9 9
97 8 94 8 98 7 94 7
r r r r
94 87 97 84 94 78 98 74
r r
94
94
r 98 4 or r 97 4
r
97 48 r 98 47
Page 7
ATC-WAKE D3_6B APPENDIX, 20/02/2005
x9 = 8
( 7
(
F9−18 : x F9−17, 8 : x F9−14, 7, 8 : x (u 9 ) .
4
)) (11)
We have specified eight (conditional) correlations for the BBN structure shown in Figure C-1 the same
as the number of arcs in the BBN. Conditional independence properties of the BBN were used to
simplify the sampling procedure in D -vines.
In principle, it is not necessary to draw D -vines to see which (conditional) correlations are necessary
for calculations. One can follow the algorithm presented below:
• Find sampling ordering. An ordering such that all ancestors of node i appear before i in the
ordering. A sampling ordering begins with a source node and ends with a sink node.
• Index the nodes according to the sampling order 1, …, n.
• Factorize the joint in the standard way (Equation 2) following the sampling order.
• Underscore those nodes in each condition, which are not parents of the conditioned variable
and thus are not necessary in sampling it.
The underscored nodes could be omitted thereby yielding the familiar factorization of the
BBN as a product of conditional probabilities, with each node conditionalized on its parents
(for source nodes the set of parents is empty).
• For each term i with parents (non-underscored variables) i1 ... i p (i ) , associate the arc i p (i )−k → i
with the conditional rank correlation
r (i , i p (i ) ); k = 0
( )
r i , i p (i )−k i p (i ), ..., i p (i )−k +1 ; 1 k p(i ) − 1 (13)
The rank correlation specification on regular vine plus copula determines the whole joint distribution
[Kurowicka D., Cooke R.M, 2005].
Page 8