Professional Documents
Culture Documents
Broderick Mlss Cadiz Part II
Broderick Mlss Cadiz Part II
Broderick Mlss Cadiz Part II
Statistics: Part II
Tamara Broderick
ITT Career Development Assistant Professor
Electrical Engineering & Computer Science
MIT
Distributions
13
Distributions
• Beta → random distribution
over 1, 2
1 2
• Dirichlet → random
distribution over 1, 2, . . . , K
• Dirichlet process →
random distribution over :
13
Distributions
• Beta → random distribution
over 1, 2
1 2
• Dirichlet → random
distribution over 1, 2, . . . , K
1 2 3 4
• GEM / Dirichlet process
stick-breaking → random
distribution over 1, 2, . . .
• Dirichlet process →
random distribution over :
13
Distributions
• Beta → random distribution
over 1, 2
1 2
• Dirichlet → random
distribution over 1, 2, . . . , K
1 2 3 4
• GEM / Dirichlet process
stick-breaking → random
distribution over 1, 2, . . . …
1 2 3 4 …
• Dirichlet process →
random distribution over :
13
Distributions
• Beta → random distribution
over 1, 2
1 2
• Dirichlet → random
distribution over 1, 2, . . . , K
1 2 3 4
• GEM / Dirichlet process
stick-breaking → random
distribution over 1, 2, . . . …
1 2 3 4 …
• Dirichlet process →
random• distribution over :
Infinity of parameters: components
• Growing number of parameters: clusters
13
Distributions
• Beta → random distribution
over 1, 2
1 2
• Dirichlet → random
distribution over 1, 2, . . . , K
1 2 3 4
• GEM / Dirichlet process
stick-breaking → random
distribution over 1, 2, . . . …
1 2 3 4 …
• Dirichlet process →
random distribution over :
13
Distributions
• Beta → random distribution
over 1, 2
1 2
• Dirichlet → random
distribution over 1, 2, . . . , K
1 2 3 4
• GEM / Dirichlet process
stick-breaking → random
distribution over 1, 2, . . . …
1 2 3 4 …
• Dirichlet process →
random distribution over :
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
13
Distributions
• Beta → random distribution
over 1, 2
1 2
• Dirichlet → random
distribution over 1, 2, . . . , K
1 2 3 4
• GEM / Dirichlet process
stick-breaking → random
distribution over 1, 2, . . . …
1 2 3 4 …
• Dirichlet process →
random distribution over :
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
iid
k ⇠ G0
13
Distributions
• Beta → random distribution
over 1, 2
1 2
• Dirichlet → random
distribution over 1, 2, . . . , K
1 2 3 4
• GEM / Dirichlet process
stick-breaking → random
distribution over 1, 2, . . . …
1 2 3 4 …
• Dirichlet process →
random distribution over :
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
iid
k ⇠X G0
1
G= ⇢k k
13 k=1
Distributions
• Beta → random distribution
over 1, 2
1 2
• Dirichlet → random
distribution over 1, 2, . . . , K
1 2 3 4
• GEM / Dirichlet process
stick-breaking → random
distribution over 1, 2, . . . …
1 2 3 4 …
• Dirichlet process → ⇢2
random distribution over :
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
iid 2
k ⇠X G0
1
G= ⇢k k
13 k=1
Distributions
• Beta → random distribution
over 1, 2
1 2
• Dirichlet → random
distribution over 1, 2, . . . , K
1 2 3 4
• GEM / Dirichlet process
stick-breaking → random
distribution over 1, 2, . . . …
1 2 3 4 …
• Dirichlet process → ⇢2
random distribution over :
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
iid 2
k ⇠X G0
1
G= ⇢k k [Ferguson 1973]
13 k=1
Dirichlet process mixture model
X1 d
G= ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
⇤ iid
µn ⇠ G
14
Dirichlet process mixture model
• Gaussian mixture model
!
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
!
⇤ iid
• i.e. µn ⇠ G
14
Dirichlet process mixture model
• Gaussian mixture model
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
!
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
!
⇤ iid
• i.e. µn ⇠ G
14
Dirichlet process mixture model
• Gaussian mixture model
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
1 2 3 4 …
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
!
⇤ iid
• i.e. µn ⇠ G
14
Dirichlet process mixture model
• Gaussian mixture model
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
µk ⇠ N (µ0 , ⌃0 ), k = 1, 2, . . . 1 2 3 4
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
!
⇤ iid
• i.e. µn ⇠ G
14
Dirichlet process mixture model
• Gaussian mixture model
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
µk ⇠ N (µ0 , ⌃0 ), k = 1, 2, . . . 1 2 3 4
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
! ⇢2
!
µ2 R D
⇤ iid
• i.e. µn ⇠ G
14
Dirichlet process mixture model
• Gaussian mixture model
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
µk ⇠ N (µ0 , ⌃0 ), k = 1, 2, . . . 1 2 3 4
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
! ⇢2
!
µ2 R D
⇤ iid
• i.e. µn ⇠ G
14
Dirichlet process mixture model
• Gaussian mixture model
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
µk ⇠ N (µ0 , ⌃0 ), k = 1, 2, . . . 1 2 3 4
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
! ⇢2
!
µ2 R D
⇤ iid
• i.e. µn ⇠ G
14
Dirichlet process mixture model
• Gaussian mixture model
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
µk ⇠ N (µ0 , ⌃0 ), k = 1, 2, . . . 1 2 3 4
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
µ2 R D
⇤ iid
• i.e. µn ⇠ G
14
Dirichlet process mixture model
• Gaussian mixture model
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
µk ⇠ N (µ0 , ⌃0 ), k = 1, 2, . . . 1 2 3 4
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
µ⇤n = µzn µ2 R D
⇤ iid
• i.e. µn ⇠ G
14
Dirichlet process mixture model
• Gaussian mixture model
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
µk ⇠ N (µ0 , ⌃0 ), k = 1, 2, . . . 1 2 3 4
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
µ⇤n = µzn µ2 R D
⇤ iid
• i.e. µn ⇠ G
14
Dirichlet process mixture model
• Gaussian mixture model
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
µk ⇠ N (µ0 , ⌃0 ), k = 1, 2, . . . 1 2 3 4
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
µ⇤n = µzn µ2 R D
⇤ iid
• i.e. µn ⇠ G
14
Dirichlet process mixture model
• Gaussian mixture model
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
µk ⇠ N (µ0 , ⌃0 ), k = 1, 2, . . . 1 2 3 4
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
µ⇤n = µzn µ2 R D
⇤ iid
• i.e. µn ⇠ G
indep ⇤
xn ⇠ N (µn , ⌃)
14
Dirichlet process mixture model
• Gaussian mixture model
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
µk ⇠ N (µ0 , ⌃0 ), k = 1, 2, . . . 1 2 3 4
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
µ⇤n = µzn µ2 R D
⇤ iid
• i.e. µn ⇠ G
indep ⇤
xn ⇠ N (µn , ⌃)
14
Dirichlet process mixture model
• Gaussian mixture model
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
µk ⇠ N (µ0 , ⌃0 ), k = 1, 2, . . . 1 2 3 4
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
µ⇤n = µzn µ2 R D
⇤ iid
• i.e. µn ⇠ G
indep ⇤
xn ⇠ N (µn , ⌃)
[demo]
14
Dirichlet process mixture model
• More generally
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
µk ⇠ N (µ0 , ⌃0 ), k = 1, 2, . . . 1 2 3 4
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
µ⇤n = µzn µ2 R D
⇤ iid
• i.e. µn ⇠ G
indep ⇤
xn ⇠ N (µn , ⌃)
14
Dirichlet process mixture model
• More generally
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
k ⇠ G0 X k = 1, 2, . . . 1 2 3 4
1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
µ⇤n = µzn µ2 R D
⇤ iid
• i.e. µn ⇠ G
indep ⇤
xn ⇠ N (µn , ⌃)
14
Dirichlet process mixture model
• More generally
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
k ⇠ G0 X k = 1, 2, . . . 1 2 3 4
1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
µ⇤n = µzn 2 R D
⇤ iid
• i.e. µn ⇠ G
indep ⇤
xn ⇠ N (µn , ⌃)
14
Dirichlet process mixture model
• More generally
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
k ⇠ G0 X k = 1, 2, . . . 1 2 3 4
1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
µ⇤n = µzn 2
⇤ iid
• i.e. µn ⇠ G
indep ⇤
xn ⇠ N (µn , ⌃)
14
Dirichlet process mixture model
• More generally
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
k ⇠ G0 X k = 1, 2, . . . 1 2 3 4
1 d
• i.e. G = ⇢k k = DP(↵, GN0(µ
) 0 , ⌃0 ))
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
µ⇤n = µzn 2
⇤ iid
• i.e. µn ⇠ G
indep ⇤
xn ⇠ N (µn , ⌃)
14
Dirichlet process mixture model
• More generally
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
k ⇠ G0 X k = 1, 2, . . . 1 2 3 4
1 d
• i.e. G = ⇢k k = DP(↵, G0 )
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
µ⇤n = µzn 2
⇤ iid
• i.e. µn ⇠ G
indep ⇤
xn ⇠ N (µn , ⌃)
14
Dirichlet process mixture model
• More generally
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
k ⇠ G0 X k = 1, 2, . . . 1 2 3 4
1 d
• i.e. G = ⇢k k = DP(↵, G0 )
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
✓n = z n 2
⇤ iid
• i.e. µn ⇠ G
indep ⇤
xn ⇠ N (µn , ⌃)
14
Dirichlet process mixture model
• More generally
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
k ⇠ G0 X k = 1, 2, . . . 1 2 3 4
1 d
• i.e. G = ⇢k k = DP(↵, G0 )
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
✓n = z n 2
iid
• i.e. ✓n ⇠ G
indep ⇤
xn ⇠ N (µn , ⌃)
14
Dirichlet process mixture model
• More generally
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
k ⇠ G0 X k = 1, 2, . . . 1 2 3 4
1 d
• i.e. G = ⇢k k = DP(↵, G0 )
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
✓n = z n 2
iid
• i.e. ✓n ⇠ G
indep
xn ⇠ F (✓n )
14
Dirichlet process mixture model
• More generally
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
k ⇠ G0 X k = 1, 2, . . . 1 2 3 4
1 d
• i.e. G = ⇢k k = DP(↵, G0 )
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
✓n = z n 2
iid
• i.e. ✓n ⇠ G
indep
xn ⇠ F (✓n )
[Antoniak 1974; Ferguson 1983; West, Müller, Escobar 1994;
14 Escobar, West 1995; MacEachern, Müller 1998]
DP or not DP, that is the question
• GEM: …
• Compare to:
• Finite (small K) mixture model
!
• Finite (large K) mixture model
!
…
!
• Time series
…
15
DP or not DP, that is the question
• GEM: …
• Compare to:
• Finite (small K) mixture model
!
• Finite (large K) mixture model
!
…
!
• Time series
…
15
DP or not DP, that is the question
• GEM: …
• Compare to:
• Finite (small K) mixture model
!
• Finite (large K) mixture model
!
…
!
• Time series
…
15
DP or not DP, that is the question
• GEM: …
• Compare to:
• Finite (small K) mixture model
!
• Finite (large K) mixture model
!
…
!
• Time series
…
15
DP or not DP, that is the question
• GEM: …
• Compare to:
• Finite (small K) mixture model
!
• Finite (large K) mixture model
!
…
!
• Time series
…
15
DP or not DP, that is the question
• GEM: …
• Compare to:
• Finite (small K) mixture model
!
• Finite (large K) mixture model
!
…
!
• Time series
…
15
Marginal cluster assignments
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
1 2
16
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
1 2
16
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(zn = 1|z1 , . . . , zn 1 )
1 2
16
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(zn Z= 1|z1 , . . . , zn 1 )
1 2
= p(zn = 1, ⇢1 |z1 , . . . , zn 1 )d⇢1
16
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(zn Z= 1|z1 , . . . , zn 1 )
1 2
= p(zn = 1|⇢1 )p(⇢1 |z1 , . . . , zn 1 )d⇢1
16
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(zn Z= 1|z1 , . . . , zn 1 )
1 2
= p(zn = 1|⇢1 )p(⇢1 |z1 , . . . , zn 1 )d⇢1
16
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(zn Z= 1|z1 , . . . , zn 1 )
1 2
= p(zn = 1|⇢1 )p(⇢1 |z1 , . . . , zn 1 )d⇢1
Z
= ⇢1 Beta(⇢1 |a1,n , a2,n )d⇢1
16
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(zn Z= 1|z1 , . . . , zn 1 )
1 2
= p(zn = 1|⇢1 )p(⇢1 |z1 , . . . , zn 1 )d⇢1
Z
= ⇢1 Beta(⇢1 |a1,n , a2,n )d⇢1
16
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(zn Z= 1|z1 , . . . , zn 1 )
1 2
= p(zn = 1|⇢1 )p(⇢1 |z1 , . . . , zn 1 )d⇢1
Z
= ⇢1 Beta(⇢1 |a1,n , a2,n )d⇢1
16
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(zn Z= 1|z1 , . . . , zn 1 )
1 2
= p(zn = 1|⇢1 )p(⇢1 |z1 , . . . , zn 1 )d⇢1
Z
= ⇢1 Beta(⇢1 |a1,n , a2,n )d⇢1
16
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(zn Z= 1|z1 , . . . , zn 1 )
1 2
= p(zn = 1|⇢1 )p(⇢1 |z1 , . . . , zn 1 )d⇢1
Z
= ⇢1 Beta(⇢1 |a1,n , a2,n )d⇢1
n
X1 n
X1
a1,n := a1 + 1{zm = 1}, a2,n = a2 + 1{zm = 2}
m=1 m=1
16
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(zn Z= 1|z1 , . . . , zn 1 )
1 2
= p(zn = 1|⇢1 )p(⇢1 |z1 , . . . , zn 1 )d⇢1
Z
= ⇢1 Beta(⇢1 |a1,n , a2,n )d⇢1
n
X1 n
X1
a1,n := a1 + 1{zm = 1}, a2,n = a2 + 1{zm = 2}
Z m=1 m=1
(a1,n + a2,n ) a1,n 1
= ⇢1 ⇢1 (1 ⇢1 )a2,n 1 d⇢1
(a1,n ) (a2,n )
16
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(zn Z= 1|z1 , . . . , zn 1 )
1 2
= p(zn = 1|⇢1 )p(⇢1 |z1 , . . . , zn 1 )d⇢1
Z
= ⇢1 Beta(⇢1 |a1,n , a2,n )d⇢1
n
X1 n
X1
a1,n := a1 + 1{zm = 1}, a2,n = a2 + 1{zm = 2}
Z m=1 m=1
(a1,n + a2,n ) a1,n 1
= ⇢1 ⇢1 (1 ⇢1 )a2,n 1 d⇢1
(a1,n ) (a2,n )
(a1,n + a2,n ) (a1,n + 1) (a2,n )
=
(a1,n ) (a2,n ) (a1,n + a2,n + 1)
16
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(zn Z= 1|z1 , . . . , zn 1 )
1 2
= p(zn = 1|⇢1 )p(⇢1 |z1 , . . . , zn 1 )d⇢1
Z
= ⇢1 Beta(⇢1 |a1,n , a2,n )d⇢1
n
X1 n
X1
a1,n := a1 + 1{zm = 1}, a2,n = a2 + 1{zm = 2}
Z m=1 m=1
(a1,n + a2,n ) a1,n 1
= ⇢1 ⇢1 (1 ⇢1 )a2,n 1 d⇢1
(a1,n ) (a2,n )
(a1,n + a2,n ) (a1,n + 1) (a2,n ) Recall
=
(a1,n ) (a2,n ) (a1,n + a2,n + 1) (x + 1) = x (x)
16
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(zn Z= 1|z1 , . . . , zn 1 )
1 2
= p(zn = 1|⇢1 )p(⇢1 |z1 , . . . , zn 1 )d⇢1
Z
= ⇢1 Beta(⇢1 |a1,n , a2,n )d⇢1
n
X1 n
X1
a1,n := a1 + 1{zm = 1}, a2,n = a2 + 1{zm = 2}
Z m=1 m=1
(a1,n + a2,n ) a1,n 1
= ⇢1 ⇢1 (1 ⇢1 )a2,n 1 d⇢1
(a1,n ) (a2,n )
(a1,n + a2,n ) (a1,n + 1) (a2,n ) Recall
=
(a1,n ) (a2,n ) (a1,n + a2,n + 1) (x + 1) = x (x)
a1,n
=
16
a1,n + a2,n
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color
# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total
17
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color
# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total
17
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color
# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total
17
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color
# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total
17
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color
# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total
17
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color
# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total
17
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color
# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total
17
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color
# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total
17
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color
# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total
17
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color
# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total
18
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color
# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total
18
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color
# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total
18
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color
# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total
19
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color
# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total
19
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color
# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total
19
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color
# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total
19
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color
# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total
19
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color
# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total
19
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color
# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total
19
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color
# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total
19
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color
# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total
19
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color
# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total
19
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color
# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total
19
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color
# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total
19
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color
# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total
19
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color
# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total
19
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color
# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total
20
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with prob proportional to its mass
• Replace and add ball of same color
# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total
20
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with prob proportional to its mass
• Replace and add ball of same color
# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total
20
PolyaUrn(aorange , agreen )
Marginal cluster assignments
• Integrate out the frequencies
!
!
1 2 3 4
!
!
• multivariate Pólya urn
• Choose any ball with prob proportional to its mass
• Replace and add ball of same color
21
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢!1:K ⇠ Dirichlet(a1:K ), zn ⇠ Cat(⇢1:K )
!
1 2 3 4
!
!
• multivariate Pólya urn
• Choose any ball with prob proportional to its mass
• Replace and add ball of same color
21
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢!1:K ⇠ Dirichlet(a1:K ), zn ⇠ Cat(⇢1:K )
ak,n
p(z
! n = k|z1 , . . . , zn 1 ) = P
K 1 2 3 4
! j=1 a j,n
!
• multivariate Pólya urn
• Choose any ball with prob proportional to its mass
• Replace and add ball of same color
21
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢!1:K ⇠ Dirichlet(a1:K ), zn ⇠ Cat(⇢1:K )
ak,n
p(z
! n = k|z1 , . . . , zn 1 ) = P
K 1 2 3 4
n 1 j=1 a j,n
! X
!
ak,n := ak + 1{zm = k}
m=1
• multivariate Pólya urn
• Choose any ball with prob proportional to its mass
• Replace and add ball of same color
21
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢!1:K ⇠ Dirichlet(a1:K ), zn ⇠ Cat(⇢1:K )
ak,n
p(z
! n = k|z1 , . . . , zn 1 ) = P
K 1 2 3 4
n 1 j=1 a j,n
! X
!
ak,n := ak + 1{zm = k}
m=1
• multivariate Pólya urn
• Choose any ball with prob proportional to its mass
• Replace and add ball of same color
21
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢!1:K ⇠ Dirichlet(a1:K ), zn ⇠ Cat(⇢1:K )
ak,n
p(z
! n = k|z1 , . . . , zn 1 ) = P
K 1 2 3 4
n 1 j=1 a j,n
! X
!
ak,n := ak + 1{zm = k}
m=1
• multivariate Pólya urn
• Choose any ball with prob proportional to its mass
• Replace and add ball of same color
21
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢!1:K ⇠ Dirichlet(a1:K ), zn ⇠ Cat(⇢1:K )
ak,n
p(z
! n = k|z1 , . . . , zn 1 ) = P
K 1 2 3 4
n 1 j=1 a j,n
! X
!
ak,n := ak + 1{zm = k}
m=1
• multivariate Pólya urn
• Choose any ball with prob proportional to its mass
• Replace and add ball of same color
21
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢!1:K ⇠ Dirichlet(a1:K ), zn ⇠ Cat(⇢1:K )
ak,n
p(z
! n = k|z1 , . . . , zn 1 ) = P
K 1 2 3 4
n 1 j=1 a j,n
! X
!
ak,n := ak + 1{zm = k}
m=1
• multivariate Pólya urn
• Choose any ball with prob proportional to its mass
• Replace and add ball of same color
21
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢!1:K ⇠ Dirichlet(a1:K ), zn ⇠ Cat(⇢1:K )
ak,n
p(z
! n = k|z1 , . . . , zn 1 ) = P
K 1 2 3 4
n 1 j=1 a j,n
! X
!
ak,n := ak + 1{zm = k}
m=1
• multivariate Pólya urn
• Choose any ball with prob proportional to its mass
• Replace and add ball of same color
21
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢!1:K ⇠ Dirichlet(a1:K ), zn ⇠ Cat(⇢1:K )
ak,n
p(z
! n = k|z1 , . . . , zn 1 ) = P
K 1 2 3 4
n 1 j=1 a j,n
! X
!
ak,n := ak + 1{zm = k}
m=1
• multivariate Pólya urn
• Choose any ball with prob proportional to its mass
• Replace and add ball of same color
21
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢!1:K ⇠ Dirichlet(a1:K ), zn ⇠ Cat(⇢1:K )
ak,n
p(z
! n = k|z1 , . . . , zn 1 ) = P
K 1 2 3 4
n 1 j=1 a j,n
! X
!
ak,n := ak + 1{zm = k}
m=1
• multivariate Pólya urn
• Choose any ball with prob proportional to its mass
• Replace and add ball of same color
Step 0
Step 0 Step 1
iid
Step 0 Step 1 Step 2 Step 3 Step 4 Vk ⇠ Beta(1, ↵)
iid
Step 0 Step 1 Step 2 Step 3 Step 4 Vk ⇠ Beta(1, ↵)
⇢1 = V1
iid
Step 0 Step 1 Step 2 Step 3 Step 4 Vk ⇠ Beta(1, ↵)
⇢1 = V1
⇢2 = (1 V1 )V2
iid
Step 0 Step 1 Step 2 Step 3 Step 4 Vk ⇠ Beta(1, ↵)
⇢1 = V1
⇢2 = (1 V1 )V2
Y2
⇢3 = [ (1 Vk )]V3
k=1
(#orange, #other) = PolyaUrn(1, ↵)
• not orange: (#green, #other) = PolyaUrn(1, ↵)
• not orange, green: (#red, #other) = PolyaUrn(1, ↵)
22
Chinese restaurant process
1
2
1
2
1 2
2
1 2
2
1 2
2 4
1 2
2 4
1 2 3
2 4
1 2 3
2 4
1 2 3
2 4
5
• Same thing we just did
• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8}
23
Chinese restaurant process
1 6 3
1 2 3
2 4
5
• Same thing we just did
• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8}
23
Chinese restaurant process
1 7 6 3
1 2 3
2 4
5
• Same thing we just did
• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8}
23
Chinese restaurant process
1 7 6 3
1 2 3
2 4
5
8
• Same thing we just did
• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8}
23
Chinese restaurant process
1 7 6 3
1 2 3
2 4
5
8
• Same thing we just did
• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8} [Aldous 1983]
23
Chinese restaurant process
1 7 6 3
1 2 3
2 4
5
8
• Same thing we just did
• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8}
23
Chinese restaurant process
1 7 6 3
1 2 3
2 4
5
8
• Same thing we just did
• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
)
So⇧far: {{1, 2, 7,process,
8 =Dirichlet 8}, {3, 5, Chinese
6}, {4}} restaurant process
•Partition
Infinity of
of [8]: set of mutually exclusive
number&ofexhaustive sets
•
parameters, growing parameters
of [8] = {1, . . . , 8}
23
Exercises
1 7 6 3
1 2 3
2 4
5
8
• Review Gibbs sampling
• What are the advantages and disadvantages of the DP
and CRP representations?
• What is the expected number of clusters generated by a
CRP(α) after N data points?
• What do you think about the answer to the previous
question when it comes to real-life data modeling?
• Code a CRP sampler. Examine the empirical distribution of
the number of clusters after N customers.
24
Exercises
1 7 6 3
1 2 3
2 4
5
8
• Review Gibbs sampling
• What are the advantages and disadvantages of the DP
and CRP representations?
• What is the expected number of clusters generated by a
CRP(α) after N data points?
• What do you think about the answer to the previous
question when it comes to real-life data modeling?
• Code a CRP sampler. Examine the empirical distribution of
the number of clusters after N customers.
24
Exercises
1 7 6 3
1 2 3
2 4
5
8
• Review Gibbs sampling
• What are the advantages and disadvantages of the DP
and CRP representations?
• What is the expected number of clusters generated by a
CRP(α) after N data points?
• What do you think about the answer to the previous
question when it comes to real-life data modeling?
• Code a CRP sampler. Examine the empirical distribution of
the number of clusters after N customers.
24
Exercises
1 7 6 3
1 2 3
2 4
5
8
• Review Gibbs sampling
• What are the advantages and disadvantages of the DP
and CRP representations?
• What is the expected number of clusters generated by a
CRP(α) after N data points?
• What do you think about the answer to the previous
question when it comes to real-life data modeling?
• Code a CRP sampler. Examine the empirical distribution of
the number of clusters after N customers.
24
Exercises
1 7 6 3
1 2 3
2 4
5
8
• Review Gibbs sampling
• What are the advantages and disadvantages of the DP
and CRP representations?
• What is the expected number of clusters generated by a
CRP(α) after N data points?
• What do you think about the answer to the previous
question when it comes to real-life data modeling?
• Code a CRP sampler. Examine the empirical distribution of
the number of clusters after N customers.
24
Exercises
1 7 6 3
1 2 3
2 4
5
8
• Review Gibbs sampling
• What are the advantages and disadvantages of the DP
and CRP representations?
• What is the expected number of clusters generated by a
CRP(α) after N data points?
• What do you think about the answer to the previous
question when it comes to real-life data modeling?
• Code a CRP sampler. Examine the empirical distribution of
the number of clusters after N customers.
24
References
A full reference list is provided at the end of the “Part III” slides.