Broderick Mlss Cadiz Part II

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 148

Nonparametric Bayesian

Statistics: Part II

Tamara Broderick
ITT Career Development Assistant Professor
Electrical Engineering & Computer Science
MIT
Distributions

13
Distributions
• Beta → random distribution
over 1, 2
1 2
• Dirichlet → random
distribution over 1, 2, . . . , K

• GEM / Dirichlet process


stick-breaking → random
distribution over 1, 2, . . .

• Dirichlet process →
random distribution over :

13
Distributions
• Beta → random distribution
over 1, 2
1 2
• Dirichlet → random
distribution over 1, 2, . . . , K
1 2 3 4
• GEM / Dirichlet process
stick-breaking → random
distribution over 1, 2, . . .

• Dirichlet process →
random distribution over :

13
Distributions
• Beta → random distribution
over 1, 2
1 2
• Dirichlet → random
distribution over 1, 2, . . . , K
1 2 3 4
• GEM / Dirichlet process
stick-breaking → random
distribution over 1, 2, . . . …
1 2 3 4 …

• Dirichlet process →
random distribution over :

13
Distributions
• Beta → random distribution
over 1, 2
1 2
• Dirichlet → random
distribution over 1, 2, . . . , K
1 2 3 4
• GEM / Dirichlet process
stick-breaking → random
distribution over 1, 2, . . . …
1 2 3 4 …

• Dirichlet process →
random• distribution over :
Infinity of parameters: components
• Growing number of parameters: clusters

13
Distributions
• Beta → random distribution
over 1, 2
1 2
• Dirichlet → random
distribution over 1, 2, . . . , K
1 2 3 4
• GEM / Dirichlet process
stick-breaking → random
distribution over 1, 2, . . . …
1 2 3 4 …

• Dirichlet process →
random distribution over :

13
Distributions
• Beta → random distribution
over 1, 2
1 2
• Dirichlet → random
distribution over 1, 2, . . . , K
1 2 3 4
• GEM / Dirichlet process
stick-breaking → random
distribution over 1, 2, . . . …
1 2 3 4 …

• Dirichlet process →
random distribution over :
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)

13
Distributions
• Beta → random distribution
over 1, 2
1 2
• Dirichlet → random
distribution over 1, 2, . . . , K
1 2 3 4
• GEM / Dirichlet process
stick-breaking → random
distribution over 1, 2, . . . …
1 2 3 4 …

• Dirichlet process →
random distribution over :
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
iid
k ⇠ G0

13
Distributions
• Beta → random distribution
over 1, 2
1 2
• Dirichlet → random
distribution over 1, 2, . . . , K
1 2 3 4
• GEM / Dirichlet process
stick-breaking → random
distribution over 1, 2, . . . …
1 2 3 4 …

• Dirichlet process →
random distribution over :
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
iid
k ⇠X G0
1
G= ⇢k k
13 k=1
Distributions
• Beta → random distribution
over 1, 2
1 2
• Dirichlet → random
distribution over 1, 2, . . . , K
1 2 3 4
• GEM / Dirichlet process
stick-breaking → random
distribution over 1, 2, . . . …
1 2 3 4 …

• Dirichlet process → ⇢2
random distribution over :
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
iid 2
k ⇠X G0
1
G= ⇢k k
13 k=1
Distributions
• Beta → random distribution
over 1, 2
1 2
• Dirichlet → random
distribution over 1, 2, . . . , K
1 2 3 4
• GEM / Dirichlet process
stick-breaking → random
distribution over 1, 2, . . . …
1 2 3 4 …

• Dirichlet process → ⇢2
random distribution over :
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
iid 2
k ⇠X G0
1
G= ⇢k k [Ferguson 1973]
13 k=1
Dirichlet process mixture model

X1 d
G= ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1

⇤ iid
µn ⇠ G

14
Dirichlet process mixture model
• Gaussian mixture model

!
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
!

⇤ iid
• i.e. µn ⇠ G

14
Dirichlet process mixture model
• Gaussian mixture model
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
!
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
!

⇤ iid
• i.e. µn ⇠ G

14
Dirichlet process mixture model
• Gaussian mixture model
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
1 2 3 4 …
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
!

⇤ iid
• i.e. µn ⇠ G

14
Dirichlet process mixture model
• Gaussian mixture model
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
µk ⇠ N (µ0 , ⌃0 ), k = 1, 2, . . . 1 2 3 4
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
!

⇤ iid
• i.e. µn ⇠ G

14
Dirichlet process mixture model
• Gaussian mixture model
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
µk ⇠ N (µ0 , ⌃0 ), k = 1, 2, . . . 1 2 3 4
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
! ⇢2

!
µ2 R D
⇤ iid
• i.e. µn ⇠ G

14
Dirichlet process mixture model
• Gaussian mixture model
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
µk ⇠ N (µ0 , ⌃0 ), k = 1, 2, . . . 1 2 3 4
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
! ⇢2

!
µ2 R D
⇤ iid
• i.e. µn ⇠ G

14
Dirichlet process mixture model
• Gaussian mixture model
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
µk ⇠ N (µ0 , ⌃0 ), k = 1, 2, . . . 1 2 3 4
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
! ⇢2

!
µ2 R D
⇤ iid
• i.e. µn ⇠ G

14
Dirichlet process mixture model
• Gaussian mixture model
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
µk ⇠ N (µ0 , ⌃0 ), k = 1, 2, . . . 1 2 3 4
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
µ2 R D
⇤ iid
• i.e. µn ⇠ G

14
Dirichlet process mixture model
• Gaussian mixture model
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
µk ⇠ N (µ0 , ⌃0 ), k = 1, 2, . . . 1 2 3 4
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
µ⇤n = µzn µ2 R D
⇤ iid
• i.e. µn ⇠ G

14
Dirichlet process mixture model
• Gaussian mixture model
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
µk ⇠ N (µ0 , ⌃0 ), k = 1, 2, . . . 1 2 3 4
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
µ⇤n = µzn µ2 R D
⇤ iid
• i.e. µn ⇠ G

14
Dirichlet process mixture model
• Gaussian mixture model
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
µk ⇠ N (µ0 , ⌃0 ), k = 1, 2, . . . 1 2 3 4
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
µ⇤n = µzn µ2 R D
⇤ iid
• i.e. µn ⇠ G

14
Dirichlet process mixture model
• Gaussian mixture model
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
µk ⇠ N (µ0 , ⌃0 ), k = 1, 2, . . . 1 2 3 4
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
µ⇤n = µzn µ2 R D
⇤ iid
• i.e. µn ⇠ G

indep ⇤
xn ⇠ N (µn , ⌃)

14
Dirichlet process mixture model
• Gaussian mixture model
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
µk ⇠ N (µ0 , ⌃0 ), k = 1, 2, . . . 1 2 3 4
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
µ⇤n = µzn µ2 R D
⇤ iid
• i.e. µn ⇠ G

indep ⇤
xn ⇠ N (µn , ⌃)

14
Dirichlet process mixture model
• Gaussian mixture model
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
µk ⇠ N (µ0 , ⌃0 ), k = 1, 2, . . . 1 2 3 4
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
µ⇤n = µzn µ2 R D
⇤ iid
• i.e. µn ⇠ G

indep ⇤
xn ⇠ N (µn , ⌃)
[demo]
14
Dirichlet process mixture model
• More generally
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
µk ⇠ N (µ0 , ⌃0 ), k = 1, 2, . . . 1 2 3 4
X1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
µ⇤n = µzn µ2 R D
⇤ iid
• i.e. µn ⇠ G

indep ⇤
xn ⇠ N (µn , ⌃)

14
Dirichlet process mixture model
• More generally
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
k ⇠ G0 X k = 1, 2, . . . 1 2 3 4
1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
µ⇤n = µzn µ2 R D
⇤ iid
• i.e. µn ⇠ G

indep ⇤
xn ⇠ N (µn , ⌃)

14
Dirichlet process mixture model
• More generally
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
k ⇠ G0 X k = 1, 2, . . . 1 2 3 4
1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
µ⇤n = µzn 2 R D
⇤ iid
• i.e. µn ⇠ G

indep ⇤
xn ⇠ N (µn , ⌃)

14
Dirichlet process mixture model
• More generally
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
k ⇠ G0 X k = 1, 2, . . . 1 2 3 4
1 d
• i.e. G = ⇢k µk = DP(↵, N (µ0 , ⌃0 ))
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
µ⇤n = µzn 2
⇤ iid
• i.e. µn ⇠ G

indep ⇤
xn ⇠ N (µn , ⌃)

14
Dirichlet process mixture model
• More generally
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
k ⇠ G0 X k = 1, 2, . . . 1 2 3 4
1 d
• i.e. G = ⇢k k = DP(↵, GN0(µ
) 0 , ⌃0 ))
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
µ⇤n = µzn 2
⇤ iid
• i.e. µn ⇠ G

indep ⇤
xn ⇠ N (µn , ⌃)

14
Dirichlet process mixture model
• More generally
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
k ⇠ G0 X k = 1, 2, . . . 1 2 3 4
1 d
• i.e. G = ⇢k k = DP(↵, G0 )
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
µ⇤n = µzn 2
⇤ iid
• i.e. µn ⇠ G

indep ⇤
xn ⇠ N (µn , ⌃)

14
Dirichlet process mixture model
• More generally
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
k ⇠ G0 X k = 1, 2, . . . 1 2 3 4
1 d
• i.e. G = ⇢k k = DP(↵, G0 )
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
✓n = z n 2
⇤ iid
• i.e. µn ⇠ G

indep ⇤
xn ⇠ N (µn , ⌃)

14
Dirichlet process mixture model
• More generally
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
k ⇠ G0 X k = 1, 2, . . . 1 2 3 4
1 d
• i.e. G = ⇢k k = DP(↵, G0 )
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
✓n = z n 2
iid
• i.e. ✓n ⇠ G

indep ⇤
xn ⇠ N (µn , ⌃)

14
Dirichlet process mixture model
• More generally
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
k ⇠ G0 X k = 1, 2, . . . 1 2 3 4
1 d
• i.e. G = ⇢k k = DP(↵, G0 )
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
✓n = z n 2
iid
• i.e. ✓n ⇠ G

indep
xn ⇠ F (✓n )

14
Dirichlet process mixture model
• More generally
⇢ = (⇢1 , ⇢2 , . . .) ⇠ GEM(↵)
! …
iid …
k ⇠ G0 X k = 1, 2, . . . 1 2 3 4
1 d
• i.e. G = ⇢k k = DP(↵, G0 )
k=1
! ⇢2
iid
zn! ⇠ Categorical(⇢)
✓n = z n 2
iid
• i.e. ✓n ⇠ G

indep
xn ⇠ F (✓n )
[Antoniak 1974; Ferguson 1983; West, Müller, Escobar 1994;
14 Escobar, West 1995; MacEachern, Müller 1998]
DP or not DP, that is the question
• GEM: …
• Compare to:
• Finite (small K) mixture model

!
• Finite (large K) mixture model
!

!
• Time series

15
DP or not DP, that is the question
• GEM: …
• Compare to:
• Finite (small K) mixture model

!
• Finite (large K) mixture model
!

!
• Time series

15
DP or not DP, that is the question
• GEM: …
• Compare to:
• Finite (small K) mixture model

!
• Finite (large K) mixture model
!

!
• Time series

15
DP or not DP, that is the question
• GEM: …
• Compare to:
• Finite (small K) mixture model

!
• Finite (large K) mixture model
!

!
• Time series

15
DP or not DP, that is the question
• GEM: …
• Compare to:
• Finite (small K) mixture model

!
• Finite (large K) mixture model
!

!
• Time series

15
DP or not DP, that is the question
• GEM: …
• Compare to:
• Finite (small K) mixture model

!
• Finite (large K) mixture model
!

!
• Time series

15
Marginal cluster assignments
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )

1 2

16
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )

1 2

16
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(zn = 1|z1 , . . . , zn 1 )
1 2

16
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(zn Z= 1|z1 , . . . , zn 1 )
1 2
= p(zn = 1, ⇢1 |z1 , . . . , zn 1 )d⇢1

16
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(zn Z= 1|z1 , . . . , zn 1 )
1 2
= p(zn = 1|⇢1 )p(⇢1 |z1 , . . . , zn 1 )d⇢1

16
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(zn Z= 1|z1 , . . . , zn 1 )
1 2
= p(zn = 1|⇢1 )p(⇢1 |z1 , . . . , zn 1 )d⇢1

16
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(zn Z= 1|z1 , . . . , zn 1 )
1 2
= p(zn = 1|⇢1 )p(⇢1 |z1 , . . . , zn 1 )d⇢1
Z
= ⇢1 Beta(⇢1 |a1,n , a2,n )d⇢1

16
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(zn Z= 1|z1 , . . . , zn 1 )
1 2
= p(zn = 1|⇢1 )p(⇢1 |z1 , . . . , zn 1 )d⇢1
Z
= ⇢1 Beta(⇢1 |a1,n , a2,n )d⇢1

16
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(zn Z= 1|z1 , . . . , zn 1 )
1 2
= p(zn = 1|⇢1 )p(⇢1 |z1 , . . . , zn 1 )d⇢1
Z
= ⇢1 Beta(⇢1 |a1,n , a2,n )d⇢1

16
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(zn Z= 1|z1 , . . . , zn 1 )
1 2
= p(zn = 1|⇢1 )p(⇢1 |z1 , . . . , zn 1 )d⇢1
Z
= ⇢1 Beta(⇢1 |a1,n , a2,n )d⇢1

16
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(zn Z= 1|z1 , . . . , zn 1 )
1 2
= p(zn = 1|⇢1 )p(⇢1 |z1 , . . . , zn 1 )d⇢1
Z
= ⇢1 Beta(⇢1 |a1,n , a2,n )d⇢1
n
X1 n
X1
a1,n := a1 + 1{zm = 1}, a2,n = a2 + 1{zm = 2}
m=1 m=1

16
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(zn Z= 1|z1 , . . . , zn 1 )
1 2
= p(zn = 1|⇢1 )p(⇢1 |z1 , . . . , zn 1 )d⇢1
Z
= ⇢1 Beta(⇢1 |a1,n , a2,n )d⇢1
n
X1 n
X1
a1,n := a1 + 1{zm = 1}, a2,n = a2 + 1{zm = 2}
Z m=1 m=1
(a1,n + a2,n ) a1,n 1
= ⇢1 ⇢1 (1 ⇢1 )a2,n 1 d⇢1
(a1,n ) (a2,n )

16
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(zn Z= 1|z1 , . . . , zn 1 )
1 2
= p(zn = 1|⇢1 )p(⇢1 |z1 , . . . , zn 1 )d⇢1
Z
= ⇢1 Beta(⇢1 |a1,n , a2,n )d⇢1
n
X1 n
X1
a1,n := a1 + 1{zm = 1}, a2,n = a2 + 1{zm = 2}
Z m=1 m=1
(a1,n + a2,n ) a1,n 1
= ⇢1 ⇢1 (1 ⇢1 )a2,n 1 d⇢1
(a1,n ) (a2,n )
(a1,n + a2,n ) (a1,n + 1) (a2,n )
=
(a1,n ) (a2,n ) (a1,n + a2,n + 1)

16
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(zn Z= 1|z1 , . . . , zn 1 )
1 2
= p(zn = 1|⇢1 )p(⇢1 |z1 , . . . , zn 1 )d⇢1
Z
= ⇢1 Beta(⇢1 |a1,n , a2,n )d⇢1
n
X1 n
X1
a1,n := a1 + 1{zm = 1}, a2,n = a2 + 1{zm = 2}
Z m=1 m=1
(a1,n + a2,n ) a1,n 1
= ⇢1 ⇢1 (1 ⇢1 )a2,n 1 d⇢1
(a1,n ) (a2,n )
(a1,n + a2,n ) (a1,n + 1) (a2,n ) Recall
=
(a1,n ) (a2,n ) (a1,n + a2,n + 1) (x + 1) = x (x)

16
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(zn Z= 1|z1 , . . . , zn 1 )
1 2
= p(zn = 1|⇢1 )p(⇢1 |z1 , . . . , zn 1 )d⇢1
Z
= ⇢1 Beta(⇢1 |a1,n , a2,n )d⇢1
n
X1 n
X1
a1,n := a1 + 1{zm = 1}, a2,n = a2 + 1{zm = 2}
Z m=1 m=1
(a1,n + a2,n ) a1,n 1
= ⇢1 ⇢1 (1 ⇢1 )a2,n 1 d⇢1
(a1,n ) (a2,n )
(a1,n + a2,n ) (a1,n + 1) (a2,n ) Recall
=
(a1,n ) (a2,n ) (a1,n + a2,n + 1) (x + 1) = x (x)
a1,n
=
16
a1,n + a2,n
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color

# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total

17
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color

# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total

17
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color

# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total

17
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color

# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total

17
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color

# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total

17
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color

# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total

17
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color

# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total

17
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color

# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total

17
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color

# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total

17
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color

# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total

18
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color

# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total

18
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color

# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total

18
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color

# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total

19
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color

# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total

19
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color

# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total

19
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color

# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total

19
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color

# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total

19
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color

# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total

19
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color

# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total

19
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color

# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total

19
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color

# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total

19
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color

# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total

19
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color

# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total

19
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color

# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total

19
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color

# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total

19
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color

# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total

19
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with equal probability
• Replace and add ball of same color

# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total

20
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with prob proportional to its mass
• Replace and add ball of same color

# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total

20
Marginal cluster assignments
• Integrate out the frequencies
iid
!⇢1 ⇠ Beta(a1 , a2 ), zn ⇠ Cat(⇢1 , ⇢2 )
p(z = 1|z , . . . , z ) a 1,n
! n 1 n 1 =
a1,n + a2,n 1 2
! n
X1 n
X1
!
a 1,n := a 1 + 1{z m = 1}, a 2,n = a 2 + 1{z m = 2}
m=1 m=1
• Pólya urn
• Choose any ball with prob proportional to its mass
• Replace and add ball of same color

# orange d
lim = ⇢orange = Beta(aorange , agreen )
n!1 # total

20
PolyaUrn(aorange , agreen )
Marginal cluster assignments
• Integrate out the frequencies
!

!
1 2 3 4
!

!
• multivariate Pólya urn
• Choose any ball with prob proportional to its mass
• Replace and add ball of same color

21
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢!1:K ⇠ Dirichlet(a1:K ), zn ⇠ Cat(⇢1:K )
!
1 2 3 4
!

!
• multivariate Pólya urn
• Choose any ball with prob proportional to its mass
• Replace and add ball of same color

21
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢!1:K ⇠ Dirichlet(a1:K ), zn ⇠ Cat(⇢1:K )
ak,n
p(z
! n = k|z1 , . . . , zn 1 ) = P
K 1 2 3 4
! j=1 a j,n

!
• multivariate Pólya urn
• Choose any ball with prob proportional to its mass
• Replace and add ball of same color

21
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢!1:K ⇠ Dirichlet(a1:K ), zn ⇠ Cat(⇢1:K )
ak,n
p(z
! n = k|z1 , . . . , zn 1 ) = P
K 1 2 3 4
n 1 j=1 a j,n
! X
!
ak,n := ak + 1{zm = k}
m=1
• multivariate Pólya urn
• Choose any ball with prob proportional to its mass
• Replace and add ball of same color

21
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢!1:K ⇠ Dirichlet(a1:K ), zn ⇠ Cat(⇢1:K )
ak,n
p(z
! n = k|z1 , . . . , zn 1 ) = P
K 1 2 3 4
n 1 j=1 a j,n
! X
!
ak,n := ak + 1{zm = k}
m=1
• multivariate Pólya urn
• Choose any ball with prob proportional to its mass
• Replace and add ball of same color

21
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢!1:K ⇠ Dirichlet(a1:K ), zn ⇠ Cat(⇢1:K )
ak,n
p(z
! n = k|z1 , . . . , zn 1 ) = P
K 1 2 3 4
n 1 j=1 a j,n
! X
!
ak,n := ak + 1{zm = k}
m=1
• multivariate Pólya urn
• Choose any ball with prob proportional to its mass
• Replace and add ball of same color

21
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢!1:K ⇠ Dirichlet(a1:K ), zn ⇠ Cat(⇢1:K )
ak,n
p(z
! n = k|z1 , . . . , zn 1 ) = P
K 1 2 3 4
n 1 j=1 a j,n
! X
!
ak,n := ak + 1{zm = k}
m=1
• multivariate Pólya urn
• Choose any ball with prob proportional to its mass
• Replace and add ball of same color

21
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢!1:K ⇠ Dirichlet(a1:K ), zn ⇠ Cat(⇢1:K )
ak,n
p(z
! n = k|z1 , . . . , zn 1 ) = P
K 1 2 3 4
n 1 j=1 a j,n
! X
!
ak,n := ak + 1{zm = k}
m=1
• multivariate Pólya urn
• Choose any ball with prob proportional to its mass
• Replace and add ball of same color

21
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢!1:K ⇠ Dirichlet(a1:K ), zn ⇠ Cat(⇢1:K )
ak,n
p(z
! n = k|z1 , . . . , zn 1 ) = P
K 1 2 3 4
n 1 j=1 a j,n
! X
!
ak,n := ak + 1{zm = k}
m=1
• multivariate Pólya urn
• Choose any ball with prob proportional to its mass
• Replace and add ball of same color

(# orange, # green, # red, # yellow)


lim
n!1 # total

21
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢!1:K ⇠ Dirichlet(a1:K ), zn ⇠ Cat(⇢1:K )
ak,n
p(z
! n = k|z1 , . . . , zn 1 ) = P
K 1 2 3 4
n 1 j=1 a j,n
! X
!
ak,n := ak + 1{zm = k}
m=1
• multivariate Pólya urn
• Choose any ball with prob proportional to its mass
• Replace and add ball of same color

(# orange, # green, # red, # yellow)


lim
n!1 # total
! (⇢orange , ⇢green , ⇢red , ⇢yellow )

21
Marginal cluster assignments
• Integrate out the frequencies
iid
⇢!1:K ⇠ Dirichlet(a1:K ), zn ⇠ Cat(⇢1:K )
ak,n
p(z
! n = k|z1 , . . . , zn 1 ) = P
K 1 2 3 4
n 1 j=1 a j,n
! X
!
ak,n := ak + 1{zm = k}
m=1
• multivariate Pólya urn
• Choose any ball with prob proportional to its mass
• Replace and add ball of same color

(# orange, # green, # red, # yellow)


lim
n!1 # total
! (⇢orange , ⇢green , ⇢red , ⇢yellow )
d
= Dirichlet(aorange , agreen , ared , ayellow )
21
Marginal cluster assignments
• Hoppe urn / Blackwell-MacQueen urn

22 [Blackwell, MacQueen 1973; Hoppe 1984]


Marginal cluster assignments
• Hoppe urn / Blackwell-MacQueen urn

22 [Blackwell, MacQueen 1973; Hoppe 1984]


Marginal cluster assignments
• Hoppe urn / Blackwell-MacQueen urn

22 [Blackwell, MacQueen 1973; Hoppe 1984]


Marginal cluster assignments
• Hoppe urn / Blackwell-MacQueen urn
• Choose ball with prob proportional to its mass
• If black, replace and add ball of new color
• Else, replace and add ball of same color

22 [Blackwell, MacQueen 1973; Hoppe 1984]


Marginal cluster assignments
• Hoppe urn / Blackwell-MacQueen urn
• Choose ball with prob proportional to its mass
• If black, replace and add ball of new color
• Else, replace and add ball of same color

22 [Blackwell, MacQueen 1973; Hoppe 1984]


Marginal cluster assignments
• Hoppe urn / Blackwell-MacQueen urn
• Choose ball with prob proportional to its mass
• If black, replace and add ball of new color
• Else, replace and add ball of same color

22 [Blackwell, MacQueen 1973; Hoppe 1984]


Marginal cluster assignments
• Hoppe urn / Blackwell-MacQueen urn
• Choose ball with prob proportional to its mass
• If black, replace and add ball of new color
• Else, replace and add ball of same color

Step 0

22 [Blackwell, MacQueen 1973; Hoppe 1984]


Marginal cluster assignments
• Hoppe urn / Blackwell-MacQueen urn
• Choose ball with prob proportional to its mass
• If black, replace and add ball of new color
• Else, replace and add ball of same color

Step 0 Step 1

22 [Blackwell, MacQueen 1973; Hoppe 1984]


Marginal cluster assignments
• Hoppe urn / Blackwell-MacQueen urn
• Choose ball with prob proportional to its mass
• If black, replace and add ball of new color
• Else, replace and add ball of same color

Step 0 Step 1 Step 2

22 [Blackwell, MacQueen 1973; Hoppe 1984]


Marginal cluster assignments
• Hoppe urn / Blackwell-MacQueen urn
• Choose ball with prob proportional to its mass
• If black, replace and add ball of new color
• Else, replace and add ball of same color

Step 0 Step 1 Step 2 Step 3

22 [Blackwell, MacQueen 1973; Hoppe 1984]


Marginal cluster assignments
• Hoppe urn / Blackwell-MacQueen urn
• Choose ball with prob proportional to its mass
• If black, replace and add ball of new color
• Else, replace and add ball of same color

Step 0 Step 1 Step 2 Step 3 Step 4

22 [Blackwell, MacQueen 1973; Hoppe 1984]


Marginal cluster assignments
• Hoppe urn / Blackwell-MacQueen urn
• Choose ball with prob proportional to its mass
• If black, replace and add ball of new color
• Else, replace and add ball of same color

Step 0 Step 1 Step 2 Step 3 Step 4

(#orange, #other) = PolyaUrn(1, ↵)


• not orange: (#green, #other) = PolyaUrn(1, ↵)
• not orange, green: (#red, #other) = PolyaUrn(1, ↵)
22
Marginal cluster assignments
• Hoppe urn / Blackwell-MacQueen urn
• Choose ball with prob proportional to its mass
• If black, replace and add ball of new color
• Else, replace and add ball of same color

Step 0 Step 1 Step 2 Step 3 Step 4

(#orange, #other) = PolyaUrn(1, ↵)


• not orange: (#green, #other) = PolyaUrn(1, ↵)
• not orange, green: (#red, #other) = PolyaUrn(1, ↵)
22
Marginal cluster assignments
• Hoppe urn / Blackwell-MacQueen urn
• Choose ball with prob proportional to its mass
• If black, replace and add ball of new color
• Else, replace and add ball of same color

Step 0 Step 1 Step 2 Step 3 Step 4

(#orange, #other) = PolyaUrn(1, ↵)


• not orange: (#green, #other) = PolyaUrn(1, ↵)
• not orange, green: (#red, #other) = PolyaUrn(1, ↵)
22
Marginal cluster assignments
• Hoppe urn / Blackwell-MacQueen urn
• Choose ball with prob proportional to its mass
• If black, replace and add ball of new color
• Else, replace and add ball of same color

iid
Step 0 Step 1 Step 2 Step 3 Step 4 Vk ⇠ Beta(1, ↵)

(#orange, #other) = PolyaUrn(1, ↵)


• not orange: (#green, #other) = PolyaUrn(1, ↵)
• not orange, green: (#red, #other) = PolyaUrn(1, ↵)
22
Marginal cluster assignments
• Hoppe urn / Blackwell-MacQueen urn
• Choose ball with prob proportional to its mass
• If black, replace and add ball of new color
• Else, replace and add ball of same color

iid
Step 0 Step 1 Step 2 Step 3 Step 4 Vk ⇠ Beta(1, ↵)

⇢1 = V1

(#orange, #other) = PolyaUrn(1, ↵)


• not orange: (#green, #other) = PolyaUrn(1, ↵)
• not orange, green: (#red, #other) = PolyaUrn(1, ↵)
22
Marginal cluster assignments
• Hoppe urn / Blackwell-MacQueen urn
• Choose ball with prob proportional to its mass
• If black, replace and add ball of new color
• Else, replace and add ball of same color

iid
Step 0 Step 1 Step 2 Step 3 Step 4 Vk ⇠ Beta(1, ↵)

⇢1 = V1
⇢2 = (1 V1 )V2

(#orange, #other) = PolyaUrn(1, ↵)


• not orange: (#green, #other) = PolyaUrn(1, ↵)
• not orange, green: (#red, #other) = PolyaUrn(1, ↵)
22
Marginal cluster assignments
• Hoppe urn / Blackwell-MacQueen urn
• Choose ball with prob proportional to its mass
• If black, replace and add ball of new color
• Else, replace and add ball of same color

iid
Step 0 Step 1 Step 2 Step 3 Step 4 Vk ⇠ Beta(1, ↵)

⇢1 = V1
⇢2 = (1 V1 )V2
Y2
⇢3 = [ (1 Vk )]V3
k=1
(#orange, #other) = PolyaUrn(1, ↵)
• not orange: (#green, #other) = PolyaUrn(1, ↵)
• not orange, green: (#red, #other) = PolyaUrn(1, ↵)
22
Chinese restaurant process

• Same thing we just did


• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8}
23
Chinese restaurant process

• Same thing we just did


• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8}
23
Chinese restaurant process

• Same thing we just did


• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8}
23
Chinese restaurant process

• Same thing we just did


• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8}
23
Chinese restaurant process

• Same thing we just did


• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8}
23
Chinese restaurant process
1

• Same thing we just did


• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8}
23
Chinese restaurant process
1

• Same thing we just did


• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8}
23
Chinese restaurant process
1
1

• Same thing we just did


• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8}
23
Chinese restaurant process
1
1

• Same thing we just did


• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8}
23
Chinese restaurant process
1
1
2

• Same thing we just did


• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8}
23
Chinese restaurant process
1 3

1
2

• Same thing we just did


• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8}
23
Chinese restaurant process
1 3

1
2

• Same thing we just did


• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8}
23
Chinese restaurant process
1 3

1 2
2

• Same thing we just did


• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8}
23
Chinese restaurant process
1 3

1 2
2

• Same thing we just did


• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8}
23
Chinese restaurant process
1 3

1 2
2 4

• Same thing we just did


• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8}
23
Chinese restaurant process
1 3

1 2
2 4

• Same thing we just did


• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8}
23
Chinese restaurant process
1 3

1 2 3
2 4

• Same thing we just did


• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8}
23
Chinese restaurant process
1 3

1 2 3
2 4

• Same thing we just did


• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8}
23
Chinese restaurant process
1 3

1 2 3
2 4
5
• Same thing we just did
• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8}
23
Chinese restaurant process
1 6 3

1 2 3
2 4
5
• Same thing we just did
• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8}
23
Chinese restaurant process
1 7 6 3

1 2 3
2 4
5
• Same thing we just did
• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8}
23
Chinese restaurant process
1 7 6 3

1 2 3
2 4
5
8
• Same thing we just did
• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8}
23
Chinese restaurant process
1 7 6 3

1 2 3
2 4
5
8
• Same thing we just did
• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8} [Aldous 1983]
23
Chinese restaurant process
1 7 6 3

1 2 3
2 4
5
8
• Same thing we just did
• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
) ⇧8 = {{1, 2, 7, 8}, {3, 5, 6}, {4}}
• Partition of [8]: set of mutually exclusive & exhaustive sets
of [8] = {1, . . . , 8}
23
Chinese restaurant process
1 7 6 3

1 2 3
2 4
5
8
• Same thing we just did
• Each customer walks into the restaurant
• Sits at existing table with prob proportional to # people
there
• Forms new table with prob proportional to α
• Marginal for the Categorical likelihood with GEM prior
!
z1 = z2 = z7 = z8 = 1, z3 = z5 = z6 = 2, z4 = 3
)
So⇧far: {{1, 2, 7,process,
8 =Dirichlet 8}, {3, 5, Chinese
6}, {4}} restaurant process
•Partition
Infinity of
of [8]: set of mutually exclusive
number&ofexhaustive sets

parameters, growing parameters
of [8] = {1, . . . , 8}
23
Exercises
1 7 6 3

1 2 3
2 4
5
8
• Review Gibbs sampling
• What are the advantages and disadvantages of the DP
and CRP representations?
• What is the expected number of clusters generated by a
CRP(α) after N data points?
• What do you think about the answer to the previous
question when it comes to real-life data modeling?
• Code a CRP sampler. Examine the empirical distribution of
the number of clusters after N customers.
24
Exercises
1 7 6 3

1 2 3
2 4
5
8
• Review Gibbs sampling
• What are the advantages and disadvantages of the DP
and CRP representations?
• What is the expected number of clusters generated by a
CRP(α) after N data points?
• What do you think about the answer to the previous
question when it comes to real-life data modeling?
• Code a CRP sampler. Examine the empirical distribution of
the number of clusters after N customers.
24
Exercises
1 7 6 3

1 2 3
2 4
5
8
• Review Gibbs sampling
• What are the advantages and disadvantages of the DP
and CRP representations?
• What is the expected number of clusters generated by a
CRP(α) after N data points?
• What do you think about the answer to the previous
question when it comes to real-life data modeling?
• Code a CRP sampler. Examine the empirical distribution of
the number of clusters after N customers.
24
Exercises
1 7 6 3

1 2 3
2 4
5
8
• Review Gibbs sampling
• What are the advantages and disadvantages of the DP
and CRP representations?
• What is the expected number of clusters generated by a
CRP(α) after N data points?
• What do you think about the answer to the previous
question when it comes to real-life data modeling?
• Code a CRP sampler. Examine the empirical distribution of
the number of clusters after N customers.
24
Exercises
1 7 6 3

1 2 3
2 4
5
8
• Review Gibbs sampling
• What are the advantages and disadvantages of the DP
and CRP representations?
• What is the expected number of clusters generated by a
CRP(α) after N data points?
• What do you think about the answer to the previous
question when it comes to real-life data modeling?
• Code a CRP sampler. Examine the empirical distribution of
the number of clusters after N customers.
24
Exercises
1 7 6 3

1 2 3
2 4
5
8
• Review Gibbs sampling
• What are the advantages and disadvantages of the DP
and CRP representations?
• What is the expected number of clusters generated by a
CRP(α) after N data points?
• What do you think about the answer to the previous
question when it comes to real-life data modeling?
• Code a CRP sampler. Examine the empirical distribution of
the number of clusters after N customers.
24
References
A full reference list is provided at the end of the “Part III” slides.

You might also like