Professional Documents
Culture Documents
CS411 Tut4 Handout
CS411 Tut4 Handout
Mengye Ren
mren@cs.toronto.edu
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 1 / 21
Naive Bayes
Bayes Rules:
p(x|t)p(t)
p(t|x) =
p(x)
Naive Bayes Assumption:
D
Y
p(x|t) = p(xj |t)
j=1
Likelihood function:
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 2 / 21
Example: Spam Classification
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 3 / 21
Example: Spam Classification
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 3 / 21
Example: Spam Classification
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 3 / 21
Bernoulli Naive Bayes
Assuming all data points x (i) are i.i.d. samples, and p(xj |t) follows a
Bernoulli distribution with parameter µjt
D
Y (i)
x (i)
p(x (i) |t (i) ) = µjtj(i) (1 µjt (i) )(1 xj )
j=1
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 4 / 21
Bernoulli Naive Bayes
Assuming all data points x (i) are i.i.d. samples, and p(xj |t) follows a
Bernoulli distribution with parameter µjt
D
Y (i)
x (i)
p(x (i) |t (i) ) = µjtj(i) (1 µjt (i) )(1 xj )
j=1
N
Y N
Y D
Y (i)
x (i)
p(t|x) / p(t (i) )p(x (i) |t (i) ) = p(t (i) ) µjtj(i) (1 µjt (i) )(1 xj )
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 4 / 21
Bernoulli Naive Bayes
Assuming all data points x (i) are i.i.d. samples, and p(xj |t) follows a
Bernoulli distribution with parameter µjt
D
Y (i)
x (i)
p(x (i) |t (i) ) = µjtj(i) (1 µjt (i) )(1 xj )
j=1
N
Y N
Y D
Y (i)
x (i)
p(t|x) / p(t (i) )p(x (i) |t (i) ) = p(t (i) ) µjtj(i) (1 µjt (i) )(1 xj )
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 4 / 21
Derivation of maximum likelihood estimator (MLE)
✓ = [µ, ⇡]
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 5 / 21
Derivation of maximum likelihood estimator (MLE)
✓ = [µ, ⇡]
P
Want: arg max✓ log L(✓) subject to k ⇡k = 1
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 5 / 21
Derivation of maximum likelihood estimator (MLE)
Take derivative w.r.t. µ
0 1
N ⇣ ⌘ (i) (i)
@ log L(✓) X xj 1 xj
=0) (i)
1 t =k @ A=0
@µjk µjk 1 µjk
i=1
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 6 / 21
Derivation of maximum likelihood estimator (MLE)
Take derivative w.r.t. µ
0 1
N ⇣ ⌘ (i) (i)
@ log L(✓) X xj 1 xj
=0) (i)
1 t =k @ A=0
@µjk µjk 1 µjk
i=1
N
X ⇣ ⌘h ⇣ ⌘ i
(i) (i)
1 t (i) = k xj (1 µjk ) 1 xj µjk = 0
i=1
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 6 / 21
Derivation of maximum likelihood estimator (MLE)
Take derivative w.r.t. µ
0 1
N ⇣ ⌘ (i) (i)
@ log L(✓) X xj 1 xj
=0) (i)
1 t =k @ A=0
@µjk µjk 1 µjk
i=1
N
X ⇣ ⌘h ⇣ ⌘ i
(i) (i)
1 t (i) = k xj (1 µjk ) 1 xj µjk = 0
i=1
N
X ⇣ ⌘ N
X ⇣ ⌘
(i)
1 t (i) = k µjk = 1 t (i) = k xj
i=1 i=1
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 6 / 21
Derivation of maximum likelihood estimator (MLE)
Take derivative w.r.t. µ
0 1
N ⇣ ⌘ (i) (i)
@ log L(✓) X xj 1 xj
=0) (i)
1 t =k @ A=0
@µjk µjk 1 µjk
i=1
N
X ⇣ ⌘h ⇣ ⌘ i
(i) (i)
1 t (i) = k xj (1 µjk ) 1 xj µjk = 0
i=1
N
X ⇣ ⌘ N
X ⇣ ⌘
(i)
1 t (i) = k µjk = 1 t (i) = k xj
i=1 i=1
PN (i) (i)
i=1 1 t = k xj
µjk = PN
i=1 1 t (i) = k
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 6 / 21
Derivation of maximum likelihood estimator (MLE)
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 7 / 21
Derivation of maximum likelihood estimator (MLE)
PN
i=1 1 t (i) = k)
⇡k =
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 7 / 21
Derivation of maximum likelihood estimator (MLE)
PN
i=1 1 t (i) = k)
⇡k =
P
Apply constraint: k ⇡k = 1 ) = N
PN
i=1 1 t (i) = k)
⇡k =
N
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 7 / 21
Spam Classification Demo
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 8 / 21
Gaussian Bayes Classifier
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 9 / 21
Gaussian Bayes Classifier
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 9 / 21
Derivation of maximum likelihood estimator (MLE)
q
✓ = [µ, ⌃, ⇡], Z = (2⇡)D det(⌃)
✓ ◆
1 1 T 1
p(x|t) = exp (x µ) ⌃ (x µ)
Z 2
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 10 / 21
Derivation of maximum likelihood estimator (MLE)
q
✓ = [µ, ⌃, ⇡], Z = (2⇡)D det(⌃)
✓ ◆
1 1 T 1
p(x|t) = exp (x µ) ⌃ (x µ)
Z 2
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 10 / 21
Derivation of maximum likelihood estimator (MLE)
q
✓ = [µ, ⌃, ⇡], Z = (2⇡)D det(⌃)
✓ ◆
1 1 T 1
p(x|t) = exp (x µ) ⌃ (x µ)
Z 2
1 ⇣ (i) ⌘T ⇣ ⌘
N
X
= log ⇡t (i) log Z x µt (i) ⌃t (i)1 x (i) µt (i)
2
i=1
P
Want: arg max✓ log L(✓) subject to k ⇡k = 1
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 10 / 21
Derivation of maximum likelihood estimator (MLE)
@ log L
N
X ⇣ ⌘
= 1 t (i) = k ⌃ 1
(x (i) µk ) = 0
@µk
i=0
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 11 / 21
Derivation of maximum likelihood estimator (MLE)
@ log L
N
X ⇣ ⌘
= 1 t (i) = k ⌃ 1
(x (i) µk ) = 0
@µk
i=0
PN
i=1 1 t (i) = k x (i)
µk = PN
i=1 1 t (i) = k
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 11 / 21
Derivation of maximum likelihood estimator (MLE)
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 12 / 21
Derivation of maximum likelihood estimator (MLE)
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 12 / 21
Derivation of maximum likelihood estimator (MLE)
" #
@ log L
N
X ⇣ ⌘ @ log Zk 1 (i)
= 1 t (i) = k (x µk )(x (i)
µk ) T
=0
@⌃k 1 i=0
@⌃k 1 2
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 12 / 21
Derivation of maximum likelihood estimator (MLE)
" #
@ log L
N
X ⇣ ⌘ @ log Zk 1 (i)
= 1 t (i) = k (x µk )(x (i)
µk ) T
=0
@⌃k 1 i=0
@⌃k 1 2
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 12 / 21
Derivation of maximum likelihood estimator (MLE)
q
Zk = (2⇡)D det(⌃k )
1
@ log Zk 1 @Zk D 1 D @ det ⌃k 1 2
1
= = (2⇡) 2 det(⌃k ) 2 (2⇡) 2
@⌃k Zk @⌃k 1 @⌃k 1
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 13 / 21
Derivation of maximum likelihood estimator (MLE)
q
Zk = (2⇡)D det(⌃k )
1
1
D @ det ⌃
2
@ log Zk 1 @Zk D 1
k
= = (2⇡) 2 det(⌃ ) 2 (2⇡) 2
k
@⌃k 1 Zk @⌃k 1 @⌃k 1
✓ ◆
1 12 1 3 1
= det(⌃k ) det ⌃k 1 2 det ⌃k 1 ⌃T k = ⌃k
2 2
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 13 / 21
Derivation of maximum likelihood estimator (MLE)
q
Zk = (2⇡)D det(⌃k )
1
1
D @ det ⌃
2
@ log Zk 1 @Zk D 1
k
= = (2⇡) 2 det(⌃ ) 2 (2⇡) 2
k
@⌃k 1 Zk @⌃k 1 @⌃k 1
✓ ◆
1 12 1 3 1
= det(⌃k ) det ⌃k 1 2 det ⌃k 1 ⌃T k = ⌃k
2 2
@ log L
N
X ⇣ ⌘ 1 1 (i)
(i)
= 1 t =k ⌃k (x µk )(x (i) µk ) T = 0
@⌃k 1 i=0
2 2
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 13 / 21
Derivation of maximum likelihood estimator (MLE)
q
Zk = (2⇡)D det(⌃k )
1
1
D @ det ⌃
2
@ log Zk 1 @Zk D 1
k
= = (2⇡) 2 det(⌃ ) 2 (2⇡) 2
k
@⌃k 1 Zk @⌃k 1 @⌃k 1
✓ ◆
1 12 1 3 1
= det(⌃k ) det ⌃k 1 2 det ⌃k 1 ⌃T k = ⌃k
2 2
@ log L
N
X ⇣ ⌘ 1 1 (i)
(i)
= 1 t =k ⌃k (x µk )(x (i) µk ) T = 0
@⌃k 1 i=0
2 2
PN T
i=1 1 t (i) = k x (i) µk x (i) µk
⌃k = PN
i=1 1 t (i) = k
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 13 / 21
Derivation of maximum likelihood estimator (MLE)
PN
t (i) = k)
i=1 1
⇡k =
N
(Same as Bernoulli)
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 14 / 21
Gaussian Bayes Classifier Demo
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 15 / 21
Gaussian Bayes Classifier
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 16 / 21
Gaussian Bayes Classifier
D
Y ✓ ◆ D
Y
1 1
= p exp ||xj µjt ||22 = p(xj |t)
(2⇡)D ⌃t,jj 2⌃t,jj
j=1 j=1
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 16 / 21
Gaussian Bayes Classifier
p(x|t) = N (x|µt , ⌃)
p(x|t) = N (x|µt , ⌃t )
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 17 / 21
Gaussian Bayes Binary Classifier Decision Boundary
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 18 / 21
Gaussian Bayes Binary Classifier Decision Boundary
1 1
log ⇡1 (x µ1 ) T ⌃ 1
(x µ1 ) = log ⇡0 (x µ0 ) T ⌃ 1
(x µ0 )
2 2
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 18 / 21
Gaussian Bayes Binary Classifier Decision Boundary
1 1
log ⇡1 (x µ1 ) T ⌃ 1
(x µ1 ) = log ⇡0 (x µ0 ) T ⌃ 1
(x µ0 )
2 2
C + xT ⌃ 1
x 2µT
1⌃
1
x + µT 1 T 1
1 ⌃ µ1 = x ⌃ x 2µT 1 T
0 ⌃ x + µ0 ⌃
1
µ0
h i
2(µ0 µ1 )T ⌃ 1 x (µ0 µ1 )T ⌃ 1 (µ0 µ1 ) = C
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 18 / 21
Gaussian Bayes Binary Classifier Decision Boundary
1 1
log ⇡1 (x µ1 ) T ⌃ 1
(x µ1 ) = log ⇡0 (x µ0 ) T ⌃ 1
(x µ0 )
2 2
C + xT ⌃ 1
x 2µT
1⌃
1
x + µT 1 T 1
1 ⌃ µ1 = x ⌃ x 2µT 1 T
0 ⌃ x + µ0 ⌃
1
µ0
h i
2(µ0 µ1 )T ⌃ 1 x (µ0 µ1 )T ⌃ 1 (µ0 µ1 ) = C
) aT x b=0
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 18 / 21
Relation to Logistic Regression
p(x, t = 0) ⇡0 N (x|µ0 , ⌃)
=
p(x, t = 0) + p(x, t = 1) ⇡0 N (x|µ0 , ⌃) + ⇡1 N (x|µ1 , ⌃)
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 19 / 21
Relation to Logistic Regression
p(x, t = 0) ⇡0 N (x|µ0 , ⌃)
=
p(x, t = 0) + p(x, t = 1) ⇡0 N (x|µ0 , ⌃) + ⇡1 N (x|µ1 , ⌃)
⇢ 1
⇡1 1 1
= 1+ exp (x µ1 ) T ⌃ 1
(x µ1 ) + (x µ0 ) T ⌃ 1
(x µ0 )
⇡0 2 2
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 19 / 21
Relation to Logistic Regression
p(x, t = 0) ⇡0 N (x|µ0 , ⌃)
=
p(x, t = 0) + p(x, t = 1) ⇡0 N (x|µ0 , ⌃) + ⇡1 N (x|µ1 , ⌃)
⇢ 1
⇡1 1 1
= 1+ exp (x µ1 )T ⌃ 1 (x µ1 ) + (x µ0 )T ⌃ 1 (x µ0 )
⇡0 2 2
⇢
⇡1 1⇣ T 1 ⌘ 1
= 1 + exp log + (µ1 µ0 )T ⌃ 1 x + µ1 ⌃ µ1 µT 1
0 ⌃ µ0
⇡0 2
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 19 / 21
Relation to Logistic Regression
p(x, t = 0) ⇡0 N (x|µ0 , ⌃)
=
p(x, t = 0) + p(x, t = 1) ⇡0 N (x|µ0 , ⌃) + ⇡1 N (x|µ1 , ⌃)
⇢ 1
⇡1 1 1
= 1+ exp (x µ1 )T ⌃ 1 (x µ1 ) + (x µ0 )T ⌃ 1 (x µ0 )
⇡0 2 2
⇢
⇡1 1⇣ T 1 ⌘ 1
= 1 + exp log + (µ1 µ0 )T ⌃ 1 x + µ1 ⌃ µ1 µT 1
0 ⌃ µ0
⇡0 2
1
=
1 + exp( w T x b)
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 19 / 21
Gaussian Bayes Binary Classifier Decision Boundary
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 20 / 21
Gaussian Bayes Binary Classifier Decision Boundary
1 1
log ⇡1 (x µ1 )T ⌃1 1 (x µ1 ) = log ⇡0 (x µ0 )T ⌃0 1 (x µ0 )
2 2
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 20 / 21
Gaussian Bayes Binary Classifier Decision Boundary
1 1
log ⇡1 (x µ1 )T ⌃1 1 (x µ1 ) = log ⇡0 (x µ0 )T ⌃0 1 (x µ0 )
2 2
⇣ ⌘ ⇣ ⌘
x T ⌃1 1 ⌃0 1 x 2 µT
1 ⌃1
1
µT ⌃
0 0
1
x + µ T
0 ⌃ 0 µ 0 µ T
1 ⌃ 1 µ 1 =C
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 20 / 21
Gaussian Bayes Binary Classifier Decision Boundary
1 1
log ⇡1 (x µ1 )T ⌃1 1 (x µ1 ) = log ⇡0 (x µ0 )T ⌃0 1 (x µ0 )
2 2
⇣ ⌘ ⇣ ⌘
x T ⌃1 1 ⌃0 1 x 2 µT
1 ⌃1
1
µT ⌃
0 0
1
x + µ T
0 ⌃ 0 µ 0 µ T
1 ⌃ 1 µ 1 =C
) x T Qx 2b T x + c = 0
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 20 / 21
Thanks!
Mengye Ren Naive Bayes and Gaussian Bayes Classifier September 29, 2017 21 / 21