Professional Documents
Culture Documents
I2ml3e Chap4
I2ml3e Chap4
INTRODUCTION
TO
MACHNE
LEARNNG
3RD EDTON
ETHEM ALPAYDIN
The MIT Press, 2014
alpaydin@boun.edu.tr
http://www.cmpe.boun.edu.tr/~ethem/i2ml3e
CHAPTER 4:
PARAMETRC METHODS
Parametric Estimation
3
X = { xt }t where xt ~ p (x)
Parametric estimation:
Assume a form for p (x |q ) and estimate q , its sufficient
statistics, using X
e.g., N ( , 2) where q = { , 2}
Maximum Likelihood Estimation
4
Log likelihood
L(|X) = log l (|X) = t log p (xt|)
p(x) = N ( , 2)
p x
1 x 2
1 x 2
px
exp-
exp
2
2 2
2 2 2
s
2 t
N
6
Bias and Variance
7
Unknown parameter q
Estimator di = d (Xi) on sample Xi
xt ~ N (, o2) and ~ N ( , 2)
ML = m
MAP = Bayes =
N / 0 2
1 / 2
E q | X m
N / 0 1/
2 2
N / 0 1/
2 2
9
Parametric Classification
10
gi x px |C i P C i
or
gi x log px |C i log P C i
px |C i
1
exp
x i 2
2 i 2 i
2
1
gi x log 2 log i
x i 2
log P C i
2 2 i 2
Given the sample X {x t ,r t }tN1
1 if x t
C i
x ri
t
0 if x t
C j , j i
ML estimates are
r x r x m r
t t t t 2 t
i i i i
PC i t
m t
si2 t
r r
i t t
N i i
t t
Discriminant 1
gi x log 2 log si
x mi
log C i
P
2
2 2si 2
11
Equal variances
Single boundary at
halfway between means
12
Variances are different
Two boundaries
13
14
Regression
15
r f x
estimator : gx |q
~ N 0, 2
pr | x ~ N gx |q , 2
L q |X log px t , r t
N
t 1
log pr t | x t log px t
N N
t 1 t 1
Regression: From LogL to Error
16
L q | X log
N
1
exp
t
r g x |q
t 2
t 1 2 2 2
r
2
g x t |q
N
1
N log 2 t
2 2
t 1
E q | X r t g x t |q
N
1
2 t 1
Linear Regression
gx t |w1 ,w0 w1 x t w0
t
r t
Nw 0 w1 x t
r x t t
w 0 x w1 x
t
t 2
t t t
N t w0
x t
r t
A t
t x 1 t x
t
w y
x t 2
w r t t
w A 1y
17
Polynomial Regression
gx |w ,,w ,w ,w w x
t
k 2 1 0 k
t k
w x
2
t 2
w1x w0
t
1 x1
x 1 2
x
1 k
r 1
2
D
1 x 2
x 2 2
x
2 k
r r
N
1 x N
x N 2
x
N 2
r
w D D DT rT 1
18
Other Error Measures
19
|q
N
1
Square Error: E q |X
2 t 1
r t
g x t
r
2
g x t |q
N
t
E q |X t 1
Relative Square Error:
r
N 2
t
r
t 1
E r gx 2 | x E r Er | x 2 | x Er | x gx 2
noise squared error
E X E r | x gx | x E r | x E X g x E X gx E X gx
2 2 2
bias variance
Estimating Bias and Variance
21
Varianceg
1
NM t i
t
gi x g x
t 2
1
g x gi x
M t
Bias/Variance Dilemma
22
As we increase complexity,
bias decreases (a better fit to data) and
variance increases (fit varies more with data)
Bias/Variance dilemma: (Geman et al., 1992)
f
23
f
bias
gi g
variance
Polynomial Regression
24
Coefficients increase in
magnitude as order
increases:
1: [-0.0769, 0.0016]
2: [0.1682, -0.6657,
0.0080]
3: [0.4238, -2.5778,
3.4675, -0.0002
4: [-0.1093, 1.4356,
-5.5007, 6.0454, -0.0019]