Professional Documents
Culture Documents
Emiprical Risk Minimization 2
Emiprical Risk Minimization 2
s
log 2|H| + log 1δ
R(h) ≤ RT m (h) + (b − a) , ∀h ∈ H
2m
This lecture answers the following quations:
+1 if hw, φ(x)i + b ≥ 0
h(x; w, b) = sign(hw, φ(x)i + b) =
−1 if hw, φ(x)i + b < 0
T m = {(xi, y i) ∈ (X × Y) | i = 1, . . . , m}
to
0/1 0/1
(w∗, b∗) ∈ Argmin RT m (h) = Argmin RT m (h(·; w, b)) (1)
h∈H (w,b)∈(Rn×R)
0/1 0/1
Recall that ULLN ∀ ε > 0 : P suph∈H R (h) − RT m (h) ≥ ε = 0
Vapnik-Chervonenkis (VC) dimension
5/14
VC dimension is a concept to measure complexity of an infinite
We learned how to bound deviation between the empirical and the true
2m ε 2
−
P max |RT m (h) − R(h)| ≥ ε ≤ 2|H|e (b−a)2 = B1(m, |H|, ε)
h∈H
h∈H
minimal risk R(h)?
Excess error = Estimation error + Approximation error
9/14
Note that:
The approximation error depends on H.
R(hm) − R(hH) ≤ ε
∀ ε > 0 : lim P R(hm) − R(hH) ≥ ε = 0
m→∞
ε
Therefore ε ≤ R(hm) − R(hH) implies 2 ≤ suph∈H R(h) − RT m (h) and
ε
P R(hm) − R(hH) ≥ ε ≤ P sup R(h) − RT m (h) ≥
h∈H 2
so if converges the RHS to zero (ULLN) so does the LHS (estimation error).
Universal consistency for ERM algorithms
13/14
We have shown relation between the estimation error and the uniform
ε − m ε2
P R(hm)−R(hH) ≥ ε ≤ P sup R(h)−RT m (h) ≥ ≤ 2|H|e 2(b−a)2
h∈H 2
R(hm) − R(hH) ≤ ε .