Professional Documents
Culture Documents
Non-Parametric Estimation: Rajkumar Saha (WS)
Non-Parametric Estimation: Rajkumar Saha (WS)
Non-Parametric Estimation: Rajkumar Saha (WS)
Non-Parametric
Non-Parametric Estimation
Rajkumar Saha(ws)
University of South Carolina
1 / 13
Rajkumar Saha(ws)
Non-Parametric Estimation
Outline
Non-Parametric
2 / 13
Non-Parametric
Rajkumar Saha(ws)
Non-Parametric Estimation
Outline
Non-Parametric
Non-parametric
Empirical CDF:
Definition: The empirical distribution function Fn (x) is the
CDF that puts mass 1/n at each data point Xi . Formally,
n
1X
Fn (x) =
I (Xi x)
n
i=1
where
(
1, if Xi x;
I (Xi x) =
0, if Xi > x;
3 / 13
Rajkumar Saha(ws)
Non-Parametric Estimation
Outline
Non-Parametric
Fhat
0
0
t
n=100
n=1000
Fhat
0.4
0.0
0.4
0.0
Fhat
0.8
0.8
4 / 13
0.4
0.0
0.4
0.0
Fhat
0.8
n=50
0.8
n=10
Rajkumar Saha(ws)
0
t
Non-Parametric Estimation
Outline
Non-Parametric
Useful inequalities
Hoeffdings Inequality: Let Y1 , Y2 , ......, Yn be independent
observations such that E (Yi ) = 0 and ai Yi bi . Let
> 0. Then, for any t > 0,
!
n
n
X
Y
2
2
t
P
Yi e
e t (bi ai ) /8 .
i=1
i=1
Pn
i=1 Xi
Rajkumar Saha(ws)
Non-Parametric Estimation
Outline
Non-Parametric
Pn
i=1 Xi
Rajkumar Saha(ws)
Non-Parametric Estimation
Outline
Non-Parametric
Density estimation
Underlying idea: Let F be a distribution with probability
density f = F 0 and let
X1 , X2 , ......, Xn F be an IID sample from F . The goal of
the nonparametric density estimation is to estimate f with
as few assumptions about f as possible.
Histograms: Assuming f supported on [0,1]. Let m be an
integer anddefine bins
B1 = 0, m1 , B2 = m1 , m2 , ........., Bm = m1
m , 1 . Thus
binwidth h = 1/m
Let Yj beR the number of observations in Bj , let pj = Yj /n and
let pj = Bj f (u)du
The the histogram estimator is define by
m
X
pj
fn (x) =
I (x Bj ).
h
j=1
7 / 13
Rajkumar Saha(ws)
Non-Parametric Estimation
Outline
Non-Parametric
60
Alto 2
65
70
75
60
Alto 1
Soprano 2
65
70
75
Soprano 1
40
30
Percent of Total
20
10
0
Bass 2
Bass 1
Tenor 2
Tenor 1
40
30
20
10
0
60
65
70
75
60
65
70
75
Height (inches)
8 / 13
Rajkumar Saha(ws)
Non-Parametric Estimation
Outline
Non-Parametric
fn (x) =
K
n
h
h
i=1
9 / 13
Rajkumar Saha(ws)
Non-Parametric Estimation
Outline
Non-Parametric
0.020
0.010
Density function
0.04
0.03
0.00
0.000
0.01
0.02
Density
0.05
0.06
0.030
density.default(x = x)
10
20
30
N = 50 Bandwidth = 2.15
10 / 13
Rajkumar Saha(ws)
20
40
60
80
x
Non-Parametric Estimation
100
120
Outline
Non-Parametric
We have to choose h
Cross validation estimator:
Definition. The cross-validation estimator of risk is
Z
n
2
2X
f(i) (Xi )
J(h) =
fn (x) dx
n
i=1
11 / 13
Rajkumar Saha(ws)
Non-Parametric Estimation
Outline
Non-Parametric
Stones Theorem
Theorem: Suppose that f is bounded. Let Jh denote the
kernel estimator with bandwidth h denote the bandwidth
chosen by cross-validation. Then,
2
f (x) fh (x) dx a.s.
1.
2
R
infh f (x) fh (x) dx
R
Xi xj
h
2
1
K (0) + O( 2 )
nh
n
i
j
Z
where K (x) = K (2) (x) 2K (x) and K (2) (z) = K (z y )K (y )dy
12 / 13
Rajkumar Saha(ws)
Non-Parametric Estimation
Outline
Non-Parametric
R-code:
13 / 13
Rajkumar Saha(ws)
Non-Parametric Estimation