Professional Documents
Culture Documents
Communications in Statistics - Theory and Methods
Communications in Statistics - Theory and Methods
Communications in Statistics - Theory and Methods
To cite this article: Hill Peter D. (1985): Kernel estimation of a distribution function, Communications in Statistics - Theory and
Methods, 14:3, 605-620
This article may be used for research, teaching, and private study purposes. Any substantial or systematic
reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to
anyone is expressly forbidden.
The publisher does not give any warranty express or implied or make any representation that the contents
will be complete or accurate or up to date. The accuracy of any instructions, formulae, and drug doses should
be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims,
proceedings, demand, or costs or damages whatsoever or howsoever caused arising directly or indirectly in
connection with or arising out of the use of this material.
REDJJEL ESTIK4TION OF A DISTRIBUTION FUNCTIOX
Peter D. Hill
Department of Mathematics
University of Waikato
Hamilton, New Zealand
1.
p-
INTRODUCTION
He showed t h a t a s n + m,
where
and (-h, h) i s t h e r e g i o n of s u p p o r t of t h e k e r n e l f u n c t i o n .
jh = 6 f o r t h e Epanechnikov k e r n e l , )
T a k i n g t h e r i g h t - h a n d s i d e o f ( 1 ) t h e minimum asymptotic MSE
occurs a t
in which case
6
If we write (2) as X = cn- then G = 'I3. This value was suggest-
ed by Wertz ""'8'
\ / when estimating the density functzcn but dces
not coincide with the result of Rosenblatt (1956) and Parzen (1962)
which suggests an optimal X for density estimation of the order
n- l/5 ,
Taking n = 1090 and concentracing on the tails of the dis-
tribution (namely, the 5th, 90th and 95th percentiles), Azzalini
chose a few particular densities (Xormal, Gamma and Beta) and
Downloaded by [Monash University Library] at 10:49 07 June 2013
- 1,
r fix)
7
* = con st(^) ! /3
i(f' (XI )2 J '"
E[F~(x)] =
. E[P,(O)] =
3. RESULTS
Downloaded by [Monash University Library] at 10:49 07 June 2013
3.1 Introduction
Tapia and Thompson (1978, p.76) refer to difficulties which
may arise with the use of the Epanechnikov kernel owing to its
non-differentiahility at the end-points; On the other hand, the
Normal kernel suffers from having infinite support. A desirable
alternative which we have chosen is the kernel called K3 by Tapia
adThompson (1978, p.SO) which has smoothness properties and
finite support. The K3 kernel is given by
Azzalini
Epanechnikov
K 3 Kernel
1
1
1.282
0.130
0.155
0,310
,08193
,08179
.08172
Azzalini
Epanechnikov
K3 K e r n e l
! 1.645
0.130
0.151
0.303
.04305
-04295
.04290
S e c t i o n 2.5) t h a t t h e c h o i c e of k e r n e l t y p e i s n o t c r u c i a l i n
Downloaded by [Monash University Library] at 10:49 07 June 2013
-
0.0
0.0 1.0
Smooth~ngparameter A
2.0
Similar graphs were obtained for other x values such as the 5th,
60th and 80th percentile points, and for various values of n
as low as n = 20, and for the other distributions.
From Figures 1 and 2 we observe that the MSE function is
smooth and rather flat near the minimum. This is an important
feature since it shows that a fairly wide range of h values
will give comparable MSE's. Hence some slight inaccuracy in
finding the optimal h may not be too serious.
Fig. 3 - MSE-optimal X vs n f o r t h e N!0,1! and B e t a ( 2 , 5 )
distributions a t x = 95 p e r c e n t i l e p o i n t .
L i
0.1 0.3 0.5 0.7 09
Location pomt X
T?2^ L
~ l g .u - 1tC.n
A a lvs
~ ~ r ;_ --L 2u- .p, ~ ~ l u the iucatiun point x ior t h e
Beta(2,lO) distribution, n = 1000.
Location point X
4. CONCLUSIONS
--
We iist our conclusions from the resulis in Seccion 3 .
(1) The A which minimizes I.ISE(? (x)) varies considerably
n
with the unknown distribution as stated in Section 3 . 1 where the
optimal A for the Gamma distributions, for example, were found
to be much larger than i o r the Beta distributions.
(2) Our results in Table I accord with the general consensus that
the choice of the kernel is not crucial for kernel estimation of
a distribution function. The size of the optimal A naturally
depends very much on the type of kernel chosen.
(3) The K3 kernel performs slightly better than the Epanechnikov
kernel.
(4) From the MSE versus h plots referred to in Section 3.2.1,
we observe that the MSE function is rather flat near the minimum.
Hence an approximate calculation of the optimal A may give
satisfactory MSE as long as it falls within the near flat portion
of the MSE function.
018 "ILL
this paper.
BIBLIOGRAPHY