Communications in Statistics - Theory and Methods

This article was downloaded by: [Monash University Library]
On: 07 June 2013, At: 10:49

Publisher: Taylor & Francis
Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House,
37-41 Mortimer Street, London W1T 3JH, UK
Communications in Statistics - Theory and Methods

Publication details, including instructions for authors and subscription information:
http://www.tandfonline.com/loi/lsta20
Kernel estimation of a distribution function

a
Hill Peter D.
a
Department of Mathematics, University of Waikato, New Zealand, Hamilton
Published online: 27 Jun 2007.
To cite this article: Hill Peter D. (1985): Kernel estimation of a distribution function, Communications in Statistics - Theory and
Methods, 14:3, 605-620
To link to this article: http://dx.doi.org/10.1080/03610928508828937
PLEASE SCROLL DOWN FOR ARTICLE
Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions
This article may be used for research, teaching, and private study purposes. Any substantial or systematic
reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to
anyone is expressly forbidden.
The publisher does not give any warranty express or implied or make any representation that the contents
will be complete or accurate or up to date. The accuracy of any instructions, formulae, and drug doses should
be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims,
proceedings, demand, or costs or damages whatsoever or howsoever caused arising directly or indirectly in
connection with or arising out of the use of this material.
REDJJEL ESTIK4TION OF A DISTRIBUTION FUNCTIOX
Peter D. Hill
Department of Mathematics
University of Waikato
Hamilton, New Zealand
Ke9 b/ords and Phrases: d i s t r i b u t i o n f m c t i o n ; kernel estimation;

mean squared error; c e n t r e of symmetry; r e f e r e n c e range; rnedicaZ
?i agnosis ,
ABSTRACT
Downloaded by [Monash University Library] at 10:49 07 June 2013
A distribution function is estimated by a kernel method with

a poinrwise mean squared error criterion at a point x. Relation-
ships between the mean squared error, the point x, the sample
size and the required kernel smoothing parazeter are investigated
for several distributions treated by Azzaiini (1981). In
particular it is noted that at a centre of symmetry or near a
mode of the distribution the kernei method breaks down. Point-
wise estimation of a distribution function is motivated as a
more useful technique than a reference range for preliminary
medical diagnosis.
1.
p-
INTRODUCTION
This paper investigates some properties of kernel estimates

of distribution functions. The motivation for the work comes
from the field of preliminary medical diagnosis. The reference
range (sometimes called normal range) is a standard tool in this
context. A data base of values of a particular clinical variable
is built up from a group of healthy individuals; a reference
range of values is constructed,within which 95%, say, of the
Copyright O 1985 by Marcel Dekker, Inc.

606 HILL
healthy population of values is intended co lie. If the variable

value of a new patienr is outside the reference range the ciinic-
ian nay initiate f a r tlier diagnostic tests.
One of the many weaknesses of rhe referecce range as a
diagnosric roo1 is that it does not determine just how unusual a
new patient's variable value is among healthy individuals. For
inorganic phosphorus, it would be useful to know how unusual is a

---
value of 4 . 1 or 4 , 7 , say, Instead of constructing a reference
range from a given database and reporting it together with the
new patient's value, we couid estimate the distributfon of the
database and quote the percentile estimate of the new patient's
value. Elveback (1972) proposed that "each clinical report
include, along with the value determined from the test, the
percentile corresponding to this value for healthy persons oi the

yaLL2nt'~age and sex."
..-++
Here we will concentrate on percentile estimates, that is,

for a given variable value x on a patient we want to estimate
(by means of the healthy data bzse) t.he p n r c a n t z" w
- r\f healthy
patients whose variable value would be no larger than x. We are
therefore interested in the distribution function F rather than
the density function f of the variable in question. k r
e.
approach is to obtain a kernel estimate F of the distribution
function. The literature of kernel estimation of density
functions is extensive. Hand (1982), Tapia and Thompson (1978)
and Wertz (1978) are recent books on the subject, We brjefly
review the limited literature on kernel estimation of a
distribution function.
Suppose XI, X2, ..., Xn is a sample of independent observ-
ations from a random variable X with continuous distribution
function F(x) and probability density function f(x). We may
estimate F!x) by integrating a kernel estimator of f!x), i.e.
So, g i v e n a new x, we e s t i m a t e i t s p e r c e x t i l e v a l u e by Fn(xf.
Nadaraya (1964) showed that, u n d e r c e r t a i n c o n d i t i o n s on K
and A, (x) h a s t h e same a s y m p t o t i c d i s t r i b u t i o n a s Fn(x)
the usual empirical distribiition function. Other a s y m p t o ~ i c
p r o p e r t i e s of gn(x) c a n b e found i n W i n t e r (1973, 1979).
R a t h e r t h a n t h e u s u a l Normal k e r n e l , A z z a l i n i (1981) c h o s e
t h e f i n i t e r a n g e Epanechnikov k e r n e l (Epanechnikov, 1969) which
h a s s u p p o r t on t h e i n t e r v a l (-v?, f v 3 . This kernel is knom t o
have c e r t a i n o p t i m a l p r o p e r t i e s i n e s t i m a t i n g t h e d e n s i t y
f u n c t i o n (Rosenblatc, 1971). Azzalini's criterion for selecting
t h e smoothing p a r a m e t e r X was t h e p o i n t w i s e mean s q u a r e d e r r o r

(MSE),
MSE(?
n
(x)) = E[ (6 la)
'-n '-'
- ~ ( x ) ) ~ ! .
He showed t h a t a s n + m,
where
and (-h, h) i s t h e r e g i o n of s u p p o r t of t h e k e r n e l f u n c t i o n .
jh = 6 f o r t h e Epanechnikov k e r n e l , )
T a k i n g t h e r i g h t - h a n d s i d e o f ( 1 ) t h e minimum asymptotic MSE
occurs a t
in which case
6
If we write (2) as X = cn- then G = 'I3. This value was suggest-
ed by Wertz ""'8'
\ / when estimating the density functzcn but dces
not coincide with the result of Rosenblatt (1956) and Parzen (1962)
which suggests an optimal X for density estimation of the order
n- l/5 ,
Taking n = 1090 and concentracing on the tails of the dis-
tribution (namely, the 5th, 90th and 95th percentiles), Azzalini
chose a few particular densities (Xormal, Gamma and Beta) and
numerically computed the optimum value of X by equation (2). He

found that A = 1.30 is a satisfactory choice for the "long tail
of the distribution", but that for estimation of F not in the
tails A = 0.50 would give better results than X = 1.30- His
results also showed that the kernel estimator generally has MSE
lower than the empirical distribution function whose MSF at the
pth percentile is just p(1-pjin.
Azzalini (1981) also considered the inverse problem: given
p, estimate Sp which is such that p = F(Cp). The kernel
estimator x of Cp is given by p = Pn(x ). Nadaraya (1964),
P P
by means of first-order approximations,found that x is asympt-
?
otically Normal with mean tp and variance p(l-p)/nf2(: ) which
P
is the same asymptotic distribution as that ~f the sample
percentile. Azzalini (1979) used second order approximations and
performed some numerical comparisons of x with the sample
P
percentile for estimating the 95 percentile of the Gamma (1)
distribution. His results showed the superiority of x over the
P
sample percentile.
KERNEL ESTIYATYOK
2. 0PTIY-U. AT THE CENTRE OF SBIMETRY
Equatlon (2) In Sec~ion1 gives Azzalini's ? for nininum

(asymptotic) MSE estimation of Fix). It reduces to
- 1,
r fix)
7
* = con st(^) ! /3
i(f' (XI )2 J '"
Clearly if fl(x) = 0 it is not possible to minimize the MSE

fuiictlon in equation (I j in c h i s way. Now if f(x) is 2 symmetric
density with centre at the origin (without loss of generality)
then fl(x) = O at x = 0. The inference is that at the centre
of symmetry, the MSE function MSE($~(O)) can be made arbitrarily
small by increasing A. This inference is borne out in our de-
tailed empirical study (reported in the next section) of the
realtiun between x and the X which minimizes MSE($~,(X)). As
x approaches the centre of a symmetric density the optimal A

becomes arbitrarily large. The following investigation reveals
the cause of this phenomenon. It is easily seen that
E[F~(x)] =
. E[P,(O)] =
But W(.) is the distribution function of the chosen kernei

function K(.) which is generally chosen to be a symmetric
density; for example the Normal kernel, or the Epanechnikov
kernei used by kzzaiini, or the K? kernel used in the next
section. Consequently W(z) = T1 + ~ ( z j where A(z) is an odd
function and so
when f is symmetric abour zero. So a kernel estimate of the

distribution function of a density symmetric about tho origin has
610 HILL
zero bias at che origin if a symmetric kernel is used.

The bias being zero at x = O means that the XSE function
c a n h e shmmj zt_ the
reduces to only the variance term ~+~hi.ch
origin, to be
This variance can be made arbitrarily small by increasing A. So

that if f(x) is a symmetric dezsity fuzction and the kernel
function is also taken to be symmetric then the MSE[?~(X)] can
be reduced to zero at the centre of symmetry of f(x) by taking
an infinite smoothing parameter A. The practical consequences
of this result will be seen in the next section.
3. RESULTS
3.1 Introduction
Tapia and Thompson (1978, p.76) refer to difficulties which
may arise with the use of the Epanechnikov kernel owing to its
non-differentiahility at the end-points; On the other hand, the
Normal kernel suffers from having infinite support. A desirable
alternative which we have chosen is the kernel called K3 by Tapia
adThompson (1978, p.SO) which has smoothness properties and
finite support. The K3 kernel is given by
We assume here that we know the true density, f, and proceed to

find by direct numericai means the smoothing parameter h that
minimizes the MSE of the kernel estimate of the distribution
function. Obviously the assumption that f is known is
artificial - if the true distribution were known there would be
no point in estimating it. Our purpose, however, is to uncover
the properties of kernel distribution function estimates under
various distributional assumptions. Robustness of certain
properties to varying distributions will be commented on as they
emerge. In practice an iterative re-estimation of the unknown
KERNEL ESTIYATIOK
distribution may be feasible (Scott; Tapia and Thompson, 1977).

For comparison we investigate the same distributions reported by
Azzalini ( l g S l ) , i.e. X(0,1>, Sa~aa(.5), Camma(1) , Cavma(2),
Gammajjj, Beta(i,l), Beta(l,4), Beta(2,5), Beta(2,10), and
Beta(5,5). We zsec! FORTRLY programing together with NAG (1978)
aigorithms for the required integrations and MSE LunclrLo~~
minim-
ization,and the NCAR graphics package.
The following comments relate Azzalini's results to those
from our method which directly minimized W E ( ? ( x ) ) using a K 3
r?
kernel and with x at the 5, 90 and 95 percentiles of the various
distributions.
(1) The MSE's obtained from using the K 3 kernel and the direct
minimization of MSE are somewhat smaller than those reported by
Azzalini in almost every case.
(2j Azzalini's rule of thumb A = 1 . 3 0-1/3
~ ). = 0.5&%)
does not depend on x. Our work showed that X should depend on

the x being considered. We will examine the relationship between
x and h more closely in Section 3.2.
(3) Both Azzalini's and the K 3 results show that the A estimates
for the various families of distributions are quite different.
n-.-
rur instance, the Gamma distributions tend to have fairly large
X's whereas the h values for the Beta distributions are compar-
acively smaller.
(4) As an interesting aside point we replaced the K 3 kernel in
our MSE minimization algorithm with the Epanechnikov kernel.
(Recall that Azzalini used the Epanechnikov kernel but with X
given by 1. 3on-I/' or 0. Js~i-"~), Po+ ths 99th and 95th percent-
ile points of the N(0,1) distribution we compare, in Table I;
the X and MSE of Azzalini, minimizing MSE with an Epanechnikov
kernel, and minimizing MSE with a K 3 kernel.
We note from Table I that the MSE's for the Epanechnikov
kernel and the K 3 kernel are almost the same, with the K 3 MSE's
being just slightly smaller. This accords with the consensus of
opinion of many other authors (Hand Chapter 3, Tapia and Thompson
HILL
A and d i S E f o r A z z a l i n i , Epanechnikov and K 3 k e r n e l

( 0 n = 1000)
Azzalini
Epanechnikov
K 3 Kernel
1
1
1.282
0.130
0.155
0,310
,08193
,08179
.08172
Azzalini
Epanechnikov
K3 K e r n e l
! 1.645
0.130
0.151
0.303
.04305
-04295
.04290
S e c t i o n 2.5) t h a t t h e c h o i c e of k e r n e l t y p e i s n o t c r u c i a l i n
k e r n e l e s t i m a t i o n , a t l e a s t when t h e sample s i z e i s l a r g e . But

note also t h a t the X e s t i m a t e s f o r t h e K3 k e r n e l f o r b o t h x's
a r e a b o u t t w i c e t h o s e f o r t h e Epanechnikov k e r n e l . (This i s t o
be e x p e c t e d i n view of t h e s c a l e d i f f e r e n c e i n t h e r e g i o n of
s u p p o r t of t h e two k e r n e l s . ) So i t i s t h e e v a l u a t i o n of oprrimal
A f o r t h e p a r t i c u l a r k e r n e l chosen, n o t t h e k e r n e l i t s e l f , which
is important. The X e s t i m a t e f o r t h e Epanechnikov k e r n e l i s
c l o s e t o t h a t used by A z z a l i n i , j u s t i f y i n g h i s ' r u l e of thumbv
formula, b u t A z z a l i n i ' s MSE'S a r e t h e l a r g e s r : of rrhe t h r e e .
3.2 Examining R e l a t i o n s h i p s between P a i r s of 7 a r i a b l e s

I n t h i s S e c t i o n we p r e s e n t g r a p h s of t h e r e l a t i o n s h i p
between A and e a c h of t h e o t h e r t h r e e v a r i a b l e s of i n t e r e s t :
MSE, n, and x. S i n c e t h e g r a p h s f o r some of t h e 10 d i s t r i b -
u t i o n s a r e s i m i l a r , f o r e a c h s u b - s e c t i o n we o n l y show a s m a l l
selection.
3.2.1 R e l a t i o n s h i p between MSE and A
We t a k e n = 1000, compute t h e MSE's f o r a r a n g e o f A at
t h e 9 5 t h p e r c e n t i l e p o i n t , and p l o t t h e graph of MSE a g a i n s t A.
F i g u r e s 1 and 2 a r e f o r two of t h e d i s t r i b u t i o n s s t u d i e d .
KERSEL ESTIXATIOS
Fig. 1 - Mean squared error of i)Fn (x) vs smoothing parameter h

for the N(0,l) distribution, n = 1000, x = 95 percentile point.
-
0.0
0.0 1.0
Smooth~ngparameter A
2.0
Fig. 2 - Mean squared error of Fn(x) vs smoothing parameter X

for the Gamma(2) distribution, n = 1000, x = 95 percentile point.
Similar graphs were obtained for other x values such as the 5th,
60th and 80th percentile points, and for various values of n
as low as n = 20, and for the other distributions.
From Figures 1 and 2 we observe that the MSE function is
smooth and rather flat near the minimum. This is an important
feature since it shows that a fairly wide range of h values
will give comparable MSE's. Hence some slight inaccuracy in
finding the optimal h may not be too serious.
Fig. 3 - MSE-optimal X vs n f o r t h e N!0,1! and B e t a ( 2 , 5 )
distributions a t x = 95 p e r c e n t i l e p o i n t .
3.2.2 R e l a t i o n s h i p between X and n

To s e e how X and 1-1 a r e r e l a t e d , we t a k e n from 106 t o

3000 ( i n s t e p s of 1 0 0 j , and compute t h e MSE o p t i m a l X a t the
95th p e r c e n t i l e point.
F i g u r e 3 shows a smooth m o n o t o n i c a l l y d e c r e a s i n g dependence
of t h e MSE n p t ? m n l h on n for
- - - t v o of t h e d i s t r i b u t i o n s s t u d i e d ,
S i m i l a r g r a p h s were o b t a i n e d f o r o t h e r x v a l u e s and o t h e r d i s -
tributions, A z z a l i n i h a s s u g g e s t e d a n exponent of -1/3 for n.
Assuming t h e model X = a n B f o r t h e r e l a t i o n s h i p between A and
n we f i t t e d a s t r a i g h t l i n e t o t h e l o g h versus log n relation-
s h i p f o r e a c h of t h e s i t u a t i o n s c o n s i d e r e d . We found t h e exponent
B t o v a r y from a b o u t -.29 t o -.30 over t h e various s i t u a t i o n s ,
s u g g e s t i n g t h a t i n g e n e r a l t h e a p p r o p r i a t e exponent may be
s l i g h t l y l a r g e r than -,3,
3.2.3 R e l a t i o n s h i p between X and t h e p o i n t x
Azzalini's suggestion f o r near optimal X does n o t t a k e
i n t o consideration the point x. Our r e s u l t s , however, show t h a t
X i s dependent on t h e x value being considered. (We comment
l a t e r on t h e d e s i r a b i l i t y i n p r a c t i c e of t h i s p o i n t w i s e s e l e c t i o n
of X.) Taking n = 1000, we compute X f o r a r a n g e of
values. We have a l s o t a k e n n a s s m a l l a s 20 and found t h e
g r a p h s t o be s i m i l a r t o t h o s e p r e s e n t e d h e r e .
F i g . 4 - MSE-optimai X vs the location point x for the ~ ( 0 ~ 1 )
distribution, n = 1000.
I t can be s e e n from F i g u r e 4 t h a t f o r a symmetric d i s t r i b -

ution i i k e the N(O,l), t h e MSE-cptimal T. f o l l o w s a symmetric

p a t t e r n , r i s i n g s h a r p l y a t t h e c e n r r e of t h e d i s t r i b u t i o n t o t h e
upper l i m i t of t h e h r a n g e which t h e m i n i m i z a t i o n r o u t i n e a l l o w s .
The o p t i m a l X h a s been shown i n S e c t i o n 2 t o b e i n f i n i t e a t t h e
-..-..-- o f
LC;;;;= a ~ y ~ e t r i dc i .s t r i b u t i o n . Note t h a t f o r t h e N(U,ij
d i s t r i b u t i o n , t h e r e i s a s l i g h t hump i n t h e X versus x
r e l a t i o n s h i p a t about x = 4 s t a n d a r d d e v i a t i o n s from t h e mean.
We c a n n o t e x p l a i n t h i s phenomenon.
We n e x t c o n s i d e r skewed d i s t r i b u t i o n s l i k e t h e S e t a !2,1O> -
t h e d e n s i t y i s graphed i n F i g u r e 5 t o show t h e d e g r e e of skew.
The g r a p h of A versus x i n F i g u r e 6 a g a i n shows a n "explosion"
of A ( b u t riot t o i n f i n i t y ) f o r v a l u e s of x i n t h e v i c i n i t y of
t h e mode of t h e d i s t r i b u t i o n . We n o t e a g a i n t h e s l i g h t hump i n
X f o r t h e r i g h t - h a n d t a i l v a l u e s of x.
The X versus x p i c t u r e i n F i g u r e 4 f o r t h e symmetric
N ( 0 , l ) d i s t r i b u t i o n i s r e i n f o r c e d by some a d d i t i o n a l work on
m i x t u r e of Normal d i s t r i b u t i o n s . I n p a r t i c u l a r , a s a n example,
for the distribution , 5 N(0,l) + .5 N ( 5 , l ) the X versus x
r e l a t i o n s h i p i s shown i n F i g u r e 8 t o g e t h e r w i t h a g r a p h of t h e
d e n s i t y i n F i g u r e 7.
HILL
Fig. 5 - The Beta(2,iO) density function.

L i
0.1 0.3 0.5 0.7 09
Location pomt X
T?2^ L
~ l g .u - 1tC.n
A a lvs
~ ~ r ;_ --L 2u- .p, ~ ~ l u the iucatiun point x ior t h e
Beta(2,lO) distribution, n = 1000.
Location point X
Fig. 7 - Mixture of Normals density function, . 5 N(0,l) + . 5 N(5,l).

Fig. 8 - MSE-optimal X vs the location point x for the
.5 N ( 0 , l ) f .5 N ( 5 , l ) distribution, n = 1000.
The points which emerge from Figure 8 are that A + at the

centre of symmetry, "explodes" to a very large, buc noc infinire,

value at the twn modpsi and increases to a hump in the tails of
the distribution.
4. CONCLUSIONS
--
We iist our conclusions from the resulis in Seccion 3 .
(1) The A which minimizes I.ISE(? (x)) varies considerably
n
with the unknown distribution as stated in Section 3 . 1 where the
optimal A for the Gamma distributions, for example, were found
to be much larger than i o r the Beta distributions.
(2) Our results in Table I accord with the general consensus that
the choice of the kernel is not crucial for kernel estimation of
a distribution function. The size of the optimal A naturally
depends very much on the type of kernel chosen.
(3) The K3 kernel performs slightly better than the Epanechnikov
kernel.
(4) From the MSE versus h plots referred to in Section 3.2.1,
we observe that the MSE function is rather flat near the minimum.
Hence an approximate calculation of the optimal A may give
satisfactory MSE as long as it falls within the near flat portion
of the MSE function.
018 "ILL
(5) A is a monotone decreasing function of the sample size n.

From some of our results in 3 . 2 - 2 , we have found that an approp-
,;ate exponent of n is a iittle larger than - . 3 .
.- It ,,,Id be .--*.
desirable to consider more varied distributions to check on this.
(6) The 1 versus x graphs referred to in Section 3.2.3
suggest a racher compiicated reiationsnip between h and x. So,
a rule of thumb value, as in Azzalini (1981), which is independent
of x, may not be a good idea. In particular, A is infinite
at a centre of syiimetry. For a non-~yii~etric
distribution, the
value of X is also large near the mode. Also, the h versus
x graphs for distributions like the N(0,l) and the Beta (2,10),
show a slight hump in the tail regions of x.
Some of these results are mainly of theoretical interest. We
expand on findings with practical implications.
In Section 1 we explained the practical rnotivatioil for this

research in connection with reference ranges for preliminary
medical diagnoses. A data base of values on a variable for
healthy patients is available. A new patient has a variable
value x and the diagnostic question is how this new patient's
value relates to the distribution of values for the healthy data
base, Our approach here has been a theoretical one, namely to
assume knowiedge of the distribution of daca base values. Despite
this assumption we then considered estimation of F(x), the
distribution function value of x, via a kernel approach. Our
purpose in this non-practical approach was to uncover the relation-
ship of the MSE optimal A to the underlying density f. In
Section 3.1 we saw that choice of h was not robust against
variation in t, This establishes the necessity in practice for
an iterative re-estimation of the unknown true f as suggested in
Section 3.1. Scott, Tapia and Thompson (1977) have tested an
algorithm which iteratively re-estimates the true density in the
context of density estimation. For our context we could adapt
their integrated mean square error of ?n (x) criterion to our
MSE(~~(X)) with no apparent difficulties.
As seen in Section 3 - 2 - 3 , h varies with x in a complicated
vay. In particular the pointwise MSE optimal X for symmetric
disrributions tends to infinity at the centre of symmetry, So for
a particular x, we obrain from the minimization algorithm a
certain A and hence an estimate of the distribution of the data-
base. With another new x we get a new A dnu dnuther e s ~ i m a r ~
of f. This pointwise selection of A is not appealing intuitively.
The new point "tail" is wagging the database "dog". The database
should give rise rro a ker~lelestimate of its o m dlstrlbutlon
without being manipulated by new x's. Hence, it might be easier
and more practical to choose a compromise value for X which is
near-optimal in terms of MSE of
,.over a range of practical
F x
values. In practice, the infinite X problem for a new x at
the centre of symmetry, or the very large X at the mode of non-
symmetric dis~riburions,may not be too important because we are
generally not interested in patients near the centre or mode of

the distribution.
this paper.
BIBLIOGRAPHY
hzzalini, A. (1979). Efficiency of the kernel method for

estimating a distribution function and percentage points.
Unpublished.
Azzalini, A. (1981). A note on the estimation of a distribution

function and quantiles by a kernel method. Biometrika 68,
326-328,
Elveback, L.R. (1972). How high is high? A proposed alternative

to the normal range. Mayo Clinic Proc., 47, 93-97.
Epanechnikov, V.A. (1969). Nonparametric estimates of a multi-

variate probability density. Th. Frob. Applic., 14, 153-158.
Hand, D.J. (1982). Kernel Discrirainant Analysis. Chichester:

Wiley.
Lyaud~dyd,E . A .
XT
(l964). Some new estimates for distribution
functions. Th. Prob. Applic., 9, 497-500.
NAG (1978). Numerical Algorithms Group Manual. Mark (8).

7 Banbury Road, Oxford,
Rosenbl-att, M, (1956)" Remarks on some non-parametric estimates

of a density function. Ann. Math. Statist., 27, 832-837.
Rosmhlatt, M = (1971); Curve estimatesi Pxn. Math. Strtjst-, 42,

1815-1842.
Scottj D:W,, Tapia, R , A , and Thompson, J,R, (1978)- Kernel

density estimation revisited. J. Nonlinear Anal., 1, 339-372.
Tapia, R.A. and Thompson, J.R. (1978). Nonparametric Probability

Density Estimation. Baltimore: John Hopkins University Press.
Wertz, W. (1978). Statistical Density Estimation: a Survey.

Gottingen: Vandenhoeck and Ruprecht.
Winter, B.B. (1973). Strong uniforn consistency of integrals of

density estimators. Can. J. Statist., 1, 247-253.
Winter, B.B. (1979j. Convergence rate of perturbed empirical

distribution functions. J. Appl. Prob., 16, 163-173.

Communications in Statistics - Theory and Methods

Uploaded by

Copyright:

Available Formats

You might also like

Communications in Statistics - Theory and Methods

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Communications in Statistics - Theory and Methods

Uploaded by

Copyright:

Available Formats

This article was downloaded by: [Monash University Library]

On: 07 June 2013, At: 10:49

Communications in Statistics - Theory and Methods

Kernel estimation of a distribution function

To link to this article: http://dx.doi.org/10.1080/03610928508828937

PLEASE SCROLL DOWN FOR ARTICLE

Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions

Ke9 b/ords and Phrases: d i s t r i b u t i o n f m c t i o n ; kernel estimation;

A distribution function is estimated by a kernel method with

This paper investigates some properties of kernel estimates

Copyright O 1985 by Marcel Dekker, Inc.

healthy population of values is intended co lie. If the variable

inorganic phosphorus, it would be useful to know how unusual is a

percentile corresponding to this value for healthy persons oi the

Here we will concentrate on percentile estimates, that is,

t h e smoothing p a r a m e t e r X was t h e p o i n t w i s e mean s q u a r e d e r r o r

numerically computed the optimum value of X by equation (2). He

2. 0PTIY-U. AT THE CENTRE OF SBIMETRY

Equatlon (2) In Sec~ion1 gives Azzalini's ? for nininum

Clearly if fl(x) = 0 it is not possible to minimize the MSE

x approaches the centre of a symmetric density the optimal A

But W(.) is the distribution function of the chosen kernei

when f is symmetric abour zero. So a kernel estimate of the

zero bias at che origin if a symmetric kernel is used.

This variance can be made arbitrarily small by increasing A. So

We assume here that we know the true density, f, and proceed to

distribution may be feasible (Scott; Tapia and Thompson, 1977).

does not depend on x. Our work showed that X should depend on

A and d i S E f o r A z z a l i n i , Epanechnikov and K 3 k e r n e l

k e r n e l e s t i m a t i o n , a t l e a s t when t h e sample s i z e i s l a r g e . But

3.2 Examining R e l a t i o n s h i p s between P a i r s of 7 a r i a b l e s

Fig. 1 - Mean squared error of i)Fn (x) vs smoothing parameter h

Fig. 2 - Mean squared error of Fn(x) vs smoothing parameter X

3.2.2 R e l a t i o n s h i p between X and n

To s e e how X and 1-1 a r e r e l a t e d , we t a k e n from 106 t o

I t can be s e e n from F i g u r e 4 t h a t f o r a symmetric d i s t r i b -

ution i i k e the N(O,l), t h e MSE-cptimal T. f o l l o w s a symmetric

Fig. 5 - The Beta(2,iO) density function.

Fig. 7 - Mixture of Normals density function, . 5 N(0,l) + . 5 N(5,l).

The points which emerge from Figure 8 are that A + at the

centre of symmetry, "explodes" to a very large, buc noc infinire,

(5) A is a monotone decreasing function of the sample size n.

In Section 1 we explained the practical rnotivatioil for this

generally not interested in patients near the centre or mode of

hzzalini, A. (1979). Efficiency of the kernel method for

Azzalini, A. (1981). A note on the estimation of a distribution

Elveback, L.R. (1972). How high is high? A proposed alternative

Epanechnikov, V.A. (1969). Nonparametric estimates of a multi-

Hand, D.J. (1982). Kernel Discrirainant Analysis. Chichester:

NAG (1978). Numerical Algorithms Group Manual. Mark (8).

Rosenbl-att, M, (1956)" Remarks on some non-parametric estimates

Rosmhlatt, M = (1971); Curve estimatesi Pxn. Math. Strtjst-, 42,

Scottj D:W,, Tapia, R , A , and Thompson, J,R, (1978)- Kernel

Tapia, R.A. and Thompson, J.R. (1978). Nonparametric Probability

Wertz, W. (1978). Statistical Density Estimation: a Survey.

Winter, B.B. (1973). Strong uniforn consistency of integrals of

Winter, B.B. (1979j. Convergence rate of perturbed empirical

You might also like