Kernel Density Estimation - Wikipedia

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

11/03/2024, 13:36 Kernel density estimation - Wikipedia

Kernel density estimation


In statistics, kernel density estimation (KDE) is the
application of kernel smoothing for probability density
estimation, i.e., a non-parametric method to estimate the
probability density function of a random variable based on
kernels as weights. KDE answers a fundamental data
smoothing problem where inferences about the population
are made, based on a finite data sample. In some fields
such as signal processing and econometrics it is also
termed the Parzen–Rosenblatt window method, after
Emanuel Parzen and Murray Rosenblatt, who are usually
credited with independently creating it in its current Kernel density estimation of 100 normally
form.[1][2] One of the famous applications of kernel density distributed random numbers using
estimation is in estimating the class-conditional marginal different smoothing bandwidths.

densities of data when using a naive Bayes classifier,[3][4]


which can improve its prediction accuracy.[3]

Definition
Let (x1, x2, ..., xn) be independent and identically distributed samples drawn from some univariate
distribution with an unknown density ƒ at any given point x. We are interested in estimating the
shape of this function ƒ. Its kernel density estimator is

where K is the kernel — a non-negative function — and h > 0 is a smoothing parameter called the
bandwidth. A kernel with subscript h is called the scaled kernel and defined as Kh(x) = 1/h K(x/h).
Intuitively one wants to choose h as small as the data will allow; however, there is always a trade-
off between the bias of the estimator and its variance. The choice of bandwidth is discussed in
more detail below.

A range of kernel functions are commonly used: uniform, triangular, biweight, triweight,
Epanechnikov, normal, and others. The Epanechnikov kernel is optimal in a mean square error
sense,[5] though the loss of efficiency is small for the kernels listed previously.[6] Due to its
convenient mathematical properties, the normal kernel is often used, which means K(x) = ϕ(x),
where ϕ is the standard normal density function.

The construction of a kernel density estimate finds interpretations in fields outside of density
estimation.[7] For example, in thermodynamics, this is equivalent to the amount of heat generated
when heat kernels (the fundamental solution to the heat equation) are placed at each data point
locations xi. Similar methods are used to construct discrete Laplace operators on point clouds for
manifold learning (e.g. diffusion map).

Example
https://en.wikipedia.org/wiki/Kernel_density_estimation#:~:text=In statistics%2C kernel density estimation,based on kernels as weights. 1/11
11/03/2024, 13:36 Kernel density estimation - Wikipedia

Kernel density estimates are closely related to histograms, but can be endowed with properties
such as smoothness or continuity by using a suitable kernel. The diagram below based on these 6
data points illustrates this relationship:

Sample 1 2 3 4 5 6

Value -2.1 -1.3 -0.4 1.9 5.1 6.2

For the histogram, first, the horizontal axis is divided into sub-intervals or bins which cover the
range of the data: In this case, six bins each of width 2. Whenever a data point falls inside this
interval, a box of height 1/12 is placed there. If more than one data point falls inside the same bin,
the boxes are stacked on top of each other.

For the kernel density estimate, normal kernels with a standard deviation of 1.5 (indicated by the
red dashed lines) are placed on each of the data points xi. The kernels are summed to make the
kernel density estimate (solid blue curve). The smoothness of the kernel density estimate
(compared to the discreteness of the histogram) illustrates how kernel density estimates converge
faster to the true underlying density for continuous random variables.[8]

Comparison of the histogram (left) and kernel density estimate (right) constructed
using the same data. The six individual kernels are the red dashed curves, the
kernel density estimate the blue curves. The data points are the rug plot on the
horizontal axis.

Bandwidth selection
The bandwidth of the kernel is a free parameter which exhibits a strong influence on the resulting
estimate. To illustrate its effect, we take a simulated random sample from the standard normal
distribution (plotted at the blue spikes in the rug plot on the horizontal axis). The grey curve is the
true density (a normal density with mean 0 and variance 1). In comparison, the red curve is
undersmoothed since it contains too many spurious data artifacts arising from using a bandwidth
h = 0.05, which is too small. The green curve is oversmoothed since using the bandwidth h = 2
obscures much of the underlying structure. The black curve with a bandwidth of h = 0.337 is
considered to be optimally smoothed since its density estimate is close to the true density. An
extreme situation is encountered in the limit (no smoothing), where the estimate is a sum of
n delta functions centered at the coordinates of analyzed samples. In the other extreme limit
the estimate retains the shape of the used kernel, centered on the mean of the samples
(completely smooth).

https://en.wikipedia.org/wiki/Kernel_density_estimation#:~:text=In statistics%2C kernel density estimation,based on kernels as weights. 2/11


11/03/2024, 13:36 Kernel density estimation - Wikipedia

The most common optimality criterion used to select this


parameter is the expected L2 risk function, also termed the
mean integrated squared error:

Under weak assumptions on ƒ and K, (ƒ is the, generally


unknown, real density function),[1][2]

Kernel density estimate (KDE) with


where o is the little o notation, and n the sample size (as different bandwidths of a random
above). The AMISE is the asymptotic MISE, i. e. the two sample of 100 points from a
leading terms, standard normal distribution. Grey:
true density (standard normal). Red:
KDE with h=0.05. Black: KDE with
h=0.337. Green: KDE with h=2.

where for a function g,

and is the second derivative of and is the kernel. The minimum

of this AMISE is the solution to this differential equation

or

Neither the AMISE nor the hAMISE formulas can be used directly since they involve the unknown
density function or its second derivative . To overcome that difficulty, a variety of automatic,
data-based methods have been developed to select the bandwidth. Several review studies have
been undertaken to compare their efficacies,[9][10][11][12][13][14][15] with the general consensus that
the plug-in selectors[7][16][17] and cross validation selectors[18][19][20] are the most useful over a
wide range of data sets.

Substituting any bandwidth h which has the same asymptotic order n−1/5 as hAMISE into the
AMISE gives that AMISE(h) = O(n−4/5), where O is the big o notation. It can be shown that, under
weak assumptions, there cannot exist a non-parametric estimator that converges at a faster rate
than the kernel estimator.[21] Note that the n−4/5 rate is slower than the typical n−1 convergence
rate of parametric methods.

If the bandwidth is not held fixed, but is varied depending upon the location of either the estimate
(balloon estimator) or the samples (pointwise estimator), this produces a particularly powerful
method termed adaptive or variable bandwidth kernel density estimation.

https://en.wikipedia.org/wiki/Kernel_density_estimation#:~:text=In statistics%2C kernel density estimation,based on kernels as weights. 3/11


11/03/2024, 13:36 Kernel density estimation - Wikipedia

Bandwidth selection for kernel density estimation of heavy-tailed distributions is relatively


difficult.[22]

A rule-of-thumb bandwidth estimator


If Gaussian basis functions are used to approximate univariate data, and the underlying density
being estimated is Gaussian, the optimal choice for h (that is, the bandwidth that minimises the
mean integrated squared error) is:[23]

An value is considered more robust when it improves the fit for long-tailed and skewed
distributions or for bimodal mixture distributions. This is often done empirically by replacing the
standard deviation by the parameter below:

where IQR is the interquartile range.

Another modification that will improve the model is to


reduce the factor from 1.06 to 0.9. Then the final
formula would be:

where is the sample size.

This approximation is termed the normal distribution


approximation, Gaussian approximation, or
Silverman's rule of thumb. [23] While this rule of Comparison between rule of thumb and solve-
thumb is easy to compute, it should be used with the-equation bandwidth.
caution as it can yield widely inaccurate estimates
when the density is not close to being normal. For
example, when estimating the bimodal Gaussian mixture model

from a sample of 200 points, the figure on the right shows the true density and two kernel density
estimates — one using the rule-of-thumb bandwidth, and the other using a solve-the-equation
bandwidth.[7][17] The estimate based on the rule-of-thumb bandwidth is significantly
oversmoothed.

Relation to the characteristic function density estimator


Given the sample (x1, x2, ..., xn), it is natural to estimate the characteristic function φ(t) = E[eitX] as

https://en.wikipedia.org/wiki/Kernel_density_estimation#:~:text=In statistics%2C kernel density estimation,based on kernels as weights. 4/11


11/03/2024, 13:36 Kernel density estimation - Wikipedia

Knowing the characteristic function, it is possible to find the corresponding probability density
function through the Fourier transform formula. One difficulty with applying this inversion
formula is that it leads to a diverging integral, since the estimate is unreliable for large t’s. To
circumvent this problem, the estimator is multiplied by a damping function ψh(t) = ψ(ht),
which is equal to 1 at the origin and then falls to 0 at infinity. The “bandwidth parameter” h
controls how fast we try to dampen the function . In particular when h is small, then ψh(t) will
be approximately one for a large range of t’s, which means that remains practically unaltered in
the most important region of t’s.

The most common choice for function ψ is either the uniform function ψ(t) = 1{−1 ≤ t ≤ 1}, which
effectively means truncating the interval of integration in the inversion formula to [−1/h, 1/h], or
2
the Gaussian function ψ(t) = e−πt . Once the function ψ has been chosen, the inversion formula
may be applied, and the density estimator will be

where K is the Fourier transform of the damping function ψ. Thus the kernel density estimator
coincides with the characteristic function density estimator.

Geometric and topological features


We can extend the definition of the (global) mode to a local sense and define the local modes:

Namely, is the collection of points for which the density function is locally maximized. A natural
estimator of is a plug-in from KDE,[24][25] where and are KDE version of and
. Under mild assumptions, is a consistent estimator of . Note that one can use the
mean shift algorithm [26][27][28] to compute the estimator numerically.

Statistical implementation
A non-exhaustive list of software implementations of kernel density estimators includes:

In Analytica release 4.4, the Smoothing option for PDF results uses KDE, and from
expressions it is available via the built-in Pdf function.
In C/C++, FIGTree (http://www.umiacs.umd.edu/~morariu/figtree/) is a library that can be used
to compute kernel density estimates using normal kernels. MATLAB interface available.
In C++, libagf (http://libagf.sf.net) is a library for variable kernel density estimation.
In C++, mlpack is a library that can compute KDE using many different kernels. It allows to set
an error tolerance for faster computation. Python and R interfaces are available.

https://en.wikipedia.org/wiki/Kernel_density_estimation#:~:text=In statistics%2C kernel density estimation,based on kernels as weights. 5/11


11/03/2024, 13:36 Kernel density estimation - Wikipedia

in C# and F#, Math.NET Numerics is an open source library for numerical computation which
includes kernel density estimation (https://numerics.mathdotnet.com/api/MathNet.Numerics.Sta
tistics/KernelDensity.htm)
In CrimeStat, kernel density estimation is implemented using five different kernel functions –
normal, uniform, quartic, negative exponential, and triangular. Both single- and dual-kernel
density estimate routines are available. Kernel density estimation is also used in interpolating a
Head Bang routine, in estimating a two-dimensional Journey-to-crime density function, and in
estimating a three-dimensional Bayesian Journey-to-crime estimate.
In ELKI, kernel density functions can be found in the package
de.lmu.ifi.dbs.elki.math.statistics.kernelfunctions
In ESRI products, kernel density mapping is managed out of the Spatial Analyst toolbox and
uses the Quartic(biweight) kernel.
In Excel, the Royal Society of Chemistry has created an add-in to run kernel density estimation
based on their Analytical Methods Committee Technical Brief 4 (http://www.rsc.org/Membershi
p/Networking/InterestGroups/Analytical/AMC/Software/kerneldensities.asp).
In gnuplot, kernel density estimation is implemented by the smooth kdensity option, the
datafile can contain a weight and bandwidth for each point, or the bandwidth can be set
automatically[29] according to "Silverman's rule of thumb" (see above).
In Haskell, kernel density is implemented in the statistics (http://hackage.haskell.org/package/st
atistics) package.
In IGOR Pro, kernel density estimation is implemented by the StatsKDE operation (added in
Igor Pro 7.00). Bandwidth can be user specified or estimated by means of Silverman, Scott or
Bowmann and Azzalini. Kernel types are: Epanechnikov, Bi-weight, Tri-weight, Triangular,
Gaussian and Rectangular.
In Java, the Weka machine learning package provides weka.estimators.KernelEstimator (http://
weka.sourceforge.net/doc.stable/weka/estimators/KernelEstimator.html), among others.
In JavaScript, the visualization package D3.js offers a KDE package in its science.stats
package.
In JMP, the Graph Builder platform utilizes kernel density estimation to provide contour plots
and high density regions (HDRs) for bivariate densities, and violin plots and HDRs for
univariate densities. Sliders allow the user to vary the bandwidth. Bivariate and univariate
kernel density estimates are also provided by the Fit Y by X and Distribution platforms,
respectively.
In Julia, kernel density estimation is implemented in the KernelDensity.jl (https://github.com/Juli
aStats/KernelDensity.jl) package.
In KNIME, 1D and 2D Kernel Density distributions can be generated and plotted using nodes
from the Vernalis community contribution, e.g. 1D Kernel Density Plot (https://hub.knime.com/v
ernalis/extensions/com.vernalis.knime.feature/latest/com.vernalis.knime.plot.nodes.kerneldensi
ty.KernelDensity1DPlotNodeFactory/), among others. The underlying implementation is written
in Java.
In MATLAB, kernel density estimation is implemented through the ksdensity function
(Statistics Toolbox). As of the 2018a release of MATLAB, both the bandwidth and kernel
smoother can be specified, including other options such as specifying the range of the kernel
density.[30] Alternatively, a free MATLAB software package which implements an automatic
bandwidth selection method[7] is available from the MATLAB Central File Exchange for
1-dimensional data (http://www.mathworks.com/matlabcentral/fileexchange/14034-kernel-d
ensity-estimator)
2-dimensional data (http://www.mathworks.com/matlabcentral/fileexchange/17204-kernel-d
ensity-estimation)
n-dimensional data (http://www.mathworks.com/matlabcentral/fileexchange/58312-kernel-d
ensity-estimator-for-high-dimensions)
A free MATLAB toolbox with implementation of kernel regression, kernel density estimation,
kernel estimation of hazard function and many others is available on these pages (http://ww
https://en.wikipedia.org/wiki/Kernel_density_estimation#:~:text=In statistics%2C kernel density estimation,based on kernels as weights. 6/11
11/03/2024, 13:36 Kernel density estimation - Wikipedia

w.math.muni.cz/english/science-and-research/developed-software/232-matlab-toolbox.htm
l) (this toolbox is a part of the book [31]).
In Mathematica, numeric kernel density estimation is implemented by the function
SmoothKernelDistribution[32] and symbolic estimation is implemented using the function
KernelMixtureDistribution[33] both of which provide data-driven bandwidths.
In Minitab, the Royal Society of Chemistry has created a macro to run kernel density estimation
based on their Analytical Methods Committee Technical Brief 4.[34]
In the NAG Library, kernel density estimation is implemented via the g10ba routine (available
in both the Fortran[35] and the C[36] versions of the Library).
In Nuklei (http://nuklei.sourceforge.net/), C++ kernel density methods focus on data from the
Special Euclidean group .
In Octave, kernel density estimation is implemented by the kernel_density option
(econometrics package).
In Origin, 2D kernel density plot can be made from its user interface, and two functions,
Ksdensity for 1D and Ks2density for 2D can be used from its LabTalk (http://wiki.originlab.com/
~originla/ltwiki/index.php?title=Category:LabTalk_Programming), Python, or C code.
In Perl, an implementation can be found in the Statistics-KernelEstimation module (https://sear
ch.cpan.org/~janert/Statistics-KernelEstimation-0.05)
In PHP, an implementation can be found in the MathPHP library (https://github.com/markrogoy
ski/math-php)
In Python, many implementations exist: pyqt_fit.kde Module (https://pythonhosted.org/PyQt-Fit/
mod_kde.html) in the PyQt-Fit package (https://pypi.python.org/packages/source/P/PyQt-Fit/Py
Qt-Fit-1.3.4.tar.gz), SciPy (scipy.stats.gaussian_kde), Statsmodels (KDEUnivariate
and KDEMultivariate), and scikit-learn (KernelDensity) (see comparison[37]). KDEpy (ht
tps://kdepy.readthedocs.io/en/latest/) supports weighted data and its FFT implementation is
orders of magnitude faster than the other implementations. The commonly used pandas library
[1] (https://pandas.pydata.org/) offers support for kde plotting through the plot method
(df.plot(kind='kde')[2] (https://pandas.pydata.org/pandas-docs/stable/reference/api/pan
das.DataFrame.plot.kde.html)). The getdist (https://getdist.readthedocs.io) package for
weighted and correlated MCMC samples supports optimized bandwidth, boundary correction
and higher-order methods for 1D and 2D distributions. One newly used package for kernel
density estimation is seaborn ( import seaborn as sns , sns.kdeplot() ).[38] A GPU
implementation of KDE also exists.[39]
In R, it is implemented through density in the base distribution, and bw.nrd0 function is
used in stats package, this function uses the optimized formula in Silverman's book. bkde in
the KernSmooth library (https://cran.r-project.org/web/packages/KernSmooth/index.html),
ParetoDensityEstimation in the DataVisualizations library (https://CRAN.R-project.org/pa
ckage=DataVisualizations) (for pareto distribution density estimation), kde in the ks library (http
s://cran.r-project.org/web/packages/ks/index.html), dkden and dbckden in the evmix library (h
ttps://cran.r-project.org/web/packages/evmix/index.html) (latter for boundary corrected kernel
density estimation for bounded support), npudens in the np library (https://cran.r-project.org/w
eb/packages/np/index.html) (numeric and categorical data), sm.density in the sm library (htt
ps://cran.r-project.org/web/packages/sm/index.html). For an implementation of the kde.R
function, which does not require installing any packages or libraries, see kde.R (http://web.mat
hs.unsw.edu.au/~zdravkobotev/kde.R). The btb library (https://cran.r-project.org/web/packages/
btb/index.html), dedicated to urban analysis, implements kernel density estimation through
kernel_smoothing.
In SAS, proc kde can be used to estimate univariate and bivariate kernel densities.
In Apache Spark, the KernelDensity() class[40]
In Stata, it is implemented through kdensity;[41] for example histogram x, kdensity.
Alternatively a free Stata module KDENS is available[42] allowing a user to estimate 1D or 2D
density functions.

https://en.wikipedia.org/wiki/Kernel_density_estimation#:~:text=In statistics%2C kernel density estimation,based on kernels as weights. 7/11


11/03/2024, 13:36 Kernel density estimation - Wikipedia

In Swift, it is implemented through SwiftStats.KernelDensityEstimation in the open-


source statistics library SwiftStats (https://github.com/r0fls/swiftstats).

See also
Kernel (statistics)
Kernel smoothing
Kernel regression
Density estimation (with presentation of other examples)
Mean-shift
Scale space: The triplets {(x, h, KDE with bandwidth h evaluated at x: all x, h > 0} form a scale
space representation of the data.
Multivariate kernel density estimation
Variable kernel density estimation
Head/tail breaks

Further reading
Härdle, Müller, Sperlich, Werwatz, Nonparametric and Semiparametric Methods, Springer-
Verlag Berlin Heidelberg 2004, pp. 39–83

References
1. Rosenblatt, M. (1956). "Remarks on Some Nonparametric Estimates of a Density Function" (htt
ps://doi.org/10.1214%2Faoms%2F1177728190). The Annals of Mathematical Statistics. 27 (3):
832–837. doi:10.1214/aoms/1177728190 (https://doi.org/10.1214%2Faoms%2F1177728190).
2. Parzen, E. (1962). "On Estimation of a Probability Density Function and Mode" (https://doi.org/
10.1214%2Faoms%2F1177704472). The Annals of Mathematical Statistics. 33 (3): 1065–
1076. doi:10.1214/aoms/1177704472 (https://doi.org/10.1214%2Faoms%2F1177704472).
JSTOR 2237880 (https://www.jstor.org/stable/2237880).
3. Piryonesi S. Madeh; El-Diraby Tamer E. (2020-06-01). "Role of Data Analytics in Infrastructure
Asset Management: Overcoming Data Size and Quality Problems". Journal of Transportation
Engineering, Part B: Pavements. 146 (2): 04020022. doi:10.1061/JPEODX.0000175 (https://do
i.org/10.1061%2FJPEODX.0000175). S2CID 216485629 (https://api.semanticscholar.org/Corp
usID:216485629).
4. Hastie, Trevor; Tibshirani, Robert; Friedman, Jerome H. (2001). The Elements of Statistical
Learning : Data Mining, Inference, and Prediction : with 200 full-color illustrations. New York:
Springer. ISBN 0-387-95284-5. OCLC 46809224 (https://www.worldcat.org/oclc/46809224).
5. Epanechnikov, V.A. (1969). "Non-parametric estimation of a multivariate probability density".
Theory of Probability and Its Applications. 14: 153–158. doi:10.1137/1114019 (https://doi.org/1
0.1137%2F1114019).
6. Wand, M.P; Jones, M.C. (1995). Kernel Smoothing. London: Chapman & Hall/CRC. ISBN 978-
0-412-55270-0.
7. Botev, Zdravko (2007). Nonparametric Density Estimation via Diffusion Mixing (https://espace.li
brary.uq.edu.au/view/UQ:120006) (Technical report). University of Queensland.
8. Scott, D. (1979). "On optimal and data-based histograms". Biometrika. 66 (3): 605–610.
doi:10.1093/biomet/66.3.605 (https://doi.org/10.1093%2Fbiomet%2F66.3.605).

https://en.wikipedia.org/wiki/Kernel_density_estimation#:~:text=In statistics%2C kernel density estimation,based on kernels as weights. 8/11


11/03/2024, 13:36 Kernel density estimation - Wikipedia

9. Park, B.U.; Marron, J.S. (1990). "Comparison of data-driven bandwidth selectors". Journal of
the American Statistical Association. 85 (409): 66–72. CiteSeerX 10.1.1.154.7321 (https://cites
eerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.154.7321).
doi:10.1080/01621459.1990.10475307 (https://doi.org/10.1080%2F01621459.1990.1047530
7). JSTOR 2289526 (https://www.jstor.org/stable/2289526).
10. Park, B.U.; Turlach, B.A. (1992). "Practical performance of several data driven bandwidth
selectors (with discussion)" (https://ideas.repec.org/p/cor/louvco/1992005.html). Computational
Statistics. 7: 251–270.
11. Cao, R.; Cuevas, A.; Manteiga, W. G. (1994). "A comparative study of several smoothing
methods in density estimation". Computational Statistics and Data Analysis. 17 (2): 153–176.
doi:10.1016/0167-9473(92)00066-Z (https://doi.org/10.1016%2F0167-9473%2892%2900066-
Z).
12. Jones, M.C.; Marron, J.S.; Sheather, S. J. (1996). "A brief survey of bandwidth selection for
density estimation". Journal of the American Statistical Association. 91 (433): 401–407.
doi:10.2307/2291420 (https://doi.org/10.2307%2F2291420). JSTOR 2291420 (https://www.jsto
r.org/stable/2291420).
13. Sheather, S.J. (1992). "The performance of six popular bandwidth selection methods on some
real data sets (with discussion)". Computational Statistics. 7: 225–250, 271–281.
14. Agarwal, N.; Aluru, N.R. (2010). "A data-driven stochastic collocation approach for uncertainty
quantification in MEMS" (https://webhost.engr.illinois.edu/~aluru/Journals/IJNME10.pdf) (PDF).
International Journal for Numerical Methods in Engineering. 83 (5): 575–597.
Bibcode:2010IJNME..83..575A (https://ui.adsabs.harvard.edu/abs/2010IJNME..83..575A).
doi:10.1002/nme.2844 (https://doi.org/10.1002%2Fnme.2844). S2CID 84834908 (https://api.se
manticscholar.org/CorpusID:84834908).
15. Xu, X.; Yan, Z.; Xu, S. (2015). "Estimating wind speed probability distribution by diffusion-
based kernel density method". Electric Power Systems Research. 121: 28–37.
doi:10.1016/j.epsr.2014.11.029 (https://doi.org/10.1016%2Fj.epsr.2014.11.029).
16. Botev, Z.I.; Grotowski, J.F.; Kroese, D.P. (2010). "Kernel density estimation via diffusion".
Annals of Statistics. 38 (5): 2916–2957. arXiv:1011.2602 (https://arxiv.org/abs/1011.2602).
doi:10.1214/10-AOS799 (https://doi.org/10.1214%2F10-AOS799). S2CID 41350591 (https://ap
i.semanticscholar.org/CorpusID:41350591).
17. Sheather, S.J.; Jones, M.C. (1991). "A reliable data-based bandwidth selection method for
kernel density estimation". Journal of the Royal Statistical Society, Series B. 53 (3): 683–690.
doi:10.1111/j.2517-6161.1991.tb01857.x (https://doi.org/10.1111%2Fj.2517-6161.1991.tb0185
7.x). JSTOR 2345597 (https://www.jstor.org/stable/2345597).
18. Rudemo, M. (1982). "Empirical choice of histograms and kernel density estimators".
Scandinavian Journal of Statistics. 9 (2): 65–78. JSTOR 4615859 (https://www.jstor.org/stable/
4615859).
19. Bowman, A.W. (1984). "An alternative method of cross-validation for the smoothing of density
estimates". Biometrika. 71 (2): 353–360. doi:10.1093/biomet/71.2.353 (https://doi.org/10.109
3%2Fbiomet%2F71.2.353).
20. Hall, P.; Marron, J.S.; Park, B.U. (1992). "Smoothed cross-validation" (https://doi.org/10.1007%
2FBF01205233). Probability Theory and Related Fields. 92: 1–20. doi:10.1007/BF01205233 (h
ttps://doi.org/10.1007%2FBF01205233). S2CID 121181481 (https://api.semanticscholar.org/Co
rpusID:121181481).
21. Wahba, G. (1975). "Optimal convergence properties of variable knot, kernel, and orthogonal
series methods for density estimation" (https://doi.org/10.1214%2Faos%2F1176342997).
Annals of Statistics. 3 (1): 15–29. doi:10.1214/aos/1176342997 (https://doi.org/10.1214%2Fao
s%2F1176342997).

https://en.wikipedia.org/wiki/Kernel_density_estimation#:~:text=In statistics%2C kernel density estimation,based on kernels as weights. 9/11


11/03/2024, 13:36 Kernel density estimation - Wikipedia

22. Buch-Larsen, TINE (2005). "Kernel density estimation for heavy-tailed distributions using the
Champernowne transformation". Statistics. 39 (6): 503–518. CiteSeerX 10.1.1.457.1544 (http
s://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.457.1544).
doi:10.1080/02331880500439782 (https://doi.org/10.1080%2F02331880500439782).
S2CID 219697435 (https://api.semanticscholar.org/CorpusID:219697435).
23. Silverman, B.W. (1986). Density Estimation for Statistics and Data Analysis (https://archive.org/
details/densityestimatio00silv_0/page/45). London: Chapman & Hall/CRC. p. 45 (https://archiv
e.org/details/densityestimatio00silv_0/page/45). ISBN 978-0-412-24620-3.
24. Chen, Yen-Chi; Genovese, Christopher R.; Wasserman, Larry (2016). "A comprehensive
approach to mode clustering" (https://doi.org/10.1214%2F15-ejs1102). Electronic Journal of
Statistics. 10 (1): 210–241. arXiv:1406.1780 (https://arxiv.org/abs/1406.1780). doi:10.1214/15-
ejs1102 (https://doi.org/10.1214%2F15-ejs1102). ISSN 1935-7524 (https://www.worldcat.org/is
sn/1935-7524).
25. Chazal, Frédéric; Fasy, Brittany Terese; Lecci, Fabrizio; Rinaldo, Alessandro; Wasserman,
Larry (2014). "Stochastic Convergence of Persistence Landscapes and Silhouettes" (https://joc
g.org/index.php/jocg/article/view/2982). Proceedings of the thirtieth annual symposium on
Computational geometry. Vol. 6. New York, New York, USA: ACM Press. pp. 474–483.
doi:10.1145/2582112.2582128 (https://doi.org/10.1145%2F2582112.2582128). ISBN 978-1-
4503-2594-3. S2CID 6029340 (https://api.semanticscholar.org/CorpusID:6029340).
26. Fukunaga, K.; Hostetler, L. (January 1975). "The estimation of the gradient of a density
function, with applications in pattern recognition". IEEE Transactions on Information Theory. 21
(1): 32–40. doi:10.1109/tit.1975.1055330 (https://doi.org/10.1109%2Ftit.1975.1055330).
ISSN 0018-9448 (https://www.worldcat.org/issn/0018-9448).
27. Yizong Cheng (1995). "Mean shift, mode seeking, and clustering". IEEE Transactions on
Pattern Analysis and Machine Intelligence. 17 (8): 790–799. CiteSeerX 10.1.1.510.1222 (http
s://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.510.1222). doi:10.1109/34.400568 (http
s://doi.org/10.1109%2F34.400568). ISSN 0162-8828 (https://www.worldcat.org/issn/0162-882
8).
28. Comaniciu, D.; Meer, P. (May 2002). "Mean shift: a robust approach toward feature space
analysis". IEEE Transactions on Pattern Analysis and Machine Intelligence. 24 (5): 603–619.
doi:10.1109/34.1000236 (https://doi.org/10.1109%2F34.1000236). ISSN 0162-8828 (https://ww
w.worldcat.org/issn/0162-8828). S2CID 691081 (https://api.semanticscholar.org/CorpusID:6910
81).
29. Janert, Philipp K (2009). Gnuplot in action : understanding data with graphs. Connecticut, USA:
Manning Publications. ISBN 978-1-933988-39-9. See section 13.2.2 entitled Kernel density
estimates.
30. "Kernel smoothing function estimate for univariate and bivariate data - MATLAB ksdensity" (htt
ps://www.mathworks.com/help/stats/ksdensity.html). www.mathworks.com. Retrieved
2020-11-05.
31. Horová, I.; Koláček, J.; Zelinka, J. (2012). Kernel Smoothing in MATLAB: Theory and Practice
of Kernel Smoothing. Singapore: World Scientific Publishing. ISBN 978-981-4405-48-5.
32. "SmoothKernelDistribution—Wolfram Language Documentation" (https://reference.wolfram.co
m/language/ref/SmoothKernelDistribution.html.en). reference.wolfram.com. Retrieved
2020-11-05.
33. "KernelMixtureDistribution—Wolfram Language Documentation" (https://reference.wolfram.co
m/language/ref/KernelMixtureDistribution.html.en). reference.wolfram.com. Retrieved
2020-11-05.
34. "Software for calculating kernel densities" (https://www.rsc.org/Membership/Networking/Interest
Groups/Analytical/AMC/Software/kerneldensities.asp). www.rsc.org. Retrieved 2020-11-05.
35. The Numerical Algorithms Group. "NAG Library Routine Document:
nagf_smooth_kerndens_gauss (g10baf)" (http://www.nag.co.uk/numeric/fl/nagdoc_fl23/pdf/G1
0/g10baf.pdf) (PDF). NAG Library Manual, Mark 23. Retrieved 2012-02-16.

https://en.wikipedia.org/wiki/Kernel_density_estimation#:~:text=In statistics%2C kernel density estimation,based on kernels as weights. 10/11


11/03/2024, 13:36 Kernel density estimation - Wikipedia

36. The Numerical Algorithms Group. "NAG Library Routine Document: nag_kernel_density_estim
(g10bac)" (https://web.archive.org/web/20111124062333/http://nag.co.uk/numeric/cl/nagdoc_cl
09/pdf/G10/g10bac.pdf) (PDF). NAG Library Manual, Mark 9. Archived from the original (http://
www.nag.co.uk/numeric/CL/nagdoc_cl09/pdf/G10/g10bac.pdf) (PDF) on 2011-11-24. Retrieved
2012-02-16.
37. Vanderplas, Jake (2013-12-01). "Kernel Density Estimation in Python" (https://jakevdp.github.i
o/blog/2013/12/01/kernel-density-estimation/). Retrieved 2014-03-12.
38. "seaborn.kdeplot — seaborn 0.10.1 documentation" (https://seaborn.pydata.org/generated/sea
born.kdeplot.html). seaborn.pydata.org. Retrieved 2020-05-12.
39. "Kde-gpu: We implemented nadaraya waston kernel density and kernel conditional probability
estimator using cuda through cupy. It is much faster than cpu version but it requires GPU with
high memory" (https://pypi.org/project/kde-gpu/#description).
40. "Basic Statistics - RDD-based API - Spark 3.0.1 Documentation" (http://spark.apache.org/docs/
latest/mllib-statistics.html#kernel-density-estimation). spark.apache.org. Retrieved 2020-11-05.
41. https://www.stata.com/manuals15/rkdensity.pdf
42. Jann, Ben (2008-05-26), "KDENS: Stata module for univariate kernel density estimation" (http
s://ideas.repec.org/c/boc/bocode/s456410.html), Statistical Software Components, Boston
College Department of Economics, retrieved 2022-10-15

External links
Introduction to kernel density estimation (http://www.mvstat.net/tduong/research/seminars/semi
nar-2001-05) A short tutorial which motivates kernel density estimators as an improvement
over histograms.
Kernel Bandwidth Optimization (https://www.neuralengine.org/res/kernel.html) A free online tool
that generates an optimized kernel density estimate.
Free Online Software (Calculator) (http://www.wessa.net/rwasp_density.wasp) computes the
Kernel Density Estimation for a data series according to the following Kernels: Gaussian,
Epanechnikov, Rectangular, Triangular, Biweight, Cosine, and Optcosine.
Kernel Density Estimation Applet (http://pcarvalho.com/things/kerneldensityestimation/index.ht
ml) An online interactive example of kernel density estimation. Requires .NET 3.0 or later.

Retrieved from "https://en.wikipedia.org/w/index.php?title=Kernel_density_estimation&oldid=1210998018"

https://en.wikipedia.org/wiki/Kernel_density_estimation#:~:text=In statistics%2C kernel density estimation,based on kernels as weights. 11/11

You might also like