Lecture 4

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

1 Parzen window estimation

n = 200 random samples generated based on the given triangle distribution


parameters a = 1.0, b = 8.0, and c = 6.0.
The first plot displays the exponential and Epanechnikov kernel functions, while the
second plot shows the estimated densities using these kernels via Parzen window
estimation, along with the true triangle density for comparison.

In the density estimation plot, the estimates provided by the Parzen window method
with both kernels are plotted against the true density of the triangular distribution.
You can see how each kernel approaches the task of estimating the true distribution
and the differences between them.
n = 10000, The plot above illustrates the density estimations using the Epanechnikov
kernel with different window widths h and compares them to the true triangle
density.

2 Non-parametric classification

a) Choice between Parzen Window, K-Nearest Neighbor, and Posterior Estimation:

● Parzen Window Estimation: Ideal for smaller datasets due to its


computational intensity. It's more flexible in capturing the shape of the
distribution but requires more memory and computational resources.
● K-Nearest Neighbor (KNN) Estimation: Suitable for medium-sized datasets.
KNN is less computationally intensive compared to Parzen windows.
However, it may not perform as well with very large datasets or in
high-dimensional spaces due to the curse of dimensionality.
● Posterior Estimation: Best used when you have a large amount of data and a
good understanding of the underlying distribution. It's computationally
efficient for large datasets, especially if the class distributions are known or
can be approximated well.

b) Requirements for a Valid Parzen Window (Kernel):

● The kernel function φ(u) must be non-negative.


● It should integrate to one over its entire range, ensuring that the estimated
probability density function is valid.

c) Worst-Case Error-Rate for K-NN with k = n:

● In the worst case, if you choose k equal to n (the total number of samples),
the k-NN classifier will always predict the most frequent class in the dataset,
regardless of the input.
● If the dataset has c categories, the worst-case error rate would be the
proportion of the dataset that is not in the most frequent class. This could be
close to 1 - (1/c) if the classes are evenly distributed.

d) Pn(ωi | x) = ki/k Explanation:

● This formula represents the posterior probability of a sample belonging to


class ωi given the feature x, in the context of k-NN classification.
● Intuitively, it's the fraction of the k nearest neighbors to x that belong to class
ωi. This gives a direct estimate of how likely x is to belong to ωi based on its
nearest neighbors.

e) Naive Approach for k-NN and Computational Complexity:

● The naive approach involves computing the distance from the sample x to
every other sample in the dataset, then selecting the k closest samples.
● The computational complexity of this approach is O(n*d), where n is the
number of samples and d is the dimensionality of the data.
● To improve this, you can use techniques like KD-trees or ball trees for more
efficient distance computations, especially in higher dimensions.

You might also like