Professional Documents
Culture Documents
A4 Task: Name: Piyush Vishwakarma Reg No: 18BCE10186
A4 Task: Name: Piyush Vishwakarma Reg No: 18BCE10186
Q1. Using K-means clustering, cluster the following data into two clusters and
show each step.
{3, 5, 10, 13, 4, 21, 31, 12, 26}.
Give your step by step clear analysis?
Ans.
Given - {3, 5, 10, 13, 4, 21, 31, 12, 26}
First of all we will assign alternate values in a way that we get two clusters like
k1 = {3, 10, 4, 31, 26} and k2 = {5, 13, 21, 12}
Now comes the step where we assign the values to that cluster distance
computed from c1 and c2 is minimum.
So, k1 becomes = {}
And k2 becomes = {3}
So, k1 becomes = {}
And k2 becomes = {3, 10}
So, k1 becomes = {}
And k2 becomes = {3, 10, 4}
Q2.
Formulate any four cluster scenarios to Calculate purity to measure the quality
of each cluster.
Ans.
The above image has 4 clusters (A, B, C, D) with three kinds of data items
coloured in aqua, green and yellow. So now we will find purity for all the
clusters.
Purity is the ratio between the dominant class in the cluster and the size of the
cluster.
W(i) = (1/n(i)) * max[n(ij)]
For A :
W(i) = (1/12)* max(2,4,6) = (1/12)*6 = 50%
For B :
W(i) = (1/12)* max(5,2,5) = (1/12)*5 = 41.6%
For C :
W(i) = (1/12)* max(3,6,3) = (1/12)*6 = 50%
For D :
W(i) = (1/12)* max(3,5,4) = (1/12)*5 = 41.6%
We come to the conclusion from computing the purity of the above four clusters
that cluster A and cluster C have 50% purity so they both have best quality in
this scenario.