Professional Documents
Culture Documents
ML Cat 2 - 4
ML Cat 2 - 4
ML Cat 2 - 4
Dr. R. Bhargavi
School of Computing Science & Engineering
VIT University, Chennai
6/27/23 1
Introduction - Hierarchical
Clustering
l Hierarchical Clustering is an unsupervised
algorithm.
l
i R
K-means clustering requires us to pre-specify
v
ar a
the number of clusters (K)
g
In most of the real world applications it is
h
l
B
difficult to pre-specify the value of K
l Hierarchical clustering does not require to
commit for a particular K
6/27/23 2
Hierarchical Clustering -
Appoaches
l Strategies for hierarchical clustering
R
generally fall into two types
vi
l Agglomerative or Bottom-up clustering
g a
r
l Divisive or Top-down clustering
B ha
6/27/23 3
Agglomerative Clustering
l Results in a Dendrogram – a tree (inverted)
R
representation of the observations
l
g avi
Dendrogram is built starting from the leaves
r
and combining clusters up to the trunk.
l
B ha
One single dendrogram can be used to
obtain any number of clusters.
6/27/23 4
Agglomerative clustering –
Working
l Start with each observation as an individual
R
cluster
l
g avi
Combine/merge two similar clusters based on
r
some dissimilarity measure.
l
B ha
Repeat till all the observations are merged
into a single cluster
6/27/23 5
Similarity measure
l Similarity measure between points – Euclidian
distance, Manhattan distance, correlation etc.
l
vi R
Similarity measure between clusters – Linkage
Linkage
arg a
Description
Complete
Single
Average B h Maximal intercluster dissimilarity.
Minimal intercluster dissimilarity.
Mean intercluster dissimilarity.
Centroid Dissimilarity between the centroids
6/27/23 6
Example
l Data : 5, 8, 14, 25, 27
l Compute similarity among initial clusters
vi R
5
arg
8
a 14 25 27
B h
6/27/23 7
Example (cont…)
R
Cluster 1 Cluster 2 Distance
vi
{5} {8} {3}
g a
{5} {14} {9}
r
{5} {25} {20}
ha
{5} {27} {22}
B
{8} {14} {6}
{8} {25} {17}
{8} {27} {19}
{14} {25} {11}
R
elements with complete linkage
l
g avi
First compute distances from each element of one cluster
to every element of the other cluster.
ar
20
B h 22
25 27
l Step 2
Cluster1
vi R Cluster 2 Distance
g a
{5} {25, 27} {22}
har
{8} {25, 27} {19}
B
{14} {25, 27} {13}
Min.
{5} {8} dissimilarity
{3}
6/27/23 10
Example (cont…)
5 8 14 25 27
l Step 3
Cluster1
vi R
Cluster 2 Distance
a
Min.
g
{5, 8} {14} {9} dissimilarity
har
{5, 8} {25, 27} {22}
B
{25, 27} {14} {13}
5 8 14 25 27
6/27/23 11
Example (cont…)
l Step 4
vi R
5
ar
8
g a14 25 27
B h
6/27/23 12
Dendrogram interpretation
22
vi R
9
arg a
3
2
B
5
h 8 14 25 27
6/27/23 13
Dendrogram interpretation
(cont…)
l Higher the similarity of the clusters, sooner
R
the fusion (lower in the tree) occurs in the
vi
construction of the dendrogram.
g a
r
l Clusters that get fused near the top of the
B ha
tree are most dissimilar.
The height of the fusion, as measured on the
vertical axis, indicates how different the two
observations are.
6/27/23 14
Algorithm
l Begin with n observations and a measure (ex: Euclidean
distance) of all the nC2 = n(n− 1)/2 pairwise dissimilarities.
R
Treat each observation as its own cluster.
l
g avi
For i = n, n − 1,... , 2:
l Examine all pairwise inter-cluster dissimilarities among
ar
the i clusters and identify the pair of clusters that are least
h
dissimilar (that is, most similar). Fuse these two clusters.
B
l Compute the new pairwise inter-cluster dissimilarities
among the i − 1 remaining clusters.
6/27/23 15
Practical Issues in Hierachical
Clustering
l In order to perform clustering, some
R
decisions must be made like ….
l
g avi
Should the observations or features first be
r
standardized in some way?
l
l B ha
What dissimilarity measure should be used?
What type of linkage should be used?
Where should the dendrogram be cut in order
to obtain clusters?
6/27/23 16