06 - Unsupervised Learning - 18 Dec 2023

AI Bootcamp – ML & EDA
Prof Dr. Hammad Afzal

hammad.afzal@mcs.edu.pk
Asst Prof Khawir Mahmood

khawir@mcs.edu.pk
1
Unsupervised Learning
Agenda
• Unsupervised Learning
– K-Means Clustering
– Agglomerative Clustering
Clustering
Clustering is the partitioning of a data set into subsets
(clusters), so that the data in each subset (ideally) share
some common trait - often according to some defined
distance measure.
Clustering is unsupervised classification
4
Clustering
Notion of Cluster can be ambigous
Clustering Applications
Types of Clustering
• A clustering is a set of clusters
Important distinction between hierarchical and partitional sets of
clusters
• Partitional Clustering
– A division of data objects into non-overlapping subsets (clusters)
• Hierarchical clustering
– A set of nested clusters organized as a hierarchical tree
Hierarchical Clustering
These find successive clusters using previously established clusters.
1. Agglomerative ("bottom-up"):
Agglomerative algorithms begin with each element as a separate cluster and merge
them into successively larger clusters.
2. Divisive ("top-down"):
Divisive algorithms begin with the whole set and proceed to divide it into
successively into smaller clusters.
Hierarchical Clustering
Partitional Clustering
– Construct a partition of a data set to produce several clusters –
At once
– The process is repeated iteratively – Termination condition
– Examples
▪ K-means clustering
▪ Fuzzy c-means clustering
Partitional Clustering
K means Clustering
13
K means Clustering
K means Clustering
15
K-Means Example
K-Means Example
K-Means Example
18
K-Means : In Class Practice
19
K-Means– Example 2
– Suppose we have 4 medicines and each has two attributes (pH
and weight index).
– Our goal is to group these objects into K=2 clusters of medicine
Medicine Weight pH-Index C

A 1 1
B 2 1
C 4 3 A B
D 5 4
– Compute the distance between all samples and K centroids
c1 = A, c2 = B
d( D , c1 ) = ( 5 − 1)2 + ( 4 − 1)2 = 5
d( D , c2 ) = ( 5 − 2)2 + ( 4 − 1)2 = 4.24
– Assign the sample to its closest cluster
– An element in a row of the Group matrix

below is 1 if and only if the object is
assigned to that group
– Re-calculate the K-centroids
– Knowing the members of each
– cluster, now we compute the new
– centroid of each group based on
– these new memberships.
c1 = (1, 1)
 2 + 4 + 5 1+ 3 + 4 
c2 =  , 
 3 3 
= (11 / 3, 8 / 3)
= (3.67, 2.67)
• Repeat the above steps
Compute the distance of all objects to

the new centroids
Assign the membership to objects

Knowing the members of each cluster, now we

compute the new centroid of each group based
on these new memberships.
 1+ 2 1+1 1
c1 =  ,  = (1 , 1)
 2 2  2
 4+5 3+4 1 1
c2 =  ,  = ( 4 ,3 )
 2 2  2 2
• We obtain result that G2=G1. Comparing the grouping of last
iteration and this iteration reveals that the objects do not move
group anymore.
• Thus, the computation of the k-mean clustering has reached its

stability and no more iterations are needed.
K-Means– Exercise
Two Different K-Means Clustering
Importance of Initial Centroids
Hierarchical
Clustering
33
Hierachical clustering
34
35
Hierachical Clustering
• Let us consider a gene measured in a set of 5 experiments:
A, B, C, D and E.
• The values measured in the 5 experiments are:
• A=100 B=200 C=500 D=900 E=1100
• We will construct the hierarchical clustering of these

values using Euclidean distance, centroid linkage and an
agglomerative approach.
36
Hierachical Clustering
SOLUTION:
1. The closest two values are 100 and 200
▪ =>the centroid of these two values is 150.
2. Now we are clustering the values: 150, 500, 900, 1100
4. The remaining values to be joined are: 150, 500, 1000.
Finally, the two resulting

37 subtrees are joined in the root of the tree.
 Agglomerative and divisive clustering on the data set {a, b, c, d ,e }
Step 0 Step 1 Step 2 Step 3 Step 4

Agglomerative
a
ab
b
abcde
c
cde
d
de
e
Divisive
Step 4 Step 3 Step 2 Step 1 Step 0
38
Agglomerative clustering
d3
d5
d3,d4,d5
d1
d4
d2 d1,d2 d4,d5 d3
Agglomerative Clustering - Example
X1 X2
A 1 1
B 1.5 1.5
C 5 5
D 3 4
E 4 4
F 3 3.5
Data matrix
Dist A B C D E F
A 0.00 0.71 5.66 3.61 4.24 3.20
B 0.71 0.00 4.95 2.92 3.54 2.50
dAB = ((1-1.5)2+(1-1.5)2)1/2 = 0.707
C 5.66 4.95 0.00 2.24 1.41 2.50
Euclidean distance D 3.61 2.92 2.24 0.00 1.00 0.50
E 4.24 3.54 1.41 1.00 0.00 1.12
F 3.20 2.50 2.50 0.50 1.12 0.00
40
Merge two closest clusters

X1 X2
A 1 1
B 1.5 1.5
C 5 5
D 3 4
E 4 4
Merge them into
single cluster` F 3 3.5
Data matrix
Dist A B C D E F
A 0.00 0.71 5.66 3.61 4.24 3.20
B 0.71 0.00 4.95 2.92 3.54 2.50
C 5.66 4.95 0.00 2.24 1.41 2.50
Find two closest clusters D 3.61 2.92 2.24 0.00 1.00 0.50
E 4.24 3.54 1.41 1.00 0.00 1.12
F 3.20 2.50 2.50 0.50 1.12 0.00
41
Update Distance Matrix

Dist A B C D E F
A 0.00 0.71 5.66 3.61 4.24 3.20
B 0.71 0.00 4.95 2.92 3.54 2.50
C 5.66 4.95 0.00 2.24 1.41 2.50
D 3.61 2.92 2.24 0.00 1.00 0.50
E 4.24 3.54 1.41 1.00 0.00 1.12
F 3.20 2.50 2.50 0.50 1.12 0.00
Dist A B C D,F E
A 0.00 0.71 5.66 ? 4.24
B 0.71 0.00 4.95 ? 3.54
C 5.66 4.95 0.00 ? 1.41
D,F ? ? ? 0.00 ?
E 4.24 3.54 1.41 ? 0.00
42

Dist A B C D E F
Min Distance – Single Linkage
A 0.00 0.71 5.66 3.61 4.24 3.20
D(D,F)→A = min(dDA,dFA)=min(3.61,3.20) = 3.20
B 0.71 0.00 4.95 2.92 3.54 2.50
C 5.66 4.95 0.00 2.24 1.41 2.50 D(D,F)→B = min(dDB,dFB)=min(2.92,2.50) = 2.50
D 3.61 2.92 2.24 0.00 1.00 0.50

D(D,F)→C = min(dDC,dFC)=min(2.24,2.50) = 2.24
E 4.24 3.54 1.41 1.00 0.00 1.12
F 3.20 2.50 2.50 0.50 1.12 0.00 D(D,F)→E = min(dDE,dFE)=min(1.00,1.12) = 1.00
Dist A B C D,F E Dist A B C D,F E

A 0.00 0.71 5.66 ? 4.24 A 0.00 0.71 5.66 3.20 4.24
B 0.71 0.00 4.95 ? 3.54 B 0.71 0.00 4.95 2.50 3.54
C 5.66 4.95 0.00 ? 1.41 C 5.66 4.95 0.00 2.24 1.41
D,F ? ? ? 0.00 ? D,F 3.20 2.50 2.24 0.00 1.00
E 4.24 3.54 1.41 ? 0.00 E 4.24 3.54 1.41 1.00 0.00
43
Merge two closest clusters

Dist A B C D,F E
A 0.00 0.71 5.66 3.20 4.24
B 0.71 0.00 4.95 2.50 3.54
C 5.66 4.95 0.00 2.24 1.41
D,F 3.20 2.50 2.24 0.00 1.00
E 4.24 3.54 1.41 1.00 0.00
Dist A,B C D,F E

A,B 0.00 ? ? ?
C ? 0.00 2.24 1.41
D,F ? 2.24 0.00 1.00
E ? 1.41 1.00 0.00
44

Dist A B C D,F E
A 0.00 0.71 5.66 3.20 4.24 D(A,B)→C = min(dCA,dCB)=min(5.66,4.95) = 4.95
B 0.71 0.00 4.95 2.50 3.54
C 5.66 4.95 0.00 2.24 1.41 D(A,B)→(D,F) = min(dDA,dDB, dFA,dFB)
=min(3.61, 2.92, 3.20, 2.50) = 2.50
D,F 3.20 2.50 2.24 0.00 1.00
E 4.24 3.54 1.41 1.00 0.00
D(A,B)→E = min(dAE,dBE)=min(4.24,3.54) = 3.54
Dist A,B C D,F E Dist A,B C D,F E

A,B 0.00 ? ? ? A,B 0.00 4.95 2.50 3.54
C ? 0.00 2.24 1.41 C 4.95 0.00 2.24 1.41
D,F ? 2.24 0.00 1.00 D,F 2.50 2.24 0.00 1.00
E ? 1.41 1.00 0.00 E 3.54 1.41 1.00 0.00
45
Merge two closest clusters/Update Distance Matrix

Dist A,B C D,F E
A,B 0.00 4.95 2.50 3.54
C 4.95 0.00 2.24 1.41
D,F 2.50 2.24 0.00 1.00
E 3.54 1.41 1.00 0.00
Dist (A,B) C (D,F),E

(A,B) 0.00 4.95 2.50
C 4.95 0.00 1.41
(D,F),E 2.50 1.41 0.00
46
Merge two closest clusters/Update Distance Matrix

Dist (A,B) C (D,F),E
(A,B) 0.00 4.95 2.50
C 4.95 0.00 1.41
(D,F),E 2.50 1.41 0.00
Dist (A,B) ((D,F),E),C

(A,B) 0.00 2.50
((D,F),E),C 2.50 0.00
47
Final Result
X1 X2
A 1 1
B 1.5 1.5
C 5 5
D 3 4
E 4 4
F 3 3.5
Data matrix
48
Dendrogram Representation

1. In the beginning we have 6 clusters: A,
B, C, D, E and F
2. We merge cluster D and F into cluster
6 (D, F) at distance 0.50
3. We merge cluster A and cluster B into
(A, B) at distance 0.71
4. We merge cluster E and (D, F) into ((D,
F), E) at distance 1.00
5. We merge cluster ((D, F), E) and C into
5 (((D, F), E), C) at distance 1.41
6. We merge cluster (((D, F), E), C) and
(A, B) into ((((D, F), E), C), (A, B)) at
4 distance 2.50
3 7. The last cluster contain all the objects,
2 thus conclude the computation
49
Thank You
50

06 - Unsupervised Learning - 18 Dec 2023

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

06 - Unsupervised Learning - 18 Dec 2023

Uploaded by

Copyright:

Available Formats

AI Bootcamp – ML & EDA

Prof Dr. Hammad Afzal

Asst Prof Khawir Mahmood

Clustering is unsupervised classification

– The process is repeated iteratively – Termination condition

Medicine Weight pH-Index C

– An element in a row of the Group matrix

Compute the distance of all objects to

Assign the membership to objects

Knowing the members of each cluster, now we

• Thus, the computation of the k-mean clustering has reached its

• A=100 B=200 C=500 D=900 E=1100

• We will construct the hierarchical clustering of these

Finally, the two resulting

Step 0 Step 1 Step 2 Step 3 Step 4

Agglomerative Clustering - Example

Agglomerative Clustering - Example

Agglomerative Clustering - Example

D 3.61 2.92 2.24 0.00 1.00 0.50

Dist A B C D,F E Dist A B C D,F E

Agglomerative Clustering - Example

Dist A,B C D,F E

Agglomerative Clustering - Example

Dist A,B C D,F E Dist A,B C D,F E

Agglomerative Clustering - Example

Dist (A,B) C (D,F),E

Agglomerative Clustering - Example

Dist (A,B) ((D,F),E),C

Agglomerative Clustering - Example

Agglomerative Clustering - Example

You might also like