Download as pdf or txt
Download as pdf or txt
You are on page 1of 21

Cluster Analysis

SD
@SPJIMR Courage . Heart
4
2 6 8
4 9 3
5 3 5
8 7 2 1 7 9
r
ste
u 1 6
Cl

7 4 2
l
ca

2 4
3 9
i

8 8
ch

7 6 5 1
ar

1 6 5 9 3
er
Hi

2 4 7 6 2
8 6 5 7 8 1
3 9
1 9 3 4 5
Give Examples

8
2 4 4 9
6 5
3 3
1 7 8 7 2
5 9 1 6

Agglomeration Divisive

@SPJIMR Courage . Heart


Non Hierarchical - Cluster
X2

8 b 9
K=2
6 5
2
a
3 4
1

X1

@SPJIMR Courage . Heart


Non Hierarchical - Cluster
X2
9
7
b
8
K=3
6 5
2
a 4

1 b
3

X1

@SPJIMR Courage . Heart


Steps for clustering
Formulate the Problem Pick variables

Euclidean Distance
Select a distance measure
Manhattan Distance = city block (uses Modulus)
between customers
Chebychev Distance

Single
Select measure/procedure
to generate clusters Complete
Ward’s Method
Decide on number of clusters
& Name them

Interpret and profile clusters

Assess the reliability and validity

@SPJIMR Courage . Heart


Single vs Complete Linkage

@SPJIMR Courage . Heart


Most similar Most dissimilar-1
Least of (most dissimilar) joins
Most dissimilar-2

7
7
8
5 9
2 9
8 5
4 2 6

6 4
1
1 3
3

Single Linkage
Complete Linkage

@SPJIMR Courage . Heart


Single Linkage – nearest neighbor
Distances across item pairs are given in next slide

@SPJIMR Courage . Heart


Table 1 Agglomerative – start with separate groups - SL X

M P R S
M 10 20 13
P 14 15
R 6
Who forms a group?
S

Table 2 Table 3

M P SR MP SR
M 10 13? MP 13
?
P 14? SR
SR
Who forms a group?

@SPJIMR Courage . Heart


Complete Linkage – farthest neighbor
Distances across item pairs are given in next slide

@SPJIMR Courage . Heart


Table 1 X
Agglomerative - CL
M P R S
M 10 20 13
P 9 15 M will start & R too will start

R 6 First : S is pulled by R, since MP lose out due to 6 < 10


Who forms a group? Second : M is then pulled by P, at scaled distance = 10
S

Table 2 Max of the Row Table 3

M P SR MP SR
M 10 20? MP 20*
?
P 15? SR
SR
Who forms a group?

* Last group (Min/Max)


@SPJIMR Courage . Heart
Complete Linkage
Another example

@SPJIMR Courage . Heart


X
Agglomerative - CL
M P R S T
Table 1
M will start & R too will start
M 9 20 13 11
P 10 15 16
R 6 7 Why did we choose Maximum? Coz, we are calculating how
elements of each group are how far away, or how heterogenous
Who forms a group?
S 11 from other groups

T Table 3

Table 2 MP RS T MP RST
MP 20 16 MP 20
?
RS 11 RST
SR
Who forms a group?

@SPJIMR Courage . Heart


Problems of Single Linkage

Even the “farthest” of friends may become “close” members in the same group

@SPJIMR Courage . Heart


HIERARCHICAL CLUSTERING

@SPJIMR Courage . Heart


Ward’s Method : Using Total Sum of Sqrd Distances (TSSD) – Div B
Variance : (like Anova), Within Sum of Squares of Distance - Minimize

TSSD go pink = DE2 + EF2 + FD2 TSSD go pink = DE2 + EF2 + FD2 + ND2 + NE2 + NF2

Y – khatta (tangy) Ananya


E
F B TSSD go green = AB2 + BC2 + CA2 + NA2 + NB2 + NC2
10

Shashank TSSD go green = AB2 + BC2 + CA2

5 D C Conclusion:
Nupur
A
If ((TSSDgo pink =< TSSDgo green), join pink, green)
Where will she go?

X – meetha (sweet)
O 5 10
Note: Centroid in Non-Hier Cluster / Ward’s SSD in Hier Cluster

@SPJIMR Courage . Heart


Note:
Centroid in Non-Hierarchical Cluster / Ward’s SSD in Hierarchical Cluster

@SPJIMR Courage . Heart


Ward’s Method : Using Total Sum of Sqrd Distances (TSSD) - Div A
Variance : Within Sum of Squares of Distance (like Anova) - Minimize

TSSD go pink = DE2 + EF2 + FD2 TSSD go pink = DE2 + EF2 + FD2 + JD2 + JE2 + JF2

Y – khatta (tangy) Priya


E
B TSSD go green = AB2 + BC2 + CA2 + JA2 + JB2 + JC2
F
10

Souptik TSSD go green = AB2 + BC2 + CA2

5 D C Conclusion:
Jayant
A If ((TSSDgo pink =< TSSDgo green), join pink, green)
Where will he go?

X – meetha (sweet)
O 5 10
Note: Centroid in Non-Hier Cluster / Ward’s SSD in Hier Cluster

@SPJIMR Courage . Heart


AVOID Centroid rule in Hierarchical Clusters

C
Suppose you take centroid, what can happen?

Green-C
D E 6.8
B
7.1
n k-C
Pi

A
F

In hierarchical you preferably DO NOT take Centroid to form clusters

@SPJIMR Courage . Heart


Oxford Happiness Questionnaire
29 variables

@SPJIMR Courage . Heart


Cluster Assessment
- Hierarchical
Dendrogram
Vertical red bar

@SPJIMR Courage . Heart

You might also like