Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 18

DR

CHAPTER-18
D R D E E PA K C H AW L A

CLUSTER ANALYSIS
NEENA SONDHI

RESEARCH CONCEPTS AND


SLIDE 18-1

What is Cluster analysis?


DR

Cluster analysis is a techniques for grouping objects,


cases, entities on the basis of multiple variables. The
D R D E E PA K C H AW L A

advantage of the technique is that it is applicable to both


NEENA SONDHI

metric and non-metric data.

Secondly, the grouping can be done post hoc , i.e. after


the primary data survey is over. The technique has wide
applications in all branches of management . However,
it is most often used for market segmentation analysis.
RESEARCH CONCEPTS AND
SLIDE 18-2

Cluster analysis- basic tenets


DR

Can be used to cluster objects, individuals and entities

Similarity is based on multiple variables


D R D E E PA K C H AW L A

Measures proximity between study variables


NEENA SONDHI

Groups that are grouped in one cluster are homogenous


as compared to others

Can be conducted on metric, non-metric as well as mixed


data
RESEARCH CONCEPTS AND
SLIDE 18-3

Usage of cluster analysis


DR

Market segmentation – customers/potential customers can

be split into smaller more homogenous groups by using


the method.
D R D E E PA K C H AW L A

Segmenting industries – the same grouping principle can


NEENA SONDHI

be applied for industrial consumers.


Segmenting markets – cities or regions with similar or

common traits can be grouped on the basis of climatic or


socio-economic conditions.

RESEARCH CONCEPTS AND


SLIDE 18-4

Usage of cluster analysis


DR

Career planning and training analysis – for human


resource planning people can be grouped into clusters
on the basis of their educational/experience or aptitude
and aspirations.
D R D E E PA K C H AW L A
NEENA SONDHI

Segmenting financial sector/instruments – different


factors like raw material cost, financial allocations,
seasonality and other factors are being used to group
sectors together to understand the growth and
performance of a group of industries.

RESEARCH CONCEPTS AND


SLIDE 18-5

Statistics associated with cluster analysis


DR

Metric data analysis

X  X jk 
3
D R D E E PA K C H AW L A

  2
d ij  ik
k 1
NEENA SONDHI

Where,
 dij =distance between person i and j.
 k = variable (interval / ratio)
 i = object
 j = object

RESEARCH CONCEPTS AND


SLIDE 18-6

Statistics associated with cluster


analysis
DR

Non-metric data

Simple matching coefficient =


D R D E E PA K C H AW L A
NEENA SONDHI

Jaccard coefficient =

Where
P=positive matches
N=negative matches
M=mismatches

RESEARCH CONCEPTS AND


SLIDE 18-8

Key concepts in cluster analysis


DR

Agglomeration schedule: A hierarchical method that provides


information on the objects, starting with the most similar pair and then at
each stage provides information on the object joining the pair at a later
stage.
D R D E E PA K C H AW L A

 ANOVA table: The univariate or one way ANOVA statistics for each
NEENA SONDHI

clustering variable. The higher is the ANOVA value , the higher is the
difference between the clusters on that variable.

Cluster variate: The variables or parameters representing the objects


to be clustered and used to calculate the similarity between objects.

Cluster centroid: The average values of the objects on all the


variables in the cluster variate.

RESEARCH CONCEPTS AND


SLIDE 18-9

Key concepts in cluster analysis


DR

Cluster seeds: Initial cluster centres in the non-hierarchical clustering that


are the initial points from which one starts. Then the clusters are created
around these seeds.
D R D E E PA K C H AW L A

Cluster membership: This indicates the address or the cluster to which a


particular person/object belongs.
NEENA SONDHI

Dendrogram: This is a tree like diagram that is used to graphically present


the cluster results. The vertical axis represents the objects and the
horizontal represents the inter-respondent distance. The figure is to be read
from left to right.

Distances between final cluster centres: These are the distances


between the individual pairs of clusters. A robust solution that is able to
demarcate the groups distinctly is the one where the inter cluster distance
is large; the larger the distance the more distinct are the clusters.

RESEARCH CONCEPTS AND


SLIDE 18-10

Key concepts in cluster analysis


DR

Entropy group: The individuals or small groups that do not


seem to fit into any cluster.
 
D R D E E PA K C H AW L A

Final cluster centres: The mean value of the cluster on each


of the variables that is a part of the cluster variate.
 
NEENA SONDHI

Hierarchical methods: A step-wise process that starts with


the most similar pair and formulates a tree-like structure
composed of separate clusters.
 
Non-hierarchical methods: Cluster seeds or centres are the
starting points and one builds individual clusters around it
based on some pre-specified distance of the seeds.

RESEARCH CONCEPTS AND


SLIDE 18-11

Key concepts in cluster analysis


DR

Proximity matrix: A data matrix that consists of pair-wise


distances/similarities between the objects. It is a N x N
matrix, where N is the number of objects being clustered.
 
D R D E E PA K C H AW L A

Summary: Number of cases in each cluster is indicated in


the non-hierarchical clustering method.
NEENA SONDHI

 
Vertical icicle diagram: Quite similar to the dendogram, it is
a graphical method to demonstrate the composition of the
clusters. The objects are individually displayed at the top. At
any given stage the columns correspond to the objects being
clustered, and the rows correspond to the number of clusters.
An icicle diagram is read from bottom to top.

RESEARCH CONCEPTS AND


SLIDE 18-12
Cluster analysis process
RESEARCH OBJECTIVES
Stage 1 Exploratory versus confirmatory
DR

objectives
Select variables used to cluster objects

Metric data CLUSTER ASSUMPTIONS Nonmetric data


Are the cluster variables metric or non
metric?
Stage 2
2

Distance measures of similarity Association measures of similarity


Squared Euclidean distance Matching coefficients
D R D E E PA K C H AW L A

Stage 3 CLUSTERING ALGORITHM


Is a hierarchical, nonhierarchical, or
combination of the two methods
used?

HIERARCHICAL NONHIERARCHICH TWO STEP COMBINATION


NEENA SONDHI

METHODS AL METHODS CLUSTER Use a hierarchical


Single Linkage Sequential method to specify
Complete Linkage Threshold cluster seeds for a
Average Linkage Parallel Threshold nonhierarchical
Wards’ Methods Optimization method
Centroid Method

Stage 4 NUMBER OF CLUSTERS


Hierarchical methods
Examine dendrogram
Cluster membership
Conceptual consideration

Stage 5 INTERPRETING THE CLUSTERS


Examine cluster variables.
Name clusters

Stage 6 VALIDATING AND PROFILING THE


CLUSTERS
Validation
Profiling

RESEARCH CONCEPTS AND


SLIDE 18-13

Illustration : Nano study


DR

Inter respondent Distance Cluster Combine


C A S E 0 5 10 15 20 25
+---------+---------+---------+---------+---------+

18  

25    

7   

13       

11    

21
D R D E E PA K C H AW L A

                                              

6    

3         

8   

5         
NEENA SONDHI

10        

17                                  

22        

15         

2        

16                     

20      

12           

19        

14        

9                             

24     

1      

23         

4       

RESEARCH CONCEPTS AND


SLIDE 18-14

Illustration: Nano study


DR

ANOVA

F Sig.

I think in India we have been able to achieve technological


D R D E E PA K C H AW L A

39.036 .000
standard of high order

I prefer to buy things made in India 44.896 .000


NEENA SONDHI

I usually buy things which provide value for money 53.716 .000

Convenience is more important than style 65.008 .000

I do not like wasteful expenditure 92.103 .000

When it comes to safety I believe there should be no


50.579 .000
compromises.

I'm a "saver" rather than a "spender." 23.468 .000

I like to try new and different things. 164.223 .000

I always want to be a part of changing world 96.749 .000

RESEARCH CONCEPTS AND


SLIDE 18-15

Illustration : Nano study


DR

Cluster centroids for Nano sample survey

Cluster

1 2 3
D R D E E PA K C H AW L A

I think in India we have been able to achieve


2.17 2.00 4.40
technological standard of high order

I prefer to buy things made in India 1.67 2.22 4.70


NEENA SONDHI

I usually buy things which provide value for money 4.67 1.44 2.70

Convenience is more important than style 4.67 1.78 2.10

I do not like wasteful expenditure 4.33 1.00 2.80

When it comes to safety I believe there should be no


4.67 1.22 2.60
compromises.

I'm a "saver" rather than a "spender." 4.17 1.00 2.60

I like to try new and different things. 1.50 4.78 1.20

I always want to be a part of changing world 1.33 4.33 1.40

RESEARCH CONCEPTS AND


SLIDE 18-16

Illustration: Nano study


DR

Cluster summary- Nano sample survey


D R D E E PA K C H AW L A

Cluster 1( cautious consumer) 6.000


NEENA SONDHI

Cluster 2( innovative consumer) 9.000


Cluster 3( Patriotic consumer) 10.000
Valid 25.000
Missing .000

RESEARCH CONCEPTS AND


SLIDE 18-17

Validating the Cluster Solution


DR

 Use two-step clustering to measure the stability of


the obtained solution.
D R D E E PA K C H AW L A

Split the data in half and conduct clustering on each


and check cluster centroids.
NEENA SONDHI

Use subjective judgment to evaluate both group


formation as well as cluster potential for managerial
decision.

RESEARCH CONCEPTS AND


D R D E E PA K C H AW L A DR
NEENA SONDHI

RESEARCH
CONCEPTS AND
END OF CHAPTER

You might also like