Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 10

CLUSTER ANALYSIS

PREPARED BY:
DR. POONAM KHURANA



Cluster analysis- Basic Applications
Can be used to cluster objects, individuals and entities

Similarity is based on multiple variables

Measures proximity between study variables

Groups that are grouped in one cluster are homogenous as
compared to others

Can be conducted on metric, non-metric as well as mixed
data



What is Cluster analysis?
Cluster analysis is a techniques for grouping objects, cases,
entities on the basis of multiple variables. The advantage
of the technique is that it is applicable to both metric and
non-metric data.
Secondly, the grouping can be done post hoc , i.e. after the
primary data survey is over. The technique has wide
applications in all branches of management . However, it
is most often used for market segmentation analysis.

Usage of cluster analysis
Market segmentation customers/potential customers can be
split into smaller more homogenous groups by using the
method.
Segmenting industries the same grouping principle can be
applied for industrial consumers.
Segmenting markets cities or regions with similar or common
traits can be grouped on the basis of climatic or socio-economic
conditions.
Usage of cluster analysis

Career planning and training analysis for human resource
planning people can be grouped into clusters on the basis of their
educational/experience or aptitude and aspirations.


Segmenting financial sector/instruments different factors like
raw material cost, financial allocations, seasonality and other
factors are being used to group sectors together to understand
thegrowth and performance of a group of industries.
Key concepts in cluster analysis

Agglomeration schedule: A hierarchical method that provides information on the
objects, starting with the most similar pair and then at each stage provides information
on the object joining the pair at a later stage.


ANOVA table: The univariate or one way ANOVA statistics for each clustering
variable. The higher is the ANOVA value , the higher is the difference between the
clusters on that variable.

Cluster variate: The variables or parameters representing the objects to be clustered
and used to calculate the similarity between objects.

Cluster centroid: The average values of the objects on all the variables in the cluster
variate.

Key concepts in cluster analysis
Cluster seeds: Initial cluster centres in the non-hierarchical clustering
that are the initial points from which one starts. Then the clusters are
created around these seeds.

Cluster membership: This indicates the address or the cluster to which
a particular person/object belongs.

Dendrogram: This is a tree like diagram that is used to graphically
present the cluster results. The vertical axis represents the objects and the
horizontal represents the inter-respondent distance. The figure is to be read
from left to right.

Distances between final cluster centres: These are the distances
between the individual pairs of clusters. A robust solution that is able to
demarcate the groups distinctly is the one where the inter cluster distance
is large; the larger the distance the more distinct are the clusters.

Key concepts in cluster analysis
Entropy group: The individuals or small groups that do
not seem to fit into any cluster.

Final cluster centres: The mean value of the cluster on
each of the variables that is a part of the cluster variate.

Hierarchical methods: A step-wise process that starts
with the most similar pair and formulates a tree-like structure
composed of separate clusters.

Non-hierarchical methods: Cluster seeds or centres are
the starting points and one builds individual clusters around
it based on some pre-specified distance of the seeds.

Key concepts in cluster analysis
Proximity matrix: A data matrix that consists of pair-
wise distances/similarities between the objects. It is a N
x N matrix, where N is the number of objects being
clustered.

Summary: Number of cases in each cluster is
indicated in the non-hierarchical clustering method.

Vertical icicle diagram: Quite similar to the
dendogram, it is a graphical method to demonstrate the
composition of the clusters. The objects are individually
displayed at the top. At any given stage the columns
correspond to the objects being clustered, and the rows
correspond to the number of clusters. An icicle diagram
is read from bottom to top.


Validating the cluster solution
Use two-step clustering to measure the stability of
the obtained solution.

Split the data in half and conduct clustering on each
and check cluster centroids.

Use subjective judgment to evaluate both group
formation as well as cluster potential for
managerial decision.

You might also like