Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 9

Multivariate Data Analysis

Chapter 9 - Cluster Analysis

Section 3: Independence Techniques


Chapter 9
 What Is Cluster Analysis (Q analysis)?
 Define groups of homogeneous objects (i.e., individuals,
firms, products, or behaviors)
 Maximize the homogeneity of objects within the clusters
while also maximize the heterogeneity between clusters
 Segmentation and target marketing
 Compare with Factor Analysis
 How Does Cluster Analysis Work?
 Measuring Similarity (Euclidean distance)
 Forming Clusters (hierarchical procedure vs.
agglomerative method)
 Determining the Number of Clusters in the Final
Solution (entropy group)
Cluster Analysis Decision Process

 Stage One: Objectives of Cluster Analysis


 Taxonomy description
 Data simplification
 Relationship identification
 Selection of Clustering Variables
 Characterize the objects being clustered
 Relate specifically to the objectives of the cluster
analysis
Cluster Analysis Decision Process
(Cont.)
 Stage 2: Research Design in Cluster Analysis
 Detecting Outliers
 Similarity Measures (Interobject similarity)
 Correlational Measures
 Distance Measures
 Comparison to Correlational Measures
 Types of Distance Measures (Euclidean distance)
 Impact of Unstandardized Data Values (Mahalonobis Distance, D2)
 Association Measures
 Standardizing the Data
 Standardizing By Variables (normalized distance
function)
 Standardizing By Observation (within-case vs. row-
centering standarlization)
Cluster Analysis Decision Process
(Cont.)

 Stage 3: Assumptions in Cluster Analysis


 Representativeness of the Sample
 Impact of Multicollinearity
Cluster Analysis Decision Process (Cont.)
 Stage 4: Deriving Clusters and Assessing Overall Fit
 Clustering Algorithms
 Hierarchical Cluster Procedures
 Single Linkage
 Complete Linkage
 Average Linkage
 Ward's Method
 Centroid Method
 Nonhierarchical Clustering Procedures
 Sequential Threshold
 Parallel Threshold
 Optimization
 Selecting Seed Points
 Should Hierarchical or Nonhierarchical Methods Be Used?
 Pros and Cons of Hierarchical Methods
 Emergence of Nonhierarchical Methods
 A Combination of Both Methods
 How Many Clusters Should Be Formed?
 Should the Cluster Analysis Be Respecified
Cluster Analysis Decision Process
(Cont.)

 Stage 5: Interpretation of the Clusters


 Stage 6: Validation and Profiling of the Clusters
 Validating the Cluster Solution
 Criterion or predictive validity
 Profiling the Cluster Solution

 Summary of the Decision Process


An Illustrative Example
 Stage 1: Objectives of the Cluster Analysis
 Segment objects (customers) into groups with
similar perceptions of HATCO
 HATCO can then formulate strategies with
different appeals for the separate groups.
 Stage 2: Research Design of the Cluster
Analysis
 Identify any outliers
 Similarity measure (multicollinearity: D2)
 Stage 3: Assumptions in Cluster Analysis
An Illustrative Example (Cont.)
 Stage 4: Deriving Clusters and Assessing
Overall Fit
 Step 1: Hierarchical Cluster Analysis
 Step 2: Nonhierarchical Cluster Analysis
 Stage 5: Interpretation of the Clusters
 Two-cluster solution
 Four-cluster solution
 Stage 6: Validation and Profiling of the Clusters
 Managerial view

You might also like