Professional Documents
Culture Documents
Session 12 - Discriminant Analysis
Session 12 - Discriminant Analysis
SESSION 12
BRM
A quick revision of techniques and related
scales
Methodology Purpose Independent variable Dependent Variable
T-tests Group difference between two groups Categorical Metric
(Nominal) (Interval or ratio)
ANOVA Group difference between more than Categorical Metric
two groups (Nominal) (Interval or ratio)
Regression Variation in a variable explained by a Metric Metric
set of other variables (Except dummy
variables)
Double cross-validation
◦ Cross validation
◦ Then, exchanging role of estimation and validation sample, and conducting the
validation again
Key Terms and Steps Used in Discriminant Analysis
Estimation
◦ Assessing basics - descriptive, correlation (for multi-collinearity), F-ratio significance (individually income,
vacation, and hsize significant)
◦ Eigen value -Ratio of between group variance to within group variance (higher the better) of a function
◦ Canonical correlation – Simple correlation coefficient between discriminant score and corresponding group
membership (visited/not visited)
◦ Square of canonical correlation reflects the % of variance (64% in this case) of dependent variable is
explained by independent variable i.e. model
Significance
◦ Wilk’s Lambda - H0: Means of all discriminant functions in all groups are equal
◦ For each predictor, Ratio of within group sum of squares to total sum of squares (lower the better)
◦ Takes value between 0 to 1 ; Smaller value indicates group means seem to be different, and larger value
indicate group mean may not be different
Key Terms and Steps Used in Discriminant Analysis
Interpretations
Unstandardized coefficients -> Canonical discriminant function coefficient (in table)
◦ Is interpreted the same way as regression coefficient, gives discriminant score
Standardized coefficients -> Standardized discriminant function coefficients
◦ Indicate the relative contribution of the variables in discriminating between the two groups
◦ There is no constant
Structure matrix -> Correlation between discriminant score and each of the predictor variable
◦ Arranged in descending order; represents the variance that predictor shares with discriminant function
Group centroid -> Mean discriminant score of a group (visited/not visited)
Developing characteristic profile -> based on the mean value of significant variables
Key Terms and Steps Used in Discriminant Analysis
Classification of cases using discriminant function
◦ Cut-off score for classification-
◦ When equal sample size- Average of two groups of centroids {i.e. (-1.219+
1.219)/2 = 0} is taken as criteria
◦ When unequal sample size
Assessing classification accuracy
Classification Matrix:
◦ Hit ratio
◦ Hit ratio = No of correct predictions/Total number of cases
▪ The data is divided into estimation sample (size-30), and validation sample (size-12) , with equal
representation of Visit/Did not visit resort
Take Home Exercise
Do a two group discriminant analysis with the loyalty data set (shared in the drive)
a. Analyze and interpret the results
b. Remove insignificant independent variable/s from the model, rerun the analysis and
interpret the results
c. Keep first 20 data set as estimation sample, and the rest 10 as validation sample. Run
analysis and interpret the results