Discriminant Analysis For Classification and Prediction

Discriminant analysis for
classification and prediction

Discriminant Analysis : A
procedure for analysing M.R. data wh
en the dependent variable is categori
cal and the independent variables are
interval in nature.
Objective : To find linear combinatio
n of the independent variables that m
aximises the discrimination etween t
wo groups and minimises the probabil
ity ofmisclassifying individuals or obje
cts into their reference group.
Applications
How do consumers who are loyal to brand differs fro
m those who are not in their demographic profiles fro
m those who are not loyal?
Dependent variable: Brand loyalty customer class
Independent variable:Demographics
Are significant demographic differences observed am
ong purchasers of Sears, Goodyear, Goodrich, and Fir
estone tires?
Dep:Different tires
Ind:Demographic characterstics
Do heavy, medium and light users of soft drinks diffe
r in terms of their consumption of frozen food?
Dep:Softdrink usage class
Ind: Consumption of Frozen Food
Applications
How do consumers who are loyal to brand differs fro
m those who are not in their demographic profiles fro
m those who are not loyal?
Dependent variable: Brand loyalty customer class
Independent variable:Demographics
Are significant demographic differences observed am
ong purchasers of Sears, Goodyear, Goodrich, and Fir
estone tires?
Dep:Different tires
Ind:Demographic characterstics
Do heavy, medium and light users of soft drinks diffe
r in terms of their consumption of frozen food?
Dep:Softdrink usage class
Ind: Consumption of Frozen Food
Similarities and Differences between
ANOVA, Regression, and Discriminant
Analysis
Table 18.1
ANOVA REGRESSION DISCRIMINANT ANALYSIS
Similarities
Number of One One One
dependent
variables
Number of
independent Multiple Multiple Multiple
variables
Differences
Nature of the
dependent Metric Metric Categorical
variables
Nature of the
independent Categorical Metric Metric
variables
The objective of discriminant Analysi
s
Development of discriminant functions, or linear c
ombinations of the independent variables, that will
best discriminate between the categories of the de
pendent variable( group ).
Examination of whether significant differences exi
st among the groups, in terms of the independent v

ariables.
Determination of which independent variables con
tribute to most of the inter-group differences.

Classification of cases to one of the the groups on
the basis of the values of the predictor variables.

Evaluation of the accuracy of classification.
 Discriminant Function : The linear combination of indep
endent variables developed by discriminant analysis that w
ill best discriminate between the categories of the depende
nt variable.
 Two-Group Discriminant Analysis : Discriminant analys

is technique in which the independent variable has two cat
egories.
 Multiple Discriminant Analysis : Discriminant analysis t

echnique in which the criterion variable involves three or
more categories.
Two group discriminant analysis

Scatter plot of data that appears to be from two
groups

.

Cutting
score=ZCE
Group A Group
B
ZA ZB
Classify as
Classify as B
A (Purchaser
(Nonpurch )ZA
aser)
Optimal Cutting Score with Equal

Sample Sizes.
Group 0 Group 1
Specification buying Top Value approach
Z0=1.836 Z0=1.063
- - 0 N=3 4
4 2 8
Plot of Group Centroids (Z)
The overlap between A and B
is smaller than any other line
drawn through the scatter
plot. Z axis represents the two
 variable profiles of A and B as


 single numbers. By finding a


 linear composites of the
 original profile scores we can
 

potray each profile as a point
in the line. Thus Z axis
“condenses” the information
about group separability into a
set of points on the Z axis.Z-
A
axis is the discriminant AXIS.
V
B 1
Discriminant Function
Z
Two-Group Discriminant
Analysis
XXOXOOO
XXXOXXOOOO
Price
XXXXOOOXOOO
Sensitivity XXOXXOXOOOO
XXOXOOOOOOO
X-segment Need for Data Storage
O-segment
x = high propensity to buy
o = low propensity to buy
Evaluation of new product
*(1=very poor, 10=excellent)
Group based on purchase X1 X2 X3
intention Durability Performance Style
Group I: Would purchase
Subject 1 8 9 6
Subject 2 6 7 5
Subject 3 10 6 3
Subject 4 9 4 4
Subject 5 4 8 2
Group Mean 7.4 6.8 4.0
Group II: Would not
purchase
Subject 6 5 4 7
Subject 7 3 7 2
Subject 8 4 5 5
Subject 9 2 4 3
Subject 10 2 2 2
Group mean 3.2 4.4 3.8
Difference between 4.2 2.4 0.2
group means
1 8
0
X1 9 7 5 6 2 1 4 3
Durabili
ty
6 7
X2 5 1
Durabili 1 4 8 3 2
ty 0
1
0
7 9 8
X3
5 3 4 2 1 6
Style
Consumer Evaluations (Like Versus Dislike) of Ten Cereals Varying in Nutritional Content
Evaluation X1 X2
Protein Vitamin D
Group I: Dislike
Person 1 2 4
Person 2 3 2
Person 3 4 5
Person 4 5 4
Person 5 6 7
Group Mean 4.0 4.4
Group II: Dislike
Person 6 7 6
Person 7 8 4
Person 8 9 7
Person 9 10 6
Person10 11 9
Group mean 9.0 6.4
12  Disliker
 s
Likers
10

8
ta
 
m
Vi
in
X
2
6  

4   
2 
0
4 8 10 12
2 6
Protein, X2
Centroid(Liker)

10
Total sample 
centroid
Centroid 
(Liker) 
 *



1
0
.824
1.596 2.368
Discriminant Analysis Model
The Discriminant Analysis Model involves linear combinati
ons of the following form :
D = b 0 + b 1 X 1 + b 2 X 2 + b 3 X3 + . . . + b k X k
D = discriminant score
b = discriminant coefficient or weight
X = independent variable.
Statistics Associated with Discriminant Analys
is
 Canonical correlation : measures the extent of associati
on between the discriminant scores and the groups. It is a
measure of association between the single discriminant fu
nction and the set of dummy variables that define the gro
up membership.
 Centroid : is the mean values for the discriminant scores
for a particular group. There are as many centroids as ther
e are groups, as there is one for each group. The means fo
r a group on all the functions are the group centroids.
 Classification matrix : sometimes also called confusion
matrix, the classification matrix contains the number of co
rrectly classified and mis-classified cases.
 Discriminant function coefficients( unstandardized ) :
are the multipliers of variables, when the variables are in t
he original units of measurement.
 Discriminant scores : The unstandardized coefficients ar
e multiplied by the values of the variables. These products
are summed and added to the content term to obtain the
discriminant scores.
 Eigenvalues : For each discriminant function, the eigenva
lue is the ratio of between-group to within-group sums of
squares.
 F values and their significance : These are calculated fr
om a one-way ANOVA, with the grouping variable serving
as the categorical independent variable. Each predictor, in
turn, serves as the metric dependent variable in the ANOV
A.
 Group means and group stand deviations : These are
computed for each predictor for each group.
 Pooled within-group correlation matrix : is computed
by averaging the separate covariance matrices for all the
groups.
 Standardized discriminant function coefficients : use
d as the multipliers when the variables have been standar
dized to a mean of 0 and a variance of 1.
 Structure correlations : Also referred to as discriminant
loadings, the structure correlation represent the simple co
rrelations between the predictors and the discriminant fun
ction.
 Total correlation matrix : If the cases are treated as if
they were from a single sample and the correlations comp
uted, a total correlation matrix is obtained.
 Analysis Sample : part of the total sample that is u
sed for estimation of the discriminant function.
 Validation Sample : That part of the total sample
used to check the results of the estimation sample.
 Direct Method : An approach to discriminant analy
sis that involves estimating the discriminant function
so that all the predictors are included simultaneousl
y.
D = w1Z1 + w2Z2 + w3Z3 + .... wiZi
D = discriminant score;
w i = weighting (coefficient) for variable i;
Z i = standardised score for variable i ( a standardised
variable has a mean of 0 and a standard deviation of 1 ).
 Stepwise Discriminant Analysis : Discriminant analysis
in which the predictors are entered sequentially based on
their ability to discriminate between the groups.
 Territorial Map : A tool for assessing discriminant analys
is results by plotting the graph membership of each case
on a graph.
Conducting Discriminant Analysis
There are 5 steps procedure as following :
1.Formulating the discriminant problem:
 Indentification of the dependent and independent variable
s.
 The sample is divided into two parts. One part, the analysis
sample, is used to estimate the discriminant function.
 The other part, the holdout sample, is used for validation.
2. Estimation:
 developing a linear combination of theindependent
variables, called discriminant functions, so that the groups
differ as much as possible on the values of independent
variables.
3. Determination of Statistical significance
 testing the null hypothesis that, in the population, the mea
ns of all discriminant functions in all groups are equal.
 If the null hypothesis is rejected, it is meaningful to interpr
et the results.
4. The interpretation of discriminant weights or coefficients
 similar to that in multiple regression analysis.
 The relative importance of the variables may be
understood by examining the absolute magnitude of the st
andardized discriminant function coefficients
 also by examining the structure correlations or discriminant
loading.
 The simple correlations between each independent and the
discriminant function represent the variance that the
independent shares with the function.
5. Validation:
 developing the classification matrix.
 The discriminant weights estimated by using the ana
lysis sample are multiplied by the values of the
independent variables in the hold-out sample.
 The cases are then assigned to groups based on thei
r discriminant scores and an appropriate decision rul
e.
 The percentage of cases correctly classified is deter
mined and compared to the rate that would be expe
cted by chance classification.
Steps for Statistical Interpretation:
 Step1:How Good is this model?
Interpretation from the output table “Classification Matrix”
Step2:
 Statistical significance of the discriminant functions
This can be answered by looking Wilks’ Lambda (proportio
n of discriminant score not explained by difference amo
ng group)and Probability value of F-test.The value of Wi
lk’s Lambda ranges between 0 and 1 with a lower value
indicating better discriminating power of the model.
 Step3:Which Variables are relatively better in discrimin
ating between consumers for national and local brand?
From the output table for “Standardised coefficients for ca
nonical variables’ compare the absolute value of standa
rdised coefficients of each independent variable
Steps for Statistical Interpretation:
Step4: How to classify a new customer into one of these two

groups?
Refer the output table of “Raw Coefficient for Canonical Vari
bles” and “Means of Canonical Variables”.
The means of canonical variables gives us the new means of
transformed group centroids.The new mean of Group 1 an
d Group2 will vary from positive to -ve. Two groups will b
e in the as per the position of discriminant score in the rig
ht and left of midpoint .
Predicted Discriminant Score= constant+ raw coefficient of i
nd. variable1*(x1)+ raw coefficient of independent varia
ble2*(x2)
Match predicted value with means of Canonical variables.
Conducting Discriminant Analysis
Estimate the Discriminant Function
Coefficients
 The direct method involves estimating the
discriminant function so that all the predictors are
included simultaneously.
 In stepwise discriminant analysis, the predictor
variables are entered sequentially, based on their
ability to discriminate among groups.
Information on Resort Visits: Analysis
Sample
Table 18.2
Attitude Importance Household Age of Amount

Resort Family Toward Attached Size Head of Spent on
Visit Income to Family Household Family

($000) Vacation
1 1 50.2 5 8 3 43 M (2)
2 1 70.3 6 7 4 61 H (3)
3 1 62.9 7 5 6 52 H (3)
4 1 48.5 7 5 5 36 L (1)
5 1 52.7 6 6 4 55 H (3)
6 1 75.0 8 7 5 68 H (3)
7 1 46.2 5 3 3 62 M (2)
8 1 57.0 2 4 6 51 M (2)
9 1 64.1 7 5 4 57 H (3)
10 1 68.1 7 6 5 45 H (3)
11 1 73.4 6 7 5 44 H (3)
12 1 71.9 5 8 4 64 H (3)
13 1 56.2 1 8 6 54 M (2)
14 1 49.3 4 2 3 56 H (3)
15 1 62.0 5 6 2 58 H (3)
Information on Resort Visits: Analysis
Sample
Table 18.2 cont.
Annual Attitude Importance Household Age of Amount

Resort Family Toward Attached Size Head of Spent on No.
Visit Income Travel to Family Household Family
($000) Vacation Vacation
16 2 32.1 5 4 3 58 L (1)
17 2 36.2 4 3 2 55 L (1)
18 2 43.2 2 5 2 57 M (2)
19 2 50.4 5 2 4 37 M (2)
20 2 44.1 6 6 3 42 M (2)
21 2 38.3 6 6 2 45 L (1)
22 2 55.0 1 2 2 57 M (2)
23 2 46.1 3 5 3 51 L (1)
24 2 35.0 6 4 5 64 L (1)
25 2 37.3 2 7 4 54 L (1)
26 2 41.8 5 1 3 56 M (2)
27 2 57.0 8 3 2 36 M (2)
28 2 33.4 6 8 2 50 L (1)
29 2 37.5 3 2 3 48 L (1)
30 2 41.3 3 3 2 42 L (1)
Results of Two-Group Discriminant
Analysis
Table 18.4
GROUP MEANS
VISIT INCOME TRAVEL VACATION HSIZE AGE
1 60.52000 5.40000 5.80000 4.33333 53.73333
2 41.91333 4.33333 4.06667 2.80000 50.13333
Total 51.21667 4.86667 4.9333 3.56667 51.93333
Group Standard Deviations
1 9.83065 1.91982 1.82052 1.23443 8.77062
2 7.55115 1.95180 2.05171 .94112 8.27101
Total 12.79523 1.97804 2.09981 1.33089 8.57395
Pooled Within-Groups Correlation Matrix
INCOME TRAVEL VACATION HSIZE AGE
INCOME 1.00000
TRAVEL 0.19745 1.00000
VACATION 0.09148 0.08434 1.00000
HSIZE 0.08887 -0.01681 0.07046 1.00000
AGE - 0.01431 -0.19709 0.01742 -0.04301 1.00000
Wilks' (U-statistic) and univariate F ratio with 1 and 28 degrees of freedom
Variable Wilks' F Significance
INCOME 0.45310 33.800 0.0000
TRAVEL 0.92479 2.277 0.1425
VACATION 0.82377 5.990 0.0209
HSIZE 0.65672 14.640 0.0007
AGE 0.95441 1.338 0.2572
Contd.
Analysis
Table 18.4 cont.
CANONICAL DISCRIMINANT FUNCTIONS
% of Cum Canonical After Wilks'
Function Eigenvalue Variance % Correlation Function  Chi-square df Significance
: 0 0 .3589 26.130 5 0.0001
1* 1.7862 100.00 100.00 0.8007 :
* marks the 1 canonical discriminant functions remaining in the analysis.
Standard Canonical Discriminant Function Coefficients
FUNC 1
INCOME 0.74301
TRAVEL 0.09611
VACATION 0.23329
HSIZE 0.46911
AGE 0.20922
Structure Matrix:
Pooled within-groups correlations between discriminating variables & canonical discriminant functions (variables ordered by size of
correlation within function)
FUNC 1
INCOME 0.82202
HSIZE 0.54096
VACATION 0.34607
TRAVEL 0.21337
AGE 0.16354
Contd.
Analysis
Table 18.4 cont.
Unstandardized Canonical Discriminant Function Coefficients

FUNC 1
INCOME 0.8476710E-01
TRAVEL 0.4964455E-01
VACATION 0.1202813
HSIZE 0.4273893
AGE 0.2454380E-01
(constant) -7.975476
Canonical discriminant functions evaluated at group means (group centroids)
Group FUNC 1
1 1.29118
2 -1.29118
Classification results for cases selected for use in analysis
Predicted Group Membership
Actual Group No. of Cases 1 2
Group 1 15 12 3
80.0% 20.0%
Group 2 15 0 15
0.0% 100.0%
Percent of grouped cases correctly classified: 90.00%
Contd.
Analysis
Table 18.4 cont.
Classification Results for cases not selected for use in the analysis (holdout sample)
Actual Group No. of Cases 1 2
Group 1 6 4 2
66.7% 33.3%
Group 2 6 0 6
0.0% 100.0%
Percent of grouped cases correctly classified: 83.33%.
Models with More than one Group
 For two group model above, Discriminant
analysis found best projection of plot
points (cases) for separating the groups
and introduced Linear Discriminant
Function.
 When there are more than two groups,
Discriminant Function become the focus
of Analysis.
Models with More than one Group
 The first Discriminant Function is the Linear
combination of the variables that maximizes the
differences between the means of k groups in
one dimension.
 The second Discriminant Function represents the
maximum dispersion of the means in a direction
orthogonal (perpendicular) to the first function.
 Third represents the disperson in a dimension
independent of the first two dimensions.
 DF’s are factors that discriminate optimally
among group centroids relative to disperson
within Group.
Group based on Switching Intention X1 X2 Service Level
Price Competitiveness
Group I: Definitely switch
Subject 1 2 2
Subject 2 1 2
Subject 3 3 2
Subject 4 2 1
Subject 5 2 3
Group Mean 2.0 2.0
Group II: undecided
Subject 6 4 2
Subject 7 4 3
Subject 8 5 1
Subject 9 5 2
Subject 10 5 3
Group mean 4.6 2.2
Group III: Definitely not switch
Subject 11 2 6
Subject 12 3 6
Subject 13 4 6
Subject 14 5 6
Subject 115 5 7
Group mean 3.8 6.2
1
5
1
1
1 1
1 4
1 1
1 1 1
5
5 3 11
X2 1
1 00
1 7 19
4 2 1
1 00
1
2
1 3 6 8
1
4
19
1
15
4
1 1
1
7 10
3
1 2 10
1
1 1
1 1
X1 1 1 1 1
2 11 2 6
3 8
4
5
|_______|_______|_______|______|_______|_______|
1 1 1 1
_______| 1 1 1 1
0 1 2 3 4 5 6
7
Discriminant Function 2
1
7 5
1
1
6 1
1
1
2
1
3
1
4
1 1 1 1
1 1 1 1
5
3 5 7 1
0
2
2 1 3 6 9
1 8
4
Discrimiant Function 1
Graphical Representation of Potential Discriminating Variables

for a Three Group Discriminant Analysis
Two-Dimensional Representation of Discriminant Functions

Discriminant Function 1=1.0*X1+0*X2
Discriminant Function 2=1.0*X1+0*X2
Results of Three-Group Discriminant
Analysis
Table 18.5
Group Means
AMOUNT INCOME TRAVEL VACATION HSIZE AGE
1 38.57000 4.50000 4.70000 3.10000 50.30000

2 50.11000 4.00000 4.20000 3.40000 49.50000
3 64.97000 6.10000 5.90000 4.20000 56.00000
Total 51.21667 4.86667 4.93333 3.56667 51.93333
Group Standard Deviations
1 5.29718 1.71594 1.88856 1.19722 8.09732

2 6.00231 2.35702 2.48551 1.50555 9.25263
3 8.61434 1.19722 1.66333 1.13529 7.60117
Total 12.79523 1.97804 2.09981 1.33089 8.57395
Pooled Within-Groups Correlation Matrix

INCOME TRAVEL VACATION HSIZE AGE
INCOME 1.00000
TRAVEL 0.05120 1.00000
VACATION 0.30681 0.03588 1.00000
HSIZE 0.38050 0.00474 0.22080 1.00000
AGE -0.20939 -0.34022 -0.01326 -0.02512 1.00000
Contd.
Analysis
Table 18.5 cont.
Wilks' (U-statistic) and univariate F ratio with 2 and 27 degrees of freedom.
Variable Wilks' Lambda F Significance
INCOME 0.26215 38.00 0.0000

TRAVEL 0.78790 3.634 0.0400
VACATION 0.88060 1.830 0.1797
HSIZE 0.87411 1.944 0.1626
AGE 0.88214 1.804 0.1840
CANONICAL DISCRIMINANT FUNCTIONS
% of Cum Canonical After Wilks'

Function Eigenvalue Variance % Correlation Function  Chi-square df Significance
: 0 0.1664 44.831 10 0.00
1* 3.8190 93.93 93.93 0.8902 : 1 0.8020 5.517 4 0.24
2* 0.2469 6.07 100.00 0.4450 :
* marks the two canonical discriminant functions remaining in the analysis.
Standardized Canonical Discriminant Function Coefficients
FUNC 1 FUNC 2
INCOME 1.04740 -0.42076
TRAVEL 0.33991 0.76851
VACATION -0.14198 0.53354
HSIZE -0.16317 0.12932
AGE 0.49474 0.52447 Contd.
Analysis
Table 18.5 cont.
Structure Matrix:
Pooled within-groups correlations between discriminating variables and canonical discriminant
functions (variables ordered by size of correlation within function)
FUNC 1 FUNC 2
INCOME 0.85556* -0.27833
HSIZE 0.19319* 0.07749
VACATION 0.21935 0.58829*
TRAVEL 0.14899 0.45362*
AGE 0.16576 0.34079*
Unstandardized canonical discriminant function coefficients

FUNC 1 FUNC 2
INCOME 0.1542658 -0.6197148E-01
TRAVEL 0.1867977 0.4223430
VACATION -0.6952264E-01 0.2612652
HSIZE -0.1265334 0.1002796
AGE 0.5928055E-01 0.6284206E-01
(constant) -11.09442 -3.791600
Canonical discriminant functions evaluated at group means (group centroids)

Group FUNC 1 FUNC 2
1 -2.04100 0.41847
2 -0.40479 -0.65867
3 2.44578 0.24020 Contd.
Analysis
Table 18.5 cont.
Classification Results:
Actual Group No. of Cases 1 2 3
Group 1 10 9 1 0
90.0% 10.0% 0.0%
Group 2 10 1 9 0
10.0% 90.0% 0.0%
Group 3 10 0 2 8
0.0% 20.0% 80.0%
Classification results for cases not selected for use in the analysis
Actual Group No. of Cases 1 2 3
Group 1 4 3 1 0
75.0% 25.0% 0.0%
Group 2 4 0 3 1
0.0% 75.0% 25.0%
Group 3 4 1 0 3
25.0% 0.0% 75.0%
All-Groups
Fig. 18.2 Scattergram
Across: Function 1
Down: Function 2
4.0
1 1
1 *1 3
23 3 *3 3
1 1 12 * 3 3
0.0 1 1 2 2
3
1 2 2
2
-4.0
* indicates a group
centroid
-6.0 -4.0 -2.0 0.0 2.0 4.0 6.0

Territorial
Fig. 18.3 Map
13
13
13 Across: Function 1
8.0 Down: Function 2
13
13 * Indicates a
13
group centroid
13
4.0 113
112 3
112233
*1 1 1 2 2 2 2 3 3 *
1 1 2 2* 223
0.0 1122 233
1122 2233
11122 223
233
-4.0 11222
1122 223
11122 233
11122 2233
1122 223
-8.0 11122 233
-8.0 -6.0 -4.0 -2.0 0.0 2.0 4.0 6.0 8.0

-1.5
Taste
Cadbury’s
Price
Nestle
0
-1 3
-2 1 2
0
Packaging Availability
-0.5
Quality
1
Amul -1.5
Interpretation
 Vectors represent the effect of discriminating

each dimension. Larger arrows pointing more
closely towards a given group centroid represent
variables most strongly associated with that
particular group.
 Vectors pointing in the opposite direction from a

group centroid represents lower association with
the concerned group.
 Variables with longer vectors in a given
dimension and closer to a given axis are
contributing more to the interpretation of that
dimension.
Screen
Thomson
Internet
0.5
Fuzzy
Colour
-2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2

2.5
Weight
Remote
0.5
Audio
Sansug Price
Perceptual Map of Lipstick Brands and Attributes
2.00
Lakme
1.50
Packaging Maybelline
1.00
Finish Price
0.50
0.00
-3.00 -2.00 -1.00 0.00 1.00 2.00 3.00 4.00
-0.50
Long lasting
colour
-1.00
-1.50
-2.00
Revion
-2.50
Dimension 1
1.5
B4
1
A1
A3 A6
0.5
A4 B2
0
A5
-4 -3 B1 -1 0 1
-2 2
A2
-0.5
-1
B3
-1.5
Stepwise Discriminant Analysis
 Stepwise discriminant analysis is analogous to stepwise
multiple regression (see Chapter 17) in that the
predictors are entered sequentially based on their ability
to discriminate between the groups.
 An F ratio is calculated for each predictor by conducting
a univariate analysis of variance in which the groups are
treated as the categorical variable and the predictor as
the criterion variable.
 The predictor with the highest F ratio is the first to be
selected for inclusion in the discriminant function, if it
meets certain significance and tolerance criteria.
 A second predictor is added based on the highest
adjusted or partial F ratio, taking into account the
predictor already selected.
Stepwise Discriminant Analysis
 Each predictor selected is tested for retention based on
its association with other predictors selected.
 The process of selection and retention is continued until
all predictors meeting the significance criteria for
inclusion and retention have been entered in the
discriminant function.
 The selection of the stepwise procedure is based on the
optimizing criterion adopted. The Mahalanobis
procedure is based on maximizing a generalized
measure of the distance between the two closest
groups.
 The order in which the variables were selected also
indicates their importance in discriminating between the
groups.
Export Data set
 Y1= Willingness to Export
 X1=Employee Size
 x2= Firm Revenue
 X3= Years of Operation in the Domestic
Market
 X4=Number of Products
 X5=Training of Employees
 X6=Management Experience in
International Operations

Discriminant Analysis For Classification and Prediction

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Discriminant Analysis For Classification and Prediction

Uploaded by

Copyright:

Available Formats

Discriminant analysis for

classification and prediction

ANOVA REGRESSION DISCRIMINANT ANALYSIS

st among the groups, in terms of the independent v

tribute to most of the inter-group differences.

the basis of the values of the predictor variables.

 Two-Group Discriminant Analysis : Discriminant analys

 Multiple Discriminant Analysis : Discriminant analysis t

Optimal Cutting Score with Equal

X-segment Need for Data Storage

Step4: How to classify a new customer into one of these two

Attitude Importance Household Age of Amount

Visit Income to Family Household Family

Annual Attitude Importance Household Age of Amount

Unstandardized Canonical Discriminant Function Coefficients

Graphical Representation of Potential Discriminating Variables

Two-Dimensional Representation of Discriminant Functions

1 38.57000 4.50000 4.70000 3.10000 50.30000

Group Standard Deviations

1 5.29718 1.71594 1.88856 1.19722 8.09732

Pooled Within-Groups Correlation Matrix

Wilks' (U-statistic) and univariate F ratio with 2 and 27 degrees of freedom.

Variable Wilks' Lambda F Significance

INCOME 0.26215 38.00 0.0000

CANONICAL DISCRIMINANT FUNCTIONS

% of Cum Canonical After Wilks'

2* 0.2469 6.07 100.00 0.4450 :

* marks the two canonical discriminant functions remaining in the analysis.

Standardized Canonical Discriminant Function Coefficients

Unstandardized canonical discriminant function coefficients

Canonical discriminant functions evaluated at group means (group centroids)

-6.0 -4.0 -2.0 0.0 2.0 4.0 6.0

-8.0 -6.0 -4.0 -2.0 0.0 2.0 4.0 6.0 8.0

 Vectors represent the effect of discriminating

 Vectors pointing in the opposite direction from a

-2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5 2

-3.00 -2.00 -1.00 0.00 1.00 2.00 3.00 4.00

You might also like