Professional Documents
Culture Documents
MDA Session 14
MDA Session 14
Canonical Discriminant
analysis
Good Accounts
Bad Accounts
25
20
15
Return on Investm ent
10
0
0.00 0.50 1.00 1.50 2.00 2.50
-5 Current Ratio
-10
1
27-07-2023
Why Line?
Z a X bY
CR ROI
Choose a,b so that Z-values of ‘good accounts’ are as
‘different’ from the Z-values of ‘bad accounts’ as possible
2
27-07-2023
y2 x xy y
a
x2 y2 xy xy
x2 y xy x
b
x2 y2 xy xy
5
3
27-07-2023
4
27-07-2023
X
Z a X b Y a b
Y
1
a x2 xy X1 X 2
b xy y2
1 2
Y Y
( X 1 X 2 ) ' 1 X Z
9
X 2 Y2 X 1 Y1
x y
Z2 z Z1
10
5
27-07-2023
11
11
Indicators of group-memberships
12
12
6
27-07-2023
13
13
14
7
27-07-2023
15
Objective
• Predict/explain different categories of amount spent on
the basis of
• Annual family income ---Attitude towards travel
• Importance given to family vacation
• Household size ---Age of the Head of HH
16
16
8
27-07-2023
17
18
18
9
27-07-2023
19
Eigenvalues
Function Eigenvalue % of Variance Cumulative % Canonical Correlation
1 3.82 93.93 93.93 0.89
2 0.25 6.07 100.00 0.44
a First 2 canonical discriminant functions were used in the analysi
i
=SSB/SSW
i
1 i 20
20
10
27-07-2023
1 1 1
1 1
1 1 1 1 1 1
2 2 1 1 1
2 2
1 2 1 2 1 1 1 1 1 2
21
21
22
11
27-07-2023
Un-standardized Discriminant
Function Coefficients
Function 1 Function 2
family income 0.1543 -0.0620
importance family vacation -0.0695 0.2613
travel attitude 0.1868 0.4223
household size -0.1265 0.1003
age of household head 0.0593 0.0628
(Constant) -11.0944 -3.7916
23
Standardized Discriminant
Function Coefficients
Function 1 Function 2
family income 1.0474 -0.4208
importance family vacation -0.1420 0.5335
travel attitude 0.3399 0.7685
household size -0.1632 0.1293
age of household head 0.4947 0.5245
24
24
12
27-07-2023
25
25
26
26
13
27-07-2023
plot(travel.can1)
library(heplots)
heplot(travel.can1, scale=6, fill=TRUE)
27
Territorial Map
1
* * 3
*
2
prepared by S. Das
28
14
27-07-2023
Territorial Map
1
* * 3
*
2
prepared by S. Das
29
Hold-out sample
resort visit family income attitude travel Importance fam vac HH size age- head HH amount spe
1 50.8 4 7 3 45 2
1 63.6 7 4 7 55 3
1 54.0 6 7 4 58 2
1 45.0 5 4 3 60 2
1 68.0 6 6 6 46 3
1 62.1 5 6 3 56 3
2 35.0 4 3 4 54 1
2 49.6 5 3 5 39 1
2 39.4 6 5 3 44 3
2 37.0 2 6 5 51 1
2 54.5 7 3 3 37 2
2 38.2 2 2 3 49 1
30
15
27-07-2023
• Principle of cross-validation
• While classifying a specific case use all but that observation
31
31
Hold-out samples
Actual group Predicted group membership
1 2 3 Hit Ratio =75%
1 3 1 0
2 0 9 1
32
3 1 0 3
32
16
27-07-2023
https://www.rdocumentation.org/packages/candisc/versions/0.8-6/topics/candisc
33
34
17
27-07-2023
Data
The dataset ‘Breast Cancer Wisconsin (Diagnostic) Data Set’ was used for the
analysis.
There are 569 observations on 32 variables.
Features are computed from a digitized image of a fine needle aspirate (FNA) of a
breast mass. They describe characteristics of the cell nuclei present in the image.
The diagnostic classification of the breast mass is given as either Benign or
Malignant.
This dataset is suitable for understanding how characteristics of the FNA image of
the breast mass relates to diagnosis of whether the mass is benign or malignant.
35
Data
ID number
Diagnosis (M = malignant, B = benign)
Ten real-valued features are computed for each cell nucleus; mean, standard
error and worst values are given for each of the features, thus bringing the
number of features to 30:
radius (mean of distances from center to points on the perimeter)
texture (standard deviation of gray-scale values)
perimeter
area
smoothness (local variation in radius lengths)
compactness (perimeter^2 / area - 1.0)
concavity (severity of concave portions of the contour)
concave points (number of concave portions of the contour)
symmetry
fractal dimension ("coastline approximation" - 1)
36
18
27-07-2023
Why LDA?
37
38
19
27-07-2023
Why LDA?
LDA will try to find the decision boundary at which the classification is most
successful.
For example, consider only two dimensions and two distinct clusters; LDA will
project these clusters down to one dimension.
39
40
20