Professional Documents
Culture Documents
Multiple-Discriminant-Analysis-1.3 Edit 1683008765342 Edit 1683019569988 Edit 1683020158653
Multiple-Discriminant-Analysis-1.3 Edit 1683008765342 Edit 1683019569988 Edit 1683020158653
Multiple-Discriminant-Analysis-1.3 Edit 1683008765342 Edit 1683019569988 Edit 1683020158653
Discriminant
Analysis
Group 2 May 2023
Group 2
Mikhaela G. Carpio John Gil S. Castro Erica Joy S. Estrella Kristianne S. Garcia
Member Member Member Member
Ronald Aylmer Fisher (1936)
British statistician and geneticist who pioneered
the application of statistical procedures to the
design of scientific experiments. Also the one who
develop the original technique and was named
Linear Discriminant or Fisher's Discriminant
Analysis. The original Linear Discriminant was
described as a two-class technique. The multi-
class version was later generalized by C.R Rao as
Multiple Discriminant Analysis.
1
History
Ronald Fisher (1936) introduced discriminant analysis primarily to
offer a way to classify an object into one of two distinct populations
of objects.
The fundamental purpose of conducting discriminant analysis is to
enable the categorization of an object based on which of two clearly
defined populations it belongs to.
Discriminant Analysis was later generalized to include more than two
populations (e.g., Rao, 1948), its original purpose remained
fundamental until the mid-1960s.
The application of Discriminant Analysis was expanded to include
additional aspects beyond classification. 2
Discriminant Analysis
Discriminant Analysis is a statistical technique that relies on
dependence. Its main application is in predicting group
membership.
The method involves using a set of predictor variables
(Independent variables) to classify individuals or objects into one
of several possible groups.
In Discriminant Analysis, the dependent variable is categorical and
measured on a nominal scale, while the independent variables are
typically interval or ratio scale.
Various approaches for performing discriminant Analysis including
3
Multiple Discriminant Analysis.
Multiple Discriminant Analysis
When we want to create the distinction in the dependent variables
in more than two groups then it will be called as Multiple
Discriminant Analysis.
It is also a statistical method utilized by investment analysts,
financial planners, and advisors to assess potential investments
involving numerous variables.
It is an expansion of discriminant analysis that shares
methodologies and principles with multiple analysis of variance
(MANOVA). MDA aims to categorize cases into three or more
groups, utilizing either continuous or dummy categorical variables
as predictors. 4
Multiple Discriminant Analysis
MDA is also referred to as discriminant factor analysis or canonical
discriminant analysis.
In Two-Group Discriminant Analysis, a single function is capable of
classifying objects.
In Multiple Discriminant Analysis, several discriminant functions
are used, with the first function being the most relevant for
distinguishing between groups, followed by the second, and so
on.
The functions operate independently, meaning that the
measurements obtained from one function are unrelated to those
obtained from another function. 5
Several Purpose for Multiple Discriminant
Analysis
8
1. Click on ANALYSE at the SPSS menu bar.
SPSS COMMANDS
9
2. Click on CLASSIFY, followed by DISCRIMINANT.
SPSS COMMANDS
3. On the dialogue box which appears, select the GROUPING VARIABLE (dependent
categorical variable in discriminant analysis) by clicking on the right arrow to transfer it
10
from the variable list on the left to the grouping variable box on the right.
SPSS COMMANDS
4. Define the range of values of the grouping variable by clicking on DEFINE RANGE just
below the grouping variable box. Fill in the minimum and maximum values (the codes
11
used in our problem is 1 to 3 of the variable in the box which appears. Then click
SPSS COMMANDS
5. Select all the independent variables for discriminant analysis from the
variable list by clicking on the arrow which transfers them to the
12
INDEPENDENTS box on the right.
SPSS COMMANDS
6. Just below the INDEPENDENTS box select ‘Enter independents together’ if you want all
the selected independent variables (that are in the box) in the discriminant model. (Here
you have an option to use a STEPWISE discriminant analysis by selecting ‘Use Stepwise
13
Method’ instead of ‘Enter independents together’).
SPSS COMMANDS
7. Click on STATISTICS on the lower part of the main dialog box. This opens up a smaller
dialog box. Under DESCRIPTIVES, click on MEANS, UNIVARIATE ANOVAS and BOX'S M.
Under the title FUNCTION COEFFICIENTS, choose FISHER'S and UNSTANDARDIZED.
These are used to classify a new object in a discriminant analysis. Under MATRICES click 14
on WITHIN GROUP CORRELATION. Click on CONTINUE to return to the main dialog box.
SPSS COMMANDS
8. Click on CLASSIFY on the lower part of the main dialog box. Select SUMMARY TABLE
and LEAVE-ONE-OUT CLASSIFICATION under the heading DISPLAY in the smaller dialog
box that appears. This gives you the classification table (also called the confusion matrix)
that judges the accuracy of the discriminant model when applied to the input data points.
15
Click on CONTINUE to return to the main dialog box.
SPSS COMMANDS
17
10. Click OK to get the discriminant analysis output.
Example
A researcher wanted to study the relationship of blood pressure (BP) status
(Normal, High) with four other variables: Age, Weight, Body Surface Area (BSA)
and Pulse. He recruited 110 adults and recorded their BP, Age, Weight, BSA and
Pulse values. He recoded the SBP values into: Normal (<140) and High (> 140).
The p-values for all, except Pulse, below 0.05. Thus, Age,weight and 19
BSA may be significant determinants of Blood Pressure status
Output
Box's Test of Equality of Covariance Matrices:
Classification:
Age = 60, Weight= 90, BSA=2.0, Pulse = 70
F = -22.368 +0.195(60) + 0.352(90) – 3.842(2) – 0.018(70) = 12.03 > 0.511
23
High BP
Output
Results from Stepwise Method - Method Wilk’s Lambda
25
F = -23.894 + 0.197(Age) + 0.226(Weight)
Output
Centroids and Classification results:
Classification:
Age = 60, Weight= 90, BSA=2.0, Pulse = 70, Stress =80
26
F = -23.894+ 0.197 (60) + 0.226(90) = 8.29 > 0.501 High BP
Key Concepts
Discriminant Function - the number of functions computed is one less
than the number of groups in the dependent
variable.
Discriminant Coefficient - are partial coefficients that reflect the unique
contribution of each variable to the classification
of the groups in the dependent variable.
Group Centroid - are the mean discriminant scores for each
group in the dependent variable for each of the
discriminant functions.
Eigenvalue - also called the characteristic roots, is a ratio
between the explained and unexplained
variation in a model. For a good model the eigen
27
value must be more than one.
Key Concepts
Canonical Correlation - it is a measure of the association between the
groups in the dependent variable and the
discriminant function.
DWilks's Lambda - it is used to test the significance of the
discriminant functions.
Bhattacharya, A., & Dutta, A. (2020). Predicting a Model for the Financial Risk Tolerance
of Retail Investors of Durgapur City on Their Demographic Factors Using Multiple
Discriminant Analysis. In Smart innovation, systems and technologies. Springer
29
Nature,685–692. https://doi.org/10.1007/978-981-13-9282-5_65
Journal Article Critique
Title: " Predicting a Model for the Financial Risk Tolerance of
Retail Investors of Durgapur City on Their Demographic Factors
Using Multiple Discriminant Analysis "