Multiple-Discriminant-Analysis-1.3 Edit 1683008765342 Edit 1683019569988 Edit 1683020158653

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 38

Multiple

Discriminant
Analysis
Group 2 May 2023
Group 2

Mikhaela G. Carpio John Gil S. Castro Erica Joy S. Estrella Kristianne S. Garcia
Member Member Member Member
Ronald Aylmer Fisher (1936)
British statistician and geneticist who pioneered
the application of statistical procedures to the
design of scientific experiments. Also the one who
develop the original technique and was named
Linear Discriminant or Fisher's Discriminant
Analysis. The original Linear Discriminant was
described as a two-class technique. The multi-
class version was later generalized by C.R Rao as
Multiple Discriminant Analysis.
1
History
Ronald Fisher (1936) introduced discriminant analysis primarily to
offer a way to classify an object into one of two distinct populations
of objects.
The fundamental purpose of conducting discriminant analysis is to
enable the categorization of an object based on which of two clearly
defined populations it belongs to.
Discriminant Analysis was later generalized to include more than two
populations (e.g., Rao, 1948), its original purpose remained
fundamental until the mid-1960s.
The application of Discriminant Analysis was expanded to include
additional aspects beyond classification. 2
Discriminant Analysis
Discriminant Analysis is a statistical technique that relies on
dependence. Its main application is in predicting group
membership.
The method involves using a set of predictor variables
(Independent variables) to classify individuals or objects into one
of several possible groups.
In Discriminant Analysis, the dependent variable is categorical and
measured on a nominal scale, while the independent variables are
typically interval or ratio scale.
Various approaches for performing discriminant Analysis including
3
Multiple Discriminant Analysis.
Multiple Discriminant Analysis
When we want to create the distinction in the dependent variables
in more than two groups then it will be called as Multiple
Discriminant Analysis.
It is also a statistical method utilized by investment analysts,
financial planners, and advisors to assess potential investments
involving numerous variables.
It is an expansion of discriminant analysis that shares
methodologies and principles with multiple analysis of variance
(MANOVA). MDA aims to categorize cases into three or more
groups, utilizing either continuous or dummy categorical variables
as predictors. 4
Multiple Discriminant Analysis
MDA is also referred to as discriminant factor analysis or canonical
discriminant analysis.
In Two-Group Discriminant Analysis, a single function is capable of
classifying objects.
In Multiple Discriminant Analysis, several discriminant functions
are used, with the first function being the most relevant for
distinguishing between groups, followed by the second, and so
on.
The functions operate independently, meaning that the
measurements obtained from one function are unrelated to those
obtained from another function. 5
Several Purpose for Multiple Discriminant
Analysis

To investigate differences among groups.


To determine the most parsimonious way to distinguish
among groups.
To discard variables which are little related to group
distinction.
To classify cases into groups.
To test theory by observing whether cases are classified as
predicted. 6
Assumption
Discriminant analysis is subject to certain assumptions, including
sensitivity to outliers and a requirement for the smallest group size to
be larger than the number of predictor variables.
Non-Multicollinearity
Multivariate Normality
Independence of Observations
Homoscedasticity
No Outliers
Adequate Sample Size
Note: DA is fairly robust to violations of the most of these
assumptions. But highly sensitive to Multivariate Normality and
7
Outliers
SPSS COMMANDS
To obtain the output for a Discriminant Analysis problem in SPSS, follow the steps
outlined below after entering the input data along with the variable and value
labels in the file.

8
1. Click on ANALYSE at the SPSS menu bar.
SPSS COMMANDS

9
2. Click on CLASSIFY, followed by DISCRIMINANT.
SPSS COMMANDS

3. On the dialogue box which appears, select the GROUPING VARIABLE (dependent
categorical variable in discriminant analysis) by clicking on the right arrow to transfer it
10
from the variable list on the left to the grouping variable box on the right.
SPSS COMMANDS

4. Define the range of values of the grouping variable by clicking on DEFINE RANGE just
below the grouping variable box. Fill in the minimum and maximum values (the codes
11
used in our problem is 1 to 3 of the variable in the box which appears. Then click
SPSS COMMANDS

5. Select all the independent variables for discriminant analysis from the
variable list by clicking on the arrow which transfers them to the
12
INDEPENDENTS box on the right.
SPSS COMMANDS

6. Just below the INDEPENDENTS box select ‘Enter independents together’ if you want all
the selected independent variables (that are in the box) in the discriminant model. (Here
you have an option to use a STEPWISE discriminant analysis by selecting ‘Use Stepwise
13
Method’ instead of ‘Enter independents together’).
SPSS COMMANDS

7. Click on STATISTICS on the lower part of the main dialog box. This opens up a smaller
dialog box. Under DESCRIPTIVES, click on MEANS, UNIVARIATE ANOVAS and BOX'S M.
Under the title FUNCTION COEFFICIENTS, choose FISHER'S and UNSTANDARDIZED.
These are used to classify a new object in a discriminant analysis. Under MATRICES click 14
on WITHIN GROUP CORRELATION. Click on CONTINUE to return to the main dialog box.
SPSS COMMANDS

8. Click on CLASSIFY on the lower part of the main dialog box. Select SUMMARY TABLE
and LEAVE-ONE-OUT CLASSIFICATION under the heading DISPLAY in the smaller dialog
box that appears. This gives you the classification table (also called the confusion matrix)
that judges the accuracy of the discriminant model when applied to the input data points.
15
Click on CONTINUE to return to the main dialog box.
SPSS COMMANDS

9. Click on SAVE and then select PREDICTED GROUP MEMBERSHIP and


16
DISCRIMINANT SCORES.
SPSS COMMANDS

17
10. Click OK to get the discriminant analysis output.
Example
A researcher wanted to study the relationship of blood pressure (BP) status
(Normal, High) with four other variables: Age, Weight, Body Surface Area (BSA)
and Pulse. He recruited 110 adults and recorded their BP, Age, Weight, BSA and
Pulse values. He recoded the SBP values into: Normal (<140) and High (> 140).

Objective: To identify the significant determinants of BP status among Age,


Weight, BSA and Pulse.
18
Output
Descriptive statistics and test of means:

The p-values for all, except Pulse, below 0.05. Thus, Age,weight and 19
BSA may be significant determinants of Blood Pressure status
Output
Box's Test of Equality of Covariance Matrices:

The p-value for Box’s M is > 0.05


Equality of variance-covariance matrixcan be assumed.
20
Output
Summary of Canonical Discriminant Functions:

There are two groups. Therefore number of function = 1.


The eigen valueis 1.021 (>1). Canonical correlation, rc= 0.711(>0.35).
Wilks Lamda = 0.495, p-value = <0.001. Thus, the Function 1 explains the
21
variation well
Output
The function:

Correlation between the determinants and F

The discriminant function:


22
F= -22.368 + 0.195(Age)+0.352(Weight) –3.842(BSA) - 0.018(Pulse)
Output
Centroids:

Between -0.613 and 1.635, mid point is 0.511

Classification:
Age = 60, Weight= 90, BSA=2.0, Pulse = 70
F = -22.368 +0.195(60) + 0.352(90) – 3.842(2) – 0.018(70) = 12.03 > 0.511
23
High BP
Output
Results from Stepwise Method - Method Wilk’s Lambda

Summary of Canonical Discriminant Functions

There are two groups. Therefore number of function = 1.


The eigen valueis 0.981 (<1). Canonical correlation, rc= 0.704(>0.35).
Wilks Lamda = 0.505, p-value = <0.001. Thus, the Function 1 explains the
24
variation well.
Output
The function:

25
F = -23.894 + 0.197(Age) + 0.226(Weight)
Output
Centroids and Classification results:

Between -0.601and 1.603, mid point is 0.501

Classification:
Age = 60, Weight= 90, BSA=2.0, Pulse = 70, Stress =80
26
F = -23.894+ 0.197 (60) + 0.226(90) = 8.29 > 0.501 High BP
Key Concepts
Discriminant Function - the number of functions computed is one less
than the number of groups in the dependent
variable.
Discriminant Coefficient - are partial coefficients that reflect the unique
contribution of each variable to the classification
of the groups in the dependent variable.
Group Centroid - are the mean discriminant scores for each
group in the dependent variable for each of the
discriminant functions.
Eigenvalue - also called the characteristic roots, is a ratio
between the explained and unexplained
variation in a model. For a good model the eigen
27
value must be more than one.
Key Concepts
Canonical Correlation - it is a measure of the association between the
groups in the dependent variable and the
discriminant function.
DWilks's Lambda - it is used to test the significance of the
discriminant functions.

Classification Matrix - it is a simple cross tabulation of the observed


and predicted memberships.
Box's M - it tests the assumption of equality of variance-
covariance matrices in the groups.
Sample Size - As a rule, the sample size of the smallest group
should exceed the number of independent 28
variables.
Journal

Bhattacharya, A., & Dutta, A. (2020). Predicting a Model for the Financial Risk Tolerance
of Retail Investors of Durgapur City on Their Demographic Factors Using Multiple
Discriminant Analysis. In Smart innovation, systems and technologies. Springer
29
Nature,685–692. https://doi.org/10.1007/978-981-13-9282-5_65
Journal Article Critique
Title: " Predicting a Model for the Financial Risk Tolerance of
Retail Investors of Durgapur City on Their Demographic Factors
Using Multiple Discriminant Analysis "

The title is not that concise but it summarizes the journal. It


is also too long for the title but uses powerful words that
draw the reader’s attention.
The journal title is informative as it effectively
communicates both the main concept of the researcher's
study and the specific location associated with the journal. 30
Journal Article Critique
Abstract:

The researcher utilized simple writing techniques that


effectively captured the readers' understanding.
Their informative abstract presented the main arguments,
important results, and supporting evidence clearly and
concisely.
the complete article thoroughly presents and explains all
key points.
31
Journal Article Critique
Introduction:

The introductions written by Amrita Bhattacharya and


Avijan Dutta incorporate the latest information and
understanding in the field.
It identified the primary topic and the importance of it to the
study.
The introduction of the study is too short. It would be
helpful to include more details to assist the reader in
gaining a more thorough comprehension of the research. 32
Journal Article Critique
Literature Review:

Extensive research conducted by the researcher indicates their


knowledge and understanding of the topic and relevant literature.
The literature review was comprehensive, covering a wide range
of perspectives.
The researcher grouped their literature and compare and
contrast the varying opinions of different authors on certain
topics.
It lacks literature on other aspects of their study. They should find
more literature on that. 33
Journal Article Critique
Research Methodology:

Being well-organized is important to avoid confusion while reading.


the authors implement quantitative research whereby it is to
quantify data and generalize results from a sample to the
population of interest.
The questionnaires it was generally used by financial advisors and
had been proven their validity and reliability.
The researcher utilized the multiple discriminant Analysis approach
appropriately.
The variable to be used is identified in the given table. 34
Journal Article Critique
Result:

All the statements have been supported by evidence using


statistical output.
The summarized results required for the study and all the
necessary tables are displayed in the study.
The process of obtaining the necessary data for the study was
thoroughly discussed.
It discusses clearly and understandably so that readers can easily
understand the result.
The results obtained using multiple discriminant analyses were
discussed thoroughly and in detail. 35
Journal Article Critique
Conclusion:

The findings of the research have clearly explained in the


conclusion, especially on how the multiple discriminant
analysis approaches predict a model for the financial risk
tolerance behavior of the retail investors living in the city of
Durgapur.
It is recommended to gather additional information to
support the study, as it would have been beneficial to expand
it further in order to attain better results. The study
35
conducted was commendable.

You might also like