Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

STATISTICAL METHODS FOR MARKET

RESEARCH
Chapter 23
Discriminant and logit analysis

Discriminant analysis is used to


estimate the relationship between
a categorical dependent variable
and a set of interval-scaled
independent variables.
Logit Analysis
We will focus on: (Chapter 23 textbook - from page 696)

• describe the binary logit model and its advantages over discriminant and regression analysis.
The logit analysis - overview

Regression analysis enables you to characterise the relationship between a


response variable and one or more predictor variables.
• In linear regression, the response variable is continuous.
• In logistic regression, the response variable is categorical.

4
The logit analysis - overview
• If the response variable is
dichotomous (binary), the appropriate
logistic regression model is the binary
logistic model.
If the response variable has more than
two categories, then:
• If the response variable is nominal,
you fit a nominal logistic regression
model.
• If the response variable is ordinal, you
fit an ordinal logistic regression
model.

5
Binary logit model
The binary logit model commonly deals with the issue of how
likely an observation is to belong to each group. It estimates the
probability of an observation belonging to a particular group.

• Discriminant analysis deals with the issue of which group an observation is likely
to belong to.
• On the other hand, the binary logit model commonly deals with the issue of how
likely an observation is to belong to each group.
• Logit model falls somewhere between regression and discriminant analysis in
application. We can estimate the probability of a binary event taking place using
the binary logit model, also called logistic regression.

6
Why not conduct a standard regression
analysis?
• Arbitrary nature of the coding
• Predicted values will take on values which have no intrinsic meaning with
regards to your response variable
• Unable to assume normality and a constant variance for the error term when
the response variable has only two values.
• Even if we build a linear probability model this can take any value (not bounded
between 0 and 1)
• Also, can you assume a linear relationship between X and the probability?
• Hence, standard regression would not work, even if we try to model just the
probability of one outcome

7
Lo gist ic regressio n cu rv e a n d lo git t ra n sfo rm a t io n

Logistic regression models transform


probabilities called logits, defined as:

𝜋
𝑙𝑜𝑔𝑖𝑡(𝜋 ) = 𝑙𝑛
1−𝜋
where:
• 𝑖 – is the index of a case (observation)
• 𝜋 - is the probability the event (a sale,
for example) occurs in the 𝑖th case
• ln is the natural log (to the base e).
Relationship between a continuous predictor and
the probability of an event or outcome.

A logistic regression model applies a logit


transformation to the probabilities 8
Conducting binary logit analysis

9
Conducting binary logit analysis

• The assumption in logistic regression is that the logit transformation of the


probabilities results in a linear relationship with the predictor variables.
• If the thoughts about the nature of the direct relationship between X and π are
correct, then the logit will have a linear relationship with X.
• In other words, a linear function of X can be used to model the logit. In that
way, you can indirectly model the probability.

10
Binary logit model
Logistic regression models transform probabilities called logits, defined as:

𝜋
logit 𝜋 = 𝑙𝑛 = 𝛽 +𝛽 𝑋 +⋯+𝛽 𝑋
1−𝜋
where:
• logit 𝜋 – the logit of the probability of the event
• 𝛽 - the intercept of the regression equation
• 𝛽 - the parameter estimate of the 𝑘th predictor variable, 𝑋 .

Unlike linear regression, the logit is not normally distributed and the variance
is not constant.
Logistic regression usually requires a more complex estimation method called
maximum likelihood to estimate the parameters than linear regression 11
Binary logit model – odds ratio
An odds ratio indicates how much more likely, with respect to odds, a
particular event occurs in one group relative to its occurrence in another
group. We define odds as:

𝜋
𝑜𝑑𝑑𝑠 =
1−𝜋

12
Binary logit model – odds ratio
The odds ratio shows the strength of the association between the predictor
variable and the outcome variable.
If the odds ratio is 1, then there is no association between the predictor variable
and the outcome.
If the odds ratio is greater than 1, then Group B is more likely to have the
outcome.
If the odds ratio is less than 1, then Group A is more likely to have the outcome.

13
Binary logit model - example
We illustrate the logit model by analysing the data of Table 23.6.
This table gives the data for 30 participants, 15 of whom are brand loyal (indicated by 1) and
15 of whom are not (indicated by 0). We also measure attitude towards the brand (Brand),
attitude towards the product category (Product) and attitude towards shopping (Shopping),
all on a 1 (unfavourable) to 7 (favourable) scale.
The objective is to estimate the probability of a consumer being brand loyal as
a function of attitude towards the brand, the product category and shopping.

14
Binary logit model - example
A multiple regression line top estimate probability would give us

𝑝 = −0.684 + 0.183 𝐵𝑟𝑎𝑛𝑑 + 0.020 𝑃𝑟𝑜𝑑𝑢𝑡 + 0.074 𝑆ℎ𝑜𝑝𝑝𝑖𝑛𝑔

The limitations are clear as p needs to be between 0 and 1. This is overcome


by logistic regression.

15
Binary logit model - example

p-value for
of

/ each one

variables
This
of popu level ,

isthe only
>
-
-ok suitable van
to
consider
Dr
>
- no

>
- no

Classification table that reveals that 24 of the 30 cases, i.e. 80%, are correctly classified. The significance of the
- -

estimated coefficients is based on Wald’s statistic. We note that only attitude towards the brand is significant in
explaining brand loyalty.
Thus, a manager seeking to increase brand loyalty should focus on fostering a more positive attitude towards
the brand and not worry about attitude towards the product category or attitude towards shopping. 16
Binary logit model – SPSS example
The data file Honours.sav contains data on the following variables for a
sample of 200 students.
Gender – ‘0 = male’ and ‘1 = female’.
Reading – metric of a student’s reading ability.
Science – metric of a student’s scientific ability.
Honours – ‘0 = no honours’ and ‘1 = honours’.
A researcher wants to use the binary logit model to build a model to help
investigate how the odds that a student achieves honours is affected by
certain predictors. Perform a binary logistic regression in SPSS to assist the
researcher.
Use Analyze > Regression > Binary Logistic…. Make ‘Honours’ the (binary)
dependent variable and the others as ‘Covariates’. Click ‘Options…’ and select
‘CI for exp(B)’.
17

You might also like