CH 5 Limited dependent variable models jan 2023

Categorical dependent variable models
Introduction
• Standard linear regression models are applied when the

dependent variable is continuous such as asset
returns, rental value of properties, saving, expenditure,
output, etc.
• But there are many situations in which the dependent

variable in a regression equation simply represents a
discrete choice assuming only a limited number of
values.
• Models involving dependent variables of this kind are

called categorical (limited, discrete or qualitative)
dependent variable models.
• In discrete choice models, the values that the dependent
variables may take are limited to certain integers (e.g. 0,
1, 2, 3, and 4) or even binary (only 0 or 1).
• Throughout our discussion we shall restrict ourselves to

cases of qualitative choice where the set of alternatives
is binary.
• For the sake of convenience the dependent variable is

given a value of 0 or 1.
• The independent variables that affect the success or
failure (that is, indicators of financial status) of
companies may be:
– working capital to total assets ratio

– retained earnings to total assets ratio
– earnings before interest and taxes to total assets ratio
– sale to total assets ratio.
• Thus, we would predict the probability of failure of

companies on the basis of these explanatory variables.
The linear probability model (LPM)
• The problem with this model is that for any individual
whose income is more than birr 60,000, the model-
predicted probability of defaulting is negative.
• For instance, the probability of defaulting of an individual

whose income is birr 80,000 is:
0.15 – 0.0025(80) = -0.05

• Clearly, such predictions cannot be allowed to stand
since we know that the probability of an event is
always a number between 0 and 1 (inclusive), that is,
probabilities can never be negative.
• The LPM can also produce probabilities that are

greater than one.
• Thus, the use of the LPM when the dependent variable

is categorical may lead to nonsense probabilities.
The logit model
Illustration
EViews procedure
First import the data from Excel to EViews
Click on Quick and then select Estimate Equation…
Under Method: in the Equation Estimation pop-up window,
select Binary – Binary Choice (Logit, Probit, Extreme Value)
From Binary estimation method:, select Logit
Click on OK to view the results of the fitted multiple logistic
regression model
• Is the fitted model adequate (is the model a good fit to
the data)?
• To answer this we can use the Hosmer and Lemeshow

Test.
• The null hypothesis of this test is that the model fits

the data well.
• If the null is rejected, we have to re-specify the model.

In the equation window, click on View and select
Goodness-of-Fit Test (Hosmer-Lemeshow)
Click on OK on a new dialog box to view the results of the Hosmer and
Lemeshow Test.
The p-value of the H-L Statistic (0.3867) is greater than 0.05.
Thus, we do not reject the null hypothesis and conclude that the model
fits the data well.
• If the coefficient of a qualitative explanatory variable is:
– negative, then the odds or likelihood of defaulting is higher

for the reference category (the category that is assigned the
value zero).
– positive, then the probability of defaulting is higher for the

non-reference category as compared to the reference
category.
Interpretation of results
Debt-to-Income ratio
• Debt-to-income ratio is a quantitative explanatory

variable.
• The coefficient of debt-to-income ratio is positive.
• This implies that increases in debt-to-income ratio

increases the probability of defaulting, keeping all
other covariates fixed.
Household income
• Household income is a quantitative variable.
• The coefficient of income is negative.
• Thus, increases in income decreases the probability

of defaulting, keeping all other covariates fixed.
Number of residents in the household
• This variable is again quantitative.
• Since the coefficient is positive, increases in the number

of residents leads to increases in the probability of
defaulting on a bank loan.
Level of education
• The positive coefficient implies that an increase in the

level of education of an individual increases the
probability of his/her defaulting.

CH 5 Limited dependent variable models jan 2023

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CH 5 Limited dependent variable models jan 2023

Uploaded by

Copyright:

Available Formats

Categorical dependent variable models

• Standard linear regression models are applied when the

• But there are many situations in which the dependent

• Models involving dependent variables of this kind are

• Throughout our discussion we shall restrict ourselves to

• For the sake of convenience the dependent variable is

– working capital to total assets ratio

• Thus, we would predict the probability of failure of

• For instance, the probability of defaulting of an individual

0.15 – 0.0025(80) = -0.05

• The LPM can also produce probabilities that are

• Thus, the use of the LPM when the dependent variable

• To answer this we can use the Hosmer and Lemeshow

• The null hypothesis of this test is that the model fits

• If the null is rejected, we have to re-specify the model.

The p-value of the H-L Statistic (0.3867) is greater than 0.05.

– negative, then the odds or likelihood of defaulting is higher

– positive, then the probability of defaulting is higher for the

• Debt-to-income ratio is a quantitative explanatory

• The coefficient of debt-to-income ratio is positive.

• This implies that increases in debt-to-income ratio

• Household income is a quantitative variable.

• The coefficient of income is negative.

• Thus, increases in income decreases the probability

• This variable is again quantitative.

• Since the coefficient is positive, increases in the number

• The positive coefficient implies that an increase in the

You might also like