Professional Documents
Culture Documents
Chapter 4
Chapter 4
4. Introduction
This chapter describes models where the dependent variable Y is a dichotomous variable. Such models are
called limited dependent variable models or also qualitative or categorical variable models. We concentrate on
the binary case where Yi can take only two values. One example would be a model of women labor force
participation (LFP). The dependent variable in this case is the labor force participation (LFP) which would take
the value of one (1) if the woman participates in the labor force and a value of zero (0) if the woman does not
participate in the labor force.
Various explanatory variables could be included: both continuous variables such as age and dichotomous
variables such as gender or educational achievement. Other examples are models of the determinants of
willingness to pay (WTP) for pure water supply in the rural areas of Chencha Woreda in SNNP Regional State
and determinants of house ownership by households in Arba Minch town. In both cases the dependent variables
are dichotomous/dummy/binary/qualitative/limited.
Where the response variable (Y) is dummy, categorical, limited, binary or qualitative. Such models are
commonly used in social science and medical research with interesting estimation- and interpretation
challenges. The following models are the most commonly used binary dependent variable models:
The linear probability model is simply applying of ordinary least square (OLS) method to estimate dichotomous
dependent variable. OLS is the method discussed in chapter 2 and 3. Assume that we want study the
determinants of labor force participation (LFP) of adult men in a particular town. Since the dependent variable,
labor force participation, is a nominal variable, it takes a value of 1 (participate) and 0 (for not participate). Suppose we
routinely apply the method of ordinary least-squares (OLS). The linear probability model applies the linear model.
Y i= X i β +ui
The conditional expectation of the dependent variable is equal to the probability of something happening, given
the value of explanatory variable. Pr (Y i=1 ⃓ X i)
From these equations the probability distribution of the dependent variable and the error term is given as
follows:
Value of Yi Ui Probability of Ui Probability of Yi
1 X
1- i ' β X i' β X i'β
0 -X i'β 1- X i ' β 1- X i ' β
Total 1 1
Assume that we want study the determinants of labor force participation (LFP) of adult men in particular town.
The mathematical model is given as:
Figure 1: Regression results of the LPM with Labor Force Participation as dependent variable
But the serious limitation with LPM is that the predicted probability of an event occurring will lie outside the
natural limit, 0≤ Pi ≤1. The predicted probability lies outside the natural limit because LPM assumes linear
relationship between predicted probability and the level of explanatory variable (X i).
However, in reality, we cannot have probabilities that fall below 0 or above 1. Therefore, we need other
techniques of estimation which guarantees that the predicted probability lies in the natural limit, 0≤P i≤1.
This requires a nonlinear functional form for the probability. This can be possible if we assume that the
dependent or the error term (Ui) follows some sorts of cumulative distribution function. The two important
nonlinear functions which are proposed for this are the logistic CDF and the normal CDF. The logistic CDF is
given as follows:
Zi
1 e
Pr ( Y i=1⃓ X i )=P i= −Z
= Z
1+ e i
1+e i
1
Pr ( Y i=0 ⃓ X i ) =1−Pi= Z
1+e i
It is easy to verify that as Zi ranges from −∞ ¿+∞ , Piranges between 0 and 1 and that Pi is non-linearly related
with Zi . That means, 0≤ Pi ≤1, for all real numbers Zi . This ensures that the predicted probability ( Pi) strictly lies
between 0 and 1. Thus, the Logit model satisfies the two conditions:
A. 0≤ Pi ≤1
B. Pi is non-linearly related with Xi
Take the ratio of the probability of an event occurring, in our case being employed, (P i) to the probability of an
event not happening (1-Pi) and the resulting ratio is called odds ratio.
Zi
e
Zi
Pi 1+e
= = e Zi
1−Pi 1
Zi
1+e
To linearize the above odds ratio, take the natural log of both the right side and left side equations. The resulting
equation is called log of the odds ratio (Logit).
Arba Minch University 2022
Pi Zi
ln ( )=ln (e )⇒ Li=Z i
1−P i
Where, Li is the Logit (which is linearly related with X i), X i is a matrix including all values for the explanatory
variables and β is a vector including all coefficients (the s).
a. As Zi goes from -∞ ¿+∞ , the predicted probability (Pi) goes from 0 to 1. In other words, as Xi goes from -
∞ ¿+∞ , the predicted probability (Pi) goes from 0 to 1. This implies that in Logit model, the predicted
probability (Pi) lies in the natural limit, 0≤ Pi ≤ 1.
b. Even if the Logit (L) is linear in X, the probabilities themselves are not. This property is in contrast with the
LPM model where the probabilities increase linearly with X.
c. If the Logit (L) is positive, it means that, when the value of the regressor increases, the odds that the
regressand equals 1 (meaning some event of interest happens) increases. If L is negative, the odds that the
regressand equals 1 decreases as the value of X increases. To put it differently, the Logit becomes negative
and increasingly large in magnitude as the odds ratio decreases from 1 to 0 and becomes increasingly large
and positive as the odds ratio increases from 1 to infinity.
d. More formally, the interpretation of the Logit model given above is as follows: β1, the slope, measures the
change in L for a unit change in X, that is, it tells how the log-odds in favor of success (being employed)
change as X1 changes by a unit. The intercept β0 is the value of the log of odds in favor of success (being
employed) when all the explanatory variables are zero. Like most interpretations of intercepts, this
interpretation may not have any physical meaning.
e. Given a certain value of the explanatory variable, say, X*, if we actually want to estimate not the odds in
Zi
e
favor of success but the probability of success itself, this can be done directly from Pi = Zi , once the
1+ e
estimates of β0 + β1 are available.
f. Whereas the LPM assumes that Pi is linearly related to Xi, the Logit model assumes that the log of the odds
ratio is linearly related to Xi.
For estimation purpose, we can write a simple Logit model (with one independent variable, X) as follows:
Pi
ln ( )=Li=β 0 + β 1 X i +ui
1−Pi
Arba Minch University 2022
This model is estimated using the maximum likelihood method (MLM). This is not done manually, but by
software like Stata. In this section we’ll see how to interpret the regression results of the binary Logit model.
There are three relevant Stata outputs for a Logit model:
Take the example given in the LPM discussion. Assume that we want study the determinants of labor force
participation (LFP) of adult men in a particular town. Suppose that we have data for 30 observations on labor
force participation (employment), age and years of schooling. There are three regression results:
Figure 2: Regression results of the Logit Model with Labor Force Participation as dependent variable
Logistic regression Number of obs = 30
LR chi2(3) = 13.89
Prob > chi2 = 0.0031
Log likelihood = -13.244611 Pseudo R2 = 0.3440
β 0=−5.26 indicates the value of logs of odds ratio in favor of being employed when all explanatory variables
(marriage, age, and schooling) are zero. The marriage factor has significant negative effect on employment at
5% level of significance. As the coefficient of marriage is negative it puts forward that married individuals are
less likely to get employed as compared to unmarried individuals. The value of β 1=−2.61 implies that as one
moves from unmarried to married individuals the logs of odds in favor of being employed decreases by 2.61.
β 2=−0.0098 Means a unit change (a one year increase) in the explanatory variable (age) leads to a 0.0098
change (decrease) in the log-odds ratio in favor of success (being employed) keeping other things constant.
Similarly, β 3=0.67 indicates that, keeping other things remain constant, a one year change (a one year increase)
in years of schooling results in 0.67 change (increment) in the log-odds ratio in favor of being employed.
Therefore, from this Logit output we can see which explanatory variables significantly affect the dependent
variable, and if the effect is positive or negative.
Also note the Pseudo R-squared in the output. This is the replacer of the R 2 used in linear regression (chapter 2
and 3). Pseudo R-squared compares the unrestricted log likelihood l ur for the model we are estimating and the
restricted log likelihood l r with only an intercept. If the independent variables have no explanatory powers the
restricted model will be the same as the unrestricted model and the pseudo R-squared will be 0.
Next to the Pseudo R2, the Likelihood Ratio (LR) test can be used to judge if the whole model is significantly
explaining the variation in the regressand (Y). The H0: the model is as good as a restricted model with only an
intercept. H1: the explanatory variables can significantly explain some variation in Y. In this case, the LR test
reveals that Married, Age and Schooling are able to explain some variation in Employment (Chi 2(3)=13.89,
p:0.0031), so the H0 is rejected and the H1 is assumed.
Figure3: Regression results of the Odds ratio of Logit Model with Labor Force Participation as dependent variable
Odds ratio greater than 1 means the probability of success is greater than the probability of failure vice versa. If odds ratio
is 1, the probability of success is equal with the probability of failure. These values are the exponents of coefficients.
Example for marriage the odds ratio is 0.074. This indicates that the odds ratio in favor of being employed for married
individuals is 0.074 times their unmarried counter parts. Age’s odds ratio is 0.99, which is less than one, implying that
probability of success is less than probabilty of failure. This indicates that as age increases by one year, the odds ratio in
favor of being employed is 0.99.
iii. Probability interpretation (Marginal Effect Methods)
This shows how the probability of success changes as the independent variable changes. As it is specified
above,
Zi
1 e
p= −Z
= Z
1+e i
1+ e i
∂p
=β ¿
∂x
The Stata result given below is the marginal effect after logit of the above employment data.
Arba Minch University 2022
. mfx
Figure 4: Regression results of the marginal effect of Logit Model with Labor Force Participation as dependent variable
The above marginal effect after logit result shows the effect of each explanatory variable on the probability of
being employed. Dy/dx of marriage = - 0.4849 indicates that keeping other things constant at average level, as
one moves from unmarried to married, the probability of being employed increases by 48.49%. Similarly, dy/dx
of age = -0.0021 indicates as age increase by one year then the probability of being employed decreases by
0.21% (assuming that all other explanatory variables score average).
Pi =1
I i=β 0 + β 1 X i
Where Xi is the explanatory variable affecting the continuous variable (the difference between the utility
obtained from being employed and the utility obtained from being unemployed).
How is the unobservable variable related to the actual decision to being employed? As before, let Y = 1 if the
individual get employed and Y = 0 if he/she does not. Now it is reasonable to assume that there is a critical or
threshold level of the index, call it u*(utility obtained from being unemployed) such that if u(utility obtained
from being employed) exceeds u*, the individual will get employed, otherwise it will not. The threshold u* ,
like Ii , is not observable, but if we assume that it is normally distributed with the same mean and variance, it is
possible not only to estimate the parameters of the index given above, but also to get some information about
the unobservable index itself.
Where Pi = Pr (Yi=1|Xi) is the probability that an individual gets employed. Z i is the standard normal variable
which is normally distributed with a mean of 0 and variance of σ 2 . F is the standard normal CDF which can be
explicitly written as follow.
Z −1 2
1 ( Z ) dz
G (Z) = Pi= ∫ √2 π
e 2
−∞
√ π
2
∫ e 2 ( Z ) dz
2
−∞
−∞
This is estimated using the maximum likelihood method (MLM). The coefficients from the Probit model are
¿
difficult to interpret because they measure the change in the unobservable I i associated with a change in one of
the explanatory variables. A more useful measure is what we call the marginal effects.
Estimated coefficients using Logit and Probit are very different because different mathematical functions are
being fitted. But they are approximately the same in magnitude of marginal effect as well as sign of the
coefficients of the independent variables. Since it is difficult to estimate the coefficient of Probit model, we will
interpret the marginal effects.
Suppose from the above employment data the probit regresion is given as follows:
Probit regression Number of obs = 30
LR chi2(3) = 13.85
Prob > chi2 = 0.0031
Log likelihood = -13.26742 Pseudo R2 = 0.3429
Figure 5: Regression results of the Probit Model with Labor Force Participation as dependent variable
The above result is the output of the probit model. We can interpret the sign of the coefficient, and its significance (by
using the z-test). However, it is difficult to interpret the magnitude of the coefficients. Also this output can be used to
evaluate the significance of the whole model (Pseudo R2 and the LR test, see example given for the Logit model).
Figure.6: Regression results of the Marginal Effects of Probit Model with Labor Force Participation as dependent variable