Professional Documents
Culture Documents
Probit Model
Probit Model
Probit Model
Sarvesh JP
2022504011
M.Sc. Agricultural Economics
TABLE OF CONTENT
• DUMMY DEPENDANT VARIABLE – OVERVIEW
• DUMMY DEPENDANT VARIABLE – KEY TAKEAWAYS
• PROBIT (OR) NORMIT REGRESSION MODEL – BRIEF
• DIFFERENCES BETWEEN LOGIT & PROBIT / NORMIT MODEL
• STEPS INVOLVED IN PROBIT REGRESSION MODEL
• FEATURES OF PROBIT REGRESSION MODEL
• ASSUMPTIONS OF PROBIT REGRESSION MODEL
• DERIVATION OF PROBIT REGRESSION MODEL ( EXAMPLE)
• MERITS – PROBIT REGRESSION MODEL
• DEMERITS – PROBIT REGRESSION MODEL
• APPLICATIONS OF PROBIT REGRESSION MODEL
• CONCLUSION
• REFERENCES AND FURTHER READINGS
DUMMY DEPENDANT VARIABLE
LINK FUNCTION uses the logit link function, which uses the cumulative distribution
is the natural logarithm of the odds function (CDF) of the standard
ratio. normal distribution as the link
function.
INTERPRETATION OF The coefficients in the Logit model The coefficients in the Probit model
COEFFICIENTS represent the change in the log- represent the change in the latent
odds of the outcome for a one-unit variable associated with a one-unit
change in the corresponding change in the corresponding
independent variable. independent variable.
ERROR DISTRIBUTION The Logit model assumes that the The Probit model assumes that the
errors or residuals follow a logistic errors or residuals follow a
distribution standard normal distribution.
Contd..,
LATENT VARIABLE The Logit model assumes that the latent The Probit model assumes that the latent
DISTRIBUTION variable underlying the binary outcome variable underlying the binary outcome
follows a logistic distribution. follows a standard normal distribution.
CALCULATION & The Logit model uses the logistic function The Probit model uses the cumulative
ESTIMATION to estimate the probabilities of the binary distribution function (CDF) of the standard
outcome. normal distribution to estimate the
probabilities of the binary outcome.
COMPUTATIONAL The Logit model is computationally The Probit model is computationally more
COMPLEXITY simpler and faster to estimate compared complex and computationally intensive to
to the Probit model. estimate compared to the Logit model.
STEPS INVOLVED IN
• Step 1(Data Preparation)
• Collect and organize the data required for
analysis. This includes the binary
dependent variable, independent variables,
and any additional relevant variables.
Suppose you are interested in studying the factors that influence the likelihood of a student being
admitted to a prestigious university. The binary outcome variable Y takes the value 1 if the student
is admitted and 0 otherwise. You hypothesize that the student’s GPA (X1) and their standardized
test score (X2) influence the admission decision. The Probit model can be specified as follows:
P(Y = 1) = Φ(Xβ)
where Φ represents the CDF of the standard normal distribution
The linear predictor is given by:
Xβ = β0 + β1X1 + β2X2
To estimate the Probit model, you would use maximum likelihood estimation (MLE) to estimate
the coefficients β0, β1, and β2. The estimated coefficients would represent the effects of GPA and test
score on the probability of admission. The predicted probabilities from the Probit model can be used
to assess the likelihood of admission for students with different GPA and test scores.
This example demonstrates the basic derivation and application of the Probit model to analyze a
binary outcome variable based on the assumption of a standard normal distribution for the latent
variable.
FEATURES OF PROBIT MODEL
• Binary Outcome: The Probit model is designed for situations where the
dependent variable is binary, meaning it can take only two values: 0 or 1. It
is commonly used to model yes/no decisions or binary outcomes, such as the
presence or absence of a disease, the success or failure of an event, or the
adoption or non-adoption of a behavior.
• Latent Variable: The Probit model assumes that the binary outcome variable
is determined by an unobserved (latent) variable that follows a standard
normal distribution. This latent variable represents the underlying
propensity or tendency for the outcome to occur.
• Cumulative Distribution Function (CDF): The Probit model employs the
cumulative distribution function (CDF) of the standard normal distribution
to model the relationship between the independent variables and the
probability of the dependent variable being 1. The CDF gives the probability
that the latent variable falls below a certain threshold.
Linear Relationship: The Probit model assumes a linear relationship between the independent
variables and the latent variable. The coefficients associated with the independent variables
indicate the change in the latent variable corresponding to a one-unit change in the independent
variable.
• Maximum Likelihood Estimation: To estimate the parameters of the Probit model, maximum
likelihood estimation (MLE) is commonly used. MLE finds the values of the model parameters
that maximize the likelihood of observing the actual binary outcomes given the specified
model and the data.
• Interpretation of Coefficients: The coefficients in the Probit model represent the effect of the
corresponding independent variable on the probability of the dependent variable being 1.
They indicate how much the odds of the binary outcome change for a one-unit change in the
independent variable, holding other variables constant.
• Predicted Probabilities: The Probit model allows for the estimation of predicted probabilities,
which represent the probability of the dependent variable being 1 for given values of the
independent variables. These predicted probabilities provide insights into the likelihood of
the binary outcome occurring.
ASSUMPTIONS
• Binary Outcome: The dependent variable is binary and takes only two values: 0
or 1. It represents a dichotomous outcome or a yes/no decision.
• Linearity: There is a linear relationship between the independent variables and
the latent variable, which represents the unobserved propensity for the binary
outcome. The linear relationship assumes that changes in the independent
variables have a constant effect on the probability of the outcome being 1.
• Independence: The observations in the dataset are assumed to be independent of
each other. This assumption implies that the probability of one observation being
1 does not depend on the outcomes of other observations.
Contd..,
• Non-Linearity: The Probit model accommodates non-linear relationships between the independent
variables and the probability of the dependent variable being 1. This feature enables capturing more
complex patterns and nonlinear effects that may exist in the data, providing a more accurate
representation of the relationship between predictors and the binary outcome.
• Maximum Likelihood Estimation: The Probit model employs maximum likelihood estimation
(MLE) to estimate the model parameters. MLE is a widely used and well-established method that
provides efficient and consistent estimates under appropriate assumptions. It allows for hypothesis
testing, model comparison, and the calculation of standard errors and confidence intervals for the
estimated coefficients.
• Goodness of Fit: The Probit model allows for assessing the goodness of fit of the model to the data.
Various statistical tests and measures, such as the likelihood ratio test or the Akaike information
criterion (AIC), can be employed to evaluate the overall fit of the model and compare alternative
specifications.
• Application Versatility: The Probit model has found extensive applications in numerous fields,
including economics, social sciences, public health, psychology, marketing, and more. It has been
successfully used to study a wide range of binary outcomes, including behaviors, choices, health
outcomes, market decisions, and policy impacts.
DEMERITS
• Distributional Assumption: The Probit model assumes that the binary outcome variable
follows a standard normal distribution. However, this assumption may not hold in all cases.
If the underlying distribution deviates significantly from normality, it can affect the
accuracy and reliability of the model’s predictions.
• Non-Linear Interpretation: While the Probit model allows for the modelling of non-linear
relationships, the interpretation of the estimated coefficients is not as straightforward as in
linear models. The coefficients in the Probit model represent the change in the latent
variable, which is not directly interpretable in terms of probabilities or odds ratios.
Interpreting the results requires transforming the coefficients using techniques like marginal
effects or predicted probabilities.
• Non-Trivial Likelihood Maximization: Estimating the parameters of the Probit model
involves maximizing the likelihood function, which can be computationally intensive and
may require specialized software. The likelihood function of the Probit model does not have a
closed-form solution, and numerical optimization techniques are typically employed. This can
make the estimation process more complex and time-consuming compared to other simpler
models.
Contd..,
• Multicollinearity: Like any regression model, the Probit model can be sensitive to
multicollinearity, which occurs when independent variables are highly correlated.
Multicollinearity can lead to unstable and imprecise coefficient estimates, making it
challenging to identify the unique effects of individual predictors on the binary outcome.
• Overfitting and Model Complexity: The Probit model allows for the inclusion of multiple
independent variables and interactions, which can increase the complexity of the model.
Including a large number of predictors relative to the sample size may lead to overfitting,
where the model becomes too specific to the data used for estimation and performs poorly on
new data.
• Limited Predictive Power: While the Probit model provides estimates of the probability of the
dependent variable being 1, it may have limited predictive power when it comes to individual-
level predictions. The model focuses on estimating population-level effects and may not
capture all the heterogeneity and individual-level variation in the binary outcome.
• Alternative Specifications: The Probit model is one of several approaches for modelling binary
outcomes, and different specifications may yield different results. Depending on the specific
research question and data characteristics, alternative models such as the Logit model or
other non-linear models might be more appropriate or provide better fits.
A P P L I C AT I O N S
• In conclusion, the Probit model is a widely used statistical model for analyzing
binary dependent variables.
• The Probit model estimates the relationship between the observed binary outcome
and the linear predictor of the latent variable using the cumulative distribution
function (CDF) of the standard normal distribution as the link function.
• The Probit model offers several advantages as it allows for a flexible and nonlinear
relationship between the independent variables and the probability of the binary
outcome.
• The Probit model also has some limitations. It requires the assumption of normality
for the latent variable, which may not always hold in practice.
• Overall, the Probit model is a valuable tool for analyzing binary dependent
variables, particularly when the normality assumption is more appropriate or
when the estimation of predicted probabilities is of interest.
• Researchers should carefully consider the assumptions, computational requirements,
and interpretational aspects of the Probit model in their specific research context.
REFERENCES AND FURTHER
READINGS
• Gujarati, D. N., & Porter, D. C. (2009). Basic
Econometrics (5th ed.). McGraw-Hill. Chapter 16
• Long, J. S. (1997). Regression Models for Categorical
and Limited Dependent Variables. Sage
Publications.
• Wooldridge, J. M. (2010). Econometric Analysis of
Cross Section and Panel Data. MIT Press. Chapter 15
• Greene, W. H. (2012). Econometric Analysis (7th ed.).
Prentice Hall. Chapter 19
• Cameron, A. C., & Trivedi, P. K. (2009).
Microeconometrics: Methods and Applications.
Cambridge University Press. Chapter 20
ANY QUERIES????
THANKS FOR YOUR ATTENTION