Probit Model

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 29

TAMILNADU AGRICULTURAL UNIVERSITY

CENTRE FOR AGRICULTURAL AND RURAL DEVELOPMENT STUDIES


DEPARTMENT OF AGRICULTURAL ECONOMICS

DUMMY DEPENDANT VARIABLE


PROBIT (OR) NORMIT MODEL

Sarvesh JP
2022504011
M.Sc. Agricultural Economics
TABLE OF CONTENT
• DUMMY DEPENDANT VARIABLE – OVERVIEW
• DUMMY DEPENDANT VARIABLE – KEY TAKEAWAYS
• PROBIT (OR) NORMIT REGRESSION MODEL – BRIEF
• DIFFERENCES BETWEEN LOGIT & PROBIT / NORMIT MODEL
• STEPS INVOLVED IN PROBIT REGRESSION MODEL
• FEATURES OF PROBIT REGRESSION MODEL
• ASSUMPTIONS OF PROBIT REGRESSION MODEL
• DERIVATION OF PROBIT REGRESSION MODEL ( EXAMPLE)
• MERITS – PROBIT REGRESSION MODEL
• DEMERITS – PROBIT REGRESSION MODEL
• APPLICATIONS OF PROBIT REGRESSION MODEL
• CONCLUSION
• REFERENCES AND FURTHER READINGS
DUMMY DEPENDANT VARIABLE

• A dummy dependent variable, also known as a binary dependent


variable, is a type of variable that can take only two values: 0 or 1.
• A dummy variable is created by assigning a value of 1 to represent the
presence or occurrence of an event or condition, and a value of 0 to
represent the absence or non-occurrence of that event or condition.
• It is used in situations where the outcome of interest is a binary
outcome or a yes/no decision.
• For example, it can be used to model whether a customer will make a
purchase (1) or not (0), or whether a patient will respond positively to
a treatment (1) or not (0).
KEY TAKEAWAYS

• A dummy dependent variable, also known as a binary dependent


variable, is a type of dependent variable that can take only two
values: 0 or 1.
• The dummy variable serves as the dependent variable in the
analysis, and its value is determined by the values of the
independent variables.
• Analyzing data with a dummy dependent variable often involves
using statistical models that are specifically designed for binary
outcomes
• These models estimate the probability of the dependent variable
being 1 given a set of independent variables and provide insights
into the factors that influence the binary outcome.
PROBIT REGRESSION MODEL

• The Probit model is a statistical model commonly


used to analyze binary dependent variables.
• It is based on the assumption that the binary
outcome variable follows a standard normal
distribution.
• In the Probit model, the dependent variable is a
binary variable that can take the value of 0 or 1.
• It represents a yes/no outcome or a choice
between two mutually exclusive alternatives.
• The Probit model estimates the probability of the
dependent variable being 1 given a set of
independent variables.
LOGIT VS PROBIT
LOGIT MODEL PROBIT (OR) NORMIT MODEL

LINK FUNCTION uses the logit link function, which uses the cumulative distribution
is the natural logarithm of the odds function (CDF) of the standard
ratio. normal distribution as the link
function.

INTERPRETATION OF The coefficients in the Logit model The coefficients in the Probit model
COEFFICIENTS represent the change in the log- represent the change in the latent
odds of the outcome for a one-unit variable associated with a one-unit
change in the corresponding change in the corresponding
independent variable. independent variable.

ERROR DISTRIBUTION The Logit model assumes that the The Probit model assumes that the
errors or residuals follow a logistic errors or residuals follow a
distribution standard normal distribution.
Contd..,
LATENT VARIABLE The Logit model assumes that the latent The Probit model assumes that the latent
DISTRIBUTION variable underlying the binary outcome variable underlying the binary outcome
follows a logistic distribution. follows a standard normal distribution.

CALCULATION & The Logit model uses the logistic function The Probit model uses the cumulative
ESTIMATION to estimate the probabilities of the binary distribution function (CDF) of the standard
outcome. normal distribution to estimate the
probabilities of the binary outcome.

COMPUTATIONAL The Logit model is computationally The Probit model is computationally more
COMPLEXITY simpler and faster to estimate compared complex and computationally intensive to
to the Probit model. estimate compared to the Logit model.
STEPS INVOLVED IN
• Step 1(Data Preparation)
• Collect and organize the data required for
analysis. This includes the binary
dependent variable, independent variables,
and any additional relevant variables.

• Step 2 (Model specification)


• Specify the Probit model by identifying the
dependent variable and the independent
variables that you believe are related to the
binary outcome. Formulate the linear
relationship between the independent
variables and the latent variable.
• Step 3 (Estimation Method)
• Choose an appropriate estimation method.
The most common method for estimating
Probit models is Maximum Likelihood
Estimation (MLE). MLE finds the parameter
values that maximize the likelihood of
observing the actual binary outcomes given
the specified Probit model and the data.
• Step 4 (Model Estimation)
• Use software or statistical packages capable
of estimating Probit models to obtain the
parameter estimates. Implement the MLE
procedure to estimate the coefficients of the
independent variables.
• Step 5 (Interpretation of Coefficients)
• Interpret the estimated coefficients by assessing their sign,
magnitude, and statistical significance. The coefficients represent the
change in the latent variable associated with a one-unit change in the
corresponding independent variable, holding other variables constant.
• Step 6 (Goodness of Fit)
• Assess the goodness of fit of the Probit model to the data.
• This involves evaluating model fit statistics, such as the likelihood
ratio test, AIC, BIC, or pseudo R-squared measures, to assess the
overall fit of the model and compare alternative specifications
• Step 7 (Predicted Probabilities)
• Use the estimated model to calculate predicted probabilities.
Predicted probabilities represent the probability of the dependent
variable being 1 for given values of the independent variables. These
probabilities provide insights into the likelihood of the binary
outcome occurring.
• Step 8 (Diagnostic Checking)
• Perform diagnostic checks to assess the validity of
the Probit model. This may involve examining
residuals, checking for influential observations,
assessing multicollinearity, and conducting
sensitivity analyses to evaluate the robustness of
the results.
• Step 9 (Interpretation and Reporting)
• Interpret the results of the Probit model and report
the findings, including the estimated coefficients,
their standard errors, statistical significance, and
predicted probabilities. Provide an interpretation of
the relationship between the independent variables
and the probability of the binary outcome based on
the estimated coefficients.
• It is worth noting that software
packages such as R, Stata, or Python
provide specific functions and
procedures for estimating Probit
models, making the estimation
process more convenient.
• These software packages handle the
mathematical optimization required
for maximum likelihood estimation
and provide output summaries for
interpretation and analysis.
Step 1: Model Assumption
Assume there is a binary dependent variable Y that takes the value of 1 with
probability P(Y = 1) = Φ(Xβ) and the value of 0 with probability P(Y = 0) = 1 –
Φ(Xβ), where Φ represents the cumulative distribution function (CDF) of the
standard normal distribution.
• X represents a matrix of independent variables, and β represents a vector of
coefficients to be estimated.
• Step 2: Linear Predictor
• Assume that the probability of Y = 1, denoted as P(Y = 1), can be expressed as the
CDF of a linear predictor Xβ, i.e., P(Y = 1) = Φ(Xβ).
• Step 3: Inverse of CDF
• Take the inverse of the CDF to transform the probability into a linear predictor:
Φ^(-1)(P(Y = 1)) = Xβ.
• The inverse of the CDF of the standard normal distribution is denoted as Φ^(-1)
which is also called the probit function.
Step 4: Link Function
• The probit function serves as the link function
that connects the linear predictor Xβ to the
observed binary outcome Y.
• Step 5: Estimation
• The coefficients β are estimated using
maximum likelihood estimation (MLE) to find
the parameter values that maximize the
likelihood of observing the actual binary
outcomes given the Probit model and the data.
• The likelihood function is constructed based
on the assumption that the observations are
independent and follow a Bernoulli
distribution.
EXAMPLE

Suppose you are interested in studying the factors that influence the likelihood of a student being
admitted to a prestigious university. The binary outcome variable Y takes the value 1 if the student
is admitted and 0 otherwise. You hypothesize that the student’s GPA (X1) and their standardized
test score (X2) influence the admission decision. The Probit model can be specified as follows:
P(Y = 1) = Φ(Xβ)
where Φ represents the CDF of the standard normal distribution
The linear predictor is given by:
Xβ = β0 + β1X1 + β2X2
To estimate the Probit model, you would use maximum likelihood estimation (MLE) to estimate
the coefficients β0, β1, and β2. The estimated coefficients would represent the effects of GPA and test
score on the probability of admission. The predicted probabilities from the Probit model can be used
to assess the likelihood of admission for students with different GPA and test scores.
This example demonstrates the basic derivation and application of the Probit model to analyze a
binary outcome variable based on the assumption of a standard normal distribution for the latent
variable.
FEATURES OF PROBIT MODEL

• Binary Outcome: The Probit model is designed for situations where the
dependent variable is binary, meaning it can take only two values: 0 or 1. It
is commonly used to model yes/no decisions or binary outcomes, such as the
presence or absence of a disease, the success or failure of an event, or the
adoption or non-adoption of a behavior.
• Latent Variable: The Probit model assumes that the binary outcome variable
is determined by an unobserved (latent) variable that follows a standard
normal distribution. This latent variable represents the underlying
propensity or tendency for the outcome to occur.
• Cumulative Distribution Function (CDF): The Probit model employs the
cumulative distribution function (CDF) of the standard normal distribution
to model the relationship between the independent variables and the
probability of the dependent variable being 1. The CDF gives the probability
that the latent variable falls below a certain threshold.
Linear Relationship: The Probit model assumes a linear relationship between the independent
variables and the latent variable. The coefficients associated with the independent variables
indicate the change in the latent variable corresponding to a one-unit change in the independent
variable.
• Maximum Likelihood Estimation: To estimate the parameters of the Probit model, maximum
likelihood estimation (MLE) is commonly used. MLE finds the values of the model parameters
that maximize the likelihood of observing the actual binary outcomes given the specified
model and the data.
• Interpretation of Coefficients: The coefficients in the Probit model represent the effect of the
corresponding independent variable on the probability of the dependent variable being 1.
They indicate how much the odds of the binary outcome change for a one-unit change in the
independent variable, holding other variables constant.
• Predicted Probabilities: The Probit model allows for the estimation of predicted probabilities,
which represent the probability of the dependent variable being 1 for given values of the
independent variables. These predicted probabilities provide insights into the likelihood of
the binary outcome occurring.
ASSUMPTIONS

• Binary Outcome: The dependent variable is binary and takes only two values: 0
or 1. It represents a dichotomous outcome or a yes/no decision.
• Linearity: There is a linear relationship between the independent variables and
the latent variable, which represents the unobserved propensity for the binary
outcome. The linear relationship assumes that changes in the independent
variables have a constant effect on the probability of the outcome being 1.
• Independence: The observations in the dataset are assumed to be independent of
each other. This assumption implies that the probability of one observation being
1 does not depend on the outcomes of other observations.
Contd..,

• No Perfect Multicollinearity: The independent variables should not exhibit


perfect multicollinearity, which means they should not be perfectly linearly
related to each other. Perfect multicollinearity can lead to estimation
problems and unstable coefficient estimates.
• Normality: The Probit model assumes that the latent variable, which follows
a standard normal distribution, determines the binary outcome. This
assumption enables the use of the cumulative distribution function (CDF) of
the standard normal distribution to model the probability of the outcome
being 1.
• Correct Specification: The Probit model assumes that the functional form of
the model is correctly specified, meaning that the included independent
variables and their relationship with the outcome are correctly specified.
Misspecification of the model can lead to biased and inefficient parameter
estimates.
MERITS

• Flexible Modelling: The Probit model provides a flexible framework for


modelling binary outcomes. It allows for the inclusion of multiple independent
variables, capturing complex relationships and interactions between predictors
and the probability of the outcome being 1. This flexibility makes it suitable for
a wide range of research questions and applications.
• Probability Interpretation: The Probit model estimates the probability of the
dependent variable being 1, given a set of independent variables. This
probability interpretation allows for a straightforward understanding of the
relationship between the predictors and the binary outcome. Researchers can
easily interpret and communicate the results in terms of the likelihood of the
event occurring.
• Distributional Assumption: The Probit model assumes that the binary
outcome follows a standard normal distribution. This assumption provides a
clear theoretical foundation for the model and allows for the estimation of
probabilities based on the cumulative distribution function. It also facilitates
comparisons across different studies using the Probit model.
Contd..,

• Non-Linearity: The Probit model accommodates non-linear relationships between the independent
variables and the probability of the dependent variable being 1. This feature enables capturing more
complex patterns and nonlinear effects that may exist in the data, providing a more accurate
representation of the relationship between predictors and the binary outcome.
• Maximum Likelihood Estimation: The Probit model employs maximum likelihood estimation
(MLE) to estimate the model parameters. MLE is a widely used and well-established method that
provides efficient and consistent estimates under appropriate assumptions. It allows for hypothesis
testing, model comparison, and the calculation of standard errors and confidence intervals for the
estimated coefficients.
• Goodness of Fit: The Probit model allows for assessing the goodness of fit of the model to the data.
Various statistical tests and measures, such as the likelihood ratio test or the Akaike information
criterion (AIC), can be employed to evaluate the overall fit of the model and compare alternative
specifications.
• Application Versatility: The Probit model has found extensive applications in numerous fields,
including economics, social sciences, public health, psychology, marketing, and more. It has been
successfully used to study a wide range of binary outcomes, including behaviors, choices, health
outcomes, market decisions, and policy impacts.
DEMERITS

• Distributional Assumption: The Probit model assumes that the binary outcome variable
follows a standard normal distribution. However, this assumption may not hold in all cases.
If the underlying distribution deviates significantly from normality, it can affect the
accuracy and reliability of the model’s predictions.
• Non-Linear Interpretation: While the Probit model allows for the modelling of non-linear
relationships, the interpretation of the estimated coefficients is not as straightforward as in
linear models. The coefficients in the Probit model represent the change in the latent
variable, which is not directly interpretable in terms of probabilities or odds ratios.
Interpreting the results requires transforming the coefficients using techniques like marginal
effects or predicted probabilities.
• Non-Trivial Likelihood Maximization: Estimating the parameters of the Probit model
involves maximizing the likelihood function, which can be computationally intensive and
may require specialized software. The likelihood function of the Probit model does not have a
closed-form solution, and numerical optimization techniques are typically employed. This can
make the estimation process more complex and time-consuming compared to other simpler
models.
Contd..,

• Multicollinearity: Like any regression model, the Probit model can be sensitive to
multicollinearity, which occurs when independent variables are highly correlated.
Multicollinearity can lead to unstable and imprecise coefficient estimates, making it
challenging to identify the unique effects of individual predictors on the binary outcome.
• Overfitting and Model Complexity: The Probit model allows for the inclusion of multiple
independent variables and interactions, which can increase the complexity of the model.
Including a large number of predictors relative to the sample size may lead to overfitting,
where the model becomes too specific to the data used for estimation and performs poorly on
new data.
• Limited Predictive Power: While the Probit model provides estimates of the probability of the
dependent variable being 1, it may have limited predictive power when it comes to individual-
level predictions. The model focuses on estimating population-level effects and may not
capture all the heterogeneity and individual-level variation in the binary outcome.
• Alternative Specifications: The Probit model is one of several approaches for modelling binary
outcomes, and different specifications may yield different results. Depending on the specific
research question and data characteristics, alternative models such as the Logit model or
other non-linear models might be more appropriate or provide better fits.
A P P L I C AT I O N S

• Economics: In economics, the Probit model has been employed to study


various binary outcomes, such as labor force participation, unemployment,
job choices, business investment decisions, consumer choices, and
financial decisions.
• Social Sciences: In social sciences, the Probit model finds applications in
studying binary outcomes related to social behavior, political
participation, educational choices, marriage and fertility decisions,
criminal behavior, and survey responses.
• Marketing and Consumer Research: The Probit model is used in marketing
and consumer research to analyze binary outcomes such as purchase
decisions, brand choice, product adoption, customer satisfaction, response
to advertising campaigns, and customer retention.
• Finance and Investments: In finance, the Probit model has been applied to
analyze binary outcomes related to investment decisions, such as the
probability of default, credit risk assessment, bankruptcy prediction, and
financial market participation.
Contd..,

• Quality Control: In manufacturing and industrial settings, probit models


can be used for quality control and defect prediction. These models help
assess the probability of product defects based on manufacturing
parameters, allowing companies to identify potential issues and
implement corrective measures.
• Environmental Studies: Probit models are useful in environmental studies
for analyzing binary outcomes related to environmental risks, pollution,
and habitat choice. Researchers can use these models to estimate the
probability of specific events occurring, such as the presence of
endangered species in certain habitats or the likelihood of pollution
levels exceeding thresholds.
• Public Health: Probit models are applied in public health research for
analyzing binary outcomes related to disease diagnosis, treatment
outcomes, and health behaviours. They can be used to model the
probability of disease occurrence based on risk factors or to evaluate the
effectiveness of interventions or treatments.
CONCLUSION

• In conclusion, the Probit model is a widely used statistical model for analyzing
binary dependent variables.
• The Probit model estimates the relationship between the observed binary outcome
and the linear predictor of the latent variable using the cumulative distribution
function (CDF) of the standard normal distribution as the link function.
• The Probit model offers several advantages as it allows for a flexible and nonlinear
relationship between the independent variables and the probability of the binary
outcome.
• The Probit model also has some limitations. It requires the assumption of normality
for the latent variable, which may not always hold in practice.
• Overall, the Probit model is a valuable tool for analyzing binary dependent
variables, particularly when the normality assumption is more appropriate or
when the estimation of predicted probabilities is of interest.
• Researchers should carefully consider the assumptions, computational requirements,
and interpretational aspects of the Probit model in their specific research context.
REFERENCES AND FURTHER
READINGS
• Gujarati, D. N., & Porter, D. C. (2009). Basic
Econometrics (5th ed.). McGraw-Hill. Chapter 16
• Long, J. S. (1997). Regression Models for Categorical
and Limited Dependent Variables. Sage
Publications.
• Wooldridge, J. M. (2010). Econometric Analysis of
Cross Section and Panel Data. MIT Press. Chapter 15
• Greene, W. H. (2012). Econometric Analysis (7th ed.).
Prentice Hall. Chapter 19
• Cameron, A. C., & Trivedi, P. K. (2009).
Microeconometrics: Methods and Applications.
Cambridge University Press. Chapter 20
ANY QUERIES????
THANKS FOR YOUR ATTENTION

You might also like