Ordinary Least Square Single Independent Variable Models Regression Models

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 38

Ordinary Least Square

Single Independent Variable Models


Regression Models
Instructor
Taimoor Naseer Waraich
Estimating Single-Independent-Variable
Models with OLS

• Recall that the objective of regression analysis is to start from:


(2.1)

• And, through the use of data, to get to:


(2.2)

• Recall that equation 2.1 is purely theoretical, while equation (2.2)


is it empirical counterpart
• How to move from (2.1) to (2.2)?
Estimating Single-Independent-Variable Models
with OLS (cont.)
• One of the most widely used methods is Ordinary Least
Squares (OLS)

• OLS minimizes (i = 1, 2, …., N) (2.3)

• Or, the sum of squared deviations of the vertical distance


between the residuals (i.e. the estimated error terms) and the
estimated regression line
• We also denote this term the “Residual Sum of Squares” (RSS)
Estimating Single-Independent-Variable
Models with OLS (cont.) N

 (Y
i
i Yˆ )2
i

• Similarly, OLS minimizes:

• Why use OLS?


• Relatively easy to use
• The goal of minimizing RSS is intuitively /
theoretically appealing
• This basically says we want the estimated regression
equation to be as close as possible to the observed data
• OLS estimates have a number of useful characteristics
Estimating Single-Independent-Variable
Models with OLS (cont.)
• OLS estimates have at least two useful
characteristics:
• The sum of the residuals is exactly zero
• OLS can be shown to be the “best” estimator
when certain specific conditions hold Ordinary
Least Squares (OLS) is an estimator
– A given produced by OLS is an estimate
Estimating Single-Independent-Variable
Models with OLS (cont.)
How does OLS work?
• First recall from (2.3) that OLS minimizes the sum of the squared
residuals
• Next, it can be shown (see Exercise 12) that the coefficients that
ensure that for the case of just one independent variable are:
(2.4)

(2.5)
Estimating Multivariate Regression Models
with OLS

• In the “real world” one explanatory variable is not enough


• The general multivariate regression model with K independent
variables is:
Yi = β0 + β1X1i + β2X2i + ... + βKXKi + εi (i = 1,2,…,N) (1.13)
• Biggest difference with single-explanatory variable regression
model is in the interpretation of the slope coefficients
– Now a slope coefficient indicates the change in the dependent variable
associated with a one-unit increase in the explanatory variable holding
the other explanatory variables constant
Estimating Multivariate Regression Models
with OLS (cont.)
• Omitted (and relevant!) variables are therefore not
held constant
• The intercept term, β0, is the value of Y when all the
Xs and the error term equal zero
• Nevertheless, the underlying principle of minimizing
the summed squared residuals remains the same
Evaluating the Quality of a Regression
Equation
Checkpoints here include the following:
1. Is the equation supported by sound theory?
2. How well does the estimated regression fit the data?
3. Is the data set reasonably large and accurate?
4. Is OLS the best estimator to be used for this equation?
5. How well do the estimated coefficients correspond to the expectations
developed by the researcher before the data were collected?
6. Are all the obviously important variables included in the equation?
7. Has the most theoretically logical functional form been used?
8. Does the regression appear to be free of major econometric problems?
*These numbers roughly correspond to the relevant chapters in the book
Describing the Overall Fit of the Estimated
Model
• The simplest commonly used measure of overall fit is
the coefficient of determination, R2:

(2.14)

• Since OLS selects the coefficient estimates that


minimizes RSS, OLS provides the largest possible R2
(within the class of linear models)
Figure 2.4 Illustration of Case Where R2= 0
Figure 2.5 Illustration of Case Where R2= .95
Figure 2.6 Illustration of Case Where R2 = 1
The Simple Correlation Coefficient, r

• This is a measure related to R2


• r measures the strength and direction of the linear
relationship between two variables:
– r = +1: the two variables are perfectly positively correlated
– r = –1: the two variables are perfectly negatively correlated
– r = 0: the two variables are totally uncorrelated
The adjusted coefficient of determination

• A major problem with R2 is that it can never decrease


if another independent variable is added
• An alternative to R2 that addresses this issue is the
adjusted R2 or R2:

(2.15)
Where N – K – 1 = degrees of freedom
The adjusted coefficient of determination
(cont.)
• So, R2 measures the share of the variation of Y around its mean
that is explained by the regression equation, adjusted for degrees
of freedom

• R2 can be used to compare the fits of regressions with the same


dependent variable and different numbers of independent
variables
• As a result, most researchers automatically use instead of R2
when evaluating the fit of their estimated regressions equations
The Classical Assumptions
1. Linear, correctly specified and additive
error term
2. Zero Population Mean
Fig 4.1: Error Term distribution with a
Mean of Zero
3. All Explanatory Variable are
uncorrelated with Error Term
4. No Serial Correlation of Error Term
5. Constant Variance/ No
Heteroscedasticity in Error Term
Fig 4.2: An Error term whose variance
increases as Z Increases (Heteroskedasticity)
6. No Perfect Multicollinearity
7. The Error Term is Normally Distributed
Fig 4.3: Normal Distribution

The Sampling Distribution of 
Properties of the Mean
Properties of Variance

Fig 4.4 Distribution of 

Fig 4.5 Sampling Distributionof for
Variance Observation
Properties of Standard Error
The Gauss Markov Theorem and
Properties of OLS Estimator
The Gauss Markov Theorem and
Properties of OLS Estimator (cont.)
Notation Conventions
THE SLOPE AND ELASTICITY
COEFFICIENTS OF THE VARIOUS MODELS

You might also like