Lecture 2 & 3: Simple Linear Regression: Gumilang Aryo Sahadewo

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 55

Lecture 2 & 3: Simple Linear Regression

Gumilang Aryo Sahadewo

Department of Economics
Universitas Gadjah Mada

October 9, 2017

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 1 / 52
Logistics

Textbook:
JW: Introductory Econometrics: A Modern Approach by Jeffrey M.
Woolridge, sixth edition, required.
Lecture notes
Class notes

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 2 / 52
Review

Econometrics is a useful tool to estimate the effect of changing one


variable on another one.
Given two random variables y and x, we are interested in studying
how y varies with changes in x keeping everything else constant.
To address this issue, we study a simple linear regression model.

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 3 / 52
Motivation

Consider the next questions.


A governmental program increase the salary of teachers: what is the
effect on its students test scores?
The GoI implements a higher excise tax for cigarette: what is the effect
on layoffs?
Quantitative answers to these questions are useful for taking decisions
and policy recommendations.
The simple regression model is the base model to address all these
questions from a simple and general perspective.

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 4 / 52
Simple Regression Model

In a simple regression model that studies how y varies with changes in


x

Y = 0 + 1 X + u

Y is dependent variable, explained variable, LHS variable


X is independent variable, explanatory variable, control variable, RHS
variable
The parameters of the model are 0 and 1 . The former is the
intercept, while the latter is the slope. Interest usually relies on 1 .

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 5 / 52
The term u in a regression model

Textbooks define u as error term or disturbance.


I personally dont like it! So, what is u?
In a simple regression model that studies how Y varies with changes
in x

Y = 0 + 1 X + u

The term u includes variables we do not control in this model, but


also affects Y .
Consider the following variables:
Y = monthly wage (in dollars);
X = years of education.

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 6 / 52
Interpreting Coefficients

If the other factors in u are held fixed, so that the change in u is


zero,u = 0 , then x has a linear effect on y:

Y = 1 X , if u = 0

The change in y is simply 1 multiplied by the change in x. This


means that 1 is the slope parameter in the relationship between y
and x, holding the other factors in u fixed; it is of primary interest in
applied economics.
The intercept parameter 0 , sometimes called the constant term, also
has its uses, although it is rarely central to an analysis.

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 7 / 52
An Example (A Simple Wage Equation)

A model relating a persons wage to observed education and other


unobserved factors is

Wage = 0 + 1 Education + u

If wage is measured in dollars per hour and Education measures years


of education, then 1 measures the change in hourly wage given
another year of education.
What does 0 measure?
Lets look at a scatter graph

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 8 / 52
Example: Test Score and Student-Teacher Ratio

Now consider the model Y = 0 + 1 X + u, where


Y = average test scores;
X = student/teacher ratio.
Again, u may represent the intelligence or social background.
What is the expected sign of 1 ? Why?
1 has the following interpretation: if the student/teacher ratio
increases by one, then average test scores increases by 1 , keeping
everything else constant.

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 9 / 52
Two Questions

In what follows, we focus on the next two questions:


When is it possible to recover (0 , 1 ) from data?
How can we do it?
The answers are:
Key Assumptions.
Ordinary Least Squares (OLS) estimator.

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 10 / 52
Assumptions in Linear Regression Model

The key assumption involves the conditional expectation of u given x,


E (u|X ).
We assume zero conditional mean assumption:

E [u | X ] = 0

This assumption implies that u and X are uncorrelated


Since the key assumption involves an unobservable variable, it cannot
be tested.

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 11 / 52
Zero conditional mean assumption: an example

In a wage equation

Wage = 0 + 1 Education + u

Zero conditional mean assumption implies: E [u | Education] = 0.

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 12 / 52
Zero conditional mean assumption: an example

In a wage equation

Wage = 0 + 1 Education + u

Zero conditional mean assumption implies: E [u | Education] = 0.


This means u and Education are uncorrelated
To simplify the discussion, assume that u is the same as innate ability
in the wage equation.

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 12 / 52
Zero conditional mean assumption: an example

In a wage equation

Wage = 0 + 1 Education + u

Zero conditional mean assumption implies: E [u | Education] = 0.


This means u and Education are uncorrelated
To simplify the discussion, assume that u is the same as innate ability
in the wage equation.
The zero conditional mean assumption requires that the average level
of ability is the same regardless of years of education.
If, for example, we think that average ability increases with years of
education, then the assumption is false. (This would happen if, on
average, people with more ability choose to become more educated.)

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 12 / 52
Assumptions in Linear Regression Model
E [Y | X ] is a linear function of X; for any given value of X, the
distribution of Y is centered about E [Y | X ]

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 13 / 52
Linear regression model

We write the basic model as

Y = 0 + 1 X + u

0 is the intercept, 1 is the coefficients of their X , also called the


slopes.
Estimated s are denoted 0 , 1 .
Fitted values of Y are denoted Y : Yi = 0 + 1 Xi
Residuals are denoted u: ui = Yi Yi

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 14 / 52
How to estimate the model: data

In order to estimate the model one needs data


A random sample of n observations.

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 15 / 52
Data: Test Score and Student-Teacher Ratio

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 16 / 52
How to estimate the model: ordinary least squares
The most popular approach to estimating 0 , 1 is ordinary least
squares.
We choose 0 , 1 to minimize the sum of squared residuals, i.e., we
solve
n 
X 2
minimize Yi 0 1 Xi
0 ,1 i=1

This is a calculus/optimization approach.


The proof is beyond the scope of this course
The OLS estimators are:
Pn
i=1 (Xi X )(Yi Y )
1 = Pn 2
i=1 (Xi X )
0 = Y 1 X

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 17 / 52
Ordinary least squares

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 18 / 52
Ordinary least squares

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 19 / 52
Some comments on OLS estimators

The slope estimator is the sample covariance of X and Y divided by


sample variance of X .
If X and Y are positively correlated, then the slope is positive.
If X and Y are negatively correlated, then the slope is negative.

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 20 / 52
Example: CEO Salary and Return on Equity
Let y be annual salary (in thousands of dollars) and x be the average
roe (in percentage) for the CEOs firm.
Salary = 0 + 1 ROE + u

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 21 / 52
Example: CEO Salary and Return on Equity

Let Y be annual salary (in thousands of dollars) and X be the


average roe (in percentage) for the CEOs firm.
The model is salary = 0 + 1 roe + u.
The OLS estimates are 0 = 963.191 and 1 = 18.501.

salary = 963.191 + 18.501 roe

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 22 / 52
Example: CEO Salary and Return on Equity

Let Y be annual salary (in thousands of dollars) and X be the


average roe (in percentage) for the CEOs firm.
The model is salary = 0 + 1 roe + u.
The OLS estimates are 0 = 963.191 and 1 = 18.501.

salary = 963.191 + 18.501 roe

If roe equals zero, the predicted annual salary is 963.191 thousands of


dollars, i.e., $963, 191 = 963.191 1, 000.
If roe increases by one percentage point, then the salary increases by
$18, 501 = 18.501 1, 000 .

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 22 / 52
Estimated Regression Line
Example: Monthly Wage and Years of Education

Dataset: wage.dta
Command: graph twoway (scatter wage educ) (lfit wage educ)

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 23 / 52
Estimated Regression Line
Example: Test Score and Student-Teacher Ratio

Dataset: caschool.dta
Command: graph twoway (scatter testscr str) (lfit testscr str)

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 24 / 52
Some important quantities
Total sum of squares (SST):
n 
X 2
SST Yi Y
i=1

Explained sum of squares (SSE):


n 
X 2
SSE Yi Y
i=1

Residual sum of squares (SSR):


n  2 n
(ui )2
X X
SSR Yi Yi =
i=1 i=1

SST = SSE + SSR.

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 25 / 52
Goodness of fit: R 2

One measure of the goodnessoffit of a regression model is the R 2 :


SSE SSR
R2 =1
SST SST
R 2 measures the predictive power of our model.
0 R 2 1. R 2 = 1 means perfect fit. If R 2 is near one, the
regressor X is good at predicting Y. If R 2 is close to zero, the
regressor is not good at predicting Y.
R 2 does not depend on the units of measurements

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 26 / 52
Example

CEO Salary and return on equity

\ = 963.191 + 18.501ROE
salary
n = 209, R 2 = 0.0132

Wage and education

[ = 0.90 + 0.54educ
wage
n = 526, R 2 = 0.163

Caution: A high R-squared does not necessarily mean that the


regression has a causal interpretation!

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 27 / 52
Incorporating Nonlinearities

So far the population regression line was assumed to be linear,


E (Y |X ) = 0 + 1 X .
This assumption is very strong, and in many cases, there are reasons
to believe that the relationship between X and Y is nonlinear.
In applied work, e.g., you will often encounter regression equations
where the dependent variable appears in logarithmic form.
For instance, the simple Mincer equation is given by
log(wage) = 0 + 1 educ + u.

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 28 / 52
Incorporating nonlinearities in simple regression

Suppose, instead, that the percentage increase in wage is the same,


given one more year of education. A model that gives (approximately)
a constant percentage effect is

log(wage) = 0 + 1 educ + u

where log() denotes the natural logarithm.


In particular, if u = 0 , then

%wage (100 1 )educ

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 29 / 52
Log-Linear Model

The log-linear regression model is

log(y ) = 0 + 1 x + u.

Note that the predicted value of y is always positive because


y = exp(0 + 1 x + u) > 0. This is a desired property in many cases,
wage-education e.g.
Suppose that x increases by one unit, i.e., x = 1. Then, y increases
as follows:

log(y + y ) = 0 + 1 (x + x ) + u = log(y ) + 1 x = log(y ) + 1 ,

which implies log(y + y ) log(y ) = 1 .


Then, (y )/y 1 , so y increases by 1 100%.

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 30 / 52
Example

Suppose we use log(wage) as the dependent variable, we obtain the


following relationship

\ = 0.584 + 0.083educ + u
log(wage)
n = 526, R 2 = 0.186

The coefficient on educ has a percentage interpretation when it is


multiplied by 100: wage increases by 8.3 percent for every additional
year of education. This is what economists mean when they refer to
the return to another year of education.

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 31 / 52
Log-Log Model
Interpretation of Coefficients: Elasticity

The second case is a model where both y and x are transformed by


taking their logarithms:

log(y ) = 0 + 1 log(x) + u.

Now 1 represents a constant elasticity, i.e., a one percent increase in


x leads to a 1 % increase in y .
E.g., if 1 = 0.312, a 1% increase in x leads to a 0.312% increase in
y.
In quantity-price mode with y = q d and x = price, if 0 1 < 1, the
demand is said to be inelastic.
The interpretation of 0 is irrelevant.

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 32 / 52
Linear-Log Model

The third case is a model where x is transformed by taking its


logarithm:
y = 0 + 1 log(x) + u.
This model is usually called linear-log model.
The interpretation of 1 is as follows: a one percent increase in x
leads to 1 0.01 units increase in y .
When 1 = 27 e.g., a one percent increase in x leads to 0.27 units
increase in y .
Note that 0 represents the expected value of y given x = 1 because
log(1) = 0

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 33 / 52
Advantages of linear regression

Easy.
Computationally simple/fast.
Speed insensitive to dimension of X .
Relatively easy to interpret.
Recognizable & crossdisciplinary.
Somewhat flexible.
Standardized.

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 34 / 52
Disadvantages of linear regression

Does not allow one to discover nonlinear structure in the data

Any nonlinear structure must be known and specified a priori.

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 35 / 52
How to Run Regressions in Stata?

reg Y X

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 36 / 52
How to Run Regressions in STATA?

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 37 / 52
Standard assumptions for the simple linear regresion model

Assumption SLR.1 (Linear in parameters)


The Data Generating Process can be written as

Y = 0 + 1 X + u

which is linear in parameters.


Assumption SLR.2 (Random sampling)
We have a random sample {(Yi , Xi ) | i = 1, 2, . . . , n} with n 2.
Assumption SLR.3 (Sample variation in explanatory variable)
The values of the explanatory variables are not all the same
n
X
(Xi X )2 > 0
i=1

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 38 / 52
Standard assumptions for the simple linear regresion model

Assumption SLR.4 (Zero conditional mean)


Zero conditional mean of the error term: E [ui | Xi ] = 0.
Assumption SLR.5 (Homoskedasticity)
Homoskedasticity: Var (ui | Xi ) = 2 .

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 39 / 52
OLS estimator is unbiased

Theorem (Unbiasedness)
Under Assumption SLR.1- SLR.4:

E (0 ) = 0 , and E (1 ) = 1

for any values of 0 , 1 . In other words, 0 is an unbiased estimator for


0 , and 1 is an unbiased estimator for 1

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 40 / 52
Interpretation of unbiasedness

The estimated coefficients may be smaller or larger, depending on the


sample that is the result of a random draw
However, on average, they will be equal to the values that
characterize the true relationship between y and x in the population
On average means if sampling was repeated, i.e. if drawing the
random sample und doing the estimation was repeated many times
In a given sample, estimates may differ considerably from true values

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 41 / 52
Some Comments

Remember that unbiasedness is a feature of the sampling distributions


of 1 and 0 , which says nothing about the estimate that we obtain
for a given sample.
Unbiasedness generally fails if any of our four assumptions fail.
Think about examples how SLR.1-SLR.4 would fail.

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 42 / 52
Variances of the OLS Estimators

In addition to knowing that the sampling distribution of 1 is


centered about 1 (1 is unbiased), it is important to know how far
we can expect 1 to be away from 1 on average.
This also allows us to choose the best estimator among all, or at least
a broad class of, unbiased estimators. Estimator that has the smallest
variance is called efficient estimator.
The estimation will be much simpler under the SLR.5:
homoskedasticity assumption

Var (ui | Xi ) = 2 , Xi

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 43 / 52
Variances of the OLS Estimators

Under the zero conditional mean assumption: E (u|X ) = E (u) = 0,


we have

Var (u|X ) = E (u 2 |X ) [E (u|X )]2 = E (u 2 |X )

This means 2 is also the unconditional expectation of u 2 :


2 = E (u 2 |X ) = Var (u|X )
It is useful to write the model in terms of conditional mean and
conditional variance of Y:

E (Y |X ) = 0 + 1 X
Var (Y |X ) = Var (0 + 1 X + u|X ) = Var (u|X ) = 2

When Var (u|X ) depends on X , the error term is said to exhibit


heteroskedasicity (or nonconstant variance). Will this make 1 biased?

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 44 / 52
Graphical illustration of homoskedasticity
The variability of the unobserved influences does not dependent on
the value of the explanatory variable

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 45 / 52
Graphical illustration of heteroskedasticity
The variability of the unobserved influences depends on the value of
the explanatory variable (e.g. wage v.s. educ)

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 46 / 52
Homoskedasticity
Discussion

Now we discuss whether the predictive power of regression line,


E (Yi |Xi ) = 0 + 1 Xi , differs across Xi .
Recall that the model is Y = 0 + 1 X + u and the population
regression function is E (Y |X ) = 0 + 1 X .
The conditional variance of the error term, i.e., var(u|x) determines
the predictive power of the latter .
When var(u|x) is constant, the predictive power of E (y |x) does not
vary with x.
When var(u|x) is constant, we say that the error term exhibits
homoskedasticity or that the error term is homoskedastic.

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 47 / 52
Variances of the OLS Estimators

Under SLR.1-SLR.5:
2 2
Var (1 ) = Pn 2
=
i=1 (Xi X ) SSTx
2 Pn 2
n 1
i=1 Xi
Var (0 ) = Pn 2
i=1 (Xi X )

q
Standard deviation of 1 is sd(1 ) = Var (1 ) = / SSTx
The sampling variability of the estimated regression coefficients will
be the higher the larger the variability of the unobserved factors, and
the lower, the higher the variation in the explanatory variable .

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 48 / 52
Variances of the OLS Estimators

Under SLR.1-SLR.5:
2 2
Var (1 ) = Pn 2
=
i=1 (Xi X ) SSTx
2 P n 2
n 1
i=1 Xi
Var (0 ) = Pn 2
i=1 (Xi X )

However, in most of cases, 2 is unknown, therefore we need to use


the data to estimate 2 .

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 49 / 52
Estimating the Error Variance

The unbiased estimator 2 for 2


n n
1 X 1 X
2 = (ui ui )2 = u 2 = SSR/(n 2)
n 2 i=1 n 2 i=1 i

(n 2) is the degree of freedom


Under SLR.1-SLR.5, E ( 2 ) = 2

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 50 / 52
Calculation of standard errors for regression coefficients


= 2 is called standard error of the regression
Standard error of 1
r s
\ 2
se(1 ) = Var (1 ) = = qP
SSTx n 2
i=1 (Xi X )

Var (1 ) and/or se(1 )measure how precisely the regression


coefficients are estimated

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 51 / 52
Standard errors in Stata regression output

Gumilang Aryo Sahadewo (MEP UGM) Applied ECM: Lecture 2 & 3 October 9, 2017 52 / 52

You might also like