Professional Documents
Culture Documents
Econ 131 Lecture 2.1 2 Assumptions Deriving The Estimators
Econ 131 Lecture 2.1 2 Assumptions Deriving The Estimators
𝐸(𝑢) = 0
Zero conditional mean
• We need to make a crucial assumption about how
𝑢 and 𝑥 are related
• We want it to be the case that knowing something
about 𝑥 does not give us any information about 𝑢,
so that they are completely unrelated. That is, that
𝐸 𝑢 𝑥) = 𝐸 𝑢 = 0
Zero conditional mean
𝐸 𝑢 𝑥) = 𝐸 𝑢 = 0, which implies
𝐸 𝑦 𝑥) = 𝛽0 + 𝛽1 𝑥
Zero conditional mean
𝐸 𝑢 𝑥) = 𝐸 𝑢 = 0, which implies
𝐸 𝑦 𝑥) = 𝛽0 + 𝛽1 𝑥
Show!
Let 𝑦 = 𝛽0 + 𝛽1 𝑥 + 𝑢
𝐸 𝑦 𝑥 = 𝐸 𝛽0 + 𝛽1 𝑥 + 𝑢 𝑥
= 𝐸 𝛽0 𝑥 + 𝛽1 𝐸 𝑥 𝑥 + 𝐸 𝑢 𝑥
= 𝛽0 + 𝛽1 𝑥
Zero conditional mean
Zero Conditional Mean
Assumption: Implications
Population regression function (PRF)
𝐸 𝑦 𝑥) = 𝛽0 + 𝛽1 𝑥
• 𝐸 𝑦 𝑥) is a linear function of 𝑥.
• The linearity means that a one-unit increase in 𝑥 changes
the expected value of 𝑦 by the amount 𝛽1 .
• For any given value of 𝑥, the distribution of 𝑦 is centered
about 𝐸 𝑦 𝑥).
PRF and the zero conditional mean
y
f(y) E(y|x) = b0 + b1x
𝐸 𝑦 𝑥) as a linear
function of 𝑥, where for
any 𝑥 the distribution of .E(y|x2)
𝑦 is centered about .
𝐸 𝑦 𝑥). E(y|x1)
x1 x2 x
Population regression line and data points
y 𝑦4 = 𝛽0 + 𝛽1𝑥4 + 𝑢4
𝒚 = 𝜷𝟎 + 𝜷𝟏𝒙
y4 {
. or
u4 𝑬 𝒚 𝒙) = 𝜷𝟎 + 𝜷𝟏𝒙
y3 𝑦2 = 𝛽0 + 𝛽1𝑥2 + 𝑢2
.} u3 𝑦3 = 𝛽0 + 𝛽1𝑥3 + 𝑢3
y2 u{ .
2
y1 .} u1 𝑦1 = 𝛽0 + 𝛽1𝑥1 + 𝑢1
x1 x2 x3 x4 x
Independence of 𝒙 and 𝒖
• To derive the OLS estimates we need to realize that
our main assumption of 𝐸 𝑢 𝑥) = 𝐸(𝑢) = 0 also
implies that
𝐶𝑜𝑣(𝑥, 𝑢) = 𝐸(𝑥𝑢) = 0
𝐸 𝑦 − 𝛽0 − 𝛽1𝑥 = 0
𝐸 𝑥 𝑦 − 𝛽0 − 𝛽1𝑥 =0
y3 .} 𝑢ො 3
y2 𝑢ො {
.
2
y1 .}𝑢
ො1
x1 x2 x3 x4 x
Goal: Derive a line that has
the shortest distance – line of best fit
• Given the intuitive idea of fitting a line, we can set up
a formal minimization problem:
minimizing the sum of the values of the error terms
across all data points.
𝑛
min 𝑢ො 𝑖
𝑖=1
Goal: Derive a line that has
the shortest distance – line of best fit
• Given the intuitive idea of fitting a line, we can set up a
formal minimization problem:
minimizing the sum of the values of the error terms across
all data points.
𝑛
min 𝑢ො 𝑖
𝑖=1
• But note that these could result into large negative and the
distance of the line from the data points could also be large
Goal: Derive a line that has
the shortest distance – line of best fit
• We can express the minimization problem as the
square of the error terms:
𝑛
min 𝑢ො 𝑖2
𝑖=1
Goal: Derive a line that has
the shortest distance – line of best fit
• We can express the minimization problem as the
square of the error terms:
𝑛 𝑛
2
min 𝑢ො 𝑖2 = 𝑦𝑖 − 𝛽መ0 − 𝛽መ1 𝑥𝑖
𝑖=1 𝑖=1
𝑑(.)
𝑖 = 2: (𝑦2 − 𝛽መ0 − 𝛽መ1 𝑥2 ) 2 → 0 = −2(𝑦2 − 𝛽መ0 − 𝛽መ1 𝑥2 )
𝑑𝛽
⋮
𝑑(.)
𝑖 = 𝑛: (𝑦𝑛 − 𝛽መ0 − 𝛽መ1 𝑥𝑛 )2 → 0
𝑑𝛽
= −2(𝑦𝑛 − 𝛽መ0 − 𝛽መ1 𝑥𝑛 )
Goal: Derive a line that has
the shortest distance – line of best fit
Hence,
2 𝑛
𝑑 𝑖=1 𝑦𝑖 − 𝛽መ0 − 𝛽መ1 𝑥𝑖
σ𝑛
= −2 𝑦𝑖 − 𝛽መ0 − 𝛽መ1 𝑥𝑖
𝑑 𝛽መ0
𝑖=1
Goal: Derive a line that has
the shortest distance – line of best fit
𝑑(.)
𝑖 = 1: (𝑦1 − 𝛽መ0 − 𝛽መ1 𝑥1 )2 → 1
𝑑𝛽
= −2𝑥1 (𝑦1 − 𝛽መ0 − 𝛽መ1 𝑥1 )
𝑑(.)
𝑖 = 2: (𝑦2 − 𝛽መ0 − 𝛽መ1 𝑥2 ) →2
1 = −2𝑥2 (𝑦2 − 𝛽መ0 − 𝛽መ1 𝑥2 )
𝑑𝛽
⋮
𝑑(.)
𝑖 = 𝑛: (𝑦𝑛 − 𝛽መ0 − 𝛽መ1 𝑥𝑛 )2 → 1
𝑑𝛽
= −2𝑥𝑛 (𝑦𝑛 − 𝛽መ0 − 𝛽መ1 𝑥𝑛 )
Goal: Derive a line that has
the shortest distance – line of best fit
Hence,
2 𝑛
𝑑 σ𝑖=1 𝑦𝑖 − 𝛽መ0 − 𝛽መ1 𝑥𝑖
𝑛
= −2 𝑥𝑖 𝑦𝑖 − 𝛽መ0 − 𝛽መ1 𝑥𝑖
𝑑 𝛽መ1 𝑖=1
Goal: Derive a line that has
the shortest distance – line of best fit
2. Equate the derivatives to zero (this will be the first order
conditions).
FOCs:
𝑛
−2 𝑦𝑖 − 𝛽መ0 − 𝛽መ1 𝑥𝑖 = 0
𝑖=1
𝑛
−2 𝑥𝑖 𝑦𝑖 − 𝛽መ0 − 𝛽መ1 𝑥𝑖 = 0
𝑖=1
Goal: Derive a line that has
the shortest distance – line of best fit
Then solve for 𝛽መ0 and 𝛽መ1 .
From the first FOC, we have:
σ𝑛𝑖=1 𝑦𝑖 − 𝛽መ0 − 𝛽መ1 𝑥𝑖 = 0
→ σ𝑖=1 𝑦𝑖 − σ𝑖=1 𝛽0 − σ𝑖=1 𝛽መ1 𝑥𝑖 = 0
𝑛 𝑛 መ 𝑛
1 𝑛 1 𝑛
መ መ
→ 𝛽0 = σ𝑖=1 𝑦𝑖 − 𝛽1 σ𝑖=1 𝑥𝑖
𝑛 𝑛
→𝜷𝟎 = 𝒚
ഥ−𝜷 𝟏𝒙
ഥ
𝟎.
** We have solved for 𝜷
Goal: Derive a line that has
the shortest distance – line of best fit
𝟎 + 𝜷
ഥ=𝜷
A result from this is that 𝒚 𝟏𝒙
ഥ, which
implies that the estimated sample regression passes
through the sample means of 𝒚 and 𝒙.
Goal: Derive a line that has
the shortest distance – line of best fit
Solve for 𝛽መ1 .
From the second FOC and the derived 𝛽መ0 , we have:
σ𝑛𝑖=1 𝑥𝑖 𝑦𝑖 − 𝛽መ0 − 𝛽መ1 𝑥𝑖 = 0
→ 𝑖=1 𝑥𝑖 𝑦𝑖 − 𝑦ത − 𝛽መ1 𝑥ҧ − 𝛽መ1 𝑥𝑖 = 0
σ𝑛
→ 𝑛
σ𝑖=1 𝑥𝑖 𝑦𝑖 − 𝑦ത − 𝛽መ1 (𝑥𝑖 − 𝑥)ҧ = 0
Goal: Derive a line that has
the shortest distance – line of best fit
→ 𝑛
σ𝑖=1 𝑥𝑖 𝑦𝑖 − 𝑦ത − 𝛽መ1 (𝑥𝑖 − 𝑥)ҧ = 0
σ 𝑛 መ σ𝑛
→ 𝑖=1 𝑥𝑖 𝑦𝑖 − 𝑦ത − 𝛽1 𝑖=1 𝑥𝑖 𝑥𝑖 − 𝑥ҧ = 0
መ 𝑛 𝑛
→ 𝛽1 σ𝑖=1 𝑥𝑖 𝑥𝑖 − 𝑥ҧ = σ𝑖=1 𝑥𝑖 𝑦𝑖 − 𝑦ത
𝑛
σ𝑖=1 𝑥𝑖 𝑦𝑖 − 𝑦ത
→ 𝛽መ1 = 𝑛
σ𝑖=1 𝑥𝑖 𝑥𝑖 − 𝑥ҧ
Goal: Derive a line that has
the shortest distance – line of best fit
σ𝑛𝑖=1 𝑥𝑖 𝑦𝑖 − 𝑦ത σ 𝑛
− 𝑥ҧ 𝑖=1 𝑦𝑖 − 𝑦ത
→ 𝛽መ1 = 𝑛 𝑛
σ𝑖=1 𝑥𝑖 𝑥𝑖 − 𝑥ҧ − 𝑥ҧ σ𝑖=1 𝑦𝑖 − 𝑦ത
𝑛
σ𝑖=1 𝑥𝑖 𝑦𝑖 − 𝑦ത 𝑛
− σ𝑖=1 𝑥ҧ 𝑦𝑖 − 𝑦ത
→ 𝛽መ1 = 𝑛
σ𝑖=1 𝑥𝑖 𝑥𝑖 − 𝑥ҧ − σ𝑛𝑖=1 𝑥ҧ 𝑥𝑖 − 𝑥ҧ
σ𝑛𝑖=1(𝑥𝑖 𝑦𝑖 − 𝑦ത − 𝑥ҧ 𝑦𝑖 − 𝑦ത )
→ 𝛽መ1 = 𝑛
σ𝑖=1 𝑥𝑖 𝑥𝑖 − 𝑥ҧ − 𝑥ҧ 𝑥𝑖 − 𝑥ҧ
Goal: Derive a line that has
the shortest distance – line of best fit
σ𝑛𝑖=1(𝑥𝑖 𝑦𝑖 − 𝑦ത − 𝑥ҧ 𝑦𝑖 − 𝑦ത )
→ 𝛽መ1 = 𝑛
σ𝑖=1 𝑥𝑖 𝑥𝑖 − 𝑥ҧ − 𝑥ҧ 𝑥𝑖 − 𝑥ҧ
𝑛
σ𝑖=1 𝑥𝑖 − 𝑥ҧ 𝑦𝑖 − 𝑦ത
→ 𝛽መ1 = 𝑛
σ𝑖=1 𝑥𝑖 − 𝑥ҧ 𝑥𝑖 − 𝑥ҧ
σ𝑛𝑖=1 𝑥𝑖 − 𝑥ҧ 𝑦𝑖 − 𝑦ത
→ 𝛽መ1 = 𝑛
σ𝑖=1 𝑥𝑖 − 𝑥ҧ 2
Goal: Derive a line that has
the shortest distance – line of best fit
σ𝒏𝒊=𝟏 𝒙𝒊 − 𝒙
ഥ 𝒚𝒊 − 𝒚ഥ
𝟏 =
→𝜷
σ𝒏𝒊=𝟏 𝒙𝒊 − 𝒙
ഥ 𝟐
𝟏
We have solved for 𝜷
Recap: Deriving the estimators:
Least square methods
• If one uses calculus to solve the minimization problem
for the two parameters you obtain the following first
order conditions, which are the same as the Method of
Moment restrictions if multiplied by n
𝑛
𝑦𝑖 − 𝛽መ0 − 𝛽መ1 𝑥𝑖 = 0
𝑖=1
𝑥𝑖 𝑦𝑖 − 𝛽መ0 − 𝛽መ1 𝑥𝑖 = 0
𝑖=1
Recap: Deriving the estimators:
Least square methods
• Given the definition of a sample mean, and properties
of summation, we can rewrite the first condition as
follows
y bˆ0 bˆ1 x
or
bˆ0 y bˆ1 x
Recap: Deriving the estimators:
Least square methods
Solving for 𝛽መ1 , we will get:
𝑛
σ𝑖=1 𝑥𝑖 − 𝑥lj 𝑦𝑖 − 𝑦lj
𝛽መ1 =
σ𝑛𝑖=1 𝑥𝑖 − 𝑥lj 2
provided that
𝑛
2
𝑥𝑖 − 𝑥lj >0
𝑖=1
Exercise
Summary: Ordinary Least Square
estimates (OLS)
• The slope estimate is the sample covariance
between 𝑥 and 𝑦 divided by the sample variance
of 𝑥
• If 𝑥 and 𝑦 are positively correlated, the slope
will be positive
• If 𝑥 and 𝑦 are negatively correlated, the slope
will be negative
Summary: OLS estimates
Recall: correlation coefficient
𝑛
σ𝑖=1 𝑥𝑖 − 𝑥lj 𝑦𝑖 − 𝑦lj
𝜌=
σ𝑛𝑖=1 𝑥𝑖 − 𝑥lj 2 σ𝑛
𝑖=1 𝑦𝑖 − 𝑦lj 2