Econometric Lec1

THE NATURE OF
ECONOMETRICS
1
AND
REGRESSION ANALYSIS LECTURE
1
Chapter 0
Introduction
2
What is Econometrics
...Social Science in which the tools of

economic theory, mathematics and
statistical inference are applied to the
analysis of economic phenomena.
3
1. Economics Theory
“ Keynes postulated a positive relationship between consumption
and incomes”, i.e., people’s income
2. Mathematical Expression:
Consumption = f(Income) ==> C = f(Y)
MPC = dC/dY = f’(Y) > 0 ;assume 0 < MPC < 1
3. Statistics:
Year C Y Find the mean, variance,
standard deviation,
1980 2447.1 3776.3 correlation, etc.
1981 2476.9 3841.1
…. …. ….
4. Econometric - Regression model
C t = b1 + b2 Y t + u t => DC/DY = b2
=> estimating the relationship 4
The Role of Econometrics
Provide measurement and quantitative

analysis of actual economic
phenomena
or economic relationship based on
1. Economic theory
2. Economic data
3. Methods of model constructed
5
Economic Relationships:
Stock Market Index money supply
government
budget
Interest rate
Exchange
inflation trade Rate
deficit Properties Market
unemployment
Wage
capital gains tax

rent
control
crime rate laws
6
Economic Decisions
To use information effectively
}
economic theory
economic
decisions
economic data
*Econometrics* helps us combine

Economic theory and economic data.
7
How much ?
Listing the variables in an economic relationship is not enough.
For effective policy we must know the amount of change

needed for a policy instrument to bring about the desired
effect:
• By how much should the Federal Reserve

raise interest rates to prevent inflation?
• By how much can the price of football tickets

be increased and still fill the stadium?
8
Methodology of Econometrics
• Statement of theory or hypothesis
• Specification of the mathematical model of the
theory
• Specification of the econometric model of the
theory
• Obtaining data for the analysis.
• Estimation with statistical properties.
• Hypothesis testing
• Analyze and evaluate implications of the results
• Forecasting or prediction
• Using the model for control or policy purpose
9
10
Example: The Consumption Function
Consumption, C, is some function of income, inc :
C = f(inc)
For applied econometric analysis

this consumption function must be
specified more precisely.
11
Economic Empirical Study
Economic Theory; Past Experience, studies

C = f(Inc) ==>
Formulating a model: Ct = b1 + b2Inct + ut
Gathering data: Statistics monthly, quarterly, yearly data
Estimating the model: Simple OLS method or other advances

H0: b2>0,
Testing the hypothesis: positive relationship or not If not true
Interpreting the results:
Forecasting Policy implication and decisions

12
Chapter 1
The nature of
Regression
Analysis
13
Regression Analysis
...the study of the dependence of one variable

(called dependent variable) on one or more
other variables (called independent variables)
with a view to Estimating or Predicting
the mean or average of the former in terms of
the known or fixed values of the latter
14
Purpose of Regression Analysis
• 1. Estimate a relationship among

economic variables, such as Y = f(X).
• 2. Forecast or predict the value of

one variable, Y, based on the value of
another variable, X.
15
Terminology and Notation
Y = b1 + b2 X + u
Left hand-side Right hand-side

Variable: Variable:
Dependent Explanatory
Explained Independent
Predictand Predictor
Regressand Regressor
Response Stimulus or control
Endogenous Exogenous
16
Some Examples
• Consumption on Income
• GDP on Export, Money Supply
• Sales on Season, Advertising
Expenditures
• Income on Gender, Education, Years
of Traning
17
Some Comparisons
• Statistical vs. Deterministic Relationships

• Regression vs. Causation
• Regression vs. Correlation
18
Data
• Time-series Data
• Cross-Sectional Data
• Pooled or Panel Data
19
Time series data
20
Cross-section data and Pool (Panel) data
21
Chapter 2
Two-variable
Regression
Analysis : Some Basic
Ideas
22
Population Regression Function
23
Weekly Food Expenditures
Y = dollars spent each week on food items.
X = consumer’s family weekly income.
The relationship between X and the expected value

of Y , given X, might be linear:
E(Y|Xi) = f(Xi) = b1 + b2 Xi
Means that each conditional mean E(Y|Xi) is a
function of Xi, this equation is known as
the population regression function (PRF) 24
Y
Conditional means
E(Y|Xi)
Population regression line

149 (PRF)
101
65
Distribution of Y
given X=220
X
80 140 220
25
Average
Consumption Y
E(Y|x)
E(Y|x)=b1+b2x
DE(Y|X)
DE(Y|X) b2=
DX
Dx slope
b1{
Intercept X (income)
The Econometric Model: a linear relationship

between average consumption and income.
26
Linearity
• Linear in variables
• Linear in Parameters
27
Stochastic Specification of PRF
Given any income level of Xi, an family’s consumption is

clustered around the average of all families at that Xi, that
is, around its conditional expectation, E(Y|Xi).
The deviation of any individual Yi is:
ui = Yi - E(Y|Xi)
or Yi = E(Y|Xi) + u i
Shochastic error or
or Yi = b1 + b2 X + u i Stochastic disturbance
28
The Error Term
• Y is a random variable composed of two parts:
• I. Systematic component: E(Y) = b1 + b2x

This is the mean of Y.
II. Random component: u = Y - E(Y)

= Y - 1 - 2X
This is called the random or shochastic error.
Together E(Y) and u form the model:

Y = 1 + 2X + u
29
For examples:
given X = $80, the individual consumption are
Y1 = 55 = b1 + b2 (80) + u 1
Y2 = 60 = b1 + b2 (80) + u 2
Y3 = 65 = b1 + b2 (80) + u 3
Y4 = 70 = b1 + b2 (80) + u 4
Y5 = 75 = b1 + b2 (80) + u 5
Y^ = 65 = ^b + b^ (80)
1 1 2
Y^ = 65 = ^b + b^ (80)
Estimated average: 2 1 2
Y^3 = 65 = ^b1 + b^2 (80)
Y^4 = 65 = ^b1 + b^2 (80)
Y^5 = 65 = ^b1 + b^2 (80)
30
The reasons for stochastic disturbance
• Vagueness of theory
• Unavailability of data
• Direct effect vs indirect effect
• (Core variables vs peripheral variables)
• Intrinsic randomness in human behaviour
• Poor proxy variables
• Principle of parsimony
• wrong functional form
31
Unobservable Nature of Error Term
• Unspecified factors / explanatory variables,

not in the model, may be in the error term.
For example: Final examine score is not only depend on
class attended but also other unobserved factors such as
student ability, maths background, hard work effort, etc.
• Approximation error is in the error term if

relationship between y and x is not exactly a perfectly
linear relationship.
• Strictly unpredictable random behavior that

may be unique to that observation is in error. 32
Y (SRF) ^Y = ^b + ^b x
Y4 . 1 2
{ E(Y|x)=b1+b2x
Y3 (PRF)
}
Y2 ^
^ u2
Y 2
u2
E(Y|x2)
Y1 .} u 1
x
x1 x2 x3 x4
The relationship among Yi, ui and the true regression line.
33
The Sample Regression Function (SRF)
(SRF2)
Y ^ ^ ^
Y = b1 + b2x
Y4 û . ^Y = ^b + ^b x
4{ 1 2
(SRF1)
Y3 .}^
u3
Y2 û .
2{
^
Y1 } 1
.u
x
x1 x2 x3 x4
34
Different samples will have different SRFs)
SRF:
^ ^ ^
Yi = b1 + b2 Xi
^ ^
or Yi = b1 + b2Xi + u^ i Residual
or Yi = b1 + b2 Xi + ei
PRF:
E(Y|X) = b1 + b2 Xi
Yi = b1 + b2 Xi + u i Error term or
Disturbance
^
Yi = estimator of Yi (E(Y|xi)
^
bi or bi = estimator of bi 35
Chapter 3
Two-variable
Regression Model: The
Problem of Estimation
36
Ordinary Least Squares (OLS) Method
Yi = b1 + b2Xi + ui
u^ i = Y i - b^1 - b^2X i
Minimize error sum of squared deviations:

n n
å uî2 = S(Y i - b^1 - b^2X i )2 = f(b^1,b^2)
i=1 i=1
37
Minimize w.r.t. b^1 and b^2:
n
f(b^1,b^2) = S(Y i - b^1 - b^2x i 2
) = f(.)
i =1
¶f(.)
= - 2 S (Y i - b^1 - b^2Xi )
¶ b^1
¶f(.)
= - 2 S Xi (Yi - b^1 - b^2Xi )
¶ b^2
Set each of these two derivatives equal to zero and
solve these two equations for the two unknowns: b^1 b^238
To minimize f(.), you set the two
derivatives equal to zero to get:
¶f(.)
= - 2 S (Y i – b1 – b2Xi ) = 0
¶ b^1
¶f(.)
= - 2 S xi (Yi - b1 – b2Xi ) = 0
¶ b^2
39
- 2 S (Y i - b1 – b2Xi ) = 0
-2S Xi (Y i – b1 – b2Xi ) = 0
S Yi - nb1 – b2 SXi = 0
2
S Xi Yi - b1 S X i - b2 S Xi = 0
nb1 + b2 S Xi = S Yi
S Xi + b2S Xi S Xi Yi
2
b1 =
40
n SXi b1
= S Yi
S i S i
X X 2
b2 = S Xi Yi
Solve the two unknowns
n SXi Yi - S Xi SYi
b2 =
n SX i - (SXi )
2 2
S(Xi - X )(Yi -Y) Sxy

= =
S( i
X - X ) 2
Sx 2
b1 = Y - b2 x 41
Y
Y4
. ^
Y = b1 + b2X
^
Y*
^*
Y3
.
{.
û*
4
^*
^
Y* = b*1 + b* 2X
^
Y*
.
1 . 2
^
u*3{ Y4
u*2 {. Y2
^ .
{
^
u*1
.
Y1
Y3
x1 x2 x3 x4 x
Why the SRF is the best one?

^ is larger.
The sum of squared residuals from any other line Y* 42
The classical regression model:
Assumptions of LS Method
1. The linear regression Model:linear in
parameters
Y = b1+ b2X+ u
2. X values are fixed in repeated sampling, so that X is
not constant (X is nonstochastic).
3. Zero mean value of error terms (disturbance, ui),
E( ui | xi) = 0
4. Homoscedasticity or equal variance of ui, the
conditional variances of u i are identical, i.e.,
var(ui|xi) = s2 43
Homoscedasticity Case
f(Yi) Yi
re
d itu
en
p
.
ex .
x1=80 x2=100 income xi

The probability density function for Yi at two
levels of family income, X i , are identical. 44
Heteroscedasticity Case
f(Yi)
Y
i
r e
itu
n d
p e
ex
.
.
.
x1 x2 x3 income xt
The variance of Yi increases as family income,
xi, increases. 45
Assumptions (continue)
5. No autocorrelation between the disturbance.

cov(ui,uj|xi ,xj) = 0
6. Zero covariance between ui and xi, i.e.,

cov(ui,xi ) = E(ui,xi ) = 0
7. The # of observation (n) must be greater

than the # of parameters (k) to be estimated.
n>k
46
Assumptions (continue)
8. Variability in X values: The values in a

given sample must not all be the same,
at least two must different.
9. No specification bias or error: the regression

model is correctly specified.
10. There is no perfect multicollinearity. No

perfect linear relationship among the
independent variables. i.e.,
Xk ¹ l Xm
47
One more assumption that is often used in
practice but is not required for least squares:
• (Optional) The values of y are
normally distributed about
their mean for each
value of x:
Y ~ N [(b1+b2X), s2 ]
48
The Error Term Assumptions
• 1. The value of y, for each value of x, is

Y = b1 + b2X + u
• 2. The average value of the random error u is:
E(u) = 0
• 3. The variance of the random error u is:
var(u) = s2 = var(Y)
• 4. The covariance between any pair of u’s is:
cov(ui , uj) = cov(Yi ,Yj) = 0
• 5. u is normally distributed with mean 0, var(u)=s2
u ~ N(0,s2)
49
Simple Linear Regression Model
Yi = b1 + b2 X i + ui
Yi = family weekly expenditures

X i = family weekly income
For a given level of xi, the expected

level of food expenditures will be:
E(Yi|X i) = b1 + b2 X i 50
The population parameters b1 and b2
are unknown population constants.
The formulas that produce the

^ ^
sample estimates b1 (or b1)and b2(or b2) are
called the estimators of b1 and b2.
• When b1 and b2 are used to represent
the formulas rather than specific values,

they are called estimators of b1 and b2
which are random variables because
they are different from sample to sample
51
Estimators are Random Variables
• If the least squares estimators b1 and b2

are random variables, then what are their
means, variances, covariances and
probability distributions?
• Compare the properties of alternative

estimators to the properties of the
least squares estimators.
52
Precision or standard errors of LS
estimates
Given that both Yi and ui have variance s 2,
the variance of the estimator b2 is:
^2
s ^2
s
var(b2) = =
S(xi - x) 2
S 2
xi
b2 is a function of the Yi values but
var(b2) does not involve Yi directly. 53
Variance of b1
Given b1 = y - b2x
the variance of the estimator b1 is:
Sx i
2
Sx i2
var(b1) = s2 = s2
n S(x i - x)
2 2
nSxi
54
Covariance of b1 and b2
-x -x
cov(b1,b2) = s2 = s2
2
S(x t - x) S xi
2
55
Estimating the variance
of the error term, s 2
û = yi - b1 - b2 xi
i
T
^
S
2
ui
^2
s = i =1
n- 2
^
s is an unbiased estimator of s 2
2
56
Properties of LS Estimators:
Gauss-Markov Theorem
Given the assumptions of classical linear

regression model, the ordinary least
squares (OLS) estimators b1 and b2
are the best linear unbiased estimators
(BLUE) of b1 and b2. This means that b1
and b2 have the smallest variance of all
linear unbiased estimator of b1 and b2.
Note: Gauss-Markov Theorem doesn’t
apply to non-linear estimators 57
The coefficient of Determination, R2 –
A Measure of “goodness of fit”
58
59
^ ^
(Yi - Y) = (Yi - Yi) + (Yi - Y)
u^
To measure variation:
^ 2 ^
S(Yi - Y)2 = S[(Yi - Y) i + (Yi - Y)]2
^ 2 ^
S(Yi - Y)2 = S(Yi - Y) i + S(Yi - Y)2
TSS RSS (Su^2) ESS

Total sum of squares Residual sum of squares Explained sum of squares
60
R2 - Measure of “goodness of fit”
2 ESS RSS
Define R = =1-
TSS TSS
Su
^ 2
i
=1-
S(Yi - Y)2
^
S(Yi - Y)2
R2 =
S(Yi - Y)2 1 ³ R2 ³ 0
61
Alternative R2 expression
^
ESS S(Yi - Y)2 Sy^ i2
2
R = = =
TSS S(Yi - Y) 2 Sy i
2
^ ^2
S(b2xi )2 b Sx 2
2 i ^ 2 Sxi2
= Sy 2 = = b
i Syi2 2
Syi2
^ 2
2
Sx Sxiyi 2
Sxi2 (Sxiyi)2
= b = =
2 Sy2 Sxi2 Syi2 Sxi2Syi2
62
Explanation of R2
• R2 tells us how well the sample regression
line fits the data
• The value of R2 lets us know how much
percent of variation in the dependent variable
can be explained by the independent
variable or by the regression model.
• Ex: R2=0.8: 80% of the variation in Y can be
explained by X.
63
Y As R2 = 0
SRF
Which SRF ?
Y
As R2 = 1
SRF
SRF go through all points

64
X
Coefficient of Correlation R
65
Numerical Example
• Manual Calculations
• Demonstration of using computer software
• Use the data in table 3.8
66
Chapter 4
Classical Normal Linear
Regression Model
67
Summary of BLUE estimators
Mean
E(b1)=b1 and E(b2)=b2
Variance
Var(b1 )=s 2
Sx i2
and s2
Var(b2)=
nSxi
2
Sxi 2
Standard error or standard deviation
Se(bk) = var(bk) 68
Estimated Error Variance
T
^
S
2
ui ^ 2) = s2
E(s
^2
s = i =1
n- 2
Standard Error of Regression (Estimate), SEE
T
^
S
2
^=
s ^
s =
2 ui
i =1 K
n- 2 # of independent
Variables plus
69 the
Constant term
Normality Assumption for ui
ui ~ N(0,σ2)
Y ~ N [(β1+
β2X), σ2 ]
70
Probability Distribution
of Least Squares Estimators
s2Sx i2
b1 ~ N b1 ,
nSx i2
s2
b2 ~ N b2 ,
Sx i2
71
THE END
72

Econometric Lec1

Uploaded by

Copyright:

Available Formats

You might also like

Econometric Lec1

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Econometric Lec1

Uploaded by

Copyright:

Available Formats

THE NATURE OF

...Social Science in which the tools of

Provide measurement and quantitative

capital gains tax

To use information effectively

*Econometrics* helps us combine

Listing the variables in an economic relationship is not enough.

For effective policy we must know the amount of change

• By how much should the Federal Reserve

• By how much can the price of football tickets

Consumption, C, is some function of income, inc :

For applied econometric analysis

Economic Theory; Past Experience, studies

Gathering data: Statistics monthly, quarterly, yearly data

Estimating the model: Simple OLS method or other advances

Interpreting the results:

Forecasting Policy implication and decisions

...the study of the dependence of one variable

• 1. Estimate a relationship among

• 2. Forecast or predict the value of

Left hand-side Right hand-side

• Statistical vs. Deterministic Relationships

The relationship between X and the expected value

Population regression line

The Econometric Model: a linear relationship

Given any income level of Xi, an family’s consumption is

• I. Systematic component: E(Y) = b1 + b2x

II. Random component: u = Y - E(Y)

Together E(Y) and u form the model:

• Unspecified factors / explanatory variables,

• Approximation error is in the error term if

• Strictly unpredictable random behavior that

Minimize error sum of squared deviations:

S(Xi - X )(Yi -Y) Sxy

Why the SRF is the best one?

x1=80 x2=100 income xi

5. No autocorrelation between the disturbance.

6. Zero covariance between ui and xi, i.e.,

7. The # of observation (n) must be greater

8. Variability in X values: The values in a

9. No specification bias or error: the regression

10. There is no perfect multicollinearity. No

• 1. The value of y, for each value of x, is

Yi = family weekly expenditures

For a given level of xi, the expected

The formulas that produce the

the formulas rather than specific values,

• If the least squares estimators b1 and b2

• Compare the properties of alternative

Given the assumptions of classical linear

TSS RSS (Su^2) ESS

SRF go through all points

Standard error or standard deviation

Standard Error of Regression (Estimate), SEE

You might also like

Econometrics helps us combine