Topic 07 - OLS

FINANCE 361 – Topic 7 – OLS
Paul Geertsema
1
Contents
1 Readings 4
2 What are we doing today 5
3 Ordinary least squares (OLS) regression 6
4 OLS Intuition 8
5 OLS assumptions (all 5) 9
6 OLS “facts” 10
7 Coefficients 11
8 Economic significance 12
2 FINANCE 361 Class Notes – University of Auckland – Copyright (C) Dr Paul Geertsema
Contents (cont.)
9 Statistical significance 13
10 t-Statistic and R-square 15
11 Correlation is not causation 18
12 Using OLS regression to do portfolio optimisation 20
1 Readings
• These notes
2 What are we doing today
• An introduction to Ordinary Least Squares regression

– https://en.wikipedia.org/wiki/Ordinary_least_squares
• Use OLS regression to estimate CAPM betas using real data (in
Excel)
• Use OLS regressions to calculate Markowitz optimal portfolios
3 Ordinary least squares (OLS) regression
• Notation
– yt = α + β1x1,t + ... + βN xN,t + εt
• Nomenclature (the names of things)
– Observations
◦ Set of values that logically belongs together
◦ For instance, because they have all been “observed” at the
same time
◦ Indicated by subscript t
– Dependent variable (y)
◦ the variable you are seeking to explain
◦ conventionally on the left hand side of the regression equation
Ordinary least squares (OLS) regression (cont.)
– Independent variable(s) (x1, ..., xN )

◦ the variables you using to explain the dependent variable
◦ conventionally on the right hand side of the regression equa-
tion
◦ aka explanatory variables
– White noise (ε)
◦ randomness that is uncorrelated with variables in the regres-
sion or itself
◦ constant variance
◦ often denoted by ε
– Predicted value
◦ After estimating coefficients, the value obtained by using
the independent variable multiplied by their estimated coef-
ficients, all added up together
– Residual
◦ The difference between the dependent variable and the pre-
dicted value
4 OLS Intuition
• The “best fit line” through a collection of points

• Minimises the sum of squared errors
– error = predicted value - actual value of dependent variable
– hence “least squares”
• Just one of many different ways to look at data, but very common
5 OLS assumptions (all 5)
1. Dependent variable (y) is a linear function of independent variables (x1, ..., xN )

plus white noise
(a) In other words, yt = α + β1x1,t + ... + βN xN,t + εt
2. Expected value of white noise is zero
(a) E[εt] = 0
3. White noise realisations are 1) uncorrelated and 2) drawn from distributions with
constant variance
(a) E[εtεt+j ] = 0, E[ε2t ] = σ 2 for all t and for j ̸= 0
4. Independent variables are measured without error (that is if you “measure them
again”, you get the same value)
5. No exact (or in practice, close to exact) linear relationship between any sub-set
of independent observations
(a) No xk = γj xj + ... + γmxm
6 OLS “facts”
• If the OLS assumptions hold, then OLS is

– Computationally “cheap” - just some matrix calculations
– Least squares - by design
– Highest R-square - follows from least squares
– Unbiased - so βc is an unbiased estimator of the true β
– Best linear unbiased estimator (compared to all possible linear
unbiased estimators)
7 Coefficients
• yt = α + β1x1,t + ... + βN xN,t + εt

• the coefficients are β1, ..., βN
• special case: N = 1
– Only one explanatory / independent variable plus a constant
– Then β1 = COV [x,y]
V AR[x,x]
8 Economic significance
• Is the coefficient big enough to make a real difference?

– In cold hard cash terms ...
• For instance, we run the regression
– rt − rf = α + β(rM − rf ) + εt
– to estimate the CAPM beta
– and get β = 0.000004023
– Not economically significant, since even if the market risk premium
doubles from 5% to 10%, the impact on the expected return for
this firm is
– 0.000004023 ∗ 5% = 0.00000020115 or less than 1 ten thou-
sandth of 1%.
– It may even be statistically significant! (see below)
– But if it is not economically significant, it isn’t relevant in a
practical sense
9 Statistical significance
• To calculate statistical significance in the OLS context, we need

one more assumption
• ASSUMPTION 6: White noise error term (ε) is normally dis-
tributed
• “Statistical significance” : real meaning – high probability that the
coefficient is not zero
• “High probability” = better than 95% (by convention)
• Or less than 5% chance that the true coefficient is zero (aka p-
value)
• Even so, 1-in-20 times we should expect to declare something “stat-
istically significant” when in fact it is just randomness
• Measured by t-statistic or p-value
Statistical significance (cont.)
• Warning: Data-mining. If you run 1 million regressions of y =

α + βx + ε where both x and y are independent random variables
you will get approximately 50,000 regressions where the β estimate
is significant at a 5% level. But this is nonsense, since you know
that the true β is zero (x and y are independent!).
– If you torture the data enough, it will confess to anything.
– aka “snooping”, “p-hacking”
10 t-Statistic and R-square
• t-statistic
– t-statistic = (Estimated coefficient) / (Standard error) = SEβ̂
β̂
√
– Standard error: SEβ̂ = σβ̂ / N or Standard deviation of es-
timated coefficient / ((number of observations)^0.5)
– Abs(t-statistic) >= 1.96 implies p-value < 5%
– Or less than a 5% chance that the true coefficient could be zero
• R-Square measure
– Reported when running OLS regressions
– Roughly, it is the % of variation in the dependent variable (y)
that is explained by the independent variables (x1, ..., xN )
– R-square will always increase if you add more independent vari-
ables to a regression, even if the variable you are adding is clearly
nonsense (like random numbers)
t-Statistic and R-square (cont.)
– If it is a univariate regression (only 1 explanatory variable) then

2
Ry|x 2
= Rx|y = (ρxy )2 where Ry|x
2
is the R-square obtained from
regressing y on x and ρxy is Pearson’s correlation coefficient
between x and y.
• “Adjusted” R-square makes an adjustment for the number of ex-
planatory variables.
t-Statistic and R-square (cont.)
• WARNING: Don’t regress a variable with a trend on other variables

with a trend
– Result: Spurious regression - high t-stats, high r-squares, but
total nonsense
– See next section
11 Correlation is not causation
• OLS regression simply shows the degree of association between

variables. It does not show that one variable causes another
• Example:
– variable xt – total population of storks in Sweden at year t
– variable yt – births in Sweden in year t
– run y = α + βx + ε
– Find: β = 0.8 and highly statistically significant (t-stat > 4)
• Implication?
Correlation is not causation (cont.)
• What is really going on

– Both the population of storks and births are declining over the
sample period
– but for different reasons
◦ Storks - habitat loss
◦ Birth rate - urbanisation and lifestyle changes
– Since both variables trend, they have a high covariance and
hence beta
• How to fix it
– Remove the trend
– For instance, take differences so that you run
– ∆y = α + β∆x + ε
– Even better, lag the change in stork population
– ∆yt = α + β∆xt−1 + ε
• Is this a problem when estimating CAPM beta? motivate.
12 Using OLS regression to do portfolio optimisation
• Britten-Jones (1999), “The Sampling Error in Estimates of Mean-

Variance Efficient Portfolio Weights”, Journal of Finance demon-
strates a close equivalence between Markowitz portfolio optimisa-
tion and OLS regression.
• Specifically:
– Regress the excess returns of N stocks on a constant = 1,
without an intercept (this bit is important).
◦ Specification: 1 = β1r1 + ... + βN rN + εi
– Collect the coefficients on each of the stocks in vector β.
– Rescale β so that the sum of the elements in β is 1. (So
β ⋆ = β/(1′β)).
– The re-scaled coefficients are the Markowitz optimal portfolio
weights!

Topic 07 - OLS

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Topic 07 - OLS

Uploaded by

Copyright:

Available Formats

FINANCE 361 – Topic 7 – OLS

2 What are we doing today 5

3 Ordinary least squares (OLS) regression 6

5 OLS assumptions (all 5) 9

10 t-Statistic and R-square 15

11 Correlation is not causation 18

12 Using OLS regression to do portfolio optimisation 20

• An introduction to Ordinary Least Squares regression

– Independent variable(s) (x1, ..., xN )

• The “best fit line” through a collection of points

1. Dependent variable (y) is a linear function of independent variables (x1, ..., xN )

(a) In other words, yt = α + β1x1,t + ... + βN xN,t + εt

2. Expected value of white noise is zero

(a) E[εtεt+j ] = 0, E[ε2t ] = σ 2 for all t and for j ̸= 0

(a) No xk = γj xj + ... + γmxm

• If the OLS assumptions hold, then OLS is

• yt = α + β1x1,t + ... + βN xN,t + εt

• Is the coefficient big enough to make a real difference?

• To calculate statistical significance in the OLS context, we need

• Warning: Data-mining. If you run 1 million regressions of y =

– If it is a univariate regression (only 1 explanatory variable) then

• WARNING: Don’t regress a variable with a trend on other variables

• OLS regression simply shows the degree of association between

• What is really going on

• Britten-Jones (1999), “The Sampling Error in Estimates of Mean-

You might also like