Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

FINANCE 361 – Topic 7 – OLS

Paul Geertsema

1
Contents

1 Readings 4

2 What are we doing today 5

3 Ordinary least squares (OLS) regression 6

4 OLS Intuition 8

5 OLS assumptions (all 5) 9

6 OLS “facts” 10

7 Coefficients 11

8 Economic significance 12

2 FINANCE 361 Class Notes – University of Auckland – Copyright (C) Dr Paul Geertsema
Contents (cont.)

9 Statistical significance 13

10 t-Statistic and R-square 15

11 Correlation is not causation 18

12 Using OLS regression to do portfolio optimisation 20

3 FINANCE 361 Class Notes – University of Auckland – Copyright (C) Dr Paul Geertsema
1 Readings

• These notes

4 FINANCE 361 Class Notes – University of Auckland – Copyright (C) Dr Paul Geertsema
2 What are we doing today

• An introduction to Ordinary Least Squares regression


– https://en.wikipedia.org/wiki/Ordinary_least_squares
• Use OLS regression to estimate CAPM betas using real data (in
Excel)
• Use OLS regressions to calculate Markowitz optimal portfolios

5 FINANCE 361 Class Notes – University of Auckland – Copyright (C) Dr Paul Geertsema
3 Ordinary least squares (OLS) regression

• Notation
– yt = α + β1x1,t + ... + βN xN,t + εt
• Nomenclature (the names of things)
– Observations
◦ Set of values that logically belongs together
◦ For instance, because they have all been “observed” at the
same time
◦ Indicated by subscript t
– Dependent variable (y)
◦ the variable you are seeking to explain
◦ conventionally on the left hand side of the regression equation

6 FINANCE 361 Class Notes – University of Auckland – Copyright (C) Dr Paul Geertsema
Ordinary least squares (OLS) regression (cont.)

– Independent variable(s) (x1, ..., xN )


◦ the variables you using to explain the dependent variable
◦ conventionally on the right hand side of the regression equa-
tion
◦ aka explanatory variables
– White noise (ε)
◦ randomness that is uncorrelated with variables in the regres-
sion or itself
◦ constant variance
◦ often denoted by ε
– Predicted value
◦ After estimating coefficients, the value obtained by using
the independent variable multiplied by their estimated coef-
ficients, all added up together
– Residual
◦ The difference between the dependent variable and the pre-
dicted value
7 FINANCE 361 Class Notes – University of Auckland – Copyright (C) Dr Paul Geertsema
4 OLS Intuition

• The “best fit line” through a collection of points


• Minimises the sum of squared errors
– error = predicted value - actual value of dependent variable
– hence “least squares”
• Just one of many different ways to look at data, but very common

8 FINANCE 361 Class Notes – University of Auckland – Copyright (C) Dr Paul Geertsema
5 OLS assumptions (all 5)

1. Dependent variable (y) is a linear function of independent variables (x1, ..., xN )


plus white noise

(a) In other words, yt = α + β1x1,t + ... + βN xN,t + εt

2. Expected value of white noise is zero

(a) E[εt] = 0

3. White noise realisations are 1) uncorrelated and 2) drawn from distributions with
constant variance

(a) E[εtεt+j ] = 0, E[ε2t ] = σ 2 for all t and for j ̸= 0

4. Independent variables are measured without error (that is if you “measure them
again”, you get the same value)
5. No exact (or in practice, close to exact) linear relationship between any sub-set
of independent observations

(a) No xk = γj xj + ... + γmxm

9 FINANCE 361 Class Notes – University of Auckland – Copyright (C) Dr Paul Geertsema
6 OLS “facts”

• If the OLS assumptions hold, then OLS is


– Computationally “cheap” - just some matrix calculations
– Least squares - by design
– Highest R-square - follows from least squares
– Unbiased - so βc is an unbiased estimator of the true β
– Best linear unbiased estimator (compared to all possible linear
unbiased estimators)

10 FINANCE 361 Class Notes – University of Auckland – Copyright (C) Dr Paul Geertsema
7 Coefficients

• yt = α + β1x1,t + ... + βN xN,t + εt


• the coefficients are β1, ..., βN
• special case: N = 1
– Only one explanatory / independent variable plus a constant
– Then β1 = COV [x,y]
V AR[x,x]

11 FINANCE 361 Class Notes – University of Auckland – Copyright (C) Dr Paul Geertsema
8 Economic significance

• Is the coefficient big enough to make a real difference?


– In cold hard cash terms ...
• For instance, we run the regression
– rt − rf = α + β(rM − rf ) + εt
– to estimate the CAPM beta
– and get β = 0.000004023
– Not economically significant, since even if the market risk premium
doubles from 5% to 10%, the impact on the expected return for
this firm is
– 0.000004023 ∗ 5% = 0.00000020115 or less than 1 ten thou-
sandth of 1%.
– It may even be statistically significant! (see below)
– But if it is not economically significant, it isn’t relevant in a
practical sense
12 FINANCE 361 Class Notes – University of Auckland – Copyright (C) Dr Paul Geertsema
9 Statistical significance

• To calculate statistical significance in the OLS context, we need


one more assumption
• ASSUMPTION 6: White noise error term (ε) is normally dis-
tributed
• “Statistical significance” : real meaning – high probability that the
coefficient is not zero
• “High probability” = better than 95% (by convention)
• Or less than 5% chance that the true coefficient is zero (aka p-
value)
• Even so, 1-in-20 times we should expect to declare something “stat-
istically significant” when in fact it is just randomness
• Measured by t-statistic or p-value

13 FINANCE 361 Class Notes – University of Auckland – Copyright (C) Dr Paul Geertsema
Statistical significance (cont.)

• Warning: Data-mining. If you run 1 million regressions of y =


α + βx + ε where both x and y are independent random variables
you will get approximately 50,000 regressions where the β estimate
is significant at a 5% level. But this is nonsense, since you know
that the true β is zero (x and y are independent!).
– If you torture the data enough, it will confess to anything.
– aka “snooping”, “p-hacking”

14 FINANCE 361 Class Notes – University of Auckland – Copyright (C) Dr Paul Geertsema
10 t-Statistic and R-square

• t-statistic
– t-statistic = (Estimated coefficient) / (Standard error) = SEβ̂
β̂

– Standard error: SEβ̂ = σβ̂ / N or Standard deviation of es-
timated coefficient / ((number of observations)^0.5)
– Abs(t-statistic) >= 1.96 implies p-value < 5%
– Or less than a 5% chance that the true coefficient could be zero
• R-Square measure
– Reported when running OLS regressions
– Roughly, it is the % of variation in the dependent variable (y)
that is explained by the independent variables (x1, ..., xN )
– R-square will always increase if you add more independent vari-
ables to a regression, even if the variable you are adding is clearly
nonsense (like random numbers)

15 FINANCE 361 Class Notes – University of Auckland – Copyright (C) Dr Paul Geertsema
t-Statistic and R-square (cont.)

– If it is a univariate regression (only 1 explanatory variable) then


2
Ry|x 2
= Rx|y = (ρxy )2 where Ry|x
2
is the R-square obtained from
regressing y on x and ρxy is Pearson’s correlation coefficient
between x and y.
• “Adjusted” R-square makes an adjustment for the number of ex-
planatory variables.

16 FINANCE 361 Class Notes – University of Auckland – Copyright (C) Dr Paul Geertsema
t-Statistic and R-square (cont.)

• WARNING: Don’t regress a variable with a trend on other variables


with a trend
– Result: Spurious regression - high t-stats, high r-squares, but
total nonsense
– See next section

17 FINANCE 361 Class Notes – University of Auckland – Copyright (C) Dr Paul Geertsema
11 Correlation is not causation

• OLS regression simply shows the degree of association between


variables. It does not show that one variable causes another
• Example:
– variable xt – total population of storks in Sweden at year t
– variable yt – births in Sweden in year t
– run y = α + βx + ε
– Find: β = 0.8 and highly statistically significant (t-stat > 4)
• Implication?

18 FINANCE 361 Class Notes – University of Auckland – Copyright (C) Dr Paul Geertsema
Correlation is not causation (cont.)

• What is really going on


– Both the population of storks and births are declining over the
sample period
– but for different reasons
◦ Storks - habitat loss
◦ Birth rate - urbanisation and lifestyle changes
– Since both variables trend, they have a high covariance and
hence beta
• How to fix it
– Remove the trend
– For instance, take differences so that you run
– ∆y = α + β∆x + ε
– Even better, lag the change in stork population
– ∆yt = α + β∆xt−1 + ε
• Is this a problem when estimating CAPM beta? motivate.
19 FINANCE 361 Class Notes – University of Auckland – Copyright (C) Dr Paul Geertsema
12 Using OLS regression to do portfolio optimisation

• Britten-Jones (1999), “The Sampling Error in Estimates of Mean-


Variance Efficient Portfolio Weights”, Journal of Finance demon-
strates a close equivalence between Markowitz portfolio optimisa-
tion and OLS regression.
• Specifically:
– Regress the excess returns of N stocks on a constant = 1,
without an intercept (this bit is important).
◦ Specification: 1 = β1r1 + ... + βN rN + εi
– Collect the coefficients on each of the stocks in vector β.
– Rescale β so that the sum of the elements in β is 1. (So
β ⋆ = β/(1′β)).
– The re-scaled coefficients are the Markowitz optimal portfolio
weights!

20 FINANCE 361 Class Notes – University of Auckland – Copyright (C) Dr Paul Geertsema

You might also like