Professional Documents
Culture Documents
Week 1
Week 1
Week 1
Katerina Chrysikou
KBS
Week 1
Introduction
Typical Problems
Data and Notation
Model formulation
The classical Linear Regression Model
Univariate regression Models
Ordinary Least Squares
Deriving the Estimator
Assumptions
Properties of Estimators
Statistical Inference
Testing Hypotheses
Empirical analysis
References
Series Frequency
GDP, Industrial Production/unemployment monthly, or quarterly
government budget deficit annually
money supply weekly
value of a stock market index end of day
value of a stock price as transactions occur
Example:
The stock price of asset i across time: pit .
SPY
250
200
Prices ($)
150
100
50
Time
Figure:
Katerina Chrysikou (KBS) Week 1 10 / 56
Returns for Financial Assets
The returns are also known as log price relatives. There is a number of
reasons for this:
They have the nice property that they can be interpreted as
continuously compounded returns.
Can add them up, e.g. if we want a weekly return and we have
calculated daily log returns:
r1 = log p1 /p0 = log p1 − log p0
r2 = log p2 /p1 = log p2 − log p1
r3 = log p3 /p2 = log p3 − log p2
r4 = log p4 /p3 = log p4 − log p3
r5 = log p5 /p4 = log p5 − log p4
log p5 − log p0 = log p5 /p0
But this does not work for the continuously compounded returns.
but we will limit ourselves to the case where there is only one x
variable to start with.
The first stage would be to form a scatter plot of the two variables.
Scatterplot
40
●
30
Fund XXX
●
20
● ●
●
10
0
0 5 10 15 20 25
Market Index
y = a + bx
40
●
30
Fund XXX
●
20
● ●
●
10
0
0 5 10 15 20 25
Market Index
Figure:
Katerina Chrysikou (KBS) Week 1 24 / 56
Ordinary Least Squares
So
∂L
= −2 ∑(yt − α − βxt ) = 0, (1)
∂α
∂L
= −2 ∑ xt (yt − α − βxt ) = 0. (2)
∂β
From (1)
∑ yt − T α̂ − β̂ ∑ xt = 0.
α̂ = y − β̂x. (3)
∑ xt yt − Tyx
β̂ = 2
.
∑ xt2 − Tx
> lm(mp~mi)
Call:
lm(formula = mp ~ mi)
Coefficients:
(Intercept) mi
-1.737 1.642
Question: If an analyst tells you that she expects the market to yield a
return 20% higher than the risk-free rate next year, what would you
expect the return on fund XXX to be?
Solution: We can say that the expected value of
y = −1.74 + 1.64 ∗ (value of) x, so plug x = 20 into the equation to get
the expected value for y:
Yt = eα Xt eut .
β
yt = α + βxt + ut .
β
yt = α + + ut
xt
then the regression can be estimated using OLS by substituting
1
zt = .
xt
But some models are intrinsically non-linear, e.g.
β
yt = α + x t + u t .
> SEa
[1] 4.113994
> SEb
[1] 0.2647783
Call:
lm(formula = mp ~ mi)
Residuals:
1 2 3 4 5
-2.955 2.648 3.209 -1.645 -1.257
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.7366 4.1140 -0.422 0.70135
mi 1.6417 0.2648 6.200 0.00845 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
1 Both SE(α̂) and SE( β̂) depend on s2 . The greater the variance s2 ,
then the more dispersed the errors are about their mean value and
therefore the more dispersed y will be about its mean value.
10
8
●
6
● ●●●
●●
●●
y
● ●
● ● ●●
●
●●
●
4
●
2
0
0 2 4 6 8 10
Figure:
Katerina Chrysikou (KBS) Week 1 41 / 56
Large ∑(xt − x)2
y <- rnorm(20, mean=5, sd=0.5)
x <- rnorm(20, mean=5, sd=2)
plot(x, y, type="p", pch=19, col="blue", cex=1.2, xlim=c(0,10), ylim=c(0,10))
abline(lm(y~x), col="red", lwd=2)
10
8
●
6
● ●
● ●● ● ●
● ●
● ● ●
y
●
● ●
●
● ●●
4
2
0
0 2 4 6 8 10
Figure:
Katerina Chrysikou (KBS) Week 1 42 / 56
Some Comments on the Standard Error Estimators II
The larger the sample size, T, the smaller will be the coefficient
variances.
We want to make inference about the likely population values from the
regression parameters.
α̂ ∼ N(α, var(α̂))
β̂ ∼ N( β, var( β̂))
Yes, if the other assumptions of the CLRM hold, and the sample size is
sufficiently large.
α̂ − α β̂ − β
p ∼ N(0, 1) and q ∼ N(0, 1)
var(α̂) var( β̂)
So we have
α̂ − α β̂ − β
∼ tT −2 and ∼ tT −2 .
SE(α̂) SE( β̂)
yt = α + βxt + ut , t = 1, 2, . . . , T.
β̂ − β∗
,
SE( β̂)
This is also sometimes called the size of the test and it determines the
region where we will reject or not reject the null hypothesis that we are
testing.
Finally, perform the test. If the test statistic lies in the rejection region
then reject the null hypothesis (H0 ), else do not reject H0 .
β̂ − β∗
,
SE( β̂)
If the test is
H0 : β = 0,
H1 : β 6 = 0
i.e. a test that the population coefficient is zero against a two-sided
alternative, this is known as a t-ratio test.
β̂
.
SE( β̂)
> tra
-0.4221322
> trb
6.200453
Papers:
Jensen, M. (1968). The performance of mutual funds in the period
1945-1964. Journal of Finance, 23, 389-416.