Professional Documents
Culture Documents
Econometrics For Actuary: Minh Nguyen
Econometrics For Actuary: Minh Nguyen
Minh Nguyen
1 / 45
COURSE CONTENTS
2 / 45
Session 0: Review Econometrics I
3 / 45
Session 0: Review Econometrics I
I Example:
I lnsale
\ = 0.5 − 0.1hhsize + 0.2lnincome
I sale
d = 0.5 + 0.15income + 0.1female
4 / 45
Session 0: Assumptions
I F1. E (y |X ) = X β
5 / 45
Section 1: Categorical dependent variables - Ch11
6 / 45
1.1 Binary dependent variables (1)
7 / 45
1.1 Binary dependent variables (2)
I E (y |x) = p = β0 + β1 loss
8 / 45
1.2 Logistic regression models (1)
e β0 +β1 loss
I p=
1 + e β0 +β1 loss
I The RHS is in between (0,1)
I It is non-linear
e β0 +β1 x1 +..βk xk eX β
I p= =
1 + e β0 +β1 x1 +..βk xk 1 + eX β
I Logit model
I Meaning of βj ?
9 / 45
1.2 Logistic regression models: example (2)
exp −1+0.38
At loss = 10, P(y = 1) = = 0.35
1 + exp −1+0.38
10 / 45
1.2 Logistic regression models: marginal effect (3)
e β0 +β1 los+
I p=
1 + e β0 +β1 loss
I Marginal effect of loss:
∂p
I = p(1 − p)β1
∂loss
I Marginal impact is not a constant
11 / 45
1.2 Logistic regression models: marginal effect (4)
exp −1+0.38
At loss = 10, P(y = 1) = = 0.35
1 + exp −1+0.38
At loss = 10, if loss ⇑ by one, P(y = 1) ⇑
0.35(1 − 0.35)(−0.1) = −0.02
At loss = 10, if loss ⇑ by m, P(y = 1) ⇑
0.35(1 − 0.35)(−0.1)m = −0.02
12 / 45
1.2 Logistic regression models: odd ratio (5)
p
I odd = = e β0 +β1 loss
1−p
I Meaning of the odd:
p
I ln(odd) = ln( ) = β0 + β1 loss
1−p
I If loss ⇑ 1 unit, ln(odd) ⇑ by β1 unit, odd ⇑ by e1β times
I ln(oddn ) = ln(oddo ) + β1
13 / 45
1.2 Logistic regression models: odd ratio (5)
p
I odd = = e β0 +β1 loss
1−p
I Meaning of the odd:
p
I ln(odd) = ln( ) = β0 + β1 loss
1−p
I If loss ⇑ 1 unit, ln(odd) ⇑ by β1 unit, odd ⇑ by e1β times
oddn
I ln(oddn ) = ln(oddo ) + β1 ; = ⇒ ln( ) = β1 ;
oddo
14 / 45
1.2 Logistic regression models: odd ratio (5)
p
I odd = = e β0 +β1 loss
1−p
I Meaning of the odd:
p
I ln(odd) = ln( ) = β0 + β1 loss
1−p
I If loss ⇑ 1 unit, ln(odd) ⇑ by β1 unit, odd ⇑ by e1β times
oddn
I ln(oddn ) = ln(oddo ) + β1 ; = ⇒ ln( ) = β1 ; ;
oddo
oddn
⇒
= = e β1 ;
oddo
15 / 45
1.2 Logistic regression models: odd ratio (5)
p
I odd = = e β0 +β1 loss
1−p
I Meaning of the odd:
p
I ln(odd) = ln( ) = β0 + β1 loss
1−p
I If loss ⇑ 1 unit, ln(odd) ⇑ by β1 unit, odd ⇑ by e1β times
oddn
I ln(oddn ) = ln(oddo ) + β1 ; = ⇒ ln( ) = β1 ; ;
oddo
oddn
⇒
= ⇒ oddn = e β1 oddo
= e β1 ; =
oddo
I odd ⇑ e β1 times
16 / 45
1.2 Logistic regression models: example 6
17 / 45
1.3 Estimation and Inference: the b
I Use ML method
1 − pi
if yi =0
I Likelihood of i-th observation: ;
if yi =1
p
i
ln(1 − pi ) if yi =0
I or loglikelihood
if yi =1
lnp
i
I Or: LLi = yi ln(pi ) + (1 − yi )ln(1 − pi )
I Given a (independent) data set {(y1 , x1 ), .., (yn , xn )}
I L(data) = LL1 × LL2 ... × LLn =
Pn
i=1 yi lnpi + (1 − yi )ln(1 − pi ) = LL(β)
0
I I (b) =
P
i pi (1 − pi )xi xi : Information matrix (estimated)
19 / 45
1.3 Estimation and Inference: pseudo R 2
I Goodness of fit:
LL(bmodel )
I McFadden pseudoR 2 = 1 − :
LL(null)
LL(bmodel ) − Lnull
I pseudo − R 2 =
LLmax − LLnull
20 / 45
1.3 Estimation and Inference: the whole model
I H0 : β = 0 vs H0 : β 6= 0;
21 / 45
1.3-1 Use of the result
22 / 45
1.3-2 Practice with R
I Command:
glm(y ∼ x1 + x2, data = dataset, family = ”binomial”)
23 / 45
1.3-3 Tutorial 1
Using "autobi" data set. Regress a logistic regression with
"attorney" as the y, marital status and loss as x.
5. Retrieve McFadden R 2
25 / 45
1.3-4 Model testing - evaluation (1)
I Other variables: X 2 ; Xi × Xj
26 / 45
1.3-5 Model testing - evaluation (2)
I Step1: predict p
27 / 45
1.2* Probit model
28 / 45
1.2* Probit model- practice
29 / 45
1.4 Nominal dependent variable - Multinomial logit model
30 / 45
1.5 Ordinal dependent variable - ordinal logit model
I Situation:
Different J choices, can be ranked as 1,2„ C
1 1
P(y = 1|X ) = −
i
1 + e −α1 +X β 1 + e −α0 +X β
1 1
P(yi = 2|X ) =
−
1 + e −α2 +X β 1 + e −α1 +X β
..
P(yi = C |X ) = 1 1
−
1 + e −αC +X β 1 + e −αJ−1 +X β
I In which α0 = −∞ ≤ α1 ≤ .. ≤ αC = ∞
P(yi ≤ s)
I oddi (s) := = exp −(−αs +X β)
P(yi > s)
I Ordered logit model
I ln(odd(j)) = αj + β1 X1 + .. + βk Xk
31 / 45
Section 2: GLM models - Ch12
2.1 Introduction
2.4 Estimation
32 / 45
2.1 Introduction
I y: normally distributed
33 / 45
2.2 GLM - Linear exponential family distribution
y θ − b(θ)
I Definition: f (y , θ, Φ) = exp( + S(y , Φ))
Φ
I y: dependent variable
I θ: parameter of interest
I Φ scale parameter
34 / 45
2.2 GLM - Linear exponential family distribution
y θ − b(θ)
I Definition: f (y , θ, Φ) = exp( + S(y , Φ))
Φ
I Check with Normal distribution
2 1 (y − µ)2 )
I f (y , µ, σ ) = √ exp(−
2πσ 2 2σ 2
1 y 2 − 2y µ + µ2
I =√ exp(− )
2πσ 2 2σ 2
y µ − 0.5µ2 y2 2 ))
I = exp( − − 0.5ln(2πσ
σ2 2σ 2
I Hence,
y2
θ = µ, Φ = σ , b(θ) = 0.5θ , S(y , Φ) = 2 + 0.5ln(2πσ 2 )
2 2
2σ
I Normal distribution belongs the family
35 / 45
2.2 GLM - Linear exponential family distribution
36 / 45
2.2 GLM - Linear exponential family distribution
I Mean E (y ) = b 0 (θ)
I Variance V (y ) = Φb 00 (θ)
I Then E (y ) = θ = µ
I Var (y ) = Φ × 1 = σ 2
37 / 45
2.3 GLM - Link functions (1)
I LM: E (y |X ) = µ = X β.
I g (µ) = X β
38 / 45
2.3 GLM - Canonical Link functions (2)
µ = b 0 (θ)
I We have
µ = g (−1) (η)
39 / 45
2.3 GLM - Canonical Link functions (3)
I g (µ) = µ
40 / 45
2.3 GLM - Canonical Link functions (4)
41 / 45
2.4 GLM - Estimation (1)
I Model: g (µ) = X β
y θ − b(θ)
I Distribution: f (y , θ, Φ) = exp( + S(y , Φ))
Φ
I With canonical link, we have: θ = η = X β
yX β − b(X β)
I So: f (y , θ, Φ) = exp( + S(y , Φ))
Φ
I Use MLE
42 / 45
2.4 GLM - Estimation (2)
43 / 45
2.5 GLM - Assumption revise
LM GLM
E (y ) = X β g (µ) = X β
X: non-stochastic Same
V (yi ) = σ 2 Not required
{yi }: independent Same
{yi } ∼ N(µ, σ 2 ) Linear exponential family
44 / 45
2.6 GLM - goodness of fit
45 / 45