Professional Documents
Culture Documents
GLM Theory Slides
GLM Theory Slides
GLM Theory Slides
7/17/19
GRM Product
Course outline
GLM theory
Data preparation
Emblem hands on
Evaluation techniques
GRM Product
Agenda – GLM theory
Linear modeling
Model parameterization
GLM extensions to ordinary linear models
Modeling diagnostics
GLMs in Emblem
GRM Product
Agenda – GLM theory
Linear modeling
Model parameterization
GLM extensions to ordinary linear models
Modeling diagnostics
M
GRM Product
Linear modeling terminology
Y = 2 + 0.7 X + N(0, 0.2)
4.5
4.0
Observations
Response (Y)
3.5
Y
3.0
2.5
observation
3.0
E[Y] = b0 + b1X
GRM Product
Classical linear regression – least squares
Yi = β0 + β1Xi + εi
4.5
observation
3.0
parameters to be estimated\
Model fit by minimizing product of f(εi)
2.5
Assumption: εi follows normal distribution with mean zero and constant variance
GRM Product
Multiple linear regression
Yi = β0 + β1Xi1 + β2Xi2 + β3Xi3 + … + βnXin + εi
= ∙ +
Where:
is a vector of our observed values (response)
is our “Design Matrix”
is the vector of indicated parameters
GRM Product
Agenda – GLM theory
Linear modeling
Model parameterization
GLM extensions to ordinary linear models
Modeling d
M
GRM Product
−
Continuous/Numeric variables
− If a variable is continuous, we can instead define a variate which can reduce the
parameterization of the model, with each order of the variate being represented by one term
in the formula
GRM Product 11
Gender
Weight Av Severity This categorical variable requires a two-
parameter model.
M 2105 $ 2,093
F 2832 $ 1,733
GRM Product
Gender
Weight Av Severity This categorical variable requires a two-
parameter model.
M 2105 $ 2,093
F 2832 $ 1,733
β0 = 1733
β1 = 360
GRM Product
Gender
Weight Av Severity This categorical variable requires a two-
parameter model.
M 2105 $ 2,093
F 2832 $ 1,733
Binary Variable: Parameter Estimates
Predicted Severity 1 for male, 0 for female
β0 = 1733
β1 = 360
E[Y ] b 0 b1 X 1
Base (female) Severity Additional Severity for being male
GRM Product
Gender Area
Weight Av Severity Weight Av Severity
M 2105 $ 2,093 A 1181 $ 1,754
F 2832 $ 1,733 B 1021 $ 1,758
C 1493 $ 1,919
D 524 $ 1,739
E 413 $ 2,104
F 305 $ 2,629
GRM Product
Gender Area
Weight Av Severity Weight Av Severity
M 2105 $ 2,093 A 1181 $ 1,754
F 2832 $ 1,733 B 1021 $ 1,758
C 1493 $ 1,919
D 524 $ 1,739
E 413 $ 2,104
F 305 $ 2,629
E[Y ] b 0 b1 X 1 b 2 X 2 b 3 X 3 b 4 X 4 b 5 X 5 b 6 X 6
β0: Base severity (Female from Area C)
X1: Binary variable (Male = 1, Female = 0) β1: Add’l severity from Male
X2: Binary variable (Area A = 1, else = 0) β2: Add’l severity from Area A
X3: Binary variable (Area B = 1, else = 0) β3: Add’l severity from Area B
X4: Binary variable (Area D = 1, else = 0) β4: Add’l severity from Area D
X5: Binary variable (Area E = 1, else = 0) β5: Add’l severity from Area E
X6: Binary variable (Area F = 1, else = 0) β6: Add’l severity from Area F
GRM Product
Gender Area
Weight Av Severity Weight Av Severity
M 2105 $ 2,093 A 1181 $ 1,754
F 2832 $ 1,733 B 1021 $ 1,758
C 1493 $ 1,919
D 524 $ 1,739
E 413 $ 2,104
F 305 $ 2,629
E[Y ] b 0 b1 X 1 b 2 X 2 b 3 X 3 b 4 X 4 b 5 X 5 b 6 X 6
β0: Base severity (Female from Area C)
X1: Binary variable (Male = 1, Female = 0) β1: Add’l severity from Male
X2: Binary variable (Area A = 1, else = 0) β2: Add’l severity from Area A
X3: Binary variable (Area B = 1, else = 0) β3: Add’l severity from Area B
X4: Binary variable (Area D = 1, else = 0) β4: Add’l severity from Area D
X5: Binary variable (Area E = 1, else = 0) β5: Add’l severity from Area E
X6: Binary variable (Area F = 1, else = 0) β6: Add’l severity from Area F
Parameter Number Name Value Standard Error Standard Error (%) Weight Weight (%)
1 Mean 1,769.64 99.14477 5.6 4,937 100
- gender (F) 2,832 57.4
2 gender (M) 361.864 100.18706 27.7 2,105 42.6
3 area (A) -174.419 135.53147 77.7 1,181 23.9
4 area (B) -170.408 141.33498 82.9 1,021 20.7
- area (C) 1,493 30.2
5 area (D) -174.622 176.69103 101.2 524 10.6
6 area (E) 173.704 193.48309 111.4 413 8.4
7 area (F) 707.8558 218.65201 30.9 305 6.2
GRM Product
−
−
GRM Product 20
Agenda – GLM theory
Linear modeling
Model parameterization
GLM extensions to ordinary linear models
Modeling diagnostics
M
GRM Product
Normal • Not restricted to be positive.
• Often inappropriate for insurance
GRM Product 22
GRM Product
Link Function, Equation with g( ) Inverse Link Equation with h( )
g( ) Function, h( )
identity, z = z Y = Xβ + 𝜀 identity, z = z Y = Xβ + 𝜀
log, ln(z) ln(Y) = Xβ + 𝜀 Exponential, 𝑒 Y = eXβ + 𝜀
logit, ln( ) ln( ) = Xβ + 𝜀 Inverse logit, Y = eXβ / (1 + eXβ ) + 𝜀
GRM Product
GRM Product
Model output: normal error / log link
[GLM fit: Log Link Function, Normal Error Structure]
Parameter Number Name Value Standard Error Standard Error (%) Weight Weight (%) Exp(Value)
1 Mean 7.47 0.0539 0.7 4,937 100 1,749.90
GRM Product
Model output: normal error / log link
[GLM fit: Log Link Function, Normal Error Structure]
Parameter Number Name Value Standard Error Standard Error (%) Weight Weight (%) Exp(Value)
1 Mean 7.47 0.0539 0.7 4,937 100 1,749.90
= exp(7.47) = 1,749.90
GRM Product
Model output: normal error / log link
[GLM fit: Log Link Function, Normal Error Structure]
Parameter Number Name Value Standard Error Standard Error (%) Weight Weight (%) Exp(Value)
1 Mean 7.47 0.0539 0.7 4,937 100 1,749.90
GRM Product
= 1,749.90 · 1.2277 · .9085 = 1,951.78
Normal • Not restricted to be positive.
• Often inappropriate for insurance
GRM Product 29
Generalizing the distribution of Y (error)
We may assume that Y is distributed according to any member of
the Exponential family of distributions
GRM Product
Classical linear regression – max. likelihood
4.5
distributed
3.5
Y
3.0
2.5
GRM Product
Generalizing the distribution of Y (error)
We may assume that Y is distributed according to any member of
the Exponential family of distributions
GRM Product
Variance function
Variance of an exponential family distribution is given by:
Where:
Distribution V(x)
Normal 1
Poisson x
Gamma x2
Binomial x(1-x)
Inverse Gaussian x3
GRM Product
Effect of variance function
4.5
4.0
3.5
Y
3.0
2.5
GRM Product X
Effect of variance function
4.5
4.0
3.5
Y
3.0
2.5
GRM Product X
Effect of variance function
4.5
Line of best fit
with assumed
constant
variance
4.0
3.0
2.5
X
GRM Product
Target Link Function Error
Claim Frequency Log Poisson,
Negative
Binomial
Claim Severity Log Gamma, Inverse
Gaussian
Loss Costs Log Tweedie
Combined Loss Costs Log Gamma
Large Claim Propensity Logit Binomial
Policy Renewal Probability Logit Binomial
Quote Conversion Probability Logit Binomial
GRM Product 37
−
−
GRM Product
GRM Product
Assumption Linear Regression Generalized Linear
Model Model
GRM Product
GLM assumptions
Random Component (Distribution of Y, error function)
− Related to what you are trying to model
− What kind of distribution do you expect the process to follow? Poisson? Gamma?
Link Function
− What kind of relationship do you expect to exist between X and Y?
− Additive (Identity)? Multiplicative (Log)?
Weight
− Claim frequency model: weight is earned exposure
− Claim Severity model: weight is claim count
Finally, once the model is specified, we solve using Maximum Likelihood Estimation
GRM Product
Agenda – GLM theory
Linear modeling
Model parameterization
GLM extensions to ordinary linear models
Modeling diagnostics
GRM Product
Model diagnostics
Variable level
− Does this variable or level make the model more predictive?
Model level
− Do we have the right link and error function?
GRM Product
Standard Error(%) = ( Standard Error / Value ) x 100%
The Smaller Standard Error % is, the more significant this parameter will be
GRM Product 44
−
•
•
−
−
−
−
•
GRM Product
−
•
•
GRM Product 46
−
−
GRM Product 47
GRM Product 48
Deviance in
GLM, for
instance
Number of
parameters
in GLM
GRM Product 49
Agenda – GLM theory
Linear modeling
Model parameterization
GLM extensions to ordinary linear models
Modeling diagnostics
GLMs in Emblem
GRM Product
GRM Product 51
· · ·...·
GRM Product 52
· · ·...·
GRM Product 53
· · ·...·
GRM Product 54
GRM Product 55
Creating Emblem Files
%CrteFile(Store=GLMLIB.EmblemStore,
Dataset = final_table,
Dirname = . . ./Intl/LI/Asia/India/Models,
Weight = claim_count,
Title = model_v9,
response = Total_losses);
Build Models
There will be a
PRIZE!!!
Scored on holdout
data (random)
Lowest Deviance
wins!
Edit categories Right Click: Add new Drop downs to select and change color
Enable categories
Rule of thumb
40/40/20
Random
Out of time
Vehicle
Characteristics
Policy
Characteristics
GRM Product
Review the Beta Page
GRM Product
Compare metrics to the reference model