Professional Documents
Culture Documents
Lab 7
Lab 7
References
Chapter 6.6, [ISLR] An Introduction to Statistical Learning (with Applications in R). Free access to download
the book: http://www-bcf.usc.edu/~gareth/ISL/
To see the help file of a function funcname, type ?funcname.
1. Preparation
Load dataset
library(ISLR)
data(Hitters)
Hitters <- na.omit(Hitters)
glmnet has different input types. Therefore, we have to create them first
# x, the predictor, has to be a numerical matrix
# model.matrix converts factors to a set of dummy variables
x <- model.matrix(Salary ~ ., Hitters)[, -1]
head(x)
# y, the output, has to be a vector
y <- Hitters$Salary
2. Ridge Regression
The penalty is defined as (1 − α)/2||β||22 + α||β||1 . Therefore, for ridge regression, the alpha is 0.
library(glmnet)
grid <- 10 ^ seq(10, -2, length = 100) # lambda from 10^10 to 10^-2, logarithmically scaled
ridge.mod <- glmnet(x, y, alpha = 0, lambda = grid)
names(ridge.mod) # read ?glmnet for details
dim(coef(ridge.mod))
par(mfrow = c(1,2))
plot(ridge.mod, xvar = 'norm')
plot(ridge.mod, xvar = 'lambda')
1
MH4510 - Statistical Learning and Data Mining - AY1819 S1 Lab 07
Large lambda
ridge.mod$lambda[50]
coef(ridge.mod)[, 10]
sqrt(sum(coef(ridge.mod)[-1, l] ^ 2)) # l2-norm of the coefficients
Small lambda
ridge.mod$lambda[60]
coef(ridge.mod)[, l]
sqrt(sum(coef(ridge.mod)[-1, l] ^ 2))
For lambda = 0 or lambda = Inf, what model does the algorithm produce?
3. LASSO
Same as ridge, but alpha = 1 now. Notice that the coefficients can be exactly zero.
lasso.mod <- glmnet(x, y, alpha = 1, lambda = grid)
par(mfrow = c(1,2))
plot(lasso.mod, xvar = 'norm')
plot(lasso.mod, xvar = 'lambda')
Large lambda
lasso.mod$lambda[50]
coef(lasso.mod)[, 50]
sqrt(sum(coef(lasso.mod)[-1, l] ^ 2))
Small lambda
lasso.mod$lambda[80]
coef(lasso.mod)[, 80]
sqrt(sum(coef(lasso.mod)[-1, l] ^ 2))
Cross validation
2
MH4510 - Statistical Learning and Data Mining - AY1819 S1 Lab 07
4. Tutorial
Explain how K-fold cross-validation is implemented on ridge regression / LASSO with scaling. Please specify
how to compute the cross-validation function and how is the scaling implemented.
This is a pseudo-code for CV without scaling. Note that this does not represent the cv.glmnet function.
Modify the pseudo-code below to answer the question above.
Suppose L is a vector containing lambda values to try, and we have 'X' as our data.