Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

ISYE6501 Office Hours

Week 8, Monday
Agenda
• Housekeeping
• Homework 8 Deep Dive
• Up Next
Housekeeping
• MT1 is now over!
• Congratulations on making it this far
• Discussions hubs are / will be available on Piazza – please post all exam
related questions there so everyone can learn together
• OMS - @2040
• edX – TBD; discussion hubs and exam results will be released in the next few days after
proctor review is complete
• Don’t rest too long though, because…
• MT2 opens November 10th (2am EDT)
HW8 Q11.1 – Advanced Regression
Question 11.1
• Using the crime data set uscrime.txt from Questions 8.2, 9.1, and
10.1, build a regression model using:
1. Stepwise regression
2. Lasso
3. Elastic net
• For Parts 2 and 3, remember to scale the data first – otherwise, the
regression coefficients will be on different scales and the constraint
won’t have the desired effect.
• For Parts 2 and 3, use the glmnet function in R.
HW8 Q11.1 – Advanced Regression
• Notes on R:
• For the elastic net model, what we called λ in the videos, glmnet calls
“alpha”; you can get a range of results by varying alpha from 1 (lasso) to 0
(ridge regression) [and, of course, other values of alpha in between].
• In a function call like glmnet(x,y,family=”mgaussian”,alpha=1) the
predictors x need to be in R’s matrix format, rather than data frame format.
You can convert a data frame to a matrix using as.matrix – for example, x
<- as.matrix(data[,1:n-1])
• Rather than specifying a value of T, glmnet returns models for a variety of
values of T.
HW8 Q11.1 – Stepwise Regression
• Backward Elimination
• Starts with the full model and remove the variable to reduces AIC the most
• Forward Elimination
• Starts with a model that is just the intercept and adds the variable that
reduces AIC the most
• Stepwise Elimination
• Starts with the full model looks at both additions and removals to the model
at each step
• All three methods stop once AIC no longer decreases
HW8 Q11.1 – Regularized Regression
• Constrains or shrinks the coefficients towards zero
• Shrinking your coefficients can reduce the variance of our models
• This reduces overfitting
• The glmnet vignette is an excellent resource for this assignment
• Specifically, the Quick Start section
• https://glmnet.stanford.edu/articles/glmnet.html
• Lambda determines how powerful the regularization effect is
• A lambda of zero means no penalty is applied
• For all the regularization methods you will need to find an optimal value of lambda using
cross validation
HW8 Q11.1 – Ridge Regression
• A regularization method that uses an L2 regularization parameter to
shrink the coefficients
• L2 regularization is the sum squares of the coefficient
• Cannot shrink a coefficient to zero, thus cannot be used for variable selection
• Tends to shrink correlated variables towards each other
HW8 Q11.1 – LASSO
• A regularization method that uses an L1 regularization parameter to
shrink the coefficients
• L1 regularization is the sum of the absolute value of the coefficients
• LASSO can reduce coefficients to zero thereby performing variable selection
• If you have correlated variables, LASSO will usually shrink one of them to zero,
but the one it shrinks is chosen at random.
HW8 Q11.1 – Elastic Net
• This method uses both the L1 and L2 penalty parameters
• The L1 term allows variable coefficients to shrink to zero
• The L2 term helps to shrink correlated variables towards each other
• Alpha is the mixing parameter
• When alpha = 1, you have a LASSO model
• When alpha = 0, you have a Ridge Regression model
• An alpha of 0.5 means the L1 and L2 terms are weighted equally
• You will need to find the best value of alpha for your data
Up Next…
• Same schedule as previous weeks (as always, check the syllabus for
official dates/times)
• HW8 due next Wednesday / Thursday
• HW8 peer reviews due Sunday / Monday

You might also like