Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 3

Exercise Lecture 5

• Exercise: Explain the differences and similarities between:


• LinR, LogR, MLP
• Perceptron

PLA (Perceptron Algorithm).
Advantages: valid, guaranteed under linear divisibility
Disadvantages: valid only under linear divisibility, nonlinearity
requires pocket algorithm.
Linear regression.
Pros: easy to optimize
Disadvantages: under 0/1 error, for having a relatively loose VC
dimensional bound
Logistic regression.
Pros: easy to optimize
Disadvantages: 0/1 error, for a relatively loose VC dimensional bound

• Exercise: We’ve shown how the MLP is used for classification.


Consider and explain in detail how a MLP for regression would
look; show the structure, CFG, the equations and relevant
gradients./

MLP if changed from classification to regression. It actually changes


its linear function, i.e. the output results from 0-1 to the outer
specific value

• Exercise: Plot the logistic function, and the derivative of the


logistic function; note the magnitudes.

• Exercise: Plot tanh and its derivative, and compare it with the
logistic function.

{
• ReLU ( z )=
z for z ≥ 0
0 otherwise


• Exercise: Plot this function and its derivative.
• lReLU ( z )=
{kz otherwise where 
z for z ≥ 0
k ∈(0,1)

• Exercise: Plot this function and its derivative.


• Exercise: Construct a MLP-N network for regression, where ReLU
is used in the hidden layers, and the output layer is a linear
combination of the prior hidden layer outputs. Generate a
polynomial dataset (e.g., a cubic function with noise), and
train your MLP with it.
• Exercise: Show how to extend our logistic regression and MLP
equations with vectorization.
• Exercise: Sketch how you would re-code your MLP solver to use
mini-batch gradient descent, and test this on Colab.
• Bonus: Determine the vector unit architecture available on
Colab, and test our your MLP solver using these accelerators
and quantify the speedup.
• Exercise: What is the purpose of the regularization penalty?
• Bonus: Use the above formulation and code an SVM solver. Test
this out on a data set that the LSD with noise. Plot the
margins of your dataset.
• Exercise: Apply a regularization penalty to the MLP-3. Show the
CFG (weights to L), and apply backpropagation on this.
• Bonus: Code a MLP-3 solver with and without the regularization
penalty, and test both solvers on the same dataset (non-LSD,
with noise). Comment on the performance.
• Once your MLP-3 solver without the regularization penalty
converges (training and validation), increase the number
of hidden units in the MLP-3 hidden layer until you see
overfitting occurring (i.e., training converges, but
validation starts to fail).
• Try the same model parameters but in the version with the
regularization penalty. Does this model generalize
better?

You might also like