Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

CISC 867: Deep Learning

Assignment #1

1. Given a neural network with two input neurons, x1 and x2, three hidden neurons, h1, h2 and h3, and two
output neurons, y1 and y2. The network acts as a classifier (2 classes). The following equations govern
the network operation:

where σ(𝑥) = 1 if 𝑥 ≥ 0 and else σ(𝑥) = 0 otherwise.


Draw the decision region for each class and the decision boundary.

2. Consider a three layer neural network whose structure is shown in Figure 1. You are required to
∂J
calculate the sensitivity δk = − at the output node k, where J is the objective function to be
∂netk
minimized and 𝑛𝑒𝑡𝑘 is the net activation of the output node k. We consider two cases, where the
objective function J and the nonlinear activation function at the output layer are chosen differently.
1 2
a. In the first case, J is chosen as the squared error 𝐽(𝑊) = 2 ||𝑡 − 𝑧|| where 𝑧𝑘 is the
prediction at the output node k and 𝑡𝑘 is the corresponding target value. In the classification
problem, only one 𝑡𝑘 equals to 1 (corresponding the ground truth class) and all the other 𝑡𝑘 ’s
are all zeros. Sigmoid f(net k ) = 1/(1 + e−netk ) is chosen as the activation function at the output
layer. Calculate the sensitivity δk in terms 𝑡𝑘 , 𝑧𝑘 and net k. Show that all the δk could be close
to zero even if the prediction error is large and explain why this is bad.
b. In the second case, the objective function is chosen as cross entropy, 𝐽(𝑊) = − ∑𝑐𝐶 𝑘=1 𝑡𝑘 𝑙𝑜𝑔(𝑧𝑘 )
and the nonlinear activation function at the output layer is chosen as softmax 𝑓(𝑛𝑒𝑡𝑘 ) =
𝑒 𝑛𝑒𝑡𝑘 / ∑𝐶𝑗=1 𝑒 𝑛𝑒𝑡𝑗 . Calculate the sensitivity δk tk again. Prove that if the prediction error is large,
at least one of the δk will be large.
Figure 1

3. Which of the following two figures represents the stochastic gradient descent and which represents the
true gradient descent. Explain.
4. Which of the following options closely relate to the following graph? Green cross are the samples of
Class A while mustard rings are samples of Class B and the red line is the separating line between the
two classes. Justify your answer.

a. Overfitting
b. Underfitting
c. Appropriate fit
d. Cannot comment

You might also like