Professional Documents
Culture Documents
18CSE013T4061
18CSE013T4061
18CSE013T4061
exploding or vanishing during the course of a forward pass through a deep neural network.
For deep networks, if we can use a heuristic to initialize the weights depending on the non-linear
activation function. heuristic
The current standard approach for initialization of the weights of neural network layers and nodes that
use the Sigmoid or TanH activation function is called ____ xavier___ initialization
The xavier initialization method is calculated as a random number with a uniform probability
distribution (U) between the range_____ (sqrt(6)/sqrt(n + m)) and sqrt(6)/sqrt(n +
m), _________, where n is the number of inputs to the node
Assume a simple MLP model with 3 neurons and inputs= 1,2,3. The weights to the input neurons are
4,5 and 6 respectively. Assume the activation function is a linear constant value of 3. What will be the
output ? 96