Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 6

AIM : Analysis and Implementation of Deep Neural Network.

DEFINITIONS :

 Machine Learning(ML) : Machine learning is an application of


artificialintelligence (AI) that provides systems the ability to automatically learn and improve
from experience without being explicitly programmed.
 Deep Learning(DL) : Deep learning is a subset of machine learning where artificial
neural networks, algorithms inspired by the human brain, learn from large amounts of data.
 Neural Network (NN) : Neural networks are a set of algorithms, modeled loosely after
the human brain, that are designed to recognize patterns.
 Deep Neural Network (DNN) : A deep neural network is a neural network with a
certain level of complexity, a neural network with more than two layers. Deep neural
networks use sophisticated mathematical modeling to process data in complex ways.
 Convolution Neural Network(CNN) : In deep learning, a convolutional neural
network (CNN, or ConvNet) is a class of deep neural networks, most commonly applied to
analyzing visual imagery.

INTRODUCTION :
Deep learning is a class of machine learning algorithms that use multiple layers to progressively
extract higher level features from raw input.
ARCHITECTURE OVERVIEW : Neural Networks receive an input (a single
vector), and transform it through a series of hidden layers. Each hidden layer is made up
of a set of neurons, where each neuron is fully connected to all neurons in the previous
layer, and where neurons in a single layer function completely independently and do not
share any connections. The last fully-connected layer is called the “output layer” and in
classification settings it represents the class scores. Convolutional Neural Networks
(ConvNet) have neurons arranged in 3 dimensions: width, height, depth. The final
output layer, by the end of the ConvNet architecture will reduce the full image into a
single vector of class scores, arranged along the depth dimension.

There are three main types of layers to build ConvNet architectures. Each Layer accepts
an input 3D volume and transforms it to an output 3D volume through a differentiable
function.

1. Convolutional Layer
The CONV layer computes without brain/neuron analogies. The CONV layer’s
parameters consist of a set of learnable filters.
 Accepts a volume of size W1×H1×D1
 Requires four hyperparameters:
 Number of filters K,
 their spatial extent F,
 the stride S,
 the amount of zero padding P.
 Produces a volume of size W2×H2×D2 where:
 W2=(W1−F+2P)/S+1
 H2=(H1−F+2P)/S+1 (i.e. width and height are computed equally by
symmetry)
 D2=k
 With parameter sharing, it introduces F⋅F⋅D1 weights per filter, for a total
of (F.F⋅D1)⋅K weights and K biases.
 In the output volume, the d-th depth slice (of size W2×H2) is the result of
performing a valid convolution of the d-th filter over the input volume with a stride
of S, and then offset by d-th bias.

2. Pooling Layer
It progressively reduce the spatial size of the representation to reduce the amount of
parameters and computation in the network, and hence to also control overfitting.
 Accepts a volume of size W1×H1×D1.
 Requires two hyperparameters:
 their spatial extent F,
 the stride S,
 Produces a volume of size W2×H2×D2 where:
 W2=(W1−F)/S+1
 H2=(H1−F)/S+1
 D2=D1
3. Fully-Connected Layer(Dense)
Fully connected layers connect every neuron in one layer to every neuron in another layer.
It is in principle the same as the traditional multi-layer perceptron neural network (MLP).
The flattened matrix goes through a fully connected layer to classify the images.

Other additional layers


1. Flatten layer.
Flattening is converting the data into a 1-dimensional array for inputting it to the next
layer. We flatten the output of the convolutional layers to create a single long feature
vector.

2. Batch Normalisation.
Batch Normalization normalizes the output of a previous activation layer by subtracting
the batch mean and dividing by the batch standard deviation.It increase the stability of a
neural network.

3. Leaky_Relu
lt is an activation function which helps the network learn non-linear decision boundaries.

Setting up the data and the model

1. Data Preprocessing
The preprocessing is to center the data to have mean of zero (Mean Subtraction), and
normalize its scale to [-1, 1] along each feature (Normalisation).
2. Weight Initialization
Initialize the weights by drawing them from a gaussian distribution with standard
deviation of √2/n, where n is the number of inputs to the neuron. If the weights are
initialized to be the same or all zero then there is no source of asymmetry between
neurons and so it is randomly initialized. Calibrating the variances with 1/sqrt(n)
ensures that all neurons in the network initially have approximately the same output
distribution and empirically improves the rate of convergence.

3. Regularization
Regularization is a technique which makes slight modifications to the learning algorithm
such that the model generalizes better. Regularization penalizes the coefficients i.e. the
weight matrices of the nodes. L1 and L2 are the most common types of regularization.
These update the general cost function by adding another term known as the
regularization term.

Cost function = Loss + Regularization term

In L2, we have:

In L1, we have:
Here, lambda is the regularization parameter which is the hyperparameter whose value
is optimized for better results and “w” represents the weights. Also, the loss function
could be either Softmax (preferred) or SVM (Support Vector Machine).

Dropout
This is the one of the most interesting types of regularization techniques. While training,
dropout is implemented by only keeping a neuron active with some probability p (a
hyperparameter), or setting it to zero otherwise as shown below.

It can also be thought of as an ensemble technique in machine learning. We must scale


the activations by p at test time. Since test-time performance is so critical, it is always
preferable to use inverted dropout, which performs the scaling at train time, leaving the
forward pass at test time untouched.

STEPS INVOLVED :
1. Loading and Analyzing the data
2. Training the classifier by Data preprocessing, Weight initialization and computing the loss.
3. Computing the Analytic Gradient with Backpropagation and performing parameter update.
4. Modeling the data by introducing Convolutional Layer, Pooling Layer, Fully-Connected Layer
and adding dropout.
5. Evaluating the test set and predicting the labels.

SOFTWARE USED : Miniconda3, python 3.6.0, tensorflow 1.5.0

EXPERIMENTAL RESULTS :

1. Performed the experiment on available dataset (fashion_mnist imported


from keras API) and computed accuracy and loss, the result of which is
attached.
2. Performed on practical dataset containing pictures of two different types
of leaf and computed accuracy and loss, the result of which is attached.

SOURCE OF INFORMATION : Wikipedia , Github , Data Camp.

You might also like