Deep Learning

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 12

What is Deep Learning?

Deep learning is the branch of machine learning which is based on artificial


neural network architecture. An artificial neural network or ANN uses layers
of interconnected nodes called neurons that work together to process and
learn from the input data.
In a fully connected Deep neural network, there is an input layer and one or
more hidden layers connected one after the other. Each neuron receives
input from the previous layer neurons or the input layer. The output of one
neuron becomes the input to other neurons in the next layer of the network,
and this process continues until the final layer produces the output of the
network. The layers of the neural network transform the input data through a
series of nonlinear transformations, allowing the network to learn complex
representations of the input data.

Deep learning can be used for supervised, unsupervised as well as


reinforcement machine learning.
Artificial neural networks

Artificial neural networks are built on the principles of the structure and
operation of human neurons. It is also known as neural networks or neural
nets. An artificial neural network’s input layer, which is the first layer,
receives input from external sources and passes it on to the hidden layer,
which is the second layer. Each neuron in the hidden layer gets information
from the neurons in the previous layer, computes the weighted total, and
then transfers it to the neurons in the next layer. These connections are
weighted, which means that the impacts of the inputs from the preceding
layer are more or less optimized by giving each input a distinct weight. These
weights are then adjusted during the training process to enhance the
performance of the model.

Artificial neurons, also known as units, are found in artificial neural networks.
The whole Artificial Neural Network is composed of these artificial neurons,
which are arranged in a series of layers. The complexities of neural networks
will depend on the complexities of the underlying patterns in the dataset
whether a layer has a dozen units or millions of units. Commonly, Artificial
Neural Network has an input layer, an output layer as well as hidden layers.
The input layer receives data from the outside world which the neural
network needs to analyze or learn about.
In a fully connected artificial neural network, there is an input layer and one
or more hidden layers connected one after the other. Each neuron receives
input from the previous layer neurons or the input layer. The output of one
neuron becomes the input to other neurons in the next layer of the network,
and this process continues until the final layer produces the output of the
network. Then, after passing through one or more hidden layers, this data is
transformed into valuable data for the output layer. Finally, the output layer
provides an output in the form of an artificial neural network’s response to
the data that comes in.

Units are linked to one another from one layer to another in the bulk of neural
networks. Each of these links has weights that control how much one unit
influences another. The neural network learns more and more about the data
as it moves from one unit to another, ultimately producing an output from the
output layer.

Xavier Initialization
The goal of Xavier Initialization is to initialize the weights such that the variance of the
activations are the same across every layer. This constant variance helps prevent the
gradient from exploding or vanishing.
what is the variance?

Variance is a measure of how data points differ from the mean. According to Layman, a
variance is a measure of how far a set of data (numbers) are spread out from their mean
(average) value. Variance means to find the expected difference of deviation from actual
value.

what is the numpy ?


NumPy is a Python library used for working with arrays. It also has functions for working in
domain of linear algebra, fourier transform, and matrices.
what is the mean ?
Mean is the average of the given numbers and is calculated by dividing the sum of given
numbers by the total number of numbers.
what is standard deviation ?
Standard Deviation is a measure which shows how much variation (such as spread,
dispersion, spread,) from the mean exists.

what is the difference between variance and


standard deviation ?
Standard deviation measures how far apart numbers are in a data set.
Variance, on the other hand, gives an actual value to how much the numbers
in a data set vary from the mean.

what is the matplotlib in python ?


Matplotlib is a low level graph plotting library in python that serves as a
visualization utility.

Matplotlib was created by John D. Hunter.

Matplotlib is open source and we can use it freely.

Matplotlib is mostly written in python, a few segments are written in C,


Objective-C and Javascript for Platform compatibility.

what is the errorbar in python ?


Error bars are used to show the uncertainty or variability in the data
being plotted. In Python, you can add error bars to a plot using the
errorbar function of the matplotlib library. This will create a scatter
plot with error bars, where the error bars are set to 10% of the y-
values.
# example of the xavier weight initialization
from math import sqrt
from numpy import mean
from numpy.random import rand
# number of nodes in the previous layer
n = 10
# calculate the range for the weights
lower, upper = -(1.0 / sqrt(n)), (1.0 / sqrt(n))
# generate random numbers
numbers = rand(1000)
# scale to the desired range
scaled = lower + numbers * (upper - lower)
# summarize
print(lower, upper)
print(scaled.min(), scaled.max())
print(scaled.mean(), scaled.std())

# plot of the bounds on xavier weight initialization for different


numbers of inputs
from math import sqrt
from matplotlib import pyplot
# define the number of inputs from 1 to 100
values = [i for i in range(1, 101)]
# calculate the range for each number of inputs
results = [1.0 / sqrt(n) for n in values]
# create an error bar plot centered on 0 for each number of inputs
pyplot.errorbar(values, [0.0 for _ in values], yerr=results)
pyplot.show()

-0.31622776601683794 0.31622776601683794
-0.3162093939621687 0.3159240408550899
0.012495820714597164 0.1806882270049287
Syntax: matplotlib.pyplot.errorbar(x, y, yerr=None, xerr=None, fmt=”,
ecolor=None, elinewidth=None, capsize=None, barsabove=False,
lolims=False, uplims=False, xlolims=False, xuplims=False, errorevery=1,
capthick=None, \*, data=None, \*\*kwargs) Parameters: This method accept
the following parameters that are described below:
 x, y: These parameter are the horizontal and vertical coordinates of the
data points.
 fmt: This parameter is an optional parameter and it contains the string
value.
 xerr, yerr: These parameter contains an array.And the error array should
have positive values.
 ecolor: This parameter is an optional parameter. And it is the color of the
errorbar lines with default value NONE.
 elinewidth: This parameter is also an optional parameter. And it is the
linewidth of the errorbar lines with default value NONE.
 capsize: This parameter is also an optional parameter. And it is the
length of the error bar caps in points with default value NONE.
 barsabove: This parameter is also an optional parameter. It contains
boolean value True for plotting errorsbars above the plot symbols.Its
default value is False.
 lolims, uplims, xlolims, xuplims: These parameter are also an optional
parameter. They contain boolean values which is used to indicate that a
value gives only upper/lower limits.
 errorevery: This parameter is also an optional parameter. They contain
integer values which is used to draws error bars on a subset of the data.
Returns: This returns the container and it is comprises of the following:
 plotline:This returns the Line2D instance of x, y plot markers and/or line.
 caplines:This returns the tuple of Line2D instances of the error bar caps.
 barlinecols:This returns the tuple of LineCollection with the horizontal
and vertical error ranges.

What is bias in machine learning?


Bias is a phenomenon that skews the result of an algorithm in favor or
against an idea.

Bias is considered a systematic error that occurs in the machine


learning model itself due to incorrect assumptions in the ML process.

Technically, we can define bias as the error between average model


prediction and the ground truth. Moreover, it describes how well the
model matches the training data set:
 A model with a higher bias would not match the data set closely.
 A low bias model will closely match the training data set.

Characteristics of a high bias model include:

 Failure to capture proper data trends


 Potential towards underfitting
 More generalized/overly simplified
 High error rate

What is variance in machine learning?


Variance refers to the changes in the model when using different
portions of the training data set.

Simply stated, variance is the variability in the model prediction—how


much the ML function can adjust depending on the given data set.
Variance comes from highly complex models with a large number of
features.

 Models with high bias will have low variance.


 Models with high variance will have a low bias.

All these contribute to the flexibility of the model. For instance, a model
that does not match a data set with a high bias will create an inflexible
model with a low variance that results in a suboptimal machine
learning model.

Characteristics of a high variance model include:

 Noise in the data set


 Potential towards overfitting
 Complex models
 Trying to put all data points as close as possible

Underfitting & overfitting ?


The terms underfitting and overfitting refer to how the model fails to
match the data. The fitting of a model directly correlates to whether it
will return accurate predictions from a given data set.

 Underfitting occurs when the model is unable to match the input


data to the target data. This happens when the model is not
complex enough to match all the available data and performs
poorly with the training dataset.
 Overfitting relates to instances where the model tries to match
non-existent data. This occurs when dealing with highly complex
models where the model will match almost all the given data points
and perform well in training datasets. However, the model would
not be able to generalize the data point in the test data set to
predict the outcome accurately.
Bias vs variance: A trade-off
Bias and variance are inversely connected. It is impossible to have an
ML model with a low bias and a low variance.

When a data engineer modifies the ML algorithm to better fit a given


data set, it will lead to low bias—but it will increase variance. This way,
the model will fit with the data set while increasing the chances of
inaccurate predictions.
The same applies when creating a low variance model with a higher
bias. While it will reduce the risk of inaccurate predictions, the model
will not properly match the data set.

It’s a delicate balance between these bias and variance. Importantly,


however, having a higher variance does not indicate a bad ML
algorithm. Machine learning algorithms should be able to handle some
variance.

We can tackle the trade-off in multiple ways…

Increasing the complexity of the model to count for bias and


variance, thus decreasing the overall bias while increasing the variance
to an acceptable level. This aligns the model with the training dataset
without incurring significant variance errors.
Increasing the training data set can also help to balance this trade-
off, to some extent. This is the preferred method when dealing with
overfitting models. Furthermore, this allows users to increase the
complexity without variance errors that pollute the model as with a
large data set.
A large data set offers more data points for the algorithm to generalize
data easily. However, the major issue with increasing the trading data
set is that underfitting or low bias models are not that sensitive to the
training data set. Therefore, increasing data is the preferred solution
when it comes to dealing with high variance and high bias models.

This table lists common algorithms and their expected behavior


regarding bias and variance:

You might also like