Professional Documents
Culture Documents
A Guide To Deep Learning and Neural Networks
A Guide To Deep Learning and Neural Networks
35
We are going to talk about them more in detail later in this text.
Machine learning attempts to extract new knowledge from a large set of pre-processed
data loaded into the system. Programmers need to formulate the rules for the machine,
and it learns based on them. Sometimes, a human might intervene to correct its errors.
you can't know what are the particular features that the logic behind the machine’s decision
neurons represent is clear
Now that you know what the difference between DL and ML is, let us look at some
advantages of deep learning.
Today, deep learning is applied across different industries for various use cases:
“Artificial neural networks” and “deep learning” are often used interchangeably, which
isn’t really correct. Not all neural networks are “deep”, meaning “with many hidden
layers”, and not all deep learning architectures are neural networks. There are
also deep belief networks, for example..
However, since neural networks are the most hyped algorithms right now and are, in
fact, very useful for solving complex tasks, we are going to talk about them in this post.
Definition of an ANN
An artificial neural network represents the structure of a human brain modeled on the
computer. It consists of neurons and synapses organized into layers.
ANN can have millions of neurons connected into one system, which makes it extremely
successful at analyzing and even memorizing various information.
Here is a video for those who want to dive deeper into the technical details of how
artificial neural networks work.
There are different types of neural networks but they always consist of the same
components: neurons, synapses, weights, biases, and functions.
Neurons
A neuron or a node is a basic unit of neural networks that receives information,
performs simple calculations, and passes it further.
In a large neural network with many neurons and connections between them, neurons
are organized in layers. There is an input layer that receives information, a number of
hidden layers, and the output layer that provides valuable results. Every neuron
performs transformation on the input information.
Neurons only operate numbers in the range [0,1] or [-1,1]. In order to turn data into
something that a neuron can work with, we need normalization. We talked about what it
is in the post about regression analysis.
A synapse is what connects the neurons like an electricity cable. Every synapse has a
weight. The weights also add to the changes in the input information. The results of the
neuron with the greater weight will be dominant in the next neuron, while information
from less ‘weighty’ neurons will not be passed over. One can say that the matrix of
weights governs the whole neural system.
How do you know which neuron has the biggest weight? During the initialization (first
launch of the NN), the weights are randomly assigned but then you will have to optimize
them.
Bias
A bias neuron allows for more variations of weights to be stored. Biases add richer
representation of the input space to the model’s weights.
In the case of neural networks, a bias neuron is added to every layer. It plays a vital role
by making it possible to move the activation function to the left or right on the graph.
It is true that ANNs can work without bias neurons. However, they are almost always
added and counted as an indispensable part of the overall model.
Every neuron processes input data to extract a feature. Let’s imagine that we have
three features and three neurons, each of which is connected with all these features.
Each of the neurons has its own weights that are used to weight the features. During
the training of the network, you need to select such weights for each of the neurons that
the output provided by the whole network would be true-to-life.
To perform transformations and get an output, every neuron has an activation function.
This combination of functions performs a transformation that is described by a common
function F — this describes the formula behind the NN’s magic.
There are a lot of activation functions. The most common ones are linear, sigmoid, and
hyperbolic tangent. Their main difference is the range of values they work with.
Neural networks are trained like any other algorithm. You want to get some results and
provide information to the network to learn from. For example, we want our neural
network to distinguish between photos of cats and dogs and provide plenty of examples.
Delta is the difference between the data and the output of the neural network. We use
calculus magic and repeatedly optimize the weights of the network until the delta is
zero. Once the delta is zero or close to it, our model is correctly able to predict our
example data.
Iteration
This is a kind of counter that increases every time the neural network goes through one
training set. In other words, this is the total number of training sets completed by the
neural network.
Epoch
The epoch increases each time we go through the entire set of training sets. The more
epochs there are, the better is the training of the model.
Batch
Batch size is equal to the number of training examples in one forward/backward pass.
The higher the batch size, the more memory space you’ll need.
one epoch is one forward pass and one backward pass of all the training
examples;
number of iterations is a number of passes, each pass using [batch size] number
of examples. To be clear, one pass equals one forward pass + one backward pass (we
do not count the forward pass and backward pass as two different passes).
Error is a deviation that reflects the discrepancy between expected and received output.
The error should become smaller after every epoch. If this does not happen, then you
are doing something wrong.
The error can be calculated in different ways, but we will consider only two main ways:
Arctan and Mean Squared Error.
There is no restriction on which one to use and you are free to choose whichever
method gives you the best results. But each method counts errors in different ways:
$\frac{arctan^2(i_1-a_1)+...+arctan^2(i_n-a_n)}{n}$
$\frac{(i_1-a_1)^2+(i_2-a_2)^2+...+(i_n-a_n)^2}{n}$
This is the simplest neural network algorithm. A feed-forward network doesn’t have any
memory. That is, there is no going back in a feed-forward network. In many tasks, this
approach is not very applicable. For example, when we work with text, the words form a
certain sequence, and we want the machine to understand it.
Feedforward neural networks can be applied in supervised learning when the data that
you work with is not sequential or time-dependent. You can also use it if you don’t know
how the output should be structured but want to build a relatively fast and easy NN.
A recurrent neural network can process texts, videos, or sets of images and become
more precise every time because it remembers the results of the previous iteration and
can use that information to make better decisions.
Recurrent neural networks are widely used in natural language processing and speech
recognition.
Convolutional neural networks
Convolutional neural networks are the standard of today’s deep machine learning and
are used to solve the majority of problems. Convolutional neural networks can be either
feed-forward or recurrent.
Let’s see how they work. Imagine we have an image of Albert Einstein. We can assign a
neuron to all pixels in the input image.
But there is a big problem here: if you connect each neuron to all pixels, then, firstly,
you will get a lot of weights. Hence, it will be a very computationally intensive operation
and take a very long time. Then, there will be so many weights that this method will be
very unstable to overfitting. It will predict everything well on the training example but
work badly on other images.
GANs are used, for example, to generate photographs that are perceived by the human
eye as natural images or deepfakes (videos where real people say and do things they
have never done in real life).
What kind of problems do NNs solve?
Neural networks are used to solve complex problems that require analytical calculations
similar to those of the human brain. The most common uses for neural networks are:
Summary
Deep learning and neural networks are useful technologies that expand human
intelligence and skills. Neural networks are just one type of deep learning architecture.
However, they have become widely known because NNs can effectively solve a huge
variety of tasks and cope with them better than other algorithms.
If you want to learn more about applications of machine learning in real life and
business, continue reading our blog:
In our post about best ML applications, you can discover the most stunning use
cases of machine learning algorithms.
Read this Medium post if you want to learn more about GPT-3 and creative
computers.
If you want to know how to choose ML techniques for your project, you will find
the answer in our blog post.