Deep Learning 10 Hours

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 27

Module 3

Deep Learning 10 Hours

• Artificial Neural Networks (ANN): architecture


• feed-forward and back propagation
• Activation functions
• Optimizers in deep learning
• Regularization techniques
• Recurrent neural networks
• Transfer learning

19-03-2024 2
Deep Learning Artificial
Intelligence
• Deep learning is a subfield of machine Machine
learning that focuses on artificial neural Learning
networks and algorithms inspired by the
structure and function of the human
brain. Deep
Learning
• It involves training models on large
amounts of data to learn hierarchical
representations of features, allowing the
system to make predictions or decisions
without explicit programming for specific
tasks.
• Deep learning eliminates human input
and automates most of the extraction
process from features and raw data so
that it doesn't require human
involvement.

3
Deep Learning

• When there is huge data, the performance of deep


learning algorithms is far better than most ML algorithms

4
Applications

• Image and Speech Recognition

• Natural Language Processing (NLP): language translation, sentiment analysis, and chatbots

• Autonomous Systems: self-driving cars and drones, to perceive and make decisions based on
the environment.

5
Deep Learning
Popular Architectures:
• Convolutional Neural Networks (CNNs): Suited for image-related tasks
• Recurrent Neural Networks (RNNs): Effective for sequence data
• Transformers: for NLP tasks
Challenges:
• Data Requirements: Deep learning models often require large labeled datasets for training,
which may be challenging to obtain in some domains.
• Computational Resources: Training deep networks can be computationally intensive,
requiring powerful hardware such as GPUs or TPUs.
• Interpretability: The black-box nature of some deep learning models makes it challenging to
interpret and understand their decision-making processes.
Advancements:
• Transfer Learning: Leveraging pre-trained models on one task for improved performance on
a related task.
• Generative Adversarial Networks (GANs): Used for generating new data instances, GANs
have applications in image synthesis and style transfer.
6
Neural network application
ALVINN: An Autonomous Land Vehicle In a Neural Network
(Carnegie Mellon University Robotics Institute, 1989-1997)

ALVINN drives 70 mph on


highways!
Human Brain

8
Artificial Neural Networks (ANN)
• An artificial neural network (ANN) is a
computational model inspired by the structure
and functioning of the human brain.
• It is a key component of machine learning and
artificial intelligence, designed to process
information and make decisions in a manner
similar to biological neural networks.
• Neurons - are the basic building blocks of ANNs
• Dendrites act as a receiver for other neurons’
signals and the axon is the transmitter of the
neuron's signal.
• Dendrites of neurons receive input in the
form of electric signals from the axons of
other neurons.

9
McCulloch & Pitts Neuron Model Perceptron
• A Perceptron is an Artificial Neuron
• It is the simplest possible Neural Network
• Neural Networks are the building blocks of Machine Learning.
• A perceptron receives one or more input.
• Perceptron inputs are called nodes.
• The nodes have both a value and a weight.
• Input nodes have a binary value of 1 or 0.
• Weights are values assigned to each input.
• Weights shows the strength of each node.
• A higher value means that the input has a stronger influence on
the output.
• The perceptron calculates the weighted sum of its inputs.
• It multiplies each input by its corresponding weight and sums up
the results.
• After the summation, the perceptron applies an activation function
to the sum.
• The purpose is to introduce non-linearity into the output. It
determines whether the perceptron should fire or not based on the
aggregated input.
10
Perceptron
• The activation function is typically accompanied Structure of an artificial
by a Threshold Value.
• If the result of the activation function exceeds
neuron
the threshold, the perceptron fires (outputs 1),
otherwise it remains inactive (outputs 0).
• The final output of the perceptron is the result of
the activation function.
• It represents the perceptron's decision or
prediction based on the input and the weights.
• The activation function maps the the weighted
sum into a binary value.
• The binary 1 or 0 can be interpreted
as true or false / yes or no.

A perceptron will fire if the


weighted sum of its inputs is
greater than the threshold (-w0).
11
Perceptron Architecture
Bias
weights
input
signals Output
In matrix form

Activation function

Output

12
Neural Network

• Perceptrons are often used as the building blocks for more complex neural networks, such as
multi-layer perceptrons (MLPs) or deep neural networks (DNNs).
• By combining multiple perceptrons in layers and connecting them in a network structure, these
models can learn and represent complex patterns and relationships in data, enabling tasks such
as image recognition, natural language processing, and decision making.

13
Neural Network architecture
Input Layer
The input layer takes raw input from the domain. No
computation is performed at this layer. Hidden Hidden
Hidden Layer 3
Layer 2
Nodes here just pass on the information (features) to the Input Layer Layer 1

hidden layer.

Hidden Layer
Output
• As the name suggests, the nodes of this layer are not
Layer
exposed.
• They provide an abstraction to the neural network.
• The hidden layer performs all kinds of computation on the
features entered through the input layer and transfers the
result to the output layer.
• All hidden layers usually use the same activation function.

Output Layer
• It’s the final layer of the network that brings the
information learned through the hidden layer and delivers
the final value as a result.
• The output layer will typically use a different activation
function from the hidden layers.
• The choice depends on the goal or type of prediction made
14
by the model.
TYPES OF ACTIVATION FUNCTIONS

• Identity function - It is a linear function


• Also known as "no activation," or "identity function", is where
the activation is proportional to the input.

• Limitations
• It’s not possible to use backpropagation as the derivative of
the function is a constant and has no relation to the input x.
• All layers of the neural network will collapse into one if a
linear activation function is used.
• No matter the number of layers in the neural network, the
last layer will still be a linear function of the first layer.
• So, essentially, a linear activation function turns the neural
network into just one layer.

15
TYPES OF ACTIVATION FUNCTIONS
• Step function - step function gives 1 as output if the input is either 0 or positive.
If the input is negative, the step function gives 0 as output.

• Binary step function depends on a threshold value that decides whether a neuron
should be activated or not.

• The input fed to the activation function is compared to a certain threshold; if the
input is greater than it, then the neuron is activated, else it is deactivated, meaning
that its output is not passed on to the next hidden layer.
Limitations of binary step function:
•It cannot provide multi-value outputs—for example, it cannot be used for multi-
class classification problems.
•The gradient of the step function is zero, which causes a hindrance in the
Not smooth, not continuous (at backpropagation process.
w0), not differentiable 16
TYPES OF ACTIVATION FUNCTIONS
• Threshold function - Threshold function same as step function, but θ is used as a
threshold value instead of 0

Not smooth, not continuous (at


w0), not differentiable

17
TYPES OF ACTIVATION FUNCTIONS
• Sigmoid function - require the activation function to be differentiable and
hence continuous
1. Binary sigmoid function Smooth, continuous (at w0), differentiable
2. Bipolar sigmoid function

k = steepness or slope parameter of the sigmoid


function

18
TYPES OF ACTIVATION FUNCTIONS

• ReLU (Rectified Linear Unit) function - most


popularly used activation function in the
areas of convolutional neural networks and
deep learning

19
TYPES OF ACTIVATION FUNCTIONS

• Hyperbolic tangent function


• Used in backpropagation network

20
21
22
Layers of Neural Network

23
Layers of Neural Network

24
Neural network with a single hidden layer
Calculate the output from the hidden layer

Activation function

Second node of the hidden layer

25
Neural network with a single hidden layer

Determine the outputs of the output layer

26
Neural network with a single hidden layer

Hidden layer becomes ineffective when


the hidden nodes have linear activation
functions. 27
Supervised learning of NN
1. Initialize the weights with adequate values.
2. Take the “input” from the training data, which is formatted as
{ input, correct output }, and enter it into the neural network.
Obtain the output from the neural network and calculate the
error from the correct output.
3. Adjust the weights to reduce the error.
4. Repeat Steps 2-3 for all training data

28

You might also like