Chapter 9 - ANNs

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 25

Department of Computer Sciences

College of Computers & Information Technology

501481-3 Summer 2022

Artificial Intelligence
Chapter 9: Artificial Neural
Networks

1
Introduction
The domain of artificial intelligence is huge
in breadth and width. In this chapter, we
consider one of the broadly common and
prospering research areas in the domain
of AI, which is Artificial Neural Networks.
Important Definitions
• Neural network is a massively parallel distributed processor
consisting of nodes connected by adaptable weights, which store
experimental knowledge from task examples through a process
of learning.

• Knowledge is acquired by the network through a learning


(training) process; The strength of the interconnections between
neurons is implemented by means of the synaptic weights used
to store the knowledge.

• Learning is a procedure of the adapting the weights with a


learning algorithm in order to capture the knowledge. On more
mathematically, the aim of the learning process is to map a given
relation between inputs and output(s) of the network .
The Artificial Neuron Model
• A set of inputs are applied, each
representing an output of another neuron.
Each input is multiplied by a corresponding
weight, analogous to synaptic strength. The
weighted inputs are summed to determine
the activation level of the neuron

• The purpose of an activation function is to


ensure that the neuron's response is
bounded - that is, the actual response of
the neuron is conditioned or damped, as a
result of large or small activating stimuli and
thus controllable.

• The connection strengths or the weights


represent the knowledge in the system.
Information processing takes place through
the interaction among these units.
2 input Single Perceptron
Classical Activation Functions
Linear activation Logistic activation

 z  z  z 
1
1  e  z
1

z
z
0

Threshold activation Hyperbolic tangent activation


1  e 2u
 1, if
  z   sign( z )  
z  0,
 u   tanhu  
1, if z  0. 1  e 2u
1
1

z 0 z
-1
-1
Perceptron Network
Example
For the network shown below, find the output of the neurons Y1
and Y2 using the following non-linear activation functions:
(a) binary sigmoidal (b) bipolar sigmoidal
Training Artificial Neural Networks
In general, learning or training algorithms can be
categorized as:
•Supervised Learning
•Unsupervised Learning
•Reinforcement Learning
Supervised Learning

During the training session input vectors are applied to the


network, and the resulting output vector is compared with the
desired response. If the actual response differs from the target
response, the generated error signal adjusts the weights. The error
minimization process is supervised by a teacher.
Unsupervised Learning

Ex: Nearest Neighbor


Classification

• This type of training is used in self-organizing neural nets, and it


doesn’t require a teacher. In this method of training, the input
vectors of similar types are grouped together without a teacher
Reinforcement Learning

During training, the weights of the connections in the


hidden layers are randomized. Reinforced training is
similar to supervised training. There is a teacher who
scores the performance of the training examples.
Machine Learning
• Learning investigates the mechanisms by which knowledge is acquired through
experience.
• When the agent is a computer, we call it machine learning:
– a computer observes some data, builds a model based on the data, and uses the model as both a
hypothesis about the world and a piece of software that can solve problems.
• A machine learning algorithm is an algorithm that is able to learn from data.
– Learning from examples.
Machine Learning
• Why would we want a machine to learn?
– First, the designers cannot anticipate all possible future situations.
• a program for predicting stock market prices must learn to adapt when conditions change from boom to bust.
– Second, sometimes the designers have no idea how to program a solution themselves.
• Most people are good at recognizing the faces of family members, but they do it subconsciously, so even the best
programmers don’t know how to program a computer to accomplish that task, except by using machine learning
• What do we mean by learning for machines?
– “A computer program is said to learn from experience with respect to some class of tasks and
performance measure , if its performance at tasks in , as measured by , improves with
experience .” Mitchell (1997)
Experience, Tasks and Performance
The experience :
• learning algorithms can be understood as being allowed to experience an entire dataset.
• Dataset: collection of many examples (a.k.a. data points, instances, samples, observations)
• The learning approaches depends on the given datasets (labelled or unlabelled data)
• As instance is a collection of features (attributes, measurements, dimensions) that are represented by
a vector
– e.g., the features of an image is the values of the pixels in the image
– the features for a document is the number of occurrences for the words in the document
Experience, Tasks and Performance
The performance :
• Design a quantitative measure of its performance (task-specific)
• Accuracy of the learnt model:

• Accuracy is measured using test data


– Data that is separate from the data used for training the machine learning system.
Experience, Tasks and Performance
The Tasks :
• Learning is our means of attaining the ability to perform the task.
• if we want a robot to be able to walk, then walking is the task.
• Machine learning tasks: how the machine learning system should process an example/instance.
Experience, Tasks and Performance
The Classification Task :
– When the output is one of a finite set of values (e.g., sunny/cloudy/rainy or true/false), the
learning problem is called classification.
– Specify which of categories some input belongs to.
– learn a function that maps feature vectors of instances to their labels.
– The agent learns a function that, when given a new image, predicts the appropriate label.
– Examples:

• Image classification
• Sentiment classification:
The movie was great +1
The food was cold and tasted bad -1
Experience, Tasks and Performance
The Classification Task :
– Classification is supervised learning (predictive models): learning with a teacher
– You are given a data set of pairs , where the first element of each pair is a feature vector , and
the second element is a label, in a (binary) classification problem.
– Binary classification: two classes
• positive/negative reviews
• Benign/malignant cancer
– Multiclass classification: three or more classes
• Animal classification (can, dog, zebra etc).
• Movie classification (thriller, adventure, action, etc).
• Iris plant classification:
– 150 iris plants, features (sepal length, sepal width, petal length and petal width), three classes (Iris
Setosa, Iris Versicolour, Iris Virginica)
Experience, Tasks and Performance
The Regression Task :
– When the output is a number (e.g., tomorrow’s temperature or house price, measured either as
an integer or a real number), the learning problem is called regression.
– Regression is supervised learning (predictive models): learning with a teacher
– You are given a data set of pairs , where the first element of each pair is a feature vector , and
the second element is a label,
– Examples:
• House price prediction
• Temperature prediction
• Student’s score prediction
Experience, Tasks and Performance
The Clustering Task :
– In this task, the input vectors of similar types are grouped together without a teacher
• Dividing the population or data points into a number of groups
– Clustering is unsupervised learning (descriptive models): learning without a teacher
– You are given a data set of examples without labels
– Examples:
• Customer grouping based on their behaviour
• Group similar words in the vocabulary together
– Unsupervised algorithms:
• Self-organizing features map (SOM neural network)
• Principle Component Analysis (PCA)
• K-mean clustering

Words that express similar sentiments are grouped


into the same cluster (Yogatama et al., 2014)
Biological and Artificial Neural Networks

Biological Neural Networkds Artificial Neural Networks


Soma Neuron (or Node)
Dendrite Input
Axon Output
Synapse Weight
Exercise
• For the network shown below, find the output of the neurons Y1 and Y2 using the
sigmoid non-linear activation functions:
Exercise
• It is easy to use vector and matrices to solve the previous exercise:

Output
nodes Weight matrix
Input nodes
Activation
(transfer)
function
The Artificial Perceptron

𝐷
𝑎=𝑏+ ∑ 𝑥 𝑖 𝑤𝑖
𝑖=1

The perceptron is an algorithm for supervised learning of binary linear classifiers:


functions that can decide whether an input (represented by a vector of numbers)
belong to one class or another.

You might also like