Neural Networks: Eric Postma Ikat Universiteit Maastricht

You might also like

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 91

Neural networks

Eric Postma IKAT Universiteit Maastricht

Overview
Introduction: The biology of neural networks

the biological computer brain-inspired models basic notions

Interactive neural-network demonstrations

Perceptron

Multilayer perceptron
Kohonens self-organising feature map Examples of applications

A typical AI agent

Two types of learning

Supervised learning
curve fitting, surface fitting, ...

Unsupervised learning
clustering, visualisation...

An input-output function

Fitting a surface to four points

(Artificial) neural networks The digital computer versus the neural computer

The Von Neumann architecture

The biological architecture

Digital versus biological computers


5 distinguishing properties speed robustness flexibility adaptivity context-sensitivity

Speed: The hundred time steps argument


The critical resource that is most obvious is time. Neurons whose basic computational speed is a few milliseconds must be made to account for complex behaviors which are carried out in a few hudred milliseconds (Posner, 1978). This means that entire complex behaviors are carried out in less than a hundred time steps. Feldman and Ballard (1982)

Graceful Degradation
performance

damage

Flexibility: the Necker cube

vision = constraint satisfaction

Adaptivitiy
processing implies learning in biological computers versus processing does not imply learning in digital computers

Context-sensitivity: patterns

emergent properties

Robustness and context-sensitivity coping with noise

The neural computer

Is it possible to develop a model after the natural example?


Brain-inspired models:
models based on a restricted set of structural en functional properties of the (human) brain

The Neural Computer (structure)

Neurons, the building blocks of the brain

Neural activity

out

in

Synapses, the basis of learning and memory

Learning: Hebbs rule


neuron 1 synapse neuron 2

Connectivity
An example:
The visual system is a feedforward hierarchy of neural modules Every module is (to a certain extent) responsible for a certain function

(Artificial) Neural Networks

Neurons
activity nonlinear input-output function

Connections
weight

Learning
supervised unsupervised

Artificial Neurons

input (vectors) summation (excitation) output (activation)

i1 i2 i3 e a = f(e)

Input-output function

nonlinear function:

f(x) =

1 + e -x/a

a0

f(e)
a

Artificial Connections (Synapses)

wAB
The weight of the connection from neuron A to neuron B

wAB

The Perceptron

Learning in the Perceptron

Delta learning rule


the difference between the desired output t and the actual output o, given input x

Global error E
is a function of the differences between the desired and actual outputs

Gradient Descent

Linear decision boundaries

The history of the Perceptron


Rosenblatt (1959)
Minsky & Papert (1961) Rumelhart & McClelland (1986)

The multilayer perceptron

input

hidden

output

Training the MLP

supervised learning
each training pattern: input + desired output in each epoch: present all patterns at each presentation: adapt weights after many epochs convergence to a local minimum

phoneme recognition with a MLP


Output: pronunciation

input: frequencies

Non-linear decision boundaries

Compression with an MLP the autoencoder

hidden representation

Learning in the MLP

Preventing Overfitting
GENERALISATION = performance on test set

Early stopping Training, Test, and Validation set k-fold cross validation
leaving-one-out procedure

Image Recognition with the MLP

Hidden Representations

Other Applications

Practical OCR financial time series fraud detection process control marketing speech recognition Theoretical cognitive modeling biological modeling

Some mathematics

Perceptron

Derivation of the delta learning rule


Target output Actual output h=i

MLP

Sigmoid function

May also be the tanh function


(<-1,+1> instead of <0,1>)

Derivative f(x) = f(x) [1 f(x)]

Derivation generalized delta rule

Error function (LMS)

Adaptation hidden-output weights

Adaptation input-hidden weights

Forward and Backward Propagation

Decision boundaries of Perceptrons

Straight lines (surfaces), linear separable

Decision boundaries of MLPs

Convex areas (open or closed)

Decision boundaries of MLPs

Combinations of convex areas

Learning and representing similarity

Alternative conception of neurons

Neurons do not take the weighted sum of their inputs (as in the perceptron), but measure the similarity of the weight vector to the input vector
The activation of the neuron is a measure of similarity. The more similar the weight is to the input, the higher the activation Neurons represent prototypes

Course Coding

2nd order isomorphism

Prototypes for preprocessing

Kohonens SOFM (Self Organizing Feature Map)


Unsupervised learning Competitive learning


winner

output

input (n-dimensional)

Competitive learning

Determine the winner (the neuron of which the weight vector has the smallest distance to the input vector) Move the weight vector w of the winning neuron towards the input i

i w

i w

Before learning

After learning

Kohonens idea

Impose a topological order onto the competitive neurons (e.g., rectangular map) Let neighbours of the winner share the prize (The postcode lottery principle.)
After learning, neurons with similar weights tend to cluster on the map

Topological order neighbourhoods Square


winner (red) Nearest neighbours

Hexagonal
Winner (red) Nearest neighbours

A simple example

A topological map of 2 x 3 neurons and two inputs

visualisation

input

2D input

weights

Weights before training

Input patterns (note the 2D distribution)

Weights after training

Another example

Input: uniformly randomly distributed points


Output: Map of 202 neurons Training
Starting with a large learning rate and neighbourhood size, both are gradually decreased to facilitate convergence

Dimension reduction

Adaptive resolution

Application of SOFM

Examples (input)

SOFM after training (output)

Visual features (biologically plausible)

Relation with statistical methods 1

Principal Components Analysis (PCA)


pca1 pca2

Projections of data

pca1

pca2

Relation with statistical methods 2


Multi-Dimensional Scaling (MDS) Sammon Mapping

Distances in highdimensional space

Image Mining the right feature

Fractal dimension in art

Jackson Pollock (Jack the Dripper)

Taylor, Micolich, and Jonas (1999). Fractal Analysis of Pollocks drip paintings. Nature, 399, 422. (3 june).

Fractal dimension

}
Creation date

Range for natural images

Our Van Gogh research


Two painters

Vincent Van Gogh paints Van Gogh Claude-Emile Schuffenecker paints Van Gogh

Sunflowers

Is it made by
Van Gogh? Schuffenecker?

Approach

Select appropriate features (skipped here, but very important!) Apply neural networks

van Gogh

Schuffenecker

Training Data

Van Gogh (5000 textures)

Schuffenecker (5000 textures)

Results

Generalisation performance 96% correct classification on untrained data

Resultats, cont.

Trained art-expert network applied to Yasuda sunflowers


89% of the textures is geclassificeerd as a genuine Van Gogh

A major caveat

Not only the painters are different

but also the material

and maybe many other things

You might also like