Professional Documents
Culture Documents
Chapters 1-4
Chapters 1-4
Brain Computer
No. of processing units 10^11 10^9
Type of processing units Neurons Transistors
Type of calculation massively parallel usually serial
Data storage associative address-based
Possible switching operations 10^13 s^-1 10^18 s^-1
Actual switching operations 10^12 s^-1 10^10 s^-1
Nervous system:
Central:
brain
spinal cord
Peripheral
nerves outside of brain and spinal cord
[see neuron figure]
Neuron components:
Dendrites:
receive electrical signals from many different sources, which are then transferred into the
nucleus of the cell.
Nucleus (soma):
accumulates signals from dentrites or synapses
when the accumulated signal exceeds a certain value the nucleus activates an electrical
pulse
axon
long, slender extension of the soma
electrically isolated
transfers electrical signal to dentrites of other neurons (by synapses)
synapse:
connect different neurons together
electrical synapse: direct, strong, unadjustable connection
chemical synapse:
synaptic cleft electrically separates the pre-synaptic side from the post-synaptic one
electrical signal is converted into chemical signal, passes through the cleft, and then is
converted back to (modified) electrical signal
one way connection
adjustable
Step functions:
Heaviside (binary threshold function)
if the input is above a certain threshold, the function changes from one value to another, but
otherwise remains constant
Fermi (logistic function): maps to the range 0:1
Can be expanded by a temperature parameter: the smaller, the more compressed on x-axis
Hyperbolic tangent (tanh): maps to the range -1:1
Topologies:
Feedforward network:
One input layer, one output layer and one or more hidden layers.
Connections are only permitted to neurons of the following layer.
Feedforward with shortcut connections:
connections may not only be directed towards the next layer but also towards any other
subsequent layer.
Recurrent networks:
Direct recurrent: a neuron can connect to itself
Indirect recurrent: a neuron can connect to neurons in preceding layer
Lateral recurrent: a neuron can connect to neurons in the same layer
Completely linked:
Every neuron is connected to every other neuron except itself
A bias neuron:
is a neuron whose output value is always 1
Connected to neurons j1, j2, ..., jn with weights equal to negative thresholds -theta_j1, -
theta_j2,……theta_jn
We can modify these weights to learn instead of trying to modify the threshold (which is
difficult)
Advantage: easier to implement
Disadvantage: Network representation already becomes ugly with only a few nerons, let
alone with a great number of them
Chapter 4
A neural network could learn by:
1. developing new connections,
2. deleting existing connections,
3. changing connecting weights,
4. changing the threshold values of neurons,
5. varying one or more of the three neuron functions (activation function, propagation function
and output function),
6. developing new neurons, or
7. deleting existing neurons (and so, of course, existing connections).
Unsupervised learning:
The training set only consists of input patterns
The network tries by itself to detect similarities and to generate pattern classes
Reinforcement learning:
The training set consists of input patterns
After completion of a sequence a value is returned to the network indicating whether the
result was right or wrong and,
Possibly, how right or wrong it was
Supervised learning:
The training set consists of input patterns with correct results
The network can receive a precise error vector
Offline learning:
Several training patterns are entered into the network at once,
the errors are accumulated and it learns for all patterns at the same time.
Online learning:
The network learns directly from the errors of each training sample.
Teaching input:
desired output vector to the training sample
The teaching input tj is the desired and correct value j should output after the input of a
certain training pattern
Error vector (Ep): For several output neurons Omega_1, Omega_2, ..., Omega_n the
difference between output vector and teaching input under a training input p
Learning curve:
Indicates the progress of the error over time t
A perfect learning curve looks like a negative exponential function, that means it is
proportional to e^-t
Specific error:
Err_p is based on a single training sample, which means it is generated online
Total error:
based on all training samples, that means it is generated offline.
Total error = sum[p \in P]: Err_p
Euclidean error:
sqrt( sum[Omega \in O]: (t_Omega - y_Omega)^2 )
RMS error:
sqrt( {sum[Omega \in O]: (t_Omega - y_Omega)^2}/{|O|} )
Gradient:
gradient g of a function f is a vector that directs from any point of f towards the steepest
ascent from this point,
with |g| corresponding to the degree of this ascent
Gradient descent:
going from f(s) against the direction of g, i.e. towards -g with steps of the size of |g| towards
smaller and smaller values of f.