Professional Documents
Culture Documents
Learning Basics of Artificial Intelligence Through Neural Networks
Learning Basics of Artificial Intelligence Through Neural Networks
Voice Image
N.Net Transcription N.Net Text caption
signal
Game
N.Net Next move
State
35
Connectionist Machines
PROGRAM
PROCESSOR NETWORK
DATA
Processing Memory
unit
Dendrites
Axon
60
The Universal Model
• Originally assumed could represent any Boolean circuit and
perform any logic
– “the embryo of an electronic computer that [the Navy] expects
will be able to walk, talk, see, write, reproduce itself and be
conscious of its existence,” New York Times (8 July) 1958
– “Frankenstein Monster Designed by Navy That Thinks,” Tulsa,
Oklahoma Times 1958
61
Also provided a learning algorithm
Sequential Learning:
is the desired output in response to input
is the actual output in response to
• Boolean tasks
• Update the weights whenever the perceptron output is
wrong
– Update the weight by the product of the input and the
error between the desired and actual outputs
• Proved convergence for linearly separable classes
62
Perceptron
X 1
-1
2 X 0
1
Y
X 1
1
1
?
?
64
A single neuron is not enough
1
-1 1
2
1
1
-1
-1
Y
Hidden Layer
• XOR
– The first layer is a “hidden” layer
– Also originally suggested by Minsky and Papert 1968
66
A more generic model
21
1 1
01 1
1 -1 1 1
21 21 1 21
1 1 1 -1 1 -1
1 1
1
X Y Z A
• A “multi-layer” perceptron
• Can compose arbitrarily complicated Boolean functions!
– In cognitive terms: Can compute arbitrary Boolean functions over
sensory input
– More on this in the next class
67
But our brain is not Boolean
x1
x2
x3
xN
x1
x2
x3
xN
• Alternate view:
– A threshold “activation” operates on the weighted sum of inputs
plus a bias
• An affine function of the inputs
– outputs a 1 if z is non-negative, 0 otherwise
• Unit “fires” if weighted input matches or exceeds a threshold
71
The perceptron with real inputs
and a real output
b
x1
x2
x3
sigmoid
xN
xN
1
x1
x2
x3
x2 w1x1+w2x2=T
xN
0
x1
• A perceptron operates on x2
x1
real-valued vectors
– This is a linear classifier 74
Boolean functions with a real
perceptron
0,1 1,1 0,1 1,1 0,1 1,1
x1 x1 x1
x1
76
Booleans over the reals
x2
x1
x2 x1
x2
x1
x2 x1
x2
x1
x2 x1
x2
x1
x2 x1
x2
x1
x2 x1
x2
4
4 AND
3 3
5
y1 y2 y3 y4 y5
4 x1
4
3 3
4 x2 x1
OR
AND AND
x2
x1 x1 x2
• Network to fire if the input is in the yellow area
– “OR” two polygons
– A third layer is required 83
Complex decision boundaries
84
Complex decision boundaries
OR
AND
x1 x2
• Can compose arbitrarily complex decision
boundaries
88
Complex decision boundaries
OR
AND
x1 x2
• Can compose arbitrarily complex decision boundaries
– With only one hidden layer!
– How?
89
Exercise: compose this with one
hidden layer
x2
x1 x1 x2
2 4 2
y ≥ 4?
x2 x1
91
Composing a pentagon
2
2
3
4 4
3 3
5
4 4
4 2
2
3 3
y ≥ 5?
2 y1 y2 y3 y4 y5
92
Composing a hexagon
3
3 4 3
5
5 5
4 6 4
5 5
3 5 3
4 4
y ≥ 6?
y1 y2 y3 y4 y5 y6
93
The multi-layer perceptron
• A network of perceptrons
– Perceptrons “feed” other
perceptrons
– We give you the “formal” definition of a layer later
14
Defining “depth”
15
Deep Structures
• In any directed graph with input source nodes and
output sink nodes, “depth” is the length of the longest
path from a source to a sink
– A “source” node in a directed graph is a node that has only
outgoing edges
– A “sink” node is a node that has only incoming edges
Input: Black
Layer 1: Red
Layer 2: Green
Layer 3: Yellow
Layer 4: Blue
N.Net
21 ℎ
1 1
01 1 ℎ
1 -1 1 1
x
21 21 1 21
1 1 1 -1 1 -1
1 1
1
X Y Z A
22
The perceptron as a Boolean gate
X 1
-1
2 X 0
1
Y
X 1
1
1
Values in the circles are thresholds
Y Values on edges are weights
1
L
-1
-1
1
L-N+1
-1
-1
26
Perceptron as a Boolean Gate
1
1
Will fire only if the total number of
1
L-N+K of X1 .. XL that are 1 and XL+1 .. XN that
-1
are 0 is at least K
-1
-1
27
The perceptron is not enough
X ?
?
?
28
Multi-layer perceptron
X 1
1
-1 1
2
1
1
-1
-1
Y
Hidden Layer
29
Multi-layer perceptron XOR
X
1
1
-2
1.5 0.5
1
• With 2 neurons
– 5 weights and two thresholds
30
Multi-layer perceptron
21
1 1
01 1
1 -1 1 1
21 2
1 1 2
1
1 1 1 -1 1 -1
1 1
1
X Y Z A
• MLPs can compute more complex Boolean functions
• MLPs can compute any Boolean function
– Since they can emulate individual gates
• MLPs are universal Boolean functions
31
MLP as Boolean Functions
21
1 1
01 1
1 -1 1 1
21 21 1 21
1 1 1 -1 1 -1
1 1
1
X Y Z A
X1 X2 X3 X4 X5 Y
0 0 1 1 0 1
0 1 0 1 1 1
0 1 1 0 0 1
1 0 0 0 1 1
1 0 1 1 1 1
1 1 0 0 1 1
33
How many layers for a Boolean MLP?
Truth table shows all input combinations
Truth Table for which output is 1
X1 X2 X3 X4 X5 Y
0 0 1 1 0 1
0 1 0 1 1 1
0 1 1 0 0 1
1 0 0 0 1 1
1 0 1 1 1 1
1 1 0 0 1 1
34
How many layers for a Boolean MLP?
Truth table shows all input combinations
Truth Table for which output is 1
X1 X2 X3 X4 X5 Y
0 0 1 1 0 1
0 1 0 1 1 1
0 1 1 0 0 1
1 0 0 0 1 1
1 0 1 1 1 1
1 1 0 0 1 1 X1 X2 X3 X4 X5
35
How many layers for a Boolean MLP?
Truth table shows all input combinations
Truth Table for which output is 1
X1 X2 X3 X4 X5 Y
0 0 1 1 0 1
0 1 0 1 1 1
0 1 1 0 0 1
1 0 0 0 1 1
1 0 1 1 1 1
1 1 0 0 1 1 X1 X2 X3 X4 X5
36
How many layers for a Boolean MLP?
Truth table shows all input combinations
Truth Table for which output is 1
X1 X2 X3 X4 X5 Y
0 0 1 1 0 1
0 1 0 1 1 1
0 1 1 0 0 1
1 0 0 0 1 1
1 0 1 1 1 1
1 1 0 0 1 1 X1 X2 X3 X4 X5
37
How many layers for a Boolean MLP?
Truth table shows all input combinations
Truth Table for which output is 1
X1 X2 X3 X4 X5 Y
0 0 1 1 0 1
0 1 0 1 1 1
0 1 1 0 0 1
1 0 0 0 1 1
1 0 1 1 1 1
1 1 0 0 1 1 X1 X2 X3 X4 X5
38
How many layers for a Boolean MLP?
Truth table shows all input combinations
Truth Table for which output is 1
X1 X2 X3 X4 X5 Y
0 0 1 1 0 1
0 1 0 1 1 1
0 1 1 0 0 1
1 0 0 0 1 1
1 0 1 1 1 1
1 1 0 0 1 1 X1 X2 X3 X4 X5
39
How many layers for a Boolean MLP?
Truth table shows all input combinations
Truth Table for which output is 1
X1 X2 X3 X4 X5 Y
0 0 1 1 0 1
0 1 0 1 1 1
0 1 1 0 0 1
1 0 0 0 1 1
1 0 1 1 1 1
1 1 0 0 1 1 X1 X2 X3 X4 X5
40
How many layers for a Boolean MLP?
Truth table shows all input combinations
Truth Table for which output is 1
X1 X2 X3 X4 X5 Y
0 0 1 1 0 1
0 1 0 1 1 1
0 1 1 0 0 1
1 0 0 0 1 1
1 0 1 1 1 1
1 1 0 0 1 1 X1 X2 X3 X4 X5
41
How many layers for a Boolean MLP?
Truth table shows all input combinations
Truth Table for which output is 1
X1 X2 X3 X4 X5 Y
0 0 1 1 0 1
0 1 0 1 1 1
0 1 1 0 0 1
1 0 0 0 1 1
1 0 1 1 1 1
1 1 0 0 1 1
X1 X2 X3 X4 X5
• Any truth table can be expressed in this manner!
• A one-hidden-layer MLP is a Universal Boolean Function
42
How many layers for a Boolean MLP?
Truth table shows all input combinations
Truth Table for which output is 1
X1 X2 X3 X4 X5 Y
0 0 1 1 0 1
0 1 0 1 1 1
0 1 1 0 0 1
1 0 0 0 1 1
1 0 1 1 1 1
1 1 0 0 1 1
X1 X2 X3 X4 X5
• Any truth table can be expressed in this manner!
• A one-hidden-layer MLP is a Universal Boolean Function
• DNF form:
– Find groups
– Express as reduced DNF
44
Reducing a Boolean Function
YZ
WX 00 01 11 10
00 Basic DNF formula will require 7 terms
01
11
10
45
Reducing a Boolean Function
YZ
WX 00 01 11 10
00
01
11
10
01
11
10
01
11
10
01
11
10
01
11
10
10 11
01
00 YZ
UV 00 01 11 10
• How many neurons in a DNF (one-hidden-
layer) MLP for this Boolean function of 6
variables? 51
Width of a one-hidden-layer Boolean MLP
YZ
WX
00
Can be generalized: Will require 2N-1
perceptrons
01 in hidden layer
Exponential
11 in N
10
10 11
01
00 YZ
UV 00 01 11 10
• How many neurons in a DNF (one-hidden-
layer) MLP for this Boolean function
52
Poll 2
• Piazza thread @94
53
Poll 2
How many neurons will be required in the hidden layer of a one-hidden-layer
network that models a Boolean function over 10 inputs, where the output for
two input bit patterns that differ in only one bit is always different? (I.e. the
checkerboard Karnaugh map)
20
256
512
1024
54
Width of a one-hidden-layer Boolean MLP
YZ
WX
00
Can be generalized: Will require 2N-1
perceptrons
01 in hidden layer
Exponential
11 in N
10
10 11
01
00 YZ
UV 00 01 11 10
• How
How many
many neurons
units if wein ause
DNFmultiple
(one-hidden-
hidden
layers?
layer) MLP for this Boolean function
55
Size of a deep MLP
YZ
WX 00 01 11 10
YZ
WX
00
00
01 01
11 11
10
11
10 01
10 00 YZ
UV 00 01 11 10
56
Multi-layer perceptron XOR
X 1
1
-1 1
2
1
1
-1
-1
Y
Hidden Layer
00
01
9 perceptrons
11
10
W X Y Z
00
01
11
10
11
10 01
00 YZ
UV 00 01 11 10
U V W X Y Z 15 perceptrons
00
01
11
10
11
10 01
00 YZ
UV 00 01 11 10
𝑋 𝑋
• Only layers
– By pairing terms
– 2 layers per XOR …
62
A better representation
XOR
XOR
XOR
XOR
𝑋 𝑋
• Only layers
– By pairing terms
– 2 layers per XOR …
63
The human perspective
Voice Image
N.Net Transcription N.Net Text caption
signal
Game
N.Net Next move
State
Voice Image
N.Net Transcription N.Net Text caption
signal
Game
N.Net Next move
State
Voice Image
Transcription Text caption
signal
Game
State Next move
784 dimensions
(MNIST)
784 dimensions
86