Welcome to Scribd!

Skip carousel

6 Perceptron

Uploaded by

damasodra33

0% found this document useful (0 votes)

6 views32 pages

Original Title

6-Perceptron

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

0% found this document useful (0 votes)

6 views32 pages

6 Perceptron

Uploaded by

damasodra33

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pdf or txt

Jump to Page

You are on page 1of 32

Search inside document

Introduction to Deep Learning

6. Multilayer Perceptron

STAT 157, Spring 2019, UC Berkeley

Mu Li and Alex Smola

courses.d2l.ai/berkeley-stat-157
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Outline

• Single Layer Perceptron

• Decision Boundary
• XOR is hard
• Multilayer Perceptron
• Layers
• Nonlinearities
• Computational Cost

courses.d2l.ai/berkeley-stat-157
Perceptron
Mark I Perceptron, 1960
(wikipedia.org)

courses.d2l.ai/berkeley-stat-157
Perceptron

• Given input x , weight w and bias b , perceptron outputs:

{0 otherwise
1 if x > 0
o = σ (⟨w, x⟩ + b) σ(x) =

courses.d2l.ai/berkeley-stat-157
Perceptron

• Given input x , weight w and bias b , perceptron outputs:

{0 otherwise
1 if x > 0
o = σ (⟨w, x⟩ + b) σ(x) =

• Binary classification (0 or 1)
• Vs. scalar real value for regression
• Vs. probabilities for logistic regression

courses.d2l.ai/berkeley-stat-157
Perceptron

Ham
Spam

courses.d2l.ai/berkeley-stat-157
Training the Perceptron
initialize w = 0 and b = 0
repeat
if yi [hw, xi i + b]  0 then
w w + yi xi and b b + yi
end if
until all classified correctly
Equals to SGD (batch size is 1) with the following loss
ℓ(y, x, w) = max(0, − y⟨w, x⟩)
courses.d2l.ai/berkeley-stat-157
courses.d2l.ai/berkeley-stat-157
From wikipedia
courses.d2l.ai/berkeley-stat-157
From wikipedia
courses.d2l.ai/berkeley-stat-157
From wikipedia
courses.d2l.ai/berkeley-stat-157
From wikipedia
Convergence Theorem

• Radius r enclosing the data

• Margin ρ separating the classes
y(x⊤w + b) ≥ ρ
for ∥w∥2 + b 2 ≤ 1
• Guaranteed that perceptron will
converge after
r2 + 1
steps
ρ 2

courses.d2l.ai/berkeley-stat-157
Consequences

• Only need to store errors.

This gives a compression
bound for perceptron.
• Fails with noisy data

do NOT train your

avatar with perceptrons
Black & White

courses.d2l.ai/berkeley-stat-157
XOR Problem (Minsky & Papert, 1969)

The perceptron cannot learn an XOR function

(neurons can only generate linear separators)

courses.d2l.ai/berkeley-stat-157
Multilayer
Perceptron

courses.d2l.ai/berkeley-stat-157
Learning XOR

1 2

3 4

courses.d2l.ai/berkeley-stat-157
Learning XOR
1 2 3 4
+ - + -
+ + - -
product + - - +
1 2

3 4

courses.d2l.ai/berkeley-stat-157
Learning XOR
1 2 3 4
+ - + -
+ + - -
product + - - +
1 2

3 4

courses.d2l.ai/berkeley-stat-157
Single Hidden Layer

Hyperparameter - size m of hidden layer

courses.d2l.ai/berkeley-stat-157
Single Hidden Layer

• Input x ∈ ℝn
• Hidden W1 ∈ ℝm×n, b1 ∈ ℝm
• Output w2 ∈ ℝm, b2 ∈ ℝ

h = σ(W1x + b1)
o = wT2 h + b2

σ is an element-wise
activation function

courses.d2l.ai/berkeley-stat-157
Single Hidden Layer

Why do we need an a
nonlinear activation?

h = σ(W1x + b1)
o = wT2 h + b2

σ is an element-wise
activation function

courses.d2l.ai/berkeley-stat-157
Single Hidden Layer

Why do we need an a
nonlinear activation?

h = W1x + b1
o = wT2 h + b2
hence o = w⊤2 W1x + b′

Linear …
courses.d2l.ai/berkeley-stat-157
Sigmoid Activation

Map input into (0, 1), a soft version of σ(x) = {

1 if x > 0
0 otherwise
1
sigmoid(x) =
1 + exp(−x)

courses.d2l.ai/berkeley-stat-157
Tanh Activation

Map inputs into (-1, 1)

1 − exp(−2x)
tanh(x) =
1 + exp(−2x)

courses.d2l.ai/berkeley-stat-157
ReLU Activation

ReLU: rectified linear unit

ReLU(x) = max(x,0)

courses.d2l.ai/berkeley-stat-157
Multiclass Classification

courses.d2l.ai/berkeley-stat-157
Multiclass Classification
y1, y2, …, yk = softmax(o1, o2, …, ok)

courses.d2l.ai/berkeley-stat-157
Multiclass Classification

• Input x ∈ ℝn
• Hidden W1 ∈ ℝm×n and b1 ∈ ℝm
• Output W2 ∈ ℝm×d and b2 ∈ ℝd

h = σ(W1x + b1)
o = wT2 h + b2
y = softmax(o)

courses.d2l.ai/berkeley-stat-157
Multiclass Classification

• Input x ∈ ℝn
• Hidden W1 ∈ ℝm×n and b1 ∈ ℝm
• Output W2 ∈ ℝm×d and b2 ∈ ℝd

h = σ(W1x + b1)
o = wT2 h + b2
y = softmax(o)

courses.d2l.ai/berkeley-stat-157
Multiple Hidden Layers

h1 = σ(W1x + b1)
h2 = σ(W2h1 + b2)
h3 = σ(W3h2 + b3)
o = W4h3 + b4

Hyper-parameters
• # of hidden layers
• Hidden size for each layer

courses.d2l.ai/berkeley-stat-157
Summary

• Perceptron
• Simple updates
• Limited function complexity
• Multilayer Perceptron
• Multiple layers add more complexity
• Nonlinearity is needed
• Simple composition (but architecture search needed)

courses.d2l.ai/berkeley-stat-157

Lyme Disease, Mycoplasma, and Bioweapons Development Timeline
Document71 pages
Lyme Disease, Mycoplasma, and Bioweapons Development Timeline
Adam F.
100% (5)
McGoverns of Hockessin
Document20 pages
McGoverns of Hockessin
lincolntta
No ratings yet
Where Is Waldo?: Courses.d2l.ai/berkeley-Stat-157
Document47 pages
Where Is Waldo?: Courses.d2l.ai/berkeley-Stat-157
kirti
No ratings yet
6 05 Undercomplete Vs Overcomplete Hidden Layer
Document4 pages
6 05 Undercomplete Vs Overcomplete Hidden Layer
Niraj Reginald
No ratings yet
Cost/sensitive Bandots
Document16 pages
Cost/sensitive Bandots
Bob Zeno
No ratings yet
c3 1 Functions
Document19 pages
c3 1 Functions
azmat18
No ratings yet
Discrete Mathematics Project: Presented To
Document12 pages
Discrete Mathematics Project: Presented To
Meet Katharotiya
No ratings yet
Finate Fields Divison4
Document25 pages
Finate Fields Divison4
Sandeep Sandy
No ratings yet
An Introduction To Support Vector Machines: Biplab Banerjee
Document31 pages
An Introduction To Support Vector Machines: Biplab Banerjee
Akshat sharma
No ratings yet
Support Vector Machines: Logisic Regression
Document10 pages
Support Vector Machines: Logisic Regression
ahmadmanhal673
No ratings yet
4 Classification 2
Document55 pages
4 Classification 2
1767824623
No ratings yet
The CS Block Cipher
Document10 pages
The CS Block Cipher
xativiy256
No ratings yet
Assignment 1
Document1 page
Assignment 1
H077 - Shankar Palagani
No ratings yet
Polynomial Functions: FX A X A X A X A X A
Document11 pages
Polynomial Functions: FX A X A X A X A X A
c.huby
No ratings yet
HW 2
Document3 pages
HW 2
api-3716660
No ratings yet
BCS 012 Previous Year Question Papers by Ignouassignmentguru
Document79 pages
BCS 012 Previous Year Question Papers by Ignouassignmentguru
Japani Tutor
No ratings yet
BCS 012 Previous Year Question Papers by Ignouassignmentguru
Document75 pages
BCS 012 Previous Year Question Papers by Ignouassignmentguru
sahil verma
No ratings yet
GB Sir Calculs Prbom
Document607 pages
GB Sir Calculs Prbom
Anish Vijay
No ratings yet
HW 3
Document4 pages
HW 3
bohari NADRA
No ratings yet
Calculus II - Partial Fractions
Document11 pages
Calculus II - Partial Fractions
Sylvester Wasonga
No ratings yet
Final Review
Document5 pages
Final Review
jitendra rauthan
No ratings yet
Unit 4 - Linear Regression
Document52 pages
Unit 4 - Linear Regression
shinjo
No ratings yet
1 Algebra - With Solution
Document161 pages
1 Algebra - With Solution
Nala A.
No ratings yet
MA1521Chap5 Optimization
Document17 pages
MA1521Chap5 Optimization
ldhdang2001
No ratings yet
MA 111 Endsem@Part-A
Document1 page
MA 111 Endsem@Part-A
Vishal Singh
No ratings yet
Workbook
Document13 pages
Workbook
Danny Johnson
No ratings yet
Discrete Optimization: Assignments: Knapsack
Document14 pages
Discrete Optimization: Assignments: Knapsack
Victor Mariano Leite
No ratings yet
Chapter 2 LAD
Document53 pages
Chapter 2 LAD
Fatima Alshareef
No ratings yet
MAFE208IU-L6 - Least Squares Regression
Document45 pages
MAFE208IU-L6 - Least Squares Regression
uph.huyen
No ratings yet
EE421/621 Digital Circuits
Document71 pages
EE421/621 Digital Circuits
prachibhide
No ratings yet
SCM Xii Applied Math 202324 Term2
Document74 pages
SCM Xii Applied Math 202324 Term2
mouseandkeyboardop
No ratings yet
Maths - Computer... SEM IV
Document3 pages
Maths - Computer... SEM IV
Likambi Mumbula
No ratings yet
Logistic Regression: Some Slides Adapted From Dan Jurfasky and Brendan O'Connor
Document53 pages
Logistic Regression: Some Slides Adapted From Dan Jurfasky and Brendan O'Connor
Prima P
No ratings yet
Indefinite Integrals RD SHARMA
Document59 pages
Indefinite Integrals RD SHARMA
Jivesh
No ratings yet
Derivative of Transcendental Functions WS
Document8 pages
Derivative of Transcendental Functions WS
rajvir singh
No ratings yet
Workbook
Document13 pages
Workbook
冯米多
No ratings yet
5 Backward Propagation
Document81 pages
5 Backward Propagation
biware8359
No ratings yet
Chapter 2 Polynomials
Document19 pages
Chapter 2 Polynomials
rmsadhvi
No ratings yet
Lecture 8
Document30 pages
Lecture 8
f20201862
No ratings yet
Ee1001 02 PDF
Document32 pages
Ee1001 02 PDF
Muhammad Haseeb
No ratings yet
Boolean Algebra and Logic Gates
Document36 pages
Boolean Algebra and Logic Gates
李亞竹
No ratings yet
Chapter2 Further Integration Techniques
Document23 pages
Chapter2 Further Integration Techniques
Shaundre Neo
No ratings yet
10 Math Polynomials
Document9 pages
10 Math Polynomials
Ajay Anand
No ratings yet
Đề thi GT1 MI1016 CK 20221 aThu E
Document1 page
Đề thi GT1 MI1016 CK 20221 aThu E
anloll812
No ratings yet
01 Introduction To Feedforward Neural Networks (Hugo)
Document78 pages
01 Introduction To Feedforward Neural Networks (Hugo)
Muhammad Rizwan
No ratings yet
Practice Question in Calculus - GB Sir - @iitjeeadv PDF
Document607 pages
Practice Question in Calculus - GB Sir - @iitjeeadv PDF
Krishna Kumar kundu
100% (1)
MHF 4U Review Package - Surti
Document8 pages
MHF 4U Review Package - Surti
Whimsical Alteration
No ratings yet
Caie As Level Maths 9709 Pure 1 v2
Document22 pages
Caie As Level Maths 9709 Pure 1 v2
Daniel Liu
No ratings yet
Math 551: Applied PDE and Complex Vars Fall 2017: Rφ φ σ dx = (0 j 6= k 1 j = k
Document1 page
Math 551: Applied PDE and Complex Vars Fall 2017: Rφ φ σ dx = (0 j 6= k 1 j = k
Alexander Bennett
No ratings yet
Ece18898g Neural Networks
Document47 pages
Ece18898g Neural Networks
sai
No ratings yet
Problems: Ted Eisenberg, Section Editor
Document30 pages
Problems: Ted Eisenberg, Section Editor
PerepePere
No ratings yet
EE207: Digital Systems I, Semester I 2003/2004: CHAPTER 2-I: Combinational Logic Circuits (Sections 2.1 - 2.5)
Document66 pages
EE207: Digital Systems I, Semester I 2003/2004: CHAPTER 2-I: Combinational Logic Circuits (Sections 2.1 - 2.5)
مروان الغفري آل بوتون
No ratings yet
AP Calculus BC Free 89-97
Document116 pages
AP Calculus BC Free 89-97
Ramses khalil
No ratings yet
Fast Growth, Slow Growth: How Logarithms and Exponents Are Related
Document3 pages
Fast Growth, Slow Growth: How Logarithms and Exponents Are Related
Se Nortia
No ratings yet
Syed N. Mujahid Compiled On October 11, 2023
Document3 pages
Syed N. Mujahid Compiled On October 11, 2023
Techno logix
No ratings yet
69_PDFsam_01 رياضيات 1-ب
Document2 pages
69_PDFsam_01 رياضيات 1-ب
Mina Albert
No ratings yet
APP2
Document3 pages
APP2
Dayana Jimenez
No ratings yet
ECE448 Lecture16 Fixed Point VHDL 2008
Document57 pages
ECE448 Lecture16 Fixed Point VHDL 2008
Sepideh Alikhany
No ratings yet
Assignment#3 - Fall 22
Document1 page
Assignment#3 - Fall 22
Md.borhan Uddin
No ratings yet
Online Chapter Tests: 7. Integrals
Document3 pages
Online Chapter Tests: 7. Integrals
Aakash kumar
No ratings yet
Maths Practice Questions
Document5 pages
Maths Practice Questions
habtamu yilma
No ratings yet
Major Revision Facts in Mathematics
From Everand
Major Revision Facts in Mathematics
B. N. Kumar
No ratings yet
NeuralNets DeepLearning
Document17 pages
NeuralNets DeepLearning
damasodra33
No ratings yet
Neural Style
Document6 pages
Neural Style
damasodra33
No ratings yet
15 Finetune
Document33 pages
15 Finetune
damasodra33
No ratings yet
Image Augmentation
Document8 pages
Image Augmentation
damasodra33
No ratings yet
Fine Tuning
Document3 pages
Fine Tuning
damasodra33
No ratings yet
MLP Scratch
Document8 pages
MLP Scratch
damasodra33
No ratings yet
3 2KNN
Document27 pages
3 2KNN
damasodra33
No ratings yet
Activation
Document7 pages
Activation
damasodra33
No ratings yet
MLP Gluon
Document3 pages
MLP Gluon
damasodra33
No ratings yet
26 Neural Nets
Document77 pages
26 Neural Nets
damasodra33
No ratings yet
MLRD 1
Document28 pages
MLRD 1
damasodra33
No ratings yet
How To Include Graphics.h in CodeBlocks - Code With C
Document3 pages
How To Include Graphics.h in CodeBlocks - Code With C
Shubhankar Bhowmick
No ratings yet
Credit Risk Modeling
Document213 pages
Credit Risk Modeling
hsk1
No ratings yet
M-Series Prog and Ops-Dec2011
Document380 pages
M-Series Prog and Ops-Dec2011
Bob
100% (1)
Varna System by MN Srinivas
Document1 page
Varna System by MN Srinivas
Ram Viswanathan
No ratings yet
PHD Thesis boilEulerFoam PDF
Document168 pages
PHD Thesis boilEulerFoam PDF
SAIKRISHNA NADELLA
No ratings yet
ITP - Ballast, Sub-Ballast, Sand Rev 0
Document4 pages
ITP - Ballast, Sub-Ballast, Sand Rev 0
paklan
No ratings yet
Carti Psihologie Titluri
Document2 pages
Carti Psihologie Titluri
Melancholya Shin Ji Ming
No ratings yet
Griffiths Score
Document5 pages
Griffiths Score
Ananya Munjal
No ratings yet
Time Study
Document341 pages
Time Study
rasheam
No ratings yet
Resume Carla Musa
Document1 page
Resume Carla Musa
Anonymous gYwVFw
No ratings yet
Absenteeism: A Qualitative Research On Reasons Why Students Fail To Show Up in Class
Document40 pages
Absenteeism: A Qualitative Research On Reasons Why Students Fail To Show Up in Class
Justin Gaspar
92% (12)
Zope Bible
Document649 pages
Zope Bible
David Hung Nguyen
No ratings yet
Digital Printing NIP
Document18 pages
Digital Printing NIP
sitinuraishah rahim
100% (1)
Template Erasmus Mundus
Document3 pages
Template Erasmus Mundus
Lenya Ryan
No ratings yet
Intructions: 1. Type The Following Data Below
Document3 pages
Intructions: 1. Type The Following Data Below
Kimberly Ann Lumanog Amar
No ratings yet
Lesson Plan Math 7 Aug 13-17
Document8 pages
Lesson Plan Math 7 Aug 13-17
Queenie Marie Obial Alas
No ratings yet
A Detailed Lesson Plan in CSS 10
Document8 pages
A Detailed Lesson Plan in CSS 10
ANTONIO A. SANTOS
No ratings yet
What Is Scrum
Document5 pages
What Is Scrum
Carlos Restrepo
No ratings yet
Final NCP Leptospirosis
Document6 pages
Final NCP Leptospirosis
Keith Austin
100% (1)
12 - IMO PSPC Background
Document69 pages
12 - IMO PSPC Background
UdinHermansyah
No ratings yet
Rank and Nullity Theorem
Document6 pages
Rank and Nullity Theorem
sdfsdf
No ratings yet
"Fuzzy", Homogeneous Configurations
Document3 pages
"Fuzzy", Homogeneous Configurations
xuanv
No ratings yet
World Muslim Population: 1950 - 2020
Document42 pages
World Muslim Population: 1950 - 2020
Ouchaoua Ucif
No ratings yet
The Revolution Is Just Beginning: Slide 1-1
Document32 pages
The Revolution Is Just Beginning: Slide 1-1
Xuân Lộc
No ratings yet
DX100 Instruction Manual
Document270 pages
DX100 Instruction Manual
alan_smo
No ratings yet
Translating Strategy Into HR Policies
Document1 page
Translating Strategy Into HR Policies
Krystal Herskind
25% (4)
Xi Ip 2013-14
Document136 pages
Xi Ip 2013-14
Ravi Jha
No ratings yet
The Mesmer Menace Teacher's Guide
Document17 pages
The Mesmer Menace Teacher's Guide
Houghton Mifflin Harcourt
No ratings yet