Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 22

Statistical Methods in Artificial Intelligence

CSE471 - Monsoon 2015 : Lecture 03

Avinash Sharma
CVIT, IIIT Hyderabad

Lecture 03: Plan

Linear Algebra Recap


Linear Discriminant Functions (LDFs)
The Perceptron
Generalized LDFs
The Two-Category Linearly Separable
Case
Learning LDF: Basic Gradient Descend
Perceptron Criterion Function

Basic Linear Algebra


Operations
Vector
Vector Operations
Scaling
Transpose
Addition
Subtraction
Dot Product

Equation of a Plane

Vector
Operations

Transpose

Scaling: Only Magnitude Changes

Vector
Operations

Vector
Operations

Dot Product (Inner Product) of two vectors is a scalar.

Dot product if two perpendicular vectors is 0

Equation of a Plane

Linear Discriminant
Functions
Assumes
a 2-class classification setup

Decision boundary is represented explicitly in


terms of components of .
Aim is to seek parameters of a linear discriminant
function which minimize the training error.
Why Linear ?
Simplest possible
Generalized

Linear Discriminant Functions

Class A
Class B

The perceptron

Perceptron Decision
Boundary

Perceptron Summary
Decision boundary surface (hyperplane) divides

feature space into two regions.


Orientation of the boundary surface is decided by the
normal vector .
Location of the boundary surface is determined by the
bias term .
is proportional to distance of from the boundary
surface.
positive side and negative side.

Generalized LDFs
Linear:

Non Linear

(Quadratic)

Generalized LDFs
Linear

=1

Non Linear

Generalized LDFs

Generalized LDFs Summary


can be any arbitrary mapping function that projects

original data points to points where .


The hyperplane decision surface passes through
origin.
Advantage: In the mapped higher dimensional space
data might be linear separable.
Disadvantage: The mapping is computationally
intensive and learning the classification parameters
can be non-trivial.

Two-Category Linearly Separable


Case

Two-Category Linearly Separable


Case

Normalized Case

Two-Category Linearly Separable


Case

Data vector

Two-Category Linearly Separable


Case

Learning LDF: Basic Gradient


Descend
Define
a scalar function which captures

classification error for specific boundary plane


described by parameter
Minimize using gradient descent.
Start with arbitrary value of for .
Iteratively refine estimate of :

is the positive scale factor also known as


learning rate
A too small makes the convergence very slow
A too large can diverge due to overshooting of

You might also like