Download as pdf or txt
Download as pdf or txt
You are on page 1of 49

Chapter 7:

Introduction to
Object (Pattern) Recognition

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 1 DHT, HCMUT
7. Pattern Recognition
 A pattern (object) is an entity, vaguely defined, that could
be given a name, e.g.,
– fingerprint image,
– handwritten word,
– human face,
– speech signal,
– DNA sequence,
...
 Pattern recognition is the study of how machines can
– observe the environment,
– learn to distinguish patterns of interest,
– make sound and reasonable decisions about the
categories of the patterns.

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 2 DHT, HCMUT
7. Pattern Recognition Applications
Example pattern recognition applications:

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 3 DHT, HCMUT
7. Process of a Pattern Recognition System (1)

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 4 DHT, HCMUT
7. Process of a Pattern Recognition System (2)
 Data acquisition and sensing:
– Measurements of physical variables.
– Important issues: bandwidth, resolution, sensitivity,
distortion, SNR, latency, etc.
 Pre-processing:
– Removal of noise in data.
– Isolation of patterns of interest from the background.
 Feature extraction:
– Finding a new representation in terms of features.
 Model learning and estimation:
– Learning a mapping between features and pattern groups
and categories.
Dept. of Telecomm. Eng. DIP2018
Faculty of EEE 5 DHT, HCMUT
7. Process of a Pattern Recognition System (3)
 Classification:
– Using features and learned models to assign a pattern to
a category.
 Post-processing:
– Evaluation of confidence in decisions.
– Exploitation of context to improve performance.
– Combination of experts.

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 6 DHT, HCMUT
7. Design Cycle (1)

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 7 DHT, HCMUT
7. Design Cycle (2)
 Data collection:
– Collecting training and testing data.
– How can we know when we have adequately large and
representative set of samples?
 Feature selection:
– Domain dependence and prior information.
– Computational cost and feasibility.
– Discriminative features.
• Similar values for similar patterns.
• Different values for different patterns.
– Invariant features with respect to translation, rotation and
scale.
– Robust features with respect to occlusion, distortion,
deformation, and variations in environment.
Dept. of Telecomm. Eng. DIP2018
Faculty of EEE 8 DHT, HCMUT
7. Design Cycle (3)
 Model selection:
– Domain dependence and prior information.
– Definition of design criteria.
– Parametric vs. non-parametric models.
– Handling of missing features.
– Computational complexity.
– Types of models: templates, decision-theoretic or
statistical, syntactic or structural, neural, and hybrid.
– How can we know how close we are to the true model
underlying the patterns?

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 9 DHT, HCMUT
7. Design Cycle (4)
 Training:
– How can we learn the rule from data?
– Supervised learning: a teacher provides a category label
or cost for each pattern in the training set.
– Unsupervised learning: the system forms clusters or
natural groupings of the input patterns.
– Reinforcement learning: no desired category is given
but the teacher provides feedback to the system such as
the decision is right or wrong.
 Evaluation:
– How can we estimate the performance with training
samples?
– How can we predict the performance with future data?
– Problems of overfitting and generalization.
Dept. of Telecomm. Eng. DIP2018
Faculty of EEE 10 DHT, HCMUT
7. Pattern Recognition Techniques (1)
 Pattern is an arrangement of descriptors (features).
 Pattern class is a family of patterns that share some
common properties.
 The approaches to pattern recognition developed are
divided into two principal areas: decision-theoretic and
structural
– The first category deals with patterns described using
quantitative descriptors, such as length, area, and
texture.
– The second category deals with patterns best described
by qualitative descriptors, such as the relational
descriptors.

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 11 DHT, HCMUT
7. Pattern Recognition Techniques (2)

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 12 DHT, HCMUT
7. Pattern Recognition Techniques (3)
Example: 3 pattern classes (for 3 types of iris flowers)
arranged in vectors with 2 descriptors (2 measurements:
width and length of their petals) for each pattern class.

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 13 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (1)
 Let x = [x1, x2,…, xn]T for W pattern classes 1, … , W,
di(x) > dj(x), j = 1,…, W, i ≠ j.
 In other words, an unknown pattern x is said to belong to
the ith pattern class if, upon substitution of x into all
decision functions, di(x) yields the largest numerical value.
 Suppose that we define the prototype of each pattern class
to be the mean vector of the patterns of that class:
1
mj 
Nj
 x
x
j
j

 Using the Euclidean distance to determine closeness


reduces the problem to computing the distance measures:
D j (x )  x  m j

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 14 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (2)
 Matching with Minimum Distance Classifier
 The smallest distance is equivalent to evaluating the
functions:
1
d j (x )  xT m j  mTj m j
2
 The decision boundary between classes and for a
minimum distance classifier is
dij (x )  di (x )  d j (x )
1
 xT (mi  m j )  (mi  m j )T (mi  m j )  0
2

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 15 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (3)
 Decision boundary of minimum distance classifier:

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 16 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (4)
 Matching by correlation
We consider it as the basis for finding matches of a sub-
image of size J × K within f(x,y) an image of M × N size ,
where we assume that J ≤ M and K ≤ N
c( x, y )   f ( s, t ) w( x  s, y  t )
s t

for x = 0,1, 2,…, M-1, y = 0,1, 2,…, N-1

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 17 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (5)

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 18 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (6)
 The correlation function has the disadvantage of being
sensitive to changes in the amplitude of f and w.
 For example, doubling all values of f doubles the value of
c(x, y).
 An approach frequently used to overcome this difficulty is
to perform matching via the correlation coefficient
 [ f (s, t )  f (s, t )][w( x  s, y  t )  w]
 ( x, y )  s t
1
 2
 [ f ( s, t )  f ( s, t )]  [ w( x  s, y  t )  w] 
2
2

 s t s t 
 The correlation coefficient is scaled in the range -1 to 1,
independent of scale changes in the amplitude of f and w.
Dept. of Telecomm. Eng. DIP2018
Faculty of EEE 19 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (7)

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 20 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (8)
 Optimum Statistical Classifiers
 The probability that a particular pattern x comes from
class i is denoted p(i/x)
 If the pattern classifier decides that x came from j
when it actually came from i, it incurs a loss, denoted
Lij W
rj (x)   Lkj p (k x)
k 1

 From basic probability theory, we know that


p ( A B )   p ( A) p ( B A)  p ( B )

1 W
rj ( x)  
p(x) k 1
Lkj p(x k ) P(k )

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 21 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (9)
 Since 1/p(x) is positive and common to allthe rj(x), j = 1, 2,
…, W, it can be dropped without affecting the relative order
of these functions from the smallest to the largest value.
W
rj (x)   Lkj p ( x k ) P(k )
k 1

 Thus the Bayes classifier assigns an unknown pattern x to


class i if
W W

L
k 1
ki p(x k ) P(k )   Lqj p(x q ) P(q )
q 1

Lij  1   ij
W
rj (x)   (1   kj ) p( x k ) P(k )
k 1

 p(x)  p(x  j ) P( j )


Dept. of Telecomm. Eng. DIP2018
Faculty of EEE 22 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (10)
 The Bayes classifier then assigns a pattern x to class i if,
p(x)  p(x i ) P(i )  p(x)  p(x  j ) P( j )
or, equivalently, if
p(x i ) P(i )  p(x  j ) P( j ), j  1,2....,W ;i  j
or in the form of decision functions:
d j (x )  p(x  j ) P( j ), j  1,2,...,W

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 23 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (11)
 Bayes classifier for Gaussian pattern classes
Let us consider a 1-D problem (n = 1) involving two pattern
classes (W = 2) governed by Gaussian densities
d j ( x)  p( x  j ) P( j )
( x  m j )2

1 2 2j
 e P( j )
2 j

j  1, 2

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 24 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (12)

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 25 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (13)
In the n-dimensional case, the Gaussian density of the
vectors in the j th pattern class has the form:
1
1  ( x m j )T Cj 1 ( x m j )
p(x  j )  12
e 2

(2 )n 2 C j
m j  E j {x}
C j  E j {( x  m)( x  m)T }

Approximating:
1
mj 
Nj
 x
x j

1
Cj 
Nj
 xx
x j
T
 m j mT
j

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 26 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (14)
• Bayes decision function for class j is
d j (x )  ln[ p(x /  j ) P( j )]

d j ( x )  ln P( j )  x T C1m j  1 / 2mTj C1m j

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 27 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (15)

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 28 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (16)
 Artificial Neuron Networks
– Ideas stem from the operation of human neural
networks.
– Networks of interconnected nonlinear computing
elements called neurons.

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 29 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (17)
• Perceptron:

w and x are n-dimensional column vectors and wTx is the dot


product of the two vectors. We refer to w as a weight vector and,
as above, to wn+1 as a bias. Giving any pattern vector x from a
vector population, find a set of weights with the property:

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 30 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (18)
For the classes are linearly separable:
The perceptron algorithm for finding (or training) w is
simple as following

Let α > 0 denote a correction increment (also called the


learning increment or the learning rate),
let w(1) be a vector with arbitrary values, and
let wn+1(1) be an arbitrary constant.
Then, do the following for k = 2, 3,…: For a pattern
vector, x(k), at step k,

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 31 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (19)

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 32 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (20)
The notation in previous equations can be simplified if we
add a 1 at the end of every pattern vector and include the
bias in the weight vector. That is, we define
x = [x1, x2, …, xn, 1]T
and
w= [w1, w2,…, wn, wn+1]T
Then,

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 33 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (21)
The perceptron algorithm can be modified as: For any
pattern vector, x(k), at step k

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 34 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (22)
For nonseparable pattern classes:
Let r denote the response we want the perceptron to have for
any pattern during training. The output of the perceptron r is
either +1 or −1. We want to find the augmented weight
vector, w, that minimizes the mean squared error (MSE)
between the desired and actual responses of the perceptron.
The perceptron algorithm for finding w is based on the
least-mean-squared-error (LMSE) algorithm as

A typical range for α is 0.1 < α < 1.0, w(1) is arbitrary.

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 35 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (13)
• Artificial Neuron:
Neural networks are interconnected perceptron-like computing
elements called artificial neurons. These neurons perform the
same computations as the perceptron, but with different
activation function.

Activation function:

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 36 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (24)

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 37 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (25)
• Fully Connected
Neural Network

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 38 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (26)
The outputs of the layer 1 are the components of input vector
x, n = n1 is the dimensionality of x:

The computation performed by neuron i in layer l is given by

for i = 1, 2,…, n and l = 2,…, L. Quantity zi(l) is called the


net (or total) input to neuron i in layer l. The reason for this
terminology is that zi(l) is formed using all outputs from layer
(l −1).
The output (activation value) of neuron i in layer l is given by

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 39 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (27)
Matrix form: Implementation of previous equations by
using matrix operations giving computationally faster.
The number of outputs in layer 1 is always of the same
dimension as an input pattern, x, so its matrix (vector)
form is simple:

The matrix, W(l), that contains all the weights in layer l,


each rows contains the weights for one of the nodes in
layer l:

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 40 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (28)
Then, we can obtain all the sum-of-products computations,
zi(l), for layer l simultaneously:

where a(l−1) is a column vector of dimension nl−1×1


containing the outputs of layer l−1, b(l) is a column vector
of dimension nl×1 containing the bias values of all the
neurons in layer l, and z(l) is an n×1 column vector
containing the net input values, zi(l), i=1, 2,…nl , to all the
nodes in layer l.

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 41 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (29)
Because the activation function is applied to each net input
independently of the others, the outputs of the network at
any layer can be expressed in vector form as:

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 42 DHT, HCMUT
7. Recognition Based on Decision-Theoretic Methods (27)

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 43 DHT, HCMUT
7. Recognition Based on Structural Methods (1)
 Matching Shape Numbers
 All shapes of order 4, 6, and 8
Order6
Order4

Chain code: 0321 Chain code: 003221


Difference : 3333 Difference : 303303
Shape no. : 3333 Shape no. : 033033

Order8

Chain code: 00332211 Chain code:03032211 Chain code: 00032221


Difference : 30303030 Difference :33133030 Difference : 30033003
Shape no. : 03030303 Shape no. :03033133 Shape no. : 00330033

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 44 DHT, HCMUT
7. Recognition Based on Structural Methods (2)
 Advantages:
1. Matching Shape Numbers suits the processing
structure simple graph, specially becomes by
the line combination.
2. Can solve rotation the question.
3. Matching Shape Numbers most emphatically to the
graph outline, shape similarity also may completely
overcome.
4. The Displacement question definitely may
overcome, because of this method emphatically to the
relative position but is not to the position.

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 45 DHT, HCMUT
7. Recognition Based on Structural Methods (3)
 Disadvantages:
1. It can not uses for a hollow structure.
2. Scaling is a shortcoming which needs to change,
perhaps coordinates the alternative means .
3. Intensity.
4. Mirror problem.
5. The color is unable to recognize.

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 46 DHT, HCMUT
7. Recognition Based on Structural Methods (4)
String Matching
 Suppose that two region boundaries, a and b, are coded
into strings denoted a1, a2, …, an and b1, b2, …, bm,
respectively.
 Let  represent the number of matches between the
two strings, where a match occurs in the k th position if
ak = bk
  max( a , b )  

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 47 DHT, HCMUT
7. Recognition Based on Structural Methods (5)
 A simple measure of similarity between a and b is the
ratio:
 
R 
 max( a , b )  

Hence R is infinite for a perfect match and 0 when none


of the corresponding symbols in and match (in this case)

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 48 DHT, HCMUT
7. Recognition Based on Structural Methods (6)

Dept. of Telecomm. Eng. DIP2018


Faculty of EEE 49 DHT, HCMUT

You might also like