Harandi IJCV2009 Published

Int J Comput Vis (2009) 81: 191–204
DOI 10.1007/s11263-008-0161-5
Optimal Local Basis: A Reinforcement Learning Approach

for Face Recognition
Mehrtash T. Harandi · Majid Nili Ahmadabadi ·
Babak N. Araabi
Received: 9 January 2008 / Accepted: 22 July 2008 / Published online: 19 August 2008
© Springer Science+Business Media, LLC 2008
Abstract This paper presents a novel learning approach for face, Fisherface, Subclass Discriminant Analysis, and Ran-
Face Recognition by introducing Optimal Local Basis. Op- dom Subspace LDA methods as well.
timal local bases are a set of basis derived by reinforcement
learning to represent the face space locally. The reinforce- Keywords Face recognition · Feature selection ·
ment signal is designed to be correlated to the recognition Reinforcement learning
accuracy. The optimal local bases are derived then by find-
ing the most discriminant features for different parts of the
1 Introduction
face space, which represents either different individuals or
different expressions, orientations, poses, illuminations, and
Individual identification through face recognition is an im-
other variants of the same individual. Therefore, unlike most
portant and crucial ability of humans in effective com-
of the existing approaches that solve the recognition prob- munication and interactions. Interest in producing artificial
lem by using a single basis for all individuals, our proposed face recognizers has been grown rapidly during the past
method benefits from local information by incorporating dif- ten years, mainly because of its various practical applica-
ferent bases for its decision. We also introduce a novel clas- tions such as surveillance, identification, authentication, ac-
sification scheme that uses reinforcement signal to build a cess control, and Mug shot searching. Among different ap-
similarity measure in a non-metric space. proaches devised for face recognition, the widely studied
Experiments on AR, PIE, ORL and YALE databases indi- ones are the statistical learning methods that try to derive
cate that the proposed method facilitates robust face recogni- an appropriate basis for face representation (Turk and Pent-
tion under pose, illumination and expression variations. The land 1991; Belhumeur et al. 1997; Bartlett et al. 2002). The
performance of our method is compared with that of Eigen- reason behind deriving a basis is because a complete ba-
sis makes the derivation of unique image representations
suitable for processes like image retrieval and object recog-
nition. Statistical learning theory also offers a lower di-
M.T. Harandi () · M. Nili Ahmadabadi · B.N. Araabi
Control and Intelligent Processing Center of Excellence, mensional description of faces. The lower dimensional de-
School of Electrical and Computer Engineering, scription is crucial in learning, as the number of examples
University of Tehran, Tehran, Iran required for achieving a given performance exponentially
e-mail: mharandi@ece.ut.ac.ir grows with the dimension of the representation space. On
M. Nili Ahmadabadi the other hand, low dimensional representations of visual
e-mail: mnili@ut.ac.ir objects have some biological roots as suggested by Edel-
B.N. Araabi man and Intrator (1990): “perceptual tasks such as similar-
e-mail: araabi@ut.ac.ir ity judgment tend to be performed on a low-dimensional
M.T. Harandi · M. Nili Ahmadabadi · B.N. Araabi
representation of the sensory data.” Principal Component
School of Cognitive Sciences, Institute for Studies in Theoretical Analysis (PCA) (Turk and Pentland 1991; Yang et al. 2004;
Physics and Mathematics, Tehran, Iran Feng et al. 2002), Linear Discriminate Analysis (LDA)
192 Int J Comput Vis (2009) 81: 191–204
(Belhumeur et al. 1997), Subclass Discriminant Analysis

(SDA) (Manli and Martinez 2006), Independent Compo-
nent Analysis (ICA) (Bartlett et al. 2002; Liu and Wech-
sler 2003), Locality Preserving Projections (LPP) (He et al.
2005; Cai et al. 2006), kernel machines (Kim et al. 2002;
Liu 2004; Lu et al. 2003) and hybrid methods (Yang et al.
2005) are successful examples of applying statistical learn-
ing theory to face recognition by introducing a single ba-
sis for face representation. This is where some important
questions come to mind: Is a holistic basis sufficient for a
complicated task like face recognition or we have to intro-
duce different bases for different parts of face space (in-
dividuals) to achieve best practically possible recognition?
How does a single basis may interpret the holistic and fea-
ture analysis behaviors (Zhao et al. 2003) of human be-
ings? Apparently using the same projection for all indi-
viduals is in contrast with human feature analysis behav-
ior in face recognition. Several studies showed that even
the derived basis can be further improved by searching for
the most discriminant directions using evolutionary algo-
rithms like evolutionary pursuit (Liu and Wechsler 2000; Fig. 1 An example of the proposed idea where, each subject is repre-
Harandi et al. 2004), GA-Fisher (Zheng et al. 2005) and in- sented by its most discriminant features in the face space
dividual/combinatorial feature selection methods (Liu and
Motoda 1998; Ekenel and Sankur 2004). of face space can be corresponds to different individuals or
Several other approaches also exist in the area of face even different expressions/poses/illuminations of an individ-
recognition to tackle the problem by introducing different ual. These learned discriminant features are varied in num-
bases. The very first approach was View-based Eigenspaces ber for different parts of the face space although they are
proposed by Moghaddam (Pentland et al. 1994). In his work, selected from a unique feature pool. This pool can be pro-
Moghaddam suggested to group images taken from a spe- vided by any holistic algorithm. As the learned discriminant
cific view together and build a specific space for each view. features formed local basis for face spaces derived by statis-
The idea is extended by Kim to LDA sp aces with soft clus- tical learning methods, we called our method Optimal Local
tering idea in 2005 (Kim and Kittler 2005). Wang introduced Basis (OLB). The proposed method can be considered as
random subspace LDA to generate a number of subspaces an extension to evolutionary algorithms (Liu and Wechsler
randomly followed by fusing the result of recognition (Wang 2000) in which only one optimal basis is sought for all in-
and Tang 2006). In addition, mutual subspaces are also pro- dividuals. However, in contrast, our approach tries to learn
posed when several images from each class are available one or several separate optimal basis/bases for each individ-
(Yamaguchi et al. 1998). ual. We also provide a novel classification scheme to use RL
Nevertheless each space in these mixture models is still reward signal for taking its final decision in a non-metric
holistic in terms of individual representation and identifica- space. The proposed method is successfully tested on AR
tion. Furthermore, the clustering method used to form the (Martinez and Kak 2001), PIE (Sim et al. 2003), ORL (ORL
samples of each space, does not necessarily group the im- database) and YALE (Yale University Face Image Database)
ages to optimally extract their features (LDA in LLDA and datasets, which cover a wide range of variations such as dif-
LDA mixture model and PCA features in View-based ap- ferent expressions, illuminations and poses. We apply our
proach). One can also expect unacceptable generalization method to face spaces derived by LDA and compare with
ability if the number of samples in some clusters are not Eigenface, Fisherface, SDA and random subspace LDA in
sufficient. terms of recognition accuracy.
In this paper, inspired by biological findings in human The rest of this paper is organized as follows: Section 2
face recognition, we present a novel idea to locally repre- provides general background on RL algorithm. In Sect. 3 our
sent the face space by using the reinforcement learning (RL) method for optimal basis derivation using RL is introduced.
method. The main idea proposed in this paper is illustrated Section 4 presents the proposed class similarity measure and
as an example in Fig. 1. Considering the complex face space OLB classifier. Section 5 describes the computational com-
manifold, our learning method tries to find the most discrim- plexity of the proposed method. Experimental results are de-
inant features for different parts of the space. Different parts scribed in Sect. 6. Section 7 provides some interesting prop-
Int J Comput Vis (2009) 81: 191–204 193
erties of the proposed algorithm followed by conclusion re- N is the number of Eigenfaces, and Ti is the binary vector
marks and suggestions for future in Sect. 8. demonstrating which Eigenfaces are the optimal subset to
describe xi . For example in Fig. 1, the optimal binary vec-
tor for the subject A is a binary vector with four elements
2 Reinforcement Learning equal to one whereas the OLB optimal binary vector for the
subject B has only two ones. The location of the ones in the
Reinforcement Learning is a machine learning technique OLB optimal binary vector determines which features are
for solving sequential decision problems. Various decision optimal for describing the underlying OLB. In this sense,
problems in real life are sequential in nature. In these prob- the binary vector Ti can be considered as a feature selection
lems the received reward does not depend on an isolated mapping from a high dimensional space into a lower dimen-
decision but rather on a sequence of decisions. Therefore, sional one. In sequel, the set of optimum local features, OLB
learning agent maximizes a combination of its immediate optimal binary vector, and Ti are used interchangeably.
and delayed rewards. In solving a classification problem, the problem is mod-
In RL, at each moment t, learning agent senses the en- eled by a set of OLBs. The OLB learning algorithm tries
vironment in state st ∈ S and takes an action at ∈ A where to find the OLB optimal binary vector Ti for each OLB by
S and A denote the sets of agent’s states and its available maximizing the discrimination power of its representative
actions, respectively. Agent’s action causes a state transition point xi from other representative points xj , j = i. More
and reward rt from the environment. The expected value of specifically the learning process tries to find the best Ti so
the received reward by following an arbitrary policy π from that all the other representative points xj , j = i with the
an initial state st is given by same class label ωi are seen closer to xi than the representa-
∞ tive points with different class labels ωj in the space defined
by Ti .
V (st ) = E
π i
γ rt+i (1)
In its formal description, OLB derivation algorithm can
i=0
be regarded as a manifold learning (Roseis and Saul 2000)
where rt+i is the reward received in state st+i using pol- approach; however, here we would like to explore another
icy π , and γ ∈ [0, 1) is a discount factor that balances de- view of the OLB algorithm; which is closely related to
layed versus instant rewards. The learning problem is mod- some of interesting findings in Biology and Neuroscience.
eled as the best mapping from states into actions π : S → A Considering a classification problem defined over a feature
i.e. finding the policy that maximizes the sum of expected space, features can be categorized either as global or local
rewards: based on the way they are used in the classification task.
In most classification algorithms, a unique set of features
π ∗ = arg max Vπ (s), ∀s ∈ S (2) is identically selected to separate all the classes over the
π
feature space. Such features are called global features. On
There are different RL methods to find the optimum pol- the other hand, most often if we consider a portion of fea-
icy; e.g. Q-learning and Sarsa (Sutton and Barto 1998). In ture space, it is possible to find a handful of features to
this paper we use Q-learning to estimate optimal local bases. characterize that portion more efficiently as compared with
global features. Such features may be labeled as local fea-
tures. Global features are ideal, provided that features ap-
3 Optimal Local Basis Learning Using RL propriateness for classification does not vary much across
different classes. That is, the same set of features performs
We will introduce our approach in this section. We as- equally well to separate all the classes. However, in most
sume that we are encountering a classification problem classification problems, it is very difficult or even impracti-
where the feature vectors are represented by N dimen- cal to find such an optimal set of global features. Moreover,
sional vectors and each datum belongs to one of the classes all features may not always be processed and used due to
{ω1 , ω2 , . . . , ωC }. An OLB is defined as a three-tuple computational and/or time resource limitations. Unlike the
(xi , ωi , Ti ), where xi ∈ RN is the OLB representative point, global features, the local features may vary on different por-
ωi ∈ {ω1 , ω2 , . . . , ωC } is the OLB class label, Ti ∈ N is a tions of the feature space. The OLB algorithm is a practical
binary vector expressing the set of features associated with approach to obtain the local features in a complex classifica-
the OLB, and N is the set of all the N dimensional bi- tion task like face recognition; because Ti can be considered
nary vectors excluding the null binary vector. Therefore, as the subset of most discriminant features from the OLB
N have 2N − 1 members. For instance, in a face recog- representative point of view. It is interesting to know that
nition task where the features are obtained by PCA, xi is the opportunistic behavior of human face recognition (Zhao
the representation of a sample face in the Eigenface space, et al. 2003) and recent theories of face spaces can also be
194 Int J Comput Vis (2009) 81: 191–204
partially modeled by the OLB algorithm. The theory of face In sequel, we introduce the proposed learning method in
space is the most common theoretical framework proposed more details. In particular Sect. 3.1 proposes the learning al-
for face recognition in both computational and psycholog- gorithm based on Q-learning—a widely used implementa-
ical literature (O’Toole et al. 1999). In most computational tion of reinforcement learning (Sutton and Barto 1998)—to
researches, a fixed face space with holistic features is as- estimate the discrimination factors. In Sect. 3.2 we provide
sumed however some psychological findings challenge the different methods for selecting optimal features for each
proposition that the feature space is ever holistic or fixed. It OLB. Estimated OLBs are integrated into a classification
means, an expert’s feature space may become reorganized schema in Sect. 4. A preliminary study on the OLB learn-
and tuned to perform a categorization task more efficiently ing algorithm is reported in Harandi et al. (2004).
(Carbon 2003). In a similar way, the OLB algorithm man-
ages to find the most discriminant features of each represen- 3.1 The Learning Method
tative point. As a result, it, tunes and reorganizes the face
space to boost the recognition performance from each local In this section, our method for assigning discrimination
observer’s point of view. factor to each feature of an OLB using RL is presented.
Person-specific image-based approaches are recently in- We assume that the learning process is applied to the
troduced with promising results for face recognition (Ma et OLB(xi , ωi , Ti ) where the classification problem is mod-
al. 2007; Bicego et al. 2006; Ahonen et al. 2006). In Ma eled by the set {OLB(x1 , ω1 , T1 ), OLB(x2 , ω2 , T2 ), . . . ,
et al. (2007) and Bicego et al. (2006), the SIFT operator— OLB(xi , ωi , Ti ), . . . , OLB(xm , ωm , Tm )}. Here m is the
introduced by Lowe (2004)—is used to detect the key points number of OLBs and m ≥ C.
of each gallery image and to extract the scale/rotation in- In order to use RL, we should model the problem as
variant features for each key point. For identifying a probe a Markov decision process and design the reward func-
image, its SIFT features are compared with those of each tion. Figure 2 shows a schematic view of the proposed
gallery image and the closest match is selected as the system where, the learning process is modeled as agent-
recognition result. Our approach and the mentioned person- environment interaction. For each OLB, the corresponding
specific methods resemble each other at the first glance; agent traverses the feature space and at each step selects
because, in both algorithms, each training sample is repre- a feature from the available features. The environment re-
sented by a set of specific features. However, our method dif- sponds the agent completing an action (feature selection) by
fers from the person-specific methods in two major aspects.
Firstly, our method is independent of the feature space and
theoretically can be used with different features. Secondly,
the nature of the feature selection approaches is different.
In fact, we find a set of most discriminant features for each
sample image by looking at its neighbor images while dis-
crimination power is not directly considered in extraction of
features in the person-specific methods.
Selection of the best feature subset in an N dimensional
space demands for examining 2N − 1 possibilities per OLB.
Searching such a space is computationally exhaustive even
for medium size feature spaces. In order to search the feature
space more efficiently, the feature selection task is modeled
as an expected reward maximization problem in an nth order
Markov Decision Process (MDP). This MDP model drasti-
cally reduces the computational complexity of the problem;
as in an nth order MDP the learner needs to keep its n last
states and decisions. To derive the OLB optimal binary vec-
tor Ti , the problem is divided into two steps. In the first
step we obtain a sorted feature set by learning discrimina-
tion factors of all the features describing the OLB represen-
tative point using RL. The discrimination factor shows how
discriminative the corresponding feature is for classifying its
representative point along with the other features. In the sec-
ond step, the best subset of features is selected as the OLB
optimum local features. Fig. 2 A schematic view of the learning system
Int J Comput Vis (2009) 81: 191–204 195
Fig. 3 A nth order MDP model of the environment, where fjs is the j th selected feature and ai is the agent’s action at state si
a reward signal. The agent’s task is to learn those features The reward signal is designed to satisfy the following cri-
for OLB(xi , ωi , Ti ) that result in the collection of maximum teria:
expected reward.
• Narrowing the distance between xi and the other rep-
The environment is modeled as an nth order MDP where
resentative points xj ; j = i with the same class labels
the current state Si is represented merely by n last selected
s ,fs s (ωi = ωj ) in an Euclidean space. This criterion is equiv-
features (fi−n i−n+1 , . . . , fi ) as shown in Fig. 3. In state alent to minimize intra-set distance for each class.
Si agent chooses its decision ai based on its n last de-
s ,fs s • Maximizing the discrimination of xi from the representa-
cision (fi−n i−n+1 , . . . , fi ) and available remaining fea- tive points xj ; j = i with different class labels (ωi = ωj ).
tures. The selected features are kept in binary vector h of
This criterion is equivalent to maximize inter-set dis-
length N , where N is the dimension of feature space. This
tances for classes. In this process, those features that seg-
is to ensure that a feature is not selected more than once.
regate xi from the representative points with different
Whenever a feature is selected the corresponding element
class labels are preferred.
of vector h becomes one. The agent can select only those
features that their corresponding elements in h are zero. The critic (the environment) evaluates the selected fea-
The value of agent’s state-action pairs are modeled by tures by examining the class labels of the K nearest neigh-
n + 1 Q-tables {Q0 , Q1 , Q2 , . . . , Qn } of size {N, N × bors of xi . To do this, all representative points xj , j =
N, N × N × N, . . . , N × N × · · · × N}. The element 1, . . . , m are projected into the space defined by the selected
n+1 features, pj = diag(xj ⊗ h), j = 1, . . . , m. Then the K near-
Qj (i0 , i1 , . . . , ij −1 , ij ) of the Q-table Qj demonstrates the est neighbors of pi is obtained using L2 norm. The received
expected value of received reward by selecting feature ij reward is formulated as
when the features i0 , i1 , . . . , ij −1 are already selected in j
previous steps. For example in a first order MDP, element
K
r= (fj (ωi ) × RC (j ) + (1 − fj (ωi )) × PN C (j )) (4)
(i0 , i1 ) of Q1 demonstrate the expected reward of selecting j =1
direction i1 when i0 is previously selected. The updating
equations for Q-learning algorithm is In (4) fj (ωi ) is a binary value function representing
whether the j th neighbor has the same class label as ωi or
Qj (i0 , i1 , . . . , ij −1 , ij ) not:
= Qj (i0 , i1 , . . . , ij −1 , ij ) ⎧
⎨1 if the j th neighbor has the same class
N fj (ωi ) = label as ωi (5)
+ α r + γ max Qj (i1 , i2 , . . . , ij , il ) ⎩
l=1 0 otherwise

− Qj (i0 , i1 , . . . , ij −1 , ij ) (3) RC (j ) and PN C (j ) are the reward and punishment that

the agent receives for correct and incorrect hits, respectively.
where Qj (i0 , i1 , . . . , ij −1 , ij ) is the expected reward of It means that, the agent receives reward RC (j ) when the
selecting feature ij when the agent is in the state (i0 , j th neighbor has the same class label as ωi and punishment
i1 , . . . , ij −1 ); Qj (i1 , i2 , . . . , ij , il ) is the expected reward of PN C (j ) otherwise. The maximum and minimum expected
K
selecting il one step after selecting feature ij ; r is the re- rewards are K j =1 RC (j ) and j =1 PN C (j ), respectively.
ceived reward of selecting feature ij ; α (0 < α ≤ 1) is the Each episode of RL has at last N steps (the number of avail-
learning rate; and γ (0 ≤ γ ≤ 1) is the discount factor. In able features). The episode could be terminated sooner if
the learning phase, the agent traverses the environment with the agent receives the maximum reward in Chits consequent
the ε Greedy policy, i.e. in each state the agent selects the steps. The agent must visit the feature space sufficiently
best action with probability 1 − ε while it is possible for the in order to learn effectively. The learning algorithm can be
agent to choose a random action with the probability of ε. summarized as shown in Table 1.
196 Int J Comput Vis (2009) 81: 191–204
Table 1 OLB derivation algorithm Table 2 Sorting the features for a nth order MDP
Algorithm OLB Learning Algorithm Sorting the features according to their appropriateness
Select randomly an OLB OLB(xi , ωi , Ti ) from the training dataset
{OLB(x1 , ω1 , T1 ), OLB(x2 , ω2 , T2 ), . . . , OLB(xi , ωi , Ti ), . . . , fC =1
OLB(xm , ωm , Tm )}. N

Initialize the corresponding Q-tables randomly. h = 1, 1, . . . 1

for iteration = 1 to _number_of_Episodes do
N a0 = arg max(Q0 (aj ))

j
h = 0, 0, . . . , 0
ordered_ features = (a0 )
repeat
h(a0 ) = 0
Select a feature ai = fi+1
s by ε greedy policy.
Update the selected feature vector, h(fi+1 s

) = 1. cState = (a0 )
Project all the representative points xj , j = 1, . . . , m
in training dataset into space defined by h using while f C < n do
pj = diag(xj ⊗ h). Select all the Q values described by cState in Qf C .
Find the class labels of the K-nearest neighbors of the This is a vector of length N which is called
projected data from pi . Row(Qf C ) = Qf C (cState, aj ), j = 1, . . . , N.
Update the corresponding cell of Q-table according to (4). af C = arg maxj (Row(Qf C )|h(j ) = 1)
until in Chits consequent steps, the agent receives the maximum ordered_ features = (ordered_ features, af C )
reward or if all the features are selected. rem_ features(af C ) = 0
end for cState = (a0 , a1 , . . . , af C )
fC =fC +1
end while
while f C ≤ N do
In this paper, we confined our experiments to the first or- Select all the Q values described by State of Qn , Row(Qn )
der MDPs and face spaces derived by applying statistical = (cState, aj ), j = 1, . . . , N. This vector has length N.
learning methods. As a result, in the context of our proposed af C = arg maxj (Row(Qn )|h(j ) = 1)
algorithm the words direction and feature are used inter- ordered_ features = (ordered_ features, af C )
h(af C ) = 0
changeably. The updating equations for Q-tables are given cState = (af C−n , af C−n+1 , . . . , af C )
in (6) and (7). Since the expected reward of selecting a direc- fC =fC +1
tion right after i0 is coded in Q1 , Q1 table is used to update end while
the values in Q0 .

N
Q0 (i0 ) = Q0 (i0 ) + α r + γ max Q1 (i0 , il ) − Q0 (i0 ) (6) in Qj , j = 0, 1, 2, . . . , n − 1, respectively. In each selection
l=1 step, the corresponding Q-table and the selected features are
used. After using the above process to find the first n features
Q1 (i0 , i1 ) = Q1 (i0 , i1 ) in order of appropriateness, the rest of features are selected
N

using the last n selected features and tracing the position
+ α r + γ max Q1 (i1 , il ) − Q1 (i0 , i1 ) (7) of maximum in the Qn table. Therefore, selection of each
l=1
feature beyond nth selected one depends only on Qn table
3.2 Selecting the Most Appropriate Features along with the last n selected features.
As an example, for a second order MDP, we use the
After completing the learning process, the set of most appro- Q0 to select the first optimal feature a0 . Then the se-
priate features for each OLB should be selected. Here, the lected feature and Q1 is used and the second appro-
term “appropriate” is shorthand for “appropriate for discrim- priate feature, a1 , is found by locating the maximum
ination of an OLB”. Not only less appropriate features can- considering a0 , i.e. a1 = arg maxj (Q1 (a0 , aj )). The re-
not provide better discrimination for the underlying OLB, maining appropriate features are obtained from Q2 using
but also they may deteriorate the effect of more appropriate al = arg maxj (Q2 (al−2 , al−1 , aj ) | h(j ) = 1), l = 2, 3, . . . ,
ones. Now, the question is how to determine the appropri- N − 1, recursively, where h is the binary vector that keeps
ate features for each OLB. To do this, firstly the features are track of the selected features in the previous steps. The
sorted according to their discrimination using the available pseudo code of acquiring the ordered set of appropriate fea-
Q-tables and then from the sorted features the optimal bi- tures is shown in Table 2.
nary vector Ti is extracted. After obtaining the ordered features, we have to select a
To obtain a set of features in descending order of ap- subset of them. Three different methods can be devised here:
propriateness, we use the already learned Q-tables in recall • Static method: Features with Q-values above a predefined
mode. For an nth order MDP, the first n appropriate fea- threshold are used in the decision process. A fixed thresh-
tures are obtained by selecting the position of the maximum old is used for all OLBs.
Int J Comput Vis (2009) 81: 191–204 197
• Adaptive method: Features with Q-values above a vary- 1, . . . , m into the space defined by Ti , pj = diag(xj ⊗ Ti ),
ing threshold are used in the decision process. The finding the label of K-nearest neighbors of pq in that space
varying threshold is selected proportional to the largest and calculating the following similarity measure:
Q-value or by a clustering method. In this method the
threshold varies for each OLB.
K
S(xq , OLB(xi , ωi , Ti )) = RC (j )fj (ωi ) (8)
• Validation method: It is also possible to split the data
j =1
into validation and test subsets and then find the optimal
features using the validation subset. In this method the fj (ωi ) in (8) is a binary value function that demonstrates
threshold varies for each OLB too. whether the class label of the j th neighbor of pq is ωi or
In this paper we use an adaptive method, which is based not (5). RC (j ) is the reward function defined in (4).
on a two classes clustering. This is partly due to its perfor- Based on (8), class similarity measure is defined
mance advantages over the static method and also its relative by fusing the similarity measures of all the OLBs be-
simplicity compared to the validation method. longing to it, i.e. S(xq , OLB(xi , , Ti )). For fusing the
S(xq , OLB(xi , , Ti )) different methods can be utilized in-
cluding max, median, sum and product rule. In our exper-
iments sum rule as shown in (9) led to higher recognition
4 Class Similarity Measure and OLB Classifier
accuracies.
When the training phase is finished and the most appro-
ni
priate features for each OLB are selected, we are ready to S = S(xq , OLB(xi , , Ti )) (9)
build an OLB-based Classifier. To this end, we need to as- i=1
sess the similarity of a query datum to all stored classes. where ni is the number of representative points in class i.
Simple similarity judgment based on ordinary distance mea- In order to classify a query input, we adopt a single stage
sures on feature space does not work here since OLBs decision making strategy where the similarity measure for
have different dimensions and features; as illustrated in all the classes are computed using (9) and the most similar
Fig. 4. In this figure, the Euclidean distance between xq match is assigned as the output of the classifier.
and OLB(x1 , ω1 , T1 ) is d1 = (xq,1 − x1,1 )2 + (xq,4 − x1,4 )2 At semantic level, reward and punishment terms in (4)
while the corresponding distance for OLB(x2 , ω2 , T2 ) is are comparable with within and between scatters or dis-
tances in a separability matrix or measure. In the classifi-
d2 = (xq,1 − x2,1 )2 + (xq,2 − x2,2 )2
cation stage we want to calculate the similarity of query xq
+ (xq,4 − x2,4 )2 + (xq,5 − x2,5 )2 . to class ωi represented by OLB(xi , ωi , Ti ). Since the pun-
ishment shows similarity of xq to the other classes and sim-
In order to make different bases comparable, we use the ilarity of xq to ωj , j = i is considered when (8) is calcu-
reward signal as the similarity measure. The similarity be- lated for OLB(xj , ωj , Tj ), j = i, we do not incorporate the
tween a query datum xq and OLB(xi , ωi , Ti ) is defined by punishments in (8), while an alternative version of (8) with
first projecting xq and all the representative points xj , j = punishment term is still conceivable.
Fig. 4 An illustration of
selected features for each OLB.
Due to differences in the
dimensions and the optimal
features of OLBs, it is
impractical to use Euclidean
norms as similarity measures
198 Int J Comput Vis (2009) 81: 191–204
5 Computational Complexity the gallery size permits) and the highest recognition rates
are reported. For random subspace LDA, Wang’s sugges-
The computational complexity of the learning algorithm de- tions for the number of subspaces and fixed dimensions are
pends on the order of MDP used for training. The computa- adopted and the majority voting fusion rule was used (Wang
tional complexity of the first order MDP is analyzed here as and Tang 2006). Again for random subspace LDA, the high-
this particular model is utilized in experiments of this paper. est recognition rates are reported.
Consider a first order MDP with N features. The OLB learn- In OLB algorithm, most appropriate features were se-
ing algorithm creates a Q1 table of size N × N . As every lected by the adaptive thresholding method with k-means
cell of this table must be visited by the agent sufficiently, we clustering. In the learning process of the experiments, we
chose the number of episodes of the RL algorithm equal to employed a k-NN classifiers with k = 3 whenever there
the number of cells of Q1 as N 2 . Since each episode maxi- were more than two prototypes per individual in the train-
mally consists of N steps, the average number of visits per ing set. The reward signal were RC = [3 2 1.5] and PN C =
cell is bounded by N . This ensures that every cells is visited [−1 − 1 − 1] in these cases. For the cases where there
almost N times in average. were only two prototypes per individual in the training set,
In each step of an episode, the agent finds a maximum a k = 2 k-NN classifier was used with RC = [3 2] and
and updates appropriate cells. Therefore, the computational PN C = [−1 − 1].
complexity of a first order MDP is equal to the number of In order to make a fair comparison with the SDA and ran-
episodes, O(N 2 ), multiplied by the number of steps, O(N ), dom subspace LDA, we trained the proposed method over
at most. This results in O(N 3 ) for each OLB. For higher LDA space. The number of features passed to OLB algo-
order MDPs, the computational cost is higher. To avoid rithm was C − 1, where C was the number of classes in each
high computational cost, authors are currently working on database. Consequently the number of learning episodes was
function approximation methods for estimation of Q-values. chosen as C 2 .
This approach drastically decreases the computational cost The studied methods are compared in terms of the recog-
(Sutton and Barto 1998). nition rate, its standard deviation and the number of features.
For every partition of train/test sets, we found the maximum
recognition rate of each benchmarking method over all of
6 Experimental Results its recognition-affecting parameters. The reported recogni-
tion rate for each method is the average of that maximum
We used AR (Martinez and Kak 2001), PIE (Sim et al. accuracy. The recognition-affecting parameters are the num-
2003), ORL (ORL database) and YALE (Yale University ber of features in Eigenface, the numbers of features and
Face Image Database) databases to evaluate the performance sub-clusters in SDA, and the numbers of subspaces and ran-
of the proposed method in different face orientations, ex- domly selected Eigenfaces in addition to the number of the
pressions, illumination situations and occlusions. In all ex- preserved ones in the random subspace LDA. For methods in
periments, no preprocessing except downsampling was per-
those the number of features affects the recognition rate, the
formed on the images. In each experiment, the image set was
recognition rate is checked for the top k features as k varies
partitioned into training and testing sets. For ease of rep-
from one to the maximum number of available features.
resentation, the experiments are named as Gm/P n which
The reported numbers of features are the average of num-
means that m images per individual are randomly selected
ber of features associated to the above mentioned maximum
for training and the remaining n images are used for testing.
accuracy over ten runs of each train/test partition. The num-
The experiments are repeated ten times for every randomly
ber of most appropriate features in the proposed method dif-
partitioned dataset.
fers from one OLB to another. Therefore, the reported values
In our method, Gm = 2 is the smallest possible num-
here are obtained by first averaging the number of most ap-
ber of training data per class. The reason is that the pro-
propriate features over all the OLBs and then averaging over
posed algorithm employs RL—which is a semi-supervised
ten runs of each train/test partition. The number of features
method—and like other semi-supervised/supervised algo-
for all cases was rounded to the nearest integer.
rithms needs more than one sample per class for learning.
The proposed method is benchmarked against the well-
known Eigenface (Turk and Pentland 1991), Fisherface 6.1 AR Database
(Belhumeur et al. 1997) and two state of the art methods;
SDA (Manli and Martinez 2006) and random subspace LDA The AR face database consists of frontal view images of
(Wang and Tang 2006). In SDA, the leave one out version over 100 people. The subjects were photographed twice at
was used. The number of subclasses in SDA—which is a a 2-week interval. During each session 13 conditions with
free parameter—was set to two, three and five (whenever varying facial expressions, illumination and occlusion were
Int J Comput Vis (2009) 81: 191–204 199
captured. Figure 5 shows an example of each condition. Ex- LDA is very significant. The proposed method can also be
periments on this database were carried out on the down- used in each subspace derived by random subspace LDA.
sampled images of size 96 × 64. Sixty randomly selected Another interesting property of the obtained results on
individuals of this database were used in our simulations. AR dataset is that when the number of gallery samples per
A random subset with Gm(= 3, 7, 11) images per individ- individual is increased, the number of features in OLB al-
ual was taken to form the training set, and the rest of the gorithm is decreased. This shows that when the number of
database was used for testing. The gallery images were se- gallery samples is increased, the OLB agent extracts the lo-
lected merely from the first imaging session. cal basis with fewer features; this is of course the result of
The recognition accuracies, standard deviations and the OLB local representation characteristic.
number of features are compared in Tables 3 and 4, respec-
tively. Table 3 reveals that the proposed algorithm outper-
6.2 PIE Database
forms the other studied methods in all cases. The difference
between the OLB and the other algorithms become more
significant when the number of gallery samples per individ- The CMU PIE face database contains 68 individuals with
ual is smaller. The improvements are obtained even by using 41368 face images. Examples of pose and illumination vari-
fewer features as shown in Table 4. ation in this database are shown in Fig. 6. The near frontal
The performance of the random subspace LDA and OLB poses (C05, C07, C09, C27 and C29) were used in our ex-
algorithm were close for Gm(= 7, 11) samples per individ- periments which resulted in having 170 images per subject
ual, however; the improvement that OLB has caused over under different illuminations and expressions. Experiments
Fig. 5 Images of one of the subjects in the AR database taken during one session
Table 3 Recognition rate and its standard deviation of the studied methods for AR database with different training set size
Method G3/P23 G7/P19 G11/P15
Eigenface (Turk and Pentland 1991) 63.66, σ = 7.1 74.7, σ = 5.1 77.15, σ = 2.8
Fisherface (Belhumeur et al. 1997) 75.75, σ = 5.6 86.0, σ = 1.1 87.70, σ = 3.0
SDA on Eigenface (Manli and Martinez 2006) 75.18, σ = 5.3 83.79, σ = 3.3 87.13, σ = 1.5
Random subspace LDA (Wang and Tang 2006) 77.72, σ = 6.4 89.74, σ = 1.9 91.04, σ = 0.8
Proposed method 82.47, σ = 5.2 90.33, σ = 1.8 92.07, σ = 0.7
Table 4 The number of

features employed by the Method G3/P23 G7/P19 G11/P15
studied methods for AR
database Eigenface (Turk and Pentland 1991) 163 372 359
Fisherface (Belhumeur et al. 1997) 59 59 59
SDA on Eigenface (Manli and Martinez 2006) 67 72 119
Random subspace LDA (Wang and Tang 2006) 20 × 59 20 × 59 20 × 59
Proposed method 28 16 13
200 Int J Comput Vis (2009) 81: 191–204
Fig. 6 Example images of PIE database
Table 5 Recognition rate and its standard deviation of the studied methods for PIE database with different training set size
Method G3/P167 G7/P163 G11/P159

studied methods for PIE
Fig. 7 Example images of ORL database
on this database were carried out on the cropped and down- ten different images for each of the 40 distinct subjects. The
sampled images of size 119 × 98. Similar to the previous ex- variations of the images are across pose, size, time, and fa-
periment, a random subset with Gm(= 3, 7, 11) images per cial expressions. Some of the images of this database are
individual was taken to form the training set, and the rest shown in Fig. 7. Experiments on this database were carried
of the images formed the testing set. The recognition accu- out on the original images of size 112 × 92. For ORL data-
racies, standard deviations and the number of features are base, a random subset with Gm(= 2, 3, 5) images per in-
compared in Tables 5 and 6, respectively. Table 5 demon- dividual was taken to form the training set, and the rest of
strates that the proposed approach performs better than the the database was used as the testing set. The recognition ac-
other studied methods. In addition, the improvement that curacies, standard deviations along with the number of fea-
OLB has caused over LDA is very significant. tures are compared in Tables 7 and 8, respectively. Similar
6.3 ORL Database to previous experiments, it can be observed that the OLB al-
gorithm is performing better than the other studied methods
ORL face database (developed at the Olivetti Research Lab- with fewer features. For small number of samples per indi-
oratory, Cambridge, UK) is composed of 400 images with vidual, the difference between the OLB algorithm and the
Int J Comput Vis (2009) 81: 191–204 201
Table 7 Recognition rate and its standard deviation of the studied methods for ORL database with different training set size

studied methods for ORL
Eigenface (Turk and Pentland 1991) 73 63 59
Fig. 8 Example images of YALE database
Table 9 Recognition rate and its standard deviation of the studied methods for YALE database with different training set size
Fisherface (Belhumeur et al. 1997) 46.37, σ = 3.7 63.58%, σ = 5.2 70.67, σ = 8.3
other algorithms is more significant. As in AR database, the to form the training set and the rest of the database were used
number of features in OLB is proportional to the inverse size for testing. The recognition accuracies, standard deviations
of the training samples per class. along with the number of features are compared in Tables 9
and 10, respectively. Like the previous experiments, the pro-
6.4 YALE Database posed algorithm outperforms the other studied methods with
fewer features.
Yale face database contains 165 images with 11 different im-
ages for each of the 15 distinct subjects. The 11 images per
subject are taken under different facial expression or config-
7 Further Discussions
uration: center-light, with glasses, happy, left-light, without
glasses, normal, right-light, sad, sleepy, surprised, and wink.
Some of the images of this database are shown in Fig. 8. In Sect. 6, we reported the performance of the proposed
Experiments on this database were carried out on the down- method on different databases. Now, we take a closer look
sampled images of size 32×32. For Yale database, a random at the properties of our proposed algorithm. Firstly the effect
subset with Gm(= 2, 3, 5) images per individual was taken of different face spaces is studied.
202 Int J Comput Vis (2009) 81: 191–204
studied methods for YALE
The OLB algorithm was trained on PIE database with

two different face spaces; Eigenface and Fisherface. In this
experiment, three images per individual were used as the
gallery data. A sample plot of the Q-values vs. PCA/LDA
feature numbers for the first 65 features for one of the gallery
samples is shown in Fig. 9. Examining these plots reveals the
following points:
• The directions corresponding to higher eigenvalues are
not usually the first and the best choice for discrimination.
For example the second and third directions in the upper
plot of Fig. 9 (PCA space) have negative values. This im-
plies that by selecting these features, the OLB agent has
received punishment, i.e. the agent has not successfully
discriminated its class from the other classes.
• Based on the Q-values demonstrated in the Fig. 9(b), it
can be seen that the LDA space is much more discrimina- (a)
tive for this sample of gallery. This observation suggests
that one can define a feature pool with different clusters
(like LDA cluster and PCA cluster). The OLB agent must
first decide which cluster is more appropriate for each
sample and then derive the optimal features based on the
selected cluster.
Another interesting observation is to study the order of
features in Eigenface space. We trained the OLB algorithm
over the first fifty directions of the space defined by PCA
for ORL database where three images per individual were
used for training. In this case, the Eigenface method reached
its best accuracy around 50 features. As a result, we did
not train the OLB algorithm for all the Eigenface space. In
Fig. 10 a contour map of the Q-values vs. feature number
for this experiment is shown. Each row of this figure corre-
sponds to a training image. The brighter regions of this plot
mean that the agent has received greater rewards by select- (b)
ing the corresponding features. Looking globally, the fea-
Fig. 9 Q-values vs. feature number for a gallery sample of PIE data-
tures correspond to larger eigenvalues are usually more im- base trained over (a) PCA space, (b) LDA space
portant than those correspond to smaller eigenvalues. This
observation is supported by random subspace LDA method
where the directions corresponding to higher eigenvalues are fect of this parameter, a sample experiment was performed
kept intact. However, the existence of samples with brighter by selecting thirteen images of the first session of the AR
regions seen for lower eigenvalues supports our idea to de- database as training data and all thirteen images of the sec-
velop OLB. ond session as the testing set and trained the OLB algorithm
Finally it is good to study the effect of choosing differ- with k(= 2, 3, 4, . . . , 12) k-NN classifiers. The recognition
ent k-NN in the training phase. In order to study the ef- accuracy vs. parameter k is shown in Fig. 11. It is evident
Int J Comput Vis (2009) 81: 191–204 203
8 Conclusion
In this paper, we introduced an object called Optimal Local

Basis (OLB) as a tool for face recognition. Each OLB was
characterized by a representative point in feature space, a
class label, and a set of locally optimal features. The locally
optimal features were derived from a larger set of features by
using a Reinforcement Learning (RL) approach. The rein-
forcement signal was designed to be correlated to the recog-
nition accuracy. For each representative point, the maximum
reward was obtained when those data with the same class la-
bel got closer to it in a Euclidean space.
In general, finding discriminant local bases can be mod-
eled as an optimization problem. There is a variety of meth-
ods for modeling and solving such optimization problems;
like evolutionary methods, dynamic programming, RL, etc.
The key differences of those methods are in the complex-
Fig. 10 Q-values vs. feature number for all the training images of
ity of the model and the optimization cost, along with the
ORL images where three images per individual is used as gallery. The
face space is derived by Eigenface method. Brighter regions corre- challenges in defining the cost (fitness) function. By assum-
spond to higher Q-values. Note that for some images the agent did ing the learning task as a Markov Decision Process (MDP),
not receive high rewards we reduced the search space considerably. Since the MDP
model is not fully known, we solved the optimization prob-
lem by using RL; which benefits from bootstrapping and
Monte-Carlo estimation of optimal values. In addition, by
using RL, defining a fitness function becomes simpler and
more flexible, as we just need local and very simple and dis-
crete reward and punishment signals. We also exploited the
reinforcement signal to construct a new non-metric classi-
fier. In our approach to classification, by utilizing the rein-
forcement signal, we do not need to define and tune a dis-
tance measure for each OLB.
The OLB method perfectly suits the face recognition
dilemma and problems with high dimensional feature space
and small number of training samples. The performance of
the proposed learning algorithm was examined on the face
recognition problem in different databases and with differ-
ent number of training images where LDA transform was
used to extract the face features. Nevertheless the proposed
method is theoretically independent from the type of feature
Fig. 11 Maximum recognition accuracy for different k-NN classifiers
space.
when the thirteen images of the first session of AR database are used
for training Our proposed method can be compared with “person-
specific” methods, where a set of exclusive features is ex-
tracted for each individual. Basically we can call the OLBs
“person-specific”, since each OLB is derived to represent an
that when the number of samples per class is small, the de- individual. However, the features associated to each OLB
signer has to choose a small k too. Interestingly, this figure are the most discriminant subset of features in a neighbor-
reveals that it is possible to get higher recognition accuracy hood of its representative point and each person may have
by selecting a small k even when the number of samples per several OLBs due to the complexity of face manifold. That
class is large. The reason for this observation is because, by means OLBs are local in addition to being person-specific.
choosing a large k, the OLB agent is forced to derive the We are working on an incremental version of our learning
best discriminant features for each representative point as algorithm, which enables the OLB method to easily adapt
global as possible. It contradicts OLB’s local nature and it to the changes cause by adding new classes to the data-
may result in reducing its generalization ability. base. Pattern classification can be grouped into close-set and
204 Int J Comput Vis (2009) 81: 191–204
open-set applications. In practice, the open-set classifica- Liu, C., & Wechsler, H. (2000). Evolutionary pursuit and its applica-
tion is a more challenging problem. The incremental version tion to face recognition. IEEE Transactions Pattern Analysis and
Machine Intelligence, 22(6), 570–582.
of the proposed learning algorithm can tackle the open-set
Liu, C., & Wechsler, H. (2003). Independent component analysis of
identification problem as well. In addition, as the proposed Gabor features for face recognition. IEEE Transactions on Neural
method extracts the most discriminant features for each pro- Networks, 14(4), 919–928.
totype in the training database, it could also be a suitable Lowe, D. (2004). Distinct image features from scale-invariant key-
candidate for face verification task. We plan to investigate points. International Journal of Computer Vision, 60(2), 91–110.
Lu, J., Plataniotis, K. N., & Venetsanopoulos, A. N. (2003). Face recog-
the appropriateness of our proposed algorithm for this task nition using kernel direct discriminant analysis algorithms. IEEE
in near future. Transactions on Neural Networks, 14(1), 117–126.
Ma, J. L., Takikawa, Y., Lao, E., Kawade, S., & Bao-Liang Lu, M.
(2007). Person-specific SIFT features for face recognition. In
IEEE international conference on acoustics, speech and signal
References processing (pp. 593–596).
Manli, Z., & Martinez, A. M. (2006). Subclass discriminant analysis.
Ahonen, T., Hadid, A., & Pietikäinen, M. (2006). Face descrip- IEEE Transactions on Pattern Analysis and Machine Intelligence,
tion with local binary patterns: Application to face recognition. 28(8), 1274–1286.
IEEE Transactions on Pattern Analysis and Machine Intelligence, Martinez, A. M., & Kak, A. C. (2001). PCA versus LDA. IEEE Trans-
28(12), 2037–2041. actions on Pattern Analysis and Machine Intelligence, 23(2), 228–
Bartlett, M. S., Movellan, J. R., & Sejnowski, T. J. (2002). Face recog- 233.
nition by independent component analysis. IEEE Transactions on O’Toole, A. J., Wenger, M. J., & Townsend, J. T. (1999). Quantita-
Neural Network, 13(6), 1450–1464. tive models of perceiving and remembering faces: precedents and
Belhumeur, P. N., Hespanha, J. P., & Kriegman, D. J. (1997). Eigen- possibilities. In M. J. Wenger & J. T. Townsend (Eds.), Compu-
faces vs. Fisherfaces: recognition using class specific linear pro- tational, geometric, and process perspectives on facial cognition:
jection. IEEE Transactions on Pattern Analysis and Machine In- Contexts and challenges (pp. 1–38).
telligence, 19(7), 711–720. Pentland, A., Moghaddam, B., & Starner, T. (1994). View-based and
Bicego, M., Lagorio, A., Grosso, E., & Tistarelli, M. (2006). On the modular Eigenspaces for face recognition. In IEEE international
use of SIFT features for face authentication. In IEEE international conference on computer vision and pattern recognition (pp. 84–
workshop on biometrics, in association with CVPR. 91).
Cai, D., He, X., Han, J., & Zhang, H. (2006). Orthogonal Laplacian- ORL database. Publicly available http://www.uk.research.att.com/
faces for face recognition. IEEE Transactions on Image Process- facedatabase.html.
ing, 15(11), 3608–3614. Yale University Face Image Database. Publicly available for non-
Carbon, C. C. (2003). Face processing, Early processing in the recog- commercial use http://cvc.yale.edu/projects/yalefaces/yalefaces.
nition of faces. Ph.D. Dissertation, Free University of Berlin. html.
Edelman, S., & Intrator, N. (1990). Learning as extraction of low- Roseis, S. T., & Saul, L. K. (2000). Nonlinear dimensionality reduction
dimensional representations. In D. Medin, R. Goldstone, & by locally linear embedding. Science, 290, 2323–2326.
P. Schyns (Eds.), Mechanisms of perceptual learning (pp. 353–
Sim, T., Baker, S., & Bsat, M. (2003). The CMU pose, illumination
380). New York: Academic Press.
and expression database. IEEE Transactions on Pattern Analysis
Ekenel, H. K., & Sankur, B. (2004). Feature selection in the indepen-
and Machine Intelligence, 25(12), 1615–1618.
dent component subspace for face recognition. Pattern Recogni-
Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning, an intro-
tion Letters, 25(12), 1377–1388.
duction. Cambridge: MIT Press.
Feng, G. C., Yuen, P. C., & Dai, D. Q. (2002). Human face recognition
using PCA on wavelet subband. Electronic Imaging, 9, 226–233. Turk, M., & Pentland, A. (1991). Eigenfaces for recognition. Cognitive
Harandi, M., Nili Ahmadabadi, M., & Araabi, B. N. (2004). Face Neuroscience, 3, 71–86.
recognition using reinforcement learning. In IEEE international Wang, X., & Tang, X. (2006). Random sampling for subspace face
conference on image processing (pp. 2709–2712). recognition. International Journal of Computer Vision, 70(1), 91–
Harandi, M., Nili Ahmadabadi, M., Araabi, B. N., & Lucas, C. (2004). 104.
Feature selection using genetic algorithm and its application to Yamaguchi, O., Fukui, K., & Maeda, K. I. (1998). Face recognition
face recognition. In IEEE International Conference on Cybernet- using temporal image sequence. In IEEE international conference
ics and Intelligent Systems (pp. 1367–1372). on automatic face and gesture recognition (pp. 318–323).
He, X., Yan, S., Hu, Y., Niyogi, P., & Zhang, H. (2005). Face recogni- Yang, J., Zhang, D., Frangi, A. F., & Yang, J.-Y. (2004). Two-
tion using Laplacianfaces. IEEE Transactions on Pattern Analysis dimensional PCA: a new approach to appearance-based face rep-
and Machine Intelligence, 27(3), 328–340. resentation and recognition. IEEE Transactions on Pattern Analy-
Kim, T. K., & Kittler, J. (2005). Locally linear discriminant analysis sis and Machine Intelligence, 26(1), 131–137.
for multimodally distributed classes for face recognition with a Yang, J., Frangi, A. F., Yang, J. Y., Zhang, D., & Jin, Z. (2005). KPCA
single model image. IEEE Transactions on Pattern Analysis and plus LDA: a complete kernel Fisher discriminant framework for
Machine Intelligence, 27(3), 318–327. feature extraction and recognition. IEEE Transactions on Pattern
Kim, K. I., Jung, K., & Kim, H. J. (2002). Face recognition using ker- Analysis and Machine Intelligence, 27(2), 230–244.
nel principal component analysis. IEEE Signal Processing Let- Zhao, W., Chellappa, R., Phillips, P., & Rosenfeld, A. (2003). Face
ters, 9(2), 40–42. recognition: a literature survey. ACM Computing Surveys, 35(4),
Liu, C. (2004). Gabor-based kernel PCA with fractional power polyno- 399–458.
mial models for face recognition. IEEE Transactions on Pattern Zheng, W. S., Lai, J. H., & Yuen, P. C. (2005). GA-fisher: a new LDA-
Analysis and Machine Intelligence, 26(5), 572–581. based face recognition algorithm with selection of principal com-
Liu, H., & Motoda, H. (1998). Feature extraction, construction and ponents. IEEE Transactions on Systems, Man and Cybernetics,
selection: a data mining perspective. Norwell: Kluwer Academic. Part B, 35(5), 1065–1078.

Harandi IJCV2009 Published

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Harandi IJCV2009 Published

Uploaded by

Copyright:

Available Formats

Int J Comput Vis (2009) 81: 191–204

Optimal Local Basis: A Reinforcement Learning Approach

(Belhumeur et al. 1997), Subclass Discriminant Analysis

− Qj (i0 , i1 , . . . , ij −1 , ij ) (3) RC (j ) and PN C (j ) are the reward and punishment that

Initialize the corresponding Q-tables randomly. h = 1, 1, . . . 1

Update the selected feature vector, h(fi+1 s

Method G3/P23 G7/P19 G11/P15

Table 4 The number of

Fig. 6 Example images of PIE database

Method G3/P167 G7/P163 G11/P159

Table 6 The number of

Fig. 7 Example images of ORL database

Method G2/P8 G3/P7 G5/P5

Table 8 The number of

Fig. 8 Example images of YALE database

Method G2/P9 G3/P8 G5/P6

The OLB algorithm was trained on PIE database with

In this paper, we introduced an object called Optimal Local

You might also like