Introduction To The First IEEE Workshop On Face Processing in Video

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

http://www.visioninterface.net/fpiv04/preface.

html Go OCT FEB JUL


👤 ⍰❎
36 captures 06 f🐦
19 Oct 2004 - 15 Aug 2018 2007 2009 2011 ▾ About this capture

The First IEEE Workshop on 

Face Processing in Video


June 28, 2004, Washington, D.C., USA
www.visioninterface.net/fpiv04

main page workshop program workshop proceedings CVPR 2004

Introduction to 
the First IEEE Workshop on Face Processing in Video
(in pdf)

What makes face processing in video special 

Since video-cameras became affordable and computers became powerful enough to process video in real-time, we have started to see a
tremendous interest from both academia and industry to the vision-based human-oriented applications. These applications include
public surveillance, information security, biometrics, computer-human interaction, multi-media, immersive and collaborative
environments, video conferencing, video coding and annotation, computer games, entertainment, to name a few. 

A task of prime importance in all of these applications is analyzing video data for the presence of information about human faces. This
involves such problems as face detection, face tracking, and, of course, face recognition. The problem of recognizing faces from video
however should not be considered as a mere extension of the problem of recognizing faces in photographs, since there are a few
principle differences between the two, in terms of both the nature of processed data and approaches used. 

On one hand, because of real-time, bandwidth, and environmental constraints, video processing has to deal with much lower resolution
and image quality, when compared to photograph processing. Even assuming that the lighting conditions are perfect when taking a
video snapshot, which is rarely true, the object of interest may be located too far from the camera or at angle which makes recognition
very difficult. On the other hand, video images can be easily acquired and they can capture the motion of a person. This makes it
possible to track people until they are in a position convenient for recognition. 

Besides that, image-based face recognition traditionally belongs to the field of pattern recognition, and as such is mainly driven by
mathematical principles. By contrast, video-based face recognition can also be approached by using neurobiological principles, the
study of which may hopefully result in making the performance of computer vision systems closer to that of biological vision systems.

The described difference between recognizing faces from photographs and recognizing those from video can be easily seen from Figure
1, which shows a photograph and a snap-shot of a {\em News} video program downloaded from the Internet. This figure can also be
used to discover the way biological vision systems (such as that of the reader of this paper) approach a face recognition problem. -- For
this purpose, the reader is invited to recognize the faces in the figure. 

a)     b)

 Figure 1. A test for examining the way facial recognition is performed by biological systems. - 
Try to recognize the faces in these images. 
(When trying to recognize the faces shown in (a) and (b), we first detect face-looking regions. Then, for the face in photograph (a), we rotate our face (or the
page) to align our eyes with the eyes in the photograph, after which we might be able to recognize John Lennon in his last year of life. This is also a position
which we would use, should we wish to memorize this face. For the image (b), which is a snap-shot of a {\em News} video program downloaded from the
Internet, we can easily locate two faces but need to look very closely in order to see in the two persons Paul McCartney and Vladimir Putin (the video was taken
shortly after the concert of the singer on the Red Square in May last year). We also note that difference in resolution and the quality between the photograph
image (a) and the video image (b). The face orientation is another factor which makes recognition in the video difficult.)
The test is aimed at showing the classification and the hierarchy of face processing tasks as presented at and covered by this workshop.

As we try to recognize a face in an image or a scene, we notice the following division and hierarchy of face processing tasks. First we
scan a scene to localize the areas where the face is located, which defines the face segmentation task. Then we approach the area of
interest and detect the presence of a face there (the face detection task). Then we follow the face (the tracking task), until it appears in
the position convenient for recognition, which, in the case of faces, is an eye-to-eye position (eye detection and face modeling tasks).
Only then do we attempt to assert whether the face is familiar or not. If it is familiar, we recognize it (the recognition task), and if it does
not look familiar, we memorize it (the memorization task). These and other face processing tasks are summarized in Figure 2.

Not claiming that is the exact order in which humans recognize faces, as, for example, facial expression and orientation can be retrieved
without retrieval of the face position, this is the order used to organize the papers presented at the workshop.
Go OCT FEB JUL
👤 ⍰❎
36 captures 06 f🐦
19 Oct 2004 - 15 Aug 2018 2007 2009 2011 ▾ About this capture

Figure 2. Categorization and hierarchy of tasks performed in face processing in video. 

Papers

There were thirty papers selected out of 43 submissions for the presentation at the workshop. The papers are now retrieved from
IEEEXplore digital library.

As it might be difficult to evaluate the video-based approaches presented by the papers by viewing only the video snap-shots shown,
many authors have also submitted links to the actual video-demos which can be downloaded from the Internet for viewing. These links
as well as the links to the related project's websites are made available at the workshop's website at http://www.visioninterface.net
/fpiv04.  The bibtex file with the list of all workshop's papers is also made available at http://www.visioninterface.net/fpiv04
/fpiv04.bib. 

A summary of the papers can also be found at  workshop's website: here and here.

About the workshop logo 

The logo designed for the workshop, which appears as an animated image at the workshop's website, is developed by the workshop's
chair to illustrate some peculiarities of processing faces in video, which are the following. In video a face is often arbitrarily oriented and
captured in low resolution and under poor lighting conditions. It can also be blurred because of motion. At the same time, video allows
one to capture facial motion, which makes it possible to localize and recognize a face from blinking, for example. The canonical face
representation, which is the base face representation used to memorize and recognize faces from video, is often eye-centered and uses
only the central part of the face. Commonly it is also chosen to be of the lowest possible resolution under which the face is still
recognizable. In particular, one of the most frequently used canonical face sizes is 24 x 24 pixels, which allows one to describe the
natural symmetry of a human face using 16 equal blocks, with eyes being located in the intersection of the upper blocks and mouth
located in the intersection of lower blocks. Face recognition on black-and-white images is just as good as recognition on colour images.
Besides, many recognition techniques work on the binary features extracted from face. The image also shows that the eyes are the
most salient features in a human face, capturing immediately the observer's attention, while hair is not. The image also shows that
despite low and binary representation of the face, it is still possible for humans to classify it as being a face of a man or a woman, and
that it is a face of the same person, even ... as the age difference between the two images is almost thirty years. 

Acknowledgements 

Finally, I would like to thank all authors of the submitted papers. With their participation the First IEEE Workshop on Face Processing in
Video becomes a real success and an inspiration for future workshops on this new and exciting area of research.

Dmitry O. Gorodnichy, FPIV'04 Program Chair

Copyright © 2004 

You might also like