Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 7

Optical flow

Optical flow is a concept for estimating the motion of objects within a visual
representation. Typically the motion is represented as vectors originating or terminating
at pixels in a digital image sequence.
Optical flow is useful in pattern recognition, computer vision, and other image processing
applications. It is closely related to motion estimation and motion compensation. Often
the term optical flow is used to describe a dense motion field with vectors at each pixel,
as opposed to motion estimation or compensation which uses vectors for blocks of pixels,
as in video compression methods such as MPEG. Some consider using optical flow for
collision avoidance and altitude acquisition system for unmanned air vehicles (UAVs).

Motion Flow in Computer Vision

Introduction
Generation of Optical Flow
Optical Flow and 3-D Motion

Introduction
Until now we dealt primarily with single image and the type of information encoded in it
as well as with the means how to extract it. In some evolutionary circles, it is believed
that the estimation of the motion of predators advancing at a mobile animal was
important to its ability to take flight away from the predator and hence survive. In this
lecture, we will deal with the problem of recovering the motion of objects in the 3-D
world from the motion of segments on the 2-D image plane.
The technical problem with estimating the motion of objects in 3-D is that in the image
formation process, due to the perspective projection of the 3-D world onto the 2-D image
plane, some of the information is lost. We will now address several ways of recovering
the 3-D information from 2-D images using various ``cues''. These cues are motion,
binocular stereopsis, texture, shading and contour. In this lecture we will content
ourselves with studying motion flow.
If the projection of a 3-D point on the image plane is a point with image coordinates
, simple inversion of the perspective projection equation in order to obtain 3-D
information will not work, since there are infinitely many points in the 3-D which would
get projected to the same point
in the image plane, all lying on a line going through
the center of projection and the point
(see Figure 1). Thus, some additional
information is needed in order to recover the 3-D structure from 2-D images. One
possible way how to extract this 3-D information is from time-varying sequences. This 3-

D information crucial for performing certain task, such as manipulation, navigation,


recognition.

Figure 1: Displacement of a point in the environment causes a displacement of the


corresponding image point. The relationship between the velocities can be found by
differentiating the perspective projection equation. For more details see (B. Horn, ``Robot
Vision'', MIT Press, 1986, Chapter 12, Chapter 17)
There is lot of biological motivation for studying motion flow (besides being eaten by
sabre toothed tigers, that is):
1. Human beings do it all the time without even realizing it, for example, this is why
we have saccadic eye movements (that is our eyes jump from focusing at one spot
to another). Thus, if the scene has no motion, and we are still our eyes are
moving. There are celebrated experiments made on the movements of the pupils
of people looking at the Mona Lisa for instance showing the eyes darting from the
eyes to the lips to the mouth and then the hair and so on.
2. Simple experiment can demonstrate how motion can reveal something about the
3-D structure. Fixating on something close and very far away and moving your
head (either sideways or forward and backward), you can notice that the retinal
image of the close by tree moves more then the one of a distant tree, i.e. the
motion in the retinal plane is inversely proportional to the distance from the
retinal plane.
3. There are a few examples from the animal world, where the motion helps the
animals to obtain some information about the environment structure, e.g. pigeons
move their necks to get the so called ``motion parallax''.

Generation of Optical Flow


If the camera (or human eye) moves in the 3-D scene, the resulting apparent motion in
the image is called the optical flow. The optical flow describes the direction and the
speed of motion of the features in the image. We wil now show how to compute the
optical flow from sequence of images and then derive an equations relating the 3-D
structure of the world to our 2-D measurements.

An example of the optical flow pattern from the sequence of images of a rotating Rubik's
cube shown in Figure 2.

Figure 2: A Rubik's cube on a rotating turntable, taken from Russell and Norvig, ``AI, A
Modern Approach'', Prentice Hall, 1995, Figure 24.8, pg. 736.
is given in Figure 3.

Figure 3: Flow vectors calculated from comparing the two images of a Rubik's cube,
taken from Russell and Norvig, ``AI, A Modern Approach'', Prentice Hall, 1995, Figure
24.9, pg. 737.
The optical flow vector
has two components
and
describing the motion of a point feature in and direction in the image plane
respectively. In order to be able to measure optical flow we need to find corresponding
points between two frames. One possibility would be to exploit the similarity between the
image patches surrounding the individual points. There are two measures of similarity
that are commonly used for optical flow vectors. One is sum of squared differences

(SSD) between an image patch centered at a point , which is a location


at time
and various other candidate locations
where that patch could have
moved between two frames at time and
The goal here is to find a displacement in
the image plane
, which minimizes the SSD criterion:

where the summation ranges over the image patch centered at the feature of interest. The
optical flow of the chosen point feature is

. An alternative measure of similarity would be to maximize the cross-correlation


between the neighboring patches in the images expressed as

It is important to note that the optical flow is not uniquely determined by the local
information. For example consider two local patches of the images observed in two
consecutive times (see Figure 4).

Figure 4: In spite of the fact that the dark square moved between the two consecutive
frames, observing purely the cut-out patch we cannot observe any change, or we may
assume that the observed pattern moved arbitrarily along the direction of the edge.
In spite of the fact that the dark square moved between the two consecutive frames,
observing purely the cut-out patch we cannot observe any change, or we may assume that
the observed pattern moved arbitrarily along the direction of the edge. The fact that one
cannot determine the optical flow along the direction of the brightness pattern is known
as aperture problem. Have you ever thought about what might cause the barber pole
illusion, where it appears that a barber shop pole is spiralling up and out. Indeed, full
optical flow is best determined at the corners.

Optical Flow and 3-D Motion


Now we want to relate the motion of the camera frame to the optical flow pattern which it
generates. For simplicity we will concentrate again only on a single point with 3-D
coordinates
. Assume that the camera moves with translational velocity
and the angular velocity
and the observed scene is stationary (see Figure
5). This is referred to as egomotion.

Figure 5: Showing the motion of the camera relative to a point.


By differentiating the equation for a projective transformation, we find that the optical
flow of a point caused by the motion of the camera is:

where

gives the coordinates of the scene point corresponding to the image at


. This equation is complicated, but can be understood better for the case of pure
translation, that is,
, in which case the flow field becomes:

It is of interest to note that


at the point
. This
point is called the focus of expansion. Using this, as the origin, that is defining
, it follows that

The useful thing about this equation (3 is that it enables us to determine the time to
impact with an object at a distance , away given by
. You can imagine that this is
important for navigation and also your well being.
Another use of motion flow is in rendering . Here you take multiple 2-D views of a fixed
scene and use it to reconstruct the 3-D shape of the object. An example of this is given in
Figure (6).

Figure 6: 3-D reconstruction of a house from four photographs taken from different
locations (left) and the photograph of the house taken from the same location (right).
Figure taken from Russell and Norvig, ``Artificial Intelligence - A Modern Approach,
Prentice Hall, 1995, Figure 24.11, page 738
Fancier versions of this are due to Debevec, Taylor and Malik and are available here.

Introduction to Optical Flow


Much of the image registration algorithm we chose to use centers on the idea of
projective flow between images. Here we will look at the somewhat simpler concept of
optical flow, which has been used in computer vision and other applications for decades.
Let's say you only have a one dimensional image, A(x), where A is a function of pixel
intensity for each possible location x. You could then translate all the pixels in that image
by a value x to generate the image B(x). Another way to view this is to think of both
images as part of one function, but with an additional parameter t, for time. This
extension comes naturally because a difference from the image captured in A to that in B
implies that time has elapsed between capturing each of them as well. This gives us the
following equations for E(x,t).
E(x,t)=A(x)
(1)
E(x+x,t+t)=B(x)
(2)
Extending this into two dimensions is not too hard to imagine, as we would simply have
an additional parameter, y. From this information, we can calculate derivatives at each
point to determine the optical flow between the two images. One way to view these
derivatives is to plot a vector field using the x and y derivatives. Vector fields give a

decent picture of the motion from one image to the next, an example of which is included
below.
Sample Optical Flow Field

Figure 1: Optical flow field from a taxi driving down a street. (Source:
http://www.kfunigraz.ac.at/imawww/invcon/pictures/flowfield.gif)
With this in mind, we can now take a look at how the image registration algorithm we
used builds upon the concept introduced here.

You might also like