Professional Documents
Culture Documents
Digital Processing Tools: 2.1 Spatial Transformations
Digital Processing Tools: 2.1 Spatial Transformations
29
In this section, an overview of spatial transformations is given with special
emphasis on techniques more useful to pixel registration; concentration is on those
transforms used in the relevant areas to correct or perturb the imagery.
2.1.1 Definitions
Each point in the input and output images is related by a geometric relationship
defined by a spatial transformation. An input image is an array of discrete intensity
values whose coordinates are given by the row and column numbers. The output image
is comprised of the transformed data. There are two possible ways of mapping the
transformation; Forward Mapping and Inverse Mapping. In forward mapping, the value
of each pixel from the input image is copied onto the output image at real valued
positions determined by the transformation function. In inverse mapping, the value of
each pixel in the output image is calculated from the values of the input image where
the contributing pixels are determined by the mapping function. In the world of discrete
images made up of pixels, forward mapping may leave “holes” and “hot spots” in the
output image. This problem is remedied by the use of inverse mapping which
guarantees to fill each and every pixel of the output image. Detailed description of each
process is given in the next sections for gray scale images where each pixel is
represented by a single intensity value, whereas the same can be applied to each channel
of colour (RGB) images, namely the Red, Green and Blue channels.
where In and Out refer to the input and the output images, and Xf and Yf refer to the
forward mapping functions in the x and y directions, respectively. Since the output array
is also a discrete one, a point-to-point mapping will lead to complications. Each pixel
lies on an integer lattice, even though it has a finite area, a single value is representative
of that pixel and the only place it can go in the output array is again on an integer
lattice. Therefore fractional displacements in the mapping function will have to be
either truncated or rounded to the nearest integer value. Figure 2-1 illustrates these two
cases; pixels A and B are mapped with the exact same translations, except that pixel A’s
30
translation has been truncated to its integer value and pixel B’s translation has been
rounded to the nearest integer value. Their translation in the y direction being less than
half pixel equates to zero in either truncation or rounding schemes. However, their
translation in the x direction which is about 1.9 pixels is realized as 1 pixel for pixel A
and 2 pixels for pixel B. The only difference between the two schemes is that rounding
to the nearest integer will yield half a pixel precision compared to full pixel precision
with truncation.
x
y 1 2 3 4 5 1 2 3 4 5
1 1
2 A 2
3 C 3
4 B 4
5 5
31
transformation of pixel A results in a patch straddling six output pixels and a strategy
other than truncating or rounding needs to be used in order to properly integrate the
input contributions at each output pixel. This is done by the use of an accumulator
array which collects in each output pixel the contributed intensity proportional to the
area overlapping the output pixel.
Whilst this four-corner approach allows us to avoid holes in the output image,
it also brings in costly intersection tests. Magnification or reduction, which are special
cases of warping, can lead the same input pixel value to be applied onto multiple output
pixels unless additional filtering is employed [92]. A solution to these problems is to
adaptively oversample the input image up to a level where the projected pixel does not
cover more than an acceptable number of output pixels, ideally one, as in this case the
accumulator array will simply accumulate a single value without intersection tests.
x
y 1 2 3 4 5 1 2 3 4 5
1 1
2 A 2
3 3
4 B 4
5 5
Despite its shortcomings, the forward mapping is useful when the input image
has to be read sequentially or when it does not reside entirely in memory, which are no
longer relevant issues in modern computer architecture. However embedded
implementations may still have a use for it.
32
Out ( x, y ) = In( X b ( x, y ), Yb ( x, y )) 2.2
It operates by projecting each output coordinate into the input image via Xb
and Yb, therefore guaranteeing that each and every output pixel is filled without the
need for an accumulator array. Again, the most simplistic implementation is to directly
copy the value of the input pixel on which the output pixel is mapped by truncating or
rounding the real part of the transformation to an integer value. In Figure 2-3, for output
pixels A and B, their projections on the input image are illustrated by light gray squares
with a dot in their centre and the transformation function for both A and B are about the
same (a little more than half pixel shift), however the value of A is obtained by
truncating its mappings to their integers parts, therefore A appears undisplaced. On the
other hand, the value of B is obtained by rounding its mappings to the nearest integers;
therefore B appears to have moved by one pixel. Instead of settling for the values at
integer locations, the use of an interpolation stage is introduced to obtain the pixel
values at real-valued coordinates. Pixel C in Figure 2-3 illustrates the interpolation
scheme.
x
y 1 2 3 4 5 1 2 3 4 5
1 1
2 2 A
3 3 B
4 4
5 5 C
Interpolation has the same role as the accumulator array in forward mapping,
but is less costly to implement as no intersection tests are performed.
33