Short Concept Explain

Image Processing Primer
David Doria
daviddoria@gmail.com
Tuesday 16th December, 2008
Contents
1 Prerequisites 5
2 What is Image Processing? 5
3 Color Models 5
3.1 Additive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.1.1 RGB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.1.2 HSV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2 Subtractive . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.2.1 CMYK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
4 Image Basics 8
4.1 Pixel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4.2 Binary Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4.3 Greyscale Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4.4 Color Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4.5 Pseudocolor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
5 Thresholding 8
5.1 Thresholding by Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
5.2 Otsu’s Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
5.2.1 Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
6 Filtering Terminology 9
7 Morphological Operations 9
7.1 Erosion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
7.1.1 Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
7.1.2 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
7.1.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
7.2 Dilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
7.2.1 Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
7.2.2 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
7.2.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1
7.3 Combined Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
7.3.1 Opening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
7.3.2 Closing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
7.4 Thickening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
7.4.1 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
7.4.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
7.5 Thinning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
7.5.1 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
7.5.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
8 Edge Detectors/Filters 13
8.1 Differential Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
8.1.1 Prewitt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
8.1.2 Sobel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
8.1.3 Canny . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
8.2 Zero Crossing Family . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
8.2.1 Laplacian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
8.2.2 Laplacian of Gaussian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
8.2.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
9 Corner Detection 17
9.1 Hit or Miss Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
9.1.1 Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
9.1.2 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
9.1.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
9.2 Harris Corner Detector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
9.2.1 Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
10 Point Operations 18
10.1 Histogram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
10.2 Contrast Stretching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
10.2.1 Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
10.2.2 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
10.3 Histogram Equalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
10.3.1 Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
10.3.2 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
11 Image Enhancement 20
11.1 Point Spread Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
11.2 Deconvolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
11.3 Sharpening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
11.3.1 Edge Enhancement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
11.3.2 Unsharp Masking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
12 Algebraic Operations 21
12.1 Motion (Change) Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
12.2 Noise Removal by Image Averaging . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2
13 Image Noise 21
13.1 Salt and Pepper Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
13.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
13.3 Gaussian Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
13.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
14 Spatial Transformations 22
14.1 Averaging Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
14.2 Median Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
15 Geometric Operations 23
15.1 Grey-Level Interopolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
15.1.1 Nearest Neighbor (Zero Order) Interpolation . . . . . . . . . . . . . . . . . . 23
15.1.2 Bilinear Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
15.1.3 Bicubic Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
15.2 Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
15.2.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
15.3 Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
15.3.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
15.4 Rotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
15.4.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
15.4.2 Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
15.5 Compound Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
15.5.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
16 Transforms 25
16.1 Hough Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
16.1.1 Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
16.1.2 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
16.1.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
16.1.4 Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
16.2 Skeletonization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
16.2.1 Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
16.2.2 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
16.2.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
16.3 Medial Axis Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
16.3.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
16.4 Distance Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
16.5 Radon Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
16.5.1 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
16.6 Fourier Slice Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
16.6.1 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
16.7 Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
16.7.1 Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
16.7.2 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
16.7.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
16.8 Discrete Cosine Transform (DCT) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
16.8.1 Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
16.8.2 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3
16.8.3 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
16.9 Hadamard Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
16.10KL, Principal Component, Hotelling, and Eigenvector Transforms . . . . . . . . . . 33
17 Compression 34
17.1 Lossless . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
17.1.1 Run Length Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
17.1.2 Lemple Ziv Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
17.1.3 Huffman Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
17.2 Lossy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
17.2.1 Transform Coding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
17.3 Theoretical Compression Limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
18 Segmentation 35
18.1 Idea . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
18.2 Connected Components Labeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
18.2.1 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
18.3 Region Growing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
18.3.1 Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
18.4 Region Splitting and Merging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
18.5 Watershed Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
19 JPEG 37
20 TV Standards 37
20.1 Color . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
20.2 NTSC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
20.3 PAL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4
1 Prerequisites
This paper assumes a working knowledge of Pattern Recognition, Probability, and basic Linear
Algebra
2 What is Image Processing?

3 Color Models
There are two ways to represent a color C. The first is to ask “Which colors of light would I have to
combine to produce C?”. This is called the additive color model. The second way is to ask “Which
colors of ink should be combined so that they absorb the colors I DONT want from incident white
light?”. This is called the subtractive color model.
3.1 Additive
Figure 1: Additive Color
3.1.1 RGB
Red, green, and blue are the three basic colors. By combining these three colors of light, any color
can be produced. R, G, and B are specified as relative amounts, which describe how much of each
color to combine (eg. [1, 0, 0 ] is pure red, [1, 1, 0] means to combine red and green in equal
quantities, etc.). These combinations can be represented as a cube.
Figure 2: RGB Cube
5
3.1.2 HSV
H, S, and V correspond to the hue, saturation, and value of the color. These values describe colors
as points in a cylinder with the following properties.
• The angle around the axis corresponds to hue. This is the actual “color” of the color.
• The distance from the axis corresponds to saturation. Near the axis, a color could be called
“light blue”, where at the edge of the cylinder, it may be called “dark blue”.
• The central axis ranges from black at the bottom to white at the top. The distance along the
axis corresponds to value (also called lightness or brightness (HSL, HSV).
Figure 3: HSV Cylinder
6
3.2 Subtractive
Figure 4: Subtractive Color
Describes what kinds of inks need to be applied to a white background so that the reflected light
produces a particular color.
3.2.1 CMYK
Cyan, Magenta, Yellow, and blacK. With these four colors of ink any color can be produced. Since
these colors are the exact inverse of the additive color model, the two systems can be interchanged
with      
C 1 R
M  = 1 − G
Y 1 B
Black is not needed in theory, CMY should color the entire range of possible colors. However,
in practice, it is much better to use a fourth color, black. Some reasons are as follows:
• It is cheaper to apply 1 ink (black) than 3 inks (CMY)
• The paper gets wet if too much ink is applied, which often happens when C, M, and Y are
applied. This is inefficient because it adds drying time to the printing process.
• Text is often black. Since text requires very fine detail, it should be easy to produce this
detail in black. If it was produced with CMY, the C, M, and Y print heads would have to be
very accuratly aligned, which is much more difficult than simply using a fourth ink.
7
4 Image Basics
4.1 Pixel
The A single element of a digital image. The word pixel is short for Picture (abbreviated “pix”)
Element (“el”).
4.2 Binary Image

A greyscale image is a two dimensional array of binary pixels. If the value is 0, the pixel is black.
If the value is 1, the pixel is white.
4.3 Greyscale Image

A greyscale image is a two dimensional array of values indicating the brightness at each point.
The brightness values are generally stored as a value between 0 (black) and 255 (white). Values
inbetween are different shades of grey.
4.4 Color Image

A color image can be viewed in two equivalent ways. The first is as a two dimensional array of
pixels, just like a greyscale image, but instead of a brightness value, each pixel has a specific color
given by an (R,G,B) triple. The alternative view is that the image is composed of three separate
2D arrays of pixels (one for red, one for green, and one for blue), where each element in the three
arrays contains the amount of only of the layer color present in the image at that point. Each of
these 2D arrays is called a layer. When the layers are overlayed, the color image is produced.
4.5 Pseudocolor
A map can be produced from a grey level to a color spectrum. This can help to visually identify
different intensities more easily.
5 Thresholding
Chose a value below which an action will be taken, and above which a different action will be taken.
5.1 Thresholding by Clustering

Use K-Means with K = 2 to try to divide the histogram values into foreground and background
pixels (white and black).
5.2 Otsu’s Method

This method is generally used to reduce a greyscale image to a binary image. In this case, pixels
above the threshold will be set to white, and pixels below the threshold will be set to black. Otsu’s
method produces the optimal threshold because it separates the pixels so that their within-class
variance is minimized.
8
5.2.1 Details
The within class variance is defined as the weighted sum of the variances of each cluster.
σwithin (T ) = nB (T )σB2 (T ) + nW (T )σW

2
(T )
P −1
where T is the threshold and nB (T ) = Ti=0 p(i).
This is very computationally expensive. An equivalent method is to maximize between class
variance; the variance of the combined distribution minus the within class variance:
σbetween = σ 2 − σwithin
2
(T )
This can be expressed as
σbetween = nB (T )nW (T ) [µB (T ) − µW (T )]2
This expresion is very simple, only involving the means of each cluster weighted by the number
of points in each cluster. Otsu’s method computes this quantity iteratively with T instead of
computing the entire expression at each iteration which makes it very fast.
6 Filtering Terminology
a X
X b
g(x, y) = w(s, t)f (x + s, y + t)
s=−a t=−b
m−1 n−1
where a = 2
and b = 2
where the filter is m by n.
7 Morphological Operations
A morphological operation is one which changes the shape of a region in an image. Morphological
operations operate on binary images. Because of this, we can define a “background” pixel to be
black, and a “foreground” pixel to be white. This type of operation is used to remove extraneous
information before further processing.
These operations require the notion of the “neighborhood” of a pixel. A structuring element (S)
is a small grid of pixels (possibly 3x3, 5x5, etc.). In procedures besides morphological operations,
this can be called a kernel or a filter. We require that a filter be ODDxODD size, so that it is very
easy to center the filter at a particular pixel.
7.1 Erosion
7.1.1 Idea
Areas of foreground pixels shrink, holes within foreground areas grow. This is commonly used to
separate touching objects so that they can be counted.
7.1.2 Procedure
For each pixel p in the image, if S centered at p overlaps ONLY foreground pixels, p is labeled a
foreground pixel. This makes objects smaller. It can break a single object into multiple objects.
9
7.1.3 Example
Figure 5: Erosion
7.2 Dilation
7.2.1 Idea
Gradually enlarge the boundaries of regions of foreground pixels. This makes objects larger. It can
merge multiple objects into one, and remove holes inside objects. Dilation can be used for edge
detection. The dilated image is subtracted from the original, resulting in the edge image.
7.2.2 Procedure
For each pixel p, if S centered at p overlaps ANY foreground pixels, p is labeled a foreground pixel.
7.2.3 Example
Figure 6: Dilation
7.3 Combined Operations

7.3.1 Opening
Idea
Remove small islands and thin filaments of object pixels. The effect is basically the same as
erosion, but less destructive. The effect can be visualized by “rolling” the structuring element
around the inner boundary of the foreground regions.
Procedure
Erosion followed by dilation.
10
Example
Figure 7: Opening
7.3.2 Closing
Idea
Remove islands and thin filaments of background pixels. The effect can be visualized by “rolling”
the structuring element around the outer boundary of the foreground regions.
Procedure
Dilation followed by erosion.
Example
Figure 8: Closing
7.4 Thickening
Uses:
• Finding the convex hull of an object.
7.4.1 Procedure
The hit-and-miss transform is added to the original image.
11
7.4.2 Example
Figure 9: Original Image
Figure 10: Thickened Image
7.5 Thinning
Uses:
• Reduce the output of an edge detector to single pixel wide edges while preserving the length
of the edges.
12
• Produce a skeleton.
7.5.1 Procedure
The hit-and-miss transform is subtracted from the original image. Consider all pixels on the bound-
aries of foreground regions (i.e. foreground points that have at least one background neighbor).
Delete any such point that has more than one foreground neighbor, as long as doing so does not
locally disconnect (i.e. split into two) the region containing that pixel. Iterate until convergence.
7.5.2 Example
Figure 11: Thinning
8 Edge Detectors/Filters
The goal is to find edges in the image. An edge is a place in the image with a strong intensity
contrast. This significantly reduces the amount of data (goes from millions of pixels to only hundreds
of edge pixels). This filtering of nonimportant information while at the same time preserving
important structural properties is generally a very helpful first step in the analysis of an image.
Edges are often used in segmentation because they generally occur at natural object boundaries.
13
Figure 13: Edge Image
This is generally performed by creating a mask (aka. kernel or filter) which outputs the desired
information (usually something about the gradient). The constant in front of the mask is simply
to normalize the value to [0, 1].
8.1 Differential Family

Looks for values in first derivative above a threshold. If you use a vertical
p and horizontal mask pair,
then the absolute magnitude of the gradient can be calculated with x2 + y 2 .
8.1.1 Prewitt
The gradient in the x direction is approximated by
[f (x + 1) − f (x)] + [f (x) − f (x − 1)] f (x + 1) − f (x − 1)

=
2 2
This is the average change to the right and the left of the current pixel.
Therefore, if we multiply the pixel to the right of p by 1, and the pixel to the left of p by -1,
sum the results, and divide by 2, we have an approximation of the gradient.
Simple Vertical Edge Detector
By applying this filter at each pixel

1
−1 0 1
2
we get an approximation of the gradient at that pixel. If the gradient is high, we can infer there is
an edge at this location. However, to be more robust to noise, this measurement is averaged over
the other dimesion (ie. while looking for horizontal edges, average the gradient in the y direction).
This is done simply by stacking several (-1 0 1) filters together, and normalizing by the sum of the
filter coefficients.
14
Vertical Edge Detector
 
−1 0 1
1
−1 0 1
6
−1 0 1
1
The factor of 6
is because sum of the filter coefficients is 6.
Horizontal Edge Detector
This is exactly the same concept as the vertical edge detector.

 
−1 −1 −1
1
0 0 0
6
1 1 1
8.1.2 Sobel
This is a Prewitt filter but with approximated Gaussian averaging instead of linear averaging over the
dimension that is not being differentiated. Since 3x3 filters are shown, the Gaussian approximation
is very bad ([1 2 1] does not look very Gaussian!), but if a 20x20 filter was used, the filter coeffients
should resemble a sampled Gaussian function.
Vertical Edge Detector
 
−1 0 1
1
−2 0 2
8
−1 0 1
Horizontal Edge Detector
 
−1 −2 −1
1
0 0 0
8
1 2 1
8.1.3 Canny
The Canny edge detector generally gives the best results. This comes at the cost of significantly
more complexity. First, the image is blurred with a Gaussian kernel. Then Sobel edge detection is
performed. After that, pixels that were labeled as edge pixesl are partitioned into horizontal, right
diagonal, vertical, or left diagonal edges. Double thresholding is done on each of these partitions.
This means that a value below a bottom threshold is definitly not an edge, and a value above the
top threshold definitly is an edge. The pixels between the thresholds are checked to see if there is a
path between adjacent pixels which have the same edge orientation and were labeled as “definitely
an edge”. This has the effect of connecting the edges - which overcomes the main drawback of most
of the other types of edge detection.
15
8.2 Zero Crossing Family
8.2.1 Laplacian
The second derivative in x can be approximated with
[f (x + 1) − f (x)] − [f (x) − f (x − 1)] = f (x + 1) − 2f (x) + f (x − 1)
This can be read “the difference between A and B” where A is the difference between the current
pixel and the one to the right of it, and B is the difference between the current pixel and the one
to the left of it.
Combine this with the second derivative in y to obtain
 
0 1 0
1
1 −4 1
8
0 1 0
or
 
1 1 1
1 
1 −8 1
16
1 1 1
8.2.2 Laplacian of Gaussian

Blur the image using a Gaussian kernel. Then use the Laplacian kernel. The blurring is to prevent
responses to noise instead of an actual edge. Another method to achieve the same result is to first
convolve the Laplacian filter with the Gaussian filter, and then apply the result to the image.
x2 + y 2 − x2 +y2 2

1
LoG(x, y) = − 4 1 − e 2σ
πσ 2σ 2
8.2.3 Example
Figure 14: Laplacian Of Gaussian Kernel
16
9 Corner Detection
9.1 Hit or Miss Transform
9.1.1 Idea
Find sections of the image that identically match a particular foreground/background configuration.
9.1.2 Procedure
A filter is created that contains 1’s and 0’s as its coefficients. We run this filter over the image,
looking for positions where the foreground/background configuration of the mask exactly matches
that of the image. We can detect four different corner orientations using the four filters below.
Figure 15: Hit Or Miss Corner Filters
9.1.3 Example
Figure 16: Hit Or Miss Corner Detection
9.2 Harris Corner Detector

Looks for corners in an image by examining the autocorrelation function in a neighborhood of each
pixel.
9.2.1 Details
The autocorrelation function is given by
X
c(x, y) = [I(xi , yi ) − I(xi + ∆x, yi + ∆y)]2
w
where w is some choice of window function.

After approximating the shifted
image function with a first order Taylor polynomial, c(x, y) can
∆x
be written as [∆x∆y]H(x, y) . The matrix H captures the intensity structure of the local
∆y
neighborhood. The eigenvalues of H are considered. There are three cases.
17
1. Both λ1 and λ2 are small. This means the autocorrelation function is flat, and therefore there
are no edges or corners.
2. One of the eigenvalues is large. This means shifts in one direction cause little change in the
autocorrelation function. This means that there is an edge present.
3. Both eigenvalues are large. This means that shifts in both directions cause large changes in
the autocorrelation function. This means that there is a corner present.
10 Point Operations
This is also called contrast enhancement, contrast stretching, or grayscale transformation. Each
pixel can be changed in grey level and only proportionally to the old grey level.
10.1 Histogram
Represents the relative frequency of occurence of the various gray levels in an image.
Figure 17: Histogram
10.2 Contrast Stretching

10.2.1 Idea
Improve the contrast in an image by stretching the range of intensity values it contains. This is
done with a linear scaling function. This can help to bring out detail in an image which has mostly
light pixels or mostly dark pixels.
10.2.2 Procedure

b−a
Pout = (Pin − c) +a
d−c
a is the lower limit after the stretching (usually 0) and b is the upper limit (usually 255). c is the
lowest intensity value in the input image, and d if the highest intensity value.
Outlier rejection is very important. c and d can instead be chosen by first discarding the lower
and upper five percent of the input histogram and then selecting the lowest and highest values.
18
10.3 Histogram Equalization
10.3.1 Idea
Produce an output histogram with a uniform distribution of intensities. This is essentially contrast
stretching with a nonlinear function.
10.3.2 Procedure
We try to find a transformation T that maps grey levels to different grey levels such that the result
is spread uniformly over the entire range of grey levels. We construct T by looking at the cumulative
distribution function of the image intensities (the integral of the histogram).
i
X
c(i) = p(xj )
j=0
The transformation is simply yi = c(i). Becareful to scale the output histogram so that it is in
the desired range. This is done with a simple normalization.
Figure 18: Histogram Equalization
Figure 19: Unequalized Image
19
Figure 20: Equalized Image
11 Image Enhancement
11.1 Point Spread Function
The point spread function (PSF) is the response of the camera to a point source of light. If a point
of light exists in the world, we would like to to occupy a very small region (maybe only a pixel) in
the image. However, optical systems are not perfect, and this point of light usually get blurred into
many pixels in the image. Often, a Gaussian function is a good approximation to this blurring.
Figure 21: Point Spread Function
11.2 Deconvolution
If the PSF is known, we can multiply by the inverse filter in the frequency domain (Weiner filtering)
to remove the effect of the blur.
20
11.3 Sharpening
11.3.1 Edge Enhancement
Idea
Boosting the high frequency content in an image to make its edges appear sharper.
Procedure
Filter the image with an edge filter and then add the edge image to the original image.
11.3.2 Unsharp Masking

Idea
Boost the high frequency content in an image to make its edges appear sharper.
Procedure
First, blur the image. Then, subtract the blurred image from the original. Add the difference
back to the original.
12 Algebraic Operations
12.1 Motion (Change) Detection
The goal is to identify the set of pixels that are “significantly different” between two images. These
pixels are called the “change mask”. It is not assumed that the two pictures are taken at the
same time, so the change mask should not include nuisance forms of change such as changes in
illumination.
The simplest method is known as “simple differencing”. The two images are subtracted, and if
the difference in any pixel is greater than a threshold τ , the pixel is labeled as a changed pixel.
12.2 Noise Removal by Image Averaging

The assumption is that the noise is random. If we can align multiple images of the same scene, we
can then simply average the value at pixel (i, j) in each images to obtain an image with less noise
(the noise is “averaged out”).
13 Image Noise
Just as in signal processing, unwanted noise is often present in an image. The two main noise
models are described in the following.
13.1 Salt and Pepper Noise

This type of noise is modeled as randomly occuring white or black pixels.
21
13.2 Example
Figure 22: Salt and Pepper Noise
This type of noise can be removed with a median filter (described below).
13.3 Gaussian Noise

Noise in the more usual sense, where each pixel value is perturbed by a Gaussian random variable.
13.4 Example
Figure 23: Gaussian Noise
14 Spatial Transformations
Performed on a local neighborhood of image pixels. Generally the image is convolved with a small
(ie. 3x3 or 5x5) filter (aka mask or kernel).
14.1 Averaging Filter

The value at each pixel p is set to the average of the pixels which overlap the mask centered at p.
This has a smoothing effect, and is often called a low pass filter.
14.2 Median Filter

Each pixel is given the median value of the pixels in some neighborhood. This has a smoothing
effect. This is an excellent way to remove salt and pepper noise.
22
15 Geometric Operations
Geometric operations are applied globally to an image. Typical transformations include scaling,
rotation, and reflection.
15.1 Grey-Level Interopolation

When a geometric operation is applied to an image, each pixel in the input image does not necessarily
map to an integer pixel in the output image. Therefore, we must choose a method for calculating
the intensity for each pixel in the new image.
15.1.1 Nearest Neighbor (Zero Order) Interpolation

Use the grey value of the nearest pixel. This is very computationally easy, but the resulting image
may be very blocky.
Example
Figure 24: Nearest Neighbor Interpolation
15.1.2 Bilinear Interpolation

Interpolate the values of the 4 closest pixels to the desired output pixel and assign the new pixel
this value. These 4 pixels can be weighted based on their distance to the desired location. This is
more computationally expensive, but produces better results than Nearest Neighbor Interpolation.
Example
Figure 25: Bilinear Interpolation
15.1.3 Bicubic Interpolation

Interpolate the values of the 16 closest pixels to the desired output pixel and assign the new pixel
this value. These 16 pixels are weighted based on their distance to the desired location. This is more
computationally expensive, but produces better results than Bilinear Interpolation. This method
is the best trade off between quality and computational expense.
Example
Figure 26: Bicubic Interpolation
23
15.2 Translation
To translate the image, multiply by the following matrix.
    
xnew 1 0 x0 x
 ynew  = 0 1 y0  y 
1 0 0 1 1
15.2.1 Example
!!!
15.3 Scaling
  1  
xnew c
0 0 x
 ynew  =  0 1 0 y 
d
1 0 0 1 1
15.3.1 Example
Figure 27: Scaled Image
15.4 Rotation
    
xnew cos(θ) −sin(θ) 0 x
 ynew  = sin(θ) cos(θ) 0 y 
1 0 0 1 1
15.4.1 Example
Figure 28: Rotated Image
15.4.2 Details
This is derived by drawing a coordinate axes, and a new coordinate axes rotated by θ.
x0 = x cos(θ) + y sin(θ)
y 0 = −x sin(θ) + y cos(θ)
15.5 Compound Transformations

To rotate around a point other than the origin, first translate the image so the point you want to
rotate around is at the origin. Then rotate the image by the desired amount. Then perform the
inverse translation (translate the image back to where it was originally).
15.5.1 Example
!!!
24
16 Transforms
16.1 Hough Transform
16.1.1 Idea
The Hough transform is generally used to find lines in an image, although it can be easily modified
to find other simple objects. It is very robust to noise and discontinuities in the appearance of
object which is being searched for. The transform does not make any hard decisions about the
location of the object, but rather provides a grid of values in the parameter space (which could
be interpreted as a probability density if normalized) which indicate the likelihood of the object
appearing in the form described by the parameters.
The idea is that each point in the edge image could possibly have come from an infinite number
of lines.
16.1.2 Procedure
First, use an edge detector. We increment each cell in the accumulator space that each point could
have come from. Cells with very high counts indicate that many points in the image fell on the
same line, indicating that there is infact a line in the image. Since there are infinite lines which go
through each point in the image, to implement this tranform we must pick an angular resolution
and solve for the corresponding r value of the line that goes through the point for each angle in our
now finite set.
Generally, the transform is thresholded to find the maxima.
16.1.3 Example
25
Figure 30: Edge Image
Figure 31: Hough Transform
The bright regions in this image represent the points in the parameter space which the most edge
pixels contributed to. These points correspond to all of the lines in the original image.
16.1.4 Details
The parametric form of a line
x cos(θ) + y cos(θ) = r
26
is generally used to prevent numerical problems (division by zero) in the case of vertical lines.
The accumulator space is generally θ vs r where θ is the angle from the origin to the line
perpendicular to the line in question. r is the perpendicular distance to the line from the origin.
Figure 32: Hough Transform Line Parameterization
16.2 Skeletonization
16.2.1 Idea
Remove most of the foreground pixels while preserving the size and connectedness of the original
image.
16.2.2 Procedure
There are two ways to produce a skeleton.
• Perform successive erosions without changing the connectedness.
• Find all circles of any size that are tangent to the boundary in at least two places. The centers
of these circles are the skeleton.
27
Figure 33: Skeletonization With Bi-Tangent Circles
16.2.3 Example
Figure 34: OriginalImage
28
Figure 35: Skeleton
16.3 Medial Axis Transform

This is skeletonization, but rather than producing a binary image of the skeleton, a greyscale image
is produced with the intensity of each pixel on the skeleton representing the distance to a boundary
in the original image.
16.3.1 Example
Figure 36: OriginalImage
29
Figure 37: Medial Axis Transform
16.4 Distance Transform

Produce an image that looks like the original image, but instead of binary pixels, each foreground
pixel p takes the greyvalue of the distance from p to the closest boundary.
30
Figure 39: Distance Transform
16.5 Radon Transform

The Radon Space is the paramaterized the same way as the Hough Space (with an angle and a
perpendicular distance). However in the Radon Transform, rather than incrementing many accu-
mulator bins for each point in the image, you simply take the line integral over the line given by
the current parameter pair and assign that value to the location (θ, r).
16.5.1 Example
!!!
16.6 Fourier Slice Theorem

Reconstruct an area by collecting projections onto multiple lines.
16.6.1 Procedure
We shoot rays (usually x-rays) through the object at every translation of a fixed angle. Each ray
is integrating the absorbtion properties of the object (the line integral) along the ray. This set of
line integral values form a 1D function (the projection onto the line perpendicular to the current
angle). The Fourier transform of this function is a single slice of the full 2D Fourier Transform
of the object. We rotate the angle and repeat the process multiple times at the desired angular
resolution. After the multiple 1D Fourier Transforms have been assembled, we can take the inverse
2D Fourier Transform to obtain an image of the object.
31
16.7 Fourier Transform
16.7.1 Idea
Decompose an image into a sum of complex exponential functions. This is a direct extension of the
1D DFT into 2D.
16.7.2 Procedure
M −1 N −1
1 XX ux vy
F (u, v) = f (x, y)e−j2π( M + N )
M N x=0 y=0
M −1 N −1
X X ux vy
f (x, y) = F (u, v)ej2π( M + N )
u=0 v=0
16.7.3 Example
32
Figure 41: Fourier Transform
16.8 Discrete Cosine Transform (DCT)

16.8.1 Idea
Just like Fourier transform, but the output image is real valued. The DCT is also much faster than
the DFT. It is often used in compression - the very high frequency components can be discarded,
resulting in less information to store.
16.8.2 Procedure
N −1 N −1
X X (2i + 1)kπ) (2j + 1)nπ
C(k, n) = α(k, n) f (i, j) cos cos
i=0 j=0
2N 2N
(
1
N
fork, n = 0
where α(k, n) = 2
N
fork, n = 1, 2, ..., N − 1
16.8.3 Example
!!!
16.9 Hadamard Transform

Matrix of 1’s and -1’s. Square basis functions instead of sinusoidal!
16.10 KL, Principal Component, Hotelling, and Eigenvector Trans-

forms
These are all the same. The new basis is an orthogonal set of eigenvectors.
33
17 Compression
Compression is a way to represent information in a more compact way, for a variety of reasons
including storage capacity restrictions, or transfer rate requirements. There are two main divisions
of compression techniques, lossless and lossy, which are described below.
To measure how much compression has been achieved, we consider the compression ratio:
num bits before
R=
num bits after
An entire field, information theory, is dedicated to precisely defining what is meant by “infor-
mation”. Because of this, it is wise to leave the word “information” at the door when talking about
compression. Instead, we consider the data to be compressed as a list of “symbols”. Sometimes the
system is binary, in which case the symbols are ’0’ and ’1’. The symbols could also be the digits
0-9, or the letters a-z.
17.1 Lossless
Compression techniques from which the original image can be recovered exactly.
17.1.1 Run Length Encoding

Used for images of few grey levels. The coding is line by line. It stores the grey level and how many
adjacent pixels are the same level.
Example
!!!
17.1.2 Lemple Ziv Coding

Single symbols are assigned a code and placed in a table. When a string not already in the table
occurs, it is stored in the table along with the code assigned to it.
Example
!!!
17.1.3 Huffman Coding

Huffman coding is an excellent compression method, however it requires prior knowledge of the rate
of occurance of each symbol.
Procedure
Build a tree by joining the two symbols with the lowest probability of occurance. Label the top
branch 0 and the bottom branch 1. Continue this until a single character remains. The code for
each character can then be read from the tree directly.
34
Example
!!!
17.2 Lossy
Compression techniques from which the original image can not be recovered.
17.2.1 Transform Coding

One of many transforms (DFT, DCT, Eigenvector, etc) can be used to discard the least important
information in the image. This loss of information clearly results in compession, but at the cost of
irreversible compression.
17.3 Theoretical Compression Limit

Called the Shannon limit for “amount of possible maximum compression”.
18 Segmentation
18.1 Idea
Group pixels into regions based on connectedness.
18.2 Connected Components Labeling

18.2.1 Procedure
Scanning an image, pixel-by-pixel (from top to bottom and left to right) in order to identify con-
nected pixel regions, i.e. regions of adjacent pixels which share the same set of intensity values
V.
However, for the following we assume binary input images and 8-connectivity. Scans the image
by moving along a row until it comes to a point p (where p denotes the pixel to be labeled at
any stage in the scanning process) that is white. It examines the four neighbors of p which have
already been encountered in the scan (i.e. the left, top, and two upper diagonals). Based on this
information, the labeling of p occurs as follows:
If all four neighbors are 0, assign a new label to p
If only one neighbor is white, assign its label to p
If more than one of the neighbors is white, assign one of the labels to p
Example
!!!
18.3 Region Growing

18.3.1 Procedure
Start with a seed pixel. Check the neighboring pixels and add them to the region if they are similar
to the seed. Repeat for each newly added pixel. Stop if no more pixels can be added.
35
This is better than any type of histogram segmentation because it considers the connectedness
of the greylevels, not just the similarity of intensity.
Does not work well with heavily textured images because the grey levels vary too quickly.
Seed point selection is extremely important, and is heavily application dependent. As an ex-
ample, if the application is to segment lit regions in an image from dark regions, a point from the
highest range of the histogram may be selected as a seed point.
Example
!!!
18.4 Region Splitting and Merging

Treat the entire image as a region. Decide if all pixels contained in the region satisfy a similarity
constraint. If they do, then this is labeled as a region. If not, split the region into four equal
sub-regions and perform the same test on each sub-region.
This arbitrary choice of how to subdivide a non-uniform region often results in adjacent regions
which should not be separate. To remedy this, a merging test is performed after each split to decide
if adjacent regions should be re-combined.
Example
!!!
18.5 Watershed Segmentation

We view a greyscale image as a topographic surface. We then “flood” the surface starting at
its minima. We prevent the merging of the waters coming from different sources. The result is
segmentation regions called “catchment basins”. The diving lines between the basins are called the
“watersheds”, meaning the points at which water would run down if poured from above (peaks of
a mountain ridge).
An alternate procedure is to find the downstream path from each pixel to a local minimum. A
catchment basin is then defined as the set of pixels for which their downstream paths end up at the
same minimum.
Figure 42: 1D Watershed Segmentation
36
19 JPEG
1. Decompose the RGB image into YCC form.
2. Optionally downsample the color information.
3. Split the image into 8x8 blocks.
4. Perform the DCT on each block.
5. Huffman the DCT coefficients in a zig zag pattern.
20 TV Standards
20.1 Color
Backwards compatability is obtained by sending luminance (brigtness) information in the same
place that the black and white signal used to be. The chrominance information is added to the
signal.
20.2 NTSC
US 525 (496 visible, the rest are closed captioning, synchronization info, and vertical retrace) lines,
30 (29.97) (30/1.001) fps, 4:3 aspect ratio (to be compatible with early film) interlaced - drawn in
two “fields”, so effective frame rate is 60hz. This matches the 60hz AC on the power lines which
avoids interference which produces rolling bars.
20.3 PAL
European 625 lines (576 visible) 25fps European power grid is 50hz.
37

Short Concept Explain

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Short Concept Explain

Uploaded by

Copyright:

Available Formats

Image Processing Primer

Tuesday 16th December, 2008

2 What is Image Processing? 5

2 What is Image Processing?

Figure 1: Additive Color

Figure 2: RGB Cube

Figure 3: HSV Cylinder

Figure 4: Subtractive Color

• It is cheaper to apply 1 ink (black) than 3 inks (CMY)

4.2 Binary Image

4.3 Greyscale Image

4.4 Color Image

5.1 Thresholding by Clustering

5.2 Otsu’s Method

σwithin (T ) = nB (T )σB2 (T ) + nW (T )σW

This can be expressed as

σbetween = nB (T )nW (T ) [µB (T ) − µW (T )]2

7.3 Combined Operations

Erosion followed by dilation.

Dilation followed by erosion.

• Finding the convex hull of an object.

Figure 9: Original Image

Figure 10: Thickened Image

Figure 11: Thinning

Figure 12: Original Image

8.1 Differential Family

[f (x + 1) − f (x)] + [f (x) − f (x − 1)] f (x + 1) − f (x − 1)

Simple Vertical Edge Detector

By applying this filter at each pixel

Horizontal Edge Detector

This is exactly the same concept as the vertical edge detector.

Vertical Edge Detector

Horizontal Edge Detector

[f (x + 1) − f (x)] − [f (x) − f (x − 1)] = f (x + 1) − 2f (x) + f (x − 1)

8.2.2 Laplacian of Gaussian

Figure 14: Laplacian Of Gaussian Kernel

Figure 15: Hit Or Miss Corner Filters

Figure 16: Hit Or Miss Corner Detection

9.2 Harris Corner Detector

where w is some choice of window function.

Figure 17: Histogram

10.2 Contrast Stretching

Figure 18: Histogram Equalization

Figure 19: Unequalized Image

Figure 21: Point Spread Function

11.3.2 Unsharp Masking

12.2 Noise Removal by Image Averaging

13.1 Salt and Pepper Noise

Figure 22: Salt and Pepper Noise

13.3 Gaussian Noise

Figure 23: Gaussian Noise

14.1 Averaging Filter

14.2 Median Filter

15.1 Grey-Level Interopolation

15.1.1 Nearest Neighbor (Zero Order) Interpolation

Figure 24: Nearest Neighbor Interpolation

15.1.2 Bilinear Interpolation

Figure 25: Bilinear Interpolation

15.1.3 Bicubic Interpolation

Figure 26: Bicubic Interpolation

Figure 27: Scaled Image

Figure 28: Rotated Image