Professional Documents
Culture Documents
Nass p2
Nass p2
Nass p2
Figure 6.5 Result of highpass filter modified by adding 0.75 to the filter
The ILPF indicates that all frequencies inside a circle of radius D0 are
passed with no attenuation, whereas all frequencies outside this circle are
completely attenuated.
The next figure shows a gray image with its Fourier spectrum. The circles
superimposed on the spectrum represent cutoff frequencies 5, 15, 30, 80
and 230.
(a) (b)
Figure 7.2 (a) Original image. (b) its Fourier spectrum
The figure below shows the results of applying ILPF with the previous
cutoff frequencies.
(a) (b)
(c) (d)
(e) (f)
Figure 7.3 (a) Original image. (b) - (f) Results of ILPF with cutoff frequencies 5, 15, 30, 80,
and 230 respectively.
Unlike ILPF, the GLPF transfer function does not have a sharp transition
that establishes a clear cutoff between passed and filtered frequencies.
Instead, GLPF has a smooth transition between low and high frequencies.
The figure below shows the results of applying GLPF on the image in
Figure 7.2(a) with the same previous cutoff frequencies.
(a) (b)
(c) (d)
(e) (f)
Figure 7.5 (a) Original image. (b) - (f) Results of GLPF with cutoff frequencies 5, 15, 30, 80,
and 230 respectively.
(a) (b)
Figure 7.6 (a) Text of poor resolution. (b) Result of applying GLPF with cutoff=80 on (a)
GLPF can also be used for cosmetic processing prior to printing and
publishing as shown below.
(a) (b)
Figure 7.7 (a) Original image. (b) Result of filtering with GLPF with cutoff=80
The IHPF sets to zero all frequencies inside a circle of radius D0 while
passing, without attenuation, all frequencies outside the circle.
The figure below shows the results of applying IHPF with cutoff
frequencies 15, 30, and 80.
(a) (b)
(c) (d)
Figure 7.9 (a) Original image. (b) - (d) Results of IHPF with cutoff frequencies 15, 30, and 80
respectively.
The figure below shows the results of applying GHPF with cutoff
frequencies 15, 30 and 80.
(a) (b)
(c) (d)
Figure 7.11 (a) Original image. (b) - (d) Results of GHPF with cutoff frequencies 15, 30, and
80 respectively.
For example, if f={f1,f2,f3,f4 ,f5 ,f6 ,f7 ,f8 } is a time-signal of length 8, then
the HWT decomposes f into an approximation subband containing the
Low frequencies and a detail subband containing the high frequencies:
Low = a =
High = d =
(b) Level 1
(c) Level 2
(d) Level 3
Figure 8.2 Example of a Haar wavelet transformed image
Wavelet transformed images can be perfectly reconstructed using the four
subbands using the inverse wavelet transform.
(a) (b)
(c) (d)
Figure 8.3 Histogram of (a) LL-subband (b) HL-subband (c) LH-subband (d) HH-subband of
subbands in Figure 8.2 (b)
(a)
(b)
Figure 8.3 (a) gray image. (b) its one-level wavelet transform
Note the horizontal edges of the original image are present in the HL
subband of the upper-right quadrant of the Figure above. The vertical
edges of the image can be similarly identified in the LH subband of the
lower-left quadrant.
To combine this information into a single edge image, we simply zero the
LL subband of the transform, compute the inverse transform, and take the
absolute value.
The next Figure shows the modified transform and resulting edge image.
(a)
(b)
Figure 8.4 (a) transform modified by zeroing the LL subband. (b) resulted edge image
The figure below shows a noisy image and its wavelet transform for two-
levels of decomposition.
(a)
(b)
Figure 8.4 (a) noisy image. (b) its two-level wavelet transform
Image Restoration
Image restoration attempts to reconstruct or recover an image that has
been degraded by a degradation phenomenon. As in image enhancement,
the ultimate goal of restoration techniques is to improve an image in some
predefined sense.
Noise Models
Spatial noise is described by the statistical behavior of the gray-level
values in the noise component of the degraded image. Noise can be
modeled as a random variable with a specific probability distribution
function (PDF). Important examples of noise models include:
1. Gaussian Noise
2. Rayleigh Noise
3. Gamma Noise
4. Exponential Noise
5. Uniform Noise
6. Impulse (Salt & Pepper) Noise
Gaussian Noise
The PDF of Gaussian noise is given by
where z is the gray value, μ is the mean and σ is the standard deviation.
Rayleigh Noise
The PDF of Rayleigh noise is given by
If b > a, then gray level b appears as a light dot (salt), otherwise the gray
level a appears as a dark dot (pepper).
The next figure shows degraded (noisy) images resulted from adding the
previous noise models to the above test pattern image.
Figure 9.6 Images and histograms from adding Gaussian, Rayleigh, Gamma, Exponential,
Uniform, and Salt & Pepper noise.
and
Order-Statistics Filters
We have used one of these filters (i.e. median) in the image enhancement.
We now use additional filters (min and max) in image restoration.
Min filter
This filter is useful for finding the darkest points in an image. Also, it
reduces salt noise as a result of the min operation.
(a) (b)
Figure 9.11 (a) image corrupted by salt noise. (b) Result of filtering (a) with a 3×3 min filter.
Max filter
This filter is useful for finding the brightest points in an image. Also,
because pepper noise has very low values, it is reduced by this filter as a
result of the max operation.
(a) (b)
Figure 9.12 (a) image corrupted by pepper noise. (b) Result of filtering (a) with a 3×3 max
filter.
Adaptive Filters
The previous spatial filters are applied regardless of local image
variation. Adaptive filters change their behavior using local statistical
parameters in the mask region. Consequently, adaptive filters outperform
the non-adaptive ones.
(a)
(b) (c)
Figure 9.13 (a) Image corrupted by salt&pepper noise with density 0.25. (b) Result obtained
using a 7×7 median filter. (c) Result obtained using adaptive median filter with Smax = 7.
From this example, we find that the adaptive median filter has three main
purposes:
1. to remove salt-and-pepper (impulse) noise.
2. to provide smoothing of other noise that may not be impulsive.
3. to reduce distortion, such as excessive thinning or thickening of
object boundaries.
(a) (b)
Structuring Element
A morphological operation is based on the use of a filter-like binary
pattern called the structuring element of the operation. Structuring
element is represented by a matrix of 0s and 1s; for simplicity, the zero
entries are often omitted.
Symmetric with respect to its origin:
Lines:
0 0 0 0 1 1
0 0 0 1 0 1 1
0 0 1 0 0 = 1 1
0 1 0 0 0 1 1
1 0 0 0 0 1
Diamond:
0 1 0
1 1 1
0 1 0
Non-symmetric:
1 1
1 1 1 1 1 1 Reflection 1 1
1 1 on origin 1 1 1 1 1 1
1 1
Dilation
Dilation is an operation used to grow or thicken objects in binary images.
The dilation of a binary image A by a structuring element B is defined as:
B̂
is the set of all structuring element origin locations where the reflected
and translated B overlaps with A by at least one element.
Solution:
We find the reflection of B:
B= 1 B̂
1
1
1
1
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 1 1 1 1 1 1 0
0 0 0 0 1 1 1 1 1 1 1 0
0 0 0 1 1 1 1 1 1 1 1 0
0 0 1 1 1 1 1 1 1 1 0 0
0 1 1 1 1 1 1 1 1 0 0 0
0 1 1 1 1 1 1 1 0 0 0 0
0 1 1 1 1 1 1 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
(a)
(b)
Erosion
Erosion is used to shrink or thin objects in binary images. The erosion of
a binary image A by a structuring element B is defined as:
Solution
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 1 1 1 1 1 1 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0
(a) (b)
Figure 10.4 (a) Binary image. (b) Eroded image.
(a)
(b)
(c)
Figure 10.5 (a) Original binary image. (b) Result of opening with square structuring element
of size 10 pixels. (c) Result of opening with square structuring element of size 20 pixels.
(a) (b)
Figure 10.6 (a) Original binary image. (b) Result of opening with square structuring element
of size 13 pixels.
(a)
(b)
Figure 10.7 (a) Result of closing with square structuring element of size 10 pixels. (c) Result
of closing with square structuring element of size 20 pixels.
(a)
(b) (c)
Figure 10.8 (a) Noisy fingerprint. (b) Result of opening (a) with square structuring element of
size 3 pixels. (c) Result of closing (b) with the same structuring element.
Note that the noise was removed by opening the image, but this process
introduced numerous gaps in the ridges of the fingerprint. These gaps can
be filled by following the opening with a closing operation.
This transform is useful in locating all pixel configurations that match the
B1 structure (i.e a hit) but do not match that of B2 (i.e. a miss). Thus, the
hit-or-miss transform is used for shape detection.
0 1 0 00000000000
1 1 1 00100000000 1
0 1 0 00100111100 1 1 1
Shap 01110000000 1
00100001100 B1
00001001110
00011100100 1 1
00001000000
00000000000 1 1
Image A B2
Solution:
00000000000 11111111111
A B1 =
00000000000 A=c 11011111111
00000000000 11011000011
00100000000 10001111111
00000000000 11011110011
00000000100 11110110001
00001000000 11100011011
00000000000 11110111111
00000000000 11111111111
10101111111 00000000000
Ac B2=
10100000001 00000000000
00000111111 A B= 00000000000
10100000001 00100000000
00000000000 00000000000
10000000001 00000000000
11101000000 00001000000
11000000101 00000000000
11101011111 00000000000
(a) (b)
Figure 11.1 (a) Binary image. (b) Result of applying hit-or-miss transform.
Boundary Extraction
The boundary of a set A, denoted by (A), can be obtained by:
(a) (b)
Figure 11.2 (a) Binary image. (b) Object boundary extracted
using the previous equation and 3×3 square structuring element.
Note that, because the size of structuring element is 3×3 pixels, the
resulted boundary is one pixel thick. Thus, using 5×5 structuring element
will produce a boundary between 2 and 3 pixels thick as shown in the
next figure.
Thinning
Thinning means reducing binary objects or shapes in an image to strokes
that are a single pixel wide. The thinning of a set A by a structuring
element B, is defined as:
Since we only match the pattern (shape) with the structuring elements, no
background operation is required in the hit-or-miss transform.
Here, B is a sequence of structuring elements:
where Bi is the rotation of Bi-1. Thus, the thinning equation can be written
as:
The entire process is repeated until no further changes occur. The next
figure shows an example of thinning the fingerprint ridges so that each is
one pixel thick.
(a) (b)
(c) (d)
Figure 11.4 (a) Original fingerprint image. (b) Image thinned once. (c) Image thinned twice.
(d) Image thinned until stability (no changes occur).
with
(a) (b)
Figure 11.5 (a) Bone image. (b) Skeleton extracted from (a).
Gray-scale Morphology
The basic morphological operations of dilation, erosion, opening and
closing can also be applied to gray images.
Gray-scale Dilation
The gray-scale dilation of a gray-scale image f by a structure element b is
defined as:
The figure below shows the result of dilating a gray image using a 3×3
square structuring element.
(a) (b)
Figure 11.6 (a) Original gray image. (b) Dilated image.
Gray-scale Erosion
The gray-scale erosion of a gray-scale image f by a structure element b is
defined as:
The next figure shows the result of eroding a gray image using a 3×3
square structuring element.
(a) (b)
Figure 11.7 (a) Original gray image. (b) Eroded image.
We can see that gray-scale erosion produces the following:
1. Dark image. 2. Small, bright details were reduced.
(a) (b)
Figure 11.8 (a) Original gray image. (b) Opened image.
Note the decreased sizes of the small, bright details, with no appreciable
effect on the darker gray levels.
The figure below shows the result of closing a gray image.
(a) (b)
Figure 11.9 (a) Original gray image. (b) Closed image.
Note the decreased sizes of the small, dark details, with relatively little
effect on the bright features.
(a) (b)
Figure 11.10 (a) Original gray image. (b) Morphological smoothed image.
Morphological gradient
is produced from subtracting an eroded image from its dilated version. It
is defined as:
(a) (b)
Figure 11.11 (a) Original gray image. (b) Morphological gradient.
Image Segmentation
is one of image analysis methods used to subdivide an image into its
regions or objects depending on the type of shapes and objects searched
for in the image. Image segmentation is an essential first step in most
automatic pictorial pattern recognition and scene analysis tasks.
Segmentation Approaches
Image segmentation algorithms are based on one of two basic properties
of gray-level values: discontinuity and similarity.
• In the first category, the approach is to partition an image based on
abrupt discontinuity (i.e. change) in gray level, such as edges in an
image.
• In the second category, the approaches are based on partitioning an
image into regions that are similar according to a set of predefined
criteria.
Point Detection
This is concerned with detecting isolated image points in relation to its
neighborhood which is an area of nearly constant gray level.
1. Simple method
The simplest point detection method works in two steps:
1. Filter the image with the mask:
-1 -1 -1
-1 8 -1
-1 -1 -1
(a)
Figure 12.1 Example of point detection using simple method. (a) Original face image.
(b)-(g) Results with different Thresholds
2. Alternative method
An alternative approach to the simple method is to locate the points in a
window of a given size where the difference between the max and the
min value in the window exceeds a given threshold. This can be done
again in two steps:
1. Obtain the difference between the max value (obtained with the
order statistics max filter) and the min value (obtained with the
order statistics min filter) in the given size mask.
2. On the output image apply an appropriate threshold (e.g. the
maximum pixel value).
The figure below shows an example of point detection in a face image
using the alternative method.
(a)
Figure 12.2 Example of point detection using alternative method. (a) Original face image.
(b)-(e) Results with different Thresholds
Line Detection
Detecting a line in a certain direction require detecting adjacent points in
the image in the given direction. This can be done using filters that yields
significant response at points aligned in the given direction.
For example, the following filters
-1 2 -1 -1 -1 -1
-1 2 -1 2 2 2
-1 2 -1 -1 -1 -1
-1 -1 2 2 -1 -1
-1 2 -1 -1 2 -1
2 -1 -1 -1 -1 2
The next figure illustrates an example of line detection using the filters
above.
(a)
(b) (c)
(d) (e)
Figure 12.3 Example of line detection. (a) Original image. (b)-(e) Detected lines in the
vertical, horizontal, +45° direction , and – 45° direction, respectively.
Edge detection
Edge detection in images aims to extract meaningful discontinuity in
pixel gray level values. Such discontinuities can be deduced from first
and second derivatives as defined in Laplacian filter.
The 1st-order derivative of an image f(x,y) is defined as:
To detect:
• Horizontal edges, we filter the image f using the left mask above.
• Vertical edges, we filter the image f using the right mask above.
• Edges in both directions, we do the following:
1. Filter the image f with the left mask to obtain Gx
2. Filter the image f again with the right mask to obtain Gy
3. Compute or
In all cases, we then take the absolute values of the filtered image, then
apply an appropriate threshold.
The next figure shows an example of edge detection using the Sobel
detector.
(a) (b)
(c) (d)
Figure 12.4 Example of Sobel edge detection. (a) Original image.
(b)-(d) Edges detected in vertical, horizontal, and both directions, respectively.
(a) (b)
(c) (d)
Figure 12.5 Example of Prewitt edge detection. (a) Original image.
(b)-(d) Edges detected in vertical, horizontal, and both directions, respectively.
We can see that the Prewitt detector produces noisier results than the
Sobel detector. This is because the coefficient with value 2 in the Sobel
detector provides smoothing.
Image Compression
• Image compression means the reduction of the amount of data
required to represent a digital image by removing the redundant data.
It involves reducing the size of image data files, while retaining
necessary information.
• Mathematically, this means transforming a 2D pixel array (i.e. image)
into a statistically uncorrelated data set. The transformation is applied
prior to storage or transmission of the image. At later time, the
compressed image is decompressed to reconstruct the original
(uncompressed) image or an approximation of it.
• The ratio of the original (uncompressed) image to the compressed
image is referred to as the Compression Ratio CR:
where
Example:
Consider an 8-bit image of 256×256 pixels. After compression, the image
size is 6,554 bytes. Find the compression ratio.
Solution:
Usize = (256 × 256 × 8) / 8 = 65,536 bytes
Compression Ratio = 65536 / 6554 = 9.999 ≈ 10 (also written 10:1)
This means that the original image has 10 bytes for every 1 byte in the
compressed image.
Fidelity Criteria
These criteria are used to assess (measure) image fidelity. They quantify
the nature and extent of information loss in image compression. Fidelity
criteria can be divided into classes:
1. Objective fidelity criteria
2. Subjective fidelity criteria
Lossless compression
• It allows an image to be compressed and decompressed without losing
information (i.e. the original image can be recreated exactly from the
compressed image).
• This is useful in image archiving (as in the storage of legal or medical
records).
• For complex images, the compression ratio is limited (2:1 to 3:1). For
simple images (e.g. text-only images) lossless methods may achieve
much higher compression.
• An example of lossless compression techniques is Huffman coding.
Huffman Coding
is a popular technique for removing coding redundancy. The result after
Huffman coding is variable length code, where the code words are
unequal length. Huffman coding yields the smallest possible number of
bits per gray level value.
Example:
Consider the 8-bit gray image shown below. Use Huffman coding
technique for eliminating coding redundancy in this image.
119 123 168 119
123 119 168 168
119 119 107 119
107 107 119 119
Solution:
Gray level Histogram Probability
119 8 0.5
168 3 0.1875
107 3 0.1875
123 2 0.125
1 1
0.5 0.5 0.5 1
01
0.1875 00 0.3125 0.5 0
011 00
0.1875 0.1875
0.125 010
Lookup table:
Gray level Probability Code
119 0.5 1
168 0.1875 00
107 0.1875 011
123 0.125 010
We use this code to represent the gray level values of the compressed
image:
1 010 00 1
010 1 00 00
1 1 011 1
011 011 1 1
Hence, the total number of bits required to represent the gray levels of the
compressed image is 29-bit: 10101011010110110000011110011.
Whereas the original (uncompressed) image requires 4*4*8 = 128 bits.
Compression ratio = 128 / 29 ≈ 4.4
Lossy compression
• It allows a loss in the actual image data, so the original uncompressed
image cannot be recreated exactly from the compressed image).
• Lossy compression techniques provide higher levels of data reduction
but result in a less than perfect reproduction of the original image.
• This is useful in applications such as broadcast television and
videoconferencing. These techniques can achieve compression ratios
of 10 or 20 for complex images, and 100 to 200 for simple images.
• An example of lossy compression techniques is JPEG compression
and JPEG2000 compression.
Color Fundamentals
Colors are seen as variable combinations of the primary colors of light:
red (R), green (G), and blue (B). The primary colors can be mixed to
produce the secondary colors: magenta (red+blue), cyan (green+blue),
and yellow (red+green). Mixing the three primaries, or a secondary with
its opposite primary color, produces white light.
RGB colors are used for color TV, monitors, and video cameras.
However, the primary colors of pigments are cyan (C), magenta (M), and
yellow (Y), and the secondary colors are red, green, and blue. A proper
combination of the three pigment primaries, or a secondary with its
opposite primary, produces black.
Color characteristics
The characteristics used to distinguish one color from another are:
• Brightness: means the amount of intensity (i.e. color level).
• Hue: represents dominant color as perceived by an observer.
• Saturation: refers to the amount of white light mixed with a hue.
Color Models
The purpose of a color model is to facilitate the specification of colors in
some standard way. A color model is a specification of a coordinate
system and a subspace within that system where each color is represented
by a single point. Color models most commonly used in image processing
are:
All color values R, G, and B have been normalized in the range [0, 1].
However, we can represent each of R, G, and B from 0 to 255.
Each RGB color image consists of three component images, one for each
primary color as shown in the figure below. These three images are
combined on the screen to produce a color image.
The total number of bits used to represent each pixel in RGB image is
called pixel depth. For example, in an RGB image if each of the red,
green, and blue images is an 8-bit image, the pixel depth of the RGB
image is 24-bits. The figure below shows the component images of an
RGB image.
Full color
where, all color values have been normalized to the range [0, 1].
In printing, combining equal amounts of cyan, magenta, and yellow
produce muddy-looking black. In order to produce true black, a fourth
color, black, is added, giving rise to the CMYK color model.
The figure below shows the CMYK component images of an RGB image.
Yellow Black
Figure 14.6 A full-color image and its CMYK component images
Where
If :
If :
The next figure shows the HSI component images of an RGB image.
Full color
For an image of size M× N, there are MN such vectors, c(x, y), for x = 0,1,
2,...,M-1; y = 0,1,2,...,N-1.
Color Transformation
As with the gray-level transformation, we model color transformations
using the expression
where f(x, y) is a color input image, g(x, y) is the transformed color output
image, and T is the color transform.
This color transform can also be written
(a) (b)
Figure 14.8 (a) Original image. (b) Result of decreasing its intensity