Ipmv Notes

INDEX
Unit 1 Digital IMAGE FUNDAMENTALS
Unit 2 IMAGE TRANSFORMS
Unit 3 IMAGE ENHANCEMENT
Unit 4 (PART – 1) IMAGE SEGMENTATION
Unit 4 (PART – 2) IMAGE MORPHOLOGY
Unit 5 IMAGE RESTORATION

Unit 1 Image Fundamentals
Unit 1
Digital IMAGE FUNDAMENTALS
Image: An image may be defined as a two dimensional function f(x, y) where x & y are spatial
coordinates and amplitude of at any pair of coordinates(x, y) is called the intensity or grey level of the
image at that point.
Digital Image: when x, y & amplitude values of f are all finite, discrete quantities we call the image a
digital image.
 Digital image is composed of a finite number of elements.
 These elements are known as pixel (picture element).
 Pixels have particular location and value.
Analog image: it can be mathematically represented as continuous range of values representing

position & intensity.
1. Digital Image Processing:

- The processing of an image by means of a computer is generally termed digital Image
processing. Processing means different techniques like image enhancement. Restoration,
fusion, watermarking etc. apply on image
- The advantages of DIP as
1. Flexibility & adaptability - no hardware modification is required.
2. Data storage & transmission.
- Limitations: memory & processing speed.
2. Digital Image Representation:
Snehal Shah (8655504691) Page 1

𝑓(0,1) ⋯ 𝑓(0, 𝑁 − 1)
f(x,y) = [ ⋮ ⋱ ⋮ ]
𝑓(𝑀 − 1,0) ⋯ 𝑓(𝑀 − 1, 𝑁 − 1)
for binary image f(x,y) values are 0 or 1 so 3 X 3 image is given by
0 0 0
3X3= 0 0 0 here, 0 – black & 1 – white
1 1 1
3. Basic components of Image Processing System
- Computer: In an image processing system is a general purpose computer and can range from
a PC to a super computer. In dedicated applications, sometimes specially designed computers
are used to achieve a required level of performance.
- Software: for image processing consists of specialized modules that perform specific tasks.

- Mass storage: capability is must in image processing applications. Digital storage for image
processing applications falls into three principle categories: (1) short-term storage (2) on-line
storage for relatively fast recall, (3) archival storage, characterized by infrequent access.
- Image display: It displays the images.
- Hardcopy devices: Used for recording images include laser printers, film cameras, heat
sensitive devices, inkjet units and digital units such as optical and CDROM disks.
4. Fundamental steps in Digital Image Processing
.
1 Image Acquisition:
- This is the first step or process of the fundamental steps of digital image processing. Image
acquisition could be as simple as being given an image that is already in digital form. Generally,
the image acquisition stage involves pre-processing, such as scaling etc
2. Image Enhancement:
- Image enhancement is among the simplest and most appealing areas of digital image
processing. Basically, the idea behind enhancement techniques is to bring out detail that is

obscured, or simply to highlight certain features of interest in an image. Such as, changing
brightness & contrast etc.
3. Image Restoration:
- Image restoration is an area that also deals with improving the appearance of an image.
However, unlike enhancement, which is subjective, image restoration is objective, in the sense
that restoration techniques tend to be based on mathematical or probabilistic models of
image degradation.
4. Color Image Processing:

- Color image processing is an area that has been gaining its importance because of the
significant increase in the use of digital images over the Internet. This may include color
modeling and processing in a digital domain etc.
5. Wavelets and Multi-Resolution Processing:

- Wavelets are the foundation for representing images in various degrees of resolution. Images
subdivision successively into smaller regions for data compression and for pyramidal
representation.
6. Compression:
- Compression deals with techniques for reducing the storage required to save an image or the
bandwidth to transmit it. Particularly in the uses of internet it is very much necessary to
compress data.
7. Morphological Processing:
- Morphological processing deals with tools for extracting image components that are useful in
the representation and description of shape.
8. Segmentation:
- Segmentation procedures partition an image into its constituent parts or objects. In general,
autonomous segmentation is one of the most difficult tasks in digital image processing. A
rugged segmentation procedure brings the process a long way toward successful solution of
imaging problems that require objects to be identified individually.
9. Representation and Description:

- Representation and description almost always follow the output of a segmentation stage,
which usually is raw pixel data, constituting either the boundary of a region or all the points
in the region itself.
- Choosing a representation is only part of the solution for transforming raw data into a form
suitable for subsequent computer processing.
- Description deals with extracting attributes that result in some quantitative information of
interest or are basic for differentiating one class of objects from another.
10. Object recognition:

- Recognition is the process that assigns a label, such as, “vehicle” to an object based on its
descriptors.
11. Knowledge Base:

- Knowledge may be as simple as detailing regions of an image where the information of
interest is known to be located, thus limiting the search that has to be conducted in seeking
that information. The knowledge base also can be quite complex, such as an interrelated list
of all major possible defects in a materials inspection problem or an image database

containing high-resolution satellite images of a region in connection with change-detection

applications.
5. Image Sensing and Acquisition

- The images are generated by the combination of an “illumination” source and the
reflection or absorption of energy from that source by the elements of the “scene”
being imaged.
- There are 3 principal sensor arrangements (produce an electrical output proportional
to light intensity).
(i)Single imaging Sensor
(ii)Line sensor
(iii)Array sensor
- Incoming energy is transformed into a voltage by the combination of input

electrical power and sensor material.
- The output voltage waveform is the response of the sensor(s), and a digital quantity is
obtained from each sensor by digitizing its response
 Image Acquisition Using a Single Sensor:

- The most common sensor of this type is the photodiode, which is made of silicon
materials and whose output voltage waveform is proportional to light.

- The use of a filter in front of a sensor improves selectivity. For example, a green (pass)
filter in front of a light sensor favours light in the green band of the color spectrum.
- As a consequence, the sensor output will be stronger for green light than for other
components in the visible spectrum.
- 2D image generated by displacement in x- and y directions between the sensor and
the area to be imaged
- Fig. shows an arrangement used in high-precision scanning, where a film negative is

mounted onto a drum whose mechanical rotation provides displacement in one
dimension.
- The single sensor is mounted on a lead screw that provides motion in
the perpendicular direction, because mechanical motion can be controlled with high
precision. This method is an inexpensive (but slow) way to obtain high-resolution
images.
 Image Acquisition Using Sensor Strips:

- Fig (a) shows the strip provides imaging elements in one direction.
- Fig (b) shows motion perpendicular to the strip provides imaging in the other
direction.
- This is the type of arrangement used in most flatbed scanners. Sensing devices with
4000 or more in-line sensors are possible.
- In-line sensors are used routinely in airborne imaging applications, in which
the imaging system is mounted on an aircraft that flies at a constant altitude and
speed over the geographical area to be imaged.
- One-dimensional imaging sensor strips that respond to various bands of the
electromagnetic spectrum are mounted perpendicular to the direction of flight.
- The imaging strip gives one line of an image at a time, and the motion of the strip
completes the other dimension of a two-dimensional image.
- Sensor strips mounted in a ring configuration are used in medical and industrial
imaging to obtain cross sectional (“slice”) images of 3 -D objects.
- A rotating X-ray source provides illumination and the portion of the sensors opposite
the source collect the X-ray energy that pass through the object.
- This is the basis for medical and industrial computerized axial tomography (CAT)
imaging

 Image Acquisition using Sensor Arrays:
Fig. An example of the digital image acquisition process. (a)Energy (“illumination”) source.
(b) An element of a scene. (c) Imaging system. (d) Projection of the scene onto the image
plane. (e) Digitized image.
- This type of arrangement is found in digital cameras. A typical sensor for these
cameras is a CCD array, which can be manufactured with a broad range of sensing
properties and can be packaged in rugged arrays of 4000 * 4000 elements or more.

- The response of each sensor is proportional to the integral of the light energy
projected on to the surface of the sensor, a property that is used in astronomical and
other applications requiring low noise images.
- The first function performed by the imaging system in Fig.(c) is to collect the incoming
energy and focus it onto an image plane.
- If the illumination is light, the front end of the imaging system is a lens, which projects
the viewed scene onto the lens focal plane as Fig.(d) shows.
- The sensor array, which is coincident with the focal plane, produces outputs
proportional to the integral of the light received at each sensor.
- The output is a digital image, as shown diagrammatically in Fig.(e)
6. Sampling and Quantization:

- The o/p of most sensors is a continuous voltage w/f whose amplitude & spatial behaviour
are related to the physical phenomenon being sensed.
- To create a digital image, we need to convert the continuous sensed data into digital form.
This involves two processes: Sampling and Quantization.
- Fig (a) shows continuous image f(X,Y) that we want to convert into digital image
- Continuous in x and y coordinates & amplitude also.
- We have to sample the function in both coordinates and in amplitude.
- Digitizing the amplitude values is called quantization
- Consider line AB (s segment) in fig (b), it is plot of grey levels (amplitude) of continuous
image along the line segment AB.

- Radom variation is due to image noise.

- Fig (c) shows location of each samples which is given by vertical mark on bottom part.
- Samples are shown as small white squares superimposed on the function.
- Set of discrete samples gives sampling function.
- Vertically it represents grey level values.
- Right side on Fig c shows different grey level values, ranging from black to white.
- Digital samples resulting from both sampling and quantization are shown in fig (d).
- After this process we will get image which shown in below figure.
- Quantization: The samples values are represented by finite set of integer values. This is
knowns as quantization
Question: Justify “Quality of picture depends on the number of pixels & grey levels”.
OR
Justify “Quality of an image is decided by its tonal and spatial resolution”.
Answer:
- Every image is seen on screen is actually in matrix form. Each element of matrix is called pixel
if matrix is N X M so total pixels are NX M.
𝑓(0,1) ⋯ 𝑓(0, 𝑁 − 1)
f(x,y) = [ ⋮ ⋱ ⋮ ]
𝑓(𝑀 − 1,0) ⋯ 𝑓(𝑀 − 1, 𝑁 − 1)
-
- If size of N x M is large than pixel value becomes more & sampling rate will be increased
therefore we will get better resolution (quality). Value of each pixel is known as grey level.
- Computer understands only 0’s and 1’s. Hence these grey levels need to be represented in
terms of 0’s and 1’s.
- If we have two bits to represent the grey levels only 4 diff grey levels are available 00,01,10,11.
Here 00 – black, 11- White remaining values are shades of grey. Similar 8 bits are used for 1
pixel representation so 28 =256 grey levels are available.

- So more bits, more grey levels & better resolution .total size of image is N X M X m where m
is no of bits used for 1 pixel. Here m is no bits for 1 pixel.
- So we can say quality of image depends on pixels & grey levels.
Question: explain image sampling & quantization of a medical image has size of 8 X 8
inches. The sampling resolution is 5 cycles/mm. How many pixels are required? Will an
image of size 256 X 256 be enough?
Solution:
- 1 cycle/mm = 1 line pair /mm
- 1 line pair means 1 line white and 1 line black
- For 1 line pair at least we require 2 pixels/mm
- So 5 cycle/mm = 10 pixels/mm
- Size is 8 inch X 8 inch
- 1 inch = 25.4 mm
- 8 X 25.4= 203.2 mm
- 203.2 mm X 203.2 mm
- each mm there are 10 pixels
- total pixels= (2032 X 2032 )
- We require 2032 X 2032 pixels to represent image so 256 X256 pixels will not be enough to
represent the given image.
7. Isopreference Curves:

- We have seen effect of reduction of N & m in previous topic. We still do not know the ideal
value of N & m for image.
- T.S Huang had attempt no of experiment by varying values of N & m simultaneously. Fig a)
has woman face b) cameraman c) has crowd of people.
- The result was drawn on the graph. Each curve on the graph represents one image. The
values on the x axis represents the number of grey levels and the values on the y axis
represents bits per pixel (k). This curve is known as isopreference curve.
- So conclude that, for more detailed images, the isopreference curves become more and
more vertical. It also means that for an image with a large amount of details, very few grey
levels are needed.
8. Image types
1) Binary image /monochrome image:
The binary image as it name states, contain only two pixel values: 0 and 1. Here 0 refers to black
colour and 1 refers to white colour. It is also known as Monochrome.
2) Grey scale image:
It has 256 different shades of colours in it. It is commonly known as Grayscale image. The range of
the colours in 8 bit varies from 0-255, where 0 stands for black, 255 stands for white and 127
stands for grey colour.
3) Colour image(24 bit):
24 bit colour format is also known as true colour format. In a 24 bit colour format, the 24 bits are
again distributed in three different formats of Red, Green and Blue.

- Since 24 is equally divided on 8, so it has been distributed equally between three different
colour channels.
- Their distribution is like this. 8 bits for R, 8 bits for G, 8 bits for B. A 24 bit colour image
supports 16777216 diff combination of colours. Colour image can be converted in grey scale
image using this equation
- X= 0.30 R + 0.59G + 0.11B.
9. Image file format:

1) BMP (bit mapped graphic Image): colour as well as monochrome image.
- Ex. Paint and saved images in computer.
- Quality is good but more storage is required.
2) TIFF (also known as TIF) : TIFF stands for Tagged Image File Format.
- TIFF images create very large file sizes.
- TIFF images are uncompressed and thus contain a lot of detailed image data (which is
why the files are so big) TIFFs are also extremely flexible in terms of colour
3) JPEG (also known as JPG): JPEG stands for Joint Photographic Experts Group, which created
this standard for this type of image formatting.
- JPEG files are images that have been compressed to store a lot of information in a small-
size file.
- Most digital cameras store photos in JPEG format, because then you can take more
photos on one camera card than you can with other formats. A JPEG is compressed in a
way that loses some of the image detail during the compression in order to make the file
small (and thus called “lossy” compression).
- JPEG files are usually used for photographs on the web, because they create a small file
that is easily loaded on a web page and also looks good.
4) GIF (Graphic Interchange Format): This format compresses images but, as different from
JPEG, the compression is lossless (no detail is lost in the compression, but the file can’t be
made as small as a JPEG).
- This format is never used for photography, because of the limited number of colours. GIFs
can also be used for animations.
5) PNG (Portable Network Graphics): It was created as an open format to replace GIF, because
the patent for GIF was owned by one company and nobody else wanted to pay licensing fees.
- It also allows for a full range of colour and better compression. It’s used almost exclusively
for web images, never for print images.
- For photographs, PNG is not as good as JPEG, because it creates a larger file. But for
images with some text, or line art, it’s better, because the images look less “bitmappy.”
10. Image resolution:

1. Spatial resolution: It depends on the numbers of pixels. The principal factor determining
spatial resolution is sampling.
2. Grey level Resolution: It depends on number of grey levels. Smallest discernible changes in
grey level.

10. Colour models:

1. RGB models:
This is an additive model, i.e. the colours present in the light add to form new colours, and is
appropriate for the mixing of coloured light for example. Red, green and blue are primary
colors to form the three secondary colours yellow (red + green), cyan (blue + green) and
magenta (red + blue), and white ((red + green + blue).
2. CMY model:
The CMYK color model (process color, four color) is a subtractive color model, used in color
printing, and is also used to describe the printing process itself. CMYK refers to the four inks
used in some color printing: cyan, magenta, yellow, and key (black).
C= 1-R
M=1-G
Y=1-B
3. HIS Model:
Hue: Dominant colour observe by observer
Intensity: Amount of white color mixed with Hue.
Saturation: Amount of brightness reflection.

R+G+B
I= ,
3
min (R,G,B) 3
S=1- =1- min (R,G,B)
I R+G+B
4. YIQ color model:

This is used for color TV. Here is the luminance (the only component necessary for B&W-
TV). The conversion from RGB to YIQ is given by
The advantage of this model is that more bandwidth can be assigned to the Y-component
(luminance) to which the human eye is more sensible than to color information.
11. Basic relationship between pixels:

Neighbours of Pixels:
- A pixel p at coordinates (x, y) has four horizontal and vertical neighbors whose coordinates
are given by (x+1, y), (x-1, y), (x, y+1), (x, y-1). This set of pixels, called the 4-neighbors of

p, is denoted by N4 (p). Each pixel is a unit distance from (x, y), and some of the neighbors
of p lie outside the digital image if (x, y) is on the border of the image.
- The four diagonal neighbors of p have coordinates (x+1, y+1), (x+1, y-1), (x-1, y+1), (x-1, y-
1) and are denoted by ND (p). These points, together with the 4-neighbors, are called the
8-neighbors of p, denoted by N8 (p). As before, some of the points in ND (p) and N8 (p)
fall outside the image if (x, y) is on the border of the image.
Adjacency:
- Two pixels are connected if they are neighbours and their grey levels are satisfy some
specified criteria of similarity.
- For example, in a binary image two pixels are connected if they are 4-neighbors and have
same value (0/1).
- Let V be set of gray levels values used to define adjacency
4-adjacency: Two pixels p and q with values from V are 4- adjacent if q is in the set N4(p)
8-adjacency: Two pixels p and q with values from V are 8- adjacent if q is in the set N8(p).
m-adjacency: Two pixels p and q with values from V are m- adjacent if,
i)q is in N4(p)
ii)q is in ND(p) and the set [N4(p) I N4(q)] is empty

connectivity:
- To determine whether the pixels are adjacent in some sense. Let V the set of grey level
values used to define connectivity then two pixels p & q that have values from the set v are:
4 connected: if q is in set of N4(p)
8 connected: if q is in set of N8(p)
m-connected: if
i) q is in set of N4(p)
Here V= {1,2}
12. Distance Transform:

1) Euclidean Distance: it is the straight line distance between two pixels. If p and q are the two
pixels with coordinates(x1,y1) and (x2,y2) then
DE = [(x1,y1 ) 2+ (x2,y2)2]1\2
2) City block distance (D4 distance): If p and q are the two pixels with coordinates(x1,y1) and
(x2,y2) then
D(city) = D(p,q) = │ x1-x2 │ + │ y1-y2 │
3) Chess Board Distance (D8 Distance):

If p and q are the two pixels with coordinates(x1,y1) and (x2,y2) then
D8(p,q) =Max( │ x1-x2 │ , │ y1-y2 │)
4) Dm Distance: This distance is measured based on m adjacency. Pixel p and q are m adjacent
if i) q is in set of N4(p)
Example : Let V = {0,1}. Compute DE , D4 ,D8, Dm distance between two pixels p and q let the
pixel coordinates of p and q be (3,0) and (2,3) respectively for the image shown. Find distance
measures.
Solution: V = {0,1} implies that the distance traversed can pass through 0 and 1.
i) Euclidean Distance:
DE (p,q) = 1.4+1+1=3.4
ii) D4 distance: D4 (p,q)= = │ x1-x2 │ + │ y1-y2 │
Coordinates of p ( 3,0) and q (2,3)
=│ 3-2 │ + │ 0-3 │
=4
iii) D8 Distance:
D8(p,q) =Max( │ x1-x2 │ , │ y1-y2 │)
= Max(│ 3-2│ , │ 0-3 │)
= Max(1,3)
=3
iv) Dm Distance: This distance is measured based on m adjacency. Pixel p and q are m adjacent if
i) q is in set of N4(p)
ii)q is in ND(p) and the set [N4(p) ∩ N4(q)] is empty

Here V={0,1} so we traverse the path shown below.
Dm (p,q) = 1+1+1+1= 4
Example: Let V = {2,4}. Compute D4, D8, Dm distance between two pixels p and q.
Solution: here p(0,2) and q(3,0)

i) D4 distance: D4 (p,q) = │ x1-x2 │ + │ y1-y2 │ = │ 0-3 │ + │ 2-0 │ = 5
ii) D8 Distance : D8(p,q) =Max( │ x1-x2 │ , │ y1-y2 │)
= Max(│ 0-3│ , │ 2-0 │)
= Max(3,2)
=3
iii) Dm Distance: This distance is measured based on m adjacency. Pixel p and q are m
adjacent if i) q is in set of N4(p)
ii)q is in ND(p) and the set [N4(p) ∩ N4(q)] is empty
here V={2,4} so we traverse the path shown below.
Dm = 1.4+1+1+1= 4.4

Unit 2
IMAGE TRANSFORMS
- Transform is basically a mathematically tool which allows us to move from one domain to another
domain (time to frequency).
- Transform do not change the information content present in the signal.
- There are two reasons for transforming an image from one representation to another.
- 1) It may isolate critical components of the image pattern to that they are directly accessible for
analysis.
- 2) Transformation may place the image data in more compact form so that they can b stored &
transmitted efficiently.
- Application: image enhancement image compression, image filtering etc.
2.1 Classification of Transform
2.1.1 1D Fourier Transform

if x(t) x(Ω)

x(Ω) = ∫− 𝑥(𝑡)𝑒 −𝑗𝛺𝑡 𝑑𝑡 (continuous))
if x(t) x(nT)
X(ejw) = ∑
− 𝑥(𝑛𝑇)𝑒
−𝑗𝛺𝑛𝑡
Here ω= Ω T
ω=2πf*T
sampling interval T = 1/fs
ω=2πf*1/fs
but we know f/ fs = k
so ω=2πk
ω/2π =k/N
- The DFT of finite duration sequence x(n) is defined as
X(k) = ∑𝑁−1
𝑛=0 𝑥(𝑛)𝑒
−𝑗2П𝑘𝑛/𝑁
k= 0,1,2,3......,N-1
=∑𝑁−1
𝑛=0 𝑥(𝑛)𝑤𝑁
nk
Example: find DFT of given sequence x (n) = {0, 1, 2, 1}
Solution:
- we know X (k) = ∑𝑁−1
𝑛=0 𝑥(𝑛)𝑒
−𝑗2П𝑘𝑛/𝑁
- Here W is twiddle factor & is represented by

- WNnk = 𝑒 −𝑗2П𝑘𝑛/𝑁
- Here N =4 so W4nk = 𝑒 −𝑗2П𝑘𝑛/4
0 1 2 3
0 w40 w40 w40 w40
1 w40 w41 w42 w43
2 w40 w42 w44 w46
3 [w40 w43 w46 w49 ]
- Now find out value of w40

- W4o = 𝑒 −𝑗2П(𝑜)/4 = e0 =1
- Now w41 = 𝑒 −𝑗2П/4 = (cos 2π/4 - jsin2π/4) =(0-j1)= -j
- Now w42 = 𝑒 –𝑗4П/4 = 𝑒 –𝑗П (cos π- jsinπ) =(0-1)= -1
- Now w43 = 𝑒 –𝑗6П/4 = (cos 6π/4 – jsin6π/4) =(0+j1)= j
- Similar remaining values we can find
- Final matrix of twiddle factor W is given by
1 1 1 1
1 −𝑗 −1 𝑗
[ ]
1 −1 1 −1
1 𝑗 −1 −𝑗
1 1 1 1 0
1 −𝑗 −1 𝑗 1
So X(k) = [ ][ ]
1 −1 1 −1 2
1 𝑗 −1 −𝑗 3
X(k) = [4 −2 0 −2]
2.1.2 2D Discrete Fourier Transform

- For N x N matrix
f(m,n) F(k,l)
- So F(k,l) = ∑𝑁−1 𝑁−1
𝑚=0 ∑𝑛=0 𝑓(𝑚, 𝑛) 𝑒
−𝑗2𝜋𝑚𝑘/𝑁 −𝑗2𝜋𝑛𝑙/𝑁
𝑒
- For inverse 2D DFT

F(m,n) =1/N2 (∑𝑁−1 𝑁−1
𝑙=0 ∑𝑘=0 𝑓(𝑘, 𝑙) 𝑒
𝑗2𝜋𝑚𝑘/𝑁 𝑗2𝜋𝑛𝑙/𝑁
𝑒 )
- So F(k,l) = R (k,l) + j I(k,l)

- Polar form F(k,l) = Ι F(k,l) Ι e jφ( n, l )
- Here Ι F(k,l) Ι ={R2 F(k,l) + I2 F(k,l)}1/2
- & φ( n, l ) = tan-1{I(k,l)/ R (k,l)}
2.1.3 Properties of 2D DFT

1) Separable property:
Proof: we know
F(k,l) = ∑𝑁−1 𝑁−1

𝑚=0 ∑𝑛=0 𝑓(𝑚, 𝑛) 𝑒
𝑒
LHS = ∑𝑁−1 𝑁−1

𝑚=0 ∑𝑛=0 𝑓(𝑚, 𝑛) 𝑒
𝑒
=∑𝑁−1 𝑁−1
𝑚=0 (∑𝑛=0 𝑓(𝑚, 𝑛) 𝑒
−𝑗2𝜋𝑛𝑙/𝑁
) 𝑒 −𝑗2𝜋𝑚𝑘/𝑁
=∑𝑁−1
𝑚=0 (𝑓(𝑚, 𝑙) 𝑒
−𝑗2𝜋𝑚𝑘/𝑁
= 𝐹(𝑘, 𝑙)
=RHS
- Performing 2D DFT= 1D DFT 2 times

- I) performing 1D transform on each row of image f(m,n) to get F(m,l)
- II) performing 1D transform on each Colum of F(m,l) to get F(k,l).
Question: State and Prove translation property of DFT.
2) Spatial Shift property(Translation property):
F(m-m0,n) 𝑒 −𝑗2𝜋𝑚0 𝑘/𝑁 F(k,l)
Proof: we know
F(k,l) = ∑𝑁−1 𝑁−1

𝑚=0 ∑𝑛=0 𝑓(𝑚, 𝑛) 𝑒
𝑒
Adding & subtracting mo to 𝑒 −𝑗2𝜋𝑚𝑘/𝑁 in above equation.

F[f(m-m0,n)] = ∑𝑁−1 𝑁−1
𝑚=0 ∑𝑛=0 𝑓(𝑚 − m0 , 𝑛) 𝑒
−𝑗2𝜋(𝑚−𝑚0 +𝑚0 )𝑘/𝑁 −𝑗2𝜋𝑛𝑙/𝑁
𝑒
−𝑗2𝜋(𝑚−𝑚0 )𝑘/𝑁
=∑𝑁−1 𝑁−1
𝑚=0 ∑𝑛=0 𝑓(𝑚 − m0 , 𝑛) 𝑒 𝑒 −𝑗2𝜋𝑚0 𝑘/𝑁 𝑒 −𝑗2𝜋𝑛𝑙/𝑁
−𝑗2𝜋(𝑚−𝑚0 )𝑘/𝑁 −𝑗2𝜋𝑛𝑙/𝑁
= 𝑒 −𝑗2𝜋𝑚0 𝑘/𝑁 ∑𝑁−1 𝑁−1
𝑚=0 ∑𝑛=0 𝑓(𝑚 − m0 , 𝑛) 𝑒 𝑒
= 𝑒 −𝑗2𝜋𝑚0 𝑘/𝑁 (DFT of f(m-m0,n))
= 𝑒 −𝑗2𝜋𝑚0 𝑘/𝑁 F(k,l)
3) Periodicity Property:
2D DFT f(m,n) is said to be periodic with period N if
F(k,l) F (k+pN,l+qN)
Proof:
F (k+pN,l+qN) = ∑𝑁−1 𝑁−1
𝑚=0 ∑𝑛=0 𝑓(𝑚, 𝑛) 𝑒
−𝑗2𝜋𝑚(𝑘+𝑝𝑁)/𝑁 −𝑗2𝜋𝑛(𝑙+𝑞𝑁)/𝑁
𝑒
= ∑𝑁−1 𝑁−1
𝑚=0 ∑𝑛=0 𝑓(𝑚, 𝑛) 𝑒
−𝑗2𝜋𝑚𝑘/𝑁 −𝑗2𝜋𝑚𝑝𝑁/𝑁 −𝑗2𝜋𝑛𝑙/𝑁 −𝑗2𝜋𝑛𝑞𝑁/𝑁
𝑒 𝑒 𝑒
= 𝑒 −𝑗2𝜋𝑚𝑝 𝑒 −𝑗2𝜋𝑛𝑞 ∑𝑁−1 𝑁−1

𝑚=0 ∑𝑛=0 𝑓(𝑚, 𝑛) 𝑒
𝑒
= 𝑒 −𝑗2𝜋𝑚𝑝 𝑒 −𝑗2𝜋𝑛𝑞 F(k,l)
For any value of m, n , p & q
𝑒 −𝑗2𝜋𝑚𝑝 𝑒 −𝑗2𝜋𝑛𝑞 = 1
So F(k+pN,l+qN) = F(k,l)
4) Convolution Property:
- Convolution in spatial domain is equal to multiplication in frequency domain.
f(m,n) * g(m.n) = F(k,l) x G(k,l)
Proof: we know convolution definition
𝑁−1
f(m,n) * g(m.n) = ∑𝑁−1
𝑎=0 ∑𝑏=0 𝑓(𝑎, 𝑏)𝑔(𝑚 − 𝑎, 𝑛 − 𝑏)
𝑁−1
LHS= F{ f(m,n) * g(m.n)} =∑𝑁−1 𝑁−1 𝑁−1
𝑚=0 ∑𝑛=0 [ ∑𝑎=0 ∑𝑏=0 𝑓(𝑎, 𝑏)𝑔(𝑚 − 𝑎, 𝑛 −
𝑏) ] 𝑒 −𝑗2𝜋𝑚𝑘/𝑁 𝑒 −𝑗2𝜋𝑛𝑙/𝑁
𝑁−1
=∑𝑁−1 𝑁−1 𝑁−1
𝑚=0 ∑𝑛=0 ∑𝑎=0 ∑𝑏=0 𝑓(𝑎, 𝑏)𝑔(𝑚 − 𝑎, 𝑛 − 𝑏) 𝑒
−𝑗2𝜋(𝑚−𝑎+𝑎)𝑘/𝑁 −𝑗2𝜋(𝑛−𝑏+𝑏)𝑙/𝑁
𝑒
𝑁−1 𝑁−1 𝑁−1

=∑𝑁−1
𝑚=0 ∑𝑛=0 ∑𝑎=0 ∑𝑏=0 𝑓(𝑎, 𝑏)𝑔(𝑚 − 𝑎, 𝑛 − 𝑏) 𝑒
−𝑗2𝜋(𝑚−𝑎)𝑘/𝑁 −𝑗2𝜋𝑎𝑘/𝑁 −𝑗2𝜋(𝑛−𝑏)𝑙/𝑁 −𝑗2𝜋𝑏𝑙/𝑁
𝑒 𝑒 𝑒
=F(k,l) x G(k,l)
=RHS
5) Correlation: correlation gives similarity between two signals. DFT of correlation of two
sequence x(n) & h(n) is defined as X(-k)H(k).
Proof:
DFT{R x,h} = ∑𝑁−1 𝑁−1

𝑚=0 ∑𝑛=0 𝑥(𝑛)ℎ(𝑛 + 𝑚) 𝑒
𝑗2𝜋𝑚𝑘/𝑁
=∑𝑁−1 𝑁−1
𝑚=0 {∑𝑛=0 𝑥(𝑛)ℎ(𝑛 + 𝑚)} 𝑒
−𝑗2𝜋(𝑚+𝑛−𝑛)𝑘/𝑁
=∑𝑁−1
𝑚=0 𝑥(𝑛)𝑒
−𝑗2𝜋(−𝑛)𝑘/𝑁 ∑𝑁−1
𝑛=0 ℎ(𝑛 + 𝑚) 𝑒
−𝑗2𝜋(𝑚+𝑛)𝑘/𝑁
= X(-k)H(k)
6) Scaling property: it is basically used to increase & decrease the size of image.
1
DFT{f(am,bn)}= F(k/a, l/b)
ab
Proof:
F[f(am,bn)] = ∑𝑁−1 𝑁−1
𝑚=0 ∑𝑛=0 𝑓(𝑎𝑚, 𝑏𝑛) 𝑒
𝑒
𝑎
−𝑗2𝜋𝑚( )𝑘/𝑁 −𝑗2𝜋𝑛(𝑏/𝑏)𝑙/𝑁
=∑𝑁−1 𝑁−1
𝑚=0 ∑𝑛=0 𝑓(𝑎𝑚, 𝑏𝑛) 𝑒 𝑎 𝑒
𝑁−1
𝑁−1 𝑘
= ∑ ∑ 𝑓(𝑎𝑚, 𝑏𝑛) 𝑒 −𝑗2𝜋𝑚(𝑎)𝑎/𝑁 𝑒 −𝑗2𝜋𝑛(𝑙/𝑏)𝑏/𝑁
𝑛=0
𝑚=0
1
= F(k/a, l/b))
ab
7) Conjugate symmetry:
f(m,n) F(k,l)
f*(m,n) F*(-k,-l)
Proof:
F(k,l) = ∑𝑁−1 𝑁−1
𝑚=0 ∑𝑛=0 𝑓(𝑚, 𝑛) 𝑒
𝑒
F*(k,l) = ∑𝑁−1 𝑁−1 ∗

𝑚=0 ∑𝑛=0 𝑓 (𝑚, 𝑛) 𝑒
𝑗2𝜋𝑚𝑘/𝑁 𝑗2𝜋𝑛𝑙/𝑁
𝑒
F*(-k,-l) = ∑𝑁−1 𝑁−1 ∗

𝑚=0 ∑𝑛=0 𝑓 (𝑚, 𝑛) 𝑒
𝑒
F*(-k,-l) =f*(k,l)
8) Orthogonality:
1
2
∑ ∑ 𝑎𝑘,𝑙 (𝑚, 𝑛)𝑎∗ 𝑘 ′ ,𝑙′ (𝑚, 𝑛) − (k-k’,l-l’)
N
9) Multiplication by exponential:
Proof:We know
F(k,l) = ∑𝑁−1 𝑁−1

𝑚=0 ∑𝑛=0 𝑓(𝑚, 𝑛) 𝑒
𝑒
RHS Multiply by 𝑒 −𝑗2𝜋𝑚𝑘0 /𝑁 & 𝑒 −𝑗2𝜋𝑚𝑙0 /𝑁
=∑𝑁−1 𝑁−1
𝑚=0 ∑𝑛=0 𝑓(𝑚, 𝑛) 𝑒
−𝑗2𝜋𝑚𝑘/𝑁
𝑒 −𝑗2𝜋𝑚𝑘0 /𝑁 𝑒 −𝑗2𝜋𝑛𝑙/𝑁 𝑒 −𝑗2𝜋𝑚𝑙0 /𝑁
=∑𝑁−1 𝑁−1
𝑚=0 ∑𝑛=0 𝑓(𝑚, 𝑛) 𝑒
−𝑗2𝜋𝑚(𝑘−𝑘0 )/𝑁 −𝑗2𝜋𝑛(𝑙−𝑙0 )/𝑁
𝑒
= F (k-ko , l-lo)
Example: Compute 2D DFT 3* 3 give image.
1 −1 1
[−1 1 1]
1 1 1
Solution: 1) compute 2D DFT
- Given matrix is symmetric so we use F= TfT

- Here T is Twiddle factor and we know WNnk = 𝑒 −𝑗2П𝑘𝑛/𝑁
-
w30 w30 w30
T= w30 w31 w32
w30 w32 w34
- Using W3nk = 𝑒 −𝑗2П𝑘𝑛/3 formula we can write

1 1 1
T= [1 −0.5 − 0.886𝑗 −0.5 + 0.886𝑗]
1 −0.5 + 0.886𝑗 −0.5 − 0.886𝑗
Now apply F= T f T
1 1 1 1 −1 1
- First find Tf =[1 −0.5 − 0.886𝑗 −0.5 + 0.886𝑗] [−1 1 1]
1 −0.5 + 0.886𝑗 −0.5 − 0.886𝑗 1 1 1
3 6 9
= [ 0 0 0]
0 0 0
3 6 9 1 1 1
Now TfT =[0 0 0] [1 −0.5 − 0.886𝑗 −0.5 + 0.886𝑗]
0 0 0 1 −0.5 + 0.886𝑗 −0.5 − 0.886𝑗
18 −4.5 + 2.598𝑗 −4.5 − 2.598𝑗

=[ 0 0 0 ]
0 0 0
Example: compute inverse 2D DFT transform coefficients of given 4 * 4 image.
1 1 1 1
1 1 1 1
[ ]
1 1 1 1
1 1 1 1
Solution:
1 1 1 1 1 1 1 1 1 1 1 1
1 −𝑗 −1 𝑗 1 1 1 1 1 −𝑗 −1 𝑗
F= TfT= [ ][ ][ ]
1 −1 1 −1 1 1 1 1 1 −1 1 −1
1 𝑗 −1 −𝑗 1 1 1 1 1 𝑗 −1 −𝑗
16 0 0 0
0 0 0 0
=[ ]
0 0 0 0
0 0 0 0
- For inverse 2D DFT

- F= 1/N [TFT]
- 1/16 [T][F][T]
1 1 1 1 16 0 0 0 1 1 1 1
1 −𝑗 −1 𝑗 0 0 0 0 1 −𝑗 −1 𝑗
1/16[ ][ ][ ]
1 −1 1 −1 0 0 0 0 1 −1 1 −1
1 𝑗 −1 −𝑗 0 0 0 0 1 𝑗 −1 −𝑗
1 1 1 1
1 1 1 1
[ ]
1 1 1 1
1 1 1 1
Example: find 2 D DFT of following image.
0 1 2 1
1 2 3 2
F(x,y)= [ ]
2 3 4 3
1 2 3 2
Solution: F = TfT
1 1 1 1 0 1 2 1
1 −𝑗 −1 𝑗 1 2 3 2
Tf= [ ][ ]
1 −1 1 −1 2 3 4 3
1 𝑗 −1 −𝑗 1 2 3 2
4 8 12 8
−2 −2 −2 −2
=[ ]
0 0 0 0
−2 −2 −2 −2
4 8 12 8 1 1 1 1
−2 −2 −2 −2 1 −𝑗 −1 𝑗
TfT= [ ][ ]
0 0 0 0 1 −1 1 −1
−2 −2 −2 −2 1 𝑗 −1 −𝑗
32 −8 0 −8
−8 0 0 0
=[ ]
0 0 0 0
−8 0 0 0
Example: find 2 D DFT of following image using 1 D DFT
0 1 2 1
1 2 3 2
F(x,y)= [ ]
2 3 4 3
1 2 3 2
1 1 1 1
1 −𝑗 −1 𝑗
Solution: we know T= [1 −1 ]
1 −1
1 𝑗 −1 −𝑗
We shall use the DFT along the rows and then along the columns.
1 1 1 1 0 4
1 −𝑗 −1 𝑗 1 −2
DFT of first row [ ][ ] = [ ]
1 −1 1 −1 2 0
1 𝑗 −1 −𝑗 1 −2
1 1 1 1 1 8
1 −𝑗 −1 𝑗 2 −2
DFT of second row [ ][ ] = [ ]
1 −1 1 −1 3 0
1 𝑗 −1 −𝑗 2 −2
1 1 1 1 2 12
1 −𝑗 −1 𝑗 3 −2
DFT of third row [ ][ ] = [ ]
1 −1 1 −1 4 0
1 𝑗 −1 −𝑗 3 −2
1 1 1 1 1 8
1 −𝑗 −1 𝑗 2 −2
DFT of fourth row [ ][ ] = [ ]
1 −1 1 −1 3 0
1 𝑗 −1 −𝑗 2 −2
4 −2 0 −2
8 −2 0 −2
Hence we have an intermediate stage = [ ]
12 −2 0 −2
8 −2 0 −2
Now using 1 D DFT along the columns of this intermediate image we get
1 1 1 1 4 32
1 −𝑗 −1 𝑗 8 −8
DFT of first column [ ][ ] = [ ]
1 −1 1 −1 12 0
1 𝑗 −1 −𝑗 8 −8
1 1 1 1 −2 −8
1 −𝑗 −1 𝑗 −2 0
DFT of second column [ ][ ] = [ ]
1 −1 1 −1 −2 0
1 𝑗 −1 −𝑗 −2 0
1 1 1 1 0 0
1 −𝑗 −1 𝑗 0 0
DFT of third column [ ][ ] = [ ]
1 −1 1 −1 0 0
1 𝑗 −1 −𝑗 0 0
1 1 1 1 −2 −8
1 −𝑗 −1 𝑗 −2 0
DFT of fourth column [ ][ ] = [ ]
1 −1 1 −1 −2 0
1 𝑗 −1 −𝑗 −2 0
32 −8 0 −8
−8 0 0 0
The final DFT of entire image is [ ]
0 0 0 0
−8 0 0 0
2.1.4 Fast Fourier transform

No of Samples N No of computation N2 (DFT) No of computation N log2N
(FFT)
32 1024 160
128 16384 896
2048 4194304 22528
- Two algorithms : i) Decimination In time (DIT-FFT)

ii) Decimination In Frequency(DIF-FFT
(4 point DIT-FFT)
Example: find DIT- FFT of given input image
0 1 2 1
1 2 3 2
2 3 4 3
1 2 3 2
Solution:
- 4 * 4 image so we need 4 point butterfly diagram. First do DFT with row then column.
F(0) G(0) F(0)

1
G(1)
F(2) -1 F(1)
H(0)
F(1) - F(2)
1
H(1)
F(3) -1 - F(3)
G(0) = f(0) + f(2)

G(1) = f(0) - f(2)
H(0) = f(1) + f(3)
H(1) = f(1) - f(3)
X(0) = G(0) + W40H(0)
X(1) = G(1) + W41H(1)
X(2) = G(0) - W40H(0)
X(3) = G(1) -W41H(1)
- Taking first row of input image [0 1 2 1]

- Here f(0)= 0, f(1)= 1 , f(2)= 2, f(3)= 1 using above equation we can find X(0),X(1),X(2),X(3).
- So we will get value of first row of output image X and repeat the process for 2nd, 3rd, 4th row.
4 -2 0 -2
8 -2 0 -2
12 -2 0 -2
8 -2 0 -2
- Repeat this process with individual column of above matrix X and final output matrix is
32 -8 0 -8
-8 0 0 0
0 0 0 0
-8 0 0 0
2.1.5 Discrete Cosine Transform (DCT)

- It is used for image compression like JPEG image.
- It has only real values.
- One dimensional DCT of a sequence f(x) is defined as
𝜋(2𝑥+1)𝑢
- F(u) = (u)∑𝑁−1
𝑥=0 𝑓(𝑥)cos[ ] ;0 ≤u< N-1
2𝑁
1
- Where (0) = √
𝑁
2
- (u) =√ ,1 ≤u ≤N-1
𝑁
- Inverse transform is given by

𝜋(2𝑥+1)𝑢
- f(x) = (u)∑𝑁−1
𝑢=0 𝐹(𝑢)cos[ 2𝑁
] ;0 ≤u< N-1
- Two dimensional DCT of a sequence f(x) is defined as
𝜋(2𝑥+1)𝑢 𝜋(2𝑦+1)𝑣
- F(u,v) = (v) (u)∑𝑁−1 𝑁−1
𝑥=0 ∑𝑦=0 𝑓(𝑥, 𝑦)cos[ 2𝑁
] cos[ 2𝑁 ] ;x,y = 0,1,2…. N-1
- For Example, we have to use
- For 1D DCT , F= c.f
- For 2D DCT , F = cfc’ ( because c matrix is asymmetric)
1
- Where c(u,v) = √𝑁 , u=0 & 0 ≤v< N-1
2 𝜋(2𝑣+1)𝑢
- c(u,v) = √ cos [ ] , 1 ≤u < N-1 & 0 ≤v < N-1.
𝑁 2𝑁
Example: find DCT of following sequence f(x) = {1, 2, 4, 7}
Solution: Here given sequence is 1D so F = cf

- N=4 so c(u,v) becomes 4 x 4
- Using this equation we can find out c(u,v)
1
- c(u,v) = √𝑁 , u=0 & 0 ≤v< N-1
2 𝜋(2𝑣+1)𝑢
- c(u,v) = √𝑁 cos [ 2𝑁
] , 1 ≤u < N-1 & 0 ≤v < N-1.
v
0 1 2 3
0 0.5 0.5 0.5 0.5
c(u,v) = 1 0.653 0.270 -0.270 -0.653
u
2 0.5 -0.5 -0.5 0.5
3 0.270 -0.653 0.653 -0.270
- Now F= cf
0.5 0.5 0.5 0.5 1
0.653 0.270 -0.270 -0.653 2
0.5 -0.5 -0.5 0.5 [ ]
4
0.270 -0.653 0.653 -0.270 7
7
−4.459
F= [ ]
1
−0.3170
Example: find DCT of given 4 * 4 image
2 4 4 2
4 6 8 3
2 8 10 4
3 8 6 2
Solution: given matrix is asymmetric so F = cfc’
0.5 0.5 0.5 0.5 2 4 4 2 0.5 0.653 0.5 0.270

0.653 0.270 −0.270 −0.653 4 6 8 3 0.5 0.270 −0.5 −0.653
=[ ][ ][ ]
0.5 −0.5 −0.5 0.5 2 8 10 4 0.5 −0.270 −0.5 0.653
0.270 −0.653 0.653 −0.270 3 8 6 2 0.5 −0.653 0.5 −0.270
19 −0.2705 −8 0.653
2.69 −0.24 2.30 0.89
= [ ]
−3.3 1.46 1.5 −1.68
0.03 −1.60 −0.95 −0.24
Example: write expression for a two dimensional DCT. Also find DCT of given 4 * 4 image
1 2 2 1
2 1 2 1
1 2 2 1
2 1 2 1
Solution: Two dimensional formula is given by
𝜋(2𝑥+1)𝑢 𝜋(2𝑦+1)𝑣
- F(u,v) = (v) (u)∑𝑁−1 𝑁−1
𝑥=0 ∑𝑦=0 𝑓(𝑥, 𝑦)cos[ 2𝑁
] cos[ 2𝑁 ] ;x,y = 0,1,2…. N-1
- Given input matrix is asymmetric so we have to use F= c f c’
0.5 0.5 0.5 0.5 2 4 4 2 0.5 0.653 0.5 0.270

0.653 0.270 −0.270 −0.653 4 6 8 3 0.5 0.270 −0.5 −0.653
=[ ][ ][ ]
0.5 −0.5 −0.5 0.5 2 8 10 4 0.5 −0.270 −0.5 0.653
0.270 −0.653 0.653 −0.270 3 8 6 2 0.5 −0.653 0.5 −0.270
0 0.3827 −1 0.9239
0 −0.1464 −0.3827 −0.3536
F= [ ]
0 0 0 0
0 −0.3536 −0.9239 −0.8536
2.1.6 KL Transform
Question: write short note on KL transform.
𝟒 −𝟐
Question: Find KL transform of following image [ ]
−𝟏 𝟑
- In image neighbor pixels are highly related to center pixels. When we apply compression
algorithm, which compressed all pixels values so which is directly effect on picture quality.
- By using k-L transform compression apply on uncorrelated that so quality of image will be good
compare with other compression algorithm.
- For KL transform we have to follow some steps which are given below.
- Step 1: Formation of vector from the given matrix.
4 −2 4 −2
- Suppose matrix X = [ ] so X0 = [ ] & X1 = [ ]
−1 3 −1 3
- Step 2: Determination of covariance matrix.
- For covariance cov(x) = E[XXT]- 𝑋̅ 𝑋̅T
- Here 𝑋̅ is mean value.
1
- 𝑋̅ = ∑𝑀−1
𝑀 𝑘=0 𝑋𝑘 , M is number of vectors in X
1 𝑀−1
- So 𝑋̅ = ∑ 𝑋
2 𝑘=0 𝑘
1
= {X0 + X1 }
2
1 4 −2
= 2{ [ ] + [ ]}
−1 3
1 2
= [ ]
2 2
1
=[ ]
1
1
- Now 𝑋̅ 𝑋̅T = [ ] [1 1]
1
1 1
=[ ]
1 1
1
- E[XXT ] = 𝑀 ∑𝑀−1 𝑇
𝑘=0 𝑋𝑘 𝑋𝑘
1
= ∑1𝑘=0 𝑋𝑘 𝑋𝑘𝑇
2
1
= [(𝑋0 𝑋0𝑇 ) + (𝑋1 𝑋1𝑇 )]
2
1 4 −2
=2 {[ ] [4 −1] + [ ] [−2 3]}
−1 3
1 16 −4 4 6
= {[ ]+[ ]}
2 −4 1 −6 9
1 20 −10
= [ ]
2 −10 10
10 −5
=[ ]
−5 5
- So cov(x) = E[XXT]- 𝑋̅ 𝑋̅T

10 −5 1 1
=[ ]− [ ]
−5 5 1 1
9 −6
=[ ]
−6 4
- Step 3: Determination of eigen values of X using cov(x)-I  = 0

9 −6 1 0
- ([ ]− [ ]) =0
−6 4 0 1
9 −6  0
- ([ ]− [ ]) =0
−6 4 0 
9− −6
- [ ]=0
−6 4−
- {(9-)(4-) – 36 } = 0
- 36- 4-9+2-36 =0
- 2 -13 =0
- (-13)=0
- So 0=0 & 1= 13
- Step 4: Determination of eigen vectors of covariance matrix.
- First eigen vector φ0
- cov(x)-0I  φ0 = 0
9 −6 1 0 ϕ00 0
- ([ ] − (0) [ ]) [ ]=[ ]
−6 4 0 1 ϕ01 0
9 −6 ϕ00 0
- [ ][ ]=[ ]
−6 4 ϕ01 0
- Take ϕ01 = 1
- 9ϕ00 − 6ϕ01 = 0
- 9ϕ00 − 6(1) = 0
- ϕ00 = 6/9 = 0.66
0.66
- So Eigen vector ϕ0 = [ ]
1
- Similarly find out eigen vector ϕ1
- cov(x)-1I  φ1 = 0
-
9 −6 1 0 ϕ10 0
- ([ ] − (13) [ ]) [ ]=[ ]
−6 4 0 1 ϕ11 0
-
9 −6 13 0 ϕ 0
- ([ ]− [ ]) [ 10 ]=[ ]
−6 4 0 13 ϕ11 0
−4 −6 ϕ00 0
- [ ][ ]=[ ]
−6 9 ϕ01 0
- Take ϕ11 = 1
- -4ϕ10 − 6ϕ11 = 0
- -4ϕ10 − 6(1) = 0
- ϕ10 = - 6/4 = -1.5
−1.5
- So Eigen vector ϕ1= [ ]
1
- Step 5: Normalization of Eigen Vectors

ϕ0 1 ϕ
- = [ 00 ]
√ϕ 2 +ϕ 2 ϕ01
ϕ0 
00 01
1 0.66
- = [ ]
√(0.66)2 +(1)2 1
0.66
- =(0.83) [ ]
1
0.55
- =[ ]
0.83
- Similarly second Eigen normalization
ϕ1 1 ϕ
- = [ 10 ]
√ϕ 2 +ϕ 2 ϕ11
ϕ1 
10 11
1 −1.5
= [ ]
√(−1.5)2 +(1)2 1
−1.5
= (0.55) [ ]
1
−0.83
=[ ]
0.55
- Step 6 : KL transform matrix from the given vector of covariance matrix
0.55 −0.83
T=[ ]
0.83 0.55
- We have to check this matrix is unitary or not so
0.55 −0.83 0.55 0.83
TTT =[ ][ ]
0.83 0.55 −0.83 0.55
0.99 0
=[ ]
0 0.99
1 0
=[ ]
0 1
- Step 7: KL transformation of i/p matrix
- Y = T[X]
0.55 −0.83 4
- Y0= T[X0] = [ ][ ]
0.83 0.55 −1
2.2 + 0.83
= [ ]
2.64 − 0.55
3.02
=[ ]
2.73
0.55 −0.83 −2
- Y1= T[X1] =[ ][ ]
0.83 0.55 3
−3.59
=[ ]
0.01
3.02 −3.59
Y= [ ]
2.73 0.01
- Step 8: Reconstruction of i/p values from the transformed coefficients

0.55 0.83 3.02
- X0 = TT Y0 = [ ][ ]
−0.83 0.55 2.73
4
[ ]
−1
0.55 0.83 −3.59

- X1 = TT Y1 =[ ][ ]
−0.83 0.55 0.01
−2
 [ ]
3
4 −2
So X = [ ]
−1 3
- This is given input matrix.
Question: Justify - KL transform is known as Principal component Analysis (PCA).
Solution:
- The KL transform is based on the statically properties of the image and has several important
properties that make it useful for image processing particularly image compression.
- Since data from neighbouring pixels in an image is highly correlated, image compression without
ruining the subjective quality of the image becomes quite challenging.
- By decorrelating this data, more data compression can be achieved. It is advantageous to remove
redundancies from a decorrelated data sequence. The KL transform performs this task of
decorrelating the data.
- The KL transform is used in clustering analysis to determine a new coordinates system for sample
data where the largest variance of a projection of the data lies on the first axis & so on.
- Because these axes are orthogonal so approach allows for reducing the dimensionality of data set
by eliminating the coordinate axis with small variances.
- This data reduction technique is known as principal component Analysis (PCA).
2.1.7 Walsh-Hadamard transform
- The hardmard transform is based on the hardamard matrix which is a square array having entries
0f +1 & -1.
1 1
- The hardamard matrix of order 2 is given by, H(2) = [ ]
1 −1
- It is orthogonal matrix.
- For normalization we have to multiply the matrix with some constant factor.
- Hardamard matrices of order 2n can be recursively generated through the kronecker product.
H(2n) = H(2) x H(2n-1)
Suppose n =1
H(2) = H(2) x H(20)
H(2) = H(2)
Suppose n= 2
H(22) = H(2) x H(2n-1)
From kronecker product we get,
1. 𝐻(2) 1. 𝐻(2)
H(4) = [ ]
1. 𝐻(2) −1. 𝐻(2)
1 1 1 1
1 −1 1 −1
H(4) = [ ]
1 1 −1 −1
1 −1 −1 1
- If x(n) is N-point dimensional sequence of finite valued real numbers arranged in a column, then
Hadmard transformed sequence is given by,
X= T.x
X[n] = [H(N) . x(n)]
- H(n) is N x N matrix & x(n) is data sequence.
- The inverse Hardmard transform is given by, x(n) = 1/N [H(n)X(n)]
- If f is a N x N image & F is transformed image, the Hardmard transform is given by,
F= T f T
F= [H (N) f H (N)]
Example: compute the Hardmard transform of the data sequence {1, 2, 0, 3}|
Solution: here N=4
X[n] = [H(N) . x(n)]

1 1 1 1 1
1 −1 1 −1 2
X[n] =[ ][ ]
1 1 −1 −1 0
1 −1 −1 1 3
6
−4
X[n] =[ ]
0
2
Example: compute the Hardmard transform of the image shown below
2 1 2 1
1 2 3 2
2 3 4 3
1 2 3 2
Solution: here F= TfT
F= [H (N) f H (N)]
1 1 1 1 2 1 2 1 1 1 1 1
1 −1 1 −1 1 2 3 2 1 −1 1 −1
F= [ ][ ][ ]
1 1 −1 −1 2 3 4 3 1 1 −1 −1
1 −1 −1 1 1 2 3 2 1 −1 −1 1
6 8 12 8 1 1 1 1
2 0 0 0 1 −1 1 −1
F=[ ][ ]
0 −2 −2 −2 1 1 −1 −1
0 −2 −2 −2 1 −1 −1 1
34 2 −6 −6
2 2 2 2
F= [ ]
−6 2 2 2
−6 2 2 2
2.1.8 The Haar Transform
- The haar transform is based on a class of orthogonal matrices whose elements are either 1,-1,0
multiplied by factor √𝟐.
- Algorithm to generate Haar basis
- Step 1: Determine the order of N of the Haar basis.
- Step 2: Determine n where n= log2N.
- Step 3: Determine p & q.
- Step 4: Determine k.
k= 2p+q-1
- Step 5: Determine Z.
Z  [0,1] {0/N ,1/N, ….., N-1/N}
- Step 6: if k=0 then H(z)= 1/√𝐍
𝑝 1
𝑞−1 (𝑞− )
+22 , ( 2𝑝 ) ≤ 𝑍 < 2𝑝
2
Otherwise Hk(Z) = Hpq(Z)= 1/√𝐍 𝑝

𝑞−1 𝑞
− 22 , ( 2𝑝 ) ≤ 𝑍 < 2𝑝
{ 0 , 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒 }
- Suppose N= 2
- Step 1: N =2
- Step 2: n= log22 = 1.
- Step 3: i) since n=1, the only value of p is 0.
ii) So q takes the value of 0 & 1.
Step 4: Determine the value k using the formula k= 2p+q-1
p q k
0 0 0
0 1 1
- Step 5: Determine the Z value Z  [0,1] {0/2 ,1/2}

- Step 6: if k=0 then H(z)= 1/√𝟐
0
+22 = 1 , 0 ≤ 𝑍 < 1/2
Otherwise H1(Z) = H01(Z)= 1/√𝟐 { 0
}
− 2 = −1
2 ,1/2 ≤ 𝑍 < 1
0 , 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
- So we can write matrices for N=2

1 1
1/√𝟐 [ ]
1 −1
- Similarly we can find N=4 matrices
1 1 1 1
1 1 −1 −1
1/√𝟒 [ ]
√𝟐 −√𝟐 0 0
0 0 √𝟐 −√𝟐
- Here Haar matrix is asymmetric so

- For 1D F= H.f
- For 2D F= H.f.H’
Example: compute the Haar transform of the image shown below
2 1 2 1
1 2 3 2
2 3 4 3
1 2 3 2
Solution: here F= TfT
F= [Haar (N) f Haar (N)’]
1 1 1 1 2 1 2 1 1 1 √𝟐 0
1 1 −1 −1 1 2 3 2 1 1 −√𝟐 0
F= 1/√𝟒 [ ][ ] 1/√𝟒
√𝟐 −√𝟐 0 0 2 3 4 3 1 −1 0 √𝟐
0 0 √𝟐 −√𝟐 1 2 3 2 [1 −1 0 −√𝟐 ]
6 8 12 8 1 1 √𝟐 0
0 −2 −2 −2 1 1 −√𝟐 0
F= 1/√𝟒 [ ] 1/√𝟒
√𝟐 −√𝟐 −√𝟐 −√𝟐 1 −1 0 √𝟐
−√𝟐 −√𝟐 √𝟐 √𝟐 [1 −1 0 −√𝟐 ]
8.5 −1.5 −0.707 1.414

1.5 0.5 0.7071 0
F= [ ]
−0.7071 0.7071 1 0
1.414 0 0 0
2.1.9 Wavelet transform

- It is efficient tool to represent an image.
- The wavelet transform allows multi-resolution analysis of an image.
- It divides a signal into and of segments, each corresponding to a different frequency band.
- Application- image compression mage denoisiong & clustering.
2.1.9.1 Evaluation of wavelet transform
- Fourier transform is powerful tool that has been available to signal analysis for many years which
gives information regarding frequency content of signal, not about time information.
SA RE GA MA PA DH
t
- Change the order of lyrics
MAGNITUDE
RE MA PA DH SA GA
t
- Fourier transform gives similar output for both because it doesn’t give time information so this
problem overcome by using STFT (short term fourier transform).
-
STFT F
t
- Drawback of STFT is that once we choose particular window size, it remains same for all frequency.
- Many signal needs a more flexible approach where one can vary window size
- It is known as multi-resolution which given by wavelet transform
- Wavelet: a wave is an oscillation function of time of space that is periodic an infinite length
continuous function.
- Wavelet is a wavelength of an effectively limited duration that has an average value of zero
- (x) is called wavelet if it has following properties

- I) ∫− (x)dx = 0
 2
- II) ∫− (x) dx < 
 (x)2
- III) C= ∫− 𝑑𝜔 < 
ω
- There are classified into 2 categories.
- I) Discrete wavelet Transform(DWT)
- II) Continuous wavelet Transform(CWT)
- CWT is given by
1  𝑡−𝑏
- Wf (a,b) = ∫− 𝑥(𝑡)∗ [ 𝑎
]𝑑𝑡
√a
- a is scaling parameter gives the frequency information in wavelet transform.
- b is shifting parameter gives the time information as indicates the locations of the window which
is shifted through the signal.
- Expression for 2D CWT of image f(x, y) is given by,
1  𝑥−𝑚 𝑦−𝑛
- ∫ ∫− 𝑓(𝑥, 𝑦)∗ [ 𝑎
, 𝑏 ]𝑑𝑥 𝑑𝑦
√a
- Where m, n - shifting parameters & a, b –scaling parameter.
2.1.9.2 Discrete wavelet Transform (DWT)

- It is obtained by filtering the signal through a series of digital filter at different scales
- The i/p signal is decompose into low-pass & high pass sub bands & each consisting of half the
numbers of samples in the original sequence
Low pass X1(n)

250 samples
X(n)
500 samples
High pass X2(n)

250 samples
- When we do this use convolution operation in filtering so no of samples will be increase so we

have to apply down sampling the sampled signals.
-
2 X1(n)
Low pass 500
samples
X(n)
500 samples
X2(n)
High pass 2
500 samples
- This process can be repeated to get a multi-resolution decomposition.

- Two dimensional convolution breaks down into one dimensional convolution on rows and
columns.
- Size of image is N * N
- At the first stage we convolve the rows of image with h(n) & g(n) & discard alternate columns
(down sample by 2)
- The columns of each of N/2 * N data convolve with h(n) & g(n) & alternate rows are discard.
- The result of entire operation gives N/2 * N/2 samples.
-
- The upper left most square represents the smooth information (blurred version of the image).
- The other square represents detailed information (edges) in different directions & at different
scales.
- We can also reconstruct original image using reverse process.
Unit 3 Image Enhancement
Unit 3
IMAGE ENHANCEMENT
- Image enhancement is one of the first steps in image processing.
- In this technique the original image is processed so that the resultant image is more
suitable for specific application.
- It is Subjective processing technique. Subjective means result may vary from person
to person.
- It does not add any extra information to the original image.
- It can be done in two domains:
1. The Spatial domain
2. The Frequency domain
3.1.IMAGE ENHANCEMENT IN SPATIAL DOMAIN

(Operates directly on Pixels)
- Suppose f(x,y) be original image where f can take values from 0-255.
- The modified image can be expressed as:
𝒈(𝒙, 𝒚) = 𝑻[𝒇(𝒙, 𝒚)]

Here, T is Transformation
- Spatial domain enhancement can be carried out in 2 different ways:

1. Point Processing
2. Neighbourhood processing
Snehal Shah(8655504691) Page 1

3.1.1 Point Processing

Question: write short note on Point processing
- In this processing we work with single pixel.
- So new value depends upon T operator & present f(x,y). { T = |x| operator }
- Some examples:
1. Digital Negative
2. Contrast Stretching
3. Thresholding
4. Grey Level Slicing
5. Bit Plan Slicing
6. Dynamic Range Compression (Log Transformation)
7. Power Law Transformation
s = T(r)
r = input pixel Intensity
s = output pixel Intensity
T is a function
1. Digital Negative
𝑺 = (𝑳 − 𝟏) − 𝒓
L is number of Grey levels
- In this case, L=256, thus, S = (256 – 1) – r
- So we can write, S = 255- r
- Thus, S = 255 for r=0

2. Contrast Stretching
Fig. a) Form of T function b) Low Contrast Image c) Contrast Stretching d)

Threshold function
- In Contrast stretching, to increase the contrast of image by making the dark portions
darker and bright portions brighter.
𝛼𝑟, 0 ≤ 𝑟 < 𝑟1
𝑆 = { 𝛽(𝑟 − 𝑟1 ) + 𝑆1 , 𝑟1 ≤ 𝑟 < 𝑟2
𝛾(𝑟 − 𝑟2 ) + 𝑆2 , 𝑟2 ≤ 𝑟 < 𝐿 − 1
- We make dark area darker by assigning a slope less than 1 &make bright area brighter
by assigning a slope greater than 1.
3. Thresholding
 Extreme contrast is known as Thresholding.
- In contrast stretching figure a), 𝑖𝑓 𝑟1 = 𝑟2 , 𝑠1 = 0 & 𝑠2 = 𝐿 − 1
- We get Thresholding function,

𝑠 = 0, 𝑖𝑓 𝑟 ≤ 𝑎
𝑠 = 𝐿 − 1, 𝑖𝑓 𝑟 > 𝑎
- Thresholding has only 2 values: black or white. (Threshold image has maximum
contrast)
4. Grey Level Slicing (Intensity Slicing)
- When we have to highlight a specific range of grey value like enhancing the flaws in x-
ray or CT image for that we have to use a transformation is known as Grey Level Slicing.
𝐿 − 1, 𝑎 ≤ 𝑟 ≤ 𝑏
𝑠={ (Without background)
0, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
𝐿 − 1, 𝑎 ≤ 𝑟 ≤ 𝑏
𝑠={ (With background)
𝑟, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
(Without background) (With background)
5. Bit Plane Slicing

Question: Explain Bit plane slicing with application.
- Suppose we consider 256*256*8 image pixels.
- Black – 00000000 0 level
- White – 11111111 255 level
- Remaining 254 are shades of grey
- In Bit Plane Slicing, we have to consider particular bit of pixel and draw an image.
- Suppose make image of only LSB, only MSB like total 8 different images.
- All 8 images will be binary.
1 2 0 001 010 000 1 0 0 0 1 0 0 0 0

4 3 2 100 011 010 0 1 0 0 1 1 1 0 0
7 5 2 111 101 010 1 1 0 1 0 1 1 1 0
Original Image Binary Image LSB Plane Middle Plane MSB Image
- Observing the images we come to conclusion that the higher order bits contain
majority of visually significant data, while the lower bits contain the suitable details in
the image.

- Bit Plane Slicing used for an image compression, we can remove lower order bits and
transmit only higher order bits.
- Application: - Stenography
- Stenography is art of hiding information. It is technique in which secret data is hidden
in carrier signal.
- LSB of carrier image is replaced by MSB of secret data.

50 150
Carrier image Secret image

0110010 10010110
- MSB bit of secret image is 1 which replaced with LSB bit 0 of carrier image. So final
value of stego image is 00110011
51
Stego image
6. Dynamic Range Compression (Log Transformation)

Question: Explain Dynamic range compression with application.
- Dynamic range of the image exceeds the capability of the display devices.
- Some images have pixels with high value (intensity) and some images with low value.
- So we cannot see the low value pixels in the image. For example, in day time we cannot
see stars because sun has high intensity compare with stars so that the eye cannot
adjust to such a large dynamic range.
- In image processing, a classic example of such large differences in grey levels is the
Fourier spectrum. In that only some of the values are large while most of the values
are too small. The dynamic range of pixel is of order of 10 6. Hence, when we plot the
spectrum, we see only the small dots which represent the large values.

- Sometimes we need to be able to see the small values as well. This technique used to
compress dynamic range of pixels is known as Dynamic Range Compression.
- For this technique we could use LOG operator.
𝑆 = 𝑐 log(1 + 𝑟)
7. Power Law Transformation

Formula
𝑔(𝑥, 𝑦) = 𝑐 ∗ 𝑓(𝑥, 𝑦)𝛾
𝑆 = 𝑐𝑟 𝛾
𝑐 𝑎𝑛𝑑 𝛾 𝑎𝑟𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑐𝑜𝑛𝑠𝑡𝑎𝑛𝑡𝑠
-  Gamma function: for different value of  we get different curve.
3.1.2 Neighborhood Processing

- In point processing we consider one pixel at a time and modify it.
- In neighbourhood processing we change the value of pixel f(x,y) based on the values
of its 8 neighbours.
y-1 y y+1
x-1 f(x-1,y-1) f(x-1,y) f(x-1, y+1)
x f(x, y-1) f(x, y) f(x, y+1)
x+1 f(x+1, y-1) f(x+1,y) f(x+1, y+1)
(3 x 3 neighbourhood)
𝒘𝟏 𝒘𝟐 𝒘𝟑
𝒘𝟒 𝒘𝟓 𝒘𝟔
𝒘𝟕 𝒘𝟖 𝒘𝟗
(3 x 3 Mask)
- Most of images background is considered to be a low frequency region and edges are
considered to be high frequency regions.
- Low Pass Filter removes noise and edges

- High Pass Filter removes background

- Noise:
 Gaussian Noise
 Salt and Pepper
 Rayleigh
 Gamma
 Exponential
 Uniform
- Salt (White) and Pepper (Black) Noise

- Salt-and-pepper noise is a form of noise sometimes seen on images. This noise can be
caused by sharp and sudden disturbances in the image signal. It presents itself as
sparsely occurring white and black pixels.
- In this noise, black dots present in white background and white dots in black
background.
3.1.2.1. Low Pass Averaging Filter
Low pass filtering mask is 1 1 1

1
1 1 1
9 1 1 1
10 10 10 10 10 10 10 10
10 10 10 10 10 10 10 10
10 10 10 10 10 10 10 10
10 10 10 10 10 10 10 10
50 50 50 50 50 50 50 50
50 50 50 50 50 50 50 50
50 50 50 50 50 50 50 50
50 50 50 50 50 50 50 50
- Multiply each pixel value of image with corresponding pixel value of mask.

- Centre value of this 3x3 matrix is replaced by average value of 3 X 3 matrix. Move this
given mask matrix from left corner to bottom right corner of input image and replace
centre value with average value.
- Final Matrix
10 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ 10
10 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ 10
10 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ 10
23.3 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ 23.3
36.6 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ 36.6
50 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ 50
50 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ 50
[ 50 ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ ⋯ 50 ]
- From final matrix we can conclude that the edges (where pixel values are changed
from 10 to 50 in input image) are blurred due to this type of filtering.
3.1.2.2. Low Pass Median Filter

- The averaging filter removes the noise but also blurs the edges.
- So if the image with Salt & Pepper noise we have to use Median filter instead of
averaging filter.
- Example
1 5 7
[2 4 6]
3 2 1
- First arrange pixels in ascending order.

- 112234567
- Middle value of above ascending order is 3. So centre value of given matrix 4 is
replaced by 3.
1 5 7
[2 3 6]
3 2 1
Example: Apply median filter on given input matrix using 3 x 3 matrix
18 22 33 25 32 24
[34 128 24 172 26 23]
22 19 32 31 28 26
Solution:
- First consider left top 3 x 3 matrix of input matrix
- Arrange all value of this 3 x 3 matrix in ascending order 18 19 22 22 24 32 33 34 128
- Middle value of above ascending order is 24. So centre value of taken 3 x 3 input matrix
128 is replaced by 24.
- Now consider next 3 x 3 matrix of input matrix
- Arrange all value of this 3 x 3 matrix in ascending order 19 22 24 25 31 32 33 128 172

- Middle value of above ascending order is 31. So centre value of taken 3 x 3 input
matrix 24 is replaced by 31.
- Repeat this process from top to bottom row and left to right column.
- So Final Matrix is
18 22 33 25 32 24
[34 24 31 31 26 23]
22 19 32 31 28 26
- From the result we can conclude that if value of pixel is very different from
neighbouring pixels in input image then this pixel value is replaced by correlated value.
3.1.2.3. High Pass Filter

- It removes background information and highlight edges of image.
10 10 10 10 10 10 10 10
10 10 10 10 10 10 10 10
10 10 10 10 10 10 10 10
10 10 10 10 10 10 10 10
100 100 100 100 100 100 100 100
100 100 100 100 100 100 100 100
100 100 100 100 100 100 100 100
100 100 100 100 100 100 100 100
-1 -1 -1
1
-1 8 -1
9 -1 -1 -1
- Apply mask on input image and add up all co-efficient and take average value and
replace it with centre value.
- Repeat this process from top to bottom row and left to right column
- Negative value in output image should be considered zero.
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
30 30 30 30 30 30 30 30
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0

- From the output image we can say High Pass filter removes background detail by
placing zero values and highlight only edges.
3.1.2.4. High Boost Filter

- Using High Pass Filter we lost background information but sometimes we need this
information for some applications. So we have to use High Boost Filter.
- In High Boost filter we pass some background information along with high frequency
content.
𝐻𝑖𝑔ℎ 𝐵𝑜𝑜𝑠𝑡 = (𝐴) 𝑂𝑟𝑖𝑔𝑖𝑛𝑎𝑙 − 𝐿𝑜𝑤 𝑃𝑎𝑠𝑠
= (𝐴 − 1) 𝑂𝑟𝑖𝑔𝑖𝑛𝑎𝑙 + 𝑂𝑟𝑖𝑔𝑖𝑛𝑎𝑙 − 𝐿𝑜𝑤 𝑃𝑎𝑠𝑠
= (𝐴 − 1) 𝑂𝑟𝑖𝑔𝑖𝑛𝑎𝑙 + 𝐻𝑖𝑔ℎ 𝑃𝑎𝑠𝑠
𝑖𝑓 𝐴 = 1, 𝑡ℎ𝑒𝑛
𝐻𝑖𝑔ℎ 𝐵𝑜𝑜𝑠𝑡 = 𝐻𝑖𝑔ℎ 𝑃𝑎𝑠𝑠
- This technique is known as Unsharp Masking.
-1 -1 -1
-1 x -1
-1 -1 -1
- Suppose x= 9A-1
- If A = 1 then x= 8
- So mask matrix becomes
-1 -1 -1
-1 8 -1
-1 -1 -1
- If A = 1.1 then x= 8.9
- So mask matrix becomes
-1 -1 -1
-1 8.9 -1
-1 -1 -1
- For different value of A we can make different mask for high boost filter.
Example: Consider the following image

0 2 1
1 100 2
2 0 1
 Perform Low Pass filtering

 Perform Median filtering
 Find High Pass filtered output
 Comment on result.

Solution: zero padding
0 0 0 0 0
0 0 2 1 0
0 1 100 2 0
0 2 0 1 0
0 0 0 0 0
 1. Low-pass averaging filter

Repeat process of low pass averaging filter. Consider 3 x 3 mask
1 1 1
1
1 1 1
9 1 1 1
So resultant matrix is
11.44 11.77 11.66
11.66 12.11 11.77
11.44 11.77 11.44
2. Median filter
Use similar 3 x 3 mask matrix and repeat process of median filter
First 3 x 3 matrix values arrange in ascending order
0 0 0 0 0 0 1 2 100 so 0 is replaced by 0
Next 3 x 3 matrix 0 0 0 0 1 1 2 2 100 so 2 is replaced by 1
Similar repeat this process so resultant matrix is
0 1 0
0 1 1
0 1 0
3. High Pass Filter
-1 -1 -1
1
-1 8 -1
9 -1 -1 -1
Using this above mask repeat process of high pass filter

-11.44 -9.77 -10.66

-10.66 87.88 -9.77
-9.44 -11.77 -10.44

3. Compare with low pass averaging and high pass filter, the median filter gives
more correlated data.
Example: Consider the following image

0 5 4
7 120 5
4 3 7
 Perform Low Pass filtering

 Perform Median filtering
 Find High Pass filtered output
 Compare result of 1 & 2.
Solution: zero padding

0 0 0 0 0
0 0 5 4 0
0 7 120 5 0
0 4 3 7 0
0 0 0 0 0
 1. Repeat process of low pass averaging filter. Consider 3 x 3 mask
1 1 1
1
1 1 1
9 1 1 1
14.67 15.67 14.89
15.44 17.22 16.00
14.89 16.22 15.00
2. Median filter
- Use similar 3 x 3 mask matrix and repeat process of median filter
0 4 0
3 5 4
0 4 0
3. High Pass Filter
-1 -1 -1
1
-1 8 -1
9 -1 -1 -1

Using this above mask repeat process of high pass filter
0 -5 -4 0 0 0
1 = 0 102.71 0
-7 960 -5
9 -4 -3 -7 0 0 0
- Similar we can find all values so resultant matrix is

-14.66 -10.66 -10.88
= -8.44 102.71 -11
-10.88 -13.22 -9
4. Comparing result of 1 & 2 we can say using median filter we can get more
correlated value of pixels.
Example: Obtain the digital negative of following 8 bits per pixel image
121 205 217 156 151

139 127 157 117 125
252 117 236 138 142
227 182 178 197 242
201 106 119 251 240
 8 bit image, thus, 28 = 256 levels

Minimum grey level = 0
Maximum grey level = 255
𝑆(𝑥, 𝑦) = 255 − 𝑟(𝑥, 𝑦)
Here r(x,y) is input image so first pixel value is 255-121=134 & so on.
134 50 38 99 104
116 128 98 138 130
3 138 19 117 113
28 73 77 58 13
54 149 136 4 15
Example: For a given image find: 1) Digital Negative of an image. 2) Bit Plane Slicing.
4 3 2 1
3 1 2 4
5 1 6 2
2 3 5 6
 Digital Negative
Max value of pixel is 6 so we need 3 bit for binary representation of this pixel value.

23 = 8. So total Grey Levels are 0 to 7

𝑆(𝑥, 𝑦) = 7 − 𝑟(𝑥, 𝑦)
3 4 5 6
4 6 5 3
2 6 1 5
5 4 2 1
 Bit Plane Slicing
011 100 101 110

100 110 101 011
010 110 001 101
101 100 010 001
0 1 1 1 1 0 0 1 1 0 1 0
1 1 1 0 0 1 0 1 0 0 1 1
0 1 0 1 1 1 0 0 0 0 1 1
1 1 0 0 0 0 1 0 1 0 0 1
MSB plane Middle plane LSB plane
Example : For following image find Contrast Stretching
r2 = 5 , r1 = 3 , s2 = 6 , s1 = 2
4 3 2 1
3 1 2 4
f(x,y)=
5 1 6 2
2 3 5 6
 Image to be 3 bit: 23 = 8. Grey Levels: 0 to 7
𝑠1 2
𝛼= = = 0.66
𝑟1 3

𝑦2 − 𝑦1 6 − 2
𝛽= = =2
𝑥2 − 𝑥1 5 − 3
𝑦2 − 𝑦1 7 − 6
𝛾= = = 0.5
𝑥2 − 𝑥1 7 − 5
𝛼𝑟, 0≤𝑟<3
𝑠 = { 𝛽(𝑟 − 𝑟1 ) + 𝑆1 , 3≤𝑟<5
𝛾(𝑟 − 𝑟2 ) + 𝑆2 , 5≤𝑟<7
r S
0 S = 𝛼𝑟 = 0.66*0 =0
1 S = 𝛼𝑟 = 0.66*1 = 0.66
2 S = 𝛼𝑟 = 0.66*2 = 1.32
3 S = 𝛽(𝑟 − 𝑟1 ) + 𝑆1 = 2(3-3)+2 =2
4 S = 𝛽(𝑟 − 𝑟1 ) + 𝑆1 = 2(4-3)+2 =4
5 S = 𝛾(𝑟 − 𝑟1 ) +𝑆2 =6
6 S = 𝛾(𝑟 − 𝑟1 ) +𝑆2 = 6.5
7 S = 𝛾(𝑟 − 𝑟1 ) +𝑆2 =7
4 2 1.32 0.66 4 2 1 1
2 0.66 1.32 4 2 1 1 4
S(x,y)= =>
6 0.66 6.5 1.32 6 1 7 1
1.32 2 6 6.5 1 2 6 7
Example: For 3 bit 4x4 image, perform the following operation:

1. Negation
2. Thresholding with T=4
3. Clipping with r1 = 2 and r2 = 5
4. Intensity level slicing with and without back ground r1 = 2 and r2 = 5
1 2 3 0
2 4 6 7
5 2 4 3
3 2 6 1
 Image to be 3 bit: 23 = 8. Grey Levels: 0 to 7

1. S = 7 - r
6 5 4 7
5 3 1 0
2 5 3 4
4 5 1 6

2.
0, 𝑟≤4
𝑠={
𝐿 − 1, 𝑟>4
0 0 0 0
0 7 7 7
7 0 7 0
0 0 7 0
3. Clipping with r1 = 2 and r2 = 5

𝐿 − 1, 2≤ 𝑟≥5
𝑠={
0 7 7 0
7 7 0 0
7 7 7 0
7 7 0 0
4 Intensity level slicing with and without back ground r1 = 2 and r2 = 5
𝐿 − 1, 2 ≤ 𝑟 ≤ 5
𝑠={ (Without background)
0 7 7 0
7 7 0 0
7 7 7 0
7 7 0 0
𝐿 − 1, 2 ≤ 𝑟 ≤ 5
𝑠={ (With background)
𝑟, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
1 7 7 0
7 7 6 7
7 7 7 3
7 7 6 1

Question: Show that original image - LPF image = HPF image.
Solution:
Z1 Z2 Z3 1/9 1/9 1/9 -1/9 -1/9 -1/9
Z4 Z5 Z6 1/9 1/9 1/9 -1/9 8/9 -1/9
Z7 Z8 Z9 1/9 1/9 1/9 -1/9 -1/9 -1/9
Original Image Low pass filter High pass filter
- When we apply the LPF on the image, the center pixel z5 changes to
- 1/9[Z1 + Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9]
- Original –low pass = Z5 - 1/9[Z1 + Z2 + Z3 + Z4 + Z5 + Z6 + Z7 + Z8 + Z9]
= Z5 - Z1 /9 - Z2 /9 - Z3/9 - Z4 /9 - Z5 /9 - Z6/9 - Z7/9 - Z8/9 + Z9/9
= 8Z5 /9 - 1/9[Z1 + Z2 + Z3 + Z4 + Z6 + Z7 + Z8 + Z9]
- This is nothing but a high pass filter mask
-1 -1 -1
-1 8 -1
1/9
-1 -1 -1
3.2 IMAGE ENHANCEMENT IN FREQUENCY DOMAIN

- In spatial domain
- In frequency domain

- To perform filtering, we need to know where the frequencies reside in the

Fourier plot.
- We have seen that for 1D signal
f(t) Low frequency
1D DFT
High frequency
t
-N/2 0 N/2
- Hence, the 0 represents the d.c term. As we move to the right, the frequency
goes on increasing, maximum being N/2. by using translation property ,we have
Low frequency
High frequency
0 N/2 N
- Hence we conclude that in the Fourier spectrum, the centre is where the low
frequencies and as we go away from the centre, we encounter the high
frequencies.
- Centre part of image is consider as Low frequency and edges of image is high
frequency.
3.2.1. Low frequency domain filters

i) Ideal low pass filter: this filter is simplest of three low pass filters.
- This filter cut off all high frequency components of Fourier transform that are at
distance greater than a specified distance.
1, 𝐷(𝑢, 𝑣) ≤ 𝐷0
𝐻(𝑢, 𝑣) = {
0, 𝐷(𝑢, 𝑣) > 0
- D(u, v) is the distance from the point (u, v) to the origin of the frequency rectangle
for an MxN image.
- D(u,v) = [(u-(M/2) ) 2+ (v-N/2)2]1\2
- For an image if u = M/2, v= N/2,
Then D(u,v) = 0

- How we can decide value of do? (which is suitable for better o/p).
- To compute circles that enclose specified amounts of total image power Ptotal.
P(u,v)2 = F(u,v)2
= R2 (u,v) + I2 (u,v)
Ptotal = ∑𝑁−1 𝑀−1
𝑢=0 ∑𝑣=0 𝑝(𝑢, 𝑣)
ii) Butterworth low pass filter: Transfer function is given by

1
H(u,v) = 1+[ D(u,v)/D0]2n
- For low order value of n Butterworth low pass filter becomes Gaussian low pas
filter.
- For high order value of n Butterworth low pass filter becomes Ideal low pas filter.
iii) Gaussian low pass filter: transfer function is given by
- Here σ is standard deviation and which measure of Gaussian curve.

3.2.2. High frequency domain filters

- Edges and other abrupt in the grey levels are associated with high freq component
which remove by this filter.
Hhp( u,v) = 1- Hlp(u,v)
i) Ideal high pass filter:
0, 𝐷(𝑢, 𝑣) ≤ 𝐷0
𝐻(𝑢, 𝑣) = {
1, 𝐷(𝑢, 𝑣) > 0
ii) Butterworth high pass filter:

Hhp( u,v) = 1- Hlp(u,v)
Hhp,BW ( u,v) = 1- Hlp,BW(u,v)
We know
1
Hlp,BW(u,v) = 1+[ D(u,v)/D0]2n
1
Hlp,BW(u,v) = 1- 1+[ D(u,v)/D0]2n
take [ D(u, v)/D0]2n = X

1
= 1- 1+𝑋
1+𝑋−1
= 1+𝑋
𝑋
=1+𝑋
1
=1+1/𝑋
1
Hlp,BW(u,v) =1+1/ [ D0/D(u,v)]2n

iii) Gaussian High Pass Filter

𝐻𝐺𝑎𝑢𝑠𝑠𝑖𝑎𝑛(𝐻𝑃) (𝑢, 𝑣) = 1 − 𝐻𝑔(𝐿𝑃)
𝟐 (𝒖,𝒗)/𝟐𝝈𝟐
= 𝟏 − 𝒆−𝑫
iv) High Boost Filtering (Unsharp masking)

- We know,
High Boost = (A-1) Original + High Pass
FHB(x,y) = (A-1)f(x,y) + FHP(x,y)
For frequency domain,
FHP(x,y) = F(u,v) – FLP(u,v) & FLP(u,v)= HLP(u,v).F(u,v)
 FHP(u,v) = [1 – HLP(u,v)]F(u,v)
 HHP(u,v).F(u,v) = [(1 - HLP(u,v)]F(u,v)
 HHP(u,v) = 1 - HLP(u,v)
- In similar manner,
HHB(u,v) = (A-1)+ HHP(u,v) ; A>1
3.2.3. Laplacian filtering in frequency domain

- Laplacian can be implemented in the frequency domain using the filter
H(u,v) = -4𝜋 2 (𝑢2 + 𝑣 2 )
- Or with respect to the center frequency rectangle, using the filter
𝑄
H(u,v) = -4𝜋 2 (𝑢 − 𝑃/2)2 + (𝑣 − 2 )2 )
= -4𝜋 2 𝐷2 (𝑢, 𝑣)
- Where D(u,v) is the distance function. then the laplacian image is obtained as
∇2f(x,y) = L-1 {H(u,v) F(u,v)}
We know g(x,y) = f(x,y) + c ∇2f(x,y)

If c=-1 then g(x,y) = f(x,y) - ∇2f(x,y)

g(x,y) = L-1 {F(u,v) - H(u,v) F(u,v)}
= L-1 {[1- H(u,v)]F(u,v)}
= L-1 {[1+ 4𝜋 2 𝐷2 (𝑢, 𝑣)}𝐹(𝑢, 𝑣)}
3.2.4. Homomorphic Filtering

- An image can be modelled as the product of an illumination function and reflectance
function at every point.
- So image can be represented as
F(x,y) = i(x,y) * r(x,y)
- This model is known as Illumination Reflectance Model.
- This model can be used to improve the quality of the image.
- So in above equation, i(x,y) = illumination component & r(x,y) = reflectance
component
- For many images, the illumination is the primary contributor to the dynamic range and
varies slowly in space.
- While the reflectance component r(n1,n2) represents the details of the object and
varies rapidly.
- If both components have to be handled separately, the logarithm of input function
f(n1,n2) is taken
Ln[f(n1,n2)] = ln[i(n1,n2). r(n1,n2)]
Taking Fourier Transform,
F(u,v) = FI(u,v) + FR(u,v)
- Now apply filter function H(u,v) to separate the illumination and reflectance
component separately.
F(u,v)* H(u,v) = FI(u,v) * H(u,v) + FR(u,v) * H(u,v)
- To come back to the space domain, we have to take Inverse Fourier Transform
F’(x,y) = F-‘[ F(u,v)* H(u,v)]
F’(x,y) = F-‘[ FI(u,v) * H(u,v)] + F-‘[ FR(u,v) * H(u,v)]
- Now for desired enhanced image is obtained by taking exponential operation.
g(x,y) = ef’(x,y)
3.3 Histogram
3.3.1. Histogram
- It is plot of number of occurrence of grey levels in image against with grey level values.
- Histogram provides more information about brightness & contrast of image.
- Histogram of dark image will be clustered towards lower grey levels.
- Histogram of bright image will be clustered towards higher grey levels.

- For low contrast image the histogram will not be spread equally, that is, the histogram
will be narrow.
- For high contrast image the histogram will have an equal spread in the grey level.
- Image brightness may be improved by modifying the histogram of the image.
- Histogram can be plotted in two different ways
Method 1:
Grey level No of pixels

0 40
1 20
2 10
3 15
4 10
5 3
6 2
-
Method 2:
Instead of plotting no of pixels, we directly plot its probability values.
Pr(k) = nk/n
Grey level No of pixels Pr(k)

0 40 0.40
1 20 0.20
2 10 0.10
3 15 0.15
4 10 0.10
5 3 0.03
6 2 0.02
n = 100

- This is known as normalized histogram.

- Advantage of 2nd method is that maximum value will be 1.
3.3.2. Histogram stretching

- One way to increase dynamic range of histogram this technique is known as
histogram stretching.
- In this histogram is spread over entire dynamic range.
𝑠 −𝑠
S = T(r) =𝑟𝑚𝑎𝑥−𝑟𝑚𝑖𝑛 (𝑟 − 𝑟𝑚𝑖𝑛 ) + 𝑠𝑚𝑖𝑛
𝑚𝑎𝑥 𝑚𝑖𝑛

Example: Perform histogram stretching on the following image. So that new image has dynamic
range [0,7].
Gray Levels 0 1 2 3 4 5 6 7
No.of Pixels 0 0 50 60 50 20 10 0
Solution:
𝑠 −𝑠
S = T(r) =𝑟𝑚𝑎𝑥−𝑟𝑚𝑖𝑛 (𝑟 − 𝑟𝑚𝑖𝑛 ) + 𝑠𝑚𝑖𝑛
𝑚𝑎𝑥 𝑚𝑖𝑛
Here,
r s
rmin = 2
2 0
rmax = 6
3 1.75 ≈ 2
smax = 0
4 3.5 ≈4
smin = 7
5 5.2 ≈5
6 7
Modified Histogram:
Gray Levels(s) 0 1 2 3 4 5 6 7
No.of Pixels 50 0 60 0 50 20 0 10
(Original Histogram) (Histogram stretching)

3.3.3 Histogram Equalization

- Using some technique we can get flat histogram this technique known as histogram
equalization.
- We can say its perfect image when all grey level has equal number of pixels.
- Our objective is not only to spread the dynamic range but also to have equal pixels in
all the grey levels.
Example: Perform histogram equalization on the following image histogram. Plot the original
and equalized histogram.
Gray 0 1 2 3 4 5 6 7
Levels
No.of 790 1023 850 656 329 245 122 81
Pixels
Solution:
Grey nk Pr(k) = nk/n Sk=∑ (L-1) Rounding Grey
level.(r) Pr(k) Sk Off level.(s)
0 790 0.19 0.19 1.33 1 1
1 1023 0.25 0.44 3.08 3 3
2 850 0.21 0.65 4.55 5 5
3 656 0.16 0.81 5.67 6 6
4 329 0.08 0.89 6.23 6 6
5 245 0.06 0.95 6.65 7 7
6 122 0.03 0.98 6.86 7 7
7 81 0.02 1 7 7 7
n=
4096
gray 0 1 2 3 4 5 6 7
level(s)
No.of 0 790 0 1023 0 850 656+329 245+122+81
Pixels =985 =448

Example: consider given image and find out equalized histogram
4 4 4 4 4
3 4 5 4 3
F(x,y)= 3 5 5 5 3
3 4 5 4 3
4 4 4 4 4
Solution:
Gray Levels 0 1 2 3 4 5 6 7
No.of Pixels 0 0 0 6 14 5 0 0

0 0 0 0 0 0 0
1 0 0 0 0 0 0
2 0 0 0 0 0 0
3 6 0.24 0.24 1.68 2 2
4 14 0.56 0.8 5.6 6 6
5 5 0.20 1 7 7 7
6 0 0 1 7 7 7
7 0 0 1 7 7 7
n =25
Gray 0 1 2 3 4 5 6 7
Levels
No.of 0 0 6 0 0 0 14 5+0+0=5
Pixels

Question: Justify-the entropy of an image is maximum by histogram equalization.

Solution: histogram equalization gives flat histogram in continuous domain
- As a result of this the probability of occurrence of each grey level in the image is equal.
- If all grey levels are equal probable, the entropy is maximized.
- Consider 256 grey level image with equal probability of occurrence for each grey level
- the maximum word length is given by
H = ∑𝐿−1
𝑖=0 𝑝𝑖 log 𝑝𝑖
𝐿−1
1 1
∑( ) log( )
256 256
𝑖=0
- So we require 8 bits /pixels.

- This simply Means that an equal length code can be used an image that has uniform
pdf.

Example: what effect would setting to i) zero the higher order bits plane ii) zero the lower order
bits plane. Image is given below.
0 1 2 3
4 5 6 7
8 9 10 11
12 13 14 15
Solution: Maximum number is 15 so we require 4 bits for binary representation
Binary representation:
0000 0001 0010 0011

0100 0101 0110 0111
1000 1001 1010 1011
1100 1101 1110 1111
i) Set Lower bits 0 in above image
0000 0000 0010 0000

0100 0100 0100 0100
1000 1000 1000 1000
1100 1100 1100 1100
0 0 0 0
4 4 4 4
8 8 8 8
12 12 12 12
- In this histogram the variability is reduced, number of grey levels are reduced.
ii) Set higher order bits 0
0000 0001 0010 0011

0000 0001 0010 0011
0000 0001 0010 0011
0000 0001 0010 0011

0 1 2 3
0 1 2 3
0 1 2 3
0 1 2 3
- In this case grey levels are also reduced but important thing is image becomes much
darker.
3.3.4 Histogram specification

- If we want to highlight some grey levels then we can use histogram specification.
- It should be noted that if we modify the grey level of an image that has a uniform PDF
- Using the inverse transformation r= T’(s) we get original histogram Pr(r).
- For that we have to create some intermediate level.
𝑟
K= T1(r) = ∫0 𝑝(𝑟) 𝑑𝑟
𝑠
K = T2(r) = ∫0 𝑝(𝑠) 𝑑𝑠
Example: Given histogram (a) and (b) modify histogram (a) as given (b).
Histogram (a):
Gray 0 1 2 3 4 5 6 7
Levels
No.of 790 1023 850 656 329 245 122 81
Pixels
Histogram (b):
Gray 0 1 2 3 4 5 6 7
Levels
No.of 0 0 0 614 819 1230 819 614
Pixels
Solution: first equalized histogram (a)


0 790 0.19 0.19 1.33 1 1
1 1023 0.25 0.44 3.08 3 3
2 850 0.21 0.65 4.55 5 5
3 656 0.16 0.81 5.67 6 6
4 329 0.08 0.89 6.23 6 6
5 245 0.06 0.95 6.65 7 7
6 122 0.03 0.98 6.86 7 7
7 81 0.02 1 7 7 7
n=
4096
gray 0 1 2 3 4 5 6 7
level(s)
No.of 0 790 0 1023 0 850 656+329 245+122+81
Pixels =985 =448
Now equalized histogram (b):
Grey nk Pr(k) = Sk=∑ Pr(k) (L-1) Sk Rounding

level.(r) nk/n Off
0 0 0 0 0 0
1 0 0 0 0 0
2 0 0 0 0 0
3 614 0.149 0.149 1.05 1
4 819 0.20 0.35 2.45 2
5 1230 0.30 0.65 4.45 5
6 819 0.20 0.85 5.97 6
7 614 0.15 1 7 7
n = 4096
Applying inverse transform and comparing histogram (a) and histogram (b).
1-----3
2-----4
5-----5
6-----6
7-----7

gray 0 1 2 3 4 5 6 7
level(s)
No.of 0 0 0 790 0 850 985 448
Pixels
Example: For given image, perform the following operation:

1. Contrast starching as per the characteristics given in fig
2. Draw original and new histogram
1. Equalize the histogram
10 2 13 7
11 14 6 9
4 7 3 2
0 5 10 7
Solution: 1)
𝑠1 2
𝛼= = = 0.4
𝑟1 5
𝑦2 − 𝑦1 12 − 2
𝛽= = =2
𝑥2 − 𝑥1 10 − 5
𝑦2 − 𝑦1 15 − 12
𝛾= = = 0.6
𝑥2 − 𝑥1 15 − 10
r S
0 S = 𝛼𝑟 = 0.4*0 =0
1 S = 𝛼𝑟 = 0.4*1 = 0.4
2 S = 𝛼𝑟 = 0.4*2 =0.8
3 S = 𝛼𝑟 = 0.4*3 = 1.2
4 S = 𝛼𝑟 = 0.4*4 = 1.6

5 S = 𝛽(𝑟 − 𝑟1 ) + 𝑆1 = 2(5-5)+2 =2
6 S = 𝛽(𝑟 − 𝑟1 ) + 𝑆1 = 2(6-5)+2 =4
7 S = 𝛽(𝑟 − 𝑟1 ) + 𝑆1 = 2(7-5)+2 =6
8 S = 𝛽(𝑟 − 𝑟1 ) + 𝑆1 = 2(8-5)+2 =8
9 S = 𝛽(𝑟 − 𝑟1 ) + 𝑆1 = 2(9-5)+2 = 10
10 S = 𝛾(𝑟 − 𝑟2 ) +𝑆2 = 0.6(10-10)+12 = 12
11 S = 𝛾(𝑟 − 𝑟2 ) +𝑆2= 0.6(11-10)+12 = 12.6
12 S = 𝛾(𝑟 − 𝑟2 ) +𝑆2= 0.6(12-10)+12 = 13.2
13 S = 𝛾(𝑟 − 𝑟2 ) +𝑆2= 0.6(13-10)+12 = 13.8
14 S = 𝛾(𝑟 − 𝑟2 ) +𝑆2= 0.6(14-10)+12 = 14.4
15 s = 𝛾(𝑟 − 𝑟2 ) +𝑆2= 0.6(15-10)+12 = 15
12 0.8 13.8 6 12 1 14 6
12.6 14.4 4 10 13 14 4 10
S(x,y) = =>
1.6 6 1.2 0.8 2 6 1 1
0 2 12 6 0 2 12 6

2)
3)
Grey nk Pr(k) = Sk=∑ Pr(k) (L-1) Sk Rounding Grey

level.(r) nk/n Off level.(s)
0 1 0.067 0.067 1.005 1 1

1 0 0 0.067 1.005 1 1
2 1 0.067 0.134 2.01 2 2
3 1 0.067 0.201 3.015 3 3
4 1 0.067 0.268 4.02 4 4
5 1 0.067 0.335 5.025 5 5
6 1 0.067 0.402 6.03 6 6
7 3 0.2 0.602 9.03 9 9
8 0 0 0.602 9.03 9 9
9 1 0.067 0.669 10.035 10 10
10 2 0.13 0.799 11.985 12 12
11 1 0.067 0.866 12.99 13 13
12 0 0 0.866 12.99 13 13
13 1 0.067 0.933 14 14 14
14 1 0.067 1 15 15 15
15 0 0 1 15 15 15
n =15
grey 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
levels
No of 0 1 1 1 1 1 1 0 0 3 1 0 2 1 1 1
pixels

Question: Histogram is a unique representation of image.
Solution: Histogram is not a unique representation of an image.
- Histogram is a graph of grey value vs frequency of occurrence of grey value.
- It depends on the probability or frequency of grey value. So NO matter how the grey
values are distributed over the image, if the frequency of occurrence of grey value is
not changed, the histogram will not change.
- Therefore, Histogram is not unique representation of images. That means it is possible

that two or more different images can have same Histogram.
- For Example: Consider a set of binary images having same Histogram.
Question: Continuous image histogram can be perfectly equalized but it may not be so for digital
image.
Solution: This statement is true.
- The cumulative density function ensures that we get a flat histogram is the continuous
domain.
- The discrete domain, as we are aware is an approximation of the continuous domain

i.e. values between integers values are not known because of which redistribution
takes place.
- For example values such as 1.1, 1.2, and 1.3, are all grouped together and placed in
value 1. Due to this perfectly flat histograms are never obtained in the discrete
domain.

Question: Difference between Histogram and Contrast Stretching.
Solution:
Histogram Contrast Stretching
1. It is about modifying the intensity 1. It is all about increasing the
values of all pixels in the image as difference between the minimum
equally. and maximum intensity value in
image.
2. The transformation function used in 2. Transformation function is selected
Histogram is selected automatically manually based on the requirement
from PDF of the image. of application.
3. It is reliable. 3. It is unreliable.
4. It is non-linear normalization. 4. It is linear normalization.
5. In histogram equalization, the original 5. In contrast stretching, the original
image cannot be restored from image can be restored from contrast
equalized image. stretched image
6. Histogram equalization is obtained 6. Contrast stretching can be obtaining
using the cumulative distribution by changing the slopes of various
function. sections.

Unit 4 (Part-1) Image Segmentation
Unit 4 (PART – 1)
IMAGE SEGMENTATION
- The main objective of image segmentation is to extract various features of the image
which can be merged or split in order to build objects of interest on which analysis
and interpretation can be performed.
- Segmentation forms a section of computer vision .we use segmentation when we
want the computer to make decision.
- Segmentation algorithms divide into two different way.
1) Segmentation based on discontinuities in intensity
2) Segmentation based on similarities in intensity
4.1. Segmentation based on discontinuities in intensity

- In this first method, the approach is to partition an image based on abrupt changes
in intensity, such as edges.
- It occurs in three different ways i) Point detection
ii) Edge detection
iii) Line detection
- These three component point, edge, line are high frequency component of image.
4.1.1 Point Detection

- It is high frequency component so we have to use high pass filter mask with sum of
coefficients of mask should be zero.
- Detection of point is simple.
- We use the standard high pas mask for it. And so that this mask detect only point not
lines, we set threshold value i.e. we say a point has been detect at the location on
which is centered only if
-1 -1 -1
-1 8 -1
-1 -1 -1
│R │≥ T
- Where R is derive from
R = W1Z1 + W2Z2 + W3Z3 +………………..+ W9Z9
R = ∑9𝑖=1 𝑤𝑖 𝑧𝑖
- We take │R │ because we want to detect both the kinds of points i.e. white points on
black background as well as black points on a white background.
- T is non negative threshold which is defined by the user.

4.1.2 Line Detection

- Detection of lines can be done using the mask shown below.
- In image lines can be in any direction & detecting this lines would need different
masks.
-1 -1 -1 -1 -1 2
2 2 2 -1 2 -1
-1 -1 -1 2 -1 -1
(Horizontal) (+45▫)
-1 2 -1 2 -1 -1
-1 2 -1 -1 2 -1
-1 2 -1 -1 -1 2
(Vertical) (-45▫)
- All these masks have a sum equal to zero, and hence all of them are high pass mask.
- The first mask would detect horizontal line, second mask would detect a line at angle
+45, third mask would detect vertical line and forth mask would detect line at angle -
45.
- Consider this example
0 0 0 10 0 0 0 0
0 0 0 10 0 0 0 0
0 0 0 10 0 0 0 0
0 0 0 10 0 0 0 0
0 10 10 10 10 10 10 0
0 0 0 10 0 0 0 0
0 0 0 10 0 0 0 0
0 0 0 10 0 0 0 0
- After applying horizontal mask we will get
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 -20 -20 -20 -20 -30 -20 0
0 40 40 40 40 60 40 0
0 -20 -20 -20 -20 -30 -20 0
0 0 0 10 0 0 0 0
0 0 0 10 0 0 0 0

0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 40 40 40 40 60 40 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
- It has detect only horizontal line and remove vertical line.
4.1.3 Edge Detection

- More than isolated points and lines, it is the detection of edges that form an
important part of image segmentation. An edge can be defined as a set of connected
pixels that form a boundary between two disjoint regions.
- Here the set of pixels that separate the black region from the white region is called
an edge. This image shown has a step edge.
- Practically sharp slope is not possible. For digital image we follow the below step

- Slop of ramp is inversely proportional to the degree of blurring.

- The process of edge detection is carried out by the derivate approach.it is achieved
by first order derivative operator.
4.1.3.1 Computing the gradient

Case i) o/p is a function of one variable y= f(x)
𝑑𝑦 𝑓(𝑥 + ℎ) − 𝑓(𝑥)
= lim
𝑑𝑥 ℎ→0 ℎ
Case ii) o/p is function of two variable

First keep y fixed & alter only x
𝜕𝑓 𝑓(𝑥 + ℎ, 𝑦) − 𝑓(𝑥, 𝑦)
= lim
𝜕𝑥 ℎ→0 ℎ
Similarly,
𝜕𝑓 𝑓(𝑥, 𝑦 + 𝑘) − 𝑓(𝑥, 𝑦)
= lim
𝜕𝑦 ℎ→0 𝑘
Hence final gradient is
𝜕𝑓 𝜕𝑓
𝛻𝑓 = i^ +j^ 𝜕𝑦
𝜕𝑥
Case iii) o/p is function of three variables p= f(x,y,z)
𝜕𝑓 𝑓(𝑥 + ℎ, 𝑦, 𝑧) − 𝑓(𝑥, 𝑦, 𝑧)
= lim
𝜕𝑥 ℎ→0 ℎ

𝜕𝑓 𝑓(𝑥, 𝑦 + 𝑘, 𝑧) − 𝑓(𝑥, 𝑦, 𝑧)
= lim
𝜕𝑦 ℎ→0 𝑘
𝜕𝑓 𝑓(𝑥, 𝑦, 𝑧 + 𝑙) − 𝑓(𝑥, 𝑦, 𝑧)
= lim
𝜕𝑧 𝑙→0 𝑙
So gradient
𝜕𝑓 𝜕𝑓 𝜕𝑓
𝛻𝑓 = i^ +j^ 𝜕𝑦 +k^ 𝜕𝑧
𝜕𝑥
Magnitude of vector for 2-D
𝜕𝑓 𝜕𝑓
│𝛻𝑓│ = [ (𝜕𝑥 )2 + ( 𝜕𝑦)2 ]2
4.1.3.2 Finding gradients using Masks
- Consider 3x3 neighbourhood matrix with Z5 as the origin
𝜕𝑓 𝑓(𝑥+ℎ,𝑦)−𝑓(𝑥,𝑦)
= lim
𝜕𝑥 ℎ→0 ℎ
𝜕𝑓 𝑓(𝑥,𝑦+𝑘)−𝑓(𝑥,𝑦)
= lim
𝜕𝑦 𝑘→0 𝑘
- In discrete domain h=k=1

𝜕𝑓
So =f(x+h,y)-f(x,y) &
𝜕𝑥
𝜕𝑓
=f(x,y+k)-f(x,y)
𝜕𝑦
𝜕𝑓 𝜕𝑓
So 𝜕𝑥= z8-z5 & 𝜕𝑦 = z6-z5
Ι∇FΙ= [(z8-z5)2 + (z6-z5 )2 ]2
Ι∇FΙ= Ιz8-z5 Ι+ Ιz6-z5 Ι

- This is first order differentiate gradient. This can be implemented using two mask.
1 0
Mask 1 Ιz5-z8 Ι =
-1 0

Mask 2 Ιz5-z6 Ι = 1 -1
0 0
- This is known as ordinary operator.

- Steps to compute the gradient of image are as follow.
i) Convolve the original image with mask 1. This gives us gradient along with x-
direction.
ii) Convolve the original image with mask 2. This gives us gradient along with y-
direction.
iii) Add the result of (i) and (ii).
- Alternate method
i) Add mask 1 and mask 2
ii) Convolve input image with resultant mask.
Robert operator:
- It state that better result could be obtained if cross difference were taken instead of
the straight difference.
Ι∇FΙ= Ιz5-z9 Ι+ Ιz6-z5 Ι
1 0
Mask 1 Ιz5-z8 Ι = 0 -1
+
0 1
Mask 1 Ιz5-z8 Ι = -1 0
1 1
Resultant Mask = -1 -1

Prewitt operator:
Ι∇FΙ= Ιz7+z8+z9 Ι - Ιz1+z2 +z3 Ι
(X-gradient)
-1 -1 -1
Mask 1=
0 0 0
1 1 1
Ι∇FΙ= Ιz3+z6+z9 Ι - Ιz1+z4 +z7 Ι

(Y-gradient)
-1 0 1
Mask 2=
-1 0 1
-1 0 1
Resultant Mask = mask1 + mask 2 = -2 -1 0

-1 0 1
0 1 2
Sobel operator: In 3x3 mask, higher weights assigned to pixels which close to centre pixel z5.
Ι∇FΙ= Ιz7+2z8+z9 Ι - Ιz1+2z2 +z3 Ι

(X-gradient)
-1 -2 -1
0 0 0
Mask 1= 1 2 1
Ι∇FΙ= Ιz3+2z6+z9 Ι - Ιz1+2z4 +z7 Ι

(Y-gradient)

-1 0 1
Mask 2=
-2 0 2
-1 0 1
-2 -2 0
Resultant Mask = mask1 + mask 2 = -2 0 2
0 2 2
- Sum of coefficient of any mask should be zero.
Compass operator: it is seen that edges in the horizontal as well as in the vertical direction
are enhanced when prewitt’s or sobel’s operator is used.
- There are applications, in which we need edges in all the direction the directions.
- A simple method would be to rotate the prewitts or sobel’s mask in all the possible
directions.
- Consider a prewitt’ s operator
-1 -1 -1
0 0 0
1 1 1
-1 -1 0 -1 0 1 0 1 1
-1 0 1 -1 0 1 -1 0 1
0 1 1 -1 0 1 -1 -1 0
1 1 1 1 1 0 1 0 -1
0 0 0 1 0 -1 1 0 -1
-1 -1 -1 0 -1 -1 1 0 -1
-1 -1 -1
0 -1 -1
0 0 0
1 0 -1
1 1 1
1 1 0
- This operator is known as compass operator and is very useful for detecting weak
edges. Compass operator can also be implemented using the sobel operator.
4.1.3.3 Image segmentation using the second derivative: - The Laplacian
- We know,
𝜕𝑓 𝜕𝑓
𝛻𝑓 = + 𝜕𝑦
𝜕𝑥
𝜕𝑓
=f(x+1,y)-f(x,y) &
𝜕𝑥

𝜕𝑓
=f(x,y+1)-f(x,y)
𝜕𝑦
𝜕2 𝑓
= f(x+1,y) - f(x,y) + f(x-1,y) - f(x,y)
𝜕𝑥 2
𝜕2 𝑓
= f(x+1,y) + f(x-1,y) -2f(x,y)
𝜕𝑥 2
&
𝜕2 𝑓
= f(x,y+1) + f(x,y-1) -2 f(x,y)
𝜕𝑦 2
𝜕2 𝑓 𝜕2 𝑓
∇2f = 𝜕𝑥 2 + 𝜕𝑦 2
∇2f = f(x+1,y) + f(x-1,y) + f(x,y+1) + f(x,y-1) -4 f(x,y)
│∇2f │= │z8+z2+z6+z4-4z5│
0 1 0
1 -4 1
0 1 0
- This is known as Laplacian operator.
Question: Justify - Laplacian is good edge detector.
Solution: It is isotropic filter, means its response is independent of the direction of the
discontinuities in image.
- But we cannot directly apply Laplacian on image due to following reasons

i) It is very sensitive to noise so if noise is present in image then Laplacian gives
very large value and ruins the image.
ii) The magnitude of Laplacian produces double edge which is undesirable effect

- For removing this unwanted effect we have to use Laplacian of Gaussian algorithm
(LOG).
- We know Gaussian function
−(𝑥2 +𝑦2 )
h(r) = 𝑒 2𝜎2
Take 𝑥 2 + 𝑦 2 = r2
−(𝑟2 )
h(r) = 𝑒 2𝜎2
−(𝑟2 )
𝜕ℎ(𝑟) 𝜕
= (𝑒 2𝜎2 )
𝜕𝑟 𝜕𝑟
−(𝑟 2 )
−(𝑟 2 ) 𝜕
= (𝑒 2𝜎2 ) 2𝜎 2
𝜕𝑟
−(𝑟2 )
−2𝑟
= (𝑒 2𝜎2 )(2𝜎2 )
−(𝑟2 )
𝜕ℎ(𝑟) −𝑟
= (𝑒 2𝜎2 )(𝜎2 )
𝜕𝑟
−(𝑟2 )
𝜕2 ℎ(𝑟) 𝜕 −𝑟
= [(𝑒 2𝜎2 )(𝜎2 )]
𝜕𝑟 2 𝜕𝑟
−(𝑟2 )
1 𝜕
= − 𝜎2 [(𝑟𝑒 2𝜎2 )]
𝜕𝑟
−(𝑟2 ) −(𝑟2 )
1 −𝑟
= − 𝜎2 [( 𝜎2 ) ( 𝑟𝑒 2𝜎2 ) + (𝑒 2𝜎2 )]
−(𝑟2 )
−(𝑟2 )
1 𝑟 2 𝑒 2𝜎2
= 𝜎2 [− − (𝑒 2𝜎2 )]
𝜎2
−(𝑟2 )
1 𝑟2
=𝜎2 (𝑒 2𝜎2 ) (𝜎2 -1)
−(𝑟2 )
𝑟 2 −𝜎2
∇2h= (𝑒 2𝜎2 ) 𝜎4

Consider 5 x 5 mask
0 0 -1 0 0
0 -1 -2 -1 0
-1 -2 16 -2 -1
0 -1 -2 -1 0
0 0 -1 0 0
- It is not unique accordingly shape of function we can change mask.

- So by using Laplacian detector we will get very thin edge in o/p image and it is also
very sensitive for noise so we can say it is good edge detector.
4.1.3.4 Edge linking:
- By using previous filter, practically we get discontinuities in lines so we have to use

linking algorithm followed by edge detector.
- Local processing: All processing that share some common properties are linked
together.
i) Strength of response of the gradient operator:
- we know │∇f│ = │Fx2 + Fy2│1/2
- The pixel in the neighbourhood of pixel ∇f(x,y) is linked to the pixel (x,y) if │ α
(x,y) - α (x’,y’) │≤ T where T is non negative threshold.
ii) The direction of gradient: It is given by
α(x,y) = tan-1 (Fx\Fy)
- The pixel in the neighbourhood of pixel α (x,y) is linked to the pixel (x,y) if │∇f(x,y)
-∇f(x’,y’)│≤ A.
- Only if both these condition are satisfied then pixels linked together.
- Hence for every pixel, all its 8 neighbours are checked for these two conditions.
4.2 Hough Transform

- Consider a point (x1,y1). A line passing through this point can be written in the slope-
intercept form as y1 = ax1 + b.

- Using this equation and varying the values of a and b, infinite number of lines pass
through this point (x1,y1).
- However if we write this equation as b = -ax1+y1
- Consider ab plane instead of xy plane, we get a single line for a point (x1,y1).
- This entire line in the ab plane is due to a single point in the xy plane and different
values of a and b.
- Now consider another point (x2,y2) in the xy plane.
- Slope intercept equation of this line is y2 = ax2 + b.
- Writing this equation in ab plane b = -ax2+y2.
- This is another line in the ab plane. These two line will intersect each other
somewhere in ab plane only if they are a part of straight line in xy plane.
- The point of intersection in ab plane is noted as (a’,b’).
- Similar process we have to repeat for all given points.

Example:1

Example 2:

4.3 Global processing via graph: (Theoretic Technique)
- By using this we have to find out low cost path would eventual correspond to the
most significant edge
Example: Using graph theoretical approach, find the edge corresponding to minimum cost path
5 6 1
I= 6 7 0
7 1 3
Solution:
C(p,q) = Max(I) – [f(p)-f(q)]

- Edges are start from top raw and terminate in the last row.
- Consider p be on right side of the direction of path and q be on left side of the
direction of the path.
- There are 6 different paths are available for given image.

- For path A
So cost of path A = 8+8+1 = 17

- For path B
- Cost of path B = 8+ 6+ 0+1 +1= 16

- Similar process repeat for remaining paths.
- Cost of path C = 23
- Cost of path D = 38
- Cost of path E = 11
- Cost of path F = 40
- Cost of path G = 17
- Cost of path H = 4

OR
- The least cost path represents the edge in H.
4.4 Region Based segmentation

- Region based segmentation is a technique in which segmentation is carried out
based on the similarities in the given image.
- The region based approach to segmentation seeks to create regions directly by
grouping together pixels which share common features into area or regions of
uniformity.
- Let R represent the entire image.
- Segmentation is process to partition R in to sub regions R1,R2,…Rn.
- Region based segmentation can be carried out in four different ways.

1) Region Growing
2) Region merging
3) Region splitting
4) Region split and merge.

4.4.1 Region Growing

- Take any pixel (x1, y1) from image that needs to be segmented. This pixel is called
the seed pixel.
- Then check condition for seed pixel is in N4(p) or ND(8) then accept this pixel in
region of seed pixel ,take new pixel as seed pixel and check it is in N 4(p) or ND(8).
Repeat this process until no new pixel is accepted.
- Steps:
I) first decide seed pixel.
II) Decide threshold value.
III) Decide connectivity N4(p) or ND(8)
IV) Check this condition: Max {g(x,y)} - min {g(x,y)} ≤ Th
V)Give similar label to all pixels of specific region.
Example: Consider an 8 x 8 image, the grey level range from 0 to 7. Segment this image
using the region growing technique.
Solution: seed pixels are 6 and 0 & Threshold value is 3 & connectivity N4(p)
- Check condition Max {g(x,y)} - min {g(x,y)} ≤ Th
4.4.2 Region splitting

- If the grey levels present in one region and do not satisfy homogenous property then
region will be divide in to four equal parts.
- Now quadrants will not satisfy homogenous property then again split into 4 parts.

Example: Consider an 8 x 8 image, the grey level range from 0 to 7. Segment this image
using the region splitting technique. consider Th ≤ 3. Also draw quad tree.
5 6 6 6 7 7 6 6
6 7 6 7 5 5 4 7
6 6 4 4 3 2 5 6
5 4 5 4 2 3 4 6
0 3 2 3 3 2 4 7
0 0 0 0 2 2 5 6
1 1 0 1 0 3 4 4
1 0 1 0 2 3 5 4
- Solution: condition Max {g(x,y)} - min {g(x,y)} ≤ 3

- Here Max value is 7 and min value is 0 so given image is not satisfied above condition
so it split in to 4 parts.
- For R1 7-4 ≤ 3 satisfied above condition

- For R2 7-2 ≤ 3 not satisfied above condition
- For R3 3-0 ≤ 3 satisfied above condition
- For R4 7-0 ≤ 3 not satisfied above condition
- So split R2 and R4 in to its quadrants.
- Now R21, R22, R23, R24, R41 , R42, R43, R44 all are satisfied this condition so no
more splitting is required.

4.4.3 Region Merging

- It is exactly opposite to the region splitting
- Individual region is merged with another region after that which should be satisfies
the condition Max {g(x,y)} - min {g(x,y)} ≤ Th
4.4.4 Region Splitting and Merging
- Here we have apply splitting on input image then apply merging on that. Repeat this
process.
Example: Segment the following image using split and merge technique. Draw quad
tree representation for the corresponding segmentation.
Solution:

Example: Segment the following image using split and merge technique. Draw
quad tree representation for the corresponding segmentation.
Solution:

Example: Segment the following image using split and merge technique. Draw quad
tree representation for the corresponding segmentation.
Solution: Add one row & column to given image. Then apply split & merge technique.

4.5 Thresholding
- It produces segments having pixels with similar intensities.
- It is useful technique for establishing boundaries in images that contain solid objects
reflecting on a contrasting background.
- This technique requires object has homogenous intensity & background with
different intensity levels.
4.5.1 Global Thresholding
1, 𝑓(𝑥, 𝑦) ≥ 𝑇
𝑓(𝑥, 𝑦) = {
- Steps for global thresholding
1) Read the given image
2) Plot the histogram of image
3) based on histogram, choose the value T
4) Using this value of T segment the image into objects & background.
4.5.2 Local Thresholding

- Suppose image has no of different grey levels means less similarities in the intensity
levels.
- so far this image it is very difficult to decide the value of T.
- So we cannot apply global thresholding on the given image.
- First we have to split the image into different parts then plot histogram of it.
- Now decide value of T for individual segments.
- T = T{ A(x,y), f(x,y)}
- Here A(x,y) is neighborhood pixel.
Question: explain different types of edges.

Solution:
1) Step edge: define a perfect transition from one segment to another.
- In step edge, the image intensity abruptly changes from one value to one side of the
discontinuity to a different value on the opposite side.
2) Line Edge: if a segment of image is very narrow, it necessarily has two edges in
close proximity. This arrangement is called a line.

3) Ramp Edge: Ramp allows for a smoother transition between segments.
4) Roof edge: two nearby ramp edges result in line structure called a roof. basically
there are two types of roof edges
i) Convex roof edges
ii) Concave roof edges
Question: Difference between image enhancement and image restoration.

Solution:
image enhancement image restoration

- It gives better visual representation - It removes effect of sensing
environment
- No model is required - Mathematical degradation model is
required
- It is subjective process - It is objective process
- Contrast stretching, histogram all are - Inverse filtering, wiener filtering
enhancement techniques. denoising are some restoration
techniques
- Image restoration is an attempt - Image enhancement is an attempt to
to restore an image to its some ideal improve an image beyond what the
(sometimes fictional) fidelity, such by camera took, by adding colour,
removing scratches, blur or noise contrast etc.

Unit 4 (PART – 2)
IMAGE MORPHOLOGY
- Morphology is the science of appearance, shape & organization.
- Mathematical morphology is a collection of nonlinear process which can be applied to
an image to remove details smaller than a certain reference shape is called structuring
element.
- The operation of mathematically morphology were originally defined as set of
operations.
- Mostly morphology applied on binary image.
- In this unit consider 1 for black & 0 for white.
4.1. Basic set theory

- I) union: AUB = def {X/xЄA or xЄB}
- 2) Intersection: A∩B = def {X/xЄA andxЄB}
- 3) A-B
- 4) A+B
- 5) Compliment of A etc.
- Logical operators:
- A or B
- A and B
- Not A
- A xor B
- A nand B
4.2 Standard binary morphological operations
4.1.1 Dilation:
- It is a process in which the binary image is expanded from its original shape.
- The expanded binary image is determined by the structuring element.
- This structuring element is smaller in size component to image itself, and normally the
size used for the structuring element is 3 x 3.
- The dilation is similar to the convolution process.
- The dilation process will move the structuring element from left to right and top to
bottom.
- The process will look for whether there is at least one dark value of structuring element
overlap with input image or not.
- If there is not a single overleaping then pixel of input image behind the position of origin
(center) of structuring element will be set 0(white).
- If there is a single overleaping then pixel of input image behind the position of origin
(center) of structuring element will be set 1(Black).
- Let us define X as the reference image & B structuring element so dilation operation is
given by
- X⊕ B = {{Z││(B᷆)z∩X}⊆ X}
Example: For given input image A apply dilation technique using given structuring element B.
Solution:
4.1.2 Erosion
- It is the counter process of dilation. If dilation enlarge the image then erosion shrink
the image.
- The erosion process will move the structuring element from left to right and top to
bottom.
- The process will look for whether there is a complete overlap with all dark portion of
structuring element or not.
- If there is no complete overleaping then pixel of input image behind the position of
origin (center) of structuring element will be set 0(white).
- If there is complete overleaping then pixel of input image behind the position of origin
(center) of structuring element will be set 1(Black).
- Let us define X as the reference image & B structuring element so erosion operation is
given by
- Xɵ B = { {Z││(B᷆)z⊆ X}
Example: For given input image A apply erosion technique using given structuring element B.
Solution:
- Similarly we have move structuring element on input image and we will get final erode
image which shown below.
Example: A={(1.0),(1,1),(1,2),(0,3),(1,3),(2,3),(3,3),(1,4)} & B= {(0,0),(1,0)} Apply dilation and
erosion on given input image.
0 1 2 3 4
Solution:
A ⊕ B=
AѲB=
- Practically formula for Dilation and Erosion

- Dilation = Max{A(x-I ,y-j)*B(I,j)}
- Erosion= Min{A(x-I ,y-j)*B(I,j)}
Example: Apply dilation and erosion on given input image.
1 0 0 0 0
1 1 1 0 1 0 0 0
S. E A= 0 0 1 0 0
0 0 0 1 0
0 0 0 0 1
Solution:
1 1 0 0 0
Dilation= 0 1 1 0 0
0 1 1 1 0
0 0 1 1 1
0 0 0 1 1
1 0 0 0 0
Erosion= 0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 1
Example: Suppose given image is grey image & Apply dilation and erosion on given input image
16 14 14 17 19
1 1 1 53 57 61 62 64
132 130 133 132 131
138 142 137 132 138
Solution:
16 16 17 19 19
Dilation= 53 61 62 64 64
132 133 133 133 131
138 142 142 139 138
16 14 14 14 14
53 53 57 61 64
Erosion=
132 130 130 131 131
138 137 137 137 138
4.1.3 Opening & closing operation
- Opening is basically erosion followed by dilation using structuring element.
- A ο B = (AѲB) ⊕ B
- Closing is basically dilation followed by erosion using structuring element.
- A • B = (A⊕B) Ѳ B
Example: perform opening and closing operation on given input image using structuring element
B.
1 0 0 0 0
1 1 1 0 1 0 0 0
0 0 1 0 0
0 0 0 1 0
0 0 0 0 1
Solution:
I) Opening:
A ο B = (AѲB) ⊕ B
1 0 0 0 0
0 0 0 0 0
AѲB = 0 0 0 0 0
0 0 0 0 0
0 0 0 0 1
1 1 0 0 0
0 0 0 0 0
(AѲB) ⊕ B= 0 0 0 0 0
0 0 0 0 0
0 0 0 1 1
2) Closing:
A • B = (A⊕B) Ѳ B
1 1 0 0 0
A⊕B = 0 1 1 0 0
0 1 1 1 0
0 0 1 1 1
0 0 0 1 1
1 0 0 0 0
0 0 1 0 0
0 0 1 0 0
(A⊕B) Ѳ B = 0 0 0 1 1
0 0 0 0 1
4.1.4 Boundary Detection
- Morphology operations are very effective in the detection of boundaries in binary
image. The following boundaries detection are widely used.
- G(x,y )= f(x,y) –( f(x,y)ѲSE)
- G(x,y)= (f(x,y)⊕SE) – f(x,y)
- Y=(f(x,y)⊕SE) – (f(x,y)ѲSE)
4.1.5 Region Filling

- It is process of coloring in define image region.
- It is defined as
- Xk = ( X k-1 ⊕ B)⋂Ā , K= 1,2,3…..
- A-input image
- B – structuring element
- K – number of iterations.
- To start the procedure of region filling from top left corner of region then follow the
formula.
- This algorithm terminates when u will get Xk = X k-1 then take union of last Xk with input
image.
Example: for given input image apply region filling technique
.
Solution: X0⊕B =
- Now (X0⊕B) ⋂ Ā = X1
- Now X1⊕B
- Now (X1⊕B) ⋂ Ā = X2
- Similar process we have to repeat till you will get Xk = X k-1.
- For this example X6= X 5 so final output image = (X 5⋃ A).
4.1.6 Hit & Miss Transform

Question: Explain Hit & Miss transform using an example.
- The hit & miss transform is used for template matching. The transformation involves
two templates set B and W-B which are disjoint.
- Template B is used to match the foreground image while W-B is used to match the
background of image.
- The hit miss transform is defined as HM(X,B) = (X Ѳ B)⋂(Xc Ѳ(W_B)).
- The small window W is assumed to have least one pixel, thicker than B.
- The i/p image X & structuring element B are shown in fig.
- Accordingly equation first we have to do erosion of input image with structuring
element B then erosion of complement of input image with structuring element W-B.
- Take intersection of this two output which gives o/p of hit miss transform.
Example: given 7 X 7 image, use hit & miss transform to find top edge of the 5 X 5 square. Use
two structuring element.
0 0 0 0 0 0 0
0 1 1 1 1 1 0 0 0 0 0 1 0
A=
0 1 1 1 1 1 0 0 1 0 0 0 0
0 1 1 1 1 1 0 0 1 0 0 0 0
0 1 1 1 1 1 0 B1 B2
0 1 1 1 1 1 0
0 0 0 0 0 0 0
Solution: consider
0 0 0 0 0 0 0
AθB1 = 0 1 1 1 1 1 0
0 1 1 1 1 1 0
0 1 1 1 1 1 0
0 1 1 1 1 1 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
1 1 1 1 1 1 1
1 0 0 0 0 0 1
1 0 0 0 0 0 1
Ac= 1 0 0 0 0 0 1
1 0 0 0 0 0 1
1 0 0 0 0 0 1
1 1 1 1 1 1 1
1 1 1 1 1 1 1
1 1 1 1 1 1 1
1 0 0 0 0 0 1
AcθB2 = 1 0 0 0 0 0 1
1 0 0 0 0 0 1
1 0 0 0 0 0 1
1 0 0 0 0 0 1
0 0 0 0 0 0 0
0 1 1 1 1 1 0
(AθB1)⋂( AcθB2) =
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
Example: For given image, use hit & miss transform find out output image.
B1 B2
Solution:
AθB1 =
Ac=
AcθB2 =
(AθB1)⋂( AcθB2) =
4.1.7 Thinning
- Thinning a binary image down to a unit width skeleton is useful not only to reduce the
amount of pixels, but also to simplify the computational procedure, required for shape
description.
- It is based on the hit or miss transformation as
X⊗B = X- HM(X,B) OR X⊗B = X⋂HM(X,B)c
- There are different possibilities of structuring element are used for thinning.
- Consider origin at center of structuring element.
X X X X
X X X X
X X X X
X X
X X
4.1.8 Thickening
- Thickening is the morphological operation which is used to grow selected region of
foreground pixels in binary images.
- It is defined as X⨀B = X ⋃ HM(X,B)
- The thickened image consists of the original image plus any additional foreground
pixels switched on by hit or miss transform.
- This process is normally applied repeatedly unit is caused no further changes in the
image.
- The different structuring element that can be used in thickening process.
- Consider origin at centre of structuring element.
X X X X
X X X X
X X X X
X X
X X
Question: Comparison between dilation and erosion
Dilation Erosion
- It is non linear operation related to - It is non linear operation related to
shape of image shape of image
- Add the pixels to the boundaries of - Remove the pixels from the
objects in image. boundaries of objects in image.
- It grows or thicken objects in - It shrink or thin object in a binary
binary image image.
- Dilation is given by this formula - Erosion is given by this formula Xɵ
X⊕ B = {{Z││(B᷆)z∩X}⊆ X} B = { {Z││(B᷆)z⊆ X}
- Replaced pixels in i/p image which - Replaced pixels in i/p image which
behind origin of structuring with behind origin of structuring with
max valued pixel of its min valued pixel of its
neighborhood. neighborhood.
- Dilation of image B is equivalent of - Erosion of image B is equivalent of
the erosion of the complement of the Dilation of the complement of
the image B. the image B.
Unit 5
IMAGE RESTORATION
- Image restoration can be defined as the process of removal or reduction of degradation
in an image through linear & nonlinear filtering.
- It is objective process.
- Degradation can be due to
- I) image sensor noise
- II) Blur due to miss-focus
- III) Blur due to motion
- IV) noise from transmission
- V) Blur due to transmission channel.
5.1 Degradation Model
- Degradation o/p is given by

 
g(x,y) = ∫− ∫− 𝑓(𝑘, 𝑙)ℎ(𝑥 − 𝑘, 𝑦 − 𝑙) 𝑑𝑘 𝑑𝑙 + 𝑛(𝑥, 𝑦)
g(x,y) =f(x,y )* h(x,y) + n(x,y)
- Original image gets convolved with degradation function h(x,y) & noise gets added to it.
- Hence to remove this degradation function we need to apply inverse filtering to the
degradation image.
- Discrete degradation model is given by
g(x,y)=∑𝑘 ∑𝑙 ℎ(𝑥 − 𝑘, 𝑦 − 𝑙)𝑓(𝑘, 𝑙) + 𝑛(𝑥, 𝑦)
5.2 Degradation Function

- There are few degradation function are known as blurring functions.
- I) Blur due to atmospheric turbulence
- II) Blur due to scanning
𝑥 𝑦
Rect ( ,  )
- III) Blur due to CCD image acquisition

1 1
∑ ∑ 𝑘,𝑙 (𝑥 − 𝑘 , 𝑦 − 𝑙)

𝐾−1 𝐾−1
- IV) Blur due to horizontal motion

1 𝑥 1
𝑟𝑒𝑐𝑡 ( − ) (𝑦)
0 0 2
5.3 Discrete Degradation Model

- In absence of noise, degradation is given by g(x,y) =f(x,y )* h(x,y)
- For one dimensional g(x)= f(x)*h(x)
- This equation can be solved using the matrix notation g=Hf
𝑓(0) 𝑔(0) ℎ(0) ℎ(𝑀 − 1) ℎ(1)
𝑓(1) 𝑔(1) ℎ(1) ℎ(0) ⋮
- Here f= [ ] g= [ ] H= [ ]
⋮ ⋮ ⋮ ⋮ ⋮
𝑓(𝑀 − 1) 𝑔(𝑀 − 1) ℎ(𝑀 − 1) ℎ(𝑀 − 2) ℎ(0) 𝑀∗𝑀
5.4 Inverse Filtering

Question: Short note on Inverse Filter.
Solution:
- We know that the i/p image f(x,y) gets convolved by a blurring function h(x,y) & changes
to g(x,y).
- Hence to retrieve f(x,y) from g(x,y) ,we take the inverse of h(x,y).
- Assume noise term to be zero.
- So g(x,y)=∑𝑘 ∑𝑙 ℎ(𝑥 − 𝑘, 𝑦 − 𝑙)𝑓(𝑘, 𝑙)
- Take Fourier transform of above equation
- G(u,v ) = H(u,v) x F(u,v)
G(u,v )
- F(u,v)= H(u,v)
1
- Let HI (u,v) = H(u,v)
- So image inverse filtering is given by

- F(u,v)= HI (u,v) x G(u,v )
- Taking inverse Fourier transform, we get
- F(x,y) = hI(x,y) * g(x,y)
- If we convolve the blurred image with the inverse of the blurring function we get
original image
G(u,v )
- F(u,v)= H(u,v)
- In presence of noise ,
- G(u,v ) = [H(u,v) x F(u,v) ]+ N(u,v)
- F(u,v )= [G(u,v ) - N(u,v)] / H(u,v)
G(u,v ) N(u,v )
- F(u,v )= -
H(u,v) H(u,v)
- F(u,v )= G(u,v ) HI (u,v) - N(u,v)HI (u,v)

N(u,v )
- If H(u,v) is zero, then term become larger value.
H(u,v)
1
- So HI (u,v) =H(u,v) 
- This is highly undesirable so we have to modify it.
Pseudo- Inverse filtering

- For linear time invariant system with frequency response H(u,v), the pseudo inverse
filter is defined as
1
- HPI (u,v) = {𝐻(𝑢,𝑣)} , H(u,v)  0
= 0 , otherwise
- Disadvantage: it is very sensitive to noise.
5.5 Wiener Filter

Question: Short note on wiener filter.
Solution:
- Wiener filter has the capability of handling both the degradation function as well as
noise.
- From the image degradation model shown in fig , the error between the i/p signal f(x,y)
& the estimate signal 𝑓̂(x,y) is given by
e( x,y ) = f(x,y) - 𝑓̂(x,y)
- Square error is given by [f(x,y) - 𝑓̂(x,y)]2
- Mean square is given by E{[f(x,y) - 𝑓̂(x,y)]2}
- The objective of wiener filter is to minimize E{[f(x,y) - 𝑓̂(x,y)]2}
- Accordingly to the principle of orthogonality E[f(x,y) - 𝑓̂(x,y) v(k’,l’)] = 0
- But 𝑓̂(x,y) = v( x,y) * g(x,y)
𝑓̂(x,y) =∑ 
𝑘=− ∑𝑙=− 𝑔(𝑥 − 𝑘 , 𝑦 − 𝑙)𝑣(𝑘, 𝑙)
- So E[f(x,y) – (∑ 
𝑘=− ∑𝑙=− 𝑔(𝑥 − 𝑘 , 𝑦 − 𝑙)𝑣(𝑘, 𝑙)) v(k’,l’)] = 0
- E{f(x,y) v(k’,l’)} -E{ ∑ 

𝑘=− ∑𝑙=− 𝑔(𝑥 − 𝑘 , 𝑦 − 𝑙)𝑣(𝑘, 𝑙)v(k’,l’) }= 0
- E{f(x,y) v(k’,l’)} = E{ ∑ 
𝑘=− ∑𝑙=− 𝑔(𝑥 − 𝑘 , 𝑦 − 𝑙)𝑣(𝑘, 𝑙)v(k’,l’) }
- {rfv(x,y)} = E{ ∑ 
𝑘=− ∑𝑙=− 𝑔(𝑥 − 𝑘 , 𝑦 − 𝑙)𝑣(𝑘, 𝑙)v(k’,l’) }= 0
- Here rfv(x,y) is cross correlation function.

- Here rfv(x,y) = g(x,y) * rvv(k,l)
- Taking Fourier transform
- Sfv(u,v) = G(u,v) x Svv(u,v)
𝑠𝑓𝑣 (𝑢,𝑣)
- = G(u,v)
𝑠𝑣𝑣(𝑢,𝑣)
- It is frequency response of wiener filter.

- For restoration
- 𝑓̂(x,y) = g( x,y) * v(x,y)
- 𝐹̂ (u,v) = G( u,v) * V(u,v)
- With noise,
- g(x,y) =f(x,y )* h(x,y) + n(x,y)
- since f(x,y) and n(x,y) are uncorrelated so E[f(x,y) , n(x,y)] = 0
- This gives Svv(u,v) =H(u,v)2 Sff(u,v) + Snn(u,v)
- Sfv(u,v) = H(u,v) *Sff(u,v)
𝑠𝑓𝑣 (𝑢,𝑣)
- We know G(u,v) = 𝑠𝑣𝑣(𝑢,𝑣)
H(u,v)∗Sff(u,v)
G(u,v) = 2
H(u,v) Sff(u,v) + Snn(u,v)
H(u,v)
G(u,v) = 2 snn (u,v)
H(u,v) +
sff (u,v)
- This is known as wiener filter response.

- From the fig, it is evident that the image gets degraded due to the degradation function
and additive noise component.
- Let us study the effects of the two separately.
1) No blur only additive noise:
- If H(u,v) =1 then
H∗(u,v)
G(u,v) = 2 snn (u,v)
H(u,v) +
sff (u,v)
1
G(u,v) = snn (u,v)
(1)2 +
sff (u,v)
sff (u,v)
G(u,v) = s
ff (u,v) +snn (u,v)
sff (u,v)
snn (u,v)
G(u,v) = sff (u,v)
+1
snn (u,v)
sff (u,v)
- is signal to noise ratio at frequencies (u,v)
snn (u,v)
SSNR
- So G(u,v) = SSNR + 1
- From the equation we realized that when the SNR is large, G(u,v)=1 and when SNR is
small, G(u,v) = SSNR
- Hence G(u,v) acts as low pass filter. It is called the wiener smoothing filter.
- An important property of this filter is that signal attenuation is in proportional to the
signal to noise ratio.
2) No noise only blur:
Since snn (u,v) = 0 then
H∗(u,v)
G(u,v) = 2
H(u,v)
1
G(u,v) = H(u,v)
- This is inverse filter. Since the blurring is usually a low pass operation, the wiener filter
in the absence of noise acts as high pass filter.
- When both noise and blur are present wiener filter act as bandpass filter.
5.6 Noise Model

Question: Short note on image noise model.
Solution:
I) Gaussian Noise:
- It is provides a good model of noise. They are very popular & are at times used when all
other noise models fail.
2
–(𝑧−𝜇)
1
- PDF= 𝜎√2𝜋 𝑒 2𝜎2 z= grey level
σ=standard deviation
μ= mean
2) Rayleigh Noise:
2 2
- PDF is defined as P(z) = 𝑏 (𝑧 − 𝑎)𝑒 −(𝑧−𝑎) /𝑏 , 𝑧 ≥ 𝑎
= 0 , z<a
3) Gamma Noise:
𝑎𝑏 𝑧 𝑏−1
PDF is defined as P(z) = 𝑒 −𝑎𝑧 , 𝑧 ≥ 0
(𝑏−1)!
=0,z<0
4) Exponential Noise:
PDF is defined as P(z) = a𝑒 −𝑎𝑧 , 𝑧 ≥ 0
= 0,z<0
4) Salt & pepper Noise:

PDF is defined as P(z) = pa , z=a
=pb , z=b
=0
4) Uniform Noise:
1
PDF is defined as P(z) = 𝑏−𝑎 , 𝑎 ≤ 𝑧 ≤ 𝑏
= 0 , otherwise
Question: What are the different types of order statistics filters? Discuss their
advantages.
Solution:
- Median, Max- Min all are consider as statistics filters.
- Max-Min filter:
- These are actually two separate filters they like the median filter work within a
neighbourhood. The max filter is given by,
f^(x,y) = max {g(m,n)}.
- This simply means that we take the maximum value from the neighbourhood and
replace it at the centre.
- The min filter is given by,
f^(x,y) = min{g(m,n)}.
- This simply means that we take the minimum value from the neighbourhood and
replace it at the centre.
- These filters help us identify the brightest and darkest points within the image.
Question: Difference between image restoration and image enhancement.

Solution:
image restoration image enhancement

- Image restoration is an attempt - Image enhancement is an attempt to
to restore an image to its some ideal improve an image beyond what the
(sometimes fictional) fidelity, such by camera took, by adding colour,
removing scratches, blur or noise. contrast or detail that wasn't really
there.
- It removes effect of sensing - It gives better visual representation
environment
- Mathematical degradation model is - No model is required.
require
- It is objective process. - It is subjective process.
- Inverse filtering, wiener filtering, - Contrast stretching, histogram
denoising are some restoration equalization are some enhancement
techniques. techniques.

Ipmv Notes

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ipmv Notes

Uploaded by

Copyright:

Available Formats

INDEX

Unit 1 Digital IMAGE FUNDAMENTALS

Unit 2 IMAGE TRANSFORMS

Unit 3 IMAGE ENHANCEMENT

Unit 4 (PART – 1) IMAGE SEGMENTATION

Unit 4 (PART – 2) IMAGE MORPHOLOGY

Unit 5 IMAGE RESTORATION

Analog image: it can be mathematically represented as continuous range of values representing

1. Digital Image Processing:

2. Digital Image Representation:

Snehal Shah (8655504691) Page 1

for binary image f(x,y) values are 0 or 1 so 3 X 3 image is given by

3. Basic components of Image Processing System

Snehal Shah (8655504691) Page 2

- Image display: It displays the images.

4. Fundamental steps in Digital Image Processing

Snehal Shah (8655504691) Page 3

4. Color Image Processing:

5. Wavelets and Multi-Resolution Processing:

9. Representation and Description:

10. Object recognition:

11. Knowledge Base:

Snehal Shah (8655504691) Page 4

containing high-resolution satellite images of a region in connection with change-detection

5. Image Sensing and Acquisition

- Incoming energy is transformed into a voltage by the combination of input

 Image Acquisition Using a Single Sensor:

Snehal Shah (8655504691) Page 5

- Fig. shows an arrangement used in high-precision scanning, where a film negative is

 Image Acquisition Using Sensor Strips:

Snehal Shah (8655504691) Page 6

 Image Acquisition using Sensor Arrays:

Snehal Shah (8655504691) Page 7

6. Sampling and Quantization:

Snehal Shah (8655504691) Page 8

- Radom variation is due to image noise.

Justify “Quality of an image is decided by its tonal and spatial resolution”.

Snehal Shah (8655504691) Page 9

Snehal Shah (8655504691) Page 10

Snehal Shah (8655504691) Page 11

9. Image file format:

10. Image resolution:

Snehal Shah (8655504691) Page 12

10. Colour models:

Snehal Shah (8655504691) Page 13

4. YIQ color model:

11. Basic relationship between pixels:

Snehal Shah (8655504691) Page 14

- Let V be set of gray levels values used to define adjacency

ii)q is in ND(p) and the set [N4(p) I N4(q)] is empty

Snehal Shah (8655504691) Page 15

4 connected: if q is in set of N4(p)

8 connected: if q is in set of N8(p)

ii)q is in ND(p) and the set [N4(p) I N4(q)] is empty

12. Distance Transform:

D(city) = D(p,q) = │ x1-x2 │ + │ y1-y2 │

3) Chess Board Distance (D8 Distance):

Snehal Shah (8655504691) Page 16

D8(p,q) =Max( │ x1-x2 │ , │ y1-y2 │)

ii)q is in ND(p) and the set [N4(p) I N4(q)] is empty

ii) D4 distance: D4 (p,q)= = │ x1-x2 │ + │ y1-y2 │

Coordinates of p ( 3,0) and q (2,3)