Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Fundamental Steps in Digital Image Processing

1. Image Acquisition:
In image processing, it is defined as the action of retrieving an image from some source, usually a hardware-
based source for processing.
It is the first step in the workflow sequence because, without an image, no processing is possible. The image
that is acquired is completely unprocessed. In image acquisition using pre-processing such as scaling is done.
2. Image Enhancement:
It is the process of adjusting digital images so that the results are more suitable for display or further image
analysis. Usually in includes sharpening of images, brightness & contrast adjustment, removal of noise, etc. In
image enhancement, we generally try to modify the image, so as to make it more pleasing to the eyes.
3. Image Restoration:
It is the process of recovering an image that has been degraded by some knowledge of degraded function H
and the additive noise term. Unlike image enhancement, image restoration is completely objective in nature.
4. Color Image Processing:
This part handles the image processing of colored images either as indexed images or RGB images.
5. Wavelets and multiresolution processing:
Wavelets are small waves of limited duration which are used to calculate wavelet transform which provides
time-frequency information.
Wavelets lead to multiresolution processing in which images are represented in various degrees of resolution.
6. Compression:
Compression deals with the techniques for reducing the storage space required to save an image or the
bandwidth required to transmit it.
This is particularly useful for displaying images on the internet as if the size of the image is large, then it uses
more bandwidth (data) to display the image from the server and also increases the loading speed of the
website.
7. Morphological Processing:
It deals with extracting image components that are useful in representation and description of shape.
It includes basic morphological operations like erosion and dilation. As seen from the block diagram above
that the outputs of morphological processing generally are image attributes.
8. Segmentation:
It is the process of partitioning a digital image into multiple segments. It is generally used to locate objects
and boundaries in objects.
9. Representation and Description:
Representation deals with converting the data into a suitable form for computer processing.
Boundary representation: it is used when the focus is on external shape characteristics e.g. corners
Regional representation: it is used when the focus in on internal properties e.g. texture
Description deals with extracting attributes that
results in some quantitative information of interest
is used for differentiating one class of objects from others
10. Recognition:
It is the process that assigns a label (e.g. car) to an object based on its description.
Knowledge Base:
Knowledge about a problem domain is coded into an image processing in the form of the knowledge
database.
Digital Image Processing Basics:
Digital Image Processing means processing digital image by means of a digital computer. We can also say
that it is a use of computer algorithms, in order to get enhanced image either to extract some useful
information.
Digital image processing is the use of algorithms and mathematical models to process and analyze digital
images. The goal of digital image processing is to enhance the quality of images, extract meaningful
information from images, and automate image-based tasks.
The basic steps involved in digital image processing are:
Image acquisition: This involves capturing an image using a digital camera or scanner, or importing an
existing image into a computer.
Image enhancement: This involves improving the visual quality of an image, such as increasing contrast,
reducing noise, and removing artifacts.
Image restoration: This involves removing degradation from an image, such as blurring, noise, and distortion.
Image segmentation: This involves dividing an image into regions or segments, each of which corresponds
to a specific object or feature in the image.
Image representation and description: This involves representing an image in a way that can be analyzed and
manipulated by a computer, and describing the features of an image in a compact and meaningful way.
Image analysis: This involves using algorithms and mathematical models to extract information from an
image, such as recognizing objects, detecting patterns, and quantifying features.
Image synthesis and compression: This involves generating new images or compressing existing images to
reduce storage and transmission requirements.
Digital image processing is widely used in a variety of applications, including medical imaging, remote
sensing, computer vision, and multimedia.
What is an image?
An image is defined as a two-dimensional function,F(x,y), where x and y are spatial coordinates, and the
amplitude of F at any pair of coordinates (x,y) is called the intensity of that image at that point. When x,y,
and amplitude values of F are finite, we call it a digital image.
In other words, an image can be defined by a two-dimensional array specifically arranged in rows and
columns.
Digital Image is composed of a finite number of elements, each of which elements have a particular value at
a particular location.These elements are referred to as picture elements,image elements,and pixels.A Pixel is
most widely used to denote the elements of a Digital Image.
Types of an image
BINARY IMAGE– The binary image as its name suggests, contain only two pixel elements i.e 0 & 1,where 0
refers to black and 1 refers to white. This image is also known as Monochrome.
BLACK AND WHITE IMAGE– The image which consist of only black and white color is called BLACK AND
WHITE IMAGE.
8 bit COLOR FORMAT– It is the most famous image format.It has 256 different shades of colors in it and
commonly known as Grayscale Image. In this format, 0 stands for Black, and 255 stands for white, and 127
stands for gray.
16 bit COLOR FORMAT– It is a color image format. It has 65,536 different colors in it.It is also known as High
Color Format. In this format the distribution of color is not as same as Grayscale image.
A 16 bit format is actually divided into three further formats which are Red, Green and Blue. That famous
RGB format.
Differences between RGB and CMYK color schemes
Both RGB and CMYK are color schemes used for mixing color in graphic design. CMYK and RGB colors are
rendered differently depending on which medium they are used for, mainly electronic-based or print based.
1. RGB Color Scheme :
RGB stands for Red Green Blue. It is the color scheme for digital images. RGB color mode is used if the
project is to be displayed on any screen. RGB color scheme is used in electronic displays such as LCD, CRT,
cameras, scanners, etc.
This color scheme is an additive type mode that combines the colors:- red, green, and blue, in various
degrees which creates a variety of different colors. When all three colors are combined and displayed to the
fullest degree, the combination gives us white color, for example for white combination will be RGB (255, 255,
255). When all three colors are combined to their lowest degree or value, the result is black, for example for
black combination is RGB (0, 0, 0).
RGB color Scheme offers the widest range of colors and hence preferred in many computer softwares.
Uses of RGB Color Scheme –
Used when project involves digital screens like computers, mobile, TV etc.
Used in web and application design.
Used in online branding.
Used in social media.
2. CMYK Color Scheme :
CMYK stands for Cyan Magenta Yellow Key (Black). It is the color scheme used for projects including printed
materials. This color mode uses the colors cyan, magenta, yellow and black as primary colors which are
combined in different extents to get different colors.
This color scheme is a subtractive type mode that combines the colors:- cyan, magenta, yellow and black in
various degrees which creates a variety of different colors. A printing machine creates images by combining
these colors with physical ink. when all colors are mixed with 0% degree white color is created, exp CMYK(0%,
0%, 0%, 0%) for white, when all colors are mixed, we get the black color.
Uses of CMYK Color scheme –
Used when project involves physically printed designs etc.
Used in physical branding like business cards etc.
Used in advertising like posters, billboards, flyers etc.
Used in cloth branding like t-shirts etc.
When to use which color scheme?
If the project involves printing something, such as a business cards, poster, or a newsletter, use CMYK
scheme.
If the project involves something that will only be seen digitally, use RGB Scheme.
Tablee: 1= RGB Color Scheme 2= CMYK Color Scheme
1:used for digital work 2:print work.
1:primary olors: Red,Green,Blue 2: Cyan,Magenta,Yellow,Black.
1:Additive Type Mixing 2: Subtractive Type Mixing.
1:Colors of images are more vibrant 2:Colors are less vibrant.
1:RGB Scheme has wider range of colors than CMYK 2:CMYK has lesser range of colors than RGB.
1:file formats:- JPEG, PNG, GIF etc. 2:file formats:- PDF, EPS etc
1:Basically it is used for online logos, online ads, digital graphics, photographs for website, social media, or
apps etc. 2:Basically it is used for business cards, stationary, stickers, posters, brochures etc.
Histogram
A digital image is a two-dimensional matrix of two spatial coordinates, with each cell specifying the intensity level of the
image at that point. So, we have an N x N matrix with integer values ranging from a minimum intensity level of 0 to a
maximum level of L-1, where L denotes the number of intensity levels. Hence, the intensity levels of a pixel r can take on
values from 0,1,2,3,…. (L-1). Generally, L = 2m, where m is the number of bits required to represent the intensity levels.
Zero level intensity denotes complete black or dark, whereas L-1 level indicates complete white or absence of grayscale.

Intensity Transformation:
Intensity transformation is a basic digital image processing technique, where the pixel intensity levels of an image are
transformed to new values using a mathematical transformation function, so as to get a new output image. In essence,
intensity transformations is simply to implement the following function:

where s is the new pixel intensity level and r is the original pixel intensity value of the given image and r≥0.

With different forms of the transformation function T(r), we get different output images.

Common Intensity Transformation Functions:

1. Image negation: This reverses the grayscales of an image, making dark pixels whiter and white pixels darker. This is
completely analogous to the photographic negative, hence the name.

s=L-1-r

2. Log Transform: Here c is some constant. It is used for expanding the dark pixel values in an image.

s = c log(1+r)

3. Power-law Transform: Here c and γ are some arbitrary constants. This transform can be used for a variety of purposes by
varying the value of γ.

s=cr

Histogram Equalization:

The histogram of a digital image, with intensity levels between 0 and (L-1), is a function h( rk ) = nk , where rk is the kth
intensity level and nk is the number of pixels in the image having that intensity level. We can also normalize the histogram
by dividing it by the total number of pixels in the image. For an N x N image, we have the following definition of a
normalized histogram function:

p( rk ) = nk/N2

This p(rk) function is the probability of the occurrence of a pixel with the intensity level rk. Clearly,

p( rk ) = 1

The histogram of an image, as shown in the figure, consists of the x-axis representing the intensity levels rk and the y-axis
denoting the h(rk) or the p(rk) functions.

The histogram of an image gives important information about the grayscale and contrast of the image. If the entire
histogram of an image is centered towards the left end of the x-axis, then it implies a dark image. If the histogram is more
inclined towards the right end, it signifies a white or bright image. A narrow-width histogram plot at the center of the
intensity axis shows a low-contrast image, as it has a few levels of grayscale. On the other hand, an evenly distributed
histogram over the entire x-axis gives a high-contrast effect to the image.

In image processing, there frequently arises the need to improve the contrast of the image. In such cases, we use an intensity
transformation technique known as histogram equalization. Histogram equalization is the process of uniformly distributing
the image histogram over the entire intensity axis by choosing a proper intensity transformation function. Hence, histogram
equalization is an intensity transformation process.
The choice of the ideal transformation function for uniform distribution of the image histogram is mathematically explained
below.

Mathematical Derivation of Transformation Function for Histogram Equalization:

Let us consider that the intensity levels of the image r is continuous, unlike the discrete case in digital images. We limit the
values that r can take between 0 and L-1, that is, 0 ≤ r ≤ L-1 . r = 0 represents black and r = L-1 represents white. Let us
consider an arbitrary transformation function:

s = T( r )

where s denotes the intensity levels of the resultant image. We have certain constraints on T(r).

T(r) must be a strictly increasing function. This makes it an injective function.

0 ≤ T( r ) ≤ L-1. This makes T(r) surjective.

The above two conditions make T(r) a bijective function. We know that such functions are invertible. So we can get back r
values from s. We can have a function such that r = T-1( s )

Let us now say that the probability density function (pdf) of r is pr (x) and the cumulative distribution function (CDF) of r is
Fr (x). Now the CDF of s will be :

FS( x ) = P( s <= x ) = P ( T(r) <= x ) = P ( r <= T-1(x) ) = Fr ( T-1(x) ).

We put the first condition of T(r) precisely to make the above step hold true. The second condition is needed as s is the
intensity value for the output image and so must be between o and (L-1).

So, a pdf of s can be obtained by differentiating FS( x ) with respect to x. We get the following relation:

Ps(S)=Pr(r)dr/ds

Now, if we define the transformation function as follows:

s = T(r) = (L-1) (f : r upper and 0 niche)Pr(x)dx

Then using this function gives us a uniform pdf for s.

The above step used Leibnitz’s integral rule. Using the above derivative, we get:

So the pdf of s is uniform. This is what we want.

Now, we extend the above continuous case to the discrete case. The natural replacement of the integral sign is the
summation. Hence, we are left with the following histogram equalization transformation function.

Since s must have integer values, any non-integer value obtained from the above function is rounded off to the nearest
integer.

Matlab example:

% Histogram Equalization in MATLAB /n % reading image /n I = imread("GFG.jfif"); /n figure /n


subplot(1,3,1) /n imshow(I) /n subplot(1,3,2:3) /n imhist(I)
Region and edge-based segmentation are two fundamental techniques in digital image processing used to
partition an image into meaningful segments. These segments typically correspond to objects or regions of
interest within the image. Each approach has its own methodology and applications. Here's a detailed look at
both:
Region-Based Segmentation
Region-based segmentation divides an image into regions that are similar according to a set of predefined
criteria. The main goal is to group pixels into regions with similar properties, such as intensity, color, or
texture. The following are some common methods used in region-based segmentation:
Thresholding:
Global Thresholding: A single threshold value is used for the entire image. Pixels with intensity values above
the threshold are grouped into one region, and those below it into another.
Adaptive Thresholding: The threshold value varies over different parts of the image, making it useful for
images with varying lighting conditions.
Region Growing:
This method starts with a set of seed points and grows regions by appending neighboring pixels that have
similar properties.
The similarity criteria can be based on intensity, color, or texture.
Region growth continues until no more pixels satisfy the criteria for inclusion in any region.
Region Splitting and Merging:
The image is initially divided into a set of disjoint regions.
Splitting divides regions that do not meet the homogeneity criterion into smaller regions.
Merging combines adjacent regions that are similar according to a predefined criterion.
Watershed Segmentation:
Treats the image as a topographic surface, where pixel values represent the altitude.
The algorithm finds the lines that separate different catchment basins (regions) on this topographic surface.
Edge-Based Segmentation
Edge-based segmentation focuses on detecting edges, which represent boundaries between different
regions or objects in an image. Edges are detected based on discontinuities in image intensity or color. The
primary techniques used in edge-based segmentation include:
Gradient-Based Methods:
Use gradient operators (e.g., Sobel, Prewitt, and Roberts) to detect edges by finding the maximum change in
intensity.
These methods calculate the gradient magnitude and direction at each pixel, highlighting regions of rapid
intensity change.
Laplacian-Based Methods:
The laplacian of an image highlights regions of rapid intensity change and can be used to detect edges.
Often used in conjunction with Gaussian smoothing (Laplacian of Gaussian) to reduce noise before edge
detection.
Canny Edge Detector:
A multi-stage algorithm that includes steps for noise reduction, gradient calculation, non-maximum
suppression, and edge tracking by hysteresis.
The Canny detector is known for its ability to detect true weak edges while minimizing the detection of false
edges.
Zero-Crossing Methods:
Detect edges by finding zero-crossings in the second derivative of the image intensity, which often
correspond to the locations of edges.
Comparison and Applications
Region-Based Segmentation:
Better suited for images where regions are clearly distinguishable by certain properties.
Commonly used in medical imaging, object recognition, and texture analysis.
Edge-Based Segmentation:
Effective for images where the boundaries between regions are well-defined and can be characterized by
sharp intensity changes.
Widely used in applications such as image enhancement, object detection, and computer vision tasks.

Difference between Dilation and Erosion


Dilation and Erosion are basic morphological processing operations that produce contrasting results when
applied to either gray-scale or binary images.
Dilation:
Dilation is the reverse process with regions growing out from their boundaries.
Dilation is A XOR B.
Erosion:
Erosion involves the removal of pixels ate the edges of the region.
Erosion is just the dual of Dilation.
Both dilation and erosion are produced by the interaction of s set called a structuring element(SE).
Difference between Dilation and Erosion:
1:Dilation 2:Erosion
1:It increases the size of the objects. 2:It decreases the size of the objects.
1:It fills the holes and broken areas. 2:It removes the small anomalies.
1:It connects the areas that are separated by space smaller than structuring element.
2:It reduces the brightness of the bright objects.
1:It increases the brightness of the objects. 2:It removes the objects smaller than the structuring element.
1:Distributive, duality, translation and decomposition properties are followed.
2:It also follows the different properties like duality etc.
1:It is XOR of A and B. 2:It is dual of dilation.
1:It is used prior in Closing operation. 2:It is used later in Closing operation.
1:It is used later in Opening operation. 2:It is used prior in Opening operation.
Introduction to JPEG Compression
JPEG is an image compression standard which was developed by "Joint Photographic Experts Group". In 1992,
it was accepted as an international standard. JPEG is a lossy image compression method. JPEG compression
uses the DCT (Discrete Cosine Transform) method for coding transformation. It allows a tradeoff between
storage size and the degree of compression can be adjusted.

Following are the steps of JPEG Image Compression-

Step 1: The input image is divided into a small block which is having 8x8 dimensions. This dimension is sum up
to 64 units. Each unit of the image is called pixel.

Step 2: JPEG uses [Y,Cb,Cr] model instead of using the [R,G,B] model. So in the 2nd step, RGB is converted into
YCbCr.

Step 3: After the conversion of colors, it is forwarded to DCT. DCT uses a cosine function and does not use
complex numbers. It converts information?s which are in a block of pixels from the spatial domain to the
frequency domain.

DCT Formula
Step 4: Humans are unable to see important aspects of the image because they are having high frequencies.
The matrix after DCT conversion can only preserve values at the lowest frequency that to in certain point.
Quantization is used to reduce the number of bits per sample.

There are two types of Quantization:

1. Uniform Quantization
2. Non-Uniform Quantization

Step 5: The zigzag scan is used to map the 8x8 matrix to a 1x64 vector. Zigzag scanning is used to group low-
frequency coefficients to the top level of the vector and the high coefficient to the bottom. To remove the
large number of zero in the quantized matrix, the zigzag matrix is used.

Step 6: Next step is vectoring, the different pulse code modulation (DPCM) is applied to the DC component.
DC components are large and vary but they are usually close to the previous value. DPCM encodes the
difference between the current block and the previous block.

Step 7: In this step, Run Length Encoding (RLE) is applied to AC components. This is done because AC
components have a lot of zeros in it. It encodes in pair of (skip, value) in which skip is non zero value and value
is the actual coded value of the non zero components.
Step 8: In this step, DC components are coded into Huffman.
Discrete Fourier Transform

The discrete Fourier transform (DFT) is "the Fourier transform for finite-length sequences" because, unlike
the (discrete-space) Fourier transform, the DFT has a discrete argument and can be stored in a finite
number of infinite word-length locations. Yet, it turns out that the DFT can be used to exactly implement
convolution for finite-size arrays. Our approach to the DFT will be through the discrete Fourier series DFS,
which is made possible by the isomorphism between rectangular periodic and finite-length, rectangular-
support sequences.

Definition (discrete Fourier transform)

For a finite-support sequence with support , we define its

DFT for integers and as follows.

(4.2.1)

Figure(4.1) and
Figure( 4.2)

We note that the DFT maps finite support rectangular sequences into themselves. Pictures of the DFT basis functions
of size 8 8 are shown with real parts in Figure (4.1) and imaginary parts in Figure (4.2). The real part of the basis
functions represent the components of x that are symmetric with respect to the 8 8 square, while the imaginary parts
of these basis functions represent the nonsymmetric parts. In these figures, the color white is maximum positive (+1),
mid-gray is 0, and black is minimum negative (-1). Each basis function occupies a small square, all of which are then

arranged into a 8 8 mosaic. Note that the highest frequencies are in the middle at =(4,4) and correspond to

the Fourier transform at . So the DFT is seen as a projection of the finite-support input
sequence x(n1,n2) onto these basis functions. The DFT coefficients then are the representation coefficients for this
basis. The inverse DFT (IDFT) exists and is given by

(4.2.2)

The DFT can be seen as a representation for x in terms of the basis

functions . and the expansion coefficients .


The correctness of this 2-D IDFT formula can be seen in a number of different ways.
Perhaps the easiest, at this point, is to realize that the 2-D DFT is a separable transform,
and as such, we can realize it as the concatenation of two 1-D DFTs.

Since we can see that the 2-D IDFT is just the inverse of each of these 1-D DFTs in the
given order, say row first and then by column, we have the desired result based on the
known validity of the 1-D DFT/IDFT transform pair, applied twice.

A second method of proof is to rely on the DFS. The key concept is that rectangular
periodic and rectangular finite-support sequences are isomorphic to one another, i.e.,
given we can define a finite support x as , and given a finite-support x , we
can find the corresponding as where we use the notation ,
meaning "n mod N'. Still a third method is to simply insert (1) into (2) and perform the 2-D
proof directly.
Spatial Filtering technique is used directly on pixels of an image. Mask is usually considered to
be added in size so that it has specific center pixel. This mask is moved on the image such that
the center of the mask traverses all image pixels.

Classification on the basis of linearity:


There are two types:

1. Linear Spatial Filter2. Non-linear Spatial Filter


General Classification:

Smoothing Spatial Filter: Smoothing filter is used for blurring and noise reduction in the image.
Blurring is pre-processing steps for removal of small details and Noise Reduction is
accomplished by blurring.

Types of Smoothing Spatial Filter:

1. Linear Filter (Mean Filter)2. Order Statistics (Non-linear) filter


These are explained as following below.

1. Mean Filter:
Linear spatial filter is simply the average of the pixels contained in the neighborhood of the
filter mask. The idea is replacing the value of every pixel in an image by the average of the
grey levels in the neighborhood define by the filter mask.
Types of Mean filter:

2.
 (i) Averaging filter: It is used in reduction of the detail in image. All coefficients
are equal.
 (ii) Weighted averaging filter: In this, pixels are multiplied by different coefficients.
Center pixel is multiplied by a higher value than average filter.

3. Order Statistics Filter:


It is based on the ordering the pixels contained in the image area encompassed by the filter.
It replaces the value of the center pixel with the value determined by the ranking result.
Edges are better preserved in this filtering.
Types of Order statistics filter:

4.
 (i) Minimum filter: 0th percentile filter is the minimum filter. The value of the
center is replaced by the smallest value in the window.
 (ii) Maximum filter: 100th percentile filter is the maximum filter. The value of the
center is replaced by the largest value in the window.
 (iii) Median filter: Each pixel in the image is considered. First neighboring pixels
are sorted and original values of the pixel is replaced by the median of the list.
Sharpening Spatial Filter: It is also known as derivative filter. The purpose of the sharpening
spatial filter is just the opposite of the smoothing spatial filter. Its main focus in on the removal
of blurring and highlight the edges. It is based on the first and second order derivative.
First order derivative:

 Must be zero in flat segments.


 Must be non zero at the onset of a grey level step.
 Must be non zero along ramps.
First order derivative in 1-D is given by:

f' = f(x+1) - f(x)


Second order derivative:

 Must be zero in flat areas.


 Must be zero at the onset and end of a ramp.
 Must be zero along ramps.
Second order derivative in 1-D is given by:

f'' = f(x+1) + f(x-1) - 2f(x)

You might also like