Download as pdf or txt
Download as pdf or txt
You are on page 1of 29

lOMoARcPSD|3696570

ImageProcessing

Computer Science (Indian Institute of Technology Madras)

StuDocu is not sponsored or endorsed by any college or university


Downloaded by King KGP (kingofallkgp@gmail.com)
lOMoARcPSD|3696570

CHAPTER 4

ANALYSIS of GRAPH THEORETICAL IMAGE


SEGMENTATION METHODS

4.1 Introduction

In Graph based methods, image is represented by undirected weighted graph


where the nodes are pixels or pixel regions. They also differ in the graph
algorithms such as Min-cut Max-flow, Dijkstra’s and Prim’s algorithm.

Graph based approach used for segmentation generally depend on local


properties of graphs without considering global impressions of image, which
ultimately limits segmentation quality. In each method, an image is represented
as weighted undirected graph, = ( , ) where is the set of nodes called as
pixels and an edge set contains edges formed by joining every pair of nodes
[98]. Weight of each edge ( , ) is function of similarity between nodes of
and . Partition the set of nodes into disjoint sets , , , … such that the
nodes in has strong affinities between them.

Fig. 4.1(a) Fig. 4.1(b)


Fig.4.1: Graphical representation of 4 x 4 pixel image (a) partitions A and B of equal size
(b) partitions A and B of unequal size.

Downloaded by King KGP (kingofallkgp@gmail.com)


lOMoARcPSD|3696570

Partitioning to achieve segmentation poses several challenges such as the precise


criteria for good partition and its efficient computation. Graph based methods for
image segmentation has been broadly studied within the fields of image
processing. In these methods, segmentation problems by analogy are converted
into graphs and solved as the graph partitioning problem. These graph based
methods can be classified as Graph Cut based, Minimal Spanning Tree based,
and Shortest Path based methods.

4.2 Graph Based Methods


Several image segmentation methods have been proposed over the last several
decades. Accurate formulation for image segmentation problem and
computationally efficient implementations are very crucial. This Section covers
the reported formulations and implementation strategies for each class of graphs
based methods. The different characteristic of these methods is the way in which
they define the enviable quality of segmentation and how they accomplish by
means of different graph properties.

4.2.1 Normalized Cut Methods

Any graph = ( , ) can be partitioned into two disjoint sets A, B provided that
| | is greater than 1. The degree of dissimilarity between the sets A and B is
addition of edge weights between nodes in A to nodes in B called as cut value.

( , )= ∑ ∈ , ∈ ( , ) (4.1)

The objective of partitioning is to optimize the cut value. By considering every


possible partition, minimum cut for a graph can be obtained, but it is very
complex problem. Optimization of cut value in the partitioning is well studied
problem and variety of efficient algorithms exists for solving it. Fig. 4.2
represents image pixels maintaining four neighborhood system and pixels

Downloaded by King KGP (kingofallkgp@gmail.com)


lOMoARcPSD|3696570

corresponding to cuts where thick lines represent discontinuities between


neighboring pixels.

Fig 4.2(a) Fig 4.2(b)

Fig. 4.2 (a) Image pixels maintaining four neighborhood system, (b) Pixels corresponding to
cuts where thick lines represent discontinuities between neighboring pixels.

Wu et al. [3] proposed a minimum cut criterion based clustering approach.


However, this criterion is fit for partitioning of graphs with small vertex set, and
yields bad segmentation. From equation (4.1) we can observe that, cut value
corresponds to numbers of crossings between the two partitions. Equally sized
partitions will be related by more edges than the unequally sized partitions. In
Fig. 4.1(a) A and B have 8 vertices in each partition with 64 edge crossings while,
in Fig. 4.1(b), 15 vertices in A and only one vertex in B with 15 crossings. To
overcome this imbalance in partitioning, Shi et al. planned a new measure of
disassociation, the normalized cut Ncut [4].

For a graph partition, = ∪ the normalized cut cost is

( , ) ( , )
( , )= + (4.2)
( , ) ( , )

where ( , ) is the sum of weights of edges removed to split the graph and
( , )and ( , ) are the sum of weights of edges in the nodes of P and
Q, respectively to all nodes in the original graph G. The Ncut of the disassociation
into the groups for small partitions of isolated points will be smaller, due to high
percentage of the entire connections between set and all other nodes.
3

Downloaded by King KGP (kingofallkgp@gmail.com)


lOMoARcPSD|3696570

Similarly total normalized association within the groups for given partition is

( , ) ( , )
( , )= + (4.3)
( , ) ( , )

where ( , ) and ( , ) is the sum of weights of edges connecting


nodes within and respectively. This measure determines how strongly nodes
within the group are connected to each other.

These unbiased measures of association and disassociation of partition are


related as

= 2− (4.4)

Hence the partitioning criterion of minimizing the disassociation between the


groups and association within the groups can be satisfied simultaneously.

 Optimization of Ncut:

Let be graph with vertex set divided between two sets and then the
minimum Ncut for a graph with N nodes is calculated as below:

i. Let ( ) = ∑ ( , ) weight of all the edges connecting node i to all other


nodes j.

0 ⋯ 0
⎡ ⎤
⎢0 ⋯ 0⎥
ii. Let = ⎢ ⎥ be diagonal matrix of degrees and
⎢⋮ ⋮ ⋱ ⋮ ⎥
⎢ ⎥
⎣0 0 ⋯ ⎦


⎡ ⎤
⎢ ⋯ ⎥
= ⎢
⋮ ⎥
is affinity matrix then the minimum
⋮ ⋮ ⋱
⎢ ⎥
⎣ ⋯ ⎦

Ncut between is given by the relation


( )
min ( , )= (4.5)

Downloaded by King KGP (kingofallkgp@gmail.com)


lOMoARcPSD|3696570

where y is orthogonal to second smallest eigenvectors , , ,…, of

is called as Rayleigh Quotient [100].

iii. If ∈ then Rayleigh Quotient is minimized by solving the generalized


Eigen value problem,

( − ) = (4.6)

The eigen vector corresponding to second smallest value, generates


solution to the normalized cut problem.

 Recursive Two Way Cut:

The graph nodes are partitioned into two subsets using threshold value. The cut
can recursively be obtained in two partitioned parts and stops when it reaches to
previously given Ncut value. For given weighted graph G, summarize the
information into the affinity matrix W and degree matrix D. Solve ( − ) =
for eigen vectors with the smallest eigen value. Minimize Ncut using eigen
vector corresponding to the second smallest eigen value and bipartition the
graph by determining the point of division. The number of graphs partitioned by
this approach is controlled directly by the highest acceptable Ncut. This
technique is known as recursive two way cut [99].

Illustration of Recursive Two Way Cut:

- Construct weighted graph = ( , ) for given image by considering


each pixel as a node and connecting each pair of pixels by an edge. The
weight of an edge is similarity between the pair of pixels. Define
weight of an edge connecting to two nodes i and j by using
brightness value of the pixel and their spatial location as

‖ () ( )‖
∗ , ‖ − ‖ < (4.7)
0, ℎ

where is the spatial location of node i.

Downloaded by King KGP (kingofallkgp@gmail.com)


lOMoARcPSD|3696570

( ) is a feature vector based on intensity, color or texture of node i.

and are spatial tuning parameters respectively.

is an entry in affinity matrix W.

- Solve for the eigen vectors with the smallest eigen value of the system

( − ) = (4.8)

Convert generalized eigen system to the following standard eigen


value problem.

( − ) = (4.9)

Solution of this system for all eigen vectors needs ( )


operations for p nodes in the graph such a large
number of operations is impractical for segmentation applications. Due
to the property of local connection in graphs to be partitioned also
the resulting eigen systems are not too dense as well as only the
few top eigen vectors are required for partitioning and the decision
requirement for the eigen vectors is low. All these properties

reduces the computations to

- After computing the eigen vectors, split the graph into two parts using
the second smallest eigen vector.

- repeate the algorithm on two partitioned on each part or


homogeneously use top eigen vectors to subdivide the graphs based
on those eigen vectors. The recursion stops when Ncut value exceeds
certain limits.

These steps of Recursive two way cut are applied on a sample image as
shown in Fig. 4,3(a). The segmented image obtained by using second
smallest of the ninth eigen vector is as shown in Fig. 4.3(b).

Downloaded by King KGP (kingofallkgp@gmail.com)


lOMoARcPSD|3696570

Fig. 4.3 (a): Original image Fig. 4.3 (b): Segmented image

Shi et al. [8] discussed multi-class partitioning in combination with iterative


process of two-way partition till acceptable result is accomplished. This
technique is computationally expensive and also Ncut generates regions of same
size which happens rarely in natural images.

Important segmentation approaches for distinct graph types using normalized


cut are as discussed below.

4.2.1.1 Pixel Affinity Graph Method

In this method, each pixel is considered as vertex and an edge is obtained by


connecting two pixels within distance r as shown in Fig. 4.4. Similarity between
the connected pixels reflects weight of an edge. Overall quality of segmentation
will be reflected by the grouping cue used in the similarity pixel such as intensity
positions and contours [101].

Downloaded by King KGP (kingofallkgp@gmail.com)


lOMoARcPSD|3696570

Fig. 4.4 Segmentation by pixel affinity graph

The measure of similarity, for grouping cue is given by

( , )= , < (4.10)
0, ℎ

where is position and is intensity difference between pixels l and m, r is


graph connection radius, and are the corresponding scale parameters
which controls the tradeoff between the brightness similarity and spatial
proximity. Independent use of grouping cue results into bad segmentation due
to effect of texture disorder in natural images. Hence another grouping cue,
related to the intervening contours is given by

∈ ( , )

( , )= , < (4.11)
0, ℎ

where line ( , ) represents straight line connecting pixels l and m, and


indicates edge strength square at location x. Pixels have high affinity if straight
line between the pixels does not cross an image edge. These two grouping cue
can be combined as

( , )= ( , ). ( , )+ ( , ) (4.12)

Downloaded by King KGP (kingofallkgp@gmail.com)


lOMoARcPSD|3696570

where is constant. For the larger radius r objects with weak contours can be
detected more easily however the graph affinity matrix turns into denser.
Segmentation quality is better for bigger graph radius, but speed is very slow.
Across larger image regions, extended range graph links helps transmission of
local grouping cues. In such situations, objects with weak boundaries can be
identified easily in messy background [102].

4.2.1.2 Multiscale Graph Decomposition

To collect sufficient grouping information, affinity graph needs long range


connections. These can be compressed on a multi-scale grid. It can produce
precise object boundaries with constrained segmentation. To enhance normalized
cut Sharon et al. [5] proposed algebraic multi-grid technique in which effective
graph coarsening is used to generate unequal pyramid encoding region based
grouping cues.

First, it defines the finest grid , and construct the series , , …, such that
∁ ∁…∁ ∁ . The principle of multi-grid method is relax on the fine
grid Ω and project error to course grid , and continues the relaxation and
projection on more course grid until the coarsest grid is obtained as shown in
Fig 4.5.

Fig. 4. Sequence of increasingly coarsened grids used in multi-grid (vertex centered)


Fig. 4.5 Relaxation and Projection in Multigrid Method

Downloaded by King KGP (kingofallkgp@gmail.com)


lOMoARcPSD|3696570

Benzit et al. [103] proposed the decomposition of multiple scales which seperates
graph links into different scales,

= + + + ⋯+ (4.13)

where encloses affinity into pixels with specific range. ( , ) ≠ 0 only

if , < ≤ , where = − . First scale is , in which every


pixel is graph node and they are connected if they are at distance r apart. In
second scale , pixels are connected 2 + 1 distance apart and in scale s pixels
are sampled at (2 + 1) distance apart. The representative pixels in each scale
will be denoted by and compressed affinity matrix with connections between
the representative pixels in is denoted by called as compressed affinity
matrix. For parallel segmentation across scales form the partitioning matrix X
and multi-scale affinity matrix W as below,

⋯ 0
= ⋮ , = ⋮ ⋱ ⋮ (4.14)
0 ⋯

To find the cross scale interpolation matrix , between the nodes in layer
and nodes in coarser layer as

1
, ∈
, ( , )= | | (4.15)
0, ℎ

The cross scale segmentation constraint matrix M is written as

, − ⋯ 0
= ⋮ ⋱ ⋱ ⋮ (4.16)
0 ⋯ ,, −

= 0 is segmentation constraint.

10

Downloaded by King KGP (kingofallkgp@gmail.com)


lOMoARcPSD|3696570

The constrained Normalized Cut is given by:

( )= ∑ (4.17)

This algorithm facilitates concurrently along the graph scales, with an inter-scale
restriction to guarantee communication and reliability between the segmentation
at every level. This segmentation approach is fit for segmenting large images but

computationally not efficient.

4.2.1.3 Watershed Regions Based Similarity Graph

Watershed transformation is a morphological based tool for image segmentation.


The watershed transform can be classified as a region-based segmentation
approach. The idea of watershed can be view as a landscape immersed in a lake,
catchment basins will be filled up with water starting at each local minimum.
Dams must be built where the water coming from different catchment basins
may be meeting in order to avoid the merging of catchment basins. The water
shed lines are defined by the catchment basins divided by the dam at the highest
level where the water can reach in the landscape. As a result, watershed lines can
separate individual catchment basins in the landscape [104]. Among various
features that can be extracted from an image, the maxima and the minima are of
primary importance. Due to large number of regional minima in the images, this
technique dealt with problem of super segmentation.

To overcome this, Meyer [105] suggested hierarchical watershed with step given
below,

- Choose local minima as region seeds.

- Add neighbors to priority queue, sorted by value.

- Take top priority pixel from queue,

if all labeled neighbors have same label, assign to pixel.


11

Downloaded by King KGP (kingofallkgp@gmail.com)


lOMoARcPSD|3696570

- Add all non-marked neighbors.

- Repeat the process until finished.

The flooding process starts with given threshold value that represents some
relief feature. So, some initial regions will be flooded which yields desired
number of partitions. The hierarchical watershed regions can be modeled using
graph. The flooded gradient image is represented by connected weighted
neighborhood graph, where node is the catchment basin of the topographic
surface. After conversion, one weighting function proposed in [105] can be used
as mean density.

( , )= (4.18)

where is the density of watershed regions and . Another interesting


approach adopted by Monterio et al. [6] which combines edge and region based
approach with spectral techniques through watershed algorithms. To reduce the
spatial resolution a pre-processing step is used without losing important image
information. Rainfall watershed algorithm is applied on the image gradient
magnitude to set an initial partitioning of the image into primitive regions. This
initial partition is the input to a computationally efficient region segmentation
process which produces the final segmentation. The later process uses a region
based similarity graph representation of the image regions. Segmentation
produced by this approach is clear and simple. Most of the methods in this
category are computationally expensive as they are proved to be NP complex
and might not be suitable for many real time applications.

12

Downloaded by King KGP (kingofallkgp@gmail.com)


lOMoARcPSD|3696570

4.2.2 Minimal Spanning Tree based Methods:

The Minimal Spanning Tree (MST) is an important concept in graph theory. A


spanning tree T of graph G = (V, E) is a tree T such that T = (V, E’) where E’  E.

Each graph may have several spanning tree but minimal spanning tree is the tree
with minimum weight. In MST of a graph, nodes are pixels and edges represent
affinities between the nodes that it connects. There are several algorithms to
construct minimal spanning tree. In Prim’s algorithm, MST is constructed by
adding the frontier edge with smallest weight. This algorithm is greedy style and
runs in polynomial time. Minimal Spanning Tree for an image is constructed as
shown in fig 4.6.

Fig 4.6: MST construction

MST based segmentation is related to graph based clustering where the data is

represented by undirected adjacency graph. Different clusters that have stronger

inherent affinities could be obtained by suitably removing the lowest weight

13

Downloaded by King KGP (kingofallkgp@gmail.com)


lOMoARcPSD|3696570

edges. Most of the MST based approaches for segmentation emphasizes the

importance of Gestalt theory [7]. Earlier MST based methods perform image

segmentation in an implicit way, which is based on the inherent relationship

between MST and cluster structure. Morris et al. [8] used MST to hierarchically

partition images based on the principle that most similar pixel should be together

and dissimilar pixels should be separated. They also proposed recursive MST

algorithm which splits up MST built from an image into many sub-trees

representing homogeneous regions such that each sub-tree should have certain

number of nodes and neighboring sub-trees should have significantly dissimilar

average gray levels. It yields low quality result in case of noisy images since

wrong configuration of MST as an object might be contained in more than one

sub-tree due to noise. An advanced work on MST based algorithm is proposed in

[106] using both the differences across the sub-graph and the differences inside

the sub-graph. The internal difference of a segment is the highest weight in the

minimal spanning tree of the segment which is given by the relation ( )=

( ) where e = MST (S, E). An edge with minimum weight among edges

connecting to the two segments represents the differences between segments.

Two segments can be merged if difference between them is less than or equal to

minimum of any of the internal differences of two segments. The formal

definition for merging criterion is

| | < min ( )+ , ( )+ (4.19)


| | | |

where K is constant, | | and | | are the sizes of components

and respectively. ( ) is the largest weight in the MST of . | | is the edge

with smallest weight which connects and . From (4.10) we can see that

14

Downloaded by King KGP (kingofallkgp@gmail.com)


lOMoARcPSD|3696570

algorithm is sensitive to edges in smooth areas and less sensitive to areas with

high variability. Felzenszwalb et al. [107] showed that segmentation produced

by this method is neither too coarse nor too fine. Since two segments are merged

on the basis of single low weight edge between them, there are possibilities that

the result could be considerably affected by noise if initial filtering of image is

not done properly. To improve performance, Fahad et al. [108] suggested some

modifications of a graph theoretic image segmentation algorithm. Kruskal’s

algorithm is used to build MST for segmentation which reflects global properties

of the image. Algorithm makes greedy decisions to produce the final

segmentation by defining the predicate for measuring the evidence of boundaries

between two regions. They have modified the algorithm by reducing the number

of edges required for sorting which produces an over segmented result and

suggests a statistical merge process which reduces the over segmentation.

Evaluation of algorithm is done by segmenting various video clips, performance

and quality of segmentation is improved. Jia et al. [109] multi-atlas-based multi

image segmentation, where an image registration framework is based on

combinative and incremental tree for better registration is proposed.

In practical scenario, it is difficult to acquire images without noise due to

perplexed imaging environment. Since MST based methods are very much

susceptible to noise, therefore for noisy images without preprocessing such as

filtering may yield unacceptable segmentation.

4.2.3 Shortest Path based Methods:

Finding the shortest path between two nodes is a classical problem in graph

theory. For connected weighted graphs, shortest path between pair of nodes is

the path whose total edge weight is minimum. The most well known algorithm

15

Downloaded by King KGP (kingofallkgp@gmail.com)


lOMoARcPSD|3696570

to find shortest path is Dijkstra’s algorithm . For a directed graph = ( , ) with

edge length ( ) ≥ 0, e is an edge in and a vertex ∈ is called as source.

To find shortest path from to each vertex ∈ steps are as below:

- Set U = V, L(u) = 0, ( ) = ∞ for ∈ − { }.

- Set = { ( )/ ∈U} and = − .

- If = ∅, then stop; for ∈ , ( ) is the shortest path length from u to v.

- Set U = X. For ∈ new label is,

( ) = min{ ( ),min{ ( ) + ( , )/ ( , ) ∈ , ∈ }}.

- Repeat step for newly generated X.


Dijkstra’s algorithm is illustrated in fig. 4.7

Fig 4.7: Shortest Path Computation by Dijkstra’s Algorithm

16

Downloaded by King KGP (kingofallkgp@gmail.com)


lOMoARcPSD|3696570

To find shortest path between nodes u and v, grow Dijkstra’s tree starting at the
node u, after each iteration add frontier edge whose non-tree end point is close to
v. After each iteration, node set of Dijkstra’s tree will be added with nodes to
which shortest path from u have been obtained.

In case of shortest path based segmentation methods, the problem of finding best
boundary segment is converted into finding the path with minimum cost
between the two nodes. In Live-wire method, initial point is selected by user and
the subsequent point is selected in such a way that the shortest path between
initial point and current position should be best fit to the object of interest [9].
Sequence of oriented pixel edges represents the boundary, where each oriented
edge has single cost value to measure the quality of boundary. The boundary
wraps around the object at real time speed. In comparison with tedious manual
tracing, Live-wire provides more accurate segmentation. Selection of proper
initial seed near the desired boundary is difficult tasks. Hence for blurred images
or weak boundaries implementation of Live-wire is difficult. While segmenting
high resolution images, Live-wire needs large number of computational
resources to search the shortest path over the whole graph. Live lane [10]
overcomes this limitation by liming the searching space in much smaller range of
5 to 100 pixels and largely reducing the computational time. Falcao et al. [11]
exploited some known properties of graphs to avoid the unnecessary shortest
path computation and proposed a fast algorithm called Live-wire-on-the-fly. The
speeding up of path searching is based on the fact that the results of computation
from the selected point can make use of the previous position of arrow. It has
advantage that there is no restriction on the shape or size of the boundary and
also the boundary can be oriented so that it has well defined inner and outer
parts of the boundary. Bai et al. [110] has developed Image region based
algorithm by using geodesic distance. Since the time complexity for geodesics is
in linear time, the algorithm can be implemented very efficiently, however it is
strongly dependent on the seed locations and is more likely to leak through weak

17

Downloaded by King KGP (kingofallkgp@gmail.com)


lOMoARcPSD|3696570

boundaries. Increasing demand of 3D data motivated the researchers to extend


2D shortest path techniques to 3D. The 3D example of live wire was proposed,
for medical image segmentation [111]. Other 3D extensions of the shortest path
algorithm discussed in [112, 113] are not straightforward and fundamentally
path based techniques; they need not guarantee that the shortest paths will lie on
the minimal surface. To overcome this, Grady [114] adopted mathematically
elegant method to find the minimal surfaces and then used them for
segmentation of 3D data.

In comparison with MST based methods, the shortest path can well describe
certain nature of the object boundaries in an image since MST based methods
focuses on clustering property of a segment. To control segmentation process,
Live-wire provides more freedom to the user. Shortest path method might be
more suitable for extracting complex objects with relatively explicit boundaries
than other graph based methods. As a robust technique for interactive
segmentation, it can be extended to 2D sequences or 3D data.

4.2.4 Other Methods:

There are many other graph based segmentation methods which does not belong
to above mentioned categories. Pyramid based methods proposed by [115, 116]
in which a graph is created from original image then from this graph, a set of
graphs defined in multi-level of resolutions is built, which looks like a pyramid.
Using reduction function, vertices and edges at level L are reduced to form the
vertices and edges at level L+1. Level of pyramid is called as working level which
is responsible to yield the segmentation. These methods are categorized into
regular pyramid and irregular pyramid based methods. For regular pyramids,
reduction factor which is the ratio between number of vertices at level L and at
level L+1 is constant and fixed. Hence the size and the layout of the structure of
pyramid is predictable. Gaussian filters with adjustable filter scales used by Ping
et al. [117] to utilize a pyramid built. Pyramid linking approach is used for

18

Downloaded by King KGP (kingofallkgp@gmail.com)


lOMoARcPSD|3696570

segmentation in [118], which is dependent on proper selection of the working


level. To overcome this drawback, a modified pyramid linking approach is
proposed by Zilan et al. [119]. Regular pyramid method fails to segment
elongated regions. The structure of pyramid also varies due to small rotations,
shifts and scales of the input image. In contrast of this, reduction factor is not
constant for irregular pyramids. Hence the size and layout are not predictable.
Montanvert et al. [120] proposed a hierarchy of region adjacency graph which
performs stochastic decimation to achieve segmentation. For same input images,
different outcomes of the random variable yield distinct segmentation results. To
overcome random variation in decimation process, it is replaced by an interest
variable. The bounded irregular pyramid proposed in [121] combines features
from regular and irregular pyramids and proved that irregular pyramid yields
good result than the regular pyramid. The random walker [122] is an interactive
segmentation method for weighted graph to assign label to each pixel of an
image. Each edge of the graph is assigned with the weight

( , )= (4.20)

where image intensity at pixel i and is constant. This weight is the

probability that random walker will go across an edge joining and . The label

of a pixel is given by the seed point where random walker will reach first. This

method of random walker probabilities is same as minimizing the Dirichlet’s

function given below.

1
[ ]= − (4.21)
2
, ∈

Minimizing [ ] is same as solving harmonic function satisfying boundary


conditions, by assuming seed point value is equal to one. Using this function,
seeds can be covered in least steps, and hence avoids segmentation leakage and

19

Downloaded by King KGP (kingofallkgp@gmail.com)


lOMoARcPSD|3696570

shrinking bias. In comparison with graph cuts, random walker exhibits the
greatest robustness to seed quantity but least robustness to seed placement.

Image segmentation method based on dominant sets as proposed by Pavan et al.


[123] is generalization of maximum clique in the context of weighted graph. The
dominant set clustering method has better classification performance in intensity,
color and texture image segmentation. It is competitive with normalized cut
method for both clustering quality and computational cost.

4.3 Evaluation and Analysis of Segmentation Methods


4.3.1 Performance evaluation of graph based segmentation methods using
BSDS
For performance evaluation of above discussed methods, we used Berkeley
Segmentation Dataset and Benchmark [124] to ease comparison of manual and
machine based image segmentation. To compare the results to ground truth
boundaries we need to threshold boundary maps multiple times, at each level it
yields two values viz. Precision (P) and Recall (R) [125], which are the metrics
used in benchmark for the classification. Precision is the probability that machine
generated boundary pixel is true boundary pixel whereas recall is the probability
that the border pixel marked by the machine is same as the border pixel marked
by human. Harmonic mean of precision and recall can be summarized in terms
of F-measure as

.
= 2 (4.22)
+
Experimental performance of image segmentation algorithms based on three
important characteristics: precision, recall and F-measure for five graph based
segmentation methods viz. pixel affinity, multiscale decomposition, watershed
regions, minimal spanning tree and shortest path is discussed. Fig. 4.8, 4.9, 4.10
and 4.11 represents precision and recall values for some images from Berkeley
Segmentation Dataset and Benchmark.

20

Downloaded by King KGP (kingofallkgp@gmail.com)


lOMoARcPSD|3696570

Fig. 4.8: Precision Recall for Image No. 208001

Fig. 4.9: Precision Recall for Image No. 3096

21

Downloaded by King KGP (kingofallkgp@gmail.com)


lOMoARcPSD|3696570

Fig. 4.10: Precision Recall for Image No. 196027

Fig. 4.11: Precision Recall for Image No. 78004

22

Downloaded by King KGP (kingofallkgp@gmail.com)


lOMoARcPSD|3696570

Table 4.1 shows the corresponding F-measure values for these images for all five
methods.

F – Measure

Image Pixel Multiscale Watershed Minimal Shortest


Affinity Decomposition Regions Spanning Path

208001 0.5261 0.1686 0.2487 Tree


0.6412 0.7874

3096 0.7468 0.8782 0.5123 0.7416 0.2076

196027 0.0815 0.7014 0.5098 0.6832 0.4454

78004 0.2252 0.8016 0.3050 0.2885 0.5094

Table 4.1: F-measure of select images for various segmentation methods

4.3.2 Enhancement in Ncut Methods

To improve the performance of Ncut, we proposed modifications in tuning

parameter and carried out performance the analysis.

The Ncut algorithm first reads an image of size and constructs an intensity

matrix corresponding to the pixels in an image where intensity matrix consists of

feature values or the intensity values of the pixel. Then the graph function

computes the affinity matrix of an image by setting default values to the

parameters as = 0.1, = 0.3 = 10. Parameter is tuning parameter

which controls magnitude of the feature intensity difference involved in

computing . From (4.7), it can be observed that for smaller values of , weight

is less resulting into closely grouped pixels and more local segmentation and

vice versa. The tuning parameter controls degree of the spatial feature

involved in computing . However because of fixed values of and in two

way recursive cut method, in many cases the quality of segmentation is

compromised. As a result, it achieves global segmentation which is not

23

Downloaded by King KGP (kingofallkgp@gmail.com)


lOMoARcPSD|3696570

perceptive to local variations in the image. To achieve improved performance,

we correlated the features values around pixel i and j by modeling as

= [ ( ( ), ). ( ( ), )] (4.23)

where ( ( ), ) and ( ( ), ) are the standard deviations of neighborhood

features around pixel i and pixel j respectively, around radius r. defined in

(18) will capture the correlation of neighboring features between pixel i and pixel

j while determining the weights of edges. For fixed radius, local variations of

features around pixel i will be less for smaller values of ( ( ), ) , similarly

features around j will be less for smaller values of ( ( ), ) . As well as for low

variations in combined local features around pixel i and pixel j,

( ( ), ). ( ( ), ) will also be smaller and hence improved . This meets to

the aim of strong weight connections between the identical neighboring pixels in

the affinity matrix W resulting better segmentation quality with linear

complexity.

By using (4.23), we varied {1.0, 0.5, 0.1, 0.05, 0.01, 0.005} and observed

segmentation results as shown in fig. 4.12 (a) - (f). It illustrates that as

decreases, segmentation becomes more detailed. The algorithm is more sensitive

to the value of and its different values can give sound segmentations in

different parts of image.

24

Downloaded by King KGP (kingofallkgp@gmail.com)


lOMoARcPSD|3696570

( ) = 1.0 ( ) = 0.5 ( ) = 0.1

( ) = 0.05 ( ) = 0.01 ( ) = 0.005

Fig. 4.12 (a) – (f): Segmentation results for different values of

Similarly for pixel affinity graph, by using (10), we considered a range of in


between 1.0 - 0.001 and r in between 1 and 10. The best segmentation was
obtained for = 0.1
.1 and r = 10. For multiscale graph decomposition, (11) is
solved for same range of whereas was varied from 0.1 - 0.005 with ∝ = 1.
The best segmentation was achieved at = 0.09 and = 0.0
.005. Watershed
region affinity matrix is generated by using connected weighted graph with
many regions obtained from hierarchical watershed as input graph. For analysis,
we used Berkeley Segmentation Dataset and Benchmark to ease comparison of
manual and machine based image segmentation. Precision, Recall, and F-
measure as well as time complexity were calculated for each segmented image
for all the methods discussed above.

25

Downloaded by King KGP (kingofallkgp@gmail.com)


lOMoARcPSD|3696570

To determine the overall performance of the algorithm, Berkeley’s benchmark

combines the individual scores from all local segmentations of each image in a

single final score. The results shown in fig. 4.13 demonstrate the final scores

obtained by using our approach for Ncut based segmentation methods. It shows

that multiscale graph decomposition performs better than other methods. The

performance of multiscale graph decomposition is even better than that of

combining hierarchical multiscale graph decomposition demonstrated by [126].

For other Ncut based methods, our approach also achieves fairly good

performance for most of images considered.

Fig. 4.13: F-Measure for various images (208001, 3096, 196073, 78004, 54082)

The time complexity is an important parameter in Ncut image segmentation


methods. We carried out time complexity computations for different images with
above discussed methods are as shown in Fig. 4.14.
26

Downloaded by King KGP (kingofallkgp@gmail.com)


lOMoARcPSD|3696570

Fig. 4.14 Time Complexity for various images (208001, 3096, 196073, 78004, 54082)

It shows that multiscale and watershed segmentation methods consume less


computational power and their performance is almost same for both for all the
images considered and it is better than that of computed by [126]. It also
indicates that the time complexity for pixel affinity and recursive two way cut
methods is sensitive to image.

4.4 Discussion and Conclusion


Latest graph based image segmentation methods and their variations such as
pixel affinity, multiscale decomposition, watershed regions, minimal spanning
tree and shortest path based methods are studied analytically. Such study and
evaluation is also crucial for improving the performance of existing segmentation

27

Downloaded by King KGP (kingofallkgp@gmail.com)


lOMoARcPSD|3696570

algorithms and for developing new powerful segmentation algorithms. Their


performance can be enhanced by use of hybrid approach and correct
optimization.

The graph based methods generally performs segmentation on the basis of local
properties of image. For segmenting the images in some applications where
detailed extraction of features is necessary, consideration of global impression
along local properties is inevitable. We have proposed an enhanced technique
which allows considering both, local as well as global features during
normalized cut based segmentation to meet the requirement of precise
segmentation. This was achieved by correlating the feature values around
neighboring pixels for determining weights of edges of the graph. This creates
strong weight connections between the identical neighboring pixels in the
affinity matrix resulting better segmentation quality with linear complexity. The
result shows that the final score of multiscale graph decomposition is superior to
the score obtained for other methods and even better than that of combining
hierarchical multiscale graph decomposition. The technique also has lesser
computational time complexity.

28

Downloaded by King KGP (kingofallkgp@gmail.com)

You might also like