Professional Documents
Culture Documents
Final PDF
Final PDF
Final PDF
M.Sc. THESIS
TESFAHUN TIGISTU
NOVEMBER 2013
HARAMAYA UNIVERSITY
By
TESFAHUN TIGISTU
NOVEMBER 2013
HARAMAYA UNIVERSITY
i
SCHOOL OF GRADUATE STUDIES
HARAMAYA UNIVERSITY
As thesis research advisors, we hereby certify that we have read and evaluated this thesis
prepared, under our guidance, by: Tesfahun Tigistu Lejibo, entitled:
As members of the Examination Board of the Final M.Sc.Thesis Open Defense, we certify that
we have read and evaluated the thesis prepared by: Tesfahun Tigistu Lejibo and examined the
candidate. We recommend that the Thesis be accepted as fulfilling the Thesis requirement for the
degree of Master of Science in Physics.
ii
DEDICATION
I dedicated this thesis to my father Tigistu Lejibo and my mother Alemitu Daritado, for their
remarkable sacrifices in the success of my life.
iii
STATEMENT OF THE AUTHOR
First, I declare that this thesis is my own work and that all sources of the materials used for this
thesis have been duly acknowledged. This thesis has been submitted in partial fulfillment of the
requirements for M.Sc. degree at the Haramaya University and is deposited at the University
Library to be made available to borrowers under rules of the library. I solemnly declare that this
thesis is not submitted to any other institution anywhere for the award of any academic degree,
diploma.
Brief quotations from this thesis are allowable without special permission provided that accurate
acknowledgement of source is made. Requests for permission for extended quotation from or
reproduction of this manuscript in whole or in part may be granted by the head of the department
of Physics or the Dean of School of Graduate Studies when in his/her judgment the proposed use
of the material is in the interests of scholarship. In all other instances, however, permission must
be obtained from the author.
Signature: _________________
iv
BIOGRAPHICAL SKETCH OF THE AUTHOR
The author, Mr. Tesfahun Tigistu was born to his father Tigistu Lejibo and mother Alemitu
Daritado, in SNNP Regional State, Kambata Tambaro Zone, in 1982 G.C. He attended his
primary school at Mino primary school and his secondary education at Shinshicho senior
secondary schools.
v
TABLE OF CONTENTS
vi
3.4. Image Analysis 17
3.4.1. Shape feature extraction 18
3.4.2. Fourier descriptors 20
3.4.3. Definition of angle measurement 20
3.4.4. Color feature extraction 21
3.5. Artificial Neural Network Setting 22
3.6. Classification Design 23
4. RESULTS AND DISCUSSIONS 26
4.1. Shape Analysis Results 26
4.2. Color Analysis Results 32
4.3. Experimental Results 35
5 SUMMARY, CONCLUSION AND RECOMMENDATIONS 47
6 REFERENCES 49
7 APPENDICES 52
vii
ACKNOWLEDGEMENTS
First of all I would like to thank the almighty God for keeping me safe and for continually
blessing me in all aspects of my life.
I extend my heartfelt gratitude to my major and co-advisor Dr. Getachew Abebe and Dr. Girma
Goro, respectively, for their positive and valuable professional guidance, constructive comments,
suggestions and encouragements from the beginning up to the end of this thesis work.
My thanks go to Haramaya university physics department: Dr. Gelena Amente, Prof. Amarendra
Rajput and Prof. K.A. Mohammed for their constructive comments during proposal defense.
Special thanks go to Ethiopian horticulture agency workers for their valuable support during data
collection.
Last but not least; I would like to thank my brothers and sisters: Kebebush Tigistu, Dinkinesh
Tigistu, Teshale Tigistu, Belaynesh Tigistu, Dereje Tigistu and Tsegahun Tigistu, for their
support both materially and psychologically.
viii
LIST OF ACRONYMS
ix
LIST OF TABLES
Tables Pages
4.3 Manually constructed table for ‘All confusion matrix’ of figure 4.8 41
4.4 Manually constructed table for ‘All confusion matrix’ of figure 4.11 45
x
LIST OF FIGURES
Figure Page
4.1a, b, c, d and e are RGB image, gray image, binary image, boundary extracted from the binary
image & signature of the flower respectively 27
4.2a, b and c. Reconstruction of shape from first zero, five, and twenty FDs for Mari-Clair flower
variety, the corresponding FDs and dc component (which represents center of the image in
frequency domain) and the graphic representation of FDs, respectively. 28-29
4.4 Neural network architecture and number of neurons in each layer of scenario-1 36
4.6 Neural network architecture and number of neurons in each layer of scenario-2 39
4.9 Neural network architecture and number of neurons in each layer of scenario-3 43
xi
CLASSIFICATION OF ROSE FLOWERS USING FOURIER SHAPE
DESCRIPTORS AND COLOR MOMENT FEATURES
ABSTRACT
In this research work, digital image analysis techniques were used to classify samples of 18 rose
flower varieties of Dire-Highland, Joe-Flower and Alliance flower farms of Heleta and
Managesha area according to their shape and color features. Totally 180 (90 images for shape
and other 90 images for color feature extraction) were captured and used. For color analysis, R,
G and B components of the images were separated and the first three statistical moments for
each component were calculated. Unlike for color analysis, the images were preprocessed for
shape analysis and then signature wave form was derived. Fourier series expansion was applied
on the signature to calculate Fourier shape descriptors. Based on the Fourier descriptors, angle
measurement was defined and artificial neural network was used to classify the flowers by using
this measurement values and color features. According to shape the flowers are categorized into
round, irregularly round and star-shaped. The flower varieties considered were identified with
the classification accuracy of 95.6%, 94.4% and 98.9% with shape, color and combination of
shape and color features, respectively. This experimental scenario results show that the angle
measurement, hence FDs and statistical color moments are quite efficient for rose flower variety
discrimination.
xii
1. INTRODUCTION
Plant Breeders Rights (PBR) are granted to respect the ownership right and recognize new plant
varieties to allow them to exploit the results of their development programs free from
commercial competition. Accordingly, PBR proposes plant varieties to be described by a set of
characters (features), unique to each species, that characterize all the significant features. The
characters are usually based on the guidelines published by the Union for the Protection of Plant
Varieties (UPOV). Plant Breeders Rights are granted if a new variety shown to be Distinct,
Uniform and Stable (DUS); that is to say, it differs from all varieties officially recognized at the
time of application. Moreover, individual plants must share the same characteristics with each
other, and the appearance of the variety must remain the same from one generation to the next
(UPOV, 1990).
In order to protect plant variety breeder rights, selected plant varieties and generally to accelerate
plant development, a number of countries signed the international agreement for the protection of
plant varieties and established the UPOV on December 2nd, 1961. The task of UPOV is to
standardize criteria and methods for examining plant variety while issuing new plant variety
breeders’ rights certificate (Zhenjiang et al., 2006).
Among millions of plant varieties, the rose is an important one in beautifying our environment.
So far, there are thousands of rose varieties in the world and new varieties are continually being
released. At present, UPOV uses 46 rose features, which were proposed about ten years ago, to
examine and classify rose varieties but the whole examination and classification process is
carried out subjectively by humans (experts). However, with the development of computer and
image analysis algorithm coupling with an aim of examining and controlling rose varieties as
precisely and reliably as possible, a new automatic computerized rose variety examination and
verification scheme is required (Zhenjiang et al., 2006).
1
In order to discriminate one variety of rose flowers from the other, as many as possible features
of the rose plants can be used. As mentioned above, UPOV uses 46 features in their rose
classification criteria. These features can be taken from stems and branches (plant growth type,
height, horizontal section size, young shoots color, and others.), leaves and leaflets (shape, size,
color, glossiness, and others.), buds (shape), flowers (flower number, type, petal number, size,
shape), and others. In this particular work, however, only two features, shape and color, are
taken for classification.
Zhenjiang et al., (2006), did a work to sort flowers. In their work they used Zernike moment to
analyze image roundness measurement and Fourier descriptors to determine the star shaped
extent of the rose flowers.
Compared to the work of people mentioned above, this work is different in aspects such as:
They used only shape as feature (descriptor) of the flowers. But in this work both
shape and color are taken.
They applied their work to rose flowers in China, which are different from
Ethiopian roses because of differences in weather condition and possibly also in
varieties. Weather condition differences affect flowers’ shape and size.
Therefore, this results in difference in shape and size among the flowers in the
two countries.
They applied the fuzzy set concept for classifier formulation, but in this work
artificial neural network (ANN) is used as classifier.
They used Fourier transform for shape feature extraction but in this work Fourier
series expansion is used.
Generally, this work describes rose variety examination program. In order to know rose flower
better by introducing its shape and color analysis scheme, a brief introduction to rose flower,
Fourier descriptor and color spaces are given, and then Fourier descriptors to rose flower shape
and RGB color analysis were applied. According to the shape analysis results, a shape
measurement called angle measurement was proposed and defined. Finally, this measurement
and color analysis were applied to the rose flower analysis and recognition scheme.
2
1.2. Statement of the Problem
Ethiopia has geographical advantages for a floriculture industry, because of its high altitude and
conducive agro meteorological climate for flowers. Besides, since the sector is labor-intensive
one, the plenty low-cost labor market that the country bestowed attract investors to this industry.
Due to this and other factors like the ambition for rapid economic growth the country diversified
its income by increasing farmlands such as for flowers (especially roses). With increment in
production, and productivity prompt the need for better and objective determination of quality of
agricultural products and their management.
Rose variety is very large in number. One type is different from the other with respect to each
feature on the plant parts (flowers, stamens and pistils, leaves, branches, stems, etc.). This
diversity of roses along with increasing number of farmlands in Ethiopia results in difficulty to:
Due to this, there is a need to recognize rose varieties on the bases of certain classification
criteria, which includes shape and color.
1.3. Objective
Rose variety is relatively large in number and requires classifications and categorization.
Therefore, the general objective of this study is classification and recognition of rose flower
varieties widely available in commercial farming. To address this main objective the following
specific objectives were drawn.
To apply Fourier descriptor and color moments and to determine how far these
measurement techniques are effective in differentiating rose varieties.
The technology of image analysis is significantly applicable on many areas. It is relatively young
and its origin can be traced back to the 1960s. Since then it has experienced tremendous growth
both in theory and application. It was applied in areas such as medical diagnosis, industrial
automation, aerial surveillance (biometrics), remote sensing (satellite observation of Earth) and
in the automated sorting and grading of agricultural products (Habtamu Minassie, 2008).
Some of the benefits of the work:
To assess the interest of customers to some specific varieties and to increase production
of these needed varieties.
To identify the varieties which preserve original color, pigment, and generally their
beauty for longer period after harvesting, because these properties may be unique to some
specific varieties and these varieties may be grouped into one category.
Disease resistant or highly susceptible variety may be grouped into one category and this
helps to identify them.
4
2. LITERATURE REVIEW
Image processing is a rapidly growing area of computer science. Its growth has been fueled by
technological advances in digital imaging, computer processors and mass storage devices. Fields
which traditionally used analog imaging are now switching to digital systems, for their flexibility
and affordability (Seemann, 2002).
The field of digital image processing refers to the processing of digital image by means of a
digital computer. Hence, two of the major driving forces of interest in digital image processing
methods are: improving image data for human interpretation, and processing image data for
storage, transmission and representation for machine vision (Gonzalez and Woods, 2002).
The field of image processing and analysis has grown considerably since 1960s with the
increased utilization of imagery in several applications coupled with improvements in the size,
speed, and cost effectiveness of digital computers and related signal processing technologies.
Image processing has found a significant role in scientific, industrial, space, and government
applications.
Now it has been widely used in biomedical (Milan et al., 2004), remote sensing (Zlotnick and
Carnine, 1993), documents (Vincent and Pavlids, 1994) astronomy and space exploration
(Jonathan, 2005). The rose flower image processing and analysis program is another recent
application besides those (Zhenjiang et al., 2006).
Image shape analysis is the process that can be undertaken to extract features which can
represent the shape of the image under consideration. These features can be extracted through
different image processing methods.
5
Object description and recognition is an important task in computer vision. There are two types
of shape description methods: boundary-based method and region-based methods
(CemDirekoğlu and Mark, 2010).
In boundary-based methods only the boundary pixels of shape are taken into account to obtain
the shape representation. These methods gain popularity because they are usually simple to
acquire and are descriptive, and hence they are broadly used in many applications.
The most common boundary-based shape descriptors are Fourier descriptors, wavelet descriptors
and wavelet-Fourier descriptors. Shape representation using Fourier descriptors is easy to
compute. Fourier descriptors are obtained from the Fourier transform on a shape signature. The
shape signature is a1-D function that represents shape derived from the boundary points of a 2-D
binary image (Zhang and Lu, 2003).
In a region-based shape method, all the pixels within a shape are used to obtain the shape
representation. Popular region based shape descriptors include moments and Generic Fourier
Descriptors (GFDs) (Zhang and Lu, 2002).
Since region-based shape representations combine information across an entire object, rather
than exploiting information just at the boundary points, they can capture the interior content of a
shape. Other advantages of region-based methods are that they can be employed to describe non-
connected and disjoint shapes. However, region-based representations do not emphasize contour
features, which are equally crucial for human perception of shapes.
6
2.2. Fourier Analysis
The fundamental idea of Fourier analysis is that any signal, be it a function of time, space or any
other variable, may be expressed as a weighted linear combination of harmonic (i.e., sine and
cosine) functions having different periods or frequencies. These are called the spatial frequency
components of the signal. Here the word ‘spatial’ shows the frequency is not the reciprocal of
time but space (i.e. ). In this representation of the signal as a weighted
combination of harmonic functions of different frequencies, the assigned weights constitute the
Fourier spectrum. This spectrum extends (in principle) to infinity and any signal can be
reproduced to arbitrary accuracy. Thus, the Fourier spectrum is a complete and valid, alternative
representation of the signal (Chris and Toby, 2011).
In the Fourier representation of a function, harmonic terms with high frequencies (short periods)
are needed to construct small-scale (i.e., sharp or rapid) changes in the signal. Conversely,
smooth features in the signal can be represented by harmonic terms with low frequencies (long
periods). The two domains are thus reciprocal – small in the space domain maps to large in the
frequency domain and large in the space domain maps to small in the frequency domain (Chris
and Toby, 2011).
Fourier series is the basic tool for representing periodic functions, which play an important role
in applications (Kreyszig, 2006). It takes the signal and decomposes it in to a sum of sine and
cosine of different frequencies (Jeffrey, 2002; Karris, 2003). When processing images & signals
such as audio, radio waves, light waves and seismic waves, Fourier series can isolate individual
components of a compound wave form, splitting them for easier detection and\or removal. In
order to define a Fourier expansion, we can start by considering that a continuous curve C(t) can
be expressed as a summation of the form
( ) ∑ ( ) ( )
7
where defines the coefficients of the expansion, and the collection of functions ( ) define
the basis functions. The expansion problem centers on finding the coefficients given a set of
basis functions. A factor that influences the development and properties of the description curve
is the choice of Fourier expansion. If we consider that the trace of a curve defines a periodic
function, we can opt to use a Fourier series expansion. However, we could also consider that the
description is not periodic. Thus, we could develop a representation based on the Fourier
transform (Mark and Alberto, 2012).
Equation (1) is very general and different basis functions can also be used. A Fourier expansion
represents periodic functions by a basis defined as a set of infinite complex exponentials, i.e.,
( ) ∑ ( )
Here, ω defines the fundamental frequency and it is equal to 2π/T, where T is the period of the
function. In addition to the exponential form given in Eq. (2), the Fourier expansion can also be
expressed in trigonometric form.
( ) ∑( ) ( )
In this equation, the values of and define a pair of complex conjugate vectors. Thus,
and describe a complex number and its conjugate. Let us define these numbers as
{ ( )
By substituting this definition in Eq. (3) and using the form , we obtain
( ) ∑( ( ) ( )) ( )
Let us define
( )
we obtain the standard trigonometric form given by
() ∑ ( ( ) ( )) ( )
8
For a real Fourier series representation of a discrete 1-D sampled signal C(t) with length of N
sample points Eq. (7) becomes
() ∑ ( ( ) ( )) ( )
where = , C(t) is the signal representation of the trace in the time domain, and ak and bk are
the unknown coefficients of the series. For the signal of period T these coefficients are:
∫ ( )
∫ ( ) ( ) ( )
∫ ( ) ( )
{
According to Eq. (4) and (6), the coefficients of the trigonometric and exponential forms are
related by
( )
The modulus √( ) of is used in practical work (like present work on rose flowers)
as Fourier descriptors of shape of a pattern. Having a boundary contour (trace) of a given rose
shape whose image is taken from above (i.e., top view) we can use a one dimensional signal
representation technique. Therefore, one can represent the signal by the infinite Fourier series
representation of Eq. (7).
A finite complex exponential form of Fourier series representation of discrete function of one
variable, C(t), t=0, 1, 2, …, N-1 is given by the equation (Gonzalez and Woods, 2002).
∑ ( )
The complex coefficients are called the Fourier Descriptors (FDs) of the boundary.
9
2.2.2. Fourier transform
The essential meaning and purpose of the Fourier transform is really no different from that of the
Fourier series. The Fourier transform of a function also fundamentally expresses its
decomposition into a weighted set of harmonic functions. Moving from the Fourier series to the
Fourier transform, we move from function synthesis using weighted combinations of harmonic
functions having discrete frequencies (a summation) to weighted, infinitesimal, combinations of
continuous frequencies (an integral) (Chris and Toby, 2011).
Fourier transform provides an alternative way of representing data that varies with time: instead
of representing the signal amplitude as a function of time, we represent signal by how much
information is contained at different frequencies (Attia, 1999; Mandal and Asif, 2007; Boggess
and Narcowich, 2009). It does to signal wave form of representation of shape traces (contours)
what a prism does for light: it breaks it up into its parts, just as a beam of light is made up of
many different colors.
The transformation of discrete data between the time and frequency domain is quite useful in
extracting information from the signal. The DFT expresses signals as linear combinations of
sinusoidal complex exponential signals with various angular frequencies (Attia, 1999; Elali,
2005; Orfanidis, 2010). This decomposition of signals allows one to examine the effects of the
system on each signal component.
Discrete Fourier transform (DFT) plays a central role in the implementation of many signal
processing algorithms. DFT is a mathematical transform which resolves a time series x[n] into
the sum of an average component and a series of sinusoids with different amplitudes and
frequencies (Musoko, 2005). To compute the frequency content of the signal (or the frequency
response of the system) we use the DFT (Mandal and Asif, 2007; Orfanidis, 2010):
The Fourier transform of discrete function of one variable given by equation (11) is (Gonzalez
and Woods, 2002):
10
∑ ( ) ( )
The 1/N multiplier in front of the Fourier transform sometimes is placed in front of the inverse
instead. Other time (not as often) both equations are multiplied by 1/√ .The location of the
multiplier does not matter.
The coefficients in equation (10) are known as Fourier coefficients of the given series. For
sampled image boundary which is represented by image signature (signal form) these
coefficients well describe the image shape. Therefore, they can be taken as descriptors of the
shape and they are known as Fourier Descriptors.
From Eq. (12) one can see that the same thing to Fourier series expansion can be done by Fourier
transform implementation (i.e., Fourier descriptors can be calculated either from Fourier series
expansion or from Fourier transforms).
The discrete summation of the Fourier series becomes a continuous summation, i.e., an integral.
() ∫ ( ) ( )
( ) ∫ ( ) ( )
Unlike the Fourier series, the Fourier transform does not require the signal to be periodic. Thus,
this transform represents continuous signals.
When concluding Fourier analysis, Fourier series expansion and the Fourier transform have the
same basic goal. Conceptually, the Fourier series expansion and the Fourier transform do the
same thing. The difference is that the Fourier series breaks down a periodic signal into harmonic
functions of discrete frequencies, whereas the Fourier transform breaks down a non-periodic
11
signal into harmonic functions of continuously varying frequencies. The mathematics is different
but the idea is the same (Chris and Toby, 2011).
The purpose of color model (also color space or color system) is to facilitate the specification of
colors in some standard, generally accepted way. In essence, a color model is a specification of a
coordinate system and a subspace within that system where each color is represented by a single
point. Among these models the RGB and HSI (HSV) models are the most frequently used
(Gonzalez and Woods, 2002).
In the RGB model, each color appears in its primary spectral components of red, green, and blue.
This model is based on a Cartesian coordinate system. A color subspace of interest is a cube in
which RGB values are at three corners, while cyan, magenta, and yellow are at three other
corners. Black is at the origin and white is at the corner farthest from the origin. In this model,
the gray scale (points of equal RGB values) extends from black to white along the line joining
these two points. The different colors in this model are points on or inside the cube, and are
defined by vectors extending from the origin. This color model is especially important one in
digital image processing because it is used by most digital imaging devices (i.e., hardware
oriented rather than being perception oriented). This is because by seeing humans cannot
discriminate how much green, red, and blue color a particular color has.
Image pixels are represented by bits in digital systems. In eight bit representation of pixels of
color images color shades can have values between 0 and 255. These color shades define the
amounts of Red, Green and Blue light it contains.
12
Figure 2.1 RGB Color Space representation
Pattern classification is an area of science concerned with discriminating objects on the basis of
information available about these objects. The objective is to recognize objects in the image from
a set of measurements of the objects. Each object is a pattern and the measured values are the
features of the pattern. A set of similar objects possessing more or less identical features are said
to belong to a certain pattern class (Habtamu Minassie, 2008).
Hence, the aim of pattern recognition is the design of a classifier, a mechanism which takes
features of objects as its input and which results in a classification or a label or value indicating
to which class the object belongs. This is done on the basis of the learning set- a set of objects
with a known labeling. The classifiers performance is usually tested using a set of objects
independent of the learning set, called the test set (Shapiro and Stockman, 2001).
A number of pattern classification techniques have been used for the recognition of patterns.
Classification methods are mainly based on two types. They are supervised learning and
unsupervised learning (Tinku and Ajoy, 2005)
In supervised classification, the classifier is trained with a large set of labeled training pattern
samples. The term labeled pattern samples means that the set of patterns whose class
memberships are known in advance.
13
In unsupervised case, the system partitions the entire data set based on some similarity criteria.
This results in a set of clusters, where each cluster of patterns belongs to a specific class.
Artificial neural network is one of the most known pattern classifier and is going to be seen in
the following section.
A neural network (ANN) is an adaptable system that can learn relationships through repeated
presentation of data, and is capable of generalizing to new, previously unseen data. They are a
large set of interconnected neurons, which execute in parallel to perform the task of learning.
(Habitamu Minassie, 2008).
Distributed computation of ANN has the advantages of reliability, fault tolerance, high
throughput (division of computation tasks) and cooperative computing. The adaptation is the
ability to change a system’s parameters according to some rule (normally, minimization of an
error function). Adaptation enables the system to search for optimal performances. The ANN
property of nonlinearity is also important in dynamic range control for unconstrained variables
and produces more powerful computation schemes when compared to linear processing.
However, it complicates theoretical analysis tremendously (Tinku and Ajoy, 2005).
Unlike more analytically based information processing methods, neural computation effectively
explores the information contained within input data, without further assumptions. It builds
14
relationships in the input data sets through the iterative presentation of the data and the intrinsic
mapping characteristics of neural topologies, normally referred to as learning.
There are two basic phases in neural network operation. They are training or learning phase and
testing - recall or retrieval phase. In the learning phase, data is repeatedly presented to the
network, while weights are updated to obtain a desired response. In testing phase, the trained
network with frozen weights is applied to data that it has never seen to evaluate the networks
classification performance.
One of the Neural Network models which are used almost in all fields is Back propagation
Neural Network. As its name indicates the algorithm in this networks is known as back
propagation (Gopi, 2007).
The back propagation algorithm is used in layered feed forward Artificial Neural Networks. This
means that the artificial neurons are organized in layers and send their signals forward and then
the errors are propagated backward to modify the weights of the network in order to minimize
the error. The network receives features extracted from the pattern as inputs by neurons in the
input layer and the output of the network is given by the neurons on an output layer as the classes
of the pattern.
Related work
Zhenjiang et al. (2006) used Fourier transform to describe flower shape with Fourier descriptors
and categorize the flowers as round, irregularly round and star- shaped.
15
3. MATERIALS AND METHODS
The samples were drawn from Holeta and Managesha flower farms. Helena and Managesh are at
30 and 40km respectively, towards to west of the capital, Addis Ababa. In Holeta there are two
flower farms: (namely ‘Dire-Highland’ and ‘Joe flower’) from which the samples were collected.
Likewise in Managesha the samples were collected from ‘Alliance’ farm. Table 3.1 shows names
of different varieties which are grown on these three farms. On this table upper most row
presents names of farms and other rows convey the names of varieties grown in each farm.
farms
No. of varieties Joe-flower Dire-highland Alliance
1 Label N-Joe Malibu
2 Good-times Happy-hour Samurai
3 High and magic Esperance Prima
4 Duet High-society Utopia
5 Marie-Clair Marie-Clair Red-Parries
6 Upper class Cartoon Upper class
7 Sweet-Candia
8 Athena
As it is seen from this table, Marie-Clair variety is grown on both ‘Joe-flower’ and ‘Dire-
Highland’ farms whereas Upper-class variety is grown on both ‘Joe flower’ and ‘Alliance’
farms. Total number of rose flower varieties in these three farms is 18.
In this particular study it was not necessary to implement formal sampling techniques rather it
was good to take samples from all available flower varieties. This is because the study was not to
find out the properties of the population (all rose flowers in the farm) but rather to classify the
flowers according to their shapes and colors. Due to this, 5 images for shape and 5 images for
16
color of each variety in the mentioned farms i.e., totally 10 x 18 = 180 images were captured
using Nikon 3100 Digital camera and saved for further processing. Among these images, sample
images which are displayed using MATLAB figure window are shown in appendix A and B.
Images of rose flowers were captured as samples. The snap shot of the samples were captured by
placing the flowers on a dark background prepared using thick black cloth and card board. The
camera was set perpendicular to the plane containing the sample to capture the top view of the
flower image. The black background is aimed to reduce the reflection of light from the
background, which in turn reduces the noise on the image that results from the reflection of light.
In this way, it is easy to separate (set threshold) the flower from the dark background after taking
the flower image. This setting threshold between image area and dark background area helps to
detect the boundary contour of image picture easily.
3.3. Software
MATLAB (version R2012b) was used for all computation purposes. MATLAB is a high-level
technical computing language and interactive environment that enables to perform
computationally intensive tasks faster than with traditional programming languages. It integrates
computation, visualization, and programming in an easy to use environment where the problems
are expressed in familiar mathematical notation, like: Math and computation, Algorithm
development, Data acquisition, Modeling, simulation and prototyping, Data analysis, exploration
and visualization, Scientific and Engineering graphics, Application development, including
building graphic user interfaces and the likes (Furtado et al., 2010).
The image data collected was loaded to a computer for further analysis. This includes image
preprocessing, image segmentation, feature extraction and pattern recognition (classification).
This is done by using the MATLAB software described above. Computer routine algorithm was
17
written using the MATLAB software. For pattern classification phase the MATLAB toolbox
known as artificial neural networking is used.
Image segmentation techniques were applied on the images loaded to the software for shape
feature extraction. This is only for shape feature extraction because for color feature extraction
the image taken has no background area to segment as the foreground (flower picture) fills the
whole image area. Segmentation was used to separate each rose image from the background,
which usually resulted in binarized image.
From the segmented images, boundaries as closed curves were extracted and signatures (wave
forms) were derived from the boundaries extracted. Then, Fourier descriptors as shape features
were extracted.
In image shape analysis using Fourier descriptors, the first step of computing these descriptors is
to obtain the boundary coordinates (x(θ), y(θ)), θ = 0, 1, 2, …, N-1, where N is the number of
boundary pixels on the sampled contour of the boundary of the image shape. The centroid
distance function is expressed by the distance of the boundary points from the centroid (x c, yc) of
the shape (Zhang and Lu, 2003). The centroid distance which is also a function of variable θ is
unique to each point on the boundary (i.e., unique to the signature of that shape) of the contour of
the image. A signature is the representation of a 2-D boundary as a 1-D function. Typically, this
is achieved by calculating the centroid of the given shape (as defined by the boundary
coordinates) and plotting the distance from the centroid to the boundary r as a function of polar
angle . The signature of a closed boundary is a periodic function, repeating itself on an angular
scale of 2π. Therefore, centroid distance can represent the image boundary.
( ) ( ( ) ( ) ) ( )
and
∑ ( ) ∑ ( ) ( )
18
An example of simple centroid distance function is shown below only to illustrate how a
signature, which is 1-dimentional can represent a given 2-dimentional shape. Here the signature
of the shape is a simple cosine wave. This is because the boundary of the flower repeats the same
pattern (shape) along its boundary. For irregular shaped flowers this signature will be complex
wave, which can be treated as a Fourier series.
Figure 3.1. Translation of two-dimensional shape into one-dimensional signature (Saitoh and
Kaneko, 2003).
The Fourier series expansion in its real form for the signature of the flower is then given by
( ) ∑ ( ( ) ( )) ( )
( ) ∑ ( )
where and are Fourier coefficients which are related to , coefficient of complex form of
Fourier series representation given as Eq. (10) of section 2.2.1.
Here represent Fourier coefficients (Fourier descriptors), which describe the boundary contour
shape and were analyzed by using computer routine algorithms from the image data. Therefore,
using this technique these features as descriptors were extracted as shown in the next diagram.
19
The modulus of , | ( )|, is used in this work as Fourier descriptors of shape of roses.
The main idea of Fourier descriptors is to characterize contour by a set of numbers that represent
the frequency content of a whole shape. In the most basic form, the coordinates of boundary
pixels are x and y point coordinates. A Fourier description of these essentially gives the set of
spatial frequencies that fit the boundary points. The first element of the Fourier components (the
d.c. component) is simply the average value of the x and y coordinates, giving the coordinates of
the center point of the boundary, expressed in complex form. The second component essentially
gives the radius of the circle that best fits the points. Accordingly, a circle can be described by its
zero- and first-order components (the d.c. component and first harmonic). The higher-order
components increasingly describe detail, as they are associated with higher frequencies and low
frequency components determine global shape (approximate shape). This means, high frequency
components account for sharp points on the boundary contour.
A few low-order coefficients are able to capture gross shape (approximate shape) of the
boundary in consideration, but many more high order terms are required to define accurately
shape features such as corners. Therefore, a few Fourier descriptors can be used to capture the
gross essence of a boundary. This property is valuable because these coefficients carry shape
information. The sharp points account for formation of angles along the boundary of the shape.
Therefore, these angles are characterized by a number of higher order FDs (FDs corresponding
to higher n).
20
3.4.3. Definition of angle measurement
In order to describe star-shaped extent of the flower based on high order FDs angle measurement
is defined.
| ( )|
∑ ( ) ( ) ( )
| ( )|
| ( )| (m= 1, 2… M) are the M high order Fourier descriptors of all |c( )|. For ideal round
shape ; for irregularly round shape and star shaped images, . The bigger is the
more star shaped the flower is. Since, it is impossible to get perfect round and perfect star shaped
flower approximation was set. Therefore, for , the flower in consideration is taken to be
a round flower and, for , the flower is considered as star shaped. All the remaining
flowers are categorized as irregularly round.
In this study statistical color moments are extracted as color features. Color moments are
measures that can be used to differentiate images based on their features of color. RGB image of
each flower was loaded to MATLAB to separate and calculate first three statistical moments of
tri-stimulus components (the three R, G & B planes) of the image.
The color moments extracted as features involved are mean (µ), standard deviation ( ), and
skewness ( ). The three features are extracted from each plane R, G, and B and totally nine color
features have been extracted. The formulae to capture those moments (Abdul et al., 2011) are;
∑ ∑ ( )
√ ∑ ∑ ( ) ( )
√ ∑ ∑( ) ( )
21
M and N are the dimensions of image. Pij are values of colour on ith column and jth row.
The computer routine algorithm was developed to extract these moments.
In this study it is used feed forward multilayer perceptron (MLP) model with back-propagation
learning rule which is based on supervised learning. This is the most widely used neural network
model, and its design consists of one input layer, one hidden layer, and one output layer. Each
layer has its own neurons whose number depends on specific function the layer performs. The
number of neurons in the input layer should be the same as the number of features in the pattern
to be classified or recognized where as the number of neurons in the hidden layer can be
assigned by the user according to the accuracy of performance needed. The number of neurons in
the output layer must be equal to the number of target class into which the pattern is going to be
classified. Figure 3.3 shows model of neural network structure.
Using color and shape features extracted, the network was trained and tested. This means
relation was set between feature vectors and assigned target groups to these features and to
22
evaluate the performance of the classifier it is tested by test set of data. This is done in order to
make sure the network of the classifier recognizes and classifies the flowers according to the
features. To train the network, the input and target data need to be fed into the network. The
extracted image features are used as input and the class label of each flower as target. As the
input and the target data introduced to the network, the network automatically classifies the data
into training, testing and validation data sets. Here the percentage of each set from the whole data
fed to the network is first adjusted by the user (researcher), otherwise the network uses default
setting, which is 70% for training, 15% for testing and 15% for validation.
The training data is used to train the network, and the network is adjusted according to its error.
Error is the difference between target and output. The validation data is used to measure network
generalization, and to halt training when generalization stops improving. Testing data is then
used to provide an independent measure of the network performance during and after training.
After the extraction of features the next step is classification of flowers. Image classification is a
fundamental problem in pattern recognition. Pattern recognition is the study of how machines
can observe the environment, learn to distinguish patterns of interest and reasonable about the
categories of pattern.
The features extracted are arranged in a matrix form in such a way that, columns specify the
samples and rows specify each feature type extracted from the sample in the corresponding
column. This means specific column vector contains all feature descriptors of the sample in that
column. Therefore, these column vectors as matrix serve as input to the neural network classifier.
For shape descriptors (FDs) the matrix used is only a row vector, i.e., only one element (feature)
represents a sample in the vector. For color the matrix is 9 rows by 90 columns. The nine rows
show us the total number of color features used as described in section 3.4.4 and ninety columns
show ninety samples used.
23
The target vectors were represented by binary digits. Belongingness to certain class is
represented by binary digit 1 and non-belongingness is represented by 0. For shape
categorization into three groups (round, irregularly round and star shaped), the target classes
were represented by 001, 010 and 100, respectively. The bit value indicates whether the features
data set is the member of the class or not. For example in (001), 1 represents that the feature
belongs to round flower. When classifying each flower with respect to its single unique shape
feature rather than categorizing into groups as round, irregularly round and star-shaped, the
target classes will be 18 in number. This is because each flower will have its own class. In a
similar manner, input and target sets are designed for color features.
Arranging the binary digits as column vectors neural network relates the features in input vector
with the class assigned in the target vectors. For color based and combination of both color and
shape-based classification the binary digits are with 18 bits. This is shown in table 3.2.
24
Table 3.2. Binary digit representation of 18 target classes
In this study the result of classification is displayed by confusion matrix (confusion table).
Confusion matrix is an output tool of artificial neural network classifier. It displays correct
classification results along the diagonal which highlighted with green color, misclassifications
below and above the diagonal in boxes which are highlighted with red color and the cumulative
result at the right bottom (end of the diagonal) box is highlighted with blue color.
25
4. RESULTS AND DISCUSSIONS
This chapter describes the extracted features of roses, classification, and discusses about the
findings of the work. As described in section 3.4, the classification of rose flower varieties has
generally four components. They are image acquisition, image segmentation, feature extraction
and pattern classification. Therefore, in this chapter the results of image segmentation, feature
extraction and classification results are presented and discussed.
As mentioned in section 3.4.1 shape analysis refers to the process of extracting shape descriptors
which are Fourier coefficients. In this process first RGB (multi-spectral) images were converted
into gray images and then these gray images were converted into binary (black and white)
images by using MATLAB computer routine algorithms developed. From the resulting binary
images, boundary closed curves were extracted and then signature (1-D signal) of these
boundaries were extracted. This process is shown in figure 4.1 for Marie-Clair flower variety.
Finally, Fourier series expansion was applied on the signal representation. In doing this, Fourier
coefficients as shape descriptors have been extracted automatically using the computer routine
algorithm from Fourier series expansion of the signature of the image boundary.
26
a. b. c.
d. e.
Fig.4.1a, b, c, d and e are RGB image, gray image, binary image, boundary extracted from the
binary image & signature of the flower, respectively.
The Fourier coefficients initially calculated were real numbers and in Eq. (3) of section
3.4.1. These coefficients have been combined to give a complex coefficient of the
complex exponential form of Fourier series representation in Eq. (11) of section 2.2.1. These
coefficients are unique for a signature representation of an image shape. The equivalency
between the real form and complex form of Fourier series is shown in section 2.2.1.
27
The result of running MATLAB code for the computation of the Fourier descriptors along with
extracted boundary and corresponding signature is shown in figure 4.2 for the flower variety
shown in figure 4.1. The MATLAB codes that used to do overall processes shown in figure 4.1
and 4.2 are given in Appendix C.
The boundary and the signature in first row of figure 4.2a show the actual boundary and its
corresponding signature. The images in other rows are boundary shapes and corresponding
signatures which are resulted from reconstruction of first: zero, five and twenty Fourier
descriptors. Figure 4.2. b. and c. show the FDs resulted from the computation and corresponding
graphic representation respectively. On this representation c(k) are Fourier descriptor and k is the
order.
a. b.
28
c.
Figure 4.2a, b and c. Reconstruction of shape from first zero, five, and twenty FDs for Mari-Clair
flower variety, the corresponding FDs and dc component (which represents center of the image
in frequency domain) and the graphic representation of FDs, respectively.
When reconstructing the shape of flowers from different numbers (for instance first 5, 20 and
10000) of Fourier descriptors, using large number results in good approximation of the shape. In
doing this, the reconstructed shape and signature nearly fit to the actual shape and signature of
the flower image, respectively. This shows reconstruction of shape from large number of Fourier
descriptors results in good approximation of the shape to the actual shape of the flower and if as
much as possible number of descriptors is used better results are obtained. This can be easily
29
seen from figure 4.2a. Therefore, from this we conclude that Fourier coefficients are good
descriptors of the flower shape.
The FDs are translation invariant but they are not rotation and scale invariant (Gonzalez and
Woods, 2002; Milan et al., 2004 and Mark et al., 2012). This means when one translates the
flower (or the signature of that flower) from one position to another in a specified coordinate
system and compute the FDs, the resulted FDs will be the same as that of before translation.
Rotation invariance can be achieved by fixing starting point on the boundary of the flower. In
this work, during image capturing this was done by putting the flower of the same variety in such
a way that all corresponding points on each flower fit the same position. Scale invariance can be
achieved by normalizing (dividing) the resulting FDs by . By doing this, the size of FDs is
limited to the certain interval. This is shown for the flower shown above as figure 4.3.
Figure 4.3. Normalized FDs result for Marie-Clair flower variety shown in this section
30
As described in section 3.4.2, the higher-order components (FDs) increasingly describe detail, as
they are associated with higher frequencies and low frequency components determine global
shape (approximate shape). This means, high frequency components account for sharp points on
the boundary contour. The sharp points in turn account for angle formation along the boundaries.
Therefore, to describe these angles six of normalized highest order coefficients (FDs) are taken
and summed up to give angle-sum formed along the boundary of the flower shape as described in
section 3.4.3. Only six highest order FDs used in the expression of angle formed is to avoid
computational error accumulation.
In this study, for rose flowers with angle value (as described in section 3.4.3), , the
flowers are grouped to be a round and for , are grouped to be star-shaped. All the
remaining flowers are categorized as irregularly round.
As discussed in chapter one, Zhenjiang et al., (2006) did work using Fourier transform for flower
shape analysis. In their work they considered flowers having ≤0.4, 0.4< < 1.6 and as
round, irregularly round and star-shaped, respectively. They assigned these values ( )
subjectively to the categories relative to the flower shapes in their country (China). This means in
their work, the smaller values of corresponding to their roses were ≤ 0.4 and higher values
were ≥1.6. Therefore, they assigned the smaller values (≤ 0.4) to ‘Round’, higher values (≥1.6) to
‘Star-shaped’ and remaining to ‘Irregularly round’ categories. But in present study the ranges of
values differ slightly and the categories are the same. It is because, relativity (relative to each
other) is also considered in this work, i.e., relatively round, star-shaped and like flowers have got
their own category depending on the calculated values of (smaller values are categorized into
‘Round’, intermediate values into ‘Irregularly round’ and larger values into ‘Star-shaped’).
The sum of six highest order FDs of Marie-Clair variety (shown earlier in this section) which
represents the sum of angles formed along the boundary of the flower is . This value
is for single image among five images taken for the same variety. It is calculated in the same
manner for all flower images captured as sample and the average value was computed for each
variety. The average value computed for the variety mentioned is . The
corresponding values for Happy-hour, Upper class, Esperance and Cartoon flower varieties are
31
0.4752, 0.5224, 1.8621 & 2.7946, respectively. Average values for all sample varieties taken in
this study are shown in Table 4.1. Here taking the same number (six) of highest order normalized
FDs for all roses, their sums show us Cartoon has highest value and followed by Esperance’s.
From this we see that higher order frequency components are predominant in Esperance and
cartoon signature than other flowers. Therefore, from this we see that Happy-hour and Upper-
class are categorized into round and Esperance and Cartoon are categorized into star shaped
flowers. The remaining flowers including Marie-Clair varieties are categorized into irregularly
round. Here something to note is, roundness and being star shaped are relative to the flowers in
the farm from which the data was taken.
In this research work, 9 color features were extracted. First, color components (the three color
channels) of RGB images were separated and mean, standard deviation and skewness of these
components were calculated. The nine features extracted are shown in Table 4.2 along with each
32
flower variety represented by its five samples. In this table columns specify the samples and
rows specify the types of features and the samples’ names are shown with abbreviations in the
top row. These abbreviations representing the variety names are matched with the names as:
H.H - Happy-hour, U.C - Upper-class, La – Label, R.P - Red-parries, Sa – Samurai, N.J - N-Joe,
M.C- Mari-Clair, H.S - High-society, G.T - Good-times, H.M - High and magic, Du - Duet, At -
Athena, S.C - Sweet-Candia, Pr -Prima, Ma- Malibu, Ut - Utopia, Es - Esperance and Ca –
Cartoon. On the Table, the positive integers (1 up to 5) which are associated with the
abbreviations are to specify each sample in a specific variety. For instance, in H.H4, the number
4 specifies fourth sample of Happy-Hour variety of the five samples.
33
34
4.3. Experimental Results
In this work totally 180 images were used and stored in a JPG (joint photographic group format)
with size 1944×2592 and the size divided by 6 to resize 324×432 to make it appropriate with the
window. One shape feature (sum of six of highest order normalized FDs which is angle
measurement value) and nine color features (described in section 4.2) were extracted and used to
train the network classifier.
Training and testing phases are the two basic phases of pattern classification. During training, the
connection weights of the neural network were initialized with some random values. The training
samples in the training set were input to the neural network classifier and the connection weights
were adjusted according to the error back-propagation learning rule. This process was repeated
until the mean squares error (MSE) fell below a predefined tolerance level or the maximum
number of iterations is achieved. In validation phase, network generalization by another unseen
(untrained) data is achieved. In testing phase, the trained system is applied to data it has never
35
seen to check the performance of the classification. In this work 70%, 15% and 15% of data was
used training, validation and testing, respectively. The experimentation scenarios are as follows:
In this scenario, one shape feature which is angle measurement is used as input to the network
and three classes were assigned as targets. These target classes as described in later part of
section 4.1 is round, irregularly round, and star-shaped. Therefore, the input and output layers
should have 1 and 3 neurons respectively. Here the number of neurons in the output layer is 10,
which is the default value. This default value is chosen because the classification result is
accurate enough with this value than others. Figure 4.4 shows the network architecture and
number of neurons in each layer.
Figure 4.4 Neural network architecture and number of neurons in each layer of scenario-1
Confusion matrix is one of the output tools of ANN which shows correct classification and
misclassification of the given pattern. For this specific scenario the confusion matrix is shown in
figure 4.5.
36
Figure 4.5 Confusion matrix of scenario-1.
As shown in figure 4.5, the result of artificial neural network classifier using shape feature, ( ),
from the total training 62 samples, 2 samples (3.2%) of irregularly round class were misclassified
into round class and 48 samples (77.4%) were correctly classified and similarly 2 samples (3.2%)
of round class is misclassified into irregularly round class and 5 samples (8.1%) is correctly
classified. For validation and test sets all samples are correctly classified.
Over all classification performance can be seen from ‘All confusion matrix’ at right bottom
corner in figure 4.5. Here among 90 total samples 70 had been assigned into one class
(irregularly round) according to their angle measurement value. When the classifier was applied
68 samples of the 70 samples were correctly classified as irregularly round and 2 of this class
were misclassified into round class. Again, 10 samples which are other than the 70 samples
mentioned had been assigned to another class (round) and after application of the classifier 8 of
them were correctly classified and 2 were misclassified into irregularly round class.
Misclassifications are due to some resemblance of features describing the flowers. From the
confusion matrix overall classification is 95.6% accurate. This good accuracy of the
37
classification shows us that angle measurement, hence Fourier descriptors are efficient identifiers
of a shape.
In this scenario, 9 color features which were described in section 4.2 are used as input to the
network and 18 classes (the number of sample varieties in the study) as targets are assigned and
represented by binary digits as shown in table 3.2. Therefore, the number of neurons in input
layer, hidden layer and output layer are 9, 10 and 18, respectively. The number in the hidden
layer is 10, because we choose the default setting. The network architecture and number of
neurons in each layer is shown is in figure 4.6.
When we use ANN to classify all samples into their respective variety groups using their feature
descriptors, the output will be eighteen classes. But confusion matrix will not display the results
of correct classification and misclassification on the screen, because the space is not enough to
display all 18 × 18 results in a matrix form. Due to this ‘All confusion matrix’ in right bottom of
figure 4.7 is zoomed out in order to display the results. To make these results farther observable,
the confusion matrix (confusion table) is done manually as Table 4.3. On this table, the diagonal
is highlighted to access easily. As usual, in this table rows specify output (predicted) classes and
columns specify target classes.
When we count rows down starting from the upper most row in the zoomed out matrix or on the
table manually constructed, we will reach at the misclassification after tenth count. The box on
the diagonal, which is in 10th row contains 0 (0%) correct classifications. This is because 5
samples of this class are misclassified into 12th row. From Table 3.2, these varieties are Good-
Times and Mari-Clair, respectively. Therefore, the 5 samples representing Good-Times variety
are misclassified into Mari-Clair. This is because some similarity of features descriptors of the
flowers. In figure 4.8, the zoomed out result of ‘All matrix’ in figure 4.7 is rotated in order to see
the final performance result clearly. The result as it is seen from the matrix 94.4% and 5.6% of
the samples are correctly classified and misclassified, respectively.
38
Figure 4.6 Neural network architecture and number of neurons in each layer of scenario-2
39
Figure 4.7 Confusion matrix of experimental scenario-2
40
Table 4.3 Manually constructed table for ‘All confusion matrix’ of figure 4.7.
5 0 0 0 0 0 0 0 0 0% 0% 0 0 0 0 0 0 0 100
5.6 0% 0% 0% 0% % 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% %
% 0%
0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100
0% 5.6 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% %
% 0%
0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100
0% 0% 5.6 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% %
% 0%
0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100
0% 0% 0% 5.6 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% %
% 0%
0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 100
0% 0% 0% 0% 5.6 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% %
% 0%
0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 100
0% 0% 0% 0% 0% 5.6 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% %
% 0%
0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 100
0% 0% 0% 0% 0% 0% 5.6 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% %
% 0%
0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 100
0% 0% 0% 0% 0% 0% 0% 5.6 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% %
% 0%
0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 100
0% 0% 0% 0% 0% 0% 0% 0% 5.6 0% 0% 0% 0% 0% 0% 0% 0% 0% %
% 0%
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0%
0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% Na
N
0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 100
0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 5.6 0% 0% 0% 0% 0% 0% 0% %
% 0%
0 0 0 0 0 0 0 0 0 5 0 5 0 0 0 0 0 0 50
0% 0% 0% 0% 0% 0% 0% 0% 0% 5.6 0% 5.6 0% 0% 0% 0% 0% 0% %
% % 50
%
0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 100
0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 5.6 0% 0% 0% 0% 0% %
% 0%
0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 100
0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 5.6 0% 0% 0% 0% %
% 0%
0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 100
0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 5.6 0% 0% 0% %
% 0%
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 100
0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 5.6 0% 0% %
% 0%
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 100
0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 5.6 0% %
% 0%
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 100
0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 5.6 0
%
100 100 100 100 100 100 100 100 100 0% 100 100 100 100 100 100 100 100 94.
% % % % % % % % % 100 % % % % % % % % 4%
0% 0% 0% 0% 0% 0% 0% 0% 0% % 0% 0% 0% 0% 0% 0% 0% 0% 5.6
%
41
Figure 4.8 Rotated confusion matrix of color based classification in scenario-2
In this experimental scenario the combination of shape and color features are used. To use shape
in the combination of features every individual variety shape feature ( ) is considered rather than
using groups like groups of round, irregularly round and star shaped varieties. Therefore, the
targets classes are to be related with the combined features of shape and color.
42
As described in scenario-2 the target classes here are also 18. This is because the flower
varieties are the same in number and type in both cases and the purpose is to classify the samples
into their respective variety classes. The difference is only the number of feature input to the
network. In scenario-2 the features were 9 but in present case are 10 in addition to one shape
feature. As shown in figure 4.9 the number of neurons in input, hidden and output layers is 10,
10, and18, respectively. The network architecture and number of neurons in each layer is shown
in figure 4.9.
Similar to the case of scenario-2, the confusion matrix in figure 4.10 is zoomed out in order to
see the results of classification. For similar reason in scenario-2, the zoomed-out confusion
matrix is reconstructed manually as Table 4.4. Similar to scenario-2, counting rows starting from
the first top row of the zoomed-out matrix or Table 4.4, on the 6th row correctly classified
samples are 4 (4.4%) out of 5 (5.56%) samples of the class. According to Table 3.2 the variety in
this class is Sweet-Candia and one of its samples is misclassified into the 7th row which
according to Table 3.2 is Athena. It is the only misclassification in the scenario. The confusion
matrix in figure 4.11 is rotated matrix of figure 4.10. This is done to clearly visualize the overall
classification results which are 98.9% and 1.1%. Here the former one is correct classification and
the later one is misclassification.
Figure 4.9 Neural network architecture and number of neurons in each layer of scenario-3
43
Figure 4.10 The zoomed out confusion matrix of scenario-3
44
Table 4.4 Manually constructed table for ‘All confusion matrix’ of figure 4.10.
5 0 0 0 0 0 0 0 0 0% 0% 0 0 0 0 0 0 0 100
5.6 0% 0% 0% 0% % 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% %
% 0%
0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100
0% 5.6 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% %
% 0%
0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100
0% 0% 5.6 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% %
% 0%
0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 100
0% 0% 0% 5.6 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% %
% 0%
0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 100
0% 0% 0% 0% 5.6 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% %
% 0%
0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 100
0% 0% 0% 0% 0% 4.4 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% %
% 0%
0 0 0 0 0 1 5 0 0 0 0 0 0 0 0 0 0 0 83.
0% 0% 0% 0% 0% 1.1 5.6 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 3%
% % 16.
7%
0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 100
0% 0% 0% 0% 0% 0% 0% 5.6 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% %
% 0%
0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 100
0% 0% 0% 0% 0% 0% 0% 0% 5.6 0% 0% 0% 0% 0% 0% 0% 0% 0% %
% 0%
0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 100
0% 0% 0% 0% 0% 0% 0% 0% 0% 5.6 0% 0% 0% 0% 0% 0% 0% 0% %
% 0%
0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 100
0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 5.6 0% 0% 0% 0% 0% 0% 0% %
% 0%
0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 50
0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 5.6 0% 0% 0% 0% 0% 0% %
% 50
%
0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 100
0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 5.6 0% 0% 0% 0% 0% %
% 0%
0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 100
0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 5.6 0% 0% 0% 0% %
% 0%
0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 100
0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 5.6 0% 0% 0% %
% 0%
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 100
0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 5.6 0% 0% %
% 0%
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 100
0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 5.6 0% %
% 0%
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 100
0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 0% 5.6 %
% 0%
100 100 100 100 100 80% 100 100 100 0% 100 100 100 100 100 100 100 100 98.
% % % % % 20% % % % 100 % % % % % % % % 9%
0% 0% 0% 0% 0% 0% 0% 0% % 0% 0% 0% 0% 0% 0% 0% 0% 1.1
%
45
Figure 4.11 Rotated confusion matrix of scenario-3
46
5. SUMMARY, CONCLUSION AND RECOMMENDATIONS
5.1. Summary
Since recent years rose flowers become a commercial commodity that play one of the major
roles in foreign currency earning among export commodities of Ethiopia. The sub-sector is
getting governmental and non-governmental attention due its significance in commercial
activities. Among others, data processing and management is one of an issue in any business like
floriculture industry. Hence, each and every rose flower has to be identified in some consistent
manner.
In line with this, to discriminate different varieties of roses, we have selected an image analysis
technology which has recently gotten its application in different areas. In the classification
problem shape and color features were extracted by using image analysis techniques. These
features were used as inputs to the neural network classifier.
The experiments were conducted under three scenarios using shape, color and combination of
shape and color features. The results of these experiments show that combination of shape
(described by FDs) and color features have more discriminating power to classify roses.
5.2. Conclusion
In general, rose flowers of different varieties can be classified by using image analysis. The
experimental scenarios show that the 18 rose flower verities considered in this work were
identified with the classification accuracy of 95.6%, 94.4% and 98.9% with shape, color and
combination of shape and color features, respectively. From this results we conclude that Fourier
coefficients (FDs) describing shape and statistical color moments are relatively efficient in
discriminating rose flowers.
47
5.3. Recommendation
In Ethiopia, no research has been done on other agricultural areas like maturity (ripeness)
identification of fruits like oranges, bananas, mangos, and others using image analysis techniques
especially, color analysis. Hence, other researchers recommended to work on these areas.
48
6. REFERENCES
Abdul, K., Lukito, E.N., Adhi, S. and Paulus, I. S., 2011. Foliage plant retrieval using polar
Fourier transform, colour moments and vein features. Department of Electrical
Engineering, Gadjah Mada University, Yogyakarta, Indonesia.
Attia, J., 1999. Electronics and circuit analysis using MATLAB.CRC Press LLC. Boca Raton,
London.
Boggess, A. and Narcowich, F., 2009. A course in wavelets with Fourier analysis. Prentice Hall,
upper river saddele.
CemDirekoğlu and Mark, N., 2010. Shape classification via image based multistate description.
Pattern Recognition School of Electronics and Computer Science, University of
Southampton, SO17 1BJ, UK.
Chris, S. and Toby, B., 2011. Fundamentals of digital image processing. School of Physical
Sciences, University of Kent, Canterbury, UK (pp 114-119)
Elali, T., 2005. Discrete systems and digital signal processing with MATLAB.CRC press. Boca
Raton, London. (p143)
Furtado, J.J., Cai, Z. and Liu, X., 2010. Digital image processing: supervised classification using
genetic algorithm in MATLAB toolbox. China University of geosciences, 388 LuMo road,
Wuhan, Hubei, P.R. China.
Gonzalez, R.C. and Woods, R.E., 2002. Digital image processing second edition. Med Data
Interactive University of Tennesse.
Gopi, E.S., 2007. Algorithm collections for digital signal processing applications using
MATLAB. National Institute of Technology, Tiruchi, India (pp 24-32)
Habtamu Minassie, 2008. Image Analysis for Ethiopia Coffee Classification. MSc. Thesis,
Addis Ababa University, Addis Ababa, Ethiopia.
Jeffrey, A., 2002. Advanced engineering mathematics. Harcourt or Acadamic press. San Diego.
Jonathan, M. B., 2005. Digital image processing, Mathematical and Computational methods,
Loughborough University, England.
Karris, S., 2003. Signals and systems with MATLAB applications second Edition. Orchard
publications.
Kreyszig, E., 2006. Advanced engineering mathematics 9th Edition. Ohio State University
Columbus, Ohio (p 478)
49
Mandal, M. and Asif, M., 2007. Continuous and discrete time signals and systems. Cambridge
University Press. Cambridge. (p 193).
Mark, S. N. and Alberto S. A., 2012. Feature Extraction & Image Processing for Computer
Vision, Third edition. Elsevier (pp 349-367)
Milan, S., Ioannis, K. and Roger, B., 2004. Computer Vision and Mathematical Methods in
Medical and Biomedical Image Analysis, University of Iowa, Iowa City (pp 339-341)
Musoko, V., 2005. Biomedical signal and image processing. Ph.D. Thesis. Prague.
Orfanidis, S., 2010. Introduction to signal processing. Prentice Hall, Inc. (p 464)
Saitoh , T. and Kaneko, T., 2003. Automatic Recognition of Wild Flowers. Department of
Information and Computer Sciences, Toyohashi University of Technology,
Toyohashi, 441-8580, Japan
Shapiro, L.G. and Stockman, G.C., 2001. Computer Vision, Prentice Hall.
Seemann, T., 2002. Digital Image Processing using Local Segmentation. Submission for
PhD thesis, School of computer science and software Engineering. Monash University,
Australia.whole shape.
Shin, K. and Hammond, J., 2008. Fundamentals of Siginal Processing for Sound and
Vibration Engineers. John Wiley & Sons Ltd.
Tinku . A, and Ajoy, K. R., 2005. Image Processing Principles and Applications, Avisere, Inc.
Tucson, Arizona and Department of Electrical Engineering Arizona State University
Tempe, Arizona Jhon Wiley. (pp 171-178)
UPOV, 1990. Guidelines for the conduct of tests for distinctness, homogeneity and stability-rose,
international union for the protection of new varieties of plants. Geneva, Switzerland
Vincent, L.M. and Pavlidis, T., 1994. Document Recognition, Proceedings of the SPIE 2181.
Zhang, D.S. and Lu, G., 2002. Generic Fourier descriptors for shape-based image retrieval, IEEE
International Conference on Multimedia and Expo 1.
Zhang, D. and Lu, G., 2003. Evaluation of MPEG-7 shape descriptors against other shape
descriptors. Gippsland School of Computing and Information Technology, Monash
University, Churchill, Victoria 3842, Australia.
Zhenjiang, M., Gandelin, M.H. and Baozong, Y., 2006. A new shape analysis approach and its
application to flower shape analysis. Institute of Information Science , Beijing Jiaotong
University, Beijing, Peoples Republic of China. Image and vision computing 24:1115-
1122.
50
Zlotnick, A. and Carnine, P.D., 1993. Finding roads seeds in aerial images, Image
Understanding. IBM Israel Science & Technology Ltd.
51
7. APPENDICES
52
APPEDIX A
53
APPEDIX B
54
APPEDIX C
A=imread('DSCN1625.JPG');
b=imresize(A,[386 574]);
c=rgb2gray(b);
f=imadjust(c);
bw=im2bw(f,0.1);
bw2 = bwareaopen(bw,10);
% fill a gap in the pen's cap
se = strel('disk',2);
bw3 = imclose(bw2,se);
% fill any holes, so that regionprops can be used to estimate
% the area enclosed by each of the boundaries
img = imfill(bw3,'holes');
n=size(img);
t=img;
% edge detection robert gradien
a=double(img);
for x=2:n(1)-1
for y=2:n(2)-1
c=abs(a(x+1,y+1) - a(x,y))+ abs(a(x+1,y) - a(x,y+1)) ;
t(x,y)=c;
end
end
[i,j]=find(t>0); xcent=mean(j); ycent=mean(i); %Calculate centroid
hold on; plot(xcent,ycent,'ro');
subplot(4,2,1),imagesc(t); axis image; axis off; colormap(gray) %Plot perimeter
[th,r]=cart2pol(j-xcent,i-ycent); %Convert to polar coordinates
subplot(4,2,2); plot(th,r,'k.'); axis on; %Plot signature
55
%Calculate Fourier series of boundary
N=0;L=2.*pi;f=r;M=length(f); %N=0 - DC term only
[a,b,dc]=fcalculat_1D(f,L,N) ; %Calculate expansion coeffs
fapprox=fdevelop_1D(a,b,dc,M,L); %Build approximate function using the
coeffs
[x,y]=pol2cart(th,fapprox);
x=x+xcent; x=x-min(x)+1; y=y+ycent; y=y-min(y)+1; %Convert back to cartesian
coordinates
prm=zeros(round(max(y)),round(max(x))); i=sub2ind(size(prm),round(y),round(x)); prm(i)=1;
subplot(4,2,3); imagesc(prm); axis image; axis ij; axis on; colormap(gray); %Display resulting
2D boundary
subplot(4,2,4); plot(th,fapprox,'k.'); axis on; %Display corresponding signature
s=sprintf('Approximation %d terms',N); title(s);
N=5;L=2.*pi;f=r;M=length(f); %Repeat for N=5 terms
[a,b,dc]=fcalculate_1D(f,L,N) ; %Calculate expansion coeffs
fapprox=fdevelop_1D(a,b,dc,M,L); %Build approximate function using the
coeffs
[x,y]=pol2cart(th,fapprox);
x=x+xcent; x=x-min(x)+1; y=y+ycent; y=y-min(y)+1;
prm=zeros(round(max(y)),round(max(x))); i=sub2ind(size(prm),round(y),round(x)); prm(i)=1;
subplot(4,2,5); imagesc(prm); axis image; axis ij; axis on; colormap(gray); %Display resulting
2D boundary
subplot(4,2,6); plot(th,fapprox,'k.'); axis on; %Display corresponding signature
s=sprintf('Approximation %d terms',N); title(s);
N=20;L=2.*pi;f=r;M=length(f); %Repeat for N=20 terms
[a,b,dc]=fcalculat_1D(f,L,N); %Calculate expansion coeffs
fapprox=fdevelop_1D(a,b,dc,M,L); %Build approximate function using the
coeffs
%Convert back to cartesian coordinates
[x,y]=pol2cart(th,fapprox);
x=x+xcent; x=x-min(x)+1; y=y+ycent; y=y-min(y)+1;
56
prm=zeros(round(max(y))+10,round(max(x))+10); i=sub2ind(size(prm),round(y),round(x));
prm(i)=1;
subplot(4,2,7); imagesc(prm); axis image; axis ij; axis on; colormap(gray); %Display resulting
2D boundary
subplot(4,2,8); plot(th,fapprox,'k.'); axis on; %Display corresponding signature
s=sprintf('Approximation %d terms',N); title(s);
fd=sqrt(a.^2+b.^2)./2;
disp('fd =')
disp(fd)
disp('dc =')
disp(dc)
57
APPENDIX D
58