The one and only area that serves the food desires of the intact human race is the Agriculture zone. It
has played a key responsibility in the development of human civilization. Plants exist all over the place; we live,
as well as places without us. Plant disease is one of the essential causes that reduces quantity and degrades
quality of the agricultural merchandises. Plant diseases have turned into a terrible as it can cause significant
reduction in both quality and quantity of agricultural products.

Images form important data and information in biological sciences. Until recently photography was
the only method to reproduce and report such data. It is difficult to quantify or treat the photographic data
mathematically. Digital image processing and image analysis technology based on the advances in
microelectronics and computers circumvent these problems associated with traditional photography. Digital
image processing is the use of computer algorithms to perform image processing on digital images. As a
subcategory or field of digital signal processing, digital image processing has many advantages over analog
image processing. It allows a much wider range of algorithms to be applied to the input data and can avoid
problems such as the build-up of noise and signal distortion during processing. Since images are defined over
two dimensions (perhaps more) digital image processing may be modeled in the form of multidimensional

Using this new tool helps to improve the images from microscopic to telescopic range and also offers
a scope for their analysis. It, therefore, has many applications in biology. However, as is the case with any new
technology, imaging technology also has to be optimised for each application, since what each user is looking
for in an image is quite unique.
Images of the leaves, captured by a camera or a scanner for Colour image analysis for estimation of
normal leaf, infected leaf and chlorophyll. Many times a viral or a fungal attack on plants results in degradation
of chlorophyll pigments in leaves. Such infected leaves have patches of green and yellow colour. In plant
breeding, it is important to quantify the leaf infection. Thus the extent of infection can be quantified without
much efforts.
Plant leaf colour is also commonly used as an indication of health status of plants. The loss of
chlorophyll content of leaves occurs due to nutrient imbalance, excessive use of pesticides, environmental
changes and ageing.
In this project, we develop a software for the automatic identification & classification of plant leaf
diseases and provide advice for specific diseases. Here the end-user is the farmer. This project, classifies the
plant leaves and stems at hand into infected and non-infected classes. The developing software provides a fast
and accurate method in which the leaf diseases are detected and classified using k-means based segmentation
and neural networks based classification. Most common diseases seen in the leaves of Tapioca and Mango are
discussed here for this approach.

Diseases occur in plants under the influence of various factors—pathogens and unfavorable
environmental conditions—and are manifested in the disturbance of functions (photosynthesis, respiration,
synthesis of tissue and growth substances, and the flow of water and nutritive substances), and the structure of
the organism, causing premature destruction of the plant or affecting some of its organs.

There is not yet a precise and comprehensive definition of plant diseases. In the early stages of the
development of phytopathology, any deviation from the normal condition of a plant was considered plant
disease. The inadequacy of this definition lay in the difficulty of distinguishing a normal (healthy plant) from
an abnormal (diseased plant) condition. The determination of the presence of a pathological process in a plant
organism made it possible to redefine plant disease in a new way and to conceive it not as a static condition but
as a dynamic process that arises and develops as a result of interaction of the plant with its environment.

Plant diseases diminish yields and impair the quality of plant production. For instance, in years
favorable to the spread of phytophthora infection of potatoes, the yield of tubers is decreased by 15 to 20
percent and, in some regions, by 50 percent or more.

More than 30,000 separate plant diseases are known. They are classified by symptoms or types
(pathographic classification), by the plants affected (plant-growing classification), and by causes, or
causative agents, of the disease (etiological classification). The last, according to which plant diseases are
divided into noninfectious and infectious, plays a leading role.

Noninfectious plant diseases are caused mainly by abiotic factors in the environment: disruptions in
the regime of mineral feeding, most often by a deficiency (rarely, a unilateral excess) of macroelements
(nitrogen, phosphorus, potassium, and magnesium) or a deficiency of microelements, especially boron, zinc,
iron, copper, and molybdenum; an unfavorable water regime (deficiency or excess of water in the soil,
prolonged rains, or high relative humidity of the air), causing “bleeding” of plants, premature drying up,
premature withering of plants, or leaves falling under conditions of water deficiency; or the effects of high
or low temperatures on plants, abrupt changes in air and soil temperatures (freezing of shoots, frost cracks,
chilling of heat-loving plants in greenhouses and hotbeds or during irrigation of the soil with cold water, and
so forth). Causes of noninfectious plant diseases may be harmful impurities in the air and soil (blight and
falling of leaves from the effects of sulfur dioxide gas, for example, in the vicinity of metallurgical and

chemical plants); residual effects of certain herbicides carried into the soil; an unfavorable light regime,
mainly a deficiency of light in greenhouses and hothouses (chlorosis and lodgment or dwarfing with a
shortened day); ionizing radiation (alpha, beta, and gamma rays, X rays, and neutrons); or toxins excreted
into the soil by certain fungi (species of Fusarium, Botrytis, and so forth) and some higher plants

Figure 2.1. Pathogen Life Cycle

Plant pathology (also phytopathology) is the scientific study of plant diseases caused by pathogens
(infectious diseases) and environmental conditions (physiological factors). Plant diseases may be broadly
classified into three types. They are bacterial, fungal and viral diseases. Some of the diseases are shown in
Figure 2.2.

Figure 2.2: Types of diseases


The majority of phytopathogenic fungi belong to the Ascomycetes and the Basidiomycetes. The fungi
reproduce both sexually and asexually via the production of spores and other structures. Spores may be spread
long distances by air or water, or they may be soil borne. Many soil inhabiting fungi are capable of living
saprotrophically, carrying out the part of their lifecycle in the soil. These are known as facultative saprotrophs.

Fungal diseases may be controlled through the use of fungicides and other agriculture practices,
however new races of fungi often evolve that are resistant to various fungicides.

Biotrophic fungal pathogens colonize living plant tissue and obtain nutrients from living host cells.
Necrotrophic fungal pathogens infect and kill host tissue and extract nutrients from the dead host cells. See
Powdery Mildew and Rice Blast images below.

(a) Powdery mildew, a Biotrophic Fungus (b) Rice blast, a necrotrophic fungus
Figure 2.3


Most bacteria that are associated with plants are actually saprotrophic, and do no harm to the plant
itself. However, a small number, around 100 known species, are able to cause disease. Bacterial diseases are
much more prevalent in sub-tropical and tropical regions of the world.

Most plant pathogenic bacteria are rod-shaped (bacilli). In order to be able to colonize the plant they
have specific pathogenicity factors. Five main types of bacterial pathogenicity factors are known: uses of Cell
wall-degrading enzymes, Toxins, Effector proteins, Phytohormones and Exopolysaccharides

Pathogens such as Erwinia, use Cell wall-degrading enzymes to cause soft rot. Agrobacterium changes
the level of auxins to cause tumours with phytohormones. Exopolysaccharides are produced by bacteria and
block xylem vessels, often leading to the death of the plant.

Bacteria control the production of pathogenicity factors via quorum sensing.

Figure 2.4 Vitis vinifera with "Ca. Phytoplasma vitis" infection


There are many types of plant virus, and some are even asymptomatic. Under normal circumstances,
plant viruses cause only a loss of crop yield. Therefore, it is not economically viable to try to control them, the
exception being when they infect perennial species, such as fruit trees.

Most plant viruses have small, single-stranded RNA genomes. These genomes may encode only three
or four proteins: a replicase, a coat protein, a movement protein, in order to allow cell to cell movement though
plasmodesmata, and sometimes a protein that allows transmission by a vector.

Plant viruses must be transmitted from plant to plant by a vector. This is often by an insect (for example,
aphids), but some fungi, nematodes, and protozoa have been shown to be viral vectors.

Figure 2.5 Tobacco mosaic virus

Figure 2.6 Orchid leaves viral infections

The characteristics of the pathogenesis of plant diseases are determined first by the properties of the
causative agent, the susceptibility of the plants, and the coincident environmental conditions. Several principal
phases are distinguished in the pathogeneses of plant diseases: the preinfectious phase, infection, incubation
period, and postincubation phase. During the preinfectious phase, spores and other agents of infection fall on
leaves, flowers, fruits, and other organs with raindrops or dew. Under favorable conditions they are embedded
in plant tissues through the stoma or other paths, or they first germinate and propagate on moist, dead organic
substances found on living plants or in their immediate vicinity and then become embedded in living tissues. In
the phase of infection, the causative agent invades cells from the intercellular interstices and infects the plant.
In many infectious diseases, such as gray rot, there is another mechanism of infection. Infectious agents on dead
plant parts in contact with living organs of the plant begin to germinate under favorable conditions and secrete
toxins that penetrate the living cells of the plant, poison it, and kill or weaken it. Then hyphae grow in those
cells. The incubation period is the period of the latent development of pathological processes in the plant from
infection to the appearance of external symptoms. Its duration depends on the temperature and humidity of the
air and the resistance or susceptibility to disease of the plant or its separate organs. The postincubation phase is
characterized by the intensification of symptoms and growing intensity of infection. The causative agent
multiplies internally or on the surface of the diseased plant; infectious elements are spread through the air by
raindrops, insects, or other means and may cause massive affection of the plant. Plants have various defensive
reactions. In response to the implantation of the causative agent, the activity of the plant’s oxidative enzymes is
intensified, the quantity and activity of phytoncides are increased, cell walls become clogged and atrophy, and
the infected cells fall out together with the causative agent. As a result of this process, some groups of cells
around the primary focus of infection, and sometimes the whole plant, acquire increased resistance and become
a sort of barrier against the spread of the causative agent in the plant. If the causative agent cannot overcome
the resistance of the tissues, the disease is limited to a spot of chlorotic or atrophied tissue (necrosis).

In controlling plant disease, prophylactic measures are of decisive value. These include the creation of
the best conditions for the growth and development of agricultural crops, development of resistant varieties,
rational seed-growing, chemical treatment of seeds, and chemical treatments of vegetating plants such as
spraying and dusting. The treatment of diseased plants is also of considerable value—for example, restoration
of chlorotic trees, thermal disinfection of wheat and barley seeds infected with powdery mildew, heating of
tubers and slips, and grafting material infected with certain viruses. In order to prevent the transference of
causative agents of plant diseases from one country to another, quarantine measures are taken.

Now we are going to discuss about tapioca and mango leaf diseases.


Tapioca is a starch extracted from cassava (Manihot esculenta). This species is native to Brazil but
spread throughout the Americas, Portuguese , Africa, the Philippines and most of the West Indies, being now
cultivated worldwide. In India, the term "tapioca" is used to represent the root of the plant (cassava), rather than
the starch.

Figure 3.1 Tapioca

The name tapioca is derived from the word tipi'óka, the name for this starch in the Tupí language of
South America. This Tupí word refers to the process by which the starch is made edible. However, as the word
moved out of Brazil it came to refer to similar preparations made with other esculents. Tapioca is a staple food
in some regions and is used worldwide as a thickening agent, mainly in foods. Tapioca is gluten-free, almost
completely protein-free, and contains practically no vitamins.


Infested leaf has angular watersoaked spots along its veins, margin, and tip. The infected leaf blade

turns brown with the typical watersoaked symptom at the leading edge of the brown patch. As the disease
further develops, the spots join together into large patches killing the leaf blade as they expand. The leaf
eventually dries and falls down.

Figure 3.2 Cassava Bacterial Blight


The damage is similar to cassava leaf blight wherein infected leaf has watersoaked angular leaf spots
that often extend along the veins but without the formation of the small secondary spots progressing into the
blighted areas. Larger dead tissues only develop on the leaf blade when several angular spots joined together.
As the bacteria mature, the center of the spot turns dark-brown covered with small yellow discharges and
becomes surrounded by a narrow watersoaked line and a yellow ring.

Figure 3.3 Cassava Leaf Spot

The mango is a fleshy stone fruit belonging to the genus Mangifera, consisting of numerous tropical
fruiting trees in the flowering plant family Anacardiaceae. The mango is native to the Indian subcontinent from
where it was distributed worldwide to become one of the most cultivated fruits in the tropics. While other
Mangifera species (e.g. horse mango, M. foetida) are also grown on a more localized basis, Mangifera indica –
the 'common mango' or 'Indian mango' – is the only mango tree commonly cultivated in many tropical and
subtropical regions. It is the national fruit of India, Philippines and Pakistan. It is also the national tree of

Figure 3.4 Mango Leaf

3.2.1 GALL MIDGE (Procontarinia spp) :

Galls are abnormal growths of plant cells formed in response to egg-laying by adult insects or feeding
by immatures. Eggs are usually laid in actively growing plant tissue. The effected plant tissue quickly surrounds
the egg or immature insect, and protects and provides food for the gall-maker until it matures. Gall-makers may
live in individual or communal chambers inside the gall. It is a serious pest of mango in northern India. It is
reported from Uttar Pradesh, Bihar and Terrai regions of northern India.

Figure 3.5 Gall Midge

Symptoms :Mango leaf gall midge produce wart-like galls on leaves resulting in reduced
photosynthesis if left uncontrolled leading to leaf drop and lowered fruit production. It is spread by wind
currents and movement of infested plant material.. Midges are very small flies, 1-2 mm in length. The female
lays eggs into the tissue of young leaves leaving a small reddish spot. The leaf tissue under the red spot becomes
swollen and soft. Gall formation begins within seven days and attains a maximum diameter of 3-4 mm.

Adults usually emerge from the underside of the leaf leaving the pupal skin protruding from the
emergence hole. Mango leaf gall midge produces wart-like galls on leaves resulting in reduced photosynthesis,
leading to leaf drop and lowered fruit production. Younger trees may die while older trees fail to recover normal
growth after repeated attacks.

3.2.2 ANTHRACNOSE (Colletotrichum gleosporioides):

Anthracnose, the most important mango disease, is caused by the fungus Colletotrichum
gleosporioides. Flower blight, fruit rot, and leaf spots are among the symptoms of this disease. Infections on
the panicles (flower clusters) start as small black or dark-brown spots. These can enlarge, coalesce and kill the
flowers, greatly reducing yield. On leaves, anthracnose infections start as small, angular, brown to black spots
(Figure below). If tissue is young when originally infected, spots can enlarge to form extensive dead areas
(Figure below). Infections that begin in older leaves usually result in smaller lesions with a maximum diameter
of 1/2 inch (6 mm) that appear as glossy dark-brown to black angular spots.

Figure 3.6 Anthracnose infections start as small, angular, brown to black spots

Control: Trees may be sprayed twice with Bavistin (0.1%) at 15 days interval during flowering to control
blossom infection. Spraying of copper fungicides (0.3%) is recommended for the control of foliar infection.


Powdery mildew is caused by the fungus Oidium. Although a somewhat sporadic disease, it can cause
severe crop loss due to flower and panicle infection and subsequent failure of fruit set.

The diagnostic key in the identification of this disease is the appearance of a whitish, powdery growth
of the fungus on panicles and young fruit. Young infected fruit turn brown and fall. The white growth can also
be seen on the undersurface of young infected leaves. Severe infection of young leaves results in premature leaf
drop. On mature leaves, the spots turn purplish brown, as the white fungal mass eventually disappears (Figure

Figure 3.7 On mature leaves, powdery mildew eventually turns purplish brown.

Powdery mildew occurs in the spring and is particularly destructive in years when the weather is
cool and dry.

Control : Control is fungicide treatment. Following three sprays of fungicides at 15 days interval recommended
for effective control of the disease :
• Wettable sulphur 0.2 per cent (2 g Sulfex / lit. water).
• Tridemorph 0.1 per cent ( 1 ml Calixin / lit. water).
• Dinocap 0.1 per cent (1 ml / g Karathane / lit. water).

3.2.4 SOOTY MOULD (Meliola mangiferae) :

The disease is common in the orchards where mealy bug, scale insect and hopper are not controlled
efficiently. The disease in the field is recognis by the presence of a black velvety coating, i.e., sooty mould on
the leaf surface. In severe cases the trees turn completely black due to the presence of mould over the entire
surface of twigs and leaves. The severity of infection depends on the honey dew secretion by the above said
insects. Honey dew secretions from insects sticks to the leaf surface and provide necessary medium for fungal
growth. The fungus is essentially saprophytic and is non-pathogenic because it does not derive nutrients from
the host tissues. Although no direct damage is caused by the fungus, the photosynthetic activity of the leaf is
adversely affected due to blockage of stomata.

Figure 3.8 Sooty Mould affected Mango leaf

Control :
• Pruning of affected branches and their prompt destruction prevents the spread of the disease.
• Spraying of 2 per cent starch is found effective.
• It could also be controlled by spray of Nottasul + Metacin + gumacasea (0.2% + 0.1% + 0.3%).
India is an agricultural country; wherein about 70% of the population depends on agriculture. Farmers
have wide range of diversity to select suitable Fruit and Vegetable crops. However, the cultivation of these
crops for optimum yield and quality produce is highly technical. It can be improved by the aid of technological
support. The management of perennial fruit crops requires close monitoring especially for the management of
diseases that can affect production significantly and subsequently the post-harvest life.
Plant diseases cause periodic outbreak of diseases which leads to large scale death and famine. It is
estimated that the outbreak of helminthosporiose of rice in north eastern India in 1943 caused a heavy loss of
food grains and death of a million people. Since the effects of plant diseases were devastating, some of the crop
cultivation has been abandoned. It is estimated that 2007 plant disease losses in Georgia (USA) is approximately
$653.06 million. In India no estimation has been made but it is more than USA because the preventive steps
taken to protect our crops are not even one-tenth of that in USA.

In case of plant the disease is defined as any impairment of normal physiological function of plants,
producing characteristic symptoms. A symptom is a phenomenon accompanying something and is regarded as
evidence of its existence. Disease is caused by pathogen which is any agent causing disease. In most of the
cases pests or diseases are seen on the leaves or stems of the plant. Therefore identification of plants , leaves,
stems and finding out the pest or diseases, percentage of the pest or disease incidence , symptoms of the pest or
disease attack, plays a key role in successful cultivation of crops. It is found that diseases cause heavy crop
losses amounting to several billion dollars annually.
Disease management is a challenging task. Mostly diseases are seen on the leaves or stems of the plant.
Precise quantification of these visually observed diseases, pests, traits has not studied yet because of the
complexity of visual patterns. Hence there has been increasing demand for more specific and sophisticated
image pattern understanding.
In biological science, sometimes thousands of images are generated in a single experiment. These
images can be required for further studies like classifying lesion, scoring quantitative traits, calculating area
eaten by insects, etc. Almost all of these tasks are processed manually or with distinct software packages. It is
not only tremendous amount of work but also suffers from two major issues: excessive processing time and
subjectiveness rising from different individuals. Hence to conduct high throughput experiments, plant biologist
need efficient computer software to automatically extract and analyze significant content. Here image
processing plays important role. This paper provides a wide survey carried to study advances in different image
processing techniques used for studding plant diseases/traits & pests.

Brendon J. Woodford , Nikola K. Kasabov and C. Howard Wearing in paper titled “Fruit Image
Analysis using Wavelets” proposed wavelet based image processing technique and neural network to develop
a method of on line identification of pest damage in pip fruit in orchards. Three pests that are prevalent in
orchards were selected as the candidates for this research: the leaf-roller, codling moth, and apple leaf curling
midge. Fast wavelet transform with special set of Doubenchies wavelet was used to extract the important
features. To retrieve the related images, the search is done in two steps. The first step matches the images by
comparing the standard deviations for the three color components. In the second step, a weighted version of the
Euclidean distance between the feature coefficients of an image selected in the first step and those of the
querying image is calculated and the images with the smallest distances are selected and sorted as matching
images to the query.
Stereomicroscopic method and Image analysis method is compared for usefulness of image analysis
as an efficient and precise method to measure fruit traits like size, shape dispersal related structures by Mix &
Pico .In general fruit length obtained with image analysis was significantly greater than that recorded with a
stereomicroscopic. Only fruit length estimates did not differ between the two methods. Nevertheless there was
a highly significant correlation between fruit length estimates obtained from both methods for all species of
study. This indicates that both stereomicroscopic and image analysis accurately discriminated fruits of different
sizes. But it was concluded that image analysis has following advantages: 1) the high amount of fruit parameters
obtained with one single measurement 2) the minimization of human errors 3) the reduction of time needed to
obtain large data sets concerning fruit trait variability 4) the possibility to estimate variability in traits of fruits
with complicated shapes.
Pests leaves distinctive outward effects on plants like rolling the leaves or destroying the whole plant.
The sucking pest reduces the moisture content of the leaves. All these effects change the chlorophyll content of
a plant with corresponding variation in its spectral image. Ahsan and Umer studied the possibilities for detecting
these effects by using various remote sensing techniques for acquisition of spectral image by satellite imagery,
airborne images from chartered or model planes in Application of Remote Sensing in Pest Scouting: Evaluating
Options and Exploring Possibilities.
Integrating Diagnostic Expert System With Image Processing Via Loosely Coupled Technique,
Central Laboratory for Agricultural Expert System(CLAES), a novel approach is proposed by Mohammad Ei
–Helly, Ahmed Rafea, Salwa Ei – Gamal And Reda Abd Ei Whab in 2004 for integrating image analysis
technique into diagnostic expert system. A CLASE (Central Lab. of Agricultural Expert System) diagnostic
model is used to manage cucumber crop. The expert system finds out the diseases of user observation. In order
to diagnose a disorder from a leaf image, four image processing phases are used: enhancement, segmentation,

feature extraction and classification .They tested three different disorders such as Leaf miner, Powdery and
Downey. The proposed approach has greatly reduced error prone dialogue between system and user.
The morphological features of leaves are used for plant classification and in the early diagnosis of
certain plant diseases. In 2005 Panagiotis Tzionas, Stelios E. Papadakis and Dimitris Manolakis Plant leaves
classification based on morphological features and fuzzy surface selection technique, 5th International
Conference ON Technology and Automation ICTA’05 presents design and implementation of an artificial
vision system which extracts specific geometric and morphological features from plant leaves. The proposed
system consists of an artificial vision system (camera), a combination of image processing algorithms and feed
forward neural network based classifier. A fuzzy surface selection technique for feature selection was used.
In 2006,a prediction approach based on support vector machines for developing weather based
prediction models of plant diseases is proposed by Rakesh & Amar. The performance of conventional multiple
regression, artificial neural network (back propagation neural network, generalized regression neural network)
and support vector machine (SVM) was compared. It was concluded that SVM based regression approach has
led to a better description of the relationship between the environmental conditions and disease level which
could be useful for disease management.
Prasad Babu & Srinivasa Rao proposed Back propagation neural network for recognition of leaves in
2007. It was proved that just a back propagation network and shape of leaf image is enough to specify the
species of a leaf. Prewitt edge detection and thinning algorithm is used to find leaf tokens as input to back
propagation algorithm. It was reported that there is a scope for enhancement of this work which involves more
experimentation’s with large training sets to recognize various leaves with pest or damaged leaves due to insects
or diseases and develop an expert system.
Neural network approach for segmentation of agricultural landed fields in remote sensing data is
proposed by Alexander A. Doudkin , Alexander V. Inyutin, Albert I. Petrovsky, Maxim E. Vatkin in 2007. A
neural network algorithm based on back propagation is used for segmentation of the color images of crop field
infected by diseases that changes usual color of plants.
Stephen Gang Wu, Forrest Sheng Bao, Eric You Xu, Yu – Xuan Wang Yi – Fan implements a leaf
recognition algorithm using easy-to-extract features and high efficient recognition algorithm in 2007. A
Probabilistic Neural Network (PNN) approach for plant leaf recognition is used. The features are extracted and
processed by PCA to form input to PNN. It was found that algorithm works with an accuracy of 90% on 32
kinds of plants.
M. T. Maliappis, K. P. Ferentinos, H. C. Passam And A. B. Sideridis in 2008 describes a system which
introduces computer management into the cultivation process in low-tech greenhouse. The proposed system is
implemented as a web-based application using open source technologies & subsystems comprised of modules

that provide: 1) static information about the cultivation process and marketing of supported crops, 2) simulation
and forecast models of general interest, 3) a collaboration environment and 4) expert system capabilities and
support. The expert system is an adaptation of the VEGES expert system. It is used as a web based application.
It can be used for identification of pests, diseases and nutritional disorders.
Santanu &Jaya described a software prototype system in 2008 for disease detection based on the
infected images of various rice plants. They used image growing, image segmentation techniques to detect
infected parts of the plants. Zooming algorithm is used to extract features of the images .Self Organize
Map(SOM)neural network is used for classifying diseased rise images.
In 2008 Weizheng S.,Yachun W.,Zhanliang C.& Hongda W fast & accurate novel method is
developed which is based on image processing for grading of plant disease. They segmented leaf region using
Otsu segmentation. The plant diseases are graded by calculating the quotient of disease spot & leaf area.
Grape leaf disease is detected in 2008 by A.Meunkaewjinda, P.Kumsawat, K.Attakitmongcol &
A.Srikaew from color imagery using hybrid intelligent system. They used self organizing maps &back
propogation neural networks to recognize colors of grape leaf. This information is used to segment grape leaf
pixels within the image. Then the grape leaf disease segmentation is performed using modified self organizing
feature maps with genetic algorithms for optimization &support vector machines for classification. The
segmented image is filtered using Gabor wavelet which allows the system to analyze leaf disease color features
more efficiently. The support vector machines are then applied to classify types of grape leaf disease.
Ying & others studied methods of image preprocessing for recognition of crop diseases in 2008. They
used cucumber powdery mildew, speckle & downy mildews as study samples & reported comparative study of
effect of simple filter and median filter. They stated that Leaves with spots must be pre-processed firstly in
order to carry out the intelligent diagnosis to crop based on image processing and appropriate features should
be extracted on the basic of this. They reported following important image preprocessing methods:
1)image clipping : Separating the leaf with spots from the complex background.
2) noise reductions: two filters—Simple filter and Median filter were compared and at last the Median filter was
chosen to wipe noises for the image.
3)threshoilding: to segment or partition image in to the spot background
In a word, the image pre-processing can make following extracting of characteristic parameters not to
be affected by background, shape and size of leaf, light and camera and make a good foundation for following
effective characteristic parameters for the disease diagnoses ,as well as setting up pattern recognition system .
In 2008 ,S.S. Abu-Naser, K. A. Kashkash and M. Fayyad, designed &developed an expert system with
two different methods for diagnosing plants Diseases were presented:1) step by step descriptive method
2)graphical representation method. It is reported that the expert system with the graphical representation is more

favourable .It is found that the graphical representation requires few description from users. The proposed
system saved a lot of time & effort in identifying plant disease .
Images features extraction is very important for the grading process of flue-cured tobacco leaves. In
2008, Xinhong Zhang & Fan Zhang presented Images Features Extraction of Tobacco Leaves in which a system
based on machine vision techniques is proposed for the automatic inspection of flue-cured tobacco leaves.
Machine vision techniques are used in this system to solve problems of features extraction and analysis of
tobacco leaves, which include features of color, size, shape and surface texture. The experimental results show
that this system is a viable way for the features extraction of tobacco leaves, and can be used for the automatic
classification of tobacco leaves.
The paper Computer Assistance Image Processing Spores Counting presents a method to monitor plant
disease which caused by spores by Xu Pengyun& Li Jigang in 2009. The color image is first converted in to
gray image so as to carry the analysis and processing, such as histogram generation, the gray-level correction,
image feature extraction, image sharpening and so on. In order to remove low frequency components, the input
gray image is pre-processed by edge enhancement using the Median filter and canny edge algorithm. After
thresholding binary image obtained is processed by using morphological features like dilation, erosion, opening
etc. it is found that this method fits for many works which under the microscope to count or recognition, for
example, optics stripe counting, the chromosome counting, and other plant diseases monitor etc.
For detecting rice disease early and accurately, Qing Yao, Zexin Guan, Yingfeng Zhou, Jian Tang,
Yang Hu, Baojun Yang presented an application of image processing techniques and Support Vector Machine
(SVM) for detecting rice diseases in 2009. Rice disease spots were segmented and their shape and texture
features were extracted. Because the color features are influenced largely by outside light, they selected shape
and color texture features of disease spot as characteristic values of classification. The SVM method was
employed to classify rice bacterial leaf blight, rice sheath blight and rice blast. The results showed that SVM
could effectively detect and classify these disease spots to an accuracy of 97.2%.
Method for fast & accurate detection & classification of plant diseases is proposed in 2009 by Di Cui,
Oin Zhang, Minzan Li, Youfu Zhao and Glen L. Hartman. They used Otsu segmentation , K-means clustering
&back propagation feed forward neural network for clustering & classification of diseases that affect on plant
A feasible methods for detecting soybean rust and quantifying severity is explored in Detection of
soybean rust using a multispectral image sensor. The images of soybean leave with different rust severity were
collected by using both multispectral CCD camera and portable spectrometer. Three parameters i.e. ratio of
infected area, lesion color index and rust severity index were extracted from the multispectral images and used
to detect leaf infection and severity of infection.

An experiment was carried out by Helmi Zulhaidi Mohd Shafri and Nasrulhapiza Hamdan in 2009 on
oil palm trees which requires on-time detection of diseases as the ganoderma basal stem rot disease presents in
more than 50% of the oil palm plantations. Airborne hyperspectral can provide data on user requirement and
has capability of acquiring data in narrow and contiguous spectral bands. This made it possible to differentiate
between healthy and diseased plants better compared with multispectral imagery. It was found that airborne
hyperspectral imagery offers better solution to detect and map the oil palm trees that are affected by the disease.
They used vegetation indices and red edge techniques to detect and map the oil palm trees that are affected by
the disease and proved that the red edge based technique s are more effective than vegetation indices.
Method for fast & accurate detection & classification of plant diseases is proposed in 2011 by H.Al-
Hiary,S.Bani-Ahmad, M.Reyalat,M.Braik & Z.AlRahamneh. They used Otsu segmentation , K-means
clustering &back propagation feed forward neural network for clustering & classification of diseases that affect
on plant leaves.

The underlying approach for all of the existing techniques of image classification is almost the same.
First, digital images are acquired from environment around the sensor using a digital image. Then image-
processing techniques are applied to extract useful features that are necessary for further analysis of these
images. After that, several analytical discriminating techniques are used to classify the images according to the
specific problem at hand. This constitutes the overall concept that is the framework for any vision related
algorithm. Figure 5.1 depicts the basic procedure of the proposed vision-based detection algorithm in this

Figure 5.1: The basic procedure of the proposed image processing-

based disease detection solution
The first phase is the image acquisition phase. In this step, the images of the various leaves that are to
be classifies are taken using a digital camera. In the second phase image preprocessing is completed. In the third
phase, segmentation using K-means clustering is performed to discover the actual segments of the leaf in the
image. Later on, feature extraction for the infected part of the leaf is completing based on the specific properties
among pixels in the image or their texture. After this step, certain statistical analysis tasks are completed to
choose the best features that represents the given image, thus minimizing feature redundancy. Finally,
classification is completed using neural network based algorithm.
The detail step-by-step account of the image acquisition and classification process is shown in figure
In the initial step, the RGB images of all the leaf samples were obtained. Some samples of diseased
leaf images are taken.

Figure 5.2: Image acquisition and classification

For each image in the data set the subsequent steps were repeated. Image segmentation of the leaf is
done on each image of leaf sample using K-means clustering. A sample clustered image with four clusters of
the leaf sample image is shown in figure 4.2. In this experiment multiple values of number of clusters are
considered. Best results can obtain when the number of clusters are 4.
Once the infected objects is determined. The image is then converted from RGB format to HSI format.
The SGDM matrices are then generated for each pixel map of the image for only H and S images. The SGDM
is a measure of the probability that a given pixel at one particular gray-level will occur at a distinct distance and
orientation angle from another pixel, given that pixel has a second particular gray-level. From the SGDM
matrices, the texture statistics for each image were generated.
A software routine is written in MATLAB that would take in .mat files representing the training and
test data, train the classifier using the train files and then use the test file to perform classification task on the
test data. Consequently, a MATLAB routine will load all the data files (training and test data files) and make
modifications to the data according to the proposed model chosen.

We propose an image processing - based software for the automatic leaf diseases identification and
classification. We test our software on five diseases which effect on the plants. They are cassava bacterial blight,
cassava leaf spot, phoma blight, bacterial canker, red rust and sooty mould. Identification and recognition of
leaves diseases are likely to give better performance and provide solutions to treat the diseases in its early stages.
Visual interpretation of plant diseases manually is both inefficient and difficult, also it requires a trained
botanist. A closer inspection of the plant diseases images reveals several difficulties for the possible leaves
diseases identification. The developed system classifies the leaves into infected and non-infected classes.
The proposed system can:
• Identify disease type
• Deal with other diseases
• Identify and classify diseases that infect plant leaves
• Provide advice to treat the diseases in its early stages
First, the images of various leaves are acquired using a digital camera. Then image-processing
techniques are applied to the acquired images to extract useful features that are necessary for further analysis.
The steps involved in recognition and classification are image acquisition, pre-processing, feature extraction,
segmentation and classification.

Algorithm : Basic steps describing the proposed algorithm.

• RGB image acquisition
• Create the color transformation structure
• Convert the color values in RGB to the space specified in the color transformation structure
• Apply K-means clustering
• Masking green-pixels
• Remove the masked cells inside the boundaries of the infected clusters
• Convert the infected (cluster / clusters) form RGB to HSI Translation
• Texture Statistics Computation
• Configuring Neural Networks for Recognition

The different types of commercial crops, food grain, fruits and cereals samples both healthy and
unaffected agriculture/horticulture produce used in the present work are collected using a digital camera.

Usually the images that are obtained during image acquisition may not be suitable straight for
identification and classification purposes because of certain factors, such as noise, lighting variations, climatic
conditions, poor resolutions of an images, unwanted background etc. We wish to adopt the established
techniques and study their performance.


Figure 6.1: a) Input image infected by Bacterial Brown Spot b) Hue Component c) Saturation Component
d) Intensity Component

First, the RGB images of leaves are converted into Hue Saturation Intensity (HSI) color space
representation. The purpose of the color space is to facilitate the specification of colors in some standard,
generally accepted way. HSI (hue, saturation, intensity) color model is a popular color model because it is based
on human perception [10]. Hue is a color attribute that refers to the dominant color as perceived by an observer.
Saturation refers to the relative purity or the amount of white light added to hue and intensity refers to the

amplitude of the light. Color spaces can be converted from one space to another easily. After the transformation
process, the H component is taken into account for further analysis. S and I are dropped since it does not give
extra information. Figure 3 shows the H, S and I components.


In this step, we identify the mostly green colored pixels. After that, based on specified threshold value
that is computed for these pixels, the mostly green pixels are masked as follows: if the green component of the
pixel intensity is less than the pre-computed threshold value, the red, green and blue components of the this
pixel is assigned to a value of zero. This is done in sense that the green colored pixels mostly represent the
healthy areas of the leaf and they do not add any valuable weight to disease identification and furthermore this
significantly reduces the processing time.


The pixels with zeros red, green, blue components were completely removed. This is helpful as it gives
more accurate disease classification and significantly reduces the processing time.


Figure 6.2 : An example of the output of K- means clustering of a leaf that is infected with early scorch
disease. (a) The infected leaf picture, (b, c, d, e) the pixels of the first, second, third and the fourth cluster
respectively and (f) a single gray-scale image with the pixels colored based on their cluster index.
K-means clustering is used to partition the leaf into four clusters in which one or more clusters contain
the disease in case when the leaf is infected by more than one disease. The K means clustering algorithms tries
to classify objects (pixels in our case) based on a set of features into K number of classes. The classification is
done by minimizing the sum of squares of distances between the objects and the corresponding cluster or class


The method followed for extracting the feature set is called the color co-occurrence method or CCM
method. It is a method, in which both the color and texture of an image are taken into account, to arrive at unique
features, which represent that image. Next we explain this method with more detailed.


The image analysis technique selecting for this study is the CCM method. The use of color image
features in the visible light spectrum provides additional image characteristics features over the traditional gray
scale representation.
The CCM methodology consists of three major mathematical processes. First, the RGB images of
leaves are converted into HSI color space representation. Once this process is completed, each pixel map is used
to generate a color co-occurrence matrix, resulting in three CCM matrices, one for each H, S and I pixel maps.
Hue Saturation Intensity (HSI) space is also a popular color space because it is based on human color perception.
Hue is generally related to the wavelength of a light and intensity shows the amplitude of the light. And
saturation measures the colorfulness in HSI space.
Color spaces can easily be transformed from one to another. Following equations can be used to
transform the images from RGB to HSI.

Intensity (I) = 3
3 𝑚𝑖𝑛 (𝑅 ,𝐺 ,𝐵)
Saturation (S) = 1 − (𝑅 + 𝐺 + 𝐵)

Hue (H) = 2 − 𝐴𝐶𝑂𝑆 { 2 },B>G
√ ( 𝑅 − 𝐺 )2 + ( 𝑅 − 𝐺 ) ( 𝐺 − 𝐵 )

Hue (H) = 𝐴𝐶𝑂𝑆 { 2 },B ≤G
√ ( 𝑅 − 𝐺 )2 + ( 𝑅 − 𝐺 ) ( 𝐺 − 𝐵 )

The color co-occurrence texture analysis method is developed through the Spatial Gray-level
Dependence Matrices (SGDM). The gray level co-occurrence methodology is a statistical way to describe shape
by statistically sampling the way certain gray-levels occur in relation to other gray levels.
These matrices measure the probability that a pixel at one particular gray level will occur at a distinct
distance and orientation from any pixel given that pixel has a second particular gray level. For the position
operator p, we can define a matrix 𝑃𝑖𝑗 that counts the number of times a pixel with gray-level i occurs at position
p from a pixel with gray-level j. SGDM’s are generated for H image. The SGDM’s are represented by the
function P(i, j, d, 𝜃) where i represent the gray level of the location (x, y) in the image I(x, y), and j represents
the gray level of the pixel at a distance d from location (x, y) at an orientation angle of 𝜃. The reference pixel at
image position (x, y) is shown as an asterix. All the neighbors from 1 to 8 are numbered in a clockwise direction.
Neighbors 1 and 5 are located on the same plane at a distance of 1 and an orientation of 0 degrees. In this
research one pixel offset distance and a zero degree orientation angle is used.
The RGB image is converted to HIS and then calculate the feature set for H and S, we drop the intensity
(I) since it does not give extra information. However, we use GLCM function in MatLab to create Gray-Level
Co-Occurrence Matrix. The number of gray-levels is set to 8 and the symmetric value is set to true and finally
offset is given a 0 value.


The CCM matrices are then normalized using the equation below, where, 𝑝(𝑖, 𝑗, 1, 0) represents the
intensity co-occurrence matrix:
p(i j, 1, 0)
p(i, j) = N−1
g g N−1
∑i=0 ∑j=0 p(i, j, 1, 0)

where Ng is the total number of intensity levels. Next is the marginal probability matrix:
𝑁 −1
𝑝𝑥 (i) = ∑𝑗=0 𝑝(𝑖, 𝑗)

Sum and differences matrices:

𝑁𝑔−1 𝑁𝑔−1
𝑝𝑥+𝑦 (𝑘) = ∑ ∑ 𝑝(𝑖, 𝑗)
𝑖=0 𝑗=0

where k = 1+j, for k = 0, 1, 2, ..............., 2(Ng -1) and

𝑁𝑔−1 𝑁𝑔−1
𝑝𝑥−𝑦 (𝑘) = ∑ ∑ 𝑝(𝑖, 𝑗)
𝑖=0 𝑗=0

where k = |I-j|; for k= 0, 1, 2,..................,2(Ng -1) and p(i. j) is the image attribute matrix.

The angular moment (I1) is a measure of the image homogeneity and is defined as
𝑁𝑔−1 𝑁𝑔−1
𝐼1 = ∑ ∑ [𝑝(𝑖, 𝑗)]2
𝑖=0 𝑗=0

The mean intensity level, I2, is a measure of image brightness and is derived from the co-occurrence
matrix as follows:
𝐼2 = ∑ 𝑖 𝑝 (𝑖)

Variation of image intensity is identified by the variance texture feature (I3) and is computed as:
𝐼3 = ∑ (𝑖 − 𝐼2 )2 𝑃𝑥 (𝑖)

Correlation (I4) is a measure of intensity linear dependence in the image:

𝑔𝑁 −1𝑔 𝑁 −1
∑𝑖=0 ∑𝑗=0 𝑖 𝑗 𝑝(𝑖, 𝑗) − 𝐼22
𝐼4 =
The produce moment (I5) is analogous to the covariance of the intensity co-occurrence matrix:
𝑁𝑔−1 𝑁𝑔−1
𝐼5 = ∑ ∑ (𝑖 − 𝐼2 )( 𝑗 − 𝐼2 ) 𝑃(𝑖, 𝑗)
𝑖=1 𝑗=1

Contrast of an image can be measured by the inverse difference moment (I6)

𝑁𝑔−1 𝑁𝑔−1 𝑃(𝑖, 𝑗)
𝐼6 = ∑ ∑ 2
𝑖=1 𝑗= 1 1 + (𝑖 − 𝑗)

The entropy feature (I7) is a measure of the amount of order in an image and is computed as:
𝑁𝑔−1 𝑁𝑔−1
𝐼7 = ∑ ∑ 𝑝(𝑖, 𝑗) ln 𝑃(𝑖, 𝑗)
𝑖=1 𝑗 =1

The sum and difference entropies (I8 and I9) cannot be easily interrelated, yet low entropies indicates
high levels of order I8 and I9 can be computed by:
2(𝑁𝑔 − 1)
𝐼8 = ∑ 𝑃𝑥 + 𝑦 (𝑘) ln 𝑃𝑥 + 𝑦 (𝑘)
𝑁𝑔 − 1
𝐼9 = ∑ 𝑃𝑥 − 𝑦 (𝑘) ln 𝑃𝑥 − 𝑦 (𝑘)

The information measures of correlation (I10 and I11) do not exhibit any apparent physical
𝐼10 =

𝐼11 = [1 − 𝑒 −2(𝐻𝑋𝑌2−𝐼7) ]
𝑁𝑔 − 1
𝐻𝑋 = − ∑ 𝑃𝑥 (𝑖) ln 𝑃(𝑖)
𝑁𝑔 − 1 𝑁𝑔 − 1
𝐻𝑋𝑌1 = − ∑ ∑ 𝑃 (𝑖, 𝑗) ln[𝑃𝑥 (𝑖) 𝑃𝑥 (𝑗)]
𝑖=0 𝑗=0
𝑁𝑔 − 1 𝑁𝑔 − 1
𝐻𝑋𝑌1 = − ∑ ∑ 𝑃𝑥 (𝑖) 𝑃𝑥 (𝑗) ln[𝑃𝑥 (𝑖) 𝑃𝑥 (𝑗)]
𝑖=0 𝑗=0


In this paper, neural networks are used in the automatic detection of leaves diseases. Neural network
is chosen as a classification tool due to its well known technique as a successful classifier for many real
applications. The training and validation processes are among the important steps in developing an accurate
process model using NNs. The dataset for training and validation processes consists of two parts; the training
feature set which are used to train the NN model; whilst a testing features sets are used to verify the accuracy
of the trained NN model. Before the data can be fed to the ANN model, the proper network design must be set
up, including type of the network and method of training. This was followed by the optimal parameter selection
phase. However, this phase was carried out simultaneously with the network training phase, in which the
network was trained using the feed-forward back propagation network. In the training phase, connection weights
were always updated until they reached the defined iteration number or acceptable error. Hence, the capability
of ANN model to respond accurately was assured using the Mean Square Error (MSE) criterion to emphasis the
model validity between the target and the network output.

In this paper, respectively, the applications of K-means clustering and Neural Networks (NNs) have
been formulated for clustering and classification of diseases that affect on plant leaves. Recognizing the disease
is mainly the purpose of the proposed approach. Thus, the proposed Algorithm was tested on five diseases which
influence on the plants; they are: Early scorch, Cottony mold, ashen mold, late scorch, tiny whiteness. The
experimental results indicate that the proposed approach is a valuable approach, which can significantly support
an accurate detection of leaf diseases in a little computational effort.
An extension of this work will focus on developing hybrid algorithms such as genetic algorithms and
NNs in order to increase the recognition rate of the final classification process underscoring the advantages of
hybrid algorithms; also, we will dedicate our future works on automatically estimating the severity of the
detected disease.

[1].H.Al-Hiary,S.Bani-Ahmad, M.Reyalat,M.Braik &Z.AlRahamneh [2011] Fast & accurate detection &

classification of plant diseases, international journal of computer applications(0975-8887), volume 17- no.1,
[2]. Al-Bashish, D., M. Braik and S. Bani-Ahmad, 2011. Detection and classification of leaf diseases using
Kmeans- based segmentation and neural-networks-based classification. Inform. Technol. J., 10: 267-275. DOI:
[3]. Ali, S. A., Sulaiman, N., Mustapha, A. and Mustapha, N., (2009). K-means clustering to improve the
accuracy of decision tree response classification. Inform. Technol. J., 8: 1256-1262. DOI:
[4]. Applications of image processing in biology and agriculture J. K. Sainis, Molecular Biology and Agriculture
Division,R. Rastogi, Computer Division, and V. K. Chadda, Electronics Systems Division, BARC news letter.
[5]. S. Ananthi and S. Vishnu Varthini., Detection And Classification Of Plant Leaf Diseases, IJREAS Volume
2, Issue 2 (February 2012) ISSN: 2249-3905
[6]. Agriculture in India From Wikipedia, the free encyclopedia
[7]. Digital image processing From Wikipedia, the free encyclopedia
[8]. Mango - Wikipedia, the free encyclopedia
[9]. Tapioca - Wikipedia, the free encyclopedia
[10]. Jayamala K. Patil1 , Raj Kumar2, Advances in image processing for detection of plant diseases, Journal
of Advanced Bioinformatics Applications and Research ISSN 0976-2604 Vol 2, Issue 2, June-2011, pp 135-

