Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

2017 International Conference on Data Management, Analytics and Innovation (ICDMAI)

Zeal Education Society, Pune, India, Feb 24-26, 2017

Plant Disease Identification: A Comparative Study

Shriroop C. Madiwalar Medha V. Wyawahare


Department of Electronics Engineering Department of Electronics Engineering
Vishwakarma Institute of Technology Vishwakarma Institute of Technology
Pune, India Pune, India
shriroopmadiwalar@gmail.com medha_w@yahoo.com

Abstract—Anthracnose and leaf spot (red rust) are the demand can only be met if the mango is kept disease free thus
common diseases affecting the mango plant. Mango being ensuring the required quality and quantity are met with.
economically important, the detection of these diseases is critical Hence, it is necessary to detect the diseases in earliest stages
for avoiding epidemics and loss of yield. A machine vision so that proper preventive measures can be employed. The
approach has been proposed for plant disease identification using diseases that affect the mango plant are anthracnose, leaf spot,
colour images of mango leaves. This approach included using leaf blight, powdery mildew, leaf rust etc. These diseases can
YCbCr converted image and creating a feature vector of textural be detected by the different textures formed on the leaves.
and colour features of the input images which are fed to the
classifier during the testing phase. GLCM, colour based The texture, colour and shape formed on leaves by these
technique and Gabor filter were used for texture and colour different diseases are unique using which a disease can be
feature extraction. Comparison of results obtained using a diagnosed. A host of feature extraction techniques have been
Minimum distance classifier and using Support Vector Machine developed till date. One of the most efficient texture analysis
(SVM) has been done. Analysis of the feature extraction approaches is the Gray level co-occurrence matrix (GLCM). It
techniques was performed to obtain individual results for each describes the spatial distribution of gray values and the
technique. The overall results gave a classification accuracy of frequency of one gray value appearing with another gray value
79.16% and 83.34% for Minimum distance classifier and in a specified distance and angle. The GLCM approach gives
Support Vector Machine respectively over a database of 86 the matrix using which a number of textural features can be
images.
evaluated. Combined features extracted using other techniques
Keywords— Mango, Plant disease, GLCM, Gabor filter, SVM
along with GLCM might prove in more efficient diagnosis of
diseases.
I. INTRODUCTION
II. RELATED WORK
The agriculture sector being the livelihood of almost two
thirds of Indian population is one of the driving forces of the A survey of the existing study is discussed below.
Indian economy. The agricultural yield is dependent on Abdullah et al. [1] presented classification of rubber tree leaf
various factors like quality of the soil, rainfall, climatic diseases through automation and utilizing primary RGB colour
conditions, and plant diseases. The control of plant diseases is model. Classification between three leaf diseases of rubber
a major challenge faced by Indian farmers as diseases pay a was carried out using the PCA technique for reducing input
large contribution in loss of crop every year. dimension and ANN for classification.

Plant disease diagnosis is a very critical topic for Indian The classification between three diseases of paddy found
agriculture scientists which requires use of scientific methods in Sri Lanka was carried out by Anthonys et. al. [2]. The
and long period observation. The real challenge lies in the feature extraction was done using Colour texture analysis and
availability of agriculture experts across the country who can the format of membership function was used for classification.
perform the aforementioned process. Experts who can study In study of Camargo et. al. [3], a machine vision system for
and provide diagnosis for plant diseases are not easily the identification of the visual symptoms of cotton plant
available in all parts of the country, especially in the rural diseases, from coloured images was proposed.
regions where even basic amenities are hard to be found. The classification of palm leaf diseases was done by
Thus, a need of an automated system has arisen which can Hairuddin et al. [4] using Fuzzy logic for classification.
perform the operations of plant disease identification and Kurniawati et al. [5] have carried out texture analysis for
provides efficient solutions, all by itself. Proper solutions diagnosing paddy diseases. The methodology involves
include precise diagnosis, providing information about the converting the RGB images into a binary image using
correct fertilizers and pesticides, and information about the variable, global and automatic threshold based on Otsu
cost involved. method. Pujari et al. [6] have described Support Vector
There is a heavy demand for the Alphonso mango in the Machine (SVM) and Artificial Neural Network (ANN) based
western market which is commonly grown in the western part recognition and classification of visual symptoms on cereals
of Maharashtra in India or the Konkan region. This heavy like wheat, maize and jowar affected by fungal disease.

978-1-5090-4083-4/17/$31.00 ©2017 IEEE 13


III. PROPOSED METHOD purpose was performed with the help of pathological experts
The method proposed in this paper is represented in the from Mahatma Phule Krishi Vidyapeeth, Pune. Photo images
block diagram (Fig. 1). It consists of two phases – The training of the leaves were taken using a digital camera with a white
phase and the testing phase. The training phase includes background. A total of 110 images were acquired from which
evaluating the features and creating a feature vector set. The 86 images were used for training process and 24 images used
testing phase takes an input image and uses a classifier to for testing process.
perform comparison with the feature vector set to give the
correct identification of the disease. Mango leaves were tested, B. Pre-processing and Segmentation
to be classified into three classes – Anthracnose affected, Leaf Pre-processing: Pre-processing of images is performed for
Spot affected and normal leaves. enhancement of the image so that the feature extraction can be
done with better efficiency. A colour balancing technique has
Image Database been used in this paper for image enhancement and
smoothening.
Segmentation: Colour based segmentation has been
performed in the YCbCr colour plane. The RGB model is
Training Model Testing Model converted to the YCbCr model [8]. The YCbCr model is used
because of the availability of known threshold values for the
brown-black colour which are the primary colours of the
Preprocessing & Preprocessing & diseased part. This makes segmentation much easier as
Segmentation Segmentation compared to RGB model. The pixels for the diseased part are
kept intact and the background is given 0 intensity value i.e.
black. The diseased part is detected by setting up threshold
Feature Extraction Feature Extraction values for the Cb and Cr (Blue and red chroma components)
planes. The segmented output is depicted in Fig 3.
Gabor Gabor
GLCM Colour GLCM Colour
Filter Filter

Feature vector
Classifier
matrix

Detected Plant
Disease

Fig. 1. Block diagram of proposed method.


(a) Preprocessed Image (b) Segmented Image
A. Image Database
Fig. 3.

C. Feature Extraction
Textural features as well as Colour features have been used
in this study. 8 textural features are extracted using the GLCM
(Gray Level Co-occurrence Matrix) technique. Colour features
have been extracted using the mean and standard deviation of
each plane of RGB as well as YCbCr channels of the image
thus providing with 12 colour features. Moreover, the mean
and standard deviation of the image convolved with Gabor
filter output values are also stored as textural features in the
feature vector set, hence giving a complete feature vector set
of 22 feature values.
(a) Leaf Spot (b) Anthracnose
D. Classification
Fig. 2. A minimum distance classifier as well as a Support Vector
Machine (SVM) classifier has been used one at a time, in the
Database was created using images taken under controlled proposed method for classification, the results of which
environment. Leaves affected by anthracnose and leaf spot provided with a comparative approach for three different
(red rust) along with normal leaves were obtained from the groups of classification.
fields and the manual disease identification for training

14
IV. TECHNIQUES USED a) Contrast:

A. Colour Balancing
Pre-processing is a critical step in image processing as it (2)
provides with a suitable and compatible image which gives b) Energy:
optimum results for feature extraction. The Colour Balancing
method gives enhancement of the diseased part on the leaves
and reduces the intensity of the green parts of the leaves. (3)
c) Sum Variance:
The algorithm for this technique is as follows:
a) Take the input image from user and resize it to a
desired level. (4)
b) Extract the R, G, and B planes of the image. d) Correlation:
c) Calculate the mean value of pixel intensities of each
plane.
(5)
d) Find out the smallest mean value of the three planes.
e) Entropy:
e) Divide this smallest value by the mean values of each
plane to obtain three scaling factors for three planes.
f) Multiply each plane with the obtained scaling factors (6)
to obtain enhanced image. f) Dissimilarity:

B. Gray Level Co-occurence matrix (GLCM)


The Gray level Co-occurrence Matrix (GLCM), proposed (7)
by Haralick [7], is being used widely for texture segmentation g) Cluster Prominence:
in various fields of image processing. The construction of
GLCM depends on the relative distance (d) and orientation (ϕ)
between pixels, where d is distance measured in the pixel (8)
number and ϕ is the angle, generally four directions h) Cluster Shade:
(horizontal- 0◦, diagonal- 45◦, vertical- 90◦ and anti-diagonal-
135◦). A GLCM Gm,n for an image I(x,y) can be generated as
follows. (9)
Here, μx, σx and μy, σy are row mean, standard deviation
and column mean, standard deviation respectively.
(1) C. Colour Based Features
Where, C(.)=1 if the argument is true, otherwise C(.)=0. The means and standard deviations of each plane of a
colour space constitute the colour features of an image. For an
i×j image c, the mean (μ) and standard deviation (σ) can be
calculated using eq. 10 and 11:

Gray level co-occurrence matrix (GLCM) is square matrix (10)


with dimension Nx, where Nx is the number of gray levels in
the image. This matrix is formed by calculating how many
number of times a pixel with value i is adjacent to a pixel with
value j. Each entry in the matrix shows the probability of pair (11)
of pixels being adjacent to each other. In short, GLCM is a
tabulation of the number of times the various grey levels occur D. Gabor Filter
in an image.
The Gabor filter is a band-pass filter and its 2D impulse
28 features [7] can be evaluated by this matrix using a set response in the spatial domain is given by the following
of equations as proposed by Haralick. Feature selection is equation [9]
performed by considering the features which give distinct
values for different classes. 5 features in [7] and 3 in [13] were
selected for a GLCM matrix G(i,j). Formulae for these are (12)
depicted in equations 2-9.

15
Where, σ = spatial width, ω = frequency, ϕ = orientation f) Perform classification using function
angle. Gabor filter is represented by Gaussian kernel function svmclassifiy(svmstructure, testing-set).
which can interpret different types of shapes based on values
g) Store the obtained class value in a variable and repeat
of parameter.
steps (e) and (f) for other two feature vector sets.
Its applications range over different operations, image h) After performing three times, the class obtained two
enhancement, smoothening and extracting textural features
times out of three is the required class.
being few of them. In this case, the Gabor filter was designed
for the texture extraction purpose. The filter was designed V. RESULTS
according to the parameters in [10]. Studies have proved that
an odd filter (in this case 9×9) gives the optimum results while The confusion matrices for the 24 test images with 8 of
detecting edge shaped textures [9]. each type are shown. Tables 1-7 give the accuracies obtained
using the minimum distance classifier while tables 8-14 give
The filter thus obtained is convolved with input image.
the accuracies obtained using SVM classifier. Table 1 gives
the accuracy for disease identification by considering the
E. Minimum Distance Classifier GLCM, Colour and Gabor features together. Tables 2, 3 and 4
Classification is a process in which individual items are evaluated considering GLCM, Colour and Gabor filter
(objects/patterns/image regions/pixels) are grouped based on features independently. Tables 5, 6, 7 give the accuracy with
the similarity between the item and the description of the different combinations of the three methods.
group. A minimum distance classifier uses a single prototype
for each class (usually the class’s mean). Decision boundaries
are created by partitioning the feature space according to the TABLE I. GLCM, COLOUR AND GABOR FILTER FEATURES
nearest mean. The difference of means of the input feature
vector (μi) and trained model (μc) are evaluated using a simple Anthra Spot Normal Accuracy Overall
distance formula. Anthra 7 1 0 87.5%
Spot 2 5 1 62.5% 79.16%
(13)
Normal 0 1 7 87.5%
Where, ||u|| denotes the norm of u. The class mean with TABLE II. GLCM FEATURES
which the distance of the input mean is minimum is selected.
Anthra Spot Normal Accuracy Overall
F. Support Vector Machine
Anthra 7 1 0 87.5%
Support Vector Machine (SVM) is a learning machine for
Spot 3 3 2 37.5% 62.5%
two group classification problems [14]. The basic idea of
SVM is the construction of a hyper plane which provides the Normal 0 3 5 62.5%
decision boundary for classification between two classes. The
optimal hyper plane is so constructed that the two supporting TABLE III. COLOUR FEATURES
hyper planes are equidistant from it which ensures least
probability of misclassification. The supporting hyper planes Anthra Spot Normal Accuracy Overall
are constructed either side of the optimal hyper plane using the Anthra 6 2 0 75%
support vectors which are nothing but the data points that lie
closest to the decision boundary. Hence, decision making is Spot 4 2 2 25% 62.5%
fully specified by the small subset of training samples, the Normal 0 1 7 87.5%
support vectors.
TABLE IV. GABOR FILTER FEATURES
The algorithm used in the proposed method for
classification between three classes is as follows:
Anthra Spot Normal Accuracy Overall
a) Create three feature vector sets. One each for
classification between anthracnose - leaf spot, anthracnose - Anthra 5 3 0 62.5%
normal, leaf spot - normal. Spot 3 4 1 50% 58.33%
b) Create three class labels (L) for each of the above Normal 0 3 5 62.5%
feature vector sets (FVS).
TABLE V. GLCM AND COLOUR FEATURES
c) Use one feature vector set and its corresponding class
label at a time. Anthra Spot Normal Accuracy Overall
d) Accept input image from user and create a testing-set Anthra 7 1 0 87.5%
containing feature values of the input image.
Spot 2 5 1 62.5% 75%
e) Train the classifier using svmtrain(FVS,L) function.
Save the trained classifier in a variable (here, ‘svmstructure’). Normal 0 2 6 75%

16
TABLE VI. GLCM AND GABOR FILTER FEATURES TABLE XIII. GLCM AND GABOR FILTER FEATURES

Anthra Spot Normal Accuracy Overall Anthra Spot Normal Accuracy Overall
Anthra 7 1 0 87.5% Anthra 5 3 0 62.5%
Spot 2 4 2 50% 62.5% Spot 0 7 1 87.5% 83.34%
Normal 0 4 4 50% Normal 0 0 8 100%

TABLE VII. COLOUR AND GABOR FILTER FEATURES TABLE XIV. COLOUR AND GABOR FILTER FEATURES

Anthra Spot Normal Accuracy Overall Anthra Spot Normal Accuracy Overall
Anthra 6 2 0 75% Anthra 6 2 0 75%
Spot 5 2 1 25% 62.5% Spot 2 5 1 62.5% 79.16%
Normal 0 1 7 87.5% Normal 0 0 8 100%

Tables 8-14 represent different combinations in similar Fig. 4 represents the comparative analysis of both the
manner as mentioned above, but, using an SVM classifier. classifiers used while considering all three features together.

TABLE VIII. GLCM, COLOUR AND GABOR FILTER FEATURES

Anthra Spot Normal Accuracy Overall


Anthra 6 2 0 75%
Spot 1 6 1 75% 83.34%
Normal 0 0 8 100%

TABLE IX. GLCM FEATURES

Anthra Spot Normal Accuracy Overall


Anthra 6 1 1 75%
Spot 0 4 4 50% 75%
Fig. 4. Comparison of Minimum Distance Classifier and SVM
Normal 0 0 8 100%

TABLE X. COLOUR FEATURES


VI. CONCLUSION
Anthra Spot Normal Accuracy Overall Disease identification was performed and leaves were
diagnosed for two diseases - anthracnose and leaf spot. Three
Anthra 6 1 1 75% feature groups were used for the feature extraction process.
Spot 4 0 4 0% 58.34%
x The GLCM feature extraction proved to be effective for
Normal 0 0 8 100% classification of normal leaves and those affected by
anthracnose. However, the detection of leaf spot was found
TABLE XI. GABOR FILTER FEATURES to be difficult.
Anthra Spot Normal Accuracy Overall x The Colour based features gave best classification results
against normal leaves, good for anthracnose affected
Anthra 6 2 0 75% leaves, but very poor for leaf spot affected leaves.
Spot 2 5 1 62.5% 79.16%
x The Gabor filter features gave the best results amongst the
Normal 0 0 8 100% three for leaf spot detection.

TABLE XII. GLCM AND COLOUR FEATURES Thus, it can be concluded that the boundary extraction
performed by the Gabor filter identified the small spots in the
Anthra Spot Normal Accuracy Overall case of leaf spot more effectively while the other two
techniques are responsible for the denser texture analysis as in
Anthra 6 1 1 75% the case of anthracnose. Finally, best results were achieved on
Spot 0 4 4 50% 75% considering all three features sets together, however at the
0 0 8 100% expense of increased computational complexity.
Normal

17
The two classifiers used in this system separately, gave [5] Nunik Noviana Kurniawati, Siti Norul Huda Sheikh Abdullah, Salwani
distinct results for different classes. While minimum distance Abdullah, Saad Abdullah, “Texture Analysis for Diagnosing Paddy
Disease,” 2009 International Conference on Electrical Engineering and
classifier was better for one class, SVM classifier performed Informatics, 5-7 August 2009, Selangor, Malaysia.
better with the other class. [6] Jagadeesh D. Pujari, Rajesh Yakkundimath, Abdulmunaf S. Byadgi,
x The minimum distance classifier performed better for “Classification of Fungal Disease Symptoms affected on cereals using
Colour texture features,” International Journal of Signal Processing,
identifying anthracnose. Image Processing and Pattern Recognition, Vol. 6, No. 6 (2013), pp.
321-330.
x The SVM classifier identified leaf spot much more
accurately than its counterpart. Also the detection of [7] Robert Haralick, K. Shanmugam, Its’hak Dinstein, “Textural Features
for image classification,” IEEE Transactions on Systems, Man and
normal leaves was carried out with 100% accuracy using Cybernetics, Vol. SMC-3, No. 6, November 1973, pp. 610-621.
the SVM. [8] Balkrishan Ahirwal, Mahesh Khadtare and Rakesh Mehta, “FPGA based
system for color space transformation RGB to YIQ and YCbCR,”
Finally, SVM gave better overall results than the minimum International Conference on Intelligent and Advanced Systems 2007, 1-
distance classifier. 4244-1355-9/07.
[9] Runping Han, Lingmin Zhang, “Fabric Defect Detection Method Based
REFERENCES on Gabor Filter Mask,” Global Congress on Intelligent Systems, DOI
10.11.09/GCIS.2009.356.
[1] Noor Ezan Abdullah, Athirah A. Rahim, Hadzli Hashim and Mahanijah
[10] Ajay Kumar, Grantham K. H. Pang, “Defect Detection in Textured
Md Kamal, “Classification of Rubber Tree Leaf Diseases Using
Materials using Gabor filters,” IEEE Transactions on Industry
Multilayer Perceptron Neural Network,” The 5th Student Conference on
Applications, vol. 38, no. 2, march/april 2002.
Research and Development -SCOReD 2007 11-12 December 2007,
Malaysia. [11] Rafael C. Gonzalez, Richard E. Woods “Digital Image Processing,”
Edition 3.
[2] G. Anthonys, N. Wickramarachchi, “An Image Recognition System for
Crop Disease Identification of Paddy fields in Sri Lanka,” Fourth [12] Richard O. Duda, Peter E. Hart, David G. Stork, “Pattern
International Conference on Industrial and Information Systems, ICIIS Classification,” Edition 2.
2009, 28 - 31 December, 2009, Sri Lanka. [13] Yongsheng Yang , Fuyuan Hu, Juan Xia, “Texture statistics features of
[3] A. Camargo, J.S. Smith, “Image pattern classification for the SAR oil spills imagery,” 978-1-4673-0875-5/12.
identification of disease causing agents in plants,” Computers and [14] Corinna Cortes, Vladimir Vapnik, ”Support-Vector Networks,” Machine
Electronics in Agriculture 66 (2009) 121–125. Learning, Vol. 20, pp. 273-297 (1995).
[4] Muhammad Asraf Hairuddin, Nooritawati Md Tahir, Shah Rizam Shah [15] Simon Tong, Daphne Koller, ”Support Vector Machine Active Learning
Baki, “Overview of Image Processing Approach for Nutrient with Applications to Text Classification,” Journal of Machine Learning
deficiencies detection in Elaeis Guineensis,” 2011 IEEE International Research (2001), pp. 45-66.
Conference on System Engineering and Technology (ICSET), 978-1-
4577-1255-5/11.

18

You might also like