Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 7

Mango Leaf Disease Prediction using Texture

features and Data Mining Algorithms.


Swapnil Shinde
Kshipra Tatkare Madhavi Ramakant Shinde
Assistant Professor
Assistant Professor Department of Information Technology
Department of Information Technology
Department of Information Technology Ramrao Adik Institute of Technology
Ramrao Adik Institute of Technology
Ramrao Adik Institute of Technology Navi Mumbai,India
Email id: Email id:
Jaspreet Kaur Sidhu
Jidnyasa Surendra Tavate
Department of Information Technology Charushila Mahesh Thokal
Department of Information Technology
Ramrao Adik Institute of Technology
Ramrao Adik Institute of Technology
Navi Mumbai,India Department of Information Technology
Navi Mumbai,India
Email id: Ramrao Adik Institute of Technology
Email Navi Mumbai,India
Email id:
Abstract— Diseases are very But it is necessary to detect • The proposed mango leaf
common among plants which and control such types of disease identification and
affect their growth. diseases in a specific time control prediction system is
Agricultural productivity is period which is at their an automated system that
something on which Indian initial state. So it is can identify the two types
economy highly depends on. important to destroy such of mango leaf disease such
This is one of the reasons diseases before it will affect that Bacterial leaf spot and
that disease detection in on some basic operation of Red rust.
plants plays an important
mango plant body such as
role in agriculture field.
Photosynthesis, • Edge detection based
Having diseases in plants are Image segmentation is done
quite natural. Mango is an transpiration, pollination,
fertilization, germination initially, and finally image
important crop as its analysis and classification
demand is high and the etc. We are using data
mining techniques with of diseases is performed
profits generated are also using proposed
high. Therefore it is image processing for
prediction Depending on Homogeneous Pixel
necessary to predict the
Counting Technique for
diseases at the beginning the application, many of
stage. An automatic problems may be solved, or Cotton Diseases Detection
(HPCCDD) Algorithm. The
technique is required to ease at least reduced, by the use
the work of identifying the of digital images, pattern goal of this research work is
diseases. In this paper we are identify the disease affected
recognition and
proposing a system which classification tools of data part of cotton leaf sport by
will make use of canny edge using the image analysis
mining techniques. There
detection along with GLCM are three main diseases of technique. After image
for feature extraction and preprocessing including
mango leaves-Anthracnose,
SVM, KNN for classification.
Red rust and Powdery image compression, image
We are focusing on three cropping and image de-
diseases which are the major
noising, K-means clustering
ones that affect the mango The security of food algorithm was used to Fig. 1. Steps for
crop. These three diseases still looks in danger due to segment the disease images, Identification of Mango
are Anthracnose, Powdery and then 21 color features, 4 Leaf Disease and Control
various issues like plant
Mildew and Red rust. This shape features and 25 Prediction using Image
diseases, climate change
system will be tested with texture features were Processing and Neural
etc. Plant diseases are not
different number of test data
extracted from the images. Network
set collected from various only a key hazards to food
regions. The proposed security all over the world, Median filter. Back 2.Disease Recognition in
system aims to give better but can also have terrible Propagation Neural Mango Crop Using
results than the existing consequences for countries Network classifier Modified Rotational Kernel
system. like India whose economy technique are used. Transform Features [2]
largely depends on healthy • Optimal number of • This paper has used
Keywords—Edge  crops. Their quick segment is 5 clusters which MRKT based scheme to
Detection, GLCM,  recognition is still difficult gives optimal performance identify calculate the
Gaussian filter, SVM, KNN due to the deficiency of the about 94% of the proposed directional features
I.INTRODUCTION essential infrastructure. In system. histograms for plant parts
image processing, like leaf fruit of a mango
Many people in India regardless a lot of efforts in • When the number of
clusters increases then the crop in digital color images
are farmers. They depend have been employed, plant taken from a data set.
on the production obtained total testing time to identify
identification is still a
from farming. They have the mango leaf disease also • Histogram equalization,
difficult and vague problem
the aim to increase increases. But good results Artificial Neural Network
[5]. //Add statistics of the are obtained by increasing
production efficiency and loss due to mango fruit classifier technique are
quality of product. Mango the number of clusters. The used.
disease steps are shown in figure 1.
fruits are the most important • Author proposes an
agricultural product in automatic plant
specific regions of India II. LITERATURE SURVEY identification system for
from customer view. The image database using all
economical profit depends 1. Identification of Mango three types of shape, color
on a product quality which Leaf Disease and Control and texture features. The
depends on healthy plants Prediction using Image major issue occurred was
and seeds. So for increasing Processing and Neural transform in shape and
the profit farmer mainly Network [1] pattern of leaf along with
focuses on these two main age of plant and leaf
things. Instead there is one • This paper has analyzed
composition in addition to
more thing which affect on the performance of the leaf
usual huddles of object
a production that is multiple disease identification and
recognition such as
type of diseases in mango control prediction algorithm
variation in pose, light and
plant leaf. in standard digital color
orientation. author use an
images taken from test data
To increase profit we convolutional neural
have to control diseases. networks (CNN) in order to
learn unsupervised feature libraries are open source, • Success rate of 3 Digital Averaging Stand
version for 44 various types and implementation is classification was around camera filter, color deviat
of plant classes, collected at possible in android devices 89% interfaced Transform- varian
the Royal Botanic Gardens, and other low cost open • The Adaptive K-Means to the ation, Kurto
Kew, England. hardware development Clustering technique Raspberry histogram skewn
boards such as Raspberry Pi enhances the segmentation Pi equalizatio entrop
• The results obtained using and Beagle bone, The output obtained through the hardware
proposed MRKT directional n,SVM
OpenCV implementation is original K-Means
feature set shows better or the classifier,
fine tuned to get the
results with accuracy upto technique. smart Hyper
accurate results compared
98 • The accuracy of Adaptive phone plane
to Matlab, so that it can be
made available to farmers k-means clustering is camera.
• The histogram peaks are
used to distinguish among easily. approximately 93%.
directional features of plant 4 The authors RGB Wave
• SVM classifier, histogram
parts. The above diagram is have Channel transf
equalization, Hyper plane
from the paper Disease are used. captured Separation graysc
Recognition in Mango Crop 350 images ,Normalizat norma
Using Modified Rotational 4.Automatic Robust using ion, K- bright
Kernel Transform Features Segmentation Scheme for Nikon 16 means levels
Pathological Problems in III.COMPARA
megapixel clustering,
Mango Crop.[4] TIVE ANALYSIS
digital neural
3.OpenCV Based Disease camera. network,
• In this paper, the authors A comparative analysis of
Identification of Mango have captured 350 numbers Perceptron
Leaves[3] the above mentioned papers
of images using Nikon is shown below in table 1. classifier,
In this paper, the 16megapixel digital camera In each paper the methods SVM
diseased leaf image is and threshold based used for enhancement and
acquired using the digital offloading technique is used classification are different.
camera interfaced to the Also, the features extracted
• K-means clustering, Edge
Raspberry Pi hardware or detection, neural network vary. IV.PROPOSED SYSTEM
the smart phone camera based classification
technique, multi-layer Table 1. Comparative The first step is image
• Dataset is prepared
with both black and white Perceptron classifier and analysis of literature survey acquisition that is the image
background SVM are used. Pap Image Methods is captured using a camera.
er database The resolution required is 6
• The disease name and • Here worked on the mega pixels or more. The
1 Standard Median
the corresponding feature segmentation of Mango second step is pre-
plant parts such as leaves, digital filter,
vectors are added to the processing which includes
flowers and fruits to detect color Neural
Matlab or OpenCV image enhancement and
database using the learning the two types of diseases images Network
taken from classifier segmentation. Pre-
algorithm. Different namely Powdery Mildew
test data technique. processing involves reading
segmentation and and Anthracnose.
set. image, cropping it and
classification algorithms are
• Initially the captured color plane separation. Image
studied for the disease
clustering and images of plant parts are enhancement involves
identification. used to perform some Gaussian filtering and high
image-preprocessing such boost filtering. For
• K-means is one of the as channel separation, segmentation Canny Edge
simple and robust illumination normalization Detection is used. The third
segmentation algorithms to (if non uniform step is Feature Extraction
implement for low cost illumination is exists). which is done using
development and it uses • Then color space 2 GLCM. Fourth step is
unsupervised learning Digital Histogram
conversion is performed to color equalizatio Classification.
method to solve known
get the grayness value. On images n, edge Classification will be done
clustering issues. Disease
classification is achieved by this grayness image we taken from enhanceme using SVM and KNN. The
SVM classifier, because of perform Adaptive K-means test data nt, results of the two will be
the implementation segmentation to segment set. Artificial compared. We have
simplicity in both the the lesion region. Neural obtained data set from the
Matlab and OpenCV • To get sharp edges at the Network author of the reference
libraries compared back boundaries of segmented classifier paper OpenCV Based
propagation neural image the authors apply Disease Identification of
networks. different edge detection Mango Leaves by
transforms and then Jayaprakash Sethupathy,
• Implementation of the
combine them in order to Veni S [3]. We have also
algorithm is done using
both Matlab and OpenCV obtain the final segmented obtained real time dataset
libraries and the results are image. by visiting nearby
compared. Since OpenCV locations. The system will
be used on both the data image provides better Following is one of the filtering is expressed in
sets. The steps are shown results and hence it is used Gaussian filter mask. equation form as follows:
below in figure 2. for the disease
identification of mango Ihighboost = AIoriginal +
leaf. Ihighpass
We have also captured (2)
images of mango leaves by
using digital mobile camera = (AWallpass
for real time dataset. +Whighpass) *
Images will get stored in Ioriginal
standard jpg format. The
camera resolution used is 6 D.CANNY EDGE
megapixels. //count of DETECTION
images captured
Canny edge detection is a
2. IMAGE Fig. 3. Gaussian multi-step algorithm that
PREPROCESSING filter mask[6] can detect edges with noise
A.CROPPING IMAGE suppressed at the same
The Gaussian distribution
Read the image and crop it time.
in 1-D has the form:
specifying crop rectangle. (a) Smooth the image with
a Gaussian filter to reduce
B.PLANE SEPARATION noise and unwanted details
In this method, we separate (1) and textures.
Red, green, blue planes of a (b) Compute gradient of
given rgb image. where sigma is the standard using any of the gradient
Red_channel=I2(:,:,1); deviation of the operators (Roberts, Sobel,
green_channel=I2(:,:,2); distribution. Prewitt)
blue_channel=I2(:,:,3); (c) Threshold M
(d) Suppress non-maxima
From this we will only pixels in the edges in
consider green_channel as obtained above to thin the
human ii. HIGH edge ridges.
eye is most sensitive to BOOST (e) Threshold the previous
green. FILTERING result by two different
[7] thresholds T1 and T2
C.IMAGE The high-boost filter can be (where T1<T2) to obtain
ENHANCEMENT used to enhance high two binary images T1 and
i. GAUSSIAN frequency component while T2.
FILTERING still keeping the low (f) Link edge segments in
[6] T2 to form continuous
frequency components.
Fig.2. Flow edges. To do so, trace each
A Gaussian filter is a  High boost filter is
Chart of Proposed system segment in T2 to its end
filter whose impulse  composed by an all pass and then search its
response is a Gaussian filter and an edge detection neighbors in T1 to find any
function (or an filter (Laplacian filter). edge segment in T1 to
1. IMAGE approximation to it). Thus, it emphasizes edges bridge the gap until
ACQUISITION Gaussian filters have the and results in image reaching another edge
In the reference paper, the properties of having no sharpener. The high-boost segment in T2.
diseased leaf image is overshoot to a step function filter is a simple sharpening
acquired using the digital input while minimizing the E.FEATURE
operator in signal and
camera interfaced to the rise and fall time. This EXTRACTION USING
image processing. It is used
Raspberry Pi hardware or behavior is closely GLCM
connected to the fact that for amplifying high
the smart phone camera, the frequency components of The process of
image is acquired from a the Gaussian filter has the transforming the input data
minimum possible group signals and images. The
certain uniform distance into a set of features is
delay. It is considered the amplification is achieved
with sufficient lighting for called feature extraction.
learning and classification. ideal time domain filter, via a procedure which
Features often contain
The image background just as the sinc is the ideal subtracts a smoothed information relative to
should provide a proper frequency domain filter. version of the media data color, shape, texture or
contrast to the leaf color. The Gaussian smoothing from the original one. In context. Texture feature
Mango leaf disease dataset operator is a 2-D image processing, we can extraction helps in
is prepared with both black convolution operator that is sharpen edges of a image segmentation or
and white background, used to `blur' images and through the amplification classification of images,
based on the comparative remove detail and noise. and obtain a more clear they can be extracted using
study black background image. The high boost statistical method.
Statistical methods Fig. 4. Process best divides a dataset into d(p, q)=√(q1-p1)²+
characterize the texture used to create the GLCM two classes (q2-p2)²+.....+(qn-
indirectly according to the [8] Types of SVM: pn)² (7)
non-deterministic (a) Linear Kernel SVM
properties that manage the We will use the following 4 (b) Polynomial Kernel
relationships between the features: SVM KNN algorithm [11]
gray levels of an image The 1.Entropy[9] (a) Determine parameter
GLCM functions In Image, Entropy is a) Linear kernel SVM k=number of nearest
characterize the texture of defined as corresponding The dot-product is called neighbors
an image by calculating states of intensity level the kernel and can be re- (b) Calculate the distance
how often pairs of pixel which individual pixels can written as: between the query instance
with specific values and in adapt. It is used in the K(x, xi) = sum(x * xi) and all the training samples
a specified spatial quantitative analysis and Where x, xi are input (c) Sort the distance and
relationship occur in an evaluation image details, variables. determine nearest
image, creating a GLCM, the entropy value is used as The kernel defines the neighbors based on the k-th
and then extracting it provides better similarity or a distance minimum distance
statistical measures from comparison of the image measure between new data (d) Gather the category of
this matrix. details. and the support vectors. nearest neighbors
GLCM helps to reveal Entropy = ∑- ln(Pij)Pij The dot product is the (e) Use simple majority of
certain properties about the similarity measure used for the category of nearest
spatial distribution of the (3) linear SVM as the distance neighbors as the prediction
gray levels in the texture 2.Contrast[9] is a linear combination of value of the query instance.
image. GLCM is effective Measures the local the inputs.
compared to wavelet variations in the gray-level It is desirable to use more V.
texture analysis, and co-occurrence matrix. complex kernels as it CONCLUSION
multivariate statistical contrast is the difference in allows lines to separate the An image processing based
analysis based on PCA visual properties that makes classes that are curved or solution is proposed and
(Principle Component an object distinguishable even more complex. evaluated for the detection
Analysis). Gray co-matrix from other objects and the This can lead to more and classification of mango
function is used to create background. accurate classifiers. diseases. Anthracnose,
GLCM matrix. By Contrast = ∑Pij(i - j)(i- Powdery Mildew, Red rust
calculating how often a j) b) Polynomial Kernel SVM are the major diseases
pixel with the intensity (4) Instead of the dot-product, affecting mango crop which
(gray-level) value i occurs 3.Homogeneity[9] we can use a polynomial have been considered. The
in a specific spatial Measures the closeness of kernel, for example: proposed approach is
relationship to a pixel with the distribution of elements K(x, xi) = 1 + sum(x * xi)d composed of mainly three
the value j. spatial in the GLCM to the GLCM Where the degree of steps. In the first step image
relationship diagonal. polynomial must be segmentation is performed
is defined as the pixel of Homogeneity = ∑Pij/(1 specified by hand to the using Canny Edge
interest and the pixel to its + (i - j)^2) kernel Detection. In the second
immediate right (5) When d=1this is the same step features are extracted.
(horizontally adjacent). 4.Correlation[9] as linear kernel: In the third step
GLCM (Gray Level Co- Measures the joint The polynomial kernel classification is performed
occurrence Matrix) probability occurrence of allows for curved lines in using SVM and KNN
Features : the specified the input space. classifier. The existing
- Energy pixel pairs. system uses SVM classifier.
- Entropy Correlation = KNN (k-Nearest The proposed system will
- Contrast ∑Pij(i -µ)(j - Neighbors)[10] compare the results of
- Homogeneity µ)/σ^2 //numbering not done SVM and KNN classifier.
- Correlation (6) KNN is a non-parametric This would help in
- Shade method for classification. achieving better accuracy.
- Prominence F.CLASSIFICATION Prediction for test data is This system would promote
i. SVM [10] //mention done on the basis of its Indian Farmers to do smart
complete name at start , neighbor farming which helps to take
Numbering should be K is an integer (small), if time to time decisions
proper k=1, k is assigned to the which also save time and
Definition: class of single nearest reduce loss of mango fruit
A support vector machine neighbor. due to diseases. The leading
(SVM) is a machine Similarity : objective of our system is
learning algorithm that Calculated using distance to enhance the value of
analyzes data for measure like Euclidean disease detection for
classification and distance. mango.
regression analysis.
SVMs are based on the idea VI. ACKNOWLEDGMENT
of finding a hyperplane that
We would like to express DOI:
profound gratitude to our 10.5815/ijmecs.2017.01.05
project guide Mr. Swapnil
Shinde and project [5]
coordinator Mrs. Reshma
Gulwani of B.E. Program i/
for their invaluable support,
encouragement, supervision [6]
and useful suggestions
throughout this project i/Gaussian_filter
work. We are very grateful
for the valuable cooperation [7]
and constant
encouragement from our c/119044470/High-Boost-
Head of Department Dr. Filtering
Ashish Jadhav.
VII. REFERENCES level-co-occurrence-
[1] Bed Prakash, Amit matrix.html
Yerpude, "Identification of
Mango Leaf Disease and [9]
Control Prediction using
Image Processing and m
Neural Network," IJSRD -
International Journal for [10]
Scientific Research https://machinelearningmas
Vol. 3, Issue 05, 2015 | machines-for-machine-
ISSN (online): 2321-0613 learning/

[2] S. B. Ullagaddi, Dr. S. com/kardi/tutorial/KNN/K
Viswanadha Raju, "Disease NN_Numerical-
Recognition in Mango Crop example.html
Using Modified Rotational
Kernel Transform
Features," 2017
International Conference on
Advanced Computing and
Communication Systems
(ICACCS -2017), Jan. 06
07, 2017, Coimbatore,

[3] Jayaprakash Sethupathy,

"OpenCV Based Disease
Identification of Mango
Leaves," International
Journal of Engineering and
Technology (IJET)

[4] S. B. Ullagaddi, Dr. S.

Viswanadha Raju,
"Automatic Robust
Segmentation Scheme for
Pathological Problems in
Mango Crop, "I.J. Modern
Education and Computer
Science, 2017, 1, 43-51
Published Online January
2017 in MECS

You might also like