Submitted by: V.DHYANA MALIKA 20KP1A05A5 T.SAl CHANDRA 20KP1A05A2 Y.SRUTHI ANASURYA 20KP1A05A9 U.ANJUNATH 20KP1A05A3 N.BHAGYA LAKSHMI 20KP1A0568 A.MANOHAR 20KP1A0503 © scanned with OKEN Scanner Plant Disease DetectionUsingImage Processingand MachineLearning Abstract: fone of the important and tedious task in agricultural practices isdetection of disease on crops. It requires huge time as well as) skilled labor. “Thispaperproposesusmartndefficienttechniquefordetecionofcropdiseasewhichs ses computer vision and machine learning techniques. The proposed system isabletodetec204itferentdiseases ofS common planswith93Maccuracy. Keywords: Digital image processing, Foreground detection, Machine learning Plantdiseasedetection 1 Introduction In India about 70% of the populace relies on agriculture. Identification of the planidiseases is important in order to prevent the losses within the yield.l's terriblytroublesome to observe the plant diseases manually. It needs wemendous quantity oflabor, expertize within the plant diseases, and conjointly need the excessive timeinterval. Hence, image processing and machine learning models ean be employed forthe detection of plant diseases. In this project, we have described the techr thedetection of plant diseases with the help of their leaves pictures. Image processing is abranchofsignalprocessing whichcanextracttheimagepropertiesorusefulinformation from the image. Machine leaning isa sub part of artificial intelligencewhichwarksautomaticallyorgiveinstructionsto doaparticulartask. Themainaimofmachinelearningistounderstandihetrainingdataandfih attrainingdataintomodelsthat should be useful to the people. So it can assist in good. decisions making andpredicting the correct output using the large amount of training. data, The color ofleaves, amount of damage to leaves, area of the leaf, texture parameters are used forclassification, In this project we have analyzed different image parameters or featuresto identifying different plant leaves diseases to achieve the best accuracy. Previouslyplant disease detection is done by visual inspection of the leaves ‘of some chemicalprocesses by experts. For doing so, a large team of experts as well as continuousobservation of plant is needed, which costs high when we do with large farms. In suchconditions, the recommended system proves to be helpful in monitoring large ids ‘ofcrops.Automatiedetectionofthediseaseshy simplyseeingthesymptomsontheplantleave s makes it easier as well as cheaper, The proposed solution for plant discasedetectioniscomputationallylessexpensiveandrequireslesstimeforpredietionthan © scanned with OKEN Scanner ‘other deep learning based approaches since it uses statistical machine learning andimageprocessingalgorithm. 2 LiteratureReview In 2015, S. Khirade et AL, tackled the problem of plant disease detection using digitalimageprocessingtechniquesandbackpropagationneuralnetwork(BPNN)[1] Auth ‘ors have elaborated different techniques for the detection of plant disease usingtheimagesofleaves, TheyhaveimplementedOtsu’sthresholdingfallowedbyboundar ¥¥ detection and spot detection algorithm to segment the infected part in leaf.After that they have extracted the features such as color, texture, morphology, edgesetc. for classification of plant disease. BPNN is used for classification i.e. to detect theplantdisease, ShitoopMadiwalarandMedhaW yawahareanalyzeddifferentimagepracessingapproache S for plant disease detection in their esearch [2]. Authors analyzed the colorand texture features for the detection of plant disease. They have experimented theiralgorithms on the dataset of 110 RGB images. The features extracted for classificationweremeanandstandarddeviationofRGBandYChCrchannels.greyleveleo- occurrence matrix (GLCM) features, the mean and standard deviation of the imageconvolvedwithGaborflter Supportvectormachineclassifierwasusedforclassificati fon. Authors concluded that GCLM features are effective to deteet normalleaves. ‘Whereas color features and Gabor filter features are considered as best fordetecting anthracnose affected eaves and leaf spot respectively. They have achievedhighestaccuracyof$3 34%usingallthoextractedfeatures Peyman Moghadam et Al, demonstrated the application of hyperspectral imaging inplant disease detection task [3]. visible and near-infrared (VNIR) and shott- ‘waveinfrared (SWIR) spectrums were used in this research. Authors have used k- ‘meansclusteringalgorithminspectraldomainforthesegmentationofleal. Theyhavepropos ‘edanovelgridremovalalgorithmtoremovethegridfromhyperspectralimages.Authors have achieved the accuracy of 83% with vegetation indices in VNIR spectralrange and 93% accuracy with full spectrum. Though the proposed method achievedhigher accuricy, it requires the hyperspectral camera with 324 spectral bands so thesolutionbecomestoocostly. Sharath D. M, et Al, developed the Bacterial Blight detection system for Pomegranateplant by using features such as color, mean, homogeneity, SD, variance, correlation,entropy, edges etc. Authors have implemented grab cut segmentation for segmentingthe region of interest in the image [4]. Canny edge detector was used to extract theedges from the images. Authors have successfully developed a system which ccanpredictthe infectionlevel inthe frit, Garima Shrestha et Al. deployed the convolutional neural network to detect the planidisease[5]_Authorshavesuccessfullyclassfied ?plantdiseaseswith$8 80%accurac y.Thedatasctof3000highresolutionGBimageswereusedforexperimentation. The network has 3 blocks of convolution and pooling layers. Thismakes the network computationally expensive. Also the FI score of the model is 0.12whichis vverylowbecause of highernumber offalse negativepredictions. © scanned with OKEN Scanner 3. Methodology 3A Dataset For this project we have used public dataset for plant leaf disease detection calledPlantVillage curated by Sharada P. Mohanty et Al. (6}. The dataset consists of '87000RGB images of healthy and unhealthy plant leaves having 38 classes out of which Wehave selected only 25 classes for experimentation of our algorithm These classes. areshowa iTable 1 Pia Apple Com Grapes Potato ‘Tomato Table DatasetSpecitications DreaaName TealihyDiseas edScab Diseased: Black rotDiseased:Cedarapplen Healthy Diseased:Cercosporaleatspot Diseased: Common rustDiseased NorthernLeafBlig he Healthy Diseased: Blackeot Diseased: Esca (Black Measles)Diseased:LeatblightIsa opsis) Healthy Diseased: Early DlighuDiseased Latebl ight Healthy Diseased: Bacterial spotDiseased: Early blightDiseased: Late blightDisessed: Leaf MoliDiseased:Sepiorialeat spot Discased:Two- spottedspidermiteDiseased Target Spot Diseased: Yellow Leaf Curt VirusDiseased Tomatomosaicvina Noatinager 200% 2016 1987 1760 1859 1642 1907 1908 1692 1888 1930 mm, 1824 1939 1939 ‘Somesamples fromthedatsetareshowninFig. 1 © scanned with OKEN Scanner Fig.l. Sampleimagesinthedataset(6) 32 Datapreprocessingandfeatureextraction Data preprocessing is important task in any computer vision based system. Fig. 2illustratesthepreprocessingstepsforeachimage. Togetpreciseresults somebackground noise should be removed before extraction of features, So first the RGBimageisconvertedtogreyscaleandthenGaussianfilterisusedforsmootheningoftheim age. Thentobinariestheimage, Otsu’ sthresholdingalgorithmisimplemented Thenmorphol fogical transform is applied on binarised image wo close the smmall holes ia theforegroundpart. Now afterforegrounddetection thebitwiseANDoperationonbinarisedimageandoriginaleolori ‘mageisperformediogetR GBimageofsegmentedleat Nowaflerimagesegmentationshape,t cextureandcolorfeaturesareextractedtromthe image. By using contours, area of the lear and perimeter of the leafs calculated Contours are the Tine that joins all the points along. the edges of objects having samecolor or intensity. Mean and standard deviation of each channel in RGB image is alsoestimated, To obtain amount of green color inthe image, image is first converted toHSV color space and we have calculated the ratio of number of pixels having pixelintensity of hue (H) channel in between 30 and 70 and total number of pixels in onechannel.Nongreenpartofimageiscalculatedby subtractinggreencolorpartirom 1 © scanned with OKEN Scanner Atterextractingcolorfeaturesfromiheimage, wehaveextractedtexturefeaturesfromereylevel co-oceurrencematrix(GLCM)ofthe image [7]. Original Image l Greyscale Conversion l ‘Smoothening using Gaussian Fiter l Otsu's Thresholding | Morphological Transform l Color Bitwise AND Operaton Features th ergial frame | Grey level co- ‘occurrence matrix | Texture Features Shape features wa ws ws a c] HSV color space conversion Green part in the, leaf Fig.2.Stepsfordatapreprocessingandfeatureextraction © scanned with OKEN Scanner GLCMis the spacial relationship of pixels in the image. Extracting texture featuresfrom GCLM is one of the tradition method in computer vision, We have extractedfollowing. featuresfromGCLM: + Contrast + Dissimilarity + Homogeneity + Energy + Correlation Afierextractingallthefeaturesfromalltheimagesinthedataset featureselectiontaskisperfor med. 33. Featureselection Feature selection is an important step in all machine learning problems. In this projectwearesclectingthefeaturesonthebasisofcorrclationofvariableswithtargetvariable, Fig. 3 shows the comrelation of each variable with each other for apple dataset. ‘Thecorrelationoffeaturegreenpartofleal(Fl)andgreenpartofleaf(F2)isveryhigh(I)which ‘means both variables are dependent on each other. So we have dropped one ofthem (F2). Now for apple disease prediction, less comelated features such as sgreenchannelmean redchannelstandardde vation bluechannelstandarddeviation dissimi larity({5)andeorrelation({8)willnotcontributetcomuchinmodeldevelopment. Sowehaved roppedthesevariablesalso.Afterfeatureselection thedataisnowparsed tomachinelearningclassifiersto find thepatternsinthedata, © scanned with OKEN Scanner Fig3. CorrelationplotforAppledataset. 34 ClassificationAlgorithm Randomforestclassifierhasbeenusedforclassificationordetectiontask Isthepartof ensemble learning, where the output is predicted from multiple base estimators [8], Generally to achieve higher accuracies, decision trees are used. But they are prone tooverfiting problems. So to overcome this issue, random forest classifier is used ‘whichis a combination of multiple decision tees. Each tree is trained by using differentsubsets of the whole dataset, this can reduce the overfitting and improves the accuraeyot the classifier. We have splitted the dataset into train set (80%) for iting the modeland test set (20%) for validation, K-fold cross validation technique is implemented tofind the accuracy score. This method ean find the accuracy on whole dataset ‘withoutanybias, Altectttingthedata flscore,precision,recall,accuracyhasbeencalculated © scanned with OKEN Scanner from test data to analyze the performance of the model. ROC curve and confusionmatrix wasplottedto analyZefalsepositivesand false negatives. 4 Resultsanddiscussion ‘Table2show stheperformancematricesforeachmodeldevelopedforeachoftheplant. We can dobserve that the accuracy scores are nearly equal to fl scores. This is becauseof balanced number of false negative and false positive predictions. This is consideredasbestcaseforany machinelearningalgorithm. Theaverageaccuracywas93%. ‘Table2.Performance matric forllmodels, Paar “Apple Come rapes? Tomo Recuraey 091 094 09s 098 0x7 Fiseae 091 094 095 098 0x7 © scanned with OKEN Scanner Fig. 4 shows the confusion matrices for each of the model. With the help of confusionmatrices, number of false negatives, false positives, true predictions can be analyzed Fig. 5 shows the receiver operating characteristic (ROC) curve for each of the model.An ROC curve is a graph showing the performance of a classification model at allelassification thresholds. It depends upon two parameters, true positive rate and falsepositiverate Grapes Tomato Fig-4.Confusionmatrices forallthemodels. © scanned with OKEN Scanner one com Fig.5.ROCcurves forallthemodels, We have developed a flask based web application for detecting the plant disease anddeployed it on heroku (Tree cloud hosting server). Fig 6 shows the homepage ‘ofideployed web application and Fig 7 shows the input images and their correspondingpredictions made by our system. It shows that the system suecessfully detected thediseaseofleat. © scanned with OKEN Scanner era ey Fig.6.Homepageotileployed APL. However, we can deploy an intelligent robot vehicle with high end processor altachedto it for realtime plant disease detection. This system can detect the diseased plants inthe agricultural site. Even we can automate the process of spreading the fertilizers byusing. such robots. Our proposed algorithm is computationally inexpensive, soit candetet the plant disease in efficient manner, Also sometimes it happens thatthe farmeralso could ‘not identify the disease of the plant. So they need an expert advice. So wecan deploy a ‘website which can detect the plant disease based on images captured anduploaded by farmer and can give suggestions or can suggest some fertilizers based ondetected disease. © Scanned with OKEN Scanner Input Image las “Apple__Cedar_apple_rast Tomnto__Early, bight nagesandoutputsgeneratedbysystem, 5 Conclusion We have successfully developed a computer vision based system for plant diseasedetection with average 93% accuracy and 0.93 Fl score, Also the proposed system iscomputationally efficient because of the use of statistical image processing and ‘machinelearning model, Table 3 illustrates the overall benefits of our system over the cotherapproaches. © scanned with OKEN Scanner ‘Table 3.Comparisonofproposedsystemithotherexistingsystems, ‘Rathor ‘= Khivede] Shiroop ] Peyman | Shavath | Gavima | Propared etal | Madiwala | Moghadam | | shresth | Method (2015) | retal. | etal (2017) | al. (2038) | et al. (2017) (2020) ‘Algorithms [Digital | Digtal | Ryperspecta | Digtal [CNN | Digtal mage | image | limaging and | image image processin | processing svMt processin processin gard | andsvme 8 and ‘PAN random forest dtaceer a a330% | son 7 aaamx [93% ‘Computational | v x v x |v ffcient Specaled | ¥ ¥ v x x (x hardware requirement ‘Wecanobservethatourtechniqueisaccuratcandefficientcomparedwithothersystems.Also itwon'trequiteaspecializedhardware, makesitcosteffectivesolution. References 1. $. D. Khirade and A. B. Pati, "Plant Disease Detection Using Image Processing.” 201STnerationalConferenceonComputingCommunicationControlandA utomation,2015, pp "768-771 ,oi10-1109/1CCUBEA 2015.153, 2. S.C MadiwalarandM. V.Wyawahar."Plantdiseaseidentification:Acomparativestudy.”2017 ‘terationalConferenceonDataManagement Aaalytesandlnnovstion(ICDMAD.2017 pp. 13-18 doi-10.1100/4CDMAL2017 8073478, 3. P. Moghadam, D. Ward, E. Goun, S. Jayawardena, P. Sikka and E. Hemandez, ‘PlantDiseaseDetectionUsingyperspectrallmaging,"201TintemationalConferenceonDigit allmageComputing Techniquesandpplications|DICTA) 2017 pp.1- ‘Sdloi 10.1 109/DICTA. 2017 8227476, 4. S. DM, Akhilesh, S.A. Kumar, R. M.G. and P. C, “Image based Plant Disease DetectioninPomegranatePlantforBacteriaBlight."2019hmernationalConferenceonComt {cationandSignalProcessing(ICCSP),2019.pp.0645- (0649, doi 10,1 L09CCSP.2019.8698007. 5. G. Shrestha, Deepsikha, M, Das and N. Dey, "Plant Disease Detection Using CNN, ‘2020[EEEApplicdSignalProcessingConferencel ASPCON).2020 pp.109- 113,oi10.1109/ASPCON49795,2000 9276722. 6. MohantySP,HughesDPandSlatheM(2016)UsingDeepLeamingforlmage- BasedPlantDiseaseDetection Front PlantSci.7:1419 doi 10.3389/fpls 2016.01419 7. RM.Haralick K Shanmugamand! Dinsten,"TexturalFeaturesforimageClessifiation.” in IEEE Transactions on Systems, Man, and Cybernetics, vol. SMC-3, no.6,pp 610-621, [Nov.1973, do:10.1108/TSMC.1973.4300314 8. Breiman.1, RandomForests MachineLearaingt5.S 32(2001) tps oi org/10.1023/A: 1010933404324 © scanned with OKEN Scanner

