Professional Documents
Culture Documents
Modified Final Modi2)
Modified Final Modi2)
Modified Final Modi2)
DILLA, ETHIOPIA
JUNE 2024
Declaration
We, the undersigned, hereby declare that the project titled "Potato Disease
Classification and Detection Using Deep Learning" is our original work, carried out
under the supervision of [Mr. Darara D. (MSc.)], at Dilla University, College of
Engineering and Technology, School of Electrical and Computer Engineering,
Department of Computer Engineering. This project is submitted in partial fulfillment
of the requirements for the degree of Bachelor of Science in Computer Engineering.
We affirm that the work presented in this project is our own and has not been
submitted previously for any degree or examination at any other institution. We have
properly acknowledged all sources used in the preparation of this project.
I
Abstract
In this project, we present a deep learning-based approach for the classification and
detection of diseases in potato plants. Potato diseases pose a significant threat to
agricultural productivity, leading to substantial economic losses. Early and accurate
detection is crucial for effective disease management and control. Leveraging
advancements in deep learning, this study aims to develop a robust and efficient
system capable of identifying common potato diseases from leaf images.
Our methodology involves the collection and preprocessing of a diverse dataset of
potato leaf images, covering various disease conditions such as early blight, late blight,
and healthy leaves. We employ convolutional neural networks (CNNs) to build and
train our model. The performance of the model is evaluated using key metrics
including accuracy, precision, recall, and F1-score.
The results demonstrate that our deep learning model achieves high accuracy in
classifying and detecting potato diseases, outperforming traditional machine learning
methods. This system not only aids farmers and agricultural professionals in timely
disease identification but also contributes to reducing the reliance on expert
knowledge and manual inspection.
In conclusion, the proposed deep learning approach offers a promising solution for the
automatic detection and classification of potato diseases, potentially improving crop
management practices and mitigating the impact of plant diseases on agricultural
yield and integration with a web application. Future work will focus on expanding the
dataset, incorporating additional disease types, and enhancing model performance
through continuous learning and adaptation.
II
Acknowledgement
First and foremost, we express our deepest gratitude to God, the Creator, for His
guidance, wisdom, and blessings throughout our academic journey and the completion
of this thesis project. Without His divine support and inspiration, this work would not
have been possible.
We would like to extend our heartfelt thanks to our academic advisor, [Mr Darara D.
MSc.], for his invaluable guidance, continuous support, and encouragement
throughout the duration of this project. His expertise and insights were instrumental in
shaping the direction and outcome of our research.
Finally, Thank you all for your support and contributions.
III
Table of Contents
Declaration......................................................................................................................I
Abstract..........................................................................................................................II
Acknowledgement........................................................................................................III
List of Tables.................................................................................................................II
ACRONYMS................................................................................................................III
CHAPTER ONE.............................................................................................................1
INTRODUCTION..................................................................................................1
1.1 Background of the Project........................................................................1
1.2 Statement of the Problem.........................................................................2
1.3 Objectives.................................................................................................3
1.4 Methodology.............................................................................................3
1.5 Significance of the Project........................................................................4
1.6 Scope of the Project..................................................................................5
CHAPTER TWO............................................................................................................7
LITERATURE REVIEW.......................................................................................7
2.1 Introduction...............................................................................................7
2.2 Theoretical reviews..................................................................................7
2.3 Summary of related works......................................................................10
CHAPTER THREE......................................................................................................12
METHODOLOGY AND MATERIALS USED..................................................12
3.1 Introduction.............................................................................................12
3.2 Overall Methodology and Process..........................................................12
3.3 Model Architecture.................................................................................15
3.4 Materials and Experimental Setup..........................................................28
CHAPTER FOUR........................................................................................................31
Results and Discussions........................................................................................31
4.1 Training the model.................................................................................31
4.2 Evaluation Metrics Analysis...................................................................31
4.3 Training and validation Accuracy Analysis...........................................32
CHAPTER FIVE..........................................................................................................36
CONCLUSION AND RECOMMENDATION...................................................36
5.1 Conclusion..............................................................................................36
IV
5.2 Limitations..............................................................................................36
5.3 Recommendation....................................................................................36
Reference......................................................................................................................38
APPENDIX..................................................................................................................40
V
List of Figures
Fig. 1. Proposed methodology......................................................................................12
Fig. 2. Early blight disease of potato leaf.....................................................................13
Fig. 3. Late blight disease of potato leaf......................................................................14
Fig. 4. Healthy potato leaf............................................................................................14
Fig.5. CNN Architecture...............................................................................................17
Fig.6. CNN detailed Architecture.................................................................................18
Fig.7. Confusion matrix report the proposed model.....................................................32
Fig. 8. Training, validation, and loss for 10 epochs.....................................................33
Fig. 9. Training, validation, and loss for 30 epochs.....................................................33
Fig. 10. Training, validation, and loss for 50 epochs...................................................34
Fig. 11. Training, validation, and loss for 50 epochs...................................................35
I
List of Tables
Table 1: Summary of Literature Reviews.....................................................................10
Table 2: Training Parameters........................................................................................20
Table 3: Hardware/Software Characteristic..................................................................29
Table 4: Classification report of proposed model........................................................31
Table 5. Validation results for classification................................................................34
II
ACRONYMS
API Application Programming Interface
CI/CD Continuous Integration/Continuous Deployment
CNN Convolutional Neural Network
CVL Convolutional Layer
DIP Digital Image Processing
DL Deep Learning
DOM Document Object Model
EB Early Blight
FC Fully Connected
GCF Google Cloud Function
GCP Google Cloud Platform
HTML Hypertext Markup Language
HTTP Hypertext Transfer Protocol
JS JavaScript
LB Late Blight
ML Machine Learning
MP Max Pooling
PC Personal Computer
PLD Potato Leaf Disease
RAM Random Access Memory
ReLU Rectified Linear Unit
RNN Recurrent Neural Network
UI User Interface
III
CHAPTER ONE
INTRODUCTION
1
only reduces the reliance on human expertise but also enables early disease detection,
facilitating timely interventions and improved crop management practices[4].
Furthermore, the development of robust deep learning models for potato disease
detection and classification can contribute to sustainable agriculture practices by
minimizing the use of chemical pesticides and fertilizers. By enabling targeted
interventions based on real-time disease monitoring, farmers can optimize resource
allocation, reduce environmental impact, and enhance overall crop health and
productivity.
In summary, the integration of deep learning techniques into potato disease detection
and classification represents a promising approach to addressing the challenges faced
by the potato industry. Through automated and accurate disease diagnosis, deep
learning has the potential to revolutionize potato cultivation practices, ultimately
ensuring food security and sustainability in a rapidly changing world.
2
the development of a deep learning-based solution for potato disease detection and
classification is imperative to address these pressing issues and improve agricultural
practices.
1.3 Objectives
The general objective of this project is to develop an automated system for potato
disease detection and classification using deep learning techniques.
1.4 Methodolgy
Data Collection
i. Data Source: The dataset used for this project consists of images of potato leaves
with various diseases, including early blight, late blight, and healthy leaves. The
images were collected from various sources, including the PlantVillage dataset .
ii. Data Preprocessing: The collected images were preprocessed to enhance their
quality and reduce noise. This included resizing the images to a uniform size of
256 × 256 pixels, normalizing the pixel values to the range , and applying data
augmentation techniques such as rotation, flipping, and zooming to increase the
dataset size.
Model Architecture Design
3
Convolutional Neural Network (CNN) Architecture: The CNN architecture used
for this project is based on the custom(standard) model, which consists of several
convolutional and pooling layers.
The architecture includes a series of convolutional layers with different filter
sizes and activation functions, followed by max-pooling layers to reduce the
spatial dimensions of the input data.
Model Training
The CNN model was trained on the preprocessed dataset using the Adam optimizer
with a learning rate of 0.001 and a batch size of 32. The model was trained for 50
epochs, and the validation accuracy was monitored to prevent overfitting.
Model Evaluation
Evaluate the trained model's performance using the validation set and appropriate
metrics such as accuracy, precision, recall, and F1-score. The metrics were calculated
for each disease class (early blight, late blight, and healthy) and for the overall model.
Fine-tune the model parameters based on the evaluation results to improve
performance.
Model Deployment
Develop a user-friendly interface (e.g., web application) for deploying the trained
model. Integrate the model into the interface to allow users, such as farmers or
agriculture professionals, to upload potato images and receive real-time disease
detection and classification results.
Results
Accuracy: The proposed CNN model achieved an accuracy of 99% on the test
dataset, outperforming the state-of-the-art models.
Precision, Recall, and F1-Score: The model achieved a precision of 1, a recall of
1, and an F1-score of 1 for the overall model.
By following this methodology, the project aims to develop an accurate and efficient
system for potato disease detection and classification using deep learning techniques,
ultimately benefiting the agricultural sector and promoting sustainable crop
management practices.
4
cultivation is susceptible to various diseases that can drastically reduce yield and
quality, leading to substantial economic losses. Traditional methods of disease
detection rely on manual inspection by experts, which is time-consuming, labor-
intensive, and often prone to human error. Deep learning, a subset of artificial
intelligence, offers a robust solution by automating and significantly improving the
accuracy of disease identification. Through the use of convolutional neural networks
(CNNs) and other advanced algorithms, deep learning models can analyze large
datasets of potato plant images, learning to identify disease patterns with high
precision. This technology enables real-time monitoring and early detection of
diseases, allowing farmers to take swift and targeted actions to mitigate damage.
Additionally, the scalability of deep learning systems makes them accessible to
farmers on a global scale, including those in remote or resource-limited areas.
Ultimately, the implementation of deep learning for potato disease classification and
detection not only enhances crop management and productivity but also contributes to
sustainable agricultural practices, reduces dependency on chemical treatments, and
supports the broader goal of ensuring food security in the face of growing population
demands and climate challenges.
5
Once a high-performing model is achieved, we integrate it into a user-friendly web
application. This web app allows users to upload images of potato plants and receive
immediate diagnostic results, displaying whether the plant is affected by late blight,
early blight, or is healthy, along with actionable recommendations. The web app is
designed to be accessible across various devices, ensuring ease of use for farmers and
agricultural professionals. Additionally, the application supports continuous learning,
enabling users to contribute new images and feedback to further refine and adapt the
model over time. Through this comprehensive scope, our project aims to deliver a
practical, scalable solution that enhances potato disease management and supports
sustainable agricultural practices.
6
CHAPTER TWO
LITERATURE REVIEW
2.1 Introduction
Numerous studies and suggestions have been done in order to automatically identify
plant diseases. Each literary work, however, seeks to discuss various plants and
components of a plant in particular. Because of this, relevant research on deep
learning and machine learning systems for autonomous plant disease diagnosis is
covered in this section.
8
over the course of 10 epochs. These results suggest that the model is quite effective in
learning and identifying patterns within the dataset, though there is a noticeable gap
between training and validation accuracy which could hint at slight overfitting or
room for improvement in generalization[7].
In contrast, our proposed CNN model shows enhanced performance metrics, with an
accuracy of 99% and a validation accuracy of 98%. This model was trained with a
more extensive dataset division, utilizing 80% for training and 20% for validation,
and over a significantly longer period of 50 epochs. The higher accuracy and
validation accuracy indicate that our model has a better generalization capability and
robustness compared to the model in the document. The increased training data and
extended epochs likely contributed to the model's superior performance, allowing it to
learn more intricate patterns and nuances in the dataset, thus improving its overall
predictive power.
Islam, et al. (2020) have developed a system that combines machine learning and
image segmentation processing to diagnose potato disease from images of the leaves.
Almost 300 images and the plant village dataset, which contain the study's data, are
used. They first remove the backdrop and the green portion of the leaf. Thus, only
extract the areas of interest that have observable illness signs. By using color and
texture cues to train a multiclass support vector machine (SVM) classifier, they were
able to identify and distinguish between disorders. From every leaf image in the
dataset, ten features (color and textural) were retrieved. Using the Gray Level Co-
occurrence Matrix (GLCM), statistical texture parameters such as contrast, correlation,
energy, and homogeneity were retrieved. Throughout the experiment, the database
was split into two sets, with 180 photographs in the training set (60%) and 120 images
in the testing set 40%. They attained a 95% accuracy rate in the suggested model that
divides potato leaf disease into three categories: healthy, late blight affected, and early
blight affected. The automatic assessment of the diagnosed disease's severity and the
recommendation for medical care, however, were not taken into account in this
investigation. The study's focus was relatively narrow; it didn't take into account the
majority of potato leaf diseases, and the dataset they used wasn't considering real
environment data[8].
In the paper A Deep Learning Approach to Classify the Potato Leaf Disease by Md.
Ashraful Islam, and Md. Hanif Sikder proposes a deep learning approach using
convolutional neural networks (CNNs) to classify potato leaf diseases. The authors
9
aimed to develop a model that could assist farmers in identifying and taking action
against early and late blight, two common potato diseases.
The study achieved promising results. The model achieved 100% accuracy in the
testing phase for classifying healthy, early blight, and late blight potato leaves. The
authors trained the model on a dataset of 10,000 images and explored the impact of
different training epochs (iterations) on the model's performance. They found that 40
epochs yielded the best results.
However, some limitations are considering.the model is focus only to train the dataset
and doesn’t deploy a web-application to predict the potato disease[9].
10
Network
(CNN)
Algorithm
Islam, et al. Detection of 95 GLCM for feature -they didn’t
(2020) Potato extraction and use real data
Diseases Using SVM for
Image classification
Segmentation
and Multiclass
Support Vector
Machine
Md. Ashraful A Deep 99.80 CNN -Not deploy
Islam, Md. Learning with a web-
Hanif Approach to app to predict
Sikder(2022) Classify the the diseases
Potato Leaf
Disease
11
CHAPTER THREE
The methodology and materials used provides a detailed overview of the approach
taken to conduct the study on potato disease detection and classification using deep
learning techniques. This chapter outlines the methods employed for data collection,
preprocessing, model development, training, evaluation, deployment, and
experimental setup. Additionally, it describes the materials, tools, and resources
utilized throughout the research process.
12
Data acquisition
Different image resolutions and sizes were obtained from an open-access image
database from Kaggle. we collected 2152 from Kaggle. All the images are divided
into three classes. These are Early Blight, Late Blight, and Healthy[10].
Early Blight:
Early blight is a form of plant epidemic brought on by the bacterium Alternaria solani.
Tiny black dots grow into massive, brown-to-black, round-to-ovoid lesions, which are
sometimes constrained by leaf veins but may also be related to lenticels. The
underside of the leaves then develops a black fungus. Tuber wilt in potatoes can be
brought on by early blight. When temperatures exceed 26 C, this disease will start to
spread. It frequently appears when potatoes’ activity is decreased due to high-
temperature drying or a lack of fertilizer.
13
Fig. 3. Late blight disease of potato leaf
Healthy Leaf: Health leaf looks fresh and is not infected with any disease.
14
Random Cropping: Random sections of the image are cropped and resized to
augment the dataset.
Zooming: Random zooming in or out of the image to simulate varying
perspectives.
Image classification
Machine learning (ML), also referred to as deep learning (DL), deep neural learning,
or deep neural network, is a component of artificial intelligence (AI). Deep learning
contains more layers than machine learning, as indicated by the word "deep". Deep
learning techniques have raised the bar in several fields, including object detection,
speech recognition, object categorization, and image classification[11]. Convolutional
Neural Network is one of the most well-liked classes in deep learning. Convolutional
neural networks have been used in several research to identify plant illnesses based on
the health of the leaves. One or more convolutional layers that are organized into
groups according to function make up convolutional neural networks in general. The
subsampling layer is frequently followed by one or more fully linked layers that are
typical of a neural network. A feature set contained in a limited area on the previous
layer serves as input for each feature layer.
I. Input Layer:
Input images of potato plants or tubers are fed into the network with dimensions
suitable for processing, typically resized to a standard size as 256x256 pixels. This
ensures uniformity in input dimensions across all images.
II. Convolutional Layers:
The network begins with a series of convolutional layers responsible for extracting
features from input images. Each convolutional layer comprises multiple filters
(kernels) that convolve over the input image to detect spatial patterns such as edges,
textures, and shapes. The number of filters and the size of the convolutional kernels
are hyperparameters that can be adjusted based on the complexity of the dataset and
the desired level of feature extraction.
III. Activation Function (ReLU):
15
After each convolutional layer, a Rectified Linear Unit (ReLU) activation function is
applied element-wise to introduce non-linearity into the network. ReLU activation
helps the model learn complex patterns by enabling it to model non-linear
relationships within the data.
IV. Pooling Layers:
Max pooling layers are interspersed between convolutional layers to downsample
feature maps and reduce computational complexity. This layers extracts the maximum
value within each pooling window to retain important features while discarding less
relevant information. Pooling layers help to reduce spatial dimensions and control
overfitting by abstracting spatial information.
V. Intermediate Convolutional Layers:
The network includes additional convolutional layers with increasing depth and
complexity to extract higher-level features from the input images. Deeper layers
capture more abstract representations of features learned from earlier layers, allowing
the network to encode complex patterns associated with potato diseases.
VI. Fully Connected Layers:
The global average pooling layer is followed by one or more fully connected layers
responsible for mapping extracted features to output classes (disease categories). Each
neuron in the fully connected layers represents a category, and the activations of these
neurons are combined to generate a probability distribution over the disease classes.
Fully connected layers utilize softmax activation to compute class probabilities and
generate predictions.
VII.Output Layer:
The output layer consists of neurons corresponding to the number of disease classes to
be classified. The output neuron with the highest probability indicates the predicted
disease class.
16
Fig.5. CNN Architecture
17
Fig.6. CNN detailed Architecture
18
Additional Considerations:
Dropout Regularization: Dropout layers may be included between fully connected
layers to prevent overfitting by randomly dropping neurons during training.
Batch Normalization: Batch normalization layers may be added to stabilize and
accelerate the training process by normalizing the inputs to each layer.
Optimization Algorithm: Gradient-based optimization algorithms such as Adam
or RMSprop can be used to optimize model parameters and minimize the loss
function during training.
Loss Function: Categorical cross-entropy loss is commonly used as the loss
function for multi-class classification tasks.
The preprocessed potato leaf disease dataset was split into training (80%), validation
(10%), and test (10%) sets. The training set is used to train the deep learning model,
while the validation set is used to monitor performance and prevent overfitting during
training. The test set is used for final evaluation of the trained model's generalizability
on unseen data. The training process involves the following steps:
1. Data Loading:
Training images and their corresponding disease labels are loaded from the
training set in batches of size 32. This batch size is a hyperparameter chosen
based on factors like memory constraints and computational efficiency.
2. Forward Pass:
The model performs a series of operations including convolutions, pooling,
activation functions, and fully connected layers to generate a probability
distribution for each potato leaf disease class.
3. Loss Calculation:
The predicted probability distribution for each image in the batch is compared to
the actual disease label using the categorical cross-entropy loss function. This
function measures the difference between the model's predictions and the ground
truth, essentially calculating the error.
4. Backpropagation:
19
The calculated loss is propagated backward through the network using the Adam
optimization algorithm. Adam is a gradient-based optimizer that iteratively
updates the weights and biases of the model in a way that minimizes the overall
loss function. This helps the model learn from its mistakes and improve its
predictions in future iterations.
5. Epoch Iteration:
Steps 1-4 are repeated for a specified number of epochs (typically 50 in this case).
An epoch represents one complete pass through the entire training set. With each
epoch, the model refines its ability to differentiate between healthy and diseased
potato leaves based on the features it extracts from the training data.
6. Validation:
After each epoch, the model's performance is evaluated on the validation set.
Metrics like accuracy, precision, recall, and F1-score are calculated to assess how
well the model generalizes to unseen data. This helps identify potential
overfitting and allows for early stopping if validation performance stagnates or
deteriorates.
7. Model Selection:
The model with the best performance on the validation set is selected as the final
trained model. This model represents the optimal balance between fitting the
training data and generalizing to unseen data.
Parameter Values
Split Ratio (Training:Validation:Test) 80%:10%:10%
Batch Size 32
Optimization Algorithm Adam
Loss Function Categorical Cross-Entropy
Epochs 50
By following these steps and carefully monitoring the training process, the deep
learning model is trained to effectively classify potato leaf diseases based on image
features. The final trained model is then evaluated on the unseen test set to assess its
generalizability and real-world applicability.
20
3.3.3 Model Evaluation
After training the deep learning model for potato leaf disease classification, the next
crucial step is evaluation. Here, we assess the model's performance on unseen data
from the test set (10% of the original dataset). This evaluation helps determine how
well the model generalizes to real-world scenarios and identifies any potential
weaknesses.
Evaluation Metrics:
True Positives (TP): The number of data points that are correctly classified as
positive.
False Positives (FP): The number of data points that are incorrectly classified as
positive.
True Negatives (TN): The number of data points that are correctly classified as
negative.
False Negatives (FN): The number of data points that are incorrectly classified as
negative.
Accuracy: This is the overall proportion of correct predictions made by the model. It
represents the percentage of images from the test set that the model classified
correctly (both healthy and diseased).
(TP + TN)
accuracy =
𝑃+𝑁+𝑃+𝑁
21
Precision: The proportion of true positives among all positive predictions. A
classification model's precision is a metric used to assess the percentage of genuine
positives among all positive predictions. To put it another way, precision assesses a
model's capacity for avoiding false positives
TP
Precision =
𝑃+𝑃
Recall: Recall (also known as Sensitivity or True Positive Rate): The proportion of
true positives among all actual positive samples. Recall, also referred to as sensitivity
or the true positive rate, is used to calculate the percentage of real positive samples.
Recall assesses a model's accuracy in properly identifying positive samples, in other
words.
TP
Recall =
𝑃+𝑁
harmonic mean to calculate precision and recall, both metrics are given equal weight.
2 x (Precision x Recall)
F1_Score = (𝑃𝑟𝑐𝑠𝑜 +𝑐𝑎)
The F-score has a scale from 0 to 1, with 1 denoting perfect precision and recall and 0
denoting zero precision or recall.
=1 𝑁 𝑃= 𝑃1+𝑃2+.....+ 𝑃𝑁
precision macro-average ∑ 𝑁
𝑁 =1
where Pi stands for the itℎ precision score and N denotes the number of precision
scores for the trained model. Further, the recall macro-average is also defined as
22
𝑁
recall macro-average = 1
∑ = 1+2+.....+
𝑁 𝑁
=1
where Ri represents the itℎ recall score and N is the number of recall scores of the
model.
Evaluation Process:
The trained model is used to predict disease labels for the images in the test set.
The predicted labels are compared with the actual labels (ground truth) for each
image in the test set.
The chosen evaluation metrics (accuracy, precision, recall, and F1-score) are
calculated based on the comparison of predicted and actual labels.
The evaluation results are analyzed to understand the model's strengths and
weaknesses. High accuracy indicates the model is making mostly correct predictions
across all disease classes. High precision for a specific class suggests the model is
good at identifying that disease with minimal false positives. Similarly, high recall
indicates the model is effectively capturing most cases of a particular disease.
Additional Considerations:
Confusion Matrix: A confusion matrix is a visualization tool that can be helpful for
evaluating a multi-class classification model's performance. It shows how many
images from each class were predicted correctly and incorrectly by the model.
Class Imbalance: If the dataset has imbalanced classes (unequal distribution of
disease types), then metrics like accuracy might be misleading. In such cases, using
F1-score or other metrics that consider both precision and recall is crucial.
By evaluating the model's performance using these metrics and techniques, we gain
valuable insights into its effectiveness for potato leaf disease classification. This
information can be used to further refine the model or select the best performing
model for deployment in a real-world application.
While the previous sections discussed training and evaluating the initial model, this
section focuses on model optimization techniques. Here, we explore strategies to
23
potentially improve the model's performance on the potato leaf disease classification
task. Here are some common approaches to model optimization:
1. Hyperparameter Tuning:
The initial training process likely involved choosing specific values for
hyperparameters like the number of convolutional filters, learning rate, or the number
of neurons in hidden layers. These hyperparameters significantly impact the model's
learning behavior and performance. Techniques like grid search, random search, or
Bayesian optimization can be used to systematically explore different hyperparameter
combinations and identify the settings that yield the best results on the validation set.
2. Data Augmentation:
Data augmentation involves artificially expanding the training dataset by creating new
variations of existing training images. This can be done through techniques like
random cropping, flipping, rotation, or adding noise. Data augmentation helps the
model learn from a wider range of image variations and improve its generalizability to
unseen data that might have slight differences from the original training set.
3. Regularization Techniques:
Regularization helps prevent overfitting by penalizing the model for having overly
complex decision boundaries. Techniques like L1/L2 regularization or dropout can be
employed during training. These methods add constraints that encourage the model to
learn simpler representations of the data and reduce the risk of memorizing the
training set instead of learning generalizable patterns.
4. Model Ensembling:
24
Once you've applied optimization techniques, re-evaluate the model on the validation
set to assess the impact of the changes. If the performance improves, you can then test
the optimized model on the unseen test set to see if the generalization ability has also
improved.
Based on the evaluation results, we compare the performance of the initial model with
the optimized model. The model that achieves the best balance of accuracy, precision,
recall, and F1-score on the test set is considered the optimal model for our potato leaf
disease classification task.
The deployment and integration of deep learning-based systems for potato disease
classification represent pivotal steps in modern agricultural diagnostics. These
systems leverage advanced computer vision techniques to automatically analyze
images of potato plants and identify the presence of diseases accurately. By deploying
such systems on accessible platforms like web applications or mobile apps, farmers
and agricultural professionals gain timely insights into disease outbreaks, enabling
proactive management strategies. Seamless integration of frontend interfaces, backend
APIs, and deep learning models ensures user-friendly interaction and reliable
performance. Overall, through effective deployment and integration, these systems
contribute significantly to enhancing crop protection and optimizing yields in potato
cultivation.
1. Platform Selection
Web Application: A web application was chosen as the deployment platform due to
its versatility and accessibility. By opting for a web-based solution, the system
becomes platform-independent, allowing users to access it from any device with a
web browser, including desktop computers, laptops, tablets, and smartphones. This
accessibility ensures that farmers and agricultural professionals can utilize the
application regardless of their device preferences or operating systems.
1. Model Optimization
25
Quantization: Quantization is a model optimization technique used to reduce the
memory footprint and computational overhead of deep learning models. By
converting the model's parameters and activations to lower precision representations
(e.g., from 32-bit floating-point to 8-bit integers), quantization significantly reduces
the storage requirements and improves inference speed without sacrificing model
accuracy. This optimization is particularly beneficial for deployment on resource-
constrained environments such as web browsers or mobile devices, where memory
and processing power are limited.
2. Framework Selection
3. Backend Infrastructure
FastAPI: FastAPI is a modern web framework for building APIs with Python,
designed to be fast, easy to use, and highly performant. It leverages Python type
annotations and asynchronous programming to automatically generate OpenAPI
documentation and interactive API documentation, making it simple to define and
document API endpoints. FastAPI's built-in support for asynchronous request
26
handling ensures low latency and high concurrency, making it well-suited for serving
machine learning models and handling HTTP requests efficiently.
5. API Development
API Functionality: The backend API provides a user-friendly interface for uploading
image data and obtaining classification results. Users can interact with the API
through a drag-and-drop interface, allowing them to easily upload images of potato
plants for analysis. Upon receiving an image, the API performs inference using the
optimized deep learning model, extracting relevant features and predicting the most
likely disease class. The API returns the predicted disease class along with confidence
scores, providing users with actionable insights into the health status of their potato
plants.
6. Frontend Development
7. Model Deployment
27
Google Cloud Platform (GCP): Google Cloud Platform is a suite of cloud
computing services offered by Google, providing a range of infrastructure and
platform services for building, deploying, and managing applications and services in
the cloud. GCP offers a scalable and reliable environment for hosting application
infrastructure, including compute, storage, networking, and machine learning services.
By deploying the system on GCP, users benefit from Google's global network
infrastructure, advanced security features, and managed services, ensuring high
availability, performance, and scalability.
This section details the hardware, software, and datasets used to develop and evaluate
the deep learning model for potato leaf disease classification.
3.4.1 Hardware
The deep learning model was trained and tested on a computer equipped with an
Intel(R) Core(TM) i3-5005U CPU @ 2.00GHz processor and 8GB of RAM. While
this setup is sufficient for basic model development, a more powerful computer with a
dedicated GPU (Graphics Processing Unit) would accelerate training times for
complex models.
3.4.2 Software
28
Programming Languages: Python, a versatile and widely used programming
language, was the primary language for coding the model and associated scripts.
Development Tools: PyCharm, an Integrated Development Environment (IDE),
was used for code development and editing.
Data Preprocessing Libraries: Scikit-image, a Python library for image processing,
was used for tasks like image resizing and manipulation during data
preprocessing.
Data Visualization Libraries: Matplotlib, a Python library for creating static,
animated, and interactive visualizations, was used to visualize the training data or
model performance (e.g., loss curves).
API Development Framework : FastAPI, a high-performance framework for
building APIs, was used to develop an API for the model.
HTTP Client : Postman, a tool for sending API requests and testing functionality,
was used to interact with the developed API .
3.4.3 Dataset
A publicly available potato leaf disease image dataset was obtained from Kaggle, a
platform for data science and machine learning. The dataset consists of a total of
2,152 images belonging to three classes:
29
Potato Healthy (152 images)
Potato Late Blight (1000 images)
This section provides a clear overview of the computational resources, software tools,
and dataset used in your project. It allows readers to understand the technical
environment in which the model was developed.
30
CHAPTER FOUR
Our deep learning model for potato leaf disease classification leverages the power of
Convolutional Neural Networks (CNNs) to extract spatial features from pre-processed
potato leaf images. The network architecture comprises input layers tailored to the
image resolution and RGB channels, followed by multiple convolutional layers that
apply filters to extract intricate features like edges and textures. Activation functions
such as ReLU introduce non-linearity, while optional pooling layers aid in
downsampling and translation invariance. The flattened layer prepares the output for
fully connected layers that learn abstract features, with the final output layer utilizing
a softmax function to classify the potato leaves into distinct disease classes. Through
this structured CNN architecture, our model aims to accurately classify healthy leaves,
Early Blight, and Late Blight with optimized performance and efficiency.
In this section, real data that was collected from kaggle was trained using proposed
classifier model. The detail analysis of each class label performance is presented in
Table 10.
As show in Table 4, 100% of early blight cases could be properly classified by the
proposed model correctly. In addition, the model accurately categorized every image
as healthy and none were incorrectly classified. Similar to this, 100% of the images
31
were correctly classified as late blight disease. As a result, the classifier model has a
100% success rate for classification with real data only.
As it is shown in Figure 7, out of a total of 124 images of potato leaves, all 124 were
correctly characterized as having early blight disease. Moreover, healthy potato leaves
were correctly identified in all 21 images. And also, out of 111 images, all 111 images
were correctly recognized as having the late blight disease. We can infer that the
model usually always accurately classified data into the correct class of the images.
The dataset contains 2152 images belonging to three classes of potato leaves. The
results of training and validation accuracy and loss for epochs 30, 40 & 50 are given
below. From the below figure 8, 9 & 10, we identify the relationship between the
number of epochs and learning outcomes.
32
Fig. 8. Training, validation, and loss for 10 epochs
33
Fig. 10. Training, validation, and loss for 50 epochs
According to the figure, we can see that our model performed better when we applied
50 epochs. We used a total of 20 images randomly to evaluate the proposed tool and
investigate the classification accuracy. Table 5 is shown the results of potato leaf
disease classification for three different classes. The dataset has been split into three
parts; training dataset, test data set and validation dataset. Neither the testing nor the
validation data set were included in the training data set.
34
In Fig. 11 are shown the actual classes and predicted classes, including the confidence.
100% confidence means the accuracy is 100% of the predicted leaf. For every class,
the accuracy of our model is 99.72%, which indicates that the model is working fine.
The classification results show that the proposed model has good accuracy for 50
epochs
35
CHAPTER FIVE
5.1 Conclusion
This project aimed to develop a deep learning model for classifying potato leaf
diseases using image recognition. The model leveraged Convolutional Neural
Networks (CNNs) to extract features from pre-processed potato leaf images and
classify them into three distinct categories: Early blight, Late blight, and Healthy.
The project followed a structured approach, starting with a comprehensive literature
review to understand existing techniques for disease detection. Subsequently, a deep
learning model was designed and implemented using TensorFlow. The model was
trained on a publicly available dataset of potato leaf images obtained from Kaggle.
The dataset contained a total of 2,152 images belonging to the three aforementioned
classes.
Evaluation metrics like accuracy, precision, recall, and F1-score were employed to
assess the model's performance. The results were highly promising, with the model
achieving an overall accuracy of 99% on real-world data collected from Kaggle and
the proposed model results higher accuracy indicatendicate that our model has a better
generalization capability and robustness compared to the model in the document. .
The confusion matrix further corroborated this finding, indicating a high degree of
accuracy in classifying images into their respective disease categories.
5.2 Limitations
5.3 Recommendation
Based on the findings of this project, several avenues exist for further research and
development. Here are some potential directions to consider:
36
Multiple Disease Classification: Expanding the model's classification
capabilities to encompass a wider range of potato leaf diseases would
significantly enhance its practical value for farmers and agricultural
professionals.
Severity Level Prediction Integration: Develop an additional model stage or
refine the existing model to predict the severity level of the detected disease. This
could involve incorporating severity-specific features or leveraging techniques
like multi-class classification or regression tasks.
37
Reference
[1] G. K. Abebe, J. Bijman, S. Pascucci, and O. Omta, “Adoption of improved
potato varieties in Ethiopia: The role of agricultural knowledge and innovation
system and smallholder farmers’ quality assessment,” Agric. Syst., vol. 122, no.
November, pp. 22–32, 2013, doi: 10.1016/j.agsy.2013.07.008.
[2] IPPC Secretariat, “Plant health and food security (FAO),” FAO behalf Secr. Int.
Plant Prot. Conv. ;, vol. 1, p. 2p., 2017, [Online]. Available:
https://openknowledge.fao.org/handle/20.500.14283/i7829en
[3] U. Y. Tambe, A. Shobanadevi, A. Shanthini, and H.-C. Hsu, “Potato Leaf
Disease Classification using Deep Learning: A Convolutional Neural Network
Approach,” Nov. 2023, Accessed: May 30, 2024. [Online]. Available:
https://arxiv.org/abs/2311.02338v1
[4] M. H. Saleem, J. Potgieter, and K. M. Arif, “Plant Disease Detection and
Classification by Deep Learning,” Plants, vol. 8, no. 11, Nov. 2019, doi:
10.3390/PLANTS8110468.
[5] M. Akila and P. Deepan, “Detection and Classification of Plant Leaf Diseases
by using Deep Learning Algorithm,” Iarjset, vol. 6, no. 7, pp. 72–75, 2018.
[6] N. E. M. Khalifa, M. H. N. Taha, L. M. Abou El-Maged, and A. E. Hassanien,
Artificial Intelligence in Potato Leaf Disease Classification: A Deep Learning
Approach, vol. 77, no. January 2022. Springer International Publishing, 2021.
doi: 10.1007/978-3-030-59338-4_4.
[7] A. J. R. and Andi Sunyoto, “Identification of Disease in Potato
LeavesUsingConvolutionalNeuralNetwork ( C NN ) Algorithm,” 3rd Int. Conf.
Inf. Commun. Technol., pp. 72–76, 2020.
[8] M. T. Islam, “Plant Disease Detection using CNN Model andImage Processing
Plant Disease Detection using CNN Model and Image Processing,” no.
February, 2020, [Online]. Available:
https://www.researchgate.net/publication/358768246
[9] M. A. Islam and M. H. Sikder, “A Deep Learning Approach to Classify the
Potato Leaf Disease,” J. Adv. Math. Comput. Sci., vol. 37, no. 12, pp. 143–155,
2022, doi: 10.9734/jamcs/2022/v37i121735.
[10] A. Tejaswi, “Plant village late blight, early blight and healthy dataset,” kaggle.
[Online]. Available: https://www.kaggle.com/datasets/arjuntejaswi/plant-
38
village
[11] H. Lee, R. Grosse, R. Ranganath, and A. Y. Ng, “Convolutional deep belief
networks for scalable unsupervised learning of hierarchical representations,”
Proc. 26th Int. Conf. Mach. Learn. ICML 2009, pp. 609–616, 2009, doi:
10.1145/1553374.1553453.
39
APPENDIX
40
41
42
43
44
45