DEP Report

DEP Report
Acknowledgement
We would like to express our sincere gratitude to our project supervisor, Dr. Subhramanyam
Murala, for his invaluable guidance and support throughout this project. Without his expertise
and encouragement, this project would not have been possible.
We would also like to extend our appreciation to our project advisor, Dr. Mahendra Sakare, for
giving us the opportunity to work on the Development Engineering Project.
We would like to acknowledge the contributions of our fellow team members, Abhay, Akshat
Ahuja, Omkar Prashant More, Rahul Chaudhary & Rohit Madan, who were essential in
completing this project. Their hard work and dedication were invaluable, and we are grateful for
their efforts.
We would also like to thank the staff at CVPR Lab, who graciously provided us with access to
the resources and equipment necessary to complete this project.
Thank you all for your contributions to this project.
Motivation
Plant disease is a major issue in the agricultural industry, with the potential to significantly
impact crop yields and farmer livelihoods. This causes the problem of crops going to waste due
to lack of adequate disease preventive actions is prevalent in India (especially the rural areas).
Existing solutions to the problem include manual inspection and diagnosis by experts, which can
be time-consuming and costly. Also these solutions are not accessible to all. In order to address
this challenge, we developed a deep learning-based Plant Disease Detection Model and
deployed it in a Mobile Application.
Our model is novel compared to existing solutions, as it utilises a deep learning algorithm that is
capable of accurately detecting plant diseases from images. By training the model on a large
dataset of images of healthy and diseased plants, we were able to achieve a high level of
accuracy in disease detection.
Deploying the model on an app provides several benefits, including ease of use and accessibility
for a wider audience. Farmers and researchers can simply upload an image of a plant and receive
a diagnosis of the disease within seconds, without needing to consult an expert or invest in
expensive equipment. It is primarily aimed to address the needs of the rural farmers, by
providing them an easy solution to finding out the disease.
Overall, our model has the potential to make a significant impact in the agricultural industry by
providing a more efficient and cost-effective solution to plant disease detection.
Objective
The objective of our group project is to develop a deep learning-based plant detection model and
deploy it on a mobile application using Google Cloud Platform (GCP) and Flutter Framework.
Our goal is to provide an efficient and accurate method for identifying plant species, which can
aid farmers, researchers, and environmentalists in various tasks.
 Approaches currently used to achieve the end goal:

The current approaches used for plant species identification include manual identification by
human experts, field guides, and traditional image processing techniques. While manual
identification by experts is accurate, it is time-consuming and impractical for large-scale
applications. Field guides can be useful for amateurs, but their accuracy can be limited by the
user's knowledge and experience. Traditional image processing techniques can also be effective,
but they often rely on handcrafted features and can be computationally expensive.
 Limitations of current techniques:

The limitations of current techniques include the reliance on human experts, the need for
extensive knowledge and experience, and the limitations of traditional image processing
techniques. These limitations can make plant species identification challenging, time-
consuming, and inaccurate.
 Our technique's approach to overcoming the above-mentioned limitations:

Our deep learning-based plant detection model overcomes these limitations by using a
convolutional neural network (CNN) to automatically extract features from plant images and
classify them into different species. This method eliminates the need for manual feature
extraction and allows for more accurate and efficient identification. Moreover, our web
application provides a user-friendly interface, making it accessible to both experts and non-
experts.
 Implications of our work & new questions it raises:

Our work has significant implications for plant species identification, which can benefit farmers,
researchers, and environmentalists in various ways. Our model can help to identify plant species
quickly and accurately, which can aid in monitoring and managing plant populations, predicting
plant growth, and improving crop yield. Our work also raises new questions about the scalability
and applicability of deep learning-based plant detection models in different environments and
ecosystems. Additionally, it highlights the importance of making these technologies accessible
to everyone, regardless of their expertise or technical knowledge.
Workflow
We first started by finalising the two crops for which we had to make the model, and then
started collecting the dataset from internet resources.
Our first approach was to remove the background and then extract the features from the image
and then feed these features to our machine learning model.
Upon further discussion with our supervisor, we concluded that this approach was not good
enough because background removal is not accurate enough and the feature extraction technique
is outdated.
So, then we moved on to Deep Learning based techniques, as suggested by our supervisor.
So, this was our workflow for the project:
We first implemented CNN using MobileNetV1 architecture and tuned our model to get the best
result.
Then we wrote the prediction function for our model, and after that we deployed it over the
Google Cloud Platform, in order to use our model for Mobile applications.
After deploying the model on GCP, the Mobile app was developed using the Flutter framework.
Deep Learning Model:
Dataset
The dataset for the images of diseased/healthy leaves was taken from the PlantVillage Dataset
available on Kaggle website for our deep learning model.
https://www.kaggle.com/datasets/abdallahalidev/plantvillage-dataset
Since, we implemented our model for Potato and Corn, hence we extracted the images of these
two specific crops from the available dataset.
 The following diseases were studied by our model:
 Potato - Early Blight
 Potato - Late Blight
 Corn - Cecrospara
 Corn - Common Rust
 Corn - Northern Leaf Blight
Data Pre-processing
We pre-processed all the images by resizing the images to the shape (224 × 224 × 3) and
rescaled our images as the base model (discussed later in the report). We used expected input in
the shape of (224 × 224 × 3).
We split our dataset into Training (80%), validation (10%) and testing (10%). There are 6 labels
that our model is expected to predict. That are:
 Healthy
 Potato - Early Blight
 Potato - Late Blight
 Corn - Cecrospara
 Corn - Common Rust
 Corn - Northern Leaf Blight
Then our next task was to choose the base model for our deep learning model.
We tried different implementations of different models using the following architectures for our
base model:
 MobileNet V1
 MobileNet V3
And after training, we selected the best architecture for our best model, i.e. MobileNetV1.
Base Model: MobileNet V1
MobileNet V1 is a convolutional neural network architecture designed specifically for mobile

and embedded vision applications. It was developed by researchers at Google in 2017 and is
known for its low computational cost and small memory footprint, making it ideal for mobile
devices with limited resources.
The major merit of this network was that it contained depth-wise separable convolutional layers,
which reduced the number of parameters and computations required while maintaining high
accuracy. In traditional convolutional layers, each kernel convolves over all channels of the
input feature map, resulting in a high
number of parameters and
computations.
Depthwise separable convolutional
layers: first apply a depthwise
convolution, followed by a pointwise
convolution. This approach
significantly reduces the number of
parameters and computations required,
while still maintaining a high level of
accuracy.
Depthwise separable convolution layer block diagram
So, the base model that we loaded was trained on the ImageNet dataset, it expected the input in
the form of (224 × 224 × 3). Also, we didn’t load the top layers of this pretrained model.
It gave output in this shape (7 × 7 × 1024).
Architecture of the model

We used two layers before the base model to ensure that we changed the input size of our image
into the required shape expected by our model. Also, rescaling was done of the input image
before feeding it to the base model. We also added a data augmentation layer in the model to
avoid overfitting and to train our model to predict correctly even if the image is rotated or
flipped.
After getting features extracted from our base model, we couldn’t directly use the flatten layer
as it was increasing the number of trainable parameters. Thus, we used a combination of
pointwise convolution and depthwise convolution in order to reduce the number of trainable
parameters.
Then we used a bunch of dense layers and dropout layers, to avoid overfitting, and finally we
used the softmax layer for the final prediction.
This is the architecture of our best model:
The total parameters of our model are:

 Total parameters: 18,456,262
 Trainable parameters: 18,434,374
 Non-trainable parameters: 21,888
We reduced the parameters by gradually changing the number of channels so as there is no
major change in accuracy.For this we tried different combinations of pointwise convolution
layers and depthwise convolution layers, and found out that first using pointwise convolution &
depthwise convolution worked well for our case.
Model Training and Accuracy
For the training purpose we used Adam optimizer whwere choice for Deep learning tasks. The
metrics that were tracked during the training and training was accuracy. The loss function which
we used here is SparseCategoricalCrossentropy, as it is a popular choice for multi-class
classification problems in deep learning.
This loss function measures the difference between the predicted class probabilities and the true
class labels. It computes the cross-entropy loss between the predicted probability distribution
and the true distribution of the classes. The loss is computed as follows:
First, the predicted class probabilities are obtained by applying a softmax activation function to
the model outputs.
Then, the cross-entropy loss is computed between the predicted probabilities and the true class
labels. Specifically, for each sample, the loss is the negative logarithm of the predicted
probability for the true class label.
The Batch size we used was 32. The number of Epochs while training was 10. The reason for
using a low number of epochs is to avoid overfitting.
The following are the plots of training accuracy & validation accuracy vs epochs and training
loss & validation loss vs epochs.
The testing accuracy we got for our model is as follows:

Prediction Function:
So after getting the output from our model in terms of an array of probabilities of length equal to
the number of classes (6 in our case). We passed the array to the predict function where we used
argmax() which finds the index of maximum probability in the array and hence we get the class
corresponding to the maximum probability, which in turn will be our predicted class.
Since our model is not full-proof, there are a few wrong predictions and also a few predictions
where the model is not able to outright predict which class it is.
To tackle that, we put a condition that if the maximum probability is greater than 0.85, then only
directly use the argmax() function. Otherwise, we return the top two classes with the maximum
probability and notify the user to click the photo again. As the image may be pixelated, blurry or
noisy which can cause the wrong prediction.
GCP (Google Cloud Platform)

We used the Google cloud platform for the deployment of our model and created a server on it..
The.h5 version of the model was created and it was uploaded in the models folder after creating
a bucket in the console.
Further, a deployment script was also written to handle incoming requests from the backend of
the mobile application.
The script was basically a predict function (cloud function), which would get an image from the
app and then run the model for that input image and generate a response. The response would
include the predicted class along with confidence levels.
The Predict function was deployed using this command.
In our case, we had allocated 1GB to this predict

function, as can be seen in the above image.
This was the URL that handled the various API

requests and inside which the cloud function ran.
This response was decoded and displayed on the
frontend.
Errors can be seen in the logs section of the cloud function and this helped us in debugging our
API and model.
We also tested this deployed

API first on Postman by
giving it an image and
checking the output object
retrieved.
Mobile Application
Purpose of the Application
 The purpose of the app is to provide farmers with an easy-to-use tool that allows them to
identify crop diseases quickly and accurately, by simply uploading or clicking a photo of
the diseased. By doing so, farmers can take necessary steps to prevent the spread of the
disease and minimize the damage it causes. This, in turn, helps to protect their
livelihoods and increase their crop yields.
 A robust convolutional neural network, deployed in the cloud-hosted server, ensures that
even the minute variations in quality of the images provided by the consumers, often due
to lack of quality resources, doesn’t significantly affect our predicted classes for the
healthy or unhealthy crop samples.
 Flutter bridges the gap between the server and the client using Google’s continuously
optimized libraries, widgets and framework which provide a lightweight solution to the
problem, while ensuring a minimalistic and quicker approach.
Why Flutter?
 Flutter is fast: It uses a Dart programming language compiled into native code, meaning
there is no need for a JavaScript bridge. This results in apps that are fast and responsive.
 Flutter creates cross-platform applications: The same code can be used to build apps
for both iOS and Android devices from a single codebase rather than switching between
different platforms. This can save a lot of time and effort when developing mobile apps.
In addition, Flutter can be used for web development to create web applications.
 Flutter has a rich set of widgets: Widgets are the building blocks of Flutter apps, and a
wide variety of them are available. This makes it easy to create beautiful and custom
user interfaces.
 Flutter is open source: Anyone can contribute to the development of Flutter, and a
growing community of developers is using it. In addition, many helpful docs/tutorials are
available online, created by the Flutter community on sites like Github.
Introduction to Flutter
Flutter is a popular mobile app development SDK (Software Development Kit), created by
Google, that allows for the creation of cross-platform apps. By using Flutter, we were able to
build a single codebase that could be used to create both iOS and Android versions of the app.
This saved time and resources compared to building separate apps for each platform.
Additionally, Flutter has a rich set of pre-built UI widgets that enabled me to create a beautiful
and intuitive user interface for the app. Finally, Flutter's performance and speed were essential
in providing quick and accurate predictions for the farmers. Flutter is built on the Dart
programming language and provides a fast development workflow with hot reloading, so we can
quickly iterate on our code. When creating a Flutter app, we will be working primarily with
“widgets.” Widgets are the basic building blocks of a Flutter app, and they are used to create
both the visual components of an app (like buttons and text) and the functional elements (like
Stateless Widgets).
Working of our Application:
Asynchronous Functions: In Dart, asynchronous functions are used to perform non-blocking
operations, such as network or file I/O, without blocking the main thread. Asynchronous
functions are declared using the async keyword, and they typically return a Future object.
The function is declared as async, which means it can use the await keyword to wait for
asynchronous operations to complete. In this case, we await for the response from the server in
the form of a serialised JSON object (as string).
Note that the return type of the GetPrediction() function is Future<Map<String, dynamic>>,
which means it doesn't return a value but instead returns a Future object that completes when the
function has finished executing, and the final Map<String, dynamic> value acts a dictionary for
our prediction class to be constructed.
Future<Map<String, dynamic>> GetPrediction(File image) async {
...
var response = await request.send();
...
}
GetPrediction(File(image!.path)).then((value) {
if (...) {
showDialog(
...
);
});
When an await statement is executed, it does not block the entire thread. Instead, it allows the
thread to be used by other functions while the asynchronous operation is being executed. Once
the asynchronous operation is completed, the thread returns to the function and continues
executing from where it left off.
HTTP Requests: An HTTP (Hypertext Transfer Protocol) request is a message that a client
(such as a web browser or mobile app) sends to a server
in order to initiate a transaction. The request typically
contains information about the resource being requested,
such as the URL of the page, and may also include
additional data, such as form data or cookies.
HTTP requests can be used to perform a wide range of
actions, such as retrieving data from a server, submitting
a form, or initiating a file download. The server then responds to the request with an HTTP
response, which contains the requested data or an error message if the request cannot be
fulfilled.
In HTTP (Hypertext Transfer Protocol), a request is a message sent from a client to a server,
requesting the server to perform an action or provide a resource. A response is the message sent
by the server back to the client in response to the request.
An HTTP request consists of the following parts:
1. Request Line: This line contains the HTTP method (such as GET, POST, PUT,
DELETE), the URI (Uniform Resource Identifier) of the requested resource, and the
HTTP version.
2. Headers: Headers provide additional information about the request, such as the content
type of the request body, the user agent making the request, and any authentication
credentials.
3. Request Body: Some requests may include a message body that contains data, such as
JSON or XML, that is sent to the server.
In an HTTP request, JSON data can be included in the request body, in the form of a string that
contains JSON-formatted data. This data can be used to pass information from the client to the
server.
As seen in the snippet below, we have initiated a POST request to the url containing the active
Google Cloud Platform server link, which serves the request with an appropriate response body
of the form:
{"class":"healthy","click_again":false,"confidence":100.0}
Or
{"class":["healthy", “Potato_Late__Blight”],"click_again":true,"confidence":”null”}
Depicting a case of a single higher confidence class for the first response and a dual class
prediction for confidences below 0.85, and a click_again flag being set to true for the latter case,
to depict a scope of better prediction upon re-uploading or re-clicking the photograph of the
crop sample.
A MultipartFile is a class in Dart that represents a file that is being uploaded via an HTTP
request, it contains the binary data of the file, as well as metadata such as the filename, content
type, and size. The metadata is used by the server to determine how to handle the file and how
to store it.
We send the serialized data of the image taken from the imagePicker utility function, as form
data for our POST request to the server.
Future<Map<String, dynamic>> GetPrediction(File image) async {
var request = new http.MultipartRequest("POST", url);
var multipart = await http.MultipartFile.fromPath("file", image.path);
request.files.add(multipart);
var response = await request.send();
if (response.statusCode == 200) {
final respStr = await response.stream.bytesToString();
return jsonDecode(respStr);
} else {
throw Exception('Failed to connect to server\n Connection Error :(');
}
}
Stateful Flutter Widgets: In Flutter, a Stateful Widget is a widget that has mutable state. That
is, it's a widget that can change its behavior or appearance over time in response to user
interaction, network requests, or other factors. The state of a Stateful Widget in Flutter can hold
any mutable data that the widget needs to store and manage, most of which consist of user
defined fields.
A Stateful Widget consists of two classes:
 A StatefulWidget class, which is responsible for creating and managing the widget's
mutable state. This class is immutable and should only contain properties that don't
change over time (such as configuration values).
 A State class, which is responsible for managing the widget's mutable state. This class is
mutable and contains properties that can change over time (such as user input data,
network response data, etc).
We maintained a single Stateful Widget Home corresponding to the state HomeState, which was
responsible for displaying different types of widgets for different states of our HTTP response,
corresponding to a private state _gettingPrediciton:
class Home extends StatefulWidget {
@override
_HomeState createState() => _HomeState();
}
class _HomeState extends State<Home> {
XFile? image;
final ImagePicker picker = ImagePicker();
var _gettingPrediction = 0;
late String class1, class2;
late num conf;
late bool reclick;
where each of the states signify the following:
 image: a public field of type XFile, which represents a file that has been picked by the
user using a file picker. It is typically used in file management applications that allow
users to browse and select files from their device's storage, and stores the value of the
image uploaded by the user, which is null at the start.
 picker: an instance of ImagePicker function of image_picker library, which is a Flutter
package that provides functionality for picking images and videos from the device's
gallery or camera. It is a commonly used package in mobile app development for
capturing and uploading photos or videos.
 gettingPrediction: custom integer type flag to describe the state of the application, to
render different widgets accordingly.
 conf: describes the confidence of the predicted class of the provided image of the crop
sample.
 class1, class2: describes at most two classes that can be predicted from the given sample.
 reclick: set to true, if the confidence corresponding to each of the predicted classes is less
than a threshold value.
We assumed the following sequence of actions for the user, along with their corresponding
state values as well as the various widgets rendered on the application screens:
Image Not Uploaded to the Application:
When no image has yet been uploaded to the application, by

the user, which is the default configuration after a cold start,
the following screen prompts the user to do the same.
Since, our state of _gettingPrediciton has already been set to 0,

the ternary statement in the build() function of HomeState
renders the following screen.
Image Uploaded to the Application but HTTP Request is

not sent yet:
When the image has been uploaded to the application, by the

user, but the image hasn’t been parsed to the server in the form
of a POST Request Form Data body.
Since the image now has a non-null value, our state of

_gettingPrediciton has already been set to 1, the ternary
statement in the build() function of HomeState renders the
following screen.
Uploading Image to the Application:

Using the image_picker package provided by Dart, along with the user permissions, we were
able to provide the user with two methods to upload the image of their crop sample to the
application, using the help of an asynchronous function which receives a value when the user
has selected an image:
Using Camera: photos can be clicked and uploaded without the need to save locally.
Using Gallery: even the locally stored photos can be uploaded, using the device’s gallery.
Image Uploaded to the Application and HTTP Request is

sent:
When the image has been uploaded to the application, by the user,
as well the image has been parsed to the server, we indicate the
user to standby while the application receives a further response
from the server.
We now set our state of _gettingPrediciton to 2, the ternary

statement in the build() function of HomeState renders a
CircularProgressIndicator() widget.
Response predictions are received from the server:
Since our state of _gettingPrediciton was already set to 2, the

ternary statement in the build()function of HomeState renders a
different widget based on the reclick_again field of the response
body from the server.
Finally, upon clicking the ‘OK’ button we set back the state of
gettingPrediciton as 0, so that the farmer can upload another
image and get further predictions.
Results
--> -->
Above, we can see the working of our PlanDoc Mobile Application.

The user is given an option to Upload Photo (Camera/Gallery Options), and then the App
Fetches results and displays the Predicted Class and Confidence within a matter of seconds.
Few more examples of the results shown by our application are:

In the above Result, we can see that our application has predicted 2 classes (because of lack
of confidence) so its shows a comment stating
“Consider rechecking the photo or reuploading”
Merits of the app & Scope of improvement
 One of the merits of our model is that it is not only able to predict the healthy leaf of
potato and corn but also of other plants as well. Here is the picture of the healthy leaf
of a plant outside our hostel which our model predicted right with good confidence..
 So by this, we can say that our model has successfully learned to differentiate
between a healthy and a diseased leaf of any plant. And we used only healthy images
of potatoes and corn for the training.
 If one can arrange a large dataset, then one can train a model based on MobileNetV3
architecture, which is very fast. We tried using that architecture, but due to the small
dataset, we couldn’t get a very good accuracy.
 Also, the merit of our application is that one can upload the photo as well as click it
on the spot using their mobile’s camera.
 The deep learning model can be further implemented for a variety of other plants for
their disease classification.
 It has the capability to be made into a self-learning algorithm that could detect
diseases and notify users of any new diseases (if that disease does not already exist in
its dataset).
 The app can also be integrated with a pesticide spraying drone, which will hover over
the farm and, upon detecting a disease, will automatically spray the pesticide. This is
a hands-free solution that also protects the farmer from the harmful effects of the
pesticides also.
 Also, the many features can be added into the app such as locating the shops where
the farmer can find the appropriate pesticides.
References:
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

:https://arxiv.org/pdf/1704.04861v1.pdf

DEP Report

Uploaded by

Copyright:

Available Formats

You might also like

DEP Report

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

DEP Report

Uploaded by

Copyright:

Available Formats

DEP Report

 Approaches currently used to achieve the end goal:

 Limitations of current techniques:

 Our technique's approach to overcoming the above-mentioned limitations:

 Implications of our work & new questions it raises:

Deep Learning Model:

Base Model: MobileNet V1

MobileNet V1 is a convolutional neural network architecture designed specifically for mobile

Architecture of the model

The total parameters of our model are:

The testing accuracy we got for our model is as follows:

GCP (Google Cloud Platform)

The Predict function was deployed using this command.

In our case, we had allocated 1GB to this predict

This was the URL that handled the various API

We also tested this deployed

where each of the states signify the following:

Image Not Uploaded to the Application:

When no image has yet been uploaded to the application, by

Since, our state of _gettingPrediciton has already been set to 0,

Image Uploaded to the Application but HTTP Request is

When the image has been uploaded to the application, by the

Since the image now has a non-null value, our state of

Uploading Image to the Application:

Image Uploaded to the Application and HTTP Request is

We now set our state of _gettingPrediciton to 2, the ternary

Response predictions are received from the server:

Since our state of _gettingPrediciton was already set to 2, the

Above, we can see the working of our PlanDoc Mobile Application.

Few more examples of the results shown by our application are:

MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications

You might also like