Professional Documents
Culture Documents
Reort 2.0 Front Page
Reort 2.0 Front Page
A PROJECT REPORT
Submitted by
201130107029
BACHELOR OF ENGINEERING
In
Computer Engineering
May, 2024
Sal College of Engineering , Ahmedabad
Computer Engineering
Summer-2024
CERTIFICATE
This is to certify that the project report submitted along with the project entitled Image
Classification Using Deep Learning has been carried out by Patel Kanish Vishnubhai
(201130107029) under my guidance in Completion of Internship (3180701) degree of
Bachelor of Engineering in Computer Engineering, 8th Semester of Gujarat
Technological University, Ahmadabad during the academic year 2023-24.
DECLARATION
I hereby declare that the Internship report submitted along with the Internship entitled
Image Classification using Deep Learning submitted in partial fulfilment for the
degree of Bachelor of Engineering in Information Technology to Gujarat Technological
University, Ahmadabad, is a bonafide record of original project work carried out by me
at The Pioneer Tech Under the supervision Mr. Das Saurabh and that no part of this
report has been directly copied from any students’ report or taken from any other source,
without providing due reference.
i
ACKNOWLEDGEMENT
ii
ABSTRACT
iii
LIST OF FIGURES
iv
Fig 6.4.5 Model summary …………….……………… 33
v
LIST OF TABLES
Table 6.3.1 Tomato Dataset lengths for train, valid and test 29
Table 6.3.2 Corn Dataset lengths for train, valid and test …….. 29
Table 6.3.3 Cotton Dataset lengths for train, valid and test …... 29
Table 7.4.1 Comparison of Loss and Accuracy for train and valid at 41
different epoch ……………………………………..
Table 7.4.2 Comparison of Loss and Accuracy for train and valid at 42
different epochs…………………………………….
Table 7.4.3 Comparison of Loss and Accuracy for train and valid at
different epoch……………………………………….
Table 8.3.1 Summary of internship…………………………. 56
vi
LIST OF SYMBOLS
vii
LIST OF ABBREVIATIONS
EB Early Bligh
BB Bacterial Blight
LB Late Blight
CV Curl Virus
IC Image Classification
DL Deep Learing
viii
TABLE OF CONTENTS
Acknowledgement………………………………………………………… I
Abstract………………………………………………………………….... II
List of IV
Tables……………………………………………………………… List V
Table of
Contents………………………………………………………….
3.2 Purpose…………………………………………...... 9
3.3 Scope…………………………………………...….. 9
3.4 Objective…………………………………………... 9
x
Chapter 4 4.0 System Analysis……………………………………….. 15
4.1 Study of Current System…………………………… 16
xi
Chapter9 9.0 Limitation and Future Work……………………….. 58
References ………………………………………………………………... 60
xii
Project ID: 421735 Overview of The Company
Chapter
1
=============================================================
History
Different Products
=============================================================
History
Since 2006, Pioneer Tech has been at the forefront of enterprise IT solutions,
specializing in website development, custom software solutions, and Google
promotion.
Organization Chart
Pioneer Tech operates with a highly skilled team proficient in website development, AI
ML, data analytics, and graphics design.
Capacity of Plant
While Pioneer Tech primarily operates in the digital realm, its capacity extends to
serving clients across various industries and fulfilling their IT needs effectively.
Chapter
2
=============================================================
All Departments
Sequence of operators
Product stages
=============================================================
HR Department
Sales Department
Development Department
Testing Department
● HR Department
A HR department is tasked with maximizing employee productivity and
protecting the company from any issues that may arise within the
workforce. HR responsibilities include compensation and benefits,
recruitment, firing, and keeping up to date with any laws that may affect
the company and its employees.
● Development Department
A Development department in different department like Web
Development, Android Development, iOS Development, UI/UX Design,
etc. Web development services help create all types of web-based software
and ensure great experience for web users.
Testing Department
The Pioneer Tech follows a systematic approach across its departments. The
HR team focuses on recruitment, onboarding, and resource allocation. Sales
engages with clients, generates proposals, and negotiates contracts.
Designing team collaborates with clients and internal teams for requirement
analysis and iterative design.
Chapter
3
=============================================================
3. INTRODUCTION TO INTERNSHIP
=============================================================
Project
Summary
Purpose
Objective
Scope
Planning
Scheduling
=============================================================
INTRODUCTION TO INTERNSHIP
Dataset PlantVillage
3.1Purpose
3.2Objective
The main objective of our project is Develop a user-friendly interface for farmers and
agricultural practitioners to easily upload images, receive disease classification results, and
access corresponding treatment advice.
1. Develop a comprehensive dataset consisting of annotated images depicting various
diseases affecting tomato, cotton, and corn crops.
2. Design and implement a convolutional neural network (CNN) architecture
optimized for multi-class classification of crop diseases.
3. Train the CNN model on the dataset to achieve high accuracy in distinguishing
between healthy and diseased instances of tomato, cotton, and corn plants.
4. Develop a user-friendly interface for farmers and agricultural practitioners to easily
upload images, receive disease classification results, and access corresponding treatment
advice.
3.2 Scope
The Scope of this project The developed deep learning model can be extended to
include more crop varieties beyond tomato, cotton, and corn, enabling broader coverage of
agricultural diseases and supporting a wider range of farmers and crops.
3.4 Planning
The waterfall model is a sequential design process, often used in software development
processes, in which progress is seen as following steadily downwards (like a waterfall)
through the phases of conception, initiation, design, analysis, construction, testing,
production/implementation, and maintenance.
Product Owner
Product is the representative of the stakeholders and customers who use the
software. They focus on the business part and is responsible for the ROI of the
project. They translate the vision of the project to the team, validate the
benefits in stories to be incorporated into the Product Backlog and prioritize
them on a regular basis.
Team
A group of professionals with the necessary technical knowledge who develop
3.5 Scheduling
Week 1 2 3 4 5 6 7 8 9 10 11 12 13
Planning and
Research
Design and
Architecture
Backend Development
Testing and
Quality Assurance
Optimization and
Deployment
Documentation and
Training
Project Handover
The scheduling of our project is shown in the above timeline chart where we can see tasks which we
had done in this semester like problem identification, Project definition, Requirement gathering,
Analysis and design. Then at which date start and finish the particular task. How much duration
required for the task completion and finally all the dates at which we had done the particular task.
Chapter
4
=============================================================
4 SYSTEM ANALYSIS
=============================================================
Feasibility Study
SYSTEM ANALYSIS
▪ 32 GB RAM
▪ 500 GB of HDD space
▪ Network related tools
● Software
o Development Machine Requirements
▪ Python (NumPy, pandas, matplotlib, PIL, TensorFlow, Streamlit)
▪ Google Colab
▪ Browser
▪ Visual Studio Code
o Client Machine Requirements
▪ Browser with Internet Connectivity
o Host Machine Requirements
▪ Python (NumPy, pandas, matplotlib, PIL, TensorFlow, Streamlit)
▪ Browser with Internet Connectiviry
Chapter
5
=============================================================
5 SYSTEM DESIGN
=============================================================
============================================================
SYSTEM DESIGN
System Design: In the modified waterfall model, the system design phase
incorporates some iterative elements. Initially, high-level system
architecture and design are created based on the gathered requirements.
The dataset collection process involves gathering a comprehensive and diverse set of
annotated images depicting tomato, cotton, and corn leaf diseases. This entails
collaborating with agricultural research institutions, farming communities, and data
providers to acquire high-quality images representing various disease types, stages, and
environmental conditions. Additionally, manual collection efforts may be undertaken to
supplement existing datasets and ensure adequate coverage of relevant diseases.
Emphasis is placed on inclusivity, ensuring representation across different geographical
regions, crop varieties, and cropping systems.
1 Tomato_Early_blight 1000
2 Tomato_Late_blight 1649
4 Tomato_healthy 1591
5 Cotton_bacterial_blight 1325
6 Cotton_curl_virus 1400
7 Cotton_fussarium_wilt 1271
8 Cotton_healthy 1589
9 Corn_Blight 1600
10 Corn_Common_Rust 1478
11 Corn_Gray_Leaf_Spot 1386
12 Corn_Healthy 1256
Description: Corn Leaf Disease Classification: Farmer can upload a corn leaf
image and classification of disease farmer can upload image to already clicked
image or real time click image throw camara.
Description: Cotton Leaf Disease Classification: Farmer can upload a Cotton leaf
image and classification of disease farmer can upload image to already clicked
image or real time click image throw camara.
Chapter
6
==============================================================
6. IMPLEMENTATION
==============================================================
Implementation Platform
Data Preprocessing
Dataset Split
CNN Model Architecture
Model Training
Output Screenshot
==============================================================
IMPLEMENTATION
● Google Colab
Free Access to GPU/TPU: Google Colab provides free access to GPU and
TPU resources, which are essential for training deep learning models
efficiently, especially for image classification tasks like leaf disease
classification.
In the Stack Overflow 2021 Developer Survey, Visual Studio Code was
ranked the most popular developer environment tool, with 70% of 82,000
respondents reporting that they use it.
Normalization was applied to scale the pixel values between 0 and 255 to
improve the convergence of the model during training.
Resizing was also applied to standardize the image size to 256x256 pixels to reduce
the computational cost of training the mode and also send the images as batches as
the batch size is 32.
Normalization The dataset was split randomly into the three subsets, with the
training dataset containing 80% of the images, the validation dataset containing 10%
of the images, and the test dataset containing 10% of the images. Table 2 shows the
dataset lengths for the train, validation, and test datasets.
Table 6.3.1 Tomato Dataset lengths for train, valid and test
Train dataset length 149
Valid dataset length 18
Test dataset length 20
Total dataset length 149+18+20 = 187
Table 6.3.2 Corn Dataset lengths for train, valid and test
Train dataset length 112
Valid dataset length 14
Test dataset length 14
Total dataset length 112+14+14 = 140
Table 6.3.3 Cotton Dataset lengths for train, valid and test
Train dataset length 125
Valid dataset length 15
Test dataset length 17
Total dataset length 125+15+17 = 156
The dataset split ensures that the model is trained on a sufficiently large dataset
while also allowing for a fair evaluation of its performance on unseen data.
CNNs are inspired by the structure and function of the visual cortex in the brain.
The network is made up of a series of interconnected layers, each consisting of
several neurons that perform simple computations on the input data. The layers are
typically arranged in a specific order, including convolutional layers, pooling
layers, and fully connected layers. The following fig shows the CNN model
architecture with properly connected layers.
Convolution Layer
Convolutional layers are the core building blocks of a CNN. They apply filters or
kernels to the input image, sliding over the entire image and performing a dot
product between the filter and the input pixels. This process generates a feature
map, highlighting the regions of the input image that are most important for
recognizing a particular pattern or object.
Pooling Layers
Max pooling is a commonly used type of pooling layer in which the maximum
value within a defined region of the input feature map is selected and then passed on
to the next layer.
For example, let's say we have an input feature map with a size of 6x6 and a pooling
window size of 2x2. The max pooling operation would take place as follows:
1. The input feature map is divided into non-overlapping regions of size 2x2.
2. The maximum value within each region is identified.
3. A new feature map is created with a size of 3x3, consisting of the maximum
values from each region.
Pooling Layers
Fully connected input layer – The preceding layers' output is
"flattened" and turned into a single vector which is used as an input for
the next stage.
The first fully connected layer – adds weights to the inputs from the feature
analysis to anticipate the proper label.
batches with a batch size of 32. The convolution layers have 32, 64, 64, 64, 64, and
64 filters, respectively, with a kernel size of 3x3 and ReLU activation. Each
convolution layer is followed by a max pooling layer with a pool size of 2x2. The
output shape of each layer is calculated based on the input size, kernel size,
padding, and stride. The output parameters of each convolution layer are calculated
based on the kernel filter size, number of filters, and number of previous filters,
while the output parameters of the dense layer are based on the input and output
channel numbers.
1. The first step in the training process is loading and preprocessing the training
data. This involves normalizing the data, splitting it into batches, and
converting it into the appropriate format for the model.
2. The second step in the training process is defining the model architecture. This
step involves specifying the neural network's architecture, including the
number and type of layers, activation functions, optimizer, and loss function.
The architecture of the CNN model is discussed in section 6.4.
3. The third step in the training process is compiling the model. This step
involves configuring the model for training by specifying the optimizer, loss
function, and any additional metrics to track during training. The Adam
optimizer and Sparse Categorical Cross entropy loss function are discussed in
current chapter.
4. The final step in the training process is training the model. This step involves
Gujarat Technological University 38 SCE, Ahmedabad
Implementation
Project ID: 421735
feeding the training data into the model, computing the output, and adjusting the
model parameters using the Adam optimizer algorithm to minimize the loss function.
The number of training epochs determines how many times the entire training dataset
is used to train the model. The trained model is used for testing on the test set in the
next chapter.
After trained CNN model saved with saved_model.h5, at the time of predicting an image
reload it and do predictions. In this project, the interface designed a dynamic path for image
predictions using a web interface page, a web page with an interface that allows end- users to
select images of their choice and perform predictions on them.
Description: When you click on the browse files page then it’s automatically redirect
to the our file explorer page and then you need to select your resume in the pdf format
and then it’s attach with the below of this selector.
Description: After the upload the Image from the file explorer then it’s show the
image and after you can classify .
Description: When you click on the classify button then it’s classify tomato leaf
disease and provide a confidence of the disease and after that suggest a treatment of
that disease.
Chapter
7
==============================================================
7 TESTING
==============================================================
Testing Plan
Testing
Strategy Testing
Methods
Testing Cases
==============================================================
TESTING
The goal of test planning is to establish the list of tasks that, if performed, will identify all
of the requirements that have not been met in the software. There are many standards that
can be used for developing test plans. Early in the deployment planning phase, the testing
effort, and identifies the methodology that your team will use to conduct tests. It also
identifies the hardware, software, and tools required for testing and the features and
functions that will be tested. A well-rounded test plan notes any risk factors that
jeopardize testing and includes a testing schedule. So I can say that Test Planning details
the activities, dependencies and effort required to conduct the system test.
The purpose of the testing strategy is to define the overall context for the entire testing
process. The process is different depending on the specific characteristics of your
solution. In many respects, this is the most important part of the testing process, since all
future testing decisions will be made within the context of the strategy. As a programmer,
we have to just do unit testing which is a part of White Box testing. Other type of testing
in each phase of the software is done by the testing department.
Unit Testing
Unit testing is a software development process in which the smallest testable parts of an
application, called units, are individually and independently scrutinized for proper
operation. Unit testing is often automated but it can also be done manually.
Unit testing involves only those characteristics that are vital to the performance of the unit
under test. This encourages developers to modify the source code without immediate
concerns about how such changes might affect the functioning of other units or the most
efficient and error-free manner possible, larger components of the program can be
evaluated by means of integration testing.
The unit test verifies that the requirements are being met. The unit testing generally tests
two types of requirements.
Table 7.4.1 Comparison of Loss and Accuracy for train and valid at different epochs.
S. No. of Train Train Valid Accuracy Valid
No eporchs Accuracy Loss Loss
1 10 0.91 0.22 0.87 0.26
Figure 7.4.1 Plotted the accuracy and loss for both train and validation
Figure 7.4.2 uploaded image and the accuracy for tomato leaf disease
Table 7.4.2 Comparison of Loss and Accuracy for train and valid at different epochs.
S. No .of Train Train Valid Valid
No eporch Accuracy Loss Accuracy Loss
s
1 10 0.86 0.34 0.85 0.36
Figure 7.4.3 Plotted the accuracy and loss for both train and validation
Figure 7.4.4 uploaded image and the accuracy for Corn leaf diseas
Table 7.4.3 Comparison of Loss and Accuracy for train and valid at different epochs.
S. No.of eporchs Train Accuracy Train Loss Valid Accuracy Valid Loss
No
Figure 7.4.5 Plotted the accuracy and loss for both train and validation
Figure 7.4.6 uploaded image and the accuracy for cotton leaf diseas
Chapter
8
==============================================================
CONCLUSION
=============================================================
=
CONCLUSION
In addition to its successes, image classification using deep learning also faces ongoing
challenges. One such challenge is the need for large amounts of labeled data for training, which
can be costly and time-consuming to acquire, especially in specialized domains. Furthermore,
ensuring the robustness of models to diverse real-world conditions, such as variations in
lighting, viewpoint, and occlusion, remains an active area of research.
Problem Encountered:
Possible Solutions:
b. Summary of Internship
Dataset PlantVillage
Chapter
9
==============================================================
==============================================================
● Limitations:
● Future Enhancements:
REFERENCE