Professional Documents
Culture Documents
Final Project Report1
Final Project Report1
Final Project Report1
Paper Title: Feature Extraction and Classification of Greater Bean Leaf disease
Submitted by:
Name: Abdullah Al Rafi
Id: 212-15-4218
Section: 60_B
Submitted to:
Dr. Md. Taimur Ahad
Associate Professor and Associate Head
Department of Computer Science and Engineering
Dataset:
Dataset link:
https://www.kaggle.com/datasets/sivm205/Greaterbean-diseased-leaf-dataset
https://www.kaggle.com/code/lorresprz/leaf-diseases-classification-inceptionv3-network
Description:
The dataset comprises images of diseased Greater Bean leaves, showcasing 10 distinct
Greater Bean ailments such as bacterial blight, brown spot, frog eye, rust, and sudden death
syndrome. These images are of excellent quality and meticulously annotated, rendering
them optimal for training machine-learning models aimed at classifying plant diseases.There
are different types of Classes like Bacterial blight ,Brown spot, Crestamento ,Ferrugen
,Mossaic Virus ,Powdery mildew ,Septoria ,Southern blight ,Sudden Death
Syndrone,Yellow Mosaic
Usage:
The Greater bean Diseased Plant Leaf Images Dataset is primarily valuable for developing and testing
models to identify Greaterbean diseases. It also serves multiple research purposes, such as estimating
the prevalence of various diseases on Greaterbean plants and examining the impact of environmental
factors on leaf health. Furthermore, it holds educational significance by providing a valuable learning
platform for both machine learning and plant pathology. Additionally, it highlights agricultural
applications like disease detection and decision-making, contributing to early disease detection in
Greaterbean cultivation.
Importance:
The Greater Bean Afflicted Plant Leaf Images Dataset holds significant value within the agricultural
technology sphere. It possesses the capability to revolutionize the control of Greaterbean diseases by
facilitating the creation of precise and effective detection tools, thereby increasing productivity and
minimizing losses experienced by farmers globally. This dataset plays a crucial role in advancing
sophisticated methods within precision agriculture, aiding farmers in the early detection of
Greaterbean crop diseases. Its impact extends beyond research facilities, fostering substantial
improvements in agricultural efficiency and sustainability on a global scale.
Code:
import time
st=time.time()
import numpy as np
import pandas as pd
import os import
cv2 as cv
import matplotlib.pyplot as plt
import seaborn as sns import
tensorflow as tf
BB = "./images/bacterial_blight" BS =
"./images/brown_spot"
CR = "./images/crestamento" FE =
"./images/ferrugen"
MV = "./images/Mossaic Virus" PM =
"./images/powdery_mildew" SE =
"./images/septoria"
SB = "./images/Southern blight"
SDS = "./images/Sudden Death Syndrone" YM =
"./images/Yellow Mosaic"
for i in [BB,BS,CR,FE,MV,PM,SE,SB,SDS,YM]:
paths = get_path_image(i)
img_data.extend(paths)
print(len(img_data))
data = {"img_data":img_data,
"labels":[np.nan for x in range(len(img_data))]} data =
pd.DataFrame(data)
data["labels"] = data["labels"].astype("float64")
img_list = []
import tensorflow
from keras import applications, models, preprocessing
def feature_extract(model):
if model == "VGG19": model = VGG19(weights='imagenet',include_top=False,
pooling="avg")
elif model == "ResNet50": model =
ResNet50(weights='imagenet',include_top=False,pooling="avg") elif
model == "ResNet101": model =
ResNet101(weights='imagenet',include_top=False,pooling="avg") elif model
== "InceptionV3": model = InceptionV3(weights='imagenet',
include_top=False, pooling="avg") elif model == "DenseNet121": model =
DenseNet121(weights='imagenet',
include_top=False, pooling="avg") elif model == "MobileNetV2": model =
MobileNetV2(weights='imagenet',
include_top=False, pooling="avg") else:
raise ValueError("Unsupported model name: " + model) return
model
features_list = [] for i in
range(len(img_list)):
"""
# Reshaping when VGG19 model is selected features =
model.predict(image).reshape(512,) """
model(**inputs)
features = outputs.last_hidden_state
features = features.squeeze(0) # Remove batch dimension'''
= data["labels"]
x = features_df.drop(['labels'], axis = 1) y =
features_df.loc[:,"labels"].values
import SimpleImputer
ets=time.time() et=ets-st
print(f"Execution time:{et}s")
from sklearn.preprocessing import MinMaxScaler scaler =
MinMaxScaler()
scaler.fit(x)
x_ = scaler.transform(x) x_ =
pd.DataFrame(x_)
def anova_fs():
selector = SelectKBest(f_classif, k=500) # k is number of features selector.fit(x_, y)
def RFE_fs():
rfe_selector = RFE(estimator=RandomForestClassifier(), n_features_to_select=200, step=0.1)
rfe_selector.fit(x_, y)
def rf_fs():
embeded_rf_selector = SelectFromModel(RandomForestClassifier(n_estimators=200,
random_state=5), threshold='1.25*median') embeded_rf_selector.fit(x_,
y)
embeded_rf_support = embeded_rf_selector.get_support()
embeded_rf_feature = x_.loc[:,embeded_rf_support].columns.tolist()
pca_fs():
pca = PCA(n_components=500) X_pca = pca.fit_transform(x_)
return X_pca
{fs_x.shape[1]}") from
import numpy as np
# Convert x_train and x_test to NumPy arrays if they are not already x_train
= np.array(x_train)
x_test = np.array(x_test)
for i, k in enumerate(neig):
knn = KNeighborsClassifier(n_neighbors=k) knn.fit(x_train, y_train)
prediction_ = knn.predict(x_test) train_accuracy.append(knn.score(x_train, y_train))
test_accuracy.append(knn.score(x_test, y_test))
print("Best accuracy is {} with K = {}".format(np.max(test_accuracy),
1 + test_accuracy.index(np.max(test_accuracy))))
knn = KNeighborsClassifier(n_neighbors=17)
knn.fit(x_train,y_train) predicted = knn.predict(x_test) score =
knn.score(x_test, y_test) knn_score_ =
np.mean(score)
nb_model = GaussianNB()
nb_model.fit(x_train,y_train) predicted =
nb_model.predict(x_test) score =
nb_model.score(x_test, y_test)
print('Accuracy : %.3f' % (score))
p=precision_score(y_test, predicted, average='weighted') print('precision : %.3f' % (p))
r=recall_score(y_test, predicted, average='weighted') print('recall : %.3f' %
(r)) f1=f1_score(y_test, predicted, average='weighted') print('f1-
score:
%.3f' % (f1))
f1_w=f1_score(y_test, predicted, average='weighted') print('weighted
f1score: %.3f' % (f1_w))
ets=time.time() et=ets-st
print(f"Execution time:{et}s")
Overview of Image Processing and Model Performance:
DenseNET121:
KNN SVM:
Random Forest: Naïve Bayes:
Random forest
Naïve Bayes:
F1-Score:
Precision and recall both have strengths and weakness; their average is F1 score,
which harmonises the two metrics. It is especially useful when high retrieval accuracy
is required while at the same time maximizing recall.
Analysis
For the assessment of the performance of the different architectures (DenseNet121, InceptionV3, MobileNetV2,
ResNet101, ResNet50, and VGG19) for the identification of Greaterbean diseases, the evaluation metrics such as
accuracy, precision, recall, and F1 score from various classifiers namely KNN, SVM, Random Forest Classifier,
and Naïve Bayes, are used.
Best Models: Random Forest Classifier consistently exhibits robust performance across most architectures, particularly
DenseNet121, MobileNetV2, and ResNet101.
SVM: Demonstrates outstanding performance, especially with ResNet101, ResNet50, and VGG19 architectures.
KNN and Naive Bayes: Generally perform adequately but are surpassed by SVM and Random Forest, with performance
variations observed across different architectures.
DenseNet121 and ResNet101: Offer the highest accuracy and metric balance when paired with Random Forest Classifier and
SVM.
InceptionV3 and VGG19: Display diverse performance, with VGG19 exhibiting lower effectiveness when paired with KNN.
The report outlines the merits and limitations of each model regarding their proficiency in categorizing Greaterbean leaf
ailments. In general, Random Forest Classifier and SVM stand out as the leading classifiers, while DenseNet121 and
ResNet101 prove to be the most efficient feature extraction frameworks for classifying Greaterbean diseases.