@ <>
Grapevine Disease
1. Introductior
A. About th
B. Loading Libraries
‘ading Data
faining
Learning with ResNet50
B. Training Loss andIntroduction
About the Project
In this notebook, we are going to recognize three of the most common grapevine disease by
the appereance of the leaf when the grapevine is contaminated,
The three disease are Esca, Black Rot, and Leaf Blight.
isa fungal di is not cause by a single agent but muttiple at the same time
- Black Rotis another fungal disease caused by Guignardia bidwellii
- Leaf Blight, caused by bacteria
Loading Libraries
In [1]
import pandas as pd
import numpy as np
import os
import matplotlib.pyplot as plt
import seaborn as sns
‘%matplotlib inline
plt.sty]
\se(' seaborn-darkgrid' )
from sklearn.model_selection import train_test_split
from tensorflow.keras.utils import img_to_array, to_categorical
import cv2Loading Data
In [2]
train = pd.DataFrame(columns=["label", "path"])
test = pd.DataFrame(columns=["label", "path"|)
for dirname, _, filenames in os.walk(' /kaggle/input/grape-disease-dataset-origin
al/Original Data’):
for filename in filenames:
paths = dirname.split(’/')
patch = paths[-2]
label = paths[-1]
file_path = os.9
in(dirname, filename)
dic = {‘label': label, ‘path’: file_path}
if batch test’:
test.loc[len(test)] = dic
elif batch == ‘train’:
print(test.shape)
display(train.head())(7222, 2)
(1805, 2)
ledel__ path
0 ESCA Ikegg
1 ESCA /kegg/e/inp.t/grepe-diseese-detaset-criginal/O..
pLt/grepe-disease-dataset-criginal/O..
2 ESCA _keggefinptt/grepe-ciseese-detaset-criginal/O..
3 ESCA keg elinptt/grepe-ciseese-dataset-criginal/O..
4 ESCA [keg elinputlgrepe-cisease-dataset-criginal/0..
Data Distribution
In [3]
fig, ax = plt.subplots(1,2, figsize=(12,4))
sns.countplot(date=train, x="label', ax=ax[0])
sns.countplot(data=test, x='label', ax=ax[1]);
2000 seo
1500 oo
aso so
$ 000
é a
CUNECA Healthy Leal aight Black Rat CUNECA Healthy Leaf Bight sack Rot
sabe late
The category are quite balanced for both splits. We can conclude that randomly predicting the
category would give us an accuracy of 0.25Preprocessing and EncodingImage Preprocessing
In [4]
X_train = []
for i, path in enumerate(train[ ‘path’ ]):
image = cv2.imread(sath)
image = cv2.resize(image, (224,224))
image = img_to_array(image)
X_train.append( image)
if i % 580 == @:
f"[info] processing image number {i}")
rint('[info] preprocessing over...
X_train = np.array(X_train)
print(' [info] normalizing array...
X_train /= 255
y_train = train['label']
X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_siz
2, stratify-y_train)
print('[info] train splits ready’)
print(X_train.shape)
(y_train.shape)[info]
[info]
[info]
[info]
[info]
[info]
[info]
[info]
[info]
[info]
[info]
[info]
[info]
[info]
[info]
[info]
[info]
[info]
(8777,
(8777,)
processing
processing
processing
processing
processing
processing
processing
processing
processing
processing
processing
processing
processing
processing
processing
‘image
image
image
image
image
image
image
image
image
image
‘image
‘image
‘image
image
image
number
number
number
number
number
number
number
number
number
number
number
number
number
number
number
preprocessing over
normalizing array.
train splits ready
224, 224, 3)
580
1008
1508
2000
2500
3000
3500
4900
4500
5008
5508
6900
6508
7008Xtest = []
for i, path in enumerate(test[ ‘path’ ]):
image = cv2.imread(path)
image = cv2.resize(image, (224,224)
image = ing_to_array(image)
X_test. append(image)
if i % 500
print(f"[info] processing image number {i}")
print('[info] preprocessing over...')
X.test = np.array(X_test)
print('[info] normalizing array.
Xtest /= 255
y_test = test[‘label']
print('[info] test splits ready’)
print(X_test.shape)
print(y_test. shape)
"\nX_test = []\n\nfor i, path in enumerate(test[\'path\']):\n im
age = cv2.imread(path)\n image = cv2.resize(image, (224,224))\n
image = img_to_array(image)\n \n —-X_test.append(image)\n —\n
if i % 588 == @:\n print(f" [info] processing image number
{i}")\n — \nprint(\' [info] preprocessing over...\')\nX_test = np.a
rray(X_test)\nprint(\' [info] normalizing array...\')\nX_test /= 255
\ny_test = test[\'label\' ]\nprint(\' [info] test splits ready\')\n\n
print(X_test. shape) \nprint(y test. shape) \n'Label Encoding
In [6]
disease_list = train.label.unique()
disease_dic = {disease:i for i, disease in enumerate(disease_list)}
decoder = {value:key for key, value in disease_dic.items()}
In [7]
y_train = np.array([disease_dicly] for y in y_train])
y-val = np.array([disease_dicly] for y in y_val])
#y_test = np.array([disease_dic[y] for y in y_test])
y_train = to_categorical(y_train, num_classes=4)
y-val = to_categorical(y_val, num_classes=4)
#y_test = to_categorical(y_test, num_classes=4
bs = 32
Ir = @.0001
epoch = 18
decay = 1r / epoch
Model and TrainingTransfert Learning with ResNet5O
In [8]from tensorflow.keras.models import Model
from tensorflow.keras.layers import AveragePooling2D, Dense, Activation, Flatte
ny Dropout, Input
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.applications import resnets@
print("[info] preparing model...
nodelBase = resnetS0.ResNet5@(include_to
input_tensor=Input(shape=(224,224,3)),
weights="imagenet' )
modelHead = mode1Base.output
modelHead = AveragePooling2D(pool_size=(7, 7) ) (modelHead)
modelHead = Flatten() (modelHeac)
modelHead = Dense(256) (modelHead)
modelHead = Activation(' relu' ) (nodelHead)
modelHead = Dropout(9.5) (modelHead)
modelHead = Dense(4) (modelHeac)
modelHead = Activation(' softmax') (modelHeac)
model = Model(inputs=modelBase.input, outputs=modelHeac)
for layer in modelBase. layers:
trainable =
opt = Adam(learning_rate=1r)
print('[info] compiling model..." )
nodel.compile(optimizer=opt, los:
metrics=[ ‘accuracy’ ])
categorical_crossentropy’,
print('[info] training model....')
h = model.fit(X_train, y train,
validation_date=(X_val, y_val),
epochs-epoch,
batch_size=bs,
steps_per_epoch=X_train.shape[@] // bs,
validation_steps=X_val.shape[] // bs, )[info] preparing model.
Downloading data from https://storage.googleapis.com/tensorflow/ker
as-applications/resnet/resnet5@_weights_tf_dim_ordering_tf_kernels_
notop .hS
94765736 /94765736 [
[info] compiling model.
[info] training model..
Epoch 1/18
180/188 [ ] - 46s 195ms/step - loss:
@.0918 - accuracy: 8.9682 - val_loss: 14.9391 - val_accuracy: 0.234
e
Epoch 2/16
180/188 [ ] - 34s 186ms/step - loss:
0.0186 - accuracy: 8.9951 - val_loss: 22.6166 - val_accuracy: 0.234
e
Epoch 3/18
180/188 [ ] - 33s 186ms/step - loss:
0.0860 - accuracy: 8.9983 - val_loss: 15.9742 - val_accuracy: 8.234
e
Epoch 4/16
180/188 [ ] - 38s 186ms/step - loss:
@.0177 - accuracy: 8.9962 - val_loss: 5.9145 - val_accuracy: @.2583
] - 1s @us/step
Epoch 5/16
180/180 [ ] - 33s 186ms/step - loss:
@.0082 - accuracy: 8.9979 - val_loss: 1.0267 - val_accuracy: @.7639
Epoch 6/18
180/188 [ ] - 33s 185ms/step - loss:
@.0217 - accuracy: 8.9953 - val_loss: @.0456 - val_accuracy: @.986
Epoch 7/16
180/188 [ ] - 33s 185ms/step - loss:
0.0829 - accuracy: 8.9991 - val_loss: 4.6787e-04 - val_accuracy: 1.
e000
Epoch 8/16
180/188 [ ] - 33s 185ms/step - loss:
9.1614e-64 - accuracy: 8.9998 - val_loss: 8.0822 - val_accuracy: @.
9986
Epoch 9/16
180/180 [ ] - 38s 185ms/step - loss:
0.0045 - accuracy: 8.9999 - val_loss: 0.8019 - val_accuracy: 0.9993
Epoch 10/18
180/188 [ ] - 36s 199ms/step - loss:
0.0020 - accuracy: 8.9997 - val_loss: 0.8025 - val_accuracy: 0.9986Training Loss and Accuracy
In [9]
plt.figure(figsize=(12,4))
plt.plot(np.arange(@, epoch), h
plt.plot(np.arange(®, epoch), h.
‘y['loss'], label='train_loss')
history[‘val_loss'], label='validation_loss' )
plt.title('Training Loss')
plt.xlabel("Epoch #")
plt.ylabel("Loss*)
plt.
plt.
Training Loss
2»plt.title('Training Accuracy’ )
plt.xlabel("Epoch #")
plt.ylabel("Accuracy")
plt.legend(1oc="1ower right")
plt.show()
Training Accuracy
oe
207
03 an accuracy
= swlidation accuracy
oro 2 ‘ ‘ 7Conclusion and Further Improvements.
Until recently, | avoided training comouter vision models because | have no GPU cn my
computer.
Running the notebook en Kaggle is challenging because of the memory limit.
| still need to test the model on other data,
one option is to save the weights anc co the test on another notebook,
or optimizing the memory management.
For example:
+ using tensorflow to load the image with the good resolution from start
‘+ encoding the image as int8 before scaling
* and as float32 after scaling
The dataset for this notebook were already augmented, this might cause data leakage between
‘the train and validation sets.
Next time | will use the original dataset to prevent the problem