Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Addressing Data Imbalance in Plant Disease

Recognition through Contrastive Learning


Bryan Chung
2024 IEEE 3rd International Conference on AI in Cybersecurity (ICAIC) | 979-8-3503-8185-6/24/$31.00 ©2024 IEEE | DOI: 10.1109/ICAIC60265.2024.10433841

The Loomis Chaffee School


Connecticut, United States
bryan_chung@loomis.org

Abstract—The following study introduces a novel framework B. Previous Methods


for recognizing plant diseases, tackling the issue of imbalanced
datasets, which is prevalent in agriculture, a key sector for many To address the aforementioned problem, several re-
economies. Plant diseases can significantly affect crop quality search studies have explored the use of CNNs, which
and yield, making early and accurate detection vital for effective have demonstrated comparable performance in image
disease management. Traditional Convolutional Neural Networks processing. In the early stages of research, Tm et al.
(CNNs) have shown promise in plant disease recognition but utilized the LeNet-5 architecture to classify ten disease
often fall short with non-tomato crops due to class imbalance categories of tomato, achieving a 94.8% test accuracy
in datasets. The proposed approach utilizes contrastive learning on Plant Village Dataset [2] [3]. Later, Karthik et al.
to train a model on the PlantDoc dataset in a self-supervised used a Residual CNN to achieve a 98% accuracy on the
manner, allowing it to learn meaningful representations from
unlabeled data by maximizing the similarity between images
same dataset while classifying 4 diseases [4]. Nithish
based on disease state rather than species. This method shows Kannan et al. applied transfer learning on the ResNet-
a marked improvement in accuracy, achieving 87.42% on the 50 model to achieve a 97% accuracy on tomato leaves
PlantDoc dataset and demonstrating its superiority over existing [5] [6]. More recently, Zhao et Al. once again used the
supervised learning methods. The agnostic nature of the model ResNet architecture to achieve a 96.81% accuracy on
towards plant species allows for universal application in agri- the Plant Village Dataset, with each image placed in
culture, offering a significant tool for disease management and one of ten categories [7]. Finally, Bedi et al. proposed a
enhancing productivity in both existing farms and future smart hybrid model using autoencoders with a CNN to achieve
farming environments.
Index Terms—Plant Disease Recognition, Data Imbalance, a 98.38% accuracy on the same dataset [8].
Contrastive Learning, Machine Learning, Convoluted Neural Despite the success of these methods, they often yield
Network, Plant Disease, Plant Science unsatisfactory results when applied to non-tomato crops.
This can be attributed to a class imbalance problem
I. I NTRODUCTION within the dataset, where more than 33% of the training
A. Problem Definition samples consist of tomato leaf images as shown in Figure
1.
Agriculture is the backbone of innumerable economies
and is pivotal in ensuring food security across the globe.
One of the most significant challenges faced by the agri-
cultural sector is the prevalence of plant diseases, which
can have catastrophic effects on plant quality and yield.
Early and accurate detection of plant diseases is essential
for effective disease management and preventing the
spread of infections. However, current means of identi-
fying plant diseases, which often rely on the expertise of
farmers, are time-consuming and inconsistent by human
nature.
In recent years, computer vision and machine learning
technologies have shown noteworthy promise in numer-
ous fields, one of which is agriculture. Among such tech- Fig. 1. A Distribution of Plant Classes in a Sample Plant Dataset.
nologies, CNNs (Convolutional Neural Networks) have
emerged as a powerful tool for image recognition tasks Due to such imbalances in datasets and the narrow
[1]. CNNs, which can extract complex patterns from scope of each model primarily identifying only tomato
images, possess phenomenal potential to revolutionize leaf diseases, it is practically difficult to apply the afore-
plant disease recognition. This research aims to explore mentioned models in a farm where many crops and
and develop a CNN-based model for accurate, acceler- leaves coexist.
ated, and accessible plant disease diagnosis, which could To address this issue, a much more varied dataset con-
significantly contribute to not only traditional agricul- taining a comparable number of pictures for many plants
tural practices but also novel forms of farms. that the model may encounter is necessary. However, the

Authorized licensed use limited to: Bahria University. Downloaded on March 10,2024 at 05:20:16 UTC from IEEE Xplore. Restrictions apply.
process of collecting a dataset will be time-consuming
and costly due to the diverse nature of plants and their
respective diseases.
C. Proposed Method
To address the aforementioned problem, this paper
proposes a novel crop-type agnostic plant disease recog-
nition framework. As aforementioned, in the widely used
PlantVillage Dataset, there are many imbalances regard- Fig. 2. Classical CNN Architecture. Adapted from [10]
ing the types of plants and their respective diseases.
Hence, this paper will use the PlantDoc dataset, which As shown above, the model goes through a process
still faces imbalance issues, but offers more diversity in of training where it learns to map input images into
the labeled category of diseases. their categories via a series of convolutional, pooling,
The biggest challenge is to prevent the model from and fully connected layers. During training, the model
being biased to the imbalances of the dataset. This paper optimizes its parameters by repeating such processes and
proposes the application of contrastive learning to a by using an optimization algorithm, such as gradient de-
CNN model via transfer learning, which prevents the scent, and a loss function, such as cross-entropy, to min-
model from relying heavily on the imbalanced labeled imize the errors between the predicted and true labels.
dataset but rather begins by initially training in an unsu- This research aims to apply such image classification
pervised manner. techniques to successfully categorize various types of
The process begins by training a CNN through con- plant diseases.
trastive learning, where the model learns to extract
meaningful representations from unlabeled data, lever- B. Contrastive Learning
aging the inherent structures and relationships within the Contrastive learning is a technique in machine learn-
dataset. In this process, the model is directed to map ing that focuses on distinguishing similar and dissimilar
similar samples closer and dissimilar samples farther vectors in multi-dimensional space. The essence of con-
in high dimensional space, using cosine similarity as a trastive learning is to train a model in such a way that
metric. representations, or embeddings, of similar vectors are
Once the pre-training is over, transfer learning is pulled closer together, while those of dissimilar vectors
applied on the CNN model. The model is fine-tuned are pulled farther apart. One widely-used metric for
using now labeled data from the PlantDoc dataset like evaluating the similarity of two embeddings is the cosine
a typical CNN [9]. By incorporating contrastive learning similarity, which outputs a value in [-1, 1], where it
and transfer learning, the proposed framework attempts returns a value close to 1 if two vectors are similar, and
to develop a plant disease recognition model that is a value close to -1 if two vectors are different. To quan-
less dependent on an imbalanced, labeled dataset. This tify the performance of a model, a loss function called
approach allows for a more general application of a the Normalized Temperature-Scaled Cross-Entropy Loss
plant disease recognition framework, as the model is not (NT-Xent Loss) is often employed. This function com-
limited by variations in image states, plant types, and bines cosine similarity with softmax to output a proba-
disease categories. bility array. The function then computes a cross-entropy
The paper will explain some of the key ideas used in loss between the predicted and actual labels, giving a
the proposed approach in section two, then move on to measure of how different they are. Minimizing the loss
the third section with more specific details of how such drives the model to produce representations that are
concepts are applied in this specific model. Finally, in grouped together if they are similar and scattered if they
section four, the paper will discuss the results of the pro- are not.
posed model, and end with the fifth section summarizing
the implementations, findings, and implications.
II. BACKGROUND
A. Image Classification

Fig. 3. Example Training Process of a Contrastive Learning Model. Adapted


from [11]

Authorized licensed use limited to: Bahria University. Downloaded on March 10,2024 at 05:20:16 UTC from IEEE Xplore. Restrictions apply.
This research utilizes contrastive learning to enable Figure 4 demonstrates the first main portion of the
the trained network to effectively cluster similar feature proposed architecture—the contrastive learning part. In
spaces within the same plant disease categories. this figure, the feature extractor F(.)—a Convolutional
Neural Network(CNN)—takes in input image I ∈ R3 (a
C. The Class Imbalance Problem three-dimensional vector) and outputs a one-dimensional
The class imbalance problem is a common challenge feature map m ∈ R . The feature map is again inputted
in machine learning, particularly in classification tasks. to a projection header P(.), which is a two-layer Neural
It arises when the distribution of classes in the training Network(NN), that outputs an embedding vector z ∈ R,
dataset is highly imbalanced, with one or more classes also of one dimension. Here, we define P(.): m → z.
being significantly underrepresented compared to others. This overall operation ultimately converts a plant image
This imbalance can adversely affect the performance of I into an embedding vector z, which encodes high-level
classification models, as they tend to favor the majority features extracted from the plant, such as shapes, colors,
class and struggle to accurately predict minority class or textures.
instances. To find better representations for plant disease fea-
Various approaches have been proposed in the litera- tures, the research, inspired by SimCLR which pro-
ture to address this problem. Some research studies have posed a self-supervised learning algorithm for con-
focused on sampling techniques, such as undersampling trastive learning with visual representations, assumes
the majority class, oversampling the minority class, or that embeddings from plants with similar diseases are
generating synthetic samples. Other methods involve similar [12]. Cosine similarity is used to measure the
modifying the original images using data augmentation similarity between two embedding vectors, zi and zj , as
techniques. shown in the following equation.
One relatively new approach to handling imbalance zi · zj
classification is the idea of contrastive learning, as men- S i,j = (1)
tioned in 2.2. As a contrastive learning model trains itself |zi |1 · |zj |1
based on the relationships between the representation Then, the NT-Xent loss function, which utilizes the
of each image rather than explicitly optimizing losses softmax function and the cross-entropy loss function is
for class-based objectives, it offers a level of immunity applied to the model, which updates its parameters based
against class imbalances. on the measured loss of the two embeddings’ similarity.
This research addresses the imbalance classification
problem via contrastive learning, not restricting the ezi
model by the limited nature of plant datasets, but en- σ(zi ) = PK (2)
zj
couraging it to find relationships between each image. j=1 e
It will prove that contrastive learning successfully sup- n
X
presses the problem in section 4 through comprehensive − yi · log yˆi (3)
experiments. i=1

III. P ROPOSED P LANT D ISEASE R ECOGNITION Ultimately, this process maximizes the similarities be-
F RAMEWORK tween two images of plants with the same disease. With
such processes, the model can learn to categorize similar
This section presents the methodology used for con- diseases together regardless of the plant species, signif-
trastive learning-based plant disease recognition using icantly reducing pertinent bias, while focusing on the
convolutional neural networks. The proposed approach disease itself to find better representations which results
aims to help models find better representations of plant in higher accuracy. The effectiveness of the proposed ap-
diseases and enhance their performances in accurately proach will be further proven in section 4 with numerical
classifying many types of plant diseases. The proposed results comparing this method with other approaches.
plant disease recognition framework comprises two main
steps: contrastive learning and supervised learning. B. Phase II: Supervised Learning
A. Phase I: Contrastive Learning

Fig. 4. Contrastive Learning Process in Proposed Approach. Fig. 5. Supervised Learning Process in Proposed Approach.

Authorized licensed use limited to: Bahria University. Downloaded on March 10,2024 at 05:20:16 UTC from IEEE Xplore. Restrictions apply.
After the contrastive learning phase, the feature extrac- Despite these challenges, the PlantDoc dataset’s di-
tor is transferred to train in a new supervised manner. versity and comprehensiveness make it a valuable re-
The feature extractor again takes in input image I and source for training a plant disease recognition model.
returns feature map m, identical to the earlier process. The dataset’s wide coverage of plant species and disease
Then, however, the feature map is inputted into a Neural types allows the model to learn a broad range of features,
Network, functioning as a plant disease classifier, that thereby enhancing its generalization ability and applica-
returns a vector outlining the probabilities for a plant bility to a wide range of scenarios.
having each disease. Here, a map PDR: m → ŷ is To evaluate the performance of the proposed model,
defined. a 5-fold cross-validation was employed. In this setup,
With this multi-step approach, the model performs no- the dataset was randomly divided into five equal-sized
tably better than one starting from scratch, as the model subsets. The model was then trained and validated five
is already capable of finding adequate representations times, each time using a different subset for validation
that maximize their ability to categorize plant images and the remaining four subsets for training. This ap-
into disease categories. Consequently, the model is less proach ensures that every data point is used for both
biased. training and validation, thereby providing a more robust
and reliable evaluation of the model’s performance.
IV. E XPERIMENTAL R ESULTS
B. Evaluation
A. Dataset
1) Comparison with State-of-the-Art models:
The PlantDoc dataset, which was used in this study,
is a comprehensive collection of 2,598 images across
TABLE I
13 plant species and 27 classes (17 disease classes and ACCURACIES OF STATE - OF - THE - ART MODELS WITH AND WITHOUT
10 healthy classes) [13]. The dataset is unique in its TOMATO IN THE DATASET.
diversity, covering a wide range of plant species and
disease types, which makes it an ideal choice for training Architecture
Accuracy
a robust and versatile plant disease recognition model. All plants All without tomatoes
However, the PlantDoc dataset is not without its chal- VGG19 0.7248 0.6534
lenges. One significant issue is its imbalanced classes,
MobileNetV2 0.7506 0.6723
where some classes are relatively overrepresented. This
imbalance can potentially skew the model towards over- Xception 0.7501 0.6875
represented classes, thereby affecting its performance on HRNet32 0.7639 0.6925
underrepresented classes. Figure 6 shows the distribution DenseNet121 0.7707 0.6930
of each subclass in the PlantDoc dataset, clearly illustrat- Proposed Method 0.8342 0.7964
ing the class imbalance problem.
It can clearly be seen that the models with the pro-
posed contrastive learning phase consistently outperform
models without it. The significant increase in perfor-
mance can be attributed to the unbiased CNN itself,
which is able to generate meaningful representations of
plant images, only focusing on the necessary parts of the
plant, not relying on plant types.
2) Architecture Replacement:

TABLE II
R ESULTS AFTER A RCHITECTURE R EPLACEMENT

Accuracy
Architecture
w/o proposed w/ proposed

VGG19 (Simonyan et al. 2015) 0.7248 0.7647 (+3.9%)


MobileNetV2 (Sandler et al. 2018) 0.7506 0.7858 (+3.5%)
Xception (Chollet 2016) 0.7501 0.8014 (+5.1%)
HRNet32 (Wang et al. 2020) 0.7639 0.8176 (+5.3%)
DenseNet121 (Huang et al. 2018) 0.7707 0.8291 (+5.8%)
Resnet50 (He et al. 2015) 0.7731 0.8342 (+6.1%)

To demonstrate the model-agnostic nature, the pro-


Fig. 6. Distrubution of Plant Classes in PlantDoc Dataset. Adapted from [13] posed architecture has been tested across many well-

Authorized licensed use limited to: Bahria University. Downloaded on March 10,2024 at 05:20:16 UTC from IEEE Xplore. Restrictions apply.
known, existing CNN models. Table 2 depicts the perfor- 5) Data Augmentation:
mances of respective CNN models with and without the
proposed contrastive learning phase. Across all models, TABLE III
the implementation of contrastive learning significantly DATA AUGMENTATION E VALUATION
and consistently increased their performances. The im-
Model Accuracy
provement is amplified in especially more robust models,
like DenseNet121 and Resnet50. Vanilla Model 0.8342
3) t-SNE Evaluation: Horizontal Flip 0.8531 (+1.89%)
Color Jitter 0.8543 (+2.01%)
Random Perspective 0.8114 (-2.28%)
Grayscale 0.8196 (-1.46%)
Gaussian Noise 0.8439 (+0.97%)
Brightness 0.8033 (-3.09%)
Proposed Method 0.8724 (+3.82%)

Various data augmentation techniques have also been


tested with the proposed method. The proposed method
(a) (b) noticeably outperforms all forms of common augmenta-
Fig. 7. Applications of t-SNE on Plant Image Embedding (a): Before and tion methods.
(b): After Applying Transfer Learning
C. Applications
Figure 7 represents the result of applying t-distributed The proposed plant disease recognition model, lever-
stochastic neighbor embedding(t-SNE) to the embed- aging the power of contrastive learning and convolu-
dings the feature extractor outputted, essentially re- tional neural networks, holds substantial promise for
ducing each representation of plant images in high- real-world applications. Its ability to accurately identify
dimensional space to a more lower, more comprehen- a broad spectrum of plant diseases, irrespective of the
sible dimension. 7.a and 7.b illustrate the vectors of plant species, makes it a versatile tool in the agricultural
each plant image without and with contrastive learning, sector. One of the most prominent areas where this
respectively. It can be observed that with contrastive model can make a significant impact is smart farming.
learning, embeddings of the same categories represented Smart farming represents a paradigm shift in agricul-
by same colors are more clearly clustered. ture, integrating advanced technologies to enhance the
4) Confusion Matrix: quality and quantity of agricultural produce. This mod-
ern approach to farming employs a range of technolo-
gies, including Internet of Things (IoT) devices, drones,
and machine learning algorithms, to optimize farming
practices. The proposed plant disease recognition model
fits seamlessly into this technological landscape, offering
a valuable tool for crop health monitoring.
In a smart farming setup, IoT devices equipped with
cameras can be deployed across the farm to continuously
monitor crop health. The images captured by these de-
vices can be processed by our plant disease recognition
model, providing real-time detection of any signs of
disease. This immediate detection allows farmers to take
swift action, such as applying targeted treatments to
affected plants, thereby preventing disease spread and
minimizing crop loss.
The framework’s robustness against class imbalance
further enhances its applicability in diverse farming envi-
ronments. In a farm with a wide variety of crops, certain
types of plants or diseases may be underrepresented in
the training data. However, the model’s use of contrastive
Fig. 8. Confusion Matrix of the Proposed Method. learning ensures it remains effective even in these sce-
narios, making it a universally applicable tool for smart
Figure 8 shows a confusion matrix for evaluating the farming.
proposed architecture’s accuracy across different classes. Moreover, its model-agnostic nature allows for its use
Shades of darker blue represent accuracy values closer to on less robust hardware, in essence allowing low-budget
1. farmers to still take advantage of the algorithm.

Authorized licensed use limited to: Bahria University. Downloaded on March 10,2024 at 05:20:16 UTC from IEEE Xplore. Restrictions apply.
In essence, the proposed plant disease recognition [9] D. Singh, N. Jain, P. Jain, P. Kayal, S. Kumawat, and N. Batra,
model can revolutionize smart farming practices. Provid- “Plantdoc: A dataset for visual plant disease detection,” in Proceedings
of the 7th ACM IKDD CoDS and 25th COMAD, CoDS COMAD
ing farmers with accurate, real-time information about 2020, (New York, NY, USA), p. 249–253, Association for Computing
their crop health enables more efficient and sustain- Machinery, 2020.
able farm management. This, in turn, may contribute [10] X. Kang, B. Song, and F. Sun, “A deep similarity metric method
to increased agricultural productivity and food security, based on incomplete data for traffic anomaly detection in iot,” Applied
Sciences, vol. 9, p. 135, Jan. 2019.
underlining the model’s potential to transform the future [11] E. Tiu, “Understanding contrastive learning,” Jan 2021.
of agriculture. [12] T. Chen, S. Kornblith, M. Norouzi, and G. Hinton, “A simple framework
for contrastive learning of visual representations,” 2020.
V. C ONCLUSION [13] D. Singh, N. Jain, P. Jain, P. Kayal, S. Kumawat, and N. Batra,
“PlantDoc,” in Proceedings of the 7th ACM IKDD CoDS and 25th
This research proposes a novel plant disease recog- COMAD, (New York, NY, USA), ACM, Jan. 2020.
nition framework that leverages the capabilities of
contrastive learning and Convolutional Neural Net-
works(CNNs). The model was designed to mitigate the
inherent bias towards overrepresented classes prevalent
in most plant image datasets—thus enhancing its ability
to accurately classify a variety of plant diseases, To
demonstrate its effectiveness, the model was trained
and tested on the PlantDoc dataset. Despite significant
class imbalances in the dataset, the model demonstrated
excelling performance in recognizing and classifying
plant diseases, one superior to state-of-the-art methods.
The study further established the general applicability
of the proposed method by performing additional tests,
including the removal of tomato-related data—the most
abundant species in plant disease datasets. The results
of these tests highlighted the robustness of the model
and its ability to adapt to various scenarios, reinforcing
its suitability for real-world applications, like the smart
farm.
VI. ACKNOWLEDGMENTS
I would like to thank Ting Chen, Simon Kornblith,
Mohammad Norouzi, and Geoffrey Hinton for their ex-
tensive work on SimCLR, which established a frame-
work for contrastive learning of visual representations.
This project took significant references from their work.
R EFERENCES
[1] K. O’Shea and R. Nash, “An introduction to convolutional neural
networks,” 2015.
[2] P. Tm, A. Pranathi, K. SaiAshritha, N. B. Chittaragi, and S. G.
Koolagudi, “Tomato leaf disease detection using convolutional neural
networks,” in 2018 Eleventh International Conference on Contemporary
Computing (IC3), pp. 1–5, 2018.
[3] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning
applied to document recognition,” Proceedings of the IEEE, vol. 86,
no. 11, pp. 2278–2324, 1998.
[4] K. R., H. M., S. Anand, P. Mathikshara, A. Johnson, and M. R., “At-
tention embedded residual cnn for disease detection in tomato leaves,”
Applied Soft Computing, vol. 86, p. 105933, Jan. 2020.
[5] N. K. E., K. M., P. P., A. R., and V. S., “Tomato leaf disease detection
using convolutional neural network with data augmentation,” in 2020 5th
International Conference on Communication and Electronics Systems
(ICCES), pp. 1125–1132, 2020.
[6] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image
recognition,” 2015.
[7] S. Zhao, Y. Peng, J. Liu, and S. Wu, “Tomato leaf disease diagnosis
based on improved convolution neural network by attention module,”
Agriculture, vol. 11, p. 651, Jul 2021.
[8] P. Bedi and P. Gole, “Plant disease detection using hybrid model
based on convolutional autoencoder and convolutional neural network,”
Artificial Intelligence in Agriculture, vol. 5, pp. 90–101, 05 2021.

Authorized licensed use limited to: Bahria University. Downloaded on March 10,2024 at 05:20:16 UTC from IEEE Xplore. Restrictions apply.

You might also like