Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 7

1

Durio Zibethinus Ripeness Determination and Variety


Identification Using Principal Component Analysis and Support
Vector Machine
Jessie R. Balbin, Judy Ann I. Alday, Charmine O. Aquino, Mae Flor G. Quintana
jayebalbin@gmail.com, judyannialday@gmail.com, coaquino@mymail.mapua.edu.ph,
quintanamae14@gmail.com
Mapúa University, Manila

appearance-based method to categorize between two durian


Abstract— This paper is purposed to determine the ripeness level varieties named Chanee and Monthong [6].
and identify the variety of the Durian fruit through image
processing using Principal Component Analysis (PCA) and Prior studies are limited to the determination of
Support Vector Machine (SVM) algorithms. The ripeness level is
maturity of durian and classification of two types of durian:
classified into three stages: unripe, ripe, and overripe. There are
numerous varieties of durian available in the Philippines, and Chanee and Monthong. In this research, the group extends the
this study specifically uses Puyat, Arancillo, Cob, Davao Selection study by using image processing techniques in the non-
and UPLB Gold varieties. The study used 100 durian fruits, 80% destructive determination of the ripeness level of the said fruit,
of which were used for data training while the 20% were used for alongside with the identification of the variety of durian fruit
data testing. The study yields 95% overall accuracy and 5% using its image as the data input. Durian has different cultivars
misclassification rate. but has almost the same husk color and rough appearance,
hence it is difficult to scrutinize the type of durian using the
Index Terms – Durian, Principal Component Analysis, human eye alone. This will be beneficial to the consumers
Support Vector Machine who are having a hard time in identifying the ripeness of a
durian fruit, and the variety they are dealing with. In this

P
I. INTRODUCTION
manner, bruising the surface of the fruit because of pinching
HILIPPINES is an agricultural country with 47% of its
can be avoided.
land allotted as agricultural land [1]. Although known
This study aims to determine the ripeness of durian and
to have rich domestic agricultural products, this doesn’t
identify its variety using Principal Component Analysis (PCA)
cease the importation of identical products for the reason of
and Support Vector Machine (SVM). This study specifically
Filipinos’ natural affinity for food. Food especially fruits, are
aims to: (1) develop a hardware for classifying and evaluating
best enjoyed when they are harvested and eaten at the right
the maturity and variety of durian; (2) develop a software that
time. However, not all people can distinguish whether a fruit is
will identify the maturity and variety of durian through the use
already suitable for consumption by merely looking at it.
of Python program code of Principal Component Analysis and
Durian is one kind of fruit which is difficult to determine its
Support Vector Machine; (3) train and test the device on
ripeness through the use of human eye. The aroma, thorns, and
certain varieties of durian; and (4) validate the accuracy of the
sound when tapping it are the characteristics to be considered
device with a durian expert.
if the durian is at its peak. The National Seed Industry Council
The significance of this research will benefit not only
recommended the different varieties of durian in the
the consumers, but the sellers as well. The device will help in
Philippines namely Alcon Fancy, Arancillo, GD 69, Duyaya,
identifying the type of durian and in determining the level of
Lacson Uno, Lacson Dos, Puyat, Atabrine, Sulit, Alcon Fancy,
ripeness in a short period of time. This will prevent the
and UPLB Gold [2]. Each variety has similarities in terms of
consumers from gathering wrong information regarding on the
the color of the husk that is covered by spine, and its flesh.
fruit that they will buy, since different varieties of durian have
Each cultivar has distinct taste and odor. The problem is
distinct tastes, and different flesh characteristics. The vendors
people are confused with the right variety of durian that they
will be helped regarding their inventory, knowing on what
want since it is difficult to identify the different cultivar of
durian should be disposed first. Further results can also
durian, especially when different varieties of durian are
contribute to the industry of Durian farming, people will be
displayed together in the market.
more interested in buying Durian fruits because it will not be
In a mobile application, Fruitilicious [3], it can predict
difficult anymore to choose the Durian fruit of their taste and
the ripeness of a fruit through its digital image, but it was
for their consumption which will avoid wastages as well.
restricted to apple, banana and melon. Study on the
This study will be applied only in specific types of
determination of maturity of durian using different algorithms
durian namely Puyat, Arancillo, Cob, Davao Selection and
was able to determine the maturity of durian through transfer
UPLB Gold varieties. The proposed device is programmed to
of linear vibration [4]. On the other hand, a study used fractal
automatically assess the captured image of a durian. The
analysis to determine the ripeness of durian [5]. Fractal
assessment will be based on a dataset that has been trained
analysis was used to investigate the characteristics which
into the device and will only be limited to the said varieties of
makes the fruit mature and immature, wherever different self-
durian. Durian samples will come from different markets
complexity is obvious. Another study conducted the
located mainly in Los Baños, Laguna and Davao City. The
2

result regarding the ripeness level will only indicate whether C. Hardware Development
the Durian input is unripe, ripe or overripe and the number of
days until maturity.
II. METHODOLOGY
A. Conceptual Framework

Figure 3. Block Diagram of Durio X hardware system

Figure 1. Conceptual Framework Figure 3 shows the arrangement of parts used in the
hardware system. The Raspberry Pi 3 Model B+ is the
Figure 1 shows the conceptual framework of this study. microcontroller used. A durian image will be captured using
A Durian fruit will be placed inside the device. The input is its the Raspberry Pi Camera V2 module and is connected to its
image that will be captured by a camera installed on the respective Camera Serial Interface port on the microcontroller
device. The microcontroller is the processor that interprets the through a ribbon cable. The microprocessor will execute the
data into an output. Processes involved includes image image processing techniques on the captured images of durian.
segmentation using Canny Edge Detection, grayscale The raspberry-pi is the main controller used in determining the
transformation, feature extraction through PCA, classification ripeness level and variety of the durian fruit using image
through SVM, data training, and data testing. The analysis. It is the one responsible in processing and analyzing
characteristic of durian such as color, shape, and texture will the image captured by the raspberry-pi camera through the
be encoded in the microcontroller. The ripeness and variety of programs that was configured into it. It is programmed in such
durian will be evaluated and classified based on this data. way the image will undergo different algorithms that will
Then the LCD will display the output indicating the variety result to the pre-assessment of the condition of durian. The
and ripeness level of durian. basis in identifying the maturity and variety of durian such as
color, shape and texture will be saved in the SD card installed
B. Process Flow in the Raspberry Pi. The power bank is used as the power
supply to the Raspberry Pi. The LCD is also connected to the
microcontroller where the variety and ripeness level of durian
will be displayed.

Figure 4. Design of the prototype

Figure 2. Process Methodology of Durian Ripeness and Variety


Identification

The process of Methodology, see figure 2, includes


the phases where the objectives are to be followed with sub
categories to be considered. Phase 1 is hardware development
where the materials and system design will be introduced.
Phase 2 is software development which contains the processes
and algorithms. Phase 3 is Durian data training and testing,
and Phase 4 is Durian data evaluation and results.
Figure 5. Actual image of the prototype (outside and inside)
3

Figure 4 shows the design of the prototype, and figure 5 obtained. The value of each intensity is computed given that of
shows the actual prototype used in the study. The prototype is weighted value equation 3.
named as Durio X. As seen on the figure, it is made up of I =0.299 R+0.587 G+0.114 B
wood, with 12x12x18 inches dimension. The inclined portion (3)
was added to contain the Raspberry Pi 3 Model B+ which is
the microcontroller used and the wires inside, and to make the
7 inches LCD touchscreen (placed on the top) more interactive
for the user. The inside part of the box was painted with black.
The Raspberry Pi Camera V2 module and LED square strip
were embedded on the top, centered. The camera has a Figure 7. The image transformation of original image from getting
resolution of 8 mega pixels and is a high definition camera the pure red, pure green, and pure blue then to grayscale image.
module compatible with all Raspberry Pi models. Canny Edge
iii. RGB Values
Detection algorithm is used to isolate the image from its
RGB values of the image are be extracted using the
background, seeing to it that the characteristics of the edges
Python programming code. Example Python code is as
are not affected.
follows:
D. Software Development
img = im.load()
i. Canny Edge Detection
print im.size #to get the width and height of the
An image segmentation process called Canny Edge
print img[x,y] # to get the RGBA value of a pixel
Detection, will separate the durian fruit from the background.
In this manner, the image will be cropped or seemed like
Table 1 shows sample values of the average red, green,
zoomed for acquiring an optimized image to be processed. In
and blue values of the durian images for unripe, ripe, and
Canny Edge detection, Gaussian filter is to smoothen the
overripe for Arancillo variety. These values are the basis for
image and convolve with it so that the noise can be removed.
ripeness determination and days until maturity prediction.
Horizontal, vertical, and diagonal edges are detected using
filters because edges of an image may point in various
directions. The edge gradient and direction are obtained using
the formula:
G=√ G x +G y
2 2 (1)
Gy
θ= (2)
Gx
Gx refers to the horizontal direction while G y refers
to the vertical direction. False response to edge detection is
removed by applying the non-maximum suppression. Then the Table 1. Sample RGB values for each cultivar when unripe, ripe, and
threshold is also applied to the non-maximal suppression overripe
image. It is used to maintain the pixel of the edges with a high
slope value, and filter outs edge pixels with a weak slope iv. Principal Component Analysis
value. Lastly, tracking the edges through suppression of edges The ripeness level of durian was determined through
that are not connected to the strong ones and those who are Principal Component Analysis. PCA is a dimensionality
considered weak. reduction technique that simplifies the inputs, or the features
used in the process. The component to be analyzed is the husk
color of durian. The average RGB values were used as the
basis whether the durian is unripe, ripe, and overripe. The
image of durian is the input data consists of 4000 features
which will be reduce to 40 features. PCA was able to reduce
these large data sets into a small number of new data.[7]

v. Support Vector Machine


A technique that classify the image through separating
Figure 6. Original image (left); image resulted from canny edge the data using hyperplane is called the Support Vector
detection (right) Machine. The data to classify each variety of durian are based
on its texture and shape. Based on [8], SVM development is
ii. Grayscale Transformation opposing the process and implementation of neural networks
The grayscale image will be used as an input to the (NNs). SVMs are more advanced from the theory to the
PCA+SVM algorithm for variety identification. In grayscale applications and experiments while NNs are reversed which is
transform, the pure red, pure green, and pure blue of the input from applications and experimentations then to theory. SVMs
image is acquired. Then the RGB values for each input pixel is includes learning function such as unknown and nonlinear
dependency (mapping, function) y=f(x) wherein the input x is
4

multi-dimensional and output y. One must execute a durian are photographed in every angle which leads us to a
distribution-free learning, since there is no material about the thousand images. This is being done to observe the changes in
underlying joint probability functions. The lone material terms of husk color, from unripe to overripe conditions.
presented is a training data set:
D = {(xi, yi) ∈ X ×Y }, i = 1, l, (4) F. Data Testing
where l refers to the number of the training data pairs and is
equal to the size of the training data set. y i is denoted as di
frequently, where d refers to a desired value. For this reason,
SVMs belong to the supervised learning techniques.

E. Data Training

Figure 9. Program Flowchart for Data Testing

Figure 9 shows the flowchart for data testing. The


system starts then proceeds to image capturing through the use
of the installed camera module. Image segmentation using
Canny Edge Detection will be done to separate the durian
from the background to have an optimized picture of the
Figure 8. Program Flowchart for Data Training durian. Then the RGB and grayscale of the cropped image will
be acquired. The grayscale image will be used as an input to
Figure 8 shows the program flowchart. The system PCA+SVM algorithm to identify the classification of durian.
starts and then analyzes the parameters encoded in the If the image is a durian then it will proceed to classification
Raspberry Pi, and then proceeds to image capturing using the otherwise, it will go back to capturing image. If the durian can
installed camera module. Image segmentation using Canny be classified, then the result of durian variety will be display
Edge Detection will be done to separate the durian from the else the system will also restart to capturing image. If the
background to have an optimized picture of the durian. Then device cannot detect the image or the object after three trials, a
the RGB and grayscale of the segmented image will be prompt message will be displayed. The result of classification
acquired. The grayscale image will be used as an input to along with the average RGB values of the durian will be run
PCA+SVM algorithm to identify the classification of durian. again to the PCA+SVM algorithm for ripeness determination
This data along with the average RGB values of the durian and days until maturity. PCA+SVM algorithm will compare
will be run again to the PCA+SVM algorithm for ripeness this data to the gathered database of the device. Lastly, the
determination and days until maturity. Lastly, the results will results will then be displayed on the LCD.
then be displayed on the LCD.
Profiling of the durian fruits are done first to secure G. System Feedback
needed data for data testing. There are 16 samples of durian Figure 10 shows the graphical user interface with the
for each variety. Everyday pictures of the five cultivars of output of Durio X. The process will begin once the start
camera button is pressed. The segmented image of durian and
5

the result is displayed after pressing the capture and predict and F indicates the correct and wrong prediction of ripeness
button. The load button is to load pictures from the system and level and variety of durian, respectively.
predict through the predict button.

Figure 10. Graphical User Interface with output of Durio X

A prompt message will appear for undefined objects


and for other varieties of durian aside from Arancillo, Cob,
Davao Selection, Puyat, and UPLB Gold. The prompt message
will display after three failed trials.

III. RESULTS AND DISCUSSIONS


The system, Durio X, was trained with 80 durian fruits
from unripe to overripe (16 fruits for each cultivar) then tested
with 20 durian fruits in no particular variety. The images used
for testing were assessed by a durian expert to identify its
ripeness level and classify its variety.

A. Data Training
RGB values of the images of the durian used in data
training are extracted. The table below shows the sample RGB
Values extracted for Arancillo.

Table 2. Sample RGB values for Arancillo when unripe, ripe, and
overripe.
B. Data Testing
The prediction result of the Durio X for the 20 random
samples of durian fruits are shown in table 3. This table
reflects the data for the uncontrolled or random testing. The T
6

C. Validation of Results
The prototype was able to identify the ripeness level
and classify the variety of a durian using PCA and SVM
algorithms. Though, there are errors since the size of the
thorns are almost the same. The table below shows the
comparison between the pre-assessment of Durio X and the
diagnosis of the durian expert on ripeness level and variety of
durian.

Table 4. Validation of Results for Variety Identification and


Determination of Ripeness Level

D. Statistical Analysis
A statistical analysis is necessary to determine the
differences between the pre-assessment done by the prototype
and the diagnosis done by a durian expert. A confusion matrix
is composed of data about the overall true recognition, overall
false recognition and the recognition rate for each of the
classification or category. Its table describes a classification
model performance by using a set of test data having true
Table 3. Data Testing values.
7

Overall Accuracy =
∑ of True Positives ×100
[8] O. Fenwa, F. Ajala, and A. Adigun, “Classification of Cancer of the
Lungs Using SVM and ANN”, Journal: International Journal of
Total Number of Observations Computers and Technology Vol. 15, No. 1, 2015
(5)

Overall Accuracy Misclassification Rate

Ripeness Level 95% 5%

Variety 95% 5%

Out of 20 uncontrolled samples of durian, 19 samples


were able to get the correct remarks in ripeness level
determination and variety identification based on the diagnosis
of the durian expert, which yields to 95% overall accuracy of
the system testing and 5% misclassification rate for both.

IV. CONCLUSION
The study created a system called Durio X to identify
the ripeness level and classify the variety of a durian, which is
limited to Arancillo, Puyat, Cob, Davao Selection, and UPLB
Gold varieties only. With the use of Principal Component
Analysis and Support Vector Machine algorithms, the main
objective of the study was attained. The study yields a 95%
overall accuracy for the data testing and evaluation.

V. RECOMMENDATIONS
The group would recommend to those who will make
further study about this research work to expand the
application of the proposed prototype to other fruits. To
increase the accuracy of the prototype, a lot of time and data
should be gathered. Since the prototype uses machine learning
algorithm, it is important to gather more data from profiling
the fruit thoroughly. Also, the future researchers can
experiment using different algorithm to determine the ripeness
and to identify the variety accurately. If the future researchers
have extra resources, a camera with higher specifications and
better quality can be used.

VI. REFERENCES
[1] Pinas - Agriculture [Online] Available:
https://pinas.dlsu.edu.ph/gov/agriculture.html [Accessed: May 26, 2018]
[2] Bajera. Durian Tree, That Crop with the Fruit that "Smells Like Hell But
Tastes Like Heaven" 2012
[3] S. Iswari, Wella, Ranny. “Fruitilicious: Mobile Application for Fruit
Ripeness Determination based on Fruit Image”, 2017 10th International
Conference on Human System Interactions (HSI) 2017
[4] S. Kongrattanaprasert, S. Arunrungrusmi, B. Pungsiri, K.
Chamnongthai. “Nondestructive Maturity Determination of Durian by
Force Vibration”, International Journal of Uncertainty, Fuzziness and
Knowledge-Based Systems 2002
[5] M. Phothisonothai. “Nondestructive Maturity Determination of Durian
based on Fractal Features”, 10th International Conference on
Information Science, Signal Processing and their Applications (ISSPA
2010) 2010
[6] F. Pensiri, P. Visusak. “Durian Cultivar Recognition Using Discriminant
Function”, 2017 2nd International Conference on Information
Technology (INCIT), 2017
[7] M. Richardson, Principal Component Analysis 2009

You might also like