IET Smart Cities - 2020 - Huang - Smart Agriculture Real Time Classification of Green Coffee Beans by Using A

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

IET Smart Cities

Research Article

Smart agriculture: real-time classification of eISSN 2631-7680


Received on 31st July 2020

green coffee beans by using a convolutional


Revised 2nd September 2020
Accepted on 3rd September 2020
E-First on 13th October 2020
neural network doi: 10.1049/iet-smc.2020.0068
www.ietdl.org

Nen-Fu Huang1 , Dong-Lin Chou1, Chia-An Lee1, Feng-Ping Wu1, An-Chi Chuang1, Yi-Hsien Chen1,
Yin-Chun Tsai1
1Department of Computer Science, National Tsing Hua University Hsinchu, Taiwan
E-mail: nfhuang@cs.nthu.edu.tw

Abstract: Coffee is an important economic crop and one of the most popular beverages worldwide. The rise of speciality
coffees has changed people's standards regarding coffee quality. However, green coffee beans are often mixed with impurities
and unpleasant beans. Therefore, this study aimed to solve the problem of time-consuming and labour-intensive manual
selection of coffee beans for speciality coffee products. The second objective of the authors’ study was to develop an automatic
coffee bean picking system. They first used image processing and data augmentation technologies to deal with the data. They
then used deep learning of the convolutional neural network to analyse the image information. Finally, they applied the training
model to connect an IP camera for recognition. They successfully divided good and bad beans. The false-positive rate was
0.1007, and the overall coffee bean recognition rate was 93%.

1 Introduction machine (SVM) [10] and near-infrared (NIR) [11]. Most methods
provide >90% of accuracy on classifying different species of coffee
Coffee is an important economic crop and one of the most popular beans. However, few methods are proposed to detect defects on
beverages in the human society [1]. Coffee is cultivated in over 70 unroasted green coffee beans. Before applying deep learning, area
countries, primarily in the equatorial regions of America, Southeast and circumference [12] of a coffee bean, are considered as
Asia, the Indian subcontinent, and Africa. Green coffee beans are important basis while judging, and gave us about 78.5% of
the most traded agricultural products in the world. With the rapid accuracy while inspecting coffee beans with defects. Image
increase in speciality coffee retailers and cafes in the 1990s, processing technique and threshold applying [13] can also help to
speciality coffee became one of the fastest-growing markets for the inspect defects on green coffee beans with about 83% of accuracy.
foodservice industry. Many countries have developed their own Currently, many vision sorting systems [14] are available in the
speciality coffee associations. According to the Specialty Coffee market. These systems can be used to distinguish good and bad
Association of America (SCAA), a cup of speciality coffee is not varieties of items such as peanuts, seeds, rice, and green coffee
defined as a cup of coffee that has been brewed and sent to beans mainly on the basis of colour. However, a range of colours
consumers. Instead, it emphasises the whole process of producing must be set for these systems, which require a robotic arm to pick
the cup of coffee. However, green coffee beans are often mixed out the items. This method is slow and inefficient. Accordingly, we
with impurities and unpleasant beans [2]. If these impurities and propose a method that involves using deep learning technology to
cowpeas are not manually picked out before roasting, the overall determine the standard of good and bad items.
coffee quality and flavour are affected. We preprocessed images of green coffee beans obtained through
In the present, the field of artificial intelligence applications can image processing technology by using the convolutional neural
be divided into three categories: speech recognition [3], image network (CNN), which is a popular technology in deep learning.
recognition, and natural language processing [4]. Image CNN is good at the colour and shape extraction of images.
recognition technology can be applied in areas such as smart cities, Therefore, we can easily get features of good and bad bean images
medical care, and agriculture. In smart cities [5], traffic flow such as partial black, broken and so on. We trained the exclusive
analysis can be used to improve traffic congestion and reduce coffee bean prediction model to quickly distinguish which raw
traffic accidents. Medical imaging [6] through MRI or computed beans were good and bad. By using this method, the considerable
tomography can be used in the early detection of disease roots and time required for the manual selection of coffee beans can be
in treatment. Agricultural imaging [7] can be used for the reduced and the development of speciality coffee beans can be
identification of crop pests, which can reduce the loss of crops promoted. An automatic green coffee bean identification system
during plantation and increase crop yields. Therefore, it is hoped
that image recognition technology can be applied for identifying
green coffee beans, thereby improving the quality and flavour of Table 1 Definition of a primary defect bean according to the
coffee. SCAA
The definition of defect beans [8] provided by the SCAA is Primary defect Number of occurrences equal to one full
presented in Tables 1 and 2. There are two types of defects: defect
primary defects and secondary defects. Through the definitions full black 1
provided by the SCAA, we can determine what type of green full sour 1
coffee beans are defective beans. If bad beans are roasted together
pod/cherry 1
with good beans, the coffee does not taste like speciality coffee. To
solve this problem, we must manually select green beans. large stones 2
However, this process involves considerable labour and time cost. medium stones 5
There are already plenty of different methods been proposed to large sticks 2
classify different species of coffee beans, including artificial neural medium sticks 5
network (ANN), K nearest neighbour (KNN) [9], support vector

IET Smart Cities, 2020, Vol. 2 Iss. 4, pp. 167-172 167


This is an open access article published by the IET under the Creative Commons Attribution License
(http://creativecommons.org/licenses/by/3.0/)
26317680, 2020, 4, Downloaded from https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/iet-smc.2020.0068 by INASP/HINARI - PAKISTAN, Wiley Online Library on [11/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Table 2 Definition of a secondary defect bean according to
the SCAA
Second defect Number of occurrences equal to one full
defect.
parchment 2–3
hull/husk 2–3
broken/chipped 5
insect damage 2–5
partial black 2–3
partial sour 2–3
floater 5
shell 5
small stones 1
small sticks 1
water damage 2–5

Fig. 2 Green coffee beans we bought from a shop

Fig. 1 Flowchart of data preprocessing

Table 3 List of hardware


CPU GPU Memory
Intel(R) Core(TM) i5-7640X NVIDIA GeForce GTX 1080 Ti 48 GB

was developed in this study to enhance the speed and accuracy of


green coffee bean identification through artificial intelligence. The
Fig. 3 Vibration bucket makes a pile of coffee beans line up in a lane on
experiment results indicate that the developed tool can serve as a
the conveyor
reference for related studies.

2 System architecture and implementation


We established the process flow displayed in Fig. 1. We first
collected the data, preprocessed the data, augmented the data, and
finally resized the image in a data format acceptable to the neural
network. This entire process automatically generated the required
data sets. Central server specification and each step in the process
flow are explained in the following sections.

2.1 Central server Fig. 4 Image of the environment setting

The store helped us to divide the green coffee beans into good and
(i) Operating system: In our experiment, we use the Ubuntu 14.04 bad beans so that we could directly take pictures.
as the operating system for our central server. In the past, collecting data sets was a difficult job while
(ii) Hardware: In recent years, central processing unit (CPU) building a neural network. It sometimes took more time than the
performance and graphics processing unit (GPU) technology have training itself. We considered an automatic coffee bean data collect
been improved tremendously. Training time of the learning module mechanism with conveyor. In this mechanism, a vibrating bucket
through the GPU parallel computing technology can be greatly makes a pile of coffee beans line up in a lane and drop on the
reduced. We use the GPU developed by Nvidia to accelerate the conveyor. After dumping coffee beans into the vibration bucket,
training time in parallel. The hardware list of our central server is the bucket will make a pile of coffee beans line up in a lane and
shown in Table 3. Since we need to read and normalize all the send to the conveyor consecutively by vibrating. The vibrating
coffee bean images before the GPU training process, we need bucket is shown in Fig. 3.
larger memories to accommodate enough images and faster CPU A high-resolution IP camera is set above the conveyor in order
processing speed. to take a photo while coffee beans pass through, and it is connected
to a desktop computer. We have fine-tuned a pre-trained object
detection model Yolov3 with coffee beans. When the objects pass
2.2 Data collection through the camera, the model will recognise the object and then
give a command to the camera to take a photo. With this
Since insufficient pictures of coffee beans were available on the mechanism, we are able to collect data sets automatically instead of
Internet, we were required to buy green coffee beans from a coffee taking photos manually. The environment is shown in Fig. 4.
shop. Fig. 2 shows a few coffee beans we bought from the shop. We previously used a webcam to take the photos. However, we
found out that when coffee beans moved over a certain speed on

168 IET Smart Cities, 2020, Vol. 2 Iss. 4, pp. 167-172


This is an open access article published by the IET under the Creative Commons Attribution License
(http://creativecommons.org/licenses/by/3.0/)
26317680, 2020, 4, Downloaded from https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/iet-smc.2020.0068 by INASP/HINARI - PAKISTAN, Wiley Online Library on [11/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Table 4 Values of the IP camera parameters resolution in that training the model with high-resolution photos
Camera parameter Value requires more powerful GPU and much more time. We decided to
brightness 255 resize the photos of coffee beans but still retained most of the
features on the appearance.
contrast 225
We photographed 1000 good beans and 1000 bad beans, as
colour tone 134 displayed in Fig. 5.
saturation 127
sharpness 25 2.3 Data preprocessing
gamma 200
backlight control 69
(i) Image segmentation: To reduce the burden and time spent
cutting images manually, we cut the coffee beans in the raw data
exposure time 1/500
through program automation, converted the raw data into
resolution 1080p greyscale, and assigned threshold values. Finally, we obtained the
precise position of the green coffee beans [15].
(ii) Image background removal: We endeavoured to reduce the
interference from the background on the training model. Although
the background seemed to be black, the value of RGB was still
between 0 and 255, which is not absolutely black, so we still
needed to remove the background of the green coffee beans. We
used colour detection methods to remove the background [16].

(a) Colour detection: The advantage of colour detection is that a


mask can be generated through two bounds, namely an upper
bound and a lower bound. By using the mask, we can remove the
background value of the range from the image. Therefore, we were
required to use ultra-black material for our background. The
background could be easily removed from the image. Fig. 6
illustrates the bean images obtained after the background was
removed. The left image displays a good bean, whereas the right
Fig. 5 Image of raw data taken with the IP camera
image displays a bad bean.

2.4 Data augmentation


At the beginning of the data collection process, data of good and
bad beans were insufficient. Therefore, data of good and bad beans
were set at 1000. However, insufficient training data could cause
overfitting for the CNN model. Data augmentation [17] technology
provides methods for flipping, offsetting, cutting, enlarging, and
shrinking the images to enhance the original data set. We did not
Fig. 6 Images of coffee beans after the background was removed wish to change the size, shape, and colour of the original image.
Therefore, we only used the rotation and flip operations to enhance
the collected data set.
As depicted in Fig. 7, we enhanced the data set by rotating the
centre of each coffee bean by 40° and collected the flipped images.
We originally rotated the images by 30°. However, the experiment
indicated that when the angle was 180°, the flipped and rotated
pictures produced duplicate images. Finally, as presented in
Table 4, we obtained a data set nine times larger than the original
data set by rotating the images of 1000 good and bad beans.
Through flipping, we obtained a data set four times larger than the
original data set. Finally, we obtained a data set 36 times larger
than our training and testing data sets.

2.5 Image resizing


After the rotation and flip data were enhanced, we resized the data
Fig. 7 Images obtained by rotating the centre of each coffee bean by 40° set to a fixed length and width image and used this image as the
training material of the CNN model. In data preprocessing, we
the conveyor, motion blur was observed on the photos we took. It resized the image to width and length of 180 pixels each. The detail
led to severe accuracy drop in this system. Therefore, we switched of the methods used are as follows:
the camera from an ordinary web camera to a high-resolution IP
camera. With this new camera, we are able to adjust the exposure (i) Resizing: The maximum length and width of the current image
time, white balance, and other camera parameters manually. This were counted, and a square with a black background was created.
new camera is connected to the computing unit by RJ45 ethernet The original image was combined with a black background and
port. The parameters we set for the IP camera are shown in Table 4. finally resized to 180 × 180 pixels.
Different from the original web camera, the exposure time of the
new IP camera can be set up to 1/10,000 s. With lower exposure We used a method that only required the knowledge of the length
time and sufficient light source, we managed to take photos of an and width of the current picture. Therefore, the proposed method
object moving in high-speed on the conveyor without motion blur. was suitable for dealing with video stream preprocessing. The
Values of the camera parameters are presented in Table 4. length and width of all coffee beans cannot be obtained in advance
These data sets will then be cropped and resized to a reasonable when the video is streamed. Therefore, this resizing method was
resolution in order to become valid data sets for our training. The suitable for our implementation technique.
images that we have shown below are all in relatively low

IET Smart Cities, 2020, Vol. 2 Iss. 4, pp. 167-172 169


This is an open access article published by the IET under the Creative Commons Attribution License
(http://creativecommons.org/licenses/by/3.0/)
26317680, 2020, 4, Downloaded from https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/iet-smc.2020.0068 by INASP/HINARI - PAKISTAN, Wiley Online Library on [11/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Table 5 Architecture of CNN model tuned Yolov3 object detection model to our system. When the
Structure of the CNN object on the conveyor passes through the camera, the object
Layer Filter Stride Output map size Activation detection model will give out a command to the camera to take a
photo. Coffee beans classification model will then judge the bean
convolution 1 5×5 1 180 × 180 × 32 ReLU
based on this photo, and give instructions to the air gun if the
pooling 1 4×4 4 45 × 45 × 32 — coffee bean was judged as bad bean.
convolution 2 5×5 1 45 × 45 × 64 ReLU
pooling 2 3 —
3×3 15 × 15 × 64
3 Experiments and results
fully connected1 — — 1024 ReLU
fully connected2 — — 2 Softmax
In the experiment, we prepared 72,000 images through data
augmentation. Of the 72,000 images, 36,000 images were good
beans and 36,000 images were bad beans. In total, 7000 images of
good and bad beans each were selected randomly from the
Table 6 Number of training and testing data sets augmented data. The remaining data were used as the training data.
Training data sets The number of training and testing data sets are shown in Table 6.
Coffee type Training data Testing data
good bean 29,000 7000 3.1 Google object detection Application Programming
bad bean 29,000 7000 Interface (API)
The Google object detection API provides many training models
such as faster R-CNN, single shot MultiBox detector, R-FCN.
Since faster R-CNN has higher accuracy than other models, we use
faster R-CNN as our model choice. We use the faster R-CNN with
inception v2 coco provided in the API as a pre-training model. It
gives a fair trade-off between accuracy and speed for GPU
accelerated environments. Before training the model, we need to
prepare the data set and label. We need to first convert the image
and the label into a TFRecord format (see in Fig. 8). The TFRecord
format is a binary file format recommended by Google that can
hold information in any format, and Tensorflow also provides a
rich API to help us read and write TFRecord files.
Fig. 8 Flow of Google object detection After preparing the data set and label, we need to train the
model. We apply the trained model to the identification. In Fig. 9,
this is the result we have identified through the trained model. We
can see that the picture is full of various boxes. In the
identification, it is impossible to determine the correctness and
quality of the coffee beans accurately. We have read many papers
using this method on the Internet, and many people have
encountered such problems. They suggested that we can increase
the number of steps to improve the problem. However, the problem
Fig. 9 Result of using Google object detection API in green coffee bean still cannot be solved after trying this suggestion.
identification Finally, we found that most of the object detection applications
are applied to the detection of large objects such as human
2.6 CNN model architecture identification, car identification, animal identification etc. and
these objects have obvious feature differences. However, we have
In this implementation, we used CNN models for the greyscale
to be very close when we pick green coffee beans. This will allow
coffee bean image training (Table 5).
us to observe the coffee beans in more detail. Therefore, the
Through compressing image's third dimension into one
application of coffee bean identification is more difficult to
dimension, we can obtain a greyscale image. The greyscale image
complete by using Google object detection API. Although, Google
enabled easy detection of the shape of the green coffee beans and
object detection API is convenient and easy to use.
their dark colours.
In the CNN model architecture, we mainly used the rectified
linear unit (ReLU) activation function for the convolution layer. 3.2 Object detection based on the CNN
The ReLU applies the non-saturating activation function [18], as In the designed CNN model architecture, the preprocessed data
indicated in (1). It effectively removes negative values from an were used as the training data. Originally, we used the RGB colour
activation map by setting them to zero. It increases the non-linear space to perform data preprocessing before training. However,
properties of the decision function and overall network without when the coffee beans were of different colours, correctly
affecting the receptive fields of the convolution layer. Other identifying the good and bad beans became difficult. Therefore, we
functions, such as the saturating hyperbolic tangent (2) and used greyscale images to preprocess the data. The shape of the
sigmoid function [19] (3), are also used to increase non-linearity. greyscale image could be detected easily. The training results are
The ReLU is often preferred to other functions because it trains the illustrated in Figs. 10 and 11. Fig. 10 displays the line chart of the
neural network several times faster than other functions without a training and testing accuracies under ten epochs. The training and
significant decrease in generalisation accuracy testing accuracies increased steadily. The final testing accuracy was
∼94.68%. Fig. 8 also displays the line chart of the loss for training
f (x) = max (0, x) (1) and testing. Loss in both testing and training gradually decreased.
In the late stages of training, the loss was smaller than those for
f (x) = tanh(x), f (x) = tanh(x) (2) testing. The final testing loss was ∼0.14.

σ(x) = (1 + e− x)−1 . (3) FP


FPR = . (4)
FP + TN
2.7 Recognition mechanism
Fig. 10 depicts the training result, whose confusion matrix is
Streaming with high resolution and frame rate is a heavy work load presented in Table 7. In the testing data, 435 of the 7000 good
for the recognition model. As a result, we decided to apply a fine- beans were misidentified as bad beans and 309 of the 7000 bad

170 IET Smart Cities, 2020, Vol. 2 Iss. 4, pp. 167-172


This is an open access article published by the IET under the Creative Commons Attribution License
(http://creativecommons.org/licenses/by/3.0/)
26317680, 2020, 4, Downloaded from https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/iet-smc.2020.0068 by INASP/HINARI - PAKISTAN, Wiley Online Library on [11/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
Fig. 10 Training and testing accuracy

Fig. 13 Prototype of our system

Finally, we set an IP camera with the same camera parameters


as those used in the data collection and connected it to our training
model. We detected each frame from the video stream and cut the
coffee bean images by using image segmentation. Then, through
the data preprocessing method described in Section 2, the captured
bean images were cut and imported into our training model for
identification.
Fig. 12 displays a screenshot of the output at the time of
recognition. The green frames in the figure were predicted as good
beans, whereas the red frames were predicted as bad beans. Users
can select the quality of coffee beans accurately and efficiently
Fig. 11 Loss of training and testing through the instant identification analysis of images. In our current
test, we could identify the green coffee bean in 1 ms when the
Table 7 Confusion matrix of the result video stream reached 24 frames per second [21]. The quality of the
Prediction outcome Total green coffee beans could be judged smoothly and accurately.
p n The previous data training only ran ten epochs. To ensure that
the training has reached the bottleneck and improved the overall
actual value p′ 6565 435 P′
accuracy, we added five epochs to the original training. The
n′ 309 6691 N′
comparison of the result is shown in Table 8. We decreased the
total P N value of True Positive and False Positive. Increase the value of
False Negative and True Negative. Lower false positive rate
indicates fewer misjudgments. We reduced the chance of
misjudgment of bad coffee beans, and improved the accuracy of
the overall identification.

4 Conclusions and future work


In this study, we used greyscale images for training. Therefore, in
Fig. 12 Results in a screenshot of the video stream the identification process, we also converted the images into
greyscale for identification. However, greyscale images are
Table 8 Comparison of epochs vulnerable to ambient light because only one-dimensional image
Epoch 10 15 information is retained.
The CNN model was used to classify good beans and bad beans
true positive 6791 6565
in our experiment. The overall coffee bean identification accuracy
false negative 209 435 was ∼94.63% and the FPR was 0.0441. By connecting the coffee
true negative 6295 6691 bean identification model to an IP camera, we could instantly
false positive 705 309 distinguish the good and bad beans from the green coffee beans
false positive rate 0.1007 0.0441 selected by the human eye. By using object detection and image
accuracy, % 93.47 94.63 recognition technology, we could reduce the time and labour costs
involved and help develop the speciality coffee industry. The
prototype of our system is shown in Fig. 13.
beans were misidentified as good beans. The less the bad beans, the The system now can only handle one side of the coffee beans.
better is the quality of the speciality coffee. The measure of interest When the defect appears on the back side of the bean, we will not
in this study was the false positive rate (FPR) [20] (4). In this be able to screen out. Since the defects on the back side of the
formula, false positive (FP) represents how many bad beans will be coffee beans will also be detected as bad beans in our inspecting
predicted as good beans, true negative (TN) represents how many system currently, we will have coffee beans which are labelled as
bad beans will be predicted as bad beans. This index indicates the good beans repeat the inspecting process. Most of the defects
proportion of bad beans misjudged as good beans. The lower the appear on the back side of the coffee beans will be screened out
FPR was, the closer we were to the standard of speciality coffee. after a few tries. Moreover, we will need to train a new inspection
According to the results of our experiments, the FPR was 0.0441 model which is able to detect defects on both sides of coffee beans.
when the testing data included 7000 images each of good and bad To achieve that, we need to collect more data that contain back side
coffee beans. of the coffee beans. After developing the solution mentioned
above, the system will be capable of screening out defects on both

IET Smart Cities, 2020, Vol. 2 Iss. 4, pp. 167-172 171


This is an open access article published by the IET under the Creative Commons Attribution License
(http://creativecommons.org/licenses/by/3.0/)
26317680, 2020, 4, Downloaded from https://ietresearch.onlinelibrary.wiley.com/doi/10.1049/iet-smc.2020.0068 by INASP/HINARI - PAKISTAN, Wiley Online Library on [11/05/2024]. See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions) on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License
5 Acknowledgment
This study was supported by the Ministry of Science and
Technology (MOST) of Taiwan under grant MOST 107-2218-
E-007-004.

6 References
[1] Oder, T.: ‘How coffee changed the world’. Available at https://
www.mnn.com/food/beverages/stories/how-coffee-changed-the-world
[2] Pinto, C., Furukawa, J., Fukai, H., et al.: ‘Classification of green coffee bean
images based on defect types using convolutional neural network (CNN)’.
2017 Int. Conf. on Advanced Informatics, Concepts, Theory, and Applications
(ICAICTA), Bali, Indonesia, August 2017, pp. 1–5
[3] Gavat, I., Militaru, D.: ‘Deep learning in acoustic modeling for automatic
speech recognition and understanding – an overview’. 2015 Int. Conf. on
Speech Technology and Human-Computer Dialogue (SpeD), Bucharest,
Romania, October 2015, pp. 1–8
[4] Young, T., Hazarika, D., Poria, S., et al.: ‘Recent trends in deep learning
based natural language processing [review article]’, IEEE Comput. Intell.
Mag., 2018, 13, pp. 55–75
[5] Fadlullah, Z.M., Tang, F., Mao, B., et al.: ‘State-of-the-art deep learning:
evolving machine intelligence toward tomorrow's intelligent network traffic
control systems’, IEEE Commun. Surv. Tutorials, 2017, 19, pp. 2432–2455
[6] Ker, J., Wang, L., Rao, J., et al.: ‘Deep learning applications in medical image
analysis’, IEEE Access, 2018, 6, pp. 9375–9389
[7] Kamilaris, A., Prenafeta-Boldú, F.X.: ‘Deep learning in agriculture: a survey’,
Comput. Electron. Agric., 2018, 147, pp. 70–90
[8] ‘Specialty Coffee Association of America’. Available at http://
www.coffeeresearch.org/coffee/scaaclass.htm
[9] Arboleda, E.R., Fajardo, A.C., Medina, R.P.: ‘Classification of coffee bean
species using image processing, artificial neural network and k nearest
neighbors’. 2018 IEEE Int. Conf. on Innovative Research and Development,
Bangkok, Thailand, 2018, pp. 1–8
[10] Arboleda, E.R.: ‘Comparing performances of data mining algorithms for
classification of green coffee beans’, Int. J. Eng. Adv. Technol., 2019, 8, pp.
1563–1567
Fig. 14 Architecture of identification
[11] Okubo, N., Kurata, Y.: ‘Nondestructive classification analysis of green coffee
beans by using near-infrared spectroscopy’, Foods, 2019, 8, p. 82
sides of coffee beans; meanwhile, both side of the coffee bean will [12] Gunadi, I.G.A., Artha, I.P.M.K., Christyaditama, I.G.P., et al.: ‘Detection of
be inspected, respectively. coffee bean damage in the roasting process based on shape features analysis’.
2019 Int. Conf. on Mathematics and Natural Sciences, Bali, Indonesia, 2020,
The model currently runs on a desktop personal computer. vol. 1503
However, considering the cost and convenience, we are planning to [13] Arboleda, E.R., Fajardo, A.C., Medina, R.P.: ‘An image processing technique
move the whole process to an edge computing device like NVIDIA for coffee black beans identification’. 2018 IEEE Int. Conf. on Innovative
Jetson Nano or Raspberry Pi 4 with Intel Neural Compute Stick in Research and Development (ICIRD), Bangkok, Thailand, May 2018, pp. 1–5
[14] Tho, T.P., Thinh, N.T., Bich, N.H.: ‘Design and development of the vision
the future. After replacing the computing unit, the device will sorting system’. 2016 3rd Int. Conf. on Green Technology and Sustainable
become easier to transport and less costly for users. Development (GTSD), Kaohsiung, Taiwan, November 2016, pp. 217–223
Artificial intelligence is feasible for the image recognition of [15] Zhu, S., Xia, X., Zhang, Q., et al.: ‘An image segmentation algorithm in
green coffee beans, and it can provide accurate and efficient image processing based on threshold segmentation’. 2007 Third Int. IEEE
Conf. on Signal-Image Technologies and Internet-Based System, Shanghai,
results. Furthermore, good and bad beans can be accurately China, December 2007, pp. 673–678
distinguished by using a camera, which solves the problem of [16] Qin, Y., Sun, S., Ma, X., et al.: ‘A background extraction and shadow removal
spending considerable time and effort for selection. In the future, algorithm based on clustering for vibe’. 2014 Int. Conf. on Machine Learning
we hope to connect a robotic machine to select and remove bad and Cybernetics, Lanzhou, China, July 2014, vol. 1, pp. 52–57
[17] Mikołajczyk, A., Grochowski, M.: ‘Data augmentation for improving deep
beans. The blueprint of the architecture of our system is illustrated learning in image classification problem’. 2018 Int. Interdisciplinary PhD
in Fig. 14. Workshop (IIPhDW), Swinoujscie, Poland, May 2018, pp. 117–122
In Fig. 14, we combine our system architecture with colour [18] Zaheer, R., Shaziya, H.: ‘GPU-based empirical evaluation of activation
sorter machine architecture. We placed the background on the track functions in convolutional neural networks’. 2018 2nd Int. Conf. on Inventive
Systems and Control (ICISC), Coimbatore, India, January 2018, pp. 769–773
and instantly identified it through the original webcam. The brown [19] Kalman, B.L., Kwasny, S.C.: ‘Why tanh: choosing a sigmoidal function’.
objects are good coffee beans, and the yellow ones are bad coffee [Proc. 1992] IJCNN Int. Joint Conf. on Neural Networks, Beijing, China,
beans. We replaced the original CCD technology or infrared June 1992, vol. 4, pp. 578–581
technology with a deep learning model and used the colour sorter [20] Boughorbel, S., Jarray, F., El-Anbari, M.: ‘Optimal classifier for imbalanced
data using Matthews correlation coefficient metric’, PLOS ONE, 2017, 12, p.
machine selection method. Then, we used an air gun to separate the e0177678
coffee beans. Finally, we separated the coffee beans into two [21] Chou, D.L.: ‘Real-time classification of green coffee beans by using a CNN’.
containers. Available at https://youtu.be/aqzs3o8z08

172 IET Smart Cities, 2020, Vol. 2 Iss. 4, pp. 167-172


This is an open access article published by the IET under the Creative Commons Attribution License
(http://creativecommons.org/licenses/by/3.0/)

You might also like