Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Sensing and Bio-Sensing Research 43 (2024) 100632

Contents lists available at ScienceDirect

Sensing and Bio-Sensing Research


journal homepage: www.elsevier.com/locate/sbsr

Electronic nose coupled with artificial neural network for classifying of


coffee roasting profile
Suryani Dyah Astuti a, *, Ihsan Rafie Wicaksono a, Soegianto Soelistiono a, Perwira Annissa
Dyah Permatasari b, Ahmad Khalil Yaqubi c, Yunus Susilo d, Cendra Devayana Putra e,
Ardiyansyah Syahrom f
a
Department of Physics, Faculty of Science and Technology, Airlangga University, 60115 Surabaya, Indonesia
b
Department of Mathematic, Faculty of Science and Technology, Airlangga University, 60115 Surabaya, Indonesia
c
Doctoral Degree, Faculty of Science and Technology, Airlangga University, 60115 Surabaya, Indonesia
d
Faculty of Engineering, Dr Soetomo University Surabaya, Indonesia
e
Institute of Information Management, National Cheng Kung University, Tainan, Taiwan
f
Medical Devices and Technology Centre, Universiti Teknologi Malaysia, 81310 Johor, Malaysia

A R T I C L E I N F O A B S T R A C T

Keywords: Coffee known for its diverse aromas shaped by postharvest treatments, particularly the roasting process, plays a
Electronic nose pivotal role in determining the quality of the brewed beverage. This study focuses on classifying the aroma of
TGS sensors Arabica coffee beans based on roasting temperature, employing an electronic nose equipped with a TGS gas array
Roasting temperature
sensor. The classification methodology integrates deep learning through an artificial neural network (ANN),
ANN
along with a calculation analysis utilizing the Pearson correlation coefficient. Raw Robusta coffee beans were
subjected to five distinct roasting treatments (185 ◦ C, 195 ◦ C, 205 ◦ C, 215 ◦ C, and 225 ◦ C), resulting in light
roasts, light to medium roasts, medium to dark roasts, medium to dark roasts, and dark roasts. The repeatability
test affirms the TGS sensor’s reliability, exhibiting a standard deviation (STD) below 20%. Notably, the TGS 2612
and TGS 2611 sensors, dedicated to odor detection, demonstrated excellent validity with an STD below 10%
across various roasting temperatures. Classification results from deep learning cross-validation showcase
impressive accuracy: 98.2% for Light Roasts, 98.4% for Light to Medium Roasts, 98.8% for Medium Roasts,
97.8% for Medium Roasts, and 95.9% for Dark Roasts. In conclusion, this study reveals that the E-nose, utilizing
the TGS gas sensor array with deep learning analysis, effectively detects and classifies coffee types based on
roasting time with high accuracy.

1. Introduction The price differential per kg ranges from 1.26 to 2.35 USD [2].
Robusta is cultivated at lower elevations and is renowned for its
The most famous and expensive thing grown on plantations is coffee, strong character, while Arabica is valued for its delicate taste and subtle
which is consumed in more than 10 million cups every year. This cof­ texture. Because Arabica is of more excellent quality and commands
fee’s flavor, background, and custom all contribute to its allure. Several higher prices, it dominates worldwide trade. Market impact comes from
varieties of coffee, including Arabica, Robusta, or a mix of these two, are Brazil, a significant manufacturer [3]. The sector adapts to changing
popular on a worldwide scale. According to a report by an international customer needs by incorporating ethical and sustainable practices.
coffee group, the trade in Robusta coffee is relatively lower than Arabica Distinctive features, consumer inclinations, and ethical and economic
coffee [1]. When compared to Robusta, which had 47.62 million bags considerations all impact the dynamics of the world coffee market.
exported in 2021, Arabica had a total of 82.72 million bags. At the end of A crucial stage in making coffee is roasting, which involves giving
2021, the trade in Arabica coffee had grown by 4.1%, and the future green coffee beans heat. The coffee is given a nice aroma by roasting,
looks promising. Arabica is promising both in terms of commerce overall which also removes the water from the beans. Because roasting changes
and price. Compared to Robusta, Arabica coffee is generally more costly. the chemistry, Arabica coffee has a more complex chemistry, as well as a

* Corresponding author.
E-mail address: suryanidyah@fst.unair.ac.id (S.D. Astuti).

https://doi.org/10.1016/j.sbsr.2024.100632
Received 28 November 2023; Received in revised form 6 February 2024; Accepted 16 February 2024
Available online 17 February 2024
2214-1804/© 2024 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-
nc-nd/4.0/).
S.D. Astuti et al. Sensing and Bio-Sensing Research 43 (2024) 100632

better smell, taste, and color. This makes the roasting process crucial Support Vector Machine (SVM) and Random Forest (RF) [21]. However,
[4]. This procedure, done at different temperatures, produces different it is very challenging to separate data with small vector dimensions due
roast levels, giving the coffee its particular smells and scents. The beans’ to the spread of data that can be obtained at a particular location.
chemical transformations produce a range of tastes, including Maillard Backpropagation analysis is used to forecast Robusta and Arabica coffee
browning and caramelization, from light to dark roasts. Dark roasts grinds [22]. Unfortunately, extensive study is required to determine
provide more profound, smokier overtones, while light roasts maintain backpropagation’s ideal performance. Principal Component Analysis
the natural bean qualities. To reconcile maintaining the natural char­ (PCA), is a technique related to Gas Chromatography-Mass Spectrom­
acteristics of the bean with reaching the appropriate roast degree, etry, which is used to analyze the flavor of Java and Sumatra Robusta
roasting calls for dexterity and skill. Roasters of coffee manipulate coffee [23]. PCA can be used with small data points, but requires a deep
temperature and time precisely to create a harmoniously balanced taste learning framework to take full advantage of data.
profile. The quality of coffee is greatly influenced by roasting. For the Six gas sensors (TGS 2620, TGS 2612, TGS 2611, TGS 2602, TGS
coffee to taste harmonic and pleasurable, careful control over the pro­ 2600, and TGS 826) are used to examine the scents produced by Arabica
cess is necessary to prevent under- or over-roasting. Coffee roasting is a coffee beans at different roasting temperatures. Every sensor provides
painstaking art form that combines scientific knowledge with sensory distinct information. Organic solvents are detected by TGS 2620, com­
perception to transform raw beans into a fragrant, tasty beverage ready bustion gases are targeted by TGS 2612, methane is detected by TGS
to be brewed. Roasting temperature and time determine roasting levels, 2611, and volatile organic chemicals are identified by TGS 2602,
requiring a roaster to be close by to be examined the complexity of this cooking. Tobacco scents are targeted by TGS 2600, and ammonia and
process accounts for the coffee roaster’s expensive compensation. The amines are detected by TGS 826. Combined, these sensors provide a
caliber of specialists is determined by their business. According to an thorough grasp of coffee odors’ subtleties and chemical makeup. This is
official survey conducted by a third party the average salary for a coffee the foundation for a deep learning model that evaluates coffee quality
roaster is between 15,000 and 20.000 USD annually, with the highest using fragrance profiles.
salary surpassing 100.000 USD [5]. We are trying to come up with an The model has difficulty dealing with dark roast coffee because of its
alternative to forecasting coffee roasting levels based on these expense natural qualities and consistent and robust fragrance profile. Difficulty
and complexity issues. in distinguishing fragrance qualities emerges from complicated chemi­
The Electronic Nose (E-Nose) is a revolutionary solution to the cal reactions caused by the lengthy, high-temperature roasting process.
problems with traditional coffee roasting level assessment techniques. Sensitivity issues with strong dark roast scents might cause the E-Nose
Conventional methods are expensive and complicated since they depend system’s sensors to miss more minor differences. Subjectivity in human
on expert roasters and subjective human judgments. With deep learning judgments and limited variety in the training dataset adds to the chal­
and gas sensors, the E-Nose provides a reliable and effective substitute. It lenge. Accurate categorization is further complicated by the possible
delivers sensitive, real-time analysis by imitating the nostrils of mam­ overlap in fragrance qualities between dark roast and neighboring roast
mals, addressing problems such as age-dependent sensory fluctuations degrees. A sophisticated strategy that includes feature inclusion, sensor
and subjective “cup tests.” The research shows how well the E-Nose improvement, and dataset refining is needed to overcome these obsta­
categorizes Arabica coffee beans according to roasting temperature, cles for better dark roast categorization.
providing a consistent and trustworthy method for assessing coffee The study uses an electronic nose (E-Nose) with TGS gas sensors to
quality. forecast coffee roasting levels, specifically Arabica coffee. The E-Nose
Therefore, to address things objectively, technology such as the mimics mammalian nostrils and uses pattern recognition software for
Electronic Nose (E-Nose) has been developed. E-nose has been widely analysis. The technology offers objectivity and is suitable for quality
used for quality detection of food ingredients [6] and microbes that detection in food ingredients [24]. The E-Nose classifies Arabica coffee
cause infection [7] and contaminants [8]. E-nose is a device that mimics beans based on roasting temperature, determining final product char­
the workings of the mammalian nostrils by using gas sensors that can acteristics. The deep learning classification achieves high accuracy
respond to certain scents [9]. The signal response generated by the e- values, making it a promising alternative to traditional methods for
nose to certain scents will be analyzed using pattern recognition soft­ assessing coffee quality based on aroma profiles.
ware so that it can be analyzed and identified [10]. When compared The E-Nose sensor is employed to classify Arabica coffee beans based
with other analytical techniques, such as gas chromatography and on roasting temperature and to identify the most informative sensor for
electronic nose systems can be built in and can provide sensitive and distinguishing coffee profiles. Raw Robusta coffee beans were divided
selective analysis in real time. The e-nose system has four main com­ into five samples, with each sample undergoing a different roasting
ponents, namely the gas sensor array, headspace system, data acquisi­ treatment: roasting at 185 ◦ C, 195 ◦ C, 205 ◦ C, 215 ◦ C, and 225 ◦ C. This
tion and pattern recognition [11] [12]. The gas sensors used by E-Noses process aimed to obtain coffee bean samples representing light roasts,
are made of conductive polymer gas sensors, quartz microbalances, light to medium roasts, medium to dark roasts, and dark roasts.
surface acoustic vibrations, and metal oxides [13]. The sensing and
purging mechanisms make up the two main components of the head­ 2. Materials and methods
space system. This invention shows that it is possible to effectively
convert the chemical reaction between volatile substances and gas 2.1. Sample preparation
sensors into a digital signal [14]. Various studies have used comparisons
of various methods to see discrimination during coffee washing [15], as The sample is 2 kg of Robusta coffee beans from the Ijen area, East
well as coffee recognition using E-nose [16] and the use of E-nose and E- Java, Indonesia which are still green in color and contain a moisture
tongue to differentiate coffee in China based on species [17]. content of 12.9%. Robusta coffee beans that were still raw were divided
The roasting degree can be changed to suit customer preferences into five samples with each sample going through a different roasting
[18]. Five roasting degrees were used, this research roasting level is not process [25], namely roasting at 185 ◦ C, 195 ◦ C, 205 ◦ C, 215 ◦ C, and
indicative of the real roasting level [19]. Since level five requirements 225 ◦ C, in order to obtain samples of light roasted coffee beans (S1),
are the most important in this study, clients with lower-level needs can light to medium roasted coffee beans (S2), medium to dark roasted
still use our method. coffee beans (S3), medium to dark roasted coffee beans (S4), and dark
Coffee is classified using multispectral and random forest methods roasted coffee beans (S5).
[20]. Linear Discriminant Analysis (LDA) is used to ascertain the type of Lightly roasted Robusta coffee is produced by a mild roasting method
coffee’s flavor [14]. However, all characteristics and predictors must be at 185 degrees Celsius. The moderate-to-medium roasting process is
assumed to have normal distributions in order to use LDA, using a carried out at 195 ◦ C. Roasting takes place at 205 ◦ C for the medium

2
S.D. Astuti et al. Sensing and Bio-Sensing Research 43 (2024) 100632

roast variety. Roasting takes place at a temperature of 215 ◦ C for the 2.4. Deep learning framework
medium to dark roast. Moreover, it is roasted to 225 ◦ C during the dark
roasting stage. Data was gathered in the form of light roast, light-to- Gathering information from E-Nose sensors when roasting Robusta
medium roast, medium roast, medium-to-dark roast, and dark roast coffee beans is part of integrating a deep learning framework after
coffee beans for each treatment sample, which analyzed with the E-Nose Pearson correlation analysis. Critical sensors are identified by correla­
up to 50 times. tion analysis, and a double-hidden-layer backpropagation neural
network is built to forecast roasting levels. After training and parameter
2.2. Set-up of experiment adjustment, the model outperforms conventional techniques regarding
accuracy. The method advances the evaluation of coffee quality by of­
The experimental setting for this study is shown in Fig. 1 with an fering an efficient and objective way to predict coffee roasting levels
array of gas sensors that collect Robusta coffee aromas during the data based on fragrance patterns.
collection phase. Previously, pre-heating or heating was carried out on Neural networks can be used to model and predict nonlinear systems,
the sensor for 30 min, aiming to stabilize all sensors so that the sensor mimicking any continuous nonlinear map. In this paper, a double-
can run well when it detects samples. Next, the sensor response was hidden-layer backpropagation neural network with several inputs and
tested with 50 replications. The output data obtained by the gas sensor six outputs is built. The input system prediction parameters are thought
array is the value of voltage over time. Meanwhile, the input value is in to be representations of the neurons found in the input layer. To com­
the form of the aroma obtained from the sample. During the pre- plete the spatially weighted aggregate of the excitation input and output
processing phase, an array of sensors is used to determine the impor­ signals, the first hidden layer’s neuron nodes are employed. Neuron
tance of each sensor. Computational analysis was used to predict coffee nodes in the second hidden layer are used to enhance the network’s
groups using Pearson’s correlation coefficients. (See Fig. 2.) nonlinear mapping capabilities for intricate input-output interactions.
Six sensors TGS 2620, TGS 2612, TGS 2611, TGS 2602, TGS 2600, Two process neuron nodes make up the output layer, which completes
and TGS 826 are used in this investigation. Each sensor has a particular the system output.
gas detection range [20]. More specifically, the TGS 2620 detects vapors The transfer function of each layer has a big effect on how well a
of organic solvents and alcohol gases between 50 and 5000 ppm EtOH. backpropagation neural network works. Through experimentation,
Methane, propane, and butane each have a detection range of 1.25% LEL several transfer functions were discovered. The linear transfer function,
for the TGS 2612. The TGS 2611 can detect methane at levels between tangent transfer function, logarithmic sigmoid transfer function, and
500 and 10,000 ppm. TGS 2602 has a 1 to 30 ppm EtOH detection range others are frequently used transfer functions in backpropagation neural
for volatile organic substances (hydrogen sulfide, VOC, ammonia, LPG, networks. Tensor Flow library defines three types of transfer functions:
butane and propane). The TGS 2600 can detect tobacco and cooking tangents, linear functions, and tangents56. This can be phrased as fol­
odor gas at concentrations between 1 and 30 ppm. The TGS 826 can lows, under the presumption that the signal from the input layer to the
detect ammonia and amine compound at concentrations between 30 and first hidden layer is:
300 ppm. Samples were observed using these sensors. ∑
m
mj = xk + wkj + bj (2)
2.3. Pearson correlation analysis k=1

where xk is the input neuron and xk is each design parameter of the Odor
Pearson correlation analysis is often used to choose which features to
Sensor discussed in this work. wkj represents the input layer to the
use or to figure out how to change new features [26,27]. Pearson cor­
weight of the first hidden layer, whereas bj indicates the bias of the first
relation analysis helps to avoid overfitting and reduce computer work.
hidden layer. The output signal for the first hidden layer is indicated by
The Pearson correlation coefficient is usually used to measure the linear
yj in Eq. (3):
relationship between two variables. Its values range from − 1 to 1. A
( )
score of − 1 indicates a perfectly negative correlation. A number of 1 yj = tansig mj (3)
indicates a fully positive correlation, while a value of 0 indicates a linear
relationship between two variables. The Pearson correlation coefficient The signal transferred from the first hidden layer to the second
is described as follows: hidden layer is symbolized by the symbol ni , and its calculation is
depicted in Eq. (4):

n
(xi − x)(yi − y) ∑
k
i=1
R = √̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅√̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅ (1) ni = yj wji + bi (4)
∑ n ∑ n
(xi − x)2 (yi − y)2 j=1
i=1 i=1
where wji is the weight of the second hidden layer’s first hidden layer
where xi and yi stand for the X and Y values of the it sample, x and y are and bi is the bias of the second hidden layer. The output signal of the
the mean values of X and Y, respectively, and n is the number of samples second hidden layer is symbolized by the symbol zi and is expressed as
[24].

Fig. 1. Experiment setup.

3
S.D. Astuti et al. Sensing and Bio-Sensing Research 43 (2024) 100632

Fig. 2. Deep learning framework.

follows:
wmn = wm−
n
1
− ϵ*α (8)
zi = tansig (ni ) (5)
wmn is weight in node n and iteration m. ϵ is derivative score and α is
The signals transferred from the second hidden layer to the output
learning rate.
layer are recorded as s, respectively, as illustrated in Eq. (6):
p

zi = zi wi + bi (6) 2.6. Data collection
i=1
For potential future analysis, our dataset is freely available for
download on GitHub [1]. Six sensors’ worth of raw data, representing
2.5. Fine-tuning deep learning procedure classes, total 26,002 bytes in the repository. Fig. 1 demonstrates the
normal distribution of the data we have gathered. The dataset has a
Depending on the amount of data needed, deep learning is often clustered shape rather than being divided into classes linearly. This is
trained using a variety of approaches. Various deep learning techniques consistent with our belief that natural data will result in a clustered
often result in different results. The following details our deep learning structure. The distribution of our dataset is balanced as well. 19,124
weighting strategy: Initializing a model in step 1 is rendered absolutely light roast data, 19,122 light to medium roast data, 19,132 medium
unnecessary. A model needs to be adjusted for a certain objective in roast data, 19,508 medium to dark roast data, and 19,116 dark roast
order to solve the issue. The starting node of a model typically includes a data are all included in the dataset distribution. Any classification model
random node weight and bias. These weights typically vary from 0 to 1. would benefit from using this condition as a data distribution. This
Depending on how challenging the problem is, deep learning uses a dataset’s intricacy is its biggest flaw. The data features cross over one
different amount of weight. another. These conditions are difficult and require additional research to
Forward propagation is the process of performing calculations on be resolved.
data to identify the current model’s anticipated label. Step three is to
determine the loss function. To determine the current model’s perfor­
3. Results
mance, a loss function computation is necessary. Generally, the mean
square error (data ratio), category cross entropy (ordinal), and other
3.1. Pearson correlation coefficient
comparable metrics serve as the foundation for the loss function. We use
categorical cross-entropy as a loss function because our dataset is
We use Pearson correlation to figure out how important each sensor
ordinal. We have classes and data, where the projected label and the
is [23,25]. This helps us find the right feature and deal with the cost
actual label of the data are both present. The following is a definition of
issue. We investigated whether the output of a single predictor could be
the total loss function:
improved by integrating various olfactory descriptors. We conduct the

N calculations outlined for the training dataset to get the Pearson corre­
L(̂y i , yi ) = − yi loĝy i (7) lation coefficient between each pair of odor descriptors. The standard
for model fusion is the Pearson correlation coefficient. The Pearson
i

Fourth step is Optimization. Optimization is a technique that lowers correlation coefficients between the several odor descriptors are dis­
the overall inaccuracy of the previous step. There are several optimi­ played in Table 1.
zation techniques, including Adam, Adagrad, RMS Prop, etc.; As the output of the regressor, each new pair of sensors is compared
The Fifth step is back propagation. The backpropagation algorithm to a single description. The six sensors are represented by TGS 2600 to
reduces the error value in the loss function using the delta rule or TGS 826. Sensor TGS 2600 and sensor TGS 2620 have a close relation­
gradient descent to obtain the ideal weight. Based on the loss function, ship. Sensor TGS 2602 and sensor TGS 826 have a close relationship.
this phase decides whether weight needs to be added or subtracted. 2611 sensors and 2612 have no high relation to other sensors. Fig. 3
There are weight modifications from the last layer to the input. shows how the accuracy of predictions only changes for one or two
Increasing both the error rate and layer weight if the derivative value is variables based on the output of the regressor. The most effective sensor
positive. The weight must be decreased. If not, layer weight increases. for predicting coffee is TGS 2602. The TGS 2600 and TGS 2620 sensors
The derivative score of the preceding layer is then calculated. are closely related. Despite having similar Pearson correlation

4
S.D. Astuti et al. Sensing and Bio-Sensing Research 43 (2024) 100632

Table 1
Pearson correlation coefficient for roasting profile coffee.
TGS 2600 TGS 2600 TGS 2602 TGS 2611 TGS 2620 TGS 2612 TGS 826

TGS 2602 0.697230 1


TGS 2611 0.411828 0.686284 1
TGS 2620 0.991888 0.735301 0.438835 1
TGS 2612 0.682535 0.498495 0.243995 0.651970 1
TGS 826 0.710215 0.972227 0.651203 0.749243 0.538378 1
Class 0.750990 0.428502 0.038299 0.776075 0.440386 0.506036

Fig. 3. Impact of descriptor correlation on prediction results.

coefficient ratings, TGS 2602 and TGS 826 have quite different per­ group was used for testing. The ten-classification accuracy means were
sonalities. This occurrence happens because, despite having identical applied as a consequence. Because the leave-one-out cross-validation
information, sensor TGS 2602’s information may be quieted by other took longer and produced results that were identical to those of the 10-
sensors, but sensor TGS 826’s information may be less quiet. fold cross-validation, we opted for the former in this investigation.
All sensors have the necessary knowledge regarding coffee classes, Cross-validation is the method we use to prevent overfitting. Node size
just like our intuitions do. We employ all sensors as part of our features, and hidden layer size are covered in this section. The impact of the
including the TGS 2602, TGS 2611, TGS 2612, TGS 826, TGS 2600, and number of hidden layers on model performance is seen in Fig. 4. We used
TGS 2620. the first through fourth layers. From layer 1 to layer 3, we found that
model performance improved. Layer three and the layer preceding it
experienced an increase in model generalization, which is what caused
3.2. Deep learning analysis
this. Layer 4 is nonetheless less accurate than layer 3 because it is too
complex for our data. A neural network with three principal hidden
Cross-validation was used to compare Deep Learning accuracy with
network (DHBP) layers is the last one we have.
conventional techniques and represent the true state of machine
Fig. 5 shows that the time it takes to process data goes down as the
learning. The other nine groups functioned as training sets when one

Fig. 4. Influence hidden layer number to accuracy.

5
S.D. Astuti et al. Sensing and Bio-Sensing Research 43 (2024) 100632

Fig. 5. Influence hidden layer to training time.

model gets closer to convergence. This happens because the appropriate could be the answer to a managerial decision to choose a cheap but
model only needs to adjust the weight model a few times, whereas the powerful odor sensor. This research found that the right number of
inappropriate model requires more time to perform backpropagation. hidden layers affects processing time in deep learning. Technical in­
This study looks at how likely it is that each layer has the same sights can be used in other domains.
number of nodes as shown in Fig. 6. Using 10 to 150 nodes, we assess the Triple layers are used to overcome coffee prediction problem using E-
model’s performance. The performance of the system is linearly influ­ Nose sensors using Pearson correlation analysis and odor descriptors.
enced by the number of nodes in a layer, from node 10 to node 60. The We should relax the requirement that all combinations contain the same
accuracy then declines until it reaches 70. This performance degradation number of descriptors and allow combinations to choose descriptors
is the result of a time-consuming computing process that was unable to adaptively to get better combination effects. In the future, we will
thoroughly evaluate the data. Performance thus improves from nodes 81 explore other odor sensors. We also want to explore other types of
to 91. At this time, the model has learned everything it can. The model sensors to improve machine receptors such as tongue sensors.
finally reaches the maximum limit of model learning between 91 and
150 nodes, but the slow computational process makes model learning
unsatisfactory. As a result, the parameter was set to be 91. 3.3. Comparison performance across methods
The effect of repetition times on accuracy is depicted in Fig. 7. Each
iteration of our DHBP has comparatively higher accuracy. When the According to our review, the classification of smelly coffee is closely
parameters are merely set in the second iteration, the accuracy related to two techniques that are often used. The first approach employs
decreases. Linear Discriminant Analysis (LDA) to assess corpora [14,26]. A normal
Other parameter settings used in this study are listed in Table 2. In distribution is a crucial premise for LDA. Only a large enough dataset
order to make this paper more reproducible, we added a parameter will allow for the normal distribution to be reached. The second tech­
table. We make use of a number of parameters, including optimizer, nique combines Support Vector Machine (SVM) with Principal Compo­
epsilon, and learning rate, that are frequently utilized in deep learning nent Analysis (PCA) [21]. Although PCA is effective at handling
research. generalizations of data, it may actually result in information loss. As a
Table 3 presents the classification results achieved through deep result, we used three hidden layers to improve data generalization and a
learning on cross-validation. backpropagation baseline to lessen the dependence on the normal dis­
Predicting odor descriptor scores using electronic nasal signal data is tribution. Table 3 displays our results in terms of accuracy.
an important research topic in coffee classification. Deep learning has In addition, we use two methods that are common in odor catego­
shown superior performance in computer vision, speech recognition, rization to address a variety of issues. A decision tree is the first tech­
and natural language processing compared to conventional machine nique [28]. When the data set is small, the decision tree or discriminant
learning techniques. Table 4 shows the results of the TGS gas sensor classification method may become unstable. The first backpropagation
repeatability test. Based on the repeatability test results for each sensor, method is used in the second method [29]. Backpropagation is effective
it shows that the TGS 2612 and TGS 2611 sensors are odor sensors that in all domains, but can be improved by adding a hidden layer to capture
have good validity at various roasting temperatures. This discovery the essence of the smell descriptor.
Backpropagation and SVM have a 0.1 accuracy difference, but DHBP

Fig. 6. Influence of node number to accuracy.

6
S.D. Astuti et al. Sensing and Bio-Sensing Research 43 (2024) 100632

Fig. 7. Influence of iteration to accuracy.

Table 2
Parameters setting up of deep learning analysis.
Table 4
Notation Value Definition The result of TGS gas sensor repeatability test.
Optimizer Adam stochastic gradient descent method based on adaptive Sensor Roasting temperature
estimate of first order and second-order moments.
185 ◦ C 195 ◦ C 205 ◦ C 215 ◦ C 225 ◦ C
Epsilon 1e-07 a small positive constant that is typically used to avoid
numerical instability and divide-by-zero errors during STD (%)
gradients of a loss function calculation
TGS 2600 23.4171 8.4409621 16.1116 16.1211121 17.0965
Learning 0.001 a training parameter used to determine the weight
TGS 2602 13.8624 13.6001783 31.711 45.2457842 51.5936
Rate correction value throughout the training process
TGS 2611 5.2252 6.0196961 10.0623 8.9897177 9.658
TGS 2620 19.3861 9.0462564 17.9403 17.9323349 18.912
TGS 2612 5.1299 3.2603214 3.1297 2.7724706 3.9926
still achieves higher accuracy. The developed model is applied to fore­ TGS 826 18.3314 18.415294 36.6359 45.6665706 48.6132
casting. This model outperforms all other machine learning techniques
with a score of 0.937. We can infer from this outcome that the deep
learning approach outperformed the machine learning approach. Our
Table 5
deep learning model outperforms the competition in terms of total ac­
Comparison results of odor classifications.
curacy, coming in at 0.97 with a 0.4 accuracy difference. Table 5 shows
average number of each origin class and predicted class of cross- Author Year Method Accuracy Std.
Dev.
validation. From the table, we can notice that the most difficult profile
is dark roast. The accuracy average of deep learning is 95%. Machine Learning
[9] 2011 Support Vector Machine 0.906 0.03
Deep learning predicts light roast, medium roast, and dark roast,
[30] 2017 Random Forest 0.639 0.00
with no miss classification. The system also gets a perfect classification [18,26] 2018 and Linear Discriminant Analysis 0.706 0.00
when it comes to differentiates between light to medium roast and dark 2019
roast. The same thing also recorded in classifying the dark roast and light [27] 2020 Decision Tree 0.553 0.00
[21] 2021 Random Forest + Support Vector 0.986 0.01
to medium. Therefore, the highest accuracy for overall is achieved when
Machine
detecting the medium roast, which is 0.98. [6] 2022 Principle Component Analysis 0.987 0.01
+Deep Neural Network
4. Conclusions
Deep Learning
This research successfully used deep learning and an electronic nose [28] 2017 1-Layer Hidden Backpropagation 0.937 0.02
with TGS gas sensors to categorize Arabica coffee scents according to [31] 2022 Principle Component Analysis 0.985 0.01
+Support Vector Machine
roasting temperature. The results of the repeatability test showed that [32] 2023 Artificial neural networks 0.903 0.01
the TGS sensor used had an STD below 20% which indicated that the This 2023 Deep Learning 0.978 0.01
sensor had good validity. TGS 2612 and TGS 2611 sensors are odor study
sensors that have good validity with an STD below 10% at various

Table 3
Classification results by deep learning on cross-validation.
Odor data Predicted group membership Accuracy

Light Roast Light to Medium Roast Medium Roast Medium to Dark Roast Dark Roast

Light Roast 18,790.00 314.70 3.20 0.20 15.90 0.982


Light to Medium Roast 249.70 18,821.30 46.30 4.70 0.00 0.984
Medium Roast 2.00 20.20 18,917.30 104.50 88.00 0.988
Medium to Dark Roast 0.00 7.80 112.00 19,085.00 303.20 0.978
Dark Roast 6.20 0.00 160.10 616.00 18,333.70 0.959

7
S.D. Astuti et al. Sensing and Bio-Sensing Research 43 (2024) 100632

roasting temperatures. Classification results by deep learning on cross- [7] A.A.S. Pradhana, S.D. Astuti, M. Khasanah, R.K.D. Ardianti, Detection of gas
concentrations based on age on Staphylococcus aureus biofilms with gas array
validation show good accuracy values, namely 98.2% for light roasts,
sensors, in: AIP Conference Proceedings vol. 2314, AIP Publishing, 2020.
98.4% for light to medium roasts, 98.8% for medium roasts, 97.8% for [8] S.D. Astuti, Y. Mukhammad, S.A.J. Duli, A.P. Putra, E.M. Setiawatie, K. Triyana,
Medium to dark roasts and 95.9% for dark roasts. With its objective and Gas sensor array system properties for detecting bacterial biofilms, J. Med. Signals
effective replacement for conventional approaches, this technology has Sens. 9 (2019) 158.
[9] E. Phaisangittisagul, H.T. Nagle, Predicting odor mixture’s responses on machine
potential uses for precise quality control and roasting level measurement olfaction sensors, Sensors Actuators B Chem. 155 (2011) 473–482.
in the coffee industry. The results of this study indicate that E-nose based [10] A.I.F. Isyrofie, A. Afifudin, R. Susilo, Y. Kholimatussa’diyah, S. Winarno, S.
on the TGS gas sensor array with deep learning analysis is able to detect D. Astuti, Role of bacterial types and odor for early detection accuracy of bacteria
with gas array, in: AIP Conf Proc 2554, 2023.
and classify coffee types based on roasting time with good accuracy. [11] S.N. Hidayat, K. Triyana, I. Fauzan, T. Julian, D. Lelono, Y. Yusuf, A.M. Peres, The
electronic nose coupled with chemometric tools for discriminating the quality of
Author statement black tea samples in situ, Chemosensors. 7 (2019) 29.
[12] J. Guo, Y. Cheng, D. Luo, K.Y. Wong, K. Hung, X. Li, ODRP: A deep learning
framework for odor descriptor rating prediction using electronic nose, IEEE Sensors
We the undersigned declare that this manuscript is original, has not J. 21 (2021) 15012–15021.
been published before and is not currently being considered for publi­ [13] Y.H. Yu Wu, D.G. Li, H.C. Feng, Electrospun nanofibers for fast dissolution of
naproxen prepared using a coaxial process with ethanol as a shell fluid, Appl.
cation elsewhere. We confirm that the manuscript has been read and Mech. Mater. 662 (2014) 29–32.
approved by all named authors and that there are no other persons who [14] A. Sanaeifar, S. Mohtasebi, M. Ghasemi-Varnamkhasti, H. Ahmadi, J.S. Lozano
satisfied the criteria for authorship but are not listed. We further confirm Rogado, Development and Application of a New Low-Cost Electronic Nose for the
Ripeness Monitoring of Banana Using Computational Techniques (PCA, LDA,
that the order of authors listed in the manuscript has been approved by
SIMCA, and SVM), 2014.
all of us. [15] S. Buratti, N. Sinelli, E. Bertone, A. Venturello, E. Casiraghi, F. Geobaldo,
Discrimination between washed Arabica, natural Arabica and Robusta coffees by
CRediT authorship contribution statement using near infrared spectroscopy, electronic nose and electronic tongue analysis,
J. Sci. Food Agric. 95 (2015) 2192–2200.
[16] K. Brudzewski, S. Osowski, A. Dwulit, Recognition of coffee using differential
Suryani Dyah Astuti: Writing – review & editing, Visualization, electronic nose, IEEE Trans. Instrum. Meas. 61 (2012) 1803–1810.
Validation, Supervision, Funding acquisition, Data curation, Conceptu­ [17] W. Dong, J. Zhao, R. Hu, Y. Dong, L. Tan, Differentiation of Chinese Robusta
coffees according to species, using a combined electronic nose and tongue, with the
alization. Ihsan Rafie Wicaksono: Writing – original draft, Methodol­ aid of chemometrics, Food Chem. 229 (2017) 743–751.
ogy, Investigation, Formal analysis, Data curation, Conceptualization. [18] S.Y. Kim, B.S. Kang, A colorimetric sensor array-based classification of coffees,
Soegianto Soelistiono: Writing – original draft, Visualization, Valida­ Sensors Actuators B Chem. 275 (2018) 277–283.
[19] S. Romani, C. Cevoli, A. Fabbri, L. Dalla Alessandrini, M. Rosa, Evaluation of coffee
tion, Methodology, Data curation, Conceptualization. Perwira Annissa roasting degree by using electronic nose and artificial neural network for off-line
Dyah Permatasari: Visualization, Resources, Project administration, quality control, J. Food Sci. 77 (2018) (2012) C960–C965.
Methodology, Formal analysis, Conceptualization. Ahmad Khalil [20] S. Wang, K. Kirillova, X. Lehto, Travelers’ food experience sharing on social
network sites, J. Travel Tour. Mark. 34 (5) (2017) 680–693.
Yaqubi: Writing – review & editing, Visualization, Validation, Software, [21] S. Astuti, D. Tamimi, M.H. Pradhana, A.A. Alamsyah, K.A. Purnobasuki,
Resources, Project administration, Formal analysis, Data curation, H. Khasanah, M. Syahrom, Gas sensor array to classify the chicken meat with E. coli
Conceptualization. Yunus Susilo: Visualization, Validation, Resources, contaminant by using random forest and support vector machine, Biosens.
Bioelectron.: X 9 (2021) 100083.
Project administration, Methodology, Data curation, Conceptualization.
[22] D. Rabersyah, Identification of types of coffee grounds using electronic nose with
Cendra Devayana Putra: Visualization, Validation, Project adminis­ backpropagation learning method, Aust. J. Electr. Electron. Eng. 5 (2016)
tration, Data curation, Conceptualization. Ardiyansyah Syahrom: 332–338.
Visualization, Validation, Methodology, Data curation, [23] Y. Arimurti, K. Triyana, S. Anggrahini, Portable electronic nose as an instrument
for discrimination of Java Robusta and Sumatran Robusta coffee aromas correlated
Conceptualization. with gas chromatography mass spectrometry, J. Phys. Sci. 10 (2018) 113–124.
[24] C.D. Putra, A.I.F. Al Isyrofie, S.D. Astuti, B.D. Putri, D.R. Ummah, M. Khasanah,
Declaration of competing interest A. Syahrom, Variational autoencoder analysis gas sensor array on the preservation
process of contaminated mussel shells (Mytilus edulis), Sens. Bio-Sens. Res. 40
(2023) 100564.
The authors state that they do not have any competing interests. [25] A.N. Gloess, A. Vietri, F. Wieland, S. Smrke, B. Schönbächler, J.A.S. López,
C. Yeretzian, Evidence of different flavour formation dynamics by roasting coffee
from different origins: on-line analysis with PTR-ToF-MS, Int. J. Mass Spectrom.
Data availability 365 (2014) 324–333.
[26] F. Gottwalt, E. Chang, T. Dillon, CorrCorr: A feature selection method for
Data will be made available on request. multivariate correlation network anomaly detection techniques, Comput. Secur. J
83 (2019) 234–245.
[27] G.Y.F. Makimori, E. Bona, Commercial instant coffee classification using an
References electronic nose in tandem with the ComDim-LDA approach, Food Anal. Methods 12
(2019) 1067–1076.
[1] ICO, I.C.O, Exports by exporting countries to all destinations, Mon. Trade Stat. 1 [28] S. Wakhid, R. Sarno, S.I. Sabilla, D.B. Maghfira, Detection and classification of
(2022). Indonesian civet and non-civet coffee based on statistical analysis comparison
[2] M.A. Johnson, C.P. Ruiz-Diaz, N.C. Manoukis, J.C. Verle Rodrigues, Coffee berry using E-nose, Int. J. Intell. Eng. Syst 13 (2020) 56–65.
borer (Hypothenemus hampei), a global pest of coffee: perspectives from historical [29] W. Zhao, Q.H. Meng, M. Zeng, P.F. Qi, Stacked sparse auto-encoders (SSAE) based
and recent invasions, and future priorities, Insects. 11 (2010) 882. electronic nose for Chinese liquors classification, Sensors 17 (2017) 2855.
[3] W.B. Sunarharum, D.J. Williams, H.E. Smyth, Complexity of coffee flavor: A [30] A. Chemura, O. Mutanga, Developing detailed age-specific thematic maps for
compositional and sensory perspective, Int. Food Res. J. 62 (2014) 315–325. coffee (Coffea Arabica L.) in heterogeneous agricultural landscapes using random
[4] N. Bhumiratana, K. Chambers Adhikari, IV, E., Evolution of sensory aroma forests applied on landsat 8 multispectral sensor, Geocarto Int. 32 (2017) 759–776.
attributes from coffee beans to brewed coffee, LWT Food Sci. Technol. 44 (2011) [31] S.D. Astuti, A.I.F. Al Isyrofie, R. Nashichah, M. Kashif, T. Mujiwati, Y. Susilo,
2185–2192. A. Syahrom, Gas array sensors based on electronic nose for detection of tuna
[5] A. Huang, Y. Chao, E. de la Mora Velasco, A. Bilgihan, W. Wei, When artificial (Euthynnus Affinis) contaminated by Pseudomonas aeruginosa, J. Med. Signals Sens
intelligence meets the hospitality and tourism industry: an assessment framework 12 (2022) 306.
to inform theory and management, J. Hosp. Tour. 5 (2022) 1080–1100. [32] A.A.S. Pradhana, S.D. Astuti, P.A.D. Permatasari, R. Agustina, A.K. Yaqubi,
[6] A.I.F. Al Isyrofie, M. Kashif, A.K. Aji, N. Aidatuzzahro, A. Rahmatilah, Y. Susilo, S. H. Setyawati, C.D. Putra, Sensor Array system based on electronic nose to detect
D. Astuti, Odor clustering using a gas sensor array system of chicken meat based on borax in meatballs with artificial neural network, J. Electr. Comput. Eng. 23 (2023)
temperature variations and storage time, Sens. BioSensing Res. 37 (2022) 100508. 10.

You might also like