Professional Documents
Culture Documents
1 s2.0 S0023643822001086 Main PDF
1 s2.0 S0023643822001086 Main PDF
LWT
journal homepage: www.elsevier.com/locate/lwt
A R T I C L E I N F O A B S T R A C T
Keywords: Sesame oil (SO), as a high-priced edible oil, is often counterfeited and adulterated. A new method for SO quality
Sesame oil identification using Vision Transformer (ViT) network based on stereoscopic images of Excitation-emission
Identification matrix fluorescence (EEMF) and Total synchronous fluorescence (TSyF) spectroscopy was proposed. The basic
Fluorescence spectroscopy
samples including pure, counterfeit and adulterated SOs were characterized by fluorescence spectroscopy. A data
Vision transformer
Few sample learning
augmentation strategy including linear interpolation, shift and noise injection was selected for few sample
learning. All fluorescence spectral data were visualized as stereoscopic images with rich spectral characteristics.
The ViT network architecture based on attention mechanism was designed and trained to establish four SO
quality identification models. The macro averages of precision, recall and F1-score on the validation set were
greater than 0.99. The values of these indicators on the test samples were equal to one. In conclusion, deep
learning based on ViT using stereoscopic fluorescence spectrum image provided a new method for sesame oil
identification.
* Corresponding author.
E-mail addresses: zzl18833827996@163.com (Z. Zhao), wuxijun@ysu.edu.cn (X. Wu), liuhl@ysu.edu.cn (H. Liu).
https://doi.org/10.1016/j.lwt.2022.113173
Received 23 June 2021; Received in revised form 27 January 2022; Accepted 28 January 2022
Available online 2 February 2022
0023-6438/© 2022 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license
(http://creativecommons.org/licenses/by-nc-nd/4.0/).
Z. Zhao et al. LWT 158 (2022) 113173
2
Z. Zhao et al. LWT 158 (2022) 113173
software and the FS920 spectrometer are provided by the Edinburgh where Imn
original noise
and Imn denote the original data and data with noise,
Instruments Company in the UK. respectively. norm is a normal distribution function with expectation μ
The EEMF spectrum consists of emission spectra corresponding to and standard deviation σ.
different excition wavelengths. The excitation wavelength range was Both training data after data augmentation and test data need to be
250–550 nm (10 nm step size), and the emission wavelength range was visualized as images because CV is based on image pixels. In previous
260–750 nm (2 nm step size). In addition, the emission wavelength reports, the three-dimensional fluorescence spectrum was described as a
lagged the excitation wavelength by 10 nm to eliminate the influence of stereogram or a contour map. In this study, the two descriptions were
Rayleigh scattering. The TSyF spectra expressed fluorescence intensity unified into one image through perspective, which provided richer
as a function of excitation wavelength and wavelength interval. When fluorescence signals for CV tasks. Specifically, in order to balance the
the TSyF spectra were scanned, the excitation wavelength range was burden of the computer and the representativeness of the spectral image,
250–750 nm (5 nm step size), and the wavelength interval range was the fluorescence data was visualized as a 144 × 144 × 3 pixel stereo
10–120 nm (10 nm step size). One of the main advantages of TSyF scopic fluorescence spectrum image including 40 contour lines and a
spectroscopy is that the Rayleigh scattering can be effectively avoided, 70% perspective stereogram.
while the advantage of EEMF compared with TSyF is more intuitionistic.
The analysis work after obtaining the spectral data was done in
Spyder integrated development environment. The fluorescence data was 2.4. Vision Transformer
visualized through Matplotlib library, and data augmentation was done
through Pandas tool and NumPy package. The Keras application pro The Transformer proposed by Ashish Vaswani (Vaswani et al., 2017)
gramming interface was responsible for building deep neural networks. has great potential in artificial intelligence applications. Different from
More information about these tools can be found on their official web RNN using recurrent units, Transformer with the encoder-decoder ar
sites listed in Table S1 (Supplementary Material). chitecture uses attention mechanism to capture the remote relationship
of sequential data in parallel. Recently, the Transformer has been
applied from NLP to CV domain by some researchers. For image data
2.3. Data augmentation and data visualization that is higher dimensional, noisier and more redundant than sequential
data, the Transformer in the ViT proposed by Alexey Dosovitskiy (Dos
Data augmentation has been widely used in many machine learning ovitskiy et al., 2020) is directly applied to image classification for the
fields, such as video processing, biometrics, text analysis, but it has very first time. The attention mechanism applied by ViT can focus on
few applications in chemometrics and food science (Georgouli, Osorio, different regions of the image and integrate the information of the whole
Del Rincon, & Koidis, 2018). It can be seen as injecting prior knowledge image, which is superior than CNN filter with local receptive field.
about data invariant attributes into training samples to generate addi The overview of ViT for stereoscopic fluorescence spectrum image is
tional training data. Data augmentation can well solve the problem of exhibited in Fig. 1 (a). It follows the original architecture of Transformer
insufficient data for few sample learning. Combining the common data as much as possible. In order to process the two-dimensional image,
augmentation methods in deep learning and the characteristics of image x ∈ ℝH×W×C is divided into a series of small patches
vegetable oil spectra, three data augmentation generators including 2
xp ∈ ℝN×(P ⋅C) . (H, W) represents the pixel size of the original image, C is
linear interpolation, shift and noise injection were proposed. These data
the number of image channels, and (P, P) means the pixel size of each
augmentation generators can be used alone or in combination. The label
of virtual spectrum obtained by data augmentation is the same as that of small patch. N = HW/P2 is the number of small image patches generated
basic data. by dividing and the input sequence length for ViT.
Linear interpolation can simulate more ratios of mixed samples, The sequence of flattened patches is linearly projected to a vector of
which is inspired by mixup (Zhang, Cisse, Dauphin, & Lopez-Paz, 2017). the model dimension D by a learnable embedding matrix E. A learnable
The virtual spectrum is calculated by the following formula: embedding Xclass is prepended to the patch embeddings, which is
required for classification task. In order to retain position information,
I interpolation = λI a + (1 − λ)I b (1) the position embedding Epos is added to the patch embeddings. These can
be expressed by the following formula:
where Iinterpolation represents the augmented virtual data, Ia and Ib are [ ]
two basic data with the same class label, and λ is the mixed proportional z0 = Xclass ; Xp1 E; Xp2 E; ⋅⋅⋅ ; XpN E + Epos
random number obeying uniform distribution in the range of 0–1. (6)
E ∈ ℝ(P ⋅C)×D
2
ceptron (MLP) block constitute the main structure of each layer. The
(3)
′
m = m + round(L(am , bm )) MLP block includes two fully connected layers with GELU non-linearity
activation functions. Each block of the encoder uses a residual connec
(4)
′
n = n + round(L(an , bn )) tion and a layer normalization (LN). The calculation of the encoder is as
follows:
original
where Imn represents the original fluorescence intensity at the m − th
(7)
′
shift
row and n − th column of the spectral matrix. Im ′ ′ is the fluorescence
n
zℓ = MSA(LN(zℓ− 1 )) + zℓ− 1 , ℓ = 1…L
intensity of the m − th row and n − th column after shift. L is the
′ ′
(8)
′ ′
zℓ = MLP(LN(zℓ )) + zℓ , ℓ = 1…L
Laplacian function with position parameter a and scale parameter b.
Noise injection is the third data augmentation generator which can Different from the literature (Dosovitskiy et al., 2020), all the outputs
simulate the jitter caused by the random noise of the spectrometer. The zℓ of Transformer encoder were used as the expression of the image for
formula for multiplicative Gaussian noise is expressed as follows: subsequent classification. The y represents the prediction category label:
noise
Imn original
= (1 + norm(μ, σ ))Imn (5) y = LN(zℓ ) (9)
The MSA block that discovers the relative importance between patch
3
Z. Zhao et al. LWT 158 (2022) 113173
Fig. 1. The overview of Vision Transformer for stereoscopic fluorescence spectrum image (a); The Transformer encoder (b); Multiheaded Self-Attention (c); Scaled
Dot-Product Attention (d). MLP: multilayer perceptron; L: linear; Q: query; K: key; V: value.
embeddings in the sequence is the core of Transformer. It is an extension prediction label is the category with the highest probability.
of self-attention (SA) because it runs h SA operations in parallel and Before defining other indicators, the true positive (TP), the false
connects their outputs. Its structure is shown in Fig. 1 (c), and its formula negative (FN), the false positive (FP) and the true negative (TN) are
for the input sequence z ∈ ℝN×D is as follows: exhibited taking two categories as an example in Table S2 (Supple
mentary Material). The overall performance of the identification model
MSA(z) = Concat(SA1 (z); SA2 (z); … SAh (z))W is evaluated by Accuracy:
(10)
W ∈ ℝh⋅Dk ×D
TP + TN
Accuracy = × 100% (16)
where Dk is usually equal to D/h. TP + TN + FP + FN
The particular attention in SA is called "Scaled Dot-Product Atten
The Precision indicates how many of the predicted positive samples
tion" as shown in Fig. 1 (d). For the input sequence z,the weighted sum
are really positive and it is calculated by the following formula:
of all values V is calculated. The attention weight A is obtained by the
calculation of query Q, key K and a softmax function. Three values of Q, Precision =
TP
(17)
K and V are generated by multiplying the input sequence z by the TP + FP
learned UQKV . These operations are calculated as follows: The Recall represents how many positive samples in the dataset are
SA(z) = AV (11) correctly predicted:
/ √̅̅̅̅̅̅ TP
Recall = (18)
A = softmax(QK T Dh ), A ∈ ℝN×N (12) TP + FN
The F1-score combines Precision and Recall, and it is calculated as
[Q, K, V] = zUQKV , UQKV ∈ ℝD×3Dk (13) follows:
Precision × Recall
2.5. Evaluation of performance F1 − score = 2 × (19)
Precision + Recall
The loss value, category prediction probability, Accuracy, Precision, The macro average is the arithmetic average of Precision, Recall, and
Recall, and F1-score were used to evaluate the performance of all the F1-score for all categories.
deep models. The value of Categorical_crossentropy loss function is a Partial least squares discriminant analysis (PLS-DA) was used to
direct indicator to evaluate the inconsistency between output and input, contrast the performance of the proposed method with that of common
and it is defined as: chemometrics. PLS-DA is based on the partial least squares regression
(PLSR), and it combines the properties of PLSR with the discriminant
1 ∑M
ability of classification techniques. The class vector in PLS-DA is trans
L(y, yreal ) = − yreal i × log(softmax(yi )) (14)
M i=1 formed into a binary Y matrix constituted by n rows (number of samples)
and m columns (the class information), which is a difference from PLSR.
eyi
softmax(yi ) = (15)
∑
M 3. Results and discussion
eyi
i=1
3.1. Analysis of fluorescence spectra
where y and yreal are the output and input values of the deep neural
network, respectively. M is the number of categories. The output of The EEMF and TSyF spectra of all oil samples prepared in the
softmax can be regarded as the category prediction probability, and the
4
Z. Zhao et al. LWT 158 (2022) 113173
laboratory were scanned. The nutrient composition is not the same for fluorescence characteristics of adulterated sesame oil. On the contrary,
different vegetable oils (Xinyan Wu, Zhao, et al., 2021). Therefore, the EEMF spectrum of adulterated sesame oil with low adulteration level
before analyzing the fluorescence spectra of adulterated and adulterated is more similar to that of pure SO, which challenges the identification.
sesame oils, the fluorescence characteristics of pure vegetable oils are Stereoscopic fluorescence spectra can be viewed from multiple di
shown in Fig. 2 (a), Fig. 2 (b) and Fig. S1 (Supplementary Material). The rections, which is different from low-dimensional spectral visualization
standard fluorescence peak of vitamin E rich in SO appears in the forms. As can be seen in Fig. S2 (Supplementary Material), contour maps
emission band centered at 325 nm (Zandomeneghi, Carbonaro, & Caf appear when stereoscopic fluorescence spectra were analyzed on over
farata, 2005). However, the characteristic peak of vitamin E appears in head view (full perspective for stereogram). Compared with the spectra
the emission band centered at 540 nm as shown in Fig. 2 (a), which is visualized in Fig. 2, the contour maps in Fig. S2 assist in emphasizing
caused by the inner filter effect. The positions of the maximum fluo spectral position information, but the height of the characteristic peak is
rescence intensity in Fig. 2 (a) and Fig. 2 (b) are consistent, which in not intuitive.
dicates that both the EEMF and TSyF spectroscopy can well characterize Overall, the differences of various oil samples are well demonstrated
oil samples. The emission spectrum between 400 and 500 nm is derived in EEMF and TSyF spectra. However, if the identification of fluorescence
from the oxidation products of fatty acids (Kongbonga et al., 2011; spectroscopy is to be further improved, intelligent analysis methods
Milanez et al., 2017), which can be reflected by CO (Fig. S1 (a)), SBO need to be introduced.
(Fig. S1 (c)) and SSO (Fig. S1 (d)). The spectrum of RSO shown in Fig. S1
(b) is obviously different from that of other vegetable oils because it is 3.2. Effect of data augmentation
rich in chlorophyll that corresponds to the 650–730 nm emission band
(Kyriakidis & Skarkalis, 2000). The samples in the identification models based on EEMF and TSyF
The different adulterated sesame oil samples are shown in Fig. 2. spectroscopy were all expanded by data augmentation. EEMF spectros
Comparing Fig. 2 (a), (c), (d) and (e), it is found that the EEMF spectra of copy was used as an example to illustrate the effect. For the experiment
pure SO and adulterated sesame oil are quite different. As can be seen of identifying adulterated sesame oil, basic samples with odd tail
from the characteristic peak at 470 nm excitation and 540 nm emission, numbers were selected to train the model, and samples with even tail
the vitamin E of adulterated sesame oil is lower than that of pure SO. numbers were utilized to test the model. For the experiment of identi
This phenomenon supports the judgment of whether the unknown SO fying adulterated sesame oil, the samples used to train models were
sample is counterfeit. The c, d and e subgraphs illustrate that the adul composed of ASO1 ~ ASO3 and SO1 ~ SO3 samples. The remaining
terated sesame oil samples are more or less different from each other samples were employed for testing.
because the raw materials are reflected in these blend samples. The Three data augmentation generators were applied in turn for training
difference between them provides the possibility for adulterated sesame samples. Based on the spectra shown in Fig. S3 (a) (Supplementary
oil traceability, but the slight difference like Fig. 2 (c) and Fig. 2 (e) Material) and Fig. S3 (b), the 50% adulterated virtual adulterated ses
increases the difficulty of traceability. ame oil spectrum obtained by linear interpolation (λ equals to 0.5) is
It is difficult to identify adulterated sesame oil because SO is the base shown in Fig. S3 (c). The two fluorescence peaks in this virtual spectrum
oil of adulterated samples. Fig. 2 (f) displays the EEMF spectrum of a (Fig. S3 (c)) correspond to vitamin E in Fig. S3 (a) and fatty acids in
adulterated sesame oil sample with the 50% adulteration level. The Fig. S3 (b), which explains that the fluorophores in the basic sample are
spectral shape of this EEMF is wider than that of pure SO because it is well displayed. The position parameter a and the scale parameter b of
enhanced by the fluorescence of SSO. As the level of adulteration in shift were 0 and 0.2, respectively. The virtual spectrum shown in Fig. S3
creases, the EEMF spectrum of adulterated sesame oil will show more (d) was obtained after the basic spectrum shown in Fig. S3 (b) was
Fig. 2. The stereoscopic Excitation-emission matrix fluorescence (a) and Total synchronous fluorescence (b) images of SO1 sample; The stereoscopic Excitation-
emission matrix fluorescence images of the CSOCO5 sample (c), the CSORSO5 sample (d), the CSOSBO5 sample (e) and the ASO2 sample (f).
5
Z. Zhao et al. LWT 158 (2022) 113173
shifted by two shift steps along the excitation and emission wavelengths. After the loss value of the validation set no longer changes, the early stop
The redshift phenomenon of the virtual spectrum relative to the basic technique of training the model for another 5 epochs is adopted to avoid
spectrum proves that the shift data augmentation generator simulates the overfitting.
the correction error well and retains the effective fluorescence infor
mation. The expectation μ and standard deviation σ of noise injection are 3.4. Identification models
0 and 0.02, respectively. Compared with Fig. S3 (b), the EEMF spectrum
shown in Fig. S3 (e) has more noise disturbances. It is clear that the noise 3.4.1. Identification models for counterfeit sesame oil
injection data augmentation generator simulates the noise of the fluo The models that use EEMF and TSyF spectroscopy coupled with the
rometer well and does not destroy the effective spectral characteristics. ViT network to identify adulterated sesame oil samples were called
For model training for adulterated sesame oil identification, there Model 1 and Model 2, respectively. The subgraphs a and b in Fig. 3
were a total of 860 data after data augmentation, including 20 original demonstrate that the accuracy and loss of the two models on the training
data, 40 data of linear interpolation, 400 data of linear interpolation & set and validation set are close to 100% and 0 respectively after iterative
shift, and 400 data of linear interpolation & shift & noise injection. For training. It is clear that there is no overfitting in the modeling, which can
model training for adulterated sesame oil identification, there were a be attributed to the data augmentation. It has been found from Table S3
total of 132 data after data augmentation, including six original data, six (Supplementary Material) that the accuracies of test set samples are
data of linear interpolation, 60 data of linear interpolation & shift, and 100%. All of these bold prediction probabilities have exceeded 0.97,
60 data of linear interpolation & shift & noise injection. The 70% of the which is very beneficial for models to output category labels. Table 4
data after data augmentation constituted the training set, and the displays the values of Precision, Recall and F1-score on the validation set
remaining 30% constituted the validation set. and test set. These indicators demonstrate that these two models both
misjudged individual CSOCO and CSOSBO samples on the validation set
because the spectra of the two types of samples are similar as shown in
3.3. Designed Vision Transformer network Fig. 2. The macro averages of these indicators denote that the pure and
counterfeit samples are well identified.
Considering the fluorescence data of oil samples and the ViT prin The interpretability of deep learning method has always been a
ciple, a complete ViT network architecture as shown in Table 3 was concern. The output of an intermediate layer in the deep network was
designed. The first layer of ViT network is an input layer with the size of visualized by truncated singular value decomposition (Truncated SVD)
144 × 144 × 3. The second layer divides the input image into patches of dimensionality reduction to assist in increasing the interpretability of
size 12 × 12 × 3. The third layer is responsible for encoding the patch the proposed model. As shown in the two subgraphs of Fig. 4, although
into a vector of length 32 and appending the position embedding. The SO and CSORSO samples are well clustered, CSOCO and CSOSBO sam
Transformer encoder includes the fourth to twelfth layers which are the ples cannot be well distinguished, which requires deeper network layers
core of the ViT network architecture. The number of attention heads of to learn more. As can be seen from indexes in Table 4, the deeper
MSA is 4. The dimensions of the two fully connected layers in the MLP network layers do make an important contribution to the final distinc
block are 64 and 32 respectively. The thirteenth layer normalizes the tion between CSOCO and CSOSBO samples.
output of Transformer encoder. The fourteenth network layer is The stability of the deep model should not be ignored in practice
responsible for flattening. The fifteenth to the eighteenth layers are the because there are differences between diverse trainings under the same
MLP block, in which the dimensions of the two fully connected layers are hyperparameters. Taking Model 1 as an example, it was trained 10 times
512 and 256 respectively. The nineteenth layer is the classification repeatedly to evaluate the stability of the ViT network in oil spectral
output layer with a softmax activation function. The number of output modeling. As can be seen from Fig. S4 (Supplementary Material), both
categories is 4 for adulterated sesame oil identification and 2 for adul loss and accuracy curves tend to be stable after iterative training
terated sesame oil identification. although their convergence processes vary in different times. Therefore,
The optimizer in the network training process is the AdamW with the stability of the deep models in this study can meet the practical
0.0001 learning rate and 0.0001 wt decay. The batch size is set to 32. requirements.
The comparative analysis with common approaches can highlight
Table 3 the advantages and disadvantages of the proposed method. In the
Layers and their connections of the Vision Transformer network for fluorescence comparative experiment, the optimal emission spectrum in the EEMF
spectroscopy. was manually extracted and fed to PLS-DA. The accuracy on the test set
Order Layer Connected to is 93.75%, and its confusion matrix is shown in Fig. S5 (Supplementary
1 input_1 –
Material). The accuracy is 87.50% for TSyF spectroscopy, which has
2 patches input_1 been reported in a previous study (Xijun Wu, Zhao, et al., 2021). The
3 patch_encoder Patches super parameters of the two researches were determined by the same
4 layer_normalization patch_encoder method. There are misjudgments between CSOCO and CSOSBO samples
5 multi_head_attention layer_normalization
in the two comparison experiments, which does not occur in the
layer_normalization
6 add multi_head_attention approach proposed in this work. However, contrasted with common
patch_encoder chemometrics, the disadvantage of deep learning relying on computer
7 layer_normalization_1 Add resources was exposed. For the comparison with other deep learning
8 dense_1 layer_normalization_1 approaches, the authors of ViT have reported that ViT attained excellent
9 dropout dense_1
10 dense_2 Dropout
results on popular image classification benchmarks compared with
11 dropout_1 dense_2 SOTA (state-of-the-art) CNN models (Dosovitskiy et al., 2020).
12 add_1 dropout_1
Add 3.4.2. Identification models for adulterated sesame oil
13 layer_normalization_2 add_1
The models of EEMF and TSyF spectroscopy to identify adulterated
14 flatten layer_normalization_2
15 dense_3 Flatten sesame oil samples were named Model 3 and Model 4, respectively.
16 dropout_2 dense_3 Fig. 3 (c) and (d) show the training of these two models. All the accu
17 dense_4 dropout_2 racies are 100% and all the loss values tend to be stable after training,
18 dropout_3 dense_4 which indicates that the models are convergent. There are two reasons
19 dense_5 dropout_3
that may explain why the curves of adulterated sesame oil identification
6
Z. Zhao et al. LWT 158 (2022) 113173
Table 4
Precision, Recall and F1-score on the validation set and test set.
Validation set Test set
models have more intense jitter than those of adulterated sesame oil 100%, which can be easily inferred from the prediction categories and
identification models. One reason is the increased difficulty for adul probabilities shown in Table S4 (Supplementary Material). For the
terated sesame oil identification and the other reason is the relatively adulterated sesame oil samples, the bold prediction probabilities
small number of samples. The accuracies of the test set samples are decrease as the proportion of SSO decreases, because the less adulterated
7
Z. Zhao et al. LWT 158 (2022) 113173
the more difficult it is to be identified. As can be seen from the values of Dong, J.-E., Zhang, J., Zuo, Z.-T., & Wang, Y.-Z. (2021). Deep learning for species
identification of bolete mushrooms with two-dimensional correlation spectral
Precision, Recall and F1-score for the two models in Table 4, the models
(2DCOS) images. Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy,
for adulterated sesame oil identification are feasible. 249, 119211. https://doi.org/10.1016/j.saa.2020.119211
In summary, the performance of the adulterated and adulterated Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., et al.
sesame oil identification models evidences that the designed ViT (2020). An image is worth 16x16 words: Transformers for image recognition at scale.
arXiv preprint arXiv:2010.11929.
network is suitable for EEMF and TSyF spectroscopy of vegetable oils. Elzey, B., Pollard, D., & Fakayode, S. O. (2016). Determination of adulterated neem and
flaxseed oil compositions by FTIR spectroscopy and multivariate regression analysis.
4. Conclusion Food Control, 68, 303–309. https://doi.org/10.1016/j.foodcont.2016.04.008
Filoda, P. F., Fetter, L. F., Fornasier, F., de Souza Schneider, R.d. C., Helfer, G. A.,
Tischer, B., et al. (2019). Fast methodology for identification of olive oil adulterated
In this report, the ViT network models based on stereoscopic fluo with a mix of different vegetable oils. Food Analytical Methods, 12(1), 293–304.
rescence spectrum image had completed the identification of counterfeit https://doi.org/10.1007/s12161-018-1360-5
Gan, J., Zhou, L., Cui, J., Man, B., Jia, X., Shi, S., et al. (2019). Classification of blood
and adulterated SOs. Both EEMF and TSyF spectra well characterized the species using fluorescence spectroscopy combined with deep learning method.
SO samples, which not only showed the advantages of easy operation Journal of Applied Mathematics and Physics, 7(10), 2324–2332. https://doi.org/
and nondestructive, but also demonstrated good selectivity of high- 10.4236/jamp.2019.710158
Georgouli, K., Osorio, M. T., Del Rincon, J. M., & Koidis, A. (2018). Data augmentation in
dimensional spectra. Three data augmentation generators designed in food science: Synthesising spectroscopic data of vegetable oils for performance
this study made deep learning possible in oil identification. The abun enhancement. Journal of Chemometrics, 32(6), Article e3004. https://doi.org/
dant fluorescence information in stereoscopic fluorescence spectrum 10.1002/cem.3004
Han, K., Wang, Y., Chen, H., Chen, X., Guo, J., Liu, Z., et al. (2020). A survey on visual
images provided more visual input for ViT network. The superior per transformer. arXiv preprint arXiv:2012.12556.
formance of the ViT network in oil identification manifested that Hu, F., Zhou, M., Yan, P., Li, D., Lai, W., Bian, K., et al. (2019). Identification of mine
Transformer combined with fluorescence spectroscopy is available and water inrush using laser-induced fluorescence spectroscopy combined with one-
dimensional convolutional neural network. RSC Advances, 9(14), 7673–7679.
scalable. These confirm that this novel method can provide technical
https://doi.org/10.1039/C9RA00805E
support for food supervision departments. The potential of this new Itakura, K., Saito, Y., Suzuki, T., Kondo, N., & Hosoi, F. (2019). Estimation of citrus
vegetable oil quality identification method is considered to be great maturity with fluorescence spectroscopy using deep learning. Horticulturae, 5(1), 2.
because it is green, fast and intelligent. It is worth noting, however, that https://doi.org/10.3390/horticulturae5010002
Ju, L., Lyu, A., Hao, H., Shen, W., & Cui, H. (2019). Deep learning-assisted three-
the characterization of vegetable oils is only realized by non-enhanced dimensional fluorescence difference spectroscopy for rapid identification and semi-
fluorescence spectroscopy in this paper. As people have higher re quantification of illicit drugs in bio-fluids. Analytical Chemistry, 91(15). https://doi.
quirements for food safety, the enhanced spectroscopy based on nano org/10.1021/acs.analchem.9b01315
Khan, S., Naseer, M., Hayat, M., Zamir, S. W., Khan, F. S., & Shah, M. (2021).
technology can be tried in the future research to realize the detection of Transformers in vision: A survey. arXiv preprint arXiv:2101.01169.
trace substances in vegetable oil. Moreover, multivariate calibration Kongbonga, Y. G. M., Ghalila, H., Onana, M. B., Majdi, Y., Lakhdar, Z. B., Mezlini, H.,
regression based on the current research will be modeled, which is a et al. (2011). Characterization of vegetable oils by fluorescence spectroscopy. Food
and Nutrition Sciences, 2(7), 692–699. https://doi.org/10.4236/fns.2011.27095
very significant work for spectral analysis. Kyriakidis, N. B., & Skarkalis, P. (2000). Fluorescence spectra measurement of olive oil
and other vegetable oils. Journal of AOAC International, 83(6), 1435–1439. https://
CRediT authorship contribution statement doi.org/10.1093/jaoac/83.6.1435
Lin, H., Li, Z., Lu, H., Sun, S., Chen, F., Wei, K., et al. (2019). Robust classification of tea
based on multi-channel LED-induced fluorescence and a convolutional neural
Zhilei Zhao: Conceptualization, Investigation, Methodology, Soft network. Sensors, 19(21), 4687. https://doi.org/10.3390/s19214687
ware, Writing – original draft, Writing – review & editing, Validation, Liu, Y., Yao, L., Xia, Z., Gao, Y., & Gong, Z. (2021). Geographical discrimination and
adulteration analysis for edible oils using two-dimensional correlation spectroscopy
Formal analysis. Xijun Wu: Funding acquisition, Project administration, and convolutional neural networks (CNNs). Spectrochimica Acta Part A: Molecular and
Data curation, Conceptualization, Methodology, Writing – review & Biomolecular Spectroscopy, 246, 118973. https://doi.org/10.1016/j.saa.2020.118973
editing. Hailong Liu: Supervision. Milanez, K. D. T. M., Nóbrega, T. C. A., Nascimento, D. S., Insausti, M., Band, B. S. F., &
Pontes, M. J. C. (2017). Multivariate modeling for detecting adulteration of extra
virgin olive oil with soybean oil using fluorescence and UV–vis spectroscopies: A
Declaration of competing interest preliminary approach. LWT-Food Science and Technology, 85, 9–15. https://doi.org/
10.1016/j.lwt.2017.06.060
Moore, J. C., Spink, J., & Lipp, M. (2012). Development and application of a database of
No authors declare any conflicts of interest. food ingredient fraud and economically motivated adulteration from 1980 to 2010.
Journal of Food Science, 77(4), R118–R126. https://doi.org/10.1111/j.1750-
Acknowledgments 3841.2012.02657.x
Ni, Y., Zhang, G., & Kokot, S. (2005). Simultaneous spectrophotometric determination of
maltol, ethyl maltol, vanillin and ethyl vanillin in foods by multivariate calibration
This work was supported by the National Natural Science Foundation and artificial neural networks. Food Chemistry, 89(3), 465–473. https://doi.org/
of China (NSFC 11674275), Natural Science Foundation of Hebei 10.1016/j.foodchem.2004.05.037
Pan, Y., Lai, K., Fan, Y., Li, C., Pei, L., Rasco, B. A., et al. (2014). Determination of tert-
Province (F2020203110; F2021203052), Science and technology butylhydroquinone in vegetable oils using surface-enhanced Raman spectroscopy.
research project of Hebei higher education institutions (QN2018071). Journal of Food Science, 79(6), T1225–T1230. https://doi.org/10.1111/1750-
3841.12482
Qiu, J., Hou, H.-Y., Huyen, N. T., Yang, I.-S., & Chen, X.-B. (2019). Raman spectroscopy
Appendix A. Supplementary data and 2DCOS analysis of unsaturated fatty acid in edible vegetable oils. Applied
Sciences, 9(14), 2807. https://doi.org/10.3390/app9142807
Supplementary data to this article can be found online at https://doi. Sikorska, E., Górecki, T., Khmelinskii, I. V., Sikorski, M., & Kozioł, J. (2005).
Classification of edible oils using synchronous scanning fluorescence spectroscopy.
org/10.1016/j.lwt.2022.113173. Food Chemistry, 89(2), 217–225. https://doi.org/10.1016/j.foodchem.2004.02.028
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., et al. (2017).
References Attention is all you need. arXiv preprint arXiv:1706.03762.
Wang, T., Wu, H.-L., Long, W.-J., Hu, Y., Cheng, L., Chen, A.-Q., et al. (2019). Rapid
identification and quantification of cheaper vegetable oil adulteration in camellia oil
Bazi, Y., Bashmal, L., Rahhal, M. M. A., Dayil, R. A., & Ajlan, N. A. (2021). Vision
by using excitation-emission matrix fluorescence spectroscopy combined with
transformers for remote sensing image classification. Remote Sensing, 13(3), 516.
chemometrics. Food Chemistry, 293, 348–357. https://doi.org/10.1016/j.
https://doi.org/10.3390/rs13030516
foodchem.2019.04.109
Chu, X., Wang, W., Li, C., Zhao, X., & Jiang, H. (2018). Identifying camellia oil
Wu, X., Bian, X., Lin, E., Wang, H., Guo, Y., & Tan, X. (2021). Weighted multiscale
adulteration with selected vegetable oils by characteristic near-infrared spectral
support vector regression for fast quantification of vegetable oils in edible blend oil
regions. Journal of Innovative Optical Health Sciences, 11(2), 1850006. https://doi.
by ultraviolet-visible spectroscopy. Food Chemistry, 342, 128245. https://doi.org/
org/10.1142/S1793545818500062
10.1016/j.foodchem.2020.128245
da Costa, G. B., Fernandes, D. D. S., Gomes, A. A., de Almeida, V. E., & Veras, G. (2016).
Wu, X., Zhao, Z., Tian, R., Niu, Y., Gao, S., & Liu, H. (2021). Total synchronous
Using near infrared spectroscopy to classify soybean oil according to expiration date.
fluorescence spectroscopy coupled with deep learning to rapidly identify the
Food Chemistry, 196, 539–543. https://doi.org/10.1016/j.foodchem.2015.09.076
8
Z. Zhao et al. LWT 158 (2022) 113173
authenticity of sesame oil. Spectrochimica Acta Part A: Molecular and Biomolecular Yuan, Y.-Y., Wang, S.-T., Wang, J.-Z., Cheng, Q., Wu, X.-J., & Kong, D.-M. (2020). Rapid
Spectroscopy, 244, 118841. https://doi.org/10.1016/j.saa.2020.118841 detection of the authenticity and adulteration of sesame oil using excitation-emission
Xu, J., Liu, X.-F., & Wang, Y.-T. (2016). A detection method of vegetable oils in edible matrix fluorescence and chemometric methods. Food Control, 112, 107145. https://
blended oil based on three-dimensional fluorescence spectroscopy technique. Food doi.org/10.1016/j.foodcont.2020.107145
Chemistry, 212, 72–77. https://doi.org/10.1016/j.foodchem.2016.05.158 Zandomeneghi, M., Carbonaro, L., & Caffarata, C. (2005). Fluorescence of vegetable oils:
Yang, J., Xu, J., Zhang, X., Wu, C., Lin, T., & Ying, Y. (2019). Deep learning for Olive oils. Journal of Agricultural and Food Chemistry, 53(3), 759–766. https://doi.
vibrational spectral analysis: Recent progress and a practical guide. Analytica org/10.1021/jf048742p
Chimica Acta, 1081, 6–17. https://doi.org/10.1016/j.aca.2019.06.012 Zhang, H., Cisse, M., Dauphin, Y. N., & Lopez-Paz, D. (2017). mixup: Beyond empirical risk
minimization. arXiv preprint arXiv:1710.09412.