A Feasibility Quantification Study of Capsaicin Content in Chili Powder For Rapid Evaluation Using Near-Infrared Reflectance Spectros

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Journal of Food Measurement and Characterization

https://doi.org/10.1007/s11694-023-01965-x

ORIGINAL RESEARCH

A feasibility quantification study of capsaicin content in chili powder


for rapid evaluation using near‑infrared reflectance spectroscopy
Bowen Jing1 · Wensheng Song2 · Xin Gao1 · Ke He1 · Qinming Sun1 · Xiuying Tang1

Received: 27 January 2023 / Accepted: 7 May 2023


© The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023

Abstract
Capsaicin, a unique alkaloid mainly found in pepper (Capsicum spp.), is in high demand in the food, chemical, and medical
fields. Near-infrared (NIR) reflectance spectroscopy at 940–1660 nm was implemented in this study to determine the capsaicin
content in dried chili pepper powder. The quantitative calibration models between the spectral data and the measured capsai-
cin content were built by partial least square regression (PLSR) and extreme learning machine regression (ELM). Different
preprocessing methods were used to process the spectral data, and successive projection algorithm (SPA) and uninformative
variables elimination (UVE) were used to select characteristic wavelengths. The PLSR model with the full wavelengths
pretreated by the first derivative yielded optimum results with a root mean squared error for the prediction set (RMSEP) of
1.0338 g/kg and a correlation coefficient (Rp) of 0.9003 and residual prediction deviation of 2.25. It demonstrated that NIR
spectroscopy could be used as an objective tool for the rapid and accurate quantitative determination of capsaicin content.

Keywords Chili powder · Capsaicin content · Near-infrared spectroscopy · Characteristic selection · Partial least squares
regression · Extreme learning machine regression

Introduction step, achieving rapid and accurate detection of capsaicin


content is of great significance.
Capsaicin has been widely used in condiments because of In recent years, the prediction of capsaicin content in chili
its unique spicy taste [1, 2]. In addition, capsaicin can also peppers has become the focus of scholarly research. In 1912,
act as a pain reliever, improve exercise performance, and Wilbur Scoville proposed the organoleptic test method to
lower blood lipids by stimulating its receptor transient recep- detect capsaicin content in chili peppers, which was easy to
tor, potential vanilloid subfamily member 1 (TRPV1) [3, 4]. operate but the crude and subjective judgment led to inac-
Therefore, it is also in significant demand in pharmaceutical curate and poorly reproducible results. Based on the UV
processing [5, 6]. According to GIR (Global Info Research) absorption of the sample solution at different wavelengths
research, global capsaicin revenue was approximately $8 and the excellent solubility of capsaicin in organic solvents,
million in 2021 and is expected to reach $11 million by the spectrophotometric approach is utilized to quantify the
2028. Although capsaicin can be synthesized chemically, total capsaicin concentration in the sample [8, 9]. Although
it is complex and costly to operate [7] and cannot meet the the influence of pigments in chili peppers can be eliminated
actual production demand. At present, chili peppers are still to some extent by calculating the absorbance of solutions at
the most critical source of capsaicin, so it is necessary to different wavelengths, it is required that the capsaicin con-
increase the extraction rate of capsaicin to meet the demand tent calculated at two wavelengths should not exceed 10%.
of the industry for capsaicin. Therefore, as a pre-extraction Therefore, there will be repeated measurement problems
in the application, even when there are many impurities,
* Xiuying Tang the results can be unusable due to the large gap between
txying@cau.edu.cn the calculated results. It is, however, prone to mistakes and
has limited repeatability due to unstable absorption. On the
1
China Agricultural University, College of Engineering, 17 other hand, high-performance liquid chromatography has
Qinghua East Road, Haidian 100083, Beijing, China
been applied to the quantitative analysis of capsaicin since
2
Xinjiang Tianjiao Hong’an Agricultural Science the late 1970s because of its high separation, sensitivity,
and Technology Co. LTD., Shihezi, Xinjiang, China

13
Vol.:(0123456789)
B. Jing et al.

and accuracy [10–13]. Although liquid chromatography reach a moisture content of 13%. After collecting the sam-
provides more precise results, the procedure is complex and ples, chili peppers were crushed to powder form by a plant
time-consuming. Furthermore, the organic extraction sol- specimen crusher and passed through a 60 mesh graded
vents utilized before detecting chili pepper samples are toxic sieve. Since capsaicin in peppers is mainly concentrated in
and should only be used by trained personnel. Therefore, the placenta and pulp [22, 23], the peppers are ground whole
exploring a rapid and accurate method for the quantitative without deseeding.
determination of capsaicin that may be employed in practical The standard capsaicin synthesis was dissolved in
applications is critical. methanol and then diluted to 1 mg/ml. The standard stor-
Due to its technical advantages, such as no pretreatment, age solution was aspirated at 0, 0.5, 1.0, 1.5, 2, and 2.5 ml,
simultaneous multi-component determination, fast analysis, respectively, and a fixed volume using methanol to 25 ml
and no contamination, NIR spectroscopy has been widely was injected into the HPLC column to establish the linear
used in food [14–16] and crop [17, 18] quality testing. How- regression equation of capsaicin.
ever, there were fewer studies on pepper quality analysis The crushed sample was weighed 20.0 ± 1.0 g, dissolved
because of the practical application limitations. Penchaiya in 150 ml of 95% ethanol (analytically pure), and then made
[19] used near-infrared reflectance spectroscopy to deter- into an extract by the Soxhlet extractor for chromatographic
mine the soluble solids content and hardness of sweet pep- analysis. The analysis was performed on an InertsilODS-3 V
per fruit and combined it with partial least squares (PLS) column (5 μm, 4.6*250 mm). The mobile phase consisted
calibration to develop an optimal model. Lim [20] launched of acetonitrile solution/ultrapure water containing 0.5% ace-
a spectral detection device for Korean red pepper powder tic acid (50/50 V/V) and was pumped at a rate of 1.0 ml/
to achieve a relatively accurate prediction. Rahman [21] min, and the volume of the sample injected was 10 µl. The
achieved hyperspectral-based detection of capsaicin con- fluorescence detector was set at an excitation wavelength of
tent in green peppers. Although the above methods achieved 280 nm and an emission wavelength of 325 nm [24].
fairly good detection accuracy, they have drawbacks such as
complicated spectroscopic equipment operation procedures Spectral data acquisition
and high hyperspectral imaging detection costs, which are
challenging to industrialize. The importance of these meth- A self-built NIR reflectance spectroscopy system col-
odologies lies in the fact that they serve as a foundation for lected the reflectance spectra of each chili pepper sample as
future research. Therefore, this study will explore the fea- shown in Fig. 1. The spectral acquisition was performed
sibility of NIR spectroscopy as a rapid detection technique in a dark box connected to a computer, and a spectrometer
for the accurate measurement of capsaicin content in chili (USB2000+, Ocean Optics Inc., USA) was used to collect
pepper powder based on the above research. the spectral data of each sample. The spectrometer measured
In this study, the predictive ability of NIR spectroscopy wavelengths from 940 to 1660 nm, with 128 data points.
for capsaicin content in peppers was evaluated. The follow- Two 35 W halogen lamps were used as light sources and
ing research focuses were proposed: (1) to build a NIR spec- placed above the samples. Before collecting the spectral
troscopy detecting system and explore its spectral feasibil- data, the lamps were heated for 30 min to stabilize the light.
ity; (2) to compare the detection performance of the ELM The sample was put 15 mm away from the fiber, and the
model and PLSR model for capsaicin content; (3) to analyze spectrum data of each sample was acquired at three sepa-
the effect of different processing methods on the prediction rate places, with the average of three replicate measurements
performance of the model; (4) to select the characteristic being used to determine the final spectral data for the pepper.
wavelengths linked to the analysis of capsaicin content and
measure the accuracy of prediction models using only those
selected wavelengths.

Materials and methods

Preparation of chili powder samples


and determination of capsaicin content

The 88 pepper samples were obtained from XinJiang Long-


Ping Hong'An Bio-Tech Co (Shihezi City, Xinjiang, China).
Wash the samples with water and cut them into small pieces. Fig. 1  Structure of visible and near-infrared reflectance spectrum
They were dried in an electric thermostatic oven at 55 ℃ to acquisition device

13
A feasibility quantification study of capsaicin content in chili powder for rapid evaluation…

Spectral data preprocessing and classification problems in NIR detection [28–30]. All
statistical analyses were performed in MatLab R2020a
In addition to the chemical information of capsaicin, the (Mathworks Inc., Natick, MA, USA).
collected spectra contain other useless information and
noise, such as electrical noise, sample background, and stray Selection of characteristic wavelengths
light. In order to obtain an accurate and reliable spectral
calibration model, it is necessary to eliminate extraneous NIR spectroscopy is a technique for predicting the content
information and noise from the spectral data. In this study, and composition of substances from information based on
Savitzky–Glay smoothing (S–G smoothing), first derivative the structure and content of X–H groups. However, the spec-
and 2nd derivative, standard normal transform (SNV), and tra contain not only information about the quality of the
multiple scattering correction (MSC) were used as the pre- object to be measured but also multicollinearity and redun-
processing methods for the spectra. dant information between successive wavelengths, which
can result in massive calculation and slow calculation speed
Multivariable data analysis [31, 32]. Therefore, this study uses successive projections
algorithm (SPA) and uninformative variables elimination
Partial least squares (UVE) to extract the characteristic wavelengths related to
the basic properties of capsaicin from the whole waveband
To ensure the stability and accuracy of the test results, it is to simplify the model.
also necessary to create and develop model detection algo- SPA eliminates redundant information and multicollin-
rithms with high stability to build accurate models for quan- earity in spectral data by finding groups of variables in the
titative analysis of capsaicin. Partial least squares regression data matrix that contains the minimum amount of redundant
(PLSR) analysis is the most widely used method in NIR spec- information. The UVE calculates the stability of each vari-
tral modeling [25]. Two data matrices, X and Y, are decom- able by inputting a random noise equal to the variable and
posed and calculated to obtain their score matrix and load screens for variables with high stability, eliminating those
matrix. Then linear regression is performed. In contrast to tra- not relevant to the detection information.
ditional regression models, PLSR can analyze the relationship
between a data matrix X with strong covariance and multiple
variable matrices Y. It can apply to problems with a small Model evaluation metrics
number of samples but a relatively large number of variables.
The model performance of the quantitative model of cap-
Extreme learning machine saicin content is evaluated by the correlation coefficient (R),
root mean square error (RMSE), and relative analysis error
PLSR is based on the assumption that the target spectral sys- (RPD). The calculation of these parameters is calculated by
tem is linearly additive. Nevertheless, the spectral variables the following equations.
and concentrations are somewhat nonlinear, and component ∑N � ��
interactions and instrument noise exacerbate the nonlinearity

i=1
̂
y i − ̂
y yi − ȳ
problem. Therefore, nonlinear algorithms can easily play a R= �
�2 � (1)
potential role in improving the reliability and reproducibility ∑N � �2
i=1 ŷ i − ŷ yi − ̄
y
of NIR spectral predictions [26]. Thus, the nonlinear algo-
rithm Extreme Learning Machine (ELM) is also compared
in this study.

∑N � �2
ŷ i − yi (2)
The limit learning machine is a new single-hidden layer RMSE = i=1

feedforward neural network learning algorithm proposed by n−1


Huang [27]. The single hidden layer feedforward network
weights are measured analytically by randomly generat- RPD = SD∕RMSEP (3)
ing the connection weights between the input layer and the
where ̂yi is the spectral prediction of capsaicin content, yi
hidden layer and the threshold values of the neurons in the
is the HPLC measurement of capsaicin content, N is the
hidden layer. During the training phase, only the number of
number of samples, SD is the standard deviation of capsaicin
neurons in the hidden layer needs to be set; no other param-
content measurement in the validation set, and RMSEP is
eter selection is required. It has greater generalization capa-
the root mean square error of the validation set.
bility, runs faster than gradient-based learning algorithms,
As shown in Table 1, R represented the correlation
and avoids many parameters and local minimum choices It
between predicted capsaicin content and measured content
has demonstrated good performance in quantitative analysis

13
B. Jing et al.

Table 1  Evaluation indicators Evaluation Function Optimal range


for quantitative models indicators

R Evaluation of the degree of correlation between the predicted and true values Approaching 1
of the model
RMSE Evaluation of deviations between the predicted and true values of the model Approaching 0
RPD Evaluation of the effectiveness of the model predictions >2

and was used to evaluate the detectability of different mod- the reference data in the calibration and prediction sets is
els. RMSE is used to measure the prediction error of the almost equally distributed, avoiding bias in the distribution
model validation set and calibration set. The closer the cor- of the two sets.
relation coefficients (R) to 1, the closer the root mean square
error values (RMSE) to 0, and the more accurate the model
prediction is. The RPD is the ratio of the standard deviation Spectrum analysis
of the validation set to RMSEP, a measure of regression
model validity and overall predictability. If the RPD value is The spectral data of 940–1660 nm were selected for mod-
more than 2, it means that the model prediction is effective, eling. The reflection spectra curve of pepper powder with
and when the RPD value is less than 1.5, the established varying capsaicin concentrations showed comparable ten-
model cannot be applied to the prediction of the index to be dencies, as illustrated in Fig. 2. The reflection curves of
measured [33, 34]. the spectrum showed a decreasing trend at 940–1205 nm
and an increasing trend at 1205–1310 nm. The reflec-
tion curves decreased after reaching the maximum value
Results and discussion until they leveled off after 1500 nm. As molecular vibra-
tion spectra, NIR spectra originate from the absorption
Reference measurement of capsaicin content of X–H bonds, and the spectra produced by different
hydrogen-containing groups differ in the intensity and
In this study, the samples were classified by the concen- position of absorption peaks. In this study, chili powder
tration gradient method to ensure that the spectral infor- had evident absorption peaks at 1000 nm, 1205 nm and
mation of the entire sample was evenly distributed in the 1480 nm, which could be caused by the second vibration
two sample spaces. The 88 samples were divided into the of N–H bonds and the secondary overtone of the C–H
calibration and validation sets in a 3:1 ratio. The range stretching vibration and the first overtone of the O–H
and the distribution of capsaicin of these samples are stretching vibration, respectively [21, 35, 36]. In addition,
shown in Table 2. As indicated in Table 2, capsaicin con- there were differences in the reflectance curves of sam-
tent range and average content of calibration set sample ples with different capsaicin concentrations, especially
were 0.0451–11.9155 g/kg and 2.7950 g/kg, respectively, at 1100–1350 nm. This may be due to the differences in
and corresponding values of prediction set sample were capsaicin content of the chili pepper samples [37]. Fur-
0.1851–9.5835 g/kg and 2.7343 g/kg, respectively. The thermore, it can be seen that the trends of all the spectral
range of the reference measurements in the prediction curves in Fig. 3 were consistent. The result indicated that
set was covered by the range in the calibration set. The there were no anomalous samples, so all data were used
standard deviations of the calibration and prediction sets for further processing.
were very similar, which means that the distribution of

Table 2  Reference measurement Data set Number of Range of capsaicin (g/kg) Mean (g/kg) Standard
of capsaicin content of chili samples deviation (g/
pepper samples in data sets kg)

Total set 88 0.0451–11.9155 2.7950 2.4315


Calibration set 66 0.0451–11.9155 2.8152 2.4639
Validation set 22 0.1851–9.5835 2.7343 2.3305

The statistics showed the capsaicin content of the chili pepper samples in the calibration and validation sets

13
A feasibility quantification study of capsaicin content in chili powder for rapid evaluation…

Fig. 2  Result of reflectance


intensity of chili powder sam-
ples with different capsaicin
contents in the spectral range of
940–1660 nm

Fig. 3  Spectral curves of the original spectrum after pretreatment. spectrum after standard normal variable processing; e spectral curve
a Row spectrum, b spectrum obtained by Savitzky–Glay smoothing obtained by first derivative; f spectral curve obtained by second deriv-
pretreatment; c spectrum after multiplicative scatter correction; d ative

13
B. Jing et al.

Prediction of capsaicin content based on the whole Rc and Rp greater than 0.8 and RMSEC greater than 1.2
spectral with different quantitative models RMSEP according to the model evaluation indicators
introduced in Chapter 2.6. The neuron with the highest fre-
ELM and PLSR algorithms were used to build prediction quency was taken as the optimal number of neurons. The
models under the collected sample spectra and their corre- model with the highest correlation coefficient under that
sponding capsaicin contents. The specific prediction results neuron was taken as the optimal model. The ideal number
based on different preprocessing methods are shown in of neurons and prediction outcomes for each preprocessing
Tables 3 and 4. model based on the same analysis were shown in Table 3.
In this study, the "sigmoidal" function is chosen as the From the prediction results of ELM model in Table 3,
kernel function of the network and the sample spectrum it can be seen that the optimal number of neurons for
is used as input. For the number of neurons, the most the original spectrum was 23, with an Rp of 0.8054, an
appropriate number should be chosen to avoid “underfit- RMSEP of 1.4565 g/kg, and RPD of 1.60. After various
ting” with too few neurons or “overfitting” with too many preprocessing methods, the performance of the model has
neurons. In this study, the initial number of neurons was been improved. Among them, the optimum prediction
set to 5 and gradually increased to 50 to develop and test model is the spectral model processed by the first deriva-
the ELM model according to the sample size. Repeat the tive, since the higher RPD than the first derivative with
experiment 100 times on the ELM model under each neu- Savitzky-Golay, with the number of neurons selected for
ron. Divide the 100 models obtained under each neuron the model is 35, Rp of 0.8371, RMSEP of 1.3694 g/kg,
into a group, and calculate the number of models with and RPD of 1.70.

Table 3  The prediction results The number of prin- Preprocessing Correction set Validation set RPD
with five evaluation parameters cipal components
of the ELM model using RC RMSEC (g/kg) RP RMSEP (g/kg)
different preprocessing methods
23 Raw 0.8129 1.4350 0.8054 1.4565 1.60
24 SG 0.8276 1.3831 0.8193 1.5796 1.48
35 1st 0.8505 1.2961 0.8371 1.3694 1.70
13 1st + SG 0.8503 1.2967 0.8412 1.4279 1.63
35 2nd 0.8192 1.4132 0.8178 2.0139 1.16
47 2nd + SG 0.8430 1.3254 0.8237 1.7838 1.31
25 MSC 0.8249 1.3928 0.8217 1.5298 1.52
35 SNV 0.8442 1.3206 0.8367 1.6127 1.45

Raw spectral data without any preprocessing method; SG Savitzky–Golay;1st the first derivative; 2nd the
second derivative; SNV standard normal variate; MSC multiplicative scatter correction; Rc correlation coef-
ficient in calibration set; RMSEC root mean squared error of calibration set; Rp correlation coefficient in
prediction set; RMSEP root mean squared error in prediction set; RPD relative percent deviation

Table 4  The prediction The number of prin- Preprocessing Correction set Validation set RPD
results with five evaluation cipal components
parameters of the PLSR model RC RMSEC (g/kg) RP RMSEP (g/kg)
using different preprocessing
methods. 10 Raw 0.8284 1.3800 0.8275 1.3295 1.75
10 SG 0.8223 14,019 0.8230 1.3479 1.73
11 1st 0.9090 1.0270 0.9003 1.0338 2.25
11 1st + SG 0.8976 1.0862 0.8993 1.0423 2.24
6 2nd 0.8555 1.2764 0.8109 1.3673 1.70
6 2nd + SG 0.8307 1.3719 0.8274 1.3132 1.77
9 MSC 0.8086 1.4495 0.7801 1.4979 1.56
9 SNV 0.8087 1.4494 0.7804 1.4973 1.56

Raw spectral data without any preprocessing method; SG Savitzky–Golay; 1st the first derivative; 2nd the
second derivative; SNV standard normal variate; MSC multiplicative scatter correction; Rc correlation coef-
ficient in calibration set; RMSEC root mean squared error of calibration set; Rp correlation coefficient in
prediction set; RMSEP root mean squared error in prediction set; RPD relative percent deviation

13
A feasibility quantification study of capsaicin content in chili powder for rapid evaluation…

The PLSR model results in Table 4 indicated that the a narrower spectral curve peak width than the first deriva-
original spectra have a certain prediction effect without tive, producing more peaks. Still, the peak height decreases
using the pretreatment method, similar to the ELM results. significantly, resulting in a decrease in discrimination. Some
The model accuracy is higher than SNV and MSC, with pseudo-harmonic peaks are generated, introducing extrane-
Rc of 0.8284, RMSEC of 1.3800 g/kg, and Rp of 0.8275 ous noise and lowering the signal-to-noise ratio, resulting in
RMSEP of 1.3295 g/kg. And the RPD value was 1.75, indi- a reduction of model accuracy.
cating that the model can be used for quantitative analysis. The overall comparison reveals that the PLSR model out-
Also, the prediction of the spectral model after preprocess- performs the ELM in terms of overall prediction, therefore,
ing has different degrees of variation. subsequent studies will be conducted on the basis of the
As shown in Table 4, the prediction results of the PLSR PLSR model.
model using SG smoothing decreased compared to the origi-
nal spectrum. The model prediction improved after the first
derivative, first derivative-SG, second-order derivative, and Prediction of capsaicin content based
second-order derivative-SG methods were processed. The on characteristic wavelengths with PLSR
optimum prediction model was the PLSR model using the
first derivative with Rp of 0.9003, RMSEP of 1.0338 g/kg, To simplify the model, the SPA and UVE algorithms were
and RPD of 2.25. used to filter the raw and preprocessed spectra. During the
It can be seen that after the smoothing process of the SPA screening process, no restrictions on the selection of
PLSR model, the accuracy of the final model has been wavelengths are set, and the optimal combination of wave-
reduced, probably because the smoothing process led to a lengths is selected based on the minimum root mean square
decrease in signal sharpness to produce distortion. error of prediction. The aim was to screen the pepper powder
SNV (standard normal variate transform) can eliminate spectral data for the most relevant wavelengths to capsaicin
the effects of solid particle size and surface scattering on content. In the screening process of the UVE method, the
the diffuse reflectance spectrum. MSC (multiplicative scatter principal component number with the smallest root mean
correction) has similar effects to SNV, which can eliminate square error of prediction was selected as the optimal princi-
scattering effects from particle inhomogeneity and parti- pal component number. The aim was to remove information
cle size. As seen in Table 4, the spectra after the SNV and from the spectral data that is not relevant to the detection
MSC treatments did not improve in precision, although the of capsaicin content. Finally, the PLSR models were built
principal component fraction was reduced. It may be due from the characteristic wavelength spectral data obtained
to the slight difference in granularity between the samples by the two methods. The detailed model results were shown
after the two-step sample processing of crushing and siev- in Table 5.
ing. Although the pretreatment reduced the scattering of the The optimum model under SPA screening is the SG-SPA-
granular pepper samples, it did not retain enough effective PLSR model due to the overfitting of the model built after
information, leading to decreased prediction accuracy [38, SPA screening of the 1st derivative spectra. The optimal
39]. And both showed similar prediction performance, which performance results of the PLSR model based on the char-
may be related to the fact that Fig. 3c and d have similar acteristic variables screened by SPA were 0.8183 and 0.8169
wave peaks and trends. for Rc and Rp, respectively, corresponding to RMSEC and
However, the accuracy of the derivative-processed PLSR RMSEP of 1.4163 g/kg and 1.3839 g/kg. As shown in Fig. 4
model is better than that of the original spectrum. The spec- and Table 6, the number of input spectral variables has been
tral model with the first derivative preprocessing has the reduced from 128 to 22 for the full spectrum, simplifying the
highest accuracy, similar to the ELM model. The spectral model and increasing computation speed. The 22 selected
model with the second derivative has lower accuracy but is wavelengths are spread over the entire band range, with a
still higher than the original spectrum. The first and second greater concentration in the band at 1150–1350 nm, which
derivatives significantly improve the prediction accuracy may be strongly related to the second overtone region of the
of the model. Among them, the optimum prediction was carbon bond [35, 37, 40].
obtained using the first derivative as the preprocessed model The optimum model under UVE screening is the SNV-
with Rc of 0.9090, Rp of 0.9003, RMSEC of 1.0270 g/kg, UVE-PLSR model, with an Rp of 0.8698, an RMSEP of
RMSEP of 1.0338 g/kg, and RPD of 2.25. This may be 1.1815 g/kg, and RPD of 1.97. The number of input spectral
since, after the first derivative processing, the profile vari- variables has been reduced from 128 to 44 for the full spec-
ation with higher clarity than the original spectrum was trum as shown in the Fig. 5 and Table 6. In comparison to
obtained, as shown in Fig. 3e, where the peak of the spec- the characteristic wavelengths obtained by the SPA, there is
tral curve with fluctuations is more pronounced compared also a significant aggregation at 1500 nm, which is mainly
to the original spectrum (Fig. 3a). The second derivative has related to the vibration of N–H [21]. Although the number of

13
B. Jing et al.

Table 5  Performance Selection Preprocessing Variables Correction set Validation set RPD
comparison of PLSR models methods number
of SPA and UVE screening RC RMSEC (g/kg) RP RMSEP (g/kg)
characteristic variables with
different preprocessing methods SPA Raw 23 0.7918 1.5048 0.7979 1.5025 1.55
SG 22 0.8183 1.4163 0.8169 1.3839 1.68
1st 10 0.8349 1.3564 0.8858 1.0980 2.12
1st + SG 20 0.8123 1.4372 0.8117 1.4242 1.64
2nd 8 0.5197 2.1073 0.6872 1.7941 1.30
2nd + SG 6 0.5418 2.0838 0.7127 1.6883 1.38
MSC 9 0.7770 1.5510 0.7648 1.5312 1.52
SNV 18 0.8194 1.4122 0.8015 1.4060 1.66
UVE Raw 51 0.8027 1.4695 0.7710 1.5159 1.54
SG 46 0.7849 1.5268 0.7841 1.4921 1.56
1st 20 0.7934 1.4500 0.7858 1.5677 1.49
1st + SG 17 0.5932 1.9840 0.5650 1.9276 1.21
2nd 8 0.6583 1.8558 0.6161 1.9195 1.21
2nd + SG 14 0.6875 1.7955 0.6576 1.7773 1.31
MSC 50 0.7921 1.5038 0.7856 1.4812 1.57
SNV 44 0.8811 1.1652 0.8698 1.1815 1.97

SPA successive projections algorithm; UVE uninformative variables elimination; Raw spectral data with-
out any preprocessing method; SG Savitzky–Golay;1st the first derivative; 2nd the second derivative; SNV
standard normal variate; MSC multiplicative scatter correction; Rc correlation coefficient in calibration set;
RMSEC root mean squared error of calibration set; Rp correlation coefficient in prediction set; RMSEP root
mean squared error in prediction set; RPD relative percent deviation

Table 6  Optimal wavelengths selected for the capsaicin content by


successive projections algorithm (SPA) and uninformative variables
elimination (UVE)

Selection Vari- Wavelength (nm)


methods ables
number

SPA 22 939.3, 951.0, 980.1, 1020.8, 1084.5, 1159.3,


1182.3, 1205.2, 1228.0, 1256.5, 1290.6,
1307.6, 1324.6, 1341.6, 1364.2, 1414.8,
1437.3, 1465.3, 1510.0, 1548.9, 1576.7,
1659.5
UVE 44 939.3, 951.0, 956.8, 980.1, 1020.8, 1026.6,
1032.4, 1038.2, 1078.7, 1084.5, 1090.3,
1096.0, 1101.8, 1153.6, 1313.3, 1319.0,
1324.6, 1330.3, 1352.9, 1358.5, 1364.2,
1369.8, 1375.5, 1386.7, 1426.1, 1442.9,
1448.5, 1470.9, 1476.5, 1482.1, 1487.7,
1493.2, 1498.8, 1504.4, 1510.0, 1515.5,
Fig. 4  The characteristic wavelengths of capsaicin selected by SPA 1532.2, 1537.8, 1548.9, 1554.5, 1560.0,
1565.6, 1571.1, 1626.4
variables after screening is higher than SG-SPA, it is accept-
able due to the improved performance of the model.
models with different preprocessing and wavelength screen-
Selection of optimal prediction model ing algorithms were compared using multiple evaluation
metrics of RMSEC, RMSEP and RPD. Among the numer-
PLSR outperformed ELM in the modeling effect based on ous preprocessing methods, the first derivative is probably
the above model results. This may be because the relation- the optimum preprocessing method that can significantly
ship between capsaicin content and spectral information improve the model accuracy. Although SPA and UVE could
approximately obeys the Lambert–Beer law and is more remove the spectral multicollinearity and irrelevant wave-
suitable for linear modeling. The performance of PLSR length to capsaicin content and achieve data reduction, the

13
A feasibility quantification study of capsaicin content in chili powder for rapid evaluation…

Fig. 5  The characteristic wavelengths of capsaicin selected by UVE

Table 7  Comparison results Number of Preprocessing Correction set Validation set RPD
of regression models for the wavelengths
capsaicin content prediction RC RMSEC (g/kg) RP RMSEP (g/kg)

Full spectra 128 1st 0.9090 1.0270 0.9003 1.0338 2.25


SPA 22 SG 0.8183 1.4163 0.8169 1.3839 1.68
UVE 44 SNV 0.8811 1.1652 0.8698 1.1815 1.97

SPA successive projections algorithm; UVE uninformative variables elimination; SG Savitzky–Golay;


1st the first derivative; SNV standard normal variate; Rc correlation coefficient in calibration set; RMSEC
root mean squared error of calibration set; Rp correlation coefficient in prediction set; RMSEP root mean
squared error in prediction set; RPD relative percent deviation

prediction results were less accurate than the full spectrum


model as shown in Table 7. Ultimately, the first derivative-
PLSR model in the full wavelength spectral range was
implemented to predict the capsaicin content in chili pep-
per powder, with Rc of 0.9090, Rp of 0.9003, RMSEP of
1.027 g/kg, RMSEC of 1.0338 g/kg, and RPD of 2.25, and
the scatter plot of the prediction results were shown in Fig. 6.

Conclusion

The overall results demonstrate that NIR spectroscopy ena-


bles the determination of capsaicin content in chili powder,
and compared to conventional HPLC and UV detection, it
can greatly reduce professional handling and detection time.
The full-band spectral prediction models based on PLSR
and ELM were developed, and the PLSR model obtained Fig. 6  Scatter distribution of reference measurements and predictions
using the quantitative model. The model of the total content of capsa-
better prediction results by comparing the evaluation metrics icin was constructed under partial least square (PLS) + first derivative
between the models. The feature bands relevant to capsaicin (1st). The results indicated an excellent fitting degree between actual
detection were subsequently screened using SPA and UVE, and predicted values

13
B. Jing et al.

and the corresponding PLSR models were developed. Satis- 17. Y. He, X.L. Li, X.F. Deng, J. Food Eng. 79(4), 1238–1242 (2007)
factory predictions were obtained while reducing the model- 18. H. Ning, J. Wang, H. Jiang, Q. Chen, Spectrochim Acta A 280,
121545 (2022)
ling variables, providing some reference for the development 19. P. Penchaiya, E. Bobelyn, B.E. Verlinden, B.M. Nicolai, W. Saeys,
of subsequent multispectral detection methods. By compar- J. Food Eng. 94(3–4), 267–273 (2009)
ing the prediction models based on full wavelength and char- 20. J. Lim, C. Mo, S.H. Noh, S. Kang, K. Lee, M.S. Kim, In: Confer-
acteristic wavelength, the full wavelength PLSR model pro- ence on Sensing for Agriculture and Food Quality and Safety IV:
Apr 24–25 2012; Baltimore, MD. (2012).
cessed by first derivative obtained the best prediction effect. 21. A. Rahman, H. Lee, M.S. Kim, B.K. Cho, Food Anal. Methods
The optimal performance of the model was achieved with 11(11), 3042–3052 (2018)
Rp of 0.9003, RMSEP of 1.0338 g/kg, and RPD of 2.25. 22. O. Cisneros-Pineda, L.W. Torres-Tapia, L.C. Gutierrez-Pacheco,
This study provides valuable data and a theoretical basis for F. Contreras-Martin, T. Gonzalez-Estrada, S.R. Peraza-Sanchez,
Food Chem. 104(4), 1755–1760 (2007)
the industrial application of capsaicin content determination 23. R. Ananthan, K. Subhash, T. Longvah, Food Chem. 238, 51–57
based on NIR spectroscopy. (2018)
24. GB/T 21266–2007, Determination of capsaicin-like substances in
Acknowledgements This study was funded by the Beijing Natural Sci- chili peppers and chili pepper products and method of expressing
ence Foundation under Grant No. 6202020. spiciness [S]
25. S. Wold, M. Sjostrom, L. Eriksson, Chemom. Intell. Lab. Syst.
Data Availability The authors of this article confirm that the data sup- 58(2), 109–130 (2001)
porting the study are available within the article. 26. M. Zareef, Q.S. Chen, M.M. Hassan, M. Arslan, M.M. Hashim,
W. Ahmad, F.Y.H. Kutsanedzie, A.A. Agyekum, Food Eng. Rev.
Declarations 12(2), 173–190 (2020)
27. G.B. Huang, Q.Y. Zhu, C.K. Siew, Neurocomputing 70(1–3),
Conflict of interest The authors declare that there is no conflict of in- 489–501 (2006)
terest. 28. C.H. Li, L.L. Li, Y. Wu, M. Lu, Y. Yang, L. Li, J. Spectrosc. 2018,
1–7 (2018)
29. Q. Ouyang, Q.S. Chen, J.W. Zhao, H. Lin, Food Bioprocess Tech-
nol. 6(9), 2486–2493 (2013)
References 30. X.H. Bian, S.J. Li, M.R. Fan, Y.G. Guo, N. Chang, J.J. Wang,
Anal. Methods 8(23), 4674–4679 (2016)
1. E. Alvarez-Parrilla, L.A. de la Rosa, R. Amarowicz, F. Shahidi, 31. M.M. Qiao, Y. Xu, G.Y. Xia, Y. Su, B. Lu, X.J. Gao, H.F. Fan,
J. Agric. Food Chem. 59(1), 163–173 (2011) Food Chem. 366, 130559 (2022)
2. J. Lillywhite, S. Tso, Agronomy-Basel 11(10), 11 (2021) 32. B. Lu, X.F. Wang, N.H. Liu, K. He, K. Wu, H.L. Li, X.Y. Tang,
3. H. Knotkova, M. Pappagallo, A. Szallasi, Clin. J. Pain 24(2), Spectrochim. Acta Part A 239, 118455 (2020)
142–154 (2008) 33. R. Karoui, A.M. Mouazen, E. Dufour, L. Pillonel, E. Schaller, J.
4. S.L. Volpe, ACSMS Health Fit. J. 24(3), 31–32 (2020) De Baerdemaeker, J.O. Bosset, Int. Dairy J. 16(10), 1211–1217
5. S.K. Sharma, A.S. Vij, M. Sharma, Eur. J. Pharmacol. 720(1–3), (2006)
55–62 (2013) 34. H. Sun, L. Zhang, Z.H. Rao, H.Y. Ji, Spectrosc. Lett. 53(10),
6. X.J. Luo, J. Peng, Y.J. Li, Eur. J. Pharmacol. 650(1), 1–7 (2011) 751–762 (2020)
7. M.D. Reyes-Escogido, E.G. Gonzalez-Mondragon, E. Vazquez- 35. W. Yin, C. Zhang, H. Zhu, Y. Zhao, Y. He, PLoS ONE 12,
Tzompantzi, Molecules 16(2), 1253–1270 (2011) e0180534 (2017)
8. C.B. Davis, C.E. Markey, M.A. Busch, K.W. Busch, J. Agric. Food 36. J. Jiang, H. Cen, C. Zhang, X. Lyu, H. Weng, H. Xu, Y. He, Post-
Chem. 55(15), 5925–5933 (2007) harv. Biol. Technol. 146, 147–154 (2018)
9. N. Maula, Muhaimin, Millasari, In:3rd International Seminar 37. T.S. Park, Y.M. Bae, M.J. Sim, D.E. Kim, S.I. Cho, Providence
on Chemical Education (ISCE) - Trends, Applications, Changes Rhode Island June 29 - July 2 2008. (2008).
in Chemical Education for the 40 Industrial Revolution: Sep 17 38. F. Liu, Y. He, G. Sun, J. Agric. Food Chem. 57, 4520 (2009)
2019; Yogyakarta, INDONESIA. (2020) 39. S.N. Jha, P. Jaiswal, K. Narsaiah, M. Gupta, R. Bhardwaj, A.K.
10. S.H. Choi, B.S. Suh, E. Kozukue, N. Kozukue, C.E. Levin, M. Singh, Sci. Hortic.-Amsterdam 138, 171 (2012)
Friedman, J. Agric. Food Chem. 54(24), 9024–9031 (2006) 40. X. Lü, J. Jiang, J. Yang, J. Zhejiang Univ. 45(6), 760–766 (2019)
11. Z.A. Al Othman, Y.B.H. Ahmed, M.A. Habila, A.A. Ghafar, Mol-
ecules 16(10), 8919–8929 (2011) Publisher's Note Springer Nature remains neutral with regard to
12. J.D. Batchelor, B.T. Jones, J. Chem. Educ. 77(2), 266–267 (2000) jurisdictional claims in published maps and institutional affiliations.
13. G.H. Chiang, J. Food Sci. 51(2), 499–503 (1986)
14. C. Jianrong, T. Mingjie, L. Qiang, Z. Jiewen, C. Quansheng, Food Springer Nature or its licensor (e.g. a society or other partner) holds
Sci. 30(04), 250–253 (2009) exclusive rights to this article under a publishing agreement with the
15. X. Huirong, C. Xiaowei, Y. Yibin, Trans. Chin. Soc. Agric. Mach. author(s) or other rightsholder(s); author self-archiving of the accepted
41(12), 126–129 (2010) manuscript version of this article is solely governed by the terms of
16. X.Y. Xu, W.G. Xie, C. Xiang, et al. Food Meas. (2023). such publishing agreement and applicable law.

13

You might also like