Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 196 (2018) 131–140

Contents lists available at ScienceDirect

Spectrochimica Acta Part A: Molecular and Biomolecular


Spectroscopy
journal homepage: www.elsevier.com/locate/saa

Using an optimal CC-PLSR-RBFNN model and NIR spectroscopy for the


starch content determination in corn
Hao Jiang, Jiangang Lu ⁎
State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University, Hangzhou 310027, China

a r t i c l e i n f o a b s t r a c t

Article history: Corn starch is an important material which has been traditionally used in the fields of food and chemical industry.
Received 29 March 2017 In order to enhance the rapidness and reliability of the determination for starch content in corn, a methodology is
Received in revised form 17 October 2017 proposed in this work, using an optimal CC-PLSR-RBFNN calibration model and near-infrared (NIR) spectroscopy.
Accepted 5 February 2018
The proposed model was developed based on the optimal selection of crucial parameters and the combination of
Available online 6 February 2018
correlation coefficient method (CC), partial least squares regression (PLSR) and radial basis function neural net-
Keywords:
work (RBFNN). To test the performance of the model, a standard NIR spectroscopy data set was introduced, con-
NIR spectroscopy taining spectral information and chemical reference measurements of 80 corn samples. For comparison, several
Corn starch other models based on the identical data set were also briefly discussed. In this process, the root mean square
Correlation coefficient method error of prediction (RMSEP) and coefficient of determination (Rp2) in the prediction set were used to make eval-
PLSR uations. As a result, the proposed model presented the best predictive performance with the smallest RMSEP
RBF (0.0497%) and the highest Rp2 (0.9968). Therefore, the proposed method combining NIR spectroscopy with
Neural network the optimal CC-PLSR-RBFNN model can be helpful to determine starch content in corn.
© 2018 Elsevier B.V. All rights reserved.

1. Introduction samples overlap closely. A feasible solution is to take methods of


chemometrics and establish multivariate statistical calibration models,
In recent years, near-infrared (NIR) spectroscopy has been develop- correlating spectral data with the content values obtained by chemical
ing rapidly with the improvement of computer performance, NIR in- reference methods [7]. With the help of calibration models, characteris-
struments and chemometrics. As a fast, accurate and non-destructive tics of unknown samples can be exactly predicted.
method for measuring, NIR spectroscopy is widely applied in the pro- In the procedure above, the robustness and accuracy of the predic-
cess of both off-line and on-line analysis. Nowadays, it has covered var- tion strongly depend on the accuracy of the used data and performance
ious fields including agriculture, food, chemical industry and biologic of calibration methods. The former part is related to the sampling pro-
science, generating remarkable social and economic benefit [1,2]. cess, spectral quality and accuracy of reference methods. The later part
Corn starch is an important food and chemical material. To facilitate mainly includes the selection and optimization of calibration models.
the breeding of high-starch corn, increase corn yields and promote the Typically, linear calibration models include multiple linear regres-
added value, it's significant to establish rapid and reliable analytical sion, principal component regression (PCR) and partial least squares re-
methods for the determination of starch content in corn [3,4]. Generally, gression (PLSR). Nonlinear calibration models mainly include nonlinear
traditional chemical methods are time-consuming, expensive and not PLSR, support vector machine (SVM) and neural network (NN) [8–11].
environment-friendly, which cannot meet the requirement of the Among all these models, PLSR possesses especially good performance.
large-scale applications of online measurements in agriculture and in- It retains the ability of PCR to denoise and eliminate the collinearity of
dustry [5]. To avoid the shortcomings above, NIR spectroscopy is an al- variables. Meanwhile, it takes the relation between dependent and in-
ternative method. Since the NIR spectrum is formed from the signals dependent variables into account, which improves its predictive capa-
due to the molecular vibrations of hydrogen-containing functional bility significantly [12].
groups [6], it's quite suitable to use NIR spectroscopy to determinate However, PLSR is a linear calibration method essentially. Although
starch content. the number of principal components extracted can be increased to en-
On the other hand, interpretation on the NIR spectra is difficult, be- able models partly approximating to the nonlinear portion, it is ulti-
cause the bands of spectra are typically broad and spectra of different mately restricted by the requirement of robustness. As a result,
nonlinear regression residual of PLSR models cannot be removed. But
⁎ Corresponding author. at the same time, nonlinear models are commonly less effective in the
E-mail address: lujg@zju.edu.cn (J. Lu). modeling of the linear parts. Thus, some models integrating PLSR with

https://doi.org/10.1016/j.saa.2018.02.017
1386-1425/© 2018 Elsevier B.V. All rights reserved.
132 H. Jiang, J. Lu / Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 196 (2018) 131–140

Table 1 of 3:1. Statistical details about the samples' chemical reference values
Statistical details of samples' chemical reference values. are descripted in Table 1.
Data sets Range (%) Mean (%) Median (%) Standard deviation (%)

Calibration set 62.884–65.903 64.678 64.772 0.808


2.2. Data Preprocessing
Prediction set 62.826–66.472 64.749 64.889 0.876
In the raw NIR spectra, except for the abundant information about
the functional groups in samples, irrelevant information is also
nonlinear calibration methods were developed in former researches to contained, such as background noise and baseline drift [18,19]. There-
synthesize both of the merits [13–15]. To be mentioned, radial basis fore, preprocessing spectra properly to extract effective information is
function neural network (RBFNN) is a promising nonlinear calibration meaningful for the development of accurate and reliable calibration
method. As an emerging local approximation network, RBFNN can ap- models. Typical preprocessing methods of NIR spectra contain back-
proximate arbitrary continuous function with arbitrary precision and ground and baseline corrections, normalization and smoothing [20].
avoid the problem of local minimum in some other networks, such as Particularly, Savitzky-Golay (S-G) smoothing algorithm combined
backpropagation neural network (BPNN) [16,17]. with derivative calculations shows great performance in the elimination
In this research, a methodology is proposed for the starch content of irrelevant information in the raw NIR spectra, and has been applied
determination in corn, using an optimal CC-PLSR-RBFNN model and widely [21–24].
NIR spectroscopy. The proposed model was formed based on the combi- In this research, all the raw spectra were preprocessed before cali-
nation of PLSR and RBFNN. Meanwhile, correlation coefficient method brations. Specifically, S-G smoothing with a third degree polynomial
(CC) was used here to select informative wavelengths, and crucial pa- was used for denoising, and the first-order numerical derivative was
rameters in the process of modeling were optimized to improve the pre- made to correct the baseline drift.
dictive performance. For comparison, several other models were also
briefly discussed in this paper. The performance of all the mentioned 2.3. Parameters Optimization on the Basic PLSR Model
models was evaluated through the root mean square error of prediction
(RMSEP) and coefficient of determination (Rp2) in the prediction data After the spectral preprocessing, PLSR was applied in the calibration
set. In general, smaller RMSEP values and larger Rp2s indicate the supe- set to form a basic model, correlating data of the full NIR spectra with
rior ability of calibration models for accurate determinations. the reference starch content. In the process of preprocessing and cali-
bration, the window width of S-G smoothing and the number of princi-
2. Materials and Methods pal components extracted are especially crucial parameters. For
instance, a large window of S-G smoothing may cause the spectra to dis-
2.1. Data Description tort, while a narrow window often results in a weak effect of denoising
[25]. Meanwhile, extracting too many principal components leads to
A published data set (http://www.eigenvector.com/Data/Corn/ overfitting and bad robustness in the calibration models, while an insuf-
index.html) was introduced here, aiming to evaluate the predictive abil- ficient extraction leads a greater prediction error due to the loss of effec-
ity of the proposed method more comprehensively by comparing with tive information [26]. Moreover, these two parameters are mutual
previous researches. This data set consists of NIR spectra obtained coupling in the effect on the predictive performance of models. There-
from 80 corn samples, and all the spectra were measured from 1100 fore, they need to be optimized together in the parameter space, accord-
to 2498 nm at 2 nm intervals (700 channels) on three different spec- ing to the cost values in the calibration set. Cost values were calculated
trometers. Besides, the reference content of samples was also involved, based on the root mean square error of cross-validation (RMSECV) and
including moisture, oil, protein and starch. extra penalty terms:
In this work, only the NIR spectra measured on the m5 spectrometer
and the corresponding starch content were used. 80 samples in the data
set were randomly divided into calibration and prediction sets at a ratio Costðp; qÞ ¼ RMSECVðp; qÞ þ α  p þ β  q ð1Þ

Fig. 1. Typical structure of RBFNN.


H. Jiang, J. Lu / Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 196 (2018) 131–140 133

Fig. 2. Flow chart describing the optimal CC-PLSR-RBFNN modeling method.

Fig. 3. Cost values of PLSR calibration for different window widths of S-G smoothing and numbers of principal components.
134 H. Jiang, J. Lu / Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 196 (2018) 131–140

where p is the window width of S-G smoothing and q is the number of the complexity of calibration models for increasing the stability of
extracted principal components. α and β are the penalty coefficients of p predictions.
and q, respectively. The penalty value α × p is used to make spectra lose In general, algorithms used for optimizations include analytic meth-
less information in the procedure of preprocessing, and β × q to reduce odologies, grid search and various heuristic methods. Which algorithm

Fig. 4. Parameter optimizations using GA in (a) run 1, (b) run 2, (c) run 3.
H. Jiang, J. Lu / Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 196 (2018) 131–140 135

should be chosen for a particular problem mainly depends on the re- transform and successive projection algorithm [28,29]. In this research,
quirement of optimization performance. On the other hand, analyticity the correlation coefficient method was used for the selection.
of the optimization, the scale of calculations and the distribution of cost Firstly, the correlation analysis was processed between spectral data
values are also important factors to be considered. In this research, the at different wavelengths and the reference starch content. In this step,
optimization is hard to be solved analytically, and the distribution of all the spectra used were preprocessed with the optimal window
cost values in the parameter space was not known at first. Meanwhile, width. Then, the wavelengths at which the correlation coefficient
the cost of ergodic calculations was acceptable due to a relatively exceeded a predetermined threshold were selected. Only the spectral
small searching space and few variables to be optimized. Thus, a grid data at the selected wavelengths were used to form the calibration
search was used here to find the global optimum parameters. The full model. Furthermore, the threshold of correlation coefficient was opti-
distribution of cost values was also acquired in this process, and it mized by cross-validation, aiming to reach a compromise between
could support future optimizations when the calibration model needs keeping the effective and removing the ineffective. In this process, the
to be updated. PLSR model extracted the optimal number of principal components.
Consequently, the optimal CC-PLSR model was formed with the three
optimal parameters, including the window width of S-G smoothing,
2.4. Wavelength Selection and the Optimal CC-PLSR Model the number of principal components and the threshold of correlation
coefficient.
In NIR spectra, data at a large number of wavelengths are sometimes
redundant or irrelative to the determination of samples, which detri-
mentally affects the calibrations [27]. In this sense, filtering spectral in- 2.5. Radial Basis Function Neural Network (RBFNN)
formation by selecting more informative wavelengths can effectively
improve the stability and interpretability of calibration models. Com- RBFNN is among the feed-forward networks with a single hidden
monly employed approaches to select wavelengths include correlation layer [30]. Fig. 1 shows a typical structure of RBFNN including three
coefficient method, interval partial least square method, wavelet layers of neurons and two kinds of mappings.

Fig. 5. (a) Raw spectra and (b) S-G derivative preprocessed spectra of the 60 samples in the calibration set.
136 H. Jiang, J. Lu / Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 196 (2018) 131–140

Fig. 6. RMSECVs of PLSR calibration for different thresholds of correlation coefficient.

The mapping from the input layer to the hidden layer is nonlinear, where ωi is the i-th weight factor, and N is the number of hidden
using the radial basis function (RBF) as the activation function. RBF in neurons.
the Gaussian form is the most commonly applied in RBFNN, and it can Among all the parameters in the structure of RBFNN, the numbers of
be defined by: input neurons and output neurons depend on the form of samples. Cen-
ters of RBFs are calculated by corresponding algorithms, such as Fuzzy
 
x−ci 2 Clustering, K-means Clustering and Orthogonal Least Squares. Weight
Rðx; ci Þ ¼ exp − ð2Þ
2σ 2 factors are the training target of the network. The number of hidden
neurons and the spread parameter are always set to some empirical
In the above equation, x is the input vector of the network, ci the cen- values or optimized by cross-validations.
ter of the i-th RBF, and σ is the spread parameter.
The mapping from the hidden layer to the output layer is linear. The
output of the network is the linear weighted sum of hidden neurons' 2.6. Optimal CC-PLSR-RBFNN Model
outputs and can be given as follows:
The optimal CC-PLSR-RBFNN model was formed based on the com-
X
N bination of the optimal CC-PLSR model and RBFNN. The optimal CC-
FðxÞ ¼ ωi Rðx; ci Þ ð3Þ PLSR model was used to properly extract principal components of
i¼1

Fig. 7. Entire preprocessed spectra and the wavelengths selected by CC (marked with circles).
H. Jiang, J. Lu / Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 196 (2018) 131–140 137

Fig. 8. Regression plot between the reference content and the content predicted by the optimal CC-PLSR model.

spectral data. Meanwhile, the residuals of PLSR were calculated as In the prediction set, the optimal CC-PLSR model was used to get the
following: principal component scores of spectral data and primary prediction
values (Vpplsr). Then, the RBFNN model was used to obtain the prediction
values of residuals (Respplsr) to make a parallel compensation. Conse-
Rescplsr ¼ V cref −V cplsr ð4Þ
quently, the final prediction was made as following:

where Vcref and Vcplsr are the reference values and prediction values of the V pfinal ¼ V pplsr þ Respplsr ð5Þ
optimal CC-PLSR model in the calibration set, respectively.
Subsequently, an RBFNN was established and trained, taking princi-
pal component scores obtained by the optimal CC-PLSR as its input var- 2.7. Flow Chart and Software
iables and Rescplsr as its output variable. This composed method retains
the ability of the optimal CC-PLSR model to extract the most effective in- In order to describe the proposed method briefly, a relevant flow
formation from original spectra. Meanwhile, the nonlinear residual of chart is present in Fig. 2, showing the steps of modeling step by step.
PLSR can be reduced due to the superior approximation performance In Fig. 2, p is the window width of S-G smoothing, ranging from 5 to
of RBFNN. p_MAX with an interval of 2, q the number of extracted principal

Fig. 9. Regression plot between the reference content and the content predicted by the optimal CC-PLSR-RBFNN model.
138 H. Jiang, J. Lu / Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 196 (2018) 131–140

Fig. 10. Regression plots between the reference content and the content predicted by (a) the optimal CC-RBFNN model, (b) the optimal CC-BPNN model and (c) the optimal CC-PLSR-BPNN
model.
H. Jiang, J. Lu / Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 196 (2018) 131–140 139

components, ranging from 3 to q_MAX with an interval of 1, and TH is model. As a result, the RMSEP and Rp2 were 0.0555% and 0.9960,
the threshold of correlation coefficient, ranging from 0 to TH_MAX respectively.
with an interval of I_TH. Besides, σ and N are the spread parameter The regression plot of prediction results between the predicted con-
and the number of hidden neurons, respectively. All the algorithms in tent and reference content is shown in Fig. 8. As can be seen, all the pre-
this work were implemented in MATLAB 2014b (Mathworks Inc., USA). dicted values of starch content are similar to the corresponding
reference values, which proves the optimal CC-PLSR model to be feasi-
ble for the determination.
3. Results and Discussion
3.3. Prediction by the Optimal CC-PLSR-RBFNN Model
3.1. Spectral Preprocessing and Wavelengths Selection
The root mean square error of calibration (RMSEC) of the optimal
All the spectra were preprocessed by the methods of S-G smoothing
CC-PLSR model was 0.0234%, and it indicated the regression residual
with a third degree polynomial and first-order derivative. Then, the
of PLSR. To reduce this residual, the optimal CC-PLSR-RBFNN model
basic PLSR model was formed using the preprocessed spectral data
was formed by training an RBFNN with the principal component scores
and corresponding starch content of 60 corn samples in the calibration
of the optimal CC-PLSR model and the reference starch content. The net-
set. In this process, the window width of S-G smoothing and the number
work was established with 17 input neurons and 1 output neuron ac-
of principal components were optimally selected by cross-validation.
cording to the form of training data. Orthogonal Least Squares
Cost values were calculated according to Eq. (1), and penalty coeffi-
algorithm was introduced to determine the centers of RBFs. Meanwhile,
cients (α and β) were both set to 0.001 to make a slight penalty to p
the spread parameter was set to 1.0, and the number of hidden neurons
and q.
was set equally to the number of inputs (17) for proper capacity of the
Cost values for different window widths of S-G smoothing and num-
network.
bers of principal components are shown in Fig. 3. The lowest cost value
Using the final calibration model, results of the prediction presented
(0.1618) was achieved with the window width = 17 and the number of
a decreased RMSEP of 0.0497% and an increased Rp2 of 0.9968. The re-
principal components = 17. Therefore, these two parameters were cho-
gression plot of prediction results between the predicted content and
sen as optimal parameters and used in the following steps of modeling.
the reference content is shown in Fig.9. As can be seen, the prediction
Fig. 3 shows the detailed distribution of cost values in the variables'
becomes more accurate than only using the optimal CC-PLSR model,
searching space. According to Fig. 3, many combinations of variables are
which indicates that the optimal CC-PLSR-RBFNN model possesses bet-
found to be acceptable for the following modeling. Furthermore, this
ter performance of the determination. Furthermore, this improvement
distribution is quite smooth, and most of the local minimums are closed
can be enhanced when a larger calibration set was introduced for a bet-
to the global minimum. Therefore, heuristic algorithms or a relatively
ter generalization ability of the network.
small space of searching in a proper range can be used in future optimi-
zations. Solutions of these effective optimizations will be suboptimal
3.4. Evaluation and Comparison
but guaranteed to be acceptable enough.
For example, three runs of parameter optimizations using genetic al-
The optimal CC-PLSR-RBFNN model was evaluated comprehensively
gorithm (GA) are recorded in Fig. 4. During the optimizations, popula-
by comparing with some other calibration models. These models in-
tion size and generations of GA were both set to 10. As it can be seen
clude the optimal CC-RBFNN model, the optimal CC-BPNN model and
in Fig. 4, the best penalty values of these optimizations were ranged
the optimal CC-PLSR-BPNN model, which shared the same calibration
from 0.1618 to 0.1671, closed and even equal to the result of grid re-
set and prediction set. Meanwhile, they were optimized by the same
search. On the other hand, cross-validations of PLS calibration model
strategy to rationalize the comparison, including wavelengths selection
were calculated for 100 times and 528 times when using GA and grid re-
and parameters optimization. Regression plots of these predictions are
search for optimization, respectively. Therefore, GA could be an option
shown in Fig. 10, and Table 2 shows all the prediction results in this re-
to optimize parameters in this situation.
search, with Rp2 and RMSEP used for the assessment.
Fig. 5a shows the raw NIR spectra of the 60 samples in the calibration
As can be seen in Table 2, the optimal CC-PLSR-RBFNN model gener-
set. Fig. 5b shows the S-G derivative spectra with the first-order deriva-
ated the highest Rp2 and the lowest RMSEP among all the presented
tive, third degree polynomial and 17 smoothing points. As it can be
models. Analyzing Table 2, it can be seen that NN models performed
compared between Fig. 5a and b, baseline drift of spectra was removed
better in the prediction when combined with PLSR. The reason is that
after the preprocessing, and the efficient part of spectral information
the linear part of relation between input and output was accurately
was emphasized more clearly.
modelled by PLSR. Besides, the dimension of data was reduced in the
Subsequently, correlation coefficient method was used for the selec-
process of PLSR calibration, which alleviated the overfitting problem
tion of wavelengths. The threshold of correlation coefficient was opti-
in NN models. Meanwhile, RBFNN model performed better than BPNN
mized here by cross-validation. In this process, the step length of
model under the same conditions, due to the ability of RBFNN to avoid
searching was set to 0.01, balancing the accuracy and efficiency.
the problem of local minimum.
RMSECVs for different thresholds of correlation coefficient are shown
Furthermore, there are several previous researches discussed here,
in Fig. 6. As can be seen, the best value was 0.30, and the RMSECV de-
which were based on the identical data set and splitting ratio (3:1). In
creased to the lowest value = 0.0633% accordingly. Using this thresh-
these researches, different models were used for the calibration, includ-
old, there were 127 wavelengths selected by CC in total, showed in
ing MMS, MLER and PCPLS [31–33]. Performance of the proposed model
Fig. 7. Compared with 700 wavelengths in the entire spectrum, redun-
dant spectral information was effectively reduced, making the following
calibration process more stable. Table 2
Comparison of the predictive performance using different optimal models.

3.2. Prediction by the Optimal CC-PLSR Model Calibration models Rp2 RMSEP (%)

CC-PLSR 0.9960 0.0555


By preprocessing spectra, selecting wavelengths and making PLSR CC-BPNN 0.9668 0.1634
operation with the three optimal parameters above, the optimal CC- CC-RBFNN 0.9932 0.0741
PLSR model was formed from the calibration set. 20 corn samples in CC-PLSR-BPNN 0.9954 0.0609
CC-PLSR-RBFNN 0.9968 0.0497
the prediction set were used to test the predictive performance of the
140 H. Jiang, J. Lu / Spectrochimica Acta Part A: Molecular and Biomolecular Spectroscopy 196 (2018) 131–140

Table 3 References
Comparison of the predictive performance with previous researches.
[1] A.J. Fernándezespinosa, Talanta 148 (2016) 216–228.
Calibration models RMSEP (%) [2] L. Ning, J. Chen, T. Pan, L. Yao, Y. Han, J. Yu, Infrared Phys. Technol. 76 (2016)
648–654.
MMS [31] 0.0915
[3] M.R. Almeida, R.S. Alves, L.B. Nascimbem, R. Stephani, R.J. Poppi, L.F. de Oliveira,
MLER [32] 0.1147
Anal. Bioanal. Chem. 397 (2010) 2693–2701.
PCPLS [32] 0.122
[4] P.P. Samuel, T. Chinnu, M.K. Lakshmanan, Mater. Today Proc. 2 (2015) 949–953.
CC-PLS-RBFNN 0.0497 [5] B. Jamshidi, E. Mohajerani, J. Jamshidi, Measurement 89 (2016) 1–6.
[6] N.C. Mariani, G.H. de Almeida Teixeira, K.M. Gomes de Lima, T. Morgenstern, V.
Nardini, L.C. Cunha Jr., Food Chem. 174 (2015) 643–648.
[7] J.U. Porep, A. Mattes, M.S.P. Nikfardjam, D.R. Kammerer, R. Carle, Aust. J. Grape Wine
and the previous can be simply compared for reference. Results of the Res. 21 (2015) 69–79.
comparison were listed in Table 3. As can be seen, RMSEP of the pro- [8] K.M.F. Diesel, F.S.L.D. Costa, A.S. Pimenta, K.M.G.D. Lima, Wood Sci. Technol. 48
(2014) 949–959.
posed model presented a significant decrease from 45.7% to 59.3%.
[9] W. Guo, L. Shang, X. Zhu, S.O. Nelson, Food Bioprocess Technol. 8 (2015) 1126–1138.
[10] R.M. Balabin, E.I. Lomakina, R.Z. Safieva, Fuel 90 (2011) 2007–2015.
4. Conclusions [11] A. Morita, T. Araki, S. Ikegami, M. Okaue, M. Sumi, R. Ueda, Y. Sagara, Food Sci.
Technol. Res. 21 (2015) 175–186.
[12] D. Krause, C. Holtz, M. Gastl, M.A. Hussein, T. Becker, Eur. Food Res. Technol. 240
In this paper, a methodology is proposed for determining starch con- (2015) 831–846.
tent in corn, using the optimal CC-PLSR-RBFNN calibration model and [13] Z. Tian, B. Gu, L. Yang, Y. Lu, Appl. Therm. Eng. 77 (2015) 113–120.
NIR spectroscopy. The proposed model is based on PLSR. In the process [14] S. Hosseinpour, M. Aghbashlo, M. Tabatabaei, E. Khalife, Energy Convers. Manag. 124
(2016) 389–398.
of modeling, it was shown that using correlation coefficient method for [15] Y. Wang, M. Yang, G. Wei, R. Hu, Z. Luo, G. Li, Sensors Actuators B Chem. 193 (2014)
the wavelengths selection and optimizing crucial parameters could in- 723–729.
crease the reliability of prediction significantly. Meanwhile, combining [16] H. Wang, C. Kong, D. Li, N. Qin, H. Fan, H. Hong, Y. Luo, Food Bioprocess Technol. 8
(2015) 2429–2443.
PLSR with RBFNN was demonstrated to be feasible to reduce the resid- [17] S. Ghandehari, M. Asghari, Sep. Sci. Technol. 48 (2012) 1324–1330.
ual of PLSR. [18] J.C.L. Alves, R.J. Poppi, Fuel 165 (2016) 379–388.
The predictive performance of calibration models was evaluated [19] Y. Tekin, Z. Tümsavas, A.M. Mouazen, Rev. Bras. Cienc. Solo 38 (2014) 1794–1804.
[20] R. Khodabakhshian, B. Emadi, M. Khojastehpour, M.R. Golzarian, Ann. Food Sci.
through the RMSEP and Rp2 in the prediction data set. Compared with
Technol. 17 (2016) 224–238.
several other models optimized by the same strategy, the proposed op- [21] S.R. Karunathilaka, A.F. Kia, C. Srigley, J.K. Chung, M.M. Mossoba, J. Food Sci. 81
timal CC-PLSR-RBFNN model showed the best performance with the (2016) 2390–2397.
[22] O.O. Olarewaju, I. Bertling, L.S. Magwaza, Sci. Hortic. 199 (2016) 229–236.
smallest RMSEP (0.0497%) and the highest Rp2 (0.9968). Moreover,
[23] J.M. Marques Junior, A.L. Muller, E.L. Foletto, C.A. Da, C.A. Bizzi, M.E. Irineu, J. Anal.
the value of RMSEP reduced by 45.7% to 59.3% than those presented in Methods Chem. 2015 (2015) 1–6.
the previous researches, which were developed based on the identical [24] P.A.M. Nascimento, L.C.D. Carvalho, L.C.C. Júnior, F.M.V. Pereira, Postharvest Biol.
data set. Therefore, it can be concluded that NIR spectroscopy coupled Technol. 111 (2016) 345–351.
[25] H. Guo, T. Pan, J. Chen, J. Wang, G. Cao, Anal. Methods 6 (2014) 8810–8816.
with the optimal CC-PLSR-RBFNN model is a promising method for [26] Q. Kang, Q. Ru, Y. Liu, L. Xu, J. Liu, Y. Wang, Y. Zhang, H. Li, Q. Zhang, Q. Wu,
the starch content determination in corn. Spectrochim. Acta A 152 (2016) 431–437.
[27] P.H.G.D. Diniz, M.F. Pistonesi, M.C.U.D. Araújo, Anal. Methods 7 (2015) 3379–3384.
[28] J.H. Cheng, D.W. Sun, Compr. Rev. Food Sci. Food Saf. 14 (2015) 478–490.
Acknowledgements [29] R.C. Costa, K.M.G.D. Lima, J. Braz. Chem. Soc. 24 (2013) 1351–1356.
[30] D.S. Broomhead, D. Lowe, Complex Syst. 2 (1988) 321–355.
This work was supported by the National Natural Science Founda- [31] X. Wu, Z. Liu, H. Li, Anal. Chim. Acta 801 (2013) 43–47.
[32] L. Guo, J. Peng, Q. Xie, Spectrochim. Acta A 189 (2017) 316–321.
tion of China (No. 61590925, No. U1609212, No. U1509211) and the Na- [33] L. Xu, X.P. Yu, X.L. Lu, Y.H. Wu, H.L. Wu, J.H. Jiang, G.L. Shen, R.Q. Yu, Anal. Chim. Acta
tional Key Research and Development Plan of China (No. 644 (2009) 25–29.
2017YFC0210403).

You might also like