Professional Documents
Culture Documents
Identification of Soybean Varieties Based On Hyperspectral Imaging Technology and One-Dimensional Convolutional Neural Network
Identification of Soybean Varieties Based On Hyperspectral Imaging Technology and One-Dimensional Convolutional Neural Network
DOI: 10.1111/jfpe.13767
ORIGINAL ARTICLE
Hao Li1,2 | Liu Zhang1,2 | Heng Sun1,2 | Zhenhong Rao3 | Haiyan Ji1,2
1
Key Laboratory of Modern Precision
Agriculture System Integration Research, Abstract
Ministry of Education, China Agricultural Variety identification of seeds is essential to ensure the purity and yield of the vari-
University, Beijing, China
2 ety. A model based on hyperspectral imaging technology and one-dimensional con-
Key Laboratory of Agricultural Information
Acquisition Technology Ministry of volutional neural network (1D CNN) was proposed to distinguish soybean seed
Agriculture, China Agricultural University,
varieties in this article. A total of 3,600 soybean seeds (900 seeds per variety) of
Beijing, China
3
College of Science, China Agricultural hyperspectral images in the spectral range of 866.4–1,701.0 nm were collected.
University, Beijing, China Traditional machine learning models (k-nearest neighbor, support vector machines,
Correspondence partial least squares discriminant analysis) and 1D CNN model were established
Haiyan Ji, Key Laboratory of Modern Precision based on different numbers of sample sets. The 1D CNN model was the most stable
Agriculture System Integration Research,
Ministry of Education, China Agricultural and had the highest classification accuracy, higher than 98%. Fivefold cross validation
University, Beijing 100083, China. was used to evaluate the model, and the model achieved an accuracy of more than
Email: yuntian@cau.edu.cn
95% in both the training set and the validation set. Finally, the t-distributed stochastic
Funding information neighbor embedding was used to visualize the feature values extracted by 1D CNN.
National Key Research and Development
Program, Grant/Award Number: The research results demonstrated that the 1D CNN model maintained good classifica-
2016YFD0200602 tion performance in soybean seed classification and had good application prospects in
hyperspectral imaging technology.
Practical Applications
Variety identification of soybean seeds is essential to ensure the purity and yield of
the variety. Different soybean varieties have different genetic purity, physical purity,
germination ability and vigor, which are related to quality attributes, such as nutri-
tional value, stress resistance, and final yield. In the past, most soybeans were identi-
fied using traditional machine learning algorithms combined with near-infrared
hyperspectral imaging technology to build models. This requires a series of opera-
tions, such as smoothing preprocessing and dimensionality reduction on hyper-
spectral data. These steps are too cumbersome and are not conducive to online
hyperspectral monitoring systems. The results of this study show that it is feasible to
use one-dimensional convolutional neural network combined with hyperspectral
technology to identify soybean seeds, and it also provides a new idea for building an
online detection system in the future.
Rui, & Tao, 2018; Ze, Senior, & Schuster, 2013). Through multilayer 2.2 | Hyperspectral system
processing, the deep learning algorithm gradually transforms the ini-
tial low-level feature representation into high-level feature represen- The “GaiaSorter” HSI system produced by Zolix Co., Ltd. (Beijing,
tation, and then can complete complex learning tasks such as China) was selected for this experiment. This system is mainly com-
classification with a simple model. Therefore, deep learning can be posed of uniform light source, spectroscopic camera, electronically
understood as feature learning or representation learning. Deep controlled mobile platform, computer and control software. The cam-
learning is used to learn the internal rules and representation era used in the spectral imager is a NIR enhanced hyperspectral
levels of sample data. The information obtained in the learning pro- camera of the “image-λ” series of Zolix Co., Ltd., and its spectral range
cess is of great help to the interpretation of data such as text, is 866.4–1,701.0 nm. The uniform light source consists of two 200w
images, and sounds. Compared with the traditional neural network, bromine tungsten lamps. The light emitted by the light source is
its most prominent feature is that the corresponding optimization uniformly illuminated by thermal radiation, and the unevenness of the
algorithm is more effective with the increase of hidden layers. Zhu light source within the volume of the cube 300 20 100 mm
et al. used hyperspectral technology combined with migration learn- (length width height) is within 5%, which meets the uniformity of
ing to classify ten soybean varieties based on six models including the light source during the test of samples of different volumes. The
AlexNet, ResNet18, Xception, InceptionV3, NASNetLarge, and entire HSI system is shown in Figure 1. The working principle of
DenseNet201, with accuracy rates above 90% (Zhu et al., 2020). the system is to place the sample to be tested on an electric mobile
Pang et al. used SVMs, ELMs, and one-dimensional convolutional platform controlled by software, and use a push-broom method to
neural networks (1D CNN) to collect hyperspectral data for modeling collect images (Elmasry, Mandour, Al-Rejaie, Belin, & Rousseau, 2019).
to classify four varieties of corn seeds. Among them, 1D CNN per- With the movement of the electric platform, the hyperspectral cube
formed best, and the 2D CNN model based on hyperspectral images data containing the spectrum information and image information of
achieved a recognition accuracy of 99.96% (Pang et al., 2020). Wu the tested samples are finally obtained.
et al. used a deep convolutional neural network model based on HSI
to classify oat seeds, and achieved the highest accuracy of 99.19%
on the testing set (Wu et al., 2019). Deep learning algorithm has 2.3 | Spectral data extraction
gradually become the optimum selection for establishing non-
destructive testing model. In this study, NIR hyperspectral technol- Before starting this experiment, the power supply was turned on to
ogy was used to obtain spectral information of different varieties of warm up the HSI system for 30 min to eliminate errors such as baseline
soybean seeds. The soybean classification models were established drift caused by the system. Then, the system's control software
based on deep learning and traditional machine learning algorithms, SpecView (SpecView Ltd., Uckfield, UK) was run to conduct a series of
and their effects were compared. tests such as focusing, and finally the exposure time was determined
The specific goals of this study are to: (a) test the identification 0.02 s and the moving speed of the electric mobile platform was
ability of the model based on NIR hyperspectral technology and 1D 0.50 cm/s. In this experiment, all the experimental samples were placed
CNN for soybean seeds; (b) compare the effects of classification on a black cardboard with extremely low reflectivity, so that the sam-
models based on 1D CNN and machine learning algorithms; ples were better separated from the background when extracting data
(c) evaluate the generalization and stability of the 1D CNN model later. During the data collection period, To avoid the interference of
through k-fold cross-validation; and (d) visualize the distribution external light, the whole system was in a closed environment After the
characteristics of the original spectral data and the data processed data were collected, black and white correction of the obtained data
by 1D CNN to prove the effectiveness of the model. were required to reduce the influence of dark current and other noise
on the image (Zhao et al., 2018). The specific operation steps are as fol-
lows: in the same collection environment, the image collected by aiming
2 | MATERIALS AND METHODS the camera at the whiteboard is IW, and then the lens cap of the camera
is capped, and the collected image is IB. The correction formula is:
2.1 | Material preparation
IO IB
IC ¼ ð1Þ
IW IB
The soybean seeds used in this experiment were purchased from a
seed market in Beijing, China and contained four different varieties:
Zhonghuang13 (ZH13), Zhonghuang37 (ZH37), Zhonghuang39 where IC is the corrected hyperspectral image, IO is the original hyper-
(ZH39), and Zhonghuang57 (ZH57). The number of each variety of spectral image, IB is the all black calibration image, and IW is the all-
seeds was 900, totaling 3,600 seeds. All the seeds harvested in 2020 white calibration image. The software for correcting the image is
were packed in plastic bags and their moisture content was controlled SpecView (the control software of the “GaiaSorter” HSI system). In
to minimize the impact of the environment on the quality of seeds. this study, a total of 36 hyperspectral images were collected and
The four varieties are encoded as 0, 1, 2, 3 in order of ZH13/ZH37/ processed. Each soybean variety included nine hyperspectral images.
ZH39/ZH57 for later data analysis. There were 100 samples in each hyperspectral image.
4 of 14 LI ET AL.
2.4 | Spectral data collection image to finally extract the spectral data. As shown in Figure 2: (a) using
the Canny edge detector to extract the edge, corrosion and dilate the
Since the data collected by the hyperspectral system is hyperspectral seeds in the corrected hyperspectral image; (b) binarizing to generate a
cube data, it is necessary to perform a series of processing on the original mask and segment the original image to remove background;
LI ET AL. 5 of 14
(c) extracting the entire area of each soybean seed on the image as a 2.5.3 | PLS discriminant analysis
region of interest (ROI), and then calculating the average reflectivity of all
pixels of the ROI as the spectral value of each soybean seed. PLS-DA is a discriminant analysis method in multivariate data analysis
technology. It is often used to deal with classification and discriminant
problems. It is considered as a supervision method to maximize the
2.5 | Data analysis methods distinction between samples (Almeida, Fidelis, Barata, & Poppi, 2013).
In this study, only the average spectrum of the sample was regarded
In this experiment, four different methods, KNN, SVM, PLS-DA, and as the X variable, and the Y variable was determined by the code rule,
1D CNN were used to classify soybean seeds. The t-distributed so that ZH13 was code 0, ZH37 was code 1, ZH39 was code 2, and
stochastic neighbor embedding (t-SNE) was used to verify the ability ZH57 was code 3. The training set and the testing set were divided at
of 1D CNN to extract features. a ratio of 3:1.
No. Operation In channels Out channels Kernel size Stride Padding Feature size BN? Activation function
1 Conv1 1 16 (1,4) 2 1 (1,1,224) ! (16,1,112) Yes ReLU
2 Conv2 16 32 (1,4) 2 1 (16,1,112) ! (32,1,56) Yes ReLU
3 Conv3 32 64 (1,4) 2 1 (32,1,56) ! (64,1,28) Yes ReLU
4 Conv4 64 128 (1,4) 2 1 (64,1,28) ! (128,1,14) Yes ReLU
5 Flatten (128,1,14) ! (1,1792)
6 Fc1 (1,1792) ! (1,512) No ReLU
7 Fc2 (1,512) ! (1,256) No ReLU
8 Fc3 (1,256) ! (1,128) No ReLU
9 Fc4 (1,128) ! (1,64) No ReLU
10 Softmax (1,64 ! (1,4)
Note: “Out channels” means the number of convolution kernels; “Conv” means the convolutional layer; “FC” means the fully connected layer; “BN?”
means whether to perform batch normalization.
was used as the activation function unit. The Leaky ReLU function is a beta_2 of 0.999 was used to minimize the loss function. The optimiza-
variant of the classic and widely used ReLU activation function. The tion algorithm is an extension of the stochastic gradient descent
output of this function has a small slope for negative input. Since method, which can be implemented simply and directly, has fast calcu-
the derivative is always nonzero, this can reduce the appearance of lation efficiency and small memory usage, and is very suitable for data
silent neurons, allow gradient-based learning, and solve the problem and/or parameter problems. To monitor and prevent overfitting prob-
of neurons not learning after the ReLU function enters the negative lems during the training process, all data sets were divided into train-
interval. As shown in the following formula: ing sets and validation sets at a ratio of 3:1. The mini-batch training
methods with a batch size of 5 was selected to achieve higher accu-
x, x > 0 racy and faster model convergence. The number of iterations of train-
Leaky ReLUðxÞ ¼ , ð2Þ
αx, x ≤ 0 ing was determined by the performance of the model.
method was used to visualize the features of the hidden layer and 3 | RESULTS AND ANALYSIS
the original spectral data (Nie, Zhang, Feng, Yu, & He, 2019). t-SNE
is a dimension reduction technology, which is used to represent 3.1 | Raw spectra analysis
high-dimensional data sets in 2D or 3D low-dimensional space, so
as to visualize them. Compared with other dimensionality reduction The original spectral wavelength range of soybean seeds is 886.4–
algorithms (such as PCA), t-SNE creates a reduced feature space. 1,701.0 nm. To eliminate the influence of the external environment
Similar samples are modeled by nearby points, and dissimilar and camera performance, the spectral information in the range of
samples are modeled by far points with high probability (Yang, 918.1–1,653.8 nm was collected by removing the bands with obvious
Zhao, & Chan, 2017). In addition, t-SNE is a technology that inte- noise in the front and the rear. The spectral data were normalized
grates dimension reduction and visualization. It is an improvement before modeling. As shown in Figure 5a, a total of 224 wavelengths
based on SNE visualization, which solves the characteristics of were included for analysis. The average spectra of each species are
crowded sample distribution and unobvious boundary of SNE after shown in Figure 5b. It can be seen from Figure 5 that the trends of
visualization, and is a good dimension reduction visualization the average spectral curves of the four soybean varieties were very
method at present. similar and difficult to distinguish. However, due to the different
chemical composition of different varieties of soybeans such as pro-
tein, fat and dietary fiber, the spectral characteristics of different vari-
2.5.7 | Software tools eties of soybeans are very different within certain specific wavelength
ranges. For example, in the wavelength range of 918.1–1,100.0 nm,
The spectral values of all pixels in the ROI were calculated on ENVI the spectral reflectance of ZH13 was significantly higher than that of
5.3 (ITT Visual Information Solutions, Boulder, UT). Image processing the other three varieties, while in the range of 1,250.0–1,330.0 nm,
and the establishment of PLS-DA model, KNN model, and SVM the spectral reflectance of ZH39 was significantly lower than the
model were carried out in MATLAB R2018b (The MathWorks, other three varieties. In the wavelength range of 918.1–1,100.0 nm
Natick, MA). Python 3.8.3 (https://www.python.org/) with Jupyter and 1,250.0–1,330.0 nm, the order of spectral reflectance from high
Notebook was used to build a 1D CNN model. The well-known deep to low was: ZH13/ZH57/ZH37/ZH39, and this difference was most
learning framework Pytorch (https://pytorch.org/) was used to pro- obvious near 1,300.0 nm. At 1,000.0–1,100.0 nm, it mainly corre-
gram the architecture of the 1D CNN model, which run in the central sponds to the third overtone of N─H stretching, and 1,100.0–
processing unit. 1,300.0 nm mainly corresponds to the second overtone of C─H
8 of 14 LI ET AL.
FIGURE 5 The spectra of soybean seeds. (a) The spectrum of all soybean seeds. (b) The average spectrum of four soybean varieties
Abbreviations: 1D CNN, one-dimensional convolutional neural network; KNN, k-nearest neighbor; PLS-DA, partial least squares discriminant analysis;
SVM, support vector machine; ZH13, Zhonghuang13; ZH37, Zhonghuang37; ZH39, Zhonghuang39; ZH57, Zhonghuang57.
stretching (Ferrari et al., 2004). However, due to the overlapping of as the number of soybean seed varieties increased. When the number
spectral curves, it is not reliable to identify different varieties of soy- of varieties of input seeds in the model increased from two to four,
bean seeds based on the reflectance difference of spectral curves. the accuracy of the training set of the model decreased from 99.41 to
Therefore, it is necessary to establish models to effectively extract 73.10%, and the accuracy of the testing set decreased from 98.44
and utilize spectral features for soybean seed classification. to 71.40%. PLS-DA is a linear classification method that combines
PLS regression and classification technology. When the number of
classifications increased, the accuracy of the method was greatly
3.2 | Classification results and analysis reduced. The KNN model had a similar situation. When the number of
soybean varieties increased to four, the accuracy of the testing set
In this step, SVM, KNN, PLS-DA, and 1D CNN were used to establish decreased from 97.33 to 85.75%, which was slightly smaller than the
model based on 224 bands. Table 2 shows the discrimination results PLD-DA model. The KNN method mainly relies on the surrounding
of soybean seeds under four different models. For different varieties limited nearby samples rather than the method of discriminating the
of soybean seeds, the results of various models were obviously differ- category domain to determine the category. Therefore, the KNN
ent. Among them, the performance of the PLS-DA model decreased method was slightly better than the PLS-DA model for the sample set
LI ET AL. 9 of 14
to be divided with more cross or overlap of the class domain. How- model and SVM model performed better as nonlinear algorithms
ever, this algorithm is often not effective for data sets with many than linear algorithms. Although the SVM and 1D CNN model can
features (a few hundreds or more) and data sets with a large maintain a relatively stable classification level with high perfor-
amount of data. SVM is a nonlinear machine learning algorithm and mance, when the number of input seeds was four for the SVM
nonlinear hyperplanes are used to classify complex data. With the model, the difference between the recognition accuracy of the
increase in the number of soybean seed varieties, the discrimina- training set and the testing set had reached more than 3%, which
tion result of the SVM model declined slightly, and the final accu- indicated that the model had been overfitted. Because the spectral
racy rates on the training set and testing set were 95.67 and data contains many deep features, it is obviously more effective to
92.56%, respectively. When the number of varieties of soybean use the deep neural network model. Deep learning model is usually
seeds was four, the 1D CNN model achieved 99.67 and 98.89% a deep architecture, and compared with traditional shallow classi-
accuracy on the training set and testing set, respectively, which fier, deep architecture can extract more abstract data features, thus
was better than SVM when identifying four different varieties of obtaining better performance level. In addition, in the process of
soybean seeds. As shown in Table 2, with the increase in the num- training the deep learning model, methods such as BN and dropout
ber of soybean varieties, the discrimination results of 1D CNN can be used to reduce overfitting and improve the stability of the
remained stable, and the accuracy of the training set and testing discriminant model. In this article, because in the process of using
set were always greater than 99 and 98%, respectively. 1D CNN and debugging dropout, although the model stability had increased,
F I G U R E 6 The loss and accuracy curves of one-dimensional convolutional neural network (1D CNN) model of soybean seeds. Classifications
of two (a), three (b), and four (c) varieties of soybean seeds
10 of 14 LI ET AL.
the accuracy of the model had dropped significantly, so instead of required longer training time. For example, in the classification of
using dropout, only BN was used. With the increasing number of four varieties, the training time of SVM, KNN, and PLS-DA models
soybean seed varieties, compared with SVM, KNN, and PLS-DA were 6.05, 0.87, and 0.63 s, respectively. The training time of 1D
model, the accuracy of 1D CNN model was significantly improved. CNN was 3,948.23 s. However, the performance of the 1D CNN
In addition, the 1D CNN model used an optimization algorithm to model was superior to them. This was because many parameters in
alleviate the overfitting problem of the model. Besides, compared deep neural networks needed to be optimized through multiple
to the SVM, KNN, and PLS-DA models, the 1D CNN model iterations. What is more, when the training of the 1D CNN model
Varieties of seeds The number of seeds Training fold data (%) Validation fold data (%)
ZH13 and ZH37 1,800 99.94 99.16
ZH13, ZH37, and ZH39 2,700 99.06 98.75
ZH13, ZH37, ZH39, and ZH57 3,600 98.79 97.63
Abbreviations: 1D CNN, one-dimensional convolutional neural network; KNN, k-nearest neighbor; PLS-DA, partial least squares discriminant analysis;
SVM, support vector machines; ZH13, Zhonghuang13; ZH37, Zhonghuang37; ZH39, Zhonghuang39; ZH57, Zhonghuang57.
F I G U R E 7 The loss and accuracy curves of fivefold cross-validation of one-dimensional convolutional neural network (1D CNN) model.
Classifications of two (a), three (b), and four (c) varieties of soybean seeds
LI ET AL. 11 of 14
F I G U R E 8 The visualization maps of soybean seeds using t-distributed stochastic neighbor embedding (t-SNE). (a) t-SNE figures before one-
dimensional convolutional neural network (1D CNN) and (b) t-SNE figures after 1D CNN
12 of 14 LI ET AL.
was completed, the best performing model parameters were saved. soybean seeds were collected to obtain spectral information. 1D CNN
In this way, when facing the same scene later, the saved 1D CNN model was used to establish a discriminant model for classifying soy-
model could be used directly. bean seed varieties, and the results of 1D CNN were compared with
The loss value and accuracy curve of the soybean seed discrimination those of discriminant models based on SVM, KNN, and PLS-DA.
model based on 1D CNN are shown in Figure 6. As the epoch increases, With the increase in the number of soybean varieties, the performance
the loss value of the discriminant model continued to decrease, and the of 1D CNN remained excellent. When the number of soybean varieties
accuracy continued to rise and tends to stabilize, which indicated that increased from two to four, compared with the results (70%) of the
the discriminant model based on 1D CNN had converged. To verify the PLS-DA model, the classification results of 1D CNN model remained
generalization and stability of the final 1D CNN model, fivefold cross- above 98%. When the number of soybean seeds increased from 1,800
validation was used to verify the model. The results are shown in Table 3. to 3,600, compared with the results (85%) of the KNN model, the clas-
The loss and accuracy curves of cross-validation are shown in Figure 7. sification effect of 1D CNN model was still superior. The difference
With the increase of soybean varieties, the accuracy of training fold data between the accuracy (95.67%) of the training set and the accuracy
and validation fold data was always higher than 98 and 97%, respectively, (92.56%) of the testing set of the SVM model was more than 3%, and
which proved that the generalization and stability of the classification there was an overfitting phenomenon. However, the performance of
model based on 1D CNN had certain advantages. Therefore, 1D CNN 1D CNN remained stable, which reduced the problem of overfitting. At
has potential applications in seed classification based on HSI technology. the same time, the fivefold cross-validation method was used to verify
the generalization and stability of the 1D CNN model, and the results
showed that its generalization was quite good. In addition, t-SNE was
3.3 | Visualize data used to visualize the results of combining NIR hyperspectral technology
and deep learning. This method proved that it was feasible to use deep
To visually prove the effectiveness of 1D CNN in soybean seed classifica- learning model and NIR hyperspectral technology to classify soybean
tion based on NIR hyperspectral technology, t-SNE was used to visualize seeds. This also proved that t-SNE was used to explain the effective-
spectral data. In this article, by inputting the spectral data of the testing ness of deep convolution neural network in seed classification. Then,
set into the best trained 1D CNN model, the feature values of different we will collect more varieties of soybean spectrum to expand the spec-
soybean varieties were collected from the previous layer of the softmax trum fingerprints. As the number of seed spectral data increases, a
layer of 1D CNN. Then t-SNE was used to visualize the original spectral model based on deep learning algorithms can be used to construct a
data and the feature values extracted from the 1D CNN model and variety library to identify more different varieties of soybean seeds.
reduce these high-dimensional data to a two-dimensional plane for more
appropriate analysis. It can be clearly seen from Figure 8 that after the ACKNOWLEDG MENT
original high-dimensional spectral data of the testing set was visualized by This research is supported by National Key Research and Develop-
t-SNE, the distribution of these data on the two-dimensional plane was ment Program (Project No. 2016YFD0200602).
messy and overlapping. This also showed that it is impractical to classify
based on raw spectral data alone. After extracting features based on the CONFLIC T OF INT ER E ST
discriminant model established by 1D CNN, the feature points of these The authors declare no conflicts of interest.
different varieties of soybean seeds were clearly separated. As the num-
ber of soybean varieties increases, the visualization results became more AUTHOR CONTRIBU TIONS
obvious. In all the visualizations, there were obvious boundaries between Hao Li: Methodology; software; visualization; writing-original draft.
the visualization data obtained from 1D CNN of different varieties of soy- Liu Zhang: Conceptualization; data curation. Heng Sun: Conceptuali-
bean seeds. Of course, some misclassified points also were seen from the zation; data curation. Zhenhong Rao: Validation. Haiyan Ji: Funding
visualization, and as the number of soybean varieties increases, the num- acquisition; resources; supervision.
ber of misclassified points was gradually increasing. This corresponds to
the testing set accuracy of the discriminant analysis model based on 1D
OR CID
CNN in Table 2. Therefore, this method plays a very significant role in the Hao Li https://orcid.org/0000-0002-2926-3707
classification of different varieties of soybean seeds based on NIR HSI Heng Sun https://orcid.org/0000-0002-2038-959X
technology. In addition, this method may explain the effectiveness of 1D
CNN in processing hyperspectral data of mixed seeds and establish a fast RE FE RE NCE S
visual classification method. Almeida, M. R., Fidelis, C. H. V., Barata, L. E. S., & Poppi, R. J. (2013). Classi-
fication of Amazonian rosewood essential oil by Raman spectroscopy
and PLS-DA with reliability estimation. Talanta, 117(15), 305–311.
4 | C O N CL U S I O N S https://doi.org/10.1016/j.talanta.2013.09.025
Ambrose, A., Mohan, L., Kim, M. S., Lee, W., & Cho, B. (2016). Infrared
physics and technology high speed measurement of corn seed viability
In this study, a combination of NIR HSI technology and 1D CNN was using hyperspectral imaging. Infrared Physics & Technology, 75, 173–
used to identify different varieties of soybean seeds. Four varieties of 179. https://doi.org/10.1016/j.infrared.2015.12.008
LI ET AL. 13 of 14
Bao, Y., Mi, C., Wu, N., Liu, F., & He, Y. (2019). Rapid classification of Nie, P., Zhang, J., Feng, X., Yu, C., & He, Y. (2019). Classification of hybrid
wheat grain varieties using hyperspectral imaging and chemometrics. seeds using near-infrared hyperspectral imaging technology combined
Applied Sciences, 9, 4119. https://doi.org/10.3390/app9194119 with deep learning. Sensors and Actuators B: Chemical, 296, 126630.
Bender, R. R., Haegele, J. W., & Below, F. E. (2015). Nutrient uptake, par- https://doi.org/10.1016/j.snb.2019.126630
titioning, and remobilization in modern soybean varieties. Agronomy Pang, L., Men, S., Yan, L., & Xiao, J. (2020). Rapid vitality estimation and
Journal, 107, 563–573. https://doi.org/10.2134/agronj14.0435 prediction of corn seeds based on spectra and images using deep
Cao, X., Li, R., Ge, Y., Wu, B., & Jiao, L. (2019). Densely connected deep learning and hyperspectral imaging techniques. IEEE Access, 8,
random forest for hyperspectral imagery classification. International 123026–123036. https://doi.org/10.1109/ACCESS.2020.3006495
Journal of Remote Sensing, 40, 3606–3622. https://doi.org/10.1080/ Satturu, V., Rani, D., Gattu, S., & Sreedhar, J. (2018). DNA fingerprinting
01431161.2018.1547932 for identification of rice varieties and seed genetic purity assessment.
Du, J., Wang, S., He, C., Zhou, B., Ruan, Y. L., & Shou, H. (2017). Identifica- Agricultural Research, 7, 379–390. https://doi.org/10.1007/s40003-
tion of regulatory networks and hub genes controlling soybean seed 018-0324-8
set and size using RNA sequencing analysis. Journal of Experimental Shrestha, S., Knapič, M., Žibrat, U., Deleuran, L. C., & Gislum, R. (2016).
Botany, 68, 1955–1972. https://doi.org/10.1093/jxb/erw460 Single seed near-infrared hyperspectral imaging in determining tomato
Elmasry, G., Mandour, N., Al-Rejaie, S., Belin, E., & Rousseau, D. (2019). (Solanum lycopersicum L.) seed quality in association with multivariate
Recent applications of multispectral imaging in seed phenotyping and data analysis. Sensors and Actuators B: Chemical, 237, 1027–1034.
quality monitoring—An overview. Sensors (Switzerland), 19, 1090. https://doi.org/10.1016/j.snb.2016.08.170
https://doi.org/10.3390/s19051090 Sun, H., Zhang, L., Rao, Z., & Ji, H. (2020). Determination of moisture con-
Feng, L., Zhu, S., Zhang, C., Bao, Y., Gao, P., & He, Y. (2018). Variety identi- tent in barley seeds based on hyperspectral imaging technology. Spec-
fication of raisins using near-infrared hyperspectral imaging. Molecules, troscopy Letters, 53(10), 751–762. https://doi.org/10.1080/
23, 2907. https://doi.org/10.3390/molecules23112907 00387010.2020.1832531
Ferrari, M., Mottola, L., & Quaresima, V. (2004). Principles, techniques, and Sun, J., Shi, X., Zhang, H., Xia, L., Guo, Y., & Sun, X. (2019). Detection of
limitations of near infrared spectroscopy. Journal of Applied Physiology, moisture content in peanut kernels using hyperspectral imaging tech-
29(4), 463–487. https://doi.org/10.1139/h04-031 nology coupled with chemometrics. Journal of Food Process Engineer-
Guo, Y., Cao, H., Han, S., Sun, Y., & Bai, Y. (2018). Spectral-spatial hyper- ing, 42, 1–10. https://doi.org/10.1111/jfpe.13263
spectral image classification with K-nearest neighbor and guided filter. Tantasawat, P., Trongchuen, J., Prajongjai, T., Jenweerawat, S., &
IEEE Access, 6, 18582–18591. https://doi.org/10.1109/ACCESS.2018. Chaowiset, W. (2011). SSR analysis of soybean (Glycine max (L.) Merr.)
2820043 genetic relationship and variety identification in Thailand. Australian
Huang, M., Tang, J., Yang, B., & Zhu, Q. (2016). Classification of maize Journal of Crop Science, 5, 283–290.
seeds of different years based on hyperspectral imaging and model Wu, N., Zhang, Y., Na, R., Mi, C., Zhu, S., He, Y., & Zhang, C. (2019). Variety
updating. Computers and Electronics in Agriculture, 122, 139–145. identification of oat seeds using hyperspectral imaging: Investigating
https://doi.org/10.1016/j.compag.2016.01.029 the representation ability of deep convolutional neural network. RSC
Ioffe, S., & Szegedy, C. (2015). Batch normalization: Accelerating deep net- Advances, 9, 12635–12644. https://doi.org/10.1039/c8ra10335f
work training by reducing internal covariate shift. Proceedings of the Xia, C., Yang, S., Huang, M., Zhu, Q., Guo, Y., & Qin, J. (2019). Maize seed
32nd International Conference on Machine Learning (ICML). 1. classification using hyperspectral image coupled with multi-linear dis-
pp. 448–456. criminant analysis. Infrared Physics & Technology, 103, 103077. https://
Kiratiratanapruk, K. (2011). Color and texture for com seed classification doi.org/10.1016/j.infrared.2019.103077
by machine vision. ISPACS, 8–12. https://doi.org/10.1109/ISPACS. Yang, J., Zhao, Y. Q., & Chan, J. C. W. (2017). Learning and transferring
2011.6146100 deep joint spectral-spatial features for hyperspectral classification.
Lee, J.-D., Grover, J., & Choung, M.-G. (2011). Application of nondestruc- IEEE Transactions on Geoscience and Remote Sensing, 55, 4729–4742.
tive measurement to improve soybean quality by near infrared reflec- https://doi.org/10.1109/TGRS.2017.2698503
tance spectroscopy. In Soybean—Applications and technology. Yang, Y., Wang, W., Zhuang, H., Yoon, S. C., & Jiang, H. (2021). Prediction
IntechOpen: Tzi-Bun Ng. https://doi.org/10.5772/15842 of quality traits and grades of intact chicken breast fillets by hyper-
Li, Y., Sun, J., Wu, X., Chen, Q., Lu, B., & Dai, C. (2019). Detection of viabil- spectral imaging. British Poultry Science, 62, 46–52. https://doi.org/10.
ity of soybean seed based on fluorescence hyperspectra and CARS- 1080/00071668.2020.1817326
SVM-AdaBoost model. Journal of Food Processing & Preservation, 43, Yu, J., Hong, C., Rui, Y., & Tao, D. (2018). Multitask autoencoder model for
1–9. https://doi.org/10.1111/jfpp.14238 recovering human poses. IEEE Transactions on Industrial Electronics, 65,
Liu, Y., Li, M., Wang, S., Wu, T., Jiang, W., & Liu, Z. (2020). Identification of 5060–5068. https://doi.org/10.1109/TIE.2017.2739691
heat damage in imported soybeans based on hyperspectral imaging Ze, H., Senior, A., & Schuster, M. (2013). Statistical parametric speech syn-
technology. Journal of the Science of Food and Agriculture, 100, 1775– thesis using deep neural networks. In Proceedings of the IEEE Interna-
1786. https://doi.org/10.1002/jsfa.10214 tional Conference on Acoustics, Speech, and Signal Processing
Maria John, K. M., Natarajan, S., & Luthria, D. L. (2016). Metabolite (ICASSP), IEEE. pp. 7962–7966. https://doi.org/10.1109/ICASSP.
changes in nine different soybean varieties grown under field and 2013.6639215
greenhouse conditions. Food Chemistry, 211, 347–355. https://doi. Zhang, J., Dai, L., & Cheng, F. (2020a). Corn seed variety classification
org/10.1016/j.foodchem.2016.05.055 based on hyperspectral reflectance imaging and deep convolutional
McCarville, M. T., Marett, C. C., Mullaney, M. P., Gebhart, G. D., & neural network. Journal of Food Measurement and Characterization, 15,
Tylka, G. L. (2017). Increase in soybean cyst nematode virulence and 484–494. https://doi.org/10.1007/s11694-020-00646-3
reproduction on resistant soybean varieties in Iowa from 2001 to Zhang, J., Dai, L., & Cheng, F. (2020b). Identification of corn seeds with dif-
2015 and the effects on soybean yields. Plant Health Progress, 18, ferent freezing damage degree based on hyperspectral reflectance
146–155. https://doi.org/10.1094/PHP-RS-16-0062 imaging and deep learning method. Food Analytical Methods, 14, 389–
Min, C. W., Gupta, R., Agrawal, G. K., Rakwal, R., & Kim, S. T. (2019). Con- 400. https://doi.org/10.1007/s12161-020-01871-8
cepts and strategies of soybean seed proteomics using the shotgun Zhang, L., & Ji, H. (2019). Identification of wheat grain in different states
proteomics approach. Expert Review of Proteomics, 16, 795–804. based on hyperspectral imaging technology. Spectroscopy Letters, 52,
https://doi.org/10.1080/14789450.2019.1654860 356–366. https://doi.org/10.1080/00387010.2019.1639762
14 of 14 LI ET AL.
Zhang, N., Liu, X., Jin, X., Li, C., Wu, X., Yang, S., … Yanne, P. (2017). Deter- multivariate analysis. Molecules, 23, 1352. https://doi.org/10.3390/
mination of total iron-reactive phenolics, anthocyanins and tannins in molecules23061352
wine grapes of skins and seeds based on near-infrared hyperspectral Zhu, S., Zhang, J., Chao, M., Xu, X., Song, P., Zhang, J., & Huang, Z. (2020). A
imaging. Food Chemistry, 237, 811–817. https://doi.org/10.1016/j. rapid and highly efficient method for the identification of soybean seed
foodchem.2017.06.007 varieties: Hyperspectral images combined with transfer learning. Mole-
Zhang, Y., & Guo, W. (2020). Moisture content detection of maize seed cules, 25(1), 152. https://doi.org/10.3390/molecules25010152
based on visible/near-infrared and near-infrared hyperspectral imaging
technology. International Journal of Food Science and Technology, 55,
631–640. https://doi.org/10.1111/ijfs.14317
Zhao, X., Li, C., Zhao, Z., Wu, G., Xia, L., Jiang, H., … Liu, J. (2021). Generic
How to cite this article: Li, H., Zhang, L., Sun, H., Rao, Z., & Ji,
models for rapid detection of vanillin and melamine adulterated in
infant formulas from diverse brands based on near-infrared hyper- H. (2021). Identification of soybean varieties based on
spectral imaging. Infrared Physics & Technology, 116, 103745. https:// hyperspectral imaging technology and one-dimensional
doi.org/10.1016/j.infrared.2021.103745 convolutional neural network. Journal of Food Process
Zhao, Y., Zhang, C., Zhu, S., Gao, P., Feng, L., & He, Y. (2018). Non-
Engineering, e13767. https://doi.org/10.1111/jfpe.13767
destructive and rapid variety discrimination and visualization of single
grape seed using near-infrared hyperspectral imaging technique and