Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Process Safety and Environmental Protection 132 (2019) 73–81

Contents lists available at ScienceDirect

Process Safety and Environmental Protection


journal homepage: www.elsevier.com/locate/psep

Estimation of soil pH using PXRF spectrometry and Vis-NIR


spectroscopy for rapid environmental risk assessment of
soil heavy metals
Mengxue Wan a,b,c,e , Mingkai Qu a,c , Wenyou Hu a , Weidong Li c , Chuanrong Zhang c ,
Hang Cheng d , Biao Huang a,∗
a
Key Laboratory of Soil Environment and Pollution Remediation, Institute of Soil Science, Chinese Academy of Sciences, Nanjing, 210008, China
b
University of Chinese Academy of Sciences, Beijing, 100049, China
c
Department of Geography, University of Connecticut, Storrs, CT, 06269, USA
d
School of Resource and Environmental Sciences, Wuhan University, Wuhan, 430079, China
e
Key Laboratory of Geospatial Technology for Middle and Lower Yellow River Regions (Henan University), Ministry of Education, Kaifeng 475004, China

a r t i c l e i n f o a b s t r a c t

Article history: Environmental risk of heavy metals (HMs) in soil is commonly assessed by the different risk screening val-
Received 8 July 2019 ues of HMs under different pH based on soil environmental quality standards. To explore and establish a
Received in revised form 5 September 2019 reliable, rapid and cost-effective method for detailed soil environmental quality survey with high-density
Accepted 20 September 2019
sampling in the large-scale area is of significance for theoretical and practical research. In present study,
Available online 24 September 2019
using data from Yunnan Province, China, rapid analysis of soil HMs were conducted via portable X-ray
fluorescence (PXRF) spectrometry, and that of soil pH was estimated by applying PXRF and visible near-
Keywords:
infrared reflectance (Vis-NIR) spectroscopy data to partial least-squares regression (PLSR) and support
Soil pH
PXRF
vector machine regression (SVMR). Then we compared soil HM contamination grades calculated by con-
Vis-NIR ventional laboratory analysis data with those of rapid analysis data. It was found that soil HMs (i.e., As,
Data fusion Pb, Cu, and Zn) were successfully estimated by PXRF with high coefficient of determination (R2 ) above
Soil heavy metals 0.97 (P < 0.001). SVMR with fused sensor dataset (here PXRF and Vis-NIR) provided the best predictive
Environmental risk assessment model for soil pH estimation (R2 = 0.86; the ratio of performance to deviation (RPD) = 2.21; the ratio of
performance to interquartile distance (RPIQ) = 3.09). The Kappa coefficient of the classification was 0.91,
a very high-level consistency between the assessment of soil HM contamination grades calculated by
rapid analysis data and that of conventional laboratory analysis data. Therefore, our study suggested a
promising method to rapidly detect soil HM contamination under different pH intervals, which would
considerably reduce the financial burden of detailed soil HM survey great sampling number in large-scale
areas.
© 2019 Published by Elsevier B.V. on behalf of Institution of Chemical Engineers.

1. Introduction a large scale is critical to assess the environmental risk of soil HMs
in a time- and cost-saving manner.
Soil heavy metal (HM) pollution has been a matter of great con- Conventionally, the environmental risk of soil HMs is com-
cern globally (Li et al., 2017; Qu et al., 2018; Yang et al., 2019). monly assessed using different pH intervals to determine the joint
Heavy metals accumulated in soils could be toxic for many organ- thresholds of soil HMs according to soil environmental quality stan-
isms and affect the vegetable quality and food security, posing a dards. Currently, soil HMs are usually determined using laboratory
severe threat to human health (Hu et al., 2017b; Tian et al., 2017; analytical methods, such as inductively coupled plasma-mass spec-
Tong et al., 2019). Therefore, completing detailed soil HM survey at troscopy (ICP-MS), inductively coupled plasma atomic emission
spectrometry (ICP-AES), atomic fluorescence spectrometry (AFS)
and atomic absorption spectrophotometry (AAS) (Li et al., 2018;
Abbreviations: PXRF, portable X-ray fluorescence; RPD, residual prediction devi-
Wang et al., 2018; Zhang et al., 2017). These techniques are highly
ation; RPIQ, the ratio of performance to interquartile distance; SVMR, supportvector accurate but time- and cost-consuming, especially for detailed soil
machine regression; PLSR, partial least-squares regression; SG, Savitzky-Golay; SNV, HM surveys with high sampling density in the large scale (Hu et al.,
standard normal variate transform; Vis-NIR, visible near-infrared reflectance. 2017a; Ran et al., 2014). Recently, the application of portable X-
∗ Corresponding author.
ray fluorescence (PXRF) to soil analysis was sanctioned first by
E-mail address: bhuang@issas.ac.cn (B. Huang).

https://doi.org/10.1016/j.psep.2019.09.025
0957-5820/© 2019 Published by Elsevier B.V. on behalf of Institution of Chemical Engineers.
74 M. Wan, M. Qu, W. Hu et al. / Process Safety and Environmental Protection 132 (2019) 73–81

the USEPA (2007) in method 6200 and most recently by the Soil et al., 2018). Therefore, to get robustness and reliability in predic-
Survey Staff (2014). Moreover, PXRF has been widely used to deter- tion, the choice of an appropriate model is an essential factor for
mine soil HMs and assess soil environmental risk because of its successful prediction, especially for sensor data fusion with a large
accuracy and rapidity (Chakraborty et al., 2017; Horta et al., 2015; number of potential predictor variables (Nawar et al., 2016).
Qu et al., 2019; Weindorf et al., 2014, 2012; Yang et al., 2018). This study was conducted with 138 surface soil samples in Yun-
Wan et al. (2019) applied PXRF to determine As, Pb, Cu, and Zn nan Province, China. The specific objectives are as follows: (1) to
at relatively low concentration and found significantly positive investigate the feasibility of Vis-NIR, PXRF and sensor data fusion
correlations between the ex-situ PXRF results and conventional lab- (PXRF + Vis-NIR) for rapid characterization of soil pH; (2) to com-
oratory analysis results for both certified reference materials and pare, by using different sensor data sets, the performance of the
soil samples. Therefore, PXRF has potential to determine soil HMs traditional linear method PLSR with that of the non-linear method
concentrations ex-situ as a rapid and nondestructive tool, espe- SVMR; (3) to verify the application of PXRF for effectively quanti-
cially when attempting to quantify HMs with high sampling density fying soil HMs (As, Pb, Cu, and Zn); and (4) to rapidly assess soil
in a large-scale area. environmental risk based on pH and HMs with rapid and non-
Moreover, soil pH is traditionally measured either colorimetri- destructive analysis. The ultimate goal is to find an optimal rapid
cally or electrometrically in the laboratory (Zhang and Gong, 2012). method of soil pH characterization and HM determination to save
It may strongly impact the relative availability of plant nutrients, time and cost for rapid soil environmental risk assessment with
microbial activity, and toxicity of contaminants and accumulation high-density sampling in a large-scale area.
of toxic materials. Therefore, it is one of the most critical soil prop-
erties for soil survey and assessment of soil environmental quality.
2. Materials and methods
More importantly, there is a tremendous need for simultaneous
soil HMs and pH measuring methods with comparable accuracy to
2.1. Study area and soil sampling
rapidly assess soil environmental risk (Hu et al., 2017a; Shepherd
and Walsh, 2002). With the development of proximal soil sensing
The study was conducted in Yunnan Province, located in the
technology, not only PXRF could measure multiple soil elements
southwest China between longitude 97 to 105 ◦ E and latitude 20 to
rapidly (Kilbride et al., 2006; Wan et al., 2019; Yang et al., 2018)
28 ◦ N. Because of great difference in elevation from 93 to 5396 m
but also pH could be simultaneously predicted by multiple linear
above sea level, Yunnan Province comprises seven different climate
regression (MLR) directly with PXRF-measured elements (Sharma
zones, including north tropical, south subtropical, middle subtrop-
et al., 2014). While, visible near-infrared reflectance (Vis-NIR) spec-
ical, northern subtropical, warm temperate, middle temperate, and
troscopy was also used to predict soil pH by partial-least squares
plateau climatic zones, which reflect climatic effects of insolation,
regression (PLSR) (Hu et al., 2017a). However, a single sensor can-
monsoon, and elevation. The annual average temperature in Yun-
not always provide comprehensive information for soil properties
nan ranges from 5 ◦ C in the north to 24 ◦ C in the south, and the
effectively because of the complex nature of soils. Thus, methods
annual rainfall is between 800 and 1200 mm mostly falling from
with multiple sensor fusion and advanced models have been used
May to October. The whole province is highly mountainous and
to improve the prediction of some soil properties. Wang et al. (2013)
rugged landscape, giving rise to high biodiversity with tropical, sub-
reported that using a Fourier transform near-infrared (FT-NIR)
tropical, temperate and even frigid plant species. With the area of
spectrometer and a PXRF analyzer with data fusion improved the
390,000 km2 of Yunnan Province, factors affect the soil formation,
prediction of soil texture. Aldabaa et al. (2015) proved that applica-
including climate, organisms, topography, and parent materials,
tion of PXRF combined with Vis-NIR and remotely sensed spectral
vary substantially across space and time, allowing for the develop-
data could potentially improve the prediction of soil salinity. Wang
ment of many soil types including Ferrosols, Andosols, Anthrosols,
et al. (2015) demonstrated that merging the PXRF and Vis-NIR
Gleyosols, Isohumosols, Argosols, Cambosols and Primosols accord-
datasets improved the power of predictive models, as indicated
ing to Keys to Chinese Soil Taxonomy (2001).
by improved residual prediction deviation (RPD) and R2 statistics.
With the consideration of soil types and transportation condi-
Given the success of these previous investigations, we hypothesize
tions, the spatial distribution of sampling locations was arranged as
that similar approaches of sensor data fusion would be feasible to
shown in Fig. 1, according to the actual performance. In this study,
predict soil pH when the PXRF and Vis-NIR data were available for
138 surface soil samples were collected from the top layer of the
rapid soil environmental risk assessment. If the fused sensor data
corresponding soil profiles down to a depth of 120 cm or the rock,
(PXRF and Vis-NIR) could be efficiently used as a proxy for soil pH
across the Yunnan Province. The depths of the top layers varied,
characterization, it would result in substantial saving in cost and
according to the observable differences in soil color, texture, struc-
time for the rapid assessment of soil environmental risk.
ture, or other characteristics. Each soil sample, approximately 1 kg,
The linear technique of partial least-squares regression (PLSR)
was composed of five sub-samples within the top layer. The gravels
is one of the most widely used techniques and has a good capacity
and plant debris were hand-picked, and the soils were air-dried at
to handle data multicollinearity and estimate attributes between
room temperature. Then the soil was separated into three parts and
the spectral and soil properties (Vasques et al., 2008; Wold et al.,
crushed to pass through < 2 mm, < 0.25 mm, and < 0.150 mm sieves,
2001). Nonetheless, PLSR is restricted by its linearity assumption
and stored in paper bags, respectively, for the further analysis. Air-
among variables. Support vector machine regression (SVMR) is a
dried and sieved soils were used to minimize effects of variation
relatively new non-linear method and has been used in classifica-
in soil moisture, particle size and non-uniformity on reflectance
tion and multivariate calibration problems (Chang and Lin, 2011).
(Bendor et al., 1999; Silva et al., 2018).
Previous studies showed that this non-linear regression model
outperformed linear models because the relationships between
spectral data or elemental concentrations and soil characteristics 2.2. Conventional laboratory analysis
are rarely linear in nature, especially for large-scale areas with a
wide variability of soil properties (Araújo et al., 2014; Lucà et al., Soil pH was measured using a soil/water ratio of 1:2.5 with a pH
2017). When dealing with a heterogeneous sample set in which soil meter (PHS-3C, Shanghai, China), and standard reference materials
composition may vary considerably, the precision of linear regres- were used for the data quality control (Zhang and Gong, 2012).
sion techniques decreases because of the non-linear nature of the For HM analysis, soil samples were digested in HNO3 −HClO4 -
relationship between spectral data and the dependent variable (Xu HF (Zhuang et al., 2009), and analyzed for Pb, Cu, and Zn using
M. Wan, M. Qu, W. Hu et al. / Process Safety and Environmental Protection 132 (2019) 73–81 75

Fig. 1. Distribution of sampling sites in Yunnan Province, China.

inductively coupled plasma-mass spectroscopy (ICP-MS; American (Agilent Technologies, CA) under controlled laboratory conditions.
Thermo Scientific, X7). Arsenic was digested in aqua regia (HNO3 : It took about 3.5 min to scan one sample, and for each subset of
HCl = 3:1) and analyzed by atomic fluorescence spectrometry (AFS; 50 samples, two samples were randomly chosen for replicate mea-
Beijing Jitian Instruments Co., Ltd. production, AFS-820) (Xu et al., surements. The variation between replicates was less than 0.3%.
2013). Spectra were sampled at 1-nm intervals with resolution narrower
than 0.048 nm in the visible range (350–700 nm) and narrower
2.3. Rapid analysis using proximal sensing techniques than 0.2 nm in the NIR range (700-2,500 nm) to obtain 2151 wave-
lengths. Detailed information about the Cary 5000 spectrometer
The rapid estimation of soil HMs and pH using proximal sensing and protocols of spectra measurements can be found at Zeng et al.
techniques can be influenced by the instrument environments, soil (2016).
particle sizes, moisture and geographical distribution of samples
(Chang et al., 2005; Wang et al., 2013). Therefore, controlled mois- 2.4. Data preparations for pH prediction
ture and temperature for the scanning environment were set in the
laboratory to help to obtain more accurate results than in-situ mea- 2.4.1. PXRF data
surement methods. Moreover, the soils were air-dried and sieved The instrument was capable of detecting various elements,
to the same size before scanning in the same laboratory environ- including Ca, Fe, Mn, Cr, Ni, Cu, Zn, Pb, As, K, Ti, V, Rb, Sr, Zr, Nb,
ment, the negative influences of particle size and moisture could Si, and Al in soil samples. Elemental data in this study were higher
be minimized. than the detection limit. And the elemental data were then log-
Portable X-ray fluorescence (PXRF) spectrometry (NITON XLt transformed (Sharma et al., 2014).
960, UK) was applied to determine elemental contents in this study
following the manufacturer’s instructions and the recommenda- 2.4.2. Vis-NIR data
tions of the Method 6200 (USEPA, 2007). It was conducted with The pretreatment of the spectral data was performed using
a 40 kV X-ray tube with Ag anode target excitation source, and a Unscrambler 9.3 (Camo Software AS, Oslo, Norway). Spectral
silicon PIN-diode with a Peltier cooled detector. Prior to sample reflectance (R) was transformed to absorbance (A) by the equation
analysis, the PXRF was calibrated by the manufacturer’s sets. Then, A = log10 (1/R) before modelling. Data redundancy was reduced by
the soil surface was directly scanned by PXRF in Geochem mode, selecting every tenth band in every ten bands. The remaining 216
and results of each sample were calculated by averaging three par- wavebands were smoothed through an 11-point Savitzky-Golay
allel measurements automatically for analysis and output in Excel (SG) filter with the first derivative using a second-order polynomial
format (Hu et al., 2014). Geochem mode consisted of three beams with a window size of 10 wavelengths to reduce the baseline vari-
operating sequentially. Each beam was set to scan for 30 s so that ation and improve the spectral features (Savitzky and Golay, 1964).
each scan could be completed in 90 s. The standard soil GSS-3 was Derivatives were applied either alone or combined with (i) standard
used to test the accuracy after every twenty samplings’ scans. normal variate transform (SNV), which reduced the particle-size
Soil samples were oven-dried at 45 ◦ C for 24 h prior to spec- effect, or (ii) with multiplicative scatter correction (MSC), which
tra collection. Reflectance spectra were measured in the visible removed light scattering variation in the reflectance spectroscopy
and near-infrared (Vis-NIR) range of 350–2500 nm using Cary 5000 (Martens and Naes, 1989). Each preprocessing data was then cal-
76 M. Wan, M. Qu, W. Hu et al. / Process Safety and Environmental Protection 132 (2019) 73–81

ibrated for soil pH determination using PLSR and SVMR models. Table 1
Assessment of standard reference on soil heavy metal.
Overall, the SG, SNV, and MSC transforms did not improve the accu-
racy of the models (data not shown), and the results presented pH As Pb Cu Zn
below were from the SG smoothing using the first derivative with mg kg−1
a second-order polynomial in conjunction with MSC. Risk control standard pH ≤ 5.5 40 70 50 200
for soil contamination 5.5 < pH ≤ 6.5 40 90 50 200
2.4.3. Sensor data fusion in China 6.5 < pH ≤ 7.5 30 120 100 250
(GB15618-2018) pH > 7.5 25 170 100 300
The elemental concentrations measured by PXRF (18 variables)
and the reflectance data by Vis-NIR (216 variables) were fused into
one dataset - PXRF + Vis-NIR (234 variables). Table 2
Grades of soil heavy metal pollution (Han et al., 2018; Zhang et al., 2017).
2.5. Multivariate models for soil pH prediction
Class SPI NCPI Grade

Both the two single sensor datasets and the one fused sensor 1 SPI≤ 1 NCPI≤ 0.7 Safety
2 1 < SPI ≤ 2 0.7 < NCPI≤ 1.0 Alert
dataset were each divided into two subsets for calibration and val-
3 2 < SPI ≤ 3 1.0 < NCPI≤ 2.0 Slight pollution
idation. The whole dataset (n = 138) was sorted in a descending 4 3 < SPI ≤ 5 2.0 < NCPI≤ 3.0 Moderate pollution
order according to soil pH value; then the middle one of every three 5 SPI > 5 NCPI> 3.0 Severe pollution
adjacent samples in the sequence were regarded as the validation
set (n = 46) and the remaining two thirds as the calibration dataset
(n = 92). determination (R2 ), and the ratio of performance to deviation resid-
Partial least-squares regression (PLSR) and support vector ual prediction deviation (RPD) and the ratio of performance to
machine regression (SVMR) models were performed to establish interquartile distance (RPIQ). The RPD index is the standard devi-
the relationship between pH and sensor data matrix using the PLS ation (SD) divided by the RMSE (Chang and Laird, 2002; Shepherd
Toolbox version 8.02 (Eigenvector Research, Inc., Wenatchee, WA, and Walsh, 2002). The RPIQ is the ratio of the interquartile distance
USA) under MATLAB version R2016a (The MathWorks, Inc., Nat- to the RMSE (Bellon-Maurel et al., 2010).
ick, MA, USA). In PLSR and SVMR models, Y was the matrix of soil
pH measured by conventional laboratory analysis, and X was the
sensor data matrix. 2.7. Soil heavy metal contamination assessment
PLSR is a widely used linear multivariate regression method for
relating soil properties and spectral data (Wold et al., 2001). The To evaluate the feasibility of rapid soil environmental risk
compression and regression steps are integrated in PLSR, and the assessment, soil contamination grades, assessed by rapid analysis
new orthogonal factors by reducing dimensional space and noise data (PXRF measured HMs and PXRF and Vis-NIR predicted pH),
of spectral variables, namely latent variables (LVs), are determined. were compared with those of conventional laboratory analysis in
The optimal number of LVs, which are used to optimize the covari- this study. Single pollution index (SPI) for each HM was used to
ance between soil property and spectral data in PLSR modelling, are evaluate the effect of each HM on soil contamination and Nemerow
determined after the minimum root mean squared error (RMSE) is composite pollution index (NCPI) was used to assess the influence
obtained by leave-one-out cross-validation. Then the extracted LVs of high concentration HMs on soil environmental quality. The pollu-
are used as the linear combination of the predictor variables. tion thresholds for each soil HM were decided based on four-group
SVMR is a new nonlinear regression method, commonly applied pH intervals according to the soil environmental quality standards
in classification and regression due to its high speed of computing in China (GB15618-2018) (Table 1). SPI is calculated by
and excellent performance (Bao et al., 2017; Vapnik, 1999). The raw
Ci
data are reduced to support vectors through SVM, and a support- SPI = (1)
Si
vector network maps the input vectors into a high-dimensional
feature space via some nonlinear mapping to construct an optimal where Ci is the concentration of soil heavy metal i and Si is the
hyperplane for nonlinear regression. The epsilon-SVM algorithm threshold of i. NCPI is calculated by
and radial basis function (kernel function) were used for modeling
in this study. The radial basis function kernel parameter (␥) and cost

(SPI max )2 + (SPI mean )2
parameter (C) were fine-tuned by a systematic grid search method, NCPI = (2)
and the optimal parameters were determined after the minimum 2
root mean squared error (RMSE) was obtained by leave-one-out
where SPI max is the maximum value among SPI values of each HM
cross-validation.
and SPI mean is the mean SPI of each heavy metal.
In this study, six modeling modes were performed as follows:
As NCPI was used to classify the soil HM pollution, they were
divided into different grades according to different values (Table 2).
1 First, soil pH was predicted by both PLSR and SVMR using single
Kappa coefficient was used to verify the intra rater-reliability of
sensor datasets of log-transformed PXRF elemental data.
environmental risk assessment based on conventional laboratory
2 Next, soil pH was predicted by both PLSR and SVMR using single
analysis data and rapid analysis data. The Kappa coefficient was
sensor datasets of pretreated Vis-NIR spectral data.
calculated as Eq. 3:
3 Finally, soil pH was predicted by both PLSR and SVMR using the
fused sensor dataset. Po − Pe
Kappa = (3)
1 − Pe
2.6. Accuracy assessment of prediction models
where Po is observed agreement, and Pe is chance agreement
The measured values soil HMs and pH were compared with (Kraemer, 2014; Thompson and Walter, 1988). The value of Kappa
the corresponding estimated values using simple linear regression coefficient is between -1 and 1. Kappa < 0.4 indicates poor consis-
analysis. The performances of the different models were eval- tency, 0.4 ≤ Kappa < 0.75 for acceptable consistency, Kappa ≥ 0.75
uated using the indices including the RMSE, the coefficient of for satisfactory consistency (Landis and Koch, 1977).
M. Wan, M. Qu, W. Hu et al. / Process Safety and Environmental Protection 132 (2019) 73–81 77

Fig. 2. Box-plots, histograms and descriptive statistics of measured pH data. Min.: minimum, Max.: maximum, SD: standard deviation, CV: coefficient of variation, n: the
number of soil samples.

Table 3
Descriptive statistics of heavy metals determined by conventional laboratory
analysis.

Heavy metals Min. Max. Mean SD CV


. . .. . .. . .. . . mg·kg−1 . . .. . .. . .. . .

Cu 5.84 331.63 50.59 47.87 0.95


Zn 24.49 858.47 103.73 87.89 0.85
Pb 14.19 489.86 46.74 52.46 1.12
As 5.97 217.33 32.42 33.24 1.03

Table 4
Correlation statistics relating the elemental contents of PXRF to soil pH.

Elements pH Elements pH Elements pH

Ca 0.820** Zn 0.173** Rb 0.008


Fe −0.001 Pb 0.145** Sr 0.458**
Mn 0.229** As 0.041 Zr −0.234**
Cr −0.028 K 0.208** Nb −0.136**
Ni 0.166** Ti −0.090* Al −0.270**
Cu 0.044 V −0.085* Si −0.116**
*
Correlation is significant at the 0.05 level (2-tailed).
**
Correlation is significant at the 0.01 level (2-tailed).

3. Results and discussion

3.1. Descriptive statistics Fig. 3. Mean soil reflectance spectra at different pH values from soils in Yunnan
Province, China: (a) raw spectra; (b) first derivative spectra.
The summary statistics of soil pH measured by conventional lab-
oratory analysis for the whole, calibration and validation datasets
were provided in Fig. 2. The calibration data ranged from 4.12 to there were significant differences in correlations between soil pH
8.59 with a mean value of 5.99, and the validation data ranged from and each of Ca, Mn, Ni, Zn, Pb, K, Ti, V, Sr, Zr, Nb, Al, and Si, with Ca
4.25 to 8.32 with an average of 5.99. Overall, the soil pH values of having the highest positive correlation coefficient of 0.820 at the
both calibration and validation datasets were similar to the whole significance level of 0.01, which might be contributed to the good
dataset with a wide range from 4.12 to 8.59, from acidic to alkaline, predictive ability of PXRF data.
demonstrating that they represented the whole dataset well. The raw and preprocessed Vis-NIR spectra of soil samples at
The summary statistics of soil HMs measured by conventional different wavelengths representing different intervals of pH val-
laboratory analysis were provided in Table 3. The mean values of ues (from acid to alkaline) were summarized in Fig. 3. The overall
As, Pb, Cu, and Zn were 32.42, 46.74, 50.59, and 103.73 mg kg−1 , shapes of the raw spectra in different pH intervals were gen-
respectively. The CV values of As, Pb, Cu, and Zn were between erally broad and smooth. The first derivative of the reflectance
0.85–1.12, much higher than 0.75, which were in the state of rel- enhanced the spectral features compared with the raw reflectance.
atively high variation. The SD of As and Pb were higher than their As reported, the three absorption regions at approximately 1400,
corresponding mean value, thus indicating a high variation, espe- 1900, and 2200 nm are related to the O H stretching, H O H bend-
cially for Pb having the highest CV of 1.12. This demonstrated that ing, and the clay lattice Al−OH absorption band (Stenberg et al.,
soil formation factors, especially for the parent materials, in Yun- 2010). Given that multiple linear regression with PXRF data and
nan Province varied dramatically. The parent materials, which are auxiliary input data (clay content, sand content, organic matter
the main source of the elements, should contribute to the corre- content) provided the best predictive model (Sharma et al., 2014),
lations between soil pH and some elements. Table 4 showed that the improvement of pH prediction based on fused sensor data in
78 M. Wan, M. Qu, W. Hu et al. / Process Safety and Environmental Protection 132 (2019) 73–81

Table 5
Comparison of classification of heavy metal contamination grades based on conventional laboratory analysis data and rapid analysis data.

Prediction-based classification Safety Alert Slight Moderate Severe Total Accuracy of prediction-based
Pollution Pollution Pollution classification (%)

Lab-measurement-based classification
Safety 63 6 1 0 0 70 90
Alert 5 20 3 0 0 28 71
Slight Pollution 3 2 19 1 0 25 76
Moderate Pollution 0 0 2 5 3 10 50
Severe Pollution 0 0 0 1 4 5 80
Total 71 28 25 7 7 138
Accuracy of method classification (%) 89 71 76 71 57 Kappa Coefficient: 0.80
Safety 66 3 1 0 0 70 94
Alert 3 23 2 0 0 28 82
Slight Pollution 1 1 22 1 0 25 88
Moderate Pollution 0 0 1 9 0 10 90
Severe Pollution 0 0 0 0 5 5 100
Total 70 27 26 10 5 138
Accuracy of method classification (%) 94 85 85 90 100 Kappa Coefficient: 0.91

Fig. 4. Scatter plots of lab-measured pH vs predicted pH using different data sets and analysis methods (PLSR and SVMR) for surface soils (validation dataset, n = 46). The
dotted lines are the 1:1 line and the red lines are the regression lines. The colored regions in the six subplots represent 95% prediction confidence intervals. (For interpretation
of the references to colour in this figure legend, the reader is referred to the web version of this article).

this study might be due to the spectral response to the soil organic complicate the PLSR analysis, leading to the decreased prediction
matter and clay. accuracy based on the fused dataset.
Compared with PLSR, the nonlinear regression method SVMR
improved the predictions of soil pH from PXRF, Vis-NIR, and fused
3.2. Soil pH estimation using different sensor datasets sensor dataset with R2 = 0.80, 0.75, and 0.86, RPD = 2.03, 1.74, and
2.21, and RPIQ = 3.01, 2.50, and 3.09 for validation dataset, respec-
For the linear regression method PLSR, the soil pH was pre- tively. Substantial improvement in predicting soil pH by SVMR over
dicted with R2 = 0.68, 0.34, and 0.58, RPD = 1.50, 0.70, and 1.13, and PLSR is likely attributable to the nonlinear relationships between
RPIQ = 2.01, 0.81, and 1.37 for validation dataset through PXRF, Vis- soil characteristics that were not captured by PLSR (Webster, 2000).
NIR, and fused sensor data, respectively (Fig. 4). The correlations Moreover, the application of the nonlinear regression method
between soil pH and certain elements might lead to the acceptable ignored the step of selecting the specific elements for linear regres-
performance of the linear regression (PLSR) based on PXRF dataset sion, leading to a better performance of SVMR. Therefore, when only
(Sharma et al., 2014). While, the poor performance of PLSR based on PXRF data were available, SVMR was effective to predict soil pH
Vis-NIR dataset might be caused by the indirect spectral response with reasonable accuracy in our study area. However, Sharma et al.
to soil pH (Stenberg et al., 2010). Since it was difficult for Vis-NIR (2014) reported that multiple linear regression (MLR) with PXRF
spectroscopy alone to provide a proper soil characterization, the and auxiliary data (clay, sand, organic matter) could improve the
poor correlation between the Vis-NIR spectra and the soil pH might prediction of soil pH. But these auxiliary data also need conven-
M. Wan, M. Qu, W. Hu et al. / Process Safety and Environmental Protection 132 (2019) 73–81 79

Fig. 5. Correlations of heavy metals concentrations between rapid analysis and laboratory analysis for surface soils in Yunnan Province, China. The dotted lines are the 1:1
line and the red lines are the regression lines. The colored regions in the six subplots represent 95% prediction confidence intervals. (For interpretation of the references to
colour in this figure legend, the reader is referred to the web version of this article).

tional laboratory analysis, which was time- and cost-consuming. results demonstrated that it was reasonable to use PXRF to rapidly
Moreover, Vis-NIR is widely used to predict soil clay, sand, and measure soil HMs in both small-scale area and large-scale area.
organic matter (Nawar et al., 2016; Sharma et al., 2014; Zeng et al.,
2016). But the correlation between the standalone Vis-NIR spectra
and the soil pH is poor, yet we found that this dataset could provide 3.4. Rapid assessment of the environmental risk of soil heavy
valuable auxiliary data for PXRF to improve the prediction. There- metals
fore, SVMR could be a preferred method for pH prediction based
on sensor data fusion (PXRF + Vis-NIR), which further proved their Compared with above different pH predictions, pure PXRF
values in rapid soil pH estimation. dataset and fused sensor dataset (PXRF + Vis-NIR) with SVMR were
used to estimate pH (as predicted pH) of soil samples, respec-
tively, and then they were classified by pH intervals into four
classes (pH < 5.5, 5.5–6.5, 6.5–7.5, > 7.5) according to the standard
3.3. Soil heavy metals estimation using PXRF of soil environmental quality in China (GB15618-2018), as well as
pH measured by conventional laboratory analysis. The rapid envi-
The detection limit of As, Pb, Cu and Zn determined by PXRF ronmental risk assessment of soil HMs was performed by rapid
(NITON XLt 960, UK) were 7 mg kg−1 , 8 mg kg−1 , 15 mg kg−1 , and analysis data (PXRF measured HMs and predicted pH). While the
12 mg kg−1 , respectively (Wan et al., 2019), which were lower than conventional environmental risk assessment was performed by
the lowest risk screening values for soil contamination according to conventional laboratory analysis for estimating soil HMs and soil
the soil environmental quality standards in China (GB15618-2018, pH. The accuracies of classification from NCPI results were verified
2018). Moreover, Wan et al. (2019) verified the quality of PXRF using the Kappa coefficient, based on rapid analysis data and con-
data to be “definitive” in the small-scale area. Therefore, based on ventional laboratory analysis data. Results of this comparison were
the accuracy and precision, this study applied PXRF to verify its shown in Table 5.
feasibility of measuring soil HM (As, Pb, Cu, and Zn) concentrations According to the results of conventional laboratory analysis for
with higher variability in the large-scale area. The results showed the 138 surface soil samples, the number of different grades of com-
determination coefficients (R2 ) for As, Pb, Cu and Zn were 0.97, prehensive contamination, including safety, alert, slight pollution,
0.99, 0.98 and 0.98 (P < 0.00 1), respectively (Fig. 5). The consistent moderate pollution, and severe pollution, were 70, 28, 25, 10, and
80 M. Wan, M. Qu, W. Hu et al. / Process Safety and Environmental Protection 132 (2019) 73–81

5, respectively (Table 5). The accuracies of prediction classification Acknowledgments


based on PXRF predicted pH results were 90%, 71%, 76%, 50%, and
80% for soil composite contamination grades, including safety, alert, This work was supported by the National Key Research and
slight pollution, moderate pollution, and severe pollution, respec- Development Program (Grant No. 2018YFC1802601), the Key Fron-
tively. Meanwhile, the accuracies of method classification were tier Project of Institute of Soil Science, Chinese Academy of Sciences
89%, 71%, 76%, 71%, and 57%, respectively. The Kappa coefficient (Grant No. ISSASIP1629), the National Science and Technology Basic
of classification was 0.80, above 0.75, which indicated the assess- Special Program (2014FY110200A10) and the China Scholarship
ment of HM contamination results based on PXRF measured HMs Council. The authors thank Professor Ganlin Zhang and Doctor Rong
and PXRF predicted pH had a satisfactory consistency with these Zeng for providing the visible near-infrared reflectance (Vis-NIR)
based on conventional laboratory analysis (Table 5). When there spectroscopy data of soil samples in this study area.
were only PXRF data available, it could be acceptable to use PXRF
measured HMs and PXRF predicted pH for rapidly classifying soil
HM contamination grades.
PXRF + Vis-NIR improved the accuracies of prediction classi- References
fication substantially to 94%, 82%, 88%, 90%, and 100% for soil
composite contamination grades of safety, alert, slight pollution, Aldabaa, A.A.A., Weindorf, D.C., Chakraborty, S., Sharma, A., Li, B., 2015. Combination
of proximal and remote sensing methods for rapid soil salinity quantification.
moderate pollution, and severe pollution, respectively. The accu- Geoderma 239, 34–46.
racies of method classification were improved to 94%, 85%, 85%, Araújo, S.R., Wetterlind, J., Demattê, J.A.M., Stenberg, B., 2014. Improving the predic-
90%, and 100%, respectively. The improved accuracy of classifi- tion performance of a large tropical vis-NIR spectroscopic soil library from Brazil
by clustering into smaller subsets or use of data mining calibration techniques.
cation should be due to the improved pH prediction. Both the Eur. J. Soil Sci. 65, 718–729.
accuracy of prediction-based classification of severe pollution and Bao, N., Wu, L., Ye, B., Yang, K., Zhou, W., 2017. Assessing soil organic matter of
that of method classification were 100%, demonstrating that the reclaimed soil from a large surface coal mine using a field spectroradiometer in
laboratory. Geoderma 288, 47–55.
“hot spots” could be identified in a time- and cost-saving manner. Bellon-Maurel, V., Fernandez-Ahumada, E., Palagos, B., Roger, J.-M., McBratney, A.,
Moreover, the Kappa coefficient of classification was 0.91, higher 2010. Critical review of chemometric indicators commonly used for assessing
than that of pure PXRF predicted pH, highlighting the usefulness of the quality of the prediction of soil attributes by NIR spectroscopy. Trac Trends
Anal. Chem. 29, 1073–1081.
using both PXRF and VIS-NIR. These results proved the feasibility of Bendor, E., Irons, J.R., Epema, G.F., 1999. Soil reflectance. Chapter in scientific book.
rapidly classifying soil HM contamination grades using estimated In: Remote Sensing for the Earth Sciences: Manual of Remote Sensing 3/3 / Rencz,
soil pH and HMs. A.N., pp. 111–188.
Chakraborty, S., Man, T., Paulette, L., Deb, S., Li, B., Weindorf, D., Frazier, M., 2017.
Furthermore, using predicted soil pH to gain the joint thresh-
Rapid assessment of smelter/mining soil contamination via portable X-ray flu-
olds of soil HMs could rapidly accomplish soil environmental risk orescence spectrometry and indicator kriging. Geoderma 306, 108–119.
assessment. Considering that the Kappa coefficient of the classifi- Chang, C.C., Lin, C.J., 2011. LIBSVM: a library for support vector machines. ACM.
cation of HM contamination grades was above 0.80, it was feasible Chang, C.W., Laird, D.A., 2002. Near-infrared reflectance spectroscopic analysis of
soil C and N. Soil Sci. 167, 110–116.
and reasonable to use PXRF individually or in combination with Chang, C.W., Laird, D.A., Hurburgh Jr, C.R., 2005. Influence of soil moisture on near-
Vis-NIR to effectively assess soil environmental risk of HMs with infrared reflectance spectroscopic measurement of soil properties. Soil Sci. 170,
reasonable accuracy. Therefore, the results proved the effectiveness 244–255.
Chinese Soil Taxonomy Research Group, 2001. Keys to Chinese Soil Taxonomy. Press
of single sensor data (PXRF) and fused sensor data with nonlinear of University of Science and Technology of China, Hefei, China.
regression SVMR in pH prediction, which could further serve for the Han, W., Gao, G., Geng, J., Li, Y., Wang, Y., 2018. Ecological and health risks assessment
rapid assessment of soil environmental risk with HMs measured and spatial distribution of residual heavy metals in the soil of an e-waste circular
economy park in Tianjin, China. Chemosphere 197, 325–335.
by PXRF, making the rapid soil survey of high-density sampling Horta, A., Malone, B., Stockmann, U., Minasny, B., Bishop, T.F.A., Mcbratney, A.B.,
feasible. Although the rapid analysis was promising ex-situ, fur- Pallasser, R., Pozza, L., 2015. Potential of integrated field spectroscopy and spatial
ther studies on a survey of soil HMs with high-density sampling analysis for enhanced assessment of soil contamination: a prospective review.
Geoderma 241, 180–209.
at a large scale should be focused on the in-situ assessment of soil
Hu, B., Chen, S., Hu, J., Xia, F., Xu, J., Li, Y., Shi, Z., 2017a. Application of portable XRF
environmental risk. And soil pH and HM analysis in-situ have to and VNIR sensors for rapid assessment of soil heavy metal pollution. PLoS One
overcome the negative influence of moisture and particle size (Hu 12, e0172438.
Hu, W., Huang, B., Weindorf, D.C., Chen, Y., 2014. Metals analysis of agricultural soils
et al., 2014).
via portable X-ray fluorescence spectrometry. Bull. Environ. Contam. Toxicol.
92, 420–426.
Hu, W., Zhang, Y., Huang, B., Teng, Y., 2017b. Soil environmental quality in green-
4. Conclusion house vegetable production systems in eastern China: current status and
management strategies. Chemosphere 170, 183–195.
Kilbride, C., Poole, J., Hutchings, T., 2006. A comparison of Cu, Pb, As, Cd, Zn, Fe, Ni
This study used two different regression methods, the linear and Mn determined by acid extraction/ICP–OES and ex situ field portable X-ray
PLSR and nonlinear SVMR, to predict soil pH via PXRF or Nis-VIR fluorescence analyses. Environ. Pollut. 143, 16–23.
Kraemer, H.C., 2014. Kappa Coefficient, Wiley StatsRef: Statistics Reference Online.
sensor individually and the fused sensor dataset (PXRF + Nis-VIR). Landis, J.R., Koch, G.G., 1977. The measurement of observer agreement for categorical
The results showed that PXRF can be used to predict soil pH data. Biometrics 33, 159–174.
through SVMR with reasonable accuracy based either on pure ele- Li, F., Zhang, J., Jiang, W., Liu, C., Zhang, Z., Zhang, C., Zeng, G., 2017. Spatial health
risk assessment and hierarchical risk management for mercury in soils from a
mental data (R2 = 0.80; RPD = 2.03; RPIQ = 3.01) or in fusion with typical contaminated site, China. Environ. Geochem. Health 39, 923–934.
Vis-NIR data (R2 = 0.86; RPD = 2.21; RPIQ = 3.09), which makes Li, F., Zhang, J., Liu, W., Liu, J., Huang, J., Zeng, G., 2018. An exploration of an integrated
non-destructive pH measurement possible. Simultaneously, it was stochastic-fuzzy pollution assessment for heavy metals in urban topsoil based
on metal enrichment and bioaccessibility. Sci. Total Environ. 644, 649–660.
found that PXRF measured values and lab-measured values (i.e.,
Lucà, F., Conforti, M., Castrignanò, A., Matteucci, G., Buttafuoco, G., 2017. Effect of
As, Pb, Cu, and Zn) have relatedly significant positive correlations calibration set size on prediction at local scale of soil carbon by Vis-NIR spec-
(P < 0.001) with R2 > 0.97. Moreover, the relatively high Kappa coef- troscopy. Geoderma 288, 175–183.
Martens, H., Naes, T., 1989. Multivariate Calibration. John Wiley & Sons Ltd., New
ficient of 0.91 demonstrated that assessment of the environmental
York.
risk of soil HMs using PXRF measured HMs and predicted pH is Nawar, S., Buddenbaum, H., Hill, J., Kozak, J., Mouazen, A.M., 2016. Estimating the
almost identical to those of conventional laboratory analysis. We, soil clay content and organic matter by means of different calibration methods
therefore, recommend the use of fused sensor data to estimate of vis-NIR diffuse reflectance spectroscopy. Soil Tillage Res. 155, 510–522.
Qu, M., Chen, J., Li, W., Zhang, C., Wan, M., Huang, B., Zhao, Y., 2019. Correction of in-
soil pH and PXRF to measure soil HMs for rapidly assessing soil situ portable X-ray fluorescence (PXRF) data of soil heavy metal for enhancing
environmental risk. spatial prediction. Environ. Pollut., 112993.
M. Wan, M. Qu, W. Hu et al. / Process Safety and Environmental Protection 132 (2019) 73–81 81

Qu, M., Wang, Y., Huang, B., Zhao, Y., 2018. Spatial uncertainty assessment of the to rapid risk assessment of heavy metals in agricultural soils. Ecol. Indic. 101,
environmental risk of soil copper using auxiliary portable X-ray fluorescence 583–594.
spectrometry data and soil pH. Environ. Pollut. 240, 184–190. Wang, D., Chakraborty, S., Weindorf, D.C., Li, B., Sharma, A., Paul, S., Ali, M.N., 2015.
Ran, J., Wang, D., Wang, C., Zhang, G., Yao, L., 2014. Using portable X-ray fluorescence Synthesized use of VisNIR DRS and PXRF for soil characterization: total carbon
spectrometry and GIS to assess environmental risk and identify sources of trace and total nitrogen. Geoderma 243, 157–167.
metals in soils of peri-urban areas in the Yangtze Delta region, China. Environ. Wang, H., Wu, Q., Hu, W., Huang, B., Dong, L., Liu, G., 2018. Using multi-medium
Sci. Processes Impacts 16, 1870–1877. factors analysis to assess heavy metal health risks along the Yangtze River in
Savitzky, A., Golay, M.J.E., 1964. Smoothing and differentiation of data by simplified Nanjing, Southeast China. Environ. Pollut. 243, 1047–1056.
least squares procedures. Anal. Chem. 36, 1627–1639. Wang, S., Li, W., Li, J., Liu, X., 2013. Prediction of soil texture using FT-NIR spec-
Sharma, A., Weindorf, D.C., Man, T., Aldabaa, A.A.A., Chakraborty, S., 2014. Charac- troscopy and PXRF spectrometry with data fusion. Soil Sci. 178, 626–638.
terizing soils via portable X-ray fluorescence spectrometer: 3. Soil reaction (pH). Webster, R., 2000. Is soil variation random? Geoderma 97, 149–163.
Geoderma 232, 141–147. Weindorf, D.C., Bakr, N., Zhu, Y., 2014. Advances in portable X-ray fluorescence
Shepherd, K.D., Walsh, M.G., 2002. Development of reflectance spectral libraries for (PXRF) for environmental, pedological, and agronomic applications. Adv. Agron.
characterization of soil properties. Soil Sci. Soc. Am. J. 66, 988–998. 128, 1–45.
Silva, S.H.G., Silva, E.A., Poggere, G.C., Guiherrme, L.R.G., Curi, N., 2018. Tropical soils Weindorf, D.C., Zhu, Y., Chakraborty, S., Bakr, N., Huang, B., 2012. Use of portable
characterization at low cost and time using portable X-ray fluorescence spec- X-ray fluorescence spectrometry for environmental quality assessment of peri-
trometer (pXRF): effects of different sample preparation methods. Ciênc Agrotec urban agriculture. Environ. Monit. Assess. 184, 217–227.
42, 80–92. Wold, S., Sjöström, M., Eriksson, L., 2001. PLS-regression: a basic tool of chemomet-
Soil Survey Staff, Available at https://prodnrcsusdagov/wps/PA NRCSConsumption/ rics. Chemom. Intell. Lab. Syst. 58, 109–130.
download?cid=stelprdb1244466&ext=pdf (verified 9 July 2014) 2014. Soil Sur- Xu, L., Wang, T., Ni, K., Liu, S., Wang, P., Xie, S., Meng, J., Zheng, X., Lu, Y., 2013. Metals
vey Field and Laboratory Methods Manual. Soil Survey Investigations Report No contamination along the watershed and estuarine areas of southern Bohai Sea,
51, Version 2 USDA-NRCS. China. Mar. Pollut. Bull. 74, 453–463.
Stenberg, B., Viscarra Rossel, R.A., Mouazen, A.M., Wetterlind, J., 2010. Visible and Xu, S., Zhao, Y., Wang, M., Shi, X., 2018. Comparison of multivariate methods for
near infrared spectroscopy in soil science. Adv. Agron. 107, 163–215. estimating selected soil properties from intact soil cores of paddy fields by Vis-
Thompson, W.D., Walter, S.D., 1988. A reappraisal of the Kappa Coefficient. J. Clin. NIR spectroscopy. Geoderma 310, 29–43.
Epidemiol. 41, 949–958. Yang, K., Li, L., Xue, S., Wang, Y., Liu, J., Yang, T., 2019. Influence factors and health
Tian, K., Huang, B., Xing, Z., Hu, W., 2017. Geochemical baseline establishment and risk assessment of bioaerosols emitted from an industrial-scale thermophilic
ecological risk evaluation of heavy metals in greenhouse soils from Dongtai, biofilter for off gas treatment. Process. Saf. Environ. Prot. 129, 55–62.
China. Ecol. Indic. 72, 510–520. Yang, M., Wang, C., Yang, Z.P., Yan, N., Li, F.Y., Diao, Y.W., Chen, M.D., Li, H.M., Wang,
Tong, R., Cheng, M., Yang, X., Yang, Y., Shi, M., 2019. Exposure levels and health J.H., Qian, X., 2018. Use of portable X-ray fluorescence spectroscopy and geo-
damage assessment of dust in a coal mine of Shanxi Province, China. Process statistics for health risk assessment. Ecotoxicol. Environ. Saf. 153, 68–77.
Safety Environ. Protect. 128, 184–192. Zeng, R., Zhao, Y.G., Li, D.C., Wu, D.W., Wei, C.L., Zhang, G.L., 2016. Selection of “local”
USEPA, 2007. Method 6200: Field Portable X-ray Fluorescence Spectrometry for the models for prediction of soil organic matter using a regional soil vis-nir spectral
Determination of Elemental Concentrations in Soil and Sediment. http://www. library. Soil Sci. 181, 13–19.
epa.gov/osw/hazard/testmethods/sw846/pdfs/6200.pdf. Zhang, G., Gong, Z., 2012. Soil Survey Laboratory Methods. Science Press.
Vapnik, V.N., 1999. An overview of statistical learning theory. IEEE Trans. Neural Zhang, H., Huang, B., Dong, L., Hu, W., Akhtar, M.S., Qu, M., 2017. Accumulation,
Netw. 10, 988–999. sources and health risks of trace metals in elevated geochemical background
Vasques, G.M., Grunwald, S., Sickman, J.O., 2008. Comparison of multivariate meth- soils used for greenhouse vegetable production in southwestern China. Ecotox-
ods for inferential modeling of soil carbon using visible/near-infrared spectra. icol. Environ. Saf. 137, 233–239.
Geoderma 146, 14–25. Zhuang, P., Mcbride, M.B., Xia, H., Li, N., Li, Z., 2009. Health risk from heavy metals
Wan, M., Hu, W., Qu, M., Tian, K., Zhang, H., Wang, Y., Huang, B., 2019. Applica- via consumption of food crops in the vicinity of Dabaoshan mine, South China.
tion of arc emission spectrometry and portable X-ray fluorescence spectrometry Sci. Total Environ. 407, 1551–1561.

You might also like