28ASCE - 29EE.1943-7870.0001003 (1) (2016) SIPIL Thok

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

Can Suspended Fine-Sediment Transport in Shallow Lakes

Be Predicted Using MVRVM with Limited Observations?


H. A. Batt, Ph.D., S.M.ASCE 1; and D. K. Stevens, Ph.D., P.E. 2
Downloaded from ascelibrary.org by LULEA UNIVERSITY OF TECHNOLOGY on 03/09/16. Copyright ASCE. For personal use only; all rights reserved.

Abstract: The study of sediment transport in water natural bodies is a challenging task. There have been several attempts to describe
sediment mathematically using hydraulic characteristics of water bodies. Most researchers who developed empirical formulas to describe
sediment transport performed laboratory experiments with assumptions that did not take into account variations of hydraulic parameters and
the fine sediment sizes that are part of this phenomenon. Recently, new approaches for studying sediment transport have been developed
involving the use of machine-learning algorithms that have proven accuracy and efficiency in predicting sediment transport. A novel machine-
learning method, the Multivariate Relevance Vector Machine (MVRVM), has yet to be tested to model sediment transport and water quality in
estuaries and lakes. The selection of the MVRVM is suggested by the limited field observations that present challenges for alternative stat-
istical learning machines, and by the promise of using the MVRVM approach to inform future data-collection efforts. This paper tests the
success of calibrating the MVRVM model to predict suspended fine-sediment transport and other environmental measures in Mud Lake,
southeastern Idaho, United States. In addition, the authors introduce and explain the technique that can be used to arrange the data which will
allow the model to work. Training and validation results for turbidity, total suspended solids (TSS), pH, dissolved oxygen (DO), and water
temperature are presented. These results emphasize that modeling the water-quality constituents and sediment transport with few observations
is possible using the MVRVM. DOI: 10.1061/(ASCE)EE.1943-7870.0001003. © 2015 American Society of Civil Engineers.

Introduction dominate natural systems (Jain 2001; Nagy et al. 2002; Senand
Altunkaynak 2004). Physics-based sediment-transport models
The amount of sediment carried by river flow or is deposited in a require detailed information about the temporally and spatially
water body depends on several factors such as flow rate and variable physical characteristics of the sediment, (2) alternative
sediment characteristics. Two types of sediment are transported modeling approaches using recent advances in statistical learning
by a flow: (1) bed load eroded from the water body’s bed, and theory show promise in providing predictive capability in such
(2) the wash load consisting of fine material coming from the cases, and (3) often sediment water-quality criteria are expressed
banks, the watershed, overload flow, and bed. When a stream as the more-easily-measured turbidity (NTU), models that address
approaches a lake or estuary, flow characteristics change. The sud- sediment particles directly will be less useful for management
den increase in cross-sectional area and a decrease in flow velocity decision-making.
often results in a significant amount of sediment deposition. The As a result, developing a model to describe the turbidity asso-
amount of sediment transport into and out of a lake is related to ciated with fine sediments discourages consideration of physics-
management requirements and beneficial use of the lake, which based models that predict sediment transport based on sediment
might not have taken into consideration the dead storage occupied physical characteristics rather than indirect measures such as tur-
by the sediment. bidity. This paper reports on the identification of an appropriate
Over the past few decades, research concerning transport of methodology for modeling the spatial and temporal patterns of sus-
various grain size classes in rivers focused mainly on the hydrody- pended fine sediment and other water-quality constituents in Mud
namic conditions, where the transport potential of sediment sizes is Lake, southeastern Idaho, United States, casting the light on a tech-
based on various formulas that use one grain size or a distribution of nique that can be used generally to model various applications.
grain sizes (e.g., Einstein method, Yang method, and Toffaleti
method) (Garcia 2002). Thus the sediment sizes are an important
factor in selecting or creating a model. However this use of particle Data-Driven (Statistical) Methods
size is considered difficult in shallow lakes and natural systems
Statistical learning-tool algorithms have been used to estimate the
because (1) the recent models developed in the last few decades
sediment concentration in different water bodies, and combining
perform poorly in terms of the very fine sediment sizes that these estimates with flow data produce estimates of the sediment
yield. Other studies have examined statistical learning tools to
1
Environmental Engineer, Utah Water Research Laboratory, Dept. predict the levels of other water-quality constituents. Three of
of Civil and Environmental Engineering, Utah State Univ., Logan, UT these methods are artificial neural networks (ANN), support vector
84321 (corresponding author). E-mail: hussein.aly.batt@aggiemail.usu.edu machines (SVM), and relevance vector machines (RVM).
2
Professor and Environmental Division Head, Utah Water Research
Laboratory, Dept. of Civil and Environmental Engineering, Utah State
Univ., Logan, UT 84321. E-mail: david.stevens@usu.edu Artificial Neural Networks (ANN)
Note. This manuscript was submitted on August 13, 2014; approved on
June 5, 2015; published online on August 18, 2015. Discussion period open An artificial neural network is a mathematical representation of the
until January 18, 2016; separate discussions must be submitted for indivi- human brain and therefore contains billions of neurons that func-
dual papers. This paper is part of the Journal of Environmental Engineer- tion to recognize patterns and process data. Like the brain, the ANN
ing, © ASCE, ISSN 0733-9372/04015051(10)/$25.00. algorithm is able to adapt as new data become available and process

© ASCE 04015051-1 J. Environ. Eng.

J. Environ. Eng., 2016, 142(1): 04015051


information, which makes it useful in the prediction of sedimenta- have large effects on the sediment load. From the output, it was
tion transport. proven that the coupled wavelet with ANN provided a good fit
Recent studies related to using ANN in sediment transport have to the observations for the testing period. The results of this re-
shown success compared with traditional, physics-based methods. search were compared to traditional ANN to show that the wave-
Jain (2001) mentioned that it is difficult in all cases to find the con- let–ANN approach had superior predictions in all cases including
ventional sediment rating curves sufficiently reliable to correctly the peak estimation of sediment values.
estimate the mass of sediment transported by rivers. He proposed Despite the advantage of using the ANN as just discussed, there
the use of the three-layer feedforward ANN to create and study remains a major disadvantage to this algorithm, namely that tradi-
sediment rating curves. He created integrated stage discharge– tional ANNs can get trapped in local minima, suggesting that the
sediment concentration relations for two gauging sites to make a best ANN models are not unique. Dogan et al. (2007) point out that
Downloaded from ascelibrary.org by LULEA UNIVERSITY OF TECHNOLOGY on 03/09/16. Copyright ASCE. For personal use only; all rights reserved.

comparison with ANN and conventional curve-fitting approaches because of this disadvantage, newer statistical learning algorithms
for predicting suspended sediment concentration. The artificial have been developed.
neural network showed better results for both of gauging sites with
a one order-of-magnitude-reduction in the sum of the squared
errors compared to the conventional curve-fitting approach. Support Vector Machine (SVM)
Taormina et al. (2012) employed artificial neural networks SVMs are derived from statistical learning theory and have been
(ANNs) for predicting and forecasting groundwater levels using used for classification and regression. The SVM algorithm is based
feedforward neural networks (FFNs). They concluded that such on separation between input data classes to select subsets of
networks can be used as a viable alternative to physical-based training data that contain important information to be used for
models to simulate the responses of the aquifer for future scenarios testing.
or to reconstruct long periods of missing observations. Singh et al. (2008) focused on estimation of discharge and nor-
Chau (2007) mentioned that the numerical simulation of mal depth in a trapezoidal channel having various bottom slopes
flow and water quality is sophisticated. He introduced a knowl- using SVM. They found that the correlation coefficient to evaluate
edge-management system that modeled flow and water quality. the efficiency of settling basins for all the bed slopes was higher
This technique did not focus on improving the model performance than 0.995 for both prediction of discharge and end depth. They
of ANN by adjusting the correlation relationship between input suggested the use of SVM to estimate discharge instead of the tradi-
components and output of models introduced by Wu et al. (2009). tional physics-based approaches.
Chen and Chau (2006) combined the simulation of human expertise Lizhong et al. (2007) modeled water quality in lakes using
during problem-solving by incorporating artificial intelligence and remotely-sensed images and support vector machines, finding that
coupling descriptive knowledge, procedural knowledge, and rea- the relationships between remote-sensed image data and water-
soning knowledge involved in the coastal hydraulic and transport quality parameters are nonlinear. They recommended the use of
processes. data-driven methods that can accommodate nonlinearity. They
Cheng et al. (2005) worked on several ANN models with a feed- attempted the use of ANN but due to the limited number of water-
forward, back-propagation network structure and various training monitoring stations to provide enough data for training, the itera-
algorithms to forecast daily and monthly river flow discharges in tions would stop at local minima of the loss function (e.g., sum of
Manwan Reservoir (Mekong, China). The results indicated that the squared errors), and fail to find the optimum parameter set. A sup-
ANN models provide better accuracy in forecasting river flow than port vector machine was used for its simple structure, good gener-
the auto-regression time-series model. alization ability, and ability to perform well in cases of fewer
Muttil and Chau (2006) used artificial neural networks (ANN) observations than ANN. The results from this study of Lake Taihu
to model algal biomass in Tolo Harbour, Hong Kong using a in China supported their hypothesis for using SVM by outperform-
weighted-trained ANN that correctly identified the ecologically- ing the ANN to model the water quality in the lake.
significant variables required for modeling the algal biomass. They Misra et al. (2009) focused on simulation of runoff and sediment
concluded that analysis of various ANN scenarios indicated yield using SVM, noting that physical models for computing runoff
that good predictions of long-term trends in algal biomass can and sediment yield are complex. They modeled the sediment yield
be obtained using only chlorophyll-a as input. from a 7,820 km2 watershed in India using data from the monsoon
Sen and Altunkaynak (2004) concluded that the different period with SVM, concluding that the SVM predicted the sediment
sediment-prediction models in practice, which were developed yield and the runoff more accurately than using ANNs.
from rational formulations, suffer from having their parameters Goel and Pal (2008) modeled scour and its effect on a grade
estimated using regression methods from a single historical data control structure using both ANN and SVM. They noted that scour
set. They coupled the ANN with Kalman filtering to model dis- was represented by empirical relationships based on laboratory/
charge and sediment concentration for the Mississippi River Basin field experiments on flow, time, material, and type of structure that
in the central United States. The resulting statistical analysis of this were computed from particular situations. These empirical
study showed that this approach improved the prediction, reducing formulas did not offer a general computational prediction capability
the residual sum of squares by 50% for the loading compared to the that can be applied to all cases. They pointed out that many scholars
regression methods. have started adopting the ANN algorithms to model scour; in this
Partal and Cigizoglu (2008) proposed a different ANN approach research, they used the SVM with available data from earlier
to accurately predict the suspended sediment loads in streams. published studies to model the scour and compared the results
Their study was divided into two parts: (1) predicting sediment load to those obtained from ANN algorithm with feedforward/back-
using past data, and (2) predicting the sediment load using daily propagation. They recommended the use of the SVM modeling ap-
river-flow measurements. They coupled normal techniques for proach in modeling scour because it performed statistically better in
forming the ANN with wavelet methods (methods that use periodic comparison to both ANN and empirical relationships. Similarly,
functions to help capture patterns of data), and mentioned that the Singh et al. (2008) studied sediment removal efficiency in settling
input for this model was selected by applying the wavelet compo- basins using ANN and SVM and reported that the performance of
nents. These components helped in deciding which parameters SVM was found to be better statistically compared to the ANN.

© ASCE 04015051-2 J. Environ. Eng.

J. Environ. Eng., 2016, 142(1): 04015051


Relevance Vector Machine (RVM) defining a higher dimensional space. The results of the model were
compared to the Reuters benchmark and showed a performance
Tipping (2001) found that the SVM suffers from some disadvan-
improvement of 10% when compared with a commonly-adopted
tages: (1) SVM makes excessive use of the kernel function, thus
text classification benchmark.
requiring the number of observations to grow with the training
The assessment of the adequacy for supporting aquatic life in
set; (2) estimation of a tradeoff error/margin parameter, which is
the presence of sediment in water bodies are often given in terms
accomplished using a cross-validation process (a technique used
of turbidity (NTU) (DEQ 2011), a measure related to the amount
to partition the sample data into subsets and performing the analysis
of fine sediment in water that aggregates the degree to which
using one of the sets and then validates the analysis using the other
particulates reflect light over all particle sizes. Physics-based
set, which s is repeated using different data sets and the validation is sediment-transport models require detailed information about the
averaged over all the sets) that is wasteful of data. He introduced the
Downloaded from ascelibrary.org by LULEA UNIVERSITY OF TECHNOLOGY on 03/09/16. Copyright ASCE. For personal use only; all rights reserved.

temporally and spatially variable physical characteristics of the


relevance vector machine (RVM) as an alternative based on a Baye- sediment. Developing a model to describe the turbidity in shallow
sian approach, which does not suffer from the disadvantages of lakes discourages consideration of physics-based models, which
SVM and requires fewer Kernel functions. Tipping explained that predict sediment transport based on sediment’s physical character-
the RVM is a probabilistic Sparse Kernel model similar to SVM, istics rather than indirect measures such as turbidity.
where the sparsity in the RVM is achieved when the algorithm iden- The review of the previous methods, the need to verify the abil-
tifies only those observations that improve the performance of the ity of data-driven models to introduce an easy framework to the
model. The important difference between SVM and RVM is that public that can model turbidity and other water-quality constituents
the RVM method generally requires many fewer observations than without dealing with complex data requirement for physics based
the SVM to achieve the same degree of predictive accuracy. models, and the water-quality criteria requirements all support the
Dogan et al. (2007) described RVM as a new algorithm that has exploration of the RVM to study very fine sediment using turbidity
not been used widely in modeling sediment transport in natural- as a surrogate. Hence, the use of the multivariate relevance vector
environmental systems. Dogan et al. (2007) worked with RVM machine (MVRVM) for this study to describe and model the vari-
to estimate sediment concentration time series in streams and rivers. ous parameters proved efficacy with limited observation. The au-
They used the data for building this model from a data pool com- thors present here a framework and a brief explanation to help
piled from riverbed loads of various kinds and sizes in the United scientists/users interested in using MVRVM to arrange their data
States and Europe, without considering spatial distribution of in order for the model to work.
sediment in lakes or rivers. They divided the data randomly into
a training data set and a model validation data set. They used
dimensional analysis to select input parameters (based on physics RVM Model Structure
of sediment and hydraulics of rivers) for their model to develop the
statistical estimation of sediment concentration. They concluded The goal of any model is to provide predictions that faithfully
that the use of this technique is superior to other methods for represent the target observations with as simple a formulation as
sediment concentration prediction; however, as with ANN and the observations allow. As discussed previously, RVMs are sparse
SVM, it should not be used for prediction outside the range of data-driven models that use techniques pioneered in pattern-
the training data. recognition applications. The RVM adopts a Bayesian approach
Huang and Wu (2008) examined the use of RVM to predict to learn which observations in a data set, x, are key to reproducing
stock indices. Similarly to Partal and CIgizoglu (2008) with the patterns represented by those observations, and seeks sparsity
ANN, Huang and Wu (2008) combined the RVM algorithm with by using only those observations that contain independent useful
wavelet techniques to build their model, using wavelets to extract information about the process being modeled.
patterns from the variables’ time series. The extracted features were The RVM model is fitted to a set of target observations of a
then used as the RVM input to make predictions. The RVM/wavelet particular type, n, (suspended solids, dissolved oxygen, etc.) by
ðnÞ
results were statistically compared with the SVM and other first creating a kernel function, ΦðnÞ (xi ; xd;i ), that represents both
traditional methods using (standard deviation, measures of the influence of underlying system drivers xd;i and the observations
ðnÞ
skewness, and Kurtosis), and it was found that the use of RVM for type n; xi , and then defining a set of weights, w, that multiply
gave better prediction results. the kernel function. These products are then summed to form the
Wong et al. (2008) worked on a fully-automated emotion vector of predicted values for observation type n. The RVM algo-
recognition system on the basis of facial analysis using RVM. Their rithm then modifies the weights, w, to minimize the discrepancy
research was based on dividing the recognition system into four between the observed target values and the corresponding predicted
components (only the first two components were used in this values. Sparsity is achieved when one or more of the estimated
research). Using different types of kernels to train their model, their weights equals zero, indicating that the corresponding observations
results, using a database of facial expression data, showed detection do not significantly improve the model—represented mathemati-
rates of over 96% for different kernels used, while the detection cally by a matrix with most of the elements equal to zero, while
rate was less than 51% using a nonfacial database[recognition of the nonzero elements are used for prediction. The importance of
objects]. sparsity is to minimize the amount of data required for observa-
Yuan et al. (2007) used RVM with cross validation to optimize a tions. Once the relevant observation vectors are identified, this in-
seed separation process. They used the cross-validation process to formation can be used to improve the design of future monitoring
minimize the approximate error of the data. They then compared campaigns.
the results of this model to an SVM model with cross validation Mathematically, the predicted value for the target observations is
statistically using the root mean square error (RMSE). The results given by
from their statistical analysis supported this approach for the
proposed model. X
N
ðnÞ ðnÞ ðnÞ
Silva and Ribeiro (2007) examined the RVM for the purpose of yðnÞ ðx; wðnÞ Þ ¼ wi ΦðnÞ ðxi ; xd;i Þ þ w0 ð1Þ
text classification, using different types of kernels to help in i¼1

© ASCE 04015051-3 J. Environ. Eng.

J. Environ. Eng., 2016, 142(1): 04015051


X
N
ðnÞ ðnÞ pðw; α; σ2 jtÞ ¼ pðwjt; α; σ2 Þ · pðα; σ2 jtÞ
¼ wi ΦðnÞ ðxi ; xd;i Þ ð2Þ
pðtjw; σ2 Þ · pðw; αÞ
i¼0
¼R ð13Þ
pðtjw; σ2 Þpðw; αÞdw
ðnÞ
¼wTi Φðxi ; xd;i Þ ð3Þ The posterior over the weight (constrained over the distribution
of weights) is expressed as
where yðnÞ ðx; wðnÞ Þ = vector of predictions for variable of type
n given the observations matrix, x and the vector of weights for pðtjw; σ2 Þ · pðw; αÞ
ðnÞ pðwjt; α; σ2 Þ ¼ R ð14Þ
variable of type n, wðnÞ; and the kernel function ΦðnÞ ðxi ; xd;i Þ pðtjw; σ2 Þpðw; αÞdw
= inner product of a mapping function for observations that relates
Downloaded from ascelibrary.org by LULEA UNIVERSITY OF TECHNOLOGY on 03/09/16. Copyright ASCE. For personal use only; all rights reserved.

the system drivers and the target observations of variable type n. All the probability density functions are Gaussian; thus, math-
Although the mapping function is general, here the authors assume ematicians can obtain an analytical expression for the posterior-
a Gaussian kernel yielding probability density function equation over the weight

X
−1=2  P 
ðnÞ
ΦðnÞ ðxi ; xd;i Þ ¼ expð−r2 jjxd;i − xðnÞ jj2 Þ ð4Þ 2 −ðNþ1Þ=2 ðw − μÞT −1 ðw − μÞ
pðwjt; α; σ Þ ¼ ð2πÞ exp −
2
where r = kernel width (selected and fixed for a particular RVM ð15Þ
model) that provides the multiplane representation of x. The targets
(observations matrix used to train the MVRVM model) are samples X
from the observations, which will contain errors after training tn ¼ ¼ ðσ−2 ΦT Φ þ AÞ−1 ð16Þ
yðxn ; wÞ þ εn where ε is independent zero-mean Gaussian noise
with variance σ2 , and ε ∼ Nð0; σ2 Þ. A ¼ diagðα0 ; α1 ; : : : ; αN Þ ð17Þ
From this, it can be inferred that the probability distribution of
tn , conditioned on the observations x, is X
μ ¼ σ−2 ΦT t ð18Þ
2
pðtn jxÞ ¼ N½tn jyðxN Þ; σ  ð5Þ
Relevance vector (learning) thus becomes the search for the hy-
The likelihood of the complete data set is represented by per (multidimension) parameters that maximize
 
1 pðα; σ2 jtÞ ∝ pðtjα; σ2 ÞpðαÞpðσ2 Þ ð19Þ
pðtjw; σ2 Þ ¼ ð2πσ2 Þ−N=2 exp − 2 t − Φwjj2 ð6Þ

with respect to α and σ2 .
where

t ¼ ½t1 ; t2 : : : ; tN T ; N × 1 vector ð7Þ Objectives and Experimental Design


This paper is focused on development of and testing the MVRVM
w ¼ ½w0 ; w1 ; : : : ; wN T ; ðN þ 1Þ × 1 vector ð8Þ as a mathematical algorithm that can be used to predict patterns in
the concentration of suspended fine sediment and other environ-
mental constituents as well as help to find how many observations
Φ ¼ ½Φx1 ; Φx2 ; : : : ; ΦxN T ; N × ðN þ 1Þ matrix ð9Þ are required to model the complex hydraulics, sediment, and
water-quality constituents. The MVRVM has not been used in
The Bayesian training algorithm requires the definition of many studies to predict sediment concentration in estuaries and
explicit prior distributions for the weights lakes (Dogan et al. 2009).
This paper provides a detailed explanation on how the data was
Y
N
arranged in order to help other interested scientists to replicate the
pðwjαÞ ¼ Nðwi j0; α−1
i Þ ð10Þ
i¼0
experiment; also, the authors examine whether the MVRVM is able
to carry out predictions for suspended fine sediment and other
where α = vector of ðN þ 1Þ prior parameters. For a given test point water-quality constituents.
x the authors predict the probability of t
Z Experimental Design
pðt jtÞ ¼ pðt jw; α; σ2 Þpðw; α; σ2 jtÞdwdαdσ2 ð11Þ
The study was carried out using Mud Lake, a part of the Bear River
National Wildlife refuge in southeastern Idaho, United States
where pðw; α; σ2 jtÞ is defined by Bayes rule as (Fig. 1) that functions as a sediment trap for flows into the
adjacent Bear Lake in addition to functioning as a habitat to support
pðtjw; α; σ2 Þ:pðw; α; σ2 Þ migratory species.
pðw; α; σ2 jtÞ ¼ ð12Þ
pðtÞ The evaluation of environmental quality in Mud Lake to ensure
satisfying its beneficial uses can be improved by successful
Tipping (2001) mentioned that “mathematicians cannot modeling of environmental constituents. However the hydraulic
perform these computations in full analytically, and must seek operations, limited observations, and the variable particle size of
an approximation. Mathematicians cannot compute the posterior sediment transported through Mud Lake present challenges to se-
pðw; α; σ2 jtÞ directly; instead, mathematicians decompose the lecting an appropriate modeling technique or sediment transport
posterior as” function.

© ASCE 04015051-4 J. Environ. Eng.

J. Environ. Eng., 2016, 142(1): 04015051


Paris Dike
8

l
9

na
Ca
12

w
bo
1 0

in
Ra
13
3
2

n Canal
15
Downloaded from ascelibrary.org by LULEA UNIVERSITY OF TECHNOLOGY on 03/09/16. Copyright ASCE. For personal use only; all rights reserved.

Zone I
14
17

Irrigatio
18
ID Mud Lake
UT
16
29

4 Mud Lake
5 19
27 20
Zone II 28

6 26
7
23
21
Zone III
Lifton Causeway 24
25

22
Legend
Sampling location 0 475 950 1,900 Meters

(a) (b)

Fig. 1. Mud Lake map as (a) bathymetry; (b) sampling locations and zones (dots mark the sample locations of change of hydraulics) (data from Batt
and Stevens 2013)

To test and validate the MVRVM for the objective, data were these observations. However, the authors examined the other kernel
collected as detailed in Batt and Stevens (2013), and consist of functions based on the shape of the collected observations; after
concentrations of fine suspended sediment, turbidity, dissolved testing these kernel functions, none of them worked with the data
oxygen, pH, and temperature at 30 locations, in Mud Lake and the model crashed in the training phase. Hence, the decision
biweekly over two ice-free seasons in 2009 and 2010. was confirmed regarding using the Gaussian kernel to model the
The investigation of the use of the MVRVM model requires the observations.
existence of representative patterns (spatial and temporal) of For the Mud Lake case study, 30 locations were selected for
observations at many locations (Batt and Stevens 2013). Consistent observation, and sediment and water-quality observations were
patterns in the data were the key requirement to support the use of made biweekly during the ice-free periods from April to October
the MVRVM with limited observations. Initially, it was assumed 2009 and 2010. Details of the data collection efforts are provided in
that the flow hydrodynamics (flow velocity, depth, and direction) Batt and Stevens (2013) and the observation set consists of
in Mud Lake represented the major driving force for all variables. time series for the levels of the six constituents at each of the
This assumption was found to be inadequate in the case of 30 locations.
modeling the water-quality constituents, as their successful predic- The model runs for training, verifying, and predicting the re-
tion required the collection of additional data which was not quired arrangement of the data according to the time and location
considered during the preliminary data collection, namely the effect of the observation. The time-series observations consist of 25 days
of vegetation and algae on the observations for dissolved oxygen of observation. Based on the preliminary analysis for the RVs of the
(DO) and pH. total suspended solids (TSS), it was found that 22 days were
Mathematically, each function has a specific shape of its required for training the MVRVM; while the testing consisted
distribution; the collected observations fit a specific distribution of 3 days. The data are arranged in matrix form for all the constitu-
that can describe and model the observations. The authors based ents and the location of each observation. The input data consists of
this paper on selecting the Gaussian kernel by changing the kernel a time-series observation matrix for water-quality parameters,
width instead of laplace polynomial, homogeneous polynomial, velocity vector magnitude, and turbidity for the four input locations
and linear spline. The authors made this decision after investigating in the lake (stations 1, 8, 7, and 25 in Fig. 1), while the MVRVM
the observations collected over 2 years; these observations revealed output data consists of a time-series observation matrix for the 30
similar distribution compared to the Gaussian data, which made a locations for water-quality parameters, velocity vector magnitude,
strong impression that the Gaussian kernel can be used to model and turbidity in the lake. The algorithm is executed while changing

© ASCE 04015051-5 J. Environ. Eng.

J. Environ. Eng., 2016, 142(1): 04015051


the model width, the kernel equation, and number of iterations. The extreme observation that can mislead the readers in their under-
output is then used together with the field observations to plot the standing the range that prevails in the observation.
RMSE and residuals in order to identify the error in the MVRVM
algorithm parameters and thus what parameters should be changed
to minimize the errors. Results and Discussion
The MVRVM training and verification were done using a
library created for MATLAB. The model runs in this study took The water-quality observation matrix described in Batt and Stevens
a range of 5–15 min to select the required RVS for each constituent (2013) was used for training the MVRVM algorithm. During
using a dual-core 2.4-GHZ machine. Jolliffe (1974) mentioned that training, the MVRVM algorithm selects observations that provide
the quartile analysis is easy to understand and is considered an easy relevant information to the model based on Bayesian probability
Downloaded from ascelibrary.org by LULEA UNIVERSITY OF TECHNOLOGY on 03/09/16. Copyright ASCE. For personal use only; all rights reserved.

way to summarize the data; he mentioned also that the quartiles are theory. Once an observation is selected as relevant, the remaining
useful to summarize data and are not influenced by extreme obser- observations at the same time are added to the prediction matrix.
vations. Thus the authors choose to present the observations and This process continues until the addition of a new observation sup-
modeled data using the quartiles to eliminate relying on any single plies no new information for the model, thus creating a matrix of

Fig. 2. Box plots of collected observations in the 30 locations of Mud Lake during 2009–2010 and predicted MVRVM output as (a) TSS, mg=L;
(b) turbidity (NTU); (c) DO, mg=l; (d) pH, std. units; (e) temperature °C; (f) velocity cm=s; the shaded bars contain the observation distributions and
MVRVM prediction distributions for each location

© ASCE 04015051-6 J. Environ. Eng.

J. Environ. Eng., 2016, 142(1): 04015051


vector observations containing the string of important vector infor- turbidity (NTU), DO, mg=l, pH, std. units, temperature °C, and
mation, and after which convergence is declared. The term velocity cm=s, respectively]; each adjacent box-and-whisker pair
Relevance Vector can be misleading in some cases; it actually refers represents the distributions of the observations (unshaded box, left)
to the single observation with high probability of being selected by and predictions (shaded box, right). The MVRVM model was
the model as a vector. Sparsity is measured by the fraction of the capable of predicting the water-quality constituents and to capture
total number of observations that are significant; as an example in the patterns of change in the different locations in the case study.
the case of velocity vector, when the model selected 15 RVs it The quality assurance/control for collecting observations (Batt and
meant that 15 out of 660 observations were significant. The Stevens 2013), and design of the experiment minimized any effect
RVs selected as a subset of the training data are then used for of changing the number of iterations that are used to run the
the prediction in the algorithm (Batt and Stevens 2014). MVRVM. For the water-quality constituents tested here, the fact
Downloaded from ascelibrary.org by LULEA UNIVERSITY OF TECHNOLOGY on 03/09/16. Copyright ASCE. For personal use only; all rights reserved.

Batt and Stevens (2014) described the complexity of the model that the MVRVM parameter estimation routine readily converged
as proportional to the number of the selected RVs, and it was before all data vectors were used suggests that the data collected
expected that the number of RVs would likely change depending were sufficient for the MVRVM algorithm.
on the complexity of the observed pattern. The number of relevant Residuals plots for the tested constituents (Fig. 3) showed that
vectors relative to the total size of the set of observations is the the residuals (observed—predicted levels of the constituents at each
measure of the model’s sparsity. Here, the authors express this time/location) are centered around zero and do not follow a
measure of sparsity as the percentage of the total number of specific pattern (random). Fig. 4 shows that the observations
observations that are included in the set of RVs. and predictions do not perfectly fall on the 45-degree line of
The results from the MVRVM model are provided as box-and- agreement and thus error exists but is evenly spread over all
whisker plots in Fig. 2, [where Figs. 2(a–f) represent TSS, mg=L, locations.

(a) (b)

(c) (d)

(e) (f)

Fig. 3. Residual plots of observations in the 30 locations of Mud Lake during 2009–2010 versus the Predicted MVRVM output as (a) TSS, mg=L;
(b) turbidity (NTU); (c) DO, mg=l; (d) pH, std. units; (e) temperature °C; (f) velocity cm=s; the different plotting symbols represent different sampling
dates

© ASCE 04015051-7 J. Environ. Eng.

J. Environ. Eng., 2016, 142(1): 04015051


1 1

Scaled Total Suspended Solids

Scaled Turbidity RVM output


0.9 0.9
0.8 0.8
0.7 0.7

RVM output
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Downloaded from ascelibrary.org by LULEA UNIVERSITY OF TECHNOLOGY on 03/09/16. Copyright ASCE. For personal use only; all rights reserved.

(a) Scaled Total Suspended Solids Observation (b) Scaled Turbidity Observation

1 1
0.9 0.9

Scaled pH RVM output


Scaled DO RVM output

0.8 0.8
0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
(c) Scaled DO Observation (d) Scaled pH Observation
Scaled Temperature RVM output

1 1
0.9 Scaled VelocityRVM output 0.9
0.8 0.8
0.7 0.7
0.6 0.6
0.5 0.5
0.4 0.4
0.3 0.3
0.2 0.2
0.1 0.1
0 0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
(e) Scaled Temperature Observation (f) Scaled Velocity Observation

Fig. 4. Scaled plots of observations in the 30 locations of Mud Lake during 2009–2010 versus the predicted MVRVM output as: (a) TSS, mg=L;
(b) turbidity (NTU); (c) DO, mg=l; (d) pH, std. units; (e) temperature °C; (f) velocity cm=s; the different plotting symbols represent different sampling
dates

The range of percentile observations follows three data patterns DO pattern is represented by 16 RVs and 13 RVs, respectively.
among the data-collection sites. For discussion purposes, locations The RMSE for the pH and DO increased from Zone 1 to Zones
with similar characteristics were grouped in three zones [for more 2 and 3, likely due to the increasing amounts of vegetation
information about zones see Fig. 1 and Batt and Stevens from Zone 1 through Zones 2 and 3.
(2013)]: 3. Pattern type III: [Figs. 2(d–f), Table 1] (turbidity, TSS, and
1. Pattern type I: [Fig. 2(c), Table 1] (temperature) is velocity magnitude) is described by a random level with larger
characterized by no significant change in the percentile range variability for percentiles in the first zone, decreased variabil-
of observations through all the locations. The RMSE was ity in the second zone, and then dropping to near zero for both
constant in all three zones, which was expected because the level and variability in the third zone. The MVRVVM
meteorological conditions that affected temperature were algorithm reflected this change in the output of the model
common for all spatial locations. The temperature pattern compared to the field observations. The turbidity is a very
did not vary significantly from Zone 1 to Zone 3. Since the interesting constituent because it is related to the TSS and
range of the temperature across location for each time was is measured under the same conditions as the TSS. However
small, only six RVs, about 0.9% of the observations, were the accuracy of measuring turbidity is higher. This accuracy
required to adequately represent temperature. leads to selecting 16 RVs, which is less than used for the
2. Pattern type II: [Figs. 2(a and b), Table 1] pH and DO are char- TSS (22 RVs).
acterized by a change in the percentile range of observations In the case of velocity vector, the noise in collecting the obser-
for all locations especially in Zone 3, where there is an in- vations and the small magnitude of velocity in Mud Lake resulted
crease of vegetation and algae that affects the pH and DO ob- in the number of RVs being 15, implying similar results as TSS and
servations during the daytime. The MVRVM algorithm was turbidity. However the selection of the 15 relevant vectors is an
able to capture the range of percentile change for the pH artifact of the algorithm that is not robust when the signal-to-noise
and DO observations in all the obervations. The pH and ratio is low. This number would likely change in case of existence

© ASCE 04015051-8 J. Environ. Eng.

J. Environ. Eng., 2016, 142(1): 04015051


Table 1. Observed and Modeled Constituents Percentile, and RMSE Percentage With Respect To Zones in Mud Lake, As DO mg=l, pH, Temperature °C, TSS
mg=l, Turbidity NTU, and Velocity cm=s
Constituent Zone Minimum 25th Percentile 50th Percentile 75th Percentile Maximum RMSE %
DO  0.2 mg=l Zone 1 4.92 7.44 8.11 9.33 11.48 —
MVRVM 5.04 7.17 7.90 9.14 11.46 5
Zone 2 4.48 6.81 7.88 9.35 11.19 —
MVRVM 6.27 7.54 8.28 9.25 11.51 10
Zone 3 5.95 8.21 9.01 10.37 14.63 —
MVRVM 6.31 8.38 9.25 9.76 12.30 16
pH  0.2 Zone 1 6.35 7.98 8.12 8.26 8.95 —
Downloaded from ascelibrary.org by LULEA UNIVERSITY OF TECHNOLOGY on 03/09/16. Copyright ASCE. For personal use only; all rights reserved.

MVRVM 6.35 7.96 8.11 8.23 8.82 10


Zone 2 7.74 8.17 8.33 8.56 9.12 —
MVRVM 7.82 8.20 8.33 8.52 9.09 15
Zone 3 5.22 8.32 8.62 9.07 10.18 —
MVRVM 6.74 8.34 8.61 9.00 9.90 14
Temperature  0.1°C Zone 1 1.46 7.20 14.27 17.93 23.81 —
MVRVM 2.16 7.27 14.42 17.65 23.79 3.5
Zone 2 1.41 7.91 14.88 18.82 25.01 —
MVRVM 2.46 7.72 15.17 19.33 23.93 3.5
Zone 3 0.97 8.19 15.07 19.28 25.56 —
MVRVM 2.21 7.93 15.44 19.69 24.32 3.5
TSS  15% mg=l Zone 1 0 14.67 27.67 52.00 170.67 —
MVRVM 0 14.80 25.15 51.14 194.60 6
Zone 2 0 10.67 17.00 34.33 108.67 —
MVRVM 0 8.39 14.66 28.08 108.28 6
Zone 3 0 3.33 6.00 10.67 100.00 —
MVRVM 0 1.85 3.80 8.47 99.78 14
Turbidity  3%NTU Zone 1 0 19.38 49.75 94.55 357.30 —
MVRVM 0 22.74 50.02 90.04 357.33 2
Zone 2 0 13.50 29.15 60.20 175.30 —
MVRVM 1.04 15.97 29.32 61.04 164.85 10
Zone 3 0 0 2.95 14.48 78.40 —
MVRVM 0 0.80 3.93 14.32 78.30 10
Velocity magnitude 2%  1.5 cm=s Zone 1 0 1.72 5.22 10.73 55.11 —
MVRVM 0 1.65 4.45 8.48 54.92 11
Zone 2 0 0.07 0.40 3.14 91.64 —
MVRVM 0 0.09 0.39 3.24 91.64 10
Zone 3 0 0 0.01 2.22 156.66 —
MVRVM 0 0 0.02 2.13 156.48 1

of observations that are above the detection limit of the minimum and maximum range of the parameters. The MVRVM
equipment used. method found an average number of observations that contribute
The turbidity RMSE in Zone 1 (2%, which is less than the important information (RVs) between 1 and 3% of the total number
device error) increased to 10% in Zones 2 and 3, which indicates of observations. However the question remains concerning whether
that Zones 2 and 3 have the same error. The source of error can this number of RVs means that 3% of the observations are sufficient
result from more complex hydraulics in the deeper waters of Zone to model complex water-quality constituents. In their next paper,
1 compared to the shallower regions in Zones 2 and 3. TSS is the authors examine in more detail how much data are required
heterogeneous, and during collection of grab samples, the authors to successfully model the complex water-quality constituents
might capture clumps of sediment in either the sample or the and how the RVs are related to this data.
smaller portion taken for filtering, which actually does not re-
present all of the water. These problems lead to errors in observing
TSS as high as 10–20% of the mean concentration. The average Conclusions
RMSE was small for the Zones 1 and 2 (6%) and Zone 3
(14%) however the errors in the three zones are within range of This paper is the first study to consider the use of MVRVM to
the typical measurement error for TSS of 10–20%. The RMSE model suspended fine-sediment transport and other water-quality
for velocity seemed high in the first two zones but the magnitudes constituents in complex natural systems as in the case of Mud Lake.
of the velocities were small and close to the error in the device The MVRVM output demonstrated the capability of the method to
accuracy. Also the RMSE in Zone 3 is 1%, but this low level of capture the spatial and temporal change in patterns in observations
error does not mean there is no error because the magnitude of for suspended fine sediment and a variety of water-quality constitu-
the velocities of this zone is close to zero. ents. The assumption of using the sample location to construct the
The results shown in preceding paragraphs confirmed that the MVRVM for modeling the selected water-quality parameters and
MVRVM is able to model each water-quality constituents. The suspended fine sediment has proven to work adequately for all the
success of prediction varied depending on the type of constituent tested constituents.
tested and the complexity of the hydraulics affecting the sampling The MVRVM results showed a changing RMSE from zone to
location. The MVRVM demonstrated an ability to vary with the zone based on the type of constituent tested and the sampling

© ASCE 04015051-9 J. Environ. Eng.

J. Environ. Eng., 2016, 142(1): 04015051


location. It is suggested that additional types of observations Garcia, M. H. (2002). “Sedimentation engineering: Processes, measure-
(e.g., algae, other vegetation) that may influence the selected ments, modeling, and practice.” ASCE Manual of Practice 110, ASCE,
constituents should be included in future work. For example, for Reston, VA.
DO and pH, the amount of algae or other vegetation present near Goel, A., and Pal, M. (2008). “Application of support vector machines in
the sampling locations in Mud Lake may improve the predictive scour prediction on grade-control structures.” Eng. Appl. Artif. Intell.,
22(2009), 216–223.
ability of the MVRVM model.
Huang, S. C., and Wu, T. K. (2008). “Combining wavelet-based feature
The number of MVRVM relevance vectors changed according extractions with relevance vector machines for stock index forecasting.”
to the complexity of the modeled pattern. This information could be Expert Syst., 25(2), 133–149.
used to inform design monitoring programs for the purpose Jain, S. K. (2001). “Development of integrated sediment rating curves using
of MVRVM. ANNs.” J. Hydraul. Eng., 10.1061/(ASCE)0733-9429(2001)127:1(30),
Downloaded from ascelibrary.org by LULEA UNIVERSITY OF TECHNOLOGY on 03/09/16. Copyright ASCE. For personal use only; all rights reserved.

Future research can be focused on identifying the amount of 30–37.


nutrient phosphorous that flows into Mud Lake since nutrients, Jolliffe, F. R. (1974). Commonsense statistics for economists and others
especially phosphorus, are often attached to sediment particles. (students library of economics), Routledge and Kegan Paul, London.
The MVRVM can be used in the same way as discussed in this Lizhong, X., Jianying, W., Ju, G., and Huang, F. (2007). “A support vector
paper to model the spatial distribution of phosphorous. machine model for mapping of lake water quality from remote-
Future research can also focus on considering algae and sensedimages.” Int. J. Intell. Comput. Med. Sci. Image Proc., 1(1),
vegetation effects on the observation of pH and DO; observations 57–66.
related to their effect can be used as another parameter and modeled MATLAB [Computer software]. Natick, MA, MathWorks.
in the same way as the parameters considered in this research. Misra, D., Oommen, T., Agarwal, A., and Thompson, A. (2009).
“Application and analysis of support vector machine based simulation
for runoff and sediment field.” Biosyst. Eng., 103(4), 527–535.
Acknowledgments Muttil, N., and Chau, K. (2006). “Neural network and genetic program-
ming for modelling coastal algal blooms.” Int. J. Environ. Pollut.,
The authors would like to thank the Utah Water Research 28(3–4), 223–238.
Laboratory for funding this research. The authors would like also Nagy, H. M., Watanabe, K., and Hirano, M. (2002). “Prediction of sediment
load concentration in rivers using artificial neural network
to thank the US Fish and Wildlife Service and PacifiCorp for their
model.” J. Hydraul. Eng., 10.1061/(ASCE)0733-9429(2002)128:
support during data collection. Special thanks to the Utah Water 6(588), 588–595.
Lab team: Jim Millesan, Mark Winklaar, Chris Thomas, Shannon Partal, T., and Cigizoglu, K. (2008). “Estimation and forecasting of daily
Clemens, Austin Jensen, Jeff Horsburgh, and Cody Allen. suspended sediment data using wavelet-neural networks.” J. Hydrol.,
358(3–4), 317–331.
Sen, Z., and Altunkaynak, A. (2004). “Sediment concentration and its
References prediction by perceptron Kalman filtering procedure.” J. Hydraul.
Eng., 10.1061/(ASCE)0733-9429(2004)130:8(816), 816–826.
Batt, H., and Stevens, D. (2013). “Relevance vector machine models of
Silva, C., and Ribeiro, B. (2007). “Combining active learning and relevance
suspended fine sediment transport in a shallow lake. I: Data collection.”
vector machines for text classification, machine learning and applica-
Environ. Eng. Sci., 30(11), 681–688.
tions.” 6th Int. Conf. on ICMLA 2007, IEEE, New York.
Batt, H., and Stevens, D. (2014). “How to utilize relevance vectors to
collect required data for modeling water quality constituents and Singh, K. K., Pal, M., Ojha, C. S. P., and Singh, V. P. (2008). “Estimation of
fine sediment in natural systems: Case study on Mud Lake, Idaho.” removal efficiency for settling basins using neural networks and support
J. Environ. Eng., 10.1061/(ASCE)EE.1943-7870.0000858, 06014003. vector machines.” J. Hydrol. Eng., 10.1061/(ASCE)1084-0699(2008)
Chau, K. W. (2007). “An ontology-based knowledge management 13:3(146), 146–155.
system for flow and water quality modeling.” Adv. Eng. Software, Taormina, R., Chau, K., and Sethi, R. (2012). “Artificial neural network
38(3), 172–181. simulation of hourly ground water levels in a coastal aquifer system
Chen, W., and Chau, K. W. (2006). “Intelligent manipulation and of the Venice lagoon.” Eng. Appl. Artif. Intell., 25(8), 1670–1676.
calibration of parameters for hydrological models.” Int. J. Environ. Tipping, M. E. (2001). “Sparse Bayesian learning and the relevance vector
Pollut., 28(3–4), 432–447. machine.” J. Mach. Learn. Res., 1(3), 1–244.
Cheng, C. T., Chau, K. W., and Sun, Y. G. (2005). “Long-term prediction of Wong, W. S., Chan, W., Datcu, D., and Rothkrantz, L. J. M. (2008). “Using
discharges in Manwan reservoir using artificial neural network models.” a sparse learning relevance vector machine in facial expression recog-
Advances in neural networks, Vol. 3498, Springer, Berlin, 1040–1045. nition.” Technical Rep., Man-Machine Interaction Group, Delft Univ. of
DEQ (Department of Environmental Quality Idaho). (2011). “Water Technology, Delft, Netherlands.
quality standards.” 〈http://adminrules.idaho.gov/rules/2011/58/0102.pdf〉 Wu, C. L., Chau, K. W., and Li, Y. S. (2009). “Predicting monthly
(Mar. 12, 2011). streamflow using data-driven models coupled with data-preprocessing
Dogan, E., Tripathi, S., and Govindaraju, D. (2007). “Application of techniques.” Water Resour. Res., 45(8), W08432.
relevance vector machine for sediment transport estimation.” World Yuan, J., Wang, K., and Yu, T. (2007). “Integrating relevance vector
Environmental and Water Resources Congress 2007, ASCE, Reston, machines and genetic algorithms for optimization of seed-separating
VA, 1–10. process.” Eng. Appl. Artif. Intell., 20(7), 970–979.

© ASCE 04015051-10 J. Environ. Eng.

J. Environ. Eng., 2016, 142(1): 04015051

You might also like