Download as pdf or txt
Download as pdf or txt
You are on page 1of 10

War. Res. Vol. 29, No. 4, pp.

995 1004, 1995


Elsevier ScienceLtd. Printed in Great Britain
Pergamon

D Y N A M I C MODELLING OF THE ACTIVATED SLUDGE


PROCESS: IMPROVING PREDICTION USING N E U R A L
NETWORKS

M A R T I N COTI~ 1, B E R N A R D P . A . G R A N D J E A N 1, , P A U L L E S S A R D 2@ and
JULES THIBAULT 1
~D6partement de g6nie chimique, 2D6partement de g6nie civil, Universit6 Laval, Qu6bec, Qc,
Canada GIK 7P4

(First received January 1994; accepted in revised form October 1994)

Abstract A procedure has been developed to improve the accuracy of an existing mechanistic model of
the activated sludge process, previously described by Lessard and Beck [Wat. Res. 27, 963-978 (1993)].
As a first step, optimization of the numerous model parameters has been investigated using the downhill
simplex method in order to minimize the sum of the squares of the errors between predicted and
experimental values of appropriate variables. Optimization of various sets of parameters has shown that
the accuracy of the mechanistic model, especially on the prediction of the dissolved oxygen (DO) in the
mixed liquor, can be easily improved by adjusting only the values of the overall oxygen transfer
coefficients, KLa. Then, in a second step, neural network models have been used successfully to predict
the remaining errors of the optimized mechanistic model. The coupling of the mechanistic model with
neural network models resulted in a hybrid model yielding accurate simulations of five key variables of
the activate sludge process.

Key vvords~ynamic modelling, activated sludge, neural networks, wastewater

INTRODUCTION sludge plant. Their model could account for the


general dynamic behaviour of the process but erratic
The activated sludge process, comprising a biological predictions were sometimes obtained.
reactor and a secondary settler, is widely used as The problem of obtaining models to adequately
secondary treatment for both municipal and indus- represent the dynamic behaviour of field data is not
trial wastewaters. The effective control of such a an easy one. The nature of the process per se, the
process depends, in part, on the ability to simulate the lack of good understanding and description of the
dynamics of both the biological reactor and the phenomena involved, the availability of good, re-
secondary clarifier. A lot of effort has therefore been liable and complete field data set, and the estimation
devoted to the modelling of the activated sludge of the numerous parameters involved are the major
process since the early 1970's (Lessard and Beck, factors contributing to this problem. Some will argue
1991). that as long as more detailed mechanistic models are
However, even though many models and types of not developed, i.e. that a more profound understand-
models have been proposed to simulate the dynamic ing of the various phenomena taking place in the
behaviour of both the biological reactor and the process is not available, it will be difficult to fit
secondary settler, very few studies looked at the simulated responses with observed data. However,
interactions between these two units (Dupont and increase in model complexity will undoubtedly in-
Henze, 1992). Moreover, very few models have been crease the number of parameters leading to problems
validated with real field data. Lessard and Beck of identifiability (Beck, 1986). On the other hand,
(1993) presented the evaluation of a simplified version some argue that to perform successfully process
of the I A W P R C ' s model (Henze et al., 1987) with a control, the use of simple empirical models (black-
set of data gathered during a 10 day intensive box) is compulsory. But oversimplification does not
sampling campaign. The results were judged ade- give insight into the process which is also necessary
quate although many improvements remained to be for better control and operation. It may suffice the
made for an accurate prediction of the process be- short-term needs of control-system design to accept
haviour. Stokes et al. (1993) have also presented an adequate input/output representation of process
results from a validation process of an activated
dynamics, but this serves neither the urgent need of
process learning for its own sake, nor the long-term
*Author to whom all correspondence should be addressed. needs of control (Beck, 1986).

995
996 M. C6t6 et al.

This dilemma of mechanistic models vs black-box chemical oxygen demand (COD T and CODs), and
ones could be solved by approaching the problem ammonium-N (NH4). The behaviour of dissolved
with a comprising mind. For example, it is desired to oxygen, and VSS in the mixed liquor and returned
model as accurately as possible the activated sludge activated sludge is also simulated by the model. The
process with a mechanistic model. However, it is not hydraulics in the aerator are represented by four
possible nor desirable to spend considerable time and continuous stirred tank reactors (CSTR's).
effort to describe particularities and nonidealities of This mechanistic model was used to simulate the
a process. This is where a black-box model can play behaviour of the activated sludge plant located at
an important role by modelling the gap that exists Norwich (England). A schematic flow sheet of this
between the mechanistic model and plant data. The process is presented in Fig. 1. The main feature of this
advent of neural networks opens another door in the particular plant is the recycling of supernatants from
field of modelling as they can be coupled with the waste activated sludge consolidation tank to the
mechanistic models to increase the prediction capa- head of the aerator. This sidestream is mixed with the
bilities without necessarily increasing the mathemati- settled sewage coming from the primary clarifiers.
cal complexities. The hydraulic retention time is approx. 9 h for the
The objective of this paper is thus to report on the aerator and 6 h for the secondary clarifier. The model
improvement of an existing mechanistic model of the was run with an influent data set that was collected
activated sludge process using black-box models: in over 10 days at the Norwich Sewage Works. The
this case, neural networks. A short description of the simulation results were then compared to the corre-
mechanistic model used will first be presented, along sponding effluent data. The simulation was adequate
with some brief comments on neural networks and for the VSS in returned activated sludge and the
their application in the field of wastewater treatment. effluent NH 4. Poorer fit was observed for effluent SS
and CODT, and dissolved oxygen in mixed liquor.
The mechanistic model However, in the latter case, the simulated concen-
The mechanistic model used in this work is de- tration and the experimental points followed the
scribed in Lessard and Beck (1993). Roughly, it same trends. More details on the model, the sampling
consists of a biological process model which is a campaign, the plant and the evaluation of the model
simplified version of the IAWPRC's model (Henze can be found in Lessard and Beck (1993).
et al., 1987) and a secondary settler model described
by a clarification model consisting of an empirical Neural network models
relationship [based on Pfianz (1969) model] and a Among the various neural network models that
thickening model using the flux theory as described exist (Wasserman, 1989; Lippman, 1987), backpropa-
by Dick and Young (1972). The model can predict the gation (or feedforward) neural network models
final effluent concentration of suspended solids (SS), are known to be very effective to capture the non-
volatile suspended solids (VSS), total and soluble linear relationships which exist between variables in

Settled
sewage "~ ~ Aerator Secondary J Final
clarifier effluent

Waste activatedsludge

"R"
~-.
WAS
Consolidationtank
Sump

To sludgetreatment ~ ~ Sludgedryingbeds
Fig. 1. The activated sludge process at Norwich, England with values to be fitted.
Neural network modelling of activated sludge 997

complex systems. In essence, a backpropagation neu- Output layer:


ral network model can be viewed as a regression
model between input and output variables. Various S k = f ( ~=, l<~k <~K. (2)
applications of these models have been recently re-
ported in chemical engineering, mostly in process The transfer function, f, was the sigmoid function
identification and control (Bhat and McAvoy, 1990; 1/[1 + e x p ( - x ) ] . The coefficients W~i and W~k in the
Thibault, 1991; Thibault and Grandjean, 1991; summations, which are usually referred as the
Hoskins et al., 1991; Psichogios and Ungar, 1991; Su weights, are the fitting coefficients of the neural
et al., 1992; Venkatasubramanian and McAvoy, model.
1992; You and Nikolaou, 1993). Neural networks
have also found applications in wastewater treat-
METHODOLOGY
ment, where they were used for instance to model the
sludge volume index (SVI) in order to forecast sludge The improvement of the Lessard and Beck (1993) acti-
bulking (Capadaglio et al., 1991), or to predict the vated sludge mechanistic model was done in two main steps.
The first step consisted of optimizing the values of the
effect of heavy metals on the performance of the various parameters used in the mechanistic model. Initially,
activated sludge process (Tyagi and Du, 1992). Boger Lessard and Beck (1993) estimated the model parameters
(1992) reviewed various applications of neural net- using good judgement and a trial and error method. Par-
works in the field of wastewater engineering and ameter optimization has been performed using the downhill
simplex method (Press et al., 1988) in order to minimize an
discussed both advantages and limitations of neural
objective function consisting of the sum of the squared
approach. According to him, the main advantage of deviations between experimental and predicted values for
neural networks is their ability to extract the under- five variables: effluent SS, COD r, NH 4, dissolved oxygen in
lying phenomena directly from historical plant data the mixed liquor and VSS in the returned activated sludge
whereas other artificial intelligence systems such as (Fig. 1). Downhill simplex method has been chosen for its
simplicity because it does not require the derivative of the
expert systems need human intervention to encode objective cost function with respect to each parameter to be
knowledge about the process. However, Boger (1992) optimized. To have a similar weighting for each variable (x),
underlines the limit of using neural network approach in the objective function, their normalized values (x*), have
if the database is insufficient (neural model could lead been considered and defined as
to erroneous interpolations) or restricted to a narrow x * -- Xmax.ex
p -- X . (3)
range of operating conditions (neural model could Xmax,ex p - - Xmin.ex p
lead to erroneous extrapolations). The criteria used to judge the fit was the sum of squared
In this investigation, a three-layer feedforward errors on these normalized values, defined as
neural network was used: that is a network with a
single hidden layer as shown in Fig. 2. Using the S= ~ ~ [x~'~xp.(m)-x~'~a,~(m)] e. (4)
/-I m--I
nomenclature of Fig. 2, the neural network models
consist of the following set of equations: Other criteria were also tested: the sum of squares of
relative errors showed to be inadequate, since it did not
Hidden layer: weigh adequately the peaks observed in the experimental
data. The sum of squares of absolute errors of non-normal-
H / = f ( Z W, l <~j <~J - 1 (1) ized data proved also inadequate as it weighed more heavily
the fitted variables having a greater magnitude.
The optimization termination criterion consisted in the
where U is the scaled input vector and H is the output following procedure. Starting from the initial set of par-
vector of the neurons contained in the hidden layer. ameters, the routine performed 400 iterations leading to a
The last elements of these two vectors, U z and H j, are new set of optimized parameter values. Then, successive
the bias and they are set equal to 1. reruns of the routine were performed until parameters
converged (Press et al., 1988).
All mechanistic models are only an approximation and
I J K
cannot account for the complete dynamics of the process.
The second step therefore consisted in the neural modelling
W~j~ Wlk of the errors between the simulated responses given by the
U~ optimized mechanistic model and the corresponding exper-
imental values. Basically, these neural models can be viewed
as error predictors of the key variables of the mechanistic
U2 v
model; the coupling of the neural network to the mechanis-
tic model results in the hybrid model shown in Fig. 3. For
each of the five key variables of the activated sludge process
previously identified (Fig. 1), a neural network was devel-
U3 S2
r
oped to model the prediction errors between the mechanistic
model and the experimental values. For each variable, the
inputs of the neural models had to be selected among the
available process variables; these inputs were selected using
a cross-correlation algorithm between input and output
variables, and a blend of trial-and-error and good judge-
ment. Five small different neural networks have been pre-
ferred to a large neural model having five outputs, because
Fig. 2. The feedforward neural network. the selection of the input variables depends on the output
998 M. C6t6 et al.

.~ Mechanistic ~J~O Table 1. The average relative error on predictions (%)

Model SS CODr NH 4 DO RASVSS


Inputs V utput I
2a
47.8
42.6
37.1
31.8
69.2
66.7
260
45.5
6.6
5.5

-I _1-1 NeuralE~ror
Predictor
I 1 2b
3
47.8
17.1
40.2
16.3
69.1
46.0
~Predictions using mechanistic model with initial parameters.
51.2
18.3
6.6
2.6

2"Predictions using mechanistic model with 41 optimized parameters.


Fig. 3. Schematic of the hybrid model. 2bPredictions using mechanistic model with 3 optimized KLa'S.
3Predictions using hybrid model.

variable. A large neural network model would then require


too m a n y inputs, most of which being not relevant for the neural network is therefore not reliable for predicting data
predictions of 4 outputs amongst the 5. The network's that has not been part of its learning procedure. This may
architecture being chosen, model parameters, which are the be caused either by an insufficient database (Boger, 1992) or
weights W,7 and l,Vjk (Fig. 2), were determined using a by an excessive number of neurons in the hidden layer
least-square regression algorithm. Parameter determination (Chitra, 1993). On the other hand, too few neurons do not
is usually known as the learning or training of the neural allow the neural model to fully capture the underlying
network model. In this investigation, the nonlinear re- dynamics of the process. In this investigation, the number
gression algorithm, used for the learning procedure, was the of neurons in the hidden layer has been varied between 3
BFGS version of a quasi-Newton method (Press et al., and 6 and the optimum value has been retained. In order
1988). The quasi-Newton method was preferred to the more to reject neural network models that overfit, the available
popular backpropagation algorithm which was found to be data set, consisting of 193 hourly basis sampling (Lessard
less accurate and to converge more slowly (Thibault and and Beck 1993), was split in two parts. The first part
Grandjean, 1991). consisted of data from time step 1 to time step 140, and
A frequent problem in the use of neural networks is was used for learning. The remaining data, from time
that they m a y overfit the experimental points: that is, step 141 to time step 193, served as a validation file on which
the network learns a series of particular cases instead the generalizing capability of the network could then be
of computing the general tendency of the system. This judged.

°,qtao° Experimental
-- Mechanistic model with initial parameters
- Mechanistic m o d e l w i t h 3 o p t i m i z e d KLa's

5 -- II

/ I1
I II I q

I II I
4 --
II II I /
I I .,,l I I I
I I
II I ~1 I I I
I I I
'7 I1 I I I I ... I I
I
v

O
3 -- II II I ,.I I /
II r-: I I\ I
I
I I ~1 / Ik/
I / III k
2 -- I iooo off k
,I , [] o] II
1 -- 13 °
I II

~ °°DO D ODO0

I I I I
0 40 80 120 160 200
T(h)

Fig. 4. Simulation of the dissolved oxygen in the mixed liquor using mechanistic model with initial and
optimized parameters.
Neural network modelling of activated sludge 999

Table 2. Values of Kta (h ~) Table 3. Characteristics of the neural networks


CSTR 1 CSTR 2 CSTR 3 Variable Inputs I J K
Initial 5.4 1.7 0.74 Effluent suspended solids (t) Q(t)setaedse,~ge 6 6 1
Optimized 8.7 1.8 0.36 Q (t)rc,u,,~-a activatedsludge
Q (t)o.e.. ...... tida, . . . . . k
SS(t )seLt~edsewage
SS(/ -- I )settled sewage
RESULTS Effluent total chemical Q (I)settled sewage 6 6 I
oxygen d e m a n d (t) Q (l)returned activated sludge
Parameter optimization of the mechanistic model Q (t)ovo,~ . . . . . . ,d~,on t~.k
Of the 61 parameters used in the mechanistic CODT(t )settled sewage
SS(t )~ded ~wage
model, 20 parameters have values that were either Effluent a m m o n i a (t) Q(l)~ltled ~wage 6 6 I
measured experimentally or determined by the struc- Q (t),~urned ~e"~ed ~udge
Q (t)ov¢~a. . . . . . . lidationtank
ture of the model; they were therefore not optimized. NH4(t - 3)settledsewage
Good initial estimates of all the other parameters CODT(t -- 3)sculedsewage
were obtained from the literature or from past experi- Mixed liquor dissolved Q (l)~ttled ~wage 6 6 I
oxygen (t) Q (t)~¢t~.ed ~ct,~ted ~l~dge
ence. A first optimization of the remaining 41 par- Q (t)ove.n . . . . . . . 'ida,o..~.k
ameters showed that few parameters varied CODT (t)settled ~age
significantly. Indeed, only three parameters varied by CODT(t - 1)settled sewage
Volatile suspended solids in Q (t)~.~d ~w~ 6 6 1
more than 20%: the dissolved oxygen saturation returned activated sludge (t) Q (t)ret.~ed activatedsludge
concentration, the overall oxygen transfer coefficient Q (t)ovum . . . . . . . lidationtank
CODT(t -- 3)~ttJed ~age
(KLa) of the mixed liquor in the third CSTR and the NH4(t - 3)~tded ~ag~
ratio of non-biodegradable particulate COD to total
COD. However, the optimized value of 7.8 g/m 3, for
the dissolved oxygen saturation concentration is not 15° and 20°C. A series of optimization was then
acceptable. Such a value corresponds to a tempera- performed using a restricted number of parameters
ture around 30°C (Ramalho, 1983) while the actual (groups of 1-5) among the following: dissolved oxy-
temperature experienced by the process was between gen saturation concentration, the ratios of total COD

50 anaaa Experimental ]zl


I- - Optimized mechanistic model I
- Hybrid model [

I o
<------ Learning [ Validation ------>
40 data t data rt

J o
° i
~ ,,
20 o ~\ "'x "~

a / ~a\ a a
10

I I I I I
0 40 80 120 160 200

T (h)
Fig. 5. S i m u l a t i o n o f t h e effluent s u s p e n d e d solids with and without the neural network correction.
1000 M. C6t6 et al.

and the values of KLa in the first three CSTR's. The elling of the effluent SS, being a linear function of the
value of KLa in the fourth CSTR was assumed to be flow entering to the aerator, could not be improved
equal to the one used for the third CSTR. by optimizing the parameters. The CODT is partly a
As reported in Table 1, the most interesting results function of the SS concentration, with regard to the
were obtained by optimizing the only three KLa'S. particulate COD; a poor fit of the effluent SS results
The resulting improvement for the prediction of in a poor fit of effluent CODT. It is interesting to
dissolved oxygen is shown in Fig. 4. The optimized mention that modelling improvement obtained by
values of KLa are reported in Table 2 and can be fitting the only three KLa'S is quite similar to one
explained. The decreasing value of KLa could be where 41 parameters were considered (Table 1). This
related to the decreasing aeration flow rate occurring result clearly stresses the limit of optimizing too many
in the aeration tank (tapered aeration system) which parameters.
has been modelled as four series CSTR's. Indeed, the Even using the three optimized K t a's, the mecha-
percentages of the total air flow-rate in the four nistic model could not predict precisely the behaviour
CSTR's are respectively 45, 30, 15 and 10%. In of the system. The next step was then to correct the
addition, it is important to mention that, in the actual errors observed with the mechanistic model using
process, a sidestream of mixed liquor is saturated block-box models, in this case, the neural networks.
with pure oxygen prior to being injected at the head These errors are not necessarily caused by inaccurate
of the aerator (VITOX process). Due to the fact that parameters, but could also result from simplifications
the model does not account for this additional oxy- in model development, from sampling errors, analyti-
genation, the optimizing procedure led to high value cal errors and so on.
of KLa in the first CSTR in order to obviate to an
underestimation of oxygen concentration. In ad- Coupling of neural network and mechanistic models
dition, it can be observed in Table 1, that parameter Neural network models have been used to predict
optimization has not led to significant improvement the errors between experimental values of effluent
for other variables than dissolved oxygen. The mod- characteristics (SS, CODr and NH4), dissolved

180

ooooo Experimental
-- -- Optimized mechanistic model
Hybrid m o d e l

150 -- •¢ - - - - - - Learning Validation ------:,


data data
D

120 --

[]

'~ 90--

6O
I
o

D
30-- D []

I I I 1
0 40 80 120 160 200

T (h)

Fig. 6. Simulation of the effluent total chemical oxygen demand with and without the neural network
correction.
Neural network modelling of activated sludge I001

oxygen and returned activated sludge, and corre- Total chemical oxygen demand. This case is similar
sponding values predicted by the optimized mecha- to that of suspended solids: the mechanistic model
nistic model. The inputs and the number of hidden simulation does not represent the observed behaviour
neurons for each network are given in Table 3. As it very well, but rather gives an average value. Using the
can be seen from Table l, the coupling of the neural network, the modelling observed on the vali-
mechanistic model with the neural network error dation file (Fig. 6) seems adequate: even though the
predictor yields significant improvement in the simu- network cannot predict exactly the amplitude of the
lation of all variables. This is true particularly in the variations, it follows the same pattern as the exper-
cases where the mathematical relations of the mecha- imental values. It is important to recall that what we
nistic model are insufficient to describe the complex are looking for is the general tendency of the system
phenomena occurring in the process: for example, rather than an overfitting of individual points.
the suspended solids in the effluent, for which a Ammonia. The effluent ammonia concentration
linear function of the influent flow is clearly in- was already well simulated by the mechanistic model,
adequate. except between time steps 30 and 80. The optimiz-
Suspended solids. The suspended solids were not ation of parameters brought little improvement. The
precisely predicted by the mechanistic model, due use of neural networks resulted in a reduction of the
to the weakness of the mathematical function used average error. Figure 7 shows that the underestima-
to simulate this variable. As shown in Fig. 5, the tion between time steps 30 and 80 is satisfactorily
corrected simulation is adequate for the validation corrected. The validation file is also well simulated:
file (beginning at t = 141). It can be observed that the the fast decrease predicted by the mechanistic model
increase of settled sewage flow at t = 154 has an between time steps 160 and 172 is rectified. Since the
immediate influence on the SS simulated by the error was occasionally five to 10 times the magnitude
mechanistic model, whereas this influence is effec- of the predicted value, corrected values below zero
tively detected only 3 ~ h later. The neural network were observed. A minimum value of zero for the
model corrected this discrepancy even if these data response of the hybrid model had therefore to be set.
were not part of the learning data. However, these discrepancies were not significant as

5
aaaaa Experimental i
-- - Optimized mechanistic model
] Hybrid model i

41 - i ---"
i

i .

i\\ ,, i i t

1 I
0 40 80 120 160 200
7" (h)
Fig. 7. Simulation of the effluent ammonia-N with and without the neural network correction.
1002 M. C6t6 et al.

they did not exceed - 0 . 1 gm -3. The value of 46% Volatile suspended solids in returned activated
reported in Table 1 for the average relative error sludge. The mechanistic model's predictions for VSS
could appear to be in contradiction with the good fit in returned activated sludge are good, except that
shown in Fig. 7. This is due to the fact that numerous they reach maximum values at regular intervals
values of NH4 are below 0.3 and the magnitude of the (Fig. 9). This is due to the flux theory used in the
relative error has then been increased. model. According to this theory, there exists a limit-
Dissolved oxygen. Simulation of the dissolved oxy- ing flux of components that can be absorbed by
gen with the three optimized parameters was rela- the settler for a certain set of operating conditions
tively good, except for time steps 1-70. The (Lessard and Beck, 1993). Figure 9 points out that the
improvement due to neural network correction slight variations of the VSS in returned activated
is shown in Fig. 8. The fitting is improved for time sludge, during which the mechanistic model gives
steps 1-70, and the validation file is also adequately constant outputs, are relatively well represented by
simulated, although the needed correction was the hybrid model. Again, the purpose of this work is
minimal due to the accuracy of the model's predic- not to match exactly with the experimental points,
tions. but to follow the variations. The validation file is here
Part of the error could probably be caused by the again satisfactorily corrected: rapid changes of con-
saturation of the oxygen probe, as it cannot evaluate centration between time steps 154 and 166, during
precisely dissolved oxygen concentrations below which the mechanistic model gives a constant value,
0.4 gm -3. Verification of the probe could not be done are predicted with a good accuracy.
to support this, but a simple observation of the
experimental results leads us to this conclusion. This
CONCLUSION
could be the cause of the somehow low improvement
brought by neural networks on this state variable, A two step procedure was presented in order to
since the learning procedure used data that did not improve the accuracy of the mechanistic model of the
represent with sufficient precision the physical situ- activated sludge process, previously described by
ation occurring in the process. Lessard and Beck (1993).

tZDtZtlD E x p e r i m e n t a l
-- -- Optimized mechanistic model
- Hybrid model

<------ Learning Validation ------:"


data data

3 --

2--

1 k
N

El Q
°:oo
[
I
!
i I I I
40 80 120 160 200
T(h)
Fig. 8. Simulation of the mixed liquor dissolved oxygen with and without the neural network correction•
Neural network modelling of activated sludge 1003

ll,000

DOODt~ Experimental
-- m Optimized mechanistic model
- Hybrid model

10,000 --

r -~hr~~
9000 --

f
8000 --

> 3
xl

7000 --

'¢------ Learning Validation ------:"


data data

6000 --

5000 I I I I

0 40 80 120 160 200

T (h)

Fig. 9. Simulation of the VSS in returned activated sludge with and without the neural network correction.

As a first step, parameter optimization of the accounting for the variation of parameters during the
model was performed using a least square regression process. Some parameters, instead of being constant,
analysis, the downhill simplex method, in conjunction can be a function of time or of other variables such
with a large set of experimental data of five key as flow, height of sludge blanket or concentration of
process variables (effluent SS, CODT, NH 4, dissolved a certain component.
oxygen in the mixed liquor, and VSS in the returned In the modelling of complex systems, the coupling
activated sludge). Among the 41 parameters that can of mechanistic and neural models could be used
be adjusted, the overall oxygen transfer coefficients, iteratively. A simple mechanistic model could be
Kea, have found to be the best fitting parameters. The calibrated initially with experimental data and the
accuracy of the mechanistic model has been greatly modelling errors taken into account by a neural
improved, especially on the prediction of the dis- network model in order to achieve a usable model for
solved oxygen (DO) in the mixed liquor. However, prediction and/or process control. With time, the
the optimization procedure did not succeed in im- mechanistic model could be improved to represent
proving the predictions of the four remaining key more accurately and to better comprehend the under-
variables. lying phenomena. As a result, the black box neural
In the second step, feedforward neural network network would take a lesser modelling load. In
models were used successfully to simulate the predic- addition, if the prediction of one state variable is
tion errors of the mechanistic model. The neural error significantly enhanced by the neural network model,
predictors consisted of a compact set of equations it is a clear indication that the experimental data
and have been coupled easily with the mechanistic contains dynamic behaviour that has not been
model. The resulting hybrid model was found, in encapsulated in the mechanistic model and the mod-
general, to simulate with a good accuracy the dynam- elling effort would lead to a better model. If no
ics of the activated sludge process. Some deviations significant improvement is observed, it may suggest
that have been observed could be partly explained, in that the data are not sufficiently rich in information
this case, by noisy effluent data. Further improve- or that important variables are not considered in the
ments of modelling could probably be obtained by model.
1004 M. C6t6 et al.

Acknowledgement--The authors thank the NSERC for its Lippman R. P. (1987) An introduction to computing with
financial support. neural nets. IEEE ASSP, April, 4-22.
Pflanz P. (1969) Performance of secondary sedimentation
basins. In Advances in Water Pollution Research (Edited
REFERENCES Jenkins S. H.), pp. 569-581. Pergamon Press, Oxford.
Press H. W., Flannery B. P., Teukolsky S. A. and Vetterling
Beck M. B. (1986) Identification, estimation and control of W. T. (1988) Numerical Recipes, the Art o f Scientific
biological waste-water treatment processes, IEE Proc. Computing, Cambridge Univ. Press. pp. 307-312.
133, 254-264. Psichogios D. C. and Ungar L. H. (1991) Direct and indirect
Bhat N. and McAvoy T. J. (1990) Use of neural nets for model based control using artificial neural networks. Ind.
dynamic modelling and control of chemical process sys- Engng Chem. Res. 30, 2564-25.
tems. Comput. Chem. Engng 14, 573-582. Ramalho R. S. (1983) Introduction to Waste Water Treat-
Boger Z. (1992) Application of neural networks to water ment Processes, 2nd Edn. Academic Press, New York.
and wastewater treatment plant operation. ISA Trans. 31, Stokes L., Tak~.cs I., Watson B. and Watts J. B. (1993)
25-33. Dynamic modelling of an A.S.P. sewage works: a case
Capodaglio A. G., Jones H. V., Novotny V. and Feng X. study. In Proc. 6th 1,4 WQ Workshop on Instrumentation,
(1991) Sludge bulking analysis and forecasting: appli- Control and Automation o f Water and Wastewater Treat-
cation of system identification and artificial neural com- ment and Transport Systems (Edited by Jank B.)
puting technologies. Wat. Res. 25, 1217-1224. pp. 105-115. Banff and Hamilton, Canada.
Chitra S. P. (1993) Use neural networks for problem solving. Su H.-T., McAvoy, T. J. and Werbos, P. (1992) Long-term
Chem. Engng Prog. 89, 44-52. predictions of chemical processes using recurrent neural
Dick R. I. and Young K. W. (1972) Analysis of thickening networks: a parallel training approach. Ind. Engng Chem.
performance of final settling tanks. In Proc. 27th Indus- Res. 31, 1338-1352.
trial Waste Conf. (Edited by Bell J. M.). Purdue Univ, Thibault J. (I 991) Feedforward neural networks for identifi-
N.C. cation of dynamic processes. Chem. Engng Commun. 105,
Dupont R. and Henze M. (1992) Modelling of the secondary 109-125.
clarifier combined with the activated sludge no.l. War. Thibault J. and Grandjean B. P. A. (1990) A neural network
Sci. Technol. 25, 285-300. methodology for heat transfer data analysis. Int. J. Heat
Henze M., Grady C. P. L., Gujer W., Marais G. v. R. and Mass Transf 34, 2063-2070.
Matsuo T. (1987) Activated sludge model No 1. Scient. Tyagi R. D. and Du Y. G. (1992) Kinetic model for the
Tech. Rep. No. 1. IAWPRC, London. effects of heavy metals on activated sludge process using
Hoskins J. C., Kaliyur K. M. and Hemmelblau D. M. neural networks. Envir. Technol. 13, 883-890.
(1991) Fault diagnosis in complex chemical plants Venkatasubramanian V. and McAvoy, T. J. (1992) Neural
using artificial neural networks. A.LCh.E, J. 37, 137- network applications in chemical engineering. Comput.
141. Chem. Engng 16 (4).
Lessard P. and Beck M. B. (1991) Dynamic modelling of Wasserman P. D. (1989) Neural Computing: Theory and
wastewater treatment processes: its current status. Envir. Practice, pp. 43q50. ANZA Research, Van Nostrand
Sci. Technol. 25, 30--39. Rheinhold, New York.
Lessard P. and Beck M. B. (1993) Dynamic modelling of the You Y. and Nikolaou M. (1993) Dynamics process mod-
activated sludge process: a case study. Wat. Res. 27, elling with recurrent neural networks. A.LCh.E.J. 39,
963-978. 1654-1667.

You might also like