Romijn 2008

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Journal of Process Control 18 (2008) 906–914

Contents lists available at ScienceDirect

Journal of Process Control


journal homepage: www.elsevier.com/locate/jprocont

A grey-box modeling approach for the reduction of nonlinear systems


Reinout Romijn a, Leyla Özkan b, Siep Weiland c, Jobert Ludlage b, Wolfgang Marquardt a,*
a
AVT-Process Systems Engineering, RWTH Aachen University, Turmstrasse 46, 52056 Aachen, Germany
b
IPCOS, Bosscheweg 135b, 5282 WV Boxtel, The Netherlands
c
Eindhoven University of Technology, P.O. Box 513, 5600 MB Eindhoven, The Netherlands

a r t i c l e i n f o a b s t r a c t

Keywords: A novel model reduction methodology is proposed to approximate large-scale nonlinear dynamical sys-
Grey-box modeling tems. The methodology amounts to finding computationally efficient substitute models for the nonlinear
Hybrid modeling subsystems. Model reduction is pursued by viewing the system as a grey-box (or hybrid) model with a
Model reduction mechanistic (white-box) component and an empirical (black-box) component. Before identifying the
Parameter estimation substitute model, the mechanistic subsystem is reduced by projection using proper orthogonal decompo-
Proper orthogonal decomposition
sition. Subsequently, the empirical component is identified by parameter estimation to substitute the
Distributed parameter systems
nonlinear subsystem. As a consequence, a reduced model with less nonlinear complexity is obtained.
An example involving a distributed parameter system is provided.
Ó 2008 Elsevier Ltd. All rights reserved.

1. Introduction high-order nonlinear systems, using singular value decomposition


(SVD) [4], proper orthogonal decomposition (POD) techniques [5],
Advances in computation and modeling tools have enabled the and approximate inertial manifolds [6,7]. The method of POD is
development of detailed complex mathematical models that yield particularly popular in the fluid dynamics community and is of
reasonable and accurate predictions of the behavior of any type of considerable interest for the reduction of distributed parameter
(process) system. The intrinsic complexity of processes yields mod- systems. An extensive overview and an in-depth treatment of
els that generally require a considerable computational effort to reduction methods for merely linear model representations has
solve them. One of the origins of complexity is the modeling of been presented by Antoulas [8]. An extensive discussion on
the spatially distributed nature of physical processes, which results reduction techniques for nonlinear systems can be found in
in models containing partial differential equations (PDEs). The Marquardt [9].
numerical solution of PDE models is often found after discretiza- There are a number of key disadvantages to most model
tion of the spatial domain or after projection of the PDEs on spatial reduction techniques in which state dimensions are reduced.
structures whereby the PDEs are approximated by a finite-dimen- First, state reductions do not necessarily lead to computationally
sional set of ordinary differential equations (ODEs). The resulting more efficient models. In some cases, this is due to the possible
models remain generally of high order when general-purpose dis- loss of sparsity in a model formulation after reduction. Another
cretization schemes are employed [1,2]. The order is defined as the cause is the fact that computationally expensive function evalua-
number of first-order differential equations. Such models are often tions are not simplified in most model reduction schemes. These
unsuitable and computationally very costly, in case of on-line, real- problems have been observed in several papers before [10–16].
time applications such as model-based control and model-based Furthermore, few techniques are able to cope with nonlinear
dynamic optimization [3]. This necessitates the reduction of a sys- uncertainty in the model and most reduction techniques do not
tem consisting of a large finite number of equations to a system allow to extract relevant features for the specific purpose for
consisting of a smaller number of equations or to a system contain- which the model is meant.
ing a smaller number of first-order differential equations. This pro- This paper is motivated by the problem of performing model
cedure is generally referred to as model reduction. approximation with the explicit aim to improve computational
Significant research efforts have been dedicated to the devel- efficiency while keeping desirable model properties intact. A new
opment and implementation of model reduction techniques for methodology is introduced that utilizes a grey-box modeling
approach to obtain a reduced model that is computationally more
efficient. At least three distinctive issues motivate the combination
* Corresponding author. Tel.: +49 241 80 94668; fax: +49 241 80 92326. of grey-box model structures with model reduction:
E-mail address: wolfgang.marquardt@avt.rwth-aachen.de (W. Marquardt).

0959-1524/$ - see front matter Ó 2008 Elsevier Ltd. All rights reserved.
doi:10.1016/j.jprocont.2008.06.007
R. Romijn et al. / Journal of Process Control 18 (2008) 906–914 907

(1) Model uncertainties and un-modeled dynamics can be rep- where fFP denotes the set of first principle equations and fEM denotes
resented as interconnections of a known system component the set of empirical black-box equations. Here, x are the differential
with a (partly) unknown component. By tearing the uncer- states, u the system inputs and h the parameter set which deter-
tain part out of the model, the remaining (linear) part can mines the input–output map of the black-box. Grey-box model of
be subject to model reduction without damaging the proper- the block-structured type (for example Wiener- and Hammerstein
ties of the uncertain model part or the interconnection models) have been proposed by Pearson and Pottmann [22].
structure. The extrapolation capacity and identifiability of serial grey-box
(2) Finite element discretization of the spatial geometry of non- model structures have been studied by several authors. Van Can et
linear distributed parameter systems typically leads to per- al. [23] compared the prediction of a serial grey-box model with
forming nonlinear function evaluations in each of the mesh the performance of a black-box model. The grey-box model per-
elements. As a consequence, the computational load per formed better in predicting the process outputs when static model
nonlinear function evaluation is critical. states were forced outside their identification domain or a dynamic
By separating computationally intensive functions in a model was excited at higher frequencies than during experiments.
model before any kind of model reduction is performed, a Kahrs and Marquardt [24] compared two criteria to define the
structure is created in which computationally intensive identification domain for static systems and analyzed the validity
functions can be substituted by simpler ones which leads regions of the model outputs. The results showed that the extrap-
to computationally more efficient approximate models. olation capacity of a static grey-box model is apparent and quanti-
(3) Reduction methods for nonlinear models generally do not fiable. Schuppert [25] studied the identifiability of black-boxes in a
preserve model sparsity. It will be shown that sparsity can tree structure where black- and white-boxes are coupled in a feed-
be re-introduced in a projected low-order model by employ- forward way. It is shown that when the block equations are suffi-
ing a grey-box model structure with a reduced size black ciently differentiable (and sufficient data is available), the black-
box. This low-order sparse black-box part can be maintained boxes can be uniquely defined.
when a high-order model is reconstructed from an identified Grey-box models have been widely applied for the modeling of
low-order model. a variety of chemical and biochemical processes. Applications in
biochemistry have for example been studied in [26–31]. Grey-
The structure of the paper is as follows. A short overview of grey- box models for chemical reaction processes have been derived
box modeling is presented in Section 2. The reduction method POD for distillation columns [32–34], polymerization [35–38], and
will be summarized in Section 3. Based on these preliminaries, the batch-crystallizers [39,40]. Grey-box models have been studied
grey-box model reduction methodology proposed in this paper will with a focus on applications to control [41] and identification
be described in Section 4. To illustrate the methodology, it is [42]. All of the cited applications involve lumped models of orders
applied to a simple example of a distributed dynamical system in between 2 and 6 although often representing processes of distrib-
Section 5. Finally, conclusions are drawn in Section 6. This paper uted nature. Distributed grey-box models have received very little
is an extended version of the earlier conference contributions of attention in the literature until now. An exception is the identifica-
Romijn et al. [17] and Özkan et al. [18]. tion of an empirical relation containing several parameters within
a system which describes electromagnetic wave propagation in a
2. Grey-box modeling two-dimensional spatial domain reported by Akkari et al. [43].
The use of grey-box models as part of a model reduction proce-
A grey-box model, also referred to as a hybrid model in the liter- dure has been proposed by Hahn et al. [44] and Sun and Hahn [45].
ature, consists of a combination of mechanistic (first principle) and Their model reduction technique is based on balancing of a given
black-box (empirical) model components. Mechanistic models are nonlinear system by means of empirical Gramians. The states
usually built from physical laws, conservation relations, and estab- which are unimportant from an input–output behavior point of
lished physical and chemical relations. Due to the general validity view are residualized by setting their derivative to zero. The result-
of these relations, such a model is capable of predicting the behav- ing implicit nonlinear algebraic equations are then replaced by an
ior of the system to a large extent. Black-box models can be viewed explicit set of equations in the form of a neural network. This net-
as models with a highly parameterized structure such that in prin- work is then identified on data generated by simulation of the ori-
ciple any input–output map can be realized. The identification of a ginal nonlinear equations. This approach does not reduce the
black-box comprises an estimation of the parameters in order to fit number of equations, but reduces the computational load if an ex-
the measured input–output data. In general, no guarantee can be plicit ODE solver can be employed to solve the system instead of a
given whether the black-box can predict system behavior outside DAE solver.
of the domain in which it has been identified. The grey-box mod- Summarizing, grey-box models have been applied for modeling
eling approach seeks to combine advantages of both model types. a variety of process systems which in principle are spatially distrib-
All available knowledge of the process mechanisms is used to build uted. However, these systems are, almost without exception, sim-
a white-box part, while missing information is approximated by plified to lumped models of low order before being used for the
black-boxes fitted on process data. Hereby, the need for possibly construction of a grey-box model. The connection of grey-box
costly extra experimentation and research of unknown process modeling and model reduction where both order and complexity
mechanisms will be minimized. is reduced, has only rarely been discussed in the literature. This
Several grey-box model structures have been proposed. Psicho- work aims to establish this connection by presenting a method
gios and Ungar [19] proposed a serial structure where a black-box with which grey-box models can be identified for high-dimen-
is used to approximate unknown relations which determine cer- sional systems.
tain model variables. Thompson and Kramer [20] described a par-
allel structure which aims at compensation of the model prediction 3. Proper orthogonal decomposition
error. A general representation of the serial grey-box equation
structure of a finite-dimensional ordinary differential equation An important technique for an efficient reduction of large-scale
system can be formulated as (e.g. [9,21]) linear and nonlinear systems is the method of POD, also known as
principle component analysis (PCA) or Karhunen-Loève expansion.
x_ ¼ fFP ðx; u; fEM ðx; u; hÞÞ; ð1Þ The application of PCA for the reduction of linear dynamic systems
908 R. Romijn et al. / Journal of Process Control 18 (2008) 906–914

to the controllable and observable subspace has been studied by T ¼ ½Tðt1 Þ    TðtK Þ: ð6Þ
Moore [46]. When the data for POD is generated by impulsive in-
In either case of an infinite or a finite dimensional system, a Galer-
puts distributed on the unit ball of the dimension of the input
kin projection on the POD basis is used to obtain a reduced model.
space, the POD basis is shown to span the controllable subspace.
Suppose that a system is governed by a PDE of the form
Independent from control theory, the method has also been
widely researched and applied within the field of fluid dynamics. oX
¼ AðXÞ þ BðuÞ þ FðX; uÞ: ð7Þ
In particular, to compactly capture the behavior of turbulent ot
flows. Lumley [47] introduced the method in fluid dynamics to Here, X(z,t) denotes the state variable at position z in some spatial
find an optimal representation of an ensemble of data by those geometry X and at time t, u(z, t) denotes the input. A is a linear
functions that maximally exploit the correlation between the operator on the state, B denotes a linear operator on the input
ensemble data points. The idea of using only data of time-un-cor- and F represents nonlinear terms. Let Hn denote an n-dimensional
related ‘‘snapshots” of system states was introduced by Sirovich subspace of H and let P n : H ! Hn and In : Hn ! H denote the
[48] to reduce the computational power needed to compute the canonical projection and canonical injection maps. The reduced
correlation between the states. Berkooz et al. [49] give a compre- state vector is then constructed as Xn = PnX and the model is then
hensive review of the research up to the early 1990s. A clear given by
treatment of the computation, the optimality properties of the
POD and the projection of a PDE system on the basis generated oX n
Pn ¼ Pn AðX n Þ þ Pn BðuÞ þ Pn FðX n ; uÞ; ð8Þ
by the POD has been compiled by Holmes et al. [50]. A recent ot
treatment on POD and the numerical computation of the basis with u 2 U and X n 2 Hn , Hn ¼ Pn H for all t. In the specific case of a
functions can be found in [51] and [52]. POD basis, the finite-dimensional subspace Hn ¼ spanðu1 ; . . . ; un Þ
The method of POD can be summarized as follows. Given an relies on the POD basis functions uj. In that case, (8) becomes a
ensemble of K measurements Xk(), k = 1, . . . , K, with each measure- set of ordinary differential equations in the coefficients aj(t) in the
ment defined on some spatial domain X, the POD method amounts expansion of Xn.
to assuming that each observation Xk belongs to a Hilbert space H
of functions defined on X. With the inner product defined on H
4. Methodology
and a collection of scalar functions fuj g1
j¼1 that defines an orthonor-
mal basis of H, any scalar function X(z) of the spatial variable z,
The methodology proposed in this work considers the partial
which is an element of H, admits a representation
differential Eq. (7) which naturally occurs in the modeling of dis-
X
1
tributed chemical process under weak assumptions. It extends
XðzÞ ¼ aj uj ðzÞ; z 2 X: ð2Þ
Eq. (1) from the case of lumped systems to the case of distributed
j¼1
parameter systems but assumes that fFP is linear in fEM. A similar
The scalars aj are referred to as the coefficients and the uj as the separation of linear and nonlinear terms can be found in [54] for
modes of the expansion. The truncated expansion a mechanical system and in [42] for a chemical lumped parameter
X
n system. The system (7) is separated into two parts:
X n ðzÞ ¼ aj uj ðzÞ; z 2 X; ð3Þ
oX
j¼1 ¼ AðXÞ þ BðuÞ þ q; ð9aÞ
ot
causes an approximation error kX  Xnk in the norm of the Hilbert q ¼ FðX; uÞ; ð9bÞ
space. We will call fuj g1
j¼1 a POD basis of H whenever it is an ortho-
normal basis of H for which the total approximation error to facilitate the interpretation of the nonlinear function FðX; uÞ as
the (known or unknown) empirical part of the model. A conven-
X
K
tional construction of a grey-box model which approximates the
kX k  X kn k ð4Þ
k¼1
model (9) without any form of model order reduction consists of
the replacing the relation (9b) by an empirical black-box model,
is minimal for all truncation levels n = 1, . . . , 1. This is an empirical resulting in the system
basis in the sense that every POD basis depends on the data ensem-
ble Xk(), k = 1 . . . , K. oX
¼ AðXÞ þ BðuÞ þ q; ð10aÞ
Using variational calculus, the solution to this optimization ot
problem amounts to finding the normalized eigenfunctions uj(z) q ¼ EðX; u; hÞ; ð10bÞ
of a positive semi-definite operator R : H ! H defined as
where E : H  U  H ! H defines a parameterization of F. In a
1X K grey-box modeling framework, F is an unknown model which is
hw1 ; Rw2 i :¼ hw ; X k i  hw2 ; X k i; ð5Þ approximated by an empirical model E. In a grey-box model reduc-
K k¼1 1
tion framework, F is a possibly highly complex, computationally
with w1 ; w2 2 H [15,53]. expensive nonlinear term that is approximated by a computation-
If the spatial domain X is gridded in N distinct points then the ally less expensive term E. Identification of the grey-box as illus-
evaluation of X on these points can be identified with a vector in trated in Fig. 1 involves the following steps. An optimal parameter
RN , which we will call T(t) with reference to the illustrating exam- h* 2 H is identified, such that a criterion J : H ! R on the intercon-
ple in Section 5. Consequently, after discretization the Hilbert nection (10) is minimized:
space H becomes finite dimensional. The kth measurement Tk is min JðhÞ; J : H ! R: ð11Þ
then an element in RN and is usually taken to be the kth time sam- h

ple of an ensemble of time-varying measurements. That is, The criterion J(h) could be either a measure of the equation error of
Tk = T(tk) where T(tk) refers to the sample of the state variable at Eqs. (9b) and (10a), i.e., kFðX; uÞ  EðX; u; hÞk, or a measure of the
time tk, k = 1, . . . , K. The operator R defined by Eq. (5) is then con- state trajectory error kX Eq:ð9Þ  X Eq:ð10Þ k where X Eq:ð9Þ and X Eq:ð10Þ de-
structed by R ¼ ð1=KÞTTT , where T is the snapshot matrix which is note the solutions X of Eqs. (9) and (10), respectively. In either
constructed by columnwise concatenation of measurement data way, an optimal parameter h* is found and the identified full order
vectors T(tk) = Tk: grey-box model is then given by
R. Romijn et al. / Journal of Process Control 18 (2008) 906–914 909

Initial Model Identified Model


Inputs Outputs Inputs Outputs
high order high order
model model

complex or high order


identification
unknown relation black-box

Fig. 1. Conventional grey-box model identification.

oX high-dimensional space before the nonlinear function F can be


¼ AðXÞ þ BðuÞ þ q; ð12aÞ
ot evaluated. After computation of the nonlinear terms, the solution
q ¼ EðX; u; h Þ: ð12bÞ is projected onto the low-dimensional space Hn . The computation
In F Pn
of X n ! X ! q ! qn still requires function evaluations of F on H,
Note, if the model part (12a) is of high order, the identified part (12b) which are often computationally costly. To solve this problem, the
is of high order as well, i.e., if X 2 RN then q 2 RN as well (Fig. 1). following step in the procedure is introduced.

4.1. Reduced-order grey-box model identification 4.1.2. Parameterization and identification of the empirical part
After reduction of Eq. (9), the projected model Eq. (13a) is ex-
The reduction methodology that we propose in this contribu- tended with the relation
tion is illustrated in Fig. 2. The method consists of the following
steps.
qn ¼ En ðX n ; u; hÞ; ð14Þ
R
where En : Hn  U  H ! Hn defines a parameterization of the
4.1.1. Reduction of the mechanistic part empirical part. An optimal parameter h* 2 HR is identified, such that
The model formulation (9) allows a separate treatment of the a criterion function J : HR ! R is minimized. The criterion can
individual model parts. Following Eq. (8), a Galerkin projection is either minimize an error on the approximation of q, i.e.,
performed on the mechanistic part (9a) to yield a reduced order min kPn FðIn X n ; uÞ  En ðX n ; u; hÞk, or minimize an error on the state
model trajectory predictions, i.e., min kX n;Eq:ð13Þ  X n;Eqs:ð13aÞ; ð14Þ k.
This results in the identified reduced model structure
oX n
Pn ¼ Pn AðX n Þ þ Pn BðuÞ þ qn ; ð13aÞ
ot oX n
Pn ¼ P n AðX n Þ þ Pn BðuÞ þ qn ; ð15aÞ
qn ¼ Pn FðIn X n ; uÞ; ð13bÞ ot
qn ¼ En ðX n ; u; h Þ: ð15bÞ
where Xn = PnX and qn = Pnq. Pn : H ! Hn is the projection map and
In : Hn ! H is the injection map, both defined in Section 3. This This step is illustrated in Fig. 2 by the step from the model in the
step reduces the order of the model to n and corresponds in Fig. 2 lower left corner to the lower right corner. The low-order mechanis-
to the step from the model in the left upper corner to the model tic model part has been combined with a newly introduced nonlin-
in the left lower corner. Observe that, although qn 2 Hn , the com- ear part, the complexity of which should be less than (13b) by
plexity of the nonlinear model part (13b) is not satisfactorily construction. The interconnection of (15a) and (15b) will be re-
reduced, as the model variables Xn have to be injected in the ferred to as the reduced model.

Initial Model Coupled Model


u y u y
Linear eq. Linear eq.

q x I n qn Pn x
u Nonlinear u
Nonlinear eq. approx. eq.

reconstruction
projection

u Projected y u Projected y
linear eq. linear eq.
identification
Pn qn I n xn qn xn
u u
Projected Nonlinear
nonlinear eq. approx. eq.
Projected Model Reduced Model

Fig. 2. Scheme of the proposed methodology. The initial high-dimensional model (9) is reduced by a projection method to obtain the projected model (13). Subsequently an
approximate nonlinear model is identified to obtain the reduced model (15). Finally, the approximate nonlinear model can be coupled to the original linear model equations
to obtain the reconstructed model (16). The linear and nonlinear parts of these models are coupled through the variables (x and q), (Inxn and Pnqn), (xn and qn), (Pnx and Inq),
respectively.
910 R. Romijn et al. / Journal of Process Control 18 (2008) 906–914

4.1.3. Model reconstruction has been put into the reduction of this type of systems – by projec-
In order to obtain solutions in terms of the variable X(z,t), two tion of the model equations [15] and by identification of reduced
possibilities exist. Firstly, the solution trajectories of the reduced order linear models [56].
model can be injected to the high-dimensional space H after sim- In this section the grey-box model reduction methodology is
ulation. That is, X = InXn where Xn satisfies Eq. (15). Secondly, the applied to a highly simplified model representing some phenom-
identified low order empirical part (15b) can be injected into H ena occurring in a glass melting process. Note that this example
and coupled to the original mechanistic part (12a), in order to serves as an illustration of the proposed methodology rather than
obtain a demonstration of the reduction of a very large and complex
system.
oX
¼ AðX; uÞ þ BðuÞ þ In En ðPn X; u; h Þ: ð16Þ
ot
5.1. Model description
This step is illustrated in Fig. 2 by the reconstruction step from the
model in the lower right corner to the upper right corner. We refer The energy balance for a glass melting process includes terms for
to model (16) as the coupled model. Observe that the reduced model the convection of heat, heat conduction, and a local boosting (heat-
(15) and the coupled model (16) contain the nonlinear map ing) or cooling term. Typically, the energy balance is coupled to bal-
En : Hn  U  HR ! Hn where the high order models (9) and (12) ance equations for mass and momentum by velocity, viscosity and
and the projected model (13) contain the nonlinear map density [57]. A simplified model is obtained by assuming constant
E : H  U  H ! H. When a projection Pn has been found such velocity, viscosity and density. A one-dimensional approximate
that Hn is of lower dimension than H, the domain and range of model for the glass melting process can then be modeled by the
En are smaller than the domain and range of E and therefore one PDE for temperature T(z, t) with spatial coordinate z 2 X defined
can expect that an En can be found which is simpler and computa- in X := (0, L), and temporal coordinate t 2 [0, tend]:
tionally less expensive than E.
This reduction and identification approach will result in grey- oTðz; tÞ oTðz; tÞ o2 Tðz; tÞ
¼ v þa þ Q ðz; tÞ; ð17Þ
box type of models for any type of system which combines linear ot oz oz2
and nonlinear terms. This type of system often arises when com- with initial conditions T(z,t = 0) = T0(z) and boundary conditions
plex process systems are modeled. Examples of these systems are T(z = 0,t) = Tb, oTðz¼L;tÞ ¼ 0 where Tb is a constant.
oz
PDE systems with a nonlinear convection term and a linear diffu- The convective flow velocity v and the heat conductivity a are
sion term or conversely, a linear convection term and a nonlinear constant parameters. Following [57], a simple distributed heating
diffusion term. Another example, a PDE containing a nonlinear term can be modeled by
feedback and source term is treated in Section 5. For completely
r
nonlinear systems, the identification procedure would still be Qðz; tÞ ¼ ðc T 4 ðz; tÞ  c2 T 4 ðz; tÞÞ ð18Þ
applicable but would result in a reduced black-box system
qcp 1 fl
oX n to capture radiative heating as a nonlinear function of temperature
Pn ¼ En ðX n ; u; hÞ; T and the temperature Tfl of heating flames above the melting glass.
ot
The flame temperature Tfl is considered as the input variable to the
which does not display the favorable structure which arises in the system. The parameter r denotes the Stefan–Boltzmann constant.
case of separability of the right-hand side. The parameter values of the model are listed in Table 1.
In order to simulate the system, the spatial domain X is
4.2. Black-box structure discretized in N equidistant grid points at intervals dz. Let
T(t) = col(T(z1,t), . . . ,T(zN,t)) denote the temperature vector at the
A black-box can be characterized by its structure and type. The grid points and let Tfl be constructed similarly. After the spatial
type of the black-box refers to the type of nonlinearity inside the derivatives of T have been approximated by finite differences, the
black-box. These nonlinearities include neural networks parame- resulting model equations assume the form
terized by their node weights, polynomial functions parameterized
through their coefficients, or any other parameterized function. T_ ¼ AT þ BT b þ Q ðT; Tfl Þ; ð19Þ
The structure is defined by the way the black-box is interconnected with corresponding boundary and initial conditions. A schematic
to the rest of the system and the internal complexity, e.g., the num- representation of the discretized model is given in Fig. 3. The order
ber of neurons and layers in case of the neural network, or the of this model could in principle be chosen arbitrarily high. A very
number of powers included in the polynomial. The methodology high accuracy can however already be obtained with a discretization
proposed in this paper is independent from the choice of projection grid of N = 400 points. Subsequently, the order of the resulting mod-
method and black-box structure and type. However, an implemen- el is reduced until a model is found for which the relative error of
tation of the methodology requires a suitable choice. In this work, approximation integrated over X and t meets an error criterion.
neural networks will be used as black-box structures. The determi- Using an error criterion of 1E-3, the resulting model is of order
nation of a neural network black-box structure is not straight-for-
ward. Henrique et al. [55] propose a method which starts with a
configuration involving a high number of neurons and neuron lay-
Table 1
ers. During identification, unnecessary complexity is removed. In
Parameter values
the example described in Section 5 in this work, structure selection
is done by examining the validation error of the system for black- Parameter Value Unit
boxes of varying complexity. Flow v 1.00E  3 m s1
Thermal diffusivity a 1.00E  5 W m2
Density q 2300 kg m3
5. Application to a heat convection-conduction model Specific heat cp 1300 J(kg K)1
Transmission coefficient c1, c2 0.20, 0.33
A process for which particularly high-order models are con- Stefan–Boltzmann constant r 5.67E  8 W=m2 K4
Boundary temperature Tb 1773 K
structed is the melting of glass. Typically, these models contain be-
Glass melter length L 10 m
tween 103–108 states and equations [15]. Recent research effort
R. Romijn et al. / Journal of Process Control 18 (2008) 906–914 911

T
q fl
Tb
v T T T T v
1 2 i N

dz z

Fig. 3. Schematic representation of the discretized 1D PDE model for heat convection and conduction given by Eq. (19).

130. It will be considered as the reference model from which the re- neurons. A black-box with four neurons performed best (see
duced models will be derived and with which they will be compared. Fig. 5) by showing a minimal validation error. This results in a
The input to the model has been defined as Tfl(z,t) = Tfl,z(z)u(t) where black-box E : Rn  Rn  H ! Rn with h 2 R69 . In this way, (20) is
Tfl,z(z) approximates the shapes of the heating flames and u(t) is a simplified to
pseudo-random signal [58], with varying amplitudes to cover a fair
region in the input space. The temporal input sequences for identi- T_ n ¼ Ar Tn þ Br T b þ En ðTn ; Tfl ; hÞ: ð21Þ
fication and validation are shown in Fig. 4. >
Initial conditions are set to Tn(0) = U T(0). The hybrid model
has now been completely reduced to a system with both a reduced
5.2. Reduction of mechanistic part and identification of empirical part number of equations and a reduced number of nonlinear function
evaluations. The parameter estimation problem is formulated as
The snapshot matrix is constructed for this example according to follows:
Eq. (6) as X = [T (t1)    T(tK)]. Using the criterion e < 1E  4 on the
X
n X
K
error measure defined by Eq. (4), five basis vectors are found to min ^ ni;j Þ2
ðTni;j  T ð22aÞ
be sufficient. Thus, the projection space H5 is spanned by the col- h
i¼1 j¼1
umn-span of the matrix of dominant POD vectors U := [u1, . . . , u5].
s:t: Eq:ð21Þ; ð22bÞ
The canonical projection matrix is P5 = U> and the canonical injec-
tion matrix I5 = U, respectively. A Galerkin projection of the model where Tni,j represents the simulated value of the ith element of T at
then yields time instant j and the corresponding measurement values T ^ ni;j are
^
obtained from projection of the solutions T of the reference model
T_ n ¼ Ar Tn þ Br T b þ U> FðUTn ; Tfl Þ; ð20Þ
on the POD basis U. Casting the parameter estimation problem into
> >
where Tn = hU, Ti, Ar = U AU and Br = U B. The nonlinear model state residual form ensures minimization of the error in tempera-
part is still of high dimension (F : RN  RN ! RN with N = 130). ture approximation rather than in heat input (q), in order to antic-
The nonlinear part in (20) is replaced by a black-box consisting ipate a possible online setting where not all states will be
of a neural network En ðTn ; Tfl ; hÞ with one hidden layer containing measurable. A simple single-shooting approach has been used to
sigmoid functions. The complexity has been determined a-poste- solve the estimation problem: a standard Runge–Kutta scheme
riori, after identification of black-boxes with varying numbers of has been used to integrate the model and the sensitivity systems,

1
identification sequence
0.8

0.6
u(t)

0.4

0.2

0
0 0.5 1 1.5 2 2.5 3
4
x 10

0.8

0.6
u(t)

0.4

0.2
validation sequence
0
0 0.5 1 1.5 2 2.5 3
time [s] x 10
4

Fig. 4. Input signals used for identification (upper plot) and validation (lower plot).
912 R. Romijn et al. / Journal of Process Control 18 (2008) 906–914

1.4 grey-box modeling framework: model reduction clearly speeds-


validation error: || ε ||22 up the identification of the black-box.
The computational load of the different models and the distri-
normalized error, normalized cpu time

1.2
validation error: ||ε || bution over the empirical part and the mechanistic part as well
inf
cpu time of simulation as the cost of the actual evaluation of the nonlinear function are
1 shown in detail in Table. 3. The simulation times of the reduced
model configurations are lower than those of the reference model
0.8 and the coupled models. This seems to originate from the fact that
the ODE solver requires less integration steps to solve the reduced
0.6 models.
The reduced models and the coupled models are validated by
inspection of the quality of prediction. Fig. 6 shows the response
0.4
of the identified reduced order system and the relative approxima-
tion error to the trajectory generated by the reference model. The
 
0.2  
relative error is defined as e :¼ maxðT TT
ÞminðT Þ.
ref
It can be seen that
ref ref

the overall approximation is fairly accurate: the error stays largely


0
1 2 3 4 5 6 below about 2%. The validation of the predictive capacity of the re-
number of neurons duced model is assessed by exciting the model by a different input
trajectory and starting from a different initial value vector. The re-
Fig. 5. Normalized validation errors and cpu time consumption for varying over
sponse of the reduced model subject to these inputs is shown in
black-box complexity: a black-box with four neurons shows the least error in the 2-
norm and the inf-norm. The CPU time of simulation is lowest for the model with Fig. 7a. The relative approximation error of the reduced model is
three neurons. shown Fig. 7b. The error is higher, which could indicate an over-
parameterization of the black-box. The accuracy of the coupled
model (16) is shown in Fig. 8. It can be seen that the identification
while the SQP algorithm SNOPT [59] has been used to solve the trajectory is approximated less accurately by this model, compared
parameter estimation problem. to the reduced model. This probably originates from the fact that
the error of projection has been compensated by the black-box
5.3. Model restoration, computational load assessment and validation model. The validation trajectory error is of the same order if either
the coupled or the reduced model is used.
Assessment of the computational load of the parameter estima-
tion is given in Table 2. Due to the non-convexity of the problem, it
is difficult to quantify the speed-up of the parameter estimation Table 3
precisely. An approximate quantification is obtained by averaging Simulation results

10 estimation runs for different random initial parameter vectors. model: Reference High order Reduced Coupled
The estimation time is fairly acceptable when comparing it to the equation: Eq. (19) Eq. (12) Eq. (21) Eq. (16)
estimation time of a high-dimensional grey-box (12). The identifi-
Total sim. time 6.12 s 5.69 s 0.604 s 4.84 s
cation of the reduced grey-box models was measured to be a factor Linear + nonlin. part 4.95 s 4.52 s 0.421 s 3.55 s
5–10 faster. This shows the benefits of model reduction within a Nonlinear part 3.16 s 1.77 s 0.134 s 0.98 s
Number of model eval. 22717 22879 3031 22657
lin: þ nonlin: part
num: model eval:
0.218 ms 0.198 ms 0.139 ms 0.157 ms
nonlin: part
Table 2 num: model eval:
0.139 ms 0.077 ms 0.044 ms 0.043 ms
Estimation results, averaged over 10 runs with random initial parameters
Measured are the total simulation time including overheads, simulation time for
Function evaluations Iterations CPU time the model consisting of linear and nonlinear part, simulation time of the nonlinear
model part, and the number of model evaluations the solver needed to simulate the
High-order model (Eq. (12)) 175 67 1972 s
trajectory. Finally, the simulation time per model evaluation and the simulation
Reduced order model (Eq. (21)) 331 224 330 s
time of the nonlinear part per model evaluation are listed.

(a) Identification trajectory (b) Identification error

0.06
774
0.04
rel
|ε |

773
0.02

772 0
10 10
3 3
5 2 5 2
1 4 1 4
x 10 x 10
z 0 0 z 00
t t

Fig. 6. (a) Nominal temperature trajectory (unshaded mesh) and simulated temperature trajectory by means of the reduced model (shaded surface) for the identification
input signal. (b) Error of estimation of the reduced model.
R. Romijn et al. / Journal of Process Control 18 (2008) 906–914 913

a) Validation trajectory b) Validation error

74.5 0.06

774
0.04

rel
73.5

|ε |
0.02
773

72.5 0
10 10
3 3
5 2 5 2 4
1 4 1 x 10
x 10 z t
z 0 0 0 0
t
Fig. 7. (a) Nominal temperature trajectory (unshaded mesh) and simulated temperature trajectory by means of the reduced model (shaded surface) for the validation input
sequence. (b) Error of validation of the reduced model.

a) Identification trajectory error b) Validation trajectory error

0.1 |ε |rel 0.2

0.5 0.1

0 0
10 10
3 3
2 4 2 4
5
1 x 10 5 x 10
z 1
z 0 t t
0 0 0

Fig. 8. Coupled model approximation errors of (a) the identification trajectory and (b) the validation trajectory.

6. Conclusions References

This paper is motivated by the observation that many model [1] L. Lapidus, G. Pinder, Numerical Solution of Part Differential Equations in
Science and Engineering, Wiley, New York, 1982.
reduction techniques for large-scale nonlinear dynamical systems [2] R. Temam, Infinite-dimensional Dynamical Systems in Physics, Springer,
yield lower order models that fail to be computationally efficient. Berlin, 1997.
Although the order of the model can be significantly reduced, the [3] P. Christofides, Nonlinear and Robust Control of PDE Systems, Methods
and Applications to Transport-Reaction Processes, Birkhäuser, Basel,
computational effort is not reduced because the favorable structure 2000.
of the original model is lost in the reduced model. To remedy this [4] D. Gay, H. Ray, Identification and control of distributed parameter systems by
problem, a novel general framework is proposed for model reduc- means of the singular value decomposition, Chem. Eng. Sci. 50 (1995) 1519–
1539.
tion for nonlinear systems using a grey-box modeling approach. [5] K. Hoo, D. Zheng, Low-order control-relevant models for a class of distributed
Grey-box models allow a separation of a mechanistic, first principle parameter systems, Chem. Eng. Sci. 56 (2001) 6683–6710.
part and an empirical, black-box part. It is shown that the combina- [6] S. Shvartsman, I. Kevrekidis, Low-dimensional approximation and control of
periodic solutions in spatially extended systems, Phys. Rev. E 58 (1998) 361–
tion of model reduction and black-box identification applied to the
368.
linear and nonlinear parts of the mechanistic model results in a [7] S. Shvartsman, C. Theodoropoulos, R. Rico-Martinez, I. Kevrekidis, E. Titi, T.
grey-box model which can be solved by significantly reduced com- Mountziaris, Order reduction for nonlinear dynamic models of distributed
reacting systems, J. Proc. Cont. 10 (2000) 177–184.
putational complexity. The method is illustrated by means of a sim-
[8] A. Antoulas, Approximation of Large-scale Dynamical Systems, Axdvances in
ple heat convection-conduction model. The prediction error of the Design and Control DC06, SIAM, Philadelphia, 2005.
reduced and the restored coupled grey-box model with a sparse [9] W. Marquardt, Nonlinear model reduction for optimization based control of
black-box structure is reasonably small. The computational effort transient chemical processes, in: AIChE Symp. Ser. 326, vol. 98, 2001, pp. 12–
42.
of the parameter estimation problem is significantly lower than in [10] H. Loeffler, W. Marquardt, Order reduction of non-linear differential-algebraic
black-box modeling. Future work will analyze the extrapolation process models, J. Proc. cont. 1 (1991) 32–40.
properties of these grey-box models and will apply the suggested [11] H. Aling, R. Kosut, A. Emami-Naeini, J. Ebert, Nonlinear model reduction with
application to rapid thermal processing, in: Proc. of the 35th Conf. on Dec. and
method to other model reduction problems of industrial relevance. Cont., 1996.
[12] H. Aling, S. Banarjee, A. Bangia, V. Cole, J. Ebert, A. Naeini, K. Jensen, I.
Kevrekidis, S. Shvartsman, Nonlinear model reduction for simulation and
Acknowledgement control of rapid thermal processing, in: Proc. of the ACC, 1997.
[13] M. Schlegel, J. v.d. Berg, W. Marquardt, O. Bosgra, Projection based model
This work has been supported by the European Union within reduction for dynamic optimization, paper presented at AIChE Annual
Meeting, Indianapolis, 2002.
the Marie-Curie Training Network PROMATCH under the Grant [14] M. Rathinam, L. Petzold, A new look at proper orthogonal decomposition, SIAM
No. MRTN-CT-2004-512441. J. Numer. Anal. 41 (2003) 1893–1925.
914 R. Romijn et al. / Journal of Process Control 18 (2008) 906–914

[15] P. Astrid, Reduction of Process Simulation Models: a proper orthogonal [36] M. Hinchliffe, G. Montague, M. Willis, A. Burke, Hybrid approach to modeling
decomposition approach, Ph.D. Thesis, Technical University of Eindhoven, an industrial polyethylene process, AIChE J. 49 (2003) 3127–3137.
2004. [37] B. Feil, J. Abonyi, P. Pach, S. Nemeth, P. Arva, M. Nemeth, G. Nagy, Semi-
[16] J. Van den Berg, Model reduction for dynamic real time optimization of mechanistic models for state-estimation – soft sensor for polymer melt index
chemical processes, Ph.D. Thesis, Technical University of Delft, 2005. prediction, in: ICAISC, 2004, pp. 1111–1117.
[17] R. Romijn, L. Özkan, S. Weiland, J. Ludlage, W. Marquardt, A grey-box modeling [38] C. Ng, M. Hussain, Hybrid neural network – prior knowledge model in
approach for the reduction of nonlinear systems, in: Proc. of the 8th Int. IFAC temperature control of a semi-batch polymerization process, Chem. Eng. Proc.
Symp. on Dyn. and Cont. of Proc. Sys. Preprints, 2007. 43 (2004) 559–570.
[18] L. Özkan, R. Romijn, S. Weiland, W. Marquardt, J. Ludlage, Model reduction of [39] P. Georgieva, M. Meireles, S.F. de Azevedo, Knowledge-based hybrid modelling
nonlinear systems: a grey-box modeling approach, in: Proc. of the 7th IFAC of a batch crystallisation when accounting for nucleation, growth and
Symp. on Nonlin. Control Systems, Praetoria, 2007. agglomeration phenomena, Chem. Eng. Sci. 58 (2003) 3699–3713.
[19] D. Psichogios, L. Ungar, A hybrid neural network-first principles approach to [40] P. Georgieva, S.F. de Azevedo, Neural network-based control strategies applied
process modeling, AIChE J. 38 (10) (1992) 1499–1511. to a fed-batch crystallization process, Int. J. Comp. Intell. 3 (2006) 224–233.
[20] M. Thompson, M. Kramer, Modeling chemical processes using prior knowledge [41] A. Alaradi, S. Rohani, Identification and control of a riser-type FCC unit using
and neural networks, AIChE J. 40 (8) (1994) 1328–1340. neural networks, Comp. Chem. Eng. 26 (2002) 401–421.
[21] F.S.J. Abonyi, J. Madar, Soft Computing in industrial applications – recent [42] M. Brendel, A. Mhamdi, D. Bonvin, W. Marquardt, An incremental approach for
advances, Springer Eng. Ser. (2002). the identification of reaction kinetics, in: Preprints of the 7th Int. Symp. on
[22] R. Pearson, M. Pottmann, Gray-box identification of block-oriented nonlinear Adv. Cont. of Chem. Proc., 2004, pp. 177–182.
models, J. Proc. Cont. 10 (2000) 301–315. [43] E. Akkari, S. Chevallier, L. Boillereaux, A 2d non-linear ‘‘grey-box” model
[23] H.V. Can, H. te Braake, S. Dubbelman, C. Hellinga, K. Luyben, J. Heijnen, dedicated to microwave thawing: theoretical and experimental investigation,
Understanding and applying the extrapolation properties of serial gray-box Comp. Chem. Eng. 30 (2005) 321–328.
models, AIChE J. 44 (1998) 1071–1089. [44] J. Hahn, S. Lextrait, T. Edgar, Nonlinear balanced model residualization via
[24] O. Kahrs, W. Marquardt, The validity domain of hybrid models and its neural networks, AIChE J. 48 (6) (2002) 1353–1357.
application in process optimization, Chem. Eng. Proc. 46 (2007) 1054–1066. [45] C. Sun, J. Hahn, Reduction of stable differential-algebraic equation systems via
[25] A. Schuppert, Extrapolability of structured hybrid models: a key to projections and system identification, J. Proc. Cont. 15 (2005) 639–650.
optimization of complex processes, in: B. Fiedler, K. Groeger, J. Sprekels [46] B.C. Moore, Principal component analysis in linear systems: controllability,
(Eds.), Proceedings of EquaDiff’99, 2000, pp. 1135–1151. observability and model reduction, IEEE Trans. Autom. Cont. 26 (1) (1981)
[26] S.F. de Azevedo, B. Dahm, F. Oliveira, Hybrid modelling of biochemical 17–32.
processes: a comparison with the conventional approach, Comp. Chem. Eng. [47] J. Lumley, Stochastic Tools in Turbulence, Academic press, New York, 1970.
21 (Suppl.) (1997) S751–S756. [48] L. Sirovich, Turbulence and the dynamics of coherent structures pt. i, ii, iii,
[27] H. van Can, H. te Braake, A. Bijman, C. Hellinga, K. Luyben, J. Heijnen, An Quart. Appl. Math. 45 (1987) 561–571, 573–582, 583–590.
efficient model development strategy for bioprocesses based on neural [49] G. Berkooz, P. Holmes, J. Lumley, The proper orthogonal decomposition in the
networks in macroscopic balances: part II, Biotechnol. Bioeng. 62 (1999) analysis of turbulent flows, Annu. Rev. Fluid Mech. 25 (1993) 539–575.
666–680. [50] P. Holmes, J. Lumley, G. Berkooz, Turbulent Coherent Structures, Dynamical
[28] S. James, R. Legge, H. Budman, Comparative study of black-box and hybrid Systems and Symmetry, Cambridge University Press, Cambridge, 1996.
estimation methods in fed-batch fermentation, J. Proc. Cont. 12 (2002) 113– [51] S. Volkwein, Proper orthogonal decomposition and singular value
121. decomposition, Tech. rep., SFB-153, Inst. for Math., Univ. Graz, 1999.
[29] P.V. Lith, B. Betlem, B. Roffel, A structured modeling approach for dynamic [52] K. Kunisch, S. Volkwein, Control of the burgers equation by a reduced-order
hybrid fuzzy-first principles models, J. Proc. Cont. 12 (2002) 605–615. approach using proper orthogonal decomposition, J. Optim. Theory Appl. 102
[30] B. Duarte, P. Saraiva, Hybrid models combining mechanistic models with (1999) 345–371.
adaptive regression splines and local stepwise regression, Ind. Eng. Chem. Res. [53] W. Cazemier, Proper orthogonal decomposition and low dimensional models
42 (2003) 99–107. for turbulent flow, Ph.D. Thesis, University of Groningen, 1997.
[31] R. Oliveira, Combining first principles modelling and artificial neural [54] B. Salimbahrami, J. Lienemann, B. Lohmann, J. Korvink, A simulation free
networks: a general framework, Comp. Chem. Eng. 28 (2004) 755–766. reduction scheme and nonlinear modelling of an electrostatic beam, in: Proc.
[32] K. Dadhe, V. Rossmann, K. Durmus, S. Engell, Neural networks as a tool for of the 10th IFAC/IFORS/IMACS/IFIP Symp. on Large Scale Sys., 2004.
gray box modelling in reactive distillation, Fuzzy Days LNCS 2206 (2001) [55] H. Henrique, E. Lima, D. Seborg, Model structure determination in neural
576–588. network models, Chem. Eng. Sci. 55 (2000) 5457–5469.
[33] P.V. Lith, B. Betlem, B. Roffel, Combining prior knowledge with data driven [56] L. Huisman, Control of glass melting processes based on reduced CFD models,
modeling of a batch distillation column including start-up, Comp. Chem. Eng. Ph.D. Thesis, Technical University of Eindhoven, 2005.
27 (2003) 1021–1030. [57] R. Beerkens, in: Mathematical Simulation in Glass Technology, Springer,
[34] L. Chen, Y. Hontoir, D. Huang, J. Zhang, A.J. Morris, Combining first prinicples Berlin, 2002, pp. 17–72.
with black-box techniques for reaction systems, Cont. Eng. Pract. 12 (2004) [58] L. Ljung, System Identification: Theory for the User, Prentice-Hall, Englewood
819–826. Cliffs, NJ, 1987.
[35] T. Crowley, C. Harrison, F. Doyle, Batch-to-batch optimization of PSD in [59] P. Gill, W. Murray, M. Saunders, SNOPT: an SQP algorithm for large-scale
emulsion polymerization using a hybrid model, in: Proc. of the ACC, 2001. constrained optimization, SIAM Rev. 47 (2005) 99–131.

You might also like