Professional Documents
Culture Documents
Encyclopedia of Systems and Control PDF
Encyclopedia of Systems and Control PDF
Encyclopedia of Systems and Control PDF
Tariq Samad
Editors-in-Chief
Encyclopedia of
Systems and
Control
1 3Reference
Encyclopedia of Systems and Control
John Baillieul • Tariq Samad
Editors
Encyclopedia of Systems
and Control
v
vi Preface
It is this incredible broadening of the scope of the field that has motivated
the editors to assemble the entries that follow. This encyclopedia aims to help
students, researchers, and practitioners learn the basic elements of a vast array
of topics that are now considered part of systems and control. The goal is to
provide entry-level access to subject matter together with cross-references to
related topics and pointers to original research and source material.
Entries in the encyclopedia are organized alphabetically by title, and
extensive links to related entries are included to facilitate topical reading—
these links are listed in “Cross-References” sections within entries. All cross-
referenced entries are indicated by a preceding symbol:. In the electronic
version of the encyclopedia these entries are hyperlinked for ease of access.
The creation of the Encyclopedia of Systems and Control has been a major
undertaking that has unfolded over a 3-year period. We owe an enormous debt
to major intellectual leaders in the field who agreed to serve as topical section
editors. They have ensured the value of the opus by recruiting leading experts
in each of the covered topics and carefully reviewing drafts. It has been a
pleasure also to work with Oliver Jackson and Andrew Spencer of Springer,
who have been unfailingly accommodating and responsive over this time.
As we reflect back over the course of this project, we are reminded of
how it began. Gary Balas, one of the world’s experts in robust control and
aerospace applications, came to one of us after a meeting with Oliver at the
Springer booth at a conference and suggested this encyclopedia—but was
adamant that he wasn’t the right person to lead it. The two of us took the
initiative (ultimately getting Gary to agree to be the section editor for the
aerospace control entries). Gary died last year after a courageous fight with
cancer. Our sense of accomplishment is infused with sadness at the loss of a
close friend and colleague.
We hope readers find this encyclopedia a useful and valuable compendium
and we welcome your feedback.
Aerospace Applications
Gary Balas Deceased. Formerly at Aerospace Engineering and Mechanics
Department, University of Minnesota, Minneapolis, MN, USA
Game Theory
Tamer Başar Coordinated Science Laboratory, University of Illinois,
Urbana, IL, USA
Discrete-Event Systems
Christos G. Cassandras Division of Systems Engineering, Center for
Information and Systems Engineering, Boston University, Brookline, MA,
USA
vii
viii Section Editors
Stochastic Control
Lei Guo Academy of Mathematics and Systems Science, Chinese Academy
of Sciences (CAS), Beijing, China
Nonlinear Control
Alberto Isidori Department of Computer and System Sciences “A. Ruberti”,
University of Rome “La Sapienza”, Rome, Italy
Hybrid Systems
Francoise Lamnabhi-Lagarrigue Laboratoire des Signaux et Systèmes –
CNRS, European Embedded Control Institute, Gif-sur-Yvette, France
Adaptive Control
Richard Hume Middleton School of Electrical Engineering and Computer
Science, The University of Newcastle, Callaghan, NSW, Australia
Intelligent Control
Thomas Parisini Department of Electrical and Electronic Engineering,
Imperial College London, London, UK
Section Editors ix
Frequency-Domain Control
Michael Sebek Department of Control Engineering, Faculty of Electrical
Engineering, Czech Technical University in Prague, Prague 6, Czech
Republic
Robotics
Bruno Siciliano Dipartimento di Informatica e Sistemistica, Università
degli Studi di Napoli Federico II, Napoli, Italy
Information-Based Control
Wing-Shing Wong Department of Information Engineering, The Chinese
University of Hong Kong, Hong Kong, China
Robust Control
Kemin Zhou Department of Electrical and Computer Engineering,
Louisiana State University, Baton Rouge, LA, USA
Contributors
xi
xii Contributors
from exceeding a threshold that can lead to un- existing control loops, a desire to limit actuator
derfrequency load shedding (UFLS) or rolling usage and structural loading, and wind variability.
blackouts. The particular threshold varies across A
utility grids, but the largest such threshold in
North America is 1.5 Hz. Active Power Control of Wind Power
Stability issues arising from the altered control Plants
algorithms must be analyzed (Buckspan et al.
2013). The trade-offs between aggressive pri- A wind power plant, often referred to as a wind
mary frequency control and resulting structural farm, consists of many wind turbines. In wind
loads also need to be evaluated carefully. Ini- power plants, wake effects can reduce generation
tial research shows that potential grid support in downstream turbines to less than 60 % of the
can be achieved while not causing any increases lead turbine (Barthelmie et al. 2009; Porté-Agel
in structural loading and hence fatigue damage et al. 2013). There are many emerging areas
and operations and maintenance costs (Buckspan of active research, including the modeling of
et al. 2012). wakes and wake effects and how these models
can then be used to coordinate the control of
individual turbines so that the overall wind power
Wind Turbine Automatic Generation plant can reliably track the desired power ref-
Control erence command. A wind farm controller can
be interconnected with the utility grid, trans-
Secondary frequency control, also known as au- mission system operator (TSO), and individual
tomatic generation control (AGC), occurs on a turbines as shown in Fig. 3. By properly account-
slower time scale than PFC. AGC commands can ing for the wakes, wind farm controllers can
be generated from highly damped proportional allocate appropriate power reference commands
integral (PI) controllers or logic controllers to to the individual wind turbines. Individual tur-
regulate grid frequency and are used to control bine generator torque and blade pitch controllers,
the power output of participating power plants. In as discussed earlier, can be designed so that
many geographical regions, frequency regulation each turbine follows the power reference com-
services are compensated through a competitive mand issued by the wind farm controller. Meth-
market, where power plants that provide faster ods for intelligent, distributed control of entire
and more accurate AGC command tracking are wind farms to rapidly respond to grid frequency
paid more. disturbances could significantly reduce frequency
An active power control system that combines deviations and improve recovery speed to such
both primary and secondary/AGC frequency con- disturbances.
trol capabilities has recently been detailed in Aho
et al. (2013a). Figure 2 presents initial exper-
imental field test results of this active power Combining Techniques with Other
controller, in response to prerecorded frequency Approaches for Balancing the Grid
events, showing how responsive wind turbines
can be to both manual derating commands as well Ultimately, active power control of wind turbines
as rapidly changing automatic primary frequency and wind power plants should be combined
control commands generated via a droop curve. with both demand-side management and storage
Overall, results indicate that wind turbines can to provide a more comprehensive solution
respond more rapidly than conventional power that enables balancing electrical generation
plants. However, increasing the power control and electrical load with large penetrations
and regulation performance of a wind turbine of wind energy on the grid. Demand-side
should be carefully considered due to a number management (Callaway and Hiskens 2011; Kowli
of complicating factors, including coupling with and Meyn 2011; Palensky and Dietrich 2011)
4 Active Power Control of Wind Power Plants for Grid Integration
Active Power Control of Wind Power Plants for Grid Gevorgian, NREL). The upper plot shows the grid fre-
Integration, Fig. 2 The frequency data input and power quency, which is passed through a 5 % droop curve with
that is commanded and generated during a field test a deadband to generate a power command. The high-
with a 550 kW research wind turbine at the US National frequency fluctuations in the generated power would be
Renewable Energy Laboratory (NREL). The frequency smoothed when aggregating the power output of an entire
data was recorded on the Electric Reliability Council of wind power plant
Texas (ERCOT) interconnection (data courtesy of Vahan
Active Power Control of Wind Power Plants for Grid The wind farm controller uses measurements of the utility
Integration, Fig. 3 Schematic showing the communica- grid frequency and automatic generation control power
tion and coupling between the wind farm control system, command signals from the grid operator to determine a
individual wind turbines, utility grid, and the grid operator. power reference for each turbine in the wind farm
Active Power Control of Wind Power Plants for Grid Integration 5
aims to alter the demand in order to mitigate peak Barthelmie RJ, Hansen K, Frandsen ST, Rathmann O,
electrical loads and hence to maintain sufficient Schepers JG, Schlez W, Phillips J, Rados K, Zervos A,
Politis ES, Chaviaropoulos PK (2009) Modelling and
control authority among generating units. As measuring flow and wind turbine wakes in large wind A
more effective and economical energy storage farms offshore. Wind Energy 12:431–444
solutions (Pickard and Abbott 2012) at the power Buckspan A, Aho J, Fleming P, Jeong Y, Pao L (2012)
plant scale are developed, wind (and solar) energy Combining droop curve concepts with control systems
for wind turbine active power control. In: Proceedings
can then be stored when wind (and solar) energy of the IEEE symposium power electronics and ma-
availability is not well matched with electrical chines in wind applications, Denver, July 2012
demand. Advances in wind forecasting (Giebel Buckspan A, Pao L, Aho J, Fleming P (2013) Stability
et al. 2011) will also improve wind power analysis of a wind turbine active power control system.
In: Proceedings of the American control conference,
forecasts to facilitate more accurate scheduling Washington, DC, June 2013, pp 1420–1425
of larger amounts of wind power on the grid. Callaway DS, Hiskens IA (2011) Achieving con-
trollability of electric loads. Proc IEEE 99(1):
184–199
Ela E, Milligan M, Kirby B (2011) Operating reserves and
Cross-References variable generation. Technical report, National Renew-
able Energy Laboratory, NREL/TP-5500-51928
Control of Fluids and Fluid-Structure Interac- Ela E, Gevorgian V, Fleming P, Zhang YC, Singh M,
tions Muljadi E, Scholbrock A, Aho J, Buckspan A, Pao
L, Singhvi V, Tuohy A, Pourbeik P, Brooks D, Bhatt
Control Structure Selection N (2014) Active power controls from wind power:
Coordination of Distributed Energy Resources bridging the gaps. Technical report, National Re-
for Provision of Ancillary Services: Architec- newable Energy Laboratory, NREL/TP-5D00-60574,
tures and Algorithms Jan 2014
Giebel G, Brownsword R, Kariniotakis G, Denhard M,
Electric Energy Transfer and Control via Power Draxl C (2011) The state-of-the-art in short-term pre-
Electronics diction of wind power: a literature overview. Technical
Networked Control Systems: Architecture and report, ANEMOS.plus/SafeWind, Jan 2011
Stability Issues Gsanger S, Pitteloud J-D (2013) World wind energy
report 2012. The World Wind Energy Association,
Power System Voltage Stability May 2013
Small Signal Stability in Electric Power Sys- Kowli AS, Meyn SP (2011) Supporting wind generation
tems deployment with demand response. In: Proceedings of
the IEEE power and energy society general meeting,
Detroit, July 2011
Miller N, Clark K, Shao M (2011) Frequency responsive
Recommended Reading wind plant controls: impacts on grid performance. In:
Proceedings of the IEEE power and energy society
general meeting, Detroit, July 2011
A recent comprehensive report on active power Palensky P, Dietrich D (2011) Demand-side manage-
control that covers topics ranging from control ment: demand response, intelligent energy systems,
design to power system engineering to economics and smart loads. IEEE Trans Ind Inform 7(3):
can be found in Ela et al. (2014) and the refer- 381–388
Pickard WF, Abbott D (eds) (2012) The intermittency
ences therein. challenge: massive energy storage in a sustainable
future. Proc IEEE 100(2):317–321. Special issue
Porté-Agel F, Wu Y-T, Chen C-H (2013) A numerical
Bibliography study of the effects of wind direction on turbine wakes
and power losses in a large wind farm. Energies
Aho J, Pao L, Buckspan A, Fleming P (2013a) An active 6:5297–5313
power control system for wind turbines capable of pri- Rebours Y, Kirschen D, Trotignon M, Rossignol S (2007)
mary and secondary frequency control for supporting A survey of frequency and voltage control ancillary
grid reliability. In: Proceedings of the AIAA aerospace services-part I: technical features. IEEE Trans Power
sciences meeting, Grapevine, Jan 2013 Syst 22(1):350–357
Aho J, Buckspan A, Dunne F, Pao LY (2013b) Controlling Wiser R, Bolinger M (2013) 2012 Wind technologies mar-
wind energy for utility grid reliability. ASME Dyn ket report. Lawrence Berkeley National Laboratory
Syst Control Mag 1(3):4–12 Report, Aug 2013
6 Adaptive Control of Linear Time-Invariant Systems
where ym is the output of the reference model. time t is then used to calculate the controller
For this transfer matching to be possible, Gp .s/ parameter vector .t/ D F .p .t// used in the
and Wm .s/ have to satisfy certain assumptions. controller C.s; /. This class of MRAC is called A
These assumptions enable the calculation of the indirect MRAC, because the controller parame-
controller parameter vector as ters are not updated directly, but calculated at
each time t using the estimated plant parameters.
D F .p /; (2) Another way of designing MRAC schemes is to
parameterize the plant transfer function in terms
where F is a function of the plant parameters of the desired controller parameter vector .
p , to satisfy the matching equation (1). The This is possible in the MRC case, because the
function in (2) has a special form in the case structure of the MRC law is such that we can use
of MRC that allows the design of both direct (2) to write
and indirect MRAC. For more general classes
of controller structures, this is not possible in p D F 1 . /; (3)
general as the function F is nonlinear. This trans-
fer function matching guarantees that the track- where F 1 is the inverse of the map-
ing error e1 D yp ym converges to zero for ping F ./, and then express Gp .s; p / D
any given reference input signal r. If the plant Gp .s; F 1 . // D GN p .s; /. The adaptive law
parameter vector p is known, then the controller for estimating online can now be developed by
parameters can be calculated using (2), and using yp D GN p .s; /up to obtain a parametric
the controller C.s; / can be implemented. We model that is appropriate for estimating the
are considering the case where p is unknown. controller vector as the unknown parameter
In this case, the use of the certainty equivalence vector. The MRAC can then be developed using
(CE) approach, (Astrom and Wittenmark 1995; the CE approach as shown in Fig. 1b. In this case,
Egardt 1979; Ioannou and Fidan 2006; Ioannou the controller parameter .t/ is updated directly
and Kokotovic 1983; Ioannou and Sun 1996; without any intermediate calculations, and for
Landau 1979; Landau et al. 1998; Morse 1996; this reason, the scheme is called direct MRAC.
Narendra and Annaswamy 1989; Narendra and The division of MRAC to indirect and direct
Balakrishnan 1997; Sastry and Bodson 1989; is, in general, unique to MRC structures, and it is
Stefanovic and Safonov 2011; Tao 2003) where possible due to the fact that the inverse maps in
the unknown parameters are replaced with their (2) and (3) exist which is a direct consequence
estimates, leads to the adaptive control scheme of the control objective and the assumptions the
referred to as indirect MRAC, shown in Fig. 1a. plant and reference model are required to satisfy
The unknown plant parameter vector p is es- for the control law to exist. These assumptions
timated at each time t denoted by p .t/, using an are summarized below:
online parameter estimator referred to as adaptive Plant Assumptions: Gp .s/ is minimum phase,
law. The plant parameter estimate p .t/ at each i.e., has stable zeros, its relative degree, n D
Adaptive Control of Linear Time-Invariant Systems, Fig. 1 Structure of (a) indirect MRAC, (b) direct MRAC
8 Adaptive Control of Linear Time-Invariant Systems
number of polesnumber of zeros, is known remains however. This approach covers both di-
and an upper bound n on its order is also rect and indirect MRAC and lead to adaptive laws
known. In addition, the sign of its high- which contain no normalization signals (Ioannou
frequency gain is known even though it can be and Fidan 2006; Ioannou and Sun 1996). A more
relaxed with additional complexity. straightforward design approach is based on the
Reference Model Assumptions: Wm .s/ has stable CE principle which separates the control design
poles and zeros, its relative degree is equal to from the parameter estimation part and leads to a
n that of the plant, and its order is equal or much wider class of MRAC which can be direct
less to the one assumed for the plant, i.e., of n. or indirect. In this case, the adaptive laws need
The above assumptions are also used to meet to be normalized for stability, and the analysis is
the control objective in the case of known pa- far more complicated than the approach based on
rameters, and therefore the minimum phase and SPR with no normalization. An example of such a
relative degree assumptions are characteristics of direct MRAC scheme for the case of known sign
the control objective and do not arise because of the high-frequency gain which is assumed to
of adaptive control considerations. The relative be positive for both plant and reference model is
degree matching is used to avoid the need to listed below:
differentiate signals in the control law. The mini- Control law:
mum phase assumption comes from the fact that
the only way for the control law to force the
closed-loop plant transfer function to be equal ˛.s/ ˛.s/
up D 1T .t/ up C 2T yp C 3 .t/yp
to that of the reference model is to cancel the .s/ .s/
zeros of the plant using feedback and replace
Cc0 .t/r D T .t/!; (4)
them with those of the reference model using a
feedforward term. Such zero pole cancelations
are possible if the zeros are stable, i.e., the plant
where ˛ , ˛n2 .s/ D Œs n2 ; s n3 ; : : : ; s; 1T
is minimum phase; otherwise stability cannot be
for n 2, and ˛.s/ , 0 for n D 1, and .s/
guaranteed for nonzero initial conditions and/or
is a monic polynomial with stable roots and
inexact cancelations.
degree n 1 having numerator of Wm .s/ as a
The design of MRAC in Fig. 1 has additional
factor.
variations depending on how the adaptive law
Adaptive law:
is designed. If the reference model is chosen to
P D "; (5)
be strictly positive real (SPR) which limits its
transfer function and that of the plant to have where is a positive definite matrix referred
relative degree 1, the derivation of adaptive law to as the adaptive gain and P D " , " D
e1
and stability analysis is fairly straightforward, m2s
, m2s D 1 C T C u2f , D T C uf ,
and for this reason, this class of MRAC schemes D Wm .s/!, and uf D Wm .s/up .
attracted a lot of interest. As the relative degree The stability properties of the above direct
changes to 2, the design becomes more complex MRAC scheme which are typical for all classes
as in order to use the SPR property, the CE of MRAC are the following (Ioannou and Fidan
control law has to be modified by adding an extra 2006; Ioannou and Sun 1996): (i) All signals
nonlinear term. The stability analysis remains to in the closed-loop plant are bounded, and the
be simple as a single Lyapunov function can be tracking error e1 converges to zero asymptotically
used to establish stability. As the relative degree and (ii) if the plant transfer function contains no
increases further, the design complexity increases zero pole cancelations and r is sufficiently rich
by requiring the addition of more nonlinear terms of order 2n, i.e., it contains at least n distinct
in the CE control law (Ioannou and Fidan 2006; frequencies, then the parameter error jj Q D
Ioannou and Sun 1996). The simplicity of using j j and the tracking error e1 converge to
a single Lyapunov function analysis for stability zero exponentially fast.
Adaptive Control of Linear Time-Invariant Systems 9
between the plant and controller parameters is disengaged or the adaptive law is switched off,
nonlinear, and the plant parameters cannot be there is no guarantee that a small disturbance
expressed as a convenient function of the con- will not drive the corresponding LTI scheme
troller parameters. This prevents parametrization unstable. Nevertheless these techniques allow the
of the plant transfer function with respect to the incorporation of well-established robust control
controller parameters as done in the case of MRC. techniques in designing a priori the set of con-
In special cases where such parametrization is troller candidates. The problem is that if the
possible such as in MRAC which can be viewed plant parameters change in a way not accounted
as a special case of APPC, the design of direct for a priori, no controller from the set may be
APPC is possible. Chapters on Adaptive Con- stabilizing leading to an unstable system.
trol, Overview, Robust Adaptive Control, and
History of Adaptive Control provide additional
Robust Adaptive Control
information regarding MRAC and APPC.
The MRAC and APPC schemes presented above
are designed for LTI systems. Due to the adaptive
Search Methods, Multiple Models,
law, the closed-loop system is no longer LTI but
and Switching Schemes
nonlinear and time varying. It has been shown
using simple examples that the pure integral ac-
One of the drawbacks of APPC is the stabilizabil-
tion of the adaptive law could cause parameter
ity condition which requires the estimated plant
drift in the presence of small disturbances and/or
at each time t to satisfy the detectability and
unmodeled dynamics (Ioannou and Fidan 2006;
stabilizability condition that is necessary for the
Ioannou and Kokotovic 1983; Ioannou and Sun
controller parameters to exist. Since the adaptive
1996) which could then excite the unmodeled
law cannot guarantee such a property, an ap-
dynamics and lead to instability. Modifications
proach emerged that involves the pre-calculation
to counteract these possible instabilities led to
of a set of controllers based on the partition-
the field of robust adaptive control whose focus
ing of the plant parameter space. The problem
was to modify the adaptive law in order to guar-
then becomes one of identifying which one of
antee robustness with respect to disturbances,
the controllers is the most appropriate one. The
unmodeled dynamics, time-varying parameters,
switching to the “best” possible controller could
classes of nonlinearities, etc., by using techniques
be based on some logic that is driven by some
such as normalizing signals, projection, fixed and
cost index, multiple estimation models, and other
switching sigma modification, etc.
techniques (Fekri et al. 2007; Hespanha et al.
2003; Kuipers and Ioannou 2010; Morse 1996;
Narendra and Balakrishnan 1997; Stefanovic and Cross-References
Safonov 2011). One of the drawbacks of this ap-
proach is that it is difficult if at all possible to find Adaptive Control, Overview
a finite set of stabilizing controllers that cover History of Adaptive Control
the whole unknown parameter space especially Model Reference Adaptive Control
for high-order plants. If found its dimension may Robust Adaptive Control
be so large that makes it impractical. Another Switching Adaptive Control
drawback that is present in all adaptive schemes
is that in the absence of persistently exciting
signals which guarantee that the input/output data Bibliography
have sufficient information about the unknown
Astrom K, Wittenmark B (1995) Adaptive control.
plant parameters, there is no guarantee that the
Addison-Wesley, Reading
controller the scheme converged to is indeed a Egardt B (1979) Stability of adaptive controllers. Springer,
stabilizing one. In other words, if switching is New York
Adaptive Control, Overview 11
Fekri S, Athans M, Pascoal A (2007) Robust multiple control has matured in many areas. This entry
model adaptive control (RMMAC): a case study. Int gives an overview of adaptive control with point-
J Adapt Control Signal Process 21(1):1–30
Goodwin G, Sin K (1984) Adaptive filtering prediction
ers to more detailed specific topics. A
and control. Prentice-Hall, Englewood Cliffs
Hespanha JP, Liberzon D, Morse A (2003) Hysteresis-
based switching algorithms for supervisory control of
uncertain systems. Automatica 39(2):263–272
Keywords
Ioannou P, Fidan B (2006) Adaptive control tutorial.
SIAM, Philadelphia Adaptive control; Estimation
Ioannou P, Kokotovic P (1983) Adaptive systems with
reduced models. Springer, Berlin/New York
Ioannou P, Sun J (1996) Robust adaptive control. Prentice-
Hall, Upper Saddle River Introduction
Kuipers M, Ioannou P (2010) Multiple model adap-
tive control with mixing. IEEE Trans Autom Control
55(8):1822–1836
What Is Adaptive Control
Landau Y (1979) Adaptive control: the model reference Feedback control has a long history of using sens-
approach. Marcel Dekker, New York ing, decision, and actuation elements to achieve
Landau I, Lozano R, M’Saad M (1998) Adaptive control. an overall goal. The general structure of a control
Springer, New York
Morse A (1996) Supervisory control of families of lin-
system may be illustrated in Fig. 1. It has long
ear set-point controllers part I: exact matching. IEEE been known that high fidelity control relies on
Trans Autom Control 41(10):1413–1431 knowledge of the system to be controlled. For
Narendra K, Annaswamy A (1989) Stable adaptive sys- example, in most cases, knowledge of the plant
tems. Prentice Hall, Englewood Cliffs
Narendra K, Balakrishnan J (1997) Adaptive control
gain and/or time constants (represented by p in
using multiple models. IEEE Trans Autom Control Fig. 1) is important in feedback control design.
42(2):171–187 In addition, disturbance characteristics (e.g., fre-
Sastry S, Bodson M (1989) Adaptive control: stability, quency of a sinusoidal disturbance), d in Fig. 1,
convergence and robustness. Prentice Hall, Englewood
Cliffs are important in feedback compensator design.
Stefanovic M, Safonov M (2011) Safe adaptive control: Many control design and synthesis techniques
data-driven stability analysis and robust synthesis. are model based, using prior knowledge of both
Lecture notes in control and information sciences, model structure and parameters. In other cases,
vol 405. Springer, Berlin
Tao G (2003) Adaptive control design and analysis. Wiley- a fixed controller structure is used, and the con-
Interscience, Hoboken troller parameters, C in Fig. 1, are tuned em-
pirically during control system commissioning.
Disturbances
Noise
Adaptive Control, Overview D (qd )
n(t) d (t)
Richard Hume Middleton u (t) y (t)
School of Electrical Engineering and Computer Actuation Plant Measurements
G (qp)
Science, The University of Newcastle,
Callaghan, NSW, Australia
Control
K(qc)
Abstract
However, if the plant parameters vary widely with There are also large classes of adaptive
time or have large uncertainties, these approaches controllers that are continuously monitoring the
may be inadequate for high-performance control. plant input-output signals to adjust the strategy.
There are two main ways of approaching high- These adjustments are often parametrized by a
performance control with unknown plant and relatively small number of coefficients, C . These
disturbance characteristics: include schemes where the controller parameters
1. Robust control ( Optimization Based Robust are directly adjusted using measureable data
Control), wherein a controller is designed to (also referred to as “implicit,” since there
perform adequately despite the uncertainties. is no explicit plant model generated). Early
Variable structure control may have very high examples of this often included model reference
levels of robustness in some cases and there- adaptive control ( Model Reference Adaptive
fore is a special class of robust nonlinear Control). Other schemes (Middleton et al. 1988)
control. explicitly estimate a plant model P ; thereafter,
2. Adaptive control, where the controller learns performing online control design and, therefore,
and adjusts its strategy based on measured the adaptation of controller parameters C
data. This frequently takes the form where the are indirect. This then led on to a range of
controller parameters, C , are time-varying other adaptive control techniques applicable to
functions that depend on the available data linear systems ( Adaptive Control of Linear
(y.t/, u.t/, and r.t/). Adaptive control has Time-Invariant Systems).
close links to intelligent control (including There have been significant questions con-
neural control ( Neural Control and Approx- cerning the sensitivity of some adaptive control
imate Dynamic Programming), where specific algorithms to unmodeled dynamics, time-varying
types of learning are considered) and also systems, and noise (Ioannou and Kokotovic
to stochastic adaptive control ( Stochastic 1984; Rohrs et al. 1985). This prompted a very
Adaptive Control). active period of research to analyze and redesign
Robust control is most useful when there are adaptive control to provide suitable robustness
large unmodeled dynamics (i.e., structural un- ( Robust Adaptive Control) (e.g., Anderson
certainties), relatively high levels of noise, or et al. 1986; Ioannou and Sun 2012) and parameter
rapid and unpredictable parameter changes. Con- tracking for time-varying systems (e.g., Kreis-
versely, for slow or largely predictable parame- selmeier 1986; Middleton and Goodwin 1988).
ter variations, with relatively well-known model Work in this area further spread to nonpara-
structure and limited noise levels, adaptive con- metric methods, such as switching, or super-
trol may provide a very useful tool for high- visory adaptive control ( Switching Adaptive
performance control (Åström and Wittenmark Control) (e.g., Fu and Barmish 1986; Morse et al.
2008). 1992). In addition, there has been a great deal of
work on the more difficult problem of adaptive
control for nonlinear systems ( Nonlinear Adap-
Varieties of Adaptive Control tive Control).
A further adaptive control technique is ex-
One practical variant of adaptive control is con- tremum seeking control ( Extremum Seeking
troller auto-tuning ( Autotuning). Auto-tuning Control). In extremum seeking (or self optimiz-
is particularly useful for PID and similar con- ing) control, the desired reference for the system
trollers and involves a specific phase of signal is unknown, instead we wish to maximize (or
injection, followed by analysis, PID gain compu- minimize) some variable in the system (Ariyur
tation, and implementation. These techniques are and Krstic 2003). These techniques have quite
an important aid to commissioning and mainte- distinct modes of operation that have proven
nance of distributed control systems. important in a range of applications.
Adaptive Cruise Control 13
A final control algorithm that has nonpara- Ioannou PA, Kokotovic PV (1984) Instability analysis
metric features is iterative learning control and improvement of robustness of adaptive control.
Automatica 20(5):583–594
( Iterative Learning Control) (Amann et al. Ioannou PA, Sun J (2012) Robust adaptive control. Dover A
1996; Moore 1993). This control scheme Publications, Mineola/New York
considers a system with a highly structured, Kreisselmeier G (1986) Adaptive control of a class
namely, repetitive finite run, control problem. of slowly time-varying plants. Syst Control Lett
8(2):97–103
In this case, by taking a nonparametric approach Middleton RH, Goodwin GC (1988) Adaptive control
of utilizing information from previous run(s), in of time-varying linear systems. IEEE Trans Autom
many cases, near-perfect asymptotic tracking can Control 33(2):150–155
be achieved. Middleton RH, Goodwin GC, Hill DJ, Mayne DQ (1988)
Design issues in adaptive control. IEEE Trans Autom
Adaptive control has a rich history ( History Control 33(1):50–58
of Adaptive Control) and has been established Moore KL (1993) Iterative learning control for determin-
as an important tool for some classes of control istic systems. Advances in industrial control series.
problems. Springer, London/New York
Morse AS, Mayne DQ, Goodwin GC (1992) Applications
of hysteresis switching in parameter adaptive control.
IEEE Trans Autom Control 37(9):1343–1354
Cross-References Rohrs C, Valavani L, Athans M, Stein G (1985) Robust-
ness of continuous-time adaptive control algorithms
Adaptive Control of Linear Time-Invariant in the presence of unmodeled dynamics. IEEE Trans
Autom Control 30(9):881–889
Systems
Autotuning
Extremum Seeking Control
History of Adaptive Control Adaptive Cruise Control
Iterative Learning Control
Model Reference Adaptive Control Rajesh Rajamani
Neural Control and Approximate Dynamic Department of Mechanical Engineering,
Programming University of Minnesota, Twin Cities,
Nonlinear Adaptive Control Minneapolis, MN, USA
Optimization Based Robust Control
Robust Adaptive Control
Stochastic Adaptive Control Abstract
Switching Adaptive Control
This chapter discusses advanced cruise control
automotive technologies, including adaptive
Bibliography cruise control (ACC) in which spacing control,
speed control, and a number of transitional
Amann N, Owens DH, Rogers E (1996) Iterative learn- maneuvers must be performed. The ACC system
ing control using optimal feedback and feedforward
actions. Int J Control 65(2):277–293
must satisfy difficult performance requirements
Anderson BDO, Bitmead RR, Johnson CR, Kokotovic PV, of vehicle stability and string stability. The
Kosut RL, Mareels IMY, Praly L, Riedle BD (1986) technical challenges involved and the control
Stability of adaptive systems: passivity and averaging design techniques utilized in ACC system design
analysis. MIT, Cambridge
Ariyur KB, Krstic M (2003) Real-time optimization by are presented.
extremum-seeking control. Wiley, New Jersey
Åström KJ, Wittenmark B (2008) Adaptive control. Keywords
Courier Dover Publications, Mineola
Fu M, Barmish BR (1986) Adaptive stabilization of linear
systems via switching control. IEEE Trans Autom Collision avoidance; String stability; Traffic sta-
Control 31(12):1097–1103 bility; Vehicle following
14 Adaptive Cruise Control
Introduction
Upper
Controller
Adaptive cruise control (ACC) is an extension of
cruise control. An ACC vehicle includes a radar,
a lidar, or other sensor that measures the distance desired fault
to any preceding vehicle in the same lane on the acceleration messages
highway. In the absence of preceding vehicles,
the speed of the car is controlled to a driver- Lower
desired value. In the presence of a preceding Controller
vehicle, the controller determines whether the ve-
hicle should switch from speed control to spacing
control. In spacing control, the distance to the
preceding car is controlled to a desired value.
actuator inputs
A different form of advanced cruise control is
a forward collision avoidance (FCA) system. An Adaptive Cruise Control, Fig. 1 Structure of longitudi-
FCA system uses a distance sensor to determine nal control system
if the vehicle is approaching a car ahead too
quickly and will automatically apply brakes to
minimize the chances of a forward collision. Controller design for vehicle following is the
For the 2013 model year, 29 % vehicles have primary topic of discussion in the sections titled
forward collision warning as an available option “Vehicle Following Requirements” and “String
and 12 % include autonomous braking for a full Stability Analysis” in this chapter.
FCA system. Examples of models in which an Transitional maneuvers and transitional con-
FCA system is standard are the Mercedes Benz trol algorithms are discussed in the section titled
G-class and the Volvo S-60, S-80, XC-60, and “Transitional Maneuvers” in this chapter.
XC-70. The longitudinal control system architecture
It should be noted that an FCA system does for an ACC vehicle is typically designed to be
not involve steady-state vehicle following. An hierarchical, with an upper-level controller and a
ACC system on the other hand involves control of lower-level controller, as shown in Fig. 1.
speed and spacing to desired steady-state values. The upper-level controller determines the de-
ACC systems have been in the market in Japan sired acceleration for the vehicle. The lower level
since 1995, in Europe since 1998, and in the US controller determines the throttle and/or brake
since 2000. An ACC system provides enhanced commands required to track the desired accelera-
driver comfort and convenience by allowing ex- tion. Vehicle dynamic models, engine maps, and
tended operation of the cruise control option even nonlinear control synthesis techniques are used
in the presence of other traffic. in the design of the lower controller (Rajamani
2012). This chapter will focus only on the design
of the upper controller, also known as the ACC
Controller Architecture controller.
As far as the upper-level controller is con-
The ACC system has two modes of steady state cerned, the plant model for control design is
operation: speed control and vehicle following
(i.e., spacing control). Speed control is traditional xR i D u (1)
cruise control and is a well-established tech-
nology. A proportional-integral controller based where the subscript i denotes the i th car in a string
on feedback of vehicle speed (calculated from of consecutive ACC cars. The acceleration of
rotational wheel speeds) is used in cruise control the car is thus assumed to be the control input.
(Rajamani 2012). However, due to the finite bandwidth associated
Adaptive Cruise Control 15
with the lower level controller, each car is ac- vehicle speed xP i . The ACC control law is said to
tually expected to track its desired acceleration provide individual vehicle stability if the spacing
imperfectly. The objective of the upper level con- error of the ACC vehicle converges to zero when
troller design is therefore stated as that of meeting the preceding vehicle is operating at constant
required performance specifications robustly in speed:
the presence of a first order lag in the lower-level xR i 1 ! 0 ) ıi ! 0: (4)
controller performance:
(b) The impulse response function h.t/ corre- Equation (9) describes the propagation of spacing
sponding to HO .s/ should not change sign errors along the vehicle string.
(Swaroop and Hedrick 1996), i.e., All positive values of kp and kv guarantee
individual vehicle stability. However, it can be
h.t/ > 0 8t 0: (7) shown that there are no positive values of kp
and kv for which the magnitude of G.s/ can be
The reasons for these two requirements to be guaranteed to be less than unity at all frequencies.
satisfied are described in Rajamani (2012). The details of this proof are available in Rajamani
Roughly speaking, Eq. (6) ensures that jjıi jj2 (2012).
jjıi 1 jj2 , which means that the energy in the Thus, the constant spacing policy will always
spacing error signal decreases as the spacing be string unstable.
error propagates towards the tail of the string.
Equation (7) ensures that the steady state spacing Constant Time-Gap Spacing
errors of the vehicles in the string have the same
sign. This is important because a positive spacing Since the constant spacing policy is unsuitable
error implies that a vehicle is closer than desired for autonomous control, a better spacing policy
while a negative spacing error implies that it is that can ensure both individual vehicle stability
further apart than desired. If the steady state value and string stability must be used. The constant
of ıi is positive while that of ıi 1 is negative, then time-gap (CTG) spacing policy is such a spacing
this might be dangerous due to the vehicle being policy. In the CTG spacing policy, the desired
closer, even though in terms of magnitude ıi inter-vehicle spacing is not constant but varies
might be smaller than ıi 1 . with velocity. The spacing error is defined as
If conditions (6) and (7) are both satisfied, then
jjıi jj1 jjıi 1 jj1 (Rajamani 2012). ıi D xi xi 1 C Ldes C hxP i : (10)
Too
Close
Region 3
0
Crash dR / dt
18 Adaptive Cruise Control
Traffic Stability
measured real-time values of R and RP ; and
the R RP diagram in Fig. 3, the ACC system In addition to individual vehicle stability and
determines the mode of longitudinal control. string stability, another type of stability analysis
For instance, in region 1, the vehicle continues that has received significant interest in ACC liter-
to operate under speed control. In region 2, ature is traffic flow stability. Traffic flow stability
the vehicle operates under spacing control. In refers to the stable evolution of traffic velocity
region 3, the vehicle decelerates at the maximum and traffic density on a highway section, for given
allowed deceleration so as to try and avoid a inflow and outflow conditions. One well-known
crash. result in this regard in literature is that traffic flow
The switching line from speed to spacing con- is defined to be stable if @q @ is positive, i.e., as
trol is given by the density of traffic increases, traffic flow rate
q must increase (Swaroop and Rajagopal 1999).
R D T RP C Rfinal (17)
If this condition is not satisfied, the highway
section would be unable to accommodate any
where T is the slope of the switching line. When a
constant inflow of vehicles from an oncoming
slower vehicle is encountered at a distance larger
ramp. The steady state traffic flow on the highway
than the desired final distance Rfinal , the switching
section would come to a stop, if the ramp inflow
line shown in Fig. 4 can be used to determine
did not stop (Swaroop and Rajagopal 1999).
when and whether the vehicle should switch to
It has been shown that the constant time-
spacing control. If the distance R is greater than
gap spacing policy used in ACC systems has
that given by the line, speed control should be
a negative q slope and thus does not lead
used.
to traffic flow stability (Swaroop and Rajagopal
The overall strategy (shown by trajectory
1999). It has also been shown that it is possible
ABC) is to first reduce gap at constant RP and
to design other spacing policies (in which the
then follow the desired spacing given by the
desired spacing between vehicles is a nonlinear
switching line of Eq. (17).
function of speed, instead of being proportional
The control law during spacing control on this
to speed) that can provide stable traffic flow
transitional trajectory is as follows. Depending on
P determine R from Eq. (17). Then (Santhanakrishnan and Rajamani 2003).
the value of R,
The importance of traffic flow stability has
use R as the desired inter-vehicle spacing in the
not been fully understood by the research com-
PD control law
munity. Traffic flow stability is likely to become
xR des D kp .xi R/ kd xP i RP (18) important when the number of ACC vehicles
Adaptive Cruise Control 19
on the highway increase and their penetration other vehicles. There is a likelihood of evolution
percentage into vehicles on the road becomes of current systems into co-operative adaptive
significant. cruise control (CACC) systems which utilize A
wireless communication with other vehicles
Recent Automotive Market and highway infrastructure. This evolution
Developments could be facilitated by the dedicated short-
range communications (DSRC) capability being
The latest versions of ACC systems on the market developed by government agencies in the US,
have been enhanced with collision warning, inte- Europe and Japan. In the US, DSRC is being
grated brake support, and stop-and-go operation developed with a primary goal of enabling
functionality. communication between vehicles and with
The collision warning feature uses the same infrastructure to reduce collisions and support
radar as the ACC system to detect moving other safety applications. In CACC, wireless
vehicles ahead and determine whether driver communication could provide acceleration
intervention is required. In this case, visual signals from several preceding downstream
and audio warnings are provided to alert the vehicles. These signals could be used in better
driver and brakes are pre-charged to allow spacing policies and control algorithms to
quick deceleration. On Ford’s ACC-equipped improve safety, ensure string stability, and
vehicles, brakes are also automatically applied improve traffic flow.
when the driver lifts the foot off from the
accelerator pedal in a detected collision warning
scenario. Cross-References
When enabled with stop-and-go functional-
ity, the ACC system can also operate at low Lane Keeping
vehicle speeds in heavy traffic. The vehicle can Vehicle Dynamics Control
be automatically brought to a complete stop when Vehicular Chains
needed and restarted automatically. Stop-and-go
is an expensive option and requires the use of
multiple radar sensors on each car. For instance,
the BMW ACC system uses two short range Bibliography
and one long range radar sensor for stop-and-go
operation. Fancher P, Bareket Z (1994) Evaluating headway control
The 2013 versions of ACC on the Cadillac using range versus range-rate relationships. Veh Syst
Dyn 23(8):575–596
ATS and on the Mercedes Distronic systems are
Rajamani R (2012) Vehicle dynamics and control, 2nd
also being integrated with camera based lateral edn. Springer, New York. ISBN:978-1461414322
lane position measurement systems. On the Mer- Santhanakrishnan K, Rajamani R (2003) On spacing poli-
cedes Distronic systems, a camera steering assist cies for highway vehicle automation. IEEE Trans Intell
Transp Syst 4(4):198–204
system provides automatic steering, while on the
Swaroop D, Hedrick JK (1996) String stability of in-
Cadillac ATS, a camera based system provides terconnected dynamic systems. IEEE Trans Autom
lane departure warnings. Control 41(3):349–357
Swaroop D, Rajagopal KR (1999) Intelligent cruise con-
trol systems and traffic flow stability. Transp Res C
Future Directions Emerg Technol 7(6):329–352
Swaroop D, Hedrick JK, Chien CC, Ioannou P (1994)
A comparison of spacing and headway control laws
Current ACC systems use only on-board sensors for automatically controlled vehicles. Veh Syst Dyn
and do not use wireless communication with 23(8):597–625
20 Advanced Manipulation for Underwater Sampling
Keywords
Advanced Manipulation
for Underwater Sampling
Kinematic control law (KCL); Manipulator;
Motion priorities
Giuseppe Casalino
University of Genoa, Genoa, Italy
Introduction
Advanced Manipulation for Underwater Sampling, Fig. 1 Snapshots showing the underwater floating manipulator
TRIDENT when autonomously picking an identified object
result of the developments outlined in the next requiring the achievement of the inequality type
section. objective
The developed framework however solely > m
represents the so-called Kinematic Control Layer
While the above objectives arise from inherently
(KCL) of the overall control architecture, that
scalar variables, other objectives instead arise as
is, the one in charge of closed-loop real-time
conditions to be achieved within the Cartesian
control generating the system velocity vector y
space, where each one of them can be conve-
as a reference signal, to be in turn concurrently
niently expressed in terms of the modulus associ-
tracked, via the action of the arm joint torques
ated to a corresponding Cartesian vector variable.
and vehicle thrusters, by an adequate underlying
To be more specific, let us, for instance, refer
Dynamic Control Layer (DCL), where the overall
to the need of avoiding the occlusions between
dynamic and hydrodynamic effects are kept into
the sample and the stereo camera, which might
account to the benefit of such velocity tracking.
occasionally occur due to the arm link motions.
Since the DCL can actually be designed in a
Then such need can be, for instance, translated
way substantially independent from the system
into the ultimate achievement of the following set
mission(s), it will not constitute a specific topic
of inequalities, for suitable chosen values of the
of this entry. Its detailed dynamic-hydrodynamic
boundaries
model-based structuring, also including a stabil-
ity analysis, can be found in Casalino (2011),
together with a more detailed description of the klk > lm I k
k >
m I k
k <
M
upper-lying KCL, while more general references
on underwater dynamic control aspects can be where l is the vector lying on the vehicle x-y
found, for instance, in Antonelli (2006). plane, joining the arm elbow with the line parallel
to the vehicle z-axis and passing through camera
frame origin, as sketched in Fig. 2a. Moreover
Task-Priority-Based Control of is the misalignment vector formed by vector
Floating Manipulators also lying on the vehicle x-y plane, joining the
lines parallel to the vehicle z-axis and, respec-
The above-outlined typical set of objectives (of tively, passing through the elbow and the end-
inequality and/or equality types) to be achieved effector origin.
within a sampling mission are here formalized. As for the vehicle, it must keep the object of
Then some helpful generalizing definitions are interest grossly centered in the camera frame (see
given, prior to presenting the related unifying Fig. 2b), thus meaning that the modulus of the
task-priority-based algorithmic framework to orientation error , formed by the unit vector np
be used. of vector p from the sample to the camera frame
and the unit vector kc of the z-axis of the camera
Inequality and Equality Objectives frame itself, must ultimately satisfy the inequality
One of the objectives, of inequality type, related
to both arm safety and its operability is that
of maintaining each joint within corresponding k k < M
minimum and maximum limits, that is,
Furthermore, the camera must also be closer than
q1m < qi < qiM I i D 1; 2; : : : ; 7 a given horizontal distance dM to the vertical
line passing through the sample, and it must
Moreover, in order to have the arm operating lie between a maximum and minimum height
with dexterity, its manipulability measure (Naka- with respect to the sample itself, thus implying
mura 1991; Yoshikawa 1985) must ultimately the achievement of the following inequalities
stay above a minimum threshold value, thus also (Fig. 2c, d):
Advanced Manipulation for Underwater Sampling 23
kv
a jv b kv
iv
iv A
v
v
jv
vT
l g
kv
p
τ h
ξ
g
g
c kv d kv
iv iv
v v
gT v
v
jv
gT jv
p
h p
hm
g
d
g
Advanced Manipulation for Underwater Sampling, Fig. 2 Vectors allowing for the defintion of some inequality
objectives in the Cartesian space: (a) camera occlusion, (b) camera centering, (c) camera distance, (d) camera height
kd k < dM I hm < khk < hM error and the orientation one of the end-effector
frame with respect to the sample frame
Also since the vehicle should exhibit an
krk D 0I k#k D 0
almost horizontal attitude, this further requires
the achievement of the following additional
As already repeatedly remarked, the achievement
inequality:
of the above inequality objectives (since related
kk < M
to the system safety and/or its best operability)
with the misalignment vector formed by the must globally deserve a priority higher than the
absolute vertical unit vector ko with the vehicle last equality.
z-axis one kv .
And finally the end-effector must eventually Basic Definitions
reach the sample, for then picking it. Thus the fol- The following definitions only regard a generic
lowing, now of equality type, objectives must also vector s 2 R3 characterizing a corresponding
be ultimately achieved, where r is the position generic objective defined in the Cartesian space
24 Advanced Manipulation for Underwater Sampling
(for instance, with the exclusion of the joint and In case PN could be exactly assigned to its
manipulability limits, all the other above-reported P it would consequently
corresponding error rate ,
objectives). In this case the vector is termed to be smoothly drive toward the achievement of its
the error vector of the objective, and it is assumed associated objective. Note however that for in-
measured with components on the vehicle frame. equality objectives, it would necessarily impose
Then its modulus P D 0 in correspondence of a point located
inside the interval of validity of the inequality
D
P ksk objective itself, while instead such an error rate
zeroing effect should be relaxed, for allowing
is termed to be the error, while its unit vector the helpful subsequent system mobility increase,
which allows for further progress toward other
nDs=I
P ¤0 lower priority control objectives. Such a relax-
ation aspect will be dealt with soon.
is accordingly denoted as the unit error vector. Furthermore, in correspondence of a reference
PN the so-called reference error vector
error rate ,
Then the following differential Jacobian relation-
ship can always be evaluated for each of them: rate can also be defined as
sP D Hy sPN Dn
P P
where y 2 RN (N D .7 C 6/ for the system that for equality objectives requiring the zeroing
of Fig. 1) is the stacked vector composed of the of their error simply becomes
joint velocity vector qP 2 R7 , plus the stacked
vector v 2 R6 of the absolute vehicle velocities sPN D
P s
(linear and angular) with components on the
vehicle frame and with sP clearly representing the whose evaluation, since not requiring its unit
time derivative of vector s itself, as seen from vector n, will be useful for managing equality
the vehicle frame and with components on it objectives.
(see Casalino (2011) for details on the real-time Finally note that for each objective not de-
evaluation of Jacobian matrices H /. fined in the Cartesian space (like, for instance,
Obviously, for the time derivative P of the above joint limits and manipulability), the
the error, also the following differential corresponding scalar error variable, its rate, and
relationship holds its reference error rate can instead be managed
directly, since obviously they do not require any
P D nT Hy preliminary scalar reduction process.
Further, to each error variable , a so-called error Managing the Higher Priority Inequality
reference rate is real time assigned of the form Objectives
A prioritized list of the various scalar inequal-
PN D . o /˛./ ity objectives, to be concurrently progressively
achieved, is suitably established in a descending
where for equality objectives o is the target priority order.
value and ˛./ 1, while for inequality ones, Then, by starting to consider the highest pri-
o is the threshold value and ˛./ is a left- ority one, we have that the linear manifold of
cutting or right-cutting (in correspondence of o / the system velocity vector y (i.e., the arm joints
smooth sigmoidal activation function, depending velocity vector qP stacked with vector v of the
on whether the objective is to force to be below vehicle linear and angular velocities), capable of
or above o , respectively. driving toward its achievement, results at each
Advanced Manipulation for Underwater Sampling 25
time instant as the set of solution of the following When instead ˛1 is within its transition zone
minimization problem with scalar argument, with 0 < ˛1 < 1 (i.e., when the first inequality is near
row vector G1 D˛P 1 nT1 H1 and scalar ˛1 the same to be achieved), since the two terms of the so- A
activation function embedded within the refer- lution manifold now become only approximately
ence error rate PN 1 orthogonal, this can make the usage of the second
term for managing lower priority tasks, possibly
( ) counteracting the first, currently acting in favor
2
S1 D
P argmin PN 1 G1 y , of the highest priority one, but in any case with-
y
out any possibility of making the primary error
y D G1# PN 1 C .I G1# G1 /z1 D
P 1 C Q1 z1 I 8z1 variable 1 getting out of its enlarged boundaries
(i.e., the ones inclusive of the transition zone),
(1) thus meaning that once the primary variable 1
has entered within such larger boundaries, it will
The above minimization, whose solution man- definitely never get out of them.
ifold appears at the right (also expressed in a With the above considerations in mind,
concise notation with an obvious correspondence managing the remaining priority-descending
of terms) parameterized by the arbitrary vector sequence of inequality objectives can then be
z1 , has to be assumed executed without extracting done by applying the same philosophy to each
the common factor ˛1 , that is, by evaluating of them and within the mobility space left free
the pseudo-inverse matrix G1# via the regularized by its preceding ones, that is, as the result of
form the following sequence of nested minimization
problems:
1 ( )
G1# D ˛12 nT1 H1 H1T n1 C p1 ˛1 H1T n1
2
P
P argmin N i Gi y
Si D I i D 1; 2; : : : ; k
y2Si 1
with p1 , a suitably chosen bell-shaped, finite sup-
port and centered on zero, regularizing function with Gi D˛
P i nTi Hi and with k indexing the low-
of the norm of row vector G1 . est priority inequality objective and where the
In the above solution manifold, when ˛1 D highest priority objective has been also included
1 (i.e., when the first inequality is still far to for the sake of completeness (upon letting So D
be achieved), the second arbitrary term Q1 z1 is RN /. In this way the procedure guarantees the
orthogonal to the first, thus having no influence concurrent prioritized convergence (actually oc-
on the generated P 1 D PN 1 and consequently curring as a sort of “domino effect” scattering
suitable to be used for also progressing toward along the prioritized objective list) toward the
the achievement of other lower priority objec- ultimate fulfillment of all inequality objectives,
tives, without perturbing the current progressive each one within its enlarged bounds at worse and
achievement of the first one. Note however that, with no possibility of getting out of them, once
since in this condition the span of the second reached.
term results one dimension less than the whole Further, a simple algebra allows translating the
system velocity space y 2 RN , this implies that above sequence of k nested minimizations into
the lower priority objectives can be progressed the following algorithmic structure, with initial-
by only acting within a one-dimension reduced ization 0 D 0; Q0 D I (see Casalino et al.
system velocity subspace. 2012a,b for more details):
When ˛1 D 0 (i.e., when the first inequality is
achieved) since G1# D 0 (as granted by the reg-
GO 1 DG
P i Qi
ularization) and consequently y D z1 , the lower
priority objectives can instead be progressed by
Ti D I Qi 1 Gi# Gi
now exploiting the whole system velocity space.
26 Advanced Manipulation for Underwater Sampling
Bibliography
The International Civil Aviation Organization Flight management systems (FMS) assist in flight
(ICAO) has divided the world’s airspace into planning in addition to providing lateral and ver-
flight information regions. Each region has a tical control of the airplane. Many aircraft have A
country that controls the airspace, and the ANSP special safety systems such as the Traffic Alert
for each country can be a government depart- and Collision Avoidance System, which alerts the
ment, state-owned company, or private organiza- flight crew to potential collisions with other air-
tion. For example, in the United States, the ANSP borne aircraft, and the Terrain Avoidance Warn-
is the Federal Aviation Administration (FAA), ing Systems (TAWS), which alert the flight crew
which is a government department. The Canadian to potential flight into terrain.
ANSP is NAV CANADA, which is a private
company.
Each country is different in terms of the ser-
Causes of Congestion and Delays
vices provided by the ANSP, how the ANSP
operates, and the tools available to the controllers.
Congestion and delays are caused by multiple
In the USA and Europe, the airspace is divided
reasons. These include tight scheduling, safety
into sectors and areas around airports. An air
limitations on how quickly aircraft can take off
traffic control center is responsible for traffic flow
and land and how closely they can fly, infras-
within its sector and rules and procedures are in
tructure limitations such as the number of run-
place to cover transfer of control between sectors.
ways at an airport and the airway structure, and
The areas around busy airports are usually han-
disturbances such as weather and unscheduled
dled by a terminal radar approach control. The air
maintenance.
traffic control tower personnel handle departing
aircraft, landing aircraft, and the movement of
aircraft on the airport surface. Tight Scheduling
Air traffic controllers in developed air travel Tight scheduling is a major contributor to con-
markets like the USA and Europe have tools that gestion and delays. The hub and spoke system
help them with the business of controlling and that many major airlines operate with to minimize
separating aircraft. Tower controllers operating at connection times means that aircraft arrive and
airports can see aircraft directly through windows depart in multiple banks during the day. During
or on computer screens through surveillance the arrival and departure banks, airports are very
technology such as radar and Automatic busy. As mentioned previously, passengers have
Dependent Surveillance-Broadcast (ADS-B). preferred times to travel, which also increase de-
Tower controllers may have additional tools mand at certain times. At airports that do not limit
to help detect and prevent potential collisions flight schedules by using slot scheduling, the
on the airport surface. En route controllers can number of flights scheduled can actually exceed
see aircraft on computer screens and may have the departure and arrival capacity of the airport
additional tools to help detect potential losses even in best-case conditions. One of the reasons
of separation between aircraft. Controllers can that airlines are asked to report on-time statistics
communicate with aircraft via radio and some is to make the published airline schedules more
have datalink communication available such reflective of the average time from departure to
as Controller-Pilot Datalink Communications arrival, not the best-case time.
(CPDLC). Aircraft themselves are also tightly scheduled.
Flight crews have tools to help with navigating Aircraft are an expensive capital asset. Since cus-
and flying the airplane. Autopilots and autothrot- tomers are very sensitive to ticket prices, airlines
tles off-load the pilot from having to continuously need to have their aircraft flying as many hours as
control the aircraft; instead the pilot can specify possible per day. Airlines also limit the number of
the speed, altitude, and heading and the autopilot spare aircraft and flight crews available to fill in
and autothrottle will maintain those commands. when operations are disrupted to control costs.
30 Air Traffic Management Modernization: Promise and Challenges
or may not have the required additional capacity, makes it a challenge to accurately estimate bene-
so the rerouting cascades on both sides of the fits. It is complicated to understand if an improve-
weather. ment in one part of the system will really help A
or just shift where the congestion points are. De-
cisions need to be made on what improvements
Why Is Air Traffic Management are the best to invest in. For government entities,
Modernization So Hard? societal benefits can be as important as financial
payback, and someone needs to decide whose
Air traffic management modernization is difficult interests are more important. For example, the
for financial and technical reasons. The air traffic people living around an airport might want longer
management system operates around the clock. It arrival paths at night to minimize noise while air
cannot be taken down for a significant period of travelers and the airline want the airline to fly the
time without a major effect on commerce and the most direct route into an airport. A combination
economy. of subject matter expertise and simulation can
Financing is a significant challenge for air provide a starting point to estimate benefit, but
traffic management modernization. Governments often only operational deployment will provide
worldwide are facing budgetary challenges and realistic estimates.
improvements to air travel are one of many com- It is a long process to develop new technolo-
peting financial interests. Local airport authori- gies and operational procedures even when the
ties have similar challenges in raising money for benefit is clear and financing is available. The
airport improvements. Airlines have competitive typical development steps include describing the
limitations on how much ticket prices can rise operational concept; developing new controllers
and therefore need to see a payback on invest- procedures, pilot procedures, or phraseology if
ment in aircraft upgrades that can be as short as needed; performing a safety and performance
2 years. analysis to determine high level requirements;
Another financial challenge is that the entity performing simulations that at some point may
that needs to pay for the majority of an improve- include controllers or pilots; designing and build-
ment may not be the entity that gets the majority ing equipment that can include software, hard-
of the benefit, at least near term. One example of ware, or both; and field testing or flight testing the
this is the installation of ADS-B transmitters on new equipment. Typically, new ground tools are
aircraft. Buying and installing an ADS-B trans- field tested in a shadow mode, where controllers
mitter costs the aircraft owner money. It benefits can use the tool in a mock situation driven by
the ANSPs, who can receive the transmissions real data before the tool is made fully opera-
and have them augment or replace expensive tional. Flight testing is performed on aircraft that
radar surveillance, but only if a large number of are flying with experimental certificates so that
aircraft are equipped. Eventually the ANSP ben- equipment can be tested and demonstrated prior
efit will be seen by the aircraft operator through to formal certification.
lower operating costs but it takes time. This is Avionics need to be certified before opera-
one reason that ADS-B transmitter equipage was tional use to meet the rules established to ensure
mandated in the USA, Europe, and other parts of that a high safety standard is applied to air travel.
the world rather than letting market forces drive To support certification, standards are developed.
equipage. Frequently the standards are developed through
All entities, whether governmental or private, international cooperation and through consen-
need some sort of business case to justify invest- sus decision-making that includes many different
ment, where it can be shown that the benefit of the organizations such as ANSPs, airlines, aircraft
improvement outweighs the cost. The same sys- manufacturers, avionics suppliers, pilot associa-
tem complexity that makes congestion and delays tions, controller associations, and more. This is
in one region propagate throughout the system a slow process but an important one, since it
32 Air Traffic Management Modernization: Promise and Challenges
reduces development risk for avionics suppliers financing. Each program has organized its efforts
and helps ensure that equipment can be used differently but there are many similarities in the
worldwide. operational objectives and improvements being
Once new avionics or ground tools are avail- developed.
able, it takes time for them to be deployed. Airport capacity problems are being addressed
For example, aircraft fleets are upgraded as air- in multiple ways. Controllers are being provided
craft come in for major maintenance rather than with advanced surface movement guidance and
pulling them out of scheduled service. Flight control systems that combine radar surveillance,
crews need to be trained on new equipment before ADS-B surveillance, and sensors installed at the
it can be used, and training takes time. Ground airport with valued-added tools to assist with traf-
tools are typically deployed site by site, and the fic control and alert controllers to potential col-
controllers also require training on new equip- lisions. Datalink communications between con-
ment and new procedures. trollers and pilots will reduce radio-frequency
congestion, reduce communication errors, and
enable more complex communication. The USA
Promise for the Future and Europe have plans to develop a modernized
datalink communication infrastructure between
Despite the challenges and complexity of air controllers and pilots that would include infor-
traffic management, there is a path forward for mation like departure clearances and the taxiway
significant improvement in both developed and route clearance. Aircraft on arrival to an airport
developing air travel markets. Developing air will be controlled more precisely by equipping
travel markets in countries like China and India aircraft with capabilities such as the ability to fly
can improve air traffic management using pro- to a required time of arrival and the ability to
cedures, tools, and technology that is already space with respect to another aircraft.
used in developed markets such as the USA and Domestic airspace congestion is being ad-
Europe. Emerging markets like China are will- dressed in Europe by moving towards a single
ing to make significant investments in improving European sky where the ANSPs for the individ-
air traffic management by building new airports, ual nations coordinate activities and airspace is
expanding existing airports, changing controller structured not as 27 national regions but operated
procedures, and investing in controller tools. In as larger blocks. Similar efforts are undergoing
developed markets, new procedures, tools, and in the USA to improve the cooperation and coor-
technologies will need to be implemented. In dination between the individual airspace sectors.
some regions, mandates and financial incentives In some countries, large blocks of airspace are
may play a part in enabling infrastructure and reserved for special use by the military. In those
equipment changes that are not driven by the countries, efforts are in place to have dynamic
marketplace. special use airspace that is reserved on an as-
The USA and Europe are both supporting needed basis but otherwise available for civil
significant research, development, and im- use.
plementation programs to support air traffic Oceanic airspace congestion is being
management modernization. In the USA, the addressed by leveraging the improved navigation
FAA has a program known as NextGen, the performance of aircraft. Some route structures
Next Generation Air Transportation System. In are available only to aircraft that can flight to
Europe, the European Commission oversees a required navigation performance. These route
a program known as SESAR, the Single structures have less required lateral separation,
European Sky Air Traffic Management Research, and thus more routes can be flown in the same
which is a joint effort between the European airspace. Pilot tools that leverage ADS-B are
Union, EUROCONTROL, and industry partners. allowing aircraft to make flight level changes
Both programs have substantial support and with reduced separation and in the future
Aircraft Flight Control 33
are expected to allow pilots to do additional and far term to support air traffic management
maneuvering that is restricted today, such as modernization worldwide.
passing slower aircraft. A
Weather cannot be controlled but efforts are
underway to do better prediction and provide Cross-References
more accurate and timely information to pilots,
controllers, and aircraft dispatchers at airlines. Aircraft Flight Control
On-board radars that pilots use to divert around Pilot-Vehicle System Modeling
weather are adding more sophisticated process-
ing algorithms to better differentiate hazardous
weather. Future flight management systems will
have the capability to include additional weather
Bibliography
information. Datalinks between the air and the Collinson R (2011) Introduction to avionics systems, 3rd
ground or between aircraft may be updated to edn. Springer, Dordrecht
include information from the on-board radar sys- http://www.faa.gov/nextgen/
tems, allowing aircraft to act as local weather http://www.sesarju.eu/
Nolan M (2011) Fundamentals of air traffic control, 5th
sensors. Improved weather information for pilots, edn. Cengage Learning, Clifton Park
controllers, and dispatchers improves flight plan-
ning and minimizes the necessary size of devia-
tions around hazardous weather while retaining
safety. Aircraft Flight Control
Weather is also addressed by providing air-
craft and airports with equipment to improve Dale Enns
airport access in reduced visibility. Ground-based Honeywell International Inc., Minneapolis,
augmentation systems installed at airports pro- MN, USA
vide aircraft with the capability to do precision-
based navigation for approaches to airports with
low weather ceilings. Other technologies like Abstract
enhanced vision and synthetic vision, which can
be part of a combined vision system, provide the Aircraft flight control is concerned with using
capability to land in poor visibility. the control surfaces to change aerodynamic mo-
ments, to change attitude angles of the aircraft
relative to the air flow, and ultimately change
Summary the aerodynamic forces to allow the aircraft to
achieve the desired maneuver or steady condi-
Air traffic management is a complex and interest- tion. Control laws create the commanded con-
ing problem. The expected increase in air travel trol surface positions based on pilot and sensor
worldwide is driving a need for improvements inputs. Traditional control laws employ propor-
to the existing system so that more passengers tional and integral compensation with scheduled
can be handled while at the same time reducing gains, limiting elements, and cross feeds between
congestion and delays. Significant research and coupled feedback loops. Dynamic inversion is an
development efforts are underway worldwide to approach to develop control laws that systemati-
develop safe and effective solutions that include cally addresses the equivalent of gain schedules
controller tools, pilot tools, aircraft avionics, in- and the multivariable cross feeds, can incorpo-
frastructure improvements, and new procedures. rate constrained optimization for the limiting ele-
Despite the technical and financial challenges, ments, and maintains the use of proportional and
many promising technologies and new proce- integral compensation to achieve the benefits of
dures will be implemented in the near, mid-, feedback.
34 Aircraft Flight Control
Introduction Flight
Flying is made possible by flight control and this Aircraft are maneuvered by changing the forces
applies to birds and the Wright Flyer, as well as acting on the mass center, e.g., a steady level
modern flight vehicles. In addition to balancing turn requires a steady force towards the direction
lift and weight forces, successful flight also re- of turn. The force is the aerodynamic lift force
quires a balance of moments or torques about the .L/ and it is banked or rotated into the direction
mass center. Control is a means to adjust these of the turn. The direction can be adjusted with
moments to stay in equilibrium and to perform the bank angle ./ and for a given airspeed .V /
maneuvers. While birds use their feathers and and air density ./, the magnitude of the force
the Wright Flyer warped its wings, modern flight can be adjusted with the angle-of-attack .˛/. This
vehicles utilize hinged control surfaces to adjust is called bank-to-turn. Aircraft, e.g., missiles,
the moments. The control action can be open can also skid-to-turn where the aerodynamic side
or closed loop, where closed loop refers to a force .Y / is adjusted with the sideslip angle .ˇ/
feedback loop consisting of sensors, computer, but this entry will focus on bank-to-turn.
and actuation. A direct connection between the Equations of motion (Enns et al. 1996; Stevens
cockpit pilot controls and the control surfaces and Lewis 1992) can be used to relate the time
without a feedback loop is open loop control. The rates of change of , ˛, and ˇ to roll .p/, pitch
computer in the feedback loop implements a con- .q/, and yaw .r/ rate. See Fig. 1. Approximate
trol law (computer program). The development of relations (for near steady level flight with no
the control law is discussed in this entry. wind) are
pP D Lp p C Lıa ıa Control
where Lp is the stability derivative and Lıa is the The control objectives are to provide stability,
control derivative both of which can be regarded disturbance rejection, desensitization, and
as constants for a given airspeed and air density. satisfactory steady state and transient response
to commands. Specifications and guidelines for
Pitch Axis or Short Period Example these objectives are assessed quantitatively with
Consider just the pitch and heave motion. The frequency, time, and covariance analyses and
differential equations (Newton’s 2nd law for the simulations.
36 Aircraft Flight Control
Aircraft Flight Control, Fig. 2 Closed loop feedback system and desired dynamics
y Kb .fc s C fi Kb /
D 2 We want to solve for u to achieve a desired rate
yc s C Kb s C fi Kb2 of change of y, so we start with
and the pilot produces the commands, .yc / with yP D CAx C CBu
cockpit inceptors, e.g., sticks, pedals.
The control system robustness can be adjusted
If we can invert CB, i.e., it is not zero, for the
with the choices made for y, Kb , fi , and fc .
short period case, we solve for u with
These desired dynamics are utilized in all of
the examples to follow. In the following, we use
dynamic inversion (Enns et al. 1996; Wacker u D .CB/1 .yPdes CAx/
et al. 2001) to algebraically manipulate the equa-
tions of motion into the equivalent of the integra- Implementation requires a measurement of the
tor for y in Fig. 2. state, x and models for the matrices CA and CB.
Aircraft Flight Control 37
The closed loop poles include the open loop Lateral-Directional Example
zeros of the transfer function y.s/
u.s/
(zero dynamics) If we choose the two angular rates as the con-
in addition to the roots of the desired dynamics trolled variables (p; r), then the zero dynamics A
characteristic equation. Closed loop stability re- are favorable. We use the same proportional plus
quires stable zero dynamics. The zero dynamics integral desired dynamics in Fig. 2 but there are
have an impact on control system robustness and two signals represented by each wire (one associ-
can influence the precise choice of y. ated with p and the other r).
When y D q, the control law includes the The same state space equations are used for
following dynamic inversion equation the dynamic inversion step but now CA and CB
are 2 4 and 2 2 matrices, respectively instead
ıe D Mı1
e
.qPdes Mq q M˛ ˛/ of scalars. The superscript in u D .CB/1 .yPdes
CAx/ now means matrix inverse instead of recip-
and the open loop zero is Z˛ Zıe Mı1 e
M˛ , rocal. The zero dynamics are assessed with the
which in almost every case of interest is a neg- transmission zeros of the matrix transfer function
ative number. .p; r/=.ıa ; ır /.
Note that there are no restrictions on the open In the practical case where the aileron and
loop poles. This control law is effective and rudder are limited, it is possible to place a higher
practical in stabilization of an aircraft with an priority on solving one equation vs. another if the
open loop unstable short period mode. equations are coupled, by proper allocation of the
Since Mıe , Mq and M˛ vary with air density commands to the control surfaces which is called
and airspeed we are motivated to schedule these control allocation (Enns 1998). In these cases, we
portions of the control law accordingly. use a constrained optimization approach
When y D ˛, the zero dynamics are not
suitable as closed loop poles. In this case, the min jjCBu .yPdes CAx/jj
umin uumax
pitch rate controller described above is the inner
loop and we apply dynamic inversion a second instead of the matrix inverse followed by a lim-
time as an outer loop (Enns and Keviczky 2006) iter. In cases where there are redundant controls,
where we approximate the angle-of-attack dy- i.e., the matrix CB has more columns than rows,
namics with the simplification that pitch rate has we introduce a preferred solution, up and solve a
reached steady state, i.e., qP D 0 and regard pitch different constrained optimization problem
rate as the input .u D q/ and angle-of-attack as
the controlled variable .y D ˛/. The approximate
min jju up jj
equation of motion is CBuCCAxDyPdes
˛P D Z˛ ˛ C q Zıe Mı1
e
M˛ ˛ C Mq q to find the solution that solves the equations that
is closest to the preferred solution. We utilize
D Z˛ Zıe Mı1e
M˛ ˛ weighted norms to accomplish the desired prior-
C 1 Zıe Mı1
e
Mq q ity.
An outer loop to control the attitude angles
This equation is inverted to give .; ˇ/ can be obtained with an approach analo-
gous to the one used in the previous section.
1
qc D 1 Zıe Mı1e
Mq
Nonlinear Example
˛P des Z˛ Zıe Mı1
e
M˛ ˛ Dynamic inversion can be used directly with the
nonlinear equations of motion (Enns et al. 1996;
qc obtained from this equation is passed to the Wacker et al. 2001). General equations of mo-
inner loop as a command, i.e., yc of the inner tion, e.g., 6 degree-of-freedom rigid body can be
loop. expressed with xP D f .x; u/ and the controlled
38 Aircraft Flight Control
variable is given by y D h.x/. With the chain portions of the control law are scheduled. Aircraft
rule of calculus we obtain do exhibit coupling between axes and so multi-
variable feedback loop approaches are effective.
@h Nonlinearities in the form of limits (noninvert-
yP D .x/ f .x; u/
@x ible) and nonlinear expressions, e.g., trigonomet-
ric, polynomial, and table look-up (invertible)
and for a given yP D yPdes and (measured) x we are present in flight control development. The
can solve this equation for u either directly or dynamic inversion approach has been shown to
approximately. In practice, the first order Taylor include the traditional feedback control princi-
Series approximation is effective ples, systematically develops the equivalent of the
gain schedules, applies to multivariable systems,
yP Š a .x; u0 / C b .x; u0 / .u u0 / applies to invertible nonlinearities, and can be
used to avoid issues with noninvertible nonlinear-
where u0 is typically the past value of u, in ities to the extent it is physically possible.
a discrete implementation. As in the previous Future developments will include adaptation,
example, Fig. 2 can be used to obtain yPdes . The reconfiguration, estimation, and nonlinear
terms a .x; u0 / b .x; u0 / u0 and b .x; u0 / are analyses. Adaptive control concepts will continue
analogous to the terms CAx and the matrix CB, to mature and become integrated with approaches
respectively. Control allocation can be utilized such as dynamic inversion to deal with
in the same way as discussed above. The zero unstructured or nonparameterized uncertainty or
dynamics are evaluated with transmission zeros variations in the aircraft dynamics. Parameterized
at the intended operating points. Outer loops can uncertainty will be incorporated with near real
be employed in the same manner as discussed in time reconfiguration of the aircraft model used
the previous section. as part of the control law, e.g., reallocation of
The control law with this approach utilizes control surfaces after an actuation failure. State
the equations of motion which can include table variables used as measurements in the control law
lookup for aerodynamics, propulsion, mass prop- will be estimated as well as directly measured
erties, and reference geometry as appropriate. in nominal and sensor failure cases. Advances
The raw aircraft data or an approximation to the in nonlinear dynamical systems analyses will
data takes the place of gain schedules with this create improved intuition, understanding, and
approach. guidelines for control law development.
Enns DF et al (1996) Application of multivariable con- of the controls community to address the
trol theory to aircraft flight control laws, final report: control needs of applications concerning some
multivariable control design guidelines. Technical re-
port WL-TR-96-3099, Flight Dynamics Directorate,
complex production and service operations, A
Wright-Patterson Air Force Base, OH 45433-7562, like those taking place in manufacturing and
USA other workflow systems, telecommunication
Stevens BL, Lewis FL (1992) Aircraft control and simula- and data-processing systems, and transportation
tion. Wiley, New York
Wacker R, Enns DF, Bugajski DJ, Munday S, Merkle S
systems. These operations were seeking the
(2001) X-38 application of dynamic inversion flight ability to support higher levels of efficiency
control. In: Proceedings of the 24th annual AAS guid- and productivity and more demanding notions
ance and control conference, Breckenridge, 31 Jan–4 of quality of product and service. At the same
Feb 2001
time, the thriving computing technologies of
the era, and in particular the emergence of
the microprocessor, were cultivating, and to a
significant extent supporting, visions of ever-
Applications of Discrete-Event
increasing automation and autonomy for the
Systems
aforementioned operations. The DES community
set out to provide a systematic and rigorous
Spyros Reveliotis
understanding of the dynamics that drive the
School of Industrial & Systems Engineering,
aforementioned operations and their complexity,
Georgia Institute of Technology, Atlanta,
and to develop a control paradigm that would
GA, USA
define and enforce the target behaviors for those
environments in an effective and robust manner.
In order to address the aforementioned objec-
Abstract
tives, the controls community had to extend its
methodological base, borrowing concepts, mod-
This entry provides an overview of the prob-
els, and tools from other disciplines. Among
lems addressed by discrete-event systems (DES)
these disciplines, the following two played a
theory, with an emphasis on their connection to
particularly central role in the development of
various application contexts. The primary inten-
the DES theory: (i) the Theoretical Computer
tions are to reveal the caliber and the strengths
Science (TCS) and (ii) the Operations Research
of this theory and to direct the interested reader,
(OR). As a new research area, DES thrived on
through the listed citations, to the corresponding
the analytical strength and the synergies that
literature. The concluding part of the entry also
resulted from the rigorous integration of the mod-
identifies some remaining challenges and further
eling frameworks that were borrowed from TCS
opportunities for the area.
and OR. Furthermore, the DES community sub-
stantially extended those borrowed frameworks,
bringing in them many of its control-theoretic
Keywords perspectives and concepts.
In general, DES-based approaches are charac-
Applications; Discrete-event systems terized by (i) their emphasis on a rigorous and
formal representation of the investigated systems
and the underlying dynamics; (ii) a double focus
Introduction on time-related aspects and metrics that define
traditional/standard notions of performance for
Discrete-event systems (DES) theory ( Models the considered systems, but also on a more be-
for Discrete Event Systems: An Overview) haviorally oriented analysis that is necessary for
(Cassandras and Lafortune 2008) emerged in ensuring fundamental notions of “correctness,”
the late 1970s/early 1980s from the effort “stability,” and “safety” of the system operation,
40 Applications of Discrete-Event Systems
especially in the context of the aspired levels frequently characterized as untimed DES models.
of autonomy; (iii) the interplay between the two In the practical applications of DES theory, the
lines of analysis mentioned in item (ii) above most popular such models are the Finite State Au-
and the further connection of this analysis to tomaton (FSA) (Cassandras and Lafortune 2008;
structural attributes of the underlying system; Hopcroft and Ullman 1979; Supervisory Con-
and (iv) an effort to complement the analytical trol of Discrete-Event Systems; Diagnosis of
characterizations and developments with design Discrete Event Systems), and the Petri net (PN)
procedures and tools that will provide solutions (Cassandras and Lafortune 2008; Murata 1989;
provably consistent with the posed specifications Modeling, Analysis, and Control with Petri
and effectively implementable within the time Nets).
and other resource constraints imposed by the In the context of DES applications, these
“real-time” nature of the target applications. modeling frameworks have been used to provide
The rest of this entry overviews the current succinct characterizations of the underlying
achievements of DES theory with respect to event-driven dynamics and to design controllers,
(w.r.t.) the different classes of problems that in the form of supervisors, that will restrict these
have been addressed by it and highlights the dynamics so that they abide to safety, consistency,
potential that is defined by these achievements fairness, and other similar considerations
for a range of motivating applications. On the ( Supervisory Control of Discrete-Event
other hand, the constricted nature of this entry Systems). As a more concrete example, in the
does not allow an expansive treatment of the context of contemporary manufacturing, DES-
aforementioned themes. Hence, the provided based behavioral control – frequently referred to
coverage is further supported and supplemented as supervisory control (SC) – has been promoted
by an extensive list of references that will as a systematic methodology for the synthesis and
connect the interested reader to the relevant verification of the control logic that is necessary
literature. for the support of the, so-called, SCADA
(Supervisory Control and Data Acquisition)
function. This control function is typically
A Tour of DES Problems implemented through the Programmable Logic
and Applications Controllers (PLCs) that have been employed in
contemporary manufacturing shop-floors, and
DES-Based Behavioral Modeling, Analysis, DES SC theory can support it (i) by providing
and Control more rigor and specificity to the models that are
The basic characterization of behavior in the employed for the underlying plant behavior and
DES-theoretic framework is through the various the imposed specifications and (ii) by offering
event sequences that can be generated by the the ability to synthesize control policies that are
underlying system. Collectively, these sequences provably correct by construction. Some example
are known as the (formal) language generated by works that have pursued the application of DES
the plant system, and the primary intention is to SC along these lines can be found in Balemi
restrict the plant behavior within a subset of the et al. (1993), Brandin (1996), Park et al. (1999),
generated event strings. The investigation of this Chandra et al. (2003), Endsley et al. (2006), and
problem is further facilitated by the introduction Andersson et al. (2010).
of certain mechanisms that act as formal repre- On the other hand, the aforementioned activity
sentations of the studied systems, in the sense that has also defined a further need for pertinent
they generate the same strings of events (i.e., the interfaces that will translate (a) the plant structure
same formal language). Since these models are and the target behavior to the necessary DES-
concerned with the representation of the event theoretic models and (b) the obtained policies to
sequences that are generated by DES, and not PLC executables. This need has led to a line of
by the exact timing of these events, they are research, in terms of representational models and
Applications of Discrete-Event Systems 41
computational tools, that is complementary to the in the design and operation of secure systems
core DES developments described in the previous (Dubreil et al. 2010; Saboori and Hadjicostis
paragraphs. Indicatively we mention the develop- 2012, 2014). A
ment of GRAFCET (David and Alla 1992) and
of the sequential function charts (SFCs) (Lewis Dealing with the Underlying
1998) from the earlier times, while some more Computational Complexity
recent endeavor along these lines is reported in As revealed from the discussion of the previous
Wightkin et al. (2011) and Alenljung et al. (2012) paragraphs, many of the applications of DES SC
and the references cited therein. theory concern the integration and coordination
Besides its employment in the manufacturing of behavior that is generated by a number of in-
domain, DES SC theory has also been considered teracting components. In these cases, the formal
for the coordination of the communicating pro- models that are necessary for the description of
cesses that take place in various embedded sys- the underlying plant behavior may grow their size
tems (Feng et al. 2007); the systematic validation very fast, and the algorithms that are involved in
of the embedded software that is employed in the behavioral analysis and control synthesis may
various control applications, ranging from power become practically intractable. Nevertheless, the
systems and nuclear plants to aircraft and au- rigorous methodological base that underlies DES
tomotive electronics (Li and Kumar 2012); the theory provides also a framework for addressing
synthesis of the control logic in the electronic these computational challenges in an effective
switches that are utilized in telecom and data and structured manner.
networks; and the modeling, analysis, and control More specifically, DES SC theory provides
of the operations that take place in health-care conditions under which the control specifica-
systems (Sampath et al. 2008). Wassyng et al. tions can be decomposable to the constituent
(2011) gives a very interesting account of the plant components while maintaining the integrity
gains, but also the extensive challenges, experi- and correctness of the overall plant behavior
enced by a team of researchers who have tried to ( Supervisory Control of Discrete-Event Sys-
apply formal methods, similar to those that have tems; Wonham 2006). The aforementioned works
been promoted by the behavioral DES theory, to of Brandin (1996) and Endsley et al. (2006) pro-
the development and certification of the software vide some concrete examples for the application
that manages some safety-critical operations for of modular control synthesis. On the other hand,
Canadian nuclear plants. there are fundamental problems addressed by SC
Apart from control, untimed DES models theory and practice that require a holistic view
have also been employed for the diagnosis of of the underlying plant and its operation, and
critical events, like certain failures, that cannot thus, they are not amenable to modular solutions.
be observed explicitly, but their occurrence For such cases, DES SC theory can still provide
can be inferred from some resultant behavioral effective solutions through (i) the identification of
patterns (Sampath et al. 1996; Diagnosis of special plant structure, of practical relevance, for
Discrete Event Systems). More recently, the which the target supervisors are implementable
relevant methodology has been extended with in a computationally efficient manner and (ii)
prognostic capability (Kumar and Takai 2010), the development of structured approaches that
while an interesting variation of it addresses can systematically trade-off the original specifi-
the “dual” problem that concerns the design cations for computational tractability.
of systems where certain events or behavioral A particular application that has benefited
patterns must remain undetectable by an external from, and, at the same time, has significantly
observer who has only partial observation of the promoted this last capability of DES SC theory,
system behavior; this last requirement has been is that concerning the deadlock-free operation
formally characterized by the notion of “opacity” of many systems where a set of processes that
in the relevant literature, and it finds application execute concurrently and in a staged manner are
42 Applications of Discrete-Event Systems
competing, at each of their processing stages, this type of analysis, the untimed DES behavioral
for the allocation of a finite set of reusable models are extended to their timed versions. This
resources. In DES theory, this problem is extension takes place by endowing the original
known as the liveness-enforcing supervision of untimed models with additional attributes that
sequential resource allocation systems (RAS) characterize the experienced delays between
(Reveliotis 2005), and it underlies the operation the activation of an event and its execution
of many contemporary applications: from the (provided that it is not preempted by some other
resource allocation taking place in contemporary conflicting event). Timed models are further
manufacturing shop floors, Ezpeleta et al. classified by the extent and the nature of the
(1995), Reveliotis and Ferreira (1996), and randomness that is captured by them. A basic
Jeng et al. (2002), to the traveling and/or work- such categorization is between deterministic
space negotiation in robotic systems (Reveliotis models, where the aforementioned delays take
and Roszkowska 2011), automated railway fixed values for every event and stochastic
(Giua et al. 2006), and other guidepath-based models which admit more general distributions.
traffic systems (Reveliotis 2000); to Internet- From an application standpoint, timed DES
based workflow management systems like those models connect DES theory to the multitude
envisioned for e-commerce and certain banking of applications that have been addressed by
and insurance claim processing applications Dynamic Programming, Stochastic Control,
(Van der Aalst 1997); and to the allocation of and scheduling theory (Bertsekas 1995; Meyn
the semaphores that control the accessibility 2008; Pinedo 2002). Also, in their most general
of shared resources by concurrently executing definition, stochastic DES models provide
threads in parallel computer programs (Liao the theoretical foundation of discrete-event
et al. 2013). A systematic introduction to the simulation (Banks et al. 2009).
DES-based modeling of RAS and their liveness- Similar to the case of behavioral DES theory, a
enforcing supervision is provided in Reveliotis practical concern that challenges the application
(2005) and Zhou and Fanti (2004), while some of timed DES models for performance model-
more recent developments in the area are ing, analysis, and control is the very large size
epitomized in Reveliotis (2007), Li et al. (2008) of these models, even for fairly small systems.
and Reveliotis and Nazeem (2013). DES theory has tried to circumvent these com-
Closing the above discussion on the ability of putational challenges through the development of
DES theory to address effectively the complexity methodology that enables the assessment of the
that underlies the DES SC problem, we should system performance, over a set of possible con-
point out that the same merits of the theory figurations, from the observation of its behavior
have also enabled the effective management of and the resultant performance at a single configu-
the complexity that underlies problems related ration. The required observations can be obtained
to the performance modeling and control of the through simulation, and in many cases, they can
various DES applications. We shall return to this be collected from a single realization – or sample
capability in the next section that discusses the path – of the observed behavior; but then, the
achievements of DES theory in this domain. considered methods can also be applied on the
actual system, and thus, they become a tool for
DES Performance Control real-time optimization, adaptation, and learning.
and the Interplay Among Structure, Collectively, the aforementioned methods define
Behavior, and Performance a “sensitivity”-based approach to DES perfor-
DES theory is also interested in the performance mance modeling, analysis, and control (Cassan-
modeling, analysis, and control of its target dras and Lafortune 2008; Perturbation Anal-
applications w.r.t. time-related aspects like ysis of Discrete Event Systems). Historically,
throughput, resource utilization, experienced DES sensitivity analysis originated in the early
latencies, and congestion patterns. To support 1980s in an effort to address the performance
Applications of Discrete-Event Systems 43
analysis and optimization of queueing systems the performance-oriented control policies that are
w.r.t. certain structural parameters like the ar- necessary for any particular DES instantiation.
rival and processing rates (Ho and Cao 1991). This is a rather novel topic in the relevant DES A
But the current theory addresses more general literature, and some recent works in this direction
stochastic DES models that bring it closer to can be found in Cao (2005), Li and Reveliotis
broader endeavors to support incremental op- (2013), Markovski and Su (2013), and David-
timization, approximation, and learning in the Henriet et al. (2013).
context of stochastic optimal control (Cao 2007).
Some particular applications of DES sensitiv- The Roles of Abstraction and Fluidification
ity analysis for the performance optimization of The notions of “abstraction” and “fluidification”
production, telecom, and computing systems can play a significant role in mastering the complex-
be found in Cassandras and Strickland (1988), ity that arises in many DES applications. Further-
Cassandras (1994), Panayiotou and Cassandras more, both of these concepts have an important
(1999), Homem-de Mello et al. (1999), Fu and role in defining the essence and the boundaries of
Xie (2002), and Santoso et al. (2005). DES-based modeling.
Another interesting development in time- In general systems theory, abstraction can be
based DES theory is the theory of (max,+) broadly defined as the effort to develop sim-
algebra (Baccelli et al. 1992). In its practical plified models for the considered dynamics that
applications, this theory addresses the timed retain, however, adequate information to resolve
dynamics of systems that involve the synchro- the posed questions in an effective manner. In
nization of a number of concurrently executing DES theory, abstraction has been pursued w.r.t.
processes with no conflicts among them, and the modeling of both the timed and untimed
it provides important structural results on the behaviors, giving rise to hierarchical structures
factors that determine the behavior of these and models. A theory for hierarchical SC is
systems in terms of the occurrence rates of presented in Wonham (2006), while some appli-
various critical events and the experienced cations of hierarchical SC in the manufacturing
latencies among them. Motivational applications domain are presented in Hill et al. (2010) and
of (max,+) algebra can be traced in the design and Schmidt (2012). In general, hierarchical SC relies
control of telecommunication and data networks, on a “spatial” decomposition that tries to local-
manufacturing, and railway systems, and more ize/encapsulate the plant behavior into a number
recently the theory has found considerable of modules that interact through the communi-
practical application in the computation of cation structure that is defined by the hierarchy.
repetitive/cyclical schedules that seek to optimize On the other hand, when it comes to timed DES
the throughput rate of automated robotic cells and behavior and models, a popular approach seeks
of the cluster tools that are used in semiconductor to define a hierarchical structure for the underly-
manufacturing (Kim and Lee 2012; Lee 2008; ing decision-making process by taking advantage
Park et al. 1999). of the different time scales that correspond to
Both sensitivity-based methods and the theory the occurrence of the various event types. Some
of (max,+) algebra that were discussed in the particular works that formalize and systematize
previous paragraphs are enabled by the explicit, this idea in the application context of production
formal modeling of the DES structure and behav- systems can be found in Gershwin (1994) and
ior in the pursued performance analysis and con- Sethi and Zhang (1994) and the references cited
trol. This integrative modeling capability that is therein.
supported by DES theory also enables a profound In fact, the DES models that have been em-
analysis of the impact of the imposed behavioral- ployed in many application areas can be per-
control policies upon the system performance ceived themselves as abstractions of dynamics of
and, thus, the pursuance of a more integrative a more continuous, time-driven nature, where the
approach to the synthesis of the behavioral and underlying plant undergoes some fundamental
44 Applications of Discrete-Event Systems
(possibly structural) transition upon the occur- approximations is empirically assessed on the
rence of certain events that are defined either en- basis of the delivered results (by comparing these
dogenously or exogenously w.r.t. these dynamics. results to some “baseline” performance). There
The combined consideration of the discrete-event are, however, a number of cases where the relaxed
dynamics that are generated in the manner de- fluid model has been shown to retain impor-
scribed above, with the continuous, time-driven tant behavioral attributes of its original coun-
dynamics that characterize the modalities of the terpart (Dai 1995). Furthermore, some recent
underlying plant, has led to the extension of works have investigated more analytically the
the original DES theory to the, so-called, hybrid impact of the approximation that is introduced
systems theory. Hybrid systems theory is itself by these models on the quality of the delivered
very rich, and it is covered in another section results (Wardi and Cassandras 2013). Some more
of this encyclopedia (see also Discrete Event works regarding the application of fluidification
Systems and Hybrid Systems, Connections Be- in the DES-theoretic modeling frameworks, and
tween). From an application standpoint, it in- of the potential advantages that it brings in vari-
creases substantially the relevance of the DES ous application contexts, can be found in Srikant
modeling framework and brings this framework (2004), Meyn (2008), David and Alla (2005), and
to some new and exciting applications. Some Cassandras and Yao (2013).
of the most prominent applications concern the
coordination of autonomous vehicles and robotic
systems, and a nice anthology of works concern- Summary and Future Directions
ing the application of hybrid systems theory in
this particular application area can be found in The discussion of the previous section has
the IEEE Robotics and Automation magazine of revealed the extensive application range and
September 2011. These works also reveal the potential of DES theory and its ability to provide
strong affinity that exists between hybrid systems structured and rigorous solutions to complex
theory and the DES modeling paradigm. Along and sometimes ill-defined problems. On the
similar lines, hybrid systems theory underlies other hand, the same discussion has revealed
also the endeavors for the development of the the challenges that underlie many of the DES
Automated Highway Systems that have been ex- applications. The complexity that arises from
plored for the support of the future urban traf- the intricate and integrating nature of most DES
fic needs (Horowitz and Varaiya 2000). Finally, models is perhaps the most prominent of these
hybrid systems theory and its DES component challenges. This complexity manifests itself in
have been explored more recently as potential the involved computations, but also in the need
tools for the formal modeling and analysis of the for further infrastructure, in terms of modeling in-
molecular dynamics that are studied by systems terfaces and computational tools, that will render
biology (Curry 2012). DES theory more accessible to the practitioner.
Fluidification, on the other hand, is the effort The DES community is aware of this need,
to represent as continuous flows, dynamics that and the last few years have seen the development
are essentially of discrete-event type, in order of a number of computational platforms that seek
to alleviate the computational challenges that to implement and leverage the existing theory
typically result from discreteness and its com- by connecting it to various application settings;
binatorial nature. The resulting models serve as indicatively, we mention DESUMA (Ricker et al.
approximations of the original dynamics, fre- 2006), SUPREMICA (Akesson et al. 2006), and
quently they have the formal structure of hybrid TCT (Feng and Wonham 2006) that support DES
systems, and they define a basis for develop- behavioral modeling, analysis, and control along
ing “relaxations” for the originally addressed the lines of DES SC theory, while the website
problems. Usually, their justification is of an ad entitled “The Petri Nets World” has an extensive
hoc nature, and the quality of the established database of tools that support modeling and
Applications of Discrete-Event Systems 45
analysis through untimed and timed variations Andersson K, Richardsson J, Lennartson B, Fabian M
of the Petri net model. Model checking tools, like (2010) Coordination of operations by relation extrac-
tion for manufacturing cell controllers. IEEE Trans
SMV and NuSpin, that are used for verification Control Syst Technol 18: 414–429 A
purposes are also important enablers for the prac- Baccelli F, Cohen G, Olsder GJ, Quadrat JP (1992)
tical application of DES theory, and, of course, Synchronization and linearity: an algebra for discrete
there are a number of programming languages event systems. Wiley, New York
Balemi S, Hoffmann GJ, Wong-Toi PG, Franklin GJ
and platforms, like Arena, AutoMod, and Simio, (1993) Supervisory control of a rapid thermal multi-
that support discrete-event simulation. However, processor. IEEE Trans Autom Control 38:1040–1059
with the exception of the discrete-event- Banks J, Carson II JS, Nelson BL, Nicol DM (2009)
simulation software, which is a pretty mature Discrete-event system simulation, 5th edn. Prentice
Hall, Upper Saddle
industry, the rest of the aforementioned endeavors Bertsekas DP (1995) Dynamic programming and optimal
currently evolve primarily within the academic control, vols 1, 2. Athena Scientific, Belmont
and the broader research community. Hence, a Brandin B (1996) The real-time supervisory control of an
remaining challenge for the DES community is experimental manufacturing cell. IEEE Trans Robot
Autom 12:1–14
the strengthening and expansion of the afore- Cao X-R (2005) Basic ideas for event-based optimization
mentioned computational platforms to robust of Markov systems. Discret Event Syst Theory Appl
and user-friendly computational tools. The avail- 15:169–197
ability of such industrial-strength computational Cao X-R (2007) Stochastic learning and optimization: a
sensitivity approach. Springer, New York
tools, combined with the development of a body Cassandras CG (1994) Perturbation analysis and “rapid
of control engineers well-trained in DES theory, learning” in the control of manufacturing systems. In:
will be catalytic for bringing all the developments Leondes CT (ed) Dynamics of discrete event systems,
that were described in the earlier parts of this vol 51. Academic, Boston, pp 243–284
Cassandras CG, Lafortune S (2008) Introduction to dis-
document even closer to the industrial practice. crete event systems, 2nd edn. Springer, New York
Cassandras CG, Strickland SG (1988) Perturbation an-
alytic methodologies for design and optimization of
Cross-References communication networks. IEEE J Sel Areas Commun
6:158–171
Diagnosis of Discrete Event Systems Cassandras CG, Yao C (2013) Hybrid models for the
Discrete Event Systems and Hybrid Systems, control and optimization of manufacturing systems. In:
Campos J, Seatzu C, Xie X (eds) Formal methods in
Connections Between manufacturing. CRC/Taylor and Francis, Boca Raton
Modeling, Analysis, and Control with Petri Chandra V, Huang Z, Kumar R (2003) Automated control
Nets synthesis for an assembly line using discrete event
Models for Discrete Event Systems: An system theory. IEEE Trans Syst Man Cybern Part C
33:284–289
Overview Curry JER (2012) Some perspectives and challenges in the
Perturbation Analysis of Discrete Event Sys- (discrete) control of cellular systems. In: Proceedings
tems of the WODES 2012, Guadalajar. IFAC, pp 1–3
Supervisory Control of Discrete-Event Systems Dai JG (1995) On positive Harris recurrence of multiclass
queueing networks: a unified approach via fluid limit
models. Ann Appl Probab 5:49–77
David R, Alla H (1992) Petri nets and Grafcet: tools
Bibliography for modelling discrete event systems. Prentice-Hall,
Upper Saddle
Akesson K, Fabian M, Flordal H, Malik R (2006) David R, Alla H (2005) Discrete, continuous and hybrid
SUPREMICA-an integrated environment for verifica- Petri nets. Springer, Berlin
tion, synthesis and simulation of discrete event sys- David-Henriet X, Hardouin L, Raisch J, Cottenceau B
tems. In: Proceedings of the 8th international work- (2013) Optimal control for timed event graphs under
shop on discrete event systems, Ann Arbor. IEEE, partial synchronization. In: Proceedings of the 52nd
pp 384–385 IEEE conference on decision and control, Florence.
Alenljung T, Lennartson B, Hosseini MN (2012) Sensor IEEE
graphs for discrete event modeling applied to formal Dubreil J, Darondeau P, Marchand H (2010) Supervi-
verification of PLCs. IEEE Trans Control Syst Technol sory control for opacity. IEEE Trans Autom Control
20:1506–1521 55:1089–1100
46 Applications of Discrete-Event Systems
Endsley EW, Almeida EE, Tilbury DM (2006) Modular In: Proceedings of the 52nd IEEE conference on de-
finite state machines: development and application to cision and control, Florence. IEEE
reconfigurable manufacturing cell controller genera- Li Z, Zhou M, Wu N (2008) A survey and comparison
tion. Control Eng Pract 14:1127–1142 of Petri net-based deadlock prevention policies for
Ezpeleta J, Colom JM, Martinez J (1995) A Petri net based flexible manufacturing systems. IEEE Trans Syst Man
deadlock prevention policy for flexible manufacturing Cybern Part C 38:173–188
systems. IEEE Trans R&A 11:173–184 Liao H, Wang Y, Cho HK, Stanley J, Kelly T, Lafortune
Feng L, Wonham WM (2006) TCT: a computation tool for S, Mahlke S, Reveliotis S (2013) Concurrency bugs in
supervisory control synthesis. In: Proceedings of the multithreaded software: modeling and analysis using
8th international workshop on discrete event systems, Petri nets. Discret Event Syst Theory Appl 23:157–195
Ann Arbor. IEEE, pp 388–389 Markovski J, Su R (2013) Towards optimal supervisory
Feng L, Wonham WM, Thiagarajan PS (2007) Designing controller synthesis of stochastic nondeterministic dis-
communicating transaction processes by supervisory crete event systems. In: Proceedings of the 52nd IEEE
control theory. Formal Methods Syst Design 30:117– conference on decision and control, Florence. IEEE
141 Meyn S (2008) Control techniques for complex networks.
Fu M, Xie X (2002) Derivative estimation for buffer ca- Cambridge University Press, Cambridge
pacity of continuous transfer lines subject to operation- Murata T (1989) Petri nets: properties, analysis and appli-
dependent failures. Discret Event Syst Theory Appl cations. Proc IEEE 77:541–580
12:447–469 Panayiotou CG, Cassandras CG (1999) Optimization
Gershwin SB (1994) Manufacturing systems engineering. of kanban-based manufacturing systems. Automatica
Prentice Hall, Englewood Cliffs 35:1521–1533
Giua A, Fanti MP, Seatzu C (2006) Monitor design for col- Park E, Tilbury DM, Khargonekar PP (1999) Modular
ored Petri nets: an application to deadlock prevention logic controllers for machining systems: formal repre-
in railway networks. Control Eng Pract 10:1231–1247 sentations and performance analysis using Petri nets.
Hill RC, Cury JER, de Queiroz MH, Tilbury DM, Lafor- IEEE Trans Robot Autom 15:1046–1061
tune S (2010) Multi-level hierarchical interface-based Pinedo M (2002) Scheduling. Prentice Hall, Upper Saddle
supervisory control. Automatica 46:1152–1164 River
Ho YC, Cao X-R (1991) Perturbation analysis of discrete Reveliotis SA (2000) Conflict resolution in AGV systems.
event systems. Kluwer Academic, Boston IIE Trans 32(7):647–659
Homem-de Mello T, Shapiro A, Spearman ML Reveliotis SA (2005) Real-time management of resource
(1999) Finding optimal material release times allocation systems: a discrete event systems approach.
using simulation-based optimization. Manage Sci Springer, New York
45:86–102 Reveliotis SA (2007) Algebraic deadlock avoidance poli-
Hopcroft JE, Ullman JD (1979) Introduction to automata cies for sequential resource allocation systems. In:
theory, languages and computation. Addison-Wesley, Lahmar M (ed) Facility logistics: approaches and so-
Reading lutions to next generation challenges. Auerbach Publi-
Horowitz R, Varaiya P (2000) Control design of auto- cations, Boca Raton, pp 235–289
mated highway system. Proc IEEE 88: 913–925 Reveliotis SA, Ferreira PM (1996) Deadlock avoidance
Jeng M, Xie X, Peng MY (2002) Process nets with re- policies for automated manufacturing cells. IEEE
sources for manufacturing modeling and their analysis. Trans Robot Autom 12:845–857
IEEE Trans Robot Autom 18:875–889 Reveliotis S, Nazeem A (2013) Deadlock avoidance poli-
Kim J-H, Lee T-E (2012) Feedback control design for cies for automated manufacturing systems using finite
cluster tools with wafer residency time constraints. In: state automata. In: Campos J, Seatzu C, Xie X (eds)
IEEE conference on systems, man and cybernetics, Formal methods in manufacturing. CRC/Taylor and
Seoul. IEEE, pp 3063–3068 Francis
Kumar R, Takai S (2010) Decentralized prognosis of Reveliotis S, Roszkowska E (2011) Conflict resolution in
failures in discrete event systems. IEEE Trans Autom free-ranging multi-vehicle systems: a resource alloca-
Control 55:48–59 tion paradigm. IEEE Trans Robot 27:283–296
Lee T-E (2008) A review of cluster tool scheduling and Ricker L, Lafortune S, Gene S (2006) DESUMA: a tool
control for semiconductor manufacturing. In: Pro- integrating giddes and umdes. In: Proceedings of the
ceedings of the winter simulation conference, Miami. 8th international workshop on discrete event systems,
INFORMS, pp 1–6 Ann Arbor. IEEE, pp 392–393
Lewis RW (1998) Programming industrial control systems Saboori A, Hadjicostis CN (2012) Opacity-enforcing su-
using IEC 1131-3. Technical report, The Institution of pervisory strategies via state estimator constructions.
Electrical Engineers IEEE Trans Autom Control 57:1155–1165
Li M, Kumar R (2012) Model-based automatic test gen- Saboori A, Hadjicostis CN (2014) Current-state opacity
eration for Simulink/Stateflow using extended finite formulations in probabilistic finite automata. IEEE
automaton. In: Proceedings of the CASE, Seoul. IEEE Trans Autom Control 59:120–133
Li R, Reveliotis S (2013) Performance optimization Sampath M, Sengupta R, Lafortune S, Sinnamohideen K,
for a class of generalized stochastic Petri nets. Teneketzis D (1996) Failure diagnosis using disrcete
Auctions 47
event models. IEEE Trans Control Syst Technol equilibrium from game theory can be applied to
4:105–124 auctions. Auction theory aims to characterize
Sampath R, Darabi H, Buy U, Liu J (2008) Control re-
configuration of discrete event systems with dynamic
and compare the equilibrium outcomes for A
control specifications. IEEE Trans Autom Sci Eng different types of auctions. Combinatorial
5:84–100 auctions arise when multiple-related items are
Santoso T, Ahmed S, Goetschalckx M, Shapiro A (2005) sold simultaneously.
A stochastic programming approach for supply chain
network design under uncertainty. Europ J Oper Res
167:96–115
Schmidt K (2012) Computation of supervisors for re- Keywords
configurable machine tools. In: Proceedings of the
WODES 2012, Guadalajara. IFAC, pp 227–232
Sethi SP, Zhang Q (1994) Hierarchical decision making in Auction; Combinatorial auction; Game theory
stochastic manufacturing systems. Birkhäuser, Boston
Srikant R (2004) The mathematics of internet congestion
control. Birkhäuser, Boston Introduction
Van der Aalst W (1997) Verification of workflow nets. In:
Azema P, Balbo G (eds) Lecture notes in computer
science, vol 1248. Springer, New York, pp 407–426 Three commonly used types of auctions for the
Wardi Y, Cassandras CG (2013) Approximate IPA: trad- sale of a single item are the following:
ing unbiasedness for simplicity. In: Proceedings of • First price auction: Each bidder submits a bid
the 52nd IEEE conference on decision and control,
Florence. IEEE one of the bidders submitting the maximum
Wassyng A, Lawford M, Maibaum T (2011) Software cer- bid wins, and the payment for the item is
tification experience in the Canadian muclear industry: the maximum bid. (In this context “wins”
lessons for the future. In: EMSOFT’11, Taipei means receives the item, no matter what the
Wightkin N, Guy U, Darabi H (2011) Formal modeling of
sequential function charts with time Petri nets. IEEE payment.)
Trans Control Syst Technol 19:455–464 • Second price auction or Vickrey auction: Each
Wonham WM (2006) Supervisory control of discrete bidder submits a bid, one of the bidders sub-
event systems. Technical report ECE 1636F/1637S
mitting the maximum bid wins, and the pay-
2006-07, Electrical & Computer Eng., University of
Toronto ment for the item is the second highest bid.
Zhou M, Fanti MP (eds) (2004) Deadlock resolution in • English auction: The price for the item in-
computer-integrated systems. Marcel Dekker, Singa- creases continuously or in some small incre-
pore
ments, and bidders drop out at some points
in time. Once all but one of the bidders has
dropped out, the remaining bidder wins and
ATM Modernization the payment is the price at which the last of
the other bidders dropped out.
Air Traffic Management Modernization: A key goal of the theory of auctions is to
Promise and Challenges predict how the bidders will bid, and predict
the resulting outcomes of the auction: which
bidder is the winner and what is the payment.
Auctions
For example, a seller may be interested in the
expected payment (seller revenue). A seller may
Bruce Hajek
have the option to choose one auction format over
University of Illinois, Urbana, IL, USA
another and be interested in revenue comparisons.
Another item of interest is efficiency or social
Abstract welfare. For sale of a single item, the outcome is
efficient if the item is sold to the bidder with the
Auctions are procedures for selling one or highest value for the item. The book of V. Krishna
more items to one or more bidders. Auctions (2002) provides an excellent introduction to the
induce games among the bidders, so notions of theory of auctions.
48 Auctions
Auctions Versus Seller Mechanisms the highest bid of the other bidders, the payoff
of bidder i is ui .xi yi / if he wins and ui .0/ if
An important class of mechanisms within the the- he does not win. Thus, bidder i would prefer to
ory of mechanism design are seller mechanisms, win whenever ui .xi yi / > ui .0/ and not win
which implement the sale of one or more items whenever ui .xi yi / < ui .0/: That is precisely
to one or more bidders. Some authors would what happens if bidder i bids xi ; no matter what
consider all such mechanisms to be auctions, but the bids of the other bidders are. That is, bidding
the definition of auctions is often more narrowly xi is a weakly dominant strategy for bidder i:
interpreted, with auctions being the subclass of Nash equilibrium can be found for the other
seller mechanisms which do not depend on the types of auctions under a model with incomplete
fine details of the set of bidders. The rules of the information, in which the type of each bidder i is
three types of auction mentioned above do not equal to the value of the object to the bidder and is
depend on fine details of the bidders, such as the modeled as a random variable Xi with a density
number of bidders or statistical information about function fi supported by some interval Œai ; bi :
how valuable the item is to particular bidders. In A simple case is that the bidders are all risk
contrast, designing a procedure to sell an item to neutral, the densities are all equal to some fixed
a known set of bidders under specific statistical density f; and the Xi ’s are mutually independent.
assumptions about the bidders’ preferences in The English auction in this context is equiva-
order to maximize the expected revenue (as in lent to the second price auction: in an English
Myerson (1981)) would be considered a problem auction, dropping out when the price reaches
of mechanism design, which is outside the more his true value is a weakly dominant strategy for
narrowly defined scope of auctions. The narrower a bidder, and for the weakly dominant strategy
definition of auctions was championed by R. Wil- equilibrium, the outcome of the auction is the
son (1987). An article on Mechanism Design same as for the second price auction. For the first
appears in this encyclopedia. price auction in this symmetric case, there exists a
symmetric Bayesian equilibrium. It corresponds
to all bidders using the bidding function ˇ (so the
Equilibrium Strategies in Auctions bid of bidder i is ˇ.Xi /), where ˇ is given by
ˇ.x/ D EŒY1 jY1 x: The expected revenue to
An auction induces a noncooperative game the seller in this case is EŒY1 jY1 < X1 ; which is
among the bidders, and a commonly used the same as the expected revenue for the second
predictor of the outcome of the auction is an price auction and English auction.
equilibrium of the game, such as a Nash or
Bayes-Nash equilibrium. For a risk neutral bidder
i with value xi for the item, if the bidder wins Equilibrium for Auctions
and the payment is Mi ; the payoff of the bidder with Interdependent Valuations
is xi Mi : If the bidder does not win, the payoff
of the bidder is zero. If, instead, the bidder is Seminal work of Milgrom and Weber (1982)
risk averse with risk aversion measured by an addresses the performance of the above three
increasing utility function ui ; the payoff of the auction formats in case the bidders do not
bidder would be ui .xi Mi / if the bidder wins know the value of the item, but each bidder i
and ui .0/ if the bidder does not win. has a private signal Xi about the value Vi of
The second price auction format is character- the item to bidder i: The values and signals
ized by simplicity of the bidding strategies. If .X1 ; : : : Xn ; V1 ; : : : ; Vn / can be interdependent.
bidder i knows the value xi of the item to himself, Under the assumption of invariance of the
then for the second price auction format, a weakly joint distribution of .X1 ; : : : Xn ; V1 ; : : : ; Vn /
dominant strategy for the bidder is to truthfully under permutation of the bidders and a strong
report xi as his bid for the item. Indeed, if yi is form of positive correlation of the random
Auctions 49
variables .X1 ; : : : Xn ; V1 ; : : : ; Vn / (see Milgrom discovery to help bidders select the packages (or
and Weber 1982 or Krishna 2002 for details), a bundles) of items most suitable for them to buy.
symmetric Bayes-Nash equilibrium is identified A key is that complementarities may exist among A
for each of the three auction formats mentioned the items for a given bidder. Complementarity
above, and the expected revenues for the means that a bidder may place a significantly
three auction formats are shown to satisfy the higher value on a bundle of items than the sum of
ordering R.first price/ R.second price/ R.English/ : values the bidder would place on the items indi-
A significant extension of the theory of Milgrom vidually. Complementarities lead to the exposure
and Weber due to DeMarzo et al. (2005) is the problem, which occurs when a bidder wins only
theory of security-bid auctions in which bidders a subset of items of a desired bundle at a price
compete to buy an asset and the final payment is which is significantly higher than the price paid.
determined by a contract involving the value of For example, a customer might place a high value
the asset as revealed after the auction. on a particular pair of shoes, but little value on a
single shoe alone.
A variation of simultaneous ascending price
Combinatorial Auctions auctions for combinatorial auctions is auctions
with package bidding (see, e.g., Ausubel and
Combinatorial auctions implement the simultane- Milgrom 2002; Cramton 2013). A bidder will
ous sale of multiple items. A simple version is the either win a package of items he bid for or no
simultaneous ascending price auction with activ- items, thereby eliminating the exposure problem.
ity constraints (Cramton 2006; Milgrom 2004). For example, in simultaneous clock auctions with
Such an auction procedure was originally pro- package bidding, the price for each item increases
posed by Preston, McAfee, Paul Milgrom, and according to a fixed schedule (the clock), and bid-
Robert Wilson for the US FCC wireless spectrum ders report the packages of items they would pre-
auction in 1994 and was used for the vast majority fer to purchase for the given prices. The price for
of spectrum auctions worldwide since then Cram- a given item stops increasing when the number of
ton (2013). The auction proceeds in rounds. In bidders for that item drops to zero or one, and the
each round a minimum price is set for each item, clock phase of the auction is complete when the
with the minimum prices for the initial round number of bidders for every item is zero or one.
being reserve prices set by the seller. A given Following the clock phase, bidders can submit
bidder may place a bid on an item in a given additional bids for packages of items. With the
round such that the bid is greater than or equal inputs from bidders acquired during the clock
to the minimum price for the item. If one or more phase and supplemental bid phase, the auctioneer
bidders bid on an item in a round, a provisional then runs a winner determination algorithm to
winner of the item is selected from among the select a set of bids for non-overlapping packages
bidders with the highest bid for the item in the that maximizes the sum of the bids. This winner
round, with the new provisional price being the determination problem is NP hard, but is com-
highest bid. The minimum price for the item is putationally feasible using integer programming
increased 10 % (or some other small percentage) or dynamic programming methods for moderate
above the new provisional price. Once there is a numbers of items (perhaps up to 30). In addition,
round with no bids, the set of provisional winners the vector of payments charged to the winners
is identified. Often constraints are placed on the is determined by a two-step process. First, the
bidders in the form of activity rules. An activity (generalized) Vickrey price for each bidder is
rule requires a bidder to maintain a history of determined, which is defined to be the minimum
bidding in order to continue bidding, so as to the bidder would have had to bid in order to
prevent bidders from not bidding in early rounds be a winner. Secondly, the vector of Vickrey
and bidding aggressively in later rounds. The prices is projected onto the core of the reported
motivation for activity rules is to promote price prices. The second step insures that no coalition
50 Autotuning
Background
Bibliography
In the late 1970s and early 1980s, there was a
Ausubel LM, Milgrom PR (2002) Ascending auctions quite rapid change of controller implementation
with package bidding. BE J Theor Econ 1(1):Article 1 in process control. The analog controllers were
Cramton P (2006) Simultaneous ascending auctions. In:
Cramton P, Shoham Y, Steinberg R (eds) Combinato- replaced by computer-based controllers and dis-
rial auctions, chapter 4. MIT, Cambridge, pp 99–114 tributed control systems. The functionality of the
Cramton P (2013) Spectrum auction design. Rev Ind new controllers was often more or less a copy
Organ 42(4):161–190 of the old analog equipment, but new functions
DeMarzo PM, Kremer I, Skrzypacz A (2005) Bidding
with securities: auctions and security bidding with that utilized the computer implementation were
securities: auctions and security design. Am Econ Rev gradually introduced. One of the first functions of
95(4):936–959 this type was autotuning. Autotuning is a method
Krishna V (2002) Auction theory. Academic, San Diego to tune the controllers, normally PID controllers,
Milgrom PR (2004) Putting auction theory to work.
Cambridge University Press, Cambridge/ New York automatically.
Milgrom PR, Weber RJ (1982) A theory of auctions and
competitive bidding. Econometrica 50(5):1089–1122
Myerson R (1981) Optimal auction design. Math Oper
Res 6(1):58–73 What Is Autotuning?
Wilson R (1987) Game theoretic analysis of trading
processes. In: Bewley T (ed) Advances in economic
theory. Cambridge University Press, Cambridge/New A PID controller in its basic form has the struc-
York ture
Autotuning 51
Zt
Methods Based on Step Response
1 d
u.t/ D K e.t/ C e.
/d
C Td e.t/ ; Analysis
Ti dt
0 A
Most methods for automatic tuning of PID
where u is the controller output and e D ysp y is controllers are based on step response analysis.
the control error, where ysp is the setpoint and y When the operator wishes to tune the controller,
is the process output. There are three parameters an open-loop step response experiment is
in the controller, gain K, integral time Ti , and performed. A process model is then obtained
derivative time Td . These parameters have to be from the step response, and controller parameters
set by the user. Their values are dependent of the are determined. This is usually done using simple
process dynamics and the specifications of the formulas or look-up tables.
control loop. The most common process model used for
A process control plant may have thousands PID controller tuning based on step response ex-
of control loops, which means that maintaining periments is the first-order plus dead-time model
high-performance controller tuning can be very
time consuming. This was the main reason why Kp
G.s/ D e sL
procedures for automatic tuning were installed so 1 C sT
rapidly in the computer-based controllers.
When a controller is to be tuned, the following where Kp is the static gain, T is the apparent
steps are normally performed by the user: time constant, and L is the apparent dead time.
1. To determine the process dynamics, a minor These three parameters can be obtained from a
disturbance is injected by changing the control step response experiment according to Fig. 1.
signal. Static gain Kp is given by the ratio between
2. By studying the response in the process out- the steady-state change in process output and
put, the process dynamics can be determined, the magnitude of the control signal step, Kp D
i.e., a process model is derived. y=u. Dead-time L is determined from the
3. The controller parameters are finally deter- time elapsed from the step change to the inter-
mined based on the process model and the section of the largest slope of the process output
specifications. with the level of the process output before the step
Autotuning means simply that these three change. Finally, time constant T is the time when
steps are performed automatically. Instead of the process output has reached 63 % of its final
having a human to perform these tasks, they value, subtracted by L.
are performed automatically on demand from
the user. Ideally, the autotuning should be fully
Process output
automatic, which means that no information
about the process dynamics is required from
63 %
the user. Δy
Automatic tuning can be performed in many
ways. The process disturbance can take differ-
ent forms, e.g., in the form of step changes or L T
some kind of oscillatory excitation. The model Control signal
obtained can be more or less accurate. There are
also many ways to tune the controller based on
Δu
the process model.
Here, we will discuss two main approaches
for autotuning, namely, those that are based on
step response analysis and those that are based Autotuning, Fig. 1 Determination of Kp , L, and T from
on frequency response analysis. a step response experiment
52 Autotuning
Frequency-domain characteristics of the process where d is the relay amplitude, the relay hys-
can be obtained by adding sinusoidals to the teresis, and a the amplitude of the input signal.
control signal, but without knowing the frequency The negative inverse of this describing function is
response of the process, the interesting frequency a straight line parallel to the real axis; see Fig. 4.
range and acceptable amplitudes are not known. The oscillation corresponds to the point where the
A method that automatically provides a rele- negative inverse describing function crosses the
vant frequency response can be determined from Nyquist curve of the process, i.e., where
experiments with relay feedback according to
Fig. 2. Notice that there is a switch that selects 1
G.i !/ D
either relay feedback or ordinary PID feedback. N.a/
When it is desired to tune the system, the PID
function is disconnected and the system is con- Since N.a/ is known, G.i !/ can be determined
nected to relay feedback control. Relay feedback from the amplitude a and the frequency ! of the
control is the same as on/off control, but where oscillation.
the on and off levels are carefully chosen and not Notice that the relay experiment is easily au-
0 and 100 %. The relay feedback makes the con- tomated. There is often an initialization phase
trol loop oscillate. The period and the amplitude where the noise level in the process output is de-
of the oscillation is determined when steady-state termined during a short period of time. The noise
oscillation is obtained. This gives the ultimate level is used to determine the relay hysteresis
period and the ultimate gain. The parameters of a and a desired oscillation amplitude in the process
PID controller can then be determined from these output. After this initialization phase, the relay
values. The PID controller is then automatically function is introduced. Since the amplitude of the
switched in again, and the control is executed oscillation is proportional to the relay output, it is
with the new PID parameters. easy to control it by adjusting the relay output.
For large classes of processes, relay feedback
gives an oscillation with period close to the ulti- Different Adaptive Techniques
mate frequency !u , as shown in Fig. 3, where the
control signal is a square wave and the process In the late 1970s, at the same time as autotuning
output is close to a sinusoid. The gain of the procedures were developed and implemented in
transfer function at this frequency is also easy to industrial controllers, there was a large academic
obtain from amplitude measurements. interest in adaptive control. These two concepts
Autotuning 53
55
A
50
y
45
0 5 10 15 20 25 30 35 40
60
55
50
u
45
40
0 5 10 15 20 25 30 35 40
Autotuning, Fig. 3 Process output y and control signal u during relay feedback
Keywords
Bibliography
Cooperative control; Coordination; Distributed
Åström KJ, Hägglund T (1995) PID controllers: theory,
control; Multi-agent systems; Networked systems
design, and tuning. ISA – The Instrumentation, Sys-
tems, and Automation Society, Research Triangle Park
Åström KJ, Hägglund T (2005) Advanced PID control.
ISA – The Instrumentation, Systems, and Automation
Society, Research Triangle Park Introduction
Vilanova R, Visioli A (eds) (2012) PID control in the third
millennium. Springer, Dordrecht In the area of control of networked systems, low
Visioli A (2006) Practical PID control. Springer, London
Yu C-C (2006) Autotuning of PID controllers – a relay cost, high adaptivity and scalability, great robust-
feedback approach. Springer, London ness, and easy maintenance are critical factors.
56 Averaging Algorithms and Consensus
To achieve these factors, distributed coordination a directed graph denotes that agent j can obtain
and control algorithms that rely on only local in- information from agent i , but not necessarily vice
teraction between neighboring agents to achieve versa. In contrast, an edge .i; j / in an undirected
collective group behavior are more favorable than graph denotes that agents i and j can obtain
centralized ones. In this article, we overview information from each other. Agent j is a (in-)
averaging algorithms and consensus in the con- neighbor of agent i if .j; i / 2 E. Let Ni denote
text of distributed coordination and control of the neighbor set of agent i . We assume that i 2
networked systems. Ni . A directed path is a sequence of edges in
Distributed consensus means that a team of a directed graph of the form .i1 ; i2 /; .i2 ; i3 /; : : :,
agents reaches an agreement on certain variables where ij 2 V. An undirected path in an undi-
of interest by interacting with their neighbors. rected graph is defined analogously. A directed
A consensus algorithm is an update law that graph is strongly connected if there is a directed
drives the variables of interest of all agents in path from every agent to every other agent. An
the network to converge to a common value undirected graph is connected if there is an undi-
(Jadbabaie et al. 2003; Olfati-Saber et al. 2007; rected path between every pair of distinct agents.
Ren and Beard 2008). Examples of the variables A directed graph has a directed spanning tree if
of interest include a local representation of the there exists at least one agent that has directed
center and shape of a formation, the rendezvous paths to all other agents. For example, Fig. 1
time, the length of a perimeter being monitored, shows a directed graph that has a directed span-
the direction of motion for a multi-vehicle swarm, ning but is not strongly connected. The adjacency
and the probability that a target has been identi- matrix A D Œaij 2 Rnn associated with G is
fied. Consensus algorithms have applications in defined such that aij (the weight of edge .j; i /)
rendezvous, formation control, flocking, attitude is positive if agent j is a neighbor of agent i
alignment, and sensor networks (Bai et al. 2011a; while aij D 0 otherwise. The (nonsymmetric)
Bullo et al. 2009; Mesbahi and Egerstedt 2010; Laplacian matrix (Agaev and Chebotarev 2005)
Qu 2009; Ren and Cao 2011). Distributed aver- L D Œ`ij 2 Rnn associated with A and hence G
P
aging algorithms aim at computing the average is defined as `i i D j ¤i aij and `ij D aij for
of certain variables of interest among multiple all i ¤ j . For an undirected graph, we assume
agents by local communication. Distributed av- that aij D aj i . A graph is balanced if for every
eraging finds applications in distributed comput- agent the total edge weights of its incoming links
ing, distributed signal processing, and distributed is equal to the total edge weights of its outgoing
P P
optimization (Tsitsiklis et al. 1986). Hence the links ( nj D1 aij D nj D1 aj i for all i ).
variables of interest are dependent on the appli-
cations (e.g., a sensor measurement or a network
quantity). Consensus and averaging algorithms
are closely connected and yet nonidentical. When
A1 A2
all agents are able to compute the average, they
essentially reach a consensus, the so-called av-
erage consensus. On the other hand, when the
agents reach a consensus, the consensus value
A3 A4 A5
might or might not be the average value.
Averaging Algorithms and Consensus, Fig. 1 A di-
Graph Theory Notations. Suppose that there rected graph that characterizes the interaction among five
are n agents in a network. A network topology agents, where Ai , i D 1; : : : ; 5, denotes agent i . An
(equivalently, graph) G consisting of a node set arrow from agent j to agent i indicates that agent i
receives information from agent j . The directed graph
V D f1; : : : ; ng and an edge set E V V will has a directed spanning tree but is not strongly connected.
be used to model interaction (communication or Here both agents 1 and 2 have directed paths to all other
sensing) between the n agents. An edge .i; j / in agents
Averaging Algorithms and Consensus 57
Consensus has a long history in management where xi is the state and ui is the control input. A
science, statistical physics, and distributed com- A common consensus algorithm for (1) is
puting and finds recent interests in distributed
control. While in the area of distributed control X
ui .t/ D aij .t/Œxj .t/ xi .t/; (2)
of networked systems the term consensus was
j 2Ni .t /
initially more or less dominantly referred to the
case of a continuous-time version of a distributed
where Ni .t/ is the neighbor set of agent i at time
linear averaging algorithm, such a term has been
t and aij .t/ is the .i; j / entry of the adjacency
broadened to a great extent later on. Related
matrix A of the graph G at time t. A consequence
problems to consensus include synchronization,
of (2) is that the state xi .t/ of agent i is driven
agreement, and rendezvous. The study of con-
toward the states of its neighbors or equivalently
sensus can be categorized in various manners.
toward the weighted average of its neighbors’
For example, in terms of the final consensus
states. The closed-loop system of (1) using (2)
value, the agents could reach a consensus on
can be written in matrix form as
the average, a weighted average, the maximum
value, the minimum value, or a general function
of their initial conditions, or even a (changing) P
x.t/ D L.t/x.t/; (3)
state that serves as a reference. A consensus
algorithm could be linear or nonlinear. Consensus where x is a column stack vector of all xi and
algorithms can be designed for agents with linear L is the Laplacian matrix. Consensus is reached
or nonlinear dynamics. As the agent dynamics if for all initial states, the agents’ states even-
become more complicated, so do the algorithm tually
becomeidentical. That is, for all xi .0/,
design and analysis. Numerous issues are also xi .t/ xj .t/ approaches zero eventually.
involved in consensus such as network topologies The properties of the Laplacian matrix L play
(fixed vs. switching, deterministic vs. random, an important role in the analysis of the closed-
directed vs. undirected, asynchronous vs. syn- loop system (3). When the graph G (and hence
chronous), time delay, quantization, optimality, the associated Laplacian matrix L) is fixed, (3)
sampling effects, and convergence speed. For can be analyzed by studying the eigenvalues and
example, in real applications, due to nonuniform eigenvectors of L. Due to its special structure,
communication/sensing ranges or limited field of for any graph G, the associated Laplacian ma-
view of sensors, the network topology could be trix L has at least one zero eigenvalue with an
directed rather than undirected. Also due to unre- associated right eigenvector 1 (column vector of
liable communication/sensing and limited com- all ones) and all other eigenvalues have positive
munication/sensing ranges, the network topology real parts. To ensure consensus, it is equivalent
could be switching rather than fixed. to ensure that L has a simple zero eigenvalue. It
can be shown that the following three statements
are equivalent: (i) the agents reach a consensus
Consensus for Agents exponentially for arbitrary initial states; (ii) the
with Single-Integrator Dynamics graph G has a directed spanning tree; and (iii) the
We start with a fundamental consensus algo- Laplacian matrix L has a simple zero eigenvalue
rithm for agents with single-integrator dynamics. with an associated right eigenvector 1 and all
The results in this section follow from Jadbabaie other eigenvalues have positive real parts. When
et al. (2003), Olfati-Saber et al. (2007), Ren and consensus is reached, the final consensus value is
Beard (2008), Moreau (2005), and Agaev and a weighted average of the initial states of those
Chebotarev (2000). Consider agents with single- agents that have directed paths to all other agents
integrator dynamics (see Fig. 2 for an illustration).
58 Averaging Algorithms and Consensus
Averaging Algorithms 10
and Consensus, Fig. 2
Consensus for five agents 9 i=1
using the algorithm (2) for i=2
(1). Here the graph G is 8 i=3
given by Fig. 1. The initial i=4
states are chosen as i=5
7
xi .0/ D 2i , where
i D 1; : : : ; 5. Consensus is
xi
6
reached as G has a directed
spanning tree. The final
5
consensus value is a
weighted average of the
4
initial states of agents 1
and 2
3
2
0 1 2 3 4 5 6 7 8
time(s)
When the graph G.t/ is switching at time account for actuator saturation and a signum type
instants t0 ; t1 ; : : :, the solution to the closed-loop function could be introduced to achieve finite-
system (3) is given by x.t/ D ˚.t; 0/x.0/, where time convergence.
˚.t; 0/ is the transition matrix corresponding to As shown above, for single-integrator dynam-
L.t/. Consensus is reached if ˚.t; 0/ eventually ics, the consensus convergence is determined
converges to a matrix with identical rows. Here entirely by the network topologies. The primary
˚.t; 0/ D ˚.t; tk /˚.tk ; tk1 / ˚.t1 ; 0/, where reason is that the single-integrator dynamics are
˚.tk ; tk1 / is the transition matrix corresponding internally stable. However, when more compli-
to L.t/ at time interval Œtk1 ; tk . It turns out that cated agent dynamics are involved, the consen-
each transition matrix is a row-stochastic matrix sus algorithm design and analysis become more
with positive diagonal entries. A square matrix is complicated. On one hand, whether the graph is
row stochastic if all its entries are nonnegative undirected (respectively, switching) or not has
and all of its row sums are one. The consen- significant influence on the complexity of the
sus convergence can be analyzed by studying consensus analysis. On the other hand, not only
the product of row-stochastic matrices. Another the network topology but also the agent dynamics
analysis technique is a Lyapunov approach (e.g., themselves and the parameters in the consensus
max xi min xi ). It can be shown that the agents’ algorithm play important roles. Next we intro-
states reach a consensus if there exists an infinite duce consensus for agents with general linear and
sequence of contiguous, uniformly bounded time nonlinear dynamics.
intervals, with the property that across each such
interval, the union of the graphs G.t/ has a
Consensus for Agents with General Linear
directed spanning tree. That is, across each such
Dynamics
interval, there exists at least one agent that can
In some circumstances, it is relevant to deal with
directly or indirectly influence all other agents. It
agents with general linear dynamics, which can
is also possible to achieve certain nice features by
also be regarded as linearized models of certain
designing nonlinear P consensus algorithms of the nonlinear dynamics. The results in this section
form ui .t/ D j 2Ni .t / aij .t/ Œxj .t/ xi .t/,
follow from Li et al. (2010). Consider agents with
where ./ is a nonlinear function satisfying
general linear dynamics
certain properties. One example is a continu-
ous nondecreasing odd function. For example, a
saturation type function could be introduced to xP i D Axi C Bui ; yi D C xi ; (4)
Averaging Algorithms and Consensus 59
For example, the directed graph does not preserve neighbors but also local auxiliary variables from
the passivity properties in general. Moreover, the neighbors. While relative physical states could
directed graph could cause difficulties in the be obtained through sensing, the exchange of
design of (positive-definite) Lyapunov functions. auxiliary variables can only be achieved by
One approach is to integrate the nonnegative left communication. Hence such generalization is
eigenvector of the Laplacian matrix associated obtained at the price of increased communication
with the zero eigenvalue into the Lyapunov between the neighboring agents. Unlike some
function, which is valid for strongly connected other algorithms, it is generally impossible
graphs and has been applied in some problems. to implement the algorithm relying on purely
Another approach is based on sliding mode relative sensing between neighbors without the
control. The idea is to design a sliding surface need for communication.
for reaching a consensus. Taking multiple
Lagrangian systems as an example, the agent
dynamics are represented by Averaging Algorithms
A
A1 A2 A3
A7 A6 A5
Averaging Algorithms and Consensus, Fig. 3 Illustra- varying) signal associated with agent i . Each agent needs
tion of distributed averaging of multiple (time-varying) to compute the average of all agents’ signals but can
signals. Here Ai denotes agent i and ri .t / denotes a (time- communicate with only its neighbors
topology at time k, with the additional assump- can be analyzed by studying the eigenvalues
tion that A is row stochastic and ai i Œk > 0 for and eigenvectors of the row-stochastic matrix A.
all i D 1; : : : ; n. Intuitively, the information state Because all diagonal entries of A are positive,
of each agent is updated as the weighted average Gershgorin’s disc theorem implies that all eigen-
of its current state and the current states of its values of A are either within the open unit disk or
neighbors at each iteration. Note that an agent at one. When the graph G is strongly connected,
maintains its current state if it does not exchange the Perron-Frobenius theorem implies that A has
information with other agents at that event in- a simple eigenvalue at one with an associated
stant. In fact, a discretized version of the closed- right eigenvector 1 and an associated positive left
loop system of (1) using (2) (with a sufficiently eigenvector. Hence when G is strongly connected,
small sampling period) takes in the form of (9). it turns out that limk!1 Ak D 1 T , where T is a
The objective here is for all agents to compute the positive left eigenvector of A associated with the
average of their initial states by communicating eigenvalue one and satisfies T 1 D 1. Note that
with only their neighbors. That is, each xi Œk xŒk D Ak xŒ0. Hence, each agent’s state xi Œk
P
approaches n1 nj D1 xj Œ0 eventually. To compute approaches T xŒ0 eventually. If it can be further
the average of multiple constant signals ci , we ensured that D n1 1, then averaging is achieved.
could simply set xi Œ0 D ci . The algorithm (9) It can be shown that the agents’ states converge to
can be written in matrix form as xŒk C 1 D the average of their initial values if and only if the
AŒkxŒk, where x is a column stack vector of all directed graph G is both strongly connected and
xi and AŒk D Œaij Œk is a row-stochastic matrix. balanced or the undirected graph G is connected.
When the graph G (and hence the matrix A) When the graph is switching, the convergence of
is fixed, the convergence of the algorithm (9) the algorithm (9) can be analyzed by studying the
62 Averaging Algorithms and Consensus
product of row-stochastic matrices. Such analysis where ˛ is a positive scalar, Ni denotes the
is closely related to Markov chains. It can be neighbor set of agent i , sgn./ is the signum func-
shown that the agents’ states converge to the tion defined componentwise, i is the internal
average of their initial values if the directed state of the estimator with i .0/ D 0, and xi
graph G is balanced at each communication event is the estimate of the average rN .t/. Due to the
and strongly connected in a joint manner or the existence of the discontinuous signum function,
undirected graph G is jointly connected. the solution of (10) is understood in the Filippov
sense (Cortes 2008).
The idea behind the algorithm (10) is as
Dynamic Averaging
follows.
Pn First, P
(10) is designed to ensure that
In a more general setting, there exist n time- n
i D1Pxi .t/ D i D1 rP
i .t/ holds for allPtime. Note
varying signals, ri .t/, i D 1; : : : ; n, which could n n n
that x
i D1 i .t/ D
i D1 i .t/ C i D1 ri .t/.
be an external signal or an output from a dy-
When the graph G is undirected and i .0/ D 0,
namical system. Here ri .t/ is available to only P Pn
it follows that R niD1 i .t/ D i D1 i .0/ C
agent i and each agent can exchange information Pn P t
˛ i D1 j 2Ni 0 sgnŒxj .
/ xi .
/d
D 0.
with only its neighbors. Each agent maintains a P P
As a result, niD1 xi .t/ D niD1 ri .t/ holds for
local estimate, denoted by xi .t/, of the average
4 P all time. Second, when G is connected, if the
of all the signals rN .t/ D n1 nkD1 rk .t/. The algorithm (10) guarantees that all estimates xi
objective is to design a distributed algorithm for approach the same value in finite time, then it can
agent i based on ri .t/ and xj .t/, j 2 Ni .t/, be guaranteed that each estimate approaches the
such that all agents will finally track the average average of all signals in finite time.
that changes over time. That is, kxi .t/ rN .t/k,
i D 1; : : : ; n, approaches zero eventually. Such
a dynamic averaging idea finds applications in
distributed sensor fusion with time-varying mea- Summary and Future Research
surements (Bai et al. 2011b; Spanos and Murray Directions
2005) and distributed estimation and tracking
(Yang et al. 2008). Averaging algorithms and consensus play an
Figure 3 illustrates the dynamic averaging important role in distributed control of networked
idea. If there exists a central station that can systems. While there is significant progress
always access the signals of all agents, then it is in this direction, there are still numerous
trivial to compute the average. Unfortunately, in open problems. For example, it is challenging
a distributed context, where there does not exist a to achieve averaging when the graph is not
central station and each agent can only communi- balanced. It is generally not clear how to deal
cate with its local neighbors, it is challenging for with a general directed or switching graph for
each agent to compute the average that changes nonlinear agents or nonlinear algorithms when
over time. While each agent could compute the the algorithms are based on only interagent
average of its own and local neighbors’ signals, physical state coupling without the need for
this will not be the average of all signals. communicating additional auxiliary variables
When the signal ri .t/ can be arbitrary but its between neighbors. The study of consensus
derivative exists and is bounded almost every- for multiple underactuated agents remains
where, a distributed nonlinear nonsmooth algo- a challenge. Furthermore, when the agents’
rithm is designed in Chen et al. (2012) as dynamics are heterogeneous, it is challenging
to design consensus algorithms. In addition, in
X the existing study, it is often assumed that the
Pi .t/ D ˛ sgnŒxj .t/ xi .t/
j 2Ni
agents are cooperative. When there exist faulty
or malicious agents, the problem becomes more
xi .t/ D i .t/ C ri .t/; i D 1; : : : ; n; (10) involved.
Averaging Algorithms and Consensus 63
We compare this BSDE with the classical multidimensional stocks with degenerate
stochastic differential equation (SDE): volatility matrix can be treated by constrained
BSDE. Assume that a small investor whose
dXs D .Xs /dBs C b.Xs /ds investment behavior cannot affect market prices
and who invests at time t 2 Œ0; T the amount
with given initial condition Xs jsD0 D x 2 Rn . Its t of his or her wealth Yt in the security and
integral form is t0 in the bond, thus Yt D t0 C t . If his
investment strategy is self-financing, then we
Z t have d Yt D t0 dPt0 =Pt0 C t dPt =Pt , thus
Xt .!/ D x C .Xs .!//dBs .!/
0
Z t
d Yt D .rYt C t /dt C t dBt ;
C b.Xs .!//ds: (2)
0
where D 1 .b r/. A strategy .Yt ; t /t 2Œ0;T
Linear backward stochastic differential is said to be feasible if Yt 0, t 2 Œ0; T .
equation was firstly introduced (Bismut 1973) A European path-dependent contingent claim set-
in stochastic optimal control problems to solve tled at time T is a given nonnegative function of
the adjoint equation in the stochastic maximum path D ..Pt /t 2Œ0;T /. A feasible strategy .Y; /
principle of Pontryagin’s type. The above is called a hedging strategy against a contingent
existence and uniqueness theorem was obtained claim at the maturity T if it satisfies
by Pardoux and Peng (1990). In the research
domain of economics, this type of 1-dimensional d Yt D .rYt C t /dt C t dBt ; YT D :
BSDE was also independently derived by Duffie
and Epstein (1992). Comparison theorem of This problem can be regarded as finding a
BSDE was obtained in Peng (1992) and improved stochastic control and an initial condition Y0
in El Karoui et al. (1997a). Nonlinear Feynman- such that the final state replicates the contingent
Kac formula was obtained in Peng (1991, 1992) claim , i.e., YT D . This type of replications
and improved in Pardoux and Peng (1992). BSDE is also called “exact controllability” in terms
is applied as a nonlinear Black-Scholes option of stochastic control (see Peng 2005 for more
pricing formula in finance. This formulation was general results).
given in El Karoui et al. (1997b). We refer to a Observe that .Y; / is the solution of the
recent survey in Peng (2010) for more details. above BSDE. It is called a superhedging strat-
egy if there exists an increasing process Kt , of-
ten called an accumulated consumption process,
such that
Hedging and Risk Measuring
in Finance d Yt D .rYt C t /dt C t dBt dKt ; YT D :
Z Z
where Œ˛C D maxf˛; 0g. A short selling con- T T
Ytu D h.XT / C l.Xs ; us /ds Zsu dBs :
straint t 0 is also a typical requirement in t t
markets. The method of constrained BSDE can
be applied to this type of problems. BSDE theory From Girsanov transformation, under the proba-
provides powerful tools to the robust pricing bility measure PQ defined by B
and risk measures for contingent claims (see El
Karoui et al. 1997a). For the dynamic risk mea- Z
d PQ T
sure under Brownian filtration, see Rosazza Gi- jT D exp b.Xs ; us /dBs
anin (2006), Peng (2004), Barrieu and El Karoui dP 0
Z T
(2005), Hu et al. (2005), and Delbaen et al. 1
(2010). jb.Xs ; us ; vs /j2 ds
2 0
Comparison Theorem
Xt is a Brownian motion, and the above BSDE is
The comparison theorem, for a real valued changed to
BSDE, tells us that, if .Yt ; Zt / and .YNt ; ZN t /
are two solutions of BSDE (1) with terminal Z T
condition YT D , YNT D N such that .!/ Ytu D h.XT / C Œl.Xs ; us / C hZsu ; b.Xs ; us /ids
N
.!/, ! 2 , then one has Yt YNt . This theorem
t
Z
holds if f and , N satisfy the abovementioned
T
Zsu dXs ;
L2 -integrability condition and f is a Lipschitz t
function in .y; z/. This theorem plays the same
important role as the maximum principle in where h; i is the Euclidean scalar product in Rd .
PDE theory. The theorem also has several very Notice that P and PQ are absolutely continuous
interesting generalizations (see Buckdahn et al. with each other. Compare this BSDE with the
2000). following one:
dXs D b.Xs ; us /ds C .Xs /dBs max min J.u; v/; J.u; v/
v u
Z T
defined in a Wiener probability space .; F ; P / DE l.Xs ; us ; vs /ds C h.XT /
with the Brownian motion Bt .!/ D !.t/ which 0
where .us ; vs / is formulated as above with com- of the corresponding quasilinear PDE. Once YT
pact control domains us 2 U and vs 2 V . In and g are path functions of X , then the solution
this case the equilibrium of the game exists if the of the BSDE becomes also path dependent. In
following Isaac condition is satisfied: this sense, we can say that the PDE is in fact
a “state-dependent BSDE,”and BSDE gives us a
H.x; z/ WDmax inf fl.x; u; v/Chz; b.x; u; v/ig new generalization of “path-dependent PDE” of
v2V u2U
parabolic and/or elliptic types. This principle was
D inf max fl.x; u; v/Chz; b.x; u; v/ig ; illustrated in Peng (2010) for both quasilinear and
u2U v2V
fully nonlinear situations.
and the equilibrium is also obtained through a Observe that BSDE (1) and forward SDE
BSDE (3) defined above. (2) are only partially coupled. A fully coupled
system of SDE and BSDE is called a forward-
Nonlinear Feynman-Kac Formula backward stochastic differential equation
A very interesting situation is when f D g.Xt ;y;z/ (FBSDE). It has the following form:
and YT D'.XT / in BSDE (1). In this case we
have the following relation, called “nonlinear
Feynman-Kac formula,” dXt D b.t; Xt ; Yt ; Zt /dt C .t; Xt ; Yt ; Zt /dBt ;
X0 D x 2 Rn ;
d Yt D f .t; Xt ; Yt ; Zt /dt Zt dBt ; YT D '.XT /:
Yt D u.t; Xt /; Zt D T .Xt /ru.t; Xt /
where K is a càdlàg and increasing process Œ0; T , t (El Karoui and Quenez 1995; Cvitanic
with K0 D 0 and Kt 2 L2P .Ft /, then Y or and Karatzas 1993; El Karoui et al. 1997a) for the
.Y; Z; K/ is called a supersolution of the BSDE, problem of superhedging in a market with con-
or g-supersolution. This notion is often used vex constrained portfolios (Cvitanic et al. 1998).
for constrained BSDEs. A typical situation is as The case with an arbitrary closed constraint was B
follows: for a given continuous adapted process proved in Peng (1999).
.Lt /t 2Œ0;T , find a smallest g-supersolution
.Y; Z; K/ such that Yt Lt . This problem Backward Stochastic Semigroup
was initialed in El Karoui et al. (1997b). It is and g-Expectations
g
proved that this problem is equivalent to finding Let Et;T Œ D Yt where Y is the solution of
g
a triple .Y; Z; K/ satisfying (4) and the following BSDE (1). .Et;T Œ/0t T <1 has the (backward)
reflecting condition of Skorohod type: semigroup property (Peng 1997)
Z g g g g
T Es;t ŒEt;T Œ D Es;T Œ; ET;T Œ
Ys Ls ; .Ys Ls /dKs D 0: (7)
0
D ; 0 s t T:
In fact WD infft 2 Œ0; T W Kt > 0g is the
optimal stopping time associated to this BSDE. For a real valued BSDE, by the comparison
g
A well-known example is the pricing of Ameri- theorem, the semigroup is monotone: Et;T Œ
g N N If moreover gjzD0 D 0, then the
can option. Et;T Œ, if .
g
Moreover, a new type of nonlinear Feynman- semigroup is constant preserving: Et;T Œc c.
Kac formula was introduced: if all coefficients are Thus the semigroup forms in fact a nonlinear
given as in the formulation of the above nonlinear expectation called g-expectation (since this non-
Feynman-Kac formula and Ls D ˆ.Xs / where linear expectation is totally determined by the
ˆ satisfies the same condition as ', then we have generator g).
Ys D u.s; Xs /, where u D u.t; x/ is the solution This notion allows us to establish a nonlinear
of the following variational inequality: g-martingale theory, e.g., g-supermartingale
decomposition theorem. Peng (1999) claims
minf@t u C Lu C g.x; u; Du/; u ˆg that, if Y is a square-integrable càdlàg
g-supermartingale, then it has the unique
D 0; .t; x/ 2 Œ0; T Rn ; (8) decomposition: there exists a unique predictable,
increasing, and càdlàg process A such that Y
with terminal condition ujt DT D '. They also solves
demonstrated that this reflected BSDE is a pow-
erful tool to deal with contingent claims of Amer- d Yt D g.t; Yt ; Zt /dt C dAt Zt dBt :
ican types in a financial market with constraints.
BSDE reflected within two barriers, a lower A theoretically challenging and practically im-
one L and an upper one U , was first investigated portant problem is as follows: given an abstract
by Cvitanic and Karatzas (1996) where a type family of expectations .Es;t Œ/st satisfying the
of nonlinear Dynkin games was formulated for a same backward semigroup properties as these of
two-player model with zero-sum utility and each g-expectation, can we find a function g such that
g
player chooses his own optimal exit time. Es;t Es;t ? Coquet, Hu et al. (2005) proved that
Stochastic optimal switching problems can be if E is dominated by g
-expectation with g
.z/ D
also solved by new types of oblique-reflected
jzj for a large enough constant
> 0, then there
BSDEs. exists a unique function g D g.t; !; z/ satisfying
A more general case of constrained BSDE is to
-Lipschitz condition such that .Es;t Œ/st is in
find the smallest g-supersolution .Y; Z; K/ with fact a g-expectation. For a concave dynamic ex-
constraint .Yt ; Zt / 2 t where, for each t 2 pectation with an assumption much weaker than
70 Backward Stochastic Differential Equations and Related Control Problems
robustness when implemented in a finite preci- section “An Illustration” using one specific task,
sion environment. namely, checking the controllability of a system.
To overcome such failures, major efforts have We refer to SLICOT (2012) for more details
been made in the last few decades to develop on SLICOT and to Varga (2004) for a general
robust, well-implemented, and standardized discussion on numerical software for systems and
software packages for computer-aided control control.
systems design (Grübel 1983; Nag Slicot 1990;
Wieslander 1977). Following the standards of
modern software design, such packages should The Control Subroutine Library
consist of numerically robust routines with SLICOT
known performance in terms of reliability and
efficiency that can be used to form the basis When designing a subroutine library of basic
of more complex control methods. Also to algorithms, one should make sure that it satisfies
avoid duplication and to achieve efficiency certain basic requirements and that it follows
and portability to different computational a strict standardization in implementation and
environments, it is essential to make maximal documentation. It should also contain standard-
use of the established standard packages that ized test sets that can be used for benchmarking,
are available for numerical computations, e.g., and it should provide means for maintenance
the Basic Linear Algebra Subroutines (BLAS) and portability to new computing environments.
(Dongarra et al. 1990) or the Linear Algebra The subroutine library SLICOT was designed to
Packages (LAPACK) (Anderson et al. 1992). satisfy the following basic recommendations that
On the basis of such standard packages, the next are typically expected in this context (Benner
layer of more complex control methods can then et al. 1999).
be built in a robust way. Robustness: A subroutine must either return
In the late 1980s, a working group was cre- reliable results or it must return an error or
ated in Europe to coordinate efforts and integrate warning indicator, if the problem has not been
and extend the earlier software developments well posed or if the problem does not fall in
in systems and control. Thanks to the support the class to which the algorithm is applicable
of the European Union, this eventually led to or if the problem is too ill-conditioned
the development of the Subroutine Library in to be solved in a particular computing
Control Theory (SLICOT) (Benner et al. 1999; environment.
SLICOT 2012). This library contains most of the Numerical stability and accuracy: Subroutines
basic computational methods for control systems are supposed to return results that are as good
design of linear time-invariant control systems. as can be expected when working at a given
An important feature of this and similar kind precision. They also should provide an option
of subroutine libraries is that the development to return a parameter estimating the accuracy
of further higher level methods is not restricted actually achieved.
by specific requirements of the languages or data Efficiency: An algorithm should never be cho-
structures used and that the routines can be eas- sen for its speed if it fails to meet the usual
ily incorporated within other more user-friendly standards of robustness, numerical stability,
software systems (Gomez et al. 1997; MATLAB and accuracy, as described above. Efficiency
2013). Usually, this low-level reusability can only must be evaluated, e.g., in terms of the num-
be achieved by using a general-purpose program- ber of floating-point operations, the memory
ming language like C or Fortran. requirements, or the number and cost of itera-
We cannot present all the features of the SLI- tions to be performed.
COT library here. Instead, we discuss its general Modern computer architectures: The require-
philosophy in section “The Control Subroutine ments of modern computer architectures
Library SLICOT” and illustrate these concepts in must be taken into account, such as shared
Basic Numerical Methods and Software for Computer Aided Control Systems Design 73
or distributed memory parallel processors, benchmark collections have been developed (see,
which are the standard environments of e.g., Benner et al. 1997; Frederick 1998, or http://
today. The differences in the various www.slicot.org/index.php?site=benchmarks).
architectures may imply different choices of
algorithms. B
Comprehensive functional coverage: The Maintenance, Open Access, and Archives
routines of the library should solve control It is a major challenge to maintain a well-
systems relevant computational problems and developed library accessible and usable over
try to cover a comprehensive set of routines time when computer architectures and operating
to make it functional for a wide range of systems are changing rapidly, while keeping the
users. The SLICOT library covers most of the library open for access to the user community.
numerical linear algebra methods needed in This usually requires financial resources that
systems analysis and synthesis problems for either have to be provided by public funding or
standard and generalized state space models, by licensing the commercial use.
such as Lyapunov, Sylvester, and Riccati equa- In the SLICOT library, this challenge has been
tion solvers, transfer matrix factorizations, addressed by the formation of the Niconet As-
similarity and equivalence transformations, sociation (http://www.niconet-ev.info/en/) which
structure exploiting algorithms, and condition provides the current versions of the codes and
number estimators. all the documentations. Those of Release 4.5 are
The implementation of subroutines for a li- available under the GNU General Public License
brary should be highly standardized, and it should or from the archives of http://www.slicot.org/.
be accompanied by a well-written online docu-
mentation as well as a user manual (see, e.g.,
standard Denham and Benson 1981; Working
Group Software 1996) which is compatible with An Illustration
that of the LAPACK library (Anderson et al.
1992). Although such highly restricted standards To give an illustration for the development of a
often put a heavy burden on the programmer, it basic control system routine, we consider the
has been observed that it has a high importance specific problem of checking controllability
for the reusability of software and it also has of a linear time-invariant control system. A
turned out to be a very valuable tool in teaching linear time-invariant control problem has the
students how to implement algorithms in the form
context of their studies.
Benchmarking dx
In the validation of numerical software, it is D Ax C Bu; t 2 Œt0 ; 1/ (1)
dt
extremely important to be able to test the cor-
rectness of the implementation as well as the
performance of the method, which is one of the Here x denotes the state and u the input function,
major steps in the construction of a software and the system matrices are typically of the form
library. To achieve this, one needs a standard- A 2 Rn;n , B 2 Rn;m .
ized set of benchmark examples that allows an One of the most important topics in control
evaluation of a method with respect to correct- is the question whether by an appropriate choice
ness, accuracy, and efficiency and to analyze of input function u.t/ we can control the system
the behavior of the method in extreme situa- from an arbitrary state to the null state. This prop-
tions, i.e., on problems where the limit of the erty, called controllability, can be characterized
possible accuracy is reached. In the context of by one of the following equivalent conditions (see
basic systems and control methods, several such Paige 1981).
74 Basic Numerical Methods and Software for Computer Aided Control Systems Design
2 3
A11 A1;r1 A1;r so that
n1
6 : : : 7
6 : : : 7 n2
6A21 : : : 7
6
6 :: :: :
7 :
: 7 :
2 3
PAPT D 6 : : : : 7 : A11 A12 A13
6 : : 7n
4 Ar1;r2 Ar1;r1 Ar1;r 5 r1 A WD P2 AP2T DW 4A21 A22 A23 5 ;
0 0 0 Arr
nr
0 A32 A33 B
n1 : : : nr2 nr1 nr
2 3 (2)
B1 0 n1 3 2
6 0 07 n2 B1 0
6 7
6 : :7 :
6 : :7 :
6 : :7 : B WD P2 B DW 4 0 05 ;
PBQ D 6 7
6 : :7 :
4 :: :: 5 ::
0 0
0 0 nr where A21 D Œ†21 0, and B1 WD V21T †B .
n1 m n1
Step 2:
i=3
where n1 n2 nr1 nr 0; nr1 > DO WHILE .ni 1 > 0 AND Ai;i 1 ¤ 0/.
0, Ai;i 1 D †i;i 1 0 , with nonsingular blocks
Perform an SVD of Ai;i 1 D Ui;i 1
†i;i 1 2 Rni ;ni and B1 2 Rn1 ;n1 . †i;i 1 0
V T with
0 0 i;i 1
Notice that when using the reduced pair in
†i;i 1 2 Rni ;ni nonsingular and diagonal.
condition (iii) of Theorem 1, the controllability
Set
condition is just nr D 0, which is simply checked
by inspection. A numerically stable algorithm 2 3
In1
to compute the staircase form of Theorem 4 is 6 :: 7
6 7
given below. It is based on the use of the singular 6 : 7
Pi WD 6
6 Ini 2
7 ; P WD Pi P;
7
value decomposition, but one could also have 6 T 7
used instead the QR decomposition with column 4 Vi;i 1 5
T
Ui;i
pivoting. 1
A11 A12 where Ai;i 1 D Πi;i 1 0.
A WD UBT AUB D ;
A21 A22 i WD i C 1
END
†B 0 r WD i
B WD UBT BVB D
0 0 It is clear that this algorithm will stop with
with A11 of size n1 n1 . ni D 0 or Ai;i 1 D 0. In every step, the
†21 0 remaining block shrinks at least by 1 row/column,
Step 1: Perform an SVD A21 D U21 VT
0 0 21 as long as Rank Ai;i 1 > 1, so that the algorithm
with nonsingular and diagonal †21 2 Rn2 ;n2 . Set stops after maximally n 1 steps. It has been
T shown in Van Dooren (1981) that system (1) is
V21 0 controllable if and only if in the staircase form of
P2 WD T ; P WD P2 P
0 U21 .A; B/ one has nr D 0.
76 Basic Numerical Methods and Software for Computer Aided Control Systems Design
It should be noted that the updating which will be the methods of choice for such
transformations Pi of this algorithm will affect problems. We still need to understand the nu-
previously created “stairs” so that the blocks merical challenges in such areas, before we can
denoted as †i;i 1 will not be diagonal anymore, propose numerically reliable software for these
but their singular values are unchanged. This is problems: the area is still quite open for new
critical in the decision about the controllability of developments.
the pair .A; B/ since it depends on the numerical
rank of the submatrices Ai;i 1 and B (see
Demmel and Kågström 1993). Based on this
and a detailed error and perturbation analysis, Cross-References
the Staircase Algorithm has been implemented
in the SLICOT routine AB01OD, and it uses Computer-Aided Control Systems Design:
in the worst-case O.n4 / flops (a “flop” is an Introduction and Historical Overview
elementary floating-point operation C; ; , Interactive Environments and Software Tools
or =). For efficiency reasons, the SLICOT for CACSD
routine AB01OD does not use SVDs for rank
decisions, but QR decompositions with column
pivoting. When applying the corresponding Bibliography
orthogonal transformations to the system without
accumulating them, the complexity can be Anderson E, Bai Z, Bischof C, Demmel J, Dongarra J,
reduced to O.n3 / flops. It has been provided with Du Croz J, Greenbaum A, Hammarling S, McKen-
error bounds, condition estimates, and warning ney A, Ostrouchov S, Sorensen D (1995) LAPACK
users’ guide, 2nd edn. SIAM, Philadelphia. http://
strategies. www.netlib.org/lapack/
Benner P, Laub AJ, Mehrmann V (1997) Benchmarks for
the numerical solution of algebraic Riccati equations.
Summary and Future Directions Control Syst Mag 17:18–28
Benner P, Mehrmann V, Sima V, Van Huffel S, Varga A
(1999) SLICOT-A subroutine library in systems and
We have presented the SLICOT library and the control theory. Appl Comput Control Signals Circuits
basic principles for the design of such basic 1:499–532
subroutine libraries. To illustrate these princi- Demmel JW, Kågström B (1993) The generalized Schur
decomposition of an arbitrary pencil A B: ro-
ples, we have presented the development of a bust software with error bounds and applications. Part
method for checking controllability for a linear I: theory and algorithms. ACM Trans Math Softw
time-invariant control system. But the SLICOT 19:160–174
Denham MJ, Benson CJ (1981) Implementation and
library contains much more than that. It essen-
documentation standards for the software library
tially covers most of the problems listed in the in control engineering (SLICE). Technical report
selected reprint volume (Patel et al. 1994). This 81/3, Kingston Polytechnic, Control Systems Research
volume contained in 1994 the state of the art Group, Kingston
Dongarra JJ, Du Croz J, Duff IS, Hammarling S (1990) A
in numerical methods for systems and control,
set of level 3 basic linear algebra subprograms. ACM
but the field has strongly evolved since then. Trans Math Softw 16:1–17
Examples of areas that were not in this vol- Frederick DK (1988) Benchmark problems for computer
ume but that are included in SLICOT are pe- aided control system design. In: Proceedings of the 4th
IFAC symposium on computer-aided control systems
riodic systems, differential algebraic equations,
design, Bejing, pp 1–6
and model reduction. Areas which still need new Golub GH, Van Loan CF (1996) Matrix computations, 3rd
results and software are the control of large- edn. The Johns Hopkins University Press, Baltimore
scale systems, obtained either from discretiza- Gomez C, Bunks C, Chancelior J-P, Delebecque F
(1997) Integrated scientific computing with scilab.
tions of partial differential equations or from the
Birkhäuser, Boston. https://www.scilab.org/
interconnection of a large number of interact- Grübel G (1983) Die regelungstechnische Programmbib-
ing systems. But it is unclear for the moment liothek RASP. Regelungstechnik 31: 75–81
Bilinear Control of Schrödinger PDEs 77
Luenberger DG (1967) Canonical forms for linear by various teams in the last decade. The results
multivariable systems. IEEE Trans Autom Control are illustrated on several classical examples.
12(3):290–293
Paige CC (1981) Properties of numerical algorithms re-
lated to computing controllability. IEEE Trans Autom Keywords
Control AC-26:130–138 B
Patel R, Laub A, Van Dooren P (eds) (1994) Numerical
linear algebra techniques for systems and control. Approximate controllability; Global exact con-
IEEE, Piscataway trollability; Local exact controllability; Quantum
The Control and Systems Library SLICOT (2012) particles; Schrödinger equation; Small-time con-
The NICONET society. NICONET e.V. http://www.
trollability
niconet-ev.info/en/
The MathWorks, Inc. (2013) MATLAB version 8.1. The
MathWorks, Inc., Natick
Introduction
The Numerical Algorithms Group (1993) NAG SLICOT
library manual, release 2. The Numerical Algorithms
Group, Wilkinson House, Oxford. Updates Release 1 A quantum particle, in a space with dimension N
of May 1990 (N D 1; 2; 3), in a potential V D V .x/, and in an
The Working Group on Software (1996) SLICOT
electric field u D u.t/, is represented by a wave
implementation and documentation standards
2.1. WGS-report 96-1. http://www.icm.tu-bs.de/ function W .t; x/ 2 R ! C on the
NICONET/reports.html L2 . ; C/ sphere S
Van Dooren P (1981) The generalized eigenstructure Z
problem in linear system theory. IEEE Trans Autom
Control AC-26:111–129 j .t; x/j2 dx D 1; 8t 2 R;
Varga A (ed) (2004) Special issue on numerical awareness
in control. Control Syst Mag 24-1: 14–17
Wieslander J (1977) Scandinavian control library. A sub-
where RN is a possibly unbounded open
routine library in the field of automatic control. Tech- domain. In first approximation, the time evolution
nical report, Department of Automatic Control, Lund of the wave function is given by the Schrödinger
Institute of Technology, Lund equation,
8
< i @t .t; x/ D . C V / .t; x/
u.t/
.x/ .t; x/ ; t 2 .0; C1/; x 2 ;
:
.t; x/ D 0; x 2 @
Bilinear Control of Schrödinger PDEs (1)
where
is the dipolar moment of the particle
Karine Beauchard1 and Pierre Rouchon2
1 and „ D 1 here. Sometimes, this equation is
CNRS, CMLS, Ecole Polytechnique, Palaiseau,
considered in the more abstract framework
France
2
Centre Automatique et Systèmes, Mines d
ParisTech, Paris Cedex 06, France i D .H0 C u.t/H1 / (2)
dt
where lives on the unit sphere of a separable
Hilbert space H and the Hamiltonians H0 , H1
Abstract are Hermitian operators on H. A natural question,
with many practical applications, is the existence
This entry is an introduction to modern issues of a control u that steers the wave function
about controllability of Schrödinger PDEs with from a given initial state 0 , to a prescribed target
bilinear controls. This model is pertinent for a f.
quantum particle, controlled by an electric field. The goal of this survey is to present
We review recent developments in the field, with well-established results concerning exact and
discrimination between exact and approximate approximate controllabilities for the bilinear
controllabilities, in finite or infinite time. We also control system (1), with applications to relevant
underline the variety of mathematical tools used examples. The main difficulties are the infinite
78 Bilinear Control of Schrödinger PDEs
dimension of H and the nonlinearity of the The first control result of the literature
control system. states
2 the 1 noncontrollability of system (1)2 in
H \ H0 ./ \ S with controls u 2 L ((0,
T), R/ (Ball et al. 1982; Turinici 2000). More
Preliminary Results precisely, by applying L2 (0, T / controls u, the
reachable
2 wave
functions (T / form a subset of
When the Hilbert space H has finite dimension n, H \ H0 ./ \ S with empty interior. This
1
then controllability of Eq. (2) is well understood statement does not give obstructions for system
(D’Alessandro 2008). If, for example, the Lie (1) to be controllable in different functional
algebra spanned by H0 and H1 coincides with spaces as we will see below, but it indicates
u.n/, the set of skew-Hermitian matrices, then that controllability issues are much more subtle
system (2) is globally controllable: for any initial in infinite dimension than in finite dimension.
and final states 0 , f 2 H of length one, there
exist T > 0 and a bounded open-loop control
Œ0; T 3 t 7! u.t/ steering from (0) = 0 to Local Exact Controllability
(T / = f .
In infinite dimension, this idea served to intuit In 1D and with Discrete Spectrum
a negative controllability result in Mirrahimi and This section is devoted to the 1D PDE:
Rouchon (2004), but the above characterization
cannot be generalized because iterated Lie brack- 8
ets of unbounded operators are not necessarily < i @t .t; x/ D @2x .t; x/
u.t/
.x/ .t; x/ ; x 2 .0; 1/ ; t 2 .0; T / ;
well defined. For example, the quantum harmonic :
.t; 0/ D .t; 1/ D 0:
oscillator
(4)
i @t .t; x/ D @2x .t; x/ C x 2 .t; x/ We call “ground state” the solution of the
free system (u D 0) built with the first eigen-
u .t/ x .t; x/ ; x 2 R; (3) value and eigenvector of @2x W 1 .t; x/ D
p
2 sin.x/ e i . Under appropriate assump-
2t
is not controllable (in any reasonable sense) (Mir- tions on the dipolar moment
, then system (4)
rahimi and Rouchon 2004) even if all its Galerkin is controllable around the ground state, locally in
approximations are controllable (Fu et al. 2001). 3
H.0/ .0; 1/ \ S, with controls in L2 ((0,T), R/, as
Thus, much care is required in the use of Galerkin stated below.
approximations to prove controllability in infinite
Theorem 1 Assume
2 H 3 ..0; 1/; R/ and
dimension. This motivates the search of different
methods to study exact controllability of bilinear ˇZ 1 ˇ
ˇ ˇ
PDEs of form (1). ˇ
.x/ sin .x/ sin .kx/ dx ˇ c ; 8k 2 N
ˇ ˇ k3
In infinite dimension, the norms need to be 0
(5)
specified. In this article, we use Sobolev norms.
For s 2 N, the Sobolev space H s ./ is the for some constant c > 0. Then, for every
space of functions : ˝ ! C with square inte- T > 0, there exists ı > 0 such that for
grable derivatives d k for k D 0; : : : ; s (deriva- every 0 ; f 2 S \ H.0/ 3
..0; 1/; C/ with
tives are well defined in the distribution sense). k 0 1 .0/kH 3 C
f 1 .T / H 3 < ı,
H s ./ is endowed with the norm k k H s W D there exists u 2 L ..0; T /; R/ such that the
2
P k 2
1=2
s d 2 . We also use the space solution of (4) with initial condition .0; x/ D
kD0 L ./
0 .x/ satisfies .T / D f .
1
H0 ./ which contains functions 2 H 1 ./ ˚
that vanish on the boundary @ (in the trace Here, H.0/3
.0; 1/ W D 2 H 3 ..0; 1/; C/ ;
00
sense) (Brézis 1999). D D 0 at x D 0; 1g. We refer to Beauchard
Bilinear Control of Schrödinger PDEs 79
and Laurent (2010) and Beauchard et al. (2013) in H 2 \ H01 and the reachable set to coincide
for proof and generalizations to nonlinear PDEs. 3
with H.0/ .
The proof relies on the linearization principle, by
applying the classical inverse mapping theorem
Open Problems in Multi-D or
to the endpoint map. Controllability of the lin-
with Continuous Spectrum B
earized system around the ground state is a con-
The linearization principle used to prove Theo-
sequence of assumption (5) and classical results
rem 1 does not work in multi-D: the trigonometric
about trigonometric moment problems. A subtle
moment problem, associated to the controllabil-
smoothing effect allows to prove C 1 regularity of
ity of the linearized system, cannot be solved.
the endpoint map.
Indeed, its frequencies, which are the eigenvalues
The assumption (5) holds for generic
2
of the Dirichlet Laplacian operator, do not satisfy
H 3 ..0; 1/; R/ and plays a key role for local ex-
a required gap condition (Loreti and Komornik
act controllability to hold in small time T . In
2005).
Beauchard and Morancey (2014), local exact con-
The study of a toy model (Beauchard 2011)
trollability is proved under the weaker assump-
suggests that if local controllability holds in 2D
tion, namely,
0 .0/ ˙
0 .1/ ¤ 0, but only in large
(with a priori bounded L2 -controls) then a posi-
time T .
tive minimal time is required, whatever
is. The
Moreover, under appropriate assumptions on
appropriate functional frame for such a result is
not appropriate. However, closed-loop controls • Time reversibility of the Schrödinger equation
may be computed via numerical simulations (i.e., if ( (t, x/, u.t// is a trajectory, then so
and then applied to real quantum systems is . .T t; x/; u.T t// where * is the
in open loop, without measurement. Then, complex conjugate of ).
the strategy consists in designing damping Let us expose this strategy on the quantum box
feedback laws, thanks to a controlled Lyapunov (6). First, one can check the assumptions of Theo-
function, which encodes the distance to the rem 2 with V .x/ = x and
(x/ D .1 )x when
target. In finite dimension, the convergence > 0 is small enough. This means that, in (6),
proof relies on LaSalle invariance principle. In we consider controls u.t/ of the form + u.t/.
infinite dimension, this principle works when Thus, an initial condition 0 2 H.0/ 3C
can be
the trajectories of the closed-loop system are steered arbitrarily close
to the first eigenvector
compact (in the appropriate space), which is ' 1 ; of @2x C x , in H 3 norm. Moreover, by
often difficult to prove. Thus, two adaptations a variant of Theorem 1, the local exact control-
3
have been proposed: approximate convergence lability of (6) holds in H.0/ around
1; . There-
(Beauchard and Mirrahimi 2009; Mirrahimi fore, the initial condition 0 2 H.0/ 3C
can be
2009) and weak convergence (Beauchard and steered exactly to
1; in finite time. By the time
Nersesyan 2010) to the target. reversibility of the Schrödinger equation, we can
also steer exactly the solution from
1; to any
target f 2 H 3C . Therefore, the solution can be
Variational Methods and Global Exact steered exactly from any initial condition 0 2
Controllability 3C
H.0/ to any target f 2 H.0/ 3C
in finite time.
The global approximate controllability of (1),
in any Sobolev space, is proved in Nersesvan
(2010), under generic assumptions on (V, Geometric Techniques Applied to Galerkin
ˇ ˇ
jL M j ¤ ˇpl plC1 ˇ ; 81 l r 1; sign of a rich structure and subtle nature of
LM 2 N with fL; M g ¤ fpl ; plC1 g. control issues. New methods will probably be
Then for every – > 0 and 0 ; f in the unit necessary to answer the remaining open problems
sphere of H , there exists a piecewise constant in the field.
function u W Œ0; T– ! Œ0; ı such that the solution This survey is far from being complete. B
condition .0/ D 0 satisfies
of In particular, we do not consider numerical
(2) with initial
.T– / f < –: methods to derive the steering control such as
H
those used in NMR (Nielsen et al. 2010) to
We refer to Boscain et al. (2012, 2013) and achieve robustness versus parameter uncertainties
Chambrion et al. (2009) for proof and additional
or such as monotone algorithms (Baudouin and
results such as estimates on the L1 norm of
Salomon 2008; Liao et al. 2011) for optimal
the control. Note that H0 is not necessarily of
control (Cancès et al. 2000). We do not consider
the form ( C V /, H1 can be unbounded, ı
also open quantum systems where the state
may be arbitrary small, and the two assumptions
is then the density operator , a nonnegative
are generic with respect to (H0 ; H1 ). The con- Hermitian operator with unit trace on H. The
nectivity and transition frequency conditions in
Schrödinger equation is then replaced by the
Theorem 3 mean physically that each pair of
Lindblad equation:
H0 eigenstates is connected via a finite number
of first-order (one-photon) transitions and that d X
the transition frequencies between pairs of eigen- D ŒH0 C uH1 ; C L L
dt
states are all different.
Note that, contrary to Theorems 2, Theorem 3 1
L L C L L
cannot be combined with Theorem 1 to prove 2
global exact controllability. Indeed, functional
spaces are different: H D L2 ./ in Theorem 3, with operator L related to the decoherence
whereas H 3 -regularitv is required for Theorem 1. channel . Even in the case of finite dimensional
This kind of results applies to several relevant Hilbert space H, controllability of such system
examples such as the control of a particule in is not yet well understood and characterized
a quantum box by an electric field (6) and the (see Altafini (2003) and Kurniawan et al. (2012)).
control of the planar rotation of a linear molecule
by means of two electric fields: Cross-References
i @t .t; / D @20 C u1 .t/ cos ./ Control of Quantum Systems
Cu2 .t/ sin .// .t; / ; 2 T Robustness Issues in Quantum Control
where T is the lD-torus. However, several other Acknowledgments The authors were partially supported
by the “Agence Nationale de la Recherche” (ANR), Projet
systems of physical interest are not covered by Blanc EMAQS number ANR-2011-BS01-017-01.
these results such as trapped ions modeled by two
coupled quantum harmonic oscillators. In Erve-
doza and Puel (2009), specific methods have been Bibliography
used to prove their approximate controllability.
Altafini C (2003) Controllability properties for finite
dimensional quantum Markovian master equations.
J Math Phys 44(6):2357–2372
Concluding Remarks Ball JM, Marsden JE, Slemrod M (1982) Controllabil-
ity for distributed bilinear systems. SIAM J Control
Optim 20:575–597
The variety of methods developed by different
Baudouin L, Salomon J (2008) Constructive solutions of
authors to characterize controllability of a bilinear control problem for a Schrödinger equation.
Schrödinger PDEs with bilinear control is the Syst Control Lett 57(6):453–464
82 Boundary Control of 1-D Hyperbolic Systems
Beauchard K (2005) Local controllability of a 1-D Kurniawan I, Dirr G, Helmke U (2012) Controllability
Schrödinger equation. J Math Pures Appl 84:851–956 aspects of quantum dynamics: unified approach for
Beauchard K (2011) Local controllability and non control- closed and open systems. IEEE Trans Autom Control
lability for a ID wave equation with bilinear control. 57(8):1984–1996
J Diff Equ 250:2064–2098 Li JS, Khaneja N (2009) Ensemble control of Bloch
Beauchard K, Laurent C (2010) Local controllability of equations. IEEE Trans Autom Control 54(3):528–536
1D linear and nonlinear Schrödinger equations with Liao S-K, Ho T-S, Chu S-I, Rabitz HH (2011) Fast-kick-
bilinear control. J Math Pures Appl 94(5):520–554 off monotonically convergent algorithm for searching
Beauchard K, Mirrahimi M (2009) Practical stabiliza- optimal control fields. Phys Rev A 84(3):031401
tion of a quantum particle in a one-dimensional in- Loreti P, Komornik V (2005) Fourier series in control
finite square potential well. SIAM J Control Optim theory. Springer, New York
48(2):1179–1205 Mirrahimi M (2009) Lyapunov control of a quantum par-
Beauchard K, Morancey M (2014) Local controllability ticle in a decaying potential. Ann Inst Henri Poincaré
of 1D Schrödinger equations with bilinear control (c) Nonlinear Anal 26:1743–1765
and minimal time, vol 4. Mathematical Control and Mirrahimi M, Rouchon P (2004) Controllability of quan-
Related Fields tum harmonic oscillators. IEEE Trans Autom Control
Beauchard K, Nersesyan V (2010) Semi-global weak 49(5):745–747
stabilization of bilinear Schrödinger equations. CRAS Nersesvan V (2010) Global approximate controllability
348(19–20):1073–1078 for Schrödinger equation in higher Sobolev norms and
Beauchard K, Coron J-M, Rouchon P (2010) Control- applications. Ann IHP Nonlinear Anal 27(3):901–915
lability issues for continuous spectrum systems and Nersesyan V, Nersisyan H (2012a) Global exact controlla-
ensemble controllability of Bloch equations. Commun bility in infinite time of Schrödinger equation. J Math
Math Phys 290(2):525–557 Pures Appl 97(4):295–317
Beauchard K, Lange H, Teismann H (2013, preprint) Lo- Nersesyan V, Nersisyan H (2012b) Global exact con-
cal exact controllability of a Bose-Einstein condensate trollability in infinite time of Schrödinger equation:
in a 1D time-varying box. arXiv:1303.2713 multidimensional case. Preprint: arXiv:1201.3445
Boscain U, Caponigro M, Chambrion T, Sigalotti M Nielsen NC, Kehlet C, Glaser SJ and Khaneja N (2010)
(2012) A weak spectral condition for the controlla- Optimal Control Methods in NMR Spectroscopy.
bility of the bilinear Schrödinger equation with ap- eMagRes
plication to the control of a rotating planar molecule. Turinici G (2000) On the controllability of bilinear quan-
Commun Math Phys 311(2):423–455 tum systems. In: Le Bris C, Defranceschi M (eds)
Boscain U, Chambrion T, Sigalotti M (2013) On Mathematical models and methods for ab initio quan-
some open questions in bilinear quantum control. tum chemistry. Lecture notes in chemistry, vol 74.
arXiv:1304.7181 Springer
Brézis H (1999) Analyse fonctionnelles: théorie et appli-
cations. Dunod, Paris
Cancès E, Le Bris C, Pilot M (2000) Contrôle optimal
bilinéaire d’une équation de Schrödinger. CRAS Paris
330:567–571 Boundary Control of 1-D Hyperbolic
Chambrion T, Mason P, Sigalotti M, Boscain M (2009)
Controllability of the discrete-spectrum Schrödinger
Systems
equation driven by an external field. Ann Inst Henri
Poincaré Anal Nonlinéaire 26(l):329–349 Georges Bastin1 and Jean-Michel Coron2
Coron J-M (2006) On the small-time local controllability 1
Department of Mathematical Engineering,
of a quantum particule in a moving one-dimensional
infinite square potential well. C R Acad Sci Paris I University Catholique de Louvain,
342:103–108 Louvain-La-Neuve, Belgium
2
Coron J-M (2007) Control and nonlinearity. Mathematical Laboratoire Jacques-Louis Lions, University
surveys and monographs, vol 136. American Mathe- Pierre et Marie Curie, Paris, France
matical Society, Providence
D’Alessandro D (2008) Introduction to quantum control
and dynamics. Applied mathematics and nonlinear
science. Chapman & Hall/CRC, Boca Raton
Ervedoza S, Puel J-P (2009) Approximate controllability Abstract
for a system of Schrödinger equations modeling a
single trapped ion. Ann Inst Henri Poincaré Anal One-dimensional hyperbolic systems are com-
Nonlinéaire 26(6):2111–2136
monly used to describe the evolution of various
Fu H, Schirmer SG, Solomon AI (2001) Complete con-
trollability of finite level quantum systems. J Phys A physical systems. For many of these systems,
34(8):1678–1690 controls are available on the boundary. There
Boundary Control of 1-D Hyperbolic Systems 83
!
are then two natural questions: controllability It 0 L1
s Ix
(steer the system from a given state to a desired C D 0; (2)
Vt Cs1 0 Vx
target) and stabilization (construct feedback laws
leading to a good behavior of the closed loop
system around a given set point). where I.t; x/ is the current intensity, V .t; x/ is B
the voltage, Ls is the self-inductance per unit
length, and Cs is the self-capacitance per unit
Keywords length. The system has two characteristic ve-
locities (which are the eigenvalues of the ma-
Chromatography; Controllability; Electrical trix A):
lines; Hyperbolic systems; Open channels; Road
traffic; Stabilization 1 1
1 D p > 0 > 2 D p : (3)
Ls Cs Ls Cs
where: Ht V H Hx
• t and x are two independent variables: a time C D 0; (4)
Vt g V Vx
variable t 2 Œ0; T and a space variable x 2
Œ0; L over a finite interval.
where H.t; x/ is the water depth, V .t; x/ is the
• Y W Œ0; T Œ0; L ! Rn is the vector of state
water horizontal velocity, and g is the gravity ac-
variables.
celeration. Under subcritical flow conditions, the
• A W Rn ! Mn;n .R/ with Mn;n .R/ is the set
system is hyperbolic with characteristic velocities
of n n real matrices.
• Yt and Yx denote the partial derivatives of Y p p
with respect to t and x, respectively. 1 D V C gH > 0 > 2 D V gH : (5)
The system (1) is hyperbolic which means that
A.Y / has n distinct real eigenvalues (called char- Aw-Rascle Equations for Fluid Models
acteristic velocities) for all Y in a domain of of Road Traffic
Rn . Here are some typical examples of physical In the fluid paradigm for road traffic modeling,
models having the form of a hyperbolic system. the traffic is described in terms of two basic
macroscopic state variables: the density %.t; x/
Electrical Lines and the speed V .t; x/ of the vehicles at position x
First proposed by Heaviside in (1885, 1886 and along the road at time t. The following dynamical
1887), the equations of (lossless) electrical lines model for road traffic was proposed by Aw and
(also called telegrapher equations) describe the Rascle in (2000):
propagation of current and voltage along electri-
cal transmission lines (see Fig. 1). It is a hyper- %t V % %t
C F .Y / D 0: (6)
bolic system of the following form: Vt 0 V Q.%/ Vt
84 Boundary Control of 1-D Hyperbolic Systems
Boundary Control of 1-D Hyperbolic Systems, Fig. 1 Transmission line connecting a power supply to a resistive
load R` ; the power supply is represented by a Thevenin equivalent with efm U.t / and internal resistance Rg
Chromatography
In chromatography, a mixture of species with
different affinities is injected in the carrying fluid
at the entrance of the process as illustrated in
Fig. 3. The various substances travel at different
propagation speeds and are ultimately separated
in different bands. The dynamics of the mixture
are described by a system of partial differential
equations:
ki Pi
Li .P / D P ; (8) where Pi (i D 1; : : : ; n) denote the densities
1C j kj Pj =Pmax of the n carried species. The function Li .P /
Boundary Control of 1-D Hyperbolic Systems 85
(called the “Langmuir isotherm”) was proposed 2. Open channels. A standard situation is when
by Langmuir in (1916). the boundary conditions are assigned by tun-
able hydraulic gates as in irrigation canals and
navigable rivers; see Fig. 4.
Boundary Control The hydraulic model of mobile spillways B
gives the boundary conditions
Boundary control of 1-D hyperbolic systems q
refers to situations where manipulated control 3
H.t; 0/V .t; 0/ D kG Z0 .t/ U0 .t/ ;
inputs are physically located at the boundaries.
q
Formally, this means that the system (1) is 3
considered under n boundary conditions having H.t; L/V .t; L/ D kG H.t; L/ UL .t/ ;
the general form
where H.t; 0/ and H.t; L/ denote the water
depth at the boundaries inside the pool, Z0 .t/
B Y .t; 0/; Y .t; L/; U.t/ D 0; (9)
and ZL .t/ are the water levels on the other
side of the gate, kG is a constant gate shape
with B W Rn Rn Rq ! Rn . The depen- parameter, and U0 and UL represent the weir
dence of the map B on (Y .t; 0/; Y .t; L/) refers elevations. The Saint-Venant equations cou-
to natural physical constraints on the system. The pled to these boundary conditions constitute a
function U.t/ 2 Rq represents a set of q ex- boundary control system with U0 .t/ and UL .t/
ogenous control inputs. The following examples as command signals.
illustrate how the control boundary conditions (9) 3. Ramp metering. Ramp metering is a strategy
may be defined for some commonly used control that uses traffic lights to regulate the flow of
devices: traffic entering freeways according to mea-
1. Electrical lines. For the circuit represented in sured traffic conditions as illustrated in Fig. 5.
Fig. 1, the line model (2) is to be considered For the stretch of motorway represented in this
under the following boundary conditions: figure, the boundary conditions are
The telegrapher equations (2) coupled with where U.t/ is the inflow rate controlled by
these boundary conditions constitute therefore the traffic lights. The Aw-Rascle equations (6)
a boundary control system with the voltage coupled to these boundary conditions consti-
U.t/ as control input. tute a boundary control system with U.t/ as
Boundary Control of 1-D Hyperbolic Systems, Fig. 4 Hydraulic gates at the input and the output of a pool
86 Boundary Control of 1-D Hyperbolic Systems
Boundary Control of 1-D Hyperbolic Systems, Fig. 5 Ramp metering on a stretch of a motorway
Feedback Stabilization
Controllability
For the boundary control system (1), (9), the
In this section and in the following one, Y 2 R n problem of local boundary feedback stabilization
is such that none of the eigenvalues of A.Y / is the problem of finding boundary feedback
are 0. After an appropriate linear state transfor- control actions
mation, the matrix A.Y / can be assumed to be
diagonal, with distinct and nonzero entries: U.t/ D F .Y .t; 0/; Y .t; L/; Y /;
F W Rn Rn Rn ! Rp ; (12)
A.Y / D diag .1 ; 2 ; : : : ; n /; (10)
1 > 2 > > m > 0 > mC1 > > n : such that the system trajectory exponentially con-
verges to a desired steady-state Y (called set
point) from any initial condition Y0 .x/ close to
Let Y C 2 Rm and Y 2 Rnm be such that Y T D
Y . In such case, the set point is said to be
.Y CT Y T /T .
exponentially stable.
For the boundary control system (1), (9), the
local controllability issue is to investigate if, Theorem 5 (See Coron et al. 2008) If
starting from a given initial state Y0 W x 2 there exists a boundary feedback U.t/ D
Boundary Control of 1-D Hyperbolic Systems 87
F .Y .t; 0/; Y .t; L/; Y / such that the boundary As shown in Li (2010), this problem has strong
conditions (9) are written in the form connections with the controllability problem and
C the system (1) is observable if the time T is large
Y C .t; 0/ Y .t; L/ enough.
DG ; G.Y / D Y ;
Y .t; L/ Y .t; 0/ The above results are on smooth solutions of B
(13) (1). However, the system (1) is known to be
then, for the boundary control system (1), (13), well posed in class of BV -solutions (Bounded
the set point Y is locally exponentially stable for Variations), with extra conditions (e.g., entropy
the H 2 -norm if type); see in particular Bressan (2000). There are
partial results on the controllability in this class.
Inf fkG 0 .Y /1 kI 2 Dg < 1; See, in particular, Ancona and Marson (1998) and
Horsin (1998) for n D 1. For n D 2, it is shown
in Bressan and Coclite (2002) that Theorem 4 no
where k k denotes the usual 2-norm of n n real
longer holds in general in the BV class. However,
matrices, G 0 .Y / denotes the Jacobian matrix
there are positive results for important physical
of the map G at Y , and D denotes the set of
systems; see, for example, Glass (2007) for the
n n diagonal real matrices with strictly positive
1-D isentropic Euler equation.
diagonal entries.
For the stabilization in the C 1 -norm, another
sufficient condition is given in Li (1994). Cross-References
nonlinear hyperbolic systems. SIAM J Control Optim when they are posed on a bounded interval.
47(3):1460–1498 Different choices of controls are studied for each
Coron J-M, Vazquez R, Krstic M, Bastin G (2013) Local
exponential H 2 stabilization of a 2 2 quasilinear hy-
equation.
perbolic system using backstepping. SIAM J Control
Optim 51(3):2005–2035
Glass O (2007) On the controllability of the 1-D isentropic Keywords
Euler equation. J Eur Math Soc (JEMS) 9(3):427–486
Heaviside O (1885, 1886 and 1887) Electromagnetic in-
duction and its propagation. The Electrician, reprinted Controllability; Dispersive equations; Higher-
in Electrical Papers, 2 vols, London. Macmillan and order partial differential equations; Parabolic
co. 1892 equations; Stabilizability
Horsin T (1998) On the controllability of the Burgers
equation. ESAIM Control Optim Calc Var 3:83–95
(electronic)
Krstic M, Smyshlyaev A (2008) Boundary control of Introduction
PDEs. Volume 16 of advances in design and con-
trol. Society for Industrial and Applied Mathematics
(SIAM), Philadelphia. A course on backstepping de- The Korteweg-de Vries (KdV) and the Kuramoto-
signs Sivashinsky (KS) equations have very different
Langmuir I (1916) The constitution and fundamental properties because they do not belong to the same
properties of solids and liquids. Part I. Solids. J Am class of partial differential equations (PDEs).
Chem Soc 38:2221–2295
Li T (1994) Global classical solutions for quasilinear The first one is a third-order nonlinear dispersive
hyperbolic systems. Volume 32 of RAM: research in equation
applied mathematics. Masson, Paris
Li T (2010) Controllability and observability for quasilin- yt C yx C yxxx C yyx D 0; (1)
ear hyperbolic systems. Volume 3 of AIMS series on
applied mathematics. American Institute of Mathemat-
ical Sciences (AIMS), Springfield and the second one is a fourth-order nonlinear
Li T, Rao B-P (2003) Exact boundary controllability parabolic equation
for quasi-linear hyperbolic systems. SIAM J Control
Optim 41(6):1748–1755 (electronic)
ut C uxxxx C uxx C uux D 0; (2)
configuration is not possible for the classical the space of bounded functions. The main control
wave and heat equations where at each extreme, properties to be mentioned in this article are con-
only one boundary condition exists and therefore trollability, stability, and stabilization. A control
controlling one or all the boundary data at one system is said to be exactly controllable if the
point is the same. system can be driven from any initial state to B
The KdV equation being of third order in another one in finite time. This kind of properties
space, three boundary conditions have to be im- holds, for instance, for hyperbolic system as the
posed: one at the left endpoint x D 0 and two at wave equation. The notion of null-controllability
the right endpoint x D L. For the KS equation, means that the system can be driven to the origin
four boundary conditions are needed to get a from any initial state. The main example for
well-posed system, two at each extreme. We will this property is the heat equation, which presents
focus on the cases where Dirichlet and Neumann regularizing effects. Even if the initial data is
boundary conditions are considered because lack discontinuous, right after t D 0, the solution
of controllability phenomena appears. This holds of the heat equation becomes very smooth, and
for some special values of the length of the therefore, it is not possible to impose a discontin-
interval for the KdV equation and depends on the uous final state. A system is said to be asymptoti-
anti-diffusion coefficient for the KS equation. cally stable if the solutions of the system without
The particular cases where the lack of any control converge as the time goes to infinity
controllability occurs can be seen as isolated to a stationary solution of the PDE. When this
anomalies. However, those phenomena give convergence holds with a control depending at
us important information on the systems. In each time on the state of the system (feedback
particular, any method independent of the value control), the system is said to be stabilizable by
of those constants cannot control or stabilize means of a feedback control law.
the system when acting from the corresponding All these properties have local versions when a
control input where trouble appears. In all of smallness condition for the initial and/or the final
these cases, for both the KdV and the KS state is added. This local character is normally
equations, the space of uncontrollable states is due to the nonlinearity of the system.
finite dimensional, and therefore, some methods
coming from the control of ordinary differential
equations can be applied.
The KdV Equation
Thus, viewing h1 .t/; h2 .t/; h3 .t/ 2 R as controls obtain the original KdV control system and the
and the solution y.t; / W Œ0; L ! R as the state, following results.
we can consider the linear control system (3)–(4)
Theorem 2 The nonlinear KdV system (1)–
and the nonlinear one (1)–(4).
(4) is:
We will report on the role of each input control
1. Locally null-controllable when controlled
when the other two are off. The tools used are
from h1 (i.e., h2 D h3 D 0) (Glass and
mainly the duality controllability-observability,
Guerrero 2008).
Carleman estimates, the multiplier method, the
2. Locally exactly controllable when controlled
compactness-uniqueness argument, the backstep-
from h2 (i.e., h1 D h3 D 0) if L does not
ping method, and fixed-point theorems. Surpris-
belong to the set O of critical lengths (Glass
ingly, the control properties of the system depend
and Guerrero 2010).
strongly on the location of the controls.
3. Locally exactly controllable when controlled
Theorem 1 The linear KdV system (3)–(4) is: from h3 (i.e., h1 D h2 D 0). If L belongs to the
1. Null-controllable when controlled from h1 set of critical lengths N , then a minimal time
(i.e., h2 D h3 D 0) (Glass and Guerrero of control may be required (see Cerpa 2014).
2008). 4. Asymptotically stable to the origin if L … N
2. Exactly controllable when controlled from h2 and no control is applied (Perla Menzala et al.
(i.e., h1 D h3 D 0) if and only if L does not 2002).
belong to a set O of critical lengths defined in 5. Locally stabilizable by means of a feedback
Glass and Guerrero (2010). law using h1 only (i.e., h2 D h3 D 0) (Cerpa
3. Exactly controllable when controlled from h3 and Coron 2013).
(i.e., h1 D h2 D 0) if and only if L does not
Item 3 in Theorem 2 is a truly nonlinear re-
belong to a set of critical lengths N defined in
sult obtained by applying a power series method
Rosier (1997).
introduced in Coron and Crépeau (2004). All
4. Asymptotically stable to the origin if L … N
other items are implied by perturbation argu-
and no control is applied (Perla Menzala et al.
ments based on the linear control system. The re-
2002).
lated control system formed by (1) with boundary
5. Stabilizable by means of a feedback law using
controls
h1 only (i.e., h2 D h3 D 0) Cerpa and Coron
(2013).
y.t; 0/ D h1 .t/; yx .t; L/ D h2 .t/; and
If L 2 N or L 2 O, one says that L is
a critical length since the linear control system yxx .t; L/ D h3 .t/; (5)
(3)–(4) loses controllability properties when only
one control input is applied. In those cases, there is studied in Cerpa et al. (2013), and the same
exists a finite-dimensional subspace of L2 .0; L/ phenomenon of critical lengths appears.
which is unreachable from 0 for the linear system.
The sets N and O contain infinitely many critical
lengths, but they are countable sets.
The KS Equation
When one is allowed to use more than one
boundary control input, there is no critical spatial
Applying the same strategy than for KdV, we
domain, and the exact controllability holds for
linearize (2) around the origin to get the equation
any L > 0. This is proved in Zhang (1999) when
three boundary controls are used. The case of two
control inputs is solved in Rosier (1997), Glass ut C uxxxx C uxx D 0; (6)
and Guerrero (2010), and Cerpa et al. (2013).
Previous results concern the linearized control which can be studied on the finite interval Œ0; 1
system. Considering the nonlinearity yyx , we under the following four boundary conditions:
Boundary Control of Korteweg-de Vries and Kuramoto–Sivashinsky PDEs 91
u.t; 0/ D v1 .t/; ux .t; 0/ D v2 .t/; It is known from Liu and Krstic (2001) that
if < 4 2 , then the system is exponentially
u.t; 1/ D v3 .t/; and ux .t; 1/ D v4 .t/: stable in L2 .0; 1/. On the other hand, if D 4 2 ,
(7) then zero becomes an eigenvalue of the system,
Thus, viewing v1 .t/; v2 .t/; v3 .t/; v4 .t/ 2 R as and therefore the asymptotic stability fails. When B
controls and the solution u.t; / W Œ0; 1 ! R > 4 2 , the system has positive eigenvalues
as the state, we can consider the linear control and becomes unstable. In order to stabilize this
system (6)–(7) and the nonlinear one (2)–(7). The system, a finite-dimensional-based feedback law
role of the parameter is crucial. The KS equa- can be designed by using the pole placement
tion is parabolic and the eigenvalues of system method (item 4 in Theorem 3).
(6)–(7) with no control (v1 D v2 D v3 D v4 D 0) Previous results concern the linearized control
go to 1. If increases, then the eigenvalues system. If we add the nonlinearity uux , we obtain
move to the right. When > 4 2 , the sys- the original KS control system and the following
tem becomes unstable because there are a finite results.
number of positive eigenvalues. In this unstable Theorem 4 The KS control system (2)–(7) is:
regime, the system loses control properties for 1. Locally null-controllable when controlled
some values of . from v1 and v2 (i.e., v3 D v4 D 0). The
same is true when controlling v3 and v4 (i.e.,
Theorem 3 The linear KS control system (6)–(7) v1 D v2 D 0) (Cerpa and Mercado 2011).
is: 2. Asymptotically stable to the origin if <
1. Null-controllable when controlled from v1 and 4 2 and no control is applied (Liu and Krstic
v2 (i.e., v3 D v4 D 0). The same is true when 2001).
controlling v3 and v4 (i.e., v1 D v2 D 0)
(Cerpa and Mercado 2011; Lin Guo 2002). There are less results for the nonlinear systems
2. Null-controllable when controlled from v2 than for the linear one. This is due to the fact that
(i.e., v1 D v2 D v3 D 0) if and only if does the spectral techniques used to study the linear
not belong to a countable set M defined in system with only one control input are not robust
Cerpa (2010). enough to deal with perturbations in order to
3. Asymptotically stable to the origin if < address the nonlinear control system.
4 2 and no control is applied (Liu and Krstic
2001).
4. Stabilizable by means of a feedback law using
v2 only (i.e., v2 D v3 D v4 D 0) if and only if Summary and Future Directions
… M (Cerpa 2010).
The KdV and the KS equations possess both
In the critical case 2 M , the linear system noncontrol results when one boundary control
is not null-controllable anymore if we control input is applied. This is due to the fact that
v2 only (item 2 in Theorem 3). The space of both are higher-order equations, and therefore,
noncontrollable states is finite dimensional. To when posed on a bounded interval, more than
obtain the null-controllability of the linear system one boundary condition should be imposed at
in these cases, we have to add another control. the same point. The KdV equation is exactly
Controlling with v2 and v4 does not improve the controllable when acting from the right and null-
situation in the critical cases. Unlike that, the controllable when acting from the left. On the
system becomes null-controllable if we can act other hand, the KS equation, being parabolic as
on v1 and v2 . This result with two input controls the heat equation, is not exactly controllable but
has been proved in Lin Guo (2002) for the case null-controllable. Most of the results are implied
D 0 and in Cerpa and Mercado (2011) in the by the behaviors of the corresponding linear sys-
general case (item 1 in Theorem 3). tem, which are very well understood.
92 Boundary Control of Korteweg-de Vries and Kuramoto–Sivashinsky PDEs
Sivashinsky GI (1977) Nonlinear analysis of hydrody- true parameter values and those estimated by the
namic instability in laminar flames – I derivation of algorithm. However, explicit forms of estimation
basic equations. Acta Astronaut 4:1177–1206
Smyshlyaev A, Krstic M (2010) Adaptive control of
error are usually difficult to obtain except for
parabolic PDEs. Princeton University Press, Princeton the simplest statistical models. Therefore, per-
Zhang BY (1999) Exact boundary controllability of the formance bounds are derived as a way of quan- B
Korteweg-de Vries equation. SIAM J Control Optim tifying estimation accuracy while maintaining
37:543–565
tractability.
In many cases, it is beneficial to quantify
performance using universal bounds that are in-
dependent of the estimation algorithms and rely
Bounds on Estimation only upon the model. In this regard, universal
lower bounds are particularly useful as it provides
Arye Nehorai1 and Gongguo Tang2 means to assess the difficulty of performing es-
1
Preston M. Green Department of Electrical and timation for a particular model and can act as
Systems Engineering, Washington University in benchmarks to evaluate the quality of any algo-
St. Louis, St. Louis, MO, USA rithm: the closer the estimation error of the algo-
2
Department of Electrical Engineering & rithm to the lower bound, the better the algorithm.
Computer Science, Colorado School of Mines, In the following, we review three widely used
Golden, CO, USA universal lower bounds on estimation: Cramér-
Rao bound (CRB), Barankin-type bound (BTB),
and Ziv-Zakai bound (ZZB). These bounds find
numerous applications in determining the per-
Abstract formance of sensor arrays, radar, and nonlinear
filtering; in benchmarking various algorithms;
We review several universal lower bounds on and in optimal design of systems.
statistical estimation, including deterministic
bounds on unbiased estimators such as Cramér-
Rao bound and Barankin-type bound, as well as
Bayesian bounds such as Ziv-Zakai bound. We Statistical Model and Related
present explicit forms of these bounds, illustrate Concepts
their usage for parameter estimation in Gaussian
additive noise, and compare their tightness. To formalize matters, we define a statistical
model for estimation as a family of parameterized
probability density functions in RN : fp.xI / W
2 ‚ Rd g. We observe a realization of
Keywords x 2 RN generated from a distribution p.xI /,
where 2 ‚ is the true parameter to be
Barankin-type bound; Cramér-Rao bound; estimated from data x. Though we assume a
Mean-squared error; Statistical estimation; single observation x, the model is general enough
Ziv-Zakai bound to encompass multiple independent, identically
distributed samples (i.i.d.) by considering the
joint probability distribution.
Introduction An estimator of is a measurable function of
O
the observation .x/ W RN ! ‚. An unbiased
Statistical estimation involves inferring the values estimator is one such that
of parameters specifying a statistical model from
n o
data. The performance of a particular statistical O
E .x/ D ; 8 2 ‚: (1)
algorithm is measured by the error between the
94 Bounds on Estimation
Here we used the subscript to emphasize estimation. Define the Fisher information matrix
that the expectation is taken with respect to I./ via
p.xI /. We focus on the performance of
unbiased estimators in this entry. There are @ @
Ii;j ./ D E log p.xI / log p.xI /
various ways to measure the error of the estimator @i @j
O
.x/. Two typical ones are the error covariance
@2
matrix: D E log p.xI / :
@i @j
n o
E .O /.O /T D Cov./;
O (2)
O the error
Then for any unbiased estimator ,
where the equation holds only for unbiased esti- covariance matrix is bounded by
mators, and the mean-squared error (MSE): n o
n o n E .O /.O /T
ŒI./1 ; (5)
E O
k.x/ k22 D trace E .O /
o
where A
B for two symmetric matrices means
.O /T : (3) A B is positive semidefinite. The inverse of the
Fisher information matrix CRB./ D ŒI./1 is
Example 1 (Signal in additive Gaussian noise called the Cramér-Rao bound.
(SAGN)) To illustrate the usage of different When is scalar, I./ measures the expected
estimation bounds, we use the following sensitivity of the density function with respect to
statistical model as a running example: changes in the parameter. A density family that
is more sensitive to parameter changes (larger
xn D sn ./ C wn ; n D 0; : : : ; N 1: (4) I./) will generate observations that look more
different when the true parameter varies, making
Here 2 ‚ R is a scalar parameter to be it easier to estimate (smaller error).
estimated and the noise wn follows i.i.d. Gaussian Example 2 For the SAGN model (4), the CRB is
distribution with mean 0 and known variance 2 .
Therefore, the density function for x is 2
CRB./ D I./1 D h i2 : (6)
PN 1 @sn . /
p.xI / nD0 @
Y
N 1
1 .xn sn .//2
D p exp The inverse dependence on the `2 norm of signal
nD0
2 2 2 derivative suggests that signals more sensitive to
( N 1 ) parameter change are easier to estimate.
1 X .xn sn .//2
D p exp : For the frequency estimation problem with
. 2/N nD0
2 2 sn ./ D cos.2 n/, the CRB as a function of
is plotted in Fig. 1.
In particular, we consider the frequency estima- There are many modifications of the basic
tion problem where sn ./ D cos.2 n/ with CRB such as the posterior CRB (Tichavsky et al.
‚ D Œ0; 14 /. 1998; Van Trees 2001), the hybrid CRB (Rockah
and Schultheiss 1987), the modified CRB
(D’Andrea et al. 1994), the concentrated CRB
Cramér-Rao Bound (Hochwald and Nehorai 1994), and constrained
CRB (Gorman and Hero 1990; Marzetta 1993;
The Cramér-Rao bound (CRB) (Kay 2001a; Sto- Stoica and Ng 1998). The posterior CRB
ica and Nehorai 1989; Van Trees 2001) is ar- takes into account the prior information of the
guably the most well-known lower bounds on parameters when they are modeled as random
Bounds on Estimation 95
H1 W D C ıI x p.xI C ı/
with
p .
/ B
Pr.H0 / D
p .
/ C p .
C ı/
p .
C ı/
Pr.H1 / D :
p .
/ C p .
C ı/
Z 1
( "Z 1
#)
1 4 4 h the performance of statistically modeled physical
V 8Pmin .
;
C h/d
hdh:
2 0 0
systems that is independent of any specific
algorithm.
Future directions of performance bounds on
The binary hypothesis testing problem is to de-
estimation include deriving tighter bounds, de-
cide which one of two signals is buried in addi-
veloping computational schemes to approximate
tive Gaussian noise. The optimal detector with
existing bounds in a tractable way, and applying
minimal probability of error is the minimum
them to practical problems.
distance receiver (Kay 2001b), and the associated
probability of error is
Cross-References
Pmin .
;
C h/
0 s 1 Estimation, Survey on
PN 1
1 .s .
/ s .
C h// 2 Particle Filters
D Q@ A;
nD0 n n
2 2
Recommended Reading
R1 2
t2
where Q.h/ D h
p1 e dt. For the
2 Kay SM (2001a), Chapter 2 and 3; Stoica P,
frequency estimation problem, we numerically
Nehorai A (1989); Van Trees HL (2001), Chap-
estimate the integral and plot the resulting
ter 2.7; Forster and Larzabal (2002); Bell et al.
ZZB in Fig. 3 together with the mean-squared
(1997).
error for the maximum likelihood estimator
(MLE). Acknowledgments This work was supported in part
by NSF Grants CCF-1014908 and CCF-0963742, ONR
Grant N000141310050, AFOSR Grant FA9550-11-1-
Summary and Future Directions 0210.
Building Control Systems, Fig. 1 Evolution from centralized to distributed network architectures
wiring costs and provide more modular solutions. of operation that depend on time and external
The development of open communications conditions.
protocols, such as BACNet, has enabled the Each local-loop feedback controller acts
use of distributed control devices from different independently, but their performance can be
vendors and improved the cost-effectiveness coupled to other local-loop controllers if not
of ECMS. There has also been a recent trend tuned appropriately. Adaptive tuning algorithms
towards the use of existing enterprise networks have been developed in recent years to enable
to reduce system installed costs and to more controllers to automatically adjust to changing
easily allow remote access and control from any weather and load conditions. There are typically
Internet accessible device. a number of degrees of freedom in adjusting
An EMCS for a large commercial building can supervisory control set points over a wide
automate the control of many of the building and range while still achieving adequate comfort
system functions, including scheduling of lights conditions. Optimal control of supervisory set
and zone thermostat settings according to occu- points involves minimizing a cost function
pancy patterns. Security and fire safety systems with respect to the free variables and subject
tend to be managed using separate systems. In to constraints. Although model-based, control
addition to scheduling, an EMCS manages the optimization approaches are not typically
control of individual equipment and subsystems employed in buildings, they have been used to
that provide heating, ventilation, and air condi- inform the development and assessment of some
tioning of the building (HVAC). This control is heuristic control strategies. Most commonly,
achieved using a two-level hierarchical structure strategies for adjusting supervisory control
of local-loop and supervisory control. Local-loop variables are established at the control design
control of individual set points is typically im- phase based on some limited analysis of the
plemented using individual proportional-integral HVAC system and specified as a sequence of
(PI) feedback algorithms that manipulate individ- operations that is programmed into the EMCS.
ual actuators in response to deviations from the
set points. For example, supply air temperature
from a cooling coil is controlled by adjusting Systems, Equipment, and Controls
a valve opening that provides chilled water to
the coil. The second level of supervisory con- The greatest challenges and opportunities for
trol specifies the set points and other modes optimizing supervisory control variables exist for
100 Building Control Systems
centralized cooling systems that are employed the coil is controlled with a local feedback con-
in large commercial buildings because of the troller that adjusts the flow of water using a valve.
large number of control variables and degrees A supply fan and return fan (not shown in Fig. 2)
of freedom along with utility rate incentives. provide the necessary airflow to and from the
A simplified schematic of a typical centralized zones. With a VAV system, zone temperature set
cooling plant is shown in Fig. 2 with components points are regulated using a feedback controller
grouped under air distribution, chilled water loop, applied to dampers within the VAV boxes. The
chiller plant, and condenser water loop. overall air flow provided by the AHU is typically
Typical air distribution systems include VAV controlled to maintain a duct static pressure set
(variable-air volume) boxes within the zones, point within the supply duct.
air-handling units, ducts, and controls. An air- The chilled water loop communicates
handling unit (AHU) provides the primary condi- between the cooling coils within the AHUs
tioning, ventilation, and flow of air and includes and chillers that provide the primary source
cooling and heating coils, dampers, fans, and con- for cooling. It consists of pumps, pipes,
trols. A single air handler typically serves many valves, and controls. Primary/secondary chilled
zones and several air handlers are utilized in a water systems are commonly employed to
large commercial building. For each AHU, out- accommodate variable-speed pumping. In the
door ventilation air is mixed with return air from primary loop, fixed-speed pumps are used to
the zones and fed to the cooling coil. Outdoor and provide relatively constant chiller flow rates to
return air dampers are typically controlled using ensure good performance and reduce the risk of
an economizer control that selects between min- evaporator tube freezing. Individual pumps are
imum and maximum ventilation air depending typically cycled on and off with a chiller that
upon the condition of the outside air. The cooling it serves. The secondary loop incorporates one
coil provides both cooling and dehumidification or more variable-speed pumps that are typically
of the process air. The air outlet temperature from controlled to maintain a set point for chilled water
Building Control Systems 101
loop differential pressure between the building of any control changes. However, zone feedback
supplies and returns. controllers respond to higher temperatures by
The primary source of cooling for the system increasing VAV box airflow through increased
is typically provided by one or more chillers damper openings. This leads to reduced static
that are arranged in parallel and have dedicated pressure in the primary supply duct, which causes B
pumps. Each chiller has an on-board local-loop the AHU supply fan controller to create ad-
feedback controller that adjusts its cooling capac- ditional airflow. The greater airflow causes an
ity to maintain a specified set point for chilled wa- increase in supply air temperatures leaving the
ter supply temperature. Additional chiller control cooling coils in the absence of any additional
variables include the number of chillers operating control changes. However, the supply air tem-
and the relative loading for each chiller. The perature feedback controllers respond by opening
relative loading can be controlled for a given total the cooling coil valves to increase water flow and
cooling requirement by utilizing different chilled the heat transfer to the chilled water (the cooling
water supply set points for constant individual load). For variable-speed pumping, a feedback
flow or by adjusting individual flows for iden- controller would respond to decreasing pressure
tical set points. Chillers can be augmented with differential by increasing the pump speed. The
thermal storage to reduce the amount of chiller chillers would then experience increased loads
power required during occupied periods in order due to higher return water temperature and/or
to reduce on-peak energy and power demand flow rate that would lead to increases in chilled
costs. The thermal storage medium is cooled water supply temperatures. However, the chiller
during the unoccupied, nighttime period using the controllers would respond by increasing chiller
chillers when electricity is less expensive. During cooling capacities in order to maintain the chilled
occupied times, a combination of the chillers and water supply set points (and match the cooling
storage are used to meet cooling requirements. coil loads). In turn, the heat rejection to the con-
Control of thermal storage is defined by the denser water loop would increase to balance the
manner in which the storage medium is charged increased energy removed by the chiller, which
and discharged over time. would increase the temperature of water leaving
The condenser water loop includes cooling the condenser. The temperature of water leav-
towers, pumps, piping, and controls. Cooling ing the cooling tower would then increase due
towers reject energy to the ambient air through to an increase in its energy water temperature.
heat transfer and possibly evaporation (for wet However, a feedback controller would respond to
towers). Larger systems tend to have multiple the higher condenser water supply temperature
cooling towers with each tower having multiple and increase the tower airflow. At some load,
cells that share a common sump with individual the current set of operating chillers would not
fans having two or more speed settings. The be sufficient to meet the load (i.e., maintain the
number of operating cells and tower fan speeds chilled water supply set points) and an additional
are often controlled using a local-loop feedback chiller would need to be brought online.
controller that maintains a set point for the This example illustrated how different local-
water temperature leaving the cooling tower. loop controllers might respond to load changes
Typically, condenser water pumps are dedicated in order to maintain individual set points. Super-
to individual chillers (i.e., each pump is cycled visory control might change these set points and
on and off with a chiller that it serves). modes of operation. At any given time, it is pos-
In order to better understand building control sible to meet the cooling needs with any number
variables, interactions, and opportunities, con- of different modes of operation and set points
sider how controls change in response to in- leading to the potential for control optimization
creasing building cooling requirements for the to minimize an objective function.
system of Fig. 2. As energy gains to the zones The system depicted in Fig. 2 and described
increase, zone temperatures rise in the absence in the preceding paragraphs represents one of
102 Building Control Systems
ice storage tank. In this case, the states would Other heuristic approaches have been devel-
be constrained between limits associated with oped (e.g., ASHRAE 2011; Braun 2007) for
the device’s practical storage potential. When controlling the charging and discharging of ice
variations in zone temperature set points are con- or chilled water storage that were derived from
sidered within an optimization, then state vari- a daily optimization formulation. For the case B
ables associated with the distributed nature of of real-time pricing of energy, heuristic charg-
energy storage within the building structure are ing and discharging strategies were derived from
important to consider. The outputs are additional minimizing a daily cost function
quantities of interest, such as equipment cooling
capacities, occupant comfort conditions, etc., that X
Nday
of power to customers and cascades that in- roughly proportional to blackout size, although
volve events in other infrastructures, especially larger blackouts may well have costs (especially
since loss of electricity can significantly impair indirect costs) that increase faster than linearly. In
other essential or economically important infras- the case of the power law dependence, the larger
tructures. Note that in the context of interact- blackouts can become rarer at a similar rate as
ing infrastructures, the term “cascading failure” costs increase, and then the risk of large black-
sometimes has the more restrictive definition of outs is comparable to or even exceeding the risk
events cascading between infrastructures (Rinaldi of small blackouts. Mitigation of blackout risk
et al. 2001). should consider both small and large blackouts,
because mitigating the small blackouts that are
easiest to analyze may inadvertently increase the
Blackout Risk risk of large blackouts (Newman et al. 2011).
Approaches to quantify blackout risk are chal-
Cascading failure is a sequence of dependent lenging and emerging, but there are also valuable
events that successively weaken the power sys- approaches to mitigating blackout risk that do
tem. At a given stage in the cascade, the previous not quantify the blackout risk. The n-1 criterion
events have weakened the power system so that that requires the power system to survive any sin-
further events are more likely. It is this depen- gle component failure has the effect of reducing
dence that makes the long series of cascading cascading failures. The individual mechanisms
events that cause large blackouts likely enough of dependence in cascades (overloads, protection
to pose a substantial risk. (If the events were failures, voltage collapse, transient stability, lack
independent, then the probability of a large num- of situational awareness, human error, etc.) can
ber of events would be the product of the small be addressed individually by specialized analyses
probabilities of individual events and would be or simulations or by training and procedures.
vanishingly small.) The statistics for the distribu- Credible initiating outages can be sampled and
tion of sizes of blackouts have correspondingly simulated, and those resulting in cascading can
“heavy tails” indicating that blackouts of all sizes, be mitigated (Hardiman et al. 2004). This can be
including large blackouts, can occur. Large black- thought of as a “defense in depth” approach in
outs are rare, but they are expected to happen which mitigating a subset of credible contingen-
occasionally, and they are not “perfect storms.” cies is likely to mitigate other possible contin-
In particular, it has been observed in several gencies not studied. Moreover, when blackouts
developed countries that the probability distribu- occur, a postmortem analysis of that particular se-
tion of blackout size has an approximate power quence of events leads to lessons learned that can
law dependence (Carreras et al. 2004b; Dobson be implemented to mitigate the risk of some sim-
et al. 2007; Hines et al. 2009). (The power law ilar blackouts (US-Canada Power System Outage
is of course limited in extent because every grid Task Force 2004).
has a largest possible blackout in which the en-
tire grid blacks out.) The power law region can
be explained using ideas from complex systems Simulation and Models
theory. The main idea is that over the long term,
the power grid reliability is shaped by the engi- There are many simulations of cascading
neering responses to blackouts and the slow load blackouts using Monte Carlo and other methods,
growth and tends to evolve towards the power law for example, Hardiman et al. (2004), Carreras
distribution of blackout size (Dobson et al. 2007; et al. (2004a), Chen et al. (2005), Kirschen et al.
Ren et al. 2008). (2004), Anghel et al. (2007), and Bienstock
Blackout risk can be defined as the prod- and Mattia (2007). All these simulations
uct of blackout probability and blackout cost. select and approximate a modest subset of the
One simple assumption is that blackout cost is many physical and engineering mechanisms of
Cascading Network Failure in Power Grid Blackouts 107
cascading failure, such as line overloads, voltage largely motivated by idealized models of propa-
collapse, and protection failures. In addition, gation of failures in the Internet. The way that
operator actions or the effects of engineering the failures propagate only along the network links
network may also be crudely represented. Some is not realistic for power systems, which satisfy
of the simulations give a set of credible cascades, Kirchhoff’s laws so that many types of failures
and others approximately estimate blackout risk. propagate differently. For example, line over-
Except for describing the initial outages, loads tend to propagate along cutsets of the net- C
where there is a wealth of useful knowledge, work. However, the high-level qualitative results
much of standard risk analysis and modeling does of phase transitions in the complex networks
not easily apply to cascading failure in power have provided inspiration for similar effects to
systems because of the variety of dependencies be discovered in power system models (Dobson
and mechanisms, the combinatorial explosion of et al. 2007). There is also a possible research
rare possibilities, and the heavy-tailed probability opportunity to elaborate the complex network
distributions. However, progress has been made models to incorporate some of the realities of
in probabilistic models of cascading (Chen et al. power system and then validate them.
2006; Dobson 2012; Rahnamay-Naeini et al.
2012).
The goal of high-level probabilistic models is Summary and Future Directions
to capture salient features of the cascade without
detailed models of the interactions and dependen- One challenge for simulation is what selection
cies. They provide insight into cascading failure of phenomena to model and in how much detail
data and simulations, and parameters of the high- in order to get useful engineering results. Faster
level models can serve as metrics of cascading. simulations would help to ease the requirements
Branching process models are transient of sampling appropriately from all the sources
Markov probabilistic models in which, after of uncertainty. Better metrics of cascading in
some initial outages, the outages are produced addition to average propagation need to be de-
in successive generations. Each outage in each veloped and extracted from real and simulated
generation (a “parent” outage) produces a data in order to better quantify and understand
probabilistic number of outages (“children” blackout risk. There are many new ideas emerg-
outages) in the next generation according to an ing to analyze and simulate cascading failure,
offspring probability distribution. The children and the next step is to validate and improve
failures then become parents to produce the next these new approaches by comparing them with
generation and so on, until there is a generation observed blackout data. Overall, there is an excit-
with zero children and the cascade stops. As ing challenge to build on the more deterministic
might be expected, a key parameter describing approaches to mitigate cascading failure and find
the cascading is its average propagation, which ways to more directly quantify and mitigate cas-
is the average number of children outages cading blackout risk by coordinated analysis of
per parent outage. Branching processes have real data, simulation, and probabilistic models.
traditionally been applied to many cascading
processes outside of risk analysis (Harris), but
they have recently been validated and applied to Cross-References
estimate the distribution of the total number of
outages from utility outage data (Dobson 2012). Hybrid Dynamical Systems, Feedback Con-
A probabilistic model that tracks the cascade as trol of
it progresses in time through lumped grid states Lyapunov Methods in Power System Stability
is presented in Rahnamay-Naeini et al. (2012). Power System Voltage Stability
There is an extensive complex networks liter- Small Signal Stability in Electric Power
ature on cascading in abstract networks that is Systems
108 Cash Management
and outflows of cash are random. There is a finite strong assumptions, Vial (1972) proved that if an
target for the cash balance, which could be zero in optimal policy exists, then it is of a simple form
some cases. The firm wants to select a policy that .a; ˛; ˇ; b/.
minimizes the expected total discounted cost for This means that the cash balance should
being far away from the target during some time be increased to level ˛ when it reaches
horizon. This time horizon is usually infinity. level a and should be reduced to level ˇ
The firm has an incentive to keep the cash level when it reaches level b. Constantinides (1976) C
low, because each unit of positive cash leads to assumed that an optimal policy exists and
a holding cost since cash has alternative uses it is of a simple form, and determined the
like dividends or investments in earning assets. above levels and discussed the properties
The firm has an incentive to keep the cash level of the optimal solution. Constantinides and
high, because penalty costs are generated as a Richard (1978) proved the main assumptions of
result of delays in meeting demands for cash. Vial (1972) and therefore obtained rigorously
The firm can increase its cash balance by raising a solution for the cash management prob-
new capital or by selling some earnings assets, lem.
and it can reduce its cash balance by paying Constantinides and Richard (1978) applied
dividends or investing in earning assets. This the theory of stochastic impulse control devel-
control of the cash balance generates fixed and oped by Bensoussan and Lions (1973, 1975,
proportional transaction costs. Thus, there is a 1982). He used a Brownian motion W to model
cost when the cash balance is different from its the uncertainty in the inventory. Formally, he
target, and there is also a cost for increasing considered a probability space .; F ; P / to-
or reducing the cash reserve. The objective of gether with a filtration .Ft / generated by a one-
the manager is to minimize the expected total dimensional Brownian motion W . He considered
discounted cost. Xt W D inventory level at time t, and assumed
Hasbrouck (2007), Madhavan and Smidt that X is an adapted stochastic process given
(1993), and Manaster and Mann (1996) study by
inventories of stocks that are similar to the cash
Z t Z t 1
X
management problem.
Xt D x ds d Ws C Ifi <t g i :
0 0 i D1
The Solution Here, > 0 is the drift of the demand and >
0 is the volatility of the demand. Furthermore, i
The qualitative form of optimal policies of the is the time of the i -th intervention and i is the
cash management problem in discrete time was intensity of the i -th intervention.
studied by Eppen and Fama (1968, 1969), Girgis A stochastic impulse control is a pair
(1968), and Neave (1970). However, their solu-
tions were incomplete. ..n /I .n //
Many of the difficulties that they and other
researchers encountered in a discrete-time frame- D .0 ; 1 ; 2 ; : : : ; n ; : : : I 0 ; 1 ; 2 ; : : : ; n ; : : :/;
work disappeared when it was assumed that de-
cisions were made continuously in time and that where
demand is generated by a Brownian motion with
drift. Vial (1972) formulated the cash manage- 0 D 0 < 1 < 2 < < n <
ment problem in continuous time with fixed and
proportional transaction costs, linear holding and is an increasing sequence of stopping times and
penalty costs, and demand for cash generated .n / is a sequence of random variables such that
by a Brownian motion with drift. Under very each n W 7! R is measurable with respect
110 Cash Management
Bibliography
We note that i and X can also take negative
Bensoussan A, Lions JL (1973) Nouvelle formulation de
values. The management wants to select the pair problemes de controle impulsionnel et applications. C
R Acad Sci (Paris) Ser A 276:1189–1192
Bensoussan A, Lions JL (1975) Nouvelles methodes en
..n /I .n // controle impulsionnel. Appl Math Opt 1:289–312
Bensoussan A, Lions JL (1982) Controle impulsionnel et
inequations quasi variationelles. Bordas, Paris
Cadenillas A, Zapatero F (1999) Optimal Central Bank
that minimizes the functional J defined by intervention in the foreign exchange market. J Econ
Theory 87:218–242
Z 1
Cadenillas A, Lakner P, Pinedo M (2010) Optimal control
of a mean-reverting inventory. Oper Res 58:1697–
J.xI ..n /I .n /// W D E e t f .Xt /dt
0
1710
1
# Constantinides GM (1976) Stochastic cash management
X with fixed and proportional transaction costs. Manage
n
C e g.n /Ifn <1g ; Sci 22:1320–1331
nD1 Constantinides GM, Richard SF (1978) Existence of op-
timal simple policies for discounted-cost inventory
and cash management in continuous time. Oper Res
where 26:620–636
f .x/ D max .hx; px/ Eppen GD, Fama EF (1968) Solutions for cash balance
and simple dynamic portfolio problems. J Bus
and 41:94–112
Eppen GD, Fama EF (1969) Cash balance and simple
8 dynamic portfolio problems with proportional costs.
< C C c if > 0
Int Econ Rev 10:119–133
g./ D min .C; D/ if D 0 Feng H, Muthuraman K (2010) A computational method
:
D d if < 0 for stochastic impulse control problems. Math Oper
Res 35:830–850
Furthermore, > 0; C; c; D; d 2 .0; 1/, Girgis NM (1968) Optimal cash balance level. Manage
and h; p 2 .0; 1/. Here, f represents the run- Sci 15:130–140
ning cost incurred by deviating from the aimed Harrison JM, Sellke TM, Taylor AJ (1983) Impulse con-
trol of Brownian motion. Math Oper Res 8:454–466
cash level 0, C represents the fixed cost per Hasbrouck J (2007) Empirical market microstructure.
intervention when the management pushes the Oxford University Press, New York
cash level upwards, D represents the fixed cost Madhavan A, Smidt S (1993) An analysis of changes
per intervention when the management pushes in specialist inventories and quotations. J Finance
48:1595–1628
the cash level downwards, c represents the pro- Manaster S, Mann SC (1996) Life in the pits: competitive
portional cost per intervention when the manage- market making and inventory control. Rev Financ
ment pushes the cash level upwards, d represents Stud 9:953–975
the proportional cost per intervention when the Neave EH (1970) The stochastic cash-balance problem
with fixed costs for increases and decreases. Manage
management pushes the cash level downwards, Sci 16:472–490
and is the discount rate. Ormeci M, Dai JG, Vande Vate J (2008) Impulse control
The results of Constantinides were comple- of Brownian motion: the constrained average cost
mented, extended, or improved by Cadenillas case. Oper Res 56:618–629
Vial JP (1972) A continuous time model for the cash
et al. (2010), Cadenillas and Zapatero (1999), balance problem. In: Szego GP, Shell C (eds)
Feng and Muthuraman (2010), Harrison et al. Mathematical methods in investment and finance.
(1983), and Ormeci et al. (2008). North Holland, Amsterdam
Classical Frequency-Domain Design Methods 111
Design Specifications C
Abstract
As discussed in Section X, the gain margin
The design of feedback control systems in indus- (GM) is the factor by which the gain can be
try is probably accomplished using frequency- raised before instability results. The phase
response (FR) methods more often than any other. margin (PM) is the amount by which the
Frequency-response design is popular primarily phase of D. j!/G. j!/ exceeds 180ı when
because it provides good designs in the face of jD.j!/G. j!/j D 1; the crossover frequency.
uncertainty in the plant model (G.s/ in Fig. 1). Performance requirements for control systems
For example, for systems with poorly known are often partially specified in terms of PM and/or
or changing high-frequency resonances, we can GM. For example, a typical specification might
temper the feedback design to alleviate the effects include the requirement that PM > 50ı and GM
of those uncertainties. Currently, this tempering is > 5. It can be shown that the PM tends to
carried out more easily using FR design than with correlate well with the damping ratio, ; of the
any other method. The method is most effective closed-loop roots. In fact, it is shown in Franklin
for systems that are stable in open loop; however, et al. (2010), that
it can also be applied to systems with instabilities.
PM
This section will introduce the reader to methods Š
of design (i.e., finding D.s/ in Fig. 1) using lead 100
and lag compensation. It will also cover the use
for many systems; however, the actual resulting
of FR design to reduce steady-state errors and
damping and/or response overshoot of the final
to improve robustness to uncertainties in high-
closed-loop system will need to be verified if they
frequency dynamics.
are specified as well as the PM. A PM of 50ı
would tend to yield a of 0.5 for the closed-loop
roots, which is a modestly damped system. The
Keywords GM does not generally correlate directly with the
damping ratio, but is a measure of the degree of
Amplitude stabilization; Bandwidth; Bode plot; stability and is a useful secondary specification to
Crossover frequency; Frequency response; Gain; ensure stability.
Gain stabilization; Gain margin; Notch filter; Another design specification is the band-
Phase; Phase margin; Stability width, which was defined in Section X. The
bandwidth is a direct measure of the frequency
at which the closed-loop system starts to fail in
Introduction following the input command. It is also a measure
of the speed of response of a closed-loop system.
Finding an appropriate compensation (D.s/ in Generally speaking, it correlates well with the
Fig. 1) using frequency response is probably the step response rise time of the system.
easiest of all feedback control design methods. In some cases, the steady-state error must
Designs are achievable starting with the FR be less than a certain amount. As discussed in
plots of both magnitude and phase of G.s/ then Franklin et al. (2010), the steady-state error is
selecting D.s/ to achieve certain values of the a direct function of the low-frequency gain of
112 Classical Frequency-Domain Design Methods
Controller Plant
+ U
R Σ Y
D(s) G(s)
−
Classical Frequency-Domain Design Methods, Fig. 1 Feedback system showing compensation, D.s/ (Source:
Franklin et al. (2010, p-249), Reprinted by permission of Pearson Education, Inc., Upper Saddle River, NJ)
the FR magnitude plot. However, increasing the typically accomplished by lead compensation.
low-frequency gain typically will raise the en- A phase increase (or lead) is accomplished by
tire magnitude plot upward, thus increasing the placing a zero in D.s/. However, that alone
magnitude 1 crossover frequency and, therefore, would cause an undesirable high-frequency gain
increasing the speed of response and bandwidth which would amplify noise; therefore, a first-
of the system. order pole is added in the denominator at frequen-
cies substantially higher than the zero break point
of the compensator. Thus, the phase lead still
Compensation Design occurs, but the amplification at high frequencies
is limited. The resulting lead compensation has a
In some cases, the design of a feedback com- transfer function of
pensation can be accomplished by using pro-
portional control only, i.e., setting D.s/ D K Ts C1
D.s/ D K ; ˛ < 1; (1)
(see Fig. 1) and selecting a suitable value for K. ˛T s C 1
This can be accomplished by plotting the mag-
nitude and phase of G.s/, looking at jG. j!/j where 1=˛ is the ratio between the pole/zero
at the frequency where †G. j!/ D 180ı, and break-point frequencies. Figure 2 shows the fre-
then selecting K so that jKG. j!/j yields the quency response of this lead compensation. The
desired GM. Similarly, if a particular value of maximum amount of phase lead supplied is de-
PM is desired, one can find the frequency where pendent on the ratio of the pole to zero and is
†G. j!/ D 180ı C the desired PM. The value shown in Fig. 3 as a function of that ratio.
of jKG. j!/j at that frequency must equal 1; For example, a lead compensator with a
therefore, the value of jG. j!/j must equal 1=K. zero at s D 2 .T D 0:5/ and a pole at
Note that the jKG. j!/j curve moves vertically s D 10 .˛T D 0:1/ (and thus ˛ D 15 ) would
based on the value of K; however the †KG. j!/ yield the maximum phase lead of max D 40ı .
curve is not affected by the value of K. This Note from the figure that we could increase the
characteristic simplifies the design process. phase lead almost up to 90ı using higher values
In more typical cases, proportional feedback of the lead ratio, 1=˛; however, Fig. 2 shows that
alone is not sufficient. There is a need for a increasing values of 1=˛ also produces higher
certain damping (i.e., PM) and/or speed of re- amplifications at higher frequencies. Thus, our
sponse (i.e., bandwidth) and there is no value of task is to select a value of 1=˛ that is a good
K that will meet the specifications. Therefore, compromise between an acceptable PM and
some increased damping from the compensation acceptable noise sensitivity at high frequencies.
is required. Likewise, a certain steady-state error Usually the compromise suggests that a lead
requirement and its resulting low-frequency gain compensation should contribute a maximum
will cause the jD. j!/G. j!/j to be greater than of 70ı to the phase. If a greater phase lead is
desired for an acceptable PM, so more phase needed, then a double-lead compensation would
lead is required from the compensation. This is be suggested, where
Classical Frequency-Domain Design Methods 113
Classical
Frequency-Domain
Design Methods, Fig. 2
Lead-compensation
frequency response with
1=˛ D 10; K D 1
(Source: Franklin et al.
(2010, p-349), Reprinted
by permission of Pearson
C
Education, Inc.)
2
Ts C1 and if R.s/ D 1=s 2 for a unit ramp, Eq. (2)
D.s/ D K : reduces to
˛T s C 1
Even if a system had negligible amounts of 1 1
ess D lim D :
noise present, the pole must exist at some point s!0 s C D.s/Œ1=.s C 1/
D.0/
because of the impossibility of building a pure
differentiator. No physical system – mechanical Therefore, we find that D.0/, the steady-state
or electrical or digital – responds with infinite gain of the compensation, cannot be less than 10
amplitude at infinite frequencies, so there will be if it is to meet the error criterion, so we pick
a limit in the frequency range (or bandwidth) for K D 10. The frequency response of KG.s/ in
which derivative information (or phase lead) can Fig. 4 shows that the PM D 20ı if no phase
be provided. lead is added by compensation. If it were pos-
As an example of designing a lead compen- sible to simply add phase without affecting the
sator, let us design compensation for a DC motor magnitude, we would need an additional phase
with the transfer function of only 25ı at the KG.s/ crossover frequency
of ! D 3 rad/sec. However, maintaining the
1 same low-frequency gain and adding a compen-
G.s/ D :
s.s C 1/ sator zero will increase the crossover frequency;
hence, more than a 25ı phase contribution will
We wish to obtain a steady-state error of less than be required from the lead compensation. To be
0.1 for a unit-ramp input and we desire a system safe, we will design the lead compensator so
bandwidth greater than 3 rad/sec. Furthermore, that it supplies a maximum phase lead of 40ı .
we desire a PM of 45ı . To accomplish the error Figure 3 shows that 1=˛ D 5 will accomplish
requirement, Franklin et al. shows that that goal. We will derive the greatest benefit from
the compensation if the maximum phase lead
1 from the compensator occurs at the crossover fre-
ess D lim s R.s/; (2)
s!0 1 C D.s/G.s/ quency. With some trial and error, we determine
114 Classical Frequency-Domain Design Methods
Classical
Frequency-Domain
Design Methods, Fig. 3
Maximum phase increase
for lead compensation
(Source: Franklin et al.
(2010, p-350), Reprinted
by permission of Pearson
Education, Inc.)
Classical Frequency-Domain Design Methods, Fig. 4 Frequency response for lead-compensation design (Source:
Franklin et al. (2010, p-352), Reprinted by permission of Pearson Education, Inc.)
Classical Frequency-Domain Design Methods 115
s=2 C 1
D.s/ D 10 : C
s=10 C 1
and sensor noise are both reduced by sufficiently innovative approaches and led to the development
low gain at high-frequency. of new and efficient algorithms. However,
some of the other problems in robust control
theory had attracted significant amount of
Summary and Future Directions research, but none of the proposed algorithms
were efficient, namely, had execution time
Before the common use of computers in design, bounded by a polynomial of the “problem
frequency-response design was the only widely size.” Several important problems in robust
used method. While it is still the most widely control theory are either of decision type or
used method for routine designs, complex sys- of computation/approximation type, and one
tems and their design are being carried out using would like to have an algorithm which can be
a multitude of methods. This section introduces used to answer all or most of the possible cases
just one of many possible methods. and can be executed on a classical computer in
reasonable amount of time. There is a branch
of theoretical computer science, called theory
Cross-References of computation, which can be used to study the
difficulty of problems in robust control theory.
Frequency-Response and Frequency-Domain In the following, classical computer system,
Models algorithm, efficient algorithm, unsolvability,
Polynomial/Algebraic Design Methods tractability, NP-hardness, and NP-completeness
Quantitative Feedback Theory will be introduced in a more rigorous fashion,
Spectral Factorization with applications to problems from robust control
theory.
Bibliography
Keywords
Franklin GF, Powell JD, Workman M (1998) Digital con-
trol of dynamic systems, 3rd edn. Ellis-Kagle Press,
Half Moon Bay Approximation complexity; Computational com-
Franklin GF, Powell JD, Emami-Naeini A (2010) Feed- plexity; NP-complete; NP-hard; Unsolvability
back control of dynamic systems, 6th edn. Pearson,
Upper Saddle River
Franklin GF, Powell JD, Emami-Naeini A (2015) Feed-
back control of dynamic systems, 7th edn. Pearson, Introduction
Upper Saddle River
The term algorithm is used to refer to differ-
ent notions which are all somewhat consistent
with our intuitive understanding. This ambiguity
Computational Complexity Issues in may sometimes generate significant confusion,
Robust Control and therefore, a rigorous definition is of extreme
importance. One commonly accepted “intuitive”
Onur Toker definition is a set of rules that a person can per-
Fatih University, Istanbul, Turkey form with paper and pencil. However, there are
“algorithms” which involve random number gen-
eration, for example, finding a primitive root in
Abstract Zp (Knuth 1997). Based on this observation, one
may ask whether a random number generation-
Robust control theory has introduced several based set of rules should be also considered
new and challenging problems for researchers. as an algorithm, provided that it will terminate
Some of these problems have been solved by after finitely many steps for all instances of the
Computational Complexity Issues in Robust Control 117
problem or for a significant majority of the cases. Lewis and Papadimitriou (1998), and Sipser
In a similar fashion, one may ask whether any (2006). Despite this being a quite simple
real number, including irrational ones which can- and low-performance “hardware” compared to
not be represented on a digital computer with- today’s engineering standards, the following
out an approximation error, should be allowed two observations justify their use in the study
as an input to an algorithm and, furthermore, of computational complexity of engineering
should all calculations be limited to algebraic problems. Anything which can be solved on C
functions only or should exact calculation of non- today’s current digital computers can be solved
algebraic functions, e.g., trigonometric functions, on a Turing machine. Furthermore, a polynomial-
the gamma function, etc., be acceptable in an time algorithm on today’s digital computers will
algorithm. Although all of these seem acceptable correspond to again a polynomial-time algorithm
with respect to our intuitive understanding of the on the original Turing machine, and vice versa.
algorithm, from a rigorous point of view, they A widely accepted definition for an algorithm
are different notions. In the context of robust is a Turing machine with a program, which is
control theory, as well as many other engineer- guaranteed to terminate after finitely many steps.
ing disciplines, there is a separate and widely For some mathematical and engineering prob-
accepted definition of algorithm, which is based lems, it can be shown that there can be no algo-
on today’s digital computers, more precisely the rithm which can handle all possible cases. Such
Turing machine (Turing 1936). Alan M. Turing problems are called unsolvable. The condition
defined a “hypothetical computation machine” “all cases” may be considered too tough, and
to formally define the notions of algorithm and one may argue that such negative results have
computability. A Turing machine is, in principle, only theoretical importance and have no practi-
quite similar to today’s digital computers widely cal implications. But such results do imply that
used in many engineering applications. The engi- we should concentrate our efforts on alterna-
neering community seems to widely accept the tive research directions, like the development of
use of current digital computers and Turing’s algorithms only for cases which appear more
definitions of algorithm and computability. frequently in real scientific/engineering applica-
However, depending on new scientific, engi- tions, without asking the algorithm to work for
neering, and technological developments, supe- the remaining cases as well.
rior computation machines may be constructed. Here is a famous unsolvable mathematical
For example, there is no guarantee that quantum problem: Hilbert’s tenth problem is basically
computing research will not lead to superior com- the development of an algorithm for testing
putation machines (Chen et al. 2006; Kaye et al. whether a Diophantine equation has an integer
2007). In the future, the engineering community solution. However, in 1970, Matijasevich
may feel the need to revise formal definitions showed that there can be no such algorithm
of algorithm, computability, tractability, etc., if (Matiyasevich 1993). Therefore, we say that
such superior computation machines can be con- the problem of checking whether a Diophantine
structed and used for scientific/engineering appli- equation has an integer solution is unsolvable.
cations. Several unsolvability results for dynamical
systems can be proved by using the Post
Turing Machines and Unsolvability correspondence problem (Davis 1985) and the
Turing machine is basically a mathematical embedding of free semigroups into matrices.
model of a simplified computation device. The For example, the problem of checking the
original definition involves a tape-like device stability of saturated linear dynamical systems
for memory. For an easy-to-read introduction is proved to be undecidable (Blondel et al. 2001),
to the Turing machine model, see Garey and meaning that no general stability test algorithm
Johnson (1979) and Papadimitriou (1995), and can be developed for such systems. A similar
for more details, see Hopcroft et al. (2001), unsolvability result is reported in Blondel and
118 Computational Complexity Issues in Robust Control
Tsitsiklis (2000a) for boundedness of switching Papadimitriou 1995). Such algorithms are also
systems of the type called efficient, and associated problems are clas-
sified as tractable. The term problem size means
x.k C 1/ D Af .k/ x.k/; number of bits used in a suitable encoding of
the problem (Garey and Johnson 1979; Papadim-
where f is assumed to be an arbitrary and un- itriou 1995).
known function from N into f0; 1g. A closely One may argue that this worst-case approach
related asymptotic stability problem is equivalent of being always polynomial time is a quite con-
to testing whether the joint spectral radius (JSR) servative requirement. In reality, a practicing en-
(Rota and Strang 1960) of a set of matrices is less gineer may consider being polynomial time on
than one. For a quite long period of time, there the average quite satisfactory for many appli-
was a conjecture called the finiteness conjecture cations. The same may be true for algorithms
(FC) (Lagarias and Wang 1995), which was gen- which are polynomial time for most of the cases.
erally believed or hoped to be true, at least for However, the existing computational complex-
a group of researchers. FC may be interpreted ity theory is developed around this idea of be-
as “For asymptotic stability of x.k C 1/ D ing polynomial time even in the worst case.
Af .k/ x.k/ type switching systems, it is enough Therefore, many of the computational complex-
to consider periodic switchings only.” There was ity results proved in the literature do not imply
no known counterexample, and the truth of this the impossibility of algorithms which are nei-
conjecture would imply existence of an algorithm ther polynomial time on the average nor poly-
for the abovementioned JSR problem. However, nomial time for most of the cases. Note that
it was shown in Bousch and Mairesse (2002) despite not being efficient, such algorithms may
that FC is not true (see Blondel et al. (2003) for be considered quite valuable by a practicing en-
a simplified proof). There are numerous known gineer. Tractability and efficiency can be defined
computationally very valuable procedures related in several different ways, but the abovementioned
to JSR approximation, for example, see Blon- polynomial-time solvability even in the worst-
del and Nesterov (2005) and references therein. case approach is widely adopted by the engineer-
However, the development of an algorithm to test ing community.
whether JSR is less than one remains as an open NP-hardness and NP-completeness are origi-
problem. nally defined to express the inherent difficulty of
For further results on unsolvability and un- decision-type problems, not for approximation-
solved problems in robust control, see Blondel type problems. Although approximation com-
et al. (1999), Blondel and Megretski (2004), and plexity is an important and active research area in
references therein. the theory of computation (Papadimitriou 1995),
most of the classical results are on decision-type
Tractability, NP-Hardness, and problems. Many robust control-related problems
NP-Completeness can be formulated as “Check whether < 1,”
The engineering community is interested in not which is a decision-type problem. Approximate
only solution algorithms but algorithms which value of may not be always good enough to
are fast even in the worst case and if not on the “solve” the problem, i.e., to decide about robust
average. Sometimes, this speed requirement may stability. For certain other engineering applica-
be relaxed to being fast for most of the cases tions for which approximate values of optimiza-
and sometimes to only a significant percentage tion problems are good enough to “solve” the
of the cases. Currently, the theory of computa- problem, the complexity of a decision problem
tion is developed around the idea of algorithms may not be very relevant. For example, in a
which are polynomial time even in the worst case, minimum effort control problem, usually there
namely, execution time bounded by a polynomial may be no point in computing the exact value of
of the problem size (Garey and Johnson 1979; the minimum, because good approximations will
Computational Complexity Issues in Robust Control 119
be just fine for most cases. However, for a robust polynomial-time solution algorithm can be de-
control problem, a result like D 0:99 ˙ 0:02 veloped. All NP-complete problems are also NP-
may be not enough to decide about robust sta- hard, but they are only as “hard” as any other
bility, although the approximation error is about problem in the set of NP-complete problems.
2 % only. Basically, both the conservativeness of The first known NP-complete problem is
the current tractability definition and the differ- SATISFIABILITY (Cook 1971). In this problem,
ences between decision- and approximation-type there is a single Boolean equation with several C
problems should be always kept in mind when variables, and we would like to test whether
interpreting computational complexity results re- there is an assignment to these variables which
ported here as well as in the literature. make the Boolean expression true. This important
In this subsection, and in the next one, we discovery enabled proofs of NP-completeness
will consider decision problems only. The class or NP-hardness of several other problems by
P corresponds to decision problems which can using simple polynomial reduction techniques
be solved by a Turing machine with a suitable only (Garey and Johnson 1979). Among these,
program in polynomial time (Garey and Johnson quadratic programming is an important one and
1979). This is interpreted as decision problems led to the discovery of several other complexity
which have polynomial-time solution algorithms. results in robust control theory. The quadratic
The definition of the class NP is more technical programming (QP) can be defined as
and involves nondeterministic Turing machines
(Garey and Johnson 1979). It may be interpreted q D min x T Qx C c T x;
Axb
as the class of decision problems for which the
truth of the problem can be verified in polynomial
more precisely testing whether q < 1 or not
time. It is currently unknown whether P and NP
(decision version). When the matrix Q is positive
are equal or not. This is a major open problem,
definite, convex optimization techniques can be
and the importance of it in the theory of computa-
used; however, the general version of the problem
tion is comparable to the importance of Riemann
is NP-hard.
hypothesis in number theory.
A related problem is linear programming (LP)
A problem is NP-complete if it is NP and
every NP problem polynomially reduces to it
(Garey and Johnson 1979). For an NP-complete q D min c T x;
Axb
problem, being in P is equivalent to P D NP.
There are literally hundreds of such problems, which is used in certain robust control problems
and it is generally argued that since after sev- (Dahleh and Diaz-Bobillo 1994) and has a quite
eral years of research nobody was able to de- interesting status. Simplex method (Dantzig
velop a polynomial-time algorithm for these NP- 1963) is a very popular computational technique
complete problems, there is probably no such for LP and is known to have polynomial-time
algorithm, and most likely P ¤ NP. Although complexity on the “average” (Smale 1983).
current evidence is more toward P ¤ NP, this However, there are examples where the simplex
does not constitute a formal proof, and the history method requires exponentially growing number
of mathematics and science is full of surprising of steps with the problem size (Klee and
discoveries. Minty 1972). In 1979, Khachiyan proposed
A problem (not necessarily NP) is called NP- the ellipsoid algorithm for LP, which was the
hard if and only if there is an NP-complete first known polynomial-time approximation
problem which is polynomial time reducible to algorithm (Schrijver 1998). Because of the nature
it (Garey and Johnson 1979). Being NP-hard is of the problem, one can answer the decision
sometimes called being intractable and means version of LP in polynomial time by using
that unless P D NP, which is considered to the ellipsoid algorithm for approximation and
be very unlikely by a group of researchers, no stopping when the error is below a certain level.
120 Computational Complexity Issues in Robust Control
But all of these results are for standard Turing fractional transformations (LFT), which
machines with input parameters restricted to are mainly used to study systems which
rational numbers. An interesting open problem is have component-level uncertainties (Packard
whether LP admits a polynomial algorithm in the and Doyle 1993). Basically, bounds on the
real number model of computation. component-level uncertainties are given, and
we would like to check whether the system
Complexity of Certain Robust Control is robustly stable or not. This is known to be
Problems NP-hard.
There are several computational complexity re- Structured Singular Value Problem (SSVP):
sults for robust control problems (see Blondel and Given a matrix M and uncertainty structure
Tsitsiklis (2000b) for a more detailed survey). , check whether the structured singular value
Here we summarize some of the key results on .M / < 1:
interval matrices and structured singular values. This is proved to be NP-hard for real, and mixed,
Kharitonov theorem is about robust Hurwitz uncertainty structures (Braatz et al. 1994), as well
stability of polynomials with coefficients as for complex uncertainties with no repetitions
restricted to intervals (Kharitonov 1978). The (Toker and Ozbay 1996, 1998).
problem is known to have a surprisingly simple
solution; however, the matrix version of the
problem has a quite different nature. If we have a Approximation Complexity
matrix family Decision version of QP is NP-hard, but approx-
imation of QP is quite “difficult” as well. An
˚ approximation is called a -approximation if
A D A 2 Rnn W ˛i;j Ai;j ˇi;j ;
the absolute value of the error is bounded by
i; j D 1; : : : ; n ; (1) times the absolute value of max–min of the
function. The following is a classical result on
where ˛i;j ; ˇi;j are given constants for i; j D QP (Bellare and Rogaway 1995): Unless P D
1; : : : ; n, then it is called an interval matrix. Such NP, QP does not admit a polynomial-time -
matrices do occur in descriptions of uncertain approximation algorithm even for < 1=3.
dynamical systems. The following two stability This is sometimes informally stated as “QP is
problems about interval matrices are known to be NP-hard to approximate.” Much work is needed
NP-hard: toward similar results on robustness margin and
Interval Matrix Problem 1 (IMP1): Decide related optimization problems of the classical
whether a given interval matrix, A, is robust robust control theory.
Hur- witz stable or not. Namely, check An interesting case is the complex structured
whether all members of A are Hurwitz-stable singular value computation with no repeated un-
matrices, i.e., all eigenvalues are in open left certainties. There is a convex relaxation of the
half plane. problem, the standard upper bound , which is
Interval Matrix Problem 2 (IMP2): Decide known to result in quite tight approximations for
whether a given interval matrix, A, has a most cases of the original problem (Packard and
Hurwitz-stable matrix or not. Namely, check Doyle 1993). However, despite strong numerical
whether there exists at least one matrix in A evidence, a formal proof of “good approximation
which is Hurwitz stable. for most cases” result is not available. We also do
For a proof of NP-hardness of IMP1, see Pol- not have much theoretical information about how
jak and Rohn (1993) and Nemirovskii (1993), hard it is to approximate the complex structured
and for a proof of IMP2, see Blondel and singular value. For example, it is not known
Tsitsiklis (1997). whether it admits a polynomial-time approxima-
Another important problem is related to tion algorithm with error bounded by, say, 5 %.
structured singular values (SSV) and linear In summary, much work needs to be done in these
Computational Complexity Issues in Robust Control 121
directions for many other robust control problems and 5 % error-bounded result will be of great
whose decision versions are NP-hard. importance for practicing engineers. Therefore,
such directions should also be studied, and
various meaningful alternatives, like being
Summary and Future Directions polynomial time on the average or for most
of cases or anything which makes sense for a
The study of the “Is P ¤ NP?” question turned practicing engineer, should be considered as an C
out to be a quite difficult one. Researchers agree alternative direction.
that really new and innovative tools are needed In summary, computational complexity the-
to study this problem. On one other extreme, one ory guides research on the development of algo-
can question whether we can really say some- rithms, indicating which directions are dead ends
thing about this problem within the Zermelo- and which directions are worth to investigate.
Fraenkel (ZF) set theory or will it have a status
similar to axiom of choice (AC) and the contin-
uum hypothesis (CH) where we can neither refute Cross-References
nor provide a proof (Aaronson 1995). Therefore,
the question may be indeed much deeper than we Optimization Based Robust Control
thought, and standard axioms of today’s mathe- Robust Control in Gap Metric
matics may not be enough to provide an answer. Robust Fault Diagnosis and Control
As for any such problem, we can still hope that Robustness Issues in Quantum Control
in the future, new “self-evident” axioms may be Structured Singular Value and Applications:
discovered, and with the help of them, we may Analyzing the Effect of Linear Time-Invariant
provide an answer. Uncertainty in Linear Systems
All of the complexity results mentioned here
are with respect to the standard Turing machine
which is a simplified model of today’s digital Bibliography
computers. Depending on the progress in science,
engineering, and technology, if superior compu- Aaronson S (1995) Is P versus NP formally independent?
tation machines can be constructed, then some Technical report 81, EATCS
of the abovementioned problems can be solved Bellare M, Rogaway P (1995) The complexity of approx-
imating a nonlinear program. Math Program 69:429–
much faster on these devices, and current re- 441
sults/problems of the theory of computation may Blondel VD, Megretski A (2004) Unsolved problems in
no longer be of great importance or relevance for mathematical systems and control theory. Princeton
University Press, Princeton
engineering and scientific applications. In such a
Blondel VD, Nesterov Y (2005) Computationally efficient
case, one may also need to revise definitions of approximations of the joint spectral radius. SIAM J
the terms algorithm, tractable, etc., according to Matrix Anal 27:256–272
these new devices. Blondel VD, Tsitsiklis JN (1997) NP-hardness of some
linear control design problems. SIAM J Control Optim
Currently, there are several NP-hardness
35:2118–2127
results about robust control problems, mostly Blondel VD, Tsitsiklis JN (2000a) The boundedness of
NP-hardness of decision problems about all products of a pair of matrices is undecidable. Syst
robustness. However, much work is needed on Control Lett 41:135–140
Blondel VD, Tsitsiklis JN (2000b) A survey of com-
the approximation complexity and conservatism
putational complexity results in systems and control.
of various convex relaxations of these problems. Automatica 36:1249–1274
Even if a robust stability problem is NP-hard, a Blondel VD, Sontag ED, Vidyasagar M, Willems JC
polynomial-time algorithm to estimate robustness (1999) Open problems in mathematical systems and
control theory. Springer, London
margin with, say, 5 % error is not ruled out
Blondel VD, Bournez O, Koiran P, Tsitsiklis JN (2001)
with the NP-hardness of the decision version The stability of saturated linear dynamical systems is
of the problem. Indeed, a polynomial-time undecidable. J Comput Syst Sci 62:442–462
122 Computer-Aided Control Systems Design: Introduction and Historical Overview
Blondel VD, Theys J, Vladimirov AA (2003) An ele- Schrijver A (1998) Theory of linear and integer program-
mentary counterexample to the finiteness conjecture. ming. Wiley, Chichester
SIAM J Matrix Anal 24:963–970 Sipser M (2006) Introduction to the theory of computa-
Bousch T, Mairesse J (2002) Asymptotic height optimiza- tion. Thomson Course Technology, Boston
tion for topical IFS, Tetris heaps and the finiteness Smale S (1983) On the average number of steps in the
conjecture. J Am Math Soc 15:77–111 simplex method of linear programming. Math Program
Braatz R, Young P, Doyle J, Morari M (1994) Computa- 27:241–262
tional complexity of calculation. IEEE Trans Autom Toker O, Ozbay H (1996) Complexity issues in robust
Control 39:1000–1002 stability of linear delay differential systems. Math
Chen G, Church DA, Englert BG, Henkel C, Rohwedder Control Signals Syst 9:386–400
B, Scully MO, Zubairy MS (2006) Quantum comput- Toker O, Ozbay H (1998) On the NP-hardness of the
ing devices. Chapman and Hall/CRC, Boca Raton purely complex mu computation, analysis/synthesis,
Cook S (1971) The complexity of theorem proving pro- and some related problems in multidimensional sys-
cedures. In: Proceedings of the third annual ACM tems. IEEE Trans Autom Control 43:409–414
symposium on theory of computing, Shaker Heights, Turing AM (1936) On computable numbers, with an
pp 151–158 application to the Entscheidungsproblem. Proc Lond
Dahleh MA, Diaz-Bobillo I (1994) Control of uncertain Math Soc 42:230–265
systems. Prentice Hall, Englewood Cliffs
Dantzig G (1963) Linear programming and extensions.
Princeton University Press, Princeton
Davis M (1985) Computability and unsolvability. Dover
Garey MR, Johnson DS (1979) Computers and intractabil- Computer-Aided Control Systems
ity, a guide to the theory of NP-completeness. W. H. Design: Introduction and Historical
Freeman, San Francisco Overview
Hopcroft JE, Motwani R, Ullman JD (2001) Introduc-
tion to automata theory, languages, and computation.
Addison Wesley, Boston Andreas Varga
Kaye P, Laflamme R, Mosca M (2007) An introduc- Institute of System Dynamics and Control,
tion to quantum computing. Oxford University Press, German Aerospace Center, DLR
Oxford
Kharitonov VL (1978) Asymptotic stability of an equi-
Oberpfaffenhofen, Wessling, Germany
librium position of a family of systems of linear dif-
ferential equations. Differentsial’nye Uravneniya 14:
2086–2088 Synonyms
Klee V, Minty GJ (1972) How good is the simplex al-
gorithm? In: Inequalities III (proceedings of the third
symposium on inequalities), Los Angeles. Academic, CACSD
New York/London, pp 159–175
Knuth DE (1997) Art of computer programming, vol-
ume 2: seminumerical algorithms, 3rd edn. Addison-
Wesley, Reading Abstract
Lagarias JC, Wang Y (1995) The finiteness conjecture for
the generalized spectral radius of a set of matrices.
Linear Algebra Appl 214:17–42 Computer-aided control system design (CACSD)
Lewis HR, Papadimitriou CH (1998) Elements of the encompasses a broad range of Methods and
theory of computation. Prentice Hall, Upper Saddle tools and technologies for system modelling,
River
Matiyasevich YV (1993) Quantum computing devices.
control system synthesis and tuning, dynamic
MIT system analysis and simulation, as well as
Nemirovskii A (1993) Several NP-hard problems arising validation and verification. The domain of
in robust stability analysis. Math Control Signals Syst CACSD enlarged progressively over decades
6:99–105
Packard A, Doyle J (1993) The complex structured singu-
from simple collections of algorithms and
lar value. Automatica 29:71–109 programs for control system analysis and
Papadimitriou CH (1995) Computational complexity. synthesis to comprehensive tool sets and user-
Addison-Wesley/Longman, Reading friendly environments supporting all aspects
Poljak S, Rohn J (1993) Checking robust nonsingularity is
NP-hard. Math Control Signals Syst 6:1–9
of developing and deploying advanced control
Rota GC, Strang G (1960) A note on the joint spectral systems in various application fields. This entry
radius. Proc Neth Acad 22:379–381 gives a brief introduction to CACSD and reviews
Computer-Aided Control Systems Design: Introduction and Historical Overview 123
the evolution of key concepts and technologies with performing virtual experiments on a given
underlying the CACSD domain. Several plant model to analyze and predict the dynamic
cornerstone achievements in developing reliable behavior of a physical plant. Often, modelling
numerical algorithms; implementing robust and simulation are closely connected parts of
numerical software; and developing sophisticated dedicated environments, where specific classes of
integrated modelling, simulation, and design models can be built and appropriate simulation
environments are highlighted. methods can be employed. Simulation is also a C
powerful tool for the validation of mathematical
models and their approximations. In the context
Keywords of CACSD, simulation is frequently used as a
control system tuning-aid, as, for example, in an
CACSD; Modelling; Numerical analysis; Simu- optimization-based tuning approach using time-
lation; Software tools domain performance criteria.
System analysis and synthesis are concerned
with the investigation of properties of the un-
Introduction derlying synthesis model and in the determi-
nation of a control system which fulfills basic
To design a control system for a plant, a typical requirements for the closed-loop controlled plant,
computer-aided control system design (CACSD) such as stability or various time or frequency re-
work flow comprises several interlaced activities. sponse requirements. The analysis also serves to
Model building is often a necessary first step check existence conditions for the solvability of
consisting in developing suitable mathematical synthesis problems, according to established de-
models to accurately describe the plant dynami- sign methodologies. An important synthesis goal
cal behavior. High-fidelity physical plant models is the guarantee of the performance robustness.
obtained, for example, by using the first prin- To achieve this goal, robust control synthesis
ciples of modelling, primarily serve for anal- methodologies often employ optimization-based
ysis and validation purposes using appropriate parameter tuning in conjunction with worst-case
simulation techniques. These dynamic models analysis techniques. A rich collection of reliable
are usually defined by a set of ordinary differ- numerical algorithms are available to perform
ential equations (ODEs), differential algebraic such analysis and synthesis tasks. These algo-
equation (DAEs), or partial differential equations rithms form the core of CACSD and their devel-
(PDEs). However, for control system synthesis opment represented one of the main motivations
purposes simpler models are required, which are for CACSD-related research.
derived by simplifying high-fidelity models (e.g., Performance robustness assessment of the
by linearization, discretization, or model reduc- resulting closed-loop control system is a key
tion) or directly determined in a specific form aspect of the verification and validation activ-
from input-output measurement data using sys- ity. For a reliable assessment, simulation-based
tem identification techniques. Frequently used worst-case analysis represents, often, the only
synthesis models are continuous or discrete-time way to prove the performance robustness of the
linear time-invariant (LTI) models describing the synthesized control system in the presence of
nominal behavior of the plant in a specific oper- parametric uncertainties and variabilities.
ating point. The more accurate linear parameter
varying (LPV) models may serve to account for
uncertainties due to various performed approxi- Development of CACSD Tools
mations, nonlinearities, or varying model param-
eters. The development of CACSD tools for system
Simulation of dynamical systems is a closely analysis and synthesis started around 1960,
related activity to modelling and is concerned immediately after general-purpose digital
124 Computer-Aided Control Systems Design: Introduction and Historical Overview
high-performance control and systems library for CACSD as a new significantly extended
SLICOT (Benner et al. 1999; Huffel et al. version of the original SLICOT. The goals
2004). In what follows, we highlight the main of the new library were to cover the main
achievements along this development process. computational needs of CACSD, by relying
The direct successor of FASP is the Variable exclusively on LAPACK and BLAS, and to
Dimension Automatic Synthesis Program guarantee similar numerical performance as
(VASP) (implemented in Fortran IV on IBM that of the LAPACK routines. The software C
360) (White and Lee 1971), while a further development used rigorous standards for
development was ORACLS (Armstrong 1978), implementation in Fortran 77, modularization,
which included several routines from the testing, and documentation (similar to that used
newly developed eigenvalue package EISPACK in LAPACK). The development of the latest
(Garbow et al. 1977; Smith et al. 1976) as versions of RASP and SLICOT eliminated
well as solvers for linear (Lyapunov, Sylvester) practically any duplication of efforts and led to
and quadratic (Riccati) matrix equations. a library which contained the best software from
From this point, the mainstream development RASP, SLICOT, BIMAS, and BIMASC. The
of numerical algorithms for linear system current version of SLICOT is fully maintained by
analysis and synthesis closely followed the the NICONET association (http://www.niconet-
development of algorithms and software for ev.info/en/) and serves as basis for implementing
numerical linear algebra. A common feature of all advanced computational functions for CACSD in
subsequent developments was the extensive use interactive environments as MATLAB (http://
of robust linear algebra software with the Basic www.mathworks.com), Maple (http://www.
Linear Algebra Subprograms (BLAS) (Lawson maplesoft.com/products/maple/), Scilab (http://
et al. 1979) and the Linear Algebra Package www.scilab.org/) and Octave (http://www.gnu.
(LINPACK) for solving linear systems (Dongarra org/software/octave/).
et al. 1979). Several control libraries have been
developed almost simultaneously, relying on the Interactive Tools
robust numerical linear algebra core software Early experiments during 1970–1985 included
formed of BLAS, LINPACK, and EISPACK. the development of several interactive CACSD
Notable examples are RASP (based partly on tools employing menu-driven interaction,
VASP and ORACLS) (Grübel 1983) – developed question-answer dialogues, or command
originally by the University of Bochum and later languages. The April 1982 special issue of IEEE
by the German Aerospace Center (DLR); BIMAS Control Systems Magazine was dedicated to
(Varga and Sima 1985) and BIMASC (Varga and CACSD environments and presented software
Davidoviciu 1986) – two Romanian initiatives; summaries of 20 interactive CACSD packages.
and SLICOT (Boom et al. 1991) – a Benelux This development period was marked by the
initiative of several universities jointly with the establishment of new standards for programming
Numerical Algorithm Group (NAG). languages (Fortran 77, C), availability of high-
The last development phase was marked quality numerical software libraries (BLAS,
by the availability of the Linear Algebra EISPACK, LINPACK, ODEPACK), transition
Package (LAPACK) (Anderson et al. 1992), from mainframe computers to minicomputers,
whose declared goal was to make the widely and finally to the nowadays-ubiquitous personal
used EISPACK and LINPACK libraries run computers as computing platforms, spectacular
efficiently on shared memory vector and parallel developments in graphical display technolo-
processors. To minimize the development efforts, gies, and application of sound programming
several active research teams from Europe paradigms (e.g., strong data typing).
started, in the framework of the NICONET A remarkable event in this period was the
project, a concentrated effort to develop a development of MATLAB, a command language-
high-performance numerical software library based interactive matrix laboratory (Moler 1980).
126 Computer-Aided Control Systems Design: Introduction and Historical Overview
The original version of MATLAB was written Simulink represent the industrial and academic
in Fortran 77. It was primarily intended as a standard for CACSD. The existing CACSD
student teaching tool and provided interactive tools are constantly extended and enriched
access to selected subroutines from LINPACK with new model classes, new computational
and EISPACK. The tool circulated for a while algorithms (e.g., structure-exploiting eigenvalue
in the public domain, and its high flexibility computations based on SLICOT), dedicated
was soon recognized. Several CACSD-oriented graphical user interfaces (e.g., tuning of PID
commercial clones have been implemented in controllers or control-related visualizations),
the C language, the most important among them advanced robust control system synthesis, etc.
being MATRIXx (Walker et al. 1982) and PC- Also, many third-party toolboxes contribute to
MATLAB (Moler et al. 1985). the wide usage of this tool.
The period after 1985 until around 2000 can Basic CACSD functionality incorporating
be seen as a consolidation and expansion period symbolic processing techniques and higher
for many commercial and noncommercial tools. precision computations is available in the Maple
In an inventory of CACSD-related software product MapleSim Control Design Toolbox as
issued by the Benelux Working Group on well as in the Mathematica Control Systems
Software (WGS) under the auspices of the product. Free alternatives to MATLAB are the
IEEE Control Systems Society, there were in MATLAB-like environments Scilab, a French
1992 in active development 70 stand-alone initiative pioneered by INRIA, and Octave, which
CACSD packages, 21 tools based on or similar has recently added some CACSD functionality.
to MATLAB, and 27 modelling/simulation
environments. It is interesting to look more
closely at the evolutions of the two main
players MATRIXx and MATLAB, which Summary and Future Directions
took place under harshly competitive condi-
tions. The development and maintenance of integrated
MATRIXx with its main components Xmath, CACSD environments, which provide support
SystemBuild, and AutoCode had over many for all aspects of the CACSD cycle such as mod-
years a leading role (especially among industrial elling, design, and simulation, involve sustained,
customers), excelling with a rich functionality strongly interdisciplinary efforts. Therefore, the
in domains such as system identification, CACSD tool development activities must rely
control system synthesis, model reduction, on the expertise of many professionals covering
modelling, simulation, and code generation. such diverse fields as control system engineering,
After 2003, MATRIXx (http://www.ni.com/ programming languages and techniques, man-
matrixx/) became a product of the National machine interaction, numerical methods in linear
Instruments Corporation and complements algebra and control, optimization, computer
its main product family LabView, a visual visualization, and model building techniques.
programming language-based system design This may explain why currently only a few of
platform and development environment (http:// the commercial developments of prior years are
www.ni.com/labview). still in use and actively maintained/developed.
MATLAB gained broad academic acceptance Unfortunately, the number of actively developed
by integrating many new methodological devel- noncommercial alternative products is even
opments in the control field into several control- lower. The dominance of MATLAB, as a
related toolboxes. MATLAB also evolved as a de facto standard for both industrial and
powerful programming language, which allows academic usage of integrated tools covering
easy object-oriented manipulation of different all aspects of the broader area of computer-
system descriptions via operator overloading. aided control engineering (CACE), cannot be
At present, the CACSD tools of MATLAB and overseen.
Computer-Aided Control Systems Design: Introduction and Historical Overview 127
The new trends in CACSD are partly aspects of developing numerical algorithms and
related to handling more complex applications, software for CACSD.
involving time-varying (e.g., periodic, multi- The main trends over the last three decades
rate sampled-data, and differential algebraic) in CACSD-related research can be followed in
linear dynamic systems, nonlinear systems with the programs/proceedings of the biannual IEEE
many parametric uncertainties, and large-scale Symposia on CACSD from 1981 to 2013 (partly
models (e.g., originating from the discretization available at http://ieeexplore.ieee.org) as well as C
of PDEs). To address many computational of the triennial IFAC Symposia on CACSD from
aspects of model building (e.g., model reduction 1979 to 2000. Additional information can be
of large order systems), optimization-based found in several CACSD-focused survey articles
robust controller tuning using multiple-model and special issues (e.g., No. 4/1982; No. 2/2000)
approaches, or optimization-based robustness of the IEEE Control Systems Magazine.
assessment using global-optimization techniques,
parallel computation techniques allow substantial
savings in computational times and facilitate
addressing computational problems for large- Bibliography
scale systems. A topic which needs further
research is the exploitation of the benefits of Anderson E, Bai Z, Bishop J, Demmel J, Du Croz J,
Greenbaum A, Hammarling S, McKenney A, Ostrou-
combining numerical and symbolic computations chov S, Sorensen D (1992) LAPACK user’s guide.
(e.g., in model building and manipulation). SIAM, Philadelphia
Armstrong ES (1978) ORACLS – a system for linear-
quadratic Gaussian control law design. Technical pa-
per 1106 96-1, NASA
Cross-References Augustin DC, Strauss JC, Fineberg MS, Johnson BB,
Linebarger RN, Sansom FJ (1967) The SCi continu-
Interactive Environments and Software Tools ous system simulation language (CSSL). Simulation
for CACSD 9:281–303
Benner P, Mehrmann V, Sima V, Van Huffel S, Varga A
Model Building for Control System Synthesis (1999) SLICOT – a subroutine library in systems and
Model Order Reduction: Techniques and Tools control theory. In: Datta BN (ed) Applied and compu-
Multi-domain Modeling and Simulation tational control, signals and circuits, vol 1. Birkhäuser,
Optimization-Based Control Design Tech- Boston, pp 499–539
Dongarra JJ, Moler CB, Bunch JR, Stewart GW (1979)
niques and Tools LINPACK user’s guide. SIAM, Philadelphia
Robust Synthesis and Robustness Analysis Elmquist H et al (1997) Modelica – a unified object-
Techniques and Tools oriented language for physical systems model-
Validation and Verification Techniques and ing (version 1). http://www.modelica.org/documents/
Modelica1.pdf
Tools Elmqvist H (1978) A structured model language for large
continuous systems. PhD thesis, Department of Auto-
matic Control, Lund University, Sweden
Garbow BS, Boyle JM, Dongarra JJ, Moler CB (1977)
Recommended Reading Matrix eigensystem routines – EISPACK guide exten-
sion. Springer, Heidelberg
The historical development of CACSD concepts Grace ACW (1991) SIMULAB, an integrated environ-
and techniques was the subject of several ar- ment for simulation and control. In: Proceedings of
American Control Conference, Boston, pp 1015–1020
ticles in reference works Rimvall and Jobling
Grübel G (1983) Die regelungstechnische Programmbib-
(1995) and Schmid (2002). A selection of papers liothek RASP. Regelungstechnik 31:75–81
on numerical algorithms underlying the develop- Kalman RE, Englar TS (1966) A user’s manual for the
ment of CACSD appeared in Patel et al. (1994). automatic synthesis program (program C). Technical
report CR-475, NASA
The special issue No. 2/2004 of the IEEE Con-
Lawson CL, Hanson RJ, Kincaid DR, Krogh FT (1979)
trol Systems Magazine on Numerical Awareness Basic linear algebra subprograms for Fortran usage.
in Control presents several surveys on different ACM Trans Math Softw 5:308–323
128 Consensus of Complex Multi-agent Systems
new form of social and economic aggregation. combinations of the components of x.t/: this mo-
This has strongly pushed towards a systematic tivates the term averaging dynamics. Stochastic
and deep study of multi-agent dynamical sys- matrices owe their name to their use in prob-
tems. Mathematically they typically consist of a ability: the term Pvw can be interpreted as the
graph where each node possesses a state vari- probability of making a jump in the graph from
able; states are coupled at the dynamical level state v to state w. In this way you construct what
through dependences determined by the edges is called a random walk on the graph G. C
in the graph. One of the challenging problems The network structure is hidden in the nonzero
in the field of multi-agent systems is to analyze pattern of P . Indeed we can associate to P a
the emergence of complex collective phenom- graph: GP D .V; EP / where the set of edges is
ena from the interactions of the units which are given by EP WD f.u; v/ 2 V V j Puv > 0g.
typically quite simple. Complexity is typically Elements in EP represent the communication
the outcome of the topology and the nature of edges among the units: if .u; v/ 2 EP , it means
interconnections. that unit u has access to the state of unit v. Denote
Consensus dynamics (also known as average by 1 2 RV the all 1’s vector. Notice that P 1 D 1:
dynamics) (Carli et al. 2008; Jadbabaie et al. this shows that once the states of units are at
2003) is one of the most popular and simplest consensus, they will no longer evolve. Will the
multi-agent dynamics. One convenient way dynamics always converge to a consensus point?
to introduce it is with the language of social Remarkably, some of the key properties of
sciences. Imagine that a number of independent P responsible for the transient and asymptotic
units possess an information represented by a real behavior of the linear system (1) are determined
number, for instance, such number can represent by the connectivity properties of the underlying
an opinion on a given fact. Units interact and graph GP . We recall that, given two vertices
change their opinion by averaging with the opin- u; v 2 V, a path (of length l) from u to v in GP is
ions of other units. Under certain assumptions, any sequence of vertices u D u1 ; u2 ; : : : ; ulC1 D
this will lead the all community to converge to a v such that .ui ; ui C1 / 2 EP for every i D
consensus opinion which takes into consideration 1; : : : ; s. GP is said to be strongly connected if
all the initial opinion of the agents. In social for any pair of vertices u ¤ v in V there is a
sciences, empiric evidences (Galton 1907) have path in GP connecting u to v. The period of a
shown how such aggregate opinion may give a node u is defined as the greatest common divisor
very good estimation of unknown quantities: such of the lengths of all closed paths from u to u. In
phenomenon has been proposed in the literature the strongly connected graph, all nodes have the
as wisdom of crowds (Surowiecki 2004). same period, and the graph is called aperiodic if
such a period is 1. If x is a vector, we will use
Consensus Dynamics, Graphs, and the notation x to denote its transpose. If A is
Stochastic Matrices a finite set, jAj denotes the number of elements
Mathematically, consensus dynamics are special in A. The following classical result holds true
linear dynamical systems of type (Gantmacher 1959):
is achieved in just one step: x.t/ D N 1 11 x.0/ consensus dynamics will necessarily exhibit a
for all t 1. slow convergence.
Formally, given a symmetric graph G D
Example 2 (SRW on a cycle graph) Consider the
.V; E/ and a subset of nodes S V, define
symmetric cycle graph CN D .V; E/ where V D
eS as the number of edges with at least one node
f0; : : : ; N 1g and E D f.v; v C 1/; .v C 1; v/g
in S and eS S as the number of edges with both
where sum is mod N . The graph CN is clearly C
nodes in S . The bottleneck of S in G is defined
strongly connected and is also aperiodic if N is
as ˆ.S / D eS S =eS . Finally, the bottleneck ratio
odd. The corresponding SRW P has eigenvalues
of G is defined as
2k
k D cos ˆ WD min ˆ.S /
N S WeS =e1=2
Therefore, if N is odd, the second eigenvalue is where e D jEj is the number of edges in the
given by graph.
Let P be the SRW on G and let
2 be its second
2 1 eigenvalue. Then,
2 D cos D 1 2 2 2 C o.N 2 /
N N (3) Proposition 1 (Cheeger bound Levin et al.
for N ! C1 2008)
1
2 2ˆ : (4)
while the corresponding convergence time is
given by Example 4 (Graphs with a bottleneck) Consider
two complete graphs with n nodes connected by
just one edge. If S is the set of nodes of one of
D .ln
21 /1 ln 1 N 2 for N ! C1
the two complete graphs, we obtain
is not realistic to assume that all interactions It can be proven that Q11 is asymptotically stable
happen at the same time: agents are embedded in (.Q11 /t ! 0). Henceforth, x R .t/ ! x R .1/ for
a physical continuous time, and interactions can t ! C1 with the limit opinions satisfying the
be imagined to take place at random, for instance, relation
in a pairwise fashion.
One of the most famous random consensus
model is the gossip model. Fix a real number x R .1/ D Q11 x R .1/ C Q12 x S .0/ (6)
q 2 .0; 1/ and a symmetric graph G D .V; E/.
At every time instant t, an edge .u; v/ 2 E
If we define „ WD .I Q11 /1 Q12 , we can write
is activated with uniform probability jEj1 , and
x R .1/ D „x S .0/. It is easy to see that „ has
nodes u and v exchange their states and produce P
nonnegative elements and that s „us D 1 for
a new state according to the equations
all u 2 R: asymptotic opinions of regular agents
are thus convex combinations of the opinions
xu .t C 1/ D .1 q/xu .t/ C qxv .t/
of stubborn agents. If all stubborn agents are
xv .t C 1/ D qxu .t/ C .1 q/xv .t/ in the same state x, then, consensus is reached
by all agents in the point x. However, typically,
The states of the other units remain unchanged. consensus is not reached in such a system: we
Will this dynamics lead to a consensus? If will discuss an example below.
the same edge is activated at every time instant, There is a useful alternative interpretation
clearly consensus will not be achieved. However, of the asymptotic opinions. Interpreting the
it can be shown that, with probability one, con- graph G as an electrical circuit where edges
sensus will be reached (Boyd et al. 2006). are unit resistors, relation (6) can be seen as
a Laplace-type equation on the graph G with
Consensus Dynamics with Stubborn boundary conditions given by assigning the
Agents voltage x S .0/ to the stubborn agents. In this
In this entry, we investigate consensus dynamics way, x R .1/ can be interpreted as the vector of
models where some of the agents do not modify voltages of the regular agents when stubborn
their own state (stubborn agents). These systems agents have fixed voltage x S .0/. Thanks to
are of interest in socioeconomic models (Ace- the electrical analogy, we can compute the
moglu et al. 2013). asymptotic opinion of the agents by computing
Consider a symmetric connected graph G D the voltages in the various nodes in the graph. We
.V; E/. We assume a splitting V D S [ R with propose a concrete application in the following
the understanding that agents in S are stubborn example.
agents not changing their state, while those in Example 5 (Stubborn agents in a line graph)
R are regular agents whose state modifies in Consider the line graph LN D .V; E/ where
time according to a SRW consensus dynamics, V D f1; 2; : : : ; N g and where E D f.u; u C
namely, 1/; .u C 1; u/ j u D 1; : : : ; N 1g. Assume that
S D f1; N g and R D V n S. Consider the graph
1 X as an electrical circuit. Replacing the line with a
xu .t C 1/ D .AG /uv xv .t/ ; 8u 2 R
du v2V single edge connecting 1 and N having resistance
N 1 and applying Ohm’s law, we obtain that
By assembling the state of the regular and of the the current flowing from 1 to N is equal to
stubborn agents in vectors denoted, respectively, ˆ D .N 1/1 ŒxNS .0/ x1S .0/
. If we now fix
as x R .t/ and x S .t/, dynamics can be recasted in an arbitrary node v 2 V and applying again the
matrix form as same arguments in the part of graph from 1 to v,
we obtain that the voltage at v, xvR .1/ satisfies
x R .t C 1/ D Q11 x R .t/ C Q12 x S .t/ the relation xvR .1/ x1S .0/ D ˆ.v 1/. We
(5)
x S .t C 1/ D x S .t/ thus obtain
Control and Optimization of Batch Processes 133
v1 S
xvR .1/ D x1S .0/ C Œx .0/ x1S .0/
: Control and Optimization of Batch
N 1 N
Processes
Control objectives
Implementation
aspect Run-time references Run-end references
y ref (t) or y ref[0,t f ] z ref
used for low-volume production. Discontinuous repeating the processing steps on a predetermined
operations can be of the batch or semi-batch type. schedule.
In batch operations, the products to be processed The main characteristics of batch process op-
are loaded in a vessel and processed without ma- erations include the absence of steady state, the
terial addition or removal. This operation permits presence of constraints, and the repetitive nature.
more flexibility than continuous operation by These characteristics bring both challenges and
allowing adjustment of the operating conditions opportunities to the operation of batch processes
and the final time. Additional flexibility is avail- (Bonvin 1998). The challenges are related to the
able in semi-batch operations, where products are fact that the available models are often poor and
continuously added by adjusting the feed rate incomplete, especially since they need to repre-
profile. We use the term batch process to include sent a wider range of operating conditions than
semi-batch processes. in the case of continuous processes. Furthermore,
Batch processes dealing with reaction although product quality must be controlled, this
and separation operations include reaction, variable is usually not available online but only
distillation, absorption, extraction, adsorption, at run end. On the other hand, opportunities
chromatography, crystallization, drying, fil- stem from the fact that industrial chemical pro-
tration, and centrifugation. The operation of cesses are often slow, which facilitates larger
batch processes involves recipes developed in sampling periods and extensive online computa-
the laboratory. A sequence of operations is tions. In addition, the repetitive nature of batch
performed in a prespecified order in specialized processes opens the way to run-to-run process
process equipment, yielding a fixed amount improvement (Bonvin et al. 2006). More infor-
of product. The sequence of tasks to be mation on batch processes and their operation
carried out on each piece of equipment, can be found in Seborg et al. (2004) and Nagy
such as heating, cooling, reaction, distillation, and Braatz (2003). Next, we will successively
crystallization, and drying, is predefined. The address the control and the optimization of batch
desired production volume is then achieved by processes.
Control and Optimization of Batch Processes 135
Z
is well suited to this task (Nagy and Braatz tf
where x represents the state vector, J the scalar updating the initial conditions for the sub-
cost to be minimized, S the run-time constraints, sequent optimization (and optionally the pa-
T the run-end constraints, and tf the final time. rameters of the process model) and numerical
In constrained optimal control problems, the optimization based on the updated process
solution often lies on the boundary of the fea- model (Abel et al. 2000). Since both steps
sible region. Batch processes involve run-time are repeated as measurements become avail-
constraints on inputs and states as well as run-end able, the procedure is also referred to as re-
constraints. peated online optimization. The weakness of
this method is its reliance on the model; if
Optimization Strategies the model is not updated, its accuracy plays
As can be seen from the cost objective (1), op- a crucial role. However, when the model is
timization requires information about the com- updated, there is a conflict between parameter
plete run and thus cannot be implemented in identification and optimization since parame-
real time using only online measurements. Some ter identification requires persistency of exci-
information regarding the future of the run is tation, that is, the inputs must be sufficiently
needed in the form of either a process model varied to uncover the unknown parameters, a
capable of prediction or measurements from pre- condition that is usually not satisfied when
vious runs. Accordingly, measurement-based op- near-optimal inputs are applied. Note that,
timization methods can be classified depending instead of computing the input uk Œt; tf
, it is
on whether or not a process model is used ex- also possible to use a receding horizon and
plicitly for implementation, as illustrated in Fig. 2 compute only uk Œt; t C T
, with T the control
and discussed next: horizon (Abel et al. 2000).
1. Online explicit optimization. This approach 2. Online implicit optimization. In this scenario,
is similar to model predictive control (Nagy measurements are used to update the inputs
and Braatz 2003). Optimization uses a process directly, that is, without the intermediary of
model explicitly and is repeated whenever a process model. Two classes of techniques
a new set of measurements becomes avail- can be identified. In the first class, an update
able. This scheme involves two steps, namely, law that approximates the optimal solution
(within-run) yk(t)
repeat online yk[0,t] NCO prediction
NCO uk∗(t)
Run-to-run input update using
3 Repeated run-to-run optimization 4 measurements
Iterative IDENT
∗ OPT
‹
Control and Optimization of Batch Processes, Fig. 2 or iteratively over several runs (vertical division). EST
Optimization strategies for batch processes. The strate- stands for “estimation,” IDENT for “identification,” OPT
gies are classified according to whether or not a process for “optimization,” and NCO for “necessary conditions of
model is used for implementation (horizontal division). optimality”
Furthermore, each class can be implemented either online
Control and Optimization of Batch Processes 137
is sought. For example, a neural network is laws can be implemented online, leaving the
trained with data corresponding to optimal be- responsibility for satisfying terminal constraints
havior for various uncertainty realizations and and sensitivities to run-to-run controllers.
used to update the inputs (Rahman and Palanki
1996). The second class of techniques relies
on transforming the optimization problem into Summary and Future Directions
a control problem that enforces the neces- C
sary conditions of optimality (NCO) (Srini- Batch processing presents several challenges.
vasan and Bonvin 2007b). The NCO involve Since there is little time for developing
constraints that need to be made active and appropriate dynamic models, there is a need for
sensitivities that need to be pushed to zero. improved data-driven control and optimization
Since some of these NCO are evaluated at approaches. These approaches require the
run time and others at run end, the control availability of online concentration-specific
problem involves both run-time and run-end measurements such as chromatographic and
outputs. The main issue is the measurement or spectroscopic sensors, which are not yet readily
estimation of the controlled variables, that is, available in production.
the constraints and sensitivities that constitute Technically, the main operational difficulty in
the NCO. batch process improvement lies in the presence
3. Iterative explicit optimization. The steps of run-end outputs such as final quality, which
followed in run-to-run explicit optimization cannot be measured during the run. Although
are the same as in online explicit optimization. model-based solutions are available, process
However, there is substantially more data models in the batch area tend to be poor. On
available at the end of the run as well as the other hand, measurement-based optimization
sufficient computational time to refine the for a given batch faces the challenge of having
model by updating its parameters and, if to know about the future to act during the batch.
needed, its structure. Furthermore, data from Consequently, the main research push is in the
previous runs can be collected for model area of measurement-based optimization and the
update (Rastogi et al. 1992). As with online use of data from both the current and previous
explicit optimization, this approach suffers batches for control and optimization purposes.
from the conflict between estimation and
optimization.
4. Iterative implicit optimization. In this sce- Cross-References
nario, the optimization problem is transformed
into a control problem, for which the control Industrial MPC of continuous processes
approaches in the second row of Fig. 1 are Iterative Learning Control
used to meet the run-time and run-end ob- Multiscale Multivariate Statistical Process
jectives (Francois et al. 2005). The approach, Control
which is conceptually simple, might be ex- Scheduling of Batch Plants
perimentally expensive since it relies more on State Estimation for Batch Processes
data.
These complementary measurement-based
optimization strategies can be combined by Bibliography
implementing some aspects of the optimization
online and others on a run-to-run basis. For Abel O, Helbig A, Marquardt W, Zwick H, Daszkowski T
instance, in explicit schemes, the states can be (2000) Productivity optimization of an industrial semi-
estimated online, while the model parameters batch polymerization reactor under safety constraints.
J Process Control 10(4):351–362
can be estimated on a run-to-run basis. Similarly, Bonvin D (1998) Optimal operation of batch reactors – a
in implicit optimization, approximate update personal view. J Process Control 8(5–6):355–368
138 Control Applications in Audio Reproduction
Bonvin D, Srinivasan B, Hunkeler D (2006) Con- processing where the ultimate objective is to
trol and optimization of batch processes: improve- reconstruct the original analog signal one started
ment of process operation in the production of spe-
cialty chemicals. IEEE Control Syst Mag 26(6):
with. After discussing some fundamental prob-
34–45 lems in the Shannon paradigm, we give our basic
Francois G, Srinivasan B, Bonvin D (2005) Use of mea- problem formulation that can be solved using
surements for enforcing the necessary conditions of modern sampled-data control theory. Examples
optimality in the presence of constraints and uncer-
tainty. J Process Control 15(6):701–712
are given to illustrate the results.
Moore KL (1993) Iterative learning control for determin-
istic systems. Advances in industrial control. Springer,
London
Nagy ZK, Braatz RD (2003) Robust nonlinear model Keywords
predictive control of batch processes. AIChE J
49(7):1776–1786
Digital signal processing; Multirate signal pro-
Rahman S, Palanki S (1996) State feedback synthesis for
on-line optimization in the presence of measurable cessing; Sampled-data control; Sampling theo-
disturbances. AIChE J 42:2869–2882 rem; Sound reconstruction
Rastogi A, Fotopoulos J, Georgakis C, Stenger HG
(1992) The identification of kinetic expressions and
the evolutionary optimization of specialty chemical
batch reactors using tendency models. Chem Eng Sci Introduction: Status Quo
47(9–11):2487–2492
Seborg DE, Edgar TF, Mellichamp DA (2004) Process
dynamics and control. Wiley, New York Consider the problem of reproducing sounds
Srinivasan B, Bonvin D (2007a) Controllability and sta- from recorded media such as compact discs. The
bility of repetitive batch processes. J Process Control current CD format is recorded at the sampling
17(3):285–295
Srinivasan B, Bonvin D (2007b) Real-time optimization of frequency 44.1 kHz. It is commonly claimed that
batch processes by tracking the necessary conditions the highest frequency for human audibility is
of optimality. Ind Eng Chem Res 46(2):492–504 20 kHz, whereas the upper bound of reproduction
in this format is believed to be the half of
44.1 kHz, i.e., 22.1 kHz, and hence, this format
should have about 10 % margin against the
Control Applications in Audio alleged audible limit of 20 kHz.
Reproduction CD players of early days used to process such
digital signals with the simple zero-order hold
Yutaka Yamamoto at this frequency, followed by an analog low-
Department of Applied Analysis and Complex pass filter. This process requires a sharp low-
Dynamical Systems, Graduate School of pass characteristic to cut out unnecessary high
Informatics, Kyoto University, Kyoto, Japan frequency beyond 20 kHz. However, a sharp cut-
off low-pass characteristic inevitably requires a
high-order filter which in turn introduces a large
Abstract amount of phase shift distortion around the cutoff
frequency.
This entry gives a brief overview of the recent To circumvent this defect, there was intro-
developments in audio sound reproduction via duced the idea of oversampling DA converter that
modern sampled-data control theory. We first re- is realized by the combination of a digital filter
view basics in the current sound processing tech- and a low-order analog filter (Zelniker and Taylor
nology and then proceed to the new idea derived 1994). This is based on the following principle:
from sampled-data control theory, which is dif- Let ff .nh/g1 nD1 be a discrete-time signal
ferent from the conventional Shannon paradigm obtained from a continuous-time signal f ./ by
based on the perfect band-limiting hypothesis. sampling it with sampling period h. The upsam-
The hybrid nature of sampled-data systems pro- pler appends the value 0, M 1 times, between
vides an optimal platform for dealing with signal two adjacent sampling points:
Control Applications in Audio Reproduction 139
Control Applications in
Audio Reproduction,
Fig. 1 Upsampler for
M D2
C
w.`/; k D M ` signal f ./ from sampled data. This is clearly
." M w/Œk
WD (1) an ill-posed problem without any assumption on
0; elsewhere:
f because there are infinitely many functions
See Fig. 1 for the case M D 2. This has the that can match the sampled data ff .nh/g1
nD1 .
effect of making the unit operational time M Hence, one has to impose a reasonable a
times faster. priori assumption on f to sensibly discuss this
The bandwidth will also be expanded by M problem.
times and the Nyquist frequency (i.e., half the The following sampling theorem gives one
sampling frequency) becomes M = h [rad/sec]. answer to this question:
As we see in the next section, the Nyquist fre-
Theorem 1 Suppose that the signal f 2 L2
quency is often regarded as the true bandwidth
is perfectly band-limited, in the sense that there
of the discrete-time signal ff .nh/g1 nD1 . But exists !0 = h such that the Fourier transform
this upsampling process just insert zeros between
fO of f satisfies
sampling points, and the real information con-
tents (the true bandwidth) is not really expanded.
fO.!/ D 0; j!j !0 ; : (2)
As a result, the copy of the frequency content
for Œ0; = h/ appears as a mirror image repeatedly
over the frequency range above = h. This dis- Then
tortion is called imaging. In order to avoid the 1
X sin .t= h n/
effect of such mirrored frequency components, f .t/ D f .nh/ : (3)
one often truncates the frequency components nD1
.t= h n/
beyond the (original) Nyquist frequency via a
digital low-pass filter that has a sharp roll-off This theorem states that if the signal f does
characteristic. One can then complete the digital not contain any high-frequency components be-
to analog (DA) conversion process by postposing yond the Nyquist frequency = h, then the origi-
a slowly decaying analog filter. This is the idea of nal signal f can be uniquely reconstructed from
an oversampling DA converter (Zelniker and Tay- its sampled-data ff .nh/g1 nD1 . On the other
lor 1994). The advantage here is that by allowing hand, if this assumption does not hold, then the
a much wider frequency range, the final analog result does not necessarily hold. This is easy to
filter can be a low-order filter and hence yields a see via a schematic representation in Fig. 2.
relatively small amount of phase distortion sup- If we sample the sinusoid in the upper fig-
ported in part by the linear-phase characteristic ure in Fig. 2, these sampled values would turn
endowed on the digital filter preceding it. out to be compatible with another sinusoid with
much lower frequency as the lower figure shows.
In other words, this sampling period does not
Signal Reconstruction Problem have enough resolution to distinguish these two
sinusoids. The maximum frequency below where
As before, consider the sampled discrete- there does not occur such a phenomenon is the
time signal ff .nh/g1nD1 obtained from a Nyquist frequency. The sampling theorem above
continuous-time signal f . The main question is asserts that it is half of the sampling frequency
how we can recover the original continuous-time 2= h, that is, = h [rad/sec]. In other words, if
140 Control Applications in Audio Reproduction
Control Applications in
Audio Reproduction,
Fig. 3 Ringing due to the
Gibbs phenomenon
Control Applications in Audio Reproduction, Fig. 4 Error system for sampled-data design
Consider the situation where one plays a mu- the frequency components beyond the Nyquist
sical instrument, say, a guitar. A guitar naturally frequency, one needs a faster sampling period,
has a frequency characteristic. When one picks a so we insert the upsampler " L to make the
string, it produces a certain tone along with its sampling period h=L. This upsampled signal is
harmonics, as well as a characteristic transient processed by digital filter K.z/ and then becomes
response. All these are governed by a certain a continuous-time signal again by going through
frequency decay curve, demanded by the physical the hold device Hh=L . It will then be processed
characteristics of the guitar. Let us suppose that by analog filter P .s/ to be smoothed out. The
such a frequency decay is governed by a rational obtained signal is then compared with delayed
transfer function F .s/, and it is driven by varied analog signal yc .t mh/ to form the delayed
exogenous inputs. error signal ec . The objective is then to make
Consider Fig. 4. The exogenous analog signal this error ec as small as possible. The reason for
wc 2 L2 is applied to the analog filter F .s/. allowing delay e mhs is to accommodate certain
This F .s/ is not an ideal filter and hence its processing delays. This is the idea of the block
bandwidth is not limited below the Nyquist fre- diagram Fig. 4.
quency. The signal wc drives F .s/ to produce The performance index we minimize is the
the target analog signal yc , which should be the induced norm of the transfer operator Tew from
signal to be reconstructed. It is then sampled wc to ec :
by sampler Sh and becomes the recorded or
transmitted digital signal yd . The objective here
is to reconstruct the target analog signal yc out kec k2
of this sampled signal yd . In order to recover kTew k1 WD sup : (4)
wc ¤0 kwc k2
142 Control Applications in Audio Reproduction
In other words, the H 1 -norm of the sampled- in the case of H 2 , is targeted. Furthermore, since
data control system Fig. 4. Our objective is then we can control the continuous-time performance
to solve the following problem: of the worst-case error signal, the present
method can indeed minimize (continuous-time)
phase errors. This is an advantage usually not
Filter Design Problem
possible with conventional methods since they
Given the system specified by Fig. 4. For a given
mainly discuss the gain characteristics of the
performance level > 0, find a filter K.z/ such
designed filters only. By the very property of
that
minimizing the H 1 norm of the continuous-
kTew k1 < :
time error signal ec , the present method can
This is a sampled-data H 1 (sub-)optimal even control the phase errors and yield much
control problem. This can be solved by using less phase distortion even around the cutoff
the standard solution method for sampled-data frequency.
control systems (Chen and Francis 1995a; Figure 5 shows the response of the proposed
Yamamoto 1999; Yamamoto et al. 2012). sampled-data filter against a rectangular wave,
The only anomaly here is that the system in with a suitable first- or second-order analog fil-
Fig. 4 contains a delay element e mhs which ter F .s/; see Yamamoto et al. (2012) for more
is infinite dimensional. However, by suitably details. Unlike Fig. 3, the overshoot is controlled
approximating this delay by successive series of to be minimum.
shift registers, one can convert the problem to The present method has been patented
an appropriate finite-dimensional discrete-time (Fujiyama et al. 2008; Yamamoto 2006;
problem (Yamamoto et al. 1999, 2002, 2012). Yamamoto and Nagahara 2006) and implemented
This problem setting has the following fea- into sound processing LSI chips as a core
tures: technology by Sanyo Semiconductors and
1. One can optimize the continuous-time perfor- successfully used in mobile phones, digital voice
mance under the constraint of discrete-time recorders, and MP3 players; their cumulative
filters. production has exceeded 40 million units as of
2. By setting the class of input functions as L2 the end of 2012.
functions band-limited by F .s/, one can cap-
ture the continuous-time error signal ec and its
worst-case norm in the sense of (4). Summary and Future Directions
The first feature is due to the advantage of
sampled-data control theory. It is a great ad- We have presented basic ideas of new signal pro-
vantage of sampled-data control theory that al- cessing theory derived from sampled-data control
lows the mixture of continuous- and discrete- theory. The theory has the advantage that is not
time components. This is in marked contrast to possible with the conventional projection meth-
the Shannon paradigm where continuous-time ods, whether based on the perfect band-limiting
performance is really demanded by the artificial hypothesis or not.
perfect band-limiting hypothesis. The application of sampled-data control
The second feature is an advantage due to theory to digital signal processing was first made
H 1 control theory. Naturally, we cannot have an by Chen and Francis (1995b) with performance
access to each individual error signal ec , but measure in the discrete-time domain; see also
we can still control the overall performance Hassibi et al. (2006). The present author and his
from wc to ec in terms of the H 1 norm that group have pursued the idea presented in this
guarantees the worst-case performance. This is in entry since 1996 (Khargonekar and Yamamoto
clear contrast with the classical case where only 1996). See Yamamoto et al. (2012) and references
a representative response, e.g., impulse response therein. For the background of sampled-data
Control Applications in Audio Reproduction 143
Control Applications in
Audio Reproduction,
Fig. 5 Response of the
proposed sampled-data
filter against a rectangular
wave
control theory, consult, e.g., Chen and Francis For sampling theorem, see Shannon (1949),
(1995a) and Yamamoto (1999). Unser (2000), and Zayed (1996), for example.
The same philosophy of emphasizing Note, however, that Shannon himself (1949) did
the importance of analog performance was not claim originality on this theorem; hence, it
proposed and pursued recently by Unser and is misleading to attribute this theorem solely to
co-workers (1994), Unser (2005), and Eldar Shannon. See Unser (2000) and Zayed (1996)
and Dvorkind (2006). The crucial difference for some historical accounts. For a general back-
is that they rely on L2 =H 2 type optimization ground in signal processing, Vetterli et al. (2013)
and orthogonal or oblique projections, which is useful.
are very different from our method here. In
particular, such projection methods can behave
poorly for signals outside the projected space. Cross-References
The response shown in Fig. 3 is a typical such
example. H-Infinity Control
Optimal Sampled-Data Control
Applications to image processing is discussed
Sampled-Data Systems
in Yamamoto et al. (2012). An application
to Delta-Sigma DA converters is studied in
Acknowledgments The author would like to thank
Nagahara and Yamamoto (2012). Again, the Masaaki Nagahara and Masashi Wakaiki for their help
crux of the idea is to assume a signal generator with the numerical examples. Part of this entry is based
model and then design an optimal filter in on the exposition (Yamamoto 2007) written in Japanese.
the sense of Fig. 4 or a similar diagram with
the same idea. This idea should be applicable
to a much wider class of problems in signal Bibliography
processing and should prove to have more
impact. Chen T, Francis BA (1995a) Optimal sampled-data control
Some processed examples of still and moving systems. Springer, New York
Chen T, Francis BA (1995b) Design of multirate filter
images are downloadable from the site: http:// banks by H1 optimization. IEEE Trans Signal Pro-
www-ics.acs.i.kyoto-u.ac.jp/~yy/ cess 43:2822–2830
144 Control for High-Speed Nanopositioning
Control for High-Speed Nanopositioning, Fig. 2 A the scanner such that it follows the intended reference
feedback loop typically encountered in nanopositioning. trajectory based on the position measurement obtained
Purpose of the controller is to control the position of from a position sensor
the position of the scanner such that it follows Summary and Future Directions
a given reference trajectory based on the
measurement provided by a displacement While high-precision nanoscale positioning
sensor. The resulting tracking error contains systems have been demonstrated at low speeds,
both deterministic and stochastic components. despite an intensive international race spanning
Deterministic errors are typically due to several years, the longstanding challenge remains
insufficient closed-loop bandwidth. They may to achieve high-speed motion and positioning
also arise from excitation of mechanical resonant with Ångstrom-level accuracy. Overcoming this
modes of the scanner or actuator nonlinearities barrier is believed to be the necessary catalyst for
such as piezoelectric hysteresis and creep (Croft emergence of ground breaking innovations across
et al. 2001). The factors that limit the achievable a wide range of scientific and technological fields.
closed-loop bandwidth include phase delays and Control is a critical technology to facilitate the
non-minimum phase zeros associated with the emergence of such systems.
actuator and scanner dynamics (Devasia et al.
2007). The dynamics of the nanopositioner, the
controller, and the reference trajectory selected Bibliography
for scanning play a key role in minimizing the
Cherubini G, Chung CC, Messner WC, Moheimani SOR
deterministic component of the tracking error. (2012) Control methods in data-storage systems. IEEE
Tracking errors of a stochastic nature mostly Trans Control Syst Technol 20(2):296–322
arise from external noise and vibrations and from Clayton GM, Tien S, Leang KK, Zou Q, Devasia S
position measurement noise. External noise and (2009) A review of feedforward control approaches
in nanopositioning for high-speed SPM. J Dyn Syst
vibrations can be significantly reduced by oper- Meas Control Trans ASME 131(6):1–19
ating the nanopositioner in a controlled environ- Croft D, Shed G, Devasia S (2001) Creep, hysteresis,
ment. However, dealing with the measurement and vibration compensation for piezoactuators: atomic
noise is a significant challenge (Sebastian et al. force microscopy application. ASME J Dyn Syst
Control 123(1):35–43
2008a). The feedback loop allows the sensing Devasia S, Eleftheriou E, Moheimani SOR (2007) A
noise to generate a random positioning error that survey of control issues in nanopositioning. IEEE
deteriorates the positioning precision. Increasing Trans Control Syst Technol 15(5):802–823
Gao W, Hocken RJ, Patten JA, Lovingood J, Lucca DA
the closed-loop bandwidth (to decrease the deter-
(2000) Construction and testing of a nanomachining
ministic errors) tends to worsen this effect. Low instrument. Precis Eng 24(4):320–328
sensitivity to measurement noise is, therefore, Krogmann D (1999) Image multiplexing system on
a key requirement in feedback control design the base of piezoelectrically driven silicon microlens
arrays. In: Proceedings of the 3rd international con-
for high-speed nanopositioning and a very hard
ference on micro opto electro mechanical systems
problem to address. (MOEMS), Mainz, pp 178–185
Control Hierarchy of Large Processing Plants: An Overview 147
Control Hierarchy of Large Processing Plants: An Overview, Fig. 3 Information network in a modern process plant
supervision ( Multiscale Multivariate Statistical MES are information systems that support the
Process Control), data reconciliation, inferences functions that a production department must
and estimation of unmeasured quantities, perform in order to prepare and to manage
fault detection and diagnosis, or performance work instructions, schedule production activities,
controller monitoring ( Controller Performance monitor the correct execution of the production
Monitoring) (CPM). process, gather and analyze information about
The information flow becomes more complex the production process, and optimize procedures.
when we move up the basic control layer, looking Notice that regarding the control of process units,
more like a web than a pyramid when we enter up to this level no fundamental differences appear
the world of what can be called generally as between continuous and batch processes. But at
asset (plant and equipment) management: a col- the MES level, which corresponds to RTO of
lection of different activities oriented to sustain Fig. 1, many process units may be involved, and
performance and economic return, considering the tools and problems are different, the main task
their entire cycle of life and, in particular, aspects in batch production being the optimal scheduling
such as maintenance, efficiency, or production of those process units ( Scheduling of Batch
organization. Above the supervisory layer, one Plants; Mendez et al. 2006).
can usually distinguish at least two levels denoted MES are part of a larger class of systems
generically as manufacturing execution systems called manufacturing operation management
(MES) and enterprise resource planning (ERP) (MOM) that cover not only the management of
(Scholten 2009) as can be seen in Fig. 4. production operations but also other functions
Control Hierarchy of Large Processing Plants: An Overview 151
Control Hierarchy of
Large Processing Plants:
An Overview, Fig. 4
Software/hardware view
Future Control and Optimization at process model in steady state and the correspond-
Plant Scale ing operational constraints to generate targets for
the control systems on the lower layers. The
Going back to Fig. 1, the variety of control and implementation of RTO provides consistent ben-
optimization problems increases as we move up efits by looking at the optimal operation problem
in the control hierarchy, entering the field of from a plant-wide perspective. Nevertheless, in
dynamic process operations and considering not practice, when MPCs with local optimizers are
only individual process units but also larger sets operating the process units, many coordination
of equipment or whole plants. Examples at the problems appear between these layers, due to dif-
RTO (or MES) level are optimal management of ferences in models and targets, so that driving the
shared resources or utilities, production bottle- operation of these process units in a coherent way
neck avoidance, optimal energy use or maximum with the global economic targets is an additional
efficiency, smooth transitions against production challenge.
changes, etc. A different perspective is taken by the
Above, we have mentioned RTO as the most so-called self-optimizing control (Fig. 5 right,
common approach for plant-wide optimization. Skogestad 2000) that, instead of implementing
Normally, RTO systems perform the optimization the RTO solution online, uses it to design a
of an economic cost function using a nonlinear control structure that assures a near optimum
152 Control Hierarchy of Large Processing Plants: An Overview
industrial applications are possible thanks to and robustness against uncertainty, and leading
the advances in optimization methods and tools to new challenges and problems for research
and computing power available on the plant and possibly large improvements of plant oper-
level, but implementation is still a challenge ations.
from many points of view, not only technical.
Few suppliers offer commercial products,
and finding optimal operation policies for a Cross-References
C
whole factory is a complex task that requires
taking into consideration many aspects and Controller Performance Monitoring
elaborate information not available directly Economic Model Predictive Control
as process measurements. Solving large Industrial MPC of Continuous Processes
NMPC problems in real time may require Model-Based Performance Optimizing Control
breaking the associated optimization problem Multiscale Multivariate Statistical Process
in subproblems that can be solved in parallel. Control
This leads to several local controllers/optimizers, Programmable Logic Controllers
each one solving one subproblem involving Real-Time Optimization of Industrial Pro-
variables of a part of the process and linked cesses
by some type of coordination. This offers a Scheduling of Batch Plants
new point of view of the control hierarchy.
Typically, three types of architectures are
mentioned for dealing with this problem,
Bibliography
represented in Fig. 7: In the hierarchical
approach, coordination between local controllers Camacho EF, Bordóns C (2004) Model predictive control.
is made by an upper layer that deals with Springer, London, pp 1–405. ISBN: 1-85233-694-3
the interactions, assigning targets to them. In de Prada C, Gutierrez G (2012) Present and future trends
price coordination, the coordination task is in process control. Ingeniería Química 44(505):38–42.
Special edition ACHEMA, ISSN:0210–2064
performed by a market-like mechanism that Engell S (2007) Feedback control for optimal process
assigns different prices to the cost functions of operation. J Process Control 17:203–219
every local controller/optimizer. Finally, in Engell S, Harjunkoski I (2012) Optimal operation:
the distributed approach, the local controllers scheduling, advanced control and their integration.
Comput Chem Eng 47:121–133
coordinate their actions by interchanging Lucia S, Finkler T, Engell S (2013) Multi-stage nonlinear
information about its decisions or states with model predictive control applied to a semi-batch poly-
neighbors (Scattolini 2009). merization reactor under uncertainty. J Process Control
23:1306–1319
Lunze J, Lamnabhi-Lagarrigue F (2009) HYCON hand-
book of hybrid systems control. Theory, tools, ap-
Summary and Future Research plications. Cambridge University Press, Boca Raton.
ISBN:978-0-521-76505-3
Marchetti A, Chachuat B, Bonvin D (2009) Modifier-
Process control is a key element in the operation
adaptation methodology for real-time optimization.
of process plants. At the lowest layer, it can Ind Eng Chem Res 48(13):6022–6033
be considered a mature, well-proven technology, Mendez CA, Cerdá J, Grossmann I, Harjunkoski I, Fahl M
even if many problems such as control structure (2006) State-of-the-art review of optimization methods
for short-term scheduling of batch processes. Comput
selection and controller tuning in reality are often
Chem Eng 30:913–946
not solved well. The range of problems under Scattolini R (2009) Architectures for distributed and hier-
consideration is continuously expanding to the archical model predictive control – a review. J Process
upper layers of the hierarchy, merging control Control 19:723–731
Scholten B (2009) MES guide for executives: why and
with process operation and optimization, creating
how to select, implement, and maintain a manufac-
new challenges that range from modeling and turing execution system. ISA, Research Triangle Park.
estimation to efficient large-scale optimization ISBN:978-1-936007-03-5
154 Control of Biotechnological Processes
Skogestad S (2000) Plantwide control: the search for the bulk chemicals. To explain the needs for control
self-optimizing control structure. J Process Control of bioprocesses, especially for the production of
10:487–507
high-value and/or large-volume compounds, it
is instructive to have a look on the development
of a new process. If a potential strain is found
Control of Biotechnological or genetically engineered, the biologist will
Processes determine favorable environmental factors for the
growth of and the production of the target product
Rudibert King by the cells. These factors typically comprise the
Technische Universität Berlin, Berlin, Germany levels of temperature, pH, dissolved oxygen, etc.
Moreover, concentration regions for the nutrients,
precursors, and so-called trace elements are
Abstract specified. Whereas for the former variables
often “optimal” setpoints are provided which,
Closed-loop control can significantly improve the at least in smaller scale reactors, can be easily
performance of bioprocesses, e.g., by an increase maintained by independent classically designed
of the production rate of a target molecule or controllers, information about the best nutrient
by guaranteeing reproducibility of the production supply is incomplete from a control engineering
with low variability. In contrast to the control point of view. It is this dynamic nutrient supply
of chemical reaction systems, the biological re- which is most often not revealed in the biological
actions take place inside cells which constitute laboratory and which, however, offers substantial
highly regulated, i.e., internally controlled sys- room for production improvements by control.
tems by themselves. As a result, through evolu- Irrespective whether bacteria, yeasts, fungi, or
tion, the same cell can and will mimic a system animal cells are used for production, these cells
of first order in some situations and a high- will consist of thousands of different compounds
dimensional, highly nonlinear system in others. which react with each other in hundreds or more
A complete mathematical description of the pos- reactions. All reactions are tightly regulated on a
sible behaviors of the cell is still beyond reach molecular and genetic basis; see Deterministic
and would be far too complicated as a basis for Description of Biochemical Networks. For so-
model-based process control. This makes super- called unlimited growth conditions, all cellular
vision, control, and optimization of biosystems compartments will be built up with the same
very demanding. specific growth rate, meaning that the cellular
composition will not change over time. In a
Keywords mathematical model describing growth and pro-
duction, only one state variable will be needed
Bioprocess control; Control of uncertain sys- to describe the biotic phase. This will give rise
tems; Optimal control; Parameter identification; to unstructured models; see below. Whenever a
State estimation; Structure identification; Struc- cell enters a limitation, which is often needed
tured models for production, the cell will start to reorganize
its internal reaction pathways. Model-based ap-
proaches of supervision and control based on
Introduction unstructured models are now bound to fail. More
biotic state variables are needed. However, it is
Biotechnology offers solutions to a broad not clear which and how many. As a result, mod-
spectrum of challenges faced today, e.g., for eling of limiting behaviors is challenging and cru-
health care, remediation of environmental cial for the control of biotechnological processes.
pollution, new sources for energy supplies, It requires a large amount of process-specific
sustainable food production, and the supply of information. Moreover, model-based estimates of
Control of Biotechnological Processes 155
the state of the cell and of the environment are a for pH or antifoam control, are added with vari-
key factor as online measurements of the internal able rates leading to an unsteady behavior. The
processes in the cell and of the nutrient concen- system to be modeled consists of the gaseous, the
trations are usually impossible. Finally, as models liquid, and the biotic phase inside the reactor. For
used for process control have to be limited in size the former ones, balance equations can be formu-
and thus only give an approximative description, lated readily. The biotic phase can be modeled
robustness of the methods has to be addressed. in a structured or unstructured way. Moreover, as C
not all cells behave similarly, this may give rise to
a segregated model formulation which is omitted
Mathematical Models here for brevity.
Control of Biotechnological Processes, Fig. 1 Modern laboratory reactor platform for control-oriented process
development
156 Control of Biotechnological Processes
added that describe the dependencies between in- a combinatorial explosion. Although approaches
dividual fluxes. Then at least part of A is known. exist to support this modeling step (see Herold
Describing the biotic phase with a higher de- and King 2013 or Mangold et al. 2005) finally,
gree of granularity does not change the mea- the modeler will have to settle with a compromise
surement situation in the laboratory or in the with respect to the accuracy of the model found
production scale, i.e., still only very few online versus the number of fully identified model can-
measurements will be available for control. didates. As a result, all control methods applied C
should be robust in some sense.
Identification
Soft Sensors
Even if the growth medium initially “only” con-
sists of some 10–20 different, chemically well- Despite many advantages in the development
defined substances, from which only few are of online measurements (see Mandenius and
described in the model, this situation will change Titchener-Hooker 2013) systems for supervision
over the cultivation time as the organisms release and control of biotechnical processes often
further compounds from which only few may be include model-based estimations schemes, such
known. If, for economic reasons, complex raw as extended Kalman filters (EKF); see Kalman
materials are used, even the initial composition Filters. Concentration estimates are needed for
is unknown. Hence, measuring the concentrations unmeasured substances and for quantities which
of some of the compounds of the large set of depend on these concentrations like the growth
substances as a basis for modeling is not trivial. rate of the cells. In real applications, formulations
For structured models, intracellular substances have to be used which account for delays in
have to be determined additionally. These are laboratory analysis of up to several hours and for
embedded in an even larger matrix of compounds situations in which results from the laboratory
making chemical analysis more difficult. There- will not be available in the same sequence as
fore, the basis for parameter and structure identi- the samples were taken. An example from a
fication is uncertain. real cultivation is shown in Fig. 2. Here, the at-
As the expensive experiments and chemical line measurement of the biomass concentration,
analysis tasks are very time consuming, some- cX D mX =V , is the only measurement available.
times lasting up to several weeks, methods of The result of a single measurement is obtained
optimal experimental design should always be about 30 min after sampling. For reference,
considered in biotechnology; see Experiment unaccessible state variables, which were analyzed
Design and Identification for Control. later, are shown as well along with the online
The models to be built up should possess estimates. The scatter of the data, especially of
some predictive capability for a limited range DNA and RNA, gives a qualitative impression of
of environmental conditions. This rules out un- the measurement accuracy in biotechnology.
structured models for many practical situations.
However, for process control, the models should
still be of manageable complexity. Medium-sized Control
structured models seem to be well suited for
such a situation. The choice of biotic states in Beside the relatively simple control of physical
m and possible structures for the reaction rates parameters, such as temperature, pH, dissolved
i , however, is hardly supported by biological oxygen, or carbon dioxide concentration, only
or chemical evidence. As a result, a combined few biotic variables are typically controlled
structure and parameter identification problem with respect to a setpoint. The most prominent
has to be solved. The choices of possible terms example is the growth rate of the biomass with
ij in all i give rise to a problem that exhibits the goal to reach a high cell concentration
158 Control of Biotechnological Processes
120 6 15 50
mProtein [g]
mDNA [g]
mRNA [g]
mX [g]
0 0 0 0
0 50 0 50 0 50 0 50
8 6 400 50
mGlucose [g]
0 0 0 0
0 25 50 0 25 50 0 25 50 0 25 50
time [h] time [h] time [h] time [h]
Control of Biotechnological Processes, Fig. 2 Estima- states (black), online estimated evolution (blue), off-line
tion of states of a structured model with an EKF with an data analyzed after the experiment (open red circles) (Data
unexpected growth delay initially. At-line measurement obtained by T. Heine)
mX (red filled circles), initially predicted evolution of
in the reactor as fast as possible. This is the concentrations instead, this can pose a rather
predominant goal when the cells are the primary challenging control problem. As the organisms
target as in baker’s yeast cultivations or when try to grow exponentially, the controller must
the expression of the desired product is growth be able to increase the feed exponentially as
associated. For other non-growth-associated well. The difficulty mainly arises from the
products, a high cell mass is desirable as well, inaccurate and infrequent measurements that
as production is proportional to the amount of the soft sensors/controller has to work with
cells. If the nutrient supply is maintained above a and from the danger that an intermediate
certain level, unlimited growth behavior results, shortage or oversupply with nutrients may switch
allowing the use of unstructured models for the metabolism to an undesired state of low
model-based control. An excess of nutrients productivity.
has to be avoided, though, as some organisms, For control of biotechnical processes,
like baker’s yeast, will initiate an overflow many methods explained in this encyclopedia
metabolism, with products which may be including feedforward, feedback, model-based,
inhibitory in later stages of the cultivation. For optimal, adaptive, fuzzy, neural nets, etc., can
some products, such as the antibiotic penicillin, be and have been used (cf. Dochain 2008;
the organism has to grow slowly to obtain Gnoth et al. 2008; Rani and Rao 1999).
a high production rate. For these so-called As in other areas of application, (robust)
secondary metabolites, low but not vanishing model-predictive control schemes (MPC) (see
concentrations for some limiting substrates Industrial MPC of Continuous Processes) are
are needed. If setpoints are given for these applied with great success in biotechnology.
Control of Biotechnological Processes 159
20 1 6
VBase [l]
mNi [g]
cX [g/l]
0 0 0
0 100 0 100 0 100
0.8 3 12
C
cRNA [g/l]
cDNA [g/l]
cPr [g/l]
0 0 0
0 100 0 100 0 100
1.5 0.4 60
cAm [g/l]
cPh [g/l]
cC [g/l]
0 0 0
0 100 0 100 0 100
0.12 0.03 0.09
uAm [l/h]
uPh [l/h]
uC [l/h]
0.06
0 0 0
0 50 100 0 50 100 0 50 100
time [h] time [h] time [h]
Control of Biotechnological Processes, Fig. 3 MPC lution (blue), off-line data analyzed after the experiment
control and state estimation of a cultivation with S. ten- (open red circles). Off-line optimal feeding profiles ui
dae. At-line measurement mX (red filled circles), initially (blue broken line), MPC-calculated feeds (black, solid)
predicted evolution of states (black), online estimated evo- (Data obtained by T. Heine)
For the antibiotic production shown in Fig. 3, determined optimal reference trajectory will not
optimal feeding profiles ui for ammonia (AM), always be the best solution, and an online op-
phosphate (PH), and glucose (C) were calculated timization over the whole horizon has a larger
before the experiment was performed in a potential (cf. Kawohl et al. 2007).
trajectory optimization such that the final
mass of the desired antibiotic nikkomycin
(Ni) was maximized. This resulted in the blue Summary and Future Directions
broken lines for the feeds ui . However, due to
disturbances and model inaccuracies, an MPC Advanced process control including soft sen-
scheme had to significantly change the feeding sors can significantly improve biotechnical pro-
profiles, to actually obtain this high amount cesses. Using these techniques promotes qual-
of nikkomycin; see the feeding profiles given ity and reproducibility of processes (Junker and
in black solid lines. This example shows that, Wang 2006). These methods should, however,
especially in biotechnology, off-line trajectory not only be exploited in the production scale.
planning has to be complemented by closed-loop For new pharmaceutical products, the time to
concepts. market is the decisive factor. Methods of (model-
On the other hand, the experimental data given based) monitoring and control can help here to
in Fig. 2 shows that significant disturbances, such speed up process development. Since a few years,
as an unexpected initial growth delay, may occur a clear trend can be seen in biotechnology to
in real systems as well. For this reason, the miniaturize and parallelize process development
classical receding horizon MPC with an off-line using multi-fermenter systems and robotic tech-
160 Control of Fluids and Fluid-Structure Interactions
nologies. This trend gives rise to new challenges Kawohl M, Heine T, King R (2007) Model-based esti-
for modeling on the basis of huge data sets and mation and optimal control of fed-batch fermentation
processes for the production of antibiotics. Chem Eng
for control in very small scales. At the same Process 11:1223–1241
time, it is expected that a continued increase King R (1997) A structured mathematical model for a
of information from bioinformatic tools will be class of organisms: part 1–2. J Biotechnol 52:219–244
available which has to be utilized for process Mandenius CF, Titchener-Hooker NJ (ed) (2013)
Measurement, monitoring, modelling and control of
control as well. Going to large-scale cultivations bioprocesses. Advances in biochemical engineering
adds further spatial dimensions to the problem. biotechnology, vol 132. Springer, Heidelberg
Now, the assumption of a well-stirred, ideally Mangold M, Angeles-Palacios O, Ginkel M, Waschler R,
mixed reactor does not longer hold. Substrate Kinele A, Gilles ED (2005) Computer aided modeling
of chemical and biological systems – methods, tools,
concentrations will be space dependent. Cells and applications. Ind Eng Chem Res 44:2579–2591
will experience changing good and bad nutrient Rani KY, Rao VSR (1999) Control of fermenters. Biopro-
environments frequently. Thus, mass transfer has cess Eng 21:77–88 31:21–39
to be accounted for, leading to partial differential Varner J, Ramkrishna D (1998) Application of cybernetic
models to metabolic engineering: investigation of stor-
equations as models for the process. age pathways. Biotech Bioeng 58:282–291; 31:21–39
Cross-References
Control of Fluids and Fluid-Structure
Control and Optimization of Batch Processes Interactions
Deterministic Description of Biochemical
Networks Jean-Pierre Raymond
Extended Kalman Filters Institut de Mathématiques, Université Paul
Experiment Design and Identification for Sabatier Toulouse III & CNRS, Toulouse Cedex,
Control France
Industrial MPC of Continuous Processes
Nominal Model-Predictive Control
Nonlinear System Identification: An Overview Abstract
of Common Approaches
We introduce control and stabilization issues for
fluid flows along with known results in the field.
Bibliography Some models coupling fluid flow equations and
equations for rigid or elastic bodies are presented,
Bastin G, Dochain D (2008) On-line estimation and adap- together with a few controllability and stabiliza-
tive control of bioreactors. Elsevier, Amsterdam
Dochain D (2008) Bioprocess control. ISTE, London tion results.
Gnoth S, Jentzsch M, Simutis R, Lübbert A (2008)
Control of cultivation processes for recombinant pro-
tein production: a review. Bioprocess Biosyst Eng
31:21–39
Keywords
Goudar C, Biener R, Zhang C, Michaels J, Piret J, Kon-
stantinov K (2006) Towards industrial application of Control; Fluid flows; Fluid-structure systems;
quasi real-time metabolic flux analysis for mammalian Stabilization
cell culture. Adv Biochem Eng Biotechnol 101:99–
118
Herold S, King R (2013) Automatic identification of struc-
tured process models based on biological phenom- Some Fluid Models
ena detected in (fed-)batch experiments. Bioprocess
Biosyst Eng. doi:10.1007/s00449-013-1100-6
We consider a fluid flow occupying a bounded
Junker BH, Wang HY (2006) Bioprocess monitoring and
computer control: key roots of the current PAT initia- domain F RN , with N D 2 or N D 3,
tive. Biotechnol Bioeng 95:226–261 at the initial time t D 0, and a domain F (t)
Control of Fluids and Fluid-Structure Interactions 161
at time t > 0. Let us denote by
.x; t/ 2 RC This model has to be completed with boundary
the density of the fluid at time t at the point conditions on @F .t/ and an initial condition at
x 2 F .t/ and by u.x; t/ 2 RN its velocity. time t D 0.
The fluid flow equations are derived by writing The incompressible Euler equations with con-
the mass conservation stant density are obtained by setting = 0 in the
above system.
@
The compressible Navier-Stokes system is ob- C
Cdiv.
u/ D 0 in F .t/; for t > 0; (1) tained by coupling the equation of conservation
@t
of mass Eq. (1) with the balance of momentum
and the balance of momentum Eq. (2), where the tensor is defined by Eq. (4),
and by completing the system with a constitutive
law for the pressure.
@u
C .u r/u D div +
f
@t (2)
in F .t/; for t > 0
Control Issues
where is the so-called constraint tensor and
There are unstable steady states of the Navier-
f represents a volumic force. For an isothermal
Stokes equations which give rise to interesting
fluid, there is no need to complete the system
control problems (e.g., to maximize the ratio
by the balance of energy. The physical nature
“lift over drag”), but which cannot be observed
of the fluid flow is taken into account in the
in real life because of their unstable nature. In
choice of the constraint tensor . When the vol-
such situations, we would like to maintain the
ume is preserved by the fluid flow transport, the
physical model close to an unstable steady state
fluid is called incompressible. The incompress-
by the action of a control expressed in feedback
ibility condition reads as div u D 0 in F .t/.
form, that is, as a function either depending on
The incompressible Navier-Stokes equations are
an estimation of the velocity or depending on the
the classical model to describe the evolution of
velocity itself. The estimation of the velocity of
isothermal incompressible and Newtonian fluid
the fluid may be recovered by using some real-
flows. When in addition the density of the fluid
time measurements. In that case, we speak of a
is assumed to be constant,
(x, t) =
0 , the
feedback stabilization problem with partial infor-
equations reduce to
mation. Otherwise, when the control is expressed
in terms of the velocity itself, we speak of a feed-
div u D 0; back stabilization problem with full information.
@u Another interesting issue is to maintain a fluid
0 C .u r/u D u rp C
0 f flow (described by the Navier-Stokes equations)
@t
in the neighborhood of a nominal trajectory (not
in F .t/; t > 0; (3)
necessarily a steady state) in the presence of
perturbations. This is a much more complicated
which are obtained by setting issue which is not yet solved.
In the case of a perturbation in the initial
2 condition of the system (the initial condition at
D ru C .ru/ C T
div u I pI; time t D 0 is different from the nominal velocity
3
(4) held at time t D 0), the exact controllability
in Eq. (2). When div u = 0, the expression of to the nominal trajectory consists in looking for
simplifies. The coefficients > 0 and > 0 are controls driving the system in finite time to the
the viscosity coefficients of the fluid, and p.x; t/ desired trajectory.
its pressure at the point x 2 F .t/ and at time Thus, control issues for fluid flows are those
t > 0. encountered in other fields. However there are
162 Control of Fluids and Fluid-Structure Interactions
specific difficulties which make the correspond- where gs andfs are time-independent functions.
ing problems challenging. When we deal with the In the case F .t/ D F , not depending on t, the
incompressible Navier-Stokes system, the pres- corresponding instationary model is
sure plays the role of a Lagrange multiplier asso-
ciated with the incompressibility condition. Thus, @u u C .u r/u C rp D f
s
we have to deal with an infinite-dimensional non- @t
and div u = 0 in F .0; 1/;
linear differential algebraic system. In the case P c (5)
u D gs C N i D1 fi .t/gi ; @F .0; 1/
of a Dirichlet boundary control, the elimination
u.0/ D u0 on F .
of the pressure, by using the so-called Leray or
Helmholtz projector, leads to an unusual form of
In this model, we assume that u0 ¤ us , gi are
the corresponding control operator; see Raymond
given functions with localized supports in @F
(2006). In the case of an internal control, the
and f .t/ D .f1 .t/; : : : ; fNc .t// is a finite-
estimation of the pressure to prove observability
dimensional control. Due to the incompressibility
inequalities is also quite tricky; see Fernandez-
condition, the functions gi have to satisfy
Cara et al. (2004). From the numerical viewpoint,
the approximation of feedback control laws leads Z
to very large-size problems, and new strategies gi n D 0;
have to be found for tackling these issues. @F
Let us now describe what are the known results for some norm Z, provided ku0 us kz is small
for the incompressible Navier-Stokes equations enough and where ' is a nondecreasing function.
in 2D or 3D bounded domains, with a control act- The mapping K, called the feedback gain, may
ing locally in a Dirichlet boundary condition. Let be chosen linear.
us consider a given steady state (us ; ps / satisfying The usual procedure to solve this stabilization
the equation problem consists in writing the system satisfied
by u us , in linearizing this system, and in
us C .us r/us C rps D fs ; looking for a feedback control stabilizing this
and div us D 0 in F ; linearized model. The issue is first to study the
stabilizability of the linearized model and, when
it is stabilizable, to find a stabilizing feedback
with some boundary conditions which may be gain. Among the feedback gains that stabilize
of Dirichlet type or of mixed type (Dirichlet- the linearized model, we have to find one able
Neumann-Navier type). For simplicity, we only to stabilize, at least locally, the nonlinear system
deal with the case of Dirichlet boundary condi- too.
tions The linearized controlled system associated
us D gs on @F ; with Eq. (5) is
Control of Fluids and Fluid-Structure Interactions 163
Additional Controllability Results for The domain S .t/ and the flow XS associated
Other Fluid Flow Models with the motion of the structure obey
e D @F .t/n@S .t/. The boundary condition obtained bypwriting that F in Eq. (18) is given
on e .0; T / may be of the form by F D 1 C 2x .u; p/nQ n, where n.x;
Q y/
is the unit normal at .x; y/ 2 S .t/ to S .t/
u D mc f on e .0; 1/; (17) outward F .t/, and n is the unit normal to
S outward F .0/ D F . If in addition, a
R control f acts as a distributed control in the
with e mc f n D 0, f is a control, and mc a C
beam equation, we shall have
localization function.
q
F D 1 C 2x .u; p/nQ n C f (19)
An Elastic Beam Located at the
Boundary of a Two-Dimensional The equality of velocities on S .t/ reads as
Domain Filled by an Incompressible
Viscous Fluid u.x; y0 C .x; t// D .0; t .x; t//;
x 2 .0; L/; t > 0: (20)
When the structure is described by an infinite-
dimensional model (a partial differential equation
or a system of p.d.e.), there are a few existence
results for such systems and mainly existence of Control of Fluid-Structure Models
weak solutions (Chambolle et al. 2005). But for
stabilization and control problems of nonlinear To control or to stabilize fluid-structure models,
systems, we are usually interested in strong so- the control may act either in the fluid equation or
lutions. Let us describe a two-dimensional model in the structure equation or in both equations.
in which a one-dimensional structure is located There are a very few controllability and
on a flat part S D .0; L/ {y0 } of the boundary stabilization results for systems coupling the
of the reference configuration of the fluid domain incompressible Navier-Stokes system with a
F . We assume that the structure is a Euler- structure equation. We state below two of those
Bernoulli beam with or without damping. The results. Some other results are obtained for
displacement of the structure in the direction simplified one-dimensional models coupling
normal to the boundary S is described by the the viscous Burgers equation coupled with the
partial differential equation motion of a mass; see Badra and Takahashi
(2013) and the references therein.
We also have to mention here recent papers on
t t bxx ctxx Caxxxx D F; in S .0; 1/;
control problems for systems coupling quasi-
D 0 and x D 0 on @S .0; 1/;
stationary Stokes equations with the motion
.0/ D 01 and t .0/ D 02 in S ;
of deformable bodies, modeling microorganism
(18)
swimmers at low Reynolds number; see Alouges
where x , xx , and xxxx stand for the first,
et al. (2008).
the second, and the fourth derivative of with
respect to x 2 S . The other derivatives are
defined in a similar way. The coefficients b
and c are nonnegative, and a > 0. The term Null Controllability of the
ctxx is a structural damping term. At time Navier-Stokes System Coupled with
t, the structure occupies the position S .t/ D the Motion of a Rigid Body
f.x; y/ jx 2 .0; L/; y D y0 C .x; t/ g. When
F is a two-dimensional model, S is of The system coupling the incompressible Navier-
dimension one, and @S is reduced to the two Stokes system Eq. (3) in the domain drawn
extremities of S . The momentum balance is in Fig. 1, with the motion of a rigid body
166 Control of Fluids and Fluid-Structure Interactions
Raymond J-P (2007) Feedback boundary stabilization of model transport phenomena and heredity, and
the three-dimensional incompressible Navier-Stokes they arise as feedback delays in control loops.
equations. J Math Pures Appl 87:627–669
Raymond J-P (2010) Feedback stabilization of a fluid–
An overview of applications, ranging from traffic
structure model. SIAM J Control Optim 48:5398–5443 flow control and lasers with phase-conjugate
Raymond J-P, Thevenet L (2010) Boundary feedback feedback, over (bio)chemical reactors and cancer
stabilization of the two dimensional Navier-Stokes modeling, to control of communication networks
equations with finite dimensional controllers. Discret
and control via networks, is included in Sipahi C
Contin Dyn Syst A 27:1159–1187
Vazquez R, Krstic M (2008) Control of turbulent and et al. (2011).
magnetohydrodynamic channel flows: boundary stabi- The aim of this contribution is to describe
lization and estimation. Birkhäuser, Boston some fundamental properties of linear control
systems subjected to time-delays and to outline
principles behind analysis and synthesis methods.
Throughout the text, the results will be illustrated
by means of the scalar system
Wim Michiels
KU Leuven, Leuven (Heverlee), Belgium which, controlled with instantaneous state feed-
back, u.t/ D kx.t/, leads to the closed-loop
system
Abstract P
x.t/ D kx.t /: (2)
Although this didactic example is extremely sim-
The presence of time delays in dynamical sys- ple, we shall see that its dynamics are already
tems may induce complex behavior, and this be- very rich and shed a light on delay effects in
havior is not always intuitive. Even if a system’s control loops.
equation is scalar, oscillations may occur. Time In some works, the analysis of (2) is called the
delays in control loops are usually associated hot shower problem, as it can be interpreted as
with degradation of performance and robustness, a (over)simplified model for a human adjusting
but, at the same time, there are situations where the temperature in a shower: x.t/ then denotes
time delays are used as controller parameters. the difference between the water temperature and
the desired temperature as felt by the person, the
term – kx.t/ models the reaction of the person
Keywords by further opening or closing taps, and the delay
is due to the propagation with finite speed of the
Delay differential equations; Delays as controller water in the ducts.
parameters; Functional differential equation
2 C .Œm ; 0
; Rn / W P 2 C .Œm ; 0
; Rn /
D .A/ D ;
P .0/ D A0 .0/ C A1 ./
d
A D d : (5)
The relation between solutions of (3) and (4) where jj jjs is the supremum norm and jj jjs D
is given by z.t/./ D x.t C /; 2 Œ; 0
. sup 2Œ;0
jj ./jj2 . As the system is linear,
Note that all system information is concentrated
in the nonlocal boundary condition describing the 1.2
domain of A. The representation (4) is closely
1
related to a description by an advection PDE with
0.8
a nonlocal boundary condition (Krstic 2009).
0.6
0.4
Asymptotic Growth Rate of Solutions
0.2
x
and Stability
The reformulation of (3) into the standard 0
form (4) allows us to define stability notions −0.2
and to generalize the stability theory for ordinary −0.4
differential equations in a straightforward way,
−0.6
with the main change that the state space is
−0.8
C.Œ; 0
; Rn /. For example, the null solution −1 0 1 2 3 4 5
of (3) is exponentially stable if and only if there t
exist constants C > 0 and > 0 such that
Control of Linear Systems with Delays, Fig. 1
t Solution of (2) for D 1; k D 1, and initial condition
8 2 C .Œm ; 0
; R / kxt . /ks C e
n
k ks ; 1
Control of Linear Systems with Delays 169
asymptotic stability and exponential stability are of delay-independent criteria (conditions holding
equivalent. A direct generalization of Lyapunov’s for all values of the delays) and delay-dependent
second method yields: criteria (usually holding for all delays smaller
than a bound). This classification has its origin at
Theorem 1 The null solution of linear system
two different ways of seeing (3) as a perturbation
(3) is asymptotically stable if there exist a
of an ODE, with as nominal system x.t/ P D
continuous functional V W C.Œ; 0
; Rn / ! R (a C
P
A0 x.t/ and x.t/ D .A0 C A1 /x.t/ (system
so-called Lyapunov-Krasovskii functional) and
for zero delay), respectively. This observation is
continuous nondecreasing functions u; v; w W
illustrated in Fig. 2 for results based on input-
RC ! RC with
output- and Lyapunov-based approaches.
u.0/ D v.0/ D w.0/ D 0 and u.s/ > 0;
v.s/ > 0; w.s/ > for s > 0;
The Spectrum of Linear Time-Delay
such that for all 2 C.Œ; 0
; Rn / Systems
u .k ks / V . / v .k ./k2 / ; Two Eigenvalue Problems
VP . / w .k ./k2 / ; The substitution of an exponential solution in (3)
leads us to the nonlinear eigenvalue problem
where
1 .I A0 A1 e /v D 0; 2 C; v 2 Cn; v ¤ 0:
VP . / D lim sup ŒV .xh . // V . /
:
h!0C h (6)
The solutions of the equation det.I A0
Converse Lyapunov theorems and the con- A1 e / D 0 are called characteristic roots.
struction of the so-called complete-type Similarly, formulation (4) leads to the equivalent
Lyapunov-Krasovskii functionals are discussed infinite-dimensional linear eigenvalue problem
in Kharitonov (2013). Imposing a particular
structure on the functional, e.g., a form depending
only on a finite number of free parameters, .I A/u D 0; 2 C; u 2 C.Œ; 0
; Cn /; u6
0:
often leads to easy-to-check stability criteria (7)
(for instance, in the form of LMIs), yet as price
to pay, the obtained results may be conservative The combination of these two viewpoints lays
in the sense that the sufficient stability conditions at the basis of most methods for computing
might not be close to necessary conditions. characteristic roots; see Michiels (2012). On the
As an alternative to Lyapunov functionals, one hand, discretizing (7), i.e., approximating
Lyapunov functions can be used as well, provided A with a matrix, and solving the resulting
that the condition VP < 0 is relaxed (the so- standard eigenvalue problems allow to obtain
called Lyapunov-Razumikhin approach); see, for global information, for example, estimates of
example, Gu et al. (2003). all characteristic roots in a given compact set
or in a given right half plane. On the other
Delay Differential Equations as hand, the (finitely many) nonlinear equations (6)
Perturbation of ODEs allow to make local corrections on characteristic
Many results on stability, robust stability, and root approximations up to the desired accuracy,
control of time-delay systems are explicitly or e.g., using Newton’s method or inverse residual
implicitly based on a perturbation point of view, iteration. Linear time-delay systems satisfy
where delay differential equations are seen as spectrum-determined growth properties of
perturbations of ordinary differential equations. solutions. For instance, the zero solution of (3)
For instance, in the literature, a classification is asymptotically stable if and only if all
of stability criteria is often presented in terms characteristic roots are in the open left half plane.
170 Control of Linear Systems with Delays
e−λτ I e−λτ−1
I
λ
e−jωτ−1
e−jωτ = 1 ≤τ
jω
Lyapunov setting:
V = xT Px + ... V = xT Px + ...
where where
A0T P + PA0 < 0 (A0 + A1)T P + P (A0 + A1) < 0
Control of Linear Systems with Delays, Fig. 2 The classification of stability criteria in delay-independent results and
delay-dependent results stems from two different perturbation viewpoints. Here, perturbation terms are printed in red
kτ=1
100 1.5
80
1
60
40 0.5
20
0
ℜ(λτ)
ℑ(λτ)
0
−20 −0.5
−40 −1
−60
−1.5
−80
−100 −2
−4.5 −4 −3.5 −3 −2.5 −2 −1.5 −1 −0.5 0 0 2 4 6 8
ℜ(λτ) kτ
Control of Linear Systems with Delays, Fig. 3 (Left) Rightmost characteristic roots of (2) for k D 1. (Right) Real
parts of rightmost characteristic roots as a function of k
In Fig. 3 (left), the rightmost characteristic abscissa, i.e., the supremum of the real parts of
roots of (2) are depicted for k D 1. Note that all characteristic roots, continuously depend on
since the characteristic equation can be written parameters. Related to this, a loss or gain of
as C ke D 0; k and can be combined stability is always associated with characteristic
into one parameter. In Fig. 3 (right), we show the roots crossing the imaginary axis. Figure 3 (right)
real parts of the characteristic roots as a func- also illustrates the transition to a delay-free sys-
tion of k. The plots illustrate some important tem as k ! 0C .
spectral properties of retarded-type FDEs. First,
even though there are in general infinitely many Critical Delays: A Finite-Dimensional
characteristic roots, the number of them in any Characterization
right half plane is always finite. Second, the indi- Assume that for a given value of k, we are
vidual characteristic roots, as well as the spectral looking for values of the delay c for which (2)
Control of Linear Systems with Delays 171
has a characteristic root j!c on the imaginary spectral abscissa, is given by 1=; hence, large
axis. From j! D kej! , we get delays can only be tolerated at the price of a
degradation of the rate of convergence. It should
C l2 be noted that the limitations induced by delays are
!c D k; c D 2
; l
!c even more stringent if the uncontrolled systems
1 are exponentially unstable, which is not the case
d 1 C
D 0; 1; : : : ; < j.c ;j!c / D 2 : (8) for (2).
d !c The analysis in the previous sections gives
a hint why control is difficult in the presence
Critical delay values c are indicated with green
of delays: the system is inherently infinite
circles on Fig. 3 (right). The above formulas first
dimensional. As a consequence, most control
illustrate an invariance property of imaginary
design problems which involve determining a
axis roots and their crossing direction with re-
finite number of parameters can be interpreted
spect to delay shifts of 2=!c . Second, the num-
as reduced-order control design problems or
ber of possible values of !c is one and thus finite.
as control design problems for under-actuated
More generally, substituting D j! in (6) and
systems, which both are known to be hard
treating as a free parameter lead to a two-
problems.
parameter eigenvalue problem
by the red circle. In case of multiple controller one closed-loop characteristic root at D k,
parameters, the path of steepest descent in the pa- i.e., as long as the model used in the predictor
rameter space typically has phases along a man- is exact, the delay in the loop is compensated by
ifold characterized by the non-differentiability of the prediction. For further reading on prediction-
the objective function. based controllers, see, e.g., Krstic (2009) and the
references therein.
Using Delays as Controller Parameters
In contrast to the detrimental effects of delays,
there are situations where delays have a beneficial
Conclusions
effect and are even used as controller parameters;
see Sipahi et al. (2011). For instance, delayed
Time-delay systems, which appear in a large
feedback can be used to stabilize oscillatory sys-
number of applications, are a class of infinite-
tems where the delay serves to adjust the phase in
dimensional systems, resulting in rich dynamics
the control loop. Delayed terms in control laws
and challenges from a control point of view. The
can also be used to approximate derivatives in
different representations and interpretations and,
the control action. Control laws which depend
in particular, the combination of viewpoints lead
on the difference x.t/ x.t /, the so-called
to a wide variety of analysis and synthesis tools.
Pyragas-type feedback, have the property that the
position of equilibria and the shape of periodic
orbits with period are not affected, in contrary
to their stability properties. Last but not least, Cross-References
delays can be used in control schemes to generate
predictions or to stabilize predictors, which allow Control of Nonlinear Systems with Delays
to compensate delays and improve performance H-Infinity Control
(Krstic 2009; Zhong 2006). Let us illustrate the H2 Optimal Control
main idea once more with system (1). Optimization-Based Control Design Tech-
System (1) has a special structure, in the sense niques and Tools
that the delay is only in the input, and it is advan-
tageous to exploit this structure in the context of
control. Coming back to the didactic example, the Bibliography
person who is taking a shower is – possibly after
some bad experiences – aware about the delay Gu K, Kharitonov VL, Chen J (2003) Stability of time-
and will take into account his/her prediction of delay systems. Birkhäuser, Basel
the system’s reaction when adjusting the cold and Kharitonov VL (2013) Time-delay systems. Lyapunov
functionals and matrices. Birkhäuser, Basel
hot water supply. Let us, to conclude, formalize Krstic M (2009) Delay compensation for nonlinear, adap-
this. The uncontrolled system can be rewritten as tive, and PDE systems. Birkhäuser, Basel
P
x.t/ D v.t/, where v.t/ D u.t /. We know Michiels W (2012) Design of fixed-order stabilizing and
u up to the current time t; thus, we know v up H2 H1 optimal controllers: an eigenvalue op-
timization approach. In: Time-delay systems: meth-
to time t C , and if x.t/ is also known, we can ods, applications and new trends. Lecture notes in
predict the value of x at time t C , control and information sciences, vol 423. Springer,
Z t C Berlin/Heidelberg, pp 201–216
Michiels W, Niculescu S-I (2007) Stability and stabi-
xp .t C / D x.t/ C v.s/ds lization of time-delay systems: an eigenvalue based
t
approach. SIAM, Philadelphia
Z t Niculescu S-I (2001) Delay effects on stability: a robust
D x.t/ C u.s/ds; control approach. Lecture notes in control and infor-
t mation sciences, vol 269. Springer, Berlin/New York
Richard J-P (2003) Time-delay systems: an overview
and use the predicted state for feedback. With the of recent advances and open problems. Automatica
control law u.t/ D kxp .t C /, there is only 39(10):1667–1694
Control of Machining Processes 173
Sipahi R, Niculescu S, Abdallah C, Michiels W, Gu K the tool and workpiece, in order to facilitate the
(2011) Stability and stabilization of systems with time- desired cutting operation. Figure 1 illustrates a
delay. IEEE Control Syst Mag 31(1):38–65
Zhong Q-C (2006) Robust control of time-delay systems.
single axis of a ball screw-driven machine tool,
Springer, London performing a milling operation. Here, the cutting
process is influenced by the motion of the servo
drive. The faster the part is fed in towards the ro-
tating cutter, the larger the cutting forces become, C
Control of Machining Processes following a typically proportional relationship
that holds for a large class of milling operations
Kaan Erkorkmaz (Altintas 2012). The generated cutting forces, in
Department of Mechanical & Mechatronics turn, are absorbed by the machine tool and feed
Engineering, University of Waterloo, Waterloo, drive structure. They cause mechanical deforma-
ON, Canada tion and may also excite the vibration modes,
if their harmonic content is near the structural
natural frequencies. This may, depending on the
Abstract cutting speed and tool and workpiece engagement
conditions, lead to forced vibrations or chatter
Control of machining processes encompasses (Altintas 2012).
a broad range of technologies and innovations, The disturbance effect of cutting forces is
ranging from optimized motion planning and also felt by the servo control loop, consisting of
servo drive loop design to on-the-fly regulation mechanical, electrical, and digital components.
of cutting forces and power consumption to This disturbance may result in the degradation
applying control strategies for damping out of tool positioning accuracy, thereby leading to
chatter vibrations caused by the interaction of part errors. Another input that influences the
the chip generation mechanism with the machine quality achieved in a machining operation is the
tool structural dynamics. This article provides a commanded trajectory. Discontinuous or poorly
brief introduction to some of the concepts and designed motion commands, with acceleration
technologies associated with machining process discontinuity, lead typically to jerky motion, vi-
control. brations, and poor surface finish. Beyond motion
controller design and trajectory planning, emerg-
ing trends in machining process control include
Keywords
regulating, by feedback, various outcomes of the
machining process, such as peak resultant cutting
Adaptive control; Chatter vibrations; Feed drive
force, spindle power consumption, and amplitude
control; Machining; Trajectory planning
of vibrations caused by the machining process.
In addition to using actuators and instrumentation
Introduction already available on a machine tool, such as feed
and spindle drives and current sensors, additional
Machining is used extensively in the manufac- devices, such as dynamometers, accelerometers,
turing industry as a shaping process, where high as well as inertial or piezoelectric actuators, may
product accuracy, quality, and strength are re- need to be used in order to achieve the required
quired. From automotive and aerospace compo- level of feedback and control injection capability.
nents, to dies and molds, to biomedical implants,
and even mobile device chassis, many manufac-
tured products rely on the use of machining. Servo Drive Control
Machining is carried out on machine tools,
which are multi-axis mechatronic systems de- Stringent requirements for part quality, typically
signed to provide the relative motion between specified in microns, coupled with disturbance
174 Control of Machining Processes
Control of Machining Processes, Fig. 1 Single axis of a ball screw-driven machine tool performing milling
force inputs coming from the machining process, It can be seen in Fig. 3 that increased axis
which can be in the order of tens to thousands tracking errors (ex and ey ) may result in increased
of Newtons, require that the disturbance rejec- contour error ."/. A practical solution to mitigate
tion of feed drives, which act as dynamic (i.e., this problem, in machine tool engineering, is
frequency dependent) “stiffness” elements, be to also match the dynamics of different motion
kept as strong as possible. In traditional machine axes, so that the tracking errors always assume
design, this is achieved by optimizing the me- an instantaneous proportion that brings the actual
chanical structure for maximum rigidity. After- tool position as close as possible to the desired
wards, the motion control loop is tuned to yield toolpath (Koren 1983). Sometimes, the control
the highest possible bandwidth (i.e., responsive action can be designed to directly reduce the
frequency range), without interfering with the contour error as well, which leads to the structure
vibratory modes of the machine tool in a way that known as “cross-coupling control” (Koren 1980).
can cause instability. The P-PI position velocity
cascade control structure, shown in Fig. 2, is
the most widely used technique in machine tool Trajectory Planning
drives. Its tuning guidelines have been well estab-
lished in the literature (Ellis 2004). To augment Smooth trajectory planning with at least accel-
the command following accuracy, velocity and eration level continuity is required in machine
acceleration feedforward, and friction compensa- tool control, in order to avoid inducing unwanted
tion terms are added. Increasing the closed-loop vibration or excessive tracking error during the
bandwidth yields better disturbance rejection and machining process. For this purpose, computer
more accurate tracking of the commanded tra- numerical control (CNC) systems are equipped
jectory (Pritschow 1996), which is especially with various spline toolpath interpolation func-
important in high-speed machining applications tions, such as B-splines, and NURBS. The fee-
where elevated cutting speeds necessitate faster drate (i.e., progression speed along the toolpath)
feed motion. is planned in the “look-ahead” function of the
Control of Machining Processes 175
Control of Machining
Processes, Fig. 2 P-PI
position velocity cascade
control used in machine
tool drives
Control of Machining
Processes, Fig. 3
Formation of contour error
."/, as a result of servo
errors (ex and ey ) in the
individual axes
CNC so that the total machining cycle time is reduced machining cycle time, while avoiding
reduced as much as possible. This has to be excessive vibration or positioning error due to
done without violating the position-dependent “jerky” motion.
feedrate limits already programmed into the nu- An example of trajectory planning using quin-
merical control (NC) code, which are specified tic (5th degree) polynomials for toolpath param-
by considering various constraints coming from eterization is shown in Fig. 4. Here, comparison
the machining process. is provided between unoptimized and optimized
In feedrate optimization, axis level trajecto- feedrate profiles subject to the same axis velocity,
ries have to stay within the velocity and torque torque (i.e., control signal), and jerk limits. As
limits of the drives, in order to avoid damaging can be seen, significant machining time reduc-
the machine tool or causing actuator saturation. tion can be achieved through trajectory optimiza-
Moreover, as an indirect way of containing track- tion, while retaining the dynamic tool position
ing errors, the practice of limiting axis level jerk accuracy. While Fig. 4 shows the result of an
(i.e., rate of change of acceleration) is applied elaborate nonlinear optimization approach (Alt-
(Gordon and Erkorkmaz 2013). This results in intas and Erkorkmaz 2003), practical look-ahead
176 Control of Machining Processes
Control of Machining Processes, Fig. 4 Example of quintic spline trajectory planning without and with feedrate
optimization
algorithms have also been proposed which lead to they are fully integrated inside a computer-aided
more conservative cycle times but are much better process planning environment, as reported for
suited for real-time implementation inside a CNC 3-axis machining by Altintas and Merdol (2007).
(Weck et al. 1999). An alternative approach, which allows the ma-
chining process to take place within safe and ef-
ficient operating bounds, is to use feedback from
Adaptive Control of Machining the machine tool during the cutting process. This
measurement can be of the cutting forces using a
There are established mathematical methods for dynamometer or the spindle power consumption.
predicting cutting forces, torque, power, and even This measurement is then used inside a feedback
surface finish for a variety of machining oper- control loop to override the commanded feedrate
ations like turning, boring, drilling, and milling value, which has direct impact on the cutting
(Altintas 2012). However, when machining com- forces and power consumption. This scheme can
plex components, such as gas turbine impellers, be used to ensure that the cutting forces do not
or dies and molds, the tool and workpiece en- exceed a certain limit for process safety or to
gagement and workpiece geometry undergo con- increase the feed when the machining capacity
tinuous change. Hence, it may be difficult to is underutilized, thus boosting productivity. Since
apply such prediction models efficiently, unless the geometry and tool engagement are generally
Control of Machining Processes 177
Control of Machining Processes, Fig. 5 Example of 5-axis impeller machining with adaptive force control (Source:
Budak and Kops (2000), courtesy of Elsevier)
Control of Machining
Processes, Fig. 6
Schematic of the chatter
vibration mechanism for
one degree of freedom
(From: Altintas (2012),
courtesy of Cambridge
University Press)
continuously varying, the coefficients of a model accordingly. This approach has produced signifi-
that relates the cutting force (or power) to the feed cant cycle time reduction in 5-axis machining of
command are also time-varying. Furthermore, gas turbine impellers, as reported in Budak and
in CNC controllers, depending on the trajectory Kops (2000) and shown in Fig. 5.
generation architecture, the execution latency of
a feed override command may not always be
deterministic. Due to these sources of variabil- Control of Chatter Vibrations
ity, rather than using classical fixed gain feed-
back, machining control research has evolved Chatter vibrations are caused by the interaction
around adaptive control techniques (Masory and of the chip generation mechanism with the
Koren 1980; Spence and Altintas 1991), where structural dynamics of the machine, tool, and
changes in the cutting process dynamics are con- workpiece assembly (see Fig. 6). The relative
tinuously tracked and the control law, which com- vibration between the tool and workpiece gener-
putes the proceeding feedrate override, is updated ates a wavy surface finish. In the consecutive tool
178 Control of Machining Processes
pass, a new wave pattern, caused by the current become available at lower cost, one can expect
instantaneous vibration, is generated on top of to see new features, such as more elaborate
the earlier one. If the formed chip, which has an trajectory planning algorithms, active vibration
undulated geometry, displays a steady average damping techniques, and real-time process
thickness, then the resulting cutting forces and and machine simulation and control capability,
vibrations also remain bounded. This leads to beginning to appear in CNC units. No doubt that
a stable steady-state cutting regime, known as the dynamic analysis and controller design for
“forced vibration.” On the other hand, if the such complicated systems will require higher
chip thickness keeps increasing at every tool levels of rigor, so that these new technologies can
pass, resulting in increased cutting forces and be utilized reliably and at their full potential.
vibrations, then chatter vibration is encountered.
Chatter can be extremely detrimental to the
machined part quality, tool life, and the machine Cross-References
tool.
Chatter has been reported in literature to be Adaptive Control, Overview
caused by two main phenomena: self-excitation PID Control
through regeneration and mode coupling. For Robot Motion Control
further information on chatter theory, the reader is
referred to Altintas (2012) as an excellent starting
point. Bibliography
Various mitigation measures have been inves-
tigated and proposed in order to avoid and control Altintas Y (2012) Manufacturing automation: metal cut-
ting mechanics, machine tool vibrations, and CNC de-
chatter. One widespread approach is to select sign, 2nd edn. Cambridge University Press, Cambridge
chatter-free cutting conditions through detailed Altintas Y, Erkorkmaz K (2003) Feedrate optimization for
modal testing and stability analyses. Recently, spline interpolation in high speed machine tools. Ann
to achieve higher material removal rates, the CIRP 52(1):297–302
Altintas Y, Merdol DS (2007) Virtual High performance
application of active damping has started to re- milling. Ann CIRP 55(1):81–84
ceive interest. This has been realized through spe- Budak E, Kops L (2000) Improving productivity and
cially designed tools and actuators (Munoa et al. part quality in milling of titanium based impellers
2013; Pratt and Nayfeh 2001) and demonstrated by chatter suppression and force control. Ann CIRP
49(1):31–36
productivity improvement in boring and milling Ellis GH (2004) Control system design guide, 3rd edn.
operations. As another method for chatter sup- Elsevier Academic, New York
pression, modulation of the cutting (i.e., spindle) Gordon DJ, Erkorkmaz K (2013) Accurate control of
speed has been successfully applied as a means ball screw drives using pole-placement vibration
damping and a novel trajectory prefilter. Precis Eng
of interrupting the regeneration mechanism (Soli- 37(2):308–322
man and Ismail 1997; Zatarain et al. 2008). Koren Y (1980) Cross-coupled biaxial computer control
for manufacturing systems. ASME J Dyn Syst Meas
Control 102:265–272
Koren Y (1983) Computer control of manufacturing sys-
Summary and Future Directions tems. McGraw-Hill, New York
Masory O, Koren Y (1980) Adaptive control system for
This article has presented an overview of various turning. Ann CIRP 29(1):281–284
Munoa J, Mancisidor I, Loix N, Uriarte LG, Barcena R,
concepts and emerging technologies in the area of
Zatarain M (2013) Chatter suppression in ram type
machining process control. The new generation travelling column milling machines using a biaxial
of machine tools, designed to meet the ever- inertial actuator. Ann CIRP 62(1):407–410
growing productivity and efficiency demands, Pratt JR, Nayfeh AH (2001) Chatter control and stability
analysis of a cantilever boring bar under regenerative
will likely utilize advanced forms of these ideas
cutting conditions. Philos Trans R Soc 359:759–792
and technologies in an integrated manner. As Pritschow G (1996) On the influence of the velocity gain
more computational power and better sensors factor on the path deviation. Ann CIRP 45/1:367–371
Control of Networks of Underwater Vehicles 179
Soliman E, Ismail F (1997) Chatter suppression by applications and by the unique challenges
adaptive speed modulation. Int J Mach Tools Manuf associated with operating in the oceans,
37(3):355–369
Spence A, Altintas Y (1991) CAD assisted adaptive
lakes, and rivers. Tasks include underwater
control for milling. ASME J Dyn Syst Meas Control exploration, mapping, search, and surveillance,
113(3):444–450 associated with problems that include pollution
Weck M, Meylahn A, Hardebusch C (1999) Innovative monitoring, human safety, resource seeking,
algorithms for spline-based CNC controller. Ann Ger
ocean science, and marine archeology. Vehicle C
Acad Soc Prod Eng VI(1):83–86
Zatarain M, Bediaga I, Munoa J, Lizarralde R (2008) networks collect data on underwater physics,
Stability of milling processes with continuous spindle biology, chemistry, and geology for improving
speed variation: analysis in the frequency and time the understanding and predictive modeling
domains, and experimental correlation. Ann CIRP
57(1):379–384
of natural dynamics and human-influenced
changes in marine environments. Because the
underwater environment is opaque, inhospitable,
uncertain, and dynamic, control is critical to the
Control of Networks of Underwater performance of vehicle networks.
Vehicles Underwater vehicles typically carry sensors
to measure external environmental signals and
Naomi Ehrich Leonard fields, and thus a vehicle network can be regarded
Department of Mechanical and Aerospace as a mobile sensor array. The underlying principle
Engineering, Princeton University, Princeton, of control of networks of underwater vehicles
NJ, USA leverages their mobility and uses an interacting
dynamic among the vehicles to yield a high-
performing collective behavior. If the vehicles
Abstract can communicate their state or measure the rel-
ative state of others, then they can cooperate and
Control of networks of underwater vehicles is coordinate their motion.
critical to underwater exploration, mapping, One of the major drivers of control of under-
search, and surveillance in the multiscale, water mobile sensor networks is the multiscale,
spatiotemporal dynamics of oceans, lakes, spatiotemporal dynamics of the environmental
and rivers. Control methodologies have been fields and signals. In Curtin et al. (1993), the
derived for tasks including feature tracking and concept of the autonomous oceanographic sam-
adaptive sampling and have been successfully pling network (AOSN), featuring a network of
demonstrated in the field despite the severe underwater vehicles, was introduced for dynamic
challenges of underwater operations. measurement of the ocean environment and res-
olution of spatial and temporal gradients in the
sampled fields. For example, to understand the
coupled biological and physical dynamics of the
Keywords ocean, data are required both on the small-scale
dynamics of phytoplankton, which are major ac-
Adaptive sampling; Feature tracking; Gliders; tors in the marine ecosystem and the global cli-
Mobile sensor arrays; Underwater exploration mate, and on the large-scale dynamics of the flow
field, temperature, and salinity.
Accordingly, control laws are needed to co-
Introduction ordinate the motion of networks of underwater
vehicles to match the many relevant spatial and
The development of theory and methodology temporal scales. And for a network of underwater
for control of networks of underwater vehicles vehicles to perform complex missions reliably
is motivated by a multitude of underwater and efficiently, the control must address the many
180 Control of Networks of Underwater Vehicles
uncertainties and real-world constraints including ocean currents can sometimes reach or even ex-
the influence of currents on the motion of the ceed the speed of the gliders. Unlike a propelled
vehicles and the limitations on underwater com- AUV, which typically has sufficient thrust to
munication. maintain course despite currents, a glider trying
to move in the direction of a strong current will
make no forward progress. This makes coordi-
Vehicles nated control of gliders challenging; for instance,
two sensors that should stay sufficiently far apart
Control of networks of underwater vehicles is may be pushed toward each other leading to less
made possible with the availability of small than ideal sampling conditions.
(e.g., 1.5–2 m long), relatively inexpensive
autonomous underwater vehicles (AUVs).
Propelled AUVs such as the REMUS provide Communication and Sensing
maneuverability and speed. These kinds of AUVs
respond quickly and agilely to the needs of Underwater communication is one of the biggest
the network, and because of their speed, they challenges to the control of networks of un-
can often power through strong ocean flows. derwater vehicles and one that distinguishes it
However, propelled AUVs are limited by their from control of vehicles on land or in the air.
batteries; for extended missions, they need Radio-frequency communication is not typically
docking stations or other means to recharge their available underwater, and acoustic data telemetry
batteries. has limitations including sensitivity to ambient
Buoyancy-driven autonomous underwater noise, unpredictable propagation, limited band-
gliders, including the Slocum, the Spray, and width, and latency.
the Seaglider, are a class of endurance AUVs When acoustic communication is too limiting,
designed explicitly for collecting data over large vehicles can surface periodically and communi-
three-dimensional volumes continuously over cate via satellite. This method may be bandwidth
periods of weeks or even months (Rudnick et al. limited and will require time and energy. How-
2004). They move slowly and steadily, and, as a ever, in the case of profiling propelled AUVs
result, they are particularly well suited to network or underwater gliders, they already move in the
missions of long duration. vertical plane in a sawtooth pattern and thus
Gliders propel themselves by alternately in- regularly come closer to the surface. When on the
creasing and decreasing their buoyancy using surface, vehicles can also get a GPS fix whereas
either a hydraulic or a mechanical buoyancy en- there is no access to GPS underwater. The GPS
gine. Lift generated by flow over fixed wings fix is used for correcting onboard dead reckoning
converts the vertical ascent/descent induced by of the vehicle’s absolute position and for updating
the change in buoyancy into forward motion, re- onboard estimation of the underwater currents,
sulting in a sawtooth-like trajectory in the vertical both helpful for control.
plane. Gliders can actively redistribute internal Vehicles are typically equipped with
mass to control attitude, for example, they pitch conductivity-temperature-density (CTD) sensors
by sliding their battery pack forward and aft. For to measure temperature, salinity, and density.
heading control, they shift mass to roll, bank, From this pressure can be computed and thus
and turn or deflect a rudder. Some gliders are depth and vertical speed. Attitude sensors
designed for deep water, e.g., to 1,500 m, while provide measurements of pitch, roll, and
others for shallower water, e.g., to 200 m. heading. Position and velocity in the plane is
Gliders are typically operated at their maxi- estimated using dead reckoning. Many sensors
mum speed and thus they move at approximately for measuring the environment have been
constant speed relative to the flow. Because this is developed for use on underwater vehicles; these
relatively slow, on the order of 0.3–0.5 m/s in the include chlorophyll fluorometers to estimate
horizontal direction and 0.2 m/s in the vertical, phytoplankton abundance, acoustic Doppler
Control of Networks of Underwater Vehicles 181
The control methodology prescribes steering for multiple graphs to handle multiple scales.
laws for vehicles operated at a constant speed. For example, in the 2006 field experiment, the
Assume that the i th vehicle moves at unit speed in default motion pattern was one in which six
the plane in the direction i .t/ at time t. Then, the gliders moved in coordinated pairs around three
velocity of the i th vehicle is xP i D .cos i ; sin i /. closed curves; one graph defined the smaller-
The steering control ui is the component of the scale coordination of each pair of gliders about
force in the direction normal to velocity, such that its curve, while a second graph defined the larger- C
Pi D ui for i D 1; : : : ; N . Define scale coordination of gliders across the three
curves.
1 X
N
N
U.1 ; : : : ; N / D kp k2 ; p D xP j :
2 N j D1
Implementation
U is a potential function that is maximal at 1
Implementation of control of networks of
when all vehicle directions are synchronized and
underwater vehicles requires coping with the
minimal at 0 when all vehicle directions are
remote, hostile underwater environment. The
perfectly anti-synchronized. Let xQ i D .xQ i ; yQi / D
P control methodology for motion patterns
.1=N / N Q?
j D1 xij and let x i D .yQi ; x
Q i /. Define
and adaptive sampling, described above, was
implemented in the field using a customized
1X
N
software infrastructure called the Glider
S.x1 ; : : : ; xN ; 1 ; : : : ; N / D kPxi !0 xQ ?
i k ;
2
2 i D1 Coordinated Control System (GCCS) (Paley et al.
2008). The GCCS combines a simple model for
where !0 ¤ 0. S is a potential function that is control planning with a detailed model of glider
minimal at 0 for circular motion of the vehicles dynamics to accommodate the constant speed of
around their center of mass with radius
0 D gliders, relatively large ocean currents, waypoint
j!0 j1 . tracking routines, communication only when
Define the steering control as gliders surface (asynchronously), other latencies,
and more. Other approaches consider control
design in the presence of a flow field, formal
X
N
Pi D !0 .1 C Kc hQxi ; xP i i/ K sin.j i /; methods to integrate high-resolution models of
j D1 the flow field, and design tailored to propelled
AUVs.
where Kc > 0 and K are scalar gains. Then, cir-
cular motion of the network is a steady solution,
with the phase-locked heading arrangement a Summary and Future Directions
minimum of K U , i.e., synchronized or perfectly
anti-synchronized depending on the sign of K . The multiscale, spatiotemporal dynamics of the
Stability can be proved with the Lyapunov func- underwater environment drive the need for well-
tion Vc D Kc S C K U . This steering control coordinated control of networks of underwater
law depends only on relative position and relative vehicles that can manage the significant opera-
heading measurements of the other vehicles. tional challenges of the opaque, uncertain, inhos-
The general form of the methodology extends pitable, and dynamic oceans, lakes, and rivers.
the above control law to network interconnec- Control theory and algorithms have been de-
tions defined by possibly time-varying graphs veloped to enable networks of vehicles to suc-
with limited sensing or communication links, cessfully operate as adaptable sensor arrays in
and it provides systematic control laws to stabi- missions that include feature tracking and adap-
lize symmetric patterns of heading distributions tive sampling. Future work will improve control
about noncircular closed curves. It also allows in the presence of strong and unpredictable flow
184 Control of Nonlinear Systems with Delays
fields and will leverage the latest in battery and adaptive sampling in Monterey Bay. IEEE J Ocean
underwater communication technologies. Hybrid Eng 31(4):935–948
Grocholsky B (2002) Information-theoretic control of
vehicles and heterogeneous networks of vehicles multiple sensor platforms. PhD thesis, University of
will also promote advances in control. Future Sydney
work will draw inspiration from the rapidly grow- Leonard NE, Paley DA, Lekien F, Sepulchre R, Fratantoni
ing literature in decentralized cooperative con- DM, Davis RE (2007) Collective motion, sensor net-
works, and ocean sampling. Proc IEEE 95(1):48–74
trol strategies and complex dynamic networks. Leonard NE, Paley DA, Davis RE, Fratantoni DM, Lekien
Dynamics of decision-making teams of robotic F, Zhang F (2010) Coordinated control of an un-
vehicles and humans is yet another important derwater glider fleet in an adaptive ocean sampling
direction of research that will impact the success field experiment in Monterey Bay. J Field Robot
27(6):718–740
of control of networks of underwater vehicles. Ögren P, Fiorelli E, Leonard NE (2004) Cooperative
control of mobile sensor networks: adaptive gradient
climbing in a distributed environment. IEEE Trans
Autom Control 49(8):1292–1302
Cross-References
Paley D, Zhang F, Leonard NE (2008) Cooperative control
for ocean sampling: the glider coordinated control sys-
Motion Planning for Marine Control Systems tem. IEEE Trans Control Syst Technol 16(4):735–744
Underactuated Marine Control Systems Redfield S (2013) Cooperation between underwater ve-
hicles. In: Seto ML (ed) Marine robot autonomy.
Springer, New York, pp 257–286
Rudnick D, Davis R, Eriksen C, Fratantoni D, Perry M
Recommended Reading (2004) Underwater gliders for ocean research. Mar
Technol Soc J 38(1):48–59
Sepulchre R, Paley DA, Leonard NE (2008) Stabilization
In Bellingham and Rajan (2007), it is argued that of planar collective motion with limited communica-
cooperative control of robotic vehicles is espe- tion. IEEE Trans Autom Control 53(3):706–719
cially useful for exploration in remote and hostile Zhang F, Leonard NE (2010) Cooperative filters and con-
trol for cooperative exploration. IEEE Trans Autom
environments such as the deep ocean. A recent
Control 55(3):650–663
survey of robotics for environmental monitoring,
including a discussion of cooperative systems,
is provided in Dunbabin and Marques (2012).
A survey of work on cooperative underwater
vehicles is provided in Redfield (2013). Control of Nonlinear Systems with
Delays
the system. The analysis methodology is based on The first step toward control design and anal-
the concept of infinite-dimensional backstepping ysis for system (1) is to consider the special
transformation – a transformation that converts case in which D D const. The next step is to
the overall feedback system to a new, cascade consider the special case of system (1), in which
“target system” whose stability can be studied D D D.t/, i.e., the delay is an a priori given
with the construction of a Lyapunov function. function of time. Systems with time-varying de-
lays model numerous real-world systems, such C
as, networked control systems, traffic systems,
or irrigation channels. Assuming that the input
Keywords
delay is an a priori defined function of time is
a plausible assumption for some applications.
Distributed parameter systems; Delay systems;
Yet, the time-variation of the delay might be
Backstepping; Lyapunov function
the result of the variation of a physical quantity
that has its own dynamics, such as, for example,
in milling processes (due to speed variations),
Nonlinear Systems with Input Delay 3D printers (due to distance variations), cooling
systems (due to flow rate variations), and popu-
Nonlinear systems of the form lation dynamics (due to population’s size varia-
tions). Processes in this category can be modeled
by systems with a delay that is a function of
XP .t/ D f .X.t/; U .t D .t; X.t//// ; (1)
the state of the system, i.e., by (1) with D D
D.X /.
where t 2 RC is time, f W Rn R ! Rn In this article control designs are presented
is a vector field, X 2 Rn is the state, D W for the stabilization of nonlinear systems with
RC Rn ! RC is a nonnegative function of input delays, with delays that are constant (Krstic
the state of the system, and U 2 R is the scalar 2009), time-varying (Bekiaris-Liberis and Krstic
input, are ubiquitous in applications. The starting 2012) or state-dependent (Bekiaris-Liberis and
point for designing a control law for (1), as well Krstic 2013b), employing predictor feedback,
as for analyzing the dynamics of (1) is to con- i.e., employing a feedback law that uses the future
sider the delay-free counterpart of (1), i.e., when rather than the current state of the system. Since
D D 0, for which a plethora of results exists one employs in the feedback law the future values
dealing with its stabilization and Lyapunov-based of the state, the predictor feedback completely
analysis (Krstic et al 1995). cancels (compensates) the input delay, i.e., after
Systems of the form (1) constitute more the control signal reaches the system, the state
realistic models for physical systems than evolves as if there were no delay at all. Since the
delay-free systems. The reason is that often future values of the state are not a priori known,
in engineering applications the control that is the main control challenge is the implementation
applied to the system does not immediately affect of the predictor feedback law. Having determined
the system. This dead time until the controller can the predictor, the control law is then obtained
affect the system might be due to, among other by replacing the current state in a nominal state-
things, the long distance of the controller from feedback law (which stabilizes the delay-free
the system, such as, for example, in networked system) by the predictor.
control systems, or due to finite-speed transport A methodology is presented in the article
or flow phenomena, such as, for example, in for the stability analysis of the closed-loop
additive manufacturing and cooling systems, or system under predictor feedback by constructing
due to various after-effects, such as, for example, Lyapunov functionals. The Lyapunov functionals
in population dynamics. are constructed for a transformed (rather than
186 Control of Nonlinear Systems with Delays
Z
f .P .s/; U.s//
P ./ D X.t/ C ds (3)
t D.t;X.t // 1 Dt ..s/; P .s// rD ..s/; P .s// f .P .s/; U.s//
Z
1
./ D t C ds; (4)
t D.t;X.t // 1 Dt ..s/; P .s// rD ..s/; P .s// f .P .s/; U.s//
for all t D .t; X.t// t. The sig- was actually applied. To cancel the effect of this
nal P is the predictor of X at the appropri- delay, the control law (2) is designed such that
ate prediction time , i.e., P .t/ D X..t//. U. .t// D U.t D/ D
using the variation of constants formula, with (and X.t/ D P .t D.X.t///). This implicit
the initial condition P .tR D/ D X.t/, as relation can be solved by proceeding as in the
t
P .t/ D e AD X.t/ C t D e A.t / BU./d. time-varying case, i.e., by performing the change
For systems that are nonlinear, P .t/ cannot be of variables t D ./, for all t D .X.t//
written explicitly, for the same reason that a t in XP .t/ D f .X.t/; U .t D.X.t//// and
nonlinear ODE cannot be solved explicitly. So integrating in starting at D t D .X.t//,
P .t/ is represented implicitly using the nonlinear to obtain the formula (3) for P with Dt D 0, C
integral equation (3). The computation of P .t/ rDf D rD .P .s// f .P .s/; U.s// and D D
from (3) is straightforward with a discretized D.X.t//.
implementation in which P .t/ is assigned values Analogously, one can derive the predictor for
based on the right-hand side of (3), which the case D D D .t; X.t// with the difference
involves earlier values of P and the values of that now the prediction time is not given explic-
the input U . itly in terms of P , but it is defined through an
The case D D D.t/ is considered next. As implicit relation, namely, it holds that .t/ D
in the case of constant delays the main goal is t C D ..t/; P .t//. Therefore, for actually com-
to implement the predictor P . One needs first to puting one has to proceed as in the deriva-
define the appropriate time interval over which tion of P , i.e., to differentiate relation ./ D
the predictor of the state is needed, which, in C D ../; P .// and then integrate starting
the constant delay case is simply D time-units at the known value .t D .t; X.t/// D t. It
in the future. The control law has to satisfy is important to note that the integral equation (4)
U . .t// D .X.t//, or, U.t/ D .X ..t///. is needed in the computation of P only when D
Hence, one needs to find an implementable for- depends on both X and t.
mula for P .t/ D X ..t//. In the constant
delay case the prediction horizon over which one
needs to compute the predictor can be determined Backstepping Transformation and
based on the knowledge of the delay time since Stability Analysis
the prediction horizon and the delay time are
both equal to D. This is not anymore true in The predictor feedback designs are based on a
the time-varying case in which the delayed time feedback law .X / that renders the closed-loop
is defined as .t/ D t D.t/, whereas the system XP D f .X; .X // globally asymptoti-
prediction time as 1 .t/ D .t/ D t CD ..t//. cally stable. However, in the rest of the section
Employing a change of variables in XP .t/ D it is assumed that the feedback law .X / renders
f .X.t/; U .t D.t/// as t D ./, for all the closed-loop system XP D f .X; .X / C v/
.t/ t and integrating in starting at input-to-state stable (ISS) with respect to v, i.e.,
D .t/ one obtains the formula for P given there exists a smooth function S W Rn ! RC and
by (3) with Dt D D 0 ..t//, rDf D 0 and class K1 functions ˛1 , ˛2 , ˛3 , ˛4 such that
D D D.t/.
Next the case D D D.X.t// is considered. ˛3 .jX.t/j/ S .X.t//
First one has to determine the predictor, i.e.,
˛4 .jX.t/j/ (5)
the signal P such that P .t/ D X ..t//, where
.t/ D 1 .t/ and .t/ D t D .X.t//. In
the case of state-dependent delay, the prediction @S.X.t//
f .X.t/; .X.t//
time .t/ depends on the predictor itself, i.e., @X
the time when the current control reaches Cv.t// ˛1 .jX.t/j/ C ˛2 .jv.t/j/: (6)
the system depends on the value of the state
at that time, namely, the following implicit Imposing this stronger assumption enables one
relationship holds P .t/ D X.t C D.P .t/// to construct a Lyapunov functional for the
188 Control of Nonlinear Systems with Delays
closed-loop systems (1)–(4) with the aid of the With the invertibility of the backstepping trans-
Lyapunov characterization of ISS defined in (5) formation one can then show global asymptotic
and (6). stability of the closed-loop system in the original
The stability analysis of the closed-loop variables .X; U /. In particular, there exists a class
systems (1)–(4) is explained next. Denote the KL function ˇ such that
infinite-dimensional backstepping transformation
of the actuator state as jX.t/j C sup jU./j
t D.t;X.t // t
!
W ./ D U./ .P .//;
ˇ jX.0/j C sup jU./j; t ;
for all t D .t; X.t// t ; (7) D.0;X.0// 0
the delay is a nonnegative function of the state, Krstic M (2009) Delay compensation for nonlinear, adap-
conditions C2–C4 are satisfied by restricting the tive, and PDE systems. Birkhauser, Boston
Krstic M, Kanellakopoulos I, Kokotovic PV (1995) Non-
initial state X and the initial actuator state. There- linear and adaptive control design. Wiley, New York
fore estimate (12) holds locally. Krstic M, Smyshlyaev A (2008) Boundary control of
PDEs: a course on backstepping designs. SIAM,
Philadelphia
Cross-References C
Ian R. Petersen
Recommended Reading School of Engineering and Information
Technology, University of New South Wales, the
The main control design tool for general systems Australian Defence Force Academy, Canberra,
with input delays of arbitrary length is predictor Australia
feedback. The reader is referred to Artstein
(1982) for the first systematic treatment of
general linear systems with constant input delays. Abstract
The applicability of predictor feedback was
extended in Krstic (2009) to several classes of Quantum control theory is concerned with the
systems, such as nonlinear systems with constant control of systems whose dynamics are governed
input delays and linear systems with unknown by the laws of quantum mechanics. Quantum
input delays. Subsequently, predictor feedback control may take the form of open loop quan-
was extended to general nonlinear systems with tum control or quantum feedback control. Also,
nonconstant input and state delays (Bekiaris- quantum feedback control may consist of mea-
Liberis and Krstic 2013a). The main stability surement based feedback control, in which the
analysis tool for systems employing predictor controller is a classical system governed by the
feedback is backstepping. Backstepping was laws of classical physics. Alternatively, quantum
initially introduced for adaptive control of finite- feedback control may take the form of coherent
dimensional nonlinear systems (Krstic et al feedback control in which the controller is a
1995). The continuum version of backstepping quantum system governed by the laws of quan-
was originally developed for the boundary tum mechanics. In the area of open loop quantum
control of several classes of PDEs in Krstic and control, questions of controllability along with
Smyshlyaev (2008). optimal control and Lyapunov control methods
are discussed. In the case of quantum feedback
control, LQG and H 1 control methods are dis-
Bibliography cussed.
error correction (e.g., see Kerckhoff et al. 2010) H. In this case, the Schrödinger picture model of
or the preparation of quantum states. Quantum a quantum system is given in terms of a master
information and quantum computing in turn have equation which describes the time evolution of
great potential in solving intractable computing the density operator. In the case of an open quan-
problems such as factoring large integers using tum system with Markovian dynamics defined on
Shor’s algorithm; see Shor (1994). a finite dimensional Hilbert space of dimension
N, the master equation is a matrix differential C
equation of the form
Schrödinger Picture Models of
Quantum Systems " ! #
X
m
P D i
.t/ H0 C uk .t/Hk ;
.t/
The state of a closed quantum system can be rep- kD1
resented by a unit vector j i in a complex Hilbert N 12
h i
space H. Such a quantum state is also referred 1 X
C ˛j;k Fj
.t/; Fk
to as a wavefunction. In the Schrödinger picture, 2
j;kD0
the time evolution of the quantum state is defined h i
by the Schrödinger equation which is in general a C Fj ;
.t/Fk I
partial differential equation. An important class
(2)
of quantum systems are finite-level systems in
which the Hilbert space is finite dimensional. In
this case, the Schrödinger equation is a linear e.g., see Breuer and Petruccione (2002). Here
ordinary differential equation of the form the notation ŒX;
D X
X refers to the
commutation operator and the notation denotes
˚ N 2 1
@ the adjoint of an operator. Also, Fj j D0 is a
i„ j .t/i D H0 j .t/i
@t basis set for the space of bounded linear operators
outcome of the quantum measurement is k, the evolution of general operators on the underlying
state of the quantum system collapses to the new Hilbert space rather than just observables which
value of p Pk j i . This change in the quantum are required to be self-adjoint operators. An im-
h jPk j i
state as a result of a measurement is an important portant class of open quantum systems which are
characteristic of quantum mechanics. For an open considered in the Heisenberg picture arise when
quantum system which is in a quantum state the underlying Hilbert space is infinite dimen-
defined by a density operator
, the probability of sional and the system represents a collection of
a measurement outcome k is given by tr.Pk
/. In independent quantum harmonic oscillators inter-
Pk
Pk
this case, the quantum state collapses to tr.P : acting with a number of external quantum fields.
k
/
In the case of an open quantum system with Such linear quantum systems are described in the
continuous measurements of an observable X , Heisenberg picture by linear quantum stochastic
we can consider a stochastic master equation as differential equations (QSDEs) of the form
follows:
dx.t/ D Ax.t/dt C Bdw.t/I
" ! #
X
m
dy.t/ D C x.t/dt C Ddw.t/ (5)
d
.t/ D i H0 C uk .t/Hk ;
.t/ dt
kD1
where A, B, C , D are real or complex matrices,
ŒX; ŒX;
.t/
dt
p x.t/ is a vector of possibly noncommuting oper-
C 2 .X
.t/ C
.t/X ators on the underlying Hilbert space H; e.g., see
James et al. (2008). Also, the quantity dw.t/ is
2tr .X
.t//
.t// dW
decomposed as
(3)
dw.t/ D ˇw .t/dt C dw.t/
Q
where is a constant parameter related to the
measurement strength and dW is a standard Q
where ˇw .t/ is an adapted process and w.t/ is a
Wiener increment which is related to the quantum Wiener process with Itô table:
continuous measurement outcome y.t/ by
p Q
dw.t/d Q D FwQ dt:
w.t/
dW D dy 2 tr .X
.t// dtI (4)
Here, FwQ 0 is a real or complex matrix. The
e.g., see Wiseman and Milburn (2010). These quantity w.t/ represents the components of the
models are used in the measurement feedback input quantum fields acting on the system. Also,
control of Markovian open quantum systems. the quantity y.t/ represents the components of
Also, the Eqs. (3) and (4) can be regarded as a interest of the corresponding output fields that
quantum filter in which
.t/ is the conditional result from the interaction of the harmonic oscil-
density of the quantum system obtained by filter- lators with the incoming fields.
ing the measurement signal y.t/; e.g., see Bouten In order to represent physical quantum sys-
et al. (2007) and Gough et al. (2012). tems, the components of vector x.t/ are required
to satisfy certain commutation relations of the
form
Heisenberg Picture Models of
Quantum Systems xj .t/; xk .t/ D 2i ‚j k ; j; k D 1; 2; : : : ; n; 8t
In the Heisenberg picture of quantum mechanics, where the matrix ‚ D ‚j k is skew symmetric.
the observables of a system evolve with time and The requirement to represent a physical quan-
the quantum state remains fixed. This picture may tum system places restrictions on the matrices
also be extended slightly by considering the time A, B, C , D, which are referred to as physical
Control of Quantum Systems 193
realizability conditions; e.g., see James et al. controllability. For the case of a closed quantum
(2008) and Shaiju and Petersen (2012). QSDE system of the form (1), the question of
models of the form (5) arise frequently in the area controllability can be defined as follows (e.g.,
of quantum optics. They can also be generalized see Albertini and D’Alessandro 2003):
to allow for nonlinear quantum systems such as
Definition 1 (Pure State Controllability) The
arise in the areas of nonlinear quantum optics
quantum system (1) is said to be pure state C
and superconducting quantum circuits; e.g., see
controllable if for every pair of initial and final
Bertet et al. (2012). These models are used in
states j 0 i and j f i, there exist control functions
the feedback control of quantum systems in both
fuk .t/g and a time T > 0 such that the cor-
the case of classical measurement feedback and
responding solution of (1) with initial condition
in the case of coherent feedback in which the
j 0 i satisfies j .T /i D j f i.
quantum controller is also a quantum system and
is represented by such a QSDE model. Alternative definitions have also been con-
sidered for the controllability of the quantum
.S; L; H / Quantum System Models system (1); e.g., see Albertini and D’Alessandro
An alternative method of modeling an open (2003) and Grigoriu et al. (2013) in the case
quantum system as opposed to the stochastic of open quantum systems. The following the-
master equation (SME) approach or the orem provides a necessary and sufficient con-
quantum stochastic differential equation (QSDE) dition for pure state controllability in terms of
approach, which were considered above, is to the Lie algebra L0 generated by the matrices
simply model the quantum system in terms of fiH0 ; iH1 ; : : : ; iHm g, u.N / the Lie algebra
the physical quantities which underlie the SME corresponding to the unitary group of dimension
and QSDE models. For a general open quantum N , su.N / the Lie algebra corresponding to the
system, these quantities are the scattering special unitary group of dimension N , sp. N2 / the
Q
2 dimensional symplectic group, and L the Lie
N
matrix S which is a matrix of operators on
N
the underlying Hilbert space, the coupling algebra conjugate to sp. 2 /.
operator L which is a vector of operators on
Theorem 1 (See D’Alessandro 2007) The
the underlying Hilbert space, and the system
quantum system (1) is pure state controllable
Hamiltonian which is a self-adjoint operator on
if and only if the Lie algebra L0 satisfies one of
the underlying Hilbert space; e.g., see Gough
the following conditions:
and James (2009). For a given .S; L; H / model,
(1) L0 D su.N /;
the corresponding SME model or QSDE model
(2) L0 is conjugate to sp. N2 /;
can be calculated using standard formulas; e.g.,
(3) L0 D u.N /;
see Bouten et al. (2007) and James et al. (2008). Q
(4) L0 D span fiIN N g ˚ L.
Also, in certain circumstances, an .S; L; H /
model can be calculated from an SME model Similar conditions have been obtained when
or a QSDE model. For example, if the linear alternative definitions of controllability are used.
QSDE model (5) is physically realizable, then a Once it has been determined that a quantum
corresponding .S; L; H / model can be found. In system is controllable, the next task in open
fact, this amounts to the definition of physical loop quantum control is to determine the control
realizability. functions fuk .t/g which drive a given initial state
to a given final state. An important approach to
this problem is the optimal control approach
Open Loop Control of Quantum in which a time optimal control problem is
Systems solved using Pontryagin’s maximum principle
to construct the control functions fuk .t/g which
A fundamental question in the open loop drives the given initial state to the given final state
control of quantum systems is the question of in minimum time; e.g., see Khaneja et al. (2001).
194 Control of Quantum Systems
This approach works well for low dimensional electromagnetic field is stabilized about a spec-
quantum systems but is computationally ified state
f D j m ih m j.
intractable for high dimensional quantum
systems.
An alternative approach for high dimensional A Heisenberg Picture Approach to
quantum systems is the Lyapunov control ap- Classical Measurement Based Quantum
proach. In this approach, a Lyapunov function is Feedback Control
selected which provides a measure of the distance In this Heisenberg picture approach to classical
between the current quantum state and the desired measurement based quantum feedback control,
terminal quantum state. An example of such a we begin with a quantum system which is de-
Lyapunov function is scribed by linear quantum stochastic equations of
the form (5). In these equations, it is assumed
V D h .t/ f j .t/ f i 0I that the components of the output vector all
commute with each other and so can be regarded
e.g., see Mirrahimi et al. (2005). A state feedback as classical quantities. This can be achieved if
control law is then chosen to ensure that the time each of the components are obtained via a process
derivative of this Lyapunov function is negative. of homodyne detection from the corresponding
This state feedback control law is then simulated electromagnetic field; e.g., see Bachor and Ralph
with the quantum system dynamics (1) to give the (2004). Also, it is assumed that the input electro-
required open loop control functions fuk .t/g. magnetic field w.t/ can be decomposed as
Classical Measurement Based ˇu .t/dt C dw Q 1 .t/
Quantum Feedback Control dw.t/ D (6)
dw2 .t/
u.t/ D f .
.t//
dxK .t/ D AK xk .t/dt C BK dy.t/
where the function f ./ is designed to achieve
a particular objective such as stabilization of the ˇu .t/dt D CK xk .t/dt: (7)
quantum system. Here u.t/ represents the vector
of control inputs uk .t/. An example of such a
quantum control scheme is given in the paper For a given quantum system model (5), the ma-
Mirrahimi and van Handel (2007) in which a trices in the controller (7) can be designed using
Lyapunov method is used to design the con- standard classical control theory techniques such
trol law f ./ so that a quantum system consist- as LQG control (see Doherty and Jacobs 1999) or
ing of an atomic ensemble interacting with an H 1 control (see James et al. 2008).
Control of Quantum Systems 195
dy.t/
dwK .t/ D (9) Cross-References
dwN K .t/
Bilinear Control of Schrödinger PDEs
represents the quantum fields acting on the con- Robustness Issues in Quantum Control
troller quantum system and where wK .t/ cor-
responds to a quantum Wiener process with a
given Itô table. Also, y.t/ represents the output
quantum fields from the quantum system being Bibliography
controlled. Note that in the case of coherent quan-
Albertini F, D’Alessandro D (2003) Notions of controlla-
tum feedback control, there is no requirement that bility for bilinear multilevel quantum systems. IEEE
the components of y.t/ commute with each other Trans Autom Control 48:1399–1403
and this in fact represents one of the main advan- Bachor H, Ralph T (2004) A guide to experiments in
quantum optics, 2nd edn. Wiley-VCH, Weinheim
tages of coherent quantum feedback control as
Bertet P, Ong FR, Boissonneault M, Bolduc A, Mallet F,
opposed to classical measurement based quantum Doherty AC, Blais A, Vion D, Esteve D (2012) Circuit
feedback control. quantum electrodynamics with a nonlinear resonator.
196 Control of Ship Roll Motion
In: Dykman M (ed) Fluctuating nonlinear oscillators: Shaiju AJ, Petersen IR (2012) A frequency domain con-
from nanomechanics to quantum superconducting cir- dition for the physical realizability of linear quantum
cuits. Oxford University Press, Oxford systems. IEEE Trans Autom Control 57(8):2033–2044
Breuer H, Petruccione F (2002) The theory of open quan- Shor P (1994) Algorithms for quantum computation:
tum systems. Oxford University Press, Oxford discrete logarithms and factoring. In: Goldwasser S
Brif C, Chakrabarti R, Rabitz H (2010) Control of quan- (ed) Proceedings of the 35th annual symposium on
tum phenomena: past, present and future. New J Phys the foundations of computer science. IEEE Computer
12:075008 Society, Los Alamitos, pp 124–134
Bouten L, van Handel R, James M (2007) An intro- Wang W, Schirmer SG (2010) Analysis of Lyapunov
duction to quantum filtering. SIAM J Control Optim method for control of quantum states. IEEE Trans
46(6):2199–2241 Autom Control 55(10):2259–2270
D’Alessandro D (2007) Introduction to quantum control Wiseman HM, Milburn GJ (2010) Quantum measurement
and dynamics. Chapman & Hall/CRC, Boca Raton and control. Cambridge University Press, Cambridge,
Doherty A, Jacobs K (1999) Feedback-control of quantum UK
systems using continuous state-estimation. Phys Rev
A 60:2700–2711
Dong D, Petersen IR (2010) Quantum control theory
and applications: a survey. IET Control Theory Appl
4(12):2651–2671
Dong D, Lam J, Tarn T (2009) Rapid incoherent control Control of Ship Roll Motion
of quantum systems based on continuous measure-
ments and reference model. IET Control Theory Appl Tristan Perez1 and Mogens Blanke2;3
3:161–169 1
Gough J, James MR (2009) The series product and its Electrical Engineering & Computer Science,
application to quantum feedforward and feedback net- Queensland University of Technology, Brisbane,
works. IEEE Trans Autom Control 54(11):2530–2544 QLD, Australia
Gough JE, James MR, Nurdin HI, Combes J (2012) Quan- 2
Department of Electrical Engineering,
tum filtering for systems driven by fields in single-
photon states or superposition of coherent states. Phys Automation and Control Group, Technical
Rev A 86:043819 University of Denmark (DTU), Lyngby,
Grigoriu A, Rabitz H, Turinici G (2013) Controllability Denmark
analysis of quantum systems immersed within an en- 3
Centre for Autonomous Marine Operations and
gineered environment. J Math Chem 51(6):1548–1560
James MR, Nurdin HI, Petersen IR (2008) H 1 control of Systems (AMOS), Norwegian University of
linear quantum stochastic systems. IEEE Trans Autom Science and Technology, Trondheim, Norway
Control 53(8):1787–1803
Kerckhoff J, Nurdin HI, Pavlichin DS, Mabuchi H (2010)
Designing quantum memories with embedded control:
photonic circuits for autonomous quantum error cor- Abstract
rection. Phys Rev Lett 105:040502
Khaneja N, Brockett R, Glaser S (2001) Time optimal The undesirable effects of roll motion of ships
control in spin systems. Phys Rev A 63:032308 (rocking about the longitudinal axis) became no-
Lloyd S (2000) Coherent quantum feedback. Phys Rev A
62:022108 ticeable in the mid-nineteenth century when sig-
Merzbacher E (1970) Quantum mechanics, 2nd edn. Wi- nificant changes were introduced to the design of
ley, New York ships as a result of sails being replaced by steam
Mirrahimi M, van Handel R (2007) Stabilizing feedback
engines and the arrangement being changed from
controls for quantum systems. SIAM J Control Optim
46(2):445–467 broad to narrow hulls. The combination of these
Mirrahimi M, Rouchon P, Turinici G (2005) Lyapunov changes led to lower transverse stability (lower
control of bilinear Schrödinger equations. Automatica restoring moment for a given angle of roll) with
41:1987–1994
the consequence of larger roll motion. The in-
Nielsen M, Chuang I (2000) Quantum computation and
quantum information. Cambridge University Press, crease in roll motion and its effect on cargo
Cambridge, UK and human performance lead to the development
Nurdin HI, James MR, Petersen IR (2009) Coherent several control devices that aimed at reducing and
quantum LQG control. Automatica 45(8):1837–1846
controlling roll motion. The control devices most
Polderman JW, Willems JC (1998) Introduction to
mathematical systems theory: a behavioral approach. commonly used today are fin stabilizers, rudder,
Springer, New York anti-roll tanks, and gyrostabilizers. The use of
Control of Ship Roll Motion 197
different types of actuators for control of ship on the hull, hence reducing the net roll excitation
roll motion has been amply demonstrated for over moment. This led to the development of fluid
100 years. Performance, however, can still fall anti-roll tank stabilizers. The most common type
short of expectations because of difficulties as- of anti-roll tank is the U-tank, which comprises
sociated with control system design, which have two reservoirs, located one on port and one on
proven to be far from trivial due to fundamental starboard, connected at the bottom by a duct.
performance limitations and large variations of Anti-roll tanks can be either passive or active. In C
the spectral characteristics of wave-induced roll passive tanks, the fluid flows freely from side to
motion. This short article provides an overview side. According to the density and viscosity of
of the fundamentals of control design for ship the fluid used, the tank is dimensioned so that
roll motion reduction. The overview is limited to the time required for most of the fluid to flow
the most common control devices. Most of the from side to side equals the natural roll period
material is based on Perez (Ship motion control. of the ship. Active tanks operate in a similar
Advances in industrial control. Springer, London, manner, but they incorporate a control system
2005) and Perez and Blanke (Ann Rev Control that modifies the natural period of the tank to
36(1):1367–5788, 2012). match the actual ship roll period. This is normally
achieved by controlling the flow of air from the
top of one reservoir to the other. Anti-roll tanks
Keywords attain a medium to high performance in the range
of 20–70 % of roll angle reduction (RMS) (Mar-
Roll damping; Ship motion control zouk and Nayfeh 2009). Anti-roll tanks increase
the ship displacement. They can also be used to
correct list (steady-state roll angle), and they are
Ship Roll Motion Control Techniques the preferred stabilizer for icebreakers.
Rudder-roll stabilization (RRS) is a technique
One of the most commonly used devices to at- based on the fact that the rudder is located not
tenuate ship motion are the fin stabilisers. These only aft, but also below the center of gravity of
are small controllable fins located on the bilge of the vessel, and thus the rudder imparts not only
the hull usually amid ships. These devices attain yaw but also roll moment. The idea of using the
a performance in the range of 60–90 % of roll rudder for simultaneous course keeping and roll
reduction (root mean square) (Sellars and Martin reduction was conceived in the late 1960s by
1992). They require control systems that sense observations of anomalous behavior of autopilots
the vessel’s roll motion and act by changing the that did not have appropriate wave filtering – a
angle of the fins. These devices are expensive feature of the autopilot that prevents the rudder
and introduce underwater noise that can affect from reacting to every single wave; see, for ex-
sonar performance, they add to propulsion losses, ample, Fossen and Perez (2009) for a discus-
and they can be damaged. Despite this, they sion on wave filtering. Rudder-roll stabilization
are among the most commonly used ship roll has been demonstrated to attain medium to high
motion control device. From a control perspec- performance in the range of 50–75 % of roll
tive, highly nonlinear effects (dynamic stall) may reduction (RMS) (Baitis et al. 1983; Blanke et al.
appear when operating in severe sea states and 1989; Källström et al. 1988; Oda et al. 1992;
heavy rolling conditions (Gaillarde 2002). van Amerongen et al. 1990). The upgrade of the
During studies of ship damage stability con- rudder machinery is required to be able to attain
ducted in the late 1800s, it was observed that slew rates in the range 10–20 deg/s for RRS to
under certain conditions the water inside the have sufficient control authority.
vessel moved out of phase with respect to the A gyrostabilizer uses the gyroscopic effects of
wave profile, and thus, the weight of the water on large rotating wheels to generate a roll reducing
the vessel counteracted the increase of pressure torque. The use of gyroscopic effects was
198 Control of Ship Roll Motion
proposed in the early 1900s as a method to elim- is called roll added mass (inertia). The second
inate roll, rather than to reduce it. Although the term is a damping term, which captures forces
performance of these systems was remarkable, due to wave making and linear skin friction, and
up to 95 % roll reduction, their high cost, the in- the coefficient Kp is a linear damping coefficient.
crease in weight, and the large stress produced on The third term is a nonlinear damping term,
the hull masked their benefits and prevented fur- which captures forces due to viscous effects. The
ther developments. However, a recent increase in last term is the restoring torque due to gravity and
development of gyrostabilizers has been seen in buoyancy.
the yacht industry (Perez and Steinmann 2009). For a 4DOF model (surge, sway, roll, and
Fins and rudder give rise to lift forces in yaw), motion variables considered are D
proportion to the square of flow velocity past the Œ
T , D Œu v p r
T , i D ŒX Y K N
T , where
fin. Hence, roll stabilization by fin or rudder is is the yaw angle, the body-fixed velocities are
not possible at low or zero speed. Only U-tanks u-surge and v-sway, and r is the yaw rate. The
and gyro devices are able to provide stabilization forces and torques are X -surge, Y -sway, K-roll,
in these conditions. For further details about the and N -yaw. With these variables, the following
performance of different devices, see Sellars and mathematical model is usually considered:
Martin (1992), and for a comprehensive descrip-
tion of the early development of devices, see P D J./ ; (3)
Chalmers (1931).
MRB P C CRB ./ D h C c C d ; (4)
Modeling of Ship Roll Motion for where J./ is a kinematic transformation, MRB
Control Design is the rigid-body inertia matrix that corresponds
to expressing the inertia tensor in body-fixed co-
The study of roll motion dynamics for control ordinates, CRB ./ is the rigid-body Coriolis and
system design is normally done in terms of either centripetal matrix, and h , c , and d represent
one- or four-degrees-of-freedom (DOF) models. the hydrodynamic, control, and disturbance vec-
The choice between models of different complex- tor of force components and torques, respectively.
ity depends on the type of motion control system The hydrostatic and hydrodynamic forces are
considered. h MA P CA ./ D./ K. /. The
For a one-degree-of-freedom (1DOF) case, the first two terms have origin in the motion of a
following model is used: vessel in an irrotational flow in a nonviscous
fluid. The third term corresponds to damping
P D p; (1) forces due to potential (wave making), skin fric-
tion, vortex shedding, and circulation (lift and
Ixx pP D Kh C Kw C Kc ; (2) drag). The hydrodynamic effects involved are
quite complex, and different approaches based
where is roll angle, p is roll rate, and Ixx is on superposition of either odd-term Taylor ex-
rigid-body moment of inertia about the x-axis of pansions or square modulus (xjxj) series expan-
a body-fixed coordinate system, where Kh is hy- sions are usually considered Abkowitz (1964)
drostatic and hydrodynamic torques, Kw torque and Fedyaevsky and Sobolev (1964). The K. /
generated by wave forces acting on the hull, and term represents the restoring forces in roll due to
Kc the control torques. The hydrodynamic torque buoyancy and gravity. The 4DOF model captures
can be approximated by the following parametric parameter dependency on ship speed as well as
model: Kh KpP pP C Kp p C Kpjpj pjpj C K. /. the couplings between steering and roll, and it is
The first term represents a hydrodynamic torque useful for controller design. For additional details
in roll due to pressure change that is proportional about mathematical model of marine vehicles,
to the roll accelerations, and the coefficient KpP see Fossen (2011).
Control of Ship Roll Motion 199
roll moment with sea state and sailing conditions has been addressed in practice using PID, LQG,
(speed and encounter angle). Fin machinery is and H1 and standard frequency-domain linear
designed so that the rate of the fin motion is control designs. The characteristics of limited
fast enough, and actuator rate saturation is not an control authority were solved by van Ameron-
issue in moderate sea states. The fins could be gen et al. (1990) using automatic gain control.
used to correct heeling angles (steady-state roll) In the literature, there have been proposals put
when the ship makes speed, but this is avoided forward for the use of model predictive control,
due to added resistance. If it is used, integral QFT, sliding-mode nonlinear control, and auto-
action needs to include anti-windup. In terms regressive stochastic control. Combined use of
of control strategies, PID, H1 , and LQR tech- fin and rudder has also be investigated. Grimble
niques have been successfully applied in prac- et al. (1993) and later Roberts et al. (1997)
tice. Highly nonlinear effects (dynamic stall) may used H1 control techniques. Thorough com-
appear when operating in severe sea states and parison of controller performances for warships
heavy rolling conditions, and proposals for appli- was published in Crossland (2003). A thorough
cations of model predictive control have been put review of the control literature can be found in
forward to constraint the effective angle of attack Perez and Blanke (2012).
of the fins. In addition, if the fins are located
too far aft along the ship, the dynamic response Gyrostabilizers
from fin angle to roll can exhibit non-minimum Using a single gimbal suspension gyrostabilizer
phase dynamics, which can limit the performance for roll damping control, the coupled vessel-roll-
at low encounter frequencies. A thorough review gyro model can be modeled as follows:
of the control literature can be found in Perez and
Blanke (2012). P D p; (5)
KpP pP C Kp p C K D Kw Kg ˛P cos ˛ (6)
Rudder-Roll Stabilization
The problem of rudder-roll stabilization requires Ip ˛R C Bp ˛P C Cp sin ˛ D Kg p cos ˛ C Tp ;
the 4DOF model (3) and (4), which captures the (7)
interaction between roll, sway, and yaw together
with the changes in the hydrodynamic forces where (6) represents the 1DOF roll dynamics
due to the forward speed. The response from and (7) represents the dynamics of the gyrosta-
rudder to roll is non-minimum phase (NMP), bilizer about the axis of the gimbal suspension,
and the system is characterized by further con- where ˛ is the gimbal angle, equivalent to the
straints due to the single-input-two-output nature precession angle for a single gimbal suspension,
of the control problem – attenuate roll without Ip is gimbal and wheel inertia about the gimbal
too much interference with the heading. Studies axis, Bp is the damping, and Cp is a restoring
of fundamental limitations due to NMP dynamics term of the gyro about the precession axis due to
have been approached using standard frequency- location of the gyro center of mass relative to the
domain tools by Hearns and Blanke (1998) and precession axis (Arnold and Maunder 1961). Tp
Perez (2005). A characterization of the trade-off is the control torque applied to the gimbal. The
between roll reduction vs. increase of interfer- use of twin counter-spinning wheels prevents gy-
ence was part of the controller design in Stoustrup roscopic coupling with other degrees of freedom.
et al. (1994). Perez (2005) determined the limits Hence, the control design for gyrostabilizers can
obtainable using optimal control with full distur- be based on a linear single-degree-of-freedom
bance information. The latter also incorporated model for roll.
constraints due to the limiting authority of the The wave-induced roll moment Kw excites the
control action in rate and magnitude of rudder roll motion. As the roll motion develops, the roll
machinery and stall conditions of the rudder. rate p induces a torque along the precession axis
The control design for rudder-roll stabilization of the gyrostabilizer. As the precession angle ˛
Control of Ship Roll Motion 201
develops, there is reaction torque done on the to the other. If the roll dominant frequency is
vessel that opposes the wave-induced moment. higher than the roll natural frequency, the U-tank
The later is the roll stabilizing torque, Xg , is used in passive mode, and the standard roll
Kg ˛P cos ˛ Kg ˛. P This roll torque can only reduction degradation occurs.
be controlled indirectly through the precession
dynamics in (7) via Tp . In the model above, the
spin angular velocity !spi n is controlled to be C
Summary and Future Directions
constant; hence the wheels’ angular momentum
Kg D Ispi n !spi n is constant.
This article provides a brief summary of control
The precession control torque Tp is used
aspects for the most common ship roll motion
to control the gyro. As observed by Sperry
control devices. These aspects include the type of
(Chalmers 1931), the intrinsic behavior of the
mathematical models used to design and analyze
gyrostabilizer is to use roll rate to generate a roll
the control problem, the inherent fundamental
torque. Hence, one could design a precession
limitations and the constraints that some of the
torque controller such that from the point of
designs are subjected to, and the performance
view of the vessel, the gyro behaves as damper.
that can be expected from the different devices.
Depending on how precession torque is delivered,
As an outlook, one of the key issues in roll
it may be necessary to constraint precession
motion control is the model uncertainty and the
angle and rate. This problem has been recently
adaptation to the changes in the environmen-
considered in Donaire and Perez (2013) using
tal conditions. As the vessel changes speed and
passivity-based control.
heading, or as the seas build up or abate, the dom-
inant frequency range of the wave-induced forces
U-tanks
changes significantly. Due to the fundamental
U-tanks can be passive or active. Roll reduction
limitations discussed, a nonadaptive controller
is achieved by attempting to transfer energy from
may produce roll amplification rather than roll
the roll motion to motion of liquid within the tank
reduction. This topic has received some attention
and using the weight of the liquid to counteract
in the literature via multi-mode control switching,
the wave excitation moment. A key aspect of the
but further work in this area could be beneficial.
design is the dimension and geometry of the tank
In the recent years, new devices have appeared
to ensure that there is enough weight due to the
for stabilization at zero speed, like flapping fins
displaced liquid in the tank and that the oscilla-
and rotating cylinders. Also the industry’s interest
tion of the fluid in the tank matches the vessel
in roll gyrostabilizers has been re-ignited. The
natural frequency in roll; see Holden and Fossen
investigation of control designs for these devices
(2012) and references herein. The design of the
has not yet received much attention within the
U-tank can ensure a single-frequency matching,
control community. Hence, it is expected that this
at which the performance is optimized, and for
will create a potential for research activity in the
this frequency the roll natural frequency is used.
future.
As the frequency of roll motion departs from this,
a degradation of roll reduction occurs. Active U-
tanks use valves to control the flow of air from
the top of the reservoirs to extend the frequency Cross-References
matching in sailing conditions in which the roll
dominant frequency is lower than the roll natural Fundamental Limitation of Feedback Control
frequency – the flow of air is used to delay H-Infinity Control
the motion of the liquid from one reservoir to H2 Optimal Control
the other. This control is achieved by detecting Linear Quadratic Optimal Control
the dominant roll frequency and using this infor- Mathematical Models of Ships and Underwater
mation to control the air flow from one reservoir Vehicles
202 Control Structure Selection
Economic control; Input-output controllability; J0 D jjW0 zjj. The reason for using a prime on J
Input/output selection; Plantwide control; (J0 ), is to distinguish it from the economic cost
Regulatory control; Supervisory control J which we later use for selecting the controlled
variables (CV).
Notice that it is assumed in Fig. 1 that we know
Introduction what to measure (v), manipulate (u), and, most
importantly, which variables in z we would like to C
Consider the generalized controller design prob- keep at setpoints (CV), that is, we have assumed a
lem in Fig. 1 where P denotes the generalized given control structure. The term “control struc-
plant model. Here, the objective is to design the ture selection” (CSS) and its synonym “control
controller K, which, based on the sensed outputs structure design” (CSD) is associated with the
v, computes the inputs (MVs) u such that the overall control philosophy for the system with
variables z are kept small, in spite of variations in emphasis on the structural decisions which are
the variables w, which include disturbances (d), a prerequisite for the controller design problem
varying setpoints/references (CVs ) and measure- in Fig. 1:
ment noise (n), 1. Selection of controlled variables (CVs,
“outputs,” included in z in Fig. 1)
w D Œd; CVs ; n
2. Selection of manipulated variables (MVs,
“inputs,” u in Fig. 1)
The variables z, which should be kept small, 3. Selection of measurements y (included in v in
typically include the control error for the selected Fig. 1)
controlled variables (CV) plus the plant inputs 4. Selection of control configuration (structure
(u), of overall controller K that interconnects the
z D ŒCV CVs I u
controlled, manipulated and measured vari-
ables; structure of K in Fig. 1)
The variables v, which are the inputs to the 5. Selection of type of controller K (PID, MPC,
controller, include all known variables, including LQG, H-infinity, etc.) and objective function
measured outputs (ym ), measured disturbances (norm) used to design and analyze it.
(dm ) and setpoints, Decisions 2 and 3 (selection of u and y) are
sometimes referred to as the input/output (IO)
v D Œym I dm I CVs
: selection problem. In practice, the controller (K)
is usually divided into several layers, operating on
The cost function for designing the optimal con- different time scales (see Fig. 2), which implies
troller K is usually the weighted control error, that we in addition to selecting the (primary)
controlled variables (CV1
CV) must also
select the (secondary) variables that interconnect
the layers (CV2 ).
Control structure selection includes all the
structural decisions that the engineer needs to
make when designing a control system, but
it does not involve the actual design of each
individual controller block. Thus, it involves the
decisions necessary to make a block diagram
Control Structure Selection, Fig. 1 General formu- (Fig. 1; used by control engineers) or process
lation for designing the controller K. The plant P is & instrumentation diagram (used by process
controlled by manipulating u, and is disturbed by the
signals w. The controller uses the measurements v, and the engineers) for the entire plant, and provides
control objective is to keep the outputs (weighted control the starting point for a detailed controller
error) z as small as possible design.
204 Control Structure Selection
into several subcontrollers, using two main prin- Notation and Matrices H1 and H2 for
ciples Controlled Variable Selection
– Decentralized (local) control. This “horizon-
tal decomposition” of the control layer is usu- The most important notation is summarized in
ally based on separation in space, for example, Table 1 and Fig. 3. To distinguish between the
by using local control of individual units. two control layers, we use “1” for the upper
– Hierarchical (cascade) control. This “vertical supervisory (economic) layer and “2” for the C
decomposition” is usually based on time scale regulatory layer, which is “secondary” in terms
separation, as illustrated for a process plant in of its place in the control hierarchy.
Fig. 2. The upper three layers in Fig. 2 deal There is often limited possibility to select the
explicitly with economic optimization and are input set (u) as it is usually constrained by the
not considered here. We are concerned with
the two lower control layers, where the main
objective is to track the setpoints specified by
the layer above.
In accordance with the two main objectives for
control, the control layer is in most cases divided
hierarchically in two layers (Fig. 2):
1. A “slow” supervisory (economic) layer
2. A “fast” regulatory (stabilization) layer
Another reason for the separation in two con-
trol layers, is that the tasks of economic opera-
tion and regulation are fundamentally different.
Combining the two objectives in a single cost
function, which is required for designing a single
centralized controller K, is like trying to compare
apples and oranges. For example, how much is
an increased stability margin worth in monitory
units [$]? Only if there is a reasonable benefit in
combining the two layers, for example, because
there is limited time scale separation between
the tasks of regulation and optimal economics, Control Structure Selection, Fig. 3 Block diagram of
a typical control hierarchy, emphasizing the selection of
should one consider combining them into a single controlled variables for supervisory (economic) control
controller. (CV1 D H1 y) and regulatory control .CV2 D H2 y/
plant design. However, there may be a possibility and we have by definition y D Œym I u
. Then the
to add inputs or to move some to another location, choice
for example, to avoid saturation or to reduce the
time delay and thus improve the input-output H2 D Œ0 1 0 0 0I 0 0 0 0 1
controllability.
There is much more flexibility in terms of out- means that we have selected CV2 D H2 y D
put selection, and the most important structural ŒTb I qb
. Thus, u1 D qb is an unused input for
decision is related to the selection of controlled regulatory control, and in the regulatory layer we
variables in the two control layers, as given by close one loop, using u2 D qa to control y2 D Tb .
the decision matrices H1 and H2 (see Fig. 3). If we instead select
CV1 D H1 y H2 D Œ1 0 0 0 0I 0 0 1 0 0
an old car, the active constraint maybe the • An ideal self-optimizing variable is the gra-
maximum fuel flow (C V1 D f Œl=s
with dient of the cost function with respect to the
setpoint fmax ). The latter corresponds to unconstrained input. CV1 D dJ=du D Ju
an input constraint .umax D fmax / which • More generally, since we rarely can mea-
is trivial to implement (“full gas”); the sure the gradient Ju , we select CV1 D H1 y.
former corresponds to an output constraint The selection of a good H1 is a nontrivial
.ymax D vmax / which requires a controller task, but some quantitative approaches are
(“cruise control”). given below.
• For“hard” output constraints, which can- For example, consider again the problem of
not be violated at any time, we need to driving between two cities, but assume that the
introduce a backoff (safety margin) to guar- objective is to minimize the total fuel, J = V
antee feasibility. The backoff is defined as [liters]., Here, driving at maximum speed will
the difference between the optimal value consume too much fuel, and driving too slow
and the actual setpoint, for example, we is also nonoptimal. This is an unconstrained
need to back off from the speed limit be- optimization problem, and identifying a good
cause of the possibility for measurement C V1 is not obvious. One option is to maintain
error and imperfect control a constant speed (C V1 D v), but the optimal
value of v may vary depending on the slope
of the road. A more “self-optimizing” option,
CV1;s D CV1;max backoff
could be to keep a constant fuel rate (C V1 D
f Œl=s
), which will imply that we drive slower
For example, to avoid exceeding the uphill and faster downhill. More generally,
speed limit of 100 km/h, we may set one can control combinations, C V1 D H1 y
backoff D 5 km/h, and use a setpoint where H1 is a “full” matrix.
vs D 95 km/h rather than 100 km/h. CV1 Rule 3. For the unconstrained degrees of
CV1 Rule 2. For the remaining unconstrained freedom, one should never control a variable
degrees of freedom, look for “self-optimizing” that reaches its maximum or minimum value at
variables which when held constant, indirectly the optimum, for example, never try to control
lead to close-to-optimal operation, in spite of directly the cost J. Violation of this rule gives
disturbances. either infeasibility (if attempting to control J
• Self-optimizing variables (CV1 D H1 y) are at a lower value than Jmin ) or nonuniqueness
variables which when kept constant, indi- (if attempting to control J at higher value than
rectly (through the action of the feedback Jmin ).
control system) lead to close-to optimal Assume again that we want to minimize the
adjustment of the inputs (u) when there are total fuel needed to drive between two cities,
disturbances (d). J D V Œl
. Then one should avoid fixing the
208 Control Structure Selection
total fuel, C V1 D V Œl
, or, alternatively, avoid a problem with only one unconstrained input
fixing the fuel consumption(“gas mileage”) in (the power). In summary, the heart rate is a good
liters pr. km .C V1 D f Œl=km
/. Attempting to candidate.
control the fuel consumption[l/km] below the
car’s minimum value is obviously not possible Regions and switching. If the optimal active
(infeasible). Alternatively, attempting to control constraints vary depending on the disturbances,
the fuel consumption above its minimum value new controlled variables (CV1 ) must be identified
has two possible solutions; driving slower or (offline) for each active constraint region, and on-
faster than the optimum. Note that the policy of line switching is required to maintain optimality.
controlling the fuel rate f [l/s] at a fixed value In practise, it is easy to identify when to switch
will never become infeasible. when one reaches a constraint. It is less obvious
For CV1 -Rule 2, it is always possible to find when to switch out of a constraint, but actually
good variable combinations (i.e., H1 is a “full” one simply has to monitor the value of the un-
matrix), at least locally, but whether or not it is constrained CVs from the neighbouring regions
possible to find good individual variables (H1 and switch out of the constraint region when the
is a selection matrix), is not obvious. To help unconstrained CV reaches its setpoint.
identify potential “self-optimizing” variables In general, one would like to simplify the
.CV1 D c/ ;the following requirements may be control structure and reduce need for switching.
used: This may require using a suboptimal CV1 in
Requirement 1. The optimal value of c is insen- some regions of active constraints. In this case,
sitive to disturbances, that is, dcopt =dd D H1 F the setpoint for CV1 may not be its nominally
is small. Here F = dyopt =dd is the optimal optimal value (which is the normal choice), but
sensitivity matrix (see below). rather a “robust setpoint” (with backoff) which
Requirement 2. The variable c is easy to measure reduces the loss when we are outside the nominal
and control accurately constraint region.
Requirement 3. The value of c is sensitive to
changes in the manipulated variable, u; that Structure of supervisory layer. The supervi-
is, the gain, G D HGy , from u to c is sory layer may either be centralized, e.g., using
large (so that even a large error in controlled model predictive control (MPC), or decomposed
variable, c, results in only a small variation in into simpler subcontrollers using standard ele-
u.) Equivalently, the optimum should be “flat” ments, like decentralized control (PID), cascade
with respect to the variable, c. Here Gy D control, selectors, decouplers, feedforward ele-
dy=d u is the measurement gain matrix (see ments, ratio control, split range control, and input
below). midrange control (also known as input resetting,
Requirement 4. For cases with two or more valve position control or habituating control). In
controlled variables c, the selected variables theory, the performance is better with the central-
should not be closely correlated. ized approach (e.g., MPC), but the difference can
All four requirements should be satisfied. be small when designed by a good engineer. The
For example, for the operation of a marathon main reasons for using simpler elements is that
runner, the heart rate may be a good “self- (1) the system can be implemented in the existing
optimizing” controlled variable c (to keep at “basic” control system, (2) it can be implemented
constant setpoint). Let us check this against with little model information, and (3) it can be
the four requirements. The optimal heart build up gradually. However, such systems can
rate is weakly dependent on the disturbances quickly become complicated and difficult to un-
(requirement 1) and the heart rate is easy to derstand for other than the engineer who designed
measure (requirement 2). The heart rate is quite it. Therefore, model-based centralized solutions
sensitive to changes in power input (requirement (MPC) are often preferred because the design is
3). Requirement 4 does not apply since this is more systematic and easier to modify.
Control Structure Selection 209
Quantitative Approach for Selecting practise, the optimum input uopt (d) cannot
Economic Controlled Variables, CV1 be realized, because of model error and
unknown disturbances d, so we use a feeback
A quantitative approach for selecting economic implementation where u is adjusted to keep
controlled variables is to consider the effect of the the selected variables CV1 at their nominally
choice CV1 D H1 y on the economic cost J when optimal setpoints.
disturbances d occur. One should also include Together with obtaining the model, the opti- C
noise/errors (ny ) related to the measurements and mization step S2 is often the most time con-
inputs. suming step in the entire plantwide control
Step S1. Define operational objectives (eco- procedure.
nomic cost function J and constraints) Step S3. Select supervisory (economic) con-
We first quantify the operational objectives trolled variables, C V 1
in terms of a scalar cost function J [$/s] that
should be minimized (or equivalently, a scalar CV1 -Rule 1: Control Active Constraints
profit function, P D J, that should be max- A primary goal for solving the optimization prob-
imized). For process control applications, this lem is to find the expected regions of active
is usually easy, and typically we have constraints, and a constraint is said to be “active”
if g D 0 at the optimum. The optimally active
J D cost feed C cost utilities .energy/ constraints will vary depending on disturbances
(d) and market conditions (prices).
value products Œ$=s
become manipulated variables (MV1 ) for the Before we look at the approaches, note again
supervisory layer (see Fig. 3). Furthermore, since that the IO-selection for regulatory control may
the set points are set by the supervisory layer be combined into a single decision, by consider-
in a cascade manner, the system eventually ing the selection of
approaches the same steady-state (as defined by
the choice of economic variables CV1 ) regardless CV2 D Œy2 I u1
D H2 y
of the choice of controlled variables in the C
regulatory layer. Here u1 denotes the inputs that are not used by the
The inputs for the regulatory layer (u2 ) are regulatory control layer. This follows because we
selected as a subset of all the available inputs want to use all inputs u for control, so assuming
(u). For stability reasons, one should avoid input that the set u is given, “selection of inputs u2 ”
saturation in the regulatory layer. In particular, (decision 2) is by elimination equivalent to “se-
one should avoid using inputs (in the set u2 ) that lection of inputs u1 .” Note that CV2 include all
are optimally constrained in some disturbance variables that we keep at desired (constant) values
region. Otherwise, in order to avoid input satura- within the fast time horizon of the regulatory
tion, one needs to include a backoff for the input control layer, including the “unused” inputs u1
when entering this operational region, and doing
so will have an economic penalty. Survey by Van de Wal and Jager
In the regulatory layer, the outputs (y2 ) are Van de Wal and Jager provide an overview of
usually selected as individual measurements and methods for input-output selection, some of
they are often not important variables in them- which include:
selves. Rather, they are “extra outputs” that are 1. “Accessibility” based on guaranteeing a
controlled in order to “stabilize” the system, and cause–effect relationship between the selected
their setpoints (y2s ) are changed by the layer inputs (u2 ) and outputs (y2 ). Use of such
above, to obtain economical optimal operation. measures may eliminate unworkable control
For example, in a distillation column one may structures.
control a temperature somewhere in the middle 2. “State controllability and state observability”
of the column (y2 D T) in order to “stabilize” to ensure that any unstable modes can be sta-
the column profile. Its setpoint (y2s D Ts ) is bilized using the selected inputs and outputs.
adjusted by the supervisory layer to obtain the 3. “Input-output controllability” analysis to en-
desired product composition (y1 D c). sure that y2 can be acceptably controlled us-
ing u2 . This is based on scaling the system,
and then analysing the transfer matrices G2 .s/
(from u2 to y2 ) and Gd2 (from expected dis-
Input-Output (IO) Selection for turbances d to y2 ). Some important control-
Regulatory Control (u2 ; y2 ) lability measures are right half plane zeros
(unstable dynamics of the inverse), condition
Finding the truly optimal control structure, in- number, singular values, relative gain array,
cluding selecting inputs and outputs for regu- etc. One problem here is that there are many
latory control, requires finding also the optimal different measures, and it is not clear which
controller parameters. This is an extremely dif- should be given most emphasis.
ficult mathematical problem, at least if the con- 4. “Achievable robust performance.” This may
troller K is decomposed into smaller controllers. be viewed as a more detailed version of input-
In this section, we consider some approaches output controllability, where several relevant
which does not require that the controller param- issues are combined into a single measure.
eters be found. This is done by making assump- However, this requires that the control prob-
tions related to achievable control performance lem can actually be formulated clearly, which
(controllability) or perfect control. may be very difficult, as already mentioned.
212 Control Structure Selection
In addition, it requires finding the optimal where the G-matrices are transfer matrices. Here,
robust controller for the given problem, which Gxd gives the effect of the disturbances on the
may be very difficult. states with no control, and the idea is to reduce
Most of these methods are useful for analyzing a the disturbance effect by closing the regulatory
given structure (u2 , y2 ) but less suitable for selec- control loops. Within the “slow” time scale of
tion. Also, the list of methods is also incomplete, the supervisory layer, we can assume that CV2 is
as disturbance rejection, which is probably the perfectly controlled and thus constant, or CV2 D
most important issue for the regulatory layer, is 0 in terms of deviation variables. This gives
hardly considered.
y
CV2 D H2 Gy u C H2 Gd d D 0
A Systematic Approach for IO-Selection
Based on Minimizing State Drift Caused by
and solving with respect to u gives
Disturbances
The objectives of the regulatory control layer y
J2 D jjWxjj
is the disturbance effect for the “partially” con-
which may be interpreted as the weighted state trolled system with only the regulatory loops
drift. One justification for considering the state closed. Note that it is not generally possible to
drift, is that the regulatory layer should ensure make Pxd D 0 because we have more states than
that the system, as measured by the weighted we have available inputs. To have a small “state
states Wx, does not drift too far away from the drift,” we want J2 D jjW Pd djj to be small, and
desired state, and thus stays in the “linear region” to have a simple regulatory control system we
when there are disturbances. Note that the cost J2 want to close as few regulatory loops as possible.
is used to select controlled variables (CV2 ) and Assume that we have normalized the disturbances
not to design the controller (for which the cost so that the norm of d is 1, then we can solve the
may be the control error, J2 0 D jjCV2 CV2s jj). following problem
Within this framework, the IO-selection prob- For 0; 1; 2: : : etc. loops closed solve:
lem for the regulatory layer is then to select the min_H2 jjM2 .H2 / jj
nonsquare matrix H2 , where M2 D WPxd and dim .u2/ D
dim .y2/ D no: of loops closed:
CV2 D H2 y By comparing the value of jjM2 .H2 / jj with
different number of loops closed (i.e., with differ-
where y D Œym I u
, such that the cost J2 is ent H2 ), we can then decide on an appropriate reg-
minimized. The cause for changes in J2 are dis- ulatory layer structure. For example, assume that
turbances d, and we consider the linear model (in we find that the value of J2 is 110 (0 loops closed),
deviation variables) 0.2 (1 loop), and 0.02 (2 loops), and assume we
have scaled the disturbances and states such that
y a J2 -value less than about 1 is acceptable, then
y D Gy u C Gd d
closing 1 regulatory loop is probably the best
x D Gx u C Gxd d choice.
Control Structure Selection 213
In principle, this is straightforward, but there for the cost J2 , it is possible to obtain analytical
are three remaining issues: (1) We need to choose formulas for the optimal sensitivity, F2 . Again,
an appropriate norm, (2) we should include Wd and Wny are diagonal matrices, expressing
measurement noise to avoid selecting insensitive the expected magnitude of the disturbances (d)
measurements and (3) the problem must be and noise (for y).
solvable numerically. For the case when H2 is a “full” matrix, this
Issue 1. The norm of M2 should be evalu- can be reformulated as a convex optimization C
ated in the frequency range between the “slow” problem and an explicit solution is
bandwidth of the supervisory control layer (¨B1 )
and the “fast” bandwidth of the regulatory control
HT2 D .Y2 YT2 /1 Gy .Gy .Y2 YT2 /1 Gy /1 J2uu
T 1=2
layer (¨B2 ). However, since it is likely that the
system sometimes operates without the supervi-
sory layer, it is reasonable to evaluate the norm and from this we can find the optimal value of
of Pxd in the frequency range from 0 (steady state) J2 . It may seem restrictive to assume that H2 is a
to ¨B2 . Since we want H2 to be a constant (not “full” matrix, because we usually want to control
frequency-dependent) matrix, it is reasonable to individual measurements, and then H2 should be
choose H2 to minimize the norm of M2 at the a selection matrix, with 1’s and 0’s. Fortunately,
frequency where jjM2 jj is expected to have its since we in this case want to control as many
peak. For some mechanical systems, this may measurements (y2 ) as inputs (u2 ), we have that
be at some resonance frequency, but for process H2 is square in the selected set, and the opti-
control applications it is usually at steady state mal value of J2 when H2 is a selection matrix
(¨ D 0), that is, we can use the steady-state is the same as when H2 is a full matrix. The
gain matrices when computing Pxd . In terms of reason for this is that specifying (controlling) any
the norm, we use the 2-norm (Frobenius norm), linear combination of y2 , uniquely determines
mainly because it has good numerical proper- the individual y2 ’s, since dim.u2 / D dim.y2 /.
ties, and also because it has the interpretation of Thus, we can find the optimal selection matrix
giving the expected variance in x for normally H2 , by searching through all the candidate square
distributed disturbances. sets of y. This can be effectively solved using
Issues 2 and 3. If we include also measurement the branch and bound approach of Kariwala and
noise ny , which we should, then the expected Cao, or alternatively it can be solved as a mixed-
value of J2 is minimized by solving the problem integer problem with a quadratic program (QP) at
min_H2 jjM2 .H2 /jj2 where (Yelchuru and Sko- each node (Yelchuru and Skogestad 2012). The
gestad 2013) approach of Yelchuru and Skogestad can also be
applied to the case where we allow for disjunct
M2 .H2 / D J2uu .H2 Gy /1 H2 Y2
1=2
sets of measurement combinations, which may
give a lower J2 in some cases.
one should consider the second-best matrix H2 controlled variables, CV2 D H2 y. However,
(with the second-best value of the state drift control engineers have traditionally not used
J2 ) and so on. the degrees of freedom in the matrices H1 and
2. The state drift cost drift J2 D jjWxjj is in H2 , and this chapter has summarized some
principle independent of the economic cost approaches.
(J). This is an advantage because we know There has been a belief that the use of “ad-
that the economically optimal operation (e.g., vanced control,” e.g., MPC, makes control struc-
active constraints) may change, whereas we ture design less important. However, this is not
would like the regulatory layer to remain un- correct because also for MPC must one choose
changed. However, it is also a disadvantage, inputs (MV1 D CV2s ) and outputs (CV1 ). The
because the regulatory layer determines the selection of CV1 may to some extent be avoided
initial response to disturbances, and we would by use of “Dynamic Real-Time Optimization
like this initial response to be in the right (DRTO)” or “Economic MPC,” but these opti-
direction economically, so that the required mizing controllers usually operate on a slower
correction from the slower supervisory layer time scale by sending setpoints to the basic con-
is as small as possible. Actually, this issue trol layer (MV1 D CV2s ), which means that se-
can be included by extending the state vector lecting the variables CV2 is critical for achieving
x to include also the economic controlled (close to) optimality on the fast time scale.
variables, CV1 , which is selected based on the
economic cost J. The weight matrix W may
then be used to adjust the relative weights
of avoiding drift in the internal states x and Cross-References
economic controlled variables CV1 .
3. The above steady-state approach does not con- Control Hierarchy of Large Processing Plants:
sider input-output pairing, for which dynamics An Overview
are usually the main issue. The main pairing Industrial MPC of Continuous Processes
rule is to “pair close” in order to minimize the PID Control
effective time delay between the selected input
and output. For a more detailed approach, de-
centralized input-output controllability must Bibliography
be considered.
Alstad V, Skogestad S (2007) Null space method
for selecting optimal measurement combinations as
controlled variables. Ind Eng Chem Res 46(3):
Summary and Future Directions
846–853
Alstad V, Skogestad S, Hori ES (2009) Optimal measure-
Control structure design involves the structural ment combinations as controlled variables. J Process
decisions that must be made before designing Control 19:138–148
Downs JJ, Skogestad S (2011) An industrial and academic
the actual controller, and it is in most cases a
perspective on plantwide control. Ann Rev Control
much more important step than the controller 17:99–110
design. In spite of this, the theoretical tools for Engell S (2007) Feedback control for optimal process
making the structural decisions are much less operation. J Proc Control 17:203–219
Foss AS (1973) Critique of chemical process control
developed than for controller design. This chapter
theory. AIChE J 19(2):209–214
summarizes some approaches, and it is expected, Kariwala V, Cao Y (2010) Bidirectional branch and bound
or at least hoped, that this important area will for controlled variable selection. Part III. Local av-
further develop in the years to come. erage loss minimization. IEEE Trans Ind Inform 6:
54–61
The most important structural decision is
Kookos IK, Perkins JD (2002) An Algorithmic method
usually related to selecting the economic con- for the selection of multivariable process control struc-
trolled variables, CV1 D H1 y, and the stabilizing tures. J Proc Control 12:85–99
Controllability and Observability 215
Morari M, Arkun Y, Stephanopoulos G (1973) Studies signals over a given time interval. This question
in the synthesis of control structures for chemical is dealt with using the concept of observability.
processes. Part I. AIChE J 26:209–214
Narraway LT, Perkins JD (1993) Selection of control
structure based on economics. Comput Chem Eng
18:S511–S515
Skogestad S (2000) Plantwide control: the search for Keywords
the self-optimizing control structure. J Proc Control C
10:487–507
Skogestad S (2004) Control structure design for complete Controllability; Duality; Indistinguishability;
chemical plants. Comput Chem Eng 28(1–2):219–234 Input–output systems in state-space form;
Skogestad S (2012) Economic plantwide control, chap- Observability; Reachability
ter 11. In: Rangaiah GP, Kariwala V (eds) Plantwide
control. Recent developments and applications. Wiley,
Chichester, pp 229–251. ISBN:978-0-470-98014-9
Skogestad S, Postlethwaite I (2005) Multivariable feed-
back control, 2nd edn. Wiley, Chichester Introduction
van de Wal M, de Jager B (2001) Review of methods for
input/output selection. Automatica 37:487–510
Yelchuru R, Skogestad S (2012) Convex formulations for In the state-space approach to input–output
optimal selection of controlled variables and measure- systems, the relation between input signals
ments using Mixed Integer Quadratic Programming. and output signals is represented by means of
J Process Control 22:995–1007
Yelchuru R, Skogestad S (2013) Quantitative methods for two equations. In the continuous-time case, the
regulatory layer selection. J Process Control 23:58–69 first of these equations is a first-order vector
differential equation driven by the input signal
and is often called the state equation. The second
equation is an algebraic equation, often called the
output equation. The unknown in the differential
equation is called the state vector of the system.
Controllability and Observability Given a particular input signal and initial value
of the state vector, the state equation generates
H.L. Trentelman a unique solution, called the state trajectory of
Johann Bernoulli Institute for Mathematics and the system. The output equation determines the
Computer Science, University of Groningen, corresponding output signal as a function of this
Groningen, AV, The Netherlands state trajectory and the input signal. Thus, in the
state space approach, the input–output behavior
of the system is obtained using the state vector as
Abstract an intermediate variable.
In the context of input–output systems in state-
State controllability and observability are key space form, the properties of controllability and
properties in linear input–output systems in state- observability characterize the interaction between
space form. In the state-space approach, the re- the input, the state, and the output. In particular,
lation between inputs and outputs is represented controllability describes the ability to manipulate
using the state variables of the system. A natural the state vector of the system by applying ap-
question is then to what extent it is possible propriate input signals. Observability describes
to manipulate the values of the state vector by the ability to determine the values of the state
means of an appropriate choice of the input func- vector from knowledge of the input and output
tion. The concepts of controllability, reachability, over a certain time interval. The properties of
and null controllability address this issue. An- controllability and observability are fundamental
other important question is whether it is possible properties that play a major role in the analysis
to uniquely determine the values of the state and control of linear input–output systems in
vector from knowledge of the input and output state-space form.
216 Controllability and Observability
Z t
Systems with Inputs and Outputs
yu .t; x0 / D C e x0 C At
K .t / u ./ d
0
Consider a continuous-time, linear, time-
CDu .t/ ; (3)
invariant, input–output system in state-space
form represented by the equations
where K.t/W D C e At B. In the case D D 0, it is
customary to call K.t/ the impulse response. In
xP .t/ D Ax .t/ C Bu .t/ ;
(1) the general case, one would call the distribution
y .t/ D C x .t/ C Du .t/ :
K.t/ C Dı.t/ the impulse response.
This system is referred to as †. In Eq. (1), A,
B, C , and D are maps (or matrices), and the Controllability
functions x, u, and y are considered to be defined
on the real axis R or on any subinterval of it. Controllability is concerned with the ability to
In particular, one often assumes the domain of manipulate the state by choosing an appropriate
definition to be the nonnegative part of R, which input signal, thus steering the current state to a
is without loss of generality since the system is desired future state in a given finite time. Thus, in
time-invariant. The function u is called the input, particular, in the differential equation in (1), we
and its values are assumed to be given. The class study the relation between u and x. We investi-
of admissible input functions is denoted by U. Of- gate to what extent one can influence the state x
ten, U is the class of piecewise continuous or lo- by a suitable choice of the input u.
cally integrable functions, but for most purposes, For this purpose, we introduce the (at time
the exact class from which the input functions are T ) reachable space WT , defined as the space of
chosen is not important. We assume that input points x1 for which there exists an input u such
functions take values in an m-dimensional space that xu .T; 0/ D x1 , i.e., the set of points that can
U, which we often identify with Rm . The first be reached from the origin at time T . It follows
equation of † is an ordinary differential equation from the linearity of the differential equation that
for the variable x. For a given initial value of x WT is a linear subspace of X . In fact, (2) implies
and input function u, the function x is completely Z ˇ
T ˇ
determined by this equation. The variable x is WT D e A.T / Bu./d ˇˇ u 2 U : (4)
called the state variable and it is assumed to 0
take values in an n-dimensional space X : The
We call system † reachable at time T if
space X is called the state space. It is usually
every point can be reached from the origin,
identified with Rn . Finally, y is called the output
i.e., if WT D X . It follows from (2)
of the system and takes values in a p-dimensional
that if the system is reachable at time T ,
space Y, which we identify with Rp . Since the
every point can be reached from every point
system † is completely determined by the maps
at time T , because the condition for the
(or matrices) A, B; C , and D, we identify † with
point x1 to be reachable from x0 at time T
the quadruple .A; B; C; D/.
is
The solution of the differential equation of †
x1 e AT x0 2 WT :
with initial value x(0) = x0 is denoted as xu .t; x0 /.
It can be given explicitly using the variation-of- The property that every point is reachable
constants formula, namely, from any point in a given time interval [0,
T ] is called controllability (at T). Finally, we
Z t
have the concept of null controllability, i.e.,
xu .t; x0 / D e At x0 C e A.t / Bu ./ d: (2)
0
the possibility to reach the origin from an
arbitrary initial point. According to (2), for a
The corresponding value of y is denoted by point x0 to be null controllable at T , we must
yu .t; x0 /. As a consequence of (2), we have have
Controllability and Observability 217
Z T
that Ak (k > n) is a linear combination
e AT x0 C e A.T / Bu ./ d D 0
0 of I; A; : : : ; An1 as well. Therefore,
Ak B D 0 for k D 0; 1; : : : ; n 1 implies
for some u 2 U. We observe that x0 is null that Ak B D 0 for all k 2 N.
controllable at T (by the control u) if and only As an immediate consequence of the previous
if –e AT x0 is reachable at T (by the control u). theorem, we find that at time T reachable sub-
Since e AT is invertible, we see that † is null space WT can be expressed in terms of the maps C
controllable at T if and only if † is reachable A and B as follows.
at T . Henceforth, we refer to the equivalent
Corollary 1
properties reachability, controllability, null con-
trollability simply as controllability (at T ). It
should be remarked that the equivalence of these WT D im .B AB An1 B/:
concepts does not hold in other situations, e.g.,
for discrete-time systems. We intend to obtain This implies that, in fact, WT is independent of
an explicit expression for the space WT and, T , for T > 0. Because of this, we often use W in-
based on this, an explicit condition for control- stead of WT and call this subspace the reachable
lability. This is provided by the following re- subspace of †. This subspace of the state space
sult. has the following geometric characterization in
terms of the maps A and B.
Theorem 1 Let be an n-dimensional row vec-
tor and T > 0. Then the following statements are Corollary 2 W is the smallest A-invariant sub-
equivalent: space containing B:D imB. Explicitly, W is A-
1. ?WT (i.e., x D 0 for all x 2 WT ). invariant, B W, and any A-invariant sub-
2. etA B D 0 for 0 t T . space L satisfying B L also satisfies W
3. Ak B D 0 for k D 0,1, 2,: : :. L. We denote the smallest A-invariant subspace
4. (B AB An1 B) D 0. containing B by hAjBi, so that we can write
W D hAjBi. For the space hAjBi, we have the
Proof (i) , (ii) If ?WT , then by Eq. (4): following explicit formula
Z T hAjBi D B C AB C C An1 B.
e A.T / Bu ./ d D 0 (5) Corollary 3 The following statements are equiv-
0
alent.
for every u 2 U. Choosing u.t/ D 1. There exists T > 0 such that system † is
B T e A .T t / T for 0 t T yields
T
controllable at T .
2. hAjBi D X .
Z
T
A.T /
2 3. Rank (B AB An1 B/ D n.
e B
d D 0;
0 4. The system † is controllable at T for all T > 0.
We say that the matrix pair (A, B) is controllable
from which (ii) follows. Conversely, assume
if one of these equivalent conditions is satisfied.
that (ii) holds. Then (5) holds and hence (i)
follows. Example 1 Let A and B be defined by
(ii) , (iii) This is obtained by power series
P1 t k k
expansion of e At
D kD0 kŠ A : 2 6 3
AWD ; BWD :
(iii) , (iv) This follows immediately from the 2 5 2
evaluation of the vector-matrix product.
(iv) , (iii) This implication is based on the 3 6
Then .B AB/ D ; rank(B AB) D 1,
Cayley-Hamilton Theorem. According to 2 4
this theorem, An is a linear combination and consequently, (A, B) is not controllable. The
of I; A; : : : ; An1 . By induction, it follows reachable subspace is the span of (B AB), i.e., the
218 Controllability and Observability
0 1
line given by the equation 2x1 C 3x2 D 0. This C
can also be seen as follows. Let z WD 2x1 C 3x2 , B CA C
B C
then zP D z. Hence, if z(0) D 0, which is the case B CA2 C
B C v D 0: (6)
if x(0) D 0, we must have z.t/ D 0 for all t 0. B :: C
@ : A
CAn1
As a consequence, the distinguishability of two
Observability vectors does not depend on T . The space of
vectors v for which (6) holds is denoted hker
In this section, we include the second of equa- C jAi and called the unobservable subspace. It is
tions (1), y D Cx + Du, in our considerations. equivalently characterized as the intersection of
Specifically, we investigate to what extent it is the spaces ker CAk for k D 0; : : : ; n 1, i.e.,
possible to reconstruct the state x if the input
u and the output y are known. The motivation \
n1
is that we often can measure the output and hker C jAi D ker CAk :
prescribe (and hence know) the input, whereas kD0
the state variable is hidden.
Equivalently, hker C jAi is the largest A-invariant
Definition 2 Two states x0 and x1 in X are
subspace contained in ker C . Finally, another
called indistinguishable on the interval [0, T ] if
characterization is “v 2 hker C jAi if and only if
for any input u we have yu .t, x0 / D yu .t, x1 /,
y0 .t, v/ is identically zero,” where the subscript
for all 0 t T .
“0” refers to the zero input.
Hence, x0 and x1 are indistinguishable if they
give rise to the same output values for every Definition 3 System † is called observable if
input u. According to (3), for x0 and x1 to be any two distinct states are not indistinguishable.
indistinguishable on [0, T ], we must have that
The previous considerations immediately lead to
the result.
Z t
C e x0 C
At
K .t / u ./ d C Du .t/ Theorem 2 The following statements are equiv-
0 alent.
Z t 1. The system † is observable.
D C e At x1 C K .t / u ./ d C Du .t/ 2. Every nonzero state is not indistinguishable
0
from the origin.
3. hker C jAi D 0.
for 0 t T and for any input signal u. We 4. CeAt v D 0 (0 t T / ) v D 0.
0 1
note that the input signal does not affect distin- C
guishability, i.e., if one u is able to distinguish B CA C
B C
between two states, then any input is. In fact, B 2 C
5. Rank B CA C D n:
x0 and x1 are indistinguishable if and only if B : C
@ :: A
CeAt x0 D CeAt x1 (0 t T ). Obviously, x0
and x1 are indistinguishable if and only if v WD CAn1
x0 x1 and 0 are indistinguishable. By applying Since observability is completely determined
Theorem 1 with = v T nonzero and transposing by the matrix pair (C , A), we will say “(C , A) is
the equations, it follows that CeAt x0 D CeAt x1 observable” instead of “system † is observable.”
(0 t T / if and only if CeAt v D 0 (0 There is a remarkable relation between the
t T / and hence if and only if CAk v D 0 controllability and observability properties,
(k D 0; 1; 2; : : :). The Cayley-Hamilton Theorem which is referred to as duality. This property
implies that we need to consider the first n terms is most conspicuous from the conditions (3) in
only, i.e., Corollary 3 and (5) in Theorem 2, respectively.
Controllability and Observability 219
Popov VM (1973) Hyperstability of control systems. basic control loops in a plant by monitoring the
Springer, Berlin. (Translation of the previous refer- variance in the process variable and compar-
ence)
Rosenbrock HH (1970) State-space and multivariable the-
ing it to that of a minimum variance controller.
ory. Wiley, New York Other metrics of controller performance are also
Trentelman HL, Stoorvogel AA, Hautus MLJ (2001) Con- reviewed. The resulting indices of performance
trol theory for linear systems. Springer, London give an indication of the level of performance
Wonham WM (1979) Linear multivariable control: a geo-
metric approach. Springer, New York
of the controller and an indication of the ac-
Zadeh LA, Desoer CA (1963) Linear systems theory – the tion required to improve its performance; the
state-space approach. McGraw-Hill, New York diagnosis of poor performance may lead one to
look at remediation alternatives such as: retuning
controller parameters or process reengineering to
reduce delays or implementation of feed-forward
Controller Performance Monitoring control or attribute poor performance to faulty
actuators or other process nonlinearities.
Sirish L. Shah
Department of Chemical and Materials
Engineering, University of Alberta Edmonton, Keywords
Edmonton, AB, Canada
Time series analysis; Minimum variance control;
Control loop performance assessment; Perfor-
Abstract mance monitoring; Fault detection and diagnosis
established reporting and remediation workflows Note that all transfer functions are expressed for
to ensure that improvement activities are carried the discrete time case in terms of the backshift op-
out in an expedient manner. Plant-wide perfor- erator, q 1 : N represents the disturbance trans-
mance metrics can provide insight into company- fer function with numerator and denominator
wide process control performance. Closed-loop polynomials in q 1 . The division of the numera-
tuning and modeling tools can also be deployed tor by the denominator can be rewritten as: N D
to aid with the improvement activities. Survey ar- F C q d R, where the quotient term, F D F0 C C
ticles by Thornhill and Horch (2007) and Shardt F1 q 1 C C Fd 1 q .d 1/ is a polynomial of
et al. (2012) provide a good overview of the over- order .d 1/ and the remainder, R is a transfer
all state of CPM and the related diagnosis issues. function. The closed-loop transfer function can
CPM software is now readily available from most be reexpressed, after algebraic manipulation as
DCS vendors and has already been implemented
successfully at many large-scale industrial sites N
yt D at
throughout the world. 1 C q d TQ Q
F C q d R
D at
Univariate Control Loop Performance 1 C q d TQ Q
Assessment with Minimum Variance
!
F 1 C q d TQ Q C q d R F TQ Q
Control as Benchmark D at
1 C q d TQ Q
It has been shown by Harris (1989) that for a !
Q
d R F T Q
system with time delay d , a portion of the output D F Cq at
variance is feedback control invariant and can be 1 C q d TQ Q
estimated from routine operating data. This is the D F0 at C F1 at 1 C C Fd 1 at d C1
so-called minimum variance output. Consider the „ ƒ‚ …
et
closed-loop system shown in Fig. 1, where Q is
the controller transfer function, TQ is the process C L0 at d C L1 at d 1 C
„ ƒ‚ …
transfer function, d is the process delay (in terms wt d
of sample periods), and N is the disturbance
transfer function driven by random white-noise The closed-loop output can then be expressed as
sequence, at .
In the regulatory mode (when the set point yt D et C wt d
is constant), the closed-loop transfer function
relating the process output and the disturbance is where et D F at corresponds to the first d 1
given by lags of the closed-loop expression for the output,
yt , and more importantly is independent of the
N controller, Q, or it is controller invariant, while
Closed-loop response: yt D at
1 C q d TQ Q wt d is dependent on the controller. The variance
of the output is then given by
A measure of controller performance index can Substituting Eq. (3) into Eq. (4) yields
then be defined as
.d / D mv
2
=y2 (4)
h i
.d / D rya
2
.0/ C rya
2
.1/ C rya
2
.2/ C C rya
2
.d 1/ =y2 a2
D
ya
2
.0/ C
ya
2
.1/ C
ya
2
.2/ C C
ya
2
.d 1/
D ZZ T
where Z is the vector of cross correlation coeffi- Although at is unknown, it can be replaced by
cients between yt and at for lags 0 to d 1 and the estimated innovation sequence aO t . The es-
is denoted as timate aO t is obtained by whitening the process
output variable yt via time series analysis. This
Z D
ya .0/
ya .1/
ya .2/ : : :
ya .d 1/ algorithm is denoted as the FCOR algorithm
Controller Performance Monitoring 223
for Filtering and CORrelation analysis (Huang It is clear from these plots that performance
and Shah 1999). This derivation assumes that indices trajectory depends on dynamics of the
the delay, d , be known a priori. In practice, disturbance and controller tuning.
however, a priori knowledge of time delays may It is important to note that the minimum vari-
not always be available. It is therefore useful to ance is just one of several benchmarks for obtain-
assume a range of time delays and then calculate ing a controller performance metric. It is seldom
performance indices over this range of the time practical to implement minimum variance control C
delays. The indices over a range of time delays as it typically will require aggressive actuator ac-
are also known as extended horizon performance tion. However, the minimum variance benchmark
indices (Thornhill et al. 1999). Through pattern serves to provide an indication of the opportunity
recognition, one can tell the performance of the in improving control performance; that is, should
loop by visualizing the patterns of the perfor- the performance index .d / be near or just above
mance indices versus time delays. There is a clear zero, then it gives the user an idea of the benefits
relation between performance indices curve and possible in improving the control performance of
the impulse response curve of the control loop. that loop.
Consider a simple case where the process is
subject to random disturbances. Figure 2 is one
example of performance evaluation for a control Performance Assessment and
loop in the presence of disturbances. This fig- Diagnosis of Univariate Control Loop
ure shows time-series of process variable data Using Alternative Performance
for both loops in the left column, closed-loop Indicators
impulse responses (middle column) and corre-
sponding performance indices (labeled as PI on In addition to the performance index for perfor-
the right column). From the impulse responses, mance assessment, there are several alternative
one can see that the loop under the first set indicators of control loop performance. These are
of tuning constants (denoted as TAG1.PV) has discussed next.
better performance; the loop under the second
set of tuning constants (denoted as TAG5.PV) Autocorrelation function: The autocorrelation
has oscillatory behavior, indicating a relatively function (ACF) of the output error, shown in
poor control performance. With performance in- Fig. 3, is an approximate measure of how close
dex “1” indicating the best possible performance the existing controller is to minimum variance
and index “0” indicating the worst performance, condition or how predictable the error is over the
performance indices for the first controller tuning time horizon of interest. If the controller is under
(shown on the upper-right plot) approach “1” minimum variance condition then the autocorre-
within 4 time lags, while performance indices lation function should decay to zero after “d 1”
for the second controller tuning (shown on the lags where “d ” is the delay of the process. In
bottom-right plot) take 10 time lags to approach other words, there should be no predictable infor-
“0.7.” In addition, performance indices for the mation beyond time lag d 1. The rate at which
second tuning show ripples as they approach an the autocorrelation decays to zero after “d 1”
asymptotic limit, indicating a possible oscillation lags indicates how close the existing controller
in the loop. is to the minimum variance condition. Since it is
Notice that one cannot rank performance of straightforward to calculate autocorrelation using
these two controller settings from the noisy time- process data, the autocorrelation function is often
series data. Instead, we can calculate performance used as a first-pass test before carrying out further
indices over a range of time delays (from 1 to 10). performance analysis.
The result is shown on the right column plots of
Fig. 2. These simulations correspond to the same Impulse response: An impulse response func-
process with different controller tuning constants. tion curve represents the closed-loop impulse
224 Controller Performance Monitoring
Value
0
−1
−2
−3
−4
−5
182 364 546 728 910 1092 1274 1456 1638 1820
Sample
0.0
−2.5
−5.0
−7.5
182 364 546 728 910 1092 1274 1456 1638 1820
Sample
0.6
0.6
Impulse
0.5
0.5 0.4
0.4 0.3
0.3 0.2
0.2 0.1
0.1 0.0
−0.0 Delay 1 2 3 4 5 6 7 8 9 10
−0.1 PI(var) 0.484 0.813 0.934 0.971 0.982 0.984 0.985 0.985 0.987 0.988
3 6 9 12 15 18 21 24 27 30 33 36
Lag
0.4
Impulse
0.25
0.3
−0.00
0.2
−0.25 0.1
−0.50 0.0
−0.75 Delay 1 2 3 4 5 6 7 8 9 10
PI(var) 0.169 0.284 0.286 0.398 0.482 0.483 0.565 0.628 0.628 0.684
9 18 27 36 45 54 63 72 81 90 99
Lag
Controller Performance Monitoring, Fig. 2 Time series of process variable (top), corresponding impulse responses
(left column) and their performance indices (right column).
Controller Performance Monitoring 225
Controller Performance
Monitoring, Fig. 3
Autocorrelation function of
the controller error
Controller Performance
Monitoring, Fig. 4
Impulse responses
estimated from routine
operating data
response between the whitened disturbance se- Spectral analysis: The closed-loop frequency
quence and the process output. This function is response of the process is an alternative way to
a direct measure of how well the controller is assess control loop performance. Spectral analy-
performing in rejecting disturbances or tracking sis of output data easily allows one to detect oscil-
set-point changes. Under stochastic framework, lations, offsets, and measurement noise present in
this impulse response function may be calculated the process. The closed-loop frequency response
using time series model such as an Autoregres- is often plotted together with the closed-loop fre-
sive Moving Average (ARMA) or Autoregres- quency response under minimum variance con-
sive with Integrated Moving Average (ARIMA) trol. This is to check the possibility of perfor-
model. Once an ARMA type of time series model mance improvement through controller tunings.
is estimated, the infinite-order moving average The comparison gives a measure of how close
representation of the model shown in Eq. (1) can the existing controller is to the minimum vari-
be obtained through a long division of the ARMA ance condition. In addition, it also provides the
model. As shown in Huang and Shah (1999), the frequency range in which the controller signif-
coefficients of the moving average model, Eq. (1), icantly deviates from minimum variance condi-
are the closed-loop impulse response coefficients tion. Large deviation in the low-frequency range
of the process between whitened disturbances typically indicates lack of integral action or weak
and the process output. Figure 4 shows closed- proportional gain. Large peaks in the medium-
loop impulse responses of a control loop with two frequency range typically indicate an overtuned
different control tunings. Clearly they denote two controller or presence of oscillatory disturbances.
different closed-loop dynamic responses: one is Large deviation in the high-frequency range typ-
slow and smooth, and the other one is relatively ically indicates significant measurement noise.
fast and slightly oscillatory. The sum of square of As an illustrative example, frequency responses
the impulse response coefficients is the variance of two control loops are shown in Fig. 5. The
of the data. left graph of the figure shows that closed-loop
226 Controller Performance Monitoring
2.5
0.0
−2.5
−5.0
−7.5
10−2 10−1
Frequency
5
0
−5
−10
−15
−20
10−2 10−1
Frequency
Controller Performance Monitoring, Fig. 5 Frequency response estimated from routine operating data
frequency response of the existing controller is is therefore often desirable. For example, seg-
almost the same as the frequency response under mentation of data may lead to some insight into
minimum variance control. A peak at the mid- any cyclical behavior of the process variation in
frequency indicates possible overtuned control. controller performance during, e.g., day/night or
The right graph of Fig. 5 shows that the frequency due to shift change. Figure 6 is an example of
response of the existing controller is oscillatory, performance segmentation over a 200 data point
indicating a possible overtuned controller or the window.
presence of an oscillatory disturbance at the peak
frequency; otherwise the controller is close to
minimum variance condition. Performance Assessment of
Univariate Control Loops Using
Segmentation of performance indices: Most User-Specified Benchmarks
process data exhibit time- varying dynamics; i.e.,
the process transfer function or the disturbance The increasing level of global competitive-
transfer function is time variant. Performance ness has pushed chemical plants into high-
assessment with a non-overlapping sliding data performance operating regions that require
window that can track time-varying dynamics advanced process control technology. See the
Controller Performance Monitoring 227
articles Control Hierarchy of Large Processing shows oscillatory behavior, with an RPI D 0:4,
Plants: An Overview and Control Structure indicating deteriorated performance due to the
Selection. Consequently, the industry has an oscillation.
increasing need to upgrade the conventional In some cases one may wish to specify cer-
PID controllers to advanced control systems. tain desired closed-loop dynamics and carry out
The most natural questions to ask for such an performance analysis with respect to such desired
upgrading are as follows. Has the advanced dynamics. One such desired dynamic benchmark C
controller improved performance as expected? is the closed-loop settling time. As an illustrative
If yes, where is the improvement and can it example, Fig. 8 shows a system where a settling
be justified? Has the advanced controller been time of ten sampling units is desired for a process
tuned to its full capacity? Can this improvement with a delay of five sampling units. The impulse
also be achieved by simply retuning the existing responses show that the existing loop is close to
traditional (e.g., PID) controllers? (see PID the desired performance, and the value of RPI D
Control). In other words, what is the cost 0:9918 confirms this. Thus no further tuning of
versus benefit of implementing an advanced the loop is necessary.
controller? Unlike performance assessment using
minimum variance control as benchmark, the
solution to this problem does not require a Diagnosis of Poorly Performing
priori knowledge of time delays. Two possible Loops
relative benchmarks may be chosen: one is the
historical data benchmark or reference data set Whereas detection of poorly performing loops
benchmark, and the other is a user-specified is now relatively simple, the task of diagnos-
benchmark. ing reason(s) for poor performance and how to
The purpose of reference data set benchmark- “mend” the loop is generally not straightforward.
ing is to compare performance of the existing The reasons for poor performance could be any
controller with the previous controller during the one of interactions between various control loops,
“normal” operation of the process. This reference overtuned or undertuned controller settings, pro-
data set may represent the process when the cess nonlinearity, poor controller configuration
controller performance is considered satisfactory (meaning the choice of pairing a process (or
with respect to meeting the performance objec- controlled) variable with a manipulative variable
tives. The reference data set should be represen- loop), or actuator problems such as stiction, large
tative of the normal conditions that the process is delays, and severe disturbances. Several studies
expected to operate at; i.e., the disturbances and have focused on the diagnosis issues related to
set-point changes entering into the process should actuator problems (Håagglund 2002; Choudhury
not be unusually different. This analysis provides et al. 2008; Srinivasan and Rengaswamy 2008;
the user with a relative performance index (RPI) Xiang and Lakshminarayanan 2009; de Souza
which compares the existing control loop perfor- et al. 2012). Shardt et al. (2012) has given an
mance with a reference control loop benchmark overview of the overall state of CPM and the
chosen by the user. The RPI is bounded by related diagnosis issues.
0 RPI 1, with “<1” indicating dete-
riorated performance, “1” indicating no change
of performance, and “>1” indicating improved Industrial Applications of CPM
performance. Figure 6 shows a result of reference Technology
data set benchmarking. The impulse response
of the benchmark or reference data smoothly As remarked earlier, CPM software is now read-
decays to zero, indicating good performance of ily available from most DCS vendors and has
the controller. After one increases the propor- already been implemented successfully at several
tional gain of the controller, the impulse response large-scale industrial sites. A summary of just
228 Controller Performance Monitoring
0.3
PI
0.2
0.1
0.0
Time 200 400 600 800 1000 1200 1400 1600 1800 2000
PI 0.541 0.428 0.387 0.433 0.462 0.404 0.479 0.460 0.496 0.388
Controller Performance
Monitoring, Fig. 7
Reference benchmarking
based on impulse responses
Controller Performance
Monitoring, Fig. 8
User-specified benchmark
based on impulse responses
Controller Performance Monitoring 229
two of many large-scale industrial implementa- control loops can be detected, and repeated per-
tions of CPM technology appears below. It gives formance monitoring will indicate which loops
a clear evidence of the impact of this control should be retuned or which loops have not been
technology and how readily it has been embraced effectively upgraded when changes in the dis-
by industry (Shah et al. 2014). turbances, in the process, or in the controller
itself occur. Obviously better design, tuning, and
BASF Controller Performance Monitoring upgrading will mean that the process will operate C
Application at a point close to the economic optimum, leading
As part of its excellence initiative OPAL 21 to energy savings, improved safety, efficient uti-
(Optimization of Production Antwerp and Lud- lization of raw materials, higher product yields,
wigshafen), BASF has implemented the CPM and more consistent product qualities. This entry
strategy on more than 30,000 control loops at has summarized the major features available in
its Ludwigshafen site in Germany and on over recent commercial software packages for control
10,000 loops at its Antwerp production facility loop performance assessment. The illustrative ex-
in Belgium. The key factor in using this technol- amples have demonstrated the applicability of
ogy effectively is to combine process knowledge, this new technique when applied to process data.
basic chemical engineering, and control expertise This entry has also illustrated how controllers,
to develop solutions for the indicated control whether in hardware or software form, should
problems that are diagnosed in the CPM software be treated like “capital assets” and how there
(Wolff et al. 2012). should be routine monitoring to ensure that they
perform close to the economic optimum and that
Saudi Aramco Controller Performance the benefits of good regulatory control will be
Monitoring Practice achieved.
As part of its process control improvement ini-
tiative, Saudi Aramco has deployed CPM on
approximately 15,000 PID loops, 50 MPC appli-
cations, and 500 smart positioners across multiple Cross-References
operating facilities.
The operational philosophy of the CPM en- Control Hierarchy of Large Processing Plants:
gine is incorporated in the continuous improve- An Overview
ment process at BASF and Aramco, whereby all Control Structure Selection
loops are monitored in real-time and a holistic Fault Detection and Diagnosis
performance picture is obtained for the entire PID Control
plant. Unit-wide performance metrics are dis- Statistical Process Control in Manufacturing
played in effective color-coded graphic forms to
effectively convey the analytics information of
the process.
Bibliography
de Prada C (2014) Overview: control hierarchy of large on cooperative manipulation. Kinematics and dy-
processing plants. In: Encylopedia of Systems and namics of robotic arms cooperatively manipu-
Control. Springer, London
de Souza LCMA, Munaro CJ, Munareto S (2012) Novel
lating a tightly grasped rigid object are briefly
model-free approach for stiction compensation in con- discussed. Then, this entry presents the main
trol valves. Ind Eng Chem Res 51(25):8465–8476 strategies for force/motion control of the cooper-
Håagglund T (2002) A friction compensator for pneu- ative system.
matic control valves. J Process Control 12(8):897–904
Harris T (1989) Assessment of closed loop performance.
Can J Chem Eng 67:856–861
Huang B, Shah SL (1999) Performance assessment Keywords
of control loops: theory and applications. Springer-
Verlag, October 1999, ISBN: 1-85233-639-0.
Shah SL, Nohr M, Patwardhan R (2014) Success stories in Cooperative task space; Coordinated motion;
control: controller performance monitoring. In: Samad Force/motion control; Grasping; Manipulation;
T, Annaswamy AM (eds) The impact of control tech- Multi-arm systems
nology, 2nd edn. www.ieeecss.org
Shardt YAW, Zhao Y, Lee KH, Yu X, Huang B, Shah
SL (2012) Determining the state of a process control
system: current trends and future challenges. Can J
Chem Eng 90(2):217–245
Introduction
Srinivasan R, Rengaswamy R (2008) Approaches for effi-
cient stiction compensation in process control valves. Since the early 1970s, it has been recognized that
Comput Chem Eng 32(1):218–229 many tasks, which are difficult or even impossi-
Skogestad S (2014) Control structure selection and
plantwide control. In: Encylopedia of Systems and
ble to execute by a single robotic manipulator,
Control. Springer, London become feasible when two or more manipulators
Thornhill NF, Horch A (2007) Advances and new direc- work in a cooperative way. Examples of typical
tions in plant-wide controller performance assessment. cooperative tasks are the manipulation of heavy
Control Eng Pract 15(10):1196–1206
Thornhill NF, Oettinger M, Fedenczuk P (1999) Refinery-
and/or large payloads, assembly of multiple parts,
wide control loop performance assessment. J Process and handling of flexible and articulated objects
Control 9(2):109–124 (Fig. 1).
Wolff F, Roth M, Nohr A, Kahrs O (2012) Software In the 1980s, research achieved several the-
based control-optimization for the chemical indus-
try. VDI, Tagungsband “Automation 2012”, 13/14.06 oretical results related to modeling and control
2012, Baden-Baden of to single-arm robots; this further fostered re-
Xiang LZ, Lakshminarayanan S (2009) A new unified search on multi-arm robotic systems. Dynamics
approach to valve stiction quantification and compen-
sation. Ind Eng Chem Res 48(7):3474–3483
Cooperative Manipulators
Fabrizio Caccavale
School of Engineering, Università degli Studi
della Basilicata, Potenza, Italy
Abstract
and control as well as force control issues have composed by two cooperative manipulators
been widely explored along the decade. grasping a common object.
In the 1990s, parameterization of the The kinetostatic formulation proposed by
constraint forces/moments acting on the object Uchiyama and Dauchez (1993), i.e., the so-called
has been recognized as a key to solving control symmetric formulation, is based on kinematic
problems and has been studied in several and static relationships between generalized
papers (e.g., Sang et al. 1995; Uchiyama and forces/velocities acting at the object and their C
Dauchez 1993; Walker et al. 1991; Williams counterparts acting at the manipulators end
and Khatib 1993). Several control schemes for effectors. To this aim, the concept of virtual
cooperative manipulators based on the sought stick is defined as the vector which determines
parameterizations have been designed, including the position of an object-fixed coordinate frame
force/motion control (Wen and Kreutz-Delgado with respect to the frame attached to each robot
1992) and impedance control (Bonitz and Hsia end effector (Fig. 2). When the object grasped
1996; Schneider and Cannon 1992). Other by the two manipulators can be considered rigid
approaches are adaptive control (Hu et al. 1995), and tightly attached to each end effector, then the
kinematic control (Chiacchio et al. 1996), task- virtual stick behaves as a rigid stick fixed to each
space regulation (Caccavale et al. 2000), and end effector.
model-based coordinated control (Hsu 1993). According to the symmetric formulation, the
Other important topics investigated in the 1990s vector, h, collecting the generalized forces (i.e.,
were the definition of user-oriented task-space forces and moments) acting at each end effector
variables for coordinated control (Caccavale et al. is given by
2000; Chiacchio et al. 1996), the development of
meaningful performance measures (Chiacchio h D W hE C VhI ; (1)
et al. 1991a,b) for multi-arm systems, and the
problem of load sharing (Walker et al. 1989). where W is the so-called grasp matrix, the
Most of the abovementioned works assume columns of V span the null space of the
that the cooperatively manipulated object is
rigid and tightly grasped. However, since
the 1990s, several research efforts have
been focused on the control of cooperative C
flexible manipulators (Yamano et al. 2004),
since flexible-arm robot merits (lightweight
structure, intrinsic compliance, and hence safety)
can be conveniently exploited in cooperative
manipulation. Other research efforts have been
focused on the control of cooperative systems C
grasp matrix, and hI is the generalized force a commonly manipulated object. In multifingered
vector which does not contribute to the object’s hands, only some motion components are
motion, i.e., it represents internal loading of the transmitted through the contact point to the
object (mechanical stresses) and is termed as manipulated object (unilateral constraints), while
internal forces, while hE represents the vector of cooperative manipulation via robotic arms is
external forces, i.e., forces and moments causing achieved by rigid (or near-rigid) grasp points and
the object’s motion. Later, a task-oriented interaction takes place by transmitting all the
formulation has been proposed (Chiacchio et al. motion components through the grasping points
1996), aimed at defining a cooperative task space (bilateral constraints). While many common
in terms of absolute and relative motion of problems between the two fields can be tackled
the cooperative system, which can be directly in a conceptually similar way (e.g., kinetostatic
computed from the position and orientation of modeling, force control), many others are specific
the end-effector coordinate frames. of each of the two application fields (e.g., form
The dynamics of a cooperative multi-arm sys- and force closure for multifingered hands).
tem can be written as the dynamics of the single
manipulators together with the closed-chain con-
straints imposed by the grasped object. By elimi- Control
nating the constraints, a reduced-order model can
be obtained (Koivo and Unseren 1991). When a cooperative multi-arm system is em-
Strongly related to kinetostatics and dynamics ployed for the manipulation of a common object,
of cooperative manipulators is the load sharing it is important to control both the absolute motion
problem, i.e., distributing the load among the of the object and the internal stresses applied
arms composing the system, which has been to it. Hence, most of the control approaches to
solved, e.g., in Walker et al. (1989). A very rele- cooperative robotic systems can be classified as
vant problem related to the load sharing is that of force/motion control schemes.
robust holding, i.e., the problem of determining Early approaches to the control of cooperative
forces/moments applied to object by the arms, in systems were based on the master/slave concept.
order to keep the grasp even in the presence of Namely, the cooperative system is decomposed
disturbing forces/moments. in a position-controlled master arm, in charge
A major issue in robotic manipulation is the of imposing the absolute motion of the object,
performance evaluation via suitably defined and the force-controlled slave arms, which are
indexes (e.g., manipulability ellipsoids). These to follow (as smoothly as possible) the motion
concepts have been extended to multi-arm robotic imposed by the master. A natural evolution of the
systems in Chiacchio et al. (1991a,b). Namely, by above-described concept has been the so-called
exploiting the kinetostatic formulations described leader/follower approach, where the follower
above, velocity and force manipulability arm reference motion is computed via closed-
ellipsoids can be defined, by regarding the whole chain constraints. However, such approaches
cooperative system as a mechanical transformer suffered from implementation issues, mainly due
from the joint space to the cooperative task space. to the fact that the compliance of the slave arms
The manipulability ellipsoids can be seen as has to be very large, so as to smoothly follow the
performance measures aimed at determining the motion imposed by the master arm. Moreover,
attitude of the system to cooperate in a given the roles of the master and slave (leader and
configuration. follower) may need to be changed during the task
Finally, it is worth mentioning the strict re- execution.
lationship between problems related to grasping Due to the abovementioned limitations, more
of objects by fingers/hands and those related natural nonmaster/slave approaches have been
to cooperative manipulation. In fact, in both pursued later, where the cooperative system is
cases, multiple manipulation structures grasp seen as a whole. Namely, the reference motion
Cooperative Manipulators 233
of the object is used to determine the motion of model (Williams and Khatib 1993) or according
all the arms in the system and the interaction to the scheme proposed in Hsu (1993).
forces are measured and fed back so as to be An alternative control approach is based on
directly controlled. To this aim, the mappings the well-known impedance concept (Bonitz and
between forces and velocities at the end effector Hsia 1996; Schneider and Cannon 1992). In fact,
of each manipulator and their counterparts at the when a manipulation system interacts with an
manipulated object are considered in the design external environment and/or other manipulators, C
of the control laws. large values of the contact forces and moments
An approach, based on the classical hybrid can be avoided by enforcing a compliant behavior
force/position control scheme, has been proposed with suitable dynamic features. In detail, the fol-
in Uchiyama and Dauchez (1993), by exploiting lowing mechanical impedance behavior between
the symmetric formulation described in the pre- the object displacements and the forces due to the
vious section. object-environment interaction can be enforced
In Wen and Kreutz-Delgado (1992) a (external impedance):
Lyapunov-based approach is pursued to devise
force/position PD-type control laws. This ME aQ E C DE vQ E C K e eE D henv ; (3)
approach has been extended in Caccavale et al.
(2000), where kinetostatic filtering of the control
where eE represents the vector of displacements
action is performed, so as to eliminate all the
between object’s desired and actual pose, vQ E is
components of the control input which contribute
the difference between the object’s desired and
to internal stresses at the object.
actual generalized velocities, aQ E is the difference
A further improvement of the PD plus
between the object’s desired and actual gener-
gravity compensation control approach has
alized accelerations, and henv is the generalized
been achieved by introducing a full model
force acting on the object, due to the interaction
compensation, so as to achieve feedback
with the environment. The impedance dynamics
linearization of the closed-loop system. The
is characterized in terms of given positive definite
feedback linearization approach formulated at
mass, damping, and stiffness matrices (ME , DE ,
the operational space level is the base of the so-
K E ). A mechanical impedance behavior between
called augmented object approach (Sang et al.
the i th end-effector displacements and the in-
1995). In this approach, the system is modeled
ternal forces can be imposed as well (internal
in the operational space as a whole, by suitably
impedance):
expressing its inertial properties via a single
augmented inertia matrix MO , i.e.,
MI;i aQ i C DI;i vQ i C K I;i ei D hI;i ; (4)
by multiple cooperating manipulators. Int J Robot Res stern (1944) this categorization is already made.
10:396–409 These authors discuss zero-sum (matrix) games
Wen JT, Kreutz-Delgado K (1992) Motion and force
control of multiple robotic manipulators. Automatica
in normal form, where the noncooperative so-
28:729–743 lution concept of saddle-point was defined and
Williams D, Khatib O (1993) The virtual linkage: a model characterized, and games in characteristic func-
for internal forces in multi-grasp manipulation. In: tion form, where solution concepts for games
Proceedings of the 1993 IEEE international confer-
of coalitions were introduced. In this article we C
ence on robotics and automation, Atlanta, pp 1025–
1030 present the fundamental solution concepts of the
Yamano M, Kim J-S, Konno A, Uchiyama M (2004) theory of cooperative games in the context of
Cooperative control of a 3D dual-flexible-arm robot. dynamical systems. The article is organized as
J Intell Robot Syst 39:1–15
Yukawa T, Uchiyama M, Nenchev DN, Inooka H (1996)
follows: we first recall the papers, which mark
Stability of control system in handling of a flexible the origin of development of a theory of dy-
object by rigid arm robots. In: Proceedings of the namic games; then we recall the basic concept of
1996 IEEE international conference on robotics and Pareto optimality proposed as a cooperative so-
automation, Minneapolis, pp 2332–2339
lution concept; we present the scalarization tech-
nique and the necessary or sufficient optimality
conditions for Pareto optimality in mathematical
programming and optimal control settings; we
Cooperative Solutions to Dynamic
then explore the difficulties encountered when
Games
one tried to extend the Nash bargaining solution,
characteristic function and cores concept to dy-
Alain Haurie
namic games; we show the links that exist with
ORDECSYS and University of Geneva,
the theory of reachability for perturbed dynamic
Switzerland
systems.
GERAD-HEC Montréal PQ, Canada
The Origins
Abstract
One may consider that the first introduction of
This article presents the fundamental elements of
a cooperative game solution concept in systems
the theory of cooperative games in the context
and control science is due to L.A. Zadeh (1963).
of dynamic systems. The concepts of Pareto op-
Two-player zero-sum dynamic games have been
timality, Nash bargaining solution, characteristic
studied by R. Isaacs (1954) in a deterministic
function, cores, and C-optimality are discussed,
continuous time setting and by L. Shapley (1953)
and some fundamental results are recalled.
in a discrete time stochastic setting. Nonzero-sum
and m player differential games were introduced
by Y.C. Ho and A.W. Starr (1969) and J.H. Case
Keywords (1969). For these games cooperative solutions
can be looked for to complement the noncoop-
Cores; Nash equilibrium; Pareto optimality erative Nash equilibrium concept.
Solution concepts in game theory are regrouped In cooperative games one is interested in non-
in two main categories called noncooperative dominated solution. This solution type is re-
and cooperation solutions, respectively. In the lated to a concept introduced by the well-known
seminal book of von Neumann and Morgen- economist V. Pareto (1869) in the context of
236 Cooperative Solutions to Dynamic Games
welfare economics. Consider a system with de- of the criteria. But this procedure will not find all
cision variables x 2 X IRn and m performance of the nondominated solutions.
criteria x ! j .x/ 2 IR, j D 1; : : : ; m that one
tries to maximize. Conditions for Pareto Optimality
in Mathematical Programming
Definition 1 The decision x 2 X is nondomi- N.O. Da Cunha and E. Polak (1967b) have ob-
nated or Pareto optimal if the following condition tained the first necessary conditions for multi-
holds: objective optimization. The problem they con-
sider is
j .x/ j .x / 8j D 1; : : : m
Pareto Opt: j .x/ j D 1; : : : m
H) j .x/ D j .x / 8j D 1; : : : m:
s:t:
'k .x/ 0 k D 1; : : : p
In other words it is impossible to give one cri-
terion j a value greater than j .x / without where the functions x 2 IRn 7! j .x/ 2 IR,
decreasing the value of another criterion, say `, j D 1; : : : ; m, and x 7! 'k .x/ 2 IR, k D
which then takes a value lower than ` .x /. 1; : : : ; p are continuously differentiable (C 1 ) and
where we assume that the constraint qualification
This vector-valued optimization framework cor-
conditions of mathematical programming hold
responds to a situation where m players are en-
gaged in a game, described in its normal form, for this problem too. They proved the following
theorem.
where the strategies of the m players constitute
the decision vector x and their respective payoffs Theorem 1 Let x be a Pareto optimal solution
are given by the m performance criteria j .x/, of the problem defined above. Then there exists a
j D 1; : : : ; m. One assumes that these players vector of p multipliers k , k D 1; : : : ; p, and a
jointly take a decision that is cooperatively op- vector r ¤ 0 of m weights rj 0, such that the
timal, in the sense that no player can improve following conditions hold
his/her payoff without deteriorating the payoff of
@
Leitmann et al. 1972; Salukvadze 1971; Vincent where the weighted Hamiltonian is defined by
and Leitmann 1970; Zadeh 1963), the main result
being an extension of the maximum principle of X
m
P
x.t/ D f .x.t/; u.t// (1)
The proof of this result necessitates some addi- C
tional regularity assumptions. Some of these con-
u.t/ 2 U (2) ditions imply that there exist differentiable Bell-
man value functions (see, e.g., Blaquière et al.
x.0/ D xo (3)
1972); some others use the formalism of nons-
t 2 Œ0; T
(4) mooth analysis (see, e.g., Bellassali and Jourani
2004).
where x 2 IRn is the state variable of the system,
u 2 U IRp with U compact is the control
variable, and Œ0; T
is the control horizon. The The Nash Bargaining Solution
system is evaluated by m performance criteria of
the form Since Pareto optimal solutions are numerous (ac-
Z T tually since a subset of Pareto outcomes Pm are in-
j .x./; u.// D gj .x.t/; u.t//dtCGj .x.T //; dexed over the weightings r, r j > 0, j D1 rj D
0 1), one can expect, in the payoff m-dimensional
(5)
space, to have a manifold of Pareto outcomes.
for j D 1; : : : ; m. Under the usual assumptions
Therefore, the problem that we must solve now
of control theory, i.e., f .; / and gj .; /, j D is how to select the “best” Pareto outcome?
1; : : : ; m, being C 1 in x and continuous in u,
“Best” is a misnomer here, because, by their
Gj ./ being C 1 in x, one can prove the following.
very definition, two Pareto outcomes cannot be
Theorem 2 Let fx .t/ W t 2 Œ0; T
g be a Pareto compared or gauged. The choice of a Pareto
optimal trajectory, generated at initial state x ı by outcome that satisfies each player must be the
the Pareto optimal control fu .t/ W t 2 Œ0; T
g. result of some bargaining. J. Nash addressed this
Then there exist costate vectors f .t/ W t 2 problem very early, in 1951, using a two-player
Œ0; T
g and a vector of positive weightsP r ¤ 0 2 game setting. He developed an axiomatic ap-
IRm , with components rj 0, m j D1 j D 1,
r proach where he proposed four behavior axioms
such that the following relations hold: which, if accepted, would determine a unique
choice for the bargaining solution. These ax-
@ ioms are called respectively, (i) invariance to
xP .t/ D H.x .t/; u .t/I .t/I r/ (6)
@ affine transformations of utility representations,
@ (ii) Pareto optimality, (iii) independence of irrel-
P
.t/ D H.x .t/; u .t/I .t/I r/ (7) evant alternatives, and (iv) symmetry. Then the
@x
bargaining point is the Pareto optimal solution
x .0/ D xo (8)
that maximizes the product
Xm
@
.T / D rj Gj .x.T // (9)
@x x D argmaxx . 1 .x/ 1 .x ı //. 2 .x/ 2 .x ı //
j D1
solution could be obtained also as the solution of Cores and C-Optimality in Dynamic
an auxiliary dynamic game in which a sequence Games
of claims and counterclaims is made by the two
players when they bargain. Characteristic functions and the associated so-
When extended directly to the context of lution concept of core are important elements
differential or multistage games, the Nash in the classical theory of cooperative games. In
bargaining solution concept proved to lack two papers (Haurie 1975; Haurie and Delfour
the important property of time consistency. 1974) the basic definitions and properties of the
This was first noticed in Haurie (1976). Let a concept of core in dynamic cooperative games
dynamic game be defined by Eqs. (1)–(5), with were presented. Consider the multistage system,
j D 1; 2. Suppose the status quo decision, controlled by a set M of m players and defined
if no agreement is reached at initial state by
.t D 0; x.0/ D x o /, consists in playing an
open-loop Nash equilibrium, defined by the x.k C 1/ D f k .x.k/; uM .k//;
controls uN j ./ W Œ0; T
! Uj , j D 1; 2 and
k D 0; 1; : : : ; K 1
generating the trajectory x N ./ W Œ0; T
! IRn ,
with x N .0/ D xo . Now applying the Nash x.i / D x i ; i 2 f0; 1; : : : ; K 1g
bargaining solution scheme to the data of this Y
differential game played at time t D 0 and state uM .k/, .uj .k//j 2M 2 UM .k/ , Uj .k/:
j 2M
x.0/ D xo , one identifies a particular Pareto
optimal solution, associated with the controls
u ./ W Œ0; T
! Uj , j D 1; 2 and generating the From the initial point .i; x i / a control sequence
trajectory x ./ W Œ0; T
! IRn , with x .0/ D xo . .uM .i /; : : : ; uM .K 1// generates for each
Now assume the two players renegotiate the player j a payoff defined as follows:
agreement to play uj ./ at an intermediate point
of the Pareto optimal trajectory .; x .//, Jj .i; x i I uM .i /; : : : ; uM .K 1// ,
2 .0; T /. When computed from that point, X
K1
the status quo strategies are in general not the ˆj .x.k/; uM .k// C ‡j .x.K//:
same as they were at .0; xo /; furthermore, the kDi
shape of the Pareto frontier, when the game is
played from .; x .//, is different from what it A subset S of M is called a coalition. Let kS W
Q
is when the game is played at .0; xo /. For these x.k/ 7! uS .k/ 2 j 2S Uj .k/ be a feedback
two reasons the bargaining solution at .; x .// control for the coalition defined at each stage k.
will not coincide in general with the restriction to A player j 2 S considers then, from any initial
the interval Œ; T
of the bargaining solution from point .i; x i /, his guaranteed payoff:
.0; xo /. This implies that the solution concept is
not time consistent. Using feedback strategies, ‰j .i; x i I iS ; : : : ; SK1 / ,
instead of open-loop ones, does not help, as
the same phenomena (change of status quo and infuM s .i /2UM s .i /;:::;uM s .K1/2UM s .K1/
PK1 k
change of Pareto frontier) occur in a feedback
kDi ˆj .x.k/; S .x.k//; uM S .k/ /
strategy context.
This shows that the cooperative game solu- C‡j .x.K//:
tions proposed in the classical theory of games
cannot be applied without precaution in a dy- Definition 2 The characteristic function at stage
namic setting when players have the possibil- i for coalition S M is the mapping v i W
ity to renegotiate agreements at any interme- .S; x i / 7! v i .S; x i / IRS defined by
diary point .t; x ? .t// of the bargained solution
trajectory. !S , .!j /j 2S 2 v i .S; x i / ,
Cooperative Solutions to Dynamic Games 239
Finally, the household might choose to enroll in differently. For example, the downregulation ca-
a demand response program in which it allows a pacity provided by a residential PV installation
demand response provider to control its TCLs to (which is achieved by curtailing its power) might
provide frequency regulation services (Callaway be valued differently from the downregulation
and Hiskens 2011). In general, the renewable capacity provided by a TCL or a PHEV (both
aggregator, the battery vehicle aggregator, and the would need to absorb additional power in order
demand response provider can be either separate to provide downregulation).
entities or they can be the same entity. In this It is not difficult to see that if the price func-
entry, we will refer to these aggregating entities tions i ./; i D 1; 2 : : : ; n, are convex and non-
P
as aggregators. decreasing, then the cost function niD1 xi i .xi /
is convex; thus, if the problem in (1) is feasi-
ble, then there exists a globally optimal solu-
The Problem of DER Coordination tion. Additionally, if the price per unit of re-
source is linear with the amount of resource, i.e.,
Without loss of generality, denote by xj the i .xi / D ci xi ; i D 1; 2; : : : ; n, then xi i .xi / D
amount of resource provided by DER i without ci xi2 ; i D 1; 2; : : : ; n, and the problem in (1)
specifying whether it is active or reactive power. reduces to a quadratic program. Also, if the price
[However, it is understood that each DER pro- per unit of resource is constant, i.e., i .xi / D
vides (or consumes) the same type of resource, ci ; i D 1; 2; : : : ; n, then xi i .xi / D ci xi ; i D
i.e., all the xi ’s are either active or reactive 1; 2; : : : ; n, and the problem in (1) reduces to a
power.] Let 0 < x i < x i , for i D 1; 2; : : : ; n, linear program. Finally, if i .xi / D .xi / D
denote the minimum (x i ) and maximum (x i ) c; i D 1; 2; : : : ; n, for some constant c > 0,
capacity limits on the amount of resource xi i.e., the price offered by the aggregator is constant
that node i can provide. Denote by X the total and the same for all DERs, then the optimization
amount of resource that the DERs must collec- problem in (1) becomes a feasibility problem of
tively provide to satisfy the aggregator request. the form
Let i .xi / denote the price that the aggregator
pays DER i per unit of resource xi that it pro- find x1 ; x2 ; : : : ; xn
vides. Then, the objective of the aggregator in X
n
the DER coordination problem is to minimize the subject to xi D X (2)
total monetary amount to be paid to the DERs for i D1
providing the total amount of resource X while 0 < x i xi x i ; 8j:
satisfying the individual capacity constraints of
the DERs. Thus, the DER coordination problem If the problem in (2) is indeed feasible (i.e.,
can be formulated as follows: Pn Pn
lD1 x l X lD1 x l ), then there is an
infinite number of solutions. One such solution,
X
n
which we refer to as fair splitting, is given by
minimize xi i .xi /
i D1 Pn
X lD1 x l
X
n
(1) xi D x i C Pn .x i x i /; 8i: (3)
subject to xi D X lD1 l x l /
.x
i D1
The formulation to the DER coordination
0 < x i xi x i ; 8j: problem provided in (2) is not the only possible
one. In this regard, and in the context of PHEVs,
By allowing heterogeneity in the price per several recent works have proposed game-
unit of resource that the aggregator offers to theoretic formulations to the problem (Ghare-
each DER, we can take into account the fact sifard et al. 2013; Ma et al. 2013; Tushar et al.
that the aggregator might value classes of DERs 2012). For example, in Gharesifard et al. (2013),
Coordination of Distributed Energy Resources for Provision of Ancillary Services. . . 243
a Ps + jQ s
Psc + jQ cs
Psr + jQ rs
Bus s
C
Psu + jQ u
s
Centralized architecture.
b Ps + jQ s
Psc + jQ cs
Psr + jQ rs
Bus s
Psu + jQ u
s
Decentralized architecture.
Coordination of Distributed Energy Resources for Provision of Ancillary Services: Architectures and Algo-
rithms, Fig. 1 Control architecture alternatives. (a) Centralized architecture. (b) Decentralized architecture
the authors assume that each PHEV is a decision as formulated in (1). Specifically, we describe a
maker and can freely choose to participate after centralized architecture that requires the aggre-
receiving a request from the aggregator. The gator to communicate bidirectionally with each
decision that each PHEV is faced with depends DER and a distributed architecture that requires
on its own utility function, along with some the aggregator to only unidirectionally communi-
pricing strategy designed by the aggregator. The cate with a limited number of DERs but requires
PHEVs are assumed to be price anticipating in some additional exchange of information (not
the sense that they are aware of the fact that necessarily through bidirectional communication
the pricing is designed by the aggregator with links) among the DERs.
respect to the average energy available. Another
alternative is to formulate the DER coordination
Centralized Architecture
problem as a scheduling problem (Chen et al.
A solution can be achieved through the
2012; Subramanian et al. 2012), where the DERs
completely centralized architecture of Fig. 1a,
are treated as tasks. Then, the problem is to
where the aggregator can exchange information
develop real-time scheduling policies to service
with each available DER. In this scenario,
these tasks.
each DER can inform the aggregator about
its active and/or reactive capacity limits and
other operational constraints, e.g., maintenance
Architectures schedule. After gathering all this information,
the aggregator solves the optimization program
Next, we describe two possible architectures that in (1), the solution of which will determine
can be utilized to implement the proper algo- how to allocate among the resources the total
rithms for solving the DER coordination problem amount of active power Psr and/or reactive
244 Coordination of Distributed Energy Resources for Provision of Ancillary Services. . .
power Qsr that it needs to provide. Then, the the testbed described in Domínguez-García et al.
aggregator sends individual commands to each (2012a), which is used to solve a particular in-
DER so they modify their active and or reactive stance of the problem in (1), uses Arduino mi-
power generation according to the solution crocontrollers (see Arduino for a description)
of (1) computed by the aggregator. In this outfitted with wireless transceivers implementing
centralized solution, however, it is necessary a ZigBee protocol (see ZigBee for a description).
to overlay a communication network connecting
the aggregator with each resource and to maintain
knowledge of the resources that are available at
any given time. Algorithms
modeled in many possible ways. One popular Modeling of Single Default Time
possibility is to model credit migrations in terms Using Conditional Density
of a jump process, say C i D .Cti /t 0 , taking
values in a finite set, say Ki WD f0; 1; 2; : : : ; K i Traditionally, there were two main approaches to
1; K i g, representing credit ratings assigned to modeling default times: the structural approach
obligor i . Typically, the rating state K i represents and the reduced approach, also known as the
the state of default of the i -th obligor, and typi- hazard process approach. The main features of C
cally it is assumed that process C i is absorbed at both these approaches are presented in Bielecki
state K i . and Rutkowski (2004).
Frequently, the case when K i D 1, that is We focus here on modeling a single default
K WD f0; 1g; is considered. In this case, one
i
time, denoted as , using the so-called condi-
is only concerned with jump from the pre- tional density approach of El Karoui et al. (2010).
default state 0 to the default state 1, which This approach allows for extension of results that
is usually assumed to be absorbing – the can be derived using reduced approach.
assumption made here as well. It is assumed that The default time is a strictly positive random
process C i starts from state 0. The (random) variable defined on the underlying probability
time of jump of process C i from state 0 space .˝; F ; P /, which is endowed with a ref-
to state 1 is called the default time, and is erence filtration, say F D .Ft /t 0 , representing
denoted as i : Process C i is now the same as flow of all (relevant) market information available
the indicator process of i , which is denoted in the model, not including information about
as H i and defined as Hti D 1f i t g ; for occurrence of : The information about occur-
t 0: Consequently, modeling of the process rence of is carried by the (right continuous)
C i is equivalent to modeling of the default filtration H generated by the indicator process
time i . H WD .Ht D 1 t /t 0 . The full information in
The ultimate goal of credit risk modeling is the model is represented by filtration G WD F_H:
to provide a feasible mathematical and compu- It is postulated that
tational methodology for modeling the evolution
of the multivariate credit migration process C WD
.C 1 ; : : : ; C N /; so that relevant functionals of P . 2 djFt / D ˛t ./d;
such processes can be computed efficiently. The
simplest example of such functional is P .Ctj 2
Aj ; j D 1; 2; : : : ; J jGs /, representing the con- for some random field ˛ ./; such that ˛t ./ is
ditional probability, given the information Gs at Ft ˝ B.RC / measurable for each t: The family
time s 0, that process C takes values in ˛t ./ is called Ft -conditionalR density of : In
1
the set Aj at time tj 0, j D 1; 2; : : : ; J . particular, P . > / D ˛0 .u/ d u: The
In particular, in case of modeling of the de- following survival processes are associated with
fault times i , i D 1; 2; : : : ; N , one is con- , R1
cerned with computing conditional survival prob- • St ./ WD P . > jFt / D ˛t .u/ d u,
abilities P . 1 > t1 ; : : : ; N > tN jGs /, which which is an F-martingale,
are the same as probabilities P .Htii D 0; i D • St WD St .t/ D P . > tjFt /, which is an F-
1; 2; : : : ; N jGs /. supermartingale (Azéma supermartingale).
R 1 particular, S0 ./ D P . > / D
Based on that, one can compute more com- In
˛0 .u/ d u; and St .0/ D S0 D 1:
plicated functionals, that naturally occur in the
context of valuation and hedging of credit risk– As an example of computations that can be
sensitive financial instruments, such as corporate done using the conditional density approach
(defaultable) bonds, credit default swaps, credit we give the following result, in which notation
spread options, collateralized bond obligations, “bd” and “ad” stand for before default and
and asset-based securities, for example. at-or-after default, respectively.
248 Credit Risk Modeling
where
is called the G-intensity of : The G-compensator
R1 is stopped at , i.e., G G G
t D t ^ . Hence, t D 0
YT ./˛t ./d
Ytbd D t
1St >0 ; when t > : In particular, we have
St
G F F
t D 1t < t D .1 Ht /t :
and
The conditional density process and the G-
E.YT ./˛T ./jFt /
Ytad .T; / D 1˛t . />0 : intensity of are related as follows: For any t <
˛t ./ and t we have
Rt
F with t WD supst Xs : We then have St ./ D
R 1 Mt D 0 .˛t .u/ ˛u .u//d u D
where
E. 0 ˛u .u/d ujFt / 1: G. / if t and St ./ D E.G. /jFtX / if
(ii) Let WD infft 0 W St D 0g: Define >t
Ft D ˛St .tt / for t < and Ft D F for t : • Assume that F D 1 G and are ab-
Then, the multiplicative decomposition of S solutely continuous w.r.t. Lebesgue measure,
is given as with respective densities f and . We then
have
Rt
Fu d u
St D LFt e 0 ;
˛t ./ D f . / D ˛ ; t ;
MtG D Ht G
t Example 2 This is a reduced-form-like example.
Credit Risk Modeling 249
X
• Suppose S is a strictly positive process. Then,
k D aij ; 8i; j; h 2 K; i ¤ j; (1)
C 1
aih;j
the F-hazard process of is denoted by F k2K
and is given as
X
k D ahk ; 8i; h; k 2 K; h ¤ k; (2)
C 2
F
aih;j
t D ln St ; t 0: j 2K
Cross-References
Keywords
Data Association
Clutter; Measurement origin uncertainty;
Yaakov Bar-Shalom and Richard W. Osborne
Measurement validation; Persistent interference;
University of Connecticut, Storrs, CT, USA
Tracking
Abstract Introduction
In tracking applications, following the signal In a radar the “return” from the target of interest
detection process that yields measurements, there is sought within a time interval determined by the
is a procedure that selects the measurement(s) anticipated range of the target when it reflects the
to be incorporated into the state estimator energy transmitted by the radar: a “range gate” is
– this is called data association (DA) . In set up and the detection(s) within this gate can be
multitarget-multisensor tracking systems, there associated with the target of interest.
are generally three classes of data association: In general the measurements have a higher
specifically, measurement-to-track association dimension:
(M2TA) , track-to-track association (T2TA) , • Range, azimuth (bearing), elevation, or direc-
and measurement-to-measurement association tion sines for radar, possibly also range rate
(M2MA) . M2TA is the process of associating • Bearing and frequency (when the signal is
each measurement from a list (originating from narrow band) or time difference of arrival and
one or more sensors) to a new or existing track. frequency difference in passive sonar
T2TA is the process of associating multiple • Two line-of-sight angles or direction sines for
existing tracks (from multiple sensors or from optical or passive electromagnetic sensors
different periods in time), generally with the Then a multidimensional gate is set up for
intent of fusing them afterward. M2MA is detecting the signal from the target. This is done
the process of associating measurements from to avoid searching for the signal from the target of
different sensors in order to form “composite interest in the entire measurement space. A mea-
measurements” and/or do track initialization. The surement in the gate, while not guaranteed to have
processes of M2TA and T2TA will be discussed originated from the target the gate pertains to, is
in more detail here, while details on M2MA can a valid association candidate – thus, the name
be found in Bar-Shalom et al. (2011). validation region or association region . If there
Data Association, Table 1 Gate thresholds and the probability mass PG in the gate
1 4 6.6 9 9.2 11.4 16 25
g 1 2 2.57 3 3.03 3.38 4 5
nz
1 0.683 0.954 0.99 0.997 0.99994 1
2 0.393 0.865 0.989 0.99 0.9997 1
3 0.199 0.739 0.971 0.99 0.9989 0.99998
The Optimal Estimator for the Pure MMSE The recursion (17) shows that the optimal
Approach MMSE estimator in the presence of data associa-
The information set available at k is tion uncertainty has the following properties:
P1. The pdf pkC1 is a weighted sum of pdfs,
I k D fZ k ; U k1 g (13) conditioned on the current time association
events Ai .k C 1/, i D 1; : : : ; M.k C 1/,
where where M.k C 1/ is the number of mutually
Z k D fz.j /gkj D1 (14) exclusive and exhaustive association events
at time k C 1.
is the cumulative set of observations through time
P2. If the exact previous pdf, which is the suf-
k, which subsumes the initial information Z 0 , ficient statistic, is available, then only the
and U k1 is the set of known inputs prior to time most recent association event probabilities
k. are needed at each time.
For a stochastic system, an information state However, the number of terms of the mixture in
(Striebel 1965) is a function of the available the right-hand side of (17) is, by time k C1, given
information set that summarizes the past of the
by the product
system in a probabilistic sense.
It can be shown that the conditional pdf of the Y
kC1
is an information state if (i) the two noise se- which amounts to an exponential increase in time.
quences (process and measurement) are white This increase is similar to the increase in the
and mutually independent and (ii) the target de- number of the branches of the MHT hypothesis
tection and clutter/false measurement processes tree.
are white. Once the conditional pdf (15) is avail- A detailed derivation of the recursion for the
able, the pure MMSE estimator , i.e., the condi- optimal estimator can be found in Bar-Shalom
tional mean, as well as the conditional variance, et al. (2011).
or covariance matrix, can be obtained.
The optimal estimator, which consists of the
recursive functional relationship between the in- Track-to-Track Association
formation states pkC1 and pk , is given by
In addition to measurement-to-track associa-
pkC1 D Œk C 1; pk ; z.k C 1/; u.k/ (16) tion (M2TA) , an additional class of data asso-
ciation is track-to-track association (T2TA) .
where Following T2TA, track-to-track fusion (T2TF)
may be performed to (hopefully) improve the
Œk C 1; pk ; z.k C 1/; u.k/ overall tracking accuracy. For more details on
track fusion, see Bar-Shalom et al. (2011).
1 X
M.kC1/
It is desired first to test the hypothesis that
D pŒz.k C 1/jx.k C 1/; Ai .k C 1/
c i D1 two tracks pertain to the same target. The optimal
Z test would require using the entire data base (the
pŒx.k C 1/jx.k/; u.k/pk d x.k/ sequences of measurements that form the tracks)
through the present time k and is not practical.
P fAi .k C 1/g (17) In view of this, the test to be presented is based
only on the most recent estimates from the tracks.
is the transformation that maps pk into pkC1 ; the The test based on the state estimates within a time
integration in (17) is over the range of x.k/ and c window is discussed in Tian and Bar-Shalom
is the normalization constant. (2009).
Data Association 257
DD O ij .k/
O ij .k/0 ŒT ij .k/1 (34)
Bibliography
The Cross-Covariance of the Estimation
Errors Bar-Shalom Y, Li XR, Kirubarajan T (2001) Estimation
with applications to tracking and navigation: theory,
The cross-covariance recursion for synchro-
algorithms and software. Wiley, New York
nized sensors can be shown to be (see Bar- Bar-Shalom Y, Willett PK, Tian X (2011) Tracking and
Shalom et al. 2011) data fusion. YBS, Storrs
Data Rate of Nonlinear Control Systems and Feedback Entropy 259
Blackman SS, Popoli R (1999) Design and analysis of instantaneously, lossless, and with arbitrary pre-
modern tracking systems. Artech House, Norwood cision is no longer satisfied. Realistic mathe-
Mallick M, Krishnamurthy V, Vo BN (eds) (2013) In-
tegrated tracking, classification, and sensor manage-
matical models of many important real-world
ment. Wiley, Hoboken communication and control networks have to
Reid DB (1979) An algorithm for tracking multiple tar- take into account general data rate constraints in
gets. IEEE Trans Autom Control 24:843–854 the communication channels, time delays, partial
Striebel C (1965) Sufficient statistics in the optimum con-
trol of stochastic systems. J Math Anal Appl 12:576–
loss of information, and variable network topolo-
592 gies. This raises the question about the smallest
Tian X, Bar-Shalom Y (2009) Track-to-track fusion con- possible information rate above which a given
D
figurations and association in a sliding window. J Adv control task can be solved. Though networked
Inf Fusion 4(2):146–164
control systems can have a complicated topology,
consisting of multiple sensors, controllers, and
actuators, a first step towards understanding the
problem of minimal data rates is to analyze the
Data Rate of Nonlinear Control simplest possible network topology, consisting of
Systems and Feedback Entropy one controller and one dynamical system con-
nected by a digital channel with a certain rate
Christoph Kawan
in bits per unit time. There is a wealth of liter-
Courant Institute of Mathematical Sciences,
ature concerned with the problem of stabilizing
New York University, New York, USA
a system under different assumptions about the
specific coding and control scheme, in this con-
text. However, with few exceptions, mainly linear
Abstract
systems (both deterministic and stochastic) have
been considered. A comprehensive and detailed
Topological feedback entropy is a measure for the
overview of this literature until 2007 can be
smallest information rate in a digital communica-
found in the survey Nair et al. (2007). The first
tion channel between the coder and the controller
systematic approach to the problem of minimal
of a control system, above which the control task
data rates for set invariance and stabilization of
of rendering a subset of the state space invariant
(deterministic, nonlinear) control systems was
can be solved. It is defined purely in terms of
presented in Nair et al. (2004), where the notion
the open-loop system without making reference
of topological feedback entropy was introduced.
to a particular coding and control scheme and can
This quantity, defined in terms of the open-loop
also be regarded as a measure for the inherent
control system, is a measure for the smallest
rate at which the system generates “invariance
data rate a communication channel may have if
information.”
the system is supposed to solve the control task
of rendering a subset of the state space invari-
Keywords ant. Other challenges that digital communication
channels come along with are not yet taken into
Communication constraints; Controlled invari- account here.
ance; Invariance entropy; Minimal data rates; Feedback entropy was first introduced in Nair
Stabilization et al. (2004), using a similar approach via open
covers as in the definition of topological entropy
of for classical dynamical systems in Adler et al.
Introduction (1965). In Colonius and Kawan (2009), a quantity
named invariance entropy was defined which
In the theory of networked control systems, the later turned out to be equivalent to the feedback
assumption of classical control theory that infor- entropy of Nair et al. (cf. Colonius et al. 2013).
mation can be transmitted within control loops The notion of invariance entropy has been further
260 Data Rate of Nonlinear Control Systems and Feedback Entropy
i 1
studied in the papers Kawan (2011a), Kawan u.˛/ D .u0 ; u1 ; u2 ; : : :/ with .ul /lD.i 1/
(2011b), and Kawan (2011c). Several variations
and generalizations have been introduced in D G.Ai 1 / for all i 0;
Colonius (2010), Colonius (2012), Colonius and
Kawan (2011), Da Silva (2013), and Hagihara and for every j 1 a set
and Nair (2013). The research monograph Kawan
(2013) provides a comprehensive presentation of Bj .˛/ WD fx 2 X W '.i ; x; u.˛// 2 Ai
the results obtained so far in the deterministic for i D 0; 1; : : : ; j 1g :
case.
The family Bj WD fBj .˛/ W ˛ 2 AN0 g is an open
cover of K. Letting N.Bj jK/ denote the minimal
Definition cardinality of a finite subcover, the entropy of
.A; ; G/ is
Topological feedback entropy is a nonnegative
real-valued quantity which serves as a measure 1
for the smallest possible data rate in a digital h.A; ; G/ W D lim log2 N.Bj jK/
j !1 j
channel, connecting a coder to a controller, above
1
which the controller is able to generate inputs D inf log2 N.Bj jK/:
j 1 j
which guarantee invariance of a given subset of
the state space. In the literature, one finds several
Finally, the topological feedback entropy (TFE)
slightly differing versions. The original definition
of K is given by
given in Nair et al. (2004) is (with minor mod-
ifications) as follows. Consider a discrete-time
hfb .K/ WD inf h.A; ; G/;
control system .A;;G/
xkC1 D F .xk ; uk / D Fuk .xk /; k 0; where the infimum is taken over all invariant open
covers of K.
with F W X U ! X , where X is a topological A conceptually simpler but equivalent defini-
space and U a nonempty set such that Fu W X ! tion, introduced in Colonius and Kawan (2009),
X is continuous for every u 2 U . The transition is the following. A subset S U is called
map associated to this system is .; K/-spanning if for every x 2 K there is u 2 S
with '.k; x; u/ 2 intK for k D 1; : : : ; . Writing
' W N0 X U N0 ! X; rinv .; K/ for the minimal cardinality of such a
set, it can be shown that
'.k; x; .un // WD Fuk1 ı ı Fu1 ı Fu0 .x/:
1
hfb .K/ D lim log2 rinv .; K/
A compact subset K X with nonempty interior !1
is (strongly) controlled invariant if for every x 2 1
K there is an input u 2 U such that Fu .x/ 2 D inf log2 rinv .; K/:
1
intK. A triple .A; ; G/ is called an invariant
open cover of K if A is an open cover of K, This definition and several variations of it are
is a positive integer, and G W A ! U is mostly referred to by the name invariance
a map with components G0 ; G1 ; : : : ; G 1 which entropy instead of feedback entropy. The intuition
assign control values to all sets in A such that for behind this definition is that a controller which
every A 2 A the finite sequence of controls G.A/ receives a certain amount of information about
yields '.k; A; G.A// intK for k D 1; 2; : : : ; . the state, say n bits, can generate at most
The entropy of .A; ; G/ is defined as follows. 2n different control sequences to steer the
For every sequence ˛ D .Ai /i 0 of sets in A one system on a finite time interval and hence,
defines an associated sequence of controls by the number of control sequences needed to
Data Rate of Nonlinear Control Systems and Feedback Entropy 261
accomplish the control task on this interval while TFE in general is not. Interpreted in terms
is a measure for the necessary amount of of information rates, topological entropy is a
information. measure for the largest average rate of informa-
The different variations of feedback or invari- tion about the initial state a dynamical system can
ance entropy which can be found in the literature generate. TFE measures the smallest rate of infor-
are briefly summarized as follows. For simplicity, mation about the state of the system above which
in this entry we refer to several of these variations a controller is able to render the set invariant. It
by the name (topological) feedback entropy: should also be mentioned that topological entropy
(i) Instead of requiring that trajectories enter the was first introduced as a topological counterpart
D
interior of K after one step of time, one can of the measure-theoretic entropy defined by Kol-
allow for a waiting time 0 before entering mogorov and Sinai, and that the two notions are
intK. related by the variational principle, which asserts
(ii) One can require that trajectories stay in K in- that the topological entropy is the supremum of
stead of intK or that they stay in an arbitrar- the measure-theoretic entropies with respect to
ily small neighborhood of K, respectively. all invariant probability measures of the given
(iii) One can restrict the set of initial states to system. For TFE, so far no convincing measure-
a subset of K 0 K. In this case, a set S theoretic approach exists. An excellent survey on
of control sequences is .; K 0 ; K/-spanning the entropy theory of dynamical systems can be
if for every x 2 K 0 there is u 2 S with found in Katok (2007).
'.k; x; u/ 2 intK for k D 1; : : : ; .
(iv) Feedback entropy can be defined for other The Data Rate Theorem
classes of systems, e.g., continuous-time de-
terministic systems, random control systems, The data rate theorem for the TFE confirms that
or systems with piecewise continuous right- the infimal data rate in a coding and control loop
hand sides. which guarantees strong controlled invariance of
There is also a local version of topological a set K is equal to hfb .K/. More precisely, sup-
feedback entropy (LTFE) which measures the pose that a sensor which measures the state of the
smallest data rate for local uniform asymptotic system at discrete times k D k, k D 0; 1; 2; : : :,
stabilization at an equilibrium. Also for other is connected to a coder which at time k has a
control tasks there have been attempts to define finite alphabet Sk of symbols available. The mea-
corresponding versions of feedback or invariance sured state is coded by use of this alphabet and
entropy. the corresponding symbol is sent via a noiseless
digital channel to a controller which generates
Comparison to Topological Entropy an input sequence of length . This sequence is
used to steer the system until the next symbol
Though there are similarities in the definitions arrives at time kC1 . The associated asymptotic
of TFE and topological entropy of dynamical average bit rate, which depends on the sequence
systems, which also reflect in similar properties, S D .Sk /k0 of coding alphabets, is given by
there is no direct relation between these two
quantities. Topological entropy detects exponen-
1 X
k1
tial complexity in the orbit structure of a dy- R.S / D lim log2 jSi j:
k!1 k
namical system. In contrast, TFE measures the i D0
complexity of the control task to keep a system in
If the limit does not exist, one may replace it
a subset of the state space by applying appropriate
with lim inf or lim sup. The data rate theorem
inputs. If no escape from this subset is possible,
establishes the equality
the TFE is zero, no matter how complicated the
orbit structure is. Hence, topological entropy is
hfb .K/ D inf R.S /;
sensitive to the local behavior of the system, S
262 Data Rate of Nonlinear Control Systems and Feedback Entropy
where the infimum is taken over all coding and inputs as time increases is the volume expansion
control loops which guarantee strong controlled of the open-loop system in the unstable subspace.
invariance of K, i.e., for initial states in K they
generate trajectories .xk /k0 with xk 2 intK for Upper Bounds Under Controllability
k D 1; 2; : : :. Similar data rate theorems can be Assumptions
proved for other variants of feedback entropy. In If the state space of the control system is a
particular, the data rate theorem for the LTFE differentiable manifold and the right-hand side
asserts that the infimal bit rate for local uniform is continuously differentiable, under certain
asymptotic stabilization at an equilibrium is given controllability assumptions upper bounds for the
by the LTFE. Proofs of different data rate theo- feedback entropy can be formulated in terms of
rems can be found in Nair et al. (2004), Hagihara the Lyapunov exponents of periodic trajectories
and Nair (2013), and Kawan (2013). (for the concept of Lyapunov exponents, see,
e.g., Barreira and Valls 2008) (cf. Kawan 2011b,
2013; Nair et al. 2004). More precisely, if there
Estimates and Formulas is a periodic trajectory in the interior of the
given set K such that the linearization along
Linear Systems this trajectory is controllable, and if complete
For linear systems, under reasonable assump- approximate controllability holds on the interior
tions, the feedback entropy is given by the sum of of K (cf. Colonius and Kliemann 2000), then
the unstable eigenvalues of the dynamical matrix,
X
i.e., if the system is given by xkC1 D Axk C Buk , hfb .K/ maxf0; n g; (2)
then
X
hfb .K/ D max f0; n log2 jjg ; (1) where the sum is taken over all Lyapunov expo-
2Sp.A/ nents of the periodic trajectory and n denotes
the multiplicity of . Using the definition of
where Sp.A/ denotes the spectrum of A and n feedback entropy in terms of .; K/-spanning
is the algebraic multiplicity of the eigenvalue sets, one can prove this by constructing spanning
(cf. Colonius and Kawan 2009). It is worth to sets of control functions which first steer all initial
mention that the TFE therefore coincides with states in K into a small neighborhood of a point
the topological entropy of the uncontrolled sys- on the given periodic orbit and then, by use
tem xkC1 D Axk , as defined by Bowen for of local controllability, keep the corresponding
maps on non-compact metric spaces (cf. Bowen trajectories in a neighborhood of the periodic
1971). However, this is a special property of trajectory for arbitrary future times. Similar ideas
linear systems and is related to the facts that (i) first have been used in Nair et al. (2004) to prove
there is no difference between the local and the that the LTFE at an equilibrium is given by the
global dynamical behavior of uncontrolled linear sum of the unstable eigenvalues of the lineariza-
systems and that (ii) the control sequence does tion about this equilibrium. For systems given by
not affect the exponential complexity of the dy- differential equations the upper estimate (2) can
namics, since it only appears as an additive term be improved under additional regularity assump-
in the transition map of the system. Formula (1) tions. Assuming that the system is smooth and
is in correspondence with several former results satisfies the strong jet accessibility rank condition
on minimal data rates for stabilization of linear (cf. Coron 1994), one can show that both as-
systems. Thinking of the definition of feedback sumptions, controllability of the linearization and
entropy via spanning sets of control sequences, periodicity, can be omitted. The only restriction
its interpretation is that in order to guarantee that remains is that the trajectory must not leave
invariance of a bounded set, the only reason for a compact subset of the interior of K. However,
exponential growth of the number of necessary in the case of nonperiodic trajectories, the sum
Data Rate of Nonlinear Control Systems and Feedback Entropy 263
of the positive Lyapunov exponents has to be set K, which under sufficiently strong hyperbol-
replaced by the maximal Lyapunov exponent of icity assumptions can be estimated in terms of
the induced linear flow on the exterior bundle of other quantities such as Lyapunov exponents and
the manifold. For control-affine systems, strong dynamical entropies. Key references for escape
jet accessibility can be weakened to local acces- rates in the classical theory of dynamical sys-
sibility. In general, it is unlikely that such upper tems are (Young (1990) and Demers and Young
bounds are tight, since they are related to very (2006)).
specific control strategies for making the given
set invariant.
D
Summary and Future Directions
Lower Bounds, Volume Growth Rates,
and Escape Rates The theory of feedback entropy for finite-
A general approach to obtain lower bounds of dimensional deterministic systems is very far
feedback entropy is via a volume growth ar- from being complete. The currently available
gument, which in its simplest form works as results only give valuable information in very
follows. Every .; K/-spanning set S defines a regular situations, and even those are not fully
cover of K, consisting of the sets (cf. Kawan understood. For the further development of
2011a,c) this theory, it will be necessary to combine
control-theoretic methods with techniques from
K;u D fx 2 K W '.k; x; u/ 2 intK; different fields such as classical, random, and
nonautonomous dynamical systems. Some of the
1 k g; u 2 S:
main focuses of future research will probably be
the following:
It follows that ';u .K;u / K and hence, since K • The generalization of feedback entropy to
is bounded, the volume expansion under ';u D more complex network topologies
'.; ; u/ gives upper bounds for the volumes of • The formulation of a feedback entropy theory
the sets K;u , which result in a lower bound for for stochastic systems
the number of these sets. For instance, the lower • The development of a probabilistic (resp.
estimate in (1) can be established by applying measure-theoretic) version of feedback
this argument to the system which arises by entropy for both deterministic and stochastic
projection of the given linear system to the un- systems, which is related to the topological
stable subspace of the uncontrolled part xkC1 D version via a variational principle
Axk . A refinement of this idea also leads to • The numerical computation of feedback en-
lower estimates of feedback entropy for inho- tropy
mogeneous bilinear systems in terms of volume
growth rates or Lyapunov exponents on unstable
bundles, respectively. For nonlinear systems, in
general only very rough estimates can be ob- Cross-References
tained by this method. However, a variation of the
volume growth argument leads to a lower bound Quantized Control and Data Rate Constraints
of the form
1 Bibliography
hfb .K/ lim inf log sup .K;u /;
!1 u
Adler RL, Konheim AG, McAndrew MH (1965) Topolog-
ical entropy. Trans Amer Math Soc 114:309–319
where denotes a reference measure on the state
Barreira L, Valls C (2008) Stability of nonautonomous
space. The right-hand side of this inequality can differential equations. Lecture notes in mathematics,
be considered as a uniform escape rate from the vol. 1926. Springer, Berlin
264 DES
effects and derive a system of ordinary differen- by dropping the explicit dependence on time, its
tial equations (ODEs) to model the dynamics. dynamics is governed by a system of n ordinary
Formally, a biochemical network is given by r differential equations:
reactions R1 ; : : : ; Rr acting on n different chem-
ical species S1 ; : : : ; Sn . Reaction Rj is given by d
x D N v.x; p/ : (1)
dt
Rj W ˛1;j S1 C C ˛n;j Sn ! ˇ1;j S1 Here, the reaction rate vector v.x; p/ D
C C ˇn;j Sn ; .v1 .x; p1 /; : : : ; vr .x; pr //T gives the rate of D
conversion of each reaction per unit-time as a
where ˛i;j ; ˇi;j 2 N are called the molecularities function of the current system state and of a set
of the species in the reaction. Their differences of parameters p.
form the stoichiometric matrix N D .ˇi;j A typical reaction rate is given by the mass-
˛i;j /i D1:::n;j D1:::r , with Ni;j describing the net action rate law
effect of one turnover of Rj on the copy number Y
n
˛
of species Si . The j th column is also called vj .x; pj / D kj xi i;j ;
the stoichiometry of reaction Rj . The system i D1
can be opened to an un-modeled environment by
introducing inflow reactions ; ! Si and outflow where the rate constant kj > 0 is the only
reactions Si ! ;. parameter and the rate is proportional to the
For example, consider the following reaction concentration of each species participating as an
network: educt (consumed component) in the respective
reaction.
Equation (1) decomposes the system into a
R1 W E CS !E S
time-independent and linear part described solely
R2 W E S !E CS by the topology and stoichiometry of the reac-
tion network via N and a dynamic and typically
R3 W E S !E CP
nonlinear part given by the reaction rate laws
v.; /. One can define a directed graph of the
Here, an enzyme E (a protein that acts as a network with one vertex per state and take N as
catalyst for biochemical reactions) binds to a sub- the (weighted) incidence matrix. Reaction rates
strate species S , forming an intermediate com- are then properties of the resulting edges. In
plex E S and subsequently converting S into essence, the equation describes the change of
a product P . Note that the enzyme-substrate each species’ concentration as the sum of the cur-
binding is reversible, while the conversion to rent reaction rates. Each rate is weighted by the
a product is irreversible. This network contains molecularity of the species in the corresponding
r D 3 reactions interconverting n D 4 chemical reaction; it is negative if the species is consumed
species S1 D S , S2 D P , S3 D E, and S4 D by the reaction and positive if it is produced.
E S . The stoichiometric matrix is Using mass-action kinetics throughout, and
0 1 using the species name instead of the numeric
1 C1 0
B 0 subscript, the reaction rates and parameters of the
0 C1C
NDB
@ 1
C : example network are given by
C1 C1A
C1 1 1
v1 .x; p1 / D k1 xE .t/ xS .t/
Dynamics v2 .x; p2 / D k2 xES .t/
Let x.t/ D .x1 .t/; : : : ; xn .t//T be the vector of
v3 .x; p3 / D k3 xES .t/
concentrations, that is, xi .t/ is the concentration
of Si at time t. Abbreviating this state vector as x p D .k1 ; k2 ; k3 /
266 Deterministic Description of Biochemical Networks
The system equations are then a combination of fluxes bT v, the total pro-
duction rate of relevant metabolites to form new
d
xS D k1 xS xE C k2 xES biomass. The biomass function given by b 2 Rr
dt is determined experimentally. The technique of
d flux balance analysis (FBA) then solves the linear
xP D k3 xES program
dt
d max bT v
xE D k1 xS xE Ck2 xES C k3 xES v
dt
d subject to
xES D k1 xS xE k2 xES k3 xES :
dt 0DNv
Steady-State Analysis vil vi viu
A reaction network is in steady state if the pro-
duction and consumption of each species are bal- to yield a feasible flux vector that balances the
anced. Steady-state concentrations x then satisfy network while maximizing growth. Alternative
the equation objective functions have been proposed, for in-
stance, for higher organisms that do not necessar-
0 D N v.x ; p/ : ily maximize the growth of each cell.
Deterministic h = 10
Description h=3
of Biochemical
response [a.u.]
input [a.u.]
268 Diagnosis of Discrete Event Systems
Introduction
Bibliography
In this entry, we consider discrete event systems
Craciun G, Tang Y, Feinberg M (2006) Understanding that are partially observable and discuss the two
bistability in complex enzyme-driven reaction net- related problems of event diagnosis and diagnos-
works. Proc Natl Acad Sci USA 103(23):8697–8702 ability analysis. Let the DES of interest be de-
Higham DJ (2008) Modeling and simulating chemical
noted by M with event set E. Since M is partially
reactions. SIAM Rev 50(2):347–368
LeDuc PR, Messner WC, Wikswo JP (2011) How do observable, its set of events E is the disjoint
control-based approaches enter into biology? Annu union of a set of observable events, denoted by
Rev Biomed Eng 13:369–396 Eo , with a set of unobservable events, denoted by
Sontag E (2005) Molecular systems biology and control.
Euo : E D Eo [Euo . At this point, we do not spec-
Eur J Control 11(4–5):396–435
Szallasi Z, Stelling J, Periwal V (eds) (2010) System ify how M is represented; it could be an automa-
modeling in cellular biology: from concepts to nuts ton or a Petri net. Let LM be the set of all strings
and bolts. MIT, Cambridge of events in E that the DES M can execute, i.e.,
Tyson JJ, Chen KC, Novak B (2003) Sniffers, buzzers,
toggles and blinkers: dynamics of regulatory and sig-
the (untimed) language model of the system; cf.
naling pathways in the cell. Curr Opin Cell Biol the related entries, Models for Discrete Event
15(2):221–231 Systems: An Overview, Supervisory Control of
Discrete-Event Systems, and Modeling, Anal-
ysis, and Control with Petri Nets. The set Euo
Diagnosis of Discrete Event Systems captures the fact that the set of sensors attached
to the DES M is limited and may not cover all
Stéphane Lafortune the events of the system. Unobservable events
Department of Electrical Engineering and can be internal system events that are not directly
Computer Science, University of Michigan, Ann “seen” by the monitoring agent that observes
Arbor, MI, USA the behavior of M for diagnosis purposes. They
can also be fault events that are included in the
system model but are not directly observable by
Abstract a dedicated sensor. For the purpose of diagnosis,
let us designate a specific unobservable event of
We discuss the problem of event diagnosis in interest and denote it by d 2 Euo . Event d could
partially observed discrete event systems. The be a fault event or some other significant event
objective is to infer the past occurrence, if any, of that is unobservable.
Diagnosis of Discrete Event Systems 269
Before we can state the problem of event basis of LM and of Eo and Euo , if any and all
diagnosis, we need to introduce some notation. occurrences of the given event of interest d will
E is the set of all strings of any length n 2 N eventually be diagnosed by the monitoring agent
that can be formed by concatenating elements that observes the system behavior.
of E. The unique string of length n D 0 is Event diagnosis and diagnosability analysis
denoted by " and is the identity element of con- arise in numerous applications of systems
catenation. As in article Supervisory Control that are modeled as DES. We mention a few
of Discrete-Event Systems, section “Supervisory application areas where DES diagnosis theory
Control Under Partial Observations,” we define has been employed. In heating, ventilation,
D
the projection function P W E ! Eo that and air-conditioning systems, components such
“erases” the unobservable events in a string and as valves, pumps, and controllers can fail in
replaces them by ". The function P is naturally degraded modes of operation, such as a valve
extended to a set of strings by applying it to each gets stuck open or stuck closed or a pump or
string in the set, resulting in a set of projected controller gets stuck on or stuck off. The available
strings. The observed behavior of M is the lan- sensors may not directly observe these faults, as
guage P .LM / over event set Eo . the sensing abilities are limited. Fault diagnosis
The problem of event diagnosis, or simply di- techniques are essential, since the components
agnosis, is stated as follows: how to infer the past of the system are often not easily accessible. In
occurrence of event d when observing strings in monitoring communication networks, faults of
P .LM / at run-time, i.e., during the operation of certain transmitters or receivers are not directly
the system? This is model-based inferencing, i.e., observable and must be inferred from the set of
the monitoring agent knows LM and the partition successful communications and the topology of
E D Eo [ Euo , and it observes strings in P .LM /. the network. In document processing systems,
When there are multiple events of interest, d1 to faults of internal components can lead to jams in
dn , and these events are fault events, we have the paper path or a decrease in image quality, and
a problem of fault diagnosis. In this case, the while the paper jam or the image quality is itself
objective is not only to determine that a fault observable, the underlying fault may not be as
has occurred (commonly referred to as “fault the number of internal sensors is limited.
detection”) but also to identify which fault has Without loss of generality, we consider a sin-
occurred, namely, which event di (commonly gle event of interest to diagnose, d . When there
referred to as “fault isolation and identification”). are multiple events to diagnose, the methodolo-
Fault diagnosis requires that LM contains not gies that we describe in the remaining of this
only the nominal or fault-free behavior of the entry can be applied to each event of interest di ,
system but also its behavior after the occurrence i D 1; : : : n, individually; in this case, the other
of each fault event di of interest, i.e., its faulty events of interest dj , j ¤ i are treated the same
behavior. Event di is typically a fault of a com- as the other unobservable events in the set Euo in
ponent that leads to degraded behavior on the the process of model-based inferencing.
part of the system. It is not a catastrophic failure
that would cause the system to completely stop
operating, as such a failure would be immediately Problem Formulation
observable. The decision on which fault events
di , along with their associated faulty behaviors, to Event Diagnosis
include in the complete model LM is a design one We start with a general language-based formu-
that is based on practical considerations related to lation of the event diagnosis problem. The in-
the diagnosis objectives. formation available to the agent that monitors
A complementary problem to event diagnosis the system behavior and performs the task of
is that of diagnosability analysis. Diagnosability event diagnosis is the language LM and the set
analysis is the off-line task of determining, on the of observable events Eo , along with the specific
270 Diagnosis of Discrete Event Systems
string t 2 P .LM / that it observes at run-time. after the occurrence of d . That is, for any sY0 in
The actual string generated by the system is LM and for any n 2 N, there exists sY D sY0 t 2
s 2 LM where P .s/ D t. However, as far LM where the length of t is equal to n. Event d
as the monitoring agent is concerned, the actual is not diagnosable in language LM if there exists
string that has occurred could be any string in such a string sY together with a second string
P 1 .t/\LM , where P 1 is the inverse projection sN that does not contain event d , and such that
operation, i.e., P 1 .t/ is the set of all strings P .sY / D P .sN /. This means that the monitoring
st 2 E such that P .st / D t. Let us denote this agent is unable to distinguish between sY and sN ,
estimate set by E.t/ D P 1 .t/ \ LM , where “E” yet, the number of events after an occurrence of
stands for “estimate.” If a string s 2 LM contains d can be made arbitrarily large in sY , thereby
event d , we write that d 2 LM ; otherwise, we preventing diagnosis of event d within a finite
write that d … LM . number of events after its occurrence. On the
The event diagnosis problem is to synthesize a other hand, if no such pair of strings .sY ; sN /
diagnostic engine that will automatically provide exists in LM , then event d is diagnosable in
the following answers from the observed t and LM . (The mathematically precise definition of
from the knowledge of LM and Eo : diagnosability is available in the literature cited
1. Yes, if and only if d 2 s for all s 2 E.t/. at the end of this entry.)
2. No, if and only if d … s for all s 2 E.t/.
3. Maybe, if and only if there exists sY ; sN 2
E.t/ such that d 2 sY and d … sN . Diagnosis of Automata
As defined, E.t/ is a string-based estimate. In
section “Diagnosis of Automata,” we discuss how We recall the definition of a deterministic finite-
to build a finite-state structure that will encode the state automaton, or simply automaton, from
desired answers for the above three cases when article Models for Discrete Event Systems:
the DES M is modeled by a deterministic finite- An Overview, with the addition of a set of
state automaton. The resulting structure is called unobservable events as in section “Supervisory
a diagnoser automaton. Control Under Partial Observations” in article
Supervisory Control of Discrete-Event
Diagnosability Analysis Systems. The automaton, denoted by G, is a
Diagnosability analysis consists in determining, a four-tuple G D .X; E; f; x0 / where X is the
priori, if any and all occurrences of event d in LM finite set of states, E is the finite set of events
will eventually be diagnosed, in the sense that partitioned into E D Eo [ Euo , x0 is the initial
if event d occurs, then the diagnostic engine is state, and f is the deterministic partial transition
guaranteed to eventually issue the decision “Yes.” function f W X E ! X that is immediately
For the sake of technical simplicity, we assume extended to strings f W X E ! X . For a
hereafter that LM is a live language, i.e., any DES M represented by an automaton G, LM is
trace in LM can always be extended by one more the language generated by automaton G, denoted
event. In this context, we would not want the by L.G/ and formally defined as the set of all
diagnostic engine to issue the decision “Maybe” strings for which the extended f is defined. It
for an arbitrarily long number of event occur- is an infinite set if the transition graph of G
rences after event d occurs. When this outcome is has one or more cycles. In view of the liveness
possible, we say that event d is not diagnosable assumption made on LM in the preceding section,
in language LM . G has no reachable deadlocked state, i.e., for all
The property of diagnosability of DES is de- s 2 E such that f .x; s/ is defined, then there
fined as follows. In view of the liveness assump- exists 2 E such that f .x; s / is also defined.
tion on language LM , any string sY0 that contains To synthesize a diagnoser automaton that cor-
event d can always be extended to a longer string, rectly performs the diagnostic task formulated
meaning that it can be made “arbitrarily long” in the preceding section, we proceed as follows.
Diagnosis of Discrete Event Systems 271
First, we perform the parallel composition (de- component is equal to Y and at least one state
noted by jj) of G with the two-state label au- t
pair in xDiag whose second component is equal
tomaton Alabel that is defined as follows. Alabel D to N ; we call such a state a “Maybe-state” of
.fN; Y g; fd g; flabel ; N /, where flabel has two tran- Diag.G/.
sitions defined: (i) flabel .N; d / D Y and (ii) To perform run-time diagnosis, it therefore suf-
flabel .Y; d / D Y . The purpose of Alabel is to fices to examine the current state of Diag.G/.
record the occurrence of event d , which causes Note that Diag.G/ can be computed off-line
a transition to state Y . By forming Glabeled D from G and stored in memory, so that run-time
GjjAlabel , we record in the states of Glabeled , which diagnosis requires only updating the new state of
D
are of the form .xG ; xA /, if the first element of Diag.G/ on the basis of the most recent observed
the pair, state xG 2 X , was reached or not event (which is necessarily in Eo ). If storing
by executing event d at some point in the past: the entire structure of Diag.G/ is impractical, its
if d was executed, then xA D Y , otherwise current state can be computed on-the-fly on the
xA D N . (We refer the reader to Chap. 2 in basis of the most recent observed event and of the
Cassandras and Lafortune (2008) for the formal transition structure of Glabeled ; this involves one
definition of parallel composition of automata.) step of the subset construction algorithm.
By construction, L.Glabeled / D L.G/. As a simple example, consider the automaton
The second step of the construction of the G1 shown in Fig. 1, where Euo D fd g. The
diagnoser automaton is to build the observer of occurrence of event d changes the behavior of the
Glabeled , denoted by Obs.Glabeled /, with respect system such that event c does not cause a return to
to the set of observable events Eo . (We refer the initial state 1 (identified by incoming arrow);
the reader to Chap. 2 in Cassandras and Lafor- instead, the system gets stuck in state 3 after d
tune (2008) for the definition of the observer occurs.
automaton and for its construction.) The con- Its diagnoser is depicted in Fig. 2. It contains
struction of the observer involves the standard one Yes-state, state f.3; Y /g (abbreviated as “3Y”
subset construction algorithm for nondeterminis- in the figure), one No-state, and two Maybe-states
tic automata in automata theory; here, the unob- (similarly abbreviated). Two consecutive occur-
servable events are the source of nondeterminism, rences of event c, or an occurrence of b right after
since they effectively correspond to "-transitions. c, both indicate that the system must be in state 3,
The diagnoser automaton of G with respect to Eo
is defined as Diag.G/ D Obs.GjjAlabel /. Its event
set is Eo .
The states of Diag.G/ are sets of state pairs
of the form .xG ; xA / where xA is either N or Y .
Examination of the state of Diag.G/ reached by
string t 2 P ŒL.G/ provides the answers to the
event diagnosis problem. Let us denote that state
t
by xDiag . Then:
1. The diagnostic decision is Yes if all state pairs
t
in xDiag have their second component equal
to Y ; we call such a state a “Yes-state” of
Diag.G/.
2. The diagnostic decision is No if all state pairs
t
in xDiag have their second component equal
to N ; we call such a state a “No-state” of
Diag.G/.
3. The diagnostic decision is Maybe if there is Diagnosis of Discrete Event Systems, Fig. 1
t
at least one state pair in xDiag whose second Automaton G1
272 Diagnosis of Discrete Event Systems
event d in the twin-automaton. It can be verified analysis, it can be performed using the twin-
that for automaton G2 in our example, event d is automaton technique of the preceding section,
diagnosable. Indeed, it is clear from the structure from the reachability graph of N .
of G2 that Diag.G2 / in Fig. 4 will never loop in Another approach that is actively being pur-
Maybe-state f.2; N /; .3; Y /g if event d occurs; sued in current literature is to exploit the struc-
rather, after d occurs, G2 can only execute c ture of the net model N for diagnosis and for
events, and after two such events, Diag.G2 / en- diagnosability analysis, instead of working with
ters Yes-state f.3; Y /g. the automaton model obtained from its reachabil-
Note that diagnosability will not hold in an ity graph. In addition to potential computational
D
automaton that contains a cycle of unobservable gains from avoiding the explicit generation of the
events after the occurrence of event d , although entire set of reachable states, this approach is mo-
this is not the only instance where the property is tivated by the need to handle Petri nets whose sets
violated, as we saw in our simple example. of reachable states are infinite and in particular
Petri nets that generate languages that are not
regular and hence cannot be represented by finite-
Diagnosis and Diagnosability state automata. Moreover, in this approach, one
Analysis of Petri Nets can incorporate potential observability of token
contents in the places of the Petri nets. We refer
There are several approaches for diagnosis and the interested reader to the relevant chapters in
diagnosability analysis of DES modeled by Petri Campos et al. (2013) and Seatzu et al. (2013) for
nets, depending on the boundedness properties coverage of these topics.
of the net and on what is observable about its
behavior. Let N be the Petri net model of the
system, which consists of a Petri net structure Current and Future Directions
along with an initial marking of all places. If the
transitions of N are labeled by events in a set E, The basic methodologies described so far for
some by observable events in Eo and some by diagnosis and diagnosability analysis have been
unobservable events in Euo , and if the contents extended in many different directions. We briefly
of the Petri net places are not observed except discuss a few of these directions, which are active
for the initial marking of N , then we have a research areas. Detailed coverage of these topics
language-based diagnosis problem as considered is beyond the scope of this entry and is available
so far in this entry, for language L.N / and for in the references listed at the end.
E D Eo [ Euo , with event of interest d 2 Euo . Diagnosis of timed models of DES has been
In this case, if the set of reachable states of the considered, for classes of timed automata and
net is bounded, then we can use the reachability timed Petri nets, where the objective is to ensure
graph as an equivalent automaton model of the that each occurrence of event d is detected within
same system and build a diagnoser automaton a bounded time delay. Diagnosis of stochastic
as described earlier. It is also possible to en- models of DES has been considered, in particu-
code diagnoser states into the original structure lar stochastic automata, where the hard diagnos-
of net N by keeping track of all possible net ability constraints are relaxed and detection of
markings following the observation of an event each occurrence of event d must be guaranteed
in Eo , appending the appropriate label (“N” or with some probability 1
, for some small
“Y”) to each marking in the state estimate. This
> 0. Stochastic models also allow handling of
is reminiscent of the on-the-fly construction of unreliable sensors or noisy environments where
the current state of the diagnoser automaton dis- event observations may be corrupted with some
cussed earlier, except that the possible system probability, such as when an occurrence of event
states are directly listed as Petri net markings a is observed as event a 80 % of the time and as
on the structure of N . Regarding diagnosability some other event a0 20 % of the time.
274 Diagnosis of Discrete Event Systems
Decentralized diagnosis is concerned with observed by the monitoring agent. However, there
DES that are observed by several monitoring are many instances where one would want the
agents i D 1; : : : ; n, each with its own set of monitoring agent to dynamically activate or de-
observable events Eo;i and each having access activate the observability properties of a subset of
to the entire set of system behaviors, LM . The the events in Eo ; this arises in situations where
task is to design a set of individual diagnosers, event monitoring is “costly” in terms of energy,
one for each set Eo;i , such that the n diagnosers bandwidth, or security reasons. This is referred to
together diagnose all the occurrences of event d . as the case of dynamic observations, and the goal
In other words, for each occurrence of event is to synthesize sensor activation policies that
d in any string of LM , there exists at least minimize a given cost function while preserving
one diagnoser that will detect it (i.e., answer the diagnosability properties of the system.
“Yes”). The individual diagnosers may or may
not communicate with each other at run-time
or they may communicate with a coordinating Cross-References
diagnoser that will fuse their information; several
decentralized diagnosis architectures have been Models for Discrete Event Systems: An
studied and their properties characterized. The Overview
focus in these works is the decentralized nature Modeling, Analysis, and Control with Petri
of the information available about the strings in Nets
LM , as captured by the individual observable Supervisory Control of Discrete-Event Systems
event sets Eo;i , i D 1; : : : ; n. Modeling, Analysis, and Control with Petri
Distributed diagnosis is closely related to Nets
decentralized diagnosis, except that it normally
refers to situations where each individual
diagnoser uses only part of the entire system Recommended Reading
model. Let M be an automaton G obtained
by parallel composition of subsystem models: There is a very large amount of literature on
G D jji D1;n Gi . In distributed diagnosis, one diagnosis and diagnosability analysis of DES
would want to design each individual diagnoser that has been published in control engineering,
Diagi on the basis of Gi alone or on the basis computer science, and artificial intelligence jour-
of Gi and of an abstraction of the rest of the nals and conference proceedings. We mention
system, jjj D1;nIj ¤i Gj . Here, the emphasis is on a few recent books or survey articles that are
the distributed nature of the system, as captured a good starting point for readers interested in
by the parallel composition operation. In the case learning more about this active area of research.
where M is a Petri net N , the distributed nature In the DES literature, the study of fault diagnosis
of the system may be captured by individual net and the formalization of diagnosability proper-
models Ni , i D 1; : : : ; n, that are coupled by ties started in Lin (1994) and Sampath et al.
common places, i.e., place-bordered Petri nets. (1995). Chapter 2 of the textbook Cassandras
Robust diagnosis generally refers to decentral- and Lafortune (2008) contains basic results about
ized or distributed diagnosis, but where one or diagnoser automata and diagnosability analysis
more of the individual diagnosers may fail. Thus, of DES, following the approach introduced in
there must be built-in redundancy in the set of Sampath et al. (1995). The research monograph
individual diagnosers so that they together may Lamperti and Zanella (2003) presents DES diag-
still detect every occurrence of event d even if nostic methodologies developed in the artificial
one or more of them ceases to operate. intelligence literature. The survey paper Zay-
So far we have considered a fixed and static toon and Lafortune (2013) presents a detailed
set of observable events, Eo E, where ev- overview of fault diagnosis research in the con-
ery occurrence of each event in Eo is always trol engineering literature. The two edited books
Differential Geometric Methods in Nonlinear Control 275
Campos et al. (2013) and Seatzu et al. (2013) Henry Hermes, Alberto Isidori, Velimir Jurdjevic,
contain chapters specifically devoted to diagnosis Arthur Krener, Claude Lobry, and Hector
of automata and Petri nets, with an emphasis Sussmann. These concepts revolutionized our
on automated manufacturing applications for the knowledge of the analytic properties of control
latter. Specifically, Chaps. 5, 14, 15, 17, and 19 in systems, e.g., controllability, observability,
Campos et al. (2013) and Chaps. 22–25 in Seatzu minimality, and decoupling. With these concepts,
et al. (2013) are recommended for further reading a theory of nonlinear control systems emerged
on several aspects of DES diagnosis. Zaytoon that generalized the linear theory. This theory of
and Lafortune (2013) and the cited chapters in nonlinear systems is largely parallel to the linear
D
Campos et al. (2013) and Seatzu et al. (2013) theory, but of course it is considerably more
contain extensive bibliographies. complicated.
Bibliography Keywords
Campos J, Seatzu C, Xie X (eds) (2013) Formal meth- Codistribution; Distribution; Frobenius theorem;
ods in manufacturing. Series on industrial information Involutive distribution; Lie jet
technology. CRC/Taylor and Francis, Boca Raton, FL
Cassandras CG, Lafortune S (2008) Introduction to dis-
crete event systems, 2nd edn. Springer, New York, NY
Lamperti G, Zanella M (2003) Diagnosis of active sys- Introduction
tems: principles and techniques. Kluwer, Dordrecht
Lin F (1994) Diagnosability of discrete event systems and This is a brief survey of the influence of differ-
its applications. Discret Event Dyn Syst Theory Appl
4(2):197–212
ential geometric concepts on the development of
Sampath M, Sengupta R, Lafortune S, Sinnamohideen K, nonlinear systems theory. Section “A Primer on
Teneketzis D (1995) Diagnosability of discrete event Differential Geometry” reviews some concepts
systems. IEEE Trans Autom Control 40(9):1555–1575 and theorems of differential geometry. Nonlin-
Seatzu C, Silva M, van Schuppen J (eds) (2013) Control
of discrete-event systems. Automata and Petri net
ear controllability and nonlinear observability are
perspectives. Lecture notes in control and information discussed in sections “Controllability of Nonlin-
sciences, vol 433. Springer, London ear Systems” and “Observability for Nonlinear
Zaytoon J, Lafortune S (2013) Overview of fault diagnosis Systems”. Section “Minimal Realizations” dis-
methods for discrete event systems. Annu Rev Control
37(2):308–320
cusses minimal realizations of nonlinear systems,
and section “Disturbance Decoupling” discusses
the disturbance decoupling problem.
charts overlap, the change of coordinates should are locally topologically equivalent (Arnol’d
be smooth. For simplicity, we restrict our atten- 1983). This is the Grobman-Hartman theorem.
tion to differential geometric objects described in There exists a local homeomorphism z D h.x/
local coordinates. that carries x.t/ trajectories into z.s/ trajectories
In local coordinates, a vector field is just an in some neighborhood of x 0 D 0. This
ODE of the form homeomorphism need not preserve time t ¤ s,
but it does preserve the direction of time. Whether
xP D f .x/ (1) these flows are locally diffeomorphic is a more
difficult question that was explored by Poincaré.
See the section on feedback linearization (Krener
where f .x/ is a smooth IRn1 valued function of
2013).
x. In a different coordinate chart with local coor-
If all the eigenvalues of F are in the open
dinates z, this vector field would be represented
left half plane, then the linear dynamics (2) is
by a different formula:
globally asymptotically stable around z0 D 0, i.e.,
if the flow of (2) is .t; z/, then .t; z1 / ! 0
zP D g.z/ as t ! 1. Then it can be shown that the non-
linear dynamics is locally asymptotically stable,
If the charts overlap, then on the overlap they are .t; x 1 / ! x 0 as t ! 1 for all x 1 in some
related: neighborhood of x 0 .
One forms, !.x/, are dual to vector fields.
@x @z The simplest example of a one form (also called
f .x.z// D .z/g.z/; g.z.x// D .z/f .x/
@z @x a covector field) is the differential dh.x/ of a
scalar-valued smooth function h.x/. This is the
Since f .x/ is smooth, it generates a smooth IR1n covector field
flow .t; x 0 / where for each t, the mapping
h i
x 7! .t; x 0 / is a local diffeomorphism and for !.x/ D @h @h
@x1 .x/ ::: @xn .x/
each x 0 the mapping t 7! .t; x 0 / is a solution
of the ODE (1) satisfying the initial condition
Sometimes this is written as
.0; x 0 / D x 0 . We assume that all the flows
are complete, i.e., defined for all t 2 IR; x 2 X
n
@h
M. The flows are one parameter groups, i.e., !.x/ D .x/ dxi
.t; .s; x 0 // D .t C s; x 0 / D .s; .t; x 0 //. 1
@xi
of irrational slope, b2 =b1 is irrational. Construct where the state x are local coordinates on an
the torus T 2 as the quotient of IR2 by the integer n-dimensional manifold M, the control u is re-
lattice Z 2 . The distribution passes to the quotient stricted to lie in some set U IRm , and the output
and since it is one dimensional, it is clearly y takes values IRp . We shall only consider such
involutive. The leaves of the quotient distribution systems.
are curves that wind around the torus indefinitely, A particular case is a linear system of the form
and each curve is dense in T 2 . Therefore, any
smooth function h.x/ that is constant on such a xP D F x C Gu
leaf is constant on all of T 2 . Hence, the local
D
y D Hx (5)
foliation does not extend to a global foliation. x.0/ D x 0
Another delicate question is whether the quo-
tient space of M by a foliation induced by an in- where M D IRn and U D IRm .
volutive distribution is a smooth manifold (Suss- The set At .x 0 / of points accessible at time
mann 1975). This is always true locally, but it t 0 from x 0 is the set of all x 1 2 M such
may not hold globally. Think of the foliation of that there exists a bounded, measurable control
T 2 discussed above. trajectory u.s/ 2 U; 0 s t, so that the
Given k n vector fields f 1 .x/; : : : ; f k .x/ solution of (4) satisfies x.t/ D x 1 . We define
that are linearly
independent
at each x and that A.x 0 / as the union of At .x 0 / for all t 0. The
commute, f i ; f j .x/ D 0, there exists a local system (4) is said to be controllable at time t > 0
change of coordinates z D z.x/ so that in the new if At .x 0 / D M and controllable in forward time
coordinates the vector fields are the first k unit if A.x 0 / D M.
vectors. For linear systems, controllability is a rather
The involutive closure DN of D is the smallest straightforward matter, but for nonlinear systems
involutive distribution containing D. As with all it is more subtle with numerous variations. The
distributions, we always assume implicitly that variation of constants formula gives the solution
is nonsingular. A point x 1 is D-accessible from of the linear system (5) as
x 0 if there exists a continuous and piecewise
smooth curve joining x 0 to x 1 whose left and Z t
right tangent vectors are always in D. Obvi- x.t/ D e F t x 0 C e F .t s/ Gu.s/ ds
ously D-accessibility is an equivalence relation. 0
Let D0 be the smallest distribution containing arbitrarily small open set containing x 0 ; x 1 . The
the vector fields g 1 .x/; : : : ; g m .x/ and invariant open set need not be connected. A nonlinear sys-
under bracketing by f .x/, i.e., tem is (short time, locally short time) observable
if every pair of initial states is (short time, locally
Œf; D0 D0 short time) distinguishable. Finally, a nonlinear
system is (short time, locally short time) locally
and let DN 0 be its involutive closure. Sussmann and observable if every x 0 has a neighborhood such
Jurdjevic (1972) showed that At .x 0 / is in the leaf that every other point x 1 in the neighborhood
of DN 0 through x 0 , and it is between an open set is (short time, locally short time) distinguishable
and its closure in the topology of this leaf. from x 0 .
For linear systems (5) where f .x/ D F x and For a linear system (5), all these definitions
g j .x/ D G j , the j th column of G, it is not hard coalesce into a single concept of observability
to see that which can be checked by the observability rank
condition which is that the rank of
adfk .g j /.x/ D .1/k F k G j
2 3
H
so the nonlinear generalization of the controlla- 6 HF 7
6 7
bility rank condition is that at each x the dimen- 6 :: 7 (8)
4 : 5
sion of DN 0 .x/ is n. This guarantees that At .x 0 /
HF n1
is between an open set and its closure in the
topology of M.
equals n.
The condition that
The corresponding concept for nonlinear sys-
N tems involves E, the smallest codistribution con-
dimension D.x/ Dn (7)
taining dh1 .x/; : : : ; dhp .x/ that is invariant un-
der repeated Lie differentiation by the vector
is referred to as the nonlinear controllability rank
fields f .x/; g 1 .x/; : : : ; g m .x/. Let
condition. This guarantees that A.x 0 / is between
an open set and its closure in the topology of M.
E.x/ D f!.x/ W ! 2 Eg
There are stronger interpretations of control-
lability for nonlinear systems. One is short time
local controllability (STLC). The definition of The nonlinear observability rank condition is
this is that the set of accessible points from x 0 in
any small t > 0 with state trajectories restricted dimension E.x/ D n (9)
to an arbitrarily small neighborhood of x 0 should
contain x 0 in its interior. Hermes (1994) and for all x 2 M. This condition guarantees that
others have done work on this. the nonlinear system is locally short time, locally
observable. It follows from (3) that E is spanned
by a set of exact one form, so its dual distribution
Observability for Nonlinear Systems E is involutive.
For a linear system (5), the input–output map-
Two possible initial states x 0 ; x 1 for the non- ping from x.0/ D x i , i D 0; 1 is
linear system are distinguishable if there exists Z t
a control u./ such that the corresponding out-
y .t/ D He x C
i Ft i
He F .t s/ Gu.s/ ds
puts y 0 .t/; y 1 .t/ are not equal. They are short 0
time distinguishable if there is an u./ such that
y 0 .t/ ¤ y 1 .t/ for all small t > 0. They are lo- The difference is
cally short time distinguishable if in addition the
corresponding state trajectories do not leave an y1.t/ y 0 .t/ D He F t .x 1 x 0 /
Differential Geometric Methods in Nonlinear Control 281
So if one input u./ distinguishes x 0 from x 1 , then this k-dimensional subspace, and it realizes the
so does every input. same input–output mapping from x 0 D 0. The
For nonlinear systems, this is not necessarily restricted system satisfies the controllability rank
true. That prompted Gauthier et al. (1992) to condition.
introduce a stronger concept of observability for If the observability rank condition does not
nonlinear systems. For simplicity, we describe hold, then the kernel of (8) is a subspace of IRn1
it for scalar input and scalar output systems. A of dimension n l > 0. This subspace is in the
nonlinear system is uniformly observable for any kernel of H and is invariant under multiplication
input if there exist local coordinates so that it is by F . In fact, it is the maximal subspace with
D
of the form these properties. Therefore, there is a quotient
linear system on the IRn1 mod, the kernel of (8)
y D x1 C h.u/ which has the same input–output mapping. The
xP 1 D x2 C f1 .x1 ; u/ quotient is of dimension l < n, and it realizes the
same input–output mapping. The quotient system
::
: satisfies the observability rank condition.
By employing these two steps in either order,
xP n1 D xn C fn1 .x1 ; : : : ; xn2 ; u/
we pass to a minimal realization of the input–
xP n D fn .x; u/ output map of (5) from x 0 D 0. Kalman also
showed that two linear minimal realizations differ
Cleary if we know u./; y./, then by repeated by a linear change of state coordinates.
differentiation of y./, we can reconstruct x./. An initialized nonlinear system is a realiza-
It has been shown that for nonlinear systems tion of minimal dimension of its input–output
that are uniformly observable for any input, the mapping if the nonlinear controllability rank con-
extended Kalman filter is a locally convergent ob- dition (7) and the nonlinear observability rank
server (Krener 2002a), and the minimum energy condition (9) hold.
estimator is globally convergent (Krener 2002b). If the nonlinear controllability rank condition
(7) fails to hold because the dimension of D.x/
is k < n, then by replacing the state space M
Minimal Realizations with the k dimensional leaf through x 0 of the
foliation induced by D, we obtain a smaller state
The initialized nonlinear system (4) can be space on which the nonlinear controllability rank
viewed as defining a input–output mapping from condition (7) holds. The input–output mapping is
input trajectories u./ to output trajectories y./. unchanged by this restriction.
Is it a minimal realization of this mapping, does Suppose the nonlinear observability rank con-
there exists an initialized nonlinear system on a dition (9) fails to hold because the dimension of
smaller dimensional state space that realizes the E.x/ is l < n. Then consider a convex neigh-
same input–output mapping? borhood N of x 0 . The distribution E induces
Kalman showed that (5) initialized at x 0 D 0 a local foliation of N into leaves of dimension
is minimal iff the controllability rank condition n l > 0. The nonlinear system leaves this
and the observability rank condition hold. He local foliation invariant in the following sense.
also showed how to reduce a linear system to a Suppose x 0 and x 1 are on the same leaf then if
minimal one. x i .t/ is the trajectory starting at x i , then x 0 .t/
If the controllability rank condition does not and x 1 .t/ are on the same leaf as long as the
hold, then the span of (6) dimension is k < n. trajectories remain in N . Furthermore, h.x/ is
This subspace contains the columns of G and constant on leaves so y 0 .t/ D y 1 .t/. Hence, there
is invariant under multiplication by F . In fact, exists locally a nonlinear system whose state
it is the maximal subspace with these proper- space is the leaf space. On this leaf space, the
ties. So the linear system can be restricted to nonlinear observability rank condition holds (9),
282 Differential Geometric Methods in Nonlinear Control
and the projected system has the same input– LŒf jl Œ: : : Œf j2 ; f j1 : : :.x 0 /
output map as the original locally around x 0 .
If the leaf space of the foliation induced by D Œg jl Œ: : : Œg j2 ; g j1 : : :.z0 / (13)
E admits the structure of a manifold, then the
reduced system can be defined globally on it. for 1 l k.
Sussmann (1973) and Sussmann (1977) stud- On the other hand, if there is a linear map
ied minimal realizations of analytic nonlinear L such that (13) holds for 1 l k, then
systems. there exists a smooth mapping ˆ.x/ D z and
The state space of two minimal nonlinear constants M > 0;
> 0 such that for any
systems need not be diffeomorphic. Consider the ku.t/k < 1, the corresponding trajectories x.t/
system and z.t/ satisfy (12).
The k-Lie jet of (10) at x 0 is the tree of
brackets Œf jl Œ: : : Œf j2 ; f j1 : : :.x 0 / for 1 l
xP D u; y D sin x k. In some sense these are the coordinate-free
Taylor series coefficients of (10) at x 0 .
where x; u are scalars. We can take the state The dynamics (10) is free-nilpotent of degree
space to be either M D IR or M D S 1 , k if all these brackets are as linearly independent
and we will realize the same input–output map- as possible consistent with skew symmetry and
ping. These two state spaces are certainly not the Jacobi identity and all higher degree brackets
diffeomorphic but one is a covering space of are zero. If it is free to degree k, the controlled
the other. dynamics (10) can be used to approximate any
other controlled dynamics (11) to degree k. Be-
cause it is nilpotent, integrating (10) reduces to
Lie Jet and Approximations repeated quadratures. If all brackets with two or
more f i ; 1 i m are zero at x 0 , then (10) is
Consider two initialized nonlinear controlled dy- linear in appropriate coordinates. If all brackets
namics with three or more f i ; 1 i m are zero at x 0 ,
then (10) is quadratic in appropriate coordinates.
Pm The references Krener and Schaettler (1988) and
xP D f 0 .x/ C j D1 f
j
.x/uj Krener (2010a,b) discuss the structure of the
(10)
x.0/ D x 0 reachable sets for such systems.
Pm
zP D g 0 .z/ C j D1 g j .z/uj
(11)
z.0/ D z0 Disturbance Decoupling
Suppose that (10) satisfies the nonlinear Consider a control system affected by a distur-
controllability rank condition (7). Further, bance input w.t/
suppose that there is a smooth mapping ˆ.x/ D z
and constants M > 0;
> 0 such that for any xP D f .x/ C g.x/u C b.x/w
ku.t/k < 1, the corresponding trajectories x.t/ y D h.x/
(14)
and z.t/ satisfy
kˆ.x.t// z.t/k < M t kC1 (12) The disturbance decoupling problem is to find
a feedback u D .x/, so that in the closed-
for 0 t <
. loop system, the output y.t/ is not affected by
Then it is not hard to show that the linear the disturbance w.t/. Wonham and Morse (1970)
map L D @ˆ @x
.x 0 / takes brackets up to order k solved this problem for a linear system
of the vector fields f j evaluated at x 0 into the
corresponding brackets of the vector fields g j xP D F x C Gu C Bw
(15)
evaluated at z0 , y D Hx
Differential Geometric Methods in Nonlinear Control 283
To do so, they introduced the concept of an distribution Dmax in the kernel of dh. Moreover,
F; G invariant subspace. A subspace V IRn is this distribution is involutive. The disturbance
F; G invariant if decoupling problem is locally solvable iff
columns of b.x/ are contained in Dmax . If
FV V C G (16) M is simply connected, then the disturbance
decoupling problem is globally solvable iff
where G is the span of the columns of G. It is columns of b.x/ are contained in Dmax .
easy to see that V is F; G invariant iff there exists
D
a K 2 IRmn such that
Conclusion
.F C GK/V V (17)
We have briefly described the role that differential
The feedback gain K is called a friend of V. geometric concepts played in the development
It is easy from (16) that if V i is F; G invariant of controllability, observability, minimality,
for i D 1; 2, then V 1 C V 2 is also. So there exists approximation, and decoupling of nonlinear
a maximal F; G invariant subspace V max in the systems.
kernel of H . Wonham and Morse showed that the
linear disturbance decoupling problem is solvable
iff B V max where B is the span of the columns Cross-References
of B.
Isidori et al. (1981a) and independently Feedback Linearization of Nonlinear Systems
Hirschorn (1981) solved the nonlinear distur- Lie Algebraic Methods in Nonlinear Control
bance decoupling problem. A distribution D is Nonlinear Zero Dynamics
locally f; g invariant if
Disaster Response Robot, Fig. 2 Unmanned construction system (Courtesy of Society for Unmanned Construction
Systems) http://www.kenmukyou.gr.jp/f_souti.htm
Disaster Response Robot, Fig. 3 VGTV X-treme (Courtesy of Recce Robotics) http://www.recce-robotics.com/vgtv.
html
Disaster Response
Robot, Fig. 4 Active
scope camera (Courtesy of
International Rescue
System Institute)
Disaster Response Robot 287
Large-sized robots can discharge large volumes compounds (VOCs), radiation, etc., as options
of the fluid with water cannons, whereas small- and can measure the Hazmat in dangerous
sized robots have better mobility. confined spaces (Fig. 6). Quince was developed
for research into technical issues of UGVs at
CBRNE disasters and has high mobility on rough
Response Robots for Man-Made terrain (Fig. 7).
Disasters
D
Explosive Ordnance Disposal (EOD)
Detection and disposal of explosive ordnance Fukushima Daiichi Nuclear Power Plant
is one of the most dangerous tasks. TALON, Accident
PackBot, and Telemax are widely used in military At the Fukushima Daiichi nuclear power plant
and explosive ordnance disposal teams world- accident caused by tsunami in 2011, various
wide. Telemax has an arm with seven degrees disaster response robots were applied. They
of freedom on a tracked vehicle with four sub- contributed to the cool shutdown and decom-
tracks as shown in Fig. 5. It can observe narrow missioning of the plant. For example, PackBot
spaces like overhead lockers of airplanes and and Quince gave essential data for task planning
bottom of automobiles by cameras, manipulate by shooting images and radiation measurement
objects by the arm, and deactivate explosives by a in nuclear reactor buildings there. Unmanned
disrupter. construction system removed debris outdoors
that were contaminated by radiological materials
CBRNE Disasters and reduced the radiation rate there significantly.
CBRNE (chemical, biological, radiological, Group INTRA in France and KHG in Ger-
nuclear, and explosive) disasters have a high risk many are organizations for responding to nu-
and can cause large-scale damage because human clear plant accidents. They are equipped with
cannot detect contamination by the Hazmat (haz- robots and remote-controlled construction ma-
ardous materials). Application of robotic systems chines for radiation measurement, decontamina-
is highly expected for this disaster. PackBot has tion, and constructions in emergency. In Japan,
sensors for toxic industrial chemicals (TIC), the Assist Center for Nuclear Emergencies was
blood agents, blister agents, volatile organic established after the Fukushima Accident.
Disaster Response Robot, Fig. 7 Quince (Courtesy of International Rescue System Institute)
Summary and Future Directions feedback enables the robots’ autonomous motion
and work. The sensed data are shown to oper-
Data flow of disaster response robots is generally ators via communication, data processing, and
described by a feedback system as shown in human interface. The operators give commands
Fig. 8. Robots change the states of objects and of motion and work to the system via the human
environment by movement and task execution. interface. The system recognizes and transmits
Sensors measure and recognize them, and their them to the robot.
Disaster Response Robot 289
Disaster Response Robot, Fig. 8 Data flow of remotely controlled disaster response robots
Each functional block has its own technical been held, e.g., RoboCupRescue targeting
challenges to be fulfilled under extreme CBRNE disasters, ELROB and euRathlon
environments of disaster space in response to for field activities, MAGIC for multi-robot
the objectives and the conditions. They should be autonomy, and DARPA Robotics Challenge for
solved technically in order to improve the robot humanoid robots in nuclear disasters. These
performance. They include insufficient mobility competitions seek to stimulate solutions of the
in disaster spaces (steps, gaps, slippage, narrow above-mentioned technical issues in different
space, obstacles, etc.), deficient workability environments by providing practical test beds for
(dexterity, accuracy, speed, force, work space, advanced technology developments.
etc.), poor sensors and sensor data processing
(image, recognition, etc.), lack of reliability and
performance of autonomy (robot intelligence,
multiagent collaboration, etc.), issues of wireless Cross-References
and wired communication (instability, delay,
capacity, tether handling, etc.), operators’ Robot Teleoperation
limitations (situation awareness, decision ability, Walking Robots
fatigue, mistake, etc.), basic performances Wheeled Robots
(explosion proof, weight, durability, portability,
etc.), and system integration that combines the
components into the solution. Mission critical Bibliography
planning and execution including human factors,
training, role sharing, logistics, etc. have to be ELROB (2013) www.elrob.org
considered at the same time. Group INTRA (2013) www.groupe-intra.com
Research into systems and control is expected KHG (2013) www.khgmbh.de
Murphy, R. (2014) Disaster robotics. MIT, Cambridge
to solve the abovementioned challenges of com- RoboCup (2013) www.robocup.org
ponents and systems. For example, intelligent Siciliano B, Khatib O (eds) (2008) Springer handbook of
control is essential for mobility and workability robotics, 1st edn. Springer, Berlin
under extreme conditions; control of feedback Siciliano B, Khatib O (eds) (2014) Springer handbook of
robotics, 2nd edn. Springer, Berlin
systems including long delay and dynamic in- Tadokoro S (ed) (2010) Rescue robotics: DDT project
stability, control of human-in-loop systems, and on robots and systems for urban search and rescue.
system integration of heterogeneous systems are Springer, London
important research topics of systems and control. Tadokoro S, Seki S, Asama H (2013) Priority issues
of disaster robotics in Japan. In: Proceedings of the
In the research field of disaster robotics, IEEE region 10 humanitarian technology conference,
various competitions of practical robots have Sendai, 27–29 Aug 2013
290 Discrete Event Systems and Hybrid Systems, Connections Between
• Logically controlled systems. Often, a phys- (vertical position h D 0), its behavior is that
ical system with a time-driven evolution is of a (partially) elastic body that bounces up.
controlled in a feedback loop by means Classes of systems that can be described by
of a controller that implements discrete this paradigm are piecewise affine systems and
computations and event-based logic. This is linear complementarity system.
the case of the thermostat mentioned above. • Variable structure systems. Some systems
Classes of systems that can be described may change their structure assuming different
by this paradigm are embedded systems or, configuration, each characterized by a
when the feedback loop is closed through different behavior. As an example, consider
a communication network, cyber-physical a multicell voltage converter composed by a
systems. cascade of elementary commutation cells:
• State-dependent mode of operation. A time- controlling some switches, it is possible
driven system can have different modes of to insert or remove cells so as to produce
evolution depending on its current state. As a desired output voltage signal. Classes
an example, consider a bouncing ball. While of systems that can be described by this
the ball is above the ground (vertical position paradigm are switched systems.
h > 0) its behavior is that of a falling body While these are certainly appropriate and
subject to a constant gravitational force. How- meaningful paradigms, they are mainly focused
ever, when the ball collides with the ground on the connections between time-driven and
292 Discrete Event Systems and Hybrid Systems, Connections Between
In Fig. 6 is shown an Alur-Dill automaton event (enabling policy) or to specify when a timer
that describes the thermostat, where the time- is reset (memory policy) (Ajmone Marsan et al.
driven dynamics have been abstracted and only 1995). Furthermore, timed discrete event system
the timing of event occurrence is modeled as in can have arbitrary stochastic timing structures
the timed DES automaton in Fig. 5. The only (e.g., semi-Markovian processes, Markov chains,
continuous variable is the value of a timer ı: and queuing networks (Lafortune and Cassandras
when it goes beyond a certain threshold (e.g., 2007)), not to mention the possibility of having
ı > ıa ), an event occurs (e.g., event a) changing an infinite discrete state space (e.g., timed Petri
the discrete state (e.g., from OFF to ON) and nets (Ajmone Marsan et al. 1995; David and
D
resetting the timer to zero. Alla 2004)). As a result, we can say that timed
It is rather obvious that the behavior of the DES automata are far more general than Alur-Dill
Alur-Dill automaton in Fig. 6 is equivalent to automata and represent a meaningful subclass of
the behavior of timed DES automaton in Fig. 5. hybrid systems.
In the former model, the notion of time is en-
coded by means of an explicit continuous vari-
able ı. In the latter model, the notion of time From Discrete Event System to Hybrid
is implicitly encoded by the timer that during Systems by Fluidization
an evolution will be associated to each event.
In both cases, however, the overall state of the The computational complexity involved in the
systems is described by a pair .`.t/; x.t// where analysis and optimization of real-scale problems
the first element ` takes value in a discrete set often becomes intractable with discrete event
fS TART; ON; OFF g and the second element is models due to the very large number of reachable
a vector (in this particular case with a single states, and a technique that has shown to be
component) of timer valuations. effective in reducing this complexity is called
It should be pointed out that an Alur-Dill fluidization ( Applications of Discrete-Event
automaton may have a more complex structure Systems). It should be noted that the derivation
than that shown in Fig. 6: as an example, the of a fluid (i.e., hybrid) model from a discrete
guard associated to a transition, i.e., the values of event one is yet an example of abstraction albeit
the timer that enable it, can be an arbitrary rect- going in opposite direction with respect to the
angular set. However, the same is also true for a examples discussed in the section “From Hybrid
timed discrete event system: several policies can Systems to Discrete Event System by Modeling
be used to define the time intervals enabling an Abstraction” above.
The main drive that motivated the fluidization
approach derives from the observation that some
discrete event systems are “heavily populated”
in the sense that there are many identical items
in some component (e.g., clients in a queue).
Fluidization consists in replacing the integer
Discrete Event Systems and Hybrid Systems, Connec-
tions Between, Fig. 5 Timed discrete event model of the counter of the number of items by a real number
thermostat and in approximating the “fast” discrete event
START OFF ON
. . δ > δa ? .
δ=1 δ > δb’ ? δ=1 δ := 0 δ=1
δ := 0 { δ ≤ δb’ } δ := 0 { δ ≤ δa } δ > δb ? { δ ≤ δb }
δ := 0
Discrete Event Systems and Hybrid Systems, Connections Between, Fig. 6 Alur-Dill automaton of the thermostat
294 Discrete Optimal Control
Abstract
Cross-References
Discrete optimal control is a branch of mathe-
Applications of Discrete-Event Systems matics which studies optimization procedures for
Hybrid Dynamical Systems, Feedback Control controlled discrete-time models – that is, the opti-
of mization of a performance index associated with
Models for Discrete Event Systems: An a discrete-time control system. This entry gives
Overview an introduction to the topic. The formulation of
Perturbation Analysis of Discrete Event Sys- a general discrete optimal control problem is de-
tems scribed, and applications to mechanical systems
Supervisory Control of Discrete-Event Systems are discussed.
Discrete Optimal Control 295
In the sequel, it is assumed the following First, we will need an introduction to discrete
regularity condition: mechanics and variational integrators.
@2 Hd
det 6D 0
@ua @ub Discrete Mechanics
the discrete system determined by Ld must ex- formulations of optimal control problems (see
tremize the action sum given fixed endpoints q0 Cadzow 1970; Hwang and Fan 1967; Jordan and
and qN . By extremizing Sd over qk , 1 k N Polak 1964).
1, we obtain the system of difference equations
Discrete Optimal Control
D1 Ld .qk ; qkC1 / C D2 Ld .qk1 ; qk / D 0; (11)
of Mechanical Systems
or, in coordinates, Consider a mechanical system whose configu-
ration space is an n-dimensional differentiable
@Ld @Ld
.qk ; qkC1 / C .qk1 ; qk / D 0; manifold Q and whose dynamics is determined
@x i @y i
by a Lagrangian L W TQ ! R. The control forces
are modeled as a mapping f W TQ U ! T Q,
where 1 i n; 1 k N 1, and x; y
where f .vq ; u/ 2 Tq Q, vq 2 Tq Q and u 2 U ,
denote the n-first and n-second variables of the
being U the control space. Observe that this last
function Ld , respectively.
definition also covers configuration and velocity-
These equations are usually called the discrete
dependent forces such as dissipation or friction
Euler–Lagrange equations. Under some regu-
(see Ober-Blöbaum et al. 2011).
larity hypotheses (the matrix D12 Ld .qk ; qkC1 / is
The motion of the mechanical system is de-
regular), it is possible to define a (local) discrete
scribed by applying the principle of Lagrange–
flow ‡Ld W Q Q ! Q Q, by ‡Ld .qk1 ; qk / D
D’Alembert, which requires that the solutions
.qk ; qkC1 / from (11). Define the discrete Legen-
q.t/ 2 Q must satisfy
dre transformations associated to Ld as
Z T
F Ld W Q Q ! T Q ı P
L.q.t/; q.t// dt
0
.q0 ; q1 / 7! .q0 ; D1 Ld .q0 ; q1 //; Z T
C P
f .q.t/; q.t/; u.t// ıq.t/ dt D 0;
FC Ld W Q Q ! T Q 0
and the discrete Poincaré–Cartan 2-form !d D where .q ; q/P are the local coordinates of TQ and
.FC Ld / !Q D .F Ld / !Q , where !Q is the where we consider arbitrary variations ıq.t/ 2
canonical symplectic form on T Q. The discrete Tq.t / Q with ıq.0/ D 0 and ıq.T / D 0 (since we
algorithm determined by ‡Ld preserves the sym- are prescribing fixed initial and final conditions
plectic form !d , i.e., ‡Ld !d D !d . Moreover, if P
.q.0/; q.0// P //).
and .q.T /; q.T
the discrete Lagrangian is invariant under the di- As we consider an optimal control problem,
agonal action of a Lie group G, then the discrete the forces f must be chosen, if they exist, as the
momentum map Jd W Q Q ! g defined by ones that extremize the cost functional:
Z T
hJd .qk ; qkC1 /; i D hD2 Ld .qk ; qkC1 /; P
C.q.t/; q.t/; u.t// dt
0
Q .qkC1 /i Cˆ.q.T /; q.T
P /; u.T //; (13)
discrete variational techniques, we first discretize manipulations, we arrive to the forced discrete
the Lagrange–d’Alembert principle and then the Euler–Lagrange equations
cost functional. We obtain a numerical method
that preserves some geometric features of the
D2 Ld .qk1 ; qk / C D1 Ld .qk ; qkC1 /
original continuous system as described in the
sequel. C fdC .qk1 ; qk ; uk1 /
C fd .qk ; qkC1 ; uk / D 0;
Discretization of the Lagrangian
and Control Forces (14) D
To discretize this problem, we replace the tangent
space TQ by the Cartesian product Q Q and with k D 1; : : : ; N 1.
the continuous curves by sequences q0 ; q1 ; : : : qN
(we are using N steps, with time step h fixed, in
such a way tk D kh and N h D T ). The discrete Boundary Conditions
Lagrangian Ld W Q Q ! R is constructed as For simplicity, we assume that the boundary con-
an approximation of the action integral in a single ditions of the continuous optimal control problem
time step (see Marsden and West 2001), that is, are given by q.0/ D x0 , q.0/
P D v0 , q.T / D
P / D vT . To incorporate these conditions
xT , q.T
Z .kC1/h to the discrete setting, we use both the continu-
Ld .qk ; qkC1 /
P
L.q.t/; q.t// dt: ous and discrete Legendre transformations. From
kh
Marsden and West (2001), given a forced system,
we can define the discrete momenta
We choose the following discretization for the
external forces: fd˙ W Q Q U ! T Q, where
U Rm ; m n, such that k D D1 Ld .qk ; qkC1 / fd .qk ; qkC1 ;uk /;
kC1 D D2 Ld .qk ; qkC1 / C fdC .qk ; qkC1 ; uk /:
fd .qk ; qkC1 ; uk / 2 Tqk Q;
fdC .qk ; qkC1 ; uk / 2 TqkC1 Q: From the continuous Lagrangian, we have the
momenta
fdC and fd are right and left discrete forces (see
Ober-Blöbaum et al. 2011). @L
p0 D FL.x0 ; v0 / D .x0 ; .x0 ; v0 //
@v
Discrete Lagrange–d’Alembert Principle
@L
Given such forces, we define the discrete pT D FL.xT ; vT / D xT ; .xT ; vT /
@v
Lagrange–d’Alembert principle, which seeks
sequences fqk gN
kD0 that satisfy
Therefore, the natural choice of boundary condi-
1
tions is
X
N
ı Ld .qk ; qkC1 /
kD0 x0 D q0 ; xT D qN
X
N 1
FL.x0 ; v0 / D D1 Ld .q0 ; q1 / fd .q0 ;q1 ; u0 /
C fd .qk ; qkC1 ; uk / ıqk
kD0 FL.xT ; vT / D D2 Ld .qN 1 ; qN /
CfdC .qk ; qkC1 ; uk / ıqkC1 D 0; CfdC .qN 1 ; qN ; uN 1 /
their application to the analysis of Clebsch optimal constrained multibody dynamics. In: 6th international
control problems and mechanical systems. J Geom conference on multibody systems, nonlinear dynam-
Mech 5(1):1–38 ics, and control, ASME international design engineer-
Bonnans JF, Laurent-Varin J (2006) Computation of or- ing technical conferences, Las Vegas
der conditions for symplectic partitioned Runge-Kutta Marrero JC, Martín de Diego D, Martínez E (2006) Dis-
schemes with application to optimal control. Numer crete Lagrangian and Hamiltonian mechanics on Lie
Math 103:1–10 groupoids. Nonlinearity 19:1313–1348. Corrigendum:
Bou-Rabee N, Marsden JE (2009) Hamilton-Pontryagin Nonlinearity 19
integrators on Lie groups: introduction and Marrero JC, Martín de Diego D, Stern A (2010) La-
structure-preserving properties. Found Comput grangian submanifolds and discrete constrained me-
Math 9(2):197–219 chanics on Lie groupoids. Preprint, To appear in
D
Cadzow JA (1970) Discrete calculus of variations. Int J DCDS-A 35-1, January 2015.
Control 11:393–407 Marsden JE, West M (2001) Discrete mechanics and
de León M, Martí’n de Diego D, Santamaría-Merino A variational integrators. Acta Numer 10: 357–514
(2007) Discrete variational integrators and optimal Marsden JE, Pekarsky S, Shkoller S (1999a) Discrete
control theory. Adv Comput Math 26(1–3):251–268 Euler-Poincaré and Lie-Poisson equations. Nonlinear-
Hager WW (2001) Numerical analysis in optimal con- ity 12:1647–1662
trol. In: International series of numerical mathematics, Marsden JE, Pekarsky S, Shkoller S (1999b) Symmetry
vol 139. Birkhäuser Verlag, Basel, pp 83–93 reduction of discrete Lagrangian mechanics on Lie
Hairer E, Lubich C, Wanner G (2002) Geometric nu- groups. J Geom Phys 36(1–2):140–151
merical integration, structure-preserving algorithms Ober-Blöbaum S (2008) Discrete mechanics and optimal
for ordinary differential equations’. Springer series in control. Ph.D. Thesis, University of Paderborn
computational mathematics, vol 31. Springer, Berlin Ober-Blöbaum S, Junge O, Marsden JE (2011) Discrete
Hussein I, Leok M, Sanyal A, Bloch A (2006) A discrete mechanics and optimal control: an analysis. ESAIM
variational integrator for optimal control problems on Control Optim Calc Var 17(2):322–352
SO.3/’. In: Proceedings of the 45th IEEE conference Pytlak R (1999) Numerical methods for optimal
on decision and control, San Diego, pp 6636–6641 control problems with state constraints. Lecture
Hwang CL, Fan LT (1967) A discrete version of Pontrya- Notes in Mathematics, 1707. Springer-Verlag, Berlin,
gin’s maximum principle. Oper Res 15:139–146 xvi+215 pp.
Jordan BW, Polak E (1964) Theory of a class of dis- Wendlandt JM, Marsden JE (1997a) Mechanical integra-
crete optimal control systems. J Electron Control tors derived from a discrete variational principle. Phys
17:697–711 D 106:2232–2246
Jiménez F, Martín de Diego D (2010) A geometric Wendlandt JM, Marsden JE (1997b) Mechanical systems
approach to Discrete mechanics for optimal control with symmetry, variational principles and integration
theory. In: Proceedings of the IEEE conference on algorithms. In: Alber M, Hu B, Rosenthal J (eds)
decision and control, Atlanta, pp 5426–5431 Current and future directions in applied mathematics,
Jiménez F, Kobilarov M, Martín de Diego D (2013) (Notre Dame, IN, 1996), Birkhäuser Boston, Boston,
Discrete variational optimal control. J Nonlinear Sci MA, pp 219–261
23(3):393–426
Junge O, Ober-Blöbaum S (2005) Optimal reconfiguration
of formation flying satellites. In: IEEE conference on
decision and control and European control conference Distributed Model Predictive Control
ECC, Seville
Junge O, Marsden JE, Ober-Blöbaum S (2006) Optimal Gabriele Pannocchia
reconfiguration of formation flying spacecraft- a de-
centralized approach. In: IEEE conference on decision University of Pisa, Pisa, Italy
and control and European control conference ECC,
San Diego, pp 5210–5215
Kobilarov M (2008) Discrete geometric motion control Abstract
of autonomous vehicles. Thesis, Computer Science,
University of Southern California Distributed model predictive control refers to a
Kobilarov M, Marsden JE (2011) Discrete geometric
optimal control on Lie groups. IEEE Trans Robot
class of predictive control architectures in which
27(4):641–655 a number of local controllers manipulate a subset
Leok M (2004) Foundations of computational geometric of inputs to control a subset of outputs (states)
mechanics, control and dynamical systems. Thesis, composing the overall system. Different levels
California Institute of Technology. Available in http://
www.math.lsa.umich.edu/~mleok
of communication and (non)cooperation exist, al-
Leyendecker S, Ober-Blobaum S, Marsden JE, Ortiz M though in general the most compelling properties
(2007) Discrete mechanics and optimal control for can be established only for cooperative schemes,
302 Distributed Model Predictive Control
those in which all local controllers optimize local properties may be even lost. Distributed predic-
inputs to minimize the same plantwide objective tive control architectures arise to meet perfor-
function. Starting from state-feedback algorithms mance specifications (stability at minimum) sim-
for constrained linear systems, extensions are dis- ilar to centralized predictive control systems, still
cussed to cover output feedback, reference target retaining the modularity and local character of the
tracking, and nonlinear systems. An outlook of optimization problems solved by each controller.
future directions is finally presented.
xiC D Ai xi C Bi ui ; yi D Ci xi (5)
D
Therefore, an inherent mismatch exists between
the model used by the local controllers (5) and
the actual subsystem dynamics (2). Each local
MPC solves the following finite-horizon optimal
control problem (FHOCP):
Distributed Model Predictive Control, Fig. 1
Interconnected systems and neighbors definition PDe
i W min Vi ./ s.t. ui 2 UN
i ; Ni D ¿ (6)
ui
a b
Distributed Model Predictive Control, Fig. 2 Three distributed control architectures: decentralized MPC, noncoop-
erative MPC, and cooperative MPC
decrease of the global objective at each decision We observe that Step 8 implicitly defines the
time (Maestre et al. 2011). new overall iterate as a convex combination of the
overall solutions achieved by each controller, that
is,
Cooperative Distributed MPC
X
M
Cooperative schemes are preferable over nonco- uc D wi .u1c1 ; : : : ; ui ; : : : ; uM
c1
/ (10)
operative schemes from many points of view, i D1
namely, in terms of superior theoretical guaran- D
tees and no larger computational requirements. In It is also important to observe that Steps 5, 8, and
this section we focus on a prototype cooperative 9 are performed separately by each controller.
distributed MPC algorithm adapted from Stewart
et al. (2010), highlighting the required compu-
Properties
tations and discussing the associated theoretical
The basic cooperative MPC described in Algo-
properties and guarantees.
rithm 1 enjoys several nice theoretical and practi-
cal properties, as detailed (Rawlings and Mayne
Basic Algorithm
2009, §6.3.1):
We present in Algorithm 1 a streamlined descrip-
1. Feasibility of each iterate: uic1 2 UN i implies
tion of a cooperative distributed MPC algorithm,
uci 2 UN , for all i D 1, . . . , M and c 2 I>0 .
in which each local controller solves PCDi
i , given
i
2. Cost decrease at each iteration: V .x.0/; uc /
a previously computed value of all other subsys-
V .x.0/; uc1 / for all c 2 I>0 .
tems’ input sequences. For each local controller,
3. Cost convergence to the centralized optimum:
the new iterate is defined as a convex combination
limc!1 V .x.0/; uc / D minu2UN V .x.0/; u/,
of the newly computed solution with the previous
iteration. A relative tolerance is defined, so that in which U D U1 UM .
cooperative iterations stop when all local con- Resorting to suboptimal MPC theory, the
trollers have computed a new iterate sufficiently above properties (1) and (2) can be exploited
close to the previous one. A maximum number to show that the origin of closed-loop system
of cooperative iterations can also be defined, so
that a finite bound on the execution time can be x C D Ax C B c .x/; with c .x/ D uc .0/ (11)
established.
is exponentially stable for any finite c 2 I>0 . This
Algorithm 1 (Cooperative MPC). Require: result is of paramount (practical and theoretical)
Overall warm start u0 D .u01 ; : : : ; u0M /, convex importance because it ensures closed-loop
P
step weights wi > 0, s.t. M i D1 wi D 1, relative
stability using cooperative distributed MPC
tolerance parameter " > 0, maximum cooperative with any finite number of cooperative itera-
iterations cmax tions. As in centralized MPC based on the
1:Initialize: c 0 and ei 2
for i = 1, . . . , M . solution of a FHOCP (Rawlings and Mayne
2:while .c < cmax / and (9i jei >
) do 2009, §2.4.3), particular care of the terminal
3: c c C 1.
cost function Vf i ./ is necessary, possibly
4: for i D 1 to M do
5: Solve PCDi in (9) obtaining u in conjunction with a terminal constraint
i i .
6: end for xi .N / 2 Xf i . Several options can be adopted
7: for i D 1 to M do as discussed, e.g., in Stewart et al. (2010,
8: Define new iterate: uci D wi u i C .1 wi / ui
c1
. 2011).
kuci uic1 k
9: Compute convergence error: ei D . Moreover, the results in Pannocchia et al.
kui k
c1
state that well-designed distributed cooperative Output Feedback and Offset-Free Tracking
MPC and centralized MPC algorithms share When the subsystem state cannot be directly mea-
the same guarantees in terms of stability and sured, each local controller can use a local state
robustness. estimator, namely, a Kalman filter (or Luenberger
observer). Assuming that the pair (Ai , Ci ) is
detectable, the subsystem state estimate evolves
Complementary Aspects as follows:
Recommended Reading Stewart BT, Venkat AN, Rawlings JB, Wright SJ, Pannoc-
chia G (2010) Cooperative distributed model predic-
tive control. Syst Control Lett 59:460–469
General overviews on DMPC can be found in Stewart BT, Wright SJ, Rawlings JB (2011) Cooperative
Christofides et al. (2013), Rawlings and Mayne distributed model predictive control for nonlinear sys-
(2009), and Scattolini (2009). DMPC algorithms tems. J Process Control 21:698–704
for linear systems are discussed in Alessio et al.
(2011), Ferramosca et al. (2013), Riverso et al.
(2013), Stewart et al. (2010), and Maestre et al.
(2011), and for nonlinear systems in Farina et al. Distributed Optimization
(2012), Liu et al. (2009), and Stewart et al.
(2011). Supporting results for implementation Angelia Nedić
and robustness theory can be found in Doan et al. Industrial and Enterprise Systems Engineering,
(2011), Pannocchia and Rawlings (2003), and University of Illinois, Urbana, IL, USA
Pannocchia et al. (2011).
Abstract
Bibliography
The paper provides an overview of the distributed
Alessio A, Barcelli D, Bemporad A (2011) Decentralized first-order optimization methods for solving a
model predictive control of dynamically coupled linear
systems. J Process Control 21:705–714
constrained convex minimization problem, where
Christofides PD, Scattolini R, Muñoz de la Peña D, Liu J the objective function is the sum of local objec-
(2013) Distributed model predictive control: a tutorial tive functions of the agents in a network. This
review and future research directions. Comput Chem problem has gained a lot of interest due to its
Eng 51:21–41
Doan MD, Keviczky T, De Schutter B (2011) An iterative emergence in many applications in distributed
scheme for distributed model predictive control using control and coordination of autonomous agents
Fenchel’s duality. J Process Control 21:746–755 and distributed estimation and signal processing
Farina M, Ferrari-Trecate G, Scattolini R (2012) Dis- in wireless networks.
tributed moving horizon estimation for nonlinear con-
strained systems. Int J Robust Nonlinear Control
22:123–143
Ferramosca A, Limon D, Alvarado I, Camacho EF (2013) Keywords
Cooperative distributed MPC for tracking. Automatica
49:906–914
Liu J, Muñoz de la Peña D, Christofides PD (2009) Dis- Collaborative multi-agent systems; Consensus
tributed model predictive control of nonlinear process protocol; Gradient-projection method; Net-
systems. AIChE J 55(5):1171–1184 worked systems
Maestre JM, Muñoz de la Peña D, Camacho EF (2011)
Distributed model predictive control based on a coop-
erative game. Optim Control Appl Methods 32:153–
176 Introduction
Pannocchia G, Rawlings JB (2003) Disturbance mod-
els for offset-free model predictive control. AIChE J
49:426–437 There has been much recent interest in distributed
Pannocchia G, Rawlings JB, Wright SJ (2011) Conditions optimization pertinent to optimization aspects
under which suboptimal nonlinear MPC is inherently arising in control and coordination of networks
robust. Syst Control Lett 60:747–755
consisting of multiple (possibly mobile) agents
Rawlings JB, Mayne DQ (2009) Model predictive control:
theory and design. Nob Hill Publishing, Madison and in estimation and signal processing in sensor
Riverso S, Farina M, Ferrari-Trecate G (2013) Plug-and- networks (Bullo et al. 2009; Hendrickx 2008;
play decentralized model predictive control for linear Kar and Moura 2011; Martinoli et al. 2013;
systems. IEEE Trans Autom Control 58:2608–2614
Mesbahi and Egerstedt 2010; Olshevsky 2010).
Scattolini R (2009) A survey on hierarchical and dis-
tributed model predictive control. J Process Control In many of these applications, the network
19:723–731 system goal is to optimize a global objective
Distributed Optimization 309
Pn
through local agent-based computations and minimize i D1 fi .x/ (1)
local information exchange with immediate subject to x 2 X:
neighbors in the underlying communication
network. This is motivated mainly by the Each fi : Rd ! R is a convex function which
emergence of large-scale data and/or large- represents the local objective of agent i , while
scale networks and new networking applications X Rd is a closed convex set. The function
such as mobile ad hoc networks and wireless fi is a private function known only to agent i ,
sensor networks, characterized by the lack of
centralized access to information and time-
while the set X is commonly known by all agents D
i 2 N . The vector x 2 X represents a global
varying connectivity. Control and optimization decision vector which the agents want to optimize
algorithms deployed in such networks should using local information. The problem is a simple
be completely distributed (relying only on local constrained convex optimization problem, where
observations and information), robust against the global objective function is given by the sum
unexpected changes in topology (i.e., link or node of the individual objective functions fi .x/ of the
failures) and against unreliable communication agents in the system. As such, the objective func-
(noisy links or quantized data) (see Networked tion is the sum of non-separable convex functions
Systems). Furthermore, it is desired that the corresponding to multiple agents connected over
algorithms are scalable in the size of the a network.
network. As an example, consider the problem arising
Generally speaking, the problem of distributed in support vector machines (SVMs), which are
optimization consists of three main components: a popular tool for classification problems. Each
1. The optimization problem that the network of n o mi
.i / .i /
agent i has a set Si D aj ; bj of mi
agents wants to solve collectively (specifying j D1
.i /
an objective function and constraints) sample-label pairs, where aj 2 Rd is a data
2. The local information structure, which de- .i /
point and bj 2 fC1; 1g is its corresponding
scribes what information is locally known or (correct) label. The number mi of data points for
observable by each agent in the system (who every agent i is typically very large (hundreds
knows what and when) of thousands). Without sharing the data points,
3. The communication structure, which specifies the agents want to collectively find a hyperplane
the connectivity topology of the under- that separates all the data, i.e., a hyperplane that
lying communication network and other separates (with a maximal separation distance)
features of the communication environ- the data with label 1 from the data with label 1
ment S
in the global data set niD1 Si . Thus, the agents
The algorithms for solving such global need to solve an unconstrained version of the
network problems need to comply with the problem (1), where the decision variable x 2 Rd
distributed knowledge about the problem among is a hyperplane normal and the objective function
the agents and obey the local connectivity fi of agent i is given by
structure of the communication network
( Networked Systems; Graphs for Modeling Xmi n o
max 0; 1 bj x 0 aj
.i / .i /
Networked Interactions). fi .x/ D kxk2 C ;
2 j D1
the information from an agent to every other 2004; Olshevsky 2010; Olshevsky and Tsitsiklis
agent through local agent interactions over time 2006, 2009; Touri 2011; Vicsek et al. 1995; Wan
( Graphs for Modeling Networked Interactions). and Lemmon 2009).
To accommodate the information spread through Using the consensus technique, a class
the entire network, it is typically assumed that the of distributed algorithms has emerged, as a
network communication graph G D .N; E/ is combination of the consensus protocols and the
strongly connected ( Dynamic Graphs, Connec- gradient-type methods. The gradient-based
tivity of). In the graph, a link (i , j ) means that approaches are particularly suitable, as they
agent i 2 N receives the relevant information have a small overhead per iteration and are, in
from agent j 2 N . general, robust to various sources of errors and
uncertainties.
The technique of using the network as a
Distributed Algorithms medium to propagate the relevant information
for optimization purpose has its origins in the
The algorithms for solving problem (1) are con- work by Tsitsiklis (1984), Tsitsiklis et al. (1986),
structed by using standard optimization tech- and Bertsekas and Tsitsiklis (1997), where the
niques in combination with a mechanism for network has been used to decompose the vector x
information diffusion through local agent inter- components across different agents, while all
actions. A control point of view for the design agents share the same objective function.
of distributed algorithms has a nice exposition in
Wang and Elia (2011). Algorithms Using Weighted Averaging
One of the existing optimization techniques The technique has recently been employed in
is the so-called incremental method, where the Nedić and Ozdaglar (2009) (see also Nedić and
information is processed along a directed cycle Ozdaglar 2007, 2010) to deal with problems
in the graph. In this approach the estimate is of the form (1) when the agents have different
passed from an agent to its neighbor (along the objective functions fi , but their decisions are
cycle), and only one agent updates at a time fully coupled through the common vector vari-
(Bertsekas 1997; Blatt et al. 2007; Johansson able x. In a series of recent work (to be detailed
2008; Johansson et al. 2009; Nedić and Bertsekas later), the following distributed algorithm has
2000, 2001; Nedić et al. 2001; Rabbat and Nowak emerged. Letting xi .k/ 2 X be an estimate
2004; Ram et al. 2009; Tseng 1998); for a de- (of the optimal decision) at agent i and time
tailed literature on incremental methods, see the k, the next iterate is constructed through two
textbooks Bertsekas (1999) and Bertsekas et al. updates. The first update is a consensus-like it-
(2003). eration, whereby, upon receiving the estimates
More recently, one of the techniques that xj .k/ from its (in)neighbors j , the agent i aligns
gained popularity as a mechanism for information its estimate with its neighbors through averaging,
diffusion is a consensus protocol, in which formally given by
the agent diffuses the information through the X
network through locally weighted averaging of vi .k/ D wij xj .k/; (2)
their incoming data ( Averaging Algorithms j 2Ni
and Consensus). The problem of reaching
a consensus on a particular scalar value, or where Ni is the neighbor set
computing exact averages of the initial values of
the agents, has gained an unprecedented interest Ni D fj 2 N j.j; i / 2 Eg [ fi g:
as a central problem inherent to cooperative
behavior in networked systems (Blondel et al. The neighbor set Ni includes agent i itself, since
2005; Boyd et al. 2005; Cao et al. 2005, 2008a,b; the agent always has access to its own informa-
Jadbabaie et al. 2003; Olfati-Saber and Murray tion. The scalar wij is a nonnegative weight that
Distributed Optimization 311
agent i places on the incoming information from the weights wij give rise to a doubly stochastic
neighbor j 2 Ni . These weights sum to 1, i.e., weight matrix W , whose entries are wij defined
P
j 2Ni wij D 1, thus yielding vi .k/ as a (local) by the weights in (4) and augmented by wij D 0
convex combination of xj .k/, j 2 Ni , obtained for j … Ni .
by agent i . Intuitively, the requirement that the weight
After computing vi .k/, agent i computes a matrix W is doubly stochastic ensures that each
new iterate xi .k C 1/ by performing a gradient- agent has the same influence on the system be-
projection step, aimed at minimizing its own havior, in a long run. More specifically, the dou-
objective fi , of the following form: bly stochastic weights ensure that the system as
D
P
Y whole minimizes niD1 n1 fi .x/, where the fac-
xi .k C 1/ D vi .k/ ˛.k/rfi .vi .k// ; tor 1/n is seen as portion of the influence of
X
(3) agent i . When the weight matrix W is only row
where …X [x] is the projection of a point x on the stochastic, the consensus protocol converges to
P
set X (in the Euclidean norm), ˛.k/ > 0 is the a weighted average niD1 i xi .0/ of the agent
stepsize at time k, and rfi .z/ is the gradient of initial values, where is the left eigenvector of
fi at a point z. W associated with the eigenvalue 1. In general,
When all functions are zero and X is the entire the values i can be different for different indices
space Pd , the distributed algorithm (2) and (3) i , and the distributed algorithm in (2) and (3)
P
reduces to the linear-iteration method: results in minimizing the function niD1 i fi .x/
over X , and thus not solving problem (1).
X
xi .k C 1/ D wij xj .k/; (4) The distributed algorithm (2) and (3) has
j 2Ni been proposed and analyzed in Ram et al.
(2010a, 2012), where the convergence had
which is known as consensus or agreement pro- been established for diminishing stepsize rule
P P
tocol. This protocol is employed when the agents (i.e., k ˛.k/ D 1 and k ˛ 2 .k/ < 1). As
in the network wish to align their decision vectors seen from the convergence analysis (see, e.g.,
xi .k/ to a common vector x. O The alignment is Nedić and Ozdaglar 2009; Ram et al. 2012),
attained asymptotically (as k ! 1). for the convergence of the method, it is critical
In the presence of objective function and con- that the iterate disagreements k xi .k/ xj .k/ k
straints, the distributed algorithm in (2) and (3) converge linearly in time, for all i ¤ j . This fast
corresponds to “forced alignment” guided by the disagreement decay seems to be indispensable
P
gradient forces niD1 rfi .x/. Under appropriate for ensuring the stability of the iterative process
conditions, the alignment is forced to a common (2) and (3).
vector x that minimizes the network objective According to the distributed algorithm (2) and
Pn
i D1 fi .x/ over the set X . This corresponds to
(3), at first, each agent aligns its estimate xi .k/
the convergence of the iterates xi .k/ to a common with the estimates xj .k/ that are received from
solution x 2 X as k ! 1 for all agents i 2 N . its neighbors and, then, updates based on its local
The conditions under which the convergence objective fi which is to be minimized over x 2
of {xi .k/} to a common solution x 2 X occurs X . Alternatively, the distributed method can be
are a combination of the conditions needed for constructed by interchanging the alignment step
the consensus protocol to converge and the con- and the gradient-projection step. Such a method,
ditions imposed on the functions fi and the step- while having the same asymptotic performance
size ˛.k/ to ensure the convergence of standard as the method in (2) and (3), exhibits a somewhat
gradient-projection methods. For the consensus slower (transient) convergence behavior due to
part, the conditions should guarantee that the a larger misalignment resulting from taking the
pure consensus protocol in (4) converges to the gradient-based updates at first. This alternative
P
average n1 niD1 xi .0/ of the initial agent values. has been initially proposed independently in
This requirement transfers to the condition that Lopes and Sayed (2006) (where simulation
312 Distributed Optimization
results have been reported), in Nedić and rfi .x/, resulting in the following stochastic
Ozdaglar (2007, 2009) (where the convergence gradient-projection step:
analysis for a time-varying networks and a Y h i
constant stepsize is given), and in Nedić et al. xi .k C 1/ D Q i .vi .k//
vi .k/ ˛.k/rf
X
(2008) (with the quantization effects) and further
investigated in Lopes and Sayed (2008), Nedić instead of (3). The convergence of these
et al. (2010), and Cattivelli and Sayed (2010). methods is established for the cases when the
More recently, it has been considered in Lobel stochastic gradients are consistent estimates
et al. (2011) for state-dependent weights and in of the actual gradient, i.e.,
Tu and Sayed (2012) where the performance is h i
compared with that of an algorithm of the form E rfQ i .vi .k//jvi .k/ D rfi .vi .k//:
(2) and (3) for estimation problems.
Algorithm Extensions: Over the past years, The convergence of these methods typically
many extensions of the distributed algorithm in requires the use of non-summable but square-
(2) and (3) have been developed, including the summable stepsize sequence {˛.k/} (i.e.,
P P 2
following: k ˛.k/ D 1 and k ˛ .k/ < 1/, e.g.,
(a) Time-varying communication graphs: The Ram et al. (2010a).
algorithm naturally extends to the case of (c) Noisy or unreliable communication links:
time-varying connectivity graphs {G.k/}, Communication medium is not always
with G.k/ = (N , E.k/) defined over perfect and, often, the communication links
the node set N and time-varying links are characterized by some random noise
E.k/. In this case, the weights wij in (2) process. In this case, while agent j 2 Ni
are replaced with wij .k/ and, similarly, sends its estimate xj .k/ to agent i , the
the neighbor set Ni is replaced with the agent does not receive the intended message.
corresponding time-dependent neighbor set Rather, it receives xj .k/ with some random
Ni .k/ specified by the graph G.k/. The link-dependent noise ij .k/, i.e., it receives
convergence of the algorithm (2) and (3) xj .k/ + ij .k/ instead of xj .k/ (see Kar and
with these modifications typically requires Moura 2011; Patterson et al. 2009; Touri and
some additional assumptions of the network Nedić 2009 for the influence of noise and
connectivity over time and the assumptions link failure on consensus). In such cases, the
on the entries in the corresponding weight distributed optimization algorithm needs to
matrix sequence {W .k/}, where wij .k/ = 0 be modified to include a stepsize for noise
for j … Ni .k/. These conditions are the same attenuation and the standard stepsize for
as those that guarantee the convergence of the gradient scaling. These stepsizes are
the (row stochastic) matrix sequence {W .k/} coupled through an appropriate relative-
to a rank-one row-stochastic matrix, such growth conditions which ensure that the
as a connectivity over some fixed period gradient information is maintained at the
(of a sliding-time window), the nonzero right level and, at the same time, link-noise
diagonal entries in W .k/, and the existence of is attenuated appropriately (Srivastava and
a uniform lower bound on positive entries in Nedić 2011; Srivastava et al. 2010). Other
W .k/; see, for example, Cao et al. (2008b), imperfections of the communication links
Touri (2011), Tsitsiklis (1984), Nedić and can also be modeled and incorporated into
Ozdaglar (2010), Moreau (2005), and Ren the optimization method, such as link failures
and Beard (2005). and quantization effects, which can be built
(b) Noisy gradients: The algorithm in (2) and (3) using the existing results for consensus
works also when the gradient computations protocol (e.g., Carli et al. 2007; Kar and
rfi .x/ in update (3) are erroneous Moura 2010, 2011; Nedić et al. 2008).
with random errors. This corresponds to (d) Asynchronous implementations: The method
Q i .x/ instead of
using a stochastic gradient rf in (2) and (3) has simultaneous updates,
Distributed Optimization 313
evident in all agents exchanging and updating Xi is an intersection of finitely many linear
information in synchronous time steps equality and/or inequality constraints), or the
indexed by k. In some communication Slater condition (Lee and Nedić 2012; Nedić
settings, the synchronization of agents is et al. 2010; Srivastava 2011; Srivastava and
impractical, and the agents are using their Nedić 2011; Zhu and Martínez 2012).
own clocks which are not synchronized In principle, most of the simple first-order
but do tick according to a common time methods that solve a centralized problem of
interval. Such communications result in the form (1) can also be distributed among the
random weights wij .k/ and random neighbor agents (through the use of consensus protocols)
D
set Ni .k/ E in (2), which are typically to solve distributed problem (1). For example,
independent and identically distributed (over the Nesterov dual-averaging subgradient method
time). Most common models are random (Nesterov 2005) can be distributed as proposed
gossip and random broadcast. In the gossip in Duchi et al. (2012), a distributed Newton-
model, at any time, two randomly selected Raphson method has been proposed and studied
agents i and j communicate and update, in Zanella et al. (2011), while a distributed
while the other agents sleep (Boyd et al. simplex algorithm has been constructed and
2005; Kashyap et al. 2007; Ram et al. analyzed in Bürger et al. (2012). An interesting
2010b; Srivastava 2011; Srivastava and Nedić method based on finding a zero of the gradient
Pn
2011). In the broadcast model, a random rf D i D1 rfi , distributedly, has been
agent i wakes up and broadcasts its estimate proposed and analyzed in Lu and Tang (2012).
xi .k/. Its neighbors that receive the estimate Some other distributed algorithms and their
update their iterates, while the other agents implementations can be found in Johansson et al.
(including the agent who broadcasted) do not (2007), Tsianos et al. (2012a,b), Dominguez-
update (Aysal et al. 2008; Nedić 2011). Garcia and Hadjicostis (2011), Tsianos (2013),
(e) Distributed constraints: One of the more Gharesifard and Cortés (2012a), Jakovetic et al.
challenging aspects is the extension of the (2011a,b), and Zargham et al. (2012).
algorithm to the case when the constraint
set X in (1) is given as an intersection of
closed convex sets Xi , one set per agent. Summary and Future Directions
Specifically, the set X in (1) is defined by
The distributed optimization algorithms have
\
n
been developed mainly using consensus protocols
XD Xi ;
that are based on weighted averaging, also known
i D1
as linear-iterative methods. The convergence
where the set Xi is known to agent i only. In behavior and convergence rate analysis of these
this case, the algorithm has a slight modifica- methods combines the tools from optimization
tion at the update of xi .k C 1/ in (3), where theory, graph theory, and matrix analysis. The
the projection is on the local set Xi instead of main drawback of these algorithms is that they
X , i.e., the update in (3) is replaced with the require (at least theoretically) the use of doubly
following update: stochastic weight matrix W (or W .k/ in time-
varying case) in order to solve problem (1). This
Y
xi .k C 1/ D vi .k/ ˛.k/rfi .vi .k// : requirement can be accommodated by allowing
Xi agents to exchange locally some additional
(5) information on the weights that they intend to
The resulting method (2), (5) converges under use or their degree knowledge. However, in
some additional assumptions on the sets Xi , general, constructing such doubly stochastic
such as the T nonempty interior assumption weights distributedly on directed graphs is rather
(i.e., the set niD1 Xi has a nonempty inte- a complex problem (Gharesifard and Cortés
rior), a linear-intersection assumption (each 2012b).
314 Distributed Optimization
In: Proceedings of IEEE INFOCOM, Miami, vol 3, Johansson B (2008) On distributed optimization in net-
pp 1653–1664 worked systems. Ph.D. dissertation, Royal Institute of
Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2010) Technology, Stockholm
Distributed optimization and statistical learning via Johansson B, Rabi M, Johansson M (2007) A simple
the alternating direction method of multipliers. Found peer-to-peer algorithm for distributed optimization in
Trends Mach Learn 3(1): 1–122 sensor networks. In: Proceedings of the 46th IEEE
Bullo F, Cortés J, Martínez S (2009) Distributed control of conference on decision and control, New Orleans, Dec
robotic networks. Applied mathematics series. Prince- 2007, pp 4705–4710
ton University Press, Princeton Johansson B, Rabi M, Johansson M (2009) A random-
Bürger M, Notarsetfano G, Bullo F, Allgöwer F (2012) ized incremental subgradient method for distributed
A distributed simplex algorithm for degenerate linear optimization in networked systems. SIAM J Control
D
programs and multi-agent assignments. Automatica Optim 20(3):1157–1170
48(9):2298–2304 Kar S, Moura J (2010) Distributed consensus algo-
Cao M, Spielman D, Morse A (2005) A lower bound rithms in sensor networks: quantized data and ran-
on convergence of a distributed network consensus dom link failures. IEEE Trans Signal Process 58(3):
algorithm. In: Proceedings of IEEE CDC, Seville, 1383–1400
pp 2356–2361 Kar S, Moura J (2011) Convergence rate analysis of
Cao M, Morse A, Anderson B (2008a) Reaching a distributed gossip (linear parameter) estimation: fun-
consensus in a dynamically changing environment: damental limits and tradeoffs. IEEE J Sel Top Signal
a graphical approach. SIAM J Control Optim 47(2): Process 5(4):674–690
575–600 Kashyap A, Basar T, Srikant R (2007) Quantized consen-
Cao M, Morse A, Anderson B (2008b) Reaching a consen- sus. Automatica 43(7):1192–1203
sus in a dynamically changing environment: conver- Kempe D, Dobra A, Gehrke J (2003) Gossip-based com-
gence rates, measurement delays, and asynchronous putation of aggregate information. In: Proceedings of
events. SIAM J Control Optim 47(2):601–623 the 44th annual IEEE symposium on foundations of
Carli R, Fagnani F, Frasca P, Taylor T, Zampieri S (2007) computer science, Cambridge, Oct 2003, pp 482–491
Average consensus on networks with transmission Lee S, Nedić A (2012) Distributed random projection
noise or quantization. In: Proceedings of European algorithm for convex optimization. IEEE J Sel Top
control conference, Kos Signal Process 48(6):988–1001 (accepted to appear)
Cattivelli F, Sayed A (2010) Diffusion LMS strategies Lobel I, Ozdaglar A, Feijer D (2011) Distributed multi-
for distributed estimation. IEEE Trans Signal Process agent optimization with state-dependent communica-
58(3):1035–1048 tion. Math Program 129(2):255–284
Dominguez-Garcia A, Hadjicostis C (2011) Distributed Lopes C, Sayed A (2006) Distributed processing over
strategies for average consensus in directed graphs. In: adaptive networks. In: Adaptive sensor array process-
Proceedings of the IEEE conference on decision and ing workshop, MIT Lincoln Laboratory, Lexington,
control, Orlando, Dec 2011 pp 1–5
Duchi J, Agarwal A, Wainwright M (2012) Dual aver- Lopes C, Sayed A (2008) Diffusion least-mean squares
aging for distributed optimization: convergence anal- over adaptive networks: formulation and perfor-
ysis and network scaling. IEEE Trans Autom Control mance analysis. IEEE Trans Signal Process 56(7):
57(3):592–606 3122–3136
Gharesifard B, Cortés J (2012a) Distributed continuous- Lu J, Tang C (2012) Zero-gradient-sum algorithms for
time convex optimization on weight-balanced di- distributed convex optimization: the continuous-time
graphs. http://arxiv.org/pdf/1204.0304.pdf case. IEEE Trans Autom Control 57(9):2348–2354
Gharesifard B, Cortés J (2012b) Distributed strategies Martinoli A, Mondada F, Mermoud G, Correll N, Egerst-
for generating weight-balanced and doubly stochastic edt M, Hsieh A, Parker L, Stoy K (2013) Distributed
digraphs. Eur J Control 18(6):539–557 autonomous robotic systems. Springer tracts in ad-
Hendrickx J (2008) Graphs and networks for the analysis vanced robotics. Springer, Heidelberg/New York
of autonomous agent systems. Ph.D. dissertation, Uni- Mesbahi M, Egerstedt M (2010) Graph theoretic methods
versité Catholique de Louvain for multiagent networks. Princeton University Press,
Jadbabaie A, Lin J, Morse S (2003) Coordination of Princeton
groups of mobile autonomous agents using nearest Moreau L (2005) Stability of multiagent systems with
neighbor rules. IEEE Trans Autom Control 48(6): time-dependent communication links. IEEE Trans Au-
988–1001 tom Control 50(2):169–182
Jakovetic D, Xavier J, Moura J (2011a) Cooperative Nedić A (2011) Asynchronous broadcast-based convex
convex optimization in networked systems: aug- optimization over a network. IEEE Trans Autom Con-
mented lagrangian algorithms with directed gossip trol 56(6):1337–1351
communication. IEEE Trans Signal Process 59(8): Nedić A, Bertsekas D (2000) Convergence rate of incre-
3889–3902 mental subgradient algorithms. In: Uryasev S, Parda-
Jakovetic D, Xavier J, Moura J (2011b) Fast distributed los P (eds) Stochastic optimization: algorithms and
gradient methods. Available at: http://arxiv.org/abs/ applications. Kluwer Academic Publishers, Dordrecht,
1112.2972 The Netherlands, pp. 263–304
316 Distributed Optimization
Nedić A, Bertsekas D (2001) Incremental subgradient Ram S, Nedić A, Veeravalli V (2010b) Asynchronous
methods for nondifferentiable optimization. SIAM J gossip algorithms for stochastic optimization: constant
Optim 56(1):109–138 stepsize analysis. In: Recent advances in optimization
Nedić A, Olshevsky A (2013) Distributed optimization and its applications in engineering: the 14th Belgian-
over time-varying directed graphs. Available at http:// French-German conference on optimization (BFG).
arxiv.org/abs/1303.2289 Springer-Verlag, Berlin Heidelberg, pp 51–60
Nedić A, Ozdaglar A (2007) On the rate of convergence Ram SS, Nedić A, Veeravalli VV (2012) A new class
of distributed subgradient methods for multi-agent of distributed optimization algorithms: application to
optimization. In: Proceedings of IEEE CDC, New regression of distributed data. Optim Method Softw
Orleans, pp 4711–4716 27(1):71–88
Nedić A, Ozdaglar A (2009) Distributed subgradient Ren W, Beard R (2005) Consensus seeking in multi-
methods for multi-agent optimization. IEEE Trans agent systems under dynamically changing interac-
Autom Control 54(1):48–61 tion topologies. IEEE Trans Autom Control 50(5):
Nedić A, Ozdaglar A (2010) Cooperative distributed 655–661
multi-agent optimization. In: Eldar Y, Palomar D Srivastava K (2011) Distributed optimization with ap-
(eds) Convex optimization in signal processing and plications to sensor networks and machine learning.
communications. Cambridge University Press, Cam- Ph.D. dissertation, University of Illinois at Urbana-
bridge/New York, pp 340–386 Champaign, Industrial and Enterprise Systems Engi-
Nedić A, Bertsekas D, Borkar V (2001) Distributed asyn- neering
chronous incremental subgradient methods. In: But- Srivastava K, Nedić A (2011) Distributed asynchronous
nariu D, Censor Y, Reich S (eds) Inherently parallel constrained stochastic optimization. IEEE J Sel Top
algorithms in feasibility and optimization and their Signal Process 5(4):772–790
applications. Studies in computational mathematics. Srivastava K, Nedić A, Stipanović D (2010) Distributed
Elsevier, Amsterdam/New York constrained optimization over noisy networks. In: Pro-
Nedić A, Olshevsky A, Ozdaglar A, Tsitsiklis J (2008) ceedings of the 49th IEEE conference on decision and
Distributed subgradient methods and quantization ef- control (CDC), Atlanta, pp 1945–1950
fects. In: Proceedings of 47th IEEE conference on Tseng P (1998) An incremental gradient(-projection)
decision and control, Cancún, Dec 2008, pp 4177– method with momentum term and adaptive stepsize
4184 rule. SIAM J Optim 8:506–531
Nedić A, Ozdaglar A, Parrilo PA (2010) Constrained Tsianos K (2013) The role of the network in dis-
consensus and optimization in multi-agent networks. tributed optimization algorithms: convergence rates,
IEEE Trans Autom Control 55(4):922–938 scalability, communication/computation tradeoffs and
Nesterov Y (2005) Primal-dual subgradient methods for communication delays. Ph.D. dissertation, McGill
convex problems, Center for Operations Research and University, Department of Electrical and Computer
Econometrics (CORE), Catholic University of Lou- Engineering
vain (UCL), Technical report 67 Tsianos K, Rabbat M (2011) Distributed consensus and
Olfati-Saber R, Murray R (2004) Consensus problems in optimization under communication delays. In: Pro-
networks of agents with switching topology and time- ceedings of Allerton conference on communication,
delays. IEEE Trans Autom Control 49(9):1520–1533 control, and computing, Monticello, pp 974–982
Olshevsky A (2010) Efficient information aggregation Tsianos K, Lawlor S, Rabbat M (2012a) Consensus-based
for distributed control and signal processing. Ph.D. distributed optimization: practical issues and applica-
dissertation, MIT tions in large-scale machine learning. In: Proceedings
Olshevsky A, Tsitsiklis J (2006) Convergence rates in of the 50th Allerton conference on communication,
distributed consensus averaging. In: Proceedings of control, and computing, Monticello
IEEE CDC, San Diego, pp 3387–3392 Tsianos K, Lawlor S, Rabbat M (2012b) Push-sum dis-
Olshevsky A, Tsitsiklis J (2009) Convergence speed in tributed dual averaging for convex optimization. In:
distributed consensus and averaging. SIAM J Control Proceedings of the IEEE conference on decision and
Optim 48(1):33–55 control, Maui
Patterson S, Bamieh B, Abbadi A (2009) Distributed aver- Tsitsiklis J (1984) Problems in decentralized decision
age consensus with stochastic communication failures. making and computation. Ph.D. dissertation, Depart-
IEEE Trans Signal Process 57:2748–2761 ment of Electrical Engineering and Computer Science,
Rabbat M, Nowak R (2004) Distributed optimization in Massachusetts Institute of Technology
sensor networks. In: Symposium on information pro- Tsitsiklis J, Bertsekas D, Athans M (1986) Distributed
cessing of sensor networks, Berkeley, pp 20–27 asynchronous deterministic and stochastic gradient
Ram SS, Nedić A, Veeravalli V (2009) Incremental optimization algorithms. IEEE Trans Autom Control
stochastic sub-gradient algorithms for convex opti- 31(9):803–812
mization. SIAM J Optim 20(2):691–717 Touri B (2011) Product of random stochastic matrices and
Ram SS, Nedić A, Veeravalli VV (2010a) Distributed distributed averaging. Ph.D. dissertation, University of
stochastic sub-gradient projection algorithms for con- Illinois at Urbana-Champaign, Industrial and Enter-
vex optimization. J Optim Theory Appl 147:516–545 prise Systems Engineering
Dynamic Graphs, Connectivity of 317
Touri B, Nedić A (2009) Distributed consensus over sensing and communication. This article focuses
network with noisy links. In: Proceedings of the 12th on the use of graphs as models of wireless
international conference on information fusion, Seat-
tle, pp 146–154
communications. In this context, graphs have
Tu S-Y, Sayed A (2012) Diffusion strategies outperform been used widely in the study of robotic and
consensus strategies for distributed estimation over sensor networks and have provided an invaluable
adaptive networks. http://arxiv.org/abs/1205.3993 modeling framework to address a number of
Vicsek T, Czirok A, Ben-Jacob E, Cohen I, Schochet O
(1995) Novel type of phase transitions in a system of
coordinated tasks ranging from exploration,
self-driven particles. Phys Rev Lett 75(6):1226–1229 surveillance, and reconnaissance to cooperative
Wan P, Lemmon M (2009) Event-triggered distributed construction and manipulation. In fact, the
D
optimization in sensor networks. In: Symposium on success of these stories has almost always
information processing of sensor networks, San Fran-
cisco, pp 49–60
relied on efficient information exchange and
Wang J, Elia N (2011) A control perspective for coordination between the members of the team,
centralized and distributed convex optimization. In: as seen, e.g., in the case of distributed state
IEEE conference on decision and control, Florida, agreement where multi-hop communication
pp 3800–3805
Wei E, Ozdaglar A (2012) Distributed alternating direc-
has been proven necessary for convergence and
tion method of multipliers. In: Proceedings of the performance guarantees.
51st IEEE conference on decision and control and
European control conference, Maui
Zanella F, Varagnolo D, Cenedese A, Pillonetto G, Schen-
ato L (2011) Newton-Raphson consensus for dis- Keywords
tributed convex optimization. In: IEEE conference on
decision and control, Florida, pp 5917–5922 Algebraic graph theory; Convex optimization;
Zargham M, Ribeiro A, Ozdaglar A, Jadbabaie A (2014) Distributed and hybrid control; Graph connectiv-
Accelerated dual descent for network flow optimiza-
tion. IEEE Trans Autom Control 59(4):905–920 ity
Zhu M, Martínez S (2012) On distributed convex op-
timization under inequality and equality constraints.
IEEE Trans Autom Control 57(1):151–164
Introduction
Dynamic Graphs, Connectivity of, Fig. 1 (a) Disc-based model of communication; (b) Weighted, proximity-based
model of communication; (c) Connected network of mobile robots
undirected; otherwise it is called directed. The via its relation to the network eigenvalue spectra
weights in W.t/ typically model signal strength (Preciado 2008). For example, the spectrum of
or channel reliability, as per the disc-based and the Laplacian matrix of a network plays a key role
weighted-proximity models in Fig. 1a, b. In these in the analysis of synchronization in networks of
models communication between nodes is related nonlinear oscillators (Pecora and Carrollg 1998;
to their pairwise distance, giving rise to the dy- Preciado and Verghese 2005), distributed algo-
namic or time-varying nature of the graph G.t/ rithms (Lynch 1997), and decentralized control
due to node mobility. Given an undirected dy- problems (Fax and Murray 2004; Olfati Saber
namic graph G.t/, we say that this graph is and Murray 2004). Similarly, the spectrum of
connected at time t if there exists a path, i.e., a se- the adjacency matrix determines the speed of
quence of distinct vertices such that consecutive viral information spreading in a network (Van
vertices are adjacent, between any two vertices Mieghem et al. 2009). Additionally, more robust
in G.t/. In the case of directed graphs, two versions of connectivity, such as k-node or k-edge
notions of connectivity are defined. A directed connectivity, can be used to introduce robustness
graph G.t/ is called strongly connected if there of a network to node or link failures, respectively
exists a directed path between any two of its (Zavlanos and Pappas 2005, 2008).
vertices or equivalently, if every vertex is reach-
able from any other vertex. On the other hand,
a directed graph is called weakly connected if Graph-Theoretic Connectivity Control
replacing all directed edges by undirected edges
produces a connected undirected graph. Finally, Connectivity Using the Graph Laplacian
a collection of graphs fG.t/ j t D t0 ; : : : ; tk g Matrix
is called jointly connected over time if the union A metric that is typically employed to capture
graph [ttkDt0 G.t/ D fV; [ttkDt0 E.t/g is connected. connectivity of dynamic networks is the second
Clearly checking for the existence of paths be- smallest eigenvalue 2 .L/ of the Laplacian
tween all pairs of nodes in a graph is difficult, matrix L 2 Rnn of the graph, also known as
especially so as the number of nodes in the graph the algebraic connectivity or Fiedler value of the
increases. For this reason, equivalent, algebraic graph. For a weighted graph G D fV; E; Wg,
representations of graphs are employed that allow the entries of the Laplacian matrix are typically
for efficient algebraic ways to check for connec- related to the weights in W so that
tivity, as we discuss in the following section. Pn the i; j
entry of L is given by ŒLij D j D1 wij if
While connectivity is necessary for informa- i D j and ŒLij D wij if i ¤ j . The
tion propagation in a network, it is also relevant Laplacian matrix of an undirected graph is always
to the performance of many networked dynamical a symmetric, positive semidefinite matrix whose
processes, such as synchronization and gossiping, smallest eigenvalue 1 .L/ is identically zero
Dynamic Graphs, Connectivity of 319
with corresponding eigenvector the vector of all Connectivity Using the Graph Adjacency
entries equal to one. Additionally, the algebraic Matrix
connectivity 2 .L/ is a concave function of the Alternatively, connectivity can be captured by the
P
Laplacian matrix that is positive if and only if sum of powers K k
kD0 A of the adjacency matrix
the graph is connected (Fiedler 1973; Godsil and A2R nn
of the network for K n 1. The en-
Royle 2001; Merris 1994; Mohar 1991). tries of the adjacency matrix are typically related
As the algebraic connectivity 2 .L/ plays to the weights in W as ŒAij D wij . For disc-
a critical role in determining whether a graph is based graphs as in Fig. 1a, the i; j entry of the
connected or not, a number of methods have been kth power of the adjacency matrix ŒAk ij captures
D
proposed for its decentralized estimation and the number of paths of length k between nodes
control. These range from methods that employ i and j ; for weighted graphs, ŒAk ij captures
market-based control to underestimate the a weighted sum of those paths. Therefore, the
P
algebraic connectivity and accordingly control entries of K k
kD0 A represent the number of paths
the network structure (Zavlanos and Pappas up to length K between every pair of nodes in
2008) to methods that enforce the states of the the graph (Godsil and Royle 2001). By definition
P
nodes to oscillate at frequencies that correspond of graph connectivity, if all entries of K kD0 A
k
to the Laplacian eigenvalues and then use fast are positive for K D n 1, then the network
Fourier transform to estimate these eigenvalues is connected. Clearly, for K < n 1, not all
P
(Franceschelli et al. 2013), to methods that entries of K k
kD0 A are necessarily positive, even
iteratively update the interval where the algebraic if the graph is connected. Maintaining positive
P
connectivity is supposed to lie (Montijano et al. definiteness of the positive entries of K kD0 A
k
2011), and to methods that rely on the power of an initially connected graph maintains paths
iteration method and its variants (DeGennaro of length K between the corresponding nodes
and Jadbabaie 2006; Kempe and McSherry 2008; and, as shown in Zavlanos and Pappas (2005),
Knorn et al. 2009; Oreshkin et al. 2010; Sabattini is sufficient to maintain connectivity of the graph
et al. 2011; Yang et al. 2010). All the above throughout.
techniques are often integrated with appropriate The ability to capture graph connectivity
controllers to regulate mobility of the nodes while using the adjacency matrix has given rise
ensuring connectivity of the network. Another to optimization-based connectivity controllers
way that 2 .L/ can be used to ensure connectivity (Srivastava and Spong 2008; Zavlanos and
of dynamic graphs is via optimization-based Pappas 2005) that are often centralized due to the
methods that maximize it away from its zero multi-hop dependencies between nodes due to the
value. Such approaches were initially centralized powers of the adjacency matrix. Since smaller
as connectivity is a global property of a graph powers correspond to shorter dependencies
(Kim and Mesbahi 2006), although recently (paths), decentralization is possible as K
distributed subgradient algorithms (DeGennaro decreases. If K D 1, connectivity maintenance
and Jadbabaie 2006) as well as non-iterative reduces to preserving the pairwise links between
decomposition techniques (Simonetto et al. the nodes in an initially connected network. Since
2013) have also been proposed. As the algebraic the adjacency matrix of weighted graphs is often
connectivity is a non-differentiable function a differentiable function, this approach can result
of the Laplacian matrix, designing continuous in continuous feedback solution techniques.
feedback controllers to maintain it positive Discrete-time approaches are discussed in Ando
definite is a challenging task. This problem et al. (1999), Notarstefano et al. (2006), and
was overcome in Zavlanos and Pappas (2007) Bullo et al. (2009), while Spanos and Murray
via the use of gradient flows that maintain (2004), Dimarogonas and Kyriakopoulos (2008),
positive definiteness of the determinant of the Cornejo and Lynch (2008), Yao and Gupta
projected Laplacian matrix to the space that is (2009), Zavlanos et al. (2007), and Ji and
perpendicular to eigenvector of ones. Egerstedt (2007) rely on local gradients that
320 Dynamic Graphs, Connectivity of
may also incorporate switching in the case Powers and Balch 2004; Wagner and Arkin
of link additions. Switching between arbitrary 2004) and visibility constraints (Anderson et al.
spanning topologies has also been studied in the 2003; Ando et al. 1999; Arkin and Diaz 2002;
literature, with the spanning subgraphs being Flocchini et al. 2005; Ganguli et al. 2009).
updated by local auctions (Zavlanos and Pappas Periodic connectivity for robot teams that need to
2008), distributed spanning tree algorithms occasionally split in order to achieve individual
(Wagenpfeil et al. 2009), combination of objectives (Hollinger and Singh 2010; Zavlanos
information dissemination algorithms and graph 2010) and sufficient conditions for connectivity
picking games (Schuresko and Cortes 2009b), or in leader-follower networks (Gustavi et al. 2010)
intermediate rendezvous (Schuresko and Cortes also adds to the list. Early experimental results
2009a; Spanos and Murray 2005). This class have demonstrated efficiency of these algorithms
of approaches is typically hybrid, combining also in practice (Hollinger and Singh 2010;
continuous link maintenance and discrete Michael et al. 2009; Tardioli et al. 2010).
topology control. The algebraic connectivity
PK k
2 .L/ and number of paths kD0 A metrics
can also be combined to give controllers that Summary and Future Directions
maintain connectivity, while enforcing desired
multi-hop neighborhoods for all agents (Stump Although graphs provide a simple abstraction of
et al. 2008). inter-robot communications, it has long been rec-
A recent, comprehensive survey on graph- ognized that since links in a wireless network do
theoretic approaches for connectivity control of not entail tangible connections, associating links
dynamic graphs can be found in Zavlanos et al. with arcs on a graph can be somewhat arbitrary.
(2011). Indeed, topological definitions of connectivity
start by setting target signal strengths to draw
the corresponding graph. Even small differences
Applications in Mobile Robot in target strengths might result in dramatic dif-
Network Control ferences in network topology (Lundgren et al.
2002). As a result, graph connectivity is neces-
Methods to control connectivity of dynamic sary but not nearly sufficient to guarantee com-
graphs have been successfully applied to multiple munication integrity, interpreted as the ability
scenarios that require network connectivity to of a network to support desired communication
achieve a global coordinated objective. Indicative rates.
of the impact of this work is recent literature To address these challenges, a new body of
on connectivity preserving rendezvous (Ando work is recently appearing that departs from tra-
et al. 1999; Cortes et al. 2006; Dimarogonas and ditional graph-based models of communication.
Kyriakopoulos 2008; Ganguli et al. 2009; Ji and Specifically, Zavlanos et al. (2013) employs a
Egerstedt 2007), flocking (Zavlanos et al. 2007, simple, yet effective, modification that relies on
2009), and formation control (Ji and Egerstedt weighted graph models with weights that capture
2007; Schuresko and Cortes 2009a), where the packet error probability of each link (De-
so far connectivity had been an assumption. Couto et al. 2006). When using reliabilities as
Further extensions and contributions involve link metrics, it is possible to model routing and
connectivity control for double integrator agents scheduling problems as optimization problems
(Notarstefano et al. 2006), agents with bounded that accept link reliabilities as inputs (Ribeiro
inputs (Ajorlou and Aghdam 2010; Ajorlou et al. et al. 2007, 2008). The key idea proposed in
2010; Dimarogonas and Johansson 2008), and Zavlanos et al. (2013) is to define connectivity
indoor navigation (Stump et al. 2008), as well in terms of communication rates and to use op-
as for communication based on radio signal timization formulations to describe optimal op-
strength (Hsieh et al. 2008; Mostofi 2009; erating points of wireless networks. Then, the
Dynamic Graphs, Connectivity of 321
communication variables are updated in discrete in arbitrary dimensions. IEEE Trans Autom Control
time via a distributed gradient descent algorithm 51(8):1289–1298
DeCouto D, Aguayo D, Bicket J, Morris R (2006) A high-
on the dual function, while robot motion is reg- throughput path metric for multihop wireless routing.
ulated in continuous time by means of appro- In: Proceedings of the international ACM conference
priate distributed barrier potentials that maintain on mobile computing and networking, San Diego,
desired communication rates. Related approaches pp 134–146
DeGennaro MC, Jadbabaie A (2006) Decentralized con-
consider optimal communications based on T-slot trol of connectivity for multi-agent systems. In: Pro-
time averages of the primal variables for general ceedings of the 45th IEEE conference on decision and
mobility schemes Neely (2010), as well as opti- control, San Diego, pp 3628–3633
D
mization of mobility and communications based Dimarogonas DV, Johansson KH (2008) Decentralized
connectivity maintenance in mobile networks with
on the end-to-end bit error rate between nodes bounded inputs. In Proceedings of the IEEE in-
(Ghaffarkhah and Mostofi 2011; Yan and Mostofi ternational conference on robotics and automation,
2012). Pasadena, pp 1507–1512
Dimarogonas DV, Kyriakopoulos KJ (2008) Connect-
edness preserving distributed swarm aggregation
for multiple kinematic robots. IEEE Trans Robot
24(5):1213–1223
Cross-References Fax A, Murray RM (2004) Information flow and coopera-
tive control of vehicle formations. IEEE Trans Autom
Flocking in Networked Systems Control 49:1465–1476
Graphs for Modeling Networked Interactions Fiedler M (1973) Algebraic connectivity of graphs.
Czechoslovak Math J 23(98):298–305
Flocchini P, Prencipe G, Santoro N, Widmayer P
(2005) Gathering of asynchronous oblivious robots
with limited visibility. Theor Comput Sci 337(1–3):
Bibliography 147–168
Franceschelli M, Gasparri A, Giua A, Seatzu C (2013)
Ajorlou A, Aghdam AG (2010) A class of bounded Decentralized estimation of laplacian eigenvalues in
distributed controllers for connectivity preservation of multi-agent systems. Automatica 49(4):1031–1036
unicycles. In: Proceedings of the 49th IEEE confer- Ganguli A, Cortes J, Bullo F (2009) Multirobot ren-
ence on decision and control, Atlanta, pp 3072–3077 dezvous with visibility sensors in nonconvex environ-
Ajorlou A, Momeni A, Aghdam AG (2010) A class of ments. IEEE Trans Robot 25(2):340–352
bounded distributed control strategies for connectivity Ghaffarkhah A, Mostofi Y (2011) Communication-aware
preservation in multi-agent systems. IEEE Trans Au- motion planning in mobile networks. IEEE Trans Au-
tom Control 55(12):2828–2833 tom Control Spec Issue Wirel Sens Actuator Netw
Anderson SO, Simmons R, Goldberg D (2003) Main- 56(10):2478–248
taining line-of-sight communications networks be- Godsil C, Royle G (2001) Algebraic graph theory, Gradu-
tween planetray rovers. In: Proceedings of the 2003 ate Texts in Mathematics, vol 207. Springer, Berlin
IEEE/RSJ international conference on intelligent Gustavi T, Dimarogonas DV, Egerstedt M, Hu X (2010)
robots and systems, Las Vegas, pp 2266–2272 Sufficient conditions for connectivity maintenance and
Ando H, Oasa Y, Suzuki I, Yamashita M (1999) Dis- rendezvous in leader-follower networks. Automatica
tributed memoryless point convergence algorithm for 46(1):133–139
mobile robots with limited visibility. IEEE Trans Hollinger G, Singh S (2010) Multi-robot coordination
Robot Autom 15(5):818–828 with periodic connectivity. In: Proceedings of the
Arkin RC, Diaz J (2002) Line-of-sight constrained ex- IEEE international conference on robotics and au-
ploration for reactive multiagent robotic teams. In: tomation, Anchorage, Alaska, pp 4457–4462
Proceedings of the 7th international workshop on ad- Hsieh MA, Cowley A, Kumar V, Taylor C (2008) Main-
vanced motion control, Maribor, pp 455–461 taining network connectivity and performance in robot
Bullo F, Cortes J, Martinez S (2009) Distributed control of teams. J Field Robot 25(1–2):111–131
robotic networks. Applied Mathematics Series. Prince- Ji M, Egerstedt M (2007) Coordination control of multi-
ton University Press, Princeton agent systems while preserving connectedness. IEEE
Cornejo A, Lynch N (2008) Connectivity service for Trans Robot 23(4):693–703
mobile ad-hoc networks. In: Proceedings of the 2nd Kempe D, McSherry F (2008) A decentralized algorithm
IEEE international conference on self-adaptive and for spectral analysis. J Comput Syst Sci 74(1):70–83
self-organizing systems workshops, pp 292–297 Kim Y, Mesbahi M (2006) On maximizing the second
Cortes J, Martinez S, Bullo F (2006) Robust rendezvous smallest eigenvalue of a state-dependent graph lapla-
for mobile autonomous agents via proximity graphs cian. IEEE Trans Autom Control 51(1):116–120
322 Dynamic Graphs, Connectivity of
Knorn F, Stanojevic R, Corless M, Shorten R (2009) Preciado V (2008) Spectral analysis for stochastic models
A framework for decentralized feedback connectivity of large-scale complex dynamical networks. Ph.D.
control with application to sensor networks. Int J dissertation, Department of Electrical Engineering and
Control 82(11):2095–2114 Computer Science, MIT
Lundgren H, Nordstrom E, Tschudin C (2002) The gray Preciado V, Verghese G (2005) Synchronization in gen-
zone problem in ieee 802.11b based ad hoc networks. eralized erdös-rényi networks of nonlinear oscillators.
ACM SIGMOBILE Mobile Comput Commun Rev In: 44th IEEE conference on decision and control,
6(3):104–105 Seville, Spain, pp 4628–463
Lynch N (1997) Distributed algorithms. Morgan Kauf- Ribeiro A, Luo Z-Q, Sidiropoulos ND, Giannakis GB
mann, San Francisco (2007) Modelling and optimization of stochastic rout-
Merris R (1994) Laplacian matrices of a graph: a survey. ing for wireless multihop networks. In: Proceedings of
Linear Algebra Appl 197:143–176 the 26th annual joint conference of the IEEE Computer
Michael N, Zavlanos MM, Kumar V, Pappas GJ (2009) and Communications Societies (INFOCOM), Anchor-
Maintaining connectivity in mobile robot networks. age, pp 1748–1756
In: Experimental robotics. Tracts in advanced robotics. Ribeiro A, Sidiropoulos ND, Giannakis GB (2008) Opti-
Springer, Berlin/Heidelberg, pp 117–126 mal distributed stochastic routing algorithms for wire-
Mohar B (1991) The laplacian spectrum of graphs. In: less multihop networks. IEEE Trans Wirel Commun
Alavi Y, Chartrand G, Ollermann O, Schwenk A (Eds) 7(11):4261–4272
Graph theory, combinatorics, and applications. Wiley, Sabattini L, Chopra N, Secchi C (2011) On decentralized
New York, pp 871–898 connectivity maintenance for mobile robotic systems.
Montijano E, Montijano JI, Sagues C (2011) Adaptive In: Proceedings of the 50th IEEE conference on deci-
consensus and algebraic connectivity estimation in sion and control, Orlando, pp 988–993
sensor networks with chebyshev polynomials. In: Pro- Schuresko M, Cortes J (2009a) Distributed tree rearrange-
ceedings of the 50th IEEE conference on decision and ments for reachability and robust connectivity. In: Hy-
control, Orlando, pp 4296–4301 brid systems: computetation and control. Lecture notes
Mostofi Y (2009) Decentralized communication-aware in computer science, vol 5469. Springer, Berlin/New
motion planning in mobile networks: an information- York, pp 470–474
gain approach. J Intell Robot Syst 56(1–2): Schuresko M, Cortes J (2009b) Distributed motion con-
233–256 straints for algebraic connectivity of robotic networks.
Neely MJ (2010) Universal scheduling for networks with J Intell Robot Syst 56(1–2):99–126
arbitrary traffic, channels, and mobility. In: Proceed- Simonetto A, Kaviczky T, Babuska R (2013) Constrained
ings of the 49th IEEE conference on decision and distributed algebraic connectivity maximization in
control, Altanta, pp 1822–1829 robotic networks. Automatica 49(5):1348–1357
Neskovic A, Neskovic N, Paunovic G (2000) Modern Spanos DP, Murray RM (2004) Robust connectivity
approaches in modeling of mobile radio systems prop- of networked vehicles. In: Proceedings of the 43rd
agation environment. IEEE Commun Surv 3(3):1–12 IEEE conference on decision and control, Bahamas,
Notarstefano G, Savla K, Bullo F, Jadbabaie A pp 2893–2898
(2006) Maintaining limited-range connectivity among Spanos DP, Murray RM (2005) Motion planning with
second-order agents. In: Proceedings of the 2006 wireless network constraints. In: Proceedings of
American control conference, Minneapolis, pp 2124– the 2005 American control conference, Portland,
2129 pp 87–92
Olfati-Saber R, Murray RM (2004) Consensus prob- Srivastava K, Spong MW (2008) Multi-agent coordina-
lems in networks of agents with switching topol- tion under connectivity constraints. In: Proceedings
ogy and time-delays. IEEE Trans Autom Control 49: of the 2008 American control conference, Seattle,
1520–1533 pp 2648–2653
Oreshkin BN, Coates MJ, Rabbat MG (2010) Opti- Stump E, Jadbabaie A, Kumar V (2008) Connectivity
mization and analysis of distributed averaging with management in mobile robot teams. In: Proceedings
short node memory. IEEE Trans Signal Process of the IEEE international conference on robotics and
58(5):2850–2865 automation, Pasadena, pp 1525–1530
Pahlavan K, Levesque AH (1995) Wireless information Tardioli D, Mosteo AR, Riazuelo L, Villarroel JL, Mon-
networks. Willey, New York tano L (2010) Enforcing network connectivity in robot
Parsons JD (2000) The mobile radio propagation channel. team missions. Int J Robot Res 29(4):460–480
Willey, Chichester Van Mieghem P, Omic J, Kooij R (2009) Virus spread in
Pecora L, Carrollg T (1998) Master stability functions networks. IEEE/ACM Trans Networking 17(1):1–14
for synchronized coupled systems. Phys Rev Lett Wagenpfeil J, Trachte A, Hatanaka T, Fujita M, Sawodny
80:2109–2112 O (2009) A distributed minimum restrictive connectiv-
Powers M, Balch T (2004) Value-based communication ity maintenance algorithm. In: Proceedings of the 9th
preservation for mobile robots. In: Proceedings of international symposium on robot control, Gifu
the 7th international symposium on distributed au- Wagner AR, Arkin RC (2004) Communication-sensitive
tonomous robotic systems, Toulouse multi-robot reconnaissance. In: Proceedings of the
Dynamic Noncooperative Games 323
IEEE international conference on robotics and au- player is finite, and discuss briefly extensions to
tomation, New Orleans, pp 2480–2487 infinite games.
Yan Y, Mostofi Y (2012) Robotic router formation in
realistic communication environments. IEEE Trans
Robot 28(4):810–827
Yang P, Freeman RA, Gordon GJ, Lynch KM, Srinivasa Keywords
SS, Sukthankar R (2010) Decentralized estimation
and control of graph connectivity for mobile sensor Extensive form games; Finite games; Nash equi-
networks. Automatica 46(2): 390–396 librium
Yao Z, Gupta K (2009) Backbone-based connectivity
control for mobile networks. In: Proceedings IEEE
D
international conference on robotics and automation,
Kobe, pp 1133–1139 Introduction
Zavlanos MM (2010) Synchronous rendezvous of very-
low-range wireless agents. In: Proceedings of the 49th Dynamic noncooperative games allow multiple
IEEE conference on decision and control, Atlanta,
pp 4740–4745
actions by individual players, and include explicit
Zavlanos MM, Pappas GJ (2005) Controlling connectivity representations of the information available to
of dynamic graphs. In: Proceedings of the 44th IEEE each player for selecting its decision. Such games
conference on decision and control and European have a complex temporal order of play and an
control conference, Seville, pp 6388–6393
Zavlanos MM, Pappas GJ (2007) Potential fields for
information structure that reflects uncertainty as
maintaining connectivity of mobile networks. IEEE to what individual players know when they have
Trans Robot 23(4):812–816 to make decisions. This temporal order and in-
Zavlanos MM, Pappas GJ (2008) Distributed connectiv- formation structure is not evident when the game
ity control of mobile networks. IEEE Trans Robot
24(6):1416–1428
is represented as a static game between play-
Zavlanos MM, Jadbabaie A, Pappas GJ (2007) Flocking ers that select strategies. Dynamic games often
while preserving network connectivity. In: Proceed- incorporate explicit uncertainty in outcomes, by
ings of the 46th IEEE conference on decision and representing such outcomes as actions taken by a
control, New Orleans, pp 2919–2924
Zavlanos MM, Tanner HG, Jadbabaie A, Pappas random player (called chance or “Nature”) with
GJ (2009) Hybrid control for connectivity known probability distributions for selecting its
preserving flocking. IEEE Trans Autom Control actions.
54(12):2869–2875 We focus our exposition on models of finite
Zavlanos MM, Egerstedt MB, Pappas GJ (2011) Graph
theoretic connectivity control of mobile robot net- games and discuss briefly extensions to infinite
works. Proc IEEE Spec Issue Swarming Nat Eng Syst games at the end of the entry.
99(9):1525–154
Zavlanos MM, Ribeiro A, Pappas GJ (2013) Network in-
tegrity in mobile robotic networks. IEEE Trans Autom Finite Games in Extensive Form
Control 58(1):3–18
This representation of payoffs in terms of Theorem 1 (Nash 1950, 1951) Every finite
strategies transforms the game from an extensive n-person game has at least one Nash equilibrium
form representation to a strategic or normal form point in mixed strategies.
representation, where the concept of dynamics
and information has been abstracted away. The Mixed strategies suggest that each player’s
resulting strategic form looks like a static game randomization occurs before the game is played,
as discussed in the encyclopedia entry Strategic by choosing a strategy at random from its choices
Form Games and Nash Equilibrium, where each of pure strategies. For games in extensive form,
player selects his/her strategy from a finite set, one can introduce a different class of strategies
resulting in a vector of payoffs for the players. where a player makes a random choice of action
Using these payoffs, one can now define so- at each information set, according to a proba-
lution concepts for the game. Let the notation bility distribution that depends on the specific
i D .1 ; : : : ; i 1 ; i C1 ; n / denote the set of information set. The choice of action is selected
strategies in a tuple excluding the i th strategy. independently at each information set according
326 Dynamic Noncooperative Games
to a selected probability distribution, in a man- is equivalent, where every player receives the
ner similar to how Nature’s actions are selected. same payoffs under both strategies i and xi .
These random choice strategies are known as
behavior strategies. Let .A.hki // denote the This equivalence was extended by Aumann
set of probability distributions over the deci- to infinite games (Aumann 1964). In dynamic
sions A.hki / for player i at information set hki . games, it is common to assume that each player
A behavior strategy for player i is an element has perfect recall and thus solutions can be found
Q
xi 2 k k
hki 2Hi .A.hi //, where xi .hi / denotes
in the smaller space of behavior strategies.
the probability distribution of the behavior strat-
egy xi over the available decisions at information Computation of Equilibria
set hki . Algorithms for the computation of mixed-
Note that the space of admissible behavior strategy Nash equilibria of static games can
strategies is much smaller than the space of ad- be extended to compute mixed-strategy Nash
missible mixed strategies. To illustrate this, con- equilibria for games in extensive form when
sider a player with K information sets and two the pure strategies are enumerated as above.
possible decisions for each information set. The However, the number of pure strategies grows
number of possible pure strategies would be 2K , exponentially with the size of the extensive form
and thus the space of mixed strategies would be tree, making these methods hard to apply. For
a probability simplex of dimension 2K 1. In two-person games in extensive form with perfect
contrast, behavior strategies would require spec- recall by both players, one can search for Nash
ifying probability distributions over two choices equilibria in the much smaller space of behavior
for each of K information sets, so the space strategies. This was exploited in Koller et al.
of behavior strategies would be a product space (1996) to obtain efficient linear complementarity
of K probability simplices of dimension 1, to- problems for nonzero-sum games and linear
taling dimension K. One way of understanding programs for zero-sum games where the number
this difference is that mixed strategies introduce of variables involved is linear in the number of
correlated randomization across choices at differ- internal decision nodes of the extensive form
ent information sets, whereas behavior strategies of the game. A more detailed overview of
introduce independent randomization across such computation algorithms for Nash equilibria can
choices. be found in McKelvey and McLennan (1996).
For every behavior strategy, one can find an The Gambit Web site (McKelvey et al. 2010)
equivalent mixed strategy by computing the prob- provides software implementations of several
abilities of every set of actions that result from techniques for computation of Nash equilibria in
the behavior strategy. The converse is not true two- and n-person games.
for general games in extensive form. However, An alternative approach to computing Nash
there is a special class of games for which the equilibria for games in extensive forms is based
converse is true. In this class of games, players on subgame decomposition, discussed next.
recall what actions they have selected previously
and what information they knew previously. A Subgames
formal definition of perfect recall is beyond the Consider a game G in extensive form. A node
scope of this exposition but can be found in c is a successor of a node n if there is a path
Hart (1991) and Kuhn (1953). The implication of in the game tree from n to c. Let h be a node
perfect recall is summarized below: in G that is not terminal and is the only node in
its information set. Assume that if a node c is a
Theorem 2 (Kuhn 1953) Given a finite n- successor of h, then every node in the information
person game in which player i has perfect recall, set containing c is also a successor of h. In this
for each mixed strategy i for player i , there situation, one can define a subgame H of G
exists a corresponding behavior strategy xi that with root node h, which consists of node h and
Dynamic Noncooperative Games 327
vessels, drilling rigs (semisubmersibles) and system, propulsion system, and sensors is impor-
ships, shuttle tankers, cable and pipe layers, tant for successful design and operation of DP
floating production, storage, and off-loading systems and DP vessels.
units (FPSOs), crane and heavy lift vessels, Thruster-assisted position mooring is another
geological survey vessels, rescue vessels, and important stationkeeping application often used
multipurpose construction vessels. DP systems for FPSOs, drilling rigs, and shuttle tanker oper-
are also installed on cruise ships, yachts, fishing ations where the DP system has to be redesigned
boats, navy ships, tankers, and others. accounting for the effect of the mooring system
A DP vessel is by the International Maritime dynamics. In thruster-assisted position mooring,
Organization (IMO) and the maritime class so- the DP system is renamed to position mooring
cieties (DNV GL, ABS, LR, etc.) defined as a (PM) system. PM systems have been commer-
vessel that maintains its position and heading cially available since the 1980s. While for DP-
(fixed location denoted as stationkeeping or pre- operated ships the thrusters are the sole source
determined track) exclusively by means of active of the stationkeeping, the assistance of thrusters
thrusters. The DP system as defined by class is only complementary to the mooring system.
societies is not only limited to the DP control Here, most of the stationkeeping is provided by
system including computers and cabling. Position a deployed anchor system. In severe environmen-
reference systems of various types measuring tal conditions, the thrust assistance is used to
North-East coordinates (satellites, hydroacoustic, minimize the vessel excursions and line tension
optics, taut wire, etc.), sensors (heading, roll, by mainly increasing the damping in terms of
pitch, wind speed and direction, etc.), the power velocity feedback control and adding a bias force
system, thruster and propulsion system, and in- minimizing the mean tensions of the most loaded
dependent joystick system are also essential parts mooring lines. Modeling and control of turret-
of the DP system. In addition, the DP operator is anchored ships are treated in Strand et al. (1998)
an important element securing safe and efficient and Nguyen and Sørensen (2009).
DP operations. Further development of human- Overview of DP systems including references
machine interfaces, alarm systems, and operator can be found in Fay (1989), Fossen (2011), and
decision support systems is regarded as top pri- Sørensen (2011). The scientific and industrial
ority bridging advanced and complex technology contributions since the 1960s are vast, and many
to safe and efficient marine operations. Sufficient research groups worldwide have provided impor-
DP operator training is a part of this. tant results. In Sørensen et al. (2012), the devel-
The thruster and propulsion system controlled opment of DP system for ROVs is presented.
by the DP control system is regarded as one of
the main power consumers on the DP vessel. An
important control system for successful integra- Mathematical Modeling of DP Vessels
tion with the power plant and the other power
consumers such as drilling system, process sys- Depending on the operational conditions, the
tem, heating, and ventilation system is the power vessel models may briefly be classified into
and energy management system (PMS/EMS) bal- stationkeeping, low-velocity, and high-velocity
ancing safety requirements and energy efficiency. models. As shown in Sørensen (2011) and
The PMS/EMS controls the power generation Fossen (2011) and the references therein,
and distribution and the load control of heavy different model reduction techniques are
power consumers. In this context both transient used for the various speed regimes. Vessel
and steady-state behaviors are of relevance. A motions in waves are defined as seakeeping
thorough understanding of the hydrodynamics, and will here apply both for stationkeeping
dynamics between coupled systems, load char- (zero speed) and forward speed. DP vessels
acteristics of the various power consumers, con- or PM vessels can in general be regarded as
trol system architecture, control layers, power stationkeeping and low-velocity or low Froude
Dynamic Positioning Control Systems for Ships and Underwater Vehicles 331
number applications. This assumption will P̃ 1 J .˜ / 033 1
P̃ D D 1 2 (1)
particularly be used in the formulation of math- P̃ 2 033 J2 .˜2 / 2
ematical models used in conjunction with the
controller design. It is common to use a two-time The vectors defining the Earth-fixed vessel posi-
scale formulation by separating the total model tion (˜1 ) and orientation (˜2 ) using Euler angles
into a low-frequency (LF) model and a wave- and the body-fixed translation (1 ) and rotation
frequency (WF) model (seakeeping) by superpo- .2 / velocities are given by
sition. Hence, the total motion is a sum of the
corresponding LF and the WF components. The T T D
˜1 D x y z ; ˜2 D
;
WF motions are assumed to be caused by first-
order wave loads. Assuming small amplitudes, T T
1 D u v w ; 2 D p q r : (2)
these motions will be well represented by a linear
model. The LF motions are assumed to be caused
The rotation matrix J1 .˜2 / 2 SO(3) and the
by second-order mean and slowly varying wave
velocity transformation matrix J2 .˜2 / 2 R33
loads, current loads, wind loads, mooring (if
are defined in Fossen (2011). For ships, if only
any), and thrust and rudder forces and moments.
surge, sway and yaw (3DOF) are considered, the
For underwater vehicles operating below the
kinematics and the state vectors are reduced to
wave zone, estimated to be deeper than half the
wavelength, the wave loads can be disregarded 3 2
xP
and of course the effect of the wind loads as well.
P̃ D R. /; or 4 yP 5
P
Modeling Issues
2 32 3
The mathematical models may be formulated in cos sin 0 u
two complexity levels: D 4 sin cos 05 4 v 5 : (3)
• Control plant model is a simplified mathe- 0 0 1 r
matical description containing only the main
physical properties of the process or plant. Process Plant Model: Low-Frequency
This model may constitute a part of the con- Motion
troller. The control plant model is also used The 6-DOF LF model formulation is based on
in analytical stability analysis based on, e.g., Fossen (2011) and Sørensen (2011). The equa-
Lyapunov stability. tions of motion for the nonlinear LF model of a
• Process plant model or simulation model is a floating vessel are given by
comprehensive description of the actual pro-
cess and should be as detailed as needed. The MP C CRB ./ C CA .r /r C D.r /CG.˜/
main purpose of this model is to simulate
the real plant dynamics. The process plant D £wave2 C £wind C £thr C £moor ; (4)
model is used in numerical performance and
robustness analysis and testing of the control where M 2 R66 is the system inertia matrix
systems. As shown above, the process plant including added mass; CRB ./ 2 R66 and
models may be implemented for off-line or CA .r / 2 R66 are the skew-symmetric Coriolis
real-time simulation (e.g., HIL testing; see Jo- and centripetal matrices of the rigid body and
hansen et al. 2007) purposes defining different the added mass; G(˜) 2 R6 is the generalized
requirements for model fidelity. restoring vector caused by the mooring lines
(if any), buoyancy, and gravitation; £thr 2 R6
Kinematics is the control vector consisting of forces and
The relationship between the Earth-fixed position moments produced by the thruster system; £wind
and orientation of a floating structure and its and £wave2 2 R6 are the wind and second-order
body-fixed velocities is wave load vectors, respectively.
332 Dynamic Positioning Control Systems for Ships and Underwater Vehicles
The damping vector may be divided into linear coefficient – !. However, this is a common
and nonlinear terms according to used formulation denoted as a pseudo-differential
equation. An important feature of the added mass
D.r / D dL .r ; /r C dNL .r ; r /r ; (5) terms and the wave radiation damping terms
is the memory effects, which in particular are
where r 2 R6 is the relative velocity vector important to consider for nonstationary cases,
between the current and the vessel. The nonlin- e.g., rapid changes of heading angle. Memory
ear damping, dNL , is assumed to be caused by effects can be taken into account by introducing
turbulent skin friction and viscous eddy-making, a convolution integral or a so-called retardation
also denoted as vortex shedding (Faltinsen 1990). function (Newman 1977) or state space models
The strictly positive linear damping matrix dL 2 as suggested by Fossen (2011).
R66 is caused by linear laminar skin friction
and is assumed to vanish for increasing speed Control Plant Model
according to For the purpose of controller design and analysis,
2 3 it is convenient to apply model reduction and
Xur e jur j :: Xr e jrj derive a LF and WF control plant model in
dL .r ; / D 4 :: :: :: 5 ; (6) surge, sway, and yaw about zero vessel velocity
Nur e jur j :: Nr e jrj according to
where is a positive scaling constant such that pP w D Apw pw C Epw wpw ; (8)
2 RC .
P̃ D R. /; (9)
Process Plant Model: Wave-Frequency
Motion bP D Tb b C Eb wb ; (10)
The coupled equations of the WF motions in MP D DL C RT . /b C £; (11)
surge, sway, heave, roll, pitch, and yaw are as-
sumed to be linear and can be formulated as !P p D 0; (12)
h
iT
M.!/ P̃ Rw C Dp .!/ P̃ Rw C G˜Rw D £wave1 ; y D ˜ C Cpw pw T !p ; (13)
P̃ w D J. Ñ 2 / P̃ Rw ; where !p 2 R is the peak frequency of the waves
(7)
(PFW). The estimated PFW can be calculated
where ˜Rw 2 R6 is the WF motion vector in the
by spectral analysis of the pitch and roll mea-
hydrodynamics frame. ˜w 2 R6 is the WF motion
surements assumed to dominantly oscillate at the
vector in the Earth-fixed frame. £wave1 2 R6 is
peak wave frequency. In the spectral analysis,
the first-order wave excitation vector, which will
the discrete Fourier transforms of the measured
be modified for varying vessel headings relative
roll and pitch, which are collected through a
to the incident wave direction. M(!) 2 R66 is
period of time, are done by taking the n-point
the system inertia matrix containing frequency
fast Fourier transform (FFT). The PFW may be
dependent added mass coefficients in addition to
found to be the frequency at which the power
the vessel’s mass and moment of inertia. Dp (!)
spectrum is maximal. The assumption !P p D 0
2 R66 is the wave radiation (potential) damping
is valid for slowly varying sea state. You can
matrix. The linearized restoring coefficient ma-
also find the wave frequency using nonlinear
trix G 2 R66 is due to gravity and buoyancy
observers/EKF and signal processing techniques.
affecting heave, roll, and pitch only. For anchored
It is assumed that the second-order linear model is
vessels, it is assumed that the mooring system
sufficient to describe the first-order wave-induced
will not influence the WF motions.
motions, and then pw 2 R6 is the state of the WF
Remark 1 Generally, a time domain equation model. Apw 2 R66 is assumed Hurwitz and de-
cannot be expressed with frequency domain scribes the first-order wave-induced motion as a
Dynamic Positioning Control Systems for Ships and Underwater Vehicles 333
Dynamic Positioning Control Systems for Ships and Underwater Vehicles, Fig. 1 Controller structure
Dynamic Positioning Control Systems for Ships and Underwater Vehicles, Fig. 2 Thrust allocation
(2008) and Smogeli and Sørensen (2009) pre- Newman JN (1977) Marine hydrodynamics. MIT,
sented the concept of anti-spin thruster control. Cambridge
Nguyen TD, Sørensen AJ (2009) Setpoint chasing for
thruster-assisted position mooring. IEEE J Ocean Eng
34(4):548–558
Cross-References Nguyen TD, Sørensen AJ, Quek ST (2007) Design of
hybrid controller for dynamic positioning from calm
to extreme sea conditions. Automatica 43(5):768–785
Control of Ship Roll Motion Pettersen KY, Fossen TI (2000) Underactuated dynamic
Fault-Tolerant Control positioning of a ship – experimental results. IEEE
Mathematical Models of Ships and Underwater Trans Control Syst Technol 8(4):856–863
Ruth E, Smogeli ØN, Perez T, Sørensen AJ (2009) Anti-
Vehicles
spin thrust allocation for marine vessels. IEEE Trans
Motion Planning for Marine Control Systems Control Syst Technol 17(6):1257–1269
Underactuated Marine Control Systems Sælid S, Jenssen NA, Balchen JG (1983) Design and
analysis of a dynamic positioning system based on
Kalman filtering and optimal control. IEEE Trans
Autom Control 28(3):331–339
Bibliography Smogeli ØN, Sørensen AJ (2009) Antispin thruster con-
trol for ships. IEEE Trans Control Syst Technol
Balchen JG, Jenssen NA, Sælid S (1976) Dynamic po- 17(6):1362–1375
sitioning using Kalman filtering and optimal control Smogeli ØN, Sørensen AJ, Minsaas KJ (2008) The con-
theory. In: IFAC/IFIP symposium on automation in cept of anti-spin thruster control. Control Eng Pract
offshore oil field operation, Amsterdam, pp 183–186 16(4):465–481
Basseville M, Nikiforov IV (1993) Detection of abrupt Strand JP, Fossen TI (1999) Nonlinear passive observer
changes: theory and application. Prentice-Hall, Engle- for ships with adaptive wave filtering. In: Nijmeijer H,
wood Cliffs. ISBN:0-13-126780-9 Fossen TI (eds) New directions in nonlinear observer
Blanke M, Kinnaert M, Lunze J, Staroswiecki M (2003) design. Springer, London, pp 113–134
Diagnostics and fault-tolerant control. Springer, Berlin Strand JP, Sørensen AJ, Fossen TI (1998) Design of
Faltinsen OM (1990) Sea loads on ships and offshore automatic thruster assisted position mooring systems
structures. Cambridge University Press, Cambridge for ships. Model Identif Control 19(2):61–75
Fay H (1989) Dynamic positioning systems, princi- Sørensen AJ (2005) Structural issues in the design and op-
ples, design and applications. Editions Technip, Paris. eration of marine control systems. Annu Rev Control
ISBN:2-7108-0580-4 29(1):125–149
Fossen TI (2011) Handbook of marine craft hydrodynam- Sørensen AJ (2011) A survey of dynamic positioning
ics and motion control. Wiley, Chichester control systems. Annu Rev Control 35:123–136.
Fossen TI, Strand JP (1999) Passive nonlinear observer Sørensen AJ, Smogeli ØN (2009) Torque and power con-
design for ships using Lyapunov methods: experimen- trol of electrically driven marine propellers. Control
tal results with a supply vessel. Automatica 35(1):3–16 Eng Pract 17(9):1053–1064
Grimble MJ, Johnson MA (1988) Optimal control and Sørensen AJ, Strand JP (2000) Positioning of small-
stochastic estimation: theory and applications, vols 1 waterplane-area marine constructions with roll and
and 2. Wiley, Chichester pitch damping. Control Eng Pract 8(2):205–213
Johansen TA, Fossen TI (2013) Control allocation— Sørensen AJ, Dukan F, Ludvigsen M, Fernandez DA,
a survey. Automatica 49(5):1087–1103 Candeloro M (2012) Development of dynamic posi-
Johansen TA, Sørensen AJ, Nordahl OJ, Mo O, Fossen TI tioning and tracking system for the ROV Minerva.
(2007) Experiences from hardware-in-the-loop (HIL) In: Roberts G, Sutton B (eds) Further advances in
testing of dynamic positioning and power management unmanned marine vehicles. IET, London, pp 113–128.
systems. In: OSV Singapore, Singapore Chapter 6
E
PT 1
`.x.t/; u.t// .f .x; u// .x/ C s.x; u/ .x/: (16)
t D0
lim sup `.x ; u /: (14)
T !C1 T
The next result highlights the connection between
The proof of this fact can be found in Angeli et al. dissipativity of the open-loop system and stability
(2012) and Amrit et al. (2011). When periodic of closed-loop economic MPC.
solutions are known to outperform, in an average Theorem 4 Assume that either Assumption 1 or
sense, the best equilibrium/control pair, one may Assumption 2 together with (13) hold. Let the
replace terminal equality constraints by periodic system (1) be strictly dissipative with respect to
terminal constraints (see Angeli et al. 2012). This the supply function s.x; u/ D `.x; u/ `.x ; u /
leads to an asymptotic performance at least as as from Definition 1 and assume there exists a E
good as that of the solution adopted as a terminal neighborhood of feasible initial states containing
constraint. x in its interior. Then provided V is continuous
at x ; x is an asymptotically stable equilibrium
Stability and Convergence with basin of attraction equal to the set of feasible
It is well known that the cost-to-go V .x/ as initial states.
defined in (8) is a natural candidate Lyapunov
function for the case of tracking MPC. In fact, See Angeli et al. (2012) and Amrit et al. (2011)
the following estimate holds along solutions of for proofs and discussions. Convergence results
the closed-loop system: are also possible for the case of economic MPC
subject to average constraints. Details can be
found in Müller et al. (2013a).
V .x.t C1// V .x.t//`.x.t/; u.t//C`.x ; u /:
Hereby it is worth mentioning that finding
(15)
a function satisfying (16) (should one exist) is
in general a hard task (especially for nonlinear
This shows, thanks to inequality (5), that V .x.t//
systems and/or nonconvex stage costs); it is akin
is nonincreasing. Owing to this, stability and con-
to the problem of finding a Lyapunov function
vergence can be easily achieved under mild addi-
and therefore general construction methods do
tional technical assumptions. While property (15)
not exist. Let us emphasize, however, that while
holds for economic MPC, both in the case of
existence of a storage function is a sufficient
terminal equality constraint and terminal penalty
condition to ensure convergence of closed-loop
function, it is no longer true that (5) holds. As
economic MPC, formulation and resolution of
a matter of fact, x might even fail to be an
the optimization problem (8) can be performed
equilibrium of the closed-loop system, and hence,
irrespectively of any explicit knowledge of such
convergence and stability cannot be expected in
function. Also, we point out that existence of
general.
as in Definition 1 and Theorem 4 is only possible
Intuitively, however, when the most profitable
if the optimal infinite-horizon regime of operation
operating regime is an equilibrium, the aver-
for the system is an equilibrium.
age performance bound provided by Theorem 3
seems to indicate that some form of stability or
convergence to x could be expected. This is
Summary and Future Directions
true under an additional dissipativity assumption
which is closely related to the property of optimal
Economic model predictive control is a fairly
operation at steady state.
recent and active area of research with great
Definition 1 A system is strictly dissipative with potential in those engineering applications where
respect to the supply function s.x; u/ if there economic profitability is crucial rather than track-
exists a continuous function W X ! R and ing performance.
W X ! R positive definite with respect to x The technical literature is rapidly growing in
such that for all x and u in X U , it holds: application areas such as chemical engineering
342 Economic Model Predictive Control
(see Heidarinejad 2012) or power systems engi- turnpike theory in economics, please refer to
neering (see Hovgaard et al. 2010; Müller et al. Rawlings and Amrit (2009). Potential applica-
2013a) where system’s output is in fact physical tions of EMPC are described in Hovgaard et al.
outflows which can be stored with relative ease. (2010), Heidarinejad (2012), and Ma et al. (2011)
We only dealt with the basic theoretical de- while Rawlings et al. (2012) is an up-to-date
velopments and would like to provide pointers to survey on the topic. Fagiano and Teel (2012) and
interesting recent and forthcoming developments Grüne (2012, 2013) deal with the issue of relax-
in this field: ation or elimination of terminal constraints, while
• Generalized terminal constraints: possibility Müller et al. (2013b) explore the possibility of
of enlarging the set of feasible initial states adaptive terminal costs and generalized equality
by using arbitrary equilibria as terminal constraints.
constraints, possibly to be updated on line in
order to improve asymptotic performance (see
Fagiano and Teel 2012; Müller et al. 2013b). Bibliography
• Economic MPC without terminal constraints:
removing the need for terminal constraints Amrit R, Rawlings JB, Angeli D (2011) Economic opti-
by taking a sufficiently long control horizon mization using model predictive control with a termi-
is an interesting possibility offered by nal cost. Annu Rev Control 35:178–186
Angeli D, Amrit R, Rawlings JB (2011) Enforcing conver-
standard tracking MPC. This is also possible gence in nonlinear economic MPC. Paper presented at
for economic MPC at least under suitable the 50th IEEE conference on decision and control and
technical assumptions as investigated in european control conference (CDC-ECC), Orlando
Grüne (2012, 2013). 12–15 Dec 2011
Angeli D, Amrit R, Rawlings JB (2012) On average per-
• The basic developments presented in the formance and stability of economic model predictive
previous paragraph only deal with systems control. IEEE Trans Autom Control 57:1615–1626
unaffected by uncertainty. This is a severe Diehl M, Amrit R, Rawlings JB (2011) A Lyapunov
limitation of current approaches and it is to be function for economic optimizing model predictive
control. IEEE Trans Autom Control 56:703–707
expected that, as for the case of tracking MPC, Fagiano L, Teel A (2012) Model predictive control with
a great deal of research in this area could be generalized terminal state constraint. Paper presented
developed in the future. In particular, both at the IFAC conference on nonlinear model predictive
deterministic and stochastic uncertainties are control, Noordwijkerhout 23–27 Aug 2012
Grüne L (2012) Economic MPC and the role of exponen-
of interest. tial turnpike properties. Oberwolfach Rep 12:678–681
Grüne L (2013) Economic receding horizon control with-
out terminal constraints. Automatica 49:725–734
Cross-References Heidarinejad M (2012) Economic model predictive con-
trol of nonlinear process systems using Lyapunov
Model-Predictive Control in Practice techniques. AIChe J 58:855–870
Hovgaard TG, Edlund K, Bagterp Jorgensen J (2010) The
Optimization Algorithms for Model Predictive
potential of Economic MPC for power management.
Control Paper presented at the 49th IEEE conference on deci-
sion and control, Atlanta 15–17 Dec 2010
Ma J, Joe Qin S, Li B et al (2011) Economic model
Recommended Reading predictive control for building energy systems. Paper
presented at the IEEE innovative smart grid technol-
ogy conference, Anaheim, 17–19 Jan 2011
Papers Amrit et al. (2011), Angeli et al. (2011,
Müller MA, Angeli D, Allgöwer F (2013a) On conver-
2012), Diehl et al. (2011), Müller et al. (2013a), gence of averagely constrained economic MPC and
and Rawlings et al. (2008) set out the basic necessity of dissipativity for optimal steady-state op-
technical tools for performance and stability anal- eration. Paper presented at the 2013 IEEE American
control conference, Washington DC, 17–19 June 2013
ysis of EMPC. To readers interested in the gen-
Müller MA, Angeli D, Allgöwer F (2013b) Economic
eral theme of optimization of system’s economic model predictive control with self-tuning terminal
performance and its relationship with classical weight. Eur J Control 19:408–416
Electric Energy Transfer and Control via Power Electronics 343
counterparts for electric energy transfer and con- • Thyristor – also called SCR (silicon-
trol. Since the advent of power electronics in the controlled rectifier). Unlike diode, thyristor
1950s, they have steadily gained ground in power is a three-terminal device with an additional
system applications. Today, power electronic gate terminal. It can be turned on by a current
controllers are an important part of equipment pulse through gate but can only be turned
for electric energy transfer and control. Their off when the main current goes to zero with
roles are growing rapidly with the continuous im- external means. Thyristor has low conduction
provement of the power electronic technologies. loss but slow switching speed.
• GTO – stands for gate-turn-off thyristor. GTO
Fundamentals of Power Electronics can be turned on similarly as a regular thyris-
tor and can also be turned off with a large
Different from semiconductor devices in micro- negative gate current pulse. GTO has been
electronics, the power electronic devices only act largely replaced by IGBT and IGCT due to its
as switches for desired control functions, such complex gate driving needs and slow switch-
that they incur minimum losses when they are ing speed.
either on (closed) or off (open). As a result, • Power BJT – similar to bipolar transistor
the power electronic controllers are basically the for microelectronics and requires a sustained
switching circuits. The semiconductor switches gate current to turn on and off. It has been
are therefore the most important elements of the replaced by IGBT and power MOSFET with
power electronic controllers. Since the 1950s, simpler gate signals and faster switching
many different types of power semiconductor speed.
switches have been developed and can be selected • Power MOSFET – similar to metal-oxide
based on the applications. semiconductor field effect transistor for
The performance of the power semiconductor microelectronics and can be turned on and
devices is mainly characterized by their voltage off with a gate voltage signal. It is the
and current ratings, conduction or on-state loss, fastest device available but has relatively
as well as the switching speed (or switching high conduction loss and relatively low-
frequency capability) and associated switching voltage/power ratings.
loss. Main types of power semiconductor devices • IGBT – stands for insulated-gate bipolar tran-
are listed with their symbols and state-of-the-art sistor. Unlike regular BJT, it can be turned
rating and frequency range shown in Table 1: on and off with a gate voltage like MOS-
• Power diode – a two terminal device with FET. It has relatively low conduction loss and
similar characteristics to diodes used in micro- fast switching speed. IGBT is becoming the
electronics but with higher-voltage and power workhorse of the power electronics for high
ratings. power applications.
Electric Energy Transfer and Control via Power Electronics, Table 1 Commonly use Si-based power semiconduc-
tor devices and their ratings
Types Symbol Voltage Current Switching frequency
Power diodes Max 80 kV, typical < 10 kV 10 kA Various
Thyristor Max 8 kV 4.5 kA AC line frequency
GTO Max 10 kV 6.5 kA <500 Hz
Power MOSFET Max 4.5 kV, typical < 600 V 1.6 kA 10 s of kHz to MHz
IGBT Max 6.5 kV, typical > 600 V 2.4 kA 1 kHz to 10 s of kHz
a b
Electric Energy Transfer and Control via Power Electronics, Fig. 1 Commonly used basic power electronics
converter topologies (only one phase shown for the AC switch)
346 Electric Energy Transfer and Control via Power Electronics
The power electronic controllers can be • Semiconductor devices – Devices used today
categorized as for energy generation, delivery, are almost exclusively based on silicon. The
and consumption. For generation, the thermal or emerging devices based on wide-bandgap
hydro generators both use synchronous machines materials such as SiC and GaN are expected
with excitation windings on the rotor, which to revolutionize power electronics with their
require DC current. A thyristor-based rectifier, capabilities of higher voltage, lower loss,
called exciter, is generally used for this purpose. faster switching speed, higher temperature,
Wind turbine generators usually use a back-to- and smaller size.
back VSI to interface to the AC grid, and PV • Power electronic converters – More cost-
solar sources use a DC-DC converter cascaded effective and reliable converters will be
with a VSI. developed as a result of better devices,
Power electronic controllers for transmission passive components, and circuit structures.
and distribution controllers include so-called Modular, distributed, and hybrid with non-
flexible AC transmission systems (FACTS) and power-electronics approaches are expected to
high-voltage DC transmission (HVDC). Some of result in overall better benefits.
the more commonly used controllers and their • Enhanced functions – Power electronic con-
functions and circuit topologies are listed in trollers can be designed to have multiple func-
Table 2. tions in the system. For example, wind and PV
The main power electronic controllers for solar inverters can provide reactive power to
loads include variable speed motor drives; the grid in addition to transferring real energy.
electronic ballast for fluorescent lights and power Today, power electronic controllers are mostly
supplies for LED; various power supplies for locally controlled. With better measurement
computer, IT, and other electronic loads; and and communication technologies, they may be
chargers for electric vehicles. The percentage controlled over a wide area for supporting the
of power electronics controlled loads in power system level functions.
systems have been steadily increasing. Power • New applications – The new applications for
electronics can generally result in improved future power system include DC grid based
performance and efficiency. on multiterminal HVDC and energy storage.
Critical technologies include cost-effective
and efficient DC transformers and DC circuit
breakers. Power electronics will play key roles
Future Directions in these technologies.
TCSC • Power flow control Power and VAR control Controlled bidirectional
– thyristor-controlled series • Stability enhancement through varying C and L in AC switch
capacitor • Fault current limiting series connection
SSTS Power supply transfer for reliability On and off control Controlled bidirectional
– solid state transfer switch and power quality AC switch
SSSC • Power flow control VAR control through Bidirectional AC/DC voltage
– static series synchronous • Stability enhancement voltage control in series source converter
compensator connection
HVDC (voltage source) • System Interconnection Power and VAR control Bidirectional AC/DC voltage
• Power flow control through back-to-back source converter
• stability enhancement converters in shunt
connections
347
E
348 Engine Control
Bibliography Keywords
Bayegen M (2001) A vision of the future grid. IEEE Compression ignition; Emissions; Exhaust af-
Power Eng Rev. vol 12, pp 10–12
Hingorani NG, Gyugyi L (2000) Understanding FACTS –
tertreatment; Internal combustion engines; Spark
concepts and technology of flexible AC transmission ignition
systems. IEEE Press, New York
Rahimo M, Klaka S (2009) High voltage semiconductor
technologies. In: 13th European conference on power
electronics and applications, EPE’09, Barcelona, pp 1–
Introduction
10
Wang F, Rosado S, Boroyevich D (2003) Open modular Most vehicles are moved by internal combus-
power electronics building blocks for utility power tion engines (ICE), whose key function is the
system controller applications. IEEE PESC, Acapulco,
pp 1792–1797, 15–19 June 2003
conversion of chemical into mechanical energy,
basically by oxidation, e.g., in the case of propane
so that
v.t/ D vdem .t/ ˙ v (3)
The Target System and
Z
1120
Engine Control, Fig. 1 Light duty engine test bench with ECU and calibration system
1a
1b
2
2
3
3
5a
Control unit Control unit 5b
Air path Air path
T C
8 4 6a T C
4
λ λ
7a 7b
Engine Control, Fig. 2 Basic system scheme of CI (left) exhaust gases; 5a EGR valve, 5b throttle valve, 6a low-
and SI (right) engines: 1a Control of the injector opening, pressure EGR valve, 7a Diesel exhaust after treatment
1b Injection premixing with air; 2 Measurement of the (DOC, SCR, DPF), 7b SI engine after treatment (3 way
engine temperature; 3 Measurement of the engine rational catalyst), 8 SCR dosing control
speed; 4 Measurement of oxygen concentration in the
Engine Control 351
0.15 140
New European Driving Cycle (NEDC)
Euro1 120
0.12
100
PM emissions g /km
0.02 Euro4 60
0.06 0.01
40
Euro3 Euro6 Euro5
0
0.2 0.3
E
0.03 0 0.1
20
Euro4
0
0
0 0.2 0.4 0.6 0.8 1 −20
0 200 400 600 800 1000 1200
NOx emissions g/km time s
Engine Control, Fig. 3 Left: different steps of limits of emissions per km as defined by the European Union (Euro 1
introduced in 1991 and Euro 6 from 2014). Right: New European Driving Cycle (NEDC)
indicated pressure
Pressure trace in a fired
cylinder of a CI engine 60
triggered by a pilot and a
main injection 40
20
0
-100 -50 0 50 100 150
crank angle [deg]
1
injection
0.5
0
-100 -50 0 50 100 150
crank angle [deg]
In CI engines, the injection leads almost im- deviation from this target. The first task is per-
mediately to the combustion which has more the formed both by control of the cooling circuit
character of an explosion and starts typically at and by specific combustion-related measures, the
several undefined locations. second one by taking the measured or estimated
Additional control during the combustion is up temperature as input for the controllers.
to now only theoretically feasible, as the combus- Fast heating is especially important for SI
tion takes place in an extremely short time, but engines, because almost all toxic emissions are
also because adequate actuators are not available. produced when the three-way catalyst is cold. To
achieve faster heating, SI engines tend to operate
Aftertreatment in a less fuel efficient, but “hotter” operation E
As the combustion mixture will always contain mode during this warm-up phase, one of the
more potential reactants than oxygen and fuel, causes of increased consumption of cold engines
side reactions will always take place, yielding and short trips.
toxic products, in particular NOx, incompletely
burnt fuel (HC), carbon monoxide (CO), and par-
ticulate matter (PM). Even if much effort is spent Cranking Idle Speed and Gear Shifting
on reducing their formation, this is almost never Control
sufficient, so additional aftertreatment equipment Initially, the engine is cranked by the starter until
is used. Table 1 gives an overview over the most a relatively low speed and then injection starts
common aftertreatment systems as well as over bringing the engine to the minimum operational
their control aspects. speed. If the injected fuel is not immediately
burnt, very high emissions will arise. At cranking,
Thermal Management the cylinder walls are typically very cold and
All main properties of engines are strongly af- combustion of a stoichiometric mixture is hardly
fected by its temperature, which depends on the possible. So engine control has the task to inject
varying load conditions. Engine operation is typ- as little as possible but as much as needed only in
ically optimal for a relatively narrow temperature the cylinder which is going to fire.
range, the same is even more critical for the Normally an ICE is expected to provide a
exhaust aftertreatment system. Engine heat is also torque to the driveline, speed being the result
required for other purposes (like defrosting of of the balance between it and the load. In idle
windshields in cold climates). control, no torque is transmitted to the driveline,
Thus the engine control system has two main but the engine speed is expected to remain stable
tasks: bringing the engine and the exhaust af- in spite of possible changes of local loads (like
tertreatment system as fast as possible into the cabin climate control). This boils down to a
target temperature range and taking in account robust control problem (Hrovat and Sun 1997).
Gear shifting requires several steps. Smooth- techniques (del Re et al. 2010) are increasing, but
ness and speed of the shifting depend on the they are not yet widespread.
coordination of engine operating point change.
Actual hardware developments (double clutches,
automated gear boxes) make a better operation, Cross-References
but require precise control.
Powertrain Control for Hybrid-Electric and
Electric Vehicles
New Trends Transmission
important research topic. While much progress channel models, e.g., channels that introduce de-
has been made in the area over the last few lays or additive noise, have been considered in
years, many open problems still remain. This the literature, these models are among the ones
entry summarizes some results available for such that have been studied the most. Moreover, the
systems and points out a few open research richness of the field can be illustrated by concen-
directions. Two popular channel models are trating on these models.
considered – the analog erasure channel model An analog erasure channel model is defined
and the digital noiseless model. Results are pre- as follows. At every time step k, the channel
sented for both the multichannel and multisensor supports as its input a real vector i.k/ 2 Rt
settings. with a bounded dimension t. The output o.k/ E
of the channel is determined stochastically. The
simplest model of the channel is when the output
Keywords
is determined by a Bernoulli process with proba-
bility p. In this case, the output is given by
Analog erasure channel; Digital noiseless chan-
nel; Networked control systems; Sensor fusion (
i.k 1/ with probability 1 p
o.k/ D
Introduction otherwise,
Networked control systems refer to systems in where the symbol denotes the fact that the
which estimation and control is done across receiver does not obtain any data at that time
communication channels. In other words, these step and, importantly, recognizes that the channel
systems feature data transmission among the has not transmitted any data. The probability p
various components – sensors, estimators, con- is termed the erasure probability of the channel.
trollers, and actuators – across communication More intricate models in which the erasure pro-
channels that may delay, erase, or otherwise cess is governed by a Markov chain, or by a
corrupt the data. It has been known for a long time deterministic process, have also been proposed
that the presence of communication channels and analyzed. In our subsequent development, we
has deep and subtle effects. As an instance, an will assume that the erasure process is governed
asymptotically stable linear system may display by a Bernoulli process.
chaotic behavior if the data transmitted from A digital noiseless channel model is defined
the sensor to the controller and the controller as follows. At every time step k, the channel
to the actuator is quantized. Accordingly, the supports at its input one out of 2m symbols.
impact of communication channels on the The output of the channel is equal to the input.
estimation/control performance and design of The symbol that is transmitted may be generated
estimation/control algorithms to counter any arbitrarily; however, it is natural to consider the
performance loss due to such channels have both channel as supporting m bits at every time step
become areas of active research. and the specific symbol transmitted as being
generated according to an appropriately design
quantizer. Once again, additional complications
Preliminaries such as delays introduced by the channel have
been considered in the literature.
It is not possible to provide a detailed overview A general networked control problem consists
of all the work in the area. This entry attempts to of a process whose states are being measured
summarize the flavor of the results that are avail- by multiple sensors that transmit data to multi-
able today. We focus on two specific communi- ple controllers. The controllers generate control
cation channel models – analog erasure channel inputs that are applied by different actuators.
and the digital noiseless channel. Although other All the data is transmitted across communication
356 Estimation and Control over Networks
channels. Design of control inputs when multiple two-block designs is better than the one-block
controllers are present, even without the pres- designs.
ence of communication channels, is known to be
hard since the control inputs in this case have
dual effect. It is, thus, not surprising that not Analog Erasure Channel Model
many results are available for networked control
systems with multiple controllers. We will thus Consider the usual LQG formulation. A linear
concentrate on the case when only one controller process of the form
and actuator is present. However, we will review
the known results for the analog erasure channel x.k C 1/ D Ax.k/ C Bu.k/ C w.k/;
and the digital noiseless channel models when
(i) multiple sensors observe the same process with state x.k/ 2 Rd and process noise w.k/
and transmit information to the controller and (ii) is controlled using a control input u.k/ 2 Rm .
the sensor transmits information to the controller The process noise is assumed to be white, Gaus-
over a network of communication channels with sian, zero mean, with covariance †w . The initial
an arbitrary topology. condition x.0/ is also assumed to be Gaussian
An important distinction in the networked and zero mean with covariance …0 . The process
control system literature is that of one-block is observed by n sensors, with the i -th sensor
versus two-block designs. Intuitively, the generating measurements of the form
one-block design arises from viewing the
communication channel as a perturbation to yi .k/ D Ci x.k/ C vi .k/;
a control system designed without a channel.
In this paradigm, the only block that needs with the measurement noise vi .k/ assumed to be
to be designed is the receiver. Thus, for white, Gaussian, zero mean, with covariance †iv :
instance, if an analog erasure channel is present All the random variables in the system are as-
between the sensor and the estimator, the sensor sumed to be mutually independent. We consider
continues to transmit the measurements as if two cases:
no channel is present. However, the estimator • If n D 1, the sensor communicates with
present at the output of the channel is now the controller across a network consisting of
designed to compensate for any imperfections multiple communication channels connected
introduced by the communication channel. according to an arbitrary topology. Every
On the other hand, in the two-block design communication channel is modeled as an
paradigm, both the transmitter and the receiver analog erasure channel with possibly a
are designed to optimize the estimation or control different erasure probability. The erasure
performance. Thus, if an analog erasure channel events on the channels are assumed to be
is present between the sensor and the estimator, independent of each other, for simplicity. The
the sensor can now transmit an appropriate sensor and the controller then form two nodes
function of the information it has access to. of a network each edge of which represents a
The transmitted quantity needs to satisfy the communication channel.
constraints introduced by the channel in terms of • If n > 1, then every sensor communicates
the dimensions, bit rate, power constraints, and with the controller across an individual com-
so on. It is worth remembering that while the munication channel that is modeled as an ana-
two-block design paradigm follows in spirit from log erasure channel with possibly a different
communication theory where both the transmitter erasure probability. The erasure events on the
and the receiver are design blocks, the specific channels are assumed to be independent of
design of these blocks is usually much more each other, for simplicity.
involved than in communication theory. It is The controller calculates the control input to
not surprising that in general performance with optimize a quadratic cost function of the form
Estimation and Control over Networks 357
" K1
X controller transmits the control input to the ac-
JK D E x T .k/Qx.k/ C uT .k/Ru.k/ tuator across a perfect channel. The separation
kD0 principle states that the optimal performance is
#
achieved if the control input is calculated using
C x .K/PK x.K/ :
T
the usual LQR control law, but the process state
is replaced by the minimum mean squared error
(MMSE) estimate of the state. Thus, the two-
All the covariance matrices and the cost matri- block design problem needs to be solved now for
ces Q, R, and PK are assumed to be positive an optimal estimation problem.
definite. The pair .A; B/ is controllable and the The next step is to realize that for any allowed
pair .A; C / is observable, where C is formed E
two-block design, an upper bound on estima-
by stacking the matrices Ci ’s. The system is tion performance is provided by the strategy of
said to be stabilizable if there exists a design every node transmitting every measurement it
(within the specified one-block or two-block de- has access to at each time step. Notice that this
sign framework) such that the cost limK!1 K1 JK strategy is not in the set of allowed two-block
is bounded. designs since the dimension of the transmitted
quantity is not bounded with time. However, the
A Network of Communication Channels same estimate is calculated at the decoder if the
We begin with the case when N D 1 as men- sensor transmits an estimate of the state at every
tioned above. The one-block design problem in time step and every other node (including the
the presence of a network of communication decoder) transmits the latest estimate it has access
channels is identical to the one-block design as to from either its neighbors or its memory. This
if only one channel were present. This is be- algorithm is recursive and involves every node
cause the network can be replaced by an “equiv- transmitting a quantity with bounded dimension,
alent” communication channel with the erasure however, since it leads to calculation of the same
probability as some function of the reliability estimate at the decoder, and is, thus, optimal. It
of the network. This can lead to poor perfor- is worth remarking that the intermediate nodes
mance, since the reliability may decrease quickly do not require access to the control inputs. This
as the network size increases. For this reason, is because the estimate at the decoder is a linear
we will concentrate on the two-block design function of the control inputs and the measure-
paradigm. ments: thus, the effect of control inputs in the
The two-block design paradigm permits the estimate can be separated from the effect of
nodes of the network to process the data prior the measurements and included at the controller.
to transmission and hence achieve much better Moreover, as long as the closed loop system is
performance. The only constraint imposed on the stable, the quantities transmitted by various nodes
transmitter is that the quantity that is transmitted are also bounded. Thus, the two-block design
is a causal function of the information that the problem can be solved.
node has access to, with a bounded dimension. The stability and performance analysis with
The design problem can be solved using the the optimal design can also be performed. As an
following steps. The first step is to prove that a example, a necessary and sufficient stabilizability
separation principle holds if the controller knows condition is that the inequality
the control input applied by the actuator at ev-
ery time step. This can be the case if the con- pmaxcut .A/2 < 1;
troller transmits the control input to the actuator
across a perfect channel or if the control input is holds, where .A/ is the spectral radius of A
transmitted across an analog erasure channel but and pmaxcut is the max-cut probability evaluated
the actuator can transmit an acknowledgment to as follows. Generate cut-sets from the network
the controller. For simplicity, we assume that the by dividing the nodes into two sets – a source
358 Estimation and Control over Networks
set containing the sensor and a sink set con- it is known that the global estimate cannot, in
taining the controller. For each cut-set, obtain general, be obtained from local estimates because
the cut-set probability by multiplying the erasure of the correlation introduced by the process noise.
probabilities of the channels from the source set If erasure probabilities are zero, or if the process
to the sink set. The max-cut probability is the noise is not present, then the optimal encoding
maximum such cut-set probability. The necessity schemes are known. Another case for which the
of the condition follows by recognizing that the optimal encoding schemes are known is when
channels from the source set to the sink set need the estimator sends back acknowledgments to the
to transmit data at a high enough rate even if encoders.
the channels within each set are assumed not to Transmitting local estimates does, however,
erase any data. The sufficiency of the condition achieve optimal stability conditions as compared
follows by using the Ford-Fulkerson algorithm to to the conditions obtained from the optimal (un-
reduce the network into a collection of parallel known) two-block design (Gupta et al. 2009b).
paths from the sensor to the controller such that As an example, the necessary and sufficient sta-
each path has links with equal erasure probability bility conditions for the two sensor cases are
and the product of these probabilities for all paths given by
is the max-cut probability. More details can be
found in Gupta et al. (2009a). p1 .A1 /2 < 1
p2 .A2 /2 < 1
Multiple Sensors
Let us now consider the case when the process p1 p2 .A3 /2 < 1;
is observed using multiple sensors that transmit
data to a controller across an individual analog where p1 and p2 are erasure probabilities from
erasure channel. A separation principle to reduce sensors 1 and 2, respectively, .A1 / is the spectral
the control design problem into the combination radius of the unobservable part of the matrix A
of an LQR control law and an estimation prob- from the second sensor, .A2 / is the spectral
lem can once again be proven. Thus, the two- radius of the unobservable part of the matrix A
block design for the estimation problem asks from the first sensor, and .A3 / is the spectral
the following question: what quantity should the radius of the observable part of the matrix A
sensors transmit such that the decoder is able from both the sensors. The conditions are fairly
to generate the optimal MMSE estimate of the intuitive. For instance, the first condition provides
state at every time step, given all the information a bound on the rate of increase of modes for
the decoder has received till that time step. This which only sensor 1 can provide information
problem is similar to the track-to-track fusion to the controller, in terms of how reliable the
problem that has been studied since the 1980s communication channel from the sensor 1 is.
and is still open for general cases (Chang et al.
1997). Suppose that at time k, the last successful
transmission from sensor i happened at time Digital Noiseless Channels
ki k. The optimal estimate that the decoder
can ever hope to achieve is the estimate of the Similar results as above can be derived for the
state x.k/ based on all measurements from the digital noiseless channel model. For the digital
sensor 1 till time k1 , from sensor 2 till time k2 , noiseless channel model, it is easier to consider
and so on. However, it is not known whether the system without either measurement or pro-
this estimate is achievable if the sensors are con- cess noises (although results with such noises are
strained to transmit real vectors with a bounded available). Moreover, since quantization is inher-
dimension. A fairly intuitive encoding scheme is ently highly nonlinear, results such as separation
if the sensors transmit the local estimates of the between estimation and control are not available.
state based on their own measurements. However, Thus, encoders and controllers that optimize a
Estimation and Control over Networks 359
possible only if l.k/ remains bounded as k ! 1. for every mode j must be satisfied. All such rate
Now, the evolution of l.k/ is governed by two vectors stabilize the system.
processes: at every time step, this uncertainty
can be (i) decreased by a factor of at most 2m
due to the data transmission across the channel Summary and Future Directions
and (ii) increased by a factor of a (where a is
the process matrix governing the evolution of the This entry provided a brief overview of some
state) due to the process evolution. This implies results available in the field of networked
that for stabilization to be possible, the inequality control systems. Although the area is seeing
m log2 .a/ must hold. Thus, the additive intense research activity, many problems
noise model is inherently wrong. Most results remain open. For control across analog erasure
in the literature formalize this basic intuition channels, most existing results break down
above (Nair et al. 2007). if a separation principle cannot be proved.
Thus, for example, if control packets are also
A Network of Communication Channels transmitted to the actuator across an analog
For the case when there is only one sensor that erasure channel, the LQG optimal two-block
transmits information to the controller across a design is unknown. There is some recent
network of communication channels connected work on analyzing the stabilizability under
in arbitrary topology, an analysis similar to that such conditions (Gupta and Martins 2010),
done for analog erasure channels can be per- but the problem remains open in general.
formed (Tatikonda 2003). A max-flow min-cut For digital noiseless channels, controllers that
like theorem again holds. The stability condition optimize some performance metric are largely
now becomes that for any cut-set unknown. Considering more general channel
X X models is also an important research direction
Rj > log2 .i /; (Martins and Dahleh 2008; Sahai and Mitter
all unstable eigenvalues 2006).
P
where Rj is the sum of data rates supported
by the channels joining the source set to sink Cross-References
set for any cut-set and i are the eigenvalues of
the process matrix A. Note that the summation Averaging Algorithms and Consensus
on the right hand side is only over the unstable Oscillator Synchronization
360 Estimation for Random Sets
Bibliography Keywords
Chang K-C, Saha RK, Bar-Shalom Y (1997) On optimal Image processing; Multitarget processing; Ran-
track-to-track fusion. IEEE Trans Aerosp Electron
Syst AES-33:1271–1276
dom finite sets; Stochastic geometry
Gupta V, Martins NC (2010) On stability in the pres-
ence of analog erasure channels between controller
and actuator. IEEE Trans Autom Control 55(1):
175–179
Gupta V, Dana AF, Hespanha J, Murray RM, Hassibi
Introduction
B (2009a) Data transmission over networks for es-
timation and control. IEEE Trans Autom Control In ordinary signal processing, one models
54(8):1807–1819 physical phenomena as “sources,” which generate
Gupta V, Martins NC, Baras JS (2009b) Stabilization over
erasure channels using multiple sensors. IEEE Trans
“signals” obscured by random “noise.” The
Autom Control 54(7):1463–1476 sources are to be extracted from the noise
Martins NC, Dahleh M (2008) Feedback control in the using optimal-estimation algorithms. Random
presence of noisy channels: ‘bode-like’ fundamental set (RS) theory was devised about 40 years
limitations of performance. IEEE Trans Autom Con-
trol 53(7):1604–1615
ago by mathematicians who also wanted to
Nair GN, Fagnani F, Zampieri S, Evans RJ (2007) Feed- construct optimal-estimation algorithms. The
back control under data rate constraints: an overview. “signals” and “noise” that they had in mind,
Proc IEEE 95(1):108–137 however, were geometric patterns in images.
Sahai A, Mitter S (2006) The necessity and sufficiency of
anytime capacity for stabilization of a linear system The resulting theory, stochastic geometry, is
over a noisy communication link–Part I: scalar sys- the basis of the “morphological operators”
tems. IEEE Trans Inf Theory 52(8):3369–3395 commonly employed today in image-processing
Tatikonda S (2003) Some scaling properties of large applications. It is also the basis for the theory of
distributed control systems. In: 42th IEEE conference
on decision and control, Maui, Dec 2003 RSs. An important special case of RS theory, the
theory of random finite sets (RFSs), addresses
problems in which the patterns of interest consist
of a finite number of points. It is the theoretical
basis of many modern medical and other image-
Estimation for Random Sets processing algorithms. In recent years, RFS
theory has found application to the problem
Ronald Mahler of detecting, localizing, and tracking unknown
Eagan, MN, USA numbers of unknown, evasive point targets.
Most recently and perhaps most surprisingly,
RS theory provides a theoretically rigorous
way of addressing “signals” that are human-
Abstract mediated, such as natural-language statements
and inference rules. The breadth of RS theory
The random set (RS) concept generalizes that is suggested in the various chapters of Goutsias
of a random vector. It permits the mathematical et al. (1997).
modeling of random systems that can be inter- The purpose of this entry is to summarize the
preted as random patterns. Algorithms based on RS and RFS theories and their applications. It is
RSs have been extensively employed in image divided in to the following sections: A Simple
processing. More recently, they have found appli- Example, Mathematics of Random Sets, Ran-
cation in multitarget detection and tracking and in dom Sets and Image Processing, Random Sets
the modeling and processing of human-mediated and Multitarget Processing, Random Sets and
information sources. The purpose of this entry Human-Mediated Data, Summary and Future Di-
is to briefly summarize the concepts, theory, and rections, Cross-References, and Recommended
practical application of RSs. Reading.
Estimation for Random Sets 361
described by a very abstract probability measure A Poisson RFS ‰ is perhaps the simplest
p‚ .O/ D Pr.‚ 2 O/. This measure must nontrivial example of a random point pattern. It
be defined on the Borel-measurable subsets is specified by a spatial distribution s.y/ and an
O c.2Y /, with respect to the Fell-Matheron intensity
. At any given instant, the probability
topology, where O is itself a class of subsets that there will be n points in the pattern is p.n/ D
of Y. However, define the Choquet capacity e
n =nŠ (the value of the Poisson distribution).
functional by c‚ .G/ D Pr.‚ \ G ¤ ;/ for The probability that one of these n points will be
all open subsets G Y. Then, the Choquet- y is s.y/. The function D‰ .y/ D
s.y/ is called
Matheron theorem states that the probability law the intensity function of ‰.
of ‚ is completely described by the simpler, At any moment, the point pattern produced
albeit nonadditive, measure c‚ .G/. by ‰ is a finite set Y D fy1 ; : : : ; yn g of points
The theory of random sets has evolved y1 ; : : : ; yn in Y, where n D 0; 1; : : : and where
into a substantial subgenre of statistical theory Y D ; if n D 0. If n D 0 then Y represents
(Molchanov 2005). For estimation theory, the the hypothesis that no objects at all are present. If
concept of the expected value E[‚] of a random n D 1 then Y D fy1 g represents the hypothesis
set ‚ is of particular interest. Most definitions that a single object y1 is present. If n D 2 then
of E[‚] are very abstract (Molchanov 2005, Y D fy1 ; y2 g represents the hypothesis that there
Chap. 2). In certain circumstances, however, are two distinct objects y1 ¤ y2 . And so on.
more conventional-looking definitions are The probability distribution of ‰ – i.e., the
possible. Suppose that Y is a Euclidean space probability that ‰ will have Y D fy1 ; : : : ; yn g
and that c.2Y / is restricted to K.2Y /, the as an instantiation – is entirely determined by its
bounded, convex, closed subsets of Y. If C; C 0 intensity function D‰ .y/:
are two such subsets, their Minkowski sum is
C C C 0 D fc C c 0 j c 2 C; c 0 2 C 0 g. Endowed f‰ .Y / D f‰ .fy1 ; : : : ; yn g/
with this definition of addition, K.2Y / can
be homeomorphically and homomorphically D e mu D‰ .y1 / D‰ .yn /
embedded into a certain space of functions
(Molchanov 2005, pp. 199–200). Denote this Every suitably well-behaved RFS ‰ has a
embedding by C 7! C . Then, the expected probability distribution f‰ .Y / and an intensity
value E[‚] of ‚, defined in terms of Minkowski function D‰ .y/ (a.k.a. first-moment density). A
addition, corresponds to the conventional Poisson RFS is unique in that f‰ .Y / is com-
expected value E[‚ ] of the random function ‚ . pletely determined by D‰ .y/.
Conventional signal processing is often con-
cerned with single-object random systems that
Random Finite Sets (Random Point have the form
Processes)
Suppose that the c.2Y / is restricted to f.2Y /, Z D .x/ C V
the class of finite subsets of Y. (In many for-
mulations, f.2Y / is taken to be the class of where x is the state of the system; .x/ is the
locally finite subsets of Y – i.e., those whose “signal” generated by the system; the zero-mean
intersection with compact subsets is finite.) A random vector V is the random “noise” associ-
random finite set (RFS) is a measurable mapping ated the sensor; and Z is the random measurement
from a probability space into f.2Y /. An example: that is observed. The purpose of signal processing
the field of twinkling stars in some patch of a is to construct an estimate xO .z1 ; : : : :; zk / of x,
night sky. RFS theory is a particular mathematical using the information contained in one or more
formulation of point process theory (Daley and draws z1 ; : : : :; zk from the random variable Z.
Vere-Jones 1998; Snyder and Miller 1991; Stoyan RFS theory is analogously concerned with
et al. 1995). random systems that have the form
Estimation for Random Sets 363
where jS j denotes the size of the set S and E[A] clutter k . Others are generated by the targets,
is the expected value of the random number A. with some targets possibly not having generated
Other distances require some definition of the any. Mathematically speaking, Z is a random
expected value E[‚] of a random set ‚. It has draw from an RFS †k that can be decomposed
been shown that, under certain circumstances, as †k D ‡.Xk / [ k , where ‡.Xk / is the set of
certain morphological operators can be viewed as target-generated measurements.
consistent maximum a posteriori (MAP) estima-
tors of S (Goutsias et al. 1997, p. 97). Conventional Multitarget Detection and
Tracking. This is based on a “divide and
RFS Theory and Image Processing. Positron- conquer” strategy with three basic steps: time
emission tomography (PET) is one example of update, data association, and measurement
the application of RFS theory. In PET, tissues update. At time tk we have n “tracks” 1 ; : : : ; n
of interest are suffused with a positron-emitting (hypothesized targets). In the time update, an
radioactive isotope. When a positron annihilates extended Kalman filter (EKF) is used to time-
an electron in a suitable fashion, two photons predict the tracks i to predicted tracks iC
are emitted in opposite directions. These photons at the time tkC1 of the next measurement set
are detected by sensors in a ring surrounding the ZkC1 D fz1 ; : : : ; zm g.
radiating tissue. The location of the annihilation Given ZkC1 , we can construct the following
on the line can be estimated by calculating time data-association hypothesis H : for each i D
difference of arrival. 1; : : : ; n, the predicted track iC generated the
Because of the physics of radioactive decay, detection zji , for some index ji , or, alternatively,
the annihilations can be accurately modeled this track was not detected at all. If we remove
as a Poisson RFS ‰. Since a Poisson RFS is from ZkC1 all of the zj1 ; : : : ; zjn , the remaining
completely determined by its intensity function measurements are interpreted either as being clut-
D‰ .x/, it is natural to try to estimate D‰ .x/. ter or as having been generated by new targets.
This yields the spatial distribution s‰ .y/ of Enumerating all possible association hypotheses
annihilations – which, in turn, is the basis of the (which is a combinatorily complex procedure),
PET image (Snyder and Miller 1991, pp. 115– we end up with a “hypothesis table” H1 ; : : : ; H
.
119). Given Hi , let zji be the measurement that is
hypothesized to have been generated by predicted
track iC . Then, the measurement-update step of
Random Sets and Multitarget an EKF is used to construct a measurement-
Processing updated track i;ji from iC and zji . Attached
to each Hi is a hypothesis probability pi – the
The purpose of this section is to summarize probability that the particular hypothesis Hi is the
the application of RFS theory to multitarget de- correct one. The hypothesis with largest pi yields
tection, tracking, and localization. An example: the multitarget estimate XO D fOx1 ; : : : ; xO nO g.
tracking the positions of stars in the night sky
over an extended period of time. RFS Multitarget Detection and Tracking. In
Suppose that at time tk there are an unknown the place of tracks and hypothesis tables, this uses
number n of targets with unknown states multitarget state sets and multitarget probability
x1 ; : : : ; xn . The state of the entire multitarget distributions. In place of the conventional time
system is a finite set X D fx1 ; : : : ; xn g with update, data association, and measurement
n 0. When interrogating a scene, many update, it uses a recursive Bayes filter. A random
sensors (such as radars) produce a measurement multitarget state set is an RFS „kjk whose
of the form Z D fz1 ; : : : ; zm g – i.e., a points are target states. A multitarget probability
finite set of measurements. Some of these distribution is the probability distribution
measurements are generated by background f .Xk jZ1Wk / D f„kjk .X / of the RFS „kjk ,
Estimation for Random Sets 365
where Z1Wk W Z1 ; : : : ; Zk is the time sequence perform better than conventional approaches in
of measurement sets at time tk . some applications.
Random Sets and Human-Mediated Data
RFS Time Update. The Bayes filter time-
update step f .Xk jZ1Wk / ! f .XkC1 jZ1Wk /
requires a multitarget Markov transition function
f .XkC1 jXk /. It is the probability that the Random Sets and Human-Mediated
multitarget system will have multitarget state set Data
XkC1 at time tkC1 , if it had multitarget state set
Xk at time tk . It takes into account all pertinent Natural-language statements and inference rules E
characteristics of the targets: individual target have already been mentioned as examples of
motion, target appearance, target disappearance, human-mediated information. Expert-systems
environmental constraints, etc. It is explicitly theory was introduced in part to address
constructed from an RFS multitarget motion situations – such as this – that involve
model using a multitarget integrodifferential uncertainties other than randomness. Expert-
calculus. system methodologies include fuzzy set theory,
the Dempster-Shafer (D-S) theory of uncertain
RFS Measurement Update. The Bayes filter evidence, and rule-based inference. RS theory
measurement-update step f .XkC1 jZ1Wk / ! provides solid Bayesian foundations for them
f .XkC1 jZ1WkC1 / is just Bayes rule. It requires and allows human-mediated data to be processed
a multitarget likelihood function fkC1 .ZjX / – using standard Bayesian estimation techniques.
the likelihood that a measurement set Z will be The purpose of this section is to briefly
generated, if a system of targets with state set summarize this aspect of the RS approach.
X is present. It takes into account all pertinent The relationships between expert-systems
characteristics of the sensor(s): sensor noise, theory and random set theory were first
fields of view and obscurations, probabilities established by researchers such as Orlov (1978),
of detection, false alarms, and/or clutter. It is Höhle (1982), Nguyen (1978), and Goodman and
explicitly constructed from an RFS measurement Nguyen (1985). At a relatively early stage, it
model using multitarget calculus. was recognized that random set theory provided
a potential means of unifying much of expert-
RFS State Estimation. Determination of the systems theory (Goodman and Nguyen 1985;
number n and states x1 ; : : : ; xn of the targets is Kruse et al. 1991).
accomplished using a Bayes-optimal multitarget A conventional sensor measurement at time tk
state estimator. The idea is to determine the XkC1 is typically represented as Zk D .xk / C Vk –
that maximizes f .XkC1 jZ1WkC1 / in some sense. equivalently formulated as a likelihood function
f .zk jxk /. It is conventional to think of zk as
Approximate Multitarget RFS Filters. the actual “measurement” and of f .zk jxk / as the
The multitarget Bayes filter is, in general, full description of the uncertainty associated with
computationally intractable. Central to the RFS it. In actuality, zk is just a mathematical model
approach is a toolbox of techniques – including zk of some real-world measurement k . Thus,
the multitarget calculus – designed to produce the likelihood actually has the form f .k jxk / D
statistically principled approximate multitarget f .zk jxk /.
filters. The two most well studied are the This observation assumes crucial importance
probability hypothesis density (PHD) filter and when one considers human-mediated data. Con-
its generalization the cardinalized PHD (CPHD) sider the simple natural-language statement
filter. In such filters, f .Xk jZ1Wk / is replaced by
the first-moment density D.xk jZ1Wk / of „kjk .
These filters have been shown to be faster and D “The target is near the tower”
366 Estimation for Random Sets
where the tower is a landmark, located at a known fields such as multitarget tracking and in expert-
position .x0 ; y0 /, and where the term “near” is systems theory. All of these fields of application
assumed to have the following specific meaning: remain areas of active research.
.x; y/ is near .x0 ; y0 / means that .x; y/ 2 T5
where T5 is a disk of radius 5 m, centered at
.x0 ; y0 /. If z D .x; y/ is the actual measurement Cross-References
of the target’s position, then is equivalent to the
formula z 2 T5 . Since z is just one possible draw Estimation, Survey on
from Zk , we can say that – or, equivalently, Extended Kalman Filters
T5 – is actually a constraint on the underlying Nonlinear Filters
measurement process: Zk 2 T5 .
Because the word “near” is rather vague, we
Recommended Reading
could just as well say that z 2 T5 is the best
choice, with confidence w5 D 0:7; that z 2 T4 is
Molchanov (2005) provides a definitive exposi-
the next best choice, with confidence w4 D 0:2;
tion of the general theory of random sets. Two
and that z 2 T6 is the least best, with confidence
excellent references for stochastic geometry are
w6 D 0:1. Let ‚ be the random subset of Z
Stoyan et al. (1995) and Barndorff-Nielsen and
defined by Pr.‚ D Ti / D wi for i D 4; 5; 6. In
van Lieshout (1999). The books by Kingman
this case, is equivalent to the random constraint
(1993) and Daley and Vere-Jones (1998) are
good introductions to point process theory. The
Zk 2 ‚: application of point process theory and stochas-
tic geometry to image processing is addressed
The probability in, respectively, Snyder and Miller (1991) and
Stoyan et al. (1995). The application of RFSs to
k .‚jxk / D Pr..xk / C Vk 2 ‚/ multitarget estimation is addressed in the tutorials
D Pr.Zk 2 ‚jXk D xk / Mahler (2004, 2013) and the book Mahler (2007).
Introductions to the application of random sets to
expert systems can be found in Kruse et al. (1991)
is called a generalized likelihood function (GLF).
and Mahler (2007), Chaps. 3–6.
GLFs can be constructed for more complex
natural-language statements, for inference rules,
and more. Using their GLF representations,
such “nontraditional measurements” can be Bibliography
processed using single- and multi-object
Barndorff-Nielsen O, van Lieshout M (1999) Stochas-
recursive Bayes filters and their approximations. tic geometry: likelihood and computation. Chap-
As a consequence, it can be shown that fuzzy man/CRC, Boca Raton
logic, the D-S theory, and rule-based inference Daley D, Vere-Jones D (1998) An introduction to the
can be subsumed within a single Bayesian- theory of point processes, 1st edn. Springer, New York
Goodman I, Nguyen H (1985) Uncertainty models for
probabilistic paradigm. knowledge based systems. North-Holland, Amsterdam
Goutsias J, Mahler R, Nguyen H (eds) (1997) Random
sets: theory and applications. Springer, New York
Höhle U (1982) A mathematical theory of uncertainty:
Summary and Future Directions fuzzy experiments and their realizations. In: Yager R
(ed) Recent developments in fuzzy set and possibility
In the engineering world, the theory of random theory. Pergamon, New York, pp 344–355
sets has been associated primarily with certain Kingman J (1993) Poisson processes. Oxford University
Press, London
specialized image-processing applications, such
Kruse R, Schwencke E, Heinsohn J (1991) Uncertainty
as morphological filters and tomographic imag- and vagueness in knowledge-based systems. Springer,
ing. It has more recently found application in New York
Estimation, Survey on 367
Mahler R (2004) ‘Statistics 101’ for multisensor, multitar- science. A certainly incomplete list of possible
get data fusion. IEEE Trans Aerosp Electron Sys Mag application domains of estimation includes the
Part 2: Tutorials 19(1):53–64
Mahler R (2007) Statistical multisource-multitarget infor-
following: statistics (Bard 1974; Ghosh et al.
mation fusion. Artech House, Norwood 1997; Koch 1999; Lehmann and Casella 1998;
Mahler R (2013) ‘Statistics 102’ for multisensor- Tsybakov 2009; Wertz 1978), telecommunication
multitarget tracking. IEEE J Spec Top Sign Proc systems (Sage and Melsa 1971; Schonhoff
7(3):376–389
Matheron G (1975) Random sets and integral geometry.
and Giordano 2006; Snyder 1968; Van Trees
Wiley, New York 1971), signal and image processing (Barkat
Michael E (1950) Topologies on spaces of subsets. Trans 2005; Biemond et al. 1983; Elliott et al. 2008;
Am Math Soc 71:152–182 Itakura 1971; Kay 1993; Kim and Woods 1998;
Molchanov I (2005) Theory of random sets. Springer, E
London
Levy 2008; Najim 2008; Poor 1994; Tuncer
Nguyen H (1978) On random sets and belief functions. J and Friedlander 2009; Wakita 1973; Woods and
Math Anal Appl 65:531–542 Radewan 1977), aerospace engineering (McGee
Orlov A (1978) Fuzzy and random sets. Prikladnoi Mno- and Schmidt 1985), tracking (Bar-Shalom and
gomerni Statisticheskii Analys, Moscow
Snyder D, Miller M (1991) Random point processes in
Fortmann 1988; Bar-Shalom et al. 2001, 2013;
time and space, 2nd edn. Springer, New York Blackman and Popoli 1999; Farina and Studer
Stoyan D, Kendall W, Meche J (1995) Stochastic geome- 1985, 1986), navigation (Dissanayake et al.
try and its applications, 2nd edn. Wiley, New York 2001; Durrant-Whyte and Bailey 2006a,b; Farrell
and Barth 1999; Grewal et al. 2001; Mullane
et al. 2011; Schmidt 1966; Smith et al. 1986;
Thrun et al. 2006), control systems (Anderson
Estimation, Survey on and Moore 1979; Athans 1971; Goodwin et al.
2005; Joseph and Tou 1961; Kalman 1960a;
Luigi Chisci1 and Alfonso Farina2 Maybeck 1979, 1982; Söderström 1994; Stengel
1
Dipartimento di Ingegneria dell’Informazione, 1994), econometrics (Aoki 1987; Pindyck
Università di Firenze, Firenze, Italy and Roberts 1974; Zellner 1971), geophysics
2
Selex ES, Roma, Italy (e.g., seismic deconvolution) (Bayless and
Brigham 1970; Flinn et al. 1967; Mendel
1977, 1983, 1990), oceanography (Evensen
Abstract 1994a; Ghil and Malanotte-Rizzoli 1991),
weather forecasting (Evensen 1994b, 2007;
This entry discusses the history and describes McGarty 1971), environmental engineering
the multitude of methods and applications of this (Dochain and Vanrolleghem 2001; Heemink
important branch of stochastic process theory. and Segers 2002; Nachazel 1993), demographic
systems (Leibungudt et al. 1983), automotive
systems (Barbarisi et al. 2006; Stephant et al.
Keywords 2004), failure detection (Chen and Patton
1999; Mangoubi 1998; Willsky 1976), power
Linear stochastic filtering; Markov step pro- systems (Abur and Gómez Espósito 2004;
cesses; Maximum likelihood estimation; Riccati Debs and Larson 1970; Miller and Lewis
equation; Stratonovich-Kushner equation 1971; Monticelli 1999; Toyoda et al. 1970),
nuclear engineering (Robinson 1963; Roman
Estimation is the process of inferring the value of et al. 1971; Sage and Masters 1967; Venerus
an unknown given quantity of interest from noisy, and Bullock 1970), biomedical engineering
direct or indirect, observations of such a quantity. (Bekey 1973; Snyder 1970; Stark 1968), pattern
Due to its great practical relevance, estimation recognition (Andrews 1972; Ho and Agrawala
has a long history and an enormous variety 1968; Lainiotis 1972), social networks (Snijders
of applications in all fields of engineering and et al. 2012), etc.
368 Estimation, Survey on
the Gaussian case (e.g., for normally distributed accessible observables. Due to the generality
noises and initial state) and the best linear of the state estimation problem, which actually
unbiased estimator irrespective of the noise encompasses parameter and signal estimation as
and initial state distributions. From the practical special cases, the literature on estimation since
viewpoint, the KF enjoys the desirable properties 1960 till today has been mostly concentrated
of being linear and acting recursively, step-by- on extensions and generalizations of Kalman’s
step, on a noise-contaminated data stream. This work in several directions. Considerable efforts,
allows for cheap real-time implementation on motivated by the ubiquitous presence of
digital computers. Further, the universality of nonlinearities in practical estimation problems,
“state-variable representations” allows almost have been devoted to nonlinear and/or non-
any estimation problem to be included in the Gaussian filtering, starting from the seminal
KF framework. For these reasons, the KF is, papers of Stratonovich (1959, 1960) and Kushner
and continues to be, an extremely effective and (1962, 1967) for continuous-time systems,
easy-to-implement tool for a great variety of Ho and Lee (1964) for discrete-time systems,
practical tasks, e.g., to detect signals in noise and Jazwinski (1966) for continuous-time
or to estimate unmeasurable quantities from systems with discrete-time observations. In these
372 Estimation, Survey on
papers, state estimation is cast in a probabilistic can be found in van Trees and Bell (2007). A brief
(Bayesian) framework as the problem of evolving review of the earlier (until 1974) state of art in
in time the state conditional probability density estimation can be found in Lainiotis (1974).
given observations (Jazwinski 1970). Work on
nonlinear filtering has produced over the years
several nonlinear state estimation algorithms, Applications
e.g., the extended Kalman filter (EKF) (Schmidt
1966), the unscented Kalman filter (UKF) (Julier Astronomy
and Uhlmann 2004; Julier et al. 1995), the The problem of making estimates and predictions
Gaussian-sum filter (Alspach and Sorenson on the basis of noisy observations originally at-
1972; Sorenson and Alspach 1970, 1971), the tracted the attention many centuries ago in the
sequential Monte Carlo (also called particle) filter field of astronomy. In particular, the first attempt
(SMCF) (Doucet et al. 2001; Gordon et al. 1993; to provide an optimal estimate, i.e., such that a
Ristic et al. 2004), and the ensemble Kalman filter certain measure of the estimation error be min-
(EnKF) (Evensen 1994a,b, 2007) which have imized, was due to Galileo Galilei that, in his
been, and are still now, successfully employed Dialogue on the Two World Chief Systems (1632)
in various application domains. In particular, (Galilei 1632), suggested, as a possible criterion
the SMCF and EnKF are stochastic simulation for estimating the position of Tycho Brahe’s su-
algorithms taking inspiration from the work in the pernova, the estimate that required the “mini-
1940s on the Monte Carlo method (Metropolis mum amendments and smallest corrections” to
and Ulam 1949) which has recently got renewed the data. Later, C. F. Gauss mathematically speci-
interest thanks to the tremendous advances in fied this criterion by introducing in 1795 the least-
computing technology. A thorough review on squares method (Gauss 1806, 1995; Legendre
nonlinear filtering can be found, e.g., in Daum 1810) which was successfully applied in 1801
(2005) and Crisan and Rozovskii (2011). to predict the location of the asteroid Ceres.
Other interesting areas of investigation have This asteroid, originally discovered by the Italian
concerned smoothing (Bryson and Frazier 1963), astronomer Giuseppe Piazzi on January 1, 1801,
robust filtering for systems subject to model- and then lost in the glare of the sun, was in
ing uncertainties (Poor and Looze 1981), and fact recovered 1 year later by the Hungarian
state estimation for infinite-dimensional (i.e., dis- astronomer F. X. von Zach exploiting the least-
tributed parameter and/or delay) systems (Bal- squares predictions of Ceres’ position provided
akrishnan and Lions 1967). Further, a lot of by Gauss.
attention has been devoted to the implementa-
tion of the KF, specifically square-root filtering Statistics
(Potter and Stern 1963) for improved numerical Starting from the work of Fisher in the 1920s
robustness and fast KF algorithms (Kailath 1973; (Fisher 1912, 1922, 1925), maximum likelihood
Morf et al. 1974) for enhancing computational estimation has been extensively employed
efficiency. Worth of mention is the work over in statistics for estimating the parameters of
the years on theoretical bounds on the estimation statistical models (Bard 1974; Ghosh et al.
performance originated from the seminal papers 1997; Koch 1999; Lehmann and Casella 1998;
of Rao (1945) and Cramér (1946) on the lower Tsybakov 2009; Wertz 1978).
bound of the MSE for parameter estimation and
subsequently extended in Tichavsky et al. (1998) Telecommunications and Signal/Image
to nonlinear filtering and in Hernandez et al. Processing
(2006) to more realistic estimation problems with Wiener-Kolmogorov’s theory on signal esti-
possible missed and/or false measurements. An mation, developed in the period 1940–1960
extensive review of this work on Bayesian bounds and originally conceived by Wiener during
for estimation, nonlinear filtering, and tracking the Second World War for predicting aircraft
Estimation, Survey on 373
trajectories in order to direct the antiaircraft fire, filter for nonlinear systems and square-root filter
subsequently originated many applications in implementations for enhanced numerical robust-
telecommunications and signal/image processing ness have been developed as part of the NASA’s
(Barkat 2005; Biemond et al. 1983; Elliott Apollo program. The aerospace field was only
et al. 2008; Itakura 1971; Kay 1993; Kim the first of a long and continuously expanding
and Woods 1998; Levy 2008; Najim 2008; list of application domains where the Kalman
Poor 1994; Tuncer and Friedlander 2009; filter and its nonlinear generalizations have found
Van Trees 1971; Wakita 1973; Woods and widespread and beneficial use.
Radewan 1977). For instance, Wiener filters have
been successfully applied to linear prediction, Control Systems and System Identification E
acoustic echo cancellation, signal restoration, and The work on Kalman filtering (Kalman 1960b;
image/video de-noising. But it was the discovery Kalman and Bucy 1961) had also a significant
of the Kalman filter in 1960 that revolutionized impact on control system design and implemen-
estimation by providing an effective and powerful tation. In Kalman (1960a) duality between esti-
tool for the solution of any, static or dynamic, mation and control was pointed out, in that for a
stationary or adaptive, linear estimation problem. certain class of control and estimation problems
A recently conducted, and probably non- one can solve the control (estimation) problem
exhaustive, search has detected the presence for a given dynamical system by resorting to a
of over 16,000 patents related to the “Kalman corresponding estimation (control) problem for
filter,” spreading over all areas of engineering a suitably defined dual system. In particular, the
and over a period of more than 50 years. What Kalman filter has been shown to be dual of
is astonishing is that even nowadays, more than the linear-quadratic (LQ) regulator, and the two
50 years after its discovery, one can see the dual techniques constitute the linear-quadratic-
continuous appearance of lots of new patents and Gaussian (LQG) (Joseph and Tou 1961) regula-
scientific papers presenting novel applications tor. The latter consists of an LQ regulator feeding
and/or novel extensions in many directions (e.g., back in a linear way the state estimate provided
to nonlinear filtering) of the KF. Since 1992 by a Kalman filter, which can be independently
the number of patents registered every year and designed in view of the separation principle.
related to the KF follows an exponential law. The KF as well as LSE and MLE techniques
are also widely used in system identification
Space Navigation and Aerospace (Ljung 1999; Söderström and Stoica 1989) for
Applications both parameter estimation and output prediction
The first important application of the Kalman purposes.
filter was in the NASA (National Aeronautic
and Space Administration) space program. As Tracking
reported in a NASA technical report (McGee and One of the major application areas for estimation
Schmidt 1985), Kalman presented his new ideas is tracking (Bar-Shalom and Fortmann 1988; Bar-
while visiting Stanley F. Schmidt at the NASA Shalom et al. 2001, 2013; Blackman and Popoli
Ames Research Center in 1960, and this meeting 1999; Farina and Studer 1985, 1986), i.e., the
stimulated the use of the KF during the Apollo task of following the motion of moving objects
program (in particular, in the guidance system of (e.g., aircrafts, ships, ground vehicles, persons,
Saturn V during Apollo 11 flight to the Moon), animals) given noisy measurements of kinematic
and, furthermore, in the NASA Space Shuttle variables from remote sensors (e.g., radar, sonar,
and in Navy submarines and unmanned aerospace video cameras, wireless sensors, etc.). The de-
vehicles and weapons, such as cruise missiles. velopment of the Wiener filter in the 1940s was
Further, to cope with the nonlinearity of the space actually motivated by radar tracking of aircraft
navigation problem and the small word length for automatic control of antiaircraft guns. Such
of the onboard computer, the extended Kalman filters began to be used in the 1950s whenever
374 Estimation, Survey on
computers were integrated with radar systems, and oceanography. A large-scale state-space
and then in the 1960s more advanced and better model is typically obtained from the physical
performing Kalman filters came into use. Still system model, expressed in terms of partial
today it can be said that the Kalman filter and differential equations (PDEs), by means of
its nonlinear generalizations (e.g., EKF (Schmidt a suitable spatial discretization technique so
1966), UKF (Julier and Uhlmann 2004), and that data assimilation is cast into a state
particle filter (Gordon et al. 1993)) represent estimation problem. To deal with the huge
the workhorses of tracking and sensor fusion. dimensionality of the resulting state vector,
Tracking, however, is usually much more compli- appropriate filtering techniques with reduced
cated than a simple state estimation problem due computational load have been suitably developed
to the presence of false measurements (clutter) (Evensen 2007).
and multiple objects in the surveillance region
of interest, as well as for the uncertainty about Global Navigation Satellite Systems
the origin of measurements. This requires to use, Global Navigation Satellite Systems (GNSSs),
besides filtering algorithms, smart techniques for such as GPS put into service in 1993 by the
object detection as well as for association be- US Department of Defense, provide nowadays a
tween detected objects and measurements. The commercially diffused technology exploited by
problem of joint target tracking and classifica- millions of users all over the world for navigation
tion has also been formulated as a hybrid state purposes, wherein the Kalman filter plays a key
estimation problem and addressed in a number of role (Bar-Shalom et al. 2001). In fact, the Kalman
papers (see, e.g., Smeth and Ristic (2004) and the filter not only is employed in the core of the
references therein). GNSS to estimate the trajectories of all the satel-
lites, the drifts and rates of all system clocks, and
Econometrics hundreds of parameters related to atmospheric
State and parameter estimation have been widely propagation delay, but also any GNSS receiver
used in econometrics (Aoki 1987) for analyz- uses a nonlinear Kalman filter, e.g., EKF, in order
ing and/or predicting financial time series (e.g., to estimate its own position and velocity along
stock prices, interest rates, unemployment rates, with the bias and drift of its own clock with
volatility etc.). respect to the GNSS time.
1 h i
systems (Abur and Gómez Espósito 2004; Debs
d 2
and Larson 1970; Miller and Lewis 1971; I./ D E .x x.// O (2)
Monticelli 1999; Toyoda et al. 1970), nuclear d 2
reactors (Robinson 1963; Roman et al. 1971; p
O
where x./ D E xj x C v is the minimum
Sage and Masters 1967; Venerus and Bullock
mean-square error estimate of x given y. Figure 1
1970), biomedical engineering (Bekey 1973;
displays the behavior of both MI, in natural log-
Snyder 1970; Stark 1968), pattern recognition
arithmic units of information (nats), and MMSE
(Andrews 1972; Ho and Agrawala 1968;
versus SNR.
Lainiotis 1972), and many others.
As mentioned in Guo et al. (2005), the above
information-estimation relationship (2) has found
a number of applications, e.g., in nonlinear filter-
Connection Between Information and ing, in multiuser detection, in power allocation
Estimation Theories over parallel Gaussian channels, in the proof
of Shannon’s entropy power inequality and its
In this section, the link between two fundamental generalizations, as well as in the treatment of the
quantities in information theory and estimation capacity region of several multiuser channels.
theory, i.e., the mutual information (MI) and
respectively the minimum mean-square error Linear Dynamic Continuous-Time Case
(MMSE), is investigated. In particular, a While in the static case the MI is assumed to be
strikingly simple but very general relationship a function of the SNR, in the dynamic case it
can be established between the MI of the input is of great interest to investigate the relationship
and the output of an additive Gaussian channel between the MI and the MMSE as a function of
and the MMSE in estimating the input given the time.
output, regardless of the input distribution (Guo Consider the following first-order (scalar) lin-
et al. 2005). Although this functional relation ear Gaussian continuous-time stochastic dynami-
holds for general settings of the Gaussian channel cal system:
(e.g., both discrete-time and continuous-time,
possibly vector, channels), in order to avoid the dxt D axt dt C d wt
p (3)
heavy mathematical preliminaries needed to treat dyt D xt dt C dvt
rigorously the general problem, two simple scalar
cases, a static and a (continuous-time) dynamic where a is a real-valued constant while wt
one, will be discussed just to highlight the main and vt are independent standard Brownian
concept. motion processes that represent the process and,
376 Estimation, Survey on
respectively, measurement noises. Defining directions. Among the many new future trends,
4 networked estimation and quantum estimation
by D fxs ; 0 s tg the collection
x0t
of all states up to time t and analogously (briefly overviewed in the subsequent parts of this
4 section) certainly deserve special attention due to
y0t D fys ; 0 s tg for the channel outputs
the growing interest on wireless sensor networks
(i.e., measurements) and considering the MI
and, respectively, quantum computing.
between x0t and y0t as a function of time t, i.e.,
I.t/ D I x0t ; y0t , it can be shown that (Duncan
1970; Mayer-Wolf and Zakai 1983) Networked Information Fusion and
Estimation
d h i Information or data fusion is about combining,
I.t/ D E .xt xO t /2 (4) or fusing, information or data from multiple
dt 2
sources to provide knowledge that is not
where xO t D E xt jy0t is the minimum mean- evident from a single source (Bar-Shalom
square error estimate of the state xt given all the et al. 2013; Farina and Studer 1986). In
channel outputs up to time t, i.e., y0t . Figure 2 1986, an effort to standardize the terminol-
depicts the time behavior of both MI and MMSE ogy related to data fusion began and the
for several values of and a D 1. JDL (Joint Directors of Laboratories) data
fusion working group was established. The
result of that effort was the conception of
Conclusions and Future Trends a process model for data fusion and a data
fusion lexicon (Blasch et al. 2012; Hall and
Despite the long history of estimation and the Llinas 1997). Information and data fusion are
huge amount of work on several theoretical and mainly supported by sensor networks which
practical aspects of estimation, there is still a lot present the following advantages over a single
of research investigation to be done in several sensor:
Estimation, Survey on 377
Estimation, Survey on, Fig. 2 MI and MMSE for different values of (10 dB, 0 dB, +10 dB) and a D 1
• Can be deployed over wide regions • Spatial and temporal sensor alignment
• Provide diverse characteristics/viewing angles • Scalable fusion
of the observed phenomenon • Robustness with respect to data incest (or dou-
• Are more robust to failures ble counting), i.e., repeated use of the same
• Gather more data that, once fused, provide a information
more complete picture of the observed phe- • Handling data latency (e.g., out-of-sequence
nomenon measurements/estimates)
• Allow better geographical coverage, i.e., • Communication bandwidth limitations
wider area and less terrain obstructions. In particular, to counteract data incest the
Sensor network architectures can be centralized, so-called covariance intersection (Julier and
hierarchical (with or without feedback), and Uhlmann 1997) robust fusion approach has
distributed (peer-to-peer). Today’s trend for been proposed to guarantee, at the price of
many monitoring and decision-making tasks is some conservatism, consistency of the fused
to exploit large-scale networks of low-cost and estimate when combining estimates from
low-energy consumption devices with sensing, different nodes with unknown correlations. For
communication, and processing capabilities. scalable fusion, a consensus approach (Olfati-
For scalability issues, such networks should Saber et al. 2007) can be undertaken. This
operate in a fully distributed (peer-to-peer) allows to carry out a global (i.e., over the
fashion, i.e., with no centralized coordination, whole network) processing task by iterating
so as to achieve in each node a global local processing steps among neighboring
estimation/decision objective through localized nodes.
processing only. Several consensus algorithms have been
The attainment of this goal actually requires proposed for distributed parameter (Calafiore
several issues to be addressed like: and Abrate 2009) or state (Alriksson and
378 Estimation, Survey on
Rantzer 2006; Kamgarpour and Tomlin 2007; In Iida et al. (2010) the quantum Kalman filter is
Olfati-Saber 2007; Stankovic et al. 2009; Xiao applied to an optical cavity composed of mirrors
et al. 2005) estimation. Recently, Battistelli and and crystals inside, which interacts with a probe
Chisci (2014) introduced a generalized consensus laser. In particular, a form of a quantum stochastic
on probability densities which opens up the differential equation can be written for such a
possibility to perform in a fully distributed system so as to design the algorithm that updates
and scalable way any Bayesian estimation the estimates of the system variables on the basis
task over a sensor network. As by-products, of the measurement outcome of the system.
this approach allowed to derive consensus
Kalman filters with guaranteed stability under
Cross-References
minimal requirements of system observability
and network connectivity (Battistelli et al. 2011,
Averaging Algorithms and Consensus
2012; Battistelli and Chisci 2014), consensus
Bounds on Estimation
nonlinear filters (Battistelli et al. 2012), and a
Consensus of Complex Multi-agent Systems
consensus CPHD filter for distributed multitarget
Data Association
tracking (Battistelli et al. 2013). Despite these
Estimation and Control over Networks
interesting preliminary results, networked
Estimation for Random Sets
estimation is still a very active research area with
Extended Kalman Filters
many open problems related to energy efficiency,
Kalman Filters
estimation performance optimality, robustness
Moving Horizon Estimation
with respect to delays and/or data losses, etc.
Networked Control Systems: Estimation and
Control over Lossy Networks
Quantum Estimation
Nonlinear Filters
Quantum estimation theory consists of a general-
Observers for Nonlinear Systems
ization of the classical estimation theory in terms
Observers in Linear Systems Theory
of quantum mechanics. As a matter of fact, the
Particle Filters
statistical theory can be seen as a particular case
of the more general quantum theory (Helstrom Acknowledgments The authors warmly recognize the
1969, 1976). Quantum mechanics presents prac- contribution of S. Fortunati (University of Pisa, Italy),
tical applications in several fields of technology L. Pallotta (University of Napoli, Italy), and G. Battistelli
(University of Firenze, Italy).
(Personick 1971) such as, the use of quantum
number generators in place of the classical ran-
dom number generators. Moreover, manipulating
Bibliography
the energy states of the cesium atoms, it is
possible to suppress the quantum noise levels Abur A, Gómez Espósito A (2004) Power system state es-
and consequently improve the accuracy of timation – theory and implementation. Marcel Dekker,
atomic clocks. Quantum mechanics can also be New York
Aidala VJ, Hammel SE (1983) Utilization of modified
exploited to solve optimization problems, giving
polar coordinates for bearings-only tracking. IEEE
sometimes optimization algorithms that are faster Trans Autom Control 28(3):283–294
than conventional ones. For instance, McGeoch Alamo T, Bravo JM, Camacho EF (2005) Guaranteed
and Wang (2013) provided an experimental state estimation by zonotopes. Automatica 41(6):
1035–1043
study of algorithms based on quantum annealing.
Alessandri A, Baglietto M, Battistelli G (2005)
Interestingly, the results of McGeoch and Wang Receding-horizon estimation for switching discrete-
(2013) have shown that this approach allows to time linear systems. IEEE Trans Autom Control
obtain better solutions with respect to those found 50(11):1736–1748
Alessandri A, Baglietto M, Battistelli G (2008) Moving-
with conventional software solvers. In quantum
horizon state estimation for nonlinear discrete-time
mechanics, also the Kalman filter has found systems: new stability results and approximation
its proper form, as the quantum Kalman filter. schemes. Automatica 44(7):1753–1765
Estimation, Survey on 379
Alriksson P, Rantzer A (2006) Distributed Kalman Bayless JW, Brigham EO (1970) Application of the
filtering using weighted averaging. In: Proceed- Kalman filter. Geophysics 35(1):2–23
ings of the 17th International Symposium on Bekey GA (1973) Parameter estimation in biologi-
Mathematical Theory of Networks and Systems, cal systems: a survey. In: Proceedings of the 3rd
Kyoto IFAC Symposium Identification and System Parameter
Alspach DL, Sorenson HW (1972) Nonlinear Bayesian Estimation, Delft, pp 1123–1130
estimation using Gaussian sum approximations. IEEE Benedict TR, Bordner GW (1962) Synthesis of an optimal
Trans Autom Control 17(4):439–448 set of radar track-while-scan smoothing equations.
Anderson BDO, Chirarattananon S (1972) New lin- IRE Trans Autom Control 7(4):27–32
ear smoothing formulas. IEEE Trans Autom Control Benes VE (1981) Exact finite-dimensional filters for
17(1):160–161 certain diffusions with non linear drift. Stochastics
Anderson BDO, Moore JB (1979) Optimal filtering. Pren- 5(1):65–92
tice Hall, Englewood Cliffs Bertsekas DP, Rhodes IB (1971) Recursive state estima- E
Andrews A (1968) A square root formulation of tion for a set-membership description of uncertainty.
the Kalman covariance equations. AIAA J 6(6): IEEE Trans Autom Control 16(2):117–128
1165–1166 Biemond J, Rieske J, Gerbrands JJ (1983) A fast Kalman
Andrews HC (1972) Introduction to mathematical tech- filter for images degraded by both blur and noise. IEEE
niques in pattern recognition. Wiley, New York Trans Acoust Speech Signal Process 31(5):1248–1256
Aoki M (1987) State space modeling of time-series. Bierman GJ (1974) Sequential square root filtering and
Springer, Berlin smoothing of discrete linear systems. Automatica
Athans M (1971) An optimal allocation and guidance laws 10(2):147–158
for linear interception and rendezvous problems. IEEE Bierman GJ (1977) Factorization methods for discrete
Trans Aerosp Electron Syst 7(5):843–853 sequential estimation. Academic, New York
Balakrishnan AV, Lions JL (1967) State estimation Blackman S, Popoli R (1999) Design and analysis of
for infinite-dimensional systems. J Comput Syst Sci modern tracking systems. Artech House, Norwood
1(4):391–403 Blasch EP, Lambert DA, Valin P, Kokar MM, Llinas J,
Barbarisi O, Vasca F, Glielmo L (2006) State of charge Das S, Chong C, Shahbazian E (2012) High level
Kalman filter estimator for automotive batteries. Con- information fusion (HLIF): survey of models, issues,
trol Eng Pract 14(3):267–275 and grand challenges. IEEE Aerosp Electron Syst Mag
Bard Y (1974) Nonlinear parameter estimation. 27(9):4–20
Academic, New York Bryson AE, Frazier M (1963) Smoothing for linear and
Barkat M (2005) Signal detection and estimation. Artech nonlinear systems, TDR 63-119, Aero System Divi-
House, Boston sion. Wright-Patterson Air Force Base, Ohio
Bar-Shalom Y, Fortmann TE (1988) Tracking and data Calafiore GC, Abrate F (2009) Distributed linear estima-
association. Academic, Boston tion over sensor networks. Int J Control 82(5):868–882
Bar-Shalom Y, Li X, Kirubarajan T (2001) Estimation Carravetta F, Germani A, Raimondi M (1996) Polyno-
with applications to tracking and navigation. Wiley, mial filtering for linear discrete-time non-Gaussian
New York systems. SIAM J Control Optim 34(5):1666–1690
Bar-Shalom Y, Willett P, Tian X (2013) Tracking and data Chen J, Patton RJ (1999) Robust model-based fault diag-
fusion: a handbook of algorithms. YBS Publishing nosis for dynamic systems. Kluwer, Boston
Storrs, CT Chisci L, Zappa G (1992) Square-root Kalman filtering of
Battistelli G, Chisci L, Morrocchi S, Papi F (2011) An descriptor systems. Syst Control Lett 19(4):325–334
information-theoretic approach to distributed state es- Chisci L, Garulli A, Zappa G (1996) Recursive
timation. In: Proceedings of the 18th IFAC World state bounding by parallelotopes. Automatica 32(7):
Congress, Milan, pp 12477–12482 1049–1055
Battistelli G, Chisci L, Mugnai G, Farina A, Graziano Chou KC, Willsky AS, Benveniste A (1994) Multiscale
A (2012) Consensus-based algorithms for distributed recursive estimation, data fusion, and regularization.
filtering. In: Proceedings of the 51st IEEE Control and IEEE Trans Autom Control 39(3):464–478
Decision Conference, Maui, pp 794–799 Combettes P (1993) The foundations of set-theoretic esti-
Battistelli G, Chisci L (2014) Kullback-Leibler average, mation. Proc IEEE 81(2):182–208
consensus on probability densities, and distributed Cramér H (1946) Mathematical methods of statistics.
state estimation with guaranteed stability. Automatica University Press, Princeton
50:707–718 Crisan D, Rozovskii B (eds) (2011) The Oxford handbook
Battistelli G, Chisci L, Fantacci C, Farina A, Graziano of nonlinear filtering. Oxford University Press, Oxford
A (2013) Consensus CPHD filter for distributed mul- Dai L (1987) State estimation schemes in singular
titarget tracking. IEEE J Sel Topics Signal Process systems. In: Preprints 10th IFAC World Congress,
7(3):508–520 Munich, vol 9, pp 211–215
Bayes T (1763) Essay towards solving a problem in the Dai L (1989) Filtering and LQG problems for discrete-
doctrine of chances. Philos Trans R Soc Lond 53: time stochastic singular systems. IEEE Trans Autom
370–418. doi:10.1098/rstl.1763.0053 Control 34(10):1105–1108
380 Estimation, Survey on
Darragh J, Looze D (1984) Noncausal minimax linear Fisher RA (1912) On an absolute criterion for fitting
state estimation for systems with uncertain second- frequency curves. Messenger Math 41(5):155–160
order statistics. IEEE Trans Autom Control 29(6):555– Fisher RA (1922) On the mathematical foundations of
557 theoretical statistics. Philos Trans R Soc Lond Ser A
Daum FE (1986) Exact finite dimensional nonlinear fil- 222:309–368
ters. IEEE Trans Autom Control 31(7):616–622 Fisher RA (1925) Theory of statistical estimation. Math
Daum F (2005) Nonlinear filters: beyond the Kalman Proc Camb Philos Soc 22(5):700–725
filter. IEEE Aerosp Electron Syst Mag 20(8):57–69 Flinn EA, Robinson EA, Treitel S (1967) Special issue
Debs AS, Larson RE (1970) A dynamic estimator for on the MIT geophysical analysis group reports. Geo-
tracking the state of a power system. IEEE Trans physics 32:411–525
Power Appar Syst 89(7):1670–1678 Frost PA, Kailath T (1971) An innovations approach to
Devlin K (2008) The unfinished game: Pascal, Fermat least-squares estimation, Part III: nonlinear estimation
and the seventeenth-century letter that made the world in white Gaussian noise. IEEE Trans Autom Control
modern. Basic Books, New York. Copyright (C) Keith 16(3):217–226
Devlin Galilei G (1632) Dialogue concerning the two chief world
Dissanayake G, Newman P, Clark S, Durrant-Whyte H, systems (Translated in English by S. Drake (Foreword
Csorba M (2001) A solution to the simultaneous lo- of A. Einstein), University of California Press, Berke-
calization and map building (SLAM) problem. IEEE ley, 1953)
Trans Robot Autom 17(3):229–241 Gauss CF (1806) Theoria motus corporum coelestium in
Dochain D, Vanrolleghem P (2001) Dynamical modelling sectionibus conicis solem ambientum (Translated in
and estimation of wastewater treatment processes. English by C.H. Davis, Reprinted as Theory of motion
IWA Publishing, London of the heavenly bodies, Dover, New York, 1963)
Doucet A, de Freitas N, Gordon N (eds) (2001) Sequential Gauss CF (1995) Theoria combinationis observationum
Monte Carlo methods in practice. Springer, New York erroribus minimis obnoxiae (Translated by G.W. Stew-
Duncan TE (1970) On the calculation of mutual informa- art, Classics in Applied Mathematics), vol 11. SIAM,
tion. SIAM J Appl Math 19(1):215–220 Philadelphia
Durrant-Whyte H, Bailey T (2006a) Simultaneous local- Germani A, Manes C, Palumbo P (2005) Polynomial
ization and mapping: Part I. IEEE Robot Autom Mag extended Kalman filter. IEEE Trans Autom Control
13(2):99–108 50(12):2059–2064
Durrant-Whyte H, Bailey T (2006b) Simultaneous local- Ghil M, Malanotte-Rizzoli P (1991) Data assimilation
ization and mapping: Part II. IEEE Robot Autom Mag in meteorology and oceanography. Adv Geophys
13(3):110–117 33:141–265
Elliott RJ, Aggoun L, Moore JB (2008) Hidden Markov Ghosh M, Mukhopadhyay N, Sen PK (1997) Sequential
models: estimation and control. Springer, New York. estimation. Wiley, New York
(first edition in 1995) Golub GH (1965) Numerical methods for solving lin-
Evensen G (1994a) Inverse methods and data assimilation ear least squares problems. Numerische Mathematik
in nonlinear ocean models. Physica D 77:108–129 7(3):206–216
Evensen G (1994b) Sequential data assimilation with a Goodwin GC, Seron MM, De Doná JA (2005) Con-
nonlinear quasi-geostrophic model using Monte Carlo strained control and estimation. Springer, London
methods to forecast error statistics. J Geophys Res Gordon NJ, Salmond DJ, Smith AFM (1993) Novel
99(10):143–162 approach to nonlinear/non Gaussian Bayesian state
Evensen G (2007) Data assimilation – the ensemble estimation. IEE Proc F 140(2):107–113
Kalman filter. Springer, Berlin Grewal MS, Weill LR, Andrews AP (2001) Global po-
Falb PL (1967) Infinite-dimensional filtering; the Kalman- sitioning systems, inertial navigation and integration.
Bucy filter in Hilbert space. Inf Control 11(1):102–137 Wiley, New York
Farina A (1999) Target tracking with bearings-only mea- Grimble MJ (1988) H1 design of optimal linear filters.
surements. Signal Process 78(1):61–78 In: Byrnes C, Martin C, Saeks R (eds) Linear circuits,
Farina A, Studer FA (1985) Radar data processing. Vol- systems and signal processing: theory and applica-
ume 1 – introduction and tracking, Editor P. Bowron. tions. North Holland, Amsterdam, pp 535–540
Research Studies Press, Letchworth/Wiley, New York Guo D, Shamai S, Verdú S (2005) Mutual information and
(Translated in Russian (Radio I Sviaz, Moscow 1993) minimum mean-square error in Gaussian channels.
and in Chinese. China Defense Publishing House, IEEE Trans Inf Theory 51(4):1261–1282
1988) Hall DL, Llinas J (1997) An introduction to multisensor
Farina A, Studer FA (1986) Radar data processing. Vol- data fusion. Proc IEEE 85(1):6–23
ume 2 – advanced topics and applications, Editor P. Hassibi B, Sayed AH, Kailath T (1999) Indefinite-
Bowron. Research Studies Press, Letchworth/Wiley, quadratic estimation and control. SIAM, Philadelphia
New York (In Chinese, China Defense Publishing Heemink AW, Segers AJ (2002) Modeling and predic-
House, 1992) tion of environmental data in space and time using
Farrell JA, Barth M (1999) The global positioning system Kalman filtering. Stoch Environ Res Risk Assess 16:
and inertial navigation. McGraw-Hill, New York 225–240
Estimation, Survey on 381
Helstrom CW (1969) Quantum detection and estimation Kalman RE (1960b) A new approach to linear filtering and
theory. J Stat Phys 1(2):231–252 prediction problems. Trans ASME J Basic Eng Ser D
Helstrom CW (1976) Quantum detection and estimation 82(1):35–45
theory. Academic, New York Kalman RE, Bucy RS (1961) New results in linear filtering
Hernandez M, Farina A, Ristic B (2006) PCRLB for track- and prediction theory. Trans ASME J Basic Eng Ser D
ing in cluttered environments: measurement sequence 83(3):95–108
conditioning approach. IEEE Trans Aerosp Electron Kamgarpour M, Tomlin C (2007) Convergence properties
Syst 42(2):680–704 of a decentralized Kalman filter. In: Proceedings of the
Ho YC, Agrawala AK (1968) On pattern classification 47th IEEE Conference on Decision and Control, New
algorithms – introduction and survey. Proc IEEE Orleans, pp 5492–5498
56(12):2101–2113 Kaminski PG, Bryson AE (1972) Discrete square-root
Ho YC, Lee RCK (1964) A Bayesian approach to prob- smoothing. In: Proceedings of the AIAA Guidance and
lems in stochastic estimation and control. IEEE Trans Control Conference, paper no. 72–877, Stanford E
Autom Control 9(4):333–339 Kay SM (1993) Fundamentals of statistical signal pro-
Iida S, Ohki F, Yamamoto N (2010) Robust quantum cessing, volume 1: estimation theory. Prentice Hall,
Kalman filtering under the phase uncertainty of the Englewood Cliffs
probe-laser. In: IEEE international symposium on Kim K, Shevlyakov G (2008) Why Gaussianity? IEEE
computer-aided control system design, Yokohama, Signal Process Mag 25(2):102–113
pp 749–754 Kim J, Woods JW (1998) 3-D Kalman filter for image mo-
Itakura F (1971) Extraction of feature parameters of tion estimation. IEEE Trans Image Process 7(1):42–52
speech by statistical methods. In: Proceedings of Koch KR (1999) Parameter estimation and hypothesis
the 8th Symposium Speech Information Processing, testing in linear models. Springer, Berlin
Sendai, pp II-5.1–II-5.12 Kolmogorov AN (1933) Grundbegriffe der Wahrschein-
Jazwinski AH (1966) Filtering for nonlinear dynamical lichkeitsrechnung (in German) (English translation
systems. IEEE Trans Autom Control 11(4):765–766 edited by Nathan Morrison: Foundations of the theory
Jazwinski AH (1968) Limited memory optimal filtering. of probability, Chelsea, New York, 1956). Springer,
IEEE Trans Autom Control 13(5): 558–563 Berlin
Jazwinski AH (1970) Stochastic processes and filtering Kolmogorov AN (1941) Interpolation and extrapolation of
theory. Academic, New York stationary random sequences. Izv Akad Nauk SSSR
Joseph DP, Tou TJ (1961) On linear control theory. Trans Ser Mat 5:3–14 (in Russian) (English translation by
Am Inst Electr Eng 80(18):193–196 G. Lindqvist, published in Selected works of A.N.
Julier SJ, Uhlmann JK (1997) A non-divergent estimation Kolmogorov - Volume II: Probability theory and math-
algorithm in the presence of unknown correlations. In: ematical statistics, A.N. Shyryaev (Ed.), pp. 272–280,
Proceedings of the 1997 American Control Confer- Kluwer, Dordrecht, Netherlands, 1992)
ence, Albuquerque, vol 4, pp 2369–2373 Kushner HJ (1962) On the differential equations satis-
Julier SJ, Uhlmann JK (2004) Unscented Filtering and fied by conditional probability densities of Markov
Nonlinear Estimation. Proc IEEE 92(3): 401–422 processes with applications. SIAM J Control Ser A
Julier SJ, Uhlmann JK, Durrant-Whyte H (1995) A new 2(1):106–199
approach for filtering nonlinear systems. In: Proceed- Kushner HJ (1967) Dynamical equations for optimal non-
ings of the 1995 American control conference, Seattle, linear filtering. J Differ Equ 3(2):179–190
vol 3, pp 1628–1632 Kushner HJ (1970) Filtering for linear distributed param-
Kailath T (1968) An innovations approach to least- eter systems. SIAM J Control 8(3): 346–359
squares estimation, Part I: linear filtering in addi- Kwakernaak H (1967) Optimal filtering in linear sys-
tive white noise. IEEE Trans Autom Control 13(6): tems with time delays. IEEE Trans Autom Control
646–655 12(2):169–173
Kailath T (1970) The innovations approach to de- Lainiotis DG (1972) Adaptive pattern recognition: a state-
tection and estimation theory. Proc IEEE 58(5): variable approach. In: Watanabe S (ed) Frontiers of
680–695 pattern recognition. Academic, New York
Kailath T (1973) Some new algorithms for recursive Lainiotis DG (1974) Estimation: a brief survey. Inf Sci
estimation in constant linear systems. IEEE Trans Inf 7:191–202
Theory 19(6):750–760 Laplace PS (1814) Essai philosophique sur le probabilités
Kailath T, Frost PA (1968) An innovations approach (Translated in English by F.W. Truscott, F.L. Emory,
to least-squares estimation, Part II: linear smoothing Wiley, New York, 1902)
in additive white noise. IEEE Trans Autom Control Legendre AM (1810) Méthode des moindres quarrés, pour
13(6):655–660 trouver le milieu le plus probable entre les résultats
Kailath T, Sayed AH, Hassibi B (2000) Linear estimation. de diffèrentes observations. In: Mémoires de la Classe
Prentice Hall, Upper Saddle River des Sciences Mathématiques et Physiques de l’Institut
Kalman RE (1960a) On the general theory of control sys- Impérial de France, pp 149–154. Originally appeared
tems. In: Proceedings of the 1st International Congress as appendix to the book Nouvelle méthodes pour la
of IFAC, Moscow, USSR, vol 1, pp 481–492 détermination des orbites des comètes, 1805
382 Estimation, Survey on
Lehmann EL, Casella G (1998) Theory of point estima- Mendel JM (1977) White noise estimators for seismic
tion. Springer, New York data processing in oil exploration. IEEE Trans Autom
Leibungudt BG, Rault A, Gendreau F (1983) Application Control 22(5):694–706
of Kalman filtering to demographic models. IEEE Mendel JM (1983) Optimal seismic deconvolution: an
Trans Autom Control 28(3):427–434 estimation-based approach. Academic, New York
Levy BC (2008) Principles of signal detection and param- Mendel JM (1990) Maximum likelihood deconvolution
eter estimation. Springer, New York – a journey into model-based signal processing.
Ljung L (1999) System identification: theory for the user. Springer, New York
Prentice Hall, Upper Saddle River Metropolis N, Ulam S (1949) The Monte Carlo method.
Los Alamos Scientific Laboratory (1966) Fermi invention J Am Stat Assoc 44(247):335–341
rediscovered at LASL. The Atom, pp 7–11 Milanese M, Belforte G (1982) Estimation theory and un-
Luenberger DG (1964) Observing the state of a linear certainty intervals evaluation in presence of unknown
system. IEEE Trans Mil Electron 8(2):74–80 but bounded errors: linear families of models and
Mahler RPS (1996) A unified foundation for data fusion. estimators. IEEE Trans Autom Control 27(2):408–414
In: Sadjadi FA (ed) Selected papers on sensor and Milanese M, Vicino A (1993) Optimal estimation theory
data fusion. SPIE MS-124. SPIE Optical Engineering for dynamic systems with set-membership uncertainty.
Press, Bellingham, pp 325–345. Reprinted from 7th Automatica 27(6):427–446
Joint Service Data Fusion Symposium, pp 154–174, Miller WL, Lewis JB (1971) Dynamic state estima-
Laurel (1994) tion in power systems. IEEE Trans Autom Control
Mahler RPS (2003) Multitarget Bayes filtering via first- 16(6):841–846
order multitarget moments. IEEE Trans Aerosp Elec- Mlodinow L (2009) The drunkard’s walk: how random-
tron Syst 39(4):1152–1178 ness rules our lives. Pantheon Books, New York. Copy-
Mahler RPS (2007a) Statistical multisource multitarget right (C) Leonard Mlodinow
information fusion. Artech House, Boston Monticelli A (1999) State estimation in electric power
Mahler RPS (2007b) PHD filters of higher order in systems: a generalized approach. Kluwer, Dordrecht
target number. IEEE Trans Aerosp Electron Syst Morf M, Kailath T (1975) Square-root algorithms for
43(4):1523–1543 least-squares estimation. IEEE Trans Autom Control
Mangoubi ES (1998) Robust estimation and failure detec- 20(4):487–497
tion: a concise treatment. Springer, New York Morf M, Sidhu GS, Kailath T (1974) Some new al-
Maybeck PS (1979 and 1982) Stochastic models, estima- gorithms for recursive estimation in constant, linear,
tion and control, vols I, II and III. Academic, New discrete-time systems. IEEE Trans Autom Control
York. 1979 (volume I) and 1982 (volumes II and III) 19(4):315–323
Mayer-Wolf E, Zakai M (1983) On a formula relating Mullane J, Vo B-N, Adams M, Vo B-T (2011) Random
the Shannon information to the Fisher information for finite sets for robotic mapping and SLAM. Springer,
the filtering problem. In: Korezlioglu H, Mazziotto Berlin
G and Szpirglas J (eds) Lecture notes in control and Nachazel K (1993) Estimation theory in hydrology and
information sciences, vol 61. Springer, New York, water systems. Elsevier, Amsterdam
pp 164–171 Najim M (2008) Modeling, estimation and optimal fil-
Mayne DQ (1966) A solution of the smoothing problem tering in signal processing. Wiley, Hoboken. First
for linear dynamic systems. Automatica 4(2):73–92 published in France by Hermes Science/Lavoisier as
McGarty TP (1971) The estimation of the constituent Modélisation, estimation et filtrage optimal en traite-
densities of the upper atmosphere. IEEE Trans Autom ment du signal (2006)
Control 16(6):817–823 Neal SR (1967) Discussion of parametric relations for
McGee LA, Schmidt SF (1985) Discovery of the Kalman the filter predictor. IEEE Trans Autom Control 12(3):
filter as a practical tool for aerospace and industry, 315–317
NASA-TM-86847 Netz R, Noel W (2011) The Archimedes codex. Phoenix,
McGeoch CC, Wang C (2013) Experimental evaluation of Philadelphia, PA
an adiabatic quantum system for combinatorial opti- Nikoukhah R, Willsky AS, Levy BC (1992) Kalman
mization. In: Computing Frontiers 2013 Conference, filtering and Riccati equations for descriptor systems.
Ischia, article no. 23. doi:10.1145/2482767.2482797 IEEE Trans Autom Control 37(9):1325–1342
McGrayne SB (2011) The theory that would not die. How Olfati-Saber R (2007) Distributed Kalman filtering for
Bayes’ rule cracked the enigma code, hunted down sensor networks. In: Proceedings of the 46th IEEE
Russian submarines, and emerged triumphant from Conference on Decision and Control, New Orleans,
two centuries of controversies. Yale University Press, pp 5492–5498
New Haven/London Olfati-Saber R, Fax JA, Murray R (2007) Consensus and
Meditch JS (1967) Orthogonal projection and discrete cooperation in networked multi-agent systems. Proc
optimal linear smoothing. SIAM J Control 5(1):74–89 IEEE 95(1):215–233
Meditch JS (1971) Least-squares filtering and smoothing Painter JH, Kerstetter D, Jowers S (1990) Reconciling
for linear distributed parameter systems. Automatica steady-state Kalman and ˛ ˇ filter design. IEEE
7(3):315–322 Trans Aerosp Electron Syst 26(6):986–991
Estimation, Survey on 383
Park S, Serpedin E, Qarage K (2013) Gaussian assump- Sage AP, Masters GW (1967) Identification and modeling
tion: the least favorable but the most useful. IEEE of states and parameters of nuclear reactor systems.
Signal Process Mag 30(3):183–186 IEEE Trans Nucl Sci 14(1):279–285
Personick SD (1971) Application of quantum estimation Sage AP, Melsa JL (1971) Estimation theory with appli-
theory to analog communication over quantum chan- cations to communications and control. McGraw-Hill,
nels. IEEE Trans Inf Theory 17(3):240–246 New York
Pindyck RS, Roberts SM (1974) Optimal policies Schmidt SF (1966) Application of state-space methods to
for monetary control. Ann Econ Soc Meas 3(1): navigation problems. Adv Control Syst 3:293–340
207–238 Schmidt SF (1970) Computational techniques in Kalman
Poor HV (1994) An introduction to signal detection and filtering, NATO-AGARD-139
estimation, 2nd edn. Springer, New York. (first edition Schonhoff TA, Giordano AA (2006) Detection and esti-
in 1988) mation theory and its applications. Pearson Prentice
Poor HV, Looze D (1981) Minimax state estimation for Hall, Upper Saddle River E
linear stochastic systems with noise uncertainty. IEEE Schweppe FC (1968) Recursive state estimation with
Trans Autom Control 26(4):902–906 unknown but bounded errors and system inputs. IEEE
Potter JE, Stern RG (1963) Statistical filtering of space Trans Autom Control 13(1):22–28
navigation measurements. In: Proceedings of the Segall A (1976) Recursive estimation from discrete-
1963 AIAA Guidance and Control Conference, paper time point processes. IEEE Trans Inf Theory 22(4):
no. 63–333, Cambridge 422–431
Ramm AG (2005) Random fields estimation. World Sci- Servi L, Ho Y (1981) Recursive estimation in the presence
entific, Hackensack of uniformly distributed measurement noise. IEEE
Rao C (1945) Information and the accuracy attainable in Trans Autom Control 26(2):563–565
the estimation of statistical parameters. Bull Calcutta Simon D (2006) Optimal state estimation – Kalman, H1
Math Soc 37:81–89 and nonlinear approaches. Wiley, New York
Rao CV, Rawlings JB, Lee JH (2001) Constrained linear Simpson HR (1963) Performance measures and optimiza-
state estimation – a moving horizon approach. Auto- tion condition for a third-order tracker. IRE Trans
matica 37(10):1619–1628 Autom Control 8(2):182–183
Rao CV, Rawlings JB, Mayne DQ (2003) Constrained Sklansky J (1957) Optimizing the dynamic parameters of
state estimation for nonlinear discrete-time systems: a track-while-scan system. RCA Rev 18:163–185
stability and moving horizon approximations. IEEE Smeth P, Ristic B (2004) Kalman filter and joint tracking
Trans Autom Control 48(2): 246–257 and classification in TBM framework. In: Proceedings
Rauch HE (1963) Solutions to the linear smoothing prob- of the 7th international conference on information
lem. IEEE Trans Autom Control 8(4): 371–372 fusion, Stockholm, pp 46–53
Rauch HE, Tung F, Striebel CT (1965) Maximum like- Smith R, Self M, Cheeseman P (1986) Estimating uncer-
lihood estimates of linear dynamic systems. AIAA J tain spatial relationships in robotics. In: Proceedings
3(8):1445–1450 of the 2nd conference on uncertainty in artificial intel-
Reid D (1979) An algorithm for tracking multiple targets. ligence, Philadelphia, pp 435–461
IEEE Trans Autom Control 24(6):843–854 Snijders TAB, Koskinen J, Schweinberger M (2012) Max-
Riccati JF (1722) Animadversiones in aequationes differ- imum likelihood estimation for social network dynam-
entiales secundi gradi. Actorum Eruditorum Lipsiae, ics. Ann Appl Stat 4(2):567–588
Supplementa 8:66–73. Observations regarding differ- Snyder DL (1968) The state-variable approach to contin-
ential equations of second order, translation of the uous estimation with applications to analog communi-
original Latin into English, by I. Bruce, 2007 cations. MIT, Cambridge
Riccati JF (1723) Appendix in animadversiones in aequa- Snyder DL (1970) Estimation of stochastic intensity func-
tiones differentiales secundi gradi. Acta Eruditorum tions of conditional Poisson processes. Monograph no.
Lipsiae 128, Biomedical Computer Laboratory, Washington
Ristic B (2013) Particle filters for random set models. University, St. Louis
Springer, New York Söderström T (1994) Discrete-time stochastic system –
Ristic B, Arulampalam S, Gordon N (2004) Beyond the estimation & control. Prentice Hall, New York
Kalman filter: particle filters for tracking applications. Söderström T, Stoica P (1989) System identification. Pren-
Artech House, Boston tice Hall, New York
Ristic B, Vo B-T, Vo B-N, Farina A (2013) A tutorial on Sorenson HW, Alspach DL (1970) Gaussian sum approx-
Bernoulli filters: theory, implementation and applica- imations for nonlinear filtering. In: Proceedings of
tions. IEEE Trans Signal Process 61(13):3406–3430 the 1970 IEEE Symposium on Adaptive Processes,
Robinson EA (1963) Mathematical development of dis- Austin, TX pp 19.3.1–19.3.9
crete filters for detection of nuclear explosions. Sorenson HW, Alspach DL (1971) Recursive Bayesian
J Geophys Res 68(19):5559–5567 estimation using Gaussian sums. Automatica 7(4):
Roman WS, Hsu C, Habegger LT (1971) Parameter identi- 465–479
fication in a nonlinear reactor system. IEEE Trans Nucl Stankovic SS, Stankovic MS, Stipanovic DM (2009)
Sci 18(1):426–429 Consensus based overlapping decentralized estimation
384 Event-Triggered and Self-Triggered Control
with missing observations and communication faults. Vo B-N, Ma WK (1996) The Gaussian mixture probability
Automatica 45(6):1397–1406 hypothesis density filter. IEEE Trans Signal Process
Stark L (1968) Neurological control systems studies in 54(11):4091–4101
bioengineering. Plenum Press, New York Vo B-T, Vo B-N, Cantoni A (2007) Analytic implementa-
Stengel PF (1994) Optimal control and estimation. Dover, tions of the cardinalized probability hypothesis density
New York. Originally published as Stochastic optimal filter. IEEE Trans Signal Process 55(7):3553–3567
control. Wiley, New York, 1986 Wakita H (1973) Estimation of the vocal tract shape by
Stephant J, Charara A, Meizel D (2004) Virtual sensor: inverse optimal filtering. IEEE Trans Audio Electroa-
application to vehicle sideslip angle and transversal coust 21(5):417–427
forces. IEEE Trans Ind Electron 51(2):278–289 Wertz W (1978) Statistical density estimation – a survey.
Stratonovich RL (1959) On the theory of optimal nonlin- Vandenhoeck and Ruprecht, Göttingen
ear filtering of random functions. Theory Probab Appl Wiener N (1949) Extrapolation, interpolation and smooth-
4:223–225 ing of stationary time-series. MIT, Cambridge
Stratonovich RL (1960) Conditional Markov processes. Willsky AS (1976) A survey of design methods for fail-
Theory Probab Appl 5(2):156–178 ure detection in dynamic systems. Automatica 12(6):
Taylor JH (1979) The Cramér-Rao estimation error 601–611
lower bound computation for deterministic nonlin- Wonham WM (1965) Some applications of stochastic
ear systems. IEEE Trans Autom Control 24(2): differential equations to optimal nonlinear filtering.
343–344 SIAM J Control 2(3):347–369
Thrun S, Burgard W, Fox D (2006) Probabilistic robotics. Woods JW, Radewan C (1977) Kalman filtering in
MIT, Cambridge two dimensions. IEEE Trans Inf Theory 23(4):
Tichavsky P, Muravchik CH, Nehorai A (1998) Posterior 473–482
Cramér-Rao bounds for discrete-time nonlinear filter- Xiao L, Boyd S, Lall S (2005) A scheme for robust
ing. IEEE Trans Signal Process 6(5):1386–1396 distributed sensor fusion based on average consensus.
Toyoda J, Chen MS, Inoue Y (1970) An application of In: Proceedings of the 4th International Symposium
state estimation to short-term load forecasting, Part I: on Information Processing in Sensor Networks, Los
forecasting model and Part II: implementation. IEEE Angeles, pp 63–70
Trans Power Appar Syst 89(7):1678–1682 (Part I) and Zachrisson LE (1969) On optimal smoothing of
1683–1688 (Part II) continuous-time Kalman processes. Inf Sci 1(2):
Tsybakov AB (2009) Introduction to nonparametric esti- 143–172
mation. Springer, New York Zellner A (1971) An introduction to Bayesian inference in
Tuncer TE, Friedlander B (2009) Classical and modern econometrics. Wiley, New York
direction-of-arrival estimation. Academic, Burlington
Tzafestas SG, Nightingale JM (1968) Optimal filter-
ing, smoothing and prediction in linear distributed-
parameter systems. Proc Inst Electr Eng 115(8):
1207–1212 Event-Triggered and Self-Triggered
Ulam S (1952) Random processes and transformations. In: Control
Proceedings of the International Congress of Mathe-
maticians, Cambridge, vol 2, pp 264–275
W.P.M.H. Heemels1 , Karl H. Johansson2, and
Ulam S, Richtmeyer RD, von Neumann J (1947) Sta-
tistical methods in neutron diffusion, Los Alamos Paulo Tabuada3
1
Scientific Laboratory report LAMS-551 Department of Mechanical Engineering,
Van Trees H (1971) Detection, estimation and modulation Eindhoven University of Technology,
theory – Parts I, II and III. Wiley, New York. Repub-
Eindhoven, The Netherlands
lished in 2013 2
van Trees HL, Bell KL (2007) Bayesian bounds for ACCESS Linnaeus Center, Royal Institute of
parameter estimation and nonlinear filtering/tracking. Technology, Stockholm, Sweden
Wiley, Hoboken/IEEE, Piscataway 3
Department of Electrical Engineering,
Venerus JC, Bullock TE (1970) Estimation of the dynamic
reactivity using digital Kalman filtering. Nucl Sci Eng
University of California, Los Angeles, CA, USA
40:199–205
Verdú S, Poor HV (1984) Minimax linear observers
and regulators for stochastic systems with uncertain
second-order statistics. IEEE Trans Autom Control Abstract
29(6):499–511
Vicino A, Zappa G (1996) Sequential approximation
Recent developments in computer and commu-
of feasible parameter sets for identification with set
membership uncertainty. IEEE Trans Autom Control nication technologies have led to a new type
41(6):774–785 of large-scale resource-constrained wireless
Event-Triggered and Self-Triggered Control 385
embedded control systems. It is desirable in respect to periodic control when handling these
these systems to limit the sensor and control constraints, but they also introduce many new
computation and/or communication to instances interesting theoretical and practical challenges.
when the system needs attention. However, Although the discussions regarding periodic
classical sampled-data control is based on vs. aperiodic implementation of feedback control
performing sensing and actuation periodically loops date back to the beginning of computer-
rather than when the system needs attention. controlled systems, e.g., Gupta (1963), in the
This article discusses event- and self-triggered late 1990s two influential papers (Årzén 1999;
control systems where sensing and actuation is Åström and Bernhardsson 1999) highlighted the
performed when needed. Event-triggered control advantages of event-based feedback control. E
is reactive and generates sensor sampling and These two papers spurred the development
control actuation when, for instance, the plant of the first systematic designs of event-based
state deviates more than a certain threshold from implementations of stabilizing feedback control
a desired value. Self-triggered control, on the laws, e.g., Yook et al. (2002), Tabuada (2007),
other hand, is proactive and computes the next Heemels et al. (2008), and Henningsson et al.
sampling or actuation instance ahead of time. The (2008). Since then, several researchers have
basics of these control strategies are introduced improved and generalized these results and
together with references for further reading. alternative approaches have appeared. In the
meantime, also so-called self-triggered control
(Velasco et al. 2003) emerged. Event-triggered
Keywords
and self-triggered control systems consist of
two elements, namely, a feedback controller
Event-triggered control; Hybrid systems; Real-
that computes the control input and a triggering
time control; Resource-constrained embedded
mechanism that determines when the control
control; Sampled-data systems; Self-triggered
input has to be updated again. The difference
control
between event-triggered control and self-
triggered control is that the former is reactive,
Introduction while the latter is proactive. Indeed, in event-
triggered control, a triggering condition based on
In standard control textbooks, e.g., Aström and current measurements is continuously monitored
Wittenmark (1997) and Franklin et al. (2010), pe- and when the condition holds, an event is
riodic control is presented as the only choice for triggered. In self-triggered control the next
implementing feedback control laws on digital update time is precomputed at a control update
platforms. Although this time-triggered control time based on predictions using previously
paradigm has proven to be extremely success- received data and knowledge of the plant
ful in many digital control applications, recent dynamics. In some cases, it is advantageous
developments in computer and communication to combine event-triggered and self-triggered
technologies have led to a new type of large-scale control resulting in a control system reactive
resource-constrained (wireless) control systems to unpredictable disturbances and proactive by
that call for a reconsideration of this traditional predicting future use of resources.
paradigm. In particular, the increasing popularity
of (shared) wired and wireless networked con-
trol systems raises the importance of explicitly Time-Triggered, Event-Triggered and
addressing energy, computation, and communi- Self-Triggered Control
cation constraints when designing feedback con-
trol loops. Aperiodic control strategies that allow To indicate the differences between various dig-
the inter-execution times of control tasks to be ital implementations of feedback control laws,
varying in time offer potential advantages with consider the control of the nonlinear plant
386 Event-Triggered and Self-Triggered Control
with x 2 Rnx the state variable and u 2 Rnu with t0 D 0. Hence, it is clear from (5) that
the input variable. The system is controlled by a there is a feedback mechanism present in the
nonlinear state feedback law determination of the next execution time as it is
based on the measured state variable. In this sense
u D h.x/ (2) event-triggered control is reactive.
Finally, in self-triggered control the next ex-
ecution time is determined proactively based on
where h W Rnx ! Rnu is an appropriate mapping
the measured state x.tk / at the previous execution
that has to be implemented on a digital platform.
time. In particular, there is a function M W Rnx !
Recomputing the control value and updating the
R0 that specifies the next execution time as
actuator signals will occur at times denoted by
t0 ; t1 ; t2 ; : : : with t0 D 0. If we assume the inputs
tkC1 D tk C M.x.tk // (6)
to be held constant in between the successive re-
computations of the control law (referred to as
with t0 D 0. As a consequence, in self-triggered
sample-and-hold or zero-order-hold), we have
control both the control value u.tk / and the next
execution time tkC1 are computed at execution
u.t/ D u.tk / D h.x.tk // 8t 2 Œtk ; tkC1 /; k 2 N: time tk . In between tk and tkC1 , no further actions
(3) are required from the controller. Note that the
time-triggered implementation can be seen as a
We refer to the instants ftk gk2N as the triggering special case of the self-triggered implementation
times or execution times. Based on these times by taking M.x/ D Ts for all x 2 Rnx .
we can easily explain the difference between Clearly, in all the three implementation
time-triggered control, event-triggered control, schemes Ts , C and M are chosen together with
and self-triggered control. the feedback law given through h to provide
In time-triggered control we have the equality stability and performance guarantees and to
tk D kTs with Ts > 0 being the sampling period. realize a certain utilization of computer and
Hence, the updates take place equidistantly in communication resources.
time irrespective of how the system behaves.
There is no “feedback mechanism” in determin-
ing the execution times; they are determined a Lyapunov-Based Analysis
priori and in “open loop.” Another way of writing
the triggering mechanism in time-triggered con- Much work on event-triggered control used one
trol is of the following two modeling and analysis
frameworks: The perturbation approach and the
tkC1 D tk C Ts ; k 2 N (4) hybrid system approach.
Using this error variable we can write the closed- holds. When < 1=ˇ, we obtain from (11) and
loop system based on (1) and (3) as (13) that GAS properties are preserved for the
event-triggered implementation. Besides, under
xP D f .x; h.x C e//: (8) certain conditions provided in Tabuada (2007), a
global positive lower bound exists on the inter-
Essentially, the three implementations discussed execution times, i.e., there exists a min > 0 such
above have their own way of indicating when that tkC1 tk > min for all k 2 N and all initial
an execution takes place and the error e is reset states x0 .
to zero. The equation (8) clearly shows how the Also self-triggered controllers can be derived
ideal closed-loop system is perturbed by using a using the perturbation approach. In this case, sta- E
time-triggered, event-triggered, or self-triggered bility properties can be guaranteed by choosing
implementation of the feedback law in (2). In- M in (6) ensuring that C.x.t/; x.tk // 0 holds
deed, when e D 0 we obtain the ideal closed loop for all times t 2 Œtk ; tkC1 / and all k 2 N.
0 1
f .x; h.x C e// on a held value x.tk / of the state variable, as
P D @f .x; h.x C e//A when0 M.x C e/ specified in (3). However, if good model-based
1 information regarding the plant is available, one
(15a) can use better model-based predictions of the
0 1 actuator signal. For instance, in the linear context,
x (Lunze and Lehmann 2010) proposed to use a
C D @ 0 A when D M.x C e/; (15b) control input generator instead of a plain zero-
0 order hold function. In fact, the plant model was
described by
which can be used for analysis based on hybrid
tools as well.
xP D Ax C Bu C Ew (18)
Alternative Event-Triggering with x 2 Rnx the state variable, u 2 Rnu the input
Mechanisms variable, and w 2 Rnw a bounded disturbance
input. It was assumed that a well functioning state
There are various alternative event-triggering feedback controller u D Kx was available. The
mechanisms. A few of them are described in this control input generator was then based on the
section. model-based predictions given for Œtk ; tkC1 / by
tkC1 D infft > tk j ke.t/k kx.t/k C ıg; Hence, when the prediction xs .t/ diverts to far
(17) from the measured state x.t/, the next event is
triggered so that updates of the state are sent
combining an absolute and a relative threshold, to the actuator. These model-based triggering
was proposed (Donkers and Heemels 2012). It schemes can enhance the communication savings
is particularly effective in the context of output- as they reduce the number of events by using
based control. model-based knowledge.
Other model-based event-triggered control
Model-Based Triggering schemes are proposed, for instance, in Yook
In the triggering conditions discussed so far, es- et al. (2002), Garcia and Antsaklis (2013), and
sentially the current control value u.t/ is based Heemels and Donkers (2013).
Event-Triggered and Self-Triggered Control 389
Outlook
Franklin GF, Powel JD, Emami-Naeini A (2010) Feed- Tallapragada P, Chopra N (2012b) Event-triggered dy-
back control of dynamical systems. Prentice Hall, namic output feedback control for LTI systems. In:
Upper Saddle River IEEE 51st annual conference on decision and control
Garcia E, Antsaklis PJ (2013) Model-based event- (CDC), Maui, pp 6597–6602
triggered control for systems with quantization and Teixeira PV, Dimarogonas DV, Johansson KH, Borges
time-varying network delays. IEEE Trans Autom Con- de Sousa J (2010) Event-based motion coordination of
trol 58(2):422–434 multiple underwater vehicles under disturbances. In:
Goebel R, Sanfelice R, Teel AR (2009) Hybrid dynamical IEEE OCEANS, Sydney
systems. IEEE Control Syst Mag 29: 28–93 Velasco M, Marti P, Fuertes JM (2003) The self triggered
Gupta S (1963) Increasing the sampling efficiency for task model for real-time control systems. In: Pro-
a control system. IEEE Trans Autom Control 8(3): ceedings of 24th IEEE real-time systems symposium,
263–264 work-in-progress session, Cancun, Mexico
Heemels WPMH, Donkers MCF (2013) Model-based Wang X, Lemmon MD (2011) Event-triggering in dis- E
periodic event-triggered control for linear systems. tributed networked systems with data dropouts and
Automatica 49(3):698–711 delays. IEEE Trans Autom Control 586–601
Heemels WPMH, Sandee JH, van den Bosch PPJ (2008) Yook JK, Tilbury DM, Soparkar NR (2002) Trading
Analysis of event-driven controllers for linear systems. computation for bandwidth: reducing communica-
Int J Control 81:571–590 tion in distributed control systems using state es-
Heemels WPMH, Donkers MCF, Teel AR (2013) Periodic timators. IEEE Trans Control Syst Technol 10(4):
event-triggered control for linear systems. IEEE Trans 503–518
Autom Control 58(4):847–861
Henningsson T, Johannesson E, Cervin A (2008) Sporadic
event-based control of first-order linear stochastic sys-
tems. Automatica 44:2890–2895
Lehmann D, Kiener GA, Johansson KH (2012) Event- Evolutionary Games
triggered PI control: saturating actuators and anti-
windup compensation. In: Proceedings of the IEEE Eitan Altman
conference on decision and control, Maui
Lunze J, Lehmann D (2010) A state-feedback approach to
INRIA, Sophia-Antipolis, France
event-based control. Automatica 46:211–215
Mazo M Jr, Tabuada P (2011) Decentralized event-
triggered control over wireless sensor/actuator net- Abstract
works. IEEE Trans Autom Control 56(10):2456–2461.
Special issue on Wireless Sensor and Actuator Net-
works Evolutionary games constitute the most recent
Miskowicz M (2006) Send-on-delta concept: an event- major mathematical tool for understanding,
based data-reporting strategy. Sensors 6: 49–63 modelling and predicting evolution in biology
Olfati-Saber R, Fax JA, Murray RM (2007) Consensus
and cooperation in networked multi-agent systems. and other fields. They complement other well
Proc IEEE 95(1):215–233 establlished tools such as branching processes
Postoyan R, Anta A, Nešić D, Tabuada P (2011) A and the Lotka-Volterra (1910) equations (e.g.
unifying Lyapunov-based framework for the event- for the predator - prey dynamics or for epidemics
triggered control of nonlinear systems. In: Proceedings
of the joint IEEE conference on decision and control evolution). Evolutionary Games also brings novel
and European control conference, Orlando, pp 2559– features to game theory. First, it focuses on the
2564 dynamics of competition rather than restricting
Seyboth GS, Dimarogonas DV, Johansson KH (2013) attention to the equilibrium. In particular, it
Event-based broadcasting for multi-agent average con-
sensus. Automatica 49(1):245–252 tries to explain how an equilibrium emerges.
Shi G, Johansson KH (2011) Multi-agent robust Second, it brings new definitions of stability,
consensus—part II: application to event-triggered co- that are more adapted to the context of large
ordination. In: Proceedings of the IEEE conference on populations. Finally, in contrast to standard
decision and control, Orlando
Tabuada P (2007) Event-triggered real-time scheduling of game theory, players are not assumed to be
stabilizing control tasks. IEEE Trans Autom Control “rational” or “knowledgeable” as to anticipate
52(9):1680–1685 the other players’ choices. The objective of this
Tallapragada P, Chopra N (2012a) Event-triggered de- article, is to present foundations as well as recent
centralized dynamic output feedback control for LTI
systems. In: IFAC workshop on distributed estimation advances in evolutionary games, highlight the
and control in networked systems, pp 31–36 novel concepts that they introduce with respect
392 Evolutionary Games
to game theory as formulated by John Nash, and evolutionary games is, thus, that the evolution
describe through several examples their huge of the fraction of a given type in the population
potential as tools for modeling interactions in depends on the sizes of other types only through
complex systems. the normalized size rather than through their
actual one.
The relative rate of the decrease or increase
Keywords of the normalized population size of some type
in the replicator dynamics is what we call fitness
Evolutionary stable strategies; Fitness; Replicator and is to be understood in the Darwinian sense.
dynamics If some type or some behavior increases more
than another one, then it has a larger fitness.
the evolution of the fitness as described by the
Introduction replicator dynamics is a central object of study in
evolutionary games.
Evolutionary game theory is the youngest of So far we did not actually consider any
several mathematical tools used in describing and game and just discussed ways of modeling
modeling evolution. It was preceded by the the- evolution. The relation to game theory is due
ory of branching processes (Watson and Francis to the fact that under some conditions, the fitness
Galton 1875) and its extensions (Altman 2008) converges to some fixed limit, which can be
which have been introduced in order to explain identified as an equilibrium of a matrix game
the evolution of family names in the English in which the utilities of the players are the
population of the second half of the nineteenth fitnesses. This limit is then called an ESS –
century. This theory makes use of the probabilis- evolutionary stable strategy – as defined by
tic distribution of the number of offspring of an Meynard Smith and Price in Maynard Smith
individual in order to predict the probability at and Price (1973). It can be computed using
which the whole population would become even- elementary tools in matrix games and then used
tually extinct. It describes the evolution of the for predicting the (long term) distribution of
number of offsprings of a given individual. The behaviors within a population. Note that an
Lotka-Volterra equations (Lotka-Volterra 1910) equilibrium in a matrix game can be obtained
and their extensions are differential equations that only when the players of the matrix game are
describe the population size of each of several rational (each one maximizing its expected
species that have a predator-prey type relation. utility, being aware of the utilities of other players
One of the foundations in evolutionary games and of the fact that these players maximize
(and its extension to population games) which is their utilities, etc.). A central contribution of
often used as the starting point in their definition evolutionary games is thus to show that evolution
is the replicator dynamics which, similarly to the of possibly nonrational populations converges
Lotka-Volterra equations, describe the evolution under some conditions to the equilibrium of a
of the size of various species that interact with game played by rational players. This surprising
each other (or of various behaviors within a given relationship between the equilibrium of a
population). In both the Lotka-Volterra equations noncooperative matrix game and the limit points
and in replicator dynamics, the evolution of the of the fitness dynamics has been supported by a
size of one type of population may depend on rich body of experimental results; see Friedman
the sizes of all other populations. Yet, unlike (1996).
the Lotka-Volterra equations, the object of the On the importance of the ESS for understand-
modeling is the normalized sizes of populations ing the evolution of species, Dawkins writes in
rather than the size itself. By normalized size his book “The Selfish Gene” (Dawkins 1976):
of some type, we mean the fraction of that type “we may come to look back on the invention of
within the whole population. A basic feature in the ESS concept as one of the most important
Evolutionary Games 393
" #
advances in evolutionary theory since Darwin.” X
N
He further specifies: “Maynard Smith’s concept pPj .t/ D
pj .t/ fj .p.t// pk .t/fk .p.t// ;
of the ESS will enable us, for the first time, to see kD1
clearly how a collection of independent selfish (1)
entities can come to resemble a single organized
whole.” where
is some positive constant and the payoff
Here we shall follow the nontraditional ap- function fk is called the fitness of strategy k.
proach describing evolutionary games: we shall In evolutionary games, evolution is assumed to
first introduce the replicator dynamics and then be due to pairwise interactions between players,
introduce the game theoretic interpretation re- as will be described in the next section. There-
P E
lated to it. fore, fk has the form fk .p/ D N i D1 J.k; i /p.i /
where J.k; i / is the fitness of an individual play-
ing k if it interacts with an individual that plays
Replicator Dynamics strategy i .
Within quite general settings (Weibull 1995),
In the biological context, the replicator dynamics the above replicator dynamics is known to con-
is a differential equation that describes the way verge to an ESS (which we introduce in the next
in which the usage of strategies changes in time. section).
They are based on the idea that the average
growth rate per individual that uses a given strat-
Evolutionary Games: Fitnesses
egy is proportional to the excess of fitness of that
strategy with respect to the average fitness.
Consider an infinite population of players. Each
In engineering, the replicator dynamics could
individual i plays at times tni , n D 1; 2; 3; : : :
be viewed as a rule for updating mixed strategies
(assumed to constitute an independent Poisson
by individuals. It is a decentralized rule since
process with some rate ) a matrix game against
it only requires knowing the average utility of
some player j.n/ randomly selected within the
the population rather than the strategy of each
population. The choice j.n/ of the other players
individual.
at different times is independent. All players have
Replicator dynamics is one of the most studied
the same finite space of pure strategies (also
dynamics in evolutionary game theory. It has
called actions) K. Each time it plays, a player
been introduced by Taylor and Jonker (1978).
may use a mixed strategy p, i.e., a probability
The replicator dynamics has been used for de-
measure over the set of pure strategies. We con-
scribing the evolution of road traffic congestion in
sider J.k; i / (defined in the previous section) to
which the fitness is determined by the strategies
be the payoff for a tagged individual if it uses
chosen by all drivers (Sandholm 2009). It has
a strategy k, and it interacts with an individual
also been studied in the context of the association
using strategy i . With some abuse of notation,
problem in wireless communications (Shakkottai
one denotes by J.p; q/ the expected payoff for a
et al. 2007).
player who uses a mixed strategy p when meeting
Consider a set of N strategies and let pj .t/
another individual who adopts the mixed strategy
be the fraction of the whole population that uses
q. If we define a payoff matrix A and consider
strategy j at time t. Let p.t/ be the correspond-
p and q to be column vectors, then J.p; q/ D
ing N -dimensional vector. A function fj is asso-
p 0 Aq. The payoff function J is indeed linear in
ciated with the growth rate of strategy j , and it is
p and q. A strategy q is called a Nash equilibrium
assumed to depend on the fraction of each of the
if
N strategies in the population. There are various
8p 2 .K/; J.q; q/ J.p; q/ (2)
forms of replicator dynamics (Sandholm 2009)
and we describe here the one most commonly where .K/ is the set of probabilities over the set
used. It is given by K.
394 Evolutionary Games
Suppose that the whole population uses a the condition (5) holds, then a population using
strategy q and that a small fraction (called “mu- q is “weakly” immune against a mutation using
tations”) adopts another strategy p. Evolutionary p. Indeed, if the mutant’s population grows, then
forces are expected to select against p if we shall frequently have individuals with strategy
q competing with mutants. In such cases, the
J.q; p C .1 /q/ > J.p; p C .1 /q/: (3) condition J.q; p/ > J.p; p/ ensures that the
growth rate of the original population exceeds
Evolutionary Stable Strategies: ESS that of the mutants.
Definition 1 q is said to be an evolutionary sta- A mixed strategy q that satisfies (4) for all p 6D
ble strategy (ESS) if for every p 6D q there q is called strict Nash equilibrium. Recall that a
exists some p > 0 such that (3) holds for all mixed strategy q that satisfies (2) for all p 6D q is
2 .0; p /. a Nash equilibrium. We conclude from the above
theorem that being a strict Nash equilibrium im-
The definition of ESS is thus related to a
plies being an ESS, and being an ESS implies
robustness property against deviations by a whole
being a Nash equilibrium. Note that whereas a
(possibly small) fraction of the population. This
mixed Nash equilibrium is known to exist in a
is an important difference that distinguishes the
matrix game, an ESS may not exist. However,
equilibrium in populations as seen by biologists
an ESS is known to exist in evolutionary games
and the standard Nash equilibrium often used in
where the number of strategies available to each
economics context, in which robustness is defined
player is 2 (Weibull 1995).
against the possible deviation of a single user.
Why do we need the stronger type of robust- Proposition 1 In a symmetric game with two
ness? Since we deal with large populations, it strategies for each player and no pure Nash
is likely to be expected that from time to time, equilibrium, there exists a unique mixed Nash
some group of individuals may deviate. Thus equilibrium which is an ESS.
robustness against deviations by a single user is
not sufficient to ensure that deviations will not
develop and end up being used by a growing
Example: The Hawk and Dove Game
portion of the population.
We briefly describe the hawk and dove game
Often ESS is defined through the following
(Maynard Smith and Price 1973). A bird
equivalent definition.
that searches food finds itself competing with
Theorem 1 (Weibull 1995, Proposition 2.1 or another bird over food and has to decide
Hofbauer and Sigmund 1998, Theorem 6.4.1, whether to adopt a peaceful behavior (dove)
p 63) A strategy q is said to be an evolutionary or an aggressive one (hawk). The advantage of
stable strategy if and only if 8p ¤ q one of the behaving aggressively is that in an interaction
following conditions holds: with a peaceful bird, the aggressive one gets
access to all the food. This advantage comes
J.q; q/ > J.p; q/; (4) at a cost: a hawk which meets another hawk
ends up fighting with it and thus takes a risk
or of getting wounded. In contrast, two doves that
meet in a contest over food share it without
J.q; q/ D J.p; q/ and J.q; p/ > J.p; p/: (5) fighting. The fitnesses for player 1 (who chooses
a row) are summarized in Table 1, in which the
In fact, if condition (4) is satisfied, then the cost for fighting is taken to be some parameter
fraction of mutations in the population will tend ı > 1=2.
to decrease (as it has a lower fitness, meaning a This game has a unique mixed Nash equi-
lower growth rate). Thus, the strategy q is then librium (and thus a unique ESS) in which the
immune to mutations. If it does not but if still fraction p of aggressive birds is given by
Evolutionary Games 395
Evolutionary Games, Table 1 The hawk-dove game Summary and Future Directions
H D
H 1=2 ı 1 The entry has provided an overview of the foun-
D 0 1/2
dations of evolutionary games which include the
ESS (evolutionary stable strategy) equilibrium
concept that is stronger than the standard Nash
2 equilibrium and the modeling of the dynamics of
pD
1:5 C ı the competition through the replicator dynamics.
Evolutionary game framework is a first step in
linking game theory to evolutionary processes. E
Extension: Evolutionary Stable Sets The payoff of a player is identified as its fitness,
i.e., the rate of reproduction. Further develop-
Assume that there are two mixed strategies pi and ment of this mathematical tool is needed for
pj that have the same performance against each handling hierarchical fitness, i.e., the cases where
other, i.e., J.pi ; pj / D J.pj ; pj /. Then neither the individual that interacts cannot be directly
one of them can be an ESS, even if they are identified with the reproduction as it is part of a
quite robust against other strategies. Now assume larger body. For example, the behavior of a blood
that when excluding one of them from the set cell in the human body when interacting with a
of mixed strategies, the other one is an ESS. virus cannot be modeled as directly related to
This could imply that different combinations of the fitness of the blood cell but rather to that of
these two ESS’s could coexist and would together the human body. A further development of the
be robust to any other mutations. This motivates theory of evolutionary games is needed to define
the following definition of an ESSet (Cressman meaningful equilibrium notions and relate them
2003): to replication in such contexts.
Definition 2 A set E of symmetric Nash equilib-
ria is an evolutionarily stable set (ESSet) if, for all Cross-References
q 2 E, we have J.q; p/ > J.p; p/ for all p 62 E
and such that J.p; q/ D J.q; q/. Dynamic Noncooperative Games
Game Theory: Historical Overview
Properties of ESSet:
Cressman R (2003) Evolutionary dynamics and extensive control, one of the main applications of system
form games. MIT, Cambridge identification.
Dawkins R (1976) The selfish gene. Oxford University
Press, Oxford
Friedman D (1996) Equilibrium in evolutionary games:
some experimental results. Econ J 106: 1–25
Hofbauer J, Sigmund K (1998) Evolutionary games and Keywords
population dynamics. Cambridge University Press,
Cambridge/New York Adaptive experiment design; Application-
Lotka-Volterra AJ (1910) Contribution to the theory of
oriented experiment design; Cramér-Rao lower
periodic reaction. J Phys Chem 14(3): 271–274
Maynard Smith J, Price GR (1973) The logic of animal bound; Crest factor; Experiment design;
conflict. Nature 246(5427):15–18 Fisher information matrix; Identification for
Sandholm WH (2009) Population games and evolutionary control; Least-costly identification; MultiSine;
dynamics. MIT
Pseudorandom binary signal (PRBS); Robust
Shakkottai S, Altman E, Kumar A (2007) Multihoming of
users to access points in WLANs: a population game experiment design
perspective. IEEE J Sel Areas Commun Spec Issue
Non-Coop Behav Netw 25(6):1207–1215
Taylor P, Jonker L (1978) Evolutionary stable strategies
and game dynamics. Math Biosci 16: 76–83 Introduction
Vincent TL, Brown JS (2005) Evolutionary game theory,
natural selection & Darwinian dynamics. Cambridge
University Press, Cambridge The accuracy of an identified model is governed
Watson HW, Galton F (1875) On the probability of the by:
extinction of families. J Anthropol Inst Great Br 4: (i) Information content in the data used for esti-
138–144 mation
Weibull JW (1995) Evolutionary game theory. MIT,
Cambridge (ii) The complexity of the model structure
The former is related to the noise properties and
the “energy” of the external excitation of the
system and how it is distributed. In regard to (ii),
a model structure which is not flexible enough to
Experiment Design and capture the true system dynamics will give rise to
Identification for Control a systematic error, while an overly flexible model
will be overly sensitive to noise (so-called overfit-
Håkan Hjalmarsson ting). The model complexity is closely associated
School of Electrical Engineering, ACCESS with the number of parameters used. For a linear
Linnaeus Center, KTH Royal Institute of model structure with n parameters modeling the
Technology, Stockholm, Sweden dynamics, it follows from the invariance result
in Rojas et al. (2009) that to obtain a model
for which the variance of the frequency function
Abstract estimate is less than 1= over all frequencies, the
signal-to-noise ratio, as measured by input energy
Understanding the effect of experiment on over noise variance, must be at least n . With
estimation result is a crucial part of system energy being power time and as input power
identification – if the experiment is constrained is limited in physical systems, this indicates that
or otherwise fixed, then the implied limitations the experiment time grows at least linearly with
need to be understood – but if the experiment the number of model parameters. When the input
can be designed, then given its fundamental energy budget is limited, the only way around
importance that design parameter should be fully this problem is to sacrifice accuracy over certain
exploited, this entry will give an understanding of frequency intervals. The methodology to achieve
how it can be exploited. We also briefly discuss this in a systematic way is known as experiment
the particulars of identification for model-based design.
Experiment Design and Identification for Control 397
resulting estimate ends up in Eapp with high prob- should be conducted in closed loop (Agüero and
ability. Goodwin 2007).
the design variables in section “Design Variables” sequence, all typical quality measures become
and associated variables. non-convex.
While various methods for non-convex nu-
merical optimization can be used to solve such
Experiment Design Criteria problems, they often encounter problems with,
e.g., local minima. To address this a number
There are two principal ways to define an optimal of techniques have been developed where either
experiment design problem: the problem is reparametrized so that it becomes
(i) Best effort. Here the best quality as, e.g., convex or where a convex approximation is used.
given by one of the quality measures in The latter technique is called convex relaxation E
section “Model Quality Measures” is sought and is often based on a reparametrization as well.
under constraints on the experimental effort We use the example above to provide a flavor of
and cost. This is the classical problem for- the different techniques.
mulation.
(ii) Least-costly. The cheapest experiment is Reparametrization
sought that results in a predefined model If the input is constrained to be periodic so that
quality. Thus, as compared to best effort u.t/ D u.t C N /, t D n; : : : ; 1, it follows
design, the optimization criterion and that the Fisher information is linear in the sample
constraint are interchanged. This type of correlations of the input. Using these as design
design was introduced by Bombois and variables instead of u results in that all quality
coworkers; see Bombois et al. (2006). measures referred to above become convex func-
As shown in Rojas et al. (2008), the two ap- tions.
proaches typically lead to designs only differing This reparametrization thus results in the two-
by a scaling factor. step procedure discussed in section “External
Excitation Signals”: First, the sample correlations
are obtained from an optimal experiment design
Computational Issues problem, and then an input sequence is generated
that has this sample correlation. In the second
The optimal experiment design problem based on step there is a considerable freedom. Notice,
the Fisher information is typically non-convex. however, that since correlations do not directly
For example, consider a finite-impulse response relate to the actual amplitudes of the resulting
model subject to an experiment of length N with signals, it is difficult to incorporate waveform
the measured outputs collected in the vector constraints in this approach. On the contrary,
2 3 variance constraints are easy to incorporate.
u.0/ : : : u..n 1//
6 :: :: :: 7 Convex Relaxations
Y D ˆ CE; ˆ D 4 : : : 5
u.N 1/ : : : u.N n/ There are several approaches to obtain convex
relaxations.
where E 2 RN is zero-mean Gaussian noise with
Using the per Sample Fisher Information
covariance matrix 2 IN N . Then it holds that
If the input is a realization of a stationary random
1 T process and the sample size N is large enough,
IF . o ; N / D ˆ ˆ (4) IF . o ; N /=N is approximately equal to the per
2
sample Fisher matrix which only depends on
From an experiment design point of view, the the correlation sequence of the input. Using this
T
input vector u D u..n 1// : : : u.N / is approximation, one can now follow the same
the design Variable, but with the elements of procedure as in the reparametrization approach
IF . o ; N / being a quadratic function of the input and first optimize the input correlation sequence.
400 Experiment Design and Identification for Control
The generation of a stationary signal with a cer- U and u (subject to the matrix inequality in (5))
tain correlation is a stochastic realization problem are decision variables.
which can be solved using spectral factorization
followed by filtering white noise sequence, i.e., Frequency-by-Frequency Design
a sequence of independent identically distributed An approximation for linear systems that allows
random variables, through the (stable) spectral frequency-by-frequency design of the input spec-
factor (Jansson and Hjalmarsson 2005). trum and feedback is obtained by assuming that
More generally, it turns out that the per sample the model is of high order. Then the variance of an
Fisher information for linear models/systems nth-order estimate, G.e i ! ; ON /, of the frequency
only depends on the joint input/noise spectrum function can approximately be expressed as
(or the corresponding correlation sequence).
A linear parametrization of this quantity thus n ˆv .!/
Var G.e i ! ; ON /
(6)
typically leads to a convex problem (Jansson and N ˆu .!/
Hjalmarsson 2005).
The set of all spectra is infinite dimensional ( System Identification: An Overview) in the
and this precludes a search over all possible spec- open loop case (there is a closed-loop extension
tra. However, since there is a finite-dimensional as well), where ˆu and ˆv are the input and noise
parametrization of the per sample Fisher informa- spectra, respectively. Performance measures of
tion (it is a symmetric nn matrix), it is also pos- the type (2) can then be written as
sible to find finite-dimensional sets of spectra that Z
parametrize all possible per sample Fisher infor- ˆv .!/
W .e i ! / d!
mation matrices. Multisines with appropriately ˆu .!/
chosen frequencies is one possibility. However,
even though all per sample Fisher information where the weighting W .e i ! / 0 depends on the
matrices can be generated, the solution may be application. When only variance constraints are
suboptimal depending on which constraints the present, such problems can be solved frequency
problem contains. by frequency, providing both simple calculations
The situation for nonlinear problems is con- and insight into the design.
ceptually the same, but here the entire proba-
bility density function of the stationary process
generating the input plays the same role as the Implementation
spectrum in the linear case. This is a much more
complicated object to parametrize. We have used the notation IF . o ; N / to indicate
that the Fisher information typically (but not al-
Lifting ways) depends on the parameter corresponding to
An approach that can deal with amplitude con- the true system. That the optimal design depends
straints is based on a so-called lifting technique: on the to-be identified system is a fundamental
Introduce the matrix U D uuT , representing problem in optimal experiment design. There are
all possible products of the elements of u. This two basic approaches to address this problem
constraint is equivalent to which are covered below. Another important as-
pect is the choice of waveform for the external
U u U u excitation signal. This is covered last in this
0; rank T D1 (5)
uT 1 u 1 section.
The idea of lifting is now to observe that the Robust Experiment Design
Fisher information matrix is linear in the ele- In robust experiment design, it is assumed that
ments of U and by dropping the rank constraint it is known beforehand that the true parameter
in (5) a convex relaxation is obtained, where both belongs to some set, i.e., o 2 ‚. A minimax
Experiment Design and Identification for Control 401
A PRBS is a periodic signal and can therefore Implications for the Identification
be split into its Fourier terms. With a period of Problem Per Se
M , each such term corresponds to one frequency
on the grid 2k=M , k D 0; : : : ; M 1. Such a In order to get some understanding of how
signal can thus be used to estimate at most M pa- optimal experimental conditions influence the
rameters. Another way to generate a signal with identification problem, let us return to the
period M is to add sinusoids corresponding to finite-impulse response model example in
the above frequencies, with desired amplitudes. section “Computational Issues.” Consider a least-
A periodic signal generated in this way is com- costly setting with an acceptable performance
monly referred to as a MultiSine. The crest factor constraint. More specifically, we would like
of a multisine depends heavily on the relation to use the minimum input energy that ensures
between the phases of the sinusoids. times the that the parameter estimate ends up in a set
number of sinusoids. It is possible to optimize the of the type (3). An approximate solution to
crest factor with respect to the choice of phases this is that a 99 % confidence ellipsoid for the
(Rivera et al. 2009). There exist also simple resulting estimate is contained in Eapp . Now,
deterministic methods for choosing phases that it can be shown that a confidence ellipsoid is
give a good crest factor, e.g., Schroeder phasing. a level set for the average least-squares cost
Alternatively, phases can be drawn randomly and EŒVN . / D EŒkY ˆ k2 D k o k2ˆT ˆ C 2 .
independently, giving what is known as random- Assuming the application cost Vapp also is
phase multisines (Pintelon and Schoukens 2012), quadratic in , it follows after a little bit of
a family of random signals with properties similar algebra (see Hjalmarsson 2009) that it must hold
to Gaussian signals. Periodic signals have some that
useful features:
• Estimation of nonlinearities. A linear time- EŒVN . / 2 1 C cVapp . / ; 8 (8)
invariant system responds to a periodic input
signal with a signal consisting of the same for a constant c that is not important for our
frequencies, but with different amplitudes discussion. The value of EŒVN . / D k
and phases. Thus, it can be concluded o k2ˆT ˆ C 2 is determined by how large the
that the system is nonlinear if the output weighting ˆT ˆ is, which in turn depends on how
contains other frequencies than the input. large the input u is. In a least-costly setting with
This can be explored in a systematic way the energy kuk2 as criterion, the best solution
to estimate also the nonlinear part of a would be that we have equality in (8). Thus we
system. see that optimal experiment design tries to shape
• Estimation of noise variance. For a linear the identification criterion after the application
time-invariant system, the difference in the cost. We have the following implications of this
output between different periods is due en- result:
tirely to the noise if the system is in steady (i) Perform identification under appropriate
state. This can be used to devise simple meth- scaling of the desired operating conditions.
ods to estimate the noise level. Suppose that Vapp . / is a function of
• Data compression. By averaging measure- how the system outputs deviate from a
ments over different periods, the noise level desired trajectory (determined by o ).
can be reduced at the same time as the number Performing an experiment which performs
of measurements is reduced. the desired trajectory then gives that the
Further details on waveform generation and sum of the squared prediction errors are
general-purpose signals useful in system an approximation of Vapp . /, at least for
identification can be found in Pintelon and parameters close to o . Obtaining equality
Schoukens (2012) and Ljung (1999). in (8) typically requires an additional scaling
Experiment Design and Identification for Control 403
of the input excitation or the length of parametric ellipsoidal uncertainty sets resulting
the experiment. The result is intuitively from standard system identification. In fact only
appealing: The desired operating conditions in the last decade analysis and design tools for
should reveal the system properties that are such type of model uncertainty have started to
important in the application. emerge, e.g., Raynaud et al. (2000) and Gevers
(ii) Identification cost for application perfor- et al. (2003).
mance. We see that the required energy The advantages of matching the identification
grows (almost) linearly with , which criterion to the application have been recognized
is a measure of how close to the ideal since long in this line of research. For control
performance (using the true parameter o ) applications this typically implies that the iden- E
we want to come. Furthermore, it is typical tification experiment should be performed under
that as the performance requirements in the the same closed-loop operation conditions as the
application increase, the sensitivity to model controller to be designed. This was perhaps first
errors increases. This means that Vapp . / recognized in the context of minimum variance
increases, which thus in turn means that the control (see Gevers and Ljung 1986) where vari-
identification cost increases. In summary, ance errors were the concern. Later on this was
the identification cost will be higher, the recognized to be the case also for the bias error,
higher performance that is required in the although here pre-filtering can be used to achieve
application. The inequality (8) can be used the same objective.
to quantify this relationship. To account for that the controller to be
(iii) Model structure sensitivity. As Vapp will be designed is not available, techniques where
sensitive to system properties important for control and identification are iterated have been
the application, while insensitive to system developed, cf. adaptive experiment design in
properties of little significance, with the section “Adaptive Experiment Design.” Conver-
identification criterion VN matched to Vapp , gence of such schemes has been established when
it is only necessary that the model structure the true system is in the model set but has proved
is able to model the important properties of out of reach for models of restricted complexity.
the system. In recent years, techniques integrating exper-
In any case, whatever model structure iment design and model predictive control have
that is used, the identified model will be started to appear. A general-purpose design cri-
the best possible in that structure for the terion is used in Rathouský and Havlena (2013),
intended application. This is very different while Larsson et al. (2013) uses an application-
from an arbitrary experiment where it is oriented criterion.
impossible to control the model fit when a
model of restricted complexity is used.
We conclude that optimal experiment de- Summary and Future Directions
sign simplifies the overall system identifica-
tion problem. When there is the “luxury” to design the exper-
iment, then this opportunity should be seized by
the user. Without informative data there is little
Identification for Control that can be done. In this exposé we have outlined
the techniques that exist but also emphasized
Model-based control is one of the most impor- that a well-conceived experiment, reflecting the
tant applications of system identification. Robust intended application, significantly can simplify
control ensures performance and stability in the the overall system identification problem.
presence of model uncertainty. However, the ma- Further developments of computational tech-
jority of such design methods do not employ the niques are high on the agenda, e.g., how to handle
404 Experiment Design and Identification for Control
off-line, therefore converting the MPC law into a in the MPC problem formulation (1) is q D
continuous and piecewise-affine function of the N nc C nN .
parameter vector (Bemporad et al. 2002b). We By eliminating the states xk D Ak x C
Pk1 j
review the main ideas of explicit MPC in the j D0 A Buk1j from problem (1), the optimal
next section, referring the reader to Alessio and control problem (1) can be expressed as the
Bemporad (2009) for a more complete survey convex quadratic program (QP):
paper on explicit MPC.
1 0 1
V ? .x/ , min z H z C x0 F 0z C x0 Y x
z 2 2
(3a)
Model Predictive Control Problem
s:t: Gz W C S x (3b)
Consider the following finite-time optimal con-
trol problem formulation for MPC: where H D H 0 2 Rn is the Hessian matrix; F 2
Rnm defines the linear term of the cost function;
X
N 1 Y 2 Rmm has no influence on the optimizer, as
V .p/ D min `N .xN / C `.xk ; uk / (1a) it only affects the optimal value of (3a); and the
z
kD0 matrices G 2 Rqn , S 2 Rqm , W 2 Rq define
s:t: xkC1 D Axk C Buk (1b) in a compact form the constraints imposed in (1).
Because of the assumptions made on the weight
Cx xk C Cu uk c (1c)
matrices Q, R, P , matrix H is positive definite
k D 0; : : : ; N 1 and matrix H F 0 is positive semidefinite.
F Y
The MPC control law is defined by setting
CN xN cN (1d)
x0 D x (1e) u.x/ D ŒI 0 : : : 0z.x/ (4)
where N is the prediction horizon; x 2 Rm is where z.x/ is the optimizer of the QP problem (3)
the current state vector of the controlled system; for the current value of x and I is the identity
uk 2 Rnu is the vector of manipulated variables matrix of dimension nu nu .
at prediction time k, k D 0; : : : ; N 1; z ,
Œ u00 ::: u0N 1 0 2 Rn , n , nu N , is the vector of Multiparametric Solution
decision variables to be optimized;
Rather than using a numerical QP solver online to
1
`.x; u/ D x 0 Qx C u0 Ru (2a) compute the optimizer z.x/ of (3) for each given
2 current state vector x, the basic idea of explicit
1 MPC is to pre-solve the QP off-line for the entire
`N .x/ D x 0 P x (2b)
2 set of states x (or for a convex polyhedral subset
X Rm of interest) to get the optimizer function
are the stage cost and terminal cost, respectively; z, and therefore the MPC control law u, explicitly
Q, P are symmetric and positive semidefinite as a function of x.
matrices; and R is a symmetric and positive The main tool to get such an explicit solu-
definite matrix. tion is multiparametric quadratic programming
Let nc 2 N be the number of constraints (mpQP). For mpQP problems of the form (3),
imposed at prediction time k D 0; : : : ; N 1, Bemporad et al. (2002b) proved that the opti-
namely, Cx 2 Rnc m , Cu 2 Rnc nu , c 2 Rnc , mizer function z W Xf 7! Rn is piecewise affine
and let nN be the number of terminal constraints, and continuous over the set Xf of parameters
namely, CN 2 RnN m , cN 2 RnN . The total x for which the problem is feasible (Xf is a
number q of linear inequality constraints imposed polyhedral set, possibly Xf D X ) and that
Explicit Model Predictive Control 407
Explicit Model Predictive Control, Fig. 1 Explicit MPC solution for the double integrator example
the value function V W Xf 7! R associating construct the solution by exploiting the Karush-
with every x 2 Xf the corresponding optimal Kuhn-Tucker (KKT) conditions for optimality:
value of (3) is continuous, convex, and piecewise
quadratic. H z C F x C G0 D 0 (6a)
An immediate corollary is that the explicit
i .G z W S x/ D 0; 8i D 1; : : : ; q (6b)
i i i
version of the MPC control law u in (4), being
the first nu components of vector z.x/, is also Gz W C S x (6c)
a continuous and piecewise-affine state-feedback
0 (6d)
law defined over a partition of the set Xf of states
into M polyhedral cells;
where 2 Rq is the vector of Lagrange multipli-
8 ers. For the strictly convex QP (3), conditions (6)
< F1 x C g1 if H1 x K1
ˆ are necessary and sufficient to characterize opti-
u.x/ D :: :: (5)
: : mality.
:̂
FM x C gM if HM x KM An mpQP algorithm starts by fixing an arbi-
trary starting parameter vector x0 2 Rm (e.g.,
An example of such a partition is depicted in the origin x0 D 0), solving the QP (3) to get the
Fig. 1. The explicit representation (5) has mapped optimal solution z.x0 /, and identifying the subset
the MPC law (4) into a lookup table of linear
gains, meaning that for each given x, the values
computed by solving the QP (3) online and those Q
Gz.x/ D SQ x C WQ (7a)
obtained by evaluating (5) are exactly the same.
of all constraints (6c) that are active at z.x0 / and
Multiparametric QP Algorithms the remaining inactive constraints:
A few algorithms have been proposed in the liter-
ature to solve the mpQP problem (3). All of them O
Gz.x/ SO x C WO (7b)
408 Explicit Model Predictive Control
Correspondingly, in view of the complementarity All methods handle the case of degeneracy,
condition (6b), the vector of Lagrange multipliers which may happen for some combinations of
is split into two subvectors: active constraints that are linearly dependent, that
is, the associated matrix GQ has no full row rank
Q
.x/ 0 (8a) Q
(in this case, .x/ may not be uniquely defined).
O
.x/ D0 (8b)
which to solve the mpQP problem and obtain the Applicability of Explicit MPC
explicit form of the MPC controller. For example,
for a tracking problem with no anticipative action Complexity of the Solution
(rk r0 , 8k D 0; : : : ; N 1), no measured dis- The complexity of the solution is given by the
turbance, and fixed upper and lower bounds, the number M of regions that form the explicit so-
explicit solution is a continuous hpiecewise
i affine lution (5), dictating the amount of memory to
x
function of the parameter vector ur0 . Note that store the parametric solution (Fi , Gi , Hi , Ki ,
1
prediction models and/or weight matrices in (13) i D 1; : : : ; M ), and the worst-case execution
cannot be treated as parameters to maintain the time required to compute Fi x C Gi once the
mpQP formulation (3). problem of identifying the index i of the region E
fx W Hi x Ki g containing the current state x
is solved (which usually takes most of the time).
Linear MPC Based on Convex
The latter is called the “point location problem,”
Piecewise-Affine Costs
and a few methods have been proposed to solve
A similar setting can be repeated for MPC
the problem more efficiently than searching lin-
problems based on linear prediction models
early through the list of regions (see, e.g., the
and convex piecewise-affine costs, such as
tree-based approach of Tøndel et al. 2003b).
1- and 1-norms. In this case, the MPC
An upper bound to M is 2q , which is the
problem is mapped into a multiparametric linear
number of all possible combinations of active
programming (mpLP) problem, whose solution
constraints. In practice, M is much smaller than
is again continuous and piecewise-affine with
2q , as most combinations are never active at
respect to the vector of parameters. For details,
optimality for any of the vectors x (e.g., lower
see Bemporad et al. (2002a).
and upper limits on an actuation signal cannot
be active at the same time, unless they coin-
Robust MPC cide). Moreover, regions in which the first nu
Explicit solutions to min-max MPC problems component of the multiparametric solution z.x/
that provide robustness with respect to additive is the same can be joined together, provided that
and/or multiplicative unknown-but-bounded their union is a convex set (an optimal merging
uncertainty were proposed in Bemporad et al. algorithm was proposed by Geyer et al. (2008) to
(2003), based on a combination of mpLP and get a minimal number M of partitions). Nonethe-
dynamic programming. Again the solution is less, the complexity of the explicit MPC law
piecewise affine with respect to the state vector. typically grows exponentially with the number
q of constraints. The number m of parameters
Hybrid MPC is less critical and mainly affects the number of
An MPC formulation based on 1- or 1-norms elements to be stored in memory (i.e., the number
and hybrid dynamics expressed in mixed-logical of columns of matrices Fi , Hi ). The number n
dynamical (MLD) form can be solved explic- of free variables also affects the number M of
itly by treating the optimization problem asso- regions, mainly because they are usually upper
ciated with MPC as a multiparametric mixed and lower bounded.
integer linear programming (mpMILP) problem.
The solution is still piecewise affine but may be
discontinuous, due to the presence of binary vari- Computer-Aided Tools
ables (Bemporad et al. 2000). A better approach The Model Predictive Control Toolbox (Bempo-
based on dynamic programming combined with rad et al. 2014) offers functions for designing ex-
mpLP (or mpQP) was proposed in Borrelli et al. plicit MPC controllers in MATLAB since 2014.
(2005) for hybrid systems in piecewise-affine Other tools exist such as the Hybrid Toolbox
(PWA) dynamical form and linear (or quadratic) (Bemporad 2003) and the Multi-Parametric Tool-
costs. box (Kvasnica et al. 2006).
410 Explicit Model Predictive Control
Summary and Future Directions towards new challenging applications. Lecture notes
in control and information sciences, vol 384. Springer,
Berlin/Heidelberg, pp 345–369
Explicit MPC is a powerful tool to convert an Baotić M (2002) An efficient algorithm for multi-
MPC design into an equivalent control law that parametric quadratic programming. Tech. Rep.
can be implemented as a lookup table of linear AUT02-05, Automatic Control Institute, ETH, Zurich
gains. Whether the explicit form is preferable Bemporad A (2003) Hybrid toolbox – user’s guide. http://
cse.lab.imtlucca.it/~bemporad/hybrid/toolbox
to solving the QP problem online depends on Bemporad A, Borrelli F, Morari M (2000) Piecewise linear
available CPU time, data memory, and program optimal controllers for hybrid systems. In: Proceedings
memory and other practical considerations. Al- of American control conference, Chicago, pp 1190–
though suboptimal methods have been proposed 1194
Bemporad A, Borrelli F, Morari M (2002a) Model predic-
to reduce the complexity of the control law, still tive control based on linear programming – the explicit
the explicit MPC approach remains convenient solution. IEEE Trans Autom Control 47(12):1974–
for relatively small problems (such as one or 1985
two command inputs, short control and constraint Bemporad A, Morari M, Dua V, Pistikopoulos E (2002b)
The explicit linear quadratic regulator for constrained
horizons, up to ten states). For larger problems, systems. Automatica 38(1):3–20
and/or problems that are linear time varying, on Bemporad A, Borrelli F, Morari M (2003) Min-max
line QP solution methods tailored to embedded control of constrained uncertain discrete-time linear
MPC may be preferable. systems. IEEE Trans Autom Control 48(9):1600–1606
Bemporad A, Morari M, Ricker N (2014) Model predic-
tive control toolbox for matlab – user’s guide. The
Mathworks, Inc., http://www.mathworks.com/access/
Cross-References helpdesk/help/toolbox/mpc/
Borrelli F, Baotić M, Bemporad A, Morari M (2005)
Dynamic programming for constrained optimal con-
Model-Predictive Control in Practice
trol of discrete-time linear hybrid systems. Automatica
Nominal Model-Predictive Control 41(10):1709–1721
Optimization Algorithms for Model Predictive Borrelli F, Bemporad A, Morari M (2011, in press) Predic-
Control tive control for linear and hybrid systems. Cambridge
University Press
Geyer T, Torrisi F, Morari M (2008) Optimal complexity
reduction of polyhedral piecewise affine systems. Au-
Recommended Reading tomatica 44:1728–1740
Jones C, Morari M (2006) Multiparametric linear com-
For getting started in explicit MPC, we plementarity problems. In: Proceedings of the 45th
IEEE conference on decision and control, San Diego,
recommend reading the paper by Bemporad et al. pp 5687–5692
(2002b) and the survey paper Alessio and Kvasnica M, Grieder P, Baotić M (2006) Multi parametric
Bemporad (2009). Hands-on experience using toolbox (MPT). http://control.ee.ethz.ch/~mpt/
one of the MATLAB tools listed above is also Mayne D, Rawlings J (2009) Model predictive control:
theory and design. Nob Hill Publishing, LCC, Madi-
useful for fully appreciating the potentials and son
limitations of explicit MPC. For understanding Patrinos P, Bemporad A (2014) An accelerated dual
how to program a good multiparametric QP gradient-projection algorithm for embedded linear
model predictive control. IEEE Trans Autom Control
solver, the reader is recommended to take the
59(1):18–33
approach of Tøndel et al. (2003a) and Spjøtvold Patrinos P, Sarimveis H (2010) A new algorithm for
et al. (2006) or, in alternative, of Patrinos and solving convex parametric quadratic programs based
Sarimveis (2010) or Jones and Morari (2006). on graphical derivatives of solution mappings. Auto-
matica 46(9):1405–1418
Ricker N (1985) Use of quadratic programming for con-
strained internal model control. Ind Eng Chem Process
Bibliography Des Dev 24(4):925–936
Spjøtvold J, Kerrigan E, Jones C, Tøndel P, Johansen
Alessio A, Bemporad A (2009) A survey on explicit TA (2006) On the facet-to-facet property of solutions
model predictive control. In: Magni L, Raimondo DM, to convex parametric quadratic programs. Automatica
Allgower F (eds) Nonlinear model predictive control: 42(12):2209–2214
Extended Kalman Filters 411
Tøndel P, Johansen TA, Bemporad A (2003) An algorithm with high-dimensional state vectors (d D several
for multi-parametric quadratic programming and ex- hundreds) in real time on a single microprocessor
plicit MPC solutions. Automatica 39(3):489–497
Tøndel P, Johansen TA, Bemporad A (2003b) Evaluation
chip. The computational complexity of the EKF
of piecewise affine control via binary search tree. scales as the cube of the dimension of the state
Automatica 39(5):945–950 vector (d) being estimated. The EKF often gives
Wang Y, Boyd S (2010) Fast model predictive control good estimation accuracy for practical nonlinear
using online optimization. IEEE Trans Control Syst
Technol 18(2):267–278
problems, although the EKF accuracy can be
very poor for difficult nonlinear non-Gaussian
problems. There are many different variations
of EKF algorithms, most of which are intended E
Extended Kalman Filters to improve estimation accuracy. In particular,
the following types of EKFs are common in
Frederick E. Daum engineering practice: (1) second-order Taylor
Raytheon Company, Woburn, MA, USA series expansion of the nonlinear functions, (2)
iterated measurement updates that recompute the
point at which the first order Taylor series is
Synonyms evaluated for a given measurement, (3) second-
order iterated (i.e., combination of items 1
EKF and 2), (4) special coordinate systems (e.g.,
Cartesian, polar or spherical, modified polar
or spherical, principal axes of the covariance
Abstract matrix ellipse, hybrid coordinates, quaternions
rather than Euler angles, etc.), (5) preferred order
The extended Kalman filter (EKF) is the most of processing sequential scalar measurement
popular estimation algorithm in practical appli- updates, (6) decoupled or partially decoupled or
cations. It is based on a linear approximation to quasi-decoupled covariance matrices, and many
the Kalman filter theory. There are thousands of more variations. In fact, there is no such thing
variations of the basic EKF design, which are as “the” EKF, but rather there are thousands of
intended to mitigate the effects of nonlinearities, different versions of the EKF. There are also
non-Gaussian errors, ill-conditioning of the co- many different versions of the Kalman filter
variance matrix and uncertainty in the parameters itself, and all of these can be used to design EKFs
of the problem. as well. For example, there are many different
equations to update the Kalman filter error
covariance matrices with the intent of mitigating
Keywords ill-conditioning and improving robustness,
including (1) square-root factorization of the
Estimation; Nonlinear filters covariance matrix, (2) information matrix update,
(3) square-root information update, (4) Joseph’s
The extended Kalman filter (EKF) is by far robust version of the covariance matrix update,
the most popular nonlinear filter in practical (5) at least three distinct algebraic versions of the
engineering applications. It uses a linear covariance matrix update, as well as hybrids of
approximation to the nonlinear dynamics and the above.
measurements and exploits the Kalman filter Many of the good features of the Kalman filter
theory, which is optimal for linear and Gaussian are also enjoyed by the EKF, but unfortunately
problems; Gelb (1974) is the most accessible not all. For example, we have a very good theory
but thorough book on the EKF. The real-time of stability for the Kalman filter, but there is
computational complexity of the EKF is rather no theory that guarantees that an EKF will be
modest; for example, one can run an EKF stable in practical applications. The only method
412 Extended Kalman Filters
to check whether the EKF is stable is to run the coordinate system. Moreover, in math and
Monte Carlo simulations that cover the relevant physics, coordinate-free methods are preferred,
regions in state space with the relevant measure- owing to their greater generality and simplicity
ment parameters (e.g., data rate and measurement and power. The physics does not depend on the
accuracy). Secondly, the Kalman filter computes specific coordinate system; this is essentially a
the theoretical error covariance matrix, but there definition of what “physics” means, and it has
is no guarantee that the error covariance matrix resulted in great progress in physics over the
computed by the EKF approximates the actual last few hundred years (e.g., general relativity,
filter errors, but rather the EKF covariance ma- gauge invariance in quantum field theory, Lorentz
trix could be optimistic by orders of magnitude invariance in special relativity, as well as a host
in real applications. Third, the numerical val- of conservation laws in classical mechanics that
ues of the process noise covariance matrix can are explained by Noether’s theorem which relates
be computed theoretically for the Kalman filter, invariance to conserved quantities). Similarly
but there is no guarantee that these will work in math, coordinate-free methods have been the
well for the EKF, but rather engineers typically royal road to progress over the last 100 years
tune the process noise covariance matrix using but not so for practical engineering of EKFs,
Monte Carlo simulations or else use a heuris- because EKFs are approximations rather than
tic adaptive process (e.g., IMM). All of these being exact, and the accuracy of the EKF
short-comings of the EKF compared with the approximation depends crucially on the specific
Kalman filter theory are due to a myriad of coordinate system used. Moreover, the effect
practical issues, including (1) nonlinearities in of ill-conditioning of the covariance matrices
the dynamics or measurements, (2) non-Gaussian in EKFs depends crucially on the specific
measurement errors, (3) unmodeled measurement coordinate system used in the computer; for
error sources (e.g., residual sensor bias), (4) un- example, if we could compute the EKF in
modeled errors in the dynamics, (5) data associa- principal coordinates, then the covariance
tion errors, (6) unresolved measurement data, (7) matrices would be diagonal, and there would
ill-conditioning of the covariance matrix, etc. The be no effect of ill-conditioning, despite enormous
actual estimation accuracy of an EKF can only condition numbers of the covariance matrices.
be gauged by Monte Carlo simulations over the Surprisingly, these two simple points about
relevant parameter space. coordinate systems are still not well understood
The actual performance of an EKF can depend by many researchers in nonlinear filtering.
crucially on the specific coordinate system that
is used to represent the state vector. This is
extremely well known in practical engineering Cross-References
applications (e.g., see Mehra 1971; Stallard
1991; Miller 1982; Markley 2007; Daum 1983; Estimation, Survey on
Schuster 1993). Intuitively, this is because the Kalman Filters
dynamics and measurement equations can be Nonlinear Filters
exactly linear in one coordinate system but not Particle Filters
another; this is very easy to see; start with dy-
namics and measurements that are exactly linear
in Cartesian coordinates and transform to polar Bibliography
coordinates and we will get highly nonlinear
equations. Likewise, we can have approximately Crisan D, Rozovskii B (eds) (2011) The Oxford hand-
linear dynamics and measurements in a specific book of nonlinear filtering. Oxford University Press,
coordinate system but highly nonlinear equations Oxford/New York
Daum FE, Fitzgerald RJ (1983) Decoupled Kalman filters
in another coordinate system. But in theory, the for phased array radar tracking. IEEE Trans Autom
optimal estimation accuracy does not depend on Control 28:269–283
Extremum Seeking Control 413
Miroslav Krstic
Department of Mechanical and Aerospace 2006). The most common version employs per-
Engineering, University of California, San turbation signals for the purpose of estimating the
Diego, La Jolla, CA, USA gradient of the unknown map that is being opti-
mized. To understand the basic idea of extremum
seeking, it is best to first consider the case of a
Abstract
static single-input map of the quadratic form, as
shown in Fig. 1.
Extremum seeking (ES) is a method for real-time
Three different thetas appear in Fig. 1: is
non-model-based optimization. Though ES was O
the unknown optimizer of the map, .t/ is the
invented in 1922, the “turn of the twenty-first cen-
real-time estimate of , and .t/ is the actual
tury” has been its golden age, both in terms of the
input into the map. The actual input .t/ is based
development of theory and in terms of its adop- O
on the estimate .t/ but is perturbed by the
tion in industry and in fields outside of control
signal a sin.!t/ for the purpose of estimating the
engineering. This entry overviews basic gradient-
unknown gradient f 00 . / of the map f . /.
and Newton-based versions of extremum seeking
The sinusoid is only one choice for a perturbation
with periodic and stochastic perturbation signals.
signal – many other perturbations, from square
waves to stochastic noise, can be used in lieu of
Keywords sinusoids, provided they are of zero mean. The
O is generated with the integrator k=s
estimate .t/
Gradient climbing; Newton’s method with the adaptation gain k controlling the speed
of estimation.
The ES algorithm is successful if the error
The Basic Idea of Extremum Seeking O
between the estimate .t/ and the unknown ,
namely, the signal
Many versions of extremum seeking exist, with
various approaches to their stability study (Krstic
and Wang 2000; Liu and Krstic 2012; Tan et al. Q
.t/ O
D .t/ (1)
414 Extremum Seeking Control
s
S (t) M (t)
s+ωh
θ̂
K Ĝ ωl
+ ×
s s+ωl z = y−η
Extremum Seeking Control, Fig. 3 The ES algorithm slower than the dynamics of the plant, convergence is
E
in the presence of dynamics with an equilibrium map 7! guaranteed (at least locally). The two filters are useful
y that satisfies the same conditions as in the static case. If in the implementation to reduce the adverse effect of the
the dynamics are stable and the user employs parameters perturbation signals on asymptotic performance but are
in the ES algorithm that make the algorithm dynamics not needed in the stability analysis
If, for example, the map Q./ has a maximum parameters are chosen so that the algorithm’s
that is locally quadratic (which implies H D dynamics are slower than those of the plant. The
H T < 0) and if the user chooses the elements algorithm is shown in Fig. 3.
of the diagonal gain matrix K as positive, the The technical conditions for convergence in
ES algorithm is guaranteed to be locally conver- the presence of dynamics are that the equilibria
gent. However, the convergence rate depends on x D l. / of the system xP D f .x; ˛.x; //,
the unknown Hessian H . This weakness of the where ˛.x; / is the control law of an internal
gradient-based ES algorithm is removed with the feedback loop, are locally exponentially stable
Newton-based ES algorithm. uniformly in and that, given the output map
A stochastic version of the algorithm in Fig. 2 y D h.x/, there exists at least one 2 Rn
also exists, in which S.t/ and M.t/ are re- @ @2
such that .hıl/. / D 0 and 2 .hıl/. / D
placed by @ @
H < 0; H D H T .
The stability analysis in the presence of
S..t// D Œa1 sin.1 .t//; : : : ; an sin.n .t//T ;
dynamics employs both averaging and singular
(9) perturbations, in a specific order. The design
" guidelines for the selection of the algorithm’s pa-
2 rameters follow the analysis. Though the guide-
M..t// D 2
sin.1 .t//; : : : ;
a1 .1 e q1 / lines are too lengthy to state here, they ensure
T that the plant’s dynamics are on a fast time scale,
2 the perturbations are on a medium time scale, and
sin.n .t// (10)
an .1 e qn /
2
the ES algorithm is on a slow time scale.
p
qi "i P
where i D ŒWi and WP i are independent
"i s C 1 Newton ES Algorithm for Static Map
unity-intensity white noise processes.
A Newton version of the ES algorithm, shown in
Fig. 4, ensures that the convergence rate be user
ES for Dynamic Systems assignable, rather than being dependent on the
unknown Hessian of the map.
ES extends in a relatively straightforward man- The elements of the demodulating matrix N.t/
ner from static maps to dynamic systems, pro- for generating the estimate of the Hessian are
vided the dynamics are stable and the algorithm’s given by
416 Extremum Seeking Control
θ y
Q(·)
S(t) M(t)
θ̂ K Ĝ
+ −ΓĜ ×
s
Ĥ N(t)
ˆΓ
Γ̇ = ωrΓ − ωrΓH ×
Extremum Seeking Control, Fig. 4 A Newton-based as HO .t / D N.t /y.t /. The Riccati matrix differential
ES algorithm for a static map. The multiplicative excita- equation .t / generates an estimate of the Hessian’s
@2 Q. / inverse matrix, avoiding matrix inversions of Hessian
tion N.t / helps generate the estimate of Hessian
@ 2 estimates that may be singular during the transient
Keywords
Fault Detection and Diagnosis
Consistency relations; Diagnostic observers;
Janos Gertler
Fault detection; Fault diagnosis; Parity space;
George Mason University, Fairfax, VA, USA
Principal component analysis; Residual genera-
tion
Synonyms
Introduction
FDD
Faults are malfunctions of various elements of
technical systems. Extreme cases of faults, called
failures, are catastrophic breakdowns of the
Abstract same. The technical systems (the plant) we are
concerned with range from complex production
The fundamental concepts and methods of systems (chemical plants, oil refineries, power
fault detection and diagnosis are reviewed. stations) through major transportation equipment
Faults are defined and classified as additive or (airplanes, ships) to consumer machines (auto-
multiplicative. The model-free approach of alarm mobiles, home-heating systems, etc.). The faults
systems is described and critiqued. Residual may affect various parts of the main technical
generation, using the mathematical model system (motors, pumps, storage tanks, pipelines)
of the plant, is introduced. The propagation or devices interfacing the main technical system
of additive and multiplicative faults to the with computers providing for control, monitor-
residuals is discussed, followed by a review ing, and operator information. These latter in-
of the effect of disturbances, noise, and model clude sensors (measuring devices) and actuators
errors. Enhanced residuals (structured and (devices acting on the process, such as valves).
directional) are introduced. The main residual The objective of fault detection is to determine
generation techniques are briefly described, and signal if there is a fault anywhere in the
including direct consistency relations, parity system. Fault diagnosis is aimed at providing
space, and diagnostic observers. Principal more specific information about the fault; fault
component analysis and its application to isolation is to pinpoint at the component(s) (sen-
fault detection and diagnosis are outlined. The sors, actuators, or plant components) where the
article closes with some thoughts about future fault is located, while fault identification is to
directions. determine (estimate) the size of the fault and, in
some cases, the time of its arrival. With the ubiq- Model-Based FDD Concepts
uitous presence of the computer, fault detection
and diagnosis (FDD) is, in general, a function of Model-based methods utilize an explicit
the computer interfaced to the plant. mathematical model of the plant. Such model
The simplest approaches to FDD consist of is obtained usually from empirical plant data by
comparing individual plant measurements to pre- systems identification methods or, exceptionally,
set limits, without utilizing any knowledge of the from the “first principles” understanding of the
plant model (limit checking or alarm systems). plant. Model building, though critical to the
More sophisticated techniques rely on an explicit success of model-based FDD, is usually not
mathematical model of the plant. They compare considered part of the FDD effort. The models
plant measurements to estimates obtained, from may be linear or nonlinear, static or dynamic,
other measurements, by the model; any discrep- and continuous or discrete time. In FDD, most
ancy may be an indication of faults. Another class frequently linear discrete-time dynamic models
of techniques (generally but incorrectly called are used.
“data driven”), most notably principal compo- The fundamental idea of model-based FDD is
nent analysis (PCA), include the estimation of the comparison of measured plant outputs to their
an implicit model, from empirical plant data, and estimates, obtained, via the mathematical model,
then use this in ways similar to the model-based from measured or actuated plant inputs (Fig. 1).
methods. These approaches will be described in Any discrepancy is (at least ideally) an indication
more detail in the sequel. that a fault (or faults) is (are) present in the
system. Mathematically, the difference between
the measured output yi .t/ and its estimate yi ^ .t/
Alarm Systems is a (primary) residual (Willsky 1976):
test. Directional residuals are tested as vectors briefly introduce the three methods, for discrete-
against multivariable distributions. The test time plant models and additive faults. Note that
thresholds are determined either theoretically, though they look formally different, if designed
using assumptions for the source noise, or em- for the same plant under the same design condi-
pirically based on measurements from fault- tions, the three methods yield identical residuals
free operating conditions. (Gertler 1991).
Dealing with disturbances. Additive dis- Direct consistency (parity) relations (Gertler
turbances are unmeasurable inputs. If the 1998). The input-output model of the plant is
disturbance-to-output transfer function (or utilized directly in the design. The enhanced
equivalent state-space representation) is known, residuals are obtained from the primary residuals
then it is possible to design residuals that are by a transformation W.q/:
completely decoupled from (insensitive to) those
disturbances. However, the FDD algorithm is r.t/ D W.q/e.t/ D W.q/Œy.t/ M.q/u.t/
subject to a certain degree of “design freedom,”
D W.q/S.q/p.t/
defined by the number of outputs in the physical
system; disturbance decoupling is competing for
The desired behavior of the residuals is specified
this freedom with fault isolation enhancement.
as r.t/ D Z.q/p.t/, where the specification Z.q/
If there are too many disturbances, or if their
contains the basic residual properties (structure
path to the outputs is unknown, then only
or directions) plus the residual dynamics. The
approximate decoupling is possible, making FDD
resulting design condition is W.q/S.q/ D Z.q/.
also approximate, usually designed to optimize
If the S.q/ matrix is square, what is usually the
some (H-infinity) performance index.
case (Gertler 1998), then this can be solved for
W.q/ by direct inversion. The residual generator
Dealing with model errors. Model errors are
has to be causal and stable; this can always be
also unavoidable in most practical situations. This
achieved by the appropriate modification of the
is the most serious obstacle in the application
dynamics in Z.q/.
of model-based FDD techniques. In some very
special cases, uncertainty of a particular plant
Parity space (Chow and Willsky 1984). This
parameter may be handled as a “multiplicative
method, also known as the “Chow-Willsky
disturbance,” and residuals designed to be ex-
scheme,” relies on the state-space description
plicitly decoupled from it. In general, however,
of the system:
only approximate solutions are possible, reducing
the residuals’ sensitivity to modeling errors, at
x.t C 1/ D A x.t/ C B u.t/ C E p.t/
the expense of also reducing their sensitivity to
faults. Design methods utilizing some optimiza- y.t/ D C x.t/ C D u.t/ C F p.t/
tion techniques, mostly based on H-infinity or
similar performance indices, are available in the Stacking n consecutive output vectors y.t/
literature (Edelmayer et al. 1994). (where n is the order of the model), and chain-
substituting the state x.t/, yields the equation
For linear dynamic systems, provided exact (non- where Y.t/; U.t/, and P.t/ are stacked vectors
approximate) solution is possible, there are three and J; K, and L are hyper-matrices composed of
major techniques to design residual generators: the A; B; C; D; E; F matrices. Now
(i) direct consistency (parity) relations, (ii) parity
space, and (iii) diagnostic observers. We will E .t/ D Y.t/ K U.t/ D L P.t/ C J x.t n/
Fault Detection and Diagnosis 421
would be a stacked vector of primary residuals, relations among the variables, it significantly
was it not for the presence of the inaccessible ini- reduces the dimensionality of the plant model
tial state x.t n/. To obtain true residuals, a trans- (Kresta et al. 1991). The application of PCA for
formation ri .t/ D wi E .t/ is necessary, so that FDD implies two phases. In the training phase,
wi J D 0. Any vector wi satisfying this orthogo- an implicit plant model is created from empirical
nality condition is a parity vector, together span- plant data. In the monitoring phase, this model is
ning the parity space. Any parity vector yields a used for FDD.
true residual ri .t/; they can be so chosen that a Training data (measured inputs and outputs)
set of residuals possesses structured behavior. are collected from the plant during fault-free
operation. The covariance matrix of the data is
Diagnostic observers. Various observer formed and its eigenstructure obtained. Due to
schemes have been extensively investigated linear relations among the data, some of the F
as possible residual generator algorithms. The eigenvalues will be zero (or near zero, in the
basic full-order Luenberger observer (assuming presence of noise). The eigenvectors belonging
D D 0) is to the nonzero eigenvalues form the data space,
where the fault-free data exist, while those be-
x.t C1/ D A x.t / C B u.t/ C K e.t/ longing to the zero eigenvalues form the residual
space.
where K is the observer gain matrix and It is the residual space that is utilized for FDD.
The projection of a measurement vector onto the
e.t/ D y.t/ C x.t / residual space is the (primary) residual. A statis-
tical test on its size leads to a detection decision
is the innovation vector. If the observer is stable (the absence or presence of faults). A thresh-
then, apart from the start-up transient of the old test is necessary because noise also causes
observer, the innovation qualifies as the primary nonzero residuals. An analysis of the eigenvec-
residual. The gain matrix K is the major design tors spanning the residual space shows how the
parameter; it is chosen to place the observer various faults propagate to the primary residual.
poles, thus achieving stability and desired dy- This allows for the design of residual manipula-
namic behavior (e.g., noise suppression). The tions yielding structured or directional residuals,
remaining design freedom can be utilized to in- just like in the FDD methods based on exact
fluence residual properties. The latter are further models (Gertler et al. 1999).
affected by the transformation r.t/ D H e.t/, The procedure as described above applies to
where the H matrix is an additional design pa- sensor and actuator faults; inclusion of plant
rameter. Diagnostic observers can be designed for faults requires extra effort (and experiments).
both structured and directional residuals (Chen Also, PCA is primarily meant for static models.
and Patton 1999; White and Speyer 1987). Other Its extension to discrete-time dynamic models is
observer schemes, most notably the unknown in- straightforward, but it increases the size of the
put observer, have also been proposed (Frank and model, proportionally to the dynamic order of the
Wunnenberg 1989). Because of their complexity, model.
the detailed design procedures of diagnostic ob-
servers go beyond the scope of this entry.
Summary and Future Directions
is just minor refinements of earlier results. This Isermann R (1984) Process fault detection based on mod-
applies particularly to the long ongoing quest to eling and estimation methods. Automatica 20:387–404
Kresta JV, MacGregor JF, Marlin TE (1991) Multivariate
create “robust” FDD algorithms, especially in the statistical monitoring of processes. Can J Chem Eng
face of model errors. 69:35–47
There are still open challenges in a couple of White JE, Speyer JL (1987) Detection filter design: spec-
areas, most notably extensions to various non- tral theory and algorithm. IEEE Trans Autom Control
AC-32:593–603
linear or parameter varying problems. Another Willsky AS (1976) A survey of design methods for
open and active area, of great practical impor- failure detection in dynamic systems. Automatica
tance, is FDD in networked control systems. 12:601–611
What is really of the greatest interest, though,
is the application of the wealth of available the-
oretical results and design methods to real-life
Fault-Tolerant Control
problems; there has recently been some visible
progress here, a most welcome development.
Ron J. Patton
School of Engineering, University of Hull, Hull,
UK
Cross-References
into either active/passive approaches with exam- Patton 1999; Gertler 1998; Patton et al. 2000)
ples of some well-known strategies. motivated by studies in the 1980s on this topic
(Patton et al. 1989). Fault-tolerant control (FTC)
began to develop in the early 1990s (Patton 1993)
Keywords and is now a standard in the literature (Patton
1997; Blanke et al. 2006; Zhang and Jiang 2008),
Active FTC; Fault accommodation; Fault based on the aerospace subject of reconfigurable
detection and diagnosis (FDD); Fault detection flight control making use of redundant actua-
and isolation (FDI); Fault estimation (FE); Fault- tors and sensors (Steinberg 2005; Edwards et al.
tolerant control; Passive FTC; Reconfigurable 2010).
control
F
Introduction Definitions Relating to Fault-Tolerant
Control
The complexity of modern engineering systems
has led to strong demands for enhanced control FTC is a strategy in control systems architecture
system reliability, safety, and green operation in and design to ensure that a closed-loop system
the presence of even minor anomalies. There can continue acceptable operation in the face
is a growing need not only to determine the of bounded actuator, sensor, or process faults.
onset and development of process faults before The goal of FTC design must ensure that the
they become serious but also to adaptively com- closed-loop system maintains satisfactory stabil-
pensate for their effects in the closed-loop sys- ity and acceptable performance during either one
tem or using hardware redundancy to replace or more fault actions. When prescribed stability
faulty components by duplicate and fault-free and closed-loop performance indices are main-
alternatives. The title “failure tolerant control” tained despite the action of faults, the system
was given by Eterno et al. (1985) working on a is said to be “fault tolerant,” and the control
reconfigurable flight control study defining the scheme that ensures the fault tolerance is the
meaning of control system tolerance to failures fault-tolerant controller (Blanke et al. 2006; Pat-
or faults. The word “failure” is used when a fault ton 1997).
is so serious that the system function concerned Fault modelling is concerned with the rep-
fails to operate (Isermann 2006). The title failure resentation of the real physical faults and their
detection has now been superseded by fault de- effects on the system mathematical model. Fault
tection, e.g., in fault detection and isolation (FDI) modelling is important to establish how a fault
or fault detection and diagnosis (FDD) (Chen and should be detected, isolated, or compensated.
System
The faults illustrated in Fig. 1 act at system loca- based on generalized internal model control
tions defined as follows (Chen and Patton 1999): (GIMC) has been proposed by Zhou and Ren
An actuator fault .fa .t// corresponds to vari- (2001) and other studies by Niemann and
ations of the control input u.t/ applied to the Stoustrup (2005). Figure 2 shows a suitable
controlled system either completely or partially. architecture to encompass active and passive
The complete failure of an actuator means that FTC methods in which a distinction is made
it produces no actuation regardless of the input between “execution” and “supervision” levels.
applied to it, e.g., as a result of breakage and The essential differences and requirements
burnout of wiring. For partial actuator faults, the between the passive FTC (PFTC) and active
actuator becomes less effective and provides the FTC (AFTC).
plant with only a part of the normal actuation PFTC is based solely on the use of robust
signal. control in which potential faults are considered
A sensor is an item of equipment that as if they are uncertain signals acting in the
takes a measurement or observation from the closed-loop system. This can be related to the
system, e.g., potentiometers, accelerometers, concept of reliable control (Veillette et al. 1992).
tachometers, pressure gauges, strain gauges, etc.; PFTC requires no online information from the
a sensor fault .fs .t// implies that incorrect fault diagnosis (FDI/FDD/FE) function about the
measurements are taken from the real system. occurrence or presence of faults and hence it is
This fault can also be subdivided into either a not by itself and adaptive system and does not
complete or partial sensor fault. When a sensor involve controller reconfiguration (Patton 1993,
fails, the measurements no longer correspond 1997; Šiljak 1980). PFTC approach can be used
to the required physical parameters. For a if the time window during which the system
partial sensor fault the measurements give remains stabilizable in the presence of a fault is
an inaccurate indication of required physical short; see, for example, the problem of the double
parameters. inverted pendulum (Weng et al. 2007) which is
A process fault .fp .t// directly affects the unstable during a loop failure.
physical system parameters and in turn the in- AFTC has two conceptual steps to provide
put/output properties of the system. Process faults the system with fault-tolerant capability (Blanke
are often termed component faults, arising as et al. 2006; Patton 1997; Zhang and Jiang 2008;
variations from the structure or parameters used Edwards et al. 2009):
during system modelling, and as such cover a • Equip the system with a mechanism to make it
wide class of possible faults, e.g., dirty water hav- able to detect and isolate (or even estimate) the
ing a different heat transfer coefficient compared fault promptly, identify a faulty component,
to when it is clean, or changes in the viscosity of a and select the required remedial action in to
liquid or components slowly degrading over time maintain acceptable operation performance.
through wear and tear, aging, or environmental With no fault a baseline controller attenu-
effects. ates disturbances and ensures good stability
and closed-loop tracking performance (Pat-
ton 1997), and the diagnostic (FDI/FDD/FE)
Architectures and Classification block recognizes that the closed-loop system
of FTC Schemes is fault-free with no control law change re-
quired (supervision level).
FTC methods are classified according to whether • Make use of supervision level information and
they are “passive” or “active,” using fixed or adapt or reconfigure/restructure the controller
reconfigurable control strategies (Eterno et al. parameters so that the required remedial activ-
1985). Various architectures have been proposed ity can be achieved (execution level).
for the implementation of FTC schemes, for Figure 3 gives a classification of PFTC and AFTC
example, the structure of reconfigurable control methods (Patton 1997).
Fault-Tolerant Control 425
Fault-Tolerant Control,
Fig. 2 Scheme of FTC
(Adapted from Blanke
et al. (2006))
Fault-Tolerant Control,
Fig. 3 General
classification of FTC
methods
Figure 3 shows that AFTC approaches (c) Tolerant to unanticipated faults using
are divided into two main types of methods: FDI/FDD/FE
projection-based methods and online automatic (d) Dependent upon use of a baseline controller
controller redesign methods. The latter involves
the calculation of new controller parameters AFTC Examples
following control impairment, i.e., using One example of AFTC is model-based predictive
reconfigurable control. In projection-based control (MPC) which uses online computed con-
methods, a new precomputed control law is trol redesign. MPC is online-fault accommodat-
selected according to the required controller ing; it does not use an FDI/FDD unit and is not
structure (i.e., depending on the type of isolated dependent on a baseline controller. MPC has a
fault). certain degree of fault tolerance against actuator
AFTC methods use online-fault accommoda- faults under some conditions even if the faults are
tion based on unanticipated faults, classified as not detected. The representation of actuator faults
(Patton 1997): in MPC is relatively natural and straightforward
(a) Based on offline (pre-computed) control laws since actuator faults such as jams and slew-rate
(b) Online-fault accommodating reductions can be represented by changing the
426 Fault-Tolerant Control
MPC optimization problem constraints. Other Hence, a AFTC system provides fault
faults can be represented by modifying the inter- tolerance either by selecting a precomputed
nal model used by MPC (Maciejowski 1998). The control law (projection-based) (Boskovic and
fact that online-fault information is not required Mehra 1999; Maybeck and Stevens 1991;
means that MPC is an interesting method for Rauch 1995) or by synthesizing a new control
flight control reconfiguration as demonstrated by strategy online (online controller redesign)
Maciejowski and Jones in the GARTEUR AG 16 (Ahmed-Zaid et al. 1991; Richter et al.
project on “fault-tolerant flight control” (Edwards 2007; Efimov et al. 2012; Zou and Kumar
et al. 2010). 2011).
Another interesting AFTC example that makes Another widely studied AFTC method is the
use of the concept of model-matching in explicit estimation and compensation approach, where
model following is the so-called pseudo-inverse a fault compensation input is superimposed
method (PIM) (Gao and Antsaklis 1992) which on the nominal control input (Noura et al.
requires the nominal or reference closed-loop 2000; Boskovic and Mehra 2002; Sami and
system matrix to compute the new controller gain Patton 2013; Zhang et al. 2004). There is a
after a fault has occurred. The challenges are: growing interest in robust FE methods based on
1. Guarantee of stability of the reconfigured sliding mode estimation (Edwards et al. 2000)
closed-loop system and augmented observer methods (Gao and
2. Minimization of the time consumed to ap- Ding 2007; Jiang et al. 2006; Sami and Patton
proach the acceptable matching 2013).
3. Achieving perfect matching through use of An important development of this approach is
different control methodologies the so-called fault hiding strategy which is cen-
Exact model-matching may be too demanding, tered on achieving FTC loop goals such that the
and some extensions to this approach make nominal control loop remains unchanged through
use of alternative, approximate (norm-based) the use of virtual actuators or virtual sensors
model-matching through the computation of (Blanke et al. 2006; Sami and Patton 2013). Fault
the required model-following gain. To relax hiding makes use of the difference between the
the matching condition further, Staroswiecki nominal and faulty system state to changes in the
(2005) proposed an admissible model-matching system dynamics such that the required control
approach which was later extended by Tornil objectives are continuously achieved even if a
et al. (2010) using D-region pole assignment. fault occurs. In the sensor fault case, the effect
The PIM approach requires an FDI/FDD/FE of the fault is hidden from the input of the con-
mechanism and is online-fault accommodating troller. However, actuator faults are compensated
only in terms of a priori anticipated faults. by the effect of the fault (Lunze and Steffen
This limits the practical value of this ap- 2006; Richter et al. 2007; Ponsart et al. 2010;
proach. Sami and Patton 2013) in which it is assumed
As a third example, feedback linearization that the FDI/FDD or FE scheme is available.
can be used to compensate for nonlinear dy- The virtual actuator/sensor FTC can be good
namic effects while also implementing control practical value if FDI/FDD/FE robustness can be
law reconfiguration or restructure. In flight con- demonstrated.
trol an aileron actuator fault will cause a strong Traditional adaptive control methods that
coupling between the lateral and longitudinal automatically adapt controller parameters to
aircraft dynamics. Feedback linearization is an system changes can be used in a special
established technique in flight control (Ochi and application of AFTC, potentially removing the
Kanai 1991). The faults are identified indirectly need for FDI/FDD and controller redesign steps
by estimating aircraft flight parameters online, (Tang et al. 2004; Zou and Kumar 2011) but
e.g., using a recursive least-squares algorithm to possibly using the FE function. Adaptive control
update the FTC. is suitable for FTC on plants that have slowly
Fault-Tolerant Control 427
Isermann R (2006) Fault-diagnosis systems an introduc- Tornil S, Theilliol D, Ponsart JC (2010) Admissible model
tion from fault detection to fault tolerance. Springer, matching using D R-regions: fault accommodation and
Berlin/New York robustness against FDD inaccuracies J Adapt Control
Jiang BJ, Staroswiecki M, Cocquempot V (2006) Fault Signal Process 24(11): 927–943
accommodation for nonlinear dynamic systems. IEEE Veillette R, Medanic J, Perkins W (1992) Design of
Trans Autom Control 51:1578–1583 reliable control systems. IEEE Trans Autom Control
Lunze J, Steffen T (2006) Control reconfiguration after ac- 37:290–304
tuator failures using disturbance decoupling methods. Weng J, Patton RJ, Cui P (2007) Active fault-tolerant
IEEE Trans Autom Control 51:1590–1601 control of a double inverted pendulum. J Syst Control
Maciejowski JM (1998) The implicit daisy-chaining prop- Eng 221:895
erty of constrained predictive control. J Appl Math Zhang Y, Jiang J (2008) Bibliographical review on re-
Comput Sci 8(4):101–117 configurable fault-tolerant control systems. Annu Rev
Maybeck P, Stevens R (1991) Reconfigurable flight con- Control 32:229–252
trol via multiple model adaptive control methods. Zhang X, Parisini T, Polycarpou M (2004) Adaptive fault-
IEEE Trans Aerosp Electron Syst 27:470–480 tolerant control of nonlinear uncertain systems: an
Niemann H, Stoustrup J (2005) An architecture for fault information-based diagnostic approach. IEEE Trans
tolerant controllers. Int J Control 78(14):1091–1110 Autom Control 49(8):1259–1274
Noura H, Sauter D, Hamelin F, Theilliol D (2000) Fault- Zhang K, Jiang B, Staroswiecki M (2010) Dynamic output
tolerant control in dynamic systems: application to a feedback-fault tolerant controller design for Takagi-
winding machine. IEEE Control Syst Mag 20:33–49 Sugeno fuzzy systems with actuator faults. IEEE Trans
Ochi Y, Kanai K (1991) Design of restructurable flight Fuzzy Syst 18:194–201
control systems using feedback linearization. J Guid Zhou K, Ren Z (2001) A new controller architecture for
Dyn Control 14(5): 903–911 high performance, robust, and fault-tolerant control.
Patton RJ, Frank PM, Clark RN (1989) Fault diagnosis IEEE Trans Autom Control 46(10):1613–1618
in dynamic Systems: theory and application. Prentice Zou A, Kumar K (2011) Adaptive fuzzy fault-tolerant
Hall, New York attitude control of spacecraft. Control Eng Pract
Patton RJ (1993) Robustness issues in fault tolerant 19:10–21
control. In: Plenary paper at international conference
TOOLDIAG’93, Toulouse, Apr 1993
Patton RJ (1997) Fault tolerant control: the 1997 situation.
In: IFAC Safeprocess ’97, Hull, pp 1033–1055 FDD
Patton RJ, Frank PM, Clark RN (2000) Issues of fault
diagnosis for dynamic systems. Springer, London/New
York Fault Detection and Diagnosis
Ponsart J, Theilliol D, Aubrun C (2010) Virtual sen-
sors design for active fault tolerant control system
applied to a winding machine. Control Eng Pract
18:1037–1044 Feedback Linearization of Nonlinear
Rauch H (1995) Autonomous control reconfiguration. Systems
IEEE Control Syst 15:37–48
Richter JH, Schlage T, Lunze J (2007) Control reconfig-
uration of a thermofluid process by means of a virtual
A.J. Krener
actuator. IET Control Theory Appl 1:1606–1620 Department of Applied Mathematics, Naval
Sami M, Patton RJ (2013) Active fault tolerant control for Postgrauate School, Monterey, CA, USA
nonlinear systems with simultaneous actuator and sen-
sor faults. Int J Control Autom Syst 11(6):1149–1161
Šiljak DD (1980) Reliable control using multiple control Abstract
systems. Int J Control 31:303–329
Staroswiecki M (2005) Fault tolerant control us-
ing an admissible model matching approach. In: Effective methods exist for the control of linear
Joint Decision and Control Conference and Euro- systems but this is less true for nonlinear systems.
pean Control Conference, Seville, 12–15 Dec 2005, Therefore, it is very useful if a nonlinear system
pp 2421–2426
Steinberg M (2005) Historical overview of research in can be transformed into or approximated by a
reconfigurable flight control. Proc IMechE, Part G J linear system. Linearity is not invariant under
Aeros Eng 219:263–275 nonlinear changes of state coordinates and non-
Tang X, Tao G, Joshi S (2004) Adaptive output feedback linear state feedback. Therefore, it may be possi-
actuator failure compensation for a class of non-linear
systems. Int J Adapt Control Signal Process 19(6): ble to convert a nonlinear system into a linear one
419–444 via these transformations. This is called feedback
Feedback Linearization of Nonlinear Systems 429
linearization. This entry surveys feedback lin- around this operating point, expand (2) to first
earization and related topics. order
@f 0 0 @f 0 0
zP D .x ; u /z C .x ; u /v
Keywords @x @u
CO.z; v/2 (4)
Distribution; Frobenius theorem; Involutive dis-
tribution; Lie derivative Ignoring the higher order terms, we get a linear
dynamics (1) where
Introduction @f 0 0 @f 0 0
F D .x ; u /; GD .x ; u /
@x @u F
A controlled linear dynamics is of the form
This simple technique works very well in
xP D F x C Gu (1) many cases and is the basis for many engineer-
ing designs. For example, if the linear feedback
where the state x 2 IRn and the control u 2 IRm . v D Kz puts all the eigenvalues of F C GK
A controlled nonlinear dynamics is of the form in the left half plane, then the affine feedback
u D u0 C K.x x 0 / makes the closed-loop
xP D f .x; u/ (2) dynamics locally asymptotically stable around
x 0 . So, one way to linearize a nonlinear dynamics
where x; u have the same dimensions but may is to approximate it by a linear dynamics.
be local coordinates on some manifolds X ; U. The other way to linearize is by a nonlinear
Frequently, the dynamics is affine in the control, change of state coordinates and a nonlinear state
i.e., feedback because linearity is not invariant under
these transformations. To see this, suppose we
xP D f .x/ C g.x/u (3) have a controlled linear dynamics (1) and we
make the nonlinear change of state coordinates
where f .x/ 2 IRn1 is a vector field and g.x/ D z D .x/ and nonlinear feedback u D .z; v/.
Œg 1 .x/; : : : ; g m .x/ 2 IRnm is a matrix field. We assume that these transformations are invert-
Linear dynamics are much easier to analyze ible from some neighborhood of x 0 D 0; u0 D 0
and control than nonlinear dynamics. For exam- to some neighborhood of z0 ; v 0 , and the inverse
ple, to globally stabilize the linear dynamics (1), maps are
all we need to do is to find a linear feedback law
..x// D x; . .z/ D z
u D Kx such that all the eigenvalues of F C GK
are in the open left half plane. Finding a feedback . .z/; .z; v// D v; ..x/; .x; u// D u
law u D .x/ to globally stabilize the nonlinear
dynamics is very difficult and frequently impos- then (1) becomes
sible. Therefore, finding techniques to linearize @
nonlinear dynamics has been a goal for several zP D .x/ .F x C Gu/
@x
centuries.
@
The simplest example of a linearization tech- D . .z// .F .z/ C G.z; v//
nique is to approximate a nonlinear dynamics @x
around a critical point by its first-order terms. which is a controlled nonlinear dynamics (2).
Suppose x 0 ; u0 is an operating point for the This raises the question asked by Brockett (1978),
nonlinear dynamics (2), that is, f .x 0 ; u0 / D 0. when is a controlled nonlinear dynamics a change
Define displacement variables z D x x 0 and of coordinate and feedback away from a con-
v D u u0 , and assuming f .x; u/ is smooth trolled linear dynamics?
430 Feedback Linearization of Nonlinear Systems
Linearization of a Smooth Vector where the Lie bracket of two vector fields
Field f .x/; g.x/ is the new vector field defined by
zP D F z C f Œ2 .x/ F x; Œ2 .x/ C O.x/3 i1 C : : : C ir k D 0:
Feedback Linearization of Nonlinear Systems 431
If there is no resonance of degree r, then the de- A necessary and sufficient condition is that
gree r homological equation is uniquely solvable
for every f Œr .x/. adk .f /g i ; adl .f /g j D 0
When there are no resonances of any degree,
then the convergence of the formal power series
for k D 0; : : : ; n 1; l D 0; : : : ; n
solution is delicate. We refer the reader to Arnol’d
The proof of this theorem is straightforward.
(1983) for the details.
Under a change of state coordinates, the vector
fields and their Lie brackets are transformed by
the Jacobian of the coordinate change. Trivially
Linearization of a Controlled for linear systems,
Dynamics by Change of State
Coordinates F
adk .F x/G i D .1/k F k G
k
Given a controlled affine dynamics (3) when does ad .F x/G i ; adl .F x/G j D 0:
there exist a smooth local change of coordinates
If the nonlinear system is feedback lineariz- One might ask if it is possible to use dynamic
able, then there exists a function h.x/ D H.x/ feedback to linearize a system that is not lineariz-
such that able by static feedback. Suppose we treat one of
the controls uj as a state and let its derivative be
Lad k1 .f /g h D 0 k D 1; : : : ; n 1 a new control,
uP j D uN j
Lad n1 .f /g h ¤ 0
can the resulting system be linearized by state
where the Lie derivative of a function h by a feedback and change of state coordinates?
vector field g is given by Loosely speaking, the effect of adding such
an integrator to the j th control is to shift the
@h j th column of the above matrix down by one
Lg h D g row. This changes the distribution spanned by
@x
the first through rth rows of the above matrix
This is a system of first-order PDEs, and the and might make it involutive. A scalar input
solvability conditions are given by the classical system m D 1 that is linearizable by dynamic
Frobenius theorem, namely, that state feedback is also linearizable by static state
feedback. There are multi-input systems m > 1
fg; : : : ; ad n2 .f /gg that are dynamically linearizable but not statically
linearizable (Charlet et al. 1989, 1991).
is involutive, i.e., its span is closed under Lie The generic system is not feedback lineariz-
bracket. able, but mechanical systems with one actuator
For controllable systems, this is a necessary for each degree of freedom typically are feedback
and sufficient condition. The controllability con- linearizable. This fact had been used in many
dition is that fg; : : : ; ad n1 .f /gg spans x space. applications, e.g., robotics, before the concept of
Suppose m D 2 and the system has controlla- feedback linearization.
bility (Kronecker) indices r1 r2 . Such a system One should not lose sight of the fact that sta-
is feedback linearizable iff bilization, model-matching, or some other perfor-
mance criterion is typically the goal of controller
design. Linearization is a means to the goal.
fg 1 ; ; g 2 ; : : : ; ad ri 2 .f /g 1 ; ad ri 2 .f /g 2 g
We linearize because we know how to meet the
performance goal for linear systems.
is involutive for i D 1; 2. Another way of
Even when the system is linearizable, finding
putting is that the distribution spanned by the first
the linearizing coordinates and feedback can be
through rth rows of the following matrix must
a nontrivial task. Mechanical systems are the ex-
be involutive for r D ri 1; i D 1; 2. This is
ception as the linearizing coordinates are usually
equivalent to the distribution spanned by the first
the generalized positions. Since the ad k1 .f /g
through rth rows of the following matrix being
for k D 1; : : : ; n 1 are characteristic directions
involutive for all r D 1; : : : ; r1 .
of the PDE for h, the general solutions of the
2 3 ODE’s
g1 g2
xP D ad k1 .f /g.x/
6 ad.f /g ad.f /g 2 7
6 7
6 :: :: 7 can be used to construct the solution (Blanken-
6 : : 7
6 r 2 7 ship and Quadrat 1984). The Gardner-Shadwick
6 ad 2 .f /g 1 ad r2 2 .f /g 2 7
6 7 (GS) algorithm (1992) is the most efficient
6 ad r2 1 .f /g 1 ad r2 1 .f /g 2 7
6 7 method that is known.
6 :: 7
6 : 7 Linearization of discrete time systems was
6 7
4 ad r1 2 .f /g 1 5 treated by Lee et al. (1986). Linearization of
ad r1 1 .f /g 1 discrete time systems around an equilibrium
Feedback Linearization of Nonlinear Systems 433
manifold was treated by Barbot et al. (1995) and and choose n r functions i .x/; i D 1; : : : ;
Jakubczyk (1987). Banaszuk and Hauser have n r so that .; / are a full set of coordinates on
also considered the feedback linearization of the state space. Furthermore, it is always possible
the transverse dynamics along a periodic orbit, (Isidori 1995) to choose i .x/ so that
(Banaszuk and Hauser 1995a,b).
Lg i .x/ D 0 i D 1; : : : ; n r
y D 1
Feedback linearization as presented above ig-
nores the output of the system but typically one P1 D 2
uses the input to control the output. Therefore, ::
F
one wants to linearize the input–output response :
of the system rather than the dynamics. This was Pr1 D r
first treated in Isidori and Krener (1982) and
Isidori and Ruberti (1984). Pr D fr .; / C gr .; /u
Consider a scalar input, scalar output system P D .; /
of the form
The feedback u D u.; ; v/ defined by
xP D f .x/ C g.x/u; y D h.x/ .v fr .; //
uD
gr .; /
The relative degree of the system is the number of
integrators between the input and the output. To transforms the system to
be more precise, the system is of relative degree
r 1 if for all x of interest, y D H
P D F C Gv
Lad j .f /g h.x/ D 0 j D 0; : : : ; r 2 P D .; /
Lad r1 .f /g h.x/ ¤ 0
where F; G; H are the r r; r 1; 1r matrices
In other words, the control appears first in the rth 2 3 2 3
010 :::
0 0
time derivative of the output. Of course, a system 60 0 1 07
::: 607
6 7 6 7
might not have a well-defined relative degree as 6 :: 7 6 7
F D 6 ::: ::: ::: ::
:7
: G D 6 ::: 7
the r might vary with x. 6 7 6 7
Rephrasing the result of the previous section, 40 0 0 ::: 15 405
a scalar input nonlinear system is feedback lin- 0 0 0 ::: 0 1
earizable if there exist an pseudo-output map
h.x/ such the resulting scalar input, scalar output H D 1 0 0 ::: 0 :
system has a relative degree equal to the state
The system has been transformed into a string
dimension n.
of integrators plus additional dynamics that is
Assume we have a scalar input, scalar out-
unobservable from the output.
put system with a well-defined relative degree
By suitable choice of additional feedback v D
1 r n. We can define r partial coordinate
K, one can insure that the poles of F C GK
functions
are stable. The stability of the overall system then
depends on the stability of the zero dynamics
i .x/ D .Lf /i 1 h.x/ i D 1; : : : ; r (Byrnes and Isidori 1984, 1988),
434 Feedback Linearization of Nonlinear Systems
If this is stable, then the overall system will be The transformed system is
stable. The zero dynamics is so-called because
it is the dynamics that results from imposing the zP D F z C Gv C f Œ2 .x; u/
constraint y.t/ D 0 on the system. For this to be
satisfied, the initial value must satisfy .0/ D 0 F x C Gu; Œ2 .x/ C G˛ Œ2 .x; u/
and the control must satisfy
CO.x; u/3
fr .0; .t//
u.t/ D To eliminate the quadratic terms, one seeks a
gr .0; .t//
solution of the degree two homological equations
for Œ2 ; ˛ Œ2
Similar results hold in the multiple input, multi-
ple output case, see Isidori (1995) for the details.
F x C Gu; Œ2 .x/ G˛ Œ2 .x; u/ D f Œ2 .x; u/
transforming the affinity controlled dynamics is less than full rank. Hence, only an approximate,
(3) to e.g., a least squares solution, is possible.
Krener has written a MATLAB toolbox
zP D F z C Gv C N.x; u/
(http://www.math.ucdavis.edu/ krener 1995)
where the nonlinearity N.x; u/ is small in some to compute term by term solutions to the
sense. In the design process, the nonlinearity is homological equations. The routine fh2f_h_.m
ignored, and the controller design is done on the sequentially computes the least square solutions
linear model and then transformed back into a of the homological equations to arbitrary
controller for the original system. degree.
The power series approach of Poincaré was
taken by Krener et al. (1987, 1988, 1991), Krener
(1990), and Krener and Maag (1991). See also Observers with Linearizable Error
Kang (1994). It is applicable to dynamics which Dynamics
may not be affine in the control. The controlled
nonlinear dynamics (2), the change of coordi- The dual of linear state feedback is linear input–
nates, and the feedback are expanded in a power output injection. Linear input–output injection is
series the transformation carrying
z D x Œ2 .x/ y D Hx
Feedback Linearization of Nonlinear Systems 435
xP D f .x; u/; y D h.x/ If they exist, the z coordinates satisfy the PDE’s
One defines a vector field g j ; 1 j p for Barbot JP, Monaco S, Normand-Cyrot D (1997) Quadratic
each output, these define a PDE for the change forms and approximate feedback linearization in dis-
crete time. Int J Control 67:567–586
of coordinates, for which certain integrability Bestle D, Zeitz M (1983) Canonical form observer design
conditions must be satisfied. The process is more for non-linear time-variable systems. Int J Control
complicated than feedback linearization and even 38:419–431
less likely to be successful so approximate solu- Blankenship GL, Quadrat JP (1984) An expert system
for stochastic control and signal processing. In: Pro-
tions must be sought which we will discuss later ceedings of IEEE CDC, Las Vegas. IEEE, pp 716–
in this section. We refer the reader Krener and 723
Respondek (1985) and related work Zeitz (1987) Brockett RW (1978) Feedback invariants for nonlinear
and Xia and Gao (1988a,b, 1989). systems. In: Proceedings of the international congress
of mathematicians, Helsinki, pp 1357–1368
Very few systems can be linearized by change Brockett RW (1981) Control theory and singular Rieman-
of state coordinates and input–output injection, so nian geometry. In: Hilton P, Young G (eds) New di-
Krener et al. (1987, 1988, 1991), Krener (1990), rections in applied mathematics. Springer, New York,
pp 11–27
and Krener and Maag (1991) sought approximate
Brockett RW (1983) Nonlinear control theory and differ-
solutions by the power series approach. Again, ential geometry. In: Proceedings of the international
the system, the changes of coordinates, and the congress of mathematicians, Warsaw, pp 1357–1368
output injection are expanded in a power series. Brockett RW (1996) Characteristic phenomena and model
problems in nonlinear control. In: Proceedings of the
See the above references for details.
IFAC congress, Sidney. IFAC
Byrnes CI, Isidori A (1984) A frequency domain phi-
losophy for nonlinear systems. In: Proceedings, IEEE
conference on decision and control, Las Vegas. IEEE,
Conclusion pp 1569–1573
Byrnes CI, Isidori A (1988) Local stabilization of
We have surveyed the various ways a nonlinear minimum-phase nonlinear systems. Syst Control Lett
11:9–17
system can be approximated by a linear system. Charlet R, Levine J, Marino R (1989) On dynamic feed-
back linearization. Syst Control Lett 13:143–151
Charlet R, Levine J, Marino R (1991) Sufficient condi-
tions for dynamic state feedback linearization. SIAM J
Cross-References Control Optim 29:38–57
Gardner RB, Shadwick WF (1992) The GS-algorithm for
exact linearization to Brunovsky normal form. IEEE
Differential Geometric Methods in Nonlinear Trans Autom Control 37:224–230
Control Hunt LR, Su R (1981) Linear equivalents of nonlinear
Lie Algebraic Methods in Nonlinear Control time varying systems. In: Proceedings of the sym-
Nonlinear Zero Dynamics posium on the mathematical theory of networks and
systems, Santa Monica, pp 119–123
Hunt LR, Su R, Meyer G (1983) Design for multi-
input nonlinear systems. In: Millman RS, Brockett
RW, Sussmann HJ (eds) Differential geometric control
Bibliography theory. Birkhauser, Boston, pp 268–298
Isidori A (1995) Nonlinear control systems. Springer,
Arnol’d VI (1983) Geometrical methods in the theory of Berlin
ordinary differential equations. Springer, Berlin Isidori A, Krener AJ (1982) On the feedback equivalence
Banaszuk A, Hauser J (1995a) Feedback linearization of of nonlinear systems. Syst Control Lett 2:118–121
transverse dynamics for periodic orbits. Syst Control Isidori A, Ruberti A (1984) On the synthesis of linear
Lett 26:185–193 input–output responses for nonlinear systems. Syst
Banaszuk A, Hauser J (1995b) Feedback linearization of Control Lett 4:17–22
transverse dynamics for periodic orbits in R3 with Jakubczyk B (1987) Feedback linearization of discrete-
points of transverse controllability loss. Syst Control time systems. Syst Control Lett 9:17–22
Lett 26:95–105 Jakubczyk B, Respondek W (1980) On linearization of
Barbot JP, Monaco S, Normand-Cyrot D (1995) Lin- control systems. Bull Acad Polonaise Sci Ser Sci Math
earization about an equilibrium manifold in discrete 28:517–522
time. In: Proceedings of IFAC NOLCOS 95, Tahoe Kang W (1994) Approximate linearization of nonlinear
City. Pergamon control systems. Syst Control Lett 23: 43–52
Feedback Stabilization of Nonlinear Systems 437
Korobov VI (1979) A general approach to the solution Xia XH, Gao WB (1989) Nonlinear observer design by
of the problem of synthesizing bounded controls in a observer error linearization. SIAM J Control Optim
control problem. Math USSR-Sb 37:535 27:199–216
Krener AJ (1973) On the equivalence of control systems Zeitz M (1987) The extended Luenberger observer
and the linearization of nonlinear systems. SIAM J for nonlinear systems. Syst Control Lett 9:
Control 11:670–676 149–156
Krener AJ (1984) Approximate linearization by state feed-
back and coordinate change. Syst Control Lett 5:181–
185
Krener AJ (1986) The intrinsic geometry of dynamic
observations. In: Fliess M, Hazewinkel M (eds) Al- Feedback Stabilization of Nonlinear
gebraic and geometric methods in nonlinear control Systems
theory. Reidel, Amsterdam, pp 77–87
Krener AJ (1990) Nonlinear controller design via ap-
proximate normal forms. In: Helton JW, Grunbaum A. Astolfi F
A, Khargonekar P (eds) Signal processing, Part II: Department of Electrical and Electronic
control theory and its applications. Springer, New Engineering, Imperial College London, London,
York, pp 139–154 UK
Krener AJ, Isidori A (1983) Linearization by output in-
jection and nonlinear observers. Syst Control Lett 3: Dipartimento di Ingegneria Civile e Ingegneria
47–52 Informatica, Università di Roma Tor Vergata,
Krener AJ, Maag B (1991) Controller and observer de- Roma, Italy
sign for cubic systems. In: Gombani A, DiMasi GB,
Kurzhansky AB (eds) Modeling, estimation and con-
trol of systems with uncertainty. Birkhauser, Boston,
pp 224–239 Abstract
Krener AJ, Respondek W (1985) Nonlinear observers
with linearizable error dynamics. SIAM J Control We consider the simplest design problem for non-
Optim 23:197–216
Krener AJ, Karahan S, Hubbard M, Frezza R (1987) linear systems: the problem of rendering asymp-
Higher order linear approximations to nonlinear con- totically stable a given equilibrium by means of
trol systems. In: Proceedings, IEEE conference on state feedback. For such a problem, we provide
decision and control, Los Angeles. IEEE, pp 519–523 a necessary condition, known as Brockett condi-
Krener AJ, Karahan S, Hubbard M (1988) Approximate
normal forms of nonlinear systems. In: Proceedings, tion, and a sufficient condition, which relies upon
IEEE conference on decision and control, San Anto- the definition of a class of functions, known as
nio. IEEE, pp 1223–1229 control Lyapunov functions. The theory is illus-
Krener AJ, Hubbard M, Karahan S, Phelps A, Maag
trated by means of a few examples. In addition,
B (1991) Poincare’s linearization method applied to
the design of nonlinear compensators. In: Jacob G, we discuss a nonlinear enhancement of the so-
Lamnahbi-Lagarrigue F (eds) Algebraic computing in called separation principle for stabilization by
control. Springer, Berlin, pp 76–114 means of partial state information.
Lee HG, Arapostathis A, Marcus SI (1986) Lineariza-
tion of discrete-time systems. Int J Control 45:1803–
1822
Murray RM (1995) Nonlinear control of mechanical sys- Keywords
tems: a Lagrangian perspective. In: Krener AJ, Mayne
DQ (eds) Nonlinear control systems design. Perga-
mon, Oxford Brockett theorem; Control Lyapunov function;
Nonlinear Systems Toolbox available for down load at Output feedback; State feedback
http://www.math.ucdavis.edu/~krener/1995
Sommer R (1980) Control design for multivariable non-
linear time varying systems. Int J Control 31:883–891
Su R (1982) On the linear equivalents of control systems. Introduction
Syst Control Lett 2:48–52
Xia XH, Gao WB (1988a) Nonlinear observer design by The problem of feedback stabilization, namely,
canonical form. Int J Control 47: 1081–1100
Xia XH, Gao WB (1988b) On exponential ob-
the problem of designing a feedback control law
servers for nonlinear systems. Syst Control Lett 11: locally, or globally, asymptotically stabilizing a
319–325 given equilibrium point, is the simplest design
438 Feedback Stabilization of Nonlinear Systems
.A C BK/ \ C C ¤ ; (6) 4 1
˛.x/ D x2 C x1 C x13 x23 ;
3
holds, then the equilibrium of the nonlinear sys-
tem cannot be stabilized by any continuously but it is not stabilizable by any continuously dif-
differentiable feedback such that ˛.x0 / D 0. ferentiable feedback. Note, in fact, that condition
The notation .A/ denotes the spectrum of the (6) holds.
matrix A, i.e., the eigenvalues of A. Note however
that, if the condition ˛.x0 / D 0 is dropped, the
obstruction does not hold: the zero equilibrium of Brockett Theorem
xP D x C xu is not stabilizable by any continuous
feedback such that ˛.0/ D 0, yet the feedback Brockett’s necessary condition, which is far from
u D 2 is a (global) stabilizer. being sufficient, provides a simple test to rule out
On the contrary, if there exists a K such that the existence of a continuous stabilizer.
Theorem 1 Consider the system (1) and assume
.A C BK/ C x0 D 0 is an achievable equilibrium with u0 D 0.
Assume there exists a continuous stabilizing
then the feedback ˛.x/ D Kx locally asymptot- feedback u D ˛.x/. Then, for each – > 0 there
ically stabilizes the equilibrium x0 of the closed- exists ı > 0 such that, for all y with jjyjj < ı, the
loop system. This fact is often referred to as the equation y D F .x; u/ has at least one solution in
linearization approach. the set jjxjj < –; jjujj < –:
440 Feedback Stabilization of Nonlinear Systems
8
ˆ @V
ˆ
ˆ 0; if g.x/ D 0;
ˆ
< @x
˛.x/ D q @V 4
ˆ 2
ˆ @x f .x/ C C
@V @V
ˆ @x f .x/ @x g.x/
:̂ @V
; elsewhere:
@x g.x/
for some continuously differentiable mappings for all x2 ¤ 0. Suppose, in addition, that there ex-
1 and 2 . As a result, the feedback ists a continuously differentiable mapping M.x2 /
such that
˛.x1 ; x2 / D 2 .x1 ; x2 / k.x2 ˛1 .x1 //;
@M
f1 .x2 / f2 .x2 / D 0;
with k > 0, yields VP < 0 for all nonzero .x1 ; x2 /; @x2
hence, the feedback is a continuously differen- ˇ
tiable stabilizer for the zero equilibrium of the @M ˇˇ
M.0/ D 0 and ¤ 0: Existence of such
system (8). Note, finally, that the function V is @x2 ˇx2 D0
a control Lyapunov for the system (8); hence, a mapping is guaranteed, for example, by asymp-
Sontag’s formula can be also used to construct a totic stability of the linearization of the system
stabilizer. xP 2 D f2 .x2 / around the origin and controllability
The result discussed above is at the basis of the of the linearization of the system (9) around the
so-called backstepping technique for recursive origin.
stabilization of systems described, for example, Consider now the function
by equations of the form
1
V .x1 ; x2 / D .x1 M.x2 //2 C V2 .x2 /;
xP 1 D x2 C '1 .x1 /; 2
xP 2 D x3 C '2 .x1 ; x2 /;
which is radially unbounded and such that
xP 3 D x4 C '3 .x1 ; x2 ; x3 /;
:: V .0; 0/ D 0 and V .x1 ; x2 / > 0 for all nonzero
: .x1 ; x2 /, and note that
xP n D u C 'n .x1 ; : : : ; xn /;
@M @V2
VP D .x1 M.x2 // g2 .x2 /u C f2 .x2 /
with xi .t/ 2 IR for all i 2 Œ1; n, and 'i smooth @x2 @x2
mappings such that 'i .0/ D 0, for all i 2 Œ1; n. @V2
C g2 .x2 /u
@x2
Feedforward Systems
@V2 @V2 @M
Consider a nonlinear system described by equa- D f2 .x2 / C .x1 M.x2 //
@x2 @x2 @x2
tions of the form g2 .x2 /u:
0
To avoid the computation of the derivatives with D 0 ; 1 ; ; n1 , has the property
of y, we exploit the uniform observability of reproducing y.t/ and its derivatives up to
property, which implies that the auxiliary y .n1/ .t/ if properly initialized. This initializa-
system tion is not feasible, since it requires the knowl-
edge of the derivative of the output at t D 0.
P 0 D 1 ; Nevertheless, the auxiliary system (15) can be
P 1 D 2 ; modified to provide an estimate of y and its
:: (15) derivatives. The modification is obtained adding
:
a linear correction term yielding the system
P n1 D n . .; v/; v0 ; v1 ; ; vn1 /;
P 0 D 1 C L cn1 .y 0 /; F
P 1 D 2 C L2 cn2 .y 0 /;
:: (16)
:
P n1 D n . .; v/; v0 ; v1 ; ; vn1 / C Ln c0 .y 0 /;
with L > 0 and the coefficients c0 , , cn1 such such that for each M > M ? and L > L? , the
that all roots of the polynomial n C cn1 n1 C dynamic output feedback control law
c1 C c0 are in C . The system (16) has the
ability to provide asymptotic estimates of y and
its derivatives up to y .n1/ provided these are vP 0 D v1 ;
bounded and the gain L is selected sufficiently vP 1 D v2 ;
large, i.e., the system (16) is a semi-global ob- ::
server of y and its derivatives up to y .n1/ . :
The closed-loop system obtained using the Q Q .; v/; v0 ; v1 ; ; vn1 /
vPn1 D ˛.
feedback law uQ D ˛. Q .; v/; v0 ; v1 ; ; vn1 / P 0 D 1 C L cn1 .y 0 /;
has a locally asymptotically stable equilibrium at
P 1 D 2 C L2 cn2 .y 0 /;
the origin. To achieve semi-global stability, one
::
has to select L sufficient large and replace :
with P n1 D n . .; v/; v0 ; v1 ; ; vn1 /
8 C Ln c0 .y 0 /;
ˆ
< .; v/; if k .; v/k < M;
u D v0 ;
Q .; v/ D .; v/
:̂ M ; if k .; v/k M;
k .; v/k
yields a closed-loop system with the following
properties:
with M > 0 to be selected, as detailed in the • The zero equilibrium of the system is locally
following statement. asymptotically stable.
Theorem 3 Consider the system (10) with m D • For any x.0/, v.0/ and .0/ such that
p D 1. Let x0 D 0 be an achievable equilibrium. kx.0/k < R and k.v.0/; .0/k < R, Q the
Assume h.0/ D 0. Suppose the system is uni- trajectories of the closed-loop system are such
formly observable and there exists a smooth state that
feedback control law u D ˛.x/ which globally
asymptotically stabilizes the zero equilibrium and
lim x.t/ D 0; lim v.t/ D 0;
it is such that ˛.0/ D 0. t !1 t !1
Then for each R > 0, there exist RQ > 0 and
lim .t/ D 0:
M > 0, and for each M > M ? , there exists L?
? t !1
446 Feedback Stabilization of Nonlinear Systems
Sepulchre R, Janković M, Kokotović P (1996) Construc- This model, where prices can take negative
tive nonlinear control. Springer, Berlin values, was changed by taking the exponential
Sontag ED (1989) A “universal” construction of Artstein’s
theorem on nonlinear stabilization. Syst Control Lett
as in the celebrated Black-Scholes-Merton model
13:117–123 (BSM) where
Teel A, Praly L (1995) Tools for semiglobal stabilization
B
by partial state and output feedback. SIAM J Control St D e St D S0 exp.
t C Wt /:
Optim 33(5):1443–1488
van der Schaft A (2000) L2 -gain and passivity techniques
in nonlinear control, 2nd edn. Springer, London Only two constant parameters were needed; the
coefficient is called the volatility. From Itô’s
formula, the dynamics of the BSM’s price are
p
open to more sophisticated models, with more dvt D .vt v/dt
N C vt dBt
parameters. In particular, the smile effect was
seen on the data: the BSM formula being true, where B and W are two Brownian motions with
the price of the option would be a determin- correlation factor . This model is generalized
istic function of the fixed parameter and of by Gourieroux and Sufana (2003) as Wishart
K (the other parameter as maturity, underlying model where the risky asset S is a d dimensional
price, interest rate being fixed), and one would process, with matrix of quadratic variation †
obtain, using .; K/, the inverse of BS.; K/, the satisfy
constant D .C o .K/; K/, where C o .K/ is the
p
observed prices associated with the strike K. This dSt D Diag.St /.dt C ˙t d Wt /
is not the case, the curve K ! .C o .K/; K/
having a smile shape (or a smirk shape). The d˙t D .ƒƒT C M˙t C ˙t M T /dt
p p
BSM model is still used as a benchmark in the C ˙t dBt Q C QT .dBt /T ˙t
concept of implied volatility: for a given observed
option price C o .K; T / (with strike K and ma- where W is a d dimensional Brownian motion
turity T ), one can find the value of such and B a .d d / matrix Brownian, ƒ; M; Q are
that BS. ; K; T / D C o .K; T /. The surface .d d / matrices, M is semidefinite negative,
.K; T / is called the implied volatility surface and ƒƒT D ˇQQT with ˇ d 1 to
and plays an important role in calibration issues. ensure strict positivity. The efficiency of these
Due to the need of more accurate formula, models was checked using calibration method-
many models were presented, the only (mathe- ology; however, the main idea of hedging for
matical) restriction being that prices have to be pricing issues is forgotten: there is, in general,
semi-martingales (to avoid arbitrage opportuni- no hedging strategy, even for common deriva-
ties). tive products, and the validation of the model
A first class is the stochastic volatility models. (the form of the volatility, since the drift term
Assuming that the diffusion coefficient (called plays no role in pricing) is done by calibra-
the local volatility) is a deterministic function of tion. The risk neutral probability is no more
time and underlying, i.e., unique.
p
drt D a.b rt /dt C rt dBt ; with C > 0; M 0; G 0, and Y < 2.
These models are often presented as a special
where ab > 0 (so that r is nonnegative). No case of a change of time: roughly speaking,
closed form for r is known; however, closed form any semi-martingale is a time-changed Brownian
for the price of the associated zero coupons is motion, and many examples are constructed as
known (see below in Affine Processes). St D BAt , where B is a Brownian motion and
A second class is the celebrated Heath-Jarrow- A an increasing process (the change of time),
Morton model (HJM): the starting point being to chosen independent of B (for simplicity) and A
model the price of a zero coupon with maturity a Lévy process (for computational issues).
T (i.e., an asset which pays one monetary unit at
time T , these prices being observable) in terms of Affine Processes
the instantaneous forward rate f .t; T /, as Affine models were introduced by Duffie et al. F
(2003) and are now a standard tool for modeling
Z T stock prices, or interest rates. An affine process
B.t; T / D exp f .t; u/d u : enjoys the property that, for any affine function g,
t
Z T
Assuming that E.exp.uXT C g.Xs /ds/jFt /
t
Lévy process. They present a nice feature: even D ˆ.T t/ expŒrt ‰.T t/
if closed forms for pricing are not known, nu-
merical methods are efficient. One of the most where
popular is the Carr-Geman-Madan-Yor (CGMY)
2.e s 1/
model which is a Lévy process without Gaussian ‰.s/ D ;
component and with Lévy density . C a/.e s 1/ C 2
s
! 2ab2
2e . Ca/ 2
C M x C ˆ.s/ D ;
e 11fx>0g C e Gx 11fx<0g . C a/.e s 1/ C 2
x Y C1 jxjY C1
450 Financial Markets Modeling
2 D a2 C 2 2 : P.Fi1 .i / ui ; i D 1; : : : ; n/
D N˙n N 1 .u1 /; : : : ; N 1 .un / ;
Though more complex, the best performing actuator producing a torque and rotating on
trajectory tracking controller for the general case a horizontal plane. The beam is flexible in the
is based on feedback linearization. Spong (1987) lateral direction only, being stiff with respect to
has shown that the nonlinear state feedback axial forces, torsion, and bending due to gravity.
Deformations are small and are in the elastic do-
D ˛.q; q;
P q;
R «
q / C ˇ.q/v; (10) main. The physical parameters of interest are the
linear density of the beam, its flexural rigidity
with EI , the beam length `, and the hub inertia Ih
(with It D Ih C `3 =3). The equations of motion
˛ D M .q/qR C n.q; q/
P
combine lumped and distributed parameter parts,
CBK 1 R P .q/«
M .q/ C K qR C 2M q with the hub rotation .t/ and the link deforma-
tion w.x; t/, being x 2 Œ0; ` the position along
Cn.q;
R q//
P
the link. From Hamilton principle, we obtain
ˇ D BK 1 M .q/;
Z `
leads globally to the closed-loop linear system R C
It .t/ R
x w.x; t/ dx D .t/ (12)
0
The transcendental characteristic equation as- The joint-level transfer function pjoint .s/ D
sociated to the spatial part of the solution to .s/=.s/ will always have relative degree two
Eqs. (12)–(14) is and only minimum phase zeros. On the other
hand, the tip-level transfer function ptip .s/ D
Ih 3 1 C cos. `/ cosh. `/ y.s/=.s/ will contain non-minimum phase
zeros. This basic difference in the pattern of the
C sin. `/ cosh. `/ cos. `/ sinh. `/ D 0:
transmission zeros is crucial for motion control
(16) design.
In a simpler modeling technique, a spec-
When the hub inertia Ih ! 1, the second ified class of spatial functions i .x/ is
term can be neglected and the characteristic assumed for describing link deformation.
equation collapses into the so-called clamped The functions need to satisfy only a re- F
condition. Equation (16) has an infinite but duced set of geometric boundary conditions
countable number of positive real roots i , with (e.g., clamped modes at the link base), but
associatedpeigenvalues of resonant frequencies otherwise no dynamic equations of motion
!i D i2 EI = and orthonormal eigenvectors such as (13). The use of finite-dimensional
i .x/, which are the natural deformation expansions like (17) limits the validity of the
shapes of the beam (Barbieri and Özgüner resulting model to a maximum frequency. This
1988). A finite-dimensional dynamic model is truncation must be accompanied by suitable
obtained by truncation to a finite number me of filtering of measurements and of control
eigenvalues/shapes. From commands, so as to avoid or limit spillover effects
(Balas 1978).
X
me
In the dynamic modeling of robots with n
w.x; t/ D i .x/ıi .t/ (17)
flexible links, the resort to assumed modes of
i D1
link deformation becomes unavoidable. In prac-
we get tice, some form of approximation and a finite-
dimensional treatment is necessary. Let be the
n-vector of joint variables describing the rigid
R D .t/
It ˛.t/
motion, and ı be the m-vector collecting the
ıRi .t/ C !i2 ıi .t/ D i0 .0/.t/; (18)
deformation variables of all flexible links. Fol-
i D 1; : : : ; me ;
lowing a Lagrangian formulation, the dynamic
model with clamped modes takes the general
where the rigid body motion (top equation) ap- form (Book 1984)
pears as decoupled from the flexible dynamics,
! !
thanks to the choice of variable ˛ rather than M .; ı/ M ı .; ı/ R
. Modal damping can be added on the left-
M Tı .; ı/ M ıı .; ı/ ıR
hand sides of the lower equations through terms ! ! !
2i !i ıPi with i 2 Œ0; 1. The angular position of n .; ı; ;P ı/
P 0
C C D ;
the motor hub at the joint is given by nı .; ı; ;P ı/
P D ıP C K ı 0
(21)
X
me
.t/ D ˛.t/ C i0 .0/ıi .t/; (19)
where the positive definite, symmetric inertia
i D1
matrix M of the complete robot and the Cori-
while the tip angular position is olis, centrifugal, and gravitational terms n have
been partitioned in blocks of suitable dimensions,
X
me K > 0 and D 0 are the robot link stiffness
i .`/
y.t/ D ˛.t/ C ıi .t/: (20) and damping matrices, and is the n-vector of
i D1
` actuating torques.
456 Flexible Robots
The dynamic model (21) shows the general end-effector motion, the closed-loop dynamics of
couplings existing between nonlinear rigid body the ı variables is stable and link deformations
motion and linear flexible dynamics. In this re- converge to a steady-state constant value (zero
spect, the linear model (18) of a single flexible in the absence of gravity) thanks to the intrinsic
link is a remarkable exception. damping of the mechanical structure. Improved
The choice of specific assumed modes may transients are indeed obtained by active modal
simplify the blocks of the robot inertia matrix, damping control (Cannon and Schmitz 1984).
e.g., orthonormal modes used for each link induce A control approach specifically developed
a decoupled structure of the diagonal inertia for the rest-to-rest motion of flexible mechanical
subblocks of M ıı . Quite often the total kinetic systems is command shaping (Singer and Seering
energy of the flexible robot is evaluated only 1990). The original command designed to
in the undeformed configuration ı D 0. With achieve a desired motion for a rigid robot is
this approximation, the inertia matrix becomes convolved with suitable signals delayed in time,
independent of ı, and so the velocity terms in so as to cancel (or reduce to a minimum) the
the model. Furthermore, due to the hypothesis of effects of the excited vibration modes at the time
small deformation of each link, the dependence of motion completion. For a single slewing link
of the gravity term in the lower component nı is with linear dynamics, as in (18), the rest-to-rest
only a function of . input command is computed in closed form by
The validation of (21) goes through the ex- using impulsive signals and can be made robust
perimental identification of the relevant dynamic via an over-parameterization.
parameters. Besides those inherited from the rigid
case (mass, inertia, etc.), also the set of structural Control of Tip-Level Motion
resonant frequencies and associated deformation The design of a control law that allows asymp-
profiles should be identified. totic tracking of a desired trajectory for the end
effector of a robot with flexible links needs to
Control of Joint-Level Motion face the unstable zero dynamics associated to the
When the target variables to be controlled are problem. In the linear case of a single flexible
defined at the joint level, the control problem link, this is equivalent to the presence of non-
for robots with flexible links is similar to that of minimum phase zeros in the transfer function to
robots with flexible joints. As a matter of fact, the tip output (20). Direct inversion of the input–
the models (1), (2), and (21) are both passive output map leads to instability, due to cancel-
systems with respect to the output ; see (19) lation of non-minimum phase zeros by unstable
in the scalar case. For instance, regulation is poles, with link deformation growing unbounded
achieved by a PD action with constant grav- and control saturations.
ity compensation, using a control law of the The solution requires instead to determine the
form (6) without the need of feeding back link unique reference state trajectory of the flexible
deformation variables (De Luca and Siciliano structure that is associated to the desired tip
1993a). Similarly, stable tracking of a joint trajec- trajectory and has bounded deformation. Based
tory d .t/ is obtained by a singular perturbation on regulation theory, the control law will be the
control approach, with flexible modes dynamics superposition of a nominal feedforward action,
acting at multiple time scales with respect to which keeps the system along the reference state
rigid body motion (Siciliano and Book 1988), trajectory (and thus the output on the desired
or by an inversion-based control (De Luca and trajectory), and of a stabilizing feedback that
Siciliano 1993b), where input–output (rather than reduces the error with respect to this state tra-
full state) exact linearization is realized and the jectory to zero without resorting to dangerous
effects of link flexibility are canceled on the cancellations.
motion of the robot joints. While vibrational In general, computing such a control law
behavior will still affect the robot at the level of requires the solution of a set of nonlinear partial
Flexible Robots 457
differential equations. However, in the case of uncertainties and external disturbances (with
a single flexible link with linear dynamics, the adaptive, learning, or iterative schemes), and
feedforward profile is simply derived by an further exploit the design of control laws
inversion defined in the frequency domain (Bayo under limited measurements and noisy sensing.
1987). The desired tip acceleration yRd .t/, Beyond free motion tasks, an accurate treatment
t 2 Œ0; T , is considered as part of a rest-to-rest of interaction tasks with the environment,
periodic signal, with zero mean value and zero requiring force or impedance controllers, is
integral. The procedure, implemented efficiently still missing for flexible robots. In this respect,
using Fast Fourier Transform on discrete-time passivity-based control approaches that do not
samples, will automatically generate bounded necessarily operate dynamic cancellations may
time signals only. The resulting unique torque take advantage of the existing compliance,
profile d .t/ will be a noncausal command, trading off between improved energy efficiency F
anticipating the actual start of the output and some reduction in nominal performance.
trajectory at t D 0 (so as to precharge the link to Often seen as a limiting factor for per-
the correct initial deformation) and ending after formance, the presence of joint elasticity
t D T (to discharge the residual link deformation is now becoming an explicit advantage for
and recover the final rest configuration). safe physical human-robot interaction and for
The same result was recovered by Kwon and locomotion. Next generation lightweight robots
Book (1994) in the time domain, by forward and humanoids will use flexible joints and also
integrating in time the stable part of the inverse compact actuation with online controlled variable
system dynamics and backward integrating the joint stiffness, an area of active research.
unstable part. An extension to the multi-link non-
linear case uses an iterative approach on repeated
linear approximations of the system along the
nominal trajectory (Bayo et al. 1989).
Cross-References
from various types of birds to insects and fish. conditions. SPPs are essentially kinematic
The phenomena can be loosely defined as any particles moving with constant speed and
aggregate collective behavior in parallel rectilin- the steering law determines the angle of
ear formation or (in case of fish) in collective (what control theorists call a kinematic,
circular motion. The mechanisms leading to such nonholonomic vehicle model). Vicsek et al.’s
behavior have been (and continues to be) an steering law was very intuitive and simple
active area of research among ecologists and and was essentially Reynolds’ zone-based
evolutionary biologists, dating back to the 1950s alignment rule (Vicsek was not aware of
if not earlier. The engineering interest in the topic Reynolds’ result): each particle averages the
is much more recent. angle of its velocity vector with that of its
In 1986, Craig Reynolds (1987), a computer neighbors (those within a disk of a prespecified
graphics researcher, developed a computer model distance), plus a noise term used to model F
of collective behavior for animated artificial ob- inaccuracies in averaging. Once the velocity
jects called boids. The flocking model for boids vector is determined at each time, each particle
was used to realistically duplicate aggregation takes a unit step along that direction, then
phenomena in fish flocks and bird schools for determining its neighbors again and repeating
computer animation. Reynolds developed a sim- the protocol.
ple, intuitive physics-based model: each boid was The simulations were done in a square of
a point mass subject to three simple steering unit length with periodic boundary conditions,
forces: alignment (to steer each boid towards to simulate infinite space. Vicsek and coauthors
the average heading of its local flockmates), co- simulated this behavior and found that as the
hesion (steering to move towards the average density of the particles increased, a preferred
position of local flockmates), and separation (to direction spontaneously emerged, resulting in
avoid crowding local flockmates). The term local a global group behavior with nearly aligned
should be understood as those flockmates who velocity vectors for all particles, despite
are within each other’s influence zone, which the fact that the update protocol is entirely
could be a disk (or a wide-angle sector of a local.
disk) centered at each boid with a prespecified With the interest in control theory shifting
radius. This simple zone-based model created towards multiagent systems and networked co-
very realistic flocking behaviors and was used in ordination and control, it became clear that the
many animations (Fig. 1). Reynolds’ 3 rules of mathematics of how birds flock and fish school
flocking. and how individuals in a social network reach
Nine years later, in 1995, Vicsek et al. (1995) agreement (even though they are often only influ-
and coauthors independently developed a model enced by other like-minded individuals) are quite
for velocity alignment of self-propelled particles related to the question of how can one engineer a
(SPPs) in a square with periodic boundary swarm of robots to behave like bird flocks.
These questions have occupied the minds network is fixed and connected (i.e., there is a
of many researchers in diverse areas ranging path from every node to every other node) and
from control theory to robotics, mathematics, and agents also include their own opinions in the
computer science. As discussed above, most of averaging, the update results in computation of
the early research, which happened in computer a global weighted average of initial opinions,
graphics and statistical physics, was on modeling where the weight of each initial opinion in the fi-
and simulation of collective behavior. Over the nal aggregate is proportional to the “importance”
past 13 years, however, the focus has shifted to of each node in the network.
rigorous systems theoretic foundations, leading The flocking models of Reynolds (1987) and
to what one might call a theory of collective Vicsek et al. (1995), however, have an extra
phenomena in multiagent systems. This theory twist: the network changes as opinions are up-
blends dynamical systems, graph theory, Markov dated. Similarly, more refined sociological mod-
chains, and algorithms. els developed over the past decade also capture
This type of collective phenomena are often this endogeneity (Hegselmann and Krause 2002):
modeled as many-degrees-of-freedom (discrete- each individual agent is influenced by others
time or continuous-time) dynamical systems only when their opinion is close to her own. In
with an additional twist that the interconnection other words, as opinions evolve, neighborhood
structure between individual dynamical systems structures change as the function of the evolving
changes, since the motion of each node in a flock opinion, resulting in a switched dynamical sys-
(or opinion of an individual) is affected primarily tem in which switching is state dependent.
by those in each node’s local neighborhood. In a paper in 2003, Jadbabaie and coau-
The twist here is that the local neighborhood thors (2003) studied the Reynolds’ alignment
is not fixed: neighbors are defined based on rule in the context of Vicsek’s model when there
the actual state of the system, for example, in is no exogenous noise. To model the endogeneity
case of Vicsek’s alignment rule, as each particle of the change in neighborhood structure, they
averages its velocity direction with that of its developed a model based on repeated local
neighbors and then takes a step, the set of its averaging in which the neighborhood structure
nearest neighbors can change. changes over time and therefore instead of a
Interestingly, very similar models were de- simple discrete-time linear dynamical system,
veloped in statistics and mathematical sociology the model is a discrete linear inclusion or a
literature to describe how individuals in a social switched linear system. The question of interest
network update their opinions as a function of was to determine what regimes of network
the opinion of their friends. The first such model changes could result in flocking. Clearly, as
goes back to the seminal work of DeGroot (1974) also DeGroot’s model suggests, when the local
in 1974. DeGroot’s model simply described the neighbor structures do not change, connectivity
evolution of a group’s scalar opinion as a function is the key factor for flocking. This is a direct
of the opinion of their neighbors by an iterative consequence of Perron-Frobenius theory. The
averaging scheme that can be conveniently mod- result can also be described in terms of directed
eled as a Markov chain. Individuals are repre- graphs. What is the equivalent condition in
sented by nodes of a graph, start from an opinion changing networks? Jadbabaie and coauthors
at time zero, and then are influenced by the show in their paper that indeed connectivity is
people in their social clique. In DeGroot’s model important, but it need not hold every time: rather,
though, the network is given exogenously and there needs to be time periods over which the
does not change as a function of the opinions. graphs are connected in time. More formally, the
The model therefore can be analyzed using the process of neighborhood changes due to motion
celebrated Perron-Frobenius theorem. The evo- in Vicsek’s model can be simply abstracted as
lution of opinions is a discrete dynamic system a graph in which the links “blink” on and off.
that corresponds to an averaging map. When the For flocking, one needs to ensure that there are
Flocking in Networked Systems 461
time periods over which the union of edges (that contains a directed spanning tree (a node who has
occur as a result of proximity of particles) needs direct links to every other node).
to correspond to a connected graph and such Some of these results were also extended to
intervals need to occur infinitely often. It turns the analysis of the Reynolds’ model of flocking
out that many of these ideas were developed including the other two behaviors. First, Tanner
much earlier in a thesis and following paper and coauthors (2003, 2007) showed in a series
by Tsitsiklis (1984) and Tsitsiklis et al. (1986), of papers in 2003 and 2007 that a zone-based
in the context of distributed and asynchronous model similar to Reynolds can result in flocking
computation of global averages in changing for dynamic agents, provided that the graph
graphs, and in a paper by Chatterjee and Seneta representing interagent communications stays
(1977), in the context of nonhomogeneous connected. Olfati-Saber and coauthors (2007)
Markov chains. The machinery for proving developed similar results with a slightly different F
such a results, however, is classical and has a model.
long history in the theory of inhomogeneous Many generalizations and extension for these
Markov chains, a subject studied since the time results exist in a diverse set of disciplines, re-
of Markov himself, followed by Birkhoff and sulting in a rich theory which has had appli-
other mathematicians such as Hajnal, Dobrushin, cations from robotics (such as rendezvous in
Seneta, Hatfield, Daubachies, and Lagarias, to mobile robots) (Cortés et al. 2006) to mathemati-
name a few. cal sociology (Hegselmann and Krause 2002) and
The interesting twist in analysis of Vicsek’s from economics (Golub and Jackson 2010) to dis-
model is that the Euclidean norm of the distance tributed optimization theory (Nedic and Ozdaglar
to the globally converged “consensus angle” (or 2009). However, some of the fundamental math-
consensus opinion in the case of opinion models) ematical questions related to flocking still remain
can actually grow in a single step of the process; open.
therefore, standard quadratic Lyapunov function First, most results focus on endogenous mod-
arguments which serve as the main tool for anal- els of network change. A notable extension is
ysis of switched linear systems are not suitable a paper by Cucker and Smale (2007), in which
for the analysis of such models. However, it the authors develop and analyze an endogenous
is fairly easy to see that under the process of model of flocking that cleverly smoothens out
local averaging, the largest value cannot increase the discontinuous change in network structure by
and the smallest value cannot decrease. In fact, allowing each node’s influence to decay smoothly
one can show that if enough connected graphs as a function of distance.
occur as the result of the switching process, the Recently, in a series of papers, Chazelle has
maximum value will be strictly decreasing and made progress in this arena by using tools from
the minimum value will be strictly increasing computational geometry and algorithms for anal-
The paper by Jadbabaie and coauthors (2003) ysis of endogenous models of flocking (Chazelle
has lead to a flurry of results in this area over 2012). Chazelle has introduced the notion of the
the past decade. One important generalization to s-energy of a flock, which can be thought of as a
the results of Jadbabaie et al. (2003), Tsitsiklis parameterized family of Lyapunov functions that
(1984), and Tsitsiklis et al. (1986) came 2 years represent the evolution of global misalignment
later in a paper by Moreau (2005), who showed between flockmates. Via tools from dynamical
that these results can be generalized to nonlinear systems, computational geometry, combinatorics,
updates and directed graphs. Moreau showed complexity theory, and algorithms, Chazelle cre-
that any dynamic process that assigns a point in ates an “algorithmic calculus,” for diffusive in-
the interior of the convex hull of the value of fluence systems: surprisingly, he shows that the
each node and its neighbors will eventually result orbit or flow of such systems is attracted to a
in agreement and consensus, if and only if the fixed point in the case of undirected graphs and a
union of graphs from every time step till infinity limit cycle for almost all arbitrarily small random
462 Flocking in Networked Systems
ve D J.q/Pq :
ideal assumptions are relaxed. In that case the action of the external wrench he . It is clear that,
control laws may be adapted to deal with nonideal if he ¤ 0, then the end effector deviates from the
characteristics. desired pose, which is usually denoted as virtual
pose.
Physically, the closed-loop system (1) with
Indirect Force Control (2) can be seen as a 6-DOF nonlinear and
configuration-dependent mass-spring-damper
The aim of indirect force control is that of achiev- system with inertia (mass) matrix ƒ.q/ and
ing a desired compliant dynamic behavior of the adjustable damping KD and stiffness KP , under
robot’s end effector in the presence of interaction the action of the external wrench he .
with the environment.
Impedance Control F
Stiffness Control A configuration-independent dynamic behavior
The simpler approach is that of imposing a suit- can be achieved if the measure of the end-effector
able static relationship between the deviation of force and moment he is available, by using the
the end-effector position and orientation from a control law, known as impedance control (Hogan
desired pose and the force exerted on the envi- 1985):
ronment, by using the control law
hc D ƒ.q/˛ C .q; qP /Pq C .q/ C he ;
hc D KP xde KD ve C .q/ ; (2)
where ˛ is chosen as:
where KP and KD are suitable matrix gains and
xde is a suitable error between a desired and the
˛ D vP d C K1
M .KD vde C KP xde he / :
actual end-effector position and orientation. The
position error component of xde can be simply
chosen as pd pe . Concerning the orientation The following expression can be found for the
error component, different choices are possible closed-loop system
(Caccavale et al. 1999), which are not all equiv-
alent, but this issue is outside the scope of this KM Pvde C KD vde C KP xde D he ; (3)
entry.
The control input (2) corresponds to a wrench representing the equation of a 6-DOF
(force and moment) applied to the end effector, configuration-independent mass-spring-damper
which includes a gravity compensation term .q/, system with adjustable inertia (mass) matrix
a viscous damping term KD ve , and an elastic KM , damping KD , and stiffness KP , known as
wrench provided by a virtual spring with stiffness mechanical impedance.
matrix KP (or, equivalently, compliance matrix A block diagram of the resulting impedance
K1
P ) connecting the end-effector frame with a control is sketched in Fig. 2.
frame of desired position and orientation. This The selection of good impedance parameters
control law is known as stiffness control or com- ensuring a satisfactory behavior is not an easy
pliance control (Salisbury 1980). task and can be simplified under the hypothesis
Using the Lyapunov method, it is possible to that all the matrices are diagonal, resulting in
prove the asymptotic stability of the equilibrium a decoupled behavior for the end-effector coor-
solution of equation dinates.
Moreover, the dynamics of the controlled sys-
KP xde D he ; tem during the interaction depends on the dynam-
ics of the environment that, for simplicity, can
meaning that, at steady state, the robot’s end be approximated as a simple elastic law for each
effector has a desired elastic behavior under the coordinate, of the form
466 Force Control in Robotics
pd,Rd he
ud Impedance a Inverse t Manipulator and q
ud control dynamics environment q
pe,Re
ue Direct
kinematics
where ˛v and f are properly designed control control schemes can be defined also in this case
inputs, which leads to the equations Villani and De Schutter (2008).
P D ˛
; D f ;
Bibliography
Abstract
Caccavale F, Natale C, Siciliano B, Villani L (1999) Six-
DOF impedance control based on angle/axis represen- In this chapter we give an introduction to fre-
tations. IEEE Trans Robot Autom 15:289–300 quency domain system identification. We start
Chiaverini S, Sciavicco L (1993) The parallel approach to
force/position control of robotic manipulators, IEEE from the identification work loop in System
Trans Robot Autom 9:361–373 Identification: An Overview, Fig. 4, and we dis-
Chiaverini S, Siciliano B, Villani L (1994) Force/position cuss the impact of selecting the time or frequency
regulation of compliant robot manipulators. IEEE domain approach on each of the choices that are
Trans Autom Control 39:647–652
De Schutter J, Van Brussel H (1988) Compliant robot mo- in this loop. Although there is a full theoreti-
tion I. A formalism for specifying compliant motion cal equivalence between the time and frequency
tasks. Int J Robot Res 7(4):3–17 domain identification approach, it turns out that,
De Schutter J, De Laet T, Rutgeerts J, Decré W, Smits from practical point of view, there can be a
R, Aerbeliën E, Claes K, Bruyninckx H (2007)
Constraint-based task specification and estimation for natural preference for one of both domains.
sensor-based robot systems in the presence of geomet-
ric uncertainty. Int J Robot Res 26(5):433–455
Hogan N (1985) Impedance control: an approach to Keywords
manipulation: parts I–III. ASME J Dyn Syst Meas
Control 107:1–24 Discrete and continuous time models; Experi-
Khatib O (1987) A unified approach for motion and force
control of robot manipulators: the operational space ment setup; Frequency and time domain identi-
formulation. IEEE J Robot Autom 3:43–53 fication; Plant and noise model
470 Frequency Domain System Identification
Frequency Domain a 6
System Identification,
Signals
Fig. 1 Comparison of the
ZOH and BL signal 0
reconstruction of a discrete
time sequence: (a) time
−6 Time (s)
domain: BL, – ZOH;
0 10 20
(b) spectrum ZOH signal;
(c) spectrum BL signal b 1
Amplitude
0 f/fs
0 1 2 3 4
c F
1
Amplitude
0 f/fs
0 1 2 3 4
and discretize the continuous time input and band of interest, resulting directly in a
output signals and process these on a digital measured frequency response function at a
computer. Also the excitation signals are mostly user selected set of frequencies !k ; k D
generated from a discrete time sequence with a 1; ; F :
digital-to-analog convertor (DAC). G.!k /:
The continuous time input and output sig-
From the identification point of view, we
nals uc .t/; yc .t/ of the system to be modeled
can easily fit the latter situation in the
are measured at the sampling moments tk D
frequency domain identification framework, by
kTs , with Ts D 1=fs the sampling period and
putting
fs the sampling frequency: u.k/ D uc .kTs /,
and y.k/ D yc .kTs /. The discrete time signals
u.k/; y.k/, k D 1; ; N are transformed to U.k/ D 1; Y .k/ D G.!k /:
the frequency domain using the discrete Fourier
transform (DFT) ( Nonparametric Techniques For that reason we will focus completely on
in System Identification, Eq. 1), resulting in the the time domain measurement approach in the
DFT spectra U.l/; Y .l/, at the frequencies fl D remaining part of this contribution.
l fNs . Making some abuse of notation, we will
reuse the same symbols later in this text, to de- Zero-Order-Hold and Band-Limited Setup:
note the Z-transform of the signals, for example, Impact on the Model Choice
Y .z/ will also be used for the Z-transform of No information is available on how the continu-
y.k/. ous time signals uc .t/; yc .t/ vary in between the
measured samples u.k/; y.k/. For that reason
Frequency Domain Measurements we need to make an assumption and make sure
A major exception to this general trend toward that the measurement setup is selected such that
time domain measurements are the (high- the intersample assumption is met. Two inter-
frequency) network analyzers that measure sample assumptions are very popular (Pintelon
the transfer function of a system frequency and Schoukens 2012; Schoukens et al. 1994,
by frequency, starting from the steady- pp. 498–512): the zero-order-hold (ZOH) and the
state response to a sine excitation. The band-limited assumption (BL). Both options are
frequency is stepped over the frequency shown in Fig. 1 and discussed below. The choice
472 Frequency Domain System Identification
of these assumptions does not only affect the BL setup is the standard choice for discrete
measurement setup; it has also a strong impact on time measurements. Without using anti-alias
the selection between a continuous or a discrete filters, large errors can be created due to aliasing
time model choice. effects: the high frequency (f > fs =2/ content
of the measured signals is folded down in the
Zero-Order Hold frequency band of interest and act there as a
The ZOH setup assumes that the excitation disturbance. For that reason it is strongly advised
remains constant in between the samples. In to use always anti-alias filters in the measurement
practice, the model is identified between the setup.
discrete time reference signal in the memory The exact relation between BL signals is de-
of the generator and the sampled output. The scribed by a continuous time model, for example,
intersample behavior is an intrinsic part of the in the frequency domain:
model: if the intersample behavior changes, also
the corresponding model will change. The ZOH Y .!/ D G.!; /U.!/
assumption is very popular in digital control. In
that case the sampling frequency fs is commonly (Schoukens et al. 1994).
chosen 10 times larger than the frequency band
of interest. Combining Discrete Time Models and BL Data
A discrete time model gives, in case of noise- It is also possible to identify a discrete time
free data, an exact description between the sam- model between the BL data, at a cost of creat-
pled input u.k/ and output y.k/ of the continuous ing (very) small model errors (Schoukens et al.
time system: 1994). This is the standard setup that is used in
digital signal procession applications like digital
y.k/ D G.q; /u.k/ audio processing. The level of the model errors
can be reduced by lowering the ratio fmax =fs
in the time domain ( System Identification: An or by increasing the model order. In a properly
Overview, Eq. 6). In this expression, q denotes designed setup, the discrete time model errors can
the shift operator (time domain), and it is replaced be made very small, e.g., relative errors below
by z in the z-domain description (transfer func- 105 . In Table 1 an overview of the models
tion description): corresponding to the experimental conditions is
given.
Y .z/ D G.z; /U.z/:
Extracting Continuous Time Models from
Evaluating the transfer function at the unit circle ZOH Data
by replacing z D e i ! results in the frequency Although the most robust and practical choice
domain description of the system: to identify a continuous time model is to start
from BL data, it is also possible to extract a
Y .e i ! / D G.e i ! ; /U.e i ! / continuous time model under the ZOH setup.
A first possibility is to assume that the ZOH
( System Identification: An Overview, Eq. 34). assumption is perfectly met, which is a very
hard assumption to realize in practice. In that
Band-Limited Setup case the continuous time model can be retrieved
The BL setup assumes that above a given by a linear step invariant transformation of the
frequency fmax < fs =2, there is no power in discrete time model. A second possibility is to
the signals. The continuous time signals are select a very high sample frequency with respect
filtered by well-tuned anti-alias filters (cutoff to the bandwidth of the system. In that case it is
frequency fmax < fs =2), before they are advantageous to describe the discrete time model
sampled. Outside the digital control world, the using the delta operator (Goodwin 2010), and
Frequency Domain System Identification 473
Frequency Domain System Identification, Table 1 Relations between the continuous time system G.s/ and the
identified models as a function of the signal and model choices
DT-model (Assuming ZOH-setup) CT-model (Assuming BL-setup)
ZOH setup Exact DT-model
n G.s/ o
G.z/ D 1 z1 Z S Not studied
‘standard conditions DT modelling’
BL setup Approximate DT model Exact CT-model G.s/
Q
G.z/ GQ z D e j!Ts Ð G.s D
!s
j!/; j!j < 2
‘digital signal processing field’ ‘standard conditions CT modelling’
F
we have that the coefficients of the discrete time excitation: the selection of the amplitude spec-
model converge to those of the continuous time trum (How is the available power distributed over
model. the frequency?) and the choice of the frequency
resolution (What is the frequency step between
Models of Cascaded Systems two successive points of the measured FRF?)
In some problems we want to build models for a (Pintelon and Schoukens 2012, Sect. 5.3).
cascade of two systems G1 ; G2 . It is well known The amplitude spectrum is mainly set by the
that the overall transfer function G is given by requirement that the excited frequency band
the product G.!/ D G1 .!/G2 .!/. This result should cover the frequency band of interest.
holds also for models that are identified under the A white noise excitation covers the full frequency
BL signal assumption: the model for the cascade band, including those bands that are of no interest
will be the product of the models of the individual for the user. This is a waste of power and it should
systems. However, the result does not hold under be avoided. Designing a good power spectrum for
the ZOH assumption, because the intermediate identification and control purposes is discussed
signal between G1 ; G2 does not meet the ZOH in Experiment Design and Identification for
assumption. For that reason, the ZOH model of Control.
a cascaded system is not obtained by cascading The frequency resolution f0 D 1=T is set by
the ZOH models of the individual systems. the inverse of the period of the signal. It should be
small enough so that no important dynamics are
Experiment Design: Periodic or Random missed, e.g., a very sharp mechanical resonance.
Excitations? The reader should be aware that exactly the
In general, arbitrary data can be used to identify same choices have to be made during the design
a system as long as some basic requirements of nonperiodic excitations. If, for example, a
are respected ( System Identification: An random noise excitation is used, the frequency
Overview, section on Experiment Design). resolution is also restricted by the length of the
Imposing periodic excitations can be an experiment Tm and the corresponding frequency
important restriction of the user’s freedom to resolution is again f0 D 1=Tm. The power
design the experiment, but we will show in the spectrum of the noise excitation should be well
next sections that it offers also major advantages shaped using a digital filter.
at many steps in the identification work
loop ( System Identification: An Overview, Nonparametric Preprocessing of the Data
Fig. 4). in the Frequency Domain
With the availability of arbitrary wave form Before a parametric model is identified from the
generators, it became possible to generate arbi- raw data, a lot of information can be gained,
trary periodic signals. The user should make two almost for free, by making a nonparametric ana-
major choices during the design of the periodic lysis of the data. This can be done with very
474 Frequency Domain System Identification
little user interaction. Some of these methods of the system is measured as a function of the
are explicitly linked to the periodic nature of frequency, and it is even possible to differenti-
the data, other methods apply also to random ate between even (e.g., x 2 ) and odd (e.g., x 3 )
excitations. distortions. While the first only act as disturb-
ing noise in a linear modeling framework, the
Nonparametric Frequency Analysis latter will also affect the linearized dynamics
of Periodic Data and can change, for example, the pole posi-
By using simple DFT techniques, the follow- tions of a system (Pintelon and Schoukens 2012,
ing frequency domain information is extracted Sect. 4.3).
from sampled time domain data u.t/; y.t/; t D
1; : : : ; N (Pintelon and Schoukens 2012):
• The signal information: U.l/; Y .l/, the DFT
spectra of the input and output, evaluated at Noise and Data Reduction
the frequencies lf0 , with k D 1; 2; ; F . By averaging the periodic signals over the suc-
• Disturbing noise variance information: The cessive periods, we get a first reduction of the
full data record is split, so that each sub- noise. An additional noise reduction is possible
record contains a single period. For each of when not all frequencies are excited. If a very
these, the DFT spectrum is calculated. Since wide frequency band has to be covered, a fine
the signals are periodic, they do not vary frequency resolution is needed at the low frequen-
from one period to the other, so that the cies, whereas in the higher frequency bands, the
observed variations can be attributed to the resolution can be reduced. Signals with a loga-
noise. By calculating the sample mean and rithmic frequency distribution are used for that
variance over the periods at each frequency, a purpose. Eliminating the unexcited frequencies
nonparametric noise analysis is available. The does not only reduce the noise, it also reduces sig-
estimated variances O U2 .k/; O Y2 .k/ measure the nificantly the amount of raw data to be processed.
disturbing noise power spectrum at frequency By combining different experiments that cover
fk on the input and the output respectively. each a specific frequency band, it is possible to
The covariance O Y2 U .k/ characterizes the lin- measure a system over multiple decades, e.g.,
ear relations between the noise on the input electrical machines are measured from a few mHz
and the output. to a few kHz.
This is very valuable information because, even In a similar way, it is also possible to focus the
before starting the parametric identification step, fit on the frequency band of interest by including
we get already full access to the quality of the only those frequencies in the parametric model-
raw data. As a consequence, there is also no ing step.
interference between the plant model estimation
and the noise analysis: plant model errors do not
affect the estimated noise model. It is also im-
portant to realize that there is no user interaction High-Quality Frequency Response Function
requested to make this analysis and it follows Measurements
directly from a simple DFT analysis of the raw For periodic excitations, it is very simple to ob-
data. These are two major advantages of using tain high-quality measurements of the nonpara-
periodic excitations. metric frequency response function of the system.
These results can be extended to random excita-
Nonlinear Analysis tions at a cost of using more advanced algorithms
Using well-designed periodic excitations, it is that require more computation time (Pintelon and
possible to detect the presence of nonlinear dis- Schoukens, Chap. 7). This approach is discussed
tortions during the nonparametric frequency step. in detail in Nonparametric Techniques in Sys-
The level of the nonlinear distortions at the output tem Identification (Eq. 1).
Frequency Domain System Identification 475
Generalized Frequency Domain the begin and end effects (called leakage in the
Models frequency domain). See Nonparametric Tech-
niques in System Identification, “The Leakage
A very important step in the identification work Problem” section. The amazing result is that in
loop is the choice of the model class ( System the frequency domain, both effects are described
Identification: An Overview, Fig. 4). Although by exactly the same mathematical expression.
most physical systems are continuous time, the This leads eventually to the following model in
models that we need might be either discrete time the frequency domain that is valid for periodic
(e.g., digital control, computer simulations, dig- and arbitrary (nonperiodic) BL or ZOH exci-
ital signal processing) or continuous time (e.g., tations (Pintelon et al. 1997; McKelvey 2002;
physical interpretation of the model, analog con- Pintelon and Schoukens 2012, Chap. 6):
trol) (Ljung 1999, Sects. 2.1 and 4.3). A major F
advantage of the frequency domain is that both Y .k/ D G.˝; /U.k/ C TG .˝; /
model classes are described by the same transfer
function model. The only difference is the choice
of the frequency variable. For continuous time which becomes for SISO (single-input-single-
models we operate in the Laplace domain, and the output) systems:
frequency variable is retrieved on the imaginary
axis by putting s D j!. For discrete time systems B.˝; / I.˝; /
G.˝; / D ; and TG .˝; /D :
we work on the unit circle that is described by the A.˝; / A.˝; /
frequency variable z D e j 2
f =fs . The application A; B; I are all polynomials in ˝. The transient
class can be even extended to include diffusion
p term TG .˝; / models transient and leakage ef-
phenomena by putting ˝ D j! (Pintelon et fects. It is most important for the reader to realize
al. 2005). From now on we will use the gen- that this is an exact description for noise free data.
eralized frequency variable ˝, and depending Observe that it is very similar to the description in
on the selected domain, the proper substitution the time domain: y.t/ D G.q; /u.t/ C tG .t; /.
p
j!; e j 2
f =fs ; j! should be made. The unified In that case the transient term tG .t; / models the
discrete and continuous time description initial transient that is due to the initial condi-
tions.
Y .k/ D G.˝k ; /U.k/:
ˇ ˇ2
ˇO O .k/ TG .˝k ; /ˇˇ
1 XF ˇ Y .k/ G .˝ k ; / U
V ./ D :
F O 2 .k/ C O U2 .k/ jG .˝k ; /j2 2Re.O Y2 U .k/ G .˝k ; //
kD1 Y
The properties of this estimator are fully stud- The formal link with the time domain cost
ied in Schoukens et al. (1999) and Pintelon and function, as presented in System Identification:
Schoukens (2012, Sect. 10.3), and it is shown that An Overview, can be made by assuming
it is a consistent and almost efficient estimator that the input is exactly known (O U2 .k/ D
under very mild conditions. 0; O Y2 U .k/ D 0/, and replacing the nonparametric
476 Frequency Domain System Identification
noise model on the output by a parametric These changes reduce the cost function, within
model: a parameter independent constant to
ˇ ˇ2
ˇO ˇ
1 X
F ˇY .k/ G .˝k ; / U0 .k/ TG .˝k ; /ˇ
VF ./ D ;
F
kD1
jH .˝k ; /j2
which is exactly the same expression as Eq. 37 in noise model can share some dynamics with the
System Identification: An Overview, provided plant model, for example, when the disturbance
that the frequencies ˝k cover the full unit circle is an unobserved plant input.
and the transient term TG is omitted. The latter It is also possible to use a nonparametric
models the initial condition effects in the time do- noise model in the time domain. This leads to a
main. The expression shows the full equivalence Toeplitz weighting matrix, and the fast numerical
with the classical discrete time domain formula- algorithms that are used to deal with these make
tion. If only a subsection of the full unit circle use internally of FFT (fast Fourier transform)
is used for the fit, the additional term N log det algorithms which brings us back to the frequency
in System Identification: An Overview, Eq. 21 domain representation of the data.
should be added to the cost function VF ./.
Stable and Unstable Plant Models
In the frequency domain formulation, there is no
special precaution needed to deal with unstable
Additional User Aspects in Parametric
models, so that we can tolerate these models
Frequency Domain System
without any problem. There are multiple reasons
Identification
why this can be advantageous. The most obvious
one is the identification of an unstable system,
In this section we highlight some additional user
operating in a stabilizing closed loop. It can
aspects that are affected by the choice for a time
also happen that the intermediate models that
or frequency approach to system identification.
are obtained during the optimization process are
unstable. Imposing stability at each iteration can
Nonparametric Noise Models be too restrictive, resulting in estimates that are
The use of a nonparametric noise model is a trapped in a local minimum. A last significant ad-
natural choice in the frequency domain. It is of vantage is the possibility to split the identification
course also possible to use the parametric noise problem (extract a model from the noisy data)
model in the frequency domain formulation, but and the approximation problem (approximate the
then we would lose two major advantages of the unstable model by a stable one). This allows us to
frequency domain formulation: (i) For periodic use in each step a cost function that is optimal for
excitations, there is no interaction between the that step: maximum noise reduction in the first
identification of the plant model and the nonpara- step, followed by a user-defined approximation
metric noise model. Plant model errors do not criterion in the second step.
show up in the noise model. (ii) The availability
of the nonparametric noise models eliminates the Model Selection and Validation
need for tuning the parametric noise model order, An important step in the system identification
resulting in algorithms that are easier to use. procedure is the tuning of the model complexity,
A disadvantage of using a nonparametric noise followed by the evaluation of the model qual-
model is that we can no longer express that the ity on a fresh data set. The availability of a
Frequency Domain System Identification 477
nonparametric noise model and a high-quality framework (Soderstrom 2012). A special case
frequency response function measurement sim- is the identification of a system that is captured
plifies these steps significantly: in a feedback loop. In that case we have that the
noisy output measurements are fed back to the
Absolute Interpretation of the Cost Function: input of the system which creates a dependency
Impact on Model Selection and Validation between the input and output disturbance. We
The weighted least squares cost function, discuss both situations briefly below.
using the nonparametric noise weighting, is a
normalized function. Its expected value equals
O D .F n =2/=F , with n the number Errors-in-Variables Framework
EfV ./g
The major difficulty of the EIV framework
of free parameters in the model and F the
is the simultaneous identification of the plant
number of frequency points. A cost function
model describing the input-output relations,
F
that is far too large points to remaining model
the noise models that describe the input and
errors. Unmodeled dynamics result in correlated
the output noise disturbances, and the signal
residuals (difference between the measured and
model describing the coloring of the excitation
the modeled FRF): the user should increase the
(Soderstrom 2012). Advanced identification
model order to capture these dynamics in the
methods are developed, but today it is still
linear model. A cost function that is far too
necessary to impose strong restrictions on the
large, while the residuals are white, points to the
noise models, e.g., correlations between input
presence of nonlinear distortions: the best linear
and output noise disturbances are not allowed.
approximation is identified, but the user should be
The periodic frequency domain approach
aware that this approximation is conditioned on
encapsulates the general EIV, including mutually
the actual excitation signal. A cost function that is
correlated colored input–output noise. Again, a
far too low points to an error in the preprocessing
full nonparametric noise model is obtained in the
of the data, resulting in a bad noise model.
preprocessing step. This reduces the complexity
of the EIV problem to that of a classical
Missing Resonances
weighted least squares identification problem
Some systems are lightly damped, resulting in a
which makes a huge difference in practice
resonant behavior, for example, a vibrating me-
(Pintelon and Schoukens 2012; Söderström et
chanical structure. By comparing the parametric
al. 2010).
transfer function model with the nonparametric
FRF measurements, it becomes clearly visible if
a resonance is missed in the model. This can be Identification in a Feedback Loop
either due to a too simple model structure (the Identification under feedback conditions can
model order should be increased), or it can appear be solved in the time domain prediction error
because the model is trapped in a local minimum. method (Ljung 1999, Sect. 13.4; Söderström
In the latter case, better numerical optimization and Stoica 1989, Chap. 10). This leads to
and initialization procedures should be looked consistent estimates, provided that the exact
for. plant and noise model structure and order is
retrieved. In the periodic frequency domain
Identification in the Presence of Noise on approach, a nonparametric noise model is
the Input and Output Measurements extracted (variances input and output noise, and
Within the band-limited measurement setup, both covariance between input and output noise) in the
the input and the output have to be measured. preprocessing step, without any user interaction.
This leads in general to an identification Next, these are used as a weighting in the
framework where both the input and the weighted least squares cost function which leads
output are disturbed by noise. Such problems to consistent estimates provided that the plant
are studied in the errors-in-variables (EIV) model is flexible enough to capture the true plant
478 Frequency Domain System Identification
Pintelon R, Schoukens J, Pauwels L et al (2005) Diffusion the easiest method to use for designing dynamic
systems: stability, modeling, and identification. IEEE compensation.
Trans Instrum Meas 54:2061–2067
Schoukens J, Pintelon R, Van hamme H (1994) Identi-
fication of linear dynamic systems using piecewise-
constant excitations – use, misuse and alternatives. Keywords
Automatica 30:1153–1169
Schoukens J, Pintelon R, Vandersteen G et al (1997)
Frequency-domain system identification using non- Bandwidth; Bode plot; Frequency response; Gain
parametric noise models estimated from a small num- margin (GM); Magnitude; Phase; Phase margin
ber of data sets. Automatica 33: 1073–1086 (PM); Resonant peak; Stability
Schoukens J, Pintelon R, Rolain Y (1999) Study of con-
ditional ML estimators in time and frequency-domain
system identification. Automatica 35:91–100
Söderström T (2012) System identification for the errors- Introduction: Frequency Response F
in-variables problem. Trans Inst Meas Control 34:780–
792
Söderström T, Stoica P (1989) System identification.
A very common way to use the exponential re-
Prentice-Hall, Englewood Cliffs sponse of linear time-invariant systems (LTIs) is
Söderström T, Hong M, Schoukens J et al (2010) Ac- in finding the frequency response, or response to
curacy analysis of time domain maximum likelihood a sinusoid. First we express the sinusoid as a sum
method and sample maximum likelihood method for
errors-in-variables and output error identification. Au- of two exponential expressions (Euler’s relation):
tomatica 46:721–727
A j!t
A cos.!t/ D .e C e j!t /: (1)
2
Suppose we have an LTI system with input u
Frequency-Response and and output y. If we let s D j! in the transfer
Frequency-Domain Models function G.s/, then the response to u.t/ D e j!t
is y.t/ D G.j!/e j!t ; similarly, the response to
Abbas Emami-Naeini and J. David Powell
u.t/ D e j!t is G.j!/e j!t . By superposition,
Stanford University, Stanford, CA, USA
the response to the sum of these two exponentials,
which make up the cosine signal, is the sum of the
responses:
Abstract
A
y.t/ D ŒG.j!/e j!t C G.j!/e j!t : (2)
A major advantage of using frequency response 2
is the ease with which experimental information
can be used for design purposes. Raw measure- The transfer function G.j!/ is a complex
ments of the output amplitude and phase of a number that can be represented in polar form
plant undergoing a sinusoidal input excitation are or in magnitude-and-phase form as G. j!/ D
sufficient to design a suitable feedback control. M.!/e j¥.!/ , or simply G D M e j¥ . With this
No intermediate processing of the data (such substitution, Eq. (2) becomes for a specific input
as finding poles and zeros or determining sys- frequency ! D !o
tem matrices) is required to arrive at the system
A j.!t C'/
model. The wide availability of computers has y.t/ D M e C e j.!t C'/ ;
rendered this advantage less important now than 2
it was years ago; however, for relatively simple D AM cos.!t C '/; (3)
systems, frequency response is often still the
most cost-effective design method. The method M D jG. j!/j D jG.s/jsDj!o
is most effective for systems that are stable in p
open-loop. Yet another advantage is that it is D fReŒG. j!o /g2 C fImŒG. j!o /g2 ;
480 Frequency-Response and Frequency-Domain Models
1 ImŒG. j!o / boundary between stability and instability; evalu-
' D †G. j!/ D tan : (4)
ReŒG. j!o / ating G. j!/ provides information that allows us
to determine closed-loop stability from the open-
loop G.s/.
This means that if an LTI system represented
For the second-order system
by the transfer function G.s/ has a sinusoidal
input with magnitude A, the output will be sinu-
soidal at the same frequency with magnitude AM 1
and will be shifted in phase by the angle '. M G.s/ D ; (5)
.s=!n /2 C 2.s=!n/ C 1
is usually referred to as the amplitude ratio or
magnitude and ' is referred to as the phase and
they are both functions of the input frequency, !. the Bode plot is shown in Fig. 3 for various values
The frequency response can be measured experi- of .
mentally quite easily in the laboratory by driving A natural specification for system per-
the system with a known sinusoidal input, letting formance in terms of frequency response is
the transient response die, and measuring the the bandwidth, defined to be the maximum
steady-state amplitude and phase of the system’s frequency at which the output of a system will
output as shown in Fig. 1. The input frequency track an input sinusoid in a satisfactory manner.
is set to sufficiently many values so that curves By convention, for the system shown in Fig. 4
such as the one in Fig. 2 are obtained. Bode sug- with a sinusoidal input r, the bandwidth is the
gested that we plot log jM j vs. log ! and '.!/ frequency of r at which the output y is attenuated
vs. log ! to best show the essential features of to a factor of 0:707 times the input (If the output
G.j!/. Hence, such plots are referred to as Bode is a voltage across a 1- resistor, the power is
plots. Bode plotting techniques are discussed in v 2 and when jvj D 0:707, the power is reduced
Franklin et al. (2015). by a factor of 2. By convention, this is called
We are interested in analyzing the frequency the half-power point.). Figure 5 depicts the idea
response not only because it will help us un- graphically for the frequency response of the
derstand how a system responds to a sinusoidal closed-loop transfer function
input, but also because evaluating G.s/ with s
taking on values along the j! axis will prove
to be very useful in determining the stability of Y .s/ KG.s/
D T .s/ D : (6)
a closed-loop system. Since the j! axis is the R.s/ 1 C KG.s/
Frequency-Response
and Frequency-Domain
Models, Fig. 1 Response
of G.s/ D .sC1/
1
to the
input u D sin 10t (Source:
Franklin et al. (2010),
p. 298, reprinted by
permission of Pearson
Education, Inc., Upper
Saddle River, NJ)
Frequency-Response and Frequency-Domain Models 481
Frequency-Response
and Frequency-Domain
Models, Fig. 2 Frequency
response for G.s/ D sC1 1
The plot is typical of most closed-loop systems be an average, the rise time (Rise time tr .) from
in that (1) the output follows the input .jT j Š 1/ y D 0:1 to y D 0:9 is approximately !n tr D 1:8.
at the lower excitation frequencies and (2) the Thus, we can say that
output ceases to follow the input .jT j < 1/
at the higher excitation frequencies. The 1:8
tr Š : (7)
maximum value of the frequency-response !n
magnitude is referred to as the resonant Although this relationship could be embellished
peak Mr . by including the effect of the damping ratio,
Bandwidth is a measure of speed of response it is important to keep in mind how Eq. (7) is
and is therefore similar to time-domain measures typically used. It is accurate only for a second-
such as rise time and peak time or the s-plane order system with no zeros; for all other systems
measure of dominant-root(s) natural frequency. it is a rough approximation to the relationship
In fact, if the KG.s/ in Fig. 4 is such that the between tr and !n . Most systems being analyzed
closed-loop response is given by Fig. 3a, we can for control systems design are more complicated
see that the bandwidth will equal the natural than the pure second-order system, so design-
frequency of the closed-loop root (that is, !BW D ers use Eq. (7) with the knowledge that it is a
!n for a closed-loop damping ratio of D rough approximation only. Hence, for a second-
0:7). For other damping ratios, the bandwidth is order system the bandwidth is inversely propor-
approximately equal to the natural frequency of tional to the rise time, tr . Hence we are able to
the closed-loop roots, with an error typically less link the time and frequency domain quantities in
than a factor of 2. this way.
For a second-order system, the time responses The definition of the bandwidth stated here is
are functions of the pole-location parameters meaningful for systems that have a low-pass filter
and !n . If we consider the curve for D 0:5 to behavior, as is the case for any physical control
482 Frequency-Response and Frequency-Domain Models
Frequency-Response
and Frequency-Domain
Models, Fig. 3 Frequency
responses of standard
second-order systems (a)
magnitude (b) phase
(Source: Franklin et al.
(2010), p. 303, reprinted by
permission of Pearson
Education, Inc., Upper
Saddle River, NJ)
system. In other applications the bandwidth may analysis, we are more interested in the sensitivity
be defined differently. Also, if the ideal model of function S.s/ D 1 T .s/, rather than T .s/. For
the system does not have a high-frequency roll- most open-loop systems with high gain at low
off (e.g., if it has an equal number of poles and frequencies, S.s/ for a disturbance input has very
zeros), the bandwidth is infinite; however, this low values at low frequencies and grows as the
does not occur in nature as nothing responds well frequency of the input or disturbance approaches
at infinite frequencies. the bandwidth. For analysis of either T .s/ or
In many cases, the designer’s primary concern S.s/, it is typical to plot their response versus the
is the error in the system due to disturbances frequency of the input. Either frequency response
rather than the ability to track an input. For error for control systems design can be evaluated using
Frequency-Response and Frequency-Domain Models 483
the computer or can be quickly sketched for design process as well as to perform sanity checks
simple systems using the efficient methods de- on the computer output.
scribed in Franklin et al. (2015). The methods
described next are also useful to expedite the
Neutral Stability: Gain and Phase
Margins
Frequency-Response
and Frequency-Domain
Models, Fig. 5
Definitions of bandwidth
and resonant peak (Source:
Franklin et al. (2010),
p. 304, reprinted by
permission of Pearson
Education, Inc., Upper
Saddle River, NJ)
denominator in factored form (because the We can see from the root locus in Fig. 6b that
factors give the system roots directly) to observe any value of K less than the value at the neutrally
whether the real parts are positive or negative. stable point will result in a stable system. At the
However, the closed-loop transfer function is frequency ! where the phase †G. j!/ D 180ı
usually not known. In fact, the whole purpose (! D 1 rad/s), the magnitude jKG. j!/j < 1:0
behind understanding the root-locus technique is for stable values of K and >1 for unstable values
to be able to find the factors of the denominator of K. Therefore, we have the following trial
in the closed-loop transfer function, given only stability condition, based on the character of the
the open-loop transfer function. Another way open-loop frequency response:
to determine closed-loop stability is to evaluate
the frequency response of the open-loop transfer jKG. j!/j < 1 at †G. j!/ D 180ı : (9)
function KG. j!/ and then perform a test on
that response. Note that this method also does This stability criterion holds for all systems for
not require factoring the denominator of the which increasing gain leads to instability and
closed-loop transfer function. In this section we jKG. j!/j crosses the magnitude .D1/ once, the
will explain the principles of this method. Note most common situation. However, there are sys-
that this method also does not require factoring tems for which an increasing gain can lead from
the denominator of the closed-loop transfer instability to stability; in this case, the stability
function. Here we will explain the principles condition is
of this method.
Suppose we have a system defined by Fig. 6a
jKG. j!/j > 1 at †G. j!/ D 180ı : (10)
and whose root locus behaves as shown in
Fig. 6b; that is, instability results if K is larger
than 2. The neutrally stable points lie on the Based on the above ideas, we can now define the
imaginary axis – that is, where K D 2 and robustness metrics gain and phase margins:
s D j 1:0. Furthermore, all points on the root Phase Margin: Suppose at !1 , jG.j!1 /j D K1 .
locus have the property that How much more phase could the system toler-
ate (as a time delay, perhaps) before reaching
jKG.s/j D 1 and †G.s/ D 180ı : the stability boundary? The answer to this
question follows from Eq. (8), i.e., the phase
margin (PM) is defined as
At the point of neutral stability we see that these
root-locus conditions hold for s D j!, so
PM D †G.j!1 / .180ı /: (11)
ı
jKG. j!/j D 1 and †G. j!/ D 180 :
(8) Gain Margin: Suppose at !2 , †G.j!2 / D
Thus, a Bode plot of a system that is neutrally 180ı . How much more gain could the
stable (that is, with K defined such that a closed- system tolerate (as an amplifier, perhaps)
loop root falls on the imaginary axis) will satisfy before reaching the stability boundary? The
the conditions of Eq. (8). Figure 7 shows the answer to this question follows from Eq. (9),
frequency response for the system whose root i.e., the gain margin (GM) is defined as
locus is plotted in Fig. 6 for various values of K.
The magnitude response corresponding to K D 2 1
GM D : (12)
passes through 1 at the same frequency (! D KjG.j!2 /j
1 rad/s) at which the phase passes through 180ı,
as predicted by Eq. (8). There are also rare cases when jKG. j!/j crosses
Having determined the point of neutral stabil- magnitude .D1/ more than once, or where an in-
ity, we turn to a key question: Does increasing the creasing gain leads to instability. A rigorous way
gain increase or decrease the system’s stability? to resolve these situations is to use the Nyquist
Frequency-Response and Frequency-Domain Models 485
Frequency-Response
and Frequency-Domain
Models, Fig. 7 Stability
example: (a) system
definition; (b) root locus
(Source: Franklin et al.
(2010), p. 319, reprinted by
permission of Pearson
Education, Inc., Upper
Saddle River, NJ)
stability criterion as discussed in Franklin et al. where !c is the crossover frequency. The closed-
(2015). loop frequency-response magnitude is approxi-
mated by
Frequency-Response and Frequency-Domain Models, Fig. 8 Closed-loop bandwidth with respect to PM (Source:
Franklin et al. (2010), p. 347, reprinted by permission of Pearson Education, Inc., Upper Saddle River, NJ)
logarithmic sensitivity integrals, known as the singular value of a matrix A will be written as
Bode integrals. Bode’s work has had a lasting .A/. If A is a Hermitian matrix, we denote
impact on the theory and practice of control by .A/ its largest eigenvalue. For any unitary
and has inspired continued research effort dated vectors u; v 2 Cn , we denote by †.u; v/ the
most recently, leading to a variety of extensions principal angle between the two one-dimensional
and new results which seek to quantify design subspaces, called the directions, spanned by u and
constraints and performance limitations by v:
logarithmic integrals of Bode and Poisson type. cos †.u; v/ WD juH vj:
On the other hand, the search for the best
For a stable continuous-time system with transfer
achievable performance is a natural goal in
function matrix G.s/, we define its H1 norm by
optimal control problems, which has lend bounds
on optimal performance indices defined under
various criteria. Likewise, the latter developments kGk1 WD sup .G.s//:
Re.s/>0
have also been substantial and are continuing to
branch to different problems and different system
We consider the standard configuration of
categories.
finite-dimensional linear time-invariant (LTI)
In this entry we attempt to provide a summary
feedback control systems given in Fig. 1. In this
overview of the key developments in the study
setup, P and K represent the transfer functions
of fundamental limitations of feedback control.
of the plant model and controller, respectively,
While the understanding on this subject has been
r is a command signal, d a disturbance, n a
compelling and the results are rather prolific, we
noise signal, and y the output response. Define
focus on Bode-type integral relations and the
the open-loop transfer function, the sensitivity
best achievable performance limits, two branches
function, and the complementary sensitivity
of the study that are believed to be most well-
function by
developed. Roughly speaking, the Bode-type
integrals are most useful for quantifying the
L D PK; S D .I C L/1 ; T D L.I C L/1 ;
inherent design constraints and tradeoffs in the
frequency domain, while the performance results
provide fundamental limits of canonical control respectively. Then the output can be expressed as
objectives defined using frequency- and time-
domain criteria. Invariably, the two sets of results y D Sd T n C SP r:
are intimately related and reinforce each other.
The essential message then is that despite its The goal of feedback control design is to design a
many benefits, feedback has its own limitations controller K so that the closed-loop system is sta-
and is subject to various constraints. Feedback ble and that it achieves certain performance spec-
design, for that sake, requires often times a hard ifications. Typical design objectives include:
tradeoff. • Disturbance attenuation. The effect of the
disturbance signal on the output should be
kept small, which translates into the require-
Control Design Specifications
a very negative phase, driving the phase closer abide some kind of conservation law, or invari-
to the negative 180 degree, namely, the critical ance property: the integral of the logarithmic
point of stability. It consequently reduces the sensitivity magnitude over the entire frequency
phase margin and may even cause instability. As range must be a nonnegative constant, determined
a result, the gain-phase relationship demonstrates by the open-loop unstable poles. This property
a conflict between the two design objectives. It is mandates a tradeoff between sensitivity reduc-
safe to claim that much of the classical feedback tion and sensitivity amplification in different fre-
design theory came as a consequence of this quency bands. Indeed, to achieve disturbance
simple relationship, aiming to shape the open- attenuation, the logarithmic sensitivity magnitude
loop frequency response in the crossover region must stay below zero db, the lower the better.
by trial and error, using lead or lag filters. For noise reduction and robustness, however, its
While in using the gain-phase formula, the de- tail has to roll off sufficiently fast to zero db
sign specifications imposed on closed-loop trans- at high frequencies. Since, in light of the inte-
fer functions are translated approximately into gral relation, the total area under the logarithmic
the requirements on the open-loop transfer func- magnitude curve is nonnegative, the logarithmic
tion, and the tradeoff between different design magnitude must rise above zero db, so that under
goals is achieved by shaping the open-loop gain its curve, the positive and negative areas may
and phase; a more direct vehicle to accomplish cancel each other to yield a nonnegative value.
this same goal is Bode’s sensitivity integral (Bode As such, an undesirable sensitivity amplifica-
1945). tion occurs, resulting in a fundamental tradeoff
between the desirable sensitivity reduction and
Bode Sensitivity Integrals Let pi 2 CC be the the undesirable sensitivity amplification, known
unstable poles and zi 2 CC the nonminimum colloquially as the waterbed effect.
phase zeros of L.s/. Suppose that the closed-loop
system in Fig. 1 is stable.
(i) If L.s/ has relative degree greater than one, MIMO Integral Relations
then
Z 1 X For a multi-input multi-output (MIMO) system
log jS.j!/jd! D
pi : depicted in Fig. 1, the sensitivity and complemen-
0 i tary sensitivity functions, which now are transfer
function matrices, satisfy similar interpolation
(ii) If L.s/ contains no less than two integrators, constraints: at each RHP pole pi and zero zi of
then L.s/, the equations
Z 1 X 1
log jT .j!/j
2
d! D
: S.pi /i D 0; T .pi /i D i ;
0 ! i
zi
system, the measure of frequency response SISO systems. Yet there is something additional
magnitude is now the largest singular value and unique of MIMO systems: the integrals now
of a transfer function matrix, which represents depend on not only the locations but also the
the worst-case amplification of energy-bounded directions of the zeros and poles. In particular, it
signals, a direct counterpart to the gain of a can be shown that they depend on the mutual ori-
scalar transfer function. This fact alone proves entation of these directions, and the dependence
to cast a fundamental difference and poses a can be explicitly characterized geometrically by
formidable obstacle. From a technical standpoint, the principal angles between the directions. This
the logarithmic function of the largest singular new phenomenon, which finds no analog in SISO
value is no longer a harmonic function as systems, thus highlights the important role of
in the SISO case, but only a subharmonic directionality in sensitivity tradeoff and more
function. Much to our regret then, familiar generally, in the design of MIMO systems. F
tools found from analytic function theory, such A more sophisticated and accordingly, more
as Cauchy and Poisson theorems, the very informative variant of Bode integrals is the Pois-
backbone in developing Bode integrals, cease to son integral for sensitivity and complementary
be applicable. Nevertheless, it remains possible sensitivity functions (Freudenberg and Looze
to extend Bode integrals in their essential spirit. 1985), which can be used to provide quantitative
Advances are made by Chen (1995, 1998, estimates of the waterbed effect. MIMO versions
2000). of Poisson integrals are also available (Chen
1995, 2000).
MIMO Bode Sensitivity Integrals Let pi 2
CC be the unstable poles of L.s/ and zi 2
CC the nonminimum phase zeros of L.s/. Sup-
Frequency-Domain Performance
pose that the closed-loop system in Fig. 1 is
Bounds
stable.
(i) If L.s/ has relative degree greater than one,
Performance bounds complement the integral
then
relations and provide fundamental thresholds to
Z ! the best possible performance ever attainable.
1 X
log .S.j!//d!
pi i H : Such bounds are useful in providing benchmarks
i
0 i for evaluating a system’s performance prior to
and after controller design. In the frequency
(ii) If L.s/ contains no less than two integrators, domain, fundamental limits can be specified as
then the minimal peak magnitude of the sensitivity and
complementary sensitivity functions achievable
Z !
1
log .T .j!// X 1 by feedback, or formally, the minimal achievable
d!
wi wH
i H1 norms:
0 !2 zi
i S
min W D inf fkS.s/k1 W K.s/ stabilizes P .s/g ;
where i and wi are some unitary vectors related
T
min W D inf fkT .s/k1 W K.s/ stabilizes P .s/g :
to the right pole direction vectors associated with
pi and the left zero direction vectors associated Drawing upon Nevanlinna-Pick interpolation
with zi , respectively. theory for analytic functions, one can obtain
exact performance limits under rather general
From these extensions, it is evident that same circumstances (Chen 2000).
limitations and tradeoffs on the sensitivity and
complementary sensitivity functions carry over H1 Performance Limits Let zi 2 CC be the
to MIMO systems; in fact, both integrals re- nonminimum phaze zeros of P .s/ with left direc-
duce to the Bode integrals when specialized to tion vectors wi , and pi 2 CC the unstable poles
492 Fundamental Limitation of Feedback Control
of P .s/ with right direction vectors i , where zi the output z to track a given reference input r,
and pi are all distinct. Then, based on the feedforward of the reference signal
r and the feedback of the measured output y.
r
is LTI, but allow K to be any causal, stabilizing Summary and Future Directions
controller. Evidently, for a stable P , the problem
is trivial; the response will restore itself to the Whether in time or frequency domain, while
origin and thus no energy is required. But what the results presented herein may differ in forms
if the system is unstable? How much energy and contexts, they unequivocally point to the
does the controller must generate to combat the fact that inherent constraints exist in feedback
disturbance? What is the smallest amount of design, and fundamental limitations will neces-
energy required? These questions are answered sarily arise, limiting the performance achievable
by the best achievable limits of the tracking regardless of controller design. Such constraints
and regulation performance (Chen et al. 2000, and limitations are especially exacerbated by the
2003). nonminimum phase zeros and unstable poles in
the system. Understanding of these constraints F
Tracking and Regulation Performance Limits and limitations proves essential to the success of
Let pi 2 CC and zi 2 CC be the RHP poles and control design.
zeros of P .s/, respectively. Then, For both its intrinsic appeal and fundamental
implications, the study of fundamental control
X limitations will continue to be a topic of en-
inffE W K stabilizes P .s/g D pi cos2 .i ; v/; during vitality and indeed will prove timeless.
i Challenges are especially daunting and endeavor
X 1 is called for, e.g., to incorporate information and
inffJ W K stabilizes P .s/g D cos2 . i ; v/;
zi communication constraints into control limitation
i
studies, of which networked control and multi-
agent systems serve as notable testimonies.
where i and i are some unitary vectors related
to the right pole direction vectors associated with
pi and the left zero direction vectors associated
Cross-References
with zi , respectively.
It becomes instantly clear that the optimal
H-Infinity Control
performance depends on both the pole/zero loca-
H2 Optimal Control
tions and their directions. In particular, it depends
Linear Quadratic Optimal Control
on the mutual orientation between the input and
pole/zero directions. This sheds some interesting
light. Take the tracking performance for an ex-
ample. For a SISO system, the minimal tracking Bibliography
error can never be made zero for a nonmini-
mum phase plant; in other words, perfect tracking Astrom KJ (2000) Limitations on control system perfor-
can never be achieved. Yet this is possible for mance. Eur J Control 6:2–20
Bode HW (1945) Network analysis and feedback ampli-
MIMO systems, when the input and zero direc-
fier design. Van Nostrand, Princeton
tions are appropriately aligned, specifically when Chen J (1995) Sensitivity integral relations and design
they are orthogonal. Interestingly, the optimal tradeoffs in linear multivariable feedback systems.
performance in both cases can be achieved by LTI IEEE Trans Autom Control 40(10):1700–1716
Chen J (1998) Multivariable gain-phase and sensitivity
controllers, though allowed to be more general.
integral relations and design tradeoffs. IEEE Trans
As a result, the results herein provide the true fun- Autom Control 43(3):373–385
damental limits that cannot be further improved, Chen J (2000) Logarithmic integrals, interpolation
in spite of using any other more general forms bounds, and performance limitations in MIMO sys-
tems. IEEE Trans Autom Control 45:1098–1115
such as nonlinear, time-varying feedforward and
Chen J, Qiu L, Toker O (2000) Limitations on maxi-
feedback. It is simply the best one can ever hope mal tracking accuracy. IEEE Trans Autom Control
for, and the LTI controllers turn out to be optimal. 45(2):326–331
494 Fundamental Limitation of Feedback Control
Chen J, Hara S, Chen G (2003) Best tracking and Middleton RH (1991) Trade-offs in linear control system
regulation performance under control energy design. Automatica 27(2):281–292
constraint. IEEE Trans Autom Control 48(8): Qiu L, Davison EJ (1993) Performance limitations of
1320–1336 non-minimum phase systems in the servomechanism
Freudenberg JS, Looze DP (1985) Right half plane problem. Automatica 29(2):337–349
zeros and poles and design tradeoffs in feed- Seron MM, Braslavsky JH, Goodwin GC (1997) Funda-
back systems. IEEE Trans Autom Control 30(6): mental limitations in filtering and control. Springer,
555–565 London
G
as attackers, defenders, and users as well as their Baras J, Katz J, Altman E (eds) (2011) Proceedings
interaction and incentives. Hence, it facilitates of the second international conference on decision
and game theory for security, GameSec 2011, Col-
making decisions on the best courses of action lege Park, 14–15 Nov 2011. Lecture notes in com-
in addressing security problems while taking puter science, vol 7037. Springer, Berlin/Heidelberg.
into account resource limitations, underlying doi:10.1007/978-3-642-25280-8
incentive mechanisms, and security risks. Thus, Buttyan L, Hubaux JP (2008) Security and cooperation
in wireless networks. Cambridge University Press,
security games and associated quantitative
Cambridge. http://secowinet.epfl.ch
models have started to replace the prevalent Chorppath AK, Alpcan T (2011) Adversarial behav-
ad hoc decision processes in a wide variety of ior in network mechanism design. In: Proceedings
security problems from safeguarding critical of the of 4th international workshop on game the-
infrastructure to risk management, trust, and ory in communication networks (Gamecomm), ENS,
Cachan
privacy in networked systems. Grossklags J, Walrand JC (eds) (2012) Proceedings of the
Game theory for security is a young and active third international conference on decision and game
research area as evidenced by the recently initi- theory for security, GameSec 2012, Budapest, 5–6 Nov
ated conference series “Conference on Decision 2012. Lecture notes in computer science, vol 7638.
Springer
and Game Theory for Security” (www.gamesec-
Gueye A, Walrand J, Anantharam V (2010) Design of
conf.org), the increasing number of journal and network topology in an adversarial environment. In:
conference articles, as well as the recently pub- Alpcan T, Buttyán L, Baras J (eds) Decision and
lished books (Alpcan and Başar 2011; Buttyan game theory for security. Lecture notes in computer
science, vol 6442. Springer, Berlin/Heidelberg, pp 1–
and Hubaux 2008; Tambe 2011).
20. doi:10.1007/978-3-642-17197-0_1
Guikema SD (2009) Game theory models of intelligent
actors in reliability analysis: an overview of the state of
the art. In: Bier VM, Azaiez MN (eds) Game theoretic
Cross-References risk analysis of security threats, Springer, New York,
pp 1–19. doi:10.1007/978-0-387-87767-9
Dynamic Noncooperative Games Kantarcioglu M, Bensoussan A, Hoe S (2011)
Investment in privacy-preserving technologies under
Evolutionary Games
uncertainty. In: Baras J, Katz J, Altman E (eds)
Learning in Games Decision and game theory for security. Lecture
Network Games notes in computer science, vol 7037. Springer,
Stochastic Games and Learning Berlin/Heidelberg, pp 219–238. doi:10.1007/978-3-
642-25280-8_17
Strategic Form Games and Nash Equilibrium
Kashyap A, Başar T, Srikant R (2004) Correlated
jamming on MIMO Gaussian fading channels.
IEEE Trans Inf Theory 50(9):2119–2123.
doi:10.1109/TIT.2004.833358
Bibliography Kodialam M, Lakshman TV (2003) Detecting network
intrusions via sampling: a game theoretic approach.
Alpcan T, Başar T (2011) Network security: a decision In: Proceedings of 22nd IEEE conference on computer
and game theoretic approach. Cambridge University communications (Infocom), San Fransisco, vol 3,
Press, Cambridge, UK. http://www.tansu.alpcan.org/ pp 1880–1889
book.php Law YW, Alpcan T, Palaniswami M (2012) Security
Alpcan T, Buttyan L, Baras J (eds) (2010) Proceedings of games for voltage control in smart grid. In: 50th an-
the first international conference on decision and game nual Allerton conference on communication, control,
theory for security, GameSec 2010, Berlin, 22–23 Nov and computing (Allerton 2012), Monticello, IL, USA,
2010. Lecture notes in computer science, vol 6442. pp 212–219. doi:10.1109/Allerton.2012.6483220
Springer, Berlin/Heidelberg. doi:10.1007/978-3-642- Manshaei MH, Zhu Q, Alpcan T, Başar T, Hubaux
17197-0 JP (2013) Game theory meets network security
Altman E, Başar T, Kavitha V (2010) Adversarial con- and privacy. ACM Comput Surv 45(3):25:1–25:39.
trol in a delay tolerant network. In: Alpcan T, doi:10.1145/2480741.2480742
Buttyán L, Baras J (eds) Decision and game the- Miura-Ko RA, Yolken B, Bambos N, Mitchell J (2008)
ory for security. Lecture notes in computer science, Security investment games of interdependent organi-
vol 6442. Springer, Berlin/Heidelberg, pp 87–106. zations. In: 46th annual Allerton conference, Monti-
doi:10.1007/978-3-642-17197-0_6 cello, IL, USA
Game Theory: Historical Overview 499
Mounzer J, Alpcan T, Bambos N (2010) Dynamic control game theory; Nash equilibrium; Stackelberg
and mitigation of interdependent IT security risks. equilibrium
In: Proceedings of the IEEE conference on commu-
nication (ICC), Cape Town. IEEE Communications
Society
Roth A (2008) The price of malice in linear congestion What Is Game Theory?
games. In: WINE ’08: proceedings of the 4th interna-
tional workshop on internet and network economics,
Shanghai, pp 118–125 Game theory deals with strategic interactions
Tambe M (2011) Security and game theory: algorithms, among multiple decision makers, called players
deployed systems, lessons learned. Cambridge Univer- (and in some context agents), with each player’s
sity Press, New York, NY, USA preference ordering among multiple alternatives
Zander J (1990) Jamming games in slotted Aloha packet
radio networks. In: IEEE military communications
captured in an objective function for that player,
conference (MILCOM), Morleley, vol 2, pp 830–834. which she either tries to maximize (in which case
doi:10.1109/MILCOM.1990.117531 the objective function is a utility function or a
Zhang N, Yu W, Fu X, Das S (2010) gPath: a benefit function) or minimize (in which case we G
game-theoretic path selection algorithm to protect
Tor’s anonymity. In: Alpcan T, Buttyán L, Baras J
refer to the objective function as a cost function or
(eds) Decision and game theory for security. Lec- a loss function). For a nontrivial game, the objec-
ture notes in computer science, vol 6442. Springer, tive function of a player depends on the choices
Berlin/Heidelberg, pp 58–71. doi:10.1007/978-3-642- (actions or equivalently decision variables) of
17197-0_4
at least one other player, and generally of all
the players, and hence a player cannot simply
optimize her own objective function independent
Game Theory: Historical Overview of the choices of the other players. This thus
brings in a coupling among the actions of the
Tamer Başar players and binds them together in decision mak-
Coordinated Science Laboratory, University of ing even in a noncooperative environment. If the
Illinois, Urbana, IL, USA players are able to enter into a cooperative agree-
ment so that the selection of actions or decisions
is done collectively and with full trust, so that
all players would benefit to the extent possible,
Abstract then we would be in the realm of cooperative
game theory, where issues such as bargaining
This article provides an overview of the aspects and characterization of fair outcomes, coalition
of game theory that are covered in this Encyclo- formation, and excess utility distribution are of
pedia, which includes a broad spectrum of topics relevance; an article in this Encyclopedia (by
on static and dynamic game theory. It starts with Haurie) discusses cooperation and cooperative
a brief overview of game theory, identifying its outcomes in the context of dynamic games. Other
basic ingredients, and continues with a brief his- aspects of cooperative game theory can be found
torical account of the development and evolution in several standard texts on game theory, such as
of the field. It concludes by providing pointers to Owen (1995), Vorob’ev (1977), or Fudenberg and
other articles in the Encyclopedia on game theory, Tirole (1991). See also the 2009 survey article
and a list of references. Saad et al. (2009), which emphasizes applications
of cooperative game theory to communication
networks.
If no cooperation is allowed among the play-
Keywords ers, then we are in the realm of noncooperative
game theory, where first one has to introduce a
Cooperation; Dynamic games; Evolutionary satisfactory solution concept. Leaving aside for
games; Game theory; Historical evolution of the moment the issue of how the players can
500 Game Theory: Historical Overview
reach such a solution point, let us address the two-player game is zero-sum if the sum of the
issue of what would be the minimum features one objective functions of the two players is zero
would expect to see there. To first order, such or can be made zero by appropriate positive
a solution point should have the property that scaling and/or translation that do not depend on
if all players but one stay put, then the player the decision variables of the players; hence, two-
who has the option of moving away from the player zero-sum games can be viewed as a spe-
solution point should not have any incentive to cial subclass of two-player nonzero-sum games,
do so because she cannot improve her payoff. and in this case the Nash equilibrium becomes
Note that we cannot allow two or more players the saddle-point equilibrium. A game is a finite
to move collectively from the solution point, game if each player has only a finite number of
because such a collective move requires cooper- alternatives, that is, the players pick their actions
ation, which is not allowed in a noncooperative out of finite sets (action sets); otherwise, the game
game. Such a solution point where none of the is an infinite game. Finite games are also known
players can improve her payoff by a unilateral as matrix games. An infinite game is said to be a
move is known as a noncooperative equilibrium continuous-kernel game if the action sets of the
or Nash equilibrium, named after John Nash, who players are continua and the players’ objective
introduced it and proved that it exists in finite functions are continuous with respect to action
games (i.e., games where each player has only variables of all players. A game is said to be
a finite number of alternatives), over 60 years deterministic if the players’ actions uniquely de-
ago; see Nash (1950, 1951). This result and termine the outcome, as captured in the objective
its various extensions for different frameworks functions, whereas if the objective function of at
as well as its computation (both off-line and least one player depends on an additional variable
online) are discussed in several articles in this (state of nature) with a probability distribution
Encyclopedia. Another noncooperative equilib- known to all players (or can be learned on line),
rium solution concept is the Stackelberg equi- then we have a stochastic game. A game is a com-
librium, introduced in von Stackelberg (1934), plete information game if the description of the
and predating the Nash equilibrium, where there game (i.e., the players, the objective functions,
is a hierarchy in decision making among the and the underlying probability distributions (if
players, with some of the players, designated stochastic)) is common information to all players;
as leaders, having the ability to first announce otherwise, we have an incomplete information
their actions (and make a commitment to play game. We say that a game is static if players have
them) and the remaining players, designated as access to only the a priori information (shared
followers, taking these actions as given in the by all) and none of the players has access to
process of computation of their noncooperative information on the actions of any of the other
(Nash) equilibria (among themselves). Before players; otherwise, what we have is a dynamic
announcing their actions, the leaders would of game. A game is a single-act game if every player
course anticipate these responses and determine acts only once; otherwise, the game is multi-act.
their actions in a way that the final outcome Note that it is possible for a single-act game to be
will be most favorable to them (in terms of dynamic and for a multi-act game to be static. A
their objective functions). For a comprehensive dynamic game is said to be a differential game if
treatment of Nash and Stackelberg equilibria for the evolution of the decision process (controlled
different classes of games, see Başar and Olsder by the players over time) takes place in contin-
(1999). uous time and generally involves a differential
We say that a noncooperative game is nonzero- equation; if it takes place over a discrete-time
sum if the sum of the players’ objective functions horizon, the dynamic game is sometimes called
cannot be made zero after appropriate positive a discrete-time game.
scaling and/or translation that do not depend on In dynamic games, as the game progresses
the players’ decision variables. We say that a players acquire information (complete or partial)
Game Theory: Historical Overview 501
on past actions of other players and use this one-player difference or differential games can
information in selecting their own actions (also be viewed as optimal control problems.
dictated by the equilibrium solution concept at
hand). In finite dynamic games, for example, the
progression of a game involves a tree structure
(also called extensive form) where each node Highlights on the History and
is identified with a player along with the time Evolution of Game Theory
when she acts, and branches emanating from a
node show the possible moves of that particular Game theory has enjoyed over 70 years of
player. A player, at any point in time, could scientific development, with the publication of
generally be at more than one node, which is a the Theory of Games and Economic Behavior by
situation that arises when the player does not have von Neumann and Morgenstern (1947) generally
complete information on the past moves of other acknowledged to kick-start the field. It has
players and hence may not know with certainty experienced incessant growth in both the number G
which particular node she is at at any particular of theoretical results and the scope and variety
time. This uncertainty leads to a clustering of of applications. As a recognition of the vitality
nodes into what is called information sets for of the field, through 2012 a total of 10 Nobel
that player. What players decide on within the Prizes were given in Economic Sciences for
framework of the extensive form is not their work primarily in game theory, with the first such
actions, but their strategies, that is, what action recognition bestowed in 1994 on John Harsanyi,
they would take at each information set (in other John Nash, and Reinhard Selten “for their
words, correspondences between their informa- pioneering analysis of equilibria in the theory
tion sets and their allowable actions). They then of noncooperative games.” The second round
take specific actions (or actions are executed on of Nobel Prizes in game theory went to Robert
their behalf), dictated by the strategies chosen Aumann and Thomas Schelling in 2005, “for
as well as the progression of the game (deci- having enhanced our understanding of conflict
sion) process along the tree. The equilibrium is and cooperation through game-theory analysis.”
then defined in terms of not actions but strate- The third round recognized Leonid Hurwicz,
gies. Eric Maskin, and Roger Myerson in 2007, “for
The notion of a strategy, as a mapping from having laid the foundations of mechanism design
the collection of information sets to action sets, theory.” And the most recent one was in 2012,
extends readily to infinite dynamic games, and recognizing Alvin Roth and Lloyd Shapley,
hence, in both differential games and difference “for the theory of stable allocations and the
games, Nash equilibria are defined in terms of practice of market design.” To this list of highest-
strategies. Several articles in this Encyclopedia level awards related to contributions to game
discuss such equilibria, for both zero-sum and theory, one should also add the 1999 Crafoord
nonzero-sum dynamic games, with and without Prize (which is the highest prize in Biological
the presence of probabilistic uncertainty. Sciences), which went to John Maynard Smith
In the broad scheme of things, game theory (along with Ernst Mayr and G. Williams)
and particularly noncooperative game theory “for developing the concept of evolutionary
can be viewed as an extension of two fields, biology,” where Smith’s recognized contributions
both covered in this Encyclopedia: Mathematical had a strong game-theoretic underpinning,
Programming and Optimal Control Theory. Any through his work on evolutionary games and
problem in game theory collapses to a problem evolutionary stable equilibrium (Smith 1974,
in one of these disciplines if there is only one 1982; Smith and Price 1973); this is the topic
player. One-player static games are essentially of one of the articles in this Encyclopedia
mathematical programming problems (linear (by Altman). Several other “game theory”
programming or nonlinear programming), and articles in the Encyclopedia also relate to the
502 Game Theory: Historical Overview
contributions of the Nobel Laureates mentioned and their uniqueness or nonuniqueness (termed
above. informational nonuniqueness) was carried out in
Even though von Neumann and Morgenstern’s Başar (1974, 1976, 1977).
1944 book is taken as the starting point of
the scientific approach to game theory, game-
theoretic notions and some isolated key results Related Articles on Game Theory in
date back to earlier years and even centuries. the Encyclopedia
Sixteen years earlier, in 1928, von Neumann
himself had resolved completely an open Several articles in the Encyclopedia introduce
fundamental problem in zero-sum games, that various subareas of game theory and discuss im-
every finite two-player zero-sum game admits a portant developments (past and present) in each
saddle point in mixed strategies, which is known corresponding area.
as the Minimax Theorem (von Neumann 1928) The article Strategic Form Games and
– a result which Emile Borel had conjectured Nash Equilibrium introduces the static game
to be false eight years before. Some early framework along with the Nash equilibrium
traces of game-theoretic thinking can be seen concept, for both finite and infinite games,
in the 1802 work (Considérations sur la théorie and discusses the issues of existence and
mathématique du jeu) of André-Marie Ampère uniqueness as well efficiency. The article
(1775–1836), who was influenced by the 1777 Dynamic Noncooperative Games focuses on
writings (Essai d’Arithmétique Morale) of dynamic games, again for both finite and infinite
Georges Louis Buffon (1707–1788). games, and discusses extensive form descriptions
Which event or writing has really started of the underlying dynamic decision process,
game-theoretic thinking or approach to decision either as trees (in finite games) or difference
making (in law, politics, economics, operations equations (in discrete-time infinite games).
research, engineering, etc.) may be a topic of Bernhard, in two articles, discusses continuous-
debate, but what is indisputable is that in (zero- time dynamic games, described by differential
sum) differential games (which is most relevant equations (so-called differential games), but
to control theory) the starting point was the work in the two-person zero-sum case. One of these
of Rufus Isaacs in the RAND Corporation in articles Pursuit-Evasion Games and Zero-Sum
the early 1950s, which remained classified for Two-Person Differential Games describes the
at least a decade, before being made accessible framework initiated by Isaacs, and several of
to a broad readership in 1965 (Isaacs 1965); see its extensions for pursuit-evasion games, and
also the review (Ho 1965) which first introduced the other one Linear Quadratic Zero-Sum
the book to the control community. One of the Two-Person Differential Games presents results
articles in this Encyclopedia (by Bernhard) talks on the special case of linear quadratic differential
about this history and the theory developed by games, with an important application of that
Isaacs, within the context of pursuit-evasion framework to robust control and more precisely
games, and another article (again by Bernhard) H1 -optimal control.
discusses the impact the zero-sum differential When the number of players in a nonzero-
game framework has made on robust control sum game is countably infinite, or even just
design (Başar and Bernhard 1995). Extension sufficiently large, some simplifications arise in
of the game-theoretic framework to nonzero- the computation and characterization of Nash
sum differential games with Nash equilibrium as equilibria. The mathematical framework appli-
the solution concept was initiated in Starr and cable to this context is provided by mean field
Ho (1969) and with Stackelberg equilibrium theory, which is the topic of the article Mean
as the solution concept in Simaan and Cruz Field Games, which discusses this relatively new
(1973). Systematic study of the role information theory within the context of stochastic differential
structures play in the existence of such equilibria games.
Game Theory: Historical Overview 503
SIAM, Philadelphia. (First edition, Academic, Lon- problems, including the fixed endpoint, the point-
don, 1982) to-point, and several H2 /H1 control problems,
Fudenberg D, Tirole J (1991) Game theory. MIT, Cam-
bridge
as well as the dual counterparts. In the past
Ho Y-C (1965) Review of ‘Differential Games’ by R. 50 years, these problems have been addressed
Isaacs. IEEE Trans Autom Control AC-10(4):501–503 using different techniques, each tailored to
Isaacs R (1975) Differential games, 2nd edn. Kruger, New their specific structure. It is only in the last
York. (First edition: Wiley, New York, 1965)
Nash JF Jr (1950) Equilibrium points in N-person games.
10 years that it was recognized that a unifying
Proc Natl Acad Sci 36(1):48–49 framework is available. This framework hinges
Nash JF Jr (1951) Non-cooperative games. Ann Math on formulae that parameterize the solutions of
54(2):286–295 the Hamiltonian differential equation in the
Owen G (1995), Game theory, 3rd edn. Academic, New
York
continuous-time case and the solutions of the
Saad W, Han Z, Debbah M, Hjorungnes A, Başar T (2009) extended symplectic system in the discrete-time
Coalitional game theory for communication networks case. Whereas traditional techniques involve the
[a tutorial]. IEEE Signal Process Mag 26(5):77–97. solutions of Riccati differential or difference
Special issue on Game Theory
Simaan M, Cruz JB Jr (1973) On the Stackelberg strategy
equations, the formulae used here to solve the
in nonzero sum games. J Optim Theory Appl 11:533– finite-horizon LQ control problem only rely on
555 solutions of the algebraic Riccati equations.
Smith JM (1974) The theory of games and the evolution In this article, aspects of the framework are
of animal conflicts. J Theor Biol 47: 209–221
Smith JM (1982) Evolution and the theory of games. described within a discrete-time context.
Cambridge University Press, Cambridge
Smith JM, Price GR (1973) The logic of animal conflict.
Nature 246:15–18 Keywords
Starr AW, Ho Y-C (1969) Nonzero-sum differential
games. J Optim Theory Appl 3:184–206
von Neumann J (1928) Zur theorie der Gesellschaftspiele. Cyclic boundary conditions; Discrete-time linear
Mathematische Annalen 100:295–320 systems; Fixed end-point; Initial value; Point-
von Neumann J, Morgenstern O (1947) Theory of games to-point boundary conditions; Quadratic cost;
and economic behavior, 2nd edn. Princeton University Riccati equations
Press, Princeton. (First edition: 1944)
von Stackelberg H (1934) Marktform und Gleichgewicht,
Springer, Vienna. (An English translation appeared in
1952 entitled “The theory of the market economy,” Introduction
published by Oxford University Press, Oxford)
Vorob’ev NH (1977) Game theory. Springer, Berlin
Ever since the linear-quadratic (LQ) optimal
control problem was introduced in the 1960s by
Kalman in his pioneering paper (1960), it has
Generalized Finite-Horizon found countless applications in areas such as
Linear-Quadratic Optimal Control chemical process control, aeronautics, robotics,
servomechanisms, and motor control, to name but
Augusto Ferrante1 and Lorenzo Ntogramatzidis2 a few.
1
Dipartimento di Ingegneria dell’Informazione, For details on the raisons d’être of LQ prob-
Università di Padova, Padova, Italy lems, readers are referred to the classical text-
2
Department of Mathematics and Statistics, books on this topic (Anderson and Moore 1971;
Curtin University, Perth, WA, Australia Kwakernaak and Sivan 1972) and to the Spe-
cial Issue on LQ optimal control problems in
IEEE Trans. Aut. Contr., vol. AC-16, no. 6, 1971.
Abstract The LQ regulator is not only important per se. It
is also the prototype of a variety of fundamental
The linear-quadratic (LQ) problem is the optimization problems. Indeed, several optimal
prototype of a large number of optimal control control problems that are extremely relevant in
Generalized Finite-Horizon Linear-Quadratic Optimal Control 505
practice can be recast into composite LQ, dual Since its introduction, the original formulation
LQ, or generalized LQ problems. Examples in- of the classic LQ optimal control problem has
clude LQG, H2 and H1 problems, and Kalman been generalized in several different directions,
filtering problems. Moreover, LQ optimal control to accommodate for the need of considering
is intimately related, via matrix Riccati equations, more general scenarios than the one represented
to absolute stability, dissipative networks, and by Problem 1. Examples include the so-called
optimal filtering. The importance of LQ problems fixed endpoint LQ, in which the extreme states
is not restricted to linear systems. For example, are sharply assigned, and the point-to-point
LQ control techniques can be used to modify an case, in which the initial and terminal values
optimal control law in response to perturbations of an output of the system are constrained to be
in the dynamics of a nonlinear plant. For these equal to specified values. This led to a number
reasons, the LQ problem is universally regarded of contributions in the area where different
as a cornerstone of modern control theory. adaptations of the Riccati theory were tailored to
In its simplest and most classical version, the these diversified contexts of LQ optimal control. G
finite-horizon discrete LQ optimal control can be These variations of the classic LQ problem are
stated as follows: becoming increasingly important due to their
use in several applications of interest. Indeed,
Problem 1 Let A 2 Rnn and B 2 Rnm , and
many applications including spacecraft, aircraft,
consider the linear system
and chemical processes involve maneuvering
between two states during some phases of a
xt C1 D A xt C B ut ; yt D C xt C Dut ; (1) typical mission. Another interesting example is
the H2 -optimization of transients in switching
where the initial state x0 2 Rn is given. Let plants, where the problem can be divided
W D W > 2 Rnn be positive semidefinite. Find into a set of finite-horizon LQ problems with
a sequence of inputs ut , with t D 0; 1; : : : ; N 1, welding conditions on the optimal arcs for
minimizing the cost function each switch instant. This problem has been
the object of a large number of contributions
X
N 1 in the recent literature, under different names:
JN;x0 .u/ D kyt k2 C xN> W xN :
def
(2) Parameter varying systems, jump linear systems,
t D0 switching systems, and bumpless systems are
definitions extensively used to denote different
Historically, LQ problems were first intro- classes of systems affected by sensible changes
duced and solved by Kalman in (1960). In this in their parameters or structures (Balas and
paper, Kalman showed that the LQ problem can Bokor 2004). In recent years, a new unified
be solved for any initial state x0 , and the optimal approach emerged in Ferrante et al. (2005),
control can be written as a state feedback u.t/ D Ferrante and Ntogramatzidis (2005), Ferrante
K.t/ x.t/, where K.t/ can be found by solving and Ntogramatzidis (2007a), and Ferrante
a famous quadratic matrix difference equation and Ntogramatzidis (2007b) that solves the
known as the Riccati equation. When W is no finite-horizon LQ optimal control problem
longer assumed to be positive semidefinite, the via a formula which parameterizes the set of
optimal solution may or may not exist. A com- trajectories generated by the corresponding
plete analysis of this case has been worked out Hamiltonian differential equation in the
in Bilardi and Ferrante (2007). In the infinite- continuous time and the extended symplectic
horizon case (i.e., when N is infinite), the optimal difference equation in the discrete case. Loosely,
control (when it exists) is stationary and may be we can say that the expressions parameterizing
computed by solving an algebraic Riccati equa- the trajectories of the Hamiltonian differential
tion (Anderson and Moore 1971; Kwakernaak equation and the extended symplectic difference
and Sivan 1972). equation using this approach hinge on a pair of
506 Generalized Finite-Horizon Linear-Quadratic Optimal Control
Xt 0 More General Linear-Quadratic
Mt C1
0 0 Problems
S C A> Xt C1 B
D .R C B > Xt B/ The problem discussed in the previous section
R C B > Xt C1 B
>
presents some limitations that prevent its appli- G
S C A> Xt C1 B R C B > Xt C1 B 0: cability in several important situations. In partic-
ular, three relevant generalizations of the classical
Hence, (5) takes the form problem are:
1. The fixed endpoint case, where the states at the
X
N 1 endpoints x0 and xN are both assigned.
JN;x0 .u/ D k Œ.R C B > Xt C1 B/1=2 2. The point-to-point case, where the initial and
t D0 terminal values z0 and zN of linear combina-
tion zt D V xt of the state of the dynamical
.S C B Xt C1 A/xt C .R C B > Xt C1 B/1=2
> >
system described by (1) are constrained to be
ut k22 Cx0> X0 x0 : (7) equal to two assigned vectors.
3. The cyclic case, where the states at the end-
Now it is clear that ut is optimal if and only if points x0 and xN are not sharply assigned, but
they are constrained to be equal (clearly, we
.R C B > Xt C1 B/1=2 .S > C B > Xt C1 A/xt can have combinations of (2) and (3)).
All these problems are special cases of a
C.R C B > Xt C1 B/1=2 ut D 0; general LQ problem that can be stated as follows:
whose solutions are parameterized by the feed- Problem 2 Consider the dynamical setting (1) of
back control Problem 1. Find a sequence of inputs ut , with t D
0; 1; : : : ; N 1 and an initial state x0 minimizing
the cost function
ut D Kt xt C Gt vt ; (8)
X
N 1
where Kt D .RCB > Xt C1 B/ .S > CB > Xt C1 A/
def
h W11 W12 i
where Q; S; R are defined as in (3). We make the
Here W D W > W
def
is a positive semidefinite
12 22 following assumptions:
matrix (partitioned in four blocks) that quadrati-
(A1) The pair .A; B/ is modulus controllable,
cally penalizes the differences between the initial
i.e., 8 2 Cnf0g at least one of the two
state x0 and a desired initial state x 0 and between
matrices ŒI A j B and Œ1 I A j B
the final state xN and a desired final state x N
has full row rank.
(This general problem formulation can also en-
(A2) The pencil zF G is regular (i.e.,
compass problems where the difference x0 xN
det.zF G/ is not the zero polynomial)
is not fixed but has to be quadratically penalized
>
and has no generalized eigenvalues on the
with a matrix D P 0 in the performance
0 N 1 xt unit circle.
index JN;x0 .u/ D t D0 Œ xt> u> t … ut C .x0 Consider the discrete algebraic Riccati
xN /> .x0 xN /. It is simple to see that this equation
performance indexhcan be brought i back to (11)
by setting W D and x 0 D x N D X D A> X A .A> X B C S / .R C B > X B/1
0.). Equation (12) permits to impose also a hard
.S > C B > X A/ C Q: (14)
constraint on an arbitrary linear combination of
initial and final states.
Under assumptions (A1)–(A2), (14) admits
The solution of Problem 2 can be obtained
a strongly unmixed solution X D X > , i.e., a
by parameterizing the solutions of the so-called
solution X for which the corresponding closed-
extended symplectic system (see Ferrante and
loop matrix
Levy (1998) and references therein for a discus-
sion on symplectic matrices and pencils). This
AX D A B KX ; KX D .R C B > X B/1
def def
solution can be convenient also for the classi-
cal case of Problem 1. In fact it does not re-
.S > C B > X A/ (15)
quire to iterate the difference Riccati equation
(which can be undesirable if the time horizon
has spectrum that does not contain reciprocal
is large) but only to solve an algebraic Ric-
values, i.e., 2 .AX / implies 1 … .AX /. It
cati equation and a related discrete Lyapunov
can now be proven that the following closed-
equation.
loop Lyapunov equation admits a unique solution
This solution requires some definitions, pre-
Y D Y > 2 Rn n :
liminary results, and standing assumptions (see
Ntogramatzidis and Ferrante (2013) and Ferrante
and Ntogramatzidis (2013a) for a more general AX Y A> > 1 >
X Y C B .R C B X B/ B D 0:
approach which does not require such assump- (16)
tions). A detailed proof of the main result can
be found in Ferrante and Ntogramatzidis (2007b); The following theorem provides an explicit for-
see also Zattoni (2008) and Ferrante and Ntogra- mula parameterizing all the optimal state and
matzidis (2012). The extended symplectic pencil control trajectories for Problem 2. Notice that this
is defined by formula can be readily implemented starting from
the problem data.
23
In 0 0 Theorem 1 With reference to Problem 2, assume
zF G; F D 4 0 A> 0 5 ;
def
that (A1) and (A2) are satisfied. Let X D X >
0 B > 0 be any strongly unmixed solution of (14) and
2 3 Y D Y > be the corresponding solution of (16).
A 0 B Let NV be a basis matrix (In the case when
G D 4 Q In S 5 :
def
(13) ker V D f0g, we consider NV to be void.) of the
S> 0 R null space of V . Moreover, let
Generalized Finite-Horizon Linear-Quadratic Optimal Control 509
F D AN
def
algebraic Riccati equation and the solution of a
X;
Lyapunov equation may be computed by standard
K? D KX Y A> >
X .R C B X B/ B ;
1 >
def
and robust algorithms available in any control
package (see the MATLABR routines dare.m
x0
XO D diag .X; X /; x D
def def
; and dlyap.m). We refer to Ferrante et al. (2005)
xN
and Ferrante and Ntogramatzidis (2013b) for the
v In Y F > continuous-time counterpart of the above results.
wD ; LD
def def
;
NV> W x F Y
0 F > Summary
U D
def
;
0 In
With the technique discussed in this paper, a
VL
M D
def
:
NV> Œ .XO W / L U large number of LQ problems can be tackled
in a unified framework. Moreover, several finite- G
horizon LQ problems that can be interesting and
Problem 2 admits solutions if and only if
useful in practice can be recovered as particular
w 2 im M . In this case, let NM be a basis matrix
cases of the control problem considered here. The
of the null space of M , and define
generality of the optimal control problem herein
considered is crucial in the solution of several
P D f D M w C NM j arbitrary g: (17)
def
H2 H1 optimization problems whose optimal
trajectory is composed of a set of arches, each one
Then, the set of optimal state and control trajec- solving a parametric LQ subproblem in a specific
tories of Problem 2 is parameterized in terms of
2 P, by time horizon and all joined together at the end-
points of each subinterval. In these cases, in fact,
8" # a very general form of constraint on the extreme
ˆ
ˆ AtX Y .A>
X/
N t
ˆ
ˆ ; states is essential in order to express the condition
ˆ
ˆ KX AtX K? .A> N t1
ˆˆ X/
< of conjunction of each pair of subsequent arches
x.t /
D 0 t N 1; (18) at the endpoints.
u.t / ˆ
ˆ " #
ˆ
ˆ
ˆ
ˆ N
ˆ AX Y ; t D N:
:̂
0 0
Cross-References
The interpretation of the above result is the
following. As varies, (18) describes the trajec- H2 Optimal Control
tories of the extended symplectic system. The set H-Infinity Control
P defined in (17) is the set of for which these Linear Quadratic Optimal Control
trajectories satisfy the boundary conditions. All
the details of this construction can be found in
Ferrante and Ntogramatzidis (2007b). If the pair Bibliography
.A; B/ is stabilizable, we can choose X D X >
to be the stabilizing solution of (14). In such Anderson BDO, Moore JB (1971) Linear optimal control.
case, the matrices AtX , .A> X/
N t
and .A>X/
N t 1 Prentice Hall International, Englewood Cliffs
Balas G, Bokor J (2004) Detection filter design for
appearing in (18) are asymptotically stable for all
LPV systems – a geometric approach. Automatica
t D 0; : : : ; N . Thus, in this case, the optimal state 40:511–518
trajectory and control are expressed in terms of Bilardi G, Ferrante A (2007) The role of terminal
powers of strictly stable matrices in the overall cost/reward in finite-horizon discrete-time LQ optimal
control. Linear Algebra Appl (Spec Issue honor Paul
time interval, thus ensuring the robustness of
Fuhrmann) 425:323–344
the obtained solution even for very large time Ferrante A (2004) On the structure of the solution of
horizons. Indeed, the stabilizing solution of an discrete-time algebraic Riccati equation with singular
510 Graphs for Modeling Networked Interactions
G
if information can flow from vertex i to vertex theory, namely, the broad discipline of algebraic
j: If the information exchange is sensor based, graph theory, by first associating matrices with
and if there is a state xi associated with vertex i , graphs. For undirected graphs, the following ma-
e.g., its position, then this information is typically trices play a key role:
relative, i.e., the states are measured relative to
each other and the information obtained along Degree matrix W DDiag .deg .v1/ ; : : : ; deg .vN//;
the edge is xj xi . If on the other hand the
Adjacency matrix W A D aij ;
information is communicated, then the full state
information xi can be transmitted along the edge
where Diag denotes a diagonal matrix whose
.vi , vj /. The graph abstraction for a network of
diagonal consists of its argument and deg.vi /
agents with sensor-based information exchange is
is the degree of vertex i in the graph, i.e., the
illustrated in Fig. 1.
cardinality of the set of edges incident on vertex
It is often useful to differentiate between sce-
i . Moreover,
narios where the Information exchange is bidi-
rectional – if agent i can get information from
1 if (vj ; vi / 2 E
agent j , then agent j can get information from aij D
0 otherwise.
agent i – and when it is not. In the language of
graph theory, an undirected graph is one where
As an example of how these matrices come into
.vi ; vj / 2 E implies that .vj ; vi / 2 E, while a
play, the so-called consensus protocol over scalar
directed graph is one where such an implication
states can be compactly written on ensemble form
may not hold.
as
xP D Lx;
Graph-Based Coordination Models where x = [x1 ,. . . , xN ]T and L is the graph
Laplacian:
Graphs provide structural insights into how dif- L D A:
ferent coordination algorithms behave over a net-
work. A coordination algorithm, or protocol, is A useful matrix for directed networks is the
an update rule that describes how the node states incidence matrix, obtained by associating an in-
should evolve over time. To understand such dex to each edge in E. We say that vi = tail.ej / if
protocols, one needs to connect the interaction edge ej starts at node vi and vi = head.ej / if ej
dynamics to the underlying graph structure. This ends up at vi , leading to the
connection is facilitated through the common
intersection of linear system theory and graph Incidence matrix W D D ij ;
512 Graphs for Modeling Networked Interactions
Applications
Summary and Future Directions
Graph-based coordination has been used in a
number of application domains, such as multi- A number of issues pertaining to graph-based dis-
agent robotics, mobile sensor and communication tributed control remain to be resolved. These in-
networks, formation control, and biological sys- clude how heterogeneous networks, i.e., networks
tems. One way in which the consensus protocol comprising of agents with different capabilities,
Graphs for Modeling Networked Interactions 513
Graphs for Modeling Networked Interactions, Fig. 2 Fifteen mobile robots are forming the letter “G” by executing
a weighted version of the consensus protocol. (a) Formation control .t D 0/. (b) Formation control .t D 5/
G
can be designed and understood. A variation to Recommended Reading
this theme is networks of networks, i.e., net-
works that are loosely coupled together and that There are a number of research manuscripts
must coordinate at a higher level of abstrac- and textbooks that explore the role of network
tion. Another key issue concerns how human structure on the system theoretic aspects of
operators should interact with networked control networked dynamic systems and its many
systems. ramifications. Some of these references are listed
below.
Cross-References
Bibliography
Averaging Algorithms and Bai H, Arcak M, Wen J (2011) Cooperative control de-
Consensus sign: a systematic, passivity-based approach. Springer,
Distributed Optimization Berlin
Dynamic Graphs, Connectivity of Bullo F, Cortés J, Martínez S (2009) Distributed control of
robotic networks: a mathematical approach to motion
Flocking in Networked Systems coordination algorithms. Princeton University Press,
Networked Systems Princeton
Optimal Deployment and Spatial Mesbahi M, Egerstedt M (2010) Graph theoretic methods
Coverage in multiagent networks. Princeton University Press,
Princeton
Oscillator Synchronization Ren W, Beard R (2008) Distributed consensus in multi-
Vehicular Chains vehicle cooperative control. Springer, Berlin
H
resulting closed-loop system is internally stable, as singular H2 optimal control problems, consists
and the H2 norm of the resulting closed-loop of those which are not regular.
transfer matrix from the disturbance input w to Most of the research in the literature was
the controlled output z, denoted by Tzw .s/, is expended on regular problems. Also, most of the
minimized. For a stable transfer matrix Tzw .s/, available textbooks and review articles, see, for
the H2 norm is defined as example, Anderson and Moore (1989), Bryson
and Ho (1975), Fleming and Rishel (1975),
Z 1 Kailath (1974), Kwakernaak and Sivan (1972),
1 1
kTzw k2 D trace H
Tzw .j!/Tzw .j!/d! 2 Lewis (1986), and Zhou et al. (1996), to name a
2 1
few, cover predominantly only a subset of regular
(4) problems. The singular H2 control problem with
state feedback was studied in Geerts (1989) and
where Tzw H
is the conjugate transpose of Tzw . Note Willems et al. (1986). Using different classes of
that the H2 norm is equal to the energy of the state- and measurement-feedback control laws,
impulse response associated with Tzw .s/ and this Stoorvogel et al. (1993) studied the general H2
is finite only if the direct feedthrough term of the optimal control problems for the first time. In
transfer matrix is zero. particular, necessary and sufficient conditions are
It is standard to make the following assump- provided therein for the existence of a solution in
tions on the problem data: D11 D 0; D22 D the case of state-feedback control, and in the case
0; .A; B/ is stabilizable; .A; C1 / is detectable. of measurement-feedback control. Following
The last two assumptions are necessary for the this, Trentelman and Stoorvogel (1995) explored
existence of an internally stabilizing control law. necessary and sufficient conditions for the
The first assumption can be made without loss existence of an H2 optimal controller within
of generality via a constant loop transformation. the context of discrete-time and sampled-data
Finally, either the assumption D22 D 0 can be systems. At the same time Chen et al. (1993,
achieved by a pre-static feedback law, or the 1994a) provided a thorough treatment of the
problem does not yield a solution that has finite H2 optimal control problem with state-feedback
H2 closed-loop norm. controllers. This includes a parameterization
There are two main groups into which all H2 and construction of the set of all H2 optimal
optimal control problems can be divided. The controllers and the associated sets of H2 optimal
first group, referred to as regular H2 optimal fixed modes and H2 optimal fixed decoupling
control problems, consists of those problems for zeros. Also, they provided a computationally
which the given plant satisfies two additional feasible design algorithm for selecting an H2
assumptions: optimal state-feedback controller that places the
1. The subsystem from the control input to the closed-loop poles at desired locations whenever
controlled output, i.e., .A; B; C2 ; D2 /, has no possible. Furthermore, Chen and Saberi (1993)
invariant zeros on the imaginary axis, and and Chen et al. (1996) developed the necessary
its direct feedthrough matrix, D2 , is injective and sufficient conditions for the uniqueness of
(i.e., it is tall and of full rank). an H2 optimal controller. Interested readers are
2. The subsystem from the exogenous dis- referred to the textbook Saberi et al. (1995)
turbance to the measurement output, i.e., for a detailed treatment of H2 optimal control
.A; E; C1 ; D1 /, has no invariant zeros on problems in their full generality.
the imaginary axis and its direct feedthrough
matrix, D1 , is surjective (i.e., it is fat and of
full rank).
Assumption 1 implies that .A; B; C2 ; D2 / is left Regular Case
invertible with no infinite zero, and Assump-
tion 2 implies that .A; E; C1 ; D1 / is right invert- Solving regular H2 optimal control problems is
ible with no infinite zero. The second, referred to relatively straightforward. In the case that all of
H2 Optimal Control 517
1
the state variables of the given plant are available
where R? > 0 and Q? 0 with .A; Q?2 / being
for feedback, i.e., y D x, and Assumption 1
detectable. The LQR problem is equivalent to
holds, the corresponding H2 optimal control
finding a static state-feedback H2 optimal control
problem can be solved in terms of the unique
law for the following auxiliary plant †LQR :
positive semi-definite stabilizing solution P 0
of the following algebraic Riccati equation:
xP D Ax C Bu C X0 w (10)
Note that the H2 optimal state-feedback control QAT CAQ C EE T .QC1T C ED1T /.D1 D1T /1
law is generally nonunique. A trivial example
is the case when E D 0, whereby every sta- .D1 E T C C1 Q/ D 0 (13)
bilizing control law is an optimal solution. It
is also interesting to note that the closed-loop The H2 optimal measurement-feedback law is
system comprising the given plant with y D x given by
and the state-feedback control law of Eq. (6) has
poles at all the stable invariant zeros and all the vP D .A C BF C KC1 /v Ky; u D F x (14)
mirror images of the unstable invariant zeros of
.A; B; C2 ; D2 / together with some other fixed where F is as given in Eq. (6) and
locations in the left half complex plane. More de-
tailed results about the optimal fixed modes and
fixed decoupling zeros for general H2 optimal K D .QC1T C ED1T /.D1 D1T /1 (15)
control can be found in Chen et al. (1993).
It can be shown that the well-known linear In fact, such an optimal control law is unique and
quadratic regulation (LQR) problem can be refor- the resulting closed-loop transfer matrix from w
mulated as a regular H2 optimal control problem. to z, Tzw .s/, has the following property:
For a given plant ˚
kTzw k2 D trace.E T PE/
xP D Ax C Bu; x.0/ D X0 (8)
1
Ctrace AT P C PA C C2T C2 Q 2
xP D Ax C Bu C Œ G? 0 w (20) T
A P C PA C C2T C2 PB C C2T D2
F .P /D 0
B T P C D2T C2 D2T D2
y D C x C Œ 0 N? w (21) (23)
and find the largest solution Q 0 for
H? 0
zD xC u (22)
0 R?
AQ C QAT C EE T QC1T C ED1T
H2 optimal control problem for discrete- G.Q/D 0
C1 Q C D1 E T D1 D1T
time systems can be solved in a similar way
(24)
via the corresponding discrete-time algebraic
Riccati equations. It is worth noting that many Note that by decomposing the quadruples
works can be found in the literature that deal .A; B; C2 ; D2 / and .A; E; C1 ; D1 / into various
with solutions to discrete-time algebraic Riccati subsystems in accordance with their structural
equations related to optimal control problems; properties, solutions to the above linear matrix
see, for example, Kucera (1972), Pappas et al. inequalities can be obtained by solving a Riccati
(1980), and Silverman (1976), to name a few. It equation similar to those in Eq. (5) or Eq. (5) for
is proven in Chen et al. (1994b) that solutions the regular case. In fact, for the regular problem,
to the discrete- and continuous-time algebraic the largest solution P 0 for Eq. (23) and
Riccati equations for optimal control problems the stabilizing solution P 0 for Eq. (5) are
can be unified. More specifically, the solution identical. Similarly, the largest solution Q 0
to a discrete-time Riccati equation can be done for Eq. (24) and the stabilizing solution Q 0
through solving an equivalent continuous-time for Eq. (13) are also the same. Interested readers
one and vice versa. are referred to Stoorvogel et al. (1993) for more
details or to Chen et al. (2004) for a more system-
atic treatment on the structural decomposition of
Singular Case linear systems and its connection to the solutions
of the linear matrix inequalities.
As in the previous section, only the key procedure It can be shown that the best achievable H2
in solving the singular H2 -optimization problem norm of the closed-loop transfer matrix from w
for continuous-time systems is addressed. For to z, i.e., the best possible performance over all
the singular problem, it is generally not possible internally stabilizing control laws, is given by
to obtain an optimal solution, except for some
˚
situations when the given plant satisfies certain 2? D trace.E T PE/
geometric constraints; see, e.g., Chen et al. (1993)
1
and Stoorvogel et al. (1993). It is more feasible C trace AT P C PA C C2T C2 Q 2 (25)
H2 Optimal Control 519
Fleming WH, Rishel RW (1975) Deterministic and closed-loop system can be derived. Connections
stochastic optimal control. Springer, New York to other problems, such as game theory and risk-
Geerts T (1989) All optimal controls for the singular linear
quadratic problem without stability: a new interpreta-
sensitive control, are discussed and finally appro-
tion of the optimal cost. Linear Algebra Appl 122:65– priate problem formulations to produce “good”
104 controllers using this methodology are outlined.
Kailath T (1974) A view of three decades of linear filtering
theory. IEEE Trans Inf Theory 20: 146–180
Kucera V (1972) The discrete Riccati equation of optimal
control. Kybernetika 8:430–447 Keywords
Kwakernaak H, Sivan R (1972) Linear optimal control
systems. Wiley, New York
Loop-shaping; Robust control; Robust stability
Lewis FL (1986) Optimal control. Wiley, New York
Pappas T, Laub AJ, Sandell NR Jr (1980) On the nu-
merical solution of the discrete-time algebraic Riccati
equation. IEEE Trans Autom Control AC-25:631–641 Introduction
Saberi A, Sannuti P, Chen BM (1995) H2 optimal control.
Prentice Hall, London
Silverman L (1976) Discrete Riccati equations: alternative The H1 -norm probably first entered the study
algorithms, asymptotic properties, and system theory of robust control with the observations made
interpretations. Control Dyn Syst 12:313–386 by Zames (1981) in the considering optimal
Stoorvogel AA (1992) The singular H2 control problem.
Automatica 28:627–631 sensitivity. The so-called H1 methods were
Stoorvogel AA, Saberi A, Chen BM (1993) Full and subsequently developed and are now routinely
reduced order observer based controller design for H2 - available to control engineers. In this entry
optimization. Int J Control 58:803–834 we consider the H1 methods for control, and
Trentelman HL, Stoorvogel AA (1995) Sampled-data and
discrete-time H2 optimal control. SIAM J Control for simplicity of exposition, we will restrict
Optim 33:834–862 our attention to linear, time-invariant, finite
Willems JC, Kitapci A, Silverman LM (1986) Singular op- dimensional, continuous-time systems. Such
timal control: a geometric approach. SIAM J Control systems can be represented by their transfer
Optim 24:323–337
Zhou K, Doyle JC, Glover K (1996) Robust and optimal function matrix, G.s/, which will then be a
control. Prentice Hall, Upper Saddle River rational function of s. Although the Hardy
Space, H1 , also includes nonrational functions,
a rational G.s/ is in H1 if and only if it is proper
and all its poles are in the open left half plane, in
H-Infinity Control which case the H1 -norm is defined as:
Keith Glover
kG.s/k1 D sup max .G.s//D sup max ..j!//
Department of Engineering, University of Res>0 1<!<1
Cambridge, Cambridge, UK
Hence in a control context the H1 -norm can That is, the controller is designed to minimize
give a measure of the gain, for example, from the worst-case effect of the disturbance w on
disturbances to the resulting errors. In the the output/error signal z as measured by the L2
interconnection of systems, the property that norm of the signals. This article will describe the
kP .s/Q.s/k1 kP .s//k1 kQ.s/k1 is often solution to this problem.
useful.
The second reason for using the H1 -norm
is in representing uncertainty in the plant being Robust Stability
controlled, e.g., the nominal plant is Po .s/ but
the actual plant is P .s/ D Po .s/ C .s/ where Before we describe the solution to the synthe-
k.s/k1 ı. sis problem, consider the problem of the robust
A typical control design problem is given in stability of an uncertain plant with a feedback
Fig. 1, i.e., controller. Suppose the plant is given by the upper
LFT, Fu .P; / with kk1 1= as illustrated
zN wN P11 wN C P12 uN in Fig. 2,
D P D
yN uN P21 wN C P22 uN
uN D K yN yN D Fu .P; /Nu; (1)
522 H-Infinity Control
0 W2 Po
P D .I C W1 W2 /Po D Fu ;
Δ W1 Po
z.t/ D C1 x.t/ C D12 u.t/ (4) such that kFl .P; K/k1 < . In addition since
transposing a system does not change its H1 -
y.t/ D C2 x.t/ C D21 w.t/ (5)
norm, the following dual ARE will also have a
solution, Y1 0,
i.e., in Fig. 1
2 3 AY1 C Y1 A C B1 B1
A B1 B2
P D 4 C1 0 D12 5 CY1 . 2 C1 C1 C2 C2 /Y1 D 0 (8)
C2 D21 0
To obtain a solution to the output feedback
where we also assume, with little loss of general- case, note that (7) implies that kzk22 < 2 kwk22
ity, that D12 D12 D I , D21 D21 D I , D12 C1 D if and only if kvk22 < 2 krk22 and vN D
0 and B1 D21 D 0. Since we wish to have Fl .Ptmp ; K/Nr where
kTz w k1 < , we need to find u such that
vN rN
kzk22 2 kwk22 < 0 for all w ¤ 0 2 L2 .0; 1/: D Ptmp ; H
yN uN
We could consider w to be an adversary trying to
and
make this expression positive, while u has to en-
sure that it always remains negative in spite of the 2 3
A C 2 B1 B1 X1 B1 B2
malicious intentions of w, as in a noncooperative
Ptmp D4 B2 X1 0 I 5
game. Suppose that there exists a solution, X1 ,
C2 D21 0
to the Algebraic Riccati Equation (ARE),
giving feedback from a state estimator in the pres- I 0
where J D and generates state-
ence of an estimate of the worst-case disturbance. 0 I
As ! 1 the standard LQG controller space solutions to these problems (Kimura 1997).
is obtained with state feedback of a state esti- The derivation above clearly demonstrates re-
mate obtained from a Kalman filter. In contrast lations to noncooperative differential games, and
to the LQG problem, the controller depends on this is fully developed in Başar and Bernhard
the value of , and if this is chosen to be too (1995) and Green and Limebeer (1995).
small, then one of the conditions in (9) will The model matching problem is clearly a con-
be violated. In order to determine the minimum vex optimization problem. The solution of linear
achievable value of , a bisection search over matrix inequalities can give effective methods
can be performed checking (9) for each candidate for solving certain convex optimization problems
value of . (e.g., calculating the H1 norm using the bounded
In the limit as ! opt (its minimum value), real lemma) and can be exploited in the H1 -
a variety of situations can arise and the formulae control problem. See Boyd and Barratt (1991) for
given here may become ill-conditioned. Typically a variety of results on convex optimization and
achieving opt is more of an interesting and some- control and Dullerud and Paganini (2000) for this
times challenging mathematical exercise rather approach in robust control.
than a control system requirement. As noted above there is a family of solutions
This control problem does not have a unique to the H1 -control problem. The central solution
solution, and all solutions can be characterized in fact minimizes the entropy integral given by
by an LFT form such as K D Fl .M; Q/ where Z
Q 2 H1 with kQk1 < 1, the present solution 2 1
I.Tz wI / WD ln
is sometimes referred to as the “central solution” 2 1
obtained with Q D 0. ˇ ˇ
ˇdet.I 2 Tz w .j!/ Tz w .j!//ˇ d!
(10)
Relations for Other Solution Methods It can be seen that this criterion will penalize the
and Problem Formulations singular values of Tz w .j!/ from being close to
for a large range of frequencies.
The H1 -control problem has been shown to One of the more surprising connections is
be related to an extraordinarily wide variety of with the risk-sensitive stochastic control problem
mathematical techniques and to other problem (Whittle 1990) where w is assumed to be Gaus-
areas, and investigations of these connections sian white noise and it is desired to minimize
have been most fruitful. Earlier approaches (see 2 n 1 2 o
Francis 1988) firstly used the characterization JT . / WD ln E e 2 VT (11)
T
of all stabilizing controllers of Youla et al. (see Z T
Vidyasagar 1985) which shows that all stable where VT WD z.t/ z.t/ dt (12)
closed-loop systems can be written as T
function. When is chosen to be too small, nevertheless exhibit most of the features of linear
Whittle refers to the controller having a “neurotic time-invariant systems without such assumptions
breakdown” because the cost will be infinite and for which routine design software is now
for all possible control laws! If in (11) we set available. Connections to a surprisingly large
2 D 1 , then the entropy minimizing con- range of other problems are also discussed.
troller will have < 0 and will be risk-averse. Generalizations to more general cases such
The risk neutral controller is when ! 0, ! as time-varying and nonlinear systems, where
1 and gives the standard LQG case. If > 0, the norm is interpreted as the induced norm of
then the controller will be risk-seeking, believing the system in L2 , can be derived although the
that large variance will be in its favor. computational aspects are no longer routine. For
the problems of robust control, there are necessar-
ily continuing efforts to match the mathematical
Controller Design with H1 representation of system uncertainty and system
Optimization performance to the physical system requirements
and to have such representations amenable to
The above solutions to the H1 mathematical analysis and computation.
H
problem do not give guidance on how to set up a
problem to give a “good” control system design.
The problem formulation typically involves iden- Cross-References
tifying frequency-dependent weighting matrices
Fundamental Limitation of Feedback Control
to characterize the disturbances, w, and the rel-
H2 Optimal Control
ative importance of the errors, z (see Skogestad
and Postlethwaite 1996). The choice of weights Linear Quadratic Optimal Control
should also incorporate system uncertainty to LMI Approach to Robust Control
obtain a robust controller. Robust H2 Performance in Feedback Control
One approach that combines both closed-loop Structured Singular Value and Applications:
system gain and system uncertainty is called Analyzing the Effect of Linear Time-Invariant
H1 loop-shaping where the desired closed-loop Uncertainty in Linear Systems
behavior is determined by the design of the loop-
shape using pre- and post-compensators and the
system uncertainty is represented in the gap met-
Bibliography
ric (see Vinnicombe 2001). This makes classical Dullerud GE, Paganini F (2000) A course in robust control
criteria such as low frequency tracking error, theory: a convex approach. Springer, New York
bandwidth, and high-frequency roll-off all eas- Green M, Limebeer D (1995) Linear robust control. Pren-
ily incorporated. In this framework the perfor- tice Hall, Englewood Cliffs
Başar T, Bernhard P (1995) H 1 -optimal control and re-
mance and robustness measures are very well lated minimax design problems, 2nd edn. Birkhäuser,
matched to each other. Such an approach has been Boston
successfully exploited in a number of practical Boyd SP, Barratt CH (1991) Linear controller design:
examples (e.g., Hyde (1995) for flight control limits of performance. Prentice Hall, Englewood Cliffs
Doyle JC, Glover K, Khargonekar PP, Francis BA (1989)
taken through to successful flight tests). Standard State-space solutions to standard H2 and H1 control
control design software packages now routinely problems. IEEE Trans Autom Control 34(8):831–847
have H1 -control design modules. Francis BA (1988) A course in H1 control theory. Lec-
ture notes in control and information sciences, vol 88.
Springer, Berlin, Heidelberg
Hyde RA (1995) H1 aerospace control design: a VSTOL
Summary and Future Directions flight application. Springer, London
Kimura H (1997) Chain-scattering approach to H1 -
control. Birkhäuser, Basel
We have outlined the derivation of H1 Skogestad S, Postlethwaite I (1996) Multivariable feed-
controllers with straightforward assumptions that back control: analysis and design. Wiley, Chichester
526 History of Adaptive Control
Vidyasagar M (1985) Control system synthesis: a factor- accommodate variations in ambient light. The
ization approach. MIT, Cambridge distinction between adaptation and conventional
Vinnicombe G (2001) Uncertainty and feedback: H1
loop-shaping and the -gap metric. Imperial College
feedback is subtle because feedback also attempts
Press, London to reduce the effects of disturbances and plant
Whittle P (1990) Risk-sensitive optimal control. Wiley, uncertainty. Typical examples are adaptive optics
Chichester and adaptive machine tool control which are
Zames G (1981) Feedback and optimal sensitivity: model
reference transformations, multiplicative seminorms,
conventional feedback systems, with controllers
and approximate inverses. IEEE Trans Automat Con- having constant parameters. In this entry we take
trol 26:301–320 the pragmatic attitude that an adaptive controller
Zhou K, Doyle JC, Glover K (1996) Robust and opti- is a controller that can modify its behavior in re-
mal control. Prentice Hall, Upper Saddle River, New
Jersey
sponse to changes in the dynamics of the process
and the character of the disturbances, by adjusting
the controller parameters.
Adaptive control has had a colorful history
History of Adaptive Control with many ups and downs and intense debates
in the research community. It emerged in the
Karl Åström 1950s stimulated by attempts to design autopi-
Department of Automatic Control, Lund lots for supersonic aircrafts. Autopilots based on
University, Lund, Sweden constant-gain, linear feedback worked well in one
operating condition but not over the whole flight
envelope. In process control there was also a need
Abstract for automatic tuning of simple controllers.
Much research in the 1950s and early 1960s
This entry gives an overview of the development contributed to conceptual understanding of
of adaptive control, starting with the early ef- adaptive control. Bellman showed that dynamic
forts in flight and process control. Two popular programming could capture many aspects of
schemes, the model reference adaptive controller adaptation (Bellman 1961). Feldbaum introduced
and the self-tuning regulator, are described with the notion of dual control, meaning that control
a thumbnail overview of theory and applications. should be probing as well as directing; the
There is currently a resurgence in adaptive flight controller should thus inject test signals to obtain
control as well as in other applications. Some better information. Tsypkin showed that schemes
reflections on future development are also given. for learning and adaptation could be captured in
a common framework (Tsypkin 1971).
Gabor’s work on adaptive filtering (Gabor
et al. 1959) inspired Widrow to develop an
Keywords analogue neural network (Adaline) for adaptive
control (Widrow 1962). Widrow’s adaptation
Adaptive control; Auto-tuning; Flight control; mechanism was inspired by Hebbian learning in
History; Model reference adaptive control; Pro- biological systems (Hebb 1949).
cess control; Robustness; Self-tuning regulators; There are adaptive control problems in eco-
Stability nomics and operations research. In these fields
the problems are often called decision making
under uncertainty. A simple idea, called the cer-
Introduction tainty equivalence principle proposed by Simon
(1956), is to neglect uncertainty and treat esti-
In everyday language, to adapt means to change mates as if they are true. Certainty equivalence
a behavior to conform to new circumstances, was commonly used in early work on adaptive
for example, when the pupil area changes to control.
History of Adaptive Control 527
A period of intense research and ample fund- based on air data sensors, the interest in adaptive
ing ended dramatically in 1967 with a crash of the flight control diminished significantly.
rocket powered X15-3 using Honeywell’s MH- There was also interest of adaptation for
96 self-oscillating adaptive controller. The self- process control. Foxboro patented an adaptive
oscillating adaptive control system has, however, process controller with a pneumatic adaptation
been successfully used in several missiles. mechanism in 1950 (Foxboro 1950). DuPont had
Research in adaptive control resurged in the joint studies with IBM aimed at computerized
1970s, when the two schemes the model ref- process control. Kalman worked for a short
erence adaptive control (MRAC) and the self- time at the Engineering Research Laboratory
tuning regulator (STR) emerged together with at DuPont, where he started work that led to a
successful applications. The research was influ- paper (Kalman 1958), which is the inspiration
enced by stability theory and advances in the field of the self-tuning regulator. The abstract of this
of system identification. There was an intensive entry has the statement, This paper examines the
period of research from the late 1970s through the problem of building a machine which adjusts
1990s. The insight and understanding of stability, itself automatically to control an arbitrary
convergence, and robustness increased. Recently dynamic process, which clearly captures the
H
there has been renewed interest because of flight dream of early adaptive control.
control (Hovakimyan and Cao 2010; Lavretsky Draper and Li investigated the problem of op-
and Wise 2013) and other applications; there is, erating aircraft engines optimally, and they devel-
for example a need for adaptation in autonomous oped a self-optimizing controller that would drive
systems. the system towards optimal working conditions.
The system was successfully flight-tested (Draper
and Li 1966) and initiated the field of extremal
control.
The Brave Era Many of the ideas that emerged in the brave
era inspired future research in adaptive control.
Supersonic flight posed new challenges for flight The MRAC, the STR, and extremal control are
control. Eager to obtain results, there was a very typical examples.
short path from idea to flight test with very little
theoretical analysis in between. A number of
research projects were sponsored by the US air
force. Adaptive flight control systems were devel- Model Reference Adaptive Control
oped by General Electric, Honeywell, MIT, and (MRAC)
other groups. The systems are documented in the
Self-Adaptive Flight Control Systems Sympo- The MRAC was one idea from the early work
sium held at the Wright Air Development Center on flight control that had a significant impact on
in 1959 (Gregory 1959) and the book (Mishkin adaptive control. A block diagram of a system
and Braun 1961). with model reference adaptive control is shown
Whitaker of the MIT team proposed the model in Fig. 1. The system has an ordinary feedback
reference adaptive controller system which is loop with a controller, having adjustable param-
based on the idea of specifying the performance eters, and the process. There is also a reference
of a servo system by a reference. Honeywell pro- model which gives the ideal response ym to the
posed a self-oscillating adaptive system (SOAS) command signal ym and a mechanism for adjust-
which attempted to keep a given gain margin ing the controller parameters . The parameter
by bringing the system to self-oscillation. The adjustment is based on the process output y, the
system was flight-tested on several aircrafts. It control signal u, and the output ym of the ref-
experienced a disaster in a test on the X-15. erence model. Whitaker proposed the following
Combined with the success of gain scheduling rule for adjusting the parameters:
528 History of Adaptive Control
History of Adaptive
ym
Control, Fig. 1 Block Model
diagram of a feedback
system with a model
reference adaptive Controller parameters
Adjustment
controller (MRAC)
mechanism
uc
u y
Controller Plant
y.t/Ca1 y.t h/ C an y.t nh/ D Insight from system identification showed that H
excitation is required to obtain good estimates. In
b1 u.t h/ C C bn u.t nh/C the absence of excitation, a phenomenon of burst-
c1 w.t h/ C cn w.t nh/ C e.t/; ing could be observed. There could be epochs
(2) with small control actions due to insufficient
excitation. The estimated parameters then drifted
where u is the control signal, y the process out-
towards values close to or beyond the stability
put, w a measured disturbance, and e a stochas-
boundary generating large control axions. Good
tic disturbance. Furthermore, h is the sampling
parameter estimates were then obtained and the
period and ak , bk and ck , are the parameters.
system quickly recovered stability. The behavior
Parameter estimation is typically done using least
then repeated in an irregular fashion. There are
squares, and a control design that minimized the
two ways to deal with the problem. One possibil-
variance of the variations was well suited for
ity is to detect when there is poor excitation and
regulation. A surprising result was that if the
stop adaptation (Hägglund and Åström 2000).
estimates converge, the limiting controller is a
The other is to inject perturbations when there is
minimum variance controller even if the distur-
poor excitation in the spirit of dual control.
bance e is colored noise (Åström and Wittenmark
1973). Convergence conditions for the self-tuning
regulator were given in Goodwin et al. (1980),
and a very detailed analysis was presented in Guo Robustness and Unification
and Chen (1991).
The problem of output feedback does not The model reference adaptive control and the
appear for the model (2) because the sequence self-tuning regulator originate from different ap-
of past inputs and outputs y.t h/; : : : ; y.t plication domains, flight control and process con-
nh/; u.t h/; : : : ; u.t nh/ is indeed a state, trol. The differences are amplified because they
albeit not a minimal state representation. The are typically presented in different frameworks,
continuous analogue would be to use derivatives continuous time for MRAC and discrete time
of states and inputs which is not feasible because for the STR. The schemes are, however, not too
of measurement noise. The selection of the sam- different. For a given process model and given
pling period is however important. design criterion the process model can often be
Early industrial experience indicated that the re-parameterized in terms of controller parame-
ability of the STR to adapt feedforward gains was ters, and the STR is then equivalent to an MRAC.
particularly useful, because feedforward control Similarly there are indirect MRAC where the
requires good models. process parameters are estimated (Egardt 1979).
530 History of Adaptive Control
A fundamental assumption made in the early computers were still slow at the time, it was
analyses of model reference adaptive controllers natural that most experimentats were executed in
was that the process model used for analysis had process control or ship steering which are slow
the same structure as the real process. Rohrs at processes. Advances in computing eliminated the
MIT, which showed that systems with guaranteed technological barriers rapidly.
convergence could be very sensitive to unmod- Self-oscillating adaptive controllers are used
eled dynamics, generated a good deal of research in several missiles. In piloted aircrafts there were
to explore robustness to unmodeled dynamics. complaints about the perturbation signals that
Averaging theory, which is based on the obser- were always exciting the system.
vation that there are two loops in an adaptive Self-tuning regulators have been used indus-
system, a fast ordinary feedback and a slow trially since the early 1970s. Adaptive autopilots
parameter adjustment loop, turned out to be a key for ship steering were developed at the same
tool for understanding the behavior of adaptive time. They outperformed conventional autopi-
systems. A large body of theory was generated lots based on PID control, because disturbances
and many books were written (Ioannou and Sun generated by waves were estimated and com-
1995; Sastry and Bodson 1989). pensated for. These autopilots are still on the
The theory resulted in several improvements market (Northrop Grumman 2005). Asea (now
of the adaptive algorithms. In the MIT rule (1) ABB) developed a small distributed control sys-
and similar adaptation laws derived from Lya- tem, Novatune, which had blocks for self-tuning
punov theory, the rate of change of the adapta- regulators based on least-squares estimation, and
tion rate is a multiplication of the error e with minimum variance control. The company First
other signals in the system. The adaptation rate Control, formed by members of the Novatune
may then become very large when signals are team, has delivered SCADA systems with adap-
large. The analysis of robustness showed that tive control since 1985. The controllers are used
there were advantages in avoiding large adapta- for high-performance process control systems for
tion rates by normalizing the signals. The stabil- pulp mills, paper machines, rolling mills, and
ity analysis also required that parameter estimates pilot plants for chemical process control. The
had to be bounded. To achieve this, parame- adaptive controllers are based on recursive esti-
ters were projected on regions given by prior mation of a transfer function model and a control
parameter bounds. The projection did, however, law based on pole placement. The controller also
require prior process knowledge. The improved admits feedforward. The algorithm is provided
insight obtained from the robustness analysis is with extensive safety logic, parameters are pro-
well described in the books Goodwin and Sin jected, and adaptation is interrupted when varia-
(1984), Egardt (1979), Åström and Wittenmark tions in measured signals and control signals are
(1989), Narendra and Annaswamy (1989), Sastry too small.
and Bodson (1989), Anderson et al. (1986), and The most common industrial uses of adaptive
Ioannou and Sun (1995). techniques are automatic tuning of PID con-
trollers. The techniques are used both in single
loop controllers and in DCS systems. Many dif-
Applications ferent techniques are used, pattern recognition
as well as parameter estimation. The relay auto-
There were severe practical difficulties in tuning has proven very useful and has been shown
implementing the early adaptive controllers to be very robust because it provides proper
using the analogue technology available in the excitation of the process automatically. Some of
brave era. Kalman used a hybrid computer when the systems use automatic tuning to automatically
he attempted to implement his controller. There generate gain schedules, and they also have adap-
were dramatic improvements when mini- and tation of feedback and feedforward gains (Åström
microcomputers appeared in the 1970s. Since and Hägglund 2005).
History of Adaptive Control 531
Summary and Future Directions little easier to analyze because the systems do
not have a feedback controller. New adaptive
Adaptive control has had turbulent history with schemes are appearing. The L1 adaptive con-
alternating periods of optimism and pessimism. troller is one example. It inherits features of
This history is reflected in the conferences. When both the STR and the MRAC. The model-free
the IEEE Conference on Decision and Control controller by Fliess and Join (2013) is another
started in 1962, it included a Symposium on example. It is similar to a continuous time version
Adaptive Processes, which was discontinued af- of the self-tuning regulator.
ter the 20th CDC in 1981. There were two IFAC There is renewed interest in adaptive control
symposia on the Theory of Self-Adaptive Control in the aerospace industry, both for aircrafts and
Systems, the first in Rome in 1962 and the second missiles (Lavretsky and Wise 2013). Good results
in Teddington in 1965 (Hammond 1966). The in flight tests have been reported both using
symposia were discontinued but reappeared when MRAC and the recently developed L1 adaptive
the Theory Committee of IFAC created a working controller (Hovakimyan and Cao 2010).
group on adaptive control chaired by Prof. Lan- Adaptive control is a rich field, and to under-
dau in 1981. The group brought the communities stand it well, it is necessary to know a wide range
H
of control and signal processing together, and a of techniques: nonlinear, stochastic, and sampled
workshop on Adaptation and Learning in Signal data systems, stability, robust control, and system
Processing and Control (ALCOSP) was created. identification.
The first symposium was held in San Francisco in In the early development of adaptive con-
1983 and the 11th in Caen in 2013. trol, there was a dream of the universal adaptive
Adaptive control can give significant benefits, controller that could be applied to any process
it can deliver good performance over wide op- with very little prior process knowledge. The
erating ranges, and commissioning of controllers insight gained by the robustness analysis shows
can be simplified. Automatic tuning of PID con- that knowledge of bounds on the parameters is
trollers is now widely used in the process in- essential to ensure robustness. With the knowl-
dustry. Auto-tuning of more general controller is edge available today, adaptive controllers can be
clearly of interest. Regulation performance is of- designed for particular applications. Design of
ten characterized by the Harris index which com- proper safety nets is an important practical issue.
pares actual performance with minimum variance One useful approach is to start with a basic
control. Evaluation can be dispensed with by constant-gain controller and provide adaptation
applying a self-tuning regulator. as an add-on. This approach also simplifies de-
There are adaptive controllers that have been sign of supervision and safety networks.
in operation for more than 30 years, for example, There are still many unsolved research
in ship steering and rolling mills. There is a problems. Methods to determine the achievable
variety of products that use scheduling, MRAC, adaptation rates are not known. Finding ways
and STR in different ways. Automatic tuning to provide proper excitation is another problem.
is widely used; virtually all new single loop The dual control formulation is very attractive
controllers have some form of automatic tuning. because it automatically generates proper
Automatic tuning is also used to build gain sched- excitation when it is needed. The computations
ules semiautomatically. The techniques appear required to solve the Bellman equations are
in tuning devices, in single loop controllers, in prohibitive, except in very simple cases. The
distributed systems for process control, and in self-oscillating adaptive system, which has been
controllers for special applications. There are successfully applied to missiles, does provide
strong similarities between adaptive filtering and excitation. The success of the relay auto-tuner
adaptive control. Noise cancellation and adaptive for simple controllers indicates that it may
equalization are widely spread uses of adapta- be called in to provide excitation of adaptive
tion. The signal processing applications are a controllers. Adaptive control can be an important
532 History of Adaptive Control
component of the emerging autonomous system. Gregory PC (ed) (1959) Proceedings of the self adaptive
One may expect that the current upswing in flight control symposium. Wright Air Development
Center, Wright-Patterson Air Force Base, Ohio
systems biology may provide more inspiration Guo L, Chen HF (1991) The Åström-Wittenmark’s
because many biological clearly have adaptive self-tuning regulator revisited and ELS-based adaptive
capabilities. trackers. IEEE Trans Autom Control 30(7):802–812
Hägglund T, Åström KJ (2000) Supervision of adaptive
control algorithms. Automatica 36:1171–1180
Hammond PH (ed) (1966) Theory of self-adaptive control
systems. In: Proceedings of the second IFAC sympo-
Cross-References sium on the theory of self-adaptive control systems,
14–17 Sept, National Physical Laboratory, Tedding-
Adaptive Control, Overview ton. Plenum, New York
Hebb DO (1949) The organization of behavior. Wiley,
Autotuning
New York
Extremum Seeking Control Hovakimyan N, Chengyu Cao (2010) L1 adaptive control
Model Reference Adaptive Control theory. SIAM, Philadelphia
PID Control Ioannou PA, Sun J (1995) Stable and robust adaptive
control. Prentice-Hall, Englewood Cliffs
Kalman RE (1958) Design of a self-optimizing control
system. Trans ASME 80:468–478
Krstić M, Kanellakopoulos I, Kokotović PV (1993) Non-
linear and adaptive control design. Prentice Hall,
Bibliography Englewood Cliffs
Kumar PR, Varaiya PP (1986) Stochastic systems: estima-
Anderson BDO, Bitmead RR, Johnson CR, Kokotović tion, identification and adaptive control. Prentice-Hall,
PV, Kosut RL, Mareels I, Praly L, Riedle B (1986) Englewood Cliffs
Stability of adaptive systems. MIT, Cambridge Landau ID (1979) Adaptive control—the model reference
Åström KJ, Hägglund T (2005) Advanced PID control. approach. Marcel Dekker, New York
ISA – The Instrumentation, Systems, and Automation Lavretsky E, Wise KA (2013) Robust and adaptive control
Society, Research Triangle Park with aerospace applications. Springer, London
Åström KJ, Wittenmark B (1973) On self-tuning regula- Mishkin E, Braun L (1961) Adaptive control systems.
tors. Automatica 9:185–199 McGraw-Hill, New York
Åström KJ, Wittenmark B (1989) Adaptive con- Monopoli RV (1974) Model reference adaptive control
trol. Addison-Wesley, Reading. Second 1994 edition with an augmented error signal. IEEE Trans Autom
reprinted by Dover 2006 edition Control AC-19:474–484
Bellman R (1961) Adaptive control processes—a guided Morse AS (1980) Global stability of parameter-adaptive
tour. Princeton University Press, Princeton control systems. IEEE Trans Autom Control AC-
Butchart RL, Shackcloth B (1965) Synthesis of model 25:433–439
reference adaptive control systems by Lyapunov’s sec- Narendra KS, Annaswamy AM (1989) Stable adaptive
ond method. In: Proceedings 1965 IFAC symposium systems. Prentice Hall, Englewook Cliffs
on adaptive control, Teddington Northrop Grumman (2005) SteerMaster. http://www.
Draper CS, Li YT (1966) Principles of optimalizing srhmar.com/brochures/as/SPERRY%20SteerMaster
control systems and an application to the internal %20Control%20System.pdf
combustion engine. In: Oldenburger R (ed) Optimal Parks PC (1966) Lyapunov redesign of model reference
and self-optimizing control. MIT, Cambridge adaptive control systems. IEEE Trans Autom Control
Egardt B (1979) Stability of adaptive controllers. AC-11:362–367
Springer, Berlin Sastry S, Bodson M (1989) Adaptive control: sta-
Fliess M, Join C (2013) Model-free control. bility, convergence and robustness. Prentice-
Foxboro (1950) Control system with automatic response Hall, New Jersey
adjustment. US Patent 2,517,081 Simon HA (1956) Dynamic programming under uncer-
Gabor D, Wilby WPL, Woodcock R (1959) A universal tainty with a quadratic criterion function. Economet-
non-linear filter, predictor and simulator which opti- rica 24:74–81
mizes itself by a learning process. Proc Inst Electron Stein G (1980) Adaptive flight control: a pragmatic view.
Eng 108(Part B):1061–, 1959 In: Narendra KS, Monopoli RV (eds) Applications of
Goodwin GC, Sin KS (eds) (1984) Adaptive filtering adaptive control. Academic, New York
prediction and control. Prentice Hall, Englewood Tsypkin YaZ (1971) Adaptation and learning in automatic
Cliffs systems. Academic, New York
Goodwin GC, Ramadge PJ, Caines PE (1980) Discrete- Widrow B (1962) Generalization and information storage
time multivariable adaptive control. IEEE Trans in network of Adaline neurons. In: Yovits et al. (ed)
Autom Control AC-25:449–456 Self-organizing systems. Spartan Books, Washington
Hybrid Dynamical Systems, Feedback Control of 533
Abstract Motivation
The control of systems with hybrid dynamics Hybrid dynamical systems are ubiquitous in sci-
ence and engineering as they permit capturing
requires algorithms capable of dealing with
the intricate combination of continuous and the complex and intertwined continuous/discrete
behavior of a myriad of systems with variables H
discrete behavior, which typically emerges from
the presence of continuous processes, switching that flow and jump. The recent popularity of feed-
devices, and logic for control. Several analysis back systems combining physical and software
and design techniques have been proposed for components demands tools for stability analysis
and control design that can systematically handle
the control of nonlinear continuous-time plants,
but little is known about controlling plants that such a complex combination. To avoid the issues
feature truly hybrid behavior. This short entry due to approximating the dynamics of a system,
focuses on recent advances in the design of in numerous settings, it is mandatory to keep
feedback control algorithms for hybrid dynamical the system dynamics as pure as possible and
systems. The focus is on hybrid feedback to be able to design feedback controllers that
can cope with flow and jump behavior in the
controllers that are systematically designed em-
ploying Lyapunov-based methods. The control system.
design techniques summarized in this entry
include control Lyapunov function-based control,
passivity-based control, and trajectory tracking
Modeling Hybrid Dynamical Control
control.
Systems
environment
ZOH u hP (z,u)
plant
D/A disturbances
A/D
interface interface
network controller
sensor and actuators κ(ξ) v
distributed logic
Hybrid Dynamical Systems, Feedback Control of, (along with environmental disturbances) as subsystems
Fig. 1 A hybrid control system: a feedback system featuring variables that flow and, at times, jump
with a plant, controller, and interfaces/signal conditioners
8
< zP 2 FP .z; u/ .z; u/ 2 CP E \ .Œ0; T f0; 1; : : : J g/
HP W zC 2 GP .z; u/ .z; u/ 2 DP (1)
: can be written in the form
y D hP .z; u/
1
J[
Œtj ; tj C1 ; j
where z is the state of the plant and takes values j D0
from the Euclidean space RnP , u is the input
and takes values from RmP , y is the output for some finite sequence of times 0 D t0
and takes values from the output space RrP , t1 t2 : : : tJ . A hybrid arc
is a
and .CP ; FP ; DP ; GP ; hP / is the data of the function on a hybrid time domain. The set E \
hybrid system. The set CP is the flow set, the .Œ0; T f0; 1; : : : ; J g/ defines a compact hybrid
set-valued map FP is the flow map, the set time domain since it is bounded and closed. The
DP is the jump set, the set-valued map GP is hybrid time domain of
is denoted by dom
.
the jump map, and the single-valued map hP is A hybrid arc is such that, for each j 2 N, t 7!
the output map. (This hybrid inclusion captures
.t; j / is absolutely continuous on intervals of
the dynamics of (constrained or unconstrained) flow I j WD ft W .t; j / 2 dom
g with nonzero
continuous-time systems when DP D ; and GP Lebesgue measure. A hybrid input u is a function
is arbitrary. Similarly, it captures the dynamics on a hybrid time domain that, for each j 2 N,
of (constrained or unconstrained) discrete-time t 7! u.t; j / is Lebesgue measurable and locally
systems when CP D ; and FP is arbitrary. Note essentially bounded on the interval I j .
that while the output inclusion does not explicitly In this way, a solution to the plant HP is
include a constraint on .z; u/, the output map is given by a pair .
; u/ with dom
D dom u .D
only evaluated along solutions.) dom.
; u// satisfying
Given an input u, a solution to a hybrid in- (S0) .
.0; 0/; u.0; 0// 2 C P or .
.0; 0/; u.0; 0//
clusion is defined by a state trajectory
that 2 DP , and dom
D dom u;
satisfies the inclusions. Both the input and the (S1) For each j 2 N such that I j has nonempty
state trajectory are functions of .t; j / 2 R0 interior int.I j /, we have
N WD Œ0; 1/ f0; 1; 2; : : :g, where t keeps track .
.t; j /; u.t; j // 2 CP for all t 2 int.I j /
of the amount of flow, while j counts the number
of jumps of the solution. These functions are
and
given by hybrid arcs and hybrid inputs, which are
defined on hybrid time domains. More precisely, d
.t; j / 2 FP .
.t; j /; u.t; j //
hybrid time domains are subsets E of R0 N dt
that, for each .T; J / 2 E, for almost all t 2 I j
Hybrid Dynamical Systems, Feedback Control of 535
particular, due to the hybrid controller’s state methods are sufficient conditions in terms of Lya-
including timers that persistently evolve within punov functions guaranteeing that the asymptotic
a bounded time interval and logic variables that stability property defined in section “Definitions
take values from discrete sets. Denoting by A and Notions” holds. Some of the methods pre-
the set of points to stabilize for the closed-loop sented below exploit such sufficient conditions
system H and j jA as the distance to such set, the when applied to the closed-loop system H, while
following property captures the typically desired others exploit the properties of the hybrid plant
properties outlined above. A closed set A is said to design controllers with a particular structure.
to be: The design methods are presented in order of
(S) Stable: for each " > 0 there exists ı > 0 complexity of the controller, namely, from it be-
such that each maximal solution
to H with ing a static state-feedback law to being a generic
.0; 0/ D xı , jxı jA ı satisfies j
.t; j /jA algorithm with true hybrid dynamics.
" for all .t; j / 2 dom
.
(A) Attractive: there exists > 0 such that every CLF-Based Control Design
maximal solution
to H with
.0; 0/ D xı , In simple terms, a control Lyapunov function
jxı jA is bounded and if it is complete (CLF) is a regular enough scalar function that
satisfies lim.t;j /2dom
; t Cj !1 j
.t; j /jA D 0. decreases along solutions to the system for some
(AS) Asymptotically stable: it is stable and at- values of the unassigned input. When such a
tractive. function exists, it is very tempting to exploit
The basin of attraction of an asymptotically stable its properties to construct an asymptotically
set A is the set of points from where the attractiv- stabilizing control law. Following the ideas from
ity property holds. The set A is said to be globally the literature of continuous-time and discrete-
asymptotically stable when the basin of attraction time nonlinear systems, we define control
is equal to the entire state space. Lyapunov functions for hybrid plants HP and
A dynamical system with assigned inputs is present results on CLF-based control design.
said to be detectable when its output being held to For simplicity, as mentioned in the input and
zero implies that its state converges to the origin. output modeling remark in section “Definitions
A similar property can be defined for hybrid and Notions,” we use inputs uc and ud instead u.
dynamical systems. For the closed-loop system Also, we restrict the discussion to sets A that are
H, given sets A and K, the distance to A is compact as well as hybrid plants with FP ; GP
0-input detectable relative to K for H if every single valued and such that hP .z; u/ D z. For
complete solution
to H notational convenience, we use … to denote
the “projection” of CP and DP onto RnP ,
.t; j / 2 K 8.t; j / 2 dom
) i.e., ….CP / D fz W 9uc s.t. .z; uc / 2 CP g and
lim.t;j /2dom
; t Cj !1 j
.t; j /jA D 0 ….DP / D fz W 9ud s.t. .z; ud / 2 DP g, and the
set-valued maps ‰c .z/ D fuc W .z; uc / 2 CP g
and ‰d .z/ D fud W .z; ud / 2 DP g.
where “
.t; j / 2 K” captures the “output being
Given a compact set A, a continuously dif-
held to zero” property in the usual detectability
ferentiable function V W RnP ! R is a control
notion.
Lyapunov function for HP with respect to A
if there exist ˛1 ; ˛2 2 K1 and a continuous,
positive definite function such that
Feedback Control Design for Hybrid
Dynamical Systems
˛1 .jzjA / V .z/ ˛2 .jzjA /
Several methods for the design of a hybrid con- 8z 2 RnP
troller HK rendering a given set asymptotically
stable are given below. At the core of these inf hrV .z/; FP .z; uc /i .jzjA /
uc 2‰c .z/
Hybrid Dynamical Systems, Feedback Control of 537
8
< 1
hrV .z/; FP .z; uc /i C .jzjA / if .z; uc / 2 CP \ .I.r/ RmP;c /;
c .z; uc ; r/ WD 2
: 1 otherwise H
and, for each .z; ud / and r 0, the function
8
< 1
V .GP .z; ud // V .z/ C .jzjA / if .z; ud / 2 DP \ .I.r/ RmP;d /;
d .z; ud ; r/ WD 2
: 1 otherwise
The following result states conditions on the feedback law z 7!
.z/ D .
c .z/;
d .z// with
c
data of HP guaranteeing that, for each r > 0, continuous on ….CP / \ I.r/ and
d continuous
there exists a continuous state-feedback law z 7! on ….DP / \ I.r/.
.z/ D .
c .z/;
d .z// rendering the compact set
Theorem 1 assures the existence of a continu-
Ar WD fz 2 R nP
W V .z/ r g ous state-feedback law practically asymptotically
stabilizing A. However, Theorem 1 does not
asymptotically stable. This property corresponds provide an expression of an asymptotically stabi-
to a practical version of asymptotic stabilizability. lizing control law. The following result provides
an explicit construction of such a control law.
Theorem 1 Given a hybrid plant HP D
.CP ; FP ; DP ; GP ; hP /, a compact set A, and a Theorem 2 Given a hybrid plant HP D
.CP ; FP ; DP ; GP ; hP /, a compact set A, and a
control Lyapunov function V for HP with respect
control Lyapunov function V for HP with respect
to A, if to A, if (C1)–(C3) in Theorem 1 hold then, for
(C1) CP and DP are closed sets, and FP and every r > 0, the state-feedback law pair
GP are continuous;
(C2) The set-valued maps ‰c .z/D fuc W .z; uc /
c W ….CP / ! RmP;c ;
d W ….DP / ! RmP;d
2 CP g and ‰d .z/ D fud W .z; ud / 2 DP g
are lower semicontinuous with convex values; defined on ….CP / and ….DP / as
(C3) For every r > 0, we have that, for every z 2
….CP / \ I.r/, the function uc 7! c .z; uc ; r/
c .z/ WD arg min fjuc j W uc 2 Tc .z/ g
is convex on ‰c .z/ and that, for every z 2
….DP /\I.r/, the function ud 7! c .z; ud ; r/ 8z 2 ….CP / \ I.r/
is convex on ‰d .z/;
d .z/ WD arg min fjud j W ud 2 Td .z/ g
then, for every r > 0, the compact set Ar is
asymptotically stabilizable for HP by a state- 8z 2 ….DP / \ I.r/
538 Hybrid Dynamical Systems, Feedback Control of
respectively, renders the compact set Ar asymp- a “passive” hybrid plant HP , in which energy
totically stable for HP , where Tc .z/ D ‰c .z/ \ might be dissipated during flows, jumps, or both.
fuc W c .z; uc ; V .z// 0g and Td .z/ D ‰d .z/ \ Passivity notions and a passivity-based control
fud W d .z; ud ; V .z// 0g. Furthermore, if the design method for hybrid plants are given next.
set-valued maps ‰c and ‰d have a closed graph, Since the form of the plant’s output plays a key
then
c and
d are continuous on ….CP / \ I.r/ role in asserting a passivity property, and this
and ….DP / \ I.r/, respectively. property may not necessarily hold both during
flows and jumps, as mentioned in the input and
The stability properties guaranteed by
output modeling remark in section “Definitions
Theorems 1 and 2 are practical. Under further
and Notions,” we define outputs yc and yd ,
properties, similar results hold when the input
which, for simplicity, are assumed to be single
u is not partitioned into uc and ud . To achieve
valued: yc D hc .x/ and yd D hd .x/. Moreover,
asymptotic stability (or stabilizability) of A with
we consider the case when the dimension of the
a continuous state-feedback law, extra conditions
space of the inputs uc and ud coincides with
are required to hold nearby the compact set,
that of the outputs yc and yd , respectively, i.e., a
which for the case of stabilization of continuous-
“duality” of the output and input space.
time systems are the so-called small control
Given a compact set A and functions hc , hd
properties. Furthermore, the continuity of the
such that hc .A/ D hd .A/ D 0, a hybrid plant
feedback law assures that the closed-loop system
HP for which there exists a continuously differ-
has closed flow and jump sets as well as contin-
entiable function V W RnP ! R0 satisfying
uous flow and jump maps, which, in turn, due to
for some functions !c W RmP;c RnP ! R and
the compactness of A, implies that the asymptotic
!d W RmP;c RnP ! R
stability property is robust. Robustness follows
from results for hybrid systems without inputs.
hrV .z/; FP .z; uc /i !c .uc ; z/
Passivity-Based Control Design 8.z; uc / 2 C (5)
Dissipativity and its special case, passivity, pro-
vide a useful physical interpretation of a feedback V .GP .z; ud // V .z/ !d .ud ; z/
control system as they characterize the exchange 8.z; ud / 2 D (6)
of energy between the plant and its controller.
For an open system, passivity (in its very pure is said to be passive with respect to a compact set
form) is the property that the energy stored in A if
the system is no larger than the energy it has
absorbed over a period of time. The energy stored
.uc ; z/ 7! !c .uc ; z/ D u>
c yc (7)
in a system is given by the difference between
the initial and final energy over a period of time, .ud ; z/ 7! !d .ud ; z/ D u>
d yd (8)
where the energy function is typically called the
storage function. Hence, conveniently, passivity The function V is the so-called storage function.
can be expressed in terms of the derivative of a If (5) holds with !c as in (7), and (6) holds with
storage function (i.e., the rate of change of the !d 0, then the system is called flow-passive,
internal energy) and the product between inputs i.e., the power inequality holds only during flows.
and outputs (i.e., the system’s power flow). Un- If (5) holds with !c 0, and (6) holds with !d as
der further observability conditions, this power in (8), then the system is called jump-passive, i.e.,
inequality can be employed as a design tool the energy of the system decreases only during
by selecting a control law that makes the rate jumps.
of change of the internal energy negative. This Under additional detectability properties,
method is called passivity-based control design. these passivity notions can be used to design
The passivity-based control design method static output feedback controllers. The following
can be employed in the design of a controller for result gives two design methods for hybrid plants.
Hybrid Dynamical Systems, Feedback Control of 539
A. For notational convenience, we define x D Theorem 4 imposes that the jumps of the
.z; ; ; k/, plant and of the reference trajectory occur
˚ simultaneously. Though restrictive, at times, this
C D x W .z;
c .; z; r.; k/// 2 CP ; property can be enforced by proper design of the
2 Œtkr ; tkC1
r
; .; z; r.; k// 2 CK controller.
F .z; ; ; k/ D .FP .z;
c .; z; r.; k///;
FK .; z; r.; k//; 1; 0/ Summary and Future Directions
D D fx W .z;
c .; z; r.; k/// 2 DP ;
Advances over the last decade on modeling and
.; k/ 2 Tr g [ fx W 2 robust stability of hybrid dynamical systems
(without control inputs) have paved the road
Œtkr ; tkC1
r
/; .; z; r.; k// 2 DK
for the development of systematic methods
G1 .z; ; ; k/ D .GP .z;
c .; z; r.; k///; for the design of control algorithms for hybrid
; ; k C 1/; plants. The results selected for this short
expository entry, along with recent efforts on
G2 .z; ; ; k/ D .z; GK .; z; r.; k//; ; k/ multimode/logic-based control, event-based
Theorem 4 Given a complete reference trajec- control, and backstepping, which were not
tory r W dom r ! RnP and associated tracking covered here, contribute to that long-term
set A in (9), if there exists a hybrid controller HK goal. The future research direction includes the
guaranteeing that development of more powerful tracking control
(1) The jumps of r and HP occur simultaneously; design methods, state observers, and optimal
(2) There exist a function V W RnP RnK R0 controllers for hybrid plants.
N ! R that is continuously differentiable;
functions ˛1 ; ˛2 2 K1 ; and continuous, pos-
itive definite functions 1 ; 2 ; 3 such that Cross-References
(a) For all .z; ; ; k/ 2 C [ D [ G1 .D/ [
G2 .D/ Lyapunov’s Stability Theory
Output Regulation Problems in Hybrid
˛1 .j.z; ; ; k/jA / V .z; ; ; k/ Systems
˛2 .j.z; ; ; k/jA / Stability Theory for Hybrid Dynamical
Systems
(b) For all .z; ; ; k/ 2 C and all 2
F .z; ; ; k/,
Bibliography
hrV .z; ; ; k/; i 1 .j.z; ; ; k/jA /
(c) For all .z; ; ; k/ 2 D1 and all 2 Set-Valued Dynamics and Variational
G1 .z; ; ; k/ Analysis:
Aubin J-P, Frankowska H (1990) Set-valued analysis.
Birkhauser, Boston
V ./ V .z; ; ; k/ 2 .j.z; ; ; k/jA / Rockafellar RT, Wets RJ-B (1998) Variational analysis.
Springer, Berlin/Heidelberg
(d) For all .z; ; ; k/ 2 D2 and all 2
G2 .z; ; ; k/ Modeling and Stability:
Branicky MS (2005) Introduction to hybrid systems.
In: Handbook of networked and embedded control
V ./ V .z; ; ; k/ 3 .j.z; ; ; k/jA /
systems. Springer, New York, pp 91–116
Haddad WM, Chellaboina V, Nersesov SG (2006)
then A is globally asymptotically stable. Impulsive and hybrid dynamical systems: stability,
Hybrid Observers 541
dissipativity, and control. Princeton University Press, available in the literature related to the observ-
Princeton ability and observer design for different classes
Goebel R, Sanfelice RG, Teel AR (2012) Hybrid dy-
namical systems: modeling, stability, and robustness.
of hybrid systems are introduced.
Princeton University Press, Princeton
Lygeros J, Johansson KH, Simić SN, Zhang J, Sastry
SS (2003) Dynamical properties of hybrid automata.
IEEE Trans Autom Control 48(1):2–17 Keywords
van der Schaft A, Schumacher H (2000) An introduction
to hybrid dynamical systems. Lecture notes in control Hybrid systems; Observer design; Observability;
and information sciences. Springer, London
Switching systems
Control:
Biemond JJB, van de Wouw N, Heemels WPMH,
Nijmeijer H (2013) Tracking control for hybrid sys- Introduction
tems with state-triggered jumps. IEEE Trans Autom
Control 58(4):876–890
Observers design, which are used to estimate the
Forni F, Teel AR, Zaccarian L (2013) Follow the bounc-
ing ball: global results on tracking and state esti- unmeasured plant state, has received a lot of at- H
mation with impacts. IEEE Trans Autom Control tention since the late ’60s. One of the first leading
58(6):1470–1485 contribution to clearly formalize the estimation
Lygeros J (2005) An overview of hybrid systems control.
problem and propose a solution in the linear case
In: Handbook of networked and embedded control
systems. Springer, New York, pp 519–538 has been proposed by Luenberger (1966). The
Naldi R, Sanfelice RG (2013) Passivity-based control for recipe to implement a Luenberger-type observer
hybrid systems with applications to mechanical sys- for a continuous-time linear system described by
tems exhibiting impacts. Automatica 49(5):1104–1116
Sanfelice RG (2013a) On the existence of con-
trol Lyapunov functions and state-feedback laws xP D Ax C Bu; y D C x C Du; (1)
for hybrid systems. IEEE Trans Autom Control
58(12):3242–3248 with x 2 Rn ; u 2 Rp ; y 2 Rm ; A 2 Rnn ; B 2
Sanfelice RG (2013b) Control of hybrid dynamical sys-
tems: an overview of recent advances. In: Daafouz J, Rnp ; C 2 Rmn , and D 2 Rmp , has three main
Tarbouriech S, Sigalotti M (eds) Hybrid systems with ingredients: system data, the correction term
constraints. Wiley, Hoboken, pp 146–177 commonly referred to as output injection, and
Sanfelice RG, Biemond JJB, van de Wouw N, Heemels the observability/detectability/determinability
WPMH (2013, to appear) An embedding approach for
the design of state-feedback tracking controllers for conditions. A Luenberger-type observer for
references with jumps. Int J Robust Nonlinear Control (1), which consists in a copy of the (system
data) dynamics (1) with a linear correction term
L.y yO /, is given by
is not already Hurwitz, exists if the pair (A, C / the linear correction term K .t/ .y .t/ COx .t// at
is detectable or (sufficient condition) observable. jump times, yielding
The observer in (2) exploits only the injection
term in the for continuous time dynamics (flow xPO .t/ D AxO .t/ C Bu .t/ C L .y .t/ COx .t// ;
map), and one may ask how profitable could be
(3a)
resets of the observer state (jump map) designing
a hybrid observer. xO tj D x tj C K t
j y t x t
j CO j ;
The observer design for hybrid systems is a
relatively new area of research and results are (3b)
consolidated only for few classes of linear hybrid
systems. where t0 D 0; tj C1 tj D T > 0; j 2 N1 and
In section “Continuous-Time Plants,” a hy- T is a parameter that defines the interval times
brid redesign of the observer (2) is discussed between resets and has to be chosen such that
first and then a more general design for non-
linear systems is introduced, whereas in sec- I m p r T ¤ 2r; r 2 Znf0g; (4)
tion “Systems with Flows and Jumps” the recent
results related to observability and observer de- for each pair (p , r / of complex eigenvalues
signs for hybrid systems is discussed. Conclu- of the matrix A LC . This preserves the (con-
sions are given in section “Summary and Future tinuous time or flow) observability of the sys-
Directions.” tem (1) when sampled at time instants tj and
allows to select a matrix K0 such that .I
K0 C / exp ..A LC / T / has all its eigenvalues
at zero. Then, the estimation error e.t/ converges
Hybrid Observers: Different
to zero in finite time (nT) if (1) is observable
Strategies
and the matrix K .t/ W R0 ! Rnq is selected
The community of researchers working on hybrid such as K .t/ D K0 if t tn and K.t/ = 0
observer, which is a quite recent area and is the otherwise. It is important to note that the state
subject of growing interest, is wide and a unique reset (3b) yields a hybrid estimation error system
given by
formal definition/notation has not been reached
yet. This fact is strictly related to the large num-
ber of different hybrid system models that are eP .t/ D .A LC / e .t/ ; (5a)
currently adopted by researchers. To render as
e tj D I K tj C e tj : (5b)
simple as possible this short presentation, we let
the state x.t/ of a hybrid system be driven by
The stability property of the origin can be easily
the flow map (differential equation) when t ¤
deduced by noting that
tj and by the jump map (difference equation)
when t D tj , with x.t/ right continuous, i.e., j
Y
limt !t C x.t/ D x.tj /. e tj D I K tj C
j
kD1
proposed in Prieur et al. (2012) allows to limit is the sample-data (discrete-time) version of (6)
the peaking phenomena for a class of high-gain with sampling time T D tj 1 tj . Then, it
observers opportunely resetting its (augmented) is possible to define a hybrid observer of the
state. Moreover, when the output of (1) is a following type:
nonlinear function of the state, y D h.x/, with
h./ not invertible (e.g. the saturation function), xPO .t/ D f .xO .t/ ; u .t// ; (8a)
it would be possible to rewrite (1) as a hybrid
system with linear flow map and augmented state xO tj D y tj ; xO tj ; tj ; (8b)
designing a hybrid observer as in Carnevale and
Astolfi (2009). where the reset map and the dynamics of the
new variable .t/ have to be properly defined.
Nonlinear Case The main idea in Moraal and Grizzle (1995)
When the input of a continuous-time plant is is that the Newton method, in continuous and
piecewise-constant the hybrid observer proposed discrete time, can be used to estimate the value
in Moraal and Grizzle (1995), exploiting sampled of that renders zero the function
measurements, can be successfully applied for a
H
class of nonlinear continuous (or discrete-time) WjN ./ D YjN H ; UjN ; (9)
systems
0
where UjN D u0 tj N C1 ; : : : ; u0 tj and
xP D f .x .t/ ; u .t// ; y .t/ D h .x .t/ ; u .t// ; 0 0
0
Yj D y tj N C1 ; : : : ; y tj
N
are the sam-
(6)
pled input and output vectors, respectively, and
with sufficiently smooth maps f .,/ and h.,/ H W Rn RmN ! RN maps the state x.tj / and
and where the N-tuple of control inputs UjN into the output
vector YjN , i.e., H x tj ; UjN D YjN , and is
x tj D F x tj 1 ; u tj 1 ; (7) defined as
2 1 1 3
h F F .: : :/ ; u tj N C1 ; u tj N C1
6 :: 7
6 7
H x; UjN D 6 1 : 7; (10)
4 h F x; u tj 1 ; u t 1
5
j
h x; u tj
where F 1 shortly represents the P .t/ D kJ . .t//1 YjN
inverse
of the
map F such that x tj 1 D F 1 x tj ;u tj 1 .
The system (6)–(7) is said to be N-osbervable,
H .t/ ; UjN ; (11a)
for some N 1 (the generic selection is
N D 2n C 1), when WjN ./ D 0 hold only
tj D F tj ; u tj 1 ; (11b)
if D x tj , uniformly in UjN . Then, under
certain technical assumptions (see Moraal and
Grizzle 1995) related to the derivatives of f
k > 0
with a sufficientlyhigh-gain and the reset
and h and the invertibility of the Jacobian map ./ D F tj ; u tj 1 . Note that
matrix J .x/ D @H .x/=@x, it is possible to (11a) is commonly referred to as Newton flow.
select This approach could be easily extended to other
544 Hybrid Observers
and C = [0, 1, 0]. Evidently the flow is with y D h.x,u/, a frequent choice is to consider
not observable in the classic sense the hybrid observer of the form
0 given
that Oflow D C 0 ; .C A/0 ; .CA2 /0 is not
full rank and the flow-unobservable subspace xPO .t/ D f .x;
O u/ C l .y; x; u/ ; (14a)
Hybrid Observers 545
xO tj D g xO tj ; u tj origin of the estimation error system is GES. In
this case, the standard observability condition for
Cm xO tj ; u tj ; (14b) the couple (A; C / is required.
is related to the system current-location observa- Balluchi A, Benvenuti L, Benedetto MDD, Vincentelli
tion tree. This graph is iteratively explored at each ALS (2002) Design of observers for hybrid systems.
In: Hybrid systems: computation and control, Stan-
new input to determine the node associated to the ford, vol 2289
current value of q.t/. Then, a linear (switched) Biyik E, Arcak M (2006) A hybrid redesign of Newton
Luenberger-type observer for the estimation of observers in the absence of an exact discrete-time
the state z, assuming minimum dwell-time and model. Syst Control Lett 55(8):429–436
Branicky MS (1998) Multiple Lyapunov functions and
observability of each pair (Aq; Cq /, is proposed. other analysis tools for swtiched and hybrid systems.
IEEE Trans Autom Control 43(5):475–482
Carnevale D, Astolfi A (2009) Hybrid observer for global
frequency estimation of saturated signals. IEEE Trans
Summary and Future Directions Autom Control 54(13):2461–2464
Forni F, Teel A, Zaccarian L (2003) Follow the bouncing
Observer design and observability properties ball: global results on tracking and state estimation
of general hybrid systems is an active field of with impacts. IEEE Trans Autom Control 58(8):1470–
1485
research and a number of different results have Goebel R, Sanfelice R, Teel AR (2009) Hybrid dynamical
been proposed although not consolidated as for systems. IEEE Control Syst Mag 29: 28–93
classical linear systems. The results are based Heemels WPMH, Camlibel MK, Schumacher J, Brogliato
on different notations and definitions for hybrid B (2011) Observer-based control of linear comple-
mentarity systems. Int J Robust Nonlinear Control
systems. Efforts to provide a unified approach, in 21(13):1193–1218. Special issues on hybrid systems
many case considering the general framework for Juloski AL, Heemels WPMH, Weiland S (2007) Observer
hybrid systems proposed in Goebel et al. (2009), design for a class of piecewise linear systems. Int J
is pursued by the scientific community to im- Robust Nonlinear Control 17(15):1387–1404
Liu Y (1997) Switching observer design for uncer-
prove consistency and cohesion of the general re- tain nonlinear systems. IEEE Trans Autom Control
sults. Observer designs, observability properties, 42(12):1699–1703
and separation principle even with linear flow and Luenberger DG (1966) Observers for multivariable sys-
jump maps are not yet completely characterized tems. IEEE Trans Autom Control 11: 190–197
Martinelli F, Menini L, Tornambè A (2004) Observability,
and, in the nonlinear case, only few works have reconstructibility and observer design for linear me-
been proposed (see Teel (2010)), providing open chanical systems unobservable in absence of impacts.
challenges for the scientific community. J Dyn Syst Meas Control 125:549
Moraal P, Grizzle J (1995) Observer design for nonlinear
systems with discrete-time measurements. IEEE Trans
Autom Control 40(3):395–404
Cross-References Prieur C, Tarbouriech S, Zaccarian L (2012) Hybrid high-
gain observers without peaking for planar nonlinear
systems. In: 2012 IEEE 51st annual conference on
Hybrid Dynamical Systems, Feedback Con- decision and control (CDC), Maui, pp 6175–6180
trol of Raff T, Allgower F (2008) An observer that converges in
Observer-Based Control finite time due to measurement-based state updates. In:
Proceedings of the 17th IFAC world congress, COEX,
Observers for Nonlinear Systems
South Korea, vol 17, pp 2693–2695
Observers in Linear Systems Theory Sassano M, Carnevale D, Astolfi A (2011) Extremum
seeking-like observer for nonlinear systems. In: 18th
IFAC world congress, Milano, vol 18, pp 1849–1854
Tanwani A, Shim H, Liberzon D (2013) Observability for
Bibliography switched linear systems: characterization and observer
design. IEEE Trans Autom Control 58(5): 891–904
Ahrens JH, Khalil HK (2009) High-gain observers in Teel A (2010) Observer-based hybrid feedback: a local
the presence of measurement noise: a switched-gain separation principle. In: American control conference
approach. Automatica 45(5):936–943 (ACC), 2010, Baltimore, pp 898–903
Babaali M, Pappas GJ (2005) Observability of switched Vidal R, Chiuso A, Soatto S, Sastry S (2003) Observ-
linear systems in continuous time. In: Morari M, ability of linear hybrid systems. In: Maler O, Pnueli
Thiele L (eds) Hybrid systems: computation and con- A (eds) Hybrid systems: computation and control.
trol. Volume 3414 of lecture notes in computer science. Volume 2623 of lecture notes in computer science.
Springer, Berlin/Heidelberg, pp 103–117 Springer, Berlin/Heidelberg, pp 526–539
I
Introduction
Identification and Control of Cell
Populations Control systems, particularly those that employ
feedback strategies, have been used successfully
Mustafa Khammash1 and J. Lygeros2
1 in engineered systems for centuries. But natural
Department of Biosystems Science and
feedback circuits evolved in living organisms
Engineering, Swiss Federal Institute of
much earlier, as they were needed for regulating
Technology at Zurich (ETHZ), Basel,
the internal milieu of the early cells. Owing
Switzerland
2 to modern genetic methods, engineered feed-
Automatic Control Laboratory, Swiss Federal
back control systems can now be used to control
Institute of Technology Zurich (ETHZ), Zurich,
in real-time biological systems, much like they
Switzerland
control any other process. The challenges of con-
trolling living organisms are unique. To be suc-
Abstract cessful, suitable sensors must be used to measure
the output of a single cell (or a sample of cells in a
We explore the problem of identification and con- population), actuators are needed to affect control
trol of living cell populations. We describe how action at the cellular level, and a controller that
de novo control systems can be interfaced with connects the two should be suitably designed. As
living cells and used to control their behavior. Us- a model-based approach is needed for effective
ing computer controlled light pulses in combina- control, methods for identification of models of
tion with a genetically encoded light-responsive cellular dynamics are also needed. In this entry,
module and a flow cytometer, we demonstrate we give a brief overview of the problem of identi-
how in silico feedback control can be configured fication and control of living cells. We discuss the
to achieve precise and robust set point regulation dynamic model that can be used, as well as the
of gene expression. We also outline how external practical aspects of selecting sensor and actua-
control inputs can be used in experimental design tors. The control systems can either be realized on
to improve our understanding of the underlying a computer (in silico feedback) or through genetic
biochemical processes. manipulations (in vivo feedback). As an example,
we describe how de novo control systems can be
Keywords interfaced with living cells and used to control
their behavior. Using computer controlled light
Extrinsic variability; Heterogeneous populations; pulses in combination with a genetically encoded
Identification; Intrinsic variability; Population light-responsive module and a flow cytometer, we
control; Stochastic biochemical reactions demonstrate how in silico feedback control can
the information that particular experiments con- controllers. However, in silico controllers require
tain about the model parameters; an approxima- a setup that maintains contact with all the cells to
tion of the Fischer information based on the first be controlled and cannot independently control
two moments was derived in Komorowski et al. large numbers of such cells. In this entry we
(2011) and an improved estimate using correction focus exclusively on in silico controllers.
terms based on moments up to order 4 was
derived in Ruess et al. (2013). Once an estimate The Actuator
of the Fischer information matrix is available, There could be several ways to send actuating
one can design experiments to maximize the signals into living cells. One consists of chemical
information gained about the parameters of the inducers that the cells respond to either through
model. The resulting optimization problem (over receptors outside the cell or through translocation
an appropriate parametrization of the space of of the inducer molecules across the cellular mem-
possible experiments) is again challenging but brane. The chemical signal captured by these
can be approached by randomized optimization inducers is then transduced to affect gene expres-
methods. sion. Another approach we will describe here is
induction through light. There are several light
systems that can be used. One of these includes I
Control of Cell Populations a light-sensitive protein called phytochrome B
(PhyB). When red light of wavelength 650 nm
There are two control strategies that one can is shined on PhyB in the presence of phyco-
implement. The control systems can be realized cyanobilin (PCB) chromophore, it is activated. In
on a computer, using real-time measurements this active state it binds to another protein Pif3
from the cell population to be controlled. These with high affinity forming PhyB-Pif3 complex.
cells must be equipped with actuators that If then a far-red light (730 nm) is shined, PhyB
respond to the computer signals that close the is deactivated and it dissociates from Pif3. This
feedback loop. We will refer to this as in silico can be exploited for controlling gene expression
feedback. Alternatively, one can implement the as follows: PhyB is fused to a GAL4 binding do-
sensors, actuators, and control system in the main (GAL4BD), which then binds to DNA in a
entirety within the machinery of the living cells. specific site just upstream of the gene of interest.
At least in principle, this can be achieved through Pif3 in turn is fused to a GAL4 activating do-
genetic manipulation techniques that are common main (GAL4AD). Upon red light induction, Pif3-
in synthetic biology. We shall refer to this type of Gal4AD complex is recruited to PhyB, where
control as in vivo feedback. Of course some com- Gal4AD acts as a transcription factor to initiate
bination of the two strategies can be envisioned. gene expression. After far-red light is shined,
In vivo feedback is generally more difficult to the dissociation of GAL4BD-PhyB complex with
implement, as it involves working within the Pif3-Gal4AD means that Gal4AD no longer acti-
noisy uncertain environment of the cell and vates gene expression, and the gene is off. This
requires implementations that are biochemical in way, one can control gene expression – at least in
nature. Such controllers will work autonomously open loop.
and are heritable, which could prove advanta-
geous in some applications. Moreover, coupled The Sensor
with intercellular signaling mechanisms such To measure the output protein concentration
as quorum sensing, in vivo feedback may lead in cell populations, a florescent protein tag is
to tighter regulation (e.g., reduced variance) of needed. This tag can be fused to the protein of
the cell population. On the other hand, in silico interest, and the fluorescence intensity emanating
controllers are much easier to program, debug, from each cell is a direct measure of the protein
and implement and can have much more complex concentration in that cell. There are several
dynamics that would be possible with in vivo technologies for measuring fluorescence of cell
550 Identification and Control of Cell Populations
Far Red
Gal4 AD
(730 nm)
PhyB Pr
PIF3
Red Gal4 BD
(650 nm) Gal 1 UAS YFP
Gal4 AD Fluorescent
PhyB Pr reporter
PIF3
Gal4 BD
Gal 1 UAS YFP
R FR
R FR
R FR R FR
Yeast population
R R R R R
Identification and Control of Cell Populations, Fig. 1 yeast cells resulting in a corresponding gene expression
Top figure: shows a yeast cell whose gene expression can pattern that is measured by flow cytometry. By applying
be induced by light: red light turns on gene expression multiple carefully chosen light input test sequences and
while far-red turns it off. Bottom figure: Each input light looking at their corresponding gene expression patterns a
sequences can be applied to a culture of light responsive dynamic model of gene expression can be identified
populations. While fluorimeters measure the genetic regulation, are often very slow with time
overall intensity of a population, flow cytometry constants of the order of tens of minutes. This
and microscopy can measure the fluorescence suggests that pure feedback control without some
of each individual cell in a population sample form of preview may be insufficient. Moreover,
at a given time. This provides a snapshot due to our incomplete understanding of the
measurement of the probability density function underlying biology, the available models are
of the protein across the population. Repeated typically inaccurate, or even structurally wrong.
measurements over time can be used as a basis Finally, the control signals are often unconven-
for model identification (Fig. 1). tional; for example, for the light control system
outlined above, experimental limitations imply
The Control System that the system must be controlled using discrete
Equipped with sensors, actuators, and a model light pulses, rather than continuous signals.
identified with the methods outlined above Fortunately advances in control theory allow
one can proceed to design control algorithms one to effectively tackle most of these challenges.
to regulate the behavior of living cells. Even The availability of a model, for example, enables
though moment equations lead to models that the use of model predictive control methods that
look like conventional ordinary differential introduce the necessary preview into the feedback
equations, from a control theory point of view, process. The presence of unconventional inputs
cell population systems offer a number of may make the resulting optimization problems
challenges. Biochemical processes, especially difficult, but the slow dynamics work in our favor,
Identification and Control of Cell Populations 551
cell population
in silico feedback output measurement
Kept in dark vessel with electronically
Kalman filter + Model Predictive Controller Flow cytometry
controlled light source
40 uM IPTG - 0 hrs 40 uM IPTG - 2 hrs 40 uM IPTG - 3 hrs 40 uM IPTG - 4 hrs 40 uM IPTG - 5 hrs
1.0
0.8
0.8
0.6
0.6
density/cdf
1.0
0.6
0.4
0.4
0.4
0.5
YFP flourescence distribution
0.2
0.2
0.2
0.0
0.0
0.0
0.0
0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5 0 1 2 3 4 5
log GFP intensity log GFP intensity log GFP intensity log GFP intensity log GFP intensity
Desired
fluorescence
Best estimate of
current stage: xk
measured
Model Predictive
fluorescence Kalman Filter
Control
0 50 100 150 200 250 300 350 400
Identification and Control of Cell Populations, Fig. 2 is fed back to the computer which implements a Kalman
Architecture of the closed-loop light control system. Cells filter plus a Model Predictive Controller. The objective
are kept darkness until they are exposed to light pulse se- of the control is to have the mean gene expression level
quences from the in silico feedback controller. Cell culture follow a desired set value
samples are passed to the flow cytometer whose output
Setpoint=7 Setpoint=5
a b 10
Experiment 1
7 Experiment 2
9
Average fluorescence fold
6 Experiment 3
Average fluorescence fold change
8
Experiment 4
5 7
6
change
4
5
3 4
2 3
2
1
OL CL1
CL CL2
0 50 100 150 200 250 300 350 400 CL3
Time (min) CL4
Precomputed Input Model 0 50 100 150 200 250 300 350 400
(no feedback) Time (min)
Experiment
Random Pulses (t < 0);
Feedback Control Experiment Feedback Control (t ≥ 0)
Identification and Control of Cell Populations, Fig. 3 each with a different initial condition. Closed-loop con-
Left panel: The closed-loop control strategy (orange) trol is turned on at time t=0 shows that tracking can
enables set point tracking, whereas an open-loop strategy be achieved regardless of initial condition. (See Milias-
(green) does not. Right panel: Four different experiments, Argeitis et al. (2011))
552 ILC
Abstract
Cross-References
Model predictive control (MPC) has become
Deterministic Description of Biochemical Net- the standard for implementing constrained,
works multivariable control of industrial continuous
Modeling of Dynamic Systems from First Prin- processes. These are processes which are
ciples designed to operate around nominal steady-state
Stochastic Description of Biochemical values, which include many of the important
Networks processes found in the refining and chemical
System Identification: An Overview industries. The following provides an overview
Industrial MPC of Continuous Processes 553
Keywords
Introduction
Industrial MPC of
Continuous Processes, past future
Fig. 2 Predictive control
approach
ref
yt|k
ŷt|k
0
ŷt|k
yt dk|k
ŷk|k−1
ût|k
ut
time
k−1 k k+1 k+M −1 k+Tss k+P
Industrial MPC of Continuous Processes 555
and also Internal Model Control (IMC) (Gar- state (e.g., zero slope) has been incorporated
cia and Morari 1982), although these techniques into two-stage formulations (see, e.g., Lee and
did not solve an online optimization problem. Xiao 2000).
Optimization-based control approaches became State Space Models. The first state space for-
feasible for industrial applications due to (1) the mulation of MPC, which was introduced in
slower sampling requirements of most industrial the late 1980s (Marquis and Broustail 1988)
control problems (on the order of minutes) and allowed MPC to be extended to integrating
the hierarchical implementations in which MPC and unstable processes. It also made use of
provides setpoints to lower level PID controllers the Kalman filter which provided additional
which execute on a much faster sample time (on capability to estimate plant states and un-
the order of seconds or faster). measured disturbances. Later, a state space
Although the basic ideas behind MPC MPC offering was developed based on an in-
remain, industrial MPC technology has changed finite horizon (for both control and prediction)
considerably since the first formulations in the (Froisy 2006). These state space approaches
late 1970s. Qin and Badgwell (2003) describe provided a connection back to unconstrained
the enhancements to MPC technology that LQR theory.
occurred over the next 20 plus years until the Nonlinear MPC. The first applications of non- I
late 1990s. Enhancements since that time are linear MPC, which appeared in the 1990s,
highlighted in Darby and Nikolaou (2012). were based on neural net models. In these
These improvements to MPC reflect increases approaches, a linear dynamic model was com-
in computer processing capability and additional bined with a neural net model that accounted
requirements of industry, which have led to for static nonlinearity (Demoro et al. 1997;
increased functionality and tools/techniques Zhao et al. 2001).
to simplify implementation. A summary The late 1990s saw the introduction of an
of the significant enhancements that have industrial nonlinear MPC based on first prin-
been made to industrial MPC is highlighted ciple models derived from differential mass
below. and energy balances and reaction kinetic ex-
Constraints: Posing input and output constraints pressions, expressed in differential algebraic
as linear inequalities, expressed as a function equation (DAE) form (Young et al. 2002).
of the future input sequence (Garcia and A process where nonlinear MPC is routinely
Morshedi 1986), and solved by a standard applied is polymer manufacturing.
quadratic program or an iterative scheme Identification Techniques. Multivariable
which approximates one. prediction error techniques are now routinely
Two-Stage Formulations: Limitations of a used. More recently, industrial application of
single objective function led to two-stage for- subspace identification methods has appeared,
mulations to handle MV degrees of freedom following the development of these algorithms
(constraint pushing) and steady-state opti- in the 1990s. Subspace methods incorporate
mization via a linear program (LP). the correlation of output measurements in the
Integrators. In their native form, impulse and identification of a multivariable state space
step response models can be applied only to model, which can be used directly in a state
stable systems (in which the impulse response space MPC or converted to an impulse or step
model coefficients approach zero). Extension response model based MPC.
to handle integrating variables included em- Testing Methods. The 1990s saw increased
bedding a model of the difference of the in- use of automatic testing methods to gen-
tegrating signal or integrating a fraction of the erate data for (linear) dynamic model
current predictionˇ error into
ˇ the future (imply- identification using uncorrelated binary
ing an increasing ˇdkCj jk ˇ for j 1 in Fig. 2). signals. Since the 2000, closed-loop testing
The desired value of an integrator at steady methods have received considerable attention.
556 Industrial MPC of Continuous Processes
where the minimization is performed over the equations and the outputs in the dynamic model
^
future sequence of inputs U D uO kjk ; uO kC1jk ; : : : ; are a function of only past inputs, such as with the
uO kCM 1jk . The four terms in the objective finite step response model.
function represent conflicting quadratic penalties When a steady-state optimizer is present in the
^ MPC, it provides the steady-state targets for uss
.kxk2A D xT Ax/; the penalty matrices are most
always diagonal. The first term penalizes the (in the third quadratic term) and yss (in the output
error relative to a desired reference trajectory reference trajectory). Consider the case of linear
(cf. Fig. 2) originating at yO kjk and terminating MPC with LP as the steady-state optimizer. The
at a desired steady-state, yss ; the second term LP is typically formulated as
penalizes output constraint violations over the
prediction horizon (constraint softening); the min
ss
cTu uss C cTy yss C qTC sC C qT s
u
third term penalizes inputs deviations from a
desired steady-state, either manually specified
or calculated. The fourth term penalizes input subject to:
changes as a means of trading off output tracking 9
and input movement (move suppression). yss D Gss uss >
>
=
The above formulation applies to both linear
uss D uk1 C uss Model equations
and nonlinear MPC. For linear MPCs, except >
>
ss ;
for state space formulations, there are no state yss D y0kCTss jk C y
Industrial MPC of Continuous Processes 557
ymin s yss ymax CsC Output constraints becoming equality constraints in subsequent
9 optimizations (Maciejowski 2002).
umin uss umax =
Input constraints
;
Mumax uss Mumax
MPC Design
Gss is formed from the gains of the linear dy- Key design decision for a given application are
namic model. Deviations outside minimum and the number of MPC controllers and the selection
maximum output limits (s and sC , respectively) of the MVs, DVs, and CVs for each controller;
are penalized, which provide constraint softening however, design decisions are not limited to just
in the event all outputs cannot be simultaneously the MPC layer. The design problem is one of
controlled within limits. The weighting in q and deciding on the best overall structure for the
qC determine the relative priorities of the out- MPC(s) and the regulatory controls, given the
put constraints. The input constraints, expressed control objectives, expected constraints, qualita-
in terms of umax , prevent targets from being tive knowledge of the expected disturbances, and
passed to the dynamic optimization that cannot robustness considerations. It may be that exist-
be achieved. The resulting solution – uss and ing measurements are insufficient and additional
yss – provides a consistent, achievable steady-
I
sensors may be required. In addition, a measure-
state for the dynamic MPC controller. Notice that ment many not be updated on a time interval
for inputs, the steady-state delta is applied to the consistent with acceptable dynamic control, for
current value and, for outputs, the steady-state example, laboratory measurements and process
delta is applied to the steady-state prediction of composition analyzers. In this case, a soft sensor,
the output without future moves, after correcting or inferential estimator, may need to be developed
for the current model error (cf. Fig. 2). If a real- from temperature and pressure measurements.
time optimizer is present, its outputs, which may MPC is frequently applied to a major plant
be targets for CVs and/or MVs, are passed to the unit, with the MVs selected based on their sen-
MPC steady-state optimizer and considered with sitivity to key unit CVs and plant economics.
other objectives but at lower weights or priorities. Decisions regarding the number and size of the
Some additional differences or features found MPCs for a given application depend on plant ob-
in industrial MPCs include: jectives, (expected) constraints, and also designer
1-Norm formulations where absolute deviations, preferences. When the objective is to minimize
instead of quadratic deviations, are penalized. energy consumption based on fixed or specified
Use of zone trajectories or “funnels” with small feed rate, multiple smaller controllers can be
or no penalty applied if predictions remain used. In this situation, controllers are normally
within the specified zone boundaries. designed based on the grouping of MVs with
Use of a minimum movement criterion in ei- the largest effect on the identified CVs, often
ther the dynamic or steady-state optimizations, leading to MPCs designed for individual sections
which only lead to MV movement when CV of equipment, such as reactors and distillation
predictions go outside specified limits. This columns. When the objective is to maximize feed
can provide controller robustness to modeling (or certain products), larger controllers are nor-
errors. mally designed, especially if there are multiple
Multiobjective formulations which solve a series constraints that can limit plant throughput. The
of QP or LP problems instead of a single one, MPC steady-state LP or QP is ideally suited to
and can be applied to the dynamic or steady- solving the throughput maximization problem by
state optimizations. In these formulations, utilizing all available MVs. The location of the
higher priority objectives are solved first, most limiting constraints can impact the number
followed by lesser priority objectives with of MPCs. If the major constraints are near the
the solution of the higher priority objectives front-end of the plant, one MPC can be designed
558 Industrial MPC of Continuous Processes
which connects these constraints with key MVs empirical linear dynamic models, fit to data from
such as feed rates, and other MPCs designed for a dedicated plant test. This will be the basis in the
the rest of the plant. If the major constraints are following discussion.
located near the back of the plant, then a single In the pretest phase of work, the key activity
MPC is normally considered; alternatively, an is one of determining the base level regulatory
MPC cascade could be considered, although this controls for MPC, tuning of these controls, and
is not a common practice across the industry (and determining if the current plant instrumentation
often requires customization). is adequate. It is common to retune a significant
The feed maximization objective is a ma- number of PID loops, with significant benefits
jor reason why MPCs have become larger with often resulting from this step alone.
the advances in computer processing capability. A range of testing approaches are used in plant
However, there is generally a higher requirement testing for linear MPC, including both manual
on model consistency for larger controllers due and automatic (computer-generated) test signal
do the increased number of possible constraint designs, most often in open loop but, increas-
sets against which the MPC can operate. A larger ingly, in closed loop. Most input testing continues
controller can also be harder to implement and to be based on uncorrelated signals, implemented
understand. This is a reason why some practi- either manually or from computer-generated ran-
tioners prefer implementing smaller MPCs at the dom sequences. Model accuracy requirements
potential loss of benefits. dictate accuracy across a range of frequencies
which is achieved by varying the duration of the
steps. Model identification runs are made through
MPC Practice out the course of a test to determine when model
accuracy is sufficient and a test can be stopped.
An MPC project is typically implemented in the In the next phase of work, modeling of the
following sequence: plant is performed. This includes constructing the
Pretest and preliminary MPC design overall MPC model from individual identifica-
Plant testing tion runs; for example, deciding which models
Model and controller development are significant and judging the models charac-
Commissioning teristics (dead times, inverse response settling
These tasks apply whether the MPC is linear time, gains) based on engineering/process and
or nonlinear, but with some differences, primar- a priori knowledge. An important step is an-
ily model development and in plant testing. In alyzing, and adjusting if necessary, the gains
nonlinear MPC, key decisions are related to the of the constructed model to insure the models
model form and level of rigor. Note that with a gains satisfy mass balances and gain ratios do
fundamental model, lower level PID loops must not result in fictitious degrees of freedom (due
be included in the model, if the dynamics are to model errors) that the steady-state optimizer
significant; this is in contrast to empirical mod- could exploit. Also included is the development
eling, where the dynamics of the PID loops are of any required inferentials or soft sensors, typi-
embedded in the plant responses. A fundamental cally based on multivariate regression techniques
model will typically require less plant testing such as principal component regression (PCR),
and make use of historical operating data to principal component analysis (PCA) and partial
fit certain model parameters such as heat trans- least squares (PLS), or sometimes based on a
fer coefficients and reaction constants. Historical fundamental model.
data and/or data from a validated nonlinear static During controller development, initial con-
model can also be used to develop nonlinear troller tuning is performed. This relates to estab-
static models (e.g., neural net) to combine with lishing criteria for utilizing available degrees of
empirical dynamic models. As mentioned earlier, freedom and setting control variable priorities. In
most industrial applications continue to rely on addition, initial tuning values are established for
Industrial MPC of Continuous Processes 559
the dynamic control. Steady state responses cor- As mentioned earlier, one of the advantages
responding to expected constraint scenarios are of state-space modeling is the inherent flexibility
analyzed to determine if the controller behaves as to model unmeasured disturbances (i.e., dkCj jj ,
expected, especially with respect to the steady- cf. Fig. 2); however, these have not found wide-
state changes in the manipulated variables. spread use in industry. A useful enhancement
Commissioning involves testing and tuning would be a framework for developing and imple-
the controller against different constraint sets. It menting improved estimators in a convenient and
is not unusual to modify or revisit model de- transparent manner, that would be applicable to
cisions made earlier. In the worst case, control traditional FIR- and FSR- based MPCs.
performance may be deemed unacceptable and In the area of nonlinear control, the use of
the control engineer is forced to revisit earlier hybrid modeling approaches has increased, for
decisions such as the base level regulatory strat- example, integrating known fundamental model
egy or plant model quality, which would require relationships with neural net or linear time-
re-testing and re-identification of portions of the varying dynamic models. The motivation is in
plant model. The main commissioning effort typ- reducing complexity and controller execution
ically takes place over a two to three week period, times. The use of hybrid techniques can be
but can vary based on the size and model density expected to further increase, especially if I
of the controller. In reality, commissioning, or nonlinear control is to be applied more broadly to
more accurately, controller maintenance, is an larger control problems. Even in situations where
ongoing activity. It is important that the operating control with linear MPC is adequate, there may
company have in-house expertise that can be used be benefits from the use of hybrid or fundamental
to answer questions (“why is the controller doing models, even if the models are not directly used
that?”), troubleshoot, and modify the controller to in the control calculation. The resulting model
reflect new operating modes and constraint sets. could be used offline in model development or
online to update the linear MPC model. Benefits
would come from reduced plant testing and in
Future Directions ensuring model consistency. In the longer term,
one can foresee a more general modeling and
Likely future developments are expected to fol- control environment where the user would not
low extensions of current approaches. Due to have to be concerned with the distinction between
the success in automatic, closed-loop testing, one linear and nonlinear models and would be able
possibility is extending it to “dual” or “joint” con- to easily incorporate known relationships into the
trol, where control and identification objectives controller model.
are combined and allow the user to select how An area that has not received significant atten-
much the control (e.g., output variance) can be tion, but is suggested as an area worth pursuing
affected by test perturbation signals. Another is concerns MPC cascades. Most of the applica-
in formulating the plant test as a DOE 8 (design tions and research are based on a single MPC
of experiments) optimization problem that could, or multiply distributed MPCs. An MPC cascade
for example, target specific models or model would permit the lower MPC to run at a faster
parameters. In the identification area, extensions time period and allow the user to decide which
have started to appear which allow constraints degrees of freedom are to be used for higher level
to be imposed, for example, on dead-times or objectives, such as feed maximization.
gains, thus allowing a priori knowledge to be
used. Another important area that has seen recent Cross-References
emphasis, and which more development can be
expected, is in monitoring and diagnosis, for ex- Control Hierarchy of Large Processing Plants:
ample, detecting which submodels of MPC have An Overview
become inaccurate and require re-identification. Control Structure Selection
560 Information and Communication Complexity of Networked Control Systems
Model-Based Performance Optimizing integrated PLS and neural net state space model. Con-
Control trol Eng Pract 9(2): 125–133
Zhu Y, Patwardhan R, Wagner S, Zhao J (2012) Towards
Model-Predictive Control in Practice
a low cost and high performance MPC: the role of
Nominal Model-Predictive Control system identification. Comput Chem Eng 51(5):124–
Real-Time Optimization of Industrial Pro- 135
cesses
Station 1
Plant
where i 2 f1; 2; : : : ; Lg DW L and x0 ; wŒ0;T 1 ; is called the communication complexity for this I
i objective.
vŒ0;T 1 are mutually independent random
variables with specified probability distributions. The above is a fixed-rate formulation for com-
Here, we use the notation wŒ0;t WD fws ; 0stg. munication complexity, since for any two coder
The DMs are allowed to exchange limited in- outputs, a fixed number of bits is used at any
formation: see Fig. 1. The information exchange given time. One could also use variable-rate for-
is facilitated by an encoding protocol E which is mulations. The variable-rate formulation exploits
a collection of admissible encoding functions de- the probabilistic distribution of the system vari-
scribed as follows. Let the information available ables: see Cover and Thomas (1991).
to DM i at time t be
i;j j;i
Iti D fyŒ1;t
i
; uŒ1;t 1 ; zŒ0;t ; zŒ0;t ; j 2 Lg;
i Communication Complexity for
Decentralized Dynamic Optimization
i;j i;j
where zt takes values in Zt and is the informa-
Let E i D fEti ; t 0g and i D fti ; t
tion variable transmitted from DM i to DM j at
0g. Under a team-encoding policy E D
time t generated with
fE 1 ; E 2 ; : : : ; E L g, and a team-control policy
i;j
D f 1 ; 2 ; : : : ; L g, let the induced cost be
zit D fzt ; j 2 Lg D Eti .Iti1 ; uit 1 ; yti /; (3)
X
T 1
i;j
and for t D 0, zi0 D fz0 ; j 2 Lg D E0i .y0i /. The E ;E Œ c.xt ; u1t ; u2t ; ; uL
t /: (4)
control actions are generated with t D0
X
T 1 Suppose that we wish to compute the minimum
C.R/ W D inf E ;E
Œ c.xt ; ut / W expected cost subject to a total rate of 2 bits that
;E
t D0 can be exchanged. Under a sequential scheme, if
1 we allow DM 1 to encode y 1 to DM 2 with 1 bit,
R.zŒ0;T 1 / R : then a cost of 0 is achieved since DM 2 knows the
T
relevant information that needs to be transmitted
We can define a dual function. to DM 1, again with 1 bit: If p D 0, x 1 is the
relevant random variable with an optimal policy
Definition 2 Given a decentralized control prob- u1 D u2 D x 1 , and if p D 1, x 2 is relevant with
lem as above, team rate-cost function R W R ! R an optimal policy u1 D u2 D x 2 , and a cost of 0 is
is achieved. However, if the information exchange
is parallel, then DM 2 does not know which state
1
R.C / W D inf R.zŒ0;T 1 / W is the relevant one, and it can be shown that a cost
;E T of 0 cannot be achieved under any policy.
X
T 1 The formulation in Definition 1 can also be
E ;E
Œ c.xt ; ut / C : adjusted to allow for multiple rounds of commu-
t D0 nication per time stage. Having multiple rounds
can enhance the performance for a class of team
The formulation here can be adjusted to in- problems while keeping the total rate constant.
clude sequential (iterative) information exchange
given a fixed ordering of actions, as opposed to a
simultaneous (parallel) information exchange at
any given time t. That is, instead of (3), we may Communication Complexity
have in Decentralized Computation
P
with where js.x; y/j D ti D1 jmi j and E is a protocol
which dictates the iterative encoding functions as
c.s; u1 ; u2 / D .s u1 /2 C .s u2 /2 : in (5) and is a decision policy.
Information and Communication Complexity of Networked Control Systems 563
For such problems, obtaining good lower where R. ; E; ˛; ˇ; x0 / denotes the communica-
bounds is in general challenging. One lower tion rate under the control and coding functions
bound for such problems is obtained through ; E, which satisfies the objectives given by the
the following reasoning: A subset of the form choices ˛; ˇ and initial condition x0 .
A B, where A and B are subsets of f0; 1gn, Wong obtains a cut-set type lower bound:
is called an f -monochromatic rectangle if Given a fixed initial state, a lower bound is given
for every x 2 A; y 2 B, f .x; y/ is the by 2D.f /, where f is a function of the objec-
same. It can be shown that given any finite tive choices and D.f / is a variation of R.f /
message sequence fm1 ; m2 ; ; mt g, the set introduced in (6) with the additional property that
f.x; y/ W s.x; y/ D fm1 ; m2 ; ; mt gg is an both DMs know f at the end of the protocol. An
f -monochromatic rectangle. Hence, to minimize upper bound is established by the exchange of the
the number of messages, one needs to minimize initial states and objective functions also taking
the number of f -monochromatic rectangles into account signaling, that is, the communication
which has led to research in this direction. through control actions, which is discussed fur-
Upper bounds are typically obtained by explicit ther below in the context of stabilization. Wong
constructions. For a comprehensive review, see and Baillieul (2012) consider a detailed analysis
Kushilevitz and Nisan (2006). for a real-valued bilinear controlled decentralized I
For control systems, the discussion takes fur- system.
ther aspects into account including a design ob-
jective, system dynamics, and the uncertainty in
the system variables.
Connections with Information Theory
with d.l; m/ denoting the graph distance (num- Networked Control Systems: Estimation and
ber of edges in a shortest path) between DM Control over Lossy Networks
l and DM m in G and Œxi denoting the sub- Quantized Control and Data Rate Constraints
space spanned by xi . Furthermore, there exist
stabilizing coding and control policies whose
sum rate is arbitrarily close to this bound. When Recommended Reading
different Jordan blocks may admit repeated and
possibly complex eigenvalues, variations of the The information exchange requirements for
result above are applicable. In the special case decentralized optimization depend also on the
where there is a centralized controller which re- structural properties of the cost functional to
ceives information from multiple sensors (under be minimized. For a class of team problems,
stabilizability and joint detectability), even in the one might simply need to exchange a sufficient
presence of noise, to achieve asymptotic stability, statistic needed for optimal solutions. For some
it suffices
P to have the average total rate be greater problems, there may be no need for an exchange
than ji j>1 log2 .ji j/. The results above follow at all, if the sufficient statistics are already
from Matveev and Savkin (2008) and Yüksel and available, as in the case of mean field equilibrium
Başar (2013). For the case with a single sensor, problems when the number of decision makers I
this result has been studied extensively in net- is unbounded or very large for almost optimal
worked control (see the chapter on Quantized solutions; see Huang et al. (2006) and Lasry
Control and Data Rate Constraints in the Ency- and Lions (2007). In case there is no common
clopedia). probabilistic information, the problem considered
becomes further involved. The consensus
literature, under both Bayesian and non-Bayesian
Summary and Future Directions contexts, aims to achieve agreement on a class of
system variables under information constraints:
In this text, we discussed the problem of see, e.g., Tsitsiklis et al. (1986). Optimization
communication complexity in networked control under local interaction and sparsity constraints
systems. Our analysis considered both cost and various criteria have been investigated in
minimization and controllability/reachability a number of publications including Rotkowitz
problems subject to information constraints. We and Lall (2006). A review for the literature
also discussed the communication complexity on norm-optimal control as well as optimal
in distributed computing as has been studied in stochastic dynamic teams is provided in Mahajan
the computer science community and provided et al. (2012). Tsitsiklis and Athans (1985) have
a brief discussion on the information theoretic observed that from a computational complexity
approaches for such problems together with viewpoint, obtaining optimal solutions for a class
structural results. There are many relevant of such communication protocol design problems
open problems on structural results for optimal is non-tractable (NP-hard).
policies, explicit solutions, as well as nontrivial Even though obtaining explicit solutions for
upper and lower bounds on the optimal optimal coding and control results may be dif-
performance. ficult, it is useful to obtain structural results on
optimal coding and control policies since one
can reduce the search space to a smaller class of
Cross-References functions. For dynamic team problems, these typ-
ically follow from the construction of a controlled
Data Rate of Nonlinear Control Systems and Markov chain (see Walrand and Varaiya 1983)
Feedback Entropy and applying tools from stochastic control theory
Flocking in Networked Systems which obtain structural results on optimal coding
Information-Based Multi-Agent Systems and control policies (see Nayyar et al. 2013).
566 Information and Communication Complexity of Networked Control Systems
Along these lines, for system (1)–(2), if the DMs Ma N, Ishwar P (2011) Some results on distributed source
can agree on a joint belief P .xt 2 jIti ; i 2 L/ coding for interactive function computation. IEEE
Trans Inf Theory 57:6180–6195
at every time stage, then the optimal cost that Mahajan A, Martins N, Rotkowitz M, Yüksel S (2012)
would be achieved under a centralized system Information structures in optimal decentralized con-
could be attained (see Yüksel and Başar 2013). trol. In: IEEE conference on decision and control,
As a further important illustrative case, if the Hawaii
Matveev AS, Savkin AV (2008) Estimation and
problem described in Definition 1 is for a real- control over communication networks. Birkhäuser,
time estimation problem for a Markov source, Boston
then the optimal causal fixed-rate coder minimiz- Nayyar A, Mahajan A, Teneketzis D (2013) The common-
ing any cost function uses only the last source information approach to decentralized stochastic con-
trol. In: Como G, Bernhardsson B, Rantzer A (eds)
symbol and the information at the controller’s Information and control in networks. Springer Inter-
memory: see Witsenhausen (1979). We also note national Publishing, Switzerland
that the optimal design of information channels Nemirovsky A, Yudin D (1983) Problem complexity and
for optimization under information constraints method efficiency in optimization. Wiley-Interscience,
New York
is a non-convex problem; see Yüksel and Lin- Orlitsky A, Roche JR (2001) Coding for computing. IEEE
der (2012) and Yüksel and Başar (2013) for a Trans Inf Theory 47:903–917
review of the literature and certain topological Raginsky M, Rakhlin A (2011) Information-based
properties of the problem. We refer the reader complexity, feedback and dynamics in convex
programming. IEEE Trans Inf Theory 57:
to Nemirovsky and Yudin (1983) for a com- 7036–7056
prehensive resource on information complexity Rotkowitz M, Lall S (2006) A characterization of convex
for optimization problems. A sequential setting problems in decentralized control. IEEE Trans Autom
with an information theoretic approach to the Control 51:274–286
Teneketzis D (1979) Communication in decentralized
formulation of communication complexity has control. PhD dissertation, MIT
been considered in Raginsky and Rakhlin (2011). Tsitsiklis J, Athans M (1985) On the complexity of de-
A formulation relevant to the one in Definition 1 centralized decision making and detection problems.
has been considered in Teneketzis (1979) with IEEE Trans Autom Control 30:440–446
Tsitsiklis J, Bertsekas D, Athans M (1986) Distributed
mutual information constraints. Giridhar and Ku- asynchronous deterministic and stochastic gradient
mar (2006) discuss distributed computation for a optimization algorithms. IEEE Trans Autom Control
class of symmetric functions under information 31:803–812
constraints and present a comprehensive review. Walrand JC, Varaiya P (1983) Optimal causal coding-
decoding problems. IEEE Trans Inf Theory 19:814–
820
Witsenhausen HS (1976) The zero-error side information
problem and chromatic numbers. IEEE Trans Inf The-
ory 22:592–593
Bibliography Witsenhausen HS (1979) On the structure of real-time
source coders. Bell Syst Tech J 58:1437–1451
Cover TM, Thomas JA (1991) Elements of information Wong WS (2009) Control communication complexity of
theory. Wiley, New York distributed control systems. SIAM J Control Optim
Gamal AE, Kim YH (2012) Network information theory. 48:1722–1742
Cambridge University Press, UK Wong WS, Baillieul J (2012) Control communication
Giridhar A, Kumar P (2006) Toward a theory of in- complexity of distributed actions. IEEE Trans Autom
network computation in wireless sensor networks. Control 57:2731–2345
IEEE Commun Mag 44:98–107 Yao ACC (1979) Some complexity questions related to
Huang M, Caines PE, Malhamé RP (2006) Large distributive computing. In: Proceedings of the 11th
population stochastic dynamic games: closed-loop annual ACM symposium on theory of computing,
McKean-vlasov systems and the nash certainty Atlanta
equivalence principle. Commun Inf Syst 6: Yüksel S, Başar T (2013) Stochastic networked control
221–251 systems: stabilization and optimization under informa-
Kushilevitz E, Nisan N (2006) Communication complex- tion constraints. Birkhäuser, Boston
ity, 2nd edn. Cambridge University Press, New York Yüksel S, Linder T (2012) Optimization and convergence
Lasry JM, Lions PL (2007) Mean field games. Jpn J Math of observation channels in stochastic control. SIAM J
2:229–260 Control Optim 50:864–887
Information Structures, the Witsenhausen Counterexample, and Communicating Using Actions 567
Control
agent
External
Channels
Control
agent
Plant
Plant Plant
Plant
Plant
u y u Control y Control
y Control
u Analog agent agent
agent
control
Digital Decentralized Networked control
Open loop Feedback Control control control systems
Information Structures, the Witsenhausen Coun- worked control systems” (also called “cyber-physical sys-
terexample, and Communicating Using Actions, tems”) are decentralized and networked using communi-
Fig. 1 The evolution of control systems. Modern “net- cation channels
x0 ∼ N (0, σ20) 15
x1 + x2 k2 =0.5
x0 + +
- 10
Cw uw + Cb ub
5
z ∼ N(0, 1)
x1
0
min k2E u2w + E x22
−5
Information Structures, the Witsenhausen Coun-
terexample, and Communicating Using Actions,
Fig. 3 The Witsenhausen counterexample is a decep- −10
tively simple two-time-step two-controller decentralized
control problem. The weak and the blurry controllers, Cw −15
and Cb act in a sequential manner
−15 −10 −5 0 5 10 15
x0
In the counterexample, two controllers (denoted
here by Cw for “weak” and Cb for “blurry”) act Information Structures, the Witsenhausen Coun-
terexample, and Communicating Using Actions, I
one after the other in two time-steps to minimize
Fig. 4 The optimization solution of Baglietto et al. (1997)
a quadratic cost function. The system state is for k 2 D 0:5, 02 D 5. The information-theoretic strategy
denoted by xt , where t is the time index. uw and of “dirty-paper coding” Costa (1983) also yields the same
ub denote the inputs generated by the curve (Grover and Sahai 2010)
two con-
trollers. The cost function is k 2 E u2w CE x22 for
some constant k. The initial state x0 and the noise important in decentralized control. This simple
z at the input of the blurry controller are assumed two-time-step two-controller LQG problem
to be Gaussian distributed and independent, with needs to be understood to have any hope of un-
variances 02 and 1 respectively. The problem derstanding larger and more complex problems.
is a “linear-quadratic-Gaussian” (LQG) problem, While the optimal costs for the problem are
i.e., the state evolution is linear, the costs are still unknown (even though it is known that an
quadratic, and the primitive random variables are optimal strategy exists (Witsenhausen 1968;
Gaussian. Wu and Verdú 2011)), there exists a wealth of
Why is the problem called a “counterexam- understanding of the counterexample that has
ple”? The traditional “certainty-equivalence” helped address more complicated problems. A
principle (Bertsekas 1995) shows that for all body of work, starting with that of Baglietto
centralized LQG problems, linear control laws et al. (1997), numerically obtained solutions that
are optimal. Witsenhausen (1968) provided could be close to optimal (although there is no
a nonlinear strategy for the Witsenhausen mathematical proof thereof). All these solutions
problem which outperforms all linear strategies. have a consistent form (illustrated in Fig. 4),
Thus, the counterexample showed that the with slight improvements in the optimal cost.
certainty-equivalence doctrine does not extend Because the discrete version of the problem,
to decentralized control. appropriately relaxed, is known to be NP-
What is this strategy of Witsenhausen that complete (Papadimitriou and Tsitsiklis 1986),
outperforms all linear strategies? It is, in fact, a this approach cannot be used to understand the
quantization-based strategy, as suggested in our entire parameter space and hence has focused on
inverted-pendulum story above. Further, it was one point: k 2 D 0:5; 02 D 5. Nevertheless, the
shown by Mitter and Sahai (1999) that multipoint approach has proven to be insightful: a recent
quantization strategies can outperform linear information-theoretic body of work shows that
strategies by an arbitrarily large factor! This the strategies of Fig. 4 can be thought of as
observation, combined with the simplicity of information-theoretic strategies of “dirty-paper
the counterexample, makes the problem very coding” Costa (1983) that is related to the idea of
570 Information Structures, the Witsenhausen Counterexample, and Communicating Using Actions
embedding information in the state. The question by the Witsenhausen counterexample. However,
here is: how do we embed the information about nonclassical information pattern for some simple
the state in the state itself ? problems – starting with the counterexample –
has recently been explored via an information-
An information-theoretic view of the coun-
theoretic lens, yielding the first optimal or
terexample: This information-theoretic approach
approximately optimal solutions to these
that culminated in Grover et al. (2013) also
problems. This approach is promising for larger
obtained the first approximately optimal
decentralized control problems as well. It is
solutions to the Witsenhausen counterexample
now important to explore what is the simplest
as well as its vector extensions. The result is
decentralized control problem that cannot be
established by analyzing information flows in the
solved (exactly or approximately) using ideas
counterexample that work toward minimizing the
developed for the counterexample. In this
knowledge gradient, effectively an information
manner, the Witsenhausen counterexample can
pattern in which Cw can predict the observation
provide a platform to unify the more modern
of Cb more precisely. The analysis provides
(i.e., external-channel centric approaches, see
an information-theoretic lower bound on cost
Quantized Control and Data Rate Constraints;
that holds irrespective of what strategy is used.
Data Rate of Nonlinear Control Systems
For the original problem, this characterizes the
and Feedback Entropy; Networked Control
optimal costs (with associated strategies) within
Systems: Architecture and Stability Issues;
a factor of 8 for all problem parameters (i.e., k
Networked Control Systems: Estimation and
and 02 ). For any finite-length extension, uniform
Control Over Lossy Networks; Information
finite-ratio approximations also exist (Grover
and Communication Complexity of Networked
et al. 2013). The asymptotically infinite-
Control Systems; in the encyclopedia) with the
length extension has been solved exactly
more classical decentralized LQG problems,
(Choudhuri and Mitra 2012).
leading to enriching and useful formula-
The problem has also driven delineation of de-
tions.
centralized LQG control problems with optimal
linear solutions and those with nonlinear optimal
solutions. This led to the development and under-
standing of many variations of the counterexam-
ple (Bansal and Başar 1987; Başar 2008; Ho et al. Cross-References
1978; Rotkowitz 2006) and understanding that
Data Rate of Nonlinear Control Systems and
can extend to larger decentralized control prob-
lems. More recent work shows that the promise Feedback Entropy
Information and Communication Complexity
of the Witsenhausen counterexample was not
a misplaced one: the information-theoretic ap- of Networked Control Systems
Networked Control Systems: Architecture and
proach that provides approximately optimal solu-
tions to the counterexample (Grover et al. 2013) Stability Issues
Networked Control Systems: Estimation and
yields solutions to other more complex (e.g.,
multi-controller, more time-steps) problems as Control over Lossy Networks
Quantized Control and Data Rate Constraints
well (Grover 2010; Park and Sahai 2012).
forefront (Bullo et al. 2009). At the same time, algorithm was shown using a Lyapunov-like
machine intelligence, machine learning, machine function.
autonomy, and theories of operation of mixed This entry surveys key topics related to multi-
teams of humans and robots have considerably agent information-based control systems, includ-
extended the intellectual frontiers of information- ing control complexity, control with data-rate
based multi-agent systems (Baillieul et al. 2012). constraints, robotic networks and formation con-
A further important development has been the trol, action-mediated communication, and multi-
study of action-mediated communication and the objective distributed systems.
recently articulated theory of control communica-
tion complexity (Wong and Baillieul 2012). These
developments may shed light on nonverbal forms Control Complexity
of communication among biological organisms
(including humans) and on the intrinsic energy In information-based distributed control systems,
requirements of information processing. how to efficiently share computational and com-
In conventional decentralized control, the con- munication resources is a fundamental issue. One
trol objective is usually well-defined and known of the earliest investigations on how to schedule
to all agents. Multi-agent information-based con- communication resources to support a network
trol encompasses a broader scenario, where the of sensors and actuators is discussed in Brockett
objective can be agent dependent and is not (1995). The concept of communication sequenc-
necessarily explicitly announced to all. For il- ing was introduced to describe how the commu-
lustration, consider the power control problem nication channel is utilized to convey feedback
in wireless communication – one of the ear- control information in a network consisting of
liest engineering systems that can be regarded interacting subsystems. In Brockett (1997), the
as multi-agent information based. It is common concept of control attention was introduced to
that multiple transmitter-receiver communication provide a measure of the complexity of a con-
pairs share the same radio frequency band and trol law against its performance. As attention is
the transmission signals interfere with each other. a shared, limited resource, the goal is to find
The power control problem searches for feedback minimum attention control. Another approach to
control for each transmitter to set its power level. gauge control complexity in a distributed system
The goal is for each transmitter to achieve tar- is by means of the minimum amount of communi-
geted signal-to-interference ratio (SIR) level by cated data required to accomplish a given control
using information of the observed levels at the task.
intended receiver only.
A popular version of the power control
problem (Foschini and Miljanic 1993) defines Control with Data-Rate Constraints
each individual objective target level by means
of a requirement threshold, known only to the A fundamental challenge in any control imple-
intended transmitter. As SIR measurements mentation in which system components com-
naturally reside on a receiver, the observed municate with each other over communication
SIR needs to be communicated back to the links is ensuring that the channel capacity is
transmitter. For obvious reasons, the bandwidth large enough to deal with the fastest time con-
for such communication is limited. The resulting stants among the system components. In a single
model fits the bill of multi-agent information- agent system, the so-called Data-Rate Theorem
based control. In Sung and Wong (1999), a has been formulated in various ways to under-
tristate power control strategy is proposed stand the constraints imposed between the sensor
so that the power control outputs are either and the controller and between the controller
increased or decreased by a fixed dB or no and the actuator. Extensions to this fundamental
change at all. Convergence of the feedback result have been focused on addressing similar
Information-Based Multi-Agent Systems 573
problems in the network control system context. movements in dance have been reported in Bail-
Information on such extensions in the distributed lieul and Özcimder (2012). Motion-based com-
control setting can be found in Nair and Evans munication of this type involves specially tailored
(2004) and Yüksel and Basar (2007). motion description languages in which sequences
of motion primitives are assembled with the ob-
jective of conveying artistic intent, while min-
Robotic Networks and Formation imizing the use of limited energy resources in
Control carrying out the movement. These motion primi-
tives constitute the alphabet that enables commu-
The defining characteristic of robotic networks nication, and physical constraints on the motions
within the larger class of multi-agent systems define the grammatical rules that govern the ways
is the centrality of spatial relationships among in which motion sequences may be constructed.
network nodes. Graph theory has been shown Research on action-mediated communication
to provide a generally convenient mathematical helps illustrate the close connection between con-
language in which to describe spatial concepts trol and information theory. Further discussion
and it is the key to understanding spatial rigid- of the deep connection between the two can be
ity related to the control of formations of au- found, for example, in Park and Sahai (2011), I
tonomous vehicles (Anderson et al. 2008), or which argues for the equivalence between the
in flocking systems (Leonard et al. 2012), or in stabilization of a distributed linear system and
consensus problems (Su and Huang 2012), or in the capacity characterization in linear network
rendezvous problems (Cortés et al. 2006). For coding.
these distributed control research topics, readers
can consult other sections in this Encyclopedia
for a comprehensive reference list. Multi-objective Distributive Systems
Much of the recent work on formation
control has included information limitation In a multi-agent system, agents may aim to carry
considerations. For consensus problems, for out individual objectives. These objectives can
example, Olfati-Saber and Murray (2004) either be cooperatively aligned (such as in a
introduced a sensing cost constraint, and in cooperative control setting) or may contend an-
Ren and Beard (2005) information exchange tagonistically (such as in a zero-sum game set-
constraints are considered, and in Yu and Wang ting). In either case, a common assumption is that
(2010) communication delays are explicitly the objective functions are a priori known to all
modeled. agents. However, in many practical applications,
agents do not know the objectives of other agents,
at least not precisely. For example, in the power
Action-Mediated Communication control problem alluded to earlier, the signal-to-
interference requirement of a user may be un-
Biological organisms communicate through mo- known to other users. Yet this does not prevent the
tion. Examples of this include prides of lions or possibility of deriving convergence algorithms to
packs of wolves whose pursuit of prey is a coop- allow the joint goals to be achieved.
erative effort and competitive team athletics in the The issue of unknown objectives in a multi-
case of humans. Recent research has been aimed agent system is formally analyzed in Wong
at developing a theoretical foundation of action- (2009) via the introduction of choice-based
mediated communication. Communication proto- actions. In an open access network, objectives of
cols for motion-based signaling between mobile an individual agent may be known only partially,
robots have been developed (Raghunathan and via the form of a random distribution in some
Baillieul 2009) and preliminary steps towards a cases. In order to achieve a joint control objective
theory of artistic expression through controlled in general, some communication via the system
574 Information-Based Multi-Agent Systems
is required if there is no side communications Baillieul J, Özcimder K (2012) The control theory of
channel. A basic issue is how to measure the motion-based communication: problems in teaching
robots to dance. In: Proceedings of the American con-
minimum amount of information exchange
trol conference (ACC), Montreal, 27–29 June 2012,
that is required to perform a specific control pp 4319–4326
task. Motivated by the idea of communication Baillieul J, Wong WS (2009) The standard parts problem
complexity in computer science, the idea and the complexity of control communication. In:
Proceedings of the 48th IEEE conference on decision
of control communication complexity was and control, held jointly with the 28th Chinese control
introduced in Wong (2009), which can provide conference, Shanghai, 16–18 Dec 2009, pp 2723–2728
such a measure. In Wong and Baillieul (2009), Baillieul J, Leonard NE, Morgansen KA (2012)
the idea was extended to a rich class of nonlinear Interaction dynamics: the interface of humans
and smart machines. Proc IEEE 100(3):776–801.
systems that arise as models of physical processes doi:10.1109/JPROC.s011.2180055
ranging from rigid body mechancs to quantum Brockett RW (1995) Stabilization of motor networks. In:
spin systems. Proceedings of the 34th IEEE conference on decision
In some special cases, control objectives and control, New Orleans, 13–15 Dec 1995, pp 1484–
1488
can be achieved without any communication Brockett RW (1997) Minimum attention control. In: Pro-
among the agents. For systems with bilinear ceedings of the 36th IEEE conference on decision and
input–output mapping, including the Brockett control, San Diego, 10–12 Dec 1997, pp 2628–2632
Integrator, it is possible to derive conditions Bullo F, Cortés J, Martínez S (2009) Distributed control of
robotic networks: a mathematical approach to motion
that guarantee this property (Wong and Baillieul coordination algorithms. Princeton University Press,
2012). Moreover, for quadratic type of control Princeton
cost, it is possible to compute the optimal Cortés J, Martínez S, Bullo F (2006) Robust rendezvous
control cost. Similar results can be extended for mobile autonomous agents via proximity graphs
in arbitrary dimensions. IEEE Trans Autom Control
to linear systems as discussed in Liu et al. 51(8):1289–1298
(2013). This circle of ideas is connected to the so- Foschini GJ, Miljanic Z (1993) A simple distributed
called standard parts problem as investigated in autonomous power control algorithm and its conver-
Baillieul and Wong (2009). Another connection gence. IEEE Trans Veh Technol 42(4):641–646
Ho YC (1972) Team decision theory and information
is to correlated equilibrium problems that have structures in optimal control problems-part I. IEEE
been recently studied by game theorists Shoham Trans Autom Control 17(1):15–22
and Leyton-Brown (2009). Leonard NE, Young G, Hochgraf K, Swain D, Chen W,
Marshall S (2012) In the dance studio: analysis of
human flocking. In: Proceedings of the 2012 Ameri-
can control conference, Montreal, 27–29 June 2012,
Cross-References pp 4333–4338
Liu ZC, Wong WS, Guo G (2013) Cooperative control
of linear systems with choice actions. In: Proceedings
Motion Description Languages and Symbolic of the American control conference, Washington, DC,
Control 17–19 June 2013, pp 5374–5379
Multi-vehicle Routing Nair GN, Evans RJ (2004) Stabilisability of stochastic
linear systems with finite feedback data rates. SIAM
Networked Systems J Control Optim 43(2):413–436
Olfati-Saber R, Murray RM (2004) Consensus problems
in networks of agents with switching topology and
time-delays. IEEE Trans Autom Control 49(9):1520–
Bibliography 1533
Park SY, Sahai A (2011) Network coding meets decen-
Altman E, Avrachenkov K, Menache I, Miller G, Prabhu tralized control: capacity-stabilizability equivalence.
BJ, Shwartz A (2009) Dynamic discrete power con- In: Proceedings of the 50th IEEE conference on de-
trol in cellular networks. IEEE Trans Autom Control cision and control and European control conference,
54(10):2328–2340 Orlando, 12–15 Dec 2011, pp 4817–4822
Anderson B, Yu C, Fidan B, Hendrickx J (2008) Raghunathan D, Baillieul J (2009) Motion based com-
Rigid graph control architectures for autonomous munication channels between mobile robots-A novel
formations. IEEE Control Syst Mag 28(6):48–63. paradigm for low bandwidth information exchange. In:
doi:10.1109/MCS.2008.929280 Proceedings of the IEEE/RSJ international conference
Input-to-State Stability 575
ISS
Input-to-State Stability
t
.9ˇ 2 KL/jx.t; x 0 /j ˇ jx 0 j; t 8 x 0 ; 8 t 0:
Input-to-State Stability, Fig. 1 ISS combines over-
shoot and asymptotic behavior
The notion of input to state stability (ISS) was
introduced in Sontag (1989), and it provides theo- ˚
retical concepts used to describe stability features jx.t/j max ˇ.jx 0 j; t/ ; .kuk1 /
of a mapping .u./; x.0// ’x./ that sends initial
states and input functions into states (or, more holds for all solutions. Such redefinitions, using
generally, outputs). Prominent among these fea- “max” instead of sum, are also possible for each
tures are that inputs that are bounded, “eventually of the other concepts to be introduced later.
small,” “integrally small,” or convergent should Intuitively, the definition of ISS requires that,
lead to outputs with the respective property. In for t large, the size of the state must be bounded
addition, ISS and related notions quantify in what by some function of the sup norm – that is to say,
manner initial states affect transient behavior. The the amplitude – of inputs (because ˇ.jx 0 j ; t/ ! 0
formal definition is as follows. as t ! 1). On the other hand, the ˇ.jx 0 j ; 0/
A system is said to be input to state stable term may dominate for small t, and this serves
(ISS) if there exist some ˇ 2 KL and 2 K1 to quantify the magnitude of the transient (over-
such that shoot) behavior as a function of the size of the
ˇ ˇ initial state x 0 (Fig. 1). The ISS superposition the-
jx.t/j ˇ.ˇx 0 ˇ ; t/ C .kuk1 / (ISS) orem, discussed later, shows that ISS is, in a pre-
cise mathematical sense, the conjunction of two
holds for all solutions (meaning that the estimate properties, one of them dealing with asymptotic
is valid for all inputs u./, all initial conditions bounds on jx 0 j as a function of the magnitude of
x 0 , and all t 0). Note that the supremum the input and the other one providing a transient
sups2Œ0;t .ju.s/j/ over the interval Œ0; t is the term obtained when one ignores inputs.
same as .kuŒ0;t k1 / D .sups2Œ0;t .ju.s/j//, For internally stable linear systems xP D Ax C
because the function is increasing, so one may Bu, the variation of parameters formula gives
replace this term by .kuk1 /, where kuk1 D immediately the following inequality:
sups2Œ0;1/ .ju.s/j/ is the sup norm of the input,
ˇ ˇ
because the solution x.t/ depends only on values jx.t/j ˇ.t/ ˇx 0 ˇ C kuk1 ;
u.s/; s t (so, one could equally well consider
the input that has values 0 for all s > t).
Since, in general, maxfa; bg a C b where
maxf2a; 2bg, one can restate the ISS condition
in a slightly different manner, namely, asking for ˇ.t/ D e tA ! 0 and
the existence of some ˇ 2 KL and 2 K1 Z 1
sA
(in general, different from the ones in the ISS D kBk e ds < 1:
definition) such that 0
Input-to-State Stability 577
The notion of ISS arose originally as a way to Input-to-State Stability, Fig. 2 Feedback stabilization,
precisely formulate, and then answer, the follow- closed-loop system xP D f .x; k.x//
ing question. Suppose that, as in many problems
in control theory, a system xP D f .x; u/ has been u
d x = f (x, u)
stabilized by means of a feedback law u D k.x/
(Fig. 2), that is to say, k was chosen such that the
u x
origin of the closed-loop system xP D f .x; k.x//
is globally asymptotically stable. (See, e.g., Son-
tag 1999 for a discussion of mathematical aspects u = k(x)
of state feedback stabilization.) Typically, the de- I
sign of k was performed by ignoring the effect of Input-to-State Stability, Fig. 3 Actuator disturbances,
possible input disturbances d./ (also called ac- closed-loop system xP D f .x; k.x/ C d /
tuator disturbances). These “disturbances” might
represent true noise or perhaps errors in the calcu-
so that f .x; k.x// D x. This is a GAS system.
lation of the value k.x/ by a physical controller
The effect of the disturbance input d is analyzed
or modeling uncertainty in the controller or the
as follows. The system xP D f .x; k.x/ C d / is
system itself. What is the effect of considering
disturbances? In order to analyze the problem, d
xP D x C .x 2 C 1/ d :
is incorporated into the model, and one studies
the new system xP D f .x; k.x/ C d /, where d is
This system has solutions which diverge to
seen as an input (Fig. 3). One may then ask what
infinity even for inputs d that converge to zero;
is the effect of d on the behavior of the system.
moreover, the constant input d 1 results in
Disturbances d may well destabilize the system,
solutions that explode in finite time. Thus k.x/ D
and the problem may arise even when using a rou- 2x
x 2 C1
was not a good feedback law, in the sense
tine technique for control design, feedback lin-
that its performance degraded drastically once
earization. To appreciate this issue, take the fol-
actuator disturbances were taken into account.
lowing very simple example. Given is the system
The key observation for what follows is that
if one adds a correction term “x” to the above
xP D f .x; u/ D x C .x 2 C 1/u: formula for k.x/, so that now,
Q 2x
In order to stabilize it, substitute u D x 2uQC1 (a pre- k.x/ D x;
x2 C 1
liminary feedback transformation), rendering the
system linear with respect to the new input uQ : xP D then the system xP D f .x; k.x/Q C d / with
xC uQ , and then use uQ D 2x in order to obtain the disturbance d as input becomes instead
closed-loop system xP D x. In other words, in
terms of the original input u, the feedback law is xP D 2x x 3 C .x 2 C 1/ d
to xP D 2x x 3 ), but, in addition, it is ISS (easy vanishes, leaving precisely the GAS property.
to verify directly, or appealing to some of the In particular, then, the system xP D f .x; u/ is
characterizations mentioned later). Intuitively, for 0-stable, meaning that the origin of the system
large x, the term x 3 serves to dominate the term without inputs xP D f .x; 0/ is stable in the sense
.x 2 C 1/d , for all bounded disturbances d./, and of Lyapunov: for each – > 0, there is some ı > 0
this prevents the state from getting too large. such that jx 0 j < ı implies jx.t; x 0 /j < –. (In
This example is an instance of a general result, comparison-function language, one can restate 0-
which says that, whenever there is some feedback stability as follows: there is some 2 K such that
law that stabilizes a system, there is also a (pos- jx.t; x 0 /j .jx 0 j/ holds for all small x 0 .)
sibly different) feedback so that the system with On the other hand, since ˇ.jx 0 j; t/ ! 0 as t !
external input d is ISS. 1, for t large one has that the first term in the
ISS estimate jx.t/j max fˇ.jx 0 j; t/; .kuk1 /g
Theorem 1 (Sontag 1989). Consider a system
vanishes. Thus an ISS system satisfies the fol-
affine in controls
lowing asymptotic gain property (“AG”): there
is some 2 K1 so that:
X
m
xP D f .x; u/ D g0 .x/ C ui gi .x/ .g0 .0/ D 0/
ˇ ˇ
i D1 lim ˇx.t; x 0 ; u/ˇ .kuk1 / 8 x 0 ; u./
t !C1
(AG)
and suppose that there is some differentiable
feedback law u D k.x/ so that (see Fig. 4). In words, for all large enough t,
the trajectory exists, and it gets arbitrarily close
xP D f .x; k.x// to a sphere whose radius is proportional, in a
possibly nonlinear way quantified by the function
has x D 0 as a GAS equilibrium. Then, there is a , to the amplitude of the input. In the language
feedback law u D e
k.x/ such that of robust control, the estimate (AG) would be
called an “ultimate boundedness” condition; it
xP D f .x; e
k.x/ C d / is a generalization of attractivity (all trajectories
converge to zero, for a system xP D f .x/ with
is ISS with input d./. no inputs) to the case of systems with inputs; the
“lim sup” is required since the limit of x.t/ as
The reader is referred to the book Krstić et al. t ! 1 may well not exist. From now on (and
(1995), and the references given later, for many analogously when defining other properties), we
further developments on the subjects of recursive
feedback design, the “backstepping” approach,
and other far-reaching extensions.
x(0)
Equivalences for ISS
(the lower bound amounts to properness and or VP ˛.V /=2. From here, one deduces by
V .x/ > 0 for x 6D 0, while the upper a comparison theorem that, along all solutions,
bound guarantees V .0/ D 0). For convenience,
VP W Rn Rm ! R is the function: ˚
V .x.t// max ˇ.V .x 0 /; t/; ˛ 1 .2.kuk1 // ;
VP .x; u/ WD rV .x/:f .x; u/
where the KL function ˇ.s; t/ is the solution y.t/
which provides, when evaluated at .x.t/; u.t//, of the initial value problem
the derivative dV .x.t//=dt along solutions of
xP D f .x; u/. 1
yP D ˛.y/ C .u/; y.0/ D s:
An ISS-Lyapunov function for xP D f .x; u/ 2
is by definition a smooth storage function V for
which there exist functions ; ˛ 2 K1 so that Finally, an ISS estimate is obtained from
V .x 0 / ˛.x 0 /.
VP .x; u/ ˛.jxj/ C .juj/ 8 x; u : The proof of the converse part of the theorem
(L-ISS) is based upon first showing that ISS implies
Integrating, an equivalent statement is that, along robust stability in the sense already discussed
all trajectories of the system, there holds the and then obtaining a converse Lyapunov
following dissipation inequality: theorem for robust stability for the system
xP D f .x; d.jxj// D g.x; d /, which is
Z t2 asymptotically stable uniformly on all Lebesgue-
V .x.t2 // V .x.t1 // w.u.s/; x.s// ds measurable functions d./ W R0 ! B.0; 1/. This
t1
last theorem was given in Lin et al. (1996) and
where, using the terminology of Willems is basically a theorem on Lyapunov functions
(1976), the “supply” function is w.u; x/ D for differential inclusions. The classical result of
.juj/ ˛.jxj/. For systems with no inputs, Massera (1956) for differential equations (with
an ISS-Lyapunov function is precisely the same no inputs) becomes a special case.
object as a Lyapunov function in the usual sense.
Theorem 5 (Sontag and Wang 1995). A sys- Using “Energy” Estimates Instead of
tem is ISS if and only if it admits a smooth ISS- Amplitudes
Lyapunov function. In linear control theory, H1 theory studies L2 !
L2 induced norms, which under coordinate
Since ˛.jxj/ ˛.˛ 1 .V .x///, the ISS- changes leads to the following type of estimate:
Lyapunov condition can be restated as
Z Z
t ˇ ˇ t
VP .x; u/ ˛.V
Q .x// C .juj/ 8 x; u ˛ .jx.s/j// ds ˛0 .ˇx 0 ˇ/ C .ju.s/j/ ds
0 0
Z t
This theorem is quite easy to prove, in
view of previous results. A sketch of proof C˛ .ju.s/j/ ds
0
is as follows. If the system is ISS, then
there is an ISS-Lyapunov function satisfying for appropriate K1 functions (note the additional
P u/ V .x/ C .juj/, so, integrating along
V.x; “˛”).
any solution:
Theorem 7 (Angeli et al. 2000b). A system
Z t Z t satisfies a weak integral to integral estimate if
V .x.s// ds V .x.s// ds C V .x.t// and only if it is iISS.
0 0
Z t Another interesting variant is found when consid-
V .x.0// C .ju.s/j/ ds ering mixed integral/supremum estimates:
0
Z t
and thus an integral-integral estimate holds. Con- ˛.jx.t/j ˇ.jx 0 j; t/ C 1 .ju.s/j/ ds
versely, if such an estimate holds, one can prove 0
as .ju.t/j/ is larger than the range of ˛. This following sense: there exists a storage function V
is also clear from the iISS definition, since a and 2 K1 , ˛ positive definite, so that
constant input with ju.t/j D r results in a term
in the right-hand side that grows like rt. VP .x; u/ .juj/ ˛.h.x//
An interesting general class of examples is
given by bilinear systems holds for all .x; u/.
! Angeli et al. (2000b) contains several additional
X m
xP D A C ui Ai x C Bu characterizations of iISS.
i D1
Superposition Principles for iISS
for which the matrix A is Hurwitz. Such systems There are also asymptotic gain characterizations
are always iISS (see Sontag 1998), but they are for iISS. A system is bounded energy weakly
not in general ISS. For instance, in the case when converging state (BEWCS) if there exists some
B D 0, boundedness of trajectories for all con- 2 K1 so that the following implication holds:
P
stant inputs already implies that A C m i D1 ui Ai Z C1
must have all eigenvalues with nonpositive real .ju.s/j/ ds < C1 )
part, for all u 2 Rm , which is a condition 0
involving the matrices Ai (e.g., xP D x C ux ˇ ˇ
lim inf ˇx.t; x 0 ; u/ˇ D 0 BEWCS
is iISS but it is not ISS). t !C1
possible that allows consideration of more Angeli D, Ingalls B, Sontag ED, Wang Y (2004)
arbitrary attractors, as well as robust and/or Separation principles for input-output and integral-
input-to-state stability. SIAM J Control Optim 43(1):
adaptive concepts. Much else has been omitted 256–276
from this entry. Most importantly, one of the key Freeman RA, Kokotović PV (1996) Robust nonlinear
results is the ISS small-gain theorem due to Jiang control design, state-space and Lyapunov techniques.
et al. (1994), which provides a powerful sufficient Birkhauser, Boston
Isidori A (1999) Nonlinear control systems II. Springer,
condition for the interconnection of ISS systems London
being itself ISS. Isidori A, Marconi L, Serrani A (2003) Robust au-
Other topics not treated include, among many tonomous guidance: an internal model-based ap-
others, all notions involving outputs; ISS proper- proach. Springer, London
Jiang Z-P, Teel A, Praly L (1994) Small-gain theorem for
ties of time-varying (and in particular periodic) ISS systems and applications. Math Control Signals
systems; ISS for discrete-time systems; questions Syst 7:95–120
of sampling, relating ISS properties of continuous Khalil HK (1996) Nonlinear systems, 2nd edn. Prentice-
and discrete-time systems; ISS with respect to Hall, Upper Saddle River
Krstić M, Deng H (1998) Stabilization of uncertain non-
a closed subset K; stochastic ISS; applications linear systems. Springer, London
to tracking, vehicle formations (“leader to fol- Krstić M, Kanellakopoulos I, Kokotović PV (1995)
lowers” stability); and averaging of ISS systems. Nonlinear and adaptive control design. Wiley,
I
Sontag (2006) may also be consulted for further New York
Lin Y, Sontag ED, Wang Y (1996) A smooth converse
references, a detailed development of some of Lyapunov theorem for robust stability. SIAM J Control
these ideas, and citations to the literature for Optim 34(1):124–160
others. In addition, the textbooks Isidori (1999), Massera JL (1956) Contributions to stability theory. Ann
Krstić et al. (1995), Khalil (1996), Sepulchre Math 64:182–206
Praly L, Wang Y (1996) Stabilization in spite of matched
et al. (1997), Krstić and Deng (1998), Freeman unmodelled dynamics and an equivalent definition
and Kokotović (1996), and Isidori et al. (2003) of input-to-state stability. Math Control Signals Syst
contain many extensions of the theory as well as 9:1–33
applications. Sepulchre R, Jankovic M, Kokotović PV (1997) Construc-
tive nonlinear control. Springer, New York
Sontag ED (1989) Smooth stabilization implies co-
prime factorization. IEEE Trans Autom Control
Cross-References 34(4):435–443
Sontag ED (1998) Comments on integral variants of ISS.
Feedback Stabilization of Nonlinear Systems Syst Control Lett 34(1–2):93–100
Sontag ED (1999) Stability and stabilization: disconti-
Fundamental Limitation of Feedback Control
nuities and the effect of disturbances. In: Nonlinear
Linear State Feedback analysis, differential equations and control (Montreal,
Lyapunov’s Stability Theory 1998). NATO science series C, Mathematical and
Stability and Performance of Complex Systems physical sciences, vol 528. Kluwer Academic, Dor-
drecht, pp 551–598
Affected by Parametric Uncertainty
Sontag ED (2006) Input to state stability: basic con-
Stability: Lyapunov, Linear Systems cepts and results. In: Nistri P, Stefani G (eds) Non-
linear and optimal control theory. Springer, Berlin,
pp 163–220
Sontag ED, Wang Y (1995) On characterizations of
Bibliography the input-to-state stability property. Syst Control Lett
24(5):351–359
Angeli D, Sontag ED, Wang Y (2000a) A characterization Sontag ED, Wang Y (1996) New characterizations of
of integral input-to-state stability. IEEE Trans Autom input-to-state stability. IEEE Trans Autom Control
Control 45(6):1082–1097 41(9):1283–1294
Angeli D, Sontag ED, Wang Y (2000b) Further equiv- Willems JC (1976) Mechanisms for the stability and
alences and semiglobal versions of integral input to instability in feedback systems. Proc IEEE 64:
state stability. Dyn Control 10(2):127–149 24–35
584 Interactive Environments and Software Tools for CACSD
Interactive Environments and Software Tools for CACSD, Fig. 1 Hierarchy of the software components incorpo-
rated in an interactive CACSD environment
the tools to be used. The process can be repeated • Define or find (via first principles or system
until a satisfactory behavior is obtained. identification) various system models (e.g.,
Usually, the underlying computational tools state-space models or transfer-function matri-
on which the interactive environments are based ces) and convert between different representa-
are hidden to the user. Moreover, software for tions
extensive testing is not normally provided, but • Find reduced order (or simplified) models,
demonstrators running few examples are offered. which can more economically be used for
I
Unfortunately, even mathematically simple simulation, control, prediction, etc.
problems of small dimension can conduct to • Analyze basic system properties, like stabil-
wrong results when using unsuitable algorithms. ity, controllability, observability, stabilizabil-
Illustrative control-related examples are given, ity, detectability, minimality, properness, etc.
e.g., in Van Huffel et al. (2004). Since system • Analyze interactively the behavior of a control
analysis and design tasks usually involve sequen- system for various scenarios
tial or iterative solution of large and complex • Provide alternative tools for different cate-
subproblems, it follows that the quality of the gories of users, from novice to expert, and
intermediate results is of utmost importance. from classical to “modern” or advanced anal-
Consequently, the interactive environments for ysis and synthesis techniques, in time domain
CACSD should be based on reliable, efficient, or frequency domain
and thoroughly tested computational building • Provide a wide range of tools, covering mod-
blocks, which are called at the lower layers of eling, system identification, filtering, control
calculations. These blocks constitute the compu- system design, simulation, real-time behav-
tational engine of an interactive environment. ior, hardware-in-the-loop simulation, and code
Figure 1 gives a typical hierarchy of the soft- generation for easy deployment and ensure
ware components incorporated in an interactive their interoperability
CACSD environment. • Allow the user to add extensions at vari-
ous levels, new functions, interfaces, or even
toolboxes or packages, which can be made
Interactive Environments for CACSD available to a general community and allow
customization
Main Functionality In addition to the functional and computational
A comprehensive set of functions of and require- tools, essential components of an interactive en-
ments for interactive environments for control vironment are the user interface, the application
engineering are described in MacFarlane et al. program interface (API), and the support tools
(1989), but such a set has probably not yet been which enable to easily specify, document, and
covered by any single environment. State-of-the- store a design solution, to visualize and interpret
art interactive environments for CACSD include the results, to export them to other applications
many attractive functional features: for further processing, to generate reports, etc.
586 Interactive Environments and Software Tools for CACSD
A good paradigm for the data environment is graphical user interface, complemented with
object orientation. attractive visualization capabilities, and open
It is a common feature of an interactive envi- for extensions with new toolkits, MATLAB
ronment for CACSD to address the requirements can be used for solving intricate scientific
of a large diversity of users, in various stages and engineering problems, as well as for the
of familiarity with the environment. This feature development and deployment of applications.
is expressed, e.g., by the option to use either MATLAB R
and Simulink R
are registered
a graphical user interface or a command lan- trademarks of The MathWorks, Inc. MATLAB,
guage to call and sequence various computational Simulink, and several toolboxes, including Sys-
procedures. In addition, tools for easy building tem Identification Toolbox, Control System Tool-
new computational or graphical procedures, or box, and Robust Control Toolbox, are suitable
for managing the codes and results, are often for solving various control engineering problems;
included. The command language should oper- other toolboxes, such as Signal Processing Tool-
ate both on low-level data constructs, such as box, Optimization Toolbox, and Symbolic Math
a matrix, and on high-level ones (e.g., system Toolbox, offer additional useful facilities. See
objects), and it should allow operator overloading http://www.mathworks.com/products/.
(e.g., taking G1
G2 as the result of a series Simulink is a high-level implementation of the
interconnection of the systems represented by the engineering approach, based on block diagrams,
system objects G1 and G2 ). to analyze and design control systems. It is also a
An environment for CACSD should integrate powerful modeling and multi-domain simulation
advanced user interfaces and API, a collection and model-based design tool for dynamic sys-
of problem solvers based on reliable and ef- tems, which supports hierarchical system-level
ficient numerical and possibly symbolic algo- design, simulation, automatic code generation,
rithms, and tools for visualizing and interpreting and continuous test and verification of embed-
the results. Widely used such environments are, ded systems. Simulink offers a graphical edi-
for instance, MATLAB from The MathWorks, tor, customizable block libraries, and solvers for
Inc., Mathematica from Wolfram Research, or modeling and simulating dynamic systems. The
Maple from Waterloo Maple Inc. (Maplesoft). models may include MATLAB algorithms, and
Earlier developments of CACSD packages are the simulation results may be further processed
surveyed in Frederick et al. (1991). There are to MATLAB. Managing projects (files, compo-
also environments dedicated to modeling and nents, data), connecting to hardware for real-time
simulation, which cover a broad range of tech- testing, and deploying the designed system are
nical and engineering computations, including additional, useful Simulink features. Real-Time
those for mechanical, electrical, thermodynamic, Workshop code generation allows to speed up
hydraulic, pneumatic, or thermal systems. An the design and implementation, by generating
example is Dymola, presented in a subsequent syntactically and semantically correct code which
subsection. can be uploaded to the target machine.
Reference interactive environments and tools MATLAB environment is very suitable for
for CACSD are presented in the following rapid prototyping, seen in a broad sense. This
(sub)sections. may include not only fully designing and imple-
menting a new control law, testing it on a host
Reference Interactive Environments computer, and deploying on a target computer
MATLAB (MATrix LABoratory) is an but also support for developing and testing new
integrated, interactive environment for tech- mathematical or control theories and algorithms.
nical computing, visualization, and program- Born around 1980, MATLAB has evolved and
ming (MathWorks 2013). Based on a powerful improved impressively. Since 2004, two releases
high-level interpreter language and development have been issued each year. There was a major
tools, an easy-to-use, flexible, and customizable change of the interface in Release 2012b, visible
Interactive Environments and Software Tools for CACSD 587
both in the core MATLAB “Desktop” and in Mathematica offers, e.g., tools for 2D and 3D
Simulink. The so-called Toolstrip interface re- data and function visualization and animation,
places former menus and toolbars and includes numeric and symbolic tools for discrete and
tabs which group functionality for common tasks. continuous calculus, a toolkit for adding user
A gallery of applications from the MATLAB interfaces to applications, control systems
family of products is additionally available and libraries, tools for parallel programming,
can be extended by the user. etc. High-performance computing capabilities
MATLAB supports developing applications include the use of packed and sparse arrays,
with graphical user interface (GUI) features; multiple precision arithmetic, automatic multi-
this itself can be done graphically using GUIDE threading on multi-core computers (based on
(GUI development environment). MATLAB has processor-specific optimized libraries), hardware
support for object-oriented programming and accelerators, support for grid technology,
interfacing with other languages or connecting to and CUDA and OpenCL GPU hardware.
similar environments as Maple or Mathematica. Mathematica and SystemModeler (based on
When using the command-line interface, Modelica c
language) offer numerous built-in
MATLAB helps the user, e.g., by showing the functions which allow to design, analyze, and
arguments of the typed MATLAB functions; also, simulate continuous- and discrete-time control I
MATLAB allows execution profiling, for increas- systems; simplify models; interactively test
ing the computational efficiency, and its editor controllers; and document the design. Both
can suggest changes in the user functions (the so- classical and modern techniques are provided. A
called M-files) for improving the performance. powerful symbolic-numeric computation engine
MATLAB users may upload their own contri- and highly efficient numerical algorithms are
butions to the MATLAB Central website or may used. Mathematica allows to define the system
download tools developed by other people. User models in a more natural form than MATLAB.
feedback is used by the MATLAB developers It can analyze not only numeric systems but
to improve the functionality, reliability, and effi- also symbolic ones, represented by state-space
ciency of the computations. or transfer-function models. The computational
Commercial competitors to MATLAB include precision and algorithms can be automatically
Mathematica, Maple, and IDL; free open-source controlled and selected, respectively, and using
alternatives are, e.g., GNU Octave, FreeMat, arbitrary precision arithmetic is possible.
and Scilab, intended to be mostly compatible
with the MATLAB language. For instance, Maple is a computer algebra system, which
a set of free CACSD tools for GNU Octave combines a powerful engine for mathematical
version 3.6.0 or beyond has been very recently calculations with an intuitive user interface
developed (see http://octave.sourceforge.net/ (see http://www.maplesoft.com/). Classical
control/). The Octave extension package called mathematical notation can be used, and the
control is based on the SLICOT Library and interface is customizable. Arbitrary precision
includes functionalities for system identification, numerical computations, as well as symbolic
system analysis, control system design (including computations, can be performed. The Maple
H1 synthesis), and model reduction, which are language is provided by a small kernel. NAG
the basic steps of the control engineer design Numerical Libraries, ATLAS libraries, and other
workflow. libraries written in this language are used for
numerical calculations. Symbolic expressions
Mathematica is an interactive environment are stored as directed acyclic graphs. The
which supports complete computational work- latest release, Maple 17, added hundreds of
flows, making it suitable for a convenient new problem-solving commands and interface
endeavor from ideas to deployed solutions enhancements. Many calculations recorded an
(see http://www.wolfram.com/mathematica/). impressive improvement in efficiency, compared
588 Interactive Environments and Software Tools for CACSD
to the previous release. Examples include cal- acquisition, instrument control, and industrial
culations with complex floating-point numbers automation. Its Control Design and Simulation
and linear algebra operations. It is possible Module (see http://sine.ni.com/psp/app/doc/p/id/
to use multiple cores and CPUs. The parallel psp-648/lang/en) can be used to build process and
memory management has been improved. Maple controller models using transfer-function, state-
includes some CACSD tools for linear and space, or zero-pole-gain representations, analyze
nonlinear dynamic systems. For instance, the the open- and closed-loop system behavior,
built-in package DynamicSystems (available deploy the designed controllers to real-time
since Maple 12 release) covers the analysis of hardware using built-in functions and LabVIEW
linear time-invariant systems. Numerical solvers Real-Time Module, etc.
for Sylvester and Lyapunov equations have been
added to the LinearAlgebra packages in Maple
13, and solvers for algebraic Riccati equations – Software Tools for CACSD
based on SLICOT Library routines – have been
included in Maple 14 (available in multiple The software tools for CACSD are formally
precision arithmetic since Maple 15). Moreover, divided below into computational and support
the MapleSim environment, based on Modelica, tools. SLICOT Library and Dymola serve as
is dedicated to physical modeling and simulation. illustrative examples. The support tools can also
Symbolic simplification, numerical solution include computational components.
of the differential-algebraic equations (DAEs),
and model post-processing (sensitivity analysis, Computational Tools
linearization, parameter optimization, code The computational tools for CACSD implement
generation, etc.) can be performed in MapleSim. the main numerical algorithms of the systems and
Its Control Design Toolbox provides solutions control theory and should satisfy several strong
for optimal control, Kalman filtering, pole requirements:
assignment, etc. Bidirectional communication • Reliability or guaranteed accuracy, which im-
with MATLAB is possible. plies the use of numerically stable algorithms
as much as possible and the estimation of the
MuPAD is another computer algebra system, problem sensitivity (conditioning) and of the
initially developed by a group at the University results accuracy; backward numerical stability
of Paderborn, Germany, and then in cooperation ensures that the computed results are exact for
with SciFace Software GmbH & Co. KG, com- slightly perturbed original data.
pany purchased in 2008 by The MathWorks, Inc. • Computational efficiency, which is important
MuPAD has been used with Scilab, and now it is for large-scale engineering design problems or
available in the Symbolic Math Toolbox. MuPAD for real-time control.
is able to operate on formulas symbolically or • Robustness, which is mainly ensured by
numerically (with specified accuracy). It offers a avoiding overflows, harmful underflows,
programming language allowing object-oriented and unacceptable accumulation of rounding-
and functional programming, several packages errors; scaling the data may be essential.
for linear algebra, differential equations, number • Ease-of-use, achieved by simplified user inter-
theory, and statistics, an interactive graphical sys- face (hiding the details), and default values for
tem supporting animations and transparent areas algorithmic parameters, such as tolerances.
in tridimensional images, etc. • Wide scope and rich functionality, which ad-
dress the range of problems and system repre-
LabVIEW (Laboratory Virtual Instrumentation sentations that can be handled.
Engineering Workbench), from National • Portability to various platforms, in the sense
Instruments, is an interactive development of functional correctness.
environment, based on MATRIXx, for a visual • Reusability, in building several dedicated en-
programming language mainly used for data gineering software systems or environments.
Interactive Environments and Software Tools for CACSD 589
More details are given, e.g., in Van Huffel et al. or directly by other users. For instance, symbolic
(2004). An example addressing all these aspects calculations are useful for checking the accuracy
is discussed in what follows. of numerical algorithms. The code generation
facility offers a safe and convenient support
SLICOT Library Benner et al. (1999) and for deploying a design solution to the control
Van Huffel et al. (2004) is one of the most com- hardware. A reference support software tool is
prehensive libraries for control theory numerical briefly presented below.
computations, containing over 500 subroutines
which cover system analysis, benchmark and Dymola (Dynamic modeling laboratory), from
test problems, data analysis, filtering, identifica- Dassault Systemes (see http://www.3ds.com/
tion, mathematical routines, some capabilities for products/catia/portfolio/dymola), deals with
nonlinear systems, synthesis, system transforma- high-fidelity modeling and simulation of complex
tion, and utility routines (see http://www.slicot. systems from various domains, like aerospace,
org/). The requirements above have been taken automotive, robotics, process control, and other
into account in the SLICOT Library develop- applications. Compatible and comprehensive
ment. Some of the SLICOT components are used model libraries, developed by leading experts,
in several interactive environments for CACSD, exist for many engineering branches. The I
including MATLAB, Maple, Scilab, and Octave users may create their own libraries or adapt
control package. The library is still under devel- existing libraries. This flexibility and openness is
opment. It is worth mentioning the new focus provided by the use of the open, object-oriented
on structure-preserving algorithms, which offer modeling language Modelica c
, currently further
increased accuracy, reliability, and efficiency, in developed by the Modelica Association.
comparison with standard solvers. Many proce- Equation-oriented models, based on DAEs,
dures for optimal control and filtering, model and symbolic manipulation are used, stimulating
reduction, etc., can benefit from using the “struc- the reuse of components and enhancing the re-
tured” solvers. There are also separate SLICOT- liability and efficiency of the calculations. This
based toolboxes for MATLAB (Benner et al. approach enables to simplify generating the equa-
2010). SLICOT components follow predefined tions, which result from interconnecting various
implementation and documentation standards. subsystems, and to deal with algebraic loops
SLICOT Library routines, and functions from and structurally singular models. Algebraic loops
many interactive environments for CACSD call are encountered when some auxiliary variables
components from the Basic Linear Algebra depend algebraically upon each other in a mu-
Subprograms (BLAS, see Dongarra et al. 1990 tual way (Cellier and Elmqvist 1992). Structural
and the references therein) and Linear Algebra singularities are related to DAE of index higher
PACKage (LAPACK, Anderson et al. 1999). This than 1.
approach enhances portability and efficiency, Dymola allows performing hardware-in-the-
since optimized BLAS and LAPACK Libraries loop simulation and real-time 3D animation. A
are provided for major computer platforms. model can be built by graphical composition,
connecting components from various libraries
Support Tools using simple dragged-and-dropped operations.
The support software tools for CACSD offer The parameters a model depends on can be tuned
additional capabilities compared to compu- either by parameter estimation (also called model
tational tools. They may include alternative calibration), which minimizes the error between
algorithms, symbolic computations (usually, the physical measurements and simulation
for low-dimensional problems), and extended results, or by optimization, which minimizes
functionality, e.g., for modeling/simulation of certain performance criteria. Sometimes, e.g.,
nonlinear systems, code generation, etc. The when designing certain controllers, the criteria
support tools can be used by software developers values are obtained by simulation. Dymola offers
of CACSD environments or computational tools also facilities for model management, including
590 Interactive Environments and Software Tools for CACSD
The Dynamic Lot Size Model less, and the marginal paper at the optimum
This is an analogue of the EOQ model when should be worth exactly zero. Thus, cu times the
the demand varies over time. Wagner and Whitin probability of selling the marginal paper minus
(1958) developed it in the discrete-time finite co times the probability of not selling it should
horizon framework. With D.t/ denoting the de- equal zero. Now, if F denotes the cumulative
mand in period t and other costs similar to those probability distribution function of the demand
in the EOQ model, they showed that there exists D, then clearly the optimal order quantity Q
an optimal policy in which an order will be satisfies co F .Q/ cu .1 F .Q// D 0, which
issued just as the inventory level reaches zero, gives us the famous newsvendor formula for the
except for the first order. This policy is called optimal order quantity
the zero-inventory policy. With this in hand, the
problem reduces to selecting only the order times. 1 cu
QDF ; (2)
This is accomplished by applying a shortest path cu C co
algorithm. Moreover, there are forward (recur-
sion) procedures for solving the problem. where cu =.cu C co / is known as the critical frac-
An important feature of this model is that in tile.
most cases, one can detect a forecast horizon If p denotes the unit sale price, c the unit
which essentially separates earlier periods from cost, and h the holding cost per unit per unit
later ones. More specifically, T is a forecast time, then cu D p c and co D c C h, and
horizon if the first order in a T horizon problem therefore, the critical fractile can be expressed as
remains optimal in any finite horizon problem .p c/ = .p C h/. An extension of the newsven-
with horizon longer than T , regardless of the dor formula to allow for a unit cost g of lost
demands beyond the period T . For an extensive goodwill and a unit salvage value s received at
bibliography of this literature, see Chand et al. the end of the period for each unit not sold is
(2002). immediate. If we let ˛ > 0 denote the periodic
discount factor, then cu D p C g c and co D
c C h ˛s and the critical fractile becomes
Stochastic Demand .p C g c/ = .p C g C h ˛s/ ; and therefore,
We shall discuss three classical models and some 1 pCgc
QDF : (3)
of their extensions. p C g C h ˛s
The Single-Period Problem: The The newsvendor model has been used exten-
Newsvendor Model sively in the context of supply chain management
The problem of a newsvendor is to decide on an with multiple agents maximizing their individual
order quantity of newspapers to meet a stochas- objectives. In this case, inefficiencies arise due
tic demand at a minimum cost. If the realized to double marginalization. Then, a question of
demand is larger than the ordered quantity, it is appropriate contracts that can lead to the first-best
lost and there is an opportunity loss of cu (selling solution, or coordinate the supply chain, becomes
price minus purchase cost) for each paper short. important. Cachon (2003) surveys this literature.
On the other hand, for each paper ordered but not
sold, there is an opportunity loss of co (purchase Multi-period Inventory Models: No
cost plus holding cost). The newsvendor concep- Fixed Cost
tualizes the decision by each additional paper as a The newsvendor model is a single-period model,
separate marginal contribution. The first is almost and its multi-period generalization requires that
certain to be sold. Each additional paper is less the inventory not sold in a period is carried over
likely to be sold than the previous one. Thus, to the next period. This results in the multi-period
each additional paper will be worth somewhat inventory model with lost sales. It is assumed
Inventory Theory 593
that demand in each period is independent and where the second term represents the savings due
identically distributed (i.i.d.) with F denoting its to postponing the purchase of the backlogged
cumulative probability distribution function. A demand unit by one period, and co D c C h ˛c
rigorous analysis requires the method of dynamic as in (4). This gives us the base stock level
programming, and it shows that there is a stock
level St called base stock in period t, that we 1 b .1 ˛/c
S DF ; (6)
would ideally like to have at the beginning of bCh
period t. Thus, the optimal policy in period t,
called the base stock policy, is to order which can be used in (5) to give the optimal
policy.
n Sometimes it is possible to have multiple de-
Qt .x/ D St x if x<St ; (4)
0 if xSt : livery modes such as fast, regular, and slow as
well as demand forecast updates. Then, at the be-
In the special case when the terminal salvage ginning of each period, on-hand inventory and de-
value of an item is exactly equal to its cost c, it is mand information are updated. At the same time,
possible to come up with the optimal policy using decisions on how much to order using each of the
intuition. Since we do not need to salvage unused modes are made. Fast, regular, and slow orders
items in the multi-period setting, one can argue are delivered at the ends of the current, the next,
I
that an item carried over to the next period is and one beyond the next periods, respectively.
worth its purchase cost c. Therefore, its presence In such models, a modified base stock policy is
means that the next period will need to order one optimal only for the two fastest modes. For details
less and thus save an amount c. In the last period, and further generalization, see Sethi et al. (2005).
when there is no next period, our terminal sal- An important extension includes serial inven-
vage value assumption also guarantees a leftover tory systems where stage 1 receives supplies from
item’s worth to be also c. Thus, we can modify an outside source and each downstream stage
(3) and obtain a stationary base stock level receives supplies from its immediate upstream
stage. Clark and Scarf (1960) introduced the
pCgc notion of the echelon inventory position at a stage
S D F 1
.p C g c/ C .c C h ˛c/ to consist of the stock at that stage plus stock
in transit to that stage plus all downstream stock
pCgc
D F 1 (5) minus the amount backlogged at the final stage.
p C g C h ˛c
Then, the optimal ordering policy at each stage
for each period t. is given by an echelon base stock policy with
Thus, the elimination of the endgame effect respect to the echelon inventory position at that
delivers us a myopic policy, a policy optimal in stage. It is known that assembly systems can be
the single-period case to be also optimal in the reduced to a serial system. Details can be found
dynamic multi-period setting. A more general in Zipkin (2000).
concept than the optimality of a myopic policy
is that of the forecast horizon mentioned earlier Multi-period Inventory Models: Fixed Cost
in the context of the dynamic lot size model. When there is a fixed cost of ordering, it is clear
Sometimes, when the demand exceeds the that it would not be reasonable to follow the base
on-hand inventory in the period, the demand is stock policy when the inventory level is not much
not lost but backlogged. In this case, each unit below the base stock level. Indeed, Scarf (1960)
of backlogged demand is satisfied in the next proved that there are numbers st and St , st < St ,
period, and unit revenue p is recovered, but a unit for period t such that the optimal policy in period
backlogging cost b is incurred, due to expediting, t is to order
special handling, delayed receipt of revenue, and n
if xst
Qt .x/ D S0 t x if (7)
loss of goodwill. Thus, cu D b .1 ˛/c, x>s :
t
594 Inventory Theory
depending on the information the agents have. Ozer O (2011) Inventory management: information, coor-
Some of them are time consistent or subgame dination, and rationality. In: Kempf KG, Keskinocak P,
Uzsoy R (eds) Planning production and inventories in
perfect and some are not. Regardless, there are the extended enterprise. Springer, New York
inefficiencies that arise from these decentralized Porteus EL (2002) Stochastic inventory theory. Stanford
game settings, and developing contracts for coor- University Press, Stanford, CA
dinating dynamic supply chains remains a wide Presman E, Sethi SP (2006) Inventory models with contin-
uous and Poisson demands and discounted and average
open topic of research. costs. Prod Oper Manag 15(2):279–293
Scarf H (1960) The optimality of (s, S) policies in
dynamic inventory problems. In: Arrow KJ, Karlin S,
Suppes P (eds) Mathematical methods in the social
Cross-References sciences. Stanford University Press, Stanford, pp 196–
202
Nonlinear Filters Sethi SP (2010) i3: incomplete information inventory
models. Decis Line Oct:16–19
Stochastic Dynamic Programming
Sethi SP, Yan H, Zhang H (2005) Inventory and supply
chain management with forecast updates. Springer,
New York
Wagner HM, Whitin TM (1958) Dynamic version of the
Bibliography economic lot size model. Manag Sci 5:89–96
Zipkin P (2000) Foundations of inventory management.
I
Arrow KJ, Karlin S, Scarf H (1958) Studies in the math- McGraw Hill, New York
ematical theory of inventory and production. Stanford
University Press, Stanford
Axsäter S (2006) Inventory control. Springer, New York
Bensoussan A (2011) Dynamic programming and inven-
tory control. IOS Press, Washington, DC Investment-Consumption Modeling
Bensoussan A, Cakanyildirim M, Feng Q, Sethi SP (2009)
Optimal ordering policies for stochastic inventory L.C.G. Rogers
problems with observed information delays. Prod Oper
Manag 18(5):546–559 University of Cambridge, Cambridge, UK
Bensoussan A, Cakanyildirim M, Sethi SP (2010) Filter-
ing for discrete-time Markov processes and applica-
tions to inventory control with incomplete information. Abstract
In: Crisan D, Rozovsky B (eds) Handbook on non-
linear filtering. Oxford University Press, Oxford, UK,
pp 500–525 The simplest investment-consumption problem is
Beyer D, Cheng F, Sethi SP, Taksar MI (2010) Markovian the celebrated example of Robert Merton (J Econ
demand inventory models. International series in op- Theory 3(4):373-413, 1971). This survey shows
erations research and management science. Springer,
New York three different ways of solving the problem, each
Beyer D, Sethi SP (1998) A proof of the EOQ formula of which is a valuable solution method for more
using quasi-variational inequalities. Int J Syst Sci complicated versions of the question.
29(11):1295–1299
Cachon GP (2003) Supply chain coordination with con-
tracts. In: de Kok AG, Graves S (eds) Supply chain
management-handbook in OR/MS, chapter 6. North- Keywords
Holland, Amsterdam, pp 229–340
Chand S, Hsu VN, Sethi SP (2002) Forecast, solution and
rolling horizons in operations management problems: Budget constraint; Hamilton-Jacobi-Bellman
a classified bibliography. Manuf Serv Oper Manag (HJB) equation; Merton problem; Value function
4(1):25–43
Clark AJ, Scarf H (1960) Optimal policies for a multi-
echelon inventory problem. Manag Sci 6(4):475–490
Erlenkotter D (1990) Ford Whitman Harris and the eco- Introduction
nomic order quantity model. Oper Res 38(6):937–946
Harris FW (1913) How many parts to make at once. Fact
Mag Manag 10(2):135–136, 152. Reprinted (1990) Consider an investor in a market with a riskless
Oper Res 38(6):947–950 bank account accruing continuously compounded
596 Investment-Consumption Modeling
interest at rate rt , and with a single risky asset The Main Techniques
whose price St at time t evolves as
We present here three important techniques for
dSt D St .t d Wt C
t dt/; (1) solving such problems: the value function ap-
proach; the use of dual variables; and the use of
where W is a standard Brownian motion, and martingale representation. The first two methods
and
are processes previsible with respect to the only work if the problem is Markovian; the third
filtration of W . The investor starts with initial only works if the market is complete. There is
wealth w0 and chooses the rate ct of consuming, a further method, the Pontryagin-Lagrange ap-
and the wealth t to invest in the risky asset, so proach; see Sect. 1.5 in Rogers (2013). While
that his overall wealth evolves as this is a quite general approach, we can only
expect explicit solutions when further structure is
d wt D t .t d Wt C
t dt/Crt .wt t /dt ct dt available.
(2)
The Value Function Approach
D rt wt dt C t ft d Wt C .
t rt /dtg ct dt: To illustrate this, we focus on the original Merton
(3) problem (Merton 1971), where and
are both
constant, and the utility U is constant relative risk
For convenience, we assume that , 1 , and
aversion (CRRA):
are bounded. See Rogers and Williams (2000a,b)
for background information on stochastic pro- U 0 .x/ D x R .x > 0/ (5)
cesses. The three terms in (2) have natural in-
terpretations: The first expresses the evolution of
for some R > 0 different from 1. The case
the wealth invested in the stock, the second the
R D 1 corresponds to logarithmic utility, and
interest accruing on the wealth .w/ invested in
can be solved by similar methods. Perhaps the
the bank account, and the third is the cash being
best starting point is the Davis-Varaiya Martin-
withdrawn for consumption.
gale Principle of Optimal Control
R t (MPOC): The
To avoid so-called doubling strategies, we in-
process Yt D e t V .wt / C 0 e s U.cs / ds
sist that the wealth process so generated by the
is a supermartingale under any control, and a
controls .c; / must remain bounded below in
martingale under optimal control. If we use Itô’s
some suitable way, which here is just the condi-
formula, we find that
tion wt 0 for all t 0; any .c; / satisfying
this condition will be called admissible. The set
of admissible .c; / will be denoted A.w0 /, a e t d Yt D V .wt /dt C V 0 .wt /d wt
notation which makes explicit the dependence on C 12 2 t2 V 00 .wt /dt C U.ct /dt
the investor’s initial wealth. :
D ŒV C ft .
r/ ct C rgV 0
The investor’s objective is taken to be to obtain
Z
C 12 2 t2 V 00 C U.ct / dt; (6)
1
V .w0 / sup E e t U.ct / dt (4) :
.c; /2A.w0 / 0 where the symbol D denotes that the two sides
differ by a (local) martingale. If the MPOC is to
for some constant > 0. The problem cannot hold, then we expect that the drift in d Y should
be solved explicitly at this level of generality, be non-positive under any control, and equal to
but if we take some special cases, we are able zero under optimal control. We simply assume
to illustrate the main methods used to attack for now that local martingales are martingales;
it. Many other objectives with various different this is of course not true in general, and is a point
constraints can be handled by similar techniques: that needs to be handled carefully in a rigorous
see Rogers (2013) for a wide range of examples. proof. Directly from (6), we then deduce the
Investment-Consumption Modeling 597
with R0 1=R. We are then able to perform the J.z/ D V .w/ wz: (14)
optimizations in (7) quite explicitly to obtain
.V 0 /2
Simple calculus gives us J 0 D w, J 00 D
0 D V C rV 0 C UQ .V 0 / 1
2 2 V 00 (9) 1=V 00 , so that the HJB equation (9) transforms I
into
where
r
: (10) 0 D UQ .z/ J.z/ C . r/zJ 0 .z/ C 12 2 z2 J 00 .z/;
Nonlinear PDEs arising from stochastic optimal (15)
control problems are not in general easy to solve, which is now a second-order linear ODE,
but (9) is tractable in this special setting, because which can be solved by traditional methods;
the assumed CRRA form of U allows us to see Sect. 1.3 of Rogers (2013) for more details.
deduce by a scaling argument that V .w/ /
w1R / U.w/, and we find that Use of Martingale Representation
This time, we shall suppose that the coefficients
R
t , rt , and t in the wealth evolution (3) are
V .w/ D M U.w/; (11)
general previsible processes; to keep things sim-
where pler, we shall suppose that
, r, and 1 are
all bounded previsible processes. The Markovian
nature of the problem which allowed us to find
RM D C .R 1/.r C 12 2 =R/: (12)
the HJB equation in the first two cases is now
destroyed, and a completely different method
The optimal investment and consumption
is needed. The way in is to define a positive
behavior is easily deduced from the optimal
semimartingale
by
choices which took us from (7) to (9). After some
calculations, we discover that
d
t D
t .rt dt t d Wt /;
0 D 1 (16)
r
t D M wt 2 wt ; ct D M wt
R where t D .
t rt /=t is a previsible process,
(13) bounded by hypothesis. This process, called the
specifies the optimal investment/consumption state-price density process, or the pricing kernel,
behavior in this example. (The positivity of M has the property Rthat if w evolves as (3), then
t
is necessary and sufficient for the problem to Mt
t wt C 0
s cs ds is a positive local
be well posed; see Sect. 1.6 in Rogers (2013)). martingale.
Unsurprisingly, the optimal solution scales Since positive local martingales are super-
linearly with wealth. martingales, we deduce from this that
598 ISS
Z 1
Cross-References
M0 D w0 E
s cs ds : (17)
0
Risk-Sensitive Stochastic Control
Thus, for any .c; / 2 A.w0 /, the budget con- Stochastic Dynamic Programming
straint (17) must hold. So the solution method Stochastic Linear-Quadratic Control
here is to maximize the objective (4) subject Stochastic Maximum Principle
to the constraint (17). Absorbing the constraint
with a Lagrange multiplier , we find the uncon-
strained optimization problem
Bibliography
Z 1
sup E fe s U.cs /
s cs g ds C w0 Cvitanic J, Zhang J (2012) Contract Theory in
0 Continuous-time Models. Springer, Berlin/Heidelberg
(18) Merton RC (1971) Optimum consumption and portfolio
whose optimal solution is given by rules in a continuous-time model. J Econ Theory
3(4):373–413
Rogers LCG (2013) Optimal Investment. Springer,
e s U 0 .cs / D
s ; (19) Berlin/New York
Rogers LCG, Williams D (2000a) Diffusions, Markov
and this determines the optimal c, up to knowl- Processes and Martingales, vol 1. Cambridge Univer-
sity Press, Cambridge/New York
edge of the Lagrange multiplier , whose value is Rogers LCG, Williams D (2000b) Diffusions, Markov
fixed by matching the budget constraint (17) with Processes and Martingales, vol 2. Cambridge Univer-
equality. sity Press, Cambridge
Of course, the missing logical piece of this
argument is that if we are given some c 0
satisfying the budget constraint, is there necessar-
ily some such that the pair .c; / is admissible ISS
for initial wealth w0 ? In this setting, this can be
shown to follow from the Brownian integral rep- Input-to-State Stability
resentation theorem, since we are in a complete
market; however, in a multidimensional setting,
this can fail, and then the problem is effectively
insoluble.
Iterative Learning Control
David H. Owens
University of Sheffield, Sheffield, UK
Summary and Future Directions
for task to task learning, the need for acceptable Step two: (Initialization) Given a demand sig-
task-to-task performance and the implications of nal r.t/; t 2 Œ0; T , choose an initial input
modeling errors for task-to-task robustness. u0 .t/; t 2 Œ0; T and set k D 0.
Step three: (Response measurement) Return
the plant to a defined initial state. Find the
output response yk to the input uk . Construct
Keywords
the tracking error ek D r yk . Store data.
Step four: (Input signal update) Use past
Adaptation; Optimization; Repetition; Robust-
records of inputs used and tracking er-
ness
rors generated to construct a new input
ukC1 .t/; t 2 Œ0; T to be used to improve
tracking accuracy on the next trial.
Introduction Step five: (Termination/task repetition)
Either terminate the sequence or increase k
Iterative learning control (ILC) is relevant to by unity and return to step 3.
trajectory tracking control problems on a finite in- It is the updating of the input signal based
terval Œ0; T (Ahn et al. 2007b; Bien and Xu 1998; on observation that provides the conceptual link I
Chen and Wen 1999). It has close links to multi- to adaptive control. ILC causality defines “past
pass process theory (Edwards and Owens 1982) data” at time t on iteration k as data on the
and repetitive control (Rogers et al. 2007) plus interval Œ0; t on that iteration plus all data on
conceptual links to adaptive control. It focuses Œ0; T on all previous iterations. Feedback plus
on problems where the repetition of a specified feedforward control normally contains feedfor-
task creates the possibility of improving tracking ward transfer of information from past iterations
accuracy from task to task and, in principle, to the current iteration.
reducing the tracking error to exactly zero. The
iterative nature of the control schemes proposed,
the use of past executions of the control to up- Modeling Issues
date/improve control action, and the asymptotic
learning of the required control signals put the Design approaches have been model-based. Most
topic in the area of adaptive control, although nonlinear problems assume nonlinear state space
other areas of study are reflected in its method- models relating the ` 1 input vector u.t/ to the
ologies. m 1 output vector y.t/ via an n 1 state vector
Application areas include robotic assembly x.t/ as follows:
(Arimoto et al. 1984), electromechanical test sys-
tems (Daley et al. 2007), and medical rehabili- P
x.t/ D f .x.t/; u.t//; y.t/ D h.x.t/; u.t//;
tation robotics (Rogers et al. 2010). For example,
consider a manufacturing robot required to under- where t 2 Œ0; T , x.0/ D x0 and f and h
take an indefinite number of identical tasks (such are vector-valued functions. The discrete time
as “pick and place” of components) specified by (sample data) version replaces derivatives by a
a spatial trajectory on a defined time interval. forward shift, where t is now a sample counter,
The problem is two-dimensional. More precisely, 0 t N (the index of the last sample). The
the controlled system evolves with two variables, continuous time linear model is
namely, time t 2 Œ0; T (elapsed in each iteration)
and iteration index k 0. Data consists of signals P
x.t/ D Ax.t/ C Bu.t/; y.t/ D C x.t/ C Du.t/
fk .t/ denoting the value of the signal f at time t
on iteration k. The conceptual algorithm used is: with an analogous model for discrete systems. In
Step one: (Preconditioning) Implement loop both cases, the matrices A; B; C; D are constant
controllers to condition plant dynamics. or time varying of appropriate dimension.
600 Iterative Learning Control
defined in terms of the Markov parameter matri- ROBUST ILC: An ILC algorithm is said to be
ces D; CB; CAB; : : : of the plant. This structure robust if convergence is retained in the presence
has led to a focus on the discrete time, time- of a defined class of modeling errors.
invariant case, and exploitation of matrix algebra Results from multipass systems theory (Ed-
techniques. wards and Owens 1982) indicate robust conver-
More generally, G W U ! Y can be a bounded gence of the sequence fek gko to a limit e1 2 Y
linear operator between suitable signal spaces U (in the presence of small modeling errors) if the
and Y. Taking G as a convolution operator, the spectral radius condition
representation (1) also applies to time-varying
continuous time and discrete time systems. The rŒ.I C GK0 /1 .W0 GK1 / < 1 (4)
representation also applies to differential-delay
systems, coupled algebraic and differential sys- is satisfied where rŒ denotes the spectral radius
tems, multi-rate systems, and other situations of of its argument. However, the desired condition
interest. e1 D 0 is true only if W0 D I . For a given
r, it may be possible to retain the benefits of
choosing W0 ¤ I and still ensure that e1 is
Formal Design Objectives sufficiently small for the application in mind,
e.g., by limiting limit errors to a high-frequency
band. This and other spectral radius conditions
Problem Statement: Given a reference signal form the underlying convergence condition when
r 2 Y and an initial input signal u0 2 U, choosing controller elements but are rarely com-
construct a causal control update rule/algorithm puted. The simplest algorithm using eigenvalue
Iterative Learning Control 601
computation for a linear discrete time system proving that zero error is attainable with added
defines the relative degree to be k D 0 if D ¤ 0 flexibility in convergence rate control by choos-
and the smallest integer k such that CAk1 B ¤ ing ˇ 2 .0; 2/. Errors in the system model used
0 otherwise. Replacing Y by the range of G; in (6) are an issue. Insight into this problem
choosing W0 D I; K0 D 0, and K1 D I ; has been obtained for single-input, single-output
and supposing that k 1, the Arimoto input discrete time systems with multiplicative plant
update rule ukC1 .t/ D uk .t/ C ek .t C k /; 0 uncertainty U as retention of monotonic conver-
t N C 1 k provides robust convergence if, gence is ensured (Owens and Chu 2012) by a
and only if, rŒI CAk 1 B < 1. It does not frequency domain condition
imply that the error signal necessarily improves
each iteration. Errors can reach very high values 1 1
j U.e i /j < ; for all 2 Œ0; 2 (7)
before finally converging to zero. However, if (4) ˇ ˇ
is replaced by the operator norm condition
that illustrates a number of general empirical
1
k.I C GK0 / .W0 GK1 /k < 1 ; then (5) rules for ILC robust design. The first is that
a small learning gain (and hence small input
fkek e1 kQ gk0 monotonically decreases to update changes and slow convergence) will I
zero. tend to increase robustness and, hence, that it is
The spectral radius condition throws light on necessary that multiplicative uncertainties satisfy
the nature of ILC robustness. Choosing, for sim- some form of strict positive real condition which,
plicity, K0 D 0 and W0 D I , the requirement for (6), is
that rŒI GK1 < 1 will be satisfied by a
wide range of processes G, namely those for Re U.e i / > 0; for all 2 Œ0; 2 ; (8)
which the eigenvalues of I GK1 lie in the open
unit circle of the complex plane. Translating this a condition that limits high-frequency roll-off
requirement into useful robustness tests may not error and constrains phase errors to the range
be easy in general. The discussion does however . 2 ; 2 /. The second observation is that if G is
show that the behavior of GK1 must be “sign- non-minimum phase, the inverse G 1 is unstable,
definite” to some extent as, if rŒI GK1 < 1, a situation that cannot be tolerated in practice.
then rŒI .G/K1 > 1, i.e., replacing the plant
by G (no matter how small) will inevitably pro-
duce non-convergent behavior. A more detailed
Optimization-Based Iteration
characterization of this property is possible for
inverse model ILC.
Design criteria can be strengthened by a mono-
tonicity requirement. Measuring error magnitude
by a norm kekQ on Y, such as the weighted mean
Inverse Model-Based Iteration square error (with Q symmetric, positive definite)
ukC1 D uk C ˇG 1 ek ; (6) then the condition kekC1 kQ < kek kQ for all
k 0 provides a performance improvement from
where ˇ is a learning gain, produces the dynam- iteration to iteration. This idea leads to a number
ics of design approaches, Owens and Daley (2008)
and Ahn et al. (2007b) (which also examines
ekC1 D .1 ˇ/ek or ekC1 D .1 ˇ/k e0 ; aspects of robustness).
602 Iterative Learning Control
links between parameter choice and convergence 1. Extending current ILC knowledge to other
rates (Ahn et al. 2007a; Bristow et al. 2006; classes of model needed for applications.
Owens and Daley 2008; Wang et al. 2009; 2. Integration of online data-based modeling into
Xu 2011). Optimization-based algorithms ILC schemes to enhance adaptive control op-
provide a structured approach to convergence tions.
and have a familiar quadratic optimal control 3. Ensuring the property of error monotonicity
structure. Despite the practical benefits of or characterizing any non-monotonicity to be
monotonic error norms, this approach underlines expected.
the difficulties induced by non-minimum- 4. The construction of robustness tests and using
phase (NMP) properties of the plant. Operator the ideas in new robust design methodologies.
representations extend this theory to include more 5. Providing a better understanding of the effect
general problems such as the intermediate point of noise and disturbances on algorithm perfor-
tracking problem and, where solutions are non- mance.
unique, can be converted into iterative algorithms 6. Extending the range of tasks to include, for
that inherit the properties of NOILC but converge example, different challenges for different out-
to a solution that also minimizes an auxiliary puts on different subintervals of Œ0; T .
optimization criterion. 7. Creating design tools for nonlinear plant that I
Many of the challenges addressed by NOILC ensure convergence and a degree of robustness
are inherited by other algorithms, many of which but, in particular, provide some control of
mimic established control design paradigms. For internal plant states that may be subject to
example, the commonly used PD update law dynamical constraints.
Arimoto S, Miyazaki F, Kawamura S (1984) Bettering Owens DH, Chu B (2012) Combined inverse
operation of robots by learning. J Robot Syst 1(2): and gradient iterative learning control: performance,
123–140 monotonicity, robustness and non-minimum-phase
Bien Z, Xu J-X (1998) Iterative learning control: analysis, zeros. Int J Robust Nonlinear Control (2):203–226.
design, integration and applications. Kluwer Aca- doi:10:1002/rnc.2893
demic, Boston Owens DH, Daley S (2008) Iterative learning control
Bristow DA, Tharayil M, Alleyne AG (2006) A survey of – monotonicity and optimization. Int J Appl Math
iterative learning control a learning-based method for Comput Sci 18(3):279–293
high-performance tracking control. IEEE Control Syst Owens DH, Chu B, Songjun M (2012) Parameter-optimal
Mag 26(3):96–114 iterative learning control using polynomial representa-
Chen Y, Wen C (1999) Iterative learning control: con- tions of the inverse plant. Int J Control 85(5): 533–544
vergence, robustness and applications. Lecture notes Owens DH, Freeman CT, Chu B (2013) Multivariable
in control and information sciences, vol 48. Springer, norm optimal iterative learning control with auxiliary
London/New York optimization. Int J Control 1(1):1–2
Chen YQ, Li Y, Ahn H-S, Tian G (2013) A survey on Rogers E, Galkowski K, Owens DH (2007) Control
fractional order iterative learning. J Optim Theory systems theory and applications for linear repetitive
Appl 156:127–140 processes. Lecture notes in control and information
Daley S, Owens DH, Hatonen J (2007) Application sciences, vol 349. Springer, Berlin
of optimal iterative learning control to the dynamic Rogers E, Owens DH, Werner H, Freeman CT, Lewin PL,
testing of mechanical structures. Proc Inst Mech Eng I Schmidt C, Kirchhoff S, Lichtenberg G (2010) Norm-
J Syst Control Eng 221:211–222 optimal iterative learning control with application to
Edwards JB, Owens DH (1982) Analysis and control of problems in acceleration-based free electron lasers and
multipass processes. Research Studies Press/Wiley, rehabilitation robotics. Eur J Control 5:496–521
Chichester Wang Y, Gao F, Doyle FJ III (2009) Survey on iterative
Owens DH, Chu B (2010) Modelling of non- learning control repetitive control and run-to run con-
minimum phase effects in discrete-time norm opti- trol. J Process Control 19:1589–1600
mal iterative learning control. Int J Control 83(10): Xu J-X (2011) A survey on iterative learning control for
2012–2027 nonlinear systems. Int J Control 84(7):1275–1294
K
definite, although weaker conditions are also suf- every operation, (3) Tychonov regularization,
ficient for stability in some cases; see Kailath (4) tuning the process noise covariance matrix,
et al. (2000) for such details on the stability (5) coding the Kalman filter in principal coor-
of the Kalman filter. Kalman filter stability is dinates or approximately principal coordinates
connected with observability and controllability (i.e., aligned with the eigenvectors of the state
of the input-output model of the relevant dynam- vector error covariance matrix), (6) sequential
ical system in Kalman (1963) The corresponding scalar measurement updates in a preferred order,
algorithm for continuous time linear measure- and (7) various factorizations of the covariance
ments (with Gaussian additive white noise) and matrices (e.g., square root, information matrix,
continuous time linear dynamical systems (with information square root, upper triangular and
Gaussian additive white process noise) is called lower triangular factorization, UDL, etc.). The
the Kalman-Bucy filter; see Kalman (1961). classic book on error covariance matrix factor-
izations is by Bierman (2006). Unfortunately,
there is no guarantee that the Kalman filter will
Design Issues work well even if all of these mitigation methods
are used. Moreover, there is no useful theoretical
In engineering practice, almost all real-world analysis of this phenomenon, with the exception
applications are nonlinear or non-Gaussian, and of a few not very tight upper bounds on the
therefore, they do not fit the Kalman filter theory. condition number. Plotting the numerical values
Nevertheless, by approximating the nonlinear dy- of the condition number of the covariance matrix
namics and measurements with linear equations, vs. time is often a helpful diagnostic. In certain
one can apply the Kalman filter theory; this is real-world applications, the condition number of
called the “extended Kalman filter” (EKF); see the Kalman filter error covariance matrix can be
Gelb et al. (1974). The linearization of the non- ten billion or larger.
linear dynamics and measurements is made by
computing the first-order Taylor series expansion
and evaluating it at the estimated state vector; this Why Is the Kalman Filter So Useful
is a very simple and fast approximation that is and Popular?
widely used in real- world applications, and it
often gives good estimation accuracy, although The Kalman filter has been enormously success-
there is no guarantee of that. Moreover, there is ful in real-world applications, and it is inter-
no guarantee that the EKF will be stable, even if esting to reflect on why it has been so useful
the linearized system satisfies all the theoretical and so popular. In particular, Kalman himself
requirements for stability of the Kalman filter. believes that his filter was successful because it
Even if the dynamics and measurements was based on probability rather than statistics;
are exactly linear and if the measurement see Kalman (1978). For example, the error co-
noise and process noise and initial uncertainty variance matrix for the Kalman filter is computed
are all exactly Gaussian with exactly known from the assumed dynamics and measurement
means and covariance matrices, there can still model and the assumed values of the initial state
be significant practical problems with Kalman uncertainty and process noise and measurement
filter accuracy, owing to ill-conditioning. In noise covariance matrices, rather than by com-
particular, the Kalman filter can be extremely puting sample covariance matrices. Likewise, the
sensitive to quantization errors in the computer Kalman filter computes the estimated state vector
arithmetic and storage. On the other hand, there from assumed Gaussian and linear probability
are many different methods to try to mitigate ill- models, rather than computing the sample mean,
conditioning including (1) double or quadruple as would be done in statistics. There is substantial
or octuple precision arithmetic, (2) making the wisdom in Kalman’s assertion, owing to the dif-
covariance matrices symmetric before and after ficulty of estimating sample covariance matrices
Kalman Filters 607
and sample vectors that are sufficiently accu- bugs in the code or ill-conditioning of the error
rate, given a limited number of samples and ill- covariance matrices or nonlinearities or errors in
conditioning and high-dimensional state vectors. modeling the dynamics or measurements.
A second reason that Kalman filters are so popu- Kalman’s 1960 paper represented a big
lar is that the real-time computational complexity paradigm shift in two ways: (1) it exploited fast
is very reasonable for modern digital comput- low-cost modern digital computers, whereas the
ers, even for problems with a high-dimensional literature up to that time did not, and (2) it used
state vector. In particular, the computational com- time domain methods rather than the ubiquitous
plexity of the Kalman filter scales as the cube Fourier transform methods, which limited the
of the dimension of the state vector; for ex- dynamics to steady state asymptotic in time,
ample, the modern GPS system uses a Kalman which in turn limited the theory to cover stable
filter with a state vector of dimension of about dynamics. Of course today we take both of
1,000 to jointly estimate the orbits of the satel- these big points as normal engineering rather
lites in the GPS constellation. In 1960, when than revolutionary and surprising. The state of
Kalman’s paper was first published, digital com- the art prior to Kalman’s 1960 paper was the
puters were starting to become fast enough at Wiener filter, which was based firmly on the
reasonable cost to multiply large matrices in real Fourier transform. The Wiener filter required
time, which is the most challenging computation very lengthy and cumbersome algebraic spectral
in a Kalman filter. Today computers are roughly factorization with complex variables, resulting in
ten orders of magnitude faster per unit cost than erroneous formulas published in certain books,
in 1960, and hence, we can run Kalman filters owing to algebraic errors which were not obvious K
for high-dimensional problems on very inexpen- and which could not be checked by computers,
sive computers that fit into your wristwatch. A owing to the nonexistence of computer algebra
third reason that Kalman filters are popular is software (e.g., MATHEMATICA) in 1960.
that the algorithms are easy to understand and Kalman explains many other problems with the
code and test. A fourth reason is the guaran- Wiener filter in Kalman (2003).
teed stability of the Kalman filter under very
mild conditions which can always be satisfied in
practical applications. A fifth reason is that the Summary and Future Directions
Kalman filter is optimal for time-varying unstable
linear dynamics with time-varying measurement To a large extent the Kalman filter theory is
noise covariance and process noise covariance. complete. There is a simple and useful theory
A sixth reason is that one can use Kalman fil- of stability of the Kalman filter, which is com-
ters for nonlinear problems by approximating the pletely lacking for nonlinear filters including the
nonlinear dynamics and measurement equations extended Kalman filter (EKF) and particle fil-
with a first-order Taylor series; this is called the ters. Moreover, there are many robust versions
extended Kalman filter (EKF), which is without of the Kalman filter that have been invented to
a doubt the most widely used algorithm in real- mitigate ill-conditioning of the covariance matrix
world estimation applications. A seventh reason as well as uncertainty in system models and
is that the Kalman filter automatically provides system parameters. However, there remain many
a convenient quantification of uncertainty of the important design issues that need to be addressed
estimated state vector using the error covariance for practical applications; see Daum (2005) for
matrix. The final reason is that it is easy to test details. The most obvious issues include nonlin-
the accuracy of the Kalman filter by comparing ear measurements, nonlinear plant models, ro-
the theoretical error covariance matrix to errors bustness to uncertainty in measurement models
computed by Monte Carlo simulations of the and plant models, non-Gaussian measurement
filter; the two errors should agree approximately, noise and plant noise, non-Gaussian initial un-
and statistically significant discrepancies suggest certainty in the state vector, and ill-conditioning
608 KYP Lemma and Generalizations/Applications
In these equations, the particular matrix ‚ is For instance, a version of the generalized KYP
chosen to describe the gain bound condition as lemma states that the FDI (1) holds in the low
a special case of the quadratic form (1), and we frequency range j!j $` if and only if there
observe that ‚ appears in the LMI as in (2). exist matrices P D P T and Q D QT > 0
It turns out that the equivalence of (1) and (2) satisfying
holds not only for this particular ‚ but also for
an arbitrary symmetric matrix ‚. This result is T
A B Q P A B
called the KYP lemma, which states that, given C ‚ < 0; (5)
arbitrary matrices A, B, and ‚ D ‚T , the FDI I 0 P $`2 Q I 0
(1) holds for all frequency ! if and only if there
exists a matrix P D P T satisfying the LMI (2), provided A has no imaginary eigenvalues in the
provided A has no eigenvalues on the imaginary frequency range. In the limiting case where $`
axis. approaches infinity and the FDI is required to
The FDI in (1) can be specialized to an FDI hold for the entire frequency range, the solution
Q to (5) approaches zero, and we recover (2).
L.j!/ L.j!/ The role of the additional parameter Q is to
… < 0: (3)
I I enforce the FDI only in the low frequency range.
To see this, consider the case where the system is
on transfer function stable and a sinusoidal input u D <ŒOue j!t , with
(complex) phasor vector uO , is applied. The state
converges to the sinusoid x D <Œxe O j!t in the
L.s/ WD C.sI A/1 B C D steady state where xO WD G.j!/Ou. Multiplying (5)
by the column vector obtained by stacking xO and
by choosing uO in a column from the right, and by its complex
conjugate transpose from the left, we obtain
T
C D C D
‚ WD … : (4)
0 I 0 I xO xO
.$`2 2
! /xO QxO C ‚ < 0:
uO uO
The choice of matrix … allows for characteri-
zations of important system properties involving
In the low frequency range j!j $` , the first
gain and phase of L.s/. For instance, the FDI (3)
term is nonnegative, enforcing the second term to
with
0 I be negative, which is exactly the FDI in (1). If !
… WD is outside of the range, however, the first term is
I 0
negative, and the FDI is not required to hold.
gives L.j!/ C L.j!/ > 0. This is called the Similar results hold for various frequency
positive real property, with which the phase angle ranges. The term involving Q in (5) can be
remains between ˙90ı when L.j!/ is a scalar. expressed as the Kronecker product ‰ ˝ Q with
‰ being a diagonal matrix with entries .1; $`2 /.
The matrix ‰ arises from characterization of the
low frequency range:
Generalization
The standard KYP lemma deals with FDIs that j! j!
‰ D $`2 ! 2 0:
are required to hold for all frequencies. To allow 1 1
for more flexibility in practical system designs,
the KYP lemma has been generalized to deal with By different choices of ‰, middle and high fre-
FDIs in (semi)finite frequency ranges. quency ranges can also be characterized:
KYP Lemma and Generalizations/Applications 611
inequality conditions (6). In general, both coeffi- to satisfy convex FDI constraints. An important
cient matrices F and ‚ may depend on the design problem, which falls outside of this framework
parameters, but if the poles of the controller are and remains open, is the design of feedback con-
fixed (as in the PID control), then the design trollers to satisfy multiple FDIs on closed-loop
parameters will appear only in ‚. If in addition transfer functions in various frequency ranges.
an FDI specifies a convex region for L.j!/ on the There have been some attempts to address this
complex plane, then the corresponding inequality problem, but none of them has so far succeeded
(6) gives a convex constraint on P , Q, and the to give an exact solution.
design parameters. This is the case for specifica- The KYP lemma has been extended in
tions of gain upper bound (disk: jLj < ) and other directions as well, including FDIs
phase bound (half plane: †L C ). A with frequency-dependent weights (Graham
gain lower bound jLj > is not convex but can and de Oliveira 2010), internally positive
often be approximated by a half plane. The design systems (Tanaka and Langbort 2011), full
parameters satisfying the specifications can then rank polynomials (Ebihara et al. 2008), real
be computed via convex programming. multipliers (Pipeleers and Vandenberghe 2011),
Various design problems other than the open- a more general class of FDIs (Gusev 2009),
loop shaping can also be solved in a similar multidimensional systems (Bachelier et al. 2008),
manner, including finite impulse response (FIR) negative imaginary systems (Xiong et al. 2012),
digital filter design with gain and phase con- symmetric formulations for robust stability
straints in a passband and stop-band and sensor analysis (Tanaka and Langbort 2013), and
or actuator placement for mechanical control sys- multiple frequency intervals (Pipeleers et al.
tems (Hara et al. 2006; Iwasaki et al. 2003). Con- 2013). Extensions of the KYP lemma and related
trol design with the Youla parametrization also S-procedures are thoroughly reviewed in Gusev
falls within the framework if a basis expansion and Likhtarnikov (2006). A comprehensive
is used for the Youla parameter and the coeffi- tutorial of robust LMI relaxations is provided
cients are sought to satisfy convex constraints on in Scherer (2006) where variations of the KYP
closed-loop transfer functions. lemma, including the generalized KYP lemma as
a special case, are discussed in detail.
Graham M, de Oliveira M (2010) Linear matrix inequality Pipeleers G, Vandenberghe L (2011) Generalized KYP
tests for frequency domain inequalities with affine lemma with real data. IEEE Trans Autom Control
multipliers. Automatica 46:897–901 56(12):2940–2944
Gusev S (2009) Kalman-Yakubovich-Popov lemma for Pipeleers G, Iwasaki T, Hara S (2013) Generalizing
matrix frequency domain inequality. Syst Control Lett the KYP lemma to the union of intervals. In: Pro-
58(7):469–473 ceedings of European control conference, Zurich,
Gusev S, Likhtarnikov A (2006) Kalman-Yakubovich- pp 3913–3918
Popov lemma and the S-procedure: a historical essay. Rantzer A (1996) On the Kalman-Yakubovich-Popov
Autom Remote Control 67(11):1768–1810 lemma. Syst Control Lett 28(1):7–10
Hara S, Iwasaki T, Shiokata D (2006) Robust PID control Scherer C (2006) LMI relaxations in robust control. Eur J
using generalized KYP synthesis. IEEE Control Syst Control 12(1):3–29
Mag 26(1):80–91 Tanaka T, Langbort C (2011) The bounded real lemma
Iwasaki T, Hara S (2005) Generalized KYP lemma: uni- for internally positive systems and H-infinity struc-
fied frequency domain inequalities with design appli- tured static state feedback. IEEE Trans Autom Control
cations. IEEE Trans Autom Control 50(1):41–59 56(9):2218–2223
Iwasaki T, Meinsma G, Fu M (2000) Generalized S- Tanaka T, Langbort C (2013) Symmetric formulation of
procedure and finite frequency KYP lemma. Math the S-procedure, Kalman-Yakubovich-Popov lemma
Probl Eng 6:305–320 and their exact losslessness conditions. IEEE Trans
Iwasaki T, Hara S, Yamauchi H (2003) Dynamical system Autom Control 58(6): 1486–1496
design from a control perspective: finite frequency Willems J (1971) Least squares stationary optimal control
positive-realness approach. IEEE Trans Autom Con- and the algebraic Riccati equation. IEEE Trans Autom
trol 48(8):1337–1354 Control 16:621–634
Kalman R (1963) Lyapunov functions for the problem Xiong J, Petersen I, Lanzon A (2012) Finite frequency
of Lur’e in automatic control. Proc Natl Acad Sci negative imaginary systems. IEEE Trans Autom Con-
49(2):201–205 trol 57(11):2917–2922
K
L
Lane keeping systems are vehicle guidance Lane Keeping Systems Architecture
systems that aim at preventing lane departure
maneuvers, which may lead to accidents, i.e., The main components of a lane keeping system
collision with surrounding obstacles and vehicles. and their interconnections are shown in Fig. 1.
Lane data
Lane data
Low level Steering, Decision
Sensor
controllers making and
braking Fusion
control
Obstacle
data
Radar
Obstacle data
Relative positions and velocities of the host ve- relying on the Doppler effect, i.e., by measuring
hicle w.r.t. the surrounding environment are mea- the frequency change of the reflected waves.
sured by one radar, typically installed on the Relative distance and velocity measurements are
front of the vehicle, and possibly by the camera, typically updated with a frequency of 10 Hz.
typically installed on the windshield. Position Radars for automotive applications emit waves
of the host vehicle within the lane and further with a frequency of 77 GHz and detect objects
information, e.g., road geometry, are measured by within an approximate range of 150 m and a
the camera. These measurements are then fused view angle of about ˙10ı , with a deviation of
by the sensor fusion module to provide accurate 20–30 cm from the correct value for 95 % of
measurements of the position and velocity of the measurements (Eidehall 2004). New radar
the vehicle w.r.t. the surrounding environment systems increase the range up to about 200 m
and the lane in the widest range of operating with a view angle of about ˙10ı (News Releases
conditions and scenarios. DENSO Corporation 2013a).
The task of the decision-making and control Typically, radar units are equipped with com-
module is to assess the risk that the vehicle puter systems running signal processing algo-
crosses the lane in a dangerous way and, possibly, rithms that detect and track objects and, for each
to take an action that can range from warning of them, calculate relative position and speed,
the driver or issuing an assisting intervention, azimuth angle, also providing additional informa-
e.g., braking and/or steering. Such steering and tion, e.g., the time an object has been tracked and
braking commands are actually implemented by a flag indicating that a target has been locked.
low-level controllers. Such additional information are typically used in
The different modules will be overviewed in logics implementing the decision-making algo-
the following sections. rithms of, among others, the lane keeping system.
There are several issues arising from the use of
a radar in automotive applications, e.g., wave re-
Sensing and Actuation flections due to road bumps and barriers that may
induce the signal processing algorithms to false
Radar object detections (Eidehall 2004). Moreover, in-
Radars for automotive applications are placed in terference and the vehicle dynamics (News Re-
the front of the car, typically behind the grille. leases DENSO Corporation 2013a), e.g., pitching
The radar emits radio waves and distance from due to braking, may limit the capability of the sig-
the vehicle ahead is calculated by measuring the nal processing algorithms of correctly detecting
arrival time and direction of the reflected radio and tracking the surrounding objects. The latter
waves. The relative velocity is determined by may be solved by, e.g., using electric motors that
Lane Keeping 617
adjust the radar antenna axes in order to com- EPAS to the driver’s steering torque, in order to
pensate for the vehicle dynamics (News Releases generate the desired yaw moment calculated by
DENSO Corporation 2013a). the decision-making and control module.
Clearly, the steering command is not the only
Vision Systems available to affect the vehicle yaw motion, thus
Vision systems in lane keeping applications changing its orientation and lateral position
are typically based on a single, CCD camera within the lane. Individual wheel braking may
mounted next to the rear-view mirror placed at the also be used (Technology Daimler and Safety
center of the windshield. The image is typically Innovation 2013). In particular, in vehicles
captured by 640 480 pixels and then processed equipped with yaw motion control system via
by an image processing unit. The sampling time individual braking, a braking torque request
of the vision system is about 0.1 s, but it can for each wheel can be sent to the yaw motion
change depending on, e.g., the complexity of the control system in order to generate the desired
scene, for example, in city traffic (Eidehall 2004). yaw motion.
Lane markings are detected by using differ-
ences in the image contrast (Technology Daim-
ler and Safety Innovation 2013). The camera Decision Making and Control
can be either monochrome or full colored. The
latter is used to enhance the detection of lane The decision making and control in a lane keep-
markings, which have different colors around ing problem can be conceptually divided into two
the world (News Releases DENSO Corporation tasks: the threat assessment and the lane position
2013b). Distances to the lane markings and road control. The threat assessment problem can be
geometry parameters, like heading angle and cur- stated as the problem of detecting the risk of
L
vature, are determined by the image processing accident due to an unintended lane departure, for
algorithms, which must be robust to poor image a given situation of the surrounding environment
due to bad weather conditions or worn lane mark- (i.e., surrounding vehicles and obstacles). The
ings. Estimation of road geometry parameters, lane position control problem is the problem of
like curvature measurement, can be a challenging controlling the vehicle yaw and lateral motion in
problem (Lundquist and Schön 2011), especially order to stay within the lane. The lane position
during rain or fog (Eidehall 2004). control is activated once the threat assessment
Depending on the image processing al- detects the risk of accident.
gorithms the cameras are equipped with, We point out that the border between the corre-
surrounding objects can also be detected sponding modules executing these two tasks may
and tracked. In particular, pattern recognition be blurred for different existing commercial lane
algorithm can be used to find objects in the keeping systems. That is, the two problems may
images and classify them into cars, trucks, not be solved by two separate modules, but rather
motorcycles, and pedestrians. Vehicles (or other seen and solved as a single problem. Moreover,
objects) can be typically detected in a range the following presentation of the threat assess-
of about 60–70 m, with lower accuracy than a ment and the lane position control problems and
radar (Eidehall 2004). approaches abstracts from the implementation of
a particular lane keeping system available on the
Actuators market, rather focusing on fundamental concepts.
In order to keep the vehicle within its lane, the
most convenient actuator is the steering. Hence, Threat Assessment
a lane keeping system can be quite easily built The core information in a threat assessment algo-
in those vehicles equipped with electric power- rithm for lane keeping applications is given by a
assisted steering (EPAS) systems. In particular, measure called time to lane crossing (TLC). This
an additional steering torque can be added by the is the predicted time when a front tire intersects a
618 Lane Keeping
lane boundary. As explained in van Winsum et al. order to decide whether to trigger an intervention,
(2000), the TLC can be calculated in different if a collision is predicted, or not (Eidehall
ways. Next, its simplest expression is reported as 2004). This step is repeated for all the detected
(Eidehall 2004) threat vehicles, provided that the onboard
radar and the camera support multiple-target
W=2 Wveh =2 yoff tracking.
TLC D ; (1)
yPoff In order to minimize the interference of the
lane keeping system with the driver and/or to
where W is the lane width, yoff is the vehicle not let the system perform dangerous maneuvers,
lateral position within the lane, and Wveh is the assisting interventions should not be triggered if
vehicle width. Equation (1) can be easily modi- the quality of the measurements is such that the
fied to calculate the TLC w.r.t. any lane boundary information about the surrounding environment
relative to the adjacent lanes. is poor. For instance, in case of low visibility
The simplest way of using the TLC is just that limits the detection of the lane markings
monitoring it and triggering an action as the and the estimation of the road geometry, the sys-
TLC passes a threshold. Nevertheless, depending tem should be temporarily deactivated or down-
on the vehicle manufacturer, more sophisticated graded.
logics can be developed in order to correctly In summary, the threat assessment module has
interpret the driver’s intention and minimize the to be designed with the objective of detecting
unnecessary assisting interventions. Next, few the risk of accident due to lane departure while
scenarios follow that must be taken into account not interfering with the driver with unnecessary
while developing such logics in order to not interventions (i.e., nuisance minimization).
interfere with the driver. In particular, the threat
assessment module should stop or not trigger Lane Position Control
any assisting intervention while the vehicle is As observed in section “Actuators,” the vehicle
approaching or crossing a lane boundary if motion within the lane can be affected in two
• The indicators are active, ways, i.e., through steering and individual wheel
• A risk of collision with the vehicle ahead braking. Clearly, a steering command can be
is detected, such that the vehicle is crossing issued by both the driver and the lane keeping
the lane markings as results of an evasive system.
maneuver, Before issuing a steering command, in order to
• The radar detects a slower vehicle ahead and minimize the system nuisance, the lane keeping
the driver accelerates, since this may be an system may issue other types of low-intrusiveness
overtaking (Technology Daimler and Safety interventions. For instance, if a “low”-level threat
Innovation 2013), is detected by the threat assessment module (i.e.,
• The driver’s steering wheel torque indicates a threat where the risk of accidents is not im-
that the driver is acting against the system, minent), warnings or other stimuli to the driver
• The driver manually initiates a maneuver, may be issued in order to induce the driver to
driving the vehicle back to its lane (i.e., the execute the right maneuver. For instance, based
driver executes “the right” maneuver) on, e.g., spectrum analysis of the driver’s steering
• The vehicle enters a motor highway or command, driver’s inattention or drowsiness may
a bend (Technology Daimler and Safety be detected and a warning issued. As observed
Innovation 2013). in Technology Daimler and Safety Innovation
Part of the threat assessment task is predicting (2013), different types of warning can be used for
the trajectories of the surrounding vehicles. For different vehicle types. In passenger cars, in such
instance, if a threat vehicle is traveling in the cases, a vibration motor in the steering wheel may
adjacent lane (in the same or opposite direction), warn the driver. In trucks, audible, directional
its position has to be predicted at the TLC in warning signals can be used to let the driver know
Lane Keeping 619
that the vehicle trajectory needs to be adjusted. In task in a lane keeping system, a lateral control
buses, in order to avoid bothering the passengers, algorithm w.r.t. the lane boundaries is needed.
driver warning is issued through vibration motors Consider the vehicle sketched in Fig. 2. The
placed in the driver’s seat. equations describing the vehicle motion within
Other types of “soft intervention” aim at in- the lane can be compactly written in a state-space
creasing the steering impedance in the direction form as
leading to lane crossing that might cause a col-
lision with surrounding vehicles. Generating the
xP D Ax C Bı C D P des ; (2)
desired steering impedance can be easily formu-
lated as a steering torque control problem. Never-
theless, tuning the control algorithm to obtain the where x D ey ePy e eP , P d is the desired yaw
desired steering feeling can be an involving and rate, e.g., calculated based on the road curvature,
time-consuming procedure based on extensive in- and A; B; D are speed-dependent matrices that
vehicle testing. can be found in Rajamani (2003). The (unstable)
Besides warnings and “soft interventions” system can be stabilized by a state-feedback con-
aiming at inducing the driver to perform correct trol law
maneuvers, as part of the lane position control ı D Kx C ıff ; (3)
620 Learning in Games
where K is a stabilizing static gain and ıff is a become relevant. Hence, future research efforts
feedforward term that can be used to compensate aiming at developing low-complexity verification
for the road curvature. In Rajamani (2003), it is methods might greatly impact the future
shown that, while ey .t/ ! 0 as t ! 0, e ap- development of automated driving systems.
proaches a nonzero steady-state value, no matter
how ıff is chosen, for non-straight road. Cross-References
Despite a simple problem formulation and
solution, controlling the vehicle position within Adaptive Cruise Control
the lane is not a trivial task. Indeed, having Vehicle Dynamics Control
the control law (3) active all the time may
Acknowledgments The author would like to thank Dr.
increase the nuisance, leading to unacceptable
Erik Coelingh for the helpful discussion.
driving experience. For this reason, the steering
command calculated through the (3) may be
active only when the vehicle significantly Bibliography
deviates from the road centerline, i.e., approaches
the lane markings. Clearly, adding such logics Eidehall A (2004) An automotive lane guidance system.
complicates the analysis of the closed-loop Lic thesis 1122, Linköping University
behavior, thus making necessary extensive in- Falcone P, Ali M, Sjöberg J (2011) Predictive threat
assessment via reachability analysis and set invariance
vehicle tuning and verification. theory. IEEE Trans Intell Transp Syst 12(4):1352–
1361
Summary and Future Directions Lundquist C, Schön TB (2011) Joint ego-motion and road
geometry estimation. Inf Fusion 4(12):253–263
News Releases DENSO Corporation (2013a) Denso de-
In this chapter, we have overviewed the general velops higher performance millimeter-wave radar,
issues and requirements that must be considered June 2013
in the design of a lane-keeping system. News Releases DENSO Corporation (2013b) Denso de-
velops new vision sensor for active safety systems,
The variety of environmental conditions the
June 2013
sensing system should operate in, together with Rajamani R (2006) Vehicle dynamics and control, chap. 2.
the range of diverse scenarios the decision- Springer, New York
making module should cope with, render the Technology Daimler and Safety Innovation (2013) Lane
keeping assist: always on the right track, June 2013
design and verification problems challenging,
van Winsum W, Brookhuis KA, de Waard D (2000) A
costly, and time consuming for a lane-keeping comparison of different ways to approximate time-to-
system. It is, therefore, necessary to approach line crossing (TLC) during car driving. Accid Anal
the design of such systems by also providing Prev 32(1):47–56
safety guarantees to the largest extent, yet
minimizing conservatism and intrusiveness of
the overall system. Model-based approaches to Learning in Games
threat assessment and decision-making problems,
as proposed in Falcone et al. (2011) for a Jeff S. Shamma
lane departure application, provide neat design School of Electrical and Computer Engineering,
and verification frameworks, which can clearly Georgia Institute of Technology, Atlanta,
describe the safe operation of the overall GA, USA
system. Adopting such design methodologies can
potentially contribute to a consistent reduction of
the development time by consistently reducing Abstract
the a posteriori safety verification phase. On the
other hand, the computational complexity of In a Nash equilibrium, each player selects a
formal model-based verification methods can strategy that is optimal with respect to the strate-
dramatically increase in those scenarios where gies of other players. This definition does not
system nonlinearity and nonconvex state spaces mention the process by which players reach a
Learning in Games 621
Nash equilibrium. The topic of learning in games as understand possible barriers to reaching Nash
seeks to address this issue in that it explores how equilibrium.
simplistic learning/adaptation rules can lead to In the setup of learning in games, players
Nash equilibrium. This entry presents a selec- repetitively play a game over a sequence
tive sampling of learning rules and their long- of stages. At each stage, players use past
run convergence properties, i.e., conditions under experiences/observations to select a strategy
which player strategies converge or not to Nash for the current stage. Once player strategies
equilibrium. are selected, the game is played, information
is updated, and the process is repeated. The
question is then to understand the long-run
Keywords behavior, e.g., whether or not player strategies
converge to Nash equilibrium.
Cournot best response; Fictitious play; Log-linear Traditionally the dynamic processes consid-
learning; Mixed strategies; Nash equilibrium ered under learning in games have players se-
lecting strategies based on a myopic desire to
optimize for the current stage. That is, play-
ers do not consider long-run effects in updating
Introduction their strategies. Accordingly, while players are
engaged in repetitive play, the dynamic processes
In a Nash equilibrium, each player’s strategy is generally are not optimal in the long run (as in the
optimal with respect to the strategies of other setting of “repeated games”). Indeed, the survey
players. Accordingly, Nash equilibrium offers a article of Hart (2005) refers to the dynamic pro-
predictive model of the outcome of a game. That cesses of learning in games as “adaptive heuris-
L
is, given the basic elements of a game – (i) a set tics.” This distinction is important in that an
of players; (ii) for each player, a set of strategies; implicit concern in learning in games is to un-
and (iii) for each player, a utility function that derstand how “low rationality” (i.e., suboptimal
captures preferences over strategies – one can and heuristic) processes can lead to the “high ra-
model/assert that the strategies selected by the tionality” (i.e., mutually optimal) notion of Nash
players constitute a Nash equilibrium. equilibrium.
In making this assertion, there is no suggestion This entry presents a sampling of results from
of how players may come to reach a Nash equi- the learning in games literature through a selec-
librium. Two motivating quotations in this regard tion of illustrative dynamic processes, a review
are: of their long-run behaviors relevant to Nash equi-
librium, and pointers to further work.
The attainment of equilibrium requires a disequi-
librium process (Arrow 1986).
information and observations to select that day’s on high congestion paths, e.g., a rolling billboard
path according to some selection rule, and this seeking to maximize exposure. The aforemen-
process is repeated day after day. tioned learning rules for Alice remain unchanged.
In game-theoretic terms, player “strategies” Of course, Alice’s actions implicitly depend on
are paths linking their origins to destinations, and the utility functions of other players, but only
player “utility functions” reflect travel times. At indirectly through their selected paths. This char-
a Nash equilibrium, players have selected paths acteristic of no explicit dependence on the utility
such that no individual player can find a shorter functions of others is known as “uncoupled”
travel time given the chosen paths of others. The learning, and it can have major implications on
learning in games question is then whether player the achievable long-run behavior (Hart and Mas-
paths indeed converge to Nash equilibrium in the Colell 2003a).
long run. Not surprisingly, the answer depends In assuming the ability to observe the paths of
on the specific process that players use to select other players and to compute off-line travel times
paths and possible additional structure of the as a function of these paths, these learning rules
commuting game. impose severe requirements on the information
Suppose that one of the players, say “Alice,” available to each player. Less restrictive are learn-
is choosing among a collection of paths. For ing rules that are “payoff based” (Young 2005).
the sake of illustration, let us give Alice the A simple modification that leads to payoff-based
following capabilities: (i) Alice can observe the learning is as follows. Alice maintains an empiri-
paths chosen by all other players and (ii) Alice cal average of the travel times of a path using only
can compute off-line her travel time as a function the days that she took that path. Note the distinc-
of her path and the paths of others. tion – on any given day, Alice remains unaware
With these capabilities, Alice can compute of travel times for the routes not selected. Using
running averages of the travel times along all these empirical average travel times, Alice can
available paths. Note that the assumed capabili- then mimic any of the aforementioned learning
ties allow Alice to compute the travel time of a rules. As intended, she does not directly observe
path and hence its running average, whether or the paths of others, nor does she have a closed-
not she took the path on that day. With average form expression for travel times as a function of
travel time values in hand, two possible learning player paths. Rather, she only can select a path
rules are: and measure the consequences. As before, all
– Exploitation: Choose the path with the lowest players implementing such a learning rule induce
average travel time. a dynamic process, but the ensuing analysis in
– Exploitation with Exploration: With high payoff-based learning can be more subtle.
probability, choose the path with the lowest
average travel time, and with low probability,
choose a path at random. Learning Dynamics
Assuming that all players implement the same
learning rule, each case induces a dynamic pro- We now give a more formal presentation of se-
cess that governs the daily selection of paths lected learning rules and results concerning their
and determines the resulting long-run behavior. long-run behavior.
We will revisit these processes in a more formal
setting in the next section.
Preliminaries
A noteworthy feature of these learning rules is
We begin with the basic setup of games with a
that they do not explicitly depend on the utility
finite set of players, f1; 2; : : : ; N g, and for each
functions of other players. For example, suppose
player i , a finite set of strategies, Ai . Let
one of the other players is willing to trade off
travel time for more scenic routes. Similarly,
suppose one of the other players prefers to travel A D A1 : : : AN
Learning in Games 623
denote the set of strategy profiles. Each player, i , ˛i 2 .Ai / for each i . Let us assume that players
is endowed with a utility function choose a strategy randomly and independently
according to these mixed strategies. Accordingly,
ui W A ! R: define PrŒaI ˛ to be the probability of strategy a
under the mixed strategy profile ˛, and define the
Utility functions capture player preferences over expected utility of player i as
strategy profiles. Accordingly, for any a; a0 2 A, X
the condition Ui .˛/ D ui .a/ PrŒaI ˛:
a2A
ui .a/ > ui .a0 /
A mixed strategy Nash equilibrium is a mixed
indicates that player i prefers the strategy profile strategy profile, ˛ , such that for any player i and
a over a0 . any ˛i0 2 .A/,
The notation i indicates the set of players
other than player i . Accordingly, we sometimes Ui .˛i ; ˛i
/ Ui .˛i0 ; ˛i
/:
write a 2 A as .ai ; ai / to isolate ai , the strategy
of player i , versus ai , the strategies of other Special Classes of Games
players. The notation i is used in other settings We will reference three special classes of games:
as well. (i) zero-sum games, (ii) potential games, and (iii)
Utility functions induce best-response sets. weakly acyclic games.
For ai 2 Ai , define Zero-sum games: There are only two players (i.e.,
˚ N D 2), and u1 .a/ D u2 .a/. L
Bi .ai / D ai W ui .ai ; ai / ui .ai0 ; ai / Potential games: There exists a (potential) func-
tion,
for all ai0 2 Ai :
WA!R
In words, Bi .ai / denotes the set of strategies such that for any pair of strategies, a D .ai ; ai /
that are optimal for player i in response to the and a0 D .ai0 ; ai /, that differ only in the strategy
strategies of other players, ai . of player i ,
A strategy profile a 2 A is a Nash equilib-
rium if for any player i and any ai0 2 Ai , ui .ai ; ai /ui .ai0 ; ai / D .ai ; ai /.ai0 ; ai /:
ui .ai ; ai
/ ui .ai0 ; ai
/: Weakly acyclic games: There exists a function
a Nash equilibrium. The aforementioned com- Cournot best-response dynamics need not al-
muting game constitutes a potential game under ways converge in games with a Nash equilibrium,
certain special assumptions. These are as follows: hence the restriction to weakly acyclic games.
(i) the delay on a road only depends on the
number of users (and not their identities) and (ii) Fictitious Play
all players measure delay in the same manner In fictitious play, introduced in Brown (1951),
(Monderer and Shapley 1996). players also use past observations to construct a
Weakly acyclic games are a generalization of forecast of the strategies of other players. Unlike
potential games. In potential games, there ex- Cournot best-response dynamics, this forecast is
ists a potential function that captures differences probabilistic.
in utility under unilateral (i.e., single player) As a simple example, consider the commuting
changes in strategy. In weakly acyclic games game with two players, Alice and Bob, who both
(see Young 1998), if a strategy profile is not a must choose between two paths, A and B. Now
Nash equilibrium, then there exists a player who suppose that on stage t D 10, Alice has observed
can simultaneously achieve an increase in utility Bob used path A for 6 out of the previous 10
while increasing the potential function. The char- days and path B for the remaining days. Then
acterization of weakly acyclic games through a Alice’s forecast of Bob is that he will chose path
potential function herein is not traditional and is A with 60 % probability and path B with 40 %
borrowed from Marden et al. (2009a). probability. Alice then chooses between path A
and B in order to optimize her expected utility.
Likewise, Bob uses Alice’s empirical averages to
Forecasted Best-Response Dynamics form a probabilistic forecast of her next choice
One family of learning dynamics involves players and selects a path to optimize his expected utility.
formulating a forecast of the strategies of other More generally, let j .t/ 2 .Aj / denote the
players based on past observations and then play- empirical frequency for player j at stage t. This
ing a best response to this forecast. vector is a probability distribution that indicates
the relative frequency of times player j played
each strategy in Aj over stages 0; 1; : : : ; t 1. In
fictitious play, player i assumes (incorrectly) that
Cournot Best-Response Dynamics
at stage t, other players will select their strategies
The simplest illustration is Cournot best-response
independently and randomly according to their
dynamics. Players repetitively play the same
empirical frequency vectors. Let …i .t/ denote
game over stages t D 0; 1; 2; : : :. At stage t,
the induced probability distribution over Ai at
a player forecasts that the strategies of other
stage t. Under fictitious play, player i selects an
players are the strategies played at the previous
action according to
stage t 1. The following rules specify Cournot
best response with inertia. For each stage t and X
for each player i : ai .t/ 2 arg max ui .ai ; ai /
ai 2Ai
ai 2Ai
• With probability p 2 .0; 1/, ai .t/ D ai .t 1/
(inertia). PrŒai I …i .t/:
• With probability 1 p, ai .t/ 2 Bi .ai .t 1//
(best response). In words, player i selects the action that
• If ai .t 1/ 2 Bi .ai .t 1//, then ai .t/ D maximizes expected utility assuming that other
ai .t 1/ (continuation). players select their strategies randomly and
independently according to their empirical
Proposition 1 For weakly acyclic (and hence
frequencies.
potential) games, player strategies under
Cournot best-response dynamics with inertia Proposition 2 For (i) zero-sum games, (ii)
converge to a Nash equilibrium. potential games, and (iii) two-player games
Learning in Games 625
in which one player has only two actions, In words, under log-linear learning, only a single
player empirical frequencies under fictitious play player performs a strategy update at each stage.
converge to a mixed strategy Nash equilibrium. The probability of selecting a strategy is expo-
nentially proportional to the utility garnered from
These results are reported in Fudenberg and
that strategy (with other players repeating their
Levine (1998), Hofbauer and Sandholm (2002),
previous strategies). In the above description, the
and Berger (2005). Fictitious play need not
dummy parameter Z is a normalizing variable
converge to Nash equilibria in all games. An
used to define a probability distribution. In fact,
early counterexample is reported in Shapley
the specific probability distribution for strategy
(1964), which constructs a two-player game
selection is a Gibbs distribution with tempera-
with a unique mixed strategy Nash equilibrium.
ture parameter, T . For very large T , strategies
A weakly acyclic game with multiple pure
are chosen approximately uniformly at random.
(i.e., non-mixed) Nash equilibria under which
However, for small T , the selected strategy is
fictitious play does not converge is reported in
a best response (i.e., ai .t/ 2 Bi .ai .t 1//)
Foster and Young (1998).
with high probability, and an alternative strategy
A variant of fictitious play is “joint strategy”
is selected with low probability.
fictitious play (Marden et al. 2009b). In this
Because of the inherent randomness, strategy
framework, players construct as forecasts empir-
profiles under log-linear learning never converge.
ical frequencies of the joint play of other players.
Nonetheless, the long-run behavior can be char-
This formulation is in contrast to constructing
acterized probabilistically as follows.
and combining empirical frequencies for each
player. In the commuting game, it turns out that Proposition 3 For potential games with poten-
joint strategy fictitious play is equivalent to the tial function ./ under log-linear learning, for L
aforementioned “exploitation” rule of selecting any a 2 A,
the path with lowest average travel time. Marden
et al. (2009b) show that action profiles under joint 1 .a/=T
lim PrŒa.t/ D a D e :
strategy fictitious play (with inertia) converge to t !1 Z
a Nash equilibrium in potential games.
In words, the long-run probabilities of strategy
profiles conform to a Gibbs distribution con-
Log-Linear Learning
structed from the underlying potential function.
Under forecasted best-response dynamics,
This characterization has the important implica-
players chose a best response to the forecasted
tion of (probabilistic) equilibrium selection. Prior
strategies of other players. Log-linear learning,
convergence results stated convergence to Nash
introduced in Blume (1993), allows the
equilibria, but did not specify which Nash equi-
possibility of “exploration,” in which players can
librium in the case of multiple equilibria. Under
select nonoptimal strategies but with relatively
log-linear learning, there is a probabilistic prefer-
low probabilities.
ence for the Nash equilibrium that maximizes the
Log-linear learning proceeds as follows. First,
underlying potential function.
introduce a “temperature” parameter, T > 0.
– At stage t, a single player, say player i , is
selected at random.
– For player i ,
Extensions and Variations
garnered in each stage, impose less restrictive Summary and Future Directions
informational requirements. See Young (2005)
for a general discussion, as well as Marden et al. We have presented a selection of learning dynam-
(2009c), Marden and Shamma (2012), and Arslan ics and their long-run characteristics, specifically
and Shamma (2004) for various payoff-based in terms of convergence to Nash equilibria. As
extensions. stated early on, the original motivation of learn-
ing in games research has been to add credence
No-regret learning. The broad class of so-called to solution concepts such as Nash equilibrium as
“no-regret” learning rules has the desirable prop- a model of the outcome of a game. An emerging
erty of converging to broader solution concepts line of research stems from engineering consid-
(namely, Hannan consistency sets and correlated erations, in which the objective is to use the
equilibria) in general games. See Hart and Mas- framework of learning in games as a design tool
Colell (2000, 2001, 2003b) for an extensive dis- for distributed decision architecture settings such
cussion. as autonomous vehicle teams, communication
networks, or smart grid energy systems. A related
Calibrated forecasts. Calibrated forecasts are emerging direction is social influence, in which
more sophisticated than empirical frequencies the objective is to steer the collective behaviors
in that they satisfy certain long-run consistency of human decision makers towards a socially
properties. Accordingly, forecasted best-response desirable situation through the dispersement of
learning using calibrated forecasts has stronger incentives. Accordingly, learning in games can
guaranteed convergence properties, such as offer baseline models on how individuals update
convergence to correlated equilibria. See Foster their behaviors to guide and inform social influ-
and Vohra (1997), Kakade and Foster (2008), and ence policies.
Mannor et al. (2007).
– Hart S (2005) Adaptive heuristics. Economet- Hart S, Mas-Colell A (2000) A simple adaptive proce-
rica 73(5):1401–1430 dure leading to correlated equilibrium. Econometrica
68(5):1127–1150
– Young HP (2007) The possible and the im- Hart S, Mas-Colell A (2001) A general class of adaptive
possible in multi-agent learning. Artif Intell strategies. J Econ Theory 98:26–54
171:429–433 Hart S, Mas-Colell A (2003a) Uncoupled dynamics
Articles relating learning in games to distributed do not lead to Nash equilibrium. Am Econ Rev
93(5):1830–1836
control: Hart S, Mas-Colell A (2003b) Regret based continuous-
– Li N, Marden JR (2013) Designing games time dynamics. Games Econ Behav 45:375–394
for distributed optimization. IEEE J Sel Top Hart S, Mas-Colell A (2006) Stochastic uncoupled dy-
Signal Process 7:230–242 namics and Nash equilibrium. Games Econ Behav
57(2):286–303
– Mannor S, Shamma JS (2007) Multi- Hofbauer J, Sandholm W (2002) On the global con-
agent learning for engineers, Artif Intell vergence of stochastic fictitious play. Econometrica
171:417–422 70:2265–2294
– Marden JR, Arslan G, Shamma JS (2009) Kakade SM, Foster DP (2008) Deterministic calibration
and Nash equilibrium. J Comput Syst Sci 74: 115–130
Cooperative control and potential games. Mannor S, Shamma JS, Arslan G (2007) Online calibrated
IEEE Trans Syst Man Cybern B Cybern forecasts: memory efficiency vs universality for learn-
6:1393–1407 ing in games. Mach Learn 67(1):77–115
Marden JR, Shamma JS (2012) Revisiting log-linear
learning: asynchrony, completeness and a payoff-
based implementation. Games Econ Behav 75(2):788–
808
Marden JR, Arslan G, Shamma JS (2009a) Cooperative
Bibliography control and potential games. IEEE Trans Syst Man
Cybern Part B Cybern 39:1393–1407 L
Arieli I, Babichenko Y (2012) Average-testing and Pareto Marden JR, Arslan G, Shamma JS (2009b) Joint strategy
efficiency. J Econ Theory 147: 2376–2398 fictitious play with inertia for potential games. IEEE
Arrow KJ (1986) Rationality of self and others in an Trans Autom Control 54:208–220
economic system. J Bus 59(4):S385–S399 Marden JR, Young HP, Arslan G, Shamma JS (2009c)
Arslan G, Shamma JS (2004) Distributed convergence to Payoff based dynamics for multi-player weakly
Nash equilibria with local utility measurements. In: acyclic games. SIAM J Control Optim 48:373–396
Proceedings of the 43rd IEEE conference on decision Marden JR, Peyton Young H, Pao LY (2011) Achieving
and control. Atlantis, Paradise Island, Bahamas Pareto optimality through distributed learning. Oxford
Berger U (2005) Fictitious play in 2 n games. J Econ economics discussion paper #557
Theory 120(2):139–154 Monderer D, Shapley LS (1996) Potential games. Games
Blume L (1993) The statistical mechanics of strategic Econ Behav 14:124–143
interaction. Games Econ Behav 5:387–424 Pradelski BR, Young HP (2012) Learning efficient Nash
Brown GW (1951) Iterative solutions of games by fic- equilibria in distributed systems. Games Econ Behav
titious play. In: Koopmans TC (ed) Activity analy- 75:882–897
sis of production and allocation. Wiley, New York, Roughgarden T (2005) Selfish routing and the price of
pp 374–376 anarchy. MIT, Cambridge
Foster DP, Vohra R (1997) Calibrated learning and corre- Shamma JS, Arslan G (2005) Dynamic fictitious play,
lated equilibrium. Games Econ Behav 21:40–55 dynamic gradient play, and distributed convergence
Foster DP, Young HP (1998) On the nonconvergence of to Nash equilibria. IEEE Trans Autom Control
fictitious play in coordination games. Games Econ 50(3):312–327
Behav 25:79–96 Shapley LS (1964) Some topics in two-person games. In:
Fudenberg D, Levine DK (1998) The theory of learning in Dresher M, Shapley LS, Tucker AW (eds) Advances in
games. MIT, Cambridge game theory. University Press, Princeton, pp 1–29
Hart S (2005) Adaptive heuristics. Econometrica Skyrms B (1992) Chaos and the explanatory significance
73(5):1401–1430 of equilibrium: strange attractors in evolutionary game
Hart S, Mansour Y (2007) The communication com- dynamics. In: PSA: proceedings of the biennial meet-
plexity of uncoupled Nash equilibrium procedures. ing of the Philosophy of Science Association. Volume
In: STOC’07 proceedings of the 39th annual ACM Two: symposia and invited papers, East Lansing, MI,
symposium on the theory of computing, San Diego, USA, pp 274–394
CA, USA, pp 345–353 Young HP (1998) Individual strategy and social structure.
Hart S, Mansour Y (2010) How long to equilibrium? The Princeton University Press, Princeton
communication complexity of uncoupled equilibrium Young HP (2005) Strategic learning and its limits. Oxford
procedures. Games Econ Behav 69(1):107–126 University Press. Oxford
628 Learning Theory
m labelled samples, which is then passed on to belong to T are shown as blue rectangles, while
the oracle for labeling, this is known as “active those that do not belong to T are shown as red
learning.” More common is “passive learning,” in dots. Knowing only these labelled samples, the
which the sequence of training samples fxi gi 1 learner is expected to guess what T might be.
is generated at random, in an independent and A reasonable approach is to choose some
identically distributed (i.i.d.) fashion, according half-plane that agrees with the data and correctly
to some probability distribution P . In this case, classifies the labelled data. For instance, the well-
even the hypothesis Hm and the generalization known support vector machine (SVM) algorithm
error are random, because they depend on the chooses the unique half-plane such that the
randomly generated training samples. This is the closest sample to the dividing line is as far as
rationale behind the nomenclature “probably ap- possible from it; see the paper by Cortes and
proximately correct.” The hypothesis Hm is not Vapnik (1997).
expected to equal to unknown target concept T The symbol H denotes the boundary of a hy-
exactly, only approximately. Even that is only pothesis, which is another half-plane. The shaded
probably true, because in principle it is possible region is the symmetric difference between the
that the randomly generated training samples two half-planes. The set TH is the set of points
could be totally unrepresentative and thus lead to that are misclassified by the hypothesis H . Of
a poor hypothesis. If we toss a coin many times, course, we do not know what this set is, because
there is a small but always positive probability we do not know T . It can be shown that, when-
that it could turn up heads every time. As the coin ever the hypothesis H is chosen to be consistent
is tossed more and more times, this probability in the sense of correctly classifying all labelled
becomes smaller, but will never equal zero. samples, the generalization error goes to zero as
the number of samples approaches infinity.
L
Examples Example 2 Now suppose the concept class con-
sists of all convex polygons in the unit square,
Example 1 Consider the situation where the con- and let T denote the (unknown) target convex
cept class consists of all half-planes in R2 , as polygon. This situation is depicted in the right
indicated in the left side of Fig. 1. Here the side of Fig. 1. This time let us assume that the
unknown target concept T is some fixed but probability distribution that generates the sam-
unknown half-plane. The symbol T is next to ples is the uniform distribution on X . Given a
the boundary of the half-plane, and all points to set of positively and negatively labelled samples
the right of the line constitute the target half- (the same convention as in Example 1), let us
plane. The training samples, generated at random choose the hypothesis H to be the convex hull
according some unknown probability distribution of all positively labelled samples, as shown in
P , are also shown in the figure. The samples that the figure. Since every positively labelled sample
H
T
H
X = [0, 1]2
630 Learning Theory
y y
x x
z z
B = {y} B = {z}
belongs to T , and T is a convex set, it follows the reader is referred to any standard text such
that H is a subset of T . Moreover, P .T n H / as Vidyasagar (1997). More generally, it can be
is the generalization error. It can be shown that shown that the set of half-planes in Rk has VC
this algorithm also “works” in the sense that the dimension k C 1.
generalization error goes to zero as the number of
Example 4 The set of convex polygons has infi-
samples approaches infinity.
nite VC dimension. To see this, let S be a strictly
convex set, as shown in Fig. 2b. (Recall that a
set is “strictly convex” if none of its boundary
Vapnik-Chervonenkis Dimension points is a convex combination of other points in
the set.) Choose any finite collection of boundary
Given any concept class C, there is a single points, call it S D fx1 ; : : : ; xn g. If B is a
integer that offers a measure of the richness of subset of S , then the convex hull of B does not
the class, known as the Vapnik-Chervonenkis (or contain any other point of S , due to the strict
VC) dimension, after its originators. convexity property. Since this argument holds for
Definition 1 A set S X is said to be shattered every integer n, the class of convex polygons has
by a concept class C if, for every subset BS , infinite VC dimension.
there is a set A 2 C such that S \ A D B. The
VC dimension of C is the largest integer d such Two Important Theorems
that there is a finite set of cardinality d that is
shattered by C. Out of the many important results in learning
theory, two are noteworthy.
Example 3 It can be shown that the set of half-
planes in R2 has VC dimension two. Choose a Theorem 1 (Blumer et al. (1989)) A concept
set S D fx; y; zg consisting of three points that class is distribution-free PAC learnable if and
are not collinear, as in Fig. 2. Then there are only if it has finite VC dimension.
23 D 8 subsets of S . The point is to show that
Theorem 2 (Benedek and Itai (1991)) Suppose
for each of these eight subsets, there is a half-
P is a fixed probability distribution. Then the
plane that contains precisely that subset, nothing
concept class C is PAC learnable if and only if,
more and nothing less. That this is possible is
for every positive number , it is possible to cover
shown in Fig. 2. Four out of the eight situations
C by a finite number of balls of radius , with
are depicted in this figure, and the remaining four
respect to the pseudometric dP .
situations can be covered by taking the comple-
ment of the half-plane shown. It is also necessary Now let us return to the two examples stud-
to show that no set with four or more elements ied previously. Since the set of half-planes has
can be shattered, but that step is omitted; instead finite VC dimension, it is distribution-free PAC
Lie Algebraic Methods in Nonlinear Control 631
learnable. The set of convex polygons can be Anthony M, Biggs N (1992) Computational learning the-
shown to satisfy the conditions of Theorem 2 ory. Cambridge University Press, Cambridge
Benedek G, Itai A (1991) Learnability by fixed distribu-
if P is the uniform distribution and is therefore tions. Theor Comput Sci 86:377–389
PAC learnable. However, since it has infinite VC Blumer A, Ehrenfeucht A, Haussler D, Warmuth M (1989)
dimension, it follows from Theorem 1 that it is Learnability and the Vapnik-Chervonenkis dimension.
not distribution-free PAC learnable. J ACM 36(4):929–965
Campi M, Vidyasagar M (2001) Learning with prior infor-
mation. IEEE Trans Autom Control 46(11):1682–1695
Devroye L, Györfi L, Lugosi G (1996) A probabilistic
Summary and Future Directions
theory of pattern recognition. Springer, New York
Gamarnik D (2003) Extension of the PAC framework to
This brief entry presents only the most basic finite and countable Markov chains. IEEE Trans Inf
aspects of PAC learning theory. Many more re- Theory 49(1):338–345
Kearns M, Vazirani U (1994) Introduction to computa-
sults are known about PAC learning theory, and
tional learning theory. MIT, Cambridge
of course many interesting problems remain un- Kulkarni SR, Vidyasagar M (1997) Learning decision
solved. Some of the known extensions are: rules under a family of probability measures. IEEE
• Learning under an “intermediate” family of Trans Inf Theory 43(1):154–166
Meir R (2000) Nonparametric time series predic-
probability distributions P that is not neces-
tion through adaptive model selection. Mach Learn
sarily equal to P , the set of all distributions 39(1):5–34
(Kulkarni and Vidyasagar 1997) Natarajan BK (1991) Machine learning: a theoretical ap-
• Relaxing the requirement that the algorithm proach. Morgan-Kaufmann, San Mateo
van der Vaart AW, Wallner JA (1996) Weak convergence
should work uniformly well for all target
and empirical processes. Springer, New York
concepts and requiring instead only that it Vapnik VN (1995) The nature of statistical learning the-
should work with high probability (Campi ory. Springer, New York L
and Vidyasagar 2001) Vapnik VN (1998) Statistical learning theory. Wiley, New
York
• Relaxing the requirement that the training
Vidyasagar M (1997) A theory of learning and generaliza-
samples are independent of each other tion. Springer, London
and permitting them to have Markovian Vidyasagar M (2003) Learning and generalization: with
dependence (Gamarnik 2003; Meir 2000) or applications to neural networks. Springer, London
ˇ-mixing dependence (Vidyasagar 2003)
There is considerable research in finding al-
ternate sets of necessary and sufficient conditions
for learnability. Unfortunately, many of these Lie Algebraic Methods in Nonlinear
conditions are unverifiable and amount to tauto- Control
logical restatements of the problem under study.
Matthias Kawski
School of Mathematical and Statistical Sciences,
Cross-References Arizona State University, Tempe, AZ, USA
rank conditions determine controllability, observ- this matrix algebra, and, e.g., invariant linear
ability, and optimality. Lie algebraic methods are subspaces, to nonlinear systems is in terms of
also employed for state-space realization, control the Lie algebra generated by the vector fields fi ,
design, and path planning. integral submanifolds of this Lie algebra, and the
algebra of iterated Lie derivatives of the output
function.
Keywords The Lie bracket of two smooth vector fields
f; gW M 7! TM is defined as the vector field
Baker-Campbell-Hausdorff formula; Chen-Fliess Œf; gW M 7! TM that maps any smooth function
series; Lie bracket ' 2 C 1 .M / to the function Œf; g' D fg'
gf '.
In local coordinates, if
Definition
X
n
@
f .x/ D f i .x/ and
This article considers generally nonlinear control @x i
i D1
systems (affine in the control) of the form
X
n
@
g.x/ D g i .x/ ;
xP D f0 .x/ C u1 f1 .x/ C : : : um fm .x/ @x i
(1) i D1
y D '.x/
then
where the state x takes values in Rn , or more
X n
generally in an n-dimensional manifold M n , the @g i
fi are smooth vector fields, 'W Rn 7! Rp is a Œf; g.x/ D f j .x/ j .x/
i;j D1
@x
smooth output function, and the controls u D
.u1 ; : : : ; um /W Œ0; T 7! U are piecewise contin- @f i @
uous, or, more generally, measurable functions g .x/ j .x/
j
:
@x @x i
taking values in a closed convex subset U Rm
that contains 0 in its interior. With some abuse of notation, one may abbreviate
Lie algebraic techniques refers to analyzing this to Œf; g D .Dg/f .Df /g, where f and
the system (1) and designing controls and sta- g are considered as column vector fields and Df
bilizing feedback laws by employing relations and Dg denote their Jacobian matrices of partial
satisfied by iterated Lie brackets of the system derivatives.
vector fields fi . Note that with this convention the Lie bracket
corresponds to the negative of the commutator
of matrices: If P; Q 2 Rnn define, in matrix
Introduction notation, the linear vector fields f .x/ D P x and
g.x/ D Qx, then Œf; g.x/ D .QP PQ/x D
Systems of the form (1) contain as a special case ŒP; Qx.
time-invariant linear systems xP D AxCBu; y D
C x (with constant matrices A 2 Rnn , B 2
Rnm , and C 2 Rpn ) that are well-studied and Noncommuting Flows
are a mainstay of classical control engineering.
Properties such as controllability, stabilizability, Geometrically the Lie bracket of two smooth
observability, and optimal control and various vector fields f1 and f2 is an infinitesimal measure
others are determined by relationships satisfied of the lack of commutativity of their flows. For a
by higher-order matrix products of A, B, and C . smooth vector field f and an initial point x.0/ D
Since the early 1970s, it has been well un- p 2 M , denote by e tf p the solution of the
derstood that the appropriate generalization of differential equation xP D f .x/ at time t. Then
Lie Algebraic Methods in Nonlinear Control 633
1 tf2 tf1 tf2 tf1 set U of admissible control values, the Hermann-
Œf1 ; f2 '.p/ D lim ' e e e e p
t !0 2t 2 Nagano theorem guarantees that if the dimension
'.p// : of the subspace L.p/ D ff .p/W f 2 Lg <
n is not maximal, then all such trajectories are
confined to stay in a lower-dimensional proper
As a most simple example, consider parallel integral submanifold of L through the point p.
parking a unicycle, moving it sideways without For a comprehensive introduction, see, e.g., the
slipping. Introduce coordinates .x; y; / for the textbooks Bressan and Piccoli (2007), Isidori
location in the plane and the steering angle. The (1995), and Sontag (1998).
dynamics are governed by xP D u1 cos , yP D
u1 sin , and P D u2 where the control u1 is
interpreted as the signed rolling speed and u2 as Controllability
the angular velocity of the steering angle. Written
in the form (1), one has f1 D .cos ; sin ; 0/T Define the reachable set RT .p/ as the set of all
and f2 D .0; 0; 1/T . (In this case the drift vector terminal points x.T I u; p/ at time T of trajecto-
field f0 0 vanishes.) If the system starts at ries of (1) that start at the initial point x.0/ D
.0; 0; 0/T , then via the sequence of control actions p and correspond to admissible controls. Com-
of the form turn left, roll forward, turn back, and monly known as the Lie algebra rank condi-
roll backwards, one may steer the system to a tion (LARC), the above condition determines
point .0; y; 0/T with y > 0. This sideways whether the system is accessible from the point
motion corresponds to the value .0; 1; 0/T of the p, which means that for arbitrarily small time
Lie bracket Œf1 ; f2 D . sin ; cos ; 0/T at the T > 0, the reachable set RT .p/ has nonempty
origin. It encapsulates that steering and rolling do n-dimensional interior. For most applications one
L
not commute. This example is easily expanded desires stronger controllability properties. Most
to model, e.g., the sideways motion of a car, or amenable to Lie algebraic methods, and practi-
a truck with multiple trailers; see, e.g., Bloch cally relevant, is small-time local controllability
(2003), Bressan and Piccoli (2007), and Bullo (STLC): The system is STLC from p if p lies in
and Lewis (2005). In such cases longer iterated the interior of RT .p/ for every T > 0. In the case
Lie brackets correspond to the required more that there is no drift vector field f0 , accessibility
intricate control actions needed to obtain, e.g., a is equivalent to STLC. However, in general, the
pure sideways motion. situation is much more intricate, and a rich liter-
In the case of linear systems, if the Kalman ature is devoted to various necessary or sufficient
rank condition rankŒB; AB; A2 B; : : : An1 B D conditions for STLC. A popular such condition
n is not satisfied, then all solutions curves of the is the Hermes condition. For this define the sub-
system starting from the same point x.0/ D p are spaces S 1 D spanf.adj f0 ; fi /W 1 j m; j 2
at all times T > 0 constrained to lie in a proper ZC g, and recursively S kC1 D spanfŒg1 ; gk W g1 2
affine subspace. In the nonlinear setting the role S 1 ; gk 2 S k g. Here .ad 0 f; g/ D g, and
of the compound matrix of that condition is taken recursively .ad kC1 f; g/ D Œf; .ad k f; g/. The
by the Lie algebra L D L.f0 ; f1 ; : : : fm / of all Hermes condition guarantees in the case of an-
finite linear combinations of iterated Lie brackets alytic vector fields and, e.g., U D Œ1; 1m
of the vector fields fi . As an immediate conse- that if the system satisfies the (LARC) and for
quence of the Frobenius integrability theorem, if every k 1, S 2k .p/ S 2k1 .p/, then the
at a point x.0/ D p the vector fields in L span system is (STLC). For more general conditions,
the whole tangent space, then it is possible to see Sussmann (1987) and also Kawski (1990) for
reach an open neighborhood of the initial point by a broader discussion.
concatenating flows of the system (1) that corre- The importance and value of Lie algebraic
spond to piecewise constant controls. Conversely, conditions may in large part be ascribed to their
in the case of analytic vector fields and a compact geometric character, their being invariant under
634 Lie Algebraic Methods in Nonlinear Control
coordinate changes and feedback. In particular, in never correspond to control systems of the
the analytic case, the Lie relations completely de- same form, much work has been done in recent
termine the local properties of the system, in the years to rewrite this series in more useful
sense that Lie algebra homomorphism between formats. For example, the infinite directed
two systems gives rise to a local diffeomorphism exponential product expansion in Sussmann
that maps trajectories to trajectories (Sussmann (1986) that uses Hall trees immediately may be
1974). interpreted in terms of free nilpotent systems
and consequently helps in the construction
of nilpotent approximating systems. More
Exponential Lie Series recent work, much of it of a combinatorial
algebra nature and utilizing the underlying
A central analytic tool in Lie algebraic methods Hopf algebras, further simplifies similar
that takes the role of Taylor expansions in clas- expansions and in particular yields explicit
sical analysis of dynamical system is the Chen- formulas for a continuous Baker-Campbell-
Fliess series which associates to every admissible Hausdorff formula or for the logarithm of
control uW Œ0; T 7! U a formal power series the Chen-Fliess series (Gehrig and Kawski
2008).
XZ T
CF.u; T / D d u I X i1 : : : X is (2)
I 0
by minimizing a scalar functional, but a search it was known early that any stabilizing linear
is required to find the right functional. Globally controller could be obtained by some choice of
convergent algorithms are available to do just weights in an LQG optimization problem (see
that for quadratic functionals, but more direct Chap. 6 and references in Skelton 1988), it was
methods are now available. not known until the 1980s what particular choice
Since the early 1990s, the focus for linear of weights in LQG would yield a solution to
system design has been to pose control problems the matrix inequality (MI) problem. See early
as feasibility problems, to satisfy multiple con- attempts in Skelton (1988), and see Zhu and Skel-
straints. Since then, feasibility approaches have ton (1992) and Zhu et al. (1997) for a globally
dominated design decisions, and such feasibility convergent algorithm to find such LQG weights
problems may be convex or not. If the problem when the MI problem has a solution. Since then,
can be reduced to a set of linear matrix inequal- rather than stating a minimization problem for
ities (LMIs) to solve, then convexity is proven. a meaningless sum of outputs and inputs, linear
However, failure to find such LMI formulations control problems can now be stated simply in
of the problem does not mean it is not convex, and terms of norm bounds on each input vector and/or
computer-assisted methods for convex problems each output vector of the system (L2 bounds,
are available to avoid the search for LMIs (see L1 bounds, or variance bounds and covariance
Camino et al. 2003). bounds). These feasibility problems are convex
In the case of linear dynamic models of for state feedback or full-order output feedback
stochastic processes, optimization methods led controllers (the focus of this elementary intro-
to the popularization of linear quadratic Gaussian duction), and these can be solved using linear
(LQG) optimal control, which had globally matrix inequalities (LMIs), as illustrated in this
optimal solutions (see Skelton 1988). The first article. However, the earliest approach to these
L
two moments of the stochastic process (the MI problems was iterative LQG solutions (to
mean and the covariance) can be controlled find the correct weights to use in the quadratic
with these methods, even if the distribution of penalty of the state), as in Skelton (1988), Zhu
the random variables involved is not Gaussian. and Skelton (1992), and Zhu et al. (1997).
Hence, LQG became just an acronym for the
solution of quadratic functionals of control
and state variables, even when the stochastic
processes were not Gaussian. The label LQG Matrix Inequalities
was often used even for deterministic problems,
where a time integral, rather than an expectation Let Q be any square matrix. The linear matrix
operator, was minimized, with given initial inequality (LMI) “Q > 0” is just a short-hand
conditions or impulse excitations. These were notation to represent a certain scalar inequality.
formally called LQR (linear quadratic regulator) That is, the matrix notation “Q > 0” means “the
problems. Later the book (Skelton 1988) gave scalar xT Qx is positive for all values of x, except
the formal conditions under which the LQG and x D 0.” Obviously this is a property of Q, not
the LQR answers were numerically identical, and x, hence the abbreviated matrix notation Q > 0.
this particular version of LQR was called the This is called a linear matrix inequality (LMI),
deterministic LQG. since the matrix unknown Q appears linearly
It was always recognized that the quadratic in the inequality Q > 0. Note also that any
form of the state and control in the LQG problem square matrix Q can be written as the sum of
was an artificial goal. The real control goals usu- a symmetric matrix Qs D 12 .Q C QT /, and a
ally involved prespecified performance bounds skew-symmetric matrix Qk D 12 .Q QT /, but
on each of the outputs and bounds on each chan- xT Qk x D 0, so only the symmetric part of the
nel of control. This leads to matrix inequalities matrix Q affects the scalar xT Qx. We assume
(MIs) rather than scalar minimizations. While hereafter without loss of generality that Q is
638 Linear Matrix Inequality Techniques in Optimal Control
symmetric. The notation “Q 0” means “the MI is the appearance of the product of the two
scalar xT Qx cannot be negative for any x.” unknowns Q and G, so more work is required to
Lyapunov proved that x.t/ converges to zero show how to use LMIs to solve this problem.
if there exists a matrix Q such that, along the In the sequel some techniques are borrowed
nonzero trajectory of a dynamic system (e.g., the from linear algebra, where a linear matrix equal-
system xP D Ax), two scalars have the property, ity (LME) Gƒ D ‚ may or may not have
x.t/T Qx.t/ > 0 and d=dt.xT .t/Qx.t// < 0. a solution G. For LMEs there are two separate
This proves that the following statements are all questions to answer. The first question is “Does
equivalent: there exist a solution?” and the answer is “if and
1. For any initial condition x.0/ of the system only if C ‚ƒC ƒ D ‚.” The second question
xP D Ax, the state x.t/ converges to zero. is “What is the set of all solutions?” and the
2. All eigenvalues of A lie in the open left half answer is “G D C ‚ƒC C Z C ZƒƒC ,
plane. where Z is arbitrary, and the C symbol denotes
3. There exists a matrix Q with the two proper- Pseudo Inverse.” LMI approaches employ the
ties Q > 0 and QA C AT Q < 0. same two questions by formulating the necessary
4. The set of all quadratic Lyapunov functions and sufficient conditions for the existence of an
that can be used to prove the stability or LMI solution and then to parametrize all solu-
instability of the null solution of xP D Ax tions.
is given by xT Q1 x, where Q is any square Perhaps the earliest book on LMI control
matrix with the two properties of item 3 methods was Boyd et al. (1994), but the results
above. and notations used herein are taken from Skelton
LMIs are prevalent throughout the fundamen- et al. (1998). Other important LMI papers and
tal concepts of control theory, such as control- books can give the reader a broader background,
lability and observability. For the linear system including Iwasaki and Skelton (1994), Gahinet
example xP D Ax C Bu, y D Cx, the “Ob- and Apkarian (1994), de Oliveira et al. (2002),
Rservability Gramian” is the infinite integral Q D Li et al. (2008), de Oliveira and Skelton
T
eA t CT CeAt dt. Furthermore Q > 0 if and only (2001), Camino et al. (2001, 2003), Boyd and
if .A; C/ is an observable pair, and Q is bounded Vandenberghe (2004), Iwasaki et al. (2000),
only if the observable modes are asymptotically Khargonekar and Rotea (1991), Vandenberghe
stable. When it exists, the solution of QA C and Boyd (1996), Scherer (1995), Scherer et al.
AT Q C CT C D 0 satisfies Q > 0 if and only (1997), Balakrishnan et al. (1994), Gahinet et al.
if the matrix pair .A; C/ is observable. (1995), and Dullerud and Paganini (2000).
R Likewise the “Controllability Gramian” X D
T
eAt BBT eA t dt > 0 if and only if the pair .A; B/
is controllable. If X exists, it satisfies XAT C Control Design Using LMIs
AX C BBT D 0, and X > 0 if and only if .A; B/
is a controllable pair. Note also that the matrix Consider the feedback control system
pair .A; B/ is controllable for any A if BBT > 0,
and the matrix pair .A; C/ is observable for any 2 3 2 32 3
xP p Ap Dp Bp xp
A if CT C > 0. Hence, the existence of Q > 0 4 y 5 D 4 Cp Dy By 5 4 w 5 ;
or X > 0 satisfying either .QA C AT Q < 0/ or z Mp Dz 0 u
.AX C XAT < 0/ is equivalent to the statement
and w is the external disturbance (in some cases any given matrices ; ƒ; ‚, Chap. 9 of the book
below we treat w as a zero-mean white noise). (Skelton et al. 1998) provides all G which solve
We seek to choose the control matrix G to satisfy
the given upper bounds on the output covariance Gƒ C .Gƒ/T C ‚ < 0; (6)
EŒyyT Y, N where E represents the steady-state
expectation operator in the stochastic case (i.e., and proves that there exists such a matrix G if and
when w is white noise), and in the deterministic only if the following two conditions hold:
case E represents the infinite integral of the ma-
trix ŒyyT . The math is the same in each case, with UT ‚U < 0; or T > 0; (7)
appropriate interpretations of certain matrices.
For a rigorous equivalence of the deterministic VTƒ ‚V ƒ < 0; or ƒT ƒ > 0: (8)
and stochastic interpretations, see Skelton (1988).
By defining the matrices, If G exists, then one set of such G is given by
Acl Bcl A D B
D C G M E (2) where > 0 is an arbitrary scalar such that
Ccl Dcl C F H
ˆ D . T ‚/1 > 0: (10)
Ap 0 Bp 0
AD ; BD ;
0 0 0 I
All G which solve the problem are given by
Mp Dp Dz Theorem 2.3.12 in Skelton et al. (1998). As L
MD ; DD ; ED
0 I 0 0 elaborated in Chap. 9 of Skelton et al. (1998),
(3) 17 different control problems (using either state
feedback or full-order dynamic controllers) all re-
C D Cp 0 ; H D By 0 ; F D Dy ; (4)
duce to this same mathematical problem. That is,
by defining the appropriate ‚; ƒ; , a very large
one can write the closed-loop system dynamics in
the form number of different control problems, including
the characterization of all stabilizing controllers,
satisfying B? .AX C XAT /.B? /T < 0, where the required performance is to keep the integral
B? denotes the left null space of B. In this case squared output (kyk2L2 ) less than the prespecified
all stabilizing controllers may be parametrized by value kykL2 <
for any vector w0 such that
G D BT P C LQ1=2 , for any Q > 0 and a wT0 w0 1, and x0 D 0. This problem is labeled
P > 0 satisfying PA C AT P PBBT P C Q D 0. linear quadratic regulator (LQR). The following
The matrix L is any matrix that satisfies the norm statements are equivalent:
bound kLk < 1. Youla et al. (1976) provided (i) There exists a controller G that solves the
a parametrization of the set of all stabilizing LQR problem.
controllers, but the parametrization was infinite (ii) There exists a matrix Y > 0 such that
dimensional (as it did not impose any restriction kDT YDk <
2 and (7) and (8) hold, where
on the order or form of the controller). So for the matrices are defined by
finite calculations one had to truncate the set to a
finite number before optimization or stabilization ƒT ‚
started. As noted above, on the other hand, all
ƒT ‚ to be designed. How does one know where to ef-
2 3 fectively spend money to improve the system? To
B XMT AX C XAT XCT D answer this question, we must optimize the infor-
D 4H 0 CX
I F 5 mation architecture jointly with the control law.
0 ET DT FT
I Let us consider the problem of selecting the
(15) control law jointly with the selection of the
precision (defined here as the inverse of the
(‚ occupies the last three columns).
noise intensity) of each actuator/sensor, subject
Proof is provided by Theorem 9.1.5 in Skelton
to the constraint of specified upper bounds on the
et al. (1998).
covariance of output error and control signals,
and specified upper bounds on the sensor/actuator
L1 Control
cost. We assume the cost of these devices is
The peak value of the frequency response is
proportional to their precision (i.e., the cost is
controlled by the above H1 controller. A similar
equal to the price per unit of precision, times
theorem can be written to control the peak in the
the precision). Traditionally, with full-order
time domain.
controllers, and prespecified sensor/actuator
Define sup y.t/T y.t/ D kyk2L1 , and let the
instruments (with specified precisions); this is a
statement kykL1 <
mean that the peak value
well-known solved convex problem (which
of y.t/T y.t/ is less than
2 . Suppose that Dy D 0
means it can be converted to an LMI problem
and By D 0. There exists a controller G which
if desired), see Chap. 6 of Skelton et al. (1998). If
maintains kykL1 <
in the Rpresence of any
1 we enlarge the domain of the freedom to include
energy-bounded input w.t/ (i.e., 0 wT wdt 1)
sensor/actuator precisions, it is not obvious
if and only if there exists a matrix X > 0 such that
whether the feasibility problem is convex or
L
CXCT <
2 I and (7) and (8) hold, where
not. The following shows that this problem of
including the sensor/actuator precisions within
ƒT ‚
the control design problem is indeed convex
most research literature, the sensors and actuators where P D diag.Pi i / and W1 D diag.Wi1 i /.
have already been selected. Yet the selection of Consider the control system (1). Suppose that
T
sensors and actuators and their locations greatly Dy D 0, By D 0, w D wTs wTa is the zero-
affect the ability of the control system to do its mean sensor/actuator noise, Dp D Œ0 Da and
job efficiently. Perhaps in one location a high- Dz D ŒDs 0. If the $N represents the allowed upper
precision sensor is needed, and in another loca- bound on sensor/actuator costs, there exists a
tion high precision is not needed, and paying for dynamic controller G that satisfies the constraints
high precision in that location would therefore
be a waste of resources. These decisions must be N
EŒuuT < U; N
EŒyyT < Y; tr.PW1 / < $N
influenced by the control dynamics which are yet (17)
642 Linear Matrix Inequality Techniques in Optimal Control
in the presence of sensor/actuator noise with fix any two of these prespecified upper bounds
intensity diag.Wi i / D W (which like G should and iteratively reduce the level set value of the
be considered a design variable not fixed third “constraint” until feasibility is lost. This
a priori) if and only if there exist matrices process minimizes the resource expressed by the
L; F; Q; X; Z; W1 such that third constraint, while enforcing the other two
constraints.
tr.PW1 / < $N (18) As an example, if cost is not a concern, one
can always set large limits for $N and discover the
best assignment of sensor/actuator precisions for
2 3 2 3
N
Y Cp X Cp N L
U 0 the specified performance requirements. These
4.Cp X/T X I 5 > 0; 4LT X I5 > 0; precisions produced by the algorithm are the val-
Cp T I Z 0 I Z ues Wi1 i , produced from the solution (18)–(20),
where the observed rankings Wi1 i > Wjj1 >
ˆ 11 ˆ T21 Wkk 1
> : : : indicate which sensors or actuators
< 0; (19)
ˆ 21 W1 are most critical to the required performance
goals .U; N Y; N If any precision W 1 is essen-
N $/. nn
tially zero, compared to other required precisions,
this manner until feasibility of the problem is problems all reduce to exactly the same matrix
lost. Then this algorithm, stopping at the previous inequality problem (6). The set of such equivalent
iteration, has selected the best distribution of problems includes LQR, the set of all stabilizing
sensors/actuators for solving the specific prob- controllers, the set of all H1 controllers, and the
lem specified by the allowable bounds ($; N U;
N Y).
N set of all L1 controllers. The discrete and robust
The most important contribution of the above versions of these problems are also included in
algorithm has been to extend control theory to this equivalent set; 17 control problems have
solve system design problems that involve more been found to be equivalent to LMI problems.
than just deigning control gains. This enlarges the LMI techniques extend the range of
set of solved linear control problems, from solu- solvable system design problems beyond just
tions of linear controllers with sensors/actuators control design. By integrating information
prespecified to solutions which specify the sen- architecture and control design, one can
sor/actuator requirements jointly with the control simultaneously choose the control gains and
solution. the precision required of all sensor/actuators to
satisfy the closed-loop performance constraints.
These techniques can be used to select the
Summary information (with precision requirements)
required to solve a control or estimation problem,
LMI techniques provide more powerful tools using the best economic solution (minimal
for designing dynamic linear systems than precision). For a more complete discussion of
techniques that minimize a scalar functional for LMI problems in control, read Dullerud and
optimization, since multiple goals (bounds) can Paganini (2000), de Oliveira et al. (2002), Li
be achieved for each of the outputs and inputs. et al. (2008), de Oliveira and Skelton (2001),
L
Optimal control has been a pillar of control theory Gahinet and Apkarian (1994), Iwasaki and
for the last 50 years. In fact, all of the problems Skelton (1994), Camino et al. (2001, 2003),
discussed in this article can perhaps be solved Skelton et al. (1998), Boyd and Vandenberghe
by minimizing a scalar functional, but a search (2004), Boyd et al. (1994), Iwasaki et al. (2000),
is required to find the right functional. Globally Khargonekar and Rotea (1991), Vandenberghe
convergent algorithms are available to do just and Boyd (1996), Scherer (1995), Scherer et al.
that for quadratic functionals. But more direct (1997), Balakrishnan et al. (1994), and Gahinet
methods are now available (since the early 1990s) et al. (1995).
for satisfying multiple constraints. Since then,
feasibility approaches have dominated design
decisions (at least for linear systems), and such Cross-References
feasibility problems may be convex or not. If
the problem can be reduced to a set of LMIs H-Infinity Control
to solve, then convexity is proven. However, H2 Optimal Control
failure to find such LMI formulations of the Linear Quadratic Optimal Control
problem does not mean it is not convex, and LMI Approach to Robust Control
computer-assisted methods for convex problems Stochastic Linear-Quadratic Control
are available to avoid the search for LMIs (see
Camino et al. 2003). Optimization can also be
achieved with LMI methods by reducing the
level set for one of the bounds, while maintaining Bibliography
all the other bounds. This level set is reduced
Balakrishnan V, Huang Y, Packard A, Doyle JC (1994)
iteratively, between convex (LMI) solutions,
Linear matrix inequalities in analysis with multipliers.
until feasibility is lost. A most amazing fact is In: Proceedings of the 1994 American control confer-
that most of the common linear control design ence, Baltimore, vol 2, pp 1228–1232
644 Linear Quadratic Optimal Control
Boyd SP, Vandenberghe L (2004) Convex optimization. Skelton RE, Iwasaki T, Grigoriadis K (1998) A unified al-
Cambridge University Press, Cambridge gebraic approach to control design. Taylor & Francis,
Boyd SP, El Ghaoui L, Feron E, Balakrishnan V (1994) London
Linear matrix inequalities in system and control the- Vandenberghe L, Boyd SP (1996) Semidefinite program-
ory. SIAM, Philadelphia ming. SIAM Rev 38:49–95
Camino JF, Helton JW, Skelton RE, Ye J (2001) Analysing Youla DC, Bongiorno JJ, Jabr HA (1976) Modern
matrix inequalities systematically: how to get schur Wiener-Hopf design of optimal controllers, part ii: the
complements out of your life. In: Proceedings of the multivariable case. IEEE Trans Autom Control 21:
5th SIAM conference on control & its applications, 319–338
San Diego Zhu G, Skelton R (1992) A two-Riccati, feasible algo-
Camino JF, Helton JW, Skelton RE, Ye J (2003) Ma- rithm for guaranteeing output l1 constraints. J Dyn
trix inequalities: a symbolic procedure to determine Syst Meas Control 114(3):329–338
convexity automatically. Integral Equ Oper Theory Zhu G, Rotea M, Skelton R (1997) A convergent al-
46(4):399–454 gorithm for the output covariance constraint control
de Oliveira MC, Skelton RE (2001) Stability tests for problem. SIAM J Control Optim 35(1):341–361
constrained linear systems. In: Reza Moheimani SO
(ed) Perspectives in robust control. Lecture notes in
control and information sciences. Springer, New York,
pp 241–257. ISBN:1852334525
de Oliveira MC, Geromel JC, Bernussou J (2002) Ex-
tended H2 and H1 norm characterizations and con- Linear Quadratic Optimal Control
troller parametrizations for discrete-time systems. Int
J Control 75(9):666–679
H.L. Trentelman
Dullerud G, Paganini F (2000) A course in robust control
theory: a convex approach. Texts in applied mathemat- Johann Bernoulli Institute for Mathematics and
ics. Springer, New York Computer Science, University of Groningen,
Gahinet P, Apkarian P (1994) A linear matrix inequality Groningen, AV, The Netherlands
approach to H1 control. Int J Robust Nonlinear Con-
trol 4(4):421–448
Gahinet P, Nemirovskii A, Laub AJ, Chilali M (1995)
LMI control toolbox user’s guide. The Mathworks Abstract
Inc., Natick
Hamilton WR (1834) On a general method in dynamics;
by which the study of the motions of all free systems
Linear quadratic optimal control is a collective
of attracting or repelling points is reduced to the search term for a class of optimal control problems
and differentiation of one central relation, or character- involving a linear input-state-output system
istic function. Philos Trans R Soc (part II):247–308 and a cost functional that is a quadratic form
Hamilton WR (1835) Second essay on a general method
in dynamics. Philos Trans R Soc (part I):95–144
of the state and the input. The aim is to
Iwasaki T, Skelton RE (1994) All controllers for minimize this cost functional over a given
the general H1 control problem – LMI exis- class of input functions. The optimal input
tence conditions and state-space formulas. Automatica depends on the initial condition, but can be
30(8):1307–1317
Iwasaki T, Meinsma G, Fu M (2000) Generalized S-
implemented by means of a state feedback
procedure and finite frequency KYP lemma. Math control law independent of the initial condition.
Probl Eng 6:305–320 Both the feedback gain and the optimal cost can
Khargonekar PP, Rotea MA (1991) Mixed H2 =H1 con- be computed in terms of solutions of Riccati
trol: a convex optimization approach. IEEE Trans
Autom Control 39:824–837 equations.
Li F, de Oliveira MC, Skelton RE (2008) Integrating infor-
mation architecture and control or estimation design.
SICE J Control Meas Syst Integr 1(2):120–128
Scherer CW (1995) Mixed H2 =H1 control. In: Isidori
A (ed) Trends in control: a European perspective, Keywords
pp 173–216. Springer, Berlin
Scherer CW, Gahinet P, Chilali M (1997) Multiobjective Algebraic Riccati equation; Finite horizon;
output-feedback control via LMI optimization. IEEE
Infinite horizon; Linear systems; Optimal control;
Trans Autom Control 42(7):896–911
Skelton RE (1988) Dynamics Systems Control: linear Quadratic cost functional; Riccati differential
systems analysis and synthesis. Wiley, New York equation
Linear Quadratic Optimal Control 645
Z 1
Problem 2 In the situation of Problem 1, deter-
J.x0 ; u/ D kz.t/k2 dt; (2)
0
mine for every initial state x0 an input u 2 U such
that xu .t; x0 / ! 0 .t ! 1/ and such that under
where k k denotes the Euclidean norm. Our this condition, J.x0 ; u/ is minimized.
aim to keep the values of the output as small as
In the literature various special cases of these
possible can then be expressed as requiring this
problems have been considered, and names have
integral to be as small as possible by suitable
been associated to these special cases. In partic-
choice of input function u. In this way we arrive
ular, Problems 1 and 2 are called regular if the
at the linear quadratic optimal control problem:
matrix D is injective, equivalently, D > D > 0. If,
Problem 1 Consider the system † W x.t/ P D in addition, C > D D 0 and D > D D I , then the
Ax.t/ C Bu.t/, z.t/ D C x.t/ C Du.t/. Deter- problems are said to be in standard form. In the
mine for every initial state x0 an input u 2 U (a standard case, the integrand in the cost functional
space of functions Œ0; 1/ ! U) such that reduces to kzk2 D x > C > C x C u> u. We often
Z write Q D C > C . The standard case is a special
1
J.x0 ; u/ W D kz.t/k2 dt (3) case, which is not essentially simpler than the
0 general regular problem, but which gives rise to
simpler formulas. The general regular problem
is minimal. Here z.t/ denotes the output trajec- can be reduced to the standard case by means of
tory zu .t; x0 / of † corresponding to the initial a suitable feedback transformation.
state x0 and input function u.
Since the system is linear and the integrand in The Finite-Horizon Problem
the cost functional is a quadratic function of z,
the problem is called linear quadratic. Of course, The finite-horizon problem in standard from is
kzk2 D x > C > C x C 2u> D > C x C u> D > Du, formulated as follows:
so the integrand can also be considered as a
quadratic function of .x; u/. The convergence of Problem 3 Given the system x.t/ P D Ax.t/ C
the integral in (3) is of course a point of concern. Bu.t/, a final time T > 0, and symmetric
Therefore, one often considers the corresponding matrices N and Q such that N 0 and Q 0,
finite-horizon problem in a preliminary investiga- determine for every initial state x0 a piecewise
tion. In this problem, a final time T is given and continuous input function u W Œ0; T ! U such
one wants to minimize the integral that the integral
RT
Z T J.x0 ; u; T / WD 0 x.t/> Qx.t/ C u.t/> u.t/ dt
J.x0 ; u; T / W D kz.t/k2 dt: (4)
0 Cx.T /> N x.T / (5)
Theorem 2 Consider the system x.t/ P D with Q 0. Assume that .A; B/ is stabilizable.
Ax.t/ C Bu.t/ together with the cost functional Then:
1. There exists a largest real symmetric solution
Z 1 of the ARE, i.e., there exists a real symmetric
J.x0 ; u/ W D x.t/> Qx.t/ C u.t/> u.t/ dt; solution P C such that for every real symmet-
0
ric solution P , we have P P C . P C is
positive semidefinite.
with Q 0. Factorize Q D C > C . Then, the
2. For every initial state x0 , we have
following statements are equivalent:
1. For every x0 2 X , there exists u 2 U such that J0 .x0 / D x0 > P C x0 :
J.x0 ; u/ < 1.
2. The ARE (9) has a real symmetric positive 3. For every initial state x0 , there exists an op-
semidefinite solution P . timal input function, i.e., a function u 2
Assume that one of the above conditions holds. U with x.1/ D 0 such that J.x0 ; u / D
Then, there exists a smallest real symmetric pos- J0 .x0 / if and only if every eigenvalue of A on
itive semidefinite solution of the ARE, i.e., there the imaginary
axis is .Q; A/ observable, i.e.,
exists a real symmetric solution P 0 such A I
that for every real symmetric solution P 0, we rank D n for all 2 .A/ with
Q
have P P . For every x0 , we have Re./ D 0.
Under this assumption we have:
J .x0 / W D inffJ.x0 ; u/ j u 2 Ug D x0 > P x0 : 4. For every initial state x0 , there is exactly one
optimal input function u . This optimal input
Furthermore, for every x0 , there is exactly one function is generated by the time-invariant
optimal input function, i.e., a function u 2 U feedback law
such that J.x0 ; u / D J .x0 /. This optimal input
u.t/ D B > P C x.t/:
is generated by the time-invariant feedback law
P
5. The optimal closed-loop system x.t/ D .A
u.t/ D B > P x.t/: BB > P C /x.t/ is stable. In fact, P C is the
unique real symmetric solution of the ARE for
which .A BB > P C / C .
In addition to the free endpoint problem, we con- Linear quadratic optimal control deals with find-
sider the version of the linear quadratic problem ing an input function that minimizes a quadratic
with zero endpoint. In this case the aim is to cost functional for a given linear system. The
minimize for every x0 the cost functional over all cost functional is the integral of a quadratic form
inputs u such that xu .t; x0 / ! 0 (t ! 1). For in the input and state variable of the system. If
each x0 such u exists if and only if the pair .A; B/ the integral is taken over, a finite time interval
is stabilizable. A solution to the regular standard the problem is called a finite-horizon problem,
form version of Problem 2 is stated next: and the optimal cost and optimal state feedback
gain can be expressed in terms of the solution of
Theorem 3 Consider the system x.t/ P D an associated Riccati differential equation. If we
Ax.t/ C Bu.t/ together with the cost functional integrate over an infinite time interval, the prob-
Z lem is called an infinite-horizon problem. The
1
J.x0 ; u/ W D x.t/> Qx.t/ C u.t/> u.t/ dt; optimal cost and optimal feedback gain for the
0 free endpoint problem can be found in terms of
Linear Quadratic Zero-Sum Two-Person Differential Games 649
the smallest nonnegative real symmetric solution worked out in detail in Hautus and Silverman
of the associated algebraic Riccati equation. For (1983) and Willems et al. (1986). For a more
the zero endpoint problem, these are given in recent reference, including an extensive list
terms of the largest real symmetric solution of the of references, we refer to the textbook of
algebraic Riccati equation. Trentelman et al. (2001).
Cross-References Bibliography
However, contrary to the control case, existence only necessary for some of the following results.
of the solution of the Riccati equation is not Detailed results without that assumption were
necessary for the existence of a closed-loop obtained by Zhang (2005) and Delfour (2005).
saddle point. One may “survive” a particular, We chose to set the cross term in uv in the
nongeneric, type of conjugate point. An criterion null; this is to simplify the results and
important application of LQDGs is the so-called is not necessary. This problem satisfies Isaacs’
H1 -optimal control, appearing in the theory of condition (see article DG) even with nonzero
robust control. such cross terms.
Using the change of control variables
Differential games; Finite horizon; H-infinity yields a DG with the same structure, with mod-
control; Infinite horizon ified matrices A and Q, but without the cross
terms in xu and xv. (This extends to the case with
nonzero cross terms in uv.) Thus, without loss of
Perfect State Measurement generality, we will proceed with .S1 S2 / D .0 0/.
The existence of open-loop and closed-loop
Linear quadratic differential games are a spe- solutions to that game is ruled by two Riccati
cial case of differential games (DG). See the equations for symmetric matrices P and P ? ,
article Pursuit-Evasion Games and Zero-Sum respectively, and by a pair of canonical equations
Two-Person Differential Games. They were first that we shall see later:
investigated by Ho et al. (1965), in the context
of a linearized pursuit-evasion game. This sub- PP C PA C At P PBR1 B t P C PD
1 D t P
section is based upon Bernhard (1979, 1980). A C Q D 0 ; P .T / D K ; (1)
linear quadratic DG is defined as
PP ? C P ? A C At P ? C P ? D
1 D t P ? CQ D 0 ;
xP D Ax C Bu C Dv ; x.t0 / D x0 ; P ? .T / D K : (2)
.' ? ; ? / in (3), is that Eq. (1) has a solution F .t/x.t/ remains finite. (See an example in Bern-
P .t/ defined over Œt0 ; T . hard 1979.)
• A necessary and sufficient condition for the If T D 1, with all system and payoff matrices
existence of an open-loop saddle point is that constant and Q > 0, Mageirou (1976) has shown
Eq. (2) has a solution over Œt0 ; T (and then so that if the algebraic Riccati equation obtained by
O
does (1)). In that case, the pairs .Ou./; v.//, setting PP D 0 in (1) admits a positive definite
? ? ?
.Ou./; /, and .' ; / are saddle points. solution P , the game has a Value kxk2P , but (3)
• A necessary and sufficient condition for may not be a saddle point. ( ? may not be an
O
.' ? ; v.// to be a saddle point is that Eq. (1) equilibrium strategy.)
has a solution over Œt0 ; T .
• In all cases where a saddle point exists, the
Value function is V .t; x/ D kxk2P .t / . H1 -Optimal Control
However, Eq. (1) may fail to have a solution and
This subsection is entirely based upon Başar and
a closed-loop saddle point still exists. The precise
Bernhard (1995). It deals with imperfect state
necessary condition is as follows: let X./ and
measurement, using Bernhard’s nonlinear min-
Y ./ be two square matrix function solutions of
imax certainty equivalence principle (Bernhard
the canonical equations
and Rapaport 1996).
Several problems of robust control may be
XP A BR1 B t C D
1 D t X brought to the following one: a linear, time-
D ;
YP Q At Y invariant system with two inputs (control input
u 2 Rm and disturbance input w 2 R` ) and two
X.T /
D
I
: outputs (measured output y 2 Rp and controlled
L
Y .T / K
output z 2 Rq ) is given. One wishes to con-
trol the system with a nonanticipative controller
The matrix P .t/ exists for t 2 Œt0 ; T if and u./ D .y.// in order to minimize the induced
only if X.t/ is invertible over that range, and linear operator norm between spaces of square-
then, P .t/ D Y .t/X 1 .t/. Assume that the rank integrable functions, of the resulting operator
of X.t/ is piecewise constant, and let X .t/ w./ 7! z./.
denote the pseudo-inverse of X.t/ and R.X.t// It turns out that the problem which has a
its range. tractable solution is a kind of dual one: given
a positive number
, is it possible to make this
Theorem 2 A necessary and sufficient condition
norm no larger than
? The answer to this ques-
for a closed-loop saddle point to exist, which is
tion is yes if and only if
then given by (3) with P .t/ D Y .t/X .t/, is
that Z 1
1. x0 2 R.X.t0 //. inf sup .kz.t/k2
2 kw.t/k2 / dt 0 :
2˚ w./2L2 1
2. For almost all t 2 Œt0 ; T , R.D.t//
R.X.t//.
We shall extend somewhat this classical prob-
3. 8t 2 Œt0 ; T , Y .t/X .t/ 0.
lem by allowing either a time variable system,
In a case where X.t/ is only singular at an with a finite horizon T , or a time-invariant system
isolated instant t ? (then conditions 1 and 2 above with an infinite horizon.
are automatically satisfied), called a conjugate The dynamical system is
point but where YX 1 remains positive defi-
nite on both sides of it, the conjugate point is xP D Ax C Bu C Dw ; (4)
called even. The feedback gain F D R1 B t P
y D Cx C Ew ; (5)
diverges upon reaching t ? , but on a trajectory
generated by this feedback, the control u.t/ D z D H x C Gu : (6)
652 Linear Quadratic Zero-Sum Two-Person Differential Games
and we assume that E is onto, , N > 0, and G and the certainty equivalent controller
is one-to-one , R > 0.
? .y.//.t/ D R1 .B t P C S t /x.t/
O : (12)
Finite Horizon
In this part, we consider a time-varying system,
Infinite Horizon
with all matrix functions measurable. Since the
The infinite horizon case is the traditional H1 -
state is not known exactly, we assume that the
optimal control problem reformulated in a state
initial state is not known either. The issue is
space setting. We let all matrices defining the
therefore to decide whether the criterion
system be constant. We take the integral in (7)
Z T from 1 to C1, with no initial or terminal term
J
D kx.T /k2K C .kz.t/k2
2 kw.t/k2 / of course. We add the hypothesis that the pairs
t0 .A; B/ and .A; D/ are stabilizable and the pairs
dt
2 kx0 k2˙0 (7) .C; A/ and .H; A/ detectable. Then, the theorem
is as follows:
may be kept finite and with which strategy. Let Theorem 4
? if and only if the fol-
lowing conditions are satisfied: The algebraic
? D inff
j inf sup J
< 1g : Riccati equations obtained by placing PP D 0
2˚ x 2Rn ;w./2L2
0 and ˙P D 0 in (8) and (9) have positive defi-
nite solutions, which satisfy the spectral radius
Theorem 3
? if and only if the following condition (10). The optimal controller is given
three conditions are satisfied: by Eqs. (11) and (12), where P and ˙ are the
1. The following Riccati equation has a solution minimal positive definite solutions of the alge-
over Œt0 ; T : braic Riccati equations, which can be obtained
as the limit of the solutions of the differential
PP DPACAt P .PBCS /R1 .B t P CS t / equations as t ! 1 for P and t ! 1
for ˙.
C
2 PMP CQ ; P .T /DK : (8)
on predictions on the system behavior derived also a linear time-invariant system. However,
from a process model (as in open-loop control) depending on the application needs, the state
but also on information about the actual behavior feedback control law (2) can be more com-
(closed-loop feedback control). plex.
– Although the Eqs. (3) that describe the closed-
loop behavior are different from Eq. (1), this
Linear Continuous-Time Systems does not imply that the system parameters
have changed. The way feedback control acts
Consider, to begin with, time-invariant systems is not by actually changing the system pa-
described by the state variable description rameters A, B, C , D but by changing u
so that closed-loop system behaves as if the
xP D Ax C Bu; y D C x C Du; (1) parameters were changed. When one applies,
say, a step via r.t/ in the closed-loop system,
in which x.t/ 2 Rn is the state, u.t/ 2 Rm is the then u.t/ in (2) is modified appropriately so
input, y.t/ 2 Rp is the output, and A 2 Rnn , the system behaves in a desired way.
B 2 Rnm , C 2 Rpn , D 2 Rpm are constant – It is possible to implement u in (2) as an open-
matrices. In this case, the linear state feedback loop control law, namely,
(lsf) control law is selected as
uO .s/ D F ŒsI .A C BF /1 x.0/
u.t/ D F x.t/ C r.t/; (2)
CŒI F .sI A/1 B1 rO .s/ (4)
where F 2 Rmn is the constant gain matrix and
r.t/ 2 Rm is a new external input. where Laplace transforms have been used for
Substituting (2) into (1) yields the closed-loop notational convenience. Equation (4) produces
state variable description, namely, exactly the same input as Eq. (2), but it has
the serious disadvantage that it is based ex-
xP D .A C BF /x C Br; clusively on prior knowledge on the system
(3) (notably x.0/ and parameters A, B). As a
y D .C C DF /x C Dr:
result, when there are uncertainties (and there
always are), the open-loop control law (4)
Appropriately selecting F , primarily to modify
may fail, while the closed-loop control law (2)
A C BF , one affects and improves the behavior
succeeds.
of the system.
– Analogous definitions exist for continuous-
A number of comments are in order:
time, time-varying systems described by the
– Feeding back the information from the state x
equations
of the system is expected to be, and it is, an ef-
fective way to alter the system behavior. This
is because knowledge of the (initial) state and xP D A.t/x C B.t/u; y D C.t/x C D.t/u
the input uniquely determines the system’s (5)
future behavior and intuitively using the state In this framework, the control law is described
information should be a good way to control by
the system, i.e., modifying its behavior. u D F .t/x C r; (6)
– In a state feedback control law, the input u
and the resulting closed-loop system is
can be any function of the state u D f .x; r/,
not necessarily linear with constant gain F as
in (2). Typically given (1) and (2) is selected xP D ŒA.t/ C B.t/F .t/x C B.t/r;
as the linear state feedback primarily because (7)
the resulting closed-loop description (3) is y D ŒC.t/ C D.t/F .t/x C D.t/r:
Linear State Feedback 655
Linear Discrete-Time Systems theory which yields the “best” F .t/ (or F .k/) in
some sense.
For the discrete-time, time-invariant case, the In the time-invariant case, one can also use a
system description is LQR formulation, but here stabilization is equiva-
lent to the problem of assigning the n eigenvalues
x.kC1/ D Ax.k/CBu.k/; y D C x.k/CDu.k/; of .A C BF / in the stable region of the complex
(8) plane. If i ; i D 1; : : : ; n, are the eigenvalues of
A C BF , then F should be chosen so that, for all
the linear state feedback control law is defined as i D 1; : : : ; n, the real part of i , Re.i / < 0 in
the continuous-time case, and the magnitude of
u.k/ D F x.k/ C r.k/; (9) i , ji j < 1 in the discrete-time case. Eigenvalue
assignment is therefore an important problem,
and the closed-loop system is described by which is discussed hereafter.
and the resulting closed-loop system is Theorem 1 The eigenvalue assignment problem
has a solution if and only if the pair .A; B/ is
reachable.
x.k C 1/ D ŒA.k/CB.k/F .k/x.k/CB.k/r.k/;
For single-input systems, that is, for systems with
y.k/ D ŒC.k/ C D.k/F .k/x.k/CD.k/r.k/:
(13) m D 1, the eigenvalue assignment problem has
a simple solution, as illustrated in the following
statement:
Selecting the Gain F Proposition 1 Consider system (1) or (8). Let
m D 1. Assume that
F (or F .t/) is selected so that the closed-loop
system has certain desirable properties. Stability rank R D n;
is of course of major importance. Many control
where
problems are addressed using linear state feed-
R D ŒB; AB; : : : :An1 B;
back including tracking and regulation, diagonal
decoupling, and disturbance rejection. Here we that is, the system is reachable. Let p.s/ be a
shall focus on stability. Stability can be achieved desired monic polynomial of degree n. Then there
under appropriate controllability assumptions. In is a (unique) linear state feedback gain matrix F
the time-varying case, one way to determine such that the characteristic polynomial of ACBF
such stabilizing F .t/ (or F .k/) is to use results is equal to p.s/. Such linear state feedback gain
from the optimal linear quadratic regulator (LQR) matrix F is given by
656 Linear State Feedback
F D 0 0 1 R1 p.A/: (14) then we use Ackermann’s formula to compute a
linear state feedback gain matrix F such that the
Proposition 1 provides a constructive way to characteristic polynomial of
assign the characteristic polynomial, hence the
eigenvalues, of the matrix A C BF . Note that, for A C BG C bi F
low order systems, i.e., if n D 2 or n D 3, it may
be convenient to compute directly the character- is p.s/. Note also that if .A; B/ is reachable,
istic polynomial of A C BF and then compute under mild conditions on A, there exists vector
F using the principle of identity of polynomials, g so that .A; Bg/ is reachable.
i.e., F should be such that the coefficients of There are many other methods to assign the
the polynomials det.sI .A C BF // and p.s/ eigenvalues which may be found in the references
coincide. Equation (14) is known as Ackermann’s below.
formula.
The result summarized in Proposition 1 can be
extended to multi-input systems. Transfer Functions
Proposition 2 Consider system (1) or (8). Sup-
If HF .s/ is the transfer function matrix of the
pose
closed-loop system (3), it is of interest to find its
rank R D n;
relation to the open-loop transfer function H.s/
that is, the system is reachable. Let p.s/ be a of (1). It can be shown that
monic polynomial of degree n. Then there is a
linear state feedback gain matrix F such that the HF .s/ D H.s/ŒI F .sI A/1 B1
characteristic polynomial of A C BF is equal
to p.s/. D H.s/ŒF .sI .A C BF //1 B C I
Note that in the case m > 1 the linear state In the single-input, single-output case, it can
feedback gain matrix F assigning the character- be readily shown that the linear state feedback
istic polynomial of the matrix A C BF is not control law (2) only changes the coefficients
unique. To compute such a gain matrix, one may of the denominator polynomial in the transfer
exploit the following fact: function (this result is also true in the multi-
Lemma 1 Consider system (1). Suppose input, multi-output case). Therefore, if any of
the (stable) zeros of H.s/ need to be changed,
rank R D n; the only way to accomplish this via linear state
feedback is by pole-zero cancelation (assigning
that is, the system is reachable. Let bi be a closed-loop poles at the open-loop zero locations;
nonzero column of the matrix B. Then there is a in the MIMO case, closed-loop eigenvalue direc-
matrix G such that the single-input system tions also need to be assigned for cancelations to
take place). Note that it is impossible to change
the unstable zeros of H.s/ under stability, since
xP D .A C BG/x C bi v (15)
they would have to be canceled with unstable
poles.
is reachable. Similar results are true for discrete-
time systems (8).
Exploiting Lemma 1, it is possible to design a Observer-Based Dynamic Controllers
matrix F such that the characteristic polynomial
of A C BF equals some monic polynomial p.s/ When the state x is not available for feedback, an
of degree n in two steps. First, we compute a ma- asymptotic estimator (a Luenberger observer) is
trix G such that the system (15) is reachable, and typically used to estimate the state. The estimate
Linear Systems: Continuous-Time Impulse Response Descriptions 657
xQ of the state, instead of the actual x, is then used Wonham WM (1967) On pole assignment in multi-input
in (2) to control the system, in what is known as controllable linear systems. IEEE Trans Autom Con-
trol AC-12:660–665
the certainty equivalence architecture. Wonham WM (1985) Linear multivariable control: a geo-
metric approach, 3rd edn. Springer, New York
Summary
The notion of state feedback for linear systems Linear Systems: Continuous-Time
has been discussed. It has been shown that state Impulse Response Descriptions
feedback modifies the closed-loop behavior. The
related problem of eigenvalue assignment has Panos J. Antsaklis
been discussed, and its connection with the reach- Department of Electrical Engineering, University
ability (controllability) properties of the system of Notre Dame, Notre Dame, IN, USA
has been highlighted. The class of feedback laws
considered is the simplest possible one. If addi-
tional constraints on the input signal, or on the Abstract
closed-loop performance, are imposed, then one
perhaps has to resort to nonlinear state feedback, An important input–output description of a linear
for example, if the input signal is bounded in continuous-time system is its impulse response,
amplitude or rate. If constraints such as decou- which is the response h.t; / to an impulse ap-
pling of the systems into m noninteracting sub- plied at time . In time-invariant systems that are
also causal and at rest at time zero, the impulse
systems or tracking under asymptotic stability are L
imposed, then dynamic state feedback may be response is h.t; 0/ and its Laplace transform is
necessary. the transfer function of the system. Expressions
for h.t; / when the system is described by state-
variable equations are also derived.
Cross-References
Keywords
Linear Systems: Continuous-Time, Time-Vary-
ing State Variable Descriptions Continuous-time; Impulse response descriptions;
Linear Systems: Discrete-Time, Time-Invariant Linear systems; Time-invariant; Time-varying;
State Variable Descriptions Transfer function descriptions
Observer-Based Control
Introduction
where it was assumed that x.t0 / D 0, i.e., the HO .s/ D C.sI A/1 B C D; (18)
system is at rest at t0 . Here ˆ.t; / is the state
L
transition matrix of the system defined by the
Peano-Baker series: which is the transfer function matrix in terms
of the coefficient matrices in the state-variable
Zt description (3). Note that (18) can also be derived
ˆ.t; t0 / D I C A.1 /d 1 directly from (3) by assuming zero initial condi-
t0 tions .x.0/ D 0/ and taking Laplace transform of
2 3 both sides.
Zt Z1
C A.1 / 4 A.2 /d 2 5 d 1 C I Finally, it is easy to show that equivalent
state-variable descriptions give rise to the same
t0 t0
impulse responses.
see Linear Systems: Continuous-Time, Time–
Varying State Variable Descriptions.
Comparing (13) with (10), the impulse re- Summary
sponse
( The continuous-time impulse response is an
C.t/ˆ.t; /B.t/CD.t/ı.t/ t; external, input–output description of linear,
H.t; /D
0 t<: continuous-time systems. When the system is
(14) time-invariant, the Laplace transform of the
impulse response h.t; 0/ (which is the output
Similarly, when the system is time-invariant
response at time t due to an impulse applied at
and is described by (3),
time zero with initial conditions taken to be zero)
Z t
is the transfer function of the system – another
y.t/ D ŒC e A.t / B C Dı.t /u./d ; very common input–output description. The
t0 relationships with the state-variable descriptions
(15) are shown.
660 Linear Systems: Continuous-Time, Time-Invariant State Variable Descriptions
Bibliography Introduction
Antsaklis PJ, Michel AN (2006) Linear systems. Linear, continuous-time systems are of great in-
Birkhauser, Boston terest because they model, exactly or approxi-
DeCarlo RA (1989) Linear systems. Prentice-Hall, Engle-
wood Cliffs
mately, the behavior over time of many practical
Kailath T (1980) Linear systems. Prentice-Hall, Engle- physical systems of interest. We are particularly
wood Cliffs interested in systems, the behavior of which is de-
Rugh WJ (1996) Linear systems theory, 2nd edn. Prentice- scribed by linear, ordinary differential equations
Hall, Englewood Cliffs
Sontag ED (1990) Mathematical control theory: deter- with constant coefficients.
ministic finite dimensional systems. Texts in applied Such descriptions can always be rewritten as a
mathematics, vol 6. Springer, New York set of first-order differential equations, typically
in the following convenient state variable form:
Solving (4) is an initial value problem. It time-invariant case becomes the defining series
can be shown that there always exists a unique for the matrix exponential (5).
solution '.t/ such that
System Response
P D A'.t/I
'.t/ '.0/ D x0 :
Based on the solution of the homogeneous equa-
To find the unique solution, consider first the tion (4), shown in (6), the solution of the state
one-dimensional case, namely, equation in (1) can be shown to be
Z
P D ay.t/I
y.t/ y.0/ D y0 t
x.t/ D e x0 C At
e A.t / Bu./d : (7)
0
the unique solution of which is
The following properties for the matrix exponen-
tial e At can be shown directly from the defining
y.t/ D e y0 ;
at
t 0:
series:
1. Ae At D e At A.
The scalar exponential e at can be expressed in a 2. .e At /1 D e At .
series form Equation (7) which is known as the variation
1 k of constants formula can be derived as follows:
X t 1 1 1
e at D ak .D 1C taC t 2 a2 C t 3 a3C: : : / Consider xP D Ax C Bu and let z.t/ D
kŠ
kD0
1 2 6 e At x.t/. Then x.t/ D e At z.t/ and substituting
The generalization to the nn matrix exponential Ae At z.t/ C e At zP.t/ D Ae At z.t/ C Bu.t/
(A is n n) is given by
or zP.t/ D e At Bu.t/ from which
1 k
X t 1 Z
e At D Ak .D In C At C A2 t 2 C : : : / t
kŠ 2 z.t/ z.0/ D e A Bu./d
kD0
(5) 0
Z t where
y.t/ D C e x0 C
At
C e A.t / Bu./d C Du.t/
0
(8)
AQ D PAP 1 ; BQ D PB; CQ D CP 1 ; DQ D D
Z t
D C e At x0 C ŒC e A.t / B
0 The state variable descriptions (9) and (10)
are called equivalent and P is the equivalence
C Dı.t /u./d transformation. This transformation corresponds
to a change in the basis of the state space, which
The second expression involves the Dirac (or is a vector space. Appropriately selecting P , one
impulse or delta) function ı.t/, and it is derived can simplify the structure of A.DQ PAP 1 /; the
based on the basic property for ı.t/, namely, matrices AQ and A are called similar. When the
Z eigenvectors of A are all linearly independent
1
f .t/ D ı.t /f ./d (this is the case, e.g., when all eigenvalues i of
1 A are distinct), then P may be found so that AQ
is diagonal. When e At is to be determined, and
It is clear that the matrix exponential e At plays AQ D PAP 1 D diagŒi (AQ and A have the same
a central role in determining the response of eigenvalues), then
a linear continuous-time, time-invariant system
described by (1). Q t
1 AP Q
Given A, e At may be determined using several e At D e P D P 1 e At P D P 1 diagŒe i t P:
methods including its defining series, diagonal-
ization of A using a similarity transformation
L
Note that it can be easily shown that equiva-
(PAP 1 ), the Cayley-Hamilton theorem, using lent state space representations give rise to the
expressions involving the modes of the system
P same impulse response and transfer function (see
(e At D niD1 Ai e i t when A has n distinct eigen- Linear Systems: Continuous-Time Impulse Re-
values i ; Ai D vi vQi with vi , vQ i the right and left sponse Descriptions).
eigenvectors of A that correspond to i (vQ i vj D
1, i D j and vQ i vj D 0, i ¤ j )), or using
Laplace transform (e At D L1 Œ.sI A/1 ). See
references below for detailed algorithms. Summary
Abstract
Solving x.t/
P D A.t/x.t/I x.t0 / D x0
Continuous-time processes that can be modeled
by linear differential equations with time-varying Consider the homogenous equation with the ini-
coefficients can be written in terms of state tial condition
Linear Systems: Continuous-Time, Time-Varying State Variable Descriptions 665
P
x.t/ D A.t/x.t/ C B.t/u.t/I x.t0 / D x0 (9) Properties of the State Transition
Matrix ˆ.t; t0 /
can be shown to be
In general it is difficult to determine ˆ.t; t0 /
Z t explicitly; however, ˆ.t; t0 / may be readily
.t/ D ˆ.t; t0 /x0 C ˆ.t; /B./u./d : determined in a number of special cases including
t0
(10) the cases in which A.t/ D A, A.t/ is diagonal,
Equation (10) is the variation of constants A.t/A./ D A./A.t/.
formula. This result can be shown via direct Consider xP D A.t/x. We can derive
substitution of (10) into (9); note that .t/ D a number of important properties which
ˆ.t0 ; t0 /x0 D x0 . That (10) is a solution can are described below. It can be shown that
also be shown using a change of variables in (9), given n linearly independent initial con-
namely, ditions x0i , the corresponding n solutions
z.t/ D ˆ.t0 ; t/x.t/: i .t/ are also linearly independent. Let a
fundamental matrix ‰.t/ of xP D A.t/x
Equation (10) is the sum of two parts, the state be an n n matrix, the columns of which
response (when u.t/ D 0 and the system is driven are a set of linearly independent solutions
only by the initial state conditions) and the input 1 .t/; : : : ; n .t/. The state transition matrix
response (when x0 D 0 and the system is driven ˆ is the fundamental matrix determined from
only by the input u.t/); this illustrates the linear solutions that correspond to the initial conditions
system principle of superposition. Œ1; 0; 0; : : :T ; Œ0; 1; 0; : : : ; 0T ; : : : Œ0; 0; : : : ; 1T
In view of (10), the output y.t/ .D C.t/x.t/C (recall that ˆ.t0 ; t0 / D I ). The following are
D.t/u.t// is properties of ˆ.t; t0 /:
(i) ˆ.t; t0 / D ‰.t/‰ 1 .t0 / with ‰.t/ any fun-
y.t/ D C.t/ˆ.t; t0 /x0 damental matrix.
Zt (ii) ˆ.t; t0 / is nonsingular for all t and t0 .
C C.t/ˆ.t; /B./u./d C D.t/u.t/ (iii) ˆ.t; / D ˆ.t; /ˆ.; / (semigroup prop-
t0 erty).
Zt
(iv) Œˆ.t; t0 /1 D ˆ.t0 ; t/.
D C.t/ˆ.t; t0 /x0 C ŒC.t/ˆ.t; /B./
In the special case of time-invariant systems
t0
and xP D Ax, the above properties can be written
CD.t/ı.t /u./d in terms of the matrix exponential since
C1
X
k
X y.k/ D H.k; l/u.l/: (8)
y.k/ D h.k; l/u.l/ (4)
lDk0
lD1
X
k X
k1
y.k/ D H.k l/u.l/; k 0; (9) y.k/ D C Ak.lC1/ Bu.l/ C Du.k/; k > k0
lD0 lDk0
(13)
where we chose k0 D 0 without loss of gener- where x.k0 / D 0 and
ality. Equation (9) is the description for casual,
8
time-invariant systems, at rest at k D 0. < C Ak.lC1/ B k>l
Equation (9) is a convolution sum and if we H.k; l/ D H.k l/ D D kDl
:
take the (one-sided or unilateral) z-transform of 0 k<l
both sides, (14)
Cross-References Keywords
Linear Systems: Discrete-Time, Time-Invariant Discrete-time; Linear systems; State variable de-
State Variable Descriptions scriptions; Time-invariant
Linear Systems: Discrete-Time, Time-Varying,
State Variable Descriptions Introduction
x.k/ D Œx1 .k/; : : : ; xn .k/T . Description (1) time elapsed (k k0 ) and not on the actual initial
can also be derived from continuous-time system time k0 .
descriptions by sampling (see Sampled-Data In view of (3), it is clear that Ak plays an
Systems). important role in the solutions of the difference
The advantage of the above state variable state equations that describe linear, discrete-
description is that given any input u.k/ and initial time, time-invariant systems; it is actually
conditions x.0/, its solution (state trajectory or analogous to the role e At plays in the solutions
motion) can be conveniently and systematically of the linear differential state equations that
characterized. This is done below. We first con- describe linear, continuous-time, time-invariant
sider the solutions of the homogenous equation systems.
x.k C 1/ D Ax.k/. Notice that in (3), k > 0. This is so because
Ak for k < 0 may not exist; this is the case,
for example, when A is a singular matrix – it has
Solving x.k C 1/ D Ax.k/I x.0/ D x0 at least one eigenvalue at the origin. In contrast,
e At exists for any t positive or negative. The
Consider the homogenous equation implication is that in discrete-time systems we
may not be able to determine uniquely the initial
x.k C 1/ D Ax.k/I x.0/ D x0 (2) past state x.0/ from a current state value x.k/; in
contrast, in continuous-time systems, it is always
where k 2 ZC is a nonnegative integer, x.k/ D possible to go backwards in time.
Œx1 .k/; : : : ; xn .k/T is the state column vectors of There are several methods to calculate Ak that
dimension n, and A is an nn matrix with entries mirror the methods to calculate e At . One could,
for example, use similarity transformations, or
L
real numbers (i.e., A 2 Rnn ).
Write (2) for k D 0; 1; 2; : : : , namely, x.1/ D the z-transform. When all eigenvectors of A are
Ax.0/, x.2/ D Ax.1/ D A2 x.0/; : : : to derive linearly independent (this is the case, e.g., when
the solution all eigenvalues i of A are distinct), then a simi-
larity transformation exists so that
x.k/ D Ak x.0/; k>0 (3)
PAP 1 D AQ D diagŒi :
This result can be shown formally by induction.
Note that A0 D I by convention and so (3) also
Then
satisfies the initial condition.
If the initial time were some (integer) k0 in- 2 3
stead of zero, then the solution would be k1
6 7
Ak D P 1 AQk P D P 1 4 ::
: 5 P:
x.k/ D A kk0
x.k0 /; k > k0 (4) kn
2 3
vQ1 Equivalence of State Variable
6 :: 7 1 Descriptions
4 : 5 D v1 vn ;
vQn
Given description (1), consider the new state
vector xQ where
Ai ki are the modes of the system. One could
also use the Cayley-Hamilton theorem to deter-
Q
x.k/ D P x.k/
mine Ak .
The state variable descriptions received wide ac- Discrete-time systems arise in a variety of ways
ceptance in systems theory beginning in the late in the modeling process. There are systems that
1950s. This was primarily due to the work of are inherently defined only at discrete points in
R.E. Kalman. For historical comments and ex- time; examples include digital devices, inventory
tensive references, see Kailath (1980). The use of systems, and economic systems such as banking
state variable descriptions in systems and control where interest is calculated and added to savings
opened the way for the systematic study of sys- accounts at discrete time interval. There are also
tems with multi-inputs and multi-outputs. systems that describe continuous-time systems at
discrete points in time; examples include simula-
tions of continuous processes using digital com-
Bibliography puters and feedback control systems that employ
digital controllers and give rise to sampled-data
Antsaklis PJ, Michel AN (2006) Linear systems. systems.
Birkhauser, Boston
Franklin GF, Powell DJ, Workman ML (1998) Digital Dynamical processes that can be described or
control of dynamic systems, 3rd edn. Addison-Wesley, approximated by linear difference equations with
Longman, Inc., Menlo Park, CA time-varying coefficients can also be described,
Kailath T (1980) Linear systems. Prentice-Hall, Engle- via a change of variables, by state variable
wood Cliffs
Rugh WJ (1996) Linear systems theory, 2nd edn. Prentice- descriptions of the form
Hall, Englewood Cliffs
x.k C 1/ D A.k/x.k/ C B.k/u.k/I x.k0 / D x0
L
y.k/ D C.k/x.k/ C D.k/u.k/:
(1)
Linear Systems: Discrete-Time,
Time-Varying, State Variable Above, the state vector x.k/ (k 2 Z, the set of in-
Descriptions tegers) is a column vector of dimension n (x.k/ 2
Rn ); the output is y.k/ 2 Rm and the input is
Panos J. Antsaklis u.k/ 2 Rm . A.k/, B.k/, C.k/, and D.k/ are
Department of Electrical Engineering, University matrices with entries functions of time k, A.k/ D
of Notre Dame, Notre Dame, IN, USA Œaij .k/, aij .k/ W Z ! R (A.k/ 2 Rnn , B.k/ 2
Rnm , C.k/ 2 Rpn , D.k/ 2 Rpm ). The vector
difference equation in (1) is the state equation,
Abstract while the algebraic equation is the output equa-
tion. Note that in the time-invariant case, A.k/ D
Discrete-time processes that can be modeled by A, B.k/ D B, C.k/ D C , and D.k/ D D.
linear difference equations with time-varying co- The advantage of the state variable description
efficients can be written in terms of state variable (1) is that given an input u.k/, k k0 and an
descriptions of the form x.k C 1/ D A.k/x.k/ C initial condition x.k0 / D x0 , the state trajectories
B.k/u.k/; y.k/ D C.k/x.k/ C D.k/u.k/. The or motions for k k0 can be conveniently
response of such systems due to a given input and characterized. To determine the expressions, we
initial conditions is derived. Equivalence of state first consider the homogeneous state equation
variable descriptions is also discussed. and the corresponding initial value problem.
x.k0 C 1/ D A.k0 /x.k0 / Equation (5) is the sum of two parts, the state
response (when u.k/ D 0 and the system is
x.k0 C 2/ D A.k0 C 1/A.k0 /x.k0 / driven only by the initial state conditions) and the
:: input response (when x.k0 / D 0 and the system
: is driven only by the input u.k/); this illustrates
x.k/ D A.k 1/A.k 2/ A.k0 /x.k0 / the linear systems principle of superposition.
Y
k1
D A.j /x.k0 /; k > k0 Equivalence of State Variable
j Dk0 Descriptions
This result can be shown formally by induction. Given (1), consider the new state vector xQ where
The solution of (2) is then
Q
x.k/ D P .k/x.k/
x.k/ D ˆ.k; k0 /x.k0 /; (3)
where P 1 .k/ exists. Then
where ˆ.k; k0 / is the state transition matrix of
(2) given by x.k Q x.k/
Q C 1/ D A.k/ Q Q
C B.k/u.k/
(7)
Y
k1 y.k/ D CQ .k/x.k/
Q Q
C D.k/u.k/
ˆ.k; k0 / D A.j /; k > k0 I ˆ.k0 ; k0 / D I:
j Dk0
where
(4)
Q
A.k/ D P .k C 1/A.k/P 1 .k/;
Note that in the time-invariant case, ˆ.k; k0 / D
Akk0 . Q
B.k/ D P .k C 1/B.k/;
CQ .k/ D C.k/P 1 .k/;
System Response Q
D.k/ D D.k/
Consider now the state equation in (1). It can be is equivalent to (1). It can be easily shown that
easily shown that the solution is equivalent descriptions give rise to the same dis-
crete impulse responses.
x.k/ D ˆ.k; k0 /x.k0 /
X
k1 Summary
C ˆ.k; j C 1/B.j /u.j /; k > k0 ;
j Dk0 State variable descriptions for linear discrete-
(5) time time-varying systems were introduced and
the state and output responses to inputs and initial
and the response y.k/ of (1) is conditions were derived. The equivalence of state
variable representations was also discussed.
y.k/ D C.k/ˆ.k; k0 /x.k0 /
X
k1
Cross-References
C C.k/ ˆ.k; j C 1/B.j /u.j /
j Dk0
Linear Systems: Discrete-Time, Time-Invariant
C D.k/u.k/; k > k0 ; (6) State Variable Descriptions
LMI Approach to Robust Control 675
Linear Systems: Discrete-Time Impulse After the introduction of LMI, it is illustrated how
Response Descriptions a control design problem is related with matrix
Linear Systems: Continuous-Time, Time-Vary- inequality. Then, two methods are explained on
ing State Variable Descriptions how to transform a control problem characterized
Sampled-Data Systems by matrix inequalities to LMIs, which is the core
of the LMI approach. Based on this knowledge,
the LMI solutions to various kinds of robust
Recommended Reading control problems are illustrated. Included are H1
and H2 control, regional pole placement, and
The state variable descriptions received wide ac- gain-scheduled control.
ceptance in systems theory beginning in the late
1950s. This was primarily due to the work of
R.E. Kalman. For historical comments and ex- Keywords
tensive references, see Kailath (1980). The use of
state variable descriptions in systems and control Gain-scheduled control; H1 and H2 control;
opened the way for the systematic study of sys- LMI; Multi-objective control; Regional pole
tems with multi-inputs and multi-outputs. placement; Robust control
Bibliography
Introduction of LMI
Antsaklis PJ, Michel AN (2006) Linear systems.
Birkhauser, Boston L
Astrom KJ, Wittenmark B (1997) Computer controlled
A matrix inequality in a form of
systems: Theory and Design, 3rd edn. Prentice Hall,
Upper Saddle River, NJ X
m
Franklin GF, Powell DJ, Workman ML (1998) Digital F .x/ D F0 C xi Fi > 0 (1)
control of dynamic systems, 3rd edn. Addison-Wesley, i D1
Menlo Park, CA
Jury EI (1958) Sampled-data control systems. Wiley, New
York is called an LMI (linear matrix inequality). Here,
Kailath T (1980) Linear systems. Prentice-Hall, Engle- x D Œx1 xm is the unknown vector and Fi .i D
wood Cliffs 1; : : :; m/ is a symmetric matrix. F .x/ is an affine
Ragazzini JR, Franklin GF (1958) Sampled-data control
systems. McGraw-Hill, New York function of x. The inequality means that F .x/ is
Rugh WJ (1996) Linear systems theory, 2nd edn. Prentice- positive definite.
Hall, Englewood Cliffs LMI can be solved effectively by numeri-
cal algorithms such as the famous interior point
method (Nesterov and Nemirovskii 1994). MAT-
LAB has an LMI toolbox (Gahinet et al. 1995)
LMI Approach to Robust Control tailored for solving the related control problems.
Boyd et al. (1994) provide detailed theoretic
Kang-Zhi Liu fundamentals of LMI. A comprehensive and up-
Department of Electrical and Electronic to-date treatment on the applications of LMI
Engineering, Chiba University, Chiba, Japan in robust control is covered in Liu and Yao
(2014).
The notation He.A/ D A C AT is used to
Abstract simplify the presentation of large matrices; A?
is a matrix whose columns form the basis of
In the analysis and design of robust control sys- the kernel space of A, i.e., AA? D 0. Further,
tems, LMI method plays a fundamental role. This A ˝ B denotes the Kronecker product of matrices
article gives a brief introduction to this topic. (A, B).
676 LMI Approach to Robust Control
Y
1 X
E T XF C F T X T E C G < 0 (3)
xP P xP A C BDK C BCK
D Ac ; Ac D :
xP K xK BK C AK
has a solution X if and only if the following two (8)
inequalities hold simultaneously
The stability condition is that the matrix inequal-
ity
T
E? GE? < 0; F?T GF? < 0: (4) ATc P C PAc < 0 (9)
LMI Approach to Robust Control 677
DK CK
Ac D A C BKC ; K D (10)
BK AK
A D NAK M T C NBK CX C YBCK M T
in which A D diag.A; 0/, B D diag.B; I /, CY .A C BDK C /X
and C D diag.C; I /, all being block diagonal.
B D NBK C YBDK ; C D CK M T
Then, based on Lemma 1, the stability condition
reduces to the existence of symmetric matrices CDK CX; D D DK :
X , Y satisfying LMIs (15)
The coefficient matrices of the controller become
.B T /T? .AX C XA /.B /? < 0
T T
(11)
DK D D; CK D .C DK CX /M T ;
BK D N 1 .B YBDK /
.C? /T .YA C AT Y /C? < 0: (12) (16)
AK D N 1 .A NBK CX YBCK M T
Meanwhile, the positive definiteness of matrix P
Y .A C BDK C /X /M T :
is guaranteed by Eq. (5) in Lemma 2.
H2 and H1 Control L
From BMI to LMI: Variable Change
In system optimization, H2 and H1 norms
We may also use the method of variable change
are the most popular and effective performance
to transform a BMI into an LMI. This method is
indices. H2 norm of a transfer function is closely
good at multi-objective optimization.
related with the squared area of its impulse
The detail is as follows (Gahinet 1996). A
response. So, a smaller H2 norm implies a
positive-definite matrix can always be factorized
faster response. Meanwhile, H1 norm of a
as the quotient of two triangular matrices, i.e.,
transfer function is the largest magnitude of
its frequency response. Hence, for a transfer
X I IY
P…1 D …2 ; …1 D ; …2 D : function from the disturbance to the controlled
MT 0 NT
0 output, a smaller H1 norm guarantees a better
(13)
disturbance attenuation.
P > 0 is guaranteed by Eq. (5) for a full-
Usually H2 and H1 optimization problems
order controller. Further, the matrices M , N are
are treated in the generalized feedback system of
computed from MN T D I XY. Consequently,
Fig. 1. Here, the generalized plant G.s/ includes
they are nonsingular.
the nominal plant, the performance index, and the
An equivalent inequality …T1 ATc …2 C
T
weighting functions.
…2 Ac …1 < 0 is obtained by multiplying Eq. (9)
Let the generalized plant G.s/ be
with …T1 and …1 . After a change of variables,
this inequality reduces to an LMI
" #
C1 D11 D 12
G.s/ D .sI A/1 B1 B2 C D 0 :
C2 21
AX C BC A C BDC C AT
He < 0: (14) (17)
0 YA C BC
Further, the stabilizability of (A, B2 ) and the
The new variables A, B, C, D are set as detectability of (C2 , A/ are assumed. The
678 LMI Approach to Robust Control
2 3
closed-loop transfer matrix from the disturbance ATc P C PAc PBc CcT
w to the performance output z is denoted by 4 BcT P
I DcT 5 < 0: (22)
Cc Dc
I
Hzw .s/ D Cc .sI Ac /1 Bc C Dc : (18)
There are two kinds of LMI solutions to this con-
The condition for Hzw .s/ to have an H2 norm trol problem: one based on variable elimination
less than
, i.e., kHzw k2 <
, is that there are and one based on variable change.
symmetric matrices P and W satisfying To state the first solution, define the following
matrices first:
PAc C ATc P CcT W BcT P
< 0; >0
Cc I PBc P NY D ŒC2 D21 ? ; NX D ŒB2T D12
T
? : (23)
(19)
as well as Tr(W / <
2 . Here, Tr(W / denotes the
trace of matrix W , i.e., the sum of its diagonal
entries. Theorem 2 The H1 control problem has a solu-
The LMI solution is derived via the appli- tion if and only if Eq. (5) and the following LMIs
cation of the variable change method, as given have positive-definite solutions X, Y:
below. 2 3
AX C XAT XC1T B1
NXT 0 4
Theorem 1 Suppose that D11 D 0. The H2 C1 X
I D11 5
0 I
control problem is solvable if and only if there B1T T
D11
I
exist symmetric matrices X, Y, W and matrices A,
N 0
YB1 I Y Y < 0: (25)
0 I
When the LMI Eqs. (20) and (21) have solutions,
an H2 controller is given by Eq. (16) by setting
D D 0. Once a matrix P is computed according to
The H1 control problem is to design a con- Lemma 2, Eq. (22) becomes an LMI and its
troller so that kHzw k1 <
. The starting point of solution yields the controller.
H1 control is the famous bounded real lemma, The second solution is given below.
which states that Hzw .s/ has an H1 norm less Theorem 3 The H1 control problem has a solu-
than
if and only if there is a positive-definite tion if and only if Eq. (5) and the following LMI
matrix P satisfying have solutions X, Y and A, B, C, D W
2 3
AX C B2 C A C B2 DC2 B1 C B2 DD21 0
6 A YA C BC2 YB1 C BD21 0 7
6 7
He 6 7 < 0: (26)
4 0 0
2I 0 5
C1 X C D12 C C1 C D12 DC2 D11 C D12 DD21
2 I
LMI Approach to Robust Control 679
The controller is given by Eq. (16). all of its poles are in the LMI region D if and only
if there is a positive-definite matrix P satisfying
the LMI
rP cP C AP
convex regions characterized by LMI, the design < 0: (30)
cP C .AP/T rP
method is mature and proven effective in practice.
Let us see how to characterize a convex region.
Meanwhile, for the sector region in Fig. 2b, the
It is easy to know that a complex number z is
corresponding LMI is
inside the disk of Fig. 2a if and only if it satisfies
.AP C PAT / sin .AP PAT / cos
r zCc < 0:
< 0: .AP PAT / cos .AP C PAT / sin
zN C c r (31)
Moreover, for a composite LMI region, such as
Similarly, z is inside the sector of Fig. 2b if and
only if
the intersection of the disk and the sector, the pole L
placement is guaranteed by enforcing a common
solution P to all the corresponding LMIs.
.z C zN/ sin .z zN/ cos In the pole placement design, only the variable
< 0:
.z zN/ cos .z C zN/ sin change method is applicable. For example, in
the nominal closed-loop system Eq. (8), the pole
Generally, the set of complex number z character- placement condition is that the LMI
ized by
X I
L˝
D D fz 2 CjL C zM C zNM T < 0g (27) I Y
AX CBC ACBDC
is called an LMI region, in which L is a symmet- CHe M ˝ <0
A YACBC
ric matrix. For the dynamic system
(32)
xP D Ax; (28)
and Eq. (5) are solvable (Chilali and Gahinet
1996).
For systems with norm-bounded parameter
uncertainty, a robust pole placement method is
provided in Chilali et al. (1999).
Multi-objective Control
by enforcing a common solution P to the corre- When the structure of the gain-scheduled con-
sponding matrix inequality conditions. troller is chosen as summarized above, the solv-
ability conditions reduce to those at all vertices
Gain-Scheduled Control i of the scheduling parameter vector p.t/. Fur-
ther, a multi-objective is achieved by imposing a
In practice, many nonlinear systems can be ex- common solution P to all LMI conditions. Some
pressed as linear systems with state-dependent concrete examples are illustrated below:
coefficients in form, which is known as the LPV H1 Norm Spec: The conditions of Theorem 3
(linear parameter-varying) form. For example, in are satisfied at all vertices j of the parameter
the model of a robot arm JR C mgl sin D u, vector p.t/.
if we define a parameter as p.t/ D sin =, H2 Norm Spec: The conditions of Theorem 1 are
R C
then it can be written as an LPV model J.t/ satisfied at all vertices j of the parameter
mglp.t/.t/ D u.t/. In this class of systems, vector p.t/.
when the parameter p.t/ is available online and Regional Pole Placement: Eq. (32) is satisfied at
its range is finite, one may tune the controller all vertices j of the parameter vector p.t/ and
parameters based on the information of p.t/, Eq. (5) holds.
so as to achieve a higher performance. This is Moreover, a different gain-scheduled method
referred to as gain-scheduled control. is proposed in Packard (1994) for parametric
Consider the following affine model: systems with norm-bounded uncertainty.
[
@A.xs / W s .xi / energy surface) and k the level value. Generally
xi 2fE\@A.xs /g speaking, this set S.k/ can be very complicated
with several connected components even for the
The following two theorems give interesting 2-dimensional case. We use the notation Sk .xs /
results on the structure of the equilibrium points to denote the only component of the several dis-
on the stability boundary. Moreover, it presents joint connected components of Sk that contains
a necessary condition for the existence of certain the stable equilibrium point xs .
types of equilibrium points on a bounded stability Theorem 5 (Optimal Estimation) Consider
boundary. the nonlinear system (1) which has an energy
Theorem 3 (Structure of Equilibrium Points function V .x/. Let xs be an asymptotically
on the Stability Boundary (Chiang and Thorp stable equilibrium point whose stability region
1989)) If there exists an energy function for sys- A.xs / is not dense in Rn . Let E1 be the
tem (1) which has an asymptotically stable equi- set of type one equilibrium points and cO D
librium point xs (but not globally asymptotically minxi 2@A.xs / \E V (xi ), and then
1
stable), then the stability boundary .xs / must 1. ScO .xs / A(xs ).
contain at least one type one equilibrium point. 2. The set fSb .xs / \ ANc .xs /g is nonempty for any
If, furthermore, the stability region is bounded, number b > c.
then the stability boundary @A.xs / must contain This theorem leads to an optimal estimation of
at least one type one equilibrium point and one the stability region A(xs ) via an energy function
source. V (.) (Chiang and Thorp 1989). For the purpose
Theorem 4 (Sufficient Condition for Un- of illustration, we consider the following simple
example: L
bounded Stability Region (Chiang et al. 1987))
If there exists an energy function for system (1)
which has an asymptotically stable equilibrium
xP 1 D sin xi 0:5 sin.x1 x2 / C 0:01
point xs (but not globally asymptotically stable)
and if @A(xs ) contains no source, then the xP 2 D 0:5 sin x2 0:5 sin.x2 x1 / C 0:05 (5)
stability region A(xs ) is unbounded.
A direct application of this is that the stability It is easy to show that the following function is an
boundary @A(xs ) of an (asymptotically) stable energy function for system (5):
equilibrium point of the classical power system
stability model (2) is unbounded.
V .x1 ; x2 / D 2 cos x1 cos x2 cos.x1 x2 /
0:02x1 0:1x2 (6)
Optimally Estimating Stability Region
Using Energy Functions
The point x s x1s ; x2s D .0:02801; 0:06403/
In this section, we focus on how to optimally is the stable equilibrium point whose stability
determine the critical level value of an energy region is to be estimated. Applying the optimal
function for estimating the stability boundary scheme to system (5), we have the critical level
@A(xs ). We consider the following set: value of 0.31329. The Curve A in Fig. 1 is
the exact stability boundary @A.x s / while Curve
Sv .k/ D fx 2 Rn W V (x) < kg (4) B is the stability boundary estimated by the
connected component (containing the s.e.p. x s )
where V .:/ W Rn ! R is an energy function. of the constant energy surface. It can be seen that
We shall call the boundary of set (2) @S.k/ WD the critical level value, 0.31329, is indeed the
fx 2 Rn W V .x/ D kg the level set (or constant optimal value.
684 Lyapunov Methods in Power System Stability
5 Applications
4
3 A
After decades of research and development in
2 the energy-function-based direct methods and the
1 B
time-domain simulation approach, it has become
X2
0
clear that the capabilities of direct methods and
−1
that of the time-domain approach complement
−2
−3 each other. The current direction of development
−4 is to include appropriate direct methods and time-
−5 domain simulation programs within the body of
4 3 2 1 0 1 2 3 4 overall power system stability simulation pro-
X1 grams (Chiang 1999, 2011; Chiang et al. 1995;
Fouad and Vittal 1991; Sauer and Pai 1998).
Lyapunov Methods in Power System Stability, Fig. 1
Curve A is the exact stability boundary @A.x s / of system
For example, the direct method provides the ad-
(5), while Curve B is the stability boundary estimated by vantages of fast computational speed and energy
the constant energy surface (with level value of 0.31329) margins which make it a good complement to
of the energy function the traditional time-domain approach. The en-
ergy margin and its functional relations to cer-
tain power system parameters are an effective
complement to develop tools such as preventive
control schemes for credible contingencies which
Constructing Analytical Energy are unstable and to develop fast calculators for
Functions for Transient Stability available transfer capability limited by transient
Models stability.
An effective, theory-based methodology for
The task of constructing an energy function for online screening and ranking of a large set
a (post-fault) transient stability model is essential of contingencies at operating points obtained
to direct stability analysis of power systems. The from state estimators has been developed in
role of the energy function is to make feasible Chiang et al. (2013). A set of improved BCU
a direct determination of whether a given point classifiers, along with their analytical basis,
(such as the initial point of a post-fault power sys- has been developed. Extensive evaluation of the
tem) lies inside the stability region of post-fault improved BCU classifiers on a large test system
SEP without performing numerical integration. It and on the actual PJM interconnection system
has been shown that a general (analytical) energy for a fast screening has been performed. This
function for power systems with losses does not evaluation study is the largest in terms of system
exist (Chiang 1989). One key implication is that size, 14,500 buses and 3,000 generators, for a
any general procedure attempting to construct practical online transient stability assessment
an energy function for a lossy power system application. The evaluation results, performed
transient stability model must include a step that on a total number of 5.3 million contingencies,
checks for the existence of an energy function. were very promising in terms of speed, accuracy,
This step essentially plays the same role as the reliability, and robustness (Chiang et al. 2013).
Lyapunov equation in determining the stability of This study also confirms the practicality of
an equilibrium point. theory-based methodology for online transient
Several schemes are available for constructing stability assessment of large-scale power
numerical energy functions for power system systems; in particular, theory-based methods are
transient stability models expressed as a set of suitable for power system online applications
general differential-algebraic equations (DAEs) which demand speed, accuracy, reliability, and
(Chu and Chiang 1999, 2005). robustness.
Lyapunov’s Stability Theory 685
theoretical foundation, the ability to conclude time t D 0 approaches the origin as t tends to 1.
stability without knowledge of the solution (no When the region of attraction is the whole space,
extensive simulation effort), and an analytical we say that the origin is globally asymptotically
framework that makes it possible to study the ef- stable. A stronger form of asymptotic stability
fect of model perturbations and design feedback arises when there exist positive constants c, k,
control. Its main drawback is the need to search and such that the solutions of Eq. (1) satisfy the
for an auxiliary function that satisfies certain inequality
conditions.
kx.t/k kkx.0/ke t ; 8 t 0 (2)
Stability of Equilibrium Points for all kx.0/k < c. In this case, the equilibrium
point x D 0 is said to be exponentially stable.
We consider a nonlinear system represented by It is said to be globally exponentially stable if the
the state model inequality is satisfied for any initial state x.0/.
This suggests that in a small neighborhood of the converges to zero as t tends to infinity. This
origin we can approximate the nonlinear system extension of the basic theorem is known as the
xP D f .x/ by its linearization about the origin invariance principle.
xP D Ax. Indeed, we can draw conclusions about Lyapunov functions can be used to estimate
the stability of the origin as an equilibrium point the region of attraction of an asymptotically sta-
for the nonlinear system by examining the eigen- ble origin, that is, to find sets contained in the
values of A. The origin of Eq. (1) is exponentially region of attraction. Let V .x/ be a Lyapunov
stable if and only if A is Hurwitz. It is unstable if function that satisfies the conditions of asymp-
ReŒi > 0 for one or more of the eigenvalues of totic stability over a domain D. For a positive
A. If ReŒi 0 for all i , with ReŒi D 0 for constant c, let c be the component of fV .x/
some i , we cannot draw a conclusion about the cg that contains the origin in its interior. The
stability of the origin of Eq. (1). properties of V guarantee that, by choosing c
small enough, c will be bounded and contained
in D. Then, every trajectory starting in c re-
Lyapunov’s Method mains in c and approaches the origin as t ! 1.
Thus, c is an estimate of the region of attraction.
Let V .x/ be a continuously differentiable scalar If D D Rn and V .x/ is radially unbounded, that
function defined in a domain D Rn that is, kxk ! 1 implies that V .x/ ! 1, then any
contains the origin. The function V .x/ is said to point x 2 Rn can be included in a bounded set
be positive definite if V .0/ D 0 and V .x/ > 0 c by choosing c large enough. Therefore, the
for x ¤ 0. It is said to be positive semidefinite origin is globally asymptotically stable if there is
if V .x/ 0 for all x. A function V .x/ is said a continuously differentiable, radially unbounded
function V(x) such that for all x 2 Rn , V .x/
L
to be negative definite or negative semidefinite
if V .x/ is positive definite or positive semidef- is positive definite and VP .x/ is either negative
inite, respectively. The derivative of V along the definite or negative semidefinite but no solution
P
can stay identically in the set fV.x/ D 0g other
trajectories of Eq. (1) is given by
than the zero solution x.t/ 0.
X n
@V @V
VP .x/ D xP i D f .x/ Time-Varying Systems
i D1
@xi @x Equation (1) is time-invariant because f does
not depend on t. The more general time-varying
where Œ@V =@x is a row vector whose i th compo- system is represented by
nent is @V =@xi .
Lyapunov’s stability theorem states that the xP D f.t; x/ (4)
origin is stable if there is a continuously differ-
In this case, we may allow the Lyapunov function
entiable positive definite function V .x/ so that
P candidate V to depend on t. Let V .t; x/ be a
V.x/ is negative semidefinite, and it is asymp-
P continuously differentiable function defined for
totically stable if V.x/ is negative definite. A
all t 0 and x 2 D. The derivative of V along
function V .x/ satisfying the conditions for sta-
the trajectories of Eq. (4) is given by
bility is called a Lyapunov function. The surface
V .x/ D c, for some c > 0, is called a Lyapunov @V @V
surface or a level surface. VP .t; x/ D C f .t; x/
@t @x
When V.x/P is only negative semidefinite, we
may still conclude asymptotic stability of the ori- If there are positive definite functions W1 .x/,
gin if we can show that no solution can stay iden- W2 .x/, and W3 .x/ such that
P
tically in the set fV.x/ D 0g, other than the zero
W1 .x/ V .t; x/ W2 .x/ (5)
solution x.t/ 0. Under this condition, V .x.t//
must decrease toward 0, and consequently x.t/ VP .t; x/ W3 .x/ (6)
688 Lyapunov’s Stability Theory
for all t 0 and all x 2 D, then the c1 kxk2 V .t; x/ c2 kxk2 (8)
origin is uniformly asymptotically stable, where
@V @V
“uniformly” indicates that the "–ı definition C f .t; x/ c3 kxk2 (9)
of stability and the convergence of x.t/ to @t @x
zero are independent of the initial time t0 . @V
Such uniformity annotation is not needed with @x c4 kxk (10)
time-invariant systems since the solution of a
time-invariant state equation starting at time t0 for all x 2 D for some positive constants c1 , c2 ,
depends only on the difference t t0 , which c3 , and c4 . Suppose the perturbation term g.t; x/
is not the case for time-varying systems. If the satisfies the linear growth bound
inequalities of Eqs. (5) and (6) hold globally and
W1 .x/ is radially unbounded, then the origin kg.t; x/k
kxk; 8 t 0; 8 x 2 D (11)
is globally uniformly asymptotically stable.
If W1 .x/ D k1 kxka , W2 .x/ D k2 kxka , and where
is a nonnegative constant. We use V as
W3 .x/ D k3 kxka for some positive constants k1 , a Lyapunov function candidate to investigate the
k2 , k3 , and a, then the origin is exponentially stability of the origin as an equilibrium point for
stable. the perturbed system. The derivative of V along
the trajectories of Eq. (7) is given by
@V @V @V
Perturbed Systems VP .t; x/ D C f .t; x/ C g.t; x/
Consider the system @t @x @x
there is no systematic procedure for finding Lya- (1977), Yoshizawa (1966), and (Zubov 1964).
punov functions. Second, the conditions of the Lyapunov’s theory for discrete-time systems is
theory are only sufficient; they are not neces- presented in Haddad and Chellaboina (2008) and
sary. Failure of a Lyapunov function candidate to Qu (1998). The monograph Michel and Wang
satisfy the conditions for stability or asymptotic (1995) presents Lyapunov’s stability theory for
stability does not mean that the origin is not stable general dynamical systems, including functional
or asymptotically stable. These drawbacks have and partial differential equations.
been mitigated by a long history of using the
method in the analysis and design of engineering
systems, where various techniques for finding Bibliography
Lyapunov functions for specific systems have
been determined. Bacciotti A, Rosier L (2005) Liapunov functions and
stability in control theory, 2nd edn. Springer, Berlin
Haddad WM, Chellaboina V (2008) Nonlinear dynami-
cal systems and control. Princeton University Press,
Cross-References Princeton
Hahn W (1967) Stability of motion. Springer, New York
Feedback Stabilization of Nonlinear Systems Khalil HK (2002) Nonlinear systems, 3rd edn. Prentice
Input-to-State Stability Hall, Princeton
Krasovskii NN (1963) Stability of motion. Stanford Uni-
Regulation and Tracking of Nonlinear Systems
versity Press, Stanford
Michel AN, Wang K (1995) Qualitative theory of dynam-
ical systems. Marcel Dekker, New York
Recommended Reading Qu Z (1998) Robust control of nonlinear uncertain sys-
tems. Wiley-Interscience, New York
Rouche N, Habets P, Laloy M (1977) Stability theory by
L
For an introduction to Lyapunov’s stability Lyapunov’s direct method. Springer, New York
theory at the level of first-year graduate students, Sastry S (1999) Nonlinear systems: analysis, stability, and
the textbooks Khalil (2002), Sastry (1999), control. Springer, New York
Slotine J-JE, Li W (1991) Applied nonlinear control.
Slotine and Li (1991), and (Vidyasagar 2002)
Prentice-Hall, Englewood Cliffs
are recommended. The books by Bacciotti and Vidyasagar M (2002) Nonlinear systems analysis, classic
Rosier (2005) and Haddad and Chellaboina edn. SIAM, Philadelphia
(2008) cover a wider set of topics at the Yoshizawa T (1966) Stability theory by Liapunov’s sec-
ond method. The Mathematical Society of Japan,
same introductory level. A deeper look into Tokyo
the theory is provided in the monographs Zubov VI (1964) Methods of A. M. Lyapunov and their
Hahn (1967), Krasovskii (1963), Rouche et al. applications. P. Noordhoff Ltd., Groningen
M
Keywords
Markov Chains and Ranking
Problems in Web Search
Aggregation; Distributed randomized algorithms;
Markov chains; Optimization; PageRank; Search
Hideaki Ishii1 and Roberto Tempo2
1 engines; World wide web
Tokyo Institute of Technology, Yokohama,
Japan
2
CNR-IEIIT, Politecnico di Torino, Torino,
Introduction
Italy
For various real-world large-scale dynamical sys-
tems, reasonable models describing highly com-
Abstract plex behaviors can be expressed as stochastic
systems, and one of the most well-studied classes
Markov chains refer to stochastic processes of such systems is that of Markov chains. A char-
whose states change according to transition acteristic feature of Markov chains is that their
probabilities determined only by the states of behavior does not carry any memory. That is, the
the previous time step. They have been crucial current state of a chain is completely determined
for modeling large-scale systems with random by the state of the previous time step and not at all
behavior in various fields such. as control, on the states prior to that step (Kumar and Varaiya
communications, biology, optimization, and 1986; Norris 1997).
economics. In this entry, we focus on their Recently, Markov chains have gained
recent application to the area of search engines, renewed interest due to the extremely successful
namely, the PageRank algorithm employed at applications in the area of web search. The
Google, which provides a measure of importance search engine of Google has been employing
for each page in the web. We present several an algorithm known as PageRank to assist
researches carried out with control theoretic tools the ranking of search results. This algorithm
such as aggregation, distributed randomized models the network of web pages as a Markov
algorithms, and PageRank optimization. Due chain whose states represent the pages that
to the large size of the web, computational web surfers with various interests visit in a
issues are the underlying motivation of these random fashion. The objective is to find an
studies. order among the pages according to their
P
popularity and importance, and this is done by sum up to 1, i.e., niD1 pij D 1 for j 2 X . In
focusing on the structure of hyperlinks among this respect, the matrix P is (column) stochastic
pages. (Horn and Johnson 1985).
In this entry, we first provide a brief overview In this entry, we assume that the Markov chain
on the basics of Markov chains and then in- is ergodic, meaning that for any pair of states,
troduce the problem of PageRank computation. the chain can make a transition from one to the
We proceed to provide further discussions on other over time. In this case, the chain and the
control theoretic approaches dealing with PageR- matrix P are called irreducible. This property is
ank problems. The topics covered include ag- known to imply that P has a simple eigenvalue
gregated Markov chains, distributed randomized of 1. Thus, there exists a unique steady state
algorithms, and Markov decision problems for probability distribution 2 Rn given by
link optimization.
D P ; 1T D 1; i > 0; 8i 2 X ;
will be repeated. The PageRank value for a page The idea of the teleportation model is that
represents the probability of the surfer visiting the the random surfer, after a while, becomes bored
page. It is thus higher for pages visited more often and stops following the hyperlinks. At such an
by the surfer. instant, the surfer “jumps” to another page not
It is now clear that PageRank is obtained by directly connected to the one currently visiting.
describing the random surfer model as a Markov This page can be in fact completely unrelated
chain and then finding its stationary distribution. in the domains and/or the contents. All n pages
First, we express the network of web pages as in the web have the same probability 1=n to be
the directed graph G D .V; E/, where V D reached by a jump.
f1; 2; : : : ; ng is the set of nodes corresponding to The probability to make such a jump is de-
web page indices while E V V is the set of noted by m 2 .0; 1/. The original transition
edges for links among pages. Node i is connected probability matrix A is now replaced with the
to node j by an edge, i.e., .i; j / 2 E, if page i has modified one M 2 Rnn defined by
an outgoing link to page j.
Let xi .k/ be the distribution of the random m T
surfer visiting page i at time k, and let x.k/ be M WD .1 m/A C 11 : (3)
n
the vector containing all xi .k/. Given the initial
distribution x.0/, which is a probability vector,
P For the value of m, we take m D 0:15 as reported
i.e., niD1 xi.0/ D 1, the evolution of x.k/ can
be expressed as in the original algorithm in Brin and Page (1998).
Notice that M is a positive stochastic matrix. By
x.k C 1/ D Ax.k/: (1) Perron’s theorem (Horn and Johnson 1985), the
eigenvalue 1 is of multiplicity 1 and is the unique
The link matrix A D .aij / 2 Rnn is given by eigenvalue with the maximum modulus. Further,
the corresponding eigenvector is positive. Hence,
M
aij D 1=nj if .j; i / 2 E and 0 otherwise, where
nj is the number of outgoing links of page j . we redefine the vector x in (2) by using M
Note that this matrix A is the transition proba- instead of A as follows:
bility matrix of the random surfer. Clearly, it is
stochastic, and thus x.k/ remains a probability
P x D M x ; x 2 Œ0; 1n ; 1T xi D 1:
vector so that niD1 xi .k/ D 1 for all k.
As mentioned above, PageRank is the
stationary distribution of the process (1) under Due to the large dimension of the link matrix
the assumption that the limit exists. Hence, M , the computation of x is difficult. The solu-
the PageRank vector is given by x WD tion employed in practice is based on the power
limk!1 x.k/. In other words, it is the solution of method given by
the linear equation
m
x D Ax ; x 2 Œ0; 1n ; 1T x D 1: (2) x.k C 1/ D M x.k/ D .1 m/Ax.k/ C 1;
n
(4)
Notice that the PageRank vector x is a non-
negative unit eigenvector for the eigenvalue 1 of
A. Such a vector exists since the matrix A is where the initial vector x.0/ 2 Rn is a probability
stochastic, but may not be unique; the reason is vector. The second equality above follows from
that A is a reducible matrix since in the web, not the fact 1T x.k/ D 1 for k 2 ZC . For imple-
every pair of pages can be connected by simply mentation, the form on the far right-hand side is
following links. To resolve this issue, a slight important, using only the sparse matrix A and not
modification is necessary in the random surfer the dense matrix M . This method asymptotically
model. finds the value vector as x.k/ ! x ,k ! 1.
694 Markov Chains and Ranking Problems in Web Search
obtain groups for a given web. Once the grouping points in the field of numerical analysis (Bert-
is settled, an aggregation-based algorithm can sekas and Tsitsiklis 1989). The distributed update
be applied, which computes an approximated recursion is given as
PageRank vector. A characteristic feature is the
tradeoff between the accuracy in PageRank com- L C 1/ D ML 1 .k/;:::;n .k/ x.k/;
x.k L (6)
putation and the node parameter ı. More accurate L
where the initial state x.0/ is a probability vector
computation requires a larger number of groups and the distributed link matrices ML p1 ;:::;pn are
and thus a smaller ı. given as follows: Its .i; j /th entry is equal to
.1 m/aij C m=n if pi D 1; 1 if pi D 0 and
i D j ; and 0 otherwise. Clearly, these matrices
Distributed Randomized keep the rows of the original link matrix M in (3)
Computation for pages initiating updates. Other pages just
keep their previous values. Thus, these matrices
For large-scale computation, distributed algo- are not stochastic. From this update recursion,
rithms can be effective by employing multiple the PageRank x is probabilistically obtained (in
processors to compute in parallel. There are the mean square sense and in probability one),
several methods of constructing algorithms to where the convergence speed is exponential in
find stationary distributions of large Markov time k. Note that in this scheme (6), due to the
chains. In this section, motivated by the current way the distributed link matrices are constructed,
literature on multi-agent systems, sequential each page needs to know which pages have links
distributed randomized approaches of gossip pointing towards it. This implies that popular
type are described for the PageRank problem. pages linked by a number of pages must have
In gossip-type distributed algorithms, nodes extra memory to keep the data of such links.
make decisions and transmit information to Another recently developed approach Ishii and M
their neighbors in a random fashion. That is, Tempo (2010) and Zhao et al. (2013) has several
at any time instant, each node decides whether notable differences from the asynchronous
to communicate or not depending on a random iteration approach above. First, the pages need
variable. The random property is important to to transmit their states only over their outgoing
make the communication asynchronous so that links; the information of such links are by
simultaneous transmissions resulting in collisions default available locally, and thus, pages are
can be avoided. Moreover, there is no need of any not required to have the extra memory regarding
centralized decision maker or fixed order among incoming links. Second, it employs stochastic
pages. matrices in the update as in the centralized
More precisely, each page i 2 V is equipped scheme; this aspect is utilized in the convergence
with a random process i .k/ 2 f0; 1g for k 2 analysis. As a consequence, it is established
ZC . If at time k, i .k/ is equal to 1, then page that the PageRank vector x is computed in
i broadcasts its information to its neighboring a probabilistic sense through the time average
pages connected by outgoing links. All pages in- of the states x.0/; : : : ; x.k/ given by y.k/ WD
volved at this time renew their values based on the P
1=.k C 1/ k`D0 x.`/. The convergence speed in
latest available data. Here, i .k/ is assumed to be this case is of order 1=k.
an independent and identically distributed (i.i.d.)
random process, and its probability distribution
is given by Probfi .k/ D 1g D ˛, k 2 ZC . PageRank Optimization
Hence, all pages are given the same probability via Hyperlink Designs
˛ to initiate an update.
One of the proposed randomized approaches For owners of websites, it is of particular interest
is based on the so-called asynchronous iteration to raise the PageRank values of their web pages.
algorithms for distributed computation of fixed Especially in the area of e-business, this can
696 Markov Chains and Ranking Problems in Web Search
be critical for increasing the number of visitors in polynomial time by modeling them as con-
to their sites. The values of PageRank can be strained Markov decision processes with ergodic
affected by changing the structure of hyperlinks rewards (see, e.g., Puterman 1994).
in the owned pages. Based on the random surfer
model, intuitively, it makes sense to arrange the
links so that surfers will stay within the domain Summary and Future Directions
of the owners as long as possible.
PageRank optimization problems have rigor- Markov chains form one of the simplest classes
ously been considered in, for example, de Ker- of stochastic processes but have been found pow-
chove et al. (2008) and Fercoq et al. (2013). erful in their capability to model large-scale com-
In general, these are combinatorial optimization plex systems. In this entry, we introduced them
problems since they deal with the issues on where mainly from the viewpoint of PageRank algo-
to place hyperlinks, and thus the computation for rithms in the area of search engines and with a
solving them can be prohibitive especially when particular emphasis on recent works carried out
the web data is large. However, the work Fercoq based on control theoretic tools. Computational
et al. (2013) has shown that the problem can issues will remain in this area as major chal-
be solved in polynomial time. In what follows, lenges, and further studies will be needed. As we
we discuss a simplified discrete version of the have observed in PageRank-related problems, it
problem setup of this work. is important to pay careful attention to structures
Consider a subset V0 V of web pages over of the particular problems.
which a webmaster has control. The objective is
to maximize the total PageRank of the pages in Cross-References
this set V0 by finding the outgoing links from
these pages. Each page i 2 V0 may have con- Randomized Methods for Control of Uncertain
straints such as links that must be placed within Systems
the page and those that cannot be allowed. All
other links, i.e., those that one can decide to have
or not, are the design parameters. Hence, the Bibliography
PageRank optimization problem can be stated as
Aldhaheri R, Khalil H (1991) Aggregation of the policy
˚ iteration method for nearly completely decomposable
max U.x ; M / W x D M x ; x 2 Œ0; 1n ; Markov chains. IEEE Trans Autom Control 36:178–
187
1T x D 1; M 2 M ; Bertsekas D, Tsitsiklis J (1989) Parallel and distributed
computation: numerical methods. Prentice-Hall, En-
glewood Cliffs
where U is the utility function U.x ; M / WD
P
Brin S, Page L (1998) The anatomy of a large-scale
i 2V0 xi and M represents the set of admissible hypertextual web search engine. Comput Netw ISDN
link matrices in accordance with the constraints Syst 30:107–117
de Kerchove C, Ninove L, Van Dooren P (2008) Influence
introduced above.
of the outlinks of a page on its PageRank. Linear
In Fercoq et al. (2013), an extended continu- Algebra Appl 429:1254–1276
ous problem is also studied where the set M of Fercoq O, Akian M, Bouhtou M, Gaubert S (2013) Er-
link matrices is a polytope of stochastic matrices godic control and polyhedral approaches to PageRank
optimization. IEEE Trans Autom Control 58:134–148
and a more general utility function is employed.
Horn R, Johnson C (1985) Matrix analysis. Cambridge
The motivation for such a problem comes from University Press, Cambridge
having weighted links so that webmasters can Ishii H, Tempo R (2010) Distributed randomized algo-
determine which links should be placed in a more rithms for the PageRank computation. IEEE Trans
Autom Control 55:1987–2002
visible location inside their pages to increase
Ishii H, Tempo R, Bai EW (2012) A web aggregation
clickings on those hyperlinks. Both discrete and approach for distributed randomized PageRank algo-
continuous problems are shown to be solvable rithms. IEEE Trans Autom Control 57:2703–2717
Mathematical Models of Marine Vehicle-Manipulator Systems 697
Kumar P, Varaiya P (1986) Stochastic systems: estimation, vehicle is usually equipped with one or more
identification, and adaptive control. Prentice Hall, En- manipulators; such systems are defined under-
glewood Cliffs
Langville A, Meyer C (2006) Google’s PageRank and be-
water vehicle-manipulator systems (UVMSs). A
yond: the science of search engine rankings. Princeton UVMS holding six degree-of-freedom (DOF)
University Press, Princeton manipulators is illustrated in Fig. 1. The vehicle
Meyer C (1989) Stochastic complementation, uncoupling carrying the manipulator may or may not be
Markov chains, and the theory of nearly reducible
systems. SIAM Rev 31:240–272
connected to the surface; in the first case we face
Norris J (1997) Markov chains. Cambridge University a so-called remotely operated vehicle (ROV),
Press, Cambridge while in the latter an autonomous underwater
Phillips R, Kokotovic P (1981) A singular perturbation vehicle (AUV). ROVs, being physically linked,
approach to modeling and control of Markov chains.
IEEE Trans Autom Control 26:1087–1094
via the tether, to an operator that can be on a
Puterman M (1994) Markov decision processes: submarine or on a surface ship, receives power as
discrete stochastic dynamic programming. Wiley, well as control commands. AUVs, on the other
New York hand, are supposed to be completely autonomous,
Zhao W, Chen H, Fang H (2013) Convergence of
distributed randomized PageRank algorithms. IEEE
thus relying to onboard power system and
Trans Autom Control 58:3255–3259 intelligence.
Remotely controlled UVMSs represent the
state of the art in underwater manipulation, while
autonomous or semiautonomous UVMSs still are
Mathematical Models of Marine in their embryonic stage. All over the world, few
Vehicle-Manipulator Systems experimental setups have been developed within
on-purpose projects; see, e.g., the European
Gianluca Antonelli project Trident (2012).
University of Cassino and Southern Lazio, M
Cassino, Italy
Sensory System
Abstract
Any manipulation task requires that some vari-
Marine intervention requires the use of manip-
ables are measured; those may concern the inter-
ulators mounted on support vehicles. Such sys-
nal state of the system such as the end effector
tems, defined as vehicle-manipulator systems,
as well the vehicle position and orientation or the
exhibit specific mathematical properties and re-
velocities. Some others concern the surrounding
quire proper control design methodologies. This
environment as it is the case of vision systems
article briefly discusses the mathematical model
or range measurements. Underwater sensing is
within a control perspective as well as sensing
characterized by poorer performance with respect
and actuation peculiarities.
to the ground corresponding variables due to
the physical properties of the water as medium
Keywords carrying the electromagnetic or acoustic signals.
One of the major challenges in underwater
Floating-base manipulators; Marine robotics; robotics is the localization due to the absence
Underwater intervention; Underwater robotics of a single, proprioceptive sensor that measures
the vehicle position and the impossibility to use
the Global Navigation Satellite System (GNSS)
Introduction under the water. The use of redundant multisensor
systems, thus, is common in order to perform sen-
In case of marine operations that require sor fusion and give fault detection and tolerance
interaction with the environment, an underwater capabilities to the vehicle.
698 Mathematical Models of Marine Vehicle-Manipulator Systems
Mathematical Models of Marine Vehicle-Manipulator Systems, Fig. 1 Sketch of a UVMS, the inertial frame as
well as the frames attached to all the rigid bodies are highlighted
differential kinematic algorithms. In particu- testing, and monitoring all the operations. Fault
lar, algorithms for redundant systems need to detection and recovery policies are necessary in
be considered since a UVMS always possess each step to avoid loss of expensive hardware.
at least six DOFs. Moving a UVMS requires Future generation of UVMSs needs to be au-
to handle additional variables with respect to tonomous, to percept and contextualize the en-
the end effector such as the vehicle roll and vironment, to react with respect to unplanned
pitch to preserve energy, the robot manipula- situations, and to safely reschedule the tasks
bility, the mechanical joint limits, or eventual of complex missions. Those characteristics are
directional sensing. being shared by all the branches of the service
• Motion control. Low-level control algorithms robotics.
are designed to allow the system tracking the
desired trajectory. UVMSs are characterized
by different dynamics between vehicle and Cross-References
manipulator, uncertainty in the model parame-
ter knowledge, poor sensing performance, and Advanced Manipulation for Underwater Sam-
limit cycle in the thruster model. On the other pling
hand, the limited bandwidth of the closed- Mathematical Models of Ships and Underwater
loop system allows the use of simple control Vehicles
approaches. Redundant Robots
• Interaction control. Several applications re-
quire exchange of forces with the environ-
Recommended Reading
ment. A pure motion control algorithm is
not devised for such operation and specific The book of Fossen (1994) is one of the first
force control algorithms, both direct and indi- books dedicated to control problems of marine
rect, may be necessary. Master/slave systems systems, both underwater and surface. The same
or haptic devices may be used on the pur- author presents, in Fossen (2002), an updated
pose, while autonomous interaction control
and extended version of the topics developed in
still is in the research phase for the marine the first book and in Fossen (2011), a handbook
environment. on marine craft hydrodynamics and control. A
short introductory chapter to marine robotics may
be found in Antonelli et al. (2008). Robotics
Summary and Future Directions fundamentals are also useful and can be found
in Siciliano et al. (2009). To the best of our
This article is aimed at giving a short overview knowledge, Antonelli (2014) is the only mono-
of the main mathematical and technological chal- graph devoted to addressing the specific problems
lenges arising with UVMSs. All the components of underwater vehicle-manipulator systems.
of an underwater mission, perception, actuation,
and communication with the surface, are char-
acterized by poorer performances with respect Bibliography
to the current industrial or advanced robotics
applications. Antonelli, G (2014) Underwater robots: motion and
force control of vehicle-manipulator systems. Springer
The underwater environment is hostile; as an
tracts in advanced robotics, Springer-Verlag, 3rd edn.
example the marine current provides disturbances Springer, Heidelberg
to be counteracted by the dynamic controller, or Antonelli G, Fossen T, Yoerger D (2008) Underwater
the sand’s whirlwinds obfuscate the vision-based robotics. In: Siciliano B, Khatib O (eds) Springer
handbook of robotics. Springer, Heidelberg, pp 987–
operations close to the sea bottom. Both tele-
1008
operated and autonomous underwater missions Fossen T (1994) Guidance and control of ocean vehicles.
require a significant human effort in planning, Chichester, New York
Mathematical Models of Ships and Underwater Vehicles 701
Fossen T (2002) Marine control systems: guidance, navi- we mean “any large floating vessel capable of
gation and control of ships, rigs and underwater vehi- crossing open waters,” as opposed to a boat,
cles. Marine Cybernetics, Trondheim
Fossen T (2011) Handbook of marine craft hydrodynam-
which is generally a smaller craft. An underwater
ics and motion control. Wiley, Chichester vehicle is a “small vehicle that is capable of
Siciliano B, Sciavicco L, Villani L, Oriolo G (2009) propelling itself beneath the water surface as well
Robotics: modelling, planning and control. Springer, as on the water’s surface.” This includes un-
London
TRIDENT (2012) Marine robots and dexterous manip-
manned underwater vehicles (UUVs), remotely
ulation for enabling autonomous underwater multi- operated vehicles (ROVs), autonomous under-
purpose intervention missions. http://www.irs.uji.es/ water vehicles (AUVs) and underwater robotic
trident vehicles (URVs).
This entry is based on Fossen (1991, 2011),
which contains a large number of standard
Mathematical Models of Ships models for ships, rigs, and underwater vehicles.
and Underwater Vehicles There exist a large number of textbooks on
mathematical modeling of ships; see Rawson
Thor I. Fossen and Tupper (1994), Lewanddowski (2004), and
Department of Engineering Cyberentics, Centre Perez (2005). For underwater vehicles, see
for Autonomous Marine Operations and Allmendinger (1990), Sutton and Roberts (2006),
Systems, Norwegian University of Science and Inzartsev (2009), Anotonelli (2010), and Wadoo
Technology, Trondheim, Norway and Kachroo (2010). Some useful references
on ship hydrodynamics are Newman (1977),
Faltinsen (1990), and Bertram (2012).
Abstract
2 3
where the matrices MA and CA ./ represent 0 mr mxg r
hydrodynamic added mass due to acceleration 6 7
CRB ./ D 4 mr 0 0 5 (14)
P and Coriolis acceleration due to the rotation
mxg r 0 0
of fbg about the geographical frame fng. The
potential and viscous damping terms are lumped 2 3
0 0 YvP v C YrP r
together into a nonlinear matrix D./ while 6 7
g./ is a vector of generalized restoring forces. CA ./ D 4 0 0 XuP u 5
The control inputs are generalized forces given YvP v YrP r XuP u 0
by . (15)
Formulae (8) and (9) together with (4) are Hydrodynamic damping will, in its simplest
the fundamental equations when deriving the ship form, be linear:
and underwater vehicle models. This is the topic
for the next sections. 2 3
Xu 0 0
6 7
DD4 0 Yv Yr 5 (16)
0 Nr Nr
Ship Model
while a nonlinear expression based on second-
The ship equations of motion are usually order modulus functions describing quadratic
represented in three DOFs by neglecting heave, drag and cross-flow drag is:
roll and pitch. Combining (4), (8), and (9) 2
we get: X juju juj 0
6 Y jvjv jvj Y jrjv jrj
6 0
D./ D 6
P D R. / (10) 4 M
MP C C./CD./ D C wind C wave 0 N jvjv jvj N jrjv jrj
(11) 3
0
Y jvjr jvj Y jrjr jrj 7
5 (17)
where WD Œx; y; > , WD Œu; v; r> and N jvjr jvj N jrjr jrj
2 3
c s 0 Other nonlinear representations are found in Fos-
R. / D 4 s c 05 (12) sen (1994, 2011).
0 0 1 In the case of irrotational ocean currents, we
introduce the relative velocity vector:
is the rotation matrix in yaw. It is assumed that
wind and wave-induced forces wind and wave r D c
can be linearly superpositioned. The system ma-
trices M D MRB C MA and C./ D CRB ./ C where c D Œubc ; vcb ; 0> is a vector of current
CA ./ are usually derived under the assumption velocities in fbg. Hence, the kinetic model takes
of port-starboard symmetry and that surge can the form:
be decoupled from sway and yaw (Fossen 2011).
Moreover, M P C C ./
„ RB ƒ‚ RB …
2 3 rigid-body forces
m XuP 0 0
6 7 C MA P r C CA . r / r C D. r / r
MD4 0 m YvP mxg YrP 5 „ ƒ‚ …
hydrodynamic forces
0 mxg NvP Iz NrP
(13) D C wind C wave
704 Mathematical Models of Ships and Underwater Vehicles
This model can be simplified if the ocean cur- a more detailed treatment of viscous dissipa-
rents are assumed to be constant and irrotational tive forces (damping) and sealoads are found in
in fng. According to Fossen (2011, Property 8.1), the extensive literature on hydrodynamics – see
MRB P C CRB ./ MRB P r C CRB . r / r if Faltinsen (1990) and Newman (1977).
the rigid-body Coriolis and centripetal matrix sat-
isfies CRB . r / D CRB ./. One parametrization
satisfying this is (14). Hence, the Coriolis and
centripetal matrix satisfies C. r / D CRB . r / C Underwater Vehicle Model
CA . r / and it follows that:
The 6-DOF underwater vehicle equations of
MP r C C. r / r C D. r / r D C wind C wave motion follow from (4), (8), and (9) under the
(18) assumption that wave-induced motions can be
neglected:
The kinematic equation (10) can be modi-
fied to include the relative velocity r according P D J./ (20)
to: >
P D R. / r C unc ; vcn ; 0 (19) MP C C./ C D./ C g./ D (21)
where the ocean current velocities unc = constant with generalized position WD Œx; y; z; ; ; >
and vcn = constant in fng. Notice that the body- and velocity WD Œu; v; w; p; q; r> . Assume that
fixed velocities c D R. /> Œunc ; vcn ; 0> will the gravitational force acts through the center
vary with the heading angle . of gravity (CG) defined by the vector rg WD
The maneuvering model presented in this Œxg ; yg ; zg > with respect to the coordinate origin
entry is intended for controller-observer design, fbg. Similarly, the buoyancy force acts through
prediction, and computer simulations, as the center of buoyancy (CB) defined by the vector
well as system identification and parameter rb WD Œxb ; yb ; zb > . For most vehicles yg D yb D
estimation. A large number of application- 0.
specific models for marine craft are found in For a port-starboard symmetrical vehicle with
Fossen (2011, Chap. 7). homogenous mass distribution, CG satisfying
Hydrodynamic programs compute mass, iner- yg D 0 and products of inertia Ixy D Iyz D 0,
tia, potential damping and restoringforces while the system inertia matrix becomes:
2 3
m X uP 0 X wP 0 mzg X qP 0
6 7
6 7
6 m Y vP mzg Y pP mx g Y rP 7
6 0 0 0 7
6 7
6 7
6 X m Z wP mx g Z qP 7
6 P
w 0 0 0 7
6 7
M WD 6 7 (22)
6 7
6 0 mzg Y pP 0 Ix K pP 0 I zx K rP 7
6 7
6 7
6 7
6mzg X qP 0 mx g Z qP 0 Iy M qP 0 7
6 7
4 5
0 mx g Y rP 0 I zx K rP 0 Iz N rP
Mathematical Models of Ships and Underwater Vehicles 705
2 3
0 mr mq mzg r mx g q mx g r
6 mr 0 mp 0 m.zg r C x g p/ 0 7
6 7
6 mq mp 0 mzg p mzg q mx g p 7
CRB ./ D 6
6mzg r
7
7
6 0 mzg p 0 I xz p C I z r I y q 7
4 mx g q m.zg r C x g p/ mzg q Ixz p I z r 0 I xz r C I x p 5
mx g r 0 mx g p Iy q Ixz r I x p 0
(25)
2 3
Notice that this representation of CRB ./ .W B/s M
6 .W B/cs 7
only depends on the angular velocities p, 6 7
6 .W B/cc 7
q, and r, and not the linear velocities g./D 6
6 .zg W zb B/cs
7
7
u; v, and r. This property will be exploited 6 7
4 .zg W zb B/sC.x g W x b B/cc 5
when including drift due to ocean cur- .x g W x b B/cs
rents. (27)
Linear damping for a port-starboard symmet-
The expression for D can be extended to include
rical vehicle takes the following form:
nonlinear damping terms if necessary. Quadratic
2 3
Xu 0 Xw 0 Xq 0 damping is important at higher speeds since
6 0 Yv 0 Yp 0 Yr 7 the Coriolis and centripetal terms C./ can
6 7
6 Zu 0 Zw 0 Zq 0 7 destabilize the system if only linear damping is
D D 6 6 7
6 0 Kv 0 Kp 0 Kr 7
7 used.
4 Mu 0 Mw 0 Mq 0 5 In the presence of irrotational ocean currents,
0 Nv 0 Np 0 Nr we can rewrite (20) and (21) in terms of relative
(26) velocity r D c according to:
Let W D mg and B D gr denote
>
the weight and buoyance where m is the P D J./ r C unc ; vcn ; wnc ; 0; 0; 0 (28)
mass of the vehicle including water in
free floating space, r the volume of fluid
displaced by the vehicle, g the accelera- MP r C C. r / r C D. r / r C g./ D (29)
tion of gravity (positive downward), and
the water density. Hence, the generalized where it is assumed that CRB . r / D CRB ./,
restoring forces for a vehicle satisfying which clearly is satisfied for (25). In addition, it
yg D yb D 0 becomes (Fossen 1994, is assumed that unc , vcn , and wnc are constant. For
2011): more details see Fossen (2011).
706 Mean Field Games
Bibliography
1 i N;
dx.t/ D f Œt; x.t/; u.t/;
t dt C d w.t/;
limit of the long-range average is used for ergodic and consequently central results of the topic con-
MFG problems in Lasry and Lions (2006a, 2007) cern the existence of Nash Equilibria and their
and Li and Zhang (2008) and in the analysis of properties.
adaptive MFG systems (Kizilkale and Caines The basic linear-quadratic mean field problem
2013). has an explicit solution characterizing a Nash
equilibrium (see Huang et al. 2003, 2007).
Consider the scalar infinite time horizon
The Existence of Equilibria discounted case, with nonuniform parameterized
agents A with parameter distribution F ./; 2
The objective of each agent is to find strategies A, and dynamical parameters identified as
(i.e., control laws) which are admissible with a WD F ; b WD G ; Q WD 1; r WD RI then
respect to information and other constraints and the so-called Nash Certainty Equivalence (NCE)
which minimize its performance function. The equation scheme generating the equilibrium
resulting problem is necessarily game theoretic solution takes the form
ds b2
s D C a s … s x ;
dt r
d x b2 b2
D a … x s ; 0 t < 1;
dt r r
Z
x.t/ D x .t/dF ./;
A M
x .t/ D .x.t/ C /;
b2
… D 2a … … C 1; … > 0; Riccati Equation
r
where the control action of the generic and at any instant it is a feedback function
parameterized agent A is given by u0 .t/ D of only the state of the agent A itself and
br .… x .t/ C s .t//; 0 t < 1: u0 is the the deterministic mean field-dependent offset
optimal tracking feedback law with respect to s .
x .t/ which is an affine function of the mean field For the general nonlinear case, the MFG
term x.t/, the mean with respect to the parameter equations on Œ0; T are given by the linked
distribution F of the 2 A parameterized state equations for (i) the performance function
means of the agents. Subject to the conditions V for each agent in the continuum, (ii) the
for the NCE scheme to have a solution, each FPK for the MV-SDE for that agent, and
agent is necessarily in a Nash equilibrium in all (iii) the specification of the best response
full information causal (i.e., non-anticipative) feedback law depending on the mean field
feedback laws with respect to the remainder of measure
t and the agent’s state x.t/. In the
the agents when these are employing the law u0 . uniform agents case, these take the following
It is an important feature of the best response form.
control law u0 that its form depends only on The Mean Field Game HJB: (MV) FPK
the parametric data of the entire set of agents, Equations
710 Mean Field Games
@V .t; x/ @V .t; x/
[MV-HJB] D inf f Œt; x.t/; u.t/;
t C LŒt; x.t/; u.t/;
t
@t u2U @x
2 @2 V .t; x/
C ;
2 @x 2
V .T; x/ D 0; .t; x/ 2 Œ0; T R;
@
.t; x/ @ff Œt; x; u.t/;
t
.t; x/g 2 @2
.t; x/
[MV-FPK] D C ;
@t @x 2 @x 2
The general nonlinear MFG problem is of the feedback law given by the MFG infinite
approached by different routes in Huang et al. population scheme gives each agent a value to
(2006) and Nourian and Caines (2013), and Lasry its performance function within " of the infinite
and Lions (2006a,b, 2007) and Cardaliaguet population Nash value.
(2012), respectively. In the former, the so-called A counterintuitive feature of these results is
probabilistic method solves the MFG equations that, asymptotically in population size, observa-
directly. Subject to technical conditions, an tions of the states of rival agents are of no value to
iterated contraction argument establishes the any given agent; this is in contrast to the situation
existence of a solution to the HJB-(MV) FPK in single-agent optimal control theory where the
equations; the best response control laws are value of observations on an agent’s environment
obtained from these MFG equations, and is in general positive.
these are necessarily Nash equilibria within all
causal feedback laws for the infinite population
problem. In Lasry and Lions (2006a, 2007) the Current Developments and Open
MFG equations on the infinite time interval (i.e., Problems
the ergodic case) are obtained as the limit of Nash
equilibria for increasing finite populations, while There is now an extensive literature on Mean
in the expository notes of Cardaliaguet (2012) the Field Games, the following being a sample:
analytic properties of solutions to the HJB-FPK the mathematical literature has focused on
equations on the finite interval are analyzed using the study of general classes of solutions to
PDE methods including the theory of viscosity the fundamental HJB-FPK equations (see e.g.,
solutions. Cardaliaguet (2013)), while in systems and
In Huang et al. (2003, 2006, 2007), Nourian control, the theory of major-minor agent MFG
and Caines (2013), and Cardaliaguet (2012), it problems (in economics terminology, atoms
is shown that subject to technical conditions, and continua) is being developed (Huang 2010;
the solutions to the HJB-FPK scheme yield "- Nourian and Caines 2013; Nguyen and Huang
Nash solutions for finite population MFGs in that 2012), adaptive control extensions of the LQG
for any " > 0, there exists a population size theory have been carried out (Kizilkale and
N" such that for all larger populations the use Caines 2013), and the risk-sensitive case has
Mean Field Games 711
been analyzed (Tembine et al. 2012). Much work Haurie A, Marcotte P (1985) On the relationship be-
is now under way in the applications of MFG tween Nash-Cournot and Wardrop equilibria. Net-
works 15(3):295–308
theory to economics, finance, distributed energy Ho YC (1980) Team decision theory and information
systems, and electrical power markets. Each structures. Proc IEEE 68(6):15–22
of these areas has significant open problems, Huang MY (2010) Large-population LQG games in-
including the application of mathematical volving a major player: the Nash certainty equiva-
lence principle. SIAM J Control Optim 48(5):3318–
transport theory to HJB-FPK equations, the 3353
role of MFG theory in portfolio optimization, Huang MY, Caines PE, Malhamé RP (2003) Individual
and the analysis of systems where the presence and mass behaviour in large population stochastic
of partially observed major and minor agent wireless power control problems: centralized and Nash
equilibrium solutions. In: IEEE conference on decision
states incurs mean field and agent state and control, Maui, pp 98–103
estimation. Huang MY, Malhamé RP, Caines PE (2006) Large pop-
ulation stochastic dynamic games: closed loop Kean-
Vlasov systems and the Nash certainty equivalence
Cross-References principle. Commun Inf Syst 6(3):221–252
Huang MY, Caines PE, Malhamé RP (2007) Large
population cost-coupled LQG problems with non-
Dynamic Noncooperative Games uniform agents: individual-mass behaviour and decen-
Game Theory: Historical Overview tralized " – Nash equilibria. IEEE Tans Autom Control
Stochastic Dynamic Programming 52(9):1560–1571
Jovanovic B, Rosenthal RW (1988) Anonymous sequen-
Stochastic Linear-Quadratic Control tial games. J Math Econ 17(1):77–87. Elsevier
Stochastic Maximum Principle Kizilkale AC, Caines PE (2013) Mean field stochas-
tic adaptive control. IEEE Trans Autom Control
58(4):905–920
Lasry JM, Lions PL (2006a) Jeux à champ moyen. I – Le
Bibliography cas stationnaire. Comptes Rendus Math 343(9):619– M
625
Altman E, Basar T, Srikant R (2002) Nash equilib- Lasry JM, Lions PL (2006b) Jeux à champ moyen.
ria for combined flow control and routing in net- II – Horizon fini et controle optimal. Comptes Rendus
works: asymptotic behavior for a large number of Math 343(10):679–684
users. IEEE Trans Autom Control 47(6):917–930. Lasry JM Lions PL (2007) Mean field games. Jpn J Math
Special issue on Control Issues in Telecommunication 2:229–260
Networks Li T, Zhang JF (2008) Asymptotically optimal decentral-
Aumann RJ, Shapley LS (1974) Values of non-atomic ized control for large population stochastic multiagent
games. Princeton University Press, Princeton systems. IEEE Tans Autom Control 53(7):1643–1660
Basar T, Ho YC (1974) Informational properties of the Neyman A (2002) Values of games with infinitely many
Nash solutions of two stochastic nonzero-sum games. players. In: Aumann RJ, Hart S (eds) Handbook
J Econ Theory 7:370–387 of game theory, vol 3. North Holland, Amsterdam,
Basar T, Olsder GJ (1999) Dynamic noncooperative game pp 2121–2167
theory. SIAM, Philadelphia Nourian M, Caines PE (2013) "-Nash Mean field games
Bensoussan A, Frehse J (1984) Nonlinear elliptic sys- theory for nonlinear stochastic dynamical systems
tems in stochastic game theory. J Reine Angew Math with major and minor agents. SIAM J Control Optim
350:23–67 50(5):2907–2937
Bergin J, Bernhardt D (1992) Anonymous sequential Nguyen SL, Huang M (2012) Linear-quadratic-Gaussian
games with aggregate uncertainty. J Math Econ mixed games with continuum-parametrized minor
21:543–562. North-Holland players. SIAM J Control Optim 50(5):2907–2937
Cardaliaguet P (2012) Notes on mean field games. Collège Tembine H, Zhu Q, Basar T (2012) Risk-sensitive mean
de France field games. arXiv:1210.2806
Cardaliaguet P (2013) Long term average of first order Wardrop JG (1952) Some theoretical aspects of road
mean field games and work KAM theory. Dyn Games traffic research. In: Proceedings of the institute of civil
Appl 3:473–488 engineers, London, part II, vol 1, pp 325–378
Correa JR, Stier-Moses NE (2010) In: Cochran JJ (ed) Weintraub GY, Benkard C, Van Roy B (2005) Oblivious
Wardrop equilibria. Wiley encyclopedia of operations equilibrium: a mean field approximation for large-
research and management science. Jhon Wiley & Sons, scale dynamic games. In: Advances in neural informa-
Chichester, UK tion processing systems. MIT, Cambridge
712 Mechanism Design
it is typically assumed the designer does not However, achieving the efficient allocation re-
know the utilities of agents at the time the quires knowledge of the utility functions; what
mechanism is chosen – even though agents do can we do if these are unknown? The key insight
know their own utilities at the time the result- is to make each user act as if they are opti-
ing game is actually played. See, e.g., Moore mizing the aggregate utility, by structuring pay-
(1992) for an overview of mechanism design ments appropriately. The basic approach in a
with Nash equilibrium as the solution concept. VCG mechanism is to let the strategy space of
each user r be the set U of possible valuation
The Vickrey-Clarke-Groves functions and make a payment tr to user r so that
Mechanisms her net payoff has the same form as the social
objective (1). In particular, note that if user r
In this section, we describe a seminal exam- receives an allocation xr and a payment tr , the
ple of mechanism design at work: the Vickrey- payoff to user r is
Clarke-Groves (VCG) class of mechanisms. The Ur .xr / C tr :
key insight behind VCG mechanisms is that by
structuring payment rules correctly, individuals On the other hand, the social objective (1) can be
can be incentivized to truthfully declare their written as
utility functions to the market and in turn achieve X
Ur .xr / C Us .xs /:
an efficient allocation. VCG mechanisms are an s¤r
example of mechanism design with dominant
strategies and with the goal of welfare maxi- Comparing the preceding two expressions, the
mization, i.e., efficiency. The presentation here most natural means to align user objectives with
is based on the material in Chap. 5 of Berry and the social planner’s objectives is to define the
Johari (2011), and the reader is referred there payment tr as the sum of the valuations of all
for further discussion. See also Vickrey (1961), users other than r.
Clarke (1971), and Groves (1973) for the original A VCG mechanism first asks each user to
papers discussing this class of mechanisms. declare a valuation function. For each r, we use
To illustrate the principle behind VCG mech- UO r to denote the declared valuation function of
anisms, consider a simple example where we al- user r and use U O D .UO 1 ; : : : ; UO R / to denote the
locate a single resource of unit capacity among R vector of declared valuations. Formally, given a
competing users. Each user’s utility is measured vector of declared valuation functions U, O a VCG
in terms of a common currency unit; in particular, mechanism chooses the allocation x.U/ O as an
if the allocated amount is xr and the payment to optimal solution to (1)–(2) given U, O i.e.,
user r is tr , then her utility is Ur .xr / C tr ; we
refer to Ur as the valuation function, and let the X
O 2 arg
x.U/ max
P UO r .xr /: (4)
space of valuation functions be denoted by U. x0W r xr 1
r
For simplicity we assume the valuation functions
are continuous. In line with our discussion of The payments are then structured so that
efficiency above, it can be shown that the Pareto
X
efficient allocation is obtained by solving the O D
tr .U/ O C hr .U
UO s .xs .U// O r /: (5)
following: s¤r
X
O C
Ur .xr .U// O
UO s .xs .U//: we have only considered implementation in static
s¤r
environments. Most practical mechanism design
settings are dynamic. Dynamic mechanism
O r , the above expression is design remains an active area of fruitful research.
Now note that given U
bounded above by
2 3 Cross-References
X
max
P
4Ur .xr / C UO s .xs /5 :
x0W r xr 1
s¤r
Auctions
Game Theory: Historical Overview
Linear Quadratic Zero-Sum Two-Person Dif-
But since x.U/ O satisfies (4), user r can achieve
ferential Games
the preceding maximum by truthfully declaring
UO r D Ur . Since this optimal strategy does not
depend on the valuation functions .UO s ; s ¤ r/
declared by the other users, we recover the impor- Bibliography
tant fact that in a VCG mechanism, truthful dec-
Berry RA, Johari R (2011) Economic modeling in net-
laration is a weak dominant strategy for user r. working: a primer. Found Trends Netw 6(3):165–286
For our purposes, the interesting feature of the Clarke EH (1971) Multipart pricing of public goods.
VCG mechanism is that it elicits the true utilities Public Choice 11(1):17–33
from the users and in turn (because of the defini- Groves T (1973) Incentives in teams. Econometrica
O chooses an efficient allocation. The 41(4):617–631
tion of x.U/) Hajek B (2013) Auction theory. In: Encyclopedia of
feature that truthfulness is a dominant strategy is systems and control. Springer
known as incentive compatibility: the individual Mas-Colell A, Whinston M, Green J (1995) Microeco-
incentives of users are aligned, or “compatible,” nomic theory. Oxford University Press, New York M
Moore J (1992) Implementation, contracts, and renegotia-
with overall efficiency of the system. The VCG tion in environments with complete information. Adv
mechanism achieves this by effectively paying Econ Theory 1:182–281
each agent to tell the truth. The significance of Myerson RB (1981) Optimal auction design. Math Oper
the approach is that this payment can be properly Res 6(1):58–73
Nisan N, Roughgarden T, Tardos E, Vazirani V (eds)
structured even if the resource manager does (2007) Algorithmic game theory. Cambridge Univer-
not have prior knowledge of the true valuation sity Press, Cambridge/New York
functions. Vickrey W (1961) Counterspeculation, auctions, and
competitive sealed tenders. J Financ 16(1): 8–37
the system under study with information coming aerospace systems; advanced energy conversion
from experimental data. In this article the role systems. All the abovementioned applications
of mathematical models in control system design possess at least one of the following features,
and the problem of developing compact control- which in turn call for accurate mathematical
oriented models are discussed. modeling for the design of the control system:
closed-loop performance critically depends on
the dynamic behavior of the plant; the system
Keywords is made of many closely interacting subsystems;
advanced control systems are required to obtain
Analytical models; Computational modeling; competitive performance, and these in turn
Continuous-time systems; Control-oriented mod- depend on explicit mathematical models for their
eling; Discrete-time systems; Parameter-varying design; the system is safety critical and requires
systems; Simulation; System identification; extensive validation of closed-loop stability and
Time-invariant systems; Time-varying systems; performance by simulation.
Uncertainty Therefore, building control-oriented mathe-
matical models of physical systems is a crucial
prerequisite to the design process itself (see, e.g.,
Introduction Lovera (2014) for a more detailed treatment of
this topic).
The design of automatic control systems requires In the following, two aspects related to mod-
the availability of some knowledge of the dynam- eling for control system synthesis will be dis-
ics of the process to be controlled. In this respect, cussed, namely, the role of models for control
current methods for control system synthesis can system synthesis and the actual process of model
be classified in two broad categories: model-free building itself.
and model-based ones.
The former aim at designing (or tuning) con-
trollers solely on the basis of experimental data The Role of Models for Control
collected directly on the plant, without resorting System Synthesis
to mathematical models.
The latter, on the contrary, assume that suit- Mathematical models play a number of different
able models of the plant to be controlled are roles in the design of control systems. In particu-
available, and rely on this information to work lar, different classes of mathematical models are
out control laws capable of meeting the design usually employed: detailed, high-fidelity models
requirements. for system simulation and compact models for
While the research on model-free design control design. In this section the two model
methods is a very active field, the vast majority classes are presented and their respective roles in
of control synthesis methods and tools fall in the design of control systems are described. Note,
the model-based category and therefore assume in passing, that although hybrid system control is
that knowledge about the plant to be controlled an interesting and emerging field, this entry will
is encoded in the form of dynamic models of focus on purely continuous-time physical mod-
the plant itself. Furthermore, in an increasing els, with application to the design of continuous-
number of application areas, control system time or sampled-time control systems.
performance is becoming a key competitive
factor for the success of innovative, high- Detailed Models for System Simulation
tech systems. Consider, for example, high- Detailed models play a double role in the control
performance mechatronic systems (such as design process. On one hand, they allow checking
robots); vehicles enhanced by active integrated how good (or crude) the compact model is, com-
stability, suspension, and braking control; pared to a more detailed description, thus helping
Model Building for Control System Synthesis 717
to develop good compact models. On the other connectors (actuator inputs and sensor outputs)
hand, they allow closed-loop performance verifi- to a causal controller model in a causal simu-
cation of the controlled system, once a controller lation environment. From a mathematical point
design is available. Indeed, verifying closed-loop of view, this corresponds to reformulating (1) in
performance using the same simplified model state-space form, by means of analytical and/or
that was used for control system design is not numerical transformations.
a sound practice; conversely, verification per- Finally, it is important to point out that
formed with a more detailed model is usually physical model descriptions based on partial-
deemed a good indicator of the control system differential equations (PDEs) can be handled in
performance, whenever experimental validation the OOM framework by means of discretization
is not possible for some reason. using finite volume, finite elements, or finite
Object-oriented modeling (OOM) method- differences methods.
ologies and equation-based, object-oriented
languages (EOOLs) provide very good support Compact Models for Control Design
for the development of such high-fidelity The requirements for a control-oriented model
models, thanks to equation-based modeling, can vary significantly from application to applica-
acausal physical ports, hierarchical system tion. Design models can be tentatively classified
composition, and inheritance; see Tiller (2001) in terms of two key features: complexity and
for a comprehensive overview. Any continuous- accuracy. For a dynamic model, complexity can
time EOOL model is equivalent to the set of be measured in terms of its order; accuracy, on
differential-algebraic equations (DAEs) the other hand, can be measured using many
different metrics (e.g., time-domain simulation
P
F .x.t/; x.t/; u.t/; y.t/; p; t/ D 0; (1) or prediction error, frequency domain matching
with the real plant, etc.) related to the capability M
where x is the vector of dynamic variables, u is of the model to reproduce the behavior of the true
the vector of input variables, y is the vector of system in the operating conditions of interest.
algebraic variables, p is the vector of parameters Broadly speaking, it can be safely stated that
and t is the time. It is interesting to highlight that the required level of closed-loop performance
the object-oriented approach (in particular, the drives the requirements on the accuracy and
use of replaceable components) allows defining complexity of the design model. Similarly,
and managing families of models of the same it is intuitive that more complex models
plant with different levels of complexity, by pro- have the potential for being more accurate.
viding more or less detailed implementations of So, one might be tempted to resort to very
the same abstract interfaces. This feature of OOM detailed mathematical representations of the
allows the development of simulation models for plant to be controlled in order to maximize
different purposes and with different degrees of closed-loop performance. This consideration
detail throughout the entire life of an engineering however is moderated by a number of additional
project, from preliminary design down to com- requirements, which actually end up driving the
missioning and personnel training, all within a control-oriented modeling process. First of all,
coherent framework. present-day controller synthesis methods and
In particular, when focusing on control sys- tools have computational limitations in terms of
tems verification (and regardless of the actual the complexity of the mathematical models they
control design methodology) once the controller can handle, so compact models representative of
has been set up, an OOM tool can be used to the dominant dynamics of the system under study
run closed-loop simulations, including both the are what is really needed. Furthermore, for many
plant and the controller model. Many OOM tools synthesis methods (such as, e.g., LQG or H1
provide model export facilities, which allow to synthesis), the complexity of the design model
connect a plant model with only causal external has an impact on the complexity of the controller,
718 Model Building for Control System Synthesis
conditions (either equilibria or trajectories). Most of the existing control design literature
Many design techniques are now available for assumes that the plant model is given in the form
this problem (see, e.g., Mohammadpour and of a linear fractional transformation (LFT) (see,
Scherer 2012), provided that a suitable model in e.g., Skogestad and Postlethwaite (2007) for an
parameter-dependent form has been derived for introduction to LFT modeling of uncertainty and
the system to be controlled. Linear parameter- Hecker et al. (2005) for a discussion of algo-
varying (LPV) models, described in state-space rithms and software tools). LFT models consist
form as of a feedback interconnection between a nominal
LTI plant and a (usually norm-bounded) operator
P
x.t/ D A.p.t//x.t/ C B.p.t//u.t/ which represents model uncertainties, e.g., poorly
(7)
y.t/ D C.p.t//x.t/ C D.p.t//u.t/ known or time-varying parameters, nonlineari-
are linear models the state-space representation ties, etc. A generic such LFT interconnection
of which depends on a parameter vector p that is depicted in Fig. 1, where the nominal plant
can be time varying. The elements of vector p is denoted with P and the uncertainty block is
may or may not be measurable, depending on the denoted with . The LFT formalism can be also
specific problem formulation. The present state used to provide a structured representation for the
of the art of LPV modeling can be briefly sum- state-space form of LPV models, as depicted in
marized by defining two classes of approaches Fig. 2, where the block .˛/ takes into account
(see Lopes dos Santos et al. (2011) for details). the presence of the uncertain parameter vector
Analytical methods based on the availability of ˛, while the block .p/ models the effect of
reliable nonlinear equations for the dynamics of the varying operating point, parameterized by the
the plant, from which suitable control-oriented vector of time-varying parameters p. Therefore,
representations can be derived (by resorting to, LFT models can be used for the design of robust
broadly speaking, suitable extensions of the fa- and gain scheduling controllers; in addition they M
miliar notion of linearization, developed in order can also serve as a basis for structured model
to take into account off-equilibrium operation identification techniques, where the uncertain pa-
of the system). Experimental methods based en- rameters that appear in the feedback blocks are
tirely on identification, i.e., aimed at deriving estimated based on input/output data sequences.
LPV models for the plant directly from input/ out- The process of extracting uncertain/scheduling
put data. In particular, some LPV identification parameters from the design model of the system
techniques assume that one global identification
experiment in which both the control input and
the parameter vector are (persistently) excited in a Δ
simultaneous way, while others aim at deriving a
parameter-dependent model on the basis of local u P y
to be controlled is a highly complex one, in which sweeps) and then reconstruct the dynamics by
symbolic techniques play a very important role. means of system identification methods. Note
Tools already exist to perform this task (see, e.g., that in this way the structure/order selection stage
Hecker et al. 2005), while a recent overview of of the system identification process provides
the state of the art in this research area can be effective means to manage the complexity versus
found in Hecker and Varga (2006). accuracy tradeoff in the derivation of the compact
Finally, it is important to point out that there is model. A more direct approach, presently
a vast body of advanced control techniques which supported by many tools, is to directly compute
are based on discrete-time models: the A; B; C; D matrices of the linearized system
around specified equilibrium (trim) points,
x.k C 1/ D f .x.k/; u.k/; p; k/ using symbolic and/or numerical linearization
(8)
y.k/ D g.x.k/; u.k/; p; k/ techniques. The result is usually a high-order
linear system, which then can (sometimes
where the integer time step k usually corresponds must) be reduced to a low-order system by
to multiples of a sampling period Ts . Many tech- using model order reduction techniques (such
niques are available to transform (2) into (8). as, e.g., balanced truncation). Model reduction
Furthermore, LTI, LTV, and LPV models can techniques (see Antoulas (2009) for an in-depth
be formulated in discrete time rather than in treatment of this topic) allow to automatically
continuous time. obtain approximated compact models such as
(3), starting from much more detailed simulation
models, by formulating specific approximation
Building Models for Control System bounds in control-relevant terms (e.g., percentage
Synthesis errors of steady-state output values, norm-
bounded additive or multiplicative errors of
The development of control-oriented models of weighted transfer functions, or L2 -norm errors
physical systems is a complex task, which in of output transients in response to specified input
general implies a careful combination of prior signals).
knowledge about the physics of the system under Black box modeling, on the other hand, cor-
study with information coming from experimen- responds to situations in which the modeling
tal data. In particular, this process can follow activity is entirely based on input-output data
very different paths depending on the type of collected on the plant (which therefore must be
information which is available on the plant to be already available), possibly in dedicated, suitably
controlled. Such paths are typically classified in designed, experiments (see Ljung 1999). Regard-
the literature as follows (see, e.g., Ljung (2008) less of the type of model to be built (i.e., linear or
for a more detailed discussion). nonlinear, time invariant or time varying, discrete
White box modeling refers to the development time or continuous time), the black box approach
of control-oriented models on the basis of first consists of a number of well-defined steps. First
principles only. In this framework, one uses the of all the structure of the model to be identified
available information on the plant to develop must be defined: in the linear time-invariant case,
a detailed model using OOM or EOOL tools this corresponds to the choice of the number of
and subsequently works out a compact control- poles and zeros for an input-output model or
oriented model from it. If the adopted tool to the choice of model order for a state-space
only supports simulation, then one can run representation; in the nonlinear case structure
simulations of the plant model, subject to suitably selection is a much more involved process in
chosen excitation inputs (ranging from steps to view of the much larger number of degrees of
persistently exciting input sequences such as, freedom which are potentially involved. Once
e.g., pseudorandom binary sequences and sine a model structure has been defined, a suitable
Model Building for Control System Synthesis 721
cost function to measure the model performance sidered. An overview of the different uses
must be selected (e.g., time-domain simulation or of mathematical models in control system
prediction error, frequency domain model fitting, design has been provided and the process
etc.) and the experiments to collect identification of building compact control-oriented models
and validation data must be designed. Finally, starting from prior knowledge about the system
the uncertain model parameters must be esti- and/or experimental data has been discussed.
mated from the available identification dataset Present-day modeling and simulation tools
and the model must be validated on the validation support advanced control system design in a
dataset. much more direct way. In particular, while
Grey box modeling (in various shades) corre- methods and tools for the individual steps in the
sponds to the many possible intermediate cases modeling process (such as OOM, linearization
which can occur in practice, ranging from the and model reduction, parameter estimation) are
white box approach to the black box one. As available, an integrated environment enabling
recently discussed in Ljung (2008), the critical the pursuit of all the abovementioned paths
issue in the development of an effective approach to the development of compact control-
to control-oriented grey box modeling lies in oriented models is still a subject for future
the integration of existing methods and tools for development. The availability of such a tool
physical systems modeling and simulation with might further promote the application of
methods and tools for parameter estimation. Such advanced, model-based techniques that are
integration can take place in a number of different currently limited by the model development
ways depending on the relative role of data and process.
priors on the physics of the system in the specific
application. A typical situation which occurs fre-
quently in applications is when a white box model M
(developed by means of OOM or EOOL tools) Cross-References
contains parameters having unknown or uncertain
numerical values (such as, e.g., damping factors Modeling of Dynamic Systems from First Prin-
in structural models, aerodynamic coefficients ciples
in aircraft models and so on). Then, one may Multi-domain Modeling and Simulation
rely on input-output data collected in dedicated System Identification: An Overview
experiments on the real system to refine the
white box model by estimating the parameters
using the information provided by the data. This
process is typically dependent on the specific Bibliography
application domain as the type of experiment,
the number of measurements, and the estimation Antoulas A (2009) Approximation of large-scale dynami-
technique must meet application-specific con- cal systems. SIAM, Philadelphia
Hecker S, Varga A (2006) Symbolic manipulation
straints (see, e.g., Klein and Morelli (2006) for techniques for low order LFT-based parametric
an overview of grey box modeling in aerospace uncertainty modelling. Int J Control 79(11):
applications). 1485–1494
Hecker S, Varga A, Magni J-F (2005) Enhanced
LFR-toolbox for MATLAB. Aerosp Sci Technol
9(2):173–180
Klein V, Morelli EA (2006) Aircraft system identification:
Summary and Future Directions theory and practice. AIAA, Reston
Ljung L (1999) System identification: theory for the user.
Prentice-Hall, New Jersey
In this article the problem of model building Ljung L (2008) Perspectives on system identification. In:
for control system synthesis has been con- 2008 IFAC world congress, Seoul
722 MHE
P
E x.t/ D Ax.t/ C Bu.t/;
Abstract
where the mass matrix M , the stiffness matrix K,
and the damping matrix D are square matrices in
Model order reduction (MOR) is here understood
Rss , F 2 Rsm , Cp ; Cv 2 Rqs , x.t/ 2 Rs ,
as a computational technique to reduce the order
u.t/ 2 Rm , y.t/ 2 Rq . Such second-order dif-
of a dynamical system described by a set of or-
ferential equations are typically transformed to a
dinary or differential-algebraic equations (ODEs
mathematically equivalent first-order differential
or DAEs) to facilitate or enable its simulation,
equation
the design of a controller, or optimization and
design of the physical system modeled. It focuses
on representing the map from inputs into the I 0 P
x.t/ 0 I x.t/ 0
D C u.t/
system to its outputs, while its dynamics are R
0 M x.t/ K D x.t/
P F
„ ƒ‚ … „ ƒ‚ … „ ƒ‚ … „ ƒ‚ … „ƒ‚…
treated as a black box so that the large-scale E zP.t / A z.t / B
set of describing ODEs/DAEs can be replaced
x.t/
by a much smaller set of ODEs/DAEs without y.t/ D Cp Cv ;
sacrificing the accuracy of the input-to-output P
„ ƒ‚ … x.t/
C „ ƒ‚ …
behavior. z.t /
Model Order Reduction: Techniques and Tools 723
There are a number of different methods to con- (i.e., all eigenvalues lie in the open left half
struct ROMs, see, e.g., Antoulas (2005), Ben- complex plane). This implies that the system is
ner et al. (2005), Obinata and Anderson (2001), stable. Let V be the n r matrix consisting of
and Schilders et al. (2008). Here we concen- the first r columns of T and let W be the first r
trate on projection-based methods which restrict rows of T 1 , that is, W D V .V V /1 . Applying
the full state x.t/ to an r-dimensional subspace the transformation T to the LTI system (1) yields
by choosing x.t/Q D W x.t/; where W is an
n r matrix. Here the conjugate transpose of T 1x.t/
P D .T 1AT /T 1x.t/ C .T 1B/u.t/ (7)
a complex-valued matrix Z is denoted by Z ;
while the transpose of a matrix Y will be denoted y.t/ D .C T /T 1 x.t/ C Du.t/ (8)
by Y T : Choosing V 2 Cnr such that W V D
I 2 Rrr yields an n n projection matrix … D with
V W which projects onto the r-dimensional sub-
1 W AV 1 W B
space spanned by the columns of V along the T AT D ; T BD ;
A2 B2
kernel of W . Applying this projection to (1), one
obtains the reduced-order LTI system (3) with and C T D ŒC V C2 ; where W AV D
diag.
1 ; : : : ;
r / and A2 D diag.
rC1 ; : : : ;
n /:
e D W EV; AQ D W AV; B
E e D W B; C
e DCV Preserving the r dominant poles (eigenvalues
(5) with largest real part) by truncating the rest (i.e.,
discarding A2 ; B2 , and C2 from (7)) yields the
and an unchanged D e D D: If V D W , … is
ROM as in (5). It can be shown that the error
an orthogonal projector and is called a Galerkin bound
projection. If V ¤ W , … is an oblique projector,
sometimes called a Petrov-Galerkin projection. 1
e
kG./ G./kH1 kC2 k kB2 k
In the following, we will briefly discuss the jRe.
rC1 /j
main classes of methods to construct suitable
matrices V and W : truncation-based methods and holds (Benner 2006). As eigenvalues contain only
interpolation-based methods. Other methods, in limited information about the system, this is not
particular combinations of the two classes dis- necessarily a meaningful reduced-order system.
cussed here, can be found in the literature. In case In particular, the dependence of the input–output
the original LTI system is real, it is often desirable relation on B and C is completely ignored.
Model Order Reduction: Techniques and Tools 725
This can be enhanced by more refined dominance • Compute the Cholesky factors S and R of the
measures taking B and C into account; see, e.g., Gramians such that P D S T S; Q D RT R.
Varga (1995) and Benner et al. (2011). • Compute the singular value decomposition of
More suitable reduced-order systems can be SRT D ˆ† T , where ˆ and are orthogo-
obtained by balanced truncation. To introduce nal matrices and † is a diagonal matrix with
this concept, we no longer need to assume A to the Hankel singular values on its diagonal.
1
be diagonalizable, but we require the stability T D S T ˆ† 2 yields the balancing trans-
1
of A in the sense of (6). For simplicity, we formation (note that T 1 D † 2 ˆT S T D
1
assume E D I . For treatment of the DAE case † 2 T R).
(E ¤ I ), see Benner et al. (2005, Chap. 3). • Partition ˆ; †; into blocks of corresponding
Loosely speaking, a balanced representation sizes,
of an LTI system is obtained by a change of
T
coordinates such that the states which are hard †1 ˆ1 1
†D ;ˆ D ; T D ;
to reach are at the same time those which are †2 ˆ2 2T
difficult to observe. This change of coordinates
amounts to an equivalence transformation of the with †1 D diag. 1 ; : : : ; r / and apply T
realization .A; B; C; D/ of (1) called state-space to (1) to obtain (7) with
transformation as in (7), where T now is the ma-
trix representing the change of coordinates. The T
1 W T AV A12 1 W B
new system matrices .T 1 AT; T 1 B; C T; D/ T AT D ; T BD ;
A21 A22 B2
form a balanced realization of (1). Truncating in (10)
this balanced realization the states that are hard
12
to reach and difficult to observe results in a ROM. and C T D ŒC V C2 for W D RT 1 †1 and
Consider the Lyapunov equations 1
V D S T ˆ1 †1 2 :
Preserving the r dominant M
Hankel singular values by truncating the rest
AP CPAT CBB T D 0; AT QCQACC T C D 0: yields the reduced-order model as in (5).
(9) As W T V D I , balanced truncation is an oblique
The solution matrices P and Q are called con- projection method. The reduced-order model is
trollability and observability Gramians, respec- stable with the Hankel singular values 1 ; : : : ; r .
tively. If both Gramians are positive definite, the It can be shown that if r > rC1 , the error bound
LTI system is minimal. This will be assumed
X n
from here on in this section. e
kG./ G./k H1 2 k (11)
In balanced coordinates the Gramians P kDrC1
and Q of a stable minimal LTI system satisfy
P D Q D diag. 1 ; : : : ; n / with the Hankel holds. Given an error tolerance, this allows to
singular values 1 2 : : : n > 0: The choose the appropriate order r of the reduced
Hankel singular values are the positive square system in the course of the computations.
roots of the eigenvalues p of the product of the As the explicit computation of the balancing
Gramians PQ, k D
k .PQ/: They are transformation T is numerically hazardous, one
system invariants, i.e., they are independent of usually uses the equivalent balancing-free square
the chosen realization of (1) as they are preserved root algorithm (Varga 1991) in which orthogonal
under state-space transformations. bases for the column spaces of V and W are
Given the LTI system (1) in a non-balanced computed. The so obtained ROM is no longer
coordinate form and the Gramians P and Q balanced, but preserves all other properties (er-
satisfying (9), the transformation matrix T which ror bound, stability). Furthermore, it is shown
yields an LTI system in balanced coordinates in Benner et al. (2000) how to implement the
can be computed via the so-called square root balancing-free square root algorithm using low-
algorithm as follows: rank approximations to S and R without ever
726 Model Order Reduction: Techniques and Tools
This reduced-order model makes use of the in- Consider the (block) Krylov subspace Kk .F; H /
formation in the matrices A12 ; A21 ; A22 ; B2 ; and D spanfH; FH; F 2 H; : : : ; F k1 H g for F D
C2 discarded by balanced truncation. It fulfills .A s0 E/1 E and H D .A s0 E/1 B with
Model Order Reduction: Techniques and Tools 727
an appropriately chosen expansion point s0 which circumstances. Using Krylov subspace methods
may be real or complex. From the definitions to achieve an interpolation-based ROM as
of A; B, and E, it follows that F 2 Knn described above is recommended. The unitary
and H 2 Knm , where K D R or K D C basis of a (block) Krylov subspace can be
depending on whether s0 is chosen in R or in computed by employing a (block) Arnoldi or
C. Considering Kk .F; H / column by column, (block) Lanczos method; see, e.g., Antoulas
this leads to the observation that the number of (2005), Golub and Van Loan (2013), and Freund
column vectors in fH; FH; F 2 H; : : : ; F k1 H g (2003).
is given by r D m k, as there are k blocks In the case when an oblique projection is
F j H 2 Knm ; j D 0; : : : ; k 1. In the case to be used, it is not necessary to compute two
when all r column vectors are linearly inde- unitary bases as above. An alternative is then to
pendent, the dimension of the Krylov subspace use the nonsymmetric Lanczos process (Golub
Kk .F; H / is m k: Assume that a unitary basis and Van Loan 2013). It computes bi-unitary (i.e.,
for this block Krylov subspace is generated such W V D Ir ) bases for the above mentioned
that the column space of the resulting unitary Krylov subspaces and the reduced-order model
matrix V 2 Cnr spans Kk .F; G/. Applying the as a by-product of the Lanczos process. An
Galerkin projection … D V V to (1) yields a re- overview of the computational techniques for
duced system whose TFM satisfies the following moment-matching and Padé approximation
(Hermite) interpolation conditions at s0 : summarizing the work of a decade is given in
Freund (2003) and the references therein.
e .j /.s0 / D G .j / .s0 /; j D 0; 1; : : : ; k 1:
G In general, the discussed MOR approaches are
instances of rational interpolation. When the
That is, the first k 1 derivatives of G and G e expansion point is chosen to be s0 D 1,
coincide at s0 . Considering the power series ex- the moments are called Markov parameters and M
pansion (12) of the original and the reduced-order the approximation problem is known as partial
TFM, this is equivalent to saying that at least the realization. If s0 D 0, the approximation problem
first k moments f M j .s0 / of the transfer function is known as Padé approximation.
e
G.s/ of the reduced system (3) are equal to the As the use of one single expansion point s0
first k moments Mj .s0 / of the TFM G.s/ of the leads to good approximation only close to s0 ,
original system (1) at the expansion point s0 : it might be desirable to use more than one ex-
pansion point. This leads to multipoint moment-
fj .s0 /; j D 0; 1; : : : ; k 1:
Mj .s0 / D M matching methods, also called rational Krylov
methods; see, e.g., Ruhe and Skoogh (1998),
If further the r columns of the unitary matrix W Antoulas (2005), and Freund (2003).
span the block Krylov subspace Kk .F; H / for In contrast to balanced truncation, these (ratio-
F D .As0 E/T E and H D .As0 E/T C T ; nal) interpolation methods do not necessarily pre-
applying the Petrov-Galerkin projection … D serve stability. Remedies have been suggested;
V .W V /1 W to (1) yields a reduced system see, e.g., Freund (2003).
whose TFM matches at least the first 2k moments The use of complex-valued expansion points
of the TFM of the original system. will lead to a complex-valued reduced-order sys-
Theoretically, the matrix V (and W ) tem (3). In some applications (in particular, in
can be computed by explicitly forming the case the original system is real valued), this
columns which span the corresponding Krylov is undesired. In that case one can always use
subspace Kk .F; H / and using the Gram-Schmidt complex-conjugate pairs of expansion points as
algorithm to generate unitary basis vectors for then the entire computations can be done in real
Kk .F; H /: The forming of the moments (the arithmetic.
Krylov subspace blocks F j H ) is numerically The methods just described provide good ap-
precarious and has to be avoided under all proximation quality locally around the expansion
728 Model Order Reduction: Techniques and Tools
points. They do not aim at a global approxi- • Oberwolfach Model Reduction Benchmark
mation as measured by the H2 - or H1 -norms. Collection
In Gugercin et al. (2008), an iterative procedure http://simulation.uni-freiburg.de/downloads/
is presented which determines locally optimal benchmark/
expansion points w.r.t. the H2 -norm approxima- • NICONET Benchmark Examples
tion under the assumption that the order r of http://www.icm.tu-bs.de/NICONET/
the reduced model is prescribed and only 0th- benchmodred.html
and 1st-order derivatives are matched. Also, for The MOR WiKi http://morwiki.mpi-magde
multi-input multi-output systems (i.e., m and p burg.mpg.de/morwiki/ is a platform for MOR
in (1) are both larger than one), no full mo- research and provides discussions of a number
ment matching is achieved, but only tangential of methods, links to further software packages
interpolation: G.sj /bj D G.s e j /bj ; cj G.sj / D (e.g., MOREMBS and MORPACK), as well as
e j /; cj G 0 .sj /bj D cj G
cj G.s e 0 .sj /bj ; for certain additional benchmark examples.
vectors bj ; cj determined together with the opti-
mal sj by the iterative procedure.
Summary and Future Directions
estimation, however, stability can be proved MRAC had been a very active and fruitful
for direct and indirect MRAC schemes. For research topic from the 1960s to 1990s, and it
MRAC systems designed with the MIT rule, local formed important foundations for modern adap-
stability can be established under more restrictive tive control theory. It also found many successful
conditions, such as when the parameters are close applications ranging from chemical process con-
to the ideal ones. trols to automobile engine controls. More recent
It should be noted that the adaptive control efforts have been mostly devoted to integrating it
algorithm in the original form has been with other design approaches to treat nonstandard
shown to have robustness issues, and extensive MRAC problems for nonlinear and complex dy-
publications in the 1980s and 1990s were namic systems.
devoted to robust adaptive control in attempts
to mitigate the problem. Many modifications
have been proposed and shown to be effective Cross-References
in “robustifying” the MRAC; interested readers
are referred to the article on Robust Adaptive Adaptive Control of Linear Time-Invariant
Control for more details. Systems
Parameter convergence is not an intrinsic Adaptive Control, Overview
requirement for MRAC, as tracking error History of Adaptive Control
convergence can be achieved without parameter Robust Adaptive Control
convergence. It has been shown, however,
that parameter convergence could enhance
robustness, particularly for indirect schemes. Recommended Reading
As in the case for parameter identification, a
persistent excitation (PE) condition needs to MRAC has been well covered in several text- M
be imposed on the regression signal to assure books and research monographs. Astrom and
parameter convergence in MRAC. In general, Wittenmark (1994) presented different MRAC
PE is accomplished by properly choosing the schemes in a tutorial fashion. Narendra and An-
reference input r. It can be established for most naswamy (1989) focused on stability of determin-
MRAC approaches that parameter convergence is istic MRAC systems. Ioannou and Sun (1996)
achieved if, in addition to conditions required for covered the detailed derivation and analysis of
stability, the reference input r is sufficiently rich different MRAC schemes and provided a unified
of order 2n; rP is bounded, and there is no pole- treatment for their stability and robustness analy-
zero cancelation in the plant transfer function. A sis. MRAC systems for discrete-time (Goodwin
signal is called to be sufficiently rich of order m and Sin 1984) and for nonlinear (Krstic et al.
if it contains at least m=2 distinct frequencies. 1995) processes are also well explored.
N u/
min T C .y;
u
!
P
N P
P P
R P
M
T C D n;i yn;ref .k i / yNn .k C i // 2
C ˛l;j u2l .k C j/
nD1 i D1 lD1 j D1
Model-Based Performance Optimizing Control 737
Similar to any NMPC controller that is designed bioethanol process and Idris and Engell (2012)
for reference tracking, a successful implementa- for a reactive distillation column.
tion will require careful engineering such that as
many uncertainties as possible are compensated
by simple feedback controllers and only the key Further Issues
dynamic variables are handled by the optimizing
controller based on a rigorous model of the essen- Modeling and Robustness
tial dynamics and of the stationary relations of the In a direct performance optimizing control ap-
plant without too much detail. proach, sufficiently accurate dynamic nonlinear
process models are needed. While in the pro-
cess industries, nonlinear steady-state models are
nowadays available for many processes because
History and Examples they are built and used extensively in the pro-
cess design phase, there is still a considerable
The idea of economic or performance optimizing additional effort required to formulate, imple-
control originated from the process control com- ment, and validate nonlinear dynamic process
munity. The first papers on directly integrating models. The effort for rigorous or semi-rigorous
economic considerations into model-predictive modeling usually dominates the cost of an ad-
control Zanin et al. (2000) proposed to achieve vanced control project. The alternative approach
a better economic performance by adding an eco- to use black-box or gray-box models as pro-
nomic term to a classical tracking performance posed frequently in nonlinear model-predictive
criterion and applied this to the control of a flu- control may be effective for regulatory control
idized bed catalytic cracker. Helbig et al. (2000) where the model only has to capture the es-
discussed different ways to integrate optimization sential dynamic features of the plant near an
and feedback control including direct dynamic operating point, but it seems to be less suitable
optimization for the example of a semi-batch for optimizing control where the optimal plant
reactor. Toumi and Engell (2004) and Erdem performance is aimed at and hence the best sta-
et al. (2004) demonstrated online performance tionary values of the inputs and of the controlled
optimizing control schemes for simulated mov- variables have to be computed by the controller.
ing bed (SMB) chromatographic separations in As increasingly so-called operator training sim-
lab scale. SMB processes are periodic processes ulators are built in parallel to the construction
and constitute prototypical examples where ad- of a plant and are continuously used and up-
ditional degrees of freedom can be used to si- dated after the commissioning phase, it seems
multaneously optimize system performance and attractive to use the models contained in the
to meet product specifications. Bartusiak (2005) simulators also for online optimization. However,
reported already industrial applications of care- the model formulations often are not suitable for
fully engineered performance optimizing NMPC this purpose.
controllers. Model inaccuracies always have to be taken
Direct performance optimizing control into account. They not only lead to suboptimal
was suggested as a promising general new performance but also can cause that the con-
control paradigm for the process industries by straints even on measured variables cannot be met
Rolandi and Romagnoli (2005), Engell (2006, in the future because of an insufficient back-off
2007), Rawlings and Amrit (2009), and others. from the constraints. A new approach to deal with
Meanwhile it has been demonstrated in many uncertainties about model parameters and future
simulation studies that direct optimization of influences on the process is multistage scenario-
a performance criterion can lead to superior based optimization with recourse. Here the model
economic performance compared to classical uncertainties are represented by a set of scenarios
tracking (N)MPC, e.g., Ochoa et al. (2010) for a of parameter variations and the future availability
Model-Based Performance Optimizing Control 739
of additional information is taken into account. years, important results on closed-loop stabil-
It has been demonstrated that this is an effective ity guaranteeing formulations have nonetheless
tool to handle model uncertainties and to auto- been obtained, involving terminal constraints or
matically generate the necessary back-off without a quasi-infinite horizon (Angeli et al. 2012; Diehl
being overly conservative (Lucia et al. 2013). et al. 2011; Grüne 2013).
chosen standard regulatory control layer leads of exothermic semi-batch polymerizations. Ind Eng
to a close-to-optimal operation, there is no need Chem Res 52:5906–5920
Finkler TF, Kawohl M, Piechottka U, Engell S (2014) Re-
for optimizing control. If the disturbances that alization of online optimizing control in an industrial
affect profitability and cannot be handled well semi-batch polymerization. J Process Control 24:399–
by the regulatory layer (in terms of economic 414
performance) are slow, the combination of reg- Grüne L (2013) Economic receding horizon control with-
out terminal constraints. Automatica 49:725–734
ulatory control and RTO is sufficient. In a more Helbig A, Abel O, Marquardt W (2000) Structural
dynamic situation or for complex nonlinear mul- concepts for optimization based control of transient
tivariable plants, the idea of direct performance processes. In: Nonlinear model predictive control.
optimizing control should be explored and im- Allgöwer F and Zheng A, eds. Birkhäuser, Basel
pp 295–311
plemented if significant gains can be realized in Idris IAN, Engell S (2012) Economics-based NMPC
simulations. strategies for the operation and control of a contin-
uous catalytic distillation process. J Process Control
22:1832–1843
Lucia S, Finkler T, Basak D, Engell S (2013) A new
Cross-References Robust NMPC Scheme and its application to a semi-
batch reactor example. J Process Control 23:1306–
Control and Optimization of Batch 1319
Processes Marlin TE, Hrymak AN (1997) Real-time operations op-
timization of continuous processes. In: Proceedings of
Control Hierarchy of Large Processing Plants: CPC V, Lake Tahoe, AIChE symposium series, vol 93,
An Overview pp 156–164
Control Structure Selection Morari M, Stephanopoulos G, Arkun Y (1980) Studies
Economic Model Predictive Control in the synthesis of control structures for chemical
processes, part I. AIChE J 26:220–232
Extended Kalman Filters Ochoa S, Wozny G, Repke J-U (2010) Plantwide opti-
Model-Predictive Control in Practice mizing control of a continuous bioethanol production
Moving Horizon Estimation process. J Process Control 20:983–998
Pham LC, Engell S (2011) A procedure for systematic
Particle Filters
control structure selection with application to reac-
Real-Time Optimization of Industrial Pro- tive distillation. In: Proceedings of 18th IFAC world
cesses congress, Milan, pp 4898–4903
Qin SJ, Badgwell TA (2003) A survey of industrial
model predictive control technology. Control Eng
Pract 11:733–764
Bibliography Rao CV, Rawlings JB, Mayne DQ (2003) Constrained
state estimation for nonlinear discrete-time systems.
Angeli D, Amrit R, Rawlings J (2012) On average per- IEEE Trans Autom Control 48:246–258
formance and stability of economic model predictive Rawlings J, Amrit R (2009) Optimizing process eco-
control. IEEE Trans Autom Control 57(7):1615–1626 nomic performance using model predictive control In:
Bartusiak RD (2005) NMPC: a platform for optimal Nonlinear model predictive control – towards new
control of feed- or product-flexible manufacturing. challenging applications. Springer, Berlin, pp 119–
Preprints International workshop on assessment and 138
future directions of NMPC, Freudenstadt, pp 3–14 Rolandi PA, Romagnoli JA (2005) A framework for
Diehl M, Amrit R, Rawlings J, Angeli D (2011) A Lya- online full optimizing control of chemical processes.
punov function for economic optimizing model pre- In: Proceedings of ESCAPE 15. Barcelona, Elsevier,
dictive control. IEEE Trans Autom Control 56(3):703– pp 1315–1320
707 Skogestad S (2000) Plantwide control: the search for the
Engell S (2006, 2007) Feedback control for optimal self-optimizing control structure. J Process Control
process operation. Plenary paper IFAC ADCHEM, 10:487–507
Gramado, 2006; J Process Control 17:203–219 Toumi A, Engell S (2004) Optimization-based control of
Erdem G, Abel S, Morari M, Mazzotti M, Morbidelli a reactive simulated moving bed process for glucose
M (2004) Automatic control of simulated moving isomerization. Chem Eng Sci 59:3777–3792
beds. Part II: nonlinear isotherms. Ind Eng Chem Res Zanin AC, Tvrzska de Gouvea M, Odloak D (2000)
43:3895–3907 Industrial implementation of a real-time optimization
Finkler T, Lucia S, Dogru M, Engell S (2013) A strategy for maximizing production of LPG in a FCC
simple control scheme for batch time minimization unit. Comput Chem Eng 24:525–531
Modeling of Dynamic Systems from First Principles 741
dx
D f .x; uI /; y D h.x; uI / (1) Physical Analogies and General
dt
Structures
where u is the input, y is the output, and the
state x contains internal physical variables, while Physical Analogies
contains parameters. Typically all of these Physicists and engineers have noted that mod-
are vectors. The model is known as a state eling in different areas of physics often gives
space model. In many cases some elements in very similar models. The term “analogies” is
742 Modeling of Dynamic Systems from First Principles
L R k
F
C m
u b
often used in modeling to describe this fact. Here Note that the products (voltage) (current) and
we will show some analogies between electrical (force) (velocity) give the power.
and mechanical phenomena. Consider the electric
circuit given in Fig. 1. An ideal voltage source is Bond Graphs
connected in series with an inductor, a resistor, The bond graph is a tool to do systematic model-
and a capacitor. Using u and v to denote the ing based on the analogies of the previous section.
voltages over the voltage source and capacitor, The basic element is the bond
respectively, and i to denote the current, a math-
ematical model is e
*
f
dv
C Di
dt formed by a half arrow showing the direction of
(4)
di positive energy flow. Two variables are associated
L C Ri C v D u with the bond, the effort variable e and the flow
dt
variable f . The product ef of these variables
The first equation uses the definition of capaci- gives the power. In the electric domain e is
tance and the second one uses Kirchhoff’s voltage voltage and f is current. For mechanical systems
law. Compare this to the mechanical system of e is force, while f is velocity. Bond graph theory
Fig. 2 where an external force F is applied to has three basic components to describe storage
a mass m that is also connected to a damper b and dissipation of energy. The relations
and a spring with spring constant k. If S is the
elongation force of the spring and w the velocity de df
of the mass, a system model is ˛ D f; ˇ D e; f D e (8)
dt dt
x22 @H2
H2 .x2 / D ; D m1 x2 D w (13)
2m @x2
R:b
Modeling of Dynamic Systems from First Principles, Let H D H1 C H2 be the total energy. Then the
Fig. 3 Bond graph for mechanical or electric system following relation holds:
dx 0 1 00 @H =@x1 0
F DN CS CT D C F
(9) dt 1 0 0b @H =@x2 1
(14)
Here T and N denote the forces associated with This model is a special case of
the damper and the mass, respectively. From (8)
it follows that dx M
D .M R/rH.x/ C Bu (15)
dt
dS dw
k 1 D w; m D N; bw D T (10) where M is a skew symmetric and R a nonnega-
dt dt
tive definite matrix, respectively. The model type
Together (9) and (10) give the same model as (5). is called a port-controlled Hamiltonian system
Using the correspondences (6), (7) it is seen that with dissipation. Without external input (B D
the same bond graph can also represent the elec- 0) and dissipation (R D 0), it reduces to an
tric circuit (4). An overview of bond graph mod- ordinary Hamiltonian system of the form (11).
eling is given in Rosenberg and Karnopp (1983). For systems generated by simple bond graphs, it
A general overview of modeling, including bond can be shown that the junction structure gives the
graphs and the connection with identification, can skew symmetric M , while the R elements give
be found in Ljung and Glad (1994b). the matrix R. The storage of energy in I and C
elements is reflected in H . The reader is directed
Port-Controlled Hamiltonian Systems to Duindam et al. (2009) for a description of the
Many physical processes can be modeled as port Hamiltonian approach to modeling.
Hamiltonian systems. This means that there are
state variables x, a scalar function H , and a skew
symmetric matrix M so that the system dynamics Component-Based Models and
is Modeling Languages
dx
D M rH.x/ (11)
dt Since engineering systems are usually assembled
The function H is called the Hamiltonian of the from components, it is natural to treat their math-
system. To be useful in a control context, this ematical models in the same way. This is the idea
model class has to be extended to handle inputs behind block-oriented models where the output
744 Modeling of Dynamic Systems from First Principles
of one model is connected to the input of another algebraic equation (DAE). The difference from
one: the connection of blocks in block diagrams is
that now the connection is not between an input
u1 y1 = u2 y2 and an output. Instead there are the equations
!1 D !2 and M1 D M2 that destroy the state
space structure. There exist modeling languages
like Modelica (Fritzson 2000; Tiller 2001) or
A nice feature of this block connection is that
SimMechanics in MATLAB (MathWorks 2002)
the state space description is preserved. Suppose
that accept this more general type of model. It
the individual models are of the form (1)
is then possible to form model libraries of basic
dxi components that can be interconnected in very
D fi .xi ; ui /; yi D hi .xi ; ui /; i D 1; 2 general ways to form models of complex systems.
dt
(16) However, this more general structure poses some
challenges when it comes to analysis and simula-
Then the connection u2 D y1 immediately gives tion that are described in the next section.
the state space model
d x1 f1 .x1 ; u1 / Differential Algebraic Equations
D ;
dt x2 f2 .x2 ; h1 .x1 ; u1 // (17) (DAE)
y2 D h2 .x2 ; h1 .x1 ; u1 //
This model (19) is a special case of the general
with input u1 , output y2 , and state (x1 ; x2 ). differential algebraic equation
This fact is the basis of block-oriented mod-
eling and simulation tools like the MATLAB- F .d z=dt; z; u/ D 0 (20)
based Simulink. Unfortunately the preservation
of the state space structure does not extend to A good description of both theory and numerical
more general connections of systems. Consider, properties of such equations is given in Kunkel
for instance, two pieces of rotating machinery and Mehrmann (2006). In many cases it is possi-
described by ble to split the variables and equations in such a
way that the following structure is achieved:
d!i
Ji D bi !i C Mi ; i D 1; 2 (18)
dt F .d z1 =dt; z1 ; z2 ; u/ D 0; F2 .z1 ; z2 ; u/ D 0
where !i is the angular velocity, Mi the external (21)
torque, Ji the moment of inertia, and bi the damp- If z2 can be solved from the second equation
ing coefficient. Suppose the pieces are joined and substituted into the first one, and if d z1 =dt
together so that they rotate with the same angular can then be solved from the first equation, the
velocity. The mathematical model would then be problem is reduced to an ordinary differential
d!1 equation in z1 . Often, however, the situation is
J1 D b1 !1 C M1
dt not as simple as that. For the example (19) an
d!2 addition of the first two equations gives
J2 D b2 !2 C M2 (19)
dt
d!1
!1 D !2 .J1 C J2 / D .b1 C b2 /!1 (22)
dt
M1 D M2
which is a standard first-order system description.
This is no longer a state space model of the Note, however, that in order to arrive at this result,
form (1), but a mixture of dynamic and static the relation !1 D !2 has to be differentiated. This
relationships, usually referred to as a differential DAE thus includes an implicit differentiation.
Modeling of Dynamic Systems from First Principles 745
In the general case one can investigate how many This expression shows that an index k > 1
times (20) has to be differentiated in order to get implies differentiation of the input (unless N B2
an explicit expression for d z=dt. This number is happens to be zero). This in turn implies potential
called the (differentiation) index. Both theoretical difficulties, e.g., if u is a measured signal.
analysis and practical experience show that the
numerical difficulties encountered when solving Identification of DAE Models
a DAE increase with increasing index; see, e.g., The extended use of DAE models in modern
the classical reference Brenan et al. (1987). It modeling tools also means that there is a need
turns out that mechanical systems in particu- to use these models in system identification. To
lar give high-index models when constructed by fully use system identification theory, one needs
joining components, and this has been an obstacle a stochastic model of disturbances. The inclusion
to the use of DAE models. For linear DAE models of such disturbances leads to a class of models de-
the role of the index can be seen more easily. A scribed as stochastic differential algebraic equa-
linear model is given by tions. The treatment of such models leads to some
interesting problems. In the previous section it
dz was seen that DAE models often contain implicit
E C F z D Gu (23) differentiations of external signals. If a DAE
dt
model is to be well posed, this differentiation
where the matrix E is singular (if E is invertible, must not affect signals modeled as white noise.
multiplication with E 1 from the left gives an In Gerdin et al. (2007), conditions are given that
ordinary differential equation). The system can guarantee that stochastic DAEs are well posed.
be transformed by multiplying with P from the There it is also described how a maximum like-
left and changing variables with z D Qw.P; Q lihood estimate can be made for DAE models,
nonsingular matrices). The transformed model is laying the basis for parameter estimation. M
now
dw Differential Algebra
PEQ C PFQw D P Gu (24)
dt For the case where models consist of polynomial
If
E C F is nonsingular for some value of the equations, it is possible to manipulate them in
scalar
, then it can be shown that there is a a very systematic way. The model (20) is then
choice of P and Q such that (24) takes the form generalized to
where y contains measured signals, z contains Hamiltonian techniques is Duindam et al. (2009).
unmeasured variables, and is a vector of pa- The Modelica modeling language is treated in
rameters to be identified, while F is a vector of Tiller (2001) and Fritzson (2000). The former
polynomials in these variables. It was shown in emphasizes the physical modeling point of view;
Ljung and Glad (1994a) that there is an algorithm the latter also gives details of the language itself.
giving for each parameter k a polynomial:
This relation can be regarded as a polynomial Brenan KE, Campbell SL, Petzold LR (1987) Numeri-
cal solution of initial-value problems in differential-
in k where all coefficients are expressed in algebraic equations. Classics in applied mathematics
measured quantities. The local or global identifi- (Book 14). SIAM, Philadelphia
ability will then be determined by the number of Duindam V, Macchelli A, Stramigioli S, Bruyninckx H
solutions. If k is unidentifiable, then no equation (eds) (2009) Modeling and control of complex physi-
cal systems: the Port-Hamiltonian approach. Springer,
of the form (30) will exist, and this fact will also Berlin
be demonstrated by the output of the algorithm. Fritzson P (2000) Principles of object-oriented modeling
and simulation with Modelica 2.1. IEEE/Wiley Inter-
science, Piscataway NJ
Gerdin M, Schön T, Glad T, Gustafsson F, Ljung L
Summary and Future Directions (2007) On parameter and state estimation for linear
differential-algebraic equations. Automatica 43:416–
There is no general method to derive models from 425
Kunkel P, Mehrmann V (2006) Differential-algebraic
first principles. However, modeling techniques
equations. European Mathematical Society, Zürich
based on bond graphs or port-controlled Hamilto- Ljung L, Glad ST (1994a) On global identifiability
nian systems offer a systematic approach for large of arbitrary model parameterizations. Automatica
model classes. Modeling languages like Modelica 30(2):265–276
Ljung L, Glad T (1994b) Modeling of dynamic systems.
make the practical work with modeling much
Prentice Hall, Englewood Cliffs
easier. A fundamental problem that comes up is Ritt JF (1950) Differential algebra. American Mathemati-
that models are not necessarily in state space form cal Society, Providence
but are so called differential algebraic equation Rosenberg RC, Karnopp D (1983) Introduction to physi-
cal system dynamics. McGraw-Hill, New York
(DAE) models. Much of the future work is ex- The MathWorks (2002) SimMechanics. User’s guide. The
pected to deal with the handling of DAE models MathWorks, Natick
and in the development of modeling languages. Tiller MM (2001) Introduction to physical modeling with
Modelica. Kluwer Academic, Boston
sharing some basic relevant features, such as algebraic form using some matrices. As in the
minimality in the number of primitives, local- case of differential equations, an initial condition
ity of the states and actions (with consequences or state should be defined in order to represent
for model construction), or temporal realism. a dynamic system. This is done by means of an
The global state of a system is obtained by the initial distributed state. The English translation of
juxtaposition of the different local states. We the Carl Adam Petri’s seminal work, presented in
should initially distinguish between autonomous 1962, is Petri (1966).
formalisms and those extended by interpretation.
Models in the latter group are obtained by re-
stricting the underlying autonomous behaviors Untimed Place/Transition Net
by means of constraints that can be related to Systems
different kinds of external events, in particular to
time. This article first describes place/transition A place/transition net (PT-net) can be viewed as
nets (PT-nets), by default simply called Petri nets N D hP; T; Pre; Posti, where:
(PNs). Other formalisms are then mentioned. As • P and T are disjoint and finite nonempty sets
a system theory modeling paradigm for concur- of places and transitions, respectively.
rent DES, Petri nets are used in a wide variety of • Pre and Post are jP j jT j sized, natural-
application fields. valued (zero included), incidence matrices.
The net is said to be ordinary if Pre and Post
are valued on f0; 1g. Weighted arcs permit
Keywords
the abstract modeling of bulk services and
arrivals.
Condition/event nets (CE-nets); Continuous Petri
A PT-net is a structure. The Pre (Post) function
nets (CPNs); Diagrams; Fluidization; Grafcet;
defines the connections from places to transitions M
Hybrid Petri nets (HPNs); Marking Petri nets;
(transitions to places). Those two functions can
Place/transition nets (PT-nets); High-level Petri
alternatively be defined as weighted flow relations
nets (HLPNs)
(nets as graphs). Thus, PT-nets can be represented
as bipartite directed graphs with places (p, using
Introduction circles) and transitions (t, using bars or rectan-
gles) as nodes: N D hP; T; F; W i, where F
Petri nets (PNs) are able to model concurrent .P T / [ .T P / is the flow relation (set of
and distributed DES ( Models for Discrete directed arcs, with dom.F /[range.F / D P [T ),
Event Systems: An Overview). They constitute and W W F ! NC assigns a natural weight to
a powerful family of formalisms with different each arc.
expressive purposes and power. They may be The net structure represents the static part
applied to inter alia, modeling, logical analysis, of the DES model. Furthermore, a “distributed
performance evaluation, parametric optimization, state” is defined over the set of places, known
dynamic control (minimum makespan, super- as the marking. This is “numerically quantified”
visory control, or other kinds), diagnosis, and (not in an arbitrary alphabet, as in automata),
implementation issues (eventually fault tolerant). associating natural values to the local state
Hybrid and continuous PNs are particularly variables, the places. If a place p has a value
useful when some parts of the system are highly .m.p/ D /, it is said to have tokens
populated. Being multidisciplinary, formalisms (frequently depicted in graphic terms with v
belonging to the Petri nets paradigm may cover black dots or just the number inside the place).
several phases of the life cycle of complex DES. The places are “state variables,” while the
A Petri net can be represented as a bipartite markings are their “values”; the global state
directed graph provided with arcs inscriptions; is defined through the concatenation of local
alternatively, this structure can be represented in states. The net structure, provided with an initial
748 Modeling, Analysis, and Control with Petri Nets
Modeling, Analysis, and Control with Petri Nets, attributions (or meets); the logical AND is formed around
Fig. 1 Most basic PN constructions: The logical OR transitions, in joins (or waits or rendezvous) and forks (or
is present around places, in choices (or branches) and splits)
Modeling, Analysis, and Control with Petri Nets, Fig. 2 Only transitions b and c are initially enabled. The results
of firing b or c are shown subsequently
• It represents a necessary but not sufficient design process is to express in a formalized way
condition for reachability; the problem is that properties that the system should enjoy and to use
the existence of a does not guarantee that a formal proof techniques. Errors can be eventually
corresponding sequence is firable from m0 ; detected close to the moment they are introduced,
thus, certain solutions – called spurious (Silva reducing their propagation to subsequent stages.
et al. 1998) – are not reachable. This implies The goal in verification is to ensure that a given
that – except in certain net system subclasses – system is correct with respect to its specification
only semi-decision algorithms can usually be (perhaps expressed in temporal-logic terms) or to
derived. a certain set of predetermined properties.
• All variables are natural numbers, which im- Among the most basic qualitative properties of
ply computational complexity. “net systems” are the following: (1) reachability
It should be pointed out that in finite-state ma- of a marking from a given one; (2) boundedness,
chines, the state is a single variable taking values characterizing finiteness of the state space; (3)
in a symbolic unstructured set, while in PT-net liveness, related to potential fireability of all tran-
systems, it is structured as a vector of nonnegative sitions starting on an arbitrary reachable marking
integers. This allows analysis techniques that do (deadlock-freeness is a weaker condition in which
not require the enumeration of the state space. only global infinite fireability of the net system
At a structural level, observe that the negation model is guaranteed, even if some transitions
is missing in Fig. 1; its inclusion leads to the so- no longer fire); (4) reversibility, characterizing
called inhibitor arcs, an extension in expressive recoverability of the initial marking from any
power. In its most basic form, if the place at reachable one; and (5) mutual exclusion of two
the origin of an inhibitor arc is marked, it “in- places, dealing with the impossibility of reaching
hibits” the enabling of the target transition. PT- markings in which both places are simultane-
net systems can model infinite-state systems, but ously marked. M
not Turing machines. PT-net systems provided All the above are behavioral properties, which
with inhibitor arcs (or priorities on the firing of depend on the net system .N ; m0 /. In practice,
transitions) can do it. sometimes problems with a net model are rooted
With this conceptually simple formalism, it in the net structure; thus, the study of the struc-
is not difficult to express basic synchronization tural counterpart of certain behavioral proper-
schemas (Fig. 3). All the illustrated examples use ties may be of interest. For example, a “net”
joins. When weights are allowed in the arcs, is structurally bounded if it is bounded for any
another kind of synchronization appears: Sev- initial marking; a “net” is structurally live if an
eral copies of the same resource are needed (or initial marking exists that make the net system
produced) in a single operation. Being able to live (otherwise stated, it reflect non-liveness for
express concurrency and synchronization, when arbitrary initial markings, a pathology of the net).
viewing the system at a higher level, it is possible Basic techniques to analyze net systems in-
to build cooperation and competition relation- clude: (1) enumeration, in its most basic form
ships. based on the construction of a reachability graph
(a sequentialized view of the behavior). If the net
system is not bounded, losing some information,
Analysis and Control of Untimed a finite coverability graph can be constructed; (2)
PT Models transformation, based on an iterative rewriting
process in which a net system enjoys a certain
The behavior of a concurrent (eventually dis- property if and only if a transformed (“sim-
tributed) system is frequently difficult to under- pler” to analyze) one also does. If the new sys-
stand and control. Thus, misunderstandings and tem is easier to analyze, and the transformation
mistakes are frequent during the design cycle. A is computationally cheap, the process may be
way of cutting down the cost and duration of the extremely interesting; (3) structural, based on
750 Modeling, Analysis, and Control with Petri Nets
Modeling, Analysis, and Control with Petri Nets, marked – must be in mutex – to remember the returning
Fig. 3 Basic synchronization schemes: (1) Join or ren- point; for simplicity, it is assumed that the subprogram is
dezvous, RV; (2) Semaphore, S; (3) Mutual exclusion single input/single output); (8) Guard (a self-loop from a
semaphore (mutex), R, representing a shared resource; place through a transition); its role is like a traffic light:
(4) Symmetric RV built with two semaphores; (5) Asym- If at least one token is present at the place, it allows the
metric RV built with two semaphores (master/slave); evolution, but it is not consumed. Synchronizations can
(6) Fork-join (or par-begin/par-end); (7) Non-recursive also be modeled by the weights associated to the arcs
subprogram (places i and j cannot be simultaneously going to transitions
this context, three kinds of related notions that so named structure theory, based on the direct
must be differentiated are the following: (1) exploitation of the structure of the net model
some natural vectors (left and right annullers (using graph or mathematical programming
of the token flow matrix, C: P-semiflows and theories and algorithms, where the initial marking
T-semiflows), (2) some invariant laws (token is a parameter).
conservation and repetitive behaviors), and Similarly, transitions (or places) can be parti-
(3) some peculiar subnets (conservative and tioned into observable and unobservable. Many
consistent components, generated by the subsets observability problems may be of interest; for ex-
of nodes in the P- and T-semiflows, respectively). ample, observing the firing of a subset of transi-
More than analysis, control leads to synthesis tions to compute the subset of markings in which
problems. The idea is to enforce the given system the system may be. Related to observability, di-
in order to fulfill a specification (e.g., to enforce agnosis is the process of detecting a failure (any
certain mutual exclusion properties). Technically deviation of a system from its intended behavior)
speaking, the idea is to “add” some elements in and identifying the cause of the abnormality. Di-
order to constrain the behavior in such a way that agnosability, like observability or controllability,
a correct execution is obtained. Questions related is a logical criterion. If a model is diagnosable
to control, observation, diagnosis, or identifica- with respect to a certain subset of possible faults
tion are all areas of ongoing research. (i.e., it is possible to detect the occurrence of
With respect to classical control theory, there those faults in finite time), a diagnoser can be
are two main differences: Models are DES and constructed (see section “Diagnosis and Diagnos-
untimed (autonomous, fully nondeterministic, ability Analysis of Petri Nets” in Diagnosis of
eventually labeling the transitions in order to Discrete Event Systems). Identification of DES
be able to consider the PNs languages). Let us is also a question that has required attention in
remark that for control purposes, the transitions recent years. In general, the starting point is a M
should be partitioned into controllable (when behavioral observation, the goal being to con-
enabled, you can either force or block the struct a PN model that generates the observed
firing) and uncontrollable (if enabled, the firing behavior, either from examples/counterexamples
is nondeterministic). A natural approach to of its language or from the structure of a reach-
synthesize a control is to start modeling the plant ability graph. So the results are derived models,
dynamics (by means of a PN, P) and adopting a not human-made models (i.e., not made by de-
specification for the desired closed-loop system signers).
(S). The goal is to compute a controller (L)
such that S equals the parallel-composition of
P and L; in other words, controllers (called The Petri Nets Modeling Paradigm
“supervisors”) are designed to ensure that only
behaviors consistent with the specification may Along the life cycle of DES, designers may deal
occur. The previous equality is not always with basic modeling, analysis, and synthesis from
possible, and the goal is usually relaxed to different perspectives together with implemen-
minimally limit the behavior within the specified tation and operation issues. Thus, the designer
legality (i.e., to compute maximally permissive may be interested in expressing the basic struc-
controllers). For an approach in the framework ture, understanding untimed possible behaviors,
of finite-state machines and regular languages, checking logic properties on the model when
see Supervisory Control of Discrete-Event provided with some timing (e.g., in order to
Systems. The synthesis in the framework of guarantee if a certain reaction is possible before
Petri nets and having goals as enforcing some 3 ms; something relevant in real-time systems),
generalized mutual exclusions constraints in computing some performance indices on timed
markings or avoiding deadlocks, for example, models (related to the throughput in the firing of a
can be efficiently approached by means of the given transition, or to the length of a waiting line,
752 Modeling, Analysis, and Control with Petri Nets
expressed as the number of tokens in a place), and structured models, while keeping the
computing a schedule or control that optimizes same theoretical expressive power of PT-
a certain objective function, decomposing the nets (i.e., we can speak of “abbreviations,”
model in order to prepare an efficient imple- not of “extensions”). In other cases, object-
mentation, efficiently determining redundancies oriented concepts from computer programming
in order to increase the degree of fault tolerance, are included in certain HLPNs. The analysis
etc. For these different tasks, different formalisms techniques of HLPNs can be approached
may be used. Nevertheless, it seems desirable with techniques based on enumeration, trans-
to have a family of related formalisms rather formation, or structural considerations and
than a collection of “unrelated” or weakly related simulation, generalizing those developed for PT-
formalisms. The expected advantages would in- net systems.
clude coherence among models usable in dif-
ferent phases, economy in the transformations
and synergy in the development of models and
theories. Extending Net Systems with External
Events and Time: Nonautonomous
Formalisms
When dealing with net systems that interact with
Other Untimed PN Formalisms: Levels of some specific environment, the marking evolu-
Expressive Power tion rule must be slightly modified. This can
PT-net systems are more powerful than condi- be done in an enormous number of ways, con-
tion/event (CE) systems, roughly speaking the sidering external events and logical conditions
basic seminal formalism of Carl Adam Petri in as inputs to the net model, in particular some
which places can be marked only with zero or depending on time. The same interpretation given
one token (Boolean marking). CE-systems can to a graph in order to define a finite-state diagram
model only finite-state systems. As already said, can be used to define a marking diagram, a for-
“extensions” of the expressive power of untimed malism in which the key point is to recognize that
PT-net systems to the level of Turing machines the state is now numerical (for PT-net systems)
are obtained by adding inhibitor arcs or priorities and distributed. For example, drawing a parallel
to the firing of transitions. with Moore automata, the transitions should be
An important idea is adding the notion of labeled with logical conditions and events, while
individuals to tokens (e.g., from anonymous unconditional actions are associated to the places.
to labeled or colored tokens). Information in If a place is marked, the associated actions are
tokens allows the objects to be named (they emitted.
are no longer indistinguishable) and dynamic Even if only time-based interpretations are
associations to be created. Moving from PT- considered, there are a large number of successful
nets to so-called high-level PNs (HLPNs) is proposals for formalisms. For example, it should
something like “moving from assembler to be specified if time is associated to the firing of
high-level programming languages,” or, at transitions (T-timing may be atomic or in three
the computational level, like “moving from phases), to the residence of tokens in places (P-
pure numerical to a symbolic level.” There timed), to the arcs of the net, as tags to the
are many proposals in this respect, the more tokens, etc. Moreover, even for T-timed models,
important being predicate/transition nets and there are many ways of defining the timing: time
colored PNs. Sometimes, this type of abstraction intervals, stochastic or possibilistic forms, and
has the same theoretical expressiveness as the deterministic case as a particular one. If the
PT-net systems (e.g., colored PNs if the firing of transitions follows exponential pdfs, and
number of colors is finite); in other words, the conflict resolution follows the race policy
high-level views may lead to more compact (i.e., fire in conflicts the first that ends the task,
Modeling, Analysis, and Control with Petri Nets 753
not a preselection policy), the underlying Markov Fluid and Hybrid PN Models
chain described is isomorphic to the reachability Different ideas may lead to different kinds of
graph (due to the Markovian “memoryless” prop- hybrid PNs. One is to fluidize (here to relax the
erty). Moreover, the addition of immediate transi- natural numbers of discrete markings into the
tions (whose firing is instantaneous) enriches the nonnegative reals) the firing of transitions that are
practical modeling possibilities, eventually com- “most time” enabled. Then the relaxed model has
plicating the analysis techniques. Timed models discrete and continuous transitions, thus also dis-
are used to compute minimum and maximum crete and continuous places. If all transitions are
time delays (when time intervals are provided, fluidized, the PN system is said to be fluid or con-
in real-time problems) or performance figures tinuous, even if technically it is a hybrid one. In
(throughput, utilization rate of resources, average this approach, the main goal is to try to overcome
number of tokens – clients – in services, etc.). For the state explosion problem inherent to enumer-
performance evaluation, there is an array of tech- ation techniques. Proceeding in that way, some
niques to compute bounds, approximated values, computationally NP-hard problems may become
or exact values, sometimes generalizing those much easier to solve, eventually in polynomial
that are used in certain queueing network classes time. In other words, fluidization is an abstraction
of models. Simulation techniques are frequently that tries to make tractable certain real-scale DES
very helpful in practice to produce an educated problems ( Discrete Event Systems and Hybrid
guess about the expected performance. Systems, Connections Between).
Time constraints on Petri nets may change When transitions are timed with the so-called
logical properties of models (e.g., mutual exclu- infinite server semantics, the PN system can be
sion constraints, deadlock-freeness, etc.), calling observed as a time differentiable piecewise affine
for new analysis techniques. For example, certain system. Thus, even if the relaxation “simplifies”
timings on transitions can transform a live system computations, it should be taken into account that M
into a non-live one (if to the net system in Fig. 2 continuous PNs with infinite server semantics
are associated deterministic times to transitions are able to simulate Turing machines. From a
and a race policy with the time associated to tran- different perspective, the steady-state throughput
sition c smaller than that of transition a, transition of a given transition may be non-monotonic with
b cannot be fired, after firing transition d; thus it respect to the firing rates or the initial marking
is non-live, while the untimed model was live). (e.g., if faster or more machines are used, the un-
By the addition of some time constraints, the controlled system may be slower); moreover, due
transformation of a non-live model into a live one to the important expressive power, discontinuities
is also possible. So additional analysis techniques may even appear with respect to continuous de-
need to be considered, redefining the state, now sign parameters as firing rates, for example.
depending also on time, more than just on the An alternative way to define hybrid Petri nets
marking. is a generalization of hybrid automata: The net
Finally, as in any DES, the optimal control system is a DES, but sets of differential equations
of timed Petri net models (scheduling, sequenc- are associated to the marking of places. If a
ing, etc.) may be approached by techniques as place is marked, the corresponding differential
dynamic programming or perturbation analysis equations contribute to define its evolution.
(presented in the context of queueing networks
and Markov chains, see Perturbation Analysis
of Discrete Event Systems). In practice, those Summary and Future Directions
problems are frequently approached by means
of some partially heuristic strategies. About the Petri nets designate a broad family of related
diagnosis of timed Petri nets, see Diagnosis of DES formalisms (a modeling paradigm) each
Discrete Event Systems. Of course, all these tasks one specifically tailored to approach certain
can be done with HLPNs. problems. Conceptual simplicity coexists with
754 Modeling, Analysis, and Control with Petri Nets
powerful modeling, analysis, and synthesis knowledge. Most of these books deal essentially
capabilities. From a control theory perspective, with PT-net systems. Complementary surveys
much work remains to be done for both untimed are Murata (1989), Silva (1993), and David and
and timed formalisms (remember, there are many Alla (1994), the latter also considering some
different ways of timing), particularly when continuous and hybrid models. Concerning high-
dealing with optimal control of timed models. level PNs, Jensen and Rozenberg (1991) is a se-
In engineering practice, approaches to the latter lection of papers covering the main developments
class of problems frequently use heuristic strate- during the 1980s. Jensen and Kristensen (2009)
gies. From a broader perspective, future research focuses on state space methods and simulation
directions include improvements required to where elements of timed models are taken into
deal with controllability and the design of account, but performance evaluation of stochastic
controllers, with observability and the design systems is not covered. Approaching the present
of observers, with diagnosability and the design day, relevant works written with complementary
of diagnosers, and with identification. This work perspectives include inter alia, Girault and Valk
is not limited to the strict DES framework, but (2003), Diaz (2009), David and Alla (2010),
also applies to analogous problems relating to and Seatzu et al. (2013). The consideration of
relaxations into hybrid or fluid approximations time in nets with an emphasis on performance
(particularly useful when high populations are and performability evaluation is addressed in
considered). The distributed nature of system is monographs such as Ajmone Marsan et al.
more and more frequent and is introducing new (1995), Bause and Kritzinger (1996), Balbo
constraints, a subject requiring serious attention. and Silva (1998), and Haas (2002), while timed
In all cases, different from firing languages models under different fuzzy interpretations are
approaches, the so named structure theory of the subject of Cardoso and Camargo (1999).
Petri nets should gain more interest. Structure-based approaches to controlling PN
models is the main subject in Iordache and
Cross-References Antsaklis (2006) or Chen and Li (2013). Different
kinds of hybrid PN models are studied in Di
Applications of Discrete-Event Systems
Febbraro et al. (2001), Villani et al. (2007), and
Diagnosis of Discrete Event Systems
David and Alla (2010), while a broad perspective
Discrete Event Systems and Hybrid Systems,
about modeling, analysis, and control of contin-
Connections Between uous (untimed and timed) PNs is provided by
Models for Discrete Event Systems: An
Silva et al. (2011).
Overview DiCesare et al. (1993) and Desrochers and Al-
Perturbation Analysis of Discrete Event Jaar (1995) are devoted to the applications of
Systems PNs to manufacturing systems. A comprehensive
Supervisory Control of Discrete-Event Systems
updated introduction to business process systems
and PNs can be found in van der Aalst and
Stahl (2011). Special volumes dealing with other
Recommended Reading monographic topics are, for example, Billington
et al. (1999), Agha et al. (2001), and Cortadella
Topics related to PNs are considered in well et al. (2002). An application domain for Petri
over a hundred thousand papers and reports. nets emerging over the last two decades is sys-
The first generation of books concerning this tems biology, a model-based approach devoted to
field is Brauer (1980), immediately followed by the analysis of biological systems (Koch et al.
Starke (1980), Peterson (1981), Brams (1983), 2011; Wingender 2011). Furthermore, it should
Reisig (1985), and Silva (1985). The fact that be pointed out that Petri nets have also been em-
they are written in English, French, German, and ployed in many other application domains (e.g.,
Spanish is proof of the rapid dissemination of this from logistics to musical systems).
Modeling, Analysis, and Control with Petri Nets 755
For an overall perspective of the field over the Girault C, Valk R (2003) Petri nets for systems engineer-
five decades that have elapsed since the publica- ing. a guide to modeling, verification, and applications.
Springer, Berlin
tion of Carl Adam Petri’s PhD thesis, including Haas PJ (2002) Stochastic Petri nets. modelling, stabil-
historical, epistemological, and technical aspects, ity, simulation. Springer series in operations research.
see Silva (2013). Springer, New York
Iordache MV, Antsaklis PJ (2006) Supervisory control of
concurrent systems: a Petri net structural approach.
Birkhauser, Boston
Jensen K, Kristensen LM (2009) Coloured Petri nets.
Bibliography modelling and validation of concurrent systems.
Springer, Berlin
Agha G, de Cindio F, Rozenberg G (eds) (2001) Concur- Jensen K, Rozenberg G (eds) (1991) High-level Petri nets.
rent object-oriented programming and Petri nets, ad- Springer, Berlin
vances in Petri nets. Volume 2001 of LNCS. Springer, Koch I, Reisig W, Schreiber F (eds) (2011) Modeling
Berlin/Heidelberg/New York in systems biology. the Petri net approach. Computa-
Ajmone Marsan M, Balbo G, Conte G, Donatelli S, tional biology, vol 16. Springer, Berlin
Franceschinis G (1995) Modelling with generalized Murata T (1989) Petri nets: properties, analysis and appli-
stochastic Petri nets. Wiley, Chichester/New York cations. Proc IEEE 77(4):541–580
Balbo G, Silva M (eds) (1998) Performance models Peterson JL (1981) Petri net theory and the
for discrete event systems with synchronizations: modeling of systems. Prentice-Hall, Upper Saddle
formalisms and analysis techniques. Proceedings of River
human capital and mobility MATCH performance ad- Petri CA (1966) Communication with automata. Rome
vanced school, Jaca. Available online: http://webdiis. Air Development Center-TR-65-377, New York
unizar.es/GISED/?q=news/matchbook Reisig W (1985) Petri nets. an introduction. Volume 4 of
Bause F, Kritzinger P (1996) Stochastic Petri nets. an EATCS monographs on theoretical computer science.
introduction to the theory. Vieweg, Braunschweig Springer-Verlag, Berlin/Heidelberg/New York
Billington J, Diaz M, Rozenberg G (eds) (1999) Appli- Seatzu C, Silva M, Schuppen J (eds) (2013) Control of
cation of Petri nets to communication networks, ad- discrete-event systems. Automata and Petri net per-
vances in Petri nets. Volume 1605 of LNCS. Springer, spectives. Number 433 in lecture notes in control and M
Berlin/Heidelberg/New York information sciences. Springer, London
Brams GW (1983) Reseaux de Petri: Theorie et Pratique. Silva M (1985) Las Redes de Petri: en la Automatica y la
Masson, Paris Informatica. Ed. AC, Madrid (2nd ed., Thomson-AC,
Brauer W (ed) (1980) Net theory and applications. Volume 2002)
84 of LNCS. Springer, Berlin/New York Silva M (1993) Introducing Petri nets. In: Practice of
Cardoso J, Camargo H (eds) (1999) Fuzziness in Petri Petri nets in manufacturing. Chapman and Hall, Lon-
nets. Volume 22 of studies in fuzziness and soft com- don/New York, pp 1–62
puting. Physica-Verlag, Heidelberg/New York Silva M (2013) Half a century after Carl Adam Petri’s PhD
Chen Y, Li Z (2013) Optimal supervisory control thesis: a perspective on the field. Annu Rev Control
of automated manufacturing systems. CRC, Boca 37(2):191–219
Raton Silva M, Teruel E, Colom JM (1998) Linear algebraic
Cortadella J, Yakovlev A, Rozenberg G (eds) (2002) and linear programming techniques for the analysis of
Concurrency and hardware design, advances net systems. Volume 1491 of LNCS, advances in Petri
in Petri nets. Volume 2549 of LNCS. Springer, nets. Springer, Berlin/Heidelberg/New York, pp 309–
Berlin/Heidelberg/New York 373
David R, Alla H (1994) Petri nets for modeling of dynamic Silva M, Julvez J, Mahulea C, Vazquez C (2011) On
systems – a survey. Automatica 30(2):175–202 fluidization of discrete event models: observation and
David R, Alla H (2010) Discrete, continuous and hybrid control of continuous Petri nets. Discret Event Dyn
Petri nets, Springer-Verlag, Berlin/Heidelberg Syst 21:427–497
Desrochers A., Al-Jaar RY (1995) Applications of Petri Starke P (1980) Petri-Netze. Deutcher Verlag der Wis-
nets in manufacturing systems. IEEE, New York senschaften, Berlin
Diaz M (ed) (2009) Petri nets: fundamental models, verifi- van der Aalst W, Stahl C (2011) Modeling busi-
cation and applications. Control systems, robotics and ness processes: a Petri net oriented approach. MIT,
manufacturing series (CAM). Wiley, London Cambridge
DiCesare F, Harhalakis G, Proth JM, Silva M, Vernadat Villani E, Miyagi PE, Valette R (2007) Modelling and
FB (1993) Practice of Petri nets in manufacturing. analysis of hybrid supervisory systems. A Petri net
Chapman & Hall, London/Glasgow/New York approach. Springer, Berlin
Di Febbraro A, Giua A, Menga G (eds) (2001) Special Wingender E (ed) (2011) Biological Petri nets. Studies in
issue on hybrid Petri nets. Discret Event Dyn Syst health technology and informatics. vol 162. IOS Press,
11(1–2):5–185 Lansdale
756 Model-Predictive Control in Practice
next control interval. The various commercial • Pre-test – design the control and test sensors
MPC algorithms differ in such details as the and actuators.
mathematical form of the dynamic model and • Step-test – generate process response data.
the specific formulations of the state estimation, • Modeling – develop model from process re-
steady-state optimization, and dynamic optimiza- sponse data.
tion problems (Qin and Badgwell 2003). • Configuration – configure the software and
In the general case, the MPC algorithm must test preliminary tuning by simulation.
solve the three optimization problems outlined • Commissioning – turn on and test the con-
above at each control interval. For the case of troller.
linear models and reasonable tuning parameters, • Post-audit – measure and certify economic
these problems take the form of a convex performance.
quadratic program (QP) with a constant, positive- • Sustainment – monitor and maintain the appli-
definite Hessian. As such, they can be solved cation.
relatively easily using standard optimization The most expensive of these steps, both in
codes. For the case of a linear state-space model, terms of engineering time and lost production, is
the structure can be exploited even further to the generation of process response data through
develop a specialized solution algorithm using an the step test. This is accomplished, in principle,
interior point method (Rao et al. 1998). by making significant adjustments to each vari-
For the case of nonlinear models, these able that will be adjusted by the MPC while
problems take the form of a nonlinear program operating open loop to prevent compensating
(NLP) for which the solution domain is no longer control action. This will necessarily cause abnor-
convex, greatly complicating the numerical mal movement in key operating variables, which
solution. A typical strategy is to iterate on may lead to lower throughput and off-spec prod-
a linearized version of the problem until ucts. Significant progress has been made in re-
convergence (Bielger 2010). cent years to minimize these difficulties through
the use of approximate closed-loop step testing
(Darby and Nikolaou 2012).
Implementation Once the application has been commis-
sioned, it is critical to set up an aggressive
The combined experience of thousands of MPC monitoring and sustainment program. MPC
applications in the process industries has led to application benefits can fall off quickly
a near consensus on the steps required for a due to changes in the process operation
successful implementation: and as new personnel interact with it. New
• Justification – make the economic case for the constraint variables may need to be added
application. and key sections of the model may need
Model-Predictive Control in Practice 759
to be updated as time goes on. The math- and Ramaker (1979). A detailed summary of the
ematical problem of MPC monitoring re- history of MPC technology development, as well
mains a topic of current academic research as a survey of commercial offerings through 2003
(Zagrobelny et al. 2012). can be found in the review article by Qin and
Note that the implementation steps outlined Badgwell (2003). Darby and Nikolaou present a
above must be carried out by a carefully selected more recent summary of MPC practice (Darby
project team that typically includes, in addition and Nikolaou 2012). Textbook descriptions of
to the MPC expert, an engineer with detailed MPC theory and design, suitable for classroom
knowledge of the process and an operator with use, include Rawlings and Mayne (2009) and
significant relevant experience. Maciejowski (2002). The book by Ljung (1999)
provides a good summary of methods for identi-
fying dynamic models from test data. Theoretical
Summary and Future Directions properties of MPC are analyzed in a highly cited
paper by Mayne and coworkers (2000). Guide-
Model predictive control is now a mature tech- lines for designing disturbance models so as to
nology in the process industries. A representative achieve offset-free control can be found in Pan-
MPC algorithm in this domain includes a state nocchia and Rawlings (2003). Numerical solution
estimator, a steady-state optimizer, and a dynamic strategies for the nonlinear programs found in
optimizer, running once a minute. A successful MPC are discussed in the book by Biegler (2010).
MPC application usually starts with a careful An efficient interior-point method for solving the
economic justification, includes significant par- linear MPC dynamic optimization is described
ticipation from process engineers and operators, in Rao et al. (1998). A promising algorithm for
and is maintained with an aggressive sustainment solving the nonlinear MPC dynamic optimiza-
program. Many thousands of such applications tion is outlined in Zavala and Biegler (2009).
are currently operating around the world, gen- M
A data-based method for tuning Kalman Filters,
erating billions of dollars per year in economic which are often used for MPC state estimation,
benefits. is described in Odelson et al. (2006). A new
Likely future directions for MPC prac- method for monitoring the performance of MPC
tice include increasing use of nonlinear is summarized in Zagrobelny et al. (2012). A
models, improved state estimation through readable summary of refining operations can be
unmeasured disturbance modeling (Pannocchia found in Gary et al. (2007).
and Rawlings 2003), and development of
more efficient numerical solution methods
(Zavala and Biegler 2009). Bibliography
It is for this reason that DES are often referred causing transitions from one system state value
to as event driven, to contrast them to clas- to another. It may be identified with an action
sical time-driven systems based on the laws (e.g., pressing a button), a spontaneous natural
of physics; in the latter, as time evolves state, occurrence (e.g., a random equipment failure), or
variables such as position, velocity, tempera- the result of conditions met by the system state
ture, voltage, etc., also continuously evolve. In (e.g., the fluid level in a tank exceeds a given
order to capture event-driven state dynamics, value). For the purpose of developing a model
however, different mathematical models are for DES, we will use the symbol e to denote an
necessary. event. Since a system is generally affected by
In addition, uncertainties are inherent in different types of events, we assume that we can
the technological environments where DES are define a discrete event set E with e 2 E.
encountered. Therefore, associated mathematical In a classical system model, the “clock” is
models and methods for analysis and control what drives a typical state trajectory: with every
must incorporate such uncertainties. Finally, “clock tick” (which may be thought of as an
complexity is also inherent in DES of practical “event”), the state is expected to change, since
interest, usually manifesting itself in the form of continuous state variables continuously change
combinatorially explosive state spaces. Although with time. This leads to the term time driven.
purely analytical methods for DES design, In the case of time-driven systems described by
analysis, and control are limited, they have continuous variables, the field of systems and
still enabled reliable approximations of their control has based much of its success on the use
dynamic behavior and the derivation of useful of well-known differential-equation-based mod-
structural properties and provable performance els, such as
guarantees. Much of the progress made in this
field, however, has relied on new paradigms xP .t/ D f.x.t/; u.t/; t/; x.t0 / D x0 (1)
M
characterized by a combination of mathematical y.t/ D g.x.t/; u.t/; t/; (2)
techniques, computer-based tools, and effective
processing of experimental data. where (1) is a (vector) state equation with initial
Event-driven and time-driven system com- conditions specified and (2) is a (vector) output
ponents are often viewed as coexisting and equation. As is common in system theory, x.t/
interacting and are referred to as hybrid systems denotes the state of the system, y.t/ is the output,
(separately considered in the Encyclopedia, and u.t/ represents the input, often associated
including the article Discrete Event Systems with controllable variables used to manipulate
and Hybrid Systems, Connections Between). the state so as to attain a desired output. Com-
Arguably, most contemporary technological mon physical quantities such as position, veloc-
systems are combinations of time-driven ity, temperature, pressure, flow, etc., define state
components (typically, the physical parts of a variables in (1). The state generally changes as
system) and event-driven components (usually, time changes, and, as a result, the time variable t
the computer-based controllers that collect data (or some integer k D 0; 1; 2; : : : in discrete time)
from and issue commands to the physical parts). is a natural independent variable for modeling
such systems.
In contrast, in a DES, time no longer serves
Event-Driven vs. Time-Driven the purpose of driving such a system and may
Systems no longer be an appropriate independent variable.
Instead, at least some of the state variables are
In order to explain the difference between time- discrete, and their values change only at certain
driven and event-driven behavior, we begin points in time through instantaneous transitions
with the concept of “event.” An event should which we associate with “events.” If a clock
be thought of as occurring instantaneously and is used, consider two possibilities: (i) At every
762 Models for Discrete Event Systems: An Overview
clock tick, an event e is selected from the event driven system shown, the state space X is the set
set E (if no event takes place, we use a “null of real numbers R, and x.t/ can take any value
event” as a member of E such that it causes no from this set. The function x.t/ is the solution
state change), and (ii) at various time instants (not of a differential equation of the general form
necessarily known in advance or coinciding with P
x.t/ D f .x.t/; u.t/; t/, where u.t/ is the input.
clock ticks), some event e “announces” that it (ii) For the event-driven system, the state space is
is occurring. Observe that in (i) state transitions some discrete set X D fs1 ; s2 ; s3 ; s4 g. The sample
are synchronized by the clock which is solely path can only jump from one state to another
responsible for any possible state transition. In whenever an event occurs. Note that an event
(ii), every event e 2 E defines a distinct process may take place, but not cause a state transition,
through which the time instants when e occurs as in the case of e4 . There is no immediately
are determined. State transitions are the result of P
obvious analog to x.t/ D f .x.t/; u.t/; t/, i.e., no
combining these asynchronous concurrent event mechanism to specify how events might interact
processes. Moreover, these processes need not over time or how their time of occurrence might
be independent of each other. The distinction be determined. Thus, a large part of the early
between (i) and (ii) gives rise to the terms time- developments in the DES field has been devoted
driven and event-driven systems, respectively. to the specification of an appropriate mathemati-
Comparing state trajectories of time-driven cal model containing the same expressive power
and event-driven systems is useful in understand- as (1)–(2) (Baccelli et al. 1992; Cassandras and
ing the differences between the two and setting Lafortune 2008; Glasserman and Yao 1994).
the stage for DES modeling frameworks. Thus, in We should point out that a time-driven
Fig. 1, we observe the following: (i) For the time- system with continuous state variables, usually
modeled through (1)–(2), may be abstracted
as a DES through some form of discretization
x(t) in time and quantization in the state space.
We should also point out that discrete event
systems should not be confused with discrete
time systems. The class of discrete time systems
contains both time-driven and event-driven
systems.
execute is called the timed language model of specification (Cassandras and Lafortune 2008;
the system. The word “language” comes from Moody and Antsaklis 1998; Ramadge and
the fact that we can think of the event E as an Wonham 1987).
“alphabet” and of (finite) sequences of events as On the other hand, we may be interested in
“words” (Hopcroft and Ullman 1979). We can event timing in order to answer questions such
further refine such a model by adding statistical as the following: “How much time does the sys-
information regarding the set of state trajectories tem spend at a particular state?” or “Can this
(sample paths) of the system. Let us assume that sequence of events be completed by a partic-
probability distribution functions are available ular deadline?” More generally, event timing
about the “lifetime” of each event type e 2 E, is important in assessing the performance of a
that is, the elapsed time between successive DES often measured through quantities such as
occurrences of this particular e. A stochastic throughput or response time. In these instances,
timed language is a timed language together with we need to consider the timed language model
associated probability distribution functions for of the system. Since DES frequently operate in a
the events. stochastic setting, an additional level of complex-
Stochastic timed language modeling is the ity is introduced, necessitating the development
most detailed in the sense that it contains event of probabilistic models and related analytical
information in the form of event occurrences and methodologies for design and performance anal-
their orderings, information about the exact times ysis based on stochastic timed language models.
at which the events occur (not only their relative Sample path analysis and perturbation analysis,
ordering), and statistical information about suc- discussed in the entry Perturbation Analysis
cessive occurrences of events. If we delete the of Discrete Event Systems, refer to the study of
timing information from a timed language, we sample paths of DES, focusing on the extrac-
obtain an untimed language, or simply language, tion of information for the purpose of efficiently M
which is the set of all possible orderings of estimating performance sensitivities of the sys-
events that could happen in the given system. tem and, ultimately, achieving online control and
For example, the untimed sequence correspond- optimization (Cassandras and Lafortune 2008;
ing to the timed sequence of events in Fig. 1 is Glasserman 1991; Ho and Cao 1991; Ho and
fe1 ; e2 ; e3 ; e4 ; e5 g. Cassandras 1983).
Untimed and timed languages represent These different levels of abstraction are com-
different levels of abstraction at which DES plementary, as they address different issues about
are modeled and studied. The choice of the the behavior of a DES. Although the language-
appropriate level of abstraction clearly depends based approach to DES modeling is attractive,
on the objectives of the analysis. In many it is by itself not convenient to address verifica-
instances, we are interested in the “logical tion, controller synthesis, or performance issues.
behavior” of the system, that is, in ensuring that This motivates the development of discrete event
all the event sequences it can generate satisfy modeling formalisms which represent languages
a given set of specifications, e.g., maintaining in a manner that highlights structural information
a precise ordering of events. In this context, about the system behavior and can be used to
the actual timing of events is not required, address analysis and controller synthesis issues.
and it is sufficient to model only the untimed Next, we provide an overview of three major
behavior of the system. Supervisory control modeling formalisms which are used by most
that is discussed in the article Supervisory (but not all) system and control theoretic method-
Control of Discrete-Event Systems is the term ologies pertaining to DES. Additional modeling
established for describing the systematic means formalisms encountered in the computer science
(i.e., enabling or disabling events which are literature include process algebras (Baeten and
controllable) by which the logical behavior Weijland 1990) and communicating sequential
of a DES is regulated to achieve a given processes (Hoare 1985).
764 Models for Discrete Event Systems: An Overview
and the event scores Ni , i 2 E, are defined by an event set E is a set of distribution functions
F D fFi W i 2 Eg characterizing the stochastic
clock sequences
Ni C 1 if i D e 0 or i … .x/
Ni0 D i 2 .x 0 /:
Ni otherwise fVi;k g D fVi;1 ; Vi;2 ; : : :g; i 2 E;
(6)
Vi;k 2 RC ; k D 1; 2; : : :
In addition, initial conditions are yi D vi;1 and
Ni D 1 for all i 2 .x0 /. If i … .x0 /, then yi
It is usually assumed that each clock sequence
is undefined and Ni D 0.
consists of random variables which are inde-
A simple interpretation of this elaborate def-
pendent and identically distributed (iid) and that
inition is as follows. Given that the system is at
all clock sequences are mutually independent.
some state x, the next event e 0 is the one with
Thus, each fVi;k g is completely characterized by
the smallest clock value among all feasible events
a distribution function Fi .t/ D P ŒVi t. There
i 2 .x/. The corresponding clock value, y ,
are, however, several ways in which a clock struc-
is the interevent time between the occurrence of
ture can be extended to include situations where
e and e 0 , and it provides the amount by which
elements of a sequence fVi;k g are correlated or
the time, t, moves forward: t 0 D t C y . Clock
two clock sequences are interdependent. As for
values for all events that remain active in state x 0
state transitions which may be nondeterministic
are decremented by y , except for the triggering
in nature, such behavior is modeled through state
event e 0 and all newly activated events, which are
transition probabilities as explained next.
assigned a new lifetime vi;Ni C1 . Event scores are
incremented whenever a new lifetime is assigned
Stochastic Timed Automaton. We can extend
to them. It is important to note that the “system
the definition of a timed automaton by viewing
clock” t is fully controlled by the occurrence of M
the state, event, and all event scores and clock
events, which cause it to move forward; if no
values as random variables denoted respectively
event occurs, the system remains at the last state
by capital letters X , E, Ni , and Yi , i 2 E. Thus,
observed.
a stochastic timed automaton is a six-tuple
Comparing x 0 D f .x; e 0 / to the state equa-
tion (1) for time-driven systems, we see that
the former can be viewed as the event-driven .X ; E; ; p; p0 ; F /;
analog of the latter. However, the simplicity of
x 0 D f .x; e 0 / is deceptive: unless an event se- where X is a countable state space; E is a count-
quence is given, determining the triggering event able event set; .x/ is the active event set (or fea-
e 0 which is required to obtain the next state x 0 sible event set); p.x 0 I x; e 0 / is a state transition
involves the combination of (3)–(6). Therefore, probability defined for all x; x 0 2 X , e 0 2 E and
the analog of (1) as a “canonical” state equation such that p.x 0 I x; e 0 / D 0 for all e 0 … .x/; p0 is
for a DES requires all Eqs. (3)–(6). Observe that the probability mass function P ŒX0 D x, x 2 X ,
this timed automaton generates a timed language, of the initial state X0 ; and F is a stochastic clock
thus extending the untimed language generated structure. The automaton generates a stochastic
by the original automaton G. state sequence fX0 ; X1 ; : : :g through a transition
In the definition above, the clock structure V mechanism (based on observations X D x, E 0 D
is assumed to be fully specified in a deterministic e 0 ):
sense and so are state transitions dictated by
x 0 D f .x; e 0 /. The sequences fvi g, i 2 E, can X 0 D x 0 with probability p.x 0 I x; e 0 / (7)
be extended to be specified only as stochastic
sequences through distribution functions denoted and it is driven by a stochastic event sequence
by Fi , i 2 E. Thus, the stochastic clock structure fE1 ; E2 ; : : :g generated exactly as in (3)–(6) with
(or stochastic timing structure) associated with random variables E, Yi , and Ni , i 2 E, instead
766 Models for Discrete Event Systems: An Overview
of deterministic quantities and with fVi;k g are associated with transition nodes. In order
Fi ( denotes “with distribution”). In addition, for a transition to occur, several conditions may
initial conditions are X0 p0 .x/, Yi D Vi;1 , and have to be satisfied. Information related to these
Ni D 1 if i 2 .X0 /. If i … .X0 /, then Yi is conditions is contained in place nodes. Some
undefined and Ni D 0. such places are viewed as the “input” to a tran-
It is conceivable for two events to occur at sition; they are associated with the conditions
the same time, in which case we need a priority required for this transition to occur. Other places
scheme to overcome a possible ambiguity in the are viewed as the output of a transition; they
selection of the triggering event in (3). In prac- are associated with conditions that are affected
tice, it is common to expect that every Fi in by the occurrence of this transition. A Petri net
the clock structure is absolutely continuous over graph is formally defined as a weighted directed
Œ0; 1/ (so that its density function exists) and bipartite graph .P; T; A; w/ where P is the finite
has a finite mean. This implies that two events set of places (one type of node in the graph), T
can occur at exactly the same time only with is the finite set of transitions (the other type of
probability 0. node in the graph), A
.P T / [ .T P /
A stochastic process fX.t/g with state space is the set of arcs with directions from places to
X which is generated by a stochastic timed au- transitions and from transitions to places in the
tomaton .X ; E; ; p; p0 ; F / is referred to as a graph, and w W A ! f1; 2; 3; : : :g is the weight
generalized semi-Markov process (GSMP). This function on the arcs. Let P D fp1 ; p2 ; : : : ; pn g,
process is used as the basis of much of the sample and T D ft1 ; t2 ; : : : ; tm g. It is convenient to
path analysis methods for DES (see Cassandras use I.tj / to represent the set of input places to
and Lafortune 2008; Glasserman 1991; Ho and transition tj . Similarly, O.tj / represents the set
Cao 1991). of output places from transition tj . Thus, we have
I.tj / D fpi 2 P W .pi ; tj / 2 Ag and O.tj / D
fpi 2 P W .tj ; pi / 2 Ag.
Petri Nets
Petri Net Dynamics. Tokens are assigned to
An alternative modeling formalism for DES is places in a Petri net graph in order to indi-
provided by Petri nets, originating in the work cate the fact that the condition described by that
of C. A. Petri in the early 1960s. Like an au- place is satisfied. The way in which tokens are
tomaton, a Petri net (Peterson 1981) is a device assigned to a Petri net graph defines a mark-
that manipulates events according to certain rules. ing. Formally, a marking x of a Petri net graph
One of its features is the inclusion of explicit .P; T; A; w/ is a function x W P ! N D
conditions under which an event can be enabled. f0; 1; 2; : : :g. Marking x defines row vector x D
The Petri net modeling framework is the subject Œx.p1 /; x.p2 /; : : : ; x.pn /, where n is the number
of the article Modeling, Analysis, and Control of places in the Petri net. The i th entry of this
with Petri Nets. Thus, we limit ourselves here to vector indicates the (nonnegative integer) number
a brief introduction. First, we define a Petri net of tokens in place pi , x.pi / 2 N. In Petri
graph, also called the Petri net structure. Then, net graphs, a token is indicated by a dark dot
we adjoin to this graph an initial state, a set of positioned in the appropriate place. The state of
marked states, and a transition labeling function, a Petri net is defined to be its marking vector
resulting in the complete Petri net model, its x. The state transition mechanism of a Petri net
associated dynamics, and the languages that it is captured by the structure of its graph and by
generates and marks. “moving” tokens from one place to another. A
transition tj 2 T in a Petri net is said to be
Petri Net Graph. A Petri net is a directed enabled if
bipartite graph with two types of nodes, places
and transitions, and arcs connecting them. Events x.pi / w.pi ; tj / for all pi 2 I.tj / : (8)
Models for Discrete Event Systems: An Overview 767
In words, transition tj in the Petri net is enabled with a transition tj . A positive real number,
when the number of tokens in pi is at least as vj;k , assigned to tj has the following meaning:
large as the weight of the arc connecting pi to when transition tj is enabled for the kth time,
tj , for all places pi that are input to transition it does not fire immediately, but incurs a firing
tj . When a transition is enabled, it can occur or delay given by vj;k ; during this delay, tokens are
fire. The state transition function of a Petri net is kept in the input places of tj . Not all transitions
defined through the change in the state of the Petri are required to have firing delays. Thus, we
net due to the firing of an enabled transition. The partition T into subsets T0 and TD , such that
state transition function, f W Nn T ! Nn , of T D T0 [ TD . T0 is the set of transitions always
Petri net .P; T; A; w; x/ is defined for transition incurring zero firing delay, and TD is the set
tj 2 T if and only if (8) holds. Then, we set of transitions that generally incur some firing
x0 D f .x; tj / where delay. The latter are called timed transitions. The
clock structure (or timing structure) associated
x 0 .pi / D x.pi / w.pi ; tj / C w.tj ; pi /; with a set of timed transitions TD
T of a
i D 1; : : : ; n : (9) marked Petri net .P; T; A; w; x/ is a set V D
fvj W tj 2 TD g of clock (or lifetime) sequences
An “enabled transition” is therefore equivalent to vj D fvj;1 ; vj;2 ; : : :g, tj 2 TD , vj;k 2 RC ;
a “feasible event” in an automaton. But whereas k D 1; 2; : : : A timed Petri net is a six-tuple
in automata the state transition function enumer- .P; T; A; w; x; V/, where .P; T; A; w; x/ is a
ates all feasible state transitions, here the state marked Petri net and V D fvj W tj 2 TD g is
transition function is based on the structure of a clock structure. It is worth mentioning that
the Petri net. Thus, the next state defined by (9) this general structure allows for a variety of
explicitly depends on the input and output places behaviors in a timed Petri net, including the
of a transition and on the weights of the arcs con- possibility of multiple transitions being enabled M
necting these places to the transition. According at the same time or an enabled transition being
to (9), if pi is an input place of tj , it loses as many preempted by the firing of another, depending
tokens as the weight of the arc from pi to tj ; if it on the values of the associated firing delays.
is an output place of tj , it gains as many tokens as The need to analyze and control such behavior
the weight of the arc from tj to pi . Clearly, it is in DES has motivated the development of a
possible that pi is both an input and output place considerable body of analysis techniques for
of tj . Petri net models which have been proven to be
In general, it is entirely possible that, after particularly suitable for this purpose (Moody and
several transition firings, the resulting state is Antsaklis 1998; Peterson 1981).
x D Œ0; : : : ; 0 or that the number of tokens in
one or more places grows arbitrarily large after
an arbitrarily large number of transition firings. Dioid Algebras
The latter phenomenon is a key difference with
automata, where finite-state automata have only a Another modeling framework is based on devel-
finite number of states, by definition. In contrast, oping an algebra using two operations: minfa; bg
a finite Petri net graph may result in a Petri net (or maxfa; bg) for any real numbers a and b and
with an unbounded number of states. It should addition .a C b/. The motivation comes from
be noted that a finite-state automaton can always the observation that the operations “min” and
be represented as a Petri net; on the other hand, “C” are the only ones required to develop the
not all Petri nets can be represented as finite-state timed automaton model. Similarly, the operations
automata. “max” and “C” are the only ones used in de-
Similar to timed automata, we can define veloping the timed Petri net models described
timed Petri nets by introducing a clock structure, above. The operations are formally named addi-
except that now a clock sequence vj is associated tion and multiplication and denoted by ˚ and ˝
768 Models for Discrete Event Systems: An Overview
respectively. However, their actual meaning (in a systematic framework for formulating and
terms of regular algebra) is different. For any two solving problems of this type; a comprehensive
real numbers a and b, we define coverage can be found in Cassandras and
Lafortune (2008). Logical behavior issues are
Addition W a ˚ b maxfa; bg (10) also encountered in the diagnosis of partially
Multiplication W a ˝ b a C b: (11) observed DES, a topic covered in the article
Diagnosis of Discrete Event Systems.
This dioid algebra is also called a .max; C/
Event Timing. When timing issues are intro-
algebra (Baccelli et al. 1992; Cuninghame-Green
duced, timed automata and timed Petri nets are
1979). If we consider a standard linear discrete
invoked for modeling purposes. Supervisory con-
time system, its state equation is of the form
trol in this case becomes significantly more com-
plicated. An important class of problems, how-
x.k C 1/ D Ax.k/ C Bu.k/;
ever, does not involve the ordering of individual
events, but rather the requirement that selected
which involves (regular) multiplication () and
events occur within a given “time window” or
addition (C). It turns out that we can use a
with some desired periodic characteristics. Mod-
.max; C/ algebra with DES, replacing the .C; /
els based on the algebraic structure of timed Petri
algebra of conventional time-driven systems, and
nets or the .max; C/ algebra provide convenient
come up with a representation similar to the one
settings for formulating and solving such prob-
above, thus paralleling to a considerable extent
lems (Baccelli et al. 1992; Glasserman and Yao
the analysis of classical time-driven linear sys-
1994).
tems. We should emphasize, however, that this
particular representation is only possible for a Performance Analysis. As in classical control
subset of DES. Moreover, while conceptually theory, one can define a performance (or cost)
this offers an attractive way to capture the event function as a means for quantifying system
timing dynamics in a DES, from a computa- behavior. This approach is particularly crucial
tional standpoint, one still has to confront the in the study of stochastic DES. Because of
complexity of performing the “max” operation the complexity of DES dynamics, analytical
when numerical information is ultimately needed expressions for such performance metrics in
to analyze the system or to design controllers for terms of controllable variables are seldom
its proper operation. available. This has motivated the use of
simulation and, more generally, the study
of DES sample paths; these have proven to
Control and Optimization of Discrete contain a surprising wealth of information for
Event Systems control purposes. The theory of perturbation
analysis presented in the article Perturbation
The various control and optimization methodolo- Analysis of Discrete Event Systems has provided
gies developed to date for DES depend on the a systematic way of estimating performance
modeling level appropriate for the problem of sensitivities with respect to system parameters
interest. (Cassandras and Lafortune 2008; Cassandras and
Panayiotou 1999; Glasserman 1991; Ho and Cao
Logical Behavior. Issues such as ordering 1991).
events according to some specification or
ensuring the reachability of a particular state are Discrete Event Simulation. Because of the
normally addressed through the use of automata aforementioned complexity of DES dynamics,
and Petri nets (Chen and Lafortune 1991; Moody simulation becomes an essential part of DES
and Antsaklis 1998; Ramadge and Wonham performance analysis (Law and Kelton 1991).
1987). Supervisory control theory provides Discrete event simulation can be defined as a
Monotone Systems in Biology 769
systematic way of generating sample paths of Chen E, Lafortune S (1991) Dealing with blocking in
a DES by means of a timed automaton or its supervisory control of discrete event systems. IEEE
Trans Autom Control AC-36(6):724–735
stochastic counterpart. The same process can be Cuninghame-Green RA (1979) Minimax algebra. Number
carried out using a Petri net model or one based 166 in lecture notes in economics and mathematical
on the dioid algebra setting. systems. Springer, Berlin/New York
Glasserman P (1991) Gradient estimation via perturbation
analysis. Kluwer Academic, Boston
Optimization. Optimization problems can be Glasserman P, Yao DD (1994) Monotone structure in
formulated in the context of both untimed and discrete-event systems. Wiley, New York
timed models of DES. Moreover, such problems Ho YC (ed) (1991) Discrete event dynamic systems:
can be formulated in both a deterministic and a analyzing complexity and performance in the modern
world. IEEE, New York
stochastic setting. In the latter case, the ability Ho YC, Cao X (1991) Perturbation analysis of discrete
to efficiently estimate performance sensitivities event dynamic systems. Kluwer Academic, Dordrecht
with respect to controllable system parameters Ho YC, Cassandras CG (1983) A new approach to the
provides a powerful tool for stochastic gradient- analysis of discrete event dynamic systems. Automat-
ica 19:149–167
based optimization (when one can define Hoare CAR (1985) Communicating sequential processes.
derivatives) (Vázquez-Abad et al. 1998). Prentice-Hall, Englewood Cliffs
A treatment of all such problems from an Hopcroft JE, Ullman J (1979) Introduction to automata
application-oriented standpoint, along with fur- theory, languages, and computation. Addison-Wesley,
Reading
ther details on the use of the modeling frame- Law AM, Kelton WD (1991) Simulation modeling and
works discussed in this entry, can be found in the analysis. McGraw-Hill, New York
article Applications of Discrete-Event Systems. Moody JO, Antsaklis P (1998) Supervisory control of
discrete event systems using petri nets. Kluwer Aca-
demic, Boston
Cross-References Peterson JL (1981) Petri net theory and the modeling of
systems. Prentice Hall, Englewood Cliffs M
Applications of Discrete-Event Systems Ramadge PJ, Wonham WM (1987) Supervisory control
Diagnosis of Discrete Event Systems of a class of discrete event processes. SIAM J Control
Optim 25(1):206–230
Discrete Event Systems and Hybrid Systems, Vázquez-Abad FJ, Cassandras CG, Julka V (1998) Cen-
Connections Between tralized and decentralized asynchronous optimization
Modeling, Analysis, and Control with Petri of stochastic discrete event systems. IEEE Trans Au-
Nets tom Control 43(5):631–655
Perturbation Analysis of Discrete Event Sys-
tems
Perturbation Analysis of Steady-State Perfor-
mance and Sensitivity-Based Optimization Monotone Systems in Biology
Supervisory Control of Discrete-Event Systems
David Angeli
Department of Electrical and Electronic
Bibliography Engineering, Imperial College London, London,
UK
Baccelli F, Cohen G, Olsder GJ, Quadrat JP (1992) Dipartimento di Ingegneria dell’Informazione,
Synchronization and linearity. Wiley, Chichester/New
York University of Florence, Italy
Baeten JCM, Weijland WP (1990) Process algebra. Vol-
ume 18 of Cambridge tracts in theoretical computer
science. Cambridge University Press, Cambridge/New
York Abstract
Cassandras CG, Lafortune S (2008) Introduction to dis-
crete event systems, 2nd edn. Springer, New York Mathematical models arising in biology might
Cassandras CG, Panayiotou CG (1999) Concurrent sam-
ple path analysis of discrete event systems. J Discret sometime exhibit the remarkable feature of
Event Dyn Syst Theory Appl 9:171–195 preserving ordering of their solutions with
770 Monotone Systems in Biology
respect to initial data: in words, the “more” of arbitrary behavior (including periodic or chaotic).
x (the state variable) at time 0, the more of it Further results along these lines provide insight
at all subsequent times. Similar monotonicity into which set of extra assumptions allow one
properties are possibly exhibited also with to strengthen generic convergence to global
respect to input levels. When this is the case, convergence, including, for instance, existence of
important features of the system’s dynamics can positive first integrals (Banaji and Angeli 2010;
be inferred on the basis of purely qualitative Mierczynski 1987), tridiagonal structure (Smillie
or relatively basic quantitative knowledge of 1984), or positive translation invariance (Angeli
the system’s characteristics. We will discuss and Sontag 2008a).
how monotonicity-related tools can be used While these tools were initially developed
to analyze and design biological systems having in mind applications arising in ecology,
with prescribed dynamical behaviors such epidemiology, chemistry, or economy, it was due
as global stability, multistability, or periodic to the increased importance of mathematical
oscillations. modeling in molecular biology and the
subsequent rapid development of systems biology
as an emerging independent field of investigation
that they became particularly relevant to biology.
Keywords
The paper Angeli and Sontag (2003) first
introduced the notion of control monotone
Feedback interconnections; Monotone dynamics;
systems, including input and output variables,
Monotonicity checks
that is of interest if one is looking at systems
arising from interconnection of monotone
modules. Small-gain theorems and related
Introduction conditions were defined to study both positive
(Angeli and Sontag 2004b) and negative (Angeli
Ordinary differential equations of a scalar and Sontag 2003) feedback interconnections by
unknown, under suitable assumptions for unicity relating their asymptotic behavior to properties
of solutions, trivially enjoy the property that any of the discrete iterations of a suitable map,
pair of ordered initial conditions (according to called the steady-state characteristic of the
the standard order defined for real numbers) system.
gives rise to ordered solutions at all positive times In particular, convergence of this map is re-
(as well as negative, though this is less relevant lated to convergent solutions for the original con-
for the developments that follow). Monotone tinuous time system; on the other hand, specific
systems are a special but significant class of negative feedback interconnections can instead
dynamical models, possibly evolving in high- give rise to oscillations as a result of Hopf bi-
dimensional or even infinite-dimensional state furcations as in Angeli and Sontag (2008b) or
spaces, that are nevertheless characterized by to relaxation oscillators as highlighted in Gedeon
a similar property holding with respect to a and Sontag (2007).
suitably defined notion of partial order. They A parallel line of investigation, originated in
became the focus of considerable interest in the work of Volpert et al. (1994), exploited the
mathematics after a series of seminal papers specific features of models arising in biochem-
by Hirsch (1985, 1988) provided the basic istry by focusing on structural conditions for
definitions as well as deep results showing monotonicity of chemical reaction networks (An-
how generic convergence properties of their geli et al. 2010; Banaji 2009). Monotonicity is
solutions are expected under suitable technical only one of the possible tools adopted in the
assumptions. Shortly before that Smale (1976), study of dynamics for such class of models in
Smale’s construction had already highlighted the related field of chemical reaction networks
how specific solutions could instead exhibit theory.
Monotone Systems in Biology 771
conditions also exist to assess strong monotonic- ƒx1 ƒx2 , where denotes the partial order
ity, for instance, in the case of orthant cones. induced by the positive orthant, while O de-
Finding whether there exists an order (as induced, notes the order induced by O. This means that
for instance, by a suitable cone K) which can we may check monotonicity with respect to O
make a system monotone is instead a harder task by performing a simple change of coordinates
which normally entails a good deal of insight in z D ƒx. As a corollary:
the systems dynamics.
Proposition 2 The system ' induced by the set
It is worth mentioning that for the special case
of differential equations (5) is monotone with
of linear systems, monotonicity is just equiva-
respect to O if and only if ƒ @f
@x
ƒ is a Metzler
lent to invariance of the cone K, as incremental
matrix for all x 2 X .
properties (referred to pairs of solutions) are just
equivalent to their non-incremental counterparts Notice that conditions of Propositions 1 and 2
(referred to the 0 solution). In this respect, a can be expressed in terms of sign constraints on
substantial amount of theory exists starting from off-diagonal entries of the Jacobian; in biological
classical works such as the Perron-Frobenius the- terms a sign constraint in an off-diagonal entry
ory on positive and cone-preserving maps; this is, amounts to asking that a particular species (mean-
however, outside the scope of this entry, and the ing chemical compound or otherwise) consis-
interested reader may refer to Farina and Rinaldi tently exhibit throughout the considered model’s
(2000) for a recent book on the subject. state space either an excitatory or inhibitory effect
on some other species of interest. Qualitative
diagrams showing effects of species on each other
Monotone Dynamics are commonly used by biologists to understand
the working principles of biomolecular networks.
We divide this section in three parts; first we sum- Remarkably, Proposition 2 has also an in-
marize the main tools for checking monotonicity teresting graph
theoretical
interpretation if one
with respect to orthant cones, then we recall some thinks of sign @f@x
as the adjacency matrix of a
of the main consequences of monotonicity for the graph with nodes x1 : : : xn corresponding to the
long-term behavior of solutions and, finally, we state variables of the system.
study interconnections of monotone systems.
Proposition 3 The system ' induced by the set
Checking Monotonicity of differential equations (5) is monotone with
Orthant cones and the partial orders they induce respect to O if and only
if the directed graph
play a major role in biology applications. In fact, of adjacency matrix sign @f@x (neglecting diag-
for systems described by equations onal entries) has undirected loops with an even
number of negative edges.
xP D f .x/ (5)
This means in particular that @f @x must be sign
symmetric (no predator-prey-type interactions)
with X Rn open and f W X ! Rn of class C 1 ,
and in addition that a similar parity property has
the following characterization holds:
to hold on undirected loops of arbitrary length.
Proposition 1 The system ' induced by the set Sufficient conditions for strong monotonicity are
of differential equations (5) is cooperative if and also known, for instance, in terms of irreducibil-
only if the Jacobian @f
@x is a Metzler matrix for all
ity of the Jacobian matrix (Kamke’s condition;
x 2 X. see Hirsch and Smith 2003).
We recall that M is Metzler if mij 0 for all i ¤
j . Let ƒ D diagŒ
1 ; : : : ;
n with
i 2 f1; 1g Asymptotic Dynamics
and assume that the orthant O D ƒŒ0; C1/n . As previously mentioned, several important im-
It is straightforward to see that x1 O x2 , plications of monotonicity are with respect to
Monotone Systems in Biology 773
asymptotic dynamics. Let E denote the set of asymptotically stable equilibrium kx .u/ and
equilibria of '. The following result is due to the map kx .u/ is continuous. If moreover the
Hirsch (1985). equilibrium is hyperbolic, then kx is called a
non-degenerate characteristic. The input–output
Theorem 1 Let ' be a strongly monotone sys-
characteristic is defined as ky .u/ D h.kx .u//.
tem with bounded solutions. There exists a zero
measure set Q such that each solution starting in Let ' be system with a well-defined input–output
X nQ converges toward E. characteristic ky ; we may define the iteration
Global convergence results can be achieved for
important classes of monotone dynamics. For ukC1 D ky .uk /: (6)
instance, when increasing conservation laws are
present (see Banaji and Angeli 2010):
Theorem 2 Let X K Rn be any two proper It is clear that fixed points of (6) correspond to
cones. Let ' on X be strongly monotone with input values (and therefore to equilibria through
respect to the partial order induced by K and the characteristic map kx ) of the closed-loop
preserving a K-increasing first integral. Then system derived by considering the unity feedback
every bounded solution converges. interconnection u D y. What is remarkable for
monotone systems is that both in the case of
Dually to first integrals, positive translation in- positive and negative feedback and in a precise
variance of the dynamics also provide grounds sense, stability properties of the fixed points of
for global convergence (see Angeli and Sontag the discrete iteration (6) are matched by stability
2008a): properties of the corresponding associated solu-
Theorem 3 If a system is strongly monotone and tions of the original continuous time system. See
fulfills '.t; x0 Chv/ D '.t; x0 /Chv for all h 2 R Angeli and Sontag (2004b) for the case of posi- M
and some v 0, then all solutions with bounded tive feedback interconnections and Angeli et al.
projections in v ? converge. (2004) for applications of such results to synthe-
sis and detection of multistability in molecular
The class of tridiagonal cooperative systems has biology.
also been investigated as a significant remarkable Multistability, in particular, is an important
class of global convergent dynamics; see Smillie dynamical feature of specific cellular systems and
(1984). These arise from differential equations can be achieved, with good degree of robustness
xP D f .x/ when @fi =@xj D 0 for all ji j j > 1. with respect to different types of uncertainties, by
Finally it is worth emphasizing how signif- means of positive feedback interconnections of
icant for biological systems, often subject to monotone subsystems. The typical input–output
phenomena evolving at different timescales, are characteristic ky giving rise to such behavior is,
also results on singular perturbations (Gedeon in the SISO case, that of a sigmoidal function
and Sontag 2007; Wang and Sontag 2008). intersecting in 3 points the diagonal u D y. Two
of the fixed points, namely, u1 and u3 (see Fig. 1),
are asymptotically stable for (6), and the cor-
Interconnected Monotone Systems
responding equilibria of the original continuous
Results on interconnected monotone SISO sys-
time monotone system are also asymptotically
tems are surveyed in Angeli and Sontag (2004a).
stable with a basin of attraction which covers
The main tool used in this context is the notion of
almost all initial conditions. The fixed-point u2
input-state and input–output steady-state charac-
is unstable and the corresponding equilibrium is
teristic.
also such (under suitable technical assumption
Definition 3 A control system admits a well- on the non-degeneracy of the I-O characteristic).
defined input-state characteristic if for all Extensions of similar criteria to the MIMO case
constant inputs u there exists a unique globally are presented in Enciso and Sontag (2005).
774 Monotone Systems in Biology
ky(u) ky(u)
u1 u2 u3 u u1 u
Monotone Systems in Biology, Fig. 1 Fixed points of Monotone Systems in Biology, Fig. 2 Fixed point of a
a sigmoidal input–output characteristic decreasing input–output characteristic
tions, as well as in Chemical Reaction Networks Hirsch MW, Smith H (2005) Monotone
Theory. Generally speaking, while monotonicity dynamical systems (Chapter 4). In: Canada A,
as a whole cannot be expected in large networks, Drábek P, Fonda A (eds) Handbook of
experimental data shows that the number of neg- differential equations ordinary differential
ative feedback loops in biological regulatory net- equations, vol 2. Elsevier
works is significantly lower than in a random
signed graph of comparable size Maayan et al.
(2008).
Analyzing the properties of monotone dynam- Bibliography
ics may potentially lead to better understanding
Angeli D, Sontag ED (2003) Monotone control systems.
of the key regulatory mechanisms of complex IEEE Trans Autom Control 48:1684–1698
networks as well as the development of bottom- Angeli D, Sontag ED (2004a) Interconnections of mono-
up approaches for the identification of meaning- tone systems with steady-state characteristics. In: Op-
ful submodules in biological networks. Poten- timal control, stabilization and nonsmooth analysis.
Lecture notes in control and information sciences,
tial research directions may include both novel vol 301. Springer, Berlin, pp 135–154
computational tools and specific applications to Angeli D, Sontag ED (2004b) Multi-stability in monotone
systems biology, for instance: input/output systems. Syst Control Lett 51:185–202
• Algorithms for detection of monotonicity with Angeli D, Sontag ED (2008a) Translation-invariant mono-
tone systems, and a global convergence result for
respect to exotic orders (such as arbitrary enzymatic futile cycles. Nonlinear Anal Ser B Real
polytopic cones or even state-dependent World Appl 9:128–140
cones) Angeli D, Sontag ED (2008b) Oscillations in I/O mono-
• Application of monotonicity-based ideas to tone systems. IEEE Trans Circuits Syst 55:166–176
Angeli D, Sontag ED (2011) A small-gain result for
control synthesis (see, for instance, Aswani orthant-monotone systems in feedback: the non sign-
and Tomlin (2009) where the special class of definite case. Paper appeared in the 50th IEEE con- M
piecewise affine systems is considered) ference on decision and control, Orlando, 12–15 Dec
2011
Angeli D, Ferrell JE, Sontag ED (2004) Detection of
multistability, bifurcations, and hysteresis in a large
Cross-References class of biological positive-feedback systems. Proc
Natl Acad Sci USA 101:1822–1827
Deterministic Description of Biochemical Net- Angeli D, de Leenheer P, Sontag ED (2010) Graph-
theoretic characterizations of monotonicity of chem-
works ical networks in reaction coordinates. J Math Biol
Spatial Description of Biochemical Networks 61:581–616
Stochastic Description of Biochemical Aswani A, Tomlin C (2009) Monotone piecewise affine
Networks systems. IEEE Trans Autom Control 54:1913–1918
Banaji M (2009) Monotonicity in chemical reaction sys-
tems. Dyn Syst 24:1–30
Banaji M, Angeli D (2010) Convergence in strongly
Recommended Reading monotone systems with an increasing first integral.
SIAM J Math Anal 42:334–353
Banaji M, Angeli D (2012) Addendum to “Convergence
For readers interested in the mathematical details in strongly monotone systems with an increasing first
of monotone systems theory we recommend the integral”. SIAM J Math Anal 44:536–537
following: Enciso GA, Sontag ED (2005) Monotone systems under
positive feedback: multistability and a reduction theo-
Smith H (1995) Monotone dynamical systems:
rem. Syst Control Lett 54:159–168
an introduction to the theory of competitive Enciso GA, Sontag ED (2006) Global attractivity, I/O
and cooperative systems. Mathematical sur- monotone small-gain theorems, and biological delay
veys and monographs, vol 41. AMS, Provi- systems. Discret Contin Dyn Syst 14:549–578
Enciso GA, Sontag ED (2008) Monotone bifurcation
dence
graphs. J Biol Dyn 2:121–139
A more recent technical survey of aspects related Farina L, Rinaldi S (2000) Positive linear systems: theory
to asymptotic dynamics of monotone systems is and applications. Wiley, New York
776 Motion Description Languages and Symbolic Control
Gedeon T, Sontag ED (2007) Oscillations in multi-stable control; discuss briefly its history, connections,
monotone systems with slowly varying feedback. and applications; and provide a few insights into
J Differ Equ 239:273–295
Hirsch MW (1985) Systems of differential equations that
where the field is going.
are competitive or cooperative II: convergence almost
everywhere. SIAM J Math Anal 16:423–439
Hirsch MW (1988) Stability and convergence in strongly Keywords
monotone dynamical systems. Reine Angew Math
383:1–53 Abstraction; Complex systems; Formal methods
Hirsch MW, Smith HL (2003) Competitive and coop-
erative systems: a mini-review. In: Positive systems,
Rome. Lecture notes in control and information sci-
ence, vol 294. Springer, pp 183–190
Maayan A, Iyengar R, Sontag ED (2008) Intracellular Introduction
regulatory networks are close to monotone systems.
IET Syst Biol 2:103–112 Systems and control theory is powerful paradigm
Mierczynski J (1987) Strictly cooperative systems with a for analyzing, understanding, and controlling dy-
first integral. SIAM J Math Anal 18:642–646
Smale S (1976) On the differential equations of species in namic systems. Traditional tools in the field for
competition. J Math Biol 3:5–7 developing and analyzing control laws, however,
Smillie J (1984) Competitive and cooperative tridiagonal face significant challenges when one needs to
systems of differential equations. SIAM J Math Anal deal with the complexity that arises in many
15:530–534
Volpert AI, Volpert VA, Volpert VA (1994) Traveling practical, real-world settings such as the control
wave solutions of parabolic Systems. Translations of of autonomous, mobile systems operating in un-
mathematical monographs, vol 140. AMS, Providence certain and changing physical environments. This
Wang L, Sontag ED (2008) Singularly perturbed mono- is particularly true when the tasks to be achieved
tone systems and an application to double phosphory-
lation cycles. J Nonlinear Sci 18:527–550 are not easily framed in terms of motion to a point
in the state space. One of the primary goals of
symbolic control is to mitigate this complexity
by abstracting some combination of the system
dynamics, the space of control inputs, and the
Motion Description Languages physical environment to a simpler, typically fi-
and Symbolic Control nite, model.
This fundamental idea, namely, that of ab-
Sean B. Andersson stracting away the complexity of the underlying
Mechanical Engineering and Division of dynamics and environment, is in fact a quite
Systems Engineering, Boston University, natural one. Consider, for example, how you
Boston, MA, USA give instructions to another person wanting to
go to a point of interest. It would be absurd
to provide details at the level of their actua-
Abstract tors, namely, with commands to their individual
muscles (or to carry the example to an even
The fundamental idea behind symbolic control more absurd extreme, to the dynamic components
is to mitigate the complexity of a dynamic sys- that make up those muscles). Rather, very high-
tem by limiting the set of available controls to level commands are given, such as “follow the
a typically finite collection of symbols. Each road,” “turn right,” and so on. Each of these
symbol represents a control law that may be provides a description of what to do with the
either open or closed loop. With these symbols, understanding that the person can carry out those
a simpler description of the motion of the system commands in their own fashion. Similarly, the en-
can be created, thereby easing the challenges vironment itself is abstracted, and only elements
of analysis and control design. In this entry, meaningful to the task at hand are described.
we provide a high-level description of symbolic Thus, continuing the example above, rather than
Motion Description Languages and Symbolic Control 777
providing metric information or a detailed map, space of possible control signals by limiting the
the instructions may use environmental features system to a typically finite collection of control
to determine when particular actions should be symbols. Each of these symbols represents a
terminated and the next begun, such as “follow control law that may be open loop or may utilize
the road until the second intersection, then turn output feedback. For example, follow the road
right.” could be a feedback control law that uses sensor
Underlying the idea of symbolic control is measurements to determine the position relative
the notion that rich behaviors can result from to the road and then applies steering commands
simple actions. This premise was used in many so that the system stays on the road while simul-
early robots and can be traced back at least to taneously maintaining a constant speed. There
the ideas of Norbert Wiener on cybernetics (see are, of course, many ways to accomplish the
Arkin 1998). It is at the heart of the behavior- specifics of this task, and the details will depend
based approach to robotics (Brooks 1986). Sim- on the particular system. Thus, an autonomous
ilar ideas can also be seen in the development four-wheeled vehicle equipped with a laser range
of a high-level language (G-codes) for Computer finder, an autonomous motorcycle equipped with
Numerically Controlled (CNC) machines. The ultrasonic sensors, or an autonomous aerial vehi-
key technical ideas in the more general setting cle with a camera would each carry out the com-
of symbolic control for dynamic systems can be mand in their own way, and each would have very
traced back to Brockett (1988) which introduced different trajectories. They would all, however,
ideas of formalizing a modular approach to pro- satisfy the notion of follow the road. Description
gramming motion control devices through the of the behavior of the system can then be given in
development of a Motion Description Language terms of the abstract symbols rather than in terms
(MDL). of the details of the trajectories.
The goal of the present work is to introduce the Typically each of these symbols describes an M
interested reader to the general ideas of symbolic action that at least conceptually is simple. In
control as well as to some of its application order to generate rich motions to carry out com-
areas and research directions. While it is not a plex tasks, the system is switched between the
survey paper, a few select references are provided available symbols. Switching conditions are of-
throughout to point the reader in hopefully fruit- ten referred to as interrupts. Interrupts may be
ful directions into the literature. purely time-based (e.g., apply a given symbol
for T seconds) or may be expressed in terms
of symbols representing certain environmental
conditions. These may be simple function of
Models and Approaches the measurements (e.g., interrupt when an in-
tersection is detected) or may represent more
There are at least two related but distinct ap- complicated scenarios with history and dynamics
proaches to symbolic control. Both begin with a (e.g., interrupt after the second intersection is
mathematical description of the system, typically detected). Just as the input symbols abstract away
given as an ordinary differential equation of the the details of the control space and of the mo-
form tion of the system, the interrupt symbols abstract
away the details of the environment. For example,
xP D f .x; u; t/; y D h.x; t/ (1) intersection has a clear high-level meaning but
a very different sensor “signature” for particular
where x is a vector describing the state of the systems.
system, y is the output of the sensors of the As a simple illustrative example, consider a
system, and u is the control input. collection of control symbols designed for mov-
Under the first approach to symbolic control, ing along a system of roads, {follow road, turn
the focus is on reducing the complexity of the right, turn left}, and a collection of interrupt
778 Motion Description Languages and Symbolic Control
Motion Description Languages and Symbolic Con- The systems each interpret the same symbols in their
trol, Fig. 1 Simple example of symbolic control with a own ways, leading to different trajectories due both to
focus on abstracting the inputs. Two systems, a snakelike differences in dynamics and also to different sensors cues
robot and an autonomous car, are given a high-level plan as caused, for example, by the parked vehicle encountered
in terms of symbols for navigating a right-hand turn. by the car in this scenario
symbols for triggering changes in such a setting, to write programs that can be compiled into
{in intersection, clear of intersection}. Suppose an executable for a specific system. Different
there are two vehicles that can each interpret rules for doing this can be established that
these symbols, an autonomous car and a snake- define different languages, analogous to different
like robot, as illustrated in Fig. 1. It is reasonable high-level programming languages such as
to assume that the control symbols each describe C++, Java, or Python. Further details can
relatively complex dynamics that allow, for ex- be found in, for example, Manikonda et al.
ample, for obstacle avoidance while carrying out (1998).
the action. Figure 1 illustrates a possible situation Under the second approach, the focus is on
where the two systems carry out the plan defined representing the dynamics and state space (or
by the symbolic sequence: environment) of the system in an abstract, sym-
bolic way. The fundamental idea is to lump all
(Follow the road UNTIL in intersection) the states in a region into a single abstract el-
(Turn right UNTIL clear of intersection) ement and to then represent the entire system
with a finite number of these elements. Control
The intent of this plan is for the system to nav- laws are then defined that steer all the states
igate a right-hand turn. As shown in the figure, in one element into some state in a different
the actual trajectories followed by the systems region. The symbolic control system is then the
can be markedly different due in part to sys- finite set of elements representing regions to-
tem dynamics (the snakelike robot undulates, gether with the finite set of controllers for moving
while the car does not) as well as to different between them. It can be thought of essentially
sensor responses (when the car goes through, as a graph (or more accurately as a transition
there is a parked vehicle that it must navigate system) in which the nodes represent regions in
around, while the snakelike robot found a clear the state space and the edges represent achiev-
path during its execution). Despite these dif- able transitions between them. The goal of this
ferences, both systems achieve the goal of the abstraction step is for the two representations to
plan. be equivalent (or at least approximately equiv-
The collection of control and interrupt alent) in that any motion that can be achieved
symbols can be thought of as a language for in one can be achieved in the other (in an ap-
describing and specifying motion and are used propriate sense). Planning and analysis can then
Motion Description Languages and Symbolic Control 779
R7 R8 R9 R10 R7 R8 R9 R10
R5 R6 R5 R6
R4
R3
R3 R4
R1 R2
R1 R2
Symbolic model
and planning
Motion Description Languages and Symbolic Con- one that twists the robot to face the cell to the left before
trol, Fig. 2 Simple example of symbolic control with a slithering across and then reorienting. The combination of
focus on abstracting the system dynamics and environ- regions and actions yields a symbolic abstraction (center
ment. The initial environment (left image) is segmented image) that allows for planning to achieve specific goals,
into different regions and simple controllers developed such as moving through the right-hand turn. Executing
for moving from region to region. The image shows two this plan leads to a physical trajectory of the system (right
possible controllers: one that actuates the robot through image)
a tight slither pattern to move forward by one region and
Girard A, Pappas GJ (2007) Approximation metrics for tal external forces (sea state, currents, winds) that
discrete and continuous systems. IEEE Trans Autom drive the optimized motion generation process.
Control 52(5):782–798
Johnson S (2002) Emergence: the connected lives of ants,
brains, cities, and software. Scribner, New York
Klavins E (2007) Programmable self-assembly. IEEE Keywords
Control Syst 27(4):43–56
Kress-Gazit H (2011) Robot challenges: toward develop-
ment of verification and synthesis techniques [from the
Configuration space; Dynamic programming;
Guest Editors]. IEEE Robot Autom Mag 18(3):22–23 Grid search; Guidance controller; Guidance sys-
Kuipers B (2000) The spatial semantic hierarchy. Artif tem; Maneuvering; Motion plan; Optimization
Intell 119(1–2):191–233 algorithms; Path generation; Route planning;
Lahijanian M, Andersson SB, Belta C (2012) Temporal
logic motion planning and control with probabilistic Trajectory generation; World space
satisfaction guarantees. IEEE Trans Robot 28(2):396–
409
Manikonda V, Krishnaprasad PS, Hendler J (1998) Lan- Introduction
guages, behaviors, hybrid architectures, and motion
control. In: Baillieul J, Willems JC (eds) Mathematical
control theory. Springer, New York, pp 199–226 Marine control systems include primarily ships
Murray RM, Deno DC, Pister KSJ, Sastry SS (1992) and rigs moving on the sea surface, but also
Control primitives for robot systems. IEEE Trans Syst underwater systems, manned (submarines) or un-
Man Cybern 22(1):183–193
Tabuada P (2006) Symbolic control of linear systems manned, and eventually can be extended to any
based on symbolic subsystems. IEEE Trans Autom kind of off-shore moving platform.
Control 51(6):1003–1013 A motion plan consists in determining what
Tarraf DC, Megretski A, Dahleh MA (2008) A framework motions are appropriate for the marine system to
for robust stability of systems over finite alphabets.
IEEE Trans Autom Control 53(5):1133–1146 reach a goal, or a target/final state (LaValle 2006).
Most often, the final state corresponds to a geo-
graphical location or destination, to be reached by
the system while respecting constraints of phys-
Motion Planning for Marine Control ical and/or economical nature. Motion planning
Systems in marine systems hence starts from route plan-
ning, and then it covers desired path generation
Andrea Caiti and trajectory generation. Path generation in-
DII – Department of Information Engineering & volves the determination of an ordered sequence
Centro “E. Piaggio”, ISME – Interuniversity of states that the system has to follow; trajectory
Research Centre on Integrated Systems for the generation requires that the states in a path are
Marine Environment, University of Pisa, Pisa, reached at a prescribed time.
Italy Route, path, and trajectory can be generated
off-line or on-line, exploiting the feedback from
the system navigation and/or from external
Abstract sources (weather forecast, etc.). In the feedback
case, planning overlaps with the guidance system,
In this chapter we review motion planning al- i.e., the continuous computation of the reference
gorithms for ships, rigs, and autonomous marine (desired) state to be used as reference input by
vehicles. Motion planning includes path and tra- the motion control system (Fossen 2011).
jectory generation, and it goes from optimized
route planning (off-line long-range path genera-
tion through operating research methods) to re- Formal Definitions and Settings
active on-line trajectory reference generation, as
given by the guidance system. Crucial to the ma- Definitions and classifications as in Goerzen et al.
rine systems case is the presence of environmen- (2010) and Petres et al. (2007) are followed
Motion Planning for Marine Control Systems 783
throughout the section. The marine systems under – Static if there is perfect knowledge of the
considerations live in a physical space referred environment at any time, dynamic otherwise
to as the world space (e.g., a submarine lives in – Time-invariant when the environment does
a 3-D Euclidean space). A configuration q is a not evolve (e.g., coastline that limits the C -
vector of variables that define position and orien- free subspace), time-variant otherwise (e.g.,
tation of the system in the world space. The set other systems – ships, rigs – in navigation)
of all possible configurations is the configuration – Differentially constrained if the system state
space, or C -space. The vector of configuration equations act as a constraint on the path,
and configuration rate of changes is the state differentially unconstrained otherwise
T
of the system x D qT q̇T , and the set of In practice, optimal motion planning problems
all the possible states is the state space. The are solved numerically through discretization of
kino-dynamic model associated to the system is the C -space. Resolution completeness/optimality
represented by the system state equations. The of an algorithm implies the achievement of the
regions of C -space free from obstacles are called solution as the discretization interval tends to
C -free. zero. Probabilistic completeness/optimality im-
The path planning problem consists in deter- plies that the probability of finding the solution
mining a curve W Œ0; 1 ! C -free; s ! .s/, tends to 1 as the computation time approaches
with (0) corresponding to the initial configu- infinity. Complexity of the algorithm refers to the
ration and (1) corresponding to the goal con- computational time required to find a solution as
figuration. Both initial and goal configurations a function of the dimension of the problem.
are in C -free. The trajectory planning problem The scale of the motion w.r. to the scale of the
consists in determining a curve and a time system defines the specific setting of the problem.
law: t ! s .t/ s:t: .s/ D .s .t//. In both In cargo ships route planning from one port call
cases, either the path or the trajectory must be to the next, the problem is stated first as static, M
compatible with the system state equations. In time-invariant, differentially unconstrained path
the following, definitions of motion algorithm planning problem; once a large-scale route is thus
properties are given referring to path planning, determined, it can be refined on smaller scales,
but the same definitions can be easily extended e.g., smoothing it, to make it compatible with ship
to trajectory planning. maneuverability. Maneuvering the same cargo
A motion planning algorithm is complete if ship in the approaches to a harbor has to be
it finds a path when one exists, and returns a casted as a dynamic, time-variant, differentially
proper flag when no path exists. The algorithm is constrained trajectory planning problem.
optimal when it provides the path that minimizes
some cost function J . The (strictly positive) cost
function J is isotropic, when it depends only Large-Scale, Long-Range Path
on the system configuration (J D J (q)), or Planning
anisotropic, when it depends also on an external
force field f (e.g., sea currents, sea state, weather Route determination is a typical long-range path
perturbations ) (J D J (q,f)). The cost function planning problem for a marine system. The
J induces a pseudometric in the configuration geographical map is discretized into a grid, and
space; the distance d between configurations q1 the optimal path between the approaches of the
and q2 through the path is the “cost-to-go” from starting and destination ports is determined as a
q1 to q2 along : sequence of adjacent grid nodes. The problem
is taken as time-invariant and differentially
unconstrained, at least in the first stages of
d .q1 ; q2 / D s0 1 J.q1 q2 .s/ ; f/ds (1)
the procedure. It is assumed that the ship will
cruise at its own (constant) most economical
An optimal motion planning problem is: speed to optimize bunker consumption, the major
784 Motion Planning for Marine Control Systems
source of operating costs (Wang and Meng updated as the ship is in transit along the route.
2012). Navigation constraints (e.g., allowed ship The path planning algorithms can/must be rerun,
traffic corridors for the given ship class) are over a grid in the neighborhood of the nominal
considered, as well as weather forecasts and path, each time new environmental information
predicted/prevailing currents and winds. The becomes available. Kino-dynamic model of the
cost-to-go Eq. (1) is built between adjacent nodes ship must be included, allowing for deviation
either in terms of time to travel or in terms of from the established path and ship speed varia-
operating costs, both computed correcting the tion around the nominal most economical speed.
nominal speed with the environmental forces. The latter case is particularly important: increas-
Optimality is defined in terms of shortest ing/decreasing the speed to avoid a weather per-
time/minimum operating cost; the anisotropy turbation keeping the same route may indeed
introduced by sea/weather conditions is the result in a reduced operating cost with respect
driving element of the optimization, making to path modifications keeping a constant speed.
the difference with respect to straightforward This implies that the timing over the path must be
shortest route computation. The approach is specified. Dynamic programming is well suited
iterated, starting with a coarse grid and then for this transition from path to trajectory gen-
increasing grid resolution in the neighborhood of eration, and it is still the most commonly used
the previously found path. approach to trajectory (re)planning in reaction to
The most widely used optimization approach environmental predictions update.
for surface ships is dynamic programming When discretizing the world space, the min-
(LaValle 2006); alternatively, since the deter- imum grid size should still be large enough to
mination of the optimal path along the grid nodes allow for ship maneuvering between grid nodes.
is equivalent to a search over a graph, the A* This is required for safety, to allow evasive ma-
algorithm is applied (Delling et al. 2009). As the neuvering when other ships are at close ranges,
discretization grid gets finer, system dynamics are and for the generation of smooth, dynamics-
introduced, accounting for ship maneuverability compliant trajectories between grid points. This
and allowing for deviation from the constant latter aspect bridges motion planning with guid-
ship speed assumption. Dynamic programming ance.
allows to include system dynamics at any level
of resolution desired; however, when system
dynamics are considered, the problem dimension Trajectory Planning, Maneuvering
grows from 2-D to 3-D (2-D space plus time). Generation, and Guidance Systems
In the case of underwater navigation, path
planning takes place in a 3-D world space, and Once a path has been established over a spatial
the inclusion of system dynamics makes it a grid, a continuous reference has to be generated,
4-D problem; moreover, bathymetry has to be linking the nodes over the grid. The generation
included as an additional constraint to shape of the reference trajectory has to take into ac-
the C -free subspace. Dynamic programming count all the relevant dynamic properties and
may become unfeasible, due to the increase constraints of the marine system, so that the
in dimensionality. Computationally feasible reference motion is feasible. In this scenario, the
algorithms for this case include global search path/trajectory nodes are way-points, and the tra-
strategies with probabilistic optimality, as genetic jectory generation connects the way-points along
algorithms (Alvarez et al. 2004), or improved the route. The approaches to trajectory generation
grid-search methods with resolution optimality, can be divided between those that do not compute
as FM* (Petres et al. 2007). explicitly in advance the whole trajectory and
Environmental force fields are intrinsically dy- those that do.
namic fields; moreover, the prediction of such Among the approaches that do not need ex-
fields at the moment of route planning may be plicit trajectory computation between way-points,
Motion Planning for Marine Control Systems 785
Motion Planning for Marine Control Systems, Fig. 1 Generation of a reference trajectory with a system model and
a guidance controller (Adapted from Fossen (2011))
the most common is the line-of -sight (LOS) is used in simulation, with a (simulated) feed-
guidance law (Pettersen and Lefeber 2001). LOS back controller (guidance controller), the next
guidance can be considered a path generation, way-point as input, and the (simulated) system
more than a trajectory generation, since it does position and velocity as output. The simulated
not impose a time law over the path; it computes results are feasible maneuvers by construction
directly the desired ship reference heading and can be given as reference position/velocity to
on the basis of the current ship position and the physical control system (Fig. 1).
the previous and next way-point positions. A
review of other guidance approaches can be
found in Breivik and Fossen (2008), where
maneuvering along the path and steering around Summary and Future Directions
the way-points are also discussed. From such a
set of different maneuvers, a library of motion
M
Motion planning for marine control systems em-
primitives can be built (Greytak and Hover ploys methodological tools that range from oper-
2010), so that any motion can be specified as ating research to guidance, navigation, and con-
a sequence of primitives. While each primitive trol systems. A crucial role in marine applica-
is feasible by construction, an arbitrary sequence tions is played by the anisotropy induced by the
of primitives may not be feasible. An optimized dynamically changing environmental conditions
search algorithm (dynamic programming, A*) is (weather, sea state, winds, currents – the external
again needed to determine the optimal feasible force fields). The quality of the plan will depend
maneuvering sequence. on the quality of environmental information and
Path/trajectory planning explicitly computing predictions.
the desired motion among two way-points may While motion planning can be considered a
include a system dynamic model, or may not. mature issue for ships, rigs, and even standalone
In the latter case, a sufficiently smooth curve autonomous vehicles, current and future research
that connects two way-points is generated, for directions will likely focus on the following
instance, as splines or as Dubins paths (LaValle items:
2006). Curve generation parameters must be set – Coordinated motion planning and obstacle
so that the “sufficiently smooth” part is guaran- avoidance for teams of autonomous surface
teed. After curve generation, a trim velocity is and underwater vehicles (Aguiar and Pascoal
imposed over the path (path planning), or a time 2012; Casalino et al. 2009)
law is imposed, e.g., smoothly varying the system – Naval traffic regulation compliant maneuver-
reference velocity with the local curvature radius. ing in restricted spaces and collision evasion
Planners that do use a system dynamic model maneuvering (Tam and Bucknall 2010)
are described in Fossen (2011) as part of the – Underwater intervention robotics (Antonelli
guidance system. In practice, the dynamic model 2006; Sanz et al. 2010)
786 Motion Planning for PDEs
a systematic inversion-based motion planning ap- nonlinearities as well as formal integration for
proach, which is based on the parametrization of semilinear PDEs using a generalized Cauchy-
any system variable by means of a flat or basic Kowalevski approach. To illustrate the principle
output. With this, the motion planning problem ideas and the evolving research results starting
can be solved rather intuitively as is illustrated for with Fliess et al. (1997), subsequently different
linear and semilinear PDEs. techniques are introduced based on selected ex-
ample problems. For this, the exposition is pri-
marily restricted to parabolic PDEs with a brief
Keywords discussion of motion planning for hyperbolic
PDEs before concluding with possible future re-
Basic output; Flatness; Formal integration; For- search directions.
mal power series; Trajectory assignment; Trajec-
tory planning; Transition path
Linear PDEs
In order to be able to solve (3) for xO n .t/, it The proof of this result can be, e.g., found in
is hence required to impose xO 0 .t/ D x.0;
O t/. Laroche et al. (2000) and Lynch and Rudolph
Denoting y.t/ D x.0; t/ or respectively (2002) and relies on the analysis of the recursion
(3) taking into account the assumptions on the
function y.t/.
xO 0 .t/ D y.t/ (3c)
Trajectory Assignment
implies To apply these results for the solution of the
motion planning problem to achieve finite-time
xO 2n .t/ D .@t r/n ı y.t/; xO 2nC1 .t/ D 0: (4) transitions between stationary profiles, it is cru-
cial to properly assign the desired trajectory y .t/
for the basic output y.t/. For this, observe that
Hence, any series coefficient in (2) can be differ- stationary profiles x s .z/ D x s .zI y s / are due
entially parametrized by means of y.t/. Taking to the flatness property (Classically stationary
into account the inhomogeneous boundary con- solutions are to be defined in terms of stationary
dition in (1b), i.e., input values x s .1/ D us .) governed by
1
X 1
X
xn .t/ x2n .t/ 0 D @2z x s .z/ C rx s .z/ (6a)
u.t/ D x.1; t/ D D (5)
nD0
nŠ nD0
.2n/Š @z x s .0/ D 0; x s .0/ D y s : (6b)
yields that y.t/ D x.0; t/ can be considered as a Hence, assigning different y s results in different
flat or basic output. In particular, by prescribing stationary profiles x s .zI y s /. The connection be-
a suitable trajectory t 7! y .t/ 2 C 1 .R/ for tween an initial stationary profile x0s .zI y0s / and a
y.t/, the evaluation of (5) yields the feedforward final stationary profile xTs .zI yTs / is achieved by
control u .t/ which is required to realize the assigning y .t/ such that
spatial-temporal path x .z; t/ obtained from the
substitution of y .t/ into (2) with coefficients y .0/ D y0s ; y .T / D yTs
parametrized by (4). This, however, relies on the
@nt y .0/ D 0; @nt y .T / D 0; n 1:
uniform convergence of (2) in view of (4) with at
least a unit radius of convergence in z. For this,
This implies that y .t/ has to be locally nonan-
the notion of a Gevrey class function is needed
alytic at t 2 f0; T g and in view of the previous
(Rodino 1993).
discussion has thus to be a Gevrey class function
Definition 1 (Gevrey class) The function y.t/ of order ˛ 2 .1; 2/. For specific problems differ-
is in GD;˛ ./, the Gevrey class of order ˛ in ent functions have been suggested fulfilling these
R, if y.t/ 2 C 1 ./ and for every closed properties. In the following, the ansatz
subset 0 of there exists a D > 0 such that
supt 20 j@nt y.t/j D nC1 .nŠ/˛ . y .t/ D y0s C .yTs y0s /˚T; .t/ (7a)
The set GD;˛ ./ forms a linear vector space and
a ring with respect to the arithmetic product of is used with
functions which is closed under the standard rules 8
ˆ
ˆ 0; t 0
of differentiation. Gevrey class functions of order < Rt
hT; . /d
˛ < 1 are entire and are analytic if ˛ D 1. ˚T; .t/ D R 0T t 2 .0; T / (7b)
ˆ hT; . /d
:̂
0
(7b) is a Gevrey class function of order ˛ D 1 C feedforward control and spatial-temporal transi-
1= (Fliess et al. 1997). Alternative functions are tion path are shown in Fig. 1.
presented, e.g., in Rudolph (2003).
converging or even divergent series expansions xO n .t/ D @t xO n2 .t/ r1 xn2 .t/
(Laroche et al. 2000; Meurer and Zeitz 2005). !
Flatness-based techniques for motion planning X
n2
n
r2 xO j .t/xO nj .t/; n2
can be also embedded into an operator theoretic j
j D0
context using semigroup theory by restricting
(9a)
the analysis to so-called Riesz spectral operators.
This enables to analyze coupled systems of linear xO 1 .t/ D 0: (9b)
PDEs with both boundary and in-domain control
in a single and multiple spatial coordinates with
a common framework (Meurer 2011, 2013). In Similar to the linear setting in the section “Linear
addition, experimental results for flexible beam PDEs” above, the recursion can be solved for
and plate structures with embedded piezoelectric xO n .t/ by imposing xO 0 .t/ D x.0;
O t/ or respectively
actuators confirm the applicability of this design
approach and the achievable high tracking ac-
xO 0 .t/ D y.t/: (9c)
curacy when transiently shaping the deflection
profile (Schröck et al. 2013).
As a result, also in this nonlinear setting any
series coefficient can be expressed in terms of
Semilinear PDEs y.t/ and its time derivatives. Hence, y.t/ D
x.0; t/ denotes a basic output for the semilinear
Flatness can be extended to semilinear PDEs. PDE (8). The uniform series convergence can
This is subsequently illustrated for the diffusion- be analyzed by restricting any trajectory y.t/ to
reaction system a certain Gevrey order ˛ while simultaneously
restricting the absolute values of d , r1 and r2
@t x.z; t/ D @2z x.z; t/ C r.x.z; t// (8a) (Dunbar et al. 2003; Lynch and Rudolph 2002).
These restrictions can be approached using, e.g.,
@z x.0; t/ D 0; x.1; t/ D u.t/ (8b) resummation techniques to sum slowly converg-
x.z; 0/ D 0 (8c) ing or divergent series to a meaningful limit.
The reader is therefore referred to Meurer and
with boundary input u.t/. Similar to the previous Zeitz (2005) or Meurer and Krstic (2011), with
section, the motion planning problem refers to the the latter introducing a PDE-based approach for
determination of a feedforward control t 7! u .t/ formation control of multi-agent systems.
to realize finite-time transitions starting at the
initial profile x0 .z/ D x.z; 0/ D 0 to achieve a Formal Integration
final stationary profile xT .z/ for t T . A generalization of these results has been re-
cently suggested in Schörkhuber et al. (2013) by
Formal Power Series making use of an abstract Cauchy-Kowalevski
If r.x.z; t// is a polynomial in x.z; t/ or an theorem in Gevrey classes. In order to illustrate
analytic function, then similar to the previous sec- this, solve (8a) for @2z x.z; t/ and formally inte-
tion, formal power series can be applied to solve grate with respect to z taking into account the
the motion planning problem. This, however, re- boundary conditions (8b). This yields the implicit
lies on the successive evaluation of Cauchy’s solution
product formula. As an example, consider
Z zZ p
x.z; t/ D xC .0; t/ @t x.q; t/
r.x.z; t// D r1 x.z; t/ C r2 x .z; t/;
2
0 0
r.x.q; t// dqdp (10a)
then the formal power series ansatz (2) results in
the recursion u.t/ D x.1; t/; (10b)
Motion Planning for PDEs 791
which can be used to develop to a flatness-based 0 D @2z x s .z/ C r.x s .z// (14a)
design systematics for motion planning given
semilinear PDEs. For this, introduce @z x s .0/ D 0; x s .0/ D y s : (14b)
motorcycle is not intuitive like that of a car. Each of the above four bodies are described by
To turn a car right, the driver simply turns the their mass, mass center location, and a 3 3 in-
steering wheel right; on a motorcycle, the rider ertia matrix. Two other important parameters are
initially turns the handlebars to the left. This is the wheelbase w and the trail c. The wheelbase
called counter-steering and is not intuitive. is the distance between the contact points of the
two wheels in the nominal configuration, and the
trail is the distance from the front wheel contact
point Q to the intersection S of the steering axis
A Basic Model with the ground. The trail is normally positive,
that is, Q is behind S . The point G locates the
To obtain a basic motorcycle model, we start mass center of the complete bike in its nominal
with four rigid bodies: the rear frame (which configuration, whereas GA is the location of the
includes a rigidly attached rigid rider), the front mass center of the front assembly (front frame
frame (includes handlebars and front forks), the and wheel).
rear wheel, and the front wheel; see Fig. 1. We
assume that both frames and wheels have a plane
of symmetry which is vertical when the bike is
in its nominal upright configuration. The front Description of Motion
frame can rotate relative to the rear frame about
the steering axis; the steering axis is in the plane Considering a right-handed reference frame e D
of symmetry of each frame and in the nominal .eO1 ; eO2 ; eO3 / with origin O fixed in the ground, the
upright configuration of the bike, the angle it bike motion can be described by the location of
makes with the vertical is called the rake angle the rear wheel contact point P relative to O,
or caster angle and is denoted by . The rear the orientation of the rear frame relative to e,
wheel rotates relative to the rear frame about an and the orientation of the front frame relative
axis perpendicular to the rear plane of symmetry to the rear frame; see Fig. 2. Assuming the bike
and is symmetrical with respect to this axis. The is moving along a horizontal plane, the location
same relationship holds between the front wheel of P is usually described by Cartesian coordi-
and the front frame. Although each wheel can nates x and y. Let reference frame b be fixed
be three dimensional, we model the wheels as in the rear frame with bO1 and bO3 in the plane
terminating in a knife edge at their boundaries of symmetry with bO1 along the nominal P S
and contact the ground at a single point. Points
Q and P are the points on the ground in contact
with the front and rear wheels, respectively. O e1 x
e3 e2
ψ δ
GA
G
UA
y b̂1
P
h b̂3 φ
P B Q S
b c
w S
Motorcycle Dynamics and Control, Fig. 1 Basic Motorcycle Dynamics and Control, Fig. 2 Description
model of motion
Motorcycle Dynamics and Control 795
line; see Fig. 2. Using this reference frame, the With no sideslip at both wheels, kinematical
orientation of the rear frame is described by a considerations (see Fig. 4) show that the yaw
3-1-2 Euler angle sequence which consists of a rate P is determined by ı; specifically for small
yaw rotation by about the 3-axis followed by angles we have the following linearized relation-
a lean (roll) rotation by about the 1-axis and ship:
finally by a pitch rotation by about the 2-axis. P D vı C
ıP (1)
The orientation of the front frame relative to the
where D c =w, c D cos , and
D cc =w is
rear frame can be described by the steer angle ı.
the normalized mechanical trail. In Fig. 4, ıf D
Assuming both wheels remain in contact with the
c ı is the effective steer angle; it is the angle
ground, the pitch angle is not independent; it is
between the intersections of the front wheel plane
uniquely determined by ı and . In considering
and the rear frame plane with the ground. Thus
small perturbations from the upright nominal
we can completely describe the lateral bike dy-
configuration, the variation in pitch is usually
namics with the roll angle and the steer angle
ignored. Here we consider it to be zero. Also the
ı. To obtain the above relationship, first note that,
dynamic behavior of the bike is independent of
as a consequence of no sideslip, vN P D v bO1 and
the coordinates x; y, and . These coordinates
vN Q is perpendicular to fO2 . Taking the dot product
can be obtained by integrating the velocity of P
of the expression,
and P .
vN Q D vN P C .wCc/ P bO2 c. P C ıPf /fO2 ;
The Whipple Bicycle Model
with fO2 while noting that bO1 fO2 D sin ıf and
The “simplest” model which captures all the bO2 fO2 D cos ıf results in
salient features of a single track vehicle for a M
basic understanding of low-speed dynamics and 0 D v sin ıf C .w C c/ P cos ıf c. P C ıPf / :
control is that originally due to Whipple (1899).
We consider the linearized version of this model Linearization about ı D 0 yields the desired
which is further expounded on in Meijaard et al. result.
(2007). The salient feature of this model is that The relationship in (1) also holds for four
there is no slip at each wheel. This means that wheel vehicles. There one can achieve a desired
the velocity of the point on the wheel which
is instantaneously in contact with the ground is
zero; this is illustrated in Fig. 3 for the rear wheel.
No slip implies that there is no sideslip which
means that the velocity of the wheel contact
point (vN P in Fig. 3) is parallel to the intersection P P
v̄
of the wheel plane with the ground plane; the P
wheel contact point is the point moving along the
ground which is in contact with the wheel. Motorcycle Dynamics and Control, Fig. 3 No slip:
0
The rest of this entry is based on linearization vP D 0
of motorcycle dynamics about an equilibrium
configuration corresponding to the bike traveling vQ
Q
upright in a straight line at constant forward P c
vP b̂1 δf S
speed v WD v P , the speed of the rear wheel w +c fˆ1
contact point P . In the linearized system, the ψ̇
b̂2 fˆ2
longitudinal dynamics are independent of the
lateral dynamics, and in the absence of driving Motorcycle Dynamics and Control, Fig. 4 Some kine-
or braking forces, the speed v is constant. matics
796 Motorcycle Dynamics and Control
G where mı D
mhb > 0, cı D mh.
Cb/ > 0
and kı .v/ D
mgb C mhv 2 . Note that v is a
constant parameter corresponding to the nominal
h speed of the rear wheel contact point.
aB With ı D 0, we have a system whose be-
havior
p is characterized by two real eigenvalues:
B
˙ mgh=Ixx : This system p is unstable due to
Motorcycle Dynamics and Control, Fig. 5 Inverted the positive eigenvalue mgh=Ixx . For v suf-
pendulum with accelerating support point ficiently large, the coefficient kı .v/ is positive
and one can readily show that the above system
constant yaw rate P d by simply letting the steer can be stabilized with positive feedback ı D
angle ı D P d =v. However, as we shall see, a K provided K > mgh=kı .v/. This helps
motorcycle with its steering fixed at a constant explain the stabilizing effect of a rider turning
steer angle is unstable. Neglecting gyroscopic the handlebars in the direction the bike is falling.
terms, it is simply an inverted pendulum. With its Actually, the rider input is a steer torque Tı
steering free, a motorcycle can be stable over a about the steer axis.
certain speed range and, if unstable, can be easily To explain why an uncontrolled motorcycle
stabilized above a very low speed by most people. can be stable or easily stabilized, one also has
Trials riders can stabilize a motorcycle at any to look at the effect that has on ı; in general,
speed including zero. a lean perturbation results in the front assembly
To help understand the effect of steer angle turning in the same direction, that is, a positive
on bike behavior, we initially ignore the mass perturbation of results in a positive change in ı.
and inertia of the front assembly along with The lean equation also explains why a motor-
gyroscopic effects, and we assume that the bO1 cycle must lean when cornering above a certain
axis is a principle axis of inertia of the rear frame speed. Suppose the motorcycle is in a right hand
with moment of inertia Ixx . Angular momentum corner of radius R at some constant speed v: in
considerations about the bO1 axis and linearization this scenario, P D v=R and, with ı constant,
results in (1) implies that ı D P =v D 1=R; with ı
and constant, the lean equation now requires
Ixx R C mhaB D mgh C .Nf c c/ı (2) that D kı .v/ı=mgh D kı .v/=mghR. For
higher speeds, kı .v/ mhv 2 ; hence
where Nf is the normal force (vertical and v 2 =gR. Since aB D v 2 =R, the lean angle
upwards) on the front wheel and aB is the lat- is approximately aB =g. Hence, to corner with a
eral acceleration (perpendicular to rear frame) of lateral acceleration aB D v 2 =R, the motorcycle
point B which is the projection of G onto the must lean at an angle of approximately aB =g.
bO1 axis. By considering a moment balance about The lean equation can also help explain
the pitch axis bO2 through P , one can obtain that counter-steering; that is, at speeds above a
Nf D mgb=w. Notice that, with the steering reasonably low speed, one can initiate a turn by
fixed at ı D 0, Eq. (2) is the equation of motion of turning the handlebars in the opposite direction to
a simple inverted pendulum whose support axis which one wants to go; to turn right, one initially
is accelerating horizontally with acceleration aB . turns the handlebars to the left. See Limebeer and
This is illustrated in Fig. 5. Sharp (2006) for further discussion.
Basic kinematics reveal that aB D v P C b R Taking into account the mass and inertia of
and, recalling relationship (1), Eq. (2) now yields the front assembly, gyroscopic effects and cross
the lean equation: products of inertia of the rear frame, one can
Motorcycle Dynamics and Control 797
show (see Meijaard et al. 2007) that the lean kı D kı
equation (3) still holds with
kıı .v/ D k0ıı C k2ıı v 2
mı D
Ixz C IAx k0ıı D s SA g;
cı D
mh C Ixz C
ST C c SF k2ıı D .SA C s SF /
kı .v/ D k0ı C k2ı v 2 cı D .
ST C c
SF /
k0ı D SA g; cıı D
.SA C Izz / C IAz
k2ı D .mh C ST /
Here, Izz is the moment of inertia of the total
motorcycle and the rider in the nominal config-
Here Ixx is the moment of inertia of the total uration about the bO3 axis, IA is the moment of
motorcycle and rider in the nominal configuration inertia of the front assembly about the steering
about the bO1 axis and Ixz is the inertia cross axis, and IAz is the front assembly inertia cross
product w.r.t the bO1 and bO3 axes. The term IAx product with respect to the steering axis and the
is the front assembly inertia cross product with vertical axis through P . The lean equation (3)
respect to the steering axis and the bO1 axis; see combined with the steer equation (4) provide an
Meijaard et al. (2007) for a further description of initial model for motorcycle dynamics. This is
this parameter. Also, SA D
mb C mA uA where a linear model with the nominal speed v as a
mA is the mass of the front assembly (front wheel constant parameter and the rider’s steering torque
and front frame) and uA is the offset of the mass Tı as an input.
center of the front assembly from the steering
axis, that is, the distance of this mass center from
the steering axis; see Fig. 1. The terms SF D
Modes of Whipple Model
M
IF yy =rF and ST D IRyy =rR C IF yy =rF are
gyroscopic terms due the rotation of the front
At v D 0, the linearized Whipple model (3)–
and rear wheels where rF and rR are the radii
(4) has two pairs of real eigenvalues: ˙p1 ; ˙p2
of the front and rear wheels, while IF yy and
with p2 > p1 > 0; see Fig. 6. The pair ˙p1
IRyy are the moments of inertias of the front and
roughly describe inverted pendulum behavior of
rear wheels about their axles. It is assumed that
the whole bike with fixed steering, while ˙p2
the mass center of each wheel is located at its
describe inverted pendulum behavior of the front
geometric center.
assembly with the rear frame fixed upright. As
By considering an angular momentum balance
v increases the real eigenvalues corresponding
about a vertical axis through P , one can obtain
to p1 and p2 meet and from there on form a
an expression for the lateral force at the front
wheel. Angular momentum considerations about
the steering axis for the front assembly and lin-
vw
earization then yield the steer equation:
Cross-References Synonyms
Abstract
Bibliography Moving horizon estimation (MHE) is a state esti-
mation method that is particularly useful for non-
Astrom KJ, Klein RE, Lennarstsson A (2005) Bicycle dy-
namics and control. IEEE Control Syst Mag 25(4):26– linear or constrained dynamic systems for which
47 few general methods with established properties
Cossalter V, Lot R (2002) A motorcycle multi-body are available. This entry explains the concept of
model for real time simulations based on the natural
full information estimation and introduces mov-
coordinates approach. Veh Syst Dyn 37(6):423–447
Cossalter V, Doria A, Lot R, Ruffo N, Salvador M ing horizon estimation as a computable approx-
(2003) Dynamic properties of motorcycle and scooter imation of full information. The basic design
800 Moving Horizon Estimation
methods for ensuring stability of MHE are pre- First, we define some notation necessary to
sented. The relationships of full information and distinguish the system variables from the es-
MHE to other state estimation methods such timator variables. We have already introduced
as Kalman filtering and statistical sampling are the system variables .x; w; y; v/. In the estima-
discussed. tor optimization problem, these have correspond-
ing decision variables, which we denote by the
Greek letters .; !; ; /. The relationships be-
Keywords tween these variables are
The state of the systems is x 2 Rn , the mea- The full information objective function is
surement is y 2 RP , and the notation x C means
1
x at the next sample time. A control input u TX
may be included in the model, but it is con- VT ..0/; !/ D `x .0/x 0 C `i .!.i /; .i //
sidered a known variable, and its inclusion is i D0
(3)
irrelevant to state estimation, so we suppress it in
subject to (2) in which T is the current time, ! is
the model under consideration here. We receive
the estimated sequence of process disturbances,
measurement y from the sensor, but the process
.!.0/; : : : ; !.T 1//, y.i / is the measurement
disturbance, w 2 Rg ; measurement disturbance
at time i , and x 0 is the prior, i.e., available,
v 2 Rp ; and system initial state, x.0/, are
value of the initial state. Full information here
considered unknown variables.
means that we use all the data on time interval
The goal of state estimation is to construct or
Œ0; T to estimate the state (or state trajectory) at
estimate the trajectory of x from only the mea-
time T . The stage cost `i .!; / costs the model
surements y. Note that for control purposes, we
disturbance and the fitting error, the two error
are usually interested in the estimate of the state
sources that we reconcile in all state estimation
at the current time, T , rather than the entire tra-
problems.
jectory over the time interval Œ0; T . In the mov-
The full information estimator is then defined
ing horizon estimation (MHE) method, we use
as the solution to
optimization to achieve this goal. We have two
sources of error: the state transition is affected
min VT ..0/; !/ (4)
by an unknown process disturbance (or noise), .0/;!
w, and the measurement process is affected by
another disturbance, v. In the MHE approach, we The solution to the optimization exists for all
formulate the optimization objective to minimize T 2 I0 under mild continuity assumptions and
the size of these errors thus finding a trajectory of choice of stage cost. Many choices of (positive,
the state that comes close to satisfying the (error- continuous) stage costs `x ./ and `i ./ are possi-
free) model while still fitting the measurements. ble, providing a rich class of estimation problems
Moving Horizon Estimation 801
Moving Horizon Estimation, Fig. 1 The state, mea- (gray circles in lower half ) is to be reconstructed given the
sured output, and disturbance variables appearing in the measurements (black circles in upper half )
state estimation optimization problem. The state trajectory
M
that can be tailored to different applications. Be-
`i .!; / D .1=2/ k!k2Q1 C kk2R1
cause the system model (1) and cost function (3)
are so general, it is perhaps best to start off by
specializing them to see the connection to some
classic results.
in which random variable x.0/ is assumed to have
mean x 0 and variance P0 and random variables w
and v are assumed zero mean with variances Q
Related Problem: The Kalman Filter and R, respectively. The Kalman filter is also a
recursive solution to the state estimation problem
If we specialize to the linear dynamic model so that only the current mean xO and variance P of
f .x; w/ D Ax C Gw, h.x/ D C x, and let x.0/, the conditional density are required to be stored,
w, and v be independent, normally distributed instead of the entire history of measurements
random variables, the classic Kalman filter is y.i /; i D 0; : : : ; T . This computational effi-
known to be the statistically optimal estimator, ciency is critical for success in online application
i.e., the Kalman filter produces the state estimate for processes with short time scales requiring fast
that maximizes the conditional probability of processing.
x.T / given y.0/; : : : ; y.T /. The full information But if we consider nonlinear models, the max-
estimator is equivalent to the Kalman filter given imization of conditional density is usually an
the linear model assumption and the following intractable problem, especially in online appli-
choice quadratic of stage costs cations. So, MHE becomes a natural alternative
for nonlinear models or if an application calls for
hard constraints to be imposed on the estimated
`x ..0/; x 0 / D .1=2/ k.0/ x 0 k2P 1
0 variables.
802 Moving Horizon Estimation
of the state vector n, however. The efficiency of either (i) general background required to
the sampling strategy is particularly important understand MHE theory and its relationship to
for online use of state estimators. Rawlings and other methods or (ii) computational methods for
Bakshi (2006) and Rawlings and Mayne (2009, solving the real-time MHE optimization problem
pp. 329–355) provide some comparisons of par- or (iii) challenging nonlinear applications that
ticle filtering with MHE and also describe some demonstrate benefits and probe the current limits
hybrid methods combining MHE and particle of MHE implementations.
filtering.
Bibliography
Summary and Future Directions
Bertsekas DP (1995) Dynamic programming and optimal
MHE is one of few state estimation methods that control, vol 1. Athena Scientific, Belmont
can be applied to nonlinear models for which Kuhl P, Diehl M, Kraus T, Schloder JP, Bock HG (2011)
properties such as estimator stability can be es- A real-time algorithm for moving horizon state and
parameter estimation. Comput Chem Eng 35:71–83
tablished (Rao et al. 2003; Rawlings and Mayne Kwakernaak H, Sivan R (1972) Linear optimal control
2009). The required online solution of an opti- systems. Wiley, New York. ISBN:0-471-51110-2
mization problem is computationally demanding Lopez-Negrete R, Biegler LT (2012) A moving horizon
in some applications but can provide signifi- estimator for processes with multi-rate measurements:
a nonlinear programming sensitivity approach. J
cant benefits in estimator accuracy and rate of Process Control 22:677–688
convergence (Patwardhan et al. 2012). Current Patwardhan SC, Narasimhan S, Jagadeesan P, Gopaluni B,
topics for MHE theoretical research include treat- Shah SL (2012) Nonlinear Bayesian state estimation:
ing bounded rather than convergent disturbances a review of recent developments. Control Eng Pract
20:933–953
and establishing properties of suboptimal MHE Rao CV, Rawlings JB, Mayne DQ (2003) Constrained
(Rawlings and Ji 2012). The current main focus state estimation for nonlinear discrete-time systems:
for MHE applied research involves reducing the stability and moving horizon approximations. IEEE
online computational complexity to reliably han- Trans Autom Control 48(2): 246–258
Rawlings JB, Bakshi BR (2006) Particle filtering and
dle challenging large dimensional, nonlinear ap- moving horizon estimation. Comput Chem Eng
plications (Kuhl et al. 2011; Lopez-Negrete and 30:1529–1541
Biegler 2012; Zavala and Biegler 2009; Zavala Rawlings JB, Ji L (2012) Optimization-based state esti-
et al. 2008). mation: current status and some new results. J Process
Control 22:1439–1444
Rawlings JB, Mayne DQ (2009) Model predictive control:
theory and design. Nob Hill Publishing, Madison, 576
Cross-References p. ISBN:978-0-9759377-0-9
Sontag ED, Wang Y (1997) Output-to-state stability and
detectability of nonlinear systems. Syst Control Lett
Bounds on Estimation 29:279–290
Estimation, Survey on Zavala VM, Biegler LT (2009) Optimization-based strate-
Extended Kalman Filters gies for the operation of low-density polyethylene
tubular reactors: nonlinear model predictive control.
Nonlinear Filters
Comput Chem Eng 33(10):1735–1746
Particle Filters Zavala VM, Laird CD, Biegler LT (2008) A fast mov-
ing horizon estimation algorithm based on nonlin-
ear programming sensitivity. J Process Control 18:
876–884
Recommended Reading
MSPCA
Introduction
Multiscale Multivariate Statistical Process
Control Methods and tools for control system analysis
and design usually require an input–output block
diagram description of the plant to be controlled.
Multi-domain Modeling and
Apart from small systems, it is nontrivial to de-
Simulation
rive such models from first principles of physics.
Since a long time, methods and tools are available
Martin Otter
to construct such models automatically for one
Institute of System Dynamics and Control,
domain, for example, a mechanical model, an
German Aerospace Center (DLR), Wessling,
electronic, or a hydraulic circuit. These domain-
Germany
specific methods and tools are, however, only
of limited use for the modeling of multi-domain
Abstract systems.
In the dissertation (Elmqvist 1978), a suitable
One starting point for the analysis and design approach for multi-domain, object-oriented
of a control system is the block diagram modeling has been developed by introducing M
representation of a plant. Since it is nontrivial to a modeling language to define models on
convert a physical model of a plant into a block a high level based on first principles. The
diagram, this can be performed manually only resulting DAE (differential-algebraic equation)
for small plant models. Based on research from systems are transformed with proper algorithms
the last 35 years, more and more mature tools automatically in a block diagram description
are available to achieve this transformation fully with input and output signals based on ODEs
automatically. As a result, multi-domain plants, (ordinary differential equations).
for example, systems with electrical, mechanical, In 1978, the computers were not powerful
thermal, and fluid parts, can be modeled in a enough to apply this method on larger systems.
unified way and can be used directly as input– This changed in the 1990s, and then the tech-
output blocks for control system design. An nology has been substantially improved, many
overview of the basic principles of this approach different modeling languages appeared (and also
is given. This provides also the possibility to use disappeared), and the technology was introduced
nonlinear, multi-domain plant models directly in in commercial simulation environments.
a controller. Finally, the low-level “Functional In Table 1, an overview of the most important
Mockup Interface” standard is sketched to standards, languages, and tools in the year 2013
exchange multi-domain models between many for multi-domain modeling is given:
different modeling and simulation environments. The Modelica language is a standard from
The Modelica Association (Modelica Associa-
tion 2012). The first version was released in
Keywords 1997. Also a large free library is provided with
about 1,300 model components from many do-
Block diagram; Bond graph; Differential- mains. There are several software tools support-
algebraic equation (DAE) system; Flow variable; ing this modeling language and the free Modelica
806 MRAC
Multi-domain Modeling and Simulation, Table 1 Multi-domain modeling and simulation environments
Tool name Web (accessed December 2013)
Environments based on the Modelica Standard (https://www.Modelica.org)
CyModelica http://cydesign.com/
Dymola http://www.dymola.com/
JModelica.org http://www.jmodelica.org/
LMS Imagine.Lab AMESim http://www.lmsintl.com/LMS-Imagine-Lab-AMESim
MapleSim http://www.maplesoft.com/products/maplesim
MWorks http://en.tongyuan.cc/
OpenModelica https://openmodelica.org/
SimulationX http://www.itisim.com/simulationx/
Wolfram SystemModeler http://www.wolfram.com/system-modeler/
Environments based on the VHDL-AMS Standard (http://www.eda.ora/twiki/bin/view.cai/P10761)
ANSYS Simplorer http://www.ansys.com/Products
Saber http://www.synopsys.com/Systems/Saber
SMASH http://www.dolphin.fr/medal/products/smash/smashoverview.php
SystemVision http://www.mentor.com/products/sm/systemvision
Virtuoso AMS designer http://www.cadence.com
Environments with vendor-specific multi-domain modeling languages
EcosimPro http://www.ecosimpro.com/
gPROMS http://www.psenterprise.com/gproms
OpenMAST http://www.openmast.org/
Simscape https://www.mathworks.com/products/simscape
Environments based on the Bond Graph Methodology
20-sim http://www.20sim.com/
Standard Library. The examples of this entry are in 1999. It is an extension of the widely used
mostly provided from this standard. VHDL hardware description language. This
The following registered trademarks are refer- language is especially used in the electronics
enced: community.
• There are several vendor-specific modeling
Registered Owner of trademark languages, notably Simscape from Math-
trademark Works as an extension to Simulink, as well
AMESim IMAGINE SA as MAST, the underlying modeling language
ANSYS ANSYS Inc. of Saber (Mantoolh and Vlach 1992). In 2004,
Dymola Dassault Systemes AB MAST was published as OpenMAST under
EcosimPro Empresarios Agrupados A.I.E.
an open source license.
gPROMS Process Systems Enterprise Limited
• Bond graphs (see, e.g., Karnopp et al. 2012)
MATLAB The MathWorks Inc
are a special graphical notation to define
Modelica Modelica Association
multi-domain systems based on energy flow.
Saber Sabremark Limited partnership
It was invented in 1959 by Henry M. Paynter.
SimulationX ITI GmbH
Simulink The MathWorks Inc
In the section “Modeling Language Princi-
SystemVision Mentor Graphics Corporation ples”, the principles of multi-domain modeling
Virtuoso Cadence Design based on a modeling language are summarized.
In the section “Models for Control Systems”,
it is shown how such models can be used not
• The VHDL-AMS language is a standard from only for simulation but also as components in
IEEE (IEEE 1076.1-2007 2007), first released nonlinear control systems. Finally, in the section
MRAC 807
“The Functional Mockup Interface”, an overview A component, like a resistor, rotational inertia,
about a low-level standard for the exchange of or convective heat transfer, is shown as an icon
multi-domain systems is described. in the diagram. On the border of a component,
small rectangular or circular signs are present
representing the “physical ports.” Ports are con-
nected by lines and model the (idealized) physical
Modeling Language Principles or signal interaction between ports of different
components, for example, the flow of electrical
Schematics: The Graphical View current or heat or the rigid mechanical coupling.
Modelers nowadays require a simple to use Components are built up hierarchically from
graphical environment to build up models. With other components. On the lowest level, compo-
very few exceptions, multi-domain environments nents are described textually with the respec-
define models by schematic diagrams. A typical tive modeling language (see section “Component
example is given in Fig. 1, showing a simple Equations”).
direct-current electrical motor in Modelica.
In the lower left part, the electrical circuit
diagram of the DC motor is visible, consisting Coupling Components by Ports
mainly of the armature resistance and inductance The ports define how of a component can interact
of the motor, a voltage source, and component with other components. A port contains (a) a def-
“emf” to model in an idealized way the electro- inition of the variables that describe the interface
motoric forces in the air gap. On the lower right and (b) defines in which way a tool can automat-
part, the motor inertia, a gear box, and a load ically construct the equations of connections. A
inertia are present. In the upper part, the heat typical scenario is shown in Fig. 2 where the ports
transfer of the resistor losses to the environment of the three components A, B, C are connected
is modeled with lumped elements. together at one point P:
M
fixedTemp
thermalCond convection
deg⬚C
C G=G Gc
T=20
const
heatCapacitor
R=13.8 inductor
k=20
resistor L= 0.061
gear
signalVoltage
motorInertia loadInertia
+
k =1.016
emf ratio=105
−
ground
Multi-domain Modeling and Simulation, Fig. 1 Modelica schematic of DC motor with mechanical load and heat
losses
808 MRAC
When cutting away the connection lines, the consequence, the balance equations and bound-
resulting system consists of three decoupled com- ary conditions are fulfilled in the overall model
ponents A, B, C and a new component around containing all components and all connections.
P describing the infinitesimally small connection In order that a tool can automatically construct
point. The balance equations and the boundary the equations at a connection point, every port
conditions of the respective domain must hold at variable needs to be associated to a port variable
all these components. When drawing the connec- type. In Table 2, some port variable types of
tion lines, enough information must be available Modelica are shown. In this table it is assumed
in the port definitions so that the tool can con- that u1 ; u2 ,. . . un ; y; v1 ; v2 ; : : : ; vn ; f1 ; f2 ; : : :, fn ,
struct the equations of the infinitesimally small s1 , s2 ; : : : ; sn are corresponding port variables
connection points automatically. from different components that are connected
To summarize, the component developer is together at the same point P.
responsible that the balance equations and bound- Port variable types “input” and “output” define
ary conditions are fulfilled for every component the “usual” signal connections in block diagrams.
(A, B, C in Fig. 2), and the tool is responsible that “Potential variables” and “flow variables” are
the balance equations and boundary conditions used to define standard physical connections. For
are also fulfilled at the points where the compo- example, an electrical port contains the electrical
nents are connected together (P in Fig. 2). As a potential and the electrical current at the port,
and when connecting electrical ports together,
the electrical potentials are identical and the sum
of the electrical currents is zero, according to
P Table 2. This corresponds exactly to Kirchhoff’s
A B
voltage and current laws.
“Stream variables” are used to describe the
C connection semantics of intensive quantities in
bidirectional fluid flow, such as specific enthalpy
or mass fraction. Here, the idealized balance
equation at a connection point states, for exam-
P
A B ple, that the sum of the port enthalpy flow rates is
zero and the port enthalpy flow rate is computed
as the product of the mass flow rate (a flow
variable fi ) and the directional specific enthalpy
C
si , which is either the (yet unknown) mixing-
specific enthalpy smix when the flow is from
Multi-domain Modeling and Simulation, Fig. 2
the connection point to the port or the specific
Cutting the connections around the connection point P
results in three decoupled components A, B, C and a new enthalpy si in the port when the flow is from
component around P describing the infinitesimally small the port to the connection point. More details
connection point
Multi-domain Modeling and Simulation, Table 2 Some port variable types in Modelica
Port variable type Connection semantics
Input variables ui , output variable y u1 D u2 D : : : D un D y (exactly one output variable can
be connected to n input variables)
Potential variables vi v1 D v2 D : : : D vn
P
Flow variables fi 0 D fi
P s if fi > 0
Stream variables si (with associated flow variables fi ) 0 D fi sOi I sOi D mix
P si if fi 0
.0 D fi /
MRAC 809
Multi-domain Modeling and Simulation, Table 3 Some port definitions from Modelica
Domain Port variables
Electrical analog Electrical potential in [V] (pot.)
electrical current in [A] (flow)
Elec. multiphase Vector of electrical ports
Electrical quasi-stationary Complex elec. potential (pot.)
complex elec. current (flow)
Magnetic flux tubes Magnetic potential in [A] (pot.)
magnetic flux in [Wb] (flow)
Translational (1-dim. mechanics) Distance in [m] (pot.)
cut-force in [N] (flow)
Rotational (1-dim. mechanics) Absolute angle in [rad] (pot.)
cut-torque in [Nm] (flow)
2-dim. mechanics Position in x-direction in [m] (pot.)
position in y-direction in [m] (pot.)
absolute angle in [rad] (pot.)
cut-force in x-direction in [N] (flow)
cut-force in y-direction in [N] (flow)
cut-torque in z-direc. in [Nm] (flow)
3-dim. mechanics Position vector in [m] (pot.)
transformation matrix in [1] (pot.)
cut-force vector in [N] (flow)
cut-torque vector in [Nm] (flow)
1-dim. heat transfer Temperature in [K] (pot.)
heat flow rate in [W] (flow)
1-dim. thermo-fluid pipe flow Pressure in [Pa] (pot.)
mass flow rate in [kg/s] (flow)
spec. enthalpy in [J/kg] (stream) M
mass fractions in [1] (stream)
Component Equations
Implementing a component in a modeling lan- Multi-domain Modeling and Simulation, Fig. 3
Equations of a capacitor component
guage means to (a) define the ports of the com-
ponent and (b) provide the equations describing
the relationships between the port variables. For
example, an electrical capacitor with constant Furthermore, the two remaining equations state
capacitance C can be defined by the equations in that the derivative of the difference of the port
the right side of Fig. 3. potentials is proportional to the current flowing
Such a component has two ports, the pins into port “p.”
“p” and “n,” and the port variables are the elec- One important question is how many equa-
trical currents ip ; in flowing into the respective tions are needed to describe such a component?
ports and the electrical potentials vp ; vn at the For an input–output block, this is simple: all input
ports. The first component equation states that variables are known, and for all other variables,
if the current ip at port “p” is positive, then the one equation per unknown is needed. Count-
current in at port “n” is negative (therefore, the ing equations for physical components, such as
current flowing into “p” is flowing out of “n”). a capacitor, is more involved: the requirement
810 MRAC
that any type of component connections shall stead, in VHDL-AMS (and some other modeling
always result in identical numbers of unknowns languages), port variables cannot be accessed in
and equations of the overall system leads to the a model, and instead via the “quantity .. across
following counting rule (for a proof, see Olsson .. through .. to ..” construction, the relationships
et al. 2008): between the port variables are implicitly defined
1. The number of potential and the number of and correspond to the Modelica equations “0 D
flow variables in a port must be identical. p.i C n.i” and “u D p.v n. v.”
2. Input variables and variables that appear dif-
ferentiated are treated as known variables. Simulation of Multi-domain Systems
3. The number of equations of a component must Collecting all the component equations of
be equal to the number of unknowns minus the a multi-domain system model together with
number of flow variables. all connection equations results in a DAE
In the example of the capacitor, there are 5 un- (differential-algebraic equation) system:
knowns (ip , in , vp ; vn , du/dt) and 2 flow variables
(ip , in ). Therefore, 52 D 3 equations are needed
0 D f.Px; x; w; y; u; t/ (1)
to define this component.
Modeling languages are used to provide a tex-
tual description of the ports and of the equations where t 2 R is time, x.t/ 2 Rnx are vari-
in a specific syntax. For example, in Modelica ables appearing differentiated, w.t/ 2 Rnw are
the capacitor from Fig. 3 can be defined as Fig. 4 algebraic variables, y.t/ 2 Rny are outputs,
(keywords of the Modelica language are written u.t/ 2 Rnu are inputs, and f 2 Rnx Cnw Cny are
in boldface): the DAE equations. Equation (1) can be solved
In VHDL-AMS the capacitor model can be numerically with an integrator for DAE systems;
defined as shown in Fig. 5. see, for example, Brenan et al. (1996). For DAEs
One difference between Modelica and VHDL- that are linear in their unknowns, a complete
AMS is that in Modelica all equations need to be theory for solvability is available based on matrix
explicitly given and port variables (such as p.i) pencils (see, e.g., Brenan et al. 1996) and also
can be directly accessed in the model (Fig. 4). In- reliable software for their analysis (Varga 2000).
Unfortunately, only certain classes of nonlin-
ear DAEs can be directly solved numerically
Multi-domain Modeling and Simulation, Fig. 4 Multi-domain Modeling and Simulation, Fig. 5
Modelica model of capacitor component VHDL-AMS model of capacitor component
MRAC 811
in a reliable way. Domain-specific software, state, that is, it is required that xP 0 D 0 and
as, e.g., for mechanical systems, transforms therefore at the initial time (1) is required to
the underlying DAE into a form that can be satisfy
more reliably solved, using domain-specific
knowledge. This is performed by differentiating 0 D f .0; x0 ; w0 ; y0 ; u0 ; t0 / (2)
certain equations of the DAE analytically and
utilizing special integration methods for the Equation (2) is a nonlinear algebraic system of
resulting overdetermined set of differential- equations in the unknowns x0 ; w0 , y0 ; u0 . These
algebraic equations. Multi-domain simulation are nx C nw C ny equations for nx C nw C ny C nu
software uses the following approaches: unknowns. Therefore, nu further conditions must
(a) The DAE (1) is directly solved numerically be provided (usually some elements of u0 and/or
using an implicit integration method, such y0 are fixed to desired physical values). Solving
as a linear multistep method. Typically, all (2) for the unknowns is also called “DC operat-
VHDL-AMS simulators use this approach. ing point calculation” or “trimming.” Nonlinear
(b) The DAE (1) is symbolically transformed in equation solvers are based on iterative methods
a form that is equivalent to a set of ODEs that require usually a sufficiently accurate initial
(ordinary differential equations), and then guess for all unknowns. In a large multi-domain
either explicit or implicit ODE or DAE in- system model, this is not practical, and therefore,
tegration methods are used to numerically methods are needed to solve (2) even if generic
solve the transformed system. The transfor- guess values in a library are provided that might
mation is based on the algorithms of Pan- be far from the solution of the system at hand.
telides (1988) and of Mattsson and Söderlind For analog electronic circuit simulations, a
(1993) and might require to analytically dif- large body of theory, algorithms, and software is
ferentiate equations. Typically, all Modelica- available to solve (2) based on homotopy meth- M
based simulators, but also EcosimPro, use ods. The basic idea is to solve a sequence of non-
this approach. linear algebraic equation systems by starting with
For many models both approaches can be applied an easy to solve simplified system, characterized
successfully. There are, however, systems where by the homotopy parameter
D 0. This system
approach (a) is successful and fails for (b) or vice is continuously “deformed” until the desired one
versa. is reached at
D 1. The solution at iteration i
DAEs (1) derived from modeling languages is used as guess value for iteration i C 1, and at
usually have a large number of equations but with every iteration, the solution is usually computed
only a few unknowns in every equation. In order with a Newton-Raphson method.
to solve DAEs of this kind efficiently, both with The simplest such approach is “source step-
(a) or (b), typically graph theory and/or sparse ping”: the initial guess values of all electrical
matrix methods are utilized. For method (b) the components are set to “zero voltage” and/or “zero
fundamental algorithms have been developed in current.” All (voltage and current) sources start
Elmqvist (1978) and later improved in further at zero, and their values are gradually increased
publications. For a recent survey and comparison until the desired source values are reached. This
of some of the algorithms, see Frenkel et al. method may not converge, typically due to the
(2012). severe nonlinearities at switching thresholds in
Solving the DAE (1) means to solve an ini- logical circuits.
tial value problem. In order that this can be There are several, more involved approaches,
performed, a consistent set of initial variables called “probability one homotopy” methods. For
xP 0 D xP .t0 / ; x0 D x .t0 / ; w0 D w .t0 / ; y0 D these method classes, proofs exist that they con-
y .t0 / ; u0 D u .t0 / has to be determined first at verge with probability one (so practically al-
the initial time t0 : In general, this is a nontrivial ways). These algorithms can only be applied for
task. For example, often (1) shall start in steady certain classes of DAEs; see, for example, the
812 MRAC
“Variable Stimulus Probability One Homotopy” Multi-domain models might also be used
from Melville et al. (1993). directly in nonlinear Kalman filters, moving
Although strong results exist for analog elec- horizon estimators, or nonlinear model predictive
trical circuit simulators, it is difficult to generalize control. For example, the company ABB is using
them to the large class of multi-domain systems moving horizon estimation and nonlinear model
covered by a modeling language. In Modelica predictive control based on Modelica models
a “homotopy” operator was introduced into the to significantly improve the start-up process of
language (Sielemann et al. 2011) in order that a power plants (Franke and Doppelhamer 2006).
library developer can formulate simple homotopy
methods like the “source stepping” in a com- Inverse Models
ponent library. A generalization of probability A large body of literature exists about the theory
one methods for multi-domain systems was de- of nonlinear control systems that are based on
veloped in the dissertation of Sielemann (2012) inverse plant models; see, for example, Isidori
and was successfully applied to air distribution (1995). Methods such as feedback linearization,
systems described as 1-dim. thermo-fluid pipe nonlinear dynamic inversion, or flat systems use
flow. an inverse plant model in the control loop. How-
ever, a major obstacle is how to automatically
utilize an inverse plant model in a controller with-
Models for Control Systems out being forced to manually set up the equations
in the needed form which is not practical for
Models for Analysis larger systems. Modeling languages can solve
The multi-domain models from section “Mod- this problem as discussed below.
eling Language Principles” can be utilized to Nonlinear inverse models can be utilized in
evaluate the properties of a control system by various ways in a control system. The simplest
simulation. Also control systems can be designed approach, as feed forward controller, is shown in
by nonlinear optimization where at every opti- Fig. 6.
mization step one or several simulations of a plant Under the assumption that the models of the
model are executed. Furthermore, modeling en- plant and of the inverse plant are completely
vironments usually provide a means to linearize identical and start at the same initial state, then
the nonlinear DAE (1) of the underlying model from the construction the control error e is zero
around an operating point: and y = T(s) yref where T is a diagonal matrix
with the transfer functions of the low-pass filters
x .t/ xop C x.t/; w.t/ wop C w.t/; on the diagonal (so y yref for reference signals
y.t/ yop C y.t/; u.t/ uop C u.t/
(3)
resulting in
, inverse plant ,
Pxred D A xred C B u model
(4)
y D C xred C D u
that have a frequency spectrum below the cutoff differentiating equations analytically and solving
frequency of the low-pass filters). Since actually algebraic variables of the model in a different way
the assumption is usually not fulfilled, there will as for a simulation model. The whole transforma-
be a nonzero control error e and the feedback tion is nontrivial, but it is just the standard method
controller has to cope with it. This controller used by Modelica tools as for any other type of
structure with a nonlinear inverse plant model DAE system.
has the advantage that the feed forward part is The question arises whether a solution of the
useful over the complete operating range of the inverse model exists, is unique, and whether the
plant. model is stable (otherwise, it cannot be applied
Various other structures with nonlinear plant in a control system). In general, a nonlinear
models are discussed in Looye et al. (2005), inverse model consists of linear and/or nonlinear
such as compensation controllers, feedback lin- algebraic equation systems and of linear and/or
earization controllers, and nonlinear disturbance nonlinear differential equations. Therefore, from
observers. a formal point of view, the same theorems as for
It turns out that nonlinear inverse plant a general DAE apply; see, for example, Brenan
models can be generated automatically with et al. (1996). Furthermore, all these equations
the techniques that have been developed for need to be solved with a numerical method.
modeling languages; see section “Modeling For some classes of systems, it can be shown
Language Principles”. In particular, constructing that mathematically a unique solution exists and
an inverse model from (1) means that the inputs that the system is stable. However, in general,
u are defined to be outputs, so they are no longer one cannot expect that it is possible to provide
knowns but unknowns, and outputs y are defined such a proof for complex inverse plant models.
to be inputs, so they are no longer unknowns Still, inverse plant models have been successfully
but knowns. The resulting system is still a utilized by automatic generation from a Modelica M
DAE and can therefore be handled as any other tool, e.g., for robots, satellites, aircrafts, vehicles,
DAE. and thermo-fluid systems.
Therefore, defining an inverse model with a
modeling language just requires exchanging the
definition of input and output signals. In Model- The Functional Mockup Interface
ica, this can be graphically performed with the
nonstandard input–output block from Fig. 8. Many different types of simulation environments
This block has two inputs and two outputs and are in use. One cannot expect that a generic
described by the equations approach as sketched in section “Modeling
Language Principles” will replace all these
u1 D u2I y1 D y2 environments with their rich set of domain-
specific knowledge, analysis, and synthesis
From a block diagram point of view this looks features. Practically, all simulation environments
strange. However, from a DAE point of view, provide a vendor-specific interface in order
this just states constraints between two input and that a user can import components that are not
two output signals. In Fig. 8, it is shown how this describable by the simulation environment itself.
block can be used to invert a simple second order Typically, this requires to provide a component
system. as a set of C or Fortran functions with a particular
The output of the low-pass filter is connected calling interface. In the control community,
to the output of the second-order system and the most widely used approach of this kind is
therefore this model computes the input of the the S-Function interface from The MathWorks,
second-order system, from the input of the filter. where Simulink is used as integration platform,
A Modelica environment will generate from and model components from other environments
this type of definition the inverse model, thereby are imported as S-Functions.
814 MRAC
Multi-domain Modeling
and Simulation, Fig. 7
Modelica u1 u2 y1 y2
InverseBlockConstraint
block
w=0.1
f_cut=10
In 2010 the vendor-independent standard properties of an FMU from a text file, without
“Functional Mockup Interface 1.0” was actually loading and running the FMU.
published (FMI Group 2010). This is a low- 2. A set of C-functions is provided to execute
level standard for the exchange of models model equations for the Model Exchange case
between different simulation environments. This and to simulate the equations for the Co-
standard allows to exchange only either the model Simulation case. These C-functions can be
equations (called “FMI for Model Exchange”) or provided either in binary form for different
the model equations with an embedded solver platforms or in source code. The different
(called “FMI for Co-Simulation”). This standard forms can be included in the same model zip-
was quickly adopted by many simulation file.
environments, and in 2013 there are more than 3. Further data can be included in the FMU zip-
40 tools that support it (for an actual list of file, especially a model icon (bitmap file),
tools, see https://www.fmi-standard.org/tools). documentation files, maps and tables needed
In particular nearly all Modelica environments by the model, and/or all object libraries or
can export Modelica models in this format, and DLLs that are utilized.
therefore, Modelica multi-domain models can be
imported in other environments with low effort.
A software component which implements the Summary and Future Directions
FMI is called Functional Mockup Unit (FMU).
An FMU consists of one zip-file with extension Multi-domain modeling based on a DAE descrip-
“.fmu” containing all necessary components to tion and defined with a modeling language is an
utilize the FMU either for Model Exchange, for established approach, and many tools support it.
Co-Simulation, or for both. The following sum- This allows to conveniently define plant models
mary is an adapted version from Blochwitz et al. from many domains for the design and evalua-
(2012): tion of control systems. Furthermore, nonlinear
1. An XML-file contains the definition of all inverse plant models can be easily constructed
exposed variables of the FMU, as well as with the same methodology and can be utilized
other model information. It is then possible to in various ways in nonlinear control systems.
run the FMU on a target system without this Current research focuses on the support of
information, i.e., without unnecessary over- the complete life cycle: defining requirements
head. Furthermore, this allows determining all of a system formally on a “high level,”
MRAC 815
considerably improving testing by checking these extended automation system 800xA. In: Proceed-
requirements automatically when evaluating a ings of 6th Modelica conference, Vienna, 4–5
Sept 2006, pp 293–302. https://modelica.org/events/
system design by simulations, and providing modelica2006/Proceedings/sessions/Session3c2.pdf
complete tool chains from nonlinear multi- Franke R, Casella F, Otter M, Sielemann M, Mattsson SE,
domain models to embedded systems. The Olsson H, Elmqvist H (2009) Stream connectors – an
latter will allow convenient and fast target code extension of modelica for device-oriented modeling
of convective transport phenomena. In: Proceedings
generation of nonlinear controllers, extended and of 7th Modelica conference, Como, pp 108–121.
unscented Kalman filters, optimization-based https://www.modelica.org/events/modelica2009/
controllers, or moving horizon estimators. Proceedings/memorystick/pages/papers/0078/0078.
Furthermore, the methodology itself is further pdf
Frenkel J, Kunze G, Fritzson P (2012) Survey of appro-
improved. For example, in 2012, Modelica was priate matching algorithms for large scale systems of
extended with language elements to define multi- differential algebraic equations. In: Proceedings of the
rate sampled data systems in a precise way, as 9th international Modelica conference, Munich, 3–5
well as state machines. Sept 2012, pp 433–442. http://www.ep.liu.se/ecp/076/
045/ecp12076045.pdf
IEEE 1076.1-2007 (2007) IEEE standard VHDL analog
and mixed-signal extensions. Standard of IEEE. http://
Cross-References standards.ieee.org/findstds/standard/1076.1-2007.
html
Isidori A (1995) Nonlinear control systems, 3rd edn.
Computer-Aided Control Systems Design: In- Springer, Berlin/New York
troduction and Historical Overview Karnopp DC, Margolis DL, Rosenberg RC (2012)
Extended Kalman Filters System dynamics: modeling, simulation, and
control of mechatronic systems, 5th edn. Wiley,
Feedback Linearization of Nonlinear Hoboken
Systems Looye G, Thümmel M, Kurze M, Otter M, Bals J
Interactive Environments and Software Tools (2005) Nonlinear inverse models for control. In: M
Proceedings of the 4th international Modelica
for CACSD
conference, Hamburg, 7–8 March 2005, p 267.
Model Building for Control System Synthesis https://www.modelica.org/events/Conference2005/
onlineproceedings/Session3/Session3c3.pdf
Mattsson SE, Söderlind G (1993) Index reduction in
differential-algebraic equations using dummy deriva-
Bibliography tives. SIAM J Sci Comput 14:677–692
Mantoolh HA, Vlach M (1992) Beyond spice with saber
Blochwitz T, Otter M, Akesson J, Arnold M, Clauß C, and MAST. In: IEEE international symposium on
Elmqvist H, Friedrich M, Junghanns A, Mauss J, circuits and systems, San Diego, May 10–13 1992,
Neumerkel D, Olsson H, Viel A (2012) The functional vol 1, pp 77–80
mockup interface 2.0: the standard for tool indepen- Melville RC, Trajkovic L, Fang SC, Watson LT (1993)
dent exchange of simulation models. In: Proceedings Artificial parameter homotopy methods for the DC
of 9th international modelica conference, Munich, 3–5 operating point problem. IEEE Trans Comput Aided
Sept 2012, pp 173–184. http://www.ep.liu.se/ecp/076/ Des Integr Circuits Syst 12:861–877. http://citeseerx.
017/ecp12076017.pdf ist.psu.edu/viewdoc/download?doi=10.1.1.212.9327&
Brenan KE, Campbel SL, Petzold LR (1996) Numeri- rep=rep1&type=pdf
cal solution of initial-value problems in differential- Modelica Association (2012) Standard, Modelica
algebraic equations. SIAM, Philadelphia transactions on computer – a unified object-
Elmqvist H (1978) A structured model language for oriented language for systems modeling. Language
large continuous systems. Dissertation. Report specification, version 3.3. https://www.modelica.org/
CODEN:LUTFD2/(TFRT–1015), Department of Auto documents/ModelicaSpec33.pdf
Control, Lund Institute of Technology, Lund. http:// Olsson H, Otter M, Mattsson SE, Elmqvist H (2008)
www.control.lth.se/database/publications/article.pike? Balanced models in Modelica 3.0 for increased model
action=fulltext&artkey=elm78dis quality. In: Proceedings of 6th Modelica confer-
FMI Group (2010) The Functional Mockup Interface for ence, Bielefeld, pp 21–33. https://modelica.org/events/
Model Exchange and for Co-Simulation, version 1.0. modelica2008/Proceedings/sessions/session1a3.pdf
https://www.fmi-standard.org Pantelides CC (1988) The consistent initialization of
Franke R, Doppelhamer J (2006) Online applica- differential-algebraic systems. SIAM J Sci Stat Com-
tion of Modelica models in the industrial IT put 9:213–233
816 Multiscale Multivariate Statistical Process Control
Sieleman M (2012) Device-oriented modeling and simu- scales (or frequencies) and the signals in each
lation in aircraft energy systems design. Dissertation, decomposed scale are then used for MSP which
Dr. Hut. ISBN 3843905045
Sielemann M, Casella F, Otter M, Clauß C, Eborn J,
provides an indirect way of handling process
Mattsson SE, Olsson H (2011) Robust initialization dynamics.
of differential-algebraic equations using homotopy.
In: 8th international Modelica conference, Dres-
den. https://www.modelica.org/events/modelica2011/
Proceedings/pages/papers/041ID154afv.pdf
Keywords
Varga A (2000) A descriptor systems toolbox for Matlab.
In: Proceedings of CACSD’2000 IEEE international Multiresolution analysis; Partial least squares
symposium on computer-aided control system design, (PLS); Principal component analysis (PCA);
Anchorage, 25–27 Sept 2000, pp 150–155. http://elib.
dlr.de/11629/01/vargacacsd2000p2.pdf
Projection to latent structures (PLS); Wavelet
transform
the variables as well as changes in relationships provided review of wavelet-based multiscale sta-
among the variables. This article focuses on tistical process monitoring.
multiscale-multiway PCA using batchwise
data unfolding. However, the methodology
can equally be applied to PLS-based process The Approach
performance monitoring (MSPC).
In multivariate statistical process control Multiresolution analysis (MRA) provides the
(MSPC), the multivariate statistical techniques theoretical basis for the derivation of a com-
of principle component analysis (PCA) and putationally efficient algorithm for the wavelet
projection to latent structures {Partial Least transform Mallat (1998). MRA allows the
Squares} (PLS) together with monitoring metrics dynamic aspects of the data in to be taken into
based on Hotelling’s T 2 (directly related to account in MSPC. The individual signals are
the Mahalanobis distance that monitors the decomposed into different scales (frequencies),
fit of new observations to the model space) and data in each decomposed scale are then used
and the squared prediction error (SPE) or Q for MSPC which provides an indirect approach
statistic (that monitors the residual space-model to handling process dynamics. Multiscale MSPC
mismatch) are used to simultaneously monitor (MSPCA) enables the simultaneous extraction
the process variables (Kourti and MacGregor of process correlations across data as well as
1996; Qin 2003). A recent survey provides an accounting for autocorrelation within sensor
excellent state-of-the-art review of the methods data. In this way, it captures correlations among
and applications of data-driven fault detection the process variables made by various events
and diagnosis that have been developed over the occurring at different scales.
last two decades (Qin 2012). MSPCA calculates the principal components
Process measurements typically exhibit multi- of wavelet coefficients at each scale and com- M
scale behavior as a consequence of representing bines these at the relevant wavelet scales. Due
the cumulative effect of a number of underlying to its multiscale nature, MSPCA is very useful
process phenomena including process dynamics, for the modeling of data containing contribu-
measurement noise, and disturbances. To address tions from events whose behavior changes over
these issues, methodologies are required to ad- both time and frequency. Process monitoring by
dress (i) the multiscale nature of process data and MSPCA, and process prediction by MSPLS, in-
(ii) the inability of some existing algorithms to volves combining those scales where significant
handle autocorrelation. One approach is through events are detected. Approximate de-correlation
the use of multiresolution analysis and wavelets of wavelet coefficients also makes MSPCA ef-
Mallat (1998). Informative discussions and ap- fective for the monitoring of autocorrelated mea-
plication studies related to using multiresolution surements.
analysis and wavelet decompositions to enhance
PCA-based process monitoring and fault detec-
tion have been presented, for example, by Bakshi The Algorithm
(1998), Misra et al. (2002), Aradhye et al. (2003),
Lu et al. (2003), Yoon and MacGregor (2004) and Wavelets are a family of basis functions that
Reis and Saraiva (2006). Yoon and MacGregor provide a mapping from the time domain to the
in their comprehensive MSPCA study discussed time-frequency domain. They can be used to
their approach in the context of other multiscale decompose the signal into different resolutions by
approaches and illustrated the methodology using projecting onto the corresponding wavelet basis
simulated data from a continuous stirred-tank functions using multiresolution analysis (MRA).
reactor system. A major contribution of the paper A wavelet set is constructed from a fundamen-
was to extend fault isolation methods based on tal basis function or the mother wavelet by a
contribution plots to multiscale PCA approaches. process of translation and dilation. The wavelet
Although some 9 years old, Ganesan et al. (2004) set is defined as wavelet analysis which provides
818 Multiscale Multivariate Statistical Process Control
methodologies for the extraction of the time and three components, low-pass filters L.n/, high-
frequency content of a signal. Conventional fre- pass filters H.n/, and dyadic decimation. By
quency analysis based on the Fourier transform passing the input signal through this pair of
consists of decomposing a signal into sine waves filters, the projection of the original signal
of different frequencies. Wavelet analysis decom- onto the scaling and wavelet functions for the
poses the original signal in a similar manner. The multiresolution analysis is performed. Dyadic
major difference is that while Fourier analysis decimation, or down-sampling, removes every
uses sine waves of infinite length, multiresolution odd member of a sequence, thus halving the
analysis uses waveforms of finite length. The original number of samples. The low-pass filter
finite length of the wavelets allows them to de- resembles a moving average, while the high-pass
scribe local events in both the time and frequency filter extracts the detailed information contained
domain. in the signal. The discrete wavelet transform
The wavelet transform, an extension to the operates by taking a sequence of values, applying
Fourier transform, projects the original signal L.n/ and H.n/ and then repeating this same
down onto wavelet basis functions, providing a procedure to the approximation coefficients. In
mapping from the time domain to the timescale this way, the original signal vector is smoothed
plane. The wavelet functions, which are localized and halved through L, and the vector of
in the time and frequency domain, are obtained approximation coefficients is again smoothed
from a single prototype wavelet, the mother and halved through L. Successive application of
wavelet, by dilation and translation. The wavelet the low-pass filter results in the approximation
set is defined as coefficients, becoming an increasingly smooth
version of the original signal. At the same time
1 t b as smoothing the signal, each iteration extracts
a;b .t/ D p
jaj a the high frequency information in the data.
The repeated application of L, followed by H
where is the mother wavelet function, a the is, in effect, a band-pass filter. The result of
dilation parameter, and b the translation param- applying high-/low-pass filters to a signal is
eters, and the factor p1jaj is used to ensure that a set of coefficients describing the details of
each wavelet function has the same energy as the the signals DL and a second set describing the
mother wavelet. The discrete wavelet transform approximations of the signals AL . The original
with dyadic dilation and translation is used in this signal s can then be represented by
overview. A definition of continuous and discrete
wavelet transforms can be found in Daubechies X
L
x.t/ D Dj .t/ C AL .t/
(1992). In the discrete case, the dilation and
j D1
translation parameters are discretized as a D a0m
and b D kb0 a0m . If a0 D 2 and b0 D 1, a dyadic where Dj and Aj are referred to as the j th level
dilation and translation is carried out; however, wavelet details and approximation, respectively.
a0 and b0 are not restricted to these values. The Figure 1 shows schematically the multi-
discrete wavelet form, which is widely used in resolution-based wavelet decomposition.
process monitoring and chemical signal analy- One of the most popular choices of wavelets
sis, is are those of the Daubechies’ family. These
wavelets are compactly supported in the time
j=2 jt
jk .t/ D a0 .a0 kb0 / domain and have good frequency domain
decay. Moreover, Daubechies’ wavelets (DaubN)
A recursive algorithm for wavelet decompo- possess a different type of smoothness which
sition and the reconstruction of a discrete signal is determined by the vanishing moments N .
of dyadic length is often used Mallat (1998) This makes it possible to match the wavelet
and is known as the pyramid algorithm. The smoothness to the smoothness of the signals to
fast discrete wavelet decomposition consists of be analyzed. The signal can then be decomposed
Multiscale Multivariate Statistical Process Control 819
Multiscale Multivariate
Statistical Process
Control, Fig. 1 Schematic
of multiresolution wavelet
decomposition
into its contributions from multiple scales fermentation process (Birol et al. http://
as a weighted sum of dyadically discretized www.chee.iit.edu~control/software.html) was
orthonormal basis functions: presented by Alawi and Morris (2007). The M
application used a combination of multiblock
X
L X
N X
N statistical modeling approaches together with
x.t/ D dmk mk .t/ C alk 'lk .t/ multiscale-multiway batch monitoring. Figure 3
mD1 kD1 kD1 shows the multiscale-multiway monitoring
scheme for process monitoring and fault
where x.t/ represents the process measurements, detection. At every time point, the batch process
dmk represents the wavelet or detail signal coeffi- variables are decomposed into scales to the
cient at scale m and location k, and aLk represent wavelet domain and then reconstructed back
the scaled signal or scaling function coefficient to the time domain. The scales/details and the
of .t/ at the coarsest scale L and location approximations are collected into separate ma-
k. The scaling function, or father wavelet, mk trices (blocks). Multiblock PCA is then applied
captures the low-frequency content of the original to the wavelets details and approximation. Fault
signal that is not captured by wavelets at the detection based on the Ts2 and Qs statistics was
corresponding or finer scales. used along with contribution plots incorporating
The wavelet transformation is applied to confidence bounds to enhance fault diagnosis.
decompose a multivariate signal X into its Figure 4 compares the monitoring statistics
approximate, A1 to AL , and detail, D1 to DL , for the multiscale-multiway PCA and conven-
coefficients for the first to Lth level, respectively. tional multiway PCA for a slowly drifting sen-
For more information, see Bakshi (1998), Misra sor fault showing the potential for multiscale
et al. (2002) and Aradhye et al. (2003). Figure 2 MPCA (MSPCA) in being able to detect faster
shows a schematic representation of a typical subtle process and sensor faults than conventional
MSPCA multivariate statistical process control multiway MSPC. It is noted that sensor drift is
scheme. confined to one scale band at low frequency. It has
An example of the application of multiway- been observed that multiscale approaches appear
multiscale MPCA to a benchmark-fed batch to provide little improvement if a fault effect is
820 Multiscale Multivariate Statistical Process Control
Multiscale Multivariate Statistical Process Control, Fig. 2 Schematic representation of multiscale PCA-based
MSPC
Multiscale Multivariate Statistical Process Control, Fig. 3 Multiscale-multiway batch process monitoring scheme
spread over more than one frequency band or developed, e.g., Teppola and Minkkinen (2000)
the fault effect occurs mainly in a scale with and Lee et al. (2009). Nonlinear approaches have
dominant variance. Thus, a monitoring method also been explored. For example, Lee et al. (2004)
that gives the best detection and identification proposed a batch monitoring approach using
of faults will depend on the fault characteristics multiway kernel principal component analysis,
with multiscale approaches, providing an advan- Shao et al. (1999) proposed a wavelet-based
tage when the faults localized in frequency or nonlinear PCA algorithm, Choi et al. (2008)
that appear in scales that normally have small described a study of a kernel-based MSPCA
variance. algorithm for nonlinear multiscale monitoring,
and most recently Zhang and Ma (2011) com-
Other Applications of Multiscale pared fault diagnosis of nonlinear processes using
MPCA multiscale KPCA and multiscale KPLS. Wavelet
multiscale approaches have also been widely
There have a number of nonlinear extensions. For discussed in spectroscopic data processing (Shao
example, multiscale PLS approaches have been et al. 2004).
Multi-vehicle Routing 821
Lee J-M, Yoo C-K, Lee I-B (2004) Fault detection of batch
processes using multiway Kernel principal component
analysis. Comput Chem Eng 28(9):1837–1847
Lee HW, Lee MW, Park JM (2009) Multi-scale extension
of PLS algorithm for advanced on-line process moni-
toring. Chemom Intell Lab Syst 98:201–212
Liu Z, Cai W, Shao X (2009) A weighted multiscale
regression for multivariate calibration of near infrared
spectra. Analyst 134:261–266
Lu N, Wang F, Gao F (2003) Combination method of prin-
cipal component and wavelet analysis for multivariate
process monitoring and fault diagnosis. Ind Eng Chem
Res 42:4198–4207
Mallat SG (1998) Multiresolution approximations and
wavelet orthonormal bases. Trans Am Math Soc
315:69–87
Multiscale Multivariate Statistical Process Control, Misra MH, Yue H, Qin SJ, Ling C (2002) Multivariate
Fig. 4 Comparison of MPCA and multiscale MSPCA for process monitoring and fault diagnosis by multi-scale
a range of subtle sensor drift faults magnitude ! against PCA. Comp Chem Eng 26:1281–1293
fault detection delay Qin SJ (2003) Statistical process monitoring: basics and
beyond. J Chemom 17:480–502
Qin SJ (2012) Survey on data-driven industrial pro-
cess monitoring and diagnosis. Annu Rev Control
Cross-References 36:220–234
Reis MS, Saraiva PM (2006) Multiscale statistical pro-
cess control with multiresolution data. AIChE J
Controller Performance
52:2107–2119
Monitoring Shao R, Jia F, Martin EB, Morris AJ (1999) Wavelets and
Fault Detection and Diagnosis nonlinear principal components analysis for process
Statistical Process Control in monitoring. Control Eng Pract 7:865–879 M
Shao X-G, Leung AK-M, Chau F-T (2004) Wavelet: a new
Manufacturing trend in chemistry. Acc Chem Res 36:276–283
Teppola P, Minkkinen P (2000) Wavelet-PLS regression
models for both exploratory data analysis and process
Bibliography monitoring. J Chemom 14:383–399
Yoon S, MacGregor JF (2004) Principal-component anal-
Alawi A, Zhang J, Morris AJ (2007) Multiscale Multi- ysis of multiscale data for process monitoring and fault
block Batch Monitoring: Sensor and Process Drift diagnosis. AIChE J 50(11):2891–2903
and Degradation. DOI: 10.1021/op400337x April 25 Zhang Y, Ma C (2011) Fault diagnosis of nonlinear pro-
2014 cesses using multiscale KPCA and multiscale KPLS.
Aradhye HB, Bakshi BR, Strauss A, Davis JF (2003) Chem Eng Sci 66:64–72
Multiscale SPC using wavelets: theoretical analysis
and properties. AIChE J 49(4):939–958
Bakshi BR (1998) Multiscale PCA with application to
multivariate statistical process monitoring. AIChE J Multi-vehicle Routing
44:1596–1610
Birol G, Undey C, Cinar AA (2002) Modular simulation Emilio Frazzoli1 and Marco Pavone2
package for fed-batch fermentation: penicillin produc- 1
tion. Comput Chem Eng 26:1553–1565
Massachusetts Institute of Technology,
Choi SW, Morris J, Lee I-B (2008) Nonlinear multiscale Cambridge, MA, USA
2
modelling for fault detection and identification. Chem Stanford University, Stanford, CA, USA
Eng Sci 63:2252–2266
Daubechies I (1992) Ten lectures on wavelets. SIAM,
Philadelphia
Ganesan R, Das TT, Venkataraman V (2004) Wavelet- Abstract
based multiscale statistical process monitoring: a lit-
erature review. IIE Trans 36:787–806 Multi-vehicle routing problems in systems and
Kourti T, MacGregor JF (1996) Multivariate SPC methods
for process and product monitoring. J Qual Technol control theory are concerned with the design of
28:409–428 control policies to coordinate several vehicles
822 Multi-vehicle Routing
moving in a metric space, in order to complete (UAVs) is responsible for investigating possible
spatially localized, exogenously generated tasks threats over a region of interest. As possible
in an efficient way. Control policies depend on threats are detected, by intelligence, high-altitude
several factors, including the definition of the or orbiting platforms, or by ground sensor net-
tasks, of the task generation process, of the vehi- works, one of the UAVs must visit its location
cle dynamics and constraints, of the information and investigate the cause of the alarm, in order
available to the vehicles, and of the performance to enable an appropriate response if necessary.
objective. Ensuring the stability of the system, Performing this task may require the UAV not
i.e., the uniform boundedness of the number of only to fly to the possible threat’s location but
outstanding tasks, is a primary concern. Typical also to spend additional time on site. The objec-
performance objectives are represented by mea- tive is to minimize the average time between the
sures of quality of service, such as the average appearance of a possible threat and the time one
or worst-case time a task spends in the system of the UAVs completes the close-range inspection
before being completed or the percentage of tasks task. Variations may include priority levels, time
that are completed before certain deadlines. The windows during which the inspection task must
scalability of the control policies to large groups be completed, and sensors with limited range.
of vehicles often drives the choice of the informa- In order to perform the required mission, the
tion structure, requiring distributed computation. UAVs (or, more in general, mission control)
need to repeatedly solve three coupled decision-
making problems:
Keywords 1. Task allocation: which UAV shall pursue
each task? What policy is used to assign tasks
Cooperative control; Decentralized control; Dy- to UAVs? How often should the assignment
namic routing; Networked robots; Task allocation be revised?
2. Service scheduling: given the list of tasks to
be pursued, what is the most efficient ordering
Introduction of these tasks?
3. Loitering paths: what should UAVs without
Multi-vehicle routing problems in systems and pending assignments do?
control theory are concerned with the design of The optimization process must take into account,
control policies to coordinate several vehicles for example, algebraic or differential constraints
moving in a metric space, in order to complete (such as obstacle avoidance or bounded cur-
spatially localized, exogenously generated tasks vature, respectively), sensing constraints, com-
in an efficient way. Key features of the prob- munication constraints, and energy constraints.
lem are that tasks arrive sequentially over time Furthermore, one might require a decentralized
and planning algorithms should provide control control architecture.
policies (in contrast to preplanned routes) that DVR problems, including the above UAV
prescribe how the routes should be updated as a routing problem, are generally intractable due
function of those inputs that change in real time. to their multifaceted combinatorial, differential,
This problem is usually referred to as dynamic and stochastic nature, and consequently solution
vehicle routing (DVR). In DVR problems, ensur- approaches have been devised that look either
ing the stability of the system, i.e., the uniform at heuristic algorithms or at approximation
boundedness of the number of outstanding tasks, algorithms with some guarantee on their
is a primary concern. performance.
problem (VRP), whereby (i) a team of n vehicles fulfilled in the order in which they arrive, are un-
is required to service a set of nT “static” tasks in able to stabilize the system for all stabilizable task
a metric space, (ii) each task requires a certain arrival rates, in the sense that with such routing al-
amount of on-site service, (iii) and the goal is to gorithms the average number of tasks grows over
compute a set of routes that minimizes the cost time without bound, even though there exist alter-
of servicing the tasks; see Toth and Vigo (2001) native routing algorithms that would maintain the
for a thorough introduction to this problem. The number of tasks uniformly bounded (Bertsimas
VRP is static in the sense that vehicle routes and van Ryzin 1991).
are computed assuming that no new tasks arrive. The most widely applied approach is
The VRP is an important research topic in the to combine static routing methods (e.g.,
operations research community. VRP-like methods, nearest neighbor strategies,
or genetic algorithms) and sequential re-
optimization, where the re-optimization horizon
Approaches for Multi-vehicle Routing is chosen heuristically. In particular, greedy
nearest neighbor strategies, whose formal
Broadly speaking, there are three main characterization still represents an open problem,
approaches available in the literature to tackle are known to perform particularly well in some
dynamic vehicle routing problems. The first notable cases (Bertsimas and van Ryzin 1991).
approach relies on heuristic algorithms. In the However, the joint selection of a static routing
second approach, called “competitive analysis method and of the re-optimization horizon in
approach,” routing policies are designed to the presence of vehicle and task constraints
minimize the worst-case ratio between their (e.g., differential motion constraints, or task
performance and the performance of an optimal priorities) makes the application of this approach
off-line algorithm which has a priori knowledge far from trivial. For example, one can show that M
of the entire input sequence. In the third an erroneous selection of the re-optimization
approach, the routing problem is embedded horizon can lead to pathological scenarios where
within the framework of queueing theory. no task ever receives service (Pavone 2010).
Routing policies are then designed to stabilize Additionally, performance criteria in dynamic
the system in terms of uniform boundedness of settings commonly differ from those of the
the number of outstanding tasks and to minimize corresponding static problems. For example, in
typical queueing-theoretical cost functions such a dynamic setting, the time needed to complete
as the expected time the tasks remain in the queue. a task may be a more important factor than the
Since the generation of tasks and motion of the total vehicle travel cost.
vehicles is within an Euclidean space, one can
refer to this third approach as “spatial queueing
theory.” Competitive Analysis Approach
The distinctive feature of the competitive analy-
Heuristic Approach sis approach is the method used to evaluate an
The main aspect of the heuristic approach is that algorithm’s performance, which is called compet-
routing algorithms are evaluated primarily via nu- itive analysis. In competitive analysis, the per-
merical, statistical and experimental studies, and formance of a (causal) algorithm is compared
formal performance guarantees are not available. to the performance of a corresponding off-line
A naïve, yet reasonable approach to design a algorithm (i.e., a non-causal algorithm that has
heuristic algorithm for DVR would be to adapt a priori knowledge of the entire input) in the
classic queueing policies. However, perhaps sur- worst-case scenario. Specifically, an algorithm is
prisingly, this adaptation is not at all straightfor- c-competitive if its cost on any problem instance
ward. For example, routing algorithms based on a is at most c times the cost of an optimal off-line
first-come-first-served policy, whereby tasks are algorithm:
824 Multi-vehicle Routing
quantifying the ratio between an upper bound on Applying Spatial Queueing Theory
the cost delivered by that algorithm and a lower to DVR Problems
bound for the best achievable cost. Such a ratio,
being an estimate of the degree of optimality of Spatial Queueing Theory Workflow for
the algorithm, should be close to one and possibly DTRP
independent of system parameters. The proposed
algorithms are finally evaluated via numerical, Model
statistical and experimental studies, including The DTRP, which, incidentally, captures well the
Monte-Carlo comparisons with alternative salient features of the UAV scenario outlined
approaches. in the Motivation Section, can be modeled as
An interesting feature of this approach is that follows. In a geographical region Q of area A,
the performance analysis usually yields scaling a dynamic process generates spatially localized
laws for the quality of service in terms of model tasks. The process generating tasks is modeled
data, which can be used as useful guidelines as a spatio-temporal Poisson process, i.e., (i) the
to select system parameters when feasible (e.g., time between consecutive generation instants has
number of vehicles). an exponential distribution with intensity
>
In order to make the model tractable, the ar- 0 and (ii) upon arrival, the locations of tasks
rival process of tasks is assumed stationary (with are independently and uniformly distributed in
possibly unknown parameters) with statistically Q. The location of the new tasks is assumed
independent arrival times. These assumptions, to be immediately available to a team of n ser-
however, can be unrealistic in some scenarios, in vicing vehicles. The vehicles provide service in
which case the competitive analysis approach Q, traveling at a speed at most equal to v; the
may represent a better alternative. From a vehicles are assumed to have unlimited fuel and
technical standpoint, one should note that spatial task-servicing capabilities. Each task requires an M
queueing models are inherently different from independent and identically distributed amount of
traditional, nonspatial queueing models. The on-site service with finite mean duration s > 0.
main reason is that in spatial queueing models, A task is completed when one of the vehicles
the “service time” per task has both a travel moves to its location and performs its on-site
and an on-site component. Although the on- service. The objective is to design a routing policy
site service requirements can often be modeled that maximizes the quality of service delivered
as “statistically” independent, the travel times by the vehicles in terms of the average steady-
are inherently statistically coupled. Hence, in state time delay T between the generation of a
contrast to standard queueing models, service task and the time it is completed (in general, in
times in spatial queueing models are statistically a dynamic setting, the focus is on the quality
dependent, and this deeply affects the solution to of service as perceived by the “end user,” rather
the problem. than, for example, fuel economies achieved by
Pioneering work in this context is that of Bert- the vehicles). Other quantities of interest are the
simas and van Ryzin (1991), who introduced average number nT of tasks waiting to be com-
queueing methods to solve the baseline DVR pleted and the waiting time W of a task before its
problem (a vehicle moves along straight lines and location is reached by a vehicle. These quantities,
visits tasks whose time of arrival, location, and however, are related according to T D W C s
on-site service are stochastic; information about (by definition) and by Little’s law, stating that
task location is communicated to the vehicle upon nT D
W , for stable queues.
task arrival). Next section provides an overview The system is considered stable if the expected
of the application of spatial queueing theory to number of waiting tasks is uniformly bounded at
such simplified DVR problem, referred to in all times, or equivalently, that tasks are removed
the literature as dynamic traveling repairman from the system at least at the same rate at
problem (DTRP). which they are generated. In the case at hand,
826 Multi-vehicle Routing
n
the time to complete a task is the sum of the Vi D q 2 Qj kq pi k kq pj k; 8j ¤ i;
time to reach its location (which depends on the
o
routing policy) plus the time spent at that location j 2 f1; : : : ; ng ;
in on-site service (which is independent of the
routing policy). Since, by definition, the service
time is no shorter than the on-site service time where Vi is the region associated with the i -th
s, then a weaker necessary condition for stability “generator” point pi (see also Optimal Deploy-
is % WD
s=n < 1; the quantity % measures ment and Spatial Coverage). The distance Hn
the fraction of time the vehicles are performing certainly provides a lower bound on the expected
on-site service. Remarkably, it turns out that this distance traveled by a vehicle to reach a task, and
is also a sufficient condition for stability, in the hence one obtains the lower bound
sense that, if this condition is satisfied, one can
find a stabilizing policy. Note that this stability Hn
condition is independent of the size and shape of T C s:
v
Q and of the speed of the vehicles.
Note that under this strategy “regions of domi- where .1/ D ˇ22 and .p/ ! ˇ22 =2 for large p.
nance” are implicitly assigned to vehicles accord- These results critically exploit the statistical char-
ing to a Median Voronoi Tessellation. acterization of the length of an optimal TSP tour.
The heavy-load case is more challenging. Hence, the proposed policy achieves a quality
Consider, first, the following single-vehicle of service that is arbitrarily close to the optimal
routing policy, based on a partition of Q into one, in the asymptotic regime of heavy load (and,
p 1 subregions fQ1 , Q2 , . . . , Qp g of equal indeed, also of light load).
area A=p. Such a partition can be obtained, e.g., The above single-vehicle routing policies can
as sectors centered at the median of Q. Define a be fairly easily lifted to an efficient multi-vehicle
cyclic ordering for the subregion, such that, e.g., routing policy. The key idea (akin to the one in the M
if the vehicle is in region Qi , the “next” region is light-load case) is to (1) partition the workspace
Qj , where j follows i in the cyclic ordering (in into n regions of dominance (with disjoint interi-
other words, j D .i C 1/mod p). ors and whose union is Q), (2) assign one vehicle
to each region, and (3) have each vehicle follow
1. If there are no outstanding tasks, move to a single-vehicle routing policy within its own re-
the median of the region Q. gion. This approach leads to the following multi-
2. Otherwise, visit the “next” subregion; vehicle routing policy for the DTRP problem:
subregions with no tasks are skipped.
Compute a minimum-length path from 1. Partition Q into n regions of dominance
the vehicle’s current position through all of equal area and assign one vehicle to
the outstanding tasks in that subregion. each region.
Complete all tasks on this path, ignoring 2. Each vehicle executes a single-vehicle
new tasks generated in the meantime. DTRP policy in its own subregion.
Repeat.
Using as single-vehicle policy the routing pol-
The problem of computing the shortest path icy described above, the average system time T
through a number of points is related to the well- in heavy-load satisfies
known traveling salesman problem (TSP). While
the TSP is a prototypically hard combinatorial A
T .p/ C s; .% ! 1 /:
optimization problem, it is well known that the v n .1 %/2
2 2
Euclidean version of the problem can be approx-
imated efficiently (Vazirani 2001). Furthermore, Hence, by comparing this result with the cor-
the length ETSP.nT / of a Euclidean TSP through responding lower bound, one concludes that a
828 Multi-vehicle Routing
simple partitioning strategy leads to a multi- ties, translating tasks, and adversarial generation;
vehicle routing policy whose performance is ar- has been extended to address aspects concerning
bitrarily close to the optimal one in heavy load. robotic implementation such as complex vehi-
cle dynamics, limited sensing range, and team
Mode of Implementation forming; and has even been tailored to integrate
The scalability of the control policies to large humans in the design space; see Bullo et al.
groups of vehicles often requires a distributed (2011) and references therein. Despite the sig-
implementation of multi-vehicle routing strate- nificant modeling differences, the “workflow” is
gies. For the DTRP, a distributed implementa- essentially the same as in the DTRP: a queueing
tion can be obtained by devising decentralized model that captures the salient features of the
algorithms for environment partitioning. In the problem at hand, characterization of the funda-
solution proposed in Pavone (2010), power dia- mental limitations of performance, and design of
grams are the key geometric concept to obtain, algorithms with provable performance bounds.
in a decentralized fashion, partitions suitable for The last step, as for the DTRP, often involves
both the light-load case (requiring, as seen before, lifting a single-vehicle policy to a multi-vehicle
a Median Voronoi Tessellation) and the heavy- policy through the strategy of environment par-
load case (requiring an equal-area partition). The titioning. Within this context, a number of parti-
power diagram of Q is defined as tioning schemes and corresponding decentralized
partitioning algorithms relevant to a large variety
n of DVR problems are discussed in Pavone et al.
Vi D q 2 Qj kq pi k2 wi kq pj k2 wj ;
(2009).
o This workflow efficiently and transparently
8j ¤ i; j 2 f1; : : : ; ng ;
decouples the three decision-making problems
mentioned in the Introduction Section, i.e., “task
where .pi ; wi / 2 Q R are a set of “power allocation,” “service scheduling,” and “loitering
points” and Vi is the subregion associated with paths.” In fact, task allocation is addressed via
the i -th power point. Note that power diagrams the strategy of environment partitioning, service
are a generalization of Voronoi diagrams: when scheduling is addressed by applying a single-
all weights are equal, the power diagram and the vehicle routing policy within the individual
Voronoi diagram are identical. The basic idea, regions of dominance, and the loitering paths
then, is to associate to each vehicle i a virtual resolve in placing the vehicles at or around
power point, which is an artificial (or logical) specific points within the dominance regions
variable whose value is locally controlled by the (e.g., the median). Note, however, that in some
i -th vehicle. The cell Vi becomes the region of important cases, e.g., DVR problems where
dominance for vehicle i , and each vehicle updates goods have to be transported from a pickup
its own power point according to a decentralized location to a delivery location or where vehicles
gradient-descent law with respect to a cover- are differentially constrained and operate in a
age function ( Optimal Deployment and Spatial “congested” workspace, multi-vehicle policies
Coverage), until the desired partition is achieved. that rely on static partitions perform poorly or
The reader is referred to Pavone (2010) for more are not even feasible (Pavone et al. 2009), and
details. task allocation and service scheduling need to be
addressed as tightly coupled.
Extensions and Discussion Through spatial queueing theory one is
By integrating additional ideas from dynamics, usually able to characterize the performance
teaming, and distributed algorithms, the spatial of multi-vehicle routing policies in asymptotic
queueing theory approach has been recently ap- regimes. To ensure “satisfactory” performance
plied to scenarios with complex models for the under general operation conditions, a common
tasks such as time constraints, service priori- strategy is to consider heuristic modifications
Multi-vehicle Routing 829
to a baseline asymptotically efficient routing regimes (current optimality results are only
policy in such a way that, on the one hand, available either in the light or heavy-load
asymptotic performance is preserved, and, on the regimes), online estimation of the statistical
other hand, light- and heavy-load performances parameters (e.g., spatial distribution of the tasks),
are “smoothly” and efficiently blended in the and formulations that take into account second-
intermediate load case. The interested reader order moments and large-deviation probabilities.
can find more information in Bullo et al.
(2011).
Cross-References
The Internet is shared by billions of users, For such a peer-to-peer service to work,
and the actions of these users have to be each peer should not only download files
regulated so that they share the resources in from others, but should also be willing to
the network in a fair manner. Equivalently, sacrifice some of its resources to upload files
this problem can be viewed as one in which to others. Naturally, peers would prefer to
a network designer has to design a collection only download and not upload to minimize
of protocols so that the users of the network their resource usage. The design of incentive
can equitably allocate the available resources schemes to induce users to both download
among themselves without the intervention of and upload files is another example of a
a central authority. Such protocols are built game-theoretic problem in a network (Qiu
into every computer connected to the Internet and Srikant 2004).
today, to allow for seamless operation of the • Network Economics: In addition to end-user
network. The problem of designing such interaction, Internet service providers (ISPs)
protocols can be posed as a game-theoretic have to interact with each other to allow their
problem in which the players are the network customers access to all the web sites in the
and the traffic sources using the network world. For example, one ISP may have a
(Kelly 1997). customer who wants to access a web site
• Routing Games: Finding appropriate routes connected to another ISP. In this case, the data
for each user’s data traffic is a particular form traffic must cross ISP boundaries, and thus,
of resource allocation mentioned above. How- one ISP has to transport data destined for a
ever, routing has applications beyond commu- customer of another ISP. Thus, ISPs must be
nication networks (with the other major ap- willing to contribute resources to satisfy the
plication area being transportation networks), needs of customers who do not directly pay
so it is useful to discuss routing separately. them. In such a situation, ISPs must have bilat-
In communication networks, each user may eral agreements (commonly known as peering
attempt to find the minimum-delay route for agreements) to ensure that the selfish interest
its traffic, with help from the network, to of each ISP to minimize its resource usage
minimize the delay experienced by its packets. is aligned with the needs of its customers.
In a transportation network, each automobile Again, game theory is the right tool to study
on the road attempts to take the path of least such inter-ISP interactions (Courcoubetis and
congestion through the network. An active Weber 2003).
area of research in game theory is one which • Spectrum Sharing: Large portions of the ra-
tries to understand the impact of individual dio spectrum are severely underutilized. Typ-
user decisions on the global performance of ically, portions of the spectrum are assigned
the network (Roughgarden 2005). An interest- to a primary user, but the primary user does
ing result in this regard is the Braess paradox not use it most of the time. There has been
which is an example of a road transporta- a surge of interest recently in the concept of
tion network in which the addition of a road cognitive radio, whereby radios are cognitive
leads to increased delays when each user self- of the presence or absence of the primary
ishly choose a route to minimize its delay. Of user, and when the primary user is absent,
course, if routes are chosen to minimize the another radio can use the spectrum to transmit
overall delay experienced in the network such its data. When there are many users and the
a paradox will not arise. available spectrum is split into many channels,
• Peer-to-Peer Applications: Many studies have it is impossible for users to perfectly coordi-
indicated that file sharing between users (also nate their transmissions to achieve maximum
known as peers) directly, without using a network utilization. In these situations, game-
centralized web site such as YouTube, is a theoretic protocols which take into account the
dominant source of traffic in the Internet. noncooperative behavior of the users can be
Network Games 833
P
designed to allow secondary users to access where we have used the notation yl WD rWl2r xr
the available channels as efficiently as possi- to denote the total data rate on link l. If p is
ble (Saad et al. 2009). known, then the optimal x can be calculate by
In the next section, we will elaborate on one of solving
the applications above, namely, resource alloca- max L.x; p/:
x0
tion in the Internet, and show how game-theoretic
modeling can be used to design fair resource Notice that the optimal solution for each xr can
sharing. be obtained by solving
Moving forward, two areas which require con- Kelly FP (1997) Charging and rate control for elastic
siderable further research are the following: (i) traffic. Eur Trans Telecommun 8:33–37
Qiu D, Srikant R (2004) Modeling and performance
inter-ISP routing and (ii) spectrum sharing. The analysis of BitTorrent-like peer-to-peer networks.
Internet is a fairly reliably network, and any Proc ACM SIGCOMM ACM Comput Commun Rev
unreliability often arises due to routing issues 34:367–378
among ISPs. As mentioned in the introduction, Roughgarden T (2005) Selfish routing and the price of
anarchy. MIT Press, Cambridge
peering arrangements between ISPs are necessary Saad W, Han Z, Debbah M, Hjorungnes A, Basar T (2009)
to make sure that ISPs carry each others’ traffic Coalitional game theory for communication networks:
and are appropriately compensated for it, either a tutorial. IEEE Signal Process Mag 26(5):77–97
through reciprocal traffic-carrying agreements or Shakkottai S, Srikant R (2007) Network optimization and
control. NoW Publishers, Boston-Delft
actual monetary transfer. Thus, the policy that Yang S, Hajek B (2007) VCG-Kelly mechanisms for
an ISP uses to route traffic may be governed by allocation of divisible goods: adapting VCG mecha-
these peering agreements. The more complicated nisms to one-dimensional signals. IEEE J Sel Areas
these policies are, the more chances there are Commun 25:1237–1243
for routing misconfigurations that lead to service
interruptions. This interplay between policies and
technology in the form of routing algorithms is an Networked Control Systems:
interesting topic for further study. Architecture and Stability Issues
Cognitive radios and spectrum sharing are
expected to be significant technological com- Linda Bushnell1 and Hong Ye2
ponents of future wireless networks. Designing 1
Department of Electrical Engineering,
algorithms for selfish radios to share the avail- University of Washington, Seattle, WA, USA
able spectrum while respecting the rights of the 2
The Mathworks, Inc., Natick, MA, USA
primary user of the spectrum is a challenge that
requires considerable further attention. This area
of research requires one to combine sensing tech- Abstract N
nologies to sense the presence of other users
with game-theoretic models to ensure fair chan- When shared, band-limited, real-time communi-
nel access to the secondary users, subject to the cation networks are employed in a control system
constraint that the primary user should not be to exchange information between spatially dis-
affected by the presence of the secondary users. tributed components, such as controllers, actua-
tors, and sensors, it is categorized as a networked
control system (NCS). The primary advantages
Cross-References of a NCS are reduced complexity and wiring,
reduced design and implementation cost, ease of
Game Theory: Historical Overview system maintenance and modification, and effi-
Networked Systems cient data sharing. In addition, this unique archi-
Optimal Deployment and Spatial Coverage tecture creates a way to connect the cyberspace
to the physical space for remote operation of sys-
tems. The NCS architecture allows for perform-
ing more complex tasks, but also requires taking
Bibliography
the network effects into account when designing
Courcoubetis C, Weber R (2003) Pricing communication control laws and stability conditions. In this entry,
networks: economics, technology and modelling. Wi- we review significant results on the architecture
ley, Hoboken and stability analysis of a NCS. The results pre-
Hardin G (1968) The tragedy of the commons. Science
sented address communication network-induced
162:1243–1248
Johari R, Tsitsiklis JN (2004) Efficiency loss in a network challenges such as time delays, scheduling, and
resource allocation game. Math Oper Res 29:407–435 information packet dropouts.
836 Networked Control Systems: Architecture and Stability Issues
From the washing machine, air conditioner, and The architecture of a NCS consists of a band-
microwave oven to the telephone, stereo, and limited, digital communication network physi-
automobile, embedded computers are present in cally and electronically integrated with a spatially
the modern home. In a factory environment, there distributed control system, operated on a given
are thousands of networked smart sensors and plant. Digital information, such as controller sig-
actuators with embedded processors, working to nals, actuator signals, sensor signals, and operator
complete a coordinated task. The trend in manu- input, is transmitted via the network. The compo-
facturing plants, homes, buildings, aircraft, and nents connected by the network include all nodes
automobiles is toward distributed networking. of the control system, such as the supervisory
This trend can be inferred from many proposed (or “network owner”) computer, controller soft-
or emerging network standards, such as con- ware and hardware, actuators, and sensors. In this
troller area network (CAN) for automotive and structure, the feedback control system’s loops are
industrial automation, BACnet for building au- closed over the shared communication network.
tomation, PROFIBUS and WorldFIP fieldbus for The communication network can be wired or
process control, and IEEE 802.11, and Blue- wireless and may be shared with other unrelated
tooth wireless standards for applications such nodes outside the control system. As illustrated in
as mobile sensor networks, HVAC systems, and Fig. 1, the shared communication channel, which
unmanned aerial vehicles. multiplexes signals from the sensors to the con-
The traditional dedicated point-to-point wired trollers and/or from the controllers to the actua-
connection in control systems has been success- tors, serves many other uses besides control. Each
fully implemented in industry for decades. With of the system components directly connected to
the advance of communication network and hard- the network via a network interface is denoted
ware technologies, it is common to integrate the a physical node. Besides the network interface,
communication network into the control system the sensors and actuator nodes are typically smart
to replace the dedicated point-to-point connection nodes with embedded microprocessors. Some-
to achieve reduced weight and power, lower cost, times, the controller is colocated with the smart
simpler installation and maintenance, and higher actuator. Several key issues make networked con-
reliability, to name a few advantages. For exam- trol systems distinct from traditional control sys-
ple, a typical new automobile has two controller tems (Hespanha et al. 2007; Yang 2006).
area networks (CANs): a high-speed one in front
of the firewall for the engine, transmission, and Band-Limited Channels
traction control and a low-speed one for locks, Bandwidth limitation of the shared communica-
windows, and other devices (Johansson et al. tion channel requires that all nodes in the network
2005). must share (e.g., time sharing or frequency shar-
The conventional definition of a networked ing, etc.) the common network resource without
control system (NCS) is as follows: When a interfering with each other.
feedback control system is closed via a com-
munication channel, which may be shared with Sampling and Delays
other nodes outside the control system, then the In a NCS, the plant outputs are sampled by
control system is called a NCS. A NCS can also the sensors, which can convert continuous-time
Networked Control Systems: Architecture and Stability Issues 837
Networked Control Systems: Architecture and Stability Issues, Fig. 1 A general networked control system (NCS)
architecture
analog signals to digital signals; perform prepro- many embedded systems can be neglected when
cessing, filtering, and encoding; and package the compared with the magnitude of network access
data signal so that it is ready for transmission. delay.
After winning the medium access control and N
being transmitted over the network, the package Packet Dropouts
containing the sampled data signal arrives at the It is possible in a NCS that a packet may be
receiver side, which could be a controller or a lost while it is in transit through the network.
smart actuator with a controller collocated with The packet that contains important sampling data
it. The receiver unpacks and decodes the signal. or control signals may drop occasionally due
This process is quite different from the traditional to transmission errors of the physical network
periodic sampling in digital control. The overall link, message collision, or node failures, to
delay between sampling and the eventual decod- name a few. Overflow in queue or buffer can
ing of the transmitted packet at the receiver can be lead to network congestion and package loss.
time varying and random due to both the network Thus, the use of queues is not favored by NCSs
access delay (i.e., the time it takes for a shared in general. Packet dropouts also happen if the
network to accept the data) and the transmission receiver discards outdated arrivals that have long
delay (i.e., the time during which data are in delays. Most network protocols are equipped
transit inside the network). This also depends on with transmission-retry mechanisms, such as
the highly variable network conditions, such as TCP, that guarantee the eventual delivery of
congestion and channel quality. In some NCSs, packets. These protocols, unfortunately, are not
the data transmitted are time stamped, which appropriate for a NCS since the retransmission
means that the receiver may have an estimate of of old sensor data or calculated control signals
the delay’s duration and could take appropriate is generally not very useful when new, time-
corrective action. Given the rapid advance of em- critical data are available. Using selected old
bedded computation and communication hard- data for estimation or prediction is an exception,
ware technology today, the transmission delay in where old data may be packaged with the new
838 Networked Control Systems: Architecture and Stability Issues
data in one packet. It is advantageous to discard variable delays, and packet dropouts can degrade
the old, un-transmitted data and transmit a new the performance of a NCS. For example, the
packet if and when it becomes available. In this NCS may have a longer settling time or bigger
way, the controller always receives fresh data for overshoot in the step response. Furthermore,
its control calculation, and the actuator always the NCS may become unstable when delays
executes the up-to-date command to control the and/or packet dropouts exceed a certain range.
plant. Designers choosing to use a NCS architecture,
however, are motivated not by performance but
Modeling Errors, Uncertainties, and by cost, maintenance, and reliability gains.
Disturbances
In a distributed NCS, modeling errors, uncertain- Band-Limited Channels
ties, and disturbances always exist when using Inspired by Shannon’s results on the maximum
the mathematical model to describe the physical bit rate that a communication channel can carry
process. These factors may lead to a major impact reliably, a significant research effort has been
on the overall system performance and cause devoted to the problem of determining the min-
failure in fulfilling the desired objectives. Wang imum bit rate that is needed to stabilize a system
and Hovakimyan (2013) proposed a reference through feedback over a finite capacity channel
model-based architecture to decouple the design (Baillieul 1999; Nair and Evans 2000; Tatikonda
of controller and communication schemes. A ref- and Mitter 2004; Wong and Brockett 1999; Bail-
erence model is introduced in each subsystem lieul and Antsaklis 2007). This has been solved
as a bridge to build the connection between the exactly for linear plants, but only conservative re-
real system and an ideal model, free of uncer- sults have been obtained for nonlinear plants. The
tainties. The closeness between the real system data-rate theorem that quantifies a fundamental
and the reference model is associated only with relationship between unstable physical systems
plant uncertainties, and the difference between and the rate at which information must be pro-
the reference model and the ideal model is only cessed in order to stably control them was proved
in the communication constraints. independently under a variety of assumptions.
Minimum bit rate and quantization becomes es-
pecially important for networks designed to carry
Stability of Networked Control very small packets with little overhead, because
Systems encoding measurements or actuation signals with
less bits can save network bandwidth.
The stability of a control system is often ex- Most of the NCS stability results presented
tremely important and is generally a safety re- here, however, are based on the observation that
quirement. Examples include the control of rock- the channel can transmit a finite number of pack-
ets, robots, airplanes, automobiles, or ships. In- ets per unit of time (packet rate) and each packet
stability in any one of these systems can result can carry certain number of bits in the data
in an unimaginable accident and loss of life. field. The packets on a real-time control network
The stability of a general dynamical system with typically are frequent and have small data seg-
no input can be described with the Lyapunov ments compared to their headers. For example, a
stability criteria, which is stated as follows: A CAN II packet with a single 16-bit data sample
linear system is stable if its impulse response has fixed 64 bits of overhead associated with
approaches zero as time approaches infinity or if identifier, control field, CRC, ACK field, and
every bounded input produces a bounded output. frame delimiter, resulting in 25 % utilization, and
When sensors, controllers, and actuators this utilization can never exceed 50 % (data field
are not colocated and use a shared network length is limited to 64 bits). Thus, the quan-
to communicate, the feedback loop of a NCS tization effects imposed by the communication
is closed over the network. Network-induced, networks are generally ignored.
Networked Control Systems: Architecture and Stability Issues 839
the setting of Lp stability was provided by Tab- In terms of future directions, there has been
bara et al. (2007). Their presentation provides significant effort in analyzing networked control
sharper results for both gain and MATI than systems with variable sampling rate, but most
previously obtainable and details the property of results investigate the stability for a given worst-
uniformly persistently exciting scheduling pro- case interval between consecutive sampling
tocols. This class of protocols was shown to times, leading to conservative results. An open
lead to stability for high enough transmission area of research would be to look at methods that
rates. This was a natural property to demand, take into account a stochastic characterization for
especially in the design of wireless scheduling the inter-sampling times. Substantial work has
protocols. The property is used directly in a novel also been devoted to determining the stability
proof technique based on the notions of vector of a NCS, as described in this article. Possible
comparison and (quasi)-monotone systems. Via open areas of research would be to consider
simulations, analytical, and numerical compari- design issues related to the joint stability and
son, it is verified that the uniform persistence of performance of the system. The design and
excitation property of protocols is, in some sense, development of controllers for a NCS is also an
the “finest” property that can be extracted from open area of research. In designing a controller
wireless scheduling protocols. for a NCS, one has to take into account the
challenges introduced by the communication
Delays and Packet Dropouts network. Only afterward can analysis of the
Packet dropouts can be modeled as either whole system take place.
stochastic or deterministic phenomena. For a
one-channel feedback NCS, Zhang and Branicky
(2001) consider a deterministic dropouts model, Cross-References
with packet dropouts occurring at an asymptotic
rate. Stability conditions were studied for a NCS Data Rate of Nonlinear Control Systems and
with deterministic and stochastic dropouts (Seiler Feedback Entropy N
and Sengupta 2005). Information and Communication Complexity
Sometimes, the NCS was characterized as of Networked Control Systems
a continuous-time delayed differential equation Networked Control Systems: Estimation and
(DDE) with the time-varying delay £(t). One Control over Lossy Networks
important advantage is that the equations are Networked Systems
still valid even when the delays exceed the sam- Quantized Control and Data Rate Constraints
pling interval. Researchers successfully used the
Lyapunov–Krasovskii (Yue et al. 2004) and the
Razumikhin theorems (Yu et al. 2004) to study Bibliography
the stability of a NCS that is modeled as DDEs.
Baillieul J (1999) Feedback designs for controlling device
arrays with communication channel bandwidth con-
straints. In: Lecture notes of the fourth ARO workshop
Summary and Future Directions on smart structures, Penn State University
Baillieul J, Antsaklis PJ (2007) Control and communi-
This article introduced the concept of a net- cation challenges in networked control systems. Proc
IEEE 95(1):9–28
worked control system and its general architec-
Hespanha JP, Naghshtabrizi P, Xu Y (2007) A survey of
ture. Several key issues specific to a NCS, such as recent results in networked control systems. Proc IEEE
band-limited channels, network-induced delays, 95(1):138–162
and information packet dropouts, were explained. Johansson KH, Torngren M, Nielsen L (2005) Vehicle
applications of controller area network. In: Levine WS,
The stability condition of a NCS with various net-
Hristu-Varsakelis D (eds) Handbook of networked
work effects was discussed with several common and embedded control systems. Birkhäuser, Boston,
modeling techniques. pp 741–765
842 Networked Control Systems: Estimation and Control Over Lossy Networks
Lin H, Zhai G, Antsaklis PJ (2003) Robust stabil- Yue D, Han QL, Peng C (2004) State feedback controller
ity and disturbance attenuation analysis of a class design for networked control systems. IEEE Trans
of networked control systems. In: 42nd IEEE con- Circuits Syst 51(11):640–644
ference on decision and control, Maui, vol 2, Zhai G, Hu B, Yasuda K, Michel A (2002), Qualitative
pp 1182–1187 analysis of discrete-time switched systems, in Proc.
Lin H, Zhai G, Fang L, Antsaklis PJ (2005) Stability Amer. Contr. Conf, vol 3, pp 1880–1885
and H1 performance preserving scheduling policy for Zhang W, Branicky MS (2001) Stability of networked
networked control systems. In: Proceedings 16th IFAC control systems with time-varying transmission pe-
world congress, Prague riod. In: Allerton conference on communication, con-
Montestruque LA, Antsaklis PJ (2004) Stability of trol, and computing, Monticello, IL
model-based networked control systems with time- Zhang W, Branicky MS, Phillips SM (2001) Stability of
varying transmission times. IEEE Trans Autom Con- networked control systems. IEEE Control Syst Mag
trol 49(9):1562–1572 21(1):84–99
Nair GN, Evans RJ (2000) Stabilization with data-rate-
limited feedback: tightest attainable bounds. Syst Con-
trol Lett 41(1):49–56. Elsevier
Nesic D, Teel A (2004a) Input-output stability properties
of networked control systems. IEEE Trans Autom Networked Control Systems:
Control 49(10):1650–1667 Estimation and Control Over
Nesic D, Teel A (2004b) Input-to-state stability of net-
worked control systems. Automatica 40(12):2121–
Lossy Networks
2128
Onat A, Naskali T, Parlakay E, Mutluer O (2011) Con- João P. Hespanha1 and Alexandre R. Mesquita2
trol over imperfect networks: model-based predictive 1
Center for Control, Dynamical Systems and
networked control systems. IEEE Trans Ind Electron
Computation, University of California, Santa
58:905–913
Seiler P, Sengupta R (2005) An H1 approach to Barbara, CA, USA
2
networked control, IEEE Trans Autom Control Department of Electronics Engineering, Federal
50(3):356–364 University of Minas Gerais, Belo Horizonte,
Tabbara M, Nesic D, Teel A (2007) Stability of wireless
Brazil
and wireline networked control systems. IEEE Trans
Autom Control 52(9):1615–1630
Tatikonda S, Mitter S (2004) Control under commu-
nication constraints. IEEE Trans Autom Control Abstract
49(7):1056–1068
Walsh G, Ye H (2001) Scheduling of networked control
This entry discusses optimal estimation and con-
systems. IEEE Control Syst Mag 21(1): 57–65
Walsh G, Ye H, Bushnell L (2001a) Stability analysis of trol for lossy networks. Conditions for stability
networked control systems. IEEE Trans Control Syst are provided both for two-link and multiple-link
Technol 10(3):438–446 networks. The online adaptation of network re-
Walsh G, Beldiman O, Bushnell L (2001b) Asymptotic
sources (controlled communication) is also con-
behavior of nonlinear networked control systems.
IEEE Trans Autom Control 44:1093–1097 sidered.
Wang X, Hovakimyan N (2013) Distributed control of un-
certain networked systems: a decoupled design. IEEE
Trans Autom Control 58(10):2536–2549 Keywords
Wong WS, Brockett RW (1999) System with finite
communication bandwidth constraints-II: stabilization
with limited information feedback. IEEE Trans Autom Automatic control; Communication networks;
Control 44(5):1049–1053 Controlled communication; Estimation; Net-
Yang TC (2006) Networked control system: a brief survey. worked control systems; Stability
IEE Proc Control Theory Appl 153(4):403–412
Ye H, Walsh G, Bushnell L (2001) Real-time mixed-
traffic wireless networks. IEEE Trans Ind Electron
48(5):883–890 Introduction
Yu M, Wang L, Chu T, Hao F (2004) An LMI approach to
networked control systems with data packet dropout
and transmission delays. 43rd IEEE conference on
Network Control Systems (NCSs) are spatially
decision and control, Paridise Island, Bahamas, vol 4, distributed systems in which the communication
pp 3545–3550 between sensors, actuators, and controllers
Networked Control Systems: Estimation and Control Over Lossy Networks 843
occurs through a shared band-limited digital traffic and are typically random; they can be ana-
communication network. In this entry, we lyzed using the techniques developed in Antunes
consider the problem of estimation and control et al. (2012).
over such networks. Notation and Basic Definitions. Throughout
A significant difference between NCSs and the entry, R stands for real numbers and N for
standard digital control is the possibility that nonnegative integers. For a givenpmatrix A 2
data may be lost while in transit through the Rnn and vector x 2 Rn ; kxk WD x 0 x denotes
network. Typically, packet dropouts result from the Euclidean norm of x, and .A/ the set of
transmission errors in physical network links eigenvalues of A. Random variables are gener-
(which is far more common in wireless than ally denoted in boldface. For a random variable
in wired networks) or from buffer overflows y; EŒy stands for the expectation of y.
due to congestion. Long transmission delays
sometimes result in packet reordering, which
essentially amounts to a packet dropout if the Two-Link Networks
receiver discards “outdated” arrivals. Reliable
transmission protocols, such as TCP, guarantee Here, we consider a control/estimation problem
the eventual delivery of packets. However, these when all network effects can be modeled using
protocols are not appropriate for NCSs since two erasure channels: one from the sensor to the
the retransmission of old data is generally not controller and the other from the controller to the
useful. Another important difference between actuator (see Fig. 1).
NCSs and standard digital control systems is We restrict our attention to a linear time-
that, due to the nature of network traffic, delays invariant (LTI) plant with intermittent observa-
in the control loop may be time varying and tion and control packets:
nondeterministic.
In this entry, we concentrate on the problem of xkC1 D Axk C k Buk C wk ; (1a)
control and estimation in the presence of packet N
losses, leaving other important features of NCSs y k D k C x k C v k ; (1b)
(such as quantization and random delays) to be
addressed in other entries of this encyclopedia. 8k 2 N; xk ; wk 2 Rn ; yk ; vk 2 Rp , where
Consequently, we assume that the network can be .x0 ; wk ; vk / are mutually independent, zero-mean
viewed as a channel that can carry real numbers Gaussian with covariance matrices .P0 ; Rw ; Rv /,
without distortion, but that some of the messages and k ; k 2 f0; 1g are i.i.d. Bernoulli random
may be lost. This network model is appropriate variables with Prfk D 1g D N and Prfk D
when the number of bits in each data packet is 1g D . N The variable k models the packet loss
sufficiently large so that quantization effects can between sensor and controller, whereas k mod-
be ignored, but packet dropouts cannot. For more els the packet loss between controller and actua-
general channel models, see, for example, Imer tor. When there is a packet drop from controller
and Basar (2005).
This entry also does not address network trans-
mission delays explicitly. In general, network Actuator Plant Sensor
delays have two components: one that is due to
the time spent transmitting packets and another
due to the time packets wait in buffers waiting
to be transmitted. Delays due to packet transmis- Erasure channel Erasure channel
Controller
sion present little variation and may be modeled
as constants. For control design purposes, these
Networked Control Systems: Estimation and Control
delays may be incorporated into the plant model. Over Lossy Networks, Fig. 1 Control system with two
Delays due to buffering depend on the network network links
844 Networked Control Systems: Estimation and Control Over Lossy Networks
to actuator, we set the actuator’s output to zero. the estimation error covariance becomes
Different strategies, such as holding the control unbounded:
input, could still be modeled using (1) by aug-
Theorem
1 (Sinopoli et al. 2004) Assume that
menting of the state vector.
A; Rw1=2 is controllable, .A; C / is observable,
The information available to the controller up
and A is unstable. Then there exists a critical
to time k is given by the information set:
value c 2 .0; 1 such that
Ik D fP0 g [ fy` ; ` W ` kg [ f` W ` k 1g:
EŒPk M; 8k 2 N , N c
Here, we make an important assumption that
acknowledgment packets from the actuator
where M is a positive definite matrix that may
are always received by the controller so that
depend on P0 . Furthermore, the critical value c
` ; ` k 1 is available at time k to the remote
satisfies min c max , where the lower bound
estimator.
is given by
Optimal Estimation with Remote
Computation 1
min D 1 ; (3)
The optimal mean-square estimate of xk , given .max fj.A/jg/2
the information known to the remote estimator at
time k, is given by
and the upper bound is given by the solution to
the following (quasi-convex) optimization prob-
xO kjk WD EŒxk jIk : lem:
xO 0j1 D 0; (2a)
where
xO kjk D xO kjk1 C k Fk .yk C xO kjk1 /; (2b)
.Y; Z/ D
xO kC1jk D AOxkjk C k Buk ; (2c)
2 p p 3
Y .YA C ZC / 1 YA
with the gain matrix Fk calculated recursively as 6p.A0 Y C C 0 Z 0 / 7
4 p Y 0 5:
follows
1 A0 Y 0 Y
Fk D Pk C 0 .CPk C 0 C Rv /1 ;
PkC1 D APk A0 C Rw k AFk .CPk C 0 C Rv /
Remark 1 In some special cases, the upper
Fk0 A0 : bound in (3) is tight in the sense that c min .
The largest class of systems known for which
Each Pk corresponds to the estimation error co- this occurs is that of nondegenerate systems
variance matrix defined in Mo and Sinopoli (2012). Examples of
systems in this class include (1) those for which
Pk D E .xk xO kjk1 /.xk xO kjk1 /0 : the matrix C is invertible and (2) those with a
detectable pair .A; C / and such that the matrix
For this estimator, there exists a critical A is diagonalizable with unstable eigenvalues
N above which
value c for the dropout rate , having distinct absolute values.
Networked Control Systems: Estimation and Control Over Lossy Networks 845
Optimal Control with Remote Theorem 2 (Schenato et al. 2007) Assume that
Computation (A, B) and .A; Rw1=2 / are controllable, (A, C)
From a control perspective, one may also be and .A; W 1=2 / are observable, and A is unstable.
interested in finding control sequences uN D Then, finite control costs J are achievable if
fu1 ; : : : ; uN 1 g, as functions of the information and only if N > c and N > c , where the
set IN , which minimize cost functions of the critical value c is given by the (quasi-convex)
form optimization problem
"N 1 #
1 X c D minf 0 W .Y; Z/ > 0;
0 0
JD lim E .xk W xk C k uk U uk /jIk :
N !1 N 0 Y I for some Y; Zg;
kD0
where
.Y; Z/ D
2 p p p 3
Y Y ZU 1=2 .YA0 C ZB 0 / 1 YA0
6 W 1 7
6 p Y1=2 0 0 0 0 7
6 7:
6 p U Z 0 I 0 0 7
4 .AY C BZ 0 / 0 0 Y 0 5
p
1 AY 0 0 0 Y
Notice that now we are applying the (TVKF) the approach described in section “Estimation
to estimate xQ k , which is fully observable. Since with Local Computation”, but with a reduced
min and max in Theorem 2 are equal for fully computational effort at the sensor.
observable processes (Schenato et al. 2007), the Analogously, an improvement to zeroing or
local computation scheme grants a minimal criti- simply holding the control input in case of
cal value c as stated in the theorem below. packet drops between controller and actuator
is for the controller to transmit a control
Theorem 3 Assume that .A; Rw1=2 / is control-
sequence uk ; ukC1 ; : : : ; ukCN that contains not
lable, (A, C) is OBSERVABLE, and A is unstable.
only the control uk to be used at the current
Then the critical value c is given by min in (3),
time instant but also a few future controls
i.e.,
ukC1 ; ukC2 ; : : : ; ukCN . In the case of packet
drops between controller and actuator, the
EŒPk M; 8k 2 N , N min actuator can use previously received “future”
control inputs in lieu of the one contained in the
where M is a positive definite matrix that may lost packet. The sequence of future control inputs
depend on P0 . may be obtained, e.g., by an optimal receding
horizon control strategy (Gupta et al. 2006).
Drops in the Acknowledgement Packets
When there are drops in the acknowledgment
channel from the actuator to the controller, the Estimation with Markovian Drops
controller does not always know k , and there- When k is a Markov process, we no longer
fore, it might not always have access to the have a separation principle, and the optimal con-
control inputs that are actually applied to the troller may depend on the drops sequence. Yet,
plant. In this case, the posterior state proba- optimal state estimates are obtained using the
bility becomes a Gaussian mixture distribution same TVKF presented earlier. Below, we give
with infinitely many components, and the sepa- conditions for the stability of the error covariance
ration principle no longer holds (Schenato et al. when drops are governed by the Gilbert-Elliot
2007). This makes the estimation and control model: PrfkC1 D j jk D i g D pij ; i; j 2
problems computationally more difficult, and, f0; 1g.
due to the smaller information set, some perfor- Theorem 4 (Mo and Sinopoli 2012) Assume
1=2
mance degradation in the control performance that .A; Rw / is controllable, A is unstable, and
should be expected. For this reason, it is generally the system given by the pair .A; C / is nonde-
a good design choice to keep controller and actua- generate as discussed in Remark 1. Moreover,
tor collocated when drops in the acknowledgment suppose that the transition probabilities for the
channels are significant. Gilbert-Elliot model satisfy p01 I p10 > 0. Then
the expected error covariance EŒPk is uniformly
Buffering bounded if
As an alternative to the approach described in p01 > min
section “Estimation with Local Computation” to
and it is unbounded for some initial condition if
use local computation at a smart sensor to allow
p01 < min .
for larger probabilities of drop, the designer may
also consider the transmission of a sequence
of previous measurements yk ; yk1 ; : : : ; ykN
in each packet. This approach is motivated Networks with Multiple Links
by the fact that often data packets can carry
much more than one vector of measured outputs. We now consider feedback loops that are closed
When N is reasonably large, one should expect over a network of communication links, each of
similar estimation/control performances as in which drops packets according to a Bernoulli
Networked Control Systems: Estimation and Control Over Lossy Networks 847
Y
process. The sensor communicates with a con- pmax-cut D max pij :
troller across the network, and we assume that all cuts.S;T / .i;j /2ST
controller and actuator are collocated. The net-
work may be represented by a graph G with nodes The above maximization can be rewritten as a
in the set V and edges in the set E, where edges minimization over the sums of log pij , which
are drawn between two communicating nodes. leads to a linear program known as the mini-
We denote by pij the probability of a drop when mum cut problem in network optimization theory
node i transmits to node j . Drops are assumed to (Cook 1995).
be independent across links and time.
To maximize robustness with respect to drops, Theorem 5 (Gupta et al. 2009) Assume that
sensors use a Kalman filter to compute an optimal Rw ; Rv > 0, that .A; B/ is stabilizable, that
estimate for the state of the process based on .A; C / is observable, and that A is unstable. Then
their measurements and transmit this estimate the control and communication policy described
across the network. When the sensors do not above is optimal for quadratic costs, and the
have access to the process input, they can take expected state covariance is bounded if and only
advantage of the linearity of the Kalman filter: if
as the output of a Kalman filter is the sum of a pmax-cut .maxfj.A/jg/2 < 1:
term due to measurements with another term due
to control inputs, sensors may compute only the
contribution due to measurements and transmit Estimation with Controlled
it to the controller, which can subsequently add Communication
the contribution due to the control inputs. This
guarantees that optimal state estimates can still To actively reduce network traffic and power
be computed at the control node, even when the consumption, sensor measurements may not be
sensors do not know the control input (Gupta sent to the remote estimator at every time step. In
addition, one may have the ability to somewhat
et al. 2009). N
The communication in the network goes as control the probability of packet drops by varying
follows. Sensors time stamp their estimates and the transmit power or by transmitting copies of
broadcast them to all nodes in their communi- the same message through multiple channel real-
cation ranges. After receiving information from izations. This is known as controlled communi-
their neighbors, nodes compare time stamps and cation, and it allows the designer to establish a
keep only the most recent estimates. These esti- trade-off between communication and estimation
mates are broadcasted to all neighboring nodes. performance.
When the controller receives new information, We consider the local estimation scenario de-
the optimal Kalman estimate is reconstructed, scribed in section “Estimation with Local Com-
taking into account the total transmission delay putation” with the difference that the Bernoulli
(learned from the packet time stamps), and a drops are now modulated as follows
standard LQG control can be used (Gupta et al. (
2009). 1 with prob: k
k D
To determine whether or not this procedure 0 with prob: 1 k
results in a stable closed loop, one defines a cut
C D .S; T / to be a partition of the node set where the sensor is free to choose k 2 Œ0; pmax
V such that the sensor node is in S and the as a function of the information available up to
controller node is in T . The cut-set is then defined time k. With its choice, the sensor incurs on a
as the set of edges .i; j / 2 E such that i 2 S communication cost c.k / at time k, where c./
and j 2 T , i.e., the set of edges that connect is some increasing function that may represent,
the sets S and T . The max-cut probability is then for example, the energy needed in order to trans-
defined as mit with a probability of drop equal to k . Note
848 Networked Control Systems: Estimation and Control Over Lossy Networks
that transmission scheduling, where k is either When computing eQ k and k in (5) is com-
0 or pmax , is a special case of this framework. putationally too costly for the sensor, one may
In order to choose k , the sensor considers prefer to make k a function of the number
the estimation error eQ k WD xQ kjk xO kjk1 between of consecutive dropped packets `k . In this case,
the local and the remote estimators. This error minimizing JQ in (4) is equivalent to minimizing
evolves according to the cost
( "K1 #
dk with prob: k 1 X
eQ kC1 D JN WD lim E trace †`k C c.k / ;
K!1 K
AQek C dk with prob: 1 k kD0
> 0; (4)
Summary and Future Directions
which penalizes a linear combination h of the i re- Most positive results in the subject rely on the
2
mote estimation error variance E kQek k and assumption of perfect acknowledgments and on
the average communication cost EŒc.k /. In actuators and controllers being collocated. Future
this context, a communication policy should be research should address ways of circumventing
understood as a rule that selects k as a function these assumptions.
of the information available to the sensor.
When
Cross-References
.1 pmax / maxfj.A/jg2 < 1;
Data Rate of Nonlinear Control Systems and
there exists an optimal communication policy that Feedback Entropy
chooses k as a function of eQk , which may be Information and Communication Complexity
computed via dynamic programming and value of Networked Control Systems
iteration (Mesquita et al. 2012). While this pro- Networked Control Systems: Architecture and
cedure can be computationally difficult, it is of- Stability Issues
ten possible to obtain suboptimal but reasonable Networked Systems
performance with rollout policies such as the Quantized Control and Data Rate Constraints
following one:
agents may be changing and dynamic. Such inter- distributed agent-to-agent interactions (Okubo
actions may be structured across different layers, 1986; Parrish et al. 2002; Conradt and Roper
which themselves might be organized in a hi- 2003; Couzin et al. 2005). In robotics, engineers
erarchical fashion. Networked systems may also design algorithmic solutions to help multivehicle
interact with external entities that specify high- networks and embedded systems coordinate
level commands that trickle down through the their actions and perform challenging spatially
system all the way to the agent level. distributed tasks (Arkin 1998; Committee on
A defining characteristic of a networked sys- Networked Systems of Embedded Computers
tem is the fact that information, understood in a 2001; Balch and Parker 2002; Howard et al. 2006;
broad sense, is sparse and distributed across the Kumar et al. 2008). Graph theorists and applied
agents. As such, different individuals have access mathematicians study the role played by the
to information of varying degrees of quality. interconnection among agents in the emergence
As part of the operation of the networked sys- of phase transition phenomena (Bollobás 2001;
tem, mechanisms are in place to share, transmit, Meester and Roy 2008; Chung 2010). This
and/or aggregate this information. Some informa- interest is also shared in communication and
tion may be disseminated throughout the whole information theory, where researchers strive to
network or, in some cases, all information can design efficient communication protocols and
be made centrally available at a reasonable cost. examine the effect of topology control on group
In other scenarios, however, the latter might turn connectivity and information dissemination
out to be too costly, unfeasible, or undesirable (Zhao and Guibas 2004; Giridhar and Kumar
because of privacy and security considerations. 2005; Lloyd et al. 2005; Santi 2005; Franceschetti
Individual agents are the basic unit for decision and Meester 2007). Game theorists study the gap
making, but decisions might be made from in- between the performance achieved by global,
termediate levels of the networked system all the network-wide optimizers and the configurations
way to a central planner. The combination of in- that result from selfish agents interacting locally
formation availability and decision-making capa- in social and economic systems (Roughgarden
bilities gives rise to an ample spectrum of possi- 2005; Nisan et al. 2007; Easley and Kleinberg
bilities between the centralized control paradigm, 2010; Marden and Shamma 2013). In mechanism
where all information is available at a central design, researchers seek to align the objectives
planner who makes the decisions, and the fully of individual self-interested agents with the
distributed control paradigm, where individual overall goal of the network. Static and mobile
agents only have access to the information shared networked systems and their applications to the
by their neighbors in addition to their own. study of natural phenomena in oceans (Paley
et al. 2008; Graham and Cortés 2012; Zhang
and Leonard 2010; Das et al. 2012; Ouimet and
Perspective from Systems Cortés 2013), rivers (Ru and Martínez 2013;
and Control Tinka et al. 2013), and the environment (DeVries
and Paley 2012) also raise exciting challenges in
There are many aspects that come into play when estimation theory, computational geometry, and
dealing with networked systems regarding com- spatial statistics.
putation, processing, sensing, communication, The field of systems and control brings a
planning, motion control, and decision making. comprehensive approach to the modeling, analy-
This complexity makes their study challenging sis, and design of networked systems. Emphasis
and fascinating and explains the interest that, is put on the understanding of the general
with different emphases, they generate in a large principles that explain how specific collective
number of disciplines. In biology, scientists behaviors emerge from basic interactions; the
analyze synchronization phenomena and self- establishment of models, abstractions, and tools
organized swarming behavior in groups with that allow us to reason rigorously about complex
Networked Systems 851
interconnected systems; and the development the optimal tuning of sensors, and the distributed
of systematic methodologies that help engineer optimization of network resources. The entry
their behavior. The ultimate goal is to establish a Estimation and Control over Networks explores
science for integrating individual components the impact that communication channels may
into complex, self-organizing networks with have on the execution of estimation and control
predictable behavior. To realize the “power tasks over networks of sensors and actuators.
of many” and expand the realm of what is A strong point of commonality among the
possible to achieve beyond the individual agent contributions is the precise characterization of the
capabilities, special care is taken to obtain scalability of coordination algorithms, together
precise guarantees on the stability properties with the rigorous analysis of their correctness
of coordination algorithms, understand the and stability properties. Another focal point is
conditions and constraints under which they the analysis of the performance gap between
work, and characterize their performance and centralized and distributed approaches in regard
robustness against a variety of disturbances and to the ultimate network objective.
disruptions. Further information about other relevant
aspects of networked systems can be found
throughout this Encyclopedia. Among these, we
Research Issues – and How the Entries highlight the synthesis of cooperative strategies
in the Encyclopedia Address Them for data fusion, distributed estimation, and
adaptive sampling, the analysis of the network
Given the key role played by agent-to-agent inter- operation under communication constraints (e.g.,
actions in networked systems, the Encyclopedia limited bandwidth, message drops, delays, and
entries Graphs for Modeling Networked In- quantization), the treatment of game-theoretic
teractions and Dynamic Graphs, Connectivity scenarios that involve interactions among
of deal with how their nature and effect can be multiple players and where security concerns
modeled through graphs. This includes diverse might be involved, distributed model predictive N
aspects such as deterministic and stochastic control, and the handling of uncertainty,
interactions, static and dynamic graphs, state- imprecise information, and events via discrete-
dependent and time-dependent neighboring event systems and triggered control.
relationships, and connectivity. The importance
of maintaining a certain level of coordination
and consistency across the networked system is Summary and Future Directions
manifested in the various entries that deal with
coordination tasks that are, in some way or an- In conclusion, this entry has illustrated ways
other, related to some form of agreement. These in which systems and control can help us de-
include consensus ( Averaging Algorithms sign and analyze networked systems. We have
and Consensus), formation control ( Vehicular focused on the role that information and agent
Chains), cohesiveness, flocking ( Flocking interconnection play in shaping their behavior.
in Networked Systems), synchronization We have also made emphasis on the increasingly
( Oscillator Synchronization), and distributed rich set of methods and techniques that allow
optimization ( Distributed Optimization). A to provide correctness and performance guar-
great deal of work (e.g., see Optimal Deploy- antees. The field of networked systems is vast
ment and Spatial Coverage and Multi-vehicle and the amount of work impossible to survey in
Routing), is also devoted to the design of this brief entry. The reader is invited to further
cooperative strategies that achieve spatially explore additional topics beyond the ones men-
distributed tasks such as optimal coverage, space tioned here. The monographs (Ren and Beard
partitioning, vehicle routing, and servicing. These 2008; Bullo et al. 2009; Mesbahi and Egerst-
entries explore the optimal placement of agents, edt 2010; Alpcan and Başar 2010) and edited
852 Networked Systems
volumes (Kumar et al. 2004; Shamma 2008; Bullo F, Cortés J, Martínez S (2009) Distributed control of
Saligrama 2008), and manuscripts (Olfati-Saber robotic networks. Applied mathematics series. Prince-
ton University Press, Princeton, NJ. Electronically
et al. 2007; Baillieul and Antsaklis 2007; Leonard available at http://coordinationbook.info
et al. 2007; Kim and Kumar 2012), together Cantoni M, Weyer E, Li Y, Ooi SK, Mareels I, Ryan M
with the references provided in the Encyclopedia (2007) Control of large-scale irrigation networks. Proc
entries mentioned above, are a good starting point IEEE 95(1):75–91
Chiang HD, Chu CC, Cauley G (1995) Direct stability
to undertake this enjoyable effort. Given the big analysis of electric power systems using energy func-
impact that networked systems have, and will tions: theory, applications, and perspective. Proc IEEE
continue to have, in our society, from energy and 83(11):1497–1529
transportation, through human interaction and Chow JH (1982) Time-scale modeling of dynamic net-
works with applications to power systems. Springer,
healthcare, to biology and the environment, there New York, NY
is no doubt that the coming years will witness Chung FRK (2010) Graph theory in the information age.
the development of more tools, abstractions, and Not AMS 57(6):726–732
models that allow to reason rigorously about Committee on Networked Systems of Embedded Comput-
ers (2001) Embedded, everywhere: a research agenda
intelligent networks and for techniques that help for networked systems of embedded computers. Na-
design truly autonomous and adaptive networks. tional Academy Press, Washington, DC
Conradt L, Roper TJ (2003) Group decision-making in
animals. Nature 421(6919):155–158
Couzin ID, Krause J, Franks NR, Levin SA (2005) Effec-
Cross-References tive leadership and decision-making in animal groups
on the move. Nature 433(7025):513–516
Das J, Py F, Maughan T, O’Reilly T, Messié M, Ryan J,
Averaging Algorithms and Consensus Sukhatme GS, Rajan K (2012) Coordinated sampling
Dynamic Graphs, Connectivity of of dynamic oceanographic features with AUVs and
Distributed Optimization drifters. Int J Robot Res 31(5):626–646
Estimation and Control over Networks DeVries L, Paley D (2012) Multi-vehicle control
in a strong flowfield with application to hurri-
Flocking in Networked Systems cane sampling. AIAA J Guid Control Dyn 35(3):
Graphs for Modeling Networked Interactions 794–806
Multi-vehicle Routing Dörfler F, Chertkov M, Bullo F (2013) Synchronization in
Optimal Deployment and Spatial Coverage complex oscillator networks and smart grids. Proc Natl
Acad Sci 110(6):2005–2010
Oscillator Synchronization Easley D, Kleinberg J (2010) Networks, crowds, and
Vehicular Chains markets: reasoning about a highly connected world.
Cambridge University Press, Cambridge, UK
Franceschetti M, Meester R (2007) Random networks for
communication. Cambridge University Press, Cam-
bridge, UK
Bibliography Giridhar A, Kumar PR (2005) Computing and commu-
nicating functions over sensor networks. IEEE J Sel
Ahuja RK, Magnanti TL, Orlin JB (1993) Network flows: Areas Commun 23(4):755–764
theory, algorithms, and applications. Prentice Hall, Graham R, Cortés J (2012) Adaptive information collec-
Englewood Cliffs tion by robotic sensor networks for spatial estimation.
Alpcan T, Başar T (2010) Network security: a decision and IEEE Trans Autom Control 57(6):1404–1419
game-theoretic approach. Cambridge University Press, Howard A, Parker LE, Sukhatme GS (2006) Experiments
Cambridge, UK with a large heterogeneous mobile robot team: ex-
Arkin RC (1998) Behavior-based robotics. MIT, Cam- ploration, mapping, deployment, and detection. Int J
bridge, MA Robot Res 25(5–6):431–447
Baillieul J, Antsaklis PJ (2007) Control and communica- Jackson MO (2010) Social and economic networks.
tion challenges in networked real-time systems. Proc Princeton University Press, Princeton, NJ
IEEE 95(1):9–28 Kim KD, Kumar PR (2012) Cyberphysical systems: a
Balch T, Parker LE (eds) (2002) Robot teams: from perspective at the centennial. Proc IEEE 100(Special
diversity to polymorphism. A. K. Peters, Wellesley, Centennial Issue):1287–1308
MA Kumar V, Leonard NE, Morse AS (eds) (2004) Coopera-
Bollobás B (2001) Random graphs, 2nd edn. Cambridge tive control. Lecture notes in control and information
University Press, Cambridge, UK sciences, vol 309. Springer, New York, NY
Neural Control and Approximate Dynamic Programming 853
Kumar V, Rus D, Sukhatme GS (2008) Networked robots. Strogatz SH (2003) SYNC: the emerging science of spon-
In: Siciliano B, Khatib O (eds) Springer handbook of taneous order. Hyperion, New York, NY
robotics. Springer, New York, NY, pp 943–958 Tinka A, Rafiee M, Bayen A (2013) Floating sensor
Kuramoto Y (1984) Chemical oscillations, waves, and networks for river studies. IEEE Syst J 7(1):36–49
turbulence. Springer, New York, NY Toh CK (2001) Ad hoc mobile wireless networks: proto-
Leonard NE, Paley D, Lekien F, Sepulchre R, Fratan- cols and systems. Prentice Hall, Englewood Cliffs, NJ
toni DM, Davis R (2007) Collective motion, sen- Zhang F, Leonard NE (2010) Cooperative filters and con-
sor networks and ocean sampling. Proc IEEE 95(1): trol for cooperative exploration. IEEE Trans Autom
48–74 Control 55(3):650–663
Lloyd EL, Liu R, Marathe MV, Ramanathan R, Ravi Zhao F, Guibas L (2004) Wireless sensor networks: an
SS (2005) Algorithmic aspects of topology control information processing approach. Morgan-Kaufmann,
problems for ad hoc networks. Mobile Netw Appl San Francisco, CA
10(1–2):19–34
Marden JR, Shamma JS (2013) Game theory and dis-
tributed control. In: Young P, Zamir S (eds) Handbook
of game theory, vol 4. Elsevier, Oxford, UK
Meester R, Roy R (2008) Continuum percolation. Cam- Neural Control and Approximate
bridge University Press, Cambridge, UK Dynamic Programming
Mesbahi M, Egerstedt M (2010) Graph theoretic methods
in multiagent networks. Applied mathematics series. Frank L. Lewis1 and Kyriakos G. Vamvoudakis2
Princeton University Press, Princeton, NJ 1
Nisan N, Roughgarden T, Tardos E, Vazirani VV Arlington Research Institute, University of
(2007) Algorithmic game theory. Cambridge Univer- Texas, Fort Worth, TX, USA
2
sity Press, Cambridge, UK Center for Control, Dynamical Systems and
Okubo A (1986) Dynamical aspects of animal grouping: Computation (CCDC), University of California,
swarms, schools, flocks and herds. Adv Biophys 22:1–
94 Santa Barbara, CA, USA
Olfati-Saber R, Fax JA, Murray RM (2007) Consensus
and cooperation in networked multi-agent systems.
Proc IEEE 95(1):215–233 Abstract
Ouimet M, Cortés J (2013) Collective estimation of ocean
nonlinear internal waves using robotic underwater
drifters. IEEE Access 1:418–427
There has been great interest recently in “uni- N
Paley D, Zhang F, Leonard N (2008) Cooperative control versal model-free controllers” that do not need
for ocean sampling: the glider coordinated control a mathematical model of the controlled plant,
system. IEEE Trans Control Syst Technol 16(4):735– but mimic the functions of biological processes
744 to learn about the systems they are controlling
Parrish JK, Viscido SV, Grunbaum D (2002) Self-
organized fish schools: an examination of emergent online, so that performance improves automati-
properties. Biol Bull 202:296–305 cally. Neural network (NN) control has had two
Ren W, Beard RW (2008) Distributed consensus in multi- major thrusts: approximate dynamic program-
vehicle cooperative control. Communications and con- ming, which uses NN to approximately solve the
trol engineering. Springer, New York, NY
Roughgarden T (2005) Selfish routing and the price of optimal control problem, and NN in closed-loop
anarchy. MIT, Cambridge, MA feedback control.
Ru Y, Martínez S (2013) Coverage control in constant flow
environments based on a mixed energy-time metric.
Automatica 49(9):2632–2640
Saligrama V (ed) (2008) Networked sensing information Keywords
and control. Springer, New York, NY
Santi P (2005) Topology control in wireless ad hoc and Adaptive control; Learning systems; Neural net-
sensor networks. Wiley, New York, NY
works; Optimal control; Reinforcement learning
Schenato L, Sinopoli B, Franceschetti M, Poolla K, Sastry
SS (2007) Foundations of control and estimation over
lossy networks. Proc IEEE 95(1):163–187
Shamma JS (ed) (2008) Cooperative control of distributed Neural Feedback Control
multi-agent systems. Wiley, New York, NY
Song B, Ding C, Kamal AT, Farrel JA, Roy-Chowdhury
AK (2011) Distributed camera networks. IEEE Signal The objective is to design NN feedback con-
Process Mag 28(3):20–31 trollers that cause a system to follow, or track,
854 Neural Control and Approximate Dynamic Programming
T
a prescribed trajectory or path. Consider the dy- VPO D Gx O WO r kG krk VO ;
namics of an n-link robot manipulator
with any constant symmetric matrices F; G > 0,
P qP C G.q/ C F .q/
M.q/qR C Vm .q; q/ P C d D and scalar tuning parameter k > 0.
(1)
NN Controller for Discrete-Time Systems
with q.t/ 2 Rn the joint variable vector, M.q/
Most feedback controllers today are implemented
an inertia matrix, Vm a centripetal/coriolis ma-
on digital computers. This requires the specifi-
trix, G.q/ a gravity vector, and F ./ representing
cation of control algorithms in discrete time or
friction terms. Bounded unknown disturbances
digital form (Lewis et al. 1999). To design such
and modeling errors are denoted by d and the controllers, one may consider the discrete-time
control input torque is .t/. The sliding mode dynamics xkC1 D f .xk / C g.xk /uk with un-
control approach (Slotine and Li 1987) can be
known functions f ./; g./. The digital NN con-
generalized to NN control systems. Given a de-
troller derived in this situation has the form of a
sired trajectory, qd 2 Rn define the tracking error
feedback linearization controller shown in Fig. 1.
e.t/ D qd .t/ q.t/ and the sliding variable error
One can derive tuning algorithms, for a discrete-
r D eP C e with D T > 0. Define the time neural network controller with L layers, that
nonlinear robot function, guarantee system stability and robustness (Lewis
et al. 1999). For the i -th layer, the weight updates
f .x/ DM.q/.qRd C e/
P C Vm .q; q/.
P qPd C e/ are of the form
C G.q/ C F .q/
P
WO i .k C 1/ D WO i .k/ ˛i
Oi .k/yOiT .k/
x.t/ of measured
with the known vector signals
is selected as, x D e T eP T qdT qPdT qR dT . T
I ˛i
Oi .k/
Oi .k/ WO i .k/
NN Controller for Continuous-Time
Systems where
O i .k/ are the output functions of layer i ,
The NN controller is designed based on func- 0 < < 1 is a design parameter, and
tional approximation properties of NN as shown (
in Lewis et al. (1999). Thus, assume that f .x/ WO iT
Oi .k/ C Kv r.k/ for i D 1; : : : ; L 1,
yOi .k/ D
can be approximated by fO.x/ D WO T .VO T x/ r.k C 1/ for i D L
with VO ; WO the estimated NN weights. Select the
control input, D WO T .VO T x/ C Kv r v with with r.k/ a filtered error.
Kv a symmetric positive definite gain and v.t/
a robustifying function. This NN control struc- Feedforward Neurocontroller
ture is shown in Fig. 1. The outer proportional- Industrial, aerospace, DoD, and MEMS assembly
derivative (PD) tracking loop guarantees robust systems have actuators that generally contain
behavior. The inner loop containing the NN is deadzone, backlash, and hysteresis. Since these
known as a feedback linearization loop, and the actuator nonlinearities appear in the feedforward
NN effectively learns the unknown dynamics loop, the NN compensator must also appear in
online to cancel the nonlinearities of the system. the feedforward loop. This design is significantly
Let the estimated sigmoid Jacobian be O 0 more complex than for feedback NN controllers.
d .z/
d z jzDVO T x . Then, the NN weight tuning laws are Details are given in Lewis et al. (2002). Feedfor-
provided by ward controllers can offset the effects of dead-
zone if properly designed. It can be shown that
a NN deadzone compensator has the structure
WPO D F r
O T F O0 VO T xr T kF krk WO ; shown in Fig. 2.
Neural Control and Approximate Dynamic Programming 855
Neural Control and Approximate Dynamic Programming, Fig. 1 Neural network robot controller
Neural Control and Approximate Dynamic Programming, Fig. 2 Feedforward NN for deadzone compensation
The NN compensator consists of two NNs. stability proof techniques. The two NN were each
NN II is in the direct feedforward control loop, selected as having one tunable layer, namely, the
and NN I is not directly in the control loop but output weights. The activation functions were set
serves as an observer to estimate the (unmea- as a basis by selecting fixed random values for
sured) applied torque .t/. The feedback stability the first-layer weights. To guarantee stability, the
and performance of the NN deadzone compen- output weights of the inversion NN II (subscript
sator have been rigorously proven using nonlinear i denotes weights and sigmoids of the inversion)
856 Neural Control and Approximate Dynamic Programming
and the estimator NN I should be tuned respec- controller or adding adaptive characteristics to an
tively as optimal controller.
with design matrices T; S > 0 and tuning gains with state xk 2 Rn and control input uk 2
k1 ; k2 . Rm . A deterministic control policy is defined
as a function from state space to control space
Rn ! Rm . That is, for every state xk , the policy
Approximate Dynamic Programming defines a control action uk D h.xk / as a feedback
for Feedback Control controller. Define a deterministic cost function
that yields the value function:
The current status of work in approximate dy-
1
X
namic programming (ADP) for feedback control
is given in Lewis and Liu (2012). ADP is a V .xk / D i k r.xi ; ui /;
i Dk
form of reinforcement learning based on an ac-
tor/critic structure. Reinforcement learning (RL)
with 0 < 1 a discount factor, Q.xk /; R > 0,
is a class of methods used in machine learning
and uk D h.xk / a prescribed feedback control
to methodically modify the actions of an agent
policy. The optimal value is given by Bellman’s
based on observed responses from its environ-
optimality equation:
ment (Sutton and Barto 1998). The actor/critic
structures are RL systems that have two learning
V .xk / D min r.xk ; h.xk // C V .xkC1 / ;
structures: A critic network evaluates the perfor- h.:/
mance of a current action policy, and based on
that evaluation, an actor structure updates the ac- which is the discrete-time Hamilton-Jacobi-
tion policy as shown in Fig. 3. Adaptive optimal Bellman (HJB) equation. Two forms of RL can be
controllers (Lewis et al. 2012b) have been pro- based on policy iteration (PI) and value iteration
posed by adding optimality criteria to an adaptive (VI). For temporal difference learning, PI is
written as follows in terms of the deterministic in Abu-Khalaf and Lewis (2005) that will give
Bellman equation. us the structure for the proposed online algo-
rithms that follow. Consider the following non-
linear time-invariant affine in the input dynamical
Algorithm 1 PI for discrete-time systems
system given by
1: procedure
2: Given admissible policies
h0 .xk /
3: while V hi V hi
ac do P
x.t/ D f .x.t// C g.x.t//u.t/I x.0/ D x0 (3)
4: Solve for the value V.i/ .x/ using
optimal control solution for the problem, one dynamics is proposed in Vrabie et al. (2009)
needs to solve the HJB equations (6) for the value where the Bellman equation is proved to be
function and then substitute in (5) to obtain the equivalent to the integral reinforcement learn-
optimal control. However, due to the nonlinear ing form with an optimal value given for some
nature of the HJB equation, finding its solution is T > 0 as
generally difficult or impossible. The following
Z t CT
PI algorithm is an iterative algorithm for solving
optimal control problems and will give us the V .x.t// D arg min r.x./; u.//d
u t
structure for the online learning algorithm.
C V .x.t C T //:
Algorithm 2 PI for continuous-time systems Therefore, the temporal difference error for
1: procedure continuous-time systems can be defined as
2: Given admissible policies
u.0/
u.i / u.i /
3: while V V
ac do
4: Solve for the value V .i/ .x/ using Bellman’s e.t W t C T / D .t W t C T / C V .x.t C T //
equation
V .x.t//;
.i / T
@V u T R t CT
Q.x/ C .f .x/ C g.x/u.i/ / C u.i/ Ru.i/ D0;
@x with .t W t C T / t r.x./; u.//d with-
.i /
V u .0/ D 0 out any information of the plant dynamics. The
IRL controller just given tunes the critic neural
5: Update the control policy u.iC1/ using network to determine the value while holding
T!
the control policy fixed. The IRL algorithm can
.i /
1 1 T @V u be implemented online by RL techniques using
u .iC1/
D R g .x/
2 @x value function approximation VO .x/ D WO 1T
.x/
in a critic approximator network. Using that ap-
6: i WD i C 1
7: end while proximation in the PI algorithm, one can use
8: end procedure batch least squares or recursive least squares
to update the value function, and then on con-
vergence of the value parameters, the action is
A PI algorithm that solves online the HJB updated. The implementation of the IRL optimal
equation without full information of the plant adaptive control algorithm is shown in Fig. 4.
Neural Control and Approximate Dynamic Programming, Fig. 4 Hybrid optimal adaptive controller based on IRL
Neural Control and Approximate Dynamic Programming 859
The work in Vamvoudakis and Lewis (2010) Acknowledgments This material is based upon the work
presents a way of finding the optimal control supported by NSF. Grant Number: ECCS-1128050, ARO.
Grant Number: W91NF-05-1-0314, AFOSR. Grant Num-
solution in a synchronous manner along with ber: FA9550-09-1-0278.
stability and convergence guarantees but with
known dynamics. This procedure is more nearly
in line with accepted practice in adaptive control.
Bibliography
A synchronous online learning algorithm that
avoids the knowledge of drift dynamics is pro- Abu-Khalaf M, Lewis FL (2005) Nearly optimal control
posed in Vamvoudakis et al. (2013). laws for nonlinear systems with saturating actuators
using a neural network HJB approach. Automatica
41(5):779–791
Learning in Games Al-Tamimi A, Lewis FL, Abu-Khalaf M (2008) Discrete-
Reinforcement learning techniques have been ap- time nonlinear HJB solution using approximate dy-
plied to design adaptive controllers that converge namic programming: convergence proof. IEEE Trans
to the solution of two-player zero-sum games Syst Man Cybern Part B 38(4):943–949
Lewis FL, Liu D (2012) Reinforcement learning and
in Vamvoudakis and Lewis (2012) and Vrabie approximate dynamic programming for feedback con-
et al. (2012), of multiplayer nonzero-sum games trol. IEEE Press computational intelligence series.
in Vamvoudakis et al. (2012a), and of Stackelberg Wiley-Blackwell, Oxford
games in Vamvoudakis et al. (2012b). In these Lewis FL, Vamvoudakis KG (2011) Reinforcement learn-
ing for partially observable dynamic processes: adap-
cases, the adaptive control structure has multiple tive dynamic programming using measured output
loops, with action networks and critic networks data. IEEE Trans Syst Man Cybern Part B 41(1):14–25
for each player. The adaptive controller for zero- Lewis FL, Jagannathan S, Yesildirek A (1999) Neural
sum games finds the solution to the H-infinity network control of robot manipulators and nonlinear
systems. Taylor and Francis, London
control problem online in real time. This adaptive Lewis FL, Campos J, Selmic R (2002) Neuro-fuzzy con-
controller does not require any systems dynamics trol of industrial systems with actuator nonlinearities.
information. Society of Industrial and Applied Mathematics Press,
Philadelphia N
Lewis FL, Vrabie D, Syrmos VL (2012a) Optimal control.
Wiley, New York
Summary and Future Directions Lewis FL, Vrabie D, Vamvoudakis KG (2012b) Rein-
forcement learning and feedback control: using natural
This entry discusses some neuro-inspired adap- decision methods to design optimal adaptive con-
trollers. IEEE Control Syst Mag 32(6):76–105
tive control techniques. These controllers have Slotine JJE, Li W (1987) On the adaptive control of robot
multi-loop, multi-timescale structures and can manipulators. Int J Robot Res 6(3):49–59
learn the solutions to Hamilton-Jacobi design Sutton RS, Barto AG (1998) Reinforcement learning – an
equations such as the Riccati equation online introduction. MIT, Cambridge
Vamvoudakis KG, Lewis FL (2010) Online actor-critic
without knowing the full dynamical model of the algorithm to solve the continuous-time infinite horizon
system. A method known as Q learning allows the optimal control problem. Automatica 46(5):878–888
learning of optimal control solutions online, in Vamvoudakis KG, Lewis FL (2011) Multi-player non
zero sum games: online adaptive learning solution
the discrete-time case, for completely unknown
of coupled Hamilton-Jacobi equations. Automatica
systems. Q learning has not yet been fully inves- 47(8):1556–1569
tigated for continuous-time systems. Vamvoudakis KG, Lewis FL (2012) Online solution
of nonlinear two-player zero-sum games using syn-
chronous policy iteration. Int J Robust Nonlinear Con-
trol 22(13):1460–1483
Cross-References Vamvoudakis KG, Lewis FL, Hudas GR (2012a) Multi-
agent differential graphical games: online adaptive
Adaptive Control, Overview learning solution for synchronization with optimality.
Automatica 48(8):1598–1611
Stochastic Games and Learning
Vamvoudakis KG, Lewis FL, Johnson M, Dixon WE
Optimal Control and the Dynamic Program- (2012b) Online learning algorithm for Stackelberg
ming Principle games in problems with hierarchy. In: Proceedings
860 Nominal Model-Predictive Control
of the 51st IEEE conference on decision and control, linear dynamical systems. While the literal mean-
Maui pp 1883–1889 ing of “model-predictive control” applies to virtu-
Vamvoudakis KG, Vrabie D, Lewis FL (2013) Online
adaptive algorithm for optimal control with integral re-
ally every model-based controller design method,
inforcement learning. Int J Robust Nonlinear Control, nowadays the term commonly refers to control
Wiley. doi: 10.1002/rnc.3018 methods in which pieces of open-loop optimal
Vrabie D, Pastravanu O, Lewis FL, Abu-Khalaf M (2009) control functions or sequences are put together
Adaptive optimal control for continuous-time lin-
ear systems based on policy iteration. Automatica
in order to synthesize a sampled data feedback
45(2):477–484 law. As such, it is often used synonymously with
Vrabie D, Vamvoudakis KG, Lewis FL (2012) Optimal “receding horizon control.”
adaptive control and differential games by reinforce- The concept of MPC was first presented in
ment learning principles. Control engineering series.
IET Press, London
Propoı̆ (1963) and was reinvented several times
Werbos PJ (1989) Neural networks for control and system already in the 1960s. Due to the lack of suf-
identification. In: Proceedings of the IEEE conference ficiently fast computer hardware, for a while
on decision and control, Tampa these ideas did not have much of an impact.
Werbos PJ (1992) Approximate dynamic programming
for real-time control and neural modeling. In: White
This changed during the 1970s when MPC was
DA, Sofge DA (eds) Handbook of intelligent control. successfully used in chemical process control. At
Van Nostrand Reinhold, New York that time, MPC was mainly applied to linear sys-
tems with quadratic cost and linear constraints,
since for this class of problems algorithms were
Nominal Model-Predictive Control sufficiently fast for real-time implementation – at
least for the typically relatively slow dynamics
Lars Grüne of process control systems. The 1980s have then
Mathematical Institute, University of Bayreuth, seen the development of theory and increasingly
Bayreuth, Germany sophisticated concepts for linear MPC, while in
the 1990s nonlinear MPC (often abbreviated as
NMPC) attracted the attention of the MPC com-
Abstract munity. After the year 2000, several gaps in the
analysis of nonlinear MPC without terminal con-
Model-predictive control is a controller design straints and costs were closed, and increasingly
method which synthesizes a sampled data feed- faster algorithms were developed. Together with
back controller from the iterative solution of the progress in hardware, this has considerably
open-loop optimal control problems. We describe broadened the possible applications of both linear
the basic functionality of MPC controllers, their and nonlinear MPC.
properties regarding feasibility, stability and per- In this entry, we explain the functionality of
formance, and the assumptions needed in order nominal MPC along with its most important
to rigorously ensure these properties in a nominal properties and the assumptions needed to
setting. rigorously ensure these properties. We also
give some hints on the underlying proofs. The
term nominal MPC refers to the assumption
Keywords
that the mismatch between our model and the
real plant is sufficiently small to be neglected
Recursive feasibility; Sampled-data feedback;
in the following considerations. If this is not
Stability
the case, methods from robust MPC must be
used ( Robust Model-Predictive Control). We
Introduction describe all concepts for nonlinear discrete time
systems, noting that the basic results outlined in
Model-predictive control (MPC) is a method for this entry are conceptually similar for linear and
the optimization-based control of linear and non- for continuous-time systems.
Nominal Model-Predictive Control 861
`.x ; u / D 0 and ˛1 .jxj/ `.x; u/ (6) on the problem data (see, e.g., Rawlings and
Mayne 2009, Sect. 4.5; Grüne and Pannek 2011,
for all x 2 X and u 2 U. Here ˛1 is a Sect. 5.3).
K1 function, i.e., a continuous function ˛1 W For ensuring that VN is strictly decreasing
Œ0; 1/ ! Œ0; 1/ which is strictly increasing, along the closed-loop solutions, we need to prove
unbounded, and satisfies ˛1 .0/ D 0. With jxj, we
denote the norm on X . In this entry, we exclu- VN .f .x; N .x/// VN .x/ `.x; N .x//: (9)
sively discuss stage costs ` satisfying (6). More
general settings using appropriate detectability
conditions are discussed in Rawlings and Mayne In order to prove this inequality, one uses on
(2009, Sect. 2.7) or Grimm et al. (2005) in the the one hand the dynamic programming principle
context of stabilizing MPC. Even more general ` stating that
are allowed in the context of economic MPC; see
the Economic Model Predictive Control article. VN 1 .f .x; N .x/// D VN .x/ `.x; N .x//:
In case of terminal constraints and terminal (10)
costs, a compatibility condition between ` and
F is needed on X0 in order to ensure stability. On the other hand, one shows that (7) implies
More precisely, we demand that for each x 2 X0
there exists a control value u 2 U such that VN 1 .x/ VN .x/ (11)
f .x; u/ 2 X0 and
F .f .x; u// F .x/ `.x; u/ (7) for all x 2 XN . Inserting (11) with f .x; N .x//
in place of x into (10) then immediately
holds. Observe that the condition f .x; u/ 2 X0 yields (9). Details of this proof can be found,
is again the viability condition which we already e.g., in Mayne et al. (2000), Rawlings and Mayne
imposed for ensuring recursive feasibility. Note (2009), or Grüne and Pannek (2011). The survey N
that (7) is trivially satisfied for F 0 in case of Mayne et al. (2000) is probably the first paper
X0 D fx g by choosing u D u . which develops the conditions needed for this
Stability is now concluded by using the opti- proof in a systematic way; a continuous-time
mal value function version of these results can be found in Fontes
(2001).
Summarizing, for MPC with terminal con-
VN .x0 / WD inf JN .x0 ; u/
u s:t: (4) straints and costs, under the conditions (6)–(8),
we obtain asymptotic stability of x on SDXN .
as a Lyapunov function. This will yield stability For MPC without terminal constraints and
on S D XN , as XN is exactly the set on which VN costs, i.e., with X0 D X and F 0, these
is defined. In order to prove that VN is a Lyapunov conditions can never be satisfied, as (7) will
function, we need to check that VN is bounded immediately imply `.x; u/ D 0 for all x 2
from below and above by K1 functions ˛1 and ˛2 X, contradicting (6). Moreover, without terminal
and that VN is strictly decaying along the closed- constraints and costs, one cannot expect (9) to
loop solution. be true. This is because without terminal con-
The first amounts to checking straints, the inequality VN 1 .x/ VN .x/ holds,
which together with the dynamic programming
˛1 .jxj/ VN .x/ ˛2 .jxj/ (8) principle implies that if (9) holds, then it holds
with equality. This, however, would imply that
for all x 2 XN . The lower bound follows N is the infinite horizon optimal feedback law,
immediately from (6) (with the same ˛1 ), and which – though not impossible – is very unlikely
the upper bound can be ensured by conditions to hold.
864 Nominal Model-Predictive Control
Thus, we need to relax (9). In order to do so, performance is to evaluate the infinite horizon
instead of (9), we assume the relaxed inequality functional corresponding to (3) along the closed
loop, i.e.,
VN .f .x; N .x/// VN .x/ ˛`.x; N .x//
1
X
(12)
c`
J1 .x0 ; N / WD ` xN .k/; N xN .k/
for some ˛ > 0 and all x 2 X, which is still kD0
enough to conclude asymptotic stability of x
if (6) and (8) hold. The existence of such an ˛ with xN .0/ D x0 . This value can then be
can be concluded from bounds on the optimal compared with the optimal infinite horizon value
value function VN . Assuming the existence of
constants K 0 such that the inequality V1 .x0 / WD inf J1 .x0 ; u/
uWu.k/2U;xu .k/2X
that the closed-loop system, described by the using recursive least-square methods. This ap-
equations proach has its roots in identification theory and
has been studied in-depth for linear systems. The
latter relies on the observation that the design of
P
xP D F .x; v.x; O ; r/; /; O D w.x; O ; r/; an update law is equivalent to the design of a
(4) (reduced-order) observer for the extended system
has specific properties. For example, one could
xP D F .x; u; /; P D 0;
require that all trajectories be bounded and the
x-component of the state converge to a given
with output y D x. This approach has its roots in
value x ? (this is the so-called adaptive regulation
the theory of nonlinear observer design.
requirement) or that the input-output behavior
The approaches described so far relies on a
of the system from the input r to some user-
sort of separation principle: the update law and
defined output signal coincide with a given refer-
the feedback law are designed separately. While
ence model (this is the so-called model reference
this approach may be adequate for linear systems,
adaptive control requirement).
for nonlinear systems it is often necessary to
A natural way to characterize design specifi-
design the update law and the feedback law in
cations for the adaptive control problem and to
one step, i.e., the selection of the feedback law
facilitate its solution is to assume the existence of
depends upon the selection of the update law
a known parameter controller, described by the
and vice versa. To illustrate this design method,
equation
and provide some explicit adaptive control design
u D v ? .x; ; r/; (5)
tools, we focus on a special class of nonlinear sys-
such that the nonadaptive closed-loop system tems: systems which are linearly parameterized
xP D F .x; v ? .x; ; r/; / satisfies given design in the unknown parameter.
specifications. In this perspective, the adaptive
control problem boils down to the design of N
the update law (2) and of the feedback law (3) Linearly Parameterized Systems
such that the behavior of the adaptive closed-loop
system matches that of the nonadaptive closed- Consider the system (1) and assume the mapping
loop system xP D F .x; v ? .x; ; r/; /. F is affine in the parameter and in the control
The above description suggests a design u, namely,
method for the feedback law: one could replace
with O in Eq. (5). This design is often known F .x; u; / D f0 .x/ C g.x/u C f1 .x/; (6)
as certainty equivalence design and lends itself
to the interpretation that O be an estimate for . with f0 W Rn ! Rn , g W Rn ! Rn Rm
Naturally, one could also modify the feedback and f1 W Rn ! Rn Rq smooth mappings.
law, replacing with O and adding x-dependant For this class of systems, under additional as-
terms: this is often called a redesign. Redesign sumptions, it is possible to provide systematic
may be guided by various considerations, for adaptive control design tools. We provide two
example, it may be based on the use of a specific formal results: additional results (depending on
Lyapunov function (yielding the so-called the specific assumptions imposed on the system)
Lyapunov redesign), or by structural properties may be derived. In both cases the focus is on
of the system, or by robustness constraints. the adaptive stabilization problem: the goal of
The interpretation of O as an estimate for the adaptive controller is to render a given equi-
leads to two similar approaches for the design librium stable, in the sense of Lyapunov, and to
of the update law. The former, pursued in the guarantee convergence of the x-component of the
so-called indirect adaptive control, relies on the state (recall that the state of the adaptive closed-
design of a parameter estimator, for example, O
loop system is the vector .x; /).
868 Nonlinear Adaptive Control
with v0 W Rn ! Rm and v1 W Rn ! Rm Rq
smooth mappings, and a positive definite and and the feedback law
radially unbounded function V W Rn ! R, such
that V .x ? / D 0 and u D v.x; O C ˇ.x//
@V
f .x; / < 0 are such that all trajectories of the closed-loop
@x
system are bounded and lim x.t/ D x ? .
t !1
for all x ¤ x ? . The stability properties of the adaptive closed-
Then the update law loop system in Theorem 1 can be studied with the
Lyapunov function W .x; / O D V .x/C 1 kO k2 ,
> 2
PO @V whereas a Lyapunov analysis for the adaptive
D g.x/v1 .x/
@x closed-loop system of Theorem 2 can be carried
out, under additional assumptions, via a Lya-
and the feedback law punov function of the form W .x; O / D V .x/ C
1 O
2
k Cˇ.x/k2 . This suggests that in Theorem 1
u D v0 .x/ C v1 .x/O O plays the role of the estimate of , whereas in
Theorem 2 such a role is played by O C ˇ.x/.
are such that all trajectories of the closed-loop Note that in none of the theorems, the parame-
system are bounded and lim x.t/ D x ? . ter estimate is required to converge to the true
t !1 value of the parameters, although in Theorem 2
the feedback law is required to converge, along
Theorem 2 Consider the system (6) and a point
trajectories, to the known parameter controller.
x . Assume there exists a known parameter con-
This has a very important, possibly counterin-
troller u D v.x; / such that the closed-loop
tuitive, consequence: the asymptotic nonadaptive
system
controller u D v.x; O1 /, where O1 D lim .t/,O
xP D f .x; /; t !1
provided the limit exists, is not in general a
where f .x; / D f0 .x/Cf1 .x/ Cg.x/v.x; /, stabilizing controller for system (6).
has a globally asymptotically stable equilibrium
at x . Assume, in addition, that there exists a Example 1 Consider the nonlinear system de-
mapping ˇ W Rn ! Rq such that all trajectories scribed by the equation xP D u C x 2 ; with
of the system x.t/ 2 R, u.t/ 2 R, and 2 R. A known
parameter controller satisfying the assumptions
of Theorem 1 (with and V .x/ D x 2 =2) and of
@ˇ
zP D f1 .x/ z; Theorem 2 (with ˇ.x/ D x) is u D x x 2 :
@x
The resulting update laws and feedback laws are
xP D f .x/ C g.x/ .v.x; C z/ v.x; //
(7)
P
are bounded and satisfy O1 D x 3 ; O 2;
u1 D x x
and
lim Œg.x.t// .v.x.t/; C z.t// v.x.t/; // D 0:
t!1
P
Then the update law O2 D x; u2 D x .O C x/x 2 ;
Nonlinear Adaptive Control 869
respectively, the subscripts “1” and “2” are used linearly parameterized in the unmeasured states,
to refer to the construction in Theorem 1 and 2, namely, it is described by equations of the form
respectively.
The basic building blocks in Theorems 1 and 2 xP 1 D x2 C 1 .x1 / C '1 .x1 /> ;
can be exploited repeatedly to design adaptive xP 2 D x3 C 2 .x1 / C '2 .x1 /> ;
controllers for systems with a specific structure, ::
:
for example, for systems described by the equa-
tions xP i D xi C1 C i .x1 / C 'i .x1 /> C bi u;
::
:
xP 1 D x2 C '1 .x1 /> ;
xP n1 D xn C n1 .x1 / C 'n1 .x1 /> C bn1 u;
xP 2 D x3 C '2 .x1 ; x2 /> ;
xP n D n .x1 / C 'n .x1 /> C bn u;
::
: (9) y D x1
xP i D xi C1 C 'i .x1 ; : : : ; xi /> ;
::
: with xi .t/ 2 R, for i D 1; : : : ; n, u.t/ 2 R,
xP n D u C 'n .x1 ; : : : ; xn /> ; y.t/ 2 R, 'i W R ! Rq , and i W R ! R,
for i D 1; : : : ; n, smooth mappings, 2 Rq ,
with xi .t/ 2 R, for i D 1; : : : ; n, u.t/ 2 R, 'i W and b D Œbi ; ; bn1 ; bn > unknown, but such
Ri ! Rq , for i D 1; : : : ; n, smooth mappings, that the sign of bn is known and the polynomial
and 2 Rq . Note that the last of the equations bn s ni C bn1 s ni 1 C C bi has all roots with
(9) can be replaced by negative real part (this implies that the system,
with input u and output y, is minimum phase).
xP n D N u C 'n .x1 ; : : : ; xn /> ;
Nonlinear Parameterized Systems
with N 2 R, provided its sign is known (this N
condition may be removed using the so-called
Adaptive control of nonlinearly parameterized
Nussbaum gain). The parameter N is often re-
systems is an open area of research. The design
ferred to as the high-frequency gain of the sys-
of adaptive controllers relies often upon struc-
tem: a terminology borrowed from linear systems
tural assumptions, for example, the existence of
theory.
a monotonic parameterization, as in the system
described by the equation
with x.t/ 2 R, u.t/ 2 R, and 2 R, may be Ilchmann A (1997) Universal adaptive stabilization of
rewritten in over-parameterized form as nonlinear systems. Dyn Control 7(3): 199–213
Ilchmann A, Ryan EP (1994) Universal -tracking for
nonlinearly-perturbed systems in the presence of
xP D u C 1 .x/1 C 2 .x/2 ; noise. Automatica 30(2):337–346
Jiang Z-P, Praly L (1998) Design of robust adaptive
controllers for nonlinear systems with dynamic uncer-
with i 2 R, for i D 1; 2. Note that the tainties. Automatica 34(7):825–840
over-parameterized form overlooks the important Kanellakopoulos I, Kokotović PV, Morse AS (1991)
information that 12 C 22 D 1. Systematic design of adaptive controllers for feed-
back linearizable systems. IEEE Trans Autom Control
36(11):1241–1253
Krstić M, Kanellakopoulos I, Kokotović P (1995) Nonlin-
Summary and Future Directions ear and adaptive control design. Wiley, New York
Marino R, Tomei P (1995) Nonlinear control design:
geometric, adaptive and robust. Prentice-Hall, London
The problem of adaptive stabilization for non- Nussbaum RD Some remarks on a conjecture in param-
linear systems has been discussed. Two concep- eter adaptive control. Syst Control Lett 3(5):243–246
tual building blocks for the design of stabilizing (1982)
adaptive controllers have been discussed, and Pomet J-B, Praly L (1992) Adaptive nonlinear regulation:
estimation from the Lyapunov equation. IEEE Trans
classes of systems for which these blocks al- Autom Control 37(6):729–740
low to explicitly design adaptive controllers have Sastry SS, Isidori A (1989) Adaptive control of
been given. The role of parameter convergence, linearizable systems. IEEE Trans Autom Control
or lack thereof, has been briefly discussed to- 34(11):1123–1131
Spooner JT, Maggiore M, Ordonez R, Passino KM (2002)
gether with connections between adaptive and Stable adaptive control and estimation for nonlinear
observer designs. The difficulties associated with systems. Wiley, New York
non-full state measurement and with nonlinear Townley S (1999) An example of a globally stabiliz-
parameterization have been also briefly high- ing adaptive controller with a generically destabiliz-
ing parameter estimate. IEEE Trans Autom Control
lighted. Several problems have not been dis- 44(11):2238–2241
cussed, for example, model reference adaptive
control, robust adaptive control, universal adap-
tive controllers, and the use of projections to
incorporate prior knowledge on the parameter.
Details on these can be found in the bibliography Nonlinear Filters
below.
Frederick E. Daum
Raytheon Company, Woburn, MA, USA
Cross-References
Abstract
Adaptive Control, Overview
History of Adaptive Control Nonlinear filters estimate the state of dynami-
Stochastic Adaptive Control cal systems given noisy measurements related to
Switching Adaptive Control the state vector. In theory, such filters can pro-
vide optimal estimation accuracy for nonlinear
measurements with nonlinear dynamics and non-
Gaussian noise. However, in practice, the actual
Bibliography performance of nonlinear filters is limited by the
curse of dimensionality. There are many different
Astolfi A, Karagiannis D, Ortega R (2008) Nonlinear and types of nonlinear filters, including the extended
adaptive control with applications. Springer, London
Hovakimyan N, Cao C (2010) L1 adaptive control theory. Kalman filter, the unscented Kalman filter, and
SIAM, Philadelphia particle filters.
Nonlinear Filters 871
time random processes for the evolution of the There is no useful way to quantify the com-
state vector (x) in many practical applications. putational complexity of nonlinear filters with-
Similarly, one can model measurements in con- out also quantifying estimation accuracy of x.
tinuous time using Itô calculus: This contrasts with standard computational com-
plexity theory (e.g., P vs. NP) because we are
dz D h.x.t/; t/dt C dv interested in approximations rather than exact so-
lutions. This is the basic idea of IBC. In practice,
Most engineers consider continuous-time mea- engineers compare the estimation accuracy and
surement models as impractical and unnecessar- computational complexity of different nonlinear
ily complicated mathematically, because digital filters using Monte Carlo simulations for specific
computers always require discrete-time measure- applications and specific computers.
ments and there are no practical analog com- The most active area of current research in
puters that can be used for nonlinear filtering, nonlinear filters is focused on particle filters,
owing to the overwhelming superiority of digital which have the promise of optimal accuracy for
computers in terms of accuracy, stability, dy- essentially any nonlinear filter problem, at the
namic range, and flexibility. Nevertheless, there cost of very high computational complexity for
are many papers published by researchers us- high-dimensional problems. In the early days
ing continuous-time measurement models. But (1994–2004), researchers often asserted that par-
the vast majority of practical papers on nonlin- ticle filters “beat the curse of dimensionality,”
ear filters use discrete time measurement models but it is well known today that this assertion
for obvious reasons. This contrasts sharply with is wrong (e.g., see Daum 2005). Unfortunately,
the practical importance of correctly modeling there is no useful theory of computational com-
continuous-time random processes for the evolu- plexity for particle filters, but rather the currently
tion of the state vector (x). available theory gives asymptotic bounds on ac-
curacy with generic “constants.” Such bounds
on the variance of estimation error are generally N
Nonlinear Filter Algorithms of the form c/N in which N is the number of
particles and c is the generic so-called constant.
There is no universally best nonlinear filter for But we know that the so-called constant actually
all applications, and there is much debate about varies by many orders of magnitude depending
which is the best nonlinear filter for any given on the specifics of the problem, including the
application. Even if we knew the best nonlinear following: (1) dimension of the state vector be-
filter for a given computer, the answer could be ing estimated, (2) uncertainty in the initial state
very different for a different computer; in partic- vector, (3) measurement accuracy, (4) stability
ular, some filters can exploit massively parallel of the dynamical system that describes the time
processing architectures, whereas others cannot. evolution of the state vector, (5) geometry of
Research and development of nonlinear filters the conditional probability densities (e.g., uni-
should continue rapidly for the foreseeable fu- modal, log-concave, multimodal, etc.), (6) Lip-
ture. More generally, there is no universal the- schitz constants of the log probability densi-
ory of computational complexity for practical ties, (7) curvature of the nonlinear dynamics and
algorithms of this type; perhaps the closest ap- measurements, (8) ill-conditioning of the Fisher
proximation to such a theory is “information- information matrix for the estimation problem,
based complexity” (IBC); e.g., see Traub and etc. Moreover, there are no tight bounds on the
Werschulz (1998) and Dick et al. (2013). The so-called constant c for practical nonlinear filter
estimation accuracy of x and the computational problems, but rather the best bounds for simple
complexity of the nonlinear filter are intimately MCMC problems are known to be 30 orders of
connected, as shown below for particle filters. magnitude too large; see Dick et al. (2013).
874 Nonlinear Filters
Discrete-Time Measurement Models rule is a simple formula that multiplies two prob-
ability densities and normalizes it by dividing by
Research papers on nonlinear filters are often p.zk jZk1 /. In most applications, there is no need
mathematically abstract, but advanced math is not to normalize the density, and hence, Bayes’ rule
required for practical engineering applications for the unnormalized conditional density is even
(e.g., see Ho and Lee 1964). In particular, one simpler:
can avoid the advanced stochastic mathematics
used for continuous-time measurements by using p.x.tk /; tk jZk / D p.x.tk /; tk jZk1 /p.zk jx.tk //
discrete-time measurements, which is the practi-
cal case of interest anyway, owing to the use of We see that Bayes’ rule for the unnormalized
digital computers to implement such algorithms. conditional density is simply a multiplication of
The notion that continuous-time measurements two densities (i.e., the likelihood and the prior).
results in simpler, better, or more elegant results
for nonlinear filters is misleading; for exam-
ple, we have the elegant innovation theory for Summary and Future Directions
continuous-time measurements (Kailath 1970),
but this theory is not applicable for discrete- In practical applications, the most popular non-
time measurements, likewise with the elegant linear filter is the extended Kalman filter (EKF),
formula for propagating the conditional mean followed by the unscented Kalman filter (UKF).
for continuous-time measurements (the so-called These two filters give good accuracy and robust
Fujisaki-Kallianpur-Kunita formula). More gen- performance for many practical applications. The
erally, the simple discrete-time version of Bayes’ computational complexity of both the EKF and
rule suffices for practical real-world engineering UKF grows as the cube of the dimension of the
applications; there is rarely a need to employ state vector, and hence, they are very practical
the more complex continuous-time version. The to run in real time on laptops or PCs for many
discrete time formula for Bayes’ rule is simply real- world applications. But there are also many
difficult nonlinear or non-Gaussian problems for
which the EKF and UKF give suboptimal accu-
p.x.tk /; tk jZk / D p.x.tk /; tk jZk1 /p.zk jx.tk //=p.zk jZk1 / racy, and in some cases, they give surprisingly
bad accuracy. The accuracy of optimal nonlinear
in which filters is limited by the curse of dimensionality.
We know how to write the equations for the
p.x.tk /, tk jZk / D probability density of x at time
optimal nonlinear filter, but the solution generally
tk conditioned on Zk ; this is also called the
takes an exponentially increasing time to com-
“posteriori probability density”
pute as the dimension of the state vector grows.
x.t/ D state vector of the dynamical system at
There are many different kinds of nonlinear fil-
time t
ters, and this is still an active field of research,
Zk D set of all measurements up to and including
as shown in Crisan and Rozovskii (2011). Future
time tk
research is likely to exploit advances in compu-
zk D measurement vector at time tk
tational complexity theory for approximation of
p.zk jx.tk // D probability density of zk condi-
functions in the style of information-based com-
tioned on x.tk /; this is also called the “like-
plexity (IBC) rather than P vs. NP theory. This
lihood”
is because we want good fast approximations
p.AjB/ D probability density of A conditioned
rather than exact algorithms. A lucid introduc-
on B
tion to IBC is Traub and Werschulz (1998), and
This is all one needs to know about Bayes’ recent work is surveyed in Dick et al. (2013).
rule for practical engineering applications of non- Another fruitful direction of research is to ex-
linear filtering; see Ho and Lee (1964). Bayes’ ploit the recent advances in transport theory,
Nonlinear Filters 875
as explained in Daum (2013); the best intro- Daum F (2005) Nonlinear filters: beyond the Kalman
duction to transport theory is the book by Vil- filter. IEEE AES Magazine 20:57–69
Daum F, Huang J (2013a) Particle flow with non-zero
lani (2003), which is very accessible yet thor- diffusion for nonlinear filters. In: Proceedings of SPIE
ough. Research in exact finite-dimensional fil- conference, San Diego
ters is difficult but could yield substantial im- Daum F, Huang J (2013b) Particle flow and Monge-
provements in accuracy and computational com- Kantorovich transport. In: Proceedings of IEEE
FUSION conference, Singapore
plexity; for example, see Benes (1981), Marcus Dick J, Kuo F, Peters G, Sloan I (eds) (2013) Monte Carlo
(1984), and Daum (2005). Progress in nonlin- and quasi-Monte Carlo methods 2012. Proceedings of
ear filter research could be inspired by many conference, Sydney. Springer, Heidelberg
diverse fields, including fluid dynamics, quan- Doucet A, Johansen AM (2011) A tutorial on parti-
cle filtering and smoothing: fifteen years later. In:
tum chemistry, quantum field theory, gauge the- Crisan D, Rozovskii B (eds) The Oxford handbook
ory, string theory, Lie superalgebras, Lie super- of nonlinear filtering. Oxford University Press, Oxford
groups, and neuroscience. An important open pp 656–704
research topic is the stability of nonlinear fil- Gelb A et al (1974) Applied optimal estimation. MIT,
Cambridge
ters, which is obviously a fundamental limita- Ho Y-C, Lee RCK (1964) A Bayesian approach to prob-
tion to good theoretical upper bounds on esti- lems in stochastic estimation and control. IEEE Trans
mation error. We still do not have a practical Autom Control 9:333–339
theory of stability for nonlinear filters. Perhaps Jazwinski A (1998) Stochastic processes and filtering
theory. Dover, Mineola
the closest approximation to such a theory is Julier S, Uhlmann J (2004) Unscented filtering and non-
the lucid paper by van Handel (2010), which linear estimation. IEEE Proc 92:401–422
makes an interesting attempt at understanding Kailath T (1970) The innovations approach to detection
the stability of nonlinear filters. In particular, and estimation theory. Proc IEEE 58: 680–695
Kushner HJ (1964) On the differential equations satisfied
van Handel’s paper aims to generalize Kalman’s by conditional probability densities of Markov pro-
theory of stability for the Kalman filter by con- cesses. SIAM J Control 2:106–119
necting stability with the essence of controlla- Marcus SI (1984) Algebraic and geometric methods
in nonlinear filtering. SIAM J Control Optim 22:
bility and observability. A good survey of what
817–844
N
is known about stability theory for nonlinear Noushin A, Daum F (2008) Some interesting observations
filters is given in various articles in Crisan and regarding the initialization of unscented and extended
Rozovskii (2011). Kalman filters. In: Proceedings of SPIE conference,
Orlando
Ristic B, Arulampalam S, Gordon N (2004) Beyond the
Kalman filter. Artech House, Boston
Cross-References Sorenson HW (1974) On the development of practical
nonlinear filters. Inf Sci 7:253–270
Estimation, Survey on Sorenson HW (1980) Parameter estimation. Marcel-
Extended Kalman Filters Dekker, New York
Sorenson HW (1988) Recursive estimation for nonlinear
Kalman Filters
dynamic systems. In: Spall J (ed) Bayesian analysis of
Particle Filters time series and dynamic models. Marcel-Dekker, New
York, pp 127–165
Stratonovich RL (1960) Conditional Markov processes.
Theory Probab Appl 5:156–178
Bibliography Traub J, Werschulz A (1998) Complexity and information.
Cambridge University Press, Cambridge
Arasaratnam I, Haykin S, Hurd TR (2010) Cubature van Handel R (2010) Nonlinear filters and system theory.
Kalman filtering for continuous-discrete systems. In: Proceedings of 19th international symposium on
IEEE Trans Signal Process 58:4977–4993 mathematical theory of networks and systems, Bu-
Benes V (1981) Exact finite-dimensional filters for certain dapest
diffusions with nonlinear drift. Stochastics 5:65–92 Villani C (2003) Topics in optimal transportation. Ameri-
Chorin A, Tu X (2009) Implicit sampling for particle can Mathematical Society, Providence
filters. Proc Natl Acad Sci 106:17249–17254 Zakai M (1969) On the optimal filtering of diffusion
Crisan D, Rozovskii B (eds) (2011) Oxford handbook of processes. Z fur Wahrscheinlichkeitstheorie und verw
nonlinear filtering. Oxford University Press, Oxford Geb 11:230–243
876 Nonlinear Sampled-Data Systems
Nonlinear Sampled-Data
Systems, Fig. 1 u(tk) u(t) y(t) y(tk)
D/A Continuous-time A/D
Sampled-data (control) converter plant converter
system
Discrete-time
controller
Note that the system in Fig. 1 can be general- are useful only for very small sampling periods.
ized in many ways. An important generalization Nevertheless, they are invaluable and are used as
is multi-rate sampling where the output of the the first step in the controller/observer design in
system is sampled at one sampling rate while the so-called emulation design approach.
the control inputs are updated at a different sam- Discrete-time models only capture the be-
pling rate. Another generalization are networked havior of the sampled-data system at sampling
control systems which are discussed in the last instants. Indeed, they ignore the inter-sample be-
section. havior of the system and this is their main draw-
back. There are two ways in which nonlinear
discrete-time models arise: (i) from the identi-
Modeling fication of the plant model using the sampled
measurements and (ii) from the discretization
The combination of continuous-time and of a known continuous-time plant model. For N
discrete-time components renders the analysis instance, black box identification methods often
and the design of sampled-data systems lead to nonlinear discrete-time models in input-
challenging. Still, linear systems allow for output form, such as NARMA (nonlinear auto-
computationally efficient analysis and design regressive moving average) models (Chen et al.
techniques that benefit from the z and ı 1989; Juditsky et al. 1995). Depending on the
transforms, as well as convex optimization approximating functions used, the nonlinearities
(Chen and Francis 1994). Nonlinear sampled- can be polynomial, neural network type, fuzzy
data systems, one the other hand, are much harder type, and so on. On the other hand, the discretiza-
to deal with since the aforementioned methods do tion of the continuous-time plant model requires
not apply in this case. This inherent difficulty has an exact analytic solution of a set of nonlinear
led to a variety of models for different analysis differential equations. When such an analytic
and design methods: solution exists, we can obtain the exact discrete-
1. Continuous-time models time models of the system; this is typically as-
2. Discrete-time models sumed for linear plants. Nonlinear sampled-data
3. Sampled-data models systems are different from their linear counter-
We discuss bellow each of these models, their parts in that it is typically impossible to obtain the
features, and the analysis or design methods that exact discrete-time model and only approximate
exploit them. discrete-time models are available for analysis
Continuous-time models basically ignore the and design (Nešić et al. 1999; Nešić and Teel
sampling process and assume that all signals are 2004).
continuous time. They are the coarsest approx- Sampled-data models capture the true be-
imation of the sampled-data system and they havior of the sampled-data system including its
878 Nonlinear Sampled-Data Systems
inter-sample behavior. There are several ways conditions of a nonlinear system that solutions
in which this can be achieved. One way is to blow up within a time that is shorter than the
model the piecewise constant signals that arise sampling period. As a consequence, for such an
from zero-order hold devices as signals with a initial condition and input, the exact discrete-time
time-varying delay; this gives rise to time-delay system cannot be defined. This is a fundamental
nonlinear models (Teel et al. 1998). Another obstacle to achieving global stability results for
recently proposed approach is to model nonlinear nonlinear systems if the sampling period is fixed
sampled-data systems as hybrid dynamical sys- and independent of the size of the initial state.
tems (Goebel et al. 2012). An extensive analy- Nevertheless, it is possible to ensure semi-global
sis and design toolbox has been developed for stability properties for very general nonlinear
hybrid dynamical systems and these results can systems which means that any compact domain
be used for nonlinear sampled-data systems. An- of convergence can be achieved if the sampling
other class of models, based on the so-called lift- period is sufficiently reduced (Nešić and Teel
ing, has been applied for linear systems where the 2004).
system is represented as a discrete-time system Model structure is changed: An important
with infinite dimensional input and output spaces. issue for nonlinear sampled-data systems is that
While this approach has been very successful in the sampling modifies the structure of the model.
the linear context (Chen and Francis 1994), it ap- When the continuous-time plant model has a
pears that it is not as useful for nonlinear systems certain structure, such as triangular or affine
due to difficulties arising from harder analysis in the input, the corresponding exact discrete-
and prohibitive computational requirements. time model will not inherit it; see Monaco and
Normand-Cyrot (2007) and Yuz and Goodwin
(2005). This significantly complicates the design
The Main Issues and Analysis of sampled-data systems via the discrete-
time approach since many nonlinear design
Controllability/observability: Issues arising techniques, like backstepping or forwarding, are
due to sampling in linear systems transfer to heavily reliant on the structure of the model.
the nonlinear context although they are less Zero dynamics: Probably the most signifi-
understood in this case. For instance, it is cant aspect of the changed structure are the so-
well known that sampling may “destroy” the called sampling zeros. In linear systems, it is well
controllability and/or observability properties of known that if a continuous-time linear system of
the system (Chen and Francis 1994). In other relative degree r 2 is sampled, then generically
words, if the continuous-time plant model is for fast sampling the discrete-time models of
controllable/observable, then the corresponding the plant will have relative degree r D 1. In
exact discrete-time model of the plant may other words, sampling introduces extra zeros in
not verify these properties for some sampling the model which are often unstable and thus
periods. A simple test is available for linear render the system non-minimum phase. It is well
systems to avoid this phenomenon, but we are known that the controller design is much harder
not aware of similar results in the nonlinear for non-minimum phase systems, and, moreover,
context. there are certain fundamental performance limi-
Finite escape times: A major difference tations in this case. Recently, results that extend
between continuous-time linear and nonlinear the notion of sampling zeros to the nonlinear
systems is that the former have well defined sampled-data systems have been reported; see
solutions for constant control inputs and the references in Monaco and Normand-Cyrot
arbitrarily long sampling periods. This is not (2007).
the case, in general, for nonlinear systems as they Passivity: Some plant properties like passivity
may exhibit finite escape times. In other words, are much more restrictive in discrete time than
for a constant input it may happen for some initial in continuous time. Indeed, it is necessary for a
Nonlinear Sampled-Data Systems 879
which are generally referred to as networked Arapostathis A, Jakubczyk B, Lee HG, Marcus SI, Sontag
control systems (NCS); see Heemels et al. (2010) ED (1989) The effect of sampling on linear equiv-
alence and feedback linearization. Syst Control Lett
and the references cited therein. These systems 13:373–381
exploit digital wired or wireless communication Chen T, Francis B (1994) Optimal sampled-data systems.
networks within the control loops. Such a setup Springer, New York
is introduced to reduce the cost, weight, and Chen S, Billings SA, Luo W (1989) Orthogonal
least squares methods and their application to
volume of the engineered systems, but its spe- non-linear system identification. Int J Control 50:
cial structure imposes new challenges due to the 1873–1896
communication constraints, data packet dropouts, Goebel R, Sanfelice RG, Teel AR (2012) Hybrid
quantization of data, varying sampling periods, dynamical systems. Princeton University Press,
Princeton
time delays, etc. At the same time, these systems Grizzle JW (1987) Feedback linearization of discrete-time
provide new flexibilities due to the distributed systems. Syst Control Lett 9:411–416
computation within the control system that can Heemels M, Teel AR, van de Wouw N, Nešić D (2010)
be used to improve the performance and mitigate Networked control systems with communication con-
straints: tradeoffs between transmission intervals, de-
some of the undesirable network effects on the lays and performance. IEEE Trans Autom Control
overall system performance. Moreover, embed- 55:1781–1796
ded microprocessors allow for event-triggered Juditsky A, Hjalmarsson H, Benveniste A, Delyon B,
and self-triggered sampling (Anta and Tabuada Ljung L, Sjöberg J, Zhang Q (1995) Nonlinear black-
box models in system identification: mathematical
2010) that are still largely unexplored especially foundations (original research article). Automatica
for nonlinear systems. Design of NCS was identi- 31:1725–1750
fied as one of the biggest challenges to the control Khalil HK (2004) Performance recovery under output
research community in the twenty-first century, feedback sampled-data stabilization of a class of non-
linear systems. IEEE Trans Autom Control 49:2173–
and more than a decade of intense research on this 2184
topic still has not provided a comprehensive and Kötta U (1995) Inversion method in the discrete-time
unifying approach for their analysis and design. nonlinear control systems synthesis problems. Lecture
notes in control and information sciences, vol 205.
Novel results on modeling and Lyapunov stability
Springer, Berlin
N
theory for (nonlinear) hybrid dynamical systems Laila DS, Nešić D, Teel AR (2002) Open and closed loop
appear to offer the right analysis design tools but dissipation inequalities under sampling and controller
they are still to be converted into efficient and emulation. Eur J Control 18:109–125
easy-to-use design tools in the control engineers’ Monaco S, Normand-Cyrot D (2007) Advanced tools for
nonlinear sampled-data systems’ analysis and control.
toolbox. Eur J Control 13:221–241
Nešić D, Teel AR (2004) A framework for stabilization
of nonlinear sampled-data systems based on their ap-
proximate discrete-time models. IEEE Trans Autom
Control 49:1103–1034
Cross-References Nešić D, Teel AR, Kokotović PV (1999) Sufficient condi-
tions for stabilization of sampled-data nonlinear sys-
Event-Triggered and Self-Triggered Control tems via discrete-time approximations. Syst Control
Hybrid Dynamical Systems, Feedback Con- Lett 38:259–270
Nešić D, Teel AR, Carnevale D (2009) Explicit
trol of
computation of the sampling period in emulation
Optimal Sampled-Data Control of controllers for nonlinear sampled-data
Sampled-Data Systems systems. IEEE Trans Autom Control 54:
619–624
Teel AR, Nešić D, Kokotović PV (1998) A note
on input-to-state stability of sampled-data
nonlinear systems. In: Proceedings of the
Bibliography conference on decision and control’98, Tampa,
pp 2473–2478
Anta A, Tabuada P (2010) To sample or not to sam- Yuz JI, Goodwin GC (2005) On sampled-data models
ple: self-triggered control for nonlinear systems. IEEE for nonlinear systems. IEEE Trans Autom Control
Trans Autom Control 55:2030–2042 50:477–1488
882 Nonlinear System Identification Using Particle Filters
Before the SMC methods are introduced estimates of the unknown parameters in a
in section “Sequential Monte Carlo”, their model, by making use of the information avail-
need is clearly explained by formulating both able in the obtained measurements y1WT . The
the Bayesian and the maximum likelihood ML estimate is obtained by finding the that
identification problems in sections “Bayesian maximizes the so-called log-likelihood function,
Problem Formulation” and “Maximum Like- which is defined as
lihood Problem Formulation”, respectively.
Solutions to these problems are then provided in X
T
sections “Bayesian Solutions” and “Maximum `T ./ , log p .y1WT / D log p .yt j y1Wt 1 /:
t D1
Likelihood Solutions”, respectively. Finally,
(3)
we give some intuition for online (recursive)
solutions in section “Online Solutions”, and in
section “Summary and Future Directions”, we Note that we use as a subindex to denote
conclude with a summary and directions for that the corresponding probability density func-
future research. tion is parameterized by , analogously to what
was done in (1). The one step ahead predictor
p .yt j y1Wt 1 / is computed by marginalizing
Bayesian Problem Formulation
p.yt ; xt j y1Wt 1 / D h .yt j xt /p .xt j y1Wt 1 /
In formulating the Bayesian problem, the param-
w.r.t. xt , i.e., integrating out xt from p.yt ; xt j
eters are modeled as unknown stochastic vari-
y1Wt 1 /. To summarize, the ML estimate O ML is
ables, ı.e., the model (1) needs to be augmented
obtained by solving the following optimization
with a prior density for the parameters Â
problem:
p./. The aim in Bayesian system identification
is to compute the posterior density of  given PT R
O ML , arg max t D1 log h .yt j xt /
the measurements p. j y1WT /. More generally,
we typically compute the joint posterior of the
p .xt j y1Wt 1 /dxt : (4) N
parameters  and the states x1WT ,
X
N
Importance Sampling pO .xt 1 j y1Wt 1 / D
N
wit 1 ıxit 1 .xt 1 /; (7)
Let z be a random variable distributed according i D1
to some complicated density .z/ and let './ be
some function of interest. Importance sampling where ıxit 1 .xt 1 / denotes the Dirac delta mass
offers a systematic way of evaluating integrals of located at xit 1 . Furthermore, wit 1 and xit 1 are
the form referred to as the weights and the particles, re-
Z spectively. We will now derive the particle filter
E Œ'.z/ D '.z/.z/dz; (5) by designing an importance sampler allowing
us to approximately solve (6). The derivation is
performed in an inductive fashion, starting by
without requiring samples directly generated
assuming that p.xt 1 j y1Wt 1 / is approximated
from .z/. The density .z/ is referred to as
by (7). Inserting PN(7) into (6b) results in pO N
the target density, i.e., the density we are trying
.xt j y1Wt 1 / D i D1 wt 1 f .xt j xit 1 /; which is
i
to sample from. The importance sampler relies
used in (6a) to compute an approximation of the
on a proposal density q.z/, from which it is
filtering density p.xt j y1Wt / up to proportionality.
simple to generate samples, let zi
q.z/,
Hence, this allows us to target p.xt j y1Wt /
i D 1; : : : ; N . Since each sample zi is drawn
using an importance sampler, where the form of
from the proposal density rather than from
pO N .xt j y1Wt 1 / suggests that new samples can be
the target density .z/, we must somehow
proposed according to
account for this discrepancy. The so-called
importance weights w Q i D .zi /=q.zi / encode
the difference.P By normalizing the weights X
N
xit
q.xt j y1Wt / D wit 1 f .xt j xit 1 /: (8)
wi D w Q i= N j D1 wQ j , we obtain a set of i D1
weighted samples fz ; wi gN
i
i D1 that can be used to
approximately evaluate the integral (5) resulting It is worth noting that we can obtain a more
P
in E Œ'.z/ N i i
i D1 w '.z /. Schön and Lindsten general algorithm by replacing f .xt j xit 1 / in
(2014) provide an introduction to importance the above mixture with a density q.xt j xit 1 ; yt /.
sampling within a dynamical systems setting, However, in the interest of a simple, but still
whereas Robert and Casella (2004) provide a highly useful algorithm, we keep (8). The pro-
general treatment. posal density (8) is a weighted mixture consisting
of N components, which means that we can
Particle Filter generate a sample xQ it from it via a two-step
The solution to the nonlinear filtering problem procedure: first we select which component to
is provided by the following two recursive sample from, and secondly we generate a sample
equations: from that component. More precisely, the first
Nonlinear System Identification Using Particle Filters 885
Algorithm 1 Bootstrap particle filter (for i D particle filters. There is by now a fairly good
1; : : : ; N ) understanding of the convergence properties of
1. Initialization (t D 1): the particle filter; see, e.g., Doucet and Johansen
(a) Sample xi1 .x1 /. (2011) for basic results and further pointers into
(b) Compute the importance weights w Q i1 D h.y1 j xi1 /
PN j the literature.
and normalize w1 D w
i
Q 1 = j D1 w
i
Q1.
2. For t D 2 to T do:
(a) Resample fxit1 ; wit1 g resulting in equally Particle Smoother
weighted particles fQxit1 ; 1=N g.
(b) Sample xit f .xt j xQit1 /. A particle smoother is an SMC method targeting
(c) Compute the importance weights w Q it D h.yt j xit / the joint smoothing density p.x1WT j y1WT / (or
PN j
and normalize wit D w Q it = j D1 wQt . one of its marginals). There are several different
strategies for deriving particle smoothers. Rather
than mentioning them all, we introduce one pow-
part amounts to selecting one of the N particles erful and increasingly popular strategy based on
fxit 1 gN backward simulation, giving rise to the family
i D1 according to
of forward filtering/backward simulation (FFBSi)
samplers.
j j
P xQ t 1 D xit 1 j fxt 1 ; wt 1 gN
j D1 D wt 1 ;
i
In an FFBSi sampler the joint smoothing den-
sity p.x1WT j y1WT / is targeted by complementing
a forward particle filter with a second recur-
where the selected particle is denoted as xQ t 1 . sion evolving in the time-reversed direction. The
By repeating this N times, we obtain a set of following factorization of the joint smoothing
equally weighted particles fQxit 1 gNi D1 , constituting density
an empirical approximation of p.xt 1 j y1Wt 1 /,
analogously to (7). We can then draw xit
!
1
TY
f .xt j xQ it 1 / to generate a realization from the
p.x1WT j y1WT / D p.xt j xt C1 ; y1Wt /
proposal (8). This procedure that turns a weighted t D1
N
set of samples into an unweighted one is com-
monly referred to as resampling. p .xT j y1WT /;
Finally, using the approximation pO N .xt j
y1Wt 1 / in (6a) and the proposal density according immediately suggests a highly useful time-
to (8) allows us to compute the weights as reversed recursion. Start by generating a sample
wQ it D h.yt j xit /. Once all the N weights are xQ T
p.xT j y1WT /. We then continue generating
computed and normalized, we obtain a collection samples backward in time by sampling from the
of weighted particles fxit ; wit gN i D1 targeting the
so-called backward kernel p.xt j xt C1 ; y1Wt /
filtering density at time t. We have now (in a according to xQ t
p.xt j xQ t C1 ; y1Wt /, for
slightly nonstandard fashion) derived the so- t D T 1; : : : ; 1. The resulting sample
called bootstrap particle filter, which was the first xQ 1WT , .xQ 1 ; : : : ; xQ T / is then by construction a
particle filter introduced by Gordon et al. (1993) sample from the joint smoothing density. Hence,
two decades ago. Since the introduction of in performing M backward simulations, we
Algorithm 1, the surrounding theory and practice obtain the following approximation of the joint
have undergone significant developments; see, smoothing density:
e.g., Doucet and Johansen (2011) for an up-
to-date survey. The weights fwi1WT gN i D1 and X M
1
the particles fxi1WT gN i D1 are random variables, pO M .x1WT j y1WT / D ıxQi1WT .x1WT /: (9)
i D1
M
and in executing the algorithm, we generate
one realization from these. This is a useful
insight both when it comes to understanding, For details on how to design algorithms
but also when it comes to the analysis of the implementing the backward simulation strategy,
886 Nonlinear System Identification Using Particle Filters
derivations, properties, and references, we refer breaking work by Andrieu et al. (2010). During
to the recent survey on backward simulation the past 3 years, the PG samplers have developed
methods by Lindsten and Schön (2013). quite a lot, and improved versions are surveyed
and explained by Lindsten and Schön (2013).
A Nontrivial Example
Bayesian Solutions To place PMCMC in the context of nonlinear sys-
tem identification, we will now solve a nontrivial
Strategies identification problem. The PG sampler is used
The posterior density (2) is analytically to compute the posterior density for a general
intractable, but we can make use of Markov chain Wiener model (linear Gaussian system followed
Monte Carlo (MCMC) samplers to address the by a static nonlinearity) (Giri and Bai 2010):
inference problem. An MCMC sampler allows
us to approximately generate samples from
xt
an arbitrary target density .z/. This is done xt C1 D A B C vt ; vt
N .0; Q/;
ut
by simulating a Markov chain (ı.e., a Markov
(10a)
process) fzŒrgr1 , which is constructed in such
a way that the stationary distribution of the chain zt D Cxt ; (10b)
is given by .z/. The sample paths fzŒrgR rD1 of
yt D g.zt / C et ; et
N .0; r/:
the chain can then be used to draw inference
(10c)
about the target distribution. Two constructive
ways of finding a suitable Markov chain to
simulate are provided by the Metropolis Hastings Based on observed inputs u1WT and outputs y1WT ,
(MH) and the Gibbs samplers, where the latter we wish to identify the model (10). We place
can be interpreted as a special case of the a matrix normal inverse Wishart (MNIW) prior
former. See, e.g., Robert and Casella (2004) on f.A; B/; Qg, an inverse gamma prior on r,
for details on MCMC. A Gibbs sampler targeting and a Gaussian process (Rasmussen and Williams
p.; x1WT j y1WT / is given by 2006) prior on the function g, resulting in a
(i) Draw 0
p. j x1WT ; y1WT /. semiparametric model. We can without loss of
0
(ii) Draw x1WT
p.x1WT j 0 ; y1WT /. generality fix the matrix C according to C D
The second step is hard, since it requires us .1; 0; : : : ; 0/. For a complete model specification,
to generate a sample from the joint smoothing we refer to Lindsten et al. (2013).
density. Simply replacing step (ii) with a back- The posterior distribution p.; x1WT j y1WT /
ward simulator does not result in a valid method is computed using a newly developed PG
(Andrieu et al. 2010). sampler referred to as particle Gibbs with
One interesting solution is provided by the ancestor sampling (PGAS); see Lindsten and
family of particle MCMC (PMCMC) sampler, Schön (2013). In the present experiment we
first introduced by Andrieu et al. (2010). PM- make use of T D 1;000 observations. The
CMC provides a systematic way of combining dimension of the state–space is 6, the linear
SMC and MCMC, where SMC is used to con- dynamics contains complex poles resulting in
struct the proposal density for the MCMC sam- oscillations as seen in Fig. 1, and the nonlinearity
pler. The so-called particle Gibbs (PG) sampler is non-monotonic; see Fig. 2. A subspace
resolves the problems briefly mentioned above by method is used to find an initial guess for the
a nontrivial modification of the SMC algorithm. linear system, and the static nonlinearity is
Introducing the PG sampler lies outside the scope initialized using a linear function (i.e., a straight
of this work; we refer the reader to the ground- line).
Nonlinear System Identification Using Particle Filters 887
Nonlinear System
10
Identification Using
Particle Filters, Fig. 1
Magn. (dB)
0
Bode diagram of the
−10
sixth-order linear system.
The black curve is the true −20
system. The red curve is
−30
the estimated posterior
mean of the Bode diagram, 10−1 100
and the shaded area is the Frequency (rad/s)
99 % Bayesian credibility
interval
200
Phase (deg)
−200
10−1 100
Frequency (rad/s)
0
the first 10;000 are discarded. The result is N
−0.2
−0.4 visualized in Figs. 1 and 2, where we plot
−0.6 the Bode diagram for the linear system and
−0.8 the static nonlinearity, respectively. In both
−1 figures we also provide the 99 % Bayesian
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2 credibility interval. MATLAB code for Bayesian
z identification of Wiener models is available
Nonlinear System Identification Using Particle Fil- from user.it.uu.se/~thosc112/research/software.
ters, Fig. 2 The black curve is the true static nonlinearity html.
(non-monotonic). The red curve is the estimated posterior The resonance peaks are accurately modeled,
mean of the static nonlinearity, and the shaded area is the
but the result is less accurate at low frequen-
99 % Bayesian credibility interval
cies (likely due to a lack of excitation). The
fact that the posterior mean is inaccurate at low
It is worth pausing for a moment to reflect frequencies is encoded in our estimate of the
upon the posterior distribution p.; x1WT j y1WT / posterior distribution as shown by the credibility
that we are computing. The unknown “parame- intervals.
ters” live in the space ‚ D R64 F , where F is In Figs. 1 and 2, we have visualized not only
an appropriate function space. The states x1WT live the posterior mean but also the uncertainty for the
in the space R61;000 . Hence, p.; x1WT j y1WT / entire model. We could do this since the model
is actually a rather complicated object for this is a linear dynamical system followed by a static
example. nonlinearity. It would be most interesting if we
888 Nonlinear System Identification Using Particle Filters
can come up with ways in which we could visu- Q.; 0 / Q. 0 ; 0 /, the likelihood is either
alize the uncertainty inherent in general nonlinear increased or left unchanged, ı.e., `T ./ `T . 0 /.
dynamical systems. The EM algorithm now suggests itself in that
we can generate a sequence of iterates f k gk1
that guarantees that the log-likelihood is not de-
Maximum Likelihood Solutions creased for increasing k by alternating the fol-
lowing two steps: (1) (Expectation) compute the
Identifying the parameters in a general non- intermediate quantity Q.; k / and (2) (maxi-
linear SSM using maximum likelihood amounts mization) compute the subsequent iterate kC1
to solving the optimization problem (3). This by maximizing Q.; k / w.r.t. . This procedure
is a challenging problem for several reasons, is then repeated until convergence, guaranteeing
for example, it requires the computation of the convergence to a stationary point on the likeli-
predictor density p .yt j y1Wt 1 /. Furthermore, its hood surface.
gradient (possibly also its Hessian) is very useful The FFBSi particle smoother offers an approx-
in setting up an efficient optimization algorithm. imation of the joint smoothing density p 0 .x1WT j
There are no closed-form solutions available for y1WT / according to (9), which inserted into (11)
these objects, forcing us to rely on approxima- provides an approximative solution QO M .; 0 / to
tions. The SMC methods briefly introduced in the expectation step. In solving the maximization
section “Sequential Monte Carlo” provide rather step, we typically want gradients of the interme-
natural tools for this task, since they are capable diate quantity r QO M .; 0 /. These can also be
of producing approximations where the accuracy approximated using (9). The above development
is only limited by the available computational is summarized in Algorithm 2, providing a solu-
resources. tion where the basic building blocks are them-
To establish a clear interface between the selves complex algorithms, an SMC algorithm
maximum likelihood problem (3) and the SMC for the E step and a nonlinear optimization algo-
methods, it has proven natural to make use of rithm for the M step. This means that we have the
the expectation maximization (EM) algorithm option of replacing the FFBSi particle smoother
(Dempster et al. 1977). The EM algorithm in step 2a with any other algorithm capable of
proceeds in an iterative fashion to compute ML producing estimates of the joint smoothing den-
estimates of unknown parameters in probabilis- sity. The family of PMCMC methods introduced
tic models involving latent variables. The strategy in section “Bayesian Solutions” contains several
underlying the EM algorithm is to exploit the highly interesting alternatives. A detailed account
structure inherent in the probabilistic model to on Algorithm 2 is provided by Schön et al.
separate the original problem into two closely (2011); see also Cappé et al. (2005).
linked problems. The first problem amounts to
computing the so-called intermediate quantity
Z Algorithm 2 EM for nonlinear system identifica-
0
Q.; / , log p .x1WT ; y1WT / tion
1. Initialization: Set k D 0 and initialize k .
2. Expectation (E) step:
p 0 .x1WT j y1WT /dx1WT
(a) Compute an approximation pOMk .x1WT j y1WT /, for
D E 0 Œlog p .x1WT ; y1WT / j y1WT ; (11) example, using an FFBSi sampler.
(b) Calculate QO M .; k /.
3. Maximization (M) step: Compute
where we have already made use of the
fact that the latent variables in an SSM are kC1 D arg max QO M .; k /:
given by the states. Furthermore, 0 denotes
a particular value for the parameters . We 4. Check termination condition. If satisfied, terminate;
can show that by choosing a new such that otherwise, update k ! k C 1 and return to step 2.
Nonlinear System Identification Using Particle Filters 889
Finally, we mention Fisher’s identity opening to grow in the future as there is a clear need
up yet another avenue for designing ML esti- motivated by the constantly growing data sets and
mators using SMC approximations. Even if we there are also clear theoretical opportunities.
are not interested in using EM when solving
the nonlinear system identification problem, the
intermediate quantity (11) is useful. The reason
Summary and Future Directions
is provided via Fisher’s identity,
We have discussed how SMC samplers can be
ˇ ˇ
r `T ./ˇ D 0 D r Q.; 0 /ˇ D 0 used to solve nonlinear system identification
problems, by sketching both Bayesian and
R ˇ ML solutions. A common feature of the
D r log p .x1WT ; y1WT /ˇ D 0 resulting algorithms is that they are (nontrivial)
p 0 .x1WT j y1WT /dx1WT ; combinations of more basic algorithms. We
have, for example, seen the combined use of
which provides a means to compute approxi- a particle smoother and a nonlinear optimization
mations of the log-likelihood gradient. Hessian solver in Algorithm 2 to compute ML estimates.
approximations are also available, but these are As another example we have the class of
more involved. Hence, Fisher’s identity opens up PMCMC methods, where the basic building
for direct use of any off-the-shelf gradient-based blocks are provided by SMC samplers and
optimization method in solving (4). MCMC samplers. The use of SMC and MCMC
methods for nonlinear system identification has
only just started to take off, and it presents very
Online Solutions interesting future prospects. Some directions for
future research are as follows: (1) The family
Online (also referred to as recursive or adap- of PMCMC algorithms is rich and fast growing,
tive) identification refers to the problem where with great potential for further developments. For
the parameter estimate is updated based on the example, its use in solving the state smoothing
N
parameter estimate at the previous time step and problem (i.e., computing p.x1WT j y1WT /) is likely
the new measurement. This is used when we to provide better algorithms in the near future.
are dealing with big data sets and in real-time (2) Related to this is the potential to design new
situations. SMC offers interesting opportunities particle smoothers capable of generating new
when it comes to deriving online solutions for particles also in the time-reversed direction. (3)
nonlinear state- space models. The most direct There are open and highly relevant challenges
idea is simply to make use of a gradient method when it comes to designing backward simulators
for Bayesian nonparametric methods (Hjort et al.
t D t 1 C t r log p .yt j y1Wt 1 /; 2010). A key question here is how to represent
the backward kernel p.xt j xt C1 ; y1Wt / in such
where ft g is the sequence of step sizes. Fisher’s
nonparametric settings. (4) The use of Bayesian
identity (12) opens up for the use of SMC in
nonparametric models will open up interesting
approximating r log p .yt j y1Wt 1 / However,
possibilities for hybrid system identification,
this leads to a rapidly increasing variance,
since they allow us to systematically express and
something that can be dealt with by the so-called
work with uncertainties over segmentations.
“marginal” Fisher identity; see Poyiadjis et al.
(2011) for details.
An interesting alternative is provided by an Cross-References
online EM algorithm; see, e.g., Cappé (2011) for
a solid introduction. The online EM approaches Nonlinear System Identification: An Overview
rely on the additive properties of the Q-function. of Common Approaches
The area of online solutions via SMC is likely System Identification: An Overview
890 Nonlinear System Identification: An Overview of Common Approaches
The main remaining task of gray-box system which are a wider class of dynamic system mod-
identification is to estimate the parameter vec- els than the abovementioned state-space mod-
tor from the data set Z N . The identification els ( Modeling of Dynamic Systems from First
criterion is typically defined with the aid of an Principles). For most dynamic systems, it is pos-
output predictor derived from the system model. sible to avoid the DAE formulation by causality
A natural output predictor is simply based on analysis, so that the connections between differ-
the numerical solution of the state equation (1): ent system components are treated as information
for some given value of , initial state x.0/ and flow, instead of algebraic constraints. There ex-
some assumed inter-sample behavior of the input ist also some theoretic studies on DAE system
u.t/ (e.g., with a zero order hold), the trajectory identifiability (Ljung and Glad 1994) and some
O
of x.t/, denoted byx.tj/, is computed with a recent developments on the identification of such
numerical ODE solver, then the output prediction systems (Gerdin et al. 2007).
is computed as
Combined Physical and Black-Box Models
O
y.tj/ D h.x.tj/;
O u.t/I /: (4)
It may happen that, in a complex system, part
of the components is well described by physical
The parameter vector is typically estimated
laws (possibly with available models from a soft-
by minimizing the sum of squared prediction
ware library), but some other components are not
error ".tj/ D y.t/ y.tj/.
O See System
well studied. In this case, the latter components
Identification: An Overview and Bohlin (2006)
can be dealt with black-box models (or possibly
for more details.
empirical models). The entire model can be fitted
The predictor based on the numerical solution
to a collected data set Z N , like in the case of the
of the state equation (1) (known as a simulator)
previous subsection.
may be in trouble if this equation with the given
value of is unstable. Moreover, the state equa-
tion (1) may also be subject to some modeling Block-Oriented Models
error that should be taken into account in the Complex systems, notably those studied in en-
output predictor. In such cases, the output predic- gineering, are often made of a certain number
tor can be made with the aid of some nonlinear of components; thus a system model can be
state observer (Gauthier and Kupka 2001) or built by connecting component models. In this
some nonlinear filtering algorithm (Doucet and sense, such component-based models could be
Johansen 2011). said “block-oriented.” In the system identifica-
Alternatively, sequential Monte Carlo (SMC) tion literature, the term block-oriented model is
methods can also be applied to the identification often used in a particular context (Giri and Bai
of (small size) nonlinear state-space systems, 2010), where it is typically assumed that each
typically assuming a discrete-time counterpart of component is either a linear dynamic subsystem
the model described by Eqs. (1) and (2). See or a nonlinear static one. Here, the term “static”
Nonlinear System Identification Using Particle means that the behavior of the component is
Filters. memoryless and can be described by an algebraic
The gray-box approach is particularly useful equation. The study of system identification with
in an engineering field when some software li- such models is motivated by the fact that, when a
brary of commonly used components is available. controlled system is stabilized around a working
In this case, a system model can be built by point, its dynamic behavior can be well described
connecting available component models. Never- by a linear model, but its actuators and sensors
theless, the “connection” of the component mod- may exhibit significant nonlinear behaviors like
els may introduce algebraic constraints through saturation or dead zone. The choice of a particular
variables shared by connected components, lead- block-oriented model structure depends on the
ing to differential algebraic equations (DAE), prior knowledge about the underlying system,
Nonlinear System Identification: An Overview of Common Approaches 893
1
X
Hammerstein System Identification
A SISO Hammerstein system is typically formu- z.t/ D hk u.t k/ (7a)
kD1
lated as
y.t/ D g.z.t// C v.t/ (7b) N
x .t/ D f .u .t// (5a)
where the sequence h1 ; h2 ,. . . is the impulse re-
y .t/ C a1 y.t 1/ C C ana y.t na / sponse of the linear subsystem, g./ is some non-
linear function, and v.t/ is a noise independent of
D b1 x .t 1/ C C bnb x .t nb / the input u.t/.
C v .t/ : (5b) Some methods for Wiener system identifica-
tion assume a finite impulse response (FIR) of
If the nonlinearity f ./ is expressed in the form the linear subsystem. In this case, the linear sub-
of system model is characterized by the vector col-
Xm lecting the FIR coefficients hT = [h1 ; h2 ,. . . , hn ].
f .u/ D l l .u/ (6) There are two typical kinds of efficient solutions,
lD1 assuming either the Gaussian distribution of the
with some chosen basis functions l ./, then input u.t/ (Greblicki 1992) or the monotonicity
the identification problem amounts to fitting the of the nonlinear function g./ (Bai and Reyland
model parameters l ; ai ; bj to a collected data Jr 2008). In both cases, it is possible to directly
set Z N . A well-known method is based on over- estimate the FIR coefficients h from the input-
parametrization (Bai 1998): replace in (5b) each output data Z N , without explicitly estimating the
x.t j / with the right-hand side of (6) and treat unknown nonlinear function g./: The estimated
each parameter product bj l as an individual h can be used to compute the internal variable
parameter, then the newly parametrized model is z.t/. It then becomes relatively easy to estimate
equivalent to a linear ARX model ( System the nonlinear function g./ from the computed
Identification: An Overview), which can be z.t/ and the measured y.t/.
894 Nonlinear System Identification: An Overview of Common Approaches
Other Block-Oriented Model Structures each considered working point, the nonlinear
Among block-oriented models composed of system is linearized so that the corresponding
more blocks, the most well-known ones concern controller can be designed from the linear control
Hammerstein-Wiener system and Wiener- theory. A by-product of this controller design
Hammerstein system. They are both composed procedure is a collection of linearized models
of 3 blocks connected in series, the former has indexed by the scheduling variable . This col-
a linear dynamic block preceded and followed lection, seen as a whole model of the globally
by two nonlinear static blocks, and the latter nonlinear system, is known as an LPV model
has a nonlinear static block in the middle of (Toth 2010). This approach has been particularly
two linear dynamic blocks. In general, the successful in the field of flight control.
prediction error method (PEM) (Ljung 1999) An LPV model can be formulated either in
is applied to the identification of such systems, input-output form or in state-space form. In the
with heuristic methods for the initialization input-output form, a SISO model can be written
of model parameters. Some recent results on as
Hammerstein-Wiener system identification have
been reported in Wills et al. (2013). There exist y .t/Ca1 ./ y .t1/ C C ana ./ y .t na /
also some other variants, with parallel blocks
or feedback loops. In most cases, each block is D b1 ./ u .t 1/ C C bnb ./ u .t nb /
either linear dynamic or nonlinear static, but C v .t/ : (8)
there is a notable exception: hysteresis blocks.
Hysteresis is a phenomenon typically observed and in the state-space form as
in some magnetic or mechanic systems. Its
mathematical description is both dynamic and x .t C 1/ D A ./ x .t/ C B ./ u .t/ C w .t/
strongly nonlinear and cannot be decomposed (9a)
into linear dynamic and nonlinear static blocks.
Due to the importance of hysteresis components y .t/ D c ./ x .t/ C D ./ u .t/ C v .t/
in some control systems, system identification (9b)
involving such blocks is currently an active
research topic (Giri et al. 2008). As a global model of the whole nonlinear
system, the -dependent parameters (matrices)
LPV Models ai ./; bj ./; A./, etc., are functions defined for
Linear parameter-varying (LPV) models could be all 2 , where is the relevant working
classified as black-box models, because typically range of the considered system (a compact subset
they rely more on experimental data than on prior of a real vector space). If originally the LPV
knowledge. However, engineers often have good model was built through a collection of linearized
insights into such models; they are thus presented models around different working points, then the
in the gray-box section. values of these functions are first defined for
the corresponding discrete values of . For other
From Gain Scheduling to LPV Models values of 2 , these functions can be defined
Gain scheduling is a method originally developed by interpolation. Alternatively, by choosing some
for the control of nonlinear systems. It consists parametric forms of ai ./; bj ./; A./, etc., the
in designing different controllers for different whole LPV model can also be estimated by fitting
working points of a nonlinear system and in it to a data set Z N , through nonlinear optimiza-
switching among the designed controllers accord- tion (Toth 2010).
ing to the actual working point. It is typically
assumed that the working point is determined by Local Linear Models
some observed variable (vector) referred to as the In an LPV model, the model parameters can in
scheduling variable and denoted by . Around principle depend on the scheduling variable in
Nonlinear System Identification: An Overview of Common Approaches 895
any chosen manner. A particularly useful case is the global approach: all the model parameters
is when they are formulated as expansions over are estimated simultaneously by solving a single
local basis functions. For example, in (8), the optimization problem for the whole model. This
parameter ai ./ may be expressed as approach can produce more accurate models in
terms of prediction error, but it is numerically
X
m
much more expensive and may lead to models
ai ./ D ai;l l ./ (10)
difficult to be interpreted by engineers.
lD1
The practical success of local linear models
where kl ./ are some chosen bell-shaped (local) strongly depends on the possibility of finding a
basis functions, typically the Gaussian function, scheduling variable of small dimension rel-
centered at different positions D cl; 2 , and evantly determining the working point of the
ai;l are coefficients of the expansion. Similarly considered system. If there exists a nonlinear
state-space model of the system, then in principle
X
m
bj ./ D bi;l l ./ : (11) the working point is determined jointly by the
lD1
state and the input of the system. As quite often
physically meaningful state variables are not fully
Assume that the basis functions are normalized observed, they cannot be used in the definition of
such that . It is possible to define as delayed output and
Xm
input variables, e.g.,
l ./ D 1 (12)
lD1
T D Œy.t/; : : : ; y.tna /; u.t1/; : : : ; u.tnb /
for all 2 . Then the LPV model (8) can be
viewed as an interpolation of m “local” models but it typically leads to a vector of quite large
dimension. It is thus important to use practical
y.t/ C a1;l y.t 1/ C C ana ;l y.t na / insights about a given system to find a relevant
D b1;l u.t 1/ C C bnb ;l u.t nb /Cv.t/ vector of reduced dimension. N
(13) For a single-dimensional , the choice of
indexed by l D 1; 2; : : : ; m. Each of these linear the local basis function centers cl can be made
models is valid for close to cl , the center of the following some practical insight or equally
corresponding basis function l ./; hence, they spaced within . For a large-dimensional ,
are called local linear models. this task is more difficult. The equally spaced
If the local basis functions l ./ are viewed as approach would lead to too many local models,
membership functions of fuzzy sets, then the local as their number would exponentially increase
linear model is strongly related to the Takagi- with the dimension of . In this case, an
Sugeno fuzzy model (Takagi and Sugeno 1985). empirical approach, called local linear model
An advantage of this point of view is the possi- tree (LOLIMOT) (Nelles 2001), can be applied.
bilities of incorporating prior knowledge in the It iteratively partitions in order to place the
form of linguistic rules and of interpreting some local basis functions where the system is more
local linear models resulting from system identi- likely nonlinear or where the available data are
fication. more concentrated.
There are two approaches to building local
linear models. The first one is the local ap-
proach: for each chosen value of cl; 2 , a Black-Box Models
local model is estimated from data corresponding
to the values of within a neighborhood of Ideally speaking, a black-box model should be
cl . This approach has the advantages of being solely built from experimental data, without any
computationally efficient, easily updatable, and prior knowledge. In practice, some prior knowl-
well understood by engineers. The second one edge is always necessary, though experimental
896 Nonlinear System Identification: An Overview of Common Approaches
data play a much more important role. For in- more efficient than FIR models, in the sense of
stance, the choice of the input and output vari- requiring fewer model parameters. By analogy,
ables, implying some causality relationship, is an the nonlinear ARX model takes the form
important prior knowledge.
With the fast development of electronic de- yO .t/ D f .y .t 1/ ; : : : ; y .t na /;
vices, more and more sensor signals are available
u .t 1/ ; : : : :u .t nb // : (15)
in various fields, notably for engineering, envi-
ronmental, and biomedical systems. Meanwhile,
the processing power of modern computers in- This is likely the most frequently used black-
creases every year. Black-box modeling has thus box model structure for nonlinear dynamic sys-
more and more potential applications. Neverthe- tem identification (Sjöberg et al. 1995; Juditsky
less, the importance of prior knowledge in a et al. 1995).
modeling procedure should not be forgotten. In
general prior knowledge leads to more reliable Nonlinear Function Estimators
models in terms of validity range, as the validity For a nonlinear ARX model in the form of (15),
of physical equations is often well understood. In the nonlinear function f ./ has to be estimated
contrast, for a black-box model essentially based from an available input-output data set Z N . Typ-
on experimental data, it may be hard to ensure ically, an estimator of f ./ with some chosen
its validity for interpolation and even harder for parametric structure is used. Let
extrapolation.
T .t/ D Œy .t 1/ ; : : : ; y .t na /;
typically chosen as a (Gaussian) bell-shaped chosen particular nonlinear function estimator for
function in radial basis networks or a sigmoid the considered system.
(S-shaped) function in multiple-layer neural
networks. State-Space Black-Box Models
Another approach to nonlinear function For a gray-box model in the form of (1) and
estimation is called nonparametric estimation. (2), it is assumed that the parametric forms of
Its main idea is to estimate f ('*) for any the nonlinear functions f ./ and h./ are known
given value of '* by the (weighted) average from prior knowledge. If no such knowledge is
of the values of y.t/ in the available data available, it is possible to estimate these nonlinear
set corresponding to values of '.t/ close to functions with some function estimator, like those
'*. This category includes kernel estimators introduced in the previous subsection. Such an
(Nadaraya 1964) and memory-based estimators approach leads to state-space black-box models.
(Specht 1991). In practice, it is easier to use the discrete-time
The nonlinear function estimation problem as counterpart of the state equation (1). Because
formulated in (17) can also be addressed with the typically the state vector x.t/ is not directly
Gaussian process model. Assume that g in (17) is observed, the estimation of f ./ and h./ cannot
a Gaussian process whose covariance matrix for be formulated as nonlinear regression problems,
any regressor pair '.t/ and './ is a known func- in contrast to the case of input-output black-
tion of the regressor pair, then the posterior dis- box models. Another difficulty is related to the
tribution of g given observations on y.t/ can be nonuniqueness of the state-space representation
computed by applying the Bayes’ theorem under of a given system: any (linear or nonlinear) state
certain assumptions (Rasmussen and Williams transformation would lead to a different state-
2006). This method is strongly related to the least space representation of the same system. In some
squares support vector machines (Suykens et al. existing methods, a linear state-space model is
2002) and to some extent is similar to kernel first estimated; then nonlinear function estimators
estimators. are used to compensate the residuals of f ./ and N
The difficulty for estimating a nonlinear func- h./ after their linear approximations (Paduart
tion f .'/ strongly depends on the dimension of et al. 2010).
'. In the single-dimensional case, most existing
methods can produce satisfactory results. When
the dimension of ', say n, increases, in order to Multiple-Input Multiple-Output
keep the data “density” unchanged, the number Systems
of data points must increase exponentially with
n. This fact implies that, in the high-dimensional For multiple-input multiple-output (MIMO)
case (say n > 10), for most practically available systems, state-space models like (1)–(2) remain
data sets, the data points are sparse in the space in the same form, by considering vector values
of '. It is thus practically impossible to estimate of the notations u.t/ and y.t/ at each time
f .'/ with a good accuracy everywhere in the instant, up to some similar adaptation of the other
space of '. In order to remedy this problem, prior involved notations. For input-output models like
knowledge can be used to form a more elaborated (15), the involved notations can also be vector
vector ' of reduced dimension, instead of the valued, but the fact that different inputs and/or
simple form of past input and output variables. outputs can have different delays makes the
The resulting model will be more of gray-box notations more complicated. For block-oriented
nature. If this approach is not possible, one has models, though a MIMO linear block is usually
to expect that the estimation algorithm automat- described by a general linear model in state-
ically discovers some low-dimension nature of space form or in input-output form, there is no
the nonlinear relationship being estimated. The consensus for the structural choice of MIMO
success would depend on the suitability of the nonlinear blocks.
898 Nonlinear System Identification: An Overview of Common Approaches
relevant classes of nonlinear systems. As a matter To see how this is possible, consider for sim-
of fact, a nonlinear system in which the zero dy- plicity the case of a system modeled by equations
namics possess a globally asymptotically stable of the form
equilibrium can be robustly stabilized, globally
or at least with guaranteed region of attraction, xP D f .x/ C g.x/u
by means of output feedback. This is a nonlinear y D h.x/
analogue of a well-know property of linear sys-
tems, namely, the property that an n-dimensional with state x 2 Rn , input u 2 R, output y 2
linear systems having n 1 zeros with neg- R and in which f .x/; g.x/; h.x/ are smooth
ative real part can be stabilized by means of functions. Systems of this forms are called input-
proportional output feedback, if the feedback affine systems. The analysis of such systems is
gain is sufficiently large. The concept of zero rendered particularly simple if appropriate no-
dynamics also plays a relevant role in variety of tations are used. Given any real-valued smooth
other problems of feedback design, such as input- function .x/ and any n-vector valued smooth
output linearization with internal stability, non- function X.x/, let LX .x/ denote the (direc-
interacting control with internal stability, output tional) derivative of .x/ along X.x/, that is the
regulation, and feedback equivalence to passive real-valued smooth function
systems.
X n
@
LX .x/ D Xi .x/ ;
i D1
@xi
The Zero Dynamics
d 1
and, recursively, set LdX D LX LX .x/ for
One of the cornerstones of the geometric the- any d 1.
ory of control systems (for linear as well as Suppose there exists an integer r 1 with the
for nonlinear systems) is the analysis of how following properties
the observability property can be influenced by
feedback. This study, originally conceived in the
context of the problem of disturbance decoupling,
Lg h.x/ D Lg Lf h.x/ D D Lg Lfr2 h.x/
had far reaching consequences in a number of
other domains. One of these consequences is the D 0 8x 2 Rn
possibility of characterizing in “geometric terms”
Lg Lfr1 h.x/ ¤ 0 8x 2 Rn :
the notion of zero of the transfer function of
a system. In a (single-input single-output and
If this is the case, it is possible to show that the
minimal) linear system, a complex number z is
set
a zero of the transfer function if and only if the
input u.t/ D exp.zt/ yields – for a suitable
choice of the initial state – a forced response Z D fx 2 Rn W h.x/ D Lh .x/ D
in which the output is identically zero. This D Lfr1 h.x/ D 0g
“open-loop” and “time-domain” characterization
has a “closed-loop and “geometric” counterpart: is a smooth sub-manifold of Rn , of codimension
all such z’s coincide with the eigenvalues of the r. It is also easy to show that the state-feedback
unobservable part of the system, once the latter law
has been rendered maximally unobservable by Lrf h.x/
means of feedback. One of the earlier successes u .x/ D
Lg Lr1
f h.x/
of the geometric approach to the analysis and
design of nonlinear systems was the possibility renders the vector
of extending these equivalent characterizations to
the domain of nonlinear systems. f .x/ D f .x/ C g.x/u .x/
Nonlinear Zero Dynamics 901
The coordinate-free construction presented above The latter provide a simple characterization of
becomes even more transparent if special coordi- the zero dynamics of the system, once that the
nates are chosen. To this end, set latter has been brought to its normal form.
It is worth observing that, in the case of a
1 linear system, functions f0 .z; / and q0 .z; / are
g .x/ D g.x/
Lg Lfr1 h.x/ linear functions, and b.z; / is a constant. Conse-
quently, the normal form can be written as
and define, recursively,
zP D F z C G
X0 .x/ D g .x/; Xk .x/ D Œf .x/; Xk1 .x/ ;
P D Ar C Br ŒH z C K C bu
for 1 k r 1, in which ŒY .x/; X.x/ denotes y D Cr :
the Lie bracket of Y .x/ and X.x/. It is possible to
show that if the vector fields X0 .x/; : : : ; Xn1 .x/ It is also easy to check that the transfer func-
are complete, there exists a smooth nonlinear, tion of the system can be expressed as
globally defined, change of variables by means
of which the system can be transformed into a det.sI F /
T .s/ D b ;
system of the form det.sI A/
902 Nonlinear Zero Dynamics
Thus, it is concluded that the unforced internal whose input-output behavior (between input v
dynamics of the inverse system coincide with the and output y) is fully linear (and stable if Kr is
zero dynamics as defined above. chosen so that the matrix Ar C Br Kr in Hurwitz).
It should be stressed, though, that the coin- In fact, the law in question renders the sys-
cidence is limited to the case of single-input tem partially unobservable, with all nonlinearities
single-output systems. For a multi-input multi- confined to its unobservable part (Isidori et al.
output nonlinear systems, the link between zero 1981). This control law is clearly non-robust,
dynamics and the dynamics of the inverse system as it relies upon exact cancelation of possibly
is more subtle. This is essentially due to the fact uncertain terms, but it can be rendered robust
that while the concept of zero dynamics only by means of appropriate dynamic compensation
seeks to determine the dynamics compatible with (Freidovich and Khalil 2008).
Nonlinear Zero Dynamics 903
The system obtained in this way has the struc- functions ˛./; ˛./; ˛./ and a class K function
ture of a cascade of two sub-systems, one of ./ satisfying
which, modeled as
˛.jzj/ V .z/ ˛.jzj/ 8z
zP D f0 .z; / ; @V
f0 .z; / ˛.jzj/ 8.z; /
@z
is seen as “driven” by the input . This motivates
the interest in classifying the asymptotic proper- such that jzj .jj/:
ties of such subsystem, as discussed below.
As a special case, it is seen that a system is
globally minimum phase if and only if there
Asymptotic Properties of the Zero exists a function V .z/, bounded as above, such
Dynamics that
@V
f0 .z; 0/ ˛.jzj/ 8z :
@z
Linear systems with no zeroes in the right-half
complex plane are traditionally called minimum- If, instead, the weaker inequality
phase systems, in view of certain properties of the
@V
Bode gain and phase plots of its transfer function. f0 .z; 0/ 0 8z
Thus, in view of the interpretation given above, @z
linear systems whose zero dynamics are asymp- holds, the system is said to be globally weakly
totically stable are minimum-phase systems. This minimum-phase.
terminology has been (somewhat abusively, but The criterion summarized above is of
with the clear intent of providing a concise and paramount importance in the design of feedback
expressive characterization) borrowed to classify laws to the purpose of stabilizing nonlinear
nonlinear systems whose zero dynamics have systems that are globally (or strongly) minimum
desirable (from the stability viewpoint) proper- phase, as it will be seen below. N
ties. Assuming that z D 0 is an equilibrium of
zP D f0 .z; 0/, the following cases are consid-
ered: Zero Dynamics and Stabilization
• A nonlinear system is locally minimum-phase
(respectively, locally exponentially minimum- The first and foremost immediate implication of
phase) if the equilibrium z D 0 of zP D f0 .z; 0/ the properties described above is the fact that the
is locally asymptotically (respectively locally feedback law
exponentially) stable (Byrnes and Isidori
1984). 1
• A nonlinear system is globally minimum- uD Œq0 .z; / C Kr ;
b.z; /
phase if the equilibrium z D 0 of zP D f0 .z; 0/
is globally asymptotically stable (Byrnes and if Kr is chosen so that the matrix Ar C Br Kr
Isidori 1991). in Hurwitz, globally asymptotically stabilizes
• A nonlinear system is strongly minimum- the equilibrium .z; / D .0; 0/ of a strongly
phase if the system zP D f0 .z; /, viewed as minimum-phase system. In fact, as observed, the
a system with input and state z, is input-to- corresponding closed-loop system can be seen as
state stable (Liberzon 2002). an asymptotically stable (linear) system driving
According to the well-known criterion of an input-to-state stable (nonlinear) system. As
Sontag (1995) for input-to-state stability, a already observed, this control mode is non-
system is strongly minimum phase if and only robust (as it relies upon exact cancelations) and
if there exists a positive definite and proper requires the availability of the full state .z; /
smooth real-valued function V .z/, class K1 of the controlled system. However, both these
904 Nonlinear Zero Dynamics
deficiencies can be to some extent fixed, by control in question by means of a dynamic control
means of appropriate techniques, that will be law that only depends on the output y, following
briefly reviewed below. a design paradigm originally proposed by H.
If the requirement of global stability is re- Khalil. In fact, if the system is strongly minimum
placed by the (weaker) requirement of stability phase and also locally exponentially minimum
with a guaranteed region of attraction, then the phase and if q0 .0; 0/ D 0, asymptotic stability
desired control goal can be achieved by means with a guaranteed region of attraction can be
of a much simpler law, depending only on the achieved by means of dynamical feedback law of
partial state and not requiring cancelations. the form Khalil and Esfandiari (1993)
Stability with a guaranteed region of attraction
P
essentially means that a given equilibrium is O1 D O2 C cr1 .y O1 /
rendered asymptotically stable, with a region of
P
attraction that contains an a priori fixed compact O2 D O3 C 2 cr2 .y O1 /
set. In this context, the most relevant results can
be summarized as follows. PO
Assume the system possesses a globally de- r1 D Or C r1 c1 .y O1 /
P
fined normal form and, without loss of generality, Or D r c0 .y O1 /
let b.z; / > 0. Let the system be controlled by a O ;
u D L .kKr /
“partial state” feedback of the form
and e is a set of regulated variables. The exoge- the controlled plant. The second two equations,
nous variables are thought of as generated by an on the other hand, interpret the ability, of the
autonomous system controller, to generate the feedforward input nec-
essary to keep e.t/ D 0 in steady-state. This is
wP D s.w/ a nonlinear version of the well-known internal
model principle of Francis and Wonham (1975).
known as the exosysten. The problem is to design
a (possibly dynamic) controller
Passivity
xP c D fc .xc ; e/
u D hc .xc ; e/ Consider a nonlinear input-affine system having
the same number m of inputs and outputs and
driven by the regulated variable e, such that recall that this system is said to be passive if
in the resulting closed-loop system all trajecto- there exists a continuous nonnegative function
ries are ultimately bounded and limt !1 e.t/ D real-valued function W .x/, with W .0/ D 0, that
0: The problem in question has been the ob- satisfies
ject of intensive research in the past years. In Z t
what follows we limit ourselves to highlight the W .x.t// W .x.0// y T .s/u.s/ds
role of the concept of zero dynamics in this 0
problem.
Assume that the set W where the exosystem along trajectories. The function W .x/ is the so-
evolves is compact and invariant and suppose a called storage function of the system.
controller exists that solves the problem of output It is well known that the notion of passiv-
regulation. Then, the associated closed-loop has a ity plays an important role in system analysis
steady-state locus (see Isidori and Byrnes 2008), and that the theory of passive systems leads to
the graph of a possibly set-valued map defined on powerful methodologies for the design of feed-
N
W . Suppose the map in question is single-valued, back laws for nonlinear systems. In this context,
which means that for each given exogenous input the question of whether a given, non-passive,
function w.t/, there exists a unique steady-state nonlinear system could be rendered passive by
response, expressed as x.t/ D .w.t// and means of state feedback is indeed relevant. It
xc .t/ D c .w.t//. If, in addition, .w/ and c .w/ turns out that this possibility can be simply ex-
are continuously differentiable, it is readily seen pressed as a property of the zero dynamics of the
that system.
Suppose that Lg h.x/ is nonsingular and set
Ls .w/ D f .w; .w/; .w// g .x/ D g.x/ŒLg h.x/1 . If the m columns
0 D h.w; .w// of g .x/ are complete and commuting vector
8w 2 W: fields, there exists a globally defined change of
Ls c .w/ D fc .c .w/; 0/
coordinates that brings the system in normal form
.w/ D hc .c .w/; 0/
Hirschorn RM (1979) Invertibility for multivariable non- parametric methods, these techniques require
linear control systems. IEEE Trans Autom Control no detailed structural information to get
AC-24:855–865
Isidori A (1995) Nonlinear control systems, 3rd edn.
insight into the dynamic behavior of complex
Springer, Berlin/New York systems. Therefore, nonparametric methods are
Isidori A, Byrnes CI (1990) Output regulation of nonlinear used in system identification to get an initial
systems. IEEE Trans Autom Control AC-35:131–140 idea of the model complexity and for model
Isidori A, Byrnes CI (2008) Steady-state behaviors in
nonlinear systems, with an application to robust dis-
validation purposes (e.g., detection of unmodeled
turbance rejection. Ann Rev Control 32:1–16 dynamics). Their drawback is the increased
Isidori A, Moog C (1988) On the nonlinear equivalent variability compared with the parametric
of the notion of transmission zeros. In: CI Byrnes, estimates. Although the main focus of this entry
A Kurzhanski (eds) Modelling and adaptive control.
Lecture notes in control and information sciences,
is on the classical identification framework
vol 105. Springer, Berlin/New York pp 445–471 (estimation of dynamical systems operating
Isidori A, Krener AJ, Gori-Giorgi C, Monaco S (1981) in open loop from known input, noisy output
Nonlinear decoupling via feedback: a differential ge- observations), the reader will also learn more
ometric approach. IEEE Trans Autom Control AC-
26:331–345
about (i) the connection between transient and
Khalil HK, Esfandiari F (1993) Semiglobal stabilization leakage errors, (ii) the estimation of dynamical
of a class of nonlinear systems using output feedback. systems operating in closed loop, (iii) the
IEEE Trans Autom Control AC-38:1412–1415 estimation in the presence of input noise, and (iv)
Liberzon D, Morse AS, Sontag ED (2002) Output-input
stability and minimum-phase nonlinear systems. IEEE the influence of nonlinear distortions on the linear
Trans Autom Control AC-43:422–436 framework. All results are valid for discrete- and
Seron MM, Braslavsky JH, Kokotovic PV, Mayne DQ continuous-time systems. The entry concludes
(1999) Feedback limitations in nonlinear systems: with some user choices and practical guidelines
from Bode integrals to cheap control. IEEE Trans
Autom Control AC-44:829–833 for setting up a system identification experiment
Singh SN (1981) A modified algorithm for invertibility and choosing an appropriate estimation method.
in nonlinear systems. IEEE Trans Autom Control AC-
26:595–598
Silverman LM (1969) Inversion of multivariable linear
N
systems. IEEE Trans Autom Control AC-14:270–276 Keywords
Sontag ED (1995) On the input–to–state stability property.
Eur J Control 1:24–36 Best linear approximation; Correlation method;
Wonham WM (1979) Linear multivariable control: a geo- Empirical transfer function estimate; Errors-
metric approach. Springer, New York
in-variables; Feedback; Frequency response
function; Gaussian process regression; Impulse
transient response modeling method; Local
polynomial method; Local rational method;
Noise (co)variances; Noise power spectrum;
Nonparametric Techniques Spectral analysis
in System Identification
Noise
to detect and quantify the nonlinear distortions
in the FRF estimate. As such, without estimating
a parametric model, the users can easily decide
whether or not the linear framework is accurate
enough for their particular application. Plant
The estimation of the nonparametric models
typically starts from sampled input-output signals
u.nT s / and y.nT s /; n D 0; 1; : : : ; N 1, that Nonparametric Techniques in System Identification,
Fig. 1 Classical identification framework: discrete- or
are transformed to the frequency domain via the continuous-time plant operating in open loop; known in-
discrete Fourier transform (DFT) put u.t /, noisy output y.t / observations; and v.t / filtered
discrete-time or band-limited continuous-time white noise
N 1
e.t / that is independent of u.t /. y0 .t / denotes the true
1 X output of the plant. In the continuous-time case, it is
X.k/ D p x.nTs /e j 2k n=N (1) assumed that the unobserved driving noise source e.t / has
N nD0 finite variance and constant (white) power spectrum within
the acquisition bandwidth
with Ts the sampling period, x D u or y, and
X D U or Y . One of the main difficulties
some practical guidelines are given (section
in estimating an FRF and noise power spec-
trum is the leakage error in the DFT spectrum “Guidelines”). Unless otherwise stated, the input
X.k/ D DFT.x.t// (1). It is due to the finite u.t/ and the disturbing noise v.t/ are assumed to
be statistically uncorrelated.
duration NT s of the experiment, and it increases
the mean square error of the nonparametric esti-
mates. Therefore, all methods try to suppress the
leakage error as much as possible. The Leakage Problem
This entry starts by a detailed analysis of
the leakage problem (section “The Leakage For arbitrary excitations u.t/, the relationship
Problem”), followed by an overview of standard between the true input U.k/ and true output Y0 .k/
and advanced nonparametric time (section DFT spectra (1) of a linear dynamic system is
“Nonparametric Time-Domain Techniques”) and given by
frequency (section “Nonparametric Frequen-
cy-Domain Techniques”) domain techniques. Y0 .k/ D G.k /U.k/ C TG .k / (2)
First, it is assumed that the system operates
in open loop (see Fig. 1) and that known where k D j!k or exp.j!k Ts / for,
input, noisy output observations are available respectively, continuous- and discrete-time
(sections “Nonparametric Time-Domain Tech- systems; !k D 2k=.NT s /; G.k / the plant
niques” and “Nonparametric Frequency-Domain frequency response function; and TG .k /
Techniques”). Next, section “Extensions” the leakage error due to the plant dynamics
extends the results to systems operating in (Pintelon and Schoukens 2012, Section 6.3.2).
closed loop (section “Systems Operating in The leakage error TG ./ is a smooth function of
Feedback”); to noisy input, noisy output the frequency that decreases to zero as O.N 1=2 /
observations (section “Noisy Input, Noisy Output for N increasing to infinity. It depends on the
Observations”); and to nonlinear systems (section difference between the initial and final conditions
“Nonlinear Systems”). Finally, some user choices of the experiment and has exactly the same poles
are discussed (section “User Choices”) and as the plant transfer function. Therefore, the
Nonparametric Techniques in System Identification 909
time-domain response of TG ./ is decaying and noise power spectrum estimation is making
exponentially to zero as a transient error. a trade-off between the reduction of the leakage
From this short discussion, it can be concluded error Eleak .k/ and the increase of the interpola-
that the leakage error in the frequency domain tion error Eint .k/ (Schoukens et al. 2006).
is equivalent to the transient error in the time Note that exactly the same analysis can be
domain. The only difference being that the former made for the continuous- or discrete-time dynam-
depends on the difference between the initial and ics of the disturbing output noise v.t/ in Fig. 1
final conditions, while the latter solely depends
on the initial conditions. V .k/ D H.k /E.k/ C TH .k / (7)
Standard spectral analysis methods (see sec-
tion “Spectral Analysis Method”) suppress the with H.k / the noise frequency response
leakage term TG .k / in (2) by multiplying the function, E.k/ the DFT of the unobserved
time-domain signals with a window w.t/ before driving discrete-time or band-limited continuous-
taking the DFT (1) time white noise source e.t/ (Pintelon and
Schoukens 2012, Section 6.7.3), and TH .k /
X
N 1 the noise leakage (transient) term. The noise
1 kn
.X.k//W D p w.nTs /x.nTs /e j 2 N leakage term is often neglected but can be
N wrms nD0
important for lightly damped systems (e.g., in
(3)
N 1 1=2 modal analysis). Most nonparametric techniques
P suppress the sum of the plant and noise leakage
with wrms D jw.nTs /j2 =N the root
nD0 errors TG .k / C TH .k /.
mean square (rms) value of the window w.t/. If an integer number of periods of the
The scaling in (3) is such that the transformation steady-state response to a periodic excitation
preserves the rms value of the signal. The rela- is measured, then the plant leakage error TG .k /
tionship between the DFT spectra .U.k//W and in (2) is zero, which simplifies significantly the
.Y0 .k//W of the windowed input-output signals estimation problem. Therefore, for the frequency- N
w.t/u.t/ and w.t/y0 .t/ is given by domain techniques, a distinction is made between
periodic and nonperiodic excitations. Note,
.Y0 .k//WD G.k / .U.k//W CEint .k/CEleak .k/ however, that the noise leakage (transient)
(4) term TH .k / in (7) remains different from
where Eint .k/ and Eleak .k/ are, respectively, the zero.
interpolation error and the remaining leakage
error
Nonparametric Time-Domain
Techniques
Eint .k/ D .G.k /U.k//W G.k / .U.k//W
(5) The time-domain methods estimate the impulse
Eleak .k/ D .TG .k //W (6) response of the plant via the time-domain rela-
Note that Eint .k/ D 0 if G.k / is constant tionship that the true output y0 .t/ equals the con-
within the bandwidth of W .k/, while the interpo- volution product between the impulse response
lation error is large if the FRF varies significantly g.t/ and the true input u.t/. For discrete-time
within the window bandwidth. To keep Eint .k/ systems, it takes the form
small, the frequency resolution 1=.NT s / should 1
be sufficiently large and the window bandwidth X
y0 .t/ D g.n/u.t n/ (8)
should be small enough. On the other hand, a nD0
larger window bandwidth is beneficial for reduc-
ing the leakage error Eleak .k/. Hence, choosing In practice only a finite number of impulse re-
an appropriate window for nonparametric FRF sponse coefficients g.t/ can be estimated from
910 Nonparametric Techniques in System Identification
N 1
N input-output samples, and, therefore, (8) is 1 X
approximated by a finite sum RO yu .m/ D y.t/u.t m/ (12)
N L t DL
X
L
y0 .t/ g.n/u.t n/ (9) 1
1 X
N
RO uu .m n/ D
nD0
u.t n/u.t m/
N L t DL
where L N 1 should also be determined (13)
from the data. From (9), it can be seen that
the response depends on the past input values (Godfrey 1993, Chapter 1; Ljung 1999, Chap-
u.1/; u.2/; : : :; u.L/. Since these values are ter 6). Since the number of estimated impulse
unknown, an exponentially decaying transient response coefficients L can grow with the amount
error is present in the first L samples of the pre- of data N , the correlation method (11) is clas-
dicted output (9). This transient error is the time- sified as being nonparametric. If the input is
domain equivalent of the leakage error TG .k / white noise, then the expected value of RO uu .m/ is
in (2). To remove the transient error, the first L proportional to the Kronecker delta ı.m/, and the
output samples can be discarded in the predicted cross-correlation RO yu .m/ (11) is – within a scal-
output (9). It reduces the amount of data from N ing factor – a good approximation of the impulse
to N L and, hence, increases the mean square response. This property is used in blind channel
error of the estimates. If it is known that the estimation.
transfer function has no direct term, then g.0/ D
0, and the sum (9) starts from n D 1.
Gaussian Process Regression
Correlation Methods The linear least squares (10) solution can be
Correlation methods have been studied inten- (very) sensitive to disturbing output noise if L
sively since the end of the 1950s (see Eykhoff is not much smaller than N . This problem is
1974) and are nowadays still used in telecom- circumvented by the Gaussian process regression
munication channel estimation and equalization. approach. The key idea consists in modeling
The impulse response coefficients are found by the impulse response coefficients g.n/ as a
minimizing the sum of the squared differences zero-mean Gaussian process with a certain
between the observed output samples and the covariance structure PL that depends on a few
output samples predicted by (9) hyper-parameters (Pillonetto et al. 2011). In
Chen et al. (2012), it has been shown that the
X
N 1 X
L Gaussian process regression is equivalent to
.y.t/ g.n/ u.t n//2 (10) the following regularized (see also System
t DL nD0 Identification Techniques: Convexification, Reg-
ularization, and Relaxation) linear least squares
w.r.t. g.m/; m D 0; 1; : : :; L. The solution of problem
this linear least squares problem is given by the
famous Wiener-Hopf equation
X
N 1 X
L
.y.t/ g.n/u.t n//2 C 2 g T PL1 g
X
L
RO yu .m/ D g.n/ RO uu .m n/ (11) t DL nD0
(14)
nD0
where g D .g.0/; g.1/; : : :; g.L//T and with
for m D 0; 1; : : :; L, where RO yu and RO uu are 2 the variance of the output disturbance. The
estimates of, respectively, the cross- and autocor- hyper-parameters defining PL and the noise vari-
relation functions Ryu ./ D Efy.t/u.t /g and ance 2 are estimated via an empirical Bayes
Ruu ./ D Efu.t/u.t /g method.
Nonparametric Techniques in System Identification 911
O k // D V2 .k/
N 1
var.G.
P jU.k/j2
(16) 1 X
RO yu ./ D y.t/ u.t / (18)
N tD
where jU.k/j is the magnitude of U.k/.
N 1
Applying (15) to random excitations gives 1 X
SORyu .k/ D p w./ RO yu ./e j 2 N (19)
k
the empirical transfer function estimate (Ljung
N D0
1999, Section 6.3). Due to the presence of the
plant leakage error TG .k /=U.k/, the statistical resulting in an FRF estimate (17) at the full
properties of (15) for random inputs are quite frequency resolution 1=.NT s / of the measure-
different from those for periodic inputs. While ment. It can be shown that (19) is a smoothed
the empirical transfer function estimate (ETFE) is version of the periodogram Y .k/U.k/, where is
unbiased and has finite variance (16) for periodic UN the complex conjugate of U (Brillinger 1981,
inputs, it is biased and has infinite variance for Chapter 5).
random inputs (Broersen 2004). To improve the In the Welch approach (Welch 1967; Pin-
statistical properties of the ETFE for random telon and Schoukens 2012, Section 2.6), the N
inputs, one can either approximate locally the input-output samples are split into M subrecords
ETFE by a polynomial (Stenman et al. 2000) of N=M samples each, and the DFT spectra
or perform a weighted average of ETFEs over .U Œm .k//W and .Y Œm .k//W of the windowed
912 Nonparametric Techniques in System Identification
input and output samples are calculated via (3) more experiments (input-output data records) are
where N is replaced by N=M , giving available. If the number of measured records M
increases to infinity, then (22) converges to the
1 X Œm Œm
M
true value, provided a perfect suppression of the
SOYW UW .k/ D Y .k/ W U .k/ W leakage error.
M mD1
(20) In measurement devices, the quality of the
XM
ˇ Œm ˇ2 spectral analysis estimate (22) is often quantified
1 ˇ U .k/ W ˇ
SOUW UW .k/ D (21) via the coherence 2 .!/
M mD1
ˇ ˇ
The spectral analysis estimate of the FRF and its ˇSyu ./ˇ2
.!/ D
2
(25)
variance are then given by Syy ./Suu ./
O
O k / D SYW UW .k/
G. (22)
which is comprised between 0 and 1. It is related
SOUW UW .k/ to the variance of the spectral analysis estimate as
n o
O k // D 1 .!k / .!k / jG.k /j2
2 2 2
O k // V .k/ E SOU1 U .k/
var.G. (23) var.G.
M W W
(Brillinger 1981, Chapter 8; Heath 2007). Finally,
A coherence smaller than 1 indicates the presence
the output noise variance V2 .k/ in (23) is esti-
of disturbing noise, residual leakage errors, non-
mated as
linear distortions, or a nonobserved input.
0 ˇ ˇ2 1
ˇO ˇ Following the same lines of Welch (1967),
M BO ˇ YW UW
S .k/ ˇ C the statistical properties of the spectral analysis
O V .k/ D
2
@SYW YW .k/ A
M 1 SOUW UW .k/ estimate (22) can be improved via overlapping
subrecords in the cross- and autopower spectra
(24)
estimates (20) and (21). This has been studied in
(Brillinger 1981, Chapter 8; Pintelon and detail for noise power spectra in Carter and Nut-
Schoukens 2012, Section 2.5.4). Due to tall (1980) and for FRFs in Antoni and Schoukens
the spectral width of the window used, the (2007).
estimates (22) and (24) are correlated over
the frequency (the correlation length is about Advanced Methods
twice the spectral width). Note that (21) is used The goal of the advanced methods is to estimate
for estimating noise power spectra (Brillinger the FRF at the full frequency resolution 1=.NT s /
1981, Chapter 5). Note also that for periodic of the experiment duration NT s while suppressing
excitations combined with a rectangular window the influence of the leakage and the noise errors.
w.nT s / D 1, the spectral analysis estimate (22), Without some extra information, it is impossi-
where each subrecord is equal to a signal period, ble to achieve this goal via (2). The additional
simplifies to the ETFE (15). piece of information that allows one to solve the
Compared with the Blackman-Tukey proce- problem is that the FRF and the leakage error are
dure (19), the FRF estimate (22) based on the locally smooth functions of the frequency.
Welch approach (20) and (21) has a frequency The local polynomial method (Pintelon and
resolution and a variance (23) that are M times Schoukens 2012, Chapter 7) approximates the
smaller. In measurement devices, the FRFs are FRF and the leakage error in (2) locally in the
estimated using the Welch approach (20)–(22) frequency band Œk n; k C n by a polynomial.
where each subrecord is an independent measure- From the residuals of the local linear least squares
ment with a fixed number of samples. The reason solution, one also gets an estimate of the output
for this is that the cross- and autopower spectra noise variance V2 and, hence, also of the variance
estimates (20) and (21) can easily be updated as of the FRF. The whole procedure is repeated for
Nonparametric Techniques in System Identification 913
all DFT frequencies k in the frequency band of back” and “Noisy Input, Noisy Output Observa-
interest. The correlation length of the estimates tions,” it is shown that the estimation bias can
equals ˙2n, which is twice the local bandwidth be avoided if a known external reference signal
of the polynomial approximation. is available (typically the signal stored in the
The local rational method (McKelvey and arbitrary waveform generator).
Guérin 2012) follows the same lines as the local Since most real-life systems behave to some
polynomial method, except that the FRF and the extent nonlinearly, it is important to detect and
leakage error in (2) are locally approximated by quantify the nonlinear effects in FRF estimates.
rational forms with the same poles (G D B=A This issue is handled in section “Nonlinear Sys-
and TG D I =A). Due to the common poles, tems.”
the local rational approximation problem can be
transformed into a local linear least squares prob-
lem. The method is biased but suppresses better Systems Operating in Feedback
the plant leakage error of lowly damped systems. The key difficulty of estimating the FRF of a plant
The transient impulse response modeling operating in feedback (see Fig. 2) using nonpe-
method (Hägg and Hjalmarsson 2012) ap- riodic excitations is that the true input u.t/ is
proximates the FRF and the leakage error by, correlated with the process noise v.t/. The direct
respectively, finite impulse and transient response approaches of sections “Nonparametric Time–
models, giving a large sparse global linear least Domain Techniques” and “Nonparametric Fre-
squares problem. From the residuals of the global quency-Domain Techniques” lead to biased esti-
linear least squares solution, one gets an estimate mates (Wellstead 1981). This can easily be seen
of the output noise variance V2 and, hence, also from the ETFE (15) applied to the feedback setup
of the variance of the FRF. This approach has the in Fig. 2
best smoothing properties and is recommended
in case the noise error is dominant.
O k / D G.k /Gact .k /R.k/ C V .k/ (26)
G. N
Gact .k /R.k/ Gfb .k /V .k/
Extensions
where Gact .k / and Gfb .k / are, respectively,
In sections “Nonparametric Time-Domain Tech- the actuator and feedback dynamics. From (26),
niques” and “Nonparametric Frequency-Domain it follows that in those frequency bands where
Techniques,” it is assumed that the linear plant the process noise V .k/ dominates, one rather
operates in open loop and that the input is known estimates minus the inverse of the feedback dy-
exactly. If the plant operates in feedback and/or namics instead of the plant FRF. On the other
the input observations are noisy, then the pre- hand, at those frequencies where the reference
sented time and frequency-domain techniques are signal injects most power, the ETFE (26) will be
biased. In sections “Systems Operating in Feed- close to the plant FRF.
Feedback
Actuator Plant
Nonparametric Techniques in System Identification, on the process noise v.t /, and y.t / is the noisy output
Fig. 2 Plant operating in closed loop: r.t / is the ex- observation
ternal reference signal, the known input u.t / depends
914 Nonparametric Techniques in System Identification
Actuator Plant
Nonparametric Techniques in System Identification, my .t / are the input and output measurement errors; v.t / is
Fig. 3 Errors-in-variables framework: r.t / is the external the process noise; and u.t /, y.t / are the noisy input, noisy
reference signal; ng .t / is the generator noise; mu .t /, output observations
If a known external reference signal is avail- input autopower spectrum in (17) is biased. In-
able, then the bias is avoided via the indirect deed, due to the noise on the input, Suu ./ is
method proposed in Wellstead (1981) too large, resulting in too small direct FRF esti-
mates. This is true for all direct FRF approaches
Syr ./=Srr ./ Syr ./ in sections “Nonparametric Time-Domain Tech-
G./ D D : (27)
Sur ./=Srr ./ Sur ./ niques” and “Nonparametric Frequency-Domain
Techniques.” Applying the indirect method of
The basic idea consists in modeling the feedback section “Systems Operating in Feedback” re-
setup (see Fig. 2) from the known reference to the moves the bias because the noise on the in-
input and output simultaneously. This reduces the put is independent of the reference signal (e.g.,
single-input, single-output closed loop problem see (27)). Proceeding in this way, the closed
to a single-input, two-output open loop problem. loop case (see Fig. 2) with noisy input, noisy
Since the process noise v.t/ is independent of output observations is also solved by the indirect
the reference signal r.t/, the direct estimate of method.
the single-input, two-output FRF is unbiased. If the excitation is periodic, then the mean
Calculating the ratio of the two FRFs finally value of the input-output DFT spectra over the
gives the indirect estimate (27). This procedure P consecutive periods converges to the their
can be applied to any of the direct methods true values as P tends to infinity (Pintelon
of sections “Nonparametric Time-Domain Tech- and Schoukens 2012, Section 2.5). Hence, the
niques” and “Nonparametric Frequency-Domain ETFE (15) is still consistent, and no external
Techniques.” Proceeding in this way, unstable reference is needed. The same conclusion is valid
plants operating in a stabilizing feedback loop for systems operating in feedback.
can also be handled.
If the excitation is periodic, then the process Nonlinear Systems
noise v.t/ is independent of the periodic part The classes of nonlinear systems considered are
of the input u.t/, and the ETFE (15) converges those systems whose steady-state response to
to the true value as the number of periods P a periodic input is periodic with the same pe-
tends to infinity (Pintelon and Schoukens 2012, riod as the input. It excludes phenomena such
Section 2.5). Hence, in the periodic case, no as chaos and subharmonics but allows for hard
external reference is needed. nonlinearities such as saturation, dead zones, and
clipping.
Noisy Input, Noisy Output Observations The classes of excitations considered are
The key difficulty of estimating the FRF of a plant stationary random signals with a specified
excited by a nonperiodic signal from noisy input, power spectrum and probability density function.
noisy output observations (see Fig. 3) is that the An important special case is the class of
Nonparametric Techniques in System Identification 915
methods have been developed which all try to Bendat JS, Piersol AG (1980) Engineering applications of
minimize the sensitivity (bias and variance) of the correlations and spectral analysis. Wiley, New York
Blackman RB, Tukey JW (1958) The measurement of
nonparametric estimates to disturbing noise, non- power spectra from the point of view of commu-
linear distortion, and transient (leakage) errors. nications engineering – Part II. Bell Syst Tech J
The renewed research interest in nonparamet- 37(2):485–569
ric techniques should be continued to handle Brillinger DR (1981) Time series: data analysis and the-
ory. McGraw-Hill, New York
the following challenging problems: short data Broersen PMT (2004) Mean square error of the empirical
sets, missing data, detection and quantification of transfer function estimator for stochastic input signals.
time-variant behavior, modeling of time-variant Automatica 40(1):95–100
dynamics, and modeling of nonlinear dynamics. Carter GC, Nuttall AH (1980) On the weighted overlapped
segment averaging method for power spectral estima-
tion. Proc IEEE 68(10):1352–1353
Chen T, Ohlsson H, Ljung L (2012) On the es-
timation of transfer functions, regularizations and
Cross-References Gaussian processes – revisited. Automatica 48(8):
1525–1535
Frequency Domain System Identification Enqvist M, Ljung L (2005) Linear approximations of
Frequency-Response and Frequency-Domain nonlinear FIR systems for separable input processes.
Automatica 41(3):459–473
Models Eykhoff P (1974) System identification. Wiley, New York
System Identification: An Overview Godfrey K (1993) Perturbation signals for system identi-
System Identification Techniques: Convexifica- fication. Prentice-Hall, Englewood Cliffs
tion, Regularization, and Relaxation Hägg P, Hjalmarsson H (2012) Non-parametric frequency
response function estimation using transient impulse
response modelling. Paper presented at the 16th IFAC
symposium on system identification, Brussels, 11–13
July, pp 43–48
Recommended Reading Heath WP (2007) Choice of weighting for averaged
nonparametric transfer function estimates. IEEE Trans
The classical correlation (see section “Correla- Autom Control 52(10):1914–1920
Ljung L (1999) System identification: theory for the user,
N
tion Methods”) and spectral analysis (see section
2nd edn. Prentice-Hall, Upper Saddle River
“Spectral Analysis Method”) methods are well McKelvey T, Guérin G (2012) Non-parametric fre-
covered by the text books listed below. The rec- quency response estimation using a local rational
ommended reading list includes the basic papers model. Paper presented at the 16th IFAC sympo-
on the spectral analysis methods (Blackman and sium on system identification, Brussels, 11–13 July,
pp 49–54
Tukey 1958; Welch 1967; Wellstead 1981) and Pillonetto G, Chiuso A, De Nicolao G (2011) Prediction
the most recent developments described in sec- error identification of linear systems: a nonparametric
tions “Gaussian Process Regression” and “Ad- Gaussian regression approach. Automatica 47(2):291–
vanced Methods.” 305
Pintelon R, Schoukens J (2012) System identification: a
frequency domain approach, 2nd edn. IEEE, Piscat-
Acknowledgments This work is sponsored by the away/Wiley, Hoboken
Research Foundation Flanders (FWO-Vlaanderen), the Schoukens J, Rolain Y, Pintelon R (2006) Analysis of win-
Flemish Government (Methusalem Fund, METH1), the dowing/leakage effects in frequency response function
Belgian Federal Government (Interuniversity Attraction measurements. Automatica 42(1):27–38
Poles programme IAP VII, DYSCO), and the European Stenman A, Gustafsson F, Rivera DE, Ljung L, McKelvey
Research Council (ERC Advanced Grant SNLSID). T (2000) On adaptive smoothing of empirical transfer
function estimates. Control Eng Pract 8(11):1309–
1315
Welch PD (1967) The use of the fast Fourier trans-
Bibliography form for the estimation of power spectra: a method
based on time averaging over short, modified peri-
Antoni J, Schoukens J (2007) A comprehensive study of odograms. IEEE Trans Audio Electroacoust 15(2):
the bias and variance of frequency-response-function 70–73
measurements: optimal window selection and overlap- Wellstead PE (1981) Non-parametric methods of system
ping strategies. Automatica 43(10):1723–1736 identification. Automatica 17(1):55–69
918 Numerical Methods for Continuous-Time Stochastic Control Problems
lem” uses a one-dimensional case as an example to denote the collection of controls, which
920 Numerical Methods for Continuous-Time Stochastic Control Problems
are determined by a sequence of measurable given f˛jh ; uhj W j n; ˛nh D x; uhn D vg by Enh .
functions Fnh ./ such that uhn D Fnh .˛kh ; k We say that a control policy is locally consistent
nI uhk ; k < n/:Denote the conditional expectation if
V h .x C h/ V h .x/ C V h .x/ V h .x h/
b .x; v/ b .x; v/
h h
a.x/ V .x C h/ 2V .x/ C V .x h/
h h h
C C R.x; v/ D 0;
2 h2
where b C and b are the positive and nega- 2. Find an improved control by P
tive parts of b, respectively. Comparing with the uh;nC1 .x/ WD argminv2U h Œ y p h ..x; y/jv/
dynamic programming equation, we obtain the V h;n .y/ C R.x; v/t h .x; v/:
transition probabilities 3. Find V h;nC1 ./ with uh;nC1 ./ by solving (7). If
jV h;nC1 V h;n j > ", go to Step 2 above with
.a.x/=2/ C hb C .x; v/ n ! n C 1.
p h .x; x C h/jv/ D ;
Q
.a.x/=2/ C hb .x; v/
p h .x; x h/jv/ D ; Further Remarks
Q 2
h
p h ./ D 0; otherwise, t h .x; v/ D ; Variations of the Problems. Variants of the
Q problems can be considered. For example, one
may consider nonlinear filtering problems or sin-
with Q D a.x/ C hjb.x; v/j being well defined. gularly perturbed control and filtering problems.
With the transition probabilities given above, we For problems arising in manufacturing systems,
can proceed to verify the local consistency by one often needs to treat controlled Markov chain
straight forward calculations and prove the de- with no diffusion terms. Such a case can also
sired convergence. be handled by the Markov chain approximation
methods; see Sethi and Zhang (1994) for the
problem and Yin and Zhang (2013, Chapter 9) for
Numerical Computation the numerical methods. In this article, we mainly N
discussed the approach by using probabilistic
To numerically approximate the controlled diffu- approach for getting the weak convergence of
sions, frequently used methods are either value the interpolations of the controlled Markov chain.
iterations or policy iterations (iteration in policy One can also use the so-called viscosity solution
space). Using Markov chain approximation in methods to treat the convergence; see Barles
conjunction with either value iteration or iteration and Souganidis (1991) (also Kushner and Dupuis
in policy space, we can further obtain a sequence 1992, Chapter 11).
of value functions fV h;n g such that V h;n ! V h
as n ! 1. The procedures can be described as Variance Control. In this entry, only drift in-
follows. volves control term. When the diffusion term is
also subject to controls, the problem becomes
Value Iteration more difficult. In Peng (1990), the idea of using
1. Given a tolerance " > 0, set n D 0; for x 2 backward stochastic differential equations was
Dh0 , set V h;0 D constant (for instance, 0). initiated, which had significant impact in the
2. Using V h;n obtained in (7) to obtain V h;nC1 . development of such stochastic control problems.
3. If jV h;nC1 V h;n j > ", go to Step 3 above with Detailed discussions can be found in Yong and
n ! n C 1. Zhou (1999). The numerical problems for diffu-
sion term involving controls can also be treated;
Policy Iteration see Kushner (2000) for further discussion. In this
1. Given a tolerance " > 0, set n D 0; for x 2 case, the so-called numerical noise or numerical
Dh0 , take an initial control uh0 .x/ D constant. viscosity can be introduced, so care must be
Use uh0 .x/ in lieu of v, solve (7) to find V h;0 ./. taken.
922 Numerical Methods for Continuous-Time Stochastic Control Problems
Complex Models Involving Jump and Switch- Acknowledgments The research of this author was sup-
ing. Note that only controlled diffusions are ported in part by the Army Research Office under grant
W911NF-12-1-0223.
considered in this entry. More complex models
such as controlled jump diffusions (Kushner and
Dupuis 1992), switching diffusions (Yin and Zhu
2010), and switching jump diffusions can be
Bibliography
treated (Song et al. 2006). Differential games can
also be treated (Kushner 2002; Song et al. 2008). Barles G, Souganidis P (1991) Convergence of approxi-
mation schemes for fully nonlinear second order equa-
Differential Delay Systems. Stochastic differ- tions. J Asymptot Anal 4:271–283
Bertsekas DP, Castanon DA (1989) Adaptive aggregation
ential delay systems may come into play. The
methods for infinite horizon dynamic programming.
corresponding numerical algorithms have been IEEE Trans Autom Control 34:589–598
studied extensively in Kushner (2008). Due to Chancelier P, Gomez C, Quadrat J-P, Sulem A, Blanken-
their inherent infinite dimensionality, a main is- ship GL, La Vigna A, MaCenary DC, Yan I (1986)
An expert system for control and signal processing
sue here concerns suitable finite approximation to with automatic FORTRAN program generation. In:
the memory segments. Mathematical systems symposium, Stockholm. Royal
Institute of Technology, Stockholm
Rates of Convergence. This entry mainly Chancelier P, Gomez C, Quadrat J-P, Sulem A (1987)
Automatic study in stochastic control. In: Fleming W,
discusses the convergence of the approximation Lions PL (eds) IMA volume in mathematics and its
methods. There is also much interest in ascertain- applications, vol 10. Springer, Berlin
ing rates of convergence. Such effort goes back Crandall MG, Lions PL (1983) Viscosity solutions of
to the paper Menaldi (1989) (see also Zhang Hamilton-Jacobi equations. Trans Am Math Soc
277:1–42
2006). Subsequently, it has been resurgent effort Crandall MG, Ishii H, Lions PL (1992) User’s guide to
in dealing with this issue from a nonlinear partial viscosity solutions of second order partial differential
differential equation point of view; see Krylov equations. Bull Am Math Soc 27:1–67
(2000). Our recent work Song and Yin (2009) Fleming WH, Rishel RW (1975) Deterministic and
stochastic optimal control. Springer, New York
complements the study by providing a probabilis- Fleming WH, Soner HM (1992) Controlled Markov pro-
tic approach for treating switching diffusions. cesses and viscosity solutions. Springer, New York
Jin Z, Wang Y, Yin G (2011) Numerical solutions of quan-
Stochastic Approximation. In certain optimal tile hedging for guaranteed minimum death benefits
under a regime-switching-jump-diffusion formulation.
control problems, the optimal controls or near- J Comput Appl Math 235:2842–2860
optimal controls turn out to be of threshold Jin Z, Yin G, Zhu C (2012) Numerical solutions of optimal
type. An alternative way of solving such risk control and dividend optimization policies under a
problems leading to at least suboptimal or generalized singular control formulation. Automatica
48:1489–1501
near-optimal control is to use a stochastic Jin Z, Yang HL, Yin G (2013) Numerical methods for
approximation approach; see Kushner and optimal dividend payment and investment strategies of
Yin (2003) for a comprehensive treatment of regime-switching jump diffusion models with capital
stochastic approximation algorithms. Some injections. Automatica 49:2317–2329
Krylov VN (2000) On the rate of convergence of finite-
successful examples include manufacturing difference approximations for Bellman’s equations
systems (Yin and Zhang 2013, Section 9.3) and with variable coefficients. Probab Theory Relat Fields
liquidation decision making (Yin et al. 2002). 117:1–16
Kushner HJ (1977) Probability methods for approxima-
tion in stochastic control and for elliptic equations.
Academic, New York
Kushner HJ (1990a) Weak convergence methods and
Cross-References singularly perturbed stochastic control and filtering
problems. Birkhäuser, Boston
Kushner HJ (1990b) Numiercal methods for stochastic
Stochastic Dynamic Programming control problems in continuous time. SIAM J Control
Stochastic Maximum Principle Optim 28:999–1048
Numerical Methods for Nonlinear Optimal Control Problems 923
a discount rate ı 0, we consider the optimal control policy. This can be expressed in open-
control problem loop form u? W R ! U , in which the function
u? depends on the initial value x and on the
minimize J T .x; u.// (2) initial time which we set to 0 here. Alternatively,
u./2U T .x/
the optimal control can be computed in state-
and time-dependent closed-loop form, in which
where
a feedback law ? W R X ! U is sought. Via
Z T u? .t/ D ? .t; x.t//, this feedback law can then
J T .x; u.// WD e ıs g.x.s; x; u.//; u.s//ds be used in order to generate the time-dependent
0
optimal control function for all possible initial
C e ıT F .x.T; x; u./// values. Since the feedback law is evaluated along
(3) the trajectory, it is able to react to perturbations
and and uncertainties which may make x.t/ deviate
from the predicted path. Finally, knowing u? or
U T .x/ WD ? , one can reconstruct the corresponding opti-
ˇ mal trajectory by solving
ˇ x.s; x; u.// 2 X
u./ 2 L1 .R; U / ˇ (4)
ˇ for all s 2 Œ0; T
P
x.t/ D f .x.t/; u? .t// or
In addition to this finite horizon optimal con- P
x.t/ D f .x.t/; ? .t; x.t///:
trol problem, we also consider the infinite horizon
problem in which T is replaced by “1,” i.e.,
Hamilton-Jacobi-Bellman Approach
minimize
1
J 1 .x; u.// (5)
u./2U .x/
In this section we describe the numerical ap-
where proach to solving optimal control problems via
Z Hamilton-Jacobi-Bellman equations. We first de-
1
scribe how this approach can be used in order
J 1 .x; u.// WD e ıs g.x.s; x; u.//; u.s//ds
0 to compute approximations to the optimal value
(6) function V T and V 1 , respectively, and after-
and wards how the optimal control can be synthesized
using these approximations. In order to formulate
U 1 .x/ WD this approach for finite horizon T , we interpret
ˇ V T .x/ as a function in T and x. We denote
ˇ x.s; x; u.// 2 X
u./ 2 L1 .R; U / ˇˇ ; (7) differentiation w.r.t. T and x with subscript T
for all s 0
and x, i.e., VxT .x/ D dV T .x/=dx, VTT .x/ D
respectively. dV T .x/=d T etc.
The term “solving” (2)–(4) or (5)–(7) can We define the Hamiltonian of the optimal
have various meanings. First, the optimal value control problem as
functions
H.x; p/ WD maxfg.x; u/ p f .x; u/g;
u2U
V .x/ D
T
inf T
J .x; u.//
u./2U T .x/
with x; p 2 Rn , f from (1), g from (3) or (6), and
or “” denoting the inner product in Rn . Then, under
V 1 .x/ D inf J 1 .x; u.// appropriate regularity conditions on the problem
u./2U 1 .x/
data, the optimal value functions V T and V 1
may be of interest. Second, and often more im- satisfy the first order partial differential equations
portantly, one would like to know the optimal (PDEs)
Numerical Methods for Nonlinear Optimal Control Problems 925
VTT .x/ C ıV T .x/ C H.x; VxT .x// D 0 by h, and rearranging terms, one arrives at the
equation
and
ıV 1 .x/ C H.x; Vx1 .x// D 0 Vh1 .x/ D minfhg.x; u/
u2U
advantage of the fact that by the chain rule for C .1 ıh/Vh1 .x.h;
Q x; u//g;
p D Vx1 .x/ and constant control functions u,
the identity is an optimal feedback control value for this
ˇ discrete-time problem for the next time step, i.e.,
1 d ˇˇ on the time interval Œt; t C h/ if x D x.t/. This
ıV .x/ p f .x; u/ D .1 ıt/V 1
dt ˇt D0 feedback law will be approximately optimal for
the continuous-time control system when applied
.x.t; x; u//
as a discrete-time feedback law, and this ap-
proximate optimality remains true if we replace
holds. Hence, the left-hand side of this equality Vh1 in the definition of ?h by its numerically
can be approximated by the difference quotient computable approximation VQh1 . A similar con-
struction can be made based on any other numer-
V 1 .x/ .1 ıh/V 1 .x.h; x; u// ical approximation VQ 1 V 1 , but the explicit
h correspondence of the semi-Lagrangian scheme
to a discrete-time auxiliary system facilitates the
for small h > 0. Inserting this approximation interpretation and error analysis of the resulting
into the Hamilton-Jacobi-Bellman equation, re- control law.
placing x.h; x; u/ by a numerical approxima- The main advantage of the Hamilton-
Q x; u/ (in the simplest case, the Euler
tion x.h; Jacobi approach is that it directly computes an
Q x; u/ D x C hf .x; u/), multiplying
method x.h; approximately optimal feedback law. Its main
926 Numerical Methods for Nonlinear Optimal Control Problems
p.0/ D p0 and control function u.t/. Then, we : : : < tN D T and in addition to p0k intro-
proceed iteratively as follows: duces variables x1k ; : : : ; xNk 1 ; p1k ; : : : ; pNk 1 2
(0) Find initial guesses p00 2 Rn and u0 .t/ for Rn . Then, starting from initial guesses p00 , u0 , and
the initial costate and the control, fix " > 0, x10 ; : : : ; xN0 1 ; p10 ; : : : ; pN0 1 , in each iteration the
and set k WD 0. Eqs. (11)–(14) are solved numerically on the in-
(1) Solve (11) and (12) numerically with initial tervals Œtj ; tj C1 with initial values xjk and pjk ,
values x0 and p0k and control function uk . respectively. We denote the respective solutions
Denote the resulting trajectories by xQ k .t/ and in the k-th iteration by xQ jk and pQjk . In order
pQ k .t/. to enforce that the trajectory pieces computed
(2) Apply one step of an iterative method for on the individual intervals Œtj ; tj C1 fit together
solving the zero-finding problem G.p/ D 0 continuously, the map G is redefined as
with
G.x1k ; : : : ; xNk 1 ; p0k ; p1k ; : : : ; pNk 1 / D
G.p0k / WD pQ .T / Fx .xQ .T //
k k
0 1
xQ 0k .t1 / x1k
for computing p0kC1 . For instance, in case of B :: C
B : C
B C
the Newton method we get B xQ Nk 2 .t1 / xNk 1 C
B C
B pQ0k .t1 / p1k C:
B C
p0kC1 WD p0k DG.p0k /1 G.p0k /: B
B :: C
C
B : C
@ pQNk 2 .t1 / pNk 1 A
If kp0kC1 p0k k < ", stop; else compute
pQN 1 .T / Fx .xQ Nk 1 .T //
k
Despite being the most straightforward and sim- I.ti ; ti C1 ; xi ; ui / D .ti C1 ti /e ıti g.xi ; ui /:
ple of the approaches described in this article,
the direct discretization approach is currently the Introducing the optimization variables
most widely used approach for computing single u0 ; : : : ; uN 1 2 Rm and x1 ; : : : ; xN 2 Rn , the
finite horizon optimal trajectories. In the direct discretized version of (2)–(4) reads
approach, we first discretize the problem and then
solve a finite dimensional nonlinear optimiza- 1
X
N
tion problem (NLP), i.e., we first discretize, then minimize I.ti ; ti C1 ; xi ; u/ C e ıT F .xN /
optimize. The main reasons for the popularity xj 2R ;uj 2R
n m
i D0
of this approach are the simplicity with which
constraints can be handled and the numerical ef- subject to the constraints
ficiency due to the availability of fast and reliable
NLP solvers. uj 2 U; j D 0; : : : ; N 1
The direct approach again applies to the finite
xj 2 X; j D 1; : : : ; N
horizon problem and computes an approximation
to a single optimal trajectory x ? .t/ and control xj C1 D x.t
Q j C1 ; tj ; xj ; u/; j D 0; : : : ; N
function u? .t/ for a given initial value x0 2 X .
To this end, a time grid 0 D t0 < t1 < t2 < : : : < This way, we have converted the optimal control
tN D T and a set Ud of control functions which problem (2)–(4) into a finite dimensional nonlin-
are parameterized by finitely many values are ear optimization problem (NLP). As such, it can
selected. The simplest way to do so is to choose be solved with any numerical method for solving
u.t/ uj 2 U for all t 2 Œti ; ti C1 . However, such problems. Popular methods are, for instance,
other approaches like piecewise linear or piece- sequential quadratic programming (SQP) or in-
wise polynomial control functions are possible, terior point (IP) algorithms. The convergence of
too. We use a numerical algorithm for ordinary this approach was proved in Malanowski et al.
differential equations in order to approximately (1998); for an up-to-date account on theory and
solve the initial value problems practice of the method, see Gerdts (2012) and
Betts (2010). These references also explain how
P
x.t/ D f .x.t/; ui /; x.ti / D xi (15) information about the costates p.t/ can be ex-
tracted from a direct discretization, thus linking
for i D 0; : : : ; N 1 on Œti ; ti C1 . We de- the approach to the maximum principle.
note the exact and numerical solution of (15) The direct method sketched here is again a
Q ti ; xi ; ui /, respectively.
by x.t; ti ; xi ; ui / and x.t; multiple shooting method, and the benefit of this
Finally, we choose a numerical integration rule in approach is the same as for solving boundary
order to compute an approximation problems, thanks to the short intervals Œti ; ti C1 ;
the solutions depend much less sensitively on the
Z ti C1 data than the solution on the whole interval Œ0; T ,
I.ti ; ti C1 ; xi ; ui / e ıt thus making the iterative solution of the resulting
ti
discretized NLP much easier. The price to pay is
g.x.t; ti ; xi ; u/; u.t//dt: again the increase of the number of optimization
variables. However, due to the particular structure
In the simplest case, one might choose xQ as of the constraints guaranteeing continuity of the
the Euler scheme and I as the rectangle rule, solution, the resulting matrices in the NLP have
leading to a particular structure which can be exploited
numerically by a method called condensing (see
Q i C1 ; ti ; xi ; ui / D xi C .ti C1 ti /f .xi ; ui /
x.t Bock and Plitt 1984).
Numerical Methods for Nonlinear Optimal Control Problems 929
An alternative to multiple shooting methods Bellman equations and direct discretization. For
are collocation methods, in which the internal the former, the development of discretization
variables of the numerical algorithm for solv- schemes suitable for increasingly higher
ing (15) are also optimization variables. However, dimensional problems is in the focus. For the
nowadays, the multiple shooting approach as de- latter, the popularity of these methods in online
scribed above is usually preferred. For a more applications like MPC triggers continuing effort
detailed description of various direct approaches, to make this approach faster and more reliable.
see also Binder et al. (2001), Sect. 5. Beyond ordinary differential equations, the
development of numerical algorithms for the
optimal control of partial differential equations
Further Approaches for Infinite (PDEs) has attracted considerable attention
Horizon Problems during the last years. While many of these
methods are still restricted to linear systems,
The last two approaches only apply to finite in the near future we can expect to see many
horizon problems. While the maximum princi- extensions to (classes of) nonlinear PDEs. It is
ple approach can be generalized to infinite hori- worth noting that for PDEs, maximum principle-
zon problems, the necessary conditions become like approaches are more popular than for
weaker and the numerical solution becomes con- ordinary differential equations.
siderably more involved (see Grass et al. 2008).
Both the maximum principle and the direct ap-
proach can, however, be applied in a receding Cross-References
horizon fashion, in which an infinite horizon
problem is approximated by the iterative solution Discrete Optimal Control
of finite horizon problems. The resulting control Economic Model Predictive Control
technique is known under the name of model Nominal Model-Predictive Control
Optimal Control and the Dynamic Program-
predictive control (MPC; see Grüne and Pannek N
2011), and under suitable assumptions, a rigorous ming Principle
approximation result can be established. Optimal Control and Pontryagin’s Maximum
Principle
Optimization Algorithms for Model Predictive
Summary and Future Directions Control
Bryson AE, Ho YC (1975) Applied optimal control. problems. In: Fiacco AV (ed) Mathematical program-
Hemisphere Publishing Corp., Washington, DC. Re- ming with data perturbations. Lecture notes in pure
vised printing and applied mathematics, vol 195. Dekker, New York,
Falcone M (1997) Numerical solution of dynamic pro- pp 253–284
gramming equations. In: Appendix A in Bardi M, Malanowski K, Maurer H, Pickenhain S (2004) Second-
Capuzzo Dolcetta I (eds) Optimal control and viscos- order sufficient conditions for state-constrained op-
ity solutions of Hamilton-Jacobi-Bellman equations. timal control problems. J Optim Theory Appl
Birkhäuser, Boston 123(3):595–617
Falcone M, Ferretti R (2013) Semi-Lagrangian approxi- Maurer H (1981) First and second order sufficient
mation schemes for linear and Hamilton-Jacobi equa- optimality conditions in mathematical program-
tions. SIAM, Philadelphia ming and optimal control. Math Program Stud 14:
Gerdts M (2012) Optimal control of ODEs and DAEs. De 163–177
Gruyter textbook. Walter de Gruyter & Co., Berlin McEneaney WM (2006) Max-plus methods for nonlinear
Grass D, Caulkins JP, Feichtinger G, Tragler G, Behrens control and estimation. Systems & control: founda-
DA (2008) Optimal control of nonlinear processes. tions & applications. Birkhäuser, Boston
Springer, Berlin Pesch HJ (1994) A practical guide to the solution of
Grüne L, Pannek J (2011) Nonlinear model predictive real-life optimal control problems. Control Cybern
control: theory and algorithms. Springer, London 23(1–2):7–60
Malanowski K, Büskens C, Maurer H (1998) Conver- Vinter R (2000) Optimal control. Systems & control:
gence of approximations to nonlinear optimal control foundations & applications. Birkhäuser, Boston
O
Consider the controlled and observed system †: Given the system (1) and a control objective,
the problem thus arises on how to determine the
maps K; L; M , and N so that the closed-loop
P
x.t/ D Ax.t/ C Bu.t/ C Ed.t/;
systems meet the objective. As an example, take
y.t/ D C x.t/; (1) the special case when E in (1) is equal to zero
z.t/ D H x.t/; (i.e., the system has no external disturbances or
reference signals) and that we wish the closed-
with x.t/ 2 X D Rn the state, u.t/ 2 Rm loop system (3) to be internally stable, i.e., we
the control input, and y.t/ 2 Rp the mea- want to find the maps K; L; M , and N so that
sured output. The signal d.t/ may represent a the eigenvalues i of the system map of (3) are in
disturbance input or a desired reference signal, the open left half-plane, i.e., satisfy Re.i / < 0
while the signal z.t/ is a controlled output signal. for all i . If we had access to the entire state
A; B; C; E, and H are maps (or matrices). variable x (instead of only to the linear function
In general, a linear controller for this system is y D C x), then this problem would be simpler:
a finite-dimensional linear time-invariant system assuming that the system is stabilizable (The
represented by system xP D Ax C Bu is called stabilizable if
there exists a map F such that A C BF has all
its eigenvalues in the open left half-plane), find
P
w.t/ D Kw.t/ C Ly.t/; a map F such that the eigenvalues of A C BF
(2)
u.t/ D M w.t/ C Ny.t/: are in the open left half-plane; then take the static
state feedback controller u D F x as the control
law. That is, we would choose the state space
The state space of the controller is assumed to be
dimension of the controller equal to 0 and the
W D Rq for some positive integer q. K; L; M ,
maps K, L, and M to be void, and we would take
and N are assumed to be linear maps (or matri-
N D F.
ces). The controller (2) takes the observations y
In general, however, we only have access to a
as its input and generates the control function u as
given linear function y D C x of x, determined
its output. The closed-loop system resulting from
by the output map C . The key idea of observer-
the interconnection of † and is described by
based control is the following:
the equations
Use the theory of observer design to find an
! observer for the state x of the system (1), i.e.,
P
x.t/ ACBNC BM x.t/ E an observer that generates an estimate of the
D C d.t/;
P
w.t/ LC K w.t/ 0 system state x based on the measured output
x.t/ y and the control input u. Next, apply a static
z.t/ D H 0 : feedback u D F mimicking the (not permissible)
z.t/
(3) control law u D F x.
This idea leads to a dynamic feedback con-
The control action of interconnecting the con- troller (2) of a very particular structure: the con-
troller with the system (1) is called dynamic troller is the combination of a state observer
feedback. The state space of the closed-loop sys- (with a certain state space dimension) and a static
tem (3) is called the extended state space and is control law acting on the state estimate. This two-
equal to the Cartesian product X W D RnCq . In stage structure, separating estimation and control,
general, a feedback control problem amounts to is often called the separation principle. We will
finding linear maps K; L; M , and N such that the work out this idea in more detail for the case
closed-loop system (3) satisfies the control design when E D 0 (no external disturbances or ref-
specifications. erence signals) and the aim is to obtain internal
Observer-Based Control 933
stability of the system. Before doing this, we first Introducing the estimation error e W D x and
explain the most important material on observers interconnecting the system (1) with (5), we find
that is needed in the sequel. that the error e satisfies the differential equation
P D .A GC /e.t/:
e.t/ (6)
State Observers
Hence all possible errors converge to 0 as t tends
If the state is not available for measurement, one to infinity if and only if A GC is a stability
can try to reconstruct it using a system, called matrix, i.e., has all its eigenvalues in the open left
observer, that takes the control input and the half-plane. In that case, we call (5) a stable state
measured output of the original system as inputs observer. Thus, a stable state observer exists if
and yields an output that is an estimate of the and only if G can be found such that A GC is
state of the original system. Again in case that a stability matrix. The problem of finding such a
in the system (1) we have E D 0, i.e., there are G is dual to the problem of finding a matrix F
no disturbance signals. This is illustrated in the to a pair .A; B/ such that A C BF is a stability
following picture: matrix.
Definition 1 The pair .C; A/ is called detectable
if there exists a matrix G such that A GC is a
Σ
u y stability matrix, i.e., has all its eigenvalues in the
open left half-plane.
y C
Σ
u
h
−
+
G
Σdup
934 Observer-Based Control
Σ
u y
where F is any map such that A C BF is a
Ω
stability matrix and G is any map such that
A GC is a stability matrix.
The controller (8) is an observer-based dynamic
F feedback controller, since it is composed of a
state observer and a static feedback part.
ı s .t/ 2 s .t/; ı m .t/ 2 m .t/; ı h .t/ 2 h .t/ ; Observation problem At each time t, respectively
respectively, ıks 2 sk ; ıkm 2 m k, given the function s 2t T; t 7! y.s/,
k ; ık 2 k :
h h
from all what the a priori information makes Milanese et al. (1996) and Witsenhausen
possible and to eliminate what is not consistent (1966) for the set case and Arulampalam
with the a posteriori information. In the set- et al. (2002), Bucy and Joseph (1987),
valued observer setting, in the discrete time case, Candy (2009), and Jazwinski (2007) for the
this gives the following observer. To ease its read- conditional probability case.
ing, we underline the data given by the a priori Need of finite or infinite but fading memory:
information. It requires the introduction of two In these observers, model states x which are
sets k and kjk1 which are updated at each time consistent with the a priori information but
k when a new measurement yk is made available. do not agree with the a posteriori information
k is the set which xk is guaranteed to belong to at are eliminated (set intersection or probability
time k, knowing all the measurements up to time product). But once a point is eliminated, this is
k, and kjk1 is the same but with measurements forever. As a consequence if there is, at some
known up to time k 1. time, a misfit between a priori and a posteriori
Set-valued observer: information, it is mistakenly propagated in
future times. A way to round this problem
Initialization: 0 D X0
is to keep the information memory finite or
At each time k: pre- kjk1 D fk1 .k1 ; sk1 / infinite but fading. In particular, with fixed
diction (flowing) ˚ length memory, consistent points which were
T
restriction k D x 2 kjk1 Xk W disregarded due to measurements which are
(consistency) o no more in the memory are reintroduced. This
yk 2 hk .x; m k / says also that observers should not be sensitive
estimation xO k 2 k to their initial condition.
Not single-valued estimate. The observers intro-
A key feature here is that this observer has a state
duced above realize a lossless data compres-
k – a set – and is a dynamical system in the form:
sion with extracting and preserving all what
concerns the hidden variables in the redun-
kC1 D 'k .k ; yk /; xO k 2 k dant data given by a priori and a posteriori
information. But this “lossless compression” O
with y as input and xO as output which is not answer is not single valued (set valued or
single valued. Important also, the initial condition conditional probability valued) as a result of
of the state is given by the a priori information. taking uncertainties into account. Actually, to
In the stochastic setting, following the get a single-valued answer, the observation
Bayesian paradigm, the observer has the same problem must be complemented by making
structure but with the state k being a conditional precise for what the estimation is made. For
probability. See Jazwinski (2007, Theorem 6.4) instance, we may want to select the most likely
or Candy (2009, Table 2.1). In that setting too the or the average or more generally some cost-
observer is not a single state; it is the (a posteriori) minimizing estimate xO among all the possible
conditional probability of the random variable xk ones given by . In this way we obtain an
given the a priori information and the sequence observer giving a single-valued estimate:
of measurements l 2 fk K; : : : ; kg 7! yl .
Comments kC1 D 'k .k ; yk /; xO k D k .k /
Implementation: For the time being, except for
very specific cases (Kalman filter, . . . ), the set- respectively
valued and the conditional probability-valued
P D '..t/; y.t/; t/; x.t/
.t/ O D ..t/; t/
observers remain conceptual since we do not
know how to manipulate numerically sets (6)
and probability laws. Their implementation But then, in general, we lose information, and
requires approximations. For instance, see in particular we have no idea on the confidence
938 Observers for Nonlinear Systems
level this estimate has. Also, since the function state reproduces the measurement. Namely,
, at least, encodes for what the estimate xO is we get
used, for different uses, different functions
may be needed. PO D f .x.t/;
x.t/ O t; 0/CE f
7! y.
/g; x.t/;y.t/;
O t
exists a set Za .t/ Z.t/, such that on the the dimension m of the observer state to
domain of existence of the solution, a distance be larger or equal to the dimension n of the
between the point .X.x; t; s/; „..x; /; t; s// and model state x minus the dimension p of the
the set Za .s/ is upperbounded by a real function measurement y.
s 7! ˇx;;t
c
.s/, may be dependent on .x; ; t/, with
nonnegative values, strictly decreasing and going Is Injective Given h
to zero as s goes to infinity. We consider now the case where the observer
has been designed with a function which is
Necessary Conditions for Observer injective given h, namely, we have the following
Convergence implication, when x is in Xa .t/,
h
No Restriction on .1 ; h.x; t/; t/ D .2 ; h.x; t/; t/
It is possible to prove that if the observer is
i
convergent, then, & 1 2 i .x; t/
Necessity of detectability: When h and are
uniformly continuous in x and , respectively, H) 1 D 2 :
the estimate xO does converge to the model state
x. In this case, two solutions of the model (7) In a “generic” situation, this property, together
which produce the same measurement must with the surjectivity given h, implies that the
converge to each other. This is an asymptotic dimension m of the observer state should be
distinguishability property called detectability. between n p and n.
If we are interested not only in the asymptotic If a convergent observer has such a function
behavior but also in the transient (as for output , then .x; t/ 7! i .x; t/, which is (of course) a
feedback), a property stronger than detectabil- (single valued) function, admits a Lie derivative
.Lf i .x; t/ D limdt !0 .X.x;t;t Cdtdt
/;t Cdt / i .x;t /
i
ity is needed. In particular instantaneous dis- /
tinguishability (see section “Observers Based Lf i satisfying
on Instantaneous Distinguishability”) is neces-
sary if we want to be able to impose the decay Lf i .x; t/ D '. i .x; t/; h.x; t/; t/ 8x 2 Xa .t/
O
c
rate of the function ˇx;;t . (10)
Necessity of m n p: For each t, there exists
a subset Xa .t/ of , supposed to collect the This says (very approximatively) that ' is nothing
model states which can be asymptotically esti- but the image of the vector field f , under the
mated and such that we can associate, to each change of coordinates .x; t/ 7! . i .x; t/; t/ but
of its point x, a set i .x; t/ allowing us to again all this given h. As partly obtained in the
redefine the set Za .t/ as optimization approach, the observer dynamics are
˚ then a copy of the model dynamics with maybe a
Za .t/ D .x; / W x 2 Xa .t/ & 2 i .x; t/ : correction term which is zero when the estimated
state reproduce the measurement.
This implies that for each t and each x in If moreover the functions h and are uni-
Xa .t/, there is a point satisfying formly continuous in x and , respectively, then,
given 1 and 2 a distance between „..x; 1 /; t; s/
x D .; h.x; t/; t/ : (9) and „..x; 2 /; t; s/ goes to zero as s goes to
infinity. This property is related to what was
This is a surjectivity property of the function called extreme stability (see Yoshizawa 1966) in
but of a special kind since h.x; t/ is an the 1950s and 1960s and is called incremental
argument of . We say that, for each t, the stability today (see Angeli 2002). It holds when,
function is surjective to Xa .t/ given h. In with denoting by „y .; t; s/ the solution at time
a “generic” situation this property requires s of the observer dynamics :
940 Observers for Nonlinear Systems
P
.t/ D '..t/; y.t/; t/ O
D i .x.t/; t/ :
going through at time t and under the action See Andrieu and Praly (2006), Luenberger
of y, the flow 7! „y .; t; s/ is a strict con- (1964), and Shoshitaishvili (1990).
traction (see Jouffroy (2005) for a bibliography
on contraction) for each s > t or, at least, if a Observers Based on Instantaneous
distance between any two solutions „y .1 ; t; s/ Distinguishability
and „y .2 ; t; s/, with the same input y, converges Instantaneous distinguishability means that we
to 0. can distinguish as quickly as we want two model
states by looking at the paths of the measurements
Sufficient Conditions they generate. A sufficient condition to have this
Knowing now how a convergent observer should property can be obtained by looking at the Taylor
look like, we move to a quick description of some expansion in s of h.X.x; t; s/; s/. Indeed, we
such observers. have:
P
.t/ D '..t/; y.t/; t/ D A .t/ C B.y.t/; t/ where hi is a function obtained recursively as
Lf i .x; t/ D A i .x; t/ C B.h.x; t/; t/ : If there exists an integer m such that, in some
(11) uniform way with respect to t, the function
To obtain a convergent observer, it is then suf-
x 7! Hm .x; t/ D .h0 .x; t/ ; : : : ; hm1 .x; t//
ficient that there exists a (uniformly continuous)
function satisfying
is injective, then we do have instantaneous distin-
x D . .x; t/; h.x; t/; t/
i guishability. We say the system is differentially
observable of order m when this injectivity prop-
For this to be possible, the function i should erty holds. When a system has such a property,
be injective given h. This injectivity holds when the model state space has a very specific struc-
the observer state has dimension m 2.n C 1/, ture as discussed in Isidori (1995, Section 1.9).
the model is distinguishable, and provided the It means that we can reconstruct x from the
eigenvalues of A have a sufficiently negative knowledge of y and its m1 first time derivatives,
real part and are not in a set of zero Lebesgue i.e., there exists a function ˆ such that we have:
measure.
Unfortunately, we are facing again a possible x D ˆ .Hm .x; t/; t/ :
difficulty in the implementation since an expres-
sion for a function i satisfying (11) is needed This way, we are left with estimating the deriva-
and the function W .; y; t/ 7! x.t/
O is known tives of y. This can be done as follows. With the
implicitly only as notation i D hi 1 .x; t/, we obtain:
Observers for Nonlinear Systems 941
P
.t/ D F C G hm ˆ..t/; t/ ; t monotonic tangentially to the level sets of the
function h, i.e., for all .x; y; v; t/ satisfying
where y D h.x; t/ and @x
@h
.x; t/v D 0, we have:
@f
F D .2 ; : : : ; m ; 0/ ; G D .0 ; : : : ; 0 ; 1/ : vT P .x; y; t/ v 0: (13)
@x
When the last term on the right hand side is This is another way of expressing a detectability
Lipschitz, we can find a convergent observer in condition. This expression is coordinate depen-
the form: dent, hence the importance of choosing the coor-
dinates properly.
P / D F .t / C G hm .x.t
.t O /; t / C K.y.t / 1 .t //; When this condition is strict and uniform in t,
O / D ..t /; t / ;
x.t it is sufficient to get a locally convergent observer
and even a nonlocal one when h is linear in x, i.e.,
with being actually an estimation of and h.x; t/ D H.t/x, again a coordinate-dependent
where K is a constant matrix and is a modified condition. In this latter case the observer takes the
version of ˆ keeping the estimated state in its a form
priori given set X .t/. P D f..t/; y.t/; t/ C `..t// P 1 H.t/T
.t/
This is the high-gain observer paradigm.
See Gauthier and Kupka (2001) and Tornambe Œy.t/ H.t/.t/;
(1988). The implementation difficulty is in
O
x.t/ D .t/;
the function Ô , not to mention sensitivity to
measurement uncertainty. where ` is a real function to be chosen with suf-
ficiently large values. If (13) is strict and uniform
Observers with Bijective Given h and holds for all v, the correction term is not
needed.
Case Where Is the Identity Function A con- There are many other results of this type,
vergent observer whose function is the identity exploiting one or the other specificity of the O
has the following form: dependence on x of the function f – monotonicity,
P D f .; t/ convexity, . . . . See Fan and Arcak (2003), Krener
and Isidori (1983), Respondek et al. (2004), San-
C E f
7! y.
/g; .t/; y.t/; t ; x.t/
O D .t/: felice and Praly (2012), . . .
(12)
The only piece remaining to be designed is the Case Where .x; t/ 7! . i .x; t/; h.x; t/; t/ Is
correction term E. It has to ensure convergence a Diffeomorphism At each time t we know
and may be also other properties like symmetry already that the model state x we want to es-
preserving (see Bonnabel et al. 2008). timate satisfy y.t/ D h.x; t/. So, as remarked
For this design, a first step is to exhibit some in Luenberger (1964), when .h.x; t/; t/ can be
specific properties of the vector field f by writing used as part of coordinates for .x; t/, we need
it in some appropriate coordinates. For example, to estimate the remaining part only. This can be
there may exist coordinates such that the done if we find a function i , whose values are
expression of f takes the form f.x.t/; h.x; t/; t/ n p dimensional, such that .x; t/ 7! .y; ; t/ D
and the corresponding observer (12) is such .h.x; t/; i .x; t/; t/ is a diffeomorphism and the
that there exists a positive definite matrix P flow 7! y .; t; s/ generated by
for which the function s 7! .X.x; t; d /
XO ..x; x/;
O t; s//0 P .X.x; t; d / XO ..x; x/;
O t; s// @ i @ i
P D
.t/ .x.t/; t/f .x.t/; t/ C .x.t/; t/;
is strictly decaying (if not zero). A necessary @x @t
condition for this to be possible is that f is D '..t/; y.t/; t/
942 Observers for Nonlinear Systems
is a strict contraction for all s > t. Indeed in this Bonnabel S, Martin P, Rouchon P (2008) Symmetry-
case the observer dynamics can be chosen as preserving observers. IEEE Trans Autom Control
53(11):2514–2526
Bucy R, Joseph P (1987) Filtering for stochastic processes
P
.t/ D '..t/; y.t/; t/ with applications to guidance, 2nd edn. Chelsea Pub-
lishing Company, Chelsea
Candy J (2009) Bayesian signal processing:
O is obtained as solution of
and the estimate x.t/
classical, modern, and particle filtering methods.
Wiley series in adaptive learning systems for
O
i .x.t/; t/ D .t/; O
h.x.t/; t/ D y.t/: signal processing, communications and control.
Wiley, Hoboken
Carnevale D, Karagiannis D, Astolfi A (2008) Invari-
This is the reduced-order observer paradigm. ant manifold based reduced-order observer design
for nonlinear systems. IEEE Trans Autom Control
See, for instance, Besançon (2000, Proposi- 53(11):2602–2614
tion 3.2), Carnevale et al. (2008), and Luenberger Cox H (1964) On the estimation of state variables and
(1964, Theorem 4). parameters for noisy dynamic systems. IEEE Trans
Autom Control 9(1):5–12
Fan X, Arcak M (2003) Observer design for systems with
multivariable monotone nonlinearities. Syst Control
Lett 50:319–330
Cross-References Gauthier J-P, Kupka I (2001) Deterministic observation
theory and applications. Cambridge University Press,
Differential Geometric Methods in Nonlinear Cambridge/New York
Isidori A (1995) Nonlinear control systems, 3rd edn.
Control Springer, Berlin/New York
Observers in Linear Systems Theory Jazwinski A (2007) Stochastic processes and filtering
Regulation and Tracking of Nonlinear Systems theory. Dover, Mineola
Jouffroy J (2005) Some ancestors of contraction analysis.
In: Proceedings of the IEEE conference on decision
and control, Seville, pp 5450–5455
Krener A, Isidori A (1983) Linearization by output in-
Bibliography jection and nonlinear observer. Syst Control Lett 3(1):
47–52
Alamir M (2007) Nonlinear moving horizon observers: Luenberger D (1964) Observing the state of a linear
theory and real-time implementation. In: Nonlinear system. IEEE Trans Mil Electron MIL-8:74–80
observers and applications. Lecture notes in Milanese M, Norton J, Piet-Lahanier H, Walter E (eds)
control and information sciences. Springer, (1996) Bounding approaches to system identification.
Berlin/New York Plenum Press, New York
Andrieu V, Praly L (2006) On the existence of Kazantzis- Respondek W, Pogromsky A, Nijmeijer H (2004) Time
Kravaris/Luenberger observers. SIAM J Control scaling for observer design with linearizable error
Optim 45(2):432–456 dynamics. Automatica 40:277–285
Angeli D (2002) A Lyapunov approach t incremen- Sanfelice R, Praly L (2012) Convergence of nonlinear
tal stability properties. IEEE Trans Autom Control observers on with a riemannian metric (Part I). IEEE
47(3):410–421 Trans Autom Control 57(7):1709–1722
Arulampalam M, Maskell S, Gordon N, Clapp T (2002) Shoshitaishvili A (1990) Singularities for projections of
A tutorial on particle filters for online nonlinear/non- integral manifolds with applications to control and
Gaussian Bayesian tracking. IEEE Trans Signal observation problems. In: Arnol’d VI (ed) Advances
Process 50(2):174–188 in Soviet mathematics. Theory of singularities and its
Bain A, Crisan D (2009) Fundamentals of stochastic applications, vol 1. American Mathematical Society,
filtering. Stochastic modelling and applied probability, Providence
vol 60. Springer, New York/London Tornambe A (1988) Use of asymptotic observers having
Başar T, Bernhard P (1995) H 1 optimal control and high gains in the state and parameter estimation. In:
related minimax design problems: a dynamic game Proceedings of the IEEE conference on decision and
approach, revised 2nd edn. Birkäuser, Boston control, Austin, pp 1791–1794
Bertsekas D, Rhodes IB (1971) On the minimax reach- Willems JC (2004) Deterministic least squares filtering.
ability of target sets and target tubes. Automatica 7: J Econ 118:341–373
233–213 Witsenhausen H (1966) Minimax control of uncertain
Besançon G (2000) Remarks on nonlinear adaptive ob- systems. Elec. Syst. Lab. M.I.T. Rep. ESL-R-269M,
server design. Syst Control Lett 41(4):271–280 Cambridge, May 1966
Observers in Linear Systems Theory 943
Yoshizawa T (1966) Stability theory by Lyapunov’s sec- with x.t/ 2 Rn , u.t/ 2 Rm , y.t/ 2 Rp and A,
ond method. The Mathematical Society of Japan, B, C , and D matrices of appropriate dimensions
Tokyo
and with constant entries, and the problem of
estimating its state from measurements of the
input and output signals. In Eq. (1)
x.t/ stands
P
for x.t/, if the system is continuous-time, and
Observers in Linear Systems Theory for x.t C 1/, if the system is discrete-time. In
addition, if the system is continuous-time, then
A. Astolfi1;2 and Panos J. Antsaklis3
1 t 2 RC , i.e., the set of nonnegative real numbers,
Department of Electrical and Electronic
whereas if the system is discrete-time, then t 2
Engineering, Imperial College London,
Z C , i.e., the set of nonnegative integers.
London, UK
2 We are interested in determining an online
Dipartimento di Ingegneria Civile e Ingegneria
estimate xe .t/ 2 Rn , i.e., the estimate at time t
Informatica, Università di Roma Tor Vergata,
has to be a function of the available information
Roma, Italy
3 (input and output) at the same time instant. This
Department of Electrical Engineering,
implies that the estimate is generated by means of
University of Notre Dame, Notre Dame,
a device (known as filter) processing the current
IN, USA
input and output of the system and generating a
state estimate. The filter may be instantaneous,
i.e., the estimate is generated instantaneously by
Abstract
processing the available information. In this case
we have a static filter. Alternatively, the state es-
Observers are dynamical systems which process
timate can be generated processing the available
the input and output signals of a given dynamical
information through a dynamical device. In this
system and deliver an online estimate of the
case we have a dynamic filter.
internal state of the given system which asymp-
Assume, for simplicity, that D D 0. This
totically converges to the exact value of the state.
assumption is without loss of generality. In fact,
For linear, finite-dimensional, time-invariant sys- O
if y D C x C Du and u are measurable, then
tems, observers can be designed provided a weak
also yQ D C x is measurable. Assume, in addition,
observability property, known as detectability,
that the filter which generates the online estimate
holds.
is linear, finite-dimensional, and time-invariant.
Then we may have the following two configura-
tions:
• Static filter. The state estimate is generated via
Keywords the relation
D F C Ly C H u; the output signal y. This observer is therefore an
(4) open-loop observer.
xe D M C Ny C P u;
To exploit the knowledge of y, we modify
the observer (6) adding a term which depends
with F , L, H , M , N and P constant matrices
upon the available information on the estimation
of appropriate dimensions. The resulting in-
error, which is given by ye D C xe y: This
terconnected system is described by the equa-
modification yields a candidate state observer
tions
described by
x D Ax C Bu;
D F C LC x C H u; (5)
D A C Bu C Lye ;
(7)
xe D M C NC x C P u: xe D :
In what follows we study in detail the dynamic To assess the properties of this candidate state
filter configuration. This is mainly due to the fact observer, note that e D x xe is such that
that this configuration allows us to solve most es-
timation problems for linear systems. Moreover,
e D .A C LC /e: (8)
while the use of a static filter is very appealing, it
provides a useful alternative only in very specific The matrix L (known as output injection gain)
situations. can be used to shape the dynamics of the estima-
tion error. In particular, we may select L to assign
State Observer the characteristic polynomial p.s/ of ACLC . To
this end, note that
A state observer is a filter that allows to estimate,
asymptotically or in finite time, the state of a sys- p.s/ D det.sI .A C LC // D det.sI .A0 C C 0 L0 //:
tem from measurements of the input and output
signals.
Hence, there is a matrix L which arbitrarily
The simplest possible observer can be
assigns the characteristic polynomial of A C LC
constructed considering a copy of the system,
if and only if the system
the state of which has to be estimated. This
means that a candidate observer for system (1) is
given by
D A0 C C 0 v
D A C Bu
(6) is reachable or, equivalently, if and only if the
xe D :
system (1) is observable.
To assess the properties of this candidate state We summarize the above discussion with two
observer, let e D x xe be the estimation error formal statements.
and note that
e D Ae: As a result, if e.0/ D 0,
Proposition 1 Consider system (1) and suppose
then e.t/ D 0 for all t and for any input signal
the system is observable. Let p.s/ be a monic
u. However, if e.0/ ¤ 0, then, for any input
polynomial of degree n. Then there is a matrix
signal u, e.t/ is bounded only if the system (1)
L such that the characteristic polynomial of A C
is stable and converges to zero only if the system
LC is equal to p.s/. Note that for single-output
(1) is asymptotically stable. If these conditions do
systems, the matrix L assigning the characteristic
not hold, the estimation error is not bounded and
polynomial of A C LC is unique.
system (6) does not qualify as a state observer
for system (1). The intrinsic limitation of the Proposition 2 System (1) is observable if and
observer (6) is that it does not use all the available only if it is possible to arbitrarily assign the
information, i.e., it does not use the knowledge of eigenvalues of A C LC .
Observers in Linear Systems Theory 945
w
xO 2 D Q Q Q Q Q Q
F C H v C Gu C L A11 xO 1 C A12 xO 2 C B1 u A21 xO 1 C A22 xO 2 C B2 u
D Q Q Q Q Q Q
F C H C LA11 A12 xO 1 C LA12 A22 xO 2 C G C LB1 B2 u: (9)
To have convergence of the estimation error to Finally, from xO 1 D v and the estimate w of xO 2 ,
zero, regardless of the initial conditions and of we build an estimate xe of the state x inverting
the input signal, we must have the transformation T , i.e.,
.w xO 2 / D F .w xO 2 / (10) x1e I C2 v
D :
x2e 0 I w
and F must have all eigenvalues with negative
real part, in the case of continuous-time systems,
or with modulo smaller than one, in the case Summary and Future Directions
of discrete-time systems. Comparing Eqs. (9) and
(10), we obtain that the matrices F , H , G, and L The problem of estimating the state of a linear
must be such that system from input and output measurements can
be solved provided a weak observability condi-
LAQ12 AQ22 D F;
tion holds. The problem addressed in this entry
H C LAQ11 AQ21 D FL; is the simplest possible estimation problem: the
underlying system is linear and all variables are
G C LBQ1 BQ 2 D 0:
exactly measured. Observers for nonlinear sys-
tems and in the presence of signals corrupted by
We now show how the previous equations can be
noise can also be designed exploiting some of
solved and how the stability condition of F can
the basic ingredients, such as the notions of error
be enforced. Detectability of the system implies
system and of output injection, discussed in this
that the (reduced system)
Q D AQ22 Q with output
entry.
yQ D AQ12 is detectable. As a result, there exists a
matrix L such that the matrix
Cross-References
F D AQ22 LAQ12
Controllability and Observability
has all eigenvalues with negative real part, in the Estimation, Survey on
case of continuous-time systems, or with modulo Hybrid Observers
smaller than one, in the case of discrete-time Kalman Filters
systems. Then the remaining equations are solved Linear Systems: Continuous-Time, Time-In-
by variant State Variable Descriptions
H D FL LAQ11 C AQ21 ; Linear Systems: Discrete-Time, Time-Invariant
G D LBQ1 C BQ2 : State Variable Descriptions
Optimal Control and Mechanics 947
d @ƒ @ƒ
Abstract P /
.q; q; P / D 0; ˆ.q; q/
.q; q; P D 0:
dt @qP @q
(2)
There are very natural close connections between
mechanics and optimal control as both involve We can rewrite this equation in Hamiltonian
variational problems. This is a huge subject and form and show that the resulting equations are
we just touch on some interesting connections equivalent to the equations of motion given by
here. A survey and history may be found in the maximum principle for a suitable optimal
Sussman and Willems (1997). Other aspects may control problem. Set p D @ƒ @qP
P / and con-
.q; q;
be found in Bloch (2003). sider this equation together with the constraints
P D 0. We can solve these two equations
ˆ.q; q/
P / under suitable conditions as discussed
for .q;
in Bloch (2003). We obtain the standard Hamil-
Keywords tonian equations with H.q; p/ D p .q; p/
L.q; .q; p//.
Nonholonomic integrator; Sub-Riemannian opti- We now compare this to the optimal control
mal control; Variational problems problem
948 Optimal Control and Mechanics
We assume that the system satisfies the acces- where A1 .x; y/ and A2 .x; y/ are smooth func-
sibility rank condition and is thus controllable, tions of x and y. A1 D y and A2 D x
since there is no drift term. Then we can pose the recover the Heisenberg/nonholonomic integrator
optimal control problem equations. More generally we get the flow of a
particle in a magnetic field – it is not hard to
Z carry out the optimal control analysis to see this.
1X 2
T m
min u .t/dt (9) Details are in Bloch (2003).
u./ 0 2 i D1 i
the data for which comprise a number T > 0, derivation of general necessary conditions of
functions f W Œ0; T Rn Rm ! Rn , L W Œ0; T optimality.
Rn Rm ! R and g W Rn Rn ! R and sets
C Rn and Rm .
It is assumed that set C has the functional in- The Maximum Principle
equality and equality constraint set representation
The centerpiece of optimal control theory is a
C D f.x0 ; x1 / 2 Rn W i .x0 ; x1 / 0 set of conditions that a minimizer .x.:/; N uN .://
for i D 1; 2; : : : ; k1 and (1) must satisfy, known as Pontryagin’s Maximum
i
.x0 ; x1 / D 0 for i D 1; 2; : : : ; k2 g; Principle or, simply, the Maximum Principle. It
came to prominence through a 1961 book, which
appeared in English translation as Pontryagin LS
in which i W Rn Rn ! R, i D 1; : : : ; k1 and et al. (1962). It bears the name of L S Pontryagin,
i
W Rn Rn ! R, i D 1; : : : ; k2 are given because of his role as leader of the research group
functions. at the Steklov Institute, Moscow, which achieved
A control function is a measurable function this advance. But the first proof is attributed to
u.:/ W Œ0; T ! Rm satisfying u.t/ 2 a.e. Boltyanskii. For given 0, define the Hamil-
0
t 2 Œ0; T A state trajectory x.:/ associated tonian function H W Œ0; T Rn Rn Rm ! R
with a control function u.:/ is a solution to the
P
differential equation x.t/ D f .t; x.t/; u.t//. A H .t; x; p; u/ WD p T f .t; x; u/ L.t; x; u/:
pair of functions .x.:/; u.:// comprising a control
function u.:/ and an associated state trajectory Theorem 1 (The Maximum Principle) Let
x.:/ satisfying the condition .x.0/; x.T // 2 N
.x.:/; uN .:// be a minimizer for (P ). Assume that
C is a feasible process. A feasible process the following hypotheses are satisfied:
N
.x.:/; uN .:// which achieves the minimum of (i) g is continuously differentiable.
J.x.:/; u.:// over all feasible processes is called (ii) i , i D 1; : : : ; k1 and i , i D 1; : : : ; k2 , are
a minimizer. continuously differentiable. O
Frequently, the initial state is fixed, i.e., C (iii) With fQ.t; x; u/ D .L.t; x; u/; f .t; x; u//,
takes the form fQ.:; :; :/ is continuous, fQ.t; :; u/ is continu-
ously differentiable for each .t; u/, and there
CDfx0 g C1 for some x02Rn and some C1Rn : exist > 0 and k.:/ 2 L1 such that
The Adjoint Equation: the differential equations for the pi .:/’s is, first,
to construct the Hamiltonian H .t; x; p; u/ D
@ T p T f .t; x; u/p T f .t; x; u/ L.t; x; u/ and,
p.t/
P D N
f .t; x.t/; uN .t//p.t/
@x second, to use the fact that the i th component
@ pi .:/ of the costate p.t/ D Œp1 .t/; : : : ; pn .t/T
N
LT .t; x.t/; uN .t//; a.e., satisfies the equation:
@x
The Maximization of the Hamiltonian Condi- @
tion: pPi .t/ D N
H .t; x.t/; p.t/; uN .t//
@xi
for i D 1; : : : ; n:
N
H .t; x.t/; p.t/; uN .t//
D max H .t; x.t/;
N p.t/; u/ a.e., The preceding equations are of course merely a
u2 component-wise statement of the costate equa-
The Transversality Condition: tion above. In many applications the endpoint
constraints take the form
.p T .0/; p T .T // D rg.x.0/;
N N //
x.T
xi .0/ D 0i for i 2 J0 and xi .0/ 2 Rn for i … J0
X
k1
C ˛ i r i .x.0/;
N N //
x.T xi .T / D 1i for i 2 J1 and xi .0/ 2 Rn for i … J1
i D1
X
k2 for given index sets J0 ; J1 f0; : : : ; ng and n-
C ˇi r i
N
.x.0/; N //
x.T vectors 0i for i 2 J0 and 1i for i 2 J1 , i.e., the
i D1 endpoints of each state trajectory component are
and ˛ D 0 for all i 2 f1; : : : ; k1 g such
i either “fixed” or “free.” In such cases the rules for
setting up the boundary conditions on the pi .:/’s
N
that i .x.0/; N // < 0; in which
x.T are
for given L W Œ0; T Rn Rn ! R and .a; b/ 2 in which rxp H1 denotes the gradient of
Rn Rn . This problem is a special case of (P ) in H.t; x; p; u/ w.r.t. the vector Œx T ; p T T variable
which f .t; x; u/ D u, D Rn , k1 D 0, k2 D 2n for fixed .t; u/, together with the endpoint
and conditions
1 n
. .x0 ; x1 /; : : : ; .x0 ; x1 / ;
N
.x.0/; N // 2 C and .p T .0/; p T .T //
x.T
nC1 2n
.x0 ; x1 / : : : ;
.x0 ; x1 /
D rg.x.0/;
N N //
x.T
D x0T aT ; x1T b T :
X
k1
C ˛ i r i .x.0/;
N N //
x.T
It is a straightforward exercise to deduce i D1
from the Maximum Principle, in this special
X
k2
case, that a minimizer satisfies the classical C ˇi r i
N
.x.0/; N //;
x.T
Euler–Lagrange and Weierstrass conditions i D1
and also that the minimizer and associate
costate arc satisfy Hamilton’s system of
for some nonnegative numbers f˛ i g and numbers
equations, under an additional uniform con-
fˇ i g satisfying
vexity hypothesis on L.t; x; :/. Thus, the
Maximum Principle unifies many of the classical
necessary conditions from the calculus of ˛ i D 0 for all i 2 f1; : : : ; k1 g such that
variations and, furthermore, validates them
i .x.0/;
N N // < 0;
x.T
under reduced hypotheses. But it has far-
reaching implications, beyond these conditions,
because it allows the presence of pathwise where rg; r and r etc., are as defined in (2).
constraints on the velocities, expressed in terms The minimizing control satisfies the relation
of a controlled differential equation and a
control constraint set, which are encountered
uN .t/ D u .t; x.t/;
N p.t//: O
in engineering design, econometrics, and other
areas.
Notice that the first-order vector differential
equation (4) is a system of 2n scalar, first-
The Hamiltonian System order differential equations. Let us suppose
that kN1 inequality endpoint constraints are
In favorable circumstances, we are justified in N
active at .x.0/; N //. Then, satisfaction of
x.T
setting the cost multiplier D 1 and, fur- the active constraints and the transversality
thermore, the maximization of the Hamiltonian condition impose 2n C kN1 C k2 on the boundary
condition permits us, for each t, to express u as a N
values of .x.:/; p.://. Taking account of the
function of x and p: fact, however, that there are kN1 C k2 unknown
endpoint multipliers, we see that the effective
u D u .t; x; p/: number of endpoint constraints accompanying
the differential equation (4) is
The Maximum Principle now asserts that a min-
N is the first component of a pair
imizing arc x.:/ 2n C kN1 C k2 .kN1 C k2 / D 2n:
N
of absolutely continuous functions .x.:/; p.://
satisfying Hamilton’s system of equations:
Thus, the set of 2n scalar first-order differen-
PT
.pP .t/; xN .t// D rxp H1 .t; x.t/;
T
N p.t/; u tial equations (4) defining the “two-point bound-
ary value problem” to determine .x; N p/ has the
N
.t; x.t/; p.t/// a.e. ; (4) “right” number of endpoint conditions.
954 Optimal Control and Pontryagin’s Maximum Principle
Free-Time Problems: Consider a variant on the In early derivations of the Maximum Principle,
“autonomous” case of problem .P / (L and f it was assumed that the functions f .t; x; u/ and
do not depend on t), call it .F T /, in which the L.t; x; u/ were continuously differentiable with
terminal time T is no longer fixed, but is a choice respect to the x variable. A major research en-
variable along with the control function and the deavor since the early 1970s has been to find
initial state, and the cost function is versions of the Maximum Principle than remain
valid when the functions f .t; x; u/ and L.t; x; u/
JQ .T; x.:/; u.:// satisfy merely a “bounded slope” or, synony-
Z T mously, a Lipschitz continuity condition with
WD L.x.t/; u.t//dt C g.T;
Q x.0/; x.T // respect to x. Such functions are “nonsmooth” in
0 the sense that they can fail to be differentiable,
in the conventional sense, at some points in their
for some function g.:; Q :; /. Take a minimizer domains. An overview of the Maximum Principle
.TN ; x.:/;
N uN .:// for .F T /. Assume, in addition would be incomplete without reference to such
to hypotheses (i)–(iii), that is bounded and advances.
the function k.:/ in (iii) is bounded. Then the The search for nonsmooth optimality
Maximum Principle conditions (for data in which conditions is motivated by a desire to solve
the end time is frozen at T D TN ) continue to optimal control problems where, in particular,
be satisfied for some p.:/ W Œ0; TN ! Rn and the function f .t; x; u/ is a piecewise linear
, including the constancy of the Hamiltonian function of x (for fixed .t; u/). Such functions
condition arise, for example, when the f .t; x; u/ is
constructed empirically via a lookup table and
N
H .x.t/; p.t/; uN .t// D c a.e t 2 Œ0; TN linear interpolation. Nonsmooth cost integrands
are encountered when they are constructed using
for some constant c. But a new condition is “pointwise” supremum and/or “absolute value”
required to reflect the extra degree of freedom operations. The function
in the new problem specification, namely, the
Z T
free end time. This is an additional transversality
J.x.:// D jx.t/jdt C maxfx.1/; 0g ;
condition involving the constant value c of the 0
Hamiltonian:
which penalizes the L1 norm of the state trajec-
Free Time Transversality Condition: c D tory and the terminal value of the scalar state, but
@T@ g.TN ; x.0/;
N N //:
x.T only if this is nonnegative, is a case in point.
When attempting to generalize the Maximum
Other Refinements: Versions of the Maximum Principle to allow for nonsmooth data, we en-
Principle are available to take account of counter the challenge of interpreting the adjoint
pathwise functional inequality constraints on equation, which can be written as
state variables (“pure” state constraints) and
of both state and control variables (“mixed” @
constraints). Maximum Principle-like conditions p.t/
P D N
H .t; x.t/; uN .t/p.t//;
@x
have also been derived for optimal control
problems in which the dynamic constraint takes in circumstances when the x-gradients of f and
the form of a retarded differential equation with L are not defined, at least not in a conventional
control terms and in which the class of control sense. One approach to dealing with this problem
functions is enlarged to include Dirac delta is via the Clarke generalized gradient @m of
functions (“impulse” optimal control problems). function mW Rn ! R at a point x: N
Optimal Control and Pontryagin’s Maximum Principle 955
8
N WD co f j there exist sequences x i ! x,
@m.x/ N ˆ Minimize T
ˆ
ˆ
i ! such that, for each i , m.:/ is Frêchet ˆ
ˆ over times T > 0, measurable functions u.:/:
ˆ
ˆ
differentiable at x i and i D @x
@
m.x i /g: ˆ
ˆ Œ0; T ! R and
ˆ
ˆ
ˆ
< absolutely continuous functions x.:/:
Œ0; T ! R2 such that
Here, “co” means closed convex hull. In a land- ˆ
ˆ
ˆ
ˆ P
x.t/ D Ax.t/ C bu.t/ a.e.
mark paper, Clarke FH 1976, Clarke proved ˆ
ˆ
ˆ
ˆ u.t/ 2 a.e.
a necessary condition commonly referred to as ˆ
ˆ
ˆ .x1 .0/; x2 .0// D .1; 0/ and
the nonsmooth Maximum Principle, in which the :̂
.x1 .T /; x2 .T // D .0; 0/:
adjoint equation is replaced by a differential in-
clusion involving the (partial) generalized gradi-
xP 1 .t/ 01 x1 .t/ 0
D C u.t/:
xP 2 .t/ 00 x2 .t/ 1 1 if t 2 Œ0; 1/
uN .t/ D
(5) C1 if t 2 Œ1; 2;
This is a special case of the free-time problem p1 .t/D1 and p2 .t/D1 C t for t 2 Œ0; 2:
956 Optimal Control and the Dynamic Programming Principle
The Maximum Principle is a necessary condition Principle, based on techniques of Nonsmooth Analysis
of optimality. Since a minimizer exists and since we refer to:
.TN ; x.:/;
Clarke FH (1983) Optimization and nonsmooth analysis.
N uN .:/; p.:// is a unique solution to the Wiley-nterscience, New York
Maximum Principle relations, it follows that Clarke FH (2013) Functional analysis, calculus of varia-
.TN ; x.:/;
N uN .:// is the solution to the problem. tions and optimal control. Graduate texts in mathemat-
This problem is amenable to simpler, more ics. Springer, London
Vinter RB (2000) Optimal control. Birkhaüser, Boston
elementary, solution techniques. But the above Engineering texts illustrating the application of the Max-
solution is enlightening, because it highlights imum Principle to solve problems of optimal control
important generic features of the Maximum Prin- and design, in flight mechanics and other areas, in-
ciple. We see how the “maximization of the clude:
Bryson AE, Ho Y-C (1975) Applied optimal control (Re-
Hamiltonian condition” can be used to eliminate vised edn). Halstead Press (a division of John Wiley
the control function and thereby to set up a two- and Sons), New York
point boundary problem for x.:/ N and p.:/ (a very Bryson AE (1999) Dynamic optimization. Addison Wes-
nonclassical construction). ley Longman, Menlo Park
Z 1
position x. This is a reference value which can
Jx .˛/ D g.yx .s; ˛/; ˛.s//e s ds (3)
0
be useful to evaluate the efficiency of a control –
N is close to v.x/, this means that ˛N is
if Jx .˛/
for a given > 0. The function g represents “efficient.”
the running cost and is the discount factor, Bellman’s dynamic programming principle
which can be used to take into account the re- provides a first characterization of the value
duced value, at the initial time, of future costs. function.
From a technical point of view, the presence of Proposition 1 (DPP for the infinite horizon
the discount factor ensures that the integral is problem) Under the assumptions of Theorem 1,
finite whenever g is bounded. Note that one can for all x 2 Rd and > 0,
also consider the undiscounted problem ( D
0) provided the integral is still finite. The goal Z
of optimal control is to find an optimal pair v.x/ D inf g.yx .sI ˛/; ˛.s//e s ds
.y ; ˛ / that minimizes the cost functional. If ˛2A 0
we seek optimal controls in open-loop form, i.e.,
Ce v.yx . I ˛// : (5)
as functions of t, then the Pontryagin Maximum
Principle furnishes necessary conditions for a
pair .y ; ˛ / to be optimal.
N
Proof Denote by v.x/ the right-hand side of (5).
A major drawback of an open-loop control is
First, we remark that for any x 2 Rd and ˛N 2 A,
that being constructed as a function of time, it
cannot take into account errors in the true state Z 1
of the system, due, for example, to model errors s
N D
Jx .˛/ N
g.y.s/; N
˛.s//e ds
or external disturbances, which may take the evo- 0
Z
lution far from the optimal forecasted trajectory. s
Another limitation of this approach is that a new D N
g.y.s/; N
˛.s//e ds
0
computation of the control is required whenever Z 1
s
the initial state is changed. C N
g.y.s/; N
˛.s//e ds
For these reasons, we are interested in the
Z
so-called feedback controls, that is, controls ex- s
D N
g.y.s/; N
˛.s//e ds C e
pressed as functions of the state of the system. 0
Under feedback control, if the system trajectory Z 1
is perturbed, the system reacts by changing its N C //e s ds
N C /; ˛.s
g.y.s
0
control strategy according to the change in the Z
s
state. One of the main motivations for using the N
g.y.s/; N
˛.s//e ds C e v.y. //
N
DPP is that it yields solutions to optimal control 0
problems in the form of feedback controls.
N is abbreviated as y.s/).
(here, yx .s; ˛/ N Taking
DPP for the Infinite Horizon Problem the infimum over all trajectories, first over the
The starting point of dynamic programming is to right-hand side and then the left of this inequality,
introduce an auxiliary function, the value func- yields
tion, which for our problem is
v.x/ v.x/
N (6)
v.x/ D inf Jx .˛/; (4)
˛2A
To prove the opposite inequality, we recall that
where, as above, x is the initial position of the vN is defined as an infimum, and so, for any x 2
system. The value function has a clear meaning: Rd and " > 0, there exists a control ˛N " (and the
it is the optimal cost associated with the initial corresponding evolution yN" ) such that
Optimal Control and the Dynamic Programming Principle 959
Z
that is,
N
v.x/C" g.yN" .s/; ˛N " .s//es dsCe v.yN" . //:
0 Z
(7)
v.x/e v.y . //D g.y .s/; ˛ .s//e s ds
On the other hand, the value function v being also 0
O
(Note that ˛./ is still measurable). Since " is where we have assumed that ˛ ./ is continuous
N
arbitrary, (9) finally yields v.x/ v.x/. at 0. Then, we can conclude O
We observe that this proof crucially relies on
the fact that the control defined by (10) still v.x/ Dv.x/ f .x; a / g.x; a / D 0 (11)
belongs to A, being a measurable control. The
possibility of obtaining an admissible control by where a D ˛ .0/. Similarly, using the equiva-
joining together two different measurable con- lent form
trols is known as the concatenation property. Z
v.x/ C sup g.y.s/; ˛.s//e s ds
˛2A 0
The Hamilton–Jacobi–Bellman e v.y. // D 0
Equation
of the DPP and the inequality, this implies for any
The DPP can be used to characterize the value (continuous at 0) control ˛ 2 A,
function in terms of a nonlinear partial differ-
ential equation. In fact, let ˛ 2 A be the
v.x/ Dv.x/ f .x; a/ g.x; a/
optimal control, and y the associated evolution
(to simplify, we are assuming that the infimum is 0; for every a 2 A: (12)
a minimum). Then,
Combining (11) and (12), we obtain the
Z
s Hamilton–Jacobi–Bellman equation (or dynamic
v.x/D g.y .s/; ˛ .s//e dsCe v.y . //;
0 programming equation):
960 Optimal Control and the Dynamic Programming Principle
H.x; u; Du/ D 0
The main difference with respect to the previous
with x 2 Rd , and problem is that now the value function T will be
finite valued only on a subset R which depends
H.x; u;p/ D u.x/Csup ff .x; a/pg.x; a/g: on the target, on the dynamics, and on the set of
a2A admissible controls.
(14)
Definition 1 The reachable set R is defined by
Note that H.x; u; / is convex (being the sup of
a family of linear functions) and that H.x; ; p/ R WD [t >0 R.t/ D fx 2 Rn W T .x/ < C1g
is monotone (since > 0). It is also easy to
see that the solution u is not differentiable even where, for t > 0, R.t/ WD fx 2 Rn W T .x/ < tg.
when f and g are smooth functions (i.e., f; g; 2 The meaning is clear: R is the set of initial points
C 1 .Rn ; A/), so we need to deal with weak so- which can be driven to the target in finite time.
lution of the Bellman equation. This can be done The system is said to be controllable on T if
in the framework of viscosity solutions, a theory for all t > 0, T int.R.t// (here, int(D)
initiated by Crandall and Lions in the 1980s denotes the interior of the set D). Assuming
which has been successfully applied in many controllability in a neighborhood of the target one
areas as optimal control, fluid dynamics, and gets the continuity of the minimum time function
image processing (see the books Barles (1994) and under the assumptions made on f , A, and T ,
and Bardi and Capuzzo Dolcetta (1997) for an one can prove some interesting properties:
extended introduction and numerous applications (i) R is open.
to optimal control). Typically viscosity solutions (ii) T is continuous on R.
are Lipschitz continuous solutions so they are (iii) lim T .x/ D C1, for any x0 2 @R.
differentiable almost everywhere. x!x0
Now let us denote by XD the characteristic func-
tion of the set D. Using in R arguments similar
to the proof of DPP in the previous section one
An Extension to the Minimum Time
can obtain the following DPP:
Problem
Proposition 2 (DPP for the minimum time
In the minimum time problem, we want to min- problem) For any x 2 R, the value function
imize the time of arrival of the state on a given satisfies
target set T . We will assume that T Rd is
a closed set. Then our cost functional will be T .x/ D inf ft ^ tx .˛/ C Xft tx .˛/g T .yx .t; ˛//g
˛2A
given by
J.x; ˛/ D tx .˛/ for any t 0 (16)
Optimal Control and the Dynamic Programming Principle 961
Optimal Control via Factorization which relates the input w and the output z when
and Model Matching v1 D 0 and v2 D 0. The objective is to select
K, to minimize this measure of performance.
Michael Cantoni
Alternatively, controllers that achieve a speci-
Department of Electrical & Electronic
fied upper bound are sought. It is also usual to
Engineering, The University of Melbourne,
require internal stability, which pertains to the
Parkville, VIC, Australia
fictitious signals v1 and v2 , as discussed more
subsequently. The best known examples are H2
and H1 control problems. In the former, perfor-
Abstract
mance is quantified as the energy (resp. power)
of z when w is impulsive (resp. unit white noise),
One approach to linear control system design in-
and in the latter, as the worst-case energy gain
volves the matching of certain input-output mod-
from w to z, which can be used to reflect robust-
els with respect to a quantification of closed-
ness to model uncertainty; see Zhou et al. (1996).
loop performance. The approach is based on a
The special case of G22 D 0 gives rise to
parametrization of all stabilizing feedback con-
a (weighted) model-matching problem, in that
trollers, which relies on the existence of co-
the corresponding performance map H.G; K/ D
prime factorizations of the plant model. This
G11 C G12 KG21 exhibits affine dependence on
parametrization and spectral factorization meth-
the design variable K, which is chosen to match
ods for solving model-matching problems are
G12 KG21 to G11 with respect to the scalar quan-
described within the context of impulse-response
tification of performance. Any internally stabiliz-
energy and worst-case energy-gain measures of
able problem with G22 ¤ 0, can be converted into
controller performance.
a model-matching problem. The key ingredients
in this transformation are coprime factorizations
of the plant model. The role of these and other
Keywords factorizations in a model-matching approach to
H2 and H1 control problems is the focus of this
Coprime factorization; H2 control; H1 control; article.
Spectral factorization; Youla-Kučera controller For the sake of argument, finite-dimensional
parametrization linear time-invariant systems are considered via
real-rational transfer functions in the frequency
domain, as the existence of all factorizations
Introduction
employed is well understood in this setting. This form naturally arises in frequency-domain
Indeed, constructions via state-space realizations analysis of the input-output map associated with
and Riccati equations are well known. The merits P
the time-domain model x.t/ D Ax.t/ C Bu.t/,
of the model-matching approach pursued here with initial condition x.0/ D 0 and output
are at least twofold: (i) the underlying algebraic equation y.t/ D C x.t/ C Du.t/, where xP
input-output perspective extends to more denotes the time derivative of x and u is the
abstract settings, including classes of distributed- input. The study of such linear time-invariant
parameter and time-varying systems (Desoer differential equation models via the Laplace
et al. 1980; Vidyasagar 1985; Curtain and Zwart transform and multiplication by real-rational
1995; Feintuch 1998; Quadrat 2006); and (ii) transfer function matrices is fundamental in
model matching is a convex problem for various linear systems theory (Kailath 1980; Francis
measures of performance (including mixed 1987; Zhou et al. 1996). P 2 R has an inverse
indexes) and controller constraints. The latter P 1 2 R if and only if limjsj!1 P .s/ is a
can be exploited to devise numerical algorithms nonsingular matrix. The superscripts T and
for controller optimization (Boyd and Barratt denote the transpose and complex conjugate
1991; Dahleh and Diaz-Bobillo 1995; Qi et al. transpose. For a matrix Z D Z with complex
2004). entries, Z > 0 means z Zz z z for
First, some notation regarding transfer some > 0 and all complex vectors z of
functions and two measures of performance compatible dimension. P .s/ WD P .s/T ,
for control system design is defined. Coprime whereby .P .j!//
p D P .j!/ for all real !
factorizations are then described within the with j WD 1. Zeros of transfer function
context of a well-known parametrization of denominators are called poles.
stabilizing controllers, originally discovered In subsequent sections, several subspaces of
by Youla et al. (1976) and Kucera (1975). This R are used to define and solve two standard
yields an affine parametrization of performance linear control problems. The subspace B R
maps for problems in standard form, and thus, a comprises transfer functions that have no poles
transformation to a model-matching problem. on the imaginary axis in the complex plane. For
Finally, the role of spectral factorizations in P 2 B, the scalar performance index O
solving model-matching problems with respect
to impulse-response energy (H2 ) and worst-case kP k1 WD max
N .P .j!// 0
energy-gain (H1 ) measures of performance is 1!1
discussed.
is finite; the real number
N .Z/ is the maximum
singular value of the matrix argument Z. This
Notation and Nomenclature index measures the worst-case energy-gain from
an input signal u, to the output signal y D P u.
R generically denotes a linear space of matrices Note that kP k1 <
if and only if
2 I
having fixed row and column dimensions, P .j!/P .j!/ > 0 for all 1 ! 1.
which are not reflected in the notation for The subspace S B R consists of transfer
convenience, and entries that are proper real- functions that have no poles with positive real
rational
Pfunctions ofP
the complex variable s; part. A transfer function in S is called stable
m k n k
i.e., kD1 b k s = kD1 ak s for sets of because the corresponding input-output map is
real coefficients fak gnkD1 and fbk gm kD1 with causal in the time domain, as well as bounded-
m n < 1. The compatibility of matrix in-bounded-out (in various senses). If P 2 S is
dimensions is implicitly assumed henceforth. such that P P D I , then it is called inner. If
All matrices in R have (nonunique) “state-space” P; P 1 2 S, then both are called outer.
realizations of the form C.sI A/1 B C D, Let L denote the subspace of strictly-proper
where A; B; C and D are real valued matrices. transfer functions in B; i.e., for all entries of the
964 Optimal Control via Factorization and Model Matching
MQ 1 NQ are said to be (doubly) coprime over S, stabilizable if there exists a K 2 R such that
if N , M , NQ , MQ are all elements of S and there the nine transfer functions associated with the
exist U0 , V0 , UQ 0 , VQ0 all in S such that map from the vector of signals .w; v1 ; v2 / to
Optimal Control via Factorization and Model Matching 965
the vector of signals .z; u; y/, which includes sense that (3) holds for some U0 ; V0 ; UQ 0 ; VQ0 2 S.
the performance map H.G; K/ D G11 C Indeed, since 0 D G22 G22 D MQ 1 .MQ N
G12 K.I G22 K/1 G21 , are all elements of S. NQ M /M 1, it follows that
Accounting in this way for the influence of the
where inf denotes greatest lower bound (infi- tion of assumptions, the infimum is achieved as
mum) and Ti 2 S, i D 1; 2; 3: shown below.
A minimizer of the convex functional f WD
inf kT1 C T2 QT3 k2 : Q 2 H 7! h.T1 C T2 QT3 /; .T1 C T2 QT3 /i
Q2S is a solution of the model matching problem.
Given spectral factorizations ˚ ˚ D T2 T2
Assume that T2 .s/ and T3 .s/ have full column and D T3 T3 (i.e., ˚; ˚ 1 ; ; 1 2
and row rank, respectively, for s on the extended S), which exist by the assumptions on the
imaginary axis. Also assume that T1 is strictly problem data, let R WD ˚Q and W WD
proper, whereby Q must be strictly proper, and ˚ T2 T1 T3 . Then for Q 2 H, which
thus an element of H S, for the performance is equivalent to R 2 H by the properties of
index to be finite. Under this standard collec- spectral factors, it follows that
(8)
f .Q/ D hT1 ; T1 i C h˚ T2 T1 T3 ; Ri C hR; ˚ T2 T1 T3 i C hR; Ri
D hT1 ; T1 i C h.˘ .W / C ˘ C .W / C R/; .˘ .W / C ˘ C .W / C R/i hW; W i
D hT1 ; T1 i h˘ C .W /; ˘ C .W /i C h.˘ C .W / C R/; .˘ C .W / C R/i; (9)
where the second last equality holds by With a view to highlighting the role of factor-
“completion-of-squares” and the last equality ization methods and simplifying the presentation,
holds since h˘ C .W /; ˘ .W /i D 0 D suppose that T2 is inner, which is possible without
hR; ˘ .W /i. From (9) it is apparent that loss of generality via inner-outer factorization if
T2 .s/ has full column rank for s on the extended
imaginary axis. Furthermore, assume that T3 D
Q D ˚ 1 ˘ C .˚ T2 T1 T3 /1 I . Following the approach of Francis
(1987)
and
Green et al. (1990),
let X D X1 X2 WD
is a minimizer of f . As above, spectral fac- T2 I T2 T2 2 B, so that X X D I and
torization is a key component of the so-called X T2 D Œ I0 . Observe that
Wiener-Hopf approach of Youla et al. (1976) and
DeSantis et al. (1978). kT1 C T2 Qk1 D kX.T1 C T2 Q/k1
Now consider the H1 model-matching prob-
T T1 C Q
lem D 2 <
inf kT1 C T2 QT3 k1 ; .I T2 T2 /T1 1
Q2S
(10)
given Ti 2 S, i D 1; 2; 3. This is more chal-
if and only if
lenging than the problem discussed above, where
k k2 is the performance index. While sufficient
0 <
2 I T1 .I T2 T2 /T1
conditions are again available for the infimum
to be achieved, computing a minimizer is gener- .T2 T1 C Q/ .T2 T1 C Q/ (11)
ally difficult; see Francis and Doyle (1987) and
Glover et al. (1991). As such, nearly optimal on the extended imaginary axis. Note that (11)
solutions are often sought by considering the implies 0 <
2 I T1 .I T2 T2 /2 T1 . Thus, it
relaxed problem of finding the set of Q 2 S that follows that there exists a Q 2 S for which (10)
satisfy kT1 C T2 QT3 k1 <
for a value of
> 0 holds if and only if the following are both sat-
greater than, but close to, the infimum. isfied: (a) there exists a spectral factorization
Optimal Control via Factorization and Model Matching 967
T2 T1 I 0 T2 T1 I 0 I WN
D
0 I 0
2 I 0 I 0 0 I
Bibliography
I 0 I WN I 0
; Ball JA, Ran ACM (1987) Optimal Hankel norm model O
0
2 I 0 I 0
reductions and Wiener-Hopf factorization I: the canon-
ical case. SIAM J Control Optim 25(2):362–382
Ball JA, Helton JW, Verma M (1991) A factorization
it follows using (12) that there exists a Q 2 S
principle for stabilization of linear control systems. Int
such that (10) holds if and only if there exists J Robust Nonlinear Control 1(4):229–294
1
a spectral
factor
˝; ˝ 1 2 S with ˝11 2 S Boyd SP, Barratt CH (1991) Linear controller design:
˝ D 0 that satisfies
I 0 limits of performance. Prentice Hall, Englewood Cliffs
Curtain RF, Zwart HJ (1995) An introduction to infinite-
dimensional linear systems theory. Volume 21 of texts
in applied mathematics. Springer, New York
T2 T1 I 0 T2 T1 I 0
D ˝ ˝; Dahleh MA, Diaz-Bobillo IJ (1995) Control of uncertain
0 I 0
2 I 0 I 0
2 I systems: a linear programming approach. Prentice
(13) Hall, Upper Saddle River
DeSantis RM, Saeks R, Tung LJ (1978) Basic optimal es-
timation and control problems in Hilbert space. Math
in which case kT1 C
T2 Qk1
if and only if Syst Theory 12(1):175–203
Desoer C, Liu R-W, Murray J, Saeks R (1980) Feedback
Q D Q1 Q21 , where Q1T Q2T WD S T I ˝ T ,
system design: the fractional representation approach
S 2 S and kS k1
; see Green et al. to analysis and synthesis. IEEE Trans Autom Control
(1990). So-called J -spectral factorizations of 25(3):399–412
the kind in (12) and (13) also appear in the Feintuch A (1998) Robust control theory in Hilbert space.
Applied mathematical sciences. Spinger, New York
chain-scattering/conjugation approach of Kimura
Francis BA (1987) A course in H1 control theory.
(1989, 1997) and the factorization approach Lecture notes in control and information sciences.
of Ball et al. (1991), for example. Springer, Berlin/New York
968 Optimal Control with State Space Constraints
.T; x.T // D 0;
the property that the gradients r i D . @@t i ; @@xi / constraints, is well established, and will not
(which we write as row vectors) are linearly be addressed here (e.g., see Cesari 1983 or the
independent on N . The terminal time T can be Filippov-Cesari theorem in Hartl et al. 1995).
free or fixed; a fixed terminal time simply would
be prescribed by one of the functions i . The
state space constraints Necessary Conditions for Optimality
@t @t
1, and defining
(b) The optimal control minimizes the Hamil-
tonian over the control set U along 1 0
F0 .t; x/ D and G.t; x/ D ;
..t/; x .t//: f .t; x/ g.t; x/
Controlled trajectories .x; u/ for which there represent the Lie (or directional) derivatives of
exist multipliers such that these conditions are the function k along the vector fields F0 and G,
satisfied are called extremals. In general, it cannot respectively. In terms of this notation, the deriva-
be excluded that 0 vanishes and extremals with tive of the function h˛ (defining the manifold M˛ )
0 D 0 are called abnormal, while those with along trajectories of the system is given by
0 > 0 are called normal. In this case, the
d
multiplier can be normalized, 0 D 1. hP ˛ .t; x.t// D h˛ .t; x.t//
dt
Special Case: A Mayer Problem for D LF0 h˛ .t;x.t//Cu.t/LG h˛ .t;x.t//:
Single-Input Control Linear Systems
Under the general assumptions formulated above, If the function LG h˛ does not vanish at a point
the sets R˛ Œt0 ; T when a particular con- .tQ; x/
Q 2 M˛ , then there exists a neighborhood
straint is active can be arbitrarily complicated. V of .tQ; x/
Q such that there exists a unique
Optimal Control with State Space Constraints 971
control u˛ D u˛ .t; x/ which solves the equation M˛ transversally (e.g., see Schättler 2006). This
hP ˛ .t; x/ D 0 on V and u˛ is given in feedback follows from the following characterization of
form as transversal connections between interior and
boundary arcs due to Maurer (1977): if is an
LF0 h˛ .t; x/ entry or exit junction time between an interior arc
u˛ .t; x/ D :
LG h˛ .t; x/ and an M˛ -boundary arc for which the reference
control u has a limit at along the interior
The manifold M˛ is said to be control invariant arc, then the interior arc is transversal to M˛
of relative degree 1 if the Lie derivative of h˛ with at entry or exit if and only if the control u is
respect to G, LG h˛ , does not vanish anywhere on discontinuous at .
M˛ and if the function u˛ .t; x/ is admissible, i.e.,
takes values in the control set Œa; b. Informal Formulation of Necessary
Thus, for a control-invariant submanifold of Conditions
relative degree 1, the control that keeps the man- In order to ensure the practicality of necessary
ifold invariant is unique, and the corresponding conditions for optimality, it is essential that be-
dynamics induce a unique flow on the constraint. sides atomistic structures at junctions that lead to
This assumption corresponds to the least degener- computable jumps in the multipliers, the Radon
ate, i.e., in some sense most generic or common, measures ˛ have no singular parts with respect
scenario and is satisfied for many practical prob- to Lebesgue measure. If it is assumed a priori that
lems. optimal controlled trajectories are finite concate-
Suppose the reference extremal is normal and nations of interior and boundary arcs, and if the
let ˛ be an M˛ -boundary arc defined over an constraint sets have a reasonably regular structure
open interval I with corresponding boundary (embedded submanifolds and transversal inter-
control u˛ that takes values in the interior of the sections thereof) and satisfy a rather technical
control set along ˛ . Then the Radon measure ˛ constraint qualification (see Hartl et al. 1995)
is absolutely continuous with respect to Lebesgue that guarantees that the restrictions of the system
measure on I with continuous and nonnegative to active constraints have solutions, then it is
Radon-Nikodym derivative ˛ .t/ given by possible to specify the above necessary condi- O
tions further and formulate more user friendly
@g
.t/ @t
.t; x .t// C Œf; g.t; x .t// versions for the determination of the multipliers.
˛ .t/ D Such formulations have become the standard for
LG h˛ .t; x .t//
numerical computations, but they still have not
where Œf; g denotes the Lie bracket of the time- always been established rigorously and somewhat
varying vector fields f and g in the variable x, carry the stigma of a heuristic nature. Neverthe-
less, it is often this more concrete set of condi-
@g @f tions that allow to solve problems numerically
Œf; g.t; x/ D .t; x/f .t; x/ .t; x/g.t; x/: and analytically. If then, in conjunction with
@x @x
sufficient conditions for optimality, it is possible
In particular, in this case, the adjoint equation can to verify the optimality of the computed extremal
be expressed in the more common form solutions, this generates a satisfactory theoretical
procedure. Such conditions, following Hartl et al.
@F @h˛ (1995), generally are referred to as the “informal
P
.t/ D .t/ .t; x ; u / ˛ .t/ .t; x /;
@x @x theorem”.
Suppose .x ; u / is a normal extremal con-
with all partial derivatives evaluated along the trolled trajectory defined over the interval Œt0 ; T
reference trajectory. Furthermore, the multiplier with the property that the graph of x is a finite
remains continuous at entry or exit if the concatenation of interior and boundary arcs with
controlled trajectory .x ; u / meets the constraint junction times i , i D 1; : : : ; k, t0 D 0 < 1 <
972 Optimal Control with State Space Constraints
: : : < k < kC1 D T . Under an appropriate and at the junction times i we have that
constraint qualification, there exist a multiplier ,
W Œt0 ; T ! .Rn / , which is absolutely con- H. i ; . i /; x . i /; u . i //
tinuous on each subinterval Œ i ; i C1 ; multipliers
˛ , ˛ W Œt0 ; T ! .Rn / , which are continuous D H. i ; . i C/; x . i /; u . i C//
on each interval Œ i ; i C1 ; a vector 2 .Rk / ; @h
and vectors . i / 2 .Rr / , i D 1; : : : ; k, with . i / . i ; x . i //:
@t
nonnegative entries such that:
(a) (adjoint equation) On each interval
. i ; i C1 /, i D 0; : : : ; r, satisfies the Sufficient Conditions for Optimality
adjoint equation in the form
The literature on sufficient conditions for opti-
P @L
.t/ D .t; x .t/; u .t// mality for optimal control problems with state
@x
space constraints is limited. The value function
@F for an optimal control problem at a point .t; x/ in
.t/ .t; x .t/; u .t//
@x the extended state space, V D V .t; x/, is defined
Xr
@h˛ as the infimum over all admissible controls u for
˛ .t/ .t; x /; which the corresponding trajectory starts at the
˛D1
@x
point x at time t and satisfies all the constraints
of the problem,
with ˛ .t/ D 0 if the constraint M˛ is not
active at time t. Assuming that no state space
constraint is active at the terminal time, the V .t; x/ D inf J.u/:
u2U
value of the multiplier at the terminal time
is given by the transversality condition
Any sufficiency theory for optimal control prob-
@˚ @ lems, one way or another, deals with the solution
.T / D .T; x .T // C .T; x .T //: of the corresponding Hamilton-Jacobi-Bellman
@x @x
(HJB) equation:
At any junction time i between an interior
arc and a boundary arc, the multiplier may
@V @V
be discontinuous satisfying a jump condition .t; x/ C min .t; x/F .t; x; u/
@t u2U @x
of the form
CL.t; x; u/g
0;
@h
. i / D . i C/ C . i / . i ; x . i // V .T; x/ D ˚.T; x/ whenever .T; x/ D 0:
@x
and the complementary slackness condition Value functions for optimal control problems
rarely are differentiable everywhere, but gener-
@h
. i / . i ; x . i // D 0 ally have singularities along lower-dimensional
@x submanifolds. Nevertheless, under some techni-
holds. cal assumptions and with proper interpretations
(b) The optimal control minimizes the Hamil- of the derivatives, this equation describes the
tonian over the control set U along evolution of the value function of an optimal
..t/; x .t//: control problem and, if an appropriate solution
can be constructed, indeed solves the optimal
H.t; .t/; x .t/; u .t// control problem.
There exists a broad theory of viscosity so-
D min H.t; .t/; x .t/; v/ lutions to the HJB equation (e.g., Fleming and
v2U
Optimal Control with State Space Constraints 973
Soner 2005; Bardi and Capuzzo-Dolcetta 2008) regular synthesis, as it was developed by Piccoli
that is also applicable to problems with state and Sussmann in (2000) for problems without
space constraints (Soner 1986) and, under vary- state space constraints, does not yet exist for
ing technical assumptions, characterizes the value problems with state space constraints. Results
function V as the unique viscosity solution to the that embed a controlled reference extremal into
HJB equation. This has led to the development of a local field of extremals have been given by
algorithms that can be used to compute numerical Bonnard et al. (2003) or Schättler (2006), and
solutions. these constructions show the applicability of the
A more classical and more geometric concepts of a regular synthesis to problems with
approach to solving the HJB equation is based state space constraints as well.
on the method of characteristics and goes back to
the work of Boltyansky on a regular synthesis Examples of Local Embeddings
for optimal control problems without state of Boundary Arcs
space constraints (Boltyanskii 1966). This work We illustrate the typical, i.e., in some sense most
follows classical ideas of fields of extremals common, generic structures of local embeddings
from the calculus of variations and imposes of boundary arcs in Figs. 1 and 2. The state
technical conditions that allow to handle the constraint M˛ is a control-invariant submanifold
singularities that arise in the value functions of relative degree 1 and represented by a hori-
(e.g., see, Schättler and Ledzewicz 2012). zontal line as it arises when limits on the size
Stalford’s results in Stalford (1971) follow of a particular state are imposed. Figure 1 shows
this approach for problems with state space the typical entry-boundary-exit concatenations of
constraints, but a broadly applicable theory of an interior arc followed by a boundary arc and
another interior arc. The local embedding of the the interior of the control set) in the interior of the
boundary arc differs substantially from classi- admissible domain, possibly with saturation if the
cal local imbeddings for unconstrained problems control limits are reached.
in the sense that this field necessarily contains
small pieces of trajectories which, when propa-
gated backward, are not close to the reference Cross-References
trajectory. This, however, does not affect the
memoryless properties required for a synthesis Numerical Methods for Nonlinear Optimal
forward in time, and strong local optimality of Control Problems
the reference trajectory can be proven combining Optimal Control and Pontryagin’s Maximum
synthesis type arguments with homotopy type Principle
approximations of the synthesis (Schättler 2006).
The one trajectory marked as black line in Fig. 1
corresponds to an optimal trajectory that meets Bibliography
the constraint only at the junction point and
immediately bounces back into the interior. Such Bardi M, Capuzzo-Dolcetta I (2008) Optimal control and
a trajectory arises as the limit when the concate- viscosity solutions of Hamilton-Jacobi-Bellman equa-
nation structure of optimal controlled trajecto- tions. Springer, New York
Boltyanskii VG (1966) Sufficient conditions for optimal-
ries changes from interior-boundary-interior arcs ity and the justification of the dynamic programming
to trajectories that do not meet the constraint. principle. SIAM J Control Optim 4:326–361
These structures are one of the extra sources Bonnard B, Faubourg L, Launay G, Trélat E (2003)
for singularities in the value function that come Optimal control with state space constraints and the
space shuttle re-entry problem. J Dyn Control Syst
up in optimal control problems with state space 9:155–199
constraints. Switching surfaces for the interior Cesari L (1983) Optimization – theory and applications.
arcs, as one is also shown in this figure, do not Springer, New York
cause such a loss of differentiability if they are Fleming WH, Soner HM (2005) Controlled Markov pro-
cesses and viscosity solutions. Springer, New York
crossed transversally be the extremal trajectories Frankowska H (2006) Regularity of minimizers and of ad-
of the field. joint states in optimal control under state constraints.
Figure 2 depicts the structure of an optimal Convex Anal 13:299–328
synthesis for a problem from electronics, the Hartl RF, Sethi SP, Vickson RG (1995) A survey of
the maximum principles for optimal control problems
problem of minimizing the base transit time of with state constraints. SIAM Rev 37:181–218
bipolar homogeneous transistors. The electrical Ioffe AD, Tikhomirov VM (1979) Theory of extremal
field that determines the transit time is controlled problems. North-Holland, Amsterdam/New York
by tailoring a distribution of dopants in the base Maurer H (1977) On optimal control problems with
bounded state variables and control appearing linearly.
region, and this dopant profile becomes an impor- SIAM J Control Optim 15:345–362
tant design parameter determining the speed of Piccoli B, Sussmann H (2000) Regular synthesis and
the device. But due to physical and engineering sufficient conditions for optimality. SIAM J Control
limitations, the variables describing the dopants Optim 39:359–410
Pontryagin LS, Boltyanskii VG, Gamkrelidze RV,
need to be limited, and thus this becomes an Mishchenko EF (1962) Mathematical theory of opti-
optimal control problem with state space con- mal processes. Wiley-Interscience, New York
straints represented by hard limits on the vari- Rinaldi P, Schättler H (2003) Minimization of the base
ables. The constraints here are control invariant transit time in semiconductor devices using optimal
control. In: Feng W, Hu S, Lu X (eds) Dynamical
of relative degree 1. Optimal solutions, in the systems and differential equations. Proceedings of the
presence of initial and terminal constraints, have 4th international conference on dynamical systems
both portions along the upper and lower control and differential equations. Wilmington, May 2002,
limits of the constrained variable and typically pp 742–751
Schättler H (2006) A local field of extremals for optimal
proceed from the upper to the lower values along control problems with state space constraints of rela-
an optimal singular control (which takes values in tive degree 1. J Dyn Control Syst 12:563–599
Optimal Deployment and Spatial Coverage 975
Schättler H, Ledzewicz U (2012) Geometric optimal con- The specific formulation of facility location
trol. Springer, New York problems depends very much on the particular
Soner HM (1986) Optimal control with state-space con-
straints I. SIAM J Control Optim 24:552–561
underlying application. A distinguishing feature
Stalford H (1971) Sufficient conditions for optimal control is that all involve strategic planning, accounting
with state and control constraints. J Optim Theory for the long-term impact on the facility operating
Appl 7:118–135 cost and their fast response to the demand. Thus,
Vinter RB (2000) Optimal control. Birkhäuser, Boston
these problems lead to constrained optimization
formulations which are typically very hard to
solve optimally. The computational complexity
of such problems, which, even in their most basic
Optimal Deployment and Spatial formulations, typically lead to NP-hard problems,
Coverage has made their solution largely intractable until
the advent of high-speed computing.
Sonia Martínez
Locational optimization techniques have also
Department of Mechanical and Aerospace
been employed to solve optimal estimation
Engineering, University of California, La Jolla,
problems by static sensor networks, mesh and
San Diego, CA, USA
grid optimization design, clustering analysis, data
compression, and statistical pattern recognition;
Abstract see Du et al. (1999). However, these solutions
typically require centralized computations and
Optimal deployment refers to the problem of how availability of information at all times.
to allocate a finite number of resources over a When the facilities are multiple vehicles or
spatial domain to maximize a performance metric mobile sensors, the underlying dynamics may
that encodes certain quality of service. Depend- require additional changes and further analysis
ing on the deployment environment, the type of that guarantee the overall system stability. In
resource, and the metric used, the solutions to this what follows, we review a particular coverage
problem can greatly vary. control problem formulation in terms of the
so-called expected-value multicenter functions O
that makes the analysis tractable leading to
robust, distributed algorithm implementations
Keywords employing computational geometric objects such
as Voronoi partitions.
Coverage control algorithms; Facility location
problems
Basic Ingredients from
Computational Geometry
Introduction
In order to formulate a basic optimal deployment
The problem of deciding what are optimal geo- problem and algorithm, we require of several
graphic locations to place a set of facilities has a notions from computational geometry; see Bullo
long history and is the main subject in operations et al. (2009) for more information.
research and management science; see Drezner Let S be a measurable set of Rm , for m 2
(1995). A facility can be broadly understood as N, consider a distance function d on Rm , and
a service such as a school; a hospital; an airport; let P D fp1 ; : : : ; pn g be n distinct points of
an emergency service, such as a fire station; or, S , corresponding to locations of certain facil-
more generally, routes of a vehicle, from buses ities. The Voronoi partition of S generated by
to aircraft, an autonomous vehicle, or a mobile P and associated with d is given by V.P / D
sensor. fV1 .P /; : : : ; Vn .P /g, where
976 Optimal Deployment and Spatial Coverage
˚
Vi .P / D q 2 S j d.pi ; q/ d.pj ; q/; Expected-Value Multicenter
Functions
j 2 P n fi g ; i 2 f1; : : : ; ng:
Facility location problems consist of spatially
Given r 2 R>0 , denote by B.pi ; r/ the closed allocating a number of sites to provide certain
ball of center pi and radius r. The r-limited quality of service. Problems of this class are for-
Voronoi partition of S generated by P and as- mulated in terms of multicenter functions and, in
sociated with d is the Voronoi partition of the particular, expected-value multicenter functions.
set S \ [niD1 B.pi ; r/, denoted as Vr .P / D To define these, consider W S ! R0
fV1;r .P /; : : : Vn;r .P /g. a density function over a bounded measurable
Let W S ! R0 be a measurable density set S Rm . One can regard as a function
function on S . The area and the centroid (or measuring the probability that some event takes
center of mass) of W S with respect to are place over the environment. The larger the value
the values of .q/, the more important the location q will
have. We refer to a nonincreasing and piecewise
Z
continuously differentiable function f W R0 !
A .W / D .q/dq;
W
R, possibly with finite jump discontinuities, as a
Z performance function.
1
CM .W / D q.q/dq: Performance functions describe the utility of
A .W / W
placing a node at a certain distance from a loca-
tion in the environment. The smaller the distance,
We say that the set of distinct points P in S the larger the value of f , that is, the better the
is a centroidal Voronoi configuration (resp., a r- performance. For instance, in sensing problems,
limited centroidal Voronoi configuration) if each performance functions can encode the signal-to-
pi is at the centroid of its own Voronoi cell. noise ratio between a source with an unknown
That is, pi D CM .Vi .P //, i 2 f1; : : : ; ng location and a sensor attempting to locate it.
(resp., pi D CM .Vi;r .P //, and i 2 f1; : : : ; ng). Without loss of generality, it can be assumed that
Voronoi partitions and centroidal Voronoi config- f .0/ D 0.
urations help assess the distribution of locations An expected-value multicenter function mod-
in a spatial domain as we establish below. els the expected value of the coverage over any
A Voronoi partition induces a natural proxim- point in S provided by a set of points p1 ; : : : ; pn .
ity graph, called the Delaunay graph, over the Formally,
set of points P . We recall that a graph G is a Z
pair G D .V; E/ where V is a set of n vertices
H.p1 ; : : : ; pn / D max f.kqpi k2 /.q/dq;
and E is a set of ordered pair of vertices, E S i 2f1;:::;ng
V V , called edge set. A proximity graph is (1)
a graph function defined on the set S , which where k k2 denotes the 2-norm of Rm . This
assigns a set of distinct points P S to a graph definition can be understood as follows: consider
G.P / D .P; E.P //, where E.P / is a function the best coverage of q 2 S among those provided
of the relative locations of the point set. Example by each of the nodes p1 ; : : : ; pn , which corre-
graphs include the following: sponds to the value maxi 2f1;:::;ng f .kq pi k2 /.
1. The r-disk graph, Gdisk,r , for r 2 R>0 . Here, Then, modulate the performance by the impor-
.pi ; pj / 2 Edisk,r .P / if d.pi ; pj / r. tance .q/ of the location q. Finally, the infinites-
2. The Delaunay graph, GD . We have .pi ; pj / 2 imal sum of this quantity over the environment S
ED .P / if Vi .P / \ Vj .P / ¤ ;. gives rise to H.p1 ; : : : ; pn / as a measure of the
3. The r-limited Delaunay graph, GLD,r , for r 2 overall coverage provided by p1 ; : : : ; pn .
R>0 . Here, .pi ; pj / 2 ELD,r .P / if Vi;r .P / \ From here, we can formulate the following
Vi;r .P / ¤ ;. geometric optimization problem, known
Optimal Deployment and Spatial Coverage 977
as the continuous p-median problem, see fp1 ; : : : ; pn g 2 S . For any performance function
Drezner (1995): f and for any partition fW1 ; : : : ; Wn g of S ,
where nout is the outward normal vector to and the inequality is strict if there exists i 2
B.pi ; a/. f1; : : : ; ng for which Wi has nonvanishing area
Different performance functions lead to differ- and pi 6D CM .Wi /. In other words, the centroid
ent expected-value multicenter functions. Let us locations CM .W1 /; : : : ; CM .Wn / are optimal
examine some important cases. for Hdistor among all configurations in S .
Note that when n D 1, the node location that
Distortion Problem optimizes p 7! Hdistor .p/ is the centroid of the
Consider the performance function f .x/ D x 2 . set S , denoted by CM .S /.
Then, on S n n C, the expected-value multicenter Recall that the gradient of Hdistor on S n n C
function takes the form takes the form,
n Z
X
Hdistor .p1 ; : : : ; pn /D @Hdistor
i D1 Vi .P / .P / D2A .Vi .P //.CM .Vi .P // pi /;
@pi
kq pi k22 .q/dq:
i 2 f1; : : : ; ng;
In signal compression Hdistor is referred to as
the distortion function and is relevant in many
that is, the i th component of the gradient points
disciplines where including vector quantization,
in the direction of the vector going from pi
signal compression, and numerical integration;
to the centroid of its Voronoi cell. The critical
see Gray and Neuhoff (1998) and Du et al.
points of Hdistor are therefore the set of centroidal
(1999). Here, distortion refers to the average
Voronoi configurations in S . This is a natural
deformation (weighted by the density ) caused
generalization of the result for the case n D 1,
by reproducing q 2 S with the location pi in
where the optimal node location is the centroid
P D fp1 ; : : : ; pn g such that q 2 Vi .P /. By
CM .S /.
means of the Parallel Axis Theorem (see Hibbeler
2006), it is possible to express Hdistor as a sum
Area Problem
Hdistor .p1 ; : : : ; pn ; W1 ; : : : ; Wn / For r 2 R>0 , consider the performance function
X
n f .x/ D 1Œ0;r .x/, that is, the indicator function
D J .Wi ; CM .Wi // of the closed interval Œ0; r. Then, the expected-
i D1 value multicenter function becomes
A .Wi /kpi CM .Wi /k22 ; (6)
R X
n
where J .W; p/ D W kq pk2 .q/dq is the Harea,r .p1 ; : : : ; pn / D A .Vi .P / \ B.pi ; r//
so-called moment of inertia of the region W i D1
about p with respect to . In this way, the terms
D A .[niD1 B.pi ; r//;
J .Wi ; CM .Wi // only depend on the partition
of S , whereas the second terms multiplied by
A .Wi / include the particular location of the which corresponds to the area, measured accord-
points. As a consequence of this observation, the ing to , covered by the union of the n balls
optimality of the centroid locations for Hdistor B.p1 ; r/; : : : ; B.pn ; r/.
follows Bullo et al. (2009). More precisely, let Let us see how the computation of the partial
fW1 ; : : : ; Wn g be a partition of S . Then, for any derivatives of Harea,r specializes in this case.
set points P D fp1 ; : : : ; pn g in S , Here, the performance function is differentiable
everywhere except at a single discontinuity, and
Hdistor CM .W1 /; : : : ; CM .Wn /; W1 ; : : : ; Wn
its derivative is identically zero. Therefore, the
Hdistor .p1 ; : : : ; pn ; W1 ; : : : ; Wn /; first term in (5) vanishes. The gradient of Harea,r
Optimal Deployment and Spatial Coverage 979
on S n n C then takes the form, for each i 2 The Voronoi-center deployment algorithm
f1; : : : ; ng, achieves convergence of a set of nodes
to a centroidal Voronoi configuration, thus
Z maximizing the expected-value multicenter
@Harea,r
.P / D nout .q/.q/dq; function Hdistor . The algorithm is distributed over
@pi Vi .P / \ @B.pi ;r/
the proximity graph GD , as the computation of
the centroids requires information in NGD .pi /,
where nout is the outward normal vector to for each i 2 f1; : : : ; ng. Additional properties of
B.pi ; r/. The critical points of Harea,r correspond this algorithm are that the algorithm is adaptive
to configurations with the property that each pi to agent departures or arrivals and amenable to
is a local maximum for the area of Vi;r .P / D asynchronous implementations.
Vi .P / \ B.pi ; r/ at fixed Vi .P /. We refer to On the other hand, the limited-Voronoi-normal
these configurations as r-limited area-centered deployment algorithm achieves convergence to a
Voronoi configurations. set that locally maximizes the area covered by the
set of sensing balls. The algorithm is distributed
in the sense that agents only need to know in-
Optimal Deployment Algorithms formation from neighbors in the proximity graph
G2r or, more precisely, GLD,r . Thus, it can be
Once a set of optimal deployment configurations implemented by agents that employ range-limited
have been characterized, the next step is to devise interactions. It enjoys similar robustness proper-
a distributed algorithm that allows a group of ties as the Voronoi-center deployment algorithm.
mobile robots to converge to such configurations.
Gradient algorithms are the first of the options
that should be explored. Simulation Results
For the expected-value multicenter functions,
robots whose dynamics can be described by first- We show evolutions of the Voronoi-centroid de-
order integrator dynamics and which can commu- ployment algorithm in Fig. 1. One can verify that
nicate at predetermined communication rounds of the final network configuration is a centroidal O
a fixed time schedule, these laws present a similar Voronoi configuration. For each evolution we
structure, loosely described as follows: depict the initial positions, the trajectories, and
the final positions of all robots.
[Informal description] In each communication Finally, we show an evolution of limited-
round, each robot performs the following tasks: (i) Voronoi-normal deployment algorithm in
it transmits its position and receives its neighbors’
positions; (ii) it computes a notion of the geometric Fig. 2. One can verify that the final network
center of its own cell, determined according to configuration is an r2 -limited area-centered
some notion of partition of the environment. (iii) Voronoi configuration. In other words, the
Between communication rounds, each robot moves deployment task is achieved.
toward this center.
The notions of geometric center and of par- Future Directions for Research
tition of the environment differ depending on The algorithms described above achieve
what is the type of expected-value multicenter locally optimal deployment configurations with
function used. In the Voronoi-center deployment respect to expected-value multicenter functions.
algorithm, the geometric center just reduces to However, this simplified setting does not
CM .Vi /. In the limited-Voronoi-normal deploy- account for many important constraints, such
ment problem in (ii), each agent computes the as obstacles and deployment in non-convex
@H
direction of v D @parea,ri
for some r and (iii) environments (Pimenta et al. 2008; Caicedo-
moves for a maximum step size in this direction Nùñez and Žefran 2008), deployment with vis-
to ensure the area function will be decreased. ibility sensors, range-limited and wedge-shaped
980 Optimal Deployment and Spatial Coverage
Optimal Deployment and Spatial Coverage, Fig. 1 and Voronoi partition. The central figure illustrates the
The evolution of the Voronoi-centroid deployment algo- evolution of the robots. After 13 s, the value of Hdistor has
rithm with n D 20 robots. The left-hand (resp., right- monotonically increased to approximately 0:515
hand) figure illustrates the initial (resp., final) locations
Optimal Deployment and Spatial Coverage, Fig. 2 central figure illustrates the evolution of the robots. The
r
The evolution of the limited-Voronoi-normal deployment 2
-limited Voronoi cell of each robot is plotted in light
algorithm with n D 20 robots and r D 0:4. The gray. After 36 s, the value of Harea , with a D r2 , has
left-hand (resp., right-hand) figure illustrates the initial monotonically increased to approximately 14:141
(respectively, final) locations and Voronoi partition. The
(or an A/D converter) S. This sampler reads out .H.wŒk// .t/ WD wŒk; for kh t < .k C 1/h:
the values of e.t/ at every time step h called
the sampling period and produces a discrete-
time signal ed Œk, k D 0; 1; 2; : : : (Fig. 2). In A typical sample-hold action is shown in Fig. 2.
particular, the sampling operator S acts on a While one can consider a nonlinear plant P or
continuous-time signal w.t/, t 0, as controller C , or infinite-dimensional P and C we
confine ourselves to linear and finite-dimensional
P and C , and also suppose that P and C are time
S.w/Œk WD w.kh/; k D 0; 1; 2; : : : invariant in continuous time and in discrete time,
respectively.
The lack of time-invariance implies that we This idea makes it possible to view a (time-
cannot naturally associate to sampled-data sys- invariant or even periodically time-varying)
tems such classical concepts of transfer functions, continuous-time system as a linear, time-invariant
steady-state response and frequency response. discrete-time system.
One can regard Fig. 1 as a time-invariant Let
discrete-time system by ignoring the intersample P
x.t/ D Ax.t/ C Bu.t/
(2)
behavior and focusing attention on the sample- y.t/ D C x.t/:
point behavior only. But the obtained model does be a given continuous-time plant and lift the input
not then reflect what happens between sampling u.t/ to obtain uŒk./. We apply this lifted input
times. This approach can lead to the neglect with the timing t D kh (h is the prespecified sam-
of undesirable inter-sample oscillations, called pling rate as above) and observe how it affects the
ripples. To monitor the intersample behavior, system. Let xŒk be the state at time t D kh. The
the notion of the modified z-transform was state xŒk C 1 at time .k C 1/h is given by
introduced, see, e.g., Jury (1958) and Ragazzini
and Franklin (1958); however, this transform is Z h
usable only after the controller has been designed xŒk C 1 D e Ah xŒk C e A.h / BuŒk. /d :
0
and hence not for the design problems considered (3)
in this article.
The right-hand side integral defines an operator
Z h
Lifting: A Modern Approach L2 Œ0; h/ ! Rn W u./ 7! e A.h / Bu. /d :
0
respondence:
As such, the lifted output yŒk./ is given by
L W f 7! ff Œk./g1
kD0 ; Z
j!h I BR
1 B A C BR
1 D C 0 x
e D0 (16)
0 A C C DR
1 B C .I C DR
1 D /C I p
where R
D .
I D D/. The important point to operator from L2 Œ0; h/ to Rn , and its adjoint B
be noted here is that all the operators appearing is an operator from Rn to L2 Œ0; h/. Hence, the
here are actually matrices. For example, B is an composition BR
1 B is a linear operator from
986 Optimal Sampled-Data Control
Rn into itself, i.e., a matrix. Thus, for a given • Sampled-data H 1 design with the generalized
the singular value equation admits a nontrivial plant
Optimal Sampled-Data
Control, Fig. 5
Disturbance rejection
Optimal Sampled-Data
Control, Fig. 6 Frequency
responses h D 0:5
O
Summary, Bibliographical Notes, and There are other performance indices for opti-
Future Directions mality, typically those arising from H 2 and L1
norms. These problems have also been studied
We have given a short summary of the main extensively, and fairly complete solutions are
achievements of modern sampled-data control available. For the lack of space, we cannot list all
theory. Particularly, we have reviewed how the references, and the reader is referred to Chen and
technique of lifting resolved the intrinsic diffi- Francis (1995) and Yamamoto (1999) for a more
culty arising from the mixture of two distinct time concrete survey and references therein.
sets: continuous and discrete. This idea further For classical treatments of sampled-data con-
led to the new notions of transfer operators and trol, it is instructive to consult Jury (1958) and
frequency response. These notions together en- Ragazzini and Franklin (1958). The textbook
abled us to treat optimal sampled-data control Åström and Wittenmark (1996) covers both clas-
problems in a unified and transparent way. We sical and modern aspects of digital control. For a
have outlined how the sampled-data H 1 control mathematical background of the computation of
problem can equivalently be reduced to a cor- adjoints treated in section “H 1 Norm Computa-
responding discrete-time H 1 problem, without tion and Reduction to Finite Dimension,” consult
sacrificing the performance in the intersample be- Yamamoto (2012) as well as Yamamoto (1993).
havior. This has been exemplified by a numerical Since control devices are now mostly digital,
example. the importance of sampled-data control will
988 Optimal Sampled-Data Control
definitely increase. While the linear, time- points when the band-limiting hypothesis does
invariant case as treated here is now fairly not hold. See, for example, Yamamoto et al.
complete, sampled-data control for a nonlinear (2012) and Nagahara and Yamamoto (2012) for
or an infinite-dimensional plant seems to be still the idea and some efforts in this direction.
quite an open issue, although it is unclear if the
methodology treated here is effective for such
classes of plants.
Sampled-data control has much to do with Cross-References
signal processing. Indeed, since it can optimize
continuous-time performance, it can shed a new Control Applications in Audio Reproduction
light on digital signal processing. Traditionally, H2 Optimal Control
Shannon’s paradigm based on the perfect band- H-Infinity Control
limiting hypothesis and the sampling theorem Optimal Control via Factorization and Model
has been prevalent in the signal processing Matching
community. Since the sampling theorem opts
for perfect reconstruction, the resulting theory Acknowledgments The author would like to thank
reduces mostly to discrete-time problems. In Masaaki Nagahara and Masashi Wakaiki for their help
in the numerical example references.
other words, the intersample information is
buried in the sampling theorem. It should,
however, be noted that the very stringent band-
limiting hypothesis is almost never satisfied
in reality, and various approximations are Bibliography
necessitated. In contrast, sampled-data control
Åström KJ, Wittenmark B (1996) Computer controlled
can provide an optimal platform for dealing with systems—theory and design, 3rd edn. Prentice Hall,
and optimizing the response between sampling Upper Saddle River
Optimization Algorithms for Model Predictive Control 989
restrict ourselves to the time-independent case, impossible due to computational delays. Several
and the OCP we treat in this article is stated as ideas can help us to deal with this issue.
follows:
Off-line Precomputations and Code
X
N 1 Generation
minimize L.xi ; ui / C E .xN / (1a) As consecutive MPC problems are similar and
X; U i D0
differ only in the value xN 0 , many computations
subject to x0 xN 0 D 0; (1b) can be done once and for all before the MPC
controller execution starts. Careful preprocessing
xi C1 f .xi ; ui / D 0; i D 0; : : : ; N 1; and code optimization for the model routines is
(1c) essential, and many tools automatically gener-
h.xi ; ui / 0; i D 0; : : : ; N 1; ate custom solvers in low-level languages. The
(1d) generated code has fixed matrix and vector di-
mensions, has no online memory allocations, and
r .xN / 0: (1e) contains a minimal number of if-then-else state-
ments to ensure a smooth computational flow.
The MPC objective is stated in Eq. (1a), the sys-
tem dynamics enter via Eq. (1c), while path and Delay Compensation by Prediction
terminal constraints enter via Eqs. (1d) and (1e). When we know how long our computations for
All functions are assumed to be differentiable and solving an MPC problem will take, it is a good
to have appropriate dimensions (h.x; u/ 2 Rnh idea not to address a problem starting at the cur-
and r.x/ 2 Rnr ). Note that xN 0 2 Rnx is not an op- rent state but to simulate at which state the system
timization variable, but a parameter upon which will be when we will have solved the problem.
the OCP depends via the initial value constraint in This can be done using the MPC system model
Eq. (1b). The optimal solution trajectories depend and the open-loop control inputs that we will ap-
only on this value and can thus be denoted by ply in the meantime. This feature is used in many
X .xN 0 / and U .xN 0 /. Obtaining them, in partic- practical MPC schemes with non-negligible com-
ular the first control value u0 .xN 0 /, as fast and putation time.
reliably as possible for each new value of xN 0 is
the aim of all MPC optimization algorithms. The Division into Preparation and Feedback Phase
most important dividing line is between convex A third ingredient of several MPC algorithms is
and non-convex optimal control problems (OCP). to divide the computations in each sampling time
If the OCP is convex, algorithms exist that find a into a preparation phase and a feedback phase.
global solution reliably and in computable time. The more CPU intensive preparation phase is
If the OCP is not convex, one usually needs to be performed with a predicted state xN 0 , before the
satisfied with approximations of locally optimal most current state estimate, say xN 00 , is available.
solutions. The OCP (1) is convex if the objec- Once xN 00 is available, the feedback phase delivers
tive (1a) and all components of the inequality quickly an approximate solution to the optimiza-
constraint functions (1d) and (1e) are convex and tion problem for xN 00 .
if the equality constraints (1c) are linear.
We typically speak of linear MPC when the Warmstarting and Shift
OCP to be solved is convex, and otherwise of An obvious way to transfer solution information
nonlinear MPC. from one solved MPC problem to the next one
uses the existing optimal solution as an initial
General Algorithmic Features for MPC guess to start the iterative solution procedure of
Optimization the next problem. We can either directly use
In MPC we would dream to have the solution to the existing solution without modification for
a new optimal control problem instantly, which is warmstarting or we can first shift it in order to
Optimization Algorithms for Model Predictive Control 991
account for the advancement of time, which is instead of (2). Another way is to use a banded
particularly advantageous for systems with time- matrix factorization.
varying dynamics or objectives.
Condensing
Iterating While the Problem Changes The constraints (2b) and (2c) can be used to
A last important ingredient of some MPC algo- eliminate the state trajectory X . This yields an
rithms is the idea to work on the optimization equivalent but smaller-scale QP of the following
problem while it changes, i.e., to never iterate form:
the optimization procedure to convergence for an
MPC problem getting older and older during the
>
1 U H G U
minimize (3a)
iterations but to rather work with the most current
U 2 RN n u 2 xN 0 G> J xN 0
information in each new iteration.
subject to d C K xN 0 C M U 0: (3b)
Convex Optimization for Linear MPC The number of inequality constraints is the same
as in the original QP (2) and given by m D
Linear MPC is based on a linear system model of N nh C nr . Note that in the simplest case without
the form xi C1 D Axi CBui and convex objective inequalities (m D 0), the solution U .xN 0 / of
and constraint functions in (1a), (1d), and (1e). the condensed QP can be obtained by setting the
The most widespread linear MPC setting uses gradient of the objective to zero, i.e., by solving
a convex quadratic objective function and affine H U .xN 0 / C G xN 0 D 0. The factorization of a
constraints and solves the following quadratic dense matrix H with dimension N nu N nu needs
program (QP): O.N 3 n3u / arithmetic operations, i.e., the compu-
tational cost of condensing-based algorithms typ-
N 1
>
1 >
1 X xi
N
Q S xi
C > Cyi>C1
Here, b; c are vectors and Q;
S; R; P; C; D; F 2 i D0 iu S R ui
Q S
matrices, and matrices > and P are sym- .xi C1 Axi CBui /: (4)
S R
metric and positive semi-definite to ensure the QP
is convex. If we reorder all unknowns that enter the Lagran-
gian and summarize them in the vector
Sparsity Exploitation
The QP (2) has a specific sparsity structure that W D Œy0> ; x0> ; u> > > > > > >
0 ; y1 ; x1 ; u1 ; : : : ; yN ; xN
can be exploited in different ways. One way is to
reduce the variable space by a procedure called the optimal solution W .xN 0 / is uniquely charac-
condensing and then to solve a smaller-scale QP terized by the first-order optimality condition
992 Optimization Algorithms for Model Predictive Control
d C K xN 0 C M U i i C D 0; i D 1; : : : ; m: 1
U ŒkC1
DP U
Œk
.H U C G xN 0 / :
Œk
LH
(5b)
An improved version of the gradient projection
These conditions form a smooth nonlinear system algorithm is called the optimal or fast gradient
of equations that uniquely determines a primal method and has probably the best possible iter-
dual solution U .xN 0 ; / and .xN 0 ; / in the in- ation complexity of all gradient type methods.
terior of the feasible set. They are not equiv- All variants of gradient projection algorithms
alent to the KKT conditions, but for ! 0, are easy to warmstart. Though they are not as
their solution tends to the exact QP solution. An versatile as active set or interior point methods,
interior point algorithm solves the system (5a) they have short code sizes and can offer ad-
and (5b) by Newton’s method. Simultaneously, vantages on embedded computational hardware,
the path parameter , that was initially set to such as the code generated by the tool FIOR-
a large value, is iteratively reduced, making the DOS.
nonlinear set of equations a closer approximation
of the original KKT system. In each Newton
iteration, a linear system needs to be factored Optimization Algorithms for
and solved, which constitutes the major com- Nonlinear MPC
putational cost of an interior point algorithm.
For the condensed QP (3) with dense matrices When the dynamic system xi C1 D f .xi ; ui /
H; M , the cost per Newton iteration is of order is not affine, the optimal control problem (1)
O.N 3 /. But the interior point algorithm can also is non-convex, and we speak of a nonlinear
be applied to the uncondensed sparse QP (2), MPC (NMPC) problem. NMPC optimization
in which case each iteration has a runtime of algorithms only aim at finding a locally optimal
order O.N /. In practice, for both cases, 10–30 solution of this problem, and they usually do
Newton iterations usually suffice to obtain very it in a Newton-type framework. For ease of
accurate solutions. As an interior point method notation, we summarize problem (1) in the form
needs always to start with a high value of and of a general nonlinear programming problem O
then reduces it during the iterations, warmstarting (NLP):
is of minor benefit. There exist efficient code
generation tools that export convex interior point
solvers as plain C-code such as CVXGEN and minimize ˆ.X; U / (6a)
X; U
FORCES.
subject to Geq .X; U; xN 0 / D 0; (6b)
Gradient Projection Methods
Gradient projection methods do not need to Gineq .X; U / 0: (6c)
factorize any matrix but only evaluate the
gradient of the objective function H U Œk C G xN 0 Let us first discuss a fundamental choice that
in each iteration. They can only be implemented regards the problem formulation and number of
efficiently if the feasible set is a simple set optimization variables.
in the sense that a projection P.U / on this
set is very cheap to compute, as, e.g., for Simultaneous vs. Sequential Formulation
upper and lower bounds on the variables U , When an optimization algorithm addresses prob-
and if we know an upper bound LH > 0 lem (6) iteratively, it works intermediately with
on the eigenvalues of the Hessian H . The nonphysical, infeasible trajectories that violate
simple gradient projection algorithm starts the system constraints (6b). Only at the optimal
with an initialization U Œ0 and proceeds as solution the constraint residual is brought to zero
follows: and a physical simulation is achieved. We speak
994 Optimization Algorithms for Model Predictive Control
L.V; Y; / D ˆ.V / C Y > Geq .V; xN 0 / Here, the subindex “lin” in the constraints G;lin
>
C Gineq .V /: (8) expresses that a first-order Taylor expansion at
Optimization Algorithms for Model Predictive Control 995
V Œk is used, while the QP objective is given method proceeds thus exactly as in an interior
by ˆquad .V I V Œk ; Y Œk ; Œk / D ˆlin .V I V Œk / C point method for convex problems, with the only
Œk > 2
2 .V V / rV L./.V V Œk /. Note that the
1 difference that it has to re-linearize all problem
QP has the same sparsity structure as the QP (2) functions in each iteration. An excellent software
resulting from linear MPC, with the only differ- implementation of the NIP method is given in the
ence that all matrices are now time varying over form of the code IPOPT.
the MPC horizon. In the case that the Hessian
matrix is positive semi-definite, this QP is convex Continuation Methods and Tangential
so that global solutions can be found reliably Predictors
with any of the methods from section “Con- In nonlinear MPC, a sequence of OCPs with
Œ0 Œ1 Œ2
vex Optimization for Linear MPC.” The solu- different initial values xN 0 ; xN 0 ; xN 0 ; : : : is solved.
tion of the QP along with the corresponding For the transition from one problem to the next,
constraint multipliers gives the next SQP iterate it is beneficial to take into account the fact that
.V ŒkC1 ; Y ŒkC1 ; ŒkC1 /. Apart from the presented the optimal solution W .xN 0 / depends almost ev-
“exact Hessian” SQP variant, which has quadratic erywhere differentiably on xN 0 . The concept of a
convergence speed, several other SQP variants continuation method is most easily explained in
exist, which make use of other Hessian approx- the context of an NIP method with fixed path
imations. A particularly useful Hessian approx- parameter N > 0. In this case, the solution
imation for NMPC is possible if the original W .xN 0 ; N / of the smooth root finding problem
objective function ˆ.V / is convex quadratic, and GNIP .W .xN 0 ; N /; xN 0 ; N / D 0 from Eq. (11) is
the resulting SQP variant is called the generalized smooth with respect to xN 0 . This smoothness can
Gauss-Newton method. In this case, one can just be exploited by making use of a tangential pre-
use the original objective as cost function in the dictor in the transition from one value of xN 0 to
QP (9a), resulting in convex QP subproblems and another. Unfortunately, the interior point solution
(often fast) linear convergence speed. manifold is strongly nonlinear at points where the
active set changes, and the tangential predictor is
Nonlinear Interior Point (NIP) Method not a good approximation when we linearize at
In contrast to SQP methods, an alternative way such points. O
to address the solution of the KKT system is to
replace the last nonsmooth KKT conditions by a Generalized Tangential Predictor and
smooth nonlinear approximation, with > 0: Real-Time Iterations
In fact, the true NLP solution is not determined
rV L.V ; Y ; / D 0 (10a) by a smooth root finding problem (10a)–(3)
but by the (nonsmooth) KKT conditions. The
Geq .V ; xN 0 / D 0 (10b) solution manifold has smooth parts when the
Gineq;i .V / i C D 0; i D 1; : : : ; m: active set does not change, but non-differentiable
points occur whenever the active set changes.
(10c) We can deal with this fact naturally in an SQP
framework by solving one QP of form (9) in
We summarize all variables in a vector W D order to generate a tangential predictor that
ŒV > ; Y > ; > > and summarize the above set of is also valid in the presence of active set
equations as changes. In the extreme case that only one
such QP is solved per sampling time, we speak
GNIP .W; xN 0 ; / D 0: (11) of a real-time iteration (RTI) algorithm. The
computations in each iteration can be subdivided
The resulting root finding problem is then into two phases, the preparation phase, in
solved with Newton’s method, for a descending which the derivatives are computed and the QP
sequence of path parameters Œk . The NIP is condensed, and the feedback phase, which
996 Optimization Algorithms for Model Predictive Control
ŒkC1
only starts once xN 0 becomes available and Binder T, Blank L, Bock HG, Bulirsch R, Dah-
men W, Diehl M, Kronseder T, Marquardt W,
in which only a condensed QP of form (3) is Schlöder JP, Stryk OV (2001) Introduction to
solved, minimizing the feedback delay. This model based optimization of chemical processes
NMPC algorithm can be generated as plain on moving horizons. In: Grötschel M, Krumke
C-code, e.g., by the tool ACADO. Another SO, Rambau J (eds) Online optimization of large
scale systems: state of the art. Springer, Berlin,
class of real-time NMPC algorithms based on pp 295–340
a continuation method can be generated by the Bryson AE, Ho Y-C (1975) Applied optimal control.
tool AutoGenU. Wiley, New York
Diehl M, Ferreau HJ, Haverbeke N (2009) Efficient nu-
merical methods for nonlinear MPC and moving hori-
zon estimation. In: Nonlinear model predictive control.
Lecture notes in control and information sciences,
Cross-References vol 384. Springer, Berlin, pp 391–417
Domahidi A, Zgraggen A, Zeilinger MN, Morari M, Jones
Explicit Model Predictive Control CN (2012) Efficient interior point methods for mul-
tistage problems arising in receding horizon control.
Model-Predictive Control in Practice In: IEEE conference on decision and control (CDC),
Numerical Methods for Nonlinear Optimal Maui, Dec 2012, pp 668–674
Control Problems Ferreau HJ, Bock HG, Diehl M (2008) An online ac-
tive set strategy to overcome the limitations of ex-
plicit MPC. Int J Robust Nonlinear Control 18(8):
Recommended Reading 816–830
Fletcher R (1987) Practical methods of optimization, 2nd
Many of the algorithmic ideas presented in this edn. Wiley, Chichester
Gill PE, Murray W, Wright MH (1999) Practical optimiza-
article can be used in different combinations than
tion. Academic, London
those presented, and several other ideas had to be Houska B, Ferreau HJ, Diehl M (2011) An auto-
omitted for the sake of brevity. Some more details generated real-time iteration algorithm for nonlinear
can be found in the following two overview arti- MPC in the microsecond range. Automatica 47(10):
2279–2285
cles on MPC optimization: Binder et al. (2001)
Mattingley J, Boyd S (2009) Automatic code generation
and Diehl et al. (2009). The general field of nu- for real-time convex optimization. In: Convex op-
merical optimal control is treated in Bryson and timization in signal processing and communica-
Ho (1975), Betts (2010), and the even broader tions. Cambridge University Press, New York,
pp 1–43
field of numerical optimization is covered in the
Nesterov Y (2004) Introductory lectures on convex opti-
excellent textbooks (Fletcher 1987; Wright 1997; mization: a basic course. Applied optimization, vol 87.
Nesterov 2004; Gill et al. 1999; Nocedal and Kluwer, Boston
Wright 2006; Biegler 2010). General purpose Nocedal J, Wright SJ (2006) Numerical optimization.
Springer series in operations research and financial
open-source software for MPC and NMPC is de- engineering, 2nd edn. Springer, New York
scribed in the following papers: FORCES (Dom- Ohtsuka T, Kodama A (2002) Automatic code
ahidi et al. 2012), CVXGEN (Mattingley and generation system for nonlinear receding horizon
Boyd 2009), qpOASES (Ferreau et al. 2008), control. Trans Soc Instrum Control Eng 38(7):
617–623
FiOrdOs (Richter et al. 2011), AutoGenU (Oht- Richter S, Morari M, Jones CN (2011) Towards
suka and Kodama 2002), ACADO (Houska et al. computational complexity certification for constrained
2011), and IPOPT (Wächter and Biegler 2006). MPC based on Lagrange relaxation and the fast
gradient method. In: 50th IEEE conference
on decision and control and European control
conference (CDC-ECC), Orlando, Dec 2011,
Bibliography pp 5223–5229
Wächter A, Biegler LT (2006) On the implementation of
Betts JT (2010) Practical methods for optimal control a primal-dual interior point filter line search algorithm
and estimation using nonlinear programming, 2nd edn. for large-scale nonlinear programming. Math Program
SIAM, Philadelphia 106(1):25–57
Biegler LT (2010) Nonlinear programming. SIAM, Wright SJ (1997) Primal-dual interior-point methods.
Philadelphia SIAM, Philadelphia
Optimization Based Robust Control 997
u D Ky
Keywords
modeled by a matrix K 2 Rm p must be
Linear systems; Optimization; Robust control designed to overcome the effect of the uncertainty
while optimizing some performance criterion
(e.g., pole placement, disturbance rejection,
H2 or H1 norm). Sometimes, a relevant
Linear Robust Control performance criterion is that the control should O
be stabilizing for the largest possible uncertainty
Robust control allows dealing with uncertainty
(measured, e.g., by some norm on ). In this
affecting a dynamical system and its environ-
section, for conciseness, we restrict our attention
ment. In this section, we assume that we have a
to static output feedback control laws, but most
mathematical model of the dynamical system
of the results can be extended to dynamical output
without uncertainty (the so-called nominal
feedback control laws, where the control signal u
system) jointly with a mathematical model of
is the output of a controller (a linear system to be
the uncertainty. We restrict ourselves to linear
designed) whose input is y.
systems: if the dynamical system we want to
control has some nonlinear components (e.g.,
input saturation), they must be embedded in
Uncertainty Models
the uncertainty model. Similarly, we assume
that the control system is relatively small
Amongst the simplest possible uncertainty mod-
scale (low number of states): higher-order
els, we can find the following:
dynamics (e.g., highly oscillatory but low energy
• Unstructured uncertainty, also called norm-
components) are embedded in the uncertainty
bounded uncertainty, where
model. Finally, for conciseness, we focus
exclusively on continuous-time systems, even
D fı 2 Rq W jjıjj 1g
though most of the techniques described in this
section can be transposed readily to discrete-time and the given norm can be a standard vector
systems. norm or a more complicated matrix norm if ı is
998 Optimization Based Robust Control
interpreted as a vector obtained by stacking the .k1 ; k2 ; k3 / 2 R3 such that k12 C k22 C k32 < 1
column of a matrix and ˛.A.K// < 0 for
• Structured uncertainty, also called polytopic
uncertainty, where 1 k1
A.K/ D :
k2 k3
D conv fıi ; i D 1; : : : ; N g
There exist various approaches to handling non-
is a polytope modeled as the convex combination convexity. One possibility consists of building
of a finite number of given vertices ıi 2 Rq ; i D convex inner approximations of the stability re-
1; : : : ; N gion in the parameter space. The approxima-
We can find more complicated uncertainty tions can be polytopes, balls, ellipsoids, or more
models (e.g., combinations of the two above: see complicated convex objects described by linear
Zhou et al. 1996), but to keep the developments matrix inequalities (LMI). The resulting stability
elementary, they are not discussed here. conditions are convex, but surely conservative, in
the sense that the conditions are only sufficient
for stability and not necessary. Another approach
to handling nonconvexity consists of formulating
Nonconvex Nonsmooth Robust
the stability conditions algebraically (e.g., via the
Optimization
Routh-Hurwitz stability criterion or its symmetric
version by Hermite) and using converging hier-
The main difficulties faced when seeking a feed-
archies of LMI relaxations to solve the result-
back matrix K are as follows:
ing nonconvex polynomial optimization problem:
• Nonconvexity: The stability conditions are
see, e.g., Henrion and Lasserre (2004) and Chesi
typically nonconvex in K.
(2010).
• Nondifferentiability: The performance cri-
The second difficulty characteristic of
terion to be optimized is typically a non-
optimization-based robust control is the potential
differentiable function of K.
nondifferentiability of the objective function.
• Robustness: Stability and performance
Consider for illustration one of the simplest
should be ensured for every possible instance
optimization problems which consists of
of the uncertainty.
minimizing the spectral abscissa ˛.A.K// of
So if we are to formulate the robust control
a matrix A.K/ depending linearly on a matrix
problem as an optimization problem, we should
K. Such a minimization makes sense since
be ready to develop and use techniques from non-
negativity of the spectral abscissa is equivalent
convex, nondifferentiable, robust optimization.
to system stability. Then typically, ˛.A.K// is
Let us first elaborate on the first difficulty
a continuous but non-Lipschitz function of K,
faced by optimization-based robust control,
which means that its gradient can be unbounded
namely, the nonconvexity of the stability
locally. In Fig. 2, we plot the spectral abscissa
conditions. In continuous time, stability of a
˛.A.K// for
linear system xP D Ax is equivalent to negativity
of the spectral abscissa, which is defined as the
0 1
maximum real part of the eigenvalues of A: A.K/ D
K K
−0.2
K3 −0.4
−0.6
−0.8
−1
0.5
0 0.5
K2
0
−0.5
−0.5 K1
Optimization Based
Robust Control, Fig. 2
The spectral abscissa is 0.9
typically nonconvex and
0.8
nonsmooth
0.7
0.6
spectral abscissa
O
0.5
0.4
0.3
0.2
0.1
−0.1
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2
K
et al. 2001, 2006b). In Fig. 3, we represent graphs The third difficulty for optimization-based
of the spectral abscissa (with flipped vertical robust control is the uncertainty. As explained
axis for better visualization) of some small-size above, optimization of a performance criterion
matrices depending on two real parameters, with with respect to controller parameters is already
randomly generated parametrization. We observe a potentially difficult problem for a nominal
the typical nonconvexity and lack of smoothness system (i.e., when the uncertainty parameter is
around local and global optima. equal to zero). This becomes even more difficult
1000 Optimization Based Robust Control
Optimization Based Robust Control, Fig. 3 The graph of the negative spectral abscissa for some randomly generated
matrix parametrizations
when this optimization must be carried out for A unified approach to addressing conflicting
all possible instances of the uncertainty ı in performance criteria and uncertainty consists of
. This is where the above assumption that the searching for locally optimal solutions of a nons-
uncertainty set has a simple description proves mooth optimization problem that is built to incor-
useful. If the uncertainty ı is unstructured and porate minimization objectives and constraints
not time varying, then it can be handled with the for multiple plants. This is called (linear robust)
complex stability radius (Ackermann 1993), the multiobjective control, and formally, it can be
pseudospectral abscissa (Trefethen and Embree expressed as the following optimization prob-
2005), or via an H1 norm constraint (Zhou et al. lem
1996). If the uncertainty ı is structured, then
we can try to optimize a performance criterion minK maxi D1;:::;N fgi .K/ W ˇi D 1g
at every vertex in the polytopic description
(which is a relaxation of the problem of s:t: gi .K/ ˇi ; i D 1; : : : ; N;
stabilizing the whole polytope). An example
is the problem of simultaneous stabilization, where each gi (K) is a function of the closed-
where a controller K must be found such that the loop matrix Ai .K/ (e.g., a spectral abscissa
maximum spectral abscissa of several matrices or an H1 norm) and the scalars ˇi are given
Ai .K/; i D 1; : : : ; N is negative (Blondel 1994). and such that if ˇi D 1 for some i , then gi
Finally, if the uncertainty ı is time varying, then appears in the objective function and not in a
performance and stability guarantees can still be constraint: see Gumussoy et al. (2009) for details.
achieved with the help of Lyapunov certificates or In the above problem, the objective function,
potentially conservative convex LMI conditions: a maximum of nonsmooth and nonconvex
see, e.g., Boyd et al. (1994) and Scherer et al. functions, is typically also nonsmooth and
(1997). nonconvex. Moreover, without loss of generality,
Optimization-Based Control Design Techniques and Tools 1001
we can easily impose a sparsity pattern on Henrion D, Lasserre JB (2004) Solving nonconvex op-
controller matrix K to account for structural timization problems – how GloptiPoly is applied to
problems in robust and nonlinear control. IEEE Con-
constraints (e.g., a low-order decentralized trol Syst Mag 24(3):72–83
controller). Scherer CW, Gahinet P, Chilali M (1997) Multi-objective
output feedback control via LMI optimization. IEEE
Trans Autom Control 42(7):896–911
Trefethen LN, Embree M (2005) Spectra and pseudospec-
Software Packages tra: the behavior of nonnormal matrices and operators.
Princeton University Press, Princeton
Algorithms for nonconvex nonsmooth opti- Zhou K, Doyle JC, Glover K (1996) Robust and optimal
mization have been developed and interfaced control. Prentice Hall, Upper Saddle River
for linear robust multiobjective control in the
public domain Matlab package HIFOO released
in Burke et al. (2006a) and based on the theory
described in Burke et al. (2006b). In 2011,
Optimization-Based Control Design
The MathWorks released HINFSTRUCT, a
Techniques and Tools
commercial implementation of these techniques
based on the theory described in Apkarian and
Pierre Apkarian1 and Dominikus Noll2
Noll (2006). 1
DCSD, ONERA – The French Aerospace Lab,
Toulouse, France
2
Institut de Mathématiques, Université de
Cross-References Toulouse, Toulouse, France
H-Infinity Control
LMI Approach to Robust Control
Abstract
In the modern high-technology field of control, with xK 2 RnK and is called structured if
engineers usually face a large variety of con- the (real) matrices AK ; BK ; CK ; DK depend
curring design specifications such as noise or smoothly on a design parameter x 2 Rn , referred
gain attenuation in prescribed frequency bands, to as the vector of tunable parameters. Formally,
damping, decoupling, constraints on settling or we have differentiable mappings
rise time, and much else. In addition, as plant
models are generally only approximations of the
AK D AK .x/; BK D BK .x/; CK D CK .x/;
true system dynamics, control laws have to be
robust with respect to uncertainty in physical DK D DK .x/;
parameters or with regard to un-modeled high-
frequency phenomena. Not surprisingly, such a and we abbreviate these by the notation K.x/ for
plethora of constraints present a major challenge short to emphasize that the controller is structured
for controller tuning, not only due to the ever- with x as tunable elements.
growing number of such constraints but also A structured controller synthesis problem is
because of their very different provenience. then an optimization problem of the form
The dramatic increase in plant complexity is
exacerbated by the desire that regulators should minimize kTwz .P; K.x//k
be as simple as possible, easy to understand and subject to K.x/ closed-loop stabilizing (3)
to tune by practitioners, convenient to hardware K.x/ structured, x 2 Rn
implement, and generally available at low cost.
Such practical constraints explain the limited use where Twz .P; K/ D F` .P; K/ is the lower feed-
of black-box controllers, and they are the driving back connection of (1) with (2) as in Fig. 1 (left),
force for the implementation of structured control also called the linear fractional transformation
architectures, as well as for the tendency to re- (Varga and Looye 1999). The norm kk stands for
place hand-tuning methods by rigorous algorith- the H1 norm, the H2 norm, or any other system
mic optimization tools. norm, while the optimization variable x 2 Rn
regroups the tunable parameters in the design.
Standard examples of structured controllers
Structured Controllers K.x/ include realizable PIDs and observer-based,
reduced-order, or decentralized controllers,
Before addressing specific optimization tech- which in state space are expressed as
niques, we introduce some basic terminology
for control design problems with structured 2 0 0 1
3
Optimization-Based Control Design Techniques and Tools, Fig. 1 Black-box full-order controller K on the left,
structured 2-DOF control architecture with K D block-diag.K1 ; K2 / on the right
Optimization-Based
Control Design
Techniques and Tools,
Fig. 2 Synthesis of K D
block-diag.K1 ; : : : ; KN /
against multiple
requirements or models
P .1/ ; : : : ; P .M / . Each
Ki .x/ can be structured
6 C D 7
P WD 6
0 0 7 ; K.x/ WD K1 .s/ 0 ;
4 0 I 0 0 5 0 K2 .s/ minimize f .x/ D max
.k/
kTwi zi .K.x//k
k2SOFT;i 2Ik
C 0 I D
1004 Optimization-Based Control Design Techniques and Tools
Optimization-Based Control Design Techniques and Tools, Fig. 3 Flowchart of proximity control bundle
algorithm
1006 Optimization-Based Control Design Techniques and Tools
num(s)
wg s2+0.6s+0.09
Band-Limited
White Noise Dryden filter
x gust
Product wind gust
Clock
Interval Test
muStep
x⬘ = Ax+Bu
+ 1
setpoint Ki* uvec + y = Cx+Du
– s e K*uvec K*uvec
alphaStep + u Aircraft
Ki
Rate Limiter S Outage F16 aircraft
Integral action
betaStep
Setpoint
commands
State feedback
y K*uvec
x = [u w q v p r]
To Workspace Scope Kx
z
mu alpha beta
Selector
Optimization-Based Control Design Techniques and Tools, Fig. 4 Synthesis interconnection for reliable control
Optimization-Based Control Design Techniques and Tools, Table 1 Outage scenarios where 0 stands for failure
Outage cases Diagonal of outage gain
Nominal mode 1 1 1 1 1
Right elevator outage 0 1 1 1 1
Left elevator outage 1 0 1 1 1
Right aileron outage 1 1 0 1 1
Left aileron outage 1 1 1 0 1
Left elevator and right aileron outage 1 0 0 1 1 O
Right elevator and right aileron outage 0 1 0 1 1
Right elevator and left aileron outage 0 1 1 0 1
Left elevator and left aileron outage 1 0 1 0 1
In reliable flight control, one has to maintain and rudder deflections (deg). The elevators are
stability and adequate performance not only in grouped symmetrically to generate the angle of
nominal operation but also in various scenarios attack. The ailerons are grouped antisymmetri-
where the aircraft undergoes outages in elevator cally to generate roll motion. This leads to three
and aileron actuators. In particular, wind gusts control actions as shown in Fig. 4. The controller
must be alleviated in all outage scenarios to main- consists of two blocks, a 36 state-feedback gain
tain safety. Variants of this problem are addressed matrix Kx in the inner loop and a 3 3 integral
in Liao et al. (2002). gain matrix Ki in the outer loop, which leads to a
The open loop F16 aircraft in the scheme of total of 27 D dim x parameters to tune.
Fig. 4 has six states, the body velocities u; v; w In addition to nominal operation, we consider
and pitch, roll, and yaw rates q; p; r. The state eight outage scenarios shown in Table 1.
is available for control as is the flight-path bank The different models associated with the
angle rate (deg/s), the angle of attack ˛ (deg), outage scenarios are readily obtained by pre-
and the sideslip angle ˇ (deg). Control inputs are multiplication of the aircraft control input by a
the left and right elevator, left and right aileron, diagonal matrix built from the rows in Table 1.
1008 Optimization-Based Control Design Techniques and Tools
Optimization-Based Control Design Techniques and Tools, Fig. 5 Responses to step changes in , ˛, and ˇ for
nominal design
Z T
The design requirements are as follows: 1
• Good tracking performance in , ˛, and ˇ J D lim E kWe ek C kWu uk dt :
2 2
T !1 T 0
with adequate decoupling of the three axes. (6)
• Adequate rejection of wind gusts of 5 m/s. Diagonal weights We and Wu provide tuning
• Maintain stability and acceptable performance knobs for trade-off between responsiveness, con-
in the face of actuator outage. trol effort, and balancing of the three channels.
Tracking is addressed by an LQG cost We use We D diag.20; 30; 20/; Wu D I3 for nor-
(Maciejowski 1989), which penalizes integrated mal operation and We D diag.8; 12; 8/; Wu D I3
tracking error e and control effort u via for outage conditions. Model-dependent weights
Optimization-Based Control Design Techniques and Tools 1009
Optimization-Based Control Design Techniques and Tools, Fig. 6 Responses to step changes in , ˛, and ˇ for
fault-tolerant design
allow to express the fact that nominal operation In particular, the variance of e is limited to 0:01
prevails over failure cases. Weights for failure for normal operation and to 0:03 for the outage
cases are used to achieve limited deterioration of scenarios.
performance or of gust alleviation under deflec- With the notation of section “Non-smooth Op-
tion surface breakdown. timization Techniques,” the functions f .x/ and
.k/
The second requirement, wind gust allevia- g.x/ in (5) are f .x/ WD maxkD1;:::;9 kTrz .x/k2
.k/
tion, is treated as a hard constraint limiting the and g.x/ WD maxkD1;:::;9 kTwg e .x/k2 , where r
variance of the error signal e in response to white denotes the set-point inputs in , ˛, and ˇ. The
noise wg driving the Dryden wind gust model. regulated output z is
1010 Optimization-Based Control Design Techniques and Tools
Formulating the right strategy in the right taking projects with positive NPV, i.e., projects
competitive environment at the right time is a Rfor which the present value of cash flows, v.x/ D
1 rt
nontrivial task. Whether to invest in a new tech- 0 e EŒXt dt , exceeds the necessary invest-
nology, a new product or enter a new market is ment cost, I . In the present case, the firm will
a strategic decision of immense importance. Cor- invest under the zero-NPV criterion if
porate management must assess strategic options
x
with proper analytical tools that can help deter- I (1)
mine whether to commit to a particular strategic r g
path, given scarce or costly resources, or whether
to stay flexible. Oftentimes, firms need to position This traditional criterion views investment oppor-
themselves flexibly to capitalize on future oppor- tunities as now-or-never decisions under passive
tunities as they emerge, while limiting potential management. However, this precludes the possi-
losses arising from adverse future circumstances. bility to adjust future decisions in case the market
In many cases, corporate managers find them- develops off the expected path. While market
selves in need to revise their decision plans in uncertainty is factored in through the discount
view of actual market developments when facing rate, the flexibility management has is typically
an uncertain future; they can then decide to un- not properly accounted for.
dertake only those projects with sufficiently high
prospects in the future to justify commitment at Real Options Analysis
that time. This needs to be balanced with the It has become standard practice in finance and
need to make irreversible strategic commitments strategy to interpret real investment opportunities
to seize first-mover advantage presenting rivals as being analogous to financial options. This view
with a fait accompli to which they have no choice is well accepted among academics and practi-
but adapt. tioners alike and is at the core of real options
analysis (ROA). ROA is an extension of option-
pricing theory to real investment situations (My-
Capital Budgeting Ignoring Strategic ers 1977; Trigeorgis 1996). This approach effec-
Interactions tively allows one to capture the dynamic nature of
decision-making since it factors in management’s
Net Present Value flexibility to revise and adapt its decision in the
Prevailing management approaches simplify face of market uncertainty. ROA allows managers
matters and often lead to investment decisions with flexibility to adapt to actual market devel-
that are detrimental to the firm’s long-term well- opments as uncertainty gets resolved. Managers
being. Suppose a firm’s future cash flow at time may, for example, delay the start (or closure) of a
t is given by a random variable Xt . Cash flows project depending on its prospects. This approach
then evolve as a geometric Brownian motion leverages on optimal stopping theory (e.g., see
Bensoussan and Lions 1982; Dixit and Pindyck
dXt D gXt dt C
Xt dBt and X0
x 1994) and is considered to be more reflective
of real decision-making than traditional methods.
with drift parameter g and volatility
. The Brow- In the case the firm can delay the decision to
nian motion .Bt I t 0/ captures exogenous invest, for example, the problem is one of optimal
market uncertainty. The standard criterion used stopping:
in corporate finance is based on discounted cash
flows (DCF) or net present value (NPV). This V .x/ D maxT EŒe rT .v.XT / I /
consists in assessing the current value of a project
by discounting the expected future cash flows by ROA, the discount rate r is the risk-free
EŒXt at a constant discount rate, r. Management interest (Dixit and Pindyck 1994; Trigeorgis
supposedly creates shareholder value by under- 1996). The time of managerial action, T, is
Option Games: The Interface Between Optimal Stopping and Game Theory 1013
now a strategic decision variable random by firms may share the right to a related investment
nature as the decision maker faces an uncertain opportunity in the industry.
environment. This problem has an analytical
solution characterized by a threshold policy, say Game Theory
a trigger XN , given by In oligopolistic industries, firms often have dif-
ficulty predicting how rivals will behave and
XN b make decisions based on beliefs about their likely
D I (2)
r g b1 behavior. A theory that helps characterize beliefs
and form predictions about which strategies op-
where b is the positive root of a quadratic ponents will follow is helpful in analyzing such
function (e.g., see Dixit and Pindyck 1994) and oligopolistic situations. Game theory has tradi-
b=.b 1/ > 1. When decisions are costly or tionally been used to frame strategic interactions
difficult to reverse, corporate managers would be arising in conflict situations involving parties
more cautious and careful to make decisions. A with different objectives or interests. It attempts
firm should not always commit immediately – to model behavior in strategic situations or games
even if the NPV criterion (1) indicates so – but in which one party’s success in making choices
wait until the gross project value is sufficiently depends on the choices of other players through
positive to cover the investment cost I by a factor influencing one another’s welfare. Game theory
larger than one, as expressed in (2). Investing adopts a different perspective on optimization, as
prematurely may destroy shareholder value. the focus is on the formation of beliefs about
Real options may justify sometimes undertaking how rivals’ optimal strategies. Finance theory
projects with negative (static) net present value has been primarily concerned with “moves by
if it creates a platform for growth options or nature,” while game theory focuses on “optimiza-
delaying projects with positive NPV. tion problems” involving multiple players. To
solve a game, one needs to reduce a complex
multiplayer problem into a simpler structure that
Accounting for Strategic Interactions captures the essence of the conflict situation. One
in Capital Budgeting can then derive useful predictions about how O
rivals are likely to react in a given situation.
Strategic Uncertainty Game theory helped reshape microeconomics by
As natural monopolies have lost their secular providing analytical foundations for the study of
well-protected positions owing to market liber- market behavior and has been at the foundation
alization in the European Union and elsewhere of the Nobel prize winning research field of
across the globe, strategic interdependencies have industrial organization.
become new key challenge for managers. At Dynamic game theory (see, e.g., Basar and
the same time sectors traditionally populated by Olsder 1999) addresses problems in which
multiple firms have undergone significant consol- several parties are in repeated interaction.
idation, often resulting in oligopolistic situations Strategic management approaches based on
with a reduced number of players. The ongoing dynamic economic theory can provide a richer
economic crisis has amplified these consolidation foundation for understanding developments and
pressures. These two ongoing phenomena – lib- competitive reactions within an industry. As
eralization and consolidation – have put high on firm competitiveness involves interactions among
the corporate agenda the assessment of strategic several players (rivals, suppliers or clients), game
options under competition. Standard real options theoretic analysis brings important insights into
analysis often examines investment decisions as strategic management in addressing such issues
if the option holder has a proprietary right to as first- and second-mover advantages, firm
exercise. This perspective may not be realistic entry and exit decisions, strategic commitment,
in the new oligopolistic environment as several reputation, signaling, and other informational
1014 Option Games: The Interface Between Optimal Stopping and Game Theory
effects. A key lesson is that, when firms react to equilibrium include the riskiness of the venture,
one another, it may sometimes be appropriate
, the magnitude of the first-mover advantage and
for one firm to take an aggressive stance in the exclusive or shared ability to reap the benefits
expectation that rivals will back off. Dynamic of the investment vis-à-vis rivals. When firms
industrial organization includes the analysis of can grasp a large first-mover advantage from in-
“games of timing” such as preemption games vesting early but cannot differentiate themselves
or war of attrition, whereby firms decide on sufficiently from each other, they may be tempted
appropriate investment timing under rivalry. to wage a preemptive war, investing prematurely
at an early market stage that actually kills option
value. If firms are more on an equal footing but
do not see much benefit from investing early,
Option Games
they may prefer to wait and invest (jointly) at a
later stage when the future market is sufficiently
The earlier optimal stopping problem falls in the
mature. If, however, one firm has a comparative
category of “games of timing” when a firm’s
cost advantage that dominates (e.g., a radical or
entry decision influences another firm’s market
drastic technological superiority) its rival indus-
strategy. Option games are most suitable to help
try, participants may prefer a consensual leader-
model situations where a firm that has a real op-
follower investment arrangement involving less
tion to (dis)invest faces rivalry. Here, the problem
option value destruction.
consists in finding a Nash equilibrium solution
for the two-player equivalent of the above optimal
stopping problem. This solution must also satisfy
certain dynamic consistency criteria. For sequen- Conclusions
tial investments, the follower is faced with a
single-agent optimal investment timing problem; Corporate management’s strategic tool kit should
it will thus enter if the gross project value exceeds provide clearer guidance on whether to pursue
the investment cost by a sufficient factor. A firm a wait-and-see stance in the face of uncertain
entering the market early on, i.e., a leader, earns market developments or jump on the first-mover
temporary monopoly rents as long as demand bandwagon to build competitive advantage. We
remains below the follower’s entry threshold. discussed two different modeling approaches that
Following the follower’s entry, the firms act as provide complementary perspectives and insights
a duopoly. As long as the leader’s value exceeds to help management deal with issues of flexibility
the follower’s, there is an incentive for one firm to versus commitment: real options and dynamic
invest, but not necessarily for both of them, lead- game theory. While each approach separately
ing to a “coordination problem.” The competitive might turn a blind eye to flexibility or commit-
pressure will dissipate away the leader’s first- ment, an integrative perspective through “options
mover advantage, leading to a market entry point games" might provide the right balance and serve
that is not socially optimal and to rent dissipa- as a tool kit for adaptive competitive strategy.
tion. Unfortunately, the multiplayer problem does Both perspectives ultimately aim to derive bet-
not involve a simple analytical solution, since at ter insights into industry dynamics under indus-
each point a duopolist firm might end up in any try conditions characterized by both market and
of four distinct situations (two-by-two matrix) strategic uncertainty.
depending on the rival’s entry decision. Option Option games pave the way for a consis-
games indicate in each situation which driving tent approach in addressing managerial decision-
force (commitment vs. flexibility) prevails and making, elevating the art of strategy to scientific
whether to go ahead with the investment or wait analysis. Option games integrates in a common,
and see. Main drivers of the prevailing market consistent framework recent advances made in
Oscillator Synchronization 1015
Cross-References
Abstract
Auctions
Learning in Games The nonlinear Kuramoto equations for n coupled
oscillators are derived and studied. The oscilla-
tors are defined to be synchronized when they os-
Recommended Reading cillate at the same frequency and their phases are
all equal. A control-theoretic viewpoint reveals
Smit and Trigeorgis (2004) discuss related trade- that synchronized states of Kuramoto oscillators
offs with discrete-time real option techniques. are locally asymptotically stable if every oscilla-
Grenadier (2000) and Huisman (2001) examine tor is coupled to all others. The problem of syn-
a number of continuous-time models. Chevalier- chronization in Kuramoto oscillators is closely
Roignant and Trigeorgis (2011) synthesize both related to rendezvous, consensus, and flocking
types of “option games.” An overview of the problems in distributed control. These problems,
literature is provided in Chevalier-Roignant et al. with their elegant solution by graph theory, are
(2011). discussed briefly.
Bibliography Keywords
O
Basar T, Olsder GJ (1999) Dynamic noncooperative game Graph theory; Kuramoto model; Laplacian;
theory, 2nd edn. SIAM, Philadelphia Oscillator; Synchronization
Bensoussan A, Lions J-L (1982) Application of varia-
tional inequalities in stochastic control. North Holland,
Amsterdam
Chevalier-Roignant B, Trigeorgis L (2011) Competitive
strategy: options and games. MIT, Cambridge, MA Introduction
Chevalier-Roignant B, Flath CM, Huchzermeier A, Trige-
orgis L (2011) Strategic investment under uncertainty: An oscillator is an electronic circuit or other kind
a synthesis. Eur J Oper Res 215(3):639–50
of dynamical system that produces a periodic
Dixit AK, Pindyck RS (1994) Investment under uncer-
tainty. Princeton University Press, Princeton signal. If several oscillators are coupled together
Grenadier S (2000) Game choices: the intersection in some fashion and the periodic signals that they
of real options and game theory. Risk Books, each produce are of the same frequency and are in
London
phase, the oscillators are said to be synchronized.
Huisman KJM (2001) Technology investment: a game
theoretic real options approach. Springer, Boston The book Sync: The Emerging Science of Spon-
Myers SC (1977) Determinants of corporate borrowing. taneous Order, by Strogatz, introduces a wide
J Financ Econ 5(2):147–175 variety of phenomena where oscillators synchro-
Smit HTJ, Trigeorgis L (2004) Strategic investment:
nize. Some examples from biology: networks of
real options and games. Princeton University Press,
Princeton pacemaker cells in the heart, circadian pacemaker
Trigeorgis (1996) Real options. MIT, Cambridge, NA cells in the suprachiasmatic nucleus of the brain,
1016 Oscillator Synchronization
positive definite function. Indeed, let rej denote consensus, or flocking problems. Phase synchro-
the average of the points ej1 ; : : : ; ejn . Of course, nization is replaced by the requirement of mobile
r and are functions of , and so we have robots gathering at some location, by the require-
ment of temperature sensors in a sensor network
1 j1 converging to the same temperature estimate, or
r./ej . /
D e C C ejn
n by the requirement that mobile robots should
head in the same direction. The simplest form of
and therefore these problems has the equations
1 ˇˇ j1 ˇ X
r./ D e C C ejn ˇ : Pk D .i k /; k D 1; : : : ; n: (4)
n
i 2Nk
The average of n points on the unit circle lives
inside the unit disc, and therefore r./ is a real Notice that this can be obtained from the Ku-
number between 0 and 1. It equals 1 if and only ramoto model (2) merely by replacing sin.i k /
if the n points are equal, that is, the n phases are by i k in (2), that is, by linearizing the
equal, and this is the state where the phases are latter at a synchronized state. We shall continue
synchronized. to call k a phase of an oscillator. When do the
Define the function phases evolving according to (4) synchronize?
The answer to the question involves a lovely col-
n2
V ./ D r./2 laboration between graph theory and dynamics.
2 Introduce a directed graph that is in one-to-
1 ˇˇ ˇ2
ˇ one correspondence with the neighbor structure.
D ˇej1 C C ejn ˇ
2 The graph is made up of n nodes, one for each
1 j1
D e C Cejn ej1 C Cejn : oscillator. From each node there is an arrow to
2 every neighbor of that node; that is, from node
Thus k is an arrow to every node in Nk . Denote the
adjacency matrix and the degree matrix of the
@V ./ graph by, respectively, A and D. That is, aij D 1
D sin.1 k / C C sin.n k /
@k if j is a neighbor of i and di i equals the sum of
the elements on row i of A. The Laplacian of the
and therefore (3) can be written as P D graph is defined to be L D D A. Then (4) is
@V ./=@: This is a gradient equation. If .0/ equivalent to simply
is chosen so that all the phases are close enough
together, then r..0// will be close to 1, and P D L; (5)
therefore will move in a direction to increase
V ./, that is, increase r./, until in the limit where is still the vector with elements
r./ D 1 and the phases are synchronized. 1 ; : : : ; n . Whether or not synchronization
There are results, e.g., Sepulchre et al. (2008), occurs depends on the connectivity of the graph.
when the coupling is not all-to-all. Also, the term We stop here and refer the reader to the articles
“synchronization” is used more generally than Averaging Algorithms and Consensus and
just for oscillators Wieland et al. (2011). Flocking in Networked Systems
Suppose there are an infinite but countable
number of oscillators in the model (5). When will
Rendezvous, Consensus, Flocking, they synchronize? To answer this, we have to be
and Infinitely Many Oscillators more specific.
Let us allow an infinite number of oscillators
Synchronization of coupled oscillators is closely numbered by the integers, positive, zero, and
related to other problems known as rendezvous, negative. Denote the phases by k and let
Oscillator Synchronization 1019
denote the phase vector, whose kth component is the most general. A more general model allows
k . Assume each oscillator has only finitely many different frequencies !k instead of just one, and
neighbors, let Nk denote the set of neighbors of also a coupling gain K, leading to the model
oscillator k, and let L be the Laplacian of the as-
sociated graph. Finally, let .t/ evolve according K X
Pk D !k C sin.i k /; k D 1; : : : ; n:
to the Eq. (5). This equation isn’t automatically n i 2N
k
well posed in the sense that there may not be a (6)
solution defined for all t > 0. We have to impose
An important problem associated with the
a framework so that solutions do indeed exist.
Kuramoto model is to determine which synchro-
One natural space in which to place .0/ is `2 ,
nized states are stable. The linearized equation is
the Hilbert space of square-summable sequences.
interesting in its own right and relates to problems
If L is a bounded operator on `2 , then so is eLt
of rendezvous, consensus, and flocking.
for every t > 0, and hence the phase vector
Reference Dörfler and Bullo (2014) offers
exists and belongs to `2 for every t > 0. Another
some questions for future study. In particular,
natural space in which to place .0/ is `1 , the
it would be interesting to extend the Kuramoto
Banach space of bounded sequences. Again, a
model beyond the first-order oscillators of (2).
phase vector exists for all t > 0 if L is a bounded
Also, the case of general neighbor sets has much
operator on `1 .
room for exploration.
The following example is from Feintuch and
Asymptotic stability is a robust property. For
Francis (2012). Take the neighbor sets to be
example, if the origin is asymptotically stable
Nk D fk 1g. The graph is a chain: There is
for the system xP D Ax, it remains so if A is
an arrow from node k to node k 1, for every
perturbed by a sufficiently small amount. This
k, and the Laplacian is the infinite matrix with
is because the spectrum of a matrix is a con-
1 on the diagonal, 1 on the first subdiagonal,
tinuous function of the matrix. The sketch in
and zero elsewhere. This Laplacian is a bounded
Fig. 1 vividly depicts the concept of synchronized
operator on both `2 and `1 . Now the vector c1,
oscillators. A topic for future study is that of ro-
where 1 is the vector of all 1’s, belongs to `1
bustness. Mathematically, if the two metronomes O
for every real number c, but it belongs to `2
are identical, they will synchronize perfectly –
only for c D 0. So the phases can potentially
this can be proved. Of course, physically two
synchronize at any value in `1 , but only at 0 in
metronomes cannot be identical, and yet they will
`2 . For the example under discussion, if the initial
synchronize if they are close enough physically.
phase vector is in `2 , then the phases synchro-
A mathematical study of this phenomenon might
nize at 0. By contrast, there exist initial phase
be interesting.
vectors in `1 such that synchronization does
not occur. Even worse, limt !1 .t/ does not
exist. The conclusion is that whether or not the Cross-References
oscillators will synchronize is a difficult question
in general. Averaging Algorithms and Consensus
Flocking in Networked Systems
Graphs for Modeling Networked Interactions
Summary and Future Directions Networked Systems
Vehicular Chains
The Kuramoto model is a widely used paradigm
for coupled oscillators. The model has the form
P D f .E/, where is the vector of phases, Recommended Reading
the matrix E maps into the vector of possible
differences i k , and f is a function. The The literature on the Kuramoto model is
Kuramoto model considered in this essay is not huge – there are now many hundreds of journal
1020 Output Regulation Problems in Hybrid Systems
papers continuing the study of oscillators using Lin Z, Francis BA, Maggiore M (2007) State agreement
Kuramoto’s model. There is space here only to for coupled nonlinear systems with time-varying inter-
action. SIAM J Control Optim 46:288–307
highlight a few sources. Moreau L (2005) Stability of multi-agent systems with
You can find a mathematical study of time-dependent communication links. IEEE Trans
coupled metronomes in Pantaleone (2002). Also, Automatic Control 50:169–182
Pantaleone’s webpage Pantaleone describes Pantaleone J. Webpage. http://salt.uaa.alaska.edu/jim/
Pantaleone J (2002) Synchronization of metronomes. Am
some experimental observations. Kuramoto’s J Phys 70(10):992–1000
original paper is Kuramoto (1975). Dörfler and Scardovi L, Sarlette A, Sepulchre R (2007) Synchroniza-
Bullo have recently written a comprehensive tion and balancing on the N-torus. Syst Control Lett
survey (Dörfler and Bullo (2014)). Strogatz has 56:335–341
Sepulchre R, Paley DA, Leonard NE (2007) Stabilization
written extensively on oscillator synchronization. of planar collective motion: all-to-all communication.
His book Sync is fascinating and is highly IEEE Trans Automatic Control 52(5):811–824
recommended (Strogatz 2004). See also Strogatz Sepulchre R, Paley DA, Leonard NE (2008) Stabiliza-
(2000) and Strogatz and Stewart (1993). The tion of planar collective motion with limited com-
munication. IEEE Trans Automatic Control 53(3):
papers Scardovi et al. (2007) and Dörfler 706–719
and Bullo (2011) are recommended for more Strogatz SH (2000) From Kuramoto to Crawford: ex-
recent results, the latter treating the general ploring the onset of synchronization in populations of
model (6). coupled oscillators. Physica D 143:1–20
Strogatz SH (2004) Sync: the emerging science of
Getting phases in oscillators to synchronize is spontaneous order. Hyperion Books, New York
a special case of getting the states or outputs of Strogatz SH, Stewart I (1993) Coupled oscillators and
coupled systems asymptotically to converge to biological synchronization. Sci Am 269:102–109
a common value. There is a very large number Wieland P, Sepulchre R, Allgower F (2011) An internal
model principle is necessary and sufficient for linear
of references on these subjects, a seminal one output synchronization. Automatica 47:1068–1074
being Jadbabaie et al. (2003); others are Lin et al.
(2007) and Moreau (2005). Regarding infinitely
many oscillators, the physics literature treats only
a continuum of oscillators, whereas countably Output Regulation Problems in
many oscillators are the subject of Feintuch and Hybrid Systems
Francis (2012).
Sergio Galeani
Acknowledgments I greatly appreciate the help from Dipartimento di Ingegneria Civile e Ingegneria
Luca Scardovi, Florian Dörfler, and Francesco Bullo.
Informatica, Università di Roma “Tor Vergata”,
Roma, Italy
Bibliography
and the plant P will be described at time .t; k/ by Hence, the (Euclidean) distance between the two
solutions at time .t; k/ is given by
(
xP D Ax C Bu C P w ; .x; u; t; k/ 2 CP ; "; t 2 Œk; k C 1 ";
(2a) d.t; k/ D
.1 "/; t 2 Œk C 1 "; k C 1I
x C D Ex C Rw ; .x; u; t; k/ 2 DP ;
(2b) in other words, choosing " > 0 as small as de-
sired, arbitrarily close initial conditions generate
e D C x C Qw ; (2c)
trajectories which are apart by a finite amount (as
close as desired to 1) during the arbitrarily small
with x.t; k/ 2 Rn , u.t; k/ 2 Rm , e.t; k/ 2 Rp , time intervals where t 2 Œk C 1 "; k C 1.
w.t; k/ 2 Rq , and suitably defined flow sets CE , Since stability deals with trajectories remaining
CP and jump sets DE , DP . close forever and attractivity deals with trajecto-
ries getting closer and closer, examples such as
the one above pose serious issues when defining
Stabilization Obstructions in Hybrid (let alone establish) stability and attractivity in
Regulation the hybrid case. Similar problems arise not only
in output regulation problems but also in other
The achievement of asymptotic stabilization of areas like state tracking, observers, and general
the desired (zero output) steady-state responses interconnections of hybrid systems.
for the considered class of linear hybrid systems However, intuition suggests (and mathematics
crucially depends on whether the plant P and confirms, by using a suitable notion of “dis-
the exosystem E have synchronous jump times or tance”) that such trajectories are close indeed.
not. Several approaches have been proposed in order
to overcome such difficulty. Considering as an
example a bouncing ball tracking another bounc-
Asynchronous Jumps ing ball, the problematic time intervals are those
Typically, jumps in P and E will be asyn- between the bounce of the first ball hitting the
chronous, and this will cause the undesirable ground and the bounce of the other ball; in such
phenomenon that genuinely close trajectories a case, the modified distances are defined by
will look “distant” around each jump when the either
distance is measured according to the usual • Allowing to exclude sufficiently short
Euclidean norm. The simplest illustration of “problematic” intervals (possibly requiring
such phenomenon consists in considering two that their length asymptotically tends to zero);
trajectories of the same system starting from see, e.g., Galeani et al. (2008, 2012)
"-close initial conditions. Consider the system • Considering alternative “mirrored” trajecto-
ries computed as if the last jump did not
vP D 1; v 2 Œ0; 1; v C D 0; v 62 .0; 1/; happen; see, e.g., Forni et al. (2013a,b)
• Using a “stretched” distance function ı such
with the initial states v0 D 0 and v1 D ", 0 < that when point a is in the jump set and
" < 1. The two ensuing solutions at time .t; k/ its image via the jump map is g.a/, then
are immediately computed as ı.a; b/ D ı.g.a/; b/; see, e.g., Biemond et al.
(2013).
v.t; kI v0 /Dt k; t 2 Œk; k C 1; While the first approach has been proposed first,
( the other two (which are strongly related) have
t k C "; t 2 Œk; k C 1 "; the advantage of providing (under mild additional
v.t; kI v1 /D
t k C " 1; t 2 Œk C 1 "; k C 1; hypotheses) global control Lyapunov functions.
Output Regulation Problems in Hybrid Systems 1023
Finally, it is worth noting that the most adequate (and an easily provable separation principle), it is
tools to address similar issues for general hybrid easily shown that:
systems are the “graphical distance” among hy- • Under a hybrid stabilizability hypothesis,
brid arcs and related concepts (see Goebel et al. state feedback stabilization of (5) is easily
2012, Chap. 5). achieved.
• Output feedback stabilization of (5) from e
Synchronous Jumps is also trivial under an additional hybrid de-
When synchronous jumps are considered, the tectability hypothesis.
above issue disappears, and asymptotic stabiliza- • Under hybrid detectability of the cascade
tion becomes a much simpler matter. Although of (4) and (5), w can be asymptotically
synchronous jumps look more like an exception estimated from e.
than a rule in hybrid systems, they are very Due to the above facts, it can be assumed without
reasonable for specific classes of problems. loss of generality that (5) is asymptotically sta-
In order to have synchronous jumps, some ble (equivalently, that all eigenvalues of Ee A M
authors have considered the use of “jump inputs” have modulus strictly less than one). Asymptotic
which impose a jump at a certain time, which stability then yields incremental stability, since
can be physically reasonable in some systems, letting xO and xL denote two motions under the
e.g., two tanks separated by a movable wall, same inputs u, w and only differing in their initial
assuming that when the wall is removed the fluid states, it is immediate to see that their difference
reaches the equilibrium configuration almost in- xQ WD xO xL evolves as
stantaneously.
Another relevant class consists of “sampled xPQ D xPO xPL D AxO C Bu C P w
data” systems, whose jumps are essentially due
to digital components which operate at a fixed .AxL C Bu C P w/;
sampling rate, which will be considered in the xQ C D xO C xL C D E xO C Rw .E xL C Rw/;
rest of this entry. In such a case, letting M be the
sampling period, the time domain of the hybrid
system is fixed as (see Fig. 1) that is, xPQ D Ax, Q xQ C D E x, Q and so it is O
just a free motion of the plant, asymptotically
T WD f.t; k/ W t 2 Œk M ; .k C 1/ M ; k 2 Zg; converging to zero. Incremental stability implies
(3) that regulation is achieved as soon as it is shown
all jumps happen exactly for .t; k/ with t D .k C that for any exogenous input w it is possible
1/ M , and then (1) can be simplified as to find an input u and an initial state of (5)
such that e is identically zero, since then any
wP D S w ; (4a) other motion arising from a different initial state
C
will asymptotically converge to the motion with
w D Jw; (4b) identically zero e. Moreover, it is easy to see that
asymptotic stability of the origin actually implies
and (2) can be simplified as uniform, global, and exponential stability of any
trajectory for such systems.
xP D Ax C Bu C P w ; (5a)
C
x D Ex C Rw ; (5b)
e D C x C Qw ; (5c) Hybrid Steady-State Generation
since flow and jump times are clear from the From this point on, the rest of the presentation
context. will be focused only on the case where the prob-
For the latter class of systems, by using linear lem data are of the form (3) to (5), since this
time-invariant hybrid control laws and observers allows to provide an uncluttered view on some
1024 Output Regulation Problems in Hybrid Systems
tk WD k M ;
.t; k/ WD t k M I Equations (7) can be shown to be both necessary
and sufficient for (6) to solve the output regula-
tion problem under the considered assumptions.
the arguments of
.t; k/ will usually be omitted
Once a solution of (7) is available, the full in-
since clear from the context. Note that
satisfies
formation regulator simply reduces to the time-
P D 1,
C D 0, and it is often explicitly
varying static feedforward controller
introduced as an additional timer variable.
xss .t; k/ ….
/ incrementally stable (as follows from its asymp-
D w.t; k/: (6) totic stability, which was assumed without loss of
uss .t; k/ .
/
generality), its output response under the control
Requiring that such expressions actually charac- law (8) must converge to the output response
terize a response of the considered plant, as well associated to (6).
as the associated output is zero, amounts to ask For later use, note that in the non-hybrid case
that: where P and E only flow
• During flows, xP ss .t; k/ has to satisfy the two
equations: wP D S w; (9a)
Output Regulation Problems in Hybrid Systems 1025
the candidate steady state (6) is replaced by Equations (7) and (15) can be shown to be both
necessary and sufficient for (13) to solve the
xss .t/ … output regulation problem under the considered
D w.t/; (10)
uss .t/ assumptions and generalize the corresponding
conditions for the non-hybrid case where P and
and (7) reduces to the celebrated regulator equa- E only flow (see (9)) and (13) and (14) are
tions (or Francis equations) replaced by
…S D A… C B C P; (11a) P D F C Ge ; (16a)
0 D C … C Q; (11b) u D H ; (16b)
2 3 2 3
and, as above, assuming without loss of gener- xss .t/ …
4 ss .t/ 5 D 4 † 5 w.t/; (16c)
ality that the plant is asymptotically stable, the
full information regulator reduces to the time- uss .t/
invariant static feedforward controller:
and (15) reduces to
u.t; k/ D w.t; k/ (12)
†S D F †; (17a)
The Error Feedback Case D H †: (17b)
When the exosystem state is not measured, a
dynamic compensator of the form Relations (17) are an expression of the in-
ternal model principle, stating that in order to
P D F C Ge ; (13a) achieve error feedback regulation, the compen- O
sator C must include a suitable “copy” of the
C D L ; (13b)
exosystem, namely, (17a) imposes a constraint on
u D H ; (13c) the dynamics of C which, coupled with (17b),
ensures that the signal uss D w used in the full
which is also supposed to flow and jump accord- information case can be equivalently produced
ing to the a priori fixed time domain T considered (without measuring w!) as uss D H †. A similar
for the plant, is introduced, and the corresponding interpretation can be given to (15), which must
candidate steady-state motion including is be required in addition to (7) in order for (13) to
solve the hybrid error feedback output regulation
2 3 2 3
xss .t; k/ ….
/ problem.
4 ss .t; k/ 5 D 4†.
/5 w.t; k/: (14)
uss .t; k/ .
/
Key Features in Hybrid vs Classical
By following similar steps as above, requiring Output Regulation
invariance of such a manifold in the space of
.x; ; u; w/, as well as zero output on it, leads While the previous section mainly aimed at show-
to the conclusion that in addition to (7), the ing how the classical theory generalizes in the
following relations must be satisfied as well: hybrid case (at least for a special class of hybrid
systems), the aim of this section is to point out
P
†.
/ C †.
/S D F †.
/ ; (15a) some of the striking differences between the two
1026 Output Regulation Problems in Hybrid Systems
cases. Before proceeding further, and in order to be performed, the following discussion will be
keep focus on the characterization of the steady- mainly based on showing the simplest examples
state response, it is worth mentioning here that al- exhibiting the pathologies of interest.
though time-varying systems will be considered, Consider the system with M D 1 (so that
no issue regarding nonuniform stability (like in tk D k, for all k 2 Z) and
general nonautonomous systems) arises since the
timer
just ranges in the compact set Œ0; M due wP D 0; (18a)
to the assumed periodic structure of T (see also
the end of section “Synchronous Jumps”). wC D w; (18b)
Comparing the classical and the hybrid output
xP 1 1 0 x1 0 0
regulator and considering that P and E are time D C uC w;
xP 2 0 2 x2 1 1
invariant, it seems somewhat strange that in the
(18c)
output feedback case the linear time-invariant
regulator (16a) and (16b) generalizes to a hybrid C
x1 0 1 x1
linear time-invariant regulator (13), whereas in C D ; (18d)
x2 2e 1 x2
the full information case the linear time-invariant
where ƒ.S / denotes the spectrum of S . In such a x2;ss .t; k/ D w.t; k/ D .1/k w.0; 0/;
case, (11) amounts to a system of nq C pq linear
equations in nq C mq unknowns (the elements 8.t; k/ 2 T ;
of …, ), which might be expected to be satisfied
since m p. If one were trying to use the unique which in turn implies that xP 2;ss D 0 for all t 2
constant solution .…; / of (11) as a solution .k; k C 1/, k 2 Z and the unique steady-state
of (7), clearly (7a) and (7c) would be satisfied, input
but then (7b) would impose other nq equations uss D 2x2;ss w:
on … which would unlikely be satisfied. For
this reason, apparently the additional degree of
Since (18d) implies that x1;ss .tkC1 ; k C 1/ D
freedom offered by choosing time dependent …
x2;ss .tkC1 ; k/ D w.tkC1 ; k/ and (18c) implies
and might be of help. In fact, it can be shown
that x1;ss .t; k/ D e .t k/ x1;ss .tk ; k/, for t 2
that if m D p and under a hybrid nonreso-
.tk ; tkC1 /, it follows that
nance condition (involving Ee A M and Je S M )
between P and E, (7a) and (7b) have a unique
solution for any choice of .
/, so that the design x1;ss .t; k/ D e .t tk / w.tk ; k/; t 2 .tk ; tkC1 /;
boils down to satisfying (7c) by choosing .
/; (19)
but is this always possible? In order to answer
this nontrivial question, a different path must be which finally is coherent with the jump equation
followed. While a complete formal analysis can for x2;ss in (18d) since
Output Regulation Problems in Hybrid Systems 1027
x2;ss .tkC1 ; kC1/D2ex1;ss .tkC1 ; k/Cx2;ss .tkC1 ; k/ that case are provided by copying them in the
compensator dynamics!
D2e.e 1 /w.tk ; k/ C w.tk ; k/ An even stronger consequence of the analysis
(20a) above is a flow zero dynamics internal model
D w.tk ; k/ (20b) principle, which essentially states that any output
feedback compensator solving the output regula-
D.1/kC1 w.0; 0/: (20c) tion problem must be able to produce as free re-
sponses (during flow) a suitable subset of the nat-
Before commenting the meaning of the above ural modes of the flow zero dynamics (and a suit-
derived steady-state evolution, it is worth noting ably modified version applies to the feedforward
that (18) might actually derive from an original static compensator (8)). It is worth noting that
system with (18c) replaced by while the classical internal model principle re-
quires exact knowledge of the exosystem modes
xP 1 1 0 x1 0 0
D C uC w; (21) (which is kind of a mild requirement, especially
xP 2 1 2 x2 1 1 when the exosystem models references, or con-
stant offsets), the flow zero dynamics internal
under the preliminary state feedback model principle requires the exact knowledge of
the modes of the zero dynamics, which typically
u D x1 C v: (22)
depends on not precisely known plant parame-
Such a feedback renders the subspace ters; clearly, this fact poses serious questions in
fx W x2 D 0g unobservable (when the system view of the achievement of robust regulation.
only flows) and reveals that the dynamics of x1 A final point, also raising serious issues about
in (18c) is the flow zero dynamics of P, that what can be robustly achieved (and how) in the
is, the zero dynamics of P when jumps are setting of hybrid output regulation, is the fact
inhibited. Having set the stage, several interesting that generically, existence of solutions is not
observations can be made now. robust to arbitrarily small parameter varia-
The flow zero dynamics samples the ex- tions. In particular, looking again at the compu-
ogenous signal w at jumps and then evolves tations in (20), it should be clear that the involved O
according to its own modes (see (19)). In fact, functions are all fixed by previous reasonings,
while in the classical case (10) the state and input whereas satisfaction of (20) crucially depends on
at steady state can be expressed as a constant exact cancellations of certain coefficients. Any
matrix times the current value of w, the real small variations of such coefficients in (18d)
nature of the time dependence of and … in (6) imply that the problem admits no steady state
is linked to this phenomenon of sampling w.tk ; k/ yielding zero output. This fact is in sharp contrast
and propagating along the zero dynamics. A with classical regulation, where the nonresonance
suitable analysis shows that ….
/, .
/ contain condition ensures existence of (different) solu-
products of matrices with rightmost factor e S
tions for small parameter variations. It has to be
(which recovers w.tk ; k/ D e S
w.t; k/ from the noted, though, that under additional conditions,
current value w.t; k/ of w) and leftmost factor robust existence of solutions is guaranteed if
containing the fundamental matrix of the flow the plant is fat, that is, m > p. Using again the
zero dynamics. It is worth mentioning that the previous example, this is the case if an additional
“motion along the zeros” in the present context is input is introduced
strongly related to the same kind of motions used
Cox N, Teel AR, Marconi L (2011) Hybrid output regula- Huang J (2004) Nonlinear output regulation: theory and
tion for minimum phase linear systems. In: American applications, vol 8. Society for Industrial Mathematics,
control conference, San Francisco, pp 863–868 Philadelphia
Cox N, Marconi L, Teel AR (2012) Hybrid internal Marconi L, Teel AR (2010) A note about hybrid linear
models for robust spline tracking. In: Conference on regulation. In: Conference on decision and control,
decision and control, Maui, pp 4877–4882 Atlanta, pp 1540–1545
Davison E (1976) The robust control of a servomecha- Marconi L, Teel AR (2013) Internal model
nism problem for linear time-invariant multivariable principle for linear systems with periodic state
systems. IEEE Trans Autom Control 21(1):25–34 jumps. IEEE Trans Autom Control 58(11):
Forni F, Teel AR, Zaccarian L (2013a) Follow the bounc- 2788–2802
ing ball: Global results on tracking and state estima- Morarescu I, Brogliato B (2010) Trajectory tracking
tion with impacts. IEEE Trans Autom Control 58(6): control of multiconstraint complementarity lagran-
1470–1485 gian systems. IEEE Trans Autom Control 55(6):
Forni F, Teel AR, Zaccarian L (2013b) Reference mirror- 1300–1313
ing for control with impacts. Daafouz J, Tarbouriech Pavlov AV, Wouw N, Nijmeijer H (2005) Uniform output
S, Sigalotti M (eds) Hybrid systems with constraints. regulation of nonlinear systems: a convergent dynam-
John Wiley & Sons, Inc., Hoboken, pp 213–260. doi: ics approach. Birkhäuser, Boston
10.1002/9781118639856.ch8 Saberi A, Stoorvogel A, Sannuti P (2000) Control of
Francis B, Wonham W (1976) The internal model princi- linear systems with regulation and input constraints.
ple of control theory. Automatica 12(5):457–465 Springer, London
Galeani S, Menini L, Potini A, Tornambe A (2008) Tra- Sanfelice RG, Biemond JJ, van de Wouw N, Heemels
jectory tracking for a particle in elliptical billiards. Int W (2013) An embedding approach for the de-
J Control 81(2):189–213 sign of state-feedback tracking controllers for refer-
Galeani S, Menini L, Potini A (2012) Robust trajectory ences with jumps. Int J Robust Nonlinear Control.
tracking for a class of hybrid systems: an internal doi:10.1002/rnc.2944
model principle approach. IEEE Trans Autom Control Trentelman HL, Stoorvogel AA, Hautus MLJ (2001) Con-
57(2):344–359 trol theory for linear systems. Springer, London
Goebel R, Sanfelice R, Teel A (2012) Hybrid dynamical Wonham W (1985) Linear multivariable control: a geo-
systems: modeling, stability, and robustness. Princeton metric approach. Applications of mathematics, vol 10,
University Press, Princeton 3rd edn. Springer, New York
O
P
Introduction
Parallel Robots
A parallel robot refers to a kinematic chain in
Frank C. Park
which a fixed platform and moving platform
Robotics Laboratory, Seoul National University,
are connected to each other by several serial
Seoul, Korea
chains, or legs. The legs, which typically have
the same kinematic structure, are connected to
the fixed and moving platforms at points that are
Abstract distributed in a geometrically symmetric fashion.
The Stewart-Gough platform (Fig. 1) is a well-
Parallel robots are closed chains consisting of a known example of a parallel robot: each of the
fixed and moving platform that are connected by six legs is a UPS structure (i.e., consisting of
a set of serial chain legs. Parallel robots typically rigid links serially connected by a universal, pris-
possess both actuated and passive joints and may matic, and spherical joint), with the prismatic
even be redundantly actuated. Although more joint actuated. Other examples of parallel robots
structurally complex and possessing a smaller include the 6 RUS platform of Fig. 2, the
workspace, parallel robots are usually designed haptic interface device of Fig. 3, and the eclipse
to exploit one or more of the natural advantages mechanism of Fig. 4.
they possess over their serial counterparts, e.g., Parallel robots can be regarded as a special
higher stiffness, increased positioning accuracy, class of closed chain mechanisms (i.e., chains
and higher speeds and accelerations. In this chap- that contain one or more closed loops) and are
ter we provide an overview of the kinematic purposely designed to exploit the specific ad-
and dynamic modeling of parallel robots, a de- vantages afforded by the closed chain structure,
scription of their singularity behavior, and basic e.g., for improved stiffness, greater positioning
methods developed for their control. accuracy, or higher speed. Parallel robots should
be distinguished from two or more cooperating
serial robots that may form closed loops during
Keywords execution of a task (e.g., a robotic hand grasping
an object). Some of the fastest velocities and
Closed kinematic chain; Closed loop mechanism; accelerations recorded by industrial robots have
Parallel manipulator been achieved by parallel robots, primarily by
Revolute joint
Universal joint
Ball joint
Universal Joint
P
Parallel Robots, Fig. 3 A 3 P U U haptic interface
Dynamics
In the case of a parallel robot whose actuated
degrees of freedom coincides with its kinematic
Parallel Robots, Fig. 4 The 3 PPRS eclipse parallel
mobility m, it is possible to choose an indepen-
mechanism
dent set of generalized coordinates of dimension
m, denoted q 2 Rm and typically identified with
the actuated joints, and to express the dynamics denotes the vector of gravitational forces. The
in the standard form structure of the dynamic equations is identical
to that for serial chain robots. Also like the case
M.q/qR C C.q; q/
P qP C G.q/ D ; (1) for serial chain robots, the Coriolis matrix term
P 2 Rmm is not unique, so that one
C.q; q/
where 2 Rm denotes the vector of input should ensure that the correct C.q; q/ P is used
joint torques, M.q/ denotes the n n mass ma- in, e.g., any control law whose stability depends
P qP denotes
trix, the matrix-vector product C.q; q/ on the matrix MP .q/ 2C.q; q/ P being skew-
the vector of Coriolis terms, and G.q/ 2 Rm symmetric.
1034 Parallel Robots
It is also important to keep in mind that the laws for serial robots are also covered in this
q must satisfy the kinematic constraint equations handbook, and we refer the reader to Linear
imposed by the loop closure constraints. That is, Matrix Inequality Techniques in Optimal Control
if 2 Rn denotes the vector of all joints (both for the essential details. Here we summarize
actuated and passive), then q 2 Rm , m n, the most basic control laws and point out any
will be a subset of whose values can only additional computational or other requirements
be obtained by solution of the kinematic con- that are needed when applying these laws to
straint equations; depending on the nature of the parallel robots. Note that other control laws and
kinematic constraints, one may have to resort to techniques developed for serial chain robots, e.g.,
iterative numerical methods. robust, sliding mode, can also be applied with the
If the parallel robot is redundantly actuated, same additional considerations and requirements
then the dynamics are subject to a further set of outlined below:
constraints on the input torques. Letting qe denote 1. Computed torque control: Computed torque
the set of independent generalized coordinates control for parallel robots has the same control
and qa be the vector of all actuated joints, the law structure as for serial robots, i.e.,
vector of actuated joint torques a must then
further satisfy S T a D W T , where denotes
D M.q/ Kp e Kv eP C ff ; (3)
the vector of joint torques for an equivalent tree
structure system that moves identically to the
redundantly actuated parallel robot and W and S where e denotes the tracking error, Kp and Kv
are defined, respectively, by are the proportional and derivative feedback
gain matrices, and ff denotes the feedfor-
@ @qa ward term required to cancel the nonlinear dy-
SD ; W D : (2)
@qe @qe namics. Robust versions of computed torque
control are also applicable to parallel robots
Compared to the dynamics for serial chain robots, under the same set of conditions, e.g., estab-
the dynamics for parallel robots is, in general, lishing appropriate bounds on the mass matrix
considerably more complex and computation- eigenvalues and on the norm of the Coriolis
ally involved. The recursive algorithms that are matrix.
available for computing the inverse and forward 2. Augmented PD control: The augmented PD
dynamics of serial chain robots can also be used control law for serial robots is also applicable
to develop similar recursive algorithms for par- to parallel robots, i.e.,
allel robot dynamics; however, the computations
will be considerably more involved and require
multiple iterations. D Kp e Kv eP C M.q/qRd C C.q; q/
P qP C G.q/;
(4)
where qd is the reference trajectory to be
Motion Control tracked and Kp and Kv are the proportional
and derivative feedback gains. Asymptotic sta-
Exactly Actuated Parallel Robots bility is also established under the same con-
For parallel robots whose actuated degrees of ditions.
freedom match the kinematic mobility (this ex- 3. Adaptive control: Because the dynamic equa-
cludes the set of all redundantly actuated parallel tions for parallel robots are also linear in the
robots), most control laws developed for serial link mass and inertial parameters, i.e.,
chain robots are also applicable. This is not al-
together surprising in light of the similarity in
the structure of the kinematic and dynamic equa- M.q/qR C C.q; q/
P qP C G.q/ D ˆ.q; q;
P q/p;
R
tions between serial and parallel robots. Control (5)
Parallel Robots 1035
where p denotes the vector of link mass and are necessary to account for the different structure
inertial parameters, adaptive control laws de- of the dynamic equations.
veloped for serial robots can also be used. Like all model-based control algorithms, the
4. Other control methods: There exist above control laws are subject to model un-
numerous control methods developed for certainties. Whereas in serial chains the effects
serial robots, e.g., task space or operational of model uncertainty simply lead to errors in
space control, sliding mode control, and tracking, for redundantly actuated parallel robots,
various nonlinear control techniques; with the consequences can lead to internal forces in
few exceptions most of these algorithms can addition to end-effector tracking errors. Perhaps
also be applied to exactly actuated parallel the most significant effect of any modeling errors
robots with minimal modification. is that, unlike the serial chain case, the kine-
matic errors can potentially alter the shape of
the configuration space (recall that the configu-
ration space will in general be a curved space
Redundantly Actuated Parallel Robots for closed chains) and also interfere with any
As described earlier, parallel robots exhibit a PD feedback introduced into the control. The
much more diverse range of singularity behavior development of control laws that are robust to
than their serial counterparts, many of which such modeling errors and disturbances remains
depend on the choice of actuated joints (actua- an open and ongoing area of research in parallel
tor singularities). One way to eliminate actuator robot control.
singularities is via redundant actuation, i.e., the
total degrees of freedom of the actuated joints ex-
ceeds the kinematic mobility of the mechanism. Force Control
Redundant actuation offers some protection in
the event of failed actuators and, when combined Both hybrid force-position control and impedance
with an appropriate control law, offers an effec- control are well-known and widely applied
tive means of reducing joint backlash, increas- concepts in serial robots and can be extended
ing speed and payload and stiffness, controlling in a straightforward manner to exactly actuated
compliance through the generation of internal parallel robots. Recall that the basic feature of
forces, and even improving power efficiency (as hybrid force-position control is that the task
P
an analogy, the human musculoskeletal system space is decomposed into force- and position-
is redundantly actuated by antagonistic muscles). controlled directions, whereas in impedance
Of course, redundant actuation introduces a new control, the goal is have the robot maintain
set of control challenges, since the control in- a certain desired spatial stiffness in the task
puts must be designed so as not to conflict with space. Controllers that combine aspects of force-
the kinematic constraints inherent in the parallel position and impedance control have also been
robot; loosely speaking, the actuated joints can proposed and developed for both serial robots
no longer be independently controlled, since the and exactly actuated parallel robots. Modeling
consequences of unintended antagonistic actua- errors will cause deviations in both the force-
tion may be catastrophic. and position-controlled directions – leading
The control of cooperating manipulators (see to motions in force-controlled directions and
Optimal Control and Mechanics) has a long forces in position-controlled directions – which
history in robotics, and many of the control tech- can be addressed by, e.g., a switching control
niques developed for such multi-arm systems can strategy.
also be applied to redundantly actuated parallel The problem of force control for redundantly
robots. One can also apply the control strategies actuated parallel robots, which encompasses
developed for exactly actuated parallel robots to both force-position and impedance control, has
the redundantly actuated case, but modifications also received some attention in the literature.
1036 Parallel Robots
However, the PF does this in a more natural 2 Resampling: Resample each particle with
.i /
way. probability !k .
3 Time update: Simulate one time step by
.i / .i /
The Basic Particle Filter taking vk pv .vk / and then set xkC1 D
.i / .i /
f .xk ; vk /.
Nonlinear filtering aims at estimating the The main design parameter here is the number
distribution of a state sequence x1WN D N of particles. A common trick to make the
.x1 ; x2 ; : : : ; xN / from a sequence of observations filter more robust is to increase the variance of
y1WN D .y1 ; y2 ; : : : ; yN /, given a state space pv and pe above. This is called dithering (or
model of the form jittering) and is a practical way to get more robust
nonlinear filters.
xkC1 D f .xk ; vk / or p.xkC1 jxk /; (1a) To illustrate some of the aspects and for
later reference, a simple example will be
yk D h.xk ; ek / or p.yk jxk /: (1b) introduced.
Particle Filters, Table 1 Matlab code for scalar linear Gaussian model
.i/
Particle Filters, Fig. 1 Set of samples fxk giD1WN compared to first a Gaussian approximation of the particles (blue)
and second to the true posterior distribution provided by the Kalman filter (green)
.i/
Particle Filters, Fig. 2 Set of trajectories fx1W10 one (likelihood). Note that the smoothing distribution
x1W10 giD1WN for two different proposal distributions, one p.xk jy1W10 / can be approximated with the set of particles
.i/
bad one (prior) leading to particle depletion and one good fxk giD1WN using the marginalization principle
doing the time update, but one has to look at a full as used in the basic PF.
cycle of the iteration scheme. .i /
• The likelihood q.xkC1 jxk ; ykC1 / /
If a proposal distribution of the functional p.ykC1 jxkC1 /. For the model (2), the proposal
form q.xk jxk1 ; yk / is used, then steps 1 and 3 becomes N.yk = h; Rk = h2 /.
have to be modified as follows:
• The optimal (minimizing weight variance)
1 Weight update: Time and measurement up- .i /
choice q.xkC1 jxk ; ykC1 / / p.ykC1 jxkC1 /
dates: .i /
p.xkC1 j xk /. For the model (2), the optimal
proposal is provided by one cycle of the
.i / .i /
.i / .i / p.xk jxk1 / Kalman filter, initialized with the particle
wkjk1 / wk1jk1 .i / .i /
; (3a) .i /
xk .
q.xk jxk1 ; yk /
The optimal proposal keeps the weights constant,
.i / .i / .i / and this would in theory avoid depletion, where
wkjk / wkjk1 p.yk jxk /: (3b)
depletion is interpreted as excessive weight vari-
ance. Figure 2 compares the set of trajectories for
3 Prediction: Generate samples from the pro- the prior and likelihood proposals, respectively.
posal Apparently, the likelihood proposal is to prefer
here, since it suffers less from depletion in the
.i / .i /
particle history. The practical limitation with the
xkC1 q.xkC1 jxk ; ykC1 / (3c) last two alternatives is that one has to be able to
sample from the likelihood, so in practice there
The most natural proposal distributions are the needs to be more measurements than states in the
following: model.
Particle Filters 1041
Particle Filters, Fig. 3 The efficient number of particles Neff .k/=N for prior proposal (blue) and likelihood proposal
(green) (N D 1;000)
Resampling was the main contribution in Gor- satellites. The approach is based on measuring
don et al. (1993) to get a working algorithm. wheel speeds on one axle and from that dead
reckon a nominal trajectory using standard odo-
metric formulas. A road map is then used as a
Marginalization measurement, to rule out impossible or unlikely
maneuvers. There is no numeric measurement yk
The posterior distribution approximation in this approach, but the likelihood p.yk jxk / in
provided by the particle filter converges with the the model (1b) is large when xk corresponds to a
number of particles. In theory, the convergence road position and decays quickly to zero outside
rate is gk =N , where gk is a polynomial function the road network. Figure 4 illustrates the particle
in time k. In practice, it appears that the required cloud gradually focuses around the true position
number of particles increases very quickly with with time. In particular, the number of modes in
state dimension. Unless a very good proposal the posterior distributions is rather high initially,
distribution is found, the practical limit for the but decreases over time, in particular after each
state dimension is around 3–4 as a rough rule of turn.
thumb. The particle filter is sometimes believed to be
For applications with a large number of states, too computer intensive for real-time applications.
one can in many cases still use the particle filter. As described in Forssell et al. (2002), this demon-
The idea is to find a linear Gaussian substructure strator implemented a particle filter on a pocket
in the model and then divide the state vector computer anno 2001 with N D 15;000 parti-
xk into two parts: xkl for the states that appear cles running in 10 Hz. Thus, the computational
linearly and xkn for the remaining states. Bayes complexity of the particle filter should not be
rule provides the factorization overemphasized in practice.
p.xkl ; x1Wk
n
jy1Wk / D p.xkl jx1Wk
n n
; y1Wk /p.x1Wk jy1Wk /
(5) Summary and Future Directions
With a linear Gaussian substructure for xkl , given The particle filter can be seen as a black-box
n
the whole trajectory x1Wk , then the Kalman filter solution to the nonlinear filtering problem, where
applies and provides a Gaussian distribution for any nonlinear dynamical model with arbitrary
the first factor in (5). The second factor is re- noise distributions can be plugged in. The main
solved using a marginalization procedure so that tuning parameter is the number of particles, and
the particle filter can be applied. the PF will work in theory if this number is large
The bottom line is that each particle is associ- enough. In practice, there are many tricks the
ated with one Kalman filter. The method is called user has to be aware of to mitigate the curse of
Rao-Blackwellized particle filter, or marginalized dimensionality (depletion) that occurs for large
particle filter, in literature. state spaces (more than three) or long time se-
quences (more than a couple of samples). One
engineering trick is dithering, to increase the vari-
Illustrative Application: Navigation ance in the involved noise distributions from their
by Map Matching nominal values. More theoretical ways to miti-
gate depletion include clever choices of proposal
A nontrivial application where the PF solves a distributions to sample from and marginalization
nonlinear filtering problem, where Kalman filter- (to solve a subset of the estimation problem with
based approaches would fail, is described in Fors- a Kalman filter).
sell et al. (2002). The problem is to compute a Current and future research directions include
robust estimate of the position of a car, without the issues above. A further trend concerns
using infrastructure such as cellular networks or the related smoothing problem, which is
Particle Filters 1043
Particle Filters, Fig. 4 Car navigation using wheel speed a position marker can be shown. The circle denotes GPS
and a street map. The figures illustrate of how the particle position, which is only used for comparison. (a) After first
representation of the position posterior distribution of turn. (b) After second turn. (c) After third turn. (d) After
position (course state is now shown) improves over time. fourth turn
After four turns, the posterior is essentially unimodal, and
interesting in itself, but which has turned out engine for more complex problems, such as
to be instrumental in joint state and parameter the simultaneous localization and mapping
estimation problems. There are also many (SLAM) problem, and to approximate the
attempts to make the particle filter more robust, probability hypothesis density (PHD) for multi-
including ideas of filter banks and invoking sensor multi-target tracking. Finally, there
a second layer of sampling algorithms to are a large number of papers reporting on
implement the proposal distribution. There is also applications in traditional as well as new
a trend to use the particle filter as a computational disciplines.
1044 Perturbation Analysis of Discrete Event Systems
closely related to the condition that, w.p.1, the triggering event E 0 D arg mini 2 .x/ fYi g, where
random function L./ is continuous throughout Y D mini 2 .x/ fYi g is the inter-event time (the
. As a matter of fact, the two conditions are time elapsed since the last event occurrence).
practically synonymous. However, shortly after To simplify the notation we define e 0 WD E 0 .
the emergence of IPA, it became apparent that Thus, with e 0 determined, the state transition
in many queueing models of interest, L./ is probabilities p.x 0 I x; e 0 / are used to specify
not continuous and, hence, IPA is not unbiased the next state x 0 . Finally, the clock values are
(Heidelberger et al. 1988). Subsequently various updated: Yi is decremented by Y for all i
techniques to overcome this problem were ex- (other than the triggering event) which remain
plored, including the so-called cut and paste of feasible in x 0 , while the triggering event (and all
the sample paths and re-parametrization of the other events which are activated upon entering
underlying probability space via statistical con- x 0 ) is assigned a new lifetime sampled from a
ditioning. For a more comprehensive coverage distribution Gi . The set G D fGi W i 2 Eg defines
of such techniques, please see Cassandras and the stochastic clock structure of the automaton.
Lafortune (2008) and references therein. These Let T˛;n denote the nth occurrence time of
techniques can yield unbiased gradient estimators event ˛ 2 E, and let V˛;n denote a realization
in principle but often at the expense of pro- of the lifetime event distribution G˛ such that
hibitive computing costs. Recently an alternative V˛;n D T˛;n Tˇ;m for some (any) event ˇ 2 E
approach has emerged, based on stochastic flow and m 2 f1; 2; : : :}. One can then always write
models (SFM) that are comprised of fluid queues T˛;n D Vˇ1 ;k1 C : : : C Vˇs ;ks for some s. Let
(Cassandras et al. 2002). It extends, significantly, us now consider a parameter 2 R which can
the class of models and problems where IPA is only affect one or more of the event lifetime
unbiased and has the added advantage of yielding distributions G˛ .xI /; in particular, does not
very simple gradient estimators. affect the state transition mechanism. The case
The following sections of this article present where 2 Rn can be handled in similar ways,
IPA in the general setting of DES, explain the but the one-dimensional case permits us to use
limits of its scope in queueing models, describe the derivative notation rather than the gradient
alternative PA techniques for extending those symbol, which simplifies the presentation. View-
limits, present the SFM approach, and conclude ing lifetimes as functions of ; V˛;k ./, it can
with some thoughts on future research directions. be shown (under mild technical conditions, see
Glasserman (1991)) that
P
with event space farrival; departureg, state space Since LN ./ D N 1 N D ./, its IPA
PN k 0
kD1
f0; 1; : : :g representing the queue’s occupancy, 0
derivative is LN ./ D N 1
kD1 Dk ./, and
and feasible event set farrivalg when x D 0 and since Dk ./ D ak ./ dk ./, it follows that
farrival; departureg when x > 0. Dk0 ./ D ak0 ./ dk0 ./. The last two derivative
Let 2 R is a parameter of the distribution of terms are computable via the following recursive
the inter-arrival times, and, hence, the realizations procedures. First, ak ./ D ak1 ./ C vk ./, and,
of v depend on in a functional manner. For hence,
example, if the arrival process is Poisson and
is its rate, then a realization of the inter-arrival ak0 ./ D ak1
0
./ C vk0 ./:
times has the form v D ln.1 !/, where
! 2 Œ0; 1 is a uniform variate. To emphasize
The term vk0 ./ has to be obtained directly from
the dependence of v on , we denote it by v./,
the realization of v, and this often is possible
while its dependence on ! is implicit. Similarly,
since such realizations depend functionally on
the arrival times depend on and, hence, are
; for instance, in the previous example, v D
denoted by ak ./, but the service times sk do
ln.1 !/ and, hence, v 0 ./ D ln.1 !/.
not depend on . The departure time of the kth
Next, dk ./ is given by the Lindley equation
customer and its delay depend on and are
denoted by dk ./ and Dk ./, respectively. The ˚
dk ./ D max ak ./; dk1 ./ C sk ;
forthcoming paragraphs discuss the derivatives
of these functions with respect to ; we use the
prime symbol to indicate such derivatives in order and, therefore, denoting by `k the index of the
to simplify the notation. customer that started the busy period containing
Define the sample performance function customer k, we have that
X
N dk0 ./ D a`0 k ./:
1
LN ./ WD N Dk ./
kD1
From these recursive relations it follows that
for a given N > 0. Under stability conditions, Dk0 ./ D 0 if customer k starts a busy period and
P
with probability 1 (w.p.1), limN !1 LN ./ D Dk0 ./ D kiD`k C1 vi0 ./ if customer k does
J./, where J./ denotes the mean of the delay’s not start a busy period.
stationary distribution. The role of IPA is to These equations shed light on the structure of
estimate J 0 ./ via the sample derivative L0N ./, IPA in a general class of queueing networks. First
and this is justified as long as limN !1 L0N ./ D there is the perturbation generation, namely, the
J 0 ./ w.p.1. In this case, IPA is said to be strongly sampling of derivative (gradient) terms directly
consistent. In contrast, unbiasedness pertains to from the sample sequence of variates (!) which
the performance function
J
N ./ WD E LN ./ defines the sample path; that was v 0 ./ in the
and means that E L0N ./ D JN0 ./. The con- above example. These terms drive the recursive
cepts of strong consistency and unbiasedness are equations that yield the IPA derivatives. The re-
closely related except that the latter concerns cursive equations often are based on the tracking
finite-horizon processes while the former pertains of certain events such as the start of busy periods
to stationary distributions in steady state. Both or idle periods, and the process of tracking the
concepts have been extensively investigated in derivatives through them is referred to as pertur-
recent years: strong consistency in the setting of bation propagation. In the above example it is
Markov chains and Markov decision processes obvious how the perturbation propagation tracks
Cao (2007) and unbiasedness in the context of the busy periods at the queue. Furthermore, in a
stochastic hybrid systems, as will be described network setting, the perturbations can propagate
in the sequel. We will focus the rest of the from one queue to the next in a natural fashion.
discussion on the issue of unbiasedness. For instance, suppose that customers departing
Perturbation Analysis of Discrete Event Systems 1049
The main idea of SPA lies in the “smoothing values from a finite set D f0 ; 1 ; : : : ; M g.
property” of conditional expectation. If we are The theory of concurrent estimation and sample
willing to extract information from a sample path path constructability provides techniques to
and denote it by Z, called the characterization estimate performance measures of the DES
of the sample path, then we can evaluate, not just through the process of constructing sample
the sample function
L./, but
also the conditional paths under each of 0 ; 1 ; : : : ; M ; for details
expectation E L./ j Z (provided we have see Cassandras and Lafortune (2008).
some distributional knowledge based on which
this expectation can be evaluated). This can result
in a much smoother function of than L./. SFM Framework for IPA
Thus, starting with the condition for an IPA
estimator to be unbiased, The stochastic flow model (SFM) framework
essentially consists of fluid queues which forego
rJ./ D E rL./ ; the notion of the individual customer and focus
instead on the aggregate flow. In such a fluid
we rewrite the left-hand side above as shown queue, traffic and service processes are character-
below, replacing J./ D E .L./ by the ex- ized by instantaneous flow rates as opposed to the
pectation of a conditional expectation: arrival, departure, and service times of discrete
customers. The SFM qualifies as a stochastic hy-
rJ./ D rE L./ D rE EŒL./ j Z ; brid system with bilayer dynamics: discrete event
(3) dynamics at the upper layer and time-driven dy-
namics at the lower layer. The discrete events are
where the inner expectation is a conditional one
associated with abrupt (discontinuous) changes in
and the conditioning
is on the characterization
traffic-flow processes, such as the boundaries of
Z. Treating E L./ j Z as the new sample
busy periods at the queues. In contrast, the time-
function, we expect it to be “smoother” than
driven dynamics describe the continuous evolu-
L./, and, in particular, continuous in . Then,
tion of flow rates between successive discrete
under some additional conditions (comparable
events, usually by differential equations or ex-
to those made in the development of IPA) the
plicit functional terms. Performance metrics that
interchange of differentiation and expectation in
are natural to SFMs typically reflect quantitative
(3) can be justified:
measures of flow rates, like average throughput,
buffer workload, and loss.
rJ./ D E rEŒL./ j Z :
Due to the smoothing effects of SFMs, they
Letting LZ ./ WD E L./ j Z , the SPA appear to provide a far more natural setting for
estimator of rJ./ is IPA than their analogous discrete queueing coun-
terparts. Furthermore, their IPA gradients often
ŒrJ./SPA D rLZ ./: (4) are computable via extremely simple algorithms
that are based entirely on the observed sample
Naturally, the idea is to minimize the amount path. Consequently SFMs could, in principle, be
of added information represented by Z, since implemented on the sample paths generated by an
this incurs added costs we would like to avoid. actual system rather than simulations thereof and
The choice of the characterization Z generally thus be used in real-time optimization.
depends on the sample function considered and All of this next will be explained via a con-
the system under study. crete example of a queue which, though simple,
A number of other extensions to IPA have also captures the salient features of the SFM setting
been developed (see Cassandras and Lafortune for IPA. For a more comprehensive discussion,
2008). It is also worth mentioning that the PA please see Cassandras et al. (2010), Wardi et al.
approach can be applied to a parameter taking (2010), and Yao and Cassandras (2013).
Perturbation Analysis of Discrete Event Systems 1051
Cassandras CG, Wardi Y, Panayiotou CG, Yao C (2010) The topic discussed includes the gradient-based
Perturbation analysis and optimization of stochastic optimization for systems with continuous param-
hybrid systems. Eur J Control 16:642–664
Glasserman P (1991) Gradient estimation via perturbation
eters and the direct-comparison-based optimiza-
analysis. Kluwer, Boston tion for systems with discrete policies, which is
Gong WB, Ho YC (1987) Smoothed perturbation analysis an alternative to dynamic programming and may
of discrete event systems. IEEE Trans Autom Control apply when the latter fails. Furthermore, these
32:858–866
Heidelberger P, Cao X-R, Zazanis M, Suri, R (1988)
new insights can also be applied to continuous-
Convergence properties of infinitesimal perturbation time and continuous-state dynamic systems, lead-
analysis estimates. Manag Sci 34:1281–1302 ing to a new paradigm of optimal control.
Ho YC, Cao, X-R (1991) Perturbation analysis of discrete
event dynamical systems. Kluwer, Boston
Ho YC, Cassandras CG (1983) A new approach to the
analysis of discrete event dynamic systems. Automat- Keywords
ica 19:149–167
Ho YC, Eyler MA, Chien DT (1979) A gradient technique
for general buffer storage design in a serial production Gradient estimation; Sample-path techniques;
line. Int J Prod Res 17:557–580 Sensitivity analysis; Queueing networks
Ho YC, Cao X-R, Cassandras CG (1983) Infinitesimal
and finite perturbation analysis for queueing networks.
Automatica 19:439–445
Wardi Y, Adams R, Melamed B (2010) A unified approach Introduction
to infinitesimal perturbation analysis in stochastic flow
models: the single-stage case. IEEE Trans Autom
Control 55:89–103 In this chapter, we introduce the theories and
Yao C, Cassandras CG (2013) Perturbation analy- methodologies that utilize the special features
sis and optimization of multiclass multiobjective of discrete event dynamic systems (DEDSs)
stochastic flow models. Discret Event Dyn Syst
for perturbation analysis (PA) and optimization
21:219–256
of steady-state performance. Such theories and
methodologies usually take different perspectives
Perturbation Analysis from the traditional optimization approaches
of Steady-State Performance and and therefore may lead to new insights and
Sensitivity-Based Optimization efficient algorithms. Furthermore, these new
insights can also be applied to continuous-time P
Xi-Ren Cao and continuous state dynamic systems, leading to
Department of Finance and Department of a new paradigm of optimal control.
Automation, Shanghai Jiao Tong University, As discussed in Perturbation Analysis of
Shanghai, China Discrete Event Systems, perturbation analysis
Institute of Advanced Study, Hong Kong (PA) can be applied to both performance in
University of Science and Technology, finite-period and steady-state performance. This
Hong Kong, China chapter will mainly focus on the latter and a
related topic, the sensitivity-based optimization
Abstract of steady-state performance of stochastic discrete
event dynamic systems.
We introduce the theories and methodologies
that utilize the special features of discrete event
dynamic systems (DEDSs) for perturbation anal- Gradient-Based Approaches
ysis (PA) and optimization of steady-state per-
formance. Such theories and methodologies usu- Basic Ideas
ally take different perspectives from the tradi- The gradient-based performance optimization of
tional optimization approaches and therefore may discrete event dynamic systems (DEDSs) consists
lead to new insights and efficient algorithms. of three steps:
1054 Perturbation Analysis of Steady-State Performance and Sensitivity-Based Optimization
1. Developing efficient algorithms to estimate proved Cao (2007). Its application to queueing
the performance gradients using the special systems and Markov systems will be discussed in
features of a DEDS the following two subsections.
2. Studying the properties of the gradient esti-
mates, including investigating whether they
are unbiased and/or strongly consistent Queueing Systems
3. With the gradient estimates, developing effi- The gradient-based approach for DEDSs starts
cient optimization algorithms with queueing systems, known as infinitesimal
Steps 1 and 2 are referred to as PA, and Step 3 is perturbation analysis (IPA), or simply PA,
usually done together with standard gradient- Perturbation Analysis of Discrete Event
based optimization approaches, such as hill- Systems. Queueing systems are widely used
climbing type of approaches and stochastic as a model for many DEDSs in literature
approximation approaches such as the Robbins- and have very unique structural features, and
Monro algorithm (Robbins and Monro 1951). PA utilizes such special features to develop
Our focus here is on PA for steady-state per- fast algorithms to estimate the performance
formance. The main principle for estimating the gradients.
gradients of steady-state performance is decom- We first give a brief explanation for the simple
position. In a DEDS, a small change in the rules of PA of queueing systems (Ho and Cao
value of a system parameter induces a series of 1983, 1991). Consider a closed Jackson network
changes on the system’s sample path; each of with M servers and N customers. The service
such changes is called a perturbation. A single times of customers at server i are exponentially
perturbation alone will affect the sample path and distributed with service rate i , i D 1; 2; ; M .
therefore affect the system performance. Such an After a customer completes its service at server
effect is typically small, and therefore, the linear i , it goes to server j with a routing probability
superposition usually holds. Thus, the effect of a qij , i; j D 1; 2; ; M . The service discipline is
small parameter change on the steady-state per- first come first served. The number of customers
formance can be determined by summing up the at server i is denoted as ni , and the system
effects of all the perturbations induced (or gener- state is denoted as n D .n1 ; n2 ; ; nM /. The
ated) by the parameter change. This principle is state process is n.t/ D .n1 .t/; n2 .t/; ; nM .t//
illustrated by Fig. 1, and it applies to many differ- with ni .t/ being the number of customers at
ent systems with different performance criteria. server i at time t, i D 1; 2; ; M , t 0.
Following the principle, efficient algorithms can Define Tl as the lth state transition time of the
be developed, and their strong consistency can be process n.t/.
Perturbation Analysis
of Steady-State
Performance and
Sensitivity-Based
Optimization, Fig. 1 The
decomposition of
performance changes
Perturbation Analysis of Steady-State Performance and Sensitivity-Based Optimization 1055
Perturbation Analysis
of Steady-State
Performance and
Sensitivity-Based
Optimization, Fig. 2
Perturbation generation
and propagation in a
queueing network
Figure 2 illustrates an example of a sample First, the perturbations generated at a server will
path where each stair-style line represents a tra- accumulate until this server is idle. When the
jectory of the number of customers at one server, server enters an idle period, all its perturbations
and the customer transitions among servers are will be lost. (After the idle period, the service
indicated by dotted arrows. starting time is determined by the customer that
An exponentially distributed service time with terminates the idle period, which carries the per-
rate can be generated according to s D 1 ln , turbation of another server.) Second, when a
where is a uniformly distributed random num- customer finishes its service at server i and enters
ber in [0, 1]. Now let the service rate of a server, server j after an idle period of server j , server j
say server k, change from k to k C k , will obtain the same amount of perturbations as
k << k . Then the service time s will change server i . If server j is not idle at this time, the P
to s 0 D k C
1
k
ln (the same is used for both perturbation at server i will only affect the arrival
0
s and s ). Thus, the perturbation of the service time of this customer and will not affect any other
time induced by the change in k is customers/servers.
Figure 2 illustrates the perturbation generation
k and propagation process of a network in which
s D s 0 s s: (1)
k the service rate of server 1 is decreased with
an infinitesimal amount. Given a system sam-
In summary, because of a small (infinitesimal) ple path obtained by simulation or observation,
change of service rate k , every customer’s we can record the perturbations of all servers
service completion time at server k will obtain as they are generated and propagated along the
(be delayed by) a perturbation that is propor- sample path. From the perturbations we can get
tional to its original service time with a multiplier a perturbed sample path and finally get the per-
k
k . This is the rule of perturbation genera-
formance changes caused by the change in the
tion. service rate. Specifically, we have the following
Next, we observe that a perturbation will af- algorithm for the perturbations of the service
fect the service starting and completion times completion times (if server k’s service rate is
of other customers in the same server and in perturbed).
other servers in the network. We say a pertur- In Step 2, we add sk;l as the perturbation
bation will be propagated through the network. generated, instead k
s as indicated by (1).
k k;l
1056 Perturbation Analysis of Steady-State Performance and Sensitivity-Based Optimization
d 1 FL
D lim D
Qg D
.P 0 P /g: (7) Direct Comparison and Policy
dı ı!0 ı L
Iteration
If the reward function also changes from f to
f .ı/ D f C ı.f 0 f /, then (7) becomes The sensitivity-based view has been extended
to optimization in discrete spaces of policies.
d With this view, we can develop a new approach
D
Œ.P 0 g C f 0 / .P g C f /: (8) to performance optimization based on a direct
dı
comparison of the performance of any two poli-
Further Works cies. This provides an alternative to the standard
The ideas described in the previous subsections dynamic programming to solving the Markov
may stimulate new research topics in theoretical decision processes (MDP) types of problems; it
analysis, estimation algorithms, and applications. also has been applied to solve some problems
Here we can only give a very brief review for when the standard MDP fails (Cao 2007; Cao and
some of them. Wan 2013).
1058 Perturbation Analysis of Steady-State Performance and Sensitivity-Based Optimization
In an MDP, there is an action space denoted as From (9) and the fact
0 > 0, the proof of the
A. For simplicity, we only consider the discrete following lemma is straightforward.
case. When the system is at any state i 2 S, an ac-
Lemma 1 If P g C f
./ P 0 g C f 0 , then
tion ˛ D d.i / is taken, which controls the system
< ./ 0 .
transition probability, denoted as p ˛ .j ji /, i; j 2
S, and the reward function, denoted as f .i; ˛/. It is interesting to note that in the lemma, we
The mapping d W S ! A is called a policy. Since use only the potentials with one Markov chain,
a policy corresponds to a transition probability i.e., g. Thus, because of the special structure of
matrix P d D Œp d.i / .j ji /Mi;j D1 and the reward the performance difference formula (9), if the
vector f d D .f .1; d.1//; ; f .M; d.M ///T , condition in Lemma 1 holds, to compare the
we also call the pair .P; f / a policy. performance measures under two policies, we
Consider two policies .P; f / and .P 0 ; f 0 /, may only need the potentials with one policy.
and assume that the Markov chain under both Policy iteration and the optimality equation
policies is ergodic. We use prime “ 0 ” to de- can be easily derived from (9) and Lemma 1.
note the quantities associated with .P 0 ; f 0 /. First,
multiplying both sides of the Poisson equation (6)
Algorithm 2 Policy Iteration
with
0 on the left and after some calculations, we
1. Guess an initial policy d0 , and set k D 0.
get 2. (Policy evaluation) Obtain the potential g dk by solving
the Poisson equation .I P dk /g dk C dk e D f dk or
estimating it on a sample path. (The superscript “dk ”
0 D
0 f.P 0 g C f 0 / .P g C f /g: (9)
is added to quantities associated with policy dk .)
3. (Policy improvement) Choose
This is called a performance difference formula,
and many optimization results can be derived dkC1 2 arg max f d C P d g dk ; (10)
d 2D
from it in an intuitive way.
component-wise (i.e., to determine an action for each
state). If in state i , action dk .i / attains the maximum,
Policy Iteration and the Optimality Equation and set dkC1 .i / D dk .i /.
The difference formula (9) has a nice decompo- 4. If dkC1 D dk , stop; otherwise, set k WD k C 1 and go
sition structure: it contains two factors, the first to Step 2.
one
0 , which does not depend on P , and the
second one .P 0 g C f 0 / .P g C f /, in which all
It follows directly from Lemma 1 that when
the parameters are known except the performance
the iteration does not stop, the performance im-
potential g, which can be obtained by only ana-
proves at each iteration. It can be proved easily by
lyzing system with P . This nice feature makes
construction that the iteration stops at an optimal
the difference formula the basis of performance
policy. Again, from Lemma 1, when the iteration
optimization.
stops at a polict dO , it holds
For two M -dimensional vectors a and b, we
define a D b if a.i / D b.i / for all i D n o
O O O
1; 2 ; M ; a b if a.i / b.i / for all i D d e C g d D max f d C P d g d : (11)
d 2D
1; 2 ; M ; a < b if a.i / < b.i / for all i D
1; 2 ; M ; and a
b if a.i / < b.i / for at least This is the Hamilton-Jacobi-Bellman (HJB) opti-
one i and a.j / D b.j / for other components. mality equation. If we further define the Q-factor,
The relation includes D,
, and <. Similar
X
definitions are used for the relations >, , and Qd .i; ˛/ D f .i; ˛/ C p ˛ .j ji /g d .j /: (12)
. j
Next, we note that
0 .i / > 0 for all i D
1; 2; ; M . Thus, from (9), we know that if Then the policy iteration equation (10) and the
.P 0 P /g C .f 0 f / 0, then 0 > 0. HJB equation (11) become
Perturbation Analysis of Steady-State Performance and Sensitivity-Based Optimization 1059
dkC1 .i / WD arg maxfQdk .i; ˛/g (13) large, and it is not computationally feasible to
˛2A
implement policy iteration or to solve the HJB
equations. On the other hand, in many practical
and
O O problems in engineering, finance, and social sci-
Qd .i; ˛/ D maxfQd .i; ˇ/g: (14)
ˇ2A ences, control actions are only taken when certain
The only difference between (8) and (9) is events occur. For example, in the traffic control
that
in (8) is replaced by
0 in (9). This leads of a network of subnetworks, often times one
to an interesting observation: policy iteration in cannot control the traffic in the same subnetwork,
MDPs in fact chooses the policy with the steepest and control actions are only applied when there
directional derivative as the policy in the next are packets transferring among different subnet-
iteration. Therefore, policy iteration in fact can works. In a portfolio management problem, the
be viewed as the “gradient-based” optimization investor sells or buys stocks when the price his-
in a discrete space. tory experiences some predetermined patterns
In our approach, the HJB equation and pol- (e.g., reaches some level). In a sensor network,
icy iteration are obtained from the performance actions are taken only when one of the sensors
difference equation (9), which compares the per- detects some abnormal situations. In a material
formance of any two policies. It is hence called handling problem, actions are taken when the
a direct-comparison based approach. This ap- inventory level falls below certain threshold.
proach also applies to more general problems, Conceptually, anything happened in the past
such as multichain Markov systems, systems with can be chosen as an event. However, the number
absorbing states, and problems with other per- of such events is too big (much bigger than the
formance criteria such as the discounted perfor- number of states), and studying all these events
mance and the bias, etc. makes analysis infeasible and defeats our original
The direct-comparison approach has been suc- purpose. Therefore, to properly define events, we
cessfully applied to the nth-bias optimality prob- need to strike a balance between the generality on
lem (Cao 2007). Essentially, starting with the one hand and the applicability on the other hand.
performance difference formulas (those similar In the event-based setting, an event e is de-
to (9) for different performance), we can de- fined as a set of state transitions with certain
velop a simple and direct approach to derive the common properties. That is, e WD fhi; j i W
i; j 2 S and hi; j i has common propertiesg, P
results that are equivalent to the sensitive dis-
count optimality for multichain Markov systems where hi; j i denotes a state transition from i to
with long-run average criteria (Puterman 1994), j . This definition can also be easily generalized
and no discounting is needed and no dynamic to represent a finite sequence of state transitions.
programming is used. The approach, motivated We shall see that in many real problems, the
by the development for discrete event dynamic number of events requiring control actions is
systems, provides a clear overall picture for the usually much smaller than that of the states.
area of MDP. An event-based policy d is defined as a map-
The direct-comparison approach can also be ping from E to A, with E being the space of all
applied to some problems where dynamic pro- events. That is, d W E ! A. Let De denote the
gramming fails, including the event-based opti- set of all the stationary and deterministic policies.
mization problems, where the sequence of events The reward function under policy d is denoted
may not be Markovian; see the next section. as f d D f .i; d.e//, and the associated long-run
average performance is denoted as d . When an
event e happens, we choose an action a D d.e/
Event-Based Optimization according to a policy d , where e 2 E and d 2 De .
Our goal is to find an optimal policy dO which
It is well known that for most systems modeled maximizes the long-run average performance as
by Markov processes, the state spaces are too follows.
1060 Perturbation Analysis of Steady-State Performance and Sensitivity-Based Optimization
˚
dO D arg max d (15) number of states, and in some cases such as the
d 2De network admission control problem, it is linear to
the system size.
The main difficulty in developing the event- The crucial condition for the above event-
based optimization theories and algorithms lies in based optimization is (19). There are many prob-
the fact that the sequence of events is usually not lems, such as the control of the networks of
Markovian. The standard optimization approach networks and the portfolio management problem,
such as dynamic programming does not apply for which the condition holds; there are also many
to such problems. However, we can apply the problems for which the condition does not hold.
direct-comparison approach to this event-based The error in applying (18) and policy iteration
optimization problem. Here we give a very brief (17) comes from the difference between
h .i je/
discussion. and
d .i je/ at each iteration. Further research is
Consider an event-based policy d , d 2 needed.
De . When event e occurs, the conditional We use a computer network as an example, in
transition probability is denoted as p d.e/ .j ji; e/. which each computer or router can be modeled
Let
d .i je/ be the conditional steady-state as a queueing subnetwork and the computer
probability of state i when event e occurs under network is then a network of such subnetworks.
policy d . Define the aggregated Q-factor Assume that there is an M subnetwork, and
P subnetwork m, m D 1; 2; ; M , consists of
Qd .e; ˛/ D i
d .i je/ km servers. The number of customers at the
h P i
f .i; ˛/ C j p ˛ .j ji; e/g d .j / : (16) j th server of subnetwork m is denoted as nm;j ,
j D 1; 2; ; km , and the number of customers
in all thePservers in subnetwork m is denoted as
We may use these aggregated Q-factors to de- km
Nm D j D1 nm;j . Suppose the service time is
velop an event-based policy iteration algorithms
exponentially distributed, then the system state
as Algorithm 2 and obtain the policy iteration
is n WD .n1;1 ; ; n1;k1 I I nM;1 ; ; nM;kM /,
equation (cf. (13))
and the aggregated state is N WD .N1 ; ; NM /.
Suppose that the transition probabilities among
dkC1 .e/ WD arg maxfQdk .e; ˛/g; e 2 E; the servers in the same subnetwork are fixed and
˛2A
(17) we can only control the transition probabilities
and the event-based HJB optimality equation (cf. among the subnetworks, and furthermore, we
(14)) is can only observe N. Then the problem can
be modeled as an event-based optimization
O O with an event being a customer transition
Qd .e; ˛/ D maxfQd .e; ˇ/g: (18)
ˇ2A among subnetworks, and meanwhile, the
aggregated state is N. It can be proved that
It can be proved that if the conditional probability the condition (19) holds, and therefore, we
h .i je/ does not depend on the policy; i.e., may apply policy iteration (17) and HJB
equation (18).
h .i je/ D
d .i je/; 8 i; e; and h; d; (19) Many existing problems fit the event-based
framework. For example, in a partially observ-
then policy iteration (17) indeed leads to a se- able Markov decision process (POMDP), we may
quence of increasing performance and (18) spec- define an observation, or a sequence of observa-
ifies an event-based optimal policy. tions, as an event. Other examples include state
The event-based aggregated Q-factor (16) can and time aggregations, hierarchical control (hy-
be estimated on a sample path in the same way as brid systems), and options. Different events can
for potentials (Cao 2007). The number of events be defined to capture the special features in these
requiring actions is usually much smaller than the different problems. In this sense, the event-based
Perturbation Analysis of Steady-State Performance and Sensitivity-Based Optimization 1061
approach may provide a unified view to these because in her/his mind, s/he enlarges the
different problems (Cao 2007). possibility of winning a large sum, and a risk
averse person buys insurance, because s/he is
afraid of a big loss and therefore enlarges its
The Sensitivity-Based Approach and probability.
Optimal Control The optimization problem with a distorted
probability suffers from the time-inconsistent is-
We refer to the approaches discussed in the sue; i.e., an optimal policy for the problem in
previous sections the sensitivity-based approach. period Œt; T /, 0 < t < T , is not optimal in the
When the parameters are continuous, it is the same period for the problem in Œ0; T . Thus, the
gradient-based approach, and the focus is on de- standard dynamic programming fails.
veloping efficient algorithms that utilize the par- The gradient-based approach has been applied
ticular system structure to estimate the gradients to the portfolio management problem with prob-
(to justify the algorithms, the unbiasedness or the ability distortion. With this approach, we discov-
consistency of the estimates should be proved). ered that the performance with distorted probabil-
When the parameters are discrete, it is the direct- ity maintains some sort of linearity called mono-
comparison approach that leads to policy iteration linearity. This property shed new insights to the
and the HJB equations, and policy iteration can portfolio management problem and the nonlinear
be viewed as the gradient-based method in dis- expected utility theory (Cao and Wan 2013).
crete spaces. This approach is different from the
conventional dynamic programming, and it has
been successfully applied to MDP with different Conclusion
criteria and the n-bias optimization problems as
well as the event-based optimization and some A sensitivity-based approach has been developed
other problems that dynamic programming fails. to the performance optimization of discrete event
This sensitivity-based approach was motivated dynamic systems (DEDS). The approach utilizes
by the study of discrete event dynamic systems the dynamic structure of DEDS. For systems with
(DEDS). It has been realized that the principles continuous parameters, it is the gradient-based
and methodologies developed for DEDSs also optimization, in which the special feature of a
apply to the optimization of continuous-time and DEDS helps in developing efficient algorithms to
P
continuous-state (CTCS) systems. Here are some estimate the performance derivatives; for systems
examples: with discrete policies, it is the direct-comparison-
1. CTCS systems: For CTCS systems, the dy- based approach, with which policy iteration and
namic is driven by Brownian motions or Levy HJB equations can be derived intuitively by us-
processes. The transition probability matrix in ing the performance difference formulas. Policy
DEDS should be replaced by the infinitesimal iteration can be viewed as the gradient method
generator in CTCS, which is an operator on in a discrete space. The estimation of gradients
the space of continuous functions. With the and the implementation of policy iteration can be
performance difference formulas, we can re- carried out on a given sample path, and efficient
develop the stochastic optimal control theory online learning algorithms can be developed (Cao
for many performance measures, including 2007).
the long-run average, discounted performance, The sensitivity-based approach was developed
and finite horizon problems, with no dynamic for DEDSs, but its principle also applies to
programming; see Cao et al. (2011). systems with continuous-state spaces. The
2. Time-inconsistent optimization problems: In approach provides an alternative to the traditional
behavioral finance, people’s preference is dynamic programming and therefore can be
modeled with a distorted probability. For applied to some problems where dynamic
example, a risk-taking person buys lotteries, programming does not work.
1062 PID Control
able to provide with a wide range of processes, function of a proportional controller can be triv-
it has become the de facto standard controller ially derived as
in industry. It has been evolving along with
the progress of technology, and nowadays it is C.s/ D Kp : (2)
very often implemented in digital form rather
than with pneumatic or electrical components. It The main drawback of using a pure proportional
can be found in virtually all kinds of control controller is that, in general, it cannot set to zero
equipments, either as a stand-alone (single- the steady-state error. This motivates the addition
station) controller or as a functional block in of a bias (or reset) term ub , namely,
Programmable Logic Controllers (PLCs) and
Distributed Control Systems (DCSs). Actually, u.t/ D Kp e.t/ C ub : (3)
the new potentialities offered by the development
of the digital technology and of the software The value of ub can then be adjusted manually
packages have led to a significant growth of the until the steady-state error is reduced to zero.
research in the PID control field: new effective In commercial products, the proportional gain
tools have been devised for the improvement is often replaced by the proportional band PB,
of the analysis and design methods of the basic which is the range of error that causes a full-range
algorithm as well as for the improvement of the change of the control variable, i.e.,
additional functionalities that are implemented
with the basic algorithm in order to increase its 100
PB D : (4)
performance and its ease of use. Kp
The success of the PID controllers is also
enhanced by the fact that they often represent the The integral action is proportional to the integral
fundamental component for more sophisticated of the control error, i.e.,
control schemes that can be implemented when Z t
the basic control law is not sufficient to achieve u.t/ D Ki e./d ; (5)
the required performance or when a more com- 0
plicated control task is of concern.
where Ki is the integral gain. It appears that
the integral action is related to the past values P
Basics of the control error. The corresponding transfer
function is
Ki
Using a PID controller means applying a feed- C.s/ D : (6)
back controller that consists of the sum of three s
types of control actions: a proportional action, an The presence of an integral action allows to
integral action, and a derivative action. reduce the steady-state error to zero when a step
The proportional control action is proportional reference signal is applied or a step load distur-
to the current control error, according to the bance occurs. In other words, the integral action
expression is able to set automatically the correct value of
ub in (3) so that the steady-state error is zero. For
u.t/ D Kp e.t/ D Kp .r.t/ y.t//; (1) this reason, the integral action is also often called
automatic reset.
where u is the controller output, Kp is the pro- While the proportional action is based on the
portional gain, r is the reference signal, and y current value of the control error and the integral
is the process output. Its meaning is straightfor- action is based on the past values of the control
ward, since it implements the typical operation of error, the derivative action is based on the pre-
increasing the control variable when the control dicted future values of the control error. An ideal
error is large (with appropriate sign). The transfer derivative control law can be expressed as
1064 PID Control